Top Posts

 
avatar
0 0 votes

MapReduce goes evolutionary

Scientists from Texas A&M University have developed a new algorithm MrsRF (MapReduce Speeds up Robinson-Foulds) for analyzing large collection of evolutionary trees using MapReduce framework. Matthews et. al, have used their MapReduce algorithm to compute all-to-all Robinson-Foulds (RF) distance matrix on multi-core computing platforms. Calculation of all possible Robinson-Foulds distance pairs is a computationally intensive task. The results show that a significant speedup can be achieved using MrsRF compared to the fastest sequential algorithms.
We studied the performance of our MrsRF algorithm on two large biological trees sets consisting of 20,000 trees of 150 taxa each and 33,306 trees of 567 taxa each. Our experiments show that MrsRF is a scalable approach reaching a speedup of over 18 on 32 total cores.

Phase 1 of the MrsRF algorithm. Two mappers and two reducers are used to process the input files.

Apart from speeding up the phylogenetic analysis, this study presents a new type of MapReducible problem...
read more >>
avatar
0 0 votes

Defining Analytics: Data, Information and Knowledge

Following-up to my blog 'Just Tell Me What I'm Doing', I'm starting a series of posts that define the key concepts and terms that make up my analytic world. Everything I do is coloured by my experience actually doing analytics in commercial organisations. So while I believe these posts will present practical definitions that will be actionable in the business world, I know that there are other worlds in academia and science where they are less relevant. At the very least, people in these areas will gain a better understanding of how business regards analytics.


Bennett's AnalyticaCortec_black_logo  
   A Practitioner's Guide To Analytics



Data, Information and Knowledge


Information is a collection of related data – often transformed and aggregated – about a topic. In business, that topic is often insight about...

read more >>

Recent Posts

 
avatar
0 0 votes

Social Media Marketers Should Get Ahead of the Curve

Recently I was reviewing an interesting article by Ross Mayfield who is an advisor to www.Slideshar.com and co-founder of Socialtext. He is also at @ross on Twitter. My compliments to him and his team. He has this to say:

 

As Chief Marketing Officers develop their social media marketing strategy for 2010, they are demanding business results. In 2009, 89% of CMOs tracked social media’s impact by using standard metrics such as site traffic, pageviews, and number of fans (as discussed in a recent survey).  However, CMOs expect that in 2010 top metrics will track more closely to P&L business goals––not just Web-related goals. The study forecasts the growth of adoption of the top three metrics in 2010, as follows:

  • A 333% increase in tracking revenue
  • A 174% increase in tracking conversion
  • A 150% increase in tracking average order value

Such a shift in measurement expectation is significant. CMOs indicate a 300% year-over-year increase in 2010...

read more >>
avatar
0 0 votes

When does a hard science become a team sport?

The newest report from analyst firm Gartner just came out – and we’re all excited! It’s called the “Magic Quadrant for Data Warehouse Database Management Systems.” Gartner issues it about team_sportonce a year, and it positions database vendors based on vision and execution. I am pleased to share that Teradata is in the top spot of the Leaders’ Quadrant. To see our press release and get details, click here.

 

The report is a serious one because CIO’s everywhere will read it, see the quadrants and how they are populated – then make more informed decisions. Those decisions shape our future business. I read the report’s fine print, too.

 

Like always, the report focuses on the relevant capabilities of vendors. However, the analysts remarked this year that Teradata’s customers play a role in our product development. I’m glad they did, because it’s an important factor. I commented on it in the press release because I believe that much of the credit for our success as a business comes from...

read more >>
avatar
0 0 votes

Why Google needed a Superbowl ad

We were watching the Superbowl on TiVo, about a half hour behind, when I got a text message from my son at college: ...quot;Google's super bowl ad was horrible. I'm going to use Bing....quot;

I didn't think it was bad, but I'm more prone to sentimentality than my son (to put it mildly). As I started this post, my point was going to be that the ad wasn't targeted at people like him, a blogger who actually considers which search engine to use. And it certainly wasn't for the masses who had clicked the ad on YouTube. I figured it was more for millions of people who use search occasionally, but don't yet understand the breadth of its applications. After all, how does Google fare with people who have migrated in the last two or three years from dial-up, or are still there? For a company with 70% of the search market, that segment might represent growth. And you're more likely to reach them on the Superbowl than with a viral campaign on the Net.

But as I rewatch the ad, I suspect it might move too fast... read more >>
avatar
0 0 votes

Socialytics

201002051511.jpg With the continued growth of both the pubic social web and private collaboration, communication and social business tools we are creating an explosion of social data. As businesses get more deeply involved in the social business movement and as software vendors create more and different social tools there is a compelling case for tools to help businesses make sense out of all this social data. I wrote about the idea of social analytics last year. This year in our IDC Top 10 Predictions we included a prediction about social analytics: "Business applications will undergo a fundamental transformation — fusing business applications with social/collaboration software and analytics into a new generation of "socialytic" apps, challenging current market leaders." Here's a simple model that shows the concept of socilaytic platforms / apps and how they might be applied:

201002061159.jpg

Some characteristics of socialytic apps...

read more >>
avatar
0 0 votes

Big Data

Several month's ago a short video appeared on YouTube with an interview of LinkedIn's Chief Scientist DJ Patil. In it he discusses how 'Big Data' impacts the practise of analytics. I've only just got around to posting about it but I am doing so now because he has some insights that I agree with and would like to share as they are still relevant. 

Big data is today most often associated with the internet superstars like Google, eBay and Amazon. There are 3 other areas with lower profiles where big data is important: intelligence (spooks, the military, etc.), scientific and academic research, and the financial markets.

Big data's future is much bigger than this because more and more areas of human activity are going to be faced with vast data sets. When you hear people talking about the growth of knowledge and statements like 'if this data were printed then the stack would grow faster than NASA’s fastest rocket', you have to remember that there is a good chance that each page of new data is adding to someone's analytic data set.

    I'm not quoting the guy verbatim but here's what I heard...

    read more >>
    avatar
    0 0 votes

    Selling to enterprises

    For some reason when you are selling information technology, big companies are referred to as “enterprises.” I’m guessing the word was invented by a software vendor who was trying to justify a million-dollar price tag. As a rule of thumb, think of enterprise sales as products/services that cost $100K/year or more.

    I am by no means an expert in enterprise sales. Personally, I vastly prefer marketing (one-to-many) versus sales (one-to-one), hence only start companies making consumer or small business products (advertising based or sub-$5000 price tags). But I have been involved in a few enterprise companies over the years. Here’s the main thing I’ve observed. Almost every enterprise startup I’ve seen has a product that would solve a problem their prospective customers have. But that isn’t the key question. The key question is whether...

    read more >>
    avatar
    0 0 votes

    After phylogenetics Microsoft patents personal data mining

    I hope you remember that some time back Microsoft tried to patent clustering phylogenetics methods which was a socking news for the bioinformatics community as community used these methods for a long time without any restriction. Now Microsoft had patented the personal data mining system. According to patent abstract which was accepted just last week
    Personal data mining mechanisms and methods are employed to identify relevant information that otherwise would likely remain undiscovered. Users supply personal data that can be analyzed in conjunction with data associated with a plurality of other users to provide useful information that can improve business operations and/or quality of life. Personal data can be mined alone or in conjunction with third party data to identify correlations amongst the data and associated users. Applications or services can interact with such data and present it to users in a myriad of manners, for instance as notifications of opportunities.

    The patents includes some heavy weight name from Microsoft including...
    read more >>
    avatar
    0 0 votes

    WSDM 2010: Day 2

    Unfortunately, I woke up this morning rather under the weather, so I’m having to resort to remotely reporting on the second day of WSDM 2010 conference, based on the published proceedings and the tweet stream.

    The day started with a keynote from Harvard economist Susan Athey. Her research focuses on the design of auction-based markets, a topic core to the business of search which largely relies on auction-based advertising models (cf. Google AdWords). Then came a session focused on learning and optimization. One paper proposed a method to learn ranking functions and query categorization simultaneously, reflecting that different categories of queries leads users to have different expectations about ranking. Another combined traditional list-based ranking with pair-wise comparisons between results to separate the results into tiers reflecting grades of relevance. An intriguing approach to query recommendation treated it as an optimization problem, perturbing users’ query-reformulation path to maximize the expected value of a utility function over the search session. Another paper looked not at...

    read more >>
    avatar
    0 0 votes

    Huffington Post: Crawling with data addicts

    The Huffington Post has recently passed the Washington Post in traffic. It got 410 million page views last month (and 35 million on the iPhone alone). Data is a big part of the site's success.

    At the media panel at the Webtrends conference this week, Huffington Post's chief technology officer, Paul Berry, described some the methods. Every hour, editors see how the traffic that hour compared to the same hour a week ago. The site is a laboratory for so-called A/B testing, where stories are played against each other to see which draws more traffic--and how long each story should stay on the front page.

    "...We've built a lot of internal tools..." Berry said. ..."A lot of us are addicted, like crack addicts, to these stats....";

    HuffPost also tracks readers' shifting moods by carrying out automated "...sentiment analysis..." on the two million comments the site generates every month. (In other words, machines look for key words and report on whether the comment was favorable or not. If you look at individual results, they're fairly primitive. A sentence like, "...I'm not saying I'm not crazy about it......", can throw a machine for a loop. But they get the big ... read more >>
    avatar
    0 0 votes

    Microsoft takes on Google and IBM in science cloud

    Microsoft, the Times reports, is offering scientists free access to its cloud computing. This is important because scientists are grappling with mountainous troves of data, and they need Google-like (or Bing-like) computing clusters to crunch them. I read recently that the biological data amassed last year surpassed all of the biological data in history, presumably from the dissection of the first frog until weeks before the Obama inauguration.

    The need for scientific clouds is clear, and as I wrote two years ago in BusinessWeek, IBM and Google are on a similar track. My question is this: Is scientific data going to get tangled in a software battle? IBM, Google and others are offering an open-source cloud software known as Hadoop, which is based on Google's MapReduce. Microsoft is providing its own platform, Azure. The grand promise of cloud computing will be for scientists to share data sets, and even to delve into ones from seemingly unrelated fields. That way they might find correlations between, say, meterology and disease.

    But if scientists in the Microsoft cloud are doing their work in Azure, will they be able to collaborate with others working in the ... read more >>

    Weekly Most Discussed

     
    avatar
    0 0 votes

    Socialytics

    201002051511.jpg With the continued growth of both the pubic social web and private collaboration, communication and social business tools we are creating an explosion of social data. As businesses get more deeply involved in the social business movement and as software vendors create more and different social tools there is a compelling case for tools to help businesses make sense out of all this social data. I wrote about the idea of social analytics last year. This year in our IDC Top 10 Predictions we included a prediction about social analytics: "Business applications will undergo a fundamental transformation — fusing business applications with social/collaboration software and analytics into a new generation of "socialytic" apps, challenging current market leaders." Here's a simple model that shows the concept of socilaytic platforms / apps and how they might be applied:

    201002061159.jpg

    Some characteristics of socialytic apps...

    read more >>
    avatar
    +2 2 votes

    Converting Prospects into Repeat Buyers in Five Steps



    Here are five best practices that work for marketers to convert their prospects into buyers, and then repeat buyers - basically fans that start following the brand with a passion.

    1. Making a connection - Target people with their preferences. If they have not provided you with preferences, then give them options via an email or a direct mail piece. Watch what they prefer, and add that to their preferences. Use this information to create targeted offers for them.

    2. Try to create a memory - Make sure that your copy and image are interesting enough for the recipient to remember. Also, make the call to action stand out. In an ideal situation the recipient will click through to purchase. If not, coax them to review your offer or even perhaps add the offer to their shopping basket which you can store for them.

    3. Provide a trigger... read more >>
    avatar
    0 0 votes

    Why Google needed a Superbowl ad

    We were watching the Superbowl on TiVo, about a half hour behind, when I got a text message from my son at college: ...quot;Google's super bowl ad was horrible. I'm going to use Bing....quot;

    I didn't think it was bad, but I'm more prone to sentimentality than my son (to put it mildly). As I started this post, my point was going to be that the ad wasn't targeted at people like him, a blogger who actually considers which search engine to use. And it certainly wasn't for the masses who had clicked the ad on YouTube. I figured it was more for millions of people who use search occasionally, but don't yet understand the breadth of its applications. After all, how does Google fare with people who have migrated in the last two or three years from dial-up, or are still there? For a company with 70% of the search market, that segment might represent growth. And you're more likely to reach them on the Superbowl than with a viral campaign on the Net.

    But as I rewatch the ad, I suspect it might move too fast... read more >>
    Clicky Web Analytics