Recently, The Economist had a special report titled “Data, data everywhere“. The report examines the rapid increase in data volumes and what the implications are. The report got the attention of the blogosphere (example) and I recommend taking a look if you haven’t already.
When I read articles like these, I try to extract three categories of “knowledge” for future use: factoids, stories, and insights.
- Factoids are simply data points that I feel might come in handy someday
- Stories are real-world anecdotes. The most memorable ones have an “aha!” element to them.
- Insights are observations (usually at a higher level of abstraction than stories) that make me go “I never thought of that before. But it makes total sense.”
Think of this crude categorization as my personal approach to dealing with information overload. Of course, there’s a fair amount of subjectivity here: what I think of as an insight may be obvious to you and vice-versa.
So what did I make of The Economist article? There were numerous factoids that I cut-and-stored away (too many to list here but email me if you want the list)…
Recently, The Economist had a special report titled “Data, data everywhere“. The report examines the rapid increase in data volumes and what the implications are. The report got the attention of the blogosphere (example) and I recommend taking a look if you haven’t already.
When I read articles like these, I try to extract three categories of “knowledge” for future use: factoids, stories, and insights.
- Factoids are simply data points that I feel might come in handy someday
- Stories are real-world anecdotes. The most memorable ones have an “aha!” element to them.
- Insights are observations (usually at a higher level of abstraction than stories) that make me go “I never thought of that before. But it makes total sense.”
Think of this crude categorization as my personal approach to dealing with information overload. Of course, there’s a fair amount of subjectivity here: what I think of as an insight may be obvious to you and vice-versa.
So what did I make of The Economist article? There were numerous factoids that I cut-and-stored away (too many to list here but email me if you want the list), a few memorable stories, and a couple of insights.
Let’s start with the stories.
In 2004 Wal-Mart peered into its mammoth databases and noticed that before a hurricane struck, there was a run on flashlights and batteries, as might be expected; but also on Pop-Tarts, a sugary American breakfast snack. On reflection it is clear that the snack would be a handy thing to eat in a blackout, but the retailer would not have thought to stock up on it before a storm.
Memorable and concrete. Neat.
Consider Cablecom, a Swiss telecoms operator. It has reduced customer defections from one-fifth of subscribers a year to under 5% by crunching its numbers. Its software spotted that although customer defections peaked in the 13th month, the decision to leave was made much earlier, around the ninth month (as indicated by things like the number of calls to customer support services). So Cablecom offered certain customers special deals seven months into their subscription and reaped the rewards.
Four months before the customer defected, early-warning signs were beginning to appear. Nice but not particularly unexpected.
Airline yield management improved because analytical techniques uncovered the best predictor that a passenger would actually catch a flight he had booked: that he had ordered a vegetarian meal.
Hey, I knew this all along! Over 20 years, I have ordered vegetarian meals almost every time and have almost never missed a flight.
Just kidding. This came out of left field, I have never seen it before. While the claim that airline yield management improved substantially due to this single discovery feels like a stretch, the story is certainly memorable.
Sometimes those data reveal more than was intended. For example, the city of Oakland, California, releases information on where and when arrests were made, which is put out on a private website, Oakland Crimespotting. At one point a few clicks revealed that police swept the whole of a busy street for prostitution every evening except on Wednesdays, a tactic they probably meant to keep to themselves.
Worry-free Wednesdays! Great story, difficult to forget.
Let’s now turn to the two insights that stood out for me.
a new kind of professional has emerged, the data scientist, who combines the skills of software programmer, statistician and storyteller/artist to extract the nuggets of gold hidden under mountains of data.
This wasn’t completely new to me (I have friends whose job title is “Data Scientist”) but seeing the sentence in black-and-white crystallized the insight for me and made me appreciate the power of the trend. Particularly the point that the data scientist needs to be at the intersection of programming, stats and story-telling.
As more corporate functions, such as human resources or sales, are managed over a network, companies can see patterns across the whole of the business and share their information more easily.
What the author means by “managed over a network” is “managed in the cloud”. In my experience, data silos are all too common and this often leads to decisions being optimized one silo at a time, even though optimizing across silos can produce dramatic benefit.
I had not appreciated that, as data for more and more business functions gets housed in the cloud, data silos will naturally disappear and it will become increasingly easier to optimize across functions.
Well, that was what I gleaned from the article. If you “extract knowledge” in a different way than factoids/stories/insights, do share in the comments – I would love to know.