How To Lose Credibility When Using Data
I have been saying for a while that data is the next big thing.
No, not data by itself (which we have been talking about forever) or even “Big Data” (which is thankfully past the prime of hype and beginning to decline and going back to its maiden name: Data). We are talking about how we use data, what we do with it, and how it leads us to better decisions.
Almost two years ago I wrote a great piece (if I may say so) when I was working with Attensity researching the issues that plagued analytics. This particular post was titled “How Accuracy Matters in Analytics“. I looked at the bias humans introduce into data and analytics and how we are the cause of poor results. If you have time to read it you’ll be a better person (OK, not that much – but it will give you something to think about – post is here) and there are some concrete steps you can take to remove bias from your data projects.
Data is wonderful. Data gives you credibility. Data makes you smart.
Data, unfortunately, is easy to twist around.
If you want to find a data point to support your theory – whatever it may be – I promise you you can find data to support it. You can get a 90% customer satisfaction very easily (formula in a post I wrote long ago); more and more of my clients, including vendors and non-vendors, now want specific data to support their theories. I can create a survey or study or research project to support basically any statement you want to make — as long as we can agree on certain specifications.
Data can be found to support what you are saying; always has, always will. The problem is not finding the data, it is using it. I am seeing more and more people use data poorly. Instead of making contextual statements, people are using absolute statements about the data they have.
Making data absolute is so incredibly wrong.
If nothing else you remember about data remember this point: data is contextual, not absolute.
Anything you measure has to be done in context to be valid. I am not just talking demographics (that is so old school – still works in some cases, but in today’s world demographics have become too fragmented to be reliable – long-tail analytics has drastically changed that), but differentiated segments.
When you introduce or use data you must provide context.
Context is what makes you smart, not just the data. If you can properly use data in context, you will be far smarter than just using data.
Here are a few examples. A few days or weeks ago (I don’t keep track of time anymore, reduces the stress) good friend Emanuele Quintarelli (@absolutesubzero on Twitter) said in the aforementioned social network:
— Emanuele Quintarelli (@absolutesubzero) November 2, 2012
That sparked a conversation between us where I said that the data was not wrong, just biased. I had not yet read the report, to be honest, but have since then and my opinion has not changed. The data is biased by the theories being proven (people complain in private channels more than public channels) and the bias on how the questions were being asked: people using private channels were asked to reply. The report is behind Forrester’s paywall (it is worth reading if you can get it, in spite of its bias it is an interesting report) so you may not be able to get it – but the way the questions are structured are set so the results come out exactly the way you see them above. The people selected to reply to this survey, and the manner the survey was conducted, biased the answers towards private networks. If the exact same survey would’ve been conducted in public social networks, the results would’ve been different – as they are in other studies asking similar questions from different respondents.
Another example, the Harris Interactive Customer Experience Impact Report from 2009 that is widely quoted (more specifically the data point that says that 86% of consumers would change a service provider after just one bad experience). Again, there is significant bias on this survey – but not on how the data was collected (it was biased towards generating a large number of positive responses by removing follow-up questions and context – e.g. nothing was ever said about whether they DID CHANGE the service provider), but how it was used. The argument goes that since people would change providers after one bad experience (which was not defined either), then customers should focus all their efforts in providing better experiences. Beyond the point whether this is true or not, the data was used to showcase a doomsday scenario to propel people to act on something that may or may not be problem.
Same argument goes against my favorite evil-fuzzy metric: NPS scores.
I could continue giving you examples for a long time, but you get the idea.
Data builds credibility. It also takes it away if not used wisely. Go forth and use data – just be careful on how you use it.
Other Posts by Esteban Kolsky
The moderated business community for business intelligence, predictive analytics, and data professionals.