I have questions about data.
Most of us who have more than a cursory knowledge of the English language have heard of the phrase ‘too much information’. We know what it means, even if we don’t always know when to apply it.
I have questions about data.
Most of us who have more than a cursory knowledge of the English language have heard of the phrase ‘too much information’. We know what it means, even if we don’t always know when to apply it.
For those who don’t know, or are unsure, the Urban Dictionary describes ‘too much information’ as “An expression of exasperation and disgust when a person is divulging personal details of his sex life, toilet habits, or anything the listener finds disgusting, uninteresting, and unwelcome.”[1]
Sum, sum. Just because we know it, doesn’t mean we should share it or even try and remember it, never mind go about analysing the hell out of it.
This is where Big Data comes in.
I have written at length on the subject of Big Data, but with the plethora of hype, dubious applications and instrumental thinking surrounding the subject, I wonder if our incessant hunger for more and more data is actually counterproductive.
We think that having data is good, but in business terms is having even more peripheral data going to actually improve anything worth improving?
Do we really want to know all that there is to know, at all times about all that is going on around us?
What are the possible ramifications in having all of history available to us, right down to the tiniest and most trivial piece of personal data?
Put it this way. There has never been so much data about trifling aspects of what we do and what is speculated or opinionated, ever. In fact, there has never been so much data as there is now. As ever, the volumes of data are growing more and more rapidly and there is no visible ceiling to that growth, some see it as normal others see it as revolutionary – in both a bad and a good sense.
Amongst all this digital upheaval, one question bothers me. Do we really want a disinhibited rush towards to an all-knowing data driven future? A future built on the accumulation, integration and analysis of as much data as we can get our hands on.
If knowing is good, and having more data is intrinsically about knowing more, then we can’t we have too much of a good thing, can we?
Let’s look at some concrete examples.
Philip Kegelmeyer, a senior scientist at Sandia National Labs, states that one way big data can trip you up is through the magnification of errors in logical reasoning. Case in point is the base rate fallacy, in which unrelated sets of incongruous data are brought together into one coalesced whole in order to draw conclusions. Some people call this Data Science, but it looks uncannily similar to Data Voodoo.
Some time back I read an interesting paper entitled On the Pursuit and Misuse of Useless Information written by Anthony Bastardi and Eldar Shafir. My take-away from reading that research is that decision makers will search out data that might seem ostensibly relevant even when in fact it is not, and they will use that newly acquired data, now imbued with magical powers of prediction, regardless of its relevance. Simply because they now have it to hand.
There are other aspects to the growing plethora of data. Claire Porter writing in The Guardian had this to say: “In the era of big data, the battle for privacy has already been fought and lost – personal data is routinely collected and traded in the new economy and there are few effective controls over how it is used or secured. Data researchers and analysts now say that it’s time for legislation to reclaim some of that privacy and ensure that any data that is collected remains secure.”
The American Civil Liberties Union have singled out the personal-data collection psychosis of government agencies and private businesses as being perhaps the greatest assault on the privacy of ordinary citizens, and they claim that the USA is “undergoing a rapid expansion of data collection, storage, tracking, and mining.” In which additional invasions of privacy, more powerful combinations of new technologies, expanded government powers and expanded private-sector data collection efforts are “creating a new “surveillance society” that is unlike anything seen before.
Jeff Stibel writing in the Harvard Business Review put it this way “Wisdom can be shattered by too much information. Great scholars, for instance, tend to be great in very narrow disciplines. These scholars give ground on colloquial information so that they can digest more within their field. In many ways, we are all idiot savants: our expertise in certain areas necessitates weakness elsewhere.”
My position is that too much data and too much information are bad for business, bad for society and ultimately, bad for us. Too much information is bad for relations with our friends, family, colleagues and even for community cohesion. The bottom line is that data disinhibition is as potentially damaging as any other extreme forms of disinhibition, and is not the mark of a civilised and decent society.
I will once again leave you with this thought. Every time we double the volume of Big Data we halve our capacity for wisdom.
Many thanks for reading.
A couple of closing points of a more general nature.
Firstly, please consider joining The Big Data Contrarians, the nicest and most professional data community here on LinkedIn: https://www.linkedin.com/groups/8338976
Secondly, keep in touch. My strategy blog is here http://www.goodstrat.com and I can be followed on Twitter at @GoodStratTweet. Please also connect on LinkedIn if you wish. If you have any follow-up questions then leave a comment or send me an email on martyn.jones@cambriano.es
Thirdly, you may be interested in other articles I have written, such as:
Free Business Analytics Content –Thanks to Wikipedia – Part 1: http://goodstrat.com/2016/03/05/free-business-analytics-content-thanks-to-wikipedia-part-1/
Free Business Analytics Content –Thanks to Wikipedia – Part 2: http://goodstrat.com/2016/03/07/free-business-analytics-content-thanks-to-wikipedia-part-2/
Free Business Analytics Content –Thanks to Wikipedia – Part 3: http://goodstrat.com/2016/03/08/free-business-analytics-content-thanks-to-wikipedia-part-3/
Free Business Analytics Content –Thanks to Wikipedia – Part 4:
http://goodstrat.com/2016/03/09/free-business-analytics-content-thanks-to-wikipedia-part-4/
Data Warehousing explained to Big Data friends –http://goodstrat.com/2015/07/20/data-warehousing-explained-to-big-data-friends/
Stuff a great data architect should know –http://goodstrat.com/2015/08/16/stuff-a-great-data-architect-should-know-how-to-be-a-professional-expert/
Big Data is not Data Warehousing – http://goodstrat.com/2015/03/06/consider-this-big-data-is-not-data-warehousing/
What can data warehousing do for us now –http://www.computerworld.com/article/3006473/big-data/what-can-data-warehousing-do-for-us-now.html
Looking for your most valuable data? Follow the money –http://www.computerworld.com/article/2982352/big-data/looking-for-your-most-valuable-data-follow-the-money.html