Bad data is costly. With data driving so many decisions in our lives, the cost of bad data truly impacts us all, whether or not we realize it. IBM estimates that bad data costs the U.S. economy around $3.1 trillion dollars each year. Most people who deal with data realize that bad data can be extremely costly, but this number is truly stunning. The data that most businesses analyze is about their customers, and if you’re relying on bad data, there is no way your business can succeed.
Additional research from Experian Data found that bad data has a direct impact on the bottom line of 88% of American companies, with the average company losing around 12% of its total revenue. These numbers paint a very real picture of the negative impact of bad data on our economy.
Looking beyond just the financial impact of bad data, the impact of bad data also includes the spread of misinformation. There are many examples of bad data mistakes throughout history that have helped shaped the world we live in today.
By taking the time to understand the lessons we can learn from bad data mistakes made throughout history, organizations can take the necessary steps to protect against them. With the proper data protection, unified security and analytics measures in place, organizations can safeguard their data and ensure accuracy and integrity.
A group of data analysts from Utopia, Inc have curated a comprehensive list of examples in an infographic that shows how bad data mistakes have led to disastrous decisions that changed the course of history and the society we live in today. Let’s explore some of the more interesting examples from their list.
The 2016 United States Presidential Election
The most recent U.S. Presidential Election was mired with bad data. From the myriad of polls and poll aggregators, to the exalted political oracles at FiveThrityEight and the New York Times, most pollsters and predictors got this election completely wrong and predicted a landslide Hillary Clinton victory. It was this error, that many Democrats argued, that caused a historic number of voters to stay home on Election Day. This forecast obviously did not materialize.
This spread of bad data could have been prevented by utilizing advanced statistics to analyze previous elections and by using machine-learning and creating “kitchen-sink” models based on voter rolls. This may sound complicated, but it is an established way to improve the underlying assumptions of the polls. These methods however are costly and time-intensive for most polls which instead use online surveys and publicly-available online Census data.
The Enron Scandal of 2001
Enron was once one of the most powerful and largest companies in the world. During the early 2000s they experienced jaw-dropping executive compensation and soaring stock prices. However, a host of fraudulent financial data can be directly attributed to the downfall of the company.
From internal whistleblowers to the shredding of documents by Enron’s external auditors, there is little question that the data that was being provided to shareholders was largely fictionalized. The data that was delivered by Enron’s executives and their auditing firm to stock holders and the Board of Directions in annual reports and financial statements proved to be false.
An ethical external auditing firm at Enron could have prevented this financial fraud from occurring. The Sarbanes-Oxley Act of 2002 was enacted following the Enron scandal to ensure auditor independence, corporate responsibility, financial disclosures, conflicts of interest and overall public company oversight. If this act was around earlier, it would have prevented the Enron disaster from occurring.
Tetraethyllead in Gasoline in the 1920s
Added to gasoline in the 1920s to control knocking in engines, tetraethyllead contributed to over 5,000 fatalities in the United States alone. This was in part, made possible by intentionally inconclusive tests led by the leaded gas industry and the willful deceit of the American Government.
For decades, the lead paint and the leaded gas industries blamed each other for lead poisoning, both suggesting their products were safe for humans. Industry scientists even suggested the human body naturally harbors lead, so high levels shouldn’t be a health concern.
After the initial discovery of the potential threat from leaded gasoline, an independent study of its harmful effects should have been conducted. The U.S. Government and the gas industry both turned a blind eye and instead relied on bad data that cost many people their lives.
Christopher Columbus and the Discovery of the Americas
Even the discovery of the Americas was a result of bad data. Christopher Columbus made a few significant miscalculations when charting the distance between Europe and Asia. First, he favored values given by Persian geographer Alfraganus, over the more accurate calculations of Greek geographer, Eratosthenes. Second, Columbus assumed Alfraganus was referring to Roman miles in his calculations when, in reality he was referring to Arabic miles.
Columbus himself is to blame for the bad data. Columbus could have stuck with one geographer’s calculations and verified the units of measurement he was using was actually correct.
The lessons we can learn from bad data mistakes in the past
There are countless examples of bad data mistakes throughout the history of the world. Better data leads to better and more accurate decisions. Relying on bad data carries negative effects to businesses and our society as a whole. Can you think of examples where bad data has affected your business or your personal life?