Having looked at timeliness in Part 1, let’s turn our attention to consistency.
Having looked at timeliness in Part 1, let’s turn our attention to consistency.
Early proponents of data warehousing, including myself, majored on the role of the Enterprise Data Warehouse (EDW) as a repository of a consistent, integrated and historical view of the business. Leaving aside the historical aspect for now, the desire for consistency and integration can be traced directly to one of the main concerns of decision makers in the 1980s. There existed a growing proliferation of applications–operational systems–that were responsible for running the business. These systems were being introduced in an ad hoc manner throughout the business, often on different platforms and addressing different but overlapping aspects of the same process.
In a bank, for example, a mainframe-based application running against an IMS database handled checking accounts. A new relational database system running on a minicomputer was introduced to handle savings accounts. The difficulty for decision makers was to understand the combined account position for individual customers. The need, stated in a nutshell, was for a “single version of the truth”.
This divergence of sources, combined with the often poor data quality in individual operational sources, as well as the need for a single truth, led EDW designers and developers to focus almost maniacally on how to achieve consistency and integration of information in the warehouse. Enterprise modeling, ETL tools and intricate, often lengthy projects were all used in service of this goal.
Today, we need to pose two important questions. First, is there really a single version of the truth that can be created and stored in the EDW? Second, do we have the time and the money to create it?
On the first question, I feel that we have become blinded by our unswerving belief in a universal truth. Yes, there do exist “truths” in the business that need to be universally agreed. The quarterly figures announced to the stock markets absolutely need to be internally consistent and well-integrated. The underlying numbers that lead to these results are similarly constrained. But, it is equally clear that some numbers can exist as best estimates, close approximations or even “swag” (some wild-assed guess!). As a culture, we have become obsessed with the second or third decimal point on many numbers. How many times have you heard election polls being reported with candidates separated by half a percentage point, while the 2% margin of error on the poll is hidden in a footnote?
Answering the first question as we just have leads easily to an answer to the second. We need to divert resources from seeking complete consistency to achieving consistency where it matters and timeliness where that is important. And, more, getting the best return on investment in both areas–timelines and consistency. The real business value in some data lies in its early availability to decision makers; the value in other data resides in its consistency and integrity.
Distinguishing between the two is the key to success.
Join me on my upcoming webinar, “Business Intelligence: the Quicker, the Better”, on October 25th for further insights into this important issue.
And for my European readers, allow me to remind you that Larissa Moss is presenting a two-day seminar in Rome on October 20-21st, entitled “Agile Approach to Data Warehousing & Business Intelligence” which will also show how to address this dilemma.