If you want to make an apple pie from scratch, you must first create the universe.—Carl Sagan
If you want to make an apple pie from scratch, you must first create the universe.—Carl Sagan
Making an enterprise data warehouse from scratch may not necessitate recreating the universe, but it’s also not easy as pie. It’s a major undertaking, which must be handled incrementally if a workable result is to be achieved. Along the way are a multitude of design decisions to be made, each of which will have ramifications downstream, such as ETL processing, business intelligence development and enterprise services.
I have spent some years working with the set of Industry Models from IBM, and while I have avoided speaking directly about them on this blog, so as not to reveal privileged information, I would like to take this opportunity to point out some of the real benefits of incorporating them into an EDW initiative. The models are not without their frustrations and limitations. But having experienced the alternative – of having to develop models from scratch – I can attest to their value.
Here are some of the major benefits of having a template data model:
- Standards
- Data Architecture – The industry models provide a cohesive infrastructure, with repeating patterns, following a strict, disciplined approach. The design is based on a classification model that presents business information according to “what it is” rather than “how it’s used”. This opens up information to be stored in many perspectives, rather than for a single line of business.
- Lowers Risk of redundancy – Because of its structure and its evolution over numerous implementations, redundancy of objects and conflicts between “versions” of information is eliminated. Every piece of incoming information has a single placeholder within the data model. The EDW System of Record becomes the “Single Source of Truth”.
- Definitions and Naming Standards – The models come with a full set of logical names and physical abbreviations, reducing the risk of the misidentification of terms and facilitating the adoption of a single set of enterprise names.
- Enterprise Perspective
- Core Concepts – Template model incorporates all aspects of each industry through its structured hierarchies of core concepts.
- Iterative – Facilitates incremental adoption while minimizing risk of being “painted into a corner” through limiting design decisions.
- Multi-Industry Integration – There are some occasions where an enterprise may be involved in multiple lines of business, such as retail and banking, or retail and insurance. Because of the common design principles on which the models are based, they are well-positioned for such integration.
- Flexibility
- History – Accommodates history at “attribute-level”, minimizing redundancy and providing an efficient structure for historical analysis.
- Growth – Accommodates growth with minimal structural changes; particularly with regard to metrics and relationships between business entities.
- Perspectives – Highly-normalized structures are well-positioned for creating different perspectives of dimensions and metrics for reporting.
- Process Acceleration
- Data Design Patterns – While the models are not intended to be used “out-of-the-box” without customization, the template offers a complete picture of the enterprise and a significant head start on development.
- ETL Patterns – Repeating patterns in Data Architecture translate to repeatable ETL patterns, with a modular approach to each set of recurring data structures (e.g., writing history of a classification value, handling single values in an attributive table, approach to normalized hierarchies).
- Testing Patterns – Acceleration of effort extends to unit and system integration testing. With repeating data structures come the ability to craft SQL code that follows repeatable patterns, reducing the effort to prepare test scripts.
In the absence of a set of Industry Models, from IBM or another vendor, all of these points must be addressed afresh; or left unconsidered, with the risk that mistakes will be made. The structures will follow inconsistent patterns, the names will conform to no specific set of conventions, and the effort to create a flexible repository to service the whole business will end up a silo of information being used by a small corner of the organization. Until the next time the executive decides to try again to create an enterprise asset.
If you want to make an apple pie from scratch, follow the recipe.
If you have experiences with or without industry models, please feel free to share them…