Earlier in my blog, Slowly Changing Dimensions – Special Attention Needed, I touched upon the need to pay special attention to slowly changing dimensions. Organizations have three variants of implementing solutions for slowly changing dimensions.
Earlier in my blog, Slowly Changing Dimensions – Special Attention Needed, I touched upon the need to pay special attention to slowly changing dimensions. Organizations have three variants of implementing solutions for slowly changing dimensions.
Type 1: in these implementations, the latest data is retained. This is implemented when there would be no need to do historic analysis. For example, an online transactional system that needs to display the latest list of values in the pull-down pick lists may use this type.
Type 2: in these implementations the history or the validity period for the changes is persisted. Whenever the values are changed for the old values and the period they were valid for are stored along without deleting those records. The latest record may be marked as the only active instance or the validity end date is marked up into the future. This is the best way to address the slowly changing dimensions, but it comes with the overhead of needs to tag timelines for datasets.
Type 3: in these implementations, at the most two versions of data are stored. Data sets are marked current/previous or active/inactive.
One of the main new features added by Teradata version 13.10 to support temporal data addresses slowly changing dimensions. Teradata v13 introduced new PERIOD data type that implements the solutions to the slowly changing dimensions. Rob detailed the technical aspects of this in his post, Exploring Teradata 13’s PERIOD Data Type. Ramakrishna explained ETL best practices to load the slowly changing dimensions in his post, Recognizing Change. Teradata calls ability to enable storing validity periods against data, as “making it temporal.”
However, migrating from Teradata v12 to v13 would not automatically make your Enterprise Data Warehouse (EDW) temporal. The data need to be inserted or migrated as temporal data. The real challenge is dealing with and migrating non-temporal data and source systems when upgrading to the temporal EDW built of Teradata v13.
Tips and Suggestions
Here are few tips and suggestions for the journey from non-temporal to temporal EDW.
a) Identify and categorize the slowly changing dimensions by the three types of implementations described above.
b) Ask your business about validity period – it is important to make business subject matter experts temporal thinkers. While this may be an irritating question, it is always appropriate to ask validity period for each reference and master data users enter into the systems.
c) Address the data collection systems – the temporal aware organizations will start with upgrading the data collection source systems that are agnostic for validity periods.
d) Temporal alone cannot address all the design issues – even with temporal EDW, there are situations that change the structure or the nature of the dimensions over time. For example, a product promotion last years was based on selling a set of new brands last year, but this year it may be based on how much of those are getting returned. The promotion’s key and the attributes themselves change overtime. If a historic analysis is needed to compare sales performance overtime for the promotion, one needs to tag the underlying changes to the promotion definitions overtime. One good way to address this could be to implement a surrogate key with natural key for the table changing over the course of time.