This landscape has changed dramatically in recent years. Not so long ago, IT investment was focused on building data warehouses to store large amounts of static data in a structure designed to facilitate reporting. The reports were usually designed and delivered by BI workers inside the IT department. But as organizations began to recognize the value of their warehoused data, demand grew for broader and more flexible access among business users.
Delivering more/better access proved to be quite a challenge, however, since the basic approach of data warehousing was designed to improve query response for IT-designed information delivery products, not to provide ordinary business people with analytic capabilities. Between 2003 and 2008, major data warehousing vendors IBM, Oracle, and SAP acquired three of the leading BI products, and many organizations adopted a “single stack” IT strategy in hopes of improving BI integration. But it seems fair to say that serious growing pains accompanied attempts to graft self-serve analytics onto the data warehouse framework.
All the while, business demands continued to escalate—and accelerate. An agenda that began with the need for power users to do their own slicing and dicing soon expanded to include sophisticated “must-haves” such as near-real-time dashboards, predictive analytics, desktop data mining, and much more. Often, conventional data warehouse/BI strategies (typified by costly development, slow implementation, and steep learning curves) did not match up well with these emerging business drivers.
And then a new generation of “data discovery” tools began to alter the landscape. These innovative analytics platforms leveraged in-memory processing for speed and had built-in visualization capabilities that gave them a strong edge in meeting certain types of business need.
Where Are We Now?
The 2010 Gartner Magic Quadrant for Business Intelligence Platforms reported: “Organizations are rapidly embracing the idea of providing data to end users and empowering them with an ability to navigate and visualize the data in a ‘surf and save’ mode as an alternative to a report-only structure.” So while reporting (which makes great use of the traditional data warehouse) remains the dominant form of information delivery in many organizations, the role of interactive analytics is growing rapidly—and challenging conventional data warehousing structures.
In fact, some view in-memory platforms as an alternative to building a data warehouse. But as analyst Cindi Howson points out, “this option usually applies to smaller organizations that may only have a single source system. For larger companies that have multiple source systems, the data warehouse continues to be the ideal place to transform, model and cleanse the data for analysis.” At the same time, however, the growing need to incorporate unstructured data into analytics creates additional issues in relation to data warehousing.
Where Are We Headed?
During the near future, it’s likely that innovative approaches to working with data will continue to mature and gain acceptance. Growing influences:
- Data mashups (ad hoc integration of data from different sources)
- SaaS (analytics “in the cloud”)
- Social BI (sharing and combining data from multiple content sources)
These developments raise obvious problems for traditional views of the data warehouse. For one thing, the importance of the data warehouse as a “single source of truth” goes by the wayside if everyone can get their own flavor of data from anywhere they want. And the role of the data warehouse in assuring data quality also comes under question.
But there’s little chance of putting the genie back in the bottle. As lead Forrester advanced analytics analyst James Kobielus recently blogged, in An Enterprise Data Warehouse without a Database—Is That Even Conceivable?: “The trend is toward a virtualized enterprise content cloud geared both to the traditional EDW roles supporting BI and operational reporting, and to the new world of advanced analytics for social media analytics, sentiment analysis, and many other compute-intensive functions involving complex content and dynamically shifting mixed workloads.”
James Kobielus explores this growing trend in greater detail in a webcast titled BI in the Cloud: A self-service alternative to “Big BI”.