In my white paper, Analytics Best Practices: The Analytical Hub, I present five design principles. The first two are below. I’ll blog about 3-5 in a subsequent post:
1. Data from everywhere needs to be accessible and integrated in a timely fashion
Expanding beyond traditional internal BI sources is necessary as data scientists examine such areas as the behavior of a company’s customers and prospects; exchange data with partners, suppliers and governments; gather machine data; acquire attitudinal survey data; and examine econometric data. Unlike internal systems that IT can use to manage data quality, many of these new data sources are incomplete and inconsistent forcing data scientists to leverage the analytical hub to clean the data or synthesize it for analysis.
Advanced analytics has been inhibited by the difficulty in accessing data and by the length of time it takes for traditional IT approaches to physically integrate it. The analytical hub needs to enable data scientists to get the data they need in a timely fashion, either physical integrating it or accessing virtually-integrated data. Data virtualization speeds time-to-analysis and avoids the productivity and error-prone trap of physically integrating data.
2. Building solutions must be fast, iterative and repeatable
Today’s competitive business environment and fluctuating economy are putting the pressure on businesses to make fast, smart decisions. Predictive modeling and advanced analytics enable those decisions to be informed. Data scientists need to get data and create tentative models fast, change variables and data to refine the models, and do it all over again as behavior, attitudes, products, competition and the economy change. The analytical hub needs to be architected to ensure that solutions can be built to be fast, iterative and repeatable.