As I discussed in the state of data and analytics in the cloud recently, usability is a top evaluation criterion for organizations in selecting cloud-based analytics software. Data access of cloud and on-premises systems are essential antecedents of usability. They can help business people perform analytic tasks themselves without having to rely on IT.
As I discussed in the state of data and analytics in the cloud recently, usability is a top evaluation criterion for organizations in selecting cloud-based analytics software. Data access of cloud and on-premises systems are essential antecedents of usability. They can help business people perform analytic tasks themselves without having to rely on IT. Some tools allow data integration by business users on an ad hoc basis, but to provide an enterprise integration process and a governed information platform, IT involvement is often necessary. Once that is done, though, using cloud-based data for analytics can help, empowering business users and improving communication and process.
To be able to make the best decisions, organizations need access to multiple integrated data sources. The research finds that the most common data sources are predictable: business applications (51%), business intelligence applications (51%), data warehouses or operational data stores (50%), relational databases (41%) and flat files (33%). Increasingly, though, organizations also are including less structured sources such as semistructured documents (33%), social media (27%) and nonrelational database systems (19%). In addition there are important external data sources, including business applications (for 61%), social media data (48%), Internet information (42%), government sources (33%) and market data (29%). Whether stored in the cloud or locally, data must be normalized and combined into a single data set so that analytics can be performed.
Given the distributed nature of data sources as well as the diversity of data types, information platforms and integration approaches are changing. While more than three in five companies (61%) still do integration primarily between on-premises systems, significant percentages are now doing integration from the cloud to on-premises (47%) and from on-premises to the cloud (39%). In the future, this trend will become more pronounced. According to our research, 85 percent of companies eventually will integrate cloud data with on-premises sources, and 84 percent will do the reverse. We expect that hybrid architectures, a mix of on-premises and cloud data infrastructures, will prevail in enterprise information architectures for years to come while slowly evolving to equality of bidirectional data transfer between the two types.
Further analysis shows that a focus on integrating data for cloud analytics can give organizations competitive advantage. Those who said it is very important to integrate data for cloud-based analytics (42% of participants) also said they are very confident in their ability to use the cloud for analytics (35%); that’s three times more often than those who said integrating data is important (10%) or somewhat important (9%). Those saying that integration is very important also said more often that cloud-based analytics helps their customers, partners and employees in an array of ways, including improved presentation of data and analytics (62% vs. 43% of those who said integration is important or somewhat important), gaining access to many different data sources (57% vs. 49%) and improved data quality and data management (59% vs. 53%). These numbers indicate that organizations that neglect the integration aspects of cloud analytics are likely to be at a disadvantage compared to their peers that make it a priority.
Integration for cloud analytics is typically a manual task. In particular, almost half (49%) of organizations in the research use spreadsheets to manage the integration and preparation of cloud-based data. Yet doing so poses serious challenges: 58 percent of those using spreadsheets said it hampers their ability to manage processes efficiently. While traditional methods may suffice for integrating relatively small and well-defined data sets in an on-premises environment, they have limits when dealing with the scale and complexity of cloud-based data. The research also finds that organizations utilizing newer integration tools are satisfied with them more often than those using older tools. More than three-fourths (78%) of those using tools provided by a cloud applications provider said they are satisfied or somewhat satisfied with them, as are even more (86%) of those using data integration tools designed for cloud computing; by comparison, fewer of those using spreadsheets (56%) or traditional enterprise data integration tools (71%) are satisfied.
This is not surprising. Modern cloud connectors are designed to connect via loosely coupled interfaces that allow cloud systems to share data in a flexible manner. The research thus suggests that for organizations needing to integrate data from cloud-based data sources, switching to modern integration tools can streamline the process.
Overall three-quarters of companies in our research said that it is important or very important to access data from cloud-based sources for analysis. Cloud-based analytics isn’t useful unless the right data can be fed into the analytic process. But without capable tools this is not easy to do. A substantial impediment is that analysts spend the majority of their time in accessing and preparing the data rather than in actual analysis. Complicating the task, each data source can represent a different, possibly complex, data model. Furthermore, the data sets may have varying data formats and interface requirements, which are not easily addressed with legacy integration tools.
Such complexity is the new reality, and new tools and approaches have come to market to address these complexities. For organizations looking to integrate their data for cloud-based analytics, we recommend exploring these new integration processes and technologies.