We are reaching a crossroads in the world of analytics. Over the years, we have seen analytic environments and data environments begin to merge through the advent of in-database processing within relational database engines. This trend had been leading to a world where analytic professionals don’t have to worry about moving data or accessing different systems to perform analytics. They could simply run their analytic processes against the data where it sits using the tools they know best. Scale could be added to analytic processes even while also simplifying them and using less system resources.
Two Steps Forward, Four Steps Back?
Unfortunately, in the world of big data, many organizations seem to be taking steps backwards rather than forwards. Instead of being freed from concerns over where data sits, many analytic professionals are forced to navigate multiple distinct systems and move massive amounts of data in order to develop analytic processes with big data. Big data is adding a burden on users, and for those who were starting to enjoy the freedom from worrying about the underlying systems, this isn’t a welcome development.
The good news is that it doesn’t have to be this way. The market’s momentum is building towards merging big data systems and toolsets into a single, unified analytics platform that enables users to access any amount data of any type for any analysis at any time**.
If analytic professionals are going to execute an analysis, do they really care if the data is in a relational database, a NoSQL system, or a hybrid of the two? Not in the absolute sense. They do care about getting good performance from their processes and having access to powerful, flexible functionality to allow them to run the desired analytics. However, they don’t typically care where the data physically sits as long as their needs are met.
The Case For A Unified Analytic Environment
The above point leads to the need for a more cohesive, unified analytics environment. As an analytic professional, I simply want to be able to see all the available data and request that my analytic logic is run against it. Perhaps the logic is SQL, perhaps it is a MapReduce process written in Java, or perhaps it is some SAS or R code. Regardless, I want a simple way to view, access, and analyze all of the data from a single entry point. Where the data is specifically sitting isn’t a top concern for me. I’d rather not worry about it, in fact.
It is important for readers to keep up on the evolution of what’s available in the marketplace. In a short time, the ideal of having a single, unified analytics environment will be very real. Don’t take your organization down the dead end paths that stem from having siloed systems created to handle only one type of data at a time and only certain genres of analysis. Enterprise analytic requirements are too robust to succeed in such an environment. You’ll need to combine these siloes to succeed.
Ask your analytic teams if they really care about where the data is physically sitting or not. I am confident you’ll come to the conclusion that users only care that each data source has been placed in an environment that best fits the nature of the data and how they plan to use it. However, they don’t want to have to worry about where it is explicitly as they build an analytic process. They want seamless access for their complex analytics to be executed at scale. The emergence of unified analytics environments will achieve that for you.
To see a video version of this blog, visit my YouTube channel.
** For purposes of full disclosure, my employer Teradata is marketing our Unified Data Architecture which is intended to do this. However, the point here isn’t our offers specifically, but the general architecture and concepts, which apply more broadly.