Am I surprised? Not at all. With the hype around big data and the conflation (partially erroneous, in my view) of big data and soft (unstructured) information, it was obvious that the big relational database vendors would need a beachhead. Oracle has landed. Further speculation has already begun as to the location of the second front. IBM targeting Coveo? Teradata aiming at MarkLogic? Microsoft and Attivio (the founders of Attivio previously sold FAST to Microsoft in 2008)? The guessing game is fun, but my concerns about this war run far deeper.
Leaving aside the other aspects of big data for now, let’s focus on soft information, and in particular on the textual subset, including information that can be easily converted to text, such as audio and document scans. While leaving out image and video reduces volumes dramatically, text is a large and highly valuable segment of the overall information market. In the past, this segment has stood largely separate from database management under the umbrella of enterprise content management. What Forrester named in 2008 (and it was already becoming evident even then) was that business users neither know nor care about some division between content and data, between search and BI. They simply want to find and benefit from the burgeoning wealth of digital information that is being stored in computers everywhere. I wrote in a 2010 white paper that I see “data and content as two ends of a continuum of the same business information asset… [with a] depth of integration required for full business value.”
The problem I see in this likely battle for acquisitions by database vendors is as follows: UIA is a comparatively young technology coming largely from the content management space, with “unstructured” search bridging over into the BI query world. This direction of innovation flow makes sense; soft information is more complex and extensive than relational data, and for business users more “natural” and easier of understand. In simple terms, the concept of search must be enhanced with query rather than query extended to search. Hard data flows from soft information, both conceptually and in its implementation. Can you imagine converting a relational database in its entirety into an engaging novel?
The risk is that large database vendors will try to shoehorn search into their existing query-centric view of the world; that the innovative solutions we need to gain real value from the explosion of soft information will be stifled. There are some small startups (such as NeutrinoBI) that come from the hard data space with an understanding of the primacy of search as an entry point into analytics, but in my experience, the larger players either focus solely on hard data or split hard and soft information management into very separate organizational silos.
In the Oracle acquisition, I expect that the likely BI outcome is the positioning of Endeca’s Latitude component behind OBIEE. My challenge to Larry (if he is listening!) is to conceptually put the two components the other way around and see what it offers to business users.