Sometimes it’s good to step out of one’s normal activities just to see what ideas get sparked or unexpected connections you make. Last week I went to the Text Analytics Summit in Boston for just this reason. I took loads of notes – my tag cloud is below.
Sometimes it’s good to step out of one’s normal activities just to see what ideas get sparked or unexpected connections you make. Last week I went to the Text Analytics Summit in Boston for just this reason. I took loads of notes – my tag cloud is below.
As you might imagine, the term Big Data came up a lot at this conference. My favorite definition came from Meta Brown of LinguaSys: “If it’s hard for you to handle, it’s Big.” More formally, Big Data refers to the enormous volume, velocity, and variety of data that exists and has the potential to be turned into business value. Big Data can be structured or unstructured. It can be created by people, calculated by systems, or generated by machines. According to IDC, the volume of digital content in the world will grow to 2.7 billion terabytes in 2012, up 48% from 2011 — and it’s rocketing toward 8 billion terabytes by 2015. (See the IDC report, “IDC Predictions 2012: Competing for 2020,” December 2011.)
In a couple of specific examples from the Text Analytics Summit:
- NASA is using aviation data to improve flight safety. In the U.S. alone there are roughly 9 million flight departures a year, each of which generates data on hundreds of parameters every second during a flight. Data is generated by the aircraft, satellites, and other systems. In addition, each flight has unstructured data associated with it, such as safety reports and write-ups from pilots and co-pilots.* Some data resides with the airlines; other data resides in government systems. NASA (the National Aeronautics and Space Administration) is using analytics to dig through all this data to uncover insights that can identify and prevent potential runway incursions and other accidents.
- eBay is using social media data to get closer to customers. As of June 2012, eCommerce giant eBay had indexed more than 40 million blogs and forums (60 billion posts – that’s 10,000 a second!), which amounts to 65 terabytes of data. Why? The company has a social data intelligence program in place to help decision makers better understand the company’s audiences, influencers, and competitive position, and to deliver superior customer service. A global social analytics team works with multiple groups across the company to find and share insights from all this data.**
A theme that emerged from the Text Analytics Summit is what I think of as “little data” (though it’s really only little in comparison to Big Data). In use cases like NASA’s aviation safety research program, where every single piece of data is important for the analysis at hand, it’s not useful to take just a subset of the data because something critically important may be missed.
But in uses cases like eBay’s (using customer insights drawn from social media to drive revenue and improve the customer experience), it can be highly effective to use sampling and other techniques to grab a subset of all the data and perform the analytics on that. It’s similar in the legal sector; during a lawsuit, a company’s legal team wants to use and present to the opposing side only documents that are relevant to a particular case and issue.
It may seem like a no-brainer, but it came across loud and clear during presentations at the Text Analytics Summit: an organization’s ability to optimize the process of deriving value from Big Data in a cost-effective way depends on the use case, business drivers, and characteristics of the data.
———————————
* According to the Bureau of Transportation Statistics Research and Innovative Technology Administration, in 2012 (ending on the last day of February) there were 9,098,000 departures, compared to 9,125,000 in the same period 2011, a change of -0.3%. For more information see http://www.transtats.bts.gov/.
**On June 13, 2012, eBay’s social commerce analyst Palm Norchoovech shared these insights in a presentation titled, “Global Social Analytics @eBay” at the Text Analytics Summit in Boston, Massachusetts. You can find more info here: http://bit.ly/GSnH03.