Author: Steve McDonnell of the Spotfire Blogging Team
Author: Steve McDonnell of the Spotfire Blogging Team
Consider a simple trip to a child’s birthday party. You send a tweet that you’re headed to the party and you create data. You get in the car, stop to get gas, pay at the pump and you create data. You buy a card at the store, scan your frequent shopper card, pay with cash and you create data. You take pictures and a short video at the party, post them on Facebook, Flickr and YouTube and you create data. You send a text message while at the party and you create data. Throughout the entire trip, your cell phone creates data as it continually sends out GPS signals and your car creates data as it tracks fuel efficiency. Take the data for this one activity, multiply it by the number of activities you have, multiply that by the number of people who have activities, and you probably have only a small fraction of the data that’s constantly being generated.
According to IBM, we create 2.5 quintillion bytes of data every day. Ninety percent of the data we have has been created in the past two years and the amount of data is expected to increase exponentially. The data we create is expanding rapidly as enterprises capture more data in greater detail, as multimedia becomes more common, as social media conversations explode and as we use the Internet to get things done. This is “big data,” and it’s getting even bigger.
Big data is complex. It’s complex because of the variety of data that it encompasses – from structured data, such as transactions we make or measurements we calculate and store, to unstructured data such as text conversations, multimedia presentations and video streams. Big data is complex because of the speed at which it’s delivered and used, such as in “real-time.” And obviously, big data is complex because of the volume of information we are creating. We used to speak in terms of megabytes and gigabytes of home storage – now we speak in terms of terabytes. Enterprises speak in terms of petabytes.
Big Data Challenges
Big data presents a number of challenges relating to its complexity. One challenge is how we can understand and use big data when it comes in an unstructured format, such as text or video. Another challenge is how we can capture the most important data as it happens and deliver that to the right people in real-time. A third challenge is how we can store the data, and how we can analyze and understand it given its size and our computational capacity. And there are numerous other challenges, from privacy and security to access and deployment.
Big Data Opportunities
But even greater than the challenges are the opportunities that big data presents. McKinsey calls big data “the next frontier for innovation, competition and productivity.” We can answer questions with big data that were beyond reach in the past. We can extract insight and knowledge, identify trends and use the data to improve productivity, gain competitive advantage and create substantial value for the world economy. The challenges with big data are limited compared to the potential benefits, which are limited only by our creativity and ability to make connections among the trillions of bytes of data we have access to.
Big data provides an opportunity to find insight in new and emerging types of data. How will you take advantage of this opportunity?