More than a decade ago, the enterprise was a very different place. Companies had challenges dealing with data growth – application performance issues, growing storage costs, and dealing with application upgrades. Data became more manageable over time, with the introduction of solutions like data archiving.
More than a decade ago, the enterprise was a very different place. Companies had challenges dealing with data growth – application performance issues, growing storage costs, and dealing with application upgrades. Data became more manageable over time, with the introduction of solutions like data archiving.
However, the needs of the enterprise are no longer met by standard data archiving.
It’s 2015, and companies are drowning in data. The amount of data produced every day is estimated to be more than eight times the amount of information stored in U.S. libraries, and the pace of data growth is still accelerating rapidly.
The big issue is that “data” does not mean what it once did. A decade ago data meant structured data from internal sources such as financial systems an ERP. Today Big Data includes gigantic amounts of social media data, machine generated logs, and click rates on Web sites. The most recent data sources are from Internet-of-things (IoT) – sensor data from everything from machines to consumer devices like FitBits. A single flight of a single commercial jet plane is estimated to generate a terabyte of data. In the near future everything – cars and trucks, bridges, consumer appliances of all types, people, themselves – will be covered in sensors.
The Big Data promise
For many companies, this is a blessing and a curse. On one hand, all this new information promises huge business opportunities. The large mobile communications carriers like AT&T, Verizon, and Sprint in the United States have been capturing and analyzing data on their customers from multiple sources including service calls and social media to find where their customer-facing processes break down, the key to why customers constantly switch from one carrier to another. Fixing those to provide a better customer experience is paying off in much higher rates of customer retention.
Experts say that companies will be able to combine traditional sales data with analysis of social media and other forms of Big Data to make forward-looking predictions of emerging market trends, allowing them to get ahead of their markets and reap competitive advantage. Some of the most advanced companies are already using Big Data to find new market opportunities or entirely new markets. For instance, a century-old train brake manufacturer is analyzing IoT sensor data from freight engines to identify the techniques the best train engineers use to meet schedules while burning the least fuel. They are providing this analysis back to their Class One freight railroad companies worldwide, giving them the keys to cutting fuel consumption 2 percent to 4 percent. A single class one freight railroad burns more fuel oil than the U.S. Navy, so the savings from this one analysis can reach multiple millions of dollars.
But all this comes with challenges. One of the largest is cost. The cost of data storage is a major component of the overall IT budget, and despite the steady decrease in the cost of storage media per gigabyte, the overall storage budget is increasing. Basically, the rate of data growth is greater than the rate of decrease of the cost of storage, and with new data sources like IoT, data growth is not going to slow down soon.
The storage management challenge
Capturing and storing this increasing volume of data is extraordinarily taxing on IT departments. Whether businesses know it or not, the cost of storing and keeping data is one of the heaviest burdens on a company’s infrastructure resources. These costs extend beyond the monetary price of a data storage system. Physically, the data explosion sucks power in data centers more than ever before. Data growth also slows system processes and forms outage windows, creating situations ranging from inconvenienced users to total system shutdowns.
As expensive as it is, however, companies cannot afford not to capture these huge volumes of data, for while Big Data promises huge business advantage to those who harness it, the dark side is that those who do not will face an increasing competitive disadvantage. Big Data and associated IT trends like mobile are disrupting most markets, and companies have no choice but to invest and change as fast as they can just to stay where they are.
To meet these challenges, CIOs need to pivot their data management strategies and turn to Big Data solutions to cut costs and improve application performance. Semiconductor technology now allows massive amounts of data to be stored at lower costs. Virtualization allows data to be stored with extraordinary efficiency through scalable platforms like Apache Hadoop.
Advanced data storage management is vital to controlling these costs. According to Information Lifecycle Management frameworks, only data that is less than 18 months old is considered current. This is the data that should be stored in memory or tier 1 storage. For easy access, the rest must be put in a nearline repository.
Experts agree that as much as 80 percent of production data in ERP, CRM, file servers, and other mission-critical applications may not be in active use, and both structured and unstructured data become less active as they age. Large amounts of inactive data stored online for too long reduces the performance of production applications, increases costs, and creates compliance challenges.
However, storage mechanisms alone aren’t enough. You can place Big Data in a repository, but its true value is only realized when structure and analytics are added. Historical data is important to document regulatory compliance and for analysis of market trends.
Big Data technologies like Hadoop offer low-cost bulk storage alternatives to storing inactive enterprise data online. By moving inactive data to nearline storage, application performance is improved and costs are reduced as active data set sizes are reduced, making workloads more manageable. Universal data access is maintained by analytics applications, structured query and reporting, or just simple text search.
Big data and this new enterprise blueprint enable organizations to increase the value of their data. Enterprise data warehouse (EDW) and analytics applications leverage Big Data for better described views of critical information. As a low-cost data repository to store copies of enterprise data, Hadoop is an ideal platform to stage critical enterprise data for later use by EDW and analytics applications. Businesses today face an exciting new world of possibilities. Advanced data management that provides the right data at the right time, at the least cost, is the necessary foundation for success in this brave new world.