Since 2005, Hadoop has been the foundation for hundreds of big data companies, due to its open-sourced nature. Over 170 well-known companies have contributed to its development since launch, and the project is currently valued at over $2 billion.
Since 2005, Hadoop has been the foundation for hundreds of big data companies, due to its open-sourced nature. Over 170 well-known companies have contributed to its development since launch, and the project is currently valued at over $2 billion.
But what exactly is Hadoop, and why is it so important? In layman’s terms, Hadoop is a framework for creating and supporting big data and large scale processing applications – something that a traditional software isn’t able to do. The whole Hadoop framework relies on 4 main modules that work together:
Hadoop Common is like the SDK for the whole Hadoop framework, providing the necessary libraries and utilities needed by the other 3 modules.
Hadoop Distributed Files System (HDFS) is the file system that stores all of the data at high bandwidth, in clusters (think RAID).
Hadoop Yarn is the module that manages the computational resources, again in clusters, for application scheduling.
Finally, Hadoop Mapreduce is the programming model for creating the large scale and big data applications.
Hadoop is a very powerful framework for big data companies, and its overall use has been on the rise since its inception in 2005 – over 25% organizations currently use Hadoop to manage their data, up from 10% in 2012. Because Hadoop is open source and flexible to a variety of needs, it has been applied to almost every industry imaginable in the current big data boom – from finance, to retail to education and government.
Solix, a leading provider of Enterprise Data Management (EDM) solutions, has created an Infographic going more in-depth on Hadoop along with some interesting predictions – you can take a look at it below. Is your company using Hadoop to manage your data?