With so many businesses touting the benefits of big data, it becomes easy for companies to get caught up in the hype and opportunity surrounding the term. However, many businesses still fail to effectively capture the real value of the data available to them. A recent survey of IT decision makers and CIOs conducted by Dimensional Research on behalf of Qubole showed that while nearly all businesses are running or planning to run big data projects, less than 10 percent report having established mature big data processes within their organization. So, where is the disconnect between data teams’ perceived capabilities and their actual performance when running big data projects?
The following are five of the most significant challenges reported by companies when implementing a big data project or reaching a big data maturity
Ensuring Data Quality
Few will deny the important role big data now plays in organizations all over the world, but uncovering the benefits requires maintaining high-quality data – something that has become increasingly difficult to do and was reported as the top challenge for IT and data professionals. In many cases, key aspects of the data collected by businesses can become corrupted by mistakes, errors or incomplete values, all of which can lead data teams to the wrong conclusion. This is referred to as dirty data, and it represents a formidable obstacle to companies hoping to use that data to drive insights and improve business operations. Dirty data is no minor issue, either. According to The Data Warehouse Institute (TDWI), corrupted or incomplete data ends up costing U.S. companies around $600 billion every year.
Taking the steps now to clean data and prevent the issue will go a long way toward helping organizations make the most of the information they collect. Businesses can help keep data clean by regularly updating their systems to ensure they can handle large amounts of data collection and analysis without risking damage to it in the process. Businesses with the right technology may even get into data scrubbing, a thorough cleaning process which involves filtering, decoding and translating of data sets.
Keeping Costs Contained
It can often be difficult for CIOs to accurately project the cost of a big data project, especially when they lack prior experience. The challenge lies in taking into account the various disparate costs associated with each project – from acquiring new hardware or software, paying a cloud provider, hiring additional required personnel and beyond. Because big data projects have a tendency to scale quickly, the costs associated with those projects can quickly become overwhelming if companies are not prepared. For businesses pursuing on-premises projects, it is important that decision makers factor in the cost of training,
Because big data projects have a tendency to scale quickly, the costs associated with those projects can quickly become overwhelming if companies are not prepared. For businesses pursuing on-premises projects, it is important that decision makers factor in the cost of training, maintenance, and expansion of their database and staff. On the other hand, while cloud-based big data deployments typically offer lower cost requirements and faster time-to-production than on-premises counterparts, businesses pursuing the cloud model also need to evaluate the service-level agreement with their provider to determine how usage will be billed, and if any additional fees may be incurred.
Satisfying Business Needs and Expectations
While data teams have high confidence in their ability to deliver self-service insights to meet growing demand, few have been able to deliver on the high expectations set by their businesses. Part of this problem stems from a lack of technical resources required to run big data operations effectively. In fact, nearly a third of respondents to the recent survey by Dimensional Research claimed they did not have access to the infrastructure or technology needed to implement an in-house big data project.Although companies may begin a big data project with high expectations, oftentimes they fail to invest in the resources necessary for data teams to properly implement those projects. In order to avoid the issue, data team leaders should consult with business leaders before beginning a project to set expectations based on the resources available. At the same time, data teams must act as educators to inform
Although companies may begin a big data project with high expectations, oftentimes they fail to invest in the resources necessary for data teams to properly implement those projects. In order to avoid the issue, data team leaders should consult with business leaders before beginning a project to set expectations based on the resources available. At the same time, data teams must act as educators to inform decision-makers on the technology, infrastructure, and staff needed to meet specific goals.
Quantifying Values of Big Data Projects
While most organizations will argue for the benefits of implementing their own big data project, understanding the need and being able to quantify the value of the required investment do not always go hand-in-hand. For instance, businesses that decide to run their data analytics on-premises will need to purchase a number of costly servers, deploy them in their data centers with the appropriate software, and run tests to ensure everything functions properly together. This process alone can take months or even years – and that is before the first query can even be run. If a data team was asked at that point for their return on investment, they simply would have no way to answer.
The cloud has changed that scenario dramatically. While there are still quite a few organizations that will choose to invest in on-premises big data infrastructure, a growing number of companies are realizing the benefits of cloud-based big data infrastructure for lower up-front investment and faster deployment times.
Lack of Industry Expertise
Finally, one of the biggest challenges an organization faces when implementing a big data project is finding qualified personnel. While 83 percent of respondents to the survey said their data teams are growing, more than one-third report they are having difficulty finding people with the expertise and skills needed to handle their data operations. The problem is further compounded by the fact that a successful big data project cannot be handled by a single type of user – companies need to hire developers, data scientists, analysts and others, each with their own skill sets and areas of expertise.Even if an organization has a skilled team in
Even if an organization has a skilled team in place, however, many of today’s data teams become bogged down with the manual effort that comes with maintaining a big data infrastructure. Rather than simply adding personnel to handle these data management tasks, businesses should instead focus on finding the tools to help their data teams work more effectively. With the cloud and machine learning, time-consuming tasks like capacity planning and software updates can be seamlessly automated, freeing up teams to focus on high-value work to drive operational improvements and revenue.Big data is difficult and
Big data is difficult and complex, and presents many obstacles to harnessing it adequately. While on some level many companies appear cognizant of the difficulty associated with implementing a big data project, in other ways they often have unrealistic expectations of the effort and expertise it will require for them to reach the next maturity stage. Before companies can execute a successful mature big data program of their own, they first need to develop the infrastructure, tools and expert resources necessary to overcome each of the challenges above. By taking a DataOps approach, companies can build a self-service data model so they can deliver insights-driven business decisions throughout an organization.