Where are all the big-data people? A recent front-page article in the Boston Globe raised this very issue.
Where are all the big-data people? A recent front-page article in the Boston Globe raised this very issue.
The article, “Mass. firms see riches, jobs in charting oceans of data”, discussed reports on Big Data by the Mass Technology Leadership Council (MTLC) and McKinsey Global Institute regarding the role Massachusetts plays as a hot bed for both Big Data software firms and the people with the skills to level Big Data in their businesses.
Big Data’s promised business benefits range from capturing more of a consumer’s wallet to discovering the cure for cancer. Software firms certainly see the potential of Big Data. They’re investing in better ways to capture, integrate, manage and analyze Big Data. Both established high tech firms and venture capitalists are making significant investments in it.
The Big-Data Skills Gap
The gap we are going encounter in this industry is the shortage of both the skilled people to develop the Big Data analytics models, often referred to as data scientists, and the data-savvy business people to take advantage of them in their business. The McKinsey report states “By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.”
Our industry has faced a shortage of skilled people in business intelligence, analytics and data integration that has kept business from effectively using the data they already have. With the onslaught of Big Data and the advanced skills it requires, we’re destined to fall even further behind.
Data Scientist Definition
If we are ever to fill that gap we need to broaden our definition of what a data scientist is. This title and the job descriptions that are posted for these jobs are too concentrated on programming and IT skills. In order to create Big Data analytical and predictive models the person does need to have programming skills, but more importantly, that person needs to understand statistical modeling. To develop the model, the person also needs to understand their business and industry. It also helps if they understand how to apply customer behavior and economic models.
Although Big Data means “a lot of data,” the reality is, in many cases, that data is incomplete and inconsistent. Developing analytical models means dealing with dirty data and gaps. Many “numbers” people have trouble dealing with these conditions.
Data scientists need to have expertise in statistics, economics and business besides the basic programming skills. Although the ranks of the software engineers and IT is an excellent place to start, our industry needs to broaden its appeal and reach to people with a mathematics, actuarial, statistical, economics or business backgrounds if the data scientist gap is ever to be filled.
McKinsey mentions the shortage of the data-savvy business managers who will be the users of the Bid Data analytical models developed by the data scientists. Fortunately, many business schools have started offering analytical curriculum at both the undergraduate and graduate level. For business people with deep business expertise, it is time to brush up on statistics.