John Marshall, CTO, Intelligence Directorate, Joint Staff/J2, moderated this panel of government and industry technologists. He began by outlining the need for Big Data experts in all fields. Data Scientist was projected to be the second hottest career for college graduates in 2012 and research suggests that better data analysis for sales, inventory, and operations could increase profit to retailers by 60%. Companies and government agencies increasingly need to incorporate Big Data into their decision-making process and develop personnel to do so.
Russ Richardson, SVP of Sotera Defense Solutions, then explained the role of Data Scientists in national security. Big Data means different things in different branches and agencies. For the Intelligence Community, this data is very large, with months of video and recording or reams of unstructured text to be analyzed. The Army also deals with text, but the sum total of Army reports for the last few decades is only in the gigabytes. Instead, the Data is incredibly complex due to the analysis which the DoD performs. When these reports are indexed, the data expands 1000 or 2000 fold.
Michael Lazar, Senior Solutions architect at VMware, continued describing Big Data challenges in the public sector. He noted that for the wide range of Big Data problems faced, Data Scientist is too broad a term, similar to doctor, which could describe a neurologist, cardiologist, or general practitioner, all of which perform very different roles. Also like doctors, Data Scientists expect to be well compensated for their expertise. A Chief Data Officer or Data Scientist could make several hundred thousand dollars a year in the private sector for firms such as Yahoo or Facebook, and even the highest government salaries can’t compete. This means that the federal space will have to grow their own Big Data experts for the specific needs they face. To retain their Data Scientists, organizations like the NSA have to offer them unique problems and missions they won’t see in the private sector.
Matt Schumpert, Director of Solutions for Datameer, questioned the idea of Data Scientists in general, noting that you shouldn’t have to be a scientist just to work with Big Data. To really make the most of a company’s information, the barriers of entry need to be lower so that analysts, marketing, and the sales team can also benefit. He added that high barriers to entry hurt transparency and collaboration.
Digital Reasoning’s David Yee spoke on his experience both in government and in industry, and is currently bridging the gap between the two as Digital Reasoning is supporting an Army Big Data project. Yee agreed with Shumpert on the need to lower barriers to entry for working with Big Data and provided the example of Digital Reasoning’s Synthesis Cloud where you don’t need a PhD to perform high level analysis. All it takes is some basic knowledge of programming languages such as Python.
Following the panel, there was a question and answer session with conference attendees. Somebody asked about major companies that have Chief Data Officers. While they are mostly employed by Web 2.0 firms such as Facebook, more conventional businesses are now making use of CDOs, notably Bank of America. Another question was, given the diverse tasks that fall under the duties of Data Scientists, should that be one person or a set of experts? While it depends on the problems and resources available, when possible enterprises will need both. Experts may be necessary for specific aspects of design, management, and analysis, but generalists are also needed in order to lead and unify the team.