Neil focused his session on Analytics. Analytics, he says, has roared to the front office in recent years (driven in part by the explosion of data and data management technology). Big, as in Big Data, is relative. The rate of increase has not particularly changed in recent years but the rate of increase is being applied to a bigger and bigger base. This means that the absolute amount of data is much, much greater.
Neil focused his session on Analytics. Analytics, he says, has roared to the front office in recent years (driven in part by the explosion of data and data management technology). Big, as in Big Data, is relative. The rate of increase has not particularly changed in recent years but the rate of increase is being applied to a bigger and bigger base. This means that the absolute amount of data is much, much greater. And this data has been captured and is manageable thanks to the dramatic increase in processing power and matching decreases in costs (for storage and processing). Moore’s law, and other matching improvements in memory and data storage, means that IT has far more capabilities for managing and analyzing data than ever before. Nevertheless we still act like we have too much data, have too little compute power, that we have to build simpler models – like there is scarcity of computing resources. This, he says (quite rightly) is a mistake. We could now manage and build the most complicated things, solve the most complicated problems. It’s time to think that nothing is impossible in this space.
Nevertheless analytics is hard. It takes resources and effort and it has to be focused. You have to focus where it matters, on the decisions that matter to your business. You can’t do everything with analytics, you have to focus or you won’t do anything well. Neil divided analytic types into four or five categories:
- Quantitative R&D types, generally PhDs, who develop new analytic algorithms, do research into analytic approaches etc.
- Data Scientists or Quantitative Analysts who have advanced math skills but are focused on using algorithms and have solid business domain knowledge.
- Operational Analytics professionals who run and manage analytic models, who implement the models, with good domain expertise and some math skills
- Business Intelligence or Discovery people who are number-oriented and focused on BI, spreadsheets, visualization etc
The fifth group is that folks who just use operationalized analytics that are embedded in their systems and processes. These are the most numerous and are the ones who ultimately must be empowered to make better decisions. In many ways these are, he says, the most important.
Neil put up a long list of traditional use cases for predictive analytics – energy consumption, machine downtime prediction, optimizing hospitalization, improving supply chains. He also pointed out that there are tremendous opportunities for using analytics for good and he encouraged everyone to do something about this (pointing to Data Kind for instance).
So what ARE the analytical needs?
- Data is needed to record what happened. But this data is stored for a primary purpose and analytics is a secondary one. This is a challenge as this data increasingly needs to be usable for analytics.
- Data then must also be able to make predictions with manageable uncertainty about what will happen. And we have to manage how consumers of analytics deal with probabilities because that’s what a prediction is – a probability.
- We want to be able to see also what could happen.
And he points out that just because we have data and WANT to solve a problem does not mean that data will yield anything useful.
Any given data, he says, gets stored to be used for some processing. This data might be reported simply and quickly so we can see that there’s activity. More complex analytics and visualizations might help us get an overview of what’s going on. But as we move further into analysis and into the use of statistics we are further and further removed and at more danger of misinterpretation. He used Anscombe’s quartet as an example – four VERY different datasets that generate the same regression curve. Unless you have some understanding of what the model is telling you before you can effectively consume it.
Analytics, of course, has a spectrum from BI and reporting to data mining, predictive analytics and optimization – from describing knowledge to prescribing appropriate actions. Data mining is focused on descriptive analytics, more accurately describing your business using your data. Predictive analytics “score” probabilities, show you how likely something is. As you move up this curve the need for statistical awareness grows and the amount of coding involved grows. Neil’s comments reminded me of the comments made by Ted Vandenberg of Farmers in his talk at Insurance Analytics where he compared analytics today to IT 20 years ago – hand coding, obscure languages, no requirements processing, lack of agile development methodologies etc. And even once you have analytics that works you have a challenge in that models age, that the value of an analytic model degrades over time. This means you have to constantly challenge models and try new approaches, using A/B testing and champion/challenger to find models that work going forward.
Enter the Data Scientist. Neil pointed out that much of the work that is bundled under Data Scientists is the same as analytics staff have always done. But the compute power let them handle the increased volume of data and they are increasingly using data from outside their environment. Many of the companies that specialize in data science are data companies, the data is their business. The problem now is that more traditional companies are trying to apply these approaches and they need to approach the problem differently. There simply aren’t enough of these folks out there and they view the world too differently so for large companies they must focus differently. In addition, he says, it is not just about mathematical talent but about people who have the correct training and tools to apply the math to real-world problems. Neil believes that 80% of the work of a data scientist can be done by others – by other members of a team (everything from data gathering and cleansing, platform availability, presentations, collecting domain expertise etc). By focusing on a team to apply analytics not just a single expert you can deliver the results you need (and I agree, one of the reasons I think decision modeling matters is that it helps business analysts participate). You can also develop the people you need, shifting people between the categories he identified – using mentoring and technology to move people up the curve from outside to type 4, from type 4 to 3 or type 3 to 2.
As Neil and I pointed out in our book, making better decisions is not automatically a consequence of better analytics. You have to create a culture and a technical environment that focused on decisions and on making better decisions using those analytics. And this is hard because no-one seems to own the portfolio for decision-making – not just the analytics but the decision process, continuous improvement of decision-making and tracking of decision outcomes. There is a persistent myth that simply having more data, better data, will result in better decisions but this is not in fact the case.
He wrapped up with a call to action:
The challenge of analytics is communication and creating a shared understanding. It’s about focusing on high impact areas, moving forward one step at a time, being skeptical, being creative, searching for the truth. Any company can compete on analytics.