I have always loved science and math, and that’s why I got into statistics and focused on analytics for a career. One thing that has always fascinated me is how certain patterns show up again and again in different places across nature and mathematics. When looking at two seemingly unrelated topics, it suddenly becomes clear that there is actually quite a strong linkage between the two and that they are simply different examples of the same underlying concept.
I have always loved science and math, and that’s why I got into statistics and focused on analytics for a career. One thing that has always fascinated me is how certain patterns show up again and again in different places across nature and mathematics. When looking at two seemingly unrelated topics, it suddenly becomes clear that there is actually quite a strong linkage between the two and that they are simply different examples of the same underlying concept.
One example of this is the Fibonacci sequence which shows up in nature regularly in places such as the way sea shell spirals grow and the pattern of seeds in a sunflower. I recently came across a terrific example of the concept of similar patterns at work within the realm of data and analytics.
A Recurring Pattern in Analytics
I recently took part in an event (see a summary video here) where professor Eric Bradlow of Wharton gave a presentation about research he’s done on what he calls “clumpiness” in customer purchasing. Eric and I got excited about a tie between Eric’s formal work on customer clumpiness and some work my team had done a few years prior around store sales forecasts. My team had effectively identified a very similar situation in a totally different setting.
This was an important realization because I consider it to be a powerful reinforcement when formal research and real world project work independently confirm the same concept. The two situations were not directly comparable – individual customer purchases and store level product sales – but they did share some similar mathematics under the hood.
“Clumpy” Data and Customer Purchasing
The central theme of Eric Bradlow’s research and talk was that while some customers purchase in a consistent pattern over time, others are quite clumpy. Some customers will not buy for a period of time, but then buy in rapid succession before pausing again. He likened this pattern to binge watching on a streaming content service.
Far from being just an academically interesting pattern, his research shows that accounting for the “clumpiness” of a customer’s purchasing will increase the power of standard customer behavioral models. The recommendation is, therefore, to embrace and account for clumpy purchasing instead of just making a note of its existence. His research focused upon a method to do that.
“Lumpy” Data and Store Sales
In our case, we were hired by a large retailer to help them better forecast what they called “lumpy demand” in some of their products. The standard forecasting algorithms all make certain assumptions about sales patterns, including a fairly regular cadence of purchasing, and many of this retailer’s products broke those assumptions. This was leading to forecasts that were not as accurate as expected or required.
In the retailer’s case, imagine a product like floor tiles. Not a single box of a given tile will sell for a number of weeks or months. However, when it does sell, many boxes will be sold to support a kitchen or bath remodel. Therefore, it is a tricky balance to figure out how much inventory to carry on hand and when to require a special order. There were a variety of factors to take into account including inventory carrying cost and the frequency and magnitude of the lumpy sales, among others.
You Say Clumpy, I Say Lumpy
As Eric Bradlow found for customer purchasing, we had also found that it was possible to account for the lumpy demand of products in a store and provide better forecasts. By accounting for the lumpiness in sales, the models were able to turn what had been noise in the data into information utilized by the models. Eric Bradlow called it “clumpy”, we called it “lumpy”, but we were all describing the same principal and seeing the same general pattern!
This experience led me to consider where else it would be possible to identify the same fundamental patterns across different types of data and analytics. I believe it is more than an academic exercise. If a certain pattern has been handled already in another context, we can potentially vastly shortcut our effort to handle it within a new context.
In the end, I don’t care if you call it clumpy, lumpy, or something else. What I do care about is that you look for the pattern in your analytics efforts and make use of what has already been done to deal with it. Much like the Fibonacci sequence appears repeatedly in nature, there are recurring patterns in data that, once recognized, can improve both our analytics and our efficiency in creating them.