Selecting Big Data Sources for Predictive Analytics
Why is big data valuable? Because it’s big? Not really. In fact, very large datasets always come with a very large set of headaches. Storage, maintenance and management of very large datasets is not simple. And just having a lot of data doesn’t guarantee a lot of value.
The value of any dataset is determined by the quality of information you can extract from it. The key to value in big data is the detail. In other words, the value of big data is in the small stuff.
Every business has a rough idea of how many customers it has, how much they spend in total, and perhaps the average spend per client. But if all you know is the average, then what are you going to do, treat every customer as average?
If you could meet each client personally, get to know each person, you wouldn’t think of anyone as average. You’d know each person’s habits. You’d know that Maria Perez shops for herself each week, and buys a gift now and then, while Laura Carter shops for her family of five, and Lily Yu does not use your products herself, but often purchases them for her parents. You’d know the times of day when each person prefers to shop, whether each is a relaxed or hurried shopper, and the products that each person prefers.
Because you would know your customers as individuals, you would treat them as individuals. You’d let Maria know that you offer gift wrapping. You’d direct Laura to the economical family-size package of her favorite product. You’d make sure that Lily chose the container that was easy for her parents to open.
The promise of big data is in the details. You want the data to give you the information you’d get if you observed each customer in person. You want to know what each person does. You want to know how each responds to a variety of things – products offered, pricing, presentation, and so on.
You only realize value from data if you do something valuable with it. Remember that in a face-to-face customer interaction, you use what you know about the customer to make appropriate suggestions, and the better the suggestions, the more the customer would buy, return and recommend you to others. The best data gives you information. Information creates opportunity. Value enters when you use the information to take meaningful action.
So, what does this tell you about selecting datasets for big data applications? Let’s look at the process.
First, you need a reason. What do you want to accomplish?
Then, you must know what kinds of action you have the option of taking. Can you offer new products, change the selection you offer, or must you work within the bounds of what you have now? Can you develop new ads, new offers?
Now, imagine that you have the same goal, and the same options, in a face-to-face situation. What information would you want? Knowing that, you are ready to look for data sources that meet your needs.
Here’s an example.
The problem: Your bricks and mortar stores are crowded at peak hours, so crowded that customers often walk away in frustration, while at other times the stores are nearly empty. You are selling below your potential due to cart abandonment and failure to attract customers throughout the day.
What do you want to accomplish? Increase revenue by better distributing activity throughout the day.
What kinds of action can you take? You have a marketing budget, authority to send print and email advertisements, and to make special offers using coupons and other promotion schemes. You also have some influence over staff scheduling and checkout procedures.
Now, imagine that you are in the store, observing customers. What useful facts could you observe?
Some shoppers habitually shop at off-peak hours. Who are they? Others habitually shop at peak hours. Why? What are they buying? Are there also shoppers who vary the times when they come to the store? Who’s giving up and walking out? What had those people intended to buy? What can you learn about the reasons for each shopper’s behavior?
How might you put that information into action?
Perhaps you have found that some of the shoppers who come at busy times are simply not aware of the times when the store is not so busy. An information campaign might work for them. It could be as simple as posting signs in the store or adding that information to your regular circular.
Others might be coaxed into shifting their shopping to off-peak times if you made it worth their while, with a discount or special offer.
How about the people who already do all their shopping during the quiet hours? There’s no benefit to you in offering them incentives for what they are already doing. But maybe you can motivate them to buy more. If you know what they’re buying, you might offer a coupon for a product they haven’t tried or a deal on a larger quantity of a favorite.
You can’t speak to every customer personally. You can’t follow everyone around and observe. But you may have access to data that provides you with much of the same information. If you are dealing with many people, and lots of detail, you’re talking about big data, the kind of big data that fuels profitable predictive analytics.
Where can you find detailed information about the behavior of your customers and prospects? Start with the data you already own. Your transaction records are a treasure chest of behavioral data. You know when each transaction takes place, what is purchased, at what price. If you have a loyalty program or house credit card, then you also know who was buying. Your own data is more valuable to you than anything you could buy, and it’s already paid for. And this data is yours alone, giving you a unique information advantage over your competitors.
If you do business online, get an understanding of the information collected in your web activity logs. These logs contain revealing details about shopping behavior, including details on the behavior of non-buyers.
Only when you’ve thoroughly investigated the possibilities of your internal data sources should you look beyond your walls. Once you have a clear idea of what you want to know, and the limits of your own data, can you shop selectively, and shrewdly, for information that fills in the blanks.
When you look for additional data, you’re still looking for the kind of information that you’d observe in person. Most often, businesses look for demographic information, and a lot of demographic data is available through government and commercial sources. The most valuable data for predictive analytics, though, is not demographics, it’s behavior. Look beyond demographics for sources of information about purchasing, interests, and any type of behavior which might be relevant to your business issue.
So, when you’re looking for valuable big data sources, look for data which is relevant to a specific business problem that you have a means to address if only you had the right information for guidance. Take advantage of internal sources first. Then move beyond your own walls for sources that add depth.
(image: big data sources / shutterstock)
Like what you read here? Read more from Meta Brown in "Big Data, Mining, and Analytics: Components of Strategic Decision Making" (forthcoming from CRC Press) and "The IBM SPSS Modeler Cookbook". She has introduced and expanded the use of analytics in offices and factories across the US and beyond. Got a question about promoting analytics? Or on using analytics? Just want to say hello? Email ...
Other Posts by Meta S. Brown
The moderated business community for business intelligence, predictive analytics, and data professionals.