1. Online Survey Data Quality
This may be the topic that has received the most attention from industry players, with various efforts underway by organizations such as CASRO, The ARF, and ESOMAR to define quality standards and measure the quality of data obtained through online surveys. For example, The ARF fielded a major study comparing a number of Internet panels with telephone research.
Without the kind of sampling frame for Internet surveys that enabled RDD (random digit dialing) methods, I don’t think the industry is anywhere near a “solution” to the problems of…
1. Online Survey Data Quality
This may be the topic that has received the most attention from industry players, with various efforts underway by organizations such as CASRO, The ARF, and ESOMAR to define quality standards and measure the quality of data obtained through online surveys. For example, The ARF fielded a major study comparing a number of Internet panels with telephone research.
Without the kind of sampling frame for Internet surveys that enabled RDD (random digit dialing) methods, I don’t think the industry is anywhere near a “solution” to the problems of representativeness and reliability (getting the same results from repeated surveys, at least on measures that should be stable over the interval). Moreover, it appears that river sampling may become more prevalent, as companies that relied primarily on respondent panels or client-provided lists find it more and more difficult to satisfy study quota requirements and to source panelists from portals such as Yahoo!.
Still, it looks like the industry is moving towards some agreement on best practices for online research, and Joel Rubinson, Chief Research Officer of the ARF, has summarized some of these in a recent post.
2. Respondent pool “spiral down.”
Declining survey response rates have been a problem for market researchers for the last couple of decades, first in telephone-based research, and now with online surveys as well. In addition, as online survey administration as increased, the addressable or covered share of the population shrinks.
I observed something like this when I worked in database marketing. There’s enough stability in the characteristics of people who respond to direct marketing that, after a while, predictive targeting models converge on the same set of individuals (given that they use the same commercially available data sources, and so forth). There’s a paradox in this–as a marketing program gets more efficient (meaning that those unlikely to respond are dropped from mailing lists), the remaining individuals get so many offers that response rates for any one offer may decrease.
3. Research 2.0
Customer-generated content (CGC) on the web has been with us for a while. Still, it’s not clear to what extent the gobs of data generated by customers without direct involvement or intervention by marketers can be tamed. Some promising developments include the “Facebook National Happiness Index” (see my post on October 14, 2009) that extract high level, aggregate information from content posted on the web.
While much of the Research 2.0 focus has been content, network structure and communication patterns may provide even more powerful insights. It’s fairly easy to summarize activity at individual nodes–number of page views, for example–but much more difficult to map connections, both between customers/users, and web pages. Of course, search engines rely on such data, but that technology has not quite made its way into market research, as far as I can tell.
4. The “Cinematch” Effect
Cinematch is the collaborative filtering system that Netflix uses to make recommendations to customers. Cinematch is quite sophisticated (at least compared to some collaborative filtering systems) and incorporates predictive modeling based on customers’ ratings of films. In one of the most publicized examples of crowdsourcing, Netflix sponsored a $1 million prize, to be awarded to the person or team that could improve the predictive accuracy of Cinematch by a certain amount (for more on the prize, see my posts on August 4 and August 6, 2009).
As far as I can tell, the contestants all come from the world of computing. While computational and algorithmic approaches have been important in market research and customer knowledge for quite some time, the Netflix prize may encourage more programmers and computer scientists to address problems that, traditionally, have fallen to social scientists. I think we can expect to see even more new entrants into MR that are driven primarily by technology solutions (text mining and prediction markets come to mind).
5. Bayesian Thinking
Bayesian inference and modeling methods that incorporate Bayesian estimation offer new solutions to problems that have plagued market researchers (and other social scientists) for years. Bayesian thinking involves more than using methods like the Metropolis-Hastings algorithm to model heterogeneity in the kind of hierarchical (“HB”) models that have found application in choice-based conjoint analysis and other market research techniques. Some enterprising practitioners like in4mation insights are using HB methods for marketing mix models, as one example. I believe that some of the problems we’re encountering with the reliability and representativeness of online samples might be addressed with Bayesian methods.
In my view Bayesian thinking leads to model-driven rather than data-driven analyses. Bayesian thinking facilitates the development and testing of different conceptual models of the behaviors of interest. Bayesian thinking also maps more closely to managerial decision processes than does “classical” or “frequentist” thinking. Managers, after all, are trying to make the best “bets” with the resources they have available, and it’s fairly easy to incorporate possible gains and losses into a Bayesian decision framework.
6. Dynamic Simulation
Simulation is not new to MR. ”Choice simulators,” for example, are used with conjoint studies to understand (and predict) the responsiveness of a market to changes in product features and pricing. Regression-based “key driver models” are often incorporated into decision tools that permit “What if?” analysis, and Monte Carlo/Markov Chain (MCMC) simulation methods also are employed to aid in decision making. For the most part, conjoint-based choice simulations and similar regression-based decision tools are static; the results are deterministic given the inputs and the model coefficients. The questions answered by such simulators tend to be of the form, “All other things being equal, what happens if I change this one input?”
Spreadsheet based MCMC simulation tools like @risk use random number generators to explore the potential variation introduced by these random variables. Some conjoint-based choice simulators incorporate a degree randomization (e.g., “randomized first choice” selection methods).
Over the past few years, some market researchers have begun to explore the use of agent-based simulation (ABS) in conjunction with more traditional simulation models. As one example, I modified a choice simulator to take into account word-of-mouth by incorporating agent-based elements (click here for more on this model).
Agent-based simulation has developed in the social sciences as a way to study complex social systems. A parallel development in computer-science applies ABS to a wide variety of operational problems (such as creating search bots to find the lowest price for something on the Internet). And a computer-generated battle in The Lord of the Rings: The Twin Towers employed agent-based models to create realistic motion and achieve the visual outcome that director Peter Jackson sought.
7. R
No doubt SAS and SPSS will continue to dominate the analysis of market research data, but the open source statistical package ‘R’ has made enough inroads into major corporations to warrant coverage in the New York Times. The Bayesm toolkit contains programs for several models for MR (see Bayesian Statistics and Marketing by Peter Rossi, Greg Allenby and Robert McCulloch for more information).
8. Other Cool Stuff
The Advanced Research Techniques Forum, sponsored by the American Marketing Association and held annually in June, and the Sawtooth Software Conference (held every 18 months) provide a look at new developments in analysis and data generation. One of the neatest things I saw at the recent Sawtooth conference (March, 2009) was the use of game playing for data collection. To be effective, market research has to engage consumers and encourage them to “tell the truth” under conditions that may be very unrealistic. Lynd Bacon, in a paper co-authored with Ashwin Sridhar (“Playing for Fund and Profit: Serious Games for Marketing Decision-Making”), showed how “purposive games” can be used to elicit information from consumers.
Copyright 2010 by David G. Bakken. All rights reserved.