Of Crowds, Both Wise and Foolish

The surest sign that a phrase has entered broad usage is to apply the cliche test: if you leave the last word(s) blank and most readers know how to complete it, the phrase is best used carefully, if at all, in good writing. In broad swaths of the online literate, the phrase “wisdom of _____” will not refer to elders, the East, science, or abstinence. Only four years after James Surowiecki’s book of the same name, it’s an article of Internet fa…

It’s timely to revisit the notion given a powerful paradox: many observers — including yours truly, who raved in this newsletter about the Iowa Electronic Markets (IEM) after trekking to Iowa City nine years ago — see great potential in idea markets at the same time that financial markets are proving eminently fallible. In concrete terms, the number of businesses built on markets as information processing mechanisms is soaring even as the number of U.S. investment banks, home to tens of thousands of well-supported best and brightest, shrank from five to two in a matter of months after over 150 years in business.

It seems clear that U.S. financial markets are suffering in the aftermath of an inflated mortgage-products market, but it turns out that financial scholars can’t agree on what a bubble is. According to Cornell’s Maureen O’Hara in The Review of Financial Studies (February 2008), the “less controversial” approach is to follow one scholar’s mild assertion that “bubbles are typically associated with dramatic asset price increases, followed by a collapse.” Begging the question of what constitutes a collapse, the issue for our purposes concerns the potential for bubble equivalents in information markets.

The fate of the many information market startups, some of which we will discuss, will unfold over the next few years. In under two months we will see whether IEM maintains its record of performance in tracking presidential election results. Come February, traders in the Hollywood Stock Exchange (owned by the finance firm Cantor Fitzgerald) will try improve on its 10-year average of 82% correct picks across the top eight Oscar award categories. Specific companies aside, it’s worth discussing some of the larger issues.

1) How do crowds express wisdom?

Several mechanisms come to mind:

-Voting, whether officially in the process of politics, or unofficially with product reviews, Digg or similar feedback (“Was this review helpful?”). All of these actions are voluntary and unsolicited, making statistical significance a moot point.

-Betting, the putting of real (as at IEM) or imagined (at HSX) currency where one’s mouth is. Given the right kind of topic and the right kind of crowd, this process can be extremely powerful, albeit with constrained questions.

-Surveys, constructed with elaborate statistical tools and focused on carefully focused questions. Interaction among respondents is usually low, making surveys useful in collecting independent opinions.

-Convened feedback. This catch-all includes tagging, blogs and comments, message boards, trackbacks, wikis, and similar vehicles. Once again, the action is voluntary, but the field of play is unconstrained. Compared to the other three categories, convened feedback can contain substantial noise, but its free form allows topics to emerge from the group rather than from the pollster, market maker, or publisher.

2) What kind of questions best lend themselves to group wisdom?

On this topic Surowieki is direct: “Groups are only smart when there is a balance between the information that everyone in the groups shares and the information that each of the members of the group holds privately.” Conversely, “what happens when [a] bubble bursts is that the expectations converge.” (pp. 255-6)

A great example of this effect can be found at Metafilter. A year ago the question was posed, “What single book is the best introduction to your field (or specialization within your field) for laypeople?” Hundreds of people replied, in areas from homicide forensics to astrophysics. The results are truly priceless, a distillation of centuries of experience into a modest library.

Cass Sunstein, a University of Chicago law professor, agrees in his book Infotopia (2006). He states that “This is the most fundamental limitation of prediction markets: They cannot work well unless investors have dispersed information that can be aggregated.” (pp. 136-7) Elsewhere in a blog post he notes that in an informal experiment with U of C law professors, the crowd came extremely close to the weight of the horse that won the Kentucky Derby, did “pretty badly” on the number of lines in Shakespeare’s Antigone, and performed “horrendously” when asked the number of Supreme Court invalidations of state and federal law. He speculates that markets employ some self-selection bias: “participants have strong incentives to be right, and won’t participate unless they think they have something to gain.”

The best questions for prediction markets, then, involve issues about which people have formed independent judgments and on which they are willing to stake a financial and/or reputational investment. It may be that the topics cannot be too close to one’s professional interests, as the financial example would suggest, and in line with the accuracy of the HSX Oscar predictions.

3) Where is error introduced?

The French political philosopher Condorcet (1743-1794) originally formulated the jury theorem that explains the wisdom of groups of people, when each individual is more than 50% likely to be right. Bad things happen when people are less that 50% likely to be right, however, and crowds then amplify error.

Numerous experiments have shown that group averages suffer when participants start listening to outside authorities or to each other. What Sunstein called “dispersed information” and what Surowiecki contrasts to mob behavior — independence — is more and more difficult to find. Many of the startups in idea markets include chat features — they are, after all, often social networking plays — making for yet another category of echo chamber.

Another kind of error comes when predictions ignore randomness. Particularly in thickly traded markets with many actors, the complexity of a given market can expose participants to phenomena for which there is no logical explanation — even though many will be offered. As Nassim Nicholas Taleb pointed out in The Black Swan (2007), newswire reports on market movement routinely and fallaciously link events and price changes: it’s not uncommon to see the equivalent of both “Dow falls on higher oil prices” and “Dow falls on lower oil prices” during the same day.

Varieties of market experience

The following are just some of many businesses seeking to monetize prediction markets:

–Newsfutures makes a B2B play, building internal prediction markets for the likes of Eli Lilly, the Department of Defense, and Yahoo.

–Spigit sells as enterprise software to support internal innovation and external customer interaction. Communities are formed to collect and evaluate new ideas.

–Intrade is an Irish firm that trades in real money (with a play money sandbox) applied to questions in politics, business (predictions on market share are common), entertainment, and other areas. The business model is built on small transaction fees on every trade.

–Hubdub, from Edinburgh, trades in play money but prominently features leaderboards, which intensify user involvement. Topics under discussion are limited only by users’ imaginations and curiosity as any member can propose a question. The current leader, orlin, has done well on European football but also advanced wide ranging predictions, including one regarding the Higgs boson being discovered by the large hadron collider within a year. He or she has made nearly 6,000 predictions.

Apart from social networking plays and predictions, seemingly trivial commitments to intellectual positions work elsewhere. Cass Sunstein’s more recent book, called Nudge (2008), was co-authored with the Chicago behavioral economist Richard Thaler. It points to the value of commitment for such personal behaviors as weight loss or project fulfillment. For example, a Ph.D. candidate, already hired as a lecturer at a substantial discount from an assistant professor’s salary, was behind on his dissertation. Thaler made him write a $100 check at the beginning of every month a chapter was due. If the chapter came in on time, the check was ripped up. If the work came in late, the $100 went into a fund for a party to which the candidate would not be invited. The incentive worked, notwithstanding the fact that $400 or $500 was a tiny portion of the salary differential at stake. A Yale economics professor who lost weight under a similar game has co-founded stickK.com, an ad-funded online business designed to institutionalize similar “Commitment Contracts.”

Futures

It’s clear that crowds can in fact be smart when the members don’t listen to each other too closely. It’s also clear that financial and/or reputational investment is connected to both good predictions and fulfilled commitments. Several other issues are less obvious. Is there a novelty effect with prediction markets? Will clever people and/or software devise ways to game the system, similar to short-selling in finance or sniping on eBay? What do prediction bubbles look like, and what are their implications? When are crowds good at answering questions and when, if ever, are they good at posing them? (Note that on most markets, individuals can ask questions, not groups.) Can we reliably predict whether a given group will predict wisely?

At a larger level, how do online information markets relate to older forms of group expression, particularly voting? The U.S.’s filtration of a state’s individual votes through the winner-take-all Electoral College is already controversial (only Maine and Nebraska currently allot their votes proportionately), and so-called National Popular Vote legislation is passed or pending in states with 274 electoral votes – enough to overturn the current process. Will some form of prediction market or other crowd wisdom accelerate or obviate this potential change?

Any process that can, under the right circumstances, deliver such powerful results will surely have unintended consequences. The controversy over John Poindexter’s Futures Markets Applied to Prediction (FutureMAP) program, which was canceled by DARPA in July 2003, will certainly not be the last of the tricky issues revolving around this class of tools.