The Luck and Skill of Scrabble

Scrabble is a game that involves both skill and luck. There’s skill in knowing the words you can play and — especially — the most advantageous ways to play them. This allows players with a wide vocabulary to excel from the get-go, while others rely on a word unscrambler to up their odds of winning. One might argue that even using a tool to automatically unscramble words gives players the chance to learn and improve their own skills for the next game. But there’s also luck in the tiles you draw randomly from the bag: get saddled with a rack containing four I’s and there’s usually not much you can do. That’s why professional Scrabble tournaments are decided by playing multiple games between each pair of players. Tournaments do this to average out the variability in tile draws between players, to make the deciding factor skill rather than luck.

But how much does luck affect a typical Scrabble game? Andrew C Thomas, a professor in Statistics at Carnegie Mellon University, came up with an ingenious idea to test this. (Thomas’s research will soon be published in a paper, a draft of which is now available on Arxiv.org.) What if we could observe a game between a couple of equally-matched players, where we “fix” the luck factor by determining the tiles each player gets in advance? Then, we can eliminate the “skill” factor by having those players re-play the fixed game many times: they’ll get the same letters, but might make different strategic plays during the game. Each player’s scores will vary over the series of games, but on average, the player with the better “luck” — in other words, the better pre-determined sequence of letters — will have the higher average score.

There are two problems with this approach. First of all, it’s not practical to get expert players to play the same game over and over with consistent results. Thomas solves this by using open-source Scrabble AI software instead. (To avoid the robot players making exactly the same moves each time, ha adds a small random factor to the AI decision-making process, by weighting the future value of any given move up or down a point or two.)

The second problem is more of a mechanical one: how can you guarantee that each robot player will get the same sequence of letters each time? In Scrabble, each player may play anywhere between one and seven tiles each move (with a 50-point “bingo” bonus for all seven), or play none at all and exchange some tiles for a new set randomly selected from the pool. The scheme Thomas comes up with to address this is very clever: rather than have each player draw from the same sequence, he pre-generates one sequence of tiles and has each player draw from opposite ends, as shown in this diagram from his paper:

In this diagram, Player 1 drew seven tiles from the left of the sequence to create the rack; Player 2 drew from the right. This way, each player gets the same sequence of tiles in the repeated games, regardless of the number of tiles played each move. Or nearly so: if Player 1 plays many long words in a game, he may access a tile toward the right of the sequence he doesn’t usually get. And tile exchanges, which are mixed in with the reserve sequence, add more variability. But in general, the more a letter is towards the left of the sequence, the more likely Player 1 will get to play it, and vice versa.

Again, in a real Scrabble game it would be impractical to lay out the tiles in a pre-determined sequence like this (especially without the players seeing them!). Thomas solves this problem by simulating the games in software: code in the R programming language simulates the sequence of tiles, hands new tiles to the AI players, and then observes their final score. 100 simulated matches are played for each sequence: the average score difference between Player 1 and Player 2 is then a measure of how “lucky” that sequence is for Player 1. And Thomas repeats this process for 10,000 different random sequences, which allows him to do statistical analysis in R on how the tile sequence (or “luck”) affects Scrabble games on average. For example, Thomas noted that most sequences where the Q was towards the left led to a point advantage for Player 2, and so in that sense Q is an “unlucky” tile to get.

Thomas takes this analysis even further: when you get a high-value “power tile” like a Q or a Z also makes a difference. Getting a Q early in the game when there are few options to play it is bad; getting it later in the game when the board has more options is better; letting your opponent draw it is best. These options are reflected in where in the initial sequence (used by both players) the Q falls: towards the left, in the middle, or on the right. Using this method, Thomas maps average player’s scores for Q, J, X, and Z depending on where in the sequence they fall:

To the left of the chart, Player 1 has each tile early in the game; towards the right, Player 2 has it. In contrast to the Q, the X is generally beneficial to the player who draws it. Using these techniques, Thomas finds the following conclusions about tiles:

The blank is worth about 30 points to a good player, mainly by making 50-point “bingo” plays possible.
Each S is worth about 10 points to the player who draws it.
The Q is a burden to whichever player receives it, effectively serving as a 5 point penalty for having to deal with it due to its effect in reducing bingo opportunities, needing either a U or a blank for a chance at a bingo and a 50-point bonus.
The J is essentially neutral pointwise.
The X and the Z are each worth about 3-5 extra points to the player who receives them. Their difficulty in playing in bingoes is mitigated by their usefulness in other short words.

(Of course, all of these conclusions will depend on exactly which Scrabble dictionary you’re using: there are a lot more words available to play from the OED-based SOWPODS dictionary, and I presume this is based on the official TWL dictionary used in American Scrabble game. I’d love to see the effect on this chart of using British rules.)

Thomas also finds that the player who goes first generally has an advantage, to the tune of about 14 points. So if you’re a gracious Scrabble player, let your opponent go first.

AC Thomas: Statistics and Scrabble, Together At Last