As just about every statistics student can attest, Simpson’s Paradox — a statistical phenomenon where an apparent trend is reversed when you look at subgroups — is notoriously hard to explain. You can look at examples — say, the fact that US wages are rising overall, but dropping within every educational group — but that don’t really help to explain the paradox.
As just about every statistics student can attest, Simpson’s Paradox — a statistical phenomenon where an apparent trend is reversed when you look at subgroups — is notoriously hard to explain. You can look at examples — say, the fact that US wages are rising overall, but dropping within every educational group — but that don’t really help to explain the paradox.
But it’s not really paradox at all, but simply the fact that the disparate rate at which members of the study join the subgroups isn’t accounted for in the analysis. To demonstrate this effect, the Visualizing Urban Data ideaLab at UC Berkeley has created an interactive tool to simulate a data set based on a famous example. You can simulate the rate at which men and women are accepted to two univeristy departments. One is an ‘Easy’ department, that accepts most applicants: 80% of women and 62% of men. The other is a ‘Hard’ department that accepts 27% of women and 26% of men. Despite the fact that both departments admit women at a higher rate than men, if too many women apply to the hard department (a rate you can control with the sliders), the overall acceptance rate for women will be lower than that for men. Click the image below to try it out.
In the real-life case, it was societal pressures that led to the disparate rates at which women applied to departments of varying difficulty. From the paper:
Women are shunted by their socialization and education toward fields of graduate study that are generally more crowded, less productive of completed degrees, and less well funded, and that frequently offer poorer professional employment prospects.
To learn more about Simpson’s Paradox and try out the interactive tool, follow the link below.
Visualizing Urban Data ideaLab: Simpson’s Paradox