🔗 Simpson's Paradox

🔗 Mathematics 🔗 Statistics

Simpson's paradox, which goes by several names, is a phenomenon in probability and statistics, in which a trend appears in several different groups of data but disappears or reverses when these groups are combined. This result is often encountered in social-science and medical-science statistics and is particularly problematic when frequency data is unduly given causal interpretations. The paradox can be resolved when causal relations are appropriately addressed in the statistical modeling.

Simpson's paradox has been used as an exemplar to illustrate to the non-specialist or public audience the kind of misleading results mis-applied statistics can generate. Martin Gardner wrote a popular account of Simpson's paradox in his March 1976 Mathematical Games column in Scientific American.

Edward H. Simpson first described this phenomenon in a technical paper in 1951, but the statisticians Karl Pearson et al., in 1899, and Udny Yule, in 1903, had mentioned similar effects earlier. The name Simpson's paradox was introduced by Colin R. Blyth in 1972.

It is also referred to as or Simpson's reversal, Yule–Simpson effect, amalgamation paradox, or reversal paradox.

Discussed on

"Simpson's Paradox" | 2024-03-11 | 365 Upvotes 106 Comments
"Simpson’s Paradox" | 2022-02-06 | 11 Upvotes 3 Comments
"Simpson's paradox" | 2011-07-29 | 174 Upvotes 34 Comments
"Simpson's paradox: why mistrust seemingly simple statistics" | 2009-08-28 | 152 Upvotes 17 Comments