The Inverse Simpson Paradox (How To Win Without Overtly Cheating)
Given two sets of data which lead to a similar statistical conclusion, the Simpson Paradox describes the tactic of combining these two sets and achieving the opposite conclusion. Depending upon the given data, this may or may not succeed. Inverse Simpson is a method of decomposing a given set of comparison data into two disjoint sets and achieving the opposite conclusion for each one. This is always possible; however, the statistical significance of the conclusions does depend upon the details of the given data.
💡 Research Summary
The paper introduces the concept of the “Inverse Simpson Paradox,” a counterpart to the classic Simpson paradox. While the classic paradox shows that aggregating two data sets can reverse a conclusion that holds within each set, the inverse paradox asks whether a single aggregated data set can be partitioned into two disjoint subsets such that each subset yields the opposite conclusion to the original aggregate. The authors claim that this is always mathematically possible, but the statistical significance of the reversed conclusions depends on the details of the data and on how the partition is performed.
The authors first review the classic Simpson paradox with a toy example involving two drugs (A and B) tested under two trial conditions. They show how the overall success rate can be lower for the drug that dominates in each individual trial. They then define three strategies for the inverse paradox: (a) legitimate decomposition when data from two sources were combined for simplicity, (b) a deliberately engineered decomposition that maximally reverses the original conclusion (potentially for litigation), and (c) a neutral decomposition that highlights hidden sub‑populations.
To assess the statistical strength of any decomposition, the paper adopts a Bayesian framework. Each observation is modeled as a Bernoulli trial; the number of successes S out of N trials yields a binomial likelihood. Assuming a uniform prior for the success probability p, the posterior is a Beta distribution. Equation (3.9) gives the posterior probability that p ≥ ½ as a ratio of incomplete Beta functions; this can be expressed in terms of the standard normal cumulative distribution function φ when N is large. The same machinery yields a closed‑form approximation for the probability that p_A ≥ p_B (Equation 4.10), again expressed via φ. This large‑sample normal approximation is justified by the central limit theorem and a steepest‑descent evaluation of the Beta integrals.
The core of the inverse paradox is the choice of partition fractions α and β, which allocate the total counts N_A and N_B into two sub‑samples (N_A1 = α N_A, N_A2 = (1‑α) N_A, etc.). The authors derive constraints on α and β that guarantee each sub‑sample reverses the original ordering. These constraints appear in Equations (5.4)–(5.7). In particular, when α ≥ β, the feasible region depends on whether P_A + P_B is greater than or less than one, leading to different upper bounds on the standardized difference C′ in each sub‑sample. The bounds involve the overall success rates P_A, P_B and the proportion γ = N_A/N.
The paper then illustrates the theory with two real‑world examples. The first is the well‑known Berkeley admissions data, where the overall admission rate for men exceeds that for women, yet each department shows no gender bias. By selecting appropriate α and β, the apparent bias can be eliminated or even reversed. The second example concerns two hospitals treating a lethal disease. Overall, Hospital A has a higher recovery rate (90 % vs 80 %). However, when patients are split by “good shape” versus “poor shape,” Hospital B outperforms A in both sub‑groups, with standardized differences C′₁ ≈ 0.038 (1.7 σ) and C′₂ ≈ 0.176 (7.9 σ). This demonstrates how the inverse Simpson paradox can arise naturally in stratified medical data.
The authors discuss the practical implications: in legal disputes or policy debates, data can be deliberately partitioned to support a desired narrative, even though the overall evidence may point elsewhere. Their Bayesian‑normal approximation framework provides a quantitative tool to evaluate whether such a partition yields statistically meaningful reversed conclusions. They also note that finding the “most damaging” partition is an optimization problem, suggesting future work on algorithmic approaches and extensions to multivariate data.
In conclusion, the paper establishes that any aggregated binary outcome data can be split into two subsets that each invert the original conclusion, provided the partition fractions satisfy certain algebraic inequalities. The statistical significance of the inversion is governed by the Beta‑posterior and its normal approximation, captured succinctly by the φ‑function. By applying the theory to classic and contemporary datasets, the authors highlight both the methodological power and the ethical hazards of the inverse Simpson paradox.
Comments & Academic Discussion
Loading comments...
Leave a Comment