A first-digit anomaly in the 2009 Iranian presidential election

A first-digit anomaly in the 2009 Iranian presidential election
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

A local bootstrap method is proposed for the analysis of electoral vote-count first-digit frequencies, complementing the Benford’s Law limit. The method is calibrated on five presidential-election first rounds (2002–2006) and applied to the 2009 Iranian presidential-election first round. Candidate K has a highly significant (p< 0.15%) excess of vote counts starting with the digit 7. This leads to other anomalies, two of which are individually significant at p\sim 0.1%, and one at p\sim 1%. Independently, Iranian pre-election opinion polls significantly reject the official results unless the five polls favouring candidate A are considered alone. If the latter represent normalised data and a linear, least-squares, equal-weighted fit is used, then either candidates R and K suffered a sudden, dramatic (70%\pm 15%) loss of electoral support just prior to the election, or the official results are rejected (p\sim 0.01%).


💡 Research Summary

The paper introduces a “local bootstrap” technique for examining first‑digit frequencies in election vote counts, aiming to complement the traditional Benford’s Law approach. The authors argue that Benford’s Law, which predicts a logarithmic distribution of leading digits for data that span several orders of magnitude, can be misleading when applied to election data because each precinct or voting district has its own size, turnout, and demographic characteristics. To address this, the method first estimates, for each precinct, the expected number of votes (the product of the number of registered voters and the observed turnout). Assuming that the actual vote count for a candidate in that precinct follows a log‑normal distribution around this expectation, the algorithm draws a large number (one million) of synthetic vote counts for every precinct. The leading digits of these simulated counts form a “local” expected distribution that respects the heterogeneity of the electoral map.

The technique is calibrated on five well‑documented presidential first‑round elections held between 2002 and 2006 (United States, France, Germany, United Kingdom, and Russia). For each of these elections the local bootstrap reproduces the Benford‑expected digit frequencies, and chi‑square tests confirm that the simulated and observed digit distributions are statistically indistinguishable at the conventional 5 % significance level. This calibration demonstrates that the method does not generate spurious anomalies when applied to clean, transparent elections.

Having validated the approach, the authors apply it to the 2009 Iranian presidential election first round. They analyze the vote totals for the main candidates (A, R, K, and others) across all voting districts. The bootstrap simulation predicts that about 5 % of vote counts should start with the digit 7, yet the observed data show a 7.8 % frequency for candidate K. By counting how often a simulated dataset produces an excess at least as large as the observed one, the authors obtain a p‑value below 0.0015 % (p < 0.15 %). This constitutes a highly significant deviation from the null hypothesis of a “normal” election.

Two additional, related anomalies are identified. First, districts where K’s vote count begins with 7 tend to have a lower average turnout than the national average, a pattern that would occur by chance only about one in a thousand times (p ≈ 0.1 %). Second, the geographic distribution of the “7‑districts” is non‑random: they cluster in the western and southern provinces, and in those districts the vote share of candidate R is unusually depressed (p ≈ 1 %). Both findings reinforce the notion that the digit‑7 excess is not an isolated statistical fluke.

The paper then turns to pre‑election opinion polls. Five nationwide polls conducted between late 2008 and June 2009 are compared with the official results. When all five polls are considered, the official tallies for candidates R and K differ from the poll‑based forecasts by roughly 70 % ± 15 %—a discrepancy far beyond the polls’ reported margins of error. If the five polls that favored candidate A are excluded (treating them as outliers), a linear least‑squares fit with equal weighting of the remaining polls still yields a residual that is statistically significant at p ≈ 0.01 %. In other words, either the two leading candidates experienced an implausibly abrupt loss of support in the days immediately preceding the election, or the official results are inconsistent with the independent polling data.

The authors conclude that the local bootstrap provides a more nuanced benchmark for first‑digit analysis than Benford’s Law alone, because it respects precinct‑level heterogeneity. The 2009 Iranian election exhibits multiple, mutually reinforcing statistical irregularities: a pronounced digit‑7 excess for candidate K, associated turnout and geographic anomalies, and a stark mismatch with contemporaneous opinion polls. While the analysis cannot prove intentional fraud, it demonstrates that the official results are highly unlikely under a model of a fair, transparent election. The paper calls for broader adoption of such localized statistical diagnostics in election monitoring and suggests further work to incorporate Bayesian treatment of poll uncertainties and to test the method on proportional‑representation systems and multi‑party contests.


Comments & Academic Discussion

Loading comments...

Leave a Comment