When is Menzerath-Altmann law mathematically trivial? A new approach

When is Menzerath-Altmann law mathematically trivial? A new approach
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Menzerath’s law, the tendency of Z, the mean size of the parts, to decrease as X, the number of parts, increases is found in language, music and genomes. Recently, it has been argued that the presence of the law in genomes is an inevitable consequence of the fact that Z = Y/X, which would imply that Z scales with X as Z ~ 1/X. That scaling is a very particular case of Menzerath-Altmann law that has been rejected by means of a correlation test between X and Y in genomes, being X the number of chromosomes of a species, Y its genome size in bases and Z the mean chromosome size. Here we review the statistical foundations of that test and consider three non-parametric tests based upon different correlation metrics and one parametric test to evaluate if Z ~ 1/X in genomes. The most powerful test is a new non-parametric based upon the correlation ratio, which is able to reject Z ~ 1/X in nine out of eleven taxonomic groups and detect a borderline group. Rather than a fact, Z ~ 1/X is a baseline that real genomes do not meet. The view of Menzerath-Altmann law as inevitable is seriously flawed.


💡 Research Summary

The paper “When is Menzerath‑Altmann law mathematically trivial? A new approach” addresses a recent claim that the Menzerath‑Altmann law (MAL) observed in genomes is a mere mathematical consequence of the definition Z = Y/X, where X is the number of chromosomes, Y the total genome size in bases, and Z the mean chromosome size. The claim implies that Z must scale with X as Z ∝ 1/X (i.e., a power‑law with exponent ‑1 and no exponential term). If true, the law would be “inevitable” and not a genuine empirical regularity.

The authors first formalize the condition under which Z ∝ 1/X holds. By proving Theorem 2.1 they show that this scaling is equivalent to Y being mean‑independent of X, i.e., E(Y|X)=E(Y) for every X. Mean independence is weaker than full statistical independence but stronger than simple Pearson uncorrelation (ρ = 0). Consequently, a necessary condition for Z ∝ 1/X is that the covariance Cov(X,Y) be zero.

Recognizing that zero covariance does not rule out non‑linear dependencies, the authors adopt three non‑parametric correlation statistics: (i) Pearson’s linear correlation ρ, (ii) Spearman’s rank correlation ρ_S (which captures monotonic relationships), and (iii) the correlation ratio η (Kruskal’s η), which measures the proportion of variance in Y explained by the conditional means E(Y|X). η equals zero exactly when Y is mean‑independent of X and equals one when Y is a deterministic function of X. Importantly, η is always at least as large as |ρ|, making it a more powerful detector of any dependence.

The empirical data comprise eleven taxonomic groups (Fungi, Angiosperms, Gymnosperms, Insects, Reptiles, Birds, Mammals, Cartilaginous fishes, Jawless fishes, Ray‑finned fishes, Amphibians). For each group the authors have species‑level records of chromosome number (X), genome size in megabase pairs (Y), and consequently mean chromosome size Z = Y/X. Sample sizes range from 13 species (Jawless fishes) to 4 706 species (Angiosperms).

To test the hypothesis, the authors perform permutation (randomisation) tests with 10⁷ random shuffles of X for each statistic, estimating p‑values as the proportion of permutations that produce a statistic at least as extreme as the observed one. A two‑sided test is used for ρ and ρ_S, while a one‑sided test (large η indicates dependence) is used for η. The significance threshold is α = 0.05.

Results (Table 2) show that Pearson’s ρ detects a significant linear relationship in only six groups, whereas Spearman’s ρ_S finds significant monotonic dependence in nine groups. The correlation ratio η, however, yields highly significant results (p < 0.001) in nine groups and borderline significance in two groups (Jawless fishes and Amphibians). In the remaining two groups η is not significant, suggesting that the data are compatible with mean‑independence, but the authors caution that these are borderline cases and may still hide subtle dependencies.

Thus, the hypothesis Z ∝ 1/X is rejected for the vast majority of taxonomic groups. The authors argue that Z ∝ 1/X should be regarded as a baseline or null model rather than an inevitable law of genome organization. The failure of the baseline indicates that the relationship between chromosome number and genome size is more complex, involving non‑linear, possibly taxon‑specific mechanisms.

The paper also clarifies conceptual distinctions: (a) Menzerath’s law is a light, model‑neutral observation that Z tends to decrease with X; (b) the Menzerath‑Altmann law (Eq. 1) is a specific functional form that imposes a deterministic relationship; (c) the “trivial” version b = ‑1, c = 0 (Z ∝ 1/X) is only justified when Y is mean‑independent of X. By demonstrating that this condition is rarely met, the authors refute the claim that MAL in genomes is mathematically inevitable.

In the discussion, the authors situate their findings within a broader debate on the status of statistical laws in linguistics, music, and biology. They note that many purported power‑law patterns have been overturned by rigorous statistical testing (e.g., degree distributions in biological networks). The present work adds to that literature by providing a robust, non‑parametric testing framework that can be applied to any situation where a variable is defined as a ratio of two others.

Finally, the authors suggest future directions: employing hierarchical Bayesian models or other flexible regression techniques to capture the true functional form of Y = f(X), exploring mechanistic explanations for the observed non‑linearities, and extending the analysis to other domains where Menzerath‑Altmann‑type scaling has been reported. The overall conclusion is that while a negative correlation between part size and part number is a genuine empirical regularity, the specific inverse‑proportional scaling is not a universal law, and treating it as inevitable is statistically unfounded.


Comments & Academic Discussion

Loading comments...

Leave a Comment