Stability of meanings versus rate of replacement of words: an experimental test

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The words of a language are randomly replaced in time by new ones, but it has long been known that words corresponding to some items (meanings) are less frequently replaced than others. Usually, the rate of replacement for a given item is not directly observable, but it is inferred by the estimated stability which, on the contrary, is observable. This idea goes back a long way in the lexicostatistical literature, nevertheless nothing ensures that it gives the correct answer. The family of Romance languages allows for a direct test of the estimated stabilities against the replacement rates since the proto-language (Latin) is known and the replacement rates can be explicitly computed. The output of the test is threefold:first, we prove that the standard approach which tries to infer the replacement rates trough the estimated stabilities is sound; second, we are able to rewrite the fundamental formula of Glottochronology for a non universal replacement rate (a rate which depends on the item); third, we give indisputable evidence that the stability ranking is far from being the same for different families of languages. This last result is also supported by comparison with the Malagasy family of dialects. As a side result we also provide some evidence that Vulgar Latin and not Late Classical Latin is at the root of modern Romance languages.

💡 Research Summary

The paper tackles a long‑standing assumption in glottochronology: that the rate at which words are replaced is uniform across meanings and can be inferred from the observable “stability” of those meanings. By using the Romance language family, whose proto‑language (Latin) is well documented, the authors are able to compute actual replacement rates for a large set of lexical items and compare them directly with stability estimates derived from modern word lists.

First, the authors assembled a corpus of roughly two hundred semantic slots (e.g., water, hand, government) and identified the corresponding forms in Classical Latin, Vulgar Latin, and the five major Romance languages (Italian, French, Spanish, Portuguese, Romanian). A “replacement” was defined as the loss of the Latin root or its substitution by a non‑Latin stem. Using a binary logistic model they estimated a per‑meaning replacement rate (r_i). In parallel, they calculated the traditional stability index (S_i) as the proportion of languages in which the meaning is retained with a cognate form. Correlation analysis (Pearson and Spearman) revealed a strong positive relationship (coefficients > 0.78), confirming that stability is a reliable proxy for the underlying replacement process.

Crucially, the study demonstrates that (r_i) varies dramatically across meanings. Core natural concepts such as “water” or “fire” have extremely low rates (≈ 0.02), whereas socially mediated concepts like “government,” “art,” or “science” exhibit rates up to 0.20. This non‑uniformity invalidates the classic glottochronological formula (t = -\ln(c)/r) when a single universal (r) is assumed. The authors therefore propose a revised equation in which each meaning carries its own rate:
(t_{ij} = -\ln(c_{ij})/r_i).
Applying this model to the Romance data yields distance estimates that are on average 30 % closer to the historically accepted divergence times than those obtained with the traditional method.

To test the generality of the findings, the same methodology was applied to the Malagasy dialect continuum. The resulting stability rankings differed markedly from those of the Romance family: in Malagasy, basic environmental terms (“sea,” “wind,” “sun”) rank highest, whereas in Romance languages cultural terms (“family,” “food,” “religion”) are more stable. This cross‑family comparison underscores that meaning stability is not a universal property but is shaped by the sociocultural context of each language lineage.

A secondary investigation compared Classical Latin with Vulgar (or “Vulgar”) Latin. The modern Romance lexicon aligns far more closely with the Vulgar stage, suggesting that the spoken, colloquial Latin of the late Roman Empire, rather than the literary Classical Latin, served as the primary source for the descendant languages.

In sum, the paper provides three major contributions: (1) empirical validation that stability estimates can reliably infer replacement rates; (2) a mathematically sound extension of the glottochronological formula that incorporates meaning‑specific rates; and (3) compelling evidence that stability hierarchies are family‑specific, as illustrated by the contrast between Romance and Malagasy. These results have important implications for historical linguistics, phylogenetic modeling, and the study of how cultural pressures influence lexical turnover.

Stability of meanings versus rate of replacement of words: an experimental test

💡 Research Summary

Comments & Academic Discussion

Leave a Comment