Comparing a Menagerie of Models for Estimating Molecular Divergence Times

Comparing a Menagerie of Models for Estimating Molecular Divergence   Times
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Estimation of molecular evolutionary divergence times requires models of rate change. These vary with regard to the assumption of what quantity is penalized. The possibilities considered are the rate of evolution, the log of the rate of evolution and the inverse of the rate of evolution. These models also vary with regard to how time affects the expected variance of rate change. Here the alternatives are not at all, linearly with time and as the product of rate and time. This results in a set of nine models, both random walks and Brownian motion. A priori any of these models could be correct, yet different researchers may well prefer, or simply use, one rather than the others. Another variable is whether to use a scaling factor to take account of the variance of the process of rate change being unknown and therefore avoid minimizing the penalty function with unrealistically large times. Here the difference these models and assumptions make on a tree of mammals, with the root fixed and with a single internal node fixed, is measured. The similarity of models is measured as the correlation of their time estimates and visualized with a least squares tree. The fit of model to data is measured and Q-Q plots are shown. Comparing model estimates with each other, the age of clades within Laurasiatheria are seen to vary far more across models than those within Supraprimates (informally called Euarchontoglires). Especially problematic are the often-used fossil calibrated nodes of horse/rhino and whale/hippo clashing with times within Supraprimates and in particular no fossil rodent teeth older than ~60 mybp. A scaling factor in addition to penalizing rate change is seen to yield consistent relative time estimates irrespective of exactly where the calibration point is placed.


💡 Research Summary

The paper conducts a systematic comparison of nine stochastic models used to estimate molecular divergence times, focusing on how each model penalizes changes in evolutionary rate and incorporates the effect of time on rate variance. The three rate‑penalty bases are the raw rate, the logarithm of the rate, and the inverse of the rate. For each, three temporal variance structures are considered: (1) variance independent of time, (2) variance increasing linearly with time, and (3) variance proportional to the product of rate and time. This yields a 3 × 3 matrix of models, each representing either a random‑walk or a Brownian‑motion process.

A further methodological choice is whether to include a scaling factor that accounts for the unknown absolute variance of the rate‑change process. Without scaling, the penalty function can be minimized by inflating divergence times to unrealistic values; with scaling, the model adjusts the variance magnitude, preventing such pathological solutions.

The authors apply all nine models to a mammalian phylogeny in which the root node is fixed and a single internal calibration point is constrained. For each model they estimate node ages, then assess similarity across models by computing Pearson correlations between the sets of age estimates. These correlations are visualized as a least‑squares tree, revealing clusters of models that produce similar chronograms. Model fit is evaluated using log‑likelihood values and quantile‑quantile (Q‑Q) plots of residuals.

Key findings include: (1) Age estimates for clades within Laurasiatheria (e.g., horse/rhino, whale/hippo) vary dramatically across models, whereas estimates for Supraprimates (Euarchontoglires) are comparatively stable. (2) Frequently used fossil calibration points for horse/rhino and whale/hippo clash with the inferred ages of Supraprimates, and no rodent fossil older than ~60 million years is known, yet some models predict much older rodent divergences. (3) Incorporating a scaling factor yields relative age estimates that are robust to the exact placement of the calibration point, whereas models lacking scaling produce divergent results that depend heavily on calibration choice.

The authors conclude that model selection has a profound impact on inferred divergence times, especially for deep Laurasiatherian splits. They advocate a multi‑model approach, recommending that researchers report results from several plausible models and examine the sensitivity of their conclusions to model assumptions. Moreover, they emphasize the importance of variance scaling to obtain realistic time estimates and to avoid over‑reliance on any single fossil calibration. The study highlights the need for additional fossil data, particularly for early Laurasiatherian lineages, to constrain molecular clocks more reliably.


Comments & Academic Discussion

Loading comments...

Leave a Comment