Probability that a chromosome is lost without trace under the neutral Wright-Fisher model with recombination

Probability that a chromosome is lost without trace under the neutral   Wright-Fisher model with recombination
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

I describe an analytical approximation for calculating the short-term probability of loss of a chromosome under the neutral Wright-Fisher model with recombination. I also present an upper and lower bound for this probability. Exact analytical calculation of this quantity is difficult and computationally expensive because the number of different ways in which a chromosome can be lost, grows very large in the presence of recombination. Simulations indicate that the probabilities obtained using my approximate formula are always comparable to the true expectations provided that the number of generations remains small. These results are useful in the context of an algorithm that we recently developed for simulating Wright-Fisher populations forward in time. C++ programs that can efficiently calculate these formulas are available on request.


💡 Research Summary

The paper addresses the problem of quantifying the probability that a specific chromosome disappears from a Wright‑Fisher population when recombination is allowed. In the classic neutral Wright‑Fisher model, each generation consists of 2N chromosomes (N diploid individuals) sampled with replacement from the previous generation. The author extends this framework by allowing a per‑generation per‑sequence recombination rate r, such that a sampled chromosome may undergo at most one crossover event, producing a mosaic offspring that combines a prefix from the sampled chromosome with a suffix from its homologous partner.

The central quantity of interest is NA(k, r, 2N), the probability that a chromosome present in the current generation leaves no descendants after k generations. Exact calculation of NA is intractable because the state space explodes: every recombination creates new mosaic lineages, and the number of possible ancestry graphs grows combinatorially with k and r. To make progress, the author defines a tractable lower bound, L(k, r, 2N), based on two restricted scenarios:

  1. Class 1 – The focal chromosome and all its descendants never recombine, and they all disappear within k generations.
  2. Class 2 – The homolog of the focal chromosome recombines in the first generation, after which both chromosomes and all their descendants never recombine and disappear within k generations.

For each class the author derives explicit transition probabilities. Let s(n,m) be the binomial probability that, given n copies of the focal chromosome in a generation, exactly m copies are chosen for the next generation. The configuration of those n copies can be split into x homologous pairs and y singletons (2x + y = n); the probability of a particular (x,y) arrangement is h(x,y,n). The probability that none of the homologous chromosomes recombine during the transition is denoted sh(n,m). Using sh as the one‑step transition matrix of a time‑homogeneous Markov chain, the probability that the chain starting from state 1 (one copy) first hits state 0 after a + 1 steps is P₁(1,a,r,2N). Summing P₁ over a = 0,…,k‑1 yields T₁, the contribution of Class 1.

Class 2 requires a special first‑step transition ss(2,m), which accounts for the event that the homolog recombines in generation 1. Subsequent steps again use sh. The analogous first‑hit probability is P₂(2,a,r,2N), and its sum over a gives T₂. The lower bound is therefore

 L(k,r,2N) = T₁ + T₂ = ∑_{a=0}^{k‑1}


Comments & Academic Discussion

Loading comments...

Leave a Comment