Markov processes forced on a subspace by a large drift, with applications to population genetics

Markov processes forced on a subspace by a large drift, with applications to population genetics Samuel A yomide Adeosun 1 P eter Pfaffelhuber 2 Abstract Consider a sequence of Markov processes X 1 , X 2 , ... with state space E , where X N has a strong drift to D ⊆ E , such that Φ( X N ) is slow for some appropriate Φ : E → D . Using the method of martingale problems, we give a limit result, such that Φ( X N ) N →∞ = = = = ⇒ Z in the space of càdlàg paths, and X N N →∞ = = = = ⇒ X in measure. W e apply the general limit result to models for copy number variation of genetic elements in a diploid Moran model of size N . The population by time t is described by X N ∈ P ( N 0 ) , where X N k is the frequency of individuals with copy number k , and Φ : P ( N 0 ) → R is the ﬁrst moment. Keywords: Martingale problem; Convergence of stochastic processes; slow-fast system. MSC2020 subject classiﬁcations: Primary 92D15, Secondary 60J80; 60F17; 60G57. 1 Introduction Slow-fast systems arise frequently in probabilistic models (see e.g. Ball et al., 2006; Berglund and Gentz, 2006; Li and Sieber, 2022; Kifer, 2024; Champagnat and Hass, 2025). W e study the situation of a fast evolving sequence of Markov processes X N , such that (i) Z N := Φ( X N ) evolves slowly and (ii) X N is pushed fast towards a slow subset of the state space. A similar situation was studied by Katzenberger (1991) using semi-martingale techniques. However , we do not show convergence of X N in path space using Lyapunov functions, but rather use tightness and martingale techniques in order to show convergence of X N in measure, and of Φ( X N ) in path space. Actually , this approach has appeared in a special situation in Pfaffelhuber and W akolbinger (2023), but is here carried out in full generality . As our main application, we use a population genetic model, where reproduction involves two parents and each individual has a type in N 0 , counting the number of genetic elements it carries. The model is fully speciﬁed once we specify the distribution 1 University of Freiburg, Germany . E-mail: samuel.adeosun@stochastik.uni- freiburg.de 2 University of Freiburg, Germany . E-mail: p.p@stochastik.uni- freiburg.de Markov processes forced on a subspace by a large drift of genetic elements a parent gives to its offspring. Such models build on the two-parental Moran model and have e.g. been studied in Coron and Le Jan (2022); Otto et al. (2022); Otto and Wiehe (2023); Pfaffelhuber and W akolbinger (2023); Omole and Czuppon (2025). 2 The abstract result Recall that a process X = ( X t ) t ≥ 0 with complete and separable metric state space ( E , r ) solves the ( G, D ) -martingale problem for some linear G : D ⊆ C b ( E ) → C b ( E ) (where C b ( E ) is the set of real-valued, bounded continuous functions on E ) if  f ( X t ) − Z t 0 Gf ( X s ) ds  t ≥ 0 is a martingale for all f ∈ D . A Markov process is the unique solution to its martingale problem (when taking D large enough), and if there is a unique such solution, it is a strong Markov process (see e.g. Theorems 4.3.1 and 4.3.2 in Ethier and K urtz (1986)). Usually , such a process has càdlàg paths [0 , ∞ ) → E and we denote the set of such paths by D ( E ) ; see Theorem 4.3.6 in Ethier and Kurtz (1986). Assume we have a sequence of Markov processes X 1 , X 2 , ... with state space ( E , r E ) such that the generator G N of X N has domain D E ⊆ C b ( E ) and is of the form G N = N G N 1 + G N 0 . (2.1) W e are interested in the weak limit of X N as N → ∞ , in the special situation that D is another Polish space and Φ : E → D is such that, for some D D ⊆ C b ( D ) , if g ∈ D D , we have g ◦ Φ ∈ D E and G 1 ( g ◦ Φ) = 0 . (2.2) (In other words, the dynamics given by G N 1 change X N fast, but does not change Φ( X N ) .) Recall that there are two kinds of convergence on D ( E ) . First, the usual Skorohod convergence ; see e.g. Chapter 3 in Ethier and Kurtz (1986). Second, there is convergence in measure : Deﬁne the weighted occupation measure of ξ ∈ D ( E ) , as the probability measure Γ ξ ([0 , t ] × A ) := Z t 0 e − s 1 { ξ s ∈ A } ds, (2.3) where t ≥ 0 and A is a measurable subset of E . F ollowing Kurtz (1991) we say that a sequence ( ξ N ) in D ( E ) converges in measure to ξ ∈ D ( E ) if the sequence of probability measures Γ ξ N converges weakly to Γ ξ . W e assume that A1 (Φ( X N ) , X N ) N →∞ = = = = ⇒ ( Z, X ) for some ( Z, X ) , where convergence to Z is with respect to the Skorohod topology in D ( E ) , and to X in measure. F or A1 to hold along a subsequence, it sufﬁces to assume that Φ( X N ) N =1 , 2 ,... is tight (in the space of càdlàg paths on D ), and ( X N ) N =1 , 2 ,... are tight in measure, i.e. the sequence of occupation measures is tight. A2 There is G 0 such that, for all f ∈ D E , G N 0 f ( X N ) N →∞ = = = = ⇒ G 0 f ( X ) in measure. P age 2/17 Markov processes forced on a subspace by a large drift A3 There is Ξ : D → E and D ′ E ⊆ D E with the following property: If G 1 f ( x ) = 0 for all f ∈ D ′ E and x ∈ E , then x = Ξ(Φ( x )) . (In other words, we can recover x if we are given Φ( x ) and G 1 f ( x ) = 0 .) Then, we have the following Theorem 2.1. Let ( E , r E ) and ( D , r D ) be complete and separable spaces, D E ⊆ C b ( E ) and D D ⊆ C b ( D ) , as well as Φ : E → D such that g ◦ Φ ∈ D E for g ∈ D D . Assume X N is solution of the ( G N , D E ) -martingale problem with G N as in (2.1) and G N 1 satisﬁes (2.2) . If A1, A2 and A3 hold, Φ( X N ) N →∞ = = = = ⇒ Z , where Z solves the martingale problem for g 7→ ( G 0 ( g ◦ Φ)) ◦ Ξ with g ∈ D D . Proof. From A1, assume that Φ( X N ) N →∞ = = = = ⇒ Φ( X ) weakly in the space of càdlàg paths, and X N N →∞ = = = = ⇒ X weakly in measure. Then, for f ∈ D ′ E , using (2.1),  1 N f ( X N t ) − Z t 0  G 1 f + 1 N G N 0 f  ( X N s ) ds  t ≥ 0 N →∞ = = = = ⇒  Z t 0 G 1 f ( X s ) ds  t ≥ 0 is a martingale, so the right hand side is a martingale with bounded variation, hence vanishes, i.e. G 1 f ( X s ) = 0 for Lebesgue almost all t . From A3, this implies that X t = Ξ(Φ( X t )) for Lebesgue almost all t ≥ 0 . In order to put everything together , consider g ∈ D D and note that g ◦ Φ ∈ D E , hence using (2.2) and X N N →∞ = = = = ⇒ Ξ(Φ( X )) in measure,  g (Φ( X N t )) − Z t 0 G N 0 ( g ◦ Φ)( X N s ) ds  t ≥ 0 N →∞ = = = = ⇒  g (Φ( X t )) − Z t 0 G 0 ( g ◦ Φ)(Ξ(Φ( X s ))) ds  t ≥ 0 is a martingale. In particular , Φ( X ) solves the martingale problem for g 7→ ( G 0 ( g ◦ Φ)) ◦ Ξ with g ∈ D D . Let us consider the following simple example: Let E = R 2 , G 1 f ( x, y ) = ( x − y )( ∂ y f ( x, y ) − ∂ x f ( x, y )) , G 0 = 1 2 ( ∂ xx f + ∂ f y y )( x, y ) . So X N is some Brownian motion in R 2 with a strong force to the diagonal. T aking Φ( x, y ) = 1 2 ( x + y ) and D E = D ′ E := C 2 ( R × R ) , we ﬁnd that G 1 f ( x ) = 0 for all f ∈ D E implies that x is on the diagonal, i.e. x = Ξ(Φ( x )) with Ξ( z ) := ( z , z ) . So, we have that the limit Φ( X ) , has generator H g ( x ) = G 0 ( g ◦ Φ)(Ξ( z )) = 1 2  ∂ xx + ∂ y y  ( g ◦ Φ)(Ξ( z )) = 1 2 g ′′ ( z ) . So, as anticipated, the limit of 1 2 ( X N + Y N ) is a Brownian motion. In addition, X N − Y N N →∞ = = = = ⇒ 0 in measure. 3 Modeling copy number variation of genetic elements Here is an extension of the population model for diploid organisms from Pfaffelhuber and W akolbinger (2023), which we will study in detail in the remainder of the paper: P age 3/17 Markov processes forced on a subspace by a large drift • A diploid population of constant size N consists of individuals, each carrying a certain number of genetic elements. • Reproduction events occur at rate N 2 . Upon a reproduction event, choose individu- als a, b, c . Individual c dies, and is replaced by offspring from a and b . • F or probability distributions p N k , k = 0 , 1 , 2 , ... , if individual a has k genetic elements, it transfers a random number of genetic elements to its offspring, distributed like p N k . Individual b inherits an independent number of genetic elements, distributed as p N l , if b has l genetic elements. W e call ( p N k ) k =0 , 1 , 2 ,... the family of inheritance distributions and we assume that there is ( p k ) k =0 , 1 , 2 ,... with ( a ) X k k p k = k 2 , ( b ) N ( p N k ( . ) − p k ( . )) N →∞ − − − − → r k ( . ) with X j j r k ( j ) = α k , k ∈ N (3.1) for some r k and some α ∈ R . Note that X j r k ( j ) = N X j p N k ( j ) − p k ( j ) = 0 . (3.2) Recall that p N k := p k := B ( k , 1 / 2) was studied in Pfaffelhuber and W akolbinger (2023). W e want to study the evolution of the distribution of these genetic elements in the limit N → ∞ . Therefore, let P ( N ) be the set probability distributions on N = { 0 , 1 , 2 , ... } , equipped with the topology of weak convergence, and X N be the P ( N ) -valued Markov jump process describing the evolution of copy numbers in the population with N individuals. F or p N k , we study the two cases: (i) p N k = B ( k , 1 2 + ε N ) with N ε N N →∞ − − − − → α, (ii) p N k = p k = U ( { 0 , ..., k } ) (3.3) and show that (3.1) holds at the beginning of Sections 5.1 and 5.2. Remark 3.1 (Motivation for (i) and (ii)) . The case (i) extends earlier work of Pfaffelhuber and W akolbinger (2023) to a case with bias. F or (ii), we refer to Otto et al. (2022) and Otto and Wiehe (2023). The idea behind this model is that there is a uniformly distributed breakpoint within the k and l genetic elements from both parents, and both parents inherit only the part on one side of the breakpoint to the offspring. Clearly , in this case, (3.1) holds with r k = α = 0 . By the dynamics from above, X N jumps from x to x + e m − e n (where e m is the m th unit vector) at rate λ N m,n ( x ) := N 2 2 x n X k,l x k x l X j p N k ( j ) p N l ( m − j ) . (3.4) In our models, a parent carrying k genetic elements, inherits on average (i) ( 1 2 + ε N ) k ((ii) k 2 ) genetic elements. As we will see now , the mean number Φ( X N ) := ρ 1 ( X N ) := ∞ X j =0 j X N j (3.5) evolves slowly , while X N is fast. Theorem 3.2. Let X N be a Markov jump process with state space P ( N ) and transition rates given by λ N m,n from (3.4) . Assume that, for some z > 0 , Φ( X N 0 ) N →∞ − − − − → z in probability , and sup N E h P ∞ k =1 k 3 X N 0 ( k ) i < ∞ . P age 4/17 Markov processes forced on a subspace by a large drift 1. If p N n satisﬁes (3.3) (i), ( X N , Φ( X N )) N →∞ = = = = ⇒ ( X , Z ) , where Z solves d Z = αZ dt + √ Z dW, and X t = P oi ( Z t ) for all t ≥ 0 . 2. If p N n satisﬁes (3.3) (ii), ( X N , Φ( X N )) N →∞ = = = = ⇒ ( X , Z ) , where Z solves d Z = p Z ( Z + 2) dW , and X t = NB (2 , 2 / (2 + Z t ) (the negative binomial distribution with number of successes 2 and mean Z t .) In both cases, Φ( X N ) N →∞ = = = = ⇒ Z in path space, and X N N →∞ = = = = ⇒ X in measure. Remark 3.3 (More general result for general p k ) . As the structure of the Theorem suggests, there is a correspondence between the family ( p N k ) k =0 , 1 ,... , the form of X t given Z t , which holds for all t > 0 , and the dynamics of Z . F or the former , there is for each choice of ( p N k ) k =0 , 1 ,... a family of distributions ( q z ) z ≥ 0 for the limit X , which is parameterized by its mean, Z , i.e. X t = q Z t for all t . This connection will be made in Lemma 5.1 for (i) and Lemma 5.4 for (ii). As for the dynamics of Z , note that the diffusion term is governed by variance of q z : In case (i), we have q z = P oi ( z ) with variance z . In case (ii), we have q z = NB (2 , 2 / (2 + z )) , which has variance z ( z + 2) . This connection holds in greater generality as we will see in Lemma 4.2. Remark 3.4 ( p n = 1 2 δ n + 1 2 δ n ) . Y et another canonical choice for the inheritance distri- butions is p n = 1 2 δ n + 1 2 δ n for n = 0 , 1 , 2 , ... , i.e. a parent either inherits no or all of its genetic elements to the offspring, each with probability 1 2 . This case can be studied directly , since Y := X (0) is an autonomous process. W e have that Y jumps from y to y + 1 at rate N 2 (1 − y )  1 4 (1 − y ) 2 + (1 − y ) y + y 2  , y − 1 at rate N 2 y  3 4 (1 − y ) 2 + (1 − y ) y  . F or example, the term 1 4 N 2 (1 − y ) 3 takes into account all events where all individuals involved in the reproduction event have at least 1 genetic element, and both parents choose to inherit none of their genetic elements. Since (1 − y )  1 4 (1 − y ) 2 + (1 − y ) y + y 2  − y  3 4 (1 − y ) 2 + (1 − y ) y  ≥ (1 − y )  (1 − y ) y + y 2  − y  (1 − y ) 2 + (1 − y ) y  = 0 , this shows that N − Y is a non-negative supermartingale, hence converges to 0 almost surely , in the time-scale N 2 dt . In other words, although p N has mean n 2 , we ﬁnd that in the limit N → ∞ , no individual carries any genetic element. 4 Preparation The proof of Theorem 3.2 is based on an application of Theorem 2.1. In this section, we will prepare the proof for some of the assumptions of Theorem 2.1 for the population model from above. W e will as long as possible keep a general family of inheritance distributions ( p N k ) k =0 , 1 ,... . This means that we only assume a certain form for ﬁrst three factorial moments of p k and r k ; see (4.4) and (4.11) . In Section 5, we restrict ourselves to the cases (i) and (ii) and ﬁnalize the proof of Theorem 3.2 in both cases. P age 5/17 Markov processes forced on a subspace by a large drift F or the generator G N of X N and f ∈ D := C 2 b ( P ( N )) , we see directly from the jump rates (3.4), using (3.1) and (3.2), G N f ( x ) = N 2 2 X n x n X k,l x k x l X j,m p N k ( j ) p N l ( m − j )  f ( x + ( e m − e n ) / N ) − f ( x )  = ( N G 1 + G 0 ) f ( x ) + o (1) with G 1 f ( x ) = 1 2 X n x n X k,l x k x l X j,m p k ( j ) p l ( m − j )( e m − e n ) · ∇ f ( x ) G 0 f ( x ) = 1 2 X k,l x k x l X j,m ( p k ( j ) r l ( m − j ) + r k ( j ) p l ( m − j )) e m · ∇ f ( x ) + 1 4 X n x n X k,l x k x l X j,m p k ( j ) p l ( m − j )( e m − e n ) · ∇ 2 f ( x ) · ( e m − e n ) . (4.1) This shows that the form (2.1) applies, and A2 holds (with G N 0 = G 0 + o (1) ), provided that X N N →∞ = = = = ⇒ X in measure. Moreover , for (2.2) , using (3.1) , recalling Φ from (3.5) , with g ∈ C 1 b ( R + ) , G 1 ( g ◦ Φ) = 1 2 g ′ (Φ( x )) X n x n X k,l x k x l X j,m p k ( j ) p l ( m − j )( j + ( m − j ) − n ) = 1 2 g ′ (Φ( x ))  2 X k x k X j j p k ( j ) − ρ 1 ( x )  = 0 . F or the remaining tasks, A1 and A3, we will be using for x ∈ P ( N ) the generating function s 7→ ψ s ( x ) and the factorial moments, ρ k ( x ) , k = 1 , 2 , ... , ψ s ( x ) = ∞ X n =0 x n (1 − s ) n = ∞ X k =0 ρ k ( x ) ( − s ) k k ! = 1 − sρ 1 ( x ) + 1 2 s 2 ρ 2 ( x ) − 1 6 s 3 ρ 3 ( x ) + O ( s 4 ) (4.2) with ρ n ( x ) = ( − 1) n ∂ n ∂ s n ψ s ( x )     s =0 = X k k · · · ( k − n + 1) x k , k = 0 , 1 , 2 , ... Let us consider the dynamics on the fast time-scale, i.e. let us look at A3. W e set D ′ := algebra generated by { ψ s : P ( N ) → R + : s ∈ [0 , 1] } and now consider the corresponding dynamics: Lemma 4.1 (Dynamics on the fast time-scale) . It holds 2 G 1 ψ s ( x ) =  X k x k ψ s ( p k )  2 − ψ s ( x ) . (4.3) In addition, if p k is such that (3.1) holds and for suitable a 2 , a 3 , ρ 2 ( p k ) = a 2 k ( k − 1) , ρ 3 ( p k ) = a 3 k ( k − 1)( k − 2) . (4.4) Then, 2 G 1 ρ 2 ( x ) = 1 2 ρ 2 1 ( x ) − (1 − 2 a 2 ) ρ 2 ( x ) , 2 G 1 ρ 3 ( x ) = 3 a 2 ρ 2 ( x ) ρ 1 ( x ) − (1 − 2 a 3 ) ρ 3 ( x ) . (4.5) P age 6/17 Markov processes forced on a subspace by a large drift Proof. F or the ﬁrst assertion, we write using (4.1) 2 G 1 ψ s ( x ) = X n x n X k,l x k x l X j,m p k ( j ) p l ( m − j )  (1 − s ) j +( m − j ) − (1 − s ) n  = X k,l x k x l ψ s ( p k ) ψ s ( p l ) − X n x n (1 − s ) n =  X k x k ψ s ( p k )  2 − ψ s ( x ) . Recalling that ρ 1 ( p k ) = 1 2 k by assumption (see (3.1) ), use (4.2) and (4.4) in order to write X k x k ψ s ( p k ) = X k x k (1 − sρ 1 ( p k ) + 1 2 s 2 ρ 2 ( p k ) − 1 6 s 3 ρ 3 ( p k ) + O ( s 4 )) = 1 − 1 2 sρ 1 ( x ) + 1 2 s 2 a 2 ρ 2 ( x ) − 1 6 a 3 s 3 ρ 3 ( x ) + O ( s 4 ) , (4.6) which implies (writing ρ i := ρ i ( x ) , i = 1 , 2 , 3 )  X k x k ψ s ( p k )  2 = 1 − sρ 1 ( x ) + 1 2 s 2  1 2 ρ 2 1 + 2 a 2 ρ 2  − 1 6 s 3  3 a 2 ρ 2 ρ 1 + 2 a 3 ρ 3  + O ( s 4 ) . (4.7) Therefore, evaluating (4.3) as a series in s , and comparing coefﬁcients, we obtain (4.5). With Lemma 4.1, we will show in Sections 5.1 for the binomial case (Lemma 5.1 and Remark 5.2) and 5.2 for the uniform case (Lemma 5.4 and Remark 5.5) that G 1 ψ s ( x ) = 0 iff x is P oisson (negative binomial) whenever ( p k ) k =0 , 1 , 2 ,... is binomial (uniform). (4.8) In other words, if G 1 ψ s ( x )( x ) = 0 for all s ≥ 0 , we can deﬁne Ξ( z ) = P oi ( z ) in case (i) and Ξ( z ) = NB (2 , 2 / (2 + z )) in case (ii) and have x = Ξ(Φ( x )) , as needed for A3. F or A1, i.e. tightness, we need to consider the slow time-scale as well. The corre- sponding calculations will be carried out in Lemma 4.2, Lemma 4.3, and Lemma 4.4, which lead to a proof of tightness in Proposition 4.6. Lemma 4.2 (Dynamics of Φ( X N ) on the slow time-scale and limiting generator) . Assume r k is as in (3.1) , as well as (see Lemma 4.1) ρ 2 ( p k ) = a 2 k ( k − 1) for some a 2 . Then, G 0 ( g ◦ Φ)( x ) = g ′ (Φ( x )) α Φ( x ) + 1 2 g ′′ (Φ( x ))  ( a 2 + 1 2 ) ρ 2 ( x ) + ρ 1 ( x ) − 3 4 ρ 2 1 ( x )  . (4.9) In addition, if x solves G 1 ψ ( x ) = 0 , i.e. x = Ξ( z ) with z = Φ( x ) , G 0 ( g ◦ Φ) ◦ Ξ( z ) = αz g ′ ( z ) + 1 2 g ′′ ( z ) v (Ξ( z )) , (4.10) where v ( x ) := ρ 2 ( x ) + ρ 1 ( x ) − ρ 2 1 ( x ) is the variance. Proof. Using that P k x k = 1 and P j r k ( j ) = 0 several times, as well as ( m − n ) 2 = ( j + P age 7/17 Markov processes forced on a subspace by a large drift m − j − n ) 2 = j ( j − 1) + j + ( m − j )( m − j − 1) + ( m − j ) + n ( n − 1) + n +2 j ( m − j ) − 2 j n − 2( m − j ) n , G 0 ( g ◦ Φ)( x ) = 1 2 X n x n X k,l x k x l X j,m ( p k ( j ) r l ( m − j ) + r k ( j ) p l ( m − j )) · ( m − n ) g ′ (Φ( x )) + 1 4 X n x n X k,l x k x l X j,m p k ( j ) p l ( m − j ) · ( m − n ) 2 g ′′ (Φ( x )) = 1 2 g ′ (Φ( x )) X k,l x k x l X j ( j r k ( j ) + j r l ( j )) + 1 4 g ′′ (Φ( x ))   X k x k (2 ρ 2 ( p k ) + 2 ρ 1 ( p k ) − 4 ρ 1 ( p k ) ρ 1 ( x )  + 2  X k x k ρ 1 ( p k )  2 + ρ 2 ( x ) + ρ 1 ( x )  = g ′ (Φ( x )) X k x k αk + g ′′ (Φ( x ))  1 2 a 2 ρ 2 ( x ) + 1 4  ρ 1 ( x ) − 2 ρ 2 1 ( x ) + 1 2 ρ 2 1 ( x ) + ρ 2 ( x ) + ρ 1 ( x )   = g ′ (Φ( x )) α Φ( x ) + 1 2 g ′′ (Φ( x ))  a 2 ρ 2 ( x ) + 1 2 ρ 2 ( x ) + ρ 1 ( x ) − 3 4 ρ 2 1 ( x )  , which is the ﬁrst assertion. Next, recall that Ξ( z ) solves G 1 ψ s (Ξ( z )) = 0 . T ake two derivatives at s = 0 in G 1 ψ s ( x ) = 0 , use (4.3), and write using (4.5) ρ 2 (Ξ( z )) = ∂ 2 ∂ s 2  X k Ξ( z ) k ψ s ( p k )  2    s =0 = 2  X k Ξ( z ) k ρ 2 ( p k ) +  X k Ξ( z ) k ρ 1 ( p k )  2  = 2( a 2 ρ 2 (Ξ( z )) + 1 4 ρ 2 1 (Ξ( z ))) . Therefore, we ﬁnish the proof with G 0 ( g ◦ Φ) ◦ Ξ( z ) = g ′ ( z ) αz + 1 2 g ′′ ( z )  ρ 2 (Ξ( z )) + ρ 1 (Ξ( z )) − ρ 2 1 (Ξ( z ))  = αz g ′ ( z ) + 1 2 g ′′ ( z ) v (Ξ( z )) . W e need some more bounds for the slow time-scale: Lemma 4.3 (Dynamics on the slow time-scale, G 0 ) . It holds G 0 ψ s ( x ) =  X k x k ψ s ( p k )  X l x l ψ s ( r l )  , G 0 ψ s ( x ) ψ r ( x ) = ψ s ( x ) G 0 ψ r ( x ) + ψ r ( x ) G 0 ψ s ( x ) + 1 4  X k x k ψ s + r − r s ( p k )  2 − 1 4 ψ s ( x )  X k x k ψ r ( p k )  2 − 1 4 ψ r ( x )  X k x k ψ s ( p k )  2 + 1 4 ψ s + r − sr ( x ) . In addition, with a 2 , a 3 from Lemma 4.1, and if (3.1) holds and for suitable b 2 , b 3 , ρ 1 ( r k ) = αk , ρ 2 ( r k ) = b 2 k ( k − 1) , ρ 3 ( r k ) = b 3 k ( k − 1)( k − 2) , (4.11) P age 8/17 Markov processes forced on a subspace by a large drift then G 0 ρ 1 ( x ) = αρ 1 ( x ) , G 0 ρ 2 1 ( x ) = (2 α − 3 4 ) ρ 2 1 ( x ) + ( a 2 + 1 2 ) ρ 2 ( x ) + ρ 1 ( x ) , G 0 ρ 2 ( x ) = αρ 2 1 ( x ) + b 2 ρ 2 ( x ) , G 0 ρ 3 1 ( x ) = (3 α − 9 4 ) ρ 3 1 ( x ) + 3( a 2 + 1 2 ) ρ 2 ( x ) ρ 1 ( x ) + 3 ρ 2 1 ( x ) , G 0 ρ 2 ( x ) ρ 1 ( x ) = ( a 2 − b 2 + 1 2 (1 − 2 α )) ρ 2 ( x ) ρ 1 ( x ) + 1 2 ( 1 2 + 2 α ) ρ 1 ( x ) 3 − (2 a 2 + 1) ρ 2 ( x ) − 1 2 ρ 1 ( x ) 2 , G 0 ρ 3 ( x ) = (3 αa 2 + 3 2 b 2 ) ρ 2 ( x ) ρ 1 ( x ) + b 3 ρ 3 ( x ) . (4.12) Proof. W e start with a similar calculation as in (4.6) , replacing p k by r k , leading to (note that ψ 0 ( r k ) = P j r k ( j ) = 0 ) X k x k ψ s ( r k ) = − αsρ 1 ( x ) + 1 2 s 2 b 2 ρ 2 ( x ) − 1 6 b 3 s 3 ρ 3 ( x ) + O ( s 4 ) . (4.13) F or the ﬁrst assertion, compute G 0 ψ s ( x ) = 1 2 X k,l x k x l X j,m ( p k ( j ) r l ( m − j ) + r k ( j ) p l ( m − j ))(1 − s ) j +( m − j ) = 1 2 X k,l x k x l ( ψ s ( p k ) ψ s ( r l ) + ψ s ( r k ) ψ s ( p l )) =  X k x k ψ s ( p k )  X l x l ψ s ( r l )  = s  1 − 1 2 sρ 1 ( x ) + 1 2 s 2 a 2 ρ 2 ( x ) + O ( s 3 )  ·  − αρ 1 ( x ) + 1 2 sb 2 ρ 2 ( x ) − 1 6 b 3 s 2 ρ 3 ( x ) + O ( s 3 )  = − αsρ 1 ( x ) + 1 2 s 2 ( αρ 2 1 ( x ) + b 2 ρ 2 ( x )) − 1 6 s 3 (3 αa 2 ρ 2 ( x ) ρ 1 ( x ) + 3 2 b 2 ρ 2 ( x ) ρ 1 ( x ) + b 3 ρ 3 ( x )) + O ( s 4 ) , (4.14) which gives the ﬁrst, third and sixth term in (4.12) by comparison of coefﬁcients. Note that the ﬁrst, second, and fourth equalities in (4.12) can be read from (4.9) . F or the ﬁfth equality , we need to compute mixed terms. F or these, note that G 0 consists of both, a ﬁrst and a second derivative. So, we can write, using (4.2), (4.7), and (4.14), 2 G 0 ψ s ( x ) ψ t ( x ) = 2 ψ t ( x ) G 0 ψ s ( x ) + 2 ψ s ( x ) G 0 ψ t ( x ) + X n x n X k,l x k x l X j,m p k ( j ) p l ( m − j )  (1 − t ) m − (1 − t ) n  (1 − s ) m − (1 − s ) n  = 2 ψ t ( x )  X k x k ψ s ( p k )  X l x l ψ s ( r l )  + 2 ψ s ( x )  X k x k ψ t ( p k )  X l x l ψ t ( r l )  +  X k x k ψ t + s − st ( p k )  2 − ψ t ( x )  X k x k ψ s ( p k )  2 − ψ s ( x )  X k x k ψ t ( p k )  2 + ψ t + s − st ( x ) = ψ t ( x )  X k x k ψ s ( p k )  2 X l x l ψ s ( r l ) − X l x l ψ s ( p l )  + ψ s ( x )  X k x k ψ t ( p k )  2 X l x l ψ t ( r l ) − X l x l ψ t ( p l )  +  X k x k ψ t + s − st ( p k )  2 + ψ t + s − st ( x ) , P age 9/17 Markov processes forced on a subspace by a large drift which leads to, up to second order in s and ﬁrst order in t , 2 G 0 ψ t ( x ) ψ s ( x ) = (1 − tρ 1 )(1 − 1 2 sρ 1 + 1 2 s 2 a 2 ρ 2 )( − 2 αsρ 1 + s 2 b 2 ρ 2 − 1 + 1 2 sρ 1 − 1 2 s 2 a 2 ρ 2 ) + (1 − sρ 1 + 1 2 s 2 ρ 2 )(1 − 1 2 tρ 1 )( − 2 αtρ 1 − 1 + 1 2 tρ 1 ) + (2 − 2( t + s − st ) ρ 1 + 1 2 ( t + s − st ) 2 ( 1 2 ρ 2 1 + (2 a 2 + 1) ρ 2 ) = (1 − tρ 1 )(1 − 1 2 sρ 1 + 1 2 s 2 a 2 ρ 2 )( − 1 + s ( 1 2 − 2 α ) ρ 1 + s 2 ( b 2 − 1 2 a 2 ) ρ 2 ) + (1 − sρ 1 + 1 2 s 2 ρ 2 )(1 − 1 2 tρ 1 )( − 1 + t ( 1 2 − 2 α ) ρ 1 ) + (2 − 2( t + s − st ) ρ 1 + 1 2 ( s 2 + 2 st − 2 s 2 t )( 1 2 ρ 2 1 + (2 a 2 + 1) ρ 2 )) = (1 − tρ 1 )( − 1 + (1 − α ) sρ 1 + s 2 (( b 2 − a 2 ) ρ 2 − 1 2 ( 1 2 − 2 α ) ρ 2 1 )) + (1 − sρ 1 + 1 2 s 2 ρ 2 )( − 1 + (1 − 2 α ) tρ 1 ) + (2 − 2( t + s − st ) ρ 1 + 1 2 ( s 2 + 2 st − 2 s 2 t )( 1 2 ρ 2 1 + (2 a 2 + 1) ρ 2 )) = − 2 αtρ 1 − 2 αsρ 1 + s 2 (( b 2 − a 2 ) ρ 2 − 1 2 ( 1 2 − 2 α ) ρ 2 1 − 1 2 ρ 2 + 1 2 ( 1 2 ρ 2 1 + (2 a 2 + 1) ρ 2 )) + st ( − (1 − 2 α ) ρ 2 1 − (1 − 2 α ) ρ 2 1 + 2 ρ 1 + ( 1 2 ρ 2 1 + (2 a 2 + 1) ρ 2 ))) + s 2 t ( − ( b 2 − a 2 ) ρ 2 ρ 1 + 1 2 ( 1 2 + 2 α ) ρ 3 1 + 1 2 (1 − 2 α ) ρ 2 ρ 1 − ( 1 2 ρ 2 1 + (2 a 2 + 1) ρ 2 ))) = − 2 αtρ 1 − 2 αsρ 1 + s 2 ( b 2 ρ 2 + αρ 2 1 ) + st ((2 a 2 + 1) ρ 2 + (4 α − 3 2 ) ρ 2 1 + 2 ρ 1 ) + s 2 t (( a 2 − b 2 + 1 2 (1 − 2 α )) ρ 2 ρ 1 + 1 2 ( 1 2 + 2 α ) ρ 3 1 − (2 a 2 + 1) ρ 2 − 1 2 ρ 2 1 ) . Comparing coefﬁcients yields the result. Lemma 4.4 (Bounds on third moment) . Assume that sup N E [ ρ 3 ( X N 0 )] < ∞ and a 2 , a 3 < 1 2 (recall from (4.4) ). Then, for all T > 0 , there is C T < ∞ with sup N sup 0 ≤ t ≤ T E [ ρ 3 ( X N t ] < C T . Proof. Let us rearrange some results from Lemma 4.1 and Lemma 4.3. W e write G 1         ρ 1 ρ 2 1 ρ 2 ρ 3 1 ρ 2 ρ 1 ρ 3         =         0 0 0 0 0 0 0 0 0 0 0 0 0 1 4 − 1 2 (1 − 2 a 2 ) 0 0 0 0 0 0 0 0 0 0 0 0 1 4 − 1 2 (1 − 2 a 2 ) 0 0 0 0 0 3 2 a 2 − 1 2 (1 − 2 a 3 )         | {z } =: M 1         ρ 1 ρ 2 1 ρ 2 ρ 3 1 ρ 2 ρ 1 ρ 3         , G 0         ρ 1 ρ 2 1 ρ 2 ρ 3 1 ρ 2 ρ 1 ρ 3         | {z } := ρ =         α 0 0 0 0 0 1 2 α − 3 4 a 2 + 1 2 0 0 0 0 α b 2 0 0 0 0 3 0 3 α − 9 4 3( a 2 + 1 2 ) 0 0 − 1 2 − (2 a 2 + 1) 1 2 ( 1 2 + 2 α ) a 2 − b 2 + 1 2 (1 − 2 α ) 0 0 0 0 0 3 αa 2 + 3 2 b 2 b 3         | {z } =: M 0         ρ 1 ρ 2 1 ρ 2 ρ 3 1 ρ 2 ρ 1 ρ 3         . An analysis using a computer algebra system yields: The matrix N M 1 + M 0 has eigenval- ues λ 1 , ..., λ 6 with λ , λ 2 , λ 3 = O (1) , and λ 4 , λ 5 = − N ( 1 2 − a 2 ) + O (1) , λ 6 = − N ( 1 2 − a 3 ) + O (1) , so since a 2 , a 3 < 1 / 2 , for every T > 0 , there is c T < ∞ such that sup 0 ≤ t ≤ T e λ i t ≤ c T . Moreover , we can represent the vector e 6 (in direction ρ 3 ) as a linear combinations of the corresponding eigenvectors v 1 , ..., v 6 . So, e 6 = P 6 i =1 a i v i , where v i is eigenvector for P age 10/17 Markov processes forced on a subspace by a large drift the eigenvalue λ i , i = 1 , ..., 6 . Since ( e − λ i t v ⊤ i ρ ( X N t )) t ≥ 0 is a martingale, i = 1 , ..., 6 , we can write sup 0 ≤ t ≤ T E [ ρ 3 ( X N t )] = sup 0 ≤ t ≤ T 6 X i =1 a i E [ v ⊤ i ρ ( X N t )] = sup 0 ≤ t ≤ T 6 X i =1 a i e λ i t E [ v ⊤ i ρ ( X N 0 )] < C T sup N E [ ρ 3 ( X N 0 )] , for some C T < ∞ only depending on T , since eigenvalues are either O (1) or negative (use a 2 , a 3 < 1 2 here). This gives the result. Lemma 4.5 (A mar tingale) . F or each N , the process ( M N t ) t ≥ 0 with M N t := e − αt Φ( X N t ) is a martingale with quadratic variation (recall a 2 from Lemma 4.1)  e − 2 αt Z t 0 F ( X N s ) ds  t ≥ 0 , F ( X N s ) := ( a 2 + 1 2 ) ρ 2 ( X N k ( s )) + ρ 1 ( X N k ( s )) − 3 4 ρ 2 1 ( X N k ( s ))  . (4.15) Proof. Recall from Ethier and Kurtz (1986), Lemma 4.3.2, that e − αt Φ( X N t ) + Z t 0 e − αs  α Φ( X N s ) − G 0 Φ( X N s )  ds = e − αt Φ( X N t ) (4.16) is a martingale. W e compute its quadratic variation using Lemma 4.3 by [ e − α · Φ( X N · )] t = e − 2 αt [Φ( X N · )] t ] = e − 2 αt Z t 0  G 0 Φ 2 ( X N s ) − 2Φ( X N s ) G 0 Φ( X N s )  ds. Proposition 4.6. Let D ′ E := { ψ t : t ∈ [0 , 1] } , and assume that for all T > 0 , there is C T < ∞ with sup N sup 0 ≤ t ≤ T E [ ρ 3 ( X N t ] < C T . Then, (Φ( X N t ) t ≥ 0 ) N is tight and ( X N t ) t ≥ 0 is tight in measure. Proof. W e will show the following: 1. (one-dimensional tightness) for every t ∈ [0 , T ] the family (Φ( X N t )) N ≥ 1 is tight; 2. (tightness of ( X N ) N ≥ 1 ): the family ( X N t ) N ≥ 1 is tight in measure; 3. (Aldous condition) for every ε > 0 and T > 0 , and every sequence of stopping times τ N bounded by T , there exists a delta δ > 0 such that lim δ ↓ 0 lim sup N →∞ sup 0 ≤ θ ≤ δ P  | Φ( X N τ N + θ ) − Φ( X N τ N ) | > ε  = 0 . Then, tightness of (Φ( X N )) N ≥ 1 in D ( R + ) follows from 1. and 3. by the Aldous–Rebolledo criterion (see Theorem 1.17 in Etheridge (2001)). The second claim, tightness in measure of ( X N ) N ≥ 1 , equals 2. W e will be using the martingale ( M N t ) t ≥ 0 with M N t := e − αt Φ( X N t ) and notation from Lemma 4.5, in particular for the quadratic variation of ( M N t ) t ≥ 0 , as given in (4.15). F or 1., note that since ( e − αt Φ( X N t )) t ≥ 0 is a martingale – see (4.16) – we have E [Φ( X N t )] = e αt E [Φ( X N 0 )] for every t ≥ 0 . Using the Markov inequality , for any C > 0 , P  Φ( X N t ) > C  ≤ e αt E [Φ( X N 0 )] C . P age 11/17 Markov processes forced on a subspace by a large drift Since sup N E [Φ( X N 0 )] < ∞ under the assumptions of Theorem 3.2, we ﬁnd that  Φ( X N t )  N ≥ 1 is tight for all t ≥ 0 . F or 2., we show that the sequence (Γ X N ) N ≥ 1 , deﬁned in (2.3) , is tight in P ([0 , ∞ ) × P ( N 0 )) . F or each N , let M N t be as in Lemma 4.5. Since this is a non-negative martingale with uniformly bounded initial expectations by assumption, sup N E [Φ( X N 0 )] < ∞ . Fix ε > 0 and choose T > 0 such that R T 0 e − s ds ≥ 1 − ε. Applying Doob’s maximal inequality to the martingale ( M N t ) t ≥ 0 , there exists a constant λ ε < ∞ such that P  sup 0 ≤ t ≤ T M N t ≤ λ ε  ≥ 1 − ε, for all N . Observe that on this event sup 0 ≤ t ≤ T Φ( X N t ) = sup 0 ≤ t ≤ T e αt M N t ≤ (1 ∨ e αT ) λ ε := C ε . Deﬁne the relatively compact set K C ε := { x ∈ P ( N 0 ) : Φ( x ) ≤ C ε } . Then, on the event { sup 0 ≤ t ≤ T Φ( X N t ) ≤ C ε } , we have X N t ∈ K C ε for all t ∈ [0 , T ] , which implies Γ X N ([0 , T ] × K C ε ) = Z T 0 e − s 1 { X N s ∈ K C ε } ds ≥ Z T 0 e − s ds ≥ 1 − ε. Combining the probability bound with the inequality above, we obtain P  Γ X N ([0 , T ] × K C ε ) ≥ 1 − ε  ≥ P  sup 0 ≤ t ≤ T M N t ≤ λ ϵ  ≥ 1 − ε, which is precisely the tightness condition required by Prohorov’s theorem. Consequently , the sequence ( X N ) N ≥ 1 is tight in measure. F or 3., recall the martingale ( M N t ) t ≥ 0 with M N t := e − αt Φ( X N t ) from Lemma 4.5 and its quadratic variation, as given in (4.15) . Let ( τ N ) N ≥ 1 be stopping times bounded by T > 0 , and ﬁx ε > 0 . By the assumption on ﬁnite third factorial moments, there exists a constant C T < ∞ such that (recall F from (4.15)) sup N ≥ 1 sup 0 ≤ s ≤ T E [ F ( X N s )] ≤ C T . Hence, for any 0 ≤ θ ≤ δ , we have E h [ M N ] τ N + θ − [ M N ] τ N i = e − 2 ατ N  e − 2 αθ E h Z τ N + θ 0 F ( X N s ) ds i − E h Z τ N 0 F ( X N s ) ds i ≤ (1 ∨ e − 2 αT )  ( e − 2 αθ − 1) E h Z T 0 F ( X N s ) ds i + E h Z τ N + θ τ N F ( X N s ) i ≤ (1 ∨ e − 2 αT )(2 αδ T C T + δ C T ) =: δ C ′ T . So, we may write | Φ( X N τ N + θ ) − Φ( X N τ N ) | = | e α ( τ N + θ ) M N τ N + θ − e ατ N M N τ N | =    e ατ N  ( e αθ − 1) M N τ N + θ + ( M N τ N + θ − M N τ N )     ≤ c T  ( e | α | δ − 1) M N τ N + θ + | M N τ N + θ − M N τ N |  P age 12/17 Markov processes forced on a subspace by a large drift for c T := 1 ∨ e αT and bound each term. By Doob’s maximal inequality , P  | M N τ N + θ − M N τ N | > ε 2 c T  ≤ 4 c 2 T ε 2 E h [ M N ] τ N + θ − [ M N ] τ N i ≤ 4 c 2 T δ C ′ T ε 2 . Choose δ 1 := ε 3 8 c 2 T C ′ T , so that this probability is at most ε/ 2 , uniformly in N . F or the multiplicative term, | M N τ N + θ | ≤ sup 0 ≤ t ≤ T | M N t | and we may choose K ε > 0 such that sup N P  sup 0 ≤ t ≤ T | M N t | > K ε  ≤ ε/ 2 . Choose δ 2 > 0 such that ( c T e | α | δ 2 − 1) K ε ≤ ε/ 2 . Finally , let δ := min { δ 1 , δ 2 } . Then for all 0 ≤ θ ≤ δ , P  | Φ( X N τ N + θ ) − Φ( X N τ N ) | > ε  ≤ P  c T ( e | α | δ − 1) | M N τ N + θ | > ε/ 2  + P  c T | M N τ N + θ − M N τ N | > ε/ 2  ≤ ε, uniformly in N . This veriﬁes the Aldous condition for tightness. Remark 4.7. Let us summarize what remains to be done for the proof of Theorem 3.2, as an application of Theorem 2.1: 1. Show that the form of factorial moments as given in (4.4) and (4.11) hold. 2. F or the resulting a 2 , we must have a 2 < 1 , such that the assumptions of Lemma 4.4 hold, which shows the required tightness of (Φ( X N t )) t ≥ 0 and ( X N t ) t ≥ 0 via Proposi- tion 4.6; 3. F or A3, show (4.8), i.e. G 1 ψ s ( x ) = 0 implies that x is P oisson (negative binomial); 4. Show that the right hand side of (4.10) corresponds to the generator of Z . W e will prove these four assertions in the next section. 5 Proof of Theorem 3.2 5.1 Case (i): Binomial/P oisson In this part, we focus on the case (3.3) (i). The goal of this section is to prove 1., 2., and 3. from the end of section 4. As explained there, this will conclude the proof of Theorem 3.2.1. Note that for p k = B ( k , 1 2 ) , ψ s ( p k ) = k X i =0  k i  1 2 k (1 − s ) i =  1 − s 2  k = 1 − s k 2 + 1 2 s 2 k ( k − 1) 4 − 1 6 s 3 k ( k − 1)( k − 2) 8 + O ( s 4 ) , i.e. ρ 1 ( p k ) = k 2 , ρ 2 ( p k ) = k ( k − 1) 4 , ρ 3 ( p k ) = k ( k − 1)( k − 2) 8 . This shows (4.4) with a 2 = 1 4 , a 3 = 1 8 . Next, for p N k = B ( k , 1 2 + ε N ) with N ε N N →∞ − − − − → α ∈ R , we already computed N ε N recall that ψ s ( B ( k , p )) = (1 − sp ) k , so N ( ψ s ( p N k ) − ψ s ( p k ) = N   1 − s  1 2 + ε N  k −  1 − s 1 2  k  N →∞ − − − − → − αk s (1 − s 1 2 ) k − 1 = − αk s + 1 4 k ( k − 1) s − 1 8 k ( k − 1)( k − 2) s 2 + O ( s 4 ) , which shows 1. and 2. from Remark 4.7 and gives (4.11) with b 2 = 1 4 , b 3 = − 1 8 . P age 13/17 Markov processes forced on a subspace by a large drift Next, we turn to 3. Note that 2 G 1 ψ s ( x ) = ψ 2 s/ 2 ( x ) − ψ s ( x ) . (5.1) Recall that if x = Poi( λ ) , then λ = ρ 1 ( x ) and ψ s ( x ) = e − λ ∞ X k =0 λ k k ! (1 − s ) k = e − sλ and s 7→ ψ s ( x ) characterizes x uniquely . W e immediately see that G 1 ψ s ( x ) = 0 if x is P oisson. The reverse implication is given next. Here is a version of Lemma 3.8 of Pfaffelhuber and W akolbinger (2023). Lemma 5.1 (Characterization of P oisson distributions) . Let ψ s and ρ n be as in (4.2) . Let x ∈ P ( N 0 ) with ρ 1 ( x ) < ∞ . Then the following are equivalent: 1. x = P oi ( ρ 1 ( x )) ; 2. F or all n = 1 , 2 , ... and s 1 , ..., s n ∈ [0 , 1] , ψ s 1 ( x ) · · · ψ s n ( x ) = 1 n n X j =1 ψ 2 s j / 2 ( x ) n Y k =1 k  = j ψ s k ( x ) . Proof. Since generating functions uniquely determine probability distributions, 1. is equivalent to 1’. ψ s ( x ) = e − sρ 1 ( x ) . 1 ′ . ⇒ 2 . : By assumption we have ψ s 1 ( x ) · · · ψ s n ( x ) = e − ( s 1 + ··· + s n ) ρ 1 ( x ) . Since the right hand side only depends on s 1 + · · · + s n , the result follows from summing indices in ψ s 1 ( x ) · · · ψ s n ( x ) = ψ 2 s j / 2 ( x ) n Y k =1 k  = j ψ s k ( x ) . 2 . ⇒ 1 ′ . : W e start with the following observation: F or s > 0 let ( s kj ) k ∈ N ,j =1 ,...,k be asymptotically negligible (in the sense that sup j | s kj | k →∞ − − − − → 0 ) and P k j =1 s kj = s . Then, since ψ s kj ( x ) = ∞ X i =0 x i (1 − s kj ) i = 1 − ( s kj + o ( s kj )) ∞ X i =0 ix i (where we have used that ρ 1 ( x ) < ∞ ), we have log  k Y j =1 ψ s kj ( x )  = k X j =1 log(1 − ( s kj + o ( s kj )) ρ 1 ( x )) k →∞ − − − − → − sρ 1 ( x ) . (5.2) Now , we come to proving the assertion: Fix s ∈ [0 , 1] , and let P n be a random par- tition of [0 , s ) with n elements, which arises iteratively as follows: Starting with P 1 = { [0 , s ) } , let P n +1 arise from P n by randomly taking one partition element [ a, b ) from P n , and adding the two elements [ a, ( a + b ) / 2) and [( a + b ) / 2 , b ) to P n +1 . (W e can e.g. have P 1 = { [0 , s ) } , P 2 = { [0 , s/ 2) , [ s/ 2 , s ) } , P 3 = { [0 , s/ 4) , [ s/ 4 , s/ 2) , [ s/ 2 , s ) } , P 4 = { [0 , s/ 4) , [ s/ 4 , 3 s/ 8) , [3 s/ 8 , s/ 2) , [ s/ 2 , s ) } , ... ) . From 2., we ﬁnd iteratively , almost surely ψ s ( x ) = Y π ∈ Π n ψ | π | ( x ) P ( P n = Π n ) . P age 14/17 Markov processes forced on a subspace by a large drift It is not hard to see that – almost surely – every partition element in P n eventually gets split in two, so {| π | : π ∈ P n } is asymptotically negligible as n → ∞ . Therefore, Y π ∈ Π n ψ | π | ( x ) P ( P n = Π n ) n →∞ − − − − → e − sρ 1 ( x ) almost surely by (5.2) and dominated convergence. Combining the last two equalities gives 1’. Remark 5.2. Note that from (5.1), since G 1 is a ﬁrst derivative, 2 G 1 ( ψ s 1 · · · ψ s n )( x ) =  1 n n X j =1 ψ 2 s j / 2 ( x ) n Y k =1 k  = j ψ s k ( x )  − ψ s 1 ( x ) · · · ψ s n ( x ) . In particular , Lemma 5.1 shows that for some x with ρ 1 ( x ) < ∞ , using again that G 1 is a ﬁrst derivative, we have G 1 ψ s ( x ) = 0 for all s ∈ [0 , 1] iff x = P oi ( ρ 1 ( x )) . Lemma 5.3. If Ξ( z ) = P oi ( z ) , then G 0 ( g ◦ Φ) ◦ Ξ( z ) = αz g ′ ( z ) + 1 2 z g ′′ ( z ) . Proof. This is straight-forward from (4.10) , since the variance of a P oisson distribution coincides with its parameter . Proof of Theorem 3.2.1 As announced at the end of Section 3, we have to show (4.8) for A3, which is the precise result from Remark 5.2 (based on Lemma 5.1). From 4.6, we see – based on ﬁnite third moments – that ((Φ( X N t )) t ≥ 0 ) N is tight and ( X N t ) t ≥ 0 is tight in measure. In particular , A1 holds along subsequences. Last, the form of the generator of Z is given in Lemma 4.10. Noting that v (Ξ( z )) = z (the variance of a P oisson distribution coincides with its parameter) we are done. 5.2 Case (ii): uniform/negative binomial In this part, we focus on the case (3.3) (ii). The goal of this section is to prove 1.–4. from Remark 4.7. Note that ψ s ( p k ) = k X j =0 1 k + 1 (1 − s ) j = 1 k + 1 1 − (1 − s ) k +1 s = 1 s Z s 0 (1 − r ) k dr = 1 k + 1 k X j =0  k + 1 j + 1  ( − s ) j = 1 − s k 2 + 1 2 s 2 k ( k − 1) 3 − 1 6 s 3 k ( k − 1)( k − 2) 4 + O ( s 4 ) , i.e. , ρ 1 ( p k ) = k 2 , ρ 2 ( p k ) = k ( k − 1) 3 , ρ 3 ( p k ) = k ( k − 1)( k − 2) 4 , which shows (4.4) with a 2 = 1 3 < 1 and a 3 = 1 4 . Moreover , (4.11) holds since r k = 0 for all k . This shows 1. and 2. F or 3., from Lemma 4.1, 2 Gψ s ( x ) =  1 s Z t 0 ψ r ( x ) dr  2 − ψ s ( x ) . (5.3) W e denote by NB ( k , p ) the negative binomial distribution, i.e. the distribution of the number of failures in a Bernoulli experiment with success probability p until the k th P age 15/17 Markov processes forced on a subspace by a large drift success. Recall that expectation and variance are given by ρ 1 ( NB (2 , p )) = 2(1 − p ) p and v ( NB (2 , p )) = 2(1 − p ) p 2 . In other words, for p = 2 z +2 , we have ρ 1  NB  2 , 2 z + 2  = z , ρ 2  NB  2 , 2 z + 2  = 3 2 z 2 , v  NB  2 , 2 z + 2  = 1 2 z ( z + 2) . (5.4) Lemma 5.4 (Characterization of a negative binomial distribution) . Let x ∈ P ( N ) with ρ 1 ( x ) < ∞ . Then, the following are equivalent: 1. x = NB (2 , p ) ; 2. p = 2 z +2 with ρ 1 ( x ) = ψ ′ 0 ( x ) = z and for all t ∈ [0 , 1] , we have  1 t Z t 0 ψ s ( x ) ds  2 − ψ t ( x ) = 0 . Proof. Recall that for x = NB ( k , p ) and z = 2 p − 2 (which is equivalent to p = 2 z +2 ), ψ t ( x ) =  p 1 − (1 − p )(1 − t )  k =  2 2 + tz  k . From this, 1. ⇒ 2. is a straight-forward calculation. F or 2. ⇒ 1., we study the integral equation  1 t Z t 0 ψ s ( x ) ds  2 − ψ t ( x ) = 0 . (5.5) Deﬁne β s ( x ) := 1 s Z s 0 ψ r ( x ) ds, and note that (5.5) for all t implies, by integration 0 = Z t 0 β 2 s ( x ) ds − tβ t ( x ) , i.e. (by taking derivatives wrt t ) β 2 t ( x ) − β t ( x ) − t d dt β t ( x ) = 0 . (5.6) Note that, taking another derivative wrt t at t = 0 , d dt β 2 t − β t − t d dt β t ( x )    t =0 = d dt β 2 t − 2 β t = 0 , i.e. in order to have a unique solution of the initial value problem (5.6) , we need to ﬁx z := 2 β ′ 0 ( x ) . In addition, β ′ 0 ( x ) = lim t → 0 1 t R t 0 ψ r ( x ) dr − 1 t = lim t → 0 1 t R t 0 r ψ ′ 0 ( x ) + o ( r ) dr t = 1 2 ψ ′ 0 ( x ) . Since the ODE (5.6) satisﬁes the usual Lipschitz condition, it has a unique solution with 2 β ′ 0 ( x ) = z := ρ 1 ( x ) , which can be computed using separation of variables, and is given by β t ( x ) = 1 1 + tz / 2 , so ψ t ( x ) = β 2 t ( x ) =  2 2 + tz  2 . The claim follows since t 7→ ψ t ( x ) determines x uniquely . P age 16/17 Markov processes forced on a subspace by a large drift Remark 5.5. Note that from (5.3), 2 G 1 ψ t ( x ) =  1 t Z t 0 ψ s ( x ) ds  2 − ψ t ( x ) . In particular , Lemma 5.4 shows that for some x with ρ 1 ( x ) < ∞ , we have G 1 f ( x ) = 0 for all f ∈ D ′ E :=  ψ t : t ∈ [0 , 1]  iff x = NB (2 , 2 / ( ρ 1 ( x ) + 2)) . Lemma 5.6. If Ξ( z ) = NB (2 , 2 2+ z ) , then ρ 1 (Ξ( z )) = z and G 0 ( g ◦ Φ) ◦ Ξ( z ) = 1 2 z ( z + 2) g ′′ (Φ( z )) . Proof. This is straight-forward from (4.10) , since the variance of NB  2 , 1 z +1  is 2 z ( z + 1) ; see (5.4). Proof of Theorem 3.2.2 W e proceed as in the proof of Theorem 3.2.1. Again, for A3, see Remark 5.5 (based on Lemma 5.4). Again, A1 holds along a subsequence. Last, for the form of the generator of Z as given in Lemma 4.10, note that v (Ξ( z )) = 1 2 z ( z + 2) (see (5.4)), so we are done. Acknowledgements W e thank Emmanuel Schertzer for bringing Otto and Wiehe (2023) to our attention. PP is supported by the Freiburg Center for Data Analysis, Modeling, and AI. References Ball, K., T . G . Kurtz, L. Popovic, and G. Rempala (2006). Asymptotic analysis of multiscale approximations to reaction networks. The Annals of Applied Probability 16 (4), 1925–1961. Berglund, N . and B. Gentz (2006). Noise-induced phenomena in slow-fast dynamical systems: a sample-paths approach . Springer Science & Business Media. Champagnat, N . and V . Hass (2025). Convergence of population processes with small and frequent mutations to the canonical equation of adaptive dynamics. The Annals of Applied Probabil- ity 35 (1), 1–63. Coron, C . and Y . Le Jan (2022). P edigree in the biparental moran model. Journal of Mathematical Biology 84 (6), 51. Etheridge, A. (2001). An introduction to superprocesses . American Mathematical Society . Ethier , S. N . and T . G. K urtz (1986). Markov Processes. Characterization and Convergence . John Wiley , New Y ork. Katzenberger , G. S . (1991). Solutions of a stochastic differential equation forced onto a manifold by a large drift. Ann. Probab. 19 , 1587–1628. Kifer , Y . (2024). Strong diffusion approximation in averaging and value computation in Dynkin’s games. The Annals of Applied Probability 34 (1A), 103–147. Kurtz, T . G . (1991). R andom time changes and convergence in distribution under the Meyer-Zheng conditions. Ann. Probab. 19 , 1010–1034. Li, X.-M. and J . Sieber (2022). Slow-fast systems with fractional environment and dynamics. The Annals of Applied Probability 32 (5), 3964–4003. Omole, A. D . and P . Czuppon (2025). A population genetics model explaining overdispersion in active transposable elements. bioRxiv , 2025–11. Otto, M. and T . Wiehe (2023). The structured coalescent in the context of gene copy number variation. Theoretical Population Biology 154 , 67–78. Otto, M., Y . Zheng, and T . Wiehe (2022). Recombination, selection, and the evolution of tandem gene arrays. Genetics 221 (3), iyac052. Pfaffelhuber , P . and A. W akolbinger (2023). A diploid population model for copy number variation of genetic elements. Electronic Journal of Probability 28 , 1–15. P age 17/17

Markov processes forced on a subspace by a large drift, with applications to population genetics

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment