The waiting time for m mutations

The w aiting time for m m utations b y Jason Sc h w einsb erg ∗ Univ ersit y of California, San Diego Octob er 25, 2018 Abstract W e consider a model o f a population of ﬁxed size N in whic h each individual gets r eplaced at rate one and each individual exp eriences a mu tatio n a t rate µ . W e calculate the asymptotic distribution of the time that it tak es befor e there is an individual in the p opulation with m m utations . Several diﬀerent behaviors a re p ossible, depe nding on how µ changes with N . These results hav e applica tions to the pr oblem o f determining the waiting time for regulatory sequences to appe a r a nd to m o dels of cancer developmen t. 1 In tro duction It is widely a ccepted that many t yp es of cancer a rise as a result of not one but sev eral m utations. F or example, Mo olga vk ar and Lueb ec k [26] write that “the concept of m ultistage carcinogenesis is one of the central dogmas of cancer researc h”, w hile Beeren winke l et. al. [ 5] wr ite that “the current view of cancer is that tumorigenesis is d ue to the accumulat ion of mutations in oncogenes, tumor su ppressor genes, and genetic instabilit y genes.” T he idea that sev eral m utations are required for cancer goes back at lea st to 19 51, when Muller [28] wrote, “There are, how ev er, reasons for inferrin g that man y or most cancerous gro wth s would require a series of m u tations in order f or cells to depart suﬃcien tly f rom the normal.” Three y ears later, Armitage and Doll [2] prop osed a simple mathematical m u lti-stage mo d el of cancer. Motiv ated by the goal of explaining the p ow er la w r elationship b et wee n age and in cidence of cancer that had b een observ ed by Fisher and Holloman [12] and Nordling [29], they formulated a m od el in w hic h a cell that has already exp erienced k − 1 m utations exp eriences a k th mutati on at rate u k . Th ey sh o w ed that asymptotically as t → 0, the probabilit y that the m th mutation occurs in the time interv al [ t, t + dt ] is giv en by r ( t ) dt = u 1 u 2 . . . u m t m − 1 ( m − 1)! dt. (1) They ﬁt their mo del to data fr om 17 diﬀerent t yp es of cancer, an d found that for man y typ es of cancer the incidence rate r ( t ) increases like the ﬁ fth or s ixth p ow er of age, suggesting that ∗ Supp orted in part by NSF Grant DMS-0504882 AMS 2000 subje ct classiﬁc ations . Primary 60J99; Secondary 60J85, 92D25, 92C50 Key wor ds and phr ases . W aiting times, mutations, p opulation genetics 1 p erhaps 6 or 7 mutati ons are inv olv ed in cancer progression. Because of concerns that h a ving 6 or 7 stages ma y not b e b iologi ally plausible, Armitage and Doll [3 ] later prop osed a t wo- stage mo del as an alternativ e. A more general t wo-sta ge mo del w as prop osed by Mo olga vk ar and Knudson [24], wh o demons tr ated that tw o-stag e mo d els are ﬂ exible enough to ﬁt a wide range of data if one allo ws for the p ossibilities that the num b er of health y cells with n o m utations ma y c hange o v er time, and that cells with one mutatio n ma y d ivid e rapidly , causin g the second m utation, and therefore the ons et of cancer, to happ en more quic kly than it otherwise would. Since the seminal pap ers of Ar m itage and Doll, multi-stag e mo dels ha v e b een app lied to a n umb er of diﬀeren t types of cancer. Knudson [19, 15] disco vered that retinoblastoma is a result of getting t wo muta tions. Multi-stage mod els of colo n cancer ha v e b een stud ied extensive ly . Mo olga vk ar and Lueb ec k [26] argued th at a three-stage mo del ﬁt the a v ailable d ata sligh tly b etter than a t wo -stage mo del. Later in [22], they found a go o d ﬁ t to a four -stage mo del. Calabrese et. al. [6] w orked with data f rom 1022 cancers from 9 hospitals in Finland and estimated th at b et we en 4 and 9 m u tations are required f or cancer, with few er m utations b eing required for hereditary cancers th an for sp oradic (nonhereditary) cancers. A recent study [32] of ov er 13,000 genes f rom breast and colon cancers su ggests that as man y as 14 mutat ions ma y b e in vo lved in colon cancer and as m any as 20 ma y b e inv olv ed in breast cancer. Multi- stage m o d els hav e also b een ﬁ t to data on lun g cancer [13] an d T -cell leukemia [31]. See [20] for a recent su rv ey of applications of multi-stage cancer m od els. In this pap er, w e formulat e a simple mathematical m od el and calculate the asymptotic dis- tribution of the time that it tak es for cancer to deve lop. Our mo d el is as follo ws. Consider a p opulation of ﬁ x ed size N . W e think of the in dividuals in the p opulation as representing N cells, whic h could dev elop cancer. W e assu me th at the p opu lation ev olv es ac cording to the Moran mo del [27]. That is, eac h individu al ind ep enden tly lives for an exp onentially d istributed amoun t of time with mean one, and then is replaced by a new individ u al whose parent is c hosen at random from the N individuals in the p opulation (including the one b eing replaced). These b irths and deaths r epresen t cell division and cell death. W e also assu me that eac h in d ividual ind ep endent ly exp eriences mutatio ns at times of a r ate µ Po isson pr o cess, and eac h new individual b orn h as the same num b er of mutatio n s as its parent. W e refer to an ind ividual that has j m utations as a t yp e j in dividual, and a muta tion that tak es an individu al’s n umb er of m utations fr om j − 1 to j as a t yp e j mutat ion. Let X j ( t ) b e the num b er of typ e j individuals at time t . F or eac h p ositiv e inte ger m , let τ m = inf { t : X m ( t ) > 0 } b e the ﬁr s t time at whic h there is an individu al in the p op u lation with m m utations. W e view τ m as representi ng the time that it take s for cancer to deve lop. Clearly τ 1 has the exp onen tial distr ibution with rate N µ b ecause the N individuals are eac h exp eriencing mutations at rate µ . Our goal in this pap er is to compute the asymptotic distribution of τ m for m ≥ 2. When a new mutatio n o ccurs, ev entually either all ind ividuals ha ving th e mutation die, caus- ing the mutation to disapp ear fr om th e p opulation, or the muta tion spreads to all individu als in the p opulation, an ev en t w hic h w e call ﬁxation. Because a mutatio n in itiall y app ears on only one individual and is assumed to oﬀer no selectiv e adv an tage or disad v an tage, eac h mutation ﬁ xates with probability 1 / N . Once one mutation ﬁxates, the problem r educes to waiting for m − 1 ad- ditional muta tions. Ho w ev er, it is p ossible for one ind ividual to accum ulate m mutati ons b efore an y m utation ﬁx ates in the p opulation, an even t wh ic h is sometimes called sto chasti c tunn eling (see [17]). I t is also p ossible for there to b e j ﬁxations, and then for one ind ividual to get m − j m utations th at do n ot ﬁxate. Because there are diﬀerent wa ys to get m mutat ions, the limiting 2 b eha vior is sur prisingly complex, as the form of the limiting d istribution of τ m dep ends on ho w µ v aries as a f unction of N . There is another source of biologica l moti v ation for th is mo del coming from the evo lution of regulatory sequences. Regulatory sequences are s hort DNA sequences that con trol how genes are expr essed. Getting a particular regulatory sequence w ould requir e sev eral mutatio ns , so to understand th e r ole that regulatory sequences p la y in ev olution, one needs to un d erstand how long it tak es b efore these mutat ions o ccur. See Durr ett and Sc hmid t [8, 9] for w ork in this direction. In addition to this motiv ation from biology , there is mathematical motiv atio n for stu d ying this mo d el as w ell. The mo del is simple and natural and , as w ill b e seen from the results, giv es rise to diﬀerent asymptotic b eha vior dep end ing on ho w µ scales as a function of N . I n particular, the usual diﬀusion scaling from p opu lation genetics in wh ic h N µ tends to a constant is ju st one of sev eral regimes. This pap er can b e view ed as a sequel to [10], in which the auth ors considered a more general mo del in whic h an ind ividual with k − 1 mutations exp eriences a k th m utation at r ate u k . T he mo del considered h ere is the sp ecial case in whic h u k = µ for all k , so we are assuming that all m utation r ates are the same. Ho wev er, whereas in [10 ] results were obtained only for sp eciﬁc ranges of the muta tion rates u k , here we are able to obtain all p ossible limiting b ehavio rs for the case in wh ic h th e m utation rates are th e same. W e also emphasize that although our mo del accoun ts for cell d ivision and cell d eath, we assume th at the rates of cell d ivision and cell death are the same, un lik e many mo dels in the biology literature w hic h sp ecify that individu als with b et we en 1 and m − 1 mutatio n s h a v e a selectiv e adv an tage, allo wing their num b ers to increase rapidly (see, f or example, [3, 24, 25, 26, 5]). As we exp lain b elo w, seve ral sp ecial cases of our results h a v e previously app eared in the biology literature, esp ecially for the t wo-st age models when m = 2. Ho wev er, here we are able to give complete asymptotic results for all m , as well as to provide rigorous pr o ofs of the results. W e state our main results in section 2. Pr oofs are given in s ections 3, 4, and 5. 2 Main results In this section, we state ou r results on the limiting b ehavi or of the wai ting time f or an individual to acquire m mutat ions, and we exp lain the h euristics b eh in d the results. Man y of the heuristics are b ased on ap p ro ximation by branching pro cesses. I n the Moran mo del, if k individu als ha ve a muta tion, then the n umb er of individuals with th e m utation is decreasing by one at rate k ( N − k ) / N (b ecause the k individ uals with the mutatio n are d ying at r ate k , and the p robabilit y that the r eplacemen t individu al d o es not h av e a mutation is ( N − k ) / N ) and is incr easing by one at rate k ( N − k ) / N (b ecause the N − k ind ividuals without a mutation are dying at rate one, and the rep lacemen t ind ividual h as a mutatio n with probability k / N ). Th erefore, wh en k is muc h smaller than N , the n u mb er of individ uals with a giv en mutation b eha v es appro ximately like a con tin uous-time branc hin g pro cess in whic h eac h individual give s birth and dies at rate one. T o k eep trac k of further m utations, it is natural to consider a contin uous-time m ultit yp e branc hin g pro cess in which initially there is a single t yp e 1 individ ual, eac h individu al giv es birth and dies at r ate 1, and a type j ind ividual mutates to t yp e j + 1 at rate µ . If p j denotes the 3 probabilit y that there is eve ntually a typ e j individ ual in the p opu lation, th en p j = 1 2 + µ (2 p j − p 2 j ) + µ 2 + µ p j − 1 . (2) T o see this result, condition on the ﬁ rst even t. With probab ility 1 / (2 + µ ), the ﬁ r st ev ent is a death, and there is n o c hance of getting a type j individu al. With pr obabilit y 1 / (2 + µ ), th e ﬁr st ev en t is a b irth, in whic h case eac h in dividual has a t yp e j descendant with p robabilit y p j and therefore the p robabilit y that at least one h as a t yp e j descendant is 2 p j − p 2 j . With p robabilit y µ/ (2 + µ ), the ﬁr st ev ent is a m utation to typ e 2, in wh ic h case the probabilit y of a t yp e j descendan t is p j − 1 b ecause j − 1 f u rther mutati ons are n eeded. Equ ation (2) can b e rewritten as p 2 j + µ p j − µp j − 1 = 0, and the p ositiv e s olution is p j = − µ + p µ 2 + 4 µp j − 1 2 . When µ is small, the second term un der the square ro ot d ominates the numerator, and w e get p j ≈ √ µp j − 1 . Since p 1 = 1, the approxi mation p j ≈ µ 1 − 2 − ( j − 1) follo ws by indu ction. Because the Moran mo del can b e appr o ximated by a branching pro cess wh en the num b er of m utant ind ividuals is m uch sm aller than N , this result su ggests th at un der appropr iate conditions, the p r obabilit y that a t yp e 1 individ ual in the p opulation has a t yp e m descendant sh ould b e appro ximately µ 1 − 2 − ( m − 1) . Prop osition 1 b elo w, w h ic h is a s p ecial case of Prop osition 4.1 in [10], establishes that this appro ximation is indeed v alid. Here an d throughout the p ap er, the mutatio n rate µ d ep ends on N e ven though we do not record this dep endence in the notation. Also, if f and g are tw o functions of N , w e write f ( N ) ∼ g ( N ) if f ( N ) /g ( N ) → 1 as N → ∞ . W e also write f ( N ) ≪ g ( N ) if f ( N ) /g ( N ) → 0 as N → ∞ and f ( N ) ≫ g ( N ) if f ( N ) /g ( N ) → ∞ as N → ∞ . Prop osition 1. Co nsider a mo del which is i dentic al to the mo del describ e d in the intr o duction, exc ept that initial ly ther e is one individual of typ e 1 and N − 1 individuals of typ e 0, and no f urther typ e 1 mutations ar e p ossible. L et q m b e the pr ob ability that a typ e m individual e v entual ly is b orn. Supp ose that N µ 1 − 2 − ( m − 1) → ∞ as N → ∞ , and that ther e is a c onstant a > 0 such that N a µ → 0 . Then q m ∼ µ 1 − 2 − ( m − 1) . Note th at q m is the probabilit y that a giv en t yp e 1 in d ividual ev entually has a t yp e m descendan t. Because a num b er of our arguments inv olv e considering eac h t yp e 1 m utation and its descendant s separately fr om other t yp e 1 mutatio ns, this result will b e used rep eatedly . T o und erstand the ord er of magnitude of q m another wa y , recall that the probabilit y that the total progen y of a critical branching pro cess exceeds M is of order M − 1 / 2 (see, for example, [14]), so if ther e are L indep endent b r anc hing pro cesses, the most successful w ill h a v e a total progen y of order L 2 . F urthermore, the su m of the total p rogenies of th e L pr ocesses w ill also b e of order L 2 . Therefore, if there are L t yp e 1 m utations, the num b er of d escendan ts th ey pro d u ce will b e of ord er L 2 . Eac h typ e 1 descendant will exp erience a t yp e 2 m utation b efore dying with probabilit y appro ximately µ , so this should lead to on the order of L 2 µ t yp e 2 m utations. It follo ws that th e num b er of type 2 descendan ts s h ould b e on the order of L 4 µ 2 , and this will lead to on the ord er of L 4 µ 3 t yp e 3 mutatio ns . Rep eating this reasoning, w e see that th e num b er of 4 t yp e m mutati ons should b e of ord er L 2 m − 1 µ 2 m − 1 − 1 . By setting this expression equal to one and solving for L , we see that it should tak e on the order of µ − (1 − 2 − ( m − 1) ) t yp e 1 m utations b efore one of these mutati ons gets a type m descendant . That is, the probabilit y that a t yp e 1 individu al has a typ e m d escendan t is of order µ 1 − 2 − ( m − 1) . 2.1 Gamma limits when N µ → 0 Because m utations o ccur at times of a P oisson pro cess of rate N µ , there will b e appro ximately N µT mutati ons by time T . W e hav e seen th at after a mutatio n o ccurs, the num b er of ind ividuals with the mutation b ehav es app ro ximately like a critical branc hin g pr o cess. By a famous result of Kolmogoro v [2 1], the probabilit y that a critical br anc hing pro cess survives for time t is of order 1 /t . Th is means that if w e hav e N µT indep enden t critical b ranc hing pro cesses, the most successful will s u rviv e for a time wh ic h is of order N µT . Therefore, all muta tions th at app ear b efore time T should either die out or ﬁxate after b eing in the p opulation for a time of order N µT . If N µ ≪ 1, then this time is m uch smaller than the time T th at we ha ve to w ait for the m utation. Therefore, w hen N µ ≪ 1, w e can consider eac h m utation separately and determine whether either it ﬁxates or giv es b irth to a typ e m descendan t without ﬁxating. W e can ignore the time that elapses b et w een when the original mutatio n app ears, and when either it ﬁxates or the descendant with m mutations is b orn. The imp ortance of th e condition N µ ≪ 1 w as previously noted, for example, in [17 ] and [18]. W e ha ve already seen that a m u tation ﬁxates with probabilit y 1 / N and giv es birth to a t yp e j descendan t with p robabilit y app ro ximately µ 1 − 2 − ( j − 1) . Therefore, ﬁxation of some mutatio n will happ en ﬁ rst if N µ 1 − 2 − ( j − 1) → 0 as N → ∞ or, equiv alen tly , if µ ≪ N − 2 j − 1 / (2 j − 1 − 1) . Th is leads to the follo wing r esult when N µ ≪ 1. Note th at w hen m = 2, th e result in p art 1 of the theorem matc hes (12.12) of [30], wh ile th e resu lt in part 3 matc hes (12.14) of [30]; see also section 3 of [18], section 4 of [16], and T heorem 1 of [9]. Theorem 2. L et Z 1 , Z 2 , . . . b e indep endent r ando m variables having the exp onential distribution with r ate 1 , and let S k = Z 1 + · · · + Z k , which has a gamma distribution with p ar ameters ( k , 1) . 1. If µ ≪ N − 2 , then µ τ m → d S m − 1 . 2. If N − 2 j − 1 / (2 j − 1 − 1) ≪ µ ≪ N − 2 j / (2 j − 1) for some j = 2 , . . . , m − 1 , then µτ m → d S m − j . 3. If N − 2 m − 1 / (2 m − 1 − 1) ≪ µ ≪ N − 1 , then N µ 2 − 2 − ( m − 1) τ m → d Z 1 . T o u nderstand this result, n ote that in part 1 of the th eorem, wh en µ ≪ N − 2 , ﬁxation o ccurs b efore an y in dividual gets t wo mutat ions without a ﬁ xation. Th erefore, to get m m utations, w e ha v e to wait for m − 1 d iﬀeren t mutat ions to ﬁxate, and this is the sum of m − 1 ind ep endent exp onen tial waiting times. The exp on ential r andom v ariables ha ve rate parameter µ , b ecause there are m utations at rate N µ and eac h ﬁxates with pr obabilit y 1 / N , so m utations that ﬁxate o ccur at rate µ . On ce m − 1 ﬁxations hav e o ccurred, the m th mutation occur s quic kly , at rate N µ rather than at rate µ , so only the waiting times for th e m − 1 ﬁxations contribute to the limiting distribution. F or p art 2 of the theorem, wh en N − 2 j − 1 / (2 j − 1 − 1) ≪ µ ≪ N − 2 j / (2 j − 1) for some j = 2 , . . . , m − 1, ﬁxation o ccurs b efore an individ ual can accum ulate j + 1 mutat ions, b ut an individu al can accum ulate j mutatio ns b efore ﬁxation. Therefore, we wait f or m − j ﬁxations, and then th e remainin g j mutatio ns happ en w ithout ﬁ xation. Because the j m utations without 5 ﬁxation happ en on a faster time scale, the limit is a su m of m − j exp onen tial r andom v ariables. In part 3, we get m m utations b efore the ﬁrst ﬁxation, and there is an exp onen tial w aiting time unt il the ﬁr s t m utation that is s u ccessful enough to pr od uce an oﬀspring with m mutati ons. Mutations hap p en at rate N µ , and m utations are successful with probabilit y appro ximately µ 1 − 2 − ( m − 1) , whic h explains th e time-scaling factor of N µ 2 − 2 − ( m − 1) . P art 3 of Theorem 2 is th e s p ecial case of Th eorem 2 of [10] in wh ic h u j = µ for all j . Condition ( i ) of that theorem b ecomes the cond ition µ ≪ N − 1 , wh ile condition ( iv ) b ecomes the condition N − 2 m − 1 / (2 m − 1 − 1) ≪ µ . Parts 1 and 2 of Theorem 2 ab ov e are prov ed in section 3. 2.2 The b orderline cases Theorem 2 do es not co v er the cases when µ is of the order N − 2 j − 1 / (2 j − 1 − 1) for some j . On this time scale, for the reasons discussed in the previous section, we can still neglect the time b etw een when a m utation ﬁrst app ears in the p opu lation an d wh en it either ﬁ xates or dies out b ecause this time will b e m uch shorter th an the time we h ad to wait for th e mutati on to o ccur. Ho w ev er, ﬁxations h app en on the same time scale as ev ent s in wh ic h an ind ividual gets j mutations without ﬁxation. Therefore, to get to m m utations, w e s tart with m − j ﬁxations. Th en we can either hav e another ﬁ xation (follo w ed by j − 1 additional m utations, wh ich happ en on a faster time scale) or w e can get j mutati ons without an y ﬁxation. Th e waiting time is the su m of m − j ind ep endent exp onen tial random v ariables w ith rate µ and another exp onential random v ariable ha ving the faster r ate λ j µ . T he last exp onential random v ariable comes from waiti n g for a m utation that either ﬁx ates or has a descendant with j − 1 additional mutatio n s b ut do es not ﬁ xate. This leads to th e f ollo wing result. Theorem 3. Supp ose µ ∼ AN − 2 j − 1 / (2 j − 1 − 1) for some j = 2 , . . . , m and some c onstant A > 0 . L et Z 1 , Z 2 , . . . b e indep endent exp onential r ando m v ariables having the e xp onential distribution with r ate 1 , and let S k = Z 1 + · · · + Z k . L et Y b e indep endent of Z 1 , Z 2 , . . . , and assume that Y has the exp onential distribution with r ate λ j , wher e λ j = ∞ X k =1 A 2 k (1 − 2 − ( j − 1) ) ( k − 1)!( k − 1)!  ∞ X k =1 A 2 k (1 − 2 − ( j − 1) ) k !( k − 1)! . (3) Then µ τ m → d S m − j + Y . This resu lt when j = m is the sp ecial case of Th eorem 3 of [10] in w hic h u j = µ for all j . As will b e seen in section 3, the result for j ≤ m − 1 follo ws easily from the result when j = m . T o explain w here the form ula for λ j comes from, w e review here the outline of th e pro of of Theorem 3 in [10]. Assume that we already hav e m − j ﬁxations, and no w we need to wait either for another ﬁxation or for a m utation that will ha ve a d escendan t with j − 1 add itional mutati ons. W e can not appro ximate the prob ab ility of the latter even t by µ 1 − 2 − ( j − 1) in th is case b ecause to get j − 1 fur ther mutat ions, the n umb er of individuals with the original m utation will n eed to b e of order N , so the branching pro cess app ro ximation do es not h old. Instead, w e consider a mo del in which there is one individual with a mutati on at time zero, and X ( t ) denotes the n umb er of in dividuals w ith the muta tion at time t . At time t , the ind ividuals with the mutat ion eac h exp erience fu rther muta tions at rate µ , and these further m u tations eac h ha ve probabilit y appro ximately µ 1 − 2 − ( j − 2) of having an oﬀspring with j total mutatio ns . Therefore, at time t , 6 successful m utations are happ ening at rate γ X ( t ), where γ ≈ µ · µ 1 − 2 − ( j − 2) = µ 2(1 − 2 − ( j − 1) ) . A t time t , the jump r ate of the pr ocess is 2 X ( t )( N − X ( t )) / N . Therefore, b y m aking a time- c hange, w e can w ork instead with a con tin uous -time simple random walk ( Y ( t ) , t ≥ 0) whic h jumps at r ate one, and the m utation rate at time t b ecomes γ Y ( t ) · N 2 Y ( t )( N − Y ( t )) = γ 2(1 − Y ( t ) / N ) . Therefore, the probabilit y that there is no ﬁxation and no fur ther successful muta tion is appro x- imately E  exp  − Z T 0 γ 2(1 − Y ( t ) / N ) dt  1 { Y ( T )=0 }  , where T = in f { t : Y ( t ) ∈ { 0 , N }} . Simp le rand om w alk con v erges to Bro wnian motio n , so if instead of starting with just one m utan t in dividual w e assume that Y (0) = ⌊ N x ⌋ , where 0 < x < 1, then the ab ov e expression is approximat ely u ( x ) = E  exp  − A 2(1 − 2 − ( j − 1) ) 2 Z U 0 1 1 − B ( s ) ds  1 { B ( U )=0 }  , (4) where U = inf { t : B ( t ) ∈ { 0 , 1 }} and ( B ( t ) , t ≥ 0) is Bro wn ian motion started at x . Here we are also u sing that N 2 γ ∼ A 2(1 − 2 − ( j − 1) ) , where the factor of N 2 comes from th e time change in replacing r andom w alk with Brownian motion. S ince the p robabilit y that we get either ﬁxation or a successful mutatio n is 1 − u ( x ), and we need to take a limit as the num b er of mutan ts at time zero gets small, we hav e λ j = lim x → 0 1 − u ( x ) x . Th u s , the problem reduces to ev aluating the Brownian functional (4). One can obtain a diﬀer- en tial equation for u ( x ) us in g th e F eynman-Kac form ula, and then get a series s olution to the diﬀeren tial equation, from whic h the formula (3) follo ws. Details of this argument o ccup y section 6 of [10]. 2.3 Rapid mutations It r emains to h an d le the case when N µ 9 0. With this scaling, ﬁxation will not o ccur b efore time τ m . Ho we ver, th e wa iting time b etw een the typ e 1 mutation that will eve ntually pro duce a t yp e m d escendan t and the actual app earance of the type m d escendan t can no longer b e ignored . As a result, wait ing times are n o longer s ums of exp onentia l r andom v ariables. In stead, we obtain the follo wing result. Th e m = 2 case of part 3 is equiv alent to the sp ecial case of Theorem 1 in [10] when u 1 = u 2 = µ . Theorem 4. We have the fol lowing limiting r esults when N µ 9 0 . 1. If µ ≫ N − 2 /m , then lim N → ∞ P ( τ m > N − 1 /m µ − 1 t ) = exp  − t m m !  . 7 2. If N − 1 / (1+( m − j − 2)2 − ( j +1) ) ≪ µ ≪ N − 1 / (1+( m − j − 1)2 − j ) for some j = 1 , . . . , m − 2 , then lim N → ∞ P ( τ m > N − 1 / ( m − j ) µ − 1 − (1 − 2 − j ) / ( m − j ) t ) = exp  − t m − j ( m − j )!  . 3. If µ ∼ AN − 1 / (1+( m − j − 1)2 − j ) for some j = 1 , . . . , m − 1 and some c onsta nt A > 0 , then lim N → ∞ P ( τ m > µ − (1 − 2 − j ) t ) = exp  − A 1+( m − j − 1)2 − j ( m − j − 1)! Z t 0 ( t − s ) m − j − 1 1 − e − 2 s 1 + e − 2 s ds  . W e no w explain the intuition b ehind th ese results. Recall that X j ( t ) is the num b er of ind i- viduals with j mutat ions at time t . Because there are N in d ividuals getting mutati ons at rate µ , w e h av e E [ X 1 ( t )] ≈ N µt for small t . Eac h of th ese individuals acquires a second mutation at rate µ , so E [ X 2 ( t )] ≈ µ Z t 0 N µs ds = N µ 2 t 2 2 . Rep eating th is reasoning, we get E [ X j ( t )] ≈ N µ j t j /j !. When the m utation rate is suﬃ cien tly large, th ere is a La w of Large Num b ers , and the ﬂuctuations in th e num b er of individu als with j mutatio n s are small relativ e to E [ X j ( t )]. In this case, X j ( t ) is we ll appr oximate d by its exp ectation. When the mutation rate is su ﬃcien tly small, most of the time there are no individuals with j m utations in the p opu lation, and when an individual gets a j th mutatio n , th is mutation either d ies out or, with probabilit y q m − j +1 , pro duces a type m d escendan t on a time scale muc h faster than τ m . In this case, the problem reduces to determining ho w long we ha ve to w ait for a j th mutation that is s uccessful enough to pro duce a t yp e m descendant . There is also a b orderline case in which we get sto c hastic eﬀects in the limit b oth from the num b er of t yp e j individuals in the p opulation and from the time b et we en the app earance of a t yp e j individu al that will ev ent ually ha ve a type m descendant and the birth of the t yp e m d escendan t. If th e mutatio n rate is fast enough s o that X m − 1 ( t ) ≈ E [ X m − 1 ( t )] u p to time τ m , then since eac h ind ividual with m − 1 m utations gets an m th mutat ion at rate µ , we get P ( τ m > t ) ≈ exp  − µ Z t 0 N µ m − 1 s m − 1 ( m − 1)! ds  = exp  − N µ m t m m !  . (5) This leads to the resu lt in part 1 of Theorem 4 if we su b stitute N − 1 /m µ − 1 t in place of t in (5). In this regime, mutat ions are happ enin g fast enough that bir ths and deaths do not aﬀect the limiting r esult, an d we get the same result that we w ould get if τ m w ere simply the ﬁrst time that one of N indep enden t rate µ P oisson pro cesses reac hes th e v alue m . Consequently , as can b e seen by in tegrating (1), this result agrees with the result of Armitage an d Doll [2], who d id not consider cell division and cell death in th eir original mo del. The result wh en m = 2 agrees with a r esult in section 4 of [16], and with (12.18) of [30 ]. Next, su pp ose mutat ion r ates are fast enough s o that X m − j − 1 ( t ) ≈ E [ X m − j − 1 ( t )] up to time τ m , b ut slo w en ough that th e time b etw een the app earance of a “successful” t yp e m − j ind ividual that will ha ve a t yp e m descendant and the birth of the t yp e m descendant is small relativ e to τ m . 8 Then eac h typ e m − j − 1 individual exp eriences “successful” mutatio ns at rate µq j +1 ≈ µ 2 − 2 − j b y Prop osition 1, so P ( τ m > t ) ≈ exp  − µ 2 − 2 − j Z t 0 N µ m − j − 1 s m − j − 1 ( m − j − 1)! ds  = exp  − N µ m − j +1 − 2 − j t m − j ( m − j )!  . This leads to the r esult in part 2 of Th eorem 4. T he b orderline cases are hand led by p art 3 of Theorem 4 . T o understand wh ere the b ound aries b et w een th e diﬀeren t t yp es of b ehavio r o ccur, ﬁ rst recall that the num b er of t yp e k ind ivid uals b orn by time t is of the order N µ k t k . Because eac h individual give s birth and dies at approximat ely rate one, th e num b er of births and deaths of t yp e k individuals by time t is of order N µ k t k +1 . Because the standard deviation of the p osition of a r andom w alk after M steps is of order M 1 / 2 , the standard d eviatio n of the n u mb er of t yp e k individuals by time t is of order N 1 / 2 µ k / 2 t ( k +1) / 2 . Th er efore, we h a v e X k ( t ) ≈ E [ X k ( t )] wh enev er N 1 / 2 µ k / 2 t ( k +1) / 2 ≪ N µ k t k or, equiv alen tly , whenever 1 ≪ N µ k t k − 1 . See Prop osition 11 b elo w for a p recise statemen t of this r esult. Eac h t yp e k individ ual exp eriences a muta tion that w ill hav e a type m descendant at r ate µq m − k ≈ µ 2 − 2 − ( m − k − 1) . Therefore, the exp ected num b er of su c h mutations by time t is of the order N µ k t k · µ 2 − 2 − ( m − k − 1) · t = N µ k +2 − 2 − ( m − k − 1) t k +1 . Th is expression is of order one when t is of order N − 1 / ( k +1) µ − 1 − (1 − 2 − ( m − k − 1) ) / ( k +1) , wh ic h is consequently the ord er of magnitude of the time w e hav e to wait for one such m utation to o ccur. It no w follo ws f rom the result of th e previous paragraph that X k ( t ) ≈ E [ X k ( t )] u p to time τ m whenev er 1 ≪ N µ k ( N − 1 / ( k +1) µ − 1 − (1 − 2 − ( m − k − 1) ) / ( k +1) ) k − 1 . (6) The expression on the right- hand side of (6) can b e simpliﬁed to ( N 2 µ 2+( k − 1)2 − ( m − k − 1) ) 1 / ( k +1) , so (6) is equiv alen t to the condition µ ≫ N − 1 / (1+( k − 1)2 − ( m − k ) ) . (7) This condition can b e compared to the condition for p art 2 of Th eorem 4, which entai ls that (7 ) holds for k = m − j − 1 but not for k = m − j , and th er efore th e num b er of t yp e m − j − 1 individuals, b ut n ot the num b er of t yp e m − j individuals, is app ro ximately deterministic th rough time τ m . If instead µ is of the ord er N − 1 / (1+( m − j − 1)2 − j ) for some j = 1 , . . . , m − 1, then on the relev an t time scale the n umb er of individ uals of t yp e m − j − 1 b eha ve s deterministically , b ut the num b er of individu als of t yp e m − j h as ﬂ uctuations of the same order as the exp ected v alue. As a result, there are sto c hastic eﬀects from the num b er of t yp e m − j ind ividuals in the p opulation. In this case, there are also stoc hastic eﬀects fr om the time b et w een the birth of t yp e m − j individual that w ill h a v e a t yp e m descendant an d the time that the t yp e m descendant is b orn. Calculating the form of th e limiting distribution in these b orderline cases inv olv es w orking with a tw o-t yp e b ranc hing pro cess. This branching pro cess is v ery similar to a pro cess analyzed in c hapter 3 of [33], w hic h explains the resem blance b et wee n part 3 of Theorem 4 and (3.20) of [33]. Sim ilar an alysis using generating functions of branc hing pro cesses that arise in m ulti-stage mo dels of cancer has b een carried out in [23, 25, 26 ]. Th e work in [25] allo ws for time-dep endent parameters, while a th ree-stag e mo del is analyzed in [26 ]. 9 2.4 The case m = 3 T o h elp the reader un derstand the diﬀerent limiting b ehavi ors, w e summarize here the resu lts when m = 3. There are 9 diﬀeren t limiting regimes in this case; in general for the waiting time to get m mutations, there are 4 m − 3 limiting r egimes. Belo w Z 1 and Z 2 ha v e the exp on ential distribution with m ean one, and Y 1 and Y 2 ha v e the exp onen tial d istributions with mean λ 2 and λ 3 resp ectiv ely , wh er e λ 2 and λ 3 are giv en b y (3 ). The random v ariables Z 1 , Z 2 , Y 1 , and Y 2 are assumed to b e indep enden t. • If µ ≪ N − 2 , then by p art 1 of Th eorem 2, µ τ 3 → d Z 1 + Z 2 . W e wa it for tw o ﬁxations, and then th e thir d m utation happ ens quickly . • If µ ∼ AN − 2 , then by the j = 2 case of Theorem 3, µτ 3 → d Z 1 + Y 1 . W e wait for one ﬁxation, then either a second ﬁxation (after which the third mutation would happ en quic kly) or a second mutation that will not ﬁ xate but w ill ha ve a descendant that gets a third m utation. • If N − 2 ≪ µ ≪ N − 4 / 3 , then b y the j = 2 case of part 2 of Th eorem 2, µτ 3 → d Z 1 . W e wa it for one ﬁ xation, and then the other t wo mutations happ en quickly . • If µ ∼ AN − 4 / 3 , then by the j = 3 case of Th eorem 3, µτ 3 → d Y 2 . W e wa it either f or a ﬁxation (after whic h the other t w o mutatio ns would h app en quic kly) or a m utation that will not ﬁxate but w ill h a v e a descendant with t wo add itional muta tions. • If N − 4 / 3 ≪ µ ≪ N − 1 , then by part 3 of Theorem 2, N µ 7 / 4 τ 3 → d Z 1 . Fixation do es not happ en b efore time τ 3 , but we wait an exp onen tially distribu ted time for a m utation that is successful enough to ha v e a descendant with three mutatio ns. • If µ ∼ AN − 1 , then by the j = 2 case of p art 3 of T heorem 4, P ( µ 3 / 4 τ 3 > t ) → exp  − A Z t 0 1 − e − 2 s 1 + e − 2 s ds  . • If N − 1 ≪ µ ≪ N − 2 / 3 , then by the j = 1 case of p art 2 of Theorem 4, P ( N 1 / 2 µ 5 / 4 τ 3 > t ) → exp( − t 2 / 2). Th e num b er of individu als with one m utation is appr o ximately deterministic, and the sto c hastic eﬀect comes from wa iting for a second mutat ion that is successful en ou gh to h a v e a d escend an t with a third mutation. • If µ ∼ AN − 2 / 3 , then b y the j = 1 case of part 3 of Theorem 4, P ( µ 1 / 2 τ 3 > t ) → exp  − A 3 / 2 Z t 0 ( t − s ) 1 − e − 2 s 1 + e − 2 s ds  . • If µ ≫ N − 2 / 3 , then by p art 1 of Theorem 4, P ( N 1 / 3 µτ 3 > t ) → exp( − t 3 / 6). The num b er of ind ividuals with t wo m utations is approximat ely deterministic, and the sto chastic eﬀect comes f r om wai ting for the th ird mutat ion. 10 2.5 P ow er la w asymptotics and implications for cancer mo deling Because the p robabilit y that an individual deve lops a particular t yp e of cancer du ring h is or her lifetime is small, it seems unlikely that it will b e p ossible to observ e the full limiting d istribution of the waiting time f or cancer from data on cancer incidence. Instead, we will observ e only the left tail of this distr ibution. Co n s equen tly , what is lik ely to b e most r elev an t for app licatio ns are asymptotic formulas as t → 0. Throughout this su bsection, write f ( t ) ≈ g ( t ) to mean that f ( t ) /g ( t ) → 1 as t → 0. Re call that if S j is the sum of j in dep endent exp onent ial random v ariables with mean one, then P ( S j ≤ t ) ≈ t j /j !. This f act, combined with the appro ximation 1 − exp( − t m − j / ( m − j )!) ≈ t m − j / ( m − j )!, allo ws us to dedu ce the follo wing corollary of Theorems 2 and 4. Corollary 5. We have the fol lowing asymptotic formulas as t → 0 : 1. If µ ≪ N − 2 , then lim N → ∞ P ( τ m ≤ µ − 1 t ) ≈ t m − 1 ( m − 1)! . 2. If N − 2 j − 1 / (2 j − 1 − 1) ≪ µ ≪ N − 2 j / (2 j − 1) for some j = 2 , . . . , m − 1 , then P ( τ m ≤ µ − 1 t ) ≈ t m − j ( m − j )! . 3. If N − 2 m − 1 / (2 m − 1 − 1) ≪ µ ≪ N − 1 , then P ( τ m ≤ N − 1 µ − 2+2 − ( m − 1) t ) ≈ t . 4. If N − 1 / (1+( m − j − 2)2 − ( j +1) ) ≪ µ ≪ N − 1 / (1+( m − j − 1)2 − j ) for some j = 1 , . . . , m − 2 , then lim N → ∞ P ( τ m ≤ N − 1 / ( m − j ) µ − 1 − (1 − 2 − j ) / ( m − j ) t ) ≈ t m − j ( m − j )! . 5. If µ ≫ N − 2 /m , then lim N → ∞ P ( τ m ≤ N − 1 /m µ − 1 t ) ≈ t m m ! . By integrat ing (1), we see that the r esu lt in part 5 of the corolla ry , wh ic h sa ys that the probabilit y of getting cancer by time t b eha v es lik e C t m , agrees w ith the resu lt of Ar mitage and Doll. Ho we ver, p arts 1 through 4 of th e corollary sh o w that in an m -stage mo del of cancer, the probabilit y of getting cancer b y time t could b eha ve lik e C t j for any j = 1 , 2 , . . . , m , dep ending on th e relationship b et w een µ and N . This range of b eha vior can o ccur b ecause n ot all of the m even ts r equired for cancer are necessarily “rate limiting”. F or example, wh en part 2 of the corollary app lies, th er e are m − j ﬁxations, and then the remaining j m u tations h app en on a muc h faster time scale. Consequent ly , it is not p ossible to deduce the num b er of m utations required for cancer ju st from the p o wer la w relationship b et wee n age and cancer incidence. Corollary 5 also shows that in our m -stage mo del, the probabilit y of getting cancer b y time t w ill never b ehav e lik e C t j for j > m . Ho w eve r, as noted by Armitage and Doll (see [1, 2]), higher p o wers could arise if th e muta tion r ate, in stead of b eing constan t, increases ov er time lik e 11 a p ow er of t . Also, the probabilit y of getting cancer by time t could increase more rapidly than t m if cells w ith mutat ions hav e a selectiv e adv an tage ov er other cells, allo wing their num b er to increase more rapidly than our mo del p redicts. This explains, in part, the su ccess of t wo -stage mo dels in ﬁtting a w id e v ariet y of cancer incidence data, as do cumen ted in [24]. 3 Pro of of Theorems 2 and 3 Recall that part 3 of Theorem 2 is a s p ecial case of T h eorem 2 of [10], s o we need to pr o v e only parts 1 and 2. W e b egin by recording three lemmas. Lemma 6, whic h just restates (3.6), (3.8), and L emma 3.1 of [10], b ound s the amount of time that a mutati on is in the p opulation b efore it dies out or ﬁxates. Lemma 7 complement s Prop osition 1. Lemma 8 is a direct consequence of part 3 of Theorem 2. In these lemmas and throughout the rest of th e pap er, C d enotes a p ositiv e constan t n ot dep end ing on N whose v alue may c hange fr om line to line. Lemma 6. Consider a mo del of a p opulation of size N in which al l individuals ar e either typ e 0 or typ e 1. The p opulatio n starts with just one typ e 1 individual and evolves ac c or ding to the Mor an mo del, so e ach individual dies at r ate one and then gets r eplac e d by a r andom ly chosen individual fr om the p opulation. L et X ( t ) b e th e numb er of typ e 1 individuals at time t . L e t T = inf { t : X ( t ) ∈ { 0 , N }} . L et L k b e the L eb esgue me asur e of { t : X ( t ) = k } . Then for k = 1 , . . . , N − 1 , E [ L k ] = 1 k . (8) Also , E [ T ] ≤ C log N (9) and for al l 0 ≤ t ≤ N , P ( T > t ) ≤ C /t. (10) Lemma 7. Consider the mo del of Pr op osition 1. L et q ′ m b e the pr ob ability that a typ e m i ndividual is b orn at some time, but that ev entual ly al l individuals have typ e zer o. Supp ose N µ 1 − 2 − ( m − 1) → 0 as N → ∞ . Then q ′ m ≪ 1 / N . Pr o of. Th e ev en t that all individuals even tually h a v e t yp e zero h as probabilit y ( N − 1) / N re- gardless of the mutation rate. On this ev en t, red ucing the m utation rate can only redu ce the probabilit y of eve ntually getting a type m individ ual. Therefore, it suﬃces to p ro v e the r esult when N µ 1 − 2 − ( m − 2) → ∞ . (11) If a type m individual ev ent ually is b orn, then some t yp e 2 m utation m ust hav e a type m descendan t. By (8), for k = 1 , . . . , N − 1, th e exp ected amount of time for wh ic h there are k individuals of nonzero t yp e is 1 /k . While there are k individu als of nonzero t yp e, t yp e 2 mutati ons o ccur at rate at most k µ . O n the ev en t that th ere is no ﬁxation, the num b er of ind ividuals of nonzero type nev er reac hes N , and the exp ected num b er of t yp e 2 mutatio ns w hile there are few er than N ind ivid uals of nonzero t yp e is at most N − 1 X k =1 1 k · kµ ≤ N µ. 12 When (11) holds, we can apply Prop osition 1 to see that if m ≥ 3 then eac h typ e 2 m utation has probabilit y at most C µ 1 − 2 − ( m − 2) of having a t yp e m descendan t. Th is inequalit y holds trivially if m = 2. I t follo ws that q ′ m ≤ ( N µ )( C µ 1 − 2 − ( m − 2) ) = C N µ 2 − 2 − ( m − 2) , and therefore N q ′ m ≤ C ( N µ 1 − 2 − ( m − 1) ) 2 → 0, as claimed. Lemma 8. Supp ose j ≥ 2 . If N − 2 j − 1 / (2 j − 1 − 1) ≪ µ ≪ 1 / N , then for al l ǫ > 0 , lim N → ∞ P ( τ j < ǫµ − 1 ) = 1 . Pr o of. Part 3 of Theorem 2 giv es lim N → ∞ P ( N µ 2 − 2 − ( j − 1) τ j ≤ t ) = 1 − e − t for all t > 0. The result f ollo ws immediately b ecause µ ≪ N µ 2 − 2 − ( j − 1) b y assumption. Pr o of of p arts 1 and 2 of The or e m 2. Supp ose either j = 1 and µ ≪ N − 2 , or j = 2 , . . . , m − 1 and N − 2 j − 1 / (2 j − 1 − 1) ≪ µ ≪ N − 2 j / (2 j − 1) . Let γ i b e the time of the i th mutati on, s o the p oint s ( γ i ) ∞ i =1 form a r ate N µ Po isson pro cess on [0 , ∞ ). Call the i th mutation bad if at time γ i , there is another m u tation in the p opulation that has not y et d ied out or ﬁxated. Oth er w ise, call the m utation go o d. F or all i , let ξ i = 1 if the i th m utation ﬁxates, and let ξ i = 0 otherwise. W e hav e P ( ξ i = 1) = 1 / N f or all i , but the random v ariables ( ξ i ) ∞ i =1 are n ot indep endent b ecause if tw o m utations are pr esen t at the same time on diﬀerent in d ividuals, at most one of the mutations can ﬁxate. Let ( ˜ ξ i ) ∞ i =1 b e a sequence of i.i.d. random v ariables, indep enden t of the p opulation p ro cess, suc h that P ( ˜ ξ i = 1) = 1 / N and P ( ˜ ξ i = 0) = ( N − 1) / N for all i . Deﬁne another sequence ( ξ ′ i ) ∞ i =1 suc h that ξ ′ i = ξ i if the i th mutation is go o d and ξ ′ i = ˜ ξ i if the i th mutation is bad. If the i th m utation is go o d, then P ( ξ i = 1 | ( ξ ′ k ) i − 1 k =1 ) = 1 / N , so ( ξ ′ i ) ∞ i =1 is an i.i.d. sequen ce. Let σ 1 = inf { γ i : ξ i = 1 } and for k ≥ 2, let σ k = inf { γ i > σ k − 1 : ξ i = 1 } . Lik ewise, let σ ′ 1 = in f { γ i : ξ ′ i = 1 } and f or k ≥ 2, let σ ′ k = in f { γ i > σ k − 1 : ξ ′ i = 1 } . Th e p oin ts γ i for which ξ ′ i = 1 form a Poisson pr o cess of rate µ , so µσ ′ m − j has the gamma d istribution w ith parameters ( m − j, 1). Let ǫ > 0, and choose t large enough that P ( σ ′ m − j > µ − 1 t ) < ǫ . (12) Note that b ecause µσ ′ m − j has a gamma distribution for all N , here t do es not dep end on N . The exp ected num b er of mutatio ns by time µ − 1 t is ( N µ )( µ − 1 t ) = N t . After a mutatio n o ccurs, the num b er of ind ividuals d escend ed from this mutan t in d ividual evol ves in th e same w a y as the n umb er of t yp e 1 individuals in Lemma 6. T herefore, by (9), the exp ected amount of time, b efore time µ − 1 t , that there is a m utation in the p op u lation that has not y et disapp eared or ﬁxated is at most C ( N log N ) t . Therefore, the exp ected n umb er of bad mutat ions b efore time µ − 1 t is at most ( N µ )( C ( N log N ) t ) = C ( N 2 log N ) µt . If a bad mutati on o ccurs at time γ i , the probability that either ξ i or ξ ′ i equals one is at most 2 / N , so P ( ξ i = ξ ′ i for all i su c h th at γ i ≤ µ − 1 t ) ≥ 1 − 2 C ( N log N ) µt. Because µ ≪ 1 / ( N log N ), it follo ws b y letting ǫ → 0 that lim N → ∞ P ( σ ′ m − j = σ m − j ) = 1 . (13) 13 Th u s , µσ m − j → d S m − j . T o complete the pro of, it remains to s h o w that µ ( τ m − σ m − j ) → p 0 . (14) W e ﬁrst prov e that lim N → ∞ P ( τ m < σ m − j ) = 0 (15) If τ m < σ m − j , then b efore time σ m − j , there must b e a t yp e k m utation for some k ≤ m − j that d o es not ﬁ xate bu t has a type m d escendan t. W e will b oun d the pr obabilit y of this ev en t. Recall that the exp ected num b er of m utations b efore time µ − 1 t is N t . Because µ ≪ N − 2 j / (2 j − 1) , w e can app ly Lemma 7 with j + 1 in place of m to get that the probabilit y that a type m − j m utation do es n ot ﬁxate b ut has a t yp e m d escendan t is asymptotically m uch smaller than 1 / N . Th u s , th e p robabilit y that b efore time µ − 1 t , there is a type k m utation for some k ≤ m − j that do es n ot ﬁxate b u t has a typ e m d escendan t is asymptotically muc h smaller than ( N t )(1 / N ), and therefore go es to zero as N → ∞ . Com bining th is result with (12) and (13) giv es (15). W e no w pro ve (14). Cho ose ǫ > 0. Let ˜ γ i b e the time when th e m utation at time γ i disapp ears or ﬁxates. By (9), w e hav e E [ ˜ γ i − γ i ] ≤ C log N . It follo ws fr om Mark ov’s In equalit y that P (˜ γ i − γ i > µ − 1 ǫ ) ≤ C log N / ( µ − 1 ǫ ). Because the exp ected num b er of m utations b y time µ − 1 t is N t , another application of Mark ov’ s Inequalit y give s P ( ˜ γ i − γ i > µ − 1 ǫ for some i such that γ i < µ − 1 t ) ≤ N t · C log N µ − 1 ǫ = C t ǫ ( N log N ) µ, whic h goes to zero as N → ∞ . Th erefore, in view of (12) and (13), if ζ is the time when the m utation at time σ m − j ﬁxates, w e hav e µ ( ζ − σ m − j ) → p 0 (16) No w (14) will b e imm ediate f rom (15) and (16) once w e show that for all ǫ > 0, lim N → ∞ P ( µ ( τ m − ζ ) > ǫ ) = 0 . (17) When j ≥ 2, equatio n (17) follo ws from L emma 8 b ecause after time σ m − j , at most j more m utations are needed b efore w e reac h time τ m . When j = 1, we reac h the time τ m as so on as there is another m utation after time σ m − j , so τ m − ζ is sto c hastically dominated by an exp onen tially distributed random v ariable with rate N µ . It follo ws th at (17) holds in this case as w ell. Most of the work inv olv ed in p ro ving T heorem 3 is contai ned in the pro of of the follo wing result, which is a sp ecial case of Lemma 7.1 of [10]. Lemma 9. Supp ose µ ∼ AN − 2 j − 1 / (2 j − 1 − 1) for some j = 2 , . . . , m and some c onstant A > 0 . Consider the mo del of Pr op osition 1 . L et q ′ j b e the pr ob ability that either a typ e j individual i s b orn at some time, or eventual ly al l individuals in the p opulatio n have typ e gr e ater than zer o. Then lim N → ∞ N q ′ j = λ j , wher e λ j > 1 is given by (3). Pr o of of The or em 3. The p ro of is similar to th e p ro of of parts 1 and 2 of Th eorem 2. Deﬁne the sequences ( γ i ) ∞ i =1 , ( ξ i ) ∞ i =1 , ( ˜ ξ i ) ∞ i =1 and ( ξ ′ i ) ∞ i =1 as in the pr o of of parts 1 and 2 of T heorem 2. Also deﬁne a sequence ( ζ i ) ∞ i =1 of { 0 , 1 } - v alued random v ariables suc h that ζ 1 = 1 if the m utation at time γ i either ﬁxates or has a descendant that gets j − 1 additional muta tions. Let 14 ( ˜ ζ i ) ∞ i =1 b e a sequence of i.i.d. random v ariables, ind ep enden t of the p opu lation pro cess, such th at P ( ˜ ζ i = 1) = λ j / N and P ( ˜ ζ i = 0) = ( N − λ j ) / N for all i , and ˜ ζ i = 1 wh enev er ˜ ξ i = 1. Let ζ ′ i = ζ i if the i th mutatio n is go o d, and let ζ ′ i = ˜ ζ i otherwise. Let σ 0 = 0. F or k = 1 , . . . , m − j , let σ k = in f { γ i > σ k − 1 : ξ i = 1 } . Let σ m − j +1 = in f { γ i > σ m − j : ζ i = 1 } . De ﬁ n e σ ′ 1 , . . . , σ ′ m − j +1 in the same wa y using the rand om v ariables ξ ′ i and ζ ′ i . It is clear from th e construction that σ ′ m − j +1 has th e s ame distr ib ution as S m − j + Y . By the same argument used in the p r o of of parts 1 and 2 of Th eorem 2, with a b ound of 2 λ j / N replacing the b ound of 2 / N , w e get lim N → ∞ P ( σ ′ m − j +1 = σ m − j +1 ) = 1 , whic h implies µσ m − j +1 → d S m − j + Y . This argumen t also giv es that the m utation at time σ m − j +1 is go o d with pr obabilit y tending to one as N → ∞ . W e next claim that lim N → ∞ P ( τ m < σ m − j +1 ) = 0 . (18) If σ m − j < γ i < σ m − j +1 , then by th e deﬁn ition of σ m − j +1 , no descendan t of the muta tion at time γ i can hav e a t yp e m descendant . Therefore, if τ m < σ m − j +1 , then b efore time σ m − j there must b e a t yp e k mutat ion for some k ≤ m − j that do es n ot ﬁxate but h as a t yp e m descendant. Because µ ≪ N − 2 j / (2 j − 1) , the probability of this ev ent go es to zero by the same argument giv en in th e p r o of of parts 1 and 2 of Theorem 2, whic h implies (18). It remains only to pr o v e µ ( τ m − σ m − j +1 ) → p 0 . (19) Let ǫ > 0, and c ho ose t large enou gh that P ( σ ′ m − j +1 > µ − 1 t ) < ǫ . Let ǫ > 0. By the same argumen t give n in the p ro of of p arts 1 and 2 of Prop osition 2, the probabilit y th at some mutat ion b efore time µ − 1 t take s longer than µ − 1 ǫ to d ie out or ﬁxate tends to zero as N → ∞ . Therefore, if ζ is the time wh en the mutation at time σ m − j +1 dies out or ﬁxates, then µ ( ζ − σ m − j +1 ) → p 0. If the m utation at time σ m − j +1 ﬁxates, then on ly j − 1 more m utations are needed b efore w e reac h time τ m . Therefore, conditional on this ﬁxation, when j ≥ 3 we get µ ( τ m − ζ ) → p 0 b y applying Lemm a 8 with j − 1 in place of j , while the result µ ( τ m − ζ ) → p 0 is immediate wh en j = 2. Alternativ ely , if the mutation at time σ m − j +1 do es not ﬁx ate and the muta tion at time σ m − j +1 is go o d, then τ m ≤ ζ . Because the mutatio n at time σ m − j +1 is go o d with p robabilit y tending to one as n → ∞ , we conclude (19). 4 Pro of of parts 1 and 2 of Theorem 4 The ﬁrst step in the p ro of of Theorem 4 is to establish conditions, stated in Prop osition 11 b elo w, under whic h the num b er of t yp e k individ u als is essentiall y deterministic, in the s ense that it can b e w ell appr oximate d by its exp ectation. It will follo w that when µ ≫ N − 2 /m , the num b er of individuals with t yp e m − 1 is appr oximate ly deterministic until time τ m . Since eac h t yp e m − 1 individual exp eriences a t yp e m mutati on at rate µ , the appro ximately deterministic b ehavio r of the t yp e m − 1 individuals leads easily to a pr oof of part 1 of Theorem 4. When in stead N − 1 / (1+( m − j − 2)2 − ( j +1) ) ≪ µ ≪ N − 1 / (1+( m − j − 1)2 − j ) , the num b er of ind ividuals of t yp e m − j − 1 is appro ximately d eterministic up to time τ m , as w ill b e sh o wn in Lemma 12 b elo w. The remainder of the pro of of part 2 of Theorem 4 inv olv es us ing a Poisson approxima tion tec hnique to calculate 15 the distrib ution of the time w e ha ve to wa it for one of the type m − j − 1 in dividuals to hav e a t yp e m − j m utation that w ill giv e r ise to a typ e m descendant. W e b egin with a lemma b oun d ing the exp ected num b er of type k individu als. Recall th at X j ( t ) denotes the num b er of t yp e j individu als at time t , and X j (0) = 0 f or all j ≥ 1. Lemma 10. L et Y k ( t ) = P ∞ j = k X j ( t ) b e the numb er of i ndividuals of typ e k or higher at time t . F or al l k ≥ 0 and t ≥ 0 , we have E [ X k ( t )] ≤ E [ Y k ( t )] ≤ N µ k t k /k ! . Pr o of. Th e ﬁr st inequalit y is obvious, so it suﬃces to sh o w E [ Y k ( t )] ≤ N µ k t k /k !. W e pro ceed b y in duction. Since Y 0 ( t ) ≤ N for all t ≥ 0, the result is true f or k = 0. Sup p ose k ≥ 1 and E [ Y k − 1 ( t )] ≤ N µ k − 1 t k − 1 / ( k − 1)! for all t ≥ 0. The exp ected num b er of type k mutations b efore time t is at most µ Z t 0 E [ X k − 1 ( s )] ds ≤ Z t 0 N µ k s k − 1 ( k − 1)! ds = N µ k t k k ! . Because individuals of t yp e k and higher give birth and die at the same rate, it follo ws that E [ Y k ( t )] ≤ N µ k t k /k !. Prop osition 11. Supp ose k ≥ 0 and T is a time that dep ends on N . A ssume that as N → ∞ , we have µT → 0 , N µ k T k − 1 → ∞ , and N µ k T k → ∞ . Then for al l ǫ > 0 , lim N → ∞ P  max 0 ≤ t ≤ T     X k ( t ) − N µ k t k k !     > ǫN µ k T k  = 0 . (20) Pr o of. W e prov e the result b y induction and b egin with k = 0. Individu als of t yp e one or higher are alw a ys b eing b orn and d ying at the same r ate. S ince n ew individu als of type one or higher also app ear b ecause of t yp e 1 mutatio ns, the pr ocess ( N − X 0 ( t ) , t ≥ 0) is a b ounded submartingale. Let ζ = in f { t : N − X 0 ( t ) > ǫN } . By the Op tional Samplin g Th eorem, w e ha v e E [ N − X 0 ( T ) | ζ ≤ T ] ≥ ǫN . Since the rate of type 1 mutatio ns is alw ays b ounded by N µ , w e ha v e E [ N − X 0 ( T )] ≤ N µT . Th erefore, P  max 0 ≤ t ≤ T | X 0 ( t ) − N | > ǫN  = P ( ζ ≤ T ) ≤ E [ N − X 0 ( T )] E [ N − X 0 ( T ) | ζ ≤ T ] ≤ N µT ǫN → 0 as N → ∞ b ecause µT → 0. It follo ws that wh en k = 0, (20) holds for all ǫ > 0. Let k ≥ 1. Ass u me that (20) holds with k − 1 in place of k . Let B k ( t ) b e the n u mb er of t yp e k mutatio ns up to time t . Let S k ( t ) b e the num b er of time s, u n til time t , that a t yp e k individual giv es birth m inus the num b er of times th at a t yp e k individ u al dies. Note that X k ( t ) = B k ( t ) − B k +1 ( t ) + S k ( t ), s o     X k ( t ) − N µ k t k k !     ≤ B k +1 ( t ) + | S k ( t ) | +     B k ( t ) − N µ k t k k !     . (21) Therefore, it su ﬃ ces to show that with probability tendin g to on e as N → ∞ , the three terms on the r igh t-hand side of (21) stay b elo w ǫ N µ k T k / 3 for t ≤ T . By Lemma 10, f or 0 ≤ t ≤ T , E [ B k +1 ( t )] = µ Z T 0 E [ X k ( t )] dt ≤ N µ k +1 T k +1 ( k + 1)! . 16 By Mark o v’s Inequalit y , P  max 0 ≤ t ≤ T B k +1 ( t ) > ǫ 3 N µ k T k  = P  B k +1 ( T ) > ǫ 3 N µ k T k  ≤ 3 µT ǫ ( k + 1)! → 0 (22) as N → ∞ b ecause µT → 0. Note that S (0) = 0, and since t yp e k individu als give b irth and d ie at the same r ate, the pro cess ( S ( t ) , 0 ≤ t ≤ T ) is a m artingale. By W ald’s Second Equation, E [ S ( T ) 2 ] is the ex- p ected n umb er of birth s plus deaths of t yp e k in dividuals (not counti n g replacement s of a typ e k individual b y another type k individ ual) up to time T , whic h by Lemma 10 is at most 2 Z T 0 E [ X k ( t )] dt ≤ 2 N µ k T k +1 ( k + 1)! . Therefore, by the L 2 -Maximal Inequalit y for martingales, E  max 0 ≤ t ≤ T | S ( t ) | 2  ≤ 4 E [ S ( T ) 2 ] ≤ 8 N µ k T k +1 ( k + 1)! . No w using Chebyshev’s Inequ alit y , P  max 0 ≤ t ≤ T | S k ( t ) | > ǫ 3 N µ k T k  ≤ 8 N µ k T k +1 ( k + 1)!  3 ǫN µ k T k  2 = 72 ( k + 1)! N µ k T k − 1 → 0 (23) as N → ∞ b ecause N µ k T k − 1 → ∞ . T o b ound the third term in (21), note that t yp e k − 1 ind ividuals m utate to typ e k at rate µ . Therefore, there exist inhomogeneous Poisson pro cesses ( N 1 ( t ) , t ≥ 0) and ( N 2 ( t ) , t ≥ 0) whose inte ns ities at time t are given b y N µ k t k − 1 / ( k − 1 )! − ǫN µ k T k − 1 / 6 and N µ k t k − 1 / ( k − 1)! + ǫN µ k T k − 1 / 6 r esp ectiv ely such that on th e eve nt that max 0 ≤ t ≤ T     X k − 1 ( t ) − N µ k − 1 t k − 1 ( k − 1)!     ≤ ǫ 6 N µ k − 1 T k − 1 , (24) w e hav e N 1 ( t ) ≤ B k ( t ) ≤ N 2 ( t ) for 0 ≤ t ≤ T . T o ac hieve this coupling, one can b egin with p oin ts at the times of t yp e k m utations. T o get ( N 1 ( t ) , t ≥ 0), when th er e is a t yp e k mutatio n at time t , remo v e this p oin t w ith pr ob ab ility [ N µ k t k − 1 / ( k − 1)! − ǫN µ k T k − 1 / 6] /µX k − 1 ( t − ). T o get ( N 2 ( t ) , t ≥ 0), add p oint s of a time-inhomogeneous Po isson pr ocess wh ose rate at time t is [ N µ k t k − 1 / ( k − 1)! + ǫN µ k T k − 1 / 6] − µX k − 1 ( t ). Note that E [ N 1 ( t )] = Z t 0  N µ k s k − 1 ( k − 1)! − ǫN µ k T k − 1 6  ds = N µ k t k k ! − ǫ 6 N µ k T k − 1 t (25) and lik ewise E [ N 2 ( t )] = N µ k t k k ! + ǫ 6 N µ k T k − 1 t. The pro cess ( N 1 ( t ) − E [ N 1 ( t )] , t ≥ 0) is a martingale, and E  ( N 1 ( T ) − E [ N 1 ( T )]) 2  = E [ N 1 ( T )] = N µ k T k k ! − ǫ 6 N µ k T k . (26) 17 Therefore, C heb yshev’s Inequalit y and th e L 2 -Maximal Inequalit y for martin gales giv e P  max 0 ≤ t ≤ T   N 1 ( t ) − E [ N 1 ( t )]   > ǫ 6 N µ k T k  ≤ 36 E  max 0 ≤ t ≤ T | N 1 ( t ) − E [ N 1 ( t )] | 2  ( ǫN µ k T k ) 2 ≤ 144 E  ( N 1 ( T ) − E [ N 1 ( T )]) 2  ( ǫN µ k T k ) 2 → 0 (27) as N → ∞ b y (26) b ecause N µ k T k → ∞ . C ombining (25) with (27) giv es lim N → ∞ P  max 0 ≤ t ≤ T     N 1 ( t ) − N µ k t k k !     > ǫ 3 N µ k T k  = 0 . (28) The same argumen t giv es lim N → ∞ P  max 0 ≤ t ≤ T     N 2 ( t ) − N µ k t k k !     > ǫ 3 N µ k T k  = 0 . (29) as N → ∞ . By th e in duction hypothesis, the even t in (24) o ccurs with p robabilit y tend ing to one as N → ∞ , s o N 1 ( t ) ≤ B k ( t ) ≤ N 2 ( t ) for 0 ≤ t ≤ T with p robabilit y tend ing to one as N → ∞ . Therefore, equ ations (28 ) and (29) imply that lim N → ∞ P  max 0 ≤ t ≤ T     B k ( t ) − N µ k t k k !     > ǫ 3 N µ k T k  = 0 . (30) The result f ollo ws from (21), (22), (23), and (30). Pr o of of p art 1 of The or em 4. Supp ose µ ≫ N − 2 /m , and let T = N − 1 /m µ − 1 t . As N → ∞ , we ha v e µT = N − 1 /m t → 0, N µ m − 1 T m − 2 = N 2 /m µt m − 2 → ∞ , and N µ m − 1 T m − 1 = N 1 /m t m − 1 → ∞ . Therefore, by P r op osition 11, if ǫ > 0, then with pr ob ab ility tending to one as N → ∞ , max 0 ≤ s ≤ T     X m − 1 ( s ) − N µ m − 1 s m − 1 ( m − 1)!     ≤ ǫN µ m − 1 T m − 1 . (31) Because eac h t yp e m − 1 individu al exp eriences a t yp e m m utation at r ate µ , the r andom v ariable V = Z τ m 0 µX m − 1 ( s ) ds has an exp onential distr ibution with mean one. When (31) holds, we ha ve N µ m T m m ! − ǫN µ m T m ≤ Z T 0 µX m − 1 ( s ) ds ≤ N µ m T m m ! + ǫN µ m T m . It follo ws that lim sup N → ∞ P ( τ m > T ) ≤ lim sup N → ∞ P  V > N µ m T m m ! − ǫN µ m T m  = P  W > t m m ! − ǫt m  = exp  − t m m ! + ǫt m  , and lik ewise lim inf N → ∞ P ( τ m > T ) ≥ lim inf N → ∞ P  V > N µ m T m m ! + ǫN µ m T m  = exp  − t m m ! − ǫt m  . Because these b ound s hold for all ǫ > 0, the result follo ws. 18 W e n ow work to wards p ro ving part 2 of Theorem 4. F or the rest of this section, w e assume that N − 1 / (1+( m − j − 2)2 − ( j +1) ) ≪ µ ≪ N − 1 / (1+( m − j − 1)2 − j ) (32) for some j = 1 , . . . , m − 2. This condition implies that N µ → ∞ and µ → 0 as N → ∞ , and therefore N µ 1 − 2 − j → ∞ . (33) Also, for the r est of this section, t is ﬁxed and T = N − 1 / ( m − j ) µ − 1 − (1 − 2 − j ) / ( m − j ) t. (34) This means th at N µ m − j T m − j = µ − (1 − 2 − j ) t m − j . (35) Let ǫ > 0. Let G N b e th e ev ent that max 0 ≤ s ≤ T     X m − j − 1 ( s ) − N µ m − j − 1 s m − j − 1 ( m − j − 1)!     ≤ ǫN µ m − j − 1 T m − j − 1 . The n ext lemma shows that G N o ccurs with h igh p robabilit y , indicating that on the time scale of in terest, the num b er of individu als with m − j − 1 mutations sta ys close to its exp ectatio n. Lemma 12. We have lim N → ∞ P ( G N ) = 1 . Pr o of. W e need to verify the conditions of P rop osition 11 with m − j − 1 in place of k . By (33 ), as N → ∞ , µT = N − 1 / ( m − j ) µ − (1 − 2 − j ) / ( m − j ) t = ( N µ 1 − 2 − j ) − 1 / ( m − j ) t → 0 . (36) Also, u sing the ﬁr st inequalit y in (32), N µ m − j − 1 T m − j − 2 = N 1 − ( m − j − 2) / ( m − j ) µ m − j − 1 − ( m − j − 2) − ( m − j − 2)(1 − 2 − j ) / ( m − j ) t m − j − 2 = N 2 / ( m − j ) µ 2 / ( m − j )+( m − j − 2)2 − j / ( m − j ) t m − j − 2 = ( N µ 1+( m − j − 2)2 − ( j +1) ) 2 / ( m − j ) t m − j − 2 → ∞ . (37) Using the second inequalit y in (32) and the fact that m − j + 1 − 2 − j > 1 + ( m − j − 1)2 − j , T = ( N µ m − j +1 − 2 − j ) − 1 / ( m − j ) t ≫ ( N 1 − ( m − j +1 − 2 − j ) / (1+( m − j − 1)2 − j ) ) − 1 / ( m − j ) t → ∞ . This result and (37) imply N µ m − j − 1 T m − j − 1 → ∞ , whic h, in com bination with (36) and (37), giv es the lemma. The r est of the pro of of part 2 of Th eorem 4 is similar to the p ro of of Th eorem 2 in [10 ]. It dep ends on the follo wing r esult on Poisson approxima tion, whic h is part of Theorem 1 of [4] and w as used also in [10]. 19 Lemma 13. Supp ose ( A i ) i ∈I is a c ol le ction of events, wher e I is any index set. L et W = P i ∈I 1 A i b e the numb er of e v ents that o c cur, and let λ = E [ W ] = P i ∈I P ( A i ) . Supp ose f or e ach i ∈ I , we have i ∈ β i ⊂ I . L et F i = σ (( A j ) j ∈I \ β i ) . D eﬁne b 1 = X i ∈I X j ∈ β i P ( A i ) P ( A j ) , b 2 = X i ∈I X i 6 = j ∈ β i P ( A i ∩ A j ) , b 3 = X i ∈I E  | P ( A i |F i ) − P ( A i ) |  . Then | P ( W = 0) − e − λ | ≤ b 1 + b 2 + b 3 . W e will use the n ext lemma to get the second momen t estimate needed to b ound b 2 . Wh en w e app ly this r esult, the individu als b orn at times t 1 and t 2 will b oth hav e th e same t yp e. W e use diﬀeren t t yp es in the statemen t of the lemma to mak e it easier to distinguish th e descendants of the tw o individu als. This result is Lemma 5.2 of [10]. Lemma 14. Fix times t 1 < t 2 . Consider a p opula tion of size N which ev olves ac c or ding to the Mor an mo del in which al l individuals initial ly have typ e 0. Ther e ar e no mutations, exc ept that one individual b e c omes typ e 1 at time t 1 , and one typ e 0 individual (if ther e is one) b e c omes typ e 2 at time t 2 . Fix a p ositive inte ger L ≤ N / 2 . F or i = 1 , 2 , let Y i ( t ) b e the numb er of typ e i individuals at time t and let B i b e the event that L ≤ max t ≥ 0 Y i ( t ) ≤ N / 2 . Then P ( B 1 ∩ B 2 ) ≤ 2 /L 2 . Lemma 15. Consider the mo del i ntr o duc e d in Pr op osition 1. Assume N µ 1 − 2 − j → ∞ as N → ∞ . We deﬁne the fol lowing thr e e events: 1. L et R 1 b e the event that e ventual ly a typ e j + 1 individual is b orn. 2. L et R 2 b e the e vent that the maximum numb er of individuals of nonzer o typ e at any time is b etwe en ǫµ − 1+2 − j and N/ 2 . 3. L et R 3 b e the event that al l individuals stil l alive at time ǫ − 1 µ − 1+2 − j have typ e zer o. L et ¯ q j +1 = P ( R 1 ∩ R 2 ∩ R 3 ) . Then ther e exists a c onstant C , not dep e nding on ǫ , such that q j +1 − C ǫµ 1 − 2 − j ≤ ¯ q j +1 ≤ q j +1 . Pr o of. Because q j +1 = P ( R 1 ), the in equalit y ¯ q j +1 ≤ q j +1 is imm ediate. W e need to sho w that P ( R 1 ∩ ( R c 2 ∪ R c 3 )) ≤ C ǫµ 1 − 2 − j . Because ǫ − 1 µ − 1+2 − j ≤ N for suﬃcient ly large N , we h a v e P ( R c 3 ) ≤ C ǫµ 1 − 2 − j b y (10). It remains to show that P ( R 1 ∩ R c 2 ) ≤ C ǫµ 1 − 2 − j . The probab ility th at the num b er of in dividuals of nonzero t yp e ev er exceeds N / 2 is at most 2 / N ≪ ǫµ 1 − 2 − j . By (8) and the f act that eac h t yp e 1 in dividual exp eriences t yp e 2 mutatio n s at rate µ , the exp ected num b er of t yp e 2 mutatio ns wh ile there are k individu als of nonzero typ e is at most ( k µ )(1 /k ) = µ . Th erefore, the exp ected n umb er of typ e 2 mutations while there are few er than ǫµ − 1+2 − j individuals of nonzero typ e is at most ǫµ 2 − j . Th e probability that a given t yp e 2 mutatio n has a t yp e j + 1 descendant is at most C µ 1 − 2 − ( j − 1) b y P r op osition 1. I t no w 20 follo ws, u sing Mark o v’s In equalit y , that the pr obabilit y that some t yp e 2 mutat ion that o ccurs while there are fewer th an ǫ µ − 1+2 − j individuals of nonzero type has a type j + 1 descend ant is at m ost C ǫµ 2 − j +1 − 2 − ( j − 1) = C ǫµ 1 − 2 − j . Th us, P ( R 1 ∩ R c 2 ) ≤ C ǫµ 1 − 2 − j . The r esult f ollo ws. W e no w deﬁne the ev en ts to w hic h w e w ill app ly Lemma 13. Divide th e interv al [0 , T ] in to M subinterv als of equal length called I 1 , I 2 , . . . , I M , where M will tend to inﬁnity w ith N . Because t yp e m − j − 1 ind ividuals exp erience type m − j mutat ions at rate µ , w e can constru ct an inhomogeneous P oisson pr o cess K on [0 , T ] w h ose intensit y at time s is giv en b y N µ m − j s m − j − 1 ( m − j − 1)! + ǫN µ m − j T m − j − 1 (38) suc h that on the even t G N , all the times of th e type m − j muta tions b efore time T are p oin ts of K . Let D i b e th e ev ent that there is a p oint of K in the int erv al I i . Let ξ 1 , ξ 2 , . . . , ξ M b e i.i.d. { 0 , 1 } -v alued random v ariables, indep enden t of K and the p opulation pro cess, such that P ( ξ i = 1) = ¯ q j +1 for all i , where ¯ q j +1 comes from Lemma 15. Let A i b e the even t th at D i o ccurs, and one of the f ollo win g o ccurs: • Th e ﬁr st p oint of K in I i is th e time of a t yp e m − j m utation, and the three even ts deﬁn ed in L emm a 15 hold. That is, the t yp e m − j mutati on even tually has a type m descendant, the maxim um num b er of d escend an ts that it has in the p op u lation at any fu ture time is b et we en ǫµ − 1+2 − j and N / 2, and it has n o descendants r emaining a time ǫ − 1 µ − 1+2 − j after the m utation o ccurs. • Th ere is no mutat ion at the time of th e ﬁr s t p oint of K in I i , and ξ i = 1. Let W = P M i =1 1 A i b e th e num b er of the ev ent s A i that o ccur, and let λ = E [ W ]. Lemma 16. We have lim sup N → ∞ | P ( W = 0) − e − λ | = 0 . Pr o of. Let β i b e the set of all j ≤ M su c h that the distance b et w een the interv als I i and I j is at most ǫ − 1 µ − 1+2 − j . Deﬁne b 1 , b 2 , and b 3 as in Lemma 13. W e need to sh o w that b 1 , b 2 , and b 3 all tend to zero as N → ∞ . It is clear from p rop erties of P oisson pro cesses that the even ts D 1 , . . . , D M are indep endent, and it is clear from the construction that P ( A i | D i ) = ¯ q j +1 for all i . Th e ev ent s A 1 , . . . , A M are not ind ep enden t b ecause mutati ons in t wo interv als I h and I i ma y ha v e d escendan ts alive at the same time. Ho wev er, if I i = [ a, b ], then the third ev en t in Lemma 15 guaran tees that wh ether or not A i has o ccurred is d etermined by time b + ǫ − 1 µ − 1+2 − j , and therefore A i is ind ep endent of all A h with h / ∈ β i . It follo ws that b 3 = 0. The length | I i | of the interv al I i is T / M . In view of (38), P ( D i ) ≤ C N µ m − j T m − j − 1 | I i | = C N µ m − j T m − j / M . (39) Because (33) holds, we can apply Prop osition 1 to get ¯ q j +1 ≤ q j +1 ≤ C µ 1 − 2 − j . T h erefore, using also (35 ), P ( A i ) = P ( D i ) ¯ q j +1 ≤ C N µ m − j +1 − 2 − j T m − j M ≤ C M 21 for all i . There are at most 2(1 + ǫ − 1 µ − 1+2 − j / | I i | ) ≤ C ǫ − 1 µ − 1+2 − j M /T indices in β i . It follo ws that b 1 ≤ M  C ǫ − 1 µ − 1+2 − j M T  C M  2 ≤ C ǫ − 1 µ − 1+2 − j T − 1 ≤ C ǫ − 1 µ − 1+2 − j N 1 / ( m − j ) µ 1+(1 − 2 − j ) / ( m − j ) = C ǫ − 1 ( N µ 1+2 − j ( m − j − 1) ) 1 / ( m − j ) → 0 (40) as N → ∞ , using the second inequalit y in (32). T o b ound b 2 , supp ose h 6 = i . S upp ose D h and D i b oth o ccur. If the ﬁrst p oin ts of the P oisson pro cess in I h and I i are times of t yp e m − j mutatio ns, then for A h ∩ A i to o ccur, the ev ent B 1 ∩ B 2 in L emm a 14 must o ccur w ith L = ǫµ − 1+2 − j . It follo ws that P ( A h ∩ A i | D h ∩ D i ) ≤ max { 2 / ( ǫµ − 1+2 − j ) 2 , ¯ q 2 j +1 } ≤ C ǫ − 2 µ 2 − 2 − ( j − 1) . Therefore, u sing (39 ), (35), and the fact that P ( D h ∩ D i ) = P ( D h ) P ( D i ) b y indep endence, P ( A h ∩ A i ) ≤ P ( D h ) P ( D i ) P ( A h ∩ A i | D h ∩ D i ) ≤  C N µ m − j T m − j M  2 ( C ǫ − 2 µ 2 − 2 − ( j − 1) ) ≤ C ǫ 2 M 2 . Th u s , by reasoning as in (40 ), we get b 2 ≤ M  C ǫ − 1 µ − 1+2 − j M T  C ǫ 2 M 2  → 0 as N → ∞ , whic h completes the pr o of. Lemma 17. L et σ m b e the time of the ﬁrst typ e m − j mutation that wil l have a typ e m desc endant. Then lim N → ∞ P ( σ m > T ) = exp  − t m − j ( m − j )!  . Pr o of. W e claim there is a constan t C , n ot dep ending on ǫ , suc h that for su ﬃcien tly large N ,     λ − t m − j ( m − j )!     ≤ C ǫ, (41) where λ comes from Lemma 16, and | P ( W = 0) − P ( σ m > T ) | ≤ C ǫ. (42) The result f ollo ws from this claim b y letting ǫ → 0 and applying Lemma 16. Recall that we hav e divided the interv al [0 , T ] into the su bin terv als I 1 , . . . , I M . By letting M tend to inﬁ nit y suﬃcien tly rapidly as N tend s to inﬁ nit y , we can ensure that the exp ected n umb er of p oints of the Poisson p r o cess K th at are in the same su bin terv al as some other p oint 22 tends to zero as N → ∞ . Therefore, P M i =1 P ( D i ) is asymptotically equiv alen t to the exp ected n umb er of p oints of K . That is, M X i =1 P ( D i ) ∼ Z T 0 N µ m − j s m − j − 1 ( m − j − 1)! + ǫN µ m − j T m − j − 1 ds = N µ m − j T m − j ( m − j )! + ǫN µ m − j T m − j . (43) No w λ = M X i =1 P ( A i ) = ¯ q j +1 M X i =1 P ( D i ) , so using Prop osition 1, the second inequalit y in Lemma 15, (43), and (35), lim sup N → ∞ λ ≤ lim sup N → ∞ µ 1 − 2 − j  N µ m − j T m − j ( m − j )! + ǫN µ m − j T m − j  = t m − j ( m − j )! + t m − j ǫ. (44) Lik ewise, d ropping the second term and using the ﬁ rst inequ ality in Lemma 15, we get lim inf N → ∞ λ ≥ lim inf N → ∞ (1 − C ǫ ) µ 1 − 2 − j  N µ m − j T m − j ( m − j )!  = t m − j (1 − C ǫ ) ( m − j )! . (45) Equations (44) and (45) imp ly (41). It r emains to pr ov e (42). The only wa y to ha ve W > 0 and σ m > T is if for some i , there is a p oin t of K in I i that is not the time of a type m − j mutatio n and ξ i = 1. O n G N , p oints of K that are not mutatio n times o ccur at rate at most 2 ǫN µ m − j T m − j − 1 . Because the Poisson pro cess ru n s for time T and P ( ξ i = 1) = ¯ q j +1 ≤ C µ 1 − 2 − j b y Lemma 15 and P rop osition 1, w e ha v e, using (35), P ( W > 0 an d σ m > T ) ≤ P ( G c N ) + C ǫN µ m − j +1 − 2 − j T m − j ≤ P ( G c N ) + C ǫ. (46) W e can hav e W = 0 with σ m ≤ T in t wo wa ys. One p ossibility is that tw o p oints of K o ccur in th e same s ubin terv al, an ev ent whose probabilit y go es to zero if M go es to inﬁnity suﬃcien tly rapidly with N . T he other p ossibilit y is that some t yp e m − j m utation b efore time T could ha v e a typ e m descendant but f ail to satisfy one of the other tw o conditions of Lemma 15. T he probabilit y of this eve nt is at most P ( G c N ) + C N µ m − j T m − j ( q j +1 − ¯ q j +1 ) ≤ P ( G c N ) + C ǫN µ m − j +1 − 2 − j T m − j ≤ P ( G c N ) + C ǫ (47) b y Lemma 15 and (35). Equation (42) follo ws f rom (46), (47), and Lemma 12. Pr o of of p art 2 of The or em 4. Recall the deﬁnition of T from (34). Deﬁne σ m to b e the time of the ﬁ rst typ e m − j m utation that will ha ve a t yp e m descendant. Then σ m ≤ τ m , and by Lemma 17, it su ﬃ ces to show that lim N → ∞ P ( σ m < T and τ m − σ m > δ N − 1 / ( m − j ) µ − 1 − (1 − 2 − j ) / ( m − j ) ) = 0 (48) for all δ > 0. The ev en t in (48 ) can only o ccur if some t yp e m − j mutatio n b efore time T either ﬁxates or tak es longer than time δ N − 1 / ( m − j ) µ − 1 − (1 − 2 − j ) / ( m − j ) to d isapp ear from the p opulation. By L emma 10, b efore time T the exp ected rate of t yp e m − j m utations is at 23 most C N µ m − j T m − j − 1 , so th e exp ected num b er of type m − j m utations by time T is at m ost C N µ m − j T m − j . Because th e probab ility that a m utation ﬁxates is 1 / N , the probabilit y that some t yp e m − j muta tion b efore time T ﬁ xates is at most C µ m − j T m − j , wh ic h goes to zero as N → ∞ b ecause µT → 0 by (36). Next, note that δ N − 1 / ( m − j ) µ − 1 − (1 − 2 − j ) / ( m − j ) ≪ N , w h ic h can b e seen by d ividing b oth sides b y N and ob s erving that δ ( N µ ) − 1 ( N µ 1 − 2 − j ) − 1 / ( m − j ) → 0 b ecause N µ → ∞ and N µ 1 − 2 − j → ∞ . Therefore, for suﬃ cien tly large N , we can apply (10) to sh o w that the prob ab ility that a give n m utation lasts longer than time δ N − 1 / ( m − j ) µ − 1 − (1 − 2 − j ) / ( m − j ) b efore d isapp earing or ﬁxating is at most C δ − 1 N 1 / ( m − j ) µ 1+(1 − 2 − j ) / ( m − j ) . Thus, th e pr obabilit y that some mutat ion b efore time T lasts this long is at most C δ − 1 N 1 / ( m − j ) µ 1+(1 − 2 − j ) / ( m − j ) · N µ m − j T m − j ≤ C δ − 1 N 1 / ( m − j ) µ 1+(1 − 2 − j ) / ( m − j ) µ − (1 − 2 − j ) t m − j = C δ − 1 ( N µ 1+( m − j − 1)2 − j ) 1 / ( m − j ) t m − j → 0 b y the second inequalit y in (32), and (48) follo ws. 5 Pro of of part 3 of Theorem 4 Throughout th is s ection, we assume µ ∼ AN − 1 / (1+( m − j − 1)2 − j ) (49) for some j = 1 , . . . , m − 1, as in part 3 of T h eorem 4. Also , let T = µ − (1 − 2 − j ) t . Then lim N → ∞ N µ m − j T m − j µ 1 − 2 − j = lim N → ∞ N µ 1+( m − j − 1)2 − j t m − j = A 1+( m − j − 1)2 − j t m − j . (50) W e ﬁr st s h o w that the num b er of ind ivid uals of t yp e m − j − 1 is appr o ximately d etermin istic through time T . Lemma 18. L et ǫ > 0 . L et G N ( ǫ ) b e the ev ent that max 0 ≤ s ≤ T     X m − j − 1 ( s ) − N µ m − j − 1 s m − j − 1 ( m − j − 1)!     ≤ ǫN µ m − j − 1 T m − j − 1 . Then lim N → ∞ P ( G N ( ǫ )) = 1 . Pr o of. As in the pro of of Lemma 12, w e n eed to c h eck the conditions of Pr op osition 11 with m − j − 1 in place of k . Because µ → 0 as N → ∞ , we hav e µT = µ 2 − j t → 0 (51) as N → ∞ . Also, usin g that µ ∼ AN − 1 / (1+( m − j − 1)2 − j ) ≫ N − 1 / (1+( m − j − 2)2 − j ) , we hav e N µ m − j − 1 T m − j − 2 = N µ m − j − 1 µ − (1 − 2 − j )( m − j − 2) t m − j − 2 = N µ 1+( m − j − 2)2 − j t m − j − 2 → ∞ as N → ∞ . Sin ce T → ∞ as N → ∞ , w e also h a v e N µ m − j − 1 T m − j − 1 → ∞ as N → ∞ , and th e lemma follo ws. 24 Although the n umb er of t yp e m − j − 1 individuals is appro ximately deterministic, there are sto c hastic eﬀects b oth fr om th e num b er of typ e m − j ind ividuals in the p opulation and from the time that elapses b et w een the app earance of the type m − j mutat ion that will ha v e a typ e m d escendan t and the birth of the t yp e m d escendan t. F urth er complicating the pro of is th at b ecause b irths and deaths o ccur at the same time in the Moran mo d el, the fates of t w o typ e m − j m utations that o ccur at diﬀerent times are n ot indep end en t, n or is the num b er of t yp e m − j individuals in the p opu lation ind ep enden t of w h ether or not th e typ e m − j + 1 mutati ons su cceed in pro ducing a t yp e m descend ant. Our pro of is v ery similar to the p r o of of Prop osition 4.1 in [10] and inv olv es a comparison b et wee n the Moran mo del and a tw o-t yp e br anc hing pro cess. T o carry out th is comparison, we introduce ﬁve mo dels. Mo del 1 : This will b e th e original mo del d escrib ed in the in tro duction. Mo del 2 : This mo del is the same as Mo del 1 except that there are no t yp e 1 mutatio ns and no individu als of typ es 1 , . . . , m − j − 1. Instead, at times of an inhomogeneous P oisson pro cess whose rate at time s is N µ m − j s m − j − 1 / ( m − j − 1)!, a type zero individu al (if there is one) b ecomes t yp e m − j . Mo del 3 : Th is model is the same as Mo del 2, except that t yp e m − j + 1 m utations are supp ressed when there is another in dividual of t yp e m − j + 1 or higher already in the p opulation. Mo del 4 : T his m od el is the same as Mo del 3, except th at t wo c hanges are made s o that the ev olution of t yp e m − j + 1 individu als and th eir oﬀsprin g is d ecoupled from the ev olution of the t yp e m − j individu als: • Whenever there wo uld b e a transition th at inv olv es exc hanging a t yp e m − j ind ividual with an in d ividual of t yp e k ≥ m − j + 1, we instead exc hange a randomly c hosen t yp e 0 individual with a t yp e k individ ual. • At th e times of typ e m − j + 1 mutatio ns , a rand omly c hosen t yp e 0 individual, rather than a t yp e m − j individu al, b ecomes t yp e m − j + 1. Mo del 5 : This mo d el is a tw o-t yp e b ranc hing p ro cess with immigration. T yp e m − j in- dividuals immigrate at times of an inhomogeneous Poisson pro cess whose rate at time s is N µ m − j s m − j − 1 / ( m − j − 1)!. Eac h individual give s birth at rate 1 and d ies at r ate 1, and t yp e m − j individu als b ecome t yp e m at rate µq j , wh ere q j comes from Prop osition 1. F or i = 1 , 2 , 3 , 4 , 5, let Y i ( s ) b e th e num b er of type m − j individuals in Mo del i at time s , and let Z i ( s ) b e the num b er of individuals in Mo del i at time s of t yp e m − j + 1 or h igher. Let r i ( s ) b e the p robabilit y th at through time s , ther e has n ev er b een a t yp e m in dividual in Mo del i . Note that r 1 ( T ) = P ( τ m > T ), s o to pr o v e part 3 of T heorem 4, we need to calculate lim N → ∞ r 1 ( T ). W e will ﬁr st ﬁnd lim N → ∞ r 5 ( T ) and then b oun d | r i ( T ) − r i +1 ( T ) | for i = 1 , 2 , 3 , 4. 5.1 A tw o-t yp e branc hing pro cess with immigration Here we consider Mo del 5. O ur analysis is based on the follo wing lemma concerning t w o-t yp e branc hin g pro cesses, whic h is pr o v ed in section 2 of [10]; see equation (2.4). 25 Lemma 19. Consid er a c ontinuous-time two-typ e br anch ing pr o c ess starte d with a single typ e 1 individual. Each typ e 1 individual gives birth and dies at r ate one, and mutates to typ e 2 at r ate r . L et f ( t ) b e the pr ob ability that a typ e 2 individual is b orn by time t . If r and t dep end on N with r → 0 and r 1 / 2 t → s as N → ∞ , then lim N → ∞ r − 1 / 2 f ( t ) = 1 − e − 2 s 1 + e − 2 s . Lemma 20. We have lim N → ∞ r 5 ( T ) = exp  − A 1+( m − j − 1)2 − j ( m − j − 1)! Z t 0 ( t − s ) m − j − 1 1 − e − 2 s 1 + e − 2 s ds  . (52) Pr o of. Let g ( w ) b e the p robabilit y that in Mod el 5, a typ e m − j individu al that immigrates at time w has a type m descendant by time T . Because t yp e m − j ind ivid uals immigrate at times of an inh omogeneo u s Poisson pro cess whose r ate at time w is N µ m − j w m − j − 1 / ( m − j − 1)!, we ha v e r 5 ( T ) = exp  − 1 ( m − j − 1)! Z T 0 N µ m − j w m − j − 1 g ( w ) dw  . (53 ) Making th e s ubstitution s = µ 1 − 2 − j w , w e get Z T 0 N µ m − j w m − j − 1 g ( w ) dw = Z t 0 N µ 1+( m − j − 1)2 − j s m − j − 1 g ( µ − (1 − 2 − j ) s ) µ − (1 − 2 − j ) ds. (54) As N → ∞ , w e hav e N µ 1+( m − j − 1)2 − j → A 1+( m − j − 1)2 − j b y (49). Note also that g ( µ − (1 − 2 − j ) s ) = f ( µ − (1 − 2 − j ) ( t − s )), wh ere f is the fun ction in L emm a 19 wh en r = µq j . Also, b y Prop osition 1, µq j ∼ µ · µ 1 − 2 − ( j − 1) = ( µ 1 − 2 − j ) 2 , so r − 1 / 2 ∼ µ − (1 − 2 − j ) and r 1 / 2 µ − (1 − 2 − j ) ( t − s ) → t − s as N → ∞ . Th erefore, by Lemma 19, lim N → ∞ g ( µ − (1 − 2 − j ) s ) µ − (1 − 2 − j ) = 1 − e − 2( t − s ) 1 + e − 2( t − s ) . Using also (54) and th e Dominated Conv ergence Theorem, lim N → ∞ Z T 0 N µ m − j w m − j − 1 g ( w ) dw = A 1+( m − j − 1)2 − j Z t 0 s m − j − 1 1 − e − 2( t − s ) 1 + e − 2( t − s ) ds. (55) The result f ollo ws from (53) and (55) after interc hanging the roles of s and t − s . 5.2 Bounding the n um b er of individuals of type m − j and higher W e b egin with the follo win g lemma, wh ic h b oun ds in all mo dels the exp ected n umb er of individ- uals in the mo d els ha ving typ e m − j or higher. Lemma 21. F or i = 1 , 2 , 3 , 4 , 5 , we have max 0 ≤ s ≤ T E [ Y i ( s ) + Z i ( s )] ≤ C N µ m − j T m − j . (56) Also , for al l ﬁve mo dels, the exp e cte d numb er of typ e m − j + 1 mutations by time T is at most C N µ m − j +1 T m − j +1 . 26 Pr o of. Because eac h type m − j individ ual exp eriences t yp e m − j + 1 mutati ons at rate µ , th e second s tatement of the lemma follo ws easily from the f act that E [ Y i ( s )] ≤ C N µ m − j T m − j , whic h is a consequence of (56). T o prov e (56), ﬁr s t n ote that b ecause births and deaths o ccur at the same rate, in all ﬁve mo dels E [ Y i ( s ) + Z i ( s )] is th e exp ected num b er of individu als of types m − j and higher that app ear up to time s as a result of m utations, or immigration in the case of Mo del 5. F or i = 2 , 3 , 5, these m u tation or immigration ev ent s o ccur at times of a r ate N µ m − j s m − j − 1 / ( m − j − 1)! Po isson pro cess (u nless they are suppr essed in Mo del 2 or 3 b ecause n o typ e zero in dividuals remain), so (56) holds. In Mo del 1, the mutation rate dep ends on the num b er of type m − j − 1 individu als, but (56) holds by Lemma 10. Mo del 4 is d iﬀeren t b ecause t yp e 0 r ather than t yp e m − j individuals are r ep laced at the times of type m − j + 1 muta tions. T he ab ov e argument still giv es E [ Y 4 ( s )] ≤ C N µ m − j T m − j for s ≤ T b ecause t yp e m − j individu als giv e b irth and die at the same rate. Th us, the exp ected n umb er of t yp e m − j + 1 mutations by time T is at most C N µ m − j +1 T m − j +1 . It follo ws that E [ Z 4 ( s )] ≤ C N µ m − j +1 T m − j +1 ≪ N µ m − j T m − j for s ≤ T , usin g th e f act that µT → 0 as N → ∞ b y (51). T herefore, (56) holds for Mo del 4 as well. Lemma 21 easily imp lies the follo wing b ound on the maximum num b er of individu als of t yp e m − j or higher throu gh time T . The lemma b elo w with f ( N ) = 1 / N implies that w ith probabilit y tending to one as N → ∞ , the num b er of in dividuals of t yp e m − j or higher do es not reac h N b efore time T . Lemma 22. Supp ose f is a function of N su c h that N µ ( m − j )2 − j f ( N ) → 0 as N → ∞ . Then for i = 1 , 2 , 3 , 4 , 5 , as N → ∞ we have max 0 ≤ s ≤ T ( Y i ( s ) + Z i ( s )) f ( N ) → p 0 . (5 7) Pr o of. Because ind ividuals of t yp e m − j or higher give bir th and d ie at th e same rate, and they can app ear b ut not disapp ear as a result of mutatio ns, the pro cess ( Y i ( s ) + Z i ( s ) , 0 ≤ s ≤ T ) is a nonnegativ e sub martingale for i = 1 , 2 , 3 , 4 , 5. By Do ob’s Maximal Inequalit y , for all δ > 0, P  max 0 ≤ s ≤ T ( Y i ( s ) + Z i ( s )) > δ f ( N )  ≤ E [ Y i ( T ) + Z i ( T )] f ( N ) δ . (58) Since N µ m − j T m − j = N µ ( m − j )2 − j t m − j , equation (56) implies that if N µ ( m − j )2 − j f ( N ) → 0 as N → ∞ , then the righ t-hand side of (58) go es to zero as N → ∞ f or all δ > 0, w h ic h pr ov es (57). 5.3 Comparing Mo dels 1 and 2 In this subs ectio n, we establish the follo wing result w h ic h con trols the diﬀerence b et w een Mo del 1 and Mo del 2. The adv an tage to w orking with Mo del 2 rather th an Mod el 1 is that the ran d omness in th e rate of the t yp e m − j mutatio ns is eliminated. Lemma 23. We have lim N → ∞ | r 1 ( T ) − r 2 ( T ) | = 0 . 27 Pr o of. Lemma 22 with f ( N ) = 1 / N im p lies that w ith pr ob ab ility tend ing to one as N → ∞ , up to time T there is alw a ys at least one typ e 0 individ ual in Mo del 2, so hereafter we will make this assumption. In this case, a typ e m − j ind ivid ual replaces a randomly c hosen type 0 ind ividual in Mo del 2 at times of a Po isson pr ocess K whose r ate at time s is N µ m − j s m − j − 1 / ( m − j − 1)!. W e will ﬁr st compare Mo del 2 to another mo del called Mo del 2 ′ , whic h will b e the same as Mo del 2 except that type m − j individu als arr iv e at times of a P oisson pro cess K ′ whose rate at time s is max { 0 , N µ m − j s m − j − 1 / ( m − j − 1)! − ǫN µ m − j T m − j − 1 } , w here ǫ > 0 is ﬁxed. Mo dels 2 and 2 ′ can b e coupled so th at births and d eaths o ccur at the same times in b oth mo dels, and eac h p oint of K ′ is also a p oin t of K . Consequentl y , a coupling can b e ac hiev ed so that if an individ u al h as typ e k ≥ m − j in Mo del 2 ′ , then it also has t yp e k in Mo del 2. With suc h a coupling, the only individu als whose types are diﬀerent in the tw o mo dels are those descended from individ uals th at in Mo del 2 b ecame t yp e m − j at a time that is in K but not K ′ . The rate of p oints in K but not K ′ is b ounded by ǫN µ m − j T m − j − 1 . The probability that a giv en typ e m − j individual has a t yp e m descendant is at most C µ 1 − 2 − j b y Prop osition 1. Therefore, the probabilit y that th ere is a t yp e m ind ividual in Mod el 2 but n ot Mo del 2 ′ b efore time T is b ounded by ǫN µ m − j T m − j · C µ 1 − 2 − j ≤ C ǫ, (59) using (50). Therefore, letting r 2 ′ ( T ) denote the p robabilit y that there is no t yp e m individual in Mo del 2 ′ b y time T , | r 2 ( T ) − r 2 ′ ( T ) | ≤ C ǫ. (60) W e now compare Mo del 1 and Mo del 2 ′ . These mo d els can b e coup led so that bir ths and deaths in the t wo mo dels happ en at the same times and , on G N ( ǫ ), there is a type m − j m u tation in Mo del 1 at all of the times in K ′ . This coupling can therefore ac hiev e the pr op ert y that on G N ( ǫ ), any individual of t yp e k ≥ m − j in Mo del 2 ′ also has type k in Mo del 1. The only individuals in Mo d el 1 of t yp e k ≥ m − j that do not h a v e the same type in Mo del 2 ′ are those descended from ind ividuals that b ecame t yp e m − j at a time that is not in K ′ . On G N ( ǫ ), the rate of t yp e m − j m utations at times n ot in K ′ is b ound ed by 2 ǫN µ m − j T m − j − 1 . Therefore, by the same calculation made in (59), the p robabilit y that G N ( ǫ ) o ccurs and that Mo del 1 bu t not Mo del 2 ′ has a typ e m descendant by time T is at most C ǫ . This b ound and Lemma 18 give | r 1 ( T ) − r 2 ′ ( T ) | ≤ C ǫ. (61) The result f ollo ws from (60) and (61) after letting ǫ → 0. 5.4 Comparing Mo dels 2 and 3 In this s u bsection, we establish the follo wing lemma. Lemma 24. We have lim N → ∞ | r 2 ( T ) − r 3 ( T ) | = 0 . The adv an tage to wo rkin g with Mo del 3 rather than Mo del 2 is that in Mo del 3, descendants of only one type m − j + 1 mutatio n can b e p resen t in the p opu lation at a time. As a result, eac h t yp e m − j + 1 mutation ind ep enden tly h as probability q j of pr o ducing a typ e m descendant . With Mo del 2, there could b e dep endence b et w een the outcomes of diﬀerent t yp e m − j + 1 m utations whose descendants o verla p in time. 28 The only d iﬀeren ce b et ween Mo del 2 and Mo del 3 is that some t yp e m − j + 1 mutatio ns are sup pressed in Mo del 3. Therefore, it is easy to couple Mo del 2 an d Mo del 3 so that until there are n o t yp e 0 individu als remaining in Mo del 2, the type of the i th ind ividual in Mo del 2 is alw a ys at least as large as the typ e of the i th individu al in Mo del 3, with the only discrepancies in vo lving in d ividuals descended from a type m − j + 1 m utation that was suppr essed in Mo del 3. Because Lemma 22 with f ( N ) = 1 / N implies that the pr obabilit y that all typ e zero individuals disapp ear by time T go es to zero as N → ∞ , Lemma 24 follo ws from the f ollo wing result. Lemma 25. In Mo del 2, the pr ob ability that some typ e m − j + 1 mutation that o c curs while ther e is another individual of typ e m − j + 1 or higher in the p opulation has a typ e m desc endant tends to zer o as N → ∞ . Pr o of. By Lemma 21, th e exp ected num b er of t yp e m − j + 1 m utations b y time T is at most C N µ m − j +1 T m − j +1 . By (9), the exp ected amount of time, b efore time T , that there is an indi- vidual in the p opulation of t yp e m − j + 1 or higher is at most C N µ m − j +1 T m − j +1 (log N ). By Lemma 22 with f ( N ) = 1 / ( N µ m − j T m − j log N ), the probabilit y that the num b er of type m − j individu als sta ys b elo w N µ m − j T m − j log N un til time T tends to one as N → ∞ . On this ev ent, the exp ected n umb er of typ e m − j + 1 mutati ons by time T while there is another individual in the p opulation of type m − j + 1 or higher is at most h N = ( C N µ m − j +1 T m − j +1 log N )( N µ m − j T m − j log N ) µ. The probabilit y that a give n suc h mutation pr od uces a typ e m descendant is q j ≤ C µ 1 − 2 − ( j − 1) b y Prop osition 1 , so the pr obabilit y th at at least one suc h m utation pro du ces a typ e m descendant is at most h N q j ≤ C ( µT (log N ) 2 )[ N µ m − j T m − j µ 1 − 2 − j ] 2 . Because µT (log N ) 2 = µ 2 − j (log N ) 2 → 0 as N → ∞ and N µ m − j T m − j µ 1 − 2 − j sta ys b ounded as N → ∞ by (50), the lemma follo ws. 5.5 Comparing Mo dels 3 and 4 In b oth Mo del 3 and Mo del 4, eac h t yp e m − j + 1 m u tation ind ep enden tly has pr obabilit y q j of pro ducing a typ e m descendant. The adv an tage to Mo del 4 is that whether or not a giv en type m − j + 1 m utation p ro duces a t yp e m descendant is d ecoupled from the evolutio n of th e num b er of t yp e m − j individu als. W e ﬁ rst d eﬁne a more precise coupling b et w een Mo del 3 and Mo del 4. W e will assume throughout the constru ction that there are fewe r than N / 2 ind ivid uals in eac h m o d el with t yp e m − j or h igher. Even tually th is assump tion will fail, b ut by Lemma 22, the assu mption is v alid through time T with pr obabilit y tending to one as N → ∞ , w h ic h is suﬃcien t for our pur p oses. F or b oth mo dels, the N in dividuals will b e assigned lab els 1 , . . . , N in add ition to their t yp es. Let L b e a P oisson pr o cess of rate N on [0 , ∞ ), and let I 1 , I 2 , . . . and J 1 , J 2 , . . . b e ind ep enden t random v ariables, u niformly distributed on { 1 , . . . , N } . Let K b e an in homogeneous Poi sson pro cess on [0 , ∞ ) whose rate at time s is N µ m − j s m − j − 1 / ( m − j − 1)!, and let L 1 , . . . , L N b e indep endent rate µ Poisso n pro cesses on [0 , ∞ ). In b oth mo dels, if s is a p oin t of K , then at time s w e c ho ose an individu al at random from those that hav e t yp e 0 in b oth mo d els to b ecome t yp e m − j . Birth and death ev ents o ccur at the times of L . A t the time of the m th p oin t of 29 L , in b oth mo dels w e c h an ge the type of the individu al lab eled I m to the typ e of th e individu al lab eled J m . In Mo del 4, if I m has typ e m − j and J m has typ e k ≥ m − j + 1, then we c ho ose a t yp e 0 ind ivid ual to b ecome t yp e m − j to keep the num b er of t yp e m − j individ uals constan t. Lik ewise, in Mo del 4, if I m has t yp e k ≥ m − j + 1 and J m has t yp e m − j , th en we c ho ose a t yp e m − j individual to b ecome t yp e 0. In b oth m o d els, the individual lab eled i exp eriences m utations at times of L i , with the exceptions that type 0 ind ividuals never get mutati ons and m utations of t yp e m − j individuals are suppressed when there is already an individu al of t yp e m − j + 1 or higher in the p opulation. Also , in Mo del 4, if s is a p oint of L i and the individual lab eled i has typ e m − j at time s − , then in addition to c hanging the typ e of the in dividual lab eled i , w e choose a t yp e 0 individual to b ecome t yp e m − j so th at the num b er of t yp e m − j individuals sta ys constant. Note that by relab eling the individu als, if necessary , after eac h tr an s ition, we can ensure that for all s ≥ 0, at time s there are m in { Y 3 ( s ) , Y 4 ( s ) } in tegers i suc h that the in dividual lab eled i h as t yp e m − j in b oth mo dels. T he rearranging can b e done so that no individual h as t yp e m − j in one of the mo dels and type m − j + 1 or higher in the other. Also, w ith this coupling, if a type m − j + 1 mutat ion o ccurs at the same time in b oth mo dels, descendants of this mutatio n will ha ve the same type in b oth mo dels. In particular, if the mutation h as a typ e m descendant in one m od el, it will ha ve a t yp e m descendant in the other. Let W ( s ) = Y 3 ( s ) − Y 4 ( s ), which is th e diﬀerence b etw een the num b er of t yp e m − j individuals in Mo del 3 and the num b er of type m − j individ u als in Mo del 4. T here are three t yp es of ev ent s that can cause the p r o cess ( W ( s ) , 0 ≤ s ≤ T ) to jump: • When a t yp e m − j individ u al exp eriences a m utation in Mo del 3 and b ecomes t yp e m − j + 1, there is no c hange to th e num b er of type m − j individu als in Mo del 4. A t time s , such c hanges o ccur at rate either 0 or µY 3 ( s ), dep en ding on w h ether or n ot there is already an individual in Mo del 3 of type m − j + 1 or higher. • When one of the individuals that is t yp e m − j in one pro cess but not the other exp eriences a birth or death, the W pro cess can increase or d ecrease by one. If Y 3 ( s ) > Y 4 ( s ), then at time s , b oth in cr eases and decreases are happ ening at rate | W ( s ) | ( N − | W ( s ) | ) / N b ecause the W p ro cess c hanges u nless the other in dividual inv olv ed in the exc hange also has t yp e m − j in Mo del 3 b ut not Mo del 4. If Y 4 ( s ) > Y 3 ( s ), then increases and decreases are eac h happ ening at rate | W ( s ) | ( N − | W ( s ) | − Z 4 ( s )) / N b ecause in Mo del 4, tran s itions exc hanging a t yp e m − j individu al with an individu al of typ e m − j + 1 or higher are not p ermitted. • Th e num b er of t yp e m − j individu als changes in Mo del 3 but n ot Mo del 4 wh en there is an exc h an ge inv olving one of the individuals that has t yp e m − j in b oth mo dels and one of the in dividuals that h as t yp e m − j + 1 or higher in Mo del 4. Changes in eac h d irection happ en at r ate Z 4 ( s ) min { Y 3 ( s ) , Y 4 ( s ) } / N . Therefore, the pro cess ( W ( s ) , 0 ≤ s ≤ T ) at time s is increasing b y one at rate λ ( s ) and decreasing b y one at rate λ ( s ) + γ ( s ), wher e 0 ≤ γ ( s ) ≤ µY 3 ( s ) (62) and λ ( s ) = | W ( s ) | ( N − | W ( s ) | − Z 4 ( s ) 1 { Y 4 ( s ) >Y 3 ( s ) } ) N + Z 4 ( s ) min { Y 3 ( s ) , Y 4 ( s ) } N . (63) The next lemma b ounds th e pr o cess ( W ( s ) , 0 ≤ s ≤ T ). 30 Lemma 26. F or 0 ≤ s ≤ t , let W N ( s ) = 1 N µ ( m − j )2 − j W ( sµ − (1 − 2 − j ) ) . Then as N → ∞ , max 0 ≤ s ≤ t | W N ( s ) | → p 0 . (64) Pr o of. Th e pro of is similar to th e pro of of Lemma 4.6 in [10]. W e use T h eorem 4.1 in c hapter 7 of [1 1] to show that the p ro cesses ( W N ( s ) , 0 ≤ s ≤ t ) con v erge as N → ∞ to a diﬀusion ( X ( s ) , 0 ≤ s ≤ t ) wh ich satisﬁes th e sto c h astic diﬀeren tial equation dX ( s ) = b ( X ( s )) + a ( X ( s )) dB ( s ) (65) with b ( x ) = 0 and a ( x ) = 2 A − 1 − ( m − j − 1)2 − j | x | for all x , where A is the constant from (49 ). The Y amada-W atanab e Theorem (see, for example, (3.3) on p . 193 of [7 ]) give s path wise uniqu eness for th is SDE, w hic h implies th at the asso ciated m artingale p roblem is we ll-p osed. F or all N an d all s ∈ [0 , t ], d eﬁne B N ( s ) = − 1 N µ ( m − j )2 − j Z s 0 γ ( r µ − (1 − 2 − j ) ) µ 1 − 2 − j dr = − 1 N µ 1+( m − j − 1)2 − j Z s 0 γ ( r µ − (1 − 2 − j ) ) dr and A N ( s ) = 1 ( N µ ( m − j )2 − j ) 2 µ 1 − 2 − j Z s 0  2 λ ( r µ − (1 − 2 − j ) ) + γ ( r µ − (1 − 2 − j ) )  dr . A t ti me s , the pro cess ( W N ( s ) , 0 ≤ s ≤ t ) exp eriences p ositiv e ju mps by 1 / ( N µ ( m − j )2 − j ) at rate λ ( sµ − (1 − 2 − j ) ) µ − (1 − 2 − j ) and n egativ e ju m ps by the same amount at the slight ly larger rate ( λ ( sµ − (1 − 2 − j ) ) + γ ( sµ − (1 − 2 − j ) )) µ − (1 − 2 − j ) . Therefore, letting M N ( s ) = W N ( s ) − B N ( s ), the p r o- cesses ( M N ( s ) , 0 ≤ s ≤ t ) and ( M 2 N ( s ) − A N ( s ) , 0 ≤ s ≤ t ) are m artingales. W e claim that as N → ∞ , sup 0 ≤ s ≤ t | B N ( s ) | → p 0 (66) and sup 0 ≤ s ≤ t     A N ( s ) − 2 A − 1 − ( m − j − 1)2 − j Z s 0 | W N ( r ) | dr     → p 0 . (67) The results (66) an d (67) ab out the inﬁn itesimal mean and v ariance resp ectiv ely enable us to deduce from Theorem 4.1 in c hapter 7 of [11] that as N → ∞ , the p ro cesses ( W N ( s ) , 0 ≤ s ≤ T ) con v erge in th e Sk oroho d to p ology to a pro cess ( X ( s ) , 0 ≤ s ≤ T ) satisfying (65). Because W N (0) = 0 for all N , we hav e X (0) = 0, and therefore X ( s ) = 0 for 0 ≤ s ≤ T . The result (64) follo ws. T o complete the pro of, w e need to establish (66) and (67). Eq u ation (62) and Lemma 22 with f ( N ) = t/ ( N µ ( m − j − 1)2 − j ) imply that as N → ∞ , sup 0 ≤ s ≤ t | B N ( s ) | ≤ t N µ ( m − j − 1)2 − j max 0 ≤ s ≤ T Y 3 ( s ) → p 0 , whic h prov es (66). 31 T o p ro v e (67), note that A N ( s ) − 2 A − 1 − ( m − j − 1)2 − j Z s 0 | W N ( r ) | dr = Z s 0 2 λ ( r µ − (1 − 2 − j ) ) + γ ( r µ − (1 − 2 − j ) ) ( N µ ( m − j )2 − j ) 2 µ 1 − 2 − j − 2 A − 1 − ( m − j − 1)2 − j | W ( r µ − 1 − 2 − j ) | N µ ( m − j )2 − j dr . It therefore follo ws from (62) and (63 ) that sup 0 ≤ s ≤ t     A N ( s ) − 2 A − 1 − ( m − j − 1)2 − j Z s 0 | W N ( r ) | dr     ≤ s up 0 ≤ s ≤ t Z s 0     2 ( N µ ( m − j )2 − j ) 2 µ 1 − 2 − j − 2 A − 1 − ( m − j − 1)2 − j N µ ( m − j )2 − j     | W ( r µ − (1 − 2 − j ) ) | dr + sup 0 ≤ s ≤ t Z s 0 2 W ( r µ − (1 − 2 − j ) ) 2 + 2 | W ( r µ − (1 − 2 − j ) ) | Z 4 ( r µ − (1 − 2 − j ) ) N ( N µ ( m − j )2 − j ) 2 µ 1 − 2 − j dr + sup 0 ≤ s ≤ t Z s 0 2 Z 4 ( r µ − (1 − 2 − j ) ) min { Y 3 ( r µ − (1 − 2 − j ) ) , Y 4 ( r µ − (1 − 2 − j ) ) } N ( N µ ( m − j )2 − j ) 2 µ 1 − 2 − j dr + sup 0 ≤ s ≤ t Z s 0 µY 3 ( r µ − (1 − 2 − j ) ) ( N µ ( m − j )2 − j ) 2 µ 1 − 2 − j dr . (68) W e need to sh o w th at the four terms on the right- han d s id e of (68) eac h conv erge in p robabilit y to zero. Because t is ﬁ xed, in eac h case it s u ﬃces to s h o w that the sup rem um of the int egrand o v er r ∈ [0 , t ] con v erges in pr obabilit y to zero as N → ∞ . W e hav e sup 0 ≤ s ≤ T     2 ( N µ ( m − j )2 − j ) 2 µ 1 − 2 − j − 2 A − 1 − ( m − j − 1)2 − j N µ ( m − j )2 − j     | W ( s ) | = sup 0 ≤ s ≤ T     2 N µ 1+( m − j − 1)2 − j − 2 A 1+( m − j − 1)2 − j     · | W ( s ) | N µ ( m − j )2 − j → p 0 b y Lemma 22 b ecause | W ( s ) | ≤ max { Y 3 ( s ) , Y 4 ( s ) } and the ﬁrst f acto r go es to zero as N → ∞ b y (49). Thus, the ﬁrst term in (68) con ve rges in probabilit y to zero. Also, N µ 1 − 2 − j → ∞ as N → ∞ , so Lemma 22 gives sup 0 ≤ s ≤ T W ( s ) 2 + | W ( s ) | Z 4 ( s ) N ( N µ ( m − j )2 − j ) 2 µ 1 − 2 − j = sup 0 ≤ s ≤ T  | W ( s ) | N µ ( m − j )2 − j ( N µ 1 − 2 − j ) 1 / 2  | W ( s ) | + Z 4 ( s ) N µ ( m − j )2 − j ( N µ 1 − 2 − j ) 1 / 2  → p 0 , whic h is enough to con trol the second term in (68). T he same argu m en t works for the third term, using Z 4 ( s ) Y 4 ( s ) in the numerator of th e left-hand s id e in place of W ( s ) 2 + | W ( s ) | Z 4 ( s ). Finally , sup 0 ≤ s ≤ T µY 3 ( s ) ( N µ ( m − j )2 − j ) 2 µ 1 − 2 − j = µY 3 ( s ) N µ ( m − j )2 − j · 1 N µ 1+( m − j − 1)2 − j → p 0 b y Lemma 22 b ecause µ → 0 as N → ∞ and N µ 1+( m − j − 1)2 − j is b ound ed aw a y fr om zero as N → ∞ by (49). Th erefore, the fourth term on the righ t-hand side of (68) con v erges in probability to zero, whic h completes the pro of of (67). 32 Lemma 27. In b oth Mo del 3 and M o del 4, the pr ob ability that ther e is a typ e m − j + 1 mutation b efor e time T that has a typ e m desc endant b orn after time T c onver ges to zer o as N → ∞ . Pr o of. Th e same a rgu m en t wo rk s for b oth mo dels. Let ǫ > 0. By Lemma 21, the exp ected n umb er of type m − j + 1 mutat ions by time T is at most N µ m − j +1 T m − j +1 . Sin ce N µ 1 − 2 − j → ∞ as N → ∞ , we ha ve ǫT ≪ N . Th erefore, by (10), the probabilit y that a give n m utation sta ys in the p opulation for a time at least ǫT b efore dy in g out or ﬁ xating is at most C / ( ǫT ). It follo ws that the prob ab ility that some t yp e m − j + 1 m utation b efore time T lasts for a time at least ǫT is at most C ǫ − 1 N µ m − j +1 T m − j ≤ C ǫ − 1 N µ 1+( m − j )2 − j → 0 as N → ∞ by (49). T hus, with probability tendin g to one as N → ∞ , all type m − j + 1 m utations that hav e a d escendan t alive at time T originated after time (1 − ǫ ) T . Arguing as ab o ve, the exp ected num b er of t yp e m − j + 1 muta tions b et w een times (1 − ǫ ) T and T is at most ǫN µ m − j +1 T m − j +1 , and the p robabilit y th at a giv en such muta tion has a t yp e m descendan t is q j ≤ C µ 1 − 2 − ( j − 1) b y P rop osition 1. Thus, the prob ab ility that some t yp e m − j + 1 m utation b et ween times (1 − ǫ ) T and T has a type m descendant is at most C ǫ N µ m − j +1 T m − j +1 µ 1 − 2 − ( j − 1) ≤ C ǫN µ 1+( m − j − 1)2 − j ≤ C ǫ (69) b y (49). T he lemma follo ws by letting ǫ → 0. Lemma 28. We have lim N → ∞ | r 3 ( T ) − r 4 ( T ) | = 0 . Pr o of. F or i = 3 , 4, let D i b e the eve nt that no t yp e m − j + 1 mutat ion that o ccurs b efore time T has a t yp e m d escendan t. By Lemma 27, it su ﬃces to sho w that lim N → ∞ | P ( D 3 ) − P ( D 4 ) | = 0 . (70) Recall that Mo del 3 and Mo del 4 are coupled so that when a t yp e m − j + 1 mutat ion o ccurs at the same time in b oth m od els, it will h a v e a type m descendant in one m od el if and only if it has a type m descendan t in the other. Therefore, | P ( D 3 ) − P ( D 4 ) | is at m ost the probabilit y that some t yp e m − j + 1 mutatio n that o ccurs in one pro cess but not th e other has a t yp e m descendan t. There are t wo sources of typ e m − j + 1 muta tions th at o ccur in one pro cess but n ot the other. S ome typ e m − j + 1 mutati ons are supp ressed in one mo d el but not the other b ecause there is already an ind ividual of t yp e m − j + 1 or higher in the p opulation. That the pr obabilit y of some su c h mutat ion ha ving a typ e m descendant go es to zero f ollo ws from th e argum ent used to p ro v e Lemma 25, wh ich is also v alid for Mo del 3 and Mo del 4. Th e other type m − j + 1 m utations that app ear in one pro cess b ut n ot the other o ccur when one of th e | W ( s ) | in dividuals that has type m − j in one mo del b ut not the other gets a mutation. Let ǫ > 0. By Lemma 26, for su ﬃcien tly large N , P  max 0 ≤ s ≤ T | W ( s ) | ≤ ǫN µ ( m − j )2 − j  > 1 − ǫ. Therefore, on an even t of probab ility at least 1 − ǫ , the exp ected num b er of type m − j + 1 m utations that o ccur in one mo del b ut not the other and hav e a type m descend ant is at most ǫN µ ( m − j )2 − j q j ≤ C ǫN µ 1+( m − j − 1)2 − j ≤ C ǫ b y Prop osition 1 and (49). The result follo ws by letting ǫ → 0. 33 5.6 Comparing Mo dels 4 and 5 In b oth Mo del 4 and Mo del 5, type m − j in dividuals app ear at times of a Po isson pr ocess wh ose rate at time s is N µ m − j s m − j − 1 / ( m − j − 1)!. In b oth mo dels, t yp e m − j individuals exp erience m utations that will lead to t yp e m descendants at r ate µq j . The tw o m od els diﬀer in th e follo wing three w ays: • In Mo del 4, s ome t yp e m − j + 1 m u tations are su ppressed b ecause there is another individual of t yp e m − j + 1 or h igher already in the p op u lation. • In Mo del 4, some time elapses b etw een the time of the typ e m − j + 1 m utation that will pro duce a type m d escend an t, and the time that the typ e m − j + 1 descendant app ears. • In Mo del 4, when there are k individ uals of type m − j and ℓ individu als of t yp e m − j + 1 or higher, the r ate at which the num b er of t yp e m − j individuals incr eases (or decreases) b y one is k ( N − ℓ ) / N b ecause the num b er of typ e m − j individuals changes only when a t yp e m − j ind ividual is exc han ged with a typ e 0 individu al. This rate is simp ly k in Mo del 5. An additional complication is that the factor ( N − ℓ ) / N is n ot ind ep enden t of w hether previous t yp e m − j + 1 m utations are successfu l in pr o d ucing t yp e m descendants. W e pro ve Lemma 29 b elo w b y making three mod iﬁcations to Model 4 to eliminate these diﬀerences, and then comparing the mo diﬁed m od el to Mod el 5. Lemmas 20, 23, 24, 28, an d 29 immediately imply part 3 of Pr op osition 4. Lemma 29. We have lim N → ∞ | r 4 ( T ) − r 5 ( T ) | = 0 . Pr o of. W e obtain Mo del 4 ′ from Mo del 4 by making the follo wing m od iﬁcations. First, wh enev er a t yp e m − j + 1 mutat ion is s u ppressed in Mo del 4 b ecause there is another individu al in the p opulation of t yp e m − j + 1 or higher, in Mo del 4 ′ w e add a type m individu al with p robabilit y q j . Second, when ev er a type m − j + 1 mutati on o ccurs in Mo del 4 that will eve ntually pro du ce a t yp e m descendant, we change the t yp e of the mutated individu al in Mo del 4 ′ to typ e m immed iately . Third, for every t yp e m − j + 1 muta tion in Mo del 4 ′ , includ in g the even ts that p ro duce a type m in dividual th at were added in the ﬁr st mo diﬁcation, if there are ℓ individu als of t yp e m − j or higher in th e p opulation, then we s u ppress the m utation with p robabilit y ℓ / N . T h is means th at at all times, ev ery t yp e m − j individu al in Mo del 4 ′ exp eriences a m utation that will p ro duce a t yp e m descendant at r ate µq j ( N − ℓ ) / N , w hile new type m − j individu als app ear and disapp ear at rate k ( N − ℓ ) / N . Note that the num b er of t yp e m − j ind ividuals is alw a ys th e s ame in Mo del 4 ′ as in Mo del 4. Let r 4 ′ ( T ) b e the probability th at there is a typ e m individual in Mo del 4 ′ b y time T . Lemma 25, whose p ro of is also v alid for Mo del 4 ′ , imp lies that with probabilit y tend ing to one as N → ∞ , the ﬁr st mo diﬁ cation ab ov e do es not cause a t yp e m individual to b e added to Mo del 4 ′ b efore time T . Lemma 27 implies this same result for the second mo diﬁcation. As for th e third mo diﬁcation, let ǫ > 0, and let D N b e the eve nt that the num b er of ind ivid uals of t yp e m − j or higher in Mo del 4 sta ys b elo w ǫN through time T . By Lemma 22, w e hav e lim N → ∞ P ( D N ) = 1. By Lemma 21, th e exp ected n umb er of type m − j + 1 m utations b y time T is at most C N µ m − j +1 T m − j +1 . On D N , w e alwa y s ha ve ℓ/ N < ǫ , so the p robabilit y that D N o ccurs and a t yp e m − j + 1 m utation that pr o d uces a type m descendant in Mo del 4 gets 34 suppr essed in Mo del 4 ′ is at m ost C N µ m − j +1 T m − j +1 · q j ǫ ≤ C ǫ , u sing (69) and Prop osition 1. Th u s , lim sup N → ∞ | r 4 ( T ) − r 4 ′ ( T ) | < ǫ. (71) It remains to compare Mo del 4 ′ and Mo del 5. In Mo del 5, when ther e are k t yp e m − j individuals, the rates that t yp e m − j individuals app ear, disapp ear, and giv e rise to a type m individ ual are k , k , and k µq j resp ectiv ely , as compared with k ( N − ℓ ) / N , k ( N − ℓ ) / N , and k µq j ( N − ℓ ) / N resp ectiv ely in Mo del 4 ′ . Consequen tly , Mo del 4 ′ is equiv alen t to Mo del 5 slo w ed do wn by a factor of ( N − ℓ ) / N , whic h on D N sta ys betw een 1 − ǫ and 1. W e can obtai n a lo w er b ound for r 4 ′ ( T ) by considering Mo del 5 r un all th e wa y to time T , so r 4 ′ ( T ) ≥ r 5 ( T ). An upp er b ound for r 4 ′ ( T ) on D N is obtained by considering Mo del 5 run only to time T (1 − ǫ ), so r ′ 4 ( T ) ≤ r 5 ((1 − ǫ ) T ) + P ( D c N ). No w lim N → ∞ r 5 ((1 − ǫ ) T ) is given by the righ t-hand side of (52) with (1 − ǫ ) t in place of t . Therefore, by letting N → ∞ and then ǫ → 0, we get lim N → ∞ | r 4 ′ ( T ) − r 5 ( T ) | = 0 , whic h, com bined with (71), p ro v es the lemma. Ac kno wledgmen ts The author thank s R ick Durrett for man y helpful discus sions r egardin g this w ork. He also thanks Rinaldo S chinazi for a discuss ion r elated to section 2.5, and a r eferee for commen ts ab out th e present ation of the p ap er. References [1] P . Armitage (1985). Multistage mo d els of carcinogenesis. Envir onmenta l He alth Pr esp e ctives 63 , 195-201. [2] P . Ar m itage and R. Doll (1954). The age distribu tion of cancer and a m ulti-stage theory of carcinogenesis. Brit. J. Canc er. 8 , 1-12. [3] P . Ar m itage and R. Doll (1957). A t wo-sta ge theory of carcinogenesis in relation to the age distribution of human cancer. Brit. J. Canc er 11 , 161-169. [4] R. Arratia, L. Goldstein, and L. Gordon (1989). Two momen ts suﬃce for P oisson approxi- mations: the Chen-Stein metho d. Ann. P r ob ab. 17 , 9-25. [5] N. Beerenwink el, T . An tel, D. Dingli, A. T raulsen, K. W. Kinsler, V. E. V elculescu, B. V ogelstein, and M. A. No wa k (2007). Genetic pr ogression and the wa iting time to cancer. PL oS Comput. Biol. 3 , no. 11, 2239-2246 . [6] P . Calabrese, J. P . Meckli n, H. J . J¨ arvinen, L. A. Aaltonen, S. T a v ar ´ e, and D. Sh ibata (20 05). Num b er s of mutat ions to d iﬀerent types of colorectal cancer. BMC Canc e r 5 : 126. [7] R. Durr ett (1996). Sto chastic Calculus: A Pr actic al Intr o duction . CR C Press, Bo ca Raton. 35 [8] R. Durr ett and D. Schmidt (2007). W aiting for regulatory sequences to app ear. Ann. Appl. Pr ob ab. 17 , 1-32. [9] R. Durr ett and D. S c hmidt (2007). W aiting for t wo m utations: with app lications to reg- ulatory sequ ence evolutio n and the limits of Darwin ian selection. Pr eprin t, a v ailable at h ttp://www.math.cornell.edu/˜durrett/recen t.html . [10] R. Durrett, D. Sc hmid t, and J. Sc hw einsb erg (2007). A waiting time problem arising f r om the stud y of m ulti-stage carcinogenesis. Preprint, a v ailable at arXiv:070 7:2057. [11] S . N. Ethier and T. G. Ku rtz (1986) . M arkov Pr o c esses: Char acterizatio n and Conver genc e. John Wiley and S ons, New Y ork. [12] J . C. Fisher and J. H. Holloma n (1951). A hyp othesis for the origin of cancer fo ci. Canc er 4 , 916-9 18. [13] D. A. F reedman and W. C . Na vidi (1989). Multistage mo dels for carcinogenesis. E nv i r on- mental He alth Persp e ctives 81 , 169-188. [14] T . E. Harris (1963). The The ory of Br anching Pr o c esses . Sprin ger-V erlag, Berlin. [15] H. W. Hethcote and A. G. Kn ud son (1978) . Mo del for the incidence of em bryonal cancers: application to retinoblastoma. Pr o c. Natl. A c ad . Sci. USA 75 , 2453-24 57. [16] Y. Iw asa, F. Mic hor, N. L. K omaro v a, and M. A. No w ak (2005 ). Po pu lation genetics of tumor suppr essor genes. J. The or. Biol. 233 , 15-23. [17] Y. Iw asa, F. Mic hor, and M. A. Now ak (2004). S to chastic tun nels in ev olutionary dynamics. Genetics 166 , 1571-1579. [18] N. L. Komarov a, A. Sengup ta, and M. A. No wak (2003). Mutat ion-selection net w orks of cancer initiation: tumor s u ppressor genes an d c hromosomal instabilit y . J. The or. Biol. 223 , 433-4 50. [19] A. G. Knudson (1971). Mu tatio n and cancer: statistical stu dy of retinoblastoma. Pr o c. Natl. A c ad. Sci. USA 68 , 820–823. [20] A. G. Kn ud son (20 01). Two g enetic hits (more or less) to cancer. Nat. R ev. Canc er 1 , 157-1 62. [21] A. N. Kolmorogo v (1 938). On the solution of a problem in biology . Izv. NII Mat. M ekh. T omsk. Univ. 2 , 7-12. [22] E. G. Lueb ec k an d S. H. Mo olga vk ar (2002). Multistage carcinogenesis and the incidence of colorecta l cancer. Pr o c. Natl. A c ad. Sci. USA 99 , 15095–1510 0. [23] S . H. Mo olga vk ar, A. Dew anji, and D. J. V enzon (1988). A sto c hastic t wo- stage mo d el for cancer r isk assessment. I. T h e hazard function and the probabilit y of tumor. Risk Analys is 8 , 383-3 92. 36 [24] S . H. Mo olga vk ar and A. G. Knudson (1981). Mutation and cancer: a mo del for human carcinogenesis. J. N atl. Canc er Inst. 66 , 1037- 1052. [25] S . H. Mo olga vk ar and G. Lueb eck (1990). Two-e ven t m o d el for carcinogenesis: biological, mathematical, and statistical considerations. Risk Analysis 10 , 323-341. [26] S . H. Mo olga vk ar and E. G. Lueb ec k (1992). Multistage carcinogenesis: p opulation-based mo del for colon cancer. J. Natl. Canc er Inst. 18 , 610-618 . [27] P . A. P . Mo ran (19 58). Random pro cesses in genetics. Pr o c. Cambridge P hilos. So c. 54 , 60-71 . [28] H. J. Muller (1951) . Radiation damage to the genetic material. In Scienc e in Pr o gr ess, Seventh Series (G. A. Baitsell, ed.), Y ale Unive rsity Press, pp. 93-165. [29] C . O. Nordling (195 3). A new theory on cancer-ind u cing mec hanism. Brit. J. Canc er 7 , 68-72 . [30] M. A. Now ak (200 6). Evolutionary Dynamics: Exploring the Equations of Life . Harv ard Univ ersit y P r ess, C am bridge. [31] T . O k amoto (1990). Multi-stop carcinogenesis m o d el for adult T-cell leukemia. Rinsh o Ket- sueki 31 , 569-571. [32] T . Sj¨ oblom et. al. (2006). Th e consensus co din g sequences of human breast and colorectal cancers. Sc i enc e 314 , 268-274 . [33] D. W o darz an d N. L. Komaro v a (2005). Computational Biolo gy of Canc er: L e ctur e N otes and Mathematic al Mo deling . W orld Scienti ﬁc, New Jersey . 37

The waiting time for m mutations

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment