Computing hitting times via fluid approximation: application to the coupon collector problem

COMPUTING HITTING TIMES VIA FLUID APPR O XIMA TION: APPLICA TION TO THE COUPON COLLECTOR PR OBLEM NICOLAS GAST Abstract. In this pap er, we sho w how to use sto c hastic approximation to compute hitting time of a stochastic process, based on the study of the time for a ﬂuid appro ximation of this pro cess to be at distance 1 / N of its ﬁxed point. This approac h is developed to study a generalized v ersion of the coupon collector problem. The system is composed b y N independent iden tical Mark o v chains. At eac h time step, one Mark o v c hain is pic k ed at random and performs one transition. W e show that the time at which all chains hav e hit the same state is bounded by c 1 N log N + c 2 N log log N + O ( N ) where c 1 and c 2 are tw o constan ts depending on eigenv alues of the Marko v chain. 1. Introduction The coup on collector is a classical problem in probability theory . There are N t yp es of coup ons. Coup on are collected at random with replacement. The goal is to compute the n um b er of coup ons to b e collected to hav e at least one coup on of each kind. This simple problem has a simple answ er and a simple pro of: on a verage, one has to buy N/ N + ( N − 1) / N + · · · + 1 / N ≈ N log N coup ons to complete a collection: if you already hav e k diﬀerent t yp es of coup ons, it tak es in av erage ( N − k ) / N to get a coup on of a new t yp e. Because of its simplicity , this problem has many applications, esp ecially in com- puter science where it often serves as a basic tool for computing the completion time of randomized algorithm [ 11 , 8 ]. Man y v ariants of it hav e b een studied during the years. F or example, the time needed to complete a T collections of the same N coup ons is shown to b e N (log N + ( T − 1) log log N + O (1)) in [ 13 , 12 ]. The time to complete the ﬁrst collection is N log N . Ho wev er, the time to complete each next collection is only N log log N . Ho w ever, ev en a slight mo diﬁcation such has obtaining T diﬀeren t collections instead of one leads to muc h more complicated proofs. The approach taken in this pap er aims at b eing more general but also at giving a new insight on the relation b et ween hitting time and stochastic approximation. Con tributions. W e develop an approach based on sto c hastic appro ximation to compute the hitting time of a sto chastic pro cess that has an absorbing state. The system is comp osed of N iden tical Mark o v chains that ha v e an absorbing state 0. A t each time step, one chain is pic k ed at random and p erforms one transition. Our goal is to compute the num b er of steps un til whic h all Marko v chains are in their absorbing state. The coup on collector problem is a particular case of this problem Contact: nicolas.gast@epfl.ch . 1 2 NICOLAS GAST b y considering N deterministic Marko v c hains that ha v e state 1 (no coup on of that t yp e has been collected) or 0 (at least one coup on of t yp e i has b een collected). Using a classical sto c hastic approximation approach, like [ 2 ], one can show that if N is large, the prop ortion of Marko v chains that are in a giv en states can b e ap- pro ximated b y a linear ordinary diﬀeren tial equation (ODE) ˙ m = mQ . This ODE has a unique ﬁxed p oint to whic h all tra jectories conv erge exp onen tially fast, corre- sp onding to a state where all chains are in state 0. Ho wev er, this approximation is not enough accurate to bound the hitting time of the sto chastic pro cess. The time for this ODE to reach its equilibrium is inﬁnite whereas the expected time for the sto c hastic system to hit this equilibrium is ﬁnite. In this pap er, we establish a relation b etw een the expected time E [ T N ] for all the c hains to b e completed and the time t N for the ODE to b e at distance 1 / N of its equilibrium p oin t. The main results of this pap er are Theorem 1 and Theorem 2 . W e ﬁrst show that T N is b ounded b y N · t N . Using this result, we derive the existence of constants c 1 , c 2 that dep end on the sp ectral prop erties of the original Mark o v c hain suc h that E [ T N ] ≤ N t N + O ( N ) ≤ c 1 N log N + c 2 N log log N + O ( N ) . Applied to the time to complete T collections, the allo ws to derive directly the results of [ 13 ]. W e also study t w o particular cases for which w e hav e simple closed-form b ounds for T N . In the more general case, if we only know that the exp ected hitting time of one Marko v chain is b ounded by T starting from its initial state, then T N is only b ounded b y N 2 T . Ho w ev er, if the exp ected hitting for a Marko v c hain is bounded b y T indep enden tly of its initial state, then w e sho w that E [ T N ] ≤ N T log N + O ( N ). W e provide examples that sho w that these bounds are tight up to a linear term. Finally , w e also sho w that this metho d can be applied to study the completion time of distributed algorithm. W e b eliev e that the interest of this metho d is t w ofold. On the one hand, it giv es a new insight on the coup on collector b y providing a new pro of of a more general results. W e also think that this results could b e adapted to more general sto c hastic approximation algorithms and that this could be helpful to understand the relationship b et ween the extinction time of sto chastic mo dels and the time for a ﬂuid appro ximation of it to get close to extinction. Related w ork. Sto c hastic approximation algorithm ha v e been introduced in [ 14 ] for solving ro ot ﬁnding problems. Application of these metho ds are scattered on man y ﬁelds, lik e economics [ 3 ] or computer science [ 2 ]. In all of these works, a ﬁrst step is to sho w that the sto chastic system can b e approximated on an y ﬁnite time in terv al b y a ﬂuid approximation – e.g. describ ed b y a diﬀerential equation. Then, this appro ximation is used to derive asymptotic prop erties suc h as characterizing the limiting dynamics [ 1 ], computing appro ximation of the steady-state distribution [ 3 , 2 ] or proving stability prop erties [ 5 , 7 ]. Ho w ever, there are few results on the relation b et ween the time for a stochastic pro cess to escap e a region and the behavior of a ﬂuid approximation of it. In [ 6 ], the authors sho ws that if the time for the diﬀerential equation to escap e a region is ﬁnite, then the time for the stochastic system to escap e this region con verges to the same v alue (Theorem 4.3 of [ 6 ]). When the diﬀerential equation stays far from Hitting time and ﬂuid appro ximation: application to the coup on collector problem. 3 the absorbing b oundary , the time for the sto c hastic system to reach its absorbing state can b e bounded by large-deviation results, as in [ 9 ]. Our case of interest in this pap er is that the deterministic system conv erges to a ﬁxed p oin t but do es not reac h this point is ﬁnite time while the sto chastic pro cess do es hit this point in ﬁnite time. Outline of the pap er. The rest of the pap er is organized as follo ws. In Section 2 , w e give a formal deﬁnition of the problem. Section 3 contains the main results of this pap er. W e show that T N is b ounded b y N t N + O ( N ) and derive an asymptotic dev elopment of t N . Section 4 establishes a explicit link b et ween T N and the av erage completion time of one algorithm. Finally , w e sho w ho w this can be used to compute the completion time of randomized algorithms in Section 5 . 2. F ormal description and not a tions Let Y b e a Marko v c hain on a ﬁnite state space S = { 0 . . . S } and let denote P its transition matrix. W e assume that this c hain has an absorbing state, denoted 0, all the other states b eing transient. If I denotes the identit y matrix, them P − I can b e written as: P − I =  0 0 Q 0 Q  where Q is a non-singular matrix such that for all i, j : Q ii < 0, Q ij ≥ 0 and P j Q ij ≤ 0. Q 0 is a v ector such that for all i , P j Q ij + Q 0 i = 0. W e consider a Marko v c hain on S N comp osed by N copies of the original Mark o v c hain. Its state at time t is denoted ( X 1 ( t ) , . . . X N ( t )). The evolution of the Marko v c hain is as follows: (1) at each time step, a c hain i ∈ { 1 . . . N } is pick ed uniformly at random (2) the i th chain X i ( t ) changes its state according to the transition matrix P . The states of the other c hains do not c hange. Our goal is to compute the time for all c hains to hit 0 starting from a state ( x 1 . . . x N ) ∈ S N . W e deﬁne T N this hitting time: (1) T N def = inf { t : ( X 1 ( t ) , . . . X N ( t )) = (0 . . . 0) } . 2.1. Notations. F or a state x ∈ S , w e denote b y e x the line vector that has all co ordinates equal to 0 except for the x th one whic h is equal to 1. The v ector 1 denotes the column vector with all co ordinates equal to 1. F or a line v ector α and a matrix P , αP denotes the classical matrix product. F or example, if P is a S × S matrix, then αP 1 = P S i =1 P S j =1 α i P ij . F or each state x ∈ S , we denote by W ( x ) the hitting time of 0 starting from x : if Y is a Marko v c hain of transition probability P , then: W ( x ) def = E [inf { t : Y ( t ) = 0 } | Y (0) = x ] . Using the v ector notation abov e, we ha ve P (inf { t : Y ( t ) = 0 } ≥ 1 + i | Y (0) = x ) = e x Q i 1 for all i ∈ N . Thus, W can also be written W ( x ) = P ∞ i =0 e x Q i 1 . 4 NICOLAS GAST 3. Hitting time and fluid appro xima tion In this part, we b ound the exp ectation of T N using a deterministic ordinary diﬀeren tial equation (ODE) appro ximation. In particular, w e show that E [ T N ] is b ounded by N times the time t N for the linear ODE ( 3 ) to be at distance 1 / N from its ﬁxed p oin t plus a term of order O ( N ). Moreo ver, this time t N is of order Ω(log( N )), showing that the term in O ( N ) b ecomes negligible compared to N t N as N gro ws. 3.1. A diﬀeren tial equation appro ximation. F or an y state x ∈ S and an y time step k , w e deﬁne the quantit y ¯ M N x ( k ) to b e the prop ortion of Marko v chains that are in state x at time step k : ¯ M N x ( k ) = 1 N N X i =1 1 X i ( k )= x , where 1 X i ( k )= x equals 1 if X i ( k ) = x and 0 otherwise. ¯ M N ( k ) denotes the vector of all ¯ M N x ( k ) for x ∈ S : ¯ M N ( k ) = P x ∈S ¯ M N x ( k ) e x where e x denotes a unit v ector ha ving its x th co ordinate equal to 1 and the others 0. The pro cess ¯ M N ( k ) is a Mark o v chain: with probability ¯ M N i ( k ), a chain that is in state i is chosen and goes with probability P ij in state k . This shows that the expected v ariation of ¯ M N ( k ) during one time step is: E  ¯ M N ( k + 1) − ¯ M N ( k ) | ¯ M N ( k )  = X x ∈S X j 6 = i ¯ M N i ( k ) P ij 1 N ( e j − e i ) = 1 N ¯ M N ( k ) Q. (2) The function f : m 7→ mQ is called the drift of the system. Let us consider the system of diﬀerential equation corresp onding to the drift: (3) ˙ m ( t ) = m ( t ) · Q. Equation ( 2 ) shows that M N ( k ) can b e describ ed by a sto chastic appro ximation with constan t step size 1 / N : it corresp onds to a Euler discretization of the ODE ( 3 ) with a random noise U ( i.e. such that E  U N ( k + 1) | ¯ M N ( k )  = 0) (4) ¯ M N ( k + 1) = ¯ M N ( k ) + 1 N  ¯ f ( M N ( k )) + U N ( k + 1)  . Let us call M N ( t ) the state of the system when the time has b een rescaled by t : M N ( t ) = ¯ M N ( b tN c ) . Using classical to ols of sto chastic appro ximation (Theorem 1 of [ 2 ] for example), one can show that if M N (0) con v erges in probability to m (0), then M N ( t ) conv erges in probability to m ( t ) uniformly on [0; T ]: lim N →∞ sup 0 ≤ t ≤ T   M N ( t ) − m ( t )   = 0 in probability . Ho w ever, when one wan ts to compute the hitting time of the 0 b y M N ( t ), this appro ximation is not accurate enough and leads to ov erestimated b ounds. In the follo wing, w e will see how to link the hitting time of M N ( t ) and the time for m 0 ( t ) to b e greater than 1 − 1 / N . Hitting time and ﬂuid appro ximation: application to the coup on collector problem. 5 3.2. Link b et ween the hitting time and the time to reac h 1 / N . Let us now lo ok at the quan tit y M N 0 ( t ) whic h is the proportion of Mark ov chain in state 0. The quantit y T N can b e deﬁned as T N = inf { t : M N 0 ( t ) = 1 } . If m ( t ) is the solution of the ODE ( 3 ), one clearly has lim N →∞ m 0 ( t ) = 1 but unless if m ( t ) starts exactly with m 0 ( t ) = 1, the time to reach m 0 ( t ) = 1 is inﬁnite: inf { t : m 0 ( t ) = 1 } = + ∞ . Due to the discrete nature of M N 0 ( t ), M N 0 ( t ) takes v alues in { 0 , 1 N , 2 N , . . . , N N } . Th us, when M N 0 ( t ) is greater than 1 − 1 / N , it is equal to 1. This suggests to in tro duce t N , the time for the ODE to be suc h that m 0 ( t ) ≥ 1 − 1 / N : (5) t N def = inf { t : m 0 ( t ) ≥ 1 − 1 N } . On Figure 1 are rep orted tw o simulations for the coupon collector with 2 cards. W e compare the hitting time t N of the sto chastic system for N = 20 and N = 1000 with the time t N for the ODE to reac h 1 − 1 / N . The time of the sto c hastic system has b een accelerated b y N . This suggests that t N is indeed a go od estimate of T N / N . 0.5 0.6 0.7 0.8 0.9 1 3 3.5 4 4.5 5 5.5 6 6.5 7 y=1-1/N T N /N t N stochastic process Limiting ODE (a) N = 20 0.995 0.996 0.997 0.998 0.999 1 9 9.5 10 10.5 11 11.5 y=1-1/N T N /N t N stochastic process Limiting ODE (b) N = 1000 Figure 1. Comparison of the hitting time of the sto c hastic of the system rescaled by 1 / N and the time for the diﬀeren tial equation to reac h 1 − 1 / N for the coup on collector problem with 2 cards. The smo oth curve represen ts the diﬀeren tial equation m 0 ( t ), the dotted line is the line 1 − 1 / N and the curve with the jumps represen ts M N ( N t ) for one sample of the simulation for N = 20 or N = 1000. F or each curv e, the hitting time of 1 for the sto c hastic system is close to the hitting time of 1 − 1 / N for the deterministic ODE. Classical stochastic appro ximation results show that the rate of con v ergence of M N ( t ) to m ( t ) is of o der O (1 / √ N ). The b ound is to o lo ose to guaran tee the con v ergence of t N to T N . In the next Theorem 1 , w e use a slightly diﬀeren t approach to show that N t N is indeed a very go o d approximation of T N . Theorem 1. L et t N b e deﬁne d by Equation ( 5 ) with m satisfying the diﬀer ential e quation ( 3 ) with initial c ondition m (0) = α def = N − 1 P N i =1 e x i . Then, hitting time 6 NICOLAS GAST T N of (0 . . . 0) for the sto chastic system c omp ose d of the N chains starting fr om ( x 1 . . . x n ) satisﬁes: E [ T N ] ≤ N  t N + α ( I − R ) − 1 1 + 2 max j,k Q − 1 j k  = N t N + O ( N ) , wher e R is the matrix deﬁne d by R ii = 0 and R ij = − Q ij Q ii for i 6 = j . Pr o of. The outline of the pro of is as follows. The main idea is to write T N as the maximum of N dep enden t random v ariables that corresp ond to the time for eac h chain to reach its absorbing state. Then we establish a relation b etw een the exp ectation of this maximum and the tail b eha vior of the marginal distribution of eac h random v ariable. The marginal distribution for each c hain follows a phase- t yp e distribution. W e show in Lemma 1 that its tail b ehavior can b e approximated b y the one of a con tinuous phase-type distribution that leads to the term t N . Let us pick a chain i ∈ { 1 . . . N } at random. The distribution of the initial state of i is the distribution α . If i w as alone, the probabilit y for this c hain to be in the absorbing state 0 at time step k starting from state i would b e ( P k ) i, 0 = α ( I + Q ) k 1 . When considering the system comp osed by the N Marko v chains, the Marko v c hain i makes a transition with probabilit y 1 / N . Th us, the probability for this particular c hain to b e in its absorbing state 0 at time k is α ( I + N − 1 Q ) k 1 . Therefore, the time T i at which the Mark ov chain i has hit its absorbing state satisﬁes: (6) P  T i ≥ k  = α (1 + 1 N Q ) k 1 . If i 1 . . . i N denotes a random permutation of { 1 . . . N } , then the time for all the Mark ov c hains to ha ve hit 0 is T N = max 1 ≤ k ≤ N T i k . The v ariables T i k are iden tically distributed following the la w given by Equation ( 6 ). How ever, these v ariables are not indep endent. Using the union b ound and the fact that P ( T N ≥ k ) ≤ 1, w e ha v e: P ( T N ≥ k ) ≤ min 1 , N X k =1 P  T i k ≥ k  ! = min  1 , N α (1 + 1 N Q ) k 1  . Therefore, the exp ectation of T N can b e bounded b y: E [ T N ] = ∞ X k =1 P ( T N ≥ k ) ≤ ∞ X k =1 min  1 , N α (1 + 1 N Q ) k 1  ≤ x N − 1 X k =1 1 + ∞ X k = x N N α (1 + 1 N Q ) k 1 (7) Hitting time and ﬂuid appro ximation: application to the coup on collector problem. 7 where x N = min { k ∈ N : α (1 + N − 1 Q ) k 1 ≤ 2 / N } . Moreo v er, using that P ∞ k = x N (1+ N − 1 Q ) k = − (1 + N − 1 Q ) x N N Q − 1 , we hav e ∞ X k = x N α (1 + 1 N Q ) k 1 = − α (1 + 1 N Q ) x N N Q − 1 1 ≤ max j,k  − Q − 1 j,k  N S X j =1  α (1 + 1 N Q ) x N  j ≤ max j,k  − Q − 1 j,k  N · 2 N (8) = 2 max j,k  − Q − 1 j,k  , where the Inequalit y ( 8 ) comes from the deﬁnitions of x N . Com bining this inequality and the Equation ( 7 ), w e get: E [ T N ] ≤ x N + 2 N max j,k  − Q − 1 j,k  . The quantit y x N is deﬁned b y x N = min { k : α (1 + 1 N Q ) k 1 ≤ 2 / N } . W e sho w in Lemma 1 that: x N ≤ N inf { t : exp ( tQ ) ≤ 1 N } + α ( I − R ) − 1 1 = N t N + α ( I − R ) − 1 1 , where R is a matrix deﬁned by R ii = 0 and R ij = − Q ij Q ii for i 6 = j .  3.3. Discrete and con tinuous phase-t yp e distribution. A random v ariable suc h that P ( X ≥ t ) = α exp( tQ ) 1 is said to hav e a contin uous phase-t yp e distribu- tion of parameter ( Q, α ). Let Y ( t ) be a Marko v chain on S such that the rate of transition from i 6 = 0 to j 6 = i is Q ij and the rate of transition from 0 to i 6 = 0 is zero. If α is the initial distribution of Y (0), then the time for Y ( . ) to reac h zero follows a phase-type distribution of parameters ( Q, α ). Similarly , a random v ariable suc h that P ( X ≥ k ) = α (1 + Q/ N ) k 1 is said to ha ve a discrete phase-type distribution of parameter (1 + Q/ N , α ). This corresp onds to the time for a discrete Marko v c hain of transition matrix 1 + Q/ N to reach zero. W e refer to [ 10 ], Chapter 2 for more deﬁnitions and prop erties of phase-type distributions. The next lemma shows the relation b et ween the tail of a contin uous phase-type distribution of parameter ( Q, α ) and the tail of a discrete phase-type distribution of parameter (1 + Q/ N , α ). Lemma 1. L et x N = min { k ∈ N : α (1 + N − 1 Q ) k 1 ≤ 2 / N } and t N = N min { t ∈ R : α exp( Qt ) 1 ≤ 1 / N } b e deﬁne d as in the pr o of of The or em 1 . Then: x N ≤ N  t N + α (1 − R ) − 1 1  , wher e R is a matrix deﬁne d by R ii = 0 and R ij = − Q ij /Q ii for i 6 = j . Pr o of. Let us consider the Marko v c hain Y () asso ciated with the contin uous phase- t yp e distribution of parameter ( Q, α ). With probability α i , the Mark ov chain starts in state i . If after k jumps, the Marko v c hain is in state i , it sta ys there for a time T ki exp onen tially distributed of parameter − Q ii and then jump to a state j 6 = i with probability − Q ij /Q ii . Thus, the probabilit y of b eing in state i after k jumps 8 NICOLAS GAST is ( α R k ) i where R denotes the matrix with R ii = 0 and R ij = − Q ij /Q ii for i 6 = j . Therefore, if X N is a contin uous phase-t yp e random v ariable of parameter ( α, Q ), X N has the same distribution as: (9) X = ∞ X k =0 S X i =1 U ki T ki , where U ki are (dep endent) Bernoulli v ariables of parameter ( αR k ) i and T ki are (indep enden t) exp onen tially distributed v ariable of parameter − Q ii . Similarly , the quan tit y α (1 + N − 1 Q ) k corresp onds to the probability for a discrete phase-t yp e random v ariable to be greater than k and a v ariable X with discrete phase-t yp e distribution of parameter ( α, I + Q/ N ) has the same distribution as: (10) ∞ X k =0 S X i =1 U ki T ( N ) ki where U ki are the same v ariable as b efore and T ( N ) ki are independent geometric random v ariables of parameter − Q ii / N . Since T ( N ) ki is a geometric random v ariable of parameter − Q ii / N , for all t ∈ R + , w e hav e: P  T ( N ) ik ≥ tN  = (1 − Q ii N ) d tN e ≤ (1 − Q ii N ) tN +1 ≤ exp( − Q ii ( t + 1 N )) = P  T ki ≥ t + 1 N  . where the last inequalit y comes from the fact that log (1 + x ) ≤ x . This sho ws that N − 1 T ( N ) ki is less than T ki + 1 N (for the sto chastic order). As p oin ted out, all the T ( N ) ki and T ki are indep endent in Equations ( 9 ) and ( 10 ). Therefore, w e can assume that N − 1 T ( N ) ki ≤ T ki + 1 N almost surely . Using that for an y p ositive random v ariable A, B and any t ∈ (0; ∞ ) and ` ∈ [0; t ], we ha v e: P ( A + B ≥ t ) ≤ P (( A ≥ ` ) ∪ ( B ≥ t − ` )) ≤ P ( A ≥ t − ` ) + P ( B ≥ ` ) , this shows that for any ` : P ∞ X k =0 S X i =1 U ki T ( N ) ki ≥ N t ! ≤ P ∞ X k =0 S X i =1 U ki T ki + ∞ X k =0 S X i =1 U ki 1 N ≥ t ! ≤ P ∞ X k =0 S X i =1 U ki T ki ≥ t − ` ! + P ∞ X k =0 S X i =1 U ki ≥ ` ! . (11) By Marko v inequalit y , P  P ∞ k =0 P S i =1 U ki ≥ N `  ≤ E h P ∞ k =0 P S i =1 U ki i / ( N ` ), with E h P ∞ k =0 P S i =1 U ki i = P k αR k 1 = α (1 − R ) − 1 1 . This shows that if ` = α (1 − R ) − 1 1 , then the second part of ( 11 ) is less than 1 / N . Moreov er, if t = ` + t N , Hitting time and ﬂuid appro ximation: application to the coup on collector problem. 9 the ﬁrst part of ( 11 ) is less than 1 / N . This shows that if t ≥ t N + α (1 − R ) − 1 1 , then: P ∞ X k =0 S X i =1 U ki T ( N ) ki ≥ N t ! ≤ 2 N . Th us, this shows that x N ≤ N ( t N + α (1 − R ) − 1 1 ).  3.4. The logarithmic trend. The quan tity α exp( Qt ) 1 is equal to one minus the cum ulativ e distribution function F () of a contin uous phase-type random v ariable of parameter ( α, Q ). According to Theorem 2.7.2 of [ 10 ], there exist γ > 0 and k ≥ 0 suc h that the density of this v ariable f () satisﬁes: (12) f ( t ) = γ t k exp( − ν t ) + o ( t k exp( − ν t )) , where − ν is a eigenv alue of Q suc h that ν > 0 and k + 1 ≥ 1 is the m ultiplicity of the eigenv alue − ν . Equation ( 12 ) leads to the logarithmic b ound for T N , expressed by the following theorem. Theorem 2. L et − ν b e the eigenvalue of Q with the gr e atest r e al p art and let k + 1 ≥ 1 denotes its multiplicity. Then, ν is r e al and p ositive and T N satisﬁes: E [ T N ] ≤ 1 ν N log ( N ) + k ν N log log N + O ( N ) . Pr o of. By Equation ( 12 ), the cumulativ e distribution function F satisﬁes: α exp( Qt ) 1 = 1 − F ( t ) = Z ∞ t f ( s ) ds = γ t k exp( − ν t ) + o ( t k exp( − ν t )) . Let s N ( x ) def = ν − 1 (log( γ N ) + k log log γ N − k log ν + x ). F or all ﬁxed x , we hav e s N ( x ) = t N + ν − 1 x + o (1). Using that exp( − ν s N ( x )) = 1 γ N (log γ N ) − k ν k exp( − x ), the quantit y s N ( x ) k exp( − ν s N ( x )) is equal to v − k (log( γ N ) + k log log γ N − k log ν + x ) k 1 γ N (log γ N ) − k ν k exp( − x ) = 1 γ N exp( − x )(1 + k log log γ N − k log ν + x log γ N ) − k . (13) The last factor of ( 13 ) go es to 1 as N goes to inﬁnit y . Therefore, if x > 0 (or x < 0), then ( 13 ) is strictly less (or greater) than 1 /γ N if N is large enough. This sho ws that for all  > 0, if N is large enough, w e ha ve 1 − F ( s N ( −  )) < 1 / N < 1 − F ( s N (  )) . This shows that the n um b er t N suc h that t ≥ t N implies α exp( Qt ) 1 ≤ 1 N , is equal to: (14) t N = 1 ν (log( γ N ) + k log log N − k log ( ν ) + o (1)) . Com bining ( 14 ) and Theorem 1 concludes the pro of of the theorem.  10 NICOLAS GAST 3.5. Application to the coup on collector problem. Let us consider the clas- sical coup on collector problem: there are N diﬀeren t t yp es of coupon. At eac h time step, a coup on of type i is pick ed at random where i is drawn uniformly at random. It has b een shown in [ 13 ] that the time to collect T coup on of eac h type is b ounded b y N (log N + ( T − 1) log log N + O (1)). In this section, w e show that our approach allo ws one to retrieve this result directly . 0 1 . . . T 1 1 1 1 (a) Collecting T cards of each type 0 1 1 1 (b) Classical coup on collector Figure 2. Mark ov chains used to represent the coup on collector problem: the state indicates the n um b er of coupons of that remain to b e collected. Let us consider the Marko v chain represented on Figure 2(a) . Its state space is { 0 . . . T } . The initial state is T and for all 0 < i ≤ T : P i,i − 1 = 1. The matrix Q corresp onding to this Marko v c hain is a T × T matrix that has − 1 on its diagonal and 1 on its sub-diagonal. On Figure 2(b) is represen ted the particular case for T = 1. The ODE corresp onding to this system is:  ˙ m T ( t ) = − m T ( t ) ˙ m i ( t ) = − m i ( t ) + m i +1 ( t ) for 0 < i < T with m T (0) = 1 and m i ( t ) = 0 for i ∈ { 0 . . . T − 1 } . A direct computation shows that m 0 ( t ) is the cumulativ e distribution function of an Erlang v ariable of parameter ( T , 1) ( i.e. the sum of T i.i.d. exponential v ariable of parameter 1) which can b e written: m 0 ( t ) = 1 − T − 1 X k =0 exp( − x ) x k /k ! = 1 − exp( − x ) x T − 1 / ( T − 1)! + O (exp( − x ) x T − 2 ) . Using Theorem 1 , this shows that the time T N to collect T cards of each t yp e is b ounded by E [ T N ] ≤ log N + ( T − 1) log log N + ( T + 2) N + O (1) . where T + 2 comes from Q − 1 ij = − 1 j ≤ i and 1 − R = − Q . 4. Explicit f ormula f or two p ar ticular cases Theorem 2 giv es a precise idea on the b eha vior of T N in the general case. How- ev er, the computation of the constants ν , k or O ( N ) can b e diﬃcult when the state space of the original Marko v chain is large. In this section, we derive explicit form ulas for these constan ts assuming that the hitting time for one Marko v chain is b ounded b y T . W e ﬁrst show that if the hitting time of the absorbing state is b ounded b y T for all single chain, then T N is less than N 2 T (Theorem 3 ), which is a lo ose b ound in many cases. When the hitting time of the absorbing state is uniformly b ounded Hitting time and ﬂuid appro ximation: application to the coup on collector problem. 11 b y T for all initial states x ∈ S , then E [ T N ] is b ounded by T N log N + O ( N ) (Theorem 4 ). A t the end of the section, we provide t wo examples that sho ws that these b ounds are tight. The results presen ted in this section remain v alid if the state space of the c hain is countable instead of ﬁnite. 4.1. Un b ounded case. If W ( x ) denotes the expected hitting time of 0 for a single Mark o v c hain, then the follo wing results hold. Theorem 3. The time T N such that al l the chains have r e ache d 0 is b ounde d by: E [ T N ] ≤ N N X i =1 W ( X i (0)) In p articular, if for al l i , W ( X i (0)) ≤ T , then E [ T N ] ≤ T N 2 . Pr o of. F or all i ∈ { 1 . . . N } , let us call R i the time at whic h the Marko v chain i reac hes 0: R i = inf { t : X i ( t ) = 0 } . It should b e clear that T N = max 1 ≤ i ≤ N R i ≤ P N i =1 R i . Moreo v er, the hitting time for just one Mark o v chain when it performs one transition at each time is W ( X i (0)). As the probability for a Mark ov c hain to p erform one transition during one time step is 1 / N , w e ha ve E [ R i ] = N W ( X i (0)).  This result seems to b e in con tradiction with Theorem 2 that shows that if w e ﬁx a Mark ov chain, the exp ected hitting time of a system composed of N of these Marko v chains is b ounded by O ( N log N ). Ho wev er, the constant hidden in O ( N log N ) dep ends on the Marko v chain and the trend in N log N is only v alid when the num b er N go es to inﬁnit y while the b ound T N 2 only dep ends on T . At the end of the section, we provide an example that shows that this b ound is tigh t. The Marko v chain used for this example dep ends on N . 4.2. Uniformly b ounded case sup x ∈S W ( x ) ≤ T . On Figure 3(a) , we present a Mark o v c hain that the b ound of the previous theorem is tigh t. This c hain has a v ery particular shap e: starting from the initial state, the hitting time of the absorbing state is 1 with a probabilit y 1 − 1 / N 2 . With probabilit y 1 / N 2 , the chain jumps in to a state from which is tak es N 2 ( T − 1) steps to hit 0. This later causes the hitting time to b e large when multiple chain. In this section, we show that if there are no such problematic states, the bound on T N can be improv ed dramatically . More precisely , we show that if the hitting time of 0 is b ounded b y T indep enden tly of the initial state – sup x ∈S W ( x ) ≤ T , then T N is of order T N log N . Theorem 4. If the hitting time for one chain is uniformly b ounde d by T ( i.e. sup x ∈S W ( x ) ≤ T ), then the time T N such that al l the chains have r e ache d 0 satis- ﬁes: E [ T N ] ≤ T N log N + 2 N T + 1 Pr o of. Let F t denotes the ﬁltration asso ciated to the pro cess X ( t ) and let us deﬁne the p otential of the system at time t , Φ t b y: Φ( t ) def = 1 T N X i =1 W ( X i ( t )) . 12 NICOLAS GAST T N is the time at which all X i ( t ) are equal to 0 and can be written T N = inf { t : Φ( t ) = 0 } . In the following, we ﬁrst show that the time for Φ( t ) to b e low er than 1 is less than N T log N + N T + 1 using [ 15 ]. Then, we use Theorem 3 to b ound the remaining time b y N T . W e say that a Marko v chain is active if it did not reach 0. If an activ e Mark o v c hain is pick ed in step 1, then the potential will decrease in exp ectation by 1 /T . Let α ( t ) denotes the num b er of active Marko v chains at time t ( i.e. α ( t ) = P N i =1 1 X i ( t ) 6 =0 ). The probability of pic king an activ e Marko v chain is α ( t ) / N . Therefore, the exp ected decrease of the potential b etw een time t and t + 1 is: (15) E [Φ( t + 1) − Φ( t ) | F t ] ≤ − α ( t ) T N . By hypothesis, sup x ∈S W ( x ) ≤ T . Th us, an activ e processor con tributes at most 1 to the p oten tial and we hav e Φ( t ) ≤ α ( t ). Combining this with ( 15 ), we get: (16) E [Φ( t + 1) | F t ] ≤ Φ( t )  1 − 1 N T  . Because of Equation ( 16 ), our p otential function satisﬁes the hypothesis of The- orem 1 of [ 15 ] with m = 1 and h ( r ) = 1 − 1 / ( N T ). According to this theorem, we ha v e: inf { t : Φ( t ) < 1 } ≤ λ (log (Φ(0)) + 1) + 1 . where λ = − 1 / log(1 − 1 / ( T N )) ≤ T N and Φ(0) ≤ N . By Theorem 3 , when T N is less than 1, P i W ( X i ( t )) ≤ T . Therefore, the remaining time to hit 0 is b ounded by N T .  4.3. Comparison with previous b ounds and tightness. Theorems 3 and 4 need stronger assumptions than Theorems 1 and 2 and are often less precise. Ho w- ev er, their main adv an tage is to giv e explicit form ulas for the hitting time, ev en if computing the time t N or the eigenv alue of the individual Mark ov chains is hard. This fact is imp ortan t in practical situation where the Marko v c hains often ha ve a complicated geometry . This is the case for the example of [ 4 ] presented in the next Section 5 . The loss of precision of these b ounds are well illustrated b y the coup on collector problem. Consider the Marko v c hain of Figure 2(a) that corresponds to the problem of collecting T cards of each type. The hitting time of 0 from any state is clearly b ounded by T . Therefore, using Theorem 4 , one has E [ T N ] ≤ T N (log N + 2) + 1. This b ounds is worse b y a factor T compared with the b ound obtained by the ODE approac h which was N (log N + ( T − 1) log log N ) + O ( N ). This is explained b y the fact that Theorem 4 do es not take into account the particular shap e of the Marko v c hain of Figure 2(a) : the b ound of Equation ( 16 ) neglects the fact the hitting time starting from state { 0 . . . T − 1 } is strictly less than T . 4.3.1. Tightness of the b ounds of The or ems 3 and 4 . A Marko v chain that shows the tigh tness of the b ound of Theorem 3 is represented on Figure 3(a) . The chain has N + 2 states, denoted { 0 , . . . , N 2 ( T − 1) } ∪ { i } . Its initial state is i . F rom i , the c hain go es with probabilit y 1 / N 2 to state N 2 ( T − 1) and with probability 1 / N 2 in state 0. F rom an y state x ∈ { 1 . . . N 2 ( T − 1) } , the c hain go es to state x − 1 with probabilit y 1. F or any state x ∈ { 0 . . . N 2 ( T − 1) } , the expected hitting time of Hitting time and ﬂuid appro ximation: application to the coup on collector problem. 13 0 1 . . . N 2 ( T − 1 ) 1 1 1 1 i 1 N 2 1 − 1 N 2 (a) Example for Theorem 3 0 1 1 − 1 /T 1 /T 1 (b) Example for Theorem 4 Figure 3. Two Marko v chains used to show that the bounds of Theorems 4 and 3 are tigh t. the state 0 is W ( x ) = x . When starting in i , the exp ected hitting time of 0 is T . Therefore, Theorem 3 shows that the exp ected hitting time of (0 . . . 0) for a system comp osed of N of these chains starting in ( i . . . i ) is b ounded by T N 2 . Let us compute a low er b ound on the hitting time of (0 . . . 0) starting from ( i . . . i ). With probability 1 − (1 − 1 / N 2 ) N , there will b e at least one c hain that needs N 2 ( T − 1) transitions to conv erges. At eac h time step, this chain makes a transition with probability 1 / N . Thus, in av erage this c hain will take ( T − 1) N 3 time steps to con verge. Since this happ ens with probability 1 − (1 − 1 / N 2 ) N , a lo w er bound for the hitting time of (0 . . . 0) starting from every one in state i is E [ T N ] ≥ N 3 ( T − 1)(1 − (1 − 1 / N 2 ) N ) = ( T − 1)( N 2 + o (1)) . This shows that the b ound of Theorem 3 is almost tight up to an additive term of N 2 . T o show that the bound of Theorem 4 cannot b e impro ved m uch without further assumption, let us consider the Marko v c hain represen ted on Figure 3(b) : it has tw o states and the probability of going to 1 to 0 is 1 /T . The exp ected hitting time of state 0 starting from state 0 is T . Theorem 4 implies that T N ≤ N T (log N + 2) + 1. The exact v alue for T N is N T P N i =1 i − 1 ≈ N T log N + γ N T + o ( N ) where γ ≈ . 57 is the Euler–Masc heroni constan t. This is close to our theoretical b ound up to an additiv e term of (2 − γ ) N T . 5. Computing completion time of randomized algorithms In this section, we sho w ho w these results can b e applied to study the completion time of randomized algorithms and show how this can b e used to design eﬃcien t distributed proto cols. 5.1. Completion time of randomized algorithm. One motiv ation for this work comes from the study of the time for a set of N distributed randomized algorithm to all ﬁnish, in a s cenario similar to [ 4 ]. Let us consider that w e w ant to solve a resource allo cation problem among a population of agen t. W e assume that we hav e a randomized algorithm that con v erges to a stable allo cation of the resource that is eﬃcien t but not fair among diﬀeren t agen ts. The ﬁnal allocation might dep end on the random c hoices done b y the algorithm and each allocation fa vors a diﬀeren t group of agents. In order to impro ve the fairness of the equilibrium, we consider the follo wing scenario. W e execute N indep enden t copies of the algorithm. A t each time step, we do a step of computation of one algorithm taken at random among the N algorithms. After some time, the N algorithms will hav e reached their stable allocation S 1 . . . S N . 14 NICOLAS GAST Since each allo cation is eﬃcient, the resulting allo cation will also b e eﬃcien t but the resulting allo cation will b e more fair since at each time step, one allo cation is pic k ed at random among S 1 . . . S N . If the original algorithm uses b ounded memory , it can b e represen ted b y a Mark ov c hain with a ﬁnite state space. After some time, the Marko v chain will reach an absorbing state represen ting the fact that the algorithm reached a stable allo cation. The resulting algorithm can b e represented by N independent Mark o v chains. A t eac h time step, one Marko v chain is pick ed at random and p erforms one transition. Our framew ork, and in particular Theorems 4 and 3 can b e used to compute the time to reach the ﬁnal allo cation if we kno w the time tak en by a single algorithm to conv erge. 5.2. Correlated equilibria and distributed proto cols. These ideas are ap- plied in [ 4 ] to design a distributed algorithm that conv erges to a fair and eﬃcien t allo cation of wireless radio c hannel to a set of user. Their scenario is the following. There are U users that wan t to share C wireless c hannels. The time is slotted and at each time slot, eac h user can transmit data on one c hannel. It t w o or more users are transmitting on the same c hannel at the same time, there are in terferences and no data are receiv ed. The only information a v ailable to a user b efore transmitting is whether a giv en channel was used b y one or more users at the previous time slot. Authors of [ 4 ] prop osed a distributed randomized algorithm that conv erges to a constant assignment of the C channels to C of these U users while the others U − C users do not transmit at all. This algorithm guarantees a 100% of utilization of the channels but is unfair since U − C users are not transmitting at all. The time of conv ergence of the algorithm can be b ounded by some constant T . In order to improv e the fairness of their algorithm, they introduced a centralized en tit y that sends a correlation signals. At eac h time step t , this entit y sends a signal n ( t ) ∈ { 1 , . . . , N } where N is a predeﬁned constant (to be chosen b y the en tit y). The signal n ( t ) is pick ed uniformly at random each time. The users keep N copies of the previous randomized algorithm. At time t , the users apply the algorithm num b er n ( t ). Ev en if this algorithm is no more completely distributed, it is still scalable: the centralized entit y has to broadcast a signal n ( t ) at time t but it do es not hav e to gather any information from the agents. Moreo v er, this algorithm impro v es the fairness of the initial algorithm. As each algorithm is indep enden t, if N is large, each user will b e assigned in a verage to C /U c hannels. The more N is large, the more fair will be the allo cation. Ho wev er, a large N slows the con vergence of the algorithm. An accurate bound on the time of con v ergence of this algorithm allo ws one to choose the right compromise b et w een the sp eed of conv ergence and the p erformance of the algorithm. As the conv ergence time of each copy of the algorithm is bounded in exp ecta- tion b y some T , this mo del satisﬁes the h yp othesis of Theorem 4 . Therefore, the con v ergence time of the whole algorithm is bounded by N T log N + 2 N T + 1. References [1] Benaim, M. (1999). Dynamics of sto chastic approximation algorithms. Seminair e de pr ob a- bilites XXXIII 1–68. [2] Benaim, M. and Le Boudec, J. (2008). A class of mean ﬁeld interaction mo dels for computer and communication systems. Performance Evaluation 65, 823–838. Hitting time and ﬂuid appro ximation: application to the coup on collector problem. 15 [3] Benaim, M. and Weibull, J. (2003). Deterministic approximation of sto chastic ev olution in games. Ec onometric a 71, 873–903. [4] Cigler, L. and F al tings, B. (2011). Reaching correlated equilibria through multi-agen t learning. In Pr o c. of 10th Int. Conf. on Autonomous A gents and Multiagent Systems (AA- MAS 2011) . pp. 509–516. [5] D ai, J. (1995). On p ositiv e harris recurrence of m ulticlass queueing net works: a uniﬁed approach via ﬂuid limit mo dels. The Annals of Applie d Pr obability 5, 49–77. [6] D arling, R. and Norris, J. (2008). Diﬀerential equation approximations for Mark ov c hains. Pr ob ability surveys 5, 37–79. [7] F or t, G., Meyn, S., Moulines, E. and Priouret, P. (2008). The o de metho d for stability of skip-free markov c hains with applications to mcmc. The Annals of Applied Prob ability 18, 664–707. [8] Kenthap adi, K. and Manku, G. (2005). Decentralized algorithms using b oth lo cal and ran- dom prob es for p2p load balancing. In Pr oc e e dings of the sevente enth annual ACM symp osium on Parallelism in algorithms and ar chite ctur es . ACM. pp. 135–144. [9] Klebaner, F. and Liptser, R. (2001). Asymptotic analysis and extinction in a sto c hastic Lotk a-V olterra mo del. Annals of Applie d Pr ob ability 11, 1263–1291. [10] La touche, G., Ramasw ami, V. and Kulkarni, V. (1999). Introduction to matrix analytic methods in sto chastic mo deling. Journal of Applie d Mathematics and Sto chastic Analysis 12, . [11] Mitzenmacher, M. and Upf al, E. (2005). Pr ob ability and c omputing: R andomized algo- rithms and pr ob abilistic analysis . Cambridge Univ Pr. [12] Myers, A. and Wilf, H. (2003). Some new asp ects of the coup on-collector’s problem. Arxiv pr eprint math/0304229 . [13] Newman, D. (1960). The double dixie cup problem. Americ an Mathematic al Monthly 67, 58–61. [14] R obbins, H. and Monro, S. (1951). A stochastic approximation metho d. The A nnals of Mathematic al Statistics 22, 400–407. [15] Tchiboukdjian, M., Gast, N. and Tr ystram, D. (2010). Dece n tralized list sc heduling. submitte d for public ation . Nicolas Gast, EPFL, IC-LCA2, BC 203 B ˆ atiment BC, St a tion 14, 1015 Lausanne-EPFL, Switzerland E-mail address : nicolas.gast@epfl.fr

Computing hitting times via fluid approximation: application to the coupon collector problem

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment