Conditioned Poisson distributions and the concentration of chromatic numbers

The paper provides a simpler method for proving a delicate inequality that was used by Achlioptis and Naor to establish asymptotic concentration for chromatic numbers of Erdos-Renyi random graphs. The simplifications come from two new ideas. The firs…

Authors: John Hartigan, David Pollard, Sekhar Tatikonda

Conditioned Poisson distributions and the concentration of chromatic   numbers
CONDITIONED POISSON DISTRIBUTIONS AND THE CONCENTRA TION OF CHR OMA TIC NUMBERS JOHN HAR TIGAN, D A VID POLLARD AND SEKHAR T A TIKOND A Y ALE UNIVERSITY A B S T R AC T . The paper provides a simpler method for proving a delicate inequality that w as used by Achlioptis and Naor to establish asymptotic concentration for chromatic numbers of Erd ¨ os-R ´ enyi random graphs. The simplifications come from two ne w ideas. The first in volves a sharp- ened form of a piece of statistical folklore re garding goodness-of-fit tests for two-way tables of Poisson counts under linear conditioning con- straints. The second idea takes the form of a ne w inequality that controls the extreme tails of the distribution of a quadratic form in independent Poissons random variables. 1. I N T R O D U C T I O N Recently , Achlioptis and Naor (2005) established a most ele gant result concerning col- orings of the Erd ¨ os-R ´ enyi random graph, which has v ertex set V = { 1 , 2 , . . . , n } and has each of the  n 2  possible edges included independently with probability d/n , for a fixed parameter d . They showed that, as n tends to infinity , the chromatic number concentrates (with probability tending to one) on a set of two values, which they specified as explicit functions of d . The main part of their argument used the “second moment method” (Alon and Spencer 2000, Chapter 4) to establish existence of desired colorings with probabil- ity bounded away from zero. Most of their paper was dev oted to a delicate calculation bounding the ratio of a second moment to the square of a first moment. More precisely , A&N considered the quantity A n ( c ) := n k − 1 k 2 n  1 − 1 k  − 2 nc X ` ∈H k n ! Q i,j ` ij ! 1 − 2 k + X i,j  ` ij n  2 ! nc , where H k denotes the set of all k × k matrices with nonnegati ve entries for which each row and column sum equals B := n/k . (W ith no loss of generality , A&N assumed that n is an integer multiple of k .) They needed to sho w , for each fixed k ≥ 3 , that (1) A n ( c ) = O (1) when c < ( k − 1) log( k − 1) . In this paper we sho w ho w the A&N calculations can be simplified by using results about conditioned Poisson distributions. More precisely , we show that the desired be- haviour of A n ( c ) follows from a sharpening of a conditional limit theorem due to Haber- man (1974) together with some elementary facts about the Poisson distrib ution. Date : October 25, 2021. K ey words and phrases. Random graph, chromatic number , second moment method, categorical data, two- way tables, Poisson counts. 1 CHR OMA TIC NUMBERS 2 In Section 2 we will establish some basic notation and record some elementary facts about the Poisson distribution. In Section 3 we will explain how A n ( c ) can be bounded by a conditional expectation of an exponential function of the classical goodness-of-fit statistic for two-way tables. W e will outline our proof of (1), starting from a χ 2 heuristic that can be sharpened (Section 4) into a rigorous proof that handles the contrib utions to A n ( c ) from all except some extreme v alues of ` . T o control the contributions from the extreme ` we will use an inequality (Lemma 2) that captures the large deviation behaviour of conditioned Poissons. The proof of the Lemma (in Section 5) is actually the most delicate part of our argument. 2. F AC T S A B O U T T H E P O I S S O N D I S T R I B U T I O N Many of the calculations in our paper in volve the con vex function (2) h ( t ) = (1 + t ) log(1 + t ) − t for − 1 ≤ t, which achieves its minimum value of zero at t = 0 . Near its minimum, h ( t ) = t 2 / 2 + O ( | t | 3 ) . In fact, h ( t ) = 1 2 t 2 ψ ( t ) where ψ is a decreasing function with ψ (0) = 1 and ψ 0 (0) = − 1 / 3 . See Pollard (2001, page 312) for a simple deriv ation of these facts. Define N 0 = { 0 , 1 , 2 , . . . } , the set of all nonnegati ve inte gers. Lemma 1. Suppose W has a Poisson ( λ ) distribution, with λ ≥ 1 . (i) If ` = λ + λu ∈ N 0 then √ 2 π λ P { W = ` } = exp  − λh ( u ) − 1 2 log(1 + u ) + O (1 /` )  = exp  − 1 2 λu 2 + O  | u | + λ | u | 3  . (ii) P { W = ` } ≤ exp( − λh ( u )) for all ` = λ (1 + u ) ∈ N 0 . (iii) F or all w ≥ 0 , P {| W − λ | ≥ λw } ≤ 2 exp( − λh ( w )) = 2 exp  − 1 2 λw 2 + O  λ | w | 3  Pr oof. By Stirling’ s formula, log( ` ! / √ 2 π ) = ( ` + 1 2 ) log ( ` ) − ` + r ` where 1 12 ` + 1 ≤ r ` ≤ 1 12 ` . Thus log  √ 2 π λ P { W = ` }  = − λ + ` log ( λ ) − log( ` ! / √ 2 π ) + 1 2 log( λ ) = − λh ( u ) − 1 2 log(1 + u ) + O ( ` − 1 ) , which giv es (i). For (ii), first note that P { W = 0 } = e − λ = exp( − λh ( − 1)) . For ` ≥ 1 we hav e log  √ 2 π P { W = ` }  = − λ + ` log ( λ ) − ( ` + 1 2 ) log ( ` ) + ` − r ` ≤ − λ + ` log ( λ ) − ` log( ` ) + ` = λh ( u ) . Inequality (iii) comes from two appeals to the usual trick with the moment generating function P e tW = exp( λ ( e t − 1)) . For w ≥ 0 , P { W ≥ λ + λw } ≤ inf t ≥ 0 P e t ( W − λ − λw ) = inf t ≥ 0 exp  − tλ (1 + w ) + λ ( e t − 1)  The infimum is achiev ed at t = log(1 + w ) , gi ving the bound exp( − λh ( w )) . Similarly P { W ≤ λ − λw } ≤ inf t ≥ 0 P e t ( λ − λw − W ) = inf t ≥ 0 exp  t ( λ − λw ) + λ ( e − t − 1)  CHR OMA TIC NUMBERS 3 with the infimum achieved at t = − log(1 − w ) if 0 ≤ w < 1 or as t → ∞ if w = 1 . The inequality is trivial for w > 1 .  3. H E U R I S T I C S A N D A N O U T L I N E O F T H E P RO O F O F ( 1 ) W e first show that A n ( c ) is almost a conditional expectation in v olving a set of inde- pendent random variables, Y = { Y ij : 1 ≤ i, j, ≤ k } , each distributed Poisson ( λ ij ) with λ ij = n/k 2 for all i, j . For ` ∈ H k , p ( ` ) := P { Y = ` } = e − n ( n/k 2 ) n Q i,j ` ij ! = n ! Q i,j ` ij ! n n e − n n ! k − 2 n The standardized v ariables X ij := ( Y ij − λ ij ) / p λ ij are approximately independent stan- dard normals. As we show in Section 4, the quantity β n := n (2 k − 1) / 2 P { Y ∈ H k } con ver ges to a strictly positi ve constant as n tends to infinity . Thus p 2 ( ` ) := P { Y = ` | Y ∈ H k } = p ( ` ) / P { Y ∈ H k } = n k − 1 k − 2 n n ! Q i,j ` ij ! n n +1 / 2 e − n n ! β n . By Stirling’ s approximation, the final fraction con ver ges to a nonzero constant. The quan- tity A n ( c ) is bounded by a constant multiple of (3)  1 − 1 k  − 2 nc X ` ∈H k p 2 ( ` )  1 − 2 k + X i,j ( ` ij /n ) 2  nc . That is, for some constant C 0 , A n ( c ) ≤ C 0  1 − 1 k  − 2 nc P 2  1 − 2 k + X i,j ( Y ij /n ) 2  nc , where P 2 ( · ) denotes expectations with respect to the conditional probability distrib ution P ( · | Y ∈ H k ) . Note the similarily to the usual chi-squared goodness-of-fit statistic, | X | 2 := X i,j X 2 ij = − n + nk 2 X i,j ( Y ij /n ) 2 . The quantity in (3) equals the P 2 expectation of  1 − 1 k  − 2 nc  1 − 1 k  2 + | X | 2 nk 2 ! nc ≤ exp  c | X | 2 ( k − 1) 2  . Our task has become: for a fix ed J k := c/ ( k − 1) 2 < ρ k := log( k − 1) / ( k − 1) , show that (4) P 2 exp  J k | X | 2  = O (1) as n → ∞ . Under P 2 , the random vector X has a limiting normal distribution N that concentrates on a ( k − 1) 2 -dimensional subspace of R k × k . The random variable | X | 2 has an asymptotic χ 2 R distribution with R = ( k − 1) 2 . If we could assume that | X | 2 were exactly χ 2 R - distributed, we could bound the conditional e xpectation in (4) by a constant times Z ∞ 0 t R/ 2 − 1 exp  ct/R − t/ 2)  dt, which would be finite for c < R/ 2 = ( k − 1) 2 / 2 . CHR OMA TIC NUMBERS 4 T o make the argument rigorous we will need to consider the contributions from the large | Y ij − n/k 2 | ’ s more carefully . As a special case of Theorem 3 in Section 4, we kno w that for each fixed θ > 1 there exists a δ = δ θ for which (5) P 2 exp  J k | X | 2  {| X | ≤ δ √ n } ≤ θ N exp( θ 2 J k | x | 2 ) . The expectation with respect to the normal distribution N can be bounded as in the previ- ous paragraph because | x | 2 ∼ χ 2 ( k − 1) 2 under N . T o control the contribution from {| X | > δ √ n } it is notationally cleaner to work with the variables U ij := ( Y ij − λ ij ) /λ ij , that is, U = k X/ √ n . Write U for the set of all u in R k × k for which λ ij (1 + u ij ) ∈ N 0 for all i, j and (because Y is constrained to lie in H k ), (6) − 1 ≤ u ij ≤ k − 1 and X i u ij = 0 = X j u ij . W e need to bound P exp  nJ k | U | 2 /k 2  {| U | > k δ } / P { Y ∈ H k } = O ( n (2 k − 1) / 2 ) X u ∈U {| u | > k δ } P { U = u } exp( nJ k | u | 2 /k 2 ) From Lemma 1, P { U = u } ≤ Y ij exp( − nh ( u ij ) /k 2 ) , which leads us to the task of showing that (7) X u ∈U {| u | > k δ } exp  n k 2 X ij  J k u 2 ij − h ( u ij )   = O ( n − (2 k − 1) / 2 ) . Here we can make use of an inequality (proved in Section 5) that controls the exponent in (7). Recall that h ( t ) = (1 + t ) log(1 + t ) − t and ρ k = log( k − 1) / ( k − 1) . Lemma 2. F or each u = ( u 1 , . . . , u k ) ∈ R k for which P j u j = 0 and − 1 ≤ u j ≤ k − 1 for all j , we have P j h ( u j ) ≥ ρ k P j u 2 j . When in v oked for the sum ov er j for each fixed i , the Lemma bounds (7) by X u ∈U {| u | > k δ } exp  − n 0 | u | 2  where  0 := ( ρ k − J k ) /k 2 > 0 . The set { u ∈ U : 2 b k δ < | u | ≤ 2 b +1 k δ } has cardinality of order O (( n 2 b ) k 2 ) . The last sum is less than O ( n k 2 ) X b ∈ N 0 exp  k 2 b − n 0 4 b  which decreases exponentially fast with n . The bound asserted in (4) follows. 4. L I M I T T H E O RY F O R C O N D I T I O N E D P O I S S O N D I S T R I B U T I O N S The main result in this Section is Theorem 3, which shows that the contributions to the left-hand side of (4) from a large range of X values can actually be bounded using the χ 2 -approximation. Suppose Y = ( Y 1 , . . . , Y q ) is a v ector of independent random variables with Y i dis- tributed Poisson ( λ i ) . Define λ := ( λ 1 , . . . , λ q ) and D := diag ( λ 1 / 2 1 , . . . , λ 1 / 2 q ) . CHR OMA TIC NUMBERS 5 For the rest of this section assume that ν := P i λ i con ver ges to infinity and that there exists some fix ed constant τ > 0 for which (8) ν ≥ max λ i ≥ min i λ i ≥ τ ν. The various constants that appear throughout the section might depend on τ . Suppose V 1 , . . . , V s are fixed vectors in Z q that are linearly independent, spanning a subspace L of R q . The linear independence implies the existence of nonzero constants C 1 and C 2 for which (9) C 1 max α | t α | ≤ | X α t α V α | ≤ C 2 max α | t α | for all t α ∈ R . W e also assume that (10) Z q ∩ ( λ ⊕ L ) 6 = ∅ . Under similar assumptions, Haberman (1974, Chapter 1) prov ed a central limit theo- rem for the random vector X := D − 1 ( Y − λ ) conditional on the event { Y ∈ λ ⊕ L} . The limit distribution N λ is that of a N (0 , I q ) conditioned to lie in the s -dimensional subspace D − 1 L . More precisely , N λ has density φ ( x ) = (2 π ) − s/ 2 exp( − | x | 2 / 2) with respect to Lebesgue measure m λ on the subspace D − 1 L . W e will write Q ( · ) to denote expectations under P ( · | Y ∈ λ ⊕ L} . That is, for the conditional expectation of a function of Y , Q f ( Y ) = P f ( Y ) { Y ∈ λ ⊕ L} P { Y ∈ λ ⊕ L} . For the calculations leading to inequality (5), the q × 1 vectors are more naturally written as k × k tables. The vector of means becomes a table λ = { λ ij : 1 ≤ i, j ≤ k } with λ ij = n/k 2 for all i, j . The constraints on row and column sums can be written using the 2 k tables with ones in a single row or column, zeros elsewhere, but only 2 k − 1 of those tables are linearly independent. Thus q = k 2 and s = k 2 − (2 k − 1) = ( k − 1) 2 and ν = n . The Q in this Section corresponds to the P 2 from Section 3. For each w ∈ Z s define z w := P α ≤ s w α V α , a point of Z q . The key idea in Haberman’ s argument is that the space L is partitioned into disjoint boxes B w := { X i ≤ s t i V i : b t i c = w i } = z w ⊕ B 0 for w ∈ Z s , each containing the same number , κ V , of lattice points from Z q . Assumption (10) ensures that κ V > 0 . Theorem 3. Suppose g is a uniformly continuous, increasing function. Then for each θ > 1 there e xists a δ > 0 and a subset L δ of L for which (i) { x ∈ D − 1 L : | x | ≤ δ √ ν } ⊆ D − 1 L δ (ii) Q exp  g ( | X | 2 )  { X ∈ D − 1 L δ } ≤ θ N λ exp  g ( θ 2 | x | 2 )  { x ∈ D − 1 L δ /θ } The proof of the Theorem will be giv en at the end of this Section, as the culmination of a sequence of lemmas based on the elementary facts from Section 2. W e first show that most of the contributions to the P 2 and N λ probabilities come from a lar ge, bounded subset of L . Lemma 4. F or each δ > 0 define W δ := { w ∈ Z s : max α | w α | ≤ δ ν } and L δ := ∪ w ∈W δ B w . Ther e exists a constant C δ > 0 for which P { Y / ∈ λ ⊕ L δ } + N λ ( D − 1 L c δ ) } = O ( e − C δ ν ) . CHR OMA TIC NUMBERS 6 Pr oof. If y ∈ λ ⊕ ( L\L δ ) then y − λ ∈ z w ⊕ B 0 for some w with max α | w α | > δ ν , which implies √ k max i | y i − λ i | ≥ | y − λ | ≥ | z w | − diam ( B 0 ) ≥ C 1 δ ν − C 4 . Define δ 0 := C 1 δ / (2 √ k ) . When ν is large enough we hav e ( C 1 δ ν − C 4 ) / √ k > δ 0 ν ≥ δ 0 max i λ i , so that P { Y / ∈ λ ⊕ L δ } ≤ X i P {| Y i − λ i | > δ 0 λ i } . In v oke Lemma 1 to bound the i th summand by 2 exp  − λ i δ 2 0 / 2 + O ( δ 3 0 λ i )  . W ith a possible decrease in δ 0 we can ensure that the λ i δ 2 0 / 2 is at least twice the other contribution to the exponent. Similarly , if x ∈ D − 1 ( L\L δ ) and ν is large enough then | x | > δ 0 √ ν and the contribu- tion from N λ is bounded by a sum of tail probabilities for the standard normal.  Next we use Lemma 1 to get good pointwise approximations for P { Y = ` } when | ` − λ | is not too large. Lemma 5. F or each θ > 1 there exists a δ > 0 such that, for all ` = λ + D x in N q 0 for which max i λ − 1 i | ` i − λ i | ≤ δ , θ − 1 φ  θ x  ≤ ν q / 2 P { Y = ` } /γ ( λ ) ≤ θ φ  x /θ  wher e γ ( λ ) := (2 π ) s/ 2 Q i (2 π λ i /ν ) − 1 / 2 , a factor that stays bounded away fr om zer o and infinity as ν → ∞ . Pr oof. From Lemma 1,  Y i p 2 π λ i  P { Y = ` } = Y i exp  − 1 2 x 2 i + r i  where, for some constant C 3 , | r i | ≤ C 3 ( | x i | + | x i | 3 ) / √ ν ≤ C 3 δ (1 + x 2 i ) . The asserted inequalities follow if δ is small enough.  Next we sum over the pointwise approximations to get bounds for the probability that Y lies in one of the box es that partition λ ⊕ L . The sum for the box λ ⊕ B w will run o ver the lattice points of the form λ + D x with x in the set X w = { x ∈ D − 1 B w : λ + Dx ∈ N q 0 } . Lemma 6. F or each θ > 1 ther e exists a δ > 0 such that, for all w in W δ and ν lar ge enough, θ − 1 N λ  D − 1 B w θ  ≤ ν ( q − s ) / 2 P { Y ∈ λ ⊕ B w } /β ( λ ) ≤ θ N λ  D − 1 B w /θ  wher e β ( λ ) is a factor that stays bounded away fr om zer o and infinity as ν → ∞ . Pr oof. As the proofs for the two inequalities are similar , we consider only the upper bound. Define x w := D − 1 z w . By inequality (9) we have | z w | ≤ C 2 δ ν and hence | x w | ≤ C 4 δ √ ν for some constant C 4 . Similarly , for each y = λ + D x in λ ⊕ B w we have | y − λ − z w | bounded by a constant, which implies | x − x w | ≤ C 5 / √ ν and hence | | x | 2 − | x w | 2 | ≤ δ 0 := C 2 5 /ν + 2( C 5 / √ ν ) C 4 δ √ ν . CHR OMA TIC NUMBERS 7 It follows that for each  > 0 and σ close enough to 1 , sup {| φ ( x /σ ) /φ ( x w /σ ) − 1 | : x ∈ D − 1 B w } <  if ν is large enough and δ is small enough. T aking σ equal to the θ from Lemma 5 we then have P { Y ∈ λ ⊕ B w } = X x ∈X w P { Y = λ + D x } ≤ θ γ ( λ ) ν − q / 2 X x ∈X w φ ( x/θ ) ≤ θ γ ( λ ) ν − q / 2 κ V (1 +  ) φ ( x w /θ ) . Similarly , N λ ( D − 1 B w /θ ) = Z { θ t ∈ D − 1 B w } φ ( t ) m λ ( d t ) = θ − s Z { x ∈ D − 1 B w } φ ( x /θ ) m λ ( d x ) ≥ θ − s φ ( x w /θ )(1 −  ) m λ ( D − 1 B 0 ) . The in v ariance properties of Lebesgue measure imply existence of some function µ ( λ ) that stays bounded away from zero and infinity as ν tends to infinity , for which m λ ( D − 1 B 0 ) = ν − s/ 2 µ ( λ ) . Thus P { Y ∈ λ ⊕ B w } ≤ θ s +1 1 +  1 −  ν − ( q − s ) / 2 κ V γ ( λ ) µ ( λ ) N λ ( D − 1 B w /θ ) . Choose  small enough and replace θ by a v alue closer to 1 to get the upper half of the asserted inequality , with β ( λ ) = κ V γ ( λ ) /µ ( λ ) .  Corollary 7. P { Y ∈ λ ⊕ L} = ν − ( q − s ) / 2  β ( λ ) + o (1)  Pr oof. From Lemmas 4 and 6, for each θ > 1 , P { Y ∈ λ ⊕ L} = P { Y / ∈ λ ⊕ L δ } + X w { w ∈ W δ } P { Y ∈ λ ⊕ B w } ≤ O ( e − C δ ν ) + θ β ( λ ) ν − ( q − s ) / 2 X w { w ∈ W δ }N λ  D − 1 B w /θ  ≤ θ ν − ( q − s ) / 2  β ( λ ) + o (1)  The argument for the lo wer bound is similar .  Corollary 8. F or all ν larg e enough, θ − 1 N λ ( D − 1 B w θ ) ≤ Q { Y ∈ λ ⊕ B w } ≤ θ N λ ( D − 1 B w /θ ) for all w ∈ W δ . W e now have all the facts needed to prove Theorem 3. The argument is a slight modi- fication of the method used to prove Lemma 6. Start with the δ and L δ from that Lemma. Assertion (i), modulo an unimportant constant, was established at the start of the proof of the Lemma. Define f ( x ) := exp( g ( | x | 2 )) . From the proof of the Lemma we kno w that | | x | 2 − | x w | 2 | ≤ δ 0 . By uniform continuity of g , if δ is small enough we then hav e | g ( | x | 2 /σ 2 ) − g ( | x w | 2 /σ 2 ) | <  all x ∈ D − 1 B w , all σ ≈ 1 CHR OMA TIC NUMBERS 8 and hence e −  f ( x w /σ ) ≤ f ( x/σ ) ≤ e  f ( x w /σ ) all x ∈ D − 1 B w , all σ ≈ 1 . Use the bounds on f on D − 1 B w to deduce that Q f ( X ) { Y ∈ λ ⊕ B w } ≤ e  f ( x w ) Q { Y ∈ λ ⊕ B w } ≤ e  f ( θ x w ) θ N λ ( D − 1 B w /θ ) as g is increasing ≤ e 2  θ N λ f ( θ x ) { x ∈ D − 1 B w /θ } Sum ov er w in W δ . to complete the argument. 5. P R O O F O F L E M M A 2 At a key step in the ar gument we will need the inequality (11) ψ ( t ) ≥ 2 log(1 + 2 t ) / (1 + 2 t ) for all t ≥ 0 , for which, unfortunately , we hav e no direct analytic proof. Howe ver , the assertion is triv- ially true near the origin because the lower bound tends to zero as t tends to zero. For large t the ratio of ψ ( t ) to the lower bound tends to 2 . For intermediate values we hav e only a proof based on an analytic bound on deri vati ves together with numerical calculation on a suitably fine grid. It would be satisfying to hav e a completely analytic proof for (11). Define g k ( s ) := h ( s ) − ρ k s 2 . W e need to show that the function G k ( u ) := P j ≤ k g k ( u j ) is nonneg ativ e on the constraint set. Suppose the minimum is achiev ed at t = ( t 1 , . . . , t k ) . W ithout loss of generality , we may suppose − 1 ≤ t 1 ≤ t 2 ≤ · · · ≤ t k ≤ k − 1 . W e cannot hav e t 1 = − 1 because h 0 ( − 1) = ∞ . Indeed, g k ( t 1 +  ) + g k ( t k −  ) − g k ( t 1 ) − g k ( t k ) =  log  + O (  ) , which would be negati ve for small  > 0 . It then follows that t k < k − 1 for otherwise the constraint P j t j = 0 would force t j = − 1 for j < k . Use Lagrange multipliers (or argue directly reg arding the first order ef fects of perturba- tions  with P j  j = 0 ) to deduce existence of some constant θ for which g 0 k ( t j ) = θ for all j . Note that g 0 k ( s ) = log (1 + s ) − 2 ρ k s is concave (because g 00 ( s ) is decreasing) with g 0 k (0) = 0 and g 00 k (0) = 1 − 2 ρ k > 0 . It follows that θ ≤ 0 and that there are numbers − 1 < a θ ≤ 0 ≤ b θ < k − 1 with g 0 k ( a θ ) = θ = g 0 k ( b θ ) such that t j equals j = a θ for j ≤ k − r and b θ otherwise. That is ( k − r ) a θ + r b θ = 0 and G k ( t ) = r g k ( b θ ) + ( k − r ) g k ( a θ ) . Thus it suffices for us to sho w that the functions M r,k ( b ) := r g k ( b ) + ( k − r ) g k ( − r b/ ( k − r )) for 0 ≤ b < ( k − r ) /r are nonnegati ve for r = 1 , 2 , . . . , k − 2 . For r ≥ 2 and 0 ≤ b ≤ ( k − 2) / 2 , inequality (11) shows that g k ( b ) is nonnegati ve: g k ( b ) = b 2  1 2 ψ ( b ) − 2 ρ k  ≥ b 2  log(1 + 2 b ) 1 + 2 b − log( k − 1) k − 1  ≥ 0 because b 7→ log (1 + 2 b ) / (1 + 2 b ) is a decreasing function. The function M r,k ( b ) is then a sum of nonnegati ve functions on [0 , ( k − r ) /r ] . It remains only to consider the case where r equals 1 . T o simplify notation, write k 1 for k − 1 and abbreviate M 1 ,k to M k . That is, M k ( b ) = h ( b ) + k 1 h ( − b/k 1 ) − ρ k  b 2 + k 1 ( b/k 1 ) 2  = (1 + b ) log (1 + b ) + ( k 1 − b ) log(1 − b/k 1 ) − k ρ k b 2 /k 1 , CHR OMA TIC NUMBERS 9 whence M 0 k ( b ) = log  1 + b 1 − b/k 1  − 2 k ρ k b k 1 , M 00 k ( b ) = k (1 + b )( k 1 − b ) − 2 k ρ k k 1 . Notice that M 00 k ( b ) ≥ 0 except on an interval I k := ( b k , b 0 k ) in which the inequality 2(1 + b )( k 1 − b ) > k 1 /ρ k holds. For k = 3 or 4 the interval I k is empty . The functions M 3 and M 4 are con ve x. They achiev e their minima of zero at b = 0 because M 0 k (0) = 0 . For k ≥ 5 , the interval I k is nonempty . The deriv ati ve M 0 k ( b ) achieves its maximum value at b = b k and its the minimum value at b 0 k . For k = 5 we have b 0 5 ≈ 2 . 19 and M 0 5 ( b 0 5 ) ≈ 0 . 055 . Thus M 5 is an increasing function on [0 , 4] , achieving its minimum value of zero at b = 0 . 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.2 0.4 0.6 0.8 1.0 M4(b) 0 1 2 3 4 0.0 0.2 0.4 0.6 0.8 1.0 M5(b) 0 1 2 3 4 5 0.0 0.2 0.4 0.6 0.8 1.0 M6(b) 0 2 4 6 8 0.0 0.2 0.4 0.6 0.8 1.0 M10(b) 0 5 10 15 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 M20(b) 0 5 10 15 20 25 30 0 1 2 3 4 5 6 7 M30(b) F I G U R E 1 . Plots of M k ( b ) for v arious values of k . The vertical lines mark off the interv als I k where the functions are concav e. For k ≥ 6 a more delicate analysis is required. The function M k is concave on the segment I k and conv ex on each of the segments [0 , b k ] and [ b 0 k , k 1 ] . The global minimum References 10 is achiev ed either at b = 0 , with M k (0) = 0 , or at the local minimum b ∗ ∈ ( b 0 k , k 1 ) where M 0 k ( b ∗ ) = 0 and M 00 k ( b ∗ ) > 0 . From the change in sign of the deriv ative, M 0 k ( k 1 − 1) = 2 log ( k 1 ) /k 2 1 > 0 for all k M 0 k ( k 1 − 2) = 2( k 1 + 2) log( k 1 ) k 2 1 + log  k 1 − 1 2 k 1  < 0 for k ≥ 6 , we deduce that k 1 − 2 < b ∗ < k 1 − 1 . The con vexity of M k on [ b 0 k , k 1 ] then gi ves a linear lower bound, M k ( b ∗ ) ≥ M k ( k 1 − 1) + ( b ∗ − k 1 + 1) M 0 k ( k 1 − 1) ≥ k 1 − 1 k 2 1 log( k 1 ) − 2 log ( k 1 ) k 2 1 ≥ 0 for k ≥ 4 . It follows that M k is nonnegati ve also for k ≥ 6 . R E F E R E N C E S Achlioptis, D. and A. Naor (2005). The tw o possible v alues of the chromatic number of a random graph. Annals of Mathematics 162 , 1335–1351. Alon, N. and J. H. Spencer (2000). The Pr obabilistic Method (second ed.). W iley . Haberman, S. J. (1974). The Analysis of F requency Data . Uni versity of Chicago Press. Pollard, D. (2001). A User’ s Guide to Measure Theor etic Pr obability . Cambridge Uni- versity Press. S TA T I S T I C S A N D E L E C T R I C A L E N G I N E E R I N G D E PA RT M E N T S , Y AL E U N I V E R S I T Y E-mail addr ess : firstname.lastname@yale.edu for each author URL : http://www.stat.yale.edu/˜ypng/

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment