Robust Estimation of Mean Values

Robust Estimation of Me an V alues ∗ Xinjia Chen No v em ber 2008 Abstract In this pap er, w e dev elop a computational approa ch f or estimating the mean v alue of a quantit y in the presence of uncertaint y . W e demo nstrate that, under so me mild assumptions, the upp er a nd low er bounds o f the mea n v alue are eﬃciently computable via a sample reuse techn ique, of which the co mputational co mplexity is shown to p oss es a P oisso n distribution. 1 In tro duction In many situations, it is d esirable to e stimate the mean v alue of a s calar qu an tit y Q whic h is a function of indep enden t r andom ve ctors V and ∆ suc h that the distribution of V is kno wn and that the d istribution of ∆ is unknown [4]. Namely , it is in terested to estimate the exp ectation of Q = q ( V , ∆ ), wh ere q ( ., . ) is a m ultiv a riate f unction. F rom mo deling considerations, it is reasonable to assu me that ∆ is b ounded in norm || . || , and rad ially symmetrical and nondecreasing in its p robabilit y d en sit y function, f ∆ ( . ) with the follo wing n otions: (i) The n orm, || ∆ || , of ∆ is no greater than a certain v alue r , i.e., || ∆ || ≤ r ; (ii) F or an y realization ∆ of ∆ , f ∆ ( ∆ ) dep ends only on, || ∆ || , the norm of ∆ ; (iii) F or an y ∆ 1 and ∆ 2 suc h that || ∆ 1 || < || ∆ 2 || , f ∆ ( ∆ 1 ) ≥ f ∆ ( ∆ 2 ). Suc h a ssump tions hav e b een p rop osed b y Barmish and Lagoa [1] in the conte xt of r obust- ness a nalysis of co nt rol systems, where ∆ is referr ed to as “uncertaint y” b ecause of the lac k of kno wledge of its distribu tion. In this pap er, we shall fo cus on the estimati on of the exp ectati on E [ Q ] = E [ q ( V , ∆ )] based on assumptions (i), (ii) and (iii). Su c h a problem is referr ed to as r obust estimation du e to the fact that the exact distribution of ∆ is n ot a v ailable. In the sp ecial case th at the maximum norm r of ∆ equals 0, the robust estimatio n p roblem reduces to a conv en tional estimati on p r oblem. Instea d of seeking the exact v alue of E [ Q ] which is obvio usly imp ossib le, we aim at obtaining upp er and lo wer b ounds for E [ Q ]. It is in tuitiv e that the gap b et w een the upp er and lo w er b ounds should b e ∗ The author had b een previously working with Louisiana State Un iversi ty at Baton Rouge, LA 70803, USA , and is now with Department of Electrical Engineering, Southern Un ivers ity and A&M College , Baton Rouge, LA 70813, USA; Email: chenxinjia@gmail. com 1 increasing with resp ect to r . Since the relation b et ween Q and V , ∆ can b e fairly complicated, the Monte Carlo estimation metho d is the unique and p o werful approac h. The remainder of the pap er is organized as follo ws . In Section 2, we derive u pp er and lo w er b ound s for E [ Q ] based on assu mptions (i), (ii) and (iii). In Section 3, w e prop ose a Mon te C arlo metho d for the ev aluation of the b ounds of E [ Q ]. In particular, we introd u ce a s ample reuse metho d to su bstan tially r educe the computational complexit y . In Section 4, w e inv estigate the computational complexit y of the Mon te Carlo metho d implemented with th e pr in ciple of sample reuse. S ection 5 is the conclusion. 2 Bounds of Exp ectation In this section, w e s hall derive upp er and lo w er b ounds of E [ Q ] = E [ q ( V , ∆ )] based on the assumptions describ ed in Section 1. F or this purp ose, we ha v e the follo win g fund amen tal resu lt, whic h is a sligh t generalization of the uniform principle p r op osed by Barmish and Lagoa [1 ]. Theorem 1 L et ∆ u ρ b e a r andom ve ctor with a uniform distribution over { ∆ : || ∆ || ≤ ρ } . Deﬁne M ( ρ ) = E  q ( V , ∆ u ρ )  , M ( r ) = inf 0 <ρ ρ 2 > · · · > ρ m > 0. Let B ℓ = { ∆ : || ∆ || ≤ ρ ℓ } . F or ℓ = 1 , · · · , m , estimate M ( ρ ℓ ) as the emp irical mean P N i =1 q ( V i , X ℓ,i ) N where V i , X ℓ,i , i = 1 , · · · , N are mutually indep end en t random v ariables suc h that V 1 , · · · , V N are i.i.d. rand om samples of V and X ℓ, 1 , · · · , X ℓ,N are i.i.d. r an d om samples uniformly distributed o ver B ℓ . Clearly , the total n umber of sim ulations is N m for estimating M ( ρ ℓ ) , ℓ = 1 , · · · , m . A ma jor prob lem with this approac h is that the computational complexit y can b e extremely high, since the num b er of grid p oin ts m is t ypically a very large num b er. T o o vercome s uc h a pr oblem, w e shall d ev elop a sample reuse tec hnique in the n ext section. 3 Sample Reuse In this section, w e shall explore th e idea of sample reuse to reduce the computational complexit y . The sample reuse metho d h as b een prop osed by Chen et al. [2, 3 ] for the robustn ess analysis 2 of control systems. The idea of sample reuse is to start simula tion from the largest set B 1 and if it also b elongs to sm aller sub s ets the exp erimental r esult is sa v ed for later use in the smaller sets. As can b e seen from last section, a con v entional app roac h would requ ire a total of N m sim ulations. Ho we v er, du e to sample reuse, the actual num b er of exp eriments for set B ℓ is a random num b er n ℓ , whic h is u s ually muc h less than N . Hence, this strategy sa v es a signiﬁcan t amoun t of compu tational eﬀort. In order to pro vide a precise description of the pr inciple of sample reu se, w e assume that all random v ariables are deﬁned in the same pr obabilit y space (Ω , F , P r ). W e shall introdu ce a function G , referred to as sample r euse fu nction , as follo ws. Let X 1 , · · · , X m b e i.i.d. samples uniformly distr ib uted o v er A . Let Y 1 , · · · , Y n b e i.i.d. samples uniform ly distribu ted ov er B . Let m ≤ n and A ⊃ B . Deﬁn e reusable sample size k suc h that k ( ω ) is the num b er of elements of { X i ( ω ) ∈ B : i = 1 , · · · , m } f or an y ω ∈ Ω. Deﬁn e random v ariables Z 1 , · · · , Z n suc h that, f or an y ω ∈ Ω, Z ℓ ( ω ) =    X i ℓ ( ω ) for 1 ≤ ℓ ≤ k ( ω ) , Y ℓ ( ω ) for k ( ω ) < ℓ ≤ n where i ℓ , 1 ≤ ℓ ≤ k ( ω ) are the ind exes of the elemen ts of { X i ( ω ) ∈ B : i = 1 , · · · , n } su ch that i ℓ is incr easing with resp ect to ℓ . This pro cess of generating Z 1 , · · · , Z n from X 1 , · · · , X m and Y 1 , · · · , Y n is den oted b y ( Z 1 , · · · , Z n ; k ) = G ( X 1 , · · · , X m ; Y 1 , · · · , Y n ) . With regard to the distrib ution of Z 1 , · · · , Z n , w e h av e Theorem 2 Supp ose X 1 , · · · , X m ar e indep endent with Y 1 , · · · , Y n . Then, Z 1 , · · · , Z n ar e i.i.d. samples uniformly distribute d over B . See App endix B for a p ro of. No w w e can u se G to p recisely describ e the sample reuse algorithm for estimating M ( ρ ℓ ) , ℓ = 1 , · · · , m . Let X ℓ,i , i = 1 , · · · , N b e the random samples un iformly distributed o v er B ℓ for ℓ = 1 , · · · , m . Let Y 1 ,i = X 1 ,i for i = 1 , · · · , N and ( Y ℓ, 1 , · · · , Y ℓ,N ; k ℓ ) = G ( Y ℓ − 1 , 1 , · · · , Y ℓ − 1 ,N ; X ℓ, 1 , · · · , X ℓ,N ) for ℓ = 2 , · · · , m . As a result of Theorem 1, we ha ve that, for an y ℓ ∈ { 1 , · · · , m } , random v ariables Y ℓ,i , i = 1 , · · · , N h a ve the same asso ciated cumulat iv e distribution with that of rand om v ariables X ℓ,i , i = 1 , · · · , N . Th is implies that 1 N P N i =1 q ( V i , Y ℓ,i ) has the same distribution as that of 1 N P N i =1 q ( V i , X ℓ,i ) for ℓ = 1 , · · · , m . Therefore, we can us e 1 N P N i =1 q ( V i , Y ℓ,i ) as an estimator of M ( ρ ℓ ) for ℓ = 1 , · · · , m . By virtu e of suc h sample reuse metho d, the total num b er of sim ulations is reduced from N m to N + P m ℓ =2 n ℓ , where n ℓ = N − k ℓ for ℓ = 2 , · · · , m . As w ill b e demonstrated in the next section, this can b e a huge reduction of complexit y for a large m . 3 4 P oisson Complexit y Since the total num b er of simulatio ns for u sing the sample reuse metho d to estimate M ( ρ ℓ ) , ℓ = 1 , · · · , m is N + P m ℓ =2 n ℓ , it is imp ortant to inv estigate the distribution of P m ℓ =2 n ℓ . In this regard, w e hav e th e follo win g general result. Theorem 3 F or arbitr ary se quenc e of neste d sets B 1 ⊃ B 2 ⊃ · · · ⊃ B m with v ol( B 1 ) = V max and v ol( B m ) = V min , the cumulative distribution function of P m ℓ =2 n ℓ is b ounde d fr om b e low by the cumulative distribution f u nction of a Poisson r andom variable P with me an λ = N ln  V max V min  . That is, Pr { P m ℓ =2 n ℓ = 0 } = Pr { P = 0 } and Pr { P m ℓ =2 n ℓ ≤ k } > Pr { P ≤ k } for any p ositive inte ger k . Mor e over, as the maximum diﬀer enc e of volumes of al l c onse cutive sets tends to b e zer o, P m ℓ =2 n ℓ c onver ges to P in distribution. See App end ix C for a pr o of. It sh ould b e noted that the v olume of a s et B , denoted b y v ol( B ), is referr ed to the Leb esgue measure of B in this pap er . As an im m ediate consequence of Theorem 3, we ha v e Pr ( m X ℓ =2 n ℓ > 0 ) = P r { P > 0 } , Pr ( m X ℓ =2 n ℓ > k ) < P r { P > k } , k = 1 , 2 , · · · whic h implies that E " m X ℓ =2 n ℓ # = ∞ X k =0 Pr ( m X ℓ =2 n ℓ > k ) < ∞ X k =0 Pr { P > k } = λ = N ln  V max V min  . By virtue of Th eorem 3, we can derive some simple b ounds for th e distribution of P m ℓ =2 n ℓ as follo ws. Theorem 4 Pr { P m ℓ =2 n ℓ ≥ k } ≤ e − λ  λe k  k for any nu mb er k > λ = N ln  V max V min  . In p articular, Pr { P m ℓ =2 n ℓ ≥ eλ } ≤ e − λ and Pr { X ≥ (1 + ǫ ) λ } < exp  − ǫ 2 λ 4  for 0 < ǫ < 1 . See App endix D for a pro of. No w w e apply Theorem 3 to inv estigate the densit y of original samples of ∆ . Supp ose that the v olume of { ∆ : || ∆ || ≤ ρ } is prop ortional to ρ d where d is the dimension of the set. Let N ρ denote the num b er of original samples in cluded in { ∆ : || ∆ || ≤ ρ } when applying the sample r euse metho d to in terv al [ ρ κ , ρ ]. Deﬁne th e densit y of samples at radiu s ρ as D ( ρ ) = lim δ → 0 E [ N ρ + δ − N ρ ] δ . Then, we ha ve the follo wing result. Theorem 5 D ( ρ ) is e qual to N d ρ  κρ a  d for ρ ∈ (0 , a κ ] and is less than N d ρ for ρ ∈ ( a κ , a ] . See Ap p end ix E for a p ro of. F rom this theorem, w e can obtain an upp er b ound for the exp ected num b er of original samp les with norm b oun ded in [0 , a ]. As can b e seen from Theorem 5, the den sit y fu nction is un imo dal an d ac hiev es the largest v alue at ρ = a κ . Th e densit y function is display ed by Figure 1. 4 0 20 40 60 80 100 0 5 10 15 20 25 30 Radius t Density Bound Density of Samples (N = 100, lambda = 10, a = 100) d = 2 d = 1 d = 3 Figure 1: Illustrativ e Example ( N = 100 , a = 100 , λ = 10) 5 Conclusion W e ha ve pr op osed an eﬃcien t computational approac h for estimating th e mean v alue of a random function, for whic h the d istr ibution of relev an t random v ariables are not completely av ailable. A Mon te Carlo metho d with s ample reuse as a key mechanism is established . Th e asso ciated computational complexit y is demons trated to follo w a Po isson distribution. A Pro of of Theorem 1 W e follo w the s imilar metho d of Barmish and Lagoa [1 ]. Let V d en ote the vo lume of B = { ∆ : || ∆ || ≤ r } . W e partition the s et B as K la yers of equal vo lume V K suc h that the k -th la y er is L k = { ∆ : r k − 1 < || ∆ || ≤ r k } with 0 = r 0 < r 1 < r 2 < · · · < r K = r . Th en, the dens ity fun ction can b e expressed as f ∆ (∆) ≈ K X k =1 I k (∆) λ k where λ k , k = 1 , · · · , K satisfying V K K X k =1 λ k = 1 , λ 1 ≥ λ 2 ≥ · · · ≥ λ K ≥ 0 (1) 5 and I k ( . ) is the indicator fu n ction such that I k (∆) = 1 if ∆ falls int o the k -th lay er L k and I k (∆) = 0 otherwise. Let f V ( . ) denote the densit y function of V . Since V and ∆ are ind ep endent, w e hav e E [ q ( V , ∆ )] ≈ Z { ( v, ∆): || ∆ ||≤ r } q ( v , ∆) f V ( v ) dv f ∆ (∆) d ∆ = Z { ( v, ∆): || ∆ ||≤ r } q ( v , ∆) f V ( v ) dv " K X k =1 I k (∆) λ k # d ∆ = K X k =1 α k λ k , where α k = R { ( v, ∆): || ∆ ||≤ r } q ( v , ∆) I k (∆) f V ( v ) dv d ∆. T herefore, the up p er and lo w er b ounds of E [ q ( V , ∆ )] corresp ond to the maxim um and minim um of the linear pr ogram: P K k =1 α k λ k sub ject to constrain t (1). F rom con v ex analysis, the maxim um and minimum of this linear program are ac hieving at extreme p oints of the form: λ k =    K j V for 1 ≤ k ≤ j, 0 for j < k ≤ K. As the n umber of la y ers K tends to inﬁnity , the summation P K k =1 I k (∆) λ k , wh ic h is asso ciated with extreme p oin t ( λ 1 , · · · , λ K ), tends to a un iform distribution. This ju stiﬁes the theorem. B Pro of of Theorem 2 Let S ℓ ⊆ B for ℓ = 1 , · · · , n . Deﬁne D = { 1 , · · · , n } and I s = { ( i 1 , · · · , i s ) : i 1 < · · · < i s ; i ℓ ∈ D , ℓ = 1 , · · · , s } . T hen, Pr { Z ℓ ∈ S ℓ , ℓ = 1 , · · · , n } = n X s =0 X ( i 1 , ··· ,i s ) ∈I s Pr { X i ℓ ∈ S ℓ , ℓ = 1 , · · · , s ; X j / ∈ B , j ∈ D \ { i 1 , · · · , i s }} × Pr { Y ℓ ∈ S ℓ , ℓ = s + 1 , · · · , n } . F or s implicit y of notations, w e let V S ℓ = v ol( S ℓ ) , V A = vo l( A ) and V B = vo l( B ). No te that Pr { Y ℓ ∈ S ℓ , ℓ = s + 1 , · · · , n } = Q n ℓ = s +1  V S ℓ V B  and Pr { X i ℓ ∈ S ℓ , ℓ = 1 , · · · , s ; X j / ∈ B , j ∈ D \ { i 1 , · · · , i s }} =  V A − V B V A  m − s s Y ℓ =1  V S ℓ V A  =  V B V A  s  1 − V B V A  m − s s Y ℓ =1  V S ℓ V B  . Since there are  n s  elemen ts in I s , w e h av e Pr { Z ℓ ∈ S ℓ , ℓ = 1 , · · · , n } = n X s =0  n s   V B V A  s  1 − V B V A  m − s s Y ℓ =1  V S ℓ V B  n Y ℓ = s +1  V S ℓ V B  = n Y ℓ =1  V S ℓ V B  n X s =0  n s   V B V A  s  1 − V B V A  m − s = n Y ℓ =1  V S ℓ V B  . 6 This concludes th e pr o of of the theorem. C Pro of of Theorem 3 W e need some preliminary results. Lemma 1 L et N 1 ≤ N 2 ≤ · · · ≤ N m . F or ℓ = 1 , · · · , m , let v ℓ = vol( B ℓ ) and X ℓ,i , i = 1 , · · · , N ℓ b e i.i.d. r andom samples unif ormly distribute d over B ℓ . L et Y 1 ,i = X 1 ,i for i = 1 , · · · , N 1 and ( Y ℓ, 1 , · · · , Y ℓ,N ℓ ; k ℓ ) = G ( Y ℓ − 1 , 1 , · · · , Y ℓ − 1 ,N ℓ − 1 ; X ℓ, 1 , · · · , X ℓ,N ℓ ) for ℓ = 2 , · · · , m . Deﬁne n ℓ = N ℓ − k ℓ for ℓ = 2 , · · · , m . Then, Pr { n ℓ = n ℓ , ℓ = 2 , · · · , m } = Q m ℓ =2 B  N ℓ − n ℓ , N ℓ − 1 , v ℓ v ℓ − 1  for N ℓ − N ℓ − 1 ≤ n ℓ ≤ N ℓ and 2 ≤ ℓ ≤ m , wher e B ( k , n , p ) =  n k  p k (1 − p ) n − k . Pro of . W e use indu ction metho d. First, it is easy to sh o w th at the lemma is true for m = 2. Next, w e assume that the lemma is tr ue for m − 1 and sh o w that the lemma is also tru e for m . Let Pr { ( k 1 , · · · , k m ) , ( N 1 , · · · , N m ) , ( v 1 , · · · , v m ) } denote the p robabilit y th at, among the N 1 samples generated from the b iggest set B 1 , there are k ℓ samples falling into B ℓ for ℓ = 1 , 2 , · · · , m . Let P m { ( n 2 , · · · , n m ) , ( N 1 , · · · , N m ) , ( v 1 , · · · , v m ) } d enote the probabilit y of even t { n ℓ = n ℓ , ℓ = 2 , · · · , m } asso ciated w ith the application of the sample reuse metho d to sets B ℓ , ℓ = 1 , · · · , m with requir ed sample sizes N 1 ≤ N 2 ≤ · · · ≤ N m . Let P m − 1 { ( n 3 , · · · , n m ) , ( N 2 − k 2 , · · · , N m − k m ) , ( v 2 , · · · , v m ) } denote the p robabilit y of ev en t { n ℓ = n ℓ , ℓ = 3 , · · · , m } asso ciated with th e application of the sample reuse metho d to sets B ℓ , ℓ = 2 , · · · , m with required samp le sizes N 2 − k 2 ≤ · · · ≤ N m − k m . Note that P m { ( n 2 , · · · , n m ) , ( N 1 , · · · , N m ) , ( v 1 , · · · , v m ) } = X k 2 ≥ k 3 ≥···≥ k m ≥ 0 Pr { ( k 1 , · · · , k m ) , ( N 1 , · · · , N m ) , ( v 1 , · · · , v m ) } × P m − 1 { ( n 3 , · · · , n m ) , ( N 2 − k 2 , N 3 − k 3 , · · · , N m − k m ) , ( v 2 , · · · , v m ) } where n 2 + k 2 = N 2 and k 1 = N 1 . By the mechanism of sample reuse, Pr { ( k 1 , · · · , k m ) , ( N 1 , · · · , N m ) , ( v 1 , · · · , v m ) } = " m Y ℓ =2  k ℓ − 1 k ℓ   v ℓ − 1 − v ℓ v 1  k ℓ − 1 − k ℓ #  v m v 1  k m . Since N ℓ and − k ℓ are non-d ecreasing with resp ect to ℓ , we ha v e that N ℓ − k ℓ is non-decreasing with resp ect to ℓ . Hence, by the assumption of induction, P m − 1 { ( n 3 , · · · , n m ) , ( N 2 − k 2 , N 3 − k 3 , · · · , N m − k m ) , ( v 2 , · · · , v m ) } = m Y ℓ =3 B  N ℓ − n ℓ − k ℓ , N ℓ − 1 − k ℓ − 1 , v ℓ v ℓ − 1  = m Y ℓ =3  N ℓ − 1 − k ℓ − 1 N ℓ − n ℓ − k ℓ   v ℓ v ℓ − 1  N ℓ − n ℓ − k ℓ  1 − v ℓ v ℓ − 1  N ℓ − 1 − N ℓ + n ℓ − k ℓ − 1 + k ℓ 7 and consequently , P m { ( n 2 , · · · , n m ) , ( N 1 , · · · , N m ) , ( v 1 , · · · , v m ) } = X k 2 ≥ k 3 ≥···≥ k m ≥ 0 " m Y ℓ =2  k ℓ − 1 k ℓ   v ℓ − 1 − v ℓ v 1  k ℓ − 1 − k ℓ #  v m v 1  k m × m Y ℓ =3  N ℓ − 1 − k ℓ − 1 N ℓ − n ℓ − k ℓ   v ℓ v ℓ − 1  N ℓ − n ℓ − k ℓ  1 − v ℓ v ℓ − 1  N ℓ − 1 − N ℓ + n ℓ − k ℓ − 1 + k ℓ = X k 2 ≥ k 3 ≥···≥ k m ≥ 0 " m Y ℓ =2  k ℓ − 1 k ℓ  N ℓ − 1 − k ℓ − 1 N ℓ − n ℓ − k ℓ  # ×  v m v 1  k m m Y ℓ =2  v ℓ − 1 − v ℓ v 1  k ℓ − 1 − k ℓ × m Y ℓ =3  v ℓ v ℓ − 1  N ℓ − n ℓ − k ℓ  1 − v ℓ v ℓ − 1  N ℓ − 1 − N ℓ + n ℓ − k ℓ − 1 + k ℓ . Making u se of the r elationships k 1 = N 1 and k 2 = N 2 − n 2 , we ha v e  v m v 1  k m m Y ℓ =2  v ℓ − 1 − v ℓ v 1  k ℓ − 1 − k ℓ × m Y ℓ =3  v ℓ v ℓ − 1  − k ℓ  1 − v ℓ v ℓ − 1  − k ℓ − 1 + k ℓ = ( v 1 − v 2 ) k 1 − k 2 v k 2 2 v N 1 1 ! =  v 2 v 1  N 2 − n 2  1 − v 2 v 1  N 1 − N 2 + n 2 and thus  v m v 1  k m m Y ℓ =2  v ℓ − 1 − v ℓ v 1  k ℓ − 1 − k ℓ × m Y ℓ =3  v ℓ v ℓ − 1  N ℓ − n ℓ − k ℓ  1 − v ℓ v ℓ − 1  N ℓ − 1 − N ℓ + n ℓ − k ℓ − 1 + k ℓ =  v m v 1  k m m Y ℓ =2  v ℓ − 1 − v ℓ v 1  k ℓ − 1 − k ℓ × m Y ℓ =3  v ℓ v ℓ − 1  − k ℓ  1 − v ℓ v ℓ − 1  − k ℓ − 1 + k ℓ × m Y ℓ =3  v ℓ v ℓ − 1  N ℓ − n ℓ  1 − v ℓ v ℓ − 1  N ℓ − 1 − N ℓ + n ℓ = m Y ℓ =2  v ℓ v ℓ − 1  N ℓ − n ℓ  1 − v ℓ v ℓ − 1  N ℓ − 1 − N ℓ + n ℓ . On the other hand, if the lemma really holds, we ha ve P m { ( n 2 , · · · , n m ) , ( N 1 , · · · , N m ) , ( v 1 , · · · , v m ) } = m Y ℓ =2 B  N ℓ − n ℓ , N ℓ − 1 , v ℓ v ℓ − 1  = " m Y ℓ =2  N ℓ − 1 N ℓ − n ℓ  # " m Y ℓ =2  v ℓ v ℓ − 1  N ℓ − n ℓ  1 − v ℓ v ℓ − 1  N ℓ − 1 − N ℓ + n ℓ # . Therefore, to sho w the lemma, it remains to sho w X k 2 ≥ k 3 ≥···≥ k m ≥ 0 " m Y ℓ =2  k ℓ − 1 k ℓ  N ℓ − 1 − k ℓ − 1 N ℓ − n ℓ − k ℓ  # = m Y ℓ =2  N ℓ − 1 N ℓ − n ℓ  . Using the relationships k 1 = N 1 and k 2 = N 2 − n 2 , this identit y can b e reduced to the follo wing iden tit y X k 2 ≥ k 3 ≥···≥ k m ≥ 0 " m Y ℓ =3  k ℓ − 1 k ℓ  N ℓ − 1 − k ℓ − 1 N ℓ − n ℓ − k ℓ  # = m Y ℓ =3  N ℓ − 1 N ℓ − n ℓ  , 8 whic h can b e shown by observing th at X k 2 ≥ k 3 ≥···≥ k m − i ≥ 0 " m − i Y ℓ =3  k ℓ − 1 k ℓ  N ℓ − 1 − k ℓ − 1 N ℓ − n ℓ − k ℓ  # = X k 2 ≥ k 3 ≥···≥ k m − i − 1 ≥ 0 " m − i − 1 Y ℓ =3  k ℓ − 1 k ℓ  N ℓ − 1 − k ℓ − 1 N ℓ − n ℓ − k ℓ  # k m − i − 1 X k m − i =0  k m − i − 1 k m − i  N m − i − 1 − k m − i − 1 N m − i − n m − i − k m − i  = X k 2 ≥ k 3 ≥···≥ k m − i − 1 ≥ 0 " m − i − 1 Y ℓ =3  k ℓ − 1 k ℓ  N ℓ − 1 − k ℓ − 1 N ℓ − n ℓ − k ℓ  #  N m − i − 1 N m − i − n m − i  for 0 ≤ i ≤ m − 4 and X k 2 ≥ k 3 ≥ 0 " 3 Y ℓ =3  k ℓ − 1 k ℓ  N ℓ − 1 − k ℓ − 1 N ℓ − n ℓ − k ℓ  # = X k 2 ≥ k 3 ≥ 0  k 2 k 3  N 2 − k 2 N 3 − n 3 − k 3  =  N 2 N 3 − n 3  . This completes the pro of of the lemma. ✷ Lemma 2 L et θ > 1 and N ≥ 1 . Deﬁne L ( θ , k ) = P k i =0  N i   1 − 1 θ  i  1 θ  N − i and L P ( θ , k ) = P k i =0 ( N l n θ ) i i ! exp( − N ln θ ) for k = 0 , 1 , · · · , N . Then, L ( θ , 0) = L P ( θ , 0) and L ( θ , k ) > L P ( θ , k ) for k = 1 , · · · , N . Pro of . First, it is eviden t that L ( θ , 0) = L P ( θ , 0) = θ − N and L ( θ , N ) = 1 > L P ( θ , N ). Hence, it remains to show the lemma for k = 1 , · · · , N − 1. It is easy to show that lim θ → ∞ L ( θ , k ) = lim θ → ∞ L P ( θ , k ) = 0 and thus lim θ → ∞ [ L ( θ , k ) − L P ( θ , k )] = 0 for k = 1 , · · · , N − 1. It can also b e readily c hec ked th at lim θ → 1 L ( θ , k ) = lim θ → 1 L P ( θ , k ) = 1 and consequen tly lim θ → 1 [ L ( θ , k ) − L P ( θ , k )] = 0 for k = 1 , · · · , N − 1. Noting th at ∂ L ( θ , k ) ∂ θ = − N ! k !( N − k − 1)!  1 − 1 θ  k  1 θ  N − k +1 and ∂ L P ( θ ,k ) ∂ θ = − ( N ln θ ) k k ! N θ N +1 , we hav e ∂ [ L ( θ ,k ) − L P ( θ ,k )] ∂ θ = N k ! θ N +1 h ( N ln θ ) k − ( N − 1)! ( θ − 1) k ( N − k − 1)! i > 0 if and only if ϕ ( θ ) > 0, where ϕ ( θ ) = ln θ − α ( θ − 1) w ith α = h ( N − 1)! ( N − k − 1)! i 1 k 1 N < 1 . Since ϕ (1) = 0 and dϕ ( θ ) dθ = 1 θ − α is p ositiv e for θ ∈  1 , 1 α  , w e hav e ϕ ( θ ) > 0 for θ ∈  1 , 1 α  . Sin ce ϕ  1 α  > 0 , lim θ → ∞ ϕ ( θ ) < 0 and dϕ ( θ ) dθ < 0 f or θ > 1 α , there exists a u nique n umber θ ∗ greater than 1 α suc h that ϕ ( θ ∗ ) = 0. Hence, ϕ ( θ ) is p ositiv e for θ ∈ (1 , θ ∗ ) and n egativ e for θ > θ ∗ . T his implies th at L ( θ , k ) − L P ( θ , k ) is monotonically increasing with resp ect to θ ∈ (1 , θ ∗ ) and m on otonically decreasing with resp ect to θ ∈ ( θ ∗ , ∞ ). R ecalling that lim θ → 1 [ L ( θ , k ) − L P ( θ , k )] = lim θ → ∞ [ L ( θ , k ) − L P ( θ , k )] = 0, we hav e L ( θ , k ) > L P ( θ , k ) for an y θ > 1. Th is completes th e pro of of th e lemma. ✷ Lemma 3 L et U i , V i , i = 1 , · · · , n b e mutual ly indep endent non-ne gative discr ete r andom vari- ables. Supp ose that Pr { U i = 0 } = Pr { V i = 0 } and Pr { U i ≤ k } > Pr { V i ≤ k } for any p ositive inte ger k and i = 1 , · · · , n . Then, Pr { P n i =1 U i = 0 } = Pr { P n i =1 V i = 0 } and Pr { P n i =1 U i ≤ k } > Pr { P n i =1 V i ≤ k } for any p ositive i nte ger k . 9 Pro of . W e u se indu ction metho d. The lemma is obvio usly true for n = 1. Assuming that the lemma is true for n = m − 1 ≥ 1, we hav e Pr { P m i =1 U i = 0 } = P r { P m − 1 i =1 U i = 0 , U m = 0 } = Pr { P m − 1 i =1 V i = 0 } Pr { V m = 0 } = Pr { P m i =1 V i = 0 } and Pr { P m i =1 U i ≤ k } = P k l =0 Pr { P m − 1 i =1 U i = l , U m ≤ k − l } > P k l =0 Pr { P m − 1 i =1 V i = l } Pr { V m ≤ k − l } = Pr { P m i =1 V i ≤ k } for any p ositive in teger k , w hic h implies that th e lemma is also true for n = m . By the pr inciple of ind u ction, the lemma is established. ✷ W e are now in a p osition to pr ov e the th eorem. W e sh all ﬁr st sho w that the distribution of P m ℓ =2 n ℓ is b ounded from b elo w by the distribu tion of a Poisson v ariable with mean N ln V max V min . Deﬁne U i = n i +1 for i = 1 , · · · , m − 1. Th en, by Lemma 1, U i are indep endent b inomial random v ariables suc h that Pr { U i ≤ k } = L ( θ i , k ) for k = 0 , 1 , · · · , N and i = 1 , · · · , m − 1, w here θ i = v i v i +1 . Deﬁne P oisson v ariables V i , i = 1 , · · · , m − 1 suc h that U i , V i , i = 1 , · · · , m − 1 are mutually ind ep endent and that Pr { V i ≤ k } = L P ( θ i , k ) for non-negativ e inte ger k and i = 1 , · · · , m − 1. By Lemmas 2 and 3, we hav e Pr { P m ℓ =2 n ℓ = 0 } = Pr { P m − 1 i =1 U i = 0 } = Pr { P m − 1 i =1 V i = 0 } and Pr { P m ℓ =2 n ℓ ≤ k } = Pr { P m − 1 i =1 U i ≤ k } > P r { P m − 1 i =1 V i ≤ k } for k = 1 , 2 , · · · . Noting that V 1 , · · · , V m − 1 are indep en den t P oisson v ariables with corresp onding m eans N ln θ 1 , · · · , N ln θ m − 1 , w e ha v e that P m − 1 i =1 V i is also a Poisso n v ariable with mean N P m − 1 i =1 ln θ i = N P m − 1 i =1 ln v i v i +1 = N ln v 1 v m = N ln V max V min . Next, w e shall sho w that the distribution of P m ℓ =2 n ℓ tends to b e the distrib ution of a Poisson v ariable with mean N ln V max V min as ν = max { v ℓ − v ℓ +1 : 1 ≤ ℓ ≤ m − 1 } , th e maximum d iﬀerence b et w een the volumes of tw o consecutiv e nested sets, tends to b e zero while the v olumes of B 1 and B m resp ectiv ely assume ﬁxed v alues v 1 = V max and v m = V min . Since all samp le sizes are equal to N , by L emm a 1, for ℓ = 2 , · · · , m , the original sample sizes n ℓ , ℓ = 2 , · · · , m are m utually ind ep endent binomial ran d om v ariables suc h that Pr { n ℓ = k } = B ( k , N , p ℓ ) for 0 ≤ k ≤ N and 2 ≤ ℓ ≤ m , wher e p ℓ = 1 − v ℓ v ℓ − 1 with v ℓ = vo l( B ℓ ). Therefore, the momen t generating fu n ction of P m ℓ =2 n ℓ can b e expressed as G ( s ) = [ Q m ℓ =2 ( p ℓ s + 1 − p ℓ )] N , where s ∈ (0 , 1] is a real num b er. Since p ℓ s + 1 − p ℓ is p ositiv e for any s ∈ (0 , 1] and ℓ = 2 , · · · , m , it is meaningful to deﬁne g ( s ) = P m ℓ =2 ln( p ℓ s + 1 − p ℓ ) for s ∈ (0 , 1]. Hence, G ( s ) = exp( N g ( s )). F or simp licit y of notations, d eﬁne h ( s ) = ( s − 1) ln  V max V min  , I 1 ( s ) = R s 0 P m ℓ =2 v ℓ − 1 − v ℓ z ( v ℓ − 1 − v ℓ )+ v ℓ dz − R 1 0 P m ℓ =2 v ℓ − 1 − v ℓ z ( v ℓ − 1 − v ℓ )+ v ℓ dz and I 2 ( s ) = R s 0 P m ℓ =2 v ℓ − 1 − v ℓ v ℓ dz − R 1 0 P m ℓ =2 v ℓ − 1 − v ℓ v ℓ dz . T he lemma can b e es- tablished by the follo win g three steps. First, it can b e seen that g ( s ) = I 1 ( s ) for an y s ∈ (0 , 1], since I 1 (1) = g (1) = 0 and dI 1 ( s ) ds = m X ℓ =2 v ℓ − 1 − v ℓ s ( v ℓ − 1 − v ℓ ) + v ℓ = m X ℓ =2 p ℓ p ℓ s + 1 − p ℓ = dg ( s ) ds for an y s ∈ (0 , 1]. 10 Second, we need to sh o w that | I 1 ( s ) − I 2 ( s ) | → 0 for an y s ∈ (0 , 1] as ν → 0. Noting that     Z s 0 z ( v ℓ − 1 − v ℓ ) 2 v 2 ℓ + z ( v ℓ − 1 − v ℓ ) v ℓ dz     = ( v ℓ − 1 − v ℓ ) 2 v ℓ     Z s 0 z v ℓ + z ( v ℓ − 1 − v ℓ ) dz     ≤ ( v ℓ − 1 − v ℓ ) 2 v ℓ Z s 0 z v ℓ dz = s 2 ( v ℓ − 1 − v ℓ ) 2 2 v 2 ℓ ≤ s 2 ν ( v ℓ − 1 − v ℓ ) 2 V 2 min for an y s ∈ (0 , 1], we ha v e | I 1 ( s ) − I 2 ( s ) | ≤ m X ℓ =2     Z s 0 z ( v ℓ − 1 − v ℓ ) 2 v 2 ℓ + z ( v ℓ − 1 − v ℓ ) v ℓ dz     + m X ℓ =2     Z 1 0 z ( v ℓ − 1 − v ℓ ) 2 v 2 ℓ + z ( v ℓ − 1 − v ℓ ) v ℓ dz     ≤ m X ℓ =2 s 2 ν ( v ℓ − 1 − v ℓ ) 2 V 2 min + m X ℓ =2 ν ( v ℓ − 1 − v ℓ ) 2 V 2 min = ( s 2 + 1) ν 2 V 2 min m X ℓ =2 ( v ℓ − 1 − v ℓ ) = ( s 2 + 1)( V max − V min ) ν 2 V 2 min . Therefore, | I 1 ( s ) − I 2 ( s ) | → 0 for an y s ∈ (0 , 1] and arbitrary v ℓ , ℓ = 1 , · · · , m , as ν → 0. Third, we need to show g ( s ) → h ( s ) as ν → 0. S in ce h ( s ) − I 2 ( s ) = Z s 0 " ln  V max V min  − m X ℓ =2 v ℓ − 1 − v ℓ v ℓ # dz − Z 1 0 " ln  V max V min  − m X ℓ =2 v ℓ − 1 − v ℓ v ℓ # dz , w e ha v e | I 2 ( s ) − h ( s ) | ≤ R s 0    ln  V max V min  − P m ℓ =2 v ℓ − 1 − v ℓ v ℓ    dz + R 1 0    ln  V max V min  − P m ℓ =2 v ℓ − 1 − v ℓ v ℓ    dz . By the deﬁnition of Riemann integrati on, P m ℓ =2 v ℓ − 1 − v ℓ v ℓ → R V max V min dv v = ln  V max V min  as ν → 0 for arbitrary v ℓ , ℓ = 1 , · · · , m . It f ollo ws that, for any s ∈ (0 , 1] and arbitrary v ℓ , ℓ = 1 , · · · , m , | I 2 ( s ) − h ( s ) | → 0 as ν → 0. In view of | g ( s ) − h ( s ) | = | I 2 ( s ) − h ( s ) + I 1 ( s ) − I 2 ( s ) | ≤ | I 2 ( s ) − h ( s ) | + | I 1 ( s ) − I 2 ( s ) | , w e hav e g ( s ) → h ( s ) as ν → 0 f or an y s ∈ (0 , 1] and arbitrary v ℓ , ℓ = 1 , · · · , m . Therefore, w e can conclude that G ( s ) → exp  ( s − 1) N ln  V max V min  as ν → 0 for an y s ∈ (0 , 1] and arbitrary v ℓ , ℓ = 1 , · · · , m . This p ro v es that P m ℓ =2 n ℓ con verges in d istribution to a P oisson v ariable of mean N ln  V max V min  . The pro of of the theorem is th us completed. D Pro of of Theorem 4 W e need some p reliminary results. Lemma 4 L et X b e a Poisson variable of me an λ > 0 . F or any numb e r k > λ , Pr { X ≥ k } ≤ e − λ  λe k  k . Pro of . Since Pr { X ≥ k } = Pr  e t ( X − k ) ≥ 1  ≤ E  e t ( X − k )  for any t > 0, w e ha ve P r { X ≥ k } ≤ inf t> 0 E  e t ( X − k )  . Note that E h e t ( X − k ) i = ∞ X i =0 e t ( i − k ) λ i i ! e − λ = e λe t e − λ e − tk ∞ X i =0 ( λe t ) i i ! e − λe t = e − λ e λe t − tk , 11 whic h is minimized if and only if λe t = k . Since k > λ , we h a ve t = ln  k λ  > 0 such that λe t = k . F or this v alue of t , we hav e e − λ e λe t − tk = e − λ  λe k  k . Hence, w e hav e sh o w n Pr { X ≥ k } ≤ e − λ  λe k  k . ✷ No w we are in a p osition to pr ov e the theorem. By Theorem 3, we ha v e Pr { P m ℓ =2 n ℓ ≥ k } ≤ Pr { X ≥ k } ≤ e − λ  λe k  k . Setting k = eλ , we ha v e Pr { P m ℓ =2 n ℓ ≥ eλ } ≤ e − λ . Moreo v er, using the inequalit y (1 + ǫ ) ln(1 + ǫ ) > ǫ + ǫ 2 4 , ∀ ǫ ∈ (0 , 1], we ha v e Pr { P m ℓ =2 n ℓ ≥ (1 + ǫ ) λ } < h e ǫ (1+ ǫ ) 1+ ǫ i λ < exp  − ǫ 2 λ 4  for 0 < ǫ < 1. T his completes the pro of of th e theorem. E Pro of of Theorem 5 By Theorem 3, we hav e E [ N ρ + δ ] ≤ N h 1 + d ln  κ ( ρ + δ ) a i . No w ﬁx the gridding o ver ( ρ, ρ + δ ]. By Theorem 3, as the griding o ver [ a κ , ρ ] b ecomes increasingly dense, we hav e E [ N ρ ] → N  1 + d ln  κρ a  . This imp lies that, for any ǫ > 0, w e ha v e E [ N ρ ] > N  1 + ln  κρ a  − ǫ f or a suﬃcien tly dense gridd ing o ver [ a κ , ρ ]. Hence, E [ N ρ + δ − N ρ ] = E [ N ρ + δ ] − E [ N ρ ] < N d ln  κ ( ρ + δ ) a  − N d ln  κρ a  + ǫ = N d ln  ρ + δ ρ  + ǫ. Since the argument holds for any small ǫ > 0, we ha ve E [ N ρ + δ − N ρ ] ≤ N d ln  ρ + δ ρ  . Th erefore, the density D ( ρ ) = lim δ → 0 E [ N ρ + δ − N ρ ] δ ≤ lim δ → 0 N d ln ( ρ + δ ρ ) δ = N d ρ . On th e other hand , as the gridding gets dense, we hav e E [ N ρ + δ − N ρ ] → N d ln  ρ + δ ρ  and th u s D ( ρ ) → N d ρ . F or ρ ∈ (0 , a κ ], it follo w s from Theorem 3 th at N ρ is a bin omial ran d om v ariable corresp onding to N i.i.d. trial s with a success probabilit y  κρ a  d . He nce, E [ N ρ ] = N  κρ a  d and accordingly D ( ρ ) = N d ρ  κρ a  d . This completes the p ro of of the theorem. References [1] B. R. Barmish and C . M. L agoa, “The uniform distribution: a rigorous justiﬁcation for its use in robustness analysis,” Mathematics of Contr ol, Sig nals and Systems , v ol. 10, p p. 203-22 2, 1997. [2] X. C hen, K. Zhou , and J. Ara vena, “F ast construction of r obustness d egradation fu n ction,” SIAM Journal on Contr ol and Optimization , vol. 42, pp. 1960 –1971, 2004 . [3] X. Chen, K. Zh ou, and J . Ara v ena, “Probabilistic r obustness analysis — risks, complexit y and algorithms,” SIAM J ournal on Contr ol and Optimization , vol. 47, pp. 2693–272 3, 2008. [4] P . J . Hub er, R obust Estimation , Wiley , 1981. 12

Robust Estimation of Mean Values

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment