Infinitesimally Robust Estimation in General Smoothly Parametrized Models

We describe the shrinking neighborhood approach of Robust Statistics, which applies to general smoothly parametrized models, especially, exponential families. Equal generality is achieved by object oriented implementation of the optimally robust esti…

Authors: Matthias Kohl, Peter Ruckdeschel, Helmut Rieder

Infinitesimally Robust Estimation in General Smoothly Parametrized   Models
Innitesimally Robust Estimation in General Smo othly P arametrized Mo dels Matthias K ohl ∗ , P eter Ru kdes hel † , Helm ut Rieder ‡ Abstrat W e desrib e the shrinking neigh b orho o d approa h of Robust Statistis, whi h applies to general smo othly parametrized mo dels, esp eially , exp onen tial families. Equal generalit y is a hiev ed b y ob jet orien ted implemen tation of the optimally robust estimators. W e ev aluate the estimates on real datasets from literature b y means of our R pa k ages R OptEst and R obL ox . Keyw ords: Exp onen tial family; Inuene urv es; Asymptotially linear estimators; Shrinking on tamination and total v ariation neigh b orho o ds; One-step onstrution; Minmax MSE 1 In tro dution F ollo wing Hub er (1997 ), p 61, the purp ose of robustness is to safeguard against deviations from the assumptions, in partiular against those that are near or b elo w the limits of detetabilit y. The innitesimal approa h of Hub erCarol (1970 ), Rieder (1978 ) and Rieder (1980 ), Bi k el (1981 ), Rieder (1994 ) to robust testing and estimation, resp etiv ely , tak es up this aim b y emplo ying shrink- ing neigh b orho o ds of the parametri mo del, where the shrinking rate n − 1 / 2 , as the sample size n → ∞ , ma y b e dedued in a testing setup; onfer Ru kdes hel (2006 ). It is true that Hub er's o wn minim um Fisher information approa h refers to (small) neigh b orho o ds of xed size; f. Hub er (1981 ). But it only treats v ariane, sets bias = 0 b y assuming symmetry , and is restrited to T uk ey-t yp e neigh b orho o ds ab out lo ation or sale mo dels. It has not b een ex- tended to sim ultaneous lo ation and sale, let alone to more general mo dels. F raiman et al. (2001 ) deriv e MSE optimalit y on xed size neigh b orho o ds. In situations b ey ond one-dimensional lo ation, ho w ev er, they do not determine a solution in losed form either. The innitesimal approa h, on the on trary , pro vides losed-form robust solutions for general mo dels (f. Setion 2.1) and fairly general risks based on v ariane and bias (f. Ru kdes hel and Rieder (2004 )). As noted b y Hub er (p 291 of Hub er (1981 )), in view of Theorem 3.7 of Rieder (1978 ), there is a lose relation b et w een the innitesimal neigh b orho o d approa h and Hamp el's Lemma 5 (f. Hamp el (1968 )); see also Theorem 3.2 of Rieder (1980 ) and Theorem 5.5.7 of Rieder (1994 ). Dierenes to Hamp el et al. (1986 ) nev ertheless exist and onern:  denition of the inuene urv e,  neessit y of the form of the optimally robust inuene urv es, ∗ Univ ersit y of Ba yreuth, German y † F raunhofer-Institut, T e hno-und Wirts haftsmathematik, Kaiserslautern, German y ‡ Univ ersit y of Ba yreuth, German y 1  optimalit y riterion: MSE and ev en more general riterions,  determination of the bias b ound (sensitivit y),  uniform asymptotis on neigh b orho o ds, and  o v erage of more mo dels. A fourth robustness approa h pursues eieny in the ideal mo del sub jet to a high breakdo wn p oin t; onfer for example Maronna et al. (2006 ), Setions 5.6.3, 5.6.4 and 6.4.5. A high breakdo wn, though, ma y easily b e inorp orated in our approa h: Giv en some starting estimator ˆ θ n , w e onstrut our optimal estimators S n as one-step estimates, S n = ˆ θ n + n − 1  ψ ˆ θ n ( x 1 ) + · · · + ψ ˆ θ n ( x n )  (1) f. Setion 4 . The pro edure is alled one-step re-w eigh ting in Setion 5.6.3 of Maronna et al. (2006 ) and has already b een used in the Prineton robustness study (f. Andrews et al. (1972 )). Th us, if | ψ θ ( x ) | ≤ b , also | S n − ˆ θ n | ≤ b . Consequen tly , the breakdo wn p oin t of the starting estimator ˆ θ n is inherited to our estimator S n . Giv en the high breakdo wn, ho w ev er, w e do not onsider robustness as settled, then striving just for high eieny in the ideal mo del. Our primary aim sta ys minmax MSE on shrinking neigh b orho o ds ab out the ideal mo del, whi h altogether omplies with Hub er (1997 ), p 61, that a high breakdo wn p oin t is nie to ha v e if it omes for free. The organisation of the pap er is as follo ws: W e review the theory of asymptoti robustness on shrinking neigh b orho o ds, add some reen t results and sp ezialize. Then, w e ompute and apply the innitesimal robust estimators to datasets from literature using our R pa k ages R OptEst (gen- eral mo dels) and R obL ox (normal lo ation and sale); onfer R Dev elopmen t Core T eam (2008 ), K ohl and Ru kdes hel (2008 ) and K ohl (2008 ). Apppliations of innitesimal neigh b orho o d ro- bustness to time series will b e the sub jet of another pap er. 2 Setup 2.1 General Smo othly P arametrized Mo dels Denoting b y M 1 ( A ) the set of all probabilit y measures on some measurable spae (Ω , A ) , w e onsider a parametri mo del P = { P θ | θ ∈ Θ } ⊂ M 1 ( A ) , whose parameter spae Θ is an op en subset of some nite-dimensional R k , and whi h is dominated: dP θ = p θ dµ ( θ ∈ Θ ). A t an y xed θ ∈ Θ , mo del P is required to b e L 2 dieren tiable, that is, to ha v e L 2 dieren tiable square ro ot densities su h that, in L 2 ( µ ) , as t → 0 , √ p θ + t = √ p θ (1 + 1 2 t ′ Λ θ ) + o( | t | ) (2) The R k -v alued funtion Λ θ ∈ L k 2 ( P θ ) is alled L 2 deriv ativ e, and its o v ariane I θ = E θ Λ θ Λ ′ θ under P θ is the Fisher information of P at θ , required of full rank k . This t yp e of dieren tiabilit y is implied b y on tin uous dieren tiabilit y of p θ and on tin uit y I θ , with resp et to θ , and then Λ θ = ∂ ∂ θ log p θ . Confer e.g. Lemma A.3 of Ha jek (1972 ), Setion 1.8 of Witting (1985 ), Setion 2.3 of Rieder (1994 ), Rieder and Ru kdes hel (2001 ). Our main appliations in this artile onern exp onen tial families, in whi h ase p θ ( x ) = exp  ζ ( θ ) ′ T ( x ) − β ( θ )  h ( x ) (3) 2 with some measurable funtions ζ : Θ → R k , h : Ω → [ 0 , ∞ ) , T : Ω → R k of p ositiv e denite o v ariane Cov θ T ≻ 0 , and the normalizing onstan t β ( θ ) . Then P forms a k -dimensional ex- p onen tial family of full rank. The natural parameter spae Z ∗ onsists of all ζ -v alues su h that 0 < R exp  ζ ′ T ( x )  h ( x ) µ ( dx ) < ∞ . P is L 2 dieren tiable under the follo wing assumptions: ζ on tin uously dieren tiable in θ ∈ Θ with regular Jaobian matrix J ζ , and ζ (Θ) ⊂ Z o ∗ (in terior). And then, Λ θ ( x ) = J ′ ζ  T ( x ) − E θ T  I θ = J ′ ζ Cov θ ( T ) J ζ (4) where E θ denotes exp etation under P θ . The result men tioned in v an der V aart (1998 ), Example 7.7, is pro v en in K ohl (2005 ), Lemma 2.3.6 (a). In what follo ws, the parametri mo del P is assumed L 2 dieren tiable at an y θ ∈ Θ . 2.2 Asymptotially Linear Estimators The founders of robust statistis ha v e dened inuene urv es (IC) as Gâteaux deriv ativ es of sta- tistial funtionals; onfer Setion 2.5 of Hub er (1981 ) and Setion 2.1 of Hamp el et al. (1986 ). The lassial denition, ho w ev er, remains v ague. Ev en if su h a deriv ativ e exists, the denition is not strong enough to o v er the empirial; onfer Reeds (1976 ) and F ernholz (1983 ). Our approa h is dieren t: Sine most pro ofs of asymptoti normalit y in the i.i.d. ase amoun t to an estimator expansion with the IC as summands, w e dene the set of all (square in tegrable, R k -v alued) ICs at P θ b eforehand b y Ψ( θ ) =  ψ θ ∈ L k 2 ( P θ ) | E θ ψ θ = 0 , E θ ψ θ Λ ′ θ = I k  (5) where I k denotes the k × k iden tit y matrix. Then w e dene asymptotially linear (AL) estimators S to b e an y sequene of estimators S n : Ω n → R k su h that for some ψ θ ∈ Ψ( θ ) , neessarily unique, n 1 / 2 ( S n − θ ) = n − 1 / 2  ψ θ ( x 1 ) + · · · + ψ θ ( x n )  + o P n θ ( n 0 ) (6) where o P n θ ( n 0 ) → 0 in pro dut P n θ probabilit y as n → ∞ . Th us, the originally in tended in terpreta- tion is a hiev ed: ψ θ ( x i ) represen ts the asymptoti, suitably standardized inuene of observ ation x i on S n . The lass of AL estimators as in tro dued b y Rieder (1980 ), Denition 1.1 and Remarks, and Rieder (1994 ), Setion 4.2, o v ers M, L, R, S and MD (minim um distane) estimates. By the Lindeb erg-Lévy CL T, as ψ θ ∈ L k 2 ( P θ ) , E θ ψ θ = 0 , AL estimators are asymptotially normal under P n θ , n 1 / 2 ( S n − θ )( P n θ ) − → w N (0 , Cov θ ( ψ θ )) (7) The third ondition E θ ψ θ Λ ′ θ = I k is equiv alen t to the lo ally uniform extension of (7), with θ on the LHS replaed b y θ n with lim sup n →∞ √ n | θ n − θ | < ∞ . F or the asymptoti v ariane under P θ , the Cramér-Rao b ound holds, Cov θ ( ψ θ )  I − 1 θ = Cov θ ( ψ h,θ ) , ψ θ ∈ Ψ θ (8) with equalit y i ψ θ = ψ h,θ := I − 1 θ Λ θ , the lassial sores. 2.3 Innitesimal P erturbations The i.i.d. observ ations x 1 , . . . , x n ma y no w follo w an y la w Q in some neigh b orho o d ab out P θ . In this artile , the t yp e of neigh b orho o ds in Rieder (1994 ) will b e restrited to (on v ex) on tamination 3 ( ∗ = c ) and total v ariation ( ∗ = v ). Delegating the total v ariation ase to App endix A, the system U c ( θ ) th us onsists of all on tamination neigh b orho o ds U c ( θ, s ) =  (1 − s ) P θ + s Q   Q ∈ M 1 ( A )  , 0 ≤ s ≤ 1 (9) Subsequen tly , s = s n = r n − 1 / 2 for starting radius r ∈ [ 0 , ∞ ) and n → ∞ . Remark 1. Under Q , still the parameter θ has to b e estimated. Sine the equation Q = P θ + ( Q − P θ ) in v olving the n uisane omp onen t Q − P θ , ma y ha v e m ultiple solutions θ , the parameter θ is no longer iden tiable. This problem has b een dealt with b y estimating funtionals that extend the parametrization to the neigh b orho o ds. As noted in Setion 4.3.3 of Rieder (1994 ), ho w ev er, b oth approa hes lead to the same optimally robust ICs and pro edures one the  hoie of the funtional is sub jeted to robustness riteria. W e no w x θ ∈ Θ and in tro due the b ounded tangen ts at P θ , Z ∞ ( θ ) =  q ∈ L ∞ ( P θ ) | E θ q = 0  (10) Along an y q ∈ Z ∞ ( θ ) and for starting radius r ∈ [0 , ∞ ) , simple p erturbations are dened b y dQ n ( q , r ) =  1 + rn − 1 / 2 q  dP θ (11) pro vided that n 1 / 2 ≥ − r inf P θ q , where inf P θ denotes the P θ -essen tial inm um. AL estimators, under su h simple p erturbations, are still asymptotially normal, n 1 / 2 ( S n − θ )  Q n n ( q , r )  − → w N k  r E θ ψ θ q , Cov θ ( ψ θ )  (12) with bias r E θ ψ θ q . W e ha v e Q n ( q , r ) ∈ U c ( θ, r n − 1 / 2 ) i q ∈ G c ( θ ) for the lass G c ( θ ) =  q ∈ Z ∞ ( θ ) | inf P θ q ≥ − 1  (13) Confer Rieder (1994 ), pro of to Prop osition 4.3.6 and Lemma 5.3.1. 3 Optimally Robust Inuene Curv es 3.1 Maxim um Risk Our aim is minmax risk. Emplo ying a on tin uous loss funtion ℓ : R k → [ 0 , ∞ ) , the asymptoti maxim um risk of an y estimator sequene on on tamination neigh b orho o ds ab out P θ of size rn − 1 / 2 is lim M →∞ lim n →∞ sup Q ∈ U c ( θ ,rn − 1 / 2 ) Z ℓ M  n 1 / 2 ( S n − θ )  dQ n n (14) where, for ease of attainabilit y of the minim um risk, the trunated loss funtions ℓ M = min { M , ℓ } are emplo y ed. A further simplied and smaller risk is obtained b y a restrition to simple p erturba- tions Q n = Q n ( q , r ) with q ∈ G c ( θ ) and the in ter hange of sup q ∈G c ( θ ) , lim M →∞ , and lim n →∞ . The xed θ will b e dropp ed from notation heneforth whenev er feasible. Th us, for an AL estima- tor S = ( S n ) with IC ψ at P = P θ , and Z ∼ N k  0 , Cov( ψ )  , sup q ∈G c ( θ ) lim M →∞ lim n →∞ Z ℓ M  n 1 / 2 ( S n − θ )  dQ n n ( q , r ) = sup q ∈G c ( θ ) E ℓ  r E ψ q + Z  (15) 4 F or the square ℓ ( z ) = | z | 2 , the (maxim um, asymptoti) MSE is obtained as w eigh ted sum of the L 2 - and L ∞ -norms of ψ under P , MSE( ψ , r ) = E | ψ | 2 + r 2 ω 2 c ( ψ ) (16) sine ω c ( ψ ) = sup  | E ψ q |   q ∈ G c ( θ )  = sup P | ψ | (17) the P -essen tial sup of | ψ | ; onfer Setions 5.3.1 and 5.5.2 of Rieder (1994 ). Other (on v ex, monotone) om binations of bias and v ariane (e.g., L p -risks) ha v e b een onsidered in Ru kdes hel and Rieder (2004 ). A suitable onstrution a hiev es that, in ase of the optimally robust estimator, risk ( 14 ) is not larger than the simplied risk (15 ); onfer Setion 4 b elo w. 3.2 Minmax Mean Square Error The optimally robust ψ ⋆ , the unique solution to minimize MSE( ψ , r ) among all ψ ∈ Ψ , is giv en in Theorem 5.5.7 of Rieder (1994 ): There exist some v etor z ∈ R k and matrix A ∈ R k × k , A ≻ 0 , su h that ψ ⋆ = A (Λ − z ) w , w = min  1 , b | A (Λ − z ) | − 1  (18) where r 2 b = E( | A (Λ − z ) | − b ) + (19) and 0 = E(Λ − z ) w , A − 1 = E(Λ − z )(Λ − z ) ′ w (20) Con v ersely , form (18 )(20 ) sues for ψ ⋆ to b e the solution. The pro of uses the Lagrange m ultipliers supplied b y Rieder (1994 ), App endix B. The minmax solution to the more general risks onsidered in Ru kdes hel and Rieder (2004 ) also is a MSE solution with suitably transformed bias w eigh t; onfer their Theorem 4.1 and equation (4.7). The matrix A , in ase r = 0 , equals in v erse Fisher information I − 1 , whi h app ears in the Cramér- Rao b ound (8). In general, A is dened b y (19) and (20 ) only impliitly . It is surprising that the statistial in terpretation in terms of minim um risk obtains in the extension, with bias no w in v olv ed. Theorem 1. F or an y r ∈ (0 , ∞ ) and ψ ∈ Ψ w e ha v e MSE( ψ , r ) ≥ tr A = MSE( ψ ⋆ , r ) (21) where equalit y holds in the rst plae i ψ = ψ ⋆ dened b y (18)(20 ) . 3.3 Relativ e MSE The starting radius r for the neigh b orho o ds U c ( θ, r n − 1 / 2 ) , on whi h the minmax MSE solution ψ ⋆ = ψ ⋆ r dep ends, will often b e unkno wn or only kno wn to b elong to some in terv al [ r lo , r up ) ⊂ [ 0 , ∞ ) . In this situation that ψ ⋆ s is used when in fat ψ ⋆ r is optimal, w e in tro due the relativ e MSE of ψ ⋆ s at radius r , relMSE( ψ ⋆ s , r ) = MSE( ψ ⋆ s , r )  MSE( ψ ⋆ r , r ) (22) 5 F or an y radius s ∈ [ r lo , r up ) the sup r relMSE( ψ ⋆ s , r ) is attained at the b oundary , sup r ∈ [ r lo ,r up ) relMSE( ψ ⋆ s , r ) = r e lMSE( ψ ⋆ s , r lo ) ∨ relMSE( ψ ⋆ s , r up ) (23) A least fa v orable radius r 0 is dened b y a hieving inf s of sup r relMSE( ψ ⋆ s , r ) , that is, inf s ∈ [ r lo ,r up ) sup r ∈ [ r lo ,r up ) relMSE( ψ ⋆ s , r ) = sup r ∈ [ r lo ,r up ) relMSE( ψ ⋆ r 0 , r ) (24) and is  haraterized b y relMSE( ψ ⋆ r 0 , r lo ) = relMSE( ψ ⋆ r 0 , r up ) . The IC ψ ⋆ r 0 , resp etiv ely the AL estimator with this IC, are alled radius-minmax (rmx) and reommended. Confer K ohl (2005 ), in partiular Lemma 2.2.3, and Rieder et al. (2008 ). The reommendation is in some sense indep enden t of the loss funtion: In ase of unsp eied radius (i.e., r lo = 0 , r up = ∞ ), the rmx IC is the same for a v ariet y of loss funtions satisfying a w eak homogeneit y ondition; onfer Ru kdes hel and Rieder (2004 ), Theorem 6.1. 3.4 Cnip er Con tamination The notion is suited to demonstrate ho w relativ ely small outliers sue to destro y the sup eriorit y of the lassial pro edure. Emplo ying, for this purp ose, on taminations R n := (1 − r n − 1 / 2 ) P + rn − 1 / 2 I { a } b y Dira measures in a ∈ R , the asymptoti MSE of the lassially optimal estimator (i.e., with IC ψ h = I − 1 Λ ) under R n is MSE a ( ψ h , r ) := tr I − 1 + r 2 | ψ h ( a ) | 2 . Relating this quan tit y to the minmax MSE = tr A (Theorem 1), w e are in terested in the set C of v alues a ∈ R su h that MSE a ( ψ h , r ) > MSE( ψ ⋆ r , r ) ; that is, r 2 | ψ h ( a ) | 2 > tr A − tr I − 1 (25) In all mo dels w e ha v e onsidered so far, rather small v alues a sue to fulll (25 ). In a Jan us t yp e pun on the w ords nie and p erniious, the b oundary v alues of C are alled nip er p oin ts (ating lik e a snip er); onfer Ru kdes hel (2004 ) and K ohl (2005 ), In tro dution. 4 Estimator Constrution Giv en the optimally robust IC ψ ⋆ θ , one for ea h θ ∈ Θ , the problem is to onstrut an estimator S ⋆ = ( S ⋆ n ) that is AL at ea h θ with IC ψ ⋆ θ . In addition, the onstrution should a hiev e that there is no inrease from the simplied risk (15 ) to the asymptoti maxim um MSE ( 14). W e require initial estimators σ = ( σ n ) whi h are n 1 / 2 onsisten t on the full neigh b orho o d system U c ( θ ) ; that is, for ea h r ∈ [ 0 , ∞ ) , lim M →∞ lim sup n →∞ sup  Q ( n ) n ( n 1 / 2 | σ n − θ | > M )   Q n,i ∈ U c ( θ, r n − 1 / 2 )  = 0 (26) with Q ( n ) n = Q n, 1 ⊗ · · · ⊗ Q n,n . F or te hnial reasons, the σ n are in addition disretized in a suitable sense (f. Rieder (1994 ), Setion 6.4.2). In this artile, the optimally robust ICs ψ ⋆ θ are b ounded. Th us onditions (2)(6) of Rieder (1994 ), p 247, on ( ψ ⋆ θ ) θ ∈ Θ simplify drastially; namley , to on tin uit y in sup-norm, lim τ → θ sup x ∈ Ω | ψ ⋆ τ ( x ) − ψ ⋆ θ ( x ) | = 0 (27) 6 Then, aording to Rieder (1994 ), Theorem 6.4.8 (b), the one-step estimator S , S n = σ n + n − 1  ψ ⋆ σ n ( x 1 ) + · · · + ψ ⋆ σ n ( x n )  (28) where σ n = σ n ( x 1 , . . . , x n ) , is uniformly asymptotially normal su h that, for all arra ys Q n,i ∈ U c ( θ, r n − 1 / 2 ) and ea h r ∈ (0 , ∞ ) , n 1 / 2 ( S n − θ − B n )( Q ( n ) n ) − → w N  0 , Cov θ ( ψ ⋆ θ )  (29) with B n = n − 1  R ψ ⋆ θ dQ n, 1 + · · · + R ψ ⋆ θ dQ n,n  . Emplo ying a v ersion ψ ⋆ θ of form (18 )(20 ) whi h is b ounded p oin t wise b y b = b θ , w e obtain | B n | ≤ sup x ∈ Ω | ψ ⋆ θ ( x ) | = b θ (30) Th us (29) ensures that risk (14 ) is not larger than the simplied risk ( 15 ). Remark 2. As initial estimators w e prefer MD estimates, not primarily b eause of their breakdo wn p oin t but b eause of their related tail b eha vior (f. Ru kdes hel (2008a )) and their appliabilit y in general mo dels. In partiular, b oth K olmogoro v and Cramér-v on Mises MD (CvM) estimates ma y b e emplo y ed (f. Rieder (1994 ), Theorems 6.3.7 and 6.3.8), with an adv an tage of the latterin view of the larger neigh b orho o ds, to whi h its n 1 / 2 onsisteny extends, and the v ariane instabilit y , for nite n , of the former (f. Donoho and Liu (1988 )). In partiular mo dels, other estimators ma y qualify as starting estimators and ma y ev en b e preferable for omputational reasons; e.g.; median, MAD in one-dim lo ation and sale, minim um o v ariane determinan t estimator in m ultiv ariate sale, least median of squares, and S estimates in linear regression; onfer Rousseeu w and Lero y (1987 ) and Y ohai (1987 ). Remark 3. Under additional smo othness, aording to Ru kdes hel (2008a ) and Ru kdes hel (2008b ), assumption (26) of n 1 / 2 onsisteny ma y b e w eak ened to only n 1 / 4+ δ onsisteny , for some δ > 0 . Conse- quen tly , for example, the least median of squares estimator ma y b e emplo y ed as a high breakdo wn start- ing estimator. Ru kdes hel (2008b ) giv es other, partly more, partly less stringen t onditions. Moreo v er, Ru kdes hel (2008a ) ensures uniform in tegrabilit y so as to disp ense with the trunation of un b ounded loss funtions in (14). The remainder of the setion deals with ondition (27 ). W e assume that the Lagrange m ultipliers A θ and a θ := A θ z θ in (18 )(20 ) are unique, and, as τ → θ , Λ τ ( P τ ) − → w Λ θ ( P θ ) , tr I τ − → tr I θ (31) sup x ∈D c | Λ τ ( x ) − Λ θ ( x ) | + sup x ∈ c D c | Λ τ ( x ) − Λ θ ( x ) | | A θ Λ θ ( x ) − a θ | − → 0 (32) where D c = { x ∈ Ω | | A t Λ t ( x ) − a t | ≤ b t for t = τ or t = θ } . Then, b y K ohl (2005 ), Theorem 2.3.3, ondition (27) is fullled. F or example, in ase of a lo ation and sale with lo ation parameter β ∈ R and sale parameter σ ∈ (0 , ∞ ) , w e ha v e Λ θ ( x ) = σ − 1 Λ θ 0  ( x − β ) /σ  , hene Λ θ ( P θ ) = σ − 1 Λ θ 0 ( P θ 0 ) and I θ = σ − 2 I θ 0 , where θ = ( β , σ ) ′ and θ 0 = (0 , 1) ′ . Therefore, (31 ) is fullled. Condition (32 ) needs further  he king but seems plausible as Λ θ 0 is on tin uous (if the mo del is to b e L 2 dieren tiable). In the ase of an L 2 dieren tiable exp onen tial family , in view of (4), ondition (31) is satised, while (32 ) holds aording to K ohl (2005 ), Lemma 2.3.6. 7 5 Appliations 5.1 Prop osal Based on the presen ted results w e mak e the follo wing prop osal for appliations: Step 1: Deide on the ideal mo del. Step 2: Deide on the t yp e of neigh b orho o d ( ∗ = c or ∗ = v ). Step 3: Determine lo w er and upp er b ounds s lo , s up for the size s = s n of the neigh b orho o ds U ∗ ( θ, s ) to b e tak en in to aoun t. Step 4: Put r lo = n 1 / 2 s lo , r up = n 1 / 2 s up , and ompute the rmx IC for [ r lo , r up ] . Step 5: Ev aluate an appropriate starting estimator. Step 6: Determine the rmx estimator using the one-step onstrution. Our R pa k ages R obL ox (f. K ohl (2008 )) and R OptEst (f. K ohl and Ru kdes hel (2008 )) pro- vide an easy w a y to p erform steps 46 making use of our pa k ages distr (f. Ru k es hel et al. (2006 )), distrEx (f. Ru k es hel et al. (2006 )), distrMo d (f. Ru kdes hel et al. (2008 )), R andV ar (f. K ohl and Ru kdes hel (2008a )) and R obAStBase (f. K ohl and Ru kdes hel (2008b )). The implemen tation of these pa k ages hea vily relies on S4 lasses and metho ds; onfer Cham b er (1998 ). Based on this ob jet orien tated approa h pa k age R OptEst pro vides an implemenation that (so far) w orks for all(!) L 2 dieren tiable parametri mo dels whi h are based on a univ ariate distribution. In the sequel, w e will demonstrate the use of pa k ages R obL ox and R OptEst b y appliation to some datasets from literature. 5.2 Normal Lo ation and Sale W e onsider the follo wing 24 measuremen ts (in parts p er million) of opp er in wholemeal our (f. Analytial Metho ds Committee (1989 )) 2.20 2.20 2.40 2.40 2.50 2.70 2.80 2.90 3.03 3.03 3.10 3.37 3.40 3.40 3.40 3.50 3.60 3.70 3.70 3.70 3.70 3.77 5.28 28.95 where the v alue 28 . 95 is learly onspiuous. In agreemen t with Maronna et al. (2006 ), Setion 2.1, in view of the ma jorit y of the data, w e assume normal lo ation and sale as the ideal mo del, P θ = N ( µ, σ 2 ) with θ = ( µ, σ ) ′ , µ ∈ R , σ ∈ (0 , ∞ ) . Let us sti k to on tamination neigh b orho o ds ( ∗ = c ). W e assume that roughly 15 observ ations, that is, roughly 520% of the 24 observ ations are erroneous. Then the matrix A and en tering v etor a = Az in (18 )(20 ), b y absolute on tin uit y of the normal distribution, are unique. Sine normal lo ation and sale also is an L 2 dieren tiable exp onen tial family , the assumptions for our estimator onstrution are fullled. W e  ho ose the Cramér-v on Mises MD estimator (CvM) as initial estimator. The follo wing R o de sho ws ho w funtion roptest of pa k age R OptEst an b e applied to p erform the omputations, where x represen ts the data, R > roptest(x = x, L2Fam = NormLoationSa le Fa mil y( ), neighbor = ContNeighborhood (), eps.lower = 0.05, eps.upper = 0.20, distane = CvMDist) 8 T able 1: Normal lo ation and sale estimates Estimator ˆ µ ˆ σ mean & sd 4 . 28 5 . 30 median & MAD 3 . 39 0 . 53 Hub er M (Prop osal 2) 3 . 21 0 . 67 Y ohai MM 3 . 16 0 . 66 CvM 3 . 23 0 . 67 rmx (roptest) 3 . 16 0 . 66 rmx (roblo x) 3 . 23 0 . 64 0 2 4 6 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 Location part x IC 0 2 4 6 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 Scale part x IC Figure 1: rmx IC omputed via roblox . More sp eied to the normal ideal mo del is the funtion roblox of pa k age R obL ox , whi h only w orks for, and is optimized for sp eed in, normal lo ation and sale. It uses median and MAD as starting estimates whi h is justied b y K ohl (2005 ), Setion 2.3.4. R > roblox(x = x, eps.lower = 0.05, eps.upper = 0.20) T able 1 sho ws the results of these omputations as w ell as mean, standard deviation and some w ell- kno wn robust estimators. The robust estimators median & MAD  rmx (roblo x) yield v ery similar results, while, ob viously , mean and standard deviation represen t the data badly . Figure 1 sho ws the lo ation and sale parts of the rmx IC omputed via funtion roblox . The lo ation part of the rmx IC, as of an y optimally robust IC, is redesending. Th us, redesending in our setup follo ws on optimalit y grounds. F or another deriv ation of redesending M -estimators see Shevly ak o v et al. (2008 ). Based on these robust estimates, let us assume a mean of µ = 3 . 2 and a standard deviation of σ = 0 . 7 for the ideal distribution P θ = N (3 . 2 , 0 . 7 2 ) . F or a on tamination of s n = 10 % at a 9 Length of stays Length of stay Density 0 20 40 60 80 100 0.00 0.02 0.04 0.06 0.08 0.10 MLE CvM rmx Figure 2: Observ ed frequenies and tted Gamma densities. sample size of n = 24 (i.e., r ≈ 0 . 4 9 ), the nip er p oin ts are alulated to 1 . 86 and 4 . 54 , and C = ( −∞ , 1 . 86] ∪ [4 . 54 , ∞ ) . Under an y elemen t of U c ( θ, s n ) the probabilit y of C is 515%, where P θ ( C ) = 5 . 56% . 5.3 Gamma Mo del W e analyze the length of sta ys of 201 patien ts in the Univ ersit y Hospital of Lausanne during the y ear 2000 (f. Hub ert and V andervieren (2006 )). F ollo wing Marrazi et al. (1998 ), w e use the Gamma mo del p θ ( x ) = Γ( α ) − 1 σ − α x α − 1 e − x/σ with shap e and sale parameters σ , α ∈ (0 , ∞ ) and θ = ( σ , α ) ′ . By K ohl (2005 ), Setion 6.1, this exp onen tial family is L 2 dieren tiable. W e assume on tamination neigh b orho o ds ( ∗ = c ) but, on visual insp etion of the data, of only small size 0 . 5% ≤ s n ≤ 5% . Then, due to absolute on tin uit y of P = P θ , equations (18 )(20 ) yield unique solutions A and a = Az . Th us, the one-step onstrution of the rmx estimator, based on the CvM estimate, applies. The algorithm an b e p erformed b y applying funtion roptest of pa k age R OptEst , where x on tains the data, R > roptest(x = x, L2Fam = GammaFamily(), neighbor = ContNeighborhood (), eps.lower = 0.005, eps.upper = 0.05, distane = CvMDist) a all, whi h is v ery similar to the one in the previous example. In fat, the unied all for roptest applies to an y smo oth mo del. Figure 2 ompares the densities of the estimated Gamma distributions with the histogram of the data. T able 2 sho ws the results as w ell as the MLE and the CvM. Again, the MLE is strongly aeted b y a few v ery large observ ations whereas the robust estimators sta y loser to the bulk of the data. Figure 3 sho ws sale and shap e parts of the rmx IC (similarly , of an y optimally robust IC; onfer K ohl (2005 ), Figure 6.1). 10 T able 2: Gamma sale and shap e estimates Estimator MLE CvM rmx ˆ σ 7 . 00 6 . 53 4 . 97 ˆ α 1 . 61 1 . 54 1 . 86 0 10 20 30 40 50 60 −5 0 5 10 15 Scale part x IC 0 10 20 30 40 50 60 −5 0 5 10 15 Shape part x IC Figure 3: rmx IC omputed via roptest . Assuming the ideal Gamma distribution P θ with θ = (5 . 0 , 1 . 9) ′ and a on tamination size s n = 2 . 5 % at n = 201 (i.e., r ≈ 0 . 35 ), the nip er p oin ts are 0 . 62 and 29 . 31 , and C = ( − ∞ , 0 . 62 ] ∪ [29 . 31 , ∞ ) . Under an y elemen t of U c ( θ, s n ) the probabilit y of C is 2.55%, where P θ ( C ) = 2 . 63% . 5.4 P oisson Mo del F or the dea y oun ts of p olonium reorded b y Rutherford and Geiger (1910 ), ounts 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 frequeny 57 203 383 525 532 408 273 139 45 27 10 4 0 1 1 w e assume the P oisson mo del p θ ( x ) = e − θ θ x /x ! , whi h exp onen tial family is L 2 dieren tiable in the param ter θ ∈ (0 , ∞ ) (f. K ohl (2005 ), Setion 4.1). F or b oth on tamination ( ∗ = c ) and total v ariation neigh b orho o ds ( ∗ = v ) of size 0 . 01 ≤ s n ≤ 0 . 05 w e ompute the rmx estimator. But, in ase ∗ = c , a = Az ma y b e non-unique, whi h happ ens if med P (Λ) , the median of Λ = Λ θ under P = P θ , is non-unique and r = n 1 / 2 s n is ≥ the so alled lo w er ase radius ¯ r (f. K ohl (2005 ), Setion 2.1.2). The non-uniqueness of the median o urs for only oun tably man y v alues θ . Sine, as our n umerial ev aluations sho w, already small deviations ( ∼ ± 10 − 8 ) from the exeptional v alues lead to a unique a , non-uniqueness ma y b e negleted in pratie; onfer K ohl (2005 ), Setions 4.2.1 and 4.4. In ase ∗ = v , the one-step onstrution 11 T able 3: P oisson mean estimates Estimator MLE CvM rmx ( ∗ = c ) rmx ( ∗ = v ) ˆ θ 3 . 87 15 3 . 89 53 3 . 9131 3 . 9133 0 2 4 6 8 10 12 14 0 100 200 300 400 500 Decay counts of polonium count frequency observed MLE rmx (* = c,v) Figure 4: Observ ed and tted frequenies. applies without restritions; onfer App endix A . Then, using the CvM as starting estimator, the rmx estimators are obtained via the follo wing alls to funtion roptest of pa k age R OptEst , where x on tains the data, R > roptest(x = x, L2Fam = PoisFamily(), neighbor = *, eps.lower = 0.01, eps.upper = 0.05, distane = CvMDist) where * stands for ContNeighborhood () or TotalVarNeighbo rh ood () , resp etiv ely . The results as w ell as MLE and CvM estimate are giv en in T able 3. The estimates dier only sligh tly , as the data, in view of the observ ed and tted frequenies in Figure 4 , app ears in v ery go o d agreemen t with the P oisson mo del. Figure 5 sho ws the rmx ICs for on tamination and total v ariation neigh b orho o ds. In fat, an y optimally robust IC is of similar form (f. K ohl (2005 ), Figures 4.1 ( ∗ = c ) and 4.14 ( ∗ = v )). Remark 4. ICs are dened with resp et to the ideal mo del, th us, in ase of the P oisson mo del, on N 0 . If w e w an t to allo w distributions in the neigh b orho o ds whose supp orts are more generally in [ 0 , ∞ ) , w e only need to extend ψ ⋆ from N 0 to [ 0 , ∞ ) su h that | ψ ⋆ ( x ) | ≤ b for ea h x > 0 ; onfer ( 30 ) in the estimator onstrution. Assuming the ideal P oisson distribution P θ with θ = 3 . 9 , neigh b orho o d t yp e ∗ = c and a on tam- ination size s n = 3% at n = 2 608 (i.e., r ≈ 1 . 53 ), w e get the nip er p oin ts 1 . 26 and 6 . 54 , and 12 2 4 6 8 10 12 14 −2 −1 0 1 2 contamination (* = c) x IC 2 4 6 8 10 12 14 −2 −1 0 1 2 total variation (* = v) x IC Figure 5: rmx IC omputed via roptest for ∗ = c , v . C = [0 , 1 . 26] ∪ [6 . 5 4 , ∞ ) . Under an y elemen t of U c ( θ, s n ) the probabilit y of C is 19.522.5%, where P θ ( C ) = 20 . 0% . A T otal v ariation neigh b orho o ds ( ∗ = v ) The system U v ( θ ) onsist of the losed balls of radius s ab out P θ , in the total v ariation metri d v ( Q, P θ ) = sup A ∈A | Q ( A ) − P θ ( A ) | , U v ( θ, s ) =  Q ∈ M 1 ( A )   d v ( Q, P θ ) ≤ s  , 0 ≤ s ≤ 1 (33) whi h ha v e the follo wing represen tation in terms of on tamination neigh b orho o ds, U v ( θ, s ) − P θ =  U c ( θ, s ) − P θ  −  U c ( θ, s ) − P θ  (34) In partiular, U c ( θ, s ) ⊂ U v ( θ, s ) follo ws. In our asymptotis, s = s n = rn − 1 / 2 for some r ∈ [ 0 , ∞ ) , as the sample size n → ∞ . Corresp onding simple p erturbations Q n ( q , r ) are dened b y ( 10) and (11) with tangen ts q in the lass G v ( θ ) =  q ∈ Z ∞ ( θ )   E θ | q | ≤ 2  = G c ( θ ) − G c ( θ ) (35) W e x θ and drop it from notation. Then, with sup e extending o v er all unit v etors e in R k , the standardized (innitesimal) bias term of an IC ψ ∈ Ψ is ω v ( ψ ) = s up  | E ψ q |   q ∈ G v ( θ )  = sup e  sup P e ′ ψ − inf P e ′ ψ  (36) The exat bias term in ase k > 1 is diult to handle and has b een dealt with only in exeptional ases (f. Rieder (1994 ), p 205 and Theorem 7.4.17). The ob vious b ound ω c ( ψ ) ≤ ω v ( ψ ) ≤ 2 ω c ( ψ ) 13 suggests an appro ximate solution b y a redution to the on tamination ase ∗ = c and radius 2 r . An exat solution of the MSE problem with bias term ω v is still p ossible in dimension k = 1 , in whi h ase ω v ( ψ ) = s up P ψ − inf P ψ . In ase k = 1 , the optimally robust IC ψ ⋆ , the unique solution to minimize MSE( ψ , r ) = E ψ 2 + r 2 ω 2 v ( ψ ) among all ICs ψ ∈ Ψ is pro vided b y Rieder (1994 ), Theorem 5.5.7: F or some n um b ers c , b , A , ψ ⋆ = c ∨ A Λ ∧ ( c + b ) (37) where r 2 b = E  c − A Λ) + = E  A Λ − ( c + b )  + (38) and E  c ∨ A Λ ∧ ( c + b )  Λ = 1 (39) Con v ersely , form (37 )(39 ) sues for ψ ⋆ to b e the solution. The solutions A , b and c of equations (37 )(39 ) are alw a ys unique, as disussed in Setion B.1 b elo w. Moreo v er, the ondition that, as τ → θ , sup x ∈D v | Λ τ ( x ) − Λ θ ( x ) | + sup x ∈ c D v | Λ τ ( x ) − Λ θ ( x ) | | Λ θ ( x ) | − → 0 (40) where D v = { x ∈ Ω | c t ≤ A t Λ t ( x ) ≤ b t + c t for t = τ or t = θ } , has b een v eried b y K ohl (2005 ), Lemma 2.3.6, in the ase ∗ = v , k = 1 , for L 2 dieren tiable exp onen tial families. Th us, the one-step onstrution is v alid. B Auxiliary Results And One Pro of B.1 Boundedness, Uniqueness, Con tin uit y Of Lagrange Multipliers W e disuss b oundedness, uniqueness, and on tin uit y of the Lagrange m ultipliers A , a = Az , b and c in the optimally robust IC ψ ⋆ . These prop erties are, on one hand, reassuring for the on v ergene of our n umerial algorithms. On the other hand, they imply the on tin uit y in sup-norm (27 ) required for the onstrution. Boundedness Giv en r > 0 , b ounds for the solutions A , a = Az , b and c of (18 )(20 ) and (37)(39 ), resp etiv ely , are deriv ed in K ohl (2005 ), Setion 2.1.3. F or example, | a | ≤ r 2 b holds. Uniqueness The Lagrange m ultipliers (lik e the separating h yp erplanes) need not b e unique; on- fer Rieder (1994 ), Remark B.2.10 (a). But, at least, tr A , b , and c in (18)(20 ) and (37)(39 ), resp etiv ely , are unique sine, in terms of the unique ψ ⋆ , tr A = MSE( ψ ⋆ , r ) , b = ω ∗ ( ψ ⋆ ) , c = inf P ψ ⋆ (41) If k = 1 and med P (Λ) is unique, then a is unique; Rieder (1994 ), Lemma C.2.4. In ase k = 1 and med P (Λ) is non-unique, then a is unique for r < ¯ r (the so alled lo w er ase radius); onfer K ohl (2005 ), Prop osition 2.1.3. In ase ∗ = c , k ≥ 1 , uniqueness of A and a is ensured b y the assumption that suppo rt Λ( P ) = R k (42) 14 onfer Rieder (1994 ), Remark 5.5.8. A and a are unique also under the more impliit ondition that, for an y h yp erplane H ⊂ R k , P (Λ ∈ H ) < P ( | ψ ⋆ | < b ) (43) whi h ertainly is satised if P (Λ ∈ H ) = 0 for an y h yp erplane H ; that is, e ∈ R k , α ∈ R , P ( e ′ Λ = α ) > 0 = ⇒ e = 0 (44) onfer Rieder (1994 ), Setion 5.5.3. Both (42 ) and (44) imply that I ≻ 0 . Con tin uit y in θ : Denote b y ψ ⋆ θ the MSE solution to v ariable parameter θ ∈ Θ and xed radius r ∈ (0 , ∞ ) . Then, under assumption (31), w e obtain tr A τ − → tr A θ , b τ − → b θ , c τ − → c θ (45) as τ → θ . Pro vided that A θ and a θ are unique, moreo v er A τ − → A θ , a τ − → a θ (46) Confer K ohl (2005 ), Theorem 2.1.11. Con tin uit y in r : Con tin uit y in r is needed for the rmx estimator. Denoting b y A r , a r = A r z r , b r , and c r the solutions of (18 )(20 ) and (37 )(39 ), resp etiv ely , for xed θ and v ariable r ∈ (0 , ∞ ) , K ohl (2005 ), Prop osition 2.1.9, sa ys that tr A s − → tr A r , b s − → b r , c s − → c r (47) as s → r . Moreo v er, in ase that A r and a r are unique, A s − → A r , a s − → a r (48) F or the rmx estimator, in addition some monotoniit y in r is needed and supplied b y Ru kdes hel and Rieder (2004 ), K ohl (2005 ), and Rieder et al. (2008 ). B.2 Pro of of Theorem 1 minmaxMSE = E | η | 2 + r 2 b 2 = − E η ′ ( Y − η ) + E η ′ Y + r 2 b 2 with the abbreviations η := ψ ⋆ , Y := A Λ , where E η ′ Y = tr E η Y ′ = tr A ′ = tr A sine E η Λ ′ = I k . ∗ = c : In this ase, η 6 = Y i | Y | > b , and th us E η ′ ( Y − η ) = b E( | Y | − b ) + = r 2 b . ∗ = v , k = 1 : In this ase, E η ( Y − η ) = b E( c − Y ) + = r 2 b 2 . Referenes Analytial Metho ds Committee (1989). Robust statistis  ho w not to rejet outliers. The A nalyst , 114 , 16931702. Andrews D. F., Bi k el P . J., Hamp el F. R., Hub er P . J., Rogers W. H., and T uk ey J. W. (1972). R obust estimates of lo  ation. Survey and advan es . Prineton Univ ersit y Press, Prineton, N. J.. Bi k el, P . J. (1981). Quelques asp e ts de la statistique r obuste . Eole d'ete de probabilites de Sain t-Flour IX-1979, Let. Notes Math. 876, 272. 15 Cham b ers, J. M. (1998). Pr o gr amming with data: a guide to the S language . Springer, New Y ork. Donoho D. L. and Liu, R. C. (1988). P athologies of Some Minim um Distane Estimators. A nnals of Statistis 16 (2),587608. F eller, M. (1968). A n intr o dution to pr ob ability the ory and its appli ations. I . Wiley , New Y ork. F ernholz, L.T. (1983). V on Mises Calulus for Statisti al F untionals. Leture Notes in Statistis #19. Springer-V erlag, New Y ork. F raiman, R., Y ohai, V. J. and Zamar, R. H. (2001). Optimal robust M -estimates of lo ation. A nn. Stat. , 29 (1), 194223. Ha jek, J. (1972). Lo al asymptoti minimax and admissibilit y in estimation. Pr o . 6th Berkeley Symp os. math. Statist. Pr ob ab. , Univ. Calif. 1970, 1 , 175194. Hamp el, F. R. (1968). Contributions to the the ory of r obust estimation . Dissertation, Univ ersit y of Califor- nia, Berk ely , CA. Hamp el, F. R., Ron hetti, E. M., Rousseeu w, P . J. and Stahel, W. A. (1986). R obust statistis. The appr o ah b ase d on inuen e funtions . Wiley , New Y ork. Hub er, P . J. (1964). Robust estimation of a lo ation parameter. A nn. Math. Stat. , 35 , 73101. Hub er, P . J. (1981). R obust statistis . Wiley , New Y ork. Hub er, P . J. (1997). R obust statisti al pr o  e dur es . 2nd ed. CBMS-NSF Regional Conferene Series in Applied Mathematis. 68. Philadelphia, P A: SIAM. Hub erCarol, C. (1970). Étude asymptotique de tests r obustes . Thèse de Do torat, ETH Züri h. Hub ert, M. and V andervieren, E. (2006). A n A djuste d Boxplot for Skewe d Distribu- tions . T e hnial Rep ort TR-06-11, KU Leuv en, Setion of Statistis, Leuv en. URL h ttp://wis.kuleuv en.b e/stat/robust/P ap ers/TR0611.p df . K ohl, M. (2005). Numeri al  ontributions to the asymptoti the ory of r obustness . Dissertation, Univ ersit y of Ba yreuth, Ba yreuth. K ohl, M. (2008). RobLo x: Optimally robust inuene urv es for lo ation and sale. R p akage version 0.6.1 . URL h ttp://robast.r-forge.r-pro jet.org . K ohl, M., and Ru kdes hel, P . (2008a). RandV ar: Implemen tation of random v ariables. R p akage version 0.6.6 . URL h ttp://robast.r-forge.r-pro jet.org/ K ohl, M. and Ru kdes hel, P . (2008b). RobAStBase: Robust Asymptoti Statistis. R p akage version 0.1.5 . URL h ttp://robast.r-forge.r-pro jet.org . K ohl, M. and Ru kdes hel, P . (2008). R OptEst: Optimally robust estimation. R p akage version 0.6.3 . URL h ttp://robast.r-forge.r-pro jet.org . Marazzi, A., P aaud, F., Rueux, C. and Beguin, C. (1998). Fitting the distributions of length of sta y b y parametri mo dels. Me di al Car e , 36 , 915927. Maronna, R. A., Martin, R. D. and Y ohai, V. J. (2006). R obust Statistis: The ory and Metho ds . Wiley , New Y ork. 16 R Dev elopmen t Core T eam (2008). R: A language and envir onment for statisti al  omputing . R F oundation for Statistial Computing, Vienna, Austria. ISBN 3-900051-07-0, URL h ttp://www.R-pro jet.org . Reeds, J.A. (1976). On the Denition of von Mises F untionals. Ph.D. Thesis, Harv ard Univ ersit y , Cam- bridge. Rieder, H. (1978). A robust asymptoti testing mo del. A nn. Stat. , 6 , 10801094. Rieder, H. (1980). Estimates deriv ed from robust tests. A nn. Stat. , 8 , 106115. Rieder, H. (1994). R obust asymptoti statistis . Springer, New Y ork. Rieder, H., K ohl, M. and Ru kdes hel, P . (2008). The ost of not kno wing the radius. Stat. Meth. & Appl. , 17 , 1340. Rieder, H. and Ru kdes hel, P . (2001). Short Pro ofs on L r Dieren tiabilit y . Stat. De is. , 19 , 419425. Rousseeu w, P .J. and Lero y , A.M. (1987). R obust R e gr ession and Outlier Dete tion . Wiley , New Y ork. Ru kdes hel, P . (2006). A Motiv ation for 1 / √ n -Shrinking-Neigh b orho o ds. Metrika , 63 (3), 295307 Ru kdes hel, P . (2004). Higher Order Asymptotis for the MSE of M-Estimators on Shrinking Neigh b or- ho o ds. Unpublished man usript. Ru kdes hel, P . (2008a). Uniform In tegrabilit y on Neigh b orho o ds. In preparation. Ru kdes hel, P . (2008b). Uniform Higher Order Asymptotis for Risks on Neigh b orho o ds. In preparation. Ru kdes hel, P . and K ohl, M. and Stabla, T. and Camphausen, F. (2006). S4 lasses for distributions. R News , 6 (2), 26. Ru kdes hel, P ., K ohl, M., Stabla, T. and Camphausen, F. (2008). S4 Classes for Distributionsa manual for p akages distr, distrSim, distrTEst, distrEx, distrMo d, and distrT e ah. T e hnial Rep ort, F raunhofer ITWM, Kaiserslautern, German y . Ru kdes hel, P . and Rieder, H. (2004). Optimal inuene urv es for general loss funtions. Stat. De is. , 22 , 201223. Rutherford, E. and Geiger, H. (1910). The Probabilit y V ariations in the Distribution of alpha P artiles. Philosophi al Magazine , 20 , 698704. Shevly ak o v, G., Morgen thaler, S. and Sh urygin, A. (2008). Redesending M-estimators. J. Stat. Plan. Infer en e , 138 (10), 29062917. v an der V aart, A. W. (1998). Asymptoti statistis . Cam bridge Univ. Press., Cam bridge. Witting, H. (1985). Mathematishe Statistik I: Par ametrishe V erfahr en b ei festem Stihpr ob enumfang . B.G. T eubner, Stuttgart. Y ohai, V. J. (1987). High breakdo wn-p oin t and high eieny robust estimates for regression. A nn. Statist. , 15 (2), 642656. 17

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment