Yet another breakdown point notion: EFSBP - illustrated at scale-shape models

Y et another br eakdown point notion: EFSBP —illustrate d at scale -shape m odels P eter Ruckdesch el · Nataliya Horbenko Recei v ed: date / Accept ed: date Abstract The b reakdo wn poin t in its dif ferent v arian ts is one of the cen tral n otions to quan - tify the global ro b ustness of a procedu re. W e p ropose a simple supplem entary var iant which is useful in s ituations where we hav e no obvious or only partial equi variance: Extending the Do noho and Huber (1983) F inite S ample Br eakdown P oin t , we propose the Expected F inite Sample Br eakdown P oint to produ ce less conﬁgur ation-depen dent values while still preserving the ﬁnite sample aspect of the former deﬁnition. W e apply this notion for joint estimation of scale and shape (with only scale-equi variance av ailable), ex empliﬁed for generalized Pareto , generalized ex treme v alue, W eib ull, and Gamma distrib utions. In these settings, we are interested in highly-ro b ust, easy-to-compute initial es timators; to this end we study Pickands-type and Location-Dispersion-type est imators and compute their respecti ve breakdo wn points. K eywords global robustn ess, ﬁnite s ample breakdo wn point, partial equiv ariance, scale-shape parametric family , LD estimator 1 I ntroduction In an industrial project to compute robu st varian ts of OpV ar , i.e.; the regulatory capital as required in B asel II (2006) for a bank to cov er its operationa l risk , we came across the problem of determining the (ﬁnite sample) breakdown point of certain considered proce- dures. Here operational risk is by deﬁnition “the risk o f direct or indirect lo ss resulting fro m inadequate or failed intern al processes, people and systems or from external ev ents. ” This work wa s supported by a DAAD schol arship for N.H. P . Ruckdesc hel · N. Horbenko Fraunhofer ITWM, Departme nt of Financ ial Mathematics, Fraunhofer -Platz 1, D-67663 Kaiserslaut ern and Dept. of Mathemat ics, Unive rsity of Kaiserslauter n, P .O.Box 3049, D-67653 Kaisersla utern E-mail: peter .ruckdeschel@it wm.fraunhofer .de natal iya.horbenk o@itwm.fraunhofer . de 2 These extrem al ev ents, as moti vated by the Pickands-Balk ema-de Haan Extreme V alu e Theorem (see Balkem a and de Haan (19 74), Pickands (1 975)) suggest the use of the gen- eralized Pareto distrib ution (GPD) for modeling in this conte xt. In an intermediate step this modeling inv olves estimation of the scale and shape param eters of this distrib ution. T o this end, s ev eral rob ust procedu res ha ve been pro posed in the literatur e, see Ruckdeschel and Horbenk o (2010) for a more detailed discussion. One of the quantities to judge rob ustness of a procedure is the breakdo wn point (see Deﬁnition 3.1). In particular , we are interested in the ﬁnite sample version FSBP of this notion to be able to q uantify the degr ee of protection a pro cedure provides in the estimation at an actual (ﬁnite) set of observ ations. It turns out that for our purpo ses t he original deﬁnition has some drawback s, as it de- pends s trongly on the conﬁguratio n of the actual sample. T o get rid of the dependence on possibly highly impro bable sample conﬁgurations while s till preservin g th e aspect of a ﬁnite sample, we prop ose an expected FSB P , EFSBP , i.e.; to inte grate out the FSBP with respect to the ideal distrib ution. W e illustrate the usefulness of this new concept for scale-shape models by means of two ty pes o f rob ust estimators, quantile-ty pe estimators (Pickand s Estimator PE ) an d rob ust Location-Dispersion (LD) es timators as introduced by Marazzi and Rufﬁeux (1999); for the l atter type we study estimators based on the median for the location part and sev eral rob ust scale estimators for the dispersion part: a (ne w) asymmetric version of the median of absolute de viations kMAD, as well as Qn and Sn from Rousseeuw and Croux (199 3)— combined to MedkMAD , MedQn , and MedSn , respecti vely . These estimators are meant to be used as initial estimators with acceptable to good global rob ustness properties for (more efﬁcien t) rob ust est imators afterwards. In particular , they can be computed without the need of additional (rob ust, consistent) initial estimators, which precludes otherwise promising alternativ es like Minimum Distance estimators, for which we co uld ha ve rea d o ff asymp totic breakd o wn point v al ues as high as half the o ptimal v alue from Donoho and Liu (1988). W e have also excluded the m ethod-of- median appro ach of Peng and W elsh (2001), because in contr ast t o PE and MedkMAD, MedQn, and MedSn, for this estimator in the GP D and GEVD case, no explicit calculations are possible. W e ha ve studied this approach i n another paper , though (Ruckdeschel and Horbenk o (2010)), and empirically fo und that in the GPD case its breakd o wn behav ior is worse than the one of MedkMAD and MedQn. Our paper is or ganized as follows: In Section 2, we list o ur r eference examp les for scale- shape models, i.e.; the generalized Pareto, the generalized extreme v alue, the W eibu ll, and the Gamm a distrib ution, as well as the Gross Err or m odel which we use to capture de viations from the ideal model. In Section 3, we recall the s tandard deﬁnitions of the asymptotic and ﬁnite sample b reakdo wn poin ts ABP and FSBP and in troduce the new concept of EFSBP in Deﬁnition 3.2 . Section 4 then deﬁnes the con sidered estimators, i.e.; quantile-type estimato rs PE, and LD estimators MedkMAD, MedQn, MedSn. At these est imators, we demonstrate our new breakdo wn point notion in Section 5, giving analytic formulae for FSB P , AB P , and EFSBP in Propositions 5.1, 5.2, and 5.3, together with some numerical ev aluations of EFSBP at some reference situation and with simulation-based ev aluations. Proofs for our results are gathered in Appendix A. Remark 1.1 This paper is a part of the PhD thesis of the second author; a preliminary version of it may be found in Ruckdesc hel and Horbenko (2010). 3 2 Model Setting For notions of inv ariance of statistical models and equi v ariance of estimators we refer to Eaton ( 1989): Giv en a measurable space ( Ω , B ) , a family of prob ability measures P de- ﬁned on B is a statistical model . Notatio nally , we use the same symbol for the cumulati ve distribut ion function (c.d.f.) and the probabilit y measure; we write F ( x − 0 ) to denote left and, corresponding ly , + 0 for right limit s, and F − to den ote th e right continu ous quantile function giv en by F − ( s ) = inf { t ∈ R : F ( t ) ≥ s } . Deﬁnition 1 Suppose a gro up G acts measurab ly on Ω . Model P is called G-in variant iff for each P ∈ P , the image probability gP of P under group action g stays in P . For simplicity , we assume that g ( P 1 ) = g ( P 2 ) implies P 1 = P 2 for an y two elements o f P . In a G -in v ariant parametric model P = { P θ | θ ∈ Θ } , where Θ is the parameter space, group G induces an is omorph ic group ˜ G , acting on the parameter space wi th the identiﬁcation g ( P θ ) = P ˜ g ( θ ) . In this situation, a point estimator t mapping Ω to Θ is equi variant iff t ( g ( x )) = ˜ g ( t ( x )) . 2.1 Generalized Par eto Distribu tion and Other Scale-Shape Families W e illustrate our concepts at scale-shape mod els; ou r reference example is the three-parameter generalized Pareto distrib ution (GPD) which has c.d.f. and density F θ ( x ) = 1 −  1 + ξ x − µ β  − 1 ξ , f θ ( x ) = 1 β  1 + ξ x − µ β  − 1 ξ − 1 (2.1) where x ≥ µ for ξ ≥ 0, and µ < x ≤ µ − β ξ if ξ < 0. It has parameter θ = ( ξ , β , µ ) τ , for location µ , scale β > 0 and shape ξ . Special cases of GPDs are t he uniform ( ξ = − 1), the expo nential ( ξ = 0, µ = 0), and Pareto ( ξ > 0, β = 1) distributions. W e limit ourselves to the case of kn o wn location µ = 0 an d unk no wn scale and shape here and abbre viate the pair ( β , ξ ) by ϑ , i.e.; we are concerned with joint estimation of ϑ = ( β , ξ ) only . Other scale-shape families for which our considerations apply mutatis mutandis are the gener alized extr eme value distribution (GEVD) gi ven by its c.d.f. F θ ( x ) = exp  −  1 + ξ x − µ β  − 1 ξ  I ( − β ξ + µ , ∞ ) ( x ) (2.2) the W eibu ll distributio n w ith density f ϑ ( x ) = ξ β  x β  ξ − 1 exp ( − ( x / β ) ξ ) I ( 0 , ∞ ) ( x ) (2.3) and the Gamma distrib ution with density f ϑ ( x ) = x ξ − 1 β ξ Γ ( ξ ) exp ( − ( x / β )) I ( 0 , ∞ ) ( x ) (2.4) For the W eibull and Gamma case we require ξ > 0, whereas in the GEVD case the same distinction applies as in the GPD case. 4 Repara metrization In the W eib ull family , passage to the log-observ ations transforms this model into a location-scale model with the standard Gumbel as central distrib ution. This approach has been taken by Boudt et al (2011), and al lo ws them to recur to the rich theory (both classical and rob ust) a v ailable for location-scale models. In both GPD and GEVD, a similar approach is possible, once instead of µ we use ˜ µ = µ ξ − β , so that in this setting we get 1 + ξ x − µ β = ξ x − ˜ µ β (2.5) In the GPD case, this leads to a location-scale model with t he standard Exponential as cen- tral distrib ution. This parametrization is used for two-param eter Pareto distribution , e.g. in Brazauskas and Serﬂing (2000). T wo issues, howe ver , are bought wi th this appro ach: First, kno wledge of µ is not the same as knowledg e of ˜ µ , so our original s etting where µ was assumed kno wn does not carry ov er easil y . Second, the correspon ding transformed model about the Expo nential distribution is not smooth— L 2 -dif ferentiable to be precise. The rea- son for this is es sentially that observ ations around the left endpoint of the distrib ution carry ov erwhelmingly much information about the location parameter . As a consequence, usual optimality theory no longer is a v ailable, and in the ideal model setting there are estimators which are consistent at faster rates than the usual 1 / √ n . On the other side, this h igh accuracy requires to base inference essentially completely on the minimal observ ations w hich makes these pro cedures extremely pro ne to outliers. Rob ustiﬁcations av oid this problem, bu t still, due to the lack of smoothness no optimality theory is a v ailable. Fo r this reason, we stick to the original parametrization . Our r eferen ce model In the s equel, we use the reference v alues β = 1 an d ξ = 0 . 7 for all our scale-shape models; in case of the GPD this amounts to moderately fat tails which reﬂects well the situation we met in our application to OpV ar . In-/equiva riance The reduced model enjoys a certain in variance : with an included scale compon ent, it remains in variant under scale transfor mations s β ( x ) = β x of the observ ations. Using the matrix d β = diag ( β , 1 ) , this inv ariance is reﬂected by a corresponding notion of equi variance of estimators, i.e.; an estimator S for ϑ = ( β , ξ ) is called s cale-equivarian t if S ( β x 1 , . . . , β x n ) = d β S ( x 1 , . . . , x n ) (2.6) For the shape parameter ξ , there is no obviou s such in v ariance, entailing a dependenc e of estimator properties like rob ustness on this param eter . 2.2 Gross Erro r Model Extending th e ideal model setting , Rob ust Statistics deﬁnes suitable d istrib utional neighb or - hoods about this ideal model. In this paper , we limit ourselves to the Gr oss Err or M odel , i.e.; as neighb orhood s, we use the sets of all distribution s F re representable as F re = ( 1 − ε ) F id + ε F di (2.7) for some gi ven size or radius ε > 0, where F id is the underlying ideal distrib ution and F di some arbitrary , unkno wn, and uncon trollable contaminating distribution . 5 3 Global Rob us tness: the Breakdo wn Po int In this paper we focus on the Br eakdown P oint as a glo bal measure of rob ustness, specifying the reliability of a proced ure under massi ve de viations from the ideal mod el. In th e gross er - ror mo del (2.7), it gi ves the largest radius ε at which the estimator still produ ces meaningful results. In standard literature on R ob ust Statistics, there are two notion s of breakdo wn point— the asymptotic (functional) br eakdown point ( ABP ) and the ﬁnite s ample br eakdown point ( FSBP ) introdu ced in Hampel (1968) and Donoho and Huber (1983), respecti vely: Deﬁnition 3.1 ( a ) (Hampel et al , 1986, 2.2 Deﬁnition 1) The asymptotic breakdo wn point (ABP) ε ∗ of the sequence of estimators T n for parameter θ ∈ Θ at pr obability F is given by ε ∗ : = sup n ε ∈ ( 0 , 1 ] ; ther e is a compact set K ε ⊂ Θ s.t. π ( F , G ) < ε = ⇒ G ( { T n ∈ K ε } ) n → ∞ − → 1 o (3.1) wher e π is Pr okhor ov distance. ( b ) (Hampel et al , 198 6, 2.2 Deﬁnitio n 2) The ﬁnite sample breakdo wn poin t (FSB P) ε ∗ n of the estimator T n at the sample ( x 1 , ..., x n ) is given by ε ∗ n ( T n ; x 1 , ..., x n ) : = 1 n max n m ; max i 1 ,..., i m sup y 1 ,..., y m | T n ( z 1 , ..., z n ) | < ∞ o , (3.2) wher e the sample ( z 1 , ..., z n ) is obtained by rep lacing the data points x i 1 , ..., x i m by arbitrary values y 1 , ..., y m . Note that ε ∗ n from (3.2) is by 1 / n smaller than the Donoho and Huber (1 983) FSBP . Deﬁnition 3.1 (b) does not cover the scale case, where we must take into accoun t the possi- bility of implosion as well: As noted by an anon ymous referee, otherwise one could achie ve arbitrarily high breakdo wn points by choosing estimators based on two very lo w quan tiles, which of course wo uld not be stable at all—an argumen t v alid in the location-scale case as well. A remedy for the s cale parameter is gi ven by the log-transform ation as mentioned in He (2005), i.e.; ε ∗ n ( T n ; x 1 , ..., x n ) : = 1 n max n m ; max i 1 ,..., i m sup y 1 ,..., y m | log ( T n ( z 1 , ..., z n )) | < ∞ o , (3.3) Br eakdown and partial in variance By ar guments gi ven in Da vies and Gather (2005), a cer - tain equi v ariance of the considered estimator under a suitable group of transformations is required to obtain meaningful upper bounds for the breakdo wn point. In our scale-shape models, ho wev er, as indicated in Section 2.1, we canonically only hav e scale in v ariance. This lack o f comp lete equi variance does no t in validate the cited au thors’ consideratio ns, bu t rather these can be extende d to also cover this partial in v ariance: While due to the lack of s hape-equi varian ce, we conjecture that similar defecti ve con- structions, which prod uce break do wn points arbitrarily close to 1 in the AR(1) case (as men - tioned in Genton and Lucas (2005)), s hould be feasible in the pure shape case as well, in the joint scale-shap e case, imposing scale-equi variance, we do obtain sensible upp er bou nds as such constructions are eliminated by this (partial) equi v ariance. 6 In particular , as the scale mod el is a subm odel of our scale-shape mode l, the cor respond- ing upp er bounds for the maxim al breakdo wn point amon g all scale-equiv ariant estimators from Davies and Gather (2005, Thms. 3.1,3.2) remain v alid in our setting without change. Hence, in the sequel, we restrict ou rselves to scale-equi v ariant estimators. In particular , fol- lo wing Dav ies and Gather (2007, sec. 4.2), we no te that with n 0 being the high est freq uency of a single data point in the original sample, ε ∗ n ≤ ⌊ ( n − n 0 − 1 ) + 2 ⌋ / n (3.4) (adapted to (3.2)) among all scale-equi variant estimators. Br eakdown and r estr icted parameter space In the GPD and GEVD f amilies, there are two canonical parameter s paces for ξ : Either one does not impo se any restriction, i.e.; ξ ∈ R — which could be seen as “natural” there, or one restricts ξ to be positiv e (which is the only possibility for the W eibu ll and Gamma case). In the GPD and GEVD cas e, ξ = 0 is a discontinuity as t o the statistical prop erties of the model, compar able to parameter v alues ± 1 in the AR(1) model. While GPD and GEVD for ξ < 0 ha ve comp act support, in th e AR(1) model ± 1 mark the border of stationar ity . In both cases, the d iscontinuity only becomes visible wh en passing to sequen ces of observ ations, i n our case when moti vating GPD and GEVD by asymptotic argu ments, i.e.; by the Pickands– Balkema-d e Haan and Fisher-T ippet-Gneden ko Extreme V alue Theorems. T o t his end we need a uniformity over sets of quantiles which gets lost when passing ov er the value ξ = 0. In particular , shape in the GPD and GEVD models decides to which domain of attraction belongs the underlying distrib ution in the co rrespondin g Extrem e V alue Limit Theo rems. In both the scale-shape and the AR(1) case, it is hence well debatab le to restrict the parameter space accordingly , see Genton and Lucas (2005) and the rejoinder in Davies and Gather (2005, p. 1033). E.g.; we are mainly interested in the case when ξ > 0, which correspon ds to hea vy-tailed GPD / GEVD, and an estimate ξ ≤ 0 wou ld lead to drastic under -estimation of the correspondin g operational risk. In the sequel, for the GPD and GEVD cases, we hence consider both si tuations: with and without restriction on the parameter space, i.e.; that ξ > 0 or ξ ∈ R . Similar arguments could be carried out in case of shape estimation in the W eibull case, where 0 < ξ < 1 correspo nds to heavy- tailed, ξ ≥ 1 to light-tailed distrib utions; we do not pursue this further here. Br eakdown and ﬁnite samples As for our pu rposes, r eliability at ﬁnite samples is of primary interest, we will focus on the FSBP . For deciding upon w hich procedur e to take befor e havin g made observatio ns, in par- ticular for ranking procedures in a simulation study , the FSB P from Deﬁ nition 3.1 (b) has some drawbacks: It is deliberately probability-free and based on an actual sample ( x 1 , ..., x n ) , which we assume from the ideal situation for the mom ent. Hence its value depends on the conﬁguratio n of this sample. This is desirable when checking s afety of a procedure at an actual data set, but also entails t hat for the estimators considered in this paper , a generally v alid value for FSBP does not exist, and the only possible uni versal lower bound will be the minimal possible value of 0; and ev en if we made a sample-wise restriction, banning such samples from the application of the estimator , we would have other ones to come up with an FSBP of 1 / n and so forth. This does not reﬂect the situation to be expected i n the ideal model, though. Hence, we follo w the general s pirit of rob ustness to tie rob ustness concepts to a central ideal probab ility model—comp are Deﬁnition 3.1 (a): T o get rid of the 7 dependen ce on possibly high ly impr obable sample conﬁgur ations leading to an o verly small FSBP , bu t sti ll preservin g the aspect o f a ﬁnite sample, we propose an expected FSBP: Deﬁnition 3.2 F or an estimator T with FSBP ε ∗ n = ε ∗ n ( T ; X 1 , ..., X n ) , we deﬁ ne the ex pected FSBP or EFSBP as ¯ ε ∗ n ( T ) : = E ε ∗ n ( T ; X 1 , ..., X n ) (3.5) wher e exp ectation is evaluated in the ideal model. At some places, if existent, for a sequen ce T of estimators T n , we also consider the limit ¯ ε ∗ ( T ) : = lim n → ∞ ¯ ε ∗ n ( T n ) (3.6) and which, for bre vity , we also call EFSBP where unamb igous. Admittedly , the ev aluation of the ex pectation in (3.5) in general assumes knowledg e of the parameter , but some vagu e prior information could be used to restrict the range of the plausible parameter v alues, say to ξ ∈ ( 0 . 5; 2 ) , and tak e the worst b eha vior of ¯ ε ∗ n ( T ) on this range to base our decisions on, compare, e.g. Figure 2. W eighted by their (ideal) occurrence probab ility , by this deﬁnition, improbable sample conﬁguratio ns of the i deal sample— befor e contamination —are smoothed out in EFSBP; we still cannot exclude these conﬁgurations, b ut usually by correspond ing Chebyshe v-type inequalities for growin g s ample size n these will occur with decreasing probab ility and ε ∗ n will concentrate about ¯ ε ∗ n . Hence, in practice, without extra knowled ge, ` a priori, the user can rely on being protected against up to ¯ ε ∗ n ( T ) n outliers on av erage; i.e.; although there may be (rare) cases wher e we ha ve co nsiderably less protection, th ese cases are balanced by correspond ing cases with considerab ly s tronger protection . By a veragin g, EFSBP is closer again to the ABP of Hampel (1968), b ut preserv es the ﬁ- nite sam ple aspect of FSBP . I n the e xamples, we will sho w that this asp ect is non-ne gligible, and that for sample sizes about 40, the ABP will still be some what misleading (see T able 2 and Figure 3 below) , while at the same time, as mentioned, FSBP will be way too pes- simistic. By dom inated con vergence though , the limit of EFSBP will coincide with the ABP whene ver the FSB P con ver ges to the ABP . Small v alues of ε ∗ n for particular s amples do not only occur in the models discussed here: In t he one-dimen sional normal s cale model, we can already hav e FSBP of 0 for the median of absolute deviatio ns M AD for large enough valu es of n 0 as introduced before (3.4). Such ev ents (and similarly extraneous sample conﬁgurations), howe ver , occur with probability 0 in a continuous setting. Otherwise, in situations where a FSB P of 0 could occur with positi ve probab ility in the ideal model, necessarily we h av e mass points violating the standard s moothn ess assumptions usually required in scale models: t he correspon ding Fisher inform ation of scale wo uld be in ﬁnite then, compare Ruckdeschel and Rieder (201 0), and one may then rather q uestion the u se of MAD. In our case, this is some what dif ferent, as without arbitrary restrictions on the sample space, samples with FSBP of 0 can occur with small bu t positiv e ideal probability (see p 0 in T able 2), although our model remain s s mooth (and Fisher information ﬁnite). 4 Robust Estimators T ypes W e illustrate the concept o f FSBP in our scale-shap e models for Pickands-type and LD-type estimators, as deﬁned in the sequel. 8 4.1 Pickands Estimator Pic kands estimator (PE) for GPD is a special case of the Elementary Percentile Method (EPM) as discussed by Castillo and Hadi (1997) for GPD. Such estimators are based o n the empirical quantiles, in our case, we follo w Pickands (19 75) and use the empirical 50% and 75% quantiles ˆ Q 2 and ˆ Q 3 . Pickands estimators for ξ and β in GP D model then are deﬁned as ˆ ξ = 1 log ( 2 ) log ˆ Q 3 − ˆ Q 2 ˆ Q 2 , ˆ β = ˆ ξ ˆ Q 2 2 ˆ Q 3 − 2 ˆ Q 2 (4.1) where we see that for ˆ β > 0 we hav e to require ˆ Q 3 > 2 ˆ Q 2 , in which case ˆ ξ > 0 automatically . Apparently PE is equi v ariant in the sense of (2.6). For GEVD, an alogue estimates can be obtained by ˆ ξ =  ξ ∈ R | ˆ Q 3 − ˆ Q 2 ˆ Q 2 = q 0 ( ξ )  ; q 0 ( ξ ) = log ( 4 / 3 ) − ξ − log ( 2 ) − ξ log ( 2 ) − ξ − 1 , (4.2) ˆ β = ˆ ξ ˆ Q 2 2 ˆ Q 3 − 2 ˆ Q 2 log ( 4 / 3 ) − ˆ ξ + 1 − 2 log ( 2 ) − ˆ ξ log ( 2 ) − 2 ˆ ξ + 1 − 2 log ( 2 ) − ˆ ξ (4.3) where q 0 is obviously smooth, and, if plotted, easily seen to be s trictly isotone, compare Figure 1; in particular , ˆ ξ > 0 iff ˆ Q 3 > ˆ Q 2 ( 1 + q 0 ( 0 )) . = 3 . 39 ˆ Q 2 , and ˆ β > 0 iff ˆ Q 3 > 2 ˆ Q 2 . −2 0 2 4 6 8 10 5e−01 5e+00 5e+01 5e+02 5e+03 q 0 ( ξ ) ξ q 0 ( 0 ) = 2.3993 Fig. 1 q 0 ( ξ ) for dif ferent values of ξ ; note the logarit hmic y -scale 9 In the W eib ull model, Boudt et al (2011) ha ve shown Pickands (quantile) es timators to ha ve an exp licit representation as ˆ ξ = f − 1 1 , 1 ( 3 / 4 ) − f − 1 1 , 1 ( 1 / 2 ) log ( ˆ Q 3 ) − log ( ˆ Q 2 ) , ˆ β = ˆ Q 2 / ( − log ( 1 / 2 )) 1 / ˆ ξ (4.4) where f − 1 1 , 1 ( α ) = log ( − log ( 1 − α )) . For the Gamma distribu tion the quantile estimates hav e no cl osed solutions, so the matching of empirical and theoretical quantiles is to be done numerically by root solving procedur es. 4.2 MedkMAD and other LD estimators L ocation- D ispersion estimators, introduced by Marazzi and Ruf ﬁeux (1999), match empir - ical locatio n and dispersion m easures of data ag ainst their populatio n counterparts to get the estimates of model parameters, and are applicable for asymmetric location-scale (Lognor - mal), as well as in scale-shape models (GPD, Pareto, W eibull, Gamma). Let θ = ( α , σ ) be a parameter vector , F n , F α , σ empirical and model distribution func- tions, m ( F n ) , s ( F n ) , m ( F α , σ ) , s ( F α , σ ) correspondin g empirical and model location and dis- persion, then LD estimators ( ˆ α , ˆ σ ) are solutions of 1 ) ˆ σ m ( F 0 , 1 ) + ˆ α = m ( F n ) , ˆ σ s ( F 0 , 1 ) = s ( F n ) when α is a location parameter , 2 ) ˆ σ m ( F ˆ α , 1 ) = m ( F n ) , ˆ σ s ( F ˆ α , 1 ) = s ( F n ) when α is a shape parameter . Ef ﬁciency and rob ustness of these estimators d epend o n the ch oice o f m ( · ) and s ( · ) , and, of course, on the respectiv e parametric model. Mean and standard dev iation are classical measures for location an d dispersion, respectiv ely . R ob ust alternati ves are median, trimmed mean—for location, IQR, MAD, trimmed MAD, Sn, Qn—for dispersion. In addition, for asymmetric distribution s, we propose a new dispersion measure, namely kMAD. T able 1 displays diff erent variations for LD estimators with increasing efﬁciency together with cor- responding references. Deﬁnitions of some particular LD estimator s Empirical median ˆ m = ˆ m n and median of ab- solute deviatio ns ˆ M = ˆ M n are well known for their high break do wn point, jointly achie ving the highest possible asympto tic breakdo wn point of 50% among all afﬁne equiv ariant esti- mators at symmetric, continu ous uni variate distribu tions. Hence it is plausible to deﬁne an estimator fo r ξ and β , matching ˆ m and ˆ M against th eir population counterparts m and M within a scale-shape model. It turns out that the mapping ( β , ξ ) 7→ ( m , M )( F ϑ ) is indeed a Dif feomorp hism, hence for sufﬁcien tly large sample size n , we can solve the implicit equ ations for β and ξ to obtain the MedMAD estimator . More ef ﬁcient estimators for dispersion than MAD, but with same breakdo wn point of 50% at continuous distrib utions, and in particular suitable for as ymmetric distrib utions, ha ve been prop osed in R ousseeuw and Croux (1993) as ˆ M = Q n and ˆ M = S n . In this conte xt, 1 unchec ked credit giv en to Oli ve (2006) in the cited reference 10 Location Dispersion Location /Dispersion Median IQR ( I nter q uant ile R ange) Marazz i and Rufﬁeu x (1999) (Gamma, W eibull) Median MAD ( M edian of A bsolute D e viation s) Boudt et al (2011) (W eibul l) 1 trimmed Mean trimmed M(ean)AD Marazz i and Rufﬁeux (1999) Marazz i and Rufﬁeu x (1999) (Gamma, W eibull) Median kMAD Ruckdesc hel and Horbenko (2010) Ruckdesc hel and Horbenko (2010) (GPD) Median S n Rousseeuw and Croux (1993) — Median Q n Rousseeuw and Croux (1993) Boudt et al (2011) (W eibul l) T able 1 LD estimators and literature of using for scale-shape models Q n = {| x i − x j | ; i < j } ( k ) , k =  h 2  ≈  n 2  / 4, h = ⌊ n / 2 ⌋ + 1 , while S n = med i { med j | x i − x j |} where in case of discrepancies, the inn er median is to be taken as hi-med, the outer as lo-m ed, where lo-med ( F ) = F − ( 1 / 2 ) , and hi-med ( F ) = F − ( 1 / 2 + 0 ) . The resulting LD estimators are named MedQn and MedSn, respectiv ely . Note that for asymmetric G , the functionals S ( G ) = med X med Y | X − Y | , X , Y ∼ G and Q ( G ) = inf { s > 0; R G ( t + d − 1 s ) d G ( t ) ≥ 5 / 8 } in v olve expen siv e, careful numerical calculations, in particular for the hea vy-tailed GPD and GEVD cases. In the GEVD and GPD case, due to their considerable ske wness to the right, one can impro ve the MedMAD estimator considerab ly , using a dispersion function al that takes this ske wness into accou nt: For a distrib ution F on R with median m let us deﬁne fo r k > 0 kMAD ( F , k ) : = inf  t > 0   F ( m + kt ) − F ( m − t ) ≥ 1 / 2  (4.5) i.e.; kMAD only searches am ong the class of interv als ab out the median m wi th co vering probability 50%, where the part right to m is k tim es longer th an th e one left to m and returns the sho rtest o f these. In ou r case, k w ould b e chosen to be a suitable number lar ger than 1, and k = 1 would reprod uce the MAD. Apparently , whene ver F is continu ous, kMAD preserves the ABP of the MAD of 50%, i.e.; cov ering both the explo sion and implosion case. Computation of LD es timators Each of our dispersion es timators Sn, Qn, and kMAD is scale-equi v ariant, and the same also holds for the respective population counterparts, as well as for any ﬁxed quantile, in particular for the median; hence denoting the dispersion functional by s , both the q uotient q ( ξ ) : = s ( β , ξ ) / m ( β , ξ , ) and its empirical coun terpart ˆ q n ( q k , ˆ q k ; n for MedkMAD) are scale-free; so we ha ve redu ced the problem by one dimension. In the sequel we also write q k , ˆ q k ; n for Sn and Qn, where k is then simply void. Assuming continuity and monoto nicity , we obtain an estimator for ξ gi ven by ˆ ξ n = q − 1 k ( ˆ q n , k ) . A correspondin g es timator for β for each of the variants kMAD, Sn, and Qn, is then simply gi ven by ˆ β n = ˆ m / m ( 1 , ˆ ξ n ) (4.6) In particular , by construction all LD estimators are equiv ariant in the sense of (2.6). 11 Continuity and Mon otonicity of q as a fun ction in ξ ensure existence and uniq ueness of the implicitly deﬁned estimator for ξ . Continuity of q k in ξ for all our scale-shape models, i.e.; GPD , GEVD, Gamma, and W eibu ll and all our dispersion function als kMAD ( k ) , S and Q is straightforw ard, e ven for the limit cases ξ → 0. Monoton icity of q k , though, is not so obvious from the analytic terms, b ut the plots of function ξ 7→ q ( ξ ) for dispersions kMAD, Sn, and Qn, in Figure 4 indicate strict mono- tonicity for each of the dispersions and the GPD, Gamma, and W eibull cases, while for the GEVD case, q is bitone with maximum ¯ q k taken in ξ 0 > 0. T o obtain consistent estimators in this case, we restrict ourselv es to the range left or right to ξ 0 containing ξ = 0 . 7 in this paper . Restriction(s) of solvability domain Besides this restriction of the range of ξ in the GEVD case, we conclude, that in the GPD and in GEVD cases, for each of the dispersions, our restriction to ξ > 0 implies a restriction of the solv ability domain for q k ( ξ ) with in the set of admissible valu es of ξ : q k ( ξ ) ≥ lim ξ → 0 q k ( ξ ) = : ˇ q k > 0 (4.7) while in the W eibu ll and Gamma case, ˇ q k can be taken as 0. The follo wing lemma gi ves us yet other restrictions: Lemma 4.1 Let s t he functional version to any of the scale estimators Sn, Qn, and kMAD (for any k > 0 ). Let G be a distribution on R such that − ∞ < x 0 = sup { x : G ( x ) = 0 } , i.e.; with ﬁnite left endpoint. Then with m = G − ( 1 / 2 + 0 ) , the hi-med of G, s ( G ) ≤ m − x 0 = : s 0 (4.8) with equality iff (kMAD) G (( m ; m + ks 0 )) = 0 . (Sn) G ( x + 2 s 0 − 0 ) − G ( x ) < 1 / 2 for each x ≥ x 0 . (Qn) G ( m ) = 1 / 2 , G ( x 0 ) = 0 . Consequently , as x 0 = 0, in the GPD, Gamma, and W eibull case, q k ( ξ ) < 1 ∀ ξ (4.9) and, the same relation in the ideal model also holds sample-wise, i.e.; ˆ q k , n < 1 = : ¯ q k (4.10) in each sample (from the ideal model distrib ution) where (kMAD) at least one observ at ion in  ˆ m ; ˆ m + k ( ˆ m − X ( 1 ) )  . (Sn) at least one interval of length shorter than 2 ( ˆ m − X ( 1 ) ) containing more than ⌊ n / 2 ⌋ + 1 observ ations. (Qn) all observ at ions ﬁnite. Hence, for the LD estimators, we ha ve to ﬁnd th e uniq ue zero ˆ ξ n of H k ( ξ ) = q k ( ξ ) − ˆ q n , k in the interv al ( ˇ q k ; ¯ q k ) which can easily be solved with a standard un iv ariate roo t-ﬁnding tool like uniro ot in R (R Dev elopment Core T eam , 2011). 12 Pr oducing br eakdown Clearly , in the GPD case, we could driv e ˆ q k , n to values larg er than 1 by modifying observ ations in the original sample to v alues smaller than x 0 . These v alues would th en be identiﬁable as o utliers without error then, an d we co uld cancel them from the sample. Instead we only con sider contam inations by v alues larg er than x 0 (which could also ha ve been prod uced in the ideal mod el). On ﬁrst glance, v alues of ˆ q k , n outside ( ˇ q k , ¯ q k ) would mak e for a “d eﬁnition breakdo wn”, b ut if, fo r ˆ s n the respecti ve scale estimato r , ˆ s n → ˆ m , this entails ˆ ξ n → ∞ in the GPD case and ˆ ξ n → 0 in the Gamm a and W eibull case. Hence we can p roduce a breakd o wn in the original sense by modifyin g an original sample such that ˆ s n → ˆ m . 5 Calculation of (E)FSBP f or Pickands and L D Estimators In some of our scale-shape models and fo r some of our estimators we ha ve analytic expres- sions for the dif ferent breakdo wn point notions. 5.1 Pickands Estimator Propositio n 5.1 (Breakdown f or PE ) In the GPD, GEVD, W eibull, and Gamma cases, an upper bound for FSBP of PE is given by 25% , which also in variably is the FSBP in the W eibu ll case. In the GPD case, no matter if ξ ∈ R or ξ > 0 , and in the unr estricted GEVD case, i.e.; ξ ∈ R , FSBP is given by ε ∗ n = ˆ N 0 n / n , for ˆ N 0 n : = # { X i   2 ˆ Q 2 ≤ X i ≤ ˆ Q 3 } . (5.1) The ABP then is given by ¯ ε ∗ = ε ∗ = P ϑ ( 2 Q 2 < X 1 ≤ Q 3 ) (5.2) which in t he GPD case is just ¯ ε ∗ = ( 2 ξ + 1 − 1 ) − 1 / ξ − 1 / 4 , and, in the GEVD case, ¯ ε ∗ = 3 / 4 − exp  −  2 log ( 2 ) ξ − 1  − 1 / ξ  . In the r estricted GEVD case, wher e ξ > 0 , ε ∗ n = ˜ N 0 n / n , for ˜ N 0 n : = # { X i   q 0 ( 0 ) ˆ Q 2 ≤ X i ≤ ˆ Q 3 } . (5.3) The ABP then is given by ¯ ε ∗ = ε ∗ = P ϑ ( q 0 ( 0 ) Q 2 < X 1 ≤ Q 3 ) . (5.4) For ξ = 0 . 7, we obtain ¯ ε ∗ . = 6 . 42% in the GPD case, and in the GEVD case, ¯ ε ∗ . = 15 . 42% in the unrestricted case, and ¯ ε ∗ . = 6 . 13 % in the restricted case. For the ﬁgures for ¯ ε ∗ n , for n = 40 , 100 , 1000 in the GPD, GEVD, and W eibull cas e, see T able 3 , where we make use of Proposition 5.3 belo w . In the Gamma case, the situation is more in volv ed, and we skip computation of the actual breakdo wn points. 13 5.2 LD Es timators The FSBPs of 50% of the median and the dispersion estimators obviously form an upper bound for the F SBP of the LD estimators, implyin g that you could at least dri ve one of the parameters β and ξ to ∞ . Ho wev er, similarly to regr ession based estimators for th e W eib ull case of B oudt et al (2011), breakdo wn is not only entailed by mo ving mass to 0 or ∞ , and the actual breakdo wn points of the LD estimator s are smaller; for the MedkMAD, we come up with some explicit expr essions, while for the MedSn and MedQn we ha ve to recur to simulations, see Subsection 5.5. Propositio n 5.2 (Breakdown f or MedkMAD) In the GPD, W eibu ll, and Gamma cases, the FSBP of MedkMAD is given by ε ∗ n =  ˆ N ′ n / n W eibu ll; Gamma; GPD, unr estr . case, i.e.; ξ ∈ R min ( ˆ N ′ n , ˆ N ′′ n ) / n GPD, r estr . case, i.e.; ξ > 0 (5.5) ˆ N ′ n : = # { X i | ˆ m < X i ≤ ( k + 1 ) ˆ m } , (5.6) ˆ N ′′ n : = ⌈ n / 2 ⌉ − # { X i | ( 1 − ˇ q k ) ˆ m < X i < ( k ˇ q k + 1 ) ˆ m } . (5.7) The ABP in this case is given by ¯ ε ∗ = ¯ ε ′ for the unr estri cted and ¯ ε ∗ = min ( ¯ ε ′ , ¯ ε ′′ ) for the r estricted case wher e ¯ ε ′ = F ϑ (( k + 1 ) m ) − 1 / 2 , ¯ ε ′′ = 1 / 2 − F ϑ  ( k ˇ q k + 1 ) m  + F ϑ  ( 1 − ˇ q k ) m  . (5.8) At k = 10 and ξ = 0 . 7, we obtain ¯ ε ∗ . = 44 . 75% (GPD; ξ ∈ R ), 11 . 87% (GPD; ξ > 0), 49 . 47% (Gam ma), and 47 . 56% ( W eibull). For further ﬁgures for ε ∗ n , ¯ ε ∗ n , ¯ ε , see T able 3, where again we m ake u se of Proposition 5.3. In particular , contrary to Boudt et al (201 1), not on ly is our FSBP v arying sample-wise in these cases, but also do ABP and EFSBP depend on ξ . A plot of the dependenc y ξ 7→ ¯ ε ∗ ( MedkMAD 10 ; GPD ( ξ )) is displayed in Figure 2. 5.3 C alculation of EFSBP T o obtain actual values of EFSBP , we ha ve the followin g pro position. Propositio n 5.3 Consider ˆ N 0 n , ˆ N ′ n , ˆ N ′′ n as deﬁn ed in (5.1), (5 .6), (5.7) and write ¯ F for 1 − F . Then for n ≥ 3 , ( a ) setting i 1 = ⌊ n / 2 ⌋ , i 2 = ⌈ 3 n / 4 ⌉ , and abbr eviating 2 F − 1 ( u ) by t 2 , we obtain for l ∈ { 1 , . . . , i 2 − i 1 − 1 } P ( ˆ N 0 n = l ) = n Z 1 0  n − 1 i 1 − 1 , i 2 − i 1 − l − 1  u i 1 − 1  F ( t 2 ) − u  i 2 − i 1 − l − 1 ¯ F ( t 2 ) n − i 2 + l + 1 d u (5.9) and P ( ˆ N 0 n = 0 ) = n n − i 2 ∑ l = 0 Z 1 0  n − 1 i 1 − 1 , i 2 − i 1 + l  u i 1 − 1  F ( t 2 ) − u  i 2 − i 1 + l ¯ F ( t 2 ) n − i 2 − l d u . (5.10) The case of ˜ N 0 n is obtained fr om (5.9) , (5.10) r eplacing t 2 by t q : = q 0 ( 0 ) F − 1 ( u ) . ( b ) using the hi-med and setting t k : = ( k + 1 ) F − 1 ( u ) , we obtain for l ∈ { 0 , . . . , ⌈ n / 2 ⌉ − 2 } P ( ˆ N ′ n = l ) = n Z 1 0  n − 1 ⌊ n / 2 ⌋ + 1 , l  u n / 2  F ( t k ) − u  l ¯ F ( t k ) n / 2 − 1 − l d u (5.11) 14 −2 0 2 4 6 8 10 0 10 20 30 40 50 ABP(MedkMAD 10 ) in % at GPD(1, ξ ) shape ξ ABP 0.7 w/o restriction, ξ ∈ R with restriction ξ > 0 Fig. 2 ¯ ε ∗ ( MedkMAD 10 ; GPD ϑ =( 1 , ξ ) ) for dif ferent ξ with or withou t restriction ξ > 0 ( c ) setting t + : = ( 1 + k ˇ q k ) F − 1 ( u ) , t − : = ( 1 − ˇ q k ) F − 1 ( u ) , we obta in for l ∈ { 0 , . . . , n / 2 − 1 } P ( ˆ N ′′ n = n / 2 − l ) = n l ∑ l 2 = 0  n − 1 n / 2 − l 2 − 1 , l 2 , l − l 2  Z 1 0 F ( t − ) n / 2 − l 2 − 1  u − F ( t − )  l 2 × ×  F ( t + ) − u  l − l 2  1 − F ( t + )  n / 2 + l 2 − l d u . (5.12) The dependen cy o f EFSBP on n is visualized in Figur e 3. W e see a saw-tooth lik e oscillation which is explained by the use of ﬁnite sample quantiles in Proposition 5.3. In particular there are considerable de viations from ABP for moderate sample sizes. 5.4 Illustration: U sefulness of EFSBP The expressions giv en in Proposition s 5.1, 5.2 , and 5.3 illustrate that in both the Pickands and LD estimator cas e, ev en starting from an ideal sample, the “usual” sample-wise ﬂucta- tions of FSBP = ˆ N n / n are considerable. Moreo ver , Propo sition 5.3 sho ws that we ev en hav e a positiv e, although v ery small ideal probab ility p 0 : = P X ( ˆ N n = 0 ) > 0 (5.13) for breakdo wn already in the ideal model. Now , on the ev ent { ˆ N n = 0 } , ε ∗ n = 0, so no uni- versal non-triv ial lo wer bound can be giv en for the FSBP i n both the Pickands and LD estimator case. As the ﬁgures in T able 2 belo w illustrate, ho wever , such an e ven t will h ardly ev er occur provid ed only moderately small sample sizes , and the same goes for similarly small realizations of ˆ N n , so these cases, as motiv ated in the introduction of EFSBP , are not representati ve, indeed . T o grasp the difference between ¯ ε ∗ n and ε ∗ , we consider the follo wing Hoef fding-ty pe lemma for empirical quantiles 15 20 40 60 80 100 5 10 15 20 EFSBP for MedkMAD 10 and PE in % at GPD(1,0.7) n EFSBP PE MedkMAD 10 Fig. 3 ¯ ε ∗ n for PE and MedkMAD 10 at GPD ( 1 , 0 . 7 ) (restric ted to ξ > 0) as a function in n Lemma 5.4 (a) Let 0 < δ < 1 / 2 and t ∈ R and f or given α ∈ ( 0 , 1 ) and cdf F , let q = F − ( α ) , and ˆ q n = ˆ F − n ( α ) . Assume that F is differ entiable in q with density f ( q ) > 0 . Then with t n = t n − 1 / 2 + δ , for n lar ge enou gh, P ( | ˆ q n − q | ≥ t n ) ≤ exp ( − 2 f ( q ) 2 n δ ) (5.14) (b) Let a i 6 = 0 , α i ∈ ( 0 , 1 ) , α 1 6 = α 2 i = 1 , 2 be given as well as cdf F ; a ssume F dif fer entiable in a i q i , i = 1 , 2 . Then under the assumptions of (a) for q i , for ˆ I n = ( a 1 ˆ q 1 , n , a 2 ˆ q 2 , n ) and I = ( a 1 q 1 ; a 2 q 2 ) , we have for n lar ge enough, P X ( ˆ I n ) = P X ( I ) + O ( n − 1 / 2 + δ / 2 ) . (5.15) T o illustrate the size of the O ( n − 1 / 2 + δ / 2 ) -term, let us also determine the upper p 1 -quantile of ε ∗ n for p 1 = 0 . 95 0 . 0001 , i.e.; the minimal number q 1 , such that with probability 0 . 95 we will not see realizations with ε ∗ n < q 1 in 10000 runs of sample size n . Evaluation s for PE and MedkMAD Using the actual distrib ution of ˆ N n gi ven in Proposi- tion 5.3 , in T able 2, for Pickands (PE) and MedkMAD, k = 10 we determine ¯ ε ∗ n , p 0 and q 1 for n = 40 , 100 , 1000 in the GPD (with and wi thout restriction to ξ > 0), Gamma, and W eibu ll cases, each with ξ = 0 . 7. The Gamma case is skipped, though, in the P E case for lack of explicit formulae. Apparently ¯ ε ∗ n is quickly con vergin g in n , so ¯ ε ∗ gi ves indeed a useful boun d on a verag e. According to the v alues of p 0 , breakdo wn in the ideal mod el will hardly e ver h appen for PE for n ≥ 1000, and for MedkMAD for n ≥ 100 , and only rarely for n ≥ 40. The values for q 1 demonstrate that in a simulation study at the GPD with ξ = 0 . 7 with 10000 runs of sample s ize upto n = 100 0, we will probab ly see breakd o wns for PE, as well as for the MedkMAD restricted to ξ > 0. Contrary to this, as long as we have no more 16 GPD estimato r n = 10 n = 40 n = 100 n = 1000 n = ∞ p 0 PE 5 . 1 e − 01 2 . 7 e − 01 7 . 9 e − 02 5 . 4 e − 08 0 MedkMAD, ξ ∈ R 3 . 3 e − 04 1 . 6 e − 15 7 . 2 e − 38 < 1 e − 300 0 MedkMAD, ξ > 0 1 . 4 e − 01 3 . 5 e − 02 2 . 7 e − 03 2 . 9 e − 018 0 q 1 PE 0 . 00% 0 . 00% 0 . 00% 1 . 00% 6 . 42% MedkMAD, ξ ∈ R 0 . 00% 2 0 . 00% 30 . 00% 41 . 10% 44 . 75% MedkMAD, ξ > 0 0 . 00% 0 . 00% 0 . 00% 5 . 70% 11 . 87% ¯ ε ∗ n PE 6 . 44% 5 . 26% 5 . 78% 6 . 34% 6 . 42% MedkMAD, ξ ∈ R 35 . 85% 42 . 5 3% 43 . 86 % 44 . 66% 44 . 75% MedkMAD, ξ > 0 18 . 37% 13 . 45% 12 . 48% 11 . 94% 11 . 87% GEVD estimato r n = 10 n = 40 n = 100 n = 1000 n = ∞ p 0 PE, ξ ∈ R 2 . 8 e − 01 3 . 8 e − 02 6 . 8 e − 04 8 . 2 e − 28 0 PE, ξ > 0 5 . 4 e − 01 3 . 7 e − 01 2 . 0 e − 01 5 . 0 e − 04 0 q 1 PE, ξ ∈ R 0 . 00% 0 . 00% 0 . 00% 9 . 10% 15 . 42% PE, ξ > 0 0 . 00% 0 . 00% 0 . 00% 0 . 00% 6 . 13% ¯ ε ∗ n PE, ξ ∈ R 12 . 50% 14 . 38% 14 . 78% 15 . 33% 15 . 42% PE, ξ > 0 4 . 80% 5 . 54% 6 . 04% 6 . 09% 6 . 13% Gamma estimato r n = 10 n = 40 n = 100 n = 1000 n = ∞ p 0 MedkMAD 2 . 3 e − 04 2 . 7 e − 14 4 . 8 e − 34 < 1 e − 300 0 q 1 MedkMAD 0 . 00% 22 . 50% 38 . 00% 47 . 60 % 49 . 47% ¯ ε ∗ n MedkMAD 39 . 03% 46 . 80% 48 . 40% 49 . 3 7% 49 . 47% W eibu ll estimato r n = 10 n = 40 n = 100 n = 1000 n = ∞ p 0 PE 0 0 0 0 0 MedkMAD 6 . 4 e − 04 5 . 5 e − 13 5 . 6 e − 31 < 1 e − 300 0 q 1 PE 25 . 00 % 25 . 00% 25 . 00% 25 . 00% 25 . 00% MedkMAD 0 . 00% 17 . 50% 32 . 00% 44 . 20 % 47 . 56% ¯ ε ∗ n PE 25 . 00 % 25 . 00% 25 . 00% 25 . 00% 25 . 00% MedkMAD 37 . 68% 45 . 03% 46 . 54% 47 . 4 6% 47 . 56% T able 2 p 0 , q 1 , and ¯ ε ∗ n for PE and MedkMAD ( k = 10) outliers than 8, 30, 411 for sample sizes n = 40 , 100 , 1000, we will not see a breakdo wn for MedkMAD in the unrestricted cas e; in the Gamma case with same shape we obtain 9, 38, 476, and in the W eibull 7, 32, 442; analogue ﬁgures for PE at the W eibu ll with ξ = 0 . 7 are 10, 25, 250. W e may interp ret the valu es of ¯ ε n as follo ws: Before havin g made any observ ations, at the GPD at ξ = 0 . 7, using PE, one may be conﬁdent to be protected against 3 outliers for sample size 40, 7 for sample size 100 , and 65 for s ample size 1000, while for MedkMAD, the corresponding ﬁgures are 17, 43, and 447 in the unrestricted case and 5, 12, and 118 when restricted to ξ > 0; calculations in the Gamma and W eib ull cases giv e comparable numbers. 5.5 B reakdo wn Calculations in the Remaining Cases: Simulation al Approach For the breakdo wn point of MedQn and MedSn, as well as for MedkMAD in the GEVD case, there are no analytical expressions, so we calculate them using simulatio ns. 17 More precisely , for each of the estimators MedkMAD ( k = 10), MedQn, MedSn, PE, and each of the ideal distribution al settings GPD, GEVD, W eibull, and Gamma (each at ϑ = ( 1 , 0 . 7 ) ), we produced M = 1000 0 runs of sample sizes n = 40 , 100 , 1000 and noted the number of alterations needed to move ˆ q k , n to ¯ q , and in a second round , starting from the same runs of ideal observation s, for GPD and GEVD, the minimal number of alt er - ations needed to mov e ˆ q k , n to ˇ q k , respectiv ely the minimum of these two rounds. In the cases where explicit formulae are av ailable this gives us a possibility to cross-check our results. Some small discrepancies should arise though, as we use the default median in R , R De velopm ent C ore T eam (2011), i.e.; ( hi-med + lo-med ) / 2 for ev en sample s ize, while Proposition 5.3 belo w is limited to hi-med. For actual simulated v alues for ¯ ε ∗ n , see T able 3. Conclusion This article pro vides a new measure for global rob ustness of an estimator at ﬁnite samples, i.e.; EFSB P , a v ariant of the ﬁnite sample breakdo wn point which is particularly useful in situations where we hav e only partial equi v ariance and no non-triv ial, uni versal lo wer bound s for FSBP are av ailable. This variant comes closer to the (sample-free) ABP while still retaining the ﬁnite sample aspect of FSBP . W e hav e illustrated this measure at a set of scale-shape models, applying it to LD and Pickands/Quantile-typ e estimators meant for high-breakd o wn initial estimators to be en- hanced in ef ﬁciency by re weighting afterw ards. Although kMAD, Qn, and Sn all share the s ame breakdo wn properties in the location- scale setting, where they are deﬁned, the corresponding LD estimators in the considered scale-shape models e xhibit a differentiated breakdo wn beha vior , and there is not one single best estimator . In the unr estricted GEVD case, the easy-to-compu te Pickands-type estimator turn ed out to ha ve the highest br eakdo wn point among all considered estimators, while in the setting re- stricted to ξ > 0, from samp le size 100, MedkMAD beco mes superior . In all other situations, the best estimator is either MedkMAD or MedQn. In the unrestricted and restricted GPD case MedQn perfor ms best, with Med kMAD close in the unrestricted case fo r n = 40. In the W eibu ll and Gamma cases MedkMAD performs best, except for the W eibu ll at n = 10 00 where MedQn is best, but with MedkMAD close by . For deciding between MedkMAD and MedQn in cases where their breakd o wn poin ts are s imilar thou gh, one also should take into account computation al costs as well, which so f ar clearly fa v ors MedkMAD. A Proof s Pr oof to Lemma 4.1: For a ny k > 0, G ( m + ks 0 ) − G ( m − s 0 − 0 ) = G ( m + k s 0 ) ≥ 1 / 2, so s 0 ≥ kMAD ( G , k ) . For x ≥ x 0 and Y ∼ G , let g G ( x ) = med x ( | Y − x | ) = inf { s ≥ 0 : G ( s + x ) − G ( x − s − 0 ) ≥ 1 / 2 } . But G ( s 0 + x ) − G ( x − s 0 − 0 ) = G ( s 0 + x ) for x ≤ m , so g G ( x ) ≤ s 0 for x ≤ m , and hence, as { x ≤ m } ⊂ { g G ( x ) ≤ s 0 } , S ( G ) = inf { t ≥ 0 : P ( g G ( x ) ≤ t ) ≥ 1 / 2 } ≤ s 0 . Finally , for X , Y ∼ G , st och. i ndep. Q ( G ) = inf { s : P ( | X − Y | ≤ s ) ≥ 1 / 4 } ≤ s 0 , as P ( | X − Y | ≤ s 0 ) = Z G ( x + s 0 ) − G ( x − s 0 − 0 ) G ( d x ) ≥ ≥ Z [ x 0 ; m ] G ( x + s 0 ) G ( d x ) ≥ Z [ x 0 ; m ] 1 2 G ( d x ) ≥ 1 4 (A.1) Assume s ( G ) = s 0 . In case of kMAD t his happens iff G ( m + ks 0 ) = 1 / 2, or , equi v alentl y , G ( ( m ; m + ks 0 )) = 0. In case of Sn, S ( G ) = s 0 if f P ( g G ( X ) > s ) ≥ 1 / 2 for all s < s 0 , or , equiv alently , P ( x : G ( ( x − s 0 ; x + s 0 )) < 18 1 / 2 ) ≥ 1 / 2. But x − s 0 < x 0 whene ver x < m , so G (( x − s 0 ; x + s 0 )) = G ( x + s 0 ) ≥ G ( m ) = 1 / 2. Hence S ( G ) = s 0 if f G (( x − s 0 ; x + s 0 )) < 1 / 2 for all x ≥ m , or , equi vale ntly , iff G ( x + 2 s 0 − 0 ) − G ( x ) < 1 / 2 for x ≥ x 0 . In case of Qn, S ( G ) = s 0 if f the inequal ities in (A.1) are equal ities, i.e.; iff G ([ x 0 ; m ]) = 1 / 2 = G ( m + s 0 ) , and R ( m ; ∞ ) G ( x + s 0 ) − G ( x − s 0 − 0 ) G ( d x ) = 0. The last in tegra l is 0 if f G (( m ; ∞ )) = 0, so that altoge ther , S ( G ) = s 0 if f G ( m ) = G ( { ∞ } ) = 1 / 2. ⊓ ⊔ Pr oof to Proposition 5.1: For all models, i.e.; GPD, GEV D, W eibull, and Gamma, we can render the scale estimato r arbitr arily large for ˆ Q 3 suf ﬁcientl y large, so ε ∗ n ≤ 1 / 4. In case of GPD and GEVD, ˆ β < 0 once ˆ Q 3 ≤ 2 ˆ Q 2 , whic h cert ainly happens if, in an id eally distrib uted sample, we re place all ob serv ations X i , 2 ˆ Q 2 ≤ X i ≤ ˆ Q 3 by ˆ Q 2 , enta iling (5.1). Appeal ing to Lemma 5.4, up to an ev ent of probabil ity O ( exp ( − cn δ )) for some c > 0, ε ∗ n = ¯ ε ∗ + O P n ϑ ( n − 1 / 2 + δ / 2 ) (A.2) As (4.4) gi ves v alid va lues for ξ and β for a ny v alues of ˆ Q 3 and ˆ Q 2 , in the W eibul l ca se, we cannot l owe r the upper bound of 25%, i.e.; lim n ¯ ε ∗ n = ¯ ε ∗ = ε ∗ = 1 / 4. ⊓ ⊔ Pr oof to Proposit ion 5.2: As we hav e seen in t he conside rations in Section 4.2 on pr oducing breakdo wn, we only can solve (uniquely) for ξ and β as long as the quotient ˆ q k ; n fall s into ( ˇ q k , ¯ q k ) ; case-by-case considera- tions indeed show that by drivi ng ˆ q k , n to either ˇ q k (in case of GPD and GE VD) or ¯ q k (in all cases) produces breakdo wn, that is, breakdo wn co uld be achie ved by either moving all ˆ N ′ n observ ations from (5.6) for which ˆ m < X i ≤ ˆ m + ˆ M k to ( k + 1 ) ˆ m (entaili ng ˆ q k ; n ≈ 1) or by moving a number of ˆ N ′′ n observ ations (as deﬁne d in (5.7)) to the interv al [( 1 − ˇ q k ) ˆ m , ( k ˇ q k + 1 ) ˆ m ] up to the point that it cont ains n / 2 observ ations (entail ing ˆ q k ; n < ˇ q k ). The ac tual FSBP is then gi ven by the alte rnati ve needing to mov e less observ ations. The terms for ABP follo w with the usual LLN argument . ⊓ ⊔ Pr oof to Proposition 5. 3: W e start with the fact that for X i i . i . d . ∼ F with Lebesgue density f , the joint c.d.f. of the order statistic s X [ i 1 : n ] , X [ i 2 : n ] for 1 ≤ i 1 < i 2 ≤ n for s ≤ t ca n be written as G ( s , t ) = n Z s − ∞ f ( x )  n − 1 i 1 − 1  F ( x ) i 1 − 1 n − i 1 ∑ k 2 = i 2 − i 1  n − i 1 k 2  F ( t ) − F ( x )  k 2 ¯ F ( t ) n − i 1 − k 2 d x Hence P ( ˆ N ′ n ≥ l ) = P ( X [( n / 2 + l + 1 ) : n ] ≤ ( k + 1 ) X [( n / 2 + 1 ) : n ] ) = n Z 1 0  n − 1 n / 2  u n / 2 n / 2 − 1 ∑ k 2 = l  n / 2 − 1 k 2  F ( t k ) − u  k 2 ¯ F ( t k ) n / 2 − 1 − k 2 d u and (5.11) follo ws by taking dif ferenc es. Cases (5.9) and (5.12) follo w similarly . ⊓ ⊔ Pr oof to Lemma 5.4: W e note that { ˆ q n ≤ t } = { ∑ i I ( X i ≤ t ) ≥ n α } . Hence with Hoef fding’ s inequa lity , Hoef fding (1963), P ( | ˆ q n − q | ≥ t n ) ≤ 2 exp ( − 2 n ( F ( t n + q ) − α ) 2 ) and (a) follo ws from F ( t n + q ) − α = f ( q ) t n + o ( t n ) . For (b), note that P ( ˆ I n ∆ I ) ≤ E | F ( ˆ q 1 , n ) − α 1 | + E | F ( ˆ q 2 , n ) − α 2 | . Hence, for large enough n , P ( ˆ I n ∆ I ) ≤ 2 f ( a 1 q 1 ) | a 1 | E | ˆ q 1 , n − q 1 | + 2 f ( a 2 q 2 ) | a 2 | E | ˆ q 2 , n − q 2 | . and, applyi ng that for a random va ri- able Z taking va lues in [ 0 , 1 ] , for t ∈ ( 0 , 1 ) , 0 ≤ E Z ≤ t + R 1 t P ( X > t ) , so by Mill’ s ratio, P ( ˆ I n ∆ I ) ≤ 2 t + ∑ i exp ( − 2 n t 2 f ( q i ) 2 ) / ( 2 n t f ( q i ) 2 ) . Plugging in t = n − 1 / 2 + δ , we obtai n (b). ⊓ ⊔ Acknowledgement W e than k two anonymous refere es for their helpful comments. 19 References Basel Committee on Banking Supervision (2006) Internat ional Con vergenc e of Capita l Measurement and Capita l Standards: A Revised Frame work. http://ww w.bis.org/ publ/bcbs128.pdf Balk ema A, de Haan L (1974) Residual life time at great age. Annals of Probability 2: 792–804 Brazau skas V , Serﬂing R (2000) Robust Es timatio n of T ail Paramete rs for T wo Paramete r Pareto and Expo- nentia l Models via Generalized Quantile S tatist ics. Extremes 3(3): 231–249 Boudt K, Caliska n D, Croux C (2011) Robust and Explic it Estimators fo r W eibull P arameters. Metrik a 73(2): 187–209 Castill o E, Hadi AS (1997) Fitting the Generalize d Pareto Distributi on to Data. Journal of the American Statist ical Ass ociat ion 92(440): 1609–1620 Davi es PL, Gather U (2005) Breakdo wn and groups (with discussion). Annals of Statistics 33(3): 977–1035 Davi es PL, Gather U (2007) The Breakdo wn Point — E xamples and Countere xamples. REVST A T 5 (1): 1–17 Donoho DL, Huber PJ (1983 ) The notion of breakdo wn point. In: Bickel PJ, Doksum K, Hodges JLJr (ed) A Festschrift for Erich L. Lehmann. W adsworth, Belmont, CA, pp 157–184 Donoho DL, L iu RC (1988) The “ Automatic” Robustness of Minimum Distance Functionals. Annals of Statist ics 16(2), 552–586 Eaton ML (1 989) Group In varia nce Applicatio ns in Stati stics. Regional Confere nce Series in Probabili ty and Statist ics, vol. 1, Uni versit y of Minnesota. Fernholz L T (1979) V on Mises Calculus for Statistical Functionals. Lecture Notes in Statistics #19, Springer Genton MG, Luca s A (2005) Discu ssion of “Bre akdo wn and Groups” by PL Da vies and U Gath er . Annals of Statist ics 33(3): 988–993 He X (20 05) Discussio n of “Breakdo wn and Groups” by PL Da vies and U Gathe r . Annals of Statist ics 33(3): 998–1000 Hampel FR (1986) Contributi ons to the theory of robust estimation. Dissertati on, Uni versit y of Cali fornia, Berk eley Hampel FR, Ronchetti EM, R ousseeuw PJ, Sta hel W A (1986) Robust stat istics. The app roach based on inﬂu- ence functi ons. Wile y Hoef fding W (196 3) Probabili ty inequalitie s for sums of bounded random v ariables. Journal of the Ameri can Statist ical Ass ociat ion 58(301): 13–30 Marazz i A, Ruf ﬁeux C (1999) T he truncated mean of asymmetric di stribut ion. Computational Stati stics & Data Analysis 32: 79–100 Oli ve D (2006) Robu st estimators for transformed locati on scale famil ies. Mimeo Peng L, W elsh AH (2001) Rob ust Estimation of the Generaliz ed Pareto Distrib ution. Extremes 4(1): 53–65 Pickands J (1975) Statisti cal Inference Using Extreme Order Statistics. Annals of Statistics 3(1): 119–131 R De velop ment Cor e T eam (2011) R : A language and en vironment for statistical computing. R Foundation for Statist ical Computing, Vi enna, Austria. ISBN 3-900051-07 -0, http://www.R- project.org Rieder H (1994) Robust Asympto tical Statistics. Springer Rousseeuw PJ, Croux C (1993) Alterna ti ves to the Median Absolute Devi ation. Journal of the Americal Statist ical Ass ociat ion 88(424): 1273–1283 Ruckdesc hel P , Horbenko N (2010) Rob ustness Prope rties of E stimators in Gene ralize d Pareto Mo dels. T ech- nical Report No. 182, Fraunhofe r IT WM, Kaise rslautern Ruckdesc hel P , Rieder H (2010) Fisher Information of Scale. Statist. Probab . Lett. 80: 1881–1885 20 T able 3 Simulate d EFSBP in % with CL T -based 95%-co nﬁdence interv al (CI) for θ = ( ξ = 0 . 7 , β = 1 ) ; number of runs is 10000 Model Med- Me d- Med- Sn ± CI Qn ± CI kMAD 10 ± CI PE ± CI GPD ξ ∈ R 34 . 69 0 . 33 43 . 74 0 . 09 44 . 68 0 . 13 5 . 94 0 . 10 GPD ξ > 0 8 . 78 0 . 18 23 . 44 0 . 21 10 . 65 0 . 07 5 . 94 0 . 10 GEVD ξ ∈ R 6 . 99 0 . 21 5 . 89 0 . 21 13 . 38 0 . 24 14 . 85 0 . 13 GEVD ξ > 0 6 . 99 0 . 21 5 . 89 0 . 21 4 . 75 0 . 13 7 . 87 0 . 16 W eibu ll 37 . 63 0 . 34 40 . 32 0 . 11 47 . 31 0 . 02 25 . 00 ∗ 0 . 00 ∗ Gamma 34 . 55 0 . 32 41 . 97 0 . 10 49 . 17 0 . 02 n . a . − n = 40 GPD ξ ∈ R 23 . 55 0 . 21 47 . 51 0 . 04 44 . 73 0 . 09 6 . 12 0 . 07 GPD ξ > 0 12 . 44 0 . 16 18 . 42 0 . 16 11 . 32 0 . 05 6 . 12 0 . 07 GEVD ξ ∈ R 3 . 25 0 . 09 2 . 88 0 . 09 8 . 86 0 . 14 15 . 01 0 . 09 GEVD ξ > 0 3 . 25 0 . 09 2 . 88 0 . 09 6 . 32 0 . 11 6 . 71 0 . 05 W eibu ll 26 . 58 0 . 30 45 . 12 0 . 05 47 . 41 0 . 02 25 . 00 ∗ 0 . 00 ∗ Gamma 25 . 42 0 . 21 45 . 90 0 . 04 49 . 35 0 . 02 n . a . − n = 100 GPD ξ ∈ R 21 . 86 0 . 03 49 . 75 0 . 00 44 . 75 0 . 03 6 . 38 0 . 03 GPD ξ > 0 14 . 99 0 . 13 16 . 06 0 . 02 11 . 82 0 . 02 6 . 37 0 . 03 GEVD ξ ∈ R 1 . 06 0 . 03 1 . 27 0 . 03 7 . 25 0 . 05 15 . 39 0 . 04 GEVD ξ > 0 1 . 06 0 . 03 1 . 27 0 . 03 7 . 22 0 . 05 6 . 20 0 . 08 W eibu ll 19 . 77 0 . 03 49 . 01 0 . 01 47 . 55 0 . 01 25 . 00 ∗ 0 . 00 ∗ Gamma 24 . 13 0 . 04 49 . 16 0 . 01 49 . 46 0 . 01 n . a . − n = 1000 ∗ : theore tical values, n . a . : not av ailable; in these cases, 25% is an upper bound 21 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0 q ( ξ ) = kMAD 1 ( ξ, 1 ) median ( ξ, 1 ) = 0.69 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0 q ( ξ ) = kMAD 10 ( ξ, 1 ) median ( ξ, 1 ) = 0.24 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0 q ( ξ ) = Qn ( ξ, 1 ) median ( ξ, 1 ) = 0.42 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0 q ( ξ ) = Sn ( ξ, 1 ) median ( ξ, 1 ) = 0.85 GPD 0 1 2 3 4 5 6 0.0 0.5 1.0 1.5 2.0 2.5 3.0 = 2.14 ξ 0 = 0.4 0 1 2 3 4 5 6 0.0 0.5 1.0 1.5 2.0 2.5 3.0 = 0.59 = 1.37 ξ 0 = 2.48 0 1 2 3 4 5 6 0.0 0.5 1.0 1.5 2.0 2.5 3.0 = 1.41 ξ 0 = 0.28 0 1 2 3 4 5 6 0.0 0.5 1.0 1.5 2.0 2.5 3.0 = 2.68 ξ 0 = 0.34 GEVD 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0 = 1 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0 = 1 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0 = 1 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0 = 1 Weib ull 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0 = 1 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0 = 1 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0 = 1 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0 = 1 Gamma P S f r a g r e p l a c e m e n t s ˇ q ˇ q ˇ q ˇ q ˇ q ¯ q ¯ q ¯ q ¯ q ¯ q ¯ q ¯ q ¯ q ¯ q ¯ q ¯ q ¯ q Fig. 4 Quotient s kMAD ( ξ , k = 1 ) / med ( ξ ) and kMAD ( ξ , k = 10 ) / med ( ξ ) , Qn ( ξ ) / med ( ξ ) and Sn ( ξ ) / med ( ξ ) as functio ns in ξ ; we also include with respec ti ve ˇ q , ¯ q

Yet another breakdown point notion: EFSBP - illustrated at scale-shape models

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment