Coherent frequentism

By representing the range of fair betting odds according to a pair of confidence set estimators, dual probability measures on parameter space called frequentist posteriors secure the coherence of subjective inference without any prior distribution. T…

Authors: David R. Bickel

Coherent frequentism
Coheren t frequen tism Da vid R. Bic k el Octob er 24, 2018 Otta w a Institute of Systems Biol ogy ; Departmen t of Bio c hemistry , Microbiology , and Imm unology; Departmen t of Mathematics and Statistics Univ ersit y of Otta w a; 451 Sm yth Road; Otta w a, O n tario, K1H 8M5 +01 (613) 56 2-5800, ext. 8670; dbic k el@uotta w a.ca Abstract By represen ting the range of fair b etting o dds a c cor d ing to a pair of conde nce set estimators, dual probabilit y measures on parameter space called frequen tist p osteriors secure the coherence of sub ject iv e inference without an y prior distribution. Th e closure of the set of exp ect ed losses corresp onding to the d ual frequen tist p osteriors constrains decisions without arbitrarily forcing op- timization under all circumstances. This decision theory reduces to those t hat maximize exp ected utilit y when the pair of frequen t ist p osteriors is induced b y an exact or appro ximate cond ence set estimator or when an automatic red uction rule is applied to the pair. In suc h cases , the resulting frequen tist p osterior is coheren t in the sense th at, as a probabilit y distribution of the parameter of in terest , it satises the axioms of the decision-theoretic and logic-theoretic systems t ypically cited i n supp ort of the Ba y esi a n p ost er i o r. U n lik e the p -v alue, the condence lev el of an in terv al h yp othesis d eriv ed from suc h a measure is suitable as an est imator of the indicator of h yp othesis truth sin ce it con v erges in sample-sp a ce probabilit y to 1 if the h yp othesis is true or to 0 otherwise under general conditions. Keyw ords: attained condence lev e l; coherence; coheren t prevision; condence distribution; decision theory; minim um exp ected loss; du c ial i nf e rence; found a t i ons of statistics; imprecise probabilit y; maxim um utilit y; observ ed co n de nce lev el ; problem of regions; signicance testing; upp er and lo w er probabilit y; utilit y maximization 1 1 In tro duction 1.1 Motiv ation A w ell kno wn mistak e in the in terpretation of a n observ ed condence in terv al confuses c ondenc e as a lev e l of certain t y with condence as the c over age r ate , the almost-sure limiting rate at whic h a condence in terv al w ould co v er a parameter v alue o v er r e p eated sampling from the same p opulation. This results in using the stated condence lev el, sa y 95%, as if it w ere a probabilit y that the parameter v al ue lies in the particular condence in terv al that corresp onds to the observ ed sample. A practical solution that do es not sacrice the 95% co v erage rate is to rep ort a condence in terv al that matc hes a 95% cr e dibility interval computable from Ba y es's form ula giv en some matching prior distribution (Ru- bin, 1984). In addition to canceling the error in in terpretation, suc h matc h i ng enables the statistici an to lev erage the exibilit y of the Ba y esian approac h in making join tly consisten t inferences, in v ol v i n g , for example, the probabilit y that the parameter lies in an y giv en region of the parameter space, on the basis of a p osterior d i st ri b utio n rmly anc hored to v alid frequen tist co v erage r a t e s. Priors yielding exact matc hing of predictiv e probabilities are a v ailable for man y mo dels, including lo cation mo dels and certain lo c ation-scale mo dels (Datta et al., 2000; Sev erini et al., 2 002). Although exact matc hing of xed-parameter co v erage rates is limited to lo cation mo dels (W elc h and P eers, 1963; F raser and Reid, 2002), priors yielding asymptotic matc hi ng ha v e b een iden tied for other mo dels, e.g., a hierarc hical normal mo del (Datta e t al., 2000). F or mixture mo dels, all priors that ac hiev e matc hing to second order necessarily dep end on the data but asymptotically con v erge to xed priors (W asserman, 2000). Data-based priors can also yield s e cond-order m atc hing with insensitivit y to the s a mpling dist ri b u- tion (Sw eeting, 2001). Agreeably , F raser (2008b) suggested a data-dep enden t prior for appro ximating the lik eliho o d function in te grated o v er the n uisance parameters to attain accurate matc hing b et w een Ba y e sian probabilities and co v erage rates. These adv ances approac h the vision of buildi ng an ob jec- tiv e Ba y esia n i s m , dened as a univ ersal recip e for appl y i n g Ba y es theorem in the absence of prior information (Efron, 1998). View ed from another angle, the fact that close ma t c hing can require resorting to priors that c hange with eac h new observ ation, crac king the foundations of Ba y esian inference, rai s e s the question of whether man y of the goals motiv ating the searc h for an ob jectiv e p osterior can b e ac hiev ed apart from Ba y e s's form ula. It will in fact b e seen that suc h a probabilit y distribution l ies dorman t in nested condence in terv als, sec ur i ng the ab o v e b enets of in terpr e tation and coherence without matc hing priors, pro vided that the condence in terv als are constructed to yield reasonable i nf e rences ab out the v al ue of the parameter for eac h sample from the a v ailable information. Unless the condence in terv als are conserv ativ e b y construction, the condition of adequately incor- p orating an y relev an t information is usually satised in practice since condence in terv als are most appropriate when information ab out the parameter v alue is either largely absen t or inc luded in the in- terv al estimation pro cedure, as it is in random-eects mo deling and v arious other frequen tist shrink age metho ds. Lik ewise, condence in terv als kno wn to lead to pathologies tend to b e a v oided. (P a t hol og- ical condence in terv als often emphasized in supp ort of credibilit y in terv a ls include formally v alid condence in terv als that li e outside the appropriate parameter space (Mandelk ern, 2002 ) and those that can fail to ascrib e 100% c ondence t o an in terv al deduced from the data to con tai n the true v al ue (Bernardo and Smith, 1994).) A game-theoretic framew or k mak e s the requiremen t more preci s e : for the 95% condence in terv al to giv e a 95% degree of ce rt a in t y in th e single case and to supp ort coheren t inferences, it m ust b e generated to ensure that, on the a v ailable information, 19:1 are appro x- imately fair b etting o dds that the parameter lies in the observ ed in terv al. This condition rules out the use of highly conserv ativ e in terv als, pathological in terv als, and in terv als that fail to r e ect substan- tial p ertinen t information. In r e lying on an observ ed condence in terv al to t hat exten t, the decision mak er ignores the presenc e of an y recognizable subsets (Gl eser, 2002), not only sligh tly conserv ativ e subsets, as in the tradition of con troll ing the rate of T yp e I errors Casella (1987), but also sligh tly an ti-conserv ativ e subsets. Giv en the ubiquit y of recogniz able subsets (Buehler and F eddersen, 196 3; Bondar, 1977), this strategy uses pre-data condence as an appro ximation to p ost-data condence 2 in the sense in whic h exp ected Fisher information appro ximates observ ed Fisher information (Efron and Hinkley, 1978), aim ing not at exact inference but at a pragmatic use of the limited resources a v ailable for an y particular data analysis. Certain situations ma y instead call for careful applications of conditional inference ( Goutis and Casella, 1995; Sundb erg, 2003; F raser, 20 04) for basing decisions more directly on the data actually observ ed. 1.2 Direct inference and a t tain e d condence The ab o v e b etting in terpretation of a frequen t i st p osterior will b e generalized in a framew ork of decision to formali ze, con trol, and extend the common practic e of equating the lev el of c ertain t y that a parameter lies in an observ ed condence in t e rv al with the in terv al estimator's rate of co v erage o v e r rep e ated sampling. Man y who fully understand that the 95% condence in terv al is dened to ac hiev e a 95% co v erage rate o v er rep eated sampling will for that reason o f te n b e substan tially more certain that t he true v al ue of the parameter lies in an o bs e r v ed 99% condence in terv al than that it l ies in a 50% condence in terv al computed f rom the same data (F ranklin, 2001; P a witan, 2001, pp. 11-1 2) . This dir e ct inf e r enc e , reasoning from the frequency of individuals of a p opulation that ha v e a certain prop e r t y to a lev el of certain t y ab out whether a particular sample from the p opul ation, is a notable feature of inductiv e logic (e.g., F ranklin, 2001; Jaege r , 2005) and often pro v es e ectiv e in ev eryda y decisio ns . Kno wi ng that the new cars of a certain mo del and y ear ha v e sp eedome t e r readings within 1 mi le p er hour (mph) of the actual sp eed in 99.5% of cases, most driv e r s will, when b etting on whether they comply with sp eed limits, ha v e a high lev el of certain t y that the sp eedomete r readings of their particular new cars of that mo del and y ear a ccurately rep ort their curren t sp eed in the absence of other relev an t information. (Suc h information migh t include a reading of 10 mph when t he car is stationary , whic h w ould indicate a defect in the instrumen t at hand.) If the ab o v e b etting in te r pretati on of the co n de nce lev el holds for an in terv al giv en b y some predetermined lev el of c ondence, then coherence requires that it hold equally for a lev el of condence giv en b y some predete r m ined h yp othesis. Fisher's ducial argumen t also emplo y ed direct inference (Fisher, 1945; Fisher, 1973, pp. 34- 36, 57-58; Hac king, 1965, Chapter 9; Z ab ell, 1992). The presen t frame w ork departs from his in its applicabilit y to inexact condence sets, in the closer pro ximi t y of its probabilitie s to rep e ated-sampling rates of co v ering v ector parameters, in i t s toleration of reference cla s ses with relev an t subsets, and in its theory of decision. Since the second and third departures are shared with recen t metho ds of computing the condence probabili t y of an arbitrary h yp o t he s i s (3.2.2), the ma in con tributio n of this pap er is the general framew ork of inferenc e that b oth motiv ates suc h metho ds giv en an exact condence set and extends them for use with appro ximate, v ali d, and nonconserv ativ e set estimators and for coheren t decision making, including prediction and p oi n t estimation. This framew ork dra ws from the theory of coheren t upp er and lo w er probabilities for the cases i n whic h no exact condence set with the desired prop erties is a v ail able. T o allo w indecision in ligh t of inconclusiv e evidence, these non-additiv e probabilities ha v e b ee n form ulated for lotteries in whic h the agen t m a y either place a b et or refrain from b etting or, equiv alen tly , i n whic h the casino p osts dieren t o dds to b e used dep ending on whether a gam b l er b ets for or against a h yp othesis. Condence decision theory will b e form ulated for t hi s scenario b y setting an agen t's prices of buying and selling a gam ble on the h yp othe s i s that a parameter θ is in some set Θ 0 ∈ Θ according to the condence l ev els of a v al id set estimate and a n o n c onserv ativ e condence set estimate that coincide with Θ 0 . As a result, the h yp othesis has an in terv al of condence l ev els rather than a single condence lev el. Equating the buying and sell ing prices reduces the upp er and lo w er probabilit y functions to a s i ngle frequen tist p osterior, a probabilit y measure on parameter space Θ , and th us reduces the in terv al to a p oin t. 1.3 Ov erview This subsecti on outlines the organization of the remainder of the pap er whil e oering a brief summary . 3 After preliminary concepts are dened (2. 1), Section 2.2 presen ts the ne w framew ork for condence- based inference and decision. The family of probabilit y measures (frequen tist p osteriors) used in in- ference and de cision can b e stated in terms of coheren t lo w er and upp er probabilities and is th us completely self-consisten t according to a widely accepted accoun t of coherence deriv ed from ideas of Bruno de Finetti (2.3). This la ys a f o u ndatio n for decisions and for exible inference ab out the truth of h yp otheses without i n v oking the lik eliho o d principle (2.4, 2.5). The framew ork is compared to other v ersions of frequen tist cohere n c e based on upp er and lo w er probabilities in Section 2 .5. While rep orting an in terv al lev el of condence in a h yp othesis has the adv an tage of honestly com- m unicating the insuciency of the data to determine a single c ondence lev el, suc h in terv als are less useful in situations requiring the automation of dec isions. Under suc h circumstances, the family of fre- quen tist p osteriors can b e reduced to a single frequen tist p osterior b y the use of exact or appr o ximate condence sets or b y an automatic reduction rule (3.1). F or a single f re q uen tist p osterior, condence decision theory is equiv alen t to the minimization of exp ected p osterior loss (3.2). As a probabilit y measure on h yp othesis space, the resulting frequen tist p osterior satises the s a me coherenc e axioms as the Ba y esian p osterior whether or not it is compatible wi th an y prior distribution (3.3). The imp ortan t sp e cial case of a scalar parameter of in terest pro vides an arena for con trasting frequen tist p osterior probabilities and p -v alues (3.4). The conde n c e framew ork pro vides direct and simple approac hes to c ommon problems of data analysis, as wil l b e illustrated b y example in Sections 3.2 and 3.4.3. Examples include rep orting probabilistic l ev els of condence of the in terv al, t w o-sided n ull h yp otheses required in bio equiv alance testing, assigni n g condence to a complex region, and assessing practica l or scien tic signicanc e. P osterior p oin t estimates and predictions that accoun t for parameter uncertain t y are a lso a v aila b l e without relinquishing th e ob jectivit y of the Neyman-P earson framew ork. Section 4 concludes t he pap e r b y highligh ting the main prop erties of the p rop osed frame w ork. 2 Condence de c isio n theory 2.1 Preliminaries 2.1.1 B asic notati on The v alues of x ∧ y and x ∨ y are resp ectiv e ly the minim um and maxim um of x and y . The sym b ols ⊆ and ⊂ resp ectiv ely signify subset and prop er subset. 1 Θ 0 : Θ → { 0 , 1 } is the usual indicator function: 1 Θ 0 ( θ ) is 1 i f θ ∈ Θ 0 or 0 i f θ / ∈ Θ 0 . Angular brac k ets rather than paren theses signal n umeric t upl es. F or example, if x and y are n um b ers, then h x, y i denotes an ordered pai r, whereas ( x, y ) denotes the op en in te r v al { z : x < z < y } . Giv en a probabilit y space (Ω , Σ , P ξ ) indexed b y the v ector parameter ξ ∈ Ξ ⊆ R d , consider the random quan tit y X of distribution P ξ and with a realization x in some sample set Ω ⊆ R n . Without loss of generalit y , partition the full param eter ξ in to an in te r e st parameter θ ∈ Θ and, unless θ = ξ , a n uisance parameter γ ∈ Γ , suc h that ξ ∈ Θ × Γ and P θ,γ = P ξ . Except where otherwise noted, ev ery probabilit y distribution is a standard (K olmogoro v) probabil- it y measure. An inc omplete probabilit y measure is a standard, additiv e measure with total mass less than or equal to 1. Let (Θ , A ) represen t a measurable space and B ([0 , 1]) the Borel σ -eld of [0 , 1] . The compl emen t and p o w e r set of Θ 0 are ¯ Θ 0 and 2 Θ 0 , resp ectiv ely . The σ -eld induced b y C is σ ( C ) . 2.1.2 Metameasure and metaprobabi lit y spaces The follo wing sligh t extension of probabilit y theory is fac ilitates a clear and precise presen tation of the presen t framew ork. T o a v oid unnecessary confusion b e t w een single-v a lued probabilit y and the sp ec ic t yp e of m ulti-v alued probabilit y required, the former will b e called probabilit y in agree men t with common usage, and the latter will b e called metaprobabilit y , a term dened b elo w. 4 Denition 1. Giv en a mea s urable space (Θ , A ) and a metame asur e sp ac e , the triple M = (Θ , A , P ) with a family P of me asu re s , the metame asur e P of M is a function P from A to the set of all closed in terv als of [0 , ∞ ) suc h that P ( A ) is the closure of { P ( A ) : P ∈ P } for eac h A ∈ A . The metameasure P is said to b e de gener ate if | P | = 1 or nonde gener ate if | P | > 1 . Denition 2. The metameasure P of a metameasure space M = (Θ , A , P ) is a pr ob ability metame a- sur e if eac h mem b er of P is a probabilit y measure. Then M is called a metapr ob ability sp ac e , and P ( A ) is called the metapr ob ability of event A for all A ∈ A . The exp e ctation interval or exp e cte d interval E ( L ) of a measurable map L : A → R 1 with resp ect to a probabilit y me t a measure P on M is the c losure of  Z L ( ϑ ) dP ( ϑ ) : P ∈ P  . In w ords, the exp ectation in terv al of a random quan tit y with resp ec t to a probabilit y metameasure is the smallest clo s e d in terv al con taining the exp ectation v alues of the random quan tit y with resp ect to the probabil it y measures of th e metaprobabili t y space. 2.2 Condence measures and metameasures P articular t yp es of condence sets form the basis of the metameasure on whic h condence decision theory rests. Denition 3. A set estimator ˆ Θ for θ is a function dened on Ω × [0 , 1] . A set estimator is called valid if its co v erage r a t e o v er rep eated sampling is at least as great as ρ, the nominal condence co e cien t: P ξ  θ ∈ ˆ Θ ( X ; ρ )  ≥ ρ for all ξ ∈ Ξ and ρ ∈ [0 , 1] . A set estimator is call ed nonc onservative if its co v erage rate o v er rep eated sampling is at no greater than the nomi nal condence co ecien t: P ξ  θ ∈ ˆ Θ ( X ; ρ )  ≤ ρ for all ξ ∈ Ξ and ρ ∈ [0 , 1] . A set estimator that is b oth v alid and n o n c onserv ativ e is called exact . F or some set C of connected subsets of C , a set estimator is called neste d if it is a function ˆ Θ : Ω × [0 , 1] → C suc h that suc h that, for all x ∈ Ω , there is a C ( x ) ⊆ C suc h that ˆ Θ ( x ; • ) : [0 , 1] → C ( x ) is bijecti v e, ˆ Θ ( x ; 0) = ∅ , ˆ Θ ( x ; 1) = Θ , and ˆ Θ ( x ; ρ 1 ) ⊆ ˆ Θ ( x ; ρ 2 ) (1) for all 0 ≤ ρ 1 ≤ ρ 2 ≤ 1 . T w o nested set estimators ˆ Θ 1 : Ω × [0 , 1] → C and ˆ Θ 2 : Ω × [0 , 1] → C are dual if the ranges C 1 ( x ) and C 2 ( x ) of ˆ Θ 1 ( x ; • ) and ˆ Θ 2 ( x ; • ) induce the same σ -eld, i .e., σ ( C 1 ( x )) = σ ( C 2 ( x )) , for eac h x ∈ Ω . The desired m etameasure will b e constructed from t w o condence measures in turn constr uc ted from dual nested set estimators. Denition 4. Let ˆ Θ : Ω × [0 , 1] → C denote a nested set estimator and A x the σ -eld induced b y C ( x ) , the range of ˆ Θ ( x ; • ) for eac h x ∈ Ω . Then, for all x ∈ Ω , ˆ Θ induc es the probabilit y space (Θ , A x , P x ) and the c ondenc e me asur e or fr e quentist p oste r ior P x , the probabilit y measure on A x suc h that Θ 0 ∈ C ( x ) = ⇒ Θ 0 = ˆ Θ ( x ; P x (Θ 0 )) . (2) The probabilit y P x (Θ 0 ) is the c ondenc e level of the h yp othesis that θ ∈ Θ 0 . If ˆ Θ is v a lid, nonconser- v ativ e, or exact, then P x and P x (Θ 0 ) are lik ewise called valid , nonc onservative , or exact , resp e ctiv ely . 5 The next result pro vides the condence lev el of an y h yp othesis that θ ∈ Θ 0 ∈ A x as the sum of condence lev els giv en more directly b y equation (2). Prop osition 5. F or e ach x ∈ Ω , let (Θ , A x , P x ) b e the c on denc e me asur e induc e d by the neste d set estimator ˆ Θ : Ω × [0 , 1] → C , and let C ( x ) b e the r ange of ˆ Θ ( x ; • ) . F or some K ∈ { 1 , 2 , . . . } , let Θ 0 = ∪ K k =1 Θ 0 k , wher e Θ 0 k ∈ A x and i 6 = j = ⇒ Θ 0 i ∩ Θ 0 j = ∅ . Then P x (Θ 0 ) = K X k =1  P x  Θ + k  − P x  Θ − k  , (3) wher e Θ + k = arg inf Θ 00 ∈C ( x ) , Θ 0 ⊆ Θ 00 | Θ 00 | and Θ − k = Θ + k \ Θ 0 k for al l k ∈ { 1 , 2 , . . . , K } . Pr o of. P x  Θ + k  = P x (Θ 0 k ) + P x  Θ − k  and P x  ∪ K k =1 Θ 0 k  = P K k =1 P x (Θ 0 k ) follo w from the m utual exclusivit y of the sets and from the additivit y of the measure P x . Th us, since, for all k ∈ { 1 , 2 , . . . , K } , b oth Θ + k and Θ − k are in C ( x ) and since C ( x ) induces A x , equations (2) and (3) can b e used to calculate P x (Θ 0 ) for an y Θ 0 ∈ A x . Denition 6. Consider the dual nested set estimators Θ ≥ : Ω × [0 , 1] → C , whic h i s v alid, and Θ ≤ : Ω × [0 , 1] → C , whic h is nonconserv ativ e. F or ev ery x ∈ Ω , let A x denote the common σ -eld induced b y eac h of the ranges of ˆ Θ ≥ ( x ; • ) and ˆ Θ ≤ ( x ; • ) . If P x ≥ is the valid c onden c e me asur e , the condence measure induced b y Θ ≥ , then P x ≥ (Θ 0 ) is called a valid c on denc e level of the h yp othesis that θ ∈ Θ 0 . F or e ac h x ∈ Ω , the dual nonc onservative c ondenc e me asur e P x ≤ and nonc onservative c ondenc e lev el P x ≤ (Θ 0 ) are dened analogously . On the metaprobabilit y space M x ≥ , ≤ =  Θ , A x ,  P x ≥ , P x ≤  , called a c ondenc e metame asur e sp ac e , the probabilit y metamea s ure P x is called the c ondenc e metame asur e induc e d b y ˆ Θ ≥ and ˆ Θ ≤ given some x in Ω . A ccordingly , the c ondenc e metalevel of the h yp othesis that θ ∈ Θ 0 is P x (Θ 0 ) for all Θ 0 ∈ A x . By the denition of metaprobabil it y , an y h yp othesis Θ 0 ∈ A x has a c ondence me t al ev el of P x (Θ 0 ) =  P x ≥ (Θ 0 ) ∧ P x ≤ (Θ 0 ) , P x ≥ (Θ 0 ) ∨ P x ≤ (Θ 0 )  . (4) R emark 7 . The restriction to σ -elds with ev en ts common to v alid and nonconserv ativ e condence measures strongly c ons trains the c hoice of the estimators to ensure the abilit y to assign a condence metalev el to an y h yp othesis of in terest without a need for incomplete probabilit y me asu re s . The further exibilit y o f allo wing m ultiple σ -elds in a class of me asu re spaces ma y b e desirable in some applications. Strategies dev elop ed within more con v en ti onal fr e q ue n tist framew orks pro vide guidance on the c hoice of whic h dual set estimators b y whic h to induce the condence metameasure. Extending the statistical mo del to incorp orate information from t he ph ysics of exp erimen tal design and measuremen t can rule out man y p a t hol ogical set estimators as me aningless (McCullagh, 2002). F or instance, the inclusion of trans formati on-group structure in the mo del leads to set estimators that exactly matc h Ba y esian p osterior credible sets under certain improp er priors (F rase r , 1968; Helland, 2004). Without taking adv an tage of extended mo dels, Barndor-Nielsen and Co x (1994, 121-122, 132-133), Sprott (2000, pp. 75-76), and Brazzale et al. (2007) highligh t adv an tages of incorp orating information fr o m the lik eliho o d function in to set estimators; cf. Section 2. 5. 6 Example 8 (normal distribution) . F or n indep enden t random v ariables eac h distributed according to P θ,γ , the normal distribution with mean θ and v ariance γ , the in terv al estimator Θ α giv en b y Θ α ( x ; ρ ) =  p − 1 x ( α ) , p − 1 x ( ρ + α )  for all ρ ∈ [0 , 1 − α ] is nested and is an e x ac t ρ (100%) condence in terv al for θ , where α ∈ [0 , 1] , p x ( θ 0 ) is the upp er-tailed p -v al ue of the h yp othesis that θ = θ 0 , and p − 1 x is the in v erse of p x . Since Θ α is b oth v al id a n d nonconserv ativ e, it is dual to itself, yielding the equalit y of the v alid and nonconserv ativ e condence measures P x α, ≥ and P x α, ≤ , eac h the distribution of ϑ = ¯ x + T n − 1 ˆ σ / √ n, where T n − 1 is the random v ariable of the Studen t t distribution with n − 1 degrees of freedom. Hence, the condence metameasure P x α induced b y Θ α is degenerate:  Θ , A x ,  P x α, ≥ , P x α, ≤  = (Θ , A x , { P x α } ) If Θ 0 is an i n terv al, then P x α (Θ 0 ) = p x (sup Θ 0 ) − p x (inf Θ 0 ) for all x ∈ Ω and Θ 0 ∈ A x , from whic h it follo ws that the condence measure P x α do es not dep end on the nested s e t estimator c hosen and can th us b e represen ted b y P x . Sp ecial prop erties of degenerate condence metameasures are giv en in Section 3. The next example in v olv es a nondegenerate condence metameasure. Example 9 (binomial distribution) . Let P θ denote the binomial measure with n trials, success probabilit y θ ∈ Θ , and C -corrected, upp er-tail cum ula t i v e probabilities p C,x ( θ ) = P θ ( X > x ) + C P θ ( X = x ) , with C ∈ [0 , 1] . Consider the family F C = { Θ α C : α ∈ [0 , 1] } of nested set estimators suc h that Θ α C ( x ; ρ ) =        h p − 1 1 − C,x ( α ) , p − 1 C,x ( α + ρ ) i ρ ∈ (0 , 1 − α ] ∅ ρ = 0 [0 , 1] ρ = 1 for all α ∈ [0 , 1] , ρ ∈ R = [0 , 1 − α ] ∪ { 1 } , x ∈ { 0 , 1 , ... } = Ω , where p − 1 C 0 ,x ( α 0 ) = θ 0 ⇐ ⇒ p C 0 ,x ( θ 0 ) = α 0 . (5) Since the rates at whic h the v alid ( C = 0) and nonconserv ativ e ( C = 1) in terv al estimators co v er θ are b ound according to P θ ( θ ∈ Θ α 0 ( X ; ρ )) ≥ ρ, P θ ( θ ∈ Θ α 1 ( X ; ρ )) ≤ ρ, the sets F 0 and F 1 are v alid and nonconserv ativ e families of nested set estimators, re s p ectiv ely , and f or an y α ∈ [0 , 1] , the v alid set estimator Θ α 0 is dual to the nonconserv ativ e set e st i mator Θ α 1 , th us inducing the v alid condence m easure P x α, 0 , the nonconserv ativ e condence measure P x α, 1 , and the c ond e nce metameasure P x α on the σ -eld B ([0 , 1]) for eac h x ∈ Ω . In order to w eigh evidence in X = x for the h yp othesis that 0 ≤ θ 0 ≤ θ ≤ θ 00 ≤ 1 , equation (2) furnishes P x α,C h p − 1 1 − C,x ( α ) , p − 1 C,x ( α + ρ C,x ) i = ρ C,x , and, with equat i on (3), P x α,C ([ θ 0 , θ 00 ]) = P x α,C h p − 1 1 − C,x ( α ) , p − 1 C,x  α + ρ 00 C,x  i − P x α,C h p − 1 1 − C,x ( α ) , p − 1 C,x  α + ρ 0 C,x  i (6) = ρ 00 C,x − ρ 0 C,x , 7 Figure 1: Condence lev els of the h yp ot he sis that θ , the limiting relativ e frequency of successes, is b et w een 1/ 4 and 3/4 as a function of n, the n um b er of indep enden t trials, with θ = 2 / 3 as the unkno wn true v alue. In the notation of Example 9, the nonc ons e r v ativ e condence lev e l is P x 1 ([1 / 4 , 3 / 4]) , the v al id condence lev el is P x 0 ([1 / 4 , 3 / 4]) , and the half-corrected lev el is P x 1 / 2 ([1 / 4 , 3 / 4]) . The condence lev el a v eraged o v er the con v ex set is dened in Section 3.1. Sampling v ariation w as suppress e d b y setting eac h n um b er x of su c cesses to the lo w est in teger greater than or equal to nθ instead of randomly dra wing v alues of x from the h n, θ i binomial distribution. where ρ 0 C,x = p C,x ( θ 0 ) − α ρ 00 C,x = p C,x ( θ 00 ) − α . Since α drops out of the dierence, let P x C = P x α,C . F or an y Θ 0 ∈ B ([0 , 1]) , equations (6) and (4) sp ecify the condence metalev el of the h yp othesis that θ ∈ Θ 0 . T o illustrate the reduction of c ond e nce indeterminacy with additional observ ations, the b oundary v alues of P x ([1 / 4 , 3 / 4]) are plotted against n in Fig. 1 for the θ = 2 / 3 case. 2.3 Coherence of cond e n ce met a lev els The condence metameasure P x on condence space M x ≥ , ≤ mo dels the reasoning pro cess of an ideal agen t b etting on inclusion of the true parameter v alue in eleme n ts of A x , the σ -eld of M x ≥ , ≤ , with upp er and lo w er b etting o dds determined b y the co v erage rate s of the corresp onding v alid and non- conserv ativ e condence s e ts . The coherence of the agen t's decisions m a y b e e v aluated b y expressing its b etting o dds in terms of upp er and lo w er probabil ities that lac k the additivit y prop ert y of K ol- mogoro v's probabilit y me asur e s. Giv en the dual functions u : A x → [0 , 1] and v : A x → [0 , 1] suc h that u (Θ 0 ) + v (Θ \ Θ 0 ) = 1 , (7) u (Θ 0 ∪ Θ 00 ) ≥ u (Θ 0 ) + u (Θ 00 ) , v (Θ 0 ∪ Θ 00 ) ≤ v (Θ 0 ) + v (Θ 00 ) for all disjoin t Θ 0 and Θ 00 in A x , the v alues u (Θ 0 ) and v (Θ 0 ) are the lower and upp er pr ob abilities (Molc hano v, 2005, 9. 3) of the h yp othesis that θ ∈ Θ 0 . The decision-theoretic i n terpretation is that 8 u (Θ 0 ) is the largest price an agen t w ould pa y for a gain of 1 θ (Θ 0 ) , whereas v (Θ 0 ) is the smallest price for whic h the same agen t w ould sell that gai n, assuming an additiv e utilit y function (W alley, 1991). The dualit y b e t w een u and v expressed as equation (7) means eac h function is completely determined b y the other. The function u is called the lower envelop e of a fam ily P of measures on A if u (Θ 0 ) = inf P ∈ P P (Θ 0 ) for all Θ 0 ∈ A (Coletti and Scozzafa v a, 2002, 15.2; Molc hano v (2005, 9.3)). Since the lo w er en v elop e of a family o f probabilit y measures is a c oher ent lower pr ob ability (W alley, 1991,  3.3.3; Molc hano v (2005, 9.3)) and since  P x ≥ , P x ≤  as sp ecied in Denition 6 constitutes suc h a family , t he agen t w eighing evidence for an y h yp othesis θ ∈ Θ 0 b y P x (Θ 0 ) , with Θ 0 ∈ A x , satises the mi n i mal set of rationalit y axioms of W alley (1991). It follo ws that the agen t a v oi d s sure loss b y making decisions according to the lo w er and upp er probabil ities u (Θ 0 ) = P x ≥ (Θ 0 ) ∧ P x ≤ (Θ 0 ) , v (Θ 0 ) = 1 − u (Θ \ Θ 0 ) . Con v e r se ly , the fr am ew ork of Section 2.2 c an b e presen ted starting with de Finetti's prevision and the related concept of coheren t extension (W alley, 1991; Coletti and Scozzafa v a, 2002) as foll o ws. An in telligen t age n t rst s e ts its prices for buying and sel ling gam bles on the h yp otheses corresp onding to the ele men ts o f C according to th e condence co ec ien ts of v alid and nonconserv ativ e nested set estimators. Then it extends its p ri ces or pr evisions to the family of the t w o probabi lit y me asur e s on the σ -eld induced b y C in order to ev aluate the probabilit y of a h yp othesis θ ∈ Θ 0 for some Θ 0 in the σ -eld but not in C . This family in turn yiel ds coheren t lo w er and upp er probabilities that equal the initial bu ying and selling prices whenev er the latter apply , i.e., when the h yp othesis is that θ ∈ Θ 0 for some Θ 0 ∈ C . Th us, a Dutc h b o ok cannot b e made against the agen t. 2.4 Decisions und e r arbitrar y loss This section generalizes b etting under 0-1 loss to making condence-based decisions under an y un- b ounded loss function. Condence metalev els do not describ e the actual b etting b eha v i or of an y h uman agen t, but instead prescrib e deci sions, including amoun ts b et on an y h yp othesis in v olving θ , giv en that the agen t will i ncur a loss of L a ( θ ) for taking ac tion a. A ccording to a natural generalization of the Ba y es dec ision rule of minimizing loss a v eraged o v er a p osterior distribution, action a 0 dominates (is rationally preferred to) action a 00 if and only if ∀E 0 ∈ E ( L a 0 ) , E 00 ∈ E ( L a 00 ) : E 0 ≤ E 00 ∃E 0 ∈ E ( L a 0 ) , E 00 ∈ E ( L a 00 ) : E 0 < E 00 , where b oth exp ectati on in terv als (Denition 2) are with resp ect t o the same condence metameasure P x . The condence metameasures imp ose no restrictions on agen t decisions other than restricting them to non-dominated actions. This use of the condence me tameasure in making decisions follo ws a previous generalization of maximizing exp ected utilit y to m ulti-v alued probabilit y . (Here, the utilities are expressed in terms of equiv ale n t losses, as is con v en tional in the statistics literature.) K yburg (1990, pp. 180, 231-234; 2003; 2006) and Kaplan (1996, 1.4) used t he principle of dominance to mak e decisions on the basis of in terv als of exp ec t e d utilities determined b y the exp ected utilit y of eac h probabilit y measure: an action yielding exp ected utilities in in terv al A is preferred to that yie lding e x p ected utilities in in terv al B if at least one mem b er of A is greater than all mem b ers of B and i f no m em b er of A is le ss than an y me m b er of B . 9 While m ulti-v alued probabiliti es do not dictate ho w to c ho ose one of the non-dominated actions in situations that demand a c hoice equiv alen t to dec iding b et w ee n accepting a h y p othesis or accepting its alternativ e, they ma y pro v e more practical when inde cision can b e br ok en b y additional considerations, as W alley (199 1, pp. 161-162, 235-241) explained. In the case of a h uman a gen t, Kyburg (2003) argued for sel ecting among non-dominated actions on the basis o f considerations that cannot b e represen ted mathematically rather than sele cting on the basis of an arbitrary prior di s tribution. If a single-v alued estimate of 1 Θ 0 ( θ ) is needed for some Θ 0 ∈ A , the indeterminacy sup P x (Θ 0 ) − inf P x (Θ 0 ) can quan tify a set estimator's degree of undesirable conserv atism; some w a ys to eliminate suc h i ndeterminacy b y replacing a condence metameasure with a condence measure are men tioned in Section 3. If indeterminacy is remo v ed, the ab o v e domi n a n c e principle re d uc es to the principle of minimizing exp ected loss (3.2). 2.5 Lik elih o o d principle While in some ca s e s the lik e liho o d function can gui de the construction of set estimators with desirable prop erties, as noted in Section 2.2, it pla ys no general role in condence decision theory . Consequen tly , inference do e s not alw a ys ob ey the lik eliho o d principle : some set estimators lead to v alues of eviden tial supp ort and partial pro of that dep end on information in the sampling mo del not enco ded in the lik eliho o d function; cf. Wilkinson (197 7). An adv an tage of coheren t statistica l metho ds in general is the exibil it y they giv e the researc her to sim ultaneously consider as man y h yp otheses and in terv al estimates for θ as desired. Although suc h v ersatilit y is usuall y presen ted as a consequence of the lik eliho o d principle and Ba y esian statistics, they are not needed to secure it once coherence has b een established (2.3). That the prop osed frame w ork is not constr ai ned b y the lik el iho o d principle distinguishes it from P eter W all ey's W 1 and W 2 , t w o inferen tial theories of i ndeterminate (m ulti-v alue d ) probabilit y in tended to satisfy the b est asp ects of b oth coherence and frequen tism (W a lley, 2002 ) . The co v erage error rate of W 1 tends to b e m uc h hi gher than the nominal rate in order to ensure sim ultaneous complianc e with the lik eli ho o d principle. Although the principle often precludes appro ximately correct frequen tist co v erage, more p o w er can b e ac hiev ed b y less stringen tly con troll ing the error rate (W all ey, 2002). W alley (2002) did not rep ort the de gr e e of conserv atism of W 2 , a normalized lik eliho o d metho d. With a uniform me asu re for in tegration o v er parameter space, the normalized l ik eliho o d is equal to the Ba y esian p ost e rior that results from a uniform prior. 3 F requen tist p osterior distribution An imp ortan t realm for practical applications of the ab o v e fr a mew ork is the situation in whic h infere n c e ma y reasonably dep end only on a single condence measure P x rather than directly on a condence metameasure P x . That is p ossible not only in the sp ecial case of degeneracy due to the a v ailabilit y o f a suitable exact nested set estimator ( Exampl e 8), but can also b e ac hiev e d either b y transforming a non- degenerate condence metameasure to a condenc e measure (3.1 ) or b y appro ximating a condence measure. Remark 1 6 concerns the latter strategy in the case of a scalar parameter of in terest. Relying solely on a single condence measure for inferenc e and decision making (3.2) e nj o ys the coherence of theories of utilit y maximization usually asso ciated with Ba y esianism (3.3). I n the ubiqui- tous s p ecial case of a scalar parameter of in terest, a single condence lev el of a h yp othe s i s is a consisten t estimator of whether the h yp othesis is true under more general conditions than is the p -v alue as suc h an estimator (3.4). 3.1 Reducing a condence metameasure In terpreting upp er and lo w er pr o b a b i lities as b ounds de ning a family of p ermissible probabilit y measures, Williamson (2007) argued for minimizing exp ected loss with resp ect to a single distri- bution within the family instead of using outside considerations to c ho ose among actions that ar e 10 non-dominated in the sense of Section 2.4. Consider the condence metameasure space M x ≥ , ≤ =  Θ , A x ,  P x ≥ , P x ≤  of condence metameasure P x for some x ∈ Ω . A m uc h larger family P of mea- sures on A x suc h th a t  P x ≥ , P x ≤  and P ha v e the same lo w e r en v elop e u is the con v ex set P = { P x D : D ∈ [0 , 1] } , where P x D = (1 − D ) P x ≥ + D P x ≤ , thereb y forming the m etaprobabilit y space ˜ M x ≥ , ≤ = (Θ , A x , P ) and probabilit y metameasure ˜ P x ; cf. Smith (1961, 11); W asserman (1990); P aris (1994, pp. 40-42). Si n c e ˜ P x = P x , the measure P x ∈ P selected according to some rule is called a r e duction of P x . Eectiv e reduction of P x to a single measure P x can b e accomplished b y a v erag ing o v er P with resp ect to the Leb esgue measure. That a v erage of the con v ex set is simply the me an of the v al id and nonconserv a t i v e condence measures: P x (Θ 0 ) = Z 1 0 P x D (Θ 0 ) dD =  P x ≥ (Θ 0 ) + P x ≤ (Θ 0 )  / 2 = P x 1 / 2 (Θ 0 ) (8) for all Θ 0 ∈ A x ; recall that P x 1 / 2 ∈ P . Other automatic metho ds of reducing a metameasure to a single measure a r e also a v ailable. F or example, the recommendation of Williamson (2007) to select the measure within the family that maximizes the en t rop y is minimax under Kullbac k-Leibler loss (Grün w ald, 2004). Example 10 (Binomial distribution, con tin ued from Example 9) . As the gra y line in Fig. 1 indicates, the m ean measure P x of the con v ex set (8) yields a condence lev el b et w een those of the v alid and nonconserv ativ e condence measures, discarding the notable reduction in condence nondegeneracy from n = 1 to n = 10 as irrelev an t for action in situations that do not p ermit indecision. The appro ximate (half-corrected) condence l ev el also disregards nondegeneracy information, yielding in this sp ecial case the same lev els of condence as do es P x . In con trast, the condence metameasure records the nondegeneracy as the dierence b et w een the agen t's selling and buying prices of a gam ble with a pa y o con tingen t on whether or not θ ∈ [1 / 4 , 3 / 4] , a die r e nce that b e comes less imp ortan t as n increases. 3.2 Condence-based decision and inference 3.2.1 Minimizing exp ected loss In a situation requiring a decision in v olving the acceptance or rejection of the h yp othesis that θ ∈ Θ 0 , that is, under a 0-1 loss function, an agen t guided b y a single measure P x regards P x ( ϑ ∈ Θ 0 ) /P x ( ϑ / ∈ Θ 0 ) as the fair b etting o dds and will act ac cordingly . The h yp othesis θ ∈ Θ 0 will b e accepted only if the o dds P x ( ϑ ∈ Θ 0 ) /P x ( ϑ / ∈ Θ 0 ) are greater than the ratio of the c os t that w ould b e incurred if θ / ∈ Θ 0 to the b enet that w ould b e gained if θ ∈ Θ 0 . Otherwise, unless the o dds are exactly equal to 1, the h yp othesis θ / ∈ Θ 0 will b e acce pt e d. Under a more general class of loss functions, the decision theory of Section 2.4 reduces to the minimization of exp ected loss giv en the degeneracy or reduction of the condence metameasure. Section 3.3.2 notes impli cations for axiomatic coherence. 3.2.2 Applications to h yp ot hesis as sessmen t As the ndings of basic science are arguably v aluable e v en if nev er applied and since the w a ys in whic h an y i nd uc tiv e inference wil l b e used are often unpredictable (Fisher, 1973, pp. 95-96, 103-1 06), P x ( ϑ ∈ Θ 0 ) ma y b e rep orted as an estimate o f 1 Θ 0 ( θ ) for use with curren tly unkno wn loss f unc t i ons (cf. Jerey, 1986; Hw ang 1992). That inferen ti al rol e i s curren tly pla y ed in m an y of the sciences b y the p -v alue in terprete d as a measure of evidence in signicance testing (Co x, 1977), but its notorious lac k of coherence has prev en ted its univ ersal acceptance (e.g., Ro y all, 1997). As will b ecome clear in Section 3.4, P x ( ϑ ∈ Θ 0 ) can dier mark edly from the p -v alue for testing θ ∈ Θ 0 as the n ull h yp othesis not only i n in terpretation but also in n umeric v alue. 11 Example 11. Efron and Tibsh i rani (1998, 3) consider the h yp othesis that the mean ξ of a ν - dimensional m ultiv ariate normal distribution of an iden tit y co v ariance matrix is in an origin-cen tered sphere of radius θ 00 but outside a concen tric sphere of radius θ 0 . Let θ = || ξ || , and let χ 2 ν b e the c hi-squared cum ulativ e distribution function (CDF) of ν degrees of free d o m. S i nce the p -v alue of the n ull h yp othesis that θ ≥ θ 0 is χ 2 ν  ( || x || /θ 0 ) 2  , the condence lev el of the h yp othesis that θ 0 < θ < θ 00 is P x ( θ 0 < ϑ < θ 00 ) = χ 2 ν  ( || x || /θ 0 ) 2  − χ 2 ν  ( || x || /θ 00 ) 2  , the v alue of whic h Ef ron and Tibshirani (1998, 4) justied as an appro ximation to a Ba y esian p osterior probabilit y . The coherence of the condence measure P x imm unizes it against the i nconsistencies that Efron and Tibshirani (1998, 3) noticed among p -v a lues: con tradictory concl us i ons w ould b e reac hed dep ending on whic h h yp othesis w as considered as the n ull. A practical impl ication of w ork i ng in the condence metameasure framew ork is that since the simple b o otstrap metho ds of Efron and Tibshirani (1998) based on a scalar piv ot enable c lose appro ximations to p -v a lue functions (Efron, 1993; Sc h w eder and H jor t, 2002; Singh et al., 2005; Xi ong and Mu, 2009), they can solv e related problems to o complex for more rigid Neyman-P earson metho ds and y et without an y need to seek matc hing priors for justication; cf. Efron ( 2 003). Applications incl ude assigning lev els of condence to ph ylogenetic tree branc hes Efron et al. (1996), to observ ed lo cal maxim a in an estimated functi on (Efron and Tibshirani, 1998; Hall, 2004), and t o gene net w ork connections found on the basis of microarra y data (Kamim ura et al., 20 03). Liu (1997) studied op erating c haracteristics of the empiric al s t r ength pr ob ability (ESP), whic h in the one- d i mensional case is equal to some condence probabilit y P x ( θ 0 < ϑ < θ 00 ) dened with resp ect to a b o otst rap algorithm. See P olansky (2007) for an accessible in tr o duction to the general problem of observ e d co n de nce lev els of comp osite h yp otheses, whic h Efron and Tibshirani (1998) had dubb ed the proble m of re- gions, understo o d to include appl ications to ranking and s e lection as w e ll as those men tioned ab o v e. The fundamen tal c haracteristic of this approac h is not the b o otstrapping tec hnique as m uc h a s the prop ert y that the lev el of condence in an y giv en region is equal to the co v erage rate of a corresp onding condence set. Un til the ESP i s see n t o h a v e a comp ell ing jus ti cation of its o wn, it ma y c on tin ue to b e regarded merely as a metho d of last resort since i t is in general neither a Ba y esian p osterior probabilit y nor a Neyman-P earson p -v a lue: F or [the latter] reason, it seems b est to use the ESP only when more sp ecic, direct testing metho ds are not a v ailable for a particular problem (Da vison et al., 2003). That the ESP and other appro ximations of the condence v alue are more acceptable than p -v alues a s estimates of whether the parameter lies in a giv en re gion ( 3.4.4) giv es cause to reconsider that judgmen t ev en apart from the coherence of the c ondence v alue. Example 1 2 (b ey ond statistical signicance) . Consider the n ull h yp othesis θ 0 − ∆ ≤ θ ≤ θ 0 + ∆ , where the non-negativ e scalar ∆ is a minimal degree of practical or scien tic signicance in a particular application. F or instance, researc hers dev elopi n g metho ds of analyzing microarra y data are increasingly calling for sp ecication of a minimal lev el of biological signicance when testing n ull h yp otheses of equiv alen t ge n e expression against alternativ e h yp otheses of dieren tial gene expression (Lewin et al., 2006; V an De Wiel and Kim, 2007; Bo c hkina and Ric hardson, 2007; Bic k el, 2008). Bic k el (2004) and McCarth y and Sm yth (2009) in eect approac hed the problem with p -v alues of comp osite n u l l h yp otheses, in c onict with the condence measure approac h (Example 18 and Section 3.4. 4). Section 3.4. 3 pro v i des additional examples th a t con tras t h yp othesis condence lev els with h yp oth- esis p -v alues in practical applications. 3.2.3 O t her applicati ons of minimizing exp ected loss The framew ork of minimizing exp ected loss with resp ect to a condence measure (3.2.1) not only leads to assigning condence lev els to h yp otheses but also pro vide s me tho ds for optimal estimation 12 and prediction. I n a d di tion, condence-measure estimators and predictors ha v e frequen t i st prop erties only shared w i th Ba y esian estimators and predictors when the Ba y esian p osterior is a condence measure. As the frequen tist p osterior, the condence measure giv es all the p oin t estimators pro vide d b y the Ba y esian p osterior. F or example, the frequen tist p osterior mean, minimizing exp ected squared error loss, is ¯ ϑ x = R Θ ϑdP x ( ϑ ) and the frequen tist p osterior p -quan tile, minimizing exp ected loss for a threshold-based function of p (Carlin and Louis, 2009, App. B), is ϑ ( p ) suc h that p = P x ( ϑ < ϑ ( p )) . Assuming a dieren tiable CDF of P x , Singh et al. (2007) pro v ed the w ea k consistency of the frequen tist p osterior median ϑ (1 / 2) and the frequen tist p osterior mean ¯ ϑ x and p ro v ed t hat the forme r is median- un biased. In that case, the frequen tist mo de, the v alue maximizing the probabilit y densit y function of ϑ , is also a v ailable if a unique maxim um exists. The fr e quentist p oste r ior pr e dictive distribution , the frequen tist analog of the Ba y esian p osterior predictiv e distribution of a new observ ation of X , is P ( x ) = R Θ P ϑ,γ dP x ( ϑ ) for all x ∈ Ω . (Da wid and W ang (1993), v an Berkum (1996), and Hannig (2009) considered this with ducial-li k e distributions in place of the condence measure P x .) Appropriate p oin t predictions are ¯ ξ x = R Ω X ( ω ) dP ( x ) ( ω ) in the regression case of con tin uous Ω and ˜ ξ x = 1 [1 / 2 , 1]  P ( x ) ( X = 1)  in the classication case i n whic h Ω = { 0 , 1 } . If P x is appro x i mated using a b o otst rap algorithm as in Section 3.2.2, then the resulting v al ues of ¯ ξ x and ˜ ξ x are b o otstrap aggregation (bagging) predictions; Breiman (1996) found bagging to reduce prediction error. The condence predictiv e distribution can also b e used to determine sizes of new studies b y accoun ting for uncertain t y in the eec t s i ze. (The classical m etho d of determining the sample size of a planne d exp erimen t i s often critic ized for r e lying on a p oin t estimate of the eect size.) 3.3 Condence v ersus Ba y esian probab ilit y As the examples of Section 3.2 il lust rate , man y uses of Ba y esian p osterior distributions are completely compatible with condence measures since b oth distributions of parameters deliv er cohe r e n t inferences in the form of probabilities that h yp otheses of in terest are tru e . Ho w ev er, to the exten t that up dating parameter distributions in agreemen t with v alid condence in terv al s conic ts with up dati ng them b y Ba y es's form ula, condence decision theory diers fundamen t al ly from the t w o dominan t forms of Ba y esianism, sub jectiv e Ba y esianism, whic h is seldom u se d b y the statistics comm unit y , and ob jec tiv e B a y esianism broadly dened as a coll ection of algorithms for g enerating prior d i s tributio n s from sampling distributions or from in v ariance argumen ts. Nonetheless, the prop osed fr a mew ork follo ws from an application of de Finetti's theory of prevision to an agen t that mak es de cisions according to certain condence lev els (2.3). 3.3.1 Ba y esi an conditio n i ng As demonstrated in Section 2.3, the prop osed framew ork for frequen tist inference satises coherence, whic h do es not require the probabilit y distribution of the parameters to corresp ond to an y Ba y esian p osterior distribution, a prior distribution conditional on the observ ed data in the K olmogoro v sense, as is frequen tly supp osed. Not cohere nce but another pillar of Ba y esianism mandates that the p osterior distribution, i.e., the parameter distribution used for decisions after making an observ ation, m ust equal the prior distr i bu ti on conditioned on the observ ation (Goldstein, 1985). That assumption, usually implicit, has b een stated as a plausible principle of learning from data: Denition 13 (Ba y e sian temp oral principle) . Consider the prior distri b u t i on π , a probabil it y measure induced b y a random v ector ϑ in Θ , the parameter space. Let the up date rule π 0 • denote a function mapping Ω , the sample space, to a set of probabilit y measures, eac h dened on Θ . I f, for all x 0 ∈ Ω , the p osterior distribution π 0 x 0 induced b y random quan tit y ϑ 0 x 0 in Θ is the conditional distribution of ϑ giv en X 0 = x 0 , then π 0 • satises the Bayesian temp or al principle, π 0 x 0 is calle d a Bayesian p osterior 13 distribution , and the equiv alence b et w een the p osterior and conditional distributions is written as ϑ 0 x 0 ≡ ϑ | x 0 . R emark 14 . In the one-dimensional case, the Ba y esian temp oral principle stipulates that, for all Θ 0 ⊆ Θ , π 0 x ( ϑ 0 ∈ Θ 0 ) = π ( ϑ ∈ Θ 0 | X 0 = x 0 ) , where π 0 x and π are the p osterior and prior dist ri bu ti ons of ϑ 0 and ϑ , resp ectiv ely . A dding a prime sym b ol ( 0 ) for eac h successiv e observ ation giv es ϑ 0 x 0 ≡ ϑ | x 0 , ϑ 00 x 00 ≡ ϑ 0 x 0 | x 00 , ϑ 000 x 000 ≡ ϑ 00 x 00 | x 000 , and so forth. Goldstein (2001) coined the name of the principle, explaining that it u nre as o n a b l y requires that an agen t's conditional b etting o dds (prior o dds conditional on a con templated future observ ation) determines its future b etting o dds (p osterior o dds as a function of the actual observ ation). In other w ords, the curren t rate o f mac hine learning is limited b y the previous strength of mac hine b elief. Goldstein (2001) p oin ted out that although Ba y esians follo w the temp oral principle when using Ba y e s's form ula, they disregard it ev ery time they revise a prior or sampling mo del up on seeing new data. Suc h revision o ccurs whenev er p osterior predictions are sub jecte d to frequen ti s t mo del c hec king pro cedures suc h as cross v alidation. One rationale for revising the prior is that p o or frequen tist p erformance ma y i n di cate that it did not adequately reect the a v ailable information as w ell as it migh t ha v e had it b een more carefully elicited. Another is the receipt of new information t hat cannot b e represen ted in the probabilit y space of the initial prior (Diaconis and Zab el l, 1982). 3.3.2 Non-Ba y esi an coherence Condence decision theory not only satises co h e rence in the sense of a v oiding sure loss (2.3), but, when reduced to the minimization of exp ected loss with resp ec t to a single condence measure (3.2), is also coheren t in the sense of axiomatic systems of exp ected utilit y maximization (v on Neumann and Morge ns te r n, 1944; Sa v age, 1 954). Whil e b oth approac hes to c oherence supp ort the concept of placing b ets in accord with the la ws of probabilit y , including conditional probabil it y for called-o b ets, none of the approac hes en tails the equalit y of c ond i tional probabilit y as dened b y K olmogoro v and p osterior probabilit y as the h yp othesis probabilit y up dated as a function of observ ed data. Replacing probabilities with prop osition trut h v alues a n d conditional probabilities with theorems (st a t e men ts of implication) furnishes an illustration from deductiv e logic (Jere y , 1986): an agen t whose set of prop ositions held to b e true do not con tradict eac h ot he r at an y p oin t in time is completely sel f - consisten t. Ho w ev er, the agen t cannot comply with the deductiv e v ersion of the Ba y esian temp oral principle unless none of the tr uth v alues ev e r requires revision (Ho wson, 1997). As a nitely additiv e probabilit y distribution, the condence m easur e also agree s with axiomatic systems of probabil istic logic suc h as that of Co x (1961). The ab o v e accoun ts of coherence pro vide no supp ort for the Ba y esian temp oral principle since their theorems in v olv e conditional probabilit y , not p osterior probabilit y as sp ecied b y some up date rule π 0 • . Simply dening the p osterior distribution to b e K olmogoro v's conditional distribution giv en the data either sp e cies nothing ab out ho w parameter distributions are up dated with new data or conceals the assumption of the Ba y esian temp oral principle (Hac king, 1967). Ev en though the statistical literature r e fers to man y theorems supp orting coherence and rationalit y as understo o d in Section 2.3, discussion of the foundational principle of Ba y esianism has instead tak e n place mostly in the phi losophical literature. Da vid Le wis (T eller, 1973) presen ted a transformation of the Dutc h b o ok game (2.3) in to one in whic h the ga m bler kno ws the rule the casino agen t uses to up date its b etting o dds on receipt of new information. In that game, but not in the original Dutc h b o ok game, violation of the Ba y esian temp oral principle l eads the casino to s ure loss (T eller, 1973; Vineb erg, 1997). Since suc h violation o ccurs o v er time, it is considered a breac h of diachr onic game-the or etic c oher enc e , a restriction on the degree to whic h an agen t's b etting o d ds can c hange o v er time, as opp osed to synchr onic game-the or etic c oher enc e , a consistency in an agen t's b etting o dds at an y giv e n t i me (Armendt, 1992). A ccordingly , the Dutc h b o ok argumen ts for diac hronic c oherence 14 ha v e b een considered m uc h w eak er (Maher, 1992; Goldstein, 2006; Williamson, 2009) than those for sync hronic coherence, the t yp e of coherence supp orted b y the theorems of de Finetti (1970 ) and Sa v age (1954). Goldstein (1997), Hac king (2001, pp. 256-260), and Williamson (2009), while accepting Dutc h b o ok argumen ts for sync hronic coherence, do not consider diac hronic coherence to b e a requiremen t of logical though t. Hild (1998) distinguished game-theoretic diac hr o n i c co h e rence from decisio n - t he oretic diac hronic coherence, arguing that the latter rules out the Ba y esian temp oral principle as incoheren t. Another dicult y is that some Dutc h b o ok argumen ts lead to v ersions of diac h roni c coherence that conict with the Ba y esian temp oral principle (Armendt, 1992). In summary , the theorems routinel y presen ted as pro of that all rati onal though t or coheren t de- cision making m ust b e Ba y esian actually pro v e no more than the irrationalit y of violating the logi c of standard probabilit y theory . Th us, an y decision-theoretic framew ork represen ting unkno wn v al ues as random quan tities mapp ed from some probabilit y space stands on equal g r ound with Ba y esianism as far as the minimal requiremen ts of rationalit y are concerned. Suc h framew orks incl ude geometric conditioning (Goldstein, 2001), probabilit y k i nematics (Diaconis and Zab ell , 1982; Jerey, 2004), dy- namic cohe r e nce (Skyrms, 1997; Z a b ell, 2002), and relativ e en trop y maximization (Grün w ald, 2004 ; Jaeger, 2005; Williamson, 2009) as w ell as condence decision t he ory (3.4). 3.3.3 Ob jections to freq u e n ti st p os teriors Since, neglecting suciency and ancillarit y considerations, the condence lev el is n umerically equal to the ducial probabilit y in the case of a one-dimensional parameter of in terest giv en con tin uous data (Wilkinson, 1977), some classical Ba y esian ob jections against the coherence of ducial distributions apply with equal force against the coherence of the condence measure. The str e ngth of suc h argumen ts is no w e v aluated in ligh t of the ab o v e distinction b et w een axiomatic coherence and the Ba y es up date rule. In the presen t fr a mew ork, condence-based or ducial probabilities of h yp otheses corresp ond to rea- sonable b etting o dds, a consequence that Corneld (1969) c onsidered imp ossible since Lindley (1958) had de monstr ate d that ducial distributions are Ba y esian p osteriors only in certain sp ecial cases and since placing conditional b ets con trary to conditional probabilit y leads to certain loss. The conclusion dra wn b y Corneld (1969) w ould only follo w under the widely held but incorrect assumption that a parameter distribution m ust b e a Ba y esian p osterior for it to satisfy coherence. Lindley (1958), extending the w ork of Grundy (1956), actually had found conditions under whic h the ducial distribu- tion violates the Ba y esian temp oral princi ple considered in Section 3.3, not that a conditional ducia l distribution is i n c ompatible wi t h the denition of a conditional probabilit y distribution. Lindley (1958) also demonstrated that violation o f the Ba y e s i an temp oral principle means the piv ot is not unique, leading to non-unique duci al distributions. In ligh t of the subsequen t failure of a generation of statisticians to iden tify an y gen uine ly n o n i nf o r m ativ e priors (Da wid et al., 1973; W alley, 1991, pp. 226-235; Kass and W asserman, 1996; Hell and, 2004), the b elated rejoinder is that Ba y esian p osteriors lac k uniqu e ness as w ell (F raser, 2008a; Hannig, 2009). Just as giv en a prior, sampling mo del, and data, all inference s made using the resulting Ba y esian p osterior me asur e are co h e ren t, so giv en an exact estimator, sampling mo de l, and data, all inferences made using the resulting condenc e or ducial measure are equally coheren t. Th us , the selec t i on of frequen tist set estimators parallels the selection of priors, and in eac h case suc h selection ma y dep end on the in tended applica t i on. Section 2.2 p oin ts to reasonable criteria for suc h selection. 3.4 Scalar sub paramet er c ase The equalit y b et w een tai l probabilities of a condence measures and p -v alues will b e used to pro v e a consistency prop ert y that holds under more general conditions for a condence lev el than for a p -v a lue as estimators of comp osite h yp othesis truth. 15 3.4.1 C ondence CDF as the p -v alue function If decisions are based on a single condence measure of a scalar p a r am eter of in terest, then the CDF of that mea s ure is an upp er-tailed p -v al ue function. Denition 15. Consider a function p + : Ω × Θ → [0 , 1] suc h that p + ( x, • ) = p + x ( • ) is a CDF for all x ∈ Ω and suc h that P ξ  p + X ( θ ) < α  = α (9) for all θ ∈ Θ , ξ ∈ Ξ , and α ∈ [0 , 1] . Then, for an y x ∈ Ω , the map p + x : Θ → [0 , 1] is called an upp er-tail p -v alue function for θ . Lik ewise, p − x : Θ → [0 , 1] is called a lower-tail p -v al ue function if p − x ( θ ) = 1 − p + x ( θ ) (10) for all θ ∈ Θ and for a ll x ∈ Ω . Uniformly distr i but e d under the simple n u l l h yp othesis that θ = θ 0 , p − x ( θ 0 ) and p + x ( θ 0 ) are exact p -v alues of one -s i ded tests. Since e q uatio n (10) is an isomorphism b et w een the t w o p -v alue functions, the pai r h p − x ( θ 0 ) , p + x ( θ 0 ) i will b e cal led the p -v al ue function, either el emen t of whic h ma y b e designated b y p ± X ( θ 0 ) . The two-side d p -v al ue of the n ull h yp othesis that θ is in a cen tral region Θ 0 of Θ is p x (Θ 0 ) = 2 sup θ 0 ∈ Θ 0 p − x ( θ 0 ) ∧ p + x ( θ 0 ) for all x ∈ Ω , reducing to the usual p x (Θ 0 ) = 2 p − x ( θ 0 ) ∧ p + x ( θ 0 ) for the p oin t h yp othesis that θ = θ 0 . While the name p-value function used b y F raser (1991) has b ecome standard i n the scien tic literature, signic anc e function is also used in higher-order asymptotics (e.g., Brazzale et al. (2007)). Efron (1993), Sc h w eder and Hjort (2002), and Singh et al. ( 2 007) pr e fer the term c ondenc e distribution , a v o ided here to clearly distinguish the p -v alue function from the condence measure as a K olmogoro v probabilit y distr i bu ti on. (Whereas an y p -v alue function is isomorphic to a unique condence measure as de n e d in Section 2.2, the p -v a lue function can also b e isomorphic to an incompl ete probabilit y measure. Wilkinson (1977) constructed a theory of incoherence based on suc h a measure, underscoring the need to sharply distinguish condence measures from p -v alue functions.) By the usual concept of statistical p o w er, the T yp e II err or r ate of p ± asso ciated with testing the false n ull h yp othesis th a t θ = θ 0 at signicance lev el α is β ± ( α, θ , θ 0 ) = P ξ  p ± X ( θ 0 ) > α  for an y θ ≷ θ 0 . F or all α 1 , α 2 ∈ [0 , 1] suc h that α 1 + α 2 < 1 , P ξ  α 1 < p + X ( θ ) < 1 − α 2  = 1 − α 1 − α 2 , implying that θ + X : [0 , 1] → Θ , the in v erse function of p + X , yields  θ + X ( α 1 ) , θ + X (1 − α 2 )  as an exact 100 (1 − α 1 − α 2 ) % condence in terv al (F raser, 1991; Efron, 1993; Sc h w eder and Hjort, 2002; Singh et al., 2007). R emark 16 . In m an y applications, appro ximate p -v al ue functions repla ce those that exactly satisfy the denition. F or instance, Sc h w eder and Hjort (2002) use a half-corrected p -v al ue function lik e p C,x of Example 9 for discrete data. Other appro xima t i ons in v olv e parameter distributions with asymptotically correct frequen tist co v erage, including the asymptotic p -v a lue functions of Singh et al. (2005), the distributions of asymptotic g eneralized piv otal quan tities of Xiong and Mu (2009), some of the generalized ducial distributions of Hannig ( 2 009), and the B a y esian p osteriors of Section 1.1. As with frequen tist inference in general, asymptotics pro vide appro ximations that in man y applications pro v e sucien tly accurate for infe r e nce in the absence of exact results (Reid, 2003). 3.4.2 In ter p r etations of the p -v alue function In its history , the p -v a lue function has had Neymanian, Fisherian, and Ba y esian in terpretations. Con- sisten tly viewing the p -v al ue function within the Neyman-P earson f ram ew ork rather than as the CDF 16 of a probabilit y measure of θ , F raser (1991), Sc h w eder and Hjort (2002), Singh et al. (2005), and Singh et al . (2007) ha v e us e d p + to conc isely presen t information ab out h yp othesis tests and condence in terv als in data analysis results. The p -v al ue function th us in terpreted as a w arehouse of results of p oten tial h yp othesis tests and condence in te r v als has also unco v ered relationships with the Ba y esian and ducial framew orks (Sc h w eder and Hjort, 2002). Sc h w eder and Hjort (200 2) aimed to demon- strate the p o w er of the frequen tist metho dology b y means of rep orting on the p -v a lue fun c tion and lik eliho o d function as k e y comp onen ts of a unied Neyman-P earson alternativ e to Ba y esian p osterior distributions, whic h can fail to yield in terv al estimates guaran teed to co v er tr ue parameter v alue at some giv en rate. In terestingly , the inc ipien t p -v alue function had b een originally concei v ed as a Fishe- rian al t e rn a t i v e to what w as seen as a mec hanical use of the Neyman-P earson co n de nce in terv al (Co x, 1958). In a mo v e a w a y from b oth of the main frequen tist in terpretations of the p -v a lue function, Efron (1993) prop osed a simple, fast algorithm for c omputing an implie d prior density and an implie d like- liho o d from a condence densit y assumed to b e prop ortional to a Ba y esian p ost e rior densit y . He rep orted that with a condence densit y based on an exp onen tial mo del and the ABC condence in- terv al metho d, the disagree men t b et w een the implied lik eliho o d and the true lik eli h o o d observ ed b y Lindley (1958) is small in most ca s e s ,  wi t h the implicatio n that the condence densit y appro ximates a Ba y esian p osterior, thereb y establishing appro ximate coherence. Ho w ev er, whi le compatibilit y with a Ba y esian p osterior is sucien t for coherence, it is b y no means necessary ( 2.3, 3. 3). Dropping the requiremen t of appro ximating a Ba y esia n p osterior e nables mo r e exact frequen tist co v erage in man y instances without sacricing the coherence ac hiev ed b y Efron (1993). The concept of coherence is itself s uc ien t to recast the p -v alue function from a pure Neyman-P earson to olb o x in to a v ersatile w eap on for statistica l inference and decisi on making, enabling all o f the applications a v ailable to a Ba y esian p osterior distribution of the in terest parame ter, marginal o v er an y n uisance parameters (cf. Efron, 1998). In addition, information in the form of a sub jectiv e prior distribution can b e incorp orated in to frequen tist data analysis b y com bining the prior with the p -v al ue function (Bic k e l, 2006) under the follo wing c ircumstances. Supp ose Agen ts A and B eac h base s the p osterior probabil it y measure b y whic h it mak es decisions (3.2) on condence sets acc ording to the framew ork of Section 2.3 whenev er the observ ation that X = x constitutes the only information ab out the parameter of i n terest. Agen t A observ es x, whic h w o u l d yiel d the condenc e m easure P x on (Θ , A ) , but it also has indep enden t information in the form of Q, a probabilit y measure on (Θ , A ) elicited from Agen t B, where Θ ⊆ R 1 . Since Agen t B w ould ha v e set Q to equal a conde n c e measure if p ossible, Agen t A pro cesses Q exactly as it w ould a condence measure computed on the basis of data inde p enden t of X. Since eac h of sev eral me t ho ds of com bining p -v alue functions from indep enden t data sets y i elds an appro ximate p -v alue fu nc tion inc orp orating information from b oth data se t s (Singh et al., 2005), Agen t A bases its decisions on P x ⊕ Q, the probabili t y measure of the CDF obtained b y applying an y suc h com bination metho d to the CDF s of P x and Q. It fol lo ws that if Q is in fact a condence me asur e , then P x ⊕ Q is a condence measure to the same degree of appro ximation as the com bined CDF is a p -v al ue function. Agen ts A and B ma y actually b e the same agen t, whic h w ould b e the case if Ag en t A had computed the pr i or Q as a condence measure on the basis of indep enden t data that are no longe r a v ailable . In conclusion, the presence of imp ortan t information in the form of a prior probabilit y distribution on (Θ , A ) do es not in i t se lf necessitate mo ving from c ondence-based statistics to Ba y e s i an statistics. 3.4.3 Condence lev els v ersus p -v alues Although b oth condence lev el s and p -v alues can b e computed from the same p -v alue function, the follo wing examples illustrate ho w they c an lead to dieren t inferences and deci s i ons. Section 3.4.4 then demonstrates that the former but not the latter are consisten t as estimators of comp osite h yp othesis truth. Example 17 (p oin t n ull h yp othesis) . If P x ( ϑ < • ) is con tin uous on Θ , then P x ( θ = θ 0 ) = 0 for an y in terior p oin t θ 0 of Θ . Th i s means that giv e n an y alternativ e h yp othesis θ ∈ Θ 0 suc h that P x ( θ ∈ Θ 0 ) > 17 0 , b etting on θ = θ 0 v ersus θ ∈ Θ 0 at an y nite b etting o dds will result in exp e cted loss, r e ecting the absence of information singling out the p oin t θ = θ 0 as a viable p oss i bilit y b efore the data w ere observ ed. ( B y con trast, the usual t w o-sided p -v alue is n umerically equal to p x ( θ 0 ) , whic h do es not necessarily equal th e probabilit y of an y h yp othesis o f in terest.) If, on the other hand, the parameter v al ue can equal the n ull h yp othesis v alue for all practical purp oses, that fact ma y b e represen ted b y mo deling the parameter of in terest as a random eect with nonzero probabilit y at the n ull h yp oth e sis v al ue. The latter option w ould dene the condence measure suc h that i ts CDF is a predictiv e p -v a lue function suc h as that used b y La wless and F redette (2005). Example 18 (bio equiv alence) . Regulatory agencies often need an estimate of 1 [ θ 0 − ∆ ,θ 0 +∆] ( θ ) , the indicator of whether the h yp othesis that the con tin uous parameter of i n terest lies within ∆ of θ 0 for some ∆ > 0; a v alue common in bio equiv alence studies is ∆ = log (125%) with exp ( θ 0 ) as the ecacy of a medical treatmen t. F or the purp ose of deciding whether to appro v e a new tr e atmen t or a genetically mo died crop, estimates pro vided b y companie s with ob vious conicts of in terest m ust b e as ob jecti v e as p ossible. The Neyman-P earson framew ork in eect enables conserv ativ e tests of the n ull h yp otheses θ ∈ [ θ 0 − ∆ , θ 0 + ∆] , θ < θ 0 − ∆ , and θ > θ 0 + ∆ (W ellek, 2003) but without guidance on ho w to use the resulting p -v al ues p x ( θ 0 ) , p + x ( θ 0 − ∆) , and p − x ( θ 0 + ∆) to mak e coheren t decisions, whic h w ould instead require estimates of 1 ( −∞ ,θ 0 − ∆) ( θ ) , 1 [ θ 0 − ∆ ,θ 0 +∆] ( θ ) , and 1 ( θ 0 +∆ , ∞ ) ( θ ) suc h that the sum of the estimates is 1. Th e probabili ties P x ( ϑ < θ 0 − ∆) , P x ( θ 0 − ∆ ≤ ϑ ≤ θ 0 + ∆) , and P x ( ϑ > θ 0 + ∆) qualify as suc h estimates without suering from the s ub jectiv e or arbitrary nature of assigning a prior distribution. Due to the coherence of probabili st i c indicator estimators, regulators ma y sim ultaneously consider more compl ex estimates suc h as P x ( ϑ > θ 0 + ∆ | ϑ / ∈ [ θ 0 − ∆ , θ 0 + ∆]) , the probabil it y that the eect size is high giv en that it is non-negligible, without the m ultiplicit y concerns that plague Neymanian st a t i s ti cs (2.5). Singh et al. (2007) also compared the use of observ ed condence lev els to con v en tional metho ds of bio equiv alence. 3.4.4 Consistency of h yp othe sis condence More terminology will b e in tro duced to establish a sense in whic h the condence v alue but not the p -v alue consisten tly estimates the h yp othesis indicator. Denition 19. An indicator estimator ˆ 1 is c onsistent if, for all Θ 0 ∈ A , ˆ 1 Θ 0 ( X ) P θ,γ − − − → 1 Θ 0 ( θ ) for ev e r y γ ∈ Γ and for e v ery θ that is an elemen t of Θ but not of the b oundary of Θ 0 . By the usual concept of statistical p o w er, the T yp e II err or r ate of p ± asso ciated with testing the false n ull h yp othesis t hat θ = θ 0 at signicance lev el α is β ± ( α, θ , θ 0 ) = P θ,γ  p ± X ( θ 0 ) > α  for an y θ ≷ θ 0 . Commonly used in t w o-sided testing, the two-side d p -v al ue of the n ull h yp othesis that θ ∈ Θ 0 is for al l Θ 0 ⊆ Θ and x ∈ Ω . The next t w o prop ositions con trast the consistency of the condence v alue with the inconsistency of the t w o-sided p -v alue. Prop osition 20. Assume al l one-side d tests r epr esente d by the p -value functions p ± ar e asymptotic al ly p owerful in the sens e that lim n →∞ β ± ( α, θ , θ 0 ) = 0 for al l α ∈ (0 , 1) and for al l θ , θ 0 ∈ Θ such t hat θ ≷ θ 0 . The function ˆ 1 : A × Ω → [0 , 1] is a c onsistent indi c ator estimator if P x = ˆ 1 • ( x ) is a c ondenc e me asur e c orr esp onding to p ± given X = x for al l x ∈ Ω . Pr o of. By the denition of t he b oundary of a set Θ 0 as the dierence b et w een i ts closure ¯ Θ 0 and its in terior in t Θ 0 , the theorem asserts that, for all Θ 0 ∈ A , θ is either in in t Θ 0 , in whic h case the theorem 18 asserts P X (Θ 0 ) P θ,γ − − − → 1 , or θ is in Θ \ Θ 0 , in whic h case t he theorem asserts P X (Θ 0 ) P θ,γ − − − → 0 . Let A 0 represen t t he set of all disjoin t op en Eac h term of the sum expands as P X (Θ 000 ) = P X ((inf Θ 000 , sup Θ 000 )) = p + X (sup Θ 000 ) − p + X (inf Θ 000 ) = p − X (inf Θ 000 ) − p − X (sup Θ 000 ) = 1 − p − X (sup Θ 000 ) − p + X (inf Θ 000 ) . As the p -v al ue functions are asymptotically p o w erful, p ± X ( θ 0 ) P θ,γ − − − → 0 for all α ∈ (0 , 1) and for all θ , θ 0 ∈ Θ suc h that θ ≷ θ 0 , with the result that eac h term ma y b e written as a function of p -v al ues that con v erge in P θ,γ to 0: P X (Θ 000 ) =      p − X (inf Θ 000 ) − p − X (sup Θ 000 ) θ < inf Θ 000 1 − p − X (sup Θ 000 ) − p + X (inf Θ 000 ) θ ∈ Θ 000 p + X (sup Θ 000 ) − p + X (inf Θ 000 ) θ > sup Θ 000 P θ,γ − − − →      0 − 0 θ < inf Θ 000 1 − 0 − 0 θ ∈ Θ 000 0 − 0 θ > sup Θ 000 for all Θ 000 ∈ A 0 . Summing the terms o v er A 0 yields P X (Θ 0 ) P θ,γ − − − → X Θ 000 ∈A 0 1 Θ 000 ( θ ) = 1 Θ 0 ( θ ) since θ ∈ int Θ 0 implies that θ is in one ele men t of A 0 . R emark 21 . P olansky (2007, pp. 37-38) pro v ed a similar p rop osition of consistency giv en a smo oth distribution P θ,γ . A suitably transformed l ik eliho o d ratio test statistic is also a consisten t i nd i cator estimator under t he standard regularit y c onditions (Bic k el, 2008). Prop osition 22. Under the c onditions of The or e m 20, the two-side d p -value p X (Θ 0 ) is not a c onsis- tent indic ator estimator. Pr o of. F or an y θ ∈ Θ 0 ∈ A , the distribution of the t w o-sided p -v alue p X (Θ 0 ) con v erges to the uniform distribution on [0 , 1] (Singh et al., 200 7) , violating consistency (Denition 19). 4 Discussion The condence metameasure P x and the condence measure or frequen tist p osterio r P x bring b oth coherence and consistency to frequen tist inference and deci s i on making. The coherence prop ert y established in Section 2.3 confers t he abilit y to consisten tly and directly rep ort the lev els of condence of as man y comple x h yp otheses as desired and to p erform estimation and predic t i on (3.2). Ev en though the frequen tist p osterior P x is a exible distr i bu ti on of p ossible v al ues of a xed parameter, it requires no prior; in fact, P x need not ev en necessarily corresp ond to an y Ba y e sian p osterior distribution (3.3). In conclusion, the metalev el o r lev el of condence in a giv en h yp othesis has the in ternal coherence of the Ba y esian p osterior or class of suc h p osteriors without requiring a p ri or distribution or ev e n an exact condence set e st i mator. More can b e said if the parameter of in terest is one-dime ns i onal, in whic h case the condence l ev el of a co mp osite h yp othesi s is consisten t as an estimate of whethe r that h yp othesis is true, whereas neither the Ba y esian p osterior probabilit y nor the p -v al ue is generally consisten t in th a t sense (3.4.4). Sp ecically , the equalit y of the condence lev e l of θ ∈ Θ 0 to the co v erage rate of the corresp onding condence set guaran tees con v e rgence in probabilit y to 1 if θ is in the in terior of Θ 0 or to 0 if θ / ∈ Θ 0 (Prop osition 20). 19 5 A c kno wledgmen ts An thon y Da vison furnished man y useful commen ts on an early v ersion of the man uscript that led to greater generalit y and clarit y . I also thank Mic hael Goldstein for information that fortied the discussion of coherenc e in Section 3.3 and Corey Y anofsky for pro viding insigh tful feedbac k on a draft of the same section. This w ork w as partially supp orted b y the F a cult y of Medicine of the Univ ersit y of Otta w a and b y Agriculture and Agri-F o o d Canada. References Armendt, B., 1992. Dutc h strategies for diac hronic rule s: When b eliev ers see the sure loss coming. PSA: Pro ceedings of the B iennial Me eting of the Philosoph y of Science Asso ciation 1992, 217229. Barndor-Nielsen, O. E., Co x, D. R., 1994. Inference a n d Asymptotics. CR C Press, London. Bernardo, J. M., Smi th , A. F. M., 1994. Ba y esian Theory . Bic k el, D. R., 2004. Degrees of di eren tial gene expression: Detecting biologically signican t expression dierences and estimating their magni t ude s . Bioinformatics (Oxford, England) 20, 6826 88. Bic k el, D. R., 2006. Incorp orating exp ert kno wledge in to frequen tist results b y com bining sub jectiv e prior and ob jectiv e p osterior di s tributions: A gene r a lization of condence distribution com bination. arXiv:math.ST/0602377. Bic k el, D. R., 2008. The strength of statistical evidence f o r comp osite h yp otheses with an application to m ultiple comparisons. T ec hnical Rep ort, Otta w a Institute of Systems B iology , COBRA Preprin t Series, Article 49, a v ailable at tin yurl.c om/7y a ysp. Bo c hkina, N., Ric hardson, S., 2007. T ail p osterior probabil it y for inference in pairwise and m ulticlass gene expression data. Bi ometrics 63 (4), 11171125. Bondar, J. V., 1977 . A conditional condence principle. The Annals of Statistics 5 (5), 881891. Brazzale, A. R . , Da vison, A . C., Reid, N. , 2007. Applied Asymptotics: Case Studies in Small-sample Statistics. Cam bridge U n i v ersit y Press, Cam bridge. Breiman, L., 1996. Bagging predictors. Mac hine Learning 24 (2), 123140. Buehler, R. J., F eddersen, A. P ., 1963. Note on a conditional prop ert y of studen t's t1. The Annals of Mathematical Statistics 34 (3), 10981 100. Carlin, B. P ., L ouis, T. A., 2009. Ba y es and Empirical Ba y es Metho ds for Data Analysis, Sec ond Edition. Chapman & Hall /CR C, New Y ork. Casella, G., 1987. Conditionally acceptable recen tered set estimators. The Annals of Statistics 15 (4), 13631371. Coletti, C., Scozzafa v a, R., 2002. Probabilistic Logic in a Coheren t Setting. Klu w er, Amsterdam. Corneld, J., 1969. Th e ba y esian outlo ok and its application. Biometrics 25 (4), 617657. Co x, D. R., 1958. Some problems connected with statistical infe r e nce. The Annals of Mathematical Statistics 29 (2), 357372. Co x, D. R., 1977. The role of signicance tests. Scandina vian Journal of Statistics 4, 4970. Co x, R. T., 1961. The Algebra of Probable Inference. Johns Hopkins Press , Baltimore. 20 Datta, G. S., Ghosh, M., Muk erjee, R., 2000. Some new results on probabilit y matc hing priors. Calcutta Statist.Asso c.Bull. 50, 179192. Da vison, A. C., Hinkley , D. V., Y oung, G. A., 2003. Recen t dev elopmen ts in b o otstrap metho dology . Statistical Science 18 (2), 14 1157. Da wid, A. P ., Stone, M., Zidek, J. V. , 1973. Marginal ization parado xes in ba y esian and structural inference. Journal of th e Ro y al Statistical So ciet y . Series B (Metho dological) 35 (2), 189233. Da wid, A. P ., W ang, J., 1993 . Fiducial prediction and semi-ba y esian inference. The Annals of Statistics 21 (3), 11191138. de Finetti, B., 1970. Theory of Probabilit y: a Critical In tro ductory T reatmen t, 1st Edition. John Wile y and Sons Ltd, New Y ork. Diaconis, P ., Zab ell, S. L., 1982. Up dating sub jectiv e probabilit y . Journal of the American Statistical Asso ciation 77 (380), 822830. Efron, B., 1993. Ba y es and l ik eliho o d calculations from condenc e in terv als. Bi ometrik a 80, 326. Efron, B., 1998. R. a. sher i n the 21st cen tury , in vited pap er presen ted at the 1 996 r. a. sher lecture. Statistical Science 13 (2), 95 114. Efron, B., 2003. Sec ond though ts on the b o otstrap. Statistical Science 18 (2), 135140. Efron, B., Ha lloran, E., Holmes, S., 1996. Bo otstrap condence lev els for ph ylogenetic trees. Pro ceed- ings of the National A cadem y of Scie n c es of the United States of America 93 (23), 1342913434. Efron, B., Hi n kle y , D. V., 1978. Assessing the accuracy of the maxim um lik eliho o d estimator: observ ed v ersus exp ected s he r information. Biometrik a 65 (3), 457487. Efron, B., Tibshirani, R., 1998. The problem of reg ions. Annals of Statistics 26 (5), 16871718. Fisher, R. A., 1945. The logical in v ersion of the notion of the random v ariable. Sankh y a: The Indian Journal of St a t i st i cs (1933-1960) 7 (2), 129132. Fisher, R. A ., 1973. Statistical Metho ds and Scien tic Inference. Hafner Press, New Y ork. F ranklin, J., 2001. Resurrecting logical pr o b a b i lit y . Erk enn tnis 55 (2), 277305. F raser, D. A. S., 196 8. The Structure of Inference. John Wiley , Ne w Y ork. F raser, D. A. S., 1991. Statistic al infere nce: lik eliho o d to signi cance. Journal of the American Statis- tical Asso ciation 86, 258265. F raser, D. A. S., 200 4. Ancil laries and conditional inference. Statistical Science 19 (2), 333 351. F raser, D. A. S., 2008a. Fiducial inference. In: D urla u f, S. N., Blume, L. E. (Eds.), The New P algra v e Dictionary of Economics. P algra v e Macmillan, Basingstok e. F raser, D. A. S., Reid, N. , 2002. Strong matc hing of frequen tist and ba y esian parametric inference. Journal of St a t i st i cal Planning and Inference 103, 263285. F raser, D. A. S., R. N. Y. G. Y., 2008b. Defaul t pr i ors for ba y esian and frequen tist inference. T ec hnic al Rep ort, Departmen t of Statistics, Univ ersit y of T oron to. Gleser, L. J., 2002. [setting condence in terv al s for b ounded parameters]: Commen t. St ati s ti cal Science 17 (2), 161163. 21 Goldstein, M., 1985. T emp oral coherence (with discussion). V o l. 2 of Ba y esian Statistics 2. V alencia Univ ersit y Press, New Y ork, pp. 231248. Goldstein, M., 1997. Prior inferences for p osterior judgemen ts. Structures and Norms in Science, 5571. Goldstein, M., 2001. A v oiding foregone conc lusions: G e ometric and foundational a n a lys i s of parado xes of nite additivit y . Journal of Statistical Planning and Inferenc e 94 (1), 7387. Goldstein, M., 2006. Sub jectiv e ba y esian analysis: principles and practice. Ba y esian Analysis 1, 403 420. Goutis, C., Casella, G., 1995. F requen tist p ost-data inference. In ternational Statistical Review / Revue In ternationale de Statist i q ue 63 (3), 325344. Grün w ald, P .D., P . D. A., 2004. Game theory , maxim um en trop y , minim um discrepanc y and robust ba y esian deci s i on theory . Annals of Statistics 32 (4), 13671433. Grundy , P . M., 1956. Fiducial distributions and prior distributions: An example in whic h the former cannot b e asso ciated with the latter. Journal of the Ro y al Statistical So ciet y , Series B 18, 217 221. Hac king, I., 1965. Logi c of Statistical Inference. Cam bridge Univ ersit y Press, Cam bridge. Hac king, I., 1967. Sli gh tly more realistic p ersonal probabilit y . Decision, Probabilit y , and Utili t y . Hac king, I., 2001. A n in tro duction to probabilit y and inductiv e logic. Cam bridge Univ ersit y Press, Cam bridge. Hall, P ., O. H., 2004. A ttributing a probabilit y to the shap e of a probabilit y densit y . Annals of Statistics 32 (5), 20982123. Hannig, J., 2009. On generaliz ed ducial inference. Statistica Sinica 19, 491544. Helland, I. S., 2004. Statistical inference under symmetry . In ternational Statistical Review 72 (3), 409422. Hild, M., 1998. The co h e rence argume n t against conditionalization. Syn these 115 (2), 229258. Ho wson, C., 1997. Logic and Probabilit y. Br J Philos Sci 48 (4), 517531. Hw ang, J.T., C. G. R. C. W. M. F. R., 1992. Estimation of accuracy in testing. Ann. Statist. 20 (1), 490509. Jaeger, M., 2005. A log ic for inductiv e probabilistic reasoning. Syn these 144 (2), 1 81248. Jerey , R., 1986. Probabilism and induction. T op oi 5 (1), 5158. Jerey , R. C., 2004. Sub jectiv e Probabilit y: The Real Thing. Ca m bridge Univ ersit y Press, Cam b ri d g e. Kamim ura, T., Shimo dai r a, H., Imoto, S., Kim, S., T ashiro, K., Kuhara, S., Miy ano, S., 2003. Multi- scale b o otstrap analysis of gene net w orks based on ba y esian net w orks and nonparametric reg r e s si on. Genome Informatics 14, 350351. Kaplan, M., 1996. Decision Theo r y As Philosoph y . Cam bridge Univ ersit y Press, Cam bridge. Kass, R. E., W asserman, L., 1996. The selection of prior distributions b y formal rules. Journal of the American Statistical Asso ciation 91, 13431370. Kyburg, H. E., 2003. Are there degrees of b elief ? Journal of Applied Logic 1 (3-4), 139149. Kyburg, H. E., 2006. Belief, evidence, and conditioning. Philosoph y of Science 73 (1), 4265. 22 Kyburg, H. E. J., 1990. Science and Reason. Oxford Univ ersit y Press, New Y ork. La wless, J. F., F redette, M., 2005. F requen tist prediction in terv als and p re dictiv e distributions. Biometrik a 9 2 (3), 529542. Lewin, A., Ric hardson, S., Marshall, C., Glazier, A., Aitman, T., 2006. Ba y esian mo deling of dieren tial gene expression. Biometrics 62 (1), 19. Lindley , D. V., 1958. Fiduc ial distributions a n d ba y es' theorem. Journ a l of the Ro y al Statistical So ci- et y .Series B (Metho dological) 20 (1), 102107. Liu, R., S. K., 1997. Notions of limiting p v alues based on data depth and b o otstrap. Journal of the American Statistical Asso ciation 92 (437), 266277. Maher, P ., 1992. Diac hronic rationalit y . P hi losoph y of Science 59 (1), 120141. Mandelk ern, M. , 2002. Setting condence in terv als for b ounded parameters. Statistical Sci ence 17 (2), 149172. McCarth y , D. J., Sm y th, G. K., 2009. T esting signicance relativ e to a fol d - c hange t hre s hol d is a TREA T. Bioinformatics 25 (6), 765771. McCullagh, P ., Oct. 2002. What i s a statistical mo del? The Annals of Statistics 30 (5), 12 251267. Molc hano v, I. , 2005. Theory of Random Sets. Springer, New Y ork. P aris, J. B., 1994. The Uncertain Reasoner 's Companion: A Mathematical P ersp ectiv e. Cam bridge Univ ersit y Press, New Y ork. P a witan, Y., 2001. I n All Lik eliho o d: Statistical Mo deling and Inference Using Lik eliho o d. Clarendon Press, Oxford. P olansky , A. M., 2007 . Observ ed Condence Lev els: Theory and Application. Chapman and Hall, New Y ork. Reid, N., 2003. Asympt o t i cs and the theory o f inference. Annals of Statistics 31 (6), 16951731. Ro y all , R., 1997. Statistical Evidence: A Lik eliho o d P aradigm . CR C Press, New Y ork. Rubin, D. B., 1984. B a y esianly justiable and relev an t frequency calc u l ations for the applied statisti- cian. Ann.Statist. 12 (4), 11511172. Sa v age, L. J., 1954. The F oundations of Statistics. John Wiley and Sons, New Y ork. Sc h w eder, T., Hjort, N. L., 200 2. Condence and lik eliho o d. Scandina vian Journal of St a t i st i cs 29 (2), 309332. Sev erini, T. A., Muk erjee, R., Ghosh, M. , 2002. On an exact probabilit y matc hing prop ert y of righ t- in v arian t priors. Biometrik a 89 (4), 952957. Singh, K., Xie, M., Stra wderman, W. E., 2005. Com bining information from indep enden t sources through condence distributions. Annals of Statistics 33 (1), 159183. Singh, K., Xie, M., Stra wderman, W. E., 2007 . Condence distribution (cd)  distribution estimator of a parameter. Skyrms, B ., 1997. The structure of radical probabilism. Erk enn tnis (1975-) 45 (2/3, Probabilit y , Dy- namics and Causalit y), 285297. 23 Smith, C. A. B., 1961. Consistency in statistical inference and decision. Journal of the Ro y al Statistical So ciet y .Se ries B (Metho dologi cal) 23 (1), 137. Sprott, D. A. , 2000. Statistical Inference in Science. Springer, New Y ork. Sundb erg, R., 2003. Conditional statistical inference and quan tication of relev ance. Journal of the Ro y a l Statistical So ci et y . Serie s B: Statistical Metho dology 65 (1), 299315. Sw eeting, T. J., 2001. Co v erage probabilit y bias, ob jec t i v e ba y es and the lik eliho o d principle. Biometrik a 8 8 (3), 657675. T eller, P ., 1973. Conditionaliz ation and observ ation. Syn these 26 (2), 21825 8. v an Berkum, E.E.M., L. H. O. D., 1996. Inference rules and inferen tial distributions. Journal of Sta- tistical Planning and I nference 49 (3), 305317. V an De Wiel , M. A., Kim, K. I., 2007. E stim ating the false disco v ery rate using nonparametric de con- v olution. Biometrics 63 (3), 806815. Vineb erg, S., 1997. Dutc h b o oks, dutc h strategies and what they sho w ab out rationalit y . Philosophical Studies 86 (2), 185201. v on N eumann, J., Morgenstern, O., 1944. Theory of Games and Economic Beha vior. Princeton Uni- v ersit y Press, Princeton. W alley , P ., 1991. S tati s ti cal Reasoning with Imprecise Probabilities. Chapman and Hall , London. W alley , P ., 2002. Rec onciling frequen tist prop erties with the lik eliho o d principle. Journal of Statistical Planning and Inference 105 (1), 3565. W asserman, L., 2000. Asymptotic inference for mixture mo dels using data-dep enden t priors. Journal of the R o y al Statistical So ciet y .Series B (Statistical Metho dology) 62 (1), 159180. W asserman, L. A., 1990. Prior en v elop es based on b elie f f unc t i ons . The Annals of Statistics 18 (1), 454464. W elc h, B. L., P e ers , H. W., 1963. On form ulae for condence p oin ts based on in tegrals of w eigh ted lik eliho o d s. J.Ro y .Statist.So c.Ser.B 2 5, 318329. W ellek, S., 2003. T esting Statistical Hyp otheses of Equ i v alence. Chapman and Hall, L ondon. Wilkinson, G. N., 1977. On resolving the con tro v e r sy i n statist i cal inference (with discussion). Journal of the R o y al Statistical So ciet y . Series B (M etho dological) 39 (2), 119171. Williamson, J., 2007. Probabilit y and Inference. T exts in Phi losoph y 2. College Publications, London, Ch. Motiv ating ob jectiv e Ba y esianism: from empirical constrain ts to ob jectiv e probabilities, pp. 151179. Williamson, J., 2009 . Ob jectiv e ba y esianism, ba y esian conditionalisation and v olun tarism. Syn these, 119. Xiong, S., Mu, W., 200 9. On c ons tructio n of asymptotical ly c orr e ct condence in t e rv als. Journal of Statistical Planning and I nf e rence 139 (4), 13941404. Zab ell, S. L., 1992. R. a. sher and the ducial argumen t. Statistical Science 7 (3), 369387. Zab ell, S. L., 2002. I t all adds up: The dynamic c oherence of radic al probabilism. Ph i losoph y of Science 69 (3, Supplemen t: Pro ceedings of the 200 0 Biennial Meeting of th e Philosoph y of Sc ience Asso ciation. P art I I: Symp osia P ap ers), S98S103. 24

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment