Objective Bayes testing of Poisson versus inflated Poisson models

IMS Collectio ns Pushing the Limits of Con temp orary Statist ics: Contributions in Honor of Jay an ta K. Ghosh V ol. 3 ( 2008) 105–121 c  Institute of Mathe matical Statistics , 2008 DOI: 10.1214/ 07492170 80000000 93 Ob jectiv e Ba y es tes ting o f P oi sson v ers us inﬂated P oisson mo dels ∗ M. J. Ba yar ri 1 , James O. Berger 2 and Gauri S. Datta 3 University of V alencia, Duke University and SAMSI, and Univ e rsity of Ge or gia Abstract: The P oisson distribution is often used as a standard mo del f or coun t data. Quite often, ho w ev er, suc h data sets are not w ell ﬁt b y a P oisson model b ecause they hav e more zeros than are compatible with this mo del. F or these si tuations, a zero-inﬂated P ois son (ZI P) distribution is often pro- posed. This article addresses testing a Poisson ve rsus a ZIP mo del, using Ba y esian methodology based on suitable ob jectiv e prior s. Speciﬁc c hoices of ob jective priors are j ustiﬁed and their prop erties inv estigat ed. The metho dol- ogy is extended to include co v ariates i n regression models. Several applications are given. Con ten ts 1 Int ro duction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 2 F orm ulation o f the problem . . . . . . . . . . . . . . . . . . . . . . . . . . 107 2.1 Bayesian mo del selection and B ayes factors . . . . . . . . . . . . . . 107 2.2 Sp eciﬁcatio n and justiﬁcation of the o b jectiv e priors . . . . . . . . . 1 08 2.3 Ob jective Bay e s factor for Poisson versus ZIP mo dels . . . . . . . . . 109 3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 0 4 Mo del selection in Z IP r egress ion . . . . . . . . . . . . . . . . . . . . . . . 11 1 4.1 Ob jective priors for mo del selection . . . . . . . . . . . . . . . . . . . 11 1 4.2 An illustr ative applica tion . . . . . . . . . . . . . . . . . . . . . . . . 114 5 Analysis with insuﬃcient pos itive counts . . . . . . . . . . . . . . . . . . . 114 5.1 All ze r o count s in the non-re gressio n case . . . . . . . . . . . . . . . 115 5.2 Insuﬃcient po sitive counts in the regr e ssion cas e . . . . . . . . . . . 116 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Ac knowledgmen ts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 0 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 1. In troductio n The Poisson distribution is o ften used a s a standard probability mo del for count data. F or example, a pro duction engineer may count the n um ber of defects in items ∗ Supported in part by NSF under Grants DMS-01-03265, SES-02-41651 and AST-05-07481, and b y the Spanish Mi nistry of Education and Science, under Grant MTM2007-61554. 1 Departmen t of Statistics and O. R., University of V alencia, Av. Dr. Moliner 50, 46100 Burjas- sot, V alencia, Spain, e-m ail: susie.ba yarri@uv .es 2 ISDS, Box 90251, Durham, NC 27708-0251, and 19 T.W. Alexander Dr., P .O . Box 14006, Researc h T riangle Park, NC 27709-4006, USA, e-mail: berger@s amsi.inf o 3 Departmen t of Statistics, Universit y of Georgia, Athens, GA 30602-1952 , USA, e- mail: gaurisda tta@gmai l.com AMS 2000 subje c t classiﬁc ations: 62F15, 62F03. Keywor ds and phr ases: Bay es factor, Jeﬀreys prior, mo del selection. 105 106 M. J. Bayarri, J. O . Ber ger and G. S. Datta randomly selected from a pro duction pr o cess. Quite often, ho wev er, such da ta s ets are not well ﬁt by a P oisson mo del b ecause they contain more zer o counts than ar e compatible with the Poisson mo del. An example is again provided by the pr o duc- tion pr o cess; indeed, according to Ghosh et al. [ 14 ], when some pro duction pro ces s es are in a near p erfect state, zero defects will o cc ur with a hig h pro bability . How- ever, ra ndo m changes in the manufacturing environment can lead the pro cess to an imper fect sta te, pr o ducing items with defects. The pr o duction pro cess can mov e randomly back a nd for th b etw een the p erfect and the imp erfect s tates. F or this t yp e of pr o duction pro cess many items will b e pro duced with zero defects, a nd this excess might be better mo deled by a ZIP distribution than a Poisson distribution. F or 0 ≤ p ≤ 1 , λ > 0, the ZIP ( λ, p ) distribution ha s the probability function (1.1) f 1 ( x | λ, p ) = p I ( x = 0) + (1 − p ) f 0 ( x | λ ) , x = 0 , 1 , 2 , . . . , where I ( · ) is the indicator function, and f 0 ( x | λ ) is the Poisson probability function (1.2) f 0 ( x | λ ) = e − λ λ x x ! , x = 0 , 1 , 2 , . . . . The para meter p is referred to as the zer o-inﬂation p ar ameter . Many authors used the ZIP distribution with and without c ov a riates to mo del count data . In a ZIP r e gressio n mo del, Lamber t [ 18 ] used a frequentist appr oach and Ghosh et a l. [ 1 4 ] used a Bayesian a pproach to a nalyze industrial data sets. While the afor ementioned a uthors used the ZIP mo del to analyze their data, a nu m ber of a uthors hav e addr essed the pro blem of chec king whether a Z IP mo del is needed to mo del the data. F rom the fre q uent ist p ersp ective, scor e tests have been developed for testing the hypothesis H 0 : p = 0 vs. H 1 : p 6 = 0 in a ZIP regr e ssion mo del ([ 10 ], [ 1 2 ]). F rom the Bay esian p ersp ective, Bhattacharya et al. [ 9 ] pres ent ed a Bay esian metho d to test p ≤ 0 v ersus the alternative p > 0 b y co mputing a certain po sterior pro bability of the alter native h ypo thesis. As in ([ 10 ], [ 12 ]), p is allow ed to be negative in their mo de l [ 9 ], as long as p + (1 − p ) e − λ ≥ 0. In this pap er, we consider Bay esian testing o f M 0 versus M 1 given by M 0 : X i i.i.d. ∼ f 0 ( · | λ ) , i = 1 , . . . , n, (1.3) M 1 : X i i.i.d. ∼ f 1 ( · | λ, p ) , i = 1 , . . . , n, (1.4) where f 0 , f 1 are given in ( 1.1 ) and ( 1.2 ), resp ectively . Note that, as opp osed to the situations in the pap ers men tioned ab ove, p < 0 is no t p ossible here. Indeed, we can alter na tively formulate the pr oblem as that of testing, w ithin the ZIP mo del, H 0 : p = 0 versus H 1 : p > 0 . Unlik e the a nalysis in [ 9 ], p = 0 (i.e., the Poisson model) is assumed to have a pr iori belie v ability (e.g., prior pro bability 1/2 ). In Sec tion 2 we develop the sugges ted ob jective testing of Poisson versus ZIP mo dels whe n not a ll counts a re zer os. F or all z e ros, the ZIP distributio n is not ident iﬁable, and a prop er prio r is requir ed for all para meters; we address this in Section 5. Section 3 is devoted to some co mparative ex amples. W e consider inclusion of cov a riates in Section 4, whe r e we address the testing of Poisson versus ZIP regres s ion mo dels and give an example involving AIDS rela ted deaths in men. In the regres s ion cas e, in or der for the o b jectiv e Bay esian mo del selection to b e successful we need enough p ositive counts so that the desig n matrix based on the p os itive counts is full column rank. When this co ndition does not hold w e suggest in Section 5 a partially pr o p er prior on the r egressio n para meters to b e used for mo del selection. Pro ofs and technical details are relega ted to an Appendix. Obje ctive Bayes testing of Poisson versus i nﬂate d Poisson mo dels 107 2. F orm ulation of the problem The Bay esian metho dolog y fo r choosing b etw ee n t w o mo dels for some data is con- ceptually very simple (see, e.g., [ 3 ]). O ne asse s ses the pr ior pr obabilities of each mo del, the prior distributions for the mo del pa rameters, and computes the p os - terior pro babilities of each mo del. These p os ter ior probabilities ca n b e computed directly fro m the prior proba bilities and the Bayes F actor , a n (integrated) likeli- ho o d r atio for the mo dels w hich is very po pular in Bay esian testing and mo del selection. Often it is not p ossible (for la ck of time or resourc e s) to ca r efully assess in a sub- jective manner all the needed priors. In these situa tio ns, very satisfactory a ns wers are provided by obje ctive Bayesian analyses that do no t use ex ter nal info r mation other than that r equired to formulate the problem (see [ 4 ]). First we review b elow some diﬃculties of mo del selection via o b jective Bayesian analysis. Then we justify the ob jective prio r w e chose for our problem, der ive the cor r esp onding Bay es F ac to r and study pr op erties of the prior and the Bay es factor. 2.1. Bayesian mo del sele ction and Bayes factors T o compa re tw o mo dels, M 0 and M 1 , for the data X = ( X 1 , . . . , X n ), the B ay esian approach is base d on the Bayes factor B 10 of M 1 to M 0 given by (2.1) B 10 = m 1 ( x ) m 0 ( x ) = R f 1 ( x | θ 1 ) π 1 ( θ 1 ) d θ 1 R f 0 ( x | θ 0 ) π 0 ( θ 0 ) d θ 0 , where, under mo del M i , X ha s density f i ( x | θ i ) and the unknown parameters θ i in M i are assig ned a pr io r density π i ( θ i ) , i = 0 , 1 . F or given pr ior mo del proba bilities P r ( M 0 ) and P r ( M 1 ) = 1 − P r ( M 0 ), the p osterio r probability of, say , M 0 is (2.2) P r ( M 0 | x ) =  1 + B 10 P r ( M 1 ) P r ( M 0 )  − 1 . In ob jectiv e Bay esian analys es π i ( θ i ) is chosen in a n ob jectiv e or conv en tional fashion and the hypo theses would b e a ssumed to be equally likely a priori. Use of ob jective pr iors has a lo ng history in Bay esian inference (see, for ex- ample, [ 8 ] a nd [ 17 ] for justiﬁca tions and refere nces). They are , how ever, typically improp er and a re only deﬁned up to an arbitra ry mult iplicative constant. This is not a problem in the p oster ior dis tr ibution, since the same constant appea rs in both the n umerator a nd the denominator o f Bay es theorem and so cancels. In mo del se- lection and hypo thesis testing, how ever, it can be see n fro m ( 2.1 ) that when at least o ne o f the prior s π i ( θ i ) is impro pe r , the arbitra ry consta nt do es not cancel, so that the Bayes factor is then a rbitrary and undeﬁned. An impo rtant exception to this aris es in inv a riant situations for para meters o ccur ring in a ll o f the mo dels; Berger et al. [ 7 ] show that us e of the (impro p er) right Haar in v aria nt pr ior is then per missible. One o f the wa ys to address this diﬃculty is to try to directly “ﬁx” the Bayes factor by appropr iately choo sing the multiplicativ e co nstant, as in [ 13 ]. Popular metho ds (the intrinsic Bayes factor [ 5 ] and the fr actional Bayes factor [ 20 ]) for ﬁxing this constant arise a s a conse q uence of “training ” the improp er prio rs int o prop er pr iors based on par t of the data or o f the likeliho o d. W e r e fer to B e rger and Pericchi [ 6 ] for a review, reference s a nd comparis ons. Another p ossibility is 108 M. J. Bayarri, J. O . Ber ger and G. S. Datta to directly derive appro pr iate “ob jectiv e” but pro pe r distributions π i ( θ i ) to use in mo del selec tion; see [ 2 ] and [ 15 ] for metho ds a nd references. This is the appro ach taken in this pap er (with a slig ht exception in Section 5 ). 2.2. Sp e ciﬁc ation and justiﬁc ation of the obje ctive pr iors Returning to the testing of the Poisson ( M 0 ) vs. the ZIP ( M 1 ) models , i.e., testing (2.3) M 0 : X ∼ f 0 ( x | λ ) v s. M 1 : X ∼ f 1 ( x | λ, p ) , the key iss ue is the choice of the priors π 0 ( λ ) and π 1 ( λ, p ) = π 1 ( λ ) π 1 ( p | λ ). A frequent s implifying pr o cedure (b oth fo r sub jectiv e and ob jective metho ds) is to take π 0 ( λ ) equal to π 1 ( λ ), that is, to give the same pr ior to the parameter s o ccurring in all mo dels under consider ation. This, how ev er, may be inappro priate, since λ might hav e entirely diﬀerent meanings under mo del M 0 and under mo del M 1 ; the fact tha t we hav e used the same lab el do es not imply that they hav e the same meanings. This fr equent mistake is discussed, for exa mple, in [ 7 ]. It has b een argued that, if the common par ameters a re ortho gonal to the r e- maining par ameters in each mo del (that is, the Fisher informa tion matrix is blo ck diagonal), then they ca n b e assigned the same prior distribution ([ 15 ], [ 16 ]). In this case, improp er prio rs can be used, since the arbitra ry constant would ca ncel in the Bay es factor . Unfortunately , p and λ in the ZI P mo del a re not o rthogona l. W e ﬁr st repar am- eterize the orig inal model. With p ∗ = p + (1 − p ) e − λ , we rewr ite f 1 ( x | λ, p ) as (2.4) f ∗ 1 ( x | λ, p ∗ ) = p ∗ I ( x = 0) + (1 − p ∗ ) f T ( x | λ ) , x = 0 , 1 , 2 , . . . , where f T ( x | λ ) is the zero-tr unca ted Poisson distribution with para meter λ . Note that p ∗ ≥ e − λ . W e can trivially expr e s s the Poisson ( M 0 ) mo del as: (2.5) f ∗ 0 ( x | λ ) = e − λ I ( x = 0) + (1 − e − λ ) f T ( x | λ ) , x = 0 , 1 , 2 , . . . , and now it can intuitiv ely b e s een that λ has the s ame meaning in bo th f ∗ 1 and f ∗ 0 . Indeed the Fisher Infor mation matrix for p ∗ and λ can b e check ed to b e diag onal. With an or thogonal re parameteriza tion, Jeﬀrey s (19 61) r e commended using (i) Jeﬀr eys prior (the s quare ro o t of Fisher information) for the “co mmo n” parameters ; and (ii) a reaso nable pr op er pr ior for the extra par a meters in the more complex mo del. The situation here is very unusual, how ever, in tha t the Jeﬀreys prior for the “common” λ is diﬀerent for each mo del. The J eﬀr eys prior for λ in the Poisson mo del is well known to b e π 0 J = 1 / √ λ , wher eas the Jeﬀr eys prior for the orthog- onalized ZIP mo del is ea s ily shown to b e the s a me a s the Jeﬀr eys prior for the truncated distribution f T ( x | λ ), which is π 1 J ( λ ) = k ( λ ) √ λ , where k ( λ ) = { 1 − ( λ + 1) e − λ } 1 / 2 1 − e − λ . That these pr iors are diﬀere n t after o rthogona liz ation is highly unusual and can be traced to the fact that λ also enters into the de ﬁnitio n of the nested mo del, through p ∗ = e − λ . In a ny case, we a re left without clear guidance as to whether π 0 J or π 1 J should b e used as the prior fo r λ . (Note that, in computing the Bay es facto r, the same prior fo r λ must b e use d in bo th the nu merator and the denominator ; otherwise one is fa cing the indeterminacy issues dis c ussed earlier.) Obje ctive Bayes testing of Poisson versus i nﬂate d Poisson mo dels 109 Under the or thogonalized ZIP mo del, we also need to sp ecify a prop er pr ior for p ∗ given λ , which we prop os e to take uniform over the interv al ( e − λ , 1), that is, π 1 ( p ∗ | λ ) = I ( e − λ < p ∗ ≤ 1) 1 − e − λ . W e ca n th us write the overall priors b eing co nsidered for the tw o mo dels f ∗ 0 ( x | λ ) and f ∗ 1 ( x | λ, p ∗ ) as, resp ectively , π l 0 ( λ ) = k ( λ ) l √ λ , π l 1 ( λ, p ∗ ) = k ( λ ) l √ λ I ( e − λ < p ∗ ≤ 1) 1 − e − λ , where l is 0 or 1 a s we utilize one or the other of the two Jeﬀreys prior s for λ . It is computationally more convenien t to work in the o riginal ( p, λ ) para meteri- zation. A change of v ariables a bove then results in the priors (2.6) π l 0 ( λ ) = k ( λ ) l √ λ , π l 1 ( λ, p ) = k ( λ ) l √ λ I (0 < p ≤ 1) , which we will henceforth consider (for l equal to 0 or 1). W e are not aw are of any desiderata that w ould suggest a preference for either the l = 0 prior or the l = 1 prio r , but luc kily the tw o yield almo st the same answers. Indeed, simple algebr a shows that k ( λ ) is a strictly increas ing function of λ and that (2.7) inf k ( λ ) = 1 √ 2 = 0 . 71 and sup k ( λ ) = 1 . Thu s k ( λ ) is quite ﬂat as a function of λ , so that k ( λ ) 1 and k ( λ ) 0 = 1 a re very similar. An immediate co nsequence for the Bayes factors B l 10 , l = 0 , 1 is that B 0 10 / √ 2 ≤ B 1 10 ≤ √ 2 B 0 10 , so that the tw o Bay es fa ctors can only diﬀer b y a mo dest amount (and in pr actice the diﬀerence is muc h smaller than this). It is obviously a bit simpler to work with the l = 0 prior, so we dro p the l sup e rscript and hencefo rth utilize the prior (2.8) π 0 ( λ ) = 1 √ λ , π 1 ( p, λ ) = 1 √ λ I (0 < p ≤ 1) . 2.3. Obje cti ve Bay es factor for Poisson versus ZIP mo dels Recall that the mo del M 0 is the s tandard Poisson mo del a nd the mo del M 1 is the ZIP mo del. F o r a sample of n c ounts X 1 , . . . , X n , let X denote the sample, k = P n i =1 I ( X i = 0) b e the num ber of zer o c ounts, and s = P n i =1 X i be the total count. Note that k = n is equiv alent to s = 0 . F o r given data x , the densities f 0 ( x | λ ) and f 1 ( x | λ, p ) under the tw o mo dels are given b y f 0 ( x | λ ) = e − nλ λ s Q n i =1 x i ! , f 1 ( x | λ, p ) = [ p + (1 − p ) e − λ ] k (1 − p ) n − k e − ( n − k ) λ λ s Q n i =1 x i ! . F or s > 0 (i.e., the counts a re not all zero), m 0 ( x ) = Z f 0 ( x | λ ) π 0 ( λ ) dλ = Γ( s + 1 2 ) n s + 1 2 Q x i ! . 110 M. J. Bayarri, J. O . Ber ger and G. S. Datta Using the binomia l expansion of [ p + (1 − p ) e − λ ] k , m 1 ( x ) = Z f 1 ( x | λ, p ) π 1 ( p, λ ) dp dλ = 1 Q x i ! k X j =0 k ! j !( k − j )! Z ∞ 0 Z 1 0 p j (1 − p ) n − j e − ( n − j ) λ λ s − 1 2 dpdλ = k ! ( n + 1)! Q x i ! k X j =0 ( n − j )! ( k − j )! Γ( s + 1 2 )( n − j ) − ( s + 1 2 ) . Both m 0 ( x ) and m 1 ( x ) are ﬁnite a nd the Bay es factor B 10 ( x ) = m 1 ( x ) /m 0 ( x ) is (2.9) B 10 ( x ) = k ! ( n + 1)! k X j =0 ( n − j )! ( k − j )! (1 − j n ) − ( s +1 / 2) . Note that, as intuitiv ely exp ected, for any given n the B ayes factor is increasing in s (total count) for any ﬁxed k (the num b er of zero’s), and is incr easing in k for a ny ﬁxed s . W e use ( 2.9 ) to calcula te the Bay es factor s fo r the examples in Sectio n 3. When s = 0 o r equiv a le n tly a ll co un ts a re zer o ( x = 0 ), there is a pr oblem. While m 0 ( 0 ) = Γ(1 / 2 ) / √ n remains ﬁnite, it is easy to see that m 1 ( 0 ) is inﬁnite. Indeed for any pr io r of the form h ( p ) π ( λ ), where π ( λ ) is impro per a nd h ( p ) is a pro p er density (as is r equired for testing), the mar ginal density m 1 ( 0 ) will b e inﬁnite. This is b ecause, for x = 0 , the density f 1 ( x | λ, p ) ≥ p n implying m 1 ( 0 ) ≥ R 1 0 p n h ( p ) dp R ∞ 0 π ( λ ) dλ = ∞ . W e discuss what to do for this case in Sec tion 5. 3. Applications In this sectio n w e apply our metho dology to tw o datasets to detect if zer o -inﬂation is present in the da ta . These e xamples hav e been analyzed for zero-inﬂatio n previously using b oth frequentist a nd Bay esian pro c edures. Since ther e a re non-zero co un ts in bo th examples, the Bayes factors ar e c o mputed using ( 2.9 ). Example 3.1. The ﬁrst data set is the Ur inary T ra ct Infection (UTI) data used in Bro ek [ 10 ], which us ed a sco r e test to detect zer o-inﬂation in a Poisson mo del. The data are co llected fro m 98 HIV-infected men trea ted a t the Department of Int ernal Medicine at the Utrech t Universit y hospital. The num ber of times they had a urinary tr act infection w as reco rded as X . T he data a re rec o rded in T able 1 . Merely by loo king at the data it is appa rent that zer o-inﬂation is present. Equation ( 2.9 ) yields a Bay es factor B 10 = 223 . 13 in fav or of mo del M 1 versus mo del M 0 ; if the mo dels w ere b elieved to b e equally likely a prior i, the r esulting po sterior mo del pro ba bilities would b e P r ( M 1 | x ) = 0 . 995 and P r ( M 0 | x ) = 0 . 005. This is indeed s trong evidence in favor of the ZIP mo del. In Bay esian tes ting of H 0 : p ≤ 0 versus H 1 : p > 0, Bhattacharya et al. [ 9 ] o btained P r ( p > 0 | x ) = 0 . 99 9. The observed v alue of the score s tatistic was rep orted as 15 . 3 4 [ 10 ], yielding a p -v alue of 0 . 000 1. All three ana ly ses pre s ent strong T able 1 UTI Data X 0 1 2 3 T ota l F requency 81 9 7 1 98 Obje ctive Bayes testing of Poisson versus i nﬂate d Poisson mo dels 111 T able 2 T err or Data X 0 1 2 3 4 T ot al F requency 38 26 8 2 1 75 evidence in fav or of the ZIP mo del, but notice that the p -v a lue seems to suggest stronger evidence against the Poisson null than the Bay esian analysis, and the p oint nu ll Bay esian a nalysis suggests weak er ev ide nc e than the interv al Bay esian test. Example 3.2. The next data set we c o nsider is the T erro rism data from [ 11 ]. T able 2 gives the num b er o f incidents o f international ter rorism p er month ( X ) in the United States b etw een 1 9 68 and 1 974. It is not intuitiv ely clear whether o r not there is zero -inﬂation in this data set. The Bay es facto r her e is B 10 = 0 . 2 8, yielding an ob jective p os terior pro babil- it y P r ( M 1 | x ) = 0 . 219, which actually supp orts the Poisson mo del. A pr evious analysis found P r ( p > 0 | x ) = 0 . 507 , an indeterminate v alue [ 9 ]. The o bserved v alue of the sco re statistic is 0 . 04 , with a p -v alue of 0 . 83 . Conigliani et al. [ 11 ] test a Poisson null mo del a gainst a nonpara metric alternative, ﬁnding a fractio na l Bay es factor B F 10 of 0 . 00 89 of the no nparametric alternative to the Poisson; the appa rent strength of this co nclusion, compare d with the other results, is r ather puzzling. 4. Mo del selection in ZIP regressi on Many applica tions inv olve count data where cov aria te infor ma tion is av aila ble; see, for exa mple, [ 14 ] a nd [ 18 ]. In this section we consider selecting b etw een Poisson regres s ion a nd ZIP regres sion mo dels given by (4.1) M R 0 : X i ind ∼ P oisson ( λ i ) , i = 1 , . . . , n, (4.2) M R 1 : X i ind ∼ Z I P ( λ i , p ) , i = 1 , . . . , n. F or a known oﬀset v aria ble a 0 i , a q × 1 vector o f cov ariates a i and regress ion parameters β = ( β 1 , . . . , β q ) T , suppo se the λ i follow the lo g -linear relationship log( λ i ) = a 0 i + a T i β . W e assume that the matrix A T = ( a 1 , . . . , a n ) is of r a nk q . Let k denote the num ber of zero counts in the da ta. F or simplicity of notation, w e index the observ ations in such a way that all the zer os are given by the ﬁrst k co un ts. 4.1. Obje cti ve pr i ors for mo del sele ction Generalizing the a r gument in Section 2.2 to the regr ession case is eas y in one case, but diﬃcult in the other. If we choos e to bas e the analy sis on the Jeﬀr e ys pr ior for β under the Poisson regr ession mo del M R 0 , the ge neralization is s tr aightforw ard: the Jeﬀreys prio r is easily computed as (4.3) π R 0 ( β ) = | n X i =1 λ i a i a T i | 1 / 2 . 112 M. J. Bayarri, J. O . Ber ger and G. S. Datta Note that this prio r is p ositive s ince the rank of A is q . Also , utilizing this pr ior for β under mo del M R 1 , along with the independent uniform prior for p , results in the following prio rs to be utilized to co mpute B 10 : (4.4) π 0 0 ( β ) = | n X i =1 λ i a i a T i | 1 / 2 , π 0 1 ( β , p ) = | n X i =1 λ i a i a T i | 1 / 2 I (0 < p ≤ 1) . The genera lization to the re g ression case of the second prior c o nsidered in Section 2.2 is muc h more diﬃcult, b eca use the Jeﬀreys prior under the ZIP r egressio n model is very complicated. In Section 2.2 , the deriv ation of the corr esp onding Jeﬀreys prior was essentially done by ignoring the zero coun ts, utilizing o nly the truncated P oisson distribution. This suggests mo difying ( 4.3 ) by r emoving the terms co rresp onding to the zer o co unts, resulting in (4.5) π R 1 ( β ) = | n X i = k +1 λ i a i a T i | 1 / 2 . F ro m another intuit ive pers p ective, the zero counts arising from the inﬂa tion factor are clea rly ir relev a nt in ﬁtting the log linea r mo del to the λ i and, s ince we do not know which zero counts a rise fro m the inﬂation factor , dro pping them all from the Jeﬀreys pr io r has an a pp e a l. Let A + = ( a k +1 , . . . , a n ) T . The prior ( 4.5 ) can o nly be used provided it is p ositive, whic h is ensured if the ra nk of A + is q . The resulting overall prior for use in co mputing B 10 is then (4.6) π 1 0 ( β ) = | n X i = k +1 λ i a i a T i | 1 / 2 , π 1 1 ( β , p ) = | n X i = k +1 λ i a i a T i | 1 / 2 I (0 < p ≤ 1) . The ﬁr st basic issue in use of these pr iors is whe ther o r not they yield ﬁnite marginal distributions. This is addressed in the following theor ems, the ﬁrst of which deals with the margina l density under the Poisson regress io n mo del. Theorem 4.1. F or t he Poisson r e gr ession mo del and either t he Jeﬀr eys prior ( j = 0) or the mo di ﬁe d Jeﬀr eys prior ( j = 1) , (4.7) m R 0 ( x ) = Z R q n Y i =1 { e − λ i λ x i i x i ! } π R j ( β ) d β < ∞ . Pr o of. See the App endix. Note that with mor e tha n one cov ar ia te there is typically no close d- form expr es- sion for m R 0 ( x ). Hence m R 0 ( x ) needs to be ev a luated by n umerical or Monte Carlo int egration. F or the ZIP r e gressio n model, the marg inal density m R 1 ( x ), under an a rbitrary improp er prior π ( β ) for β and an indepe ndent uniform prio r for p , is given by (4.8) m R 1 ( x ) = Z R q Z 1 0 f 1 ( x | β , p ) π ( β ) dp d β , where the density of x , under mo del M R 1 , is given by f 1 ( x | β , p ) = k Y i =1 { p + (1 − p ) e − λ i } (1 − p ) n − k n Y i = k +1 e − λ i λ x i i x i ! . Obje ctive Bayes testing of Poisson versus i nﬂate d Poisson mo dels 113 Again, as for m R 0 ( x ), there is usually no closed- fo rm expres s ion for m R 1 ( x ) and the marginal needs to b e co mputed via numerical or Monte Carlo integration. T o inv estigate the ﬁniteness of m R 1 ( x ), note ﬁrs t that (4.9) p k (1 − p ) n − k n Y i = k +1 e − λ i λ x i i x i ! ≤ f 1 ( x | β , p ) ≤ n Y i = k +1 e − λ i λ x i i x i ! . In v iew of this inequa lity a nd the indep endent unifor m prio r for p , the margina l m R 1 ( x ) is ﬁnite if and o nly if (4.10) Z R q n Y i = k +1 e − λ i λ x i i x i ! π ( β ) d β < ∞ . Theorem 4 .2 below gives suﬃcient conditions for this to b e ﬁnite under the pr io rs ( 4.3 ) and ( 4.5 ) r e sp ectively . Recall that the k zeros in the sample ar e lab eled to corres p o nd to the ﬁrst k obser v ations. A key c o ndition will b e that the matrix A + has rank q which implies that n ≥ k + q (analogo us to the co ndition of at least one po sitive coun t for the case of no cov ariate treated in Section 2). Theorem 4.2. Using π R 0 ( β ): Su pp ose that, for the observation X j , j = 1 , . . . , k , c orr esp onding to the zer o c ounts, the c orr esp onding c ovari ate ve ctor a j is such that (4.11) a j = n X m = k +1 c mj a m with c mj ≥ 0 , j = 1 , . . . , k , m = k + 1 , . . . , n. Then the mar ginal m R 1 ( x ) is ﬁnite. Using π R 1 ( β ): If A + has ra nk q , the mar ginal m R 1 ( x ) is ﬁnite. Pr o of. See the App endix. Clearly the condition under which m R 1 ( x ) is ﬁnite is more g e neral and muc h easier to chec k for π R 1 ( β ) than for π R 0 ( β ). This, together with the intuitiv e app eal of π R 1 ( β ), le a ds us to r ecommend its use in practice. (Note tha t either of the tw o priors reduces to the prior recommended in Section 2 for the non-r egress io n case.) Remark 4. 1. If the condition ( 4.11 ) fails, the marg inal density m R 1 ( x ) based on the Jeﬀreys prio r may b e inﬁnite. F or ex ample, consider n = 3 and q = 2, with λ 1 = λ c 1 2 λ c 2 3 , λ 2 = exp( β 1 ), λ 3 = exp( β 2 ) for suitable nonzero c 1 , c 2 to b e chosen later. Then the deter minant of informa tion matrix for β is given by | I ( β ) | = λ 2 λ 3 + c 2 1 λ c 1 2 λ c 2 +1 3 + c 2 2 λ c 1 +1 2 λ c 2 3 , so that | I ( β ) | 1 / 2 ≥ | c 1 | λ c 1 / 2 2 λ ( c 2 +1) / 2 3 . If X 1 = 0, X 2 = x 2 and X 3 = x 3 , then m R 1 ( x ) ≥ | c 1 | 2 Z R 2 e − λ 2 λ x 2 2 x 2 ! e − λ 3 λ x 3 3 x 3 ! λ c 1 / 2 2 λ ( c 2 +1) / 2 3 d β = | c 1 | x 2 ! x 3 !2 Z ∞ 0 e − λ 2 λ x 2 − 1+ . 5 c 1 2 dλ 2 Z ∞ 0 e − λ 3 λ x 3 − 1+ . 5 c 2 + . 5 3 dλ 3 = ∞ , providing that x 2 ≤ − . 5 c 1 or that x 3 ≤ − . 5 − . 5 c 2 . F or e x ample, if c 1 = − 5 and a sample pro duces x 2 = 2, then m R 1 ( x ) = ∞ . Note that here a 1 = − 5 a 2 + c 2 a 3 , with a 2 = (1 , 0) T and a 3 = (0 , 1) T , so that the condition ( 4.11 ) do es not ho ld. 114 M. J. Bayarri, J. O . Ber ger and G. S. Datta 4.2. A n il lustr ative applic ation W e a pply the metho dolo g y recommended in Section 4.1 to a dataset in volving the nu m ber of AIDS-related dea ths in men. The data pr ovides the num ber of de a ths for 598 census tracts in a la r ge city o f Spain over a p er io d of eight years. The datas e t, which w as supplied to us by Dr. M.A.M. B e neyto, has a lar ge num ber o f tra c ts with zero deaths (actually , 3 03, w hich is k in our notation). Along with the num ber of dea ths, the dataset also provides, for each ce ns us tract, the exp ected num b er of deaths E from AIDS (adjusting for the p opulation and the distribution o f ag es in e a ch tract) and an a uxiliary v ariable W (contin uous in natur e ) measuring the so cial status of ea ch census tract. In our application and for the i th census tract, we take lo g( E i ) as the oﬀset a 0 i and prop os e a log- linear r egressio n for λ i with q = 2 and a i = (1 , W i ) T . First, we will ignor e the cov ariate W and compute the Bay es fa ctor taking q = 1 a nd a i = 1 based on the Jeﬀreys pr io r. This mo del mo diﬁes the common mean mo del of Section 2.2 by incorpo r ating the oﬀset v ar iable in the mean, which is here giv en b y E i λ with λ = β 1 . The marginal m 1 ( x ) is computed by one- dimensional numerical int egration. Although it has a closed- fo rm ex pression, it is r a ther complicated and omitted here to sav e spa ce. This ex pr ession is g iven in the App endix in [ 1 ]. F or the sp eciﬁc da ta here, B 10 = 22 , 9 75 which gives ov erwhelming e v idence in fa v or of the ZIP mo del. Epidemiologis ts who are knowledgeable ab out this s tudy b elieved that the large nu m ber o f zero counts in the data could b e explained by the cov ariate measur ing the so c ia l status and, indeed, susp ected that a ZIP reg ression model would not b e needed if the cov ariate were incorp or ated into the analysis . The Bay es factor in fav or of the ZIP regres sion mo del versus the Poisso n re g ression mo del (with q = 2) is given b y 7 . 25. While this Bay es facto r provides a mode r ate amount of evidence in fav or of the ZIP r egressio n mo del, it is muc h smaller than 22 , 9 7 5, indicating that, indeed, the cov aria te ca n expla in most of the excess ze ro counts. In this example, it is po ssible that the same inﬂation par ameter p may not b e appropria te for all individuals. Just like using the log-linear mo de ls for λ i , we can treat each p i diﬀerently (as p may change a ccording to the cov ariates) and ﬁt a logistic reg ression mo del for p i . But it is highly likely that there would b e se vere confounding b etw een the tw o regr essions, whic h is pa rticularly pro blematical with ob jective Bay esian analysis (since there is no t a pr op er sub jectiv e prior to overcome the confounding). 5. Analysis with insuﬃci e n t p osi tiv e coun ts As noted in Sectio n 2, the marginal density under mo del M 1 based on an improp er prior fo r λ is no t ﬁnite when all counts are zero s, and hence the Bayes factor is not well-deﬁned. This is not a diﬃculty of o nly mo del se le ction; in this situatio n, it is also not p os sible to ma ke inferences ab out the parameters o f the ZIP mo del, sinc e the joint p osterio r of the par ameters (under the ZI P mo del) is improp er. Indeed, when all co un ts are zero, the ZIP mo del parameter s a r e no t iden tiﬁable, and the data do not pr ovide enough information to estimate the par ameters. Since o b jectiv e Bay es metho ds ar e typically based on information from the data alo ne, it is not surprising that pro ble ms a re encountered. W e co uld simply invok e this argument and r efrain from considering the case when a ll counts are zer o. How ev er, it is in teresting to explore s everal metho dolo g ies Obje ctive Bayes testing of Poisson versus i nﬂate d Poisson mo dels 115 that hav e b een pro po sed for diﬃcult testing situations, par tly to judge the success of the metho dolog ies and partly to try to provide a reas onable answer to this case. W e co nt inu e, throughout the section, to assume that p ∼ U n (0 , 1 ). 5.1. A l l zer o c ounts in the non-r e gr ession c ase W e mentioned that to re s olve the identiﬁabilit y issue in the Z IP mo del for the data with all zer os we need a prop er prio r on λ . This can be done by either sub jectively sp ecifying a pro per prior for λ o r by “training” the improp er prior s in to prop er priors bas e d on part o f the data or of the likeliho o d. In particular , the intrinsic Bay es factor approa ch [ 5 ] utilizes a pa r t o f the data as a training sa mple to tr ain the improp er prior to ge t a pr op er p osterio r. Although this approa ch w orks successfully in many examples, it is not successful in the present pr oblem. Our in vestigation of this approa ch [ 1 ] is omitted her e to sav e space. W e discuss b elow the case where a sub jectiv e prop er prior o n λ is spec iﬁe d based on cer tain consideratio ns. If a prop er prior is needed to deﬁne the Bay es factor for the situation of all zero counts, the most direct approach is to ﬁnd a proper prior that seems compatible with certain b ehaviors that we ex p ect of the B ay es facto r in this situation. A natural prop er prior to consider for λ is a Ga mma ( Ga ( a, b )) conjugate prior under the Poisson mo del ( M 0 ) given by the Gamma g ( λ | a, b ) density g ( λ | a, b ) = b a e − bλ λ a − 1 Γ( a ) , where a, b a re suitably chosen p os itive co nstants. Of co urse, one is w elcome to simply make s ub jectiv e choices here, but we will ar g ue for a certain choice (o r choices) bas e d o n r a ther neutral thinking. First, we a ssume that the same ga mma prior is a ppropriate for λ , bo th under the Poisson a nd the ZIP mo dels. This can be justiﬁed by the orthog onalizatio n argument use d in Sectio n 2.2 . With the uniform density for p and the Ga ( a, b ) prior for λ , the resulting Bay es facto r for arbitra ry data x can b e computed to b e (5.1) B 10 ( x ) = k ! ( n + 1)! k X j =0 ( n − j )! ( k − j )!  1 − j n + b  − ( s + a ) , by a similar argument to that leading to ( 2.9 ). This Bayes factor includes as a sp ecial cas e the ob jective Bayes factor in ( 2.9 ); indeed the Jeﬀr e ys prior used ther e was a limiting case of the g ( λ | a, b ) for a = 1 / 2 a nd b = 0. Note that the Bayes factor ( 5.1 ) is increa sing in s , k and a , a nd decreas ing in b . F or the sp ecia l ca se x = 0 (that is s = 0 and k = n ), note that f 1 ( 0 | λ, p ) ≥ f 0 ( 0 | λ ). Hence, using the same pr op er prior for λ with b oth the Poisson and the ZIP mo dels, it follows that m 1 ( 0 ) ≥ m 0 ( 0 ), and hence, B 10 ( 0 ) ≥ 1. In particular , for the U n (0 , 1 ) prior for p and Ga ( a, b ) prior for λ , it c a n be chec k ed that (5.2) B 10 ( 0 ) = ( n + b ) a n + 1 n X j =0 1 ( j + b ) a ≥ 1 . This is r easona ble : when a long str e am of only zeros is o bs erved, it is entirely natural to say that the data favor the ZIP model. But the degree of fav oritism dep ends on a a nd b , and we tur n to ra ther sp eculative desiderata to na r row the choice. Recall that the mean of the Ga ( a, b ) distribution for λ is ab − 1 and the v ar iance is ab − 2 . 116 M. J. Bayarri, J. O . Ber ger and G. S. Datta In order fo r the prio r not to b e to o shar p, it is r easonable to re quire the prior standard deviatio n to b e no less than the prior mean. This implies that a ≤ 1. It also seems reaso nable to r e quire the prior mean to be a t least 1, so that small v a lues of λ do not hav e excessive prio r probability . This leads to b ≤ a . Since the B ay es factor is decreas ing in b , the smallest Bayes fa ctor satisfying the a b ove constraints (that is, the one lending the most supp ort for the Poisson mo del M 0 ) is then obtained by tak ing b = a (this gives a prior mean o f 1). It is not unrea s onable to select this prior as it b elo ngs to a reaso nable c la ss which is mo st fav o rable to the null mo del. Finally , one mig ht judge it to be unapp ealing to utilize a prior for λ which is not bo unded near z ero (for a < 1 the gamma density is decrea sing with an asymptote at λ = 0) which implies that a should b e at least 1. Thus we end up with the choice a = b = 1. Note that a = 1 is the upp er limit of a ≤ 1 and the choice a = 1 now counterbalances the Bay es factor in fav or of M 1 (whereas b = a in the rang e b ≤ a tilts the Bay es facto r in favor of M 0 ). This reasoning is all rather sp eculative a nd, of course , the result is a particular pr ior, which may not reﬂe c t a c tua l prior b eliefs. Nevertheless it is instructive to study the be havior of the Bay es fac to r when this prior is used. F or a = b = 1, tha t is, the Exp onential(1) distribution, it can be check ed that B 10 = P n j =0 ( j + 1) − 1 ,which is thus o ur recommended default Bayes factor when observing only zero co unts. Note that B 10 ( 0 ) ≈ log ( n + 1) for large n . So a lar ge string of all zero counts in a s a mple w ill lead to a Bayes factor approa ching inﬁnit y at the slow r ate o f lo g( n ). The larg e sample b ehavior of the Bay es factor for this t yp e of sample see ms in tuitiv ely reas o nable. 5.2. Insuﬃcient p ositi ve c ounts in the r e gr ession c ase In the reg ression situation of Section 4, it was nec essary to have suﬃcient pos itive counts so that the co nditio ns o f Theorem 4.2 were satisﬁed. W e will restr ict disc us- sion here to the situation involving the prior sp eciﬁcations in ( 4.6 ), for which the key condition needed for the marginal to b e ﬁnite was that the matrix A + (( n − k ) × q ) should b e o f rank q . If the num ber of po sitive counts n − k is insuﬃcient so that t , the rank of A + , is less than q , this solution will not work. Remark 5. 1. Indeed, neither the prio r for β g iven by ( 4.3 ) nor by ( 4.5 ) g uarantees a ﬁnite p ositive margina l density . W e omit the pro of to save space. A pro of may b e found in the App endix in [ 1 ]. W e call this situation one of ra nk deﬁciency , with the rank deﬁciency of A + equal to q − t . The situation is a nalogous to the ca se of all zero counts without cov ar iates discussed in Subsection 5.1. (In the setup o f that sectio n, q = 1 and r ank A + less than 1 mea ns that k = n , i.e ., no p os itive co unt s.) W e could again merely r ecognize that this type of data is just not informative eno ugh to allow for ob jective Bayes analysis. W e shall how ev er pro po se a prior that yields ﬁnite mar ginal densities, following simila r reaso ning to that used in Sec tio n 5.1. W e contin ue to use a U n (0 , 1 ) prio r for p and fo c us on prop osing suitable priors for β . A disc us sion similar to that in subsection 5.1 shows that this prior has to b e at least, par tially prop er. Note that, instead o f sp ecifying a pr ior on β , we can sp e cify a prio r on q inde- pendent par ametric functions of β ; our spe c iﬁc pr op osal is to car efully choose these functions such that t of them ar e well ident iﬁed b y the data with p ositive counts while the rema ining q − t are no t. W e then pro p o se to use a version of Jeﬀreys pr ior on the for mer t functions, and a prop er prior on the latter q − t functions. Obje ctive Bayes testing of Poisson versus i nﬂate d Poisson mo dels 117 Spec iﬁc a lly , let A 0 denote the k × q matrix whose k rows are a T 1 , . . . , a T k . A ra nk of A = q and a rank of A + = t imply a r ank of A 0 ≥ q − t. L e t V + ⊆ R q denote the vector space of dimensio n t formed by the columns o f A T + . Supp ose a i 1 , . . . , a i r are all o f the vectors from a 1 , . . . , a k corres p o nding to the ze r o counts which ar e in V + . Note that 0 ≤ r ≤ k − ( q − t ). Thes e vectors are linear co m binations of the vectors a j 1 , . . . , a j t and the corresp onding λ i 1 , . . . , λ i r are functions of λ j 1 , . . . , λ j t . F ro m the set of { λ j : j ∈ { 1 , . . . , k } − { i 1 , . . . , i r }} we select q − t λ ’s, λ l 1 , . . . , λ l q − t such that { a j 1 , . . . , a j t , a l 1 , . . . , a l q − t } is linear ly indep endent. Note that there is an ( n − k ) × t matrix C o f ra nk t such that ( a k +1 , . . . , a n ) = ( a j 1 , . . . , a j t ) C T . Let D ≡ D ( λ j 1 , . . . , λ j t ). Then, the information matrix for λ j 1 , . . . , λ j t based on the Poisson mo de l for the obser v atio ns k + 1 , . . . , n is g iven by (5.3) I ( λ j 1 , . . . , λ j t ) = D − 1 C T D iag ( λ k +1 , . . . , λ n ) C D − 1 . W e deﬁne a partial Jeﬀrey s prior for λ j 1 , . . . , λ j t by (5.4) π P J ( λ j 1 , . . . , λ j t ) = { t Y i =1 λ − 1 j i }| C T D iag ( λ k +1 , . . . , λ n ) C | 1 / 2 . Let { b 1 , . . . , b q − t } denote a n or thonormal basis of the space s panned by a l 1 , . . . , a l q − t . Deﬁne ξ w = e b T w β , w = 1 , . . . , q − t . Note that λ l w , w = 1 , . . . , q − t can b e expressed in ter ms of ξ 1 , . . . , ξ q − t . Indeed, log( λ l w ) = a 0 l w + q − t X h =1 d wh log( ξ h ) , w = 1 , . . . , q − t, where d wh = b T h a l w . Finally , we ass ign indepe nden t exp onential distributions with mean 1 to each of ξ 1 , . . . , ξ q − t . This prio r will induce a pr op er distribution o n λ l w , w = 1 , . . . , q − t with a density which we deno te by π pro p ( λ l 1 , . . . , λ l q − t ). The ﬁnal prior used to calcula te the marginal densit y under mo del M R 1 is then given by π ( λ j 1 , . . . , λ j t , λ l 1 , . . . , λ l q − t ) = π P J ( λ j 1 , . . . , λ j t ) π pro p ( λ l 1 , . . . , λ l q − t ) ; this is partially J eﬀreys pr ior and partially pro per . The corresp onding prio r density on β is, of cours e, obtained throug h transforma tion. F urther, along the line of the pro of of Theorem 4 .2, it can b e chec k ed that the marginal density m R 1 ( x ) will b e ﬁnite. W e omit the details to sav e space. While there is a rbitrar ine s s in the sp eciﬁc choice o f λ l 1 , . . . , λ l q − t to assig n a sub jectiv e prior distribution base d o n exp onential distributions, the pa rtial Jeﬀreys prior in ( 5.4 ) re mains in v ariant to the choice of t indep endent λ ’s fro m λ k +1 , . . . , λ n . This solution thus seems reaso nable for small q − t . T o av oid the ar bitrariness, we could consider all p ossible selections of ( q − t ) of the λ ’s from λ 1 , . . . , λ k so that these q − t and t of the λ ’s fr o m λ k +1 , . . . , λ n deﬁne a repa rameteriza tion of β . F o r each selection we ca n calculate the Bay es factor , and in the spirit of IBF we ca n take a suitable av erage over all these Bayes factors. If the rank deﬁciency of A + is 1, we will hav e k − r Bay es facto rs to av erage. 118 M. J. Bayarri, J. O . Ber ger and G. S. Datta App endix Pr o of of The or em 4.1. F ro m ( 4.3 ) and ( 4.5 ) it is immediate that π R 1 ( β ) ≤ π R 0 ( β ). Thu s it is enoug h to prov e ( 4.7 ) for j = 0. Let i denote the indices ( i 1 , . . . , i q ) and A ( i ) deno te a q × q submatrix o f A bas e d on r ows i 1 , . . . , i q . Then by Binet-Ca uch y expansion of determinant (cf. Noble [ 19 ], p. 22 6) it can b e shown that (A1) | n X i =1 λ i a i a T i | = X ( λ i 1 . . . λ i q ) | A ( i ) A ( i ) T | , where the summation is over all submatr ices o f or der q × q . Dropping the ter ms from the a b ove summation for which | A ( i ) A ( i ) T | = 0 we get from ( 4.3 ) that (A2) π R 0 ( β ) ≤ ∗ X ( λ i 1 . . . λ i q ) 1 / 2 | A ( i ) A ( i ) T | 1 / 2 , where P ∗ denotes summation over a ll q × q matrices for which | A ( i ) A ( i ) T | > 0. Since e − λ i λ x i i /x i ! < 1, from ( 4.7 ) and ( A2 ) we get (A3) m R 0 ( x ) ≤ ∗ X Z R q q Y j =1 { e − λ i j λ x i j i j x i j ! } ( λ i 1 . . . λ i q ) 1 / 2 | A ( i ) A ( i ) T | 1 / 2 d β . Recall that log ( λ i ) = a 0 i + a T i β . Now transfo r ming β to ( λ i 1 , . . . , λ i q ) and using the Jacobian of tra ns formation ( λ i 1 . . . λ i q ) − 1 | A ( i ) A ( i ) T | − 1 / 2 , we get from ( A3 ) that (A4) m R 0 ( x ) ≤ ∗ X q Y j =1 Z ∞ 0 e − λ i j λ x i j − . 5 i j x i j ! dλ i j < ∞ , since each o f the integrals in the right hand side o f ( A4 ) is ﬁnite. This completes the pro of of Theo rem 4.1. Pr o of of The or em 4.2. First, as in ( A1 ) and ( A2 ), it can b e shown that for so me po sitive c (no t dep e nding on para meters) le ss than 1 c ∗ X ( λ i 1 . . . λ i q ) 1 / 2 | A ( i ) A ( i ) T | 1 / 2 (A5) ≤ π R 0 ( β ) ≤ ∗ X ( λ i 1 . . . λ i q ) 1 / 2 | A ( i ) A ( i ) T | 1 / 2 . In view o f this ineq uality and ( 4.10 ), the margina l m R 1 ( x ) is ﬁnite if and only if (A6) Z R q n Y i = k +1 e − λ i λ x i i x i ! ( λ i 1 . . . λ i q ) 1 / 2 | A ( i ) A ( i ) T | 1 / 2 d β < ∞ for each i = ( i 1 , . . . , i q ) for which | A ( i ) A ( i ) T | > 0. Note that the suﬃcient c o ndition stated in the theore m and the condition that rank of A is q imply that the regress io n matrix A T + = ( a k +1 , . . . , a n ) corr esp onding to the set of p o sitive counts ha s rank q . Suppo se, with no loss of g enerality , i 1 < · · · < i q in ( A6 ). Als o , supp ose i 1 < · · · < i u ≤ k < i u +1 < · · · < i q . It is p ossible that u may b e 0 or may b e q . Obje ctive Bayes testing of Poisson versus i nﬂate d Poisson mo dels 119 By the assumed condition that for j = 1 , . . . , k , a j can b e express ed as a linea r combination of a k +1 , . . . , a n with nonnegative co eﬃcients, it follows that λ i j = h i j n Y m = k +1 λ c mi j m , j = 1 , . . . , u, where c mi j ≥ 0 and h i j > 0. Then u Y j =1 λ i j = f n Y m = k +1 λ b m m , where b m = P u j =1 c mi j ≥ 0 and f > 0 are free fr o m parameters. Then the integrand (without | A ( i ) A ( i ) T | 1 / 2 ) in ( A6 ) can b e simpliﬁed a s n Y i = k +1 e − λ i λ x i i x i ! ( λ i 1 . . . λ i q ) 1 / 2 = n Y i = k +1 e − λ i λ x i + 1 2 b i i x i ! ( λ i u +1 . . . λ i q ) 1 / 2 = [ q Y j = u +1 e − λ i j λ x i j + 1 2 b i j + 1 2 i j x i j ! ][ n + u − k − q Y l =1 e − λ α l λ x α l + 1 2 b α l α l x α l ! ] , (A7) where { α 1 , . . . , α n + u − k − q } = { k + 1 , . . . , n } − { i u +1 , . . . , i q } . Suppo se { s 1 , . . . , s q } ⊂ { k + 1 , . . . , n } is s uch that { a s 1 , . . . , a s q } is a linear ly independent set (such a set exists since A + is of ra nk q ). Note tha t for y > 0 the function g ( u ) = e − u u y is maximized at u = y implying (A8) e − u u y ≤ e − y y y for all u > 0 . By ( A8 ) we get from ( A7 ) that (A9) n Y i = k +1 e − λ i λ x i i x i ! ( λ i 1 . . . λ i q ) 1 / 2 ≤ D ( q Y j =1 e − λ s j λ d s j s j ) , where D > 0 is a co ns tant indep endent of the para meters a nd d s j = x s j + 1 2 b s j + 1 2 if s j ∈ { i u +1 , . . . , i q } , and d s j = x s j + 1 2 b s j if s j ∈ { α 1 , . . . , α n + u − k − q } . The J acobian of trans formation from β to λ s 1 , . . . , λ s q is H / ( λ s 1 . . . λ s q ) for some H > 0 co nstant. Then since d s j ≥ 1 for j = 1 , . . . , q , by ( A9 ) we ha ve (A10) Z R q n Y i = k +1 e − λ i λ x i i x i ! ( λ i 1 . . . λ i q ) 1 / 2 d β ≤ H D q Y j =1 Z ∞ 0 e − λ s j λ d s j − 1 s j dλ s j < ∞ . By ( A10 ) a nd ( A6 ) we conclude that m R 1 ( x ) corresp o nding to π R 0 ( β ) is ﬁnite. T o prov e ﬁniteness o f m R 1 ( x ) corr esp onding to π R 1 ( β ) note that by ( 4.10 ) m R 1 ( x ) ≤ Z R q ( n Y i = k +1 e − λ i λ x i i x i ! ) π R 1 ( β ) d β . Finiteness of the r ight ha nd quantit y in the last dis play follows fro m a version of Theorem 4.1 cor resp onding to the prior π R 0 ( β ) by replacing n obser v atio ns from the Poisson by n − k observ ations fro m Poisson. This completes the pro of. 120 M. J. Bayarri, J. O . Ber ger and G. S. Datta Ac kno wle dgment s. The a uthors would like to thank the Comm unit y of V a- lencia Gro up in the Pr o ject Desigualdades so cio e c on´ omic as y me dio ambientales en ciudades en Esp a ˜ na, Pr oye cto MEDEA for the da ta used in Section 4, a refer ee for v alua ble comment s, and Archan Bhattacharya for computing help. Part of this resear ch was conducted while Bay arri a nd Datta were visiting SAMSI/Duke Uni- versit y , whose supp ort is gra tefully ackno wledged. References [1] Ba y arri, M. J., Berger, J. O. and Da tt a, G. S. (2007). Ob jective Bay es testing o f Poisson versus inﬂated Poisson mo dels . T echnical rep or t, Dept. Statistics, Univ. Geo rgia, A thens, GA 306 02, USA. [2] Ba y arri, M. J. and Garc ´ ıa-Dona to, G. (2007). E xtending conv en tional priors for testing gener al hypo theses in linear mo dels. Biometrika 94 13 5–152 . MR23678 28 [3] Ber ger, J. (1985 ). Statistic al De cisio n The ory and Bayesian Analysis , 2nd ed. Springer , New Y ork. MR0804 611 [4] Ber ger, J. (20 0 6). T he case for ob jective Bay esian analysis. Bayesian Anal- ysis 1 38 5 –402 . MR2 22127 1 [5] Ber ger, J. O. and Pericchi, L. R. (199 6). The intrinsic Bay es factor for mo del selection and prediction. J. Amer. St atist. Asso c. 91 109– 122. MR13940 65 [6] Ber ger, J. O. and Pericchi, L. R. (20 0 1). O b jective Bayesian metho ds for mo del sele c tio n: introduction a nd compariso n (with discussion). In Mo del Sele ction. IMS L e ctur e Notes – Mono gr aph Series 38 (P . L a hiri, ed.) 135– 207. IMS, Beach w o o d, O H. MR20007 53 [7] Ber ger, J., Pericchi, L. and V arsha vsky, J. (1 9 98). Bayes fa ctors and marginal distributions in inv ariant situations. Sankhy¯ a A 6 0 3 07–32 1. MR17187 89 [8] Ber ger, J. and Sun, D. (2008). O b jectiv e priors for a biv a r iate nor ma l mo del with multiv ariate genera lizations. Ann. Statist. T o app ear. [9] Bha tt a char y a, A., Clarke, B. S. and Da tt a, G. S. (2007). A Bay esian test for excess zeros in a zero- inﬂa ted p ow er series dis tribution. In Beyond Par ametrics in I n ter di sciplinary R ese ar ch: F estschrift in Honour of Pr of essor Pr anab K. S en . IMS L e ctur e Notes and Mono gr aphs Series 1 (N. Balak rishnan, E. Pe˜ na and M. Silv apulle, eds.) 89– 104. IMS, Bea ch w oo d, OH. [10] Broek, J. V. D. (1995 ). A scor e test for zero inﬂation o n a Poisson distribu- tion. Biometrics 51 73 8–74 3 . MR13499 12 [11] Conigliani, C., Castr o, J. I. and O’Hagan, A. (2000 ). Bay esian a ssess- men t of go o dness of ﬁt aga inst nonparametric alternatives. Canad. J. Statist . 28 327– 3 42. MR1792 053 [12] Deng, D. and P aul, S. R. (2000 ). Score test for zero inﬂation in g eneralized linear mo dels. Canad. J. S t atist. 28 563–5 70. MR17 93111 [13] Ghosh, J. K. and Samant a, T. (20 02). Nonsub jective Bay es testing – a n ov erview. J. Statist. Plann. Infer enc e 103 205 –223 . MR18969 93 [14] Ghosh, S. K., Mukhop adhy a y, P. and Lu, J. C. (20 06). Bay esian ana lysis of zero- inﬂated reg ression mo dels. J. Statist. Plann. Infer enc e 1 3 6 1360– 1375. MR22537 68 [15] Jeffreys, H. (1 961). The ory of Pr ob abili ty , 3r d ed. O xford Univ. Pres s . MR01872 57 Obje ctive Bayes testing of Poisson versus i nﬂate d Poisson mo dels 121 [16] Kass, R. E. and V aidy ana than, S. (1 992). Approximate Bay es factor s and orthogo nal parameters , with application to testing equality of tw o binomial prop ortions . J. R oy. Statist . So c. Ser. B 54 12 9–14 4 . MR11577 16 [17] Kass, R. E. and W asserman, L. (199 6). The selectio n of pr ior distributions by formal rules . J. Amer. Statist. Asso c. 91 1 343–1 370. MR14786 84 [18] Lamber t, D. (1992). Zero-inﬂated Poisson regr ession, with an applicatio n to defects in manufacturing. T e chnometrics 34 1–14. [19] Noble, B. ( 1969). Applie d Line ar Algebr a . P rentice-Hall, New Y ork. MR02468 84 [20] O’Hagan, A . (1995). F ractional Bayes factors for mo del compar isons. J. Ro y. Statist. So c. Ser. B 57 9 9 –138 . MR13253 79 [21] P ´ erez, J. M. and Berger, J. (200 1). Analysis o f mixture mo dels using ex- pec ted p osterior prior s, with application to classiﬁca tion of gamma ray bursts. In Bayesian Metho ds, with Applic atio ns to Scienc e, Policy and Oﬃcial Statis- tics (E. George and P . Nanop oulos, eds.) 40 1–410 . Oﬃcial P ublica tions of the Europ ean Communities, Luxem bo ur g.

Objective Bayes testing of Poisson versus inflated Poisson models

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment