Beta-binomial/gamma-Poisson regression models for repeated counts with random parameters

Beta-binomial/gamma-P oisson regression mo dels for rep eated coun ts with random parameters Ma yra Iv anoﬀ Lora Esc ola de Ec onomia d e S˜ ao Paulo F unda¸ c˜ ao Getu lio V ar gas, S˜ ao Paulo, Br a z il Julio M Singer Dep artamen to de Estat ´ ıstic a Universidade de S˜ ao Paulo, Br azi l Abstract Beta-binomial/ Poisso n mo dels hav e b een used by many authors to mo del multiv ari- ate coun t data. Lora and Singer (Statistics in Medicine, 2008 ) extended such mo dels to accommo date rep eated m ultiv aria te count data with ov erdip ersion in the binomial comp onen t. T o overcome some o f the limitations o f that mo del, we co ns ider a b eta- binomial/ga mma -P oisson alternative that a lso allows for b oth overdisper sion and diﬀer- ent co v ar ia nces be t ween the Poisson coun ts . W e obtain maxim um lik eliho od estimates for the parameters using a Newton-Raphson alg orithm and compar e both mo dels in a practical example. Key wor ds: biv ariate coun ts, longitudin al data, o v erdisp ersion, random eﬀects, regres- sion mo d els 1 In tro d uction Beta-binomial mo dels ha ve b een used b y man y authors to mo d el binomial coun t data with diﬀeren t probabilities of success among units fr om the same group of study . Williams (1975 ) used such distrib utions to compare the n umber of f et al abn ormalit ies of pr eg nant rat females on a c hemical diet during pregnancy to a con trol group , b oth with ﬁxed litter size. Gange et al. (1996) analyzed the qualit y of health serv ices (classiﬁed as appropr ia te or not) d uring patient sta y in a h ospita l using a similar app roac h. T o analyze mortalit y 1 data in mouse litters with a ﬁ xed n umber of implan ted fetuses, Bro oks et al. (1997) used suc h mo dels not only to allo w for d iﬀeren t p robabilitie s of s ucce ss among u nits from the same group of study , but also to consider o v erdisp ersion among them. Giv en that in man y studies, the num b er of trials ma y not b e ﬁ xed, C omulada and W ei ss (200 7) considered a m ultiv ariate P oisson distribution to mo del th e num b er of successes and failures in a random n umber of attempts, illustrating th eir prop osal with data from a HIV transmission study . Multiv ariate P oisson distr ib ution ha ve also b een u sed to mo del correlated coun t data, as in Karlis and Ntzo ufras (2003 ) who used su c h distribu tio n to m odel the n u m b er of goals of t w o comp eting teams. In a study where the num b er of su cce sses in a rand om num b er of trials w as observ ed rep eatedly , and th erefore are p ossibly correlated, Lora and Singer (2008) consider m u lti- v ariate b eta-binomial/P oisson mo dels. In their pr oposal, the b eta-binomial comp onen t also accoun ts for o v erdisp ersion across units with the same leve ls of cov ariate s. The multiv ari- ate Poisson comp onen t accommodates b oth the random num b er of trials and the r epeated measures nature of the data. The eﬀect of p ossible cov ariate s is tak en in to account via the regression approac h suggested by Ho an d S in ge r (1997, 2001). Their mo del, ho we ve r, requires a constan t co v ariance term b etw een the rep eated num b er of trials and d oes n ot allo w for ov erdisp ersion in these counts. Sin ce , as suggested by Cox (1983), the precision of parameter estimates ma y b e seriously aﬀected w hen o v erdisp ersion is not accoun ted for in the mo dels considered for analysis, we prop ose a b eta-binomial/gamma-P oisson mo del that not only incorp orate s suc h c haracteristics but is also easier to implement computa- tionally . The mo del, along with maxim um lik eliho od methods f or estimation and testing purp oses are pr esented in Section 2. An illustr at ion us ing data previously an alyzed b y Lora and Singer (2008) is p resen ted in Section 3. A brief discussion and suggestions for fu ture researc h are outlined in Section 4. 2 The b eta-binomial /gamma-P oisson mo del for rep eated mea- suremen ts W e denote the v ector of resp onses for the g -th sample unit ( g = 1 , . . . , M ) by Y g = ( X g 1 , N g 1 , ..., X g p , N g p ) ′ 2 with X g h corresp onding to the n umber of successes in N g h trials p erformed under the h -th ( h = 1 , . . . , p ) observ ation condition. W e assume that for all g and h , X g h | N g h , π g h follo w indep enden t bin omial( N g h , π g h ) d istributions (1) π g h follo w indep enden t Beta( µ ( z µgh ) /θ ( z θ g h ) , [1 − µ ( z µgh )] /θ ( z θ g h )) distributions (2) N g h | τ g follo w indep enden t Poi sson ( λ ( z λg h ) τ g ) distribu tions (3) τ g follo w indep enden t gamma( α ( z αg ) /δ ( z δg ) , 1 /δ ( z δg )) distributions (4) where z µgh , z θ g h , z λg h , z αg and z δg are v ectors of ﬁxed co v ariates. According to (1) and (2), the su cc ess probabilities ma y be diﬀeren t across units, but they are generated by b eta distribu tio ns that may dep end on co v ariates. In (3) and (4), w e follo w Nelson (1985) to sp ecify that the num b ers of trials may also b e diﬀerent across units, but are generated b y gamma distributions that ma y also dep end on co v ariates. The parametrizations (0 < µ < 1, θ > 0) adopted in (2) and ( α > 0, δ > 0) adopted in (4) are used to facilitate maximum lik eliho o d estimation, as suggested b y Gange et al. (1996); their relation to the u sual b eta( a, b ) p aramet rization, as in Johnson and K ot z (1970 ), and the usual gamma( c, d ) p arametrization, as in Moo d et al. (1974), is giv en by µ = a a + b , θ = 1 a + b , α = c d and δ = 1 d . The ﬁrst and second order cen tral momen ts of τ g in (4) are E ( τ g ) = α ( z αg ) (5) V ar ( τ g ) = α ( z αg ) δ ( z δg ) (6) F r om (3) and (4), the ﬁr st and second order cen tral momen ts of the n umber of trials are E ( N g h ) = λ ( z λg h ) α ( z αg ) (7) V ar ( N g h ) = λ ( z λg h ) α ( z αg ) { 1 + λ ( z λg h ) δ ( z δg ) } (8) C ov ( N g h , N g h ′ ) = λ ( z λg h ) λ ( z λg h ′ ) α ( z αg ) δ ( z δg ) (9) for all g , h, h ′ , h 6 = h ′ . Similarly , the ﬁ rst and second ord er cen tral moments of π g h in (2) are E ( π g h ) = µ ( z µgh ) (10) V ar ( π g h ) = µ ( z µgh )[1 − µ ( z µgh )] θ ( z θ g h )[1 + θ ( z θ g h )] − 1 (11) 3 Also, from (1) and (2), we ma y conclud e that, for all g and h , X g h | N g h ∼ b eta − binomial[ N g h , µ ( z µgh ) , θ ( z θ g h )] with E ( X g h ) = µ ( z µgh ) λ ( z λg h ) α ( z αg ) (12) V ar ( X g h ) = µ ( z µgh )[1 − µ ( z µgh )] θ ( z θ g h ) 1 + θ ( z θ g h ) λ 2 ( z λg h ) α ( z αg )[ α ( z αg ) + δ ( z δg )] + µ ( z µgh ) λ ( z λg h ) α ( z αg )[1 + µ ( z µgh ) λ ( z λg h ) δ ( z δg )] (13) C ov ( X g h , X g h ′ ) = µ ( z µgh ) µ ( z µgh ′ ) λ ( z λg h ) λ ( z λg h ′ ) α ( z αg ) δ ( z δg ) (14) for all g , h, h ′ , h 6 = h ′ . The co v ariance b etw een the num b ers of successes and trials is C ov ( X g h , N g h ) = µ ( z µgh ) λ ( z λg h ) α ( z αg ) { 1 + λ ( z λg h ) δ ( z δg ) } . (15) The paramete rs θ ( z θ g h ) go ve rn b oth the v ariabilit y of the succe ss probabilities and the o v erdisp ersion of the num b er of successes, that may also dep end on the parameter δ ( z δg ). When θ ( z θ g h ) and δ ( z δg ) are equal to zero, there is no o v erdisp ersion for the n umb er of successes. Th e p arame ters δ ( z δg ) are also related to the v ariabilit y and o v erdisp ersion of the num b er of trials and to the co v ariance b et ween the num b ers of trials and num b ers of successes. When δ ( z δg ) = 0, the rep eate d counts are ind epend en t. T o inv estigate the eﬀect s of co v ariates, we adopt log-linear mo dels of the form µ ( z µgh ) = exp( z ′ µgh β µ ) 1 + exp( z ′ µgh β µ ) (16) θ ( z θ g h ) = exp( z ′ θ g h β θ ) (17) λ ( z λg h ) = exp( z ′ λg h β λ ) (18) α ( z αg ) = exp( z ′ αg β α ) (19) δ ( z δg ) = exp( z ′ δg β δ ) (20) where β µ , β θ , β λ , β α and β δ are v ectors of parameters to b e estimated. F r om (1), (2), ( 3) and (4) it follo ws that the joint probabilit y mass function for the n umber of trials and successes for the g -th u nit is P ( X g 1 , N g 1 , ..., X g p , N g p ) = p Y h =1 P ( X g h | N g h ) P ( N g 1 , ..., N g p ) = p Y h =1 P ( X g h | N g h ) Z ∞ 0 p Y h =1 P ( N g h | τ g ) f ( τ g ) dτ g ! 4 with f denoting the density of (4). Since the logarithm of the like liho o d is given b y log L ( β µ , β θ , β λ , β α , β δ ) = = M X g =1 p X h =1 log P ( X g h | N g h , β µ , β θ ) + M X g =1 log P ( N g 1 , ..., N g p | β λ , β α , β δ ) , the p arame ters of the b eta-binomial distribu tio n ( β µ , β θ ) can b e estimated separately fr om those of the gamma-P oisson d istribution ( β λ , β α , β δ ). The b eta-binomial p robabilit y mass fu nction can b e written as P ( X g h = x g h | N g h = n g h , β µ , β θ ) = n g h x g h ! ( Γ  1 θ ( z θ g h )   Γ  1 θ ( z θ g h ) + n g h  − 1 ) × ( Γ  µ ( z µgh ) θ ( z θ g h ) + x g h   Γ  µ ( z µgh ) θ ( z θ g h )  − 1 ) × ( Γ  1 − µ ( z µgh ) θ ( z θ g h ) + n g h − x g h   Γ  1 − µ ( z µgh ) θ ( z θ g h )  − 1 ) = n g h x g h ! n gh − 1 Y u =0 [1 + uθ ( z θ g h )] − 1 x gh − 1 Y v =0 [ µ ( z µgh ) + v θ ( z θ g h )] × n gh − x gh − 1 Y w =0 [1 − µ ( z µgh ) + w θ ( z θ g h )] (21) where Γ( r ) = R ∞ 0 t r − 1 e − t dt . The expressions inv olving ratios b et we en tw o gamma fun ct ions (presen ted with in brac k ets) mak e s ense when n g h 6 = 0 (in the ﬁrst r atio), x g h 6 = 0 (in the second ratio) and x g h 6 = n g h (in the third ratio). When th ese conditions are not satisﬁed, the ratios b et w een the gamma fun ct ions ma y b e set equal to one, an d do not aﬀect the conditional probabilit y of X g h giv en N g h , β µ , β θ . The k ernel of the b eta-binomial log-lik eliho od fu nctio n is L ( β µ , β θ ) = M X g =1 p X h =1   x gh − 1 X v =0 log [ µ ( z µgh ) + v θ ( z θ g h )]+ n gh − x gh − 1 X w =0 log [1 − µ ( z µgh ) + w θ ( z θ g h )] − n gh − 1 X u =0 log [1 + uθ ( z θ g h )]   (22) and we may u se maxim um lik eliho od metho ds adopting a Newton-Raphson iterativ e pr ocess to estimate β µ and β θ . T he ﬁrst and s ec ond deriv ativ es of (22) are sho wn in Lora and S inger (2008 ). Me tho d of moment s estimates based on the b eta-binomial distribu tio n m a y b e u sed 5 as initial v alues f or µ ( z µgh ) and θ ( z θ g h ), as suggested by Griﬃ th s (19 73). Lik eliho od ratio tests ma y b e emplo yed for mo del reduction purp oses, i.e., for constructing a parsimonious mo del that captures the explainable v ariabilit y in the d at a. F or example, to v erify if the q -parameter vec tor β ∗ is null, the test statistics LR = 2( L − L ∗ ), with L ∗ indicating the log-lik eliho o d under H 0 and L , this logarithm u nder the al ternativ e h yp othesis m a y b e emplo y ed. Asy m ptoti cally , LR f ollo ws a c hi-squared d istribution with q degrees of fr ee dom under the n ull h yp othesis. The probabilit y function for the rep eated num b er of trials b ased in (3) and (4) is P ( N g 1 = n g 1 , ..., N g p = n g p | β λ , β α , β δ ) = = p Y h =1  [ λ ( z λg h )] n gh n g h !   1 δ ( z δg )  α ( z αg ) /δ ( z δg ) Γ p X h =1 n g h + α ( z αg ) δ ( z δg ) !  Γ  α ( z αg ) δ ( z δg )  − 1 ÷ " p X h =1 λ ( z λg h ) + 1 δ ( z δg ) # Σ p h =1 n gh + α ( z αg ) /δ ( z δg ) = p Y h =1  [ λ ( z λg h )] n gh n g h !  Σ p h =1 n gh − 1 Y u =0 [ α ( z αg ) + uδ ( z δg )] ÷ ( δ ( z δg ) " p X h =1 λ ( z λg h ) # + 1 ) Σ p h =1 n gh + α ( z αg ) /δ ( z δg ) (23) In (23 ), the simpliﬁcations for the ratio ns b et w een t wo gamma f unctions mak e sense when P p h =1 n g h 6 = 0. When th is condition is not satisﬁed, the ratio is also set equal to one, and it do es n ot aﬀect the probabilit y v alue. The k ernel of the gamma-P oisson log-lik eliho o d fun ctio n is L ( β λ , β α , β δ ) = M X g =1    p X h =1 [ n g h log λ ( z λg h )] + Σ p h =1 n gh − 1 X u =0 log [ α ( z αg ) + uδ ( z δg )] − " p X h =1 n g h + α ( z αg ) δ ( z δg ) # log " δ ( z δg ) p X h =1 λ ( z λg h ) ! + 1 #) (24) and w e adopt th e same metho ds used with the b eta-binomial mo del to estimate β λ , β α and β δ . T he ﬁ rst and second deriv ativ es of (24) are sho wn at the App endix. Metho d of momen ts estimates ma y b e used used as the in iti al v alues f or λ g h ( z λ ), α g ( z α ) and δ ( z δ ) here, to o. Lik eliho od ratio tests ma y b e emplo y ed for mo del red u ctio n p urp ose, along similar lines as those considered for the b eta- binomial mo del. Both iterativ e pro cesses are imp lemen ted in the R soft wa re and the corresp onding co de can b e d o wnloaded from ht tp://www.ime.usp.b r/ ∼ jmsinger. 6 3 Data analysis T o compare the b eta-binomial/ gamma-P oisson to the multiv ariate b eta-binomial/ Po isson mo del, w e consider the same d ata presented in Lora and Singer (2008) from a study con- ducted at the Learning Lab oratory of the Departmen t of Physio therapy , Phonotherapy and Occupational Therapy of the Univ ersit y of S˜ ao Pa ulo, Brazil, to ev aluate the p erform an ce of some motor activities of Parkinson’s disease patien ts. F or the sak e of completeness, we rep eat the d escrip tio n of the study here. T wen t y ﬁ v e patien ts with conﬁr m ed clinical di- agnosis of P arkinson’s disease and t wen t y one normal (without an y preceding neur olo gic alteratio ns) sub jects rep eated tw o sequ ences of sp eciﬁed op p osed ﬁnger m ov ements (touc h- ing one of the other four ﬁngers w ith the th u m b ) during one minute p erio ds, with b oth hands. This was done b oth b efore and a fter a four -week exp erimenta l perio d in whic h only one of the sequ en ce s was trained (activ e sequence) with one of the hand s; the other sequence was n ot trained (con trol sequen ce ). Half of the sub jects in eac h group trained the preferred hand (righ t for th e r ig ht-handed and left for the left-handed in the normal group or the less aﬀected by the disease in the exp erimen tal group) and the other half trained the non-preferred hand. Information on the n um b er of attempted and successful trials were recorded with a sp ecia l device attac hed to a computer. Six su b groups ma y b e c haracterized by the com b inatio n of d isea se stage (n orm al , ini- tial or adv anced) and use of the preferred h and (ye s or no). The rep eated measur es are c haracterized by the cross-classiﬁcation of the lev els o f sequence (con trol or activ e) a nd ev aluation session (baseline or ﬁn al) . The sp eciﬁc ob jectiv e of th e stud y wa s to ev aluate whether training is asso ciated with in crea ses in the exp ected num b er of attempted trials p er minute (agilit y) and/or on th e p robabilit y of su cce ssful trials (abilit y). Note that the treatmen t could impro v e ag ilit y without impr oving abilit y , so an ev aluation of its eﬀect on b oth c haracteristics is imp ortan t. The means and v ariances of th e num b er of attempted and successful trials at the baseline and ﬁnal ev aluations with the act iv e and cont rol sequences for p at ien ts at the d iﬀe rent dis- ease stages using the preferred or non-preferred h ands are pr esen ted in T able 1. V ariances, instead of standard d eviations, are disp la y ed to facilit ate identiﬁca tion of o v erdisp ersion in the sense referr ed by Nelder and McCullagh (1989), i.e., cases wh ere v ariances are greater than exp ected under P oisson or binomial distributions. Ov erdisp ersion in the num b er of attempts, und er a Poi sson distrib ution is clearly identiﬁed b y comparin g the observ ed mean and v ariance; for th e n umb er of successes, on the other hand, it is necessary to compare the 7 observ ed and exp ected v ariances u nder the binomial distribu tio n ( n p (1 − p )). F o r example, considering normal sub jects p erform ing the activ e sequ ence at the b asel ine session u sing the preferred hand, the exp ected v ariance und er the binomial mo del is 1 . 4, while the observ ed v ariance is 49 . 0, highligh ting the o v erdisp ersion for these coun ts to o. Correlation co eﬃci ent s for the w ithin-sub ject resp onses for the normal patient s u sing the p referred hand are disp la y ed in T able 2. F or this s ubgroup, only 3 out of the 28 observed correlations are s m all er than 0 . 60; this suggests that the coun ts are probably related and it is s en sible to us e a mo del that can accommo date this relationship. Th e correlation patterns for the other subgroups are similar and are not pr esen ted. The analysis strategy consisted in ﬁtting in itia l mo dels of the form (16)-(20) with all main eﬀects and ﬁrst order in teractions, and trying to r educe them by sequ en tially elimi- nating the non-signiﬁcant terms. Th e parameters are ind exed b y disease stage (0=normal, 1=initial, 2=adv anced), interv enti on hand (P=pr eferred, N=non-pr eferred), ev aluation ses- sion (B=baseline, F=ﬁn al) and s equ ence (C=con trol, A=activ e). W e adopted a reference cell parameterization with the reference cel l corresp onding to the norm al group (0), p er- forming the a ctiv e sequence (A) with the preferred han d (P) at the baseline ev aluatio n (B). 3.1 Mo delling the exp ected proba bilit y and disp ersion of successful at- tempts F or b oth b eta-binomial/g amma-P oisson and multiv ariate b eta-binomial/P oisson mo dels, the parameters of the beta-binomial co mp onent s can b e estimated separately from those of th e gamma-P oisson or the multiv ariate P oisson distributions. Therefore, mo delling the exp ected probab ilities and disp ersion parameters of the successful attempts is exactly the same as in Lora and Singer (2008 ) and it is n ot sho wn here; w e presen t only the estimate s and stand ard errors computed under the ﬁnal b eta-binomial mo del (T able 3) f or comparison with the r esu lts obtained und er the b eta-binomial/ga mma-Po isson mo del. Under this ﬁn al mo del, estimates of the exp ected p robabilities of successful attempts [ E ( π g h ) = µ ( z µgh )] and disp ersion parameters θ ( z θ g h ) (that go vern the v ariabilit y of the probabilities of successful attempts), along with their standard errors, are p resen ted in T able 4. The results su gg est no evidence of d iﬀeren ce b etw een the exp ected pr obabiliti es of successful attempts f or patien ts using preferr ed or non-pr eferred hand ( β µN = 0), n ei ther for activ e n or for con trol sequences in the baseline session ( β µC = 0). P atien ts in th e normal group or with the disease in initial stage h a v e similar exp ected probabilities of su cc essful 8 T a ble 1: Mea n and v ariance (within p aren theses) of the num b er of attempted and successful trials. Disease Ev alua tion Int erven tion Sequence Succes s es A ttempts stage session hand Normal Bas e line Preferr e d Control 17 .1 (49.0) 18.6 (46,2) Normal Bas e line Preferr e d Active 17.1 (72.3 ) 17.9 (79.2) Normal Bas e line Non- preferred Con trol 18.1 (27.0 ) 20.9 (47.6) Normal Bas e line Non- preferred Active 17.1 (37.2) 19.5 (53.3) Normal Final Preferr e d Control 20 .9 (90.3) 26.1 (44.9) Normal Final Preferr e d Active 32.7 (139.2 ) 33.1 (132.3) Normal Final Non-preferre d Control 24 .2 (25.0 ) 28.6 (38.4) Normal Final Non-preferre d Activ e 32.8 (74.0 ) 34.4 (72.3) Initial Baseline Preferr e d Control 13 .7 (24.0) 16.3 (44.9) Initial Baseline Preferr e d Active 12.0 (23.0 ) 13.5 (23.0) Initial Baseline Non-preferred Control 12 .0 (17.6) 14.6 (9.0) Initial Baseline Non-preferred Activ e 1 0.7 (20 .3) 13.6 (10.9) Initial Final Preferr e d Control 13 .2 (30.3) 16.8 (43.6) Initial Final Preferr e d Active 20.2 (9.6) 21.8 (2.9) Initial Final Non-preferre d Control 15.3 (112.4) 20.3 (116.6) Initial Final Non-preferre d Activ e 20.1 (33.6 ) 20.4 (39.7) Adv anced Baseline Preferred Control 4.8 (22.1) 7.1 (11.6) Adv anced Baseline Preferred Active 4 .6 (11.6) 7.9 (14 .4) Adv anced Baseline Non- preferred Control 8.3 (72.3) 12.5 (15.2) Adv anced Baseline Non- preferred Active 13.5 (92.2) 15.5 (57.8) Adv anced Final Pre fer red Control 7.4 (75.7 ) 11.9 (67.2) Adv anced Final Pre fer red Active 13.5 (9 0.3) 14.9 (77.4 ) Adv anced Final Non-prefer red Cont rol 5.8 (31.4) 12.8 (1 2.3) Adv anced Final Non-prefer red Activ e 2 2.5 (75.7 ) 23.8 (75.7) 9 T a ble 2: Correlation coeﬃcients for the within-sub ject resp onses for the n orm al sub jects using the preferred hand Baseline sessio n Final sessio n Activ e se q. Con trol seq. Activ e seq. Control seq. Suc. At t. Suc. A tt. Suc. At t. Suc. Att. Baseline Active Suc. 1 session seq. Att. 0.99 1 Control Suc. 0.85 0 .84 1 seq. Att. 0.78 0.80 0.96 1 Final Activ e Suc. 0.76 0 .76 0.61 0.6 1 1 session seq. Att. 0.74 0.74 0.61 0 .63 0.99 1 Control Suc. 0.53 0 .49 0.59 0.6 3 0.60 0.6 1 1 seq. Att. 0.81 0.82 0.70 0 .69 0.93 0 .92 0.50 1 Co des: Suc.=Successes, Att.=A ttempts a nd seq.= sequence attempts ( β µ 1 = 0), but those with the d isea se in an adv anced stage hav e smaller exp ected probabilities of successfu l attempts ( β µ 2 < 0). Moreo v er, an inte rven tion eﬀect is detected since the exp ect ed pr obabilit ies of successful attempts in the ﬁnal session are greate r than those for the baseline session ( β µF > 0). These v alues are smaller for the con trol sequence than for the activ e sequence ( β µF + β µ ( F ∗ C ) < 0) su gg esting that training is eﬀectiv e with resp ect to abilit y . W e ma y also in f er that th ere is no diﬀerence b et w een the exp ected disp ersion parameter for sub jects p erforming the activ e and control sequ ences ( β θ C = 0). F or the normal sub jects, the exp ected disp ersion parameters are the same ( β θ C , β θ N , β θ F =0), except in the ﬁnal ev aluation using the non-p r eferred hand, f or which the exp ected v alue is smaller than the others ( β θ ( F ∗ N ) < 0). F or patien ts in initial stage of th e disease, the exp ected disp ersion parameters are smaller than for those in the n ormal group ( β θ 1 < 0); how ev er, they change for eac h com bination of session and in terv ent ion hand ( β θ (1 ∗ F ) , β θ (1 ∗ N ) , β θ ( F ∗ N ) 6 = 0). Finally , for patien ts in th e adv anced stage of the disease, the exp ected disp ersion parameter is larger than for those in the n orm al group ( β θ 2 > 0), but this change s for the ﬁnal session when the non-preferred hand is used ( β θ ( F ∗ N ) 6 = 0). 10 T able 3: Parame ter estimates an d stand ard errors und er the ﬁnal b eta- binomial m odel Standard Parameter Related to Estimate erro r β µ 0 Normal gro up, prefer red hand, 1.86 0.15 baseline session a nd ac tive s equence β µ 2 Eﬀect of adv a nced stage -1.35 0.25 β µF Eﬀect of ﬁnal se s sion 1.38 0.30 β µ ( F ∗ C ) Eﬀect of ﬁnal se s sion and control sequence -1.79 0.30 β θ 0 Normal gro up, prefer red hand, -1.07 0.27 baseline session a nd ac tive s equence β θ 1 Eﬀect of initial sta g e -2.98 1.05 β θ 2 Eﬀect of adv a nced stage 1.31 0.37 β θ (1 ∗ F ) Eﬀect of disease in initial sta ge a nd ﬁnal ses s ion 1.6 6 0.82 β θ (1 ∗ N ) Eﬀect of initial sta g e and non-prefer red ha nd 2.78 0.91 β θ ( F ∗ N ) Eﬀect of ﬁnal se s sion and non-pr e ferred hand -1.49 0.44 3.2 Mo delling t he exp ected n um b er of attempts The initia l mo del parameter v ector, with all main eﬀects and ﬁrst order in teractions is β = ( β λ , β α , β δ ) where β λ = ( β λ 0 , β λ 1 , β λ 2 , β λN , β λF , β λC , β λ (1 ∗ F ) , β λ (1 ∗ N ) , β λ (1 ∗ C ) , β λ (2 ∗ F ) , β λ (2 ∗ N ) , β λ (2 ∗ C ) , β λ ( F ∗ N ) , β λ ( F ∗ C ) , β λ ( N ∗ C ) ) β m = ( β m 0 , β m 1 , β m 2 , β mN , β m (1 ∗ N ) , β m (2 ∗ N ) ) with m = α, δ . W e ma y inte rpr et β λ 0 as the logarithm of λ for norm al ind ivid uals, u sing the p referred hand, p erformin g the activ e sequ ence at th e ﬁ nal ev aluation; β λN corresp onds to the v ariation in the logarithm of λ d ue to the eﬀect of the n on-preferred hand compared to the pr eferr ed one; β λ (1 ∗ N ) corresp onds to an ad d itio nal v ariation in the logarithm of λ due to the inte raction b etw een th e initial stage of the d isea se (1) and the u se of the non- preferred hand ( N ). The elemen ts of the vec tor β λ related to diﬀeren t ev aluation sessions (represen ted by F an d C ) allo w for d iﬀeren t num b er of attempts in th ese diﬀerent ev aluation sessions. On the other h and, α ( z αg ) and δ ( z δg ) do not v ary in diﬀerent ev aluation sessions; therefore the v ectors β α and β δ do not ha v e elemen ts to distinguish b et w een sessions, but ha v e elemen ts to compare subgroup s. As noticed in Lora and Sin ge r (2008) for the b eta-binomial mo del, th e iterativ e pro ce ss 11 T able 4: Estimates of exp ecte d p robabilitie s of successful attempts, disp ersion parameters and standard errors under the ﬁnal b eta -binomial mo del Disease Ev alua tion Int erven tion Exp ected Standa rd stage session hand Sequence v alue error Exp ected pr obabilities of success ful a ttempts Normal or initial Baseline Either Either 0.87 0.0 2 Normal or initial Final Either Con trol 0.81 0.0 3 Normal or initial Final Either Active 0.96 0.01 Adv anced Baseline Either Either 0.62 0.0 6 Adv anced Final Either Control 0.52 0.0 6 Adv anced Final Either Active 0.87 0.0 4 Disper sion par ameters Normal Baseline Either Either 0.34 0.0 9 Normal Final Preferr e d Either 0.34 0.0 9 Normal Final Non-preferre d E ither 0.08 0.03 Initial Baseline Prefer red Either 0.02 0 .02 Initial Baseline Non-preferr ed Either 0.28 0.14 Initial Final Preferr ed Either 0.09 0.0 6 Initial Final Non-preferr e d Either 0.33 0 .19 Adv anced Baseline Either Either 1.27 0.3 7 Adv anced Final Preferr e d Either 1.27 0.3 7 Adv anced Final Non-preferre d E ither 0.29 0.13 w as v ery sensitive to initial v alues, sp ecially for the in teractions. T o o v ercome th is d iﬃ cu lt y , w e started with a simp ler mo del co nta ining only the main eﬀects and used the resu lting estimates as initial v alues f or ﬁtting other mo dels, obtained b y includin g the inte ractions one b y one. The estimates of th e interac tion parameters ob tained in this preliminary pro cess w ere used as the initial v alues in our mo delling strategy . The non-signiﬁcan t interac tions w ere iden tiﬁed and their simultaneous elimination fr om the initial mo del w as supp orted ( p = 0 . 211) v ia a test of the h yp othesis H 0 : β λ (1 ∗ F ) , β λ (1 ∗ N ) , β λ (1 ∗ C ) , β λ (2 ∗ F ) , β λ (2 ∗ N ) , β λ (2 ∗ C ) , β λ ( F ∗ N ) , β λ ( N ∗ C ) , β α (1 ∗ N ) , β α (2 ∗ N ) , β δ (1 ∗ N ) , β δ (2 ∗ N ) = 0 Under the resulting redu ce d mo del, the non-signiﬁcan t m ai n eﬀects were ident iﬁed; their 12 sim ultaneous elimination was corrob orated ( p = 0 . 493) via a test of the h yp othesis H 0 : β λN , β λC , β α 1 , β α 2 , β αN , β δ 1 , β δ 2 , β δN = 0 . W e considered other hyp otheses wh ere some of these p aramet ers are equal to zero and th ey w ere all rejected ( p < 0 . 150). Go o dness of ﬁt of the resulting reduced mo del w as conﬁrmed b y a lik elihoo d ratio test in which it was co mpared to the initial mo del ( p = 0 . 289). F or this ﬁnal mo del, th e corresp onding parameter estimates along with their s ta ndard errors are pr esen ted in T able 5. Based on this, w e estimated exp ected v alues for λ ( z λg h ); the results are presented in T able 6. Additionally , since only the parameters β α 0 and β δ 0 w ere included at th e ﬁnal mo del, w e hav e α ( z αg ) = 3 . 67, with standard error of 0 . 18, and δ ( z δg ) = 0 . 2 7, with sta ndard error of 0.07 , for al l disease stages and b oth h ands. The non-zero estimate of δ s u gg ests that the total attempts are o verdisp ersed and that the correlations among the coun ts across the diﬀeren t instan ts of ev aluation are non-null. T able 5: P arameter estimates and standard errors for the ﬁnal gamma-P oisson m odel Parameter Related to Estimate Standard err o r β λ 0 Normal gro up, prefer red hand, 1.68 0.03 initial ev a luation and active seq uence β λ 1 Eﬀect of initial sta g e -0.38 0.05 β λ 2 Eﬀect of adv a nced sta g e -0.71 0.05 β λF Eﬀect of ﬁnal ev aluation 0.52 0.04 β λ ( F ∗ C ) Eﬀect of ﬁnal ev aluation and control s equence -0.22 0.05 β α 0 Normal gro up, prefer red hand 1.30 0.05 β δ 0 Normal gro up, prefer red hand -1.32 0.25 W e ma y conclude that individu als in the initial stage of the disease h a v e smaller exp ected n umber of att empts than normal ones, and for individu al s in the adv anced stage this v alue is ev en smaller ( β λ 2 < β λ 1 < 0 and β α 1 = β α 2 = 0). There is no evidence of diﬀerence b et ween the exp ected num b er of attempts for participan ts using pr eferred or non-preferr ed hands ( β λN = 0 and β αN = 0), neither for activ e n or for con trol sequences in th e baseline session ( β λC = 0). Th e results suggest that the training is also eﬀectiv e w ith resp ect to agilit y , since t he exp ected n umber of attempts und er the ﬁn al ev aluation is b igg er than at the initial one ( β λF > 0). Moreo v er, for th e cont rol sequence, the exp ected n umber of attempts is larger at the ﬁnal ev aluation compared with the initial one ( β λF + β λ ( F ∗ C ) > 0); ho w ev er, considering only the ﬁ nal ev aluation, the exp ected num b er of attempts is larger for the activ e s equences th an for the con trol ones ( β λ ( F ∗ C ) < 0). 13 T able 6: Estimates of exp ected v alues of λ ( z λg h ) Disease Ev alua tion Interv ention Sequence Exp ected Standard stage session hand v alue error Normal Bas e line Either Either 5.4 0.2 Normal Final Either Control 7.2 0.3 Normal Final Either Active 9.0 0.4 Initial Base line Either Either 3 .7 0.2 Initial Final Either Co n trol 5.0 0.4 Initial Final Either Activ e 6.2 0.3 Adv anced Baseline Either E ither 2.3 0.1 Adv anced Final Either Con trol 3.6 0.2 Adv anced Final Either Activ e 4.4 0.3 T able 7 c on tains estima tes o f th e exp ec ted successful and total attempts al ong with the resp ectiv e standard errors. In T able 8 w e present estimat es (with r espectiv e standard errors) of the elemen ts of the cov ariance m at rix fo r norm al su b jects using the preferred hand. C o v ariance p att erns for the other subgroups are similar and are not included. T able 7: Estimates and standard errors (within paren theses) for the exp ected num b er of successful and total attempts under the ﬁnal b eta -binomial/gamma-P oisson mo del Disease Ev aluation Interv ention Sequence Success ful T otal stage session hand attempts attempts Normal Bas e line Either Either 17.2 (1.0) 19.8 (1.1) Normal Final Either Co n trol 21.4 (0 .8) 26.4 (0.1) Normal Final Either Activ e 3 1.7 (1.8) 33.0 (1.8) Initial Baseline Either Either 11.8 (0.7) 13.6 (0.8 ) Initial Final Either Co n trol 14.9 (1 .1) 18.4 (1.2) Initial Final Either Activ e 2 1.9 (1.4) 22.8 (1.4) Adv anced Baseline Either Either 5.2 (0.6) 8.4 (0.6 ) Adv anced Final Either Con trol 6.9 (0.9) 13 .2 (0.9) Adv anced Final Either Activ e 14.0 (1.2) 16.1 (1.1) 4 Discussion The p rop osed b eta-binomial/ga mma-Po isson mo del is more general than the multiv ariate b eta-binomial/ Po isson m odel considered in Lora and Singer (2008) b ecause it allo ws for 14 T able 8: Estimates and standard errors (within parentheses) for the exp ected co v ariance matrix for normal sub jects using the preferred hand Baseline sessio n Final s ession Activ e se q. Control seq. Activ e se q. Control seq. Suc. A tt. Suc. A tt. Suc. A tt. Suc. A tt. Baseline Active Suc. 51.2 session seq. (7.2) A tt. 42.2 48.5 (11.4) (13.0) Control Suc. 2 1.5 0 5 1.2 seq. (5.8) (7.2) A tt. 0 28.7 42.4 48.5 (7.8) (11.4) (13.0 ) Final Active Suc. 40.3 0 40.3 0 118.9 session seq. (10.8) (10.8) (22.2) A tt. 0 48.4 0 48 .4 110.5 115.1 (13.0) (13.0) (30.0) (31.2) Control Suc. 2 7.3 0 2 7.3 0 52.3 0 87.4 seq. (7.3) (7.3) (13.8) (12.9) A tt. 0 39.0 0 39 .0 0 65.8 64.9 80.1 (10.5) (10.5) (17.7) (17.8) (21.8) Co des: Suc.=Successes, Att.=A ttempts a nd seq.= sequence diﬀeren t co v ariances b et ween the num b er of attempts in diﬀeren t ev aluation sessions and considers a p ossible o verdisp ersion of the total at tempts. Moreo ver, the gamma-P oisson comp onen t of the mod el is computationally m uc h easier to use for comparisons among the n umber s of attempts in diﬀeren t ev aluation sessions. While in the m ultiv ariate b eta-binomial/P oisson mo del, the m ultiv ariate Po isson com- p onen t requ ires a diﬀeren t set of parameters for eac h ev aluation session, in the b eta- binomial/gamma-P oisson m o del, the g amma-P oisson compon ent includes a single set of parameters for all ev aluation sessions. T o compare the exp ected n umber of attempts under diﬀeren t conditions using the former, it is necessary rewrite the model and to d eriv e ad ho c estimating equations w hile und er the latter, it su ﬃces to eliminate the corresp onding regression parameter and to obtain new parameter estimates using the same estimating 15 equations. F or the analyzed data, for example, the comparison b et we en the con trol and ac- tiv e sequence du ring th e b aseline ev aluation using th e b eta-binomial/gamma- Po isson m o del is d one by testing if the parameter β λC is null. On the other h and, und er th e multiv ari- ate b eta-binomial/P oisson app roac h, the total num b er of trials is mo delled with a sp eciﬁc v ector of parameters for eac h instan t of observ ation; for th e d at a in the example, th ey are: baseline ev aluation p erforming activ e sequence, baseline ev aluation p erf orm ing the con trol sequence, ﬁ nal ev aluation p erforming the activ e sequence and ﬁ n al ev aluation p erforming the control sequence. T o compare the co ntrol and a ctiv e sequences dur ing th e baseline session w e sh ould r ewrite the mo del using only three parameters: baseline ev aluation (the same for ac tiv e and con trol sequences), ﬁnal ev aluation perform ing activ e sequence and ﬁnal ev aluation p erforming con trol sequence. The a v erage o f th e absolute diﬀerence s bet wee n t he sample m ea ns of t he num b er of successful and t otal attempts and the resp ectiv e exp ect ed v alues und er this ﬁnal mo del (T able 7) is 1.7. The same a v erage b ased on the multiv ariate b eta -binomial/P oisson mod el is 0.9. F u rthermore, the av erage of the absolute diﬀerences b et w een the obs erv ed and estimated co v ariances usin g the m ultiv ariate b eta-binomial/P oisson mo del is 21.5 while it is 19.1 if w e use the b eta-binomial/gamma- Po isson. These d iﬀe rences are attributable to the more ﬂexible co v ariance str ucture indu ce d b y the latter, i.e., allo wing for diﬀerent co v ariances b et we en the rep eated n umber of trials. The v alues of the AIC ( = 1888.0) and the BIC ( = 1919.1 ) for th e b eta-binomial/ga mma- P oisson mo del compared to the corresp onding v alues (AIC = 1935 .6 and BIC = 1974.0) for the m ultiv ariate b eta-binomial/P oisson also suggest a b etter ﬁt of the former. Although the results are quite similar, with the exception of the v alues for patien ts in the adv anced stage of the disease, the b eta-binomial/gamma-P oisson one is preferable to the multiv ariate b eta-binomial/P oisson, b oth b ecause of th e mo delling ﬂexibilit y and the computational adv ant ages mentio ned b efore. As an extension f or the b eta-binomial/ gamma-P oisson mod el, we could incorp orate a parameter to relate the probabilities of success to the total attempts, as in Zh u et al. (2004 ). Another p ossible extension w ould b e to consider the case where attempts could b e done correctly , satisfactorily or incorrectly; in this case, we could generalize the mo del by considering Diric hlet-m u ltinomia l/gamma-P oisson d istribution mo dels. These extensions are current ly under in v estigation. 16 App endix First and second deriv atives for the gamma-Poiss on model ∂ L ( β λ , β α , β δ ) ∂ β λ = Z ′ λ L  L − 1 n − ( I p ⊗ B − 1 )( 1 p ⊗ a )  , ∂ L ( β λ , β α , β δ ) ∂ β α = Z ′ α M [ c − D − 1 log ( b )] and ∂ L ( β λ , β α , β δ ) ∂ β δ = Z ′ δ [ De + D − 1 Mlog ( b ) − B − 1 L s a ] ∂ 2 L ( β λ , β α , β δ ) ∂ β λ ∂ β ′ λ = Z ′ λ L [ I p ⊗ ( AB − 1 )]  L [ I p ⊗ ( DB − 1 )] − I M p  Z λ , ∂ 2 L ( β λ , β α , β δ ) ∂ β λ ∂ β ′ α = − Z ′ λ L [ I p ⊗ ( MB − 1 )]( 1 p ⊗ Z α ) , ∂ 2 L ( β λ , β α , β δ ) ∂ β λ ∂ β ′ δ = − Z ′ λ L [ I p ⊗ ( DB − 2 )] [ I p ⊗ ( N s − ML s )] ( 1 p ⊗ Z δ ) , ∂ 2 L ( β λ , β α , β δ ) ∂ β α ∂ β ′ α = Z ′ α M  C − D − 1 log ( B ) − MF  Z α , ∂ 2 L ( β λ , β α , β δ ) ∂ β α ∂ β ′ δ = Z ′ α MD  − J − D − 1 B − 1 L s + D − 2 log ( B )  Z δ and ∂ 2 L ( β λ , β α , β δ ) ∂ β δ ∂ β ′ δ = Z ′ δ D  E − DQ + MD − 1 B − 1 L s − D − 2 Mlog ( B ) − B − 2 L s [ N s − ML s ]  Z δ with L ( β λ , β α , β δ ) present ed in (24) and a = ( a 1 , ..., a g , ..., a M ) ′ , a g = δ ( z δg ) " p X h =1 n g h # + α ( z αg ) A = diag { a g } B = diag { b g } , b g = δ ( z δg ) " p X h =1 λ ( z λg h ) # + 1 log ( b ) = ( l og ( b 1 ) , ..., l og ( b g ) , ..., l og ( b M )) ′ log ( B ) = diag { log ( b g ) } c = ( c 1 , ..., c g , ..., c M ) ′ , c g = Σ p h =1 n gh − 1 X u =0 1 α ( z αg ) + uδ ( z δg ) , C = diag { c g } 17 e = ( e 1 , ..., e g , ..., e M ) ′ , e g = Σ p h =1 n gh − 1 X u =0 u α ( z αg ) + uδ ( z δg ) E = diag { e g } , F = diag { f g } , f g = Σ p h =1 n gh − 1 X u =0 1 [ α ( z αg ) + uδ ( z δg )] 2 J = diag { j g } , j g = Σ p h =1 n gh − 1 X u =0 u [ α ( z αg ) + uδ ( z δg )] 2 Q = diag { q g } , q g = Σ p h =1 n gh − 1 X u =0  u α ( z αg ) + uδ ( z δg )  2 n = ( n 11 , ..., n g h , ..., n M p ) ′ , N s = diag ( p X h =1 n g h ) L = diag { λ ( z λg h ) } L s = diag ( p X h =1 λ ( z λg h ) ) M = diag { α ( z αg ) } D = diag { δ ( z δg ) } Z λ = ( z ′ λ 11 , ..., z ′ λg h , ..., z ′ λM p ) ′ Z α = ( z ′ α 1 , ..., z ′ αg , ..., z ′ αM ) ′ Z δ = ( z ′ δ 1 , ..., z ′ δg , ..., z ′ δM ) ′ Ac kno wledgemen ts W e are grateful to Conselho Nacional de Desen v olvimen to Cient ´ ıﬁco e T ecnol´ ogico (CNPq) and F unda¸ c˜ ao de Amparo ` a Pesquisa do Estado de S˜ ao P aulo (F APESP), Brazil, for partial ﬁnancial supp ort. W e are also grateful to Maria Elisa Piment el Piemonte f or pr o viding the data. 18 References [1] Br o oks, S . P ., Morgan, B. J. T ., P ack, S. E. (1997). Finite mixture mo dels for prop or- tions. Biometrics 53 , 109 7-1115. [2] C om ulada, W. S., W eiss, R. E. (2007 ). On mo dels for Binomial data with random n umber of trials. Biometrics 63 , 610 -617. [3] C o x, D. R. (1983). Some remarks on o verdisp ersion. Biometrika 70 , 269-274. [4] Gange, S. J., Mun oz, A., Saez, M., Alonso, J. (1996). Use of the Beta-Binomial d is- tribution to mo del the eﬀect of p olicy c hanges on appropriateness of Hospital Sta ys. Applie d Statistics 45 , 371-3 82. [5] Gr iﬃths, D. A. (1973). Maxim um like liho o d estimation for the Beta-Binomial distri- bution and an ap p lica tion to the total n um b er of cases of a disease. Biometr ics 29 , 673-6 48. [6] Ho, L. L., Singer, J. M. (1997). Regression mo dels for biv ariate counts. Br azilian Journal of Pr ob ability and Statistics 11 , 175 -197. [7] Ho, L . L., Singer, J. M. (2001). Generalized least s q u ares metho ds for biv ariate P oisson regression. Communic ations in Statistics 30 , 263 -278. [8] J ohnson, N. L., K ot z, S. (1970). Distributions in Statistics: c ontinuous univariate distributions 2 . Boston: Houghton Miﬄin. [9] K arlis, D., Ntzo ufr as, I. (2003). Analysis of sp orts data by us in g biv ariate Poisson mo dels. The Statistician 52 , 381-3 93. [10] Lora, M. I ., Sin ge r, J. M. (2008). Beta-binomial/P oisson mo dels for rep eated biv ariate coun ts. Statistics in M e dicine 27 , 3366 -3381. [11] Mo od, A. M., Gra ybill, F. A., Bo es, D. C. (19 74). Intr o duction to the the ory of statis- tics . Singap ore: McGra w -Hil l. [12] Nelder, J. A., McCullagh, P . (19 89). Gener alize d Line ar Mo dels . London: Chapman and Hall. [13] Nelson, J. F. (1985). Multiv ariate Gamma-P oisson Mo dels. Journal of the Americ an Statistic al Asso ciation 392 , 828-83 4. 19 [14] Williams, D. A. (1975). The analysis of binary resp onse from to xo co logical exp erimen ts in vlo ving repr odu ct ion and terato genecit y . Biometrics 31 , 949-95 2. [15] Zhu, J., Eic khoﬀ, J. C., Kaiser, M. S . (2003). Mo deling the Dep endence b et ween num- b er of trials and success pr ob ab ility in Beta-Binomia l − Po isson m ixture distributions. Biometrics 59 , 955 -961. 20

Beta-binomial/gamma-Poisson regression models for repeated counts with random parameters

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment