Testing polynomial covariate effects in linear and generalized linear mixed models

Statisti cs Surve ys V ol. 2 (2008) 154–169 ISSN: 1935-7516 DOI: 10.1214/ 08-SS036 T esting p olynomial co v ariate eﬀects in linear and generalized linear mixed mo de ls ∗ Mingya n Huang and Dao w en Zhang Dep artment of Statistics, North Car olina State University, R aleigh, NC 27695 e-mail: mhuang@n csu.edu e-mail: dzhang2@stat.n csu.edu Abstract: An important feature of l inear mixed mo dels and gene ralized linear m i xed mo dels is that the conditional mean of the response giv en the random eﬀects, after transformed b y a link function, is linearly re- lated to the ﬁxed cov ariate eﬀects and r andom eﬀect s. Therefore, it is of practical i mp ortance to test the adequacy of this assumption, particularly the assumption of l inear co v ari ate eﬀec ts. In this paper, we review pro- cedures that can b e used for testing polynomial cov ariate eﬀects in these popular mo dels. Sp eciﬁcally , four t ypes of h ypothesis testing approac hes are r eview ed, i.e. R tests, likelihoo d ratio tests, score tests and resi dual- based tests. Deriv ation and p erformance of each testing procedure will b e discussed, including a small s imulation study for comparing the likelihoo d ratio tests with the score tests. Keywords and phrases: Likelihoo d Ratio T est, R estricted Maximum Like liho od (REML), Score T est. Receiv ed F ebruary 2008. Con ten ts 1 Int ro duction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 2 Generalized linear mixed mo dels . . . . . . . . . . . . . . . . . . . . . 156 3 F our testing pr o cedures . . . . . . . . . . . . . . . . . . . . . . . . . . 157 3.1 R tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 3.2 Likelihoo d ratio tests . . . . . . . . . . . . . . . . . . . . . . . . . 158 3.3 Score tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 3.4 Residual based tests . . . . . . . . . . . . . . . . . . . . . . . . . 163 4 Compariso n betw een the exact lik elihoo d ratio and the sco r e tests . . 164 5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Ac knowledgemen ts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 ∗ This pap er was acce pted by Michael Kosorok, Ass o ciate Editor for the IMS. 154 M. Huang and D. Zhang/T esting p olynomia l c ovariate eﬀe ct s 155 1. In tro duction Linear mixed mo dels (LMMs) [ 16 ] and their extension, g eneralized linear mixed mo dels (GLMMs) [ 2 ; 28 ] are p opular sta tistical mo dels for analyzing corr elated data, including longitudinal a nd clustered data often ar is ing in biomedical re- search. An imp or tant feature o f these models is that the conditional mea n of the res po nse g iven co v ariates and rando m eﬀects, after transformed by a link function, is linearly r elated to the ﬁxed cov ariate eﬀects and random eﬀects. The correctness o f s uch mo del sp eciﬁcation, esp ecially the one o n parametric linear cov aria te eﬀects, has a signiﬁcant impa c t on the v alidity of the subse- quent s ta tistical inference on the cov ar iate eﬀects. Therefore, it is of pr actical impo rtance to chec k the a de q uacy o f the assumption for the parametric linear cov ariate eﬀects. In order to ev aluate the adequacy of a parametric co v aria te eﬀect in a re- gressio n mo del, one co mmon approach is to cast the pro blem in the hypothesis testing framework, where a br oader cla ss of mo dels is selected as the alternatives. Nonparametric r e gressio n mo dels, due to their ﬂexibility and robustness in mod- eling the relations hip b etw ee n a resp onse v ariable and expla natory v ariables, are often ch osen as such alterna tives. In practice, how ev er, one rarely directly uses pure nonparametr ic regress io n mo dels as alternatives b ecause of the in trinsic in- ﬁnite dimensional problem of nonpar ametric functions. T o overcome such diﬃ- culties, v a rious smo othing techniques, suc h as kernel smo othing and (penaliz ed) spline smoo thing, are often applied to estimate nonparametric functions, and the re s ulting es timates a re then used as the alter natives for testing the adeq uacy of the para metr ic cov ariate eﬀects. In do ing so, the inﬁnite dimensional alter- natives are reduced to the one s with ﬁnite dimensions (or even one dimensio n in some special cases), which s igniﬁcantly simpliﬁes the testing problems. F or example, it is well-kno wn that a nonparametric function estimated via p enal- ized splines or smoothing s plines has a mixed eﬀects representation [ 3 ; 29 ; 30 ]. An appealing feature of using the mixed eﬀects representation is that one can cast the h ypo thesis test of par ametric against nonparametric cov aria te eﬀects as a v ariance comp onent test, which in most cases is a simple one-dimensiona l testing pr oblem [ 30 ; 8 ]. The lik elihoo d ratio and the score testing approaches reviewed here are mainly based on this mixed eﬀects repr esentation. Alternatively , testing the a dequacy of pa rametric cov aria te eﬀects in LMMs and GLMMs can also be viewed a s a go o dnes s-of-ﬁt problem. The residual based tests propos ed by Pan and Lin [ 22 ] tak e this view. Sp eciﬁcally , these tes ts are “based on the cumulativ e sums of r esiduals over cov ar iates or predicted v alues of the r esp onse v ariable” [ 22 ]. The ma jor adv an tage of this appr oach is that it is v alid against any alternatives that devia te from an a ssumed model. F or checking the adequa cy o f parametric co v aria te eﬀects, we pr esent here an ov erview on four types of h ypo thesis testing appro aches that receive sig - niﬁcant attent ion in the literature: R tests, likeliho o d ratio tests, scor e tests and residual-ba sed tests. F or each test, the der iv ation and perfor mance are de- scrib ed ﬁrst in the linear or generalized linear mo del fr amework, and then w e mainly focus on their extens io ns to mixe d mo dels. The paper is organized as M. Huang and D. Zhang/T esting p olynomia l c ovariate eﬀe ct s 156 follows. Section 2 brieﬂy introduces the mo dels to be consider ed in this review. In Section 3, w e review the four tes ting pr o cedures. In Section 4, w e present the res ults from a small sim ulation study to compar e the p e rformance of t wo po pular testing pr o cedures, the exact likeliho o d ratio tes t and the scor e test, based on mixe d eﬀects repr esentation of (p enalized) smoo thing spline estimates of a nonpar ametric function. The pap er is concluded in Section 5 with so me discussion. 2. Generalized linear mixed mo del s In this section, we brieﬂy in tro duce the mo dels to b e consider ed and notations to b e used in this rev iew. Since LMMs are sp ecia l cases of GLMMs, we will only int ro duce GLMMs for long itudinal/clustered data. Supp ose there are m sub jects (or clusters) in a data s et. F or the i th sub ject ( i = 1 , 2 , . . . , m ), denote by y ij the j th mea surement o f the res p o nse v ar iable ( j = 1 , 2 , . . . , n i ), and by z ij , s ij and t ij the j th meas urements of the q -dimensional co v aria tes z , p -dimensio nal cov ariates s (not including the intercept) and a scala r cov ariate t . Given sub ject- sp eciﬁc random eﬀects b i and these co v aria te v alues , y ij is assumed to b e inde- pendent and has a conditiona l densit y in an ex p o nent ial family with conditional mean µ ij = E( y ij | b i ) a nd conditional v ariance v ar( y ij | b i ) = ω − 1 ij φv ( µ ij ), where ω ij is a prior weight , φ is the disp ersion para meter and v ( · ) is the v ariance func- tion. The conditional mean µ ij is assumed to be rela ted to the cov ar iates in the following GLMM [ 2 ] g ( µ ij ) = s T ij δ + m ( t ij , γ ) + z T ij b i , (2.1) where g ( · ) is a known monotone link function, δ are ﬁxed eﬀects of s , m ( t, γ ) = γ 0 + γ 1 t + · · · + γ d t d is the d -o rder ( d is a non-negative integer) p olyno mial cov ariate eﬀect of t with co eﬃcients γ k ’s, and the ra ndom eﬀects b i are usually assumed to hav e a multiv ariate normal distribution N { 0 , D ( θ ) } with θ b eing the vector of unique parameters in the v ariance matrix of the random eﬀects b i . Mo del ( 2.1 ) includes man y po pula r mo dels a s sp ecial ca ses. When g ( µ ) = µ and y ij is assumed to hav e a conditional normal distribution given rando m eﬀects b i , the mo del ( 2.1 ) reduces to a n LMM consider ed by Laird a nd W are [ 16 ]. Supp ose we are conﬁdent ab out the parametr ic linear form s T ij δ in mo del ( 2.1 ) and are mainly concer ned with the adequacy o f m ( t, γ ), the p olynomial cov ariate eﬀect of t . F or this purp ose, we consider the following semipar ametric additive mixed mo dels (SAMMs) prop osed b y Zhang and Lin [ 30 ] as a lternative mo dels to model ( 2.1 ) g ( µ ij ) = s T ij δ + f ( t ij ) + z T ij b i , (2.2) where f ( t ) is a smo oth but arbitrary function. Denote y = ( y 11 , . . . , y 1 n 1 , . . . , y m 1 , . . . , y mn m ) T , S = ( s 11 , . . . , s 1 n 1 , . . . , s m 1 , . . . , s mn m ) T , b = ( b 1 , . . . , b m ) T , Z i = ( z T i 1 , . . . , z T in i ) T , Z = dia g { Z 1 , . . . , Z m } , and µ = E( y | b ). I n the next sectio n, we discuss four pro cedur es for chec king the assumption tha t f ( t ) is a dequately represented by a p olynomial function m ( t, γ ). M. Huang and D. Zhang/T esting p olynomia l c ovariate eﬀe ct s 157 3. F our testing pro cedures 3.1. R tests The R tests, discussed b y Hastie a nd Tibshirani [ 15 ], w ere o riginally developed for testing smo othing par ameters during the estimation of nonpar ametric func- tions through smo othing techniques for independent data. The idea of the R tests is a na logous to the F s tatistic frequen tly us ed in linear regress ion mo dels. One of the adv an tages of the R tests is their easy implemen tation, as under the null hypothesis the asymptotic distr ibutio n of the R statistic can b e ap- proximated b y the chi-square distribution. How ev er, the estimates of the de- grees of freedom o f chi-square distributions can b e biased, and the resulting approximated critical v alues might b e inaccurate. Moreov er, the ﬁnite-sample distribution of the R statistic has not b een studied [ 8 ]. A n um ber o f mo diﬁca tions on the o riginal R tests have been made, including the correction of the bias o f nonparametric estimates, r econstruction of the original test statistics and the corresp onding distributions [ 1 ; 4 ; 8 ]. Here we brieﬂy describ e a version of R sta tistics propo sed b y Ha rdle et a l. [ 13 ] under the genera lized linear mo del (GLM) framew ork. They considered the follo wing generalized pa rtially linear mo del, a s pe c ial case of SAMMs ( 2.2 ) for indep endent data ( n i = 1): g ( µ i ) = s T i δ + f ( t i ) . (3.1) Here, no random eﬀect is re quired as y i ’s are independent, so the s econd sub- script j ( j = 1) can be dro pp e d for the simplicit y of the notation. Denote by ˜ δ and ˜ f the e stimates of δ and f ( t ) under the null parametric mo del H 0 : f ( t ) = m ( t, γ ), and by b δ and b f the es timates under the alternative model H a : f ( t ) 6 = m ( t, γ ). Let ˜ µ i = g − 1 { s T i ˜ δ + ˜ f ( t i ) } and b µ i = g − 1 { s T i b δ + b f ( t i ) } . The prop osed R statistic for testing H 0 : f ( t ) = m ( t ; γ ) versus H a : f ( t ) 6 = m ( t ; γ ), is deﬁned as R = − 2 m X i =1 Q ( ˜ µ i ; b µ i ) , (3.2) where Q is the log quasi-likeliho o d function deﬁned as Q ( µ i ; y i ) = R µ i y i ω i ( y i − u ) v ( u ) du. Note that here the non-pa rametric estimates ar e based o n kernel smo othing metho ds instead of spline methods a s discussed be low. As Hardle et al. [ 13 ] po int ed out, the usual likelihoo d ratio statistic L ( b f , b δ ) − L ( ˜ f , ˜ δ ), where L ( f , δ ) = P m i =1 Q ( µ i ; y i ), is not appr opriate in this case as δ and f ( t ) are es timated from t wo diﬀerent likeliho o d functions. Under the null hypothesis, Hardle et a l. [ 13 ] show ed that the new R statistic has an asymptotic normal distribution, a lthough such approximation typically do es not work w ell. Hence Hardle et al. [ 13 ] pro- po sed sev eral sophisticated b o otstrap-ba sed appro aches to obtain mo re acc urate critical v alues for the R tests. Sper lich and Lom bardia [ 21 ] ex tended the ab ove R statistic to test H 0 : f ( t ) = m ( t ; γ ) fo r a sp ecial SAMM with a random int ercept only ( i.e. , z ij = 1). M. Huang and D. Zhang/T esting p olynomia l c ovariate eﬀe ct s 158 The test statistic they propo sed takes the follo wing for m: R 1 w = m X i =1 n i X j =1 H { b f ( t ij ) , b δ }{ b f ( t ij ) − ˜ f ( t ij ) + s T ij ( b δ − ˜ δ ) } 2 π ( t ij ) , (3.3) where π ( . ) is a w eigh t function which could b e chosen empirically and H { f ( t ij ) , δ } = ∂ ∂ f l ( y ij ; f , δ ) 2 , with l ( y ij ; f , δ ) = l og f ( y ij | t, s, f , δ ), the log densit y of y ij . The R 1 w statistic is based on “direct compa rison” b etw een estimates fro m nonpar ametric alter- natives and estimates from null parametric mo dels. F ur thermore, Sp erlich and Lombardia [ 21 ] showed that the theory of the asymptotic normal distribution from Hardle et al. [ 13 ] can b e carried o ver to the test sta tistic R 1 w . How ev er, the as ymptotic a pproximations o ften depar t fro m the real ﬁnite sample dis tribu- tions o f the test sta tis tics , which can lead to p o or estimates o f the critical v alues. Therefore, a num ber of bo o tstrap pr o cedures w ere sugg ested to approximate the nu ll distribution of the test s tatistic R 1 w . It can be immediately seen tha t construction of the R test statistic and its extension R 1 w for SAMMs in v olv es the estimation of both the n ull and alter- native mo dels. Estimatio n of the n ull model may be relatively straightforw ard, how ev er the mo del estimation under alter natives can be computationally in ten- sive and sometimes challenging. The bo ots tr ap pr o cedure used to calc ulate the nu ll distribution of the test statis tics also requires signiﬁca nt computation time, which may limit the application scop e of this testing approach. 3.2. Likeliho o d r atio tests F or tes ting a pa rametric versus nonpar ametric co v aria te eﬀect, the lik eliho o d ratio test (LR T) is a na tural c hoice. The LR T has been popular in situatio ns where w e need to compar e tw o nested mo dels. How ev er, extending the LR T to testing the adequacy o f a pa rametric cov ar iate eﬀect is no t straig ht forward. A considerable amount of work has b een done in constructing lik eliho o d ratio based test statistics for comparing para metric versus nonpara metric cov ariate eﬀects. Dep ending on how the no nparametric alter natives w ere sp eciﬁed a nd what t ypes of s mo othing techniques were used, a num ber of versions of likeliho o d ratio bas ed testing procedur es hav e been prop osed. In this section, w e review the LR Ts based on the mixed mo del representation o f a nonparametric function estimated using a (penalized) smo othing spline. Crainiceanu and Rupp ert [ 7 ] consider ed the exa ct LR T and restricted like- liho o d r atio test (RLR T) for testing whether the nonparametric function is a certain degree po lynomial in the following partially linear mo del, which is a sp ecial case of SAMMs ( 2.2 ) and generalized partially linear models ( 3.1 ), y i = s T i δ + f ( t i ) + ǫ i , (3.4) M. Huang and D. Zhang/T esting p olynomia l c ovariate eﬀe ct s 159 where δ a nd f ( t ) hav e the same deﬁnitions as before, ǫ i are i.i .d. from N (0 , σ 2 ǫ ) and a re assumed to b e independent of s i and t i . The nonparametric function f ( t ) can be approximated throug h a penaliz e d smo o thing s pline by the following spline function f ( t ) = γ 0 + γ 1 t + · · · + γ d t d + K X k =1 a k ( t − ξ k ) d + , (3.5) where K is a no n- negative integer, γ = ( γ 0 , · · · , γ d ) T , a = ( a 1 , · · · , a K ) T are tw o sets of parameters, ( t ) d + = t d for t > 0 a nd zero otherwise, ξ 1 < · · · < ξ K are ﬁxed knots, a nd ξ k could be deﬁned a s the k / ( K + 1)th sample quantile of t ′ s . In order for ( 3.5 ) to be a go o d approximation, K is us ually c hosen to b e larg e (such as 20 ), in which case it is no t desira ble to estimate γ and a directly . A p enalized spline estimate o f f ( t ) is obtained b y minimizing the following p enalized least square equation m X i =1 { y i − f ( t i ) − s T i δ } 2 + 1 λ a T Σ − 1 a, (3.6) where λ is the smo othing parameter and Σ is a pr e-sp eciﬁed r oughness p ena lty matrix, usually taken to be the identit y ma trix Σ = I K × K . Let A b e the m × ( d + 1) matrix with the i th row A i = (1 , t i , · · · , t d i ) and B b e the m × K ma tr ix with the i th row B i = [( t i − ξ 1 ) d + , · · · , ( t i − ξ K ) d + ]. The p e nalized least square e quation ( 3.6 ) suggests that f ( t ) has a mixe d eﬀects r epresentation f = Aγ + B a , where f = { f ( t 1 ) , f ( t 2 ) , . . . , f ( t m ) } T , γ is co nsidered as ﬁxed eﬀects and a is r egarded as random eﬀects having the distribution a ∼ N (0 , σ 2 a ) with σ 2 a = λσ 2 ǫ . Denote β = ( δ T , γ T ) T and X = [ S | A ] where S is the m × p matrix with the i th row s T i . Then the o riginal partially linear model ha s the equiv alent linea r mixed model repres entation Y = X β + B a + ǫ . (3.7) It can b e cle a rly se e n from the p enalized spline ex pr ession ( 3.5 ) that genera lly f ( t ) is a p oly no mial of degre e d − h ( h = 0 , 1 , . . . , d ) if γ d − h +1 = · · · = γ d = 0 and a 1 = · · · = a K = 0, which is equiv alen t to γ d − h +1 = · · · = γ d = 0 and σ 2 a = 0 (o r λ = 0) using the linear mix e d mo del representation. There fore, tes ting whether the co v aria te e ﬀect of t is a ( d − h )-deg ree polyno mial is equiv alent to testing H 0 : γ d − h +1 = · · · = γ d = 0 , σ 2 a = 0 ( λ = 0) versus H a : γ d − h +1 6 = 0 or · · · or γ d 6 = 0 or σ 2 a > 0 ( λ > 0) if the mixed mo del representation of a p enalized smo othing spline is used. One approach prop ose d by Cra iniceanu and Rupp ert [ 7 ] for testing this hypothesis is the LR T using the log-likelihoo d of β , σ 2 a and σ 2 ǫ from the mixed model repr esentation ( 3.7 ) ℓ ( β , σ 2 a , σ 2 ǫ ; Y ) = − 1 2 log | V | − 1 2 ( Y − X β ) T V − 1 ( Y − X β ) , where V = σ 2 a B B T + σ 2 ǫ I m × m is the marg inal v ariance of Y under the mo del ( 3.7 ). In the case where h = 0, the testing problem b e c omes a v ariance comp o- nent tes t, i.e. H 0 : σ 2 a = 0 versus H a : σ 2 a > 0 . Besides the LR T, an alternative M. Huang and D. Zhang/T esting p olynomia l c ovariate eﬀe ct s 160 choice for testing this particular h yp o thesis is to use the following REML func- tion ℓ R ( σ 2 a , σ 2 ǫ ; Y ) = − 1 2 log | V | − 1 2 log | X T V − 1 X | − 1 2 ( Y − X b β ) T V − 1 ( Y − X b β ) , where b β = ( X T V − 1 X ) − 1 X T V − 1 Y . This metho d is abbrevia ted by RLR T. As p o inted out by Cra inice anu and Rupp e r t [ 7 ], under H 0 the LR T or RLR T asymptotically do es no t follow a 0 . 5 χ 2 0 + 0 . 5 χ 2 1 mixture chi-square distribution as sug gested by Self and Lia ng [ 23 ] a nd Stra m and Lee [ 2 5 ]. Instead, the LR T or RLR T as ymptotically follows a mixture of χ 2 0 and χ 2 1 with a m uc h heavier mass on χ 2 0 . A simple and fast alg orithm was also prop os ed to sample the exact nu ll distribution of the LR T or RLR T, which is summarized as follows [ 7 ]: Step 1: Generate a grid of λ v a lues where 0 = λ 1 < λ 2 < · · · < λ n . Step 2: Sim ulate K independent random v ariables w 2 1 , · · · , w 2 K from the χ 2 1 . Let S K = P K s =1 w 2 s . Step 3: Independently sim ulate X m,K,d = P m − p − d − 1 s = K +1 w 2 s with w 2 s ∼ χ 2 1 . Step 4: When h 6 = 0, indep endently simulate X h = P h s =1 u 2 s with u 2 s ∼ χ 2 1 . Step 5: F or every grid point λ i calculate N m ( λ i ) = K X s =1 λ i µ s,m 1 + λ i µ s,m w 2 s D m ( λ i ) = K X s =1 w 2 s 1 + λ i µ s,m + X m,K,d . Step 6: Obtain λ max that maximizes f m ( λ i ) ov er λ 1 , · · · , λ n , where f m ( λ ) = m log  1 + N m ( λ ) D m ( λ )  − K X s =1 l og (1 + λζ s,m ) . Step 7 : Compute the LR T statistic L RT m = f m ( λ max )+ m log(1 + X h S K + X m,K,d ), or LRT m = f m ( λ max ) if h = 0. F or the case of RLR T, compute RLR T m = sup λ ≥ 0 " ( m − p − d − 1) lo g  1 + N m ( λ ) D m ( λ )  − K X s =1 log(1 + λµ s,m ) # . Step 8: Repea t steps 2–7. Here µ s,m and ζ s,m are deﬁned to b e the K eigen v alues of the K × K matrices Z T P 0 Z a nd Z T Z r esp ectively , where P 0 = I m − X ( X T X ) − 1 X T . In a recent (unpublished) paper , C la eskens et a l. [ 5 ] adapted the idea of Crainiceanu a nd Rupper t [ 7 ] and e x plored the adv a ntages o f w av elets for es - timating no npa rametric s mo oth functions ov er the use of pena lized splines in partially linear models for indep endent da ta. Tw o asymptotic dis tr ibution the- orems w ere dev eloped for the test statistics prop osed therein, and simulation M. Huang and D. Zhang/T esting p olynomia l c ovariate eﬀe ct s 161 results showed that the wa v elet-based test has b etter per formance than the p e- nalized spline based test in some situations. They als o extended the wav elet based test to the cases of simultaneously testing several p olynomial cov ariate eﬀects. F or testing gener alized linear mo dels with a s ingle co v aria te t for independent discrete data, Liu et al. [ 20 ] prop osed three metho ds which ar e “based on the connection betw een smo o thing s pline models and Bay esian mo dels ”, assuming f ( t ) in mo del ( 3.1 ) to hav e the following Bay esian expres sion f ( t ) = γ 0 + γ 1 t + · · · + γ d t d + τ 1 / 2 W ( t ) , where γ 0 , γ 1 , . . . , γ d hav e ﬂat prior, a nd W ( t ) is the d -order Wiener pro cess. Under this Bay e sian mo del, they extended the generalized max im um likelihoo d ratio (GML) test of W ahba [ 27 ] to test the adeq uacy o f a g eneralized linear mo del, whic h is equiv alent to H 0 : τ = 0. The test statistic of the GML test prop osed b y Liu et al. [ 20 ] is constructed as t GM L = sup φ L (0 , φ | y ) sup τ ,φ L ( τ , φ | y ) , (3.8) where L ( τ , φ | y ) deno tes the ma rginal density of y under this Bay esian mo del. Obviously , under the mixed mo del representation of a smo othing spline estimate of a no npa rametric function t GM L is essentially a LR T. One diﬃcult y with the GML test is that there is no closed form expression for L ( τ , φ | y ), a nd the test statistic can only be approximated n umerically [ 20 ]. Secondly , it is nearly imp ossible to analy tica lly derive the null distribution of the test statistic as its distribution depends on some unknown par a meters. T o ov ercome this diﬃculty , Liu e t al. [ 20 ] sugges ted t wo approaches to approximat- ing the exa ct null distribution of the test statistic. One is the usual b o otstrap pro cedure whic h is computationa lly in tensiv e. The other approach is the so called empirica l approximation method, whic h was considered sup er ior to the bo otstrap-ba s ed metho d. It should be no ted that the testing pr o cedures based on the lik elihoo d ra- tio a re a ll pr op osed fo r mo dels for independent da ta. Although conce ptua lly they can b e extended to SAMMs for longitudinal/clustered data, there are at least tw o ma jor o bstacles. Firs t the calc ula tion of the likelihoo d is even more complicated under the alternative using the mixed mo del repre sentation of a (pena lized) smo othing spline estimate o f a nonpa rametric function. Secondly , it may not be ea sy to extend the algor ithm of Crainiceanu and Rupp ert [ 7 ], orig- inally propo sed for simulating the exact distr ibution o f the LR T in a partially linear model, to SAMMs or ev en LMMs for long itudinal/clustered data. More future research is needed in this area. 3.3. Sc or e tests In generalized linea r mo de ls , score tests have been use d for testing the overdis- per sion and hetero geneity of outcomes [ 10 ; 24 ]. L in [ 19 ] extended s core tests to M. Huang and D. Zhang/T esting p olynomia l c ovariate eﬀe ct s 162 GLMMs, in which a global score test as well a s individua l scor e tests w ere pr o- po sed to test the n ull hypo thes es of all zer o random-eﬀect v ariance comp onents and individual zero r andom-eﬀect v aria nce compo nents respectively . Zhang a nd Lin [ 30 ] co nsidered the problem of testing the nonparametr ic func- tion f ( t ) in model ( 2.2 ) being a d -or der p olyno mial. They ﬁrst estimated f ( t ) by a d -or der smo othing spline and expr essed f with a mixed eﬀects represe ntation, similar to the one in Section 3.2 for a p ena lized smoo thing spline f = T γ + Σ a, (3.9) where f = f ( t 0 ), t 0 is the vector formed by dis tinct { t ij } ’s, T is a matrix for med by z e ro to the d th polynomia ls of t 0 with corresp onding co eﬃcients γ , Σ is a smo othing matrix, and a ∼ N(0 , τ I ). Note that this mixed eﬀects representation is basically the same as the Bayesian ex pr ession presented in Section 3.2 . Denote by N the incidence matrix mapping t 0 to { t ij } ’s, and deﬁne X = ( N T , S ), B = N Σ. Then under the mixed eﬀects r e pr esentation ( 3.9 ), SAMM ( 2.2 ) beco mes the follo wing GLMM g ( µ ) = X β + B a + Z b, (3.10) where β = ( γ T , δ T ) T are the new ﬁxed eﬀects and ( a, b ) ar e the new random eﬀects. As describ ed in the e a rlier sections, testing f ( t ) in SAMM ( 2.2 ) b eing a d - order p olyno mial is equiv alent to testing H 0 : τ = 0 in the induced GLMM ( 3.10 ). Zhang and Lin [ 30 ] adapted the idea of Lin’s [ 19 ] v ar iance comp onent score tests to test H 0 : τ = 0. How ev er, they pointed out that the sc ore tests prop osed by Lin [ 19 ] for testing zer o v ariance comp onents in GLMMs cannot be used directly for testing H 0 : τ = 0. They prop os ed a scaled chi-squared approximation to the test statistic. Denote by ψ = ( θ T , φ ) the nuisance para meter v ector, and by ℓ M ( τ , ψ ) the marginal log -likelihoo d function of τ and ψ (by int egrating out random eﬀects a , b and ﬁxed e ﬀects β ). Then under the induced GLMM ( 3.10 ), the score U τ for testing H 0 : τ = 0 takes the following form U τ ( b ψ ) = ∂ ℓ M ( τ , ψ ; y ) ∂ τ     τ =0 , b ψ (3.11) ≈ 1 2 { ( Y − X β ) T V − 1 N Σ N T V − 1 ( Y − X β ) − tr ( P N Σ N T ) }     b β , b ψ , where b β is the MLE of β and b ψ is the REML-t ype of estima te of ψ under the following null GLMM ( 3.12 ), a nd Y = X β + Z b + ∆( y − µ ) is the working vector from the n ull GLMM g ( µ ) = X β + Z b, (3.12) where P = V − 1 − V − 1 X ( X T V − 1 X ) − 1 X T , V = W − 1 + Z GZ T , G = diag { D , . . . , D } , ∆ = diag { g ′ ( µ ij ) } , W = dia g { w ij } and w ij = { φω − 1 ij v ( µ ij )[ g ′ ( µ ij )] 2 } − 1 . Note that model ( 3.12 ) is the ma trix representation of the o riginal GLMM ( 2.1 ). M. Huang and D. Zhang/T esting p olynomia l c ovariate eﬀe ct s 163 Because of the s pec ia l structure o f Σ, Zhang a nd Lin [ 30 ] found that the score U τ ( b ψ ) does not follow an asymptotic normal distribution. W r ite U τ ( ψ ) as U τ ( ψ ) = U τ ( y ; ψ ) − e ( ψ ), where U τ ( y ; ψ ) and e ( ψ ) denote the ﬁrst and the second terms of the a bove scor e, and deﬁne ψ 0 as the true v alue of ψ under H 0 : τ = 0. Zhang a nd Lin [ 30 ] sho wed that the n ull distribution of U τ ( y ; ψ 0 ) is approximately equal to the one of weight ed chi-squared random v a riables and ca n be w ell appr oximated b y a scaled c hi-squared distribution. Since the exp ectation of U τ ( ψ ) is an increasing function of τ , lar ger v alues o f U τ ( b ψ ) give more evidence aga inst H 0 , whic h indicates that the scor e test should be one- sided. Compared with the LR Ts, one ma jor adv an tage of using the sc o re test s tatis- tic U τ ( y ; b ψ ) is its easy implementation, as it can b e calculated directly by ﬁt- ting a GLMM (under the n ull hypothesis) r ather than a SAMM. In addition, the critica l v a lues ca n b e directly approximated from the reg ular chi-square dis- tribution. Therefo r e, it is not necessa ry to derive the distribution of the test statistics under the null hypothesis as often required by the LR Ts. Secondly , as SAMMs encompas s a broad class of statistical models, the ab ove scor e test ca n be applied in ma ny situations, such as indep endent Gaussian data [ 6 ], c lus tered Gaussian or binary data, etc. F or cluster ed data, the implementation of the LR Ts can b e very diﬃcult as exp ensive computation is needed to approximate the n ull distr ibution of the test statistics. The simulation results show ed that the score test statistic ab ov e performs very well for Gaussian outco mes, less so for binar y data due to the p o or a pprox- imation of the Lapla c e metho d in calculating the score statistic, but improves rapidly as the binomial de no minator increases [ 30 ]. 3.4. R esidual b ase d tests Inspired by the idea of residual plots for chec king the go o dness-o f-ﬁt of regres- sion models , recently Pan and Lin [ 22 ] in tro duced a graphica l a nd numerical approach to assess the a dequacy of GLMMs. These metho ds are “based on the cum ulative sums o f r esiduals over cov ariates or predicted v alues of the r esp onse v aria ble” [ 22 ] and are the fur ther extensions of the work by Su a nd W ei [ 2 6 ] and Lin et a l. [ 18 ]. Denote by µ ij ( β , θ, φ ) = E( y ij ), the ma rginal mean of y ij and deﬁne r esidual e ij as e ij = y ij − b µ ij , wher e b µ ij = µ ij ( b β , b θ, b φ ), and b β , b θ , b φ are the estimates of the cor resp onding par ameters under the or iginal GLMM ( 2.1 ) or mo del ( 3.12 ) in the matrix notation. Pan and Lin [ 22 ] then co nsidered the follo wing tw o classes of stochastic pro c e sses W ( x ) = m − 1 / 2 m X i =1 n i X j =1 I ( x ij ≤ x ) e ij , W g ( r ) = m − 1 / 2 m X i =1 n i X j =1 I ( b µ ij ≤ r ) e ij , M. Huang and D. Zhang/T esting p olynomia l c ovariate eﬀe ct s 164 where x = ( x 1 , · · · , x p ) T , r ∈ R , I ( x ij ≤ x ) = I ( x 1 ij ≤ x 1 , · · · , x pij ≤ x p ), a nd x kij is the k th comp onent of x ij . Under the assumed GLMM, these sto chastic pro cesses con v erge in distribu- tion to zer o-mean Gaussian pro cesses, whic h ca n b e sim ulated through Monte Carlo techniques. Eac h obser ved cumulativ e-sum proces s W ( x ) or W g ( r ) ca n then b e c o mpared, b oth visua lly and ana ly tically , to a certain zero -mean Gaus- sian pro c e ss. If the assumed GLMM is a reaso nable mo del for the g iven data, the cum ulative-sum pro cesses would b ehav e like w hite noise. Ther efore, any abno r - mal depar ture of W ( x ) or W g ( r ) from the zero-mea n Ga us sian pro ces ses would be an indication of model mis-sp eciﬁcation. The main adv antage of this testing approach is that there is no need to sp ecify the a lter natives, therefore it can be used to test whether or not f ( t ) in SAMM ( 2.2 ) can be adequately represented by a p oly no mial function. Nev ertheless this tes t may b e less p ow erful compa r ed to the other pr o cedures sp eciﬁcally designed for testing f ( t ). Int ro duced b y F an and Huang [ 11 ], a nother r esidual ba s ed tes t is the so called “adaptive Neyman test”. Although the test statistic is constructed in a completely diﬀerent w ay , the basic idea is simila r to the o ne describ ed ab ove, i.e. if a parametr ic mo del ﬁts data w ell, the residuals should ﬂuctuate ar o und 0. They fo cused on the classica l nonpara metric mo del, which is y = f ( x ) + ǫ with ǫ ∼ N (0 , σ 2 ). Under the null h ypothesis f ( · ) = m ( · , γ ) for some γ , w her e m ( · , γ ) be longs to a given par a metric family , the r esulting res iduals ar e given as b ǫ i = y i − m ( x i , b γ ), i = 1 , · · · , n , where b γ is the estima te of γ under the assumed mo del. Deno te b ǫ = ( b ǫ 1 , . . . , b ǫ n ), then b ǫ is nearly indep endently and normally distributed with mean vector η = ( η 1 , · · · , η n ) T , where η i = f ( x i ) − m ( x i , γ 0 ) and γ 0 is the conv ergent limit of b γ . Th us, the testing pro blem ca n b e constructed a s H 0 : η = 0 versus H a : η 6 = 0. F an a nd Huang [ 1 1 ] adopted the adaptive Neyman test to this testing problem. The adaptive Neyma n test statistic is constr ucted based on the F ourier transfor m of the res iduals b ǫ with its exact n ull distribution being gener ated through sim ulations. As men tioned e a rlier, the adaptive Neyma n test has only been studied in partially linea r mo dels . So, extending it to LMMs or GLMMs could p otentially be a future rese a rch direc tio n. 4. Comparis on b et w e en the exact likeliho o d ratio and the score tests In this paper , w e pr ovided an o v erview of the four types of testing approa ches. Among them, likelihoo d ratio a nd sc o re tests hav e been widely used in a v ari- ety of hypothes is testing problems. T o our knowledge, how ever, no co mpa rison betw een these tw o tests has b een inv estigated for the cur rent situation, i.e. tes t- ing a parametric cov ar iate eﬀect a gainst a nonparametric cov ar iate eﬀect. Here, we conduct a small simulation study to ev aluate and compa re the p erfo r mance of these t w o p opular testing pro cedure s . F or illustration purpos e s, we consider testing the linearit y of co v aria te eﬀects under the partially linear model fra me- work, i.e. whether f ( t ) is a linear function of t in mo del ( 3.4 ). F o llowing the pena lized s pline, we formulate the exact LR T (named as LR T1), RLR T and the M. Huang and D. Zhang/T esting p olynomia l c ovariate eﬀe ct s 165 score test a s v ariance comp onent tests based on the mixed mo del r epresentation ( 3.7 ) as discussed above. In additon, for testing the same null hypo thesis, w e also form ulate the exact LR T in a diﬀerent way (na med as LR T2) by mo deling the a lter native through a quadra tic s pline. In the latter case, we a re testing whether f ( t ) is a ( d − h )-deg ree polynomia l of t with d = 2 and h = 1. Since no exa c t LR T or RLR T has b een developed for mixed mo dels for lo ngi- tudinal/clustered data, w e only consider pa r tially linear mo dels fo r independent data even thoug h Z hang and Lin’s [ 30 ] procedur e is applicable to mo re compli- cated models. Data in this simulation are g enerated from the following partia lly linear mo del y i = s i 1 β 1 + s i 2 β 2 + f ( t i ) + ǫ i , i = 1 , 2 , . . . , m where s i 1 is generated fr om N(0 , 0 . 3), s i 2 is generated from N(0 , 0 . 4), t i ’s are equally spaced distinct p oints in [0,1], and ǫ i ∼ N(0 , σ 2 ). The true v alues of β 1 and β 2 are set to be 1.3 and 0.45 resp ectively . The v alues of σ ar e 0.25 and 0.5, and the sample size m is taken to b e 50 and 100. A total of ﬁve diﬀere nt functions of f ( t ) are cons ider ed, i.e., f c ( t ) = (0 . 2 5 c ) t · exp (2 − 2 t ) − t + 0 . 5 , for c = (0 , 1 , 2 , 3 , 4) [ 30 ]. Note that when c = 0, f c ( t ) is a linear function of t and f c ( t ) deviates further fro m linearity w ith increasing c . W e apply the exact LR T1, LR T2, RLR T and the s core testing pro cedur es to eac h simulated data set. The sim ulation r esults are based on 1 000 Mon te Ca rlo sim ulation runs. F or testing the n ull hypothesis tha t f ( t ) is a linear function of t , the size and p ow er of e a ch testing pro cedure are calcula ted by setting c = 0 and c 6 = 0 resp ectively . When a p enalized spline is used to estimate f ( t ) as in the LR T or RLR T, the n um ber of k nots for the pena lized spline is set to be 20 . F or the score testing pr o cedure, the smoo thing matrix Σ is from a natural s mo othing spline. The sim ulation res ults are pres ent ed in the T able 1 ( m = 50) and T able 2 ( m = 1 00), wher e the no mina l levels are set to b e 0 .05 and 0 .1. Reg a rding the empirical size, our sim ulation res ults sho w that the exact LR T2, RLR T and the score tes t are all close to the nominal levels. The empirical size of the LR T1, how ev er, stays unchanged even if the nominal lev el increases fro m 0.05 to 0.1. Overall the increased sa mple size brings the empirica l sizes of all these tes ts closer to the nominal levels, where a s the erro r no ise s eems to have not m uc h inﬂuence on them. With resp ect to the p ow er, all tests show decreased p ow er as the error v ar iance increases . As exp ected, the increa sed s ample s ize improv es the overall p ow er. Note that the powers of the LR T1 are also unc hanged as the nominal lev el increases, whic h implies tha t the sim ulated critical v alues for the LR T1 may no t b e accurate with a moder a te num b er of Monte Car lo simulation runs. In general, our sim ulation indicates that the LR T2, RLR T a nd sc o re test are more p ow erful than the LR T1, with the score test s lightly out-p er fo rming the exact LR T2 and RLR T. In comparing to likeliho o d ratio based tes ts , the score test has at least tw o main adv an tages. First the exact LR T (LR T1 and LR T2) and RLR T are com- putationally m uc h mor e intensiv e than the s core test, as deriving the null dis- tributions of the LR T and RLR T statistics requires simulation in each r un. The M. Huang and D. Zhang/T esting p olynomia l c ovariate eﬀe ct s 166 T able 1 Empiric al sizes and p owers of the four tests in te sting the line arity of c ovariate e ﬀ e cts in mo del ( 3.4 ) wher e m = 50 nominal σ T est Size Po wer lev el c = 0 c = 1 c = 2 c = 3 c = 4 0.05 0.25 LR T1 0.032 0.152 0.696 0.991 1.000 LR T2 0.049 0.419 0.935 0.999 1.000 RLR T 0.067 0.419 0.927 1.000 1.000 Score 0.066 0.443 0. 948 1.000 1.000 0.5 LR T1 0.066 0.094 0.224 0.473 0.782 LR T2 0.047 0.135 0.412 0.737 0.923 RLR T 0.050 0.123 0.404 0.720 0.915 Score 0.060 0.158 0. 448 0.762 0.936 0.1 0.25 LR T1 0.032 0.152 0.696 0.991 1.000 LR T2 0.115 0.548 0.962 0.999 1.000 RLR T 0.138 0.545 0.970 0.999 1.000 Score 0.124 0.560 0. 972 1.000 1.000 0.5 LR T1 0.066 0.094 0.224 0.473 0.782 LR T2 0.093 0.230 0.545 0.838 0.961 RLR T 0.103 0.213 0.531 0.832 0.960 Score 0.104 0.242 0. 565 0.859 0.970 T able 2 Empiric al sizes and p owers of the four tests in te sting the line arity of c ovariate e ﬀ e cts in mo del ( 3.4 ) wher e m = 100 nominal σ T est Size Po wer lev el c = 0 c = 1 c = 2 c = 3 c = 4 0.05 0.25 LR T1 0 .044 0.217 0.950 1.000 1.000 LR T2 0.053 0.675 0.994 1.000 1.000 RLR T 0.052 0.661 0.995 1.000 1.000 Score 0.052 0.691 0.997 1.000 1.000 0.5 LR T1 0.068 0.115 0.364 0.810 0.988 LR T2 0.059 0.240 0.681 0.956 0.999 RLR T 0.054 0.221 0.670 0.959 0.999 Score 0.062 0.249 0.697 0.963 0.999 0.1 0.25 LR T1 0.044 0.217 0.950 1.000 1.000 LR T2 0.109 0.778 0.998 1.000 1.000 RLR T 0.102 0.762 0.999 1.000 1.000 Score 0.107 0.779 1.000 1.000 1.000 0.5 LR T1 0.068 0.115 0.364 0.810 0.988 LR T2 0.103 0.353 0.781 0.975 1.000 RLR T 0.112 0.336 0.777 0.982 1.000 Score 0.111 0.363 0.798 0.983 1.000 computing time of the exact LR T and RLR T in this simulation is 50 times more than that of the score test. Secondly , the exact LR T and RLR T hav e not yet b een developed for more complicated mo dels suc h as LMMs and GLMMs, whereas the score testing pro cedure is ﬂexible and can b e ada pted to man y mo deling situations. F or simplicity , o nly the linearity test is consider ed in the cur r ent simulation; how ev er in practice, one might be in terested in testing higher-or der po lynomial c ov ar iate eﬀects ( i.e. d > 1), which can be easily carrie d o ut by using a diﬀere n t d . Overall we consider the score test is a b etter choice than the LR T and RLR T. M. Huang and D. Zhang/T esting p olynomia l c ovariate eﬀe ct s 167 5. Summ ary W e ov erview the ma in developmen t of the four types of testing approa ches used for testing a par ametric cov ariate eﬀect versus a nonpar ametric cov ar iate eﬀect. A considerable amount of work ha s be e n done with the LR Ts under linear or gener alized linear models. The likelihoo d bas ed tests per fo rm very well for independent data in ﬁnite sample situatio ns. How ev er, these test statistics can be diﬃcult to compute in a more complex model, as b oth the parametr ic and nonparametric mo dels need to b e estimated. In addition, der iving the null distributions of those test statistics can be challenging. Therefore, it is not straig h tforward to extend the exis ting LR Ts or RLR Ts to LMMs and GLMMs. Compared to the LR Ts or RLR Ts, the sco re statistics a re ea sy to compute, usually show go o d p erfor mance and are applicable to b oth LMMs and GLMMs. F urther study ma y be needed to in v estigate the prop erties o f the score tests for small samples. The R tests are likelihoo d-ra tio- based tests, hence they share the sa me adv an tages and disa dv an tages as the LR Ts. The r ecently dev elop ed r esidual-bas e d test [ 22 ] ca n b e considered as an omnibus test fo r detecting mo del mis-speciﬁca tion and can be used to test the adequacy of a p o lynomial co v ariate eﬀect. Since no a lternative models need to be speciﬁed, the residua l-based test is applicable in man y situations including LMMs a nd GLMMs. How ev er, it may b e less pow erful than the other testing pro cedures that a re speciﬁca lly designed for testing a particula r co v aria te eﬀect. Compariso n o f the residual-based tests with the scor e tests in mixed mo de ls could be of future interest. Ac kno wle dgements The rese arch of Daowen Zha ng is partly supp or ted by an NIH gr a nt R01 CA858 48- 08. I w ould lik e to thank the r e feree and the managing editor W endy Martinez for many v aluable sug gestions that greatly improv ed the presentation of this pap er. References [1] Azzalini, A. an d Bowman, A . ( 1993). O n the use of nonparametric regres s ion for chec king linear rela tionships. Journ al of R oyal Statistic al So- ciety - B 55 , 549 –557. MR122 4417 [2] Breslow, N. E. and Cla yton, D. G. (1993 ). Appro ximate inference in generalized linear mixe d mo dels. Journ al of the Americ an Statistic al Asso ciation 88 , 9–2 5. [3] Brumba ck , B., R upper t, D. and W and, M. P. (1999). Comment on v aria ble selection and function e s timation in additive nonpa r ametric re- gressio n using da ta -based prior’ b y Shively , Kohn and W o o d. Journ al of the Americ an Statistic al Asso ciation 94 , 794– 797. M. Huang and D. Zhang/T esting p olynomia l c ovariate eﬀe ct s 168 [4] Cantoni, E. and H astie, T. (20 02). Degree s -of-freedom tests for smo oth- ing splines. Biometrika 89 , 251–263 . MR19139 57 [5] Claeskens, G ., Ding, H. and Jansen, M. (2007). Lack-of- ﬁt tests in semiparametric mixed mo dels. Av ailable on web at www.ec on.ku leuve n.be/fetew/pdf public aties /KBI 0709.pd f [6] Cox, D. , K oh, E., W ahba, G. and Y andell, B. (198 8). T esting the (parametric) null mo del h ypothes is in (semiparametric) pa rtial and gener - alized spline models . Annals of Statist ics 2 1 , 903–923 . MR09248 59 [7] Crainiceanu, C. M. and Rupper t, D. (2004 ). Likelihoo d ratio tests in linea r mixed mo dels with one v aria nce comp onent. Journal of R oyal Statistic al So ciety - B 66 , 165–18 5 . MR20357 65 [8] Crainiceanu, C. M. and Rupper t, D. (2005). Exact likelihoo d ra tio tests for pena liz e d splines. Biome trika 92 , 91– 1 03. MR21586 12 [9] Crainiceanu, C. M., R upper t, D. and Voge lsang, T. J. (2003). Some prop erties of lik eliho o d ratio tests in linear mixed models (unpub- lished). [10] Dean, C. (1992 ). T esting for ov erdisper sion in Poisson and binomial regr es- sion mo dels. Journal of the Americ an Statistic al Asso ciation 87 , 451–457 . [11] F an, J. Q. and Huang, L. S. (20 01). Go o dness-o f-ﬁt tests for parametric regres s ion mo dels. Journal of the Americ an Statistic al Asso ciation 96 , 64 0– 652. MR19464 31 [12] Gu, C. (1992). Penalized likeliho o d r egress io n: a Bay esian analy sis. Statis- tic a Sinic a 2 , 255– 264. MR11523 08 [13] Hardle, W ., Mammen, E. and Mul ler M. (1998). T esting para metric versus semiparametr ic mo deling in generalize d linea r mo dels. Journal of the Americ an Statistic al Asso ciation 93 , 1461 –147 4 . MR16666 41 [14] Har ville. D. A. (1976 ). E xtension of the Gauss- Marko v theorem to in- clude the estimation of random eﬀects. Annals of Statistics 4 , 384– 395. MR03980 07 [15] Hastie, T. and Tishirani, R. (19 90). Gener al ize d additive mo dels . Chap- man & Hall, New Y or k. MR10 8214 7 [16] Laird, N . M. and W a re, J. H. (19 82). Ra ndom eﬀects models for lo n- gitudinal data. Biometrics 38 , 963–974 . [17] Liang, H. (20 06). Checking linearity of non-par ametric comp onent in partially linear models with an application in sys temic inﬂamma tory re- sp onse syndrome study . Statistic al Metho ds in Me dic al R ese ar ch 15 , 273– 284. MR22274 49 [18] Lin, D. Y., Wei, L. J., and Ying, Z. (20 02). Mo del-checking techniques based on cum ulative residuals . Biometrics 58 , 1– 1 2. MR189 1037 [19] Lin, X. (19 97). V ariance co mpo nent testing in generalized linear mo dels with random eﬀects. Biometrika 84 , 30 9–326 . MR1467 049 [20] Liu, A., Meiring, W. and W ang, Y. (2004). T esting genera lized lin- ear mo dels using smo othing spline metho ds. Statistic S inic a 15 , 235 –256. MR21257 30 [21] Lombardia, M. J. and Sperlich, S. (2007). Semipa r ametric infer e nce in generalized mixed eﬀects models . http:/ /ssrn .com/abstract=1010928 . M. Huang and D. Zhang/T esting p olynomia l c ovariate eﬀe ct s 169 [22] P an, Z. and Lin, D. Y. (2005). Go o dness-of-ﬁt metho ds for gener alized linear mixed mo dels. Biometrics 61 , 1000–10 09. MR22161 93 [23] Self, S. G. and Liang, K. Y. (1987). Asymptotic prop erties of ma x- im um likeliho o d estimates and likelihoo d r a tio tests under non-sta nda rd conditions. Journal of t he Americ an Statistic al Asso ciation 82 , 605 – 610. MR08983 65 [24] Smith, P. J. and H eitjan, D. F. (1993). T esting and adjusting for de- partures from nominal disper sion in generalized linea r mo dels. Applie d. Statistics 41 , 31–41. [25] Stram, D. O. and Lee, J. W. (1 994). V ariance comp onents testing in the longitudinal mixed eﬀects model. Biometrics 50 , 1171 –117 7. [26] Su, J. Q. and Wei, L. J. (1991). A la ck-of-ﬁt test for the mean function in a genera lized linear mo del. Jour nal of the A meric an Statistic al Asso ci ation 86 , 420– 426. MR11371 24 [27] W ahba, G. (1990). Spline models for observ ational data. CB MS-NSF re - gional conference series in applied mathematics, SIAM. MR1 04544 2 [28] Zeger, S. L . and Kar im, M. R. (1991). Generalized linea r models with random eﬀects: A Gibbs sampling approach Journal of the Americ an Sta- tistic al Asso cia tion 86 , 79–86. MR11371 01 [29] Zhang, D. and Lin, X . (1 998). Semipar ametric stochastic mixed models for longitudinal data. Journal of the Americ an Statistic al Asso ciation 9 3 , 710–7 19. MR16313 69 [30] Zhang, D. a nd Lin, X. (2003). Hyp othesis testing in semiparametric additive mixed mo dels. Biostatistics 4 , 57–74 . [31] Zhang, D. (2004). Generalized linea r mixed models with v arying co eﬃ- cients for longitudinal data. Biometrics 60 , 8–15 . MR20436 13

Testing polynomial covariate effects in linear and generalized linear mixed models

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment