A New Family of Covariate-Adjusted Response Adaptive Designs and their Asymptotic Properties

A NEW F AMIL Y OF CO V ARI A TE-ADJUSTED RESPONSE AD APTIV E DESIGNS AND THEIR ASYMPTOTIC PR OPER TI ES By Li-Xin ZHANG and Feif ang HU 1 Zhejiang Univ ersit y and Unive rsity of Virginia Abstract It is often imp ortant to incorpo rating cov aria te information in the design of clinical trials. In liter a ture, there are man y designs of us- ing stratiﬁcation and cov ariate- adaptive randomizatio n to ba lance on certain known cov ariate. Recently Zhang, Hu, Cheung and Chan (2007) ha ve prop osed a family of co v a r iate-adjusted r esp onse-adaptive (CARA) desig ns and studied their asymptotic prop e r ties. Howev er, these CARA designs often hav e high v aria bilities. In this pap er, we prop ose a new family of cov aria te-adjusted resp onse-a daptive (CARA) designs. W e show that the new designs have smaller v ariabilities and therefore more eﬃcie nt. 1 In tro duction Resp onse-adaptiv e d esigns for clinical trials incorp orate sequen tially accru- ing resp onse data in to future allo cation p robabilities. A ma jor ob jectiv e of resp onse-adaptiv e designs in clinica l trials is to minimize the n um b er of patien ts that i s assigned to the in ferior treatmen t to a degree that s till gener- ates useful statistical inferences. The p reliminary idea of resp onse adaptiv e randomization can b e traced bac k to Th ompson (1933) an d Robbins (1952). A lot of resp onse-adaptive designs hav e b een prop o sed in lit erature ( e.g. , Rosen b erger and Lac h in 2002, Hu and Rosen b erger, 2006). Muc h recen t w ork has fo cused on p rop osing b et ter randomized adaptiv e designs. The three main comp o nents for ev aluating a resp onse-a daptive design are allo- cation prop ortio n, eﬃcie ncy (p o w er), and v ariabilit y . The issue of eﬃcie ncy or p o wer w as discus s ed by Hu and Rosenberger (2003), wh o sho w ed that 1 Li-Xin Zhang is Profess or, I nstitute of Statistics and Department o f M athematics, Zhejiang U niversi ty , Hongzhou, China. F eifang H u is Professor, D epartment of St atistics, Universit y of Virginia, Charlottesville, V A 22904-4135. The researc h was partially sup- p orted by NSF of China 10771192 (Lixin Zhang) and NS F Awards DMS-0349048 of USA (F eifang Hu ). 1 the eﬃciency is a decreasing function of th e v ariabilit y induced by the ran- domization pro cedure for an y giv en allocation prop o rtion. Hu , Rosenberger and Zhang (2006) sho w ed that there is an asymptotic lo w er b ound on the v ariabilit y of resp onse-a daptive designs. A resp onse-adaptive design that at- tains this lo wer b ound will b e said to b e ﬁ rst order eﬃcie nt . More recen tly , Hu, Zh ang a nd He ( 2008) prop osed a n ew f amily of eﬃcien t r andomized adaptiv e designs that can adapt to any desired allo cation prop ortio n. But all these studies are limit to the designs that do not in corp orate co v ariates. In man y clinical trials (P o co c k and Simon, 1975, T a v es, 1 974), co v ari- ate information is a v ailable and has a strong inﬂu en ce on the resp onses of patien ts. F or instance, the eﬃcac y of a h yp ertensive drug is related to a pa- tien t’s initial b lo o d pressur e an d c holesterol lev el, whereas the eﬀectiv en ess of a ca ncer treatme nt ma y dep end on w h ether the patie nt is a smok er or a non-smok er. Co v ariate-adaptiv e designs hav e b een prop o sed to b alance co- v ariates among treatment groups (see Poco ck and Simon, 1975, T a v es, 1974 and Zele n, 1974). Hu and Rosen b erger (2006 ) deﬁn ed a co v ariate-adjusted resp onse-adaptive (CARA) design as a design that incorp orate sequentially history information of accruing r esp onse data and co v ariate as w ell as the observ ed cov ariate in formation of the incoming patien t int o fu ture allo cation probabilities. In a CARA d esign, th e assignmen t of a treatment dep end s on the h istory information and the co v ariate of the incoming patien t. This generates a certain lev el of tec hnical complexit y for studying the pr op erties of the design. Zhang, et al (2007) go t a limit success o n C ARA designs b y prop osing a class of CARA designs that allo w a wide s p ectrum of applications to v ery general statistical mo dels and obtaining the asymptotic pr op erties to provide a statistical basis for inferences after using this kind of designs. Ho we v er, the CARA designs in Zhang, et al (20 07) often ha ve high v ariabilities and therefore are n ot eﬃcien t (Hu and Rosen b erger, 2003). Th e ma jor purp o se of th is pap er is to stud y the v ariabilit y and eﬃciency of CARA d esigns and to prop ose a new family of CARA designs with small v ariabilitie s. The pap er is organized as follo ws. In Secti on 2, the Fisher information and th e b e st asymp totic v ariabilit y are derived for a CARA design with an y giv en target allocation prop ortion. W e w ill ﬁ nd that the Fisher information and the v ariabilit y dep end on the distribution of eac h in dividual resp onse, the target f unction and the distrib u tion of the cov ariate. In Section 3, we 2 prop ose a new C ARA design that can adapt to target an y allocation function and in whic h a parameter can b e tuned suc h that th e asymp totic v ariabilit y approac hes to the best o ne. The design prop osed b y Zhang, et al (20 07) is a sp ecial case of this new d esign and has th e largest v ariabilit y in all this kind of designs. The new design is also an extension of the doubly adaptiv e biased coin design (B DCD) prop o sed b y Eisele and W oo d ro ofe (1995 ) and Hu and Zhang (2004a) . T he tec h nical proofs are put on the App en dix. 2 V ari abilit y and eﬃciency of CARA designs 2.1 General framew ork of CARA designs. Giv en a clinical trial with K trea tmen ts. Supp o sing that a patien t with a co v ariate vec tor ξ is assigned to treatmen t k , k = 1 , . . . , K , and the observe d resp onse is Y k , assume that the resp onse Y k has a conditional distrib ution f k ( y k | θ k , ξ ) for giv en the co v ariate ξ . Here θ k , k = 1 , . . . , K , are unkno wn parameters, and Θ k ⊂ R d is the parameter space of θ k . In an adaptiv e design, we let X 1 , X 2 , ... b e the sequ ence of random treatmen t assignmen ts. F or the m -th s ub ject, X m = ( X m, 1 , . . . , X m,K ) rep- resen ts the assignment of tr eatmen t suc h that if the m -th sub ject is allocated to treatmen t k , then all elements in X m are 0 except for the k -th comp onent, X m,k , whic h is 1. Supp ose that { Y m,k , k = 1 , . . . , K , m = 1 , 2 . . . } denote the r esp onses suc h that Y m,k is the resp onse of the m -th sub ject to treat- men t k , k = 1 , . . . , K . In p ractical situations, only Y m,k with X m,k = 1 is observ ed. Denote Y m = ( Y m, 1 , . . . , Y m,K ). Also, assume that co v ariate infor- mation is a v ailable in the clinical stu d y . Let ξ m b e the co v ariate of the m -th sub ject. W e assume th at { ( Y m, 1 , . . . , Y m,K , ξ m ) , m = 1 , 2 , . . . } is a sequence of i.i. d. r an d om vect ors, the distributions of whic h a re the same as th at of ( Y 1 , . . . , Y K , ξ ). F urth er , let X m = σ ( X 1 , . . . , X m ), Y m = σ ( Y 1 , . . . , Y m ) and Z m = σ ( ξ 1 , . . . , ξ m ) b e the sigma ﬁelds corresp onding to the resp onses, assignmen ts and co v ariat es r esp ectiv ely , and let F m = σ ( X m , Y m , Z m ) b e the sigma ﬁeld of the h istory . A ge neral co v ariate-adjusted resp o nse- adaptiv e (CARA) design is d eﬁned b y ψ m +1 ,k = P ( X m +1 ,k = 1 | F m , ξ m +1 ) = P ( X m +1 ,k = 1 | X m , Y m , Z m +1 ) , k = 1 , ..., K, 3 the co nditional probabilities of assignin g treatmen ts 1 , ..., K to the m th pa- tien t, conditioning on the entire history including the in formation of all pre- vious m assignmen ts, resp onses, and co v ariate v ectors, plus the in formation of the current patie nt’s co v ariate v ector. 2.2 CARA designs with a target. Let N m,k b e the n um b er of sub jects assigned to treatmen t k in the ﬁrst m assignmen ts and write N m = ( N m, 1 , . . . , N m,K ). Then N m = P m i =1 X i . F urther, let N n,k | x = P n m =1 X m,k I { ξ m = x } b e t he num b er of su b jects with co v ariate x that is rand omized to treatmen t k , k = 1 , . . . , K , in the n trials, and N n ( x ) = P n m =1 I { ξ m = x } b e the total num b er of su b jects with co v ariate x . W rite θ = ( θ 1 , . . . , θ K ). Beca use the v alue of θ and the co v ariate determinate the d istributions of th e outcomes, and acco rdingly , the eﬀects of eac h treatmen ts, in many cases on e wo uld lik e to deﬁn e a CARA design such that th e ”conditional” allo cation prop ortio n for a giv en co v ariate x co nv erges to a pre-sp eciﬁed pr op ortio n which is a function of θ and x . That is, N n,k | x N n ( x ) → π k ( θ , x ) , k = 1 , . . . , K, (2.1) where π 1 ( θ , x ), . . . , π K ( θ , x ) are K kno w n functions. W e call them target al- lo cation functions. Examples for the c hoice of target functions are discussed in Zhang, et al (200 7), Rosenberger, et al (200 1), Rosenberger, Vidy ashank ar and Agarw al (2001) and Hu and Rosen b erger (2006). Recen tly , T ymofy ey ev, Rosen b erger and Hu (2007) dev eloped a general f ramew ork to obtain opti- mal a llo cation prop ortion for K -treatmen t clinica l trials. H o w ev er, when P ( ξ = x ) = 0, for example, in the con tinuous co v ariate case, the ”condi- tional” allocation prop ortion N n,k | x / N n ( x ) is not well-deﬁned b ecause b oth the n umerator and d enominator are zeros almost surely . As compared with (2.1), it is m ore meaningful to allocate eac h individual patient to treatment k w ith a pr obabilit y close to π k ( θ , x ) f or a giv en cov ariate x . So we consid er a class of CARA designs with a prop ert y that P ( X m +1 ,k = 1 | F m , ξ m +1 = x ) → π k ( θ , x ) a.s. (2.2) The n ext theorem tells us that (2.2) implies (2.1). W rite ρ k ( θ ) = E π k ( θ , ξ ), k = 1 , . . . , K , ρ ( θ ) = ( ρ 1 ( θ ) , . . . , ρ K ( θ )) and π ( θ , x ) = ( π 1 ( θ , x ) , . . . , π K ( θ , x )) . 4 Theorem 2.1 If (2.2) is satisﬁe d, then N n,k | x N n ( x ) → π k ( θ , x ) a.s. on the event { N n ( x ) → ∞} (2.3) and N n,k n → ρ k ( θ ) a.s. (2.4) Her e, ” A a.s. on B ” me ans that P ( B \ A ) = 0 for two events A a nd B . F urther, if the density of the c ovar iate is p ositive at x , then lim r ց 0 lim n →∞ N n,k | B ( x ,r ) N n ( B ( x , r )) = π k ( θ , x ) a.s., (2.5) wher e N n,k | B ( x ,r ) = P n m =1 X m,k I { ξ m ∈ B ( x , r ) } , N n ( B ( x , r )) = P n m =1 I { ξ m ∈ B ( x , r ) } , B ( x , r ) is a b al l with the c enter x and the r adius r . Notice, when P ( ξ = x ) = 0, though the allocation prop ortion N n,k | x / N n ( x ) is not wel l-deﬁned, (2 .3) is trivial b ecause P ( N n ( x ) → ∞ ) = 0. Accurately , (2.3) make s sense only in the discrete co v ariate case and (2.5 ) is a ve rsion of (2.3) for con tin uous co v arites. 2.3 V ariabilit y and eﬃciency . F or r esp onse-adaptiv e designs w hic h do not incorporate co v ariates, Hu, Rosen b erger and Zhang (200 6) found the lo wer b ound of the asymptotic v ariabilit y of a design, i .e., o f the allocation proportions of the design. A design is called asymptotical ly eﬃcien t if its asymptotic v ariabilit y attains the lo w er bou n d. Next, we study the v ariabilit y and eﬃciency of the CARA designs. Supp ose, given ξ , that the resp onse Y k of a tr ial of treatmen t k has a distribution in the exp onential family , and tak es the form f k ( y k | ξ , θ k ) = exp  ( y k µ k − a k ( µ k )) /φ k + b k ( y k , φ k ) } (2.6) with link fun ction µ k = h k ( ξ θ T k ), where θ k = ( θ k 1 , . . . , θ k d ), k = 1 , . . . , K , are co eﬃcien ts. Assume that the scale parameter φ k is ﬁxed. It is easily c hec k ed that E [ Y k | ξ ] = a ′ k ( µ k ), V ar ( Y k | ξ ) = a ′′ k ( µ k ) φ k , ∂ log f k ( y k | ξ , θ k ) ∂ θ k = 1 φ k { y k − a ′ k ( µ k ) } h ′ k ( ξ θ T k ) ξ , 5 ∂ 2 log f k ( y k | ξ , θ k ) ∂ θ 2 k = 1 φ k n − a ′′ k ( µ k )[ h ′ k ( ξ θ T k )] 2 + [ y k − a ′ k ( µ k )] h ′′ k ( ξ θ T k ) o ξ T ξ and, giv en ξ , the conditional Fisher information m atrix is I k ( θ k | ξ ) = − E h ∂ 2 log f k ( Y k | ξ , θ k ) ∂ θ 2 k    ξ i = 1 φ k a ′′ k ( µ k )[ h ′ k ( ξ θ T k )] 2 ξ T ξ . F or the observ ations up to stage n , the lik eliho o d fu nction is L ( θ ) = n Y j =1 K Y k =1 [ f k ( Y j,k | ξ j , θ k )] X j,k = K Y k =1 n Y j =1 [ f k ( Y j,k | ξ j , θ k )] X j,k := K Y k =1 L k ( θ k ) (2.7) with log L k ( θ k ) ∝ P n j =1 X j,k { Y j,k − a k ( µ j,k ) } , µ j,k = h k ( θ T k ξ j ), k = 1 , 2 , . . . , K. W rite I k = E [ π k ( θ , ξ ) I k ( θ k | ξ )] , k = 1 , . . . , K . (2.8) Then − E θ  ∂ 2 log L ( θ ) ∂ θ 2 k  = n X j =1 E θ [ X j,k I k ( θ k | ξ j )] = n I k + o ( n ) It follo ws that the entire Firsher information matrix is I n ( θ ) = − E θ  ∂ 2 log L ( θ ) ∂ θ 2  = ndiag ( I 1 , . . . , I K ) + o ( n ) . Th us we obtain th e foll o wing theo rem. Theorem 2.2 Supp ose the r esp onses fol low the ge ne r alize d line ar mo del (2.6) and the design satisﬁes (2.2). L et I ( θ ) = diag ( I 1 , . . . , I K ) . Then the Fir sher information matr ix satisﬁes I n ( θ ) = n I ( θ ) + o ( n ) , and the asymptotic varianc e-c ovarianc e matrix of an asymptot ic eﬃcient estimator of θ is I − 1 ( θ ) /n . 6 The limit prop ortio n ρ ( θ ) = ( ρ 1 ( θ ) , . . . , ρ K ( θ )) dep ends on b ot h the parameter θ and the d istribution of ξ . When the d istribution of ξ is known, according to Theorem 2.2, the asymptotic v ariance- co v ariance matrix of an asymptotic eﬃcient estimato r of ρ ( θ ) is 1 n ∂ ρ ( θ ) ∂ θ I − 1 ( θ )  ∂ ρ ( θ ) ∂ θ  T . While, if the parameter θ is k n o wn, then the non-parameter maximal likel iho o d estimator (MLE) of ρ ( θ ) = E [ π ( θ , ξ )] is 1 n P n m =1 π ( θ , ξ m ) and its v ariance- co v ariance m atrix is V a r { π ( θ , ξ ) } /n. So, in the general case that the param- eter θ and the d istr ibution of ξ are b oth u nkno wn, the asymp totic v ariance- co v ariance matrix of an asymptotic eﬃcien t estimat or o f ρ ( θ ) is B ( θ ) /n , where B ( θ ) = ∂ ρ ( θ ) ∂ θ I − 1 ( θ )  ∂ ρ ( θ ) ∂ θ  T + Va r { π ( θ , ξ ) } . The allo cation p rop ortion N n /n in a adaptive design with prop ert y (2.2) will con v erge to ρ ( θ ) according to Th eorem 2.1. So w e can now deﬁne an asymptotically eﬃcien t C ARA design as follo w s . Deﬁnition 1 A c ovar aite- adjuste d r esp onse-adaptive design with tar get func- tion π ( θ , x ) is c al le d asymp totic al ly eﬃcient if it satisﬁes (2.2) and n 1 / 2  N n /n − ρ ( θ )  D → N  0 , B ( θ )  , (2.9) and B ( θ ) is c al le d the b est asymptotic variability. Zhang, Hu , Cheung and Ch an (2007) prop o sed a CARA d esign (we refer it as ZHCC’s design) by d eﬁning P ( X m +1 ,k = 1 | F m , ξ m ) = π k ( b θ m , ξ m +1 ) , where b θ m is the MLE of θ based on the observ atio ns up to stag e m . I t has b een shown that ZHCC’s d esign satisfy (2.2) and n 1 / 2  N n /n − ρ ( θ )  D → N  0 , Σ ( θ )  , where Σ ( θ ) = 2 ∂ ρ ( θ ) ∂ θ I − 1 ( θ )  ∂ ρ ( θ ) ∂ θ  T + diag ( ρ ( θ )) −  ρ ( θ )  T ρ ( θ ) . 7 It is easily seen that diag ( ρ ( θ )) −  ρ ( θ )  T ρ ( θ ) = V ar { π ( θ , ξ ) } + E h diag ( π ( θ , ξ )) − ( π ( θ , ξ )) T π ( θ , ξ ) i ≥ Va r { π ( θ , ξ ) } , where A ≥ B means that A − B is non-negativ e deﬁnite. Hence, ZHCC’s design is not asymptotically eﬃcien t. It is of signiﬁcance to ﬁn d an asymp totic eﬃcien t CARA d esign for any giv en target function π ( θ , x ). In the n ext sectio n, w e will prop ose a new class of CARA designs with an asymptotic v ariability b eing able to app roac h the b est one. 3 Co v ariate-adjusted DBCD Our new design is b ased on the id ea of the doubly adaptiv e biased coin design (BD CD) prop o sed b y Eisele and W o o dro ofe (199 5), and extended b y Hu and Z h ang (2004a). In the scenario without co v ariates, the Hu and Zhang’s extension can targe t any desired allo cation and can appr oac h the lo w er b ound of the asymptotic v ariabilit y b y tu ning a parameter. In this section, w e mo dify the DBCD to incorp orate co v ariates. F or simpliﬁcation, w e only consider the tw o-t reatmen t case ( K = 2). Co v ariate-adjusted DBCD ( C ADBCD): T o start, w e let θ 0 b e an in itial estimate of θ , and assign m 0 sub jects to eac h treatmen t by using a restricted randomization. Assume that m ( m ≥ 2 m 0 ) sub jec ts hav e b ee n assigned to treatmen ts. Their resp o nses { Y j , j = 1 , . . . , m } and the corresp onding co v ariates { ξ j , j = 1 , . . . , m } are obser ved. W e let b θ m = ( b θ m, 1 , b θ m, 2 ) b e an estimate of θ = ( θ 1 , θ 2 ). Here, for eac h k = 1 , 2, b θ m,k = b θ m,k ( Y j,k , ξ j : X j,k = 1 , j = 1 , . . . , m ) is the estimator of θ k that is based on the observ ed N m,k -size sample { ( Y j,k , ξ j ) : for whic h X j,k = 1 , j = 1 . . . , m } . W rite b ρ m = 1 m P m i =1 π 1 ( b θ m , ξ i ) and b π m = π 1 ( b θ m , ξ m +1 ). Next, when the ( m + 1) -th sub ject is r eady for randomizatio n and the corresp ond ing co v ariate ξ m +1 is recorded, we assig n the patient to treatmen t 1 with a probabilit y of ψ m +1 , 1 = b π m  b ρ m N m, 1 /m  γ b π m  b ρ m N m, 1 /m  γ + (1 − b π m )  1 − b ρ m 1 − N m, 1 /m  γ (3.10) and to treatmen t 2 with a probabilit y of ψ m +1 , 2 = 1 − ψ m +1 , 1 , where γ ≥ 0 is a constant that con trols the degree of randomness of the pro cedure, from 8 most random when γ = 0 to deterministic when γ = ∞ . ZHCC’s design is a sp ecial case of CADBCD with γ = 0. Asymptotic prop erties. F or studying the asymp totic prop erties, we assume the target allocation function π 1 ( θ ∗ , x ) satisﬁes th e follo wing co ndi- tion. Condition A We assume that the p ar ameter sp ac e Θ k is a b ounde d domain in R d , and that the true value θ k is an interior p oint of Θ k , k = 1 , 2 . 1. F or e ach ﬁxe d x , 0 < π 1 ( θ ∗ , x ) < 1 is a c ontinuous function of θ ∗ in the clo sur e of Θ 1 × Θ 2 . 2. π 1 ( θ ∗ , ξ ) is twic e diﬀer entiable with r esp e ct to θ ∗ , and the exp e ctations of k ∂ π 1 ( θ , ξ ) / ∂ θ k 2 and sup k θ ∗ − θ k≤ δ k ∂ 2 π 1 ( θ ∗ , ξ ) /∂ θ 2 k ar e ﬁnite for some δ > 0 . W rite v = E [ π 1 ( θ , ξ )] , then 0 < v < 1 due to Condition A.1. Theorem 3.1 Supp ose that for k = 1 , 2 , b θ nk − θ k = 1 n n X m =1 X m,k h k ( Y m,k , ξ m )  1 + o (1)  + o ( n − 1 / 2 ) a.s., (3.11 ) wher e h k s ar e fu nctions with E [ h k ( Y k , ξ ) | ξ ] = 0 . We also assume that E k h k ( Y k , ξ ) k 2 < ∞ , k = 1 , 2 . Then under Condition A, we have P  X n, 1 = 1  → v ; P  X n, 1 = 1 | F n − 1 , ξ n = x  → π 1 ( θ , x ) a.s. (3.12) and N n, 1 n − v = O  r log log n n  a.s. ; b θ n − θ = O  r log log n n  a.s. (3.13) F urther, let V k = E { π k ( θ , ξ )( h k ( Y k , ξ )) T h k ( Y k , ξ ) } , k = 1 , 2 , V = diag  V 1 , V 2  , σ 2 1 = E [ π 1 ( θ , ξ )( 1 − π 1 ( θ , ξ )) ] , σ 2 2 = Va r { π 1 ( θ , ξ ) } , σ 2 3 = E ∂ π 1 ( θ , ξ ) ∂ θ V  E ∂ π 1 ( θ , ξ ) ∂ θ  T , λ = γ σ 2 1 v (1 − v ) and σ 2 = σ 2 1 + σ 2 3 1+2 λ + σ 2 2 + σ 2 3 . Then, √ n ( N n, 1 /n − v ) D → N (0 , σ 2 ) and √ n ( b θ n − θ ) D → N ( 0 , V ) . (3.14) 9 The pro of of this Theorem is a little complex and will b e state in the App end ix. According to (3.12), CADBCD s atisﬁes (2.2). The asymptotic v ariabilit y σ 2 of the d esign tak es the v alues from the m axim um 2 σ 2 3 + v (1 − v ) when γ = 0 to the minim σ 2 2 + σ 2 3 when γ = ∞ . The next resu lt for the generalized linear mo del is a corollary of T h eorem 3.1. The pro of is giv en in the App endix through the v eriﬁcation of Condition (3.11). Corollary 3.1 Supp ose the distributions of the r esp onses fol low the gener- alize d line ar mo del (2.6) and satisfy the fol lowing r e gular c ondition H ( δ ) =: E θ h sup k z k≤ δ    ∂ 2 log f k ( Y k | ξ , θ k ) ∂ θ 2 k     θ k + z θ k    i → 0 as δ → 0 , (3.15) wher e f ( x ) | b a = f ( b ) − f ( a ) . Under Condition A, if the matric es I 1 and I 2 deﬁne d as in (2.8) ar e nonsingular and the MLE b θ m , which maximize the likeliho o d fu nction (2.7), is unique, then we ha ve (3.1 2 ), (3.13), an d (3.14 ) with V = I − 1 ( θ ) and I ( θ ) = diag ( I 1 , I 2 ) . It is ob vious that B ( θ ) = σ 2 2 + σ 2 3 is the b est asymptotic v ariability of CARA designs with t w o treatmen ts according to Deﬁnition 1. F or the CADBCD, σ 2 = σ 2 1 + σ 2 3 1 + 2 γ σ 2 1 v (1 − v ) + B ( θ ) > B ( θ ) but σ 2 ց B ( θ ) as γ ր ∞ . This means that the C ADBCD is n ot asymptotically eﬃcien t but it can ap- proac h an asymptotically eﬃcien t CARA design if γ is c hosen large. ZHCC ’s design is a sp e cial case of the CADBCD whic h has the largest v ariabilit y . 4 Conclusion Remarks W e ha v e prop osed a family of co v ariate-adjusted resp onse-adaptive designs that are fu lly rand omized a nd asymptotically eﬃcie nt. Th e CADBCD can b e viewe d as a generalization of Hu and Z hang’s doubly adaptiv e biased coin design (Hu and Zhang, 2004a) for in corp orating co v ariate information. The asymptotic pr op erties d eriv ed here p ro vide the theoretical foun d ation for inference based on the CADBCD. 10 In this p ap er, w e h av e assumed that the resp on s es in eac h treatmen t group are a v ailable without dela y . In p ractice, there is no logistical diﬃ- cult y in incorp orating dela y ed resp onses in to th e CADBCD, pro vided that some resp onses b eco me a v ailable during the course of the allocation in the exp eriment, and thus we can alwa ys u p date the estimates wh enev er new data b ecome a v ailable. F or cli nical trials w ith uniform (or exp onen tial) pa- tien t ent ry and exp onential r esp onse times (see Bai, Hu and Rosen b er ger (2002 ), Hu and Zhang (2004) and Z hang, et al (2006) for examples), it is easy to v erify the theoretical results in Secti on 2 and 3. 5 App endix: Pro ofs Pro of of Theorem 2.1. Notic e E [ X m +1 ,k | F m ] = E [ ψ m +1 ,k | F m ] → ρ k ( θ ) b y (2.2 ) and { P n m =1 ( X m,k − E [ X m,k | F m − 1 ]) , F n } is a martingale. (2.4) fol- lo ws immediate ly . F or (2.3), let G m = σ ( F m , ξ m +1 ). T hen { P n m =1 ( X m,k − E [ X m,k | G m − 1 ]) I { ξ m = x } , G m } is a martingale with n X m =1 E  { ( X m,k − E [ X m,k | G m − 1 ]) I { ξ m = x }} 2 | G m − 1  ≤ N n ( x ) . It follo ws that P n m =1 ( X m,k − E [ X m,k | G m − 1 ]) I { ξ m = x } N n ( x ) → 0 a.s. on { N n ( x ) → ∞} b y Th eorem 3.3.10 of Stout (1974). O n the other h and, P n m =1 ( E [ X m,k | G m − 1 ] − π k ( θ , x )) I { ξ m = x } N n ( x ) → 0 a.s. on { N n ( x ) → ∞} b y (2.2). S o, (2.3) is pro v ed. F or (2.5), notice N n ( B ( x , r )) n → P { ξ ∈ B ( x , r ) } > 0 a.s. With a similar argument we ha v e lim n →∞ N n,k | B ( x ,r ) N n ( B ( x , r )) = lim n →∞ P n m =1 π k ( θ , ξ m ) I { ξ m ∈ B ( x , r ) } N n ( B ( x , r )) = E [ π k ( θ , ξ ) I { ξ ∈ B ( x , r ) } ] P { ξ ∈ B ( x , r ) } a.s. 11 Letting r ց 0 yields (2.5). Pro of of Theorem 3.1. The pro of is a little complex and long. W e will complete via four steps. Step 1. W e sho w that (3.13) and b ρ m = v + O ( p log log m/m ) a.s. (5.16) W rite π 1 = π 1 ( θ , ξ ) for short. Let M n, 1 = P n m =1 ( X m, 1 − E [ X m, 1 | F m − 1 , ξ m ]), M n, 2 = P n m =1 ( π 1 ( θ , ξ m ) − E π 1 ), Q n,k = P n m =1 X m,k h k ( Y m,k , ξ m ) for k = 1 , 2, Q n = ( Q n, 1 , Q n, 2 ) and M n, 3 = Q n  E ∂ π 1 ∂ θ  T . Then Q n and M n,j , j = 1 , 2 , 3, are martingal es. According to the la w of th e it erated loga rithm (LIL) for martingales, we ha v e Q n = O ( p log log n/n ) and M n,j = O ( p log log n/n ) a.s.j = 1 , 2 , 3 . (5.17) Hence, by (3.11) it is easily sh own that b θ m − θ = O ( p log log m/m ) a.s. (5.18) It follo ws that b π m = π 1 ( b θ m , ξ m +1 ) = π 1 ( θ , ξ m +1 ) + ( b θ m − θ )  ∂ π 1 ( θ , ξ m +1 ) ∂ θ  T + O (1) k b θ m − θ k 2 sup k θ ∗ − θ k≤ δ     ∂ 2 π 1 ( θ ∗ , ξ m +1 ) ∂ θ 2     (5.19) = π 1 ( θ , ξ m +1 ) + ( b θ m − θ ) E ∂ π ∂ θ + ( b θ m − θ )  ∂ π 1 ( θ , ξ m +1 ) ∂ θ − E ∂ π 1 ∂ θ  T + O (1) log log m m sup k θ ∗ − θ k≤ δ     ∂ 2 π 1 ( θ ∗ , ξ m +1 ) ∂ θ 2     a.s. It is easily shown that n X m =1 ( b θ m − θ )  ∂ π 1 ( θ , ξ m +1 ) ∂ θ − E ∂ π 1 ∂ θ  T = o ((log n ) 2 ) a.s . and n X m =1 log log m m sup k θ ∗ − θ k≤ δ     ∂ 2 π 1 ( θ ∗ , ξ m +1 ) ∂ θ 2     = o ((log n ) 2 ) a.s . 12 It follo ws that n X m =1 b π m = n X m =1 π 1 ( θ , ξ m +1 )+ n X m =1 ( b θ m − θ )  E ∂ π 1 ∂ θ  T + o ((log n ) 2 ) a.s. (5.20) Similarly , b ρ m = 1 m m X i =1 π 1 ( b θ m , ξ i ) = 1 m m X i =1 π 1 ( θ , ξ i ) + ( b θ m − θ )  E ∂ π 1 ( θ , ξ ) ∂ θ  T +( b θ m − θ ) 1 m m X i =1  ∂ π 1 ( θ , ξ i ) ∂ θ − E ∂ π 1 ( θ , ξ ) ∂ θ  T + O (1) k b θ m − θ k 2 1 m m X i =1 sup k θ ∗ − θ k≤ δ     ∂ 2 π 1 ( θ ∗ , ξ i ) ∂ θ 2     (5.21) = 1 m m X i =1 π 1 ( θ , ξ i ) + ( b θ m − θ )  E ∂ π 1 ( θ , ξ ) ∂ θ  T + O ( log log m m ) . (5.22) It follo ws that b ρ m = v + 1 m m X i =1 [ π 1 ( θ , ξ i ) − E π 1 ]+ O ( p log log m/m ) = v + O ( p log log m/m ) a.s. and n X m =1 b π m = nv + O ( p n log log n ) a.s. No w, write g ( π , a, b ) = π ( b/a ) γ π ( b/a ) γ + (1 − π )( (1 − b ) / (1 − a )) γ . (5.23) Then ψ m +1 , 1 = g ( b π m , N m, 1 /m, b ρ m ). It is easily seen that g ( π , a, b ) is a non - decreasing fun ction of b , and so g ( π , a, b ) ≤ g ( π, a, a ) = π if a ≥ b . Let l n = max { m ≤ n : N m, 1 /m ≤ b ρ m } , then ψ m +1 , 1 ≤ b π m when m ≥ l n + 1. 13 Hence N n, 1 = N l n +1 , 1 + M n, 1 − M l n +1 , 1 + n − 1 X m = l n +1 ψ m +1 , 1 ≤ 1 + N l n , 1 + M n, 1 − M l n +1 , 1 + n − 1 X m = l n +1 b π m ≤ 1 + l n b ρ l n + M n, 1 − M l n +1 , 1 + n − 1 X m =1 b π m − l n X m =1 b π m (5.24) ≤ nv + O ( p n log log n ) a.s. Similarly , n − N n, 1 ≤ n (1 − v ) + O ( p n log log n ) a.s. (3.13) and (5.16) are no w p ro v ed. Step 2. W e sho w (3.12) and the asymptotic normalit y of b θ n . By (3.13 ) and (5.16), b ρ n / ( N n, 1 /n ) → 1 a.s.. And hen ce (3.12) is pro v ed. and further ψ m, 1 − π 1 ( b θ m − 1 , ξ m ) → 0 a.s. Then, it is easily c hec k that Q n is a martingale with 1 n n X m =1 E  (∆ Q n ) T ∆ Q n  = 1 n n X m =1 diag  E  ψ m, 1 h 1 ( Y m, 1 , ξ m ) T h 1 ( Y m, 1 , ξ m )  , E  ψ m, 2 h 2 ( Y m, 2 , ξ m ) T h 2 ( Y m, 2 , ξ m )   → V . So, applying the cent ral limit theo rem for martingales yields n 1 / 2 ( b θ n − θ ) D → N ( 0 , V ) . The pro of of Step 2 is completed. Step 3. W e sho w that ψ m +1 , 1 = b π m − γ b π m (1 − b π m ) v (1 − v )  N m, 1 m − b ρ m  + O ( log log m m ) a.s. (5.25) 14 Let g ( π , a, b ) b e deﬁned as in (5.23). By some element ary argument , it can b e show ed that sup 0 ≤ π ≤ 1     g ( π , a, b ) − π + γ π (1 − π ) v (1 − v ) ( a − b )     = O (( a − v ) 2 + ( b − v ) 2 ) , (5.26) as ( a, b ) → ( v , v ). By (3.13) and (5.16), it follo ws that sup 0 ≤ π ≤ 1     g ( π , N m, 1 /m, b ρ m ) − π + γ π (1 − π ) v (1 − v )  N m, 1 m − b ρ m      = O ( log log m m ) a.s . (5.25) is no w pr o v ed. Step 4. At last, w e sho w the asymptotic normalit y of N n . Notice N m, 1 /m − b ρ m = O ( p log log m/m ) a.s.. With th e s ame argumen t as deriving (5.20), we can sho w th at n X m =1 b π m (1 − b π m ) v (1 − v )  N m, 1 m − b ρ m  = n X m =1 E [ π 1 (1 − π 1 )] v (1 − v )  N m, 1 m − b ρ m  + o ((log n ) 2 ) a.s. By (5.25) it follo ws that n X m =1 ψ m − 1 , 1 = n X m =1 π 1 ( θ , ξ m ) + n − 1 X m =0 ( b θ m − θ )  E ∂ π 1 ∂ θ  T − λ n − 1 X m =1  N m, 1 m − b ρ m  + o ((log n ) 2 ) a.s. 15 Then N n, 1 − nv = M n, 1 + n X m =1 ψ m − 1 , 1 − nv = M n, 1 + M n, 2 + n − 1 X m =0 ( b θ m − θ )  E ∂ π 1 ∂ θ  T − λ n − 1 X m =1  N m, 1 m − b ρ m  + o ((log n ) 2 ) = M n, 1 + M n, 2 + λ n − 1 X m =1 M m, 2 m + ( λ + 1) n − 1 X m =0 ( b θ m − θ )  E ∂ π 1 ∂ θ  T − λ n − 1 X m =1  N m, 1 m − v  + o ((log n ) 2 ) a.s. = M n, 1 + M n, 2 + λ n − 1 X m =1 M m, 2 m ! + (1 + o (1)) ( λ + 1) n − 1 X m =1 M m, 3 m ! − λ n − 1 X m =1  N m, 1 m − v  + o ( n 1 / 2 ) a.s. On the other hand, E [∆ M m,i ∆ M m,j | F m − 1 ] = 0 , i 6 = j, E [(∆ M m, 1 ) 2 | F m − 1 ] = E [ ψ m, 1 (1 − ψ m, 1 ) | F m − 1 ] → σ 2 1 a.s., E [(∆ M m, 2 ) 2 | F m − 1 ] = Va r [ π 1 ( θ , ξ m )] = σ 2 2 and E [(∆ M m, 3 ) 2 | F m − 1 ] = E ∂ π ∂ θ E [(∆ Q m ) T ∆ Q m | F m − 1 ]  E ∂ π ∂ θ  T → σ 2 3 a.s. By applying the function cen tral limit theorem (c.f., Corollary 3.1 of Hall and Heyde, 1980), we h a v e n − 1 / 2  M [ nt ] , 1 , M [ nt ] , 2 , M [ nt ] , 3  D →  σ 1 B (1) t , σ 2 B (2) t , σ 3 B (3) t  , where B ( i ) t , i = 1 , 2 , 3, are three in dep endent standard Bro wnian motions. Then with the same argumen t as in Hu and Zhang (2004a ), one can show that n − 1 / 2 ( N [ nt ] , 1 − [ nt ] v ) D → G t , 16 where G t = σ 1 t − λ Z t 0 x λ dB (1) x + σ 2 B (2) t + ( λ + 1) σ 3 t − λ Z t 0 x λ − 1 B (3) x dx is a solution of the equation G t = σ 1 B (1) t + σ 2 B (2) t + λ Z t 0 B (2) x x dx ! + ( λ + 1) σ 3 Z t 0 B (3) x x dx − λ Z t 0 G x x dx with G 0 = 0. It is easily chec ke d that V a r ( G t ) = t  σ 2 1 1 + 2 λ + σ 2 2 + 2( λ + 1) 1 + 2 λ σ 2 3  = t  σ 2 1 + σ 2 3 1 + 2 λ + σ 2 2 + σ 2 3  . Hence n 1 / 2 ( N n, 1 /n − v ) D → N (0 , σ 2 ) . Pro of of Corollary 3.1. I t is s uﬃcien t to sh o w the strong con tinency of the MLE b θ m : b θ n → θ . (5.27) In fact, if (5.27) is prov ed , then b y (5.19) and (5.21) we ha v e b ρ n → v a.s. and 1 n P n m =1 b π m → v a.s.. By (5 .24) w e will h a v e N n /n → v a.s. It follo ws that ψ m,k − π k ( b θ m − 1 , ξ m ) → 0 a.s. b y (5.26). The rest pro of is similar to Corollary 3.1 of Zhang et al (2007). F or (5.27), it suﬃ ces to sh o w that, for an y δ > 0 small enough, with probabilit y one for m large enough we ha v e log L k ( θ ∗ k ) < log L k ( θ k ) , if k θ ∗ k − θ k k = δ. (5.28) W e co nsider the case k = 1 only . T he application of T a ylor’s theorem yields 1 m log L 1 ( θ ∗ 1 ) − 1 m log L 1 ( θ 1 ) =( θ ∗ 1 − θ 1 ) 1 m ∂ log L 1 ∂ θ 1    θ 1 + ( θ ∗ 1 − θ 1 ) 1 m ∂ 2 log L 1 ∂ θ 2 1    θ 1 ( θ ∗ 1 − θ 1 ) T + ( θ ∗ 1 − θ 1 ) n 1 m Z 1 0 h ∂ 2 log L 1 ∂ θ 2 1     θ 1 + t ( θ ∗ 1 − θ 1 ) θ 1 i dt o ( θ ∗ 1 − θ 1 ) T . 17 W rite f ( a, b, z , ξ ) = π 1 ( z , ξ )  b a  γ π 1 ( z , ξ )  b a  γ + (1 − π 1 ( z , ξ ))  1 − b 1 − a  γ . It is ob vious that f is a con tin uous function of a , b and z f or eac h give ξ . By applying the la w of large num b ers for martingales, one can sho w that 1 m ∂ log L 1 ∂ θ 1    θ 1 → 0 a.s. and ∂ 2 log L 1 ∂ θ 2 1    θ 1 = m X j =2  E  f ( a, b, z , ξ ) I 1 ( θ 1 | ξ )     a = N j − 1 j − 1 ,b = b ρ j − 1 , z = b θ j − 1 + o ( m ) a.s. F or the details of the pro of, one can refer to Zhang et al (20 07). F ur ther, it is ob vious that lim su p b ρ m ≤ lim 1 m m X j =1 sup θ π 1 ( θ , ξ j ) = E [sup θ π 1 ( θ , ξ )] < 1 a.s., where the sup erior is tak en ov er the parameter space. And similarly lim su p n →∞ 1 n n X m =1 b π m ≤ E [sup θ π 1 ( θ , ξ )] < 1 a.s. By (5.24), lim su p N n, 1 /n ≤ E [su p θ π 1 ( θ , ξ )] < 1 a.s. By consid ering 1 − b ρ m and n − N n, 1 instead of b ρ m and N n, 1 resp ectiv ely , we ha v e lim inf b ρ m ≥ E [inf θ π 1 ( θ , ξ )] > 0 and lim inf N n, 1 /n ≥ E [inf θ π 1 ( θ , ξ )] > 0 a.s. So w e ma y assume that b ρ m , N n, 1 /n ∈ [ δ 0 , 1 − δ 0 ] for some 0 < δ 0 < 1. On the other hand, it is obvi ous that y E  f ( a, b, z , ξ ) I 1 ( θ 1 | ξ )  y T is a con tin uous function of a, b, y , z , and is p ositive for all 0 < a, b < 1, y 6 = 0 and all z . It follo ws that there is a constant c 0 > 0 for which lim inf j →∞ min y : k y k =1  y E  f ( a, b, z , ξ ) I 1 ( θ 1 | ξ )  y T  a = N j − 1 j − 1 ,b = b ρ j − 1 , z = b θ j − 1 > c 0 a.s. 18 So with probability one f or m large e nough it holds that 1 m log L 1 ( θ ∗ 1 ) − 1 m log L 1 ( θ 1 ) ≤ − k θ ∗ 1 − θ 1 k 2 n 1 m m X j =2 min y : k y k =1  y E  f ( a, b, z , ξ ) I 1 ( θ 1 | ξ )  y T  a = N j − 1 j − 1 ,b = b ρ j − 1 , z = b θ j − 1 o + k θ ∗ 1 − θ 1 k 2 H ( k θ ∗ 1 − θ 1 k ) + o (1) ≤ − c 0 δ 2 + δ 2 H ( δ ) + o (1) < 0 uniformly in θ ∗ 1 with k θ ∗ 1 − θ 1 k = δ when δ is small enough. (5.28) is p ro v ed. 19 References [1] Bai, Z. D., Hu , F. and Rosen b erger, W. F. (2002). Asymptotic prop er- ties of adaptive d esigns with dela y ed resp onse. Annals of Statistics , 30 : 122–1 39. [2] Eisele, J. and Woodr oofe, M. ( 1995). Centra l limit theo rems for doubl y ad aptive bias ed coin designs. Ann. Sta tist. 23 234- 254. [3] Hall, P. a nd Heyde, C. C. (1980). Martingale Limit The ory and its Applic ations , A cademic Press , London. [4] Ha yre, L. S. (1979). Two-popula tion sequential test s with three hypothe ses. Biometrika , 66 , 465–474 . [5] Hu, F. and Rosenberger, W. F . (2003). Ev alua tiong re sponse- adaptive ran domiza tion procedures for trea tment comp ar- isons. Journal of the A meric an Statistic al Asso ci ation , 98 , 671-678. [6] Hu, F. and Rosenberger, W. F. (2006). The The ory of R esp onse- A daptive R andomization in Clinic al T rials , John Wile y an d S ons, Inc., New York. [7] Hu, F., Rosenberger, W. F., an d Zhang, L.-X. (2006). Asymptot icall y b est res ponse-adaptive rando miza tion pro- cedures . Journal of Statistic al Planning and Infer enc e , 136 , 1911– 1922. [8] Hu, F. and Zhang, L.-X. (2004 a ). Asymptot ic p roper ties of doubl y adaptive biased coin designs for mul titrea tment clinical trials. The A nnals o f Statistics , 32 , 268-301. [9] Hu, F . and Zh ang, L.-X. (2004 b ) Asympto tic normal ity of ur n models for clinical trials with dela ye d resp onse. Bernoul li , 10 (3), 447-463. [10] Hu, F., Zhang, L.X. and He, X. (2008). Efficient r andomized adaptive designs, A nnals of Statistics , To app ear. 20 [11] Pocock, S. J. and Simon, R. (1975). Sequen tial trea t ment assignment with balancing for prognostic f actors in t he controlled clinical trial. Biometrics , 31 , 103–115. [12] R obbins, H . (1952). Some aspects of the sequential design of exp eriments. Bul letin of the Americ an Mathematic al So ci e ty , 58 , 527–535. [13] R osenbe rger, W . F. and Lachin, J. M. (2002) R andomizat ion in Clinic al T rials The ory and P r actic e , John Wiley and Sons , Inc., New York. [14] R osenbe rger, W. F., S t allard , N., Iv anov a, A. Harp er, C., and Ricks, M. (2001). Optimal ad aptive de signs for binar y respons e trials. Biometrics , 57 , 909-913. [15] R osenbe rger, W. F., Vidy as hankar, A. N. and A gar w al, D. K. (2001). Cov aria te-adjust ed resp onse-adaptive des igns for binar y resp onse. J. Biopharm. Statist. , 11 227-236. [16] T a ve s, D.R. (1974). Minimiza tion: a new method of assigning p a tients to trea tment and control groups. Clin Pharmac ol Ther. , 15 , 443-453. [17] Thompson, W. R. (1933). On th e likel ihood tha t one un- known p robability exceeds another in the view of the evi- dence of the two samples. Biometrika , 25 , 275–294. [18] Tymofyeye v, Y., Rosenberger, W. F. and Hu, F . (2007). Implemen ting opt imal alloc a tion in seq uential binar y r e- sponse ex periments . Journal of the Americ an Statistic al A sso ciation , 102 , 224-234. [19] Zelen, M. ( 1974). The rando miza tion and st ra tifica tion of p a tients to clinical trials. Journal of Chr onic Dise ases , 27 , 365- 375. [20] Zhang, L.X., Chan, W.S., Cheung. S .H. an d Hu , F. ( 2006). A gene ralized urn model for clinical trials with dela yed respons es. Stat istic a Si ni c a , 17 , 387-409 21 [21] Zhang, L.X., Hu, F., Cheung. S.H. and Chan, W.S. ( 2007). Asymptot ic pr oper ties of cov aria te -adjusted adaptive de- signs. Annal s of Statistics , 35 , 1166- 1182. 22

A New Family of Covariate-Adjusted Response Adaptive Designs and their Asymptotic Properties

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment