A New Family of Covariate-Adjusted Response Adaptive Designs and their Asymptotic Properties

It is often important to incorporating covariate information in the design of clinical trials. In literature, there are many designs of using stratification and covariate-adaptive randomization to balance on certain known covariate. Recently Zhang, H…

Authors: ** Zhang, Hu, Cheung

A NEW F AMIL Y OF CO V ARI A TE-ADJUSTED RESPONSE AD APTIV E DESIGNS AND THEIR ASYMPTOTIC PR OPER TI ES By Li-Xin ZHANG and Feif ang HU 1 Zhejiang Univ ersit y and Unive rsity of Virginia Abstract It is often imp ortant to incorpo rating cov aria te information in the design of clinical trials. In liter a ture, there are man y designs of us- ing stratification and cov ariate- adaptive randomizatio n to ba lance on certain known cov ariate. Recently Zhang, Hu, Cheung and Chan (2007) ha ve prop osed a family of co v a r iate-adjusted r esp onse-adaptive (CARA) desig ns and studied their asymptotic prop e r ties. Howev er, these CARA designs often hav e high v aria bilities. In this pap er, we prop ose a new family of cov aria te-adjusted resp onse-a daptive (CARA) designs. W e show that the new designs have smaller v ariabilities and therefore more efficie nt. 1 In tro duction Resp onse-adaptiv e d esigns for clinical trials incorp orate sequen tially accru- ing resp onse data in to future allo cation p robabilities. A ma jor ob jectiv e of resp onse-adaptiv e designs in clinica l trials is to minimize the n um b er of patien ts that i s assigned to the in ferior treatmen t to a degree that s till gener- ates useful statistical inferences. The p reliminary idea of resp onse adaptiv e randomization can b e traced bac k to Th ompson (1933) an d Robbins (1952). A lot of resp onse-adaptive designs hav e b een prop o sed in lit erature ( e.g. , Rosen b erger and Lac h in 2002, Hu and Rosen b erger, 2006). Muc h recen t w ork has fo cused on p rop osing b et ter randomized adaptiv e designs. The three main comp o nents for ev aluating a resp onse-a daptive design are allo- cation prop ortio n, efficie ncy (p o w er), and v ariabilit y . The issue of efficie ncy or p o wer w as discus s ed by Hu and Rosenberger (2003), wh o sho w ed that 1 Li-Xin Zhang is Profess or, I nstitute of Statistics and Department o f M athematics, Zhejiang U niversi ty , Hongzhou, China. F eifang H u is Professor, D epartment of St atistics, Universit y of Virginia, Charlottesville, V A 22904-4135. The researc h was partially sup- p orted by NSF of China 10771192 (Lixin Zhang) and NS F Awards DMS-0349048 of USA (F eifang Hu ). 1 the efficiency is a decreasing function of th e v ariabilit y induced by the ran- domization pro cedure for an y giv en allocation prop o rtion. Hu , Rosenberger and Zhang (2006) sho w ed that there is an asymptotic lo w er b ound on the v ariabilit y of resp onse-a daptive designs. A resp onse-adaptive design that at- tains this lo wer b ound will b e said to b e fi rst order efficie nt . More recen tly , Hu, Zh ang a nd He ( 2008) prop osed a n ew f amily of efficien t r andomized adaptiv e designs that can adapt to any desired allo cation prop ortio n. But all these studies are limit to the designs that do not in corp orate co v ariates. In man y clinical trials (P o co c k and Simon, 1975, T a v es, 1 974), co v ari- ate information is a v ailable and has a strong influ en ce on the resp onses of patien ts. F or instance, the efficac y of a h yp ertensive drug is related to a pa- tien t’s initial b lo o d pressur e an d c holesterol lev el, whereas the effectiv en ess of a ca ncer treatme nt ma y dep end on w h ether the patie nt is a smok er or a non-smok er. Co v ariate-adaptiv e designs hav e b een prop o sed to b alance co- v ariates among treatment groups (see Poco ck and Simon, 1975, T a v es, 1974 and Zele n, 1974). Hu and Rosen b erger (2006 ) defin ed a co v ariate-adjusted resp onse-adaptive (CARA) design as a design that incorp orate sequentially history information of accruing r esp onse data and co v ariate as w ell as the observ ed cov ariate in formation of the incoming patien t int o fu ture allo cation probabilities. In a CARA d esign, th e assignmen t of a treatment dep end s on the h istory information and the co v ariate of the incoming patien t. This generates a certain lev el of tec hnical complexit y for studying the pr op erties of the design. Zhang, et al (2007) go t a limit success o n C ARA designs b y prop osing a class of CARA designs that allo w a wide s p ectrum of applications to v ery general statistical mo dels and obtaining the asymptotic pr op erties to provide a statistical basis for inferences after using this kind of designs. Ho we v er, the CARA designs in Zhang, et al (20 07) often ha ve high v ariabilities and therefore are n ot efficien t (Hu and Rosen b erger, 2003). Th e ma jor purp o se of th is pap er is to stud y the v ariabilit y and efficiency of CARA d esigns and to prop ose a new family of CARA designs with small v ariabilitie s. The pap er is organized as follo ws. In Secti on 2, the Fisher information and th e b e st asymp totic v ariabilit y are derived for a CARA design with an y giv en target allocation prop ortion. W e w ill fi nd that the Fisher information and the v ariabilit y dep end on the distribution of eac h in dividual resp onse, the target f unction and the distrib u tion of the cov ariate. In Section 3, we 2 prop ose a new C ARA design that can adapt to target an y allocation function and in whic h a parameter can b e tuned suc h that th e asymp totic v ariabilit y approac hes to the best o ne. The design prop osed b y Zhang, et al (20 07) is a sp ecial case of this new d esign and has th e largest v ariabilit y in all this kind of designs. The new design is also an extension of the doubly adaptiv e biased coin design (B DCD) prop o sed b y Eisele and W oo d ro ofe (1995 ) and Hu and Zhang (2004a) . T he tec h nical proofs are put on the App en dix. 2 V ari abilit y and efficiency of CARA designs 2.1 General framew ork of CARA designs. Giv en a clinical trial with K trea tmen ts. Supp o sing that a patien t with a co v ariate vec tor ξ is assigned to treatmen t k , k = 1 , . . . , K , and the observe d resp onse is Y k , assume that the resp onse Y k has a conditional distrib ution f k ( y k | θ k , ξ ) for giv en the co v ariate ξ . Here θ k , k = 1 , . . . , K , are unkno wn parameters, and Θ k ⊂ R d is the parameter space of θ k . In an adaptiv e design, we let X 1 , X 2 , ... b e the sequ ence of random treatmen t assignmen ts. F or the m -th s ub ject, X m = ( X m, 1 , . . . , X m,K ) rep- resen ts the assignment of tr eatmen t suc h that if the m -th sub ject is allocated to treatmen t k , then all elements in X m are 0 except for the k -th comp onent, X m,k , whic h is 1. Supp ose that { Y m,k , k = 1 , . . . , K , m = 1 , 2 . . . } denote the r esp onses suc h that Y m,k is the resp onse of the m -th sub ject to treat- men t k , k = 1 , . . . , K . In p ractical situations, only Y m,k with X m,k = 1 is observ ed. Denote Y m = ( Y m, 1 , . . . , Y m,K ). Also, assume that co v ariate infor- mation is a v ailable in the clinical stu d y . Let ξ m b e the co v ariate of the m -th sub ject. W e assume th at { ( Y m, 1 , . . . , Y m,K , ξ m ) , m = 1 , 2 , . . . } is a sequence of i.i. d. r an d om vect ors, the distributions of whic h a re the same as th at of ( Y 1 , . . . , Y K , ξ ). F urth er , let X m = σ ( X 1 , . . . , X m ), Y m = σ ( Y 1 , . . . , Y m ) and Z m = σ ( ξ 1 , . . . , ξ m ) b e the sigma fields corresp onding to the resp onses, assignmen ts and co v ariat es r esp ectiv ely , and let F m = σ ( X m , Y m , Z m ) b e the sigma field of the h istory . A ge neral co v ariate-adjusted resp o nse- adaptiv e (CARA) design is d efined b y ψ m +1 ,k = P ( X m +1 ,k = 1 | F m , ξ m +1 ) = P ( X m +1 ,k = 1 | X m , Y m , Z m +1 ) , k = 1 , ..., K, 3 the co nditional probabilities of assignin g treatmen ts 1 , ..., K to the m th pa- tien t, conditioning on the entire history including the in formation of all pre- vious m assignmen ts, resp onses, and co v ariate v ectors, plus the in formation of the current patie nt’s co v ariate v ector. 2.2 CARA designs with a target. Let N m,k b e the n um b er of sub jects assigned to treatmen t k in the first m assignmen ts and write N m = ( N m, 1 , . . . , N m,K ). Then N m = P m i =1 X i . F urther, let N n,k | x = P n m =1 X m,k I { ξ m = x } b e t he num b er of su b jects with co v ariate x that is rand omized to treatmen t k , k = 1 , . . . , K , in the n trials, and N n ( x ) = P n m =1 I { ξ m = x } b e the total num b er of su b jects with co v ariate x . W rite θ = ( θ 1 , . . . , θ K ). Beca use the v alue of θ and the co v ariate determinate the d istributions of th e outcomes, and acco rdingly , the effects of eac h treatmen ts, in many cases on e wo uld lik e to defin e a CARA design such that th e ”conditional” allo cation prop ortio n for a giv en co v ariate x co nv erges to a pre-sp ecified pr op ortio n which is a function of θ and x . That is, N n,k | x N n ( x ) → π k ( θ , x ) , k = 1 , . . . , K, (2.1) where π 1 ( θ , x ), . . . , π K ( θ , x ) are K kno w n functions. W e call them target al- lo cation functions. Examples for the c hoice of target functions are discussed in Zhang, et al (200 7), Rosenberger, et al (200 1), Rosenberger, Vidy ashank ar and Agarw al (2001) and Hu and Rosen b erger (2006). Recen tly , T ymofy ey ev, Rosen b erger and Hu (2007) dev eloped a general f ramew ork to obtain opti- mal a llo cation prop ortion for K -treatmen t clinica l trials. H o w ev er, when P ( ξ = x ) = 0, for example, in the con tinuous co v ariate case, the ”condi- tional” allocation prop ortion N n,k | x / N n ( x ) is not well-defined b ecause b oth the n umerator and d enominator are zeros almost surely . As compared with (2.1), it is m ore meaningful to allocate eac h individual patient to treatment k w ith a pr obabilit y close to π k ( θ , x ) f or a giv en cov ariate x . So we consid er a class of CARA designs with a prop ert y that P ( X m +1 ,k = 1 | F m , ξ m +1 = x ) → π k ( θ , x ) a.s. (2.2) The n ext theorem tells us that (2.2) implies (2.1). W rite ρ k ( θ ) = E π k ( θ , ξ ), k = 1 , . . . , K , ρ ( θ ) = ( ρ 1 ( θ ) , . . . , ρ K ( θ )) and π ( θ , x ) = ( π 1 ( θ , x ) , . . . , π K ( θ , x )) . 4 Theorem 2.1 If (2.2) is satisfie d, then N n,k | x N n ( x ) → π k ( θ , x ) a.s. on the event { N n ( x ) → ∞} (2.3) and N n,k n → ρ k ( θ ) a.s. (2.4) Her e, ” A a.s. on B ” me ans that P ( B \ A ) = 0 for two events A a nd B . F urther, if the density of the c ovar iate is p ositive at x , then lim r ց 0 lim n →∞ N n,k | B ( x ,r ) N n ( B ( x , r )) = π k ( θ , x ) a.s., (2.5) wher e N n,k | B ( x ,r ) = P n m =1 X m,k I { ξ m ∈ B ( x , r ) } , N n ( B ( x , r )) = P n m =1 I { ξ m ∈ B ( x , r ) } , B ( x , r ) is a b al l with the c enter x and the r adius r . Notice, when P ( ξ = x ) = 0, though the allocation prop ortion N n,k | x / N n ( x ) is not wel l-defined, (2 .3) is trivial b ecause P ( N n ( x ) → ∞ ) = 0. Accurately , (2.3) make s sense only in the discrete co v ariate case and (2.5 ) is a ve rsion of (2.3) for con tin uous co v arites. 2.3 V ariabilit y and efficiency . F or r esp onse-adaptiv e designs w hic h do not incorporate co v ariates, Hu, Rosen b erger and Zhang (200 6) found the lo wer b ound of the asymptotic v ariabilit y of a design, i .e., o f the allocation proportions of the design. A design is called asymptotical ly efficien t if its asymptotic v ariabilit y attains the lo w er bou n d. Next, we study the v ariabilit y and efficiency of the CARA designs. Supp ose, given ξ , that the resp onse Y k of a tr ial of treatmen t k has a distribution in the exp onential family , and tak es the form f k ( y k | ξ , θ k ) = exp  ( y k µ k − a k ( µ k )) /φ k + b k ( y k , φ k ) } (2.6) with link fun ction µ k = h k ( ξ θ T k ), where θ k = ( θ k 1 , . . . , θ k d ), k = 1 , . . . , K , are co efficien ts. Assume that the scale parameter φ k is fixed. It is easily c hec k ed that E [ Y k | ξ ] = a ′ k ( µ k ), V ar ( Y k | ξ ) = a ′′ k ( µ k ) φ k , ∂ log f k ( y k | ξ , θ k ) ∂ θ k = 1 φ k { y k − a ′ k ( µ k ) } h ′ k ( ξ θ T k ) ξ , 5 ∂ 2 log f k ( y k | ξ , θ k ) ∂ θ 2 k = 1 φ k n − a ′′ k ( µ k )[ h ′ k ( ξ θ T k )] 2 + [ y k − a ′ k ( µ k )] h ′′ k ( ξ θ T k ) o ξ T ξ and, giv en ξ , the conditional Fisher information m atrix is I k ( θ k | ξ ) = − E h ∂ 2 log f k ( Y k | ξ , θ k ) ∂ θ 2 k    ξ i = 1 φ k a ′′ k ( µ k )[ h ′ k ( ξ θ T k )] 2 ξ T ξ . F or the observ ations up to stage n , the lik eliho o d fu nction is L ( θ ) = n Y j =1 K Y k =1 [ f k ( Y j,k | ξ j , θ k )] X j,k = K Y k =1 n Y j =1 [ f k ( Y j,k | ξ j , θ k )] X j,k := K Y k =1 L k ( θ k ) (2.7) with log L k ( θ k ) ∝ P n j =1 X j,k { Y j,k − a k ( µ j,k ) } , µ j,k = h k ( θ T k ξ j ), k = 1 , 2 , . . . , K. W rite I k = E [ π k ( θ , ξ ) I k ( θ k | ξ )] , k = 1 , . . . , K . (2.8) Then − E θ  ∂ 2 log L ( θ ) ∂ θ 2 k  = n X j =1 E θ [ X j,k I k ( θ k | ξ j )] = n I k + o ( n ) It follo ws that the entire Firsher information matrix is I n ( θ ) = − E θ  ∂ 2 log L ( θ ) ∂ θ 2  = ndiag ( I 1 , . . . , I K ) + o ( n ) . Th us we obtain th e foll o wing theo rem. Theorem 2.2 Supp ose the r esp onses fol low the ge ne r alize d line ar mo del (2.6) and the design satisfies (2.2). L et I ( θ ) = diag ( I 1 , . . . , I K ) . Then the Fir sher information matr ix satisfies I n ( θ ) = n I ( θ ) + o ( n ) , and the asymptotic varianc e-c ovarianc e matrix of an asymptot ic efficient estimator of θ is I − 1 ( θ ) /n . 6 The limit prop ortio n ρ ( θ ) = ( ρ 1 ( θ ) , . . . , ρ K ( θ )) dep ends on b ot h the parameter θ and the d istribution of ξ . When the d istribution of ξ is known, according to Theorem 2.2, the asymptotic v ariance- co v ariance matrix of an asymptotic efficient estimato r of ρ ( θ ) is 1 n ∂ ρ ( θ ) ∂ θ I − 1 ( θ )  ∂ ρ ( θ ) ∂ θ  T . While, if the parameter θ is k n o wn, then the non-parameter maximal likel iho o d estimator (MLE) of ρ ( θ ) = E [ π ( θ , ξ )] is 1 n P n m =1 π ( θ , ξ m ) and its v ariance- co v ariance m atrix is V a r { π ( θ , ξ ) } /n. So, in the general case that the param- eter θ and the d istr ibution of ξ are b oth u nkno wn, the asymp totic v ariance- co v ariance matrix of an asymptotic efficien t estimat or o f ρ ( θ ) is B ( θ ) /n , where B ( θ ) = ∂ ρ ( θ ) ∂ θ I − 1 ( θ )  ∂ ρ ( θ ) ∂ θ  T + Va r { π ( θ , ξ ) } . The allo cation p rop ortion N n /n in a adaptive design with prop ert y (2.2) will con v erge to ρ ( θ ) according to Th eorem 2.1. So w e can now define an asymptotically efficien t C ARA design as follo w s . Definition 1 A c ovar aite- adjuste d r esp onse-adaptive design with tar get func- tion π ( θ , x ) is c al le d asymp totic al ly efficient if it satisfies (2.2) and n 1 / 2  N n /n − ρ ( θ )  D → N  0 , B ( θ )  , (2.9) and B ( θ ) is c al le d the b est asymptotic variability. Zhang, Hu , Cheung and Ch an (2007) prop o sed a CARA d esign (we refer it as ZHCC’s design) by d efining P ( X m +1 ,k = 1 | F m , ξ m ) = π k ( b θ m , ξ m +1 ) , where b θ m is the MLE of θ based on the observ atio ns up to stag e m . I t has b een shown that ZHCC’s d esign satisfy (2.2) and n 1 / 2  N n /n − ρ ( θ )  D → N  0 , Σ ( θ )  , where Σ ( θ ) = 2 ∂ ρ ( θ ) ∂ θ I − 1 ( θ )  ∂ ρ ( θ ) ∂ θ  T + diag ( ρ ( θ )) −  ρ ( θ )  T ρ ( θ ) . 7 It is easily seen that diag ( ρ ( θ )) −  ρ ( θ )  T ρ ( θ ) = V ar { π ( θ , ξ ) } + E h diag ( π ( θ , ξ )) − ( π ( θ , ξ )) T π ( θ , ξ ) i ≥ Va r { π ( θ , ξ ) } , where A ≥ B means that A − B is non-negativ e definite. Hence, ZHCC’s design is not asymptotically efficien t. It is of significance to fin d an asymp totic efficien t CARA d esign for any giv en target function π ( θ , x ). In the n ext sectio n, w e will prop ose a new class of CARA designs with an asymptotic v ariability b eing able to app roac h the b est one. 3 Co v ariate-adjusted DBCD Our new design is b ased on the id ea of the doubly adaptiv e biased coin design (BD CD) prop o sed b y Eisele and W o o dro ofe (199 5), and extended b y Hu and Z h ang (2004a). In the scenario without co v ariates, the Hu and Zhang’s extension can targe t any desired allo cation and can appr oac h the lo w er b ound of the asymptotic v ariabilit y b y tu ning a parameter. In this section, w e mo dify the DBCD to incorp orate co v ariates. F or simplification, w e only consider the tw o-t reatmen t case ( K = 2). Co v ariate-adjusted DBCD ( C ADBCD): T o start, w e let θ 0 b e an in itial estimate of θ , and assign m 0 sub jects to eac h treatmen t by using a restricted randomization. Assume that m ( m ≥ 2 m 0 ) sub jec ts hav e b ee n assigned to treatmen ts. Their resp o nses { Y j , j = 1 , . . . , m } and the corresp onding co v ariates { ξ j , j = 1 , . . . , m } are obser ved. W e let b θ m = ( b θ m, 1 , b θ m, 2 ) b e an estimate of θ = ( θ 1 , θ 2 ). Here, for eac h k = 1 , 2, b θ m,k = b θ m,k ( Y j,k , ξ j : X j,k = 1 , j = 1 , . . . , m ) is the estimator of θ k that is based on the observ ed N m,k -size sample { ( Y j,k , ξ j ) : for whic h X j,k = 1 , j = 1 . . . , m } . W rite b ρ m = 1 m P m i =1 π 1 ( b θ m , ξ i ) and b π m = π 1 ( b θ m , ξ m +1 ). Next, when the ( m + 1) -th sub ject is r eady for randomizatio n and the corresp ond ing co v ariate ξ m +1 is recorded, we assig n the patient to treatmen t 1 with a probabilit y of ψ m +1 , 1 = b π m  b ρ m N m, 1 /m  γ b π m  b ρ m N m, 1 /m  γ + (1 − b π m )  1 − b ρ m 1 − N m, 1 /m  γ (3.10) and to treatmen t 2 with a probabilit y of ψ m +1 , 2 = 1 − ψ m +1 , 1 , where γ ≥ 0 is a constant that con trols the degree of randomness of the pro cedure, from 8 most random when γ = 0 to deterministic when γ = ∞ . ZHCC’s design is a sp ecial case of CADBCD with γ = 0. Asymptotic prop erties. F or studying the asymp totic prop erties, we assume the target allocation function π 1 ( θ ∗ , x ) satisfies th e follo wing co ndi- tion. Condition A We assume that the p ar ameter sp ac e Θ k is a b ounde d domain in R d , and that the true value θ k is an interior p oint of Θ k , k = 1 , 2 . 1. F or e ach fixe d x , 0 < π 1 ( θ ∗ , x ) < 1 is a c ontinuous function of θ ∗ in the clo sur e of Θ 1 × Θ 2 . 2. π 1 ( θ ∗ , ξ ) is twic e differ entiable with r esp e ct to θ ∗ , and the exp e ctations of k ∂ π 1 ( θ , ξ ) / ∂ θ k 2 and sup k θ ∗ − θ k≤ δ k ∂ 2 π 1 ( θ ∗ , ξ ) /∂ θ 2 k ar e finite for some δ > 0 . W rite v = E [ π 1 ( θ , ξ )] , then 0 < v < 1 due to Condition A.1. Theorem 3.1 Supp ose that for k = 1 , 2 , b θ nk − θ k = 1 n n X m =1 X m,k h k ( Y m,k , ξ m )  1 + o (1)  + o ( n − 1 / 2 ) a.s., (3.11 ) wher e h k s ar e fu nctions with E [ h k ( Y k , ξ ) | ξ ] = 0 . We also assume that E k h k ( Y k , ξ ) k 2 < ∞ , k = 1 , 2 . Then under Condition A, we have P  X n, 1 = 1  → v ; P  X n, 1 = 1 | F n − 1 , ξ n = x  → π 1 ( θ , x ) a.s. (3.12) and N n, 1 n − v = O  r log log n n  a.s. ; b θ n − θ = O  r log log n n  a.s. (3.13) F urther, let V k = E { π k ( θ , ξ )( h k ( Y k , ξ )) T h k ( Y k , ξ ) } , k = 1 , 2 , V = diag  V 1 , V 2  , σ 2 1 = E [ π 1 ( θ , ξ )( 1 − π 1 ( θ , ξ )) ] , σ 2 2 = Va r { π 1 ( θ , ξ ) } , σ 2 3 = E ∂ π 1 ( θ , ξ ) ∂ θ V  E ∂ π 1 ( θ , ξ ) ∂ θ  T , λ = γ σ 2 1 v (1 − v ) and σ 2 = σ 2 1 + σ 2 3 1+2 λ + σ 2 2 + σ 2 3 . Then, √ n ( N n, 1 /n − v ) D → N (0 , σ 2 ) and √ n ( b θ n − θ ) D → N ( 0 , V ) . (3.14) 9 The pro of of this Theorem is a little complex and will b e state in the App end ix. According to (3.12), CADBCD s atisfies (2.2). The asymptotic v ariabilit y σ 2 of the d esign tak es the v alues from the m axim um 2 σ 2 3 + v (1 − v ) when γ = 0 to the minim σ 2 2 + σ 2 3 when γ = ∞ . The next resu lt for the generalized linear mo del is a corollary of T h eorem 3.1. The pro of is giv en in the App endix through the v erification of Condition (3.11). Corollary 3.1 Supp ose the distributions of the r esp onses fol low the gener- alize d line ar mo del (2.6) and satisfy the fol lowing r e gular c ondition H ( δ ) =: E θ h sup k z k≤ δ    ∂ 2 log f k ( Y k | ξ , θ k ) ∂ θ 2 k     θ k + z θ k    i → 0 as δ → 0 , (3.15) wher e f ( x ) | b a = f ( b ) − f ( a ) . Under Condition A, if the matric es I 1 and I 2 define d as in (2.8) ar e nonsingular and the MLE b θ m , which maximize the likeliho o d fu nction (2.7), is unique, then we ha ve (3.1 2 ), (3.13), an d (3.14 ) with V = I − 1 ( θ ) and I ( θ ) = diag ( I 1 , I 2 ) . It is ob vious that B ( θ ) = σ 2 2 + σ 2 3 is the b est asymptotic v ariability of CARA designs with t w o treatmen ts according to Definition 1. F or the CADBCD, σ 2 = σ 2 1 + σ 2 3 1 + 2 γ σ 2 1 v (1 − v ) + B ( θ ) > B ( θ ) but σ 2 ց B ( θ ) as γ ր ∞ . This means that the C ADBCD is n ot asymptotically efficien t but it can ap- proac h an asymptotically efficien t CARA design if γ is c hosen large. ZHCC ’s design is a sp e cial case of the CADBCD whic h has the largest v ariabilit y . 4 Conclusion Remarks W e ha v e prop osed a family of co v ariate-adjusted resp onse-adaptive designs that are fu lly rand omized a nd asymptotically efficie nt. Th e CADBCD can b e viewe d as a generalization of Hu and Z hang’s doubly adaptiv e biased coin design (Hu and Zhang, 2004a) for in corp orating co v ariate information. The asymptotic pr op erties d eriv ed here p ro vide the theoretical foun d ation for inference based on the CADBCD. 10 In this p ap er, w e h av e assumed that the resp on s es in eac h treatmen t group are a v ailable without dela y . In p ractice, there is no logistical diffi- cult y in incorp orating dela y ed resp onses in to th e CADBCD, pro vided that some resp onses b eco me a v ailable during the course of the allocation in the exp eriment, and thus we can alwa ys u p date the estimates wh enev er new data b ecome a v ailable. F or cli nical trials w ith uniform (or exp onen tial) pa- tien t ent ry and exp onential r esp onse times (see Bai, Hu and Rosen b er ger (2002 ), Hu and Zhang (2004) and Z hang, et al (2006) for examples), it is easy to v erify the theoretical results in Secti on 2 and 3. 5 App endix: Pro ofs Pro of of Theorem 2.1. Notic e E [ X m +1 ,k | F m ] = E [ ψ m +1 ,k | F m ] → ρ k ( θ ) b y (2.2 ) and { P n m =1 ( X m,k − E [ X m,k | F m − 1 ]) , F n } is a martingale. (2.4) fol- lo ws immediate ly . F or (2.3), let G m = σ ( F m , ξ m +1 ). T hen { P n m =1 ( X m,k − E [ X m,k | G m − 1 ]) I { ξ m = x } , G m } is a martingale with n X m =1 E  { ( X m,k − E [ X m,k | G m − 1 ]) I { ξ m = x }} 2 | G m − 1  ≤ N n ( x ) . It follo ws that P n m =1 ( X m,k − E [ X m,k | G m − 1 ]) I { ξ m = x } N n ( x ) → 0 a.s. on { N n ( x ) → ∞} b y Th eorem 3.3.10 of Stout (1974). O n the other h and, P n m =1 ( E [ X m,k | G m − 1 ] − π k ( θ , x )) I { ξ m = x } N n ( x ) → 0 a.s. on { N n ( x ) → ∞} b y (2.2). S o, (2.3) is pro v ed. F or (2.5), notice N n ( B ( x , r )) n → P { ξ ∈ B ( x , r ) } > 0 a.s. With a similar argument we ha v e lim n →∞ N n,k | B ( x ,r ) N n ( B ( x , r )) = lim n →∞ P n m =1 π k ( θ , ξ m ) I { ξ m ∈ B ( x , r ) } N n ( B ( x , r )) = E [ π k ( θ , ξ ) I { ξ ∈ B ( x , r ) } ] P { ξ ∈ B ( x , r ) } a.s. 11 Letting r ց 0 yields (2.5). Pro of of Theorem 3.1. The pro of is a little complex and long. W e will complete via four steps. Step 1. W e sho w that (3.13) and b ρ m = v + O ( p log log m/m ) a.s. (5.16) W rite π 1 = π 1 ( θ , ξ ) for short. Let M n, 1 = P n m =1 ( X m, 1 − E [ X m, 1 | F m − 1 , ξ m ]), M n, 2 = P n m =1 ( π 1 ( θ , ξ m ) − E π 1 ), Q n,k = P n m =1 X m,k h k ( Y m,k , ξ m ) for k = 1 , 2, Q n = ( Q n, 1 , Q n, 2 ) and M n, 3 = Q n  E ∂ π 1 ∂ θ  T . Then Q n and M n,j , j = 1 , 2 , 3, are martingal es. According to the la w of th e it erated loga rithm (LIL) for martingales, we ha v e Q n = O ( p log log n/n ) and M n,j = O ( p log log n/n ) a.s.j = 1 , 2 , 3 . (5.17) Hence, by (3.11) it is easily sh own that b θ m − θ = O ( p log log m/m ) a.s. (5.18) It follo ws that b π m = π 1 ( b θ m , ξ m +1 ) = π 1 ( θ , ξ m +1 ) + ( b θ m − θ )  ∂ π 1 ( θ , ξ m +1 ) ∂ θ  T + O (1) k b θ m − θ k 2 sup k θ ∗ − θ k≤ δ     ∂ 2 π 1 ( θ ∗ , ξ m +1 ) ∂ θ 2     (5.19) = π 1 ( θ , ξ m +1 ) + ( b θ m − θ ) E ∂ π ∂ θ + ( b θ m − θ )  ∂ π 1 ( θ , ξ m +1 ) ∂ θ − E ∂ π 1 ∂ θ  T + O (1) log log m m sup k θ ∗ − θ k≤ δ     ∂ 2 π 1 ( θ ∗ , ξ m +1 ) ∂ θ 2     a.s. It is easily shown that n X m =1 ( b θ m − θ )  ∂ π 1 ( θ , ξ m +1 ) ∂ θ − E ∂ π 1 ∂ θ  T = o ((log n ) 2 ) a.s . and n X m =1 log log m m sup k θ ∗ − θ k≤ δ     ∂ 2 π 1 ( θ ∗ , ξ m +1 ) ∂ θ 2     = o ((log n ) 2 ) a.s . 12 It follo ws that n X m =1 b π m = n X m =1 π 1 ( θ , ξ m +1 )+ n X m =1 ( b θ m − θ )  E ∂ π 1 ∂ θ  T + o ((log n ) 2 ) a.s. (5.20) Similarly , b ρ m = 1 m m X i =1 π 1 ( b θ m , ξ i ) = 1 m m X i =1 π 1 ( θ , ξ i ) + ( b θ m − θ )  E ∂ π 1 ( θ , ξ ) ∂ θ  T +( b θ m − θ ) 1 m m X i =1  ∂ π 1 ( θ , ξ i ) ∂ θ − E ∂ π 1 ( θ , ξ ) ∂ θ  T + O (1) k b θ m − θ k 2 1 m m X i =1 sup k θ ∗ − θ k≤ δ     ∂ 2 π 1 ( θ ∗ , ξ i ) ∂ θ 2     (5.21) = 1 m m X i =1 π 1 ( θ , ξ i ) + ( b θ m − θ )  E ∂ π 1 ( θ , ξ ) ∂ θ  T + O ( log log m m ) . (5.22) It follo ws that b ρ m = v + 1 m m X i =1 [ π 1 ( θ , ξ i ) − E π 1 ]+ O ( p log log m/m ) = v + O ( p log log m/m ) a.s. and n X m =1 b π m = nv + O ( p n log log n ) a.s. No w, write g ( π , a, b ) = π ( b/a ) γ π ( b/a ) γ + (1 − π )( (1 − b ) / (1 − a )) γ . (5.23) Then ψ m +1 , 1 = g ( b π m , N m, 1 /m, b ρ m ). It is easily seen that g ( π , a, b ) is a non - decreasing fun ction of b , and so g ( π , a, b ) ≤ g ( π, a, a ) = π if a ≥ b . Let l n = max { m ≤ n : N m, 1 /m ≤ b ρ m } , then ψ m +1 , 1 ≤ b π m when m ≥ l n + 1. 13 Hence N n, 1 = N l n +1 , 1 + M n, 1 − M l n +1 , 1 + n − 1 X m = l n +1 ψ m +1 , 1 ≤ 1 + N l n , 1 + M n, 1 − M l n +1 , 1 + n − 1 X m = l n +1 b π m ≤ 1 + l n b ρ l n + M n, 1 − M l n +1 , 1 + n − 1 X m =1 b π m − l n X m =1 b π m (5.24) ≤ nv + O ( p n log log n ) a.s. Similarly , n − N n, 1 ≤ n (1 − v ) + O ( p n log log n ) a.s. (3.13) and (5.16) are no w p ro v ed. Step 2. W e sho w (3.12) and the asymptotic normalit y of b θ n . By (3.13 ) and (5.16), b ρ n / ( N n, 1 /n ) → 1 a.s.. And hen ce (3.12) is pro v ed. and further ψ m, 1 − π 1 ( b θ m − 1 , ξ m ) → 0 a.s. Then, it is easily c hec k that Q n is a martingale with 1 n n X m =1 E  (∆ Q n ) T ∆ Q n  = 1 n n X m =1 diag  E  ψ m, 1 h 1 ( Y m, 1 , ξ m ) T h 1 ( Y m, 1 , ξ m )  , E  ψ m, 2 h 2 ( Y m, 2 , ξ m ) T h 2 ( Y m, 2 , ξ m )   → V . So, applying the cent ral limit theo rem for martingales yields n 1 / 2 ( b θ n − θ ) D → N ( 0 , V ) . The pro of of Step 2 is completed. Step 3. W e sho w that ψ m +1 , 1 = b π m − γ b π m (1 − b π m ) v (1 − v )  N m, 1 m − b ρ m  + O ( log log m m ) a.s. (5.25) 14 Let g ( π , a, b ) b e defined as in (5.23). By some element ary argument , it can b e show ed that sup 0 ≤ π ≤ 1     g ( π , a, b ) − π + γ π (1 − π ) v (1 − v ) ( a − b )     = O (( a − v ) 2 + ( b − v ) 2 ) , (5.26) as ( a, b ) → ( v , v ). By (3.13) and (5.16), it follo ws that sup 0 ≤ π ≤ 1     g ( π , N m, 1 /m, b ρ m ) − π + γ π (1 − π ) v (1 − v )  N m, 1 m − b ρ m      = O ( log log m m ) a.s . (5.25) is no w pr o v ed. Step 4. At last, w e sho w the asymptotic normalit y of N n . Notice N m, 1 /m − b ρ m = O ( p log log m/m ) a.s.. With th e s ame argumen t as deriving (5.20), we can sho w th at n X m =1 b π m (1 − b π m ) v (1 − v )  N m, 1 m − b ρ m  = n X m =1 E [ π 1 (1 − π 1 )] v (1 − v )  N m, 1 m − b ρ m  + o ((log n ) 2 ) a.s. By (5.25) it follo ws that n X m =1 ψ m − 1 , 1 = n X m =1 π 1 ( θ , ξ m ) + n − 1 X m =0 ( b θ m − θ )  E ∂ π 1 ∂ θ  T − λ n − 1 X m =1  N m, 1 m − b ρ m  + o ((log n ) 2 ) a.s. 15 Then N n, 1 − nv = M n, 1 + n X m =1 ψ m − 1 , 1 − nv = M n, 1 + M n, 2 + n − 1 X m =0 ( b θ m − θ )  E ∂ π 1 ∂ θ  T − λ n − 1 X m =1  N m, 1 m − b ρ m  + o ((log n ) 2 ) = M n, 1 + M n, 2 + λ n − 1 X m =1 M m, 2 m + ( λ + 1) n − 1 X m =0 ( b θ m − θ )  E ∂ π 1 ∂ θ  T − λ n − 1 X m =1  N m, 1 m − v  + o ((log n ) 2 ) a.s. = M n, 1 + M n, 2 + λ n − 1 X m =1 M m, 2 m ! + (1 + o (1)) ( λ + 1) n − 1 X m =1 M m, 3 m ! − λ n − 1 X m =1  N m, 1 m − v  + o ( n 1 / 2 ) a.s. On the other hand, E [∆ M m,i ∆ M m,j | F m − 1 ] = 0 , i 6 = j, E [(∆ M m, 1 ) 2 | F m − 1 ] = E [ ψ m, 1 (1 − ψ m, 1 ) | F m − 1 ] → σ 2 1 a.s., E [(∆ M m, 2 ) 2 | F m − 1 ] = Va r [ π 1 ( θ , ξ m )] = σ 2 2 and E [(∆ M m, 3 ) 2 | F m − 1 ] = E ∂ π ∂ θ E [(∆ Q m ) T ∆ Q m | F m − 1 ]  E ∂ π ∂ θ  T → σ 2 3 a.s. By applying the function cen tral limit theorem (c.f., Corollary 3.1 of Hall and Heyde, 1980), we h a v e n − 1 / 2  M [ nt ] , 1 , M [ nt ] , 2 , M [ nt ] , 3  D →  σ 1 B (1) t , σ 2 B (2) t , σ 3 B (3) t  , where B ( i ) t , i = 1 , 2 , 3, are three in dep endent standard Bro wnian motions. Then with the same argumen t as in Hu and Zhang (2004a ), one can show that n − 1 / 2 ( N [ nt ] , 1 − [ nt ] v ) D → G t , 16 where G t = σ 1 t − λ Z t 0 x λ dB (1) x + σ 2 B (2) t + ( λ + 1) σ 3 t − λ Z t 0 x λ − 1 B (3) x dx is a solution of the equation G t = σ 1 B (1) t + σ 2 B (2) t + λ Z t 0 B (2) x x dx ! + ( λ + 1) σ 3 Z t 0 B (3) x x dx − λ Z t 0 G x x dx with G 0 = 0. It is easily chec ke d that V a r ( G t ) = t  σ 2 1 1 + 2 λ + σ 2 2 + 2( λ + 1) 1 + 2 λ σ 2 3  = t  σ 2 1 + σ 2 3 1 + 2 λ + σ 2 2 + σ 2 3  . Hence n 1 / 2 ( N n, 1 /n − v ) D → N (0 , σ 2 ) . Pro of of Corollary 3.1. I t is s ufficien t to sh o w the strong con tinency of the MLE b θ m : b θ n → θ . (5.27) In fact, if (5.27) is prov ed , then b y (5.19) and (5.21) we ha v e b ρ n → v a.s. and 1 n P n m =1 b π m → v a.s.. By (5 .24) w e will h a v e N n /n → v a.s. It follo ws that ψ m,k − π k ( b θ m − 1 , ξ m ) → 0 a.s. b y (5.26). The rest pro of is similar to Corollary 3.1 of Zhang et al (2007). F or (5.27), it suffi ces to sh o w that, for an y δ > 0 small enough, with probabilit y one for m large enough we ha v e log L k ( θ ∗ k ) < log L k ( θ k ) , if k θ ∗ k − θ k k = δ. (5.28) W e co nsider the case k = 1 only . T he application of T a ylor’s theorem yields 1 m log L 1 ( θ ∗ 1 ) − 1 m log L 1 ( θ 1 ) =( θ ∗ 1 − θ 1 ) 1 m ∂ log L 1 ∂ θ 1    θ 1 + ( θ ∗ 1 − θ 1 ) 1 m ∂ 2 log L 1 ∂ θ 2 1    θ 1 ( θ ∗ 1 − θ 1 ) T + ( θ ∗ 1 − θ 1 ) n 1 m Z 1 0 h ∂ 2 log L 1 ∂ θ 2 1     θ 1 + t ( θ ∗ 1 − θ 1 ) θ 1 i dt o ( θ ∗ 1 − θ 1 ) T . 17 W rite f ( a, b, z , ξ ) = π 1 ( z , ξ )  b a  γ π 1 ( z , ξ )  b a  γ + (1 − π 1 ( z , ξ ))  1 − b 1 − a  γ . It is ob vious that f is a con tin uous function of a , b and z f or eac h give ξ . By applying the la w of large num b ers for martingales, one can sho w that 1 m ∂ log L 1 ∂ θ 1    θ 1 → 0 a.s. and ∂ 2 log L 1 ∂ θ 2 1    θ 1 = m X j =2  E  f ( a, b, z , ξ ) I 1 ( θ 1 | ξ )     a = N j − 1 j − 1 ,b = b ρ j − 1 , z = b θ j − 1 + o ( m ) a.s. F or the details of the pro of, one can refer to Zhang et al (20 07). F ur ther, it is ob vious that lim su p b ρ m ≤ lim 1 m m X j =1 sup θ π 1 ( θ , ξ j ) = E [sup θ π 1 ( θ , ξ )] < 1 a.s., where the sup erior is tak en ov er the parameter space. And similarly lim su p n →∞ 1 n n X m =1 b π m ≤ E [sup θ π 1 ( θ , ξ )] < 1 a.s. By (5.24), lim su p N n, 1 /n ≤ E [su p θ π 1 ( θ , ξ )] < 1 a.s. By consid ering 1 − b ρ m and n − N n, 1 instead of b ρ m and N n, 1 resp ectiv ely , we ha v e lim inf b ρ m ≥ E [inf θ π 1 ( θ , ξ )] > 0 and lim inf N n, 1 /n ≥ E [inf θ π 1 ( θ , ξ )] > 0 a.s. So w e ma y assume that b ρ m , N n, 1 /n ∈ [ δ 0 , 1 − δ 0 ] for some 0 < δ 0 < 1. On the other hand, it is obvi ous that y E  f ( a, b, z , ξ ) I 1 ( θ 1 | ξ )  y T is a con tin uous function of a, b, y , z , and is p ositive for all 0 < a, b < 1, y 6 = 0 and all z . It follo ws that there is a constant c 0 > 0 for which lim inf j →∞ min y : k y k =1  y E  f ( a, b, z , ξ ) I 1 ( θ 1 | ξ )  y T  a = N j − 1 j − 1 ,b = b ρ j − 1 , z = b θ j − 1 > c 0 a.s. 18 So with probability one f or m large e nough it holds that 1 m log L 1 ( θ ∗ 1 ) − 1 m log L 1 ( θ 1 ) ≤ − k θ ∗ 1 − θ 1 k 2 n 1 m m X j =2 min y : k y k =1  y E  f ( a, b, z , ξ ) I 1 ( θ 1 | ξ )  y T  a = N j − 1 j − 1 ,b = b ρ j − 1 , z = b θ j − 1 o + k θ ∗ 1 − θ 1 k 2 H ( k θ ∗ 1 − θ 1 k ) + o (1) ≤ − c 0 δ 2 + δ 2 H ( δ ) + o (1) < 0 uniformly in θ ∗ 1 with k θ ∗ 1 − θ 1 k = δ when δ is small enough. (5.28) is p ro v ed. 19 References [1] Bai, Z. D., Hu , F. and Rosen b erger, W. F. (2002). Asymptotic prop er- ties of adaptive d esigns with dela y ed resp onse. Annals of Statistics , 30 : 122–1 39. [2] Eisele, J. and Woodr oofe, M. ( 1995). Centra l limit theo rems for doubl y ad aptive bias ed coin designs. Ann. Sta tist. 23 234- 254. [3] Hall, P. a nd Heyde, C. C. (1980). Martingale Limit The ory and its Applic ations , A cademic Press , London. [4] Ha yre, L. S. (1979). Two-popula tion sequential test s with three hypothe ses. Biometrika , 66 , 465–474 . [5] Hu, F. and Rosenberger, W. F . (2003). Ev alua tiong re sponse- adaptive ran domiza tion procedures for trea tment comp ar- isons. Journal of the A meric an Statistic al Asso ci ation , 98 , 671-678. [6] Hu, F. and Rosenberger, W. F. (2006). The The ory of R esp onse- A daptive R andomization in Clinic al T rials , John Wile y an d S ons, Inc., New York. [7] Hu, F., Rosenberger, W. F., an d Zhang, L.-X. (2006). Asymptot icall y b est res ponse-adaptive rando miza tion pro- cedures . Journal of Statistic al Planning and Infer enc e , 136 , 1911– 1922. [8] Hu, F. and Zhang, L.-X. (2004 a ). Asymptot ic p roper ties of doubl y adaptive biased coin designs for mul titrea tment clinical trials. The A nnals o f Statistics , 32 , 268-301. [9] Hu, F . and Zh ang, L.-X. (2004 b ) Asympto tic normal ity of ur n models for clinical trials with dela ye d resp onse. Bernoul li , 10 (3), 447-463. [10] Hu, F., Zhang, L.X. and He, X. (2008). Efficient r andomized adaptive designs, A nnals of Statistics , To app ear. 20 [11] Pocock, S. J. and Simon, R. (1975). Sequen tial trea t ment assignment with balancing for prognostic f actors in t he controlled clinical trial. Biometrics , 31 , 103–115. [12] R obbins, H . (1952). Some aspects of the sequential design of exp eriments. Bul letin of the Americ an Mathematic al So ci e ty , 58 , 527–535. [13] R osenbe rger, W . F. and Lachin, J. M. (2002) R andomizat ion in Clinic al T rials The ory and P r actic e , John Wiley and Sons , Inc., New York. [14] R osenbe rger, W. F., S t allard , N., Iv anov a, A. Harp er, C., and Ricks, M. (2001). Optimal ad aptive de signs for binar y respons e trials. Biometrics , 57 , 909-913. [15] R osenbe rger, W. F., Vidy as hankar, A. N. and A gar w al, D. K. (2001). Cov aria te-adjust ed resp onse-adaptive des igns for binar y resp onse. J. Biopharm. Statist. , 11 227-236. [16] T a ve s, D.R. (1974). Minimiza tion: a new method of assigning p a tients to trea tment and control groups. Clin Pharmac ol Ther. , 15 , 443-453. [17] Thompson, W. R. (1933). On th e likel ihood tha t one un- known p robability exceeds another in the view of the evi- dence of the two samples. Biometrika , 25 , 275–294. [18] Tymofyeye v, Y., Rosenberger, W. F. and Hu, F . (2007). Implemen ting opt imal alloc a tion in seq uential binar y r e- sponse ex periments . Journal of the Americ an Statistic al A sso ciation , 102 , 224-234. [19] Zelen, M. ( 1974). The rando miza tion and st ra tifica tion of p a tients to clinical trials. Journal of Chr onic Dise ases , 27 , 365- 375. [20] Zhang, L.X., Chan, W.S., Cheung. S .H. an d Hu , F. ( 2006). A gene ralized urn model for clinical trials with dela yed respons es. Stat istic a Si ni c a , 17 , 387-409 21 [21] Zhang, L.X., Hu, F., Cheung. S.H. and Chan, W.S. ( 2007). Asymptot ic pr oper ties of cov aria te -adjusted adaptive de- signs. Annal s of Statistics , 35 , 1166- 1182. 22

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment