An application of principal stratification to control for institutionalization at follow-up in studies of substance abuse treatment programs

Participants in longitudinal studies on the effects of drug treatment and criminal justice system interventions are at high risk for institutionalization (e.g., spending time in an environment where their freedom to use drugs, commit crimes, or engag…

Authors: Beth Ann Griffin, Daniel F. McCaffrey, Andrew R. Morral

An application of principal stratification to control for   institutionalization at follow-up in studies of substance abuse treatment   programs
The Annals of Applie d Statistics 2008, V ol. 2, No. 3, 1034–105 5 DOI: 10.1214 /08-A OAS179 c  Institute of Mathematical Statistics , 2 008 AN APPLICA TION OF PRINCIP AL STRA TIFICA TION TO CONTR OL FOR INSTITUTIONALIZA TION A T F OLL O W-UP IN STUDIES OF SUBST ANCE ABUSE TREA TMENT PR OGRAMS 1 By Beth Ann Griffin, D a niel F. McCaff rey and Andrew R. Morral RAND Corp or ation P articipan ts in longitudinal studies on the effects of drug trea t- ment and criminal justice sy stem interve ntions are at high risk for in- stitutionalization (e.g., sp end ing time in an environment where their freedom to use dru gs, commit crimes, or engage in risky b ehavior ma y be circumscribed). Methods used for estimating treatment ef- fects in the presence of institutionalization during follo w-up can b e highly sensitive to as sumptions that are unlikely to b e met in appli- cations and thus likely to yield misleading inferences. In th is paper w e consider the use of principal stratification to control for institu- tionalization at follo w-u p. Principal stratification h as b een suggested for similar problems where outcomes are unobserv able for samples of study participan ts b ecause of dropout, death or other forms of censor- ing. The metho d identifies principal strata within which causal eff ects are w ell defined and p otentiall y estimable. W e extend the metho d of principal stratification t o mo del institutionalization at follo w-up and estimate the effect of residential substance abuse treatment ver- sus outpatient services in a large scale study of adolescen t su bstance abuse treatment programs. Additionally , we discuss practical issues in applying the principal stratification mo del to d ata. W e show via sim- ulation studies that the mod el can only reco ver true effects pro v ided the data meet strenuous demands and that there must b e caution taken when implementing principal stratification as a technique to contro l for post-treatment confound ers such as institutionalization. 1. In tro duction. Eac h y ear almo st 1 .8 million Americans receiv e alcohol and other d rug treatmen t services. E fforts to imp r o ve th ese services thr ou gh Received June 200 7; revised April 2008. 1 Supp orted by NIDA Grants R01 DA015697, R01 DA016722 and R01 DA017507. Key wor ds and phr ases. Principal stratificatio n, post- t reatment co n founder, institu- tionalization, causal inference. This is a n electronic re pr int of the or iginal article published by the Institute of Ma thematical Statistics in The Annals of Applie d Statistics , 2008, V ol. 2, N o. 3, 1034–1055 . This reprint differs from t he original in pag ination and t yp ogr a phic detail. 1 2 B. A. GR IFFIN, D. F. MCC A FFREY A ND A. R . MOR RAL researc h or pro vider pr ofiling are hindered b y drug treatmen t clien ts’ high rates of p ost-treatmen t institutionalizat ion, defined here as sp en ding a day or more in a c on trolled en vir onmen t (e.g., a j ail, prison, hospital, r esiden tial treatmen t or group home setting) w here the p ossibilit y of drug u s e and criminal activit y is substantial ly diminish ed. By reducing the p oten tial for substance use, institutionalization m asks the p otent ial effects of sub stance use treatment p rograms. F or example, outcomes of clien ts from b oth effectiv e and ineffectiv e treatmen t p rograms w ill lo ok the same w hen they ha ve no access to drugs or alcohol. The confounding of treatmen t effects due to in s titutionaliza tion cannot b e ignored when ev aluating su bstance abuse treatmen t programs, b ecause in- stitutionalizat ion i s so p erv asiv e among drug treatmen t clien ts. F or instance, in the Drug Abu se T reatment Outcomes Study [ Hubbard et al. ( 1997 )], 40% of the 2,966 clien ts in subs tance abuse treatmen t programs in terviewed 12- mon th s after discharge r ep orted b eing institutionaliz ed for some part of the preceding y ear [U.S. Dept. of Health and Human Services, National Insti- tute on Dr u g Abus e ( 2004 )]. Among those with any institutionalizatio n , the av erage n u m b er of d a ys institutionalized out of the past 365 w as 115 [ U.S. Dept. of Health and Human Services, National In stitute on Drug Abus e ( 2004 )]. Similarly , ab out 52% of the sample from the National T reatment Improv ement Ev aluation (NTIES) [U.S. Dept. of Health and Human S er- vices, Substance Abuse and Men tal Health Services Admin istration, Cen ter for Substance Abuse T reatmen t ( 2004 ), NOR C and R TI ( 1997 )] w ere in s ti- tutionalized dur ing the study’s 12-mon th p ost-treatmen t ev aluation, many for the en tire length of the ev aluation p erio d. The confounding effect s of institutionalizat ion are not, ho wev er , limited to sub stance abuse treatmen t pr ograms. Cr iminal justice sys tem clien ts are also at grea t r isk for b eing ins titutionalized follo wing a n interv en tion. Mo re- o ve r, patie n ts inv olv ed in man y h ealth studies are at risk f or h ospitalizatio n s whic h can limit or su ppress the measurement of primary outcomes of inter- est in those studies (e.g., daily physic al activities for a general p opu lation or falls f or geriatric patien ts). In all of these cases, a p ost-treatmen t factor (e.g., institutionalization or hospitalization) whose v alue is determined after the start of treatment consequen tly determines the p oten tial range of the outcome that can b e observe d , thereby suppressin g or censoring the outcome of primary in terest. In many situations, inferen ces are f urther complicated b y the f act that the confound ing factor can tak e on many lev els giving rise to outcomes whic h are observ ed at differen t v alues of the co nfoundin g v ariable and inv alidating the comparabilit y o f o utcomes in the treatment and co ntrol groups. In a recen t pap er McCaffrey et al. ( 2007 ) dev elop ed a statistical mo del to describ e the different p ossible estimates of the causal effect of treatment in the presence of institutionalization and the p oten tial confound ing effects AN APPLICA TION OF PRIN CIP AL STRA TIFICA TION 3 of institutionaliz ation on th ese estimates. The pap er identifies commonly used analytic metho ds for estimating treatment effects in the presence of institutionalizati on and demonstrates that the estimated treatment effects can v ary greatly dep ending on which metho d is emplo ye d . T he pap er also iden tifi es assumptions und er wh ich the v arious appr oac hes yield unbiased estimates of the treatmen t effects of inte rest. Unfortunately m any of the assumptions requ ired app ear un lik ely to hold in most real world applications. Institutionalizatio n is similar to th e problem of incomplete compliance in dru g trials. In b oth problems, cases assigned to treatmen t h a ve multiple p oten tial outcomes that can v ary dep en d ing o n a v ariable not con trolled by the exp erimen ter: d ose of the dr ug (i.e., lev el of compliance) in drug tri- als or da ys institutionalized in our example. Hence, metho ds f or estimating the dose-resp onse curv e from partial compliance d ata [ Efron an d F eldman ( 1991 )] might apply to the institutionalization problem. How ev er, these metho ds make substan tial use of the fact that treatment is a co mplete d ose of the dru g, whereas, in our example in stitutionalizati on is not r elated to the amoun t or t yp e of treatmen t receiv ed but is an external ev ent that curtails use censoring the outcomes of in terest. The problem of institutionalization is more similar to the problems ad- dressed b y the m etho ds of principal stratification. Deve lop ed by F rangakis and Rubin ( 2002 ) for problems w here outcomes are unobserv able for samples of study p articipan ts b ecause of drop out, d eath or other form s of censoring (e.g., failure to find emplo yment in a jobs pr ogram), the metho d identifies principal strata within whic h causal effects are we ll d efined and p oten tially estimable. Roughly sp eaking, F rangakis and Rub in ( 2002 ) noted that in data with censoring, cases that receiv e treatment and hav e observ ed outcomes are a mix of cases that w ould ha ve observ ed outcomes und er con trol and those that would not. Similarly , cases th at receiv e the con tr ol condition and ha ve observ ed outcomes are a mix of cases that would hav e observed outcomes under treatment and those that w ould not. A study participan t’s censoring statuses u nder the treatment and control conditions define th e strata within the p opulation, called principal strata. Analyses that condition on cases in the stratum wher e participan ts wo uld b e observ ed un der b oth treatmen t and con trol can p ro vide un biased treatment effect esti mates for that stratum as- suming no o ther b iasing fact ors. The key innov ation to th e wo rk of p rincipal stratification is the recognition that conditioning on censoring statuses un- der treatmen t and con trol is sufficien t to allo w for u n b iased estimation of treatmen t effects. Principal strata notions extend to th e p roblem of institutionalization at follo w-up with only sligh t mo difications to the metho ds f or censored data. In particular, mod ifications are necessary b ecause in stitutionalizat ion can tak e on more than the t wo lev els of ce nsored a n d u ncensored. A lso, outco m es are 4 B. A. GR IFFIN, D. F. MCC A FFREY A ND A. R . MOR RAL observ ed at all levels of institutionaliz ation and treatmen t effects at v arious lev els ma y b e of in terest. Using d ata from the Adolescen t T reatmen t Mo d els (A TM) s tudy , fielded b et w een 1998 and 2002 b y the S ubstance Abuse and Men tal Heal th Services Administration, Cen ter for Substance Ab use T reatmen t, we aim to exam- ine the effects of treatmen t m o dalit y (residentia l versus outpatien t) on the 12-mon th substance use outcomes for adolescen ts w ho participated in the A TM usin g principal stratification. In the A TM, high rates of ins titution- alizati on clearly confound the effects of treatment mo dalit y . Adolescen ts in residen tial treatment hav e signifi can tly higher r ates of in stitutionalizati on and lo we r mean dr ug use frequency outcomes at the 12-mont h follo w-up . Giv en that institutionalizati on in the sample can b e sho wn to yield low er drug use outcomes [ McCaffrey et al. ( 2007 )], it is un clear whether adoles- cen ts in residen tial treatment truly ha ve lo w er mean v alues of the outcome b ecause the treatment is more effectiv e th an ou tp atien t treatment or b ecause they tend to b e in s titutionalized more often. The goal of th is pap er is to extend principal stratification to control for the confou n ding effects of institutionalization and to ev aluate the sensitiv- it y of the extended mo del to v arious assumptions ab ou t the data. Section 2 describ es the data from the A TM stu dy and illustrates the confounding effects institutionalizatio n is lik ely to h a ve w hen examining treatment ef- fects in th is study . Section 3 describ es th e method of principal stratification and its extension in more detail and applies the metho d to the A TM data to examine the effects of substance use treatmen t mo dalit y (residen tial v er- sus ou tp atien t) on drug use outcomes for adolescen ts in the study . S ection 4 ev aluates the p rincipal stratification metho d p resen ted using a series of sim u lation stu dies and Section 5 pr o vides a discu ssion of our find ings and recommendations f or practice and f u ture researc h . 2. Adolescen t treatmen t m o dels study and the institutionaliz ation con- found. The n u m b er of adolescen ts receiving s ubstance abuse treatmen t has increased b y ov er 65% in the last 10 or so ye ars and p olicy mak ers, clini- cians and p arents wan t to kno w if treatmen t is effectiv e and for whom it is effectiv e, as we ll as what treatmen t mo dalities are b est. W e address these questions through a study of the effects of treatment m o dalit y (residentia l v ersu s outpatient ) on the 12-mon th sub stance use outcomes f or adolescen ts who participated in the A TM study . The A TM study collected treatmen t ad- mission and 12-mon th outcomes data for new admiss ions to 10 comm unity- based treatmen t programs in the United States, includin g s ix resident ial programs and four outpatien t programs [ Stev ens and Morral ( 2003 )]. The sample used in th e presen t analysis includes all new admissions to the 10 programs in the main A TM analytic dataset, whic h was pro duced in Marc h of 2002 . Of these 1,384 cases, 1,256 (91%) completed a 12-mon th AN APPLICA TION OF PRIN CIP AL STRA TIFICA TION 5 follo w-up surve y and p ro vid ed data on the outcomes of interest. Only cases with follo w -up data are included in the analysis presented b elo w. F or pu r p oses of t he illustrations presented i n this rep ort, w e examine the effects of treatmen t mo dalit y on th e Substance F requen cy Scale (SFS), a widely used scale fr om the Global Appraisal of In dividual Needs (GAIN) [ Dennis ( 1999 )], the s u rv ey ins trument used at eve ry site for baseline and 12-mon th outcome assessmen ts. The SFS a v erages resp onses to a series of questions on the frequency of r ecen t dru g u se, into xication and dr u g prob- lems in the 90 da ys pr ior to the 12 month follo w-up. 1 It is scaled so that higher v alues indicate greater sub stance use and more dru g problems. Da ys of institutionalization at follo w-up is assessed w ith the maximum num b er of d a ys—in the p ast 90—whic h the resp ondent r ep orts b eing in an y of sev- eral d ifferen t t yp es of con trolled en vir on m en ts (e.g., inp atien t psyc h iatric or medical hospitals, resid en tial treatmen t facilities, ju v enile halls or other criminal j ustice d etentio n facilitie s, etc.). Figure 1 shows a sc atter plot and smo oth of mean S FS v ersus th e n u m b er of months institutionalized for adol escen ts enrolled in r esiden tial an d outpa- tien t care treatmen t mo d alities in the A TM. As shown, more adolesce n ts in residen tial tr eatmen t h a ve months of institutionalizat ion greater th an 0. In fact, 52% of adolescen ts in residenti al ca re w ere institutionali zed for at lea st one day in the past 90 at the 12-mon th follo w-up , wh ile only 38% of ado- lescen ts in outpatien t care were institutionalized at follo w-u p (see T able 1 ). Figure 1 also rev eals the sup pression effect that institutionalization can ha ve on SFS. As the num b er of mon ths institutionalized in creases, the observed v alues of SFS in the A T M data app ear to decrease. This relationship can b e seen mo re cl early in t he smo oth o f SFS v ersu s da ys institutionalize d also sho w n in the figur e. In addition to the sup pression effect of institutionalizati on, Figure 1 also clearly rev eals the p otent ial selection effects that exists in th is data b e- t wee n adolescen ts with and without an y da ys of ins titutionaliza tion. The mean v alue o f S FS for adolescen ts who are n ot institutionalized at follo w-up (represen ted b y the blac k X ab o ve 0 days institutionalized in Figure 1 ) is mark edly low er than the m ean v alues f or adolescen ts with only a few d ays of institutionalizati on (shown along the smo othed line in Figure 1 ). Th e difference in mean SFS b et ween these adolescen ts suggests that adolescen ts who were institutionalized in the 90 days p rior to the 12 mon th follo w-up tended to b e more difficult cases with higher leve ls of drug use than those adolescen ts n ot en terin g institutional settings. Both the su ppression and selection effects of institutionalization in the A TM data could distort inferences ab out the effects of treatmen t on S FS. 1 In this illustration we m ultiplied SFS b y 90 to mak e it scale w ith use in the past 90 days rather than use p er da y . 6 B. A. GR IFFIN, D. F. MCC A FFREY A ND A. R . MOR RAL T able 1 Weighte d me an r ate of inst itutionalization and SFS at 12-month fol low-up by tr e atment mo dality Percen t institutionalized at follow-up Mean SFS Residential treatment 52% 9.0 Outpatient treatmen t 38% 9.7 T able 1 on page 6 provides we igh ted d escrip tiv e statistics (see b elo w for more details on w eight ing) for SFS and institutionalizatio n at follo w-up b y treatmen t mo dalit y . As sho wn , adolescen ts in resid ential treatmen t ha v e higher rates of institutionaliz ation and lo wer mean SFS. Giv en that institu- tionalizat ion app ears to su ppress substance use in this sample, it is unclear whether adolescen ts in residen tial treatmen t ha v e lo w er mean v alues of SFS b ecause they ha ve decreased their substance use in resp onse to residential treatmen t or b ecause they tend to b e institutionalized more often. Unfortunately , as describ ed in McCaffrey et al. ( 2007 ), cur rent metho ds for hand ling institutionalizatio n are not adequate and require strong as- sumptions whic h are unlikely to hold in practice. In ligh t of these findin gs, w e prop ose the use of pr incipal stratification to obtain p olicy-relev an t treat- Fig. 1. Sc atter plot of SFS values by numb er of m onths i nstitutionalize d for adolesc ents in b oth r esidential and outp atient tr e atment mo dali ties with smo oth of me an SFS by months institutionalize d overlaye d. T he black X denotes the me an SFS value among adolesc ents with 0 days of institut ionalization. AN APPLICA TION OF PRIN CIP AL STRA TIFICA TION 7 men t effects on this data wh ich appropr iately control f or the su ppression and selection effects institutionaliza tion can ha ve on SFS. 3. Principal stratification. Prin cipal s tr atification was d ev elop ed b y F r an - gakis and Rubin ( 2002 ) as a metho d for accoun ting for p ost-treatmen t con- founds within th e con text of the Neyman–Ru b in causal mo del. Hence, we b egin by extending the Neyman–Rubin causal mo del to accoun t for ins ti- tutionalizati on and then turn to describ e the sp ecific innov ations of the principal str atification approac h . 3.1. A c ausal mo del for tr e atment eff e c ts in the pr ese nc e of institutional- ization. W e start b y considerin g the treatmen t effect of a single interv en tion v ersu s a con trol in the simple case without an y institutionalizat ion. In this case, the Neyman–Rubin causal m o del [ Holland ( 1986 ), Pea rl ( 1996 )] con- siders tw o p oten tial outcomes and one random v ariable for eac h case in the study . Th e p oten tial outcomes are Y 0 , th e outcome after receiving the con- trol condition, and Y 1 , the outco me after receiving the tr eatmen t condition. Throughout we assume that the Stable Unit T reatment V alue Assu m ption (SUTV A) [ Rubin ( 1 990 )] holds so that for ea c h case th e p otentia l outcomes are u nique and w ell defined . The treat men t effect for eac h individu al is Y 1 − Y 0 . T ypically this is sum- marized b y its mean across study p articipan ts. How ev er, w e cannot directly estimate the treatmen t effect for individual cases or the m ean across cases b ecause cases cann ot b e observ ed und er b oth the treatmen t and the co n trol conditions. T he condition under which eac h case is observ ed is determined b y the r andom treatmen t assignmen t v ariable T , wh ic h equals 1 for treat- men t and 0 for con trol. When T = 1 , we observ e Y 1 , otherwise Y 0 is observed. This r esults in the r andom v ariable Y obs = Y T . Un b iased estimation of the a verag e treatmen t effect is p ossib le if cases with T = 1 ha v e the same exp ected v alues of their p oten tial outcomes as cases with T = 0. Under this assumption, E ( Y obs | T = 1) = E ( Y 1 ) and E ( Y obs | T = 0) = E ( Y 0 ), where exp ectation is ov er the participating cases. Hence, t he difference in th e obs er ved treatment and co n tr ol mea ns yi elds an unbiase d estimate of the a verag e causal effect of treatmen t. Consisten t estimation is also p ossible in situations wh er e treatment as- signmen t might dep end on observ able c haracteristics, suc h as observ ational studies where stud y participan ts self-sele ct int o the treatment programs b e- ing studied. In such circums tances, consisten t estimates can b e obtained b y comparing the means for cases with the same probabilit y of treatmen t and using prop ensity score weigh ts to adjust for s election effects in to a p articular treatmen t [ Rosen bau m and Rubin ( 1983 )]. In th e presence of institutionalizati on, th e Neyman–Rubin causal mo del describ ed ab ov e m us t b e expanded to allo w for cases to ha ve p otentia l lev- els of institutionalization, whic h we denote by Z 0 and Z 1 for the control 8 B. A. GR IFFIN, D. F. MCC A FFREY A ND A. R . MOR RAL and treatmen t conditions, resp ectiv ely . Th e mo del must also allo w for dif- feren t p oten tial outcomes at eac h lev el of institutionaliza tion. F or example, a case might hav e a d ifferen t p otenti al outcome w hen institutionaliz ed 0 da ys, compared to 1, 2 or 3 or more d a ys. W e let Z max equal the maximum p ossible v alue f or institutionalizatio n. Th en, for treatmen t, T = 1 , we la- b el the p oten tial outcomes for a case as Y 1 [ z ] , z = 0 , . . . , Z max so that Y 1 [0] is the p otent ial outcome if assigned to treatmen t and not institutionalized during the follo w-up p erio d and Y 1 [1] is the p otenti al outcome if assigned to treatmen t and h ad 1 day of institutionaliza tion, and so on to Z max . T he p oten tial ou tcomes for con trol are Y 0 [ z ] , z = 0 , . . . , Z max . No w, f or eac h case, we can d efine d ifferent tr eatment effects f or eac h of th e different lev els of institutionalization, for example, D [ z ] = Y 1 [ z ] − Y 0 [ z ] su c h that D [ z ] might change with z . While there are m ultiple causal effects that migh t b e of interest, n ot all ma y b e estimable from the d ata without strong assump tions. W e might , for example, b e int erested in the a ve rage causal effect of D [0], the a verage causal effect had no one b een institutionalized, which McCaffrey et al. ( 2007 ) refer to as the “uns u ppressed treatmen t effect.” Additionally , w e migh t also wan t to estimate the av erage causal effect for cases at eac h of the observed v alues of institutionalizati on as was considered when the notion of p r incipal stratification was deriv ed [ F rangakis and Rubin ( 2002 )]. W e no w turn to that concept. 3.2. Princip al str atific ation. T o estimate the causal effects of treatmen t using the notions of principal stratification, w e need a m o del for b oth the outcomes and ins titutionaliza tion. Sp ecifically , we assume th at ins titution- alizati on can tak e on only a discrete set of v alues 0 , 1 , . . . , Z max for b oth treatmen t and cont rol. F or example, if the follo w-up interv al for observing outcomes is 90 da ys , th en Z 1 and Z 0 can only tak e on a v alue fr om 0 to 90 . Under this assum ption, w e can in turn d efi ne Z max ∗ Z max principal strata as sho wn in T able 2 . W e define the probabilit y that an individual falls into stratum ( z 0 , z 1 ) by p z 0 ,z 1 = P r { Z 0 = z 0 , Z 1 = z 1 } and define a dens it y func- tion for the p otent ial outcome Y t [ z t ] | Z 0 = z 0 , Z 1 = z 1 if a p erson falls in to this s tratum by f ( y ; θ z 0 ,z 1 ,t ) for z 0 , z 1 = 1 , . . . , Z max , wh ere θ z 0 ,z 1 ,t denotes the parameter ve ctor for the assu med under lyin g density fu nction f and t denotes whether or not an individu al r eceiv ed treatmen t ( t = 1) or control ( t = 0). W e note that the probabilities, p z 0 ,z 1 , d o not d ep end on treatmen t status b ecause Z 0 and Z 1 are p oten tial outcomes and a case’s prin cipal stratum r emains the same regardless of w hic h treatmen t he/she receiv es. Con versely , the distribution of the p oten tial outcomes, f , is d etermined b y the principal strata and treatmen t status of an ind ividual. F or example, if w e can assume the p oten tial outcomes are normally distributed, then we migh t hav e separate means a nd v ariances in θ z 0 ,z 1 ,t for eac h p ossible pair o f v alues of ( z 0 , z 1 ) for b oth treatmen t and con trol. AN APPLICA TION OF PRIN CIP AL STRA TIFICA TION 9 T able 2 Princip al str atific ation mo del for institutionalization under tr e atment t = 0 , 1 Z 0 = 0 1 · · · Z max Z 1 = 0 ( p 00 , f ( y ; θ 0 , 0 ,t )) ( p 10 , f ( y ; θ 1 , 0 ,t )) · · · ( p Z max , 0 , f ( y ; θ Z max , 0 ,t )) 1 ( p 01 , f ( y ; θ 0 , 1 ,t )) ( p 11 , f ( y ; θ 1 , 1 ,t )) · · · ( p Z max , 1 , f ( y ; θ Z max , 1 ,t )) 2 ( p 02 , f ( y ; θ 0 , 2 ,t )) ( p 12 , f ( y ; θ 1 , 2 ,t )) · · · ( p Z max , 2 , f ( y ; θ Z max , 2 ,t )) · · · · · · · · · · · · · · · · · · · · · Z max ( p 0 ,Z max , f ( y ; θ 0 ,Z max ,t )) ( p 1 ,Z max , f ( y ; θ 1 ,Z max ,t )) · · · ( p Z max ,Z max , f ( y ; θ Z max ,Z max ,t )) The primary treatmen t effects of interest in th is mo del are E [ D [ z ] | Z 0 = z , Z 1 = z ], th at is, the treatmen t effects for in dividuals with the same lev el of ins titutionalization un der b oth treatmen t and control. T h ese treatmen t effects do not confound c h anges in institutionalizatio n with other effects of treatmen t and allo w p olicy-mak ers, stak e holders and c aregiv ers to ev aluate treatmen t indep endent of the p oten tially costly and undesir ab le effects on institutionalizati on. Eac h a ve r age effect is restricted to cases f r om a prin cipal stratum. Generaliza tions to other strata require add itional assumptions. W e let S z t ,t denote the set of all cases in condition t whose observed v alue of ins titutionaliza tion equals z t . Then if y i denotes the observ ed outcome for the i th case, the lik eliho o d for th e observe d data is giv en by L ( Y , Z , T , θ ) = Y z 1 Y i ∈ S z 1 , 1 X z 0 f ( y i ; θ z 0 ,z 1 , 1 ) p z 0 ,z 1 (3.1) × Y z 0 Y i ∈ S z 0 , 0 X z 1 f ( y i ; θ z 0 ,z 1 , 0 ) p z 0 ,z 1 . This mixture d istr ibution results from the fact that only Z 1 or Z 0 is ob- serv ed for eac h case. A similar lik eliho o d was p resen ted for an app lication of principal stratification metho ds for estimating the causal effect of a jobs program [ Zhang, Ru b in and Mealli ( 20 05 )]. In their mo del, Z max w as 1 and the p oten tial outcomes Y t did n ot exist w hen Z t = 1. In our case, outcomes can exist at every lev el of institutionalizatio n. W eigh ts can b e easily incorp orated into the lik eliho o d fu nction in equa- tion ( 3.1 ), allo wing for maximizatio n of a wei gh ted likelihoo d function. A natural application of suc h a w eigh ted likelihoo d in cludes the maximiza tion 10 B. A. GR IFFIN, D. F. MCC A FFREY A ND A. R . MOR RAL of the lik eliho o d when comparing prop ensit y score weigh ted clien ts in one treatmen t p r ogram to unw eigh ted clien ts in another, as w ill b e illustrated in Section 3.3 b elo w. F or many common distributions, mixture lik eliho o d s lik e ( 3.1 ) can b e optimized using the EM algorithm. Theoretically , we can use the EM algo- rithm to estimate causal effects with an y set of data. Ho wev er, in practice, the solution is un lik ely to b e so simp le. First, if t here are man y lev els o f Z to mo del, there will b e man y p ossible lev els of institutionalization; then, with- out an y ad d itional structure, ther e will b e a v ery large num b er of parameters to estimate. T o d iminish the d imensionalit y of the p r oblem, one can con- sider mo deling θ z 0 ,z 1 ,t as lo w order p olynomials in z 0 and z 1 . F or example, w e migh t assu me E ( Y 1 | Z 1 = z 1 , Z 0 = z 0 ) = µ + β 1 z 1 + β 0 z 0 + γ z 1 z 0 . Second, it is w ell known that the con v ergence of lik eliho o d optimizers for mixtu re mo dels can b e sensitiv e to th e starting v alues [ Biernac ki, Celeux and Gerard ( 2003 ), Karlis and Xek alaki ( 2003 ), McLac hlan ( 1988 ), Seidel and Sev cik o v a ( 2004 )]. Giv en the large n u m b ers o f mixtures in v olve d in t his lik eliho o d and the fact that iden tification of the parameters w ill d ep end on matc hing the mixing prop ortions/probabilities across groups, sensitivit y to starting v al- ues is lik ely and careful consideration of these v alues w ill b e required. W e examine these issues m ore carefully in Section 4 . 3.3. Applic ation of princip al str atific ation to estimating mo dality effe cts on SFS in the A TM. T o tease out the effect of treatmen t mo d alit y on SFS in the presence of institutionalizat ion, we fit a four strata principal stratifi- cation m o del to our data from the A TM study , where Z 0 and Z 1 can only tak e on t wo v alues, namely , 0 and 1 . Thus, we dichoto mized da ys institu- tionalized such that Z t = 0 if an individual h ad 0 days of ins titutionalization and Z t = 1 ot herwise for t = 0 , 1. W e let t = 1 an d = 0 den ote residential and outpatien t care treatmen t m o dalities, resp ective ly . Th is mo del allo ws us to compute treatmen t effec ts among adolescen ts w ho would not b e institution- alized under b oth treat men t and con trol and among adolescen ts w ho w ould b e in stitutionalized u nder b oth treatment and cont rol. Because S FS has a very skew ed distribution with many observe d zeros (see Figure 1 ), w e assumed the un derlying distribution for Y within eac h stratum was tobit [ Maddala ( 1983 ), e.g., f ( y ; θ z 0 ,z 1 ,t ) = G (0; η t z 0 z 1 , ζ 2 t ) 1( y ≤ 0) × g ( y ; η t z 0 z 1 , ζ 2 t ) 1( y> 0) , w here G ( y ; η , ζ 2 ) and g ( y ; η , ζ 2 ) denote the distribution and densit y functions, resp., for a n ormal random v ariable with mean η and v ariance ζ 2 ]. The parameters η t z 0 z 1 and ζ 2 t dep end on treatmen t to allo w for treatmen t effects. As parameterized, η t z 0 z 1 and ζ 2 t are not the mean and v ariance of the tobit distribution but of the trun cated normal whic h defines the tobit. Because the A TM is an observ ational stu d y , there we re observ able differ- ences in the pretreatmen t c h aracteristics of y ouths en tering residential and AN APPLICA TION OF PRIN CIP AL STRA TIFICA TION 11 outpatien t care. Give n these pretreatmen t differences, an y observed d iffer- ences in tr eatmen t group outcomes could result either from differentia l effec- tiv eness of the treatmen t mo dalities, or b ecause of differences in how hard their resp ectiv e p opulations are to treat. In order to isolate j ust the treat- men t effects of in terest, w e must compare treatmen ts on equiv alent cases. Th us, for t he case study , w e compare the effec tiv eness of the tw o mo dalities on cases lik e those in the A TM sample who en tered the r esiden tial mo dal- it y . W e ac hiev e this comparison by we igh ting the outpatien t sample so that it closely matc hes the resident ial sample in terms of the d istribution of 86 pretreatmen t v ariables exp ected to b e related to substance u se and treat- men t assignmen t [ Morral, McCaffrey and Ridgew a y ( 2004 )]. Details on the w eighting and comparison of the w eight ed groups can b e fou n d in McCaffrey et al. ( 2007 ). In the remainder of the analysis rep orted in this example, w e co mpare the unw eigh ted residen tial sample ( n = 770) to the weigh ted outpatien t sam- ple ( n = 486, effectiv e samp le size = 125), a comparison designed to exam- ine whether the residenti al mo dalit y pro d uces b etter 12-mon th outcomes than outpatien t care f or cases with p retreatmen t c haracteristics lik e those of clien ts who us ually ent er residen tial care. W e maximize the w eight ed lik e- liho o d in ( 3.1 ) using the EM algorithm. Preliminary r esu lts s uggested that estimates could b e v ery sensitive to starting v alues used with the EM al- gorithm. Consequent ly , w e develo p ed the follo wing approac h f or selecting starting v alues. First, among the resid en tial cases obs er ved to ha ve Z 1 = 0, we fit a mixture m o del of t wo normal distr ibutions usin g a simp le EM algo rithm [ Dempster, L aird and Rubin ( 1977 )] to obtain initial estimates of the t wo means an d th eir corresp onding mixin g pr op ortions for this group. W e la- b eled the tw o means b µ 1 0 ,A and b µ 1 0 ,B . Und er our mo d el, we kno w that the observ ed means for residentia l cases with Z 1 = 0 is a mixtur e of µ 1 00 and µ 1 01 , so b µ 1 0 ,A is an estimate of one of these means and b µ 1 0 ,B is an estimate of the other; h o wev er, we do not know w hic h is which. W e thus used starting v alues which allo w for b oth p ossible mapp ings in th is group. W e rep eated this pro cedure f or residen tial cases with Z 1 = 1 and out- patien t cases for b oth Z 0 equal to 0 and 1. Th ese steps, in tur n, yielded preliminary estimates of th e eight m ean p arameters of the mo del and their asso ciated mixing prop ortions. All that remained wa s to determine ho w the eigh t means m ap p ed to th e mo del p arameters. There are 16 p ossible map- pings from the preliminary estimates of the simple mixture mo d els to the parameters of the mo del. F or example, one mapping assu mes that ( b µ 1 0 ,A , b µ 1 0 ,B , b µ 1 1 ,A , b µ 1 1 ,B , b µ 0 A, 0 , b µ 0 B , 0 , b µ 0 A, 1 , b µ 0 B , 1 ) estimat es ( µ 1 0 , 0 , µ 1 0 , 1 , µ 1 1 , 0 , µ 1 1 , 1 , µ 0 0 , 0 , µ 0 1 , 0 , µ 0 0 , 1 , µ 0 1 , 1 ). The remaining 15 mappings are obtained by switc hing the mapping of A and B for eac h observed v alue of Z 1 and Z 0 to 0 and 1 exhaustiv ely . 12 B. A. GR IFFIN, D. F. MCC A FFREY A ND A. R . MOR RAL T o obtain the final parameter estimate s of the m o del, we ran the EM algorithm to maximize ( 3.1 ) separatel y for eac h of the 16 p ossible map p ings of th e estima tes from the simple m ixtu re mo d els an d used as final estimates the solution that m aximized the log lik eliho o d among all 16 runs. Simula- tions d escrib ed b elow suggested t hat t his p ro cedure ca n successfully reco ve r the global maximum from the man y lo cal maxima. It is critica l that multiple starting v alues a re utili zed when applying p rin- cipal stratification to data. Figure 2 sho ws the sensitivit y of resulting pa- rameter estimates from the 1 6 d ifferen t starting v alues used in our analysis. The results from eac h set of starting v alues are plotted as v ertical bands and denoted b y a n u m b er from 1 to 16. Eac h of th e 8 estimated mo d el means is denoted b y a ro w and the resulting estimated v alues f or e ac h mean are plot- ted f or e ac h set of starting v alues using ve r tical bars. The first row plots the p ercenta ge increase in the negativ e log-lik eliho o d for eac h solution compared to the v alue that minimized it across all sets of starting v alues (estimate 11 in Figure 2 ). E ach set of starting v alues leads to a differen t p ossible solution, all of whic h represent a lo cal minim u m of the negativ e log-lik eliho o d in the data. Although the resulting log -likel iho o d v alues are v ery s imilar (differing b y no more than 2 p ercen t), eac h solution giv es v ery differen t inferences ab out the estimated strata means and treatment effects that exist within eac h stratum. As s ho wn , the solution which giv es the minim u m n egativ e log lik eliho o d v alue (denoted b y estimate 11 in Figure 2 ) is distinctly d ifferen t ev en from the next b est estimate of the mo del p arameters (estimate 6 in Figure 2 ) whic h assigns the estimated v alues of the means for µ 1 0 , 0 and µ 1 0 , 1 , µ 1 1 , 0 and µ 1 1 , 1 , µ 0 0 , 0 and µ 0 1 , 0 , and µ 0 0 , 1 and µ 0 1 , 1 in reverse of ho w they are assigned in the solution. M ore generall y , it is clear that the alternativ e solu- tions tend to find similar v alues for the v arious m ean v alues but “flip ” the strata lab els asso ciated with those lab els. The mixture of tobit mo dels app ears to fit the data w ell, as sho wn in Figure 3 wh ic h plots the histogram of fitted probabilities for eac h observe d condition in our data. Go o d ness of fit can b e noted by the group ing of fitte d probabilities at 0 and 1 for the cases of eac h observ ed condition. These groupings imply that the m a jorit y of our cases ha ve a h igh p robabilit y of falling into one of the t w o strata of w hic h their observed condition is a mixture. If there had b een a more ev en spread of these v alues, we w ould w orr y ab out th e go o d ness of fit of the mo del. Moreo ver, T able 3 shows that the principal stratification mo d el main tains the marginal means and probabilities of this mo del. The estimated treatmen t effects from the mixture mo d el are sh o wn in T able 4 . In cont rast to the patterns shown in T able 1 , when w e con trol for the confounding effects of institutionalizatio n us in g p rincipal stratificatio n, residen tial treatment leads to significan tly w ors e outcomes among adoles- cen ts wh o would exp erience the same level of institutionalization und er b oth AN APPLICA TION OF PRIN CIP AL STRA TIFICA TION 13 Fig. 2. The values of the ne gative lo g-likeliho o d and the estimate d me ans for e ach of the 16 start ing values use d in the A T M analysis. Each q uantity has a sep ar ate y-axis and the estimate d value of that quantity for e ach s et of star ting values i s denote d by the hei ghts of the vert ic al b ars. Fig. 3. Histo gr am of fitte d pr ob abili ti es for e ach observe d c onditi on. 14 B. A. GR IFFIN, D. F. MCC A FFREY A ND A. R . MOR RAL treatmen t m o dalities. The effect of residen tial treatmen t is larger among adolescen ts wh o are n ot institutionalized under b oth treatment mo dalities. The standard err ors for treatmen t effects within eac h stratum can b e adjusted for the effects of clustering (e.g., h ere adolescen ts are clustered within treatment sites) b y using a Hub er–White sand wic h estimate of the v ariance–co v ariance matrix for the parameters in the lik eliho o d f unction [ Skinner, Holt and Smith ( 1989 )]. When we con tr ol for clustering within the A TM data, the treatmen t effects are no longer s ignifican t within the t wo strat a with the same levels o f institutio nalization under both treatme n t mo dalities. 4. Ev aluation of principal stratification. Estimating the treatment ef- fects of treatmen t mo dalit y via pr incipal stratification requires iden tification of laten t v ariables from complex mixture data using mixture prop ortions. In particular, means for cases from different principal strata are id en tified from the m ixing prop ortions within observ ed groups. Give n that our exploration of starting v alues suggested th at the ident ification of strata is p oten tially w eak and that lab el swapping migh t b e p ossible, we felt it imp ortant to explore th e pr op erties of the principal str atificatio n metho d b efore in ter- preting our fi ndings on sub s tance abuse treatmen t. T o our kno wledge, there ha ve b een very few studies using either real or sim u lated data s ets to explore the con vergence of the estimation algorithms, th e iden tification of desired parameters, and the precision of the estimates with v arying sample s izes for treatmen t effects from mixtur e mo dels lik e mo del ( 3.1 ). T able 3 Observe d versus pr e dicte d mar ginal me ans for r esidential and outp atient c ar e c ases Residential care Outpatient care Predicted Observed Predicted Observ ed Mean SFS for Z obs = 0 11 . 8 9 . 8 9 . 9 10 . 2 Mean SFS for Z obs = 1 8 . 1 8 . 2 7 . 8 8 . 1 Proportion institutionalized 51% 52% 43% 3 8% T able 4 T r e atment effe ct estimates and standar d err ors c omp aring r esidential to out p atient tr e atment using the four-str ata princip al str atific ation mo del. Signific ant tr e atment effe cts ar e denot e d by ∗ and standar d err ors ar e unadjuste d f or cl ustering Z 0 = 0 Z 0 > 0 Z 1 = 0 4.1 (1.7)* 1.3 (0.9) Z 1 > 0 − 1.1 (0.9) 3.3 (1.8)* AN APPLICA TION OF PRIN CIP AL STRA TIFICA TION 15 4.1. Metho ds. T o ev aluate the metho d of principal stratificatio n , w e b e- gan b y examining the p erformance of th e four strata mo del where Z 0 and Z 1 can only tak e on t wo v alues, namely , 0 a nd 1. Add itionally , w e assumed that the und erlying distribution of the outcome, f ( y ; θ z 0 ,z 1 ,t ), within eac h strata is normal with mean µ t z 0 z 1 and v ariance σ 2 t , where the means and v ariances can dep end on treatmen t so that treatmen t effects can exist. W e conducted a simulation stu d y of the prop erties of parameters estimated b y maximiz ing ( 3.1 ) u nder this assumed p arametric mo del. First, data w as sim u lated under this mod el and the impact of sample size, t he v alue of the principal strata probabilities, and the disp ersion of the means within eac h lev el of t and Z t on mo del p erformance w as examined. Second, data wa s simulated under h ea vy tailed and sk ewed distribu tions and analyzed usin g the norm al mod el to examine the sensitivit y of th e normal mo del to m o del missp ecificatio n. In eac h case, we maximized the lik eliho o d using the metho ds describ ed ab o ve and compared the resulting estimates to the v alues used in generating th e data. 4.2. R e sults. First, w e examined the p erformance of the estimators for the f our strata mo del under v arious assumptions ab out the sample size p er treatmen t arm ( N ), the strata probabilities and the disp ersion of the means b et w een strata within th e same lev el of t and Z t . Figures 4 , 5 and 6 , resp ec- tiv ely , plot the estimated means for the con trol group , the estimate d means for the treatmen t group and the estimated pr obabilities for eac h strata versus their true v alues u nder the v arious cases considered. As s h o wn , the metho d did n ot b egin to p erform w ell un til the means within a giv en lev el of t and Z t w ere at least 1 . 6 standard deviations a w ay from eac h other and the sample size within eac h treatmen t arm w as at least N = 1000. See sup plemen tary material f or tabulated r esults [ Griffin, McCaffrey and Morr al ( 2008 )]. The p erformance of the estimators w as relativ ely inv arian t to the tr u e strata probabilities unless the probabilities were u niform, as sho wn in the third column of Figures 4 and 5 . Sp ecifically , w e examined three cases for strata probabilities, one in whic h all four strata had reasonably large prob- abilities, another in which one strata had a particularly small pr obabilit y of 0 . 05, and a third in whic h the pr obabilities were equal. P erform ance w as similar in the tw o cases wh ere the probabilities were not all equal, as sho w n in the firs t and second column s of Figures 4 and 5 . Wh en the pr ob- abilities we re equal, th e m etho d could not identify the correct mapping of the mixtu re comp onent s to the pr incipal strata v alues b ecause the mo del w as under-identified. Cons equen tly , t he maximum lik eliho o d estimation was unable to reco ve r the mo d el parameters. W e also examined the sensitivit y of the four strata mo del to mo del mis- sp ecification [ Griffin, McCaffrey and Morral ( 2008 )]. As exp ected, the m etho d 16 B. A. GR IFFIN, D. F. MCC A FFREY A ND A. R . MOR RAL Fig. 4. Simulation r esults. Plot of fitte d me ans (black d ots) versus true values (indic ate d by dashe d lines) for the four str ata in the c ontr ol gr oup with me ans for other str ata (indi- c ate d as dotte d gr ay l i nes) for N = 100, 1000 and 5000, differ ent disp ersions of the m e ans, and differ ent assumptions ab out the str ata pr ob abilities. w as sensitiv e to extremely hea vy taile d and sk ewe d d istributions. The algo- rithm app eared to p er f orm w ell when the d ata had only mo d erately hea vy tails or mo derate sk ew. Unfortunately , our sim u lation study results suggest that our results for the A T M study m ust b e inte r preted with caution. While the d isp ersion of means within eac h lev el of institutionalizat ion in the A TM data is 3.04, the effectiv e samp le size in the weigh ted con trol group is qu ite small, only 125, dra w ing in to question the abilit y of this mo del to conv er ge to the correct solution if the p rincipal stratification mo del holds in our data. Our sim ulation study analysis was rep eated for a n ine str ata mo del in whic h Z t can take on three v alues, namely , 0 , 1 and 2. Ho wev er, a num b er of problems w ere encoun tered. First, the num b er of p ossible sets of starting v alues (e.g., mapp in gs b et ween the parameters of the f u ll m o del to estimates AN APPLICA TION OF PRIN CIP AL STRA TIFICA TION 17 Fig. 5. Simulation r esults. Plot of fitte d me ans (black d ots) versus true values (indic ate d by dashe d lines) for the f our str ata in the tr e atment gr oup with me ans f or other str ata (indic ate d as dotte d gr ay lines) f or N = 100, 1000 and 5000, differ ent disp ersions of the me ans, and differ ent assumptions ab out the st r ata pr ob abilities. from simple mixture mo dels fi t to d ata) increased significan tly , leading to 219 2 p ossible sets of starting v alues. Giv en the computational demands of suc h a large n umber of starting v alues, we did n ot ru n the EM algorithm for all p ossible mapp ings. Instead we calculate d the lik eliho o d at eac h p ossible mapping and then tried tw o tec h niques for selecting reasonable starting v alues: (i) choosing the 30 starting v alues whic h ga ve th e 30 greatest v alues of the log-lik eliho o d and (ii) choosing the 30 starting v alues whic h repr esen ted a spread of the log-lik eliho o d surface. W e fou n d that option (i) yielded the b est solution (i.e., the solution which ga v e the highest v alue of the log-lik eliho o d). Ho we v er, even with a disp ersion of means equal to 2 . 5 and N = 5000 p er treatmen t arm, the b est solution from option (i) was un able to reco v er the mo del parameters used to simulat e the data (r esu lts a v ailable up on request). 18 B. A. GR IFFIN, D. F. MCC A FFREY A ND A. R . MOR RAL Fig. 6. Simulation r esults. Plot of fitte d pr ob abili ties (black dots) versus their true val ues (indic ate d by dashe d li nes) for the four str ata for N = 100, 1000 and 5000, differ ent disp ersions of the me ans, and differ ent assumptions ab out the str ata pr ob abili ties. It is lik ely that adding more parametric assumptions into the nine strata mo del would help improv e the fit of the mo del. F or example, one could consider mo deling the means within eac h stratum and treatmen t group as a linear fun ction of the v alue of Z 1 and/or Z 0 . How ev er, as th e num b er of strata increases, more assum ptions will b e required whic h ma y or ma y not b e lik ely to hold in pract ice. Inevitably , extending principal stratificati on to more strata b ecomes int ractable and unidentifiable b ey ond the simple four strata mo del. 5. Discussion. Institutionalizatio n du ring the follo w-up p oses serious c hal- lenges to estimating treatment effects on outcomes f ollo wing sub stance abuse treatmen t b ecause it restricts opp ortunities t o u se drugs or p artake in other problem b ehavio r s (e.g., cr im in al activit y or risky s exual activit y). If unac- coun ted for, it can confound treatmen t effects and lead to incorrect inferences AN APPLICA TION OF PRIN CIP AL STRA TIFICA TION 19 ab out the abilit y of treat men t to pro duce desirable o u tcomes. Similar prob- lems can o ccur in man y other settings, suc h as criminal justice and men tal health outcomes studies where stud y participant s are also at high risk f or institutionalizati on during the follo w-up p erio d . Because institutionaliz ation o ccurs p ost-treatmen t, treating it as a co- v ariate in analyses can lead to b iased results s in ce cases with v arious lev- els of institutionalization might differ in terms of their p oten tial outcomes [ McCaffrey et al. ( 200 7 )]. Most common approac hes to the pr oblem require strong assump tions and the results can b e v ery sensitiv e to w h ic h assu mp- tions are made and w hic h metho ds are used [ McCaffrey et al. ( 2007 )]. F or example, use of a joint or comp osite outcome wh ic h lo oks at the effects of treatmen t on institutionalizatio n and SFS together co n flates t he effects that treatmen t has on institutionalization with the effects it has on S FS. P rin- cipal stratification allo ws us to directly mo d el ho w the treatmen t effects on SFS m ay v ary within different lev els of institutionalization. Principal strati fication pro vides a fr amew ork for d ev eloping causal effe cts on outcomes in the p resence of p ost-treatmen t institutionalization. The k ey idea is that causal effects can b e obtained by conditioning on cases with equal v alues for the observ ed lev el of in stitutionalizat ion for the observ ed treatmen t status and the unobserved lev el of institutionali zation for the un - observ ed treatmen t s tatus. In situations where in s titutionaliza tion can tak e on discrete v alues, the laten t p rincipal strat a result in finite mixt ure mod els and the parameters of inte rest can p oten tially b e estimated. When applied to adolescen t substance ab u se treatmen t data in the A TM stud y , the metho d suggests that residen tial treatmen t ma y b e less effectiv e for y ou th who are lik ely to exp erience the s ame lev el of institutionalization und er b oth con- ditions (residen tial and outpatien t). Ho wev er , the effect is strongest among those y outh who are unlik ely to b e institutionalized follo win g treatmen t— that is, th e least prob lematic users. Alternativ e analyses [ McCaffrey et al. ( 2007 )] foun d a similar result whic h impr o ve s our confid ence in these fi nd- ings, d espite th e c hallenges of the metho d cited ab o ve. As discussed in the In tro duction , stak eh olders , lik e p aren ts, wan t to kn o w if t reatmen t is effectiv e an d for whom it is effectiv e. Th e principal stratifica- tion app roac h estimates c ausal effects for yo uth in different pr in cipal strata. Th us, we aim to know if treatmen t is effectiv e for y outh within eac h principal stratum wh ere institutionalization is constan t across treatmen t conditions. Ho we v er, the principal strata d o not n ecessarily pro vide meaningful cla ssifi- cation of adolescen ts to stak eholders. An imp ortan t follo w-up to our analysis is to determine if p rincipal strata membersh ip can b e d escrib ed usin g mea- surable and meaningfu l b aseline v ariables so th at sta k eholders might hav e a b etter idea ab out whic h y ouths will b e b est serv ed b y treat men t or different treatmen t mo dalities. 20 B. A. GR IFFIN, D. F. MCC A FFREY A ND A. R . MOR RAL Bey ond this application, our inv estigation of p rincipal stratification sug- gests that th e p oten tial of principal stratification may b e imp ossible to realize in many empirical studies. Ou r stud y shows that the differences in outcome m eans among the pr incipal strata must b e quite large (1.6 standard deviation un its or larger) and that the sample s izes m u st also b e very large ( N ≥ 1000) for estimated effects to correctly id en tify which means b elong to eac h principal stratum. Un fortunately , suc h disp ers ion of means an d samp le sizes are m u c h greater than those that are likely to arise in many applications. With smaller samples sizes and less disp ersed means, group lab els are often switc hed ; for instance, the means of the Z 1 = Z 0 = 0 stratum might b e incorrectly lab eled as th e means for the Z 1 = 1 and Z 0 = 0, stratum yielding incorrect inferences ab out treatmen t effects. Additionally , the sw itc hed mean v alues may b e v ery plausible so that there w ould b e no ind ication of misleading results. The limitatio ns of principal stratification m u st b e considered carefully b efore using the m etho d in all but th e most ideal settings. As the p ossible v alues for institutionalization gro w, the problem quic kly b ecomes m ore unmanageable. Th e n u m b er of parameters for the prin cipal strata and the n umber of mixtures that need to b e iden tified gro w rapidly; ev en fin ding starting v alues b ecomes an extremely c hallenging task as the p ossible v alues for institutionalization grow. Estimation in these cont exts will r equire add itional assu mptions ab out the relationship b et wee n outco me and in s titutionaliza tion (e.g., o utcomes d ecline linea rly with increased i nsti- tutionalizati on) and assumptions about the join t distribution of institution- alizati on und er treatmen t and con trol to mak e the p roblem more tractable. Giv en the c hallenges of principal stratificatio n wh en we consider many v al- ues for institutionaliz ation, a p ossib le option may b e to ignore the outcomes from ins titutionalized cases and try to estimate an un suppr essed trea tmen t effect restricted to cases that w ould not b e institutionalized under either treatmen t or con trol. Our approac h to starting v alues could b e used and trust w orthy estimates might b e obtained in some real w orld settings. The limitation of this approac h w ould b e the inabilit y t o generalize the estimates b eyo nd cases that are lik ely nev er to b e institutionalized. Th at is, we could not estimate th e unsu ppressed effect for all cases nor could we determine treatmen t effects on institutionalized yo uth. Using a Ba y esian approac h m ight address some of the computational prob- lems with maximizing th e lik eliho o d since integ rating o ver a pr ior helps to redu ce sensitivit y to the starting v alues. Ho we ver, sp ecial care would b e needed when u sing Mark ov c hain Mon te Carlo metho ds to sample the p osterior to a voi d lab el sw itching of the mixtures within the observe d v al- ues of institutio nalization [ Jasra, Holmes and S tephens ( 2005 )]. S trata lab el switc hin g will o ccur b ecause the data pro vid e only very indirect inf orm a- tion ab out the joint d istr ibution of p oten tial institutionalization. Use of an AN APPLICA TION OF PRIN CIP AL STRA TIFICA TION 21 informativ e prior would b e one means of o verco ming th is limitatio n of the data via Ba yesian analysis. Ho wev er, this appr oac h is unap p ealing b ecause analysts are unlikely to b e able to mak e goo d inform ed guesses ab out the join t distribu tion so that informed priors would b e difficult to c ho ose. Obtaining sufficien t inf ormation ab out the joint distrib ution of th e p oten- tial institutionalizatio n outcomes is a cle ar c hallenge to using the fr amew ork of principal stratification to pr o vide useful estimates of treatmen t effects. One app roac h to obtaining more information ab out this distribution w ould b e to collec t data at baseline that m igh t b e str on gly related to the prin ci- pal s tr ata and us e the information as co v ariates in the mo del for the strata probabilities. F or institutional ization follo wing substance use treatment this information might include criminal activit y , detailed information ab out in- v olve men t with th e crimin al justice system, history with sub stance abuse treatmen t, the av ailabilit y of v arious t yp es of substance ab out treatmen t and sour ces of pa yment for th e treatmen t. Another approac h f or increasing th e information ab out the principal s tr ata is to j oin tly mo del multiple outcomes such as m ultiple indicators of s u b- stance use and criminal activit y . Because p rincipal strata are defined b y institutionalizati on and not b y other outcomes, prin cipal strata designatio n should not dep end on which outcome is mo deled. Com bining multiple out- comes can therefore provide more information ab out the latent pr incipal strata m em b ership and should imp ro ve estimation of treatmen t effects for ev ery outcome. A downside of th is approac h is the necessit y o f sp ecifying the join t distribution for m u ltiple outcomes, but the additional mo deling might b e ve ry b eneficial for identifying the principal strata and the accuracy of the resulting treatmen t effect estimates. W e migh t c onsider sim p ly us in g the p rincipal stratification framew ork for sensitivit y analyses. F or example, rather than trying to mo del the joint dis- tribution of Z 0 and Z 1 , we migh t sp ecify the join t distribution in terms of the parameters of th e marginal distributions of Z 0 and Z 1 and a parameter sp ecifying their correlation. The v alue of the correlation parameter could b e manipu lated to study the sensitivit y of the treatmen t effect estimates to differen t assu mptions ab out the principal strata. F or example, when institu- tionalizat ion can tak e on just t wo v alues, we could specify the co r relation by the inte raction from a log-li near mo del. F or con tin u ous institutionalization, w e could sp ecify th e correlation parameter. Principal stratification is an imp ortan t to ol for app roac hing the problem of institutionalization during the follo w-up , because it pro vides a framew ork for defining estimates of in terest and p ossible method s for estimat ing them. Ho we v er, it is also clear that it is n ot a panacea to the problem b ecause the m o del p arameters are at b est weakly id entified and it d o es not extend easily to p roblems with many p ossible v alues for in stitutionalizat ion. Th e extensions describ ed ab o ve provide usefu l areas for fu ture researc h that may 22 B. A. GR IFFIN, D. F. MCC A FFREY A ND A. R . MOR RAL significan tly improv e the application of principal stratification in real w orld settings. Ac kn owledgmen t. The authors are grateful to Ra j eev Ramc h and for his helpful suggestions on th e manuscript. SUPPLEMENT AR Y MA TERIAL Supp lemen tary tables for “An application of principal stratification to con trol for in s titutionaliza tion at follo w -up in s tu d ies of substance abu se treatmen t programs” (DOI: 10.1214 /08-A O AS179S UPP A ; .p df ). This file con tains tabulated results to sim ulation study of pr incipal str atification metho d. Example data for running principal strat ification mod el in “An application of principal s tratificatio n to con trol f or institutionalizati on at follo w-up in studies of s u bstance abu se treatmen t programs” (DOI: 10.121 4/08-A OAS179SUPPB ; .csv). This file cont ains dataset d e- scrib ed in pap er. Example co d e for running principal s tratification mod el in “An application of principal s tratificatio n to con trol f or institutionalizati on at follo w-up in studies of s u bstance abu se treatmen t programs” (DOI: 10.1214 /08-A OAS179SUPPC ; .txt). This fi le contai ns cod e u sed to run mo dels in pap er. REFERENCES Biernacki, C., Celeux, C. and Gerard , G. (2003). Choosing starting v alues for th e EM algorithm for getting the highest lik eliho o d in multiv ariate Gaussian mixture mod els. Comput. Statist . Data A nal. 4 1 561–575. MR1968069 Dempster, A., La ird, N. and Rubin, D. (1977). Maxim um lik elihoo d from incomplete data via the em al gorithm. J. R oy. Statist. So c. Ser. B 39 1–38. MR0501537 Dennis, M. L. (1999). Glob al Appr ai sal of Individual Ne e ds (GAIN) Manual: A dmini s- tr ation, Sc oring and Interpr etation . Ligh th ouse Publications, Blo omington, IL. Efr on, B. and Feldman, D. (1991). Compliance as an explanatory v ariable in clinical trials. J. Amer. Statist. Asso c. 86 9–17. Frangakis, C. and R ubin, D. (2002). Principal stratification in causal inference. B i o- metrics 58 21–29. MR1891039 Griffin, B., McCaffrey , D. and Morral, A . (2008). Supplement to “An appli- cation of principal stratification t o control for institu tionalization at follo w-up in studies of substance abu se treatment programs.” DOI: 10.1214/0 8-AO AS 179SUPP A ; DOI: 10.1214/0 8-AO AS 179SUPPB ; DOI: 10.1214/08-A OAS179SUPPC . Holland, P. W. (1986). Statistics and causal inference. J. A mer. Statist. As so c. 81 945– 970. MR0867618 Hubbard, R. L., Craddock, S . G. , Fl ynn, P . M., Ande rson, J. and Etherid ge, R. (1997). Overview of 1-year follo w-up outcomes in t he drug abuse treatment outcome study DA TOS . Psycholo gy of A ddictive Behavior 11 261–278. AN APPLICA TION OF PRIN CIP AL STRA TIFICA TION 23 Jasra, A., Holmes, C. and Stephens, D. (2005). Mark o v c hain Monte Carlo methods and the label switc hing problem in Ba yesian mixture mod eling. St atist. Sci. 20 50–67. MR2182987 Karlis, D. and Xekalaki, E. (2003). Choosing initial v alues for the EM algo rith m for finite mixt u res. Comput. Stat ist. Data A nal. 41 57 7–590. MR1968070 Maddala, G. (1983). Lim ite d-Dep endent and Qualitative V ariables in Ec onometrics . Cam b rid ge Univ. Press. MR0799154 McCaffrey, D. F., Morral, A . R., Ridgew a y, G. and G riffin, B. (2007). Interpret- ing treatment effects when cases are institutionalized after treatment. Drug and Alc ohol Dep endenc e 89 126–138. McLachl an, G. (1988). On the choice of starting v alues for the EM a lgorithm in fi tting mixture mo dels. The Statistician 87 417–425. Morral, A., McCaffrey, D. and Ridgew a y, G . (2004). Effectiv eness of comm unity- based treatment for substance-abusing adolescen ts: 12-month outcomes of yo uths en- tering phoenix academ y or a lternative probation disp ositions. Psycho lo gy of A ddictive Behaviors 18 257–268. NORC and R TI (1997). N ational T r e atment Impr ovement Evaluation Survey (NTIES): Final R ep ort . Center for Su bstance Abuse T reatment, Sub stance A buse and Mental Health Serv ices Administration, Ro ckville, MD. Pearl, J. (1996). Pr o c e e dings of the Sixth Confer enc e of The or etic al Asp e cts of R atio- nality and Know le dge , Chapte r Causation , A ction and Counterfactua ls 57–73. Morgan Kauffman, San F rancisco, C A. MR1443674 Ro senbaum, P. and Rubin, D. (1983). The central role of the prop ensity score in obser- v ational studies f or causal effects. Biometrika 70 41–55. MR0742974 Rubin, D. (1990). F ormal mo des of statistical inference for causal effects. J. Statist. Plann. Infer enc e 25 279–292. Seidel, W. and Sevcik ov a, H. (2004). Types of likel ihoo d maxima in mixture mo dels and t heir implication on the p erformance of tests. Ann. I nst. Statist. Math. 56 631–654. MR2137630 Skinner, C. J. e., Hol t, D. e. and Smi th, T. M. F. e. (1989). A nalysis of Complex Surveys . Wiley , New Y ork. MR1049386 Stevens, S. and Morral, A ., eds. (2003). A dolesc ent Substanc e A buse T r e atment i n the Unite d States: Exemplary Mo dels fr om a National Evaluation Study . Haw orth Press, New Y ork. U.S. Dept. of Heal th an d H uman Se r vices, Na tional Institute on Drug Abuse (2004). Drug Abuse T r e atment Outc ome Study (DA TOS), 1991–1994: [ UNITED ST A TES] [Com puter file] , 2nd ICPSR ed . Inter-univ ersit y Consortium for P olitical and Social Researc h [producer an d distributor], Ann Arb or, MI. U.S. Dep t. of He al th and Human Se r vices, Subst ance Abuse and Men t al H eal th Ser vices A dministra tion, Cen ter for Subst ance Abuse T rea tment ( 2004). Na- tional T r e atment Impr ovement Evaluation S tudy (NTIES), 1992–1997 [Computer file] , 3rd I CPSR ed. Inter-universi ty Consortium for Political and So cial Research [prod ucer and distribu tor], Ann Arb or, MI. Zhang, J., Rubin, D. and Mealli, F. (2005). Ev aluating causal effects in the presence of “trun cation by d eath”—like lihoo d-b ased analysis via principal stratification. Unpu b- lished. A vail able at http://fac ulty.smu.e du/millime t/AIE/pdf/JunniZhang.pdf . 24 B. A. GR IFFIN, D. F. MCC A FFREY A ND A. R . MOR RAL B. A. Griff in A. R. Morral RAND Corpora tion 1200 South Ha y es Street Arlington, Virginia 22202 USA E-mail: bethg@rand.org morral@rand.org D. F. McCaffrey RAND Corpora tion 4570 5th A venu e Pittsburgh, Pennsyl v ania 15 213 USA E-mail: danielm@rand.org

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment