The Evaluation of Causal Effects in Studies with an Unobserved Exposure/Outcome Variable: Bounds and Identification

The Ev aluation of Causal Eﬀects in Studies with an Unobserv ed Exp osure/Outcome V ariable: Bounds and Iden tiﬁcation Manabu Kuroki Department of Systems Innov ation Graduate School of Engineering Science Osak a Univ ersity mkuroki@sigmath.es.osaka-u.ac.jp Zhihong Cai Department of Biostatistics School of Public Health Kyoto Univ ersity cai@pbh.med.kyoto-u.ac.jp Abstract This paper deals with the problem of ev al- uating the causal eﬀect using observational data in the presence of an unobserv ed ex- posure/outcome v ariable, when cause-eﬀect relationships b et ween v ariables can be de- scribed as a directed acyclic graph and the corresponding recursive factorization of a joint distribution. First, we propose iden tiﬁ- ability criteria for causal eﬀects when an un- observed exposure/outcome v ariable is con- sidered to contain more than two categories. Next, when unmeasured v ariables exist b e- tw een an unobserv ed outcome v ariable and its proxy v ariables, w e pro vide the tightest bounds based on the p oten tial outcome ap- proach. The results of this pap er are helpful to ev aluate causal eﬀects in the case where it is diﬃcult or exp ensiv e to observe an ex- posure/outcome v ariable in many practical ﬁelds. 1 INTR ODUCTION The ev aluation of causal eﬀects from observ ational studies is one of the cen tral aims in many ﬁelds of prac- tical science. F or this purpose, man y researc hers ha ve attempted to clarify cause-eﬀect relationships and to ev aluate the causal eﬀect of an exposure v ariable on an outcome v ariable through observed data. Statis- tical causal analysis, which is one of pow erful to ols for solving these problems, started with path analy- sis (W right, 1923, 1934), and adv anced to structural equation mo dels (W old, 1954; Bollen, 1989). It also has been modiﬁed in order to b e applicable to categor- ical data (Go odman, 1973, 1974a, 1974b; Hagenaars, 1993). Recently , P earl (2000) developed a new frame- work of causal mo deling based on a directed acyclic graph and the corresp onding nonparametric structural equation model. In observ ational studies, there often exist unobserved v ariables, whic h mak es it diﬃcult to ev aluate reliable causal eﬀects. Man y researchers hav e proposed v ari- ous useful approac hes to ev aluate causal eﬀects when unobserved v ariables are confounding factors betw een an exp osure v ariable and an outcome v ariable, suc h as the instrumen tal v ariable metho d and sensitivity analysis. In the context of graphical causal mo dels, Pearl (2000) provided the mathematical deﬁnition of the causal eﬀect. In addition, when b oth an exposure v ariable and an outcome v ariable are observed, Pearl (2000), Tian and Pearl (2002) and Shpitser and Pearl (2006) discussed several graphical identiﬁcation con- ditions for causal eﬀects, whic h enable us to recognize situations where the causal eﬀects can be ev aluated from observ ational data. How ev er, in some situations, ev en an exp o- sure/outcome v ariable is unobserv ed. F or example, in a study to examine whether the so cioeconomic gra- dient has an inﬂuence on low birth-weigh t, socio eco- nomic status is measured by some proxy v ariables suc h as income, wealth, education and occupation, since the true socio economic status is unobserv ed (Finc h, 2003). Another example concerning an unobserv ed exposure is in o ccupational settings. Man y epidemio- logical studies ha ve addressed the question of carcino- genicity in w orkers exp osed to diesel exhaust and coal mine dust, and most showed a low-to-medium increase in the risk of lung cancer. How ev er, exposure measure- ment in these studies is mainly inferred on the basis of job classiﬁcations and ma y lead to misclassiﬁcation (Hoﬀmann and Jockel, 2006). On the other hand, as an example concerning an unobserv ed outcome, Fleiss et al. (1976) rep orted a comparative clinical trial of ibuprofen, aspirin and placebo in the relief of p ost- extraction pain. Since the true outcome (pain relief ) is unobserved, they used the Ridit analysis (Bross, 1958) to divide patien ts into ﬁve categories of pain relief: none, po or, fair, goo d and v ery goo d. These examples show the importance of ev aluating causal eﬀects when an exposure/outcome v ariable is unobserved. Kuroki et al. (2005) p oin ted out that it is diﬃcult to apply the identiﬁcation criteria prop osed by Pearl and his colleagues to ev aluate causal eﬀects in such situ- ations, and provided the graphical identiﬁabilit y cri- teria when an unobserved exp osure/outcome v ariable is contin uous. In addition, Kuroki (2007) arranged the identiﬁcation conditions proposed b y Kuroki et al. (2005) to the case where an exposure/outcome v ari- able is dic hotomous. How ev er, in many situations, researchers and practitioners are more interested in the diﬀerent exp osure lev els (e.g., none, lo w, medium and high) than the pure binary exposure (exp osed vs. unexposed), and are also more interested in the re- sponse lev els (e.g., none, p oor, fair, go od and v ery goo d) than the simple binary response (impro ved vs. not improv ed). Then, the main purp ose of this pap er is to provide identiﬁabilit y criteria for causal eﬀects from observa- tional studies in the presence of an unobserved exp o- sure/outcome v ariable with more than tw o categories. It will b e shown that if we can observe some proxy v ariables that are aﬀected b y the unobserv ed v ariable, then the causal eﬀect can b e ev aluated by using sta- tistical causal analysis. More generally , we consider the case where there exist unmeasured v ariables b e- tw een the unobserv ed exposure/outcome v ariable and its pro xy v ariables. Under suc h a situation, the causal eﬀect is not iden tiﬁable but the bounds on the causal eﬀect can b e deriv ed. Finally , we illustrate our results with an example ab out social science. 2 PRELIMINARIES 2.1 BA YESIAN NETW ORKS Let f ( v 1 ,v 2 ,...,v n ) b e a strictly positive joint dis- tribution of a set V = { V 1 ,V 2 , ··· ,V n } of v ariables, f ( v i | v j ) the conditional distribution of V i given V j = v j ( V i ,V j ∈ V ) and f ( v i ) the marginal distribution of V i . Similar notations are used for other distributions. F or graph theoretic terminology used in this paper, refer to Kuroki et al. (2005). Suppose that a set V of v ariables and a directed acyclic graph G =( V , E ) are given. When the join t distri- bution of V is factorized recursively according to the graph G as the following equation, the graph is called a Ba yesian net work: f ( v 1 ,v 2 , ··· ,v n )= n Π i =1 f ( v i | pa( v i )) . (1) When pa( v i ) is an empty set, f ( v i | pa( v i )) is the marginal distribution f ( v i )o f v i . If a join t distribution is factorized recursiv ely accord- ing to the graph G , the conditional indep endencies im- plied by the factorization (1) can b e obtained from the graph G according to the d-separation criterion (Pearl, 1988), that is, if Z 1 d-separates Z 2 from Z 3 in a di- rected acyclic graph G ( Z 1 , Z 2 , Z 3 ⊂ V ), then Z 2 is conditionally indep enden t of Z 3 given Z 1 in the corre- sponding recursive factorization (1); See, for example, Geiger et al. (1990). 2.2 CA USAL EFFECT Pearl (2000) deﬁned a causal eﬀect as a distribution of an outcome v ariable when conducting an external interv en tion, where an ‘external in terven tion’ means that a v ariable is forced to take on some ﬁxed v alue, regardless of the v alues of other v ariables. If the dis- tribution of the remaining v ariables represented in the directed acyclic graph remains essentially unchanged by suc h an external interv ention, then the graph can be regarded as a causal diagram and the eﬀect of the external interv ention can be calculated from the joint factorized distribution. The exact deﬁnition is given as follows. DEFINITION 1 Let V = { X, Y }∪ Q ( { X, Y }∩ Q = φ ) b e a set of v ari- ables represen ted in a Bayesian netw ork G . If the dis- tribution of Y after setting X to a v alue x is given by f ( y | set( X = x )) =  q f ( x, y , q ) f ( x | pa( x )) , (2) then G is called a causal diagram with regard to X and equation (2) is called a causal eﬀect of X on Y . Here, set( X = x ) means that X is set to a v alue x by an external interv en tion. 2 If Deﬁnition 1 holds true with regard to all pairs of v ariables in the graph, then the whole graph is said to be causal. F or more details about the relationship between Ba yesian netw orks and causal diagrams, see Pearl (2000). Given a causal diagram G , in order to ev aluate the causal eﬀect f ( y | set( X = x )) of X on Y from a joint factorized distribution of observ ed v ariables, it is re- quired to observe not only X and Y but also a set Z of other v ariables, such as confounders. Pearl (2000) pro- vided ‘the bac k do or criterion’ as one of graphical iden- tiﬁability criteria for causal eﬀects f ( y | set( X = x )), where ‘identiﬁable’ means that f ( y | set( X = x )) can be determined uniquely from a join t distribution of observed v ariables. DEFINITION 2 Suppose that X is a non-descendant of Y in a directed acyclic graph G . If a set Z of v ertices satisﬁes the following conditions relativ e to an ordered pair ( X, Y ) of v ertices, then Z is said to satisfy the back door criterion relative to ( X, Y ): (i) no vertex in Z is a descendan t of X ; (ii) Z blo cks ev ery path betw een X and Y that con- tains an arrow p oin ting to X . 2 If a set Z of v ariables satisﬁes the bac k door criterion relative to ( X, Y ), then the causal eﬀect f ( y | set( X = x )) of X on Y is iden tiﬁable through the observation of Z ∪{ X, Y } and is given b y the formula f ( y | set( X = x )) =  z f ( y | x, z ) f ( z ) . (3) When the bac k do or criterion can not b e applied to ev aluate causal eﬀects, Pearl (2000) pro vided ‘the fron t door criterion’, which is as follo ws: DEFINITION 3 Suppose that X is a non-descendan t of Y in a directed acyclic graph G . If a set Z of v ariables satisﬁes the following conditions relativ e to an ordered pair ( X, Y ) of v ariables, then Z is said to satisfy the front door criterion relative to ( X, Y ): (i) Z blocks all directed paths from X to Y ; (ii) an empt y set blocks every path betw een X and Z that contains an arro w pointing to X ; (iii) X blocks ev ery path b et ween an y vertex in Z and Y . 2 If a set Z of v ariables satisﬁes the fron t door criterion relative to ( X, Y ), then the causal eﬀect f ( y | set( X = x )) of X on Y is identiﬁable through the observation of Z ∪{ X, Y } and is given b y the formula f ( y | set( X = x )) =  x  , z f ( y | x  , z ) f ( z | x ) f ( x  ) . (4) 3 IDENTIFICA TION OF CA USAL EFFECTS In section 2, it is assumed that both an exp osure v ari- able and an outcome v ariable are observ able. If either of them is unobserved, we cannot identify the causal eﬀect of an exp osure on an outcome even if a set of v ariables satisfying the bac k do or criterion or the fron t door criterion are observed. In this section, we con- sider the case where an unobserv ed exp osure/outcome v ariable is assumed to be discrete. Let X be an exp o- sure v ariable and Y be an outcome v ariable. Though X or Y is unobserved, researchers are in terested in dividing them into k categories. F or example, when the domain of Y is divided into k = 3 categories, y 1 , y 2 and y 3 may represent the p oor, fair and go od re- sponse levels. Then, let U be either X or Y whic h is an unobserv ed v ariable ( u ∈{ u 1 , ··· ,u k } ). In addition, let a set S and a set T b e observed proxy v ariables that are aﬀected by the unobserved v ariable U . As- sume that we can select k distinct v ectors from the domains of a set S and a set T of v ariables, denoted as t 1 , ··· , t k and s 1 , ··· , s k , resp ectiv ely . A set W and a set Z are assumed to b e con tinuous and/or discrete v ariables. F urthermore, let P and Q be k dimensional nonsingular matrices such that P = ⎛ ⎜ ⎜ ⎜ ⎝ 1 f ( t 1 | z ) ··· f ( t k − 1 | z ) f ( s 1 | z ) f ( s 1 , t 1 | z ) ··· f ( s 1 , t k − 1 | z ) . . . . . . . . . . . . f ( s k − 1 | z ) f ( s k − 1 , t 1 | z ) ··· f ( s k − 1 , t k − 1 | z ) ⎞ ⎟ ⎟ ⎟ ⎠ , (5) Q = ⎛ ⎜ ⎜ ⎜ ⎝ f ( w | z ) f ( w , t 1 | z ) ··· f ( w , t k − 1 | z ) f ( w , s 1 | z ) f ( w , s 1 , t 1 | z ) ··· f ( w , s 1 , t k − 1 | z ) . . . . . . . . . . . . f ( w , s k − 1 | z ) f ( w , s k − 1 , t 1 | z ) ··· f ( w , s k − 1 , t k − 1 | z ) ⎞ ⎟ ⎟ ⎟ ⎠ . (6) Then, the following theorem is obtained. THEOREM 1 Given a causal diagram G on V with S ∪ T ∪{ U } ∪ Z ∪ W ( ⊂ V ), suppose that (i) Z ∪{ U } d-separates S from T and W from S ∪ T ; (ii) f ( u 1 | z ) < ···

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment