Defaults and Normality in Causal Structures

Defaults and Normality in Causal Structur es Joseph Y . Halpern ∗ Cornell University Dept. of Computer Science Ithaca, NY 14853 halpern@cs.co rnell.edu http://www .cs.cornell.edu/h ome/halpern Abstract A serious defect wit h the Halpern-Pearl (HP) deﬁnition of causality is repaired by combining a theory of causality wit h a theory of def aults. In addition, it is s hown that (despite a claim to the contrary) a cause according to the HP condition need not be a single conjunct. A deﬁnition of causality mo- tiv ated by Wright’ s NESS test is sho wn to always hold for a single conjunct. Moreov er , conditions that hold for all the ex - amples cons idered by HP are giv en t hat guarantee that causal- ity according to (t his version) of the NESS test is equi valen t to the HP deﬁnition. 1 Intr oduction Getting an adequ ate deﬁnition of causality is difﬁcult. There have been n umero us attempts, in ﬁelds r ang- ing from philosoph y to law to comp uter scienc e (see, e.g., [Collins, Hall, and Paul 2004; Hart and Honor ´ e 1 985; Pearl 2000]). A recent d eﬁnition by Halpern and Pearl ( HP from n ow on) , ﬁrst introdu ced in [Halpern and Pearl 2001], using structural equations, has attracted some atten tion re- cently . The intuition behind th is d eﬁnition, which go es back to Hume [1748 ], is that A is a cause of B if, h ad A n ot happen ed, B would n ot hav e happened. For e xample, de- spite the fact th at it was raining and I was drunk , the faulty brakes are the cause of my accid ent because, had the br akes not been faulty , I w ould no t have had the accident. As is well known, this deﬁn ition does n ot quite w ork. T o take an example due to Wright [1985], suppose that V ictoria, the victim, drinks a cu p of te a poisoned by Paula, b ut before the poison takes effect, Shar on shoots V icto ria, and she dies. W e would like to call Sharo n’ s shot the cause of the V ictoria’ s death, but if Sha ron hadn’t shot, V ictoria would have died in any case. HP deal with this by , rou ghly speaking, consid- ering the contingency wh ere Sha ron do es not shoo t. Un der that co ntingency , V icto ria dies if Paula adm inisters the poi- son, and oth erwise does no t. T o prevent the poisonin g from also being a cause o f P aula’ s de ath, HP put some co nstraints on the continge ncies th at could be considered . Unfortu nately , tw o sign iﬁcant p roblems h av e b een f ound with the original HP deﬁnitio n, each lead ing to situations ∗ Supported in part by NSF under un der grants ITR-0325453 and IIS-0534064, and by AFOSR under grant F A9550-05-1-005 5. Copyrigh t c  2021, Association for the Adv ancemen t of Artiﬁcial Intelligence (www .aaai.org). Al l rights reserve d. where the deﬁnition does no t match most people’ s intu- itions regarding causality . The ﬁrst, observed by Hopkin s and Pearl [200 3] (see Example 3.3), showed that the con- straints on th e co ntingen cies were too liberal. This prob - lem was d ealt with in the jour nal version of the HP pa- per [Halpern and Pearl 2005] b y puttin g a fur ther constraint on contingencies. The second pr oblem is arguably deepe r . As examples o f Hall [200 7] and Hid dleston [ 2005] show , the HP deﬁnition gives inappr opriate answers in cases tha t have structural equ ations isomorphic to ones where the HP deﬁnition giv es the appropriate an swer (see Example 4.1). Thus, th ere m ust b e mo re to causality than just the structural equations. T he ﬁnal HP d eﬁnition recog nizes this problem by viewing som e contingencies as “unreasonab le” or “far- fetched”. Howe ver , in some of the examples, it is not clear why the relev ant conting encies a re more farfetc hed than oth- ers. I show that th e pr oblem is even deeper th an th at: th ere is no w ay of viewing con tingencies as “farfetched” ind epen- dent of actual contingency th at can solve the problem. This paper has two b road the mes, motiv ated by the two problem s in th e HP deﬁnitio n. First, I pr opose a g eneral approa ch fo r dealing with the second prob lem, mo tiv ated by the f ollowing well-known o bservation in the psych ology literature [Kahn eman and M iller 1986, p . 143 ]: “an event is m ore likely to b e u ndone by altering exceptional than routine aspects of the causal chain that led to it. ” In the languag e of this paper, a con tingency that d iffers from the actual situation b y chan ging something th at is atypical in the actual situation is more reason able than on e that d if- fers by chan ging some thing that is typical in the actual sit- uation. T o captu re this intuition for mally , I use a well- understoo d appr oach to d ealing with defaults a nd nor mal- ity [Kr aus, Lehmann, and Magidor 1990]. Combin ing a de - fault th eory with causality , using the intu itions of Kahne- mann an d Miller, leads to a straig htforward solution to the second prob lem. The idea is that, when showing th at if A hadn’t happened then B would no t have happened, we co n- sider o nly c ontingen cies that ar e more norm al than the ac- tual world. For example, if someone typic ally leaves work at 5 :30 PM an d arrives ho me at 6 , but, due to unu sually bad trafﬁ c, arrives ho me at 6:10, th e bad trafﬁc is typ ically viewed as the cause of h is being late, no t the fact that he left at 5:30 (rather than 5:20). The seco nd theme of th is pa per is a compa rison o f the HP deﬁnition to p erhaps the best worked-ou t a pproach to causality in the legal literature: th e NESS (Necessary Ele- ment of a Sufﬁcient Set) test, or iginally described by Hart and Hono r ´ e [1 985], and worked ou t in greater detail by Wright [19 85, 1988, 2 001]. This is mo tiv ated in part by the ﬁrst p roblem. As sh own by Eiter and Lukasiewicz [2002] and Hopk ins [2001], th e origin al HP deﬁnition had the prop- erty that causes were always single conju ncts; that is, it is never the case that A ∧ A ′ is a cause o f B if A 6 = A ′ . This proper ty , which play s a c ritical role in the comp lexity results of E iter an d Lukasiewicz [2002], was a lso claimed to hold for the r evised deﬁnitio n [Halp ern and Pearl 2005] (which was revised pr ecisely to deal with the ﬁrst pro blem) but, as I show he re, it d oes not. N ev ertheless, for all the examples considered in the literature, the cause is alw ays a single con- junct. Conside ring the NESS test helps e xplain why . While th e NESS test is simple an d intuitive, a nd de als well with m any examples, as I show here, it suffers f rom some serious p roblems. In In particu lar , it lacks a cle ar def- inition of what it means fo r a set of events to be sufﬁcient for another event to occur . I provide such a deﬁnition h ere, using ideas from the HP deﬁn ition o f cau sality . Combin ing these id eas with the in tuition b ehind the N ESS test leads to a deﬁnition of causality that (a) often agrees with the HP deﬁ- nition (indeed , does so on all the examp les in the HP paper) and (b ) has the p roperty that a cau se is always a sing le con- junct. I p rovide a sufﬁcient co ndition (that hold s in all the examples in the HP paper) for when the NESS test deﬁn ition implies the HP deﬁnition, thus also providin g an explanation as to why the cause is a single conjunct accordin g to th e HP deﬁnition in so many cases. I conclu de this introdu ction with a brief discussion on re- lated work . Th ere h as been a great deal o f work on causal- ity in philosop hy , statistics, AI , and the law . It is b eyond the scop e of this paper to r evie w it; th e HP pap er h as some compariso n of the HP app roach to other, particularly those in the philosop hy literature. It is perhaps worth m entioning here that the focus of this w ork is quite dif ferent fr om the AI work on form al action theory (see, for examp le, [L in 1995; Sandewall 1 994; Reiter 2001]), which is concerned with ap- plying causal relationship s so as to guid e action s, as opposed to the foc us here on extracting the actual causality relation from a speciﬁc scenario. 2 Causal Models In this section, I brieﬂy revie w the f ormal mod el of ca usality used in the HP deﬁnition. More d etails, intuition, and mo - ti vation can be fo und in [Halpern and Pearl 2005] and th e referenc es t herein . The HP ap proach assumes th at the world is described in term s o f ran dom variables and the ir values. F or exam- ple, if we are trying to determin e whether a forest ﬁre was caused by lightnin g or an arsonist, w e ca n take th e world to be described b y three random variables: FF for fo rest ﬁre, wher e FF = 1 if there is a forest ﬁre an d FF = 0 otherwise; L fo r lightn ing, wh ere L = 1 if lightn ing oc- curred and L = 0 otherwise; M for match (dro pped by ar- sonist), where M = 1 if the arsonist drops a lit match , and M = 0 o therwise. The choice of rando m variables deter- mines th e la nguage used to frame the situatio n. Alth ough there is no “right” choice, clearly some c hoices are more approp riate than othe rs. For example, wh en tryin g to deter- mine the cau se of Sam’ s lun g cancer, if ther e is n o r andom variable co rrespon ding to smo king in a mode l then, in that model, we cannot hop e to con clude that smoking is a cau se of Sam’ s lun g cancer . Some r andom variables may have a causal inﬂuen ce o n others. Th is inﬂuence is modeled by a set of structural equation s . For example , to model the fact th at if a match is lit or lightning strikes then a ﬁre starts, we co uld use the random variables M , FF , and L as above, with the equa- tion FF = max( L, M ) . The equ ality sign in this equa tion should be thoug ht of more like an assignm ent statement in progr amming languag es; o nce w e set the values o f FF and L , th en th e value o f FF is set to their max imum. Howe ver , despite the equality , if a fo rest ﬁre starts some other way , that does not force the v alue of either M or L to be 1. It is conceptually u seful to split the ra ndom v ariables into two sets: the exogenous variables, whose values are deter- mined by factors ou tside the model, and the endogenous variables, whose values are ultimately determin ed b y the ex- ogeno us variables. For examp le, in the f orest ﬁre example, the variables M , L , and FF are endog enous. Howe ver , we want to take a s g iv en that there is en ough oxygen fo r th e ﬁre and that the wood is sufﬁciently dry to burn. In a ddition, we do n ot want to concern ou rselves with th e factors that make the ar sonist drop the ma tch or the factor s that cause lightning . Th ese factors are all determined b y the exogen ous variables. Formally , a causal model M is a p air ( S , F ) , where S is a signature , which explicitly lists the en dogeno us and exo ge- nous variables and ch aracterizes their possible values, and F deﬁnes a set of modiﬁa ble structural equations , relating the values of the variables. A sign ature S is a tu ple ( U , V , R ) , where U is a set of exogen ous variables, V is a set of end oge- nous variables, an d R associates with ev ery variable Y ∈ U ∪ V a nonem pty set R ( Y ) of possible values for Y (that is, the set of values over which Y ranges ). F associates with each endog enous variable X ∈ V a f unction denoted F X such that F X : ( × U ∈U R ( U )) × ( × Y ∈V −{ X } R ( Y )) → R ( X ) . This mathema tical notation ju st m akes precise the fact that F X determines th e value o f X , given the values o f all the other variables in U ∪ V . If ther e is o ne exogen ous variable U an d th ree en dogen ous variables, X , Y , and Z , then F X deﬁnes the values of X in terms of the values of Y , Z , an d U . For example, we m ight have F X ( u, y , z ) = u + y , which is usually written as X = U + Y . 1 Thus, if Y = 3 and U = 2 , then X = 5 , regardless of how Z is set. In the ru nning forest ﬁre examp le, suppose that we have an exog enous rando m U that d etermines the values o f L and M . Thus, U ha s fou r possible values of th e for m ( i, j ) , where bo th of i and j a re eithe r 0 or 1. T he i value deter- mines th e value of L a nd the j value determine s the value 1 Again, t he fact that X i s assigned U + Y (i.e., the value of X is the sum of the values of U and Y ) does not imply t hat Y i s assigned X − U ; that is, F Y ( U, X, Z ) = X − U does not ne cessarily hold. of M . Althoug h F L gets as aragum ents the vale of U , M , and FF , in fact, it dep ends on ly on th e (ﬁrst co mpone nt of) the value o f U ; that is, F L (( i, j ) , m, f ) = i . Similar ly , F M (( i, j ) , l, f ) = j . Th e value of FF depends only o n the value of L and M . How it depen ds on them depends on wh ether having either ligh tning or an arsonist sufﬁces for the f orest ﬁre, or w hether bo th ar e necessary . If e ither one sufﬁces, then F FF (( i, j ) , l , m ) = max( l , m ) , or, p er- haps more comprehe nsibly , FF = max( L, M ) ; if both are needed, then FF = min( L, M ) . For futur e reference, ca ll the form er mo del the disjunctive mod el, and the latter the conjunc tive mo del. The key role of the structur al eq uations is to de ﬁne wha t happen s in the p resence of external inter ventions. For ex- ample, we can explain what happens if the arsonist does not drop the match. In the d isjunctive model, there is a f orest ﬁre exactly exactly if ther e is lightnin g; in th e conjun ctiv e model, there is deﬁnitely no ﬁre. Setting the value of some variable X to x in a causal mod el M = ( S , F ) results in a new causal model deno ted M X = x . In the new cau sal mo del, since the value of X is set, X is removed fr om the list o f endog enous variables. That means that there is no longer a n equation F X deﬁning X . More over , X is n o long er an ar- gument in the equation F Y characterizin g another en doge- nous variable Y . The new eq uation fo r Y is th e one that results by substituting x for X . More formally , M X = x = ( S X , F X = x ) , where S X = ( U , V − { X } , R| V −{ X } ) (this notation just say s that X is removed from th e set o f en - dogen ous variables and R is restricted so that its d omain is V − { X } rather than all of V ) and F X = x associates with each variable Y ∈ V − { X } the eq uation F X = x Y which is o b- tained fr om F Y by setting X to x . Th us, if M is the d isjunc- ti ve causal mod el fo r the fore st-ﬁre example , then M M =0 , the m odel w here the ar sonist do es not dro p the match, h as endog enous variables L and FF , where the equation for L is just as in M , and FF = L . If M is the con junctive model, then equation for FF become s instead FF = 0 . In this paper, following HP , I restrict to a cyclic c ausal models, whe re c ausal in ﬂuence ca n b e repr esented by an acyclic Bayesian n etwork. That is, there is no cycle X 1 , . . . , X n , X 1 of endog enous variables whe re the value of X i +1 (as given by F X i +1 ) depend s on th e value o f X i , for 1 = 1 , . . . , n − 1 , and the value of X 1 depend s on the value of X n . If M is an acyclic cau sal m odel, then given a context , that is, a setting ~ u f or the exogenous v ariables in U , there is a unique solution for all the equations. There are many non trivial decisions to be made when choosing the structural model to describe a given situation . One sign iﬁcant d ecision is the set of variables u sed. As we shall see, the e vents that can b e causes and those that can be caused are expressed in terms o f these variables, as are a ll the in termediate events. The choice o f variables essentially determines the “lang uage” of th e discussion; ne w events cannot be created on the ﬂy , so to speak. In o ur r unning example, the fact that there is n o variable for u nattended campﬁres means that the mod el do es not allow us to co n- sider unattended campﬁres as a cause of the forest ﬁre. Once the set o f variables is chosen, the n ext step is to de- cide which a re exog enous an d which are endo genous. As I said earlier, th e exogenous variables to some extent en code the b ackgrou nd situation that we want to take for g ranted. Other implicit backg round assum ptions a re encod ed in the structural equatio ns themselves. Suppo se that we are tryin g to d ecide whethe r a lig htning bolt o r a match was the cause of the forest ﬁre , and we want to take fo r gr anted that there is suf ﬁcient oxygen in the air and the wood is dry . W e cou ld model the dr yness of th e wood by an exogen ous v ariable D with values 0 ( the wood is wet) a nd 1 (the wood is d ry). 2 By making D exogeno us, its value is assumed to be given and o ut o f th e co ntrol o f th e mo deler . W e could als o tak e the amount o f o xygen as an exoge nous variable (for example, there could b e a variable O with two values—0, for insuf- ﬁcient oxyg en, and 1, for sufﬁcient o xygen ); alternatively , we could choose not to model oxygen explicitly a t all. For example, suppo se that we hav e, as before, a random variable M for match lit, and anothe r variable WB for wood burning, with values 0 (it’ s no t) an d 1 (it is). T he stru ctural equ ation F WB would d escribe the depende nce of WB on D an d M . By setting F WB (1 , 1) = 1 , we ar e say ing that the wood will burn if the m atch is lit and th e wood is d ry . Thus, the equation is implicitly mo deling our assump tion that th ere is sufﬁcient ox ygen for the wood to b urn. According to the deﬁnition of causality in Section 3, only endog enous variables can b e causes o r be caused. Thu s, if no variables enc ode the presence of o xygen , or if it is en - coded only in an exogeno us variable, then oxygen can not be a cause of the forest b urning. If we were to explicitly m odel the amou nt of ox ygen in the air (which certainly migh t be relev ant if we were ana lyzing ﬁres o n Mo unt Everest), th en F WB would also take values of O as an argum ent, and the presence o f sufﬁcient o xygen migh t well be a cause of th e wood b urning , an d hence the forest burning. It is n ot al ways straightforward to decide what th e “right” causal model is in a gi ven situation , nor is it always o bvious which of tw o causal m odels is “better” in so me sense. Th ese decisions often lie at the hear t of determinin g actual causal- ity in the r eal world . Disagreem ents about cau sality rela- tionships often boil d own to disagreements about the c ausal model. While the for malism p resented here does not provid e technique s to settle d isputes abo ut which cau sal model is the right one, at lea st it p rovides tools fo r carefu lly describ- ing the d ifferences between causal mo dels, so th at it should lead to more info rmed an d princ ipled decision s abo ut tho se choices. 3 A Formal Deﬁnition of Actual Cause 3.1 A language fo r describing causes T o make the deﬁnition of actual c ausality precise, it is help - ful to have a for mal langu age for mak ing statements abou t causality . Given a signatu re S = ( U , V , R ) , a primitive event is a fo rmula of the for m X = x , for X ∈ V and x ∈ R ( X ) . A causa l formula (over S ) is one of the for m [ Y 1 = y 1 , . . . , Y k = y k ] ϕ , where 2 Of course, in practice, we may want to allow D to hav e more v alues, indicating t he degree of dryness of the wood, but that level of complexity is unnecessary for the points I am trying to make here. • ϕ is a Boolean combinatio n o f primitive events, • Y 1 , . . . , Y k are distinct variables in V , and • y i ∈ R ( Y i ) . Such a fo rmula is abbr eviated as [ ~ Y = ~ y ] ϕ . The special case where k = 0 is abbr eviated as ϕ . Intuiti vely , [ Y 1 = y 1 , . . . , Y k = y k ] ϕ say s th at ϕ would h old if Y i were set to y i , for i = 1 , . . . , k . A causal form ula ψ is true or false in a causal mod el, giv en a context. As usual, I write ( M , ~ u ) | = ψ if the ca usal formu la ψ is true in cau sal model M given c ontext ~ u . The | = relatio n is deﬁn ed inductively . ( M , ~ u ) | = X = x if th e variable X h as value x in the uniqu e ( since we ar e deal- ing with acyclic models) solution to the eq uations in M in context ~ u (that is, the uniqu e vector of v alues for the exoge- nous variables th at simultaneou sly satisﬁes all eq uations in M with the variables in U set to ~ u ). Th e truth o f conjun c- tions and negations is d eﬁned in the standard way . Fina lly , ( M , ~ u ) | = [ ~ Y = ~ y ] ϕ if ( M ~ Y = ~ y , ~ u ) | = ϕ . I wr ite M | = ϕ if ( M , ~ u ) | = ϕ for all contexts ~ u . For example, if M is the d isjunctive causal m odel fo r the f orest ﬁre, and u is the context wh ere ther e is light- ning and the arsonist dro ps the lit m atch, then ( M , u ) | = [ M = 0]( FF = 1) , since even if the arson ist is somehow prevented from dropping the matc h, the f orest b urns (thanks to the lig htning) ; similar ly , ( M , u ) | = [ L = 0]( FF = 1) . Howe ver , ( M , u ) | = [ L = 0; M = 0 ]( FF = 0) : if arsonist does not drop the lit match and the lig htning does not strike, then the forest does not burn. 3.2 A preli minary deﬁnition of causality The HP deﬁnition o f cau sality , like many others, is based on coun terfactuals. The idea is that A is a cau se of B if, if A had n’t occur red ( although it did ), then B would n ot have occurred. This id ea goes bac k to at least Hum e [174 8, Section VIII], who said: W e may d eﬁne a cau se to b e an ob ject followed by an- other, . . . , if th e ﬁrst objec t had not been, the secon d never had existed. This is essentially the but-for test, perhap s the most widely used test o f actua l c ausation in tort adju dication. The but- for test states th at an act is a cause o f injury if and only if, but for the act (i.e., had the th e act n ot o ccurred) , the inju ry would not ha ve occurred. There are two well-known pr oblems with th is deﬁnition. The ﬁrst can b e seen by co nsidering the d isjunctive causal model for the f orest ﬁre again . Suppose that the ar sonist drops a match and lightn ing strikes. Which is the cause? Ac- cording to a naive interpretation o f the counterfactual deﬁni- tion, neither is. If the m atch h adn’t dro pped, th en th e ligh t- ning w ould still h av e struck, so there would h av e b een a f or- est ﬁre anyway . Similarly , if the lightn ing had not occu rred, there still would h av e been a f orest ﬁre. As we shall see, the HP d eﬁnition d eclares b oth lig htning a nd th e arso nist cases of the ﬁre. (In general, there may b e more than one cause o f an outcome. ) A mo re subtle prob lem is what p hilosoph ers have called pr eemption , where there are two potential causes of an event, one of which pr eempts the oth er . Preem ption is illustrated by the following stor y taken from [Hall 2004]: Suzy and Billy bo th pick up ro cks and th row them at a bottle. Suzy’ s rock ge ts there ﬁrst, shattering the bottle. Since both throws are perf ectly a ccurate, Billy’ s would have shattered th e bo ttle had it not bee n pr eempted by Suzy’ s thro w . Common sense suggests that Suzy’ s throw is the cause of th e shattering, b ut Billy’ s is not. However , it does no t satisfy the naive counter factual de ﬁnition either; if Suzy hadn ’t thrown, then Billy’ s throw would ha ve shattered the bottle. The H P deﬁnition de als with the ﬁrst pr oblem by deﬁn- ing causality as coun terfactual dep endency und er certain contingencies . In the forest ﬁre example, the forest ﬁre does c ounterfactually depend on the lightn ing un der the con- tingency that the arsonist does not dr op the ma tch; simi- larly , the forest ﬁre depend s o ounterfactu ally on the arson- ist’ s match under the con tingency that the ligh tning d oes n ot strike. Clearly we need to be a little c areful her e to limit the conting encies th at can b e consider ed. W e do n ot want to make Billy’ s throw the cau se of the bottle shatter ing b y considerin g the conting ency that Suzy does not throw . The reason that we co nsider Suzy’ s throw to be the cause and Billy’ s th row not to b e the cause is that Suzy’ s r ock hit the bottle, while Billy’ s d id not. Somehow th e de ﬁnition must capture this obvious intuition. W ith this backgrou nd, I now giv e the preliminary version of the HP d eﬁnition of causality . Althou gh the deﬁnition is labeled “preliminary”, it is quite clo se to the ﬁna l deﬁnition, which is given in Sec tion 4 . As I pointed out in the intro- duction, the deﬁnition is relative to a ca usal model (and a context); A may be a cause o f B in on e causal m odel but not in an other . The d eﬁnition consists o f thr ee clauses. T he ﬁrst and third are quite simple; all the work is go ing on in the second clause. The types of e vents that the HP deﬁnition allows as actual causes ar e o nes o f the form X 1 = x 1 ∧ . . . ∧ X k = x k —that is, conjun ctions o f primitive events; this is ofte n abb revi- ated as ~ X = ~ x . The e vents that can b e caused are arb itrary Boolean combinatio ns of primitive events. The deﬁn ition does n ot allow statements of th e form “ A o r A ′ is a ca use of B , ” althou gh this could b e treated as being equiv alent to “either A is a cau se o f B or A ′ is a cause of B ”. On the other hand, statements such as “ A is a cau se of B or B ′ ” are allowed; as we shall see, this is no t equiv alent to “eith er A is a cause of B or A is a ca use of B ′ ”. Deﬁnition 3.1 : (Actual cause; preliminar y v ersion) [Halpern and Pearl 2005] ~ X = ~ x is an a ctual c ause of ϕ in ( M , ~ u ) if the following three conditions hold: A C1. ( M , ~ u ) | = ( ~ X = ~ x ) and ( M , ~ u ) | = ϕ . A C2. T here is a partition of V (the set of endog enous vari- ables) in to two subsets ~ Z and ~ W with ~ X ⊆ ~ Z and a set- ting ~ x ′ and ~ w o f the v ariables in ~ X and ~ W , respectively , such that if ( M , ~ u ) | = Z = z ∗ for all Z ∈ ~ Z , then both of the following con ditions hold: (a) ( M , ~ u ) | = [ ~ X = ~ x ′ , ~ W = ~ w ] ¬ ϕ . (b) ( M , ~ u ) | = [ ~ X = ~ x, ~ W ′ = ~ w , ~ Z ′ = ~ z ∗ ] ϕ for all sub- sets ~ W ′ of ~ W and all subsets ~ Z ′ of ~ Z , where I abuse notation a nd wr ite ~ W ′ = ~ w to den ote the assignm ent where the variables in ~ W ′ get the same values a s they would in the assignment ~ W = ~ w . A C3. ~ X is minimal; no subset of ~ X satisﬁes condition s A C1 and A C2. ~ W , ~ w , and ~ x ′ are said to be witnesses to the fact that ~ X = ~ x is a cause of ϕ . A C1 just says that ~ X = ~ x cann ot be consid ered a cause of ϕ unle ss b oth ~ X = ~ x an d ϕ actually happen. AC3 is a mini- mality con dition, which en sures that on ly those ele ments of the conjunctio n ~ X = ~ x that are essential for changin g ϕ in A C2(a) ar e Clearly , all th e “action ” in th e d eﬁnition oc- curs in AC2. W e can think of the variables in ~ Z as making up the “c ausal pa th” fr om ~ X to ϕ . Intuitively , cha nging the value of som e variable in X results in chang ing the value(s) of some v ariable( s) in ~ Z , which results in the values o f some other variable(s) in ~ Z b eing chan ged, which ﬁn ally results in the value of ϕ chang ing. The remaining endogenous vari- ables, the o nes in ~ W , are o ff to the side, so to spea k, but may still ha ve an indirect e ffect on wha t happens. A C2(a) is essentially the standa rd counterfactual deﬁnition o f causal- ity , but with a twist. If we want to show that ~ X = ~ x is a cause of ϕ , we must show (in part) that if ~ X had a d ifferent value, then s o too would ϕ . Howev er, th is effect of the value of ~ X on the value of ϕ may not hold in the ac tual context; the value of ~ W may ha ve to be d ifferent to allow this effect to manifest itself. For example, c onsider the con text wh ere both the lightn ing strikes and the arsonist drop s a match in the d isjunctiv e mod el of the fo rest ﬁre. Stopp ing the arson- ist from dr opping the match will not pr ev ent the for est ﬁr e. The coun terfactual effect of the arsonist on the fore st ﬁre manifests itself only in a situation wh ere th e lightn ing do es not strike (i. e., where L is set to 0 ). AC 2(a) is what allows us to call both the ligh tning and the arson ist causes o f th e forest ﬁre. Ess entially , it ensures that ~ X alo ne sufﬁces to bring abo ut the change fr om ϕ to ¬ ϕ ; setting ~ W to ~ w merely eliminates p ossibly sp urious side effects th at may mask the effect of changin g th e value of ~ X . Mo reover , although the values of v ariables on the causal p ath ( i.e., th e variables ~ Z ) may be pertu rbed by the chan ge to ~ W , this p erturbatio n has no impac t on the value of ϕ . If ( M , ~ u ) | = ~ Z = ~ z ∗ , then ~ z ∗ is the value of th e variable Z in the context ~ u . W e capture the fact that the p erturbatio n h as no impact on the value of ϕ by saying that if som e variables Z o n th e cau sal pa th wer e set to their origin al values in the co ntext ~ u , ϕ would still be true, as long as ~ X = ~ x . T o give s ome intuition fo r this deﬁnitio n, I consider three examples that will be rele vant later in the pap er . Example 3.2: Can not per forming an action be (par t of) a cause? Consider the following story , also ta ken fro m (an early v ersion of) [Hall 2004]: Suppose th at Billy is h ospital- ized with a mild illness on Monday; he is treated and recov- ers. In the obvious causal model, the d octor’ s treatment is a cause of Billy’ s recovery . Moreover , if the docto r d oes not treat Billy on Mond ay , th en the doctor’ s omission to treat Billy is a cause o f Billy’ s bein g sick on T uesday . But n ow suppose there are 10 0 do ctors in the hospital. Although o nly doctor 1 is assigned to Billy (and h e fo rgot to giv e m edica- tion), in pr inciple, any of the other 99 doctors cou ld h av e giv en Billy his m edication. Is the nontrea tment b y d octors 2–100 also a cause of Billy’ s being sick on T uesday? Of course, if we do no t ha ve variables in the model correspo nd- ing to the other do ctors’ treatment, or trea t these variables as exogen ous, then there is no problem. But if we hav e en - dogen ous variables correspo nding to the othe r do ctors ( for example, if we want to also consider other patients, who are being tr eated by these other doctors), then th e o ther doctors’ nontreatm ent is a cause, which seems inap propr iate. I return to this issue in the next section. W ith this backgrou nd, we co ntinue with Hall’ s modiﬁca- tion of the original story . Suppose th at Monday ’ s doc tor is reliable, and admin - isters the medicine ﬁrst thing in the morn ing, so that Billy is fu lly recovered by Tuesday aftern oon. T ues- day’ s d octor is also reliable, and would h av e treated Billy if Mon day’ s doctor had failed to. . . . And let us add a twist: one d ose of medica tion is harm less, but two doses are lethal. Is the f act that T uesd ay’ s doc tor did not treat Billy the cause of him being ali ve (and recov ered) on W ed nesday morning ? The causal model fo r this story is straightfo rward. Th ere are three random variables: • T for Monday ’ s trea tment (1 if Billy was treated Mond ay; 0 otherwise); • TT fo r Tuesday’ s treatm ent (1 if Billy was treated T ues- day; 0 otherwise); and • BMC for Billy’ s m edical co ndition (0 if Billy is ﬁne both T uesday m orning and W ednesday mornin g; 1 if B illy is sick T uesday mo rning, ﬁne W edne sday morn ing; 2 if Billy is sick both T uesday and W ednesday morning; 3 if Billy is ﬁne T uesda y morning and d ead W ednesday mor n- ing). W e can then d escribe Billy’ s con dition as a functio n of the four possible comb inations of treatm ent/nontr eatment on Monday and T uesday . I o mit the obviou s stru ctural e qua- tions correspon ding to th is discussion. In th is causal model, it is tru e that T = 1 is a cause of BMC = 0 , as we would expect—b ecause Billy is treated Monday , h e is n ot treated o n T uesday mor ning, and thus recovers W edn esday morn ing. T = 1 is also a cause of TT = 0 , as we would expect, an d TT = 0 is a cause o f Billy’ s being alive ( BMC = 0 ∨ BMC = 1 ∨ BMC = 2 ). Howe ver , T = 1 is not a cause of Billy’ s being ali ve. It f ails condition AC2(a): setting T = 0 still leads to Billy’ s b e- ing alive (with W = ∅ ). Note that it would n ot he lp to take ~ W = { TT } . For if TT = 0 , then Billy is alive n o matter what T is, while if TT = 1 , then Billy is dea d when T has its original value, so AC 2(b) is violated (with ~ Z ′ = ∅ ) . This shows that cau sality is no t transitive, according to our de ﬁnitions. Althou gh T = 1 is a cause of TT = 0 and TT = 0 is a cause of BMC = 0 ∨ BMC = 1 ∨ BMC = 2 , T = 1 is n ot a cause of BMC = 0 ∨ BMC = 1 ∨ BMC = 2 . Nor is causality closed u nder right weakenin g : T = 1 is a cause of BMC = 0 , which logically im plies BMC = 0 ∨ BMC = 1 ∨ BMC = 2 , which is not cau sed by T = 1 . This disting uishes the HP deﬁnition fro m that of Lewis [2000], which builds in tran siti vity and implicitly assumes right weakening. The version of A C2(b) used here is taken from [Halpern and Pearl 2005], and dif fers from th e v er- sion giv en in the conference v ersion of that paper [Halpern and Pearl 2001]. In the current version, A C2(b) is requ ired to ho ld for all subsets ~ W ′ of ~ W ; in the origin al deﬁnition, it was required to ho ld only for ~ W . The following example, due to Ho pkins and Pearl [200 3], illustrates wh y the change was made. Example 3.3 : Supp ose that a p risoner dies e ither if A load s B ’ s gun and B shoots, or if C load s an d shoots his gu n. T aking D to rep resent the prison er’ s dea th and making the obvious assump tions about th e meaning of the v ariables, we have that D = 1 iff ( A = 1 ∧ B = 1) ∨ ( C = 1 ) . Su ppose that in the actua l context u , A loads B ’ s gun, B does n ot shoot, but C does load and shoot his gu n, so that th e prisoner dies. Clearly C = 1 is a cause of D = 1 . W e would not want to say that A = 1 is a cau se of D = 1 in con text u ; g iv en that B did not sho ot ( i.e., gi ven that B = 0 ), A ’ s loading th e gun shou ld not count as a cause. T he obviou s way to attemp t to show tha t A = 1 is a cause is to take ~ W = { B , C } and consider the contingency where B = 1 and C = 0 . It is easy to check that A C2(a) holds for this conting ency; mo reover , ( M , u ) | = [ A = 1 , B = 1 , C = 0]( D = 1) . Howe ver , ( M , u ) | = [ A = 1 , C = 0]( D = 0) . Thu s, AC2(b) is no t satisﬁed for the su bset { C } of W , so A = 1 is not a cause of D = 1 . Howev er, had we req uired A C2(b) to ho ld only for ~ W rather than all subsets ~ W ′ of ~ W , then A = 1 would have b een a cause. While the chang e in A C2(b) has the advantage of b e- ing able to de al with Exam ple 3.3 (indeed , it deals with the whole class of examples g iv en b y Ho pkins a nd Pearl of which this is an instance), it has a n ontrivial side effect. For the origin al deﬁnition, it was shown that the m inimality condition AC3 guara ntees that causes are al ways single con- juncts [Eiter and Lukasiewicz 2002; Hopkins 2001]. It was claimed in [Halpern and Pearl 200 5] th at the result is still true f or the m odiﬁed deﬁn ition, but, as I n ow show , this is not the case. Example 3.4 : A and B both v ote for a candidate. B ’ s vote is r ecorded in two optical scan ners ( C 1 and C 2 ). If A votes for the candid ate, then she win s; if B votes for the candidate and his vote is cor rectly recor ded in th e optical scann ers, then the candidate wins. Un fortun ately , A also has access to the scanner s, so she will set them to r ead 0 if she do es not vote for the cand idate. In th e actual con text ~ u , bo th A an d B vote for the candidate. The follo wing stru ctural equations characterize C and WIN : C i = min( A, B ) , i = 1 , 2 , an d WIN = 1 iff A = 1 or C 1 = C 2 = 1 . I claim th at C 1 = 1 ∧ C 2 = 1 is a cause of WIN = 1 , but neithe r C 1 = 1 nor C 2 = 1 is a ca use. T o see th at C 1 = 1 ∧ C 2 = 1 is a cause, ﬁrst observe that A C1 clearly ho lds. For AC2, let ~ W = { A } (so ~ Z = { B , C 1 , C 2 , WIN } ) a nd take w = 0 (so we are consider ing the continge ncy wher e A = 0 ). Clearly , ( M , ~ u ) | = [ C 1 = 0 , C 2 = 0 , A = 0]( WIN = 0) and ( M , ~ u ) | = [ C 1 = 1 , C 2 = 1 , A = a ]( WIN = 1) , for bo th a = 0 and a = 1 , so A C2 holds. T o show th at AC3 ho lds, I must show th at neither C 1 = 1 nor C 2 = 1 is a cause of WIN = 1 . The argumen t is the same fo r both C 1 = 1 an d C 2 = 1 , so I just sh ow tha t C 1 = 1 is no t a cause. T o see this, note that if C 1 = 1 is a cause with ~ W , ~ w , and ~ x ′ as witnesses, then ~ W must co ntain A and ~ w must b e such that A = 0 . But since ( M , u ) | = [ C 1 = 1 , A = 0]( WIN = 0) , A C2(b) is violated no matter whether C 2 is in ~ Z or in ~ W . Although Examp le 3.4 shows that ca uses are n ot always single conjun cts, they o ften are. Indeed, it is no t h ard to show th at in all the standard e xample s considered in the phi- losophy and legal literature (in particu lar , in all th e exam- ples con sidered in HP), they are. The following result giv e some intuition as to why . Further in tuition is gi ven b y the results o f Section 5 . Notice that in Exam ple 3.4, A a f- fects both C 1 and C 2 . As the f ollowing result shows, we do not have con junctive c auses if the potential ca uses cannot be affected by other variables. Say tha t ~ X = ~ x is a weak cause o f ϕ u nder the co ntin- gency ~ W = ~ w in ( M , ~ u ) if AC1 an d AC2 h old u nder th e contingen cy ~ W = ~ w , but A C3 d oes not necessarily hold. Proposition 3 .5 : If ~ X = ~ x is a wea k cause of ϕ in ( M , ~ u ) with ~ W , ~ w , and ~ x ′ as witnesses, | ~ X | > 1 , and each variable X i in ~ X is ind epende nt of a ll the va riables in V − ~ X in ~ u (that is, if ~ Y ⊆ V − ~ X , then fo r e ach setting ~ y of ~ Y , we have ( M , ~ u ) | = ~ X = ~ x iff ( M , ~ u ) | = [ ~ Y = ~ y ]( ~ X = ~ x ) ), then ~ X = ~ x is not a cause of ϕ in ( M , ~ u ) . In the examples in [Halpern and Pearl 2005] (and else- where in the liter ature), the variables that are p otential causes are typically ind ependen t o f all other variables, so in these causes are in fact s ingle co njuncts. 4 Dealing with normality and typicality While the deﬁn ition of causality given in Deﬁn ition 3.1 works well in many ca ses, it do es not always d eli ver answers that agree with (m ost people’ s) intuition. Consider the f ol- lowing example, taken fro m Hitchcock [2 007], based on an example due to Hiddleston [2005]. Example 4.1 : Assassin is in possession of a lethal poi- son, but has a last-minute chang e of heart a nd refr ains from putting it in V ictim’ s coffee. Bod yguar d p uts antidote in the coffee, wh ich would have neutr alized the po ison h ad th ere been any . V ictim drink s the coffee and survives. Is Bod y- guard’ s putting in the antid ote a cause o f V ictim su rviving? Most people would say no, but according to the preliminary HP deﬁnition, it is. For in the contin gency where Assassin puts in the poison, V ictim survi ves iff Bod yguard p uts in the antidote. Example 4.1 illustrates an e ven deeper problem with Def- inition 3.1. The stru ctural equation s for Example 4 .1 are iso- morphic to those in the forest-ﬁre example, provid ed that we interpret the variables ap propria tely . S peciﬁcally , take the endog enous v ariables in Ex ample 4.1 to be A (fo r “assassin does n ot pu t in poison ”), B (for “bodyg uard p uts in anti- dote”), and VS (fo r “v ictim survives”). Then A , B , and VS satisfy exactly the same equations as L , M , and FF , respec- ti vely . In the context where there is lightn ing and the arson- ists dr ops a lit match, bo th the the lig htning and the m atch are cau ses of th e for est ﬁre , which seems reaso nable. But here it does not seem reason able that B odyg uard’ s p utting in the antidote is a cause. Nevertheless, any deﬁnition that just depend s on the structural equ ations is bou nd to give the same answers in these two e xamples. (An e xample illu strating the same ph enomen on is given b y Hall [200 7].) This sug gests that there must be more to causality than just the structural equations. And , indeed, the ﬁnal HP deﬁnition o f cau sality allows certain con tingencies to b e lab eled as “unreasonable” or “too farfetch ed”; these con tingencies are the n n ot consid- ered in A C2(a) or A C2(b). Unfortun ately , it is not always clear wh at m akes a conting ency unreason able. Moreover, this appro ach wil l not work to deal with Example 3.2. In this e xamp le, we clearly want to consider as reasonable the con tingency where n o doctor is assigned to Billy an d Billy is no t treated (and thus is sick on T uesday). W e should also consid er as reaso nable th e contin gency where d octor 1 is assigned to Billy an d trea ts him (o therwise we cannot say that d octor 1 is the cause o f Billy be ing sick if he is assigned to Billy and does not treat him). What abou t the contingen cy wher e doc tor i > 1 is assigned to treat Billy and does so? It seem s just as reason able as the on e wher e doctor 1 is assigned to treat Billy and does so. Indee d, if we do not call it reasonable, then we will no t be able to say that doctor i is a cause of Billy’ s sickness in th e con text where doctor i assigne d to treat Billy and d oes no t. On the o ther hand, if we ca ll it reason able, then if d octor 1 is assigned to treat Billy and do es not, then docto r i > 1 no t treating Billy will also be a cause o f Billy’ s sickness. T o deal with this, what is reason able will have to depend on the co ntext; in the context where d octor 1 is assign ed to tr eat Billy , it shou ld not be conside red reasonable that docto r i > 1 is assign ed to treat Billy . As s ugge sted in the intro duction , the solu tion in v olves as- suming that an agent has, in addition to a theory of causality (as modeled by the stru ctural equation s), a theory of “no r- mality” or “typicality ”. This theory would include state- ments like “ typically , peo ple do not put poison in coffee” and “typically docto rs do n ot treat patien ts to who m they are not as signed”. There are many ways of giving s emantics to such ty picality statements, in cluding pr efer ential struc- tur es [ Kraus, Lehman n, and Magidor 1990; Sho ham 1987], ǫ -semantics [Ad ams 1975; Geffner 1992; Pearl 198 9], and possibilistic structures [Du bois and Prade 1991], a nd r ank- ing f unctions [ Goldszmidt and Pearl 1992; Spohn 1988]. For d eﬁniteness, I u se the last app roach here (altho ugh it would b e p ossible to use any of the oth er appr oaches as well). T ake a world to be a complete description of the values of all the random variables. I assume that each world h as asso- ciated with it a rank , wh ich is just a natur al number or ∞ . Intuitively , the higher the rank, the less likely the world. A world with a ran k o f 0 is r easonably likely , on e with a ran k of 1 is som ewhat likely , o ne with a rank of 2 is qu ite un - likely , a nd so on. Given a ranking on worlds, th e statement “if p then typically q ” is true if in all the worlds o f lea st r ank where p is tr ue, q is also true. Thus, in on e mo del whe re people d o n ot typ ically p ut eith er p oison or antidote in cof- fee, th e worlds where ne ither p oison no r an tidote is pu t in the coffee have rank 0, worlds where either p oison or anti- dote is put in the coffee hav e rank 1, and worlds where both poison and antidote are put in the coffee have rank 2. T ake an e xtended causa l mod el to be a tuple M = ( S , F , κ ) , where ( S , F ) is a causal mod el, an d κ is a ranking function that associates with each world a r ank. In an acyclic extended cau sal m odel, a context ~ u d etermines a world d e- noted s ~ u . ~ X = ~ x is a cause of ϕ in an extended mod el M and context ~ u if ~ X = ~ x is a cau se of ϕ accor ding to Deﬁni- tion 3.1, excep t tha t in AC 2(a), there must be a world s such that κ ( s ) ≤ κ ( s ~ u ) and ~ X = ~ x ′ ∧ ~ W = ~ w is tru e at s . This can b e viewed as a fo rmalization o f Kahneman n and Miller’ s observation tha t we tend to alter the exceptio nal than th e routine aspects of a world ; we consider on ly alterations that hold in a world th at is n o m ore exceptional than the actual world. 3 (The idea o f extendin g causal mo dels with a ranking function alrea dy app ears in [Halp ern and Pearl 2001], but it was n ot used to capture statements abou t ty picality as sug- gested here. Rather, it was used to talk abo ut ~ X = ~ x being a cause of ϕ at r ank k , where k is the lowest rank of the world that shows that ~ X = ~ x is a cause. The ide a was dr opped in the journal version of th e paper .) This de ﬁnition deals well with all the pro blematic exam- ples in the literature. Consider Example 4.1. Using the r ank- ing d escribed ab ove, Bodyg uard is not a cause of V ictim’ s surviv al beca use the world th at would need to be co nsid- ered in A C2(a), where Assassin po ison the coffee, is less normal than the actual world, where he does not. It also deals well with Exa mple 3 .2. Su ppose that in fact the ho s- pital has 1 00 doc tors and the re are variables A 1 , . . . , A 100 and T 1 , . . . , T 100 in the causal mo del, where A i = 1 if d oc- tor i is assigned to treat Billy an d A i = 0 if he is no t, and T i = 1 if d octor i a ctually treats Billy on Mo nday , and T i = 0 if h e do es no t. Doctor 1 is assigned to treat Billy; the other s ar e not. Howe ver , in fact, no doctor treats Billy . Further assume that ty pically , d octors do not treat p atients (that is, a rand om doctor does not typically trea t a ran dom patient), an d if docto r i is assign ed to Billy , then typic ally 3 I originally considered requiring that κ ( s ) < κ ( s ~ u ) , so that you move to a str ictly more normal world, but t his seems too strong a requirement. For example, suppo se that A wins an election over B by a vote of 6–5. W e would like to say that each voter for A is a cause of A ’ s winning. But if we view all voting patterns as equally normal, then no voter is a cause of A ’ s winning, because no contingen cy is more normal than any other . doctor i treats Billy . W e can cap ture this in an extended causal m odel where the world whe re no doctor is assigned to Billy an d no doc tor treats him has r ank 0; the 1 00 worlds where exactly o ne d octor is assigned to Billy , and that doc - tor treats h im, have rank 1 ; the 100 worlds where exactly one doctor is assigned to Billy and no on e treats him h av e rank 2 ; and the 1 00 × 99 world s where exactly on e doctor is assign ed to Billy but som e do ctor treats him hav e r ank 3. (The rankin g giv en to othe r worlds is irrelev ant.) In th is ex- tended mod el, in the context wh ere docto r i is assigned to Billy but no o ne treats him, i is the cause of Billy’ s sickness (the w orld wh ere i treats Bi lly has lower ran k than the world where i is assigned to Billy but n o one treats h im), but no other d octor is a cau se of Billy’ s sickness. Moreover , in the context where i is assigne d to Billy and treats him, then i is the cause of B illy’ s recovery (fo r A C2(a), consider th e world where no doctor is assigned to Billy and none treat him). I con sider one mor e exam ple her e, due to Hitchcock [2007], th at illustrates th e interp lay b etween no rmality and causality . Example 4.2 : Assistant Bodyguard puts a harmless antidote in V ictim’ s coffee. Buddy th en poison s the co ffee, using a type of poison that is nor mally lethal, but is coun tered by the an tidote. Buddy would no t have po isoned the c offee if Assistant had not administered the antido te ﬁrst. (Buddy and Assistant do not really want to harm V ictim. T hey just w ant to help Assistant ge t a pr omotion by making it loo k like he foiled an assassination attem pt.) V ictim dr inks the coffee and survives. Is Assistant’ s ad ding the antidote a cau se of V ictim’ s sur- viv al? U sing th e p reliminary HP deﬁnition, it is; if Assistant does n ot ad d th e antid ote, V ictim surv i ves. Howe ver , usin g an extend ed cau sal mod el with the norm ality assump tions implied by the story , it is not. Sp eciﬁcally , su ppose we as- sume that if Ass istant does not add the an tidote, then Bud dy does not norma lly ad d poison. (Budd y , after all, is norm ally a law-abid ing citizen.) In th e corr espondin g extended cau sal model, the world wh ere Buddy poison s the coffee and As- sistant d oes not add th e Antid ote has a higher ran k (i.e., is less norm al than ) the world where Bud dy p oisons the co f- fee an d Assistant adds the antido te. This is all we need to know abo ut the r anking functio n to c onclude that ad ding the antidote is not a cause. By way of co ntrast, if Bud dy were a more ty pical assassin, with reaso nable no rmality as- sumptions, the world where h e pu ts in the poison and As- sistant p uts in the antid ote would be less no rmal than then one Buddy p uts in the poison and Assistant does no t put in the an tidote, so Assistant would b e a cause of V ictim being a aliv e. Interestingly , Hitchco ck captures this stor y using struc- tural equations that also m ake Assistant p utting in th e anti- dote a ca use of Bud dy putting in the poison. This is the de- vice used to distinguish th is situation fr om one wher e Buddy is actually means V ictim to d ie (in which case Buddy w ould presumab ly h av e put in the poison even if Assistant had not a dded the antido te). Howev er, it is n ot clear that peo- ple would agree th at Assistant putting in the an tidote really caused Buddy to add the poison; rath er , it set up a c ircum- stance wh ere Bud dy was willing to p ut it in. I would argu e that th is is better cap tured by u sing th e no rmality statemen t “If Assistant does n ot put in the an tidote, then Bud dy does not normally add poison . ” As this examp le shows , there is a nontrivial interp lay betwe en statements of causality and statements of normality . I leave it to the read er to che ck th at reason able assump- tions about typicality can als o be used to deal with the other problem atic examples f or th e HP deﬁn ition th at have been pointed o ut in the literatu re, such as Larry the Lo anshark [Halpern and Pearl 2005, Example 5.2] and Hall’ s [200 7 ] watching police example. (The family sleeps p eacefully throug h the nig ht. Are the watching po lice a cause? After all, if there h ad been thiev es, the police would ha ve nab bed them, and w ithout the police, th e f amily’ s peace w ould ha ve been disturbed.) This is n ot the ﬁrst attem pt to mo dify structural equations to d eal with defaults; Hitchcock [20 07] and Hall [ 2007] also consider this issue. Neither ad ds any extra machin ery such as r anking fun ctions, but b oth assume that th ere is an im- plicitly understoo d notio n of no rmality . Rough ly spe aking, Hitchcock [200 7 ] can b e u nderstoo d as giving co nstraints on models that guaran tee that the answer obtained using the preliminar y HP deﬁnition agr ees w ith the answer o btained using the deﬁn ition in extended causal mo dels. I d o n ot co m- pare my suggestion to th at of Hall [2007], since, as Hitch - cock [2 008] p oints o ut, th ere ar e a number of serious prob- lems with Ha ll’ s approach . It is worth noting that b oth Hall and Hitch cock assum e that a variable h as a “normal” or “de- fault” setting; any o ther setting is abnorma l. Howe ver , it is easy to construct examp les where what c ounts as normal d e- pends on th e con text. For examp le, it is normal for d octor i to treat Billy if i is assign ed to Billy; otherwise it is not. 5 The NESS appr oach In this section I pr ovide a su fﬁcient co ndition to guara ntee that a sing le c onjunct is a cause. Doing so h as the ad ded beneﬁt of providing a careful compar ison o f the NESS test and the HP app roach. Wright do es not provide a mathemat- ical formalization of the NESS test; what I giv e here is my understan ding o f it. A is a ca use of B acco rding to the NESS test if there ex- ists a set S = { A 1 , . . . , A k } o f events, eac h of which actu - ally o ccurred, whe re A = A 1 , S is sufﬁcient fo r f or B , an d S − { A 1 } is no t sufﬁcient fo r B . Thus, A is an elemen t of a sufﬁcient con dition for B , nam ely S , and is a necessary element o f that set, becau se any subset of { A 1 , . . . , A k } that does not include A is n ot suf ﬁcient for B . 4 The NE SS test, a s stated, seems intu iti ve and simple. Moreover , it d eals well with many examples. Howe ver , a l- though the NE SS test loo ks quite formal, it lacks a deﬁnitio n of wh at it means for a set S of events to be sufﬁcient for B to occur . As I now sho w , such a deﬁnition is sorely needed . 4 The NE SS t est is much in the spirit of Mackie’ s INUS t est [Mackie 1965], according to which A i s a cause of B i f A is an insuf ﬁcient but necessary part of a condition which is unnecessary but sufﬁcient for B . Ho we ver , a comp arison of the tw o approach es is bey ond the scope of this paper . Example 5.1 : Consider Wright’ s example of V ictor ia’ s poi- soning fr om th e intro duction. First, suppose that V ictoria drinks a cu p of tea p oisoned b y Paula, and then dies. It seems clear that Paula po isoning the tea cau sed V ictoria’ s death. Let S consist of two e vents: • A 1 , Paula poisoned the tea; and • A 2 , V ictor ia drank the tea. Giv en o ur un derstandin g of the world, it seems re asonable to say that th e A 1 and A 2 are sufﬁcient for V ictoria’ s death, but removing A 1 results in a set that is insufﬁcient. But n ow supp ose that Sharon shoots V ictoria just after she drinks the tea (call this event A 3 ), and she dies instanta- neously from the shot (before the poiso n can take effect). In this case, we would want to say th at A 3 is th e cause of V ic- toria’ s death, not A 2 . Nevertheless, it would seem that the same argument that m akes Paula’ s poison ing a cause with- out Sharon ’ s shot w ould still make Paula’ s poiso ning a cause ev en without Sharo n’ s shot. The set { A 1 , A 2 } still seem s sufﬁcient for V ictoria’ s death, while { A 2 } is n ot. Wright [198 5] o bserves the poisoned tea w ould be a cau se of V ictoria’ s de ath only if V ictoria “drank the tea and was alive when the po ison took effect ” . Wright seem s to be ar- guing th at { A 1 , A 2 } is in fact no t sufﬁcient fo r V ictoria’ s death. W e need A 3 : V ictoria was alive whe n the poison took effect. While I ag ree that the fact that V ictoria was aliv e when th e poiso n too k place is critical fo r causality , I do no t see how it help s in the NESS test, under wh at seems to me the most o bvious de ﬁnitions of “sufﬁcient”. I would argue th at { A 1 , A 2 } is in fact just as sufﬁcient f or d eath as { A 1 , A 2 , A 3 } . For sup pose tha t A 1 and A 2 hold. Either V ic- toria was aliv e wh en th e po ison to ok effect, o r she was not. In the eith er case, she dies. In the f ormer case, it is due to the poison; in the latter case, it is not. But it gets worse. While I would argue that { A 1 , A 2 } is indeed ju st as sufﬁcient f or d eath as { A 1 , A 2 , A 3 } , it is not clear tha t { A 1 , A 2 } is in fact sufﬁcient. Su ppose, f or ex- ample, that some people are naturally immune to the poison that Paula used, and d o not die fro m it. V ictoria is not im- mune. But then it seems th at we need to add a cond ition A 4 saying that V icto ria is not imm une fro m th e poiso n to get a set su fﬁcient to cause V ictoria’ s de ath. An d wh y sho uld it stop there? Suppo se that th e poison has an antidote that, if administered within ﬁve minutes of the poiso n tak ing ef fect, will prevent death . Unfo rtunately , the antido te was not ad- ministered to V ic toria, but do we have to ad d th is c ondition to S to ge t a suf ﬁcient set for V ictoria’ s death ? Where does it stop? I believe that a for mal deﬁn ition o f sufﬁcient cause re- quires the m achinery of causal mod els. (This p oint echo es criticisms of NESS and related ap proach es by Pearl [200 0, pp. 314 –315] .) I now sketch an app roach to deﬁning suf- ﬁciency that d eliv ers reasonab le answer s in many cases of interest and, indeed , of ten agrees with the HP deﬁnition. 5 5 Interestingly , Baldwin and Neufeld [2003] claimed that the NESS test could be formalized using causal models, but did not actually sho w how , beyond describing some examp les. In a later paper [Baldwin and Neufeld 2004], they seem to retract the claim that the NESS test can be formalized using causal models. Fix a causal mo del M . Recall th at a p rimitiv e event has the fo rm X = x ; a set of primitive e vents is consistent if it does not con tain both X = x and X = x ′ for some ran dom variable X a nd x 6 = x ′ . If S = { X 1 = x 1 , . . . , X k = x k } is a consistent set o f primitiv e e vents, the n S is sufﬁcient for ϕ relativ e to causal mod el M if M | = [ S ] ϕ , wh ere [ S ] ϕ is an ab breviation fo r [ X 1 = x 1 ; . . . ; X k = x k ] ϕ . Roug hly speaking, the ide a is to for malize th e NESS test by tak ing X = x to be a cause of ϕ if ther e is a a set S includin g X = x that is sufﬁcient f or ϕ , while S − { X = x } is not. Example 5 .1 already shows that th is will not work. If CP is a ran dom variable that ta kes on value 1 if Paula p oisoned the tea and 0 oth erwise, th en it is no t h ard to show that in the obvious causal model, CP = 1 is sufﬁcient for PD = 1 (V ictoria dies), e ven if Sharon shoots V ictoria. T o deal with this problem , we must streng then the notion of sufﬁciency to capture some of the intuitions behind A C2(b). Say th at S is str ongly su fﬁcient for ϕ in ( M , ~ u ) if S ∪ S ′ is sufﬁcient fo r ϕ in M for all sets S ′ consisting of p rimitive ev ents Z = z such that ( M , ~ u ) | = Z = z . I ntuitively , S is strongly sufﬁcient f or ϕ in ( M , ~ u ) if S remain s sufﬁcient for ϕ even when ad ditional events, which happen to b e tru e in ( M , ~ u ) , are added to it. As I now show , altho ugh CP = 1 is sufﬁcient for PD = 1 , it is no t stron gly sufﬁcient, provided that the language includes enough e vents. As already shown by HP , in or der to get the “rig ht” an- swer fo r cau sality in the presence o f p reemption (h ere, the shot preemp ts the po ison), there must be a variable in th e languag e that takes on d ifferent values depend ing on which of the two po tential cau ses is the actual cause. In this ca se, we nee d a variable tha t ta kes on different values d ependin g on whethe r Sharo n shot. Supp ose that it would take V ic- toria t units of time after the poiso n is administered to die; let DAP b e the variable that has value 1 if V ictoria dies t units o f time after the poison is admin istered and is ali ve be- fore that, and h as value 0 othe rwise. No te th at DAP = 0 if V ictoria is alread y dead bef ore the p oison takes effect. In particular, if Sharo n shoots V ictoria b efore the poison takes effect, then DAP = 0 . Then altho ugh CP = 1 is sufﬁcient for PD = 1 , it is not strongly sufﬁcient fo r PD = 1 in the context ~ u ′ where Sharon sho ots, since ( M , ~ u ) | = D AP = 0 , and M | = [ C P = 1; DAP = 0]( PD 6 = 1) . The following deﬁnition is my attempt at f ormalizing the NESS condition, using the ideas above. Deﬁnition 5.2 : ~ X = ~ x is a cau se of ϕ in ( M , ~ u ) ac cor ding to the causa l NES S test if there exists a set S of prim iti ve ev ents conta ining ~ X = ~ x such that th e fo llowing properties hold: NT1. ( M , ~ u ) | = S ; tha t is, ( M , ~ u ) | = Y = y for all p rimi- ti ve e vents Y = y in S . NT2. S is strongly sufﬁcient fo r ϕ in ( M , ~ u ) . NT3. S −{ ~ X = ~ x } is not stro ngly sufﬁcient f or ϕ in ( M , ~ u ) . NT4. ~ X = ~ x is minimal; no subset o f ~ X satisﬁes condition s NT1–3. 6 6 This deﬁnition does not take into accoun t defaults. It can be extend ed to take defaults into account by requiring that if ~ u ′ is the S is said to be a witness for the fact that ~ X = ~ x is a cause of ϕ a ccording to the causal NESS test. Unlike the HP deﬁnition , cau ses acc ording to the causal NESS test always consist of single conju ncts. Theorem 5.3 : If { X 1 = x 1 , . . . , X k = x k } is a cau se of ϕ in M acco r ding to the causal NESS test, then k = 1 . It is easy to check th at in Example 3 .4, both C 1 = 1 and C 2 = 1 are causes of WIN = 1 accordin g to the causal NESS test, while (b ecause of NT4) C 1 = 1 ∧ C 2 = 1 is not. On th e othe r ha nd, Example 3.4 shows that neither C 1 = 1 nor C 2 = 1 is a cause acco rding to the HP deﬁnition, while C 1 ∧ C 2 = 1 is. Thus, the tw o deﬁnitions are in compara ble. Nev ertheless, the HP deﬁn ition and the ca usal NESS test agree in many cases of inter est (in p articular, in all the ex- amples in the HP p aper). In ligh t of Theorem 5. 3, th is ex- plains in part why , in so m any cases, causes a re sing le con - juncts with the HP deﬁnition. In the rest of this sectio n I give condition s un der which the NESS test an d the HP deﬁnition agree. Alth ough they are complicated , they apply in all the standard examples in the literature. I start with c onditions tha t sufﬁce to show th at b eing a cause with accord ing to the causal N ESS test implies be ing a cause accordin g to the HP deﬁnition. Theorem 5.4 : Suppo se that X = x is a cause of ϕ in ( M , ~ u ) according to the cau sal NESS test with witness S , a nd ther e exis ts a (possible e mpty) s et ~ T of variables not mentioned in ϕ o r S and a con te xt ~ u ′ such that the follo wing p r operties hold: SH1. S − { X = x } is not a sufﬁcient co ndition for ϕ in ( M , ~ u ′ ) ; that is, ( M , ~ u ′ ) | = [ S − { X = x } ] ¬ ϕ . SH2. Ea ch variab le in ~ T is indep endent o f all oth er vari- ables in co ntexts ~ u and ~ u ′ ; that is, for all variab les T ∈ ~ T , if ~ W co nsists of all en dogenous va riables other than T , th en fo r a ll setting s t of T and ~ w of ~ W , we ha ve ( M , ~ u ) | = T = t iff ( M , ~ u ) | = [ ~ W = ~ w ]( T = t ) , and similarly for context ~ u ′ . SH3. ϕ is d etermined by ~ T and X in conte xts ~ u and ~ u ′ ; th at is, for all ~ t , ~ T ′ disjoint fr om ~ T and X , x ′ , and ~ t ′ , we have ( M , ~ u ′ ) | = [ ~ T = ~ t, ~ T ′ = ~ t ′ , X = x ′ ] ϕ iff ( M , ~ u ) | = [ ~ T = ~ t, ~ T ′ = ~ t ′ , X = x ′ ] ϕ . SH4. In context ~ u , S − { X = x } depends o nly o n X = x in ~ u ; th at is, for a ll ~ T ′ disjoint fr om S a nd ~ t ′ , we have ( M , ~ u ) | = [ ~ X = x, ~ T ′ = ~ t ′ ] S . Then X = x is a ca use of ϕ in ( M , ~ u ) ac cor ding to the HP deﬁnition . Getting conditio ns sufﬁcient fo r causality accord ing to the HP deﬁn ition to imp ly causality acc ording to th e NESS test is n ot so easy . The pro blem is the r equiremen t in the NE SS deﬁnition that there b e a witness S such that ( M , ~ u ′ ) | = [ S ] ϕ in all co ntexts ~ u ′ is very stro ng, indeed, a rguably too strong. contex t showing that S − { X = x } is not strongly suf ﬁcient for ϕ in NT2, then κ ( s ~ u ′ ) ≤ κ ( s ~ u ) . For ease of exposition, I ignore this issue here. For exam ple, co nsider a vote th at might b e called o ff if th e weather is bad, where the weather is part of the con text. Thus, in a context where the weather is bad, ther e is no win- ner, even if some votes have been cast. I n the actual co n- text, the weather is ﬁn e and A votes f or M r . B, who wins the election. A ’ s vote is a cause of Mr . B’ s victory in this context, according to the HP deﬁnition, but no t according to the NESS test, sinc e there is no set S that in cludes A sufﬁ- cient to make Mr . B win in all con texts; ind eed, there is no cause for Mr . B’ s victory accord ing to the NESS test (which arguably in dicates a proble m with the deﬁnitio n). Since the HP deﬁnitio n ju st focuses on the actual context, there is no o bvious way to c onclude fro m X = x being a cause of ϕ in con text ~ u a cond ition holds in all con texts. T o deal with this, I weaken the NESS test so th at it must hold only with respect to a set U of contexts. Mo re precisely , say that S is sufﬁcient for ϕ with r espect to U if ( M , u ) | = [ S ] ϕ for all u ∈ U . W e can th en d eﬁne wh at it m eans f or S to be str ongly sufﬁcient for ϕ in ( M , ~ u ) with r espect to U and for ~ X = ~ x to be a cause of ϕ in ( M , ~ u ) with r espect to U in the obvious way; in th e latter case, we simp ly requir e take strong sufﬁciency in NT2 and NT3 to be w ith respe ct to U . It is easy to check that Theor em 5 .3 holds (with no change in pr oof) for c ausality with respect to a set U of con texts; that is, ev en in this case, a cause must be a single conjunct. Theorem 5.5 : Suppo se that X = x is a cause of ϕ in ( M , ~ u ) according to th e HP deﬁn ition, with ~ W , ~ w , an d x ′ as wit- nesses. Su ppose tha t there exists a subset ~ W ′ ⊆ ~ W such that ( M , ~ u ′ ) | = ~ W ′ = ~ w (that is, the assignment ~ W ′ = ~ w does not c hange the values o f the variables in ~ W ′ in context ( M , ~ u ) ) and a context ~ u ′ such that th e follo wing co nditions hold, wher e ~ W ′′ = ~ W − ~ W ′ : SN1. ( M , ~ u ′ ) | = [ ~ W ′ = ~ w ]( X = x ′ ∧ ~ W ′′ = ~ w ) . SN2. ~ W ′′ is independ ent of ~ Z given X = x and ~ W = ~ w in ~ u ′ , so that if ~ Z ′ ⊆ ~ Z , then for a ll ~ z ′ , we have ( M , ~ u ′ ) | = [ X = x, ~ W ′ = ~ w, ~ Z ′ = ~ z ′ ]( ~ W ′′ = ~ w ) . SN3. ϕ is ind ependen t of ~ u an d ~ u ′ condition al o n X a nd ~ W = ~ w ; that is if ~ Z ′ ⊆ ~ Z , then for all ~ z ′ and x ′′ , we have ( M , ~ u ′ ) | = [ X = x ′′ , ~ W = ~ w ′ , ~ Z = ~ z ′ ] ϕ iff ( M , ~ u ) | = [ X = x ′′ , ~ W = ~ w ′ , ~ Z = ~ z ′ ] ϕ . Then X = x is a cause of ϕ in ( M , ~ u ) with r espect to { ~ u, ~ u ′ } according to the causal NESS test. 6 Discussion It has lo ng been recognized that normality is a key co mpo- nent of ca usal reasonin g. Here I show how it can be inco rpo- rated into the HP framework in a straig htforward way . The HP ap proach deﬁnes c ausality relative to a causal model. But we may be interested in whether a causal statemen t follows from some features of th e structural equatio ns and some default statemen ts, witho ut knowing the whole causal model. For example, in a scenario with many variables, it may be infeasible ( or there migh t not be enoug h inf or- mation) to provid e all the struc tural equatio ns and a com- plete ranking function. This suggests it may be of interest to ﬁnd an approp riate log ic for r easoning about actual causal- ity . Axio ms for causal reasonin g (expressed in th e langua ge of this p aper, using f ormulas of the for m [ ~ X = ~ x ] ϕ , have already been g iv en by Halp ern [200 0]; th e KLM axioms [Kraus, Lehmann, and Magidor 1990] fo r reason ing abou t normality and defaults are well kn own. It would be o f in- terest to put these axiom s to gether, perhap s in corpor ating ideas f rom th e causal NESS test, and ad ding so me state- ments ab out (stron g) sufﬁciency , to see if they lead to in - teresting conclusio ns ab out actual causality . Acknowledgments: I than k Ste ve Sloman f or pointin g out [Kahnema n an d Miller 1986], D enis Hilton and Chris Hitchcock for inter sting discussions o n cau sality , and Jud ea Pearl and the an onymous KR reviewers for useful com - ments. Refer ences Adams, E. (1975 ). T he Logic of Conditionals . Reidel. Baldwin, R. A. and E. Neufeld (20 03). On the structur e model inter pretation of Wright’ s NESS test. In Pr oc. AI 2003 , Lecture Notes in AI, V ol. 2671 , p p. 9–23. Baldwin, R. A. and E. Neufeld ( 2004) . The structura l model interpretatio n of the NESS test. In Ad vances in Artiﬁcial Intelligence , Lectur e Notes in Comp uter Science, V ol. 3060, pp. 297–30 7. Collins, J., N. Hall, and L. A. Paul (Eds.) (2004 ). Causa- tion and Counterfac tuals . MIT P ress. Dubois, D. and H. Prade (1 991). Possibilistic logic, pref- erential models, non-m onoton icity and related issues. In Pr oc. T welfth Internationa l Joint Conf. on Artiﬁcial Intelligence (IJCAI ’91) , pp. 419–42 4. Eiter , T . and T . Lukasiewicz (200 2). Complexity re- sults for structure-b ased causality . Artiﬁcia l In telli- gence 142 (1) , 5 3–89. Geffner , H. (199 2). High p robabilities, mod el prefer ence and default ar gumen ts. Mind and Mac hines 2 , 51–70. Goldszmidt, M. and J. Pearl ( 1992) . Rank -based systems: A simp le appro ach to belief revision, b elief update and reasonin g about evidence and actions. In P rin- ciples of Kno wledge Repr esentation an d Reason ing: Pr oc. Third I nternation al Conf. (KR ’9 2) , p p. 6 61– 672. Hall, N. (2004). T wo concep ts of cau sation. In J . Collins, N. Hall, and L. A. Paul ( Eds.), Causa tion and Coun- terfactuals . MIT Press. Hall, N. (20 07). Structural equation s an d c ausation. Philosoph ical S tudies 132 , 109–1 36. Halpern, J. Y . ( 2000) . Ax iomatizing causal reasoning . Journal of A.I. Resear c h 12 , 317–33 7. Halpern, J. Y . and J. Pearl ( 2001) . Causes and explana- tions: A structural-m odel app roach — Part I: Causes. In P r oc. Seventeenth Conf. on Uncertainty in Artiﬁ- cial Intelligence (U AI 2001) , pp. 194–20 2. Halpern, J. Y . and J. Pearl ( 2005) . Causes and explana- tions: A structural-m odel approa ch. Part I: Causes. British Journal for Philosophy of Scienc e 56 (4) , 843– 887. Hart, H. L. A. and T . H onor ´ e (1985 ). Cau sation in the Law (second ed.). Oxford Uni versity Press. Hiddleston, E. (2005 ). Cau sal po wers. British J ournal for Philosoph y o f Science 56 , 27–59. Hitchcock, C. (2 007). Prevention, preemp tion, an d th e principle of sufﬁcient reason. P hilosophic al Re- view 116 , 495–532 . Hitchcock, C . (2008). Structural equatio ns and cau sation: six counterexample s. Philosoph ical Studies . Hopkins, M. ( 2001) . A pr oof o f the conjunc ti ve cause conjecture . Un published manuscript. Hopkins, M . and J. Pearl (2 003). Clarifyin g the usage o f structural m odels for comm onsense causal reasoning. In Pr oc. AAAI Spring Sym posium on Logical F ormal- izations of Commonsense Reasonin g . Hume, D. ( 1748) . An Enqu iry Concernin g Human U n- derstanding . Reprinted by Open Court Press, 1958. Kahneman , D. and D. T . Miller (198 6). No rm theo ry: comparin g reality to its alter nativ es. Psychological Review 9 4 (2), 136–153 . Kraus, S., D. Lehman n, and M. Mag idor (19 90). Non- monoto nic reasoning, pr eferential mod els and cum u- lati ve logics. Artiﬁcial Intelligence 44 , 167–207. Lewis, D. (200 0). Causation as inﬂuence. Journal of Phi- losophy XCVII (4), 182–197 . Lin, F . (19 95). E mbracing cau sality in specifying the in - determinate effects of actions. In Pr oc. F ourteenth In - ternational Joint Conf. on A rtiﬁcial I ntelligence (IJ- CAI ’95) , pp. 1985 –1991 . Mackie, J. (1965). Causes and con ditions. American Philosoph ical Qu arterly 2/4 , 261–2 64. Pearl, J. (1 989). Pro babilistic sem antics f or nonmo no- tonic reasoning: a survey . I n P r oc. Fir st International Conf. on Princip les of Kn owledge Rep r esentation an d Reasoning (KR ’89) , pp. 505–5 16. Pearl, J. (200 0). Causality: Mod els, Rea soning, and In- fer ence . Cambridge Uni versity Press. Reiter , R. (200 1). Knowledge in Action : Logica l F oun- dations for S pecifying and Implementin g Dynamica l Systems . MIT Press. Sandewall, E. (199 4). F eatur es and Fluents , V ol. 1. Clarendon Press. Shoham, Y . (1987) . A seman tical ap proach to nonmo no- tonic log ics. In Pr oc. 2 nd IEE E Sy mposium on Logic in Computer Science , pp. 275– 279. Spohn, W . (1 988). Ord inal conditio nal functio ns: a dy - namic th eory o f epistemic states. In W . Harper and B. Skyrm s (Eds.), Causation in Decision , Belief Change, and Statistics , V ol. 2, pp . 105–13 4. Reidel. Wright, R. W . (1 985). Causation in tort law . California Law Review 7 3 , 1735–18 28. Wright, R. W . (1 988) . Causation, responsibility , risk, probab ility , naked statistics, an d p roof: Pruning the bramble bush by cla rifying the concepts. Iowa Law Review 7 3 , 1001– 1077. Wright, R. W . (20 01). Once mor e into the bra mble bush: Duty , causal contribution, and the extent of legal responsibility . V anderbilt Law Review 54 ( 3), 107 1– 1132.

Defaults and Normality in Causal Structures

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment