Can rational choice guide us to correct {em de se} beliefs?
Significant controversy remains about what constitute correct self-locating beliefs in scenarios such as the Sleeping Beauty problem, with proponents on both the "halfer" and "thirder" sides. To attempt to settle the issue, one natural approach consi…
Authors: Vincent Conitzer
Can rational c hoice guide us to correct de se b eliefs? ∗ Vincen t Conitzer Duk e Univ ersit y Abstract Significant contro v ersy remains about what constitute correct self- locating b eliefs in scenarios suc h as the Sleeping Beaut y problem, with prop onen ts on b oth the “halfer” and “thirder” sides. T o attempt to set- tle the issue, one natural app roach consists in creating decision v ariants of the problem, determining what actions th e v arious cand idate beliefs prescribe, and a ssessing whether these actions are reasonable when we step back. Dutch b ook arguments are a sp ecial case of this approach, b ut other Sleeping Beaut y games ha ve also been constructed to mak e similar p oin ts. Building on a recent article (James R. Shaw. De se b eli ef and ra- tional choice. Synthese , 190(3):491-508, 2013), I show that in general w e should b e w ary of such arguments, b ecause unintuitive actions may result for reasons that are u nrelated to th e b elief s. On the other hand , I show that, when we restrict our attention to additive games, th en a thirder will necessarily maximize her ex ante exp ected pay out, but a halfer in some cases will not (assuming causal decision theory). I conclude th at this do es not necessarily settle the issue and sp eculate about what migh t. Keywords: Sleeping Beauty , Du tc h b ooks, decision theory , game th eory . 1 In tro duction The Sle eping Be auty pro blem [Elga, 20 00 ] illustrates some fundamen tal issues regar ding self-lo cating b eliefs. In it, a s tudy participant referr ed to as “ Sleeping Beauty” is put to sle ep on Sunday , and aw oken either just on Monday , or on bo th Monda y and T uesda y , a ccording to the outcome of a fair coin to ss (Heads or T ails, r espectively). After an aw akening, she is put back to sleep with her memory of the awakening eve nt er ase d , so that all aw akenings are indistinguish- able to her. When Beauty is a wok en, what should b e her credence (sub jective probability) that the coin came up Heads? Some (“halfers”) ar gue that it should be 1 / 2. The sta ndard argument for this p osition is that this should hav e b een her credence in Heads b efore the exp eriment, and she has lear ned nothing new, ∗ This pap er appears in Synthese , V olume 192, Issue 12, pp. 4107-4119, Dece mber 2015. The final publication is av ailable at Springer via http://dx.doi.org/10.100 7/s11229-015-0737-x 1 knowing all along that s he w ould be awok en at leas t once. O thers (“thirders” ) argue that it should b e 1 / 3 . The standard a rgumen t for this p osition is that if the exp eriment is repeated man y times, in the long run, only 1 / 3 of aw akenings corres p ond to a toss of Heads. (It should b e emphasized that “halfer s ” and “thirders” would compute other frac tio ns o n different examples, a nd “ halfing” and “thir ding” are supp osed to refer to the metho ds o f computing these fractions rather than these sp ecific v alues.) F or a summary of reasons wh y philo sophers are interested in the Sleeping Beauty problem, see Titelbaum [2013]. One approa c h to settling what Beauty ought to believe is to des ig n scena rios where she must a ct on her b eliefs, and to inv estigate the consequences of b eing a thirder or a halfer o n these actions. One sp ecific line of a ttac k within this general a pproach is to desig n Dutch b o ok a rgumen ts. A Dut ch b ook is a s et of bets tha t a n age n t would all adopt individually in s pite o f the fact that their combination will lead to a g uarant eed loss. If such can be constructed, this is an argument agains t the rationality of the agent’s b eliefs. In the cont ext of the Sleeping Beauty problem, the fo cus is on diachr onic Dutch b oo ks, whic h inv olv e b ets at differ en t times. Dutch b oo k arguments for the Sleeping Beauty problem are considered b y Hitc hco c k [2004], Halpern [2006], Drap er and Pust [2008], Brigg s [2010], and Conitzer [2015]. These arg umen ts gener ally fav or thirding, though it is s ometimes also arg ue d that a halfer can r esist Dutch bo oks, particularly when ado pting eviden tial dec is ion theory . Sha w [2013] mo re generally pursues the agenda of integrating de se b eliefs with rational choice in the context o f v ariants of the Sleeping Beauty problem. He allows Beauty to play more complex games, and designs one where, he argues, the thirder makes the wrong decision and the halfer makes the right decision, regar dless of whether they ado pt ca usal or evident ial decisio n theory . In this article, taking Shaw [2 0 13 ] as a star ting p oint, I further pursue the agenda of settling the co rrect answer to the Sleeping B e aut y pr oblem by lo ok- ing at the consequences of halfing and thir ding on the outcomes of asso ciated decision problems. I firs t so und a note of ca ution b y showing that in s ome ca s es unin tuitive outcomes in these exa mples result no t fr om incorr ect credence s , but rather from challenges that a ra tional a ctor faces when trying to co ordinate with her past and future selves under imper fect recall (at least under ca usal decision theo ry). F rom exa mples that inv olve suc h challenges, we canno t com- fortably draw a n y conclusions ab out the (in)correctness of a particular a pproac h for computing credences. Subsequen tly , I show that if we restrict the types of decision problem to additive ones , which include typical Dutch b o ok a rgumen ts, these co ordination challenges disa ppear; moreover, under causa l decis ion theo ry , a thirder will a lw ays make decisions that maximize her ov erall expected pa yout, but a ha lfer in some case s do es not. I co nclude by ass essing how muc h we can learn from these results ab out corr ect self-lo cating b eliefs. 2 2 Review: Sha w’s W aking Game First, a review of Shaw’s W aking Game is in order. He argues that thirders g et the wr ong answer in this ga me while halfers get it r igh t. I focus here on his analysis of a thirder who is a ca us al decis ion theorist. 1 Sha w’s W aking Game. A t the b eginning o f the exper imen t, Beauty is informed of the rules of the game, which a re as follows. A fair coin will b e tos sed; the outco me of this coin toss will not b e revealed to Beaut y until the game is ov er. If it la nds Heads, she w ill b e wok en up only o nce, on Monday . If it lands T ails, she will be w oken up four times, on Monday , T uesday , W ednes da y , a nd Thu rsday . Each day , she will b e asked to press either Left of Right. Her memory of the aw akening will be erased afterwards, she will not be able to take an y notes, and the aw akenings will be indistinguishable. She will be comp ensated as follows. 1. If Heads came up and she pressed L e ft , s he will receive $400. 2. If Heads came up and she pressed Rig h t, she will receive $ 200. 3. If T a ils came up and she pressed Le ft on each of the four da ys, she will receive $100. 4. If T a ils c a me up a nd she pr essed Right on each of the four days, she will receive $200. 5. If T a ils came up and she pressed Left on Monday and Righ t on at least one other day , she will receive $200. 6. If T a ils came up and she pressed Righ t on Monday a nd Left on at least one other day , she will receive $100. Shaw makes t wo a ssumptions that he calls R andomi zing Pr ohibi te d and Pr e- vious Runs . The meaning of the former is clea r; the latter r efers to the fact that Beauty , having seen man y s imila r experiments p erformed on other s, has bec ome convinced that a sub ject always makes the same decision on ea c h of her a wak enings. These imply the followin g, whic h is all that is needed for his analysis of the case of a thirder who is a ca us al decision theorist. Definition 1 Be auty is said to ac c ept Consistency in Other Rounds if, up on any given awakening, she do es not assign any cr e denc e to the fol lowi ng event: she has woken u p or wil l wake up (with the same information) multiple addi- tional times and did not/wil l not take the same action on e ach of those other o c c asions. 1 Throughout, unless otherwise not ed, I wil l fo cus on causal decision the ory . Therefore, some of the conclusions I reac h can be av oided by dismissing causal decision theory . If the reader feels compell ed to do so b y the examples pr ovided here, then that might be an ev en more significan t i mpac t for them to ha v e—but I m yself am not willi ng to go that far. 3 Then, Shaw pr o vides the following analysis. If Beauty is a thirder and a causal decision theo rist, then upo n an aw akening, she should assig n 1 / 5 c redence to Heads/Monday , 1 / 5 to T ails/ Monda y , and 3 / 5 to T ails/ some other day . If she accepts Consistency in Ot her R ounds , then moreov er she b eliev es that either (a) on all other a w akenings (if any) she choose s Left, o r that (b) on all other aw akenings she chooses Rig ht. Under (a), if she choo ses to now press Left, her exp ected pay out will be (1 / 5) · $4 00 + (1 / 5) · $100 + (3 / 5) · $100 = $1 60 On the other hand, if she ch o oses to no w press Right, he r exp ected pa yout will be (1 / 5) · $2 00 + (1 / 5) · $100 + (3 / 5) · $200 = $1 80 Hence, under (a), she is better off press ing Righ t. Under (b), if she cho oses to no w press Left, her exp ected pay out will be (1 / 5) · $4 00 + (1 / 5) · $200 + (3 / 5) · $100 = $1 80 On the other hand, if she ch o oses to no w press Right, he r exp ected pa yout will be (1 / 5) · $2 00 + (1 / 5) · $200 + (3 / 5) · $200 = $2 00 Hence, under (b), she is also b etter off pressing Right ! It follows that Beaut y , if she is a thirder a nd a causal decision theor ist, will press Right. Now, b ecause all awak enings are indistinguisha ble, she should always press Right, resulting in a pay out of $200. But alwa ys pressing Le ft would have resulted in a n exp ected v alue of $250, whic h is b etter (assuming Bea ut y is risk- neutral), and is hence the c o rrect course o f action according to Shaw. (He shows that a thirder who is an evidential decision theoris t also should cho ose Right in this example, but I will not review this analysis here .) 3 Three A w ak enings Shaw’s W aking Game is illuminating, but I be lie v e little can be concluded fro m it ab out whether thirding or halfing is corr ect. T o s ho w wh y , let us c o nsider another e x ample that shares key fea tures of the rea soning a bov e, but without any coin tosses whatso ev er. Three aw ak enings . A t the b eginning of the exp eriment, Beauty is in- formed of the rules of the game, whic h a re as follows. She will b e wok en up exactly three times (Monday , T uesday , and W ednesday). Ea c h day , she will be asked to pr ess either Left o f Right. Her memo ry o f the awak ening will b e eras ed afterwards, s he will not b e able to tak e any no tes, and the awak enings will b e indistinguishable. She will b e comp ensated as follows. 1. If she never pr essed Righ t, she will receive $200. 2. If she pressed Right once, she will receive $300. 4 3. If she pressed Right t wice, she will rec e ive $0. 4. If she pressed Right three times, she will r eceiv e $100. Again, note that no co ins are tossed at a ll in Three Awakenings. 2 The o nly uncertainties tha t Beauty fa ces are (1) whic h day it is and (2) what she her s elf has done and will do on the other days. In fact, arg uably , (1 ) does no t even matter b ecause in this game, a ll aw akenings are treated symmetrically . The key uncertaint y is (2). How should Beauty act in this game? If she alwa ys presses Left, she will obtain $200 ; if she always presses Right, she will obtain only $1 00. So something is to be said for pressing Left. How ever, up on any given aw akening, Bea uty can reason a s follows. Ther e are tw o other ro unds in which she has pressed or will press a button. If she acce pts Consistency in Other R oun ds , then she b elieves that either (a) she has pressed or will pres s Left b oth other times or (b) she has pressed or w ill pr ess Right b oth other times. In case (a), she will b e b etter off pressing Righ t this round, b ecause pressing Righ t in o nly one round pa ys o ut $300, wherea s nev er press ing Right pays out $2 00. In case (b), she will also be better off pres s ing Righ t this ro und, because pr e s sing Righ t in a ll three ro unds pays out $ 100, whereas pressing Rig h t in only tw o pays out nothing. So in either c ase Beaut y is better off pressing Righ t, gaining $1 00 fr om doing so ! 3 Then, bec ause all aw akenings are indistinguishable, it seems we should e xpect Beauty to pres s Righ t all the time—even though pressing Left all the time results in a higher pay out. F r om Three Aw akenings, it b ecomes clear that, under causa l decision the- ory , actions that are loca lly optimal—at least when as suming Consistency in Other R ounds —can res ult in globally s uboptimal outco mes, even in cases wher e there is no am biguity ab out what the co r rect credences are. (I take it to be uncontro versial that Beauty’s credence upon awak ening s hould be distributed uniformly (1 / 3 , 1 / 3, 1 / 3) acr oss Monday , T uesday , a nd W ednesday .) I b elieve the example a lso makes it clear that the total payout earned by a sub ject is a very unreliable indicator of the corr ectness o f her credences. 4 T o drive home the p oin t, cons ider the following modifica tion of Three Awak enings. Three Awak ening s with a Co in T oss. The exp erimen t now beg ins with a biased coin toss. If it la nds Hea ds (which happ ens 99 % of the time), we pro ceed with the or iginal Three Aw akenings ga me. If it la nds T ails (1%), Beauty will 2 In this sense, it is closer to the example of O’Leary aw ak ening twice i n his trunk [Stalnaker, 1981], except that I need three rather than tw o aw ake nings. Nev ertheless, I w i ll stic k with the Beaut y terminology f or ex p ository purp oses, and will r ein troduce coin tosses soon. 3 Of course, to reason this wa y , Beauty m ust be a causal decision theorist; if she w ere an eviden tial decision theorist, then she would prefer to press Left and therefore b eliev e that she presses Left in the other r ounds as well. The example may th us provide some ammunition for evidential decision theorists, but again, I w i ll attempt to steer clear of that debate here as muc h as possi ble. 4 One migh t, of course, argue that this is so only b eca use we are using causal decision theory and ca usal decision theory is fla w ed. Still , given the prominence of causal decision theory , I beli ev e the example should leav e us generally cautious ab out the strategy of usi ng rational c hoice to determine what the corr ec t credences are. 5 similarly b e woken up on Monday , T uesday , and W ednesday , a nd a sk ed to pr ess Left or Rig ht, but the pay offs will b e different. In fact, they will b e mu ch simpler: she will r eceiv e $100 for each time she pr esses Left (a nd nothing for pressing Righ t). As alwa ys, Beauty knows the setup of this modified game, but will not re ceiv e any evidenc e of how the coin landed until the game has ended. I take it to b e uncontrov ersial that up on a n y awak ening, Beauty s hould pla ce a credence of 99% on the even t that the coin landed Heads, b ecause whether the c o in landed Heads or not, she will b e awok en three times. Moreover, in all six po s sible a wak ening event s, she will hav e the exact same information. Given this, the modification is to o s ligh t to hav e an impact on her decision: for any given awak ening, ther e is a 99% chance tha t she will gain $100 from pressing Right (assuming Consistency in O ther R ounds ) and a 1% chance that s he will lose $ 100 fro m do ing so—so she should still press Right. But no w, suppose that Beauty’s credence is inexplica bly inv erted, so that she believes that there is a 99% chance that the co in came up T ails . If so, then from he r p ersp ectiv e, now the simpler pay off function dominates a nd clearly she should press Left. As a result, s he will ac tua lly obtain a lar ger exp ected payout from the actual game, bec ause alwa ys pressing Left results in a higher pay out in Three Awak enings than a lw ays pressing Righ t. Ho wev er, it seems clear that this should not lead us to b eliev e that Beauty’s inverted credence is in any sense c orr e ct ; rather , she was just lucky that she accidentally inv erted the credences , thereby esca ping the detrimental reasoning to which understanding the ga me correctly w ould hav e led her. Of course, we do not need to g o to such lengths to find examples where in- correct cr edences lead to a b etter res ult. Someone who for some reason b eliev es that in roulette Red co mes up 2 / 3 of the time, and bets on Red o nce for this reason only (as opposed to not b etting a t all), ma y w ell get lucky on that one spin of the wheel. If so , nob ody will arg ue that this ex p ost outcome implies that the cr edence of 2 / 3 was cor rect. What is interesting a bout Three Aw akenings with a Coin T oss is tha t an y credences that max imize ex ante exp e cted payoff are c learly incorrect. It w ould seem that it is a v ery reasonable cr iterion for ev a luating the correctness o f credences to see whether they lea d to the max- im um ex ante expec ted payoff—but the exa mple shows that this approach is, in general, problema tic (at least if we are not willing to dismiss causal decisio n theory). 4 Additiv e Games The exa mples in Sectio n 3 suggest that in sufficie ntly rich decision v a rian ts of the Sleeping Beauty problem, under causal decision theory , the pay outs that Beauty obtains do no t provide useful guidance for wha t her corr ect credence s should be. This is so b ecause in suc h scena rios, actions that ar e lo cally apparently rational may lead to subo ptimal pa youts ev en when there can b e no serious dispute a bout what the correct credences should be. But perha ps, if w e restrict the space of scenarios , we can av oid suc h is sues. 6 The problematic a spect of the Thr ee Awak enings ga me is that Beauty’s “three se lves” need to c o or dinate their a ctions to maximize pay out—the effect of one action on ov erall pay out dep ends on the other actions—a nd they fail to do so due to the lack o f memory . What happens if we ass ume aw ay this int erdep endence? In wha t follows, I show that in the r esulting r estricted cla ss of games—additive ga mes—Beaut y do es in fact maximize her exp ected total pay out by being a thirder (and a causal decision theorist). Of course, merely showing an example additiv e game wher e b eing a thirder maximizes Beauty’s exp ected total payout will do little to prove the point, b ecause for all we know there is another example where b eing a thirder results in sub optimal pay out. I hav e to prove the result at some level of generality for it to b e mo r e than mere ly suggestive. In particular , for the sake of genera lit y , I wish to allo w that Beaut y do es no t neces sarily have the same exp erience in each aw akening (thereby al- lowing us to also address examples suc h as “T echnicolor Beauty” [Titelbaum, 2008]). T o do so, I will have to be a bit mor e formal. Definition 2 A S le eping Be auty de cision va riant with p ayoff function π is ad- ditive if for every re alization r of the initial c oin toss, 5 • (actions do not affect future rounds) r always le ads to the same numb er n r of awakenings by Be aut y re gar d less of Be auty’s actions, and for every i with 1 ≤ i ≤ n r , the information that Be aut y p ossesses in the i t h awakening dep ends only on r and i , and n ot on Be auty’s e arlier actions; and • (pa y off additi vit y) for every i with 1 ≤ i ≤ n r , and every two c orr e- sp ondi ng se quenc es of actions a 1 , . . . , a n r and a ′ 1 , . . . , a ′ n r that Be auty may take up on her n r awakenings, we have that π ( r , a 1 , . . . , a n r ) − π ( r , a 1 , . . . , a i − 1 , a ′ i , a i +1 , . . . a n r ) = π ( r , a ′ 1 , . . . , a ′ i − 1 , a i , a ′ i +1 , . . . a ′ n r ) − π ( r , a ′ 1 , . . . , a ′ n r ) Int uitively , in a dditiv e games, Beaut y do es not need to worry a bout co ordi- nating her actions with her s e lv es from other awak enings. This is b ecause b y the first co ndition, the only effect of actions is directly o n the final payout (as opp osed to them affecting the num ber of a wak enings or the informa tion that she has in future r ounds), and by the second condition these effects on pa yout are independent acro ss actions. This in tuition lea ds to the following prop osition. Prop osition 1 If Be auty is a thir der and a c ausal de cision the orist, and acts ac c or dingly up on e ach individual awakening, then she wil l maximize her ex an te exp e cte d p ayout in additive games. If she is a halfer and a c ausal de cisio n 5 W e may assume without loss of generality that a single coin toss at the b eginning provides all the randomness needed for the duration of the game, since w e can keep as muc h of this randomness hidden fr om Beauty as we must , f or as long as we must. Indeed, it is commonly agreed that moving the coin toss betw een Sunday night and Monda y night in the standard v ersion of the Sleeping Beaut y problem makes no differenc e. 7 the orist, and ac ts ac c or dingly on e ach indi vidual awakening, ther e ar e additive games in which she do es not maximize h er ex ante exp e cte d p ayout. Pro of. F o r each r and i with 1 ≤ i ≤ n r , let v ( r , i ) c o rresp o nd to the aw akening even t on the i th day after a coin toss realiza tion o f r . Let V = S ( r,i ):1 ≤ i ≤ n r { v ( r , i ) } b e the set of all aw akening even ts. By pay off additiv- it y , we ca n construct, for every v ∈ V , a function π v such that Bea ut y’s total pay out up on co in toss realiza tio n r and actions a 1 , . . . , a n r is c ( r ) + P i ∈{ 1 ,...,n r } π v ( r,i ) ( a i ), wher e c ( r ) is a co ns tan t that we may igno re for the pur- po se of acting optimally . 6 W e will use I ⊆ V to denote an informatio n set , i.e., a set of aw akening ev ents that Be aut y cannot distinguish. 7 Note that tw o awak- ening even ts in the same informa tion set may cor r espond either to the same coin toss realiz ation r —e.g., s ubsequen t Monday and T uesday awak enings in the standar d version of Sleeping Beaut y—or to different coin tos s re a lizations— e.g., the tw o Monday awak ening even ts corr esponding to Heads and T ails in the standar d version. When Beauty awak ens in information set I , if she is a thirder, then her cr e dence that the rea lization of the coin toss is r is given b y P ( r | I ) = P ( r ) · ν ( I ,r ) P r ′ P ( r ′ ) · ν ( I ,r ′ ) , where ν ( I , r ) = |{ v ∈ I : r ( v ) = r }| is the num b er of times that Beauty will a wak en with information I after co in tos s r ealization r and r ( v ) is the re a lization that leads to v . (This is the essence of being a thirder : given pa rticular info r mation up on aw akening, c r edence in a particular realiza- tion is pro portional to the num ber of times one will aw aken w ith this informatio n under this r ealization. Indeed, if the expe r imen t is repeated man y times, then P ( r | I ) gives the long-run fraction of the aw akenings in information s et I that corres p onded to a coin tos s realiza tion of r .) Moreov er, the credence that she assigns to a sp e cific v ∈ I with r ( v ) = r is P ( v | I ) = P ( r | I ) ν ( I ,r ) = P ( r ) P r ′ P ( r ′ ) · ν ( I ,r ′ ) . Hence, if A I is the set o f a ctions av ailable to her in information set I , 8 she will choose some a I ∈ A I that maximizes P v ∈ I P ( v | I ) π v ( a I ). If Beauty takes a ction a I ∈ A I whenever she is in information set I , then her ex ante expected pay out o verall is P r P ( r ) P i ∈{ 1 ,...,n r } π v ( r,i ) ( a I ( v ( r ,i )) ) (where I ( v ) is the informa tion set in which v lies). Rearra ng ing, this is equal to 6 T o be sp ec ific, we can c hoose, f or ev ery v , a default action d v . Let r ( v ) de- note the coin toss realization that leads to v . Then, for any action a v that can be taken at v , we let π v ( a v ) = π ( r ( v ) , d v ( r, 1) , . . . , d v ( r,i − 1) , a v , d v ( r,i +1) , . . . , d v ( r,n r ) ) − π ( r ( v ) , d v ( r, 1) , . . . , d v ( r,i − 1) , d v , d v ( r,i +1) , . . . , d v ( r,n r ) ), where v = v ( r, i ). By pa yoff ad- ditivity it then f ollo ws that π ( r , a 1 , . . . , a n r ) = π v ( r, 1) ( a 1 ) + π ( r, d v ( r, 1) , a 2 , . . . , a n r ) = π v ( r, 1) ( a 1 ) + π v ( r, 2) ( a 2 ) + π ( r, d v ( r, 1) , d v ( r, 2) , a 3 , . . . , a n r ) = . . . = ( P i ∈{ 1 ,...,n r } π v ( r,i ) ( a i )) + π ( r ( v ) , d v ( r, 1) , . . . , d v ( r,n r ) ), so w e can s et c ( r ) = π ( r ( v ) , d v ( r, 1) , . . . , d v ( r,n r ) ). (It is easy to see that conv ersely the existence of such π v ( · ) implies pay off additivity .) 7 Note that one aw ak ening ev ent corresp onds to m an y nodes in the standard extensiv e-form represen tation of the game—one f or eac h sequence of actions that Beaut y has tak en so far. How ev er, b ecause of the “actions do not affect future r ound s” condition, all these no des must lie in the same i nf ormation set. 8 Note that an agen t cannot ha ve different sets of actions av ailable to her in tw o aw ak ening ev en ts that are in the same i nformation set, b eca use then she wou ld b e able to rule out some of the aw ak ening even ts in the information set based on the actions av ailable to her. Some Dutc h b ook arguments are flaw ed b ec ause they violate this criterion. 8 P I P v ∈ I P ( r ( v )) π v ( a I ). 9 W e will sho w that if Beauty is a thirder and a causal decision theorist, then in fa ct fo r every I she maximizes P v ∈ I P ( r ( v )) π v ( a I ), thereby establishing that she maximizes her ex ante exp ected payout over- all. Indeed, we hav e alre a dy established that for ea c h I , Beauty max imize s P v ∈ I P ( v | I ) π v ( a I ). Using that (Beauty b eing a thirder) P ( v | I ) = P ( r ( v )) P r ′ P ( r ′ ) · ν ( I ,r ′ ) , we obtain that Beauty ma x imizes P v ∈ I P ( r ( v )) π v ( a I ) P r ′ P ( r ′ ) · ν ( I ,r ′ ) . Beca use Beauty cannot affect the denominator of this expressio n, this is eq uiv alent to maximizing P v ∈ I P ( r ( v )) π v ( a I ), as was to b e shown. On the other hand, if Beauty is a halfer (and a caus al decision theorist), then consider the sta nda rd Sleeping Bea uty game, where a coin is tossed to determine whether she aw ak ens once (upon Heads) or twice, with all her three p ossible aw akenings in the same information set. L e t her c ho ose b et ween Left and Right upo n eac h a wak ening. If the aw akenin g is one corresp onding to Heads, she will receive 3 for choosing Left (and 0 for Righ t); if it is one corresp onding to T ails , she will receiv e 2 for c ho o sing Right (and 0 fo r Left). If Beauty is a halfer (and a causal decision theorist), up on aw akening she will think it equa lly likely that she is in a Heads aw akening and that she is in a T ails aw akening, and therefore will c ho ose Left for a n exp ected pay off of 3 / 2 in this r ound (rather than Right for 1). How ev er, o verall, choosing Righ t ev ery time gives an ex ante e x pected total pa yout of (1 / 2) · 2 · 2 = 2, whereas c ho osing Left every time giv es an ex ante exp ected total pa yout of (1 / 2) · 1 · 3 = 3 / 2, so Beauty fails to ma ximize her exp ected pay out. Int uitively , the way the pr oof w orks is as follo ws. Because the g ame is a dditiv e, we can separa te Beauty’s tota l ex ante e x pected pay off in to the con tributions made to it by individual infor mation sets I . It then re ma ins to show that Bea uty maximizes her expected pay off fo r ea c h informa tion set I if she is a thirder and a causal decision theorist. Now, the contribution of each individua l aw akening even t v within the information set I to the exp ected pa yoff is propor tional to the pro babilit y P ( r ( v )) of the coin to ss realiza tion r ( v ) tha t gives rise to v . But, when she is in I , Beauty’s credence P ( v | I ) in v is also pr oportio na l to P ( r ( v )). This is so beca use (b eing a thirder) her c redence P ( r ( v ) | I ) in r ( v ) is prop ortional to P ( r ( v )) ν ( I , r ( v )), where ν ( I , r ( v )) is the num ber of aw akening even ts in I , across whic h this credence is equally divided. Because of this, Bea ut y w eighs the a wak ening even ts in an information set exactly so a s to maximize ex ant e exp ected pay off. In contrast, if she is a halfer and a causal decision theorist, then her credence in r ( v ) is not propor tional to P ( r ( v )) ν ( I , r ( v )) but ra ther just to P ( r ( v )), so that her cr edence in v itself is prop ortional to P ( r ( v )) /ν ( I , r ( v )). 10 As a res ult, she places to o little weigh t on aw akening ev en ts v in I that cor - resp ond to coin toss outcomes r that lead to ma ny o ther aw akening even ts in 9 T o see this, note that the first summation sums ov er all v b y first s umming ov er all r and then ov er all v corr esponding to that r . The second summation also sums ov er all v , but instead by first summing ov er all inf ormation sets and then ov er all v in that i nformation set. In both cases, the summand for v is P ( r ( v )) π v ( a I ( v ) ). 10 At least, it wo uld appear natural to split the crede nce equally across these ν ( I , r ( v )) a wak ening eve nts—but note that the counte rexample does not actually rely on this. 9 I . This is what leads her to decide s uboptimally in the counterexample a t the end o f the pro of: she ins ufficien tly weighs the T ails a wak enings in making her decision. 11 Prop osition 1 also implies that Beauty , if she is a thir de r a nd a causal deci- sion theorist, is invulnerable to certain types of Dutch b o oks. (This is already discussed in prior work [Hitchcock, 2004, Drape r and Pust, 2008, Br iggs , 2010].) Spec ific a lly , she will not fall for a Dutc h bo ok a s long as: (a ) Beauty , at the beg inning of the exper imen t, is made aw are o f the bets she will b e offered in different aw akening states and will not forg et this; (b) Beauty’s betting a c tions affect neither her future a wak ening states nor the outcomes o f past or future bets ; (c) for every t wo states in the same infor mation set, the b et p osed to Beauty is the same. Here, (c) seems na tural, b ecause if tw o states in the same information set were to hav e different b ets a ssocia ted with them, then in fact, by (a), they would allow Beauty to distinguish b etw een them b efore she tak es her action, con tradicting that they are in the same information set. (a) and (c) together ensure that we can int erpret the b ets as Beauty playing a game (whose 11 One may wonder whether, along the li nes of Bri ggs [ 20 10], the halfer could correct for this b y adopting evidential decision theory instead. The i de a would be tha t her decision pro vides evidence for what she do es in al l the ν ( I , r ( v )) aw ake nings, thereby undoing the problematic division b y ν ( I , r ( v )) ab o v e. Unfortunately , if she adopts eviden tial decision theo ry , then in general her decision wil l also provide evidence about what she does i n other inf ormation sets (especially , very simil ar ones) and this preven ts the pro of f rom going through. T o illustrate, consider the following example (an additiv e game). W e toss a three-sided coin (Heads, T ails, and Edge with probability 1/3 each) . On Heads, Beaut y will b e aw ak ened once in information set I 1 ; on T ails, once in information set I 2 ; on Edge, once in I 1 and once in I 2 . On every a wak ening, Beaut y must choose Left or Right. If the world is H eads or T ails, Left pay s out 3 and Ri gh t 0; if it’s Edge, Left pa ys out 0 and Right 2. Note that I 1 and I 2 are completely symmetric. The optimal thing to do fr om the persp ectiv e of ex a nte expected pa yout is to alwa ys play Left (and get (2 / 3 ) · 3 rather than (1 / 3) · 2 · 2 from Ri gh t). What will the EDT halfer do? U pon aw ak ening in (sa y) I 1 , she will assign credence 1 / 2 to each of Heads and Edge (and 0 to T ail s ). (In f ac t, some v ariants of halfing will result in differen t credences; to add ress suc h a v ariant, we can modify the example by adding anot her aw ak ening in b oth Heads and T ai ls—but not Edge—worlds, in an information set I 3 where no action is tak en. All v arian ts of halfing—and, for that matter, thirding—of which I am a w are will r esult in the desired credences of 1 / 2 Heads, 1 / 2 Edge in this mo dified example.) Now, the key p oin t is tha t if she pl a ys Ri gh t (Left) now, this is very strong evidence that she would play Righ t (Left) i n I 2 as well—after al l the s i tua tion is en tirely symmetric. Thus, conditional on pl a ying Left, she w i ll exp ect to get 3 in the Heads world and 0 i n the Edge world; conditional on play ing Right, she will expect to get 0 in the Heads world and 2 · 2 = 4 in the Edge w orld. Hence she will choose Righ t (and by symmetry she will also c ho ose Right in I 2 ), which does not maximize ex ante exp ecte d pay out. Conitzer [2015] pro vides a more elabor at e example along these lines in the form of a Dutc h b ook to w hi c h eviden tial decision theorists fall prey , along with fur th er discussion. (Inciden tally , an evident ial decision theorist who is a thir der fails to maximize ex ante expected pay off on a muc h simpler example: in the coun terexample at the end of the proof of Prop osition 1, just ch ange the pay off f or choosing Left on Heads to 5. No w Left maximizes ex ante exp ect ed pay off, but an evide ntial decision theorist who is a thirder wi ll calculate (1 / 3) · 5 = 5 / 3 < 8 / 3 = (2 / 3) · 2 · 2 and c hoose R igh t. What goes wrong is that ν ( I , T ails) = 2 now occurs twic e on the right-hand si de, once due to thirding (2 / 3) and once due to evidential decision theory (the second 2; the third 2 is the pay off for choosing Right on T ails). I thank an anon ymous reviewer for providing this coun terexample. It should also be noted that Briggs [2010] already gives a Dutc h b ook f or an eviden tial decision theorist who is a thirder.) 10 rules she knows), a nd adding (b) ensures that this game is additive. (Note that we may have to add an initial round to corresp ond to a b et at the b eginning of the expe rimen t.) By the first part of Prop osition 1, Beauty will act in a w ay that maximizes her exp e c ted payout. This mea ns s he ca nnot b e vulner able to accepting a set of b ets that results in a sure los s, b ecause if she did so she w ould not be maximizing her exp ected pay out (since, after all, she can also accept none of the b ets at all a nd thereb y avoid a loss ). Given all this, the second par t of Prop osition 1 is unsur prising in light o f the Dutch b oo k given by Hitc hco c k [2004] for ha lfers that use causa l decisio n theory . One may wonder whether additive ga mes are really the “ righ t” class of g ames to whic h to r estrict o ur attention. Perhaps the result can be g eneralized to a somewhat br oader class o f games, fo r example b y slig h tly relaxing the first con- dition in Definition 2. 12 Such a generalization, of course, would only strengthen the p oin t. More pro blematically , p erhaps a different natural cla s s of ga mes would actua lly fav or halfing. I cannot rule out this po ssibilit y , but it seems unlikely to me that such a class w ould b e more co mpelling than that of additive games. I b elieve that additiv e games are well motiv ated by the discussion g iv en at the b eginning of this section ab out removing the co ordination problem b e- t ween Beauty’s multiple selves, and the fact that the result provides a corolla ry ab out Dutc h b ooks is also enco uraging. 5 Conclusion What can we conclude from the forego ing? First and foremost, the Thre e Awak- enings game shows that we sho uld b e very ca utious when drawing conclusions ab out halfing vs. thirding fr om the outco mes of decisio n- theoretic v aria n ts of the Sleeping Beauty problem. I do belie ve that Prop osition 1 shows some merit to b eing a thirder rather than a halfer, but surely it do es not settle the matter once and for a ll. One migh t well a rgue, for e x ample, tha t, once she has aw ak- ened under particular circumstances, B e aut y should no long er ca re whether she maximizes her ex ante expected pay out; instead, she s hould maximize her ex- pec ted payout with resp ect to her beliefs at hand. These tw o ob jectives tur n out to b e a ligned in the ca se of a (causal decision theorist) thirder in addi- tive games, a nd this may b e a nice prop erty . But the battle-ha rdened halfer is likely mo r e c o mfortable biting the bullet and accepting nonalignment in these t wo ob jectives than giving up on other cherished philosophica l c o mmitmen ts. Another pos sibilit y for the halfer ma y be to embrace a v ersion of eviden tial de- 12 It should be noted that doing so appears non trivial. F or example, supp ose we contin ue to insist that the n umber of aw ak enings depends only on the outcome of the coin toss, but we attempt to relax th e requirement that actions do not affect the information that Beauty has i n future aw ak enings. Then, an action’s v alue ma y come l ess from the pay off resulting directly from it and mor e from allowing Beaut y to obtain increased pa yoffs in later rounds b y improving her inform at ion. It is possibl e that these latter, indirect effe cts on pay offs are not additive even when the di rect pay offs are additiv e (so that pa yoff additivity is tec hnically satisfied), and that this w ould still allow us to embed problematic examples such as the Three Awak enings game. 11 cision theory instea d. More discussion of how halfers may o r may not be nefit from adopting eviden tial decision theory , particularly in the cont ext of Dutc h bo ok ar gumen ts, is given b y Arntzenius [2002], Drap er and Pus t [2 008], Briggs [2010], and Co nitzer [2015] (see a lso the discussion in F o otnote 11). How could we create decision v ar ian ts of the Sleeping Beauty problem that leav e no ambiguit y ab out whether rational decisio ns truly cor r espond to rational belie fs? One wa y to do s o would b e to consider a myopic Beauty . Such a Beaut y would b e r ew arded immediately after tak ing an action in the game, ra ther than at the end. W e may supp ose that she is re warded in so mething giving immediate satisfaction—say , chocola te—r ather than money . Moreov er, s he is a ssumed to care only ab o ut the very near future; tomo rro w is to o fa r in the future to affect her decisions. Her being m yopic is not to b e understo o d a s her being irrationa l. W e still a s sume her to b e ent irely rationa l, but she just dis coun ts the future exceptiona lly heavily (and, to the extent it matters, the past a s w ell). Such a Beauty , in a simple v ariant (without decisions) where she is certa inly wok en up on bo th Monday and T uesday but giv en chocola te only on T uesday , will hop e that to day is T uesday when she is aw ok en. 13 So a m yopic B e aut y’s preferences are entirely de se and de nu n c . If we a dditionally supp ose that the game is a dditiv e as descr ibed ab ov e, then she need not worry at all about what s he will do or has done in o ther rounds (including ab out what her curren t actions say ab out what she will do or ha s done in other rounds), b ecause none of those affect her current circumstances and re wards. Hence, it seems that here b e liefs and actions sho uld unamb iguously line up. Unfortunately , such extreme assumptions also mak e it difficult, and p erhaps impossible, to set up an example that provides muc h insigh t b ey ond no n-decision-theoretic v ariants of the Sleeping Beauty pr oblem. There is a tight rop e to be walk ed here . T o o per missiv e a setup will allow us to reach co nclusions that are un w arr an ted; to o restricted a setup will not allo w us to reach any conclusions at all. Perhaps the bes t we can hop e for is to iden tify the happy medium and g radually a ccum ulate bits o f evidence tha t, while eac h not en tirely con vincing o n its o wn, gradually tilt the ba la nce in fav or of one or the other p osition. 14 13 Pe rhaps such examples are more palatable when we co nsider v ariants of the Sleeping Beaut y problem that inv olve clones—see, e.g., Elga [2004] and Sch w arz [ 2014] . The example where she hop es that to day is T uesday then is analogous to the “Af te r the T rain Crash” case in Hare [2007], where a victim of a train crash who has forgotten his name, up on learning that the victim named “A” will hav e to undergo painful surgery , hopes that he is vic tim “B” . (See also Hare [2009, 2010] . ) 14 Not all of these bits of evidence would concern decision v ariants, esp ecially as sur prising connect ions fr om the Sleeping B eauty problem to other problems contin ue to be dra wn. F or example, Pi tt ard [2015] makes an inte resting connection to epistemic impli ca tions of disagree- men t that provides a c hallenge to half ers (and argues that this challenge can be met ). Of course, there are also man y direct probabilistic ar gu ments. Many of these w ere al r eady made early on in the debate ab out Sleeping Beaut y [Elga, 2000, Lewis, 2001, Arntz enius, 2002, Dorr, 2002, etc.], but new ones contin ue to b e made [Titelbaum, 2012, Conitzer, 2014, e.g.]. 12 Ac kno wledgmen ts I thank the anon ymous r eview ers for man y useful comments that ha ve helped to significantly improv e the pap er. References F r ank Arnt zenius. Reflections on Sleeping Beauty. Analysis , 6 2(1):53–62, 20 02. Rachael Brig g s. P utt ing a v a lue on Beauty. In T amar Szab´ o Gendler a nd Jo hn Hawthorne, editors, Oxfor d Studies in Epi stemolo gy: V olume 3 , pages 3– 34. Oxford Universit y Press , 20 10. Vincen t Conitzer. A dev asta ting example for the Halfer Rule. Philoso phic al Studies , 2014 . DOI 1 0.1007/s1 1098-014-0384-y . Vincen t Conitzer. A Dutc h b o ok against sleeping beauties who are e v iden tial decision theoris ts. Synt he se , 2 015. DOI 10.1007 /s11229- 015-0691-7. Cian Dorr. Sleeping Bea ut y: in defence o f Elg a. Anal ysis , 62(4):29 2–296, 200 2. Kai Draper and Joel Pust. Diachronic Dutc h Books and Sleeping Beaut y. Syn- these , 164 (2):281–287, 2 008. Adam Elga. Self-lo cating b e lief and the Sleeping Beauty proble m. Ana lysis , 60 (2):143–1 47, 2000 . Adam E lga. Defeating Dr. E vil with s elf-locating belief. Philosoph y and Phe- nomenolo gic al R ese ar ch , 6 9(2):383–396 , 2004. Joseph Y. Halp ern. Sleeping Beauty recons idered: Conditioning and reflection in asynchronous systems. In T amar Szab´ o Gendler a nd John Hawthorne, editors, Oxfor d Studies in Epistemolo gy: V olume 3 , pag es 111–1 4 2. Oxfor d Univ ersity Press, 200 6. Caspar Hare. Self-Bias, Time-Bias, and the Metaphysics of Self and Time. The Journal of Philosophy , 104(7):350–3 73, July 2007 . Caspar Ha r e. On Myself , And Other, L ess Im p ortant Subje cts . Pr inceton Uni- versit y Press, 2009 . Caspar Hare. Rea lism About T ense and P ersp ective. Philosophy Comp ass , 5 (9):760–7 69, 2010 . Christopher Hitchcock. Beauty a nd the b ets. Synthese , 139 (3 ):405–420, 20 04. David Lewis. Slee ping Beauty: reply to E lga. Analysis , 61(3):17 1–176, 2 001. 13 John Pittard. When Beauties disagre e : Wh y halfer s s ho uld affirm r obust p er- sp ectiv alim. In T ama r Szab´ o Gendler and Jo hn Hawthorne, editors, Oxfor d Studies in Epistemolo gy: V olume 5 . Oxford Univ ersity Press, 2015 . F or th- coming. W olfg ang Schw arz. Belief up da te across fission. British Journal for the Philos- ophy of Scienc e , 201 4 . F orthcoming. James R. Sha w. De se belief and rational c hoice. Synt hese , 190(3):49 1–508, 2013. Rob ert C. Stalnaker. Indexica l belief. Synthese , 49 :129–151 , 1 981. Michael G. Titelbaum. The r elev ance of self-lo cating b eliefs. Philosoph ic al R eview , 117(4):555 –605, 2008 . Michael G. Titelbaum. An Em barras smen t for Double-Halfers . Thought , 1(2): 146–1 51, 2012 . Michael G. Titelba um. T en reasons to car e ab out the Sleeping B eaut y problem. Philosop hy Comp ass , 8(11):1 0 03–1017 , 201 3. 14
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment