Optimal and Myopic Information Acquisition
We consider the problem of optimal dynamic information acquisition from many correlated information sources. Each period, the decision-maker jointly takes an action and allocates a fixed number of observations across the available sources. His payoff…
Authors: Annie Liang, Xiaosheng Mu, Vasilis Syrgkanis
Optimal and My opic Information Acquisition ∗ Annie Liang † Xiaosheng Mu ‡ V asilis Syrgk anis § Ma y 15, 2018 Abstract W e consider the problem of optimal dynamic information acquisition from man y correlated information sources. Eac h p erio d, the decision-mak er jointly tak es an action and allo cates a fixed n umber of observ ations across the a v ailable sources. His pa yoff dep ends on the actions taken and on an unknown state. In the canonical setting of join tly normal information sources, we show that the optimal dynamic information acquisition rule pro ceeds my opically after finitely many perio ds. If signals are acquired in large blocks eac h p erio d, then the optimal rule turns out to be my opic fr om p erio d 1 . These results demonstrate the p ossibilit y of robust and “simple” optimal information acquisition, and simplify the analysis of dynamic information acquisition in a widely used informational environmen t. JEL co des: C44, D81, D83 Keyw ords: Information Acquisition, Correlation, Endogenous A ttention, My opic Choice, Robustness, V alue of Information ∗ This pap er w as previously circulated under the title of “Dynamic Information Acquisition from Mul- tiple Sources.” W e thank Y ash Deshpande, Johannes H¨ orner, Michihiro Kandori, Elliot Lipnowski, George Mailath, Juuso T oikk a, Anton Tsoy , and Muhamet Yildiz for helpful comments. W e are esp ecially grateful to Drew F uden b erg, Eric Maskin and T omasz Strzalec ki for their constant guidance and supp ort. † Departmen t of Economics, Univ ersity of Pennsylv ania ‡ Departmen t of Economics, Harv ard Universit y § Microsoft Research 1 Con ten ts 1 In tro duction 4 2 Preliminaries 7 2.1 Mo del . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 In terpretations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3 (Ev entual) Optimalit y of Myopic Rule 10 3.1 My opic Information Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4 Discussion 13 4.1 In tuition for Theorems 1 - 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.2 Precision vs. Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.3 Ho w Important is Normalit y? . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5 Games with Dynamic Information Acquisition 19 6 Extensions 21 6.1 Endogenous Learning Intensities . . . . . . . . . . . . . . . . . . . . . . . . . 21 6.2 Multiple P a yoff-Relev ant States . . . . . . . . . . . . . . . . . . . . . . . . . 22 6.3 Con tinuous Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 7 Related Literature 23 8 Conclusion 24 9 App endix 26 9.1 Preliminary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 9.1.1 P osterior V ariance F unction . . . . . . . . . . . . . . . . . . . . . . . 26 9.1.2 The Matrix Q i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 9.1.3 Order Difference Lemma . . . . . . . . . . . . . . . . . . . . . . . . . 27 9.2 Dynamic Blac kw ell Comparison . . . . . . . . . . . . . . . . . . . . . . . . . 28 9.2.1 The Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 9.2.2 Pro of of Lemma 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 9.3 Pro of of Theorem 1 (Large Blo ck of Signals) . . . . . . . . . . . . . . . . . . 31 2 9.4 Pro of of Theorem 2 (Separable Environmen ts) . . . . . . . . . . . . . . . . . 33 9.5 Preparation for the Pro of of Theorem 3 . . . . . . . . . . . . . . . . . . . . . 33 9.5.1 Switc h Deviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 9.5.2 Asymptotic Characterization of Optimal Strategy . . . . . . . . . . . 35 9.6 Pro of of Theorem 3 (Generic Even tual Myopia) . . . . . . . . . . . . . . . . 36 9.6.1 Outline of the Pro of . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 9.6.2 Equiv alence at Almost All Times . . . . . . . . . . . . . . . . . . . . 37 9.6.3 A Simultan eous Diophantine Approximation Problem . . . . . . . . . 41 9.6.4 Monotonicit y of t -Optimal Divisions . . . . . . . . . . . . . . . . . . 43 9.6.5 Completing the Pro of of Theorem 3 . . . . . . . . . . . . . . . . . . . 44 9.7 Pro of of Prop osition 1 (Bound on B ) . . . . . . . . . . . . . . . . . . . . . . 45 9.7.1 Preliminary Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . 45 9.7.2 Refined Asymptotic Characterization of n ( t ) . . . . . . . . . . . . . . 45 10 Online App endix 51 10.1 Applications of Results from Section 5 (Multi-play er Games) . . . . . . . . . 51 10.1.1 Beauty Contest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 10.1.2 Strategic T rading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 10.2 Example in Whic h n ( t ) is Not Monotone Ev en for Large t . . . . . . . . . . 53 10.3 Additional Result for K = 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 10.4 Even tual Optimality of the My opic Strategy . . . . . . . . . . . . . . . . . . 55 3 1 In tro duction In a classic problem of sequential information acquisition, a Ba yesian decision-mak er (DM) rep eatedly acquires information and tak es actions. His pa yoff depends on the sequence of actions taken, as w ell as on an unkno wn pa yoff-relev ant state. W e consider a setting in whic h the DM acquires information from a limited n umber of flexibly c orr elate d information sources, and allo cates a fixed n umber of observ ations across these sources each p erio d. The optimal strategy for information acquisition is of interest. Neglecting dynamic considerations, a simple strategy is to acquire at each p erio d the signal (or the set of signals) that maximally reduces uncertain ty ab out the pa yoff-relev ant state. W e refer to this as the myopic (or greedy) rule, as it is the optimal rule if the DM mistak enly believes each p erio d to be the last possible p erio d of information acquisition. This m y opic rule turns out to possess strong optimalit y properties in a widely used setting. Supp ose that the av ailable signals are jointly normal . If signal observ ations are acquired in sufficien tly large blocks eac h perio d, then m yopic information acquisition is optimal from p erio d 1 (Theorem 1 ). W e provide a sufficient condition on the required size of the block; this condition dep ends on primitives of the informational environmen t but not on the pay off function. Theorem 2 c haracterizes a condition on the prior and signal structure, given whic h m yopic information acquisition is optimal from p erio d 1 for any blo ck size. In b oth of these cases, the optimal information acquisition strategy can b e exactly and simply characterized. Additionally , instead of solving for the optimal decision rule and information acquisition strategy jointly (as w ould otherwise b e required), our results sho w that one can separate these t w o problems, and solv e for the optimal decision rule in this setting as if information acquisition w ere exogenous. Finally , for generic signal structures, and for an y block size, the optimal strategy proceeds b y my opic acquisition after finitely many p erio ds (Theorem 3 ). These results hold across all pa yoff functions (and in particular, indep endently of discoun ting); thus, my opic information acquisition is (even tually) “robustly” b est. Wh y does the my opic rule perform s o w ell? The main inefficiency of m y opic planning is that it neglects p oten tial complementarities across signals. A signal that is individually un- informativ e can b e v ery informativ e when paired with other signals; thus, rep eated (greedy) acquisition of the b est single signal need not result in the b est sequence of signals. 1 A k ey observ ation is that whether the DM p erceiv es tw o signals as providing complemen tary 1 See Section 4.1 for a concrete example and further discussion. 4 information dep ends on his current b elief ov er the state space. 2 This means that comple- men tarities across signals are not in trinsic to the underlying signal correlation structure: As the DM’s beliefs about the states ev olve, so too do his p erceptions of the correlations across signals. It is clear that as information accum ulates, the DM’s b eliefs b ecome more precise ab out each of the unkno wn states. This do es not itself lead to optimality of my opic information acquisition (see Section 4.2 for more detail). W e show that the k ey force comes from a second effect of information accum ulation: The DM’s beliefs evolv e in such a w a y that the signals endo genously de-c orr elate from his p ersp ective, and are even tually p erceiv ed as providing approximately indep enden t information. A t the limit in whic h all signals are indep enden t, the v alue of an y giv en signal can b e ev aluated separately of the others. The dynamic problem is thus “separable,” and can b e replaced with a sequence of static prob- lems. Given sufficien tly many signal observ ations, w e hav e only approximate separabilit y , whic h w e show is sufficient for the my opic rule to b e optimal. The mechanism w e identify is different from the one underlying a classic result from the exp erimentation literature. In “learning b y exp erimentation” settings, m yopic b ehavior is ev entually near-optimal: in the long run, the DM’s beliefs con verge, so the v alue of exploration (i.e. learning) b ecomes second-order relative to the v alue of exploitation of the p erceiv ed b est arm. 3 In our paper, signal acquisition decisions are driv en b y learning concerns exclusiv ely , as there is b y design no exploitation incen tive. T o see this, recall that in the classic multi-armed bandit problem ( Gittins , 1979 ; Bergemann and V¨ alim¨ aki , 2008 ), actions pla y the dual role of influencing the evolution of beliefs and also determining flo w pay offs. In our setting (whic h do es not fall into the m ulti-armed bandit framew ork), there is a separation b et ween signal choic es , whic h influence the ev olution of beliefs, and actions , whic h determine (unobserv ed) pay offs. My opic signal choices b ecome optimal in our framew ork b ecause they maximize the sp e e d of le arning , and not because they optimize a tradeoff b et ween learning and pay off. (Additionally , a my opic strategy is immediately optimal in multi-armed bandit problems only under very restrictive assumptions ( Berry and F ristedt , 1988 ; Banks and Sundaram , 1992 ).) 2 As a simple example, suppose the pay off-relev ant state is θ 1 , and the a v ailable signals are ab out θ 1 + θ 2 and θ 2 . These signals are “complementary” when the agen t’s prior belief is that θ 1 and θ 2 are indep enden t: observ ations of the first signal impro ve the v alue of observing the second signal, and vice versa. But supp ose the DM’s prior is such that θ 2 = θ 1 ; then, these signals in fact b ecome substitutes. 3 Easley and Kiefer ( 1988 ) and Aghion et al. ( 1991 ) show that if there is a unique my opically optimal p olicy at the limiting b eliefs, then the optimal p olicies must conv erge to this p olicy . In our setting, every p olicy (signal choice) is trivially my opic at the limiting b eliefs (a p oin t mass at the true parameter), so w e do not hav e uniqueness and cannot use this argument to iden tify long-run b eha vior. 5 Our results simplify the analysis of optimal dynamic information acquisition in an in- formational en vironment that is commonly used in economics: normal signals. How ever, the core of our analysis—the “endogenous de-correlation” of signals describ ed abov e—do es not rely on the assumption of normality . As w e discuss further in Section 4.1 , this de- correlation derives from a Ba y esian v ersion of the Cen tral Limit Theorem, which holds for arbitrary signal distributions. This suggests that our ev entual optimality result (Theorem 3 ) generalizes. 4 W e conclude b y demonstrating sev eral extensions. T o facilitate application of our results, w e extend our environmen t to a m ulti-pla yer setting in whic h individuals priv ately acquire information before playing a one-shot game at a random final perio d. This extension connects our results to a literature on games with Gaussian information ( Hellwig and V eldk amp , 2009 ; Myatt and W allace , 2012 ; Colom b o et al. , 2014 ; Lam b ert et al. , 2018 ). 5 W e presen t corollaries of our main results, and use these to extend results from Hellwig and V eldk amp ( 2009 ) and Lambert et al. ( 2018 ) in an online app endix. Finally , w e demonstrate extensions to en vironments with choice of information “intensit y” (the n umber of signals to acquire eac h p erio d), to m ultiple pay off-relev ant states (for a class of prediction problems) and to a con tinuous-time setting. Our w ork primarily builds on a large literature ab out optimal dynamic information ac- quisition ( Moscarini and Smith , 2001 ; F udenberg et al. , 2018 ; Che and Mierendorff , 2018 ; Ma ysk ay a , 2017 ; Steiner et al. , 2017 ; H ´ eb ert and W o o dford , 2018 ; Zhong , 2018 ) and a related literature on sequential search ( W ald , 1947 ; Arrow et al. , 1949 ; W eitzman , 1979 ; Callander , 2011 ; Ke and Villas-Boas , 2017 ; Bardhi , 2018 ). In con trast to an earlier fo cus on optimal stopping and c hoice of signal precision, our framework c haracterizes c hoice b et ween differ- en t kinds of information, as in the w ork of F uden b erg et al. ( 2018 ) (where the sources are t wo Bro wnian motions), and Che and Mierendorff ( 2018 ) and Maysk ay a ( 2017 ) (where the sources are t wo P oisson signals). 6 Compared to this work, we allo w for many (i.e. more than 4 Sp ecifically , we conjecture that for general signals, the optimal rule ev entually proceeds my opically when w e restrict to certain decision problems (e.g. prediction of the pay off-relev ant state). Immediate optimality of the my opic rule given sufficiently many signals, and also the indep endence of our results to the pay off function, do rely on prop erties of the normal en vironmen t (see Section 4.3 for further details). 5 F or games of information acquisition b ey ond the Gaussian setting, see e.g. Persico ( 2000 ), Bergemann and V¨ alim¨ aki ( 2002 ), Y ang ( 2015 ) and Denti ( 2018 ). All of these papers restrict to a single signal choice. 6 Che and Mierendorff ( 2018 ) and Maysk ay a ( 2017 ) consider c hoice betw een t wo P oisson signals, each of which provides evidence tow ards/against a binary-v alued state. The P oisson model (with a finite state space) is more suited to applications suc h as learning ab out whether a defendan t is guilty or inno cen t, while the Gaussian mo del describ es for example learning ab out the (real-v alued) return to an in v estmen t. 6 t wo) sources with flexible correlation. 7 Another strand of the literature considers a DM who c ho oses from completely flexible information structures at entropic (or more generally , “p osterior-separable”) costs, suc h as in Steiner et al. ( 2017 ), H ´ eb ert and W o o dford ( 2018 ) and Zhong ( 2018 ). Compared to these pap ers, our DM has access to a pr escrib e d and limited set of signals. 8 Finally , acquisition of Gaussian signals whose means are linear com binations of unknown states appears previously in the work of Meyer and Zwieb el ( 2007 ) and Sethi and Yildiz ( 2016 ). In particular, Sethi and Yildiz ( 2016 ) characterizes the long-run b ehavior of a DM who m y opically acquires information from experts with indep enden t biases. See Section 7 for remaining connections to the literature. See Section 7 for remaining connections to the literature. 2 Preliminaries 2.1 Mo del Time is discrete. A t eac h time t = 1 , 2 , . . . , the DM first c ho oses from among K information sources, and then chooses an action a t from a set A t . 9 The DM’s pay off U ( a 1 , a 2 , . . . ; θ 1 ) is an arbitrary function that dep ends on the sequence of action c hoices and a pay off-relev ant state θ 1 ∈ R . W e assume that pay offs are realized only at an (exogenously or endogenously determined) end date; th us, the information sources describ ed b elow are the only channel through which the DM learns. This assumption distin- guishes our mo del from m ulti-armed bandit problems, see Section 7 for further discussion. St ylized cases of suc h decision problems include: Exogenous Final Date. An action is tak en just once at a final p erio d T that is determined b y an arbitrary distribution ov er perio ds. 10 The DM’s pay off is U ( a 1 , a 2 , . . . ; θ 1 ) = u T ( a T , θ 1 ) 7 Callander ( 2011 ) also emphasizes correlation across differen t a v ailable signals. But the signals in Callan- der ( 2011 ) are related by a Brownian motion path, whic h yields a sp ecial correlation structure. Similar mo dels are studied in Garfagnini and Strulovici ( 2016 ) and in Bardhi ( 2018 ). 8 In Section 6.1 , w e do allow the DM to also con trol the intensit y of information acquisition by endogenously c ho osing how many signals to acquire in eac h perio d. But even in that extension, w e assume that the incurred information cost is a function of the n um b er of observ ations. This is analogous to Moscarini and Smith ( 2001 ) and is distinguished from the ab ov e pap ers that measure information cost based on b elief changes. 9 Th us, the action a t can be based on the information received in p erio d t . The timing of these c hoices is not imp ortan t for our results. 10 Sp ecial cases include ge ometric disc ounting , in whic h ev ery p eriod (conditional on b eing reached) has a constan t probability of b eing final, as well as Poisson arrival of the final p erio d. 7 where T is the (random) final time p erio d, and a T is the action c hosen in that p erio d. The time-dep enden t pa y off function u T ( a T , θ 1 ) ma y incorp orate discounting. Endogenous Stopping with P er-Period Costs. T ake each action a t to sp ecify b oth the decision of whether to stop, and also the action to be taken if stopp ed. The DM’s pay off is U ( a 1 , a 2 , . . . ; θ 1 ) = u T ( a T , θ 1 ) where T is the (endogenously chosen) final time p erio d, and a T is the action chosen in that perio d. The pa yoff-function u T ( a T , θ 1 ) may incorp orate discoun ting and/or a p er-p erio d cost to signal acquisition. Costs are fixed across sources in a giv en p erio d, but can v ary across p erio ds. 11 Apart from the decision problem, there are K information sources, which dep end on the unkno wn and p ersistent state v ector θ = ( θ 1 , . . . , θ K ) 0 ∼ N ( µ 0 , V 0 ). 12 This v ector includes the pay off-relev ant unkno wn θ 1 and additionally K − 1 p ayoff-irr elevant unknowns θ 2 , . . . , θ K . The role of these auxiliary states is to permit correlations across the information sources conditional on θ 1 ; this allows, for example, for the sources to be afflicted b y common and p ersisten t biases. In each p erio d, the DM chooses B sources (allowing for rep etition), where B ∈ N + is interpreted as a fixed time/atten tion constrain t (see Section 6.1 for extension to endogenous c hoice of B ). Choice of source k = 1 , . . . , K pro duces an observ ation of X k = h c k , θ i + k , k ∼ N (0 , σ 2 k ) where the co efficien t v ectors c k = ( c k 1 , . . . , c kK ) 0 and signal v ariances σ 2 k are fixed (and kno wn), but the Gaussian error terms are indep endent across realizations and sources. Throughout, w e use C to denote the matrix of co efficients whose k -th ro w is c 0 k . W e impose the follo wing assumption on the informational en vironment: Assumption 1 (Non-Redundancy) . The matrix C has ful l r ank, and no pr op er subset of r ow ve ctors of C sp ans the c o or dinate ve ctor e 0 1 . Equivalently, the inverse matrix C − 1 exists, and its first r ow c onsists of non-zer o entries. Heuristically , this means that the DM c an and must observ e each source infinitely often to reco ver the v alue of the pay off-relev an t state θ 1 . Since in this pap er w e restrict the num b er of sources to be the same as the n umber of unkno wn states, the ab ov e assumption is generically satisfied. 13 , 14 11 See e.g. F udenberg et al. ( 2018 ) and Che and Mierendorff ( 2018 ) for recent models with constan t waiting cost p er p erio d. 12 Here and later, w e exclusiv ely use the ap ostrophe to denote vector or matrix transp ose. 13 Throughout, “generic” means with probability 1 for randomly drawn coefficient matrices C . 14 Although we ha ve assumed that the n um b er of sources and signals are the same, our results extend to cases in which there are fewer sources than states, so long as e 0 1 is spanned b y the whole set of signal co efficien t v ectors and not by an y prop er subset. 8 Assumption of no r e dundant sour c es simplifies our analysis, as it guarantees that all sources will b e sampled infinitely often under the criteria w e consider. With redundan t sources, a new question emerges regarding whic h subset of sources the DM will choose from. Characterization of that subset is the focus of Liang and Mu ( 2018 ). 2.2 In terpretations W e pro vide b elow several interpretations of this framew ork. News Sour c es with Corr elate d Biases. On election day T , a DM will choose which of tw o candidates I and J to v ote for, where his pay off dep ends on θ 1 = v I − v J , the difference b et ween the candidates’ qualities v I and v J . In each p erio d up to time T , the DM can acquire information from different news sources. All sources pro vide biased information, and moreo ver the biases are correlated across the sources. As the DM acquires information, he learns not only ab out the pa yoff-relev ant state θ 1 , but also how to de-bias (and aggregate) information from the v arious news sources. A ttribute Disc overy. A pro duct has K unknown attributes ˜ θ 1 , . . . , ˜ θ K . Its v alue θ 1 is some arbitrary linear combination of these attribute v alues. F or example, the DM ma y wan t to learn the v alue of a conglomerate comp osed of sev eral companies, where each compan y i is v alued at ˜ θ i and the v alue of the conglomerate is θ 1 := α 1 ˜ θ 1 + · · · + α K ˜ θ K . The DM has access to (noisy) observ ations of different linear com binations of the attributes; for example, he might hav e access to ev aluations of eac h ˜ θ i individually . 15 A t some endogenously chosen end time, the DM decides whether or not to inv est in the conglomerate. Se quential Pol ling. A polling organization seeks to predict the av erage opinion in the p opulation to w ards an issue. There are K demographic groups in the population, and opinions in demographic group k are normally distributed with unknown mean µ k and known v ariance σ 2 k . The fraction of the p opulation in eac h demographic group k is p k , so the a v erage opinion is θ 1 := P k p k µ k . It is not feasible to directly sample individuals according to the true distribution p k , but the organization can sample individuals according to other non- represen tative distributions ˆ p k 6 = p k . Eac h p erio d, the polling organization allo cates a fixed budget of opinion samples across the av ailable distributions (p olling technologies), and p osts a prediction for θ 1 . Its pa yoff is the av erage prediction error across some fixed n umber of p erio ds. Intertemp or al Investment. Eac h action a t is a decision of how to allo cate capital b etw een consumption, and tw o inv estment p ossibilities: a liquid asset (bond), and an illiquid asset 15 This mo del can b e rewritten in our framework ab o ve, where the state v ector is ( θ 1 , ˜ θ 2 , . . . , ˜ θ K ). 9 (p ension fund). The return to the liquid asset is kno wn: 1 dollar sa ved to da y is worth e r dollars tomorrow. The return to the illiquid asset is unknown, and it is the pa y off- relev an t state in the work er’s problem; that is, every dollar inv ested to da y in the p ension fund deterministically yields e θ 1 dollar(s) tomorrow. The work er works for T perio ds, and in eac h of these p erio ds he learns ab out θ 1 (from some information sources) and then allo cates his w ealth across consumption, sa ving and inv estment. In p erio d T + 1, the w ork er retires and receiv es all the returns from his in vestmen ts into the illiquid asset. His ob jectiv e is to maximize the aggregated sum of his discoun ted consumption utilities and the pay off after retiremen t. 16 3 (Ev en tual) Optimalit y of My opic Rule 3.1 My opic Information Acquisition A strategy consists of an information acquisition strategy and a decision strategy . An infor- mation ac quisition str ate gy is a measurable map from p ossible histories of signal realizations to m ulti-sets of B signals, and a de cision str ate gy is a map from histories to actions. W e will say that an information acquisition strategy is my opic if it pro ceeds by choosing signals that maximally reduce (next p erio d) uncertaint y ab out the pa yoff-relev an t state. Definition 1. A n information ac quisition str ate gy is my opic , if at every next p erio d, it pr escrib es cho osing the B signals that (c ombine d with the history of observations) le ad to the lowest p osterior varianc e ab out θ 1 . Note that the B signals whic h minimize the p osterior v ariance also Blac kwell dominate any other multi-set of B signals (see e.g. Hansen and T orgersen ( 1974 )). Th us, m yopic acquisition is optimal if the current p eriod is the last chance for information acquisition, and this is true no matter what the pa y off function is. Our results b elow rev eal a close relationship b etw een the optimal information acquisition strategy and the m yopic information acquisition strategy . W e do not pursue a c haracter- ization of the optimal decision strategy , whic h in general dep ends on the pay off function, 16 An imp ortan t assumption of this example is that the return to inv estment is deterministic and only observ ed at the end. How ever, our mo del and results extend to a situation where there are “free” signals arriving each p erio d that do not coun t to ward the capacity constraint B . By considering the realized log return as a particular free signal, the extension of our mo del cov ers the case where in v estmen t returns are sto c hastic and the DM observes past return realizations. 10 although w e p oin t out one application of our main results tow ards simplification of this c haracterization. 3.2 Main Results W e present three results regarding optimalit y of the my opic information acquisition rule: Theorem 1 sa ys that my opic information acquisition is optimal from p erio d 1 if B (the n umber of signals acquired eac h p erio d) is sufficien tly large. Our next t wo results hold for arbitrary B : Theorem 2 provides a sufficient condition on the prior and the coefficient matrix C under whic h m y opic information acquisition is optimal from perio d 1, and Theorem 3 states that the optimal rule is eventual ly my opic in generic environmen ts. Theorem 1 (Immediate Optimalit y under Many Observ ations) . Fix any prior and signal structur e, and supp ose B is sufficiently lar ge. Then the DM has an optimal str ate gy that ac quir es information myopic al ly. 17 Optimally , the DM c ho oses the most informativ e B signals in the first p erio d based on his prior, then chooses the most informativ e B signals in the second p erio d based on his up dated p osterior, and so on. Note that since p osterior v ariances are indep enden t of signal realizations, and we ha ve assumed that there is no feedback from actions, the ab o ve my opic strategy is history-indep endent , and can b e represen ted as a deterministic signal path. This implies that instead of solving for the optimal decision strategy and information acquisition strategy jointly (as w ould otherwise b e required), one can solv e for the optimal decision strategy with resp ect to an exogenous stream of information. 18 W e additionally men tion that Theorem 1 can b e strengthened to optimalit y of m yopic information acquisition at all histories, including those in which the DM has previously deviated from the m y opic rule. Finally , a precise b ound for how large B must b e app ears in Section 4.2 . Our next t wo results hold for arbitrary block sizes B . First, the m yopic rule is again optimal from p erio d 1 in a class of “separable” environmen ts. Let f ( q 1 , . . . , q K ) denote the 17 Without further assumptions on the pay off function U , we cannot assert strict optimality of the m yopic information acquisition strategy . F or instance, this w ould not b e true if there exists a “dominan t” action sequence that maximizes U ( a 1 , a 2 , . . . ; θ 1 ) for every v alue of θ 1 . But in most other cases, strictly more precise b eliefs do lead to strictly higher exp ected pay offs, whic h implies unique optimality of the my opic rule. 18 An application of this tw o-step approac h (in con tinuous time) appears in the concurrent work of F uden- b erg et al. ( 2018 ), Section 3.5. See Section 6.3 for a brief discussion of how our mo del extends to con tinuous time. 11 DM’s p osterior v ariance ab out θ 1 , updating from q i observ ations of each signal X i . An informational environmen t is separable if its p osterior v ariance function can b e decomp osed in the following wa y: Definition 2. The informational envir onment ( V 0 , C, { σ 2 i } ) is separable if ther e exist c onvex functions g 1 , . . . , g K and a strictly incr e asing function F such that f ( q 1 , . . . , q K ) = F ( g 1 ( q 1 ) + · · · + g K ( q K )) . In tuitively , separabilit y ensures that observing signal i does not c hange the relativ e v alue of other signals, but strictly decreases the marginal v alue of signal i relative to ev ery other signal. Note that separability is not defined directly on the primitives of the informational en- vironmen t ( V 0 , C , and { σ 2 i } ), as it is based instead on the p osterior v ariance function f . Nev ertheless, f can b e directly computed from these primitives, and so is not an endogenous ob ject. The result b elow sa ys that my opic information acquisition is optimal in all separable informational en vironmen ts. Theorem 2 (Immediate Optimalit y in Separable Informational En vironments) . Supp ose the informational envir onment is sep ar able. Then for every B ∈ N + , the DM has an optimal str ate gy that ac quir es information myopic al ly. Separabilit y encompasses sev eral classes of informational en vironments that are of inde- p enden t in terest. F or example: Example (Orthogonal Signals). The DM’s prior is standard Gaussian ( V 0 = I K ), and the ro w v ectors of C are orthogonal to one another. 19 Example (Multiple Biases). There is a single pay off-relev an t state θ 1 ∼ N (0 , v 1 ). The DM has access to observ ations of X 1 = θ 1 + θ 2 + · · · + θ K + 1 , where each θ i ( i > 1) is a p ersisten t “bias” indep endently dra wn from N (0 , v i ), and 1 ∼ N (0 , σ 2 1 ) is a noise term i.i.d. ov er time. Additionally , the DM has access to signals ab out eac h bias X i = θ i + i ( i > 1), where i ∼ N (0 , σ 2 i ). 20 19 This is b ecause the signals are indep endent from eac h other, see also Example 2 in Figure 1 . 20 The DM’s p osterior v ariance about θ 1 is given b y f ( q 1 , . . . , q K ) = v 0 − v 2 0 v 0 + σ 2 1 q 1 + P K i =2 v i − v 2 i v i + σ 2 i /q i . This can b e written in the separable form. 12 In all remaining cases, optimal signal c hoices are even tually my opic. Theorem 3 (Generic Even tual Myopia) . Fix any prior c ovarianc e V 0 and signal varianc es { σ 2 i } K i =1 . F or generic c o efficient matric es C , ther e exists a time T ∗ ∈ N that dep ends only on the informational envir onment. F or this T ∗ , and for any B and any de cision pr oblem, the DM has an optimal str ate gy that ac quir es information myopic al ly after T ∗ p erio ds. That is, at all late perio ds, the optimal signal acquisitions are those that maximally reduce p osterior v ariance in the given p erio d. The result ab o v e tells us that the optimal rule ev entually pro ceeds b y my opic signal acqui- sition; this is different from the statement that acquisition of signals m yopically (from perio d 1) leads to the optimal signal path. W e sho w in Appendix 10.4 that this latter statement is also true. This complemen tary result (that the my opic signal path is ev entually optimal) dep ends critically on Assumption 1 (no redundan t signals). In particular, in en vironments with redundant signals, it is p ossible for the my opic and optimal signal paths to even tually sample from disjoin t subsets of signals. 21 In con trast, w e conjecture that Theorem 3 (the optimal rule ev entually pro ceeds my opically) extends b ey ond Assumption 1 . W e lea ve this conjecture as an op en question for future w ork. 4 Discussion 4.1 In tuition for Theorems 1 - 3 W e b egin with a simplified argumen t for tw o perio ds and B = 1: Supp ose that the best signal to acquire in p erio d 1 is a part of the b est pair of signals to acquire. In this case, no tradeoffs are necessary across the t wo p erio ds, and it is optimal in b oth p erio ds to acquire information m yopically . In general ho wev er, signals that are individually uninformativ e (from the persp ective of p erio d 1) can be very informative as a pair; th us, my opic information acquisition in the first round can preclude acquisition of the b est pair of signals. A t a high level, my opic information acquisition fails to b e optimal when there are strong complemen tarities across signals. A k ey part of our argument is that complemen tarities “w ash out” as information is acquired, so that signals are even tually p erceived as providing 21 F or example, supp ose the av ailable signals are X 1 = 0 . 5 θ 1 + 1 , X 2 = θ 1 + θ 2 + 2 , X 3 = θ 1 − θ 2 + 3 , where noise terms are standard normal, states are indep enden t, and prior v ariance ab out θ 2 is larger than 3. My opic information acquisition (from p erio d 1) leads to exclusive sampling of signal X 1 , while a patient DM even tually samples only from X 2 and X 3 . This example is generalized in Liang and Mu ( 2018 ). 13 (appro ximately) indep enden t information. After sufficien tly man y observ ations, the b est next signal to acquire is (generically) a part of the b est next pair of signals to acquire (and the b est batc h of B signals, where B is sufficien tly large, is alwa ys a part of the b est batc h of 2 B signals). W e no w pro ceed with a more detailed in tuition. Consider a one-shot v ersion of our problem, in whic h the DM allo cates t observ ations across the a v ailable signals. Define a t -optimal “division v ector” n ( t ) to b e any optimal allo cation of these signals: n ( t ) = ( n 1 ( t ) , . . . , n K ( t )) ∈ argmin ( q 1 ,...,q K ): q i ∈ Z + , P K i =1 q i = t f ( q 1 , . . . , q K ) where n i ( t ) is the num b er of observ ations allo cated to signal i . Applying Hansen and T org- ersen ( 1974 ), this allo cation maximizes exp ected pay offs for any decision. W e study the evolution of the v ectors n ( t ) as the n um b er of observ ations t v aries. If eac h count n i ( t ) increases monotonically in t , then the division vectors ( n ( t )) t ≥ 1 can b e ac hieved under some sequential sampling rule; moreov er, this sampling rule corresp onds to m yopic information acquisition. 22 A k ey “dynamic Blackw ell” lemma sho ws that a sequence of normal signals is b etter than another sequence (for all decision problems dep ending on θ 1 ) if and only if it leads to lo wer p osterior v ariances at ev ery p erio d. 23 Th us, optimalit y of m y opic information acquisition directly follows from existence of a sequence ( n ( t )) t ≥ 1 of monotonically increasing division vectors. Existence of suc h a sequence dep ends on whether there are strong complemen tarities across signals. In Example 2 of Figure 1 , signals are “indep endent,” and any allo cation that is as close to balanced across the signals as p ossible is t -optimal. It is th us p ossible to find t -optimal division v ectors that ev olv e monotonically . Theorem 2 generalizes Example 2 to a class of en vironments in whic h n ( t ) ev olv e monotonically , and the optimal rule is m yopic from p erio d 1. In general, the division vectors n ( t ) need not b e monotone, as shown in Example 1 of Figure 1 . Because signals X 2 and X 3 are strong complemen ts (observ ation of either increases the v alue of observ ation of the other), w e ha ve that n (5) = (4 , 1 , 0) but n (6) = (3 , 2 , 1); that 22 Pro ceed by induction: in the first perio d the m y opic rule c ho oses the signal that minimizes p osterior v ariance. In the second p erio d, he again w ants to minimize posterior v ariance at the given p erio d; since the division chosen by the totally optimal rule is best and feasible given the p erio d 1 choice, this is what m yopic information acquisition will yield. So on and so forth. 23 This generalizes a result from Greenshtein ( 1996 ), see Section 7 for further discussion. 14 ✓ 1 , ✓ 2 , ✓ 3 ⇠ i.i.d. N (0 , 1) X 1 = ✓ 1 + ✏ 1 X 2 = ✓ 2 + ✏ 2 X 3 = ✓ 3 + ✏ 3 ✓ 1 + ✓ 2 + ✓ 3 Payoff-relevant state: Example 2 “independent” signal structure independent prior ✓ 1 , ✓ 2 , ✓ 3 ⇠ i.i.d. N (0 , 1) X 1 = ✓ 1 ✓ 2 + ✏ 1 X 2 = ✓ 2 ✓ 3 + ✏ 2 X 3 = ✓ 3 + ✏ 3 ✓ 1 Payoff-relevant state: Example 1 complementarities across signals independent prior ˜ ✓ 1 , ˜ ✓ 2 , ˜ ✓ 3 ⇠ i.i.d. N (0 , ⌃ ) X 1 = ˜ ✓ 1 + ✏ 1 X 2 = ˜ ✓ 2 + ✏ 2 X 3 = ˜ ✓ 3 + ✏ 3 ˜ ✓ 1 + ˜ ✓ 2 + ˜ ✓ 3 Payoff-relevant state: Example 1’ “independent” signal structure correlated prior beliefs over “de-corree” “de-correlate” as number of observations increase ( ˜ ✓ 1 , ˜ ✓ 2 , ˜ ✓ 3 ) Figure 1 is, after four signal acquisitions, the b est next signal to acquire is X 1 , but the best pair of signals to acquire is { X 2 , X 3 } . The optimal allo cation of six signals is not ac hiev able from the b est allo cation of five signals. 24 An imp ortant step in our proofs is to sho w that en vironments whic h start off lik e Example 1 necessarily b ecome “lik e” Example 2 (as observ ations of each signal accumulate). T o facilitate this comparison, rewrite Example 1 in the follo wing wa y: Define a new set of states ˜ θ 1 = θ 1 − θ 2 , ˜ θ 2 = θ 2 − θ 3 , and ˜ θ 3 = θ 3 . The original prior o ver ( θ 1 , θ 2 , θ 3 ) defines a new prior ov er ( ˜ θ 1 , ˜ θ 2 , ˜ θ 3 ), and the original pa yoff-relev ant state can be re-expressed in the new states as θ 1 = ˜ θ 1 + ˜ θ 2 + ˜ θ 3 . The signal structure in Example 1’ is the same as in Example 2, with the crucial difference that the prior is correlated in Example 1’ and indep enden t in Example 2. Given sufficien tly many observ ations of eac h signal in Example 1’, ho w ever, the DM’s p osterior b eliefs ov er ( ˜ θ 1 , . . . , ˜ θ K ) b ecome almost indep endent . That is, not only does learning about eac h ˜ θ k o ccur, but the states ˜ θ k “de-correlate.” Th us, ev entually the signals 24 T o see these are the t -optimal divisions, w e calculate that at t = 5, f (4 , 1 , 0) = 11 23 < 14 29 = f (3 , 1 , 1) = f (3 , 2 , 0). Whereas at t = 6, f (3 , 2 , 1) = 5 11 < 17 37 = f (4 , 1 , 1) = f (4 , 2 , 0). 15 can b e viewed as approximately indep endent. The ab o ve heuristic statemen ts ab out de-correlation roughly follow from a Bay esian ver- sion of the Cen tral Limit Theorem. W e establish a tec hnical lemma (Lemma 2 in App endix 9.1 ) that strengthens this with a comparison of the v alue of differen t signals. Sp ecifically , w e c haracterize the externalit y that observ ation of a giv en signal X i has on the marginal v alue of future observ ations. W e sho w that the effect of observing X i on the v alue of future observ ations of X i (ev entually) far out w eighs its effect on future observ ations of an y X j , j 6 = i . That is, the only effect that observ ation of X i can ha ve on the ranking of signals is b y reducing the p osition of signal i . This prop erty is immediate when the cov ariance matrix is the identit y (as in Example 2), and holds also when the cov ariance matrix is close to diagonal (almost indep endent). A t late p erio ds we hav e a setting muc h like Example 2, and the problem is “near-separable.” No w finally , observe that the transformation w e used to rewrite Example 1 as Example 1’ w as not sp ecial. Indeed, w e can rewrite any signal structure in the follo wing w ay: F or eac h signal k , define a new state ˜ θ k = h c k , θ i , so that the signal X k is simply ˜ θ k plus indep endent Gaussian noise. Under Assumption 1 (no redundan t signals), the pay off-relev an t state can b e rewritten as a unique linear com bination of the transformed states ˜ θ 1 , . . . , ˜ θ K , and the original prior defines a new prior ov er ( ˜ θ 1 , . . . , ˜ θ K ). The same assumption allo ws us to show eac h signal is sampled infinitely often. Hence “de-correlation” necessarily o ccurs. The arguments ab o ve form the core of our pro ofs of Theorem 1 and Theorem 3 . They establish that n ( t ) ev en tually ev olv es approximately monotonically . 25 W e in tro duce tw o differen t conditions that allo w us to strengthen this to (ev entual) exact optimalit y of m yopic information acquisition. Our first approac h is to allo w for acquisition of a larger n umber of signals each p erio d. W e sho w that in transitioning from the t -optimal division to the ( t + d )-optimal division (where t is sufficiently large relativ e to d ), if some signal coun t decreases (failing sequen tialit y), then ev ery other signal count increases by at most one. Sp ecifically , taking d = K − 1, we can show that n i ( t + K − 1) ≥ n i ( t ) for every signal i at ev ery p erio d t ≥ T , with T a large n um b er dep ending on the informational en vironmen t. Th us, giv en blo ck size B ≥ max { K − 1 , T } , the division vectors n ( B t ) are attainable using a sequen tial rule from p erio d 1. Applying the dynamic Blackw ell lemma men tioned earlier, we ha ve that the optimal strategy immediately follo ws n ( B t ), my opically . This is the intuition b ehind Theorem 1 . Our second approac h quantifies the “typicalit y” of failures of monotonicity of n ( t ). (Be- 25 See App endix 10.2 for an example in whic h n ( t ) fails to be monotone even when w e restrict to arbitrarily late p erio ds. 16 lo w, assume B = 1 for illustration.) W e sho w that at late perio ds t , these failures o ccur for a purely technical reason: The v ectors n ( t ) even tually seek to approximate an optimal limiting frequency λ ∈ ∆ K − 1 across signals, but do so under an integer constraint; and the b est integer appro ximations to λt may not b e monotone. 26 A key lemma demonstrates these in teger approximations do (even tually) evolv e monotonically for a “measure-1” set of co effi- cien t matrices C . 27 Th us, if the optimal strategy coincides with n ( t ) at some late perio d t , it will follow these t -optimal divisions afterwards. The last step is to v erify that the optimal strategy will coincide with n ( t ). W e argue that if this w ere not true, then the DM could “deviate to ward the t -optimal divisions” and (in generic environmen ts) impro v e the p osterior v ariance at every p erio d, contradicting optimalit y of the original strategy . Hence generically , the optimal strategy will ev en tually coincide with n ( t ) and subsequen tly follow it. This yields Theorem 3 . 4.2 Precision vs. Correlation With sufficiently man y observ ations, the DM’s b eliefs simultaneously become more precise and less correlated, and these tw o effects are confounded in our main results. It is tempting to think that Theorem 1 (or Theorem 3 ) follo ws from the even tual precision of beliefs. Ho wev er, as our discussion ab o ve suggests, the key feature is not ho w precise b eliefs are, but ho w correlated they are. Sp ecifically , the blo ck size B needed in Theorem 1 dep ends on ho w man y observ ations are required for the transformed states ˜ θ 1 , . . . , ˜ θ K to “de-correlate,” at whic h point complementarities across signals are w eak. Belo w w e mak e this formal with a b ound on B . T o state our result, w e define transformed states ˜ θ k = 1 σ k h c k , θ i (dividing through by σ k normalizes all error v ariances to 1), and let ˜ V denote the prior co v ariance matrix o ver these transformed states. The pa yoff-relev an t state θ 1 can be rewritten as a linear combination of the transformed states: θ 1 = h w , ˜ θ i for some fixed pay off weigh t v ector w ∈ R K . In the follo wing result w e assume for simplicity that w = 1 is the vector of ones, although our analysis can b e easily adapted to an y w . Prop osition 1. L et R denote the op er ator norm of the matrix ( ˜ V ) − 1 . 28 Supp ose w = 1 , then B ≥ 8( R + 1) K 1 . 5 is sufficient for The or em 1 to hold. 26 This is indeed the case for Example 2, see App endix 10.2 for details. 27 The lemma follo ws from results in Diophan tine appro ximation theory , whic h studies the extent to which an arbitrary real n umber can b e approximated b y rational n um b ers. 28 The op erator norm of a matrix M is defined as k M k op = sup n k M x k k x k : x ∈ R K with x 6 = 0 o . 17 Notice that this b ound is increasing in the norm of ˜ V − 1 . T o in terpret this, supp ose first that we adjust the precision of the DM’s prior beliefs ov er ˜ θ 1 , . . . , ˜ θ K but fix the degree of correlation, for example b y scaling ˜ V by a factor less than 1. Then the norm of ˜ V − 1 increases, and a larger n um b er of signals B is needed. This is b ecause a more precise prior can b e understo o d as “re-scaling” the state space b y shrinking all states to w ards zero. Since signal noise is not corresp ondingly rescaled, each signal no w reveals less about the states, and de-correlation takes longer. In con trast, supp ose w e hold prior precision fixed and increase the degree of prior correla- tion. This w ould corresp ond to fixing the diagonal entries of ˜ V and increasing the off-diagonal en tries, so that the v ariances ab out individual states are unchanged but their co v ariances b ecome larger in magnitude. Then the en tire matrix ˜ V becomes closer to b eing singular, the norm of its inv erse increases and a larger B is required. That is, a more correlated prior requires more observ ations to de-correlate. Finally , we highligh t that Prop osition 1 only provides a sufficien t condition on the blo ck size for the m yopic rule to b e optimal. Ho wev er, the aforemen tioned comparativ e statics hold not just for the B that w e identify , but also for the smallest B that pro duces the result. Indeed, these comparativ e statics are sharp in the con tinuous-time limit of our mo del (see Section 6.3 ), where doubling the prior precision w ould also double the capacity B needed for Theorem 1 to b e true. T o summarize, optimalit y of m yopic information acquisition obtains quic kly when prior b eliefs are impr e cise and not to o c orr elate d . Our results suggest that despite the amount of (residual) uncertain t y in these situations, there is not m uch conflict b et ween short-run and long-run information acquisition incentiv es. 4.3 Ho w Imp ortan t is Normalit y? De-c orr elation. The k ey part of our argumen t is that signals ev entually de-correlate. This de- correlation deriv es from a Bay esian v ersion of the Central Limit Theorem, and do es not rely on sp ecial prop erties of normalit y . Consider a more general setting with signals X i = ˜ θ i + i , where the noise term i has an arbitrary distribution with zero mean and finite v ariance. Then, w e ha ve that the suitably normalized p osterior distribution o ver ( ˜ θ 1 , . . . , ˜ θ K ) con v erges to wards a standard normal distribution (so that signals are approximately indep endent). W e th us exp ect that given sufficien tly man y observ ations, our previous comparisons on the v alue of information extend b eyond normal signals. But if w e drop normalit y , then our results w eak en in the follo wing w a ys. Sp ecifically , in 18 w orking with general signal distributions, w e sacrifice the p oten tial for immediate optimality of the my opic rule, and also the generality to all intertemporal decisions. Imme diate Optimality of the Myopic R ule. F or normal signals, we established a T suc h that given T observ ations of eac h signal, the posterior co v ariance matrix (o ver the trans- formed states) is almost indep enden t. Notably , this bound holds uniformly across all his- tories of signal realizations, thanks to the fact that posterior v ariances do not dep end on signal realizations under normality . As men tioned ab o ve, we can use a Ba y esian Central Limit Theorem to argue a similar prop ert y for other signal distributions. The difference is that the CL T giv es us (near) independence almost sur ely , so that at every p erio d t , there is still positive probabilit y (alb eit v anishing as t increases) that the norm alized p osterior co v ariance matrix is v ery different from the iden tity . This precludes us from demonstrating a blo ck size B giv en whic h the optimal rule w ould be m y opic fr om p erio d 1 (Theorem 1 ). F or general signal distributions, we th us conjecture that almost sur ely the optimal rule is ev entually m y opic, but do not know what conditions w ould produce immediate optimality of the my opic rule. Gener al Intertemp or al Payoffs. The place where w e rely most heavily on normalit y is the statemen t that our results hold for al l pa yoff criteria that dep end only on θ 1 (and actions). Indeed, when pa yoff-relev an t uncertain ty is one-dimensional (as it is here), then all normal signals can b e Blac kw ell-ordered based on their p osterior v ariances. W e use this fact in Section 3.1 when we define the m yopic rule to maximize reduction in p osterior v ariance. W e use this fact again in Section 4.1 when we define the t -optimal divisions n ( t ) without explicit reference to the pay off function. Finally , while the ab o ve uses of normality are concerned with static decisions (i.e. taking an action once), w e also need normality to b e able to compare signal se quenc es . General- izing Greensh tein ( 1996 ), w e sho w that the ranking of sequences of normal signals is the same whether w e consider the class of static decision problems or the broader class of in- tertemp oral decisions. This equiv alence do es not hold in general; see Greenshtein ( 1996 ) for a coun terexample inv olving Bernoulli signals. 5 Games with Dynamic Information Acquisition W e no w extend our results to a m ulti-play er setting in which individuals priv ately acquire information b efore pla ying a normal-form game at a random (and exogenously determined) 19 end date. 29 There are N play ers (indexed by i ), each of whom has access to a set of K signals X i k = h c i k , θ i i + i k . The state v ector θ i = ( θ i 1 , θ i 2 , . . . , θ i K ) represen ts persistent unknown states particular to pla yer i . Noise terms it k are (normalized to) standard normal and indep endent across signals, pla yers and time. In each p eriod up to and including the final p erio d, each pla yer i acquires B indep endent observ ations of his signals described ab o v e, p ossibly obtaining m ultiple (indep enden t) real- izations of the same signal. Signal choices and their realizations are b oth privately observe d . The final p erio d is determined b y an exogenous distribution π . At this p erio d, agen ts pla y a one-shot incomplete information game, where eac h play er i ’s pay off u i ( a, ω ) dep ends on actions a = ( a 1 , . . . , a N ) in addition to a pay off-relev an t state ω . W e require that the play ers share a common prior o ver all states ( ω and the play er- sp ecific state v ectors ( θ i ) 1 ≤ i ≤ N ) with the following c onditional indep endenc e prop erty: F or eac h pla yer i , conditional on the v alue of θ i 1 , both the pay off-relev an t state ω and also the other play ers’ unknown states ( θ j ) j 6 = i are conditionally independent from pla y er i ’s o wn states θ i . 30 This ensures that no pla y er i infers anything ab out ω or ab out any other pla y er j ’s information b ey ond what he (play er i ) learns ab out θ i 1 , whic h makes θ i 1 the only state of in terest for play er i . 31 F or concreteness, w e pro vide examples (adapted from Lam b ert et al. ( 2018 )) that do and do not satisfy conditional independence. 32 Example 1 (Satisfies Conditional Independence) . In addition to the pa yoff-relev an t state ω , there is a common unkno wn state ξ , and t wo play er-sp ecific unkno wn states b 1 and b 2 . All states are indep endent. Pla y er 1 has access to observ ations of ω + ρ 1 ξ + b 1 + 1 1 (where ρ 1 is a constant) and b 1 + 1 2 . Pla yer 2 has access to observ ations of ω + ρ 2 ξ + b 2 + 2 1 (where ρ 2 is a constant) and b 2 + 2 2 . T o see that this example satisfies Conditional Indep endence, define θ 1 1 = ω + ρ 1 ξ and θ 2 1 = ω + ρ 2 ξ . 29 Reingan um ( 1983 ) considers a similar m ulti-agent mo del with priv ate information acquisition (specif- ically , firms engaging in R&D before comp eting in oligopoly). Her mo del is further developed by T aylor ( 1995 ) within the con text of researc h tournamen ts. How ever, these pap ers assume perfectly revealing signals and are thus distinguished from our setting. 30 Note that we do not imp ose conditional indep endence b etwe en ω and the other pla yers’ states. 31 Conditional indep endence is imp osed on pla yers’ prior b eliefs. Ho w ev er, this implies conditional inde- p endence for subsequent posterior b eliefs; given the v alue of θ i 1 , each signal is simply a linear combination of play er i ’s other states plus noise, th us conditional indep endence is preserved under up dating. 32 Example 1 is based on Example OA.3 in Lam b ert et al. ( 2018 ). Example 2 is based on their Example 1. 20 Example 2 (F ails Conditional Indep endence) . The pa yoff-relev an t state is ω , and there is additionally a common unknown state ξ . These states are indep enden t. Pla yer 1 has access to noisy observ ations of ω + ξ only . Play er 2 has access to noisy observ ations of b oth ω and ξ . Because both states ω and ξ cov ary with ω + ξ , there is no wa y to define the second pla yer’s “state of interest” θ 2 1 that w ould satisfy Conditional Indep endence. W e main tain Assumption 1 , so that signals are non-redundant (pla yers must observ e all the signals av ailable to them in order to learn θ i 1 ). The follo wing result generalizes Theorem 1 and Theorem 2 . (Although w e do not state it here, Theorem 3 also extends.) Corollary 1. Supp ose B is sufficiently lar ge or the informational envir onment is sep ar able. Then ther e exists a Nash e quilibrium of this mo del wher e e ach player ac quir es information myopic al ly. In fact, we show that the my opic information acquisition strategy is dominant in the follo wing sense: F or arbitrary opp onen t strategies, play er i ’s b est resp onse in volv es acquiring signals m yopically . In App endix 10.1 , we apply the ab o ve corollary to extend results from Hellwig and V eld- k amp ( 2009 ) and Lambert et al. ( 2018 ) to a setting with sequential information acquisition. 6 Extensions 6.1 Endogenous Learning In tensities The main mo del imp oses an exogenous capacit y constraint of B signals p er p erio d. Suppose no w that in each p erio d t , the DM can c ho ose to observ e an y num b er N t ∈ Z + of signal realizations (which are then optimally allo cated across signals). The DM incurs a flo w cost of information acquisition, mo deled as κ ( N t ) for some increasing cost function κ ( · ) with κ (0) = 0. This framework embeds our main mo del if w e define κ ( N ) = 0 for N ≤ B and κ ( N ) = ∞ for N > B . W e assume that the DM’s pay off is U ( a 1 , a 2 , . . . ; θ 1 ) − P t δ t − 1 · κ ( N t ) for some discoun t factor δ . 33 F or the sp ecial case of endogenous stopping, the pa y off function simplifies to δ τ · u ( a τ ; θ 1 ) − τ X t =1 δ t − 1 · κ ( N t ) 33 Our analysis can accommodate more general pay off functions of the form U ( N 1 , a 1 , N 2 , a 2 , . . . ; θ 1 ). 21 whenev er the DM stops after τ p erio ds. This is a discrete-time generalization of the frame- w ork prop osed in Moscarini and Smith ( 2001 ), although our fo cus is on allocation of the signals instead of choice of in tensity level. 34 Theorems 1 and 2 generalize to this setting: Corollary 2. Supp ose B is sufficiently lar ge or the informational envir onment is sep ar able. Then, even with endo genous le arning intensities, the DM has an optimal str ate gy that cho oses signals myopic al ly. In the ab ov e corollary , “m y opic acquisition” means the following: In any p erio d t , given (endogenous) intensit y choice N t , the optimal acquisitions are the N t signals that minimize p osterior v ariance about θ 1 . W e emphasize that while my opic signal c hoices are optimal, m yopic intensit y choices need not b e. Ho wev er, kno wing that the signal c hoices must follo w the m yopic path provides a simplifying first step tow ards the c haracterization of optimal in tensity levels. Generic ev en tual m yopia (Theorem 3 ) also extends, but w e omit the details. 6.2 Multiple P a y off-Relev ant States In the main mo del, the DM’s pay off function dep ends on a one-dimensional state. Our results do not extend in general to pa yoff functions that depend on the full state vector ( θ 1 , . . . , θ K ). Lo osely , this is b ecause the signals can no longer be Blac kwell ordered; th us, ev en the statement that the signal which maximally reduces p osterior v ariance is b est for static decision problems has no analogue when multiple states are pay off-relev an t. Ho wev er, our results do extend for a class of prediction problems. Sp ecifically , supp ose that at an exogenous end date (determined b y an arbitrary distribution o ver p eriods), the DM is asked to predict the state v ector θ . At this time, he receives a pa yoff of − ( a − θ ) 0 W ( a − θ ) , where W is a giv en p ositive semi-definite matrix and a ∈ R K is the DM’s prediction. In suc h a setting, our main results and their pro ofs extend essen tially without mo difi- cation. T o see this, note that when W is diagonal, the DM simply minimizes a weighte d 34 Moscarini and Smith ( 2001 ) has a single state and a single signal ( K = 1), so the DM c ho oses only the learning intensities N t . Unlik e Moscarini and Smith ( 2001 ), w e do not c haracterize the optimal sequence of intensit y choices ( N t ) t ≥ 1 , but instead show how this problem can b e separated from allo cation of those observ ations across different kinds of sources. 22 sum of p osterior v ariances ab out m ultiple states. Generalizing Lemma 2 in App endix 9.1 , w e can sho w that an y suc h ob jectiv e function exhibits “ev entual near-separability ,” whic h is sufficien t to derive Theorem 1 and Theorem 3 . When W is not diagonal, we can use the sp ectral theorem to write the DM’s ob jective function as a weigh ted sum of p osterior v ariances ab out some linearly-transformed states. Our pro ofs still carry through. 6.3 Con tin uous Time In a working pap er, w e analyze a contin uous-time v ersion of our problem. In that mo del, the DM has B units of atten tion in total at ev ery p oin t in time. He chooses atten tion lev els β 1 ( t ) , . . . , β K ( t ) sub ject to β i ( t ) ≥ 0 and P i β i ( t ) ≤ B , and then observ es K diffusion pro cesses X 1 , . . . , X K , whose ev olutions are affected b y the attention rates in the following w ay: dX i ( t ) = β i ( t ) · h c i , θ i dt + p β i ( t ) dB i , where each B i is an indep enden t standard Bro wnian motion. This form ulation can b e seen as a limit of our discrete-time model in the current pap er, where w e take p erio d length to zero and also “divide” the signals to hold constan t the amount of information that can b e gathered ev ery instant. In short, all results from this pap er extend (and o ccasionally can b e strengthened): the op- timal rule is ev en tually my opic in al l informational environmen ts (th us dropping the generic qualifier in Theorem 3 ); additionally , we pro vide more p ermissiv e sufficient conditions on the informational en vironment under whic h an optimal strategy is m yopic from perio d 1. W e refer the reader to the working pap er for more detail. 7 Related Literature Besides the references mentioned in the introduction, our setting is related to a recent litera- ture ( Bub eck et al. , 2009 ; Russo , 2016 ) regarding “b est-arm iden tification” in a multi-armed bandit setting: A DM samples for a num b er of p erio ds b efore selecting an arm and receiving its pay off. In App endix 10.3 , we c haracterize the optimal information acquisition strategy for the case of tw o states ( K = 2), whic h exactly applies to the problem of iden tifying the b etter of t wo correlated normal arms. Ho w ever, due to our assumption of an one-dimensional pa yoff-relev ant state, we are not able to handle more than t wo arms. 35 35 With tw o arms, the DM only cares ab out the difference in their expected pa yoffs. Cho osing among more than tw o arms would inv olv e multi-dimensional p ayoff unc ertainty and a de cision pr oblem that is not 23 W e note that correlation is the k ey feature of our setting, and are not a w are of man y pap ers that study correlated bandits, either under the classical framework or under b est-arm iden tification (see Rothschild ( 1974 ), Keener ( 1985 ) and Mersereau et al. ( 2009 ) for a few st ylized cases). Our results on the comparison of sequential normal experiments (see the discussion in Section 4 , and results in Appendix 9.2 ) generalize the main result in Greenshtein ( 1996 ). Greensh tein ( 1996 ) compares tw o deterministic (i.e. history-indep endent) sequences of sig- nals, where eac h signal is θ 1 plus indep endent normal noise. His Theorem 3.1 implies that the former sequence is Blac kwell-dominan t if and only if its cumulativ e precision is higher at every time. Note that this statement do es not refer to the prior beliefs, but if w e im- p ose a normal prior on θ 1 , then higher cumulativ e precision is equiv alent to lo wer p osterior v ariance. Th us, the result of Greenshtein ( 1996 ) coincides with ours when θ 1 is the only p ersisten t state, and when all signals are independent conditional on θ 1 . Our setting fea- tures additional correlation across different signals through the p ersisten t (pa yoff-irrelev ant) states θ 2 , . . . , θ K . Consequen tly , the dynamic Blac kw ell comparison in our mo del dep ends on prior b eliefs. 36 This feature, together with the endogenous choice of signals (whic h ma y b e history-dep enden t), complicates our problem relative to Greensh tein ( 1996 ). Finally , our w ork is closely related to optimal design , a field initiated b y the the early w ork of Robbins ( 1952 ) (see Chernoff ( 1972 ) for a surv ey). Sp ecifically , the problem of one- shot allo cation of t signals (our t -optimal criterion in Section 4 ) is equiv alent to a Ba yesian optimal design problem with resp ect to the “ c -optimalit y criterion”, which seeks to minimize the v ariance of an unknown parameter. Our analysis is how ev er fo cused on dynamics, and w e demonstrate here the optimality of “greedy design” for a broad class of (in tertemp oral) ob jectiv es. 8 Conclusion A DM learns about a pa yoff-relev an t state b y sequen tially sampling (batc hes of ) signals from flexibly correlated Gaussian sources. Under conditions that we pro vide, my opic information pr e diction . As we discussed in 6.2 , the lack of a complete Blac kw ell ordering limits the generalization of our argumen t. Inciden tally , in related sequen tial search settings, Sanjurjo ( 2017 ), Ke and Villas-Boas ( 2017 ) and Chic k and F razier ( 2012 ) also highlight the challenge of characterizing the optimal strategy once there are at least three alternativ es. 36 This is already the case for static comparisons, since as the prior b eliefs v ary , it is not alwa ys the same signal that leads to the low est p osterior v ariance ab out θ 1 . 24 acquisition is optimal and robust across all p ossible pay off functions. Generically , the optimal strategy even tually acquires signals m yopically . These results are robust to extension to m ulti-play er settings, to endogenous c hoice of the n um b er of signals to acquire each p erio d, and to multi-dimensional uncertaint y for certain pa y off functions. W e conclude with a re-in terpretation of the main setting. Supp ose there is a sequence of short-liv ed decision-mak ers indexed by time t , each of whom acquires information and then tak es an action a t to maximize some priv ate ob jectiv e u t ( a t , θ 1 ). All information acquisition is public. 37 A so cial planner has (an arbitrary) ob jectiv e function U ( a 1 , a 2 , . . . ; θ 1 ); thus, his incen tives are misaligned with those of the short-liv ed decision mak ers. Our main results demonstrate conditions under which this mis-alignment is of no consequence. If each DM acquires sufficien tly many signals, or if the en vironment is separable, then eac h DM will acquire exactly the information that the so cial planner w ould ha ve w anted. Generically , the so cial planner will not b e able to improv e (at late p erio ds) upon the information that has b een aggregated so far. W e generalize this qualitativ e insigh t in our companion piece Liang and Mu ( 2018 ) and demonstrate also ho w it can fail. 37 This separates our mo del from classic so cial learning framew orks ( Banerjee , 1992 ; Bikhchandani et al. , 1992 ), where decision-makers only observe coarse summary statistics of past information acquisitions. 25 9 App endix 9.1 Preliminary Results 9.1.1 P osterior V ariance F unction W e b egin by presenting basic results that are used throughout the app endix. The following lemma characterizes the p osterior v ariance function f mentioned in the main text, whic h maps signal counts to the DM’s p osterior v ariance ab out the pa y off-relev an t state θ 1 . Lemma 1. Given prior c ovarianc e matrix V 0 and q i observations of e ach signal i , the DM’s p osterior varianc e ab out θ 1 is given by 38 f ( q 1 , . . . , q K ) = [ V 0 − V 0 C 0 Σ − 1 C V 0 ] 11 (1) wher e Σ = C V 0 C 0 + D − 1 and D = diag q 1 σ 2 1 , . . . , q K σ 2 K . The function f is de cr e asing and c onvex in e ach q i whenever these ar guments take non-ne gative extende d r e al values: q i ∈ R + = R + ∪ { + ∞} . Pr o of. The expression ( 1 ) comes directly from the conditional v ariance form ula for m ulti- v ariate Gaussian distributions. T o pro ve ∂ f ∂ q i ≤ 0, consider the partial order on p ositive semi-definite matrices so that A B if and only if A − B is p ositiv e semi-definite. As q i increases, the matrices D − 1 and Σ decrease in this order. Thus Σ − 1 increases in this order, whic h implies that V 0 − V 0 C 0 Σ − 1 C V 0 decreases in this order. In particular, the diagonal en tries of V 0 − V 0 C 0 Σ − 1 C V 0 are uniformly smaller, so that f b ecomes smaller. Intuitiv ely , more information alwa ys improv es the decision-maker’s estimates. T o prov e f is con vex, it suffices to pro ve f is midp oint-c onvex since the function is clearly con tinuous. T ak e q 1 , . . . , q K , r 1 , . . . , r K ∈ R + and let s i = q i + r i 2 . 39 Define the corresp onding diagonal matrices to b e D q , D r , D s . W e need to show f ( q 1 , . . . , q K ) + f ( r 1 , . . . , r K ) ≥ 2 f ( s 1 , . . . , s K ). F or this, we first use the W o o dbury inv ersion form ula to write Σ − 1 = ( C V 0 C 0 + D − 1 ) − 1 = J − J ( J + D ) − 1 J, with J = ( C V 0 C 0 ) − 1 . Plugging this back into ( 1 ), w e see that it suffices to sho w the follo wing matrix order: ( J + D q ) − 1 + ( J + D r ) − 1 2 ( J + D s ) − 1 . 38 When M is a matrix, we let M ij denote its ( i, j )-th entry . 39 W e allow the function f to take + ∞ as arguments. This relaxation do es not affect the prop erties of f , and it is con venien t for our future analysis. 26 In verting b oth sides, w e need to show 2 (( J + D q ) − 1 + ( J + D r ) − 1 ) − 1 J + D s . F rom defi- nition, D q + D r = diag( q 1 + r 1 σ 2 1 , . . . , q K + r K σ 2 K ) = 2 D s . Th us the ab o ve follows from the AM-HM inequalit y for p ositive definite matrices, see for instance Ando ( 1983 ). 9.1.2 The Matrix Q i Let us define for eac h 1 ≤ i ≤ K , Q i = C − 1 ∆ ii C 0− 1 (2) where ∆ ii is the matrix with ‘1’ in the ( i, i )-th en try , and zeros elsewhere. W e note that [ Q i ] 11 = ([ C − 1 ] 1 i ) 2 , whic h is strictly p ositiv e under Assumption 1 . These matrices Q i will b e rep eatedly used in our proofs. 9.1.3 Order Difference Lemma Here w e establish the asymptotic order for the second deriv atives of f . Lemma 2. As q 1 , . . . , q K → ∞ , ∂ 2 f ∂ q 2 i is p ositive with or der 1 q 3 i , wher e as ∂ 2 f ∂ q i ∂ q j has or der at most 1 q 2 i q 2 j for any j 6 = i . F ormal ly, ther e is a p ositive c onstant L dep ending on the informational envir onment, such that ∂ 2 f ∂ q 2 i ≥ 1 Lq 3 i and | ∂ 2 f ∂ q i ∂ q j | ≤ L q 2 i q 2 j . T o interpret, the second deriv ativ e ∂ 2 f /∂ q 2 i is the effect of observing signal i on the marginal v alue of the next observ ation of signal i . Our lemma says that this second deriv ativ e is alw ays ev entually positive, so that each observ ation of signal i mak es the next observ ation of signal i less v aluable. The cross-partial ∂ 2 f /∂ q i ∂ q j is the effect of observing signal i on the marginal v alue of the next observ ation of a different signal j , and its sign is am biguous. The k ey conten t of the lemma is that regardless of the sign of the cross partial, it is alw ays of lo w er order compared to the second deriv ativ e. In words, the effect of observing a signal on the marginal v alue of other signals (as quantified b y the cross-partial) is even tually second-order to its effect on the marginal v alue of further realizations of the same signal (as quan tified b y the second deriv ativ e). This is true for an y signal path in whic h the signal coun ts q 1 , . . . , q K go to infinity prop ortionally , which we will justify later. Pr o of. Recall from Lemma 1 that f ( q 1 , . . . , q K ) = [ V 0 − V 0 C 0 Σ − 1 C V 0 ] 11 and therefore ∂ 2 f ∂ q i ∂ q j = [ ∂ ij ( V 0 − V 0 C 0 Σ − 1 C V 0 )] 11 ∂ 2 f ∂ q 2 i = [ ∂ ii ( V 0 − V 0 C 0 Σ − 1 C V 0 )] 11 (3) 27 Using prop erties of matrix deriv atives, ∂ ii (Σ − 1 ) = Σ − 1 ( ∂ i Σ)Σ − 1 ( ∂ i Σ)Σ − 1 − Σ − 1 ( ∂ ii Σ)Σ − 1 + Σ − 1 ( ∂ i Σ)Σ − 1 ( ∂ i Σ)Σ − 1 . (4) The relev an t deriv atives of the co v ariance matrix Σ are ∂ i Σ = − σ 2 i q 2 i ∆ ii ∂ ii Σ = 2 σ 2 i q 3 i ∆ ii Plugging these into ( 4 ), we obtain ∂ ii (Σ − 1 ) = − 2 σ 2 i q 3 i (Σ − 1 ∆ ii Σ − 1 ) + O 1 q 4 i . Thus by ( 3 ), ∂ 2 f ∂ q 2 i = − V 0 C 0 · ∂ 2 (Σ − 1 ) ∂ q 2 i · C V 0 11 = 2 σ 2 i q 3 i · V 0 C 0 Σ − 1 ∆ ii Σ − 1 C V 0 11 + O 1 q 4 i . (5) As q 1 , . . . , q k → ∞ , Σ → C V 0 C 0 whic h is symmetric and non-singular. Thus the matrix V 0 C 0 Σ − 1 ∆ ii Σ − 1 C V 0 con verges to the matrix Q i defined earlier in ( 2 ). F rom ( 5 ) and [ Q i ] 11 > 0, w e conclude that ∂ 2 f ∂ q 2 i is p ositiv e with order 1 q 3 i . Similarly , for i 6 = j , we hav e ∂ ij (Σ − 1 ) = Σ − 1 ( ∂ j Σ)Σ − 1 ( ∂ i Σ)Σ − 1 − Σ − 1 ( ∂ ij Σ)Σ − 1 + Σ − 1 ( ∂ i Σ)Σ − 1 ( ∂ j Σ)Σ − 1 . The relev an t deriv atives of the co v ariance matrix Σ are ∂ i Σ = − σ 2 i q 2 i ∆ ii ∂ j Σ = − σ 2 j q 2 j ∆ j j ∂ ij Σ = 0 F rom this it follows that ∂ ij (Σ − 1 ) = O 1 q 2 i q 2 j . The same holds for ∂ 2 f ∂ q i ∂ q j b ecause of ( 3 ), completing the pro of of the lemma. 9.2 Dynamic Blac kw ell Comparison 9.2.1 The Lemma This subsection establishes a dynamic version of Blackw ell dominance for sequences of normal signals. As an o verview, w e first generalize Greensh tein ( 1996 ) and show that a deterministic (i.e. history-indep endent) signal sequence yields higher exp ected pa y off than another in ev ery in tertemp oral decision problem if (and only if ) the former sequence induces lo wer p osterior v ariances ab out θ 1 at every p erio d. This will be a corollary of the lemma below, which also co vers strategies that may condition on signal realizations. W e introduce some notation: Since θ 1 is the only pay off-relev an t state, the DM in our mo del only needs to remem b er the exp ected v alue of θ 1 and the co v ariance matrix o ver all of the states (that is, exp ected v alues of the other states do not matter). Thus, w e can 28 summarize any history of b eliefs b y h T = ( µ 0 1 , V 0 ; . . . , µ T 1 , V T ), with µ t 1 represen ting the p osterior exp ected v alue of θ 1 after t p erio ds and V t the p osterior cov ariance matrix. Since the p osterior co v ariance matrix is a function of signal coun ts, w e can also k eep track of the ev olution of p osterior co v ariance matrices b y a sequence of division vectors. That is, we will write the history as h T = ( µ 0 1 , d (0); . . . ; µ T 1 , d ( T )), where eac h d ( t ) = ( d 1 ( t ) , . . . , d K ( t )) coun ts the n um b er of eac h signal acquired by time t . W e can then view any information acquisition strategy S as a mapping from such sequences of exp ected v alues and division v ectors to signal c hoices. Consider a mapping ˜ G from p ossible sequences of divisions to these sequences themselv es: F or each ( d (0) , . . . , d ( T )), ˜ G maps to another sequence ( ˜ d (0) , . . . , ˜ d ( T )), sub ject to the fol- lo wing “consistency” requiremen ts. First, P i ˜ d i ( t ) = t , meaning that eac h ˜ d ( t ) m ust b e a p ossible division at time t . Second, ˜ d i ( t ) ≥ ˜ d i ( t − 1), meaning that the sequence ˜ d can be attained via a sequential sampling rule. Lastly , we require ( ˜ d (0) , . . . , ˜ d ( T − 1)) = ˜ G ( d (0) , . . . , d ( T − 1)) so that nesting sequences are mapp ed to nesting sequences. The follo wing lemma sa ys that if d ( · ) represen ts the division v ectors under an information acquisition strategy S , and if ˜ G is a consisten t mapping that uniformly reduces the p osterior v ariance, then w e can find another information acquisition strategy ˜ S whose division vectors are given by ˜ d ( · ). Moreov er, our construction ensures that ˜ S leads to more disp ersed p osterior b eliefs than S at every p erio d, so that in an y decision problem, acquiring signals according to ˜ S is w eakly better than S (when actions are tak en optimally). Lemma 3. Fix any information ac quisition str ate gy S and any c onsistent mapping ˜ G define d ab ove. Supp ose that for every se quenc e of divisions ( d (0) , . . . , d ( T )) r e alize d under S , it holds that f ( ˜ d ( T )) ≤ f ( d ( T )) . Then ther e exists deviation str ate gy ˜ S such that, at every p erio d T , any history h T = ( µ 0 1 , d (0); . . . ; µ T 1 , d ( T )) under S c an b e “asso ciate d with” a distribution of histories ˜ h T = ( ν 0 1 , ˜ d (0); . . . ; ν T 1 , ˜ d ( T )) with the fol lowing pr op erties: 1. the pr ob ability of h T o c curring under S is the same as the pr ob ability of its asso ciate d ˜ h T (inte gr ate d with r esp e ct to the pr ob ability of “asso ciation”) o c curring under ˜ S ; 2. the total pr ob ability that any ˜ h T is asso ciate d to (inte gr ate d with r esp e ct to differ ent p ossible h T ) is 1 ; 29 3. under the asso ciation, the distribution of ν t 1 is normal with me an µ t 1 and varianc e f ( d ( t )) − f ( ˜ d ( t )) for e ach t . Conse quently, for any de cision str ate gy A , ther e exists another de cision str ate gy ˜ A such that the exp e cte d p ayoff under ( ˜ S , ˜ A ) is no less than the exp e cte d p ayoff under ( S, A ) . T o in terpret, the first t wo prop erties require that the association relation is a Markov kernel b etw een histories under S and histories under ˜ S ; this enables us to compare pa y offs under ˜ S to those under S . The third property guarantees that the alternativ e strategy ˜ S is more informativ e than S . W e note the follo wing corollary , whic h is obtained from the previous lemma by considering a constan t mapping ˜ G . Corollary 3. Define the t -optimal division ve ctors as in Se ction 4.1 . Supp ose e ach c o or dinate of n ( B t ) incr e ases in t . Then it is optimal for the DM to achieve n ( B t ) at every p erio d. 9.2.2 Pro of of Lemma 3 W e construct ˜ S iterativ ely as follows. In the first p erio d, consider the signal c hoice under S (given the n ull history). This signal leads to the division d (1). Let ˜ S observe the unique signal that would achiev e the division ˜ d (1). After the first observ ation, the DM’s distribution of p osterior b eliefs ab out θ 1 under S is θ 1 ∼ N ( µ 1 1 , f ( d (1))) with µ 1 1 a normal random v ariable with mean µ 0 1 and v ariance f ( 0 ) − f ( d (1)). By comparison, the distribution of p osterior b eliefs under ˜ S is θ 1 ∼ N ( ν 1 1 , f ( ˜ d (1))) with ν 1 1 dra wn from N ( µ 0 1 , f ( 0 ) − f ( ˜ d (1))). Since f ( ˜ d (1)) ≤ f ( d (1)), the latter distribution of b eliefs (under ˜ S ) is more informative a la Blac kw ell. Thus, w e can asso ciate each b elief θ 1 ∼ N ( µ 1 1 , f ( d (1))) under S with a more informativ e distribution of b eliefs N ( ν 1 1 , f ( ˜ d (1))) under ˜ S . T o b e more sp ecific, for fixed µ 1 1 , the asso ciated ν 1 1 is distributed normally with mean µ 1 1 and v ariance f ( d (1)) − f ( ˜ d (1))). Thus by construction, all three properties are satisfied at perio d 1. T o facilitate the discussion below, w e say this distribution of b eliefs under ˜ S “imitates” the b elief ( µ 1 1 , f ( d (1))) under S . In the second perio d, the deviation strategy ˜ S takes the curren t belief ( ν 1 1 , f ( ˜ d (1))) and randomly selects some µ 1 1 (with conditional probabilities under the Mark ov k ernel) to “im- itate.” That is, giv en an y selection of µ 1 1 , find the signal that S w ould observ e in p eriod 2 given b elief ( µ 1 1 , f ( d (1))). This signal choice under S leads to the division sequence ( d (0) , d (1) , d (2)), whic h is mapped to ( ˜ d (0) , ˜ d (1) , ˜ d (2)). Naturally , we let ˜ S observe the signal that w ould lead to the division ˜ d (2). Suc h a signal is w ell-defined due to our consistency requiremen ts on ˜ G . 30 T o pro ceed with the analysis, let us fix µ 1 1 and study the distribution of p osterior b eliefs ab out θ 1 after t w o observ ations. Under S , the distribution of p osterior beliefs is θ 1 ∼ N ( µ 2 1 , f ( d (2))) with µ 2 1 normally distributed with mean µ 1 1 and v ariance f ( d (1)) − f ( d (2)). While under ˜ S , the distribution of p osterior b eliefs is θ 1 ∼ ( ν 2 1 , f ( ˜ d (2))) with ν 2 1 dra wn from N ( µ 1 1 , f ( d (1)) − f ( ˜ d (2))). 40 Since f ( ˜ d (2)) ≤ f ( d (2)), the distribution of b eliefs under ˜ S Blackw ell-dominates the distribution under S , for each µ 1 1 . W e can thus asso ciate each history ( µ 1 1 , d (1); µ 2 1 , d (2)) under S with a distribution of histories ( ν 1 1 , ˜ d (1); ν 2 1 , ˜ d (2)) under ˜ S , suc h that the corresp onding b eliefs under ˜ S are more informativ e at both p erio ds. Rep eating this pro cedure completes the construction of ˜ S , which satisfies all three properties stated in the lemma. Finally , supp ose A is an y decision strategy that maps histories to actions. W e need to find ˜ A suc h that the pair ( ˜ S , ˜ A ) do es no w orse than ( S, A ). This is straigh tforward given what we ha v e done: at an y history ˜ h T under ˜ S , let ˜ h T randomly select h T to imitate, and define ˜ A ( ˜ h T ) = A ( h T ). Then w e see that a DM who follows the decision strategy A obtains the same pay off along an y b elief history h as another DM who uses the decision strategy ˜ A and faces the distribution of b elief histories ˜ h . In tegrating ov er h , we ha ve sho wn that ( ˜ S , ˜ A ) ac hieves the same pay off as ( S , A ). The lemma is prov ed. 9.3 Pro of of Theorem 1 (Large Blo c k of Signals) By Corollary 3 , it suffices to show that for sufficien tly large B , each co ordinate n ( B t ) is increasing in t . T o do this, w e first argue that the signal counts gro w to infinit y (roughly) prop ortionally . In more detail, define λ i = | [ C − 1 ] 1 i | · σ i P K j =1 | [ C − 1 ] 1 j | · σ j . (6) Then w e will show that for each signal i , n i ( t ) − λ i · t remains bounded ev en as t → ∞ . Indeed, w e m ust at least ha v e n i ( t ) → ∞ ; otherwise the posterior v ariance f ( n ( t )) w ould be bounded a w ay from zero, which w ould con tradict the optimalit y of n ( t ) since f ( t/K, . . . , t/K ) → 0. Additionally , we compute from ( 1 ) that ∂ i f ( n ( t )) = − σ 2 i n 2 i · [ V 0 C 0 Σ − 1 ∆ ii Σ − 1 C V 0 ] 11 . (7) 40 Here we use the following tec hnical result: supp ose the DM is endo wed with a distribution of prior b eliefs θ ∼ N ( µ, V ), with µ 1 normally distributed with mean y and v ariance σ 2 , then upon observing signal i and p erforming Bay esian up dating, his distribution of p osterior b eliefs is θ ∼ N ( ˆ µ, ˆ V ), with ˆ µ 1 normally distributed with mean y and v ariance σ 2 + [ V − ˆ V ] 11 . This is prov ed b y noting that the DM’s distribution of b eliefs ab out θ 1 m ust integrate to the same ex-ante distribution of θ 1 . 31 As eac h n i → ∞ , the matrix Σ = C V 0 C 0 + D − 1 (see Lemma 1 ) conv erges to C V 0 C 0 . So V 0 C 0 Σ − 1 ∆ ii Σ − 1 C V 0 con verges to the matrix Q i defined in ( 2 ). It follo ws from ( 7 ) that ∂ i f ∼ − σ 2 i n 2 i · [ Q i ] 11 (ratio con verges to 1). Since a t -optimal division must satisfy ∂ i f ∼ ∂ j f (b ecause w e are doing discrete optimization, ∂ i f and ∂ j f need only be approximately equal), w e deduce that n i and n j m ust gro w prop ortionally . Using [ Q i ] 11 = ([ C − 1 ] 1 i ) 2 , w e hav e n i ( t ) ∼ λ i t . Next, note that b ecause n i ( t ) ∼ λ i t , Σ = C V 0 C 0 + D − 1 = C V 0 C 0 + O ( 1 t ). 41 Th us in fact V 0 C 0 Σ − 1 ∆ ii Σ − 1 C V 0 con verges to Q i at the rate of 1 t . F rom ( 7 ), w e obtain ∂ i f = − σ 2 i · [ Q i ] 11 + O ( 1 t ) n 2 i . t -optimalit y giv es us the first-order condition ∂ i f = ∂ j f + O ( 1 t 3 ). 42 So λ 2 i + O ( 1 t ) n 2 i = λ 2 j + O ( 1 t ) n 2 j . This is equiv alen t to λ 2 i n 2 j − λ 2 j n 2 i = O ( t ), which yields λ i n j − λ j n i = O (1) after factorization. Hence n i ( t ) = λ i · t + O (1) as w e claimed. Ha ving completed this asymptotic c haracterization of the t -optimal division v ectors, w e will now sho w that n ( t + K − 1) ≥ n ( t ) (in eac h co ordinate) whenev er t is sufficiently large. Theorem 1 will follow once this is pro ve d. 43 Supp ose for the sak e of con tradiction that n 1 ( t + K − 1) ≤ n 1 ( t ) − 1. Note we hav e P K i =1 ( n i ( t + K − 1) − n i ( t )) = K − 1. So P K i =2 ( n i ( t + K − 1) − n i ( t )) ≥ K , and w e can without loss of generalit y assume n 2 ( t + K − 1) ≥ n 2 ( t ) + 2. T o summarize, when transitioning from t -optimality to t + K − 1-optimality , signal 1 is acquired at least once less and signal 2 at least t wice more. Below w e will obtain a con tradiction by arguing that at perio d t + K − 1, the posterior v ariance could b e further reduced by observing signal 1 once more and signal 2 once less. Indeed, let us write n i = n i ( t ) and ˜ n i = n i ( t + K − 1). Then t -optimality of n ( t ) giv es us f ( n 1 − 1 , n 2 + 1 , . . . , n K ) ≥ f ( n 1 , n 2 , . . . , n K ) . With a sligh t abuse of notation, w e let ∂ i f to denote the discr ete partial deriv ative of f : ∂ i f ( q ) = f ( q i + 1 , q − i ) − f ( q ). Then the abov e display is equiv alent to ∂ 2 f ( n 1 − 1 , n 2 , . . . , n K ) ≥ ∂ 1 f ( n 1 − 1 , n 2 , . . . , n K ) . (8) 41 “Big O” notation has the usual meaning. 42 That is, error terms due to discreteness are no larger than 1 t 3 . W e omit the details. 43 T o b e fully rigorous, this only prov es Theorem 1 when B is sufficiently large and is a m ultiple of K − 1. Ho w ev er, w e can similarly show n ( t + K ) ≥ n ( t ) for sufficien tly large t . The t w o inequalities n ( t + K − 1) ≥ n ( t ) and n ( t + K ) ≥ n ( t ) together are sufficient to deduce Theorem 1 for all large B . 32 W e claim this implies the following: ∂ 2 f ( ˜ n 1 , ˜ n 2 − 1 , . . . , ˜ n K ) > ∂ 1 f ( ˜ n 1 , ˜ n 2 − 1 , . . . , ˜ n K ) . (9) This w ould lead to f ( ˜ n 1 , ˜ n 2 , . . . , ˜ n K ) > f ( ˜ n 1 + 1 , ˜ n 2 − 1 , . . . , ˜ n K ) , whic h w ould b e our desired contradiction. It remains to sho w ( 8 ) = ⇒ ( 9 ). By assumption, we ha ve ˜ n 1 ≤ n 1 − 1, ˜ n 2 ≥ n 2 + 2 and the difference b etw een an y ˜ n j and n j is b ounded uniformly o ver t . Thus the LHS of ( 9 ) exceeds the LHS of ( 8 ) b y (at least) a second deriv ative ∂ 22 min us a finite n umber of cross partial deriv ativ es ∂ 2 j . By Lemma 2 , this difference on the LHS is p ositive with order 1 t 3 . The difference b etw een the RHS of ( 9 ) and the RHS of ( 8 ) can b e p ositive or negative, but either w ay it has order O ( 1 t 4 ). This shows ( 9 ) is a consequence of ( 8 ), and the theorem follo ws. 9.4 Pro of of Theorem 2 (Separable En vironmen ts) Supp ose the informational environmen t is s eparable. W e will show n ( t ) increases in t , whic h implies the theorem via Corollary 3 . Note that in a separable environmen t, the definition of t -optimality reduces to: n ( t ) = ( n 1 ( t ) , . . . , n K ( t )) ∈ argmin ( q 1 ,...,q K ): q i ∈ Z + , P K i =1 q i = t K X i =1 g i ( q i ) where g 1 , . . . , g K are con v ex functions. In this setting, the my opic information acquisition strategy sequentially chooses the signal i that minimizes the difference g i ( q i + 1) − g i ( q i ), giv en the curren t division vector q . But since the g -functions are conv ex, the outcome under the my opic strategy coincides with t - optimalit y at ev ery p erio d t (This is fairly w ell kno wn, and it can b e prov ed quickly b y induction.) Hence n ( t ) can and will b e sequen tially attained. 9.5 Preparation for the Pro of of Theorem 3 9.5.1 Switc h Deviations W e now introduce preliminary results that will be used to sho w that the optimal rule even- tually proceeds my opically in generic environmen ts. Relative to the pro ofs of Theorems 1 and 2 , the new difficult y that arises is that in general, the optimal information acquisition 33 strategy conditions on signal realizations. As a result, the induced division vectors d ( · ) are sto c hastic, and we will need the full p ow er of our dynamic Blac kwell lemma. In what follows, we will apply Lemma 3 using a particular class of mappings ˜ G . Definition 3. Fix a p articular se quenc e of divisions ( d ∗ (0) , d ∗ (1) , . . . , d ∗ ( t 0 )) . L et i b e the signal observe d in p erio d t 0 and j b e any other signal. An ( i, j )-switch mapping ˜ G sp e cifies the fol lowing: 1. Supp ose T < t 0 or d ( t ) 6 = d ∗ ( t ) for some t ≤ t 0 , then let ˜ G ( d (0) , . . . , d ( T )) b e itself. 2. Otherwise T ≥ t 0 and d ( t ) = d ∗ ( t ) , ∀ t ≤ t 0 . If d j ( T ) = d j ( t 0 ) , then let ˜ d ( T ) = ( d i ( T ) − 1 , d j ( T ) + 1 , d − ij ( T )) . If d j ( T ) > d j ( t 0 ) , then let ˜ d ( T ) = d ( T ) . s 1 s 2 s 3 s 4 s 5 . . . s t 0 − 1 | {z } Signals match divisions d ∗ (0) , . . . , d ∗ ( t 0 − 1) X i s t 0 +1 . . . s τ − 1 | {z } None of these are X j X j s τ +1 . . . ( i, j )-switc h Figure 2: Pictorial r epr esentation of an ( i, j ) -switch b ase d on a se quenc e of divisions d ∗ (0) , . . . , d ∗ ( t 0 ) . Let us interpret this definition by relating to the resulting deviation strategy ˜ S con- structed in Lemma 3 . The first case ab o v e says that ˜ S only deviates when the history of divisions is d ∗ (0) , . . . , d ∗ ( t 0 − 1) and S is about to observ e signal i in p erio d t 0 . The second case says that ˜ S dictates observing signal j instead at that history; subsequen tly , ˜ S observ es the same signal as S (at the imitated belief ) until the first p erio d at whic h S is about to observ e signal j . If that p erio d exists, the deviation strategy ˜ S switches b ack to observing signal i and coincides with S afterwards. The b enefit of these “switc h deviations” is that their p osterior v ariances can b e easily compared to the original strategy . Sp ecifically , ˜ d ( t ) = d ( t ) except at those histories that b egin with d ∗ (0) , d ∗ (1) , . . . , d ∗ ( t 0 − 1) (and b efore signal j is observed again under S ). At suc h histories, the posterior v ariance is strictly low er under ˜ S if and only if f ( d i ( t ) − 1 , d j ( t ) + 1 , d − ij ( t )) < f ( d ( t )) . Using (absolute v alues of ) the discrete partial deriv atives, we can rewrite this condition as | ∂ i f ( d i ( t ) − 1 , d j ( t ) , d − ij ( t )) | < | ∂ j f ( d i ( t ) − 1 , d j ( t ) , d − ij ( t )) | . (*) W e can thus obtain the following corollary: 34 Corollary 4. Supp ose we c an find a history of divisions d (0) , . . . , d ( t 0 ) r e alize d under S such that d i ( t 0 ) = d i ( t 0 − 1) + 1 and mor e over ( * ) holds for al l divisions d ( t ) with d j ( t ) = d j ( t 0 ) and d k ( t ) ≥ d k ( t 0 ) , ∀ k . Then the switch deviation ˜ S c onstructe d ab ove impr oves up on S . Note that the condition d j ( t ) = d j ( t 0 ) captures the fact that ˜ d ( t ) differs from d ( t ) only un til signal j is c hosen again b y S . Mean while, d k ( t ) ≥ d k ( t 0 ) , ∀ k holds because w e only compare p osterior v ariances after t 0 p erio ds. 9.5.2 Asymptotic Characterization of Optimal Strategy Belo w w e will use the contrapositive of Corollary 4 to argue that if S is the optimal infor- mation acquisition strategy , then we cannot find an y history of realized divisions such that ( * ) alwa ys holds. T ec hnically sp eaking, w e migh t w orry that although ˜ S strictly impro v es up on S in terms of p osterior v ariances, it migh t achiev e the same exp ected pay off as S (for instance, when the DM faces a constant pay off function). Nonetheless, b y Zorn’s lemma w e can c ho ose S to b e an optimal strategy that is additionally “un-dominated” in terms of p osterior v ariances. With that c hoice, the deviation ˜ S cannot exist, and our arguments remain v alid. (Note that our theorems only state that the DM has an optimal str ate gy . . . ) T o illustrate, w e now deriv e the asymptotic signal proportions for the optimal information acquisition strategy S . Lemma 4. Supp ose S is the optimal information ac quisition str ate gy, and d ( · ) is its induc e d divisions. L et λ k b e define d as in ( 6 ). In generic informational envir onments, the differ enc e d k ( T ) − λ k · T r emains b ounde d as T → ∞ , for any r e alize d division d ( T ) and e ach signal k . Pr o of. F or this pro of, we only need the informational en vironment to b e suc h that eac h signal has strictly p ositive mar ginal value . That is, for an y signal k and an y p ossible division q , we require f ( q k + 1 , q − k ) < f ( q ) . This is “generically” satisfied b ecause any equalit y f ( q k + 1 , q − k ) = f ( q ) would imp ose a non-trivial p olynomial equation o v er the signal linear co efficients, and the n umber of such constrain ts is at most countable. Under this genericit y assumption, let us first sho w d k ( T ) → ∞ holds for each signal k , and the sp eed of divergence dep ends only on the informational environmen t. F or con tradiction, supp ose this is not true. Then we can find a sequence of histories { h T m } such that T m → ∞ but d 1 ( T m ) remains b ounded (these histories need not nest one another). By passing to a subsequence, we may assume q k = lim m →∞ d k ( T m ) exists for every signal k , where this limit 35 ma y b e infinity . Define I to b e the non-empty subset of signals (not including signal 1) with q k = ∞ . F urthermore, we assume that the signal observed in the last p erio d of each of these histories h T m is the same signal i . W e also assume i ∈ I ; otherwise just truncate the histories b y finitely many p erio ds. T ake an y signal j / ∈ I (for instance, j = 1 works). Cho ose T m sufficien tly large and consider the ( i, j )-switch deviation ˜ S that deviates from h T m b y observing signal j instead of i in perio d T m . W e will v erify ( * ) for all possible divisions d ( t ) with d j ( t ) = d j ( T m ) and d i ( t ) ≥ d i ( T m ), which will contradict the optimality of S via Corollary 4 . Indeed, note that as T m → ∞ , d i ( T m ) → ∞ b ecause i ∈ I . Since d i ( t ) ≥ d i ( T m ), the LHS of ( * ) approac hes zero as T m increases. By comparison, the RHS of ( * ) is b ounded a w a y from zero b ecause d j ( t ) = d j ( T m ) is bounded, and we assume each signal has strictly p ositiv e marginal v alue. Hence ( * ) holds and w e hav e shown that d ( T ) → ∞ in each co ordinate. Next, from ( 7 ), we hav e the follo wing approximations for the partial deriv ativ es: | ∂ i f ( d i ( t ) − 1 , d j ( t ) , d − ij ( t )) | ∼ σ 2 i · [ Q i ] 11 d i ( t ) 2 | ∂ j f ( d i ( t ) − 1 , d j ( t ) , d − ij ( t )) | ∼ σ 2 j · [ Q j ] 11 d j ( t ) 2 . If lim sup t 0 →∞ d i ( t 0 ) d j ( t 0 ) > λ i λ j (recall that λ i is prop ortional to σ i · p [ Q i ] 11 ), then the ab ov e estimates would imply ( * ) whenever d i ( t ) ≥ d i ( t 0 ) (b ecause t ≥ t 0 ) and d j ( t ) = d j ( t 0 ). That w ould contradict the optimalit y of S . Hence, lim sup t 0 →∞ d i ( t 0 ) d j ( t 0 ) ≤ λ i λ j for ev ery pair of signals i and j . It follows that d k ( t 0 ) ∼ λ k · t 0 , ∀ k . Once these asymptotic prop ortions are pro ved, we know that the matrix Σ = C V 0 C 0 + D − 1 con verges to C V 0 C 0 at the rate of 1 t . By ( 7 ), we can deduce more precise appro ximations: | ∂ i f ( d i ( t ) − 1 , · · · ) | = σ 2 i · [ Q i ] 11 + O ( 1 t ) d i ( t ) 2 | ∂ j f ( d i ( t ) − 1 , · · · ) | = σ 2 j · [ Q j ] 11 + O ( 1 t ) d j ( t ) 2 . If d i ( t 0 ) d j ( t 0 ) > λ i λ j + O ( 1 t 0 ), then these refined estimates would again imply ( * ) whenev er d i ( t ) ≥ d i ( t 0 ) and d j ( t ) = d j ( t 0 ). T o a void the resulting contradiction, w e must ha ve d i ( t 0 ) d j ( t 0 ) ≤ λ i λ j + O ( 1 t 0 ) for every signal pair. This enables us to conclude d k ( t 0 ) = λ k · t 0 + O (1) as desired. 9.6 Pro of of Theorem 3 (Generic Ev en tual My opia) 9.6.1 Outline of the Pro of T o guide the reader through this app endix, we b egin by outlining the pro of of the theorem, whic h is brok en down in to sev eral steps. Throughout, we fo cus on the case of B = 1 (one signal each p eriod), but our pro of easily extends to arbitrary B . W e will first sho w a simpler 36 (and weak er) result that, in generic environmen ts, the num b er of p erio ds in whic h the optimal strategy differs from the t -optimal division has natural densit y 1. Our pro of of this result is based on the observ ation that if equiv alence do es not hold at some time t , there m ust b e two differ ent divisions ov er signals for which the resulting p osterior v ariances ab out θ 1 are within O ( 1 t 4 ) from each other. This leads to a Diophantine approximation inequality , which w e can show only occurs at a v anishing fraction of p erio ds t . T o impro ve the result and demonstrate equiv alence at al l late p erio ds , w e show that the n umber of “exceptional p erio ds” t is generically finite if there are thr e e differ ent divisions o ver signals whose p osterior v ariances are within O ( 1 t 4 ) from eac h other. This allo ws us to conclude that in generic environmen ts, the t -optimal divisions even tually monotonically increase in t . In suc h environmen ts, t -optimalit y can b e ac hieved at every late p erio d. Th us, whenever t -optimalit y obtains in some late p erio d, it will be sustained in all future perio ds. Since w e ha ve already established that the optimal strategy ac hieves t -optimalit y infinitely often, w e conclude equiv alence at all large t . W e highligh t that in this appendix, w e use a slightly differen t notion of “generic” where w e fix the signal co efficien t matrix C and instead (randomly) v ary the signal v ariances { σ 2 i } . This concept implies (and is stronger than) the previous genericity concept defined on C . 9.6.2 Equiv alence at Almost All Times W e b egin by pro ving a weak er result, that the optimal strategy induces the t -optimal division n ( t ) at almost all perio ds t . Prop osition 2. Supp ose the informational envir onment ( V 0 , C, { σ 2 i } ) has the pr op erty that for any i 6 = j , the r atio λ i λ j is an irr ational numb er. Then, at a set of times with natur al density 1 , 44 d ( t ) = n ( t ) (which is unique) holds for every de cision pr oblem. In p articular, the optimal str ate gy induc es a deterministic division ve ctor at such times. Pr o of of Pr op osition 2 . Supp ose that d 1 ( T ) ≥ n 1 ( T ) + 1 and d 2 ( T ) ≤ n 2 ( T ) − 1. Consider the last p erio d t 0 ≤ T in which the optimal strategy observed signal 1. Then d 1 ( t 0 ) = d 1 ( T ) ≥ n 1 ( T ) + 1; d 2 ( t 0 ) ≤ d 2 ( T ) ≤ n 2 ( T ) − 1 . Using the con trap ositive of Corollary 4 with the (1 , 2)-switch, w e kno w that ( * ) cannot alw ays hold. Thus there exists a division d ( t ) such that the inequality ( * ) is reversed. That 44 F ormally , for any set of p ositiv e integers A , let A ( N ) coun t the n umber of integers in A no greater than N . Then we define the natural density of A to b e lim N →∞ A ( N ) N , when this limit exists. 37 is, we can find a division d ( t ) with d 1 ( t ) ≥ d 1 ( t 0 ) and d 2 ( t ) = d 2 ( t 0 ) suc h that (getting rid of the absolute v alues) ∂ 1 f ( d 1 ( t ) − 1 , d 2 ( t ) , d − 12 ( t )) ≤ ∂ 2 f ( d 1 ( t ) − 1 , d 2 ( t ) , d − 12 ( t )) . (10) On the other hand, t -optimalit y of n ( t ) giv es us ∂ 1 f ( n 1 ( T ) , n 2 ( T ) − 1 , n − 12 ( T )) ≥ ∂ 2 f ( n 1 ( T ) , n 2 ( T ) − 1 , n − 12 ( T )) . (11) Note that d 2 ( t ) = d 2 ( t 0 ) implies t − t 0 is b ounded (due to Lemma 4 ). On the other hand, w e ha ve d 1 ( t ) = d 1 ( T 0 ) b y construction ( t 0 is the last perio d signal 1 was observ ed). Hence t 0 − T is also bounded. Com bining b oth, w e deduce t − T must be b ounded. Applying Lemma 4 again, we know that an y difference d i ( t ) − n i ( T ) is b ounded. No w b ecause d 1 ( t ) − 1 ≥ d 1 ( t 0 ) − 1 ≥ n 1 ( T ), the LHS of ( 10 ) has size at least the LHS of ( 11 ) min us a finite num b er of cross partial deriv atives ∂ 1 j . Similarly , the RHS of ( 10 ) is at most bigger than the RHS of ( 10 ) by a num b er of cross partials. T ogether with the order difference lemma, these imply that the only w ay ( 10 ) and ( 11 ) can b oth hold is if the t wo sides of ( 11 ) differ b y at most O 1 T 4 . T o summarize: A necessary condition for d 1 ( T ) ≥ n 1 ( T ) + 1 and d 2 ( T ) ≤ n 2 ( T ) + 1 to o ccur is that | f ( n 1 ( T ) + 1 , n 2 ( T ) − 1 , . . . , n K ( T )) − f ( n ( T )) | = O 1 T 4 . (12) Hence, to prov e the Prop osition we only need to sho w that ( 12 ) holds at a set of times with natural densit y 0. The following lemma pro v es exactly this prop erty . Lemma 5. Supp ose λ 1 λ 2 is an irr ational numb er. F or p ositive c onstants c 0 , c 1 , define A ( c 0 , c 1 ) to b e the fol lowing set of p ositive inte gers: { t : ∃ q 1 , q 2 , . . . , q K ∈ Z + , s.t. | q i − λ i · t | ≤ c 0 , ∀ i ∧ | f ( q 1 , q 2 + 1 , . . . , q K ) − f ( q 1 + 1 , q 2 , . . . , q K ) | ≤ c 1 /t 4 } . Then A ( c 0 , c 1 ) has natur al density zer o. Pr o of of L emma 5 . The proof relies on the following tec hnical result, whic h gives a precise appro ximation of the discr ete p artial derivatives of f : 38 Lemma 6. Fix the informational envir onment. Ther e exists a c onstant a j such that f ( q j , q − j ) − f ( q j + 1 , q − j ) = σ 2 j · [ Q j ] 11 ( q j − a j ) 2 + O 1 t 4 ; c 0 (13) holds for al l q 1 , . . . , q K with | q i − λ i t | ≤ c 0 , ∀ i . The notation O 1 t 4 ; c 0 me ans an upp er b ound of L t 4 , wher e the c onstant L may dep end on the informational envir onment as wel l as on c 0 . 45 Assuming ( 13 ), we see that the condition | f ( q 1 , q 2 + 1 , . . . , q K ) − f ( q 1 + 1 , q 2 , . . . , q K ) | ≤ c 1 t 4 implies σ 2 1 · [ Q 1 ] 11 ( q 1 − a 1 ) 2 − σ 2 2 · [ Q 2 ] 11 ( q 2 − a 2 ) 2 ≤ c 2 t 4 and thus ( λ 1 q 1 − a 1 ) 2 − ( λ 2 q 2 − a 2 ) 2 ≤ c 3 t 4 for some larger p ositiv e constan ts c 2 , c 3 . This further implies λ 1 q 1 − a 1 − λ 2 q 2 − a 2 ≤ c 4 t 3 , whic h reduces to q 2 − a 2 − λ 2 λ 1 ( q 1 − a 1 ) ≤ c 5 t . (14) This inequalit y sa ys that the fractional part of λ 2 λ 1 q 1 is v ery close to the fractional part of λ 2 λ 1 a 1 − a 2 . But since λ 2 λ 1 is an irrational num b er, the fractional part of λ 2 λ 1 q 1 is “equi-distributed” in (0,1) as q 1 ranges in the p ositive in tegers. 46 Th us the Diophan tine approximation ( 14 ) only has solution at a set of times t with natural densit y 0, pro ving Lemma 5 . Belo w we supply the technically inv olv ed pro of of ( 13 ). Pr o of of L emma 6 . Fix q 1 , . . . , q K and the signal j . Recall the diagonal matrix D = diag ( q 1 σ 2 1 , . . . , q K σ 2 K ). Consider any ˆ q j ∈ [ q j , q j + 1] and let ˆ D b e the analogue of D for the division ( ˆ q j , q − j ). That is, ˆ D = D except that [ ˆ D ] j j = ˆ q j σ 2 j . Let ˆ Σ = C V 0 C 0 + ˆ D − 1 . F rom ( 7 ), we hav e ∂ j f ( ˆ q j , q − j ) = − σ 2 j ˆ q 2 j · h V 0 C 0 ˆ Σ − 1 ∆ j j ˆ Σ − 1 C V 0 i 11 . (15) Her e and later in this pr o of, ∂ j f r epr esents the usual c ontinuous derivative r ather than the discr ete derivative. Let D 0 = diag λ 1 t σ 2 1 , . . . , λ K t σ 2 K and Σ 0 = C V 0 C 0 + D − 1 0 . F or | q i − λ i t | ≤ c 0 , ∀ i w e hav e ˆ D − D 0 = O ( c 0 ), where the Big O notation applies entry-wise. It follows that ˆ Σ = C V 0 C 0 + ˆ D − 1 = C V 0 C 0 + D − 1 + O ( 1 t 2 ; c 0 ) = Σ 0 + O ( 1 t 2 ; c 0 ) . 45 In applying Lemma 5 to prov e Prop osition 2 , c 0 is taken to b e the b ound on n i − λ i · t . 46 The Equi-distribution Theorem states that for any irrational n umber α and any sub-interv al ( a, b ) ⊂ (0 , 1), the set of p ositive in tegers n such that the fractional part of αn b elongs to ( a, b ) has natural density b − a . It is a sp ecial case of the Ergo dic Theorem. 39 Observ e that the matrix in v erse is a differentiable mapping at Σ 0 (whic h is C V 0 C 0 + D − 1 0 C V 0 C 0 and th us p ositiv e definite). Thus we hav e ˆ Σ − 1 = Σ − 1 0 + O 1 t 2 ; c 0 . Plugging this into ( 15 ) and using ˆ q j ∼ λ j t , w e obtain that ∂ j f ( ˆ q j , q − j ) = − σ 2 j ˆ q 2 j · V 0 C 0 Σ − 1 0 ∆ j j Σ − 1 0 C V 0 11 + O 1 t 4 ; c 0 . (16) Since Σ 0 = C V 0 C 0 + 1 t · diag σ 2 1 λ 1 , . . . , σ 2 K λ K , w e can apply T a ylor expansion (to the matrix in verse map) and write Σ − 1 0 = ( C V 0 C 0 ) − 1 − 1 t ( C V 0 C 0 ) − 1 · diag σ 2 1 λ 1 , . . . , σ 2 K λ K · ( C V 0 C 0 ) − 1 + O 1 t 2 . (17) This implies V 0 C 0 Σ − 1 0 ∆ j j Σ − 1 0 C V 0 = V 0 C 0 ( C V 0 C 0 ) − 1 ∆ j j ( C V 0 C 0 ) − 1 C V 0 − M j t + O 1 t 2 = Q j − M j t + O 1 t 2 , (18) where M j is a fixed K × K matrix depending only on the informational environmen t. F or future use, we note that M j = V 0 C 0 ( C V 0 C 0 ) − 1 diag σ 2 1 λ 1 , . . . , σ 2 K λ K ( C V 0 C 0 ) − 1 ∆ j j ( C V 0 C 0 ) − 1 C V 0 + V 0 C 0 ( C V 0 C 0 ) − 1 ∆ j j ( C V 0 C 0 ) − 1 diag σ 2 1 λ 1 , . . . , σ 2 K λ K ( C V 0 C 0 ) − 1 C V 0 = C − 1 diag σ 2 1 λ 1 , . . . , σ 2 K λ K ( C V 0 C 0 ) − 1 ∆ j j C 0− 1 + C − 1 ∆ j j ( C V 0 C 0 ) − 1 diag σ 2 1 λ 1 , . . . , σ 2 K λ K C 0− 1 . (19) Using ( 18 ), we can simplify ( 16 ) to ∂ j f ( ˆ q j , q − j ) = − σ 2 j ˆ q 2 j · Q j − M j t 11 + O 1 t 4 ; c 0 . (20) In tegrating this ov er ˆ q j ∈ [ q j , q j + 1], we conclude that f ( q j , q − j ) − f ( q j + 1 , q − j ) = σ 2 j q j ( q j + 1) · Q j − M j t 11 + O 1 t 4 ; c 0 . (21) 40 W e set a j = − λ j · [ M j ] 11 2[ Q j ] 11 + 1 2 . Then σ 2 j q j ( q j + 1) · Q j − M j t 11 = ( σ 2 j · [ Q j ] 11 ) · 1 + 2 a j +1 λ j t q j ( q j + 1) = σ 2 j · [ Q j ] 11 ( q j − a j ) 2 + O 1 t 4 ; c 0 , implying the desired appro ximation ( 13 ). The last equalit y ab ov e uses 1+ 2 a j +1 λ j t q j ( q j +1) = 1 ( q j − a j ) 2 + O 1 t 4 ; c 0 , whic h is b ecause q j ( q j + 1) ( q j − a j ) 2 = 1 + 2( a j + 1) q j − a j + O 1 ( q j − a j ) 2 = 1 + 2 a j + 1 λ j t + O 1 t 2 ; c 0 dividing through by q j ( q j + 1). 9.6.3 A Sim ultaneous Diophan tine Approximation Problem The abov e Lemma 5 tells us that at most times t , there do not exist a p air of divisions (differing minimally on t wo signal counts) that lead to p osterior v ariances close to eac h other (with a difference of c 1 t 4 ). W e obtain a stronger result if a triple of such divisions were to exist. Lemma 7. Fix V 0 and C , and let signal varianc es vary. F or p ositive c onstants c 0 , c 1 , define A ∗ ( c 0 , c 1 ) to b e the fol lowing set of p ositive inte gers: { t : ∃ q 1 , q 2 , q 3 , . . . , q K ∈ Z + , s.t. | q i − λ i t | ≤ c 0 , ∀ i ∧ | f ( q 1 , q 2 + 1 , q 3 , . . . , q K ) − f ( q 1 + 1 , q 2 , q 3 , . . . , q K ) | ≤ c 1 /t 4 ∧ | f ( q 1 , q 2 , q 3 + 1 , . . . , q K ) − f ( q 1 + 1 , q 2 , q 3 , . . . , q K ) | ≤ c 1 /t 4 } Then, for generic signal varianc es, A ∗ ( c 0 , c 1 ) has finite c ar dinality. Pr o of. So far we hav e been dealing with fixed informational en vironments. Ho wev er, a n umber of parameters defined ab o ve dep end on the signal v ariances σ = { σ 2 i } K i =1 . Sp ecifically , while the matrix Q i = C − 1 ∆ ii C 0− 1 is indep endent of σ , the asymptotic prop ortions λ i ∝ σ i · [ Q i ] 11 do v ary with σ . In this pro of, we write λ i ( σ ) to highlight this dependence. Next, w e recall the matrix M j in tro duced earlier in ( 19 ). W e note that for fixed matrices V 0 and C , each entry of M j ( σ ) is a fixed linear com bination of σ 2 1 λ 1 ( σ ) , . . . , σ 2 K λ K ( σ ) . Then, the parameter a j ( σ ) in ( 13 ) is given by (see the previous proof ) a j ( σ ) = − 1 2 − λ j ( σ ) · [ M j ( σ )] 11 2[ Q j ] 11 = − 1 2 + λ j ( σ ) K X i =1 ˜ b j i σ 2 i λ i ( σ ) = − 1 2 + K X i =1 b j i σ i σ j (22) 41 for some constants ˜ b j i , b j i indep enden t of σ . In the last equalit y ab ov e, w e used the fact that λ j ( σ ) λ i ( σ ) equals a constant times σ j σ i . Th us Lemma 6 giv es f ( q j , q − j ) − f ( q j + 1 , q − j ) = σ 2 j · [ Q j ] 11 ( q j − a j ( σ )) 2 + O 1 t 4 ; c 0 whenev er | q i − λ i ( σ ) · t | ≤ c 0 , ∀ i . W e commen t that the Big O constant here ma y depend on σ . Ho w ever, a single constant suffices if w e restrict each σ i to b e bounded abov e and b ounded a wa y from zero. Since measure-zero sets are closed under coun table unions, this restriction do es not affect the result we w ant to pro ve. By the ab o ve appro ximation, a necessary condition for t ∈ A ∗ ( c 0 , c 1 ) is that q 1 , q 2 , q 3 satisfy ( q 2 − a 2 ( σ )) − η · σ 2 σ 1 ( q 1 − a 1 ( σ )) ≤ c 6 q 1 (23) as w ell as ( q 3 − a 3 ( σ )) − κ · σ 3 σ 1 ( q 1 − a 1 ( σ )) ≤ c 6 q 1 (24) for some constant c 6 indep enden t of σ ( c 6 ma y dep end on c 0 , c 1 stated in the lemma). The constan t η is giv en b y η = p [ Q 2 ] 11 / [ Q 1 ] 11 , and similarly for κ . It remains to sho w that for generic σ , there are only finitely man y p ositive in teger triples ( q 1 , q 2 , q 3 ) satisfying the simultane ous Diophantine appr oximation ( 23 ) and ( 24 ). T o pro ve this, w e assume that eac h σ i is i.i.d. dra wn from the uniform distribution on [ 1 L , L ], where L is a large constant. Denote by F ( q 1 , q 2 , q 3 ) the even t that ( 23 ) and ( 24 ) hold simultaneously . W e claim that there exists a constant c 7 suc h that P ( F ( q 1 , q 2 , q 3 )) ≤ c 7 q 4 1 holds for all q 1 , q 2 , q 3 . Since F ( q 1 , q 2 , q 3 ) cannot o ccur for q 2 , q 3 > c 8 q 1 , this claim will imply X q 1 ,q 2 ,q 3 P ( F ( q 1 , q 2 , q 3 )) < X q 1 X q 2 ,q 3 ≤ c 8 q 1 c 7 q 4 1 < X q 1 c 7 c 2 8 q 2 1 < ∞ . (25) Generic finiteness of tuples ( q 1 , q 2 , q 3 ) will then follow from the Borel-Cantelli Lemma. 47 T o pro ve this claim, it suffices to sho w that if σ = ( σ 1 , σ 2 , σ 3 , σ 4 , . . . , σ K ) and σ 0 = ( σ 1 , σ 0 2 , σ 0 3 , σ 4 , . . . , σ K ) b oth satisfy ( 23 ) and ( 24 ), then | σ 2 − σ 0 2 | , | σ 3 − σ 0 3 | ≤ c q 2 1 for some 47 Because of the use of Borel-Cantelli Lemma, this pro of (unlike Lemma 5 ab o ve) do es not allow us to effectiv ely determine for given σ whether ( 23 ) and ( 24 ) only ha ve finitely many integer solutions. Nonetheless, a mo dification of this pro of do es imply the follo wing finite-time probabilistic statement: when σ 1 , . . . , σ K are indep endently dra wn, the probability that the optimality strategy coincides with t -optimality at every p eriod t ≥ T is at least 1 − O ( 1 T ), where the constan t in volv ed only dep ends on the distribution of σ . 42 constan t c . 48 Without loss, we assume | σ 2 − σ 0 2 | ≥ | σ 3 − σ 0 3 | . Using ( 22 ), we can rewrite the condition ( 23 ) as q 2 + 1 2 − η · σ 2 σ 1 q 1 + 1 2 | {z } A + X i β i σ 2 σ i | {z } B ≤ c 6 q 1 for some constants β i indep enden t of σ . A similar inequality holds at σ 0 : q 2 + 1 2 − η · σ 0 2 σ 1 q 1 + 1 2 | {z } A 0 + X i β i σ 0 2 σ 0 i | {z } B 0 ≤ c 6 q 1 . It follo ws from the ab ov e tw o inequalities that | A + B − A 0 − B 0 | ≤ 2 c 6 q 1 . F urthermore, since | A − A 0 | ≤ | A + B − A 0 − B 0 | + | B − B 0 | (b y triangle inequality), we deduce η · ( σ 0 2 − σ 2 ) σ 1 · q 1 + 1 2 ≤ 2 c 6 q 1 + X i β i ( σ 0 2 σ 0 i − σ 2 σ i ) . (26) Because σ 0 i = σ i for i 6 = 2 , 3, w e ha v e X i β i ( σ 0 2 σ 0 i − σ 2 σ i ) = X i β i ( σ 0 2 − σ 2 ) σ i + X i β i σ 0 2 ( σ 0 i − σ i ) = X i β i ( σ 0 2 − σ 2 ) σ i ! + β 2 σ 0 2 ( σ 0 2 − σ 2 ) + β 3 σ 0 2 ( σ 0 3 − σ 3 ) ≤ ( K + 2) L · max i | β i | · | σ 0 2 − σ 2 | . Plugging this estimate into ( 26 ), w e obtain the desired result | σ 2 − σ 0 2 | ≤ c q 2 1 . This completes the pro of of the lemma. 9.6.4 Monotonicit y of t -Optimal Divisions W e apply Lemma 7 to prov e the ev entual monotonicity of t -optimal divisions in generic informational en vironmen ts. Prop osition 3. Fix V 0 and C . F or generic signal varianc es { σ 2 i } K i =1 , ther e exists T 0 such that for t ≥ T 0 , the t -optimal division n ( t ) is unique, and it satisfies n i ( t + 1) ≥ n i ( t ) , ∀ i . 48 This implies that the probability of the even t F ( q 1 , q 2 , q 3 ) conditional on any v alue of σ 1 , σ 4 , . . . , σ K is b ounded b y c 7 q 4 1 , which is stronger than the claim. 43 Pr o of. Uniqueness follo ws from the stronger fact that in generic informational environmen ts, f ( q 1 , . . . , q K ) differs from f ( q 0 1 , . . . , q 0 K ) whenev er q 6 = q 0 . Belo w w e fo cus on monotonicit y . Using the order difference lemma, w e can already deduce the difference | n i ( t + 1) − n i ( t ) | is no more than 1 at sufficien tly late p erio ds t . Supp ose that n 1 ( t + 1) = n 1 ( t ) − 1. Then b ecause P i ( n i ( t + 1) − n i ( t )) = 1, we can without loss assume n 2 ( t + 1) = n 2 ( t ) + 1 and n 3 ( t + 1) = n 3 ( t ) + 1. F or notational ease, write n i = n i ( t ) , n 0 i = n i ( t + 1). By t -optimality , w e hav e f ( n 1 , n 2 , n 3 , . . . , n K ) ≤ f ( n 1 − 1 , n 2 + 1 , n 3 , . . . , n K ) f ( n 0 1 , n 0 2 , n 0 3 , . . . , n 0 K ) ≤ f ( n 0 1 + 1 , n 0 2 − 1 , n 0 3 , . . . , n 0 K ) These inequalities are equiv alent to ∂ 2 f ( n 1 − 1 , n 2 , n 3 , . . . , n K ) ≥ ∂ 1 f ( n 1 − 1 , n 2 , n 3 , . . . , n K ) (27) ∂ 2 f ( n 0 1 , n 0 2 − 1 , n 0 3 , . . . , n 0 K ) ≤ ∂ 1 f ( n 0 1 , n 0 2 − 1 , n 0 3 , . . . , n 0 K ) (28) with ∂ i f represen ting the discr ete p artial derivative . Since n 0 2 − 1 = n 2 , the LHS of ( 28 ) is at least the LHS of ( 27 ) min us a num b er of cross partials. Similarly , the RHS of ( 28 ) is at most bigger than the RHS of ( 27 ) by a n umber of cross partials. Thus the only w a y ( 27 ) and ( 28 ) can b oth hold is if the tw o sides of ( 27 ) differ b y no more than O ( 1 t 4 ). That is, for some absolute constant c 1 , 49 w e ha v e | f ( n 1 − 1 , n 2 + 1 , n 3 , . . . , n K ) − f ( n 1 , n 2 , n 3 , . . . , n K ) | ≤ c 1 t 4 . (29) An analogous argument yields | f ( n 1 − 1 , n 2 , n 3 + 1 , . . . , n K ) − f ( n 1 , n 2 , n 3 , . . . , n K ) | ≤ c 1 t 4 . (30) But no w w e can apply Lemma 7 to sho w that in generic environmen ts, there are only finitely man y integer tuples ( n 1 , . . . , n K ) that satisfy b oth ( 29 ) and ( 30 ). This prov es the result. 9.6.5 Completing the Pro of of Theorem 3 By Proposition 3 , generically there exists T 0 suc h that n ( t ) is monotonic in t after T 0 p erio ds. Th us, using our dynamic Blackw ell lemma, if the DM achiev es t -optimalit y at some p erio d t ≥ T 0 , he will con tinue to do so in the future. By Prop osition 2 , suc h a time t do es exist. This pro v es Theorem 3 . 49 As discussed in the pro of of Lemma 7 , we can find a single constant c 1 that works for all σ bounded ab o ve and b ounded aw ay from zero. 44 9.7 Pro of of Prop osition 1 (Bound on B ) 9.7.1 Preliminary Estimates Throughout, we w ork with the linearly-transformed model, where each signal X i is simply ˜ θ i plus standard Gaussian noise, and the DM’s prior co v ariance matrix ov er the transformed states is ˜ V . Let γ = γ ( q 1 , . . . , q K ) represen t the following K × 1 vector: γ = ( ˜ V + E ) − 1 · ˜ V · w (31) with E = diag( 1 q 1 , . . . , 1 q K ). F or 1 ≤ i ≤ K , γ i denotes the i -th co ordinate of γ . Here w e re-deriv e the posterior v ariance function f , its (usual con tin uous) deriv ativ es and second deriv ativ es. Our formulae b elo w tak e as primitives ˜ V and w , but they are equiv alent to those presented in App endix 9.1 (for the original model). F act 1 (Posterior V ariance) . f ( q 1 , . . . , q K ) = w 0 ( ˜ V − ˜ V ( ˜ V + E ) − 1 ˜ V ) w . F act 2 (P artial Deriv ativ es of P osterior V ariance) . ∂ i f ( q 1 , . . . , q K ) = − 1 q 2 i · w 0 ˜ V ( ˜ V + E ) − 1 ∆ ii ( ˜ V + E ) − 1 ˜ V w = − γ 2 i q 2 i . F act 3 (Second-Order Partial Deriv ativ es of Posterior V ariance) . ∂ ii f ( q 1 , . . . , q K ) = 2 · w 0 ˜ V ( ˜ V + E ) − 1 ∆ ii ( ˜ V + E ) − 1 ˜ V w q 3 i − 2 · w 0 ˜ V ( ˜ V + E ) − 1 ∆ ii ( ˜ V + E ) − 1 ∆ ii ( ˜ V + E ) − 1 ˜ V w q 4 i = 2 γ 2 i q 3 i · 1 − [( ˜ V + E ) − 1 ] ii q i ! F act 4 (Cross-Partial Deriv ativ es of Posterior V ariance) . ∂ ij f ( q 1 , . . . , q K ) = − 2 q 2 i q 2 j · w 0 ˜ V ( ˜ V + E ) − 1 ∆ ii ( ˜ V + E ) − 1 ∆ j j ( ˜ V + E ) − 1 ˜ V w = − 2 γ i γ j q 2 i q 2 j · [( ˜ V + E ) − 1 ] ij . All of the ab ov e facts can be pro v ed b y simple linear algebra, so we omit the details. 9.7.2 Refined Asymptotic Characterization of n ( t ) W e no w specialize to w = 1 and establish the next lemma, whic h refines our asymptotic c haracterization of n ( t ) in App endix 9.3 . 50 Prop osition 1 will immediately follo w. 50 Easy to see that in the transformed mo del, λ i ∝ | w i | . So λ i = 1 K here. 45 Lemma 8. F or t ≥ 8( R + 1) K √ K , it holds that | n i ( t ) − t K | ≤ 4( R + 1) √ K . Pr o of. Note from ( 31 ) that ( ˜ V + E ) γ = ˜ V w . So ˜ V ( w − γ ) = E γ = ( γ 1 q 1 , . . . , γ K q K ) 0 , and w − γ = ( ˜ V ) − 1 · γ 1 q 1 , . . . , γ K q K 0 . F rom the definition of the op erator norm, we deduce K X i =1 (1 − γ i ) 2 = k w − γ k 2 ≤ R 2 · K X j =1 γ 2 j q 2 j ! . (32) This holds for any division v ector q and the corresp onding γ (which is a function of q ). No w supp ose without loss of generality that n 1 ( t ) ≥ t K . Let q = ( n 1 ( t ) − 1 , n 2 ( t ) , . . . , n K ( t )) and consider the corresp onding γ . Then from t -optimalit y we hav e | f ( q 1 + 1 , q − 1 ) − f ( q ) | ≥ | f ( q j + 1 , q − j ) − f ( q ) | , ∀ j. Note that the discrete partial deriv ativ es abov e are related to the usual con tinuous partials b y the following inequalities: 51 γ 2 j q j ( q j + 1) ≤ | f ( q j + 1 , q − j ) − f ( q ) | ≤ γ 2 j q 2 j . W e therefore deduce γ 2 1 q 2 1 ≥ γ 2 j q j ( q j + 1) , ∀ j. (33) Com bining ( 32 ) and ( 33 ) and using 1 q 2 j ≤ 2 q j ( q j +1) , w e see that K X i =1 (1 − γ i ) 2 ≤ 2 R 2 K · γ 2 1 q 2 1 . (34) In particular, we know that γ 1 − 1 ≤ R √ 2 K · γ 1 q 1 . Easy to see this implies γ 1 ≤ 1 + 2 R √ K q 1 ≤ √ 2 (35) whenev er q 1 = n 1 ( t ) − 1 ≥ t K − 1 ≥ (2 √ 2 + 2) R √ K . Plugging this bac k in to the RHS of ( 34 ), w e then obtain γ j ≥ 1 − 2 R √ K q 1 ≥ 2 − √ 2 . (36) 51 The RHS follows from the conv exity of f ; the LHS can b e prov ed by using F act 2, F act 3 and noting that γ 2 j is an increasing function in q j , b ecause ∂ γ j ( q ) ∂ q j = γ j q 2 j · [( V + E ) − 1 ] j j has the same sign of γ j . 46 No w use ( 33 ), ( 35 ) and ( 36 ) to deduce that q j + 1 ≥ γ j γ 1 · q 1 ≥ 1 − 2 R √ K q 1 1 + 2 R √ K q 1 · q 1 ≥ 1 − 4 R √ K q 1 ! · q 1 = q 1 − 4 R √ K . Recall q j = n j ( t ) for j > 1 and q 1 = n 1 ( t ) − 1. W e th us hav e n j ( t ) ≥ n 1 ( t ) − 4 R √ K − 2 . (37) Since n 1 ( t ) ≥ t K , the abov e implies n j ( t ) ≥ t K − 4( R + 1) √ K for eac h signal j . This prov es half of the lemma. F or the other half, note that n j ( t ) ≤ t K m ust hold for some signal j . Thus ( 37 ) yields n 1 ( t ) ≤ t K + 4( R + 1) √ K . This is not just true for signal 1, but in fact for an y signal i with n i ( t ) ≥ t K . So w e conclude n i ( t ) ≤ t K + 4( R + 1) √ K for each signal i . The pro of of the lemma is complete. 47 References A ghion, P., P. Bol ton, C. Harris, and B. Jullien (1991): “Optimal Learning b y Exp erimen tation,” R eview of Ec onomic Studies , 58, 621–654. Ando, T. (1983): “On the Arithmetic-Geometric-Harmonic-Mean Inequalities for P ositive Definite Matrices,” Line ar A lgebr a and its Applic ations , 52, 31–37. Arr ow , K. J., D. Blackwell, and M. A. Girshick (1949): “Bay es and Minimax Solutions of Sequential Decision Problems,” Ec onometric a , 17, 213–244. Banerjee, A. (1992): “A Simple Mo del of Herd Beha vior,” Quaterly Journal of Ec onomics , 107, 797–817. Banks, J. S. and R. K. Sund aram (1992): “A Class of Bandit Problems Yielding My opically Optimal Strategies,” Journal of Applie d Pr ob ability , 29, 625–632. Bardhi, A. (2018): “Optimal Disco very and Influence Through Selectiv e Sampling,” W ork- ing P ap er. Ber gemann, D. and J. V ¨ alim ¨ aki (2002): “Information Acquisition and Efficient Mech- anism Design,” Ec onometric a , 70, 1007–1033. Ber gemann, D. and J. V ¨ alim ¨ aki (2008): “Bandit Problems,” in The New Palgr ave Dictionary of Ec onomics , ed. b y S. N. Durlauf and L. E. Blume, Basingstok e: Palgra v e Macmillan. Berr y, D. and B. Fristedt (1988): “Optimalit y of Myopic Stopping Times for Geometric Discoun ting,” Journal of Applie d Pr ob ability , 25, 437–443. Bikhchand ani, S., D. Hirshleifer, and I. Welch (1992): “A Theory of F ads, F ashion, Custom, and Cultural Change as Information Cascades,” Journal of Politic al Ec onomy , 100, 992–1026. Bubeck, S., R. Munos, an d G. Stol tz (2009): “Pure Exploration in Multi-armed Ban- dits Problems,” in Algorithmic L e arning The ory. AL T 2009. L e ctur e Notes in Computer Scienc e, vol 5809 , Springer, Berlin, Heidelberg. Callander, S. (2011): “Searching and Learning b y T rial and Error,” A meric an Ec onomic R eview , 101, 2277–2308. Che, Y.-K. and K. Mierendorff (2018): “Optimal Sequen tial Decision with Limited A ttention,” W orking Paper. Chernoff, H. (1972): Se quential Analysis and Optimal Design , So ciety for Industrial and Applied Mathematics. Chick, S. E. and P. Frazier (2012): “Sequen tial Sampling with Economics of Selection Pro cedures,” Management Scienc e , 58, 550–569. 48 Colombo, L., G. Femminis, and A. P a v an (2014): “Information Acquisition and W el- fare,” R eview of Ec onomic Studies , 81, 1438–1483. Denti, T. (2018): “Unrestricted Information Acquisition,” W orking Paper. Dew an, T. and D. P. My a tt (2008): “The Qualities of Leadership: Direction, Commu- nication and Obfuscation,” Americ an Politic al Scienc e R eview , 102, 351–368. Easley, D. and N. M. Kiefer (1988): “Con trolling a Sto c hastic Pro cess with Unknown P arameters,” Ec onometric a , 56, 1045–1064. Fudenber g, D., P. Strack, and T. Strzalecki (2018): “Stochastic Choice and Opti- mal Sequen tial Sampling,” Americ an Ec onomic R eview . Garf agnini, U. and B. Str ulovici (2016): “So cial Exp erimen tation with In terdep en- den t and Expanding T ec hnologies,” R eview of Ec onomic Studies , 83, 1579–1613. Gittins, J. C. (1979): “Bandit processes and dynamic allo cation indices,” Journal of the R oyal Statistic al So ciety, Series B , 148–177. Greenshtein, E. (1996): “Comparison of Sequen tial Exp eriments,” The Annals of Statis- tics , 24, 436–448. Hansen, O. H. and E. N. Tor gersen (1974): “Comparison of Linear Normal Experi- men ts,” The Annals of Statistics , 2, 367–373. H ´ eber t, B. and M. W oodford (2018): “Rational Inatten tion with Sequen tial Informa- tion Sampling,” W orking Paper. Hell wig, C. and L. Veldkamp (2009): “Knowing What Others Know: Coordination Motiv es in Information Acquisition,” The R eview of Ec onomic Studies , 76, 223–251. Ke, T. T. and J. M. Villas-Bo as (2017): “Optimal Learning Before Choice,” W orking P ap er. Keener, R. (1985): “F urther Con tributions to the Two-armed Bandit Problem,” A nnals of Statistics , 13, 418–422. Lamber t, N., M. Ostro vsky, and M. P anov (2018): “Strategic T rading in Informa- tionally Complex Environmen ts,” Ec onometric a . Liang, A. and X. Mu (2018): “Ov erabundan t Information and Learning T raps,” W orking P ap er. Ma yska y a, T. (2017): “Dynamic Choice of Information Sources,” W orking Paper. Merserea u, A. J., P. Rusmevichientong, and J. N. Tsitsiklis (2009): “A Struc- tured Multiarmed Bandit Problem and the Greedy P olicy ,” IEEE T r ansactions On A uto- matic Contr ol , 54, 2787–2802. Meyer, M. and J. Zwiebel (2007): “Learning and Self-Reinforcing Behavior,” W orking P ap er. 49 Moscarini, G. and L. Smith (2001): “The Optimal Level of Exp erimen tation,” Ec ono- metric a , 69, 1629–1644. My a tt, D. P. and C. W alla ce (2012): “Endogenous Information Acquisition in Co or- dination Games,” The R eview of Ec onomic Studies , 79, 340–374. Persico, N. (2000): “Information Acquisition in Auctions,” Ec onometric a , 68, 135–148. Reinganum, J. (1983): “Nash Equilibrium Searc h for the Best Alternativ e,” Journal of Ec onomic The ory , 30, 139–152. R obbins, H. (1952): “Some asp ects of the sequential design of exp erimen ts,” Bul l. Amer. Math. So c. , 58, 527–535. R othschild, M. (1974): “A Two-Armed Bandit Theory of Mark et Pricing,” Journal of Ec onomic The ory , 9, 185–202. R usso, D. (2016): “Simple Bay esian Algorithms for Best-Arm Identification,” W orking P ap er. Sanjurjo, A. (2017): “Search with Multiple A ttributes: Theory and Empirics,” Games and Ec onomic Behavior , 535–562. Sethi, R. and M. Yildiz (2016): “Comm unication with Unkno wn P ersp ectives,” Ec ono- metric a , 84, 2029–2069. Steiner, J., C. Stew ar t, and F. Ma t ˇ ejka (2017): “Rational Inattention Dynamics: Inertia and Delay in Decision-Making,” Ec onometric a , 85, 521–553. T a ylor, C. R. (1995): “Digging for Golden Carrots: An Analysis of Research T ourna- men ts,” A meric an Ec onomic R eview , 85, 872–890. W ald, A. (1947): “F oundations of a General Theory of Sequen tial Decision F unctions,” Ec onometric a , 15, 279–313. Weitzman, M. L. (1979): “Optimal Searc h for the Best Alternativ e,” Ec onometric a , 47, 641–654. Y ang, M. (2015): “Co ordination with Flexible Information Acquisition,” Journal of Ec o- nomic The ory , 158, 721–738. Zhong, W. (2018): “Optimal Dynamic Information Acquisition,” W orking Paper. 50 10 Online App endix 10.1 Applications of Results from Section 5 (Multi-pla y er Games) 10.1.1 Beaut y Con test Hellwig and V eldk amp ( 2009 ) in tro duced a beauty contest game with endogenous one-shot information acquisition. W e build on this by mo difying the information acquisition stage so that play ers se quential ly acquire information ov er man y p erio ds (rather than once), and face a c ap acity c onstr aint each p erio d (rather than costly signals). W e sho w that the basic insigh ts of Hellwig and V eldk amp ( 2009 ) hold in this setting. Sp ecifically , supp ose that at an unknown final perio d, a unit mass of play ers simultane- ously chooses prices p i ∈ R to minimize the (normalized) squared distance b et w een their price and an unkno wn target price p ∗ , whic h dep ends on the unkno wn state ω and also on the a v erage price p = R p i di : u i ( p i , p, ω ) = − 1 (1 − r ) 2 · ( p i − p ∗ ) 2 where p ∗ = (1 − r ) · ω + r · p. (38) The constant r ∈ ( − 1 , 1) determines whether pricing decisions are complemen ts or substi- tutes. 52 In ev ery p erio d up un til the final perio d, eac h pla yer acquires B signals from the set ( X i k ), as in the framew ork w e ha ve developed. T o closely mirror the setup in Hellwig and V eldk amp ( 2009 ), w e set each θ i 1 = ω . Assuming “conditional independence” of play ers’ signals, we can directly apply Corollary 1 and conclude that in every equilibrium, pla yers c ho ose a deterministic (my opic) sequence of information acquisitions. This result ec ho es Hellwig and V eldk amp ( 2009 ), who show that equilibrium is unique when play ers c ho ose from private signals (see their Subsection 1.3.4). 53 Our extension is to in tro duce dynamics and sho w how the dynamic problem can b e reduced into a static one. Let Σ( t ) be the p osterior v ariance ab out ω after the first t m yopic observ ations. Since the pla yers in our mo del acquire B signals eac h p erio d, their (common) p osterior v ariance at the 52 When r > 0, b est resp onses are increasing in the prices set by other play ers, thus decisions are comple- men ts. Conv ersely , r < 0 implies decisions are substitutes. 53 Hellwig and V eldk amp ( 2009 ) also study a case in which play ers observe signals that are distorted by a common noise (whic h violates conditional indep endence). They show that m ultiple equilibria generally arise with such “public signals”. Dewan and Myatt ( 2008 ), Myatt and W allace ( 2012 ) and Colombo et al. ( 2014 ) restore a unique linear symmetric equilibrium b y assuming p erfectly divisible signals, similar to the con tin uous-time v ariant of our mo del. In con trast, our equilibrium analysis relies on the informational en vironmen t (i.e. conditional independence), but not on symmetry or linearity of the strategy . 51 end of t p erio ds is giv en by Σ( B t ). Th us, conditional on p erio d t b eing the final p erio d, our game is as if the pla yers acquire a batch of B t signals and then choose prices. This means that equilibrium prices are determined in the same wa y as in Hellwig and V eldk amp ( 2009 ): p ( I i ≤ B t ) = 1 − r 1 − r + r · Σ( B t ) · E ( ω |I i ≤ B t ) , (39) where I i ≤ B t represen ts pla y er i ’s information set, consisting of B t signal realizations. W e can use this c haracterization of equilibrium to re-ev aluate the main insight in Hellwig and V eldk amp ( 2009 ): the incentiv e to acquire more informativ e signals is increasing in aggregate information acquisition if decisions are complements and decreasing if decisions are substitutes. F or this purp ose, we augment the model with a perio d 0, in which each pla yer i inv ests in a capacity level B i at some cost. Afterw ards, pla y ers acquire information m yopically (under p ossibly differential capacity constraints) and participate in the b eauty con test game. Let µ ∈ ∆( Z + ) b e the distribution ov er capacity levels c hosen b y pla y er i ’s opponents. Then, pla y er i ’s exp ected utilit y from choosing capacit y B i is giv en by E U ( B i , µ ) = − E t ∼ π " Σ( B i t ) 1 − r + r · R B Σ( B t ) dµ ( B ) 2 # . Ab o ve, the exp ectation is tak en with respect to the random final p erio d t distributed ac- cording to π , while inside the exp ectation, the term R B Σ( B t ) dµ ( B ) is the a v erage p osterior v ariance among the play ers. Similar to Prop osition 1 in Hellwig and V eldk amp ( 2009 ), w e ha ve the follo wing result: Corollary 5. Supp ose ˆ B i > B i and ˆ µ > µ in the sense of first-or der sto chastic dominanc e. Then the sign of the differ enc e E U ( B i , µ ) + E U ( ˆ B i , ˆ µ ) − E U ( B i , ˆ µ ) − E U ( ˆ B i , µ ) is (a) zer o, if ther e is no str ate gic inter action ( r = 0 ); (b) p ositive, if de cisions ar e c omplementary ( r > 0) ; (c) ne gative, if de cisions ar e substitutes ( r < 0) . When decisions are complemen ts, the v alue of additional information is increasing in the amoun t of aggr e gate information. Thus play er i has a stronger incentiv e to c ho ose a higher signal capacit y if his opp onents (on a verage) acquire more signals. This incen tive go es in the opposite direction when decisions are substitutes, whic h confirms the main finding of Hellwig and V eldk amp ( 2009 ). 52 10.1.2 Strategic T rading W e consider the strategic trading game introduced in Lam b ert et al. ( 2018 ), in whic h indi- viduals trade given asymmetric information ab out the v alue of an asset. W e endogenize the information a v ailable to traders by adding a pre-trading stage in which traders sequentially acquire signals. As before, w e supp ose that trading o ccurs at a final time p erio d that is determined according to an arbitrary full-supp ort distribution. In more detail: A t the final time p erio d, a security with unknown v alue v is traded in a mark et, and each of n traders submits a demand d i . There are additionally liquidit y traders who generate exogenous random demand u . A mark et-maker priv ately observes a signal θ M (p ossibly multi-dimensional) and the total demand D = P i d i + u . He sets the price P ( θ M , D ), whic h in equilibrium equals E [ v | θ M , D ]. Each strategic trader then obtains profit Π i = d i · ( v − P ( θ M , D )). W e supp ose that in eac h p erio d up to and including the final time perio d, eac h trader i c ho oses to observ e a signal from his set ( X i k ) (describ ed in Section 5 ). The requiremen t of conditional indep endence is strengthened to apply to a pay off-relev ant ve ctor ω = ( v , θ M , u ) (instead of a real-v alued unkno wn): That is, for eac h play er i , conditional on the v alue of θ i 1 , the pa yoff-relev an t v ector ω and the other pla y ers’ unknown states ( θ j ) j 6 = i are assumed to b e conditionally indep enden t from pla y er i ’s states θ i . Relative to the fully general setting considered in Lam b ert et al. ( 2018 ), this assumption allows for flexible correlation within a pla yer’s signals, but places a strong restriction on the correlation acr oss different pla yers’ signals. Applying Corollary 1 , we can conclude that: Corollary 6. Under the ab ove assumptions, ther e is an essential ly unique line ar NE in which the on-p ath signal ac quisitions ar e myopic, and in the final p erio d, players play the unique line ar e quilibrium describ e d in L amb ert et al. ( 2018 ). Th us, the closed-form solutions that are a k ey contribution of Lam b ert et al. ( 2018 ) extend to our dynamic setting with endogenous information. 10.2 Example in Whic h n ( t ) is Not Monotone Even for Large t Here w e con tin ue to study Example 2 presen ted in Figure 1 of the main text. W e will show that the t -optimal division vectors n ( t ) fail to be monotone, ev en when w e consider only p erio ds after some large T . 53 The p osterior v ariance function is f ( q 1 , q 2 , q 3 ) = 1 − 1 1 + 1 q 1 + 1 − 1 1+ 1 q 2 + 1 1+ q 3 This suggests that the t -optimal problem can b e separated in to t w o parts: c ho osing q 1 , and allo cating the remaining observ ations betw een q 2 and q 3 . The latter allocation problem is simple: an optimal division satisfies q 3 = q 2 − 1 or q 3 = q 2 . With some extra algebra, we obtain that for N ≥ 1: 1. If t = 3 N + 1, then the unique t -optimal division is ( N + 2 , N , N − 1); 2. If t = 3 N + 2, then the unique t -optimal division is ( N + 3 , N , N − 1); 3. If t = 3 N + 3, then the unique t -optimal division is ( N + 2 , N + 1 , N ). Crucially , note that when transitioning from t = 3 N + 2 to t = 3 N + 3, the t -optimal n umber of X 1 signals is decreased. This reflects the complemen tarity b et w een signals X 2 and X 3 , which causes the DM to observe them in pairs. Due to this failure of monotonicity , a sequen tial rule cannot achiev e the t -optimal division vectors for all large t . 10.3 Additional Result for K = 2 When there are only t w o states and t wo signals, w e can show that for a broad class of en vironments, the m yopic information acquisition strategy is optimal from p erio d 1. 54 Prop osition 4. Supp ose K = 2 , the prior is standar d Gaussian ( V 0 = I 2 ) , and b oth signals have varianc e 1 . 55 Write C = a b c d ! and assume without loss that | ad | ≥ | bc | . Then the optimal information ac quisition str ate gy is myopic whenever the fol lowing ine quality holds: (1 + 2 b 2 ) · | ad − bc | ≥ | ad + bc | . (40) In p articular, this is true whenever abcd ≤ 0 . T o in terpret, ( 40 ) requires that the determinan t of the matrix C , ad − bc , is not to o small (holding other terms constant). Equiv alen tly , the t wo v ectors (in R 2 ) defining the signals should not b e close to collinear. This rules out situations where the t wo signals pro vide suc h similar information in the initial p erio ds that they substitute one another. 54 In the prop osition below, if the linear co efficien ts a, b, c, d were pic ked at random, then with probabilit y 1 2 w e would ha ve abcd ≤ 0. 55 W e mak e these simplifying assumptions so that the condition for immediate equiv alence is easy to state and interpret. 54 Pr o of. Under the assumptions, the DM’s p osterior v ariance ab out θ 1 is computed to b e f ( q 1 , q 2 ) = 1 + b 2 q 1 + d 2 q 2 1 + ( a 2 + b 2 ) q 1 + ( c 2 + d 2 ) q 2 + ( ad − bc ) 2 q 1 q 2 . Giv en q i observ ations of eac h signal i in the past, the my opic strategy c ho oses signal 1 if and only if f ( q 1 + 1 , q 2 ) < f ( q 1 , q 2 + 1), which reduces to ( ad − bc ) 2 b 2 q 2 1 + (1 + b 2 )( ad − bc ) 2 q 1 − ( a 2 d 2 − b 2 c 2 ) q 1 + c 2 (1 + b 2 ) < ( ad − bc ) 2 d 2 q 2 2 + (1 + d 2 )( ad − bc ) 2 q 2 + ( a 2 d 2 − b 2 c 2 ) q 2 + a 2 (1 + d 2 ) (41) The condition | ad | ≥ | bc | ensures that the RHS is an increasing function of q 2 , b ecause the co efficien ts in front of q 2 2 and q 2 are b oth positive. Meanwhile, the condition (1 + 2 b 2 ) | ad − bc | ≥ | ad + bc | implies the LHS is larger when q 1 = 1 than when q 1 = 0, so that the LHS is also increasing in (integer v alues of ) q 1 . Ev en if f ma y not b e written in to separable form, ( 41 ) suggests that the comparison b et ween the marginal v alues of signal 1 and 2 “is separable.” It follo ws that the t -optimal division v ectors are increasing in t . Proposition 4 is pro ved. 10.4 Ev en tual Optimalit y of the My opic Strategy Belo w, write m ( t ) for the division v ector at time t ac hiev ed under the (history-indep endent) m yopic rule. 56 W e ha v e discussed that when Theorems 1 or 2 apply , the m y opic division v ector m ( t ) is t -optimal at ev ery p erio d t . In this app endix, w e argue that generically , division v ectors m ( t ) at late perio ds are t -optimal. This result complements our Theorem 3 , and suggests that a DM who naiv ely follo ws the m yopic rule all the w ay cannot do v ery p o orly . T o a void rep etition, here we only sk etch the core argument. The main new step is to show that the division v ectors m ( t ) under the m yopic rule gro w to infinit y in each coordinate; that is, a m y opic DM would not get stuck observing a subset of signals. Once this is sho wn, w e can repeat the (rest of the) proof of Lemma 4 and deduce that m i ( t ) − λ i · t remains bounded. And with these asymptotic characterizations, w e can repro duce the proof of Theorem 3 (no w for the my opic strategy instead of the optimal strategy) without trouble. 57 T o see my opic signal choices never get stuc k, we establish the following lemma. 56 That is, m ( t ) = ( m 1 ( t ) , . . . , m K ( t )) where m i ( t ) is the n um b er of times signal i has b een observed under m y opic information acquisition prior to and including p eriod t . 57 These latter steps are actually simpler to carry out for the m yopic strategy . This is b ecause in construct- ing a deviation from the my opic strategy , we only need to lo ok for a low er p osterior v ariance at a single p eriod. So we no longer need to make use of switch deviations. 55 Lemma 9. Fix an arbitr ary division ve ctor q ∈ R K + (ne e d not b e inte gr al). The p artial derivatives of f at q ar e al l zer o if and only if q 1 = · · · = q K = ∞ . In tuitively , this holds b ecause for normal linear signals, the p osterior v ariance is glob al ly c onvex . So if eac h signal has zero marginal v alue relativ e to the division q , then q m ust b e a global minimizer of posterior v ariance. W e conclude b y men tioning that a similar result (i.e. my opic information acquisition do es not get stuc k) w ould not in general b e true for other signal structures. The following is a coun terexample with normal but non-linear signals. Example 3 . Consider three states θ 1 , θ 2 , θ 3 dra wn indep enden tly . The DM has access to these three signals: X 1 = θ 1 + sign( θ 2 ) + 1 X 2 = sign( θ 2 θ 3 ) + 2 X 3 = θ 3 + 3 where 1 , 2 , 3 are Gaussian noise terms. W e focus on the prediction problem, in whic h (at a random time) the DM mak es a prediction ab out θ 1 and receives negativ e of the squared prediction error. Note that prior to the first observ ation of X 2 , signal X 3 is completely uninformativ e ab out the pay off-relev an t state θ 1 (ev en when com bined with previous observ ations of X 1 ). Similarly , signal X 2 is individually uninformativ e about θ 2 , 58 and th us about θ 1 . These imply that the DM’s uncertaint y ab out θ 1 is not reduced up on the first observ ation of either X 2 or X 3 . Hence, the my opic rule in this example is to alw a ys observ e X 1 , con trary to Lemma 9 . Th us, if the DM acquires information m yopically , he will nev er completely learn the v alue of θ 1 . By contrast, if the DM is sufficien tly patien t, then his optimal strategy will observ e eac h signal infinitely often and iden tify the v alue of θ 1 in the long run. Th us, in this example the m y opic signal path does not even tually agree with the optimal path. 58 This is b ecause the sign of θ 2 θ 3 do es not con tain any new information about θ 2 when θ 3 is equally likely to b e p ositive or negativ e. 56
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment