Learning without recall in directed circles and rooted trees
This work investigates the case of a network of agents that attempt to learn some unknown state of the world amongst the finitely many possibilities. At each time step, agents all receive random, independently distributed private signals whose distri…
Authors: M. Amin Rahimian, Ali Jadbabaie
Learnin g without Reca ll in Dir ected Cir cles and Rooted T r ees Mohammad Amin Rahimian, Ali Ja dbaba ie ∗ Abstract —This work in v estigates the ca se of a network o f agents that attempt to learn some u nknown state of the world amongst the finitely many possibi lities. At each time step, agents all receiv e random, independentl y distributed priv ate sign als whose distributions are dependent on the un known state of the world. Howev er , it may be the case that some or any of the agents cannot di stinguish between two or more of the p ossible states based onl y on their private observa tions, as when sev eral states result in the same distribution of the priva te signals. In our model, the agents f orm som e ini tial belief (probability distribution) about the unkn own state and then refine their beli efs in accordance with their priva te observa tions, as well as the beliefs of their neighbors. An agent learns the unk nown state when her belief con v erges to a point mass th at is concentrated at th e true state. A rational agent would u se the Bayes’ ru le to incorporate her neighbors’ beliefs and own priv ate signals ov er time. While such repeated applications of the Bayes’ rule in networks can become computationally in tractable; in th is paper , we show that in the canonical cases of directed star , circle or path netw orks and their combinations, one can derive a class of memoryless update rules that replicate t hat of a sin gle Bayesian agent but replace the self bel iefs wit h the beliefs of the neighbors. This way , on e can realize an exponentially fast rate of learning similar to the case of Bay esian (fully rational) agents. The proposed rules are a special case of the Learning wit hout Recall app roach that we deve lop in a companion pap er , and it has the advantage that while preserving essential features of th e Bayesian i nference, they are made tractable. In particular , the agents can rely on the observ ational abilities of their neighbors and thei r neighbors’ neighbors etc. to lear n the unkn own state; ev en though they themselves cannot distinguish the truth. I . I N T RO D U C T I O N & B AC K G RO U N D Consider a group of agen ts who try to estimate an u nknown state of the world. Each agent receives a sequ ence of indepen- dent and identically distributed (i.i.d.) pr i vate signals whose distribution is dete rmined by the unknown state. Suppo se further that th e belief of e ach agent about the unkn own state is repr esented by a discrete probab ility distribution over th e finitely many possibilities, an d that every ag ent sequen tially applies the Bayes’ rule to her observations a t each step, and up- dates h er be liefs accord ingly . It is a well-known con sequence of the classical results in merging a nd learning th eory [1], [2] that the beliefs formed in the above m anner c onstitute a bound ed marting ale and con verge to a limiting distribution as the number of observations increases. Howe ver , the limiting distribution may d iffer from a point mass centered at the truth, in which case the agent fails to learn th e tru e state asymptotically . This may be the case, fo r instance if the agen t faces an id entification pro blem, that is when there are states other than the true state which are observationally equivalent to the tru e state an d indu ce the same distribution on he r sequenc e ∗ The authors are with the Department of E lectri cal and Systems Engineer- ing, Univ ersity of Pennsylv ania, Philade lphia, P A 19104-6228 USA (email: jadbabai@seas. upenn.edu ). This work was supported by AR O MURI W911NF-12-1-0509. of p riv ately observed sign als. Accor dingly , the agen ts have an incentive to comm unicate in a social network so that they can resolve their identification p roblems by r elying on e ach other ’ s observational abilities. This leads to the p roblem of soc ial learning th at is a classical focus of b ehavioral microecono mic theory [3], [4] and has close par allels in distributed estimation and statistical lear ning theory [5], [ 6], [7], [8], [9]. Rational agents in a social network would apply the Baye s’ rule su ccessi vely to their obser vations at each step, whic h include not only their private signals but also the belief s commun icated by their n eighbo rs. Howe ver , suc h re peated ap- plications of Bayes’ r ule in n etworks become computation ally intractable especially if the agen ts are un aware of the global network structure. This is due to th e fact that the agents at each step sho uld use their local data that is in creasing with time, and m ake infer ences abou t the global signal structures that can have led to their observations. Inde ed, tra ctable modelin g and analysis of ra tional behavior in n etworks is an impor tant problem in Bay esian economics and hav e attracted much attention [10], [1 1]. On the oth er side of th e spectru m are the literature such as [ 12], [13] which attempt to in vestigate the problem of lear ning in networks thro ugh iterative ap plications of so me update ru les that do no t necessarily result in th e Bayesian beliefs, but can nonethe less provide the asymp totic proper ties of learn ing and co nsensus u nder cer tain conditio ns. In th is work, we first consider th e be havior of a single Bayesian agent that ob serves a sequence of i.i.d. signals condition ed on the unkn own state of the world. W e sh ow that the learnin g rate for such an a gent is expon entially f ast with an asymptotic rate that can be expressed in ter ms of the relativ e entropies between th e lik elihood st ructures of her sign als under various states of the world . W e n ext use these re sults to u pper bound th e rate o f learnin g for a Bayesian agent in a social network ob serving not only her private sign als but also her neighbo r’ s b eliefs. The f ocus is then restric ted to the case of directed circles, for which we pro pose a class of up date ru les offering exponentially fast learnin g at a n asymptotic rate th at is within a c onstant factor 1 / l of the der iv e d up per b ound, l being the length of the circle. T hese updates are also applied to oth er hybr id stru ctures, where the center n ode of rooted tree is rep laced by a cir cle. These up dates are a special c ase of the Learn ing withou t Reca ll ru les, which we develop in a compan ion paper . The rem ainder of this paper is organized as follows. The modeling and formu lation are set forth in Section II. The case of a single Bayesian ag ent is in vestigated in Sec tion III. Learning without Recall updates for directed cir cle and rooted trees tog ether with their asymptotic p roper ties in cluding the learning and co n vergence r ate are then p resented in Section IV, and the p aper is concluded b y Section V. I I . T H E M O D E L Notation: Throu ghout the p aper, R is the set of real number s, N denotes the set of all natura l number s, and W = N ∪ { 0 } . For n ∈ N a fixed integer the set of integers { 1 , 2 , . . . , n } is deno ted by [ n ] , while any o ther set is represented by a calligrap hic capital letter . The cardinality of a set X , which is the num ber of its elements, is d enoted by | X | , and P ( X ) = {M ; M ⊂ X } d enotes th e power-set of X , w hich is the set of all its subsets. The difference of two sets X an d Y is d efined by X K Y := { x ; x ∈ X and x / ∈ Y } . Boldface letter s denote r andom variables. Consider a set of n agen ts t hat are labeled by [ n ] an d interact accordin g to a d irected informa tion flow structure given by a digr aph G = ([ n ] , E ) , where E ⊂ [ n ] × [ n ] is the set of directed ed ges. N ( i ) = { j ∈ [ n ]; ( j, i ) ∈ E } is called the neighbo rhoo d of ag ent i and is th e set of all ag ents wh ose beliefs are observed by agent i , a nd deg( i ) = | N ( i ) | is called the d egree of ag ent i . For l , m ∈ [ n ] , a path P k ( l, m ) of length k from l to m is a sequence of k d istinct integers i 1 , i 2 , . . . , i k , such that i 1 = l , i k = m a nd ( i j − 1 , i j ) ∈ E for all j ∈ [ k ] . The set of fin itely ma ny possible states of th e world is denoted by Θ , and ∆Θ is the space o f all probability measures on the set Θ . The g oal is to decide amongst the finitely many possibilities in the state space Θ . A ran dom variable θ is chosen ran domly from Θ by the nature and accor d- ing to the p robab ility measur e ν ( · ) ∈ ∆Θ , which satisfies ν ( ˆ θ ) > 0 , ∀ ˆ θ ∈ Θ an d is referred to as the commo n prio r . Associated with each agen t i , S i is a finite set called the signal space of i , and given θ , ℓ i ( · | θ ) is a probab ility measur e on S i , wh ich is refer red to as the signa l structure or likelihood function of ag ent i . Fur thermo re, (Ω , F , P ) is a proba bility triplet, where Ω = Θ × Y i ∈ [ n ] S i W , is an infinite produ ct space with a general element ω = ( θ ; ( s 1 , 0 , . . . , s n, 0 ) , ( s 1 , 1 , . . . , s n, 1 ) , . . . ) an d the associated sigma field F = P (Ω) . P ( · ) is the pro bability m easure on Ω which assign s probabilities con sistently with the commo n prior ν ( · ) an d the likeliho od functio ns ℓ i ( · | θ ) , i ∈ [ n ] , and in such a way that con ditional on θ the r andom variables { s i,t , t ∈ W , i ∈ [ n ] } ar e indep enden t. E {·} is the expectation operator, which rep resents integration with respect to d P ( ω ) . A. Signa ls Let t ∈ W denote th e time index a nd f or eac h agent i , define { s i,t , t ∈ W } to be a sequence of independent and identically distributed rando m variables with th e proba bility mass function ℓ i ( · | θ ) ; this seq uence represents the pr iv ate observations made by agent i at each time per iod t . T he priv ately observed signals are independen t and id entically distributed across tim e; and at any giv e n time t , each agent makes a signal observation that is independ ent of the rest of the agents. In pa rticular, for t ∈ W and ˆ θ ∈ Θ both fixed, let L ( · | ˆ θ ) denote the joint law for the ran dom vector ( s 1 ,t , . . . , s n,t ) , then L ( · | ˆ θ ) d oes no t depend on t , and it factors L ( · | ˆ θ ) = Y i ∈ [ n ] ℓ i ( · | ˆ θ ) , (1) as the prod uct probability measure on the pro duct space Q i ∈ [ n ] S i . The signal structures and the join t law gi ven in (1), as well as the common prior ν ( · ) and their corr espondin g sam- ple spaces Q i ∈ [ n ] S i and Θ are commo n knowledge am ongst all th e agents. The assum ption of co mmon kn owledge in the case of fully rational (Bayesian ) agents implies th at given the same observations of o ne ano ther’ s beliefs or pri vate signals distinct agents would make identical infe rences; in the sense that starting form th e same belief ab out th e u nknown θ , their updated beliefs giv en the same observations, w o uld be the same. B. Beliefs For each tim e in stant t , let µ i,t ( · ) be the proba bility mass function on Θ , re presenting the op inion o r belief at tim e t of agent i about th e realized value of θ . Note th at µ i,t ( · ) is random since it dep ends on the random ob servations of the agent. Th e g oal is to in vestigate the prob lem of asympto tic learning, i.e. for e ach agents to learn the true realized value θ ∈ Θ o f θ asymptotically . That is to have µ i,t ( · ) to conver g e to a point mass center ed at θ , where the co n vergence could be in p robab ility or in the P -almost sure sense. At t = 0 the values θ = θ , followed by s i ∈ S i of s i, 0 are realized and the latter is observed by a gent i for all i ∈ [ n ] , who then fo rms an initial Bayesian o pinion µ i, 0 ( · ) abo ut the value of θ . G iv en s i, 0 , and u sing the Bayes’ rule fo r each agent i ∈ [ n ] , the initial belief in terms of the obser ved signal s i, 0 is gi ven by: µ i, 0 ( ˆ θ ) = ν ( ˆ θ ) ℓ i ( s i, 0 | ˆ θ ) X ˜ θ ∈ Θ ν ( ˜ θ ) ℓ i ( s i, 0 | ˜ θ ) . (2) At any su ccessi ve time step t > 1 , each agent i observes the realized values of s i,t as well as the current beliefs of its neighbo rs µ k,t − 1 ( · ) , ∀ k ∈ N ( i ) and f orms a refined opinion µ i,t ( · ) b y inco rporatin g all th e d ata that have been m ade av ailable to her by the time t . I I I . T H E C A S E O F A S I N G L E B AY E S I A N A G E N T A Bayesian agen t i that starts with a prior ν ( · ) on the state of the w o rld and successi vely uses the Bayes ru le to update her beliefs based on the signals { s i,t , t ∈ N } tha t she observes would form the initial belief given in ( 2) and will then sequentially update her b eliefs according to the Bayes’ rule: µ i,t ( ˆ θ ) = µ i,t − 1 ( ˆ θ ) ℓ i ( s i,t | ˆ θ ) X ˜ θ ∈ Θ µ i,t − 1 ( ˜ θ ) ℓ i ( s i,t | ˜ θ ) , ∀ ˆ θ ∈ Θ . (3) For any false state ˇ θ ∈ Θ K { θ } let r i,t ( ˇ θ ) := log ℓ i ( s i,t | ˇ θ ) / ℓ i ( s i,t | θ ) , be the random variable r epresenting the lo g-likelihood ra tio o f th e priv ate signa l that agent i observes at time t a nd under the false state ˇ θ ; and similarly , let λ i,t ( ˇ θ ) := lo g µ i,t ( ˇ θ ) / µ i,t ( θ ) be the lo g-likelihood ratio of the belief of ag ent i at time t under th e false state ˇ θ . Note that λ i,t ( ˇ θ ) and µ i,t ( ˇ θ ) are related through µ i,t ( ˇ θ ) = e λ i,t ( ˇ θ ) 1 + P ˜ θ ∈ Θ K { θ } e λ i,t ( ˜ θ ) , (4) and the Bayesian belief update in (3) translates into the following linear upd ate for the lo g-likelihood ratios: λ i,t ( ˇ θ ) = λ i,t − 1 ( ˇ θ ) + r i,t ( ˇ θ ) , wh ich leads to λ i,t ( ˇ θ ) = λ i, 0 ( ˇ θ ) + n X q =1 r i,q ( ˇ θ ) . (5) Next no te th at for all t ∈ N , E r i,t ( ˇ θ ) = X s i ∈S i ℓ i ( s i | θ ) log ℓ i ( s i | ˇ θ ) ℓ i ( s i | θ ) := − D K L ℓ i ( ·| θ ) || ℓ i ( ·| ˇ θ ) 6 0 , where the inequality follows from the positivity of the Kullback-Leibler diver g ence D K L ( ·||· ) and is strict when ev er ℓ i ( ·| ˇ θ ) 6≡ ℓ i ( ·| θ ) , i.e. ∃ s ∈ S i such that ℓ i ( s | ˇ θ ) 6 = ℓ i ( s | θ ) [14, Theorem 2.6.3 ]. In particular r i,t ( ˇ θ ) , t ∈ N are in tegrable, indepen dent an d identically distrib u ted variables, then ce by the K o lmogrov’ s strong law o f large num ber we get 1 t t X q =1 r i,q ( ˇ θ ) → E r i,t ( ˇ θ ) , (6) P -almost su rely . Th is in turn implies that if E r i,t ( ˇ θ ) = − D K L ℓ i ( ·| θ ) || ℓ i ( ·| ˇ θ ) < 0 or e quiv a lently ℓ i ( ·| ˇ θ ) 6≡ ℓ i ( ·| θ ) , then λ i,t ( ˇ θ ) → −∞ . Sub stituting the latter in (4 ) then yields that µ i,t ( ˇ θ ) → 0 , P -alm ost surely; with prob ability one, the age nt asymptotica lly rejects any false state ˇ θ ∈ Θ K { θ } satisfying D K L ℓ i ( ·| θ ) || ℓ i ( ·| ˇ θ ) > 0 . I n particular we h av e, Theorem 1. If ∀ ˇ θ ∈ Θ K { θ } , D K L ℓ i ( ·| θ ) || ℓ i ( ·| ˇ θ ) > 0 , th en µ i,t ( θ ) → 1; P -almost sur ely , under the update rule in ( 3) a nd the specified model. Remark 1. It is instructive to re gar d the almost sure conver - gence o f b eliefs stated fo r a Bay esian agent in Th eor em 1 as a consequen ce o f the bo unded martingale conver gence th eor e m. T o see ho w , for all ω ∈ Ω and ˆ θ ∈ Θ , let 1 θ = ˆ θ ( ω ) be the indicator variable for the true state o f th e world b eing ˆ θ , i. e. 1 θ = ˆ θ ( ω ) = 1 if θ ( ω ) = ˆ θ and 1 θ = ˆ θ ( ω ) = 0 , otherwise. F o r each t ∈ W , define F i,t = σ ( s i,t , s i,t − t , s i,t − 2 , . . . , s i, 1 , s i, 0 ) as the sigma fields generated by the private signa ls o f the agent upto time t . Note in particula r that F 0 ,t = σ ( { s i, 0 } ) and { F i,t , t ∈ W } is a filtration on the measur e space (Ω , F ) . The Bayes rule in (2 ) can be interpr eted as µ i, 0 ( ˆ θ ) = P { θ = ˆ θ | s i, 0 } = E { 1 θ = ˆ θ | F i, 0 } . (7) Similarly , starting fr o m the Bayesian o pinion in (7) , app li- cation of (3 ) at times t > 0 is exactly the Bayesian up date of agent i ’s belief fr o m time t − 1 to time t , given that at time t agent i ha s observed the sig nal s i,t . Whence, (3 ) can be c ombined with (7 ) to get µ i,t ( ˆ θ ) = E { 1 θ = ˆ θ | F i,t } , ∀ t > 0 . The b eliefs form a boun ded ma rtingale with re- spect to the filtration intr od uced a bove, a nd it is exactly the setting for the martinga le co n v er gence theorem. Indeed, the co n ver gence of µ i,t ( ˆ θ ) to 1 θ = ˆ θ is now immed iate, since D K L ℓ i ( ·| θ ) || ℓ i ( ·| ˇ θ ) > 0 , ∀ ˇ θ 6 = θ implies that { θ = ˆ θ } ∈ F i, ∞ and by Levy’s zer o- one law [1 5], lim t →∞ µ i,t ( ˆ θ ) = E { 1 θ = ˆ θ | F i, ∞ } = 1 θ = ˆ θ , P -almost sur ely . By the pr esumption of Th eorem 1 we are lead to d efine for any ˆ θ ∈ Θ the set o f those states ˜ θ ∈ Θ K { ˆ θ } that are observationally equivalent to ˆ θ f or agent i and denote it by O i ( ˆ θ ) = n ˜ θ ∈ Θ K { ˆ θ } : D K L ℓ i ( ·| ˆ θ ) || ℓ i ( ·| ˜ θ ) = 0 o . In th e case where O i ( ˆ θ ) 6 = ∅ , the statement o f T heorem 1 can be refined as f ollows. Corollary 1. Un der th e update rule in (3 ) an d the specified model, it holds true with P -pr obability 1 that as t → ∞ , µ i,t ( ˇ θ ) → 0 , ∀ ˇ θ 6∈ O i ( ˆ θ ) and µ i,t ( ˆ θ ) → ν ( ˆ θ ) X ˜ θ ∈O i ( θ ) ν ( ˜ θ ) , ∀ ˆ θ ∈ O i ( θ ) . A. Expon entially F ast Learn ing W e can push the precedin g r esults furth er to prove an exponential rate of con vergence for the b eliefs. Indeed, by applying (5) an d (6 ) for the beliefs on the false state g iv en in (4), it fo llows that fo r any 0 < γ < D K L ℓ i ( ·| θ ) || ℓ i ( ·| ˇ θ ) we can write µ i,t ( ˇ θ ) = e − γ t z t , where { z t , t ∈ N } is a p rocess satisfying z t → 0 with P -pr obability one as t → ∞ . I nstead, if we write µ i,t ( θ ) = 1 − P ˇ θ ∈ Θ K { θ } µ i,t ( ˇ θ ) , then it f ollows that the almost sure convergence stated in Theorem 1 occurs with a n exponentially fast asympto tic rate of R i ( θ ) := min ˇ θ ∈ Θ K { θ } D K L ℓ i ( ·| θ ) || ℓ i ( ·| ˇ θ ) . (8) In effect, the states of the world are to b e statistically d istin- guished by the observed signals s i,t , t ∈ W . Dif ferent states ˆ θ ∈ Θ are distinguished th rough their different likelihood function s ℓ i ( · | ˆ θ ) and the m ore r efined suc h differences are, the better the states are distingu ished. The asymptotic rate derived in (8) is one me asure of resolution for th e likelihood structure, or in deed for th e filtration introduc ed in Remark 1. B. Bayesian Learning in Networks The preceding d iscussion laid th e case for th e exponentially fast learning of a sing le agen t that ap plies the Bayes ru le successiv ely to her observed signals and updates h er belief about th e tru e state of the world at each round. On e mig ht suggest to use the sam e framew o rk in a network setting w here the agents ha ve access to their neighbor s’ beliefs at successi ve time steps, by considerin g the cond itional prob abilities giv en both the n eighbo rs’ beliefs an d the pr iv ate sign als. Howe ver, repeated application s of the Bayes’ rule in networks b ecome computatio nally intractable , pa rtly due to the f act that each agent needs to use her local data that is inc reasing over time a nd make inferen ces abou t the global network structur e, which is unknown to her . The co mplexities associated with the Bayesian framew ork h as limited its a pplication to the simp lest networks such as three -link ones [16]. Apart fro m th e diffi- culties associated with the network stru cture, the in creasing history o f the ob servations that a fully Bayesian agent needs to take into account im poses forebod ing computation al b urden [17]. Nonetheless, an upper-boun d for the exponential rate of learning by a par ticular ag ent i in a network of Bayesian agents can be ob tained as follows. Con sider an ou tside Bay esian a gent ˆ i wh o shares the same co mmon kn owledge of the prior and signal structures with the ne twork agen ts in [ n ] ; in par ticular, ˆ i knows th e prior ν ( · ) as well as the signal struc tures ℓ i ( · | ˆ θ ) , ∀ ˆ θ ∈ Θ and ∀ i ∈ [ n ] , an d it will make the same in ference as any other agent in [ n ] when given access to the same observations. Consider next a Gedanken experiment where ˆ i is gran ted d irect access to all the signals o f agen t i together with every o ther agen ts to whom agent i has dire ct or ind irect access, i.e. her neighbo rs and neighbors’ neighbo rs and so on. The ra te at which agent i learn s is then upp er-bounded by the learning rate of ˆ i . Formally , define A ( i ) = { j ∈ [ n ] : th ere exists a path P k ( j, i ) in G for some k ∈ N } . Then the Bayesian agent i learns at an exponen tially fast asymptotic rate that is upper bound ed by R G i ( θ ) := min ˇ θ ∈ Θ K { θ } D K L Y j ∈A ( i ) ℓ j ( ·| θ ) Y j ∈A ( i ) ℓ j ( ·| ˇ θ ) . In the next section, a method of belief aggregation is propo sed that applies to network topologies where each vertex has either on e or no neighbors. T he p ropo sed rule is to use the same Bayesian update as in (3) if a gent i has no neighbors and else if a gent i has one neighbor, th en t o use ( 3) b ut with the self belief µ i,t − 1 ( · ) in the right-han d side replaced by the belief µ j,t − 1 ( · ) of the uniqu e neighbo r { j } = N ( i ) . W ith this rule and in the case of a dir ected n -node circle where A ( i ) = [ n ] for any i , one can realize exponentially fast learning with an asymptotic rate of (1 /n ) R G i ( θ ) which is within a constant factor o f the above up per b ound . I V . M E M O RY L E S S N E T W O R K U P DAT E S Consider a d igraph G satisfying deg ( i ) ∈ { 0 , 1 } , ∀ i ∈ [ n ] . The pro posed rule is to u se the Bayesian update in ( 3) if deg( i ) = 0 , and else to u se µ i,t ( ˆ θ ) = µ j,t − 1 ( ˆ θ ) ℓ i ( s i,t | ˆ θ ) X ˜ θ ∈ Θ µ j,t − 1 ( ˜ θ ) ℓ i ( s i,t | ˜ θ ) , ∀ ˆ θ ∈ Θ , (9) where j ∈ [ n ] is the un ique vertex j ∈ N ( i ) . W e begin the analysis of asym ptotic learnin g with the pr oposed rules by th e special cases of dir ected circle and rooted trees in Subsections IV - A and I V - B, respectiv ely; fo llowed by the discussion of the class of all networks with node degrees zero and one in Subsection IV -C. These up dates are a special case of the Learning without Reca ll rules that we dev elop in a comp anion paper , and they can describe th e beh avior o f Rational but Memo ryless a gents who share a c ommon prior ν ( · ) and always interpret their curr ent and ob served beliefs a s having stemmed from this common prior . A. Dir ecte d Cir cles In this subsection, we sho w that the update rules in (9) are particularly amiable to a circu lar structure, effecti vely achieving the u pper boun d d erived in Subsection III- B, except for a constant m ultiplicative factor . Consider a directed circle on n no des and labeled by [ n ] , in such a way th at the or dered sequ ence (1 , 2 , . . . , n ) con stitutes a p ath. Fix i ∈ [ n ] arbitrarily . Starting from agent i at time t , and successi vely applying (9) at tim es t , t − 1 , . . . upto t − n + 1 , yields fo r all t > n and all j ∈ { i, i − 1 , . . . , 1 , 0 , − 1 , . . . , i + 1 − n } that: µ j,t − i + j ( θ ) = l j ( s j,t − i + j | θ ) µ j − 1 ,t − i + j − 1 ( θ ) X ˜ θ ∈ Θ l j ( s j,t − i + j | ˜ θ ) µ j − 1 ,t − i + j − 1 ( ˜ θ ) , (10 ) when j > 1 , and µ n + j,t − i + j ( θ ) = l n + j ( s n + j,t − i + j | θ ) µ n + j − 1 ,t − i + j − 1 ( θ ) X ˜ θ ∈ Θ l n + j ( s n + j,t − i + j | ˜ θ ) µ n + j − 1 ,t − i + j − 1 ( ˜ θ ) , when i + 1 − n 6 j < 1 . Next keep the term µ i,t ( θ ) on the righ t-hand side of (10) and replace fo r each of the terms µ j − 1 ,t − i + j − 1 ( θ ) on its left-h and side from the successi ve relation at time t − i + j − 1 until you r etrieve µ i,t − n on the left han d side fo r time t − n + 1 . Executing this proced ure leads to th e fo llowing iteration which in volves only th e beliefs of agent i at the two poin ts in time t and t − n . µ i,t ( θ ) = i − 1 Y j =0 l i − j ( s i − j,t − j | θ ) n − 1 Y j = i l j +1 ( s j +1 ,t − n +1+ i − j | θ ) µ i,t − n ( θ ) X ˜ θ ∈ Θ i − 1 Y j =0 l i − j ( s i − j,t − j | ˜ θ ) n − 1 Y j = i l j +1 ( s j +1 ,t − n +1+ i − j | ˜ θ ) µ i,t − n ( ˜ θ ) , ∀ t > n. (11) Next no te that starting from µ i, 0 ( θ ) given by (7) the above is exactly the Bayesian update of agent i ’ s belief fr om time t − n to time t , given th at at time t age nt i has obser ved the signal s j,t , a t time t − 1 agent i − 1 has obser ved the signal s i − 1 ,t − 1 , and so on up to the ob servation of signal s 1 ,t − i +1 at time t − i + 1 , and then signal s n,t − i followed b y s n − 1 ,t − i − 1 and so on un til s i +1 ,t − n +1 at tim e t − n + 1 . Hence, if we let ˆ t = (1 /n ) t for all t belo nging to the integer multiples of n , then Th eorem 1, to gether with (8), implies that µ i, ˆ t ( θ ) → 1 , P -almost surely as ˆ t → ∞ , at an expo nentially fast asymptotic rate of R circle ( θ ) := min ˇ θ ∈ Θ K { θ } D K L n Y j =1 ℓ j ( ·| θ ) n Y j =1 ℓ j ( ·| ˇ θ ) , or equivalently that, µ i,t ( θ ) → 1 , P -almost surely as t → ∞ , at an expon entially fast asymp totic rate of (1 /n ) R circle ( θ ) . Indeed , except for a pen alty of constant factor 1 / n which decreases prop ortiona lly to the network size, with the upda te rules in (9) one can achiev e exponen tially fast learning at the upper bound rate R G ( θ ) = R circle ( θ ) . Remark 2 . In th e special case th at all agents r eceive inde- penden t a nd identically distributed sign als we have ℓ i ( ·| θ ) ≡ ℓ j ( ·| θ ) , ∀ j ∈ [ n ] K { i } , and th er efore (1 /n ) R circle ( θ ) = (1 /n ) min ˇ θ ∈ Θ K { θ } D K L ℓ n i ( ·| θ ) ℓ n i ( ·| ˇ θ ) = min ˇ θ ∈ Θ K { θ } D K L ℓ i ( ·| θ ) ℓ i ( ·| ˇ θ ) = R i ( θ ) . In other wor ds, every agent learns the true state at the same asymptotic rate as when she r elies only on her o wn private signals. Th er eby commun ications with the neigh boring agents offer no ad vantages in this ca se. On th e other han d, if th e agents learn the true parameter at differ ent private rates, R i ( θ ) , i ∈ [ n ] , the n the asymp totic rate (1 /n ) R circle ( θ ) would be slower th an max i R i ( θ ) but faster tha n min i R i ( θ ) . That is the faster ag ents will be slowed d own by the cir cular communica tions, while the slower agents will be sped up. However , th e true a dvantage o f commun ications in a dir ected cir cle is appa r ent when some or n one of the agents can lean the true pa rameter o n their own, th at is we have O i ( θ ) 6 = ∅ for some or all i ∈ [ n ] . Ther e following the communica tion rules prescr ibed b y (9) , a ll agents would learn th e true parameter e xponentia lly fast a nd at a c ommon asymptotic rate of (1 /n ) R circle ( θ ) , pr ovided that ∩ i ∈ [ n ] O i ( θ ) = ∅ , or equivalen tly that R circle ( θ ) > 0 . Th is way , by communica ting in a circle the agent a r e able to b enefit fr om each o ther’ s observations and all lea rn th e true state o f the world asymp- totically , even if none a r e able to learn the true state o n their own. W e now shift attention to the case of n etworks with rooted tree topolog ies. 1 2 8 9 7 5 6 3 4 (a) A directed rooted tree 1 2 5 4 3 6 7 8 (b) A hybrid st ructure Fig. 1: Some o f the graph structures c onsidered in th e paper . B. Rooted T rees In a directed ro oted tree, a no de is designated as the root and all the edges are directed away from it. Let G be one such directed rooted tree and lab el its vertice s by [ n ] , assignin g 1 to its root, as in example Fig. 1a. Note by th e tree prop erty that for a ny vertex th ere is a unique path c onnectin g the root to that vertex. T ake one such verte x and suppose that it is connected to the ro ot node by a directed path co nsisting of i distinct nodes for some i ∈ [ n ] . W ithou t any loss in generality and fo r the ease of n otation, suppose th at vertices of G are labeled in such a way th at the un ique p ath c onnectin g nod e 1 (the root) to node i is given by the ordered sequ ence of vertices (1 , 2 , 3 , . . . , i ) , as is the case for i = 4 in Fig. 1a. Successiv e applications o f ( 9) at times t , t − 1 , t − 2 , . . . , t − i + 2 for the n odes i , i − 1 , i − 2 , . . . , 2 in the ir re spectiv e or der of appea rance shows that (10) applies her e as well for any j ∈ [ i ] K { 1 } . Similar to the way (11) was derived, b y starting from the equation for µ i,t ( θ ) and successiv ely substituting for each of the terms µ j − 1 ,t − i + j − 1 ( θ ) on the left hand side, we can express the be liefs of node i at each time t > i in terms of the root’ s belief µ 1 ,t − i +1 as follows. µ i,t ( θ ) = i Y j =2 l j ( s j,t − i + j | θ ) µ 1 ,t − i +1 ( θ ) X ˜ θ ∈ Θ i Y j =2 l j ( s j,t − i + j | ˜ θ ) µ 1 ,t − i +1 ( ˜ θ ) . (12) It is now immed iate from (1 2) that h aving lim t →∞ µ 1 ,t ( θ ) = 1 , P -almost surely , is sufficient to g et µ i,t ( θ ) → 1 as t → ∞ with P -pro bability one and at the same asymptotic rate as µ 1 ,t ( θ ) → 1 . In p articular, we have that if the r oot nod e learns the true p arameter so that per Theorem 1 we h av e D K L ℓ 1 ( ·| θ ) || ℓ 1 ( ·| ˇ θ ) > 0 , then f or every age nt i ∈ [ n ] in a d irected ro oted tree we have that µ i,t ( θ ) → 1 , P -almost surely as t → ∞ , at an exponen tially fast asymptotic rate of D K L ℓ 1 ( ·| θ ) || ℓ 1 ( ·| ˇ θ ) . This in the main part is due to the fact that a point m ass is a station ary point for the belief iteratio ns propo sed in (9). In wh at follows, we shall comb ine the results from this and the p revious subsectio n to addr ess the g eneral class of d igraphs with zero or o ne node degrees. C. Generalization to Hyb rid S tructur es W e begin by the observation that any weakly conne cted digraph G which h as only d egree zero or degree on e nodes can be drawn a s a r ooted tree whose roo t is re placed b y a directed circle, e. g. Fig. 1b. This is true since any s uch dig raph can have at mo st one directed circle and all oth er nodes that are connected to this circle should be directed away from it, otherwise G w o uld have to in clude a node of degree two or higher . The case o f d igraphs G with no circles is the rooted trees discussed in Subsection IV -B. There fore suppose that G co ntains a circle of length l consisting of its first l nodes and let them be labeled by [ l ] . Next note that any of the l nodes belo nging to the directed circle would lea rn the true state of the world at the exponentially fast a symptotic rate of (1 /l ) R circle ( θ ) deriv ed in Subsection IV -A. Mo reover , each of th e nodes i ∈ [ n ] K [ l ] is conne cted uniq uely to a node i l ∈ [ l ] and along a distinct path ( i l , i l 1 , i l 2 , . . . , i l k , i ) for some k ∈ [ n − l − 1] . Thereby , using the same argumen t as the on e leadin g to (1 2) in the case of r ooted trees, we get th at for any i ∈ [ n ] K [ l ] th at µ i,t ( θ ) = l i ( s i,t | θ ) k Y j =1 l i j ( s i j ,t − k + j − 1 | θ ) µ i l ,t − k − 1 ( θ ) X ˜ θ ∈ Θ l i ( s i,t | ˜ θ ) k Y j =1 l i j ( s i j ,t − k + j − 1 | ˜ θ ) µ i l ,t − k +1 ( ˜ θ ) , wherefor e w ith P -prob ability one as t → ∞ if µ i l ,t ( θ ) → 1 , then µ i,t ( θ ) → 1 as well, and at the same asymptotic rate. Indeed , we have that if R circle ( θ ) > 0 , then every agent in the network would learn the true state θ asymptoticly exponen- tially fast, and all at the same rate gi ven by (1 /l ) R circle ( θ ) > 0 . A Leader-F ollower Arc h itectur e: The p receding results ca n be summ arized upo n the observation that tho se agent belong- ing to th e so- called “root c ircle” c ombine their observations in a Bayesian manner ( except for a penalty of 1 /l in the asymptotic rate) and o nce the opinions o f the cir cle agen ts conv erge to a p oint mass, the rest of the agents fo llow as well, after a finite numb er o f steps that depen ds on their distance to the root circle. Indeed , the first l agen ts form a circle of leaders where they combine their observations an d reach a consensus; ev ery other agent in th e network then follows whatev er state that the leaders h ave co llectiv ely agr eed upon. V . C O N C L U D I N G R E M A R K S In this paper, a belief aggr egation method is pro posed a nd shown to be applicab le to a class o f d irected n etworks that can b e drawn as a rooted tree with the root node rep laced by a directed c ircle. The proposed update rules r eplicate that of a single Bayesian agent except that in the case of degree one nodes the self-beliefs are replaced b y the beliefs commun icated b y the neighb oring ag ents. Accord ingly , those agents which b elong to the root circle can com bine the ir observations to le arn the true state o f the world, even if n one can distingu ish the tr uth pr i vately . Any per ipheral agent that does not belo ng to the root cir cle would then f ollow the beliefs of the ro ot agent to whom she is co nnected either directly o r in directly th rough her neighbor and neighbo r’ s neighbo r etc. Th ereby , all agents in th e network would learn the true state of the world exponentially fast an d at the same asymptotic rate, so long as the truth is distinguishable through the comb ined observations of all agents in th e root cir cle. The a symptotic rate at which learning occurs is sh own to equal (1 /l ) R circle , where l is the leng th of th e r oot c ircle, and R circle is the exponential r ate at wh ich a Bayesian agent with direct access to all the o bservations of the root age nts would learn the truth. The authors’ ongo ing research focu ses on the in vestigation and ana lysis of belief upda te r ules that provide asymptotic lea rning in a wider variety of ne twork structure s and facilitate the tractable mo deling and an alysis of rational behaviors in networks, using the so-called Lea rning without Recall framework. R E F E R E N C E S [1] D. Blackwel l an d L. Dubins, “Merging of opinions with increa s ing informati on, ” The Annals of Mathematical Statistics , vol. 33, pp. 882 – 886, 1962. [2] E. Lehrer and R. Sm orodinsk y , “Merging and lear ning, ” Lecture Notes- Monog raph Series , pp. 147–168, 1996. [3] C. P . Chamley , R ational Herds: Economic Models of Social Learning . Cambridge Unive rsity Press, 2004. [4] M. O. Jackson, Social and Economic Net works . Princet on, NJ, USA: Princet on Univ ersity Press, 2008. [5] V . Borkar and P . V araiya, “ Asymptotic agreement in distribute d esti - mation, ” Automatic Contr ol, IEE E T ransaction s on , vol. 27, no. 3, pp. 650–655, J un 1982. [6] K. Rahnama Rad and A. T ahbaz-Salehi, “Distribut ed paramete r estima- tion in netw orks, ” in 49th IE EE Confer ence on Decision and Contr ol (CDC) . IEEE, Dec. 2010, pp. 5050–5 055. [7] S. Shahrampour and A. Jadbabaie , “Exponent ially fast parameter es- timatio n in netw orks using distribute d dual ave raging, ” in 52nd IEEE Confer ence on Decision and Contr ol (CDC) , Dec. 2013, pp. 6196–6201. [8] S. Shahrampour , A. Rakhlin, and A. Jadbabaie, “Distrib uted detecti on: Finite- time analysis and impact of netwo rk topology , ” arXiv preprint arXiv:1409.8606 , 2014. [9] S. Shahrampour , S. Rakhli n, and A. Jadbabaie, “Online learning of dynamic parameters in social networks, ” in A dvances in Neural Infor- mation Pr ocessing Systems , 2013. [10] M. Mueller-Fran k, “ A general frame work for rational lea rning in social netw orks, ” Theor etical Economics , vol. 8, no. 1, pp. 1–40, 2013. [11] E. Mossel, A. Sly , and O. T amuz, “Asymptotic learnin g on bayesian social networks, ” Pr obability Theory and Related Fi elds , vol. 158, no. 1-2, pp. 127–157, 2014. [12] A. Lalitha, A. Sarwate, and T . Javidi , “Social learning and distribut ed hypothesi s testing, ” in 2014 IEEE Internation al Symposium on Informa- tion Theory (ISIT) , June 2014, pp. 551–555. [13] A. Jadba baie, P . Molavi, A. Sandroni, and A. T ahbaz-Sal ehi, “Non- bayesia n social learning, ” Games and Economic B ehavior , vol. 76, no . 1, pp. 210 – 225, 2012. [14] T . Cover and J. Thomas, Elements of Information Theory , ser . A W ile y- Intersci ence publicati on. W iley , 2006. [15] P . Bill ingsle y , Prob abilit y and Measure , 3rd ed. Wil ey-Interscie nce, 1995. [16] D. Gale and S. Kari v , “Bayesia n learnin g in social netw orks, ” Games and Economic B ehavior , vol. 45, pp. 329–346, 2003. [17] P . M. DeMarz o, D. V ayanos, and J. Z wiebel, “Persuasion bias, social influence , and unidimensional opinions, ” The Quarterly J ournal of Economics , vol. 118, pp. 909–968, 2003.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment