An Information Theoretic Representation of Agent Dynamics as Set Intersections

An Information Theoretic Represe n tat ion of Agen t Dynamics as Set In tersections Samuel Epstein and Marg rit Betke Boston Universit y , 11 1 Cummington S t, Boston, MA 02215 { samepst, betke } @cs .bu.edu Abstract. W e represent agen ts as sets of strings. Each string enco des a potential intera ction wi th another agen t or environmen t. W e represent the total set of dynamics b etw een tw o agents as th e intersectio n of their respective strings, w e prove complexity prop erties of play er interactions using Algorithmic Information Theory . W e show how the prop osed con- struction is compatible with U niversal Artiﬁcial Intellig ence, in that the AIXI mod el can b e seen as universal with resp ect to intera ction. 1 Keywords: Universa l Artiﬁcial Intelligence, AIX I Mod el, K olmogorov Complexit y , Algorithmic In formation Theory 1 In tro duction Whereas classical Information Theor y is c o ncerned with q uantif ying the ex- pec ted num b er of bits needed for communication, Algor ithmic Infor mation The- ory (AIT) principally studies the complex ity of individual strings. A central measure of AIT is the Ko lmogorov Complexity C ( x ) of a string x , which is the size o f the smallest progr am that will output x on a universal T uring machine. Another central deﬁnition of AIT is the universal pr io r m ( x ) that w eight s a hypothesis ( string) by the complexity of the prog rams that pr o duce it [L V08]. This univ ersal prior has many remark a ble pr op erties; if m ( x ) is used for in- duction, then a ny computable s equence can b e learned with only the minimum amount of data . Unfortunately , C ( x ) and m ( x ) are not ﬁnitely computable. Algorithmic Infor mation Theor y can b e interpreted a s a gene r alization of classi- cial Information Theory [CT91] and the Minim um Description Length principal. Some other applications include universal P AC lea rning a nd Algor ithmic Statis- tics [L V08,GTV01]. The question of whether AIT can b e used to form the foundatio n o f Artiﬁcial Int elligence was answered in the aﬃr mative with Hutter’s Universal Artiﬁcial Int elligence (UAI) [Hut04]. This was achiev ed b y the application of the universal prior m ( x ) to the c y be r netic a g ent mo del, where a n ag ent co mmunicates with an environmen t through sequential cyc le s of a ction, p erception, a nd reward. It was shown that there exists a univ ersal age nt , the AIXI model, that inherits many universality prop erties from m ( x ). In particula r, the AIXI mo del will conv erge to achieve optimal rewards given long eno ug h time in the environment. 1 The authors are grateful to Leonid Lev in for insightful discussions and ac knowl edge partial supp ort by N SF grant 0713229. As almost all AI pro blems can b e formalized in the cyb er netic agent mo del, the AIXI mo del is a complete theoretical solution to the ﬁeld of Artiﬁcial General Int elligence [GP07]. In this pap er , we represent agents as s ets o f s tr ings a nd the p otential dyna m- ics b etw een them a s the int ersection of their resp ective sets of string s (Sec. 2). W e c o nnect this interpretation of interacting agents to the cyb ernetic a gent mo del (Sec. 2.2). W e provide ba ckground o n Algorithmic Information Theory (Sec. 3) and show how age nt le a rning can b e describ ed with algor ithmic com- plexity (Sec. 4). W e apply combinatorial and algor ithmic pro of techniques [VV10] to study the dy namics b etw een agents (Sec. 5 ). In par ticular, we desc rib e the approximation of a gents (Th. 2), the co nditions for remov al of sup erﬂuous in- formation in the enco ding of an agent (Th. 3), and the consequences of having m ultiple pay ers achieving the s ame rewards in a n e nvironment (Th. 4). W e show how the int erpretation given in Sec. 2 is compatible with Universal Artiﬁcial Int elligence, in that the AIXI mo del has universalit y pr op erties with r esp ect to our deﬁnition of “interaction” (Sec. 6 ). 2 In teraction as In tersection W e deﬁne play ers A a nd B as tw o sets co ntaining strings of size n . Each str ing x in the in tersection set A ∩ B repr esents a particular “ int eraction” b etw een play ers A and B . W e will use the terms string and inter action interc hangeably . This s et r epresentation can b e used to enco de non-co op era tive games (Sec. 2.1) and instances of the cyb ernetic agent mo del (Sec. 2.2). Uncertainties in instances of b oth do ma ins ca n b e enco ded int o the s ize o f the intersections. T he amo un t of uncertaint y betw een players is equa l to | A ∩ B | . If the interaction b etw een the players is deterministic then | A ∩ B | = 1. If uncer taint y exists, then multiple int eractions are poss ible and | A ∩ B | > 1. W e say that play er A int er acts with B if | A ∩ B | > 0 . 2.1 Non-co op erativ e Games Sets c a n be use d to enco de adversaries in sequential games [RN09], where agents exchange a series o f actio ns ov er a ﬁnite num ber of plies. Each game or in- teraction consis ts of the reco rding o f actions by adversar ie s α a nd β , with x = ( a 1 , b 1 )( a 2 , b 2 )( a 3 , b 3 ) for a game of three r ounds. The play er (set) rep- resentation A of adversary α is the set of games repr esenting all p ossible actions by α ’s adversary with α ’s resp onding a ctions, and similarly for play er B r epre- senting adversary β . An exa mple game is ro ck-paper- scissors whe r e adversaries α and β play tw o sequential r ounds with an ac tio n s pace of { R , P, S } . Adversary α only plays ro ck, wher eas a dversary β ﬁrst plays pap er, then copies his a dver- sary’s play of the ﬁrs t ro und. The c orresp o nding play ers (sets) A a nd B can be seen in Fig . 1a . The intersection set of A and B contains the single interaction x =“ ( R, P )( R, R ),” which is the only p ossible game (interaction) that α and β can play . Example 1 (Chess Game). W e use the exa mple o f a chess game with uncerta int y betw een t w o players: Anatoly as white and Bor is as black. An interaction x ∈ A B (R,R)(R,R) (R,P)(R,R) (R,R)(R,P) (R,P)(P , R) (R,R)(R,S) (R,P)(S,R) (R,P)(R,R) (P ,P)(R, P) (R,P)(R,P) (P ,P)(P ,P) (R,P)(R,S) (P ,P)(S ,P) (R,S)(R,R) (S,P)(R,S) (R,S)(R,P) (S,P)(P ,S) (R,S)(R,S) (S,P)(S,S ) working tape ... working tape ... Environment µ y 1 y 2 y 3 y 4 y 5 y 6 y 7  r 1 o 1 r 2 o 2 r 3 o 3 r 4 o 4 r 5 o 5 r 6 o 6 r 7 o 7  Agent p perception x: action y : Fig. 1. ( a) The set representation of play ers A and B pla ying tw o games of pa- p er, ro ck, scissors. The intersection set of A and B contai ns the single interaction x =“( R, P )( R, R ).” (b ) The cyb ernetic agent mo del. A ∩ B be tw een Ana to ly and Boris is a ga me of chess play ed for at most m plies for each player, with x = a 1 b 1 a 2 b 2 . . . a m b m = a b 1: m . The chess mov e space V ⊂ { 0 , 1 } ∗ has a short binar y enco ding, whose precise deﬁnition is not imp or tant. If the game ha s not ended a fter m rounds, then the game is c onsidered a dr aw. Both play ers are nondeterministic, where at every ply , they can choose from a selection of mov es. Anatoly’s decisions c a n be r epresented by a function f A : V ∗ → 2 V and similarly Bo ris’ decisions by f B . Anatoly can b e represe nt ed b y a set A , with A = { a b 1: m : ∀ 1 ≤ k ≤ m a k ∈ f A ( a b 1: k − 1 ) } , and simila r ly B o ris by set B . Their int ersection, A ∩ B , r epresents the se t of po ssible g a mes that Anatoly and Bor is can play together. Generally , s e ts can enco de adversaries o f non-co o pe r ative normal form g a mes, with their in teractions consisting of pure Nash equilibriums [RN09]. A normal form g ame is deﬁned as ( p, q ) with the a dversaries represented b y nor malized pay oﬀ functions p and q of the form { 0 , 1 } n × { 0 , 1 } n → [0 , 1]. The set of pur e Nash equilibr iums is { h x, y i : p ( x, y ) = q ( y , x ) = 1 } . F or each pay oﬀ function p there is a player A = {h x, y i | p ( x, y ) = 1 } , and fo r ea ch pay oﬀ function q there is play er B . The intersection of A and B is equa l to the set of pure Nash equilibriums of p and q . 2.2 Cyb ernetic Ag en t Mo del The in terpretation of “interaction as in tersection” is also applicable to the cy- ber netic agent mo del used in Universal Ar tiﬁcial Intelligence [Hut04]. With the cyb ernetic agent mo del, there is an agent and an environment co mm unicating in a series of cycles k = 1 , 2 , . . . (Fig. 1 b). At cycle k , the a gent p erforms an action y k ∈ Y , dep endent o n the pr evious history y x 0 } . Play er B µ m,τ represents all pos sible histories of µ (how ever unlikely) wher e the re ward is at least τ . If A p m ∩ B µ m,τ = ∅ , then environment µ is “ to o diﬃcult” for the ag ent p ; there is no interaction wher e the agent can receive a reward of at lea st τ . W e say the ag ent p inter acts with the environment µ a t time horizon m a nd diﬃculty τ if A p m ∩ B µ m,τ 6 = ∅ . Example 2 (Peter and Magnus). W e present a cyb ernetic ag e nt mo del in ter- pretation of c hess with r eward based players Peter and Magnus (same rules as example 1 ). Peter, the agent p , has to b e deterministic wher eas Magnus, the environmen t µ , ha s uncer taint y . A t cycle k , ea ch action y k is Peter’s mov e a nd each p er ception x k is Magnus’ mov e. At ply m in the chess game, Magnus re- turns a reward of 1 if Peter has won. In rounds where the game is unﬁnished o r if Peter loses or draws, the reward is 0. The player (set) A p m represents Peter’s plays for m rounds. The play er (set) for Ma g nus with diﬃculty thr eshold τ = 1 and m plies, B µ m, 1 is the set of all g ames that Magnus loses in m rounds or less. If A m ∩ B µ m, 1 = ∅ , then Peter cannot int er act with Magnus at diﬃculty level 1; Peter can never b eat Magnus at chess in m r o unds or less. If A m ∩ B µ m, 1 6 = ∅ then Peter can b eat Mag nus at a g a me o f chess in m r ounds or less. Another construction of a play er D µ m,τ with resp ect to environment µ , is D µ m,τ = { y x 1: m : ∀ k V ∗ ,µ 1: m ( y x 1: k ) /V ∗ ,µ 1: m ≥ τ } . With this in terpretation, play er D µ m,τ represents all histories where at each time k , 1 ≤ k ≤ m , an agent can po tent ially achiev e an exp ected r eward of at lea st τ times the optimal exp ected reward. I f A p m ∩ D µ m,τ = ∅ , then environment µ is “to o diﬃcult” for the agent p ; there is no interaction where a t every cycle k the agent has the p o tent ial to receive an exp ected re ward o f at le ast τ V ∗ ,µ 1: m . 3 Bac kground in Algorithmic Information Theory W e denote ﬁnite binary strings by x ∈ { 0 , 1 } ∗ and the length of string s by l ( x ). Let the pairing function h· , ·i b e the standa rd one- to-one ma pping from N × N to N , where: h x, y i = x ′ y = 1 l ( l ( x )) 0 l ( x ) xy and l ( h x, y i ) = l ( y ) + l ( x )+ 2 l ( l ( x ))+ 1. The Kolmog o rov complex it y C ( x ) is the length of the shortes t binary progr am to co mpute x on a universal T uring machine ψ , C ( x ) = min { l ( d ) : ψ ( d ) = x } . The pr eﬁx-free Kolmo gorov complexity , K ( x ), restricts the universal machine ψ so no halting prog r am is a prop er pr eﬁx o f another ha lting program. F or the rest of this pap er, we use plain K olmogor ov complexity . Ko lmogorov complexity is no t ﬁnitely computable. The conditional Ko lmogorov complex it y of x rela tive to y , C ( x | y ), is deﬁned as the length of a sho rtest pro gram to co mpute x , us ing y as an a uxiliary input to the computation. The complex it y of tw o strings x and y is denoted by C ( x, y ) = C ( h x, y i ). The conditional co mplexity of tw o strings is C ( x | y , z ) = C ( x |h y , z i ). The complexity of information in x ab out y is I ( x : y ) = C ( y ) − C ( y | x ). The conditiona l mutual informatio n is I ( x : y | z ) = C ( y | z ) − C ( y | x, z ) a nd can b e interpreted as the informa tio n z r eceives ab out y when given x . The co mplexity of a function f : { 0 , 1 } ∗ → { 0 , 1 } ∗ is C ( f ) = min { C ( p ) : ∀ x ψ ( p, x ) = f ( x ) } . The Levin complexit y is deﬁned by C t ( x ) = min p { l ( p ) + log t ( p, x ) : ψ ( p ) = x } , with t ( p, x ) b eing the num ber of steps taken by ψ until x is printed (without ψ necess ary halting). Levin co mplex it y is computable. The complexity of a ﬁnite set S is C ( S ), the length of the s ho rtest pro g ram f from which the univ ersal T uring machine ψ computes a listing of the elemen ts of S and then halts. If S = { x 1 , . . . , x n } , then ψ ( f ) = h x 1 , h x 2 , . . . , h x n − 1 , x n i . . . ii . The conditional complexity C ( x | S ) is the length of the shor tes t prog ram from which ψ , given S literally a s auxiliary information, computes x . F or every set S containing x , it must b e that C ( x | S ) ≤ log | S | + O (1). The rando mness deﬁciency is the lack o f typicalit y of x with resp ect to set S , with δ ( x | S ) = lo g | S | − C ( x | S ), for x ∈ S and ∞ o ther wise. If δ ( x | S ) is small enough, then x is a typical element of S ; x satisﬁes all simple pr o p erties that hold with hig h ma jor ities of s trings in S . Example 3 (Anatoly’s Games). Chess play er Anato ly with function f A can b e represented a s a set A (see example 1). Set A is simple relative to f A and the maximum nu m be r of plies m , with C ( A | f A , m ) = O (1), wher e O (1 ) is the length of co de required to use f A and m to enum erate all ga mes x ∈ A . The fo llowing theorem, used in section 5, shows that if a string x is contained by a large num ber of sets of a certain co mplexity , then it is contained by a simpler set [VV04]. The enumerative complexity , C E ( F ), is the Ko lmogorov complexity of a non ha lting progra m tha t enumerates all the sets F ∈ F . This theorem also holds for conditional complexity b ounds, C ( F | y ). Theorem 1 ([VV04]). L et F b e a family of su bset s of a set of strings G . If x ∈ G is an memb er of e ach of 2 k sets F ∈ F with C ( F ) ≤ r , t hen x is a memb er of a set F ′ in F with C ( F ′ ) ≤ r − k + O (log k + log r + log log |G | + C E ( F )) . 4 Pla y er Strategy Learning Play ers A and B ca n lear n information ab out each o ther’s strateg ies from a single int eraction (game) x ∈ A ∩ B or from their entire interaction set (all po s sible games) A ∩ B . The c ap acity of a play er A is the maximum amount o f information that A ca n re ceive ab out ano ther player through a ll p oss ible interactions, i.e . their int eraction set. It is equal to the log of the num ber of p ossible subsets that it can hav e, log 2 | A | = | A | . W e deﬁne the lack of typicalit y of a subset S with resp ect to A to b e δ ( S | A ) = | A | − C ( S | A ), for S ⊆ A and ∞ other wise. Example 4 (Cap acity). B o ris B use s a range of black o pe nings wherea s Bill B ′ uses o nly the Sicilia n defence. So Boris has a higher capacity , | B | ≫ | B ′ | , and can po tent ially lear n more than Bill. Example 5 (R andomness Deﬁciency). Let A b e the chess games play ed by Ana- toly . Bo b is a simple pla yer B ′ , who only mov es his knight back and forth. Set S = A ∩ B ′ represents a ll A ’s games with B ′ . The rando mnes s deﬁciency of these games, δ ( S | A ), is high, as S is e a sily computable from A , with C ( S | A ) ≪ | A | . Let T ⊆ A , in which T = A ∩ B are ga mes play ed b etw een Anatoly and Boris, who uses a range of chess strategies unknown to Anatoly . Then δ ( T | A ) is low and C ( T | A ) is high. If A views every interaction in A ∩ B , the amount of informatio n B r e veals ab out itself is, I ( A ∩ B : B | A ), the mutual infor ma tion b etw een B and A ∩ B , given A . This term can b e r educed to C ( A ∩ B | A ) − C ( A ∩ B | A, B ) = C ( A ∩ B | A ) + O (1 ). W e deﬁne the amo unt of k nowledge that A received ab out B fr om the interaction setp as: R ( B | A ) = C ( A ∩ B | A ) . (1) The higher the r andomness deﬁciency , δ ( A ∩ B | A ), o f an interaction se t, A ∩ B , with resp e ct to play e r A , the less information, R ( B | A ), play er A can rece ive ab out its opp onent B , with R ( B | A ) + δ ( A ∩ B | A ) = | A | . (2) Play er A receives the most informa tion ab out its opp onent when the ra ndomness deﬁciency is δ ( A ∩ B | A ) ≈ 0. Example 6. Let Anatoly , A , and Bob, B ′ , b e the play ers of example 5. Bob has a simple strategy and has a low er capacity | B ′ | ≪ | A | , but he learns a lot fr o m Anatoly , with δ ( A ∩ B ′ | B ′ ) ≈ 0 a nd R ( A | B ′ ) ≈ | B ′ | . Anatoly lear ns very little from Bob, with R ( B ′ | A ) ≈ 0 and δ ( A ∩ B ′ | A ) ≈ | A | . Play ers c an reveal informatio n a bo ut themse lves through a s ing le interaction. The amount of information that A rec e ived a b out B from their interaction x is I ( x : B | A ) = C ( x | A ) − C ( x | A, B ) . (3) A graphica l depiction of the complexities relating to A , B , and x c an b e seen in Fig. 2. W e deﬁne the lack of typicalit y of an interaction x with res pe ct to bo th play ers to b e δ ( x | A, B ) = log | A ∩ B | − C ( x | A, B ) (4) A (Anatoly) B (Boris) C(A|x,B) C(B|x,A) C(x|A,B) I(x:B|A) I(x:A|B) x C(B|A) C(A|B) C(x|B) C(x|A) ∩ ∩ ∩ ∩ Fig. 2. The complex ities and information of A , B , and th eir interaction x . The rela- tionships hold up t o logarithmic precision. for x ∈ A ∩ B and ∞ otherw is e. If δ ( x | A, B ) is small, then x repr esents a typical int eraction. T he infor ma tion pa ssed fro m player B to player A through a single int eraction is represe n ted by I ( x : B | A ) + δ ( x | A ) = log | A | / | A ∩ B | + δ ( x | A, B ) . (5) The informatio n passe d b etw een players through a single in teraction with the same capacity is I ( x : B | A ) + δ ( x | A ) = I ( x : A | B ) + δ ( x | B ) + O (1) . (6) Example 7. Anatoly A plays a game x with Boris B who has the same capacity with | A | = | B | . Anatoly tricks Boris with a King’s gambit and the game x follows a series of mov es extremely familia r to Anato ly . Boris reac ts with the most obvious mov e at every turn. In this c ase the game is simple to Anatoly , with δ ( x | A ) being la r ge and I ( x : B | A ) be ing sma ll. The game is new to Boris with δ ( x | B ) be ing small and I ( x : A | B ) b eing lar ge. Thus Boris learns mo re than Anatoly from x . If the play ers hav e a deter ministic interaction, then A ∩ B = { y } a nd the information A received from B reduces to I ( y : B | A ) + δ ( y | A ) = log | A | . 5 Pla y er Approxima tion and Interaction Complexit y W e s how that, g iven an interaction x betw een play ers A and B , A c an “ c on- struct” an approximate player B ′ that ha s int eraction x using a small num b er of extra bits ǫ , where C ( B ′ | A, x ) = ǫ . W e als o show that the conditional co mplex it y C ( B ′ | A ) o f the appr oximate player B ′ is no t gre ater than the amount of infor- mation I ( x : B | A ) that A obtains ab out B (up to logar ithmic pr e cision). W e use the simpliﬁed notation log A = log | A | . W e also use the play er spac e notation, B , to denote a set o f sets of s trings. Theorem 2. Given ar e a player sp ac e B and players A and B ∈ B over strings of size n with x ∈ A ∩ B and C ( B ) = O (lo g n ) . Then t her e is a player B ′ ∈ B with x ∈ B ′ , C ( B ′ | A, x ) = O ( s ) , and C ( B ′ | A ) ≤ I ( x : B | A ) + O ( s ) , with s = log C ( B | A ) + log n . Pr o of. Let r = C ( B | A ). W e deﬁne G as the s et of strings of size n , with log log |G | = log n . W e set F = B , a nd so C E ( F ) = O (log n ). Let N b e the nu m be r of s ets S ∈ B , with C ( S | A ) ≤ r a nd x ∈ S . W e ﬁrst show that C ( B | A, x ) ≤ log N + O (log nr ). There is a pro gram, that when given x , A , B , and r , with C ( B , r ) = O (log nr ), can enumerate a ll sets in B containing x with conditional co mplexity to A b eing less than or equa l to r . Thus B ca n b e created using such a pr o gram and an index of siz e ⌈ lo g N ⌉ . By the application of Theorem 1, conditional on A , with k = ⌊ log N ⌋ , there is a set B ′ ∈ F with x ∈ B ′ and C ( B ′ | A ) ≤ r − k + O (log nr ) ≤ C ( B | A ) − C ( B | A, x ) + O (log nr ) = I ( x : B | A ) + O (log nr ). T o prove C ( B ′ | A, x ) = O ( s ), a ssume B ′ is the set satisfying the ab ov e prop er ties tha t minimizes C ( B ′ | A ) up to pr ecision O ( s ). It must b e tha t C ( B ′ | A, x ) = O ( s ). Otherwise C ( B ′ | A, x ) = ω ( s ) and there is a set B ′′ satisfying prop erties ab ov e and C ( B ′′ | A ) ≤ C ( B ′ | A ) − C ( B ′ | A, x ) + O ( s ) = C ( B ′ | A ) − ω ( s ), causing a contradiction. Example 8 (Opp onent R e c onstru ction). Anatoly , A , plays a chess game x with Boris, B , with x ∈ A ∩ B . The play ers use a r a ndom string b of size C ( x | A, B ) to help decide their mov es. Without using b , Anatoly can “constr uc t” B ob, B ′ , a n imper sonator of Boris, using information from the game x and O (log C ( B | A ) + log l ( x )) bits. Bob ca n play the same game x with Anatoly . Given are play ers A and B who inter ac t , in that A ∩ B 6 = ∅ . W e show that there exists a n interacting play er B ′ that has complexity b ounded by the m utual infor mation of A and B . If theorem 1 can b e streng thened such that the enum erative complexity ter m C E ( F ) is re placed by C E E ( F ), the co mplex it y o f enum erating b o th the sets and the elements of the sets of F , then the precisio n of theorems 3 and 4 can be strengthened with the replace ment of the Lev in complexity term C t ( A ) with Kolmo gorov complexity C ( A ). Theorem 3. Given ar e a player sp ac e B and players A and B ∈ B with A ∩ B 6 = ∅ . Then ther e exists a player B ′ ∈ B with A ∩ B ′ 6 = ∅ , and C ( B ′ ) ≤ I ( A : B ) + O ( s ) , with s = log C ( B ) + log C t ( A ) + C ( B ) . Pr o of. Let r = C ( B ), h = C t ( A ), and q = 2 C ( B ) . W e deﬁne G = {h S i : C t ( S ) ≤ r } , with h S i being an enco ding o f set S . This implies log log |G | = O (log h ). W e deﬁne F with a r ecursive function λ : B → F , with λ ( S ) = { h T i | C t ( T ) ≤ h, S ∩ T 6 = ∅} . It m ust b e C ( λ ) = O (log h ). The enumeration complexity of F requires the enco ding of B a nd λ , and s o C E ( F ) = O (log hq ). Thus if h T i ∈ λ ( S ), then T ∩ S 6 = ∅ . Let N b e the num ber of s ets S ∈ B , with C ( S ) ≤ r and S ∩ A 6 = ∅ . Thu s C ( B | A ) ≤ log N + O (log hq r ), as ther e is a pr ogra m, when given A , r , B , and an index of siz e ⌈ lo g N ⌉ , that c a n r eturn any such S . By the application of Theorem 1, with x = h A i and k = ⌊ log N ⌋ , there is a set F ∈ F with x ∈ F and C ( F ) ≤ r − k + O (log hq r ) ≤ C ( B ) − C ( B | A ) + O (lo g hq r ) = I ( A : B ) + O (lo g hq r ). A set B ′ ∈ B , with λ ( B ′ ) = F , can b e easily recovered fro m F by enumerating all sets in B , applying λ to each one, and selecting the ﬁrst one which pro duce s F . So C ( B ′ ) ≤ C ( F ) + O (log q ) ≤ I ( A : B ) + O (log hq r ). Since h A i ∈ λ ( B ′ ), it m ust b e that A ∩ B ′ 6 = ∅ . W e show that if a play er A int er acts with num erous play ers of a given co m- plexity and uncertaint y , then there exists a simple play er B ′ who in teracts with A with the same uncertaint y . Theorem 4. Given ar e player sp ac e B , player A and 2 k players B ∈ B wher e for e ach B , 0 < | A ∩ B | ≤ c and C ( B ) ≤ r . Ther e is a player B ′ ∈ B such that 0 < | A ∩ B ′ | ≤ c and C ( B ′ ) ≤ r − k + O ( s ) , with s = log C t ( A ) + log c + log k + log r + C ( B ) . Pr o of. Let h = C t ( A ) and q = 2 C ( B ) . W e ca n deﬁne G ⊆ { 0 , 1 } ∗ as a set of strings, each enco ding a set (play e r ) S whose Levin complexity is less than or equal to h . This implies log log |G | = O (log h ). W e represent the enco ding of S with h S i . W e deﬁne F with a re cursive function λ : B → F , with λ ( S ) = {h T i | C t ( T ) ≤ h, 0 < | S ∩ T | ≤ c } . Th us it must b e C ( λ ) = O (lo g ch ). The enum eration complex ity of F r equires the enco ding of c , h , and B , with C E ( F ) = O (log chq ). Th us if h T i ∈ λ ( S ), then player T and pla yer S ha v e a no n empty int ersection of size at most c . F rom the a ssumptions of this theor em, h A i is cov ered by a t least 2 k sets λ ( B ) ∈ F of complexity C ( λ ( B )) ≤ r + O (log chq ). By the application of Theor em 1 , with x = h A i , ther e is a s et F ∈ F with x ∈ F , C ( F ) ≤ r − k + O (log ( chk q r )). A set B ′ ∈ B , with λ ( B ′ ) = F can be recov ered fro m F by enumerating all s ets in B , applying λ to each one, a nd selecting the ﬁrst o ne which pro duces F . Therefore C ( B ′ | F ) ≤ O (lo g chq ) and so C ( B ′ ) ≤ C ( F ) + O (log chq ) ≤ r − k + O (log ( chk q r )). Since h A i ∈ λ ( B ′ ), it m ust b e that 0 < | A ∩ B ′ | ≤ c , thus the theorem is prov en. Example 9. An example applica tion of theorem 4 is a g ame of the sa me form as example 2. Magnus, represented by set B , plays 2 k games of against 2 k young play ers A ∈ A . F ur thermore the play ers and Ma gnus ar e deter ministic with for each A ∈ A , | A ∩ B | = 1. The diﬃculty threshold τ , is set to 1, so every one of the young pla y ers b ea t Ma gnus. By theorem 4, if all play ers A ∈ A hav e complexity at most C ( A ) ≤ r , then there is a simpler play er A ′ ∈ A that will win a g ainst Magnus, with C ( A ′ ) ≤ r − k + ǫ (with ǫ b eing of loga rithmic order) and | A ′ ∩ B | = 1. 6 F uture W ork: Unive rsal In t eraction Since the a gents a nd environmen ts of the cyb ernetic agent mo del o f Sectio n 2.2 can b e transla ted into set representations, ther e is po tent ial a pplication o f the pro of techn iques used in Section 5 to Artiﬁcial Universal Intelligence [Hut04], and in particular to describ e prop erties of the AIXI mo del. The univ ersal en viron- men t, ξ , is deﬁned using a form of the universal pr ior, m ( x ) = P p : ψ ( p )= x 2 − l ( p ) , representing a s emimeasure (degenerate pr obability) ov er a ll inﬁnite strings , with ξ ( y x 1: k ) = P ρ 2 − K ( ρ ) ρ ( y x 1: n ). The universal environment ξ is the weigh ted sum- mation ov er all chronological en vironment s ρ . The ter m K ( ρ ) repres ents the pr eﬁx fr e e Kolmogo rov co mplexit y o f ρ . The AIXI mo del p ξ m is the optimal agent for the environment ξ with hor izon m , in that p ξ m = arg max p V p,ξ 1: m . The sequence o f self optimizing AIXI a gents for each time horiz o n is { p ξ i } i =1 , 2 ,... . Let M b e a set of environments where a sequenc e of self-optimizing p olicies ˜ p m ex- ists. The s e quence conv erges to re c eive the o ptimal av erage for a ll environments with ∀ ν ∈ M : 1 m V p m ,ν 1: m m →∞ − → 1 m V ∗ ,ν 1: m . By theorem 5.29 from [Hut04], it must b e that the seq ue nc e of AIXI a gents is optimal for M with 1 m V p ξ m ,ν 1: m m →∞ − → 1 m V ∗ ,ν 1: m . W e use the conv ersion of ag ents p and environmen ts µ to sets A p m and D µ m,τ as int ro duced at the end Se c tio n 2.2. The s equence of self o ptimizing AIXI agents, { p ξ i } i =1 , 2 ,... , is universal w ith regar d to interaction with res pec t to M . It is easy to see that fo r all τ and a ll environmen ts ν ∈ M , there is a num ber m ν τ where for all m > m ν,τ , A p ξ m m ∩ D µ m,τ 6 = ∅ . This implies a set re pr esentation of agent dynamics can b e used to des crib e further prop erties of the AIXI mo del. There is p otential fo r a deep connection, roughly analo gously to how preﬁx-free K ol- mogorov complexity a nd the universal prior are rela ted with the Co ding Theorem K ( x ) = − log m ( x ) + O (1) [L V08]. 7 Conclusions W e used Algo rithmic Informa tion Theory to quantify the information exchanged betw een agents that interact in non- c o op erative games (Sec. 4). W e hav e shown that a n ag ent A can constr uct an approximation of his o ppo nent B us ing infor- mation from a single interaction (game) with B (Th. 2). W e hav e shown that if an agent B with sup erﬂuous information in teracts with a n en vironment A and achiev es a certain reward, then there exists another a gent B ′ without this information tha t can achieve the s ame reward (Th. 3). W e hav e als o shown that if multiple agents in teract with an environment to achiev e a certain r eward, then there exists a simple a gent who can achiev e the same reward (Th. 4 ). Our constructions are compatible with Univ ersal Artiﬁcial Intelligence, in that the AIXI mo del can be interpreted as universal with rega rd to in teractions with environmen ts (Section 6). References CT91. T. Cov er and J. Thomas. Elements of Inf ormation The ory . Wiley- Interscience, New Y ork, NY, US A , 1991. GP07. B. Goertzel and C. Pennac hin. Ar tiﬁcial Gener al I ntel li genc e (Co gnitive T e ch- nolo gies) . Springer-V erlag New Y ork, Inc., Secaucus, N J, U SA, 2007. GTV01. P . G´ acs, J. T romp, and P . V itanyi. Algorithmic Statistics. Information The- ory, IEEE T r ansactions on , 47(6), 2001. Hut04. M. H u tter. Universal Artiﬁcial I ntel l i genc e: Se quential De cisions b ase d on Algorithmic Pr ob ability . Springer, Berlin, 2004. L V08. M. Li and P . Vit´ an yi. An Intr o duction to Kolmo gor ov Complexity and Its Applic ations . Springer Publishing Company , Incorp orated, 3 edition, 2008. RN09. S. Russell and P . Norvig. Ar tiﬁcial Intel ligenc e: A Mo dern Appr o ach . Prentice Hall, 3rd edition, 2009. VV04. N. V ereshchagin and P . Vit´ an yi. Algori thmic Rate Distortion Theory . arXiv:cs/04110 14v3, 2004. http://arxiv.org/a bs/cs.IT/041 1014. VV10. N. V ereshchagin and P . Vit´ an yi. Rate D istortion and Denoising of Indiv id- ual Data u sing Kolmogoro v Complexity . IEEE T r ansactions on Information The ory , 56, 2010.

An Information Theoretic Representation of Agent Dynamics as Set Intersections

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment