Encoding dynamics for multiscale community detection: Markov time sweeping for the Map equation

Enco ding dynamics for m ultiscale comm unit y detect ion: Mark o v time sweep ing for the map equation Michael T. Schaub, 1, 2 , ∗ Renaud La m biotte, 3 and Mauricio Barahona 1, † 1 Dep artment o f Mathematics, Imp erial Col le ge L ondon, L ondon SW7 2AZ, Uni te d Kingdom 2 Dep artment of Chemistry, Imp erial Col le ge L ondon, L ondon SW7 2AZ, Unite d Ki ngdom 3 Dep artment of Mathematics and Naxys, University of Namur, 5000 Nam ur, Belgium (Dated: March 1, 2018) The detection of comm unity structu re i n netw orks is in timately related to ﬁnding a concise de- scription of the netw ork in terms of its modules. This notion has b een recen tly exploited by t h e ma p equation formalism ( M. Rosv all and C. T. Bergstrom, PNAS , 105 (4), pp. 1118–1123, 2008) through an information-theoretic description of the p rocess of coding inter- and intra-comm unity transitions of a random wal ker in the netw ork at stationari ty . How ever, a th orough s tud y of the relati onship b et we en the full Marko v dynamics and the cod ing mec hanism is still lacking. W e sho w here that the original map co ding sc heme, whic h is b oth block-a veraged and one-step, neglects th e i nternal structure of the communities and introdu ces an upp er scale, th e ‘ﬁeld-of-view’ limit, in the com- munities it can detect. As a consequence, map is w ell tuned to detect clique-like communities bu t can lead to undesirable ov erpartitioning when communities are far from clique-like. W e show th at a signature of this b ehavior is a large compression gap: t h e map description length is far from its ideal limit. T o address this issue, we p ropose a simple dynamic approach that introduces time ex plicitly into t h e map co ding through the analysis of th e we ighted adjacency matrix of the time-dep endent multis tep transition matrix o f the Mark ov pro cess. The res ulting Mark ov time sweeping indu ces a dynamical zooming across scales that can revea l (p oten tially multisca le) communit y structure ab o ve the ﬁeld-of-view limit, with the relev ant partitions indicated by a small compression gap. I. INTRO DUCT ION The analysis of biolo gical, technical and socia l net- works has bec ome ex tremely p opular in re c en t years [ 1 – 3 ]. The av aila bilit y of high dimensional relationa l da ta coupled with increas ing co mputational p ow er has set the ground for the inv estigation of c omplex sy s tems from a net work p ersp ectiv e, i.e., each agent or entit y is viewed as a no de in tera c ting via multiple links with other no des in the network. Such a viewp oin t aims to under stand the global emerg en t b ehavior o f the sys tem fro m the interac- tions b et ween the individual co mponents of the system, in contrast to fo cusing o n each part on its own. In many cases of in terest, complex netw or ks a r e far from b eing unstructur ed and contain relev ant subgroup- ings or c ommunities , possibly o rganized int o (not nec- essarily hierarchical) m ultiple levels [ 4 ]. The detection of such communit y structure can b e of impor tance for the understanding of the interpla y b etw een the struc- tural and functional features of the netw or k. In pa rticu- lar, parts of the system o perating on given scales could be r e pr esen ted with a simpliﬁed description at an appro- priate level of coar se gr aining. Communit y detection metho ds based on a v a riet y of heuristics (including modular it y [ 5 , 6 ] and s p ectral partitioning methods [ 7 – 10 ] among many others—see Refs. [ 1 , 11 ] for r ecen t reviews ) hav e b een prop osed to ﬁnd a n optimized split in to communities. The co mm uni- ties thus found r esult from identifying g roups with high ∗ E-mail: micha el.schaub09@imperial.ac.uk † E-mail: m.barahona@imper ial.ac.uk int ra - comm unity weights a s compar ed to the exp ected weigh ts in surrog ate mo dels of the netw ork. In ado pt- ing such a s tructural criter ion, these methods introduce an in trinsic sca le that establishes limits on the commu- nities they can detect, th us leading to p otent ial misde- tection [ 12 ]. F urthermore, such single scale metho ds are not suitable for the a nalysis of netw orks in which ther e is not a single ‘b e st’ mesoscopic level of descriptio n, but rather m ultiple levels as socia ted with diﬀerent sca les in the system [ 13 ]. In or der to acco unt for the presence of multiple lev- els o f org anization, multiscale metho ds hav e b een intro- duced that allow to s earch for the r igh t s cale at which the net w or k should be analy zed [ 14 – 17 ]. Recently , it has b een shown that one c an use the time evolution o f a Marko v pro cess on the graph to reveal rele v ant commu- nities a t diﬀerent scales in a pro cess of dynamic zo om- ing thro ugh the so-called p artition stability [ 12 , 18 , 1 9 ]. As the Markov time incr eases, the diﬀusive pro cess in- volv es m ultistep transitions and ex plores further aﬁeld the s tructure of the graph, resulting in the detection of communit y structure acr oss scales , fro m ﬁner to coar ser. This dyna mic appr o ac h has the adv antage that it pro - vides a unifying framework for s tructural communit y de- tection metho ds (such as mo dularity and sp ectral meth- o ds), whic h can b e seen a s par ticular cases of this ap- proach inv olving one- s tep measures. A diﬀer e n t p ersp ective is pr ovided by an informa tion theoretic framework that considers the problem o f ﬁnd- ing co mm unities in a netw or k as a co ding or compres- sion problem [ 20 – 23 ]. The underly ing idea is that the presence of co mm unities should imply the existence o f an eﬃcient and c oncise wa y to enco de the b ehavior o f 2 a s ystem in terms of its s ubgroups. Recently , the ma p equation metho d by Rosv all et al [ 21 , 22 ] relies on a com- pression of the description length of a random walk inside and b etw een co mmunities to ﬁnd go o d gr aph partitions. This metho d has r eceiv ed a lot of attention, since it has bee n shown to b e extremely eﬃcient on b enc hmark tests [ 24 ] outp erforming the po pular mo dularit y [ 5 , 6 ]. It ha s also b een shown to be immune to the resolution limit [ 25 ] that aﬀects the p erformance of mo dularity . How ever, the mathematical prop erties and p ossible limitations of the map eq ua tion r emain relatively unexplor e d. Here w e show that the map equation ca n als o b e under- sto od as a one-step metho d and, co nsequen tly , it suﬀers from an upp er scale (the ﬁeld-of-v iew limit) ab o ve which it cannot detect communities [ 12 ]. This limited ﬁeld- of- view can lea d to ov erpartitioning when communities a re far from b eing clique-like [ 12 ]. In addition, the o ne - step map co ding scheme als o neglects the internal str ucture of the co mm unities and, in doing so, intro duces a bias tow ards comm unities that ar e lo cally fast mixing (and in this sense clique-like). W e also show that the qual- it y o f the map partitioning can b e asses sed thro ugh the existence of a s mall compress ion gap, i.e., a small dis- tance b etw een the compres s ion achiev ed by Map and its theoretical limit given by the true entrop y rate of the Marko v pro cess. T o a lleviate some of these limitations, we int ro duce a dy na mical approa c h that introduces time explicitly into the map coding scheme, b y co nsidering the time-dependent m ulti-step trans itio n matrix of the Marko v pro cess on the netw ork as the ob ject of the map enco ding. This introduces a dynamic zo oming by sweep- ing thro ug h the Markov time, which allows the detection of m ultiscale comm unity structure with the map equation formalism. II. COMMUNITY DETECTION FROM A CODING PERSPECTIVE: THE MAP EQUA T ION The map formalis m considers the problem of pa r tition- ing a net work into non-ov erlapping comm unities from a co ding p ersp e ctiv e. The o riginal ma p forma lism [ 21 ] equates the qua lit y of the pa r tition to the eﬃciency o f a co de that would describ e the notio nal transitio ns of a r andom walker inside and b etw een communities. The Infomap algorithm can then b e used to obtain go o d par - titions through the o ptimization of this quality function. The underlying principle is that the co de fo r such one- step transitions of the random walk er can b e eﬃcien tly compressed in the pr esence o f a stro ng communit y struc- ture: short names for no des (co dew ords) can be reused in diﬀerent co mm unities, muc h lik e street names can be reused in diﬀerent cities o f a co un try [ 21 ]. In the o r iginal map equation, the mo vemen t o f the walk er is describ ed in terms of tw o kinds of co debo ok. The ﬁrst k ind of code - bo ok is sp eciﬁc to each communit y and assigns a unique co dew ord for each no de inside it a nd a particular exit co dew ord for the communit y . An a dditional c o deb o ok contains uniq ue co dew ords that describ e the movemen ts betw een diﬀerent communities. Mo re r ecen tly , a hier ar- chical extension of the map formalism (a recursive version of the orig inal metho d) has b een presented [ 22 ] as well as an extension for overlapping mo dules [ 26 ]. W e do not consider these extens io ns in detail her e , as bo th metho ds are based on the sa me principles of the sta ndard map equation a nd our ﬁndings a re applicable to these as well. A. Deﬁnitions and notation An explicit r ewriting of the origina l map formalism in terms of the stationar y distribution of a r andom walk is as follows. Consider a discrete time Marko v pro cess on a graph with N no des: p k +1 = p k D − 1 A ≡ p k M , (1) where p k is the 1 × N (node ) probability vector, A is the (weighted) adjacency matrix of the graph, D is the diagonal matrix co n taining the (weigh ted) degree of each no de, a nd we hav e also deﬁned M , the tr a nsition matrix of the r andom walk. The s ta tionary distribution o f the random walk, π , is then given by: π = π M . (2) Consider now a partition of the netw ork into c com- m unities indexed b y α = 1 , . . . , c . At statio na rit y , the probability of leaving communit y α (or of arriving at communit y α ) is q α y = X i ∈ α X j / ∈ α π i M ij , and the ov erall pr o babilit y of changing co mm unit y is q y = c X α =1 q α y . Similarly , the proba bilit y to stay within or to leave com- m unity α is p α  = q α y + X i ∈ α π i . The map equation then deﬁnes the p er-step description length o f a co de asso ciated with this partition as: L M = c X α =1 p α  H ( P α ) + q y H ( Q ) , (3) a weighted combination of the Shannon entropies: H ( P α ) = − q α y p α  log 2  q α y p α   − X i ∈ α π i p α  log 2  π i p α   H ( Q ) = − c X α =1 q α y q y log 2  q α y q y  . 3 (c) (a) (b) = = = 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 Figure 1. (Color online) Equi v alent graph partitions for the map e quation. Because th e map equ ation ignores the sp eciﬁc connectivity of the graph, graph partitions with eq ual equilibrium and lea ving probabilities b ecome in distinguish- able to Map. D iﬀerent communities are represented by dif- feren t colors (sh ad es of gra y). Unless indicated, the wei ght of the edge is 1. (a) Tw o graphs with diﬀerent intra-comm unity connectivity b ut th e same map co ding length, L M . (b) Two graphs with diﬀerent inter-comm unity connectivity and the same L M . (c) Two graphs with equal L M but very diﬀerent inter- and intra-comm unity connectivit y . F rom the viewp oin t of Map, a ring-of-rings is equiv alen t to a cli qu e-of-cliques with diﬀerent wei ghts. The t wo ter ms in Eq. ( 3 ) corresp ond to t wo clas s es o f co debo oks tha t enco de one-step tra nsitions at station- arity viewed thro ugh the prism of the given partition. The ﬁrst term s tems from the “ comm unity-cen tric” co de- bo oks with pro babilit y distributions P α (and as socia ted ent ro p y) of b eing at or leaving fr o m each o f the co m- m unities. The second term co rresp onds to the “inter- communit y ” c o deb o ok with dis tribution Q (and asso ci- ated entropy) of changing communit y . In the o riginal map forma lis m it is pro posed that a low L M is a characteristic of go o d partitions and the Infomap algorithm is used to search computatio na lly for partitions with low L M . II I. M AP ENCODES BLOCK-A VERA GED, ONE-STEP T RA NSITIONS: IMPLICA TIONS FOR COMMUNITY DETECTION As shown by the deﬁnitions ab ov e, the origina l map equation do es not fully co de for the dynamics of the Marko v pr oces s ( 1 ), as it only uses qua n tities derived from blo ck-av eraging of one-s tep transitions at station- arity . The simpliﬁcations involv ed in blo c k-averaging the s tr ucture and in ignor ing lo nger-term dynamics bo th hav e inter-related implications for communit y detection, which we now study in detail. A. Blo c k-av eraging the connectivi t y: the compression gap and a bias tow ards ov er-ﬁtting to clique-li ke communities An examina tion of the terms in the map equation ( 3 ) reveals that the implicit blo c k-averaging neglects the in- ternal structure of the comm unities as w ell a s the detailed int er- c omm unity connectivity . More precisely , given a particular partition, all graphs with the same equilibrium distribution π a nd ov erall leaving probabilities q α y will be indistinguis ha ble in ter ms o f their map quality , L M , as exempliﬁed in Figure 1 . F r o m the viewp oint of entropies, the ma p e q uation ( 3 ) is fo r mally equiv alent to a weigh ted sum of the en- tropies of i.i.d. stochastic pro cesses with states visited according to: nor malized “communit y-centric” probabil- ities {{ π i /p α  } i ∈ α , q α y /p α  } c α =1 ; a nd normalized ”leav- ing” pro babilities { q α y /q y } c α =1 , res pectively . Alterna - tively , this pro cedure may be seen as formally equiv a - lent to using a blo c k-averaged trans ition matrix co rre- sp onding to a blo c k-str uctured w eighted (and in genera l directed) c omplete gr aph with self-lo ops. Co ns equen tly , Infomap exhibits a bias tow ards identifying communities that are fo rmally equiv alent to clique-like subg raphs. In this sense, the map equation can b e seen to co de for a tw o-level, mean-ﬁeld o r ganization: one ins ide commu - nities, o ne acr oss communities. Suc h blo ck-structured, all-to-all mo dels a re a g oo d repres en tation of comm u- nit y structure ba sed on hierarchical cliques - of-cliques. In- deed, the map equation pe r forms w ell in blo ck-structured Erd¨ os- Ren yi b enchmarks [ 24 ] and is not a ﬄicted by the ‘resolution limit’ [ 25 ]. On the o ther hand, there ar e im- po rtan t netw orks with a more mar k ed lo cal structure in which communities ar e not cliq ue - lik e [ 12 ]. Because the map forma lism has not b een designed to detect such non clique-like communities with la rge eﬀective dista nces, In- fomap will tend to ov erpa rtition such netw orks. Ignoring the detaile d c onne ctivity: the c ompr ession gap of the map e quation The fact that Ma p ignores the detailed connectivity in- side a nd o utside the co mm unities lea ds to a sub- o ptimal co ding scheme. This sub-optimality can b e quantiﬁed through the c ompr ession gap (deﬁned b elow), which can be used as a measure of when the Map blo ck-a veraging assumptions ar e a v alid simpliﬁcation for the net work under s tudy . Consider the Mar k ov chain with transition matr ix M 4 and stationary distribution π , as given in E q. ( 1 ). The most eﬃcient co ding of the dynamics of the ass ociated Marko v pro cess at stationa r it y is b ounded fro m b elow by the e n tropy rate h [ 27 , 28 ]: h ( π ; M ) = − X ij π i M ij log 2 ( M ij ) . (4) The corr esponding optimal enco ding can be asymptoti- cally achiev ed by endowing ea c h no de with a dictionary for its outgoing links, as s hown by Shannon [ 27 ]. This is a k ind of ‘edg e enco ding.’ On the o ther hand, if w e consider a co ding scheme that gives each no de a unique na me within the whole graph, (i.e., a ‘no de enco ding’), then the co rresp onding co ding length is b ounded b y the en tropy rate of the i.i.d. ra ndo m v ar iable with probability distribution π , which is equal to the entrop y of the s tationary distribution: H ( π ) = − X i π i log 2 ( π i ) . (5) The map co ding scheme ca n b e seen a s a mixture of bo th: it enco des no des uniquely within communities, but enco des for transitio ns (‘edge s’) b e t ween co mm unities. Therefore, in genera l, h ( π ; M ) < L M ≤ H ( π ) , (6) and Map is sub-optimal in terms of its co ding leng th [ 29 ], as rec o gnized by Rosv all a nd Bergstr o m in their original publication [ 21 ]. This sub-optimality can b e unders too d with a simple example: consider a communit y α out from which there is only one p ossible link to another commu- nit y α to . Map enco des this transitio n with tw o co de- words: a n exit co dew ord to s ig nal the leaving o f α out and a co deword to ide ntify the destinatio n communit y α to . Clearly , the se cond co deword is redundant. Impo rtan tly , if the gra ph is a weight ed, directed clique (i.e., with tra nsition matrix M = 1 π ), then h ( π ; M ) = H ( π ) and the tw o c o ding s chemes (‘edge’ and ‘no de’) give the sa me result (see Figure 2 (a)). There fore, the sub-optimality of the map co ding is minimal when the graph is close to a clique. Consequently , the minimiza- tion of the map cost function is well suited to identify communit y str uctur e that is a clique of cliques: within each communit y Map use s a ‘node ’ enco ding while b e- t ween communities Map enco des tr ansitions by default. In such a s cenario, the map co ding scheme is nearly o p- timal and close to the entrop y rate. The sub-optimality of the map enco ding plays a sig- niﬁcant role when enco ding communities with restricted connectivity . F or insta nc e , if the communit y is a ring, a r andom walk er ha s only tw o p ossible no des to transi- tion to, instead of n α as assumed by Map. In this ca se, there is a lar ge gap b et ween the map description length L M and the optimal limit es ta blished by h ( π ; M ), indi- cating that the full consideratio n of the gr aph s tructure in the Marko v dynamics could b e exploited for a b etter enco ding (see Fig ure 2 (b) for an example). E F G H B C D A (a) (b) ...AACBDCCAB... ...EFGHEFGHE... Example Sequence: Example Sequence: Entropy rate h = 2 bits/step Map encoding L = 2 bits/step M Entropy rate h = 0 bits/step Map encoding L = 2 bits/step M Figure 2. T he compress ion gap of the map codi ng sch em e . (a) F or a clique with self lo ops, th e map codin g sc heme is optimal (assuming no comm unities) and is equiv- alen t to a u niform i.i.d. pro cess with four states. (b) In a directed cycle, the map co ding is far from optimal. The mo ve- ment of a ra ndom w alker on this graph can b e encoded b y just denoting the starting p ositi on bu t th e map cod ing sc heme en- forces uniq ue n ames for each nod e and thus requires at least 2 bits/step. This dis cussion highlights the fact that the block- av eraging implicit in the orig inal map scheme leads to a sub-optimality of the prop osed map co ding s c heme tha t bec omes signiﬁcant when the netw or k cannot be well de- scrib ed as a clique-o f-cliques. In o r der to quantify this eﬀect, we deﬁne the c ompr ession gap , δ : δ = ( L M − h ) / h, (7) which measures how c lo se the ma p enco ding is to opti- mality . Note that other measures for the compres sion gap, such as δ ′ = ( L M − h ) / ( H − h ), could b e use d and may b e more suitable , or sensitive, in so me cases. In this manuscript we stick mostly to the slightly simpler expression of δ , as it is suﬃcient for our pur p oses. The compressio n gap can b e used to establish when the com- m unities identiﬁed by Map a re far from b eing clique-like and hence serves a s an indicator of the relia bility of the partitions obtained by Infomap, as shown b elow. B. One-step transitions: the ﬁeld-of-view limit and a bias tow ards ov erpartioning of non clique -lik e communities As discussed ab ov e, the original map forma lism is based on an implicit clique-like concept of communit y , and a comm unity structure as a (statistical) c liq ue of cliques. Alth oug h this mo del ha s prov ed success ful in a v ar iet y of ﬁelds [ 24 , 30 ], relev ant technological, so cial and biological netw or ks are far fr om being clique-like [ 12 ]. In such cases, Infomap might tend to overpartition com- m unities as a result of an upp er scale (the ‘ﬁeld-o f-view limit’) which stems from map e ncoding only for one - step transitions a t stationarity . This ﬁeld-o f-view limit a f- fects all one- step metho ds, including not only Map but 5 also mo dularity . The ﬁeld-o f-view o ccurs on the opp o- site end of the well-kno wn resolution limit that app ears as a low er scale for mo dularity [ 25 ] but do es not seem to impact Map [ 17 ]. Overp artitioning of lattic e-like gr aphs The ov erpa rtitioning induced by the ﬁeld- of-view limit can b e understoo d analytically through the follo wing simple examples of lattice-like gr aphs. First, consider a c ycle graph of length N with un- weigh ted edges. The equilibrium distribution o f the ran- dom walk o n this gr aph is π i = 1 / N , i = 1 , . . . , N . This graph has no communit y structure and the only relev ant partition should b e the global “all-in-o ne”. F o r a partition of the ring into c ≥ 2 communities indexed by α we hav e:  q α y = 1 / N , ∀ α ; q y = c/ N ; p α  = ( n α + 1 ) / N  where n α is the num b e r of no des in communit y α and, clearly , P c α =1 n α = N . The map cost function of this partition is L M ( { n α } c α =1 ) = c N log 2 ( c ) + c X α =1 n α + 1 N log 2 ( n α + 1 ) . (8) Using conv exity arg umen ts, it is ea sy to show that fo r a given N and c , the minimal L M is attained for the par ti- tion with e q ually-sized communities with n α = N /c, ∀ α , if it exists. F or such a par tition, the map equa tion ( 8 ) bec omes: L M ( { N /c } c α =1 ) =  1 + c N  log 2 ( N + c ) − log 2 ( c ) , (9) with c ≥ 2. The ca se c = 1 is the tr iv ial “all-in-one partition” with L M ( { N } α =1 ) = log 2 N . The relev ant Map optimization fo r the cycle gra ph of size N is then equiv alent to ﬁnding which of the equal partitions into c communities has the low est L M : min c L M ( { N /c } c α =1 ) . Assume N /c to b e r eal to facilitate the analysis , a relax- ation which our numerics show no t to a ﬀect the result. Then the par tition with minimal L M has equal commu- nities of size N /c ∗ with c ∗ ( N ) given by: ln( N + c ∗ ) = N c ∗ − 1 c ∗ ≥ 2 . (10) It is easy to s ho w that, for a long enough ring, such a partition will hav e low er L M than the ‘all-in-one’ parti- tion. Indeed, the map equatio n partitions all cycles with N ≥ 10 . Similar results a re obtained fo r the regular k -cyc le s used as the star ting p oin t fo r the small-world constr uc- tion (see Section V C ). In this case, Map partitions the k -cycle into equally-sized co mm unities of size N /c ∗ given by: ln  2 N k + 1 + c ∗  = 2 N c ∗ ( k + 1) − 1 c ∗ ≥ 2 . (11) The same reas oning can be applied to a torus net- work, i.e., the cartesian pro duct of tw o cyc le s of lengths R and r , with N = rR . This graph can b e thoug ht of a s the discre tiza tion of a 2- dimensional lattice with p erio dic bo undary conditions. It is easy to s ho w that the optimal r adial ly symmet ric par tition of the gr aph (with R > r ) is into communities of size N /c ∗ with c ∗ given by: ln(2 R + c ∗ ) = 2 R c ∗ − 1 (12) Therefore, as the size of the lattice N increa ses, Infomap will partition the torus into s ma ller sections. Our nu- merical explor a tion shows that the ab o ve solution is a conserv ative estimate and the ov erpartitioning induce d by Map is e ven more acute for the torus: As N grows, other even s maller patch-lik e partitions a re obtained by the Infomap optimization. IV. A DYNAMICAL ENHANCEMENT OF THE MAP SCHEME: MA RK O V T IME SWEEPING FOR THE MAP EQUA TION As discussed a b ov e, the o riginal ma p equa tion do es not fully account for the dynamics of the Markov pro - cess ( 1 ), a s it only uses quantities derived fro m blo c k- av eraged one-step transitions. Su ch a s impliﬁcation is reasona ble for clique-like communities, w hich exhibit a small compres s ion g ap a nd can b e fully explored in one step. Ho wev er, netw orks of in terest sometimes po s sess a multi-scale, non clique-like communit y structure which will go unreco g nized by the origina l map equation due to its intrinsic bias tow ards cliques and the ensuing ﬁeld-of- view limit. The limitations o f the map equation in s uc h scenar- ios can b e overcome b y a dopting co ncepts from p artition stability , a re c en tly introduced dyna mical framework for communit y de tec tio n [ 12 , 18 , 1 9 ]. The idea is to consider the time evolution of the Markov pro cess a s a means to unfolding s y stematically the g raph structure at diﬀerent scales. This Markov t ime swe eping , which is equiv a len t to consider ing multi-step tra ns itions, a pplies a natura l zo oming pro cess (from sma ll to la r ge sca le s) to the net- work. A key asp ect of this approa ch is the sy stematic sweeping acros s sc a les pr o vided by the dynamics, w hich minimizes the e ﬀects of the resolution and ﬁe ld- of-view limits. F or an extended dis cussion, see [ 12 , 1 8 , 19 , 31 ]. This Markov time sweeping can be used to endow the map equation with a dynamic zo oming that allows it to detect mult i-sca le communit y structure, with relev a n t partitions characterized by a low compress ion gap. F o r simplicity , consider the c o n tinuous version of the Ma rk ov 6 pro cess ( 1 ) asso ciated with a gra ph with adjacenc y ma- trix A on N no des: ˙ p = − p D − 1 L, (13) where p is a 1 × N v ector of probabilities, D is the di- agonal matrix co n taining the weigh ts of each no de and L = D − A is the g raph Laplacian. It is eas ily veri- ﬁed that this contin uous-time Ma rk ov pro cess has the same s tationary distribution as the discrete- time random walk ( 1 ) [ 1 8 , 19 ]. The a na lytical solution o f this system leads us to con- sider the discrete-time pro cess: p k +1 = p k T ( t ) , (14) where T ij ( t ) = [ e − tD − 1 L ] ij is the eﬀectiv e transitio n probability b et ween no des i and j a fter a (Mar k ov) time t . Within this framework, it is eas y to see that the orig - inal map formulation co nsiders the line arize d version of T ( t ) ev a luated a t time t = 1. Consequently , the or iginal map equation scheme is included as a pa r ticular case in our formulation and we can alwa ys recover the standar d Map r esults under o ur scheme [ 32 ]. Our approach is then to use the map equation to analyze the communit y structure of the time-dep endent weighte d network D T ( t ) as a function o f the (Mar k ov) time, t . As time g rows, the transition matrix T b e- comes less spars e and more clique-like, yet in a s truc- tured manner that reﬂects the commun ity structure of the netw ork [ 19 ]. Cons e q uen tly , the leaving proba bilities q β y ( t ) = P i ∈ α P j / ∈ α π i T ij ( t ) increase with inc r easing time; the cost for encoding distinct co mmunities increa ses to o; and map tends to ﬁnd coar s er communities that can be b etter r epresen ted a s cliq ue s . Mor e sp eciﬁcally: • F or t → 0 , the leaving pr obabilities go to zero and the map equa tion is minimized by setting each no de in its own co mm unit y , as can b e easily veriﬁed. • F or t → ∞ , we approa c h the limit o f an i.i.d. ran- dom pro cess, i.e., T ( t ) → 1 π , where 1 is the vector of ones. In this limit, the map enco ding for the “all-in-o ne” partition is optimal, since it results in a des cription length which is equiv alent to the en- tropy rate. More precisely , it is ea sy to see from Eq. ( 6 ) that δ ( t ) → 0 as t → ∞ . • F or intermediate times, the Ma rk ov time acts as a natural resolution parameter and the partitions of the time-dep enden t weigh ted gr a ph D T ( t ) b ecome increasingly coarser . By following the time evolu- tion, we can chec k whether a particula r partition corres p onds merely to a transient or whether it is per sisten t for a range of times. F urther more, the compr ession ga p ( 7 ) ca n be used as an information- theoretic indicator o f the reliability of the partitions found by Infomap at diﬀerent Ma rk ov times. As discussed a b ov e, a low δ is exp ected when the par- tition reﬂects a communit y str ucture close to that of a clique of cliques, th us c onforming to the assumptions un- derlying the map forma lism. Therefore, lo w v alues o f δ ( t ) ca n b e used to indica te relev ant map partitions and also to identify the existence of a multi-scale communit y structure in the netw ork. This Marko v tim e sweeping br ings to the map equatio n what the p artition st abil ity oﬀers to mo dularity [ 18 , 19 ]; namely , the p ossibilit y to use time as a mea ns to sca n naturally through the reso lution of commu nity detection (from ﬁne to co arse) in a manner that is co nsisten t with the Marko v dynamics on the g raph. F rom this dynamica l viewp oin t, the s tandard map equa tion corr e sponds to a time-snapshot of the diﬀusion dynamics. F urther more, this dynamica l approach is a na tural framework for the map scheme, since it introduces a time-dep enden t but ﬁnite probability of jumping from any node to any o ther no de at all times, in line with the formalism underpinning the map e q uation. V. SOME ILLUSTRA TIVE EX AMPLES In this se ction, we illustra te the use of Markov swe ep- ing map with simple ex amples. The pr o cedure is as follows: F or ea c h Ma rk ov time, we construct the time- depe ndent netw ork deﬁned by D T ( t ). W e then optimize the (time-dep endent) map c ost function using the imple- men tation of Infomap for directed graphs found online at http:/ /www.tp. umu.se/ ~ rosval l/ , slightly modiﬁed to enable self-lo ops in the gr aphs. W e only co nsider here undirected netw or k s but the metho d can b e extended easily to direc ted gr aphs when we allow for telep orta- tion [ 33 ]. F or all examples, 100 r uns of the Infomap algo- rithm at each Markov time were us e d to ﬁnd the optimal partition. A. A netw ork without community structure: the cycle graph As a ﬁrst example, we a pply Marko v time sweeping to the ring netw ork discussed in Section II I B . Recall that for the cycle graph with N = 20 no des, our a nalytical arguments show that the origina l map scheme leads to a non-intuitiv e partition into 5 equal communit ies, instead of the exp ected ‘all-in- o ne’ pa r tition. How ever, the hig h compressio n gap of the 5-w ay partition found b y the sta n- dard Map ( δ ≈ 2 . 48) conﬁrms that this partition is far from b eing formed by cliq ue-lik e commu nities. Because standard Map is b eing applied to a net work which do es not confo r m to the implicit ass umptions ab out commu- nit y detection in the origina l map framework, we see a n ov erpartitioning in this case. As seen in Fig ure 3 , a nalyzing this netw o rk with the Marko v-sweeping version of Ma p reveals that there is no signiﬁcant co mmunit y structure in this gra ph. Only the singleton partition (at very short times) and the globa l partition (at very long times) provide signiﬁcant group- 7 10 −1 10 0 10 1 10 2 0 10 20 10 −1 10 0 10 1 10 2 0 0.2 0.4 0.6 compression gap δ Markov time No. communities c Figure 3. (Color online) Marko v sweeping map for a cy- cle graph wi th N = 20 . As the Mark ov time increases, the map partitioning goes from the ﬁnest p ossible partition to the global ‘all -in- one’ partition ( sol id blue line). Ho wev er, as indi- cated by the featureless decay of the compression gap δ with no clear min ima (d ashed green line), no other relev ant com- munit y is found b et ween those tw o extreme partitions, thus signaling the lac k of community structure. In this case, opti- mizing the stand ard map equation ﬁnds 5 communities but a large compression gap δ ≈ 2 . 48 indicates that this partition is unreliable. I nset: analyzed graph. ings of the no des while all other pa rtitions show hig h v alues of δ . B. A simple netw ork w ith multi-scale communit y structure Consider now a weight ed g r aph with a distinct hier- archical communit y structure: t w o tria ngles of tria ngles with weigh ted link s to reinforc e the hier arch y (see inse t of Figur e 4 ). In this example, the standard map equation metho d identiﬁes the ﬁne structure of six small tr iangles. (W e no te that the hierarchical map eq uation uncov ers the tw o -tier hierarchy of communities in this graph.) Our prop osed Marko v sweeping map also r eco vers the hierarch y o f pa r titions acro ss time-scales, as indicated by the sharp decrea ses in the compressio n ga p δ when the six-fold and the tw o-fold partition are detec ted (Figure 4 ). Our metho d also indicates over which timescales the relev a n t pa r titions app ear to be natural. F o r instance, a change in the weigh ts would induce changes in the leng ths of the plateaux co rresp onding to the diﬀeren t lev els of the hierarch y . As stated a bov e, hier a rc hical Ma p is able to resolve this clique-like communit y structure (while stan- dard Map ﬁnds only the ﬁne structure). How ever, if the m ulti-sca le structure is not c liq ue-lik e, hiera rc hical Ma p may fail to r esolve the multi-scale structure, as shown in the netw ork o f sma ll-w orld communities discuss ed in the next section. C. A ring of smal l-w orld communities As a nex t sce nario, we study a ring of ﬁve weakly con- nected small-world gra phs [ 34 ] of 200 no des ea c h, a s in- tro duced in [ 12 ] (see Figur e 5 (a )). W e use the CON- TEST to olbox [ 35 ] to generate small- w orld co mm unities 10 0 10 2 10 4 0 5 10 15 No. communities c 10 0 10 2 10 4 0 0.2 0.4 0.6 compression gap δ Markov time 6 2 Figure 4. (Color online) Ma rko v sweeping map for a graph wi th a hie rarc hi cal communit y structure. The graph analyzed (inset) h as a clear comm unity struct u re give n by a h ierarc hy of triangles: the six smaller triangles (denoted by d iﬀeren t colors) ha ve edges within th em o f w eigh t 100; they are group ed into tw o larger triangles with w eaker links (the edges b et ween the 6 small triangles have w eight 10); the edge b et we en the tw o big triangular structures has w eight 1. The compression gap (dashed green line) sho ws tw o clear minima, indicating well deﬁn ed partitions in to 6 an d 2 comm unities, correspondin g to the tw o tiers of t h e h ierarc hy . Standard In - fomap ﬁnds only the 6 small triangles ( δ ≈ 0 . 62). by adding random connectio ns ` a la Newman-W atts [ 36 ] starting from a pristine w or ld with tw o nea rest neigh- bo urs [ 37 ] but allo wing for the possibility of m ultiple shortcuts at each no de. As discus s ed in Section I I I B , the sta ndard map equa- tion will tend to ov erpartition lattice-lik e structures, s uc h as the pristine w or lds ( k -cycles ) us e d as starting p oin t fo r the small- world cons tr uction. As g iv en by Eq. ( 11 ), stan- dard Infomap par titio ns the pr istine world with N = 200 and k = 2 into 22 equally-s ized c o mm unities. This overpartitioning p ersists when few ra ndom sho rt- cuts a r e added, as shown in Figure 5 (b). Only when the av erage num b er of added shor tcuts p er no de, her e de- noted by s , is g r eater than 3.5 (a nd the mean distance within the small-world has b ecome s mall) do es standard Map o btain the r igh t s plit into ﬁve commu nities. This is consistent with our discus sion p ertaining the ﬁeld- of- view, i.e., the smaller the mean path length, the more clique-like the structure. In this ca se, hierar c hical In- fomap can even give a non-intuitiv e partitio n into 4 co m- m unities, due to the non clique-like nature of the co mm u- nities. O n the other hand, Figure 5 (c) sho ws that Mar k ov sweeping allows Map to detect the re le v ant partition int o 5 communities ov er an ex tended time-scale with a small compressio n gap. VI. DISCUSSION A key insight to emerge from the map equation for- malism is the fact that a coar se-grained des cription of a graph in terms of its commu nities is intimately related to ﬁnding concise descriptions o f the information ﬂow on these net works, a nd hence to the ﬁeld o f co ding the- ory a nd da ta compress ion. How ever, the adoption of a co ding o r compres sion mechanism ha s imp ortant eﬀects 8 2.4 2.6 2.8 3 3.2 3.4 3.6 0 10 20 30 40 50 60 70 80 90 Mean pathlength 5 Communities 10 0 10 2 10 4 10 0 10 1 10 2 10 3 10 0 10 2 10 4 0 0.3 0.6 0.9 compression gap δ Markov time SW SW SW SW SW Standard Map (a) (b) (c) 5 Communities Standard Map No. communities c No. communities c Figure 5. ( Col or online) Communit y detection in a ring of small-world communities. (a) Ring of 5 small-w orld comm unities with N = 200 each. The edges within the small- w orlds hav e w eight 5 while the weigh t of t he links b et ween them is 1. All the small-worl ds hav e an av erage num b er of randomly add ed shortcurts p er no de, s . F or s = 1 (shown), standard Map shows a strong ov erpartitioning leading to an a verage of 16 communities inside each small-w orld (indicated by diﬀerent colors in online vers ion). (b) Number of comm u- nities found by standard Map vs. mean p athlength inside the small-w orld communities. The numerics show n correspond to 10 d iﬀeren t realizations of the n et wo rk with av erage num b er of shortcuts p er nod e, s = 1 , 1 . 25 , 1 . 5 , . . . , 3 . 75 , 4. (c) A pply- ing Marko v sweeping map to the ring of small-w orlds with s = 2 . 5 (mean pathlength inside th e small -worlds ≈ 2 . 7) ﬁnds the relev ant partition into 5 communities, whil e standard Map ﬁnds 23 communities in th is case. on the o utco me of the a lgorithm and ultimately reﬂects the underlying assumptions about the concept of comm u- nit y . Here w e have shown that the original map equation formalism is inherently tuned tow ards a blo c k-averaged notion of co mm unit y structure as a weigh ted, statisti- cal cliq ue of cliques. This tuning stems from tw o inter- related simpliﬁcations: the blo c k-averaged co ding mech- 50 100 150 C C C C C C 10 15 20 25 30 40 compression gap δ 10 −1 10 0 10 1 10 2 10 3 10 0 δ ’ Markov time 0 0.5 1 10 −1 Ring of het. rings No. communities c ( b) Ring of het. cliques 0 50 100 150 K K K K K K 10 15 20 25 30 40 10 −1 10 0 10 1 10 2 10 3 0 0.5 1 δ ’ Markov time 0 0.2 0.4 compression gap δ (a) No. communities c 6 6 Figure 6. (Color on line) Communit y detection with het- erogeneously sized subgraphs. (a) Ring of 6 cliques with diﬀerent sizes { 10 , 15 , 20 , 25 , 30 , 40 } . Upp er panel: number of communities found by Marko v sweeping map vs. Ma rko v time. Lo wer panel: b oth the compression gap δ (dashed green line) and the alternative compression gap measure δ ′ (red dashed-dotted line), clearly highligh t the presence of a ro- bust partition into 6 comm un ities. Inset: analysed graph. (b) Ring of 6 rings with d iﬀeren t sizes { 10 , 15 , 20 , 25 , 30 , 40 } . In this case the alternative measure δ ′ for th e compression gap is b etter suited for the analysis, ind icating the presence of th e 6 rings by a relative minimum around Marko v time 30. Inset: analysed graph. anism, which igno res the detailed co nnectivit y a nd ex- hibits a lar ge compres sion ga p for non-clique str uctures, and the use of o ne-step q uan tities, which igno res the ef- fect of mult i-step ﬂows in the co mmunities and leads to an upp er s cale (ﬁeld-of-vie w ) fo r detection. This in trin- sic bias o f the map equation e x plains the excellent p er- formance of the map equation in clique- lik e benchmarks but can lead to unexp ected overpartitioning of netw orks if they diﬀer stro ngly from the as sumed clique-like orga - nization. W e have shown that using the dynamical zooming provided b y Markov time sweeping allows one to take int o acco un t multi-step ﬂows and s c a n across all scales in a natural ma nner. The under ly ing idea is that, as time incr eases, the communities in the netw ork will b e- come mo r e cliq ue- lik e whe n analyzed through the time- depe ndent weigh ted trans ition matrix of the Markov pro- cess. Therefore, the ma p formalism ca n be used to detect long-ra nge co mm unities as the Ma rk ov time incr e a ses, 9 and the relev ant communities will b e signa led by a low compressio n ga p. This Markov sw eeping for the map equation can enhance the pe rformance o f the metho d by a llo wing it to detect non-clique communities and the presence of multi-scale co mm unit y structure in netw ork s. Impo rtan tly , the metho d still recovers all the results from the origina l map equation. As stated ab ov e, the dynamic zo oming acro ss all scales eﬀected by the Markov pro cess is an integral ingredien t of the met ho d. Rather than just lo oking for the ‘r ig h t’ sca le, the communit y str ucture emer ges from the integration of the infor mation gathere d systematically at all scale s . This appro a c h ca n help allevia te the reliance on a g lobal scale which can aﬀect the results when dealing with net- works with communities with very heterogeneo us size s [ 38 ]. In pa rticular, Mar k ov-sweeping map is able to de- tect heterogeneo us cliques as obtained thr ough the LFR benchmark, a fact consistent with the notion that c liques are all eﬀectively one-step and that standard Map already per forms eﬀectively on suc h b enc hmarks (see also Fig- ure 6 for an analysis with heterog e neously sized cliques). Similarly o ur metho d p erforms w ell in detecting co mmu- nities in a ring of rings with very dissimilar sizes as illus- trated in Figure 6 , a lthough when the heteroge neit y of the rela tiv e ring siz e s b ecomes very large , our approach will not identif y a ll rings a t once at the s a me level of the hierarch y . T o improv e further the applicability of the metho d to such problems , one can use diﬀerent dynam- ics for the Markov pro cess [ 19 , 39 ]. This is an area of resear ch we are currently pursuing. How ever, since there is no communit y detection alg orithm that will s e rv e all purp oses for all p ossible a pplications, one s hould comple- men t the analys is with other metho ds based on diﬀerent principles (e.g., lo cal algor ithms in tho s e cases). Adding a dynamical dimension to Map thro ugh Marko v sweeping is just one of the p ossible ways to en- hance the map equa tion and a lternativ e appr oaches a re worth pur suing. One direction would b e the mo diﬁcation of the co ding scheme. F or instance, a more rigorous trea t- men t would req uire to r e move the constraint of having unique co dew ords within each communit y and allow also for enco ding of walks instead of sing le step co dew or ds . This genera lization, how ever, w ould mo st lik ely le ad to a brea kdo wn of the simple co ding picture that under pins the map equatio n. Our work emphasizes the imp ortance of the choice of dynamics o n the netw ork a nd shows that using a dyna mical p erspec tiv e may lead to a more natural framework for communit y detectio n, es p ecially when the underlying sy s tem has an inherent ﬂow. In this pap er, we have used the standar d (unbiased) contin uo us-time random walk as a neutra l ﬁrst c hoice of dynamics. Ho w- ever, other co ntin uous time o r discrete time pro cesses are po ssible (see also [ 19 ] fo r a r elated discussion) in order to tune our comm unity detection a lgorithm to diﬀerent characteristics of the netw ork. Co de is av ailable online [ 40 ]. ACKNO WLEDGMENT S W e thank J.-C. Delvenne for fruitful discussions, es - pec ially relev a nt for the quantitativ e analy s is o f the rings. W e thank S.N. Y alira ki for helpful comments. R.L. ackno wledges funding from the Belgia n Net work D YSCO (Dynamical Sys tems, Control, and O ptimiza- tion), funded by the In teruniversit y Attraction Poles Progr amme, initiated by the Belgian State, Science Policy O ﬃce. M.B. acknowledges funding from grant EP/I0 1 7267/1 from the EPSRC (E ngineering and Phys- ical Sciences Research Council) of the UK under the Mathematics under pinning the Digital Eco nom y progr am and from the US Oﬃce of Nav al Res earch (ONR). The scientiﬁc resp onsibilit y rests with its a uthors. [1] S. F ortunato, Physics Rep orts 486 , 75 (2010) . [2] S. Boccaletti, V. Latora, Y . Moreno, M. Chav ez, and D.-U. Hwa ng, Physics Rep orts 424 , 175 (2006) . [3] A. Arenas, A. D ´ ıaz-Guilera, J. Kurths, Y. Moreno, and C. Zhou, Physics Rep orts 469 , 93 (2008) . [4] H. A . Simon, Pro ceedings of the American Philosophical Society 106 , 467 ( 196 2). [5] M. E. J. Newman an d M. Girv an, Phys. R ev. E 69 , 026113 (2004) . [6] M. E. J. Newman, Proceedings of t h e National A cademy of Sciences 103 , 8577 (2006) . [7] M. Fiedler, Czec hoslo v ak Mathematical Journal 23 , 298 (1973) . [8] M. Fiedler, Czec hoslo v ak Mathematical Journal 25 , 619 (1975) . [9] J. Shi and J. Malik, P attern An alysis and Machine I n tel- ligence, IEEE T ransactions on 22 , 888 (2000) . [10] R. Kann an, S . V empala, and A. V eta, in F oundations of Computer Scienc e, 2000. Pr o c e e dings. 41st Annual Sym- p osium on (IEEE, 2000) pp. 367 –377. [11] M. P orter, J. Onnela, and P . Mucha, Notices of the AMS 56 , 1082 (2009). [12] M. T . Schaub, J.-C. D elv enne, S. N. Y aliraki, a nd M. Barahona, PLoS ONE 7 , e32210 (2012) . [13] H. A. S imon and A . Ando, Econometrica 29 , 111 ( 196 1). [14] J. Reichardt and S. Bornholdt, Phys. Rev. Lett. 93 , 218701 (2004) . [15] A. Arenas, A. F ern´ andez, and S. G´ omez, New Journal of Physics 10 , 053039 (2008). [16] P . Ronhovde and Z. Nussinov, Phys. R ev. E 80 , 016109 (2009) . [17] A. Lancic hinetti, F. Radicc hi, J. J. Ramasco , and S. F or- tunato, PLoS ONE 6 , e18961 ( 2011) . [18] J.-C. Delvenne, S. N. Y aliraki, and M. Barahona, Pro- ceedings of the National Academy of Sciences 107 , 1275 5 (2010) . [19] R. Lambiotte, J. -C. D elv enne, an d M. Barahona, (2009), arXiv:0812.17 70 . [20] E. Ziv, M. Middend orf, and C. H . Wiggins, Phys. Rev. E 71 , 046117 (2005) . 10 [21] M. R osva ll and C. T. Bergstrom, Proceedings of the Na- tional Academy of Sciences 105 , 1118 (2008) . [22] M. Rosv all and C. T. Bergstrom, PLoS ONE 6 , e18209 (2011) . [23] A. Ra j and C. H. Wiggins, IEEE T ransactions on P attern Analysis and Mac hine I n telligence 32 , 988 (2010) . [24] A. Lancic hinetti and S. F ortunato, Phys. R ev. E 80 , 056117 (2009) . [25] S. F ortunato and M. Barth´ elemy , Proceedings of t he Na- tional Academy of Sciences 104 , 36 (2007) . [26] A. V iamon tes Esquivel and M. Rosv all, Phys. Rev. X 1 , 021025 (2011) . [27] C. E. Shannon, The Bell System T echnical Journal 27 , 379 ( 1948). [28] T. M. Co ver and J. A. Thomas, Elements of Information The ory , 2nd ed. (Wiley-Interscience, 2006). [29] Note that w e do n ot hav e to construct a co de t o ev aluate the map eq uation: one can ﬁnd the optimal cod e-lengths from the entro py ex pressions or the map eq uation ( 3 ). [30] B. K arrer and M. E. J. Newman, Phys. R ev. E 83 , 0161 07 (2011) . [31] A. Delmotte, E. W. T ate, S. N . Y aliraki, and M. Bara- hona, Physical Biology 8 , 055010 (2011) . [32] More sp eciﬁcally , th e standard map equ ation is exactly reco vered at time t = 1 when considering either a lin- earized v ersion of th e dyn amics ( 14 ) or of the co rresp ond- ing discrete-time dynamics ( 1 ). The results presented here consider the nonlinear transition matrix ( 14 ) and hence th e results at time t = 1 sh own here can displa y small deviations from those obtained from th e standard (i.e., linearized in our case) map scheme. [33] R. Lambiotte and M. Rosv all, Phys. Rev. E 85 , 056107 (2012) . [34] D. J. W atts and S. H . S trogatz, Nature 393 , 440 (1998) . [35] A. T aylor and D. J. Higham, ACM T rans. Math. Soft w. 35 , 26:1 (2009) . [36] M. E. J. Newman, C. Mo ore, and D. J. W atts, Phys. Rev. Lett. 84 , 3201 (2000) . [37] M. Barahona and L. M. P ecora, Phys. Rev. Lett. 89 , 054101 (2002) . [38] A. Lancic hinetti and S. F ortunato, Phys. R ev. E 84 , 066122 (2011) . [39] J.-C. D elvenne, M. T. Sc haub, S. N. Y aliraki, and M. Barahona, in Tim e V arying Dynamic al Networks , edited by N. Ganguly , A . Mukherjee, M. Chou d h ury , F. Peruani, and B. Mitra (Birkh¨ auser, Springer, Boston, 2012) to b e published. [40] http://mic haelschaub.gith ub.com/ MarkovZoom ingMap/ .

Encoding dynamics for multiscale community detection: Markov time sweeping for the Map equation

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment