Statistical Signatures of Structural Organization: The case of long memory in renewal processes
Identifying and quantifying memory are often critical steps in developing a mechanistic understanding of stochastic processes. These are particularly challenging and necessary when exploring processes that exhibit long-range correlations. The most co…
Authors: Sarah E. Marzen, James P. Crutchfield
San ta F e Institute W orking P ap er 15-12-XXX arxiv.org:1512.XXXX [ph ysics.gen-ph] Statistical Signatures of Structural Organization: The case of long memory in renew al pro cesses Sarah E. Marzen 1 , ∗ and James P . Crutc hfield 2 , † 1 Dep artment of Physics University of California at Berkeley, Berkeley, CA 94720-5800 2 Complexity Scienc es Center and Dep artment of Physics University of California at Davis, One Shields Avenue, Davis, CA 95616 (Dated: Septem ber 3, 2018) Iden tifying and quan tifying memory are often critical steps in dev eloping a mec hanistic under- standing of stochastic pro cesses. These are particularly challenging and necessary when explor- ing pro cesses that exhibit long-range correlations. The most common signatures employ ed rely on second-order temporal statistics and lead, for example, to identifying long memory in pro cesses with p o w er-law autocorrelation function and Hurst exp onent greater than 1 / 2. Ho w ev er, most sto chas- tic pro cesses hide their memory in higher-order temp oral correlations. Information measures— sp ecifically , div ergences in the m utual information betw een a pro cess’ past and future (excess en- trop y) and minimal predictive memory stored in a pro cess’ causal states (statistical complexity)— pro vide a differen t w a y to iden tify long memory in processes with higher-order temp oral correlations. Ho w ev er, there are no ergodic stationary processes with infinite excess en trop y for whic h information measures hav e been compared to autocorrelation functions and Hurst exponents. Here, we show that fractal renew al pro cesses—those with in terev en t distribution tails ∝ t − α —exhibit long mem- ory via a phase transition at α = 1. Excess entrop y diverges only there and statistical complexity div erges there and for all α < 1. When these processes do hav e pow er-la w autocorrelation function and Hurst exp onent greater than 1 / 2, they do not hav e divergen t excess entrop y . This analysis breaks the in tuitiv e association b etw een these different quantifications of memory . W e hop e that the metho ds used here, based on causal states, provide some guide as to how to construct and analyze other long memory processes. P ACS num bers: 02.50.-r 89.70.+c 05.45.Tp 02.50.Ey 02.50.Ga Keywords: stationary renewal process, fractal renewal pro cess, statistical complexity , excess en trop y , long memory , pow er-la w scaling, 1/f noise, Zipf ’s law I. INTR ODUCTION Man y time series of in terest hav e “short memory”, meaning (lo osely sp eaking) that kno wledge of the past confers exp onentially diminishing returns for predicting the future. How ev er, man y other time series of interest— those with “long memory”—exhibit intrinsic timescales that gro w without b ound as the amount of a v ailable data increases [1 – 6]. Examples include the h ydrological data first studied by Hurst [7] and mo deled by Mandelbrot [8] and many others, e.g., see Refs. [9, 10]. These are qualitativ ely differen t pro cesses that demand qualitativ ely different generativ e models. In other words, signatures of long memory imply a kind of structural or- ganization of the underlying process that differs from one with short memory . This is the inverse pr oblem of long memory: Which statistical signatures identify , uniquely or not, which intrinsic organizations? Sharp answers are critical to successful empirical analysis and often provide necessary first steps in predictive theory building. The complemen tary forwar d pr oblem , an op en question, is to ∗ smarzen@berkeley .edu † chaos@ucda vis.edu iden tify the kinds of memoryful process structure that lead to one or another statistical signature. Answering this question requires defining statistical signatures that quan tify memory in sto chastic pro cesses. Man y existing quan tifications of long memory are based on second-order statistics; e.g., on using the auto- correlation function, p ow er spectrum, or Hurst exp onent. These approac hes ha ve had notable successes in analyz- ing hydrological data [7, 9], music [4], spin systems [2], astroph ysical flick er noise [6], language [11, 12], natural scenery [13, 14], communication system error clustering [15], financial time series, and man y other seemingly com- plex phenomena [5, 16]. Ho wev er, there are at least tw o reasons to look to other statistics b esides the Hurst exp onent. First, second-order statistics alone can be misleading, as a pro cess can “hide” signatures of long memory in higher-order statistics. F or example, Fig. 1 sho ws a hidden Marko v mo del (HMM) that, on the one hand, is paten tly quite structured, and, on the other, generates a pro cess with a flat p ow er sp ec- trum [17]. Indeed, most stochastic pro cesses seem to hide information ab out their temp oral dep endencies in higher-order statistics [18, 19]. Second, as suggested in Ref. [20], our determination of whether or not a pro- 2 B C S D A p : 1 1 − p : 0 q : 1 1 − q : 0 q : 1 1 − q : 0 1 : 1 1 : 0 FIG. 1. The Random-Random-XOR (RRXOR) Pro cess is generated by the five-state (minimal unifilar) Hidden Marko v mo del here. Lab els p | x denote that a state-to-state transition o ccurs with probability p and emits symbol x . If X t is the random v ariable at time t , then the generated time series is X t +2 = X t +1 X OR X t , with X t +1 and X t b eing Bernoulli(q) and Bernoulli(p), resp ectively , for t = 0 , 3 , 6 , . . . . With p = q = 1 / 2 and starting state probabilities Pr( S ) = 1 / 3 and Pr( A ) = Pr( B ) = Pr( C ) = Pr( D ) = 1 / 6 the output pro cess is stationary white noise—a flat p ow er sp ectrum [17]. cess has long memory ideally should b e inv ariant under in vertible transformations of one’s measurement v alues. The challenge is not only to find a new statistic that ad- dresses these tw o concerns, but to find a statistic that is also easy to op erationalize. References [21 – 23] suggested a pro cess might b e said to hav e long memory when the mutual information b e- t ween its past and future (excess entrop y) diverges, and Ref. [21] suggested that long memory is asso ciated with div ergent statistical complexity with the effective mem- ory architecture given by a pro cess’ -mac hine. By con- struction, these statistics are inv ariant under in vertible transformations of the data; and with sufficien tly clev er en tropy estimation techniques, these statistics are also calculable directly from time series data. Unfortunately , there is a paucity of concrete examples up on which to build in tuition as to ho w these higher- order statistics and the more commonly used second- order statistics relate. In part, this lack of concrete exam- ples might ow e somewhat to the fact that it is nontrivial to construct ergodic stationary pro cesses with divergen t excess entrop y , though see Refs. [24, 25]. (Note that the pro cesses considered in Ref. [22] were nonergo dic [26].) T o that end, w e study a tractable class of pro cesses that can hav e b oth divergen t excess entrop y and Hurst exp onen t greater than 1 / 2: the fr actal r enewal pr o c esses [27 – 30] in which interev ent interv als are drawn indep en- den tly and identically (I ID) from a probability distribu- tion with tails ∝ t − α . These pro cesses are very widely used in the physical, biological, and so cial sciences to mo del diverse long-memory phenomena, ranging from curren t fluctuations in electronic devices and neuronal spik e trains to earthquakes and astrophysical time series [31 – 40]. Previous studies analyzed the second-order statistics of such pro cesses in some detail [9, 41]. Here, we use tec hniques inspired b y those in Refs. [25, 42] to calculate the excess entrop y and statistical complexity of fractal renew al pro cesses for the first time. W e find that fractal renew al pro cesses hav e div ergent excess entrop y only and exactly when α = 1 and divergen t statistical complexity as α → 1 from ab ov e and for all 0 < α < 1. How ever, fractal renewal pro cesses hav e p ow er-law p ow er sp ectra for all 0 < α < 2 [41] and Hurst exp onents greater than 1 / 2 [9]—the latter b eing tw o of the conv entional second- order statistical signatures of “long memory”. Thus, even for these relativ ely straigh tforw ard pro cesses, the excess en tropy and statistical complexity encapsulate a different notion of long memory than one gleans using only second- order statistics. These results also add fractal renewal pro cesses to a very short list of kno wn stationary ergo dic pro cesses with divergen t excess entrop y [25, 42] and so, w e hop e, they pav e the wa y for more general comparisons b et w een different definitions of long memory . Section I I briefly reviews definitions of memory in sto c hastic pro cesses. Section I I I calculates informational measures of memory for fractal renewal pro cesses. Sec- tion IV then compares our findings to the second-order statistics calculated by Refs. [9, 41] and draws out the lessons for the ab ov e application examples. W e close b y reflecting on structural organization associated with long memory . I I. BA CK GROUND There are man y definitions for a stochastic process to hav e long memory; Ref. [20] provides a particularly helpful surv ey . Consider a sequence of ` observ ations x 0 , x 1 , . . . , x ` − 1 , realizations of discrete-v alued random v ariables X 0 , X 1 , . . . , X ` − 1 . F or instance, if the auto c or- r elation function C ( τ ) is asymptotically a pow er la w m ul- tiplied by a slowly v arying function g ( τ ), then a pro cess can b e said to hav e “long memory”: C ( τ ) = σ − 2 ` X j =0 ( x j − µ )( x j + τ − µ ) ∝ g ( τ ) τ − γ , with 0 < γ < 1, mean µ , and v ariance σ 2 . Y et other definitions are based on the deca y of the sp e ctr al density : P ( f ) = ` − 1 ` X j =0 x j e − ij f 2 . The pro cess has long memory when P ( f ) ∝ f − β L 1 ( f ) as f approaches 0 (where L 1 ( f ) is a slowly v arying function near f = 0) with 0 < β < 1. Other definitions still are based on how v ariances deviate from time-lo cal linear 3 extrap olation. Starting with the v ariance of partial sums S j = X 1 + · · · + X j , one uses the r esc ale d-r ange statistics: RS ( ` ) = max 0 ≤ j ≤ ` ( S j − j ` S ` ) − min 0 ≤ j ≤ ` ( S j − j ` S ` ) σ ∝ ` − H , where H ∈ (0 , 1) is the Hurst index . Processes with H > 1 / 2 are interpreted as ha ving long memory . Un- fortunately , even these second-order statistics are not al- w ays equiv alent signatures of long memory . Section 5 of Ref. [20] provides examples of inconsistencies. In a searc h for general principles from ergo dic theory , Sec. 4 of Ref. [20] prop osed that we require a definition of long memory indep endent of inv ertible transformations of the data. That is, if an inv ertible transformation is applied p oint wise to each observ ation X i , we w ould hop e that the resulting pro cess has long memory if and only if the original pro cess had long mem ory . This desideratum is not satisfied by definitions based on the ab ov e second- order statistics. Since strongly mixing processes hav e short memory and nonergo dic pro cesses could b e said to hav e infinite memory [26], Ref. [20] prop osed that one or another type of nonmixing property is a goo d candidate for long mem- ory in ergodic stationary pro cesses. This criterion satis- fies the inv ariance desideratum ab ov e but can b e rather difficult to ev aluate. F ortunately , the information-theoretic notions of mem- ory we consider also satisfy the transformation-inv ariant desideratum and hav e b een successfully deploy ed as quan tifications for the “complexity” of sto chastic pro- cesses [22, 43]. W e study tw o: the exc ess entr opy E = I [ ← − X ; − → X ], or the mutual information b etw een a pro cess’ past ← − X = . . . X − 3 X − 2 X − 1 and future − → X = X 0 X 1 X 2 . . . [23]; and the statistic al c omplexity C µ , or the amount of information from the past ← − X required to exactly predict the future − → X [43]. When the excess entrop y diverges, w e are interested in the asymptotic rate of div ergence of finite-length excess entrop y estimates E ( ` ) = I [ ← − X ; − → X ` ] [22, 23]. This asymptotic rate of divergence is also in- v ariant to temp orally local con volutions and inv ertible transformations of the data [22]. T o more precisely define and calculate the statistical complexit y and the excess entrop y , w e need to recall the causal states of computational mechanics. Consider clustering pasts according to an equiv alence relation ∼ in which tw o pasts are equiv alen t when they hav e the same conditional probability distribution o ver futures: ← − x ∼ ← − x 0 if and only if Pr( − → X | ← − X = ← − x ) = Pr( − → X | ← − X = ← − x 0 ). The resulting clusters are forwar d-time c ausal states S + , whic h inherit a probability distribution from the prob- abilit y distribution ov er pasts. The forwar d-time sta- tistic al c omplexity is the en trop y of these causal states: C + µ = H [ S + ]. F or more detail, see Refs. [44, 45]. W e can similarly define the r everse-time c ausal states S − b y clustering futures with equiv alent conditional probabilit y distributions ov er pasts: − → x ∼ − → x 0 if and only if Pr( ← − X | − → X = − → x ) = Pr( ← − X | − → X = − → x 0 ). The r everse-time statistic al c omplexity is the entrop y of those reverse-time causal states: C − µ = H [ S − ]. Renewal processes are time- rev ersal in v ariant [46], or c ausal ly r eversible , so through- out the follo wing we denote the statistical complexity as C µ = C + µ = C − µ without loss of precision. Rev erse-time causal states and forward-time causal states can be used to calculate the excess en tropy [47, 48]: E = I [ S + ; S − ] . F or discrete-time pro cesses, E is a low er b ound on C µ : E ≤ C µ . (1) In other words, for discrete-time pro cesses, if statistical complexit y is finite, then so is excess entrop y . Conv ersely , if excess entrop y is infinite, then statistical complexity is infinite. Often con tinuous-time processes hav e an uncoun table set of causal states. F or them, the statistical complexity is taken to b e the differential entrop y: b C µ = H [ S + ] = − Z ∆ dµ ( σ + ) log µ ( σ + ) , where we hav e the simplex ∆ of causal states and µ ( σ + ) is their measure in ∆. In the contin uous-time setting, the inequality analogous to Eq. (1) no longer necessar- ily holds [49]. W e call the differen tial en tropy b C µ the c ontinuous-time statistic al c omplexity to distinguish it from the discrete-time statistical complexit y C µ , but sim- ply refer to it as the statistical complexity when context is clear. One can also define finite-time reverse-time causal states, denoted S − ` , b y clustering futures of finite-length ` with the same equiv alence relation as ab o v e. F rom these, w e obtain finite-length reverse-time statistical complex- it y C − ` µ = H [ S − ` ], respectively . These can b e used to calculate finite-future excess entrop y estimates: E ( ` ) = I [ S + ; S − ` ] [47, 48]. F or discrete-alphab et, discrete-time pro cess es, the sta- tistical complexity is inv ariant to relab elings of the mea- suremen t alphabet. How ever, as just noted, when the causal states are uncountable, the statistical complexity in volv es a differential en tropy , and differential entropies are not inv ariant to inv ertible transformations of the co- 4 ordinate system of the distribution’s supp ort. A prosaic example of this is giv en in Ref. [50]. Mo dulo suc h factors, whether or not statistical complexity diverges, the rate of div ergence of its finite-length estimates C ` µ is in v ariant to temp orally lo cal conv olutions of the data. Realizations from a renewal pro cess consist of se- quences of even ts separated by ep o chs of quiescence, the lengths of which are drawn indep endently from the same inter event distribution. Throughout, when discussing a discrete-time renewal pro cess, w e use the following nota- tion [46]: F ( n ) is the inter event c ount probability distri- bution function; w ( n ) = P ∞ n 0 F ( n 0 ) is the survival func- tion ; and µ is its me an inter event c ount . W e use the follo wing notation for con tinuous-time renewal pro cesses: φ ( t ) is the waiting time distribution; Φ( t ) is its survival function ; and T is the me an inter event interval . F r ac- tal r enewal pr o c esses ha ve surviv al functions that hav e p o w er-law tails, as introduced shortly . I I I. INTRINSIC MEMOR Y IN FRACT AL RENEW AL PROCESSES F ractal renew al pro cesses—those with p ow er-law in- terev ent interv al probability densit y functions—can ha v e long memory in the sense of Ref. [51]. F or instance, they can hav e Hurst index H > 1 / 2 [9] and their auto- correlation function can b e (asymptotically) a p ow er law [41]. F ractal renewal pro cesses ha ve b een implicated in a v ariety of complex natural pro cesses, to which the intro- duction alluded. Might these pro cesses also hav e infinite statistical complexit y or infinite excess entrop y? T o the b est of our knowledge, the excess entrop y and statistical complexit y of fractal renew al pro cesses hav e yet to b e calculated. Calculating statistical complexity and excess entrop y can b e challenging when going b eyond finite causal-state pro cesses [52]. T o make progress with b ounding the ex- cess entrop y of fractal renewal processes, we use t w o to ols. The first to ol is to coarse grain b y time-binning. The Data Pro cessing Inequalit y [53] then implies that the excess entrop y of a discrete-time renewal pro cess is alw ays upp er-b ounded by the excess entrop y of the corre- sp onding con tinuous-time renewal pro cess. See App. A. The second tool allo ws us to calculate excess entrop y and statistical complexity even when the mean rate of even ts v anishes by conditioning on the presence of a proxy even t. This to ol was inspired b y previous work [24] and is sum- marized in App. B. F ractal renew al pro cesses are typically considered in con tinuous-time, with interev ent interv als generated in- dep enden tly and identically distributed (I ID) from the probabilit y density function: φ ( t ) = ( 0 t < 1 αt − ( α +1) t ≥ 1 . (2) The probabilit y of seeing an interev ent interv al of length t or larger is the surviv al function: Φ( t ) = Z ∞ t φ ( t 0 ) dt 0 = ( 1 t < 1 t − α t ≥ 1 . (3) Time interv als are given in terms of the shortest p ossible in tereven t interv al. When α > 1, the me an interev ent in terv al T = α α − 1 is finite; when 0 < α ≤ 1, the mean in tereven t interv al is infinite, but one will alwa ys even tu- ally see an even t. App endix D describes how to manipulate the con tinuous-time analog of Eq. (B1) to obtain: b E = log α 2 α − 1 − 1 α > 1 ∞ α = 1 α 2 + α − 1 α (1 − α ) + log α 1 − α − (1 − α ) K α α < 1 , (4) where K α = R ∞ 0 ( u − α − (1 + u ) − α ) log( u − α − (1 + u ) − α ) du . Note that at small v alues of α , K α is difficult to ev aluate n umerically due to the in tegrand’s long tails, ev en when b E is quite small. F or instance, when α = 1 / 4, b E ≈ 0 . 089 nats, but R N 0 ( u − α − (1+ u ) − α ) log( u − α − (1+ u ) − α ) du do es not return p ositive estimates for the excess entrop y until N ≥ 10 11 . A more obvious b enefit of Eq. (4), then, is that w e can study the excess entrop y’s asymptotic b ehavior near α = 1, where b E ( ` ) ∼ log log ` . This divergence is slo wer than any previously rep orted divergence [22, 24, 25], but is a divergence nonetheless. When α > 1 but close to its critical v alue, the excess en tropy diverges as ∼ log 1 α − 1 . As α → ∞ , b E diverges as log α . This p oint is discussed more fully later on. The discrete-time analog of fractal renewal pro cesses has a surviv al function: w ( n ) = ( 1 n = 0 n − α n ≥ 1 . (5) The transient (small n ) b eha vior of w ( n ) may not match that in some applications, but only w ( n )’s asymptotic b eha vior is relev ant to E ’s divergence. Moreov er, App. A guaran tees that E is finite when α 6 = 1 and that at α = 1 its divergence is at most log log ` . Additional arguments in App. D, in turn, sho w that E ( ` ) diverges at α = 1 as log log ` . 5 The excess entrop y E captures the amount of pre- dictable randomness of a sto c hastic pro cess. As a com- parison, we are also interested in the statistical complex- it y C µ of discrete-time and con tinuous-time fractal re- new al processes. The statistical complexity is the num b er of bits required to losslessly predict ( E nats of ) the pro- cess’ future. Sometimes, C µ is not m uch larger than E ; for discrete-time perio dic processes, the tw o are equiv a- len t and equal to the logarithm of the p erio d. More often than not, C µ is infinite while E is finite; e.g., for pro cesses generated by most (nonunifilar) Hidden Mark o v Mo dels. Cryptic pr o c esses ha ve large statistical complexit y and small excess entrop y [47]; the larger the crypticity , the more that a pro cess’ true structure is “hidden” from the observ er. An op en question is whether or not fractal re- new al pro cesses, with their statistical signatures of com- plexit y , are highly cryptic. So, we fo cus some attention no w on ev aluating C µ for fractal renewal pro cesses. W e can calculate C µ of time-binned contin uous-time renew al pro cesses in the infinitesimal- τ limit [49]: C µ τ ∼ log 1 τ − Z ∞ 0 Φ( t ) T log Φ( t ) T dt . As we will discuss elsewhere, the ab ov e expression is the differen tial entrop y ov er contin uous-time causal states— the expression giv en in Sec. I I as the “contin uous-time statistical complexity” b C µ —plus the logarithm of our time-bin resolution. Th us, C µ τ ’s log 1 τ div ergence is an artifact of our failure to use the differen tial en tropy when calculating memory storage requiremen ts of con tin uous random v ariables [53]. As a result, w e fo cus on C µ τ ’s nondiv ergent comp onent, b C µ = lim τ → 0 C µ τ + log τ , or what was earlier called the contin uous-time statistical complexit y . Straightforw ard algebra shows that: b C µ = ( 1 α − 1 + log α α − 1 α > 1 ∞ α ≤ 1 . (6) Again, w e can say that the (contin uous-time) C µ di- v erges whenever the mean interev ent in terv al T diverges. When α ≤ 1, finite-length statistical complexity esti- mates adapted to the con tinuous-time case from Eq. (B2) div erge as: C + ` µ ∼ ( log ` α < 1 1 2 log ` α = 1 . So, the sp ecial nature of α = 1 is also revealed as a dis- con tinuit y in rates of div ergence of the finite-length sta- tistical complexit y . In particular, the least cryptic frac- tal renewal pro cess, among fractal renew al processes with div ergent statistical complexit y , is the pro cess generated when α = 1. Equations (4) and (6) are plotted in Fig. 2. The di- v ergences in b E and b C µ at α = 1 are apparent in the plot. If b E and b C µ are taken to b e systems-agnostic or- der parameters, then a fractal renewal pro cess exhibits a nonequilibrium phase transition exactly when its mean in tereven t interv al diverges. 1 10 100 1000 0 2 4 6 8 10 b C µ b E ↵ FIG. 2. Excess en tropy b E and statistical complexit y c C µ of con tinuous-time fractal renewal pro cesses: Pro- cess realizations are generated by dra wing in tereven t interv als I ID from the probability densit y function φ ( t ) = αt − ( α +1) for t ≥ 1 and 0 otherwise. b E in nats as a function of α , ev aluated using Eq. (4). The nondiv ergen t comp onent of statistical com- plexit y b C µ in nats as a function of α , ev aluated using Eq. (6). Note that b C µ is a differential entrop y and so not necessarily larger than the excess entrop y b E ; a subtlety when working with con tin uous-time pro cesses. The b ehavior of b E and b C µ as α tends to infinity also deserv es sp ecial mention, as the pro cess app ears to b e- come infinitely predictable ( b E → ∞ ) while requiring less memory for prediction ( b C µ → 0). As α tends to ∞ , φ ( t ) b ecomes more and more sharply p eaked at t = 1. In other words, the pro cess mo ves closer and closer to that of a p erio dic pro cess with p erio d 1. Periodic pro cesses are random enough, in that the phase of the pro cess could b e any real num b er b etw een 0 and the p erio d. In the language of computational mechanics, the causal state is the phase, and its differential entrop y—the contin uous- time statistical complexity b C µ —is the logarithm of the pro cess’ p erio d. As α → ∞ , the mean interev ent in terv al T = α α − 1 tends to 1, and the con tinuous-time statistical complexit y corresp ondingly tends to log 1 = 0. How- ev er, p erio dic processes are also highly predictable, in that the time to next even t is determined by the time since last even t; hence, the differential en tropy of the time to next even t c onditione d on the time since last ev ent tends tow ards negativ e infinit y , resulting in an in- 6 finite b E = b C µ − H [ S − |S + ] → ∞ . Similar b eha vior w as seen in Ref. [49] as the noisiness of spike trains tended to zero, though. The least cryptic fractal renewal pro cess, then, o ccurs in the limit that α tends to infinity . IV. CONCLUSION W e show ed that a fractal renewal pro cess’s excess en- trop y div erges precisely when its mean interev ent interv al div erges. This adds a relativ ely easily understoo d pro cess and one of muc h broader applicability to the existing list of ergo dic stationary pro cesses with divergen t excess en- trop y [24, 25]. Notably , the exp ected num b er of even ts observ ed in a finite time interv al for a fractal renew al pro cess with di- v ergent excess en tropy is zero. This brings in an interpre- tational c hallenge. A pro cess that, on av erage, pro duces arbitrarily long silence is not often describ ed as random. So, should not the excess entrop y of a p oint process with infinite mean in terev ent interv al b e zero? How ever, the m utual information b etw een finite-length pasts and fu- tures, assuming that we do see an even t, can diverge. And, we will almost surely se e an even t when we view a semi-infinite past. Our calculations rev ealed that fractal renew al pro- cesses flip from finite to divergen t statistical complexit y and exhibit divergen t excess entrop y exactly when the mean in te rev ent in terv al diverges. These information- theoretic measures of memory p oint to the pow er-law co- efficien t α = 1 as b eing a “critical” parameter in this pro cess family . When the mean interev ent in terv al is finite, b oth excess en tropy and contin uous-time statisti- cal complexity are finite, though excess entrop y gro ws un b ounded as α tends to infinity . When the mean in- terev ent interv al is infinite and the p ow er-law co efficient is not α = 1, excess entrop y is finite, but contin uous-time statistical complexity is infinite. Emplo ying signatures of long memory based on second- order statistics suggests, instead, that α = 2 was a “crit- ical p oint”. Sp ecifically , the p ow er sp ectrum of a fractal renew al pro cess exhibits p o wer-la w scaling when α < 2 [41], and the Hurst index of the pro cesses with α < 2 is greater than 1 / 2 and increases with decreasing α [9]. Therefore, at a minim um, drawing conclusions ab out a pro cess’ complex organization via such low-order statis- tics can b e ambiguous. Finally , our results suggest that certain previously studied experimental phenomenon are p oised at a critical p oin t b etw een finite and infinite “memory”, as suggested b y many others using other definitions of criticality [54]. The sto chastic pro cess of neuron membrane ion channels op ening and closing has divergen t excess entrop y when the kinetic rate adopts the form k eff ( t ) ≈ t − 1 . This ma y b e the case for some p otassium-selective channels in cul- tured mouse hipp o campal p yramidal cells near resting mem brane voltage, V = − 60 V [55, Fig. 10, b ottom righ t]. Similarly , the phenomenological fit of the stop- ping probabilities used for Wikip edia edit-revert time se- ries has div ergen t statistical complexit y when α = 1 and div ergent excess entrop y when p = 1 as well [56, 57]. This seems to suggest that increased co operativity b e- t ween editors drives Wikipedia tow ards increasing its so- cial memory . Ho wev er, one lesson from our results is tantamoun t to a cautionary note on interpreting the implicated memory organization. T o the extent that the estimated fractal renew al pro cesses with divergen t memory are go o d mo d- els, one cannot conclude that the conten t of that memory reflects sophisticated computational pro cessing or highly organized storage of detailed information. Indeed, like all renew al pro cesses, fractal renewal pro cesses are simple: they count up to some threshold and reset. Surely these coarse statistics, while useful and even necessary as to ols for a first-cut analysis, fall far short of fully describing the hierarc hies of information pro cessing in neurons and the ric h so cial dynamics driving Wikip edia’s accumulat- ing human knowledge. T o close, let’s return to our initial discussion of sta- tistical signatures of structural organization. W e drew a comparison of divergen t memory in ergo dic pro cesses to that we previously iden tified in the so-called Bandit non- ergo dic pro cesses [26]. The mechanism underlying the latter was rather straigh tforward: from trial to trial the pro cess remem b ers the operant ergodic comp onent sub- pro cess and so uses an infinite memory and exhibits an excess entrop y that diverges as log ` . The case for er- go dic pro cess is more subtle. F or renewal pro cesses we sho wed that the divergence is log log ` . What’s the as- so ciated mechanism? Renewal pro cesses track time b e- t ween even ts and so, in computational mo del terms, it app ears that the pro cess someho w embeds a counter [21, Sec. 4.5.2]. An interesting con trast is the log ` excess en tropy divergence seen at the onset of chaos through p erio d-doubling, asso ciated with pushdown stac k mecha- nism [21, Sec. 4.5.1], and seen in the branc hing copy pro- cess [24]. At this stage, though, the p ossibility of unique asso ciations betw een the form of information measure di- v ergence and mechanism is not sufficien tly w ell explored. Nonetheless, with further extension and refinemen t infor- mation measures and their divergences will b ecome in- creasingly more insightful diagnostics of nature’s diverse forms of intrinsic computation. 7 A CKNOWLEDGMENTS The authors thank W. Bialek, S. Dedeo, and P . Riech- ers for helpful discussions and the Santa F e Institute and the City Universit y of New Y ork for hospitality dur- ing visits. This material is based up on work supp orted b y , or in part by , the U.S. Army Research Lab oratory and the U. S. Army Research Office under contracts W911NF-13-1-0390, W911NF-13-1-0340, and W911NF- 12-1-0288. S.E.M. was funded by a National Science F oundation Graduate Student Research F ellowship and the U.C. Berkeley Chancellor’s F ellowship. App endix A: Con tin uous- v ersus Discrete-time Excess Entropies Often, integrals are easier to ev aluate than the corre- sp onding sums. One practical goal, lev eraging this b elow, is to relate the excess en tropy of time-binned con tinuous- time pro cesses to that of corresp onding discrete-time re- new al pro cesses. Reference [46] found that the excess en tropy of a discrete-time renew al pro cess is: E = log ( µ + 1) − 2 µ + 1 ∞ X n =0 w ( n ) log w ( n ) + 1 µ + 1 ∞ X n =0 ( n + 1) F ( n ) log F ( n ) . (A1) While Ref. [49] show ed that the excess entrop y of a con tinuous-time renewal pro cess X ( t ) is: b E = I[ X ( t ) t< 0 ; X ( t ) t ≥ 0 ] = log T − 2 T Z ∞ 0 Φ( t ) log Φ( t ) dt + 1 T Z ∞ 0 tφ ( t ) log φ ( t ) dt , (A2) whic h is in units of nats when the mean interev en t inter- v al T is finite. Consider time-binning the con tinuous-time p oint pro- cess X ( t ) by asking how many even ts are observed in an in terv al [ t, t + τ ). If at least one ev ent is observ ed, then w e record a 1; if no ev ents are observed, then w e record a 0. This data lab eling technique is common; e.g., when studying neural spike trains. The probability of observ- ing at least n counts b etw een successive 1s is given b y: w τ ( n ) = Φ( nτ ) . When τ = 1, then the surviv al function of the time- binned proces s is exactly that of the discrete-time re- new al pro cess with excess entrop y given in Eq. (A1). The excess entrop y or estimates thereof for a discrete- time renewal pro cess are upp er b ounded by the excess en tropy of a corresp onding contin uous-time renewal pro- cess, as shown shortly . This is a sp ecial case of a more general statement: coarse-graining a time series alwa ys reduces its excess e n tropy , due to the Data Pro cessing Inequalit y . This statement can b e easily generalized to other discrete-alphab et, contin uous-time pro cesses. De- spite its simplicit y , it prov es useful for the calculations to come in Sec. I I I. In particular, let b E denote the excess en tropy of a con tinuous-time renew al pro cess X ( t ) with surviv al func- tion Φ( t ) and E the excess entrop y of the discrete-time renew al pro cess X t with surviv al function w ( n ) = Φ( n ) for all nonnegative integers n . Then, when b E < ∞ : E ≤ b E . T o see this, let E τ denote the excess entrop y of the discrete-time pro cess that comes from time-binning the con tinuous-time renewal pro cess with discretization bin size τ . T o obtain the ab ov e inequalit y , we apply the Data Pro cessing Inequalit y: E 1 /n = I [ . . . , X ( − 2 /n ) , X ( − 1 /n ); X (0) , X (1 /n ) , . . . ] ≥ I [ . . . , X − 2 , X − 1 ; X 0 , X 1 , . . . ] = E 1 . If w e take the limit of the left-hand side as n → ∞ , we obtain: E τ =1 ≤ lim n →∞ E 1 /n = lim τ → 0 E τ . Again b y the Data Pro cessing Inequality , E τ =1 is lo wer- b ounded b y the mutual information betw een the coun ts since last ev ent and counts to next even t, as the former is a function of the past and the latter is a function of the future: E ≤ E τ =1 . By definition [58], lim τ → 0 E τ = b E . App endix B: Renew al Pro cesses with Infinite Mean In terven t In terv als When the mean interev ent interv al T (or µ ) is infinite, the formulae for excess en tropy in Eqs. (A1) and (A2) no longer apply . Causal states, how ever, still provide a use- ful framework for calculating it. Using them w e introduce an analysis method for discrete-time renewal processes in this case. The obvious extensions to contin uous-time re- new al pro cesses follo w when we replace F ( n ) with φ ( t ), 8 w ( n ) with Φ( t ), and summations with integrals. W e calculate E ( ` ) for renewal pro cesses with infinite µ via an analysis tec hnique inspired by Ref. [24] and then calculate E as a limit of E ( ` ) as ` tends to infinity , seem- ingly v alid for ergodic processes. First, w e w ould lik e to directly calculate E ( ` ) in terms of forward and reverse- time causal states [47]: E ( ` ) = I [ ← − X ; − → X ` ] = I [ S + ; S − ` ], where S − ` are finite-time rev erse-time causal states. Un- fortunately , insp ecting the corresp onding join t probabil- it y distribution in App. I I of Ref. [46] sho ws that while w e can identify the joint probability distribution up to a normalization constan t, this normalization constant is infinite when µ is infinite. So, w e define a “pro xy” binary random v ariable U ` whic h is 1 if there has b een an even t sometime in − → X ` and past ← − X , and 0 otherwise. A little reflection shows that Pr( U ` = 0) = lim N →∞ w ( N + ` ) = 0. Ev en so, this auxiliary random v ariable is a surprisingly useful construct. A standard information-theoretic decomp o- sition giv es E ( ` ) = I [ S + ; S − ` | U ` ] + I [ S + ; S − ` ; U ` ], but since Pr( U ` = 0) = 0, we hav e that I [ S + ; S − ` | U ` ] = I [ S + ; S − ` | U ` = 1] and I [ S + ; S − ` ; U ` ] = 0. Altogether this yields: E ( ` ) = I [ S + ; S − ` | U ` = 1] . The conditional probability distribution Pr( S + , S − ` | U ` = 1) is normalizable and, as shown in App. C, leads to: E ( ` ) = log Z ( ` ) − 1 Z ( ` ) ` X n =0 w ( n ) log w ( n ) − 1 Z ( ` ) ∞ X n =0 ( w ( n ) − w ( n + ` + 1)) × log ( w ( n ) − w ( n + ` + 1)) + 1 Z ( ` ) ` X n =0 ( n + 1) F ( n ) log F ( n ) + ` + 1 Z ( ` ) ∞ X n = ` +1 F ( n ) log F ( n ) , (B1) where Z ( ` ) = P ` n =0 w ( n ). If lim ` →∞ E ( ` ) diverges, then w e look for the asymptotic rate of div ergence of E ( ` ). Otherwise, the pro cess’ excess entrop y can b e defined as E = lim ` →∞ E ( ` ). W e exp ect E will often b e finite even when µ diverges. A similar method allo ws us to calculate C µ when mean in tereven t count is infinite. This time, we define U ` as a pro xy random v ariable that is 1 if there has been an even t in ← − X ` and 0 otherwise. Since U ` is a function of S + , a standard information-theoretic identit y implies that: C µ = H[ S + | U ` ] + H[ U ` ] and, in particular: C µ = lim ` →∞ H[ S + | U ` ] + H[ U ` ] . As b efore, lim ` →∞ Pr( U ` = 0) = lim ` →∞ w ( ` ) = 0, so lim ` →∞ H[ U ` ] = 0. Also, H[ S + | U ` ] = Pr( U ` = 0) H[ S + | U ` = 0] + Pr( U ` = 1) H[ S + | U ` = 1] by defini- tion. Since there is only one s emi-infinite past without an even t, lim ` →∞ H[ S + | U ` = 0] = 0. And, H[ S + | U ` = 1] = − P ` n =0 w ( n ) Z ( ` ) log w ( n ) Z ( ` ) . Altogether, this implies: C µ = lim ` →∞ ` X n =0 w ( n ) Z ( ` ) log 1 / w ( n ) Z ( ` ) . (B2) One can also study the growth rate of finite-time statisti- cal complexity estimates which are, after a moment’s re- flection, the C ` µ = − P ` n =0 w ( n ) Z ( ` ) log w ( n ) Z ( ` ) estimates ab ov e in Eq. (B2). One comment, p erhaps ob vious from Eqs. (B1) and (B2), is that whether or not E and C µ div erge dep ends en tirely on the asymptotic form of F ( n ). Another is that the sums in Eq. (B1) can b e quite difficult to ev aluate n u- merically when the renewal pro cess has long-range tem- p oral correlations, since then F ( n ) deca ys slo wly with n . App endix C: Finite-time Excess En trop y Estimates with Infinite Mean Interev ent In terv al F rom App. I I of Ref. [46]: Pr( S + = σ + , S − ` = σ − | U ` = 1) = 1 Z ( F ( σ + + σ − ) σ − ≤ ` 0 σ − = ` + 1 , where the normalization constant is: Z = ` X σ − =0 ∞ X σ + =0 F ( σ + + σ − ) = ` X σ − =0 w ( σ − ) . The marginals are easily calculated: Pr( S + = σ + | U ` = 1) = 1 Z ( w ( σ + ) − w ( σ + + ` + 1)) 9 and: Pr( S − ` = σ − | U ` = 1) = 1 Z ( w ( σ − ) σ − ≤ ` 0 σ − = ` + 1 . F rom this, w e calculate finite-length excess en tropy in nats: E ( ` ) = H [ S − ` | U ` = 1] + H [ S + | U ` = 1] − H [ S + , S − ` | U ` = 1] = log Z − 1 Z ` X n =0 w ( n ) log w ( n ) − 1 Z ∞ X n =0 ( w ( n ) − w ( n + ` + 1)) × log ( w ( n ) − w ( n + ` + 1)) + 1 Z ∞ X n =0 ` X m =0 F ( n + m ) log F ( n + m ) . This simplifies to: E ( ` ) = log Z − 1 Z ` X n =0 w ( n ) log w ( n ) − 1 Z ∞ X n =0 ( w ( n ) − w ( n + ` + 1)) × log ( w ( n ) − w ( n + ` + 1)) + 1 Z X n =0 ` ( n + 1) F ( n ) log F ( n ) + ` + 1 Z ∞ X n = ` +1 F ( n ) log F ( n ) . Similar manipulations hold for con tinuous-time pro- cesses. Briefly , the time since last ev ent t and time to next even t t 0 ha ve a joint probability distribution pro- p ortional to φ ( t + t 0 ), since the time since last even t plus the time to next even t is an interev en t in terv al. App endix D: F ractal Renewal Pro cesses The α > 1 case simply requires substituting φ ( t ) and Φ( t ) from Eqs. (2)-(3) into Eq. (A2) and solving: b E = log T − 2 T Z ∞ 0 Φ( t ) log Φ( t ) dt + 1 T Z ∞ 0 tφ ( t ) log φ ( t ) dt. (D1) After straightforw ard calculations, we find that: T = α α − 1 , 1 T Z ∞ 0 Φ( t ) log Φ( t ) dt = − 1 α − 1 , and 1 T Z ∞ 0 tφ ( t ) log φ ( t ) dt = log α − α + 1 α − 1 . These together yield: b E = log α 2 α − 1 − 1 . No w, we turn our atten tion to the case of 0 < α ≤ 1. There are tw o p ossibilities for b E when 0 < α ≤ 1. One is that b E div erges, in which case, w e only care ab out the asymptotic rate of divergence of b E ( ` ). The other p ossibilit y is that b E do es not div erge, in whic h case, we only care ab out contributions Q to b E ( ` ) that are not o (1); i.e., that satisfy lim ` →∞ Q 6 = 0. Our strategy in ev aluating b E ( ` ) from Eq. (D1) is to systematically find closed-form expressions for all comp onents that are not o (1). Direct solution gives: Z = ( ` 1 − α 1 − α α < 1 log ` α = 1 , (D2) plus comp onen ts of o (1): − 1 Z Z ` 0 Φ( t ) log Φ( t ) dt = ( − α 1 − α + α log ` α < 1 1 2 log ` α = 1 (D3) plus comp onen ts of o (1); and: 1 Z Z ` 1 tφ ( t ) log φ ( t ) dt + ` Z Z ∞ ` φ ( t ) log φ ( t ) dt = ( − 1 − α − 2 α 2 α (1 − α ) + log α − (1 + α ) log ` α < 1 − 2 − log ` α = 1 , (D4) plus comp onen ts of o (1). Finally , w e address the only component with no simple closed-form expression: 1 Z Z ∞ 0 (Φ( t ) − Φ( t + ` )) log (Φ( t ) − Φ( t + ` )) dt = 1 Z Z ∞ 1 ( t − α − ( t + ` ) − α ) log( t − α − ( t + ` ) − α ) dt + 1 Z Z 1 0 (1 − ( t + ` ) − α ) log(1 − ( t + ` ) − α ) dt . 10 Since: lim ` →∞ 1 Z Z 1 0 (1 − ( t + ` ) − α ) log(1 − ( t + ` ) − α ) dt = 0 , w e ignore that term as a correction of o (1). The case for α = 1 can actually be ev aluated explicitly since 1 t − 1 t + ` = ` t ( t + ` ) : lim ` →∞ 1 Z Z ∞ 1 ` t ( t + ` ) log( ` t ( t + ` ) ) dt = − 1 2 log ` . No w, consider the case of α < 1. W e extract the asymp- totic scaling in ` of the first term by the change of v ari- ables u = `t , giving: 1 Z Z ∞ 1 ( t − α − ( t + ` ) − α ) log( t − α − ( t + ` ) − α ) dt = ` 1 − α Z Z ∞ 1 /` ( u − α − (1 + u ) − α ) log( ` − α ( u − α − (1 + u ) − α )) du = − α ` 1 − α log ` Z Z ∞ 1 /` u − α − (1 + u ) − α du + ` 1 − α Z Z ∞ 1 /` ( u − α − (1 + u ) − α ) log( u − α − (1 + u ) − α ) du . The first of the tw o integrals can b e ev aluated explicitly as: Z ∞ 1 /` u − α − (1 + u ) − α du = − ` α − 1 1 − α + ` α − 1 1 − α ( ` + 1) 1 − α . So, that w e find the first term’s asymptotic b eha vior to b e: − α ` 1 − α log ` Z Z ∞ 1 /` u − α − (1 + u ) − α du ∼ − α log ` , plus corrections of o (1). One of the more notable cor- rections of o (1) is prop ortional to log ` Z , which is o (1) for α < 1 and otherwise has a nonzero limiting v alue when ` → ∞ . Surprisingly , the latter of the tw o integrals limits to a finite v alue for α < 1: lim ` →∞ ` 1 − α Z Z ∞ 1 /` ( u − α − (1 + u ) − α ) log( u − α − (1 + u ) − α ) du = (1 − α ) Z ∞ 0 ( u − α − (1 + u ) − α ) log( u − α − (1 + u ) − α ) du , where we used lim ` →∞ ` 1 − α Z = 1 − α for α < 1. As a result, we find that: 1 Z Z ∞ 0 (Φ( t ) − Φ( t + ` )) log (Φ( t ) − Φ( t + ` )) dt = ( − 1 2 log ` α = 1 − α log ` + (1 − α ) R ∞ 0 ( u − α − (1 + u ) − α ) log( u − α − (1 + u ) − α ) du 0 < α < 1 , (D5) plus corrections of o (1). Altogether, combining Eqs. (D2)-(D4) and (D5) in to Eq. (D1), we recov er Eq. (4) of the main text. As discussed there, we still must ev aluate E ( ` ) at α = 1. W e fo cus again on asymptotic expansions in ` and drop corrections to expressions that do not contribute to E . When α = 1: Z ( ` ) = 1 + ` X n =1 1 n = log ` , 11 plus corrections of O (1). Next, we ev aluate: − ` X n =0 w ( n ) log w ( n ) = ` X n =1 log n n = ` X n =2 log n n . Since log n n is a monotone decreasing function with n , w e lo wer- and upp er-b ound this sum using in tegrals: R ` +1 2 log n n dn ≤ P ` n =2 log n n ≤ log 2 2 + R ` 2 log n n dn . These are easily ev aluated, giving: − ` X n =0 w ( n ) log w ( n ) = − 1 2 log 2 ` , plus corrections of O (1). F or other sums, w e need an expression for F ( n ): F ( n ) = w ( n ) − w ( n + 1) = ( 0 n = 0 1 n ( n +1) n ≥ 1 . Then, we ev aluate: ` X n =0 ( n + 1) F ( n ) log F ( n ) = − 2 ` X n =1 log n n + ` X n =1 log(1 + 1 n ) n = log 2 ` , plus corrections of O (1), w here we ha ve noted that P ∞ n =1 log(1+ 1 n ) n con verges since R ∞ 1 log(1+ 1 x ) x dx conv erges. The next term takes the form: ( ` + 1) ∞ X ` +1 F ( n ) log F ( n ) = − ( ` + 1) ∞ X ` +1 log( n ( n + 1)) n ( n + 1) . W e can bound the sum using R ∞ ` +1 log( n ( n +1)) n ( n +1) dn ≤ P ∞ ` +1 log( n ( n +1)) n ( n +1) ≤ log( ` 2 + ` ) ` 2 + ` + R ∞ ` +1 log( n ( n +1)) n ( n +1) dn . These in tegrals are b oth easily ev aluated, revealing an asymp- totic form of: ( ` + 1) ∞ X ` +1 F ( n ) log F ( n ) = − 2 log ` , plus corrections of O (1). Finally , to ev aluate the last term in the sum, we note that: w ( n ) − w ( n + ` + 1) = 1 n (1 + n ` +1 ) = 1 /` + 1 n ` +1 (1 + n ` +1 ) , when n ≥ 1. W e define x n = n ` +1 with dx n = 1 ` +1 and write: w ( n ) − w ( n + ` + 1) = dx n x n (1 + x n ) . Then: ∞ X n =0 ( w ( n ) − w ( n + ` + 1)) log( w ( n ) − w ( n + ` + 1)) = (1 − w ( ` + 1)) log (1 − w ( ` + 1)) + log dx n ∞ X n =1 dx n x n (1 + x n ) + ∞ X n =1 log( x n (1 + x n )) x n (1 + x n ) dx n . The first term is o (1), since lim ` →∞ (1 − w ( ` + 1)) log(1 − w ( ` + 1)) = 0. W e can view the other t wo sums as Rie- mann sums for integrals R ∞ 1 /` dx x (1+ x ) and R ∞ 1 /` log( x (1+ x )) x (1+ x ) dx resp ectiv ely , giving: ∞ X n =1 dx n x n (1 + x n ) = log ` , plus corrections of o (1) and: ∞ X n =1 log( x n (1 + x n )) x n (1 + x n ) dx n = − 1 2 log 2 ` . plus corrections of o (1). Altogether, substituting the ab o v e expressions into Eq. (B1) yields: E ( ` ) = log log ` − 2 , plus corrections of o (1). The v arious divergences of order log ` all cancel one another, but the div ergence of log log ` due to the log ` divergence in Z ( ` ) remains, just as for the con tinuous-time case. When F ( n ) is monotone decreas- ing at some finite N sufficiently rapidly , manipulations similar to those ab ov e imply that divergence in b E is a sufficien t condition for divergence in E . [1] G. M. Shim and M. Y. Choi. Algebraic decay of correla- tions in neural netw orks. Phys. Rev. A , 46(8):5292–5295, 1992. [2] J. J. Binney , N. J. Dowric k, A. J. Fisher, and M. E. J. Newman. The Theory of Critic al Phenomena . Oxford Univ ersit y Press, Oxford, 1992. 12 [3] J. F. Alv es and V. Pinheiro S. Luzzatto. Marko v struc- tures and deca y of correlations for non-uniformly expand- ing dynamical systems. Ann. I. H. Poinc are – Anal. Non- lin. , 22:817–839, 2005. [4] R. F. V oss and J. Clark e. ‘1/f noise’ in music and speech. Natur e , 258:317–318, 27 Nov ember 1975. [5] A. d’Amico and P . Mazzetti. Noise in Physic al Systems and 1/f Noise . Elsevier Science Publishers, Amsterdam, The Netherlands, 1986. [6] W. H. Press. Flick er noises in astronomy and elsewhere. Comments on Astr ophysics , 7(4):103–119, 1978. [7] H. E. Hurst. Long-term storage capacity of reservoirs. T r ans. Amer. So c. Civil Engine ers , 116:770–799, 1951. [8] B. B. Mandelbrot and J. R. W allis. Noah, Joseph, and op erational hydrology . Water R esour ces R ese ar ch , 4(5):909–918, 1968. [9] D. J. Daley . The Hurst index of long-range dep endent renew al pro cesses. Ann. Pr ob. , 27(4):2035–2041, 1999. [10] J. Beran, Y. F eng, S. Ghosh, and R. Kulik. L ong-memory Pr o c esses: Pr ob abilistic Pr op erties and Statistic al Meth- o ds . Springer, London, United Kingdom, 2013. [11] G. K. Zipf. The Psycho-Biolo gy of L anguage: An Intr o- duction to Dynamic Philolo gy . Houghton Mifflin Com- pan y , Boston, Massac h usetts, 1935. [12] B. Mandelbrot. An informational theory of the statistical structure of languages. In W. Jackson, editor, Commu- nic ation The ory , pages 486–502. Butterw orths, London, 1953. [13] D. C. Knill, D. Field, and D. Kersten. Human discrimi- nation of fractal images. J. Opt. So c. Am. A , 7(6):1113– 1123, 1990. [14] T. Kumar, P . Zhou, and D. A. Glaser. Comparison of h u- man p erformance with algorithms for estimating fractal dimension. J. Opt. So c. Am. A , 10(6):1136–46, 1993. [15] B. B. Mandelbrot. Self-similar error clusters in commu- nication systems and the concept of conditional station- arit y . IEEE T r ans. Commun. T e chnol. , 13:71–90, 1965. [16] B. B. Mandelbrot. Multifr actals and 1/f Noise: Wild Self- Affinity in Physics (1963- 1976) . Springer, New Y ork, first edition, 1999. [17] P . Riec hers. Priv ate communication, 2015. [18] B. D. Johnson, J. P . Crutchfield, C. J. Ellison, and C. S. McT ague. Enumerating finitary pro cesses. page submitted, 2012. SFI W orking Paper 10-11-027; arxiv.org:1011.0036 [cs.FL]. [19] R. G. James, J. R. Mahoney , C. J. Ellison, and J. P . Crutc hfield. Many roads to sync hron y: Natural time scales and their algorithms. Phys. R ev. E , 89:042135, 2014. [20] G. Samoro dnitsky . Long range dep endence. F oundations and T r ends in Stochastic Systems , 1(3):163–257, 2007. [21] J. P . Crutchfield. The calculi of emergence: Compu- tation, dynamics, and induction. Physic a D , 75:11–54, 1994. [22] W. Bialek, I. Nemenman, and N. Tish by . Predictability , complexit y , and learning. Neural Comp. , 13:2409–2463, 2001. [23] J. P . Crutchfield and D. P . F eldman. Regularities un- seen, randomness observ ed: Lev els of entrop y con ver- gence. CHAOS , 13(1):25–54, 2003. [24] N. T rav ers and J. P . Crutc hfield. Infinite excess en- trop y pro cesses with coun table-state generators. Entr opy , 16:1396–1413, 2014. [25] L. Deb owski. On hidden Marko v pro cesses with infinite excess en trop y . J. The o. Pr ob. , pages 1–13, 2012. [26] J. P . Crutchfield and S. Marzen. Signatures of infinit y: Nonergo dicit y and resource scaling in prediction, com- plexit y , and learning. Phys. R ev. E , 91(5):050106, 2015. [27] W. L. Smith. Renewal theory and its ramifications. J. R oy. Stat. So c. B , 20(2):243–302, 1958. [28] W. Gerstner and Kistler. Statistics of spike trains. In Spiking Neur on Mo dels . Cambridge Universit y Press, Cam bridge, United Kingdom, 2002. [29] F. Beichelt. Sto chastic Pr o c esses in Scienc e, Engine ering and Financ e . Chapman and Hall, New Y ork, 2006. [30] V. S. Barbu and N. Limnios. Semi-Markov chains and hidden semi-Markov mo dels towar d applic ations: Their Use in R eliability and DNA Analysis , volume 191. Springer, New Y ork, 2008. [31] S. B. Low en and M. C. T eich. F ractal renewal pro cesses. IEEE T r ans. Info. Th. , 39(5):1669–1671, 1993. [32] S. Th urner, S. B. Low en, M. C. F eurstein, and C. Heneghan. Analysis, synthesis and estimation of fractal-rate sto chastic p oint pro cesses. F r actals , 5(4):565–595, 1997. [33] R. Cakir, P . Grigolini, and A. A. Krokhin. Dynamical origin of memory and renew al. Phys. R ev. E , 74:021108, 2006. [34] S. Bianco, M. Ignaccolo, M. S. Rider, M. J. Ross, P . Win- sor, and P . Grigolini. Brain, music, and non-Poisson re- new al pro cesses. Phys. R ev. E , 75:061911, 2007. [35] C.-B. Li, H. Y ang, and T. Komatsuzaki. Multiscale com- plex netw ork of protein conformational fluctuations in single-molecule time series. Pr o c. Natl. A c ad. Sci. USA , 105:536–541, 2008. [36] T. Akimoto, T. Hasumi, and Y. Aizaw a. Characteriza- tion of intermittency in renewal pro cesses: Application to earthquak es. Phys. R ev. E , 81:031133, 2010. [37] D. Kelly , M. Dillingham, A. Hudson, and K. Wiesner. A new metho d for inferring hidden Marko v mo dels from noisy time sequences. PL oS One , 7(1):e29703, 01 2012. [38] M. Montero and J. Villarro el. Monotonic contin uous- time random w alks with drift and stochastic reset even ts. Phys. R ev. E , 87:012116, 2013. [39] M. Bologna, B. J. W est, and P . Grigolini. Renewal and memory origin of anomalous diffusion: A discussion of their join t action. Phys. R ev. E , 88:062106, 2013. [40] T. Onaga and S. Shinomoto. Bursting transition in a lin- ear self-exciting p oint pro cess. Phys. Rev. E , 89:042817, 2014. [41] S. B. Low en and M. C. T eich. F ractal renewal pro cesses generate 1/f noise. Phys. R ev. E , 47(2):992–1001, 1993. [42] N. T rav ers and J. P . Crutchfield. Equiv alence of history and generator -machines. 2014. SFI W orking P ap er 11- 11-051; arx iv.org:1111.4500 [math.PR]. [43] J. P . Crutc hfield and K. Y oung. Inferring statistical com- 13 plexit y . Phys. R ev. L et. , 63:105–108, 1989. [44] C. R. Shalizi and J. P . Crutc hfield. Computational me- c hanics: P attern and prediction, structure and simplicity . J. Stat. Phys. , 104:817–879, 2001. [45] W. Lohr. Prop erties of the statistical complexit y func- tional and partially deterministic HMMs. Entr opy , 11(3):385–401, 2009. [46] S. Marzen and J. P . Crutc hfield. Informational and causal arc hitecture of discrete-time renewal pro cesses. Entr opy , 17(7):4891–4917, 2015. [47] J. P . Crutchfield, C. J. Ellison, and J. R. Mahoney . Time’s barbed arrow: Irreversibilit y , crypticity , and stored information. Phys. R ev. L ett. , 103(9):094101, 2009. [48] C. J. Ellison, J. R. Mahoney , and J. P . Crutchfield. Prediction, retro diction, and the amount of information stored in the present. J. Stat. Phys. , 136(6):1005–1034, 2009. [49] S. Marzen, M. R. DeW eese, and J. P . Crutchfield. Time resolution dep endence of information measures for spik- ing neurons: Scaling and universalit y . F ront. Comput. Neur osci. , 9:109, 2015. [50] S. Marzen and J. P . Crutchfield. Information anatomy of sto c hastic equilibria. Entr opy , 16(9):4713–4748, 2014. [51] T. Grav es, R. B. Gramacy , C. F ranzke, and N. W atkins. A brief history of long memory . arXiv pr eprint arXiv:1406.6018 , 2014. [52] J. P . Crutchfield, P . Riechers, and C. J. Ellison. Exact complexit y: Sp ectral decomposition of intrinsic compu- tation. submitted. San ta F e Institute W orking Paper 13-09-028; arXiv:1309.3792 [cond-mat.stat-mech]. [53] T. M. Cov er and J. A. Thomas. Elements of Information The ory . Wiley-Interscience, New Y ork, second edition, 2006. [54] T. Mora and W. Bialek. Are biological systems poised at criticalit y? J. Stat. Phys. , 144(2):268–302, 2011. [55] L. S. Liebovitc h and J. M. Sulliv an. F ractal analysis of a voltage-dependent p otassium channel from cultured mouse hippo campal neurons. Biophys. J. , 52(6):979–988, 1987. [56] S. Dedeo. Collectiv e phenomena and non-finite state computation in a human so cial system. PL oS One , 8(10):e75818, 2013. [57] S. Dedeo. Group minds and the case of Wikip edia. Hu- man Computation , 1(1):5–29, 2014. [58] M. S. Pinsker. Information and information stability of r andom variables and pr o c esses . Holden-Day , San F ran- cisco, California, 1964.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment