Consistency of detrended fluctuation analysis

The scaling function $F(s)$ in detrended fluctuation analysis (DFA) scales as $F(s)\sim s^{H}$ for stochastic processes with Hurst exponents $H$. We prove this scaling law for both stationary stochastic processes with $0<H<1$, and non-stationary stoc…

Authors: Ola L{o}vsletten

Consistency of detrended fluctuation analysis
Consistency of detrended fluctuation analysis O. Løvsletten Dep artment of Mathematics and Statistics, UiT Artic University of Norway, Norway. The scaling function F ( s ) in detrended fluctuation analysis (DF A) scales as F ( s ) ∼ s H for sto c has- tic pro cesses with Hurst exp onent H . This scaling law is pro ven for stationary sto chastic processes with 0 < H < 1, and non-stationary stochastic processes with 1 < H < 2. F or H < 0 . 5 it is observ ed that the asymptotic (pow er-la w) auto-correlation function (ACF) scales as ∼ s 1 / 2 . It is also demonstrated that the fluctuation function in DF A is equal in exp ectation to: i) a weigh ted sum of the ACF, ii) a weigh ted sum of the second order structure function. These results enable us to compute the exact finite-size bias for signals that are scaling, and to employ DF A in a meaningful sense for signals that do not exhibit p ow er-law statistics. The usefulness is illustrated by examples where it is demonstrated that a previous suggested modified DF A will increase the bias for signals with Hurst exponents H > 1. As a final application of the new developmen ts, a new estimator ˆ F ( s ) is prop osed. This estimator can handle missing data in regularly sampled time series without the need of in terp olation sc hemes. Under mild regularit y conditions, ˆ F ( s ) is equal in expectation to the fluctuation function F ( s ) in the gap-free case. I. INTR ODUCTION Detrended fluctuation analysis (DF A) was introduced in a study of long-range dep endence in DNA sequences [1]. It has later b een applied in a wide range of scientific disciplines [2]. Some recent examples are found in scien- tific studies of climate [3], finance [4] and medicine [5]. The most common usage of DF A is to estimate the Hurst exp onen t. The assumption is then that the second mo- men t of the fluctuations of the input signal, after these ha ve b een av eraged ov er a time scale s , is a pow er-law function of s . This prop ert y is called sc ale invarianc e , or just sc aling . In the context of DF A, the scaling assump- tion implies that the DF A fluctuation function F ( s ) tak es the form of a p ow er-la w E F 2 ( s ) ∼ s 2 H , (1) where E denotes the exp ectation, i.e., the ensemble mean. Imp ortan t examples of sto c hastic pro cesses X ( t ) with scaling prop erties are self-similar and multifractal mo d- els, see e.g., [6]. F or this large class of models, the ex- isting q -moments satisfy E | X ( t + t 0 ) − X ( t 0 ) | q ∝ t ζ ( q ) . In particular, if the v ariance is finite, the second mo- men ts are scaling and the Hurst exp onent H is defined b y the relation ζ (2) = 2 H − 2. The pow er-law of the DF A fluctuation function in this case (1 < H < 2) has b een established empirically . A mathematical pro of has not b een published prior to this pap er, except for random w alks ( H = 1 . 5) [7, 8]. F or stationary sto c hastic processes X ( t ) with scaling second-momen ts, the Hurst exponent is in the range 0 < H < 1. F or H = 0 . 5, X ( t ) is white noise, while H 6 = 1 / 2 implies an auto-correlation function (ACF) ρ ( τ ) on the form [20] ρ ( τ ) ∼ H (2 H − 1) τ 2 H − 2 . (2) F or H < 1 / 2 the ACF is negativ e for all time lags τ 6 = 0, while for H > 1 / 2 the A CF is p ositive. Moreov er, in the p ersisten t case ( H > 1 / 2), the A CF deca ys so slowly that the series P ∞ τ = −∞ ρ ( τ ) diverges. In the case of a stationary input signal X ( t ), with Hurst exp onen t 0 < H < 1, Eq. (1) has b een partly pro ven in the past. T aqqu et al. [9] constructed a pro of for DF A1. DF A m , or DF A of order m , means that a m ’th order p olynomial is applied in the DF A algorithm (Sec- tion I I IA). F or Hurst exp onen ts restricted to the range 0 . 5 < H < 1, the pro of has b een extended to include higher-order polynomials m ≥ 1 [8]. In this pap er a new observ ation is made: for 0 < H < 0 . 5, in order for Eq. (1) to be satisfied, only the exact auto-co v ariance function (acvf ) gives the correct result. If the asymptotic acvf is emplo yed, then E F 2 ( s ) ∼ s . F or stationary signals, H¨ oll and Kantz [8] show ed that the squared DF A fluctuation function is equal in exp ec- tation to a weigh ted sum of the acvf γ ( · ): E F 2 ( s ) = γ (0) G (0 , s ) s − 1 + 2 s − 1 s − 1 X j =1 G ( j, s ) γ ( j ) , (3) where the w eight function G ( j, s ) will b e defined in Sec- tion I I I. In this pap er a more general result is presen ted; E F 2 ( s ) = − 1 s s − 1 X j =1 G ( j, s ) S ( j ) , (4) where S ( t ) = E [ X ( t + t 0 ) − X ( t 0 )] 2 , which also holds for non-stationary sto chastic pro cesses with stationary incre- men ts. The quan tit y S ( t ) is kno wn as the v ariogram. W e note that the relationship b etw een DF A and the p o wer sp ectral densit y was derived, partly based on n umerical calculations, b y Heneghan and McDarby [10]. Eqs. (3) and (4) hav e applications b eyond proving Eq. (1). F or instance, one can compute the exact finite- size bias for scaling signals, and make meaningful use of DF A for signals that are not scaling. In Kan telhardt et al. [2], the bias of DF A for sto chastic pro cesses with Hurst exponents in the range 0 . 5 ≤ H < 1 was found by 2 means of Monte Carlo sim ulations, using long time se- ries of synthetically generated fractional Gaussian noises. An analytical study of the behaviour of DF A for auto- regressiv e pro cesses of order one (AR(1)) was inv esti- gated in H¨ oll and Kantz [8]. In the presen t pap er the usage of Eqs. (3) and (4) is demonstrated b y simple ex- tensions of the aforementioned examples. An important application of the theoretical dev elop- men ts presented in this pap er is the construction of es- timators (mo difications of the DF A fluctuation function) that can handle missing data in regularly sampled time series. One simple w ay of handling missing data is to apply linear in terp olation, random re-sampling, or mean filling. How ev er, this will typically destroy , or add, ar- tificial correlations to the time series under study . The effect on DF A using these three gap-filling techniques was examined in Wilson et al. [11] for signals with Hurst ex- p onen ts 0 < H < 1. It was found that these interpo- lation sc hemes in tro duce significant deviation from the exp ected scaling. In con trast, the modified fluctuation functions proposed here ha ve the property of equality in exp ectation to the fluctuation function in the gap-free case. F or the w av elet v ariance, estimators that can han- dle missing data in a proper statistical wa y w as presented b y Mondal and Perciv al [12]. These w av elet v ariances are similar in construction to the DF A estimators presented here. This pap er is organised as follo ws. In Section II the definition of Hurst exp onen t adopted in this pap er is review ed. Examples of sto c hastic pro cesses with well- defined Hurst exp onents are given. Section I I I presents the relationship b etw een the DF A fluctuation function and the acvf and v ariogram, and the pro of of Eq. (1). Examples of applications are given in Section I I I: Bias for scaling signals, DF A of Ornstein-Uhlenbeck pro cesses, and modification of the DF A fluctuation function to han- dle missing data. I I. HURST EXPONENT A. Definition and properties Let X ( t ) be a sto chastic pro cess with mean E X ( t ) = 0. If i) X ( t ) is non-stationary with stationary increments and E [ X ( t + t 0 ) − X ( t 0 )] 2 ∝ t 2 H − 2 , or ii) X ( t ) is stationary and E [ Y ( t + t 0 ) − Y ( t 0 )] 2 ∝ t 2 H , Y ( t ) = t X k =1 X ( k ) , then w e define H to be the Hurst exponent of the pro cess X ( t ). The Hurst exponent determines the correlation at all time scales. Assume that X ( t ) has Hurst exponent 1 < H < 2, i.e., X ( t ) is non-stationary . Then, 2 X ( t ) X ( s ) = X ( t ) 2 + X ( s ) 2 − { X ( t ) − X ( s ) } 2 . By stationary increments; E { X ( t ) − X ( s ) } 2 = E X ( | t − s | ) 2 , it follo ws that E X ( t ) X ( s ) = σ 2 2  | s | 2 h + | t | 2 h − | t − s | 2 h  , (5) with E X (1) 2 = σ 2 and h = H − 1. The increment pro cess ∆ X ( t ) = X ( t ) − X ( t − 1) has Hurst exponent h . The acvf γ ( τ ) of the incremen ts follows from (5), and is given by γ ( τ ) = σ 2 2  | τ + 1 | 2 h − 2 | τ | 2 h + | τ − 1 | 2 h  . (6) F or h = 1 / 2 the incremen t pro cess is white noise, while for h 6 = 1 / 2 the acvf is asymptotically a p o wer-la w; γ ( τ ) ∼ σ 2 2 d 2 dτ 2 t 2 h = σ 2 h (2 h − 1) τ 2 h − 2 , as τ → ∞ . Th us, h 6 = 1 / 2 implies dep endent increments. Cho osing 0 < h < 1 / 2 results in negatively correlated incremen ts, while for h > 1 / 2 the increments are p ersis- ten t. Moreo ver, in the p ersistent case, the acvf decays so slo wly that the series P ∞ τ = −∞ γ ( τ ) diverges. B. Examples Hurst exponent in the range 1 < H < 2 and X ( t ) Gaussian distributed defines the class of fractional Bro w- nian motions (fBm’s). The corresp onding incremen t pro- cess is known as a fractional Gaussian noise (fGn’s) [13]. An fBm is an example of a self-similar pro cess. By defi- nition self-similar pro cesses X ( t ), with self-similar expo- nen t h , satisfy the the scale-inv ariance X ( at ) d = M ( a ) X ( t ) , (7) where M ( a ) = a h [14], and “ d =” denotes equality in finite- dimensional distributions. The class of log-infinitely di- visible m ultifractal processes [15, 16] also satisify Eq. (7), but no w M ( a ) is random v ariable with an arbitrarily log- infinitely divisible distribution. The scaling law Eq. (7) implies E | X ( t + t 0 ) − X ( t 0 ) | q ∝ t ζ ( q ) . Thus, if the sec- ond moments exist the Hurst exp onen t H is given by the relation ζ (2) = 2 H − 2. These examples are sum- marized in T able I. W e emphasize that neither multifrac- talit y nor self-similarity is needed to hav e a process with w ell-defined Hurst exp onent. An example is the class of smo othly truncated L ´ evy fligh ts (STLF’s) [17]. F or STLF’s all moments exist, and the prop erty of station- ary and independent incremen ts implies a Hurst exp o- nen t H = 1 . 5. The STLF b ehav es like a L` evy flight on 3 Sto c hastic process Hurst exp onent White noise H = 1 / 2 Random walks H = 3 / 2 fractional Gaussian noise 0 < H < 1 fractional Brownian motion 1 < H < 2 h -selfsimilar pro cesses H = h + 1 Scaling function ζ ( q ) H = ζ (2) / 2 + 1 T ABLE I. Examples of stochastic pro cesses with well-defined Hurst-exp onen ts H . Finite v ariance is assumed in all exam- ples. small time scales, while on long time scales, the statis- tics are close to Bro wnian motion [18]. Thus, it is clearly neither self-similar nor m ultifractal, which was prov en in [19]. I I I. DETRENDED FLUCTUA TION ANAL YSIS A. DF A algorithm Let X (1) , X (2) , . . . , X ( n ) be the input to DF A. The first step in DF A is to construct the profile Y ( t ) = t X k =1 X ( k ) . F or a giv en scale s one considers windows of length s . In eac h window a p olynomial of degree m is fitted to the profile. Subtracting the fitted p olynomial from the profile gives a set of residuals. F rom these residuals the v ariance is computed. W e denote by F 2 t ( s ) the resid- ual v ariance. The squared fluctuation function F 2 is the a verage of F 2 t . T o express the residual v ariance mathe- matically , w e introduce some notation. Define the vec- tor Y ( t ) = [ Y ( t + 1) , Y ( t + 2) , . . . , Y ( t + s )] T . Let B b e the ( m + 1) × s design matrix in the ordinary least square (OLS) regression. That is, ro w k of B is the v ector (1 k − 1 , 2 k − 1 , . . . , s k − 1 ). Define Q = B T  B B T  − 1 B , (8) whic h is kno wn as the hat matrix in statistics. The resid- ual v ariance is giv en by F 2 t ( s ) = 1 s Y ( t ) T ( I − Q ) Y ( t ) , (9) where I is the ( s × s ) identit y matrix. B. Ho w DF A relates to v ariogram and acvf It is con venien t to express the squared fluctuation function explicitly in terms of the input series. Let X ( t ) = [ X ( t + 1) , X ( t + 2) , . . . , X ( t + s )] T . W e define the s × s matrix D b y letting element ( i, j ) of D b e unity if i ≥ j and zero otherwise. Left-m ultiplying D with X ( t ) 0 20 40 60 80 100 - 100 0 100 200 300 400 j G ( j , 100 )  0 20 40 60 80 100 - 50 0 50 100 150 200 j G ( j , 100 )  FIG. 1. Shows the map j 7→ G ( j, 100) for DF A2 (top figure) and DF A5 (b ottom figure). The weigh t function G ( j, s ) is defined in the text. giv es the vector of cumulativ e sums ( X ( t + 1) , X ( t + 1) + X ( t + 2) , . . . , P s k =1 X ( t + k )). Define A = D T ( I − Q ) D , and let a k,j b e elemen t ( k , j ) of the matrix A . The fluc- tuation function can b e written F 2 t ( s ) = 1 s X ( t ) T A X ( t ) = 1 s s X k =1 s X j =1 a k,j X ( t + k ) X ( t + j ) . (10) In the definition of DF A the profile is constructed for the en tire time series prior to windowing. Eq. (10) states that constructing the profile in each window giv es the same residual v ariance (squared fluctuation function). Another form of the residual v ariance is F 2 t ( s ) = − 1 2 s s X k =1 ,j =1 a k,j [ X ( t + k ) − X ( t + j )] 2 . (11) The proofs of Eqs. (10) and (11) can b e found in Ap- p endix A. In the sequel we make the assumption that X ( t ) = T ( t ) + Z ( t ), where Z ( t ) is a sto chastic process with mean zero and acvf γ ( t, s ). The deterministic part T ( t ) of the 4 input signal is a trend modeled as a p olynomial of order q less than the order m of DF A (see Appendix A for precise form of the trends). F or simplicity we consider in this section the case T ( t ) = 0, and postp one to App endix A to show that the results are v alid also for trends with q < m . By applying the exp ectation op erator to Eq. (10), it is seen that E F 2 t ( s ) = 1 s s X k =1 s X j =1 a k,j γ ( t + k , t + j ) . (12) If we add the further restriction of stationarit y of the pro cess X ( t ), Eq. (12) simplifies to Eq. (3), with γ ( t ) = γ (0 , t ) and G ( j, s ) = s − j X k =1 a k,k + j . (13) While Eq. (12) app ears to b e time-dep endent when X ( t ) is non-stationary with stationary incremen ts, this is not the case. T o establish that E F 2 t ( s ) do es not depend on the window t , one can apply the exp ectation op erator to Eq. (11). The result is Eq. (4), with G ( j, s ) the w eight function defined in Eq. (13). Since E F 2 t ( s ) do es not de- p end on the window t , it follo ws that E F 2 t ( s ) = E F 2 ( s ). The weigh t functions G ( j, s ) can b e computed exactly . In this w ork this has b een done b y means of Mathemat- ica. The weigh t function for DF A1 and DF A2 are listed in T able I I, while the map j 7→ G ( j, 100) for DF A2 and DF A5 are shown in Fig. 1. m G ( j, s ) 1 ( j − s − 1)( j − s )( j − s +1) ( 3 j 2 +9 j s − 2 s 2 +8 ) 30 s ( s 2 − 1 ) 2 − ( j − s − 1)( j − s )( j − s +1) ( 10 j 4 +30 j 3 s +2 j 2 ( 9 s 2 +19 ) +2 j s ( 67 − 13 s 2 ) +3 ( s 4 − 13 s 2 +36 )) 70 s ( s 4 − 5 s 2 +4 ) T ABLE I I: The weigh t function G ( j, s ) for DF A1 and DF A2. q 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 m = 1 1 15 − 1 2 1 − 2 3 0 1 10 m = 2 3 70 − 1 2 3 2 − 3 2 0 3 5 0 − 1 7 m = 3 2 63 − 1 2 2 − 8 3 0 2 0 − 8 7 0 5 18 m = 4 5 198 − 1 2 5 2 − 25 6 0 5 0 − 5 0 25 9 0 − 7 11 m = 5 3 143 − 1 2 3 − 6 0 21 2 0 − 16 0 15 0 − 84 11 0 21 13 m = 6 7 390 − 1 2 7 2 − 49 6 0 98 5 0 − 42 0 175 3 0 − 49 0 294 13 0 − 22 5 T ABLE I I I: The co efficients { d q } for DF A of order m = 1 , 2 , . . . , 6. C. Pro of of DF A scaling W e are now in a position to pro ve E F 2 ( s ) ∼ λ m,H s 2 H , (14) for input signals X ( t ) with Hurst exp onent H ∈ { (0 , 1) ∪ (1 , 2) } . W e assume E X ( t ) 2 = 1. In the app endix we de- riv e the asymptotic weigh t function G asym ( j, s ) ∼ G ( j, s ), whic h takes the form G asym ( j, s ) = ( P 2 m +3 q =0 s 2 − q j q d q if j > 0 , d 0 s 2 if j = 0 . Expressions for the co efficients { d q } can b e found in Ap- p endix B. The v alues of { d q } for orders m ≤ 6 are listed in T able I I I. F or H > 1, using Eq. (4) and the asymptotic weigh t function, yields the asserted scaling: E F ( s ) 2 ∼ − s 2 H 2 m +3 X q =0 s 1 − q d q s − 1 X j =1 j q +2 H − 2 ∼ − s 2 H 2 m +3 X q =0 d q q + 2 H − 1 . In the stationary case 0 < H < 1, using Eq. (3), we find E F 2 ( s ) ∼ s d 0 γ (0) + 2 2 m +3 X q =0 s 1 − q d q s − 1 X j =1 j q γ ( j ) . (15) White noise ( H = 1 / 2) is trivial: E F ( s ) 2 ∼ s d 0 . 5 F or 1 / 2 < H < 1, in Eq. (15) the second term dominates the first term. Coupled with the asymptotic form of the acvf w e ha ve E F ( s ) 2 ∼ 2 H (2 H − 1) 2 m +3 X q =0 s 1 − q d q s − 1 X j =1 j q +2 H − 2 ∼ s 2 H 2 H (2 H − 1) 2 m +3 X q =0 d q q + 2 H − 1 . F or 0 < H < 1 / 2 the leading terms in Eq. (15) scale as s , but those leading terms cancel and w e end up with the same form ula as ab ov e. T o see this, denote by ρ ( τ ) the auto-correlation function (A CF), i.e., ρ ( τ ) = γ ( τ ) /γ (0). It is well-kno wn that P ∞ j = −∞ ρ ( τ ) = 0 (e.g., [20]). Since ρ (0) = 1, and the A CF is a symmetric function, w e ha ve − γ (0) / 2 = P ∞ j =1 γ ( τ ). Thus E F ( s ) 2 ∼ − 2 d 0 s ∞ X j = s γ ( j ) + 2 2 m +3 X q =1 s 1 − q d q s − 1 X j =1 j q γ ( j ) ∼ − 2 d 0 sH (2 H − 1) ∞ X j = s j 2 H − 2 +2 H (2 H − 1) 2 m +3 X q =1 s 1 − q d q s − 1 X j =1 j q +2 H − 2 ∼ s 2 H 2 H (2 H − 1) 2 m +3 X q =0 d q q + 2 H − 1 . IV. APPLICA TION A. Bias for scaling signals In Kan telhardt et al. [2] the bias (of the DF A fluctua- tion function) for Hurst exp onents H = 0 . 5 , 0 . 65 , 0 . 9 was found by means of Mon te Carlo simulations. F rom this bias they proposed the modified DF A fluctuation func- tion F 2 mod ( s ) = F 2 ( s ) K 2 ( s ) , (16) with K 2 ( s ) = E F 2 ( s ) τ 2 H E F 2 ( τ ) s 2 H . If we assume τ is large, such that E F 2 ( τ ) = λ m,H τ 2 H holds (appro ximately), then K 2 ( s ) = E F 2 ( s ) λ m,H s 2 H , (17) whic h implies E F 2 mod ( s ) = λ m,H s 2 H . Eq. (3) can b e used to compute the bias, i.e., the dif- ference b etw een the fluctuation function and asymptotic        -        s [ E F 2 ( s ) / s ] 1 / 2 (  ) � =         -        s [ E F 2 ( s ) / s ] 1 / 2 (  ) � =  FIG. 2. Detrended fluctuation analysis for input signals with Hurst exp onent (a) H = 0 . 9 and (b) H = 1 . 1. In both figures (a,b) the graphs from b ottom to top corresponds to DF A of increasing order m , from m = 1 (b ottom) to m = 6 (top). Dashed lines are the asymptotic scaling λ 1 / 2 m,H s H (see text). The squared fluctuation functions hav e been shifted by factors 10 m − 1 . scaling, for signals with Hurst exp onents 0 < H < 1. An example is sho wn in Fig. 2a where we hav e used H = 0 . 9. This gives similar result as provided by Kantelhardt et al. [2] (see their Fig. 2a) since the only difference b etw een the analytical and Mon te Carlo metho d is the negligible error caused by finite sample length for the latter. Eq. (4) can also b e used to compute the bias for sig- nals with Hurst exp onents 1 < H < 2. Fluctuation func- tions with corresonding asymptotic scaling for H = 1 . 1 is sho wn in Fig. 2b. The correction functions Eq. (17) for H = 0 . 9 and H = 1 . 1 are sho wn in Fig. 3. A practical problem is that K ( s ) depends on the (unkno wn) Hurst exp onent. In [2] this dependence was found to be weak for H = 0 . 5 , 0 . 65 , 0 . 9. Based on this finding the authors suggested to use Eq. (16) with the correction function for H = 0 . 5. While using this mo dified DF A will improv e the scaling for H < 1, it will actually increase the bias for signals with Hurst exp onents H > 1. F or H = 0 . 9 and H = 1 . 1, this can be seen from Figs. 2 and 3, where we observe that the bias has different signs. 6         s K ( s ) (  ) � =          s K ( s ) (  ) � =  FIG. 3. The correction function K ( s ) for input signals with Hurst exp onent (a) H = 0 . 9 and (b) H = 1 . 1 In b oth figures (a,b) the graphs from b ottom to top corresponds to DF A of decreasing order m , from m = 6 (b ottom) to m = 1 (top). The correction functions has b een shifted b y factors 1 . 3 6 − m . B. Ornstein Uhlenbeck pro cesses Another application is to study the b ehaviour of DF A for signals that are not scaling. Here we consider the class of Ornstein-Uhlen b eck (OU) pro cesses. An OU is the solution to the Langevin equation dX ( t ) = − 1 τ X ( t ) dt + σ dB ( t ) , (18) where B ( t ) is a standard ( E B (1) 2 = 1) Brownian motion, σ > 0 is a scale parameter and τ > 0 is the characteristic correlation time. W e c ho ose initial condition such that X ( t ) is stationary . This implies that the auto-cov ariance function tak es the form E X ( t ) X ( s ) = exp( −| t − s | /τ ) 2 σ 2 (19) Again, we can use Eq. (3) to compute the exp ected v alue of the squared DF A fluctuation. An example is shown in Fig. 4. While OU pro cesses do not hav e well-defined Hurst exp onen ts as defined in Section II, the second momen t scales asymptotically: On long time scales ( τ → ∞ ), ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●  =   =             -    s [ E F 2 ( s ) / s ] 1 / 2 FIG. 4. Detrended fluctuation analysis of order 1 for Ornstein Uhlen b eck process with characteristic correlation time τ = 20 (blue curv e). The red p oints are DF A1 for random walks (Hurst exp onent H = 1 . 5). X ( t ) is white noise, while on short time scales ( τ → 0), X ( t ) con verges to a Brownian motion. Thus, for the DF A fluctuation function we should exp ect a scaling exp onent close to H = 0 . 5 on long time scales. It is seen in Fig. 4 that this holds. On small time scales there exists bias in DF A for signals that exhibit scaling b ehavior. Relev ant here is the bias for random walks ( H = 1 . 5). In Fig. 4 it is seen that the OU DF A fluctuation function, with τ = 20, is consistent with random walks on small time scales. W e note that an AR(1) is an discretised OU pro cess, and more results on the AR(1) DF A fluctuation function can be found in [8]. C. Missing data Based on Eq. (11) we can mo dify DF A to handle miss- ing data. Define δ ( t ) to b e zero if X ( t ) is missing and unit y otherwise. W e mak e the assumption that at least one X ( t + k ) X ( t + j ) is non-missing. A sufficien t, but not necessary , condition for this to hold is that at least one windo w con tain no gaps. Let p k,j = # of windo ws # of non-missing X ( t + k ) X ( t + j ) . W e prop ose the estimator ˆ F 2 t ( s ) = − 1 2 s s X k =1 ,j =1 p k,j a k,j × [ X ( t + k ) − X ( t + j )] 2 × δ ( t + k ) δ ( t + j ) . (20) W e define ˆ F 2 ( s ) to be the av erage of ˆ F 2 t ( s ) (av eraging o ver the different windows t used). Without missing data the mo difie d fluctuation function ˆ F ( s ) is the same as the 7 fluctuation function F ( s ) in the gap-free case. F or a time series with gaps ˆ F ( s ) is equal in exp ectation to F ( s ). The equalit y E ˆ F 2 ( s ) = E F 2 ( s ) holds if the input sig- nal is stationary or non-stationary with stationary incre- men ts: Applying the exp ectation op erator on Eq. (20) w e hav e E ˆ F 2 t ( s ) = − 1 2 s s X k =1 ,j =1 p k,j a k,j S ( | k − j | ) × δ ( t + k ) δ ( t + j ) , and since at least one δ ( t + k ) δ ( t + j ) is assumed non- zero, the equality E ˆ F 2 ( s ) = E F 2 ( s ) follows. Whereas the ordinary fluctuation function is alwa ys non-negative, the mo dified fluctuation function can become negativ e. Practically , one can resolve this problem b y letting the fluctuation function b e undefined if negative v alues o ccur. Examples of time series with gaps are some of the re- gional temp eratures analyzed in Løvsletten and Ryp dal [3]. W e use one of these time series, from the HAD- CR UT4 data pro duct [21], to demonstrate the usage of the mo dified DF A to handle missing data. The chosen time series is the surface temperature in the tropical Pa- cific, and is sho wn in Fig. 5a. The mo dified DF A of this series is presented in Fig. 5b. In this work w e use non- o verlapping windows starting from the left, and (mo di- fied) DF A of order m = 2. W e observ e that the fluctua- tion function is rather p o orly appro ximated by a p ow er- la w. This is an exp ected result since in [3] we show ed that an AR(1) model is significantly better than an fGn (p o wer-la w) mo del for this temp erature series due to the influence of the El Ni ˜ no Southern Oscillation (see discus- sion in [3]). Using the same gap-sequence as in Fig. 5a, we compare the modified DF A with the ordinary DF A by computing the fluctuation function from an ensemble of 500 fGn’s with Hurst exp onent H = 0 . 7 and sample size n = 1368 (same sample size as in Fig. 5a). F or eac h ensemble mem- b er, data p oints are omitted to construct the same gap sequence as in Fig. 5a and then the mo dified DF A fluc- tuation function is computed. In Fig. 5c the results are presen ted in form of ensem ble means and 90% confidence in terv als. The ensemble means are visually indistinguish- able for the modified and ordinary fluctuation functions, while the uncertain ty of the former is increased due to the gaps. The increased uncertaint y is also seen in the estimated Hurst exp onents, see Fig. 5e. Here, the estimators are the slop es from linear regression of log F ( s ) (and log ˆ F ( s )) against log s . The same Monte Carlo exp eriment, using an ensemble of fBm’s with Hurst exponent H = 1 . 1, yields similar results whic h are presen ted in Fig. 5d and f. 1900 1920 1940 1960 1980 2000 - 2 - 1 0 1 2 3 Time ( year ) temp ( Celsius ) (  )       s F ( s ) (  )         s F ( s ) (  )         s F ( s ) (  ) 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.00 0.05 0.10 0.15 0.20 H  Prob ( H  ) (  ) 0.8 0.9 1.0 1.1 0.00 0.05 0.10 0.15 H  Prob ( H  ) (  ) FIG. 5. a) Monthly reconstructed temperature for the 5 ◦ × 5 ◦ grid cen tered at 7.5 ◦ S, 172.5 ◦ W. b) Modified Detrended Fluc- tuation Analysis of the time series in (a). (c-f ) The results of the Mon te Carlo study of DF A explained in section IV C. In (c-d) the black points are the ensem ble means of the modified and ordinary fluctuation functions, light blue is the 90% confi- dence interv al for the ordinary fluctuation function, and ligh t plus dark blue is the 90% confidence interv al for the mo dified fluctuation function. (e-f ) Blue plus gray is the probability densit y function (PDF) of ensembles of estimated Hurst expo- nen ts according to the mo dified DF A, while yello w plus gra y is the PDF according to the ordinary DF A. An alternativ e estimator, based on Eq. (10), is ˜ F 2 t ( s ) = 1 s s X k =1 ,j =1 p k,j a k,j × X ( t + k ) X ( t + j ) × δ ( t + k ) δ ( t + j ) , (21) and ˜ F 2 ( s ) defined as the av erage of ˜ F 2 t ( s ). It is straight- forw ard to verify that E ˜ F 2 ( s ) = E F 2 ( s ) for station- ary input signals. How ever, for input signals that are non-stationary with stationary incremen ts, ˜ F ( s ) does not ha ve the desirable prop erty of equalit y in exp ectation to the fluctuation function F ( s ) in the gap-free case. As an example, consider a input signal with Hurst exp onent 1 < H < 2. In the gap-free case, the time-dep endent part of the exp ected squared fluctuation v anishes, i.e., E F 2 t ( s ) = 1 s s X k =1 s X j =1 a k,j ( | t + k | 2 h + | t + j | 2 h ) = 0 , (see Eq. (A1) in the app endix). F or ˜ F 2 t ( s ) the time- dep enden t part will not, in general, v anish. This is due 8 to the additional m ultiplicative factors p k,j δ ( t + k ) δ ( t + j ) in Eq. (21). V. CONCLUDING REMARKS In this pap er, several new prop ositions for DF A hav e b een formulated and pro ven. These include the relation- ship b etw een the DF A fluctuation function and the acvf and v ariogram, derived from the sample forms Eqs. (10) and (11). The results w ere deriv ed under the assump- tion that the input signal in DF A is either stationary or non-stationary with stationary increments, or one of these sup erp osed on a p olynomial trend of order less than the order of DF A (results on trends not accounted for b y DF A can b e found in Hu et al. [22] and Kan telhardt et al. [2]). F or these classes of input signals the present pap er has established that the residual v ariance in dif- feren t windows are equal in exp ectation. The p ow er-la w scaling of the DF A fluctuation function has b een rigor- ously pro ven for stochastic processes with Hurst exp o- nen ts H ∈ { (0 , 1) ∪ (1 , 2) } . It has also been demonstrated for these classes of sig- nals that the new developmen ts of the DF A metho d can b e used to compute analytically the bias of the DF A- estimate. F or this purp ose we used the weigh t functions and asymptotic w eight functions. The Mathematica code for these functions are found as supplementary material to this article. Therein, it is demonstrated that the mo d- ified fluctuation function prop osed in Kantelhardt et al. [2] degrades the scaling prop erty for scaling input signals with Hurst exp onent greater than unity . F rom an applied ph ysics p oint of view, the most use- ful result of this study ma y b e the metho d of handling missing data in DF A prop osed in Eq. (20). F or ensem- ble a verages, the mo dified DF A fluctuation function with missing data is the same as the DF A fluctuation function without missing data. Some of the theory presented in this pap er is probably a suitable starting p oint to prov e the correctness of the m ultfractal DF A in tro duced b y Kantelhardt et al. [23], as well as the v ariance and limiting distribution of the DF A fluctuation function. A CKNOWLEDGMENTS This work has receiv ed supp ort from the Norwegian Researc h Council under Contract 229754/E10. The au- thor thanks K. Rypdal and M. Ryp dal for useful discus- sions and language editing. App endix A: Simple pro ofs Recall the definition of the w eight matrix A = D T ( I − Q ) D and hat matrix Q = B T  B B T  − 1 B . Since Q is a pro jection matrix, v ectors v that are in the ro w-space of B will b e mapp ed to itself, i.e. Q v = v , and th us ( I − Q ) v = 0. (9) ⇔ (10): Let 1 b e a ( s × 1) v ector of ones. F or t > 0 : Y ( t ) = D X ( t ) + 1 Y ( t ) . The pro of is completed by noting that 1 is in the the ro w-space of B . (10) ⇔ (11): This equalit y holds if s X k =1 ,j =1 a k,j ( X ( t + k ) 2 + X ( t + j ) 2 ) = 0 (A1) Fix k and consider the sum s X j =1 a k,j X ( t + k ) 2 , whic h is element k of the vector A 1 X ( t + k ) 2 . W e hav e D 1 = (1 , 2 , . . . , n ) T , whic h is in the row-space of B . Th us A 1 X ( t + k ) 2 = 0. Since this holds for all k = 1 , 2 , . . . s , w e can conclude that s X k =1 ,j =1 a k,j X ( t + k ) 2 = 0 . Since A is a symmetric matrix w e also ha ve s X k =1 ,j =1 a k,j X ( t + j ) 2 = 0 . In the sequel it is shown that the relationships b et ween the DF A fluctuation function and the acvf and v ariogram estimators, Eqs. (12) and (4), and the pow er-law scaling of the DF A fluctuation function Eq. (14), remain v alid when a p olynomial trend is sup erp osed on the signal. Let Z ( t ) be a sto chastic process with mean zero and acvf γ ( t, s ). It is assumed that Z ( t ) is either stationary or non-stationary with stationary increments. Define T ( t ) = β 0 + β 1 t + . . . + β q t q , t = 1 , . . . , n, where q is an in teger in the range 0 ≤ q ≤ m − 1. Let X ( t ) = T ( t ) + Z ( t ) . 9 By Eq.(10) we ha ve F 2 t ( s ) = 1 s s X k =1 ,j =1 a k,j Z ( t + k ) Z ( t + j ) + 1 s s X k =1 ,j =1 a k,j Z ( t + k ) T ( t + j ) + 1 s s X k =1 ,j =1 a k,j T ( t + k ) Z ( t + j ) + 1 s s X k =1 ,j =1 a k,j T ( t + k ) T ( t + j ) . (A2) Since E Z ( t ) = 0, the middle terms v anish in exp ectation, and th us E F 2 t ( s ) = 1 s s X k =1 ,j =1 a k,j γ ( t + k , t + j ) + 1 s s X k =1 ,j =1 a k,j T ( t + k ) T ( t + j ) . (A3) W e hav e s X k =1 ,j =1 a k,j T ( t + k ) T ( t + j ) = T ( t ) T A T ( t ) , (A4) where T ( t ) = [ T ( t + 1) , . . . , T ( t + s )] T . One can use the form ulas for sums of p ow ers, e.g. [24], to v erify that D T ( t ) is in the row-space of B . Hence s X k =1 ,j =1 a k,j T ( t + k ) T ( t + j ) = 0 , and th us E F 2 t ( s ) = 1 s s X k =1 ,j =1 a k,j γ ( t + k , t + j ) . App endix B: Asymptotic weigh t function The w eight matrix can be written A = D T D − D T QD , where Q is the hat matrix defined in Eq. (8). Element ( i, j ) of D is one if i ≥ j and zero otherwise. Thus, elemen t ( i, j ) in the first matrix pro duct is ( D T D ) i,j = s + 1 − max { i, j } . Summing the j ’th (sub)-diagonal yield s − j X k =1 ( D T D ) k,k + j = s − j X k =1 ( s + 1 − ( k + j )) = s 2 / 2 − sj + s/ 2 + j 2 / 2 − j / 2 ∼ s 2 / 2 − sj + j 2 / 2 . (B1) Computing the term D T QD is more tedious, but straigh t-forward. The starting p oint is the hat matrix Q . Denote b y ( B B T ) − 1 i,j elemen t ( i, j ) of the in verse of B B T . By observing that column j of B is ( j 0 , j 1 , . . . , j m ), it is seen that Q p,q = m +1 X d =1 ,l =1 p d − 1 q l − 1 ( B B T ) − 1 d,l . [ D T QD ] i 1 ,i 2 = s X k 1 = i 1 s X k 2 = i 2 Q k 1 ,k 2 = s X k 1 = i 1 s X k 2 = i 2 m +1 X d =1 ,l =1 k d − 1 1 k l − 1 2 ( B B T ) − 1 d,l ∼ m +1 X d =1 ,l =1 ( s d − i d 1 )( s l − i l 2 ) / ( dl )( B B T ) − 1 d,l (B2) Using the asymptotic expression of B B T , ( B B T ) i,j = s X t =1 t i + j − 2 ∼ s i + j − 1 i + j − 1 , one c an use the definition of the in verse matrix to verify that ( B B T ) − 1 d,l ∼ ˜ c d,l /s d + l − 1 (B3) Inserting Eq. (B3) into Eq. (B2) yields [ D T QD ] i 1 ,i 2 ∼ m +1 X d =1 ,l =1 ( s d − i d 1 )( s l − i l 2 ) c d,l /s d + l − 1 , where we ha v e defined c d,l = ˜ c d,l / ( dl ). Summing the j ’th (sub)-diagonal yield s − j X k =1 ( D T QD ) k,k + j ∼ s − j X k =1 m +1 X d =1 ,l =1 ( s d − k d )( s l − ( k + j ) l ) c d,l /s d + l − 1 ∼ m +1 X d =1 ,l =1 c d,l  s 2 − sj − s 2 l + 1 + s 1 − l j l +1 l + 1  (B4) − m +1 X d =1 ,l =1 c d,l s − d +1 ( s − j ) d +1 d + 1 (B5) + m +1 X d =1 ,l =1 c d,l l X r =0  l r  ( s − j ) d + l +1 − r j r ( d + l + 1 − r ) s d + l − 1 (B6) = 2 m +3 X q =0 s 2 − q j q b q . (B7) 10 The terms (B4)-(B6) can b e written 2 m +3 X q =0 s 2 − q j q b ( k ) q , , k = 1 , 2 , 3 , resp ectiv ely . This implies the equality Eq. (B7), with b q = b (1) q + b (2) q + b (3) q . The coefficients b ( k ) q , found b y re-organising terms, are giv en by: b (1) q =          P m +1 d =1 ,l =1 c d,l − c d,l 1 l +1 if q = 0 , − P m +1 d =1 ,l =1 c d,l if q = 1 , 1 q P m +1 d =1 c d,q − 1 if 2 ≤ q ≤ m + 2 , 0 if q > m + 2 . b (2) q =          − P m +1 d =1 ,l =1 c d,l d +1 if q = 0 , P m +1 d =1 ,l =1 c d,l d +1  d +1 d  if q = 1 , ( − 1) q − 1 P m +1 d = q − 1 ,l =1 c d,l d +1  d +1 d +1 − q  if 2 ≤ q ≤ m + 2 , 0 if q > m + 2 . b (3) q = m +1 X d + l ≥ q − 1 , d ≥ 1 ,l ≥ 1 a ( d,l ) q c d,l , a ( d,l ) k = min { l,k } X r =0  l r  ( − 1) k − r d + l + 1 − r  d + l + 1 − r d + l + 1 − k  . Using Eq. (B1) and (B7), the coefficients follows: d q =          1 / 2 − b 0 if q = 0 , − 1 − b 1 if q = 1 , 1 / 2 − b 2 if q = 2 , − b q if q > 2 . [1] C. K. Peng, S. V. Buldyrev, S. Havlin, M. Simons, H. E. Stanley , and A. L. Goldb erger, Ph ys. Rev. E 49 , 1685 (1994). [2] J. W. Kantelhardt, E. Koscieln y-Bunde, H. H. A. Rego, S. Havlin, and A. Bunde, Ph ysica A 295 , 441 (2001). [3] O. Løvsletten and M. Ryp dal, J. Clim. 29 , 4057 (2016). [4] S. Lahmiri, Physica A 437 , 130 (2015). [5] J.-Y. Chiang, J.-W. Huang, L.-Y. Lin, C.-H. Chang, F.- Y. Chu, Y.-H. Lin, C.-K. W u, J.-K. Lee, J.-J. Hwang, J.- L. Lin, and F.-T. Chiang, PloS one 11 , e0147282 (2016). [6] O. Løvsletten and M. Ryp dal, Ph ys. Rev. E 85 , 046705 (2012). [7] M. H¨ oll and H. Kan tz, Eur. Ph ys. J. B 88 , 327 (2015). [8] M. H¨ oll and H. Kan tz, Eur. Ph ys. J. B 88 , 126 (2015). [9] M. S. T aqqu, V. T evero vsky , and W. Willinger, F ractals 03 , 785 (1995). [10] C. Heneghan and G. McDarby , Ph ys. Rev. E 62 , 6103 (2000). [11] P . S. Wilson, A. C. T omsett, and R. T oumi, Phys. Rev. E 68 , 017103 (2003). [12] D. Mondal and D. B. Perciv al, Ann. Inst. Stat. Math. 62 , 943 (2008). [13] B. B. Mandelbrot and J. W. V an Ness, SIAM Rev. 10 , 422 (1968). [14] P . Embrec hts and M. Maejima, Selfsimilar Pr o c esses, Princ eton Series in Applie d Mathematics (Princeton Uni- v ersity Press, 2002). [15] J.-F. Muzy and E. Bacry , Phys. Rev. E 66 , 056121 (2002). [16] E. Bacry and J. F. Muzy , Commun. Math. Phys. 236 , 449 (2003). [17] I. Kop onen, Phys. Rev. E 52 , 1197 (1995). [18] G. T erdik, W. A. W oyczynski, and A. Piryatinsk a, Phy . Lett. A 348 , 94 (2006). [19] M. Rypdal and K. Ryp dal, Earth Syst. Dyn. 7 , 281 (2016). [20] J. Beran, Statistics for long-memory pr o c esses , V ol. 61 (CR C press, 1994). [21] C. P . Morice, J. J. Kennedy , N. A. Ra yner, and P . D. Jones, J. Geophys. Res. 117 , D08101 (2012). [22] K. Hu, P . C. Iv anov, Z. Chen, P . Carpena, and H. E. Stanley , Phys. Rev. E 64 , 011114 (2001). [23] J. W. Kantelhardt, S. A. Zschiegner, E. Koscieln y-Bunde, S. Havlin, A. Bunde, and H. E. Stanley , Ph ysica A 316 , 87 (2002) . [24] E. W. W eisstein, “Po w er sum,” F rom MathW orld–A W ol- fram W eb Resource. http://mathworld.wolfram.com/ PowerSum.html .

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment