Fisher Information Framework for Time Series Modeling

Fisher Information F ramew ork for Time Series Mo deling R. C. V enk atesan a , ∗ A. Plast ino b a Systems R ese ar ch Corp or ation, Aundh, Pune 4110 07, India b IFLP, National Univ e rsity L a P lata & National R ese ar ch (CONICET) C. C., 727 1900, L a Plata, Ar gentina Abstract A robust prediction mo del inv oking the T akens em b edding theorem, whose working hyp othesis is ob tained via an inferen ce pro cedure based on the m inim um Fisher information principle, is presente d . The co eﬃcients of th e ansatz, cen tral to the working hyp othesis satisfy a time indep end en t Sc hr¨ odinger-lik e equ ation in a v ector setting. Th e inferen ce of i) th e pr obabilit y density function of the co eﬃcien ts of the working hyp othesis and ii) the esta b lish ing of co ns train t driv en pseudo-inv ers e condition for the mo deling ph ase of the pr ediction sc heme, is made, for the case of norm al d istributions, with the aid of the quan tum mec h an ical virial th eorem. The well -known r ecipro cit y relations and the asso ciated Legendre transf orm struc- ture for the Fisher information measure (FIM, h ereafter)-based m o del in a v ec- tor setting (with least square constrain ts) are self-consisten tly derived. Th ese r e- lations are d emons trated to yield an in triguing form of th e FIM for the mo deling phase, whic h deﬁnes the working hyp othesis , solely in terms of the observ ed data. Cases for prediction e mp loying time series’ obtained fr om th e: ( i ) the Mac k ey-Glass dela y-diﬀeren tial equation, ( ii ) one ECG sample from the MIT-Beth Isr ael Dea- coness Hospital (MIT-BIH) cardiac arrh ythmia database, and ( iii ) one ECG from the Creigh ton Univ ersit y v entric u lar tac h yarrh ythmia database. The EC G samples w ere obtained f r om the Ph ysionet online rep ository . T hese examples demonstrate the eﬃciency of the prediction mo d el. Numerical examples for exemplary cases are pro vided. Key wor ds: Fisher information, time series prediction, w orking h yp othesis inference, min im um Fisher information, T ak ens theorem, generalized v ector Fisher-Euler theorem, Legendre transf orm structure, Mac key-Gl ass equatio n, ECG’s. P ACS: 05.20.-y; 2.50.Tt; 0.3.65. -w; 05.45. Tp ∗ Corresp on d ing author. Email addr esses: r avi@syst emsresea rchcorp.com;ravicv@eth.net (R. C. Preprint su bmitted to E lsevier Science 21 F ebruary 2018 1 In tro duction Devising metho ds for analyzing and predicting time series is curren tly con- sidered one of the most importa nt c hallenges in chaotic time-series analy- sis (eg., see Refs. [1-3]). In general, c haotic b ehavior is observ ed in relation with nonlinear diﬀe rential equ atio ns and m aps on manifolds. Times series may b e c onstrue d as b e ing the p r oje ctions of manifolds onto c o or dinate axes . Muc h w ork in nonlinear dynamics has fo cused on the building of appro pri- ate mo del(s) o f the underlying phy sical pro cess from a time series, with the ob jectiv e of predicting the ne ar-futur e b ehav ior of dynamical systems. The ﬁrst step in for mulating predictiv e mo dels is that o f sp ecifying/estimating a suitably pa rameterized nonlinear function o f the observ ation. This is follow ed b y estimating the pa rameters of this function. In general, prediction mo dels ar e formulated o n the basis o f the systematic and accurate iden tiﬁcation of a workin g hyp othesis [4]. This h yp othesis is represen ted by a set of para meters that f orm an ansatz. This pap er o btains the co eﬃcien ts of suc h an ansatz, whic h p ossess informat ion ab out the data set(s), via recourse to a Fisher information measure (FIM, hereafter) based inference pro cedure. The leitmotif for obtaining the working hyp othesis b y emplo ying an inference pro cedure is to form ulate a prediction mo del, based on the famed em b edding theorem of T ak ens [5, 6]. The conceptual sophistication underlying the T ak ens’ theorem renders the prediction problem to b ecome an instance of extrap o la- tion. Curren tly , some of the prominen t prediction mo dels based on information theory (IT, hereafter) are: ( i ) the framew ork o f Plastino et. a l. [7- 10] using the maxim um en trop y (MaxEn t, hereafter) metho d of Ja ynes [11], a nd ( ii ) the nonparametric mo dels b y Princip e et. a l, (eg. see [12, 1 3]). The w ork presen ted herein b elongs to a class of mo dels kno wn as pseudo- inverse mo dels, for reasons describ ed in Section 2 and 3 of this pap er. Such mo dels ha ve b een success fully emplo y ed to forecasting tasks in a n umber of disciplines whic h include nonlinear dynamical systems [7, 8], ﬁnancial data forecasting [8], prediction of tonic-clonic epileptic seizures from real- time elec- tro encephalogram (EEG) da ta [9], and ev en fraud analysis (the London Inter- bank Oﬀered Rate (LIBOR ) manipulatio ns) [10]. Generally , predictiv e mo dels are of tw o types, viz. glob a l a nd lo c al (see for example R ef. [14]). G lobal mo dels ar e based on training da t a collected f r o m across the phase space. On the other hand, in lo cal mo dels, the training is accomplished b y measuremen ts providing data lying in the immediate vicin- it y of a speciﬁc/lo calized region of the phase space. Pseudo- i n verse pr e dictive V enk atesan), pla stino@fi sica.unl p.edu.ar (A. Plastino). 2 mo dels, i n cluding the o n e pr ese n te d her ein, ar e essential ly glob al mo del s which p os s ess lo c al char acteristics [10, 15]. Time series prediction has its ro ots in the theory o f optimal ﬁltering b y Wiener [16]. In recen t times, forecasting of c haotic time series has hitherto largely utilized artiﬁcial neural netw orks (ANN’s, hereafter) and other learning pa r a digms. Commencing from t he sem- inal radial basis function mo del of Casdagli [17 ], some of the notable attempts to study c haotic t ime series comprise (but are not limited to) the time de- la y ed neural net w ork a r chitectures [18], recurren t ANN’s [19], maxim um en- trop y ANN’s [20], and supp or t v ector mac hines [21]. Within the p ersp ectiv e of phys ics-based mo dels, the w orks of Crutc hﬁeld and McNamara [22 ] and F armer and Sidor owic h [23] constitute some of the most prominen t eﬀorts. FIM-based studies ha ve recen tly b een acquiring prominence across a sp ectrum of disciplines r a nging fro m ph ysics and biology to economics (for eg., see [2 4]). The prediction mo del presen ted in this pap er comprises of tw o phases: ( i ) the mo deling phase and ( ii ) the prediction phase. The task o f the mo deling phase is to obtain the co eﬃcien ts of the ansatz that suitably parameterizes the nonlinear function of the observ ed time series (see Section 2 of this pa p er). This phase establishes the workin g hyp othesis , and is accomplished with the assistance of the training data. The prediction phase then generates forecasts based on the set of co eﬃcien ts obtained in the mo deling phase. The leitmotif fo r the FIM-based mo del employ ed in t his pap er is tw o- f old. First, it prov ides the fra mework t o endo w the mo deling phase with a quantum mec hanical (QM, hereafter) connotation. This is in a ccordance with Wheeler’s h yp othesis of establishing an informat io n-theoretical foundation fo r the fun- damen tal theories of ph ysics [2 5 ], and is accomplished by recourse to the min- im um Fisher informa t io n (MFI, hereafter) principle of H ¨ ub er [2 6, 27]. V a r ia - tional extremization of the FIM sub ject to least squares constraints results in a Sturm-Lio uville equation in a v ector setting, hereinafter referred to as the time independent Sc hr¨ oding er- lik e equation. Consequen tly , i) the probabilit y densit y function (p df, hereafter) of t he co eﬃcien ts of the ansatz, and ii) t he constrain t driv en pseudo-in v erse condition (that yields the inferred estimate co eﬃcien ts, fundamen tal for the working hyp othesis ) , can b e sp eciﬁed not only via Gaussian (Maxw ell-Bolt zmann) p df ’s [whic h are e quilibrium distributions], but also in terms o f non-e quilibrium distributions [24, 28-3 0], comprising o f Hermite-Gauss p olynomials. 1 This greatly widens the scop e of the w orks pre- sen ted in Refs. [7-10], and is a ccomplished in this pap er with the aid of the QM virial theorem [31 , 32] for normal distributions. Note that in inference prob- lems inv olving the FIM, the Gaussian p df ’s are obtained as solutions to the lo w est eigenv a lue b y solving the time-indep enden t Sc hr¨ o ding er- lik e equation in Section 3 of this pa p er as an eigen v a lue problem, and corresp ond to the gr ound 1 In this pap er the terms Gaussian p df and normal p df are emplo y ed int erchange- ably 3 state of the ph ysical Sc hr¨ odinger w av e equation (SWE, hereafter). F urther, the non-equilibrium p df ’s corr esp o nd to the higher-order eigen v a lue solutions of suc h SWE, and a re linke d to ex c i te d states of the phys ical SWE (see, for eg. [33, 3 4]). F rom a p r actic al p ersp ectiv e, this enables the p erformance of t he mo deling phase and t he concomitan t pr ediction phase to b e systematically categorized in terms of a n established ph ysics-based framew ork. Next, the recipro cit y relations and the Legendre transform structure (L TS, hereafter), t o gether with the concomitan t informatio n theoretic relations for the FIM, in a v ector setting a nd for least squares constrain ts, are deriv ed. Prior studies hav e deriv ed recipro cit y relations and L TS for the FIM mo del [35] and hav e analyzed suc h r elat io ns [36 -39]. Recen tly , these w orks hav e b een qualitativ ely extended to the case of the relativ e Fisher infor mation (RFI, hereafter) [40- 42] b y V enk atesan and Plastino by deriving the recipro cit y rela- tions and L TS [43]. A connection b etw een the celebrated Hellmann-F eynman theorem, the recipro cit y relations, and L TS for the R FI ha s b een establis hed in [44], in addition with a unique inference pro cedure to obtain t he energy eigen v alue without recourse to solving the time-indep enden t Sc hr¨ odinger-like equation. These prior w orks diﬀer from the analysis presen ted in this pap er in tw o signiﬁcant asp ects - ( i ) they treat the scalar case and ( ii ) the prior kno wledge enco ded in the observ ed data are in tro duced as constraints into the v aria t io nal extremization pro cedure in the fo r m of exp ectations of the p ow ers of the sc alar independen t v ar ia ble. The recipro cit y relations a nd the L TS for the time-independen t Schr¨ odinger- lik e equation derived in t his pap er, despite p ossessing a v ector form and least squares constrain ts, mathematically resem ble those derive d in [35]. This au- gurs w ell with regards t o the p ossibilit y o f translating the en tire mathematical structure of thermo dynamics in to the Fisher-based mo del presen ted in this pap er. The distinctions in the recipro cit y relations and L TS deriv ed in this pap er vis-´ a- vis earlier referenced w orks [36- 3 9] result in the information the- oretic relations deriv ed from these relations b eing qualitativ ely diﬀeren t from those obtained in the scalar case. This fa ct evidenc es the distinction b etw een the results presen ted in this pap er and those demonstrated in Refs. [36- 39] , based on ph ysics and on systems’ theoretic [45] considerations. Of inter est is an expr ession that in fers the F IM of the mo deling phas e just in terms the observe d d ata, her e after r eferr e d to as the emp i ri c al FIM . Such r elat io n, whic h is a solution of a linear PDE deriv ed from the recipro cit y relations together with the L TS that infers the FIM without recourse to the time-indep enden t Sc hr¨ odinger- like equation, has no equiv a lent in the MaxEn t mo del. The goals of this pap er are • ( i ) to prov ide an o v erview of the solution pro cedure. This is done in Section 2, 4 • ( ii ) to: ( a ) in tro duce the MFI principle in a v ector setting and using least square constrain ts, ( b ) deriv e a systematic pro cedure fo r the inference of ex- p onen tial p df ’s of the mo deling phase with the aid of the QM virial theorem, and ( c ) obtain the constraint driv en pseudo-in v erse condition that yields the estimate of the coeﬃcien ts comprising the working hyp othesis (see Section 2 o f this pap er) b y in v oking the QM virial theorem. This three-fold o b jec- tiv e is p erformed in Section 3. Note that for normal p df ’s the solutions o f the MFI and MaxEn t principles ar e kno wn to coincide [24, 46]. This pa- p er fo cuses on the normal distribution to demonstrate that the results o f the MaxEn t mo del can b e deriv ed fro m QM considerations and in terpreted within the framew ork of estimation theory , whic h is not p ossible within the am bit of the MaxEnt framew ork, • ( iii ) to deriv e the recipro city relations and t he L TS for the FIM in a v ec- tor setting using square constrain ts, analyzing the concomitan t informatio n theoretic relations. The e m pir c al FIM is deriv ed, and a preliminary analysis of its prop erties is p erformed. This is accomplished in Section 4, • ( iv ) to computationally demonstrate the eﬃcacy of the prediction frame- w ork for the Mac key -G lass (M-G, hereafter) dela y diﬀeren tial equation (DDE, hereafter) [47], for a 5 minute electro cardiogram (ECG, hereafter) segmen t of Record 20 7 of the MIT-Beth Israel Deaconess Hospital (MIT-BIH, here- after) arrythmia database [48] (considered to b e one of the most c halleng- ing Records in the MIT-BIH arrh ythmia database) for the Mo diﬁed Lead I I (MLI I, hereafter), and for the single ECG signal in Record cudb/cu02 of the around 8.5 min ute Creigh ton Univ ersit y v en tricular tach y arrh ythmia (VT A, hereafter) databa se [49 ]. The ECG data are obta ined from the Ph ysionet online rep o sitory [50]. This is demonstrated in Section 5 of this pap er. The leitmotif of this exercise is a s follow s. An ob vious practical a dv antage of t he pseudo-in v erse mo del presen ted in this pap er o ve r a least squares appro ac h in ordinary Euclidian space is that the fo r mer requires the Mo ore-Pe nrose pseudo-in v erse [51 ] of the em b edding matrix W (deﬁned in Section 2 o f this pap er) and t herefore, the estimate of the co eﬃcien ts o f the ansatz comprising the working hyp othesis < a > ( see Sections 2 and 3 of this pap er) deriv ed via inference from the training data. This can b e achie ve d ev en when W is nearly singular. The fact that t he es- timates < a > are deﬁned ev en when W is singular (or nearly singular) can in principle result in very v o latile forecasts, o n a ccoun t of ill-conditioning. Note that ill-conditioning could o ccur in the presence of a near-singular W , whic h in turn might o ccur if man y lags of the observe d data v are presen t. The leitmotif for t he choice of the b enc hmarks on whic h t o test the pre- diction mo del is as follo ws. The M-G equ atio n with dela y τ > 14 s ecs. has a hig h embedding dimension [18]. Th us W displa ys more lags as compared to most prominen t mo dels describing low dimensional c haos [3, 14]. As is described in Section 2 of this pap er, the rationale b eing that the num b er of 5 lags in W dep ends up on the embedding dimension. As evidenced in Section 5 of this pap er, the f o recast of the M-G DD E is stable and accurate. Next, ECGs of patien ts suﬀering from serious cardiac related a ilmen ts p ossess ar- tifacts whic h are represen tativ e of v a r io us conditions of a diseased heart. These artif acts are noted in the reference annotatio ns as episo des (tran- sien ts). It is demonstrated that ev en for the most c hallenging cases, the mo del presen ted in t his pa p er accurately fo recasts these episo des without an y signs of v olatility , thereb y demonstrating the accuracy and robustness of the pseudo-inv erse mo del. This is established for cases where the original signal p ossesses highly erratic/volatile b ehav ior. Numerical examples for exemplary cases are provided. T o the best of t he authors’ kno wledge, these o b jectiv es ha v e nev er hitherto b een accomplished. 2 Ov erview of t he solution pro cedure 2.1 Basics of emb e dding the ory Giv en a signal x from an unknown dynamical system D : ℜ S → ℜ S , the cor- resp onding time series consists of a sequence of N str ob osc opic measuremen ts: { v ( τ 0 ) , v ( τ 0 + τ s ) , ..., v ( τ 0 + N τ s ) } made at in terv als τ s . The state space is reconstructed using the time delay em b edding [1, 5, 6], whic h uses a collection of co ordinates with t ime lag to create a v ector in d -dimensions, on a system considered to b e in a state described b y x ( t ) ∈ ℜ S at discrete times v ( t n ) = { v ( t n ) , v ( t n − ∆) , ..., v ( t n − ( d − 1) ∆) } (1) where ∆ = τ s is the time lag, and d is the embedding dimension of the re- construction. It is kno wn from T ak ens’ theorem (eg. see Refs. [5, 6]) that for ﬂo ws ev olving to compact attracting manifolds o f dimension d a ; if d > 2 d a for the forecasting time T ∈ ℜ , T > 0 (time samples in t his pap er), there exis ts a functional form o f the t yp e v ( t + T ) = ℑ ( v ( t )) . (2) where v ( t ) = [ v 1 ( t ) , v 2 ( t ) , ..., v d ( t )] , (3) and v i ( t ) = v ( t − ( i − 1) ∆) ; i = 1 , ..., d . A non-uniq ue ansatz for the map- ping function of this form (emplo ying the Einstein summation con v en tion) is sp eciﬁed as [9] ℑ ∗ ( v ( t )) = a 0 + a i 1 v i 1 + a i 1 i 2 v i 1 v i 2 + a i 1 i 2 i 3 v i 1 v i 2 v i 3 + ... + a i 1 i 2 i 3 ...i np v i 1 v i 2 v i 3 ...v i np , (4) 6 where 1 ≤ i k ≤ d and np is the p olynomial degree c hosen to expand the mapping ℑ ∗ . The n um b er of parameters in (4) corresp onding to k terms (the degree), is the combination with rep etitions d k ! ∗ = ( d + k − 1)! k !( d − 1)! . (5) The length of the vec tor of parameters, a is N c = np X k =1 d k ! ∗ . (6) Other forms of a nsatz’ are encoun tered in [5 2 ]. It is imp ort an t to note that sp ecifying an ansatz of a form, such as that deﬁned in (4), has its ro o ts in signal pro cessing [5 3]. 2.2 The mo deling phase As an informatio n r ecov ery criterion, the v ector of co eﬃcien ts a is obtained via inference b y in v oking the MFI principle. The ob jective is to ac hiev e a mo del po ssessin g high predictiv e ability . Computations are made on the basis of the information give n by M p oin ts of the t ime se ries. Thes e constitute the tr aining data obtained from the observ ed signal, whose utility is to infer the co eﬃcien ts a . [ v ( t n ) , v ( t n + T )] ; n = 1 , ..., M . (7) Giv en t he data set (7 ), the parametric mapping (2) can b e re-stated as v ( t n + T ) = ℑ ∗ ( v ( t n )); n = 1 , ..., M . (8) Here, (7) can b e expressed in vec tor - matrix f o rm as Wa = v T , (9) where ( v T ) n = v ( t n + T ) a nd W is a rectangular matrix with dimensions M × N c , and whose n th ro w is: h 1 , v i 1 ( t n ) , v i 2 ( t n ) v i 2 ( t n ) , ..., v i 2 ( t n ) v i 2 ( t n ) ...v i np ( t n ) i . The working hyp othes i s is established in Section 3 via inference of the co- eﬃcien ts from the observ ed data by inv oking the MFI principle. Here, a = h a 0 , a i 1 , a i 1 i 2 , .., a i 1 i 2 i 3 ...i np i . It is assumed that t he probabilit y asso ciat ed with a is f ( a ). Note that a is assumed to b e a con tin uous random v ariable. Alter- nately , f ( a ) may b e deﬁned as the emp iric a l distributions of the o bserv ations v ( t n ); n = 1 , ..., M [54]. The FIM is extre mized sub ject to the constrain ts W < a > = v T , (10) 7 and the normalization condition Z f ( a ) d a = 1 . (11) Note that d a = da 1 da 2 ...da N C , where N C is the num b er o f pa r a meters of the mo del. Also < • > denotes the exp ectation ev aluated with resp ect to f ( a ). Section 3 deriv es the constraint driv en pseudo-in v erse condition for normal distributions by in v oking the QM virial theorem as < a > = W † v T , (12) where: W † = W T ( WW T ) − 1 is the Mo ore-P enrose pseudo-inverse [51]. Note that as stated in Sections 1 and 3, unlik e the MaxEn t mo del the FIM-based framew ork presen ted herein also allo ws for f ( a ) describ ed by Hermite-Gauss solutions. Suc h extensions o f the presen t mo del and the subseq uen t eﬀects on the pseudo-in v erse condition are b ey ond the scop e of this pap er, and will be presen ted elsewhere. 2.3 The pr e diction phase The pr e diction phase commences once t he p ertinen t parameters < a > are de- termined from the M training data in the mo deling phase . These are emplo y ed to predict new series v alues ˆ v ( t n + T ) n =1 ,..., M P = ˆ W < a >, (13) where ˆ W is a matrix of dimension M P × N C . Note that M P is suc h that M P − M ne w time series v alues ma y b e ev aluated after the training data has b een reconstructed. The prediction phase is essen tially the implemen tatio n of (10), for temp oral indices n = 1 , ..., M P , where, M P >> M is the sum of b oth the training data and the new da t a to b e predicted after completion of the mo deling phase. It is imp ortant to note that the pr o c ess of infer enc e ne c essitates the r e-deﬁnition of the worki n g hyp o thes i s to ac c ount for (10) now sup erse ding (9). Th e obvi- ous r e as o n b eing that the pr o c ess of infer enc e c an only ev aluate < a > and not a . The v alue of M P should b e suitably b ounded to facilitate the com- parison b et w een the predicted signal obta ined from the solution of (13 ) , with the original signal. This is done in order to judge the ﬁdelit y of the pr edic- tion through b oth visual insp ection and a nalysis; viz. calculation of the mean squared error (MSE, hereafter) b et we en the o riginal and the predicted signal. In this pap er, given t he original signal represen ted b y the column vector Z , M P = dimension [ Z ] − max { T , d } . Note that this non- uni q ue ﬁduciary b ound do e s not in any way c onstr ain the IT-b ase d pr e diction mo del, and ther e is 8 nothing that pr events the value of M P fr om exc e e ding this b ound sh ould the situation r e quir e it . T o ev alua t e the MSE, deﬁning the exact measuremen t from the original signal a s v , and the corresp onding results of the predictiv e mo del as ˆ v M S E = 1 M P M P X j =1 ( v j − ˆ v j ) 2 ; j = 1 , ..., M P . (14) 3 Inference framework for the mo deling phase 3.1 The MFI p rinciple Consider the probability f ( a ; θ ) , (15) where θ is a v ector parameter. Sp ecializing t he fo cus to a class o f probabilit y distributions exhibiting translational in v ariance where f ( a ; θ ) = f ( a − θ ), and assuming without an y loss of generalit y , that the elemen ts of the vec tor a are a prio ri iid , the FI matrix with v ector en tries acquires the form of a diag onal matrix. The FIM [24, 5 5] no w take s the form I [ f ] = R 1 f ( a )  d f ( a ) d a  2 d a = P i R f ( a )  ∂ ln f ( a ) ∂ a i  2 d a = P i R 1 f i ( a i )  ∂ f i ∂ a i  2 da i = P i I i . (16) Note that in (16 ), for iid en tries o f the v ector a , the FIM is the trace of the FI matrix whic h is iden tical to the scalar case [24], and I i is the ii th diagonal elemen t of the diago nal F I matrix. The deriv at io n of (16) is described in the App endix. With the a id of real v alued amplitudes deﬁned by f ( a ) = ψ 2 ( a ) , (17) the FIM (16) may b e compactly expressed as I [ ψ ] = 4 Z dψ ( a ) d a ! 2 d a (18) whic h is extremized sub ject to the constrain t deﬁned by Eq. (10) and the normalization condition Eq. (1 1), in Section 2 . A Lagrangia n can b e sp eciﬁed of the form J [ ψ ] = Z    4 dψ ( a ) d a ! 2 − → λ Wa ψ 2 ( a ) − λ 0 ψ 2 ( a )    d a , (19) 9 where ~ λ is the v ector of Lagrange m ultipliers asso ciated with the constrain t (10). Eq. (19) is re-express ed with its constrain t terms de scrib ed in comp onen t- wise form as J [ ψ ] = Z    4 dψ ( a ) d a ! 2 − M X k =1 λ k N c X i =1 W k i a i ψ 2 ( a ) − λ 0 ψ 2 ( a )    d a (20) V ariatio nal extremization of (19) with resp ect to ψ ( a ), and m ultiplying the resultan t by 2 yields − d 2 ψ ( a ) d a 2 − ~ λ 4 Wa ψ ( a ) | {z } U ( a ) = λ 0 4 ψ ( a ) , (21) where U ( a ) = − ~ λ 4 W aψ ( a ) is the empirical pseudo-p oten tial. Here, ( 2 1) b ears a resem blance to t he SWE in a ve ctor setting with ~ 2 2 m = 1. 3.2 Infer enc e of normal distributions an d derivation of the pse udo-inverse c on d ition Redeﬁning (18) in terms of the p df f ( a ) = ψ 2 ( a ) one ﬁnds after inv oking the QM virial theorem [31] Z f ( a ) d ln f ( a ) da ! 2 d a = 4 Z f ( a ) a dU ( a ) da ! da. (22) Eqs. (22) yields Z f ( a )   d ln f ( a ) d a ! 2 − 4 a dU ( a ) d a   d a = 0 . (23) Substituting the expression for U ( a ) in (21) in to (23) results in Z f ( a )   d ln f ( a ) d a ! 2 + → λ W a   d a = 0 . (24) Solving (24) yields f ( a ) = exp " ∓ Z q − → λ Wa d a # . (25) Setting → λ = − Wa σ 4 (26) 10 results in the p df f ( a ) = exp " − R r  W a σ 4  2 d a # = exp h − Wa T a 2 σ 2 i ˜ Z = exp     − M P k =1 W ki N C P i =1 a 2 i 2 σ 2     ˜ Z ; ˜ Z = R exp h − Wa T a 2 σ 2 i d a . (27) Note that σ 2 denotes the statistic al disp ersion , and ˜ Z is the canonical pa r tition function. The ab ov e analysis is presen ted in a more f amiliar form b y in v oking the translational inv ariance prop erty of the FIM b y sp ecifying r = a − h a i . (28) Here, (28) has the eﬀect o f transforming (21) to − d 2 ˜ ψ ( r ) d r 2 + ˜ U ( r ) ˜ ψ ( r ) = λ ∗ 0 4 ψ ( r ) , (29) where the translated empirical pseudo-pot ential is deﬁned b y ˜ U ( r ) = − 1 4 → λ W ( a − < a > ) = − 1 4 → λ ∗ Wr , (30) and the translated norma lizat io n Lagrange multiplie r is no w λ ∗ 0 = λ 0 + → λ W h a i (31) It is notew orth y to mention tha t (28) is iden tical to the so-called z e r o-m e an form of the SWE emplo y ed in man y works on FIM-based inference [5 6 ]. Eq. (23) is no w re-cast as Z ˜ f ( r )   d ln ˜ f ( r ) d r ! 2 − 4 r d ˜ U ( r ) d r   d r = 0 . (32) Note that Z 1 f ( a ) d f ( a ) d a ! 2 d a = * a dU ( a ) d a + f ( a ) = * r dU ( r ) d r + ˜ f ( r ) = Z 1 ˜ f ( r ) d ˜ f ( r ) d r ! 2 d r , (33) where, h• i f ( • ) denotes the exp ectation ev aluated with resp ect to f ( • ). Substi- tuting (30) into (32 ) a nd solving yields ˜ f ( r ) = exp " ∓ Z q − → λ ∗ Wr d r # . (34) 11 Setting → λ ∗ = − Wr σ 4 (35) results in the p df ˜ f ( r ) = exp " − Wr T r 2 σ 2 # (36) Th us f ( a ) = exp h − W ( a −h a i ) T ( a −h a i ) 2 σ 2 i ˜ Z = exp     − M P k =1 W ki N c P i =1 ( a i − h a i i ) T ( a i − h a i i ) 2 σ 2     ˜ Z ; ˜ Z = R exp  − W ( a −h a i ) T ( a −h a i ) 2 σ 2  d a . (37) Solving (37) yields f ( a ) = 1 (2 π σ 2 ) N c 2 exp " ( v − Wa ) T ( v − Wa ) 2 σ 2 # . (38) F rom (28 ) D r T r E = D ( a − h a i ) 2 E = D a T a E − h a i T h a i = σ 2 ; (39) With the aid of (28), (32), and (36), (37) yields the matrix FIM in the form [45] I [ f ] = W T W σ 2 (40) F or normal distributions, the Cramer- R ao b ound is alw a ys saturated [24, 45]. Th us, the diago nal co v a r ia nce matrix is of the form C = σ 2 ( W T W ) − 1 . (41) T o fo r mally establish the pseudo-inv erse relation, (10), (17), and ( 3 5) yield − R W T → λ ∗ ψ 2 ( a ) d a = W T W h a i = W T v (42) Th us h a i =  W T W  − 1 W T v = W † v . (43) The pseudo-in v erse condition (43 ) ma y b e readily sho wn to b e an e ﬃ c ient estimator of a . 12 4 Recipro city relations and the Legendre transform structure The basic mathematical appara tus and theoretical considerations for deriving the recipro cit y relations and the L TS ha v e b een established [28, 43]. Th us, only the p ertinen t F IM-relations in a vec tor setting for least square constraints are to b e stated. Multiplying (21) b y 4 ψ ( a ) and inte gra t ing yields in v ector f orm after re-arranging the terms I [ ψ ] = λ 0 + → λ W < a > (44) T o treat the comp onen t-wise case, the follo wing deﬁnition is in v ok ed Z a i f ( a ) d a = < a i >, (45) yielding I [ ψ ] = λ 0 + M X k =1 λ k N c X i =1 W k i h a i i . (46) T aking deriv ativ es o f (46) with resp ect to λ k results in ∂ I [ ψ ] ∂ λ k = ∂ λ 0 ∂ λ k + N c X i =1 W k i h a i i + M X j =1 j 6 = k λ j ∂ N c P i =1 W j i h a i i ∂ λ k . (47) Sp ecifying ∂ λ 0 ∂ λ k = − N c X i =1 W k i h a i i (48) in (47), yields the generalized Fisher-Euler theorem in a ve ctor setting for least squares constrain ts ∂ I [ ψ ] ∂ λ k = M X j =1 j 6 = k λ j ∂ N c P i =1 W j i h a i i ∂ λ k . (49) Setting Θ k ( a i ) = N c X i =1 W k i a i , (50) With the aid of (4 5 ), (50) tak es t he form h Θ k ( a i ) i = N c X i =1 W k i h a i i . (51) 13 With the aid of (2 1 ), (44),(45) and (50), the fo llo wing relation is obtained λ 0 + M P k =1 λ k h Θ k ( a i ) i = − M P k =1 λ k D a i d Θ k ( a i ) da i E = − M P k =1 λ k h Θ k ( a i ) i ⇒ λ 0 = − 2 M P k =1 λ k h Θ k ( a i ) i ⇒ I [ ψ ] = − M P k =1 λ k h Θ k ( a i ) i (52) T aking deriv ativ es o f the third term in (5 2) yields ∂ I [ ψ ] ∂ λ j = − h Θ j ( a i ) i − M X k =1 λ k ∂ h Θ k ( a i ) i ∂ λ j (53) Consider the relation t ha t underlies the basis for the L TS [26, 43] λ 0 ( λ 1 , ..., λ M ) = I ( h Θ 1 ( a i ) i , ..., h Θ M ( a i ) i ) − M X k =1 λ k h Θ k ( a i ) i . (54) T aking deriv a tiv es o f (54 ) with respect to h Θ j ( a i ) i and comparing the ensuing results with (53) yield the recipro cit y relation ∂ I ( h Θ 1 ( a i ) i , ..., h Θ M ( a i ) i ) ∂ h Θ j ( a i ) i = λ j . (55) Lik ewise, taking deriv a tiv es of (53) with r esp ect to λ k yields the r ecipro cit y relation (48). Substituting (55) in to (52) yields a linear PDE to infer the F IM without ha ving to solv e the v ector indep enden t Sc hr¨ odinger-lik e equation I [ ψ ] = − M X k =1 h Θ k ( a i ) i ∂ I ( h Θ 1 ( a i ) i , ..., h Θ M ( a i ) i ) ∂ h Θ j ( a i ) i . (56) Sp ecifying I [ ψ ] = M X k =1 I k [ ψ ] = M X k =1 exp [ g ( h Θ k ( a i ) i )] , (57) and substituting (5 7) in to (56) yields I ( h Θ 1 ( a i ) i , ..., h Θ M ( a i ) i ) = M X k =1 C k | h Θ k ( a i ) i | − 1 , (58) where C k is a constan t of in tegration. In v oking (1 0) in (58) yields a candidate empiric al FIM fo r the mo deling phase, deﬁned solely in terms of the training data I ( h Θ 1 ( a i ) i , ..., h Θ M ( a i ) i ) = M X k =1 C k v − 1 k . (59) 14 The utilit y and practical implemen ta tion is the task of ongoing work. Some of the p oten tial implications of (59) are brieﬂy discussed in Section 6. T aking the deriv ative of (58) with resp ect to h Θ k ( a i ) i yields ∂ I ( h Θ 1 ( a i ) i , ..., h Θ M ( a i ) i ) ∂ h Θ k ( a i ) i = − 1 h Θ k ( a i ) i C k |h Θ k ( a i ) i| − 1 = − 1 h Θ k ( a i ) i I k . (60) Here, (60) describ es a monotonically decre asing I k in the h Θ k ( a i ) i -direction. Diﬀeren tiating (60) yields ∂ 2 I ( h Θ 1 ( a i ) i , ..., h Θ M ( a i ) i ) ∂ h Θ k ( a i ) i ∂ h Θ l ( a i ) i = 2 C k |h Θ k ( a i ) i| − 3 δ k l , (61) where δ k l is the Kr ¨ onec k er delta. Here, (6 1 ) establishes t he conv exit y of the FIM deriv ed in this Section, thereby guarante eing the existence o f its inv erse. 5 Numerical examples This Section demonstrates t he eﬃcacy of the prediction mo del with the aid of the M-G DDE a nd tw o ECG signals. The em b edding dimension f o r all exam- ples is ev aluated using the false nearest neighbor metho d [57] All n umerical examples are ev aluated for v a lues of the forecasting time T= 1 and 5. The largest p ositiv e Ly apunov exp onen t (LLE, hereafter) is o ne o f the simplest indicators of c haotic b eha vior. F rom Refs. [17, 58, 59], it is eviden t that M-G DDE, with the delay time τ > 14 secs., p ossess a p o sitiv e LL E. Likew ise, from [6 0,61], it has b een established that the ECG signals comprising Record 207 of the MIT-BIH arrh ythmia database p ossess a p ositiv e LLE, while [60] demonstrates that the signal comprising cudb/cu02 also has a p ositive LLE. 5.1 Mackey-Glass e quation The famed M-G DDE is describ ed b y dx ( t ) dt = ax ( t ) ( t − τ ) 1 + x ( t ) 10 ( t − τ ) − bx ( t ) , (62) where, a = 0 . 2 , b = 0 . 1 , τ = 30 secs. Here, x (0) = 1 . 2. Integration with a fourth-or der Runge-Kutta routine yields the or iginal solutions. The v alue of t he em b edding dimension is d = 5. The to t a l n umber o f p oints in the original solution is 1500 . The pro cedures describ ed in Sections 2 a nd 3 are then emplo ye d for M = 300 ∈ [0 , 3 00] secs. to o btain < a > , for M P = 15 1494 , np = 3 , N c = 56. These results and the concomitant MSE v alues are depicted in Figs. 1 and 2, resp ective ly . Here, F ig. 1 clearly demonstrates that the pr edicted results faithfully capture the dynamics em b edded in the c haot ic M-G time series’. Fig. 2 exp ectedly demonstrates a sligh t distortion of the predicted signal vis-´ a-vis the original signal, as a consequence o f long-term forecasting. It may be a r g ued that the n um b er of co eﬃcien ts N c = 56 is high and can forecast just ab out any signal. This argumen t is not only ten uous at b est f or the case of c haotic signals, but is also orthogo nal to t he v ery r eason causing the c ho osing of suc h a high v alue of N c . Sp eciﬁcally , Section 1 explicitly discusses the p o ssible singularity (or near-singularit y) of the embedding matrix W . As stated t herein, large n um b er o f lags, W can r esult in v olatile f o recasts o wing to ill-conditioning. The M-G DD E in this example has a higher em b edding dimension tha n other prominen t mo dels (such a s the Lorenz, H ´ enon, etc.) (see, for eg. Ref. [3]), a nd th us the resulting W w ould b e more prone to result in v olatile forecasts. As is evidenced by Figs. 1 and 2, this is not the case and the forecasts a re clearly accurate and stable. It is notew orth y t o mention t ha t the co eﬃcien ts obtained fro m the train- ing data during the mo deling phase, whic h form the basis on whic h further prediction is do ne ov er a muc h larg er time p erio d and data sample size (as compared to those in the mo deling phase), are unique to the sp eciﬁc data set under consideration. Sp eciﬁcally , co eﬃcien ts o btained fro m diﬀeren t data sets, for example ( i ) the M-G DD E with a diﬀeren t v alue of the lag τ or ( ii ) another no nlinear dynamics mo del, yield erroneous predictions if applied to a data se t which diﬀers from the o ne(s) they w ere obtained from. This issue is the task o n ongoing studies brieﬂy describ ed in Section 6 within the con text of the results describ ed in Sections 3 and 4, and will b e presen ted elsewhere. 5.2 MIT-BIH arrhythmia datab ase R e c or d 207 This sub-Section emplo ys a 300 secs. (5min.) ECG signa l to demonstrate tha t the mo del described in this pap er accurately predicts episodes (tra nsien ts) whic h are the artifacts of a diseased heart o v er a reasonable p erio d of time, ev en for a highly erractic/v olatile signal. The annot a tions are describ ed in [62]. The signal is extracted fro m data obtained as a .mat data ﬁle from [63]. The sampling frequency of the data δ s = 360 H z . [64], the num b er of samples b eing 1 08 , 000 for a to tal duration o f 300 secs . The rationa le for the choic e of 3 00 secs. sample is to ensure that t he p ortion of the signal, b oth during the mo deling phase and the prediction phase tha t follows , p ossess suﬃcien tly iden tiﬁable episo des whic h a re documen ted in the re ference annota t ions [65]. It w ould b e desirable to conduct the study ov er the en tire duration of the signal 16 spanning around 30 mins . How eve r, this w ould yield sim ulation results that are visually incoheren t, and hence the truncation of the signal length/dura t io n. The n um b er o f training samples from whic h the v a lues of < a > is obtained is M = 18 , 000 ∈ [0 , 50 . 0] secs. fo r forecasting times of T = 1 . 0 ( M p = 107 , 996) and T = 5 . 0 ( M p = 107 , 995 ). In all cases, d = 4, np = 3, and N c = 35. All sim ulation results are presen ted for the MLI I lead. The sim ulation results depicting the predicted ECG signal sup erimp osed ov er the original Record 207 signal for the MLII lead for T=1.0 and 5 .0 resp ectiv ely and the concomitan t MSE’s are presen ted in Figs. 3 and 5, resp ectiv ely . Both Figs. (3) and (5) demonstrate that ev en for a relativ ely small duration of t he mo deling phase whic h comprises 1/6 -th of the duration o f the entire prediction exercis e comprising of b oth the mo deling phase and the prediction of new data, the prediction results are of a high quality . The case correspond- ing to T = 5 . 0 shows greater ”undersho ots” and ”o v ersho ots” of the p eaks of the signal a s compared to the case of T = 1 . 0. This degradation of prediction p erformance is exp ected. The r esults of the MSE’s fo r b oth c ases o f the for e- c as ting time shows dive r gen t p e aks. These have b e en analyze d and found to b e the r esult of the highly err actic/volatile natur e of the signal. It is imp ortant to note that they do not c o nstitute any volatility in the pr e diction sinc e the pr o- ﬁles of the pr e dic te d sig n als ar e dem onstr ate d to b e very much in ac c or d with the origina l signal. F urther, in b oth c ases, fol lowing every diver gent p e ak which c an even b e visual ly r elate d to err atic signal quality in Fig s . (3) and (5), the pr e diction r eturns to ”normalcy” which is deﬁ n e d by a low MSE value. This would not b e the c ase of a volatile pr e diction c ase d by il l-c onditioning or any other factor (se e Se ction 1) , b e c ause the err ors c ase d by vo latile pr e dictions tend to c asc ade . Figs. (4) and (6) fo cus o n sp eciﬁc regions of in terest. Figs. 4(a) and 6(a ) depict the ov erlaid o v erlaid a nd pr edicted signals of the mo deling phase for the cases of the for ecasting time T = 1 . 0 and T = 5 . 0, resp ectiv ely . T o establish the ac- curacy of the FIM based mo del, the prediction of the episo des corresp onding to the v arious conditions of the diseased heart do cumen ted in the reference annotations [65 ], that can b e visually determined in the p erio d [0 , 300] secs. are demonstrated in Figs. 4(b)- ( d) and 6(b)- (d) f o r T = 1 . 0 and T = 5 . 0, resp ectiv ely . On insp ection o f Fig s. 4(b) and 6(b), ( i ) the instance of v en- tricular tac h ycardia iden tiﬁed b y ”+” and deﬁned b y ”(VT” in the reference annotations at 38.522 secs., immediately follow ed by three instances of prema- ture ven tr icular con traction iden tiﬁed b y ”V” and ( ii ) the onset of v en tricular ﬂutter/ﬁbrillation iden tiﬁed b y ”[” at 4 0 .736 secs. in the reference annotatio n follo we d b y an instance of ven tricular ﬂutter iden tiﬁed b y ”+” and deﬁned by ”(VFL” in the reference annotations at 40 .8 03 secs. and the subsequ ent termi- nation of the ve ntricular ﬂutter/ﬁbrillatio n identiﬁe d by ”]” at 50.972 secs. in the reference annot a tion can b e easily iden tiﬁed. This region is of particular imp ortance since it spans b oth the mo deling phase fro m whic h the w orking 17 hyp othesi s is determined from the training dat a , and the prediction o f new data v alues. On ins p ection of Figs. 4(c) and 6(c), ( i ) the onset of ven tricular ﬂutter/ﬁbrillation iden tiﬁed b y ”[” at 54.764 secs. in the reference annotation follow ed by an in- stance of v en tricular ﬂutter iden tiﬁed b y ”+” and deﬁned by ”(VFL” in the reference annotat io ns at 54.86 9 secs. and the subseque nt t ermination of the v en tricular ﬂutter/ﬁbrillatio n identiﬁe d by ”]” at 50.9 72 secs. in the reference annotation and ( ii ) the instance of v en tricular tac h ycardia iden tiﬁed b y ”+” and deﬁned b y ”(VT” in the reference annotations at 61.839 secs. (1:01.839 mins.), immediately follo w ed by three instances of premature ven tricular con- traction iden tiﬁed b y ” V” , can b e easily iden tiﬁed. Finally , o n inspection of Figs. 4(d) and 6(d), t he o nset of ven tricular ﬂutter/ﬁbrillation iden tiﬁed b y ”[” at 26 9 .467 secs. (4:29.467 mins.) in the reference annotation follo we d by an instance of v en tricular ﬂutter iden tiﬁed by ”+” a nd deﬁned by ”(VFL” in the reference annotations at 129 .5 86 secs. (4:29.586 mins.) and the subseq uent termination of the v entric ular ﬂutter/ﬁbrillation iden tiﬁed b y ” ]” at 2 40.906 secs. ( 4:40.906 mins.) in the reference annota tion, can b e easily iden tiﬁed. In all cases depicted in Figs. (4) and ( 6 ), it is observ ed that the qualit y of the prediction is high, with the case of the example with T = 5 . 0 b eing margina lly degraded vis-´ a-vis the case with T = 1 . 0, whic h is expected. 5.3 Cr ei g h ton Univers i ty VT A datab ase R e c or d cudb/cu02 This sub-Section demonstrates the robustness of mo del describ ed in this pap er for an extende d ECG signal, ev en for a highly erractic/volatile signal display- ing the symptoms of cardiac VT A. The signal is extracted from data obtained as a .mat data ﬁle fro m [66]. The sampling frequency of the data δ s = 250 H z . , and the num b er of samples is 127 , 232, for a total duration of 8 : 2 8 . 928 mins. [67]. In o r der to study the robustness of the prediction p erformance of the FIM-based mo del, the num b er of training samples from whic h the v alues of < a > is obtained is c hosen to b e M = 30 , 00 0 ∈ [0 , 120 . 0] secs. for T = 1 . 0 ( M p = 127 , 228) and T = 5 . 0 ( M p = 127 , 227). In all cases, d = 4, np = 3, and N c = 35. The high qualit y o f the mo del is attested b y t he ﬁdelity of the predicted ECG proﬁles with the original signals, and, the MSE’s. The sim ulation results depicting the predicted ECG signal sup erimp osed o v er the orig ina l signal for T=1.0 and 5.0 and the concomitant MSE’s, are presen ted in Figs. 7 and 8, resp ectiv ely . Similar to the case describ ed in Section 5.2, the results of the MSE’s for b oth forecasting times depict dive rgent p eaks. These ha v e b een analyzed and found to b e the result of the na ture of the signal. Aga in, note that these div ergen t p eaks in the MSE’s do not constitute any v olatility in the 18 prediction since the proﬁles of the predicted signals are demonstrated to b e v ery muc h in accord with the original signa l. F urther, in b o th cases, following ev ery div ergen t p eak whic h can ev en be visually related to erratic signal qualit y in F igs. (7) and (8), the prediction returns to ”normalcy” (deﬁned b y a low MSE v alue). This w o uld not b e the case in a v olatile prediction cased b y ill- conditioning or any other factor (see Section 1), because the errors cased b y v olatility of the forecasting tend to cascade. 6 Summary and conclusions A con v enien t fra mework for the mo deling and for ecasting of time series has b een dev elop ed within the ambit of a FIM-based inference mo del. The mo del- ing phase from whic h the working hyp othesis is deriv ed has b een provide d with a QM connotation, b y the for mulation of a time indep enden t Sc hr¨ odinger- like equation in a vec tor setting, emplo ying least squares c onstraints. This has been ac hiev ed by inv oking the MFI principle. Apart from the ob vious theoretical implications, this allow s for the systematic deriv atio n and categorization of the workin g hyp othesis and the subsequen tial forecasting phase. The p df and the pseudo-in v erse r elat io ns hav e b een self-consisten tly inferred b y inv oking the QM virial theorem f o r the case of normal distributions. The recipro cit y relations and the L TS for the mo deling phase hav e b een derived for the mo d- eling phase. This results in an intriguing form of the FIM fo r the mo deling phase, whic h deﬁnes the work ing hyp othesis , describ ed solely in terms of the observ ed da ta (the empiric al FI M ). The p ossible utilities of this form of the FIM ar e to deriv e principled ex- pressions and v alues for the statistical disp ersion. This F IM expression has no equiv alen t in prior MaxEn t mo dels [7-10], whic h treat the statistical disp ersion as an ad-ho c scaling [15]. The FIM-based mo del has b een numerically tested, and its eﬃcacy prov en for the Mac k ey-Glass DDE, the ECG signal for the MLI I lead from Record 2 07 of the MIT-BIH cardiac arrhy thmia database, and the ECG from Record cudb/cuo2 of the Creigh ton Univ ersit y VT A database. The forecasting is consisten tly demonstrated to b e of ve ry high quality , and do es not suﬀer from any signs of v olatility caused by ill-conditioning of the em b edding matrix W or any other factor. The n umerical exp erimen ts on the ECG demonstrate that the mo del presen ted in this pap er is able to forecast salien t episo des do cumen ted in the referenc e annotations for the said signals, to a v ery high degree of accuracy . Ongoing w ork is fo cused on a tw o-pronged approac h. First, the em piric al FIM deriv ed in Section 4 has b een in v esti- gated within the contex t of its relatio nship to the FIM emplo y ed in Section 3, a nd the subsequ ent eﬀects on the forecasting. A ﬁduciary time dep endence is induced into the mo deling phase via a sliding windo w analysis. This allo ws for the quality of prediction to b e related to f undamen tal results gov erning 19 the FIM, viz. the I-the or em (the Fisher-equiv alen t of the H-theorem) [24], for b oth normal and non- equilibrium distributions. Next, a principled comparison of the FIM-based mo del with existing nonparametric prediction mo dels [12, 13] is in progress. These works will b e presen ted elsew here. App endix A: Der iv ation of Eq. ( 16) F or the F isher information matrix [ F ], eac h elemen t is deﬁned by F ij = Z f ( a ) " ∂ ln f ( a ) ∂ a i ∂ ln f ( a ) ∂ a j # da. (A.1) F or a p ossessing a-priori iid en tries f ( a ) = Y i f i ( a i ); ∂ ln f ( a ) ∂ a i = ∂ ln f i ( a i ) ∂ a i . (A.2) Substituting (A.2) into (A.1) yields F ij = Z Y k f k ( a k ) " ∂ ln f i ( a i ) ∂ a i ∂ ln f j ( a j ) ∂ a j # da k (A.3) F or i 6 = j F ij = F i F j ; F i = R f i ( a i ) ∂ ln f i ( a i ) ∂ a i da i = ∂ ∂ a i R I f i ( a i ) da i = ∂ ∂ a i R 1 = 0 , (A.4) since all other in tegrals da k in tegrate t o unit y b ecause of normalizat io n. F or i = j , F ii = Z f i ( a i ) ∂ f i ( a i ) ∂ a i ! 2 da i . (A.5) Th us, [ F ] is a diago na l matrix with eac h elemen t deﬁned b y (A.5). Note that F ii = I i , as used in Eq. (16). References [1] J.-P . Ec kman, D. Ruelle, Rev. Mod . Phys. 15 (1985) 617-656. [2] H. D. I. Abarbanel, R. Brown, J. J. Sidoro wich, L. Sh. Ysimring, Rev. Mo d . Ph ys. 65 (1993) 1331-1392 . [3] H. K an tz, T. S chreib er, N online ar Time Series Ana lysis , Cam bridge Univ. Press, Cambridge U.K, 1999. 20 [4] J. Rissanen, An n. Stat. 14 (1989) 1080-110 0. [5] F. T ak ens, ”Detecti n g strange attract ors in turb u lence”, in Dynamic al Systems and T u rbu lenc e : L ecture Notes in Mathematics, V olume 898, Sprin ger, Berlin, 1981, 366-381 . [6] T. Sauer, J . A. Y ork e, M. Casdagli, J. Stat. P h ys. 65 (2000) 579. [7] L. Diam bra, A. Plastino, Ph ys. Lett. A 216 (1996) 278-28 2. [8] M. T. Mart ´ ın, A. Plastino, V. V ampa, G. Ju dge, Ph ysica A 405 (2014) 63-69 . [9] M. T. Mart ´ ın, A. Plastino, V. V ampa, Entrop y 16 (2014) 4603-4611. [10] A. F. Bariviera, M. T. Mart ´ ın, A. Plastino, V. V ampa, Physica A 449 (2016) 401-4 07. [11] E. T. Jaynes, Phys. Rev. 106 (1957) 620 -630. [12] J. C. Pr incip e, Information The or etic L e arning - R enyi’s Entr opy and Kernel Persp e ctives , Springer, New Y ork, 2010. [13] W. Liu, J. C. Prin cip e, S. Ha ykin, Ke rnel A daptive Filtering : A Compr ehensive Intr o duction , Wiley , Hob ok en NJ, 2010. [14] H. D. I. Ab arbanel, Analysis of Observe d Cha otic Data , Sp ringer, New Y ork, 1996. [15] L. Diam bra, Physica A 278 2000) 140-149. [16] N. Wiener. Extr ap olation, Interp olation, and Smo othening of Stationary Time Series with Engine ering Applic ations , Wiley , New Y ork, 1949. [17] M. Casdagli, P h ysica D 35 (1989) 335-3 56. [18] J. C. Pr incip e, A. Rathie, J.-M. Kuo, Intl. J. of Bifu r cation and Chaos 2 1992 989-9 96. [19] D. Mandic, J . Ch am b ers, R e curr ent Neu r al Networks f or Pr e diction , Wiley , Chic hester, 2001 . [20] L. Diam bra, A. Plastino, Ph ys. Rev. E 52 (1995 4557-45 60. [21] N. I. S apank evyc h, R. S ank ar, IEEE Comput. Intell. Mag., 4 (20 09) 24-38. [22] J. P . Cru tc hﬁeld, B. S. McNamara, C omplex Systems, 21 (1985 ) 417. [23] J. D. F armer, J . J. Sidoro wic h, Ph ys. Rev. Lett. 59 (198 7) 845. [24] B. R. F rieden, Scienc e fr om Fisher Information - A Uniﬁc ation , Cam br idge Univ ersit y Press Cambridge, 2004. [25] J.A. Wheeler, in Zu rek W. H. (Ed.): Complexity, Entr opy and the P hysics of Information , Add ison W esley , New Y ork, 3-28, 1991. [26] P . J. H ¨ ub er, R obust Stat istics , Wiley , Newy Y ork, 1981 . 21 [27] B. R.F rieden , Phys. Rev. A, 41 (2000) 4265-4 276; Optics Lett, 14 (1989) 199- 201. [28] B. R. F rieden, A. Plastino, A. R. Plastino, B. H. S oﬀer, Phys. Rev. E 66 (2002) 04612 8. [29] B.R. F rieden, A. Plastino, A. R. Plastino, B. H. Soﬀer, Phys. Lett. A 304 (2002 ) 73-78. [30] J. S. Dehesa, . G. Martn, P . S ´ anc h ez-Moreno, Complex An al. and Op er. Th. 6 (2012) 585-601. [31] W. Greiner, Q uantum Me chanics. An Intr o duction , Springer, Berlin, 2012. [32] F. N. F er n andez and E. A. Castro, Hyp e rvirial The or ems , Lecture Notes in Chemistry , V ol. 43, Sp ringer-V erlag, Berlin, 1987. [33] R. C . V enk atesan, ”Statistica l Cryp tography using a Fisher-Schr¨ odinger mo del”, Pr o c. IEEE Symp osium on F oundations of Computational Intel ligenc e (F OCI 2007) , 487, IE E E Press, P iscata wa y , NJ, 2007. [34] R. C. V enk atesan, ”Encry p tion of Co v ert Information thr ough a Fisher Game”, in Explor atory Data Analysis using Fisher Information , F rieden, B.R. and Gaten b y , R.A., (Eds.), S pringer- V erlag, Lond on, 181-216 , 2006. [35] B. R.F rieden, A. Plastino, A. R. Plastino, B. H. Soﬀer, Phys. Rev. E , 60 (1999) 04612 8. [36] S. P . Flego, A. Plastino, A. R. Plastino, Ann . Phys., 326 (2011) 2533-25 43. [37] S. P . Flego, A. Plastino, A. R. Plastino, Physica A 390 (2011 ) 2276-2282 . [38] S. P . Flego, A. Plastino, A. R. Plastino, Physica A 390 (2011 ) 4702-4712 . [39] S. P . Flego, A. Plastino, A. R. Plastino, Ph ysica S cripta, 85 (201 2) 05500 2- 05500 8. [40] C. Villani, T opics in Opt imal T r ansp ortation , Graduate Studies in Mathematics V ol. 58 , American Mathematic al So ciet y , 2000. [41] G. Blo wer, R ando m Matric es: High Dimensional Phenomena , London Mathematica l So ciet y Lecture Notes, Cambridge Univ ersity Press, Cambridge, 2009. [42] P . Z egers, Entrop y 17 (2015) 4918-493 9; B. R. F r ieden, B. H. S oﬀer, Ph ys. Lett. A 374 (2010 ) 3895-3898. [43] R. C. V enk atesan, A. Plastino, Ph ys. Lett. A, 378 (2014) 1341-1 345. [44] R. C. V enk atesan, A. Plastino, Ann. Phys., 359 (2015) 300-316. [45] S. M. Ka y , F undamentals of Statistic al Signal Pr o c essing, V ol I: Estimation The ory , Pren tice-Hall Signal Pro cessing Series, 1993. [46] M. Casas, F. P ennini, A. Plastino, P hys. Lett. A 235 (1997) 457-463. 22 [47] M.C.Mac k ey , L.Glass, Science 197 (1977) 287-28 9. [48] G. B. Moo d y , R . G. Mark, IEEE Eng. in Med and Biol. 20 (2001) 45-5 0. [49] F. M. Nolle, F. K. Badura, J. M. Catlett, R. W. Bo wser, M. H. Sk etc h, ”CREI- GARD, a new concept in computerized arrh ythmia monitoring systems”, Computers in Car diolo g y 1986 13 , 515-518, IEEE Press, Piscata wa y , NJ, 1986. [50] A. L. Goldb erger, L. A. N. Amaral, L. Glass, J. M. Hausdorﬀ, P . Ch. Iv ano v, R. G. Mark, J. E. Mietus, G. B. Mo o dy , C.-K. Peng, H. E. Stanley , Circulation, 01 (200 0 e215-e220. [51] G. H. Golub , C . L. v an Loan, Matrix Computations , third ed., Johns Hopkins Univ. Press, Baltimore, 1995. [52] L. Diam bra, C. P . Malta, Ph ys. Rev. E 57 (1999) 929-93 7. [53] B. P omp e, ”Mutual Information and Relev ant V ariables for Predictions”, in Mo deling and F or e c asting Financial Data T e chniques for Nonline ar Dynamics , So oﬁ, A. S. and Cao, L., (Eds.), S pringer, New Y ork, p p. 61-92, 2002. [54] A. B. Owen, Empiric al Likeliho o d , C hapman & Hall/CR C, Bo ca Raton, 2001. [55] D. Guo, S. Shamai (Shitz) and S. V erd ´ u, The Interplay Betwe en Information and Estimation Me asur es , F oun dations and T ren d s in Signal Pr o cessing Ser., No w P ublishers, Boston, 2012. [56] M. Casas, A. Plastino, A. Puen te, Ph ys. Lett. A 248 (1998) 161-1 66. [57] M. Kennel, R. Bro wn, H. D. I. Abarbanel, Phys. Rev. A; 45 (1992) 3403-3411. [58] P . I. Grassb erger, I. Pro caccia, P h ysica D, 9 (1983) 189-2 08. [59] J. D. F armer, P h ysica D, 4 (1982)366- 393. [60] R. B. Go vind an, K . Naray anan, M. S. Gopinathan, Chaos, 8 (1998) 495-502. [61] A. Casaleggio , S. Braiotta,”Study of th e Ly apunov exp onent s of ECG signals from MIT-BIH d atabase”, Computers in Car diolo gy 1995 , 697-700. IEEE Press, Piscata w a y , NJ, 1995; A. Casaleggio, S.Braiotta , Chaos Sol. & F rac., 9 (1997 ) 1591-1599. [62] h ttps://physionet. org/physiobank/annotat ions.shtml [63] h ttps://physionet. org/atm/mitdb/207/a tr/0/3600/exp ort/matlab/207m.mat [64] h ttps://physionet. org/atm/mitdb/207/description/record.txt [65] h ttps://physionet. org/atm/mitdb/207/a tr/0/3600/rdan n /annotations.txt [66] h ttps://physionet. org/atm/cudb/cu02/atr/0/ 3600/exp ort/matlab/cu02m.mat [67] h ttps://physionet. org/atm/cudb/cu02/description/record.txt 23 0 500 1000 1500 0 0.2 0.4 0.6 0.8 1 1.2 1.4 t (secs.) x(t) −. Original − Predicted (a) Predicted vs. original signals. 0 500 1000 1500 10 −14 10 −12 10 −10 10 −8 10 −6 10 −4 10 −2 t (secs.) MSE (b) MSE Fig. 1. Pr ed icted vs. original signals and MSE for th e Mac k ey-Glass equ ation, τ = 30 secs., d = 5, T=1.0, and M p =1494. Mo deling p hase is ∈ [0 , 300] secs. for M=300 training d ata. New d ata predicted ∈ [300 , 1495] secs. is M p − M =1195. 0 500 1000 1500 0 0.2 0.4 0.6 0.8 1 1.2 1.4 t (secs.) x(t) −. Original − Predicted (a) Predicted vs. original signals. 0 500 1000 1500 10 −10 10 −9 10 −8 10 −7 10 −6 10 −5 10 −4 10 −3 t (secs.) MSE (b) MSE Fig. 2. Pr ed icted vs. original signals and MSE for th e Mac k ey-Glass equ ation, τ = 30 secs., d = 5, T=5.0, and M p =1494. Mo deling p hase is ∈ [0 , 300] secs. for M=300 training d ata. New d ata predicted ∈ [300 , 1495] secs. is M p − M =1195. 24 0 50 100 150 200 250 300 −4 −3 −2 −1 0 1 2 3 4 t (secs.) ECG Amplitides (mV) MIT−BIH 207MLII, d=4.0, T=1.0 −. Original − Predicted (a) Predicted vs. original signals. 0 50 100 150 200 250 300 10 −16 10 −14 10 −12 10 −10 10 −8 10 −6 10 −4 10 −2 10 0 t (secs.) MSE MIT−BIH 207 MLII, d=4.0, T=1.0. (b) MSE Fig. 3. Predicted vs. original signals and MSE for ECG signal from MIT-BIH ar- rhyt hm ia database for Sample 207 and lead MLI I, d=4, T=1.0, and M p = 107 , 996. Mo deling phase is ∈ [0 , 50] secs. for M=18, 000 training data. New data predicted ∈ [50 , 300] secs. is M p − M =89,996. 25 0 10 20 30 40 50 −4 −3 −2 −1 0 1 2 3 4 t (secs.) ECG Amplitides (mV) MIT−BIH 207MLII, d=4.0, T=1.0. −. Original − Predicted (a) Predicted vs. original signals. 38 40 42 44 46 48 50 −4 −3 −2 −1 0 1 2 3 4 t (secs.) ECG Amplitides (mV) MIT−BIH 207MLII, d=4.0, T=1.0. −. Original − Predicted (b) Predicted vs. original signals. 54 55 56 57 58 59 60 61 62 63 −4 −3 −2 −1 0 1 2 3 4 t (secs.) ECG Amplitides (mV) MIT−BIH 207MLII, d=4.0, T=1.0. −. Original − Predicted (c) Predicted vs. original signals. 270 272 274 276 278 280 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 t (secs.) ECG Amplitides (mV) MIT−BIH 207MLII, d=4.0, T=1.0. −. Original − Predicted (d) Predicted vs. original signals. Fig. 4. S amp le segmen ts of p redicted vs. original signals and MSE for ECG signal from MIT-BIH arrh ythmia database for Sample 207 and lead MLI I, d=4, T=1.0, and M p = 107 , 996. Mo d eling ph ase is ∈ [0 , 50] secs. for M=18,00 0 training data. New data predicted ∈ [50 , 300] secs. is M p − M =89,996. 26 0 50 100 150 200 250 300 −4 −3 −2 −1 0 1 2 3 4 t (secs.) ECG Amplitides (mV) MIT−BIH 207 MLII, d=4.0, T=5.0. −. Original − Predicted (a) Predicted vs. original signals. 0 50 100 150 200 250 300 10 −16 10 −14 10 −12 10 −10 10 −8 10 −6 10 −4 10 −2 10 0 t (secs.) MSE MIT−BIH 207 MLII, d=4.0, T=5.0. (b) MSE Fig. 5. Predicted vs. original signals and MSE for ECG signal from MIT-BIH ar- rhyt hm ia database for Sample 207 and lead MLI I, d=4, T=5.0, and M p = 107 , 995. Mo deling phase is ∈ [0 , 50] secs. for M=18, 000 training data. New data predicted ∈ [50 , 300] secs. is M p − M =89,995. 27 0 10 20 30 40 50 −4 −3 −2 −1 0 1 2 3 4 t (secs.) ECG Amplitides (mV) MIT−BIH 207 MLII, d=4.0, T=5.0 secs. −. Original − Predicted (a) Predicted vs. original signals. 38 40 42 44 46 48 50 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 t (secs.) ECG Amplitides (mV) MIT−BIH 207 MLII, d=4.0, T=5.0. −. Original − Predicted (b) Predicted vs. original signals. 54 55 56 57 58 59 60 61 62 63 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 t (secs.) ECG Amplitides (mV) MIT−BIH 207 MLII, d=4.0, T=5.0 secs. −. Original − Predicted (c) Predicted vs. original signals. 270 272 274 276 278 280 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 t (secs.) ECG Amplitides (mV) MIT−BIH 207 MLII, d=4.0, T=5.0 secs. −. Original − Predicted (d) Predicted vs. original signals. Fig. 6. S amp le segmen ts of p redicted vs. original signals and MSE for ECG signal from MIT-BIH arrh ythmia database for Sample 207 and lead MLI I, d=4, T=5.0, and M p = 107 , 995. Mo d eling ph ase is ∈ [0 , 50] secs. for M=18,00 0 training data. New data predicted ∈ [50 , 300] secs. is M p − M =89,995. 28 0 50 100 150 200 250 300 350 400 450 500 −6 −4 −2 0 2 4 6 t (secs.) ECG Amplitides (mV) CUBD/CU02, d=4.0, T=1.0. −. Original − Predicted (a) Predicted vs. original signals. 0 50 100 150 200 250 300 350 400 450 500 10 −18 10 −16 10 −14 10 −12 10 −10 10 −8 10 −6 10 −4 10 −2 10 0 t (secs.) MSE CUDB/CU02, d=4.0, T=1.0. (b) MSE Fig. 7. Predicted vs. original signals and MS E for ECG signal from Creigh ton Uni- v ersit y v entricular tac h ya rr h ythmia d atabase S ample CUDB/CU02, d=4, T=1.0, and M p = 127 , 228. Mo deling phase is ∈ [0 , 120] secs. for M=30,000 training data . New data predicted ∈ [120 , 515] secs. is M p − M =97,228. 29 0 50 100 150 200 250 300 350 400 450 500 −6 −4 −2 0 2 4 6 t (secs.) ECG Amplitides (mV) CUDB/CU02, d=4.0, T=5.0. −. Original − Predicted (a) Predicted vs. original signals. 0 50 100 150 200 250 300 350 400 450 500 10 −18 10 −16 10 −14 10 −12 10 −10 10 −8 10 −6 10 −4 10 −2 10 0 t (secs.) MSE CUDB/CU02, d=4.0, T=5.0. (b) MSE Fig. 8. Predicted vs. original signals and MS E for ECG signal from Creigh ton Uni- v ersit y v entricular tac h ya rr h ythmia d atabase S ample CUDB/CU02, d=4, T=5.0, and M p = 127 , 227. Mo deling phase is ∈ [0 , 120] secs. for M=30,000 training data . New data predicted ∈ [120 , 515] secs. is M p − M =97,227. 30

Fisher Information Framework for Time Series Modeling

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment