Computational aspects and applications of a new transform for solving the complex exponentials approximation problem
Many real life problems can be reduced to the solution of a complex exponentials approximation problem which is usually ill posed. Recently a new transform for solving this problem, formulated as a specific moments problem in the plane, has been prop…
Authors: Piero Barone
Computational asp ects and applications of a new transform for solving th e complex exp on en tials approx imation problem Piero Barone ∗ Abstract Man y re al life problems c an be red uced to the s olution of a complex exp o- nen tials approximation problem whic h is usually ill p o sed. Recen tly a new transform for solving this problem, form ulated as a sp ecific moments prob- lem in the plane, ha s b een prop osed in a theoretical framew ork. In this work some computational issues ar e addressed to mak e this new to ol useful in practice. An algorithm is dev elop ed and used to solv e a Nuclear Magnetic Resonance spectrometry problem, tw o time se ries in terp olat io n and extrap- olation problems a nd a shap e from momen ts problem. Key wor ds: complex momen ts problem; P ade’ approxim ants; logarith- mic poten tials; random determinan ts; p encils of matrices AMS classific ation: 62M15, 30Exx ∗ Istituto per le Applicazioni del Calcolo ”M. Picone” , C.N.R., Viale del Policlinico 137, 00161 Ro me, Italy; e-ma il: barone@ iac.rm.cnr.it; fax: 39- 6-440 4306 1 Complex exponentials approximation 2 In tro ductio n Man y signal pro cessing problems (see e.g. [23]) can b e form ulated as a com- plex exponential in terp ola tion problem (CEIP): given the complex n um b ers s k , k = 0 , 1 , 2 , . . . 2 p − 1 , to find complex n umbers { c j , ξ j } , j = 1 , . . . , p suc h that s k = p X j =1 c j ξ k j , k = 0 , 1 , . . . , 2 p − 1 . (1) or, equiv alen tly [15 ], to find p oles ξ j and corresp onding residues r j = c j /ξ j of the ra t ional f unction s ( z ) whose first 2 p T a ylor co efficien ts at z = 0 are s k , k = 0 , 1 , 2 , . . . 2 p − 1 . The problem can be restated a s a generalized eigen v alue problem a s follow s. Let us consider Hank el matrices U 0 ( s ) and U 1 ( s ) giv en b y U 0 ( s ) = s 0 s 1 . . . s p − 1 s 1 s 2 . . . s p . . . . . . s p − 1 s p . . . s 2 p − 2 U 1 ( s ) = s 1 s 2 . . . s p s 2 s 3 . . . s p +1 . . . . . . s p s p +1 . . . s 2 p − 1 where s = [ s 0 , . . . , s 2 p − 1 ] . Because of (1), the fo llowing facto r izat io ns ho ld U 0 ( s ) = V C V T , U 1 ( s ) = V C Z V T where V is the V andermonde matrix based on ( ξ 1 , . . . , ξ p ), C = diag { c 1 , . . . , c p } and Z = diag { ξ 1 , . . . , ξ p } . Complex exponentials approximation 3 Therefore ( ξ j , j = 1 , . . . , p ) are the generalized eigen v alues of the p encil P = [ U 1 ( s ) , U 0 ( s )] and ( c j , j = 1 , . . . , p ) are related to the g eneralized eigen v ector u j = V − T e j of P b y c j = u T j [ s 0 , . . . , s p − 1 ] T , where e j is t he j − th column of the iden tity matrix I p of order p . A further equiv alen t fo rm ulation is based on the complex measure S ( z ) = p X j =1 c j δ ( z − ξ j ) , z , ξ j ∈ D , where D is a compact subset of I C a nd δ is the Dirac distribution. I t turns out that fo r k = 0 , 1 , 2 , . . . s k = Z D z k S ( z ) dz = Z Z D ( x + iy ) k S ( x + iy ) dxdy . Therefore s k is the k - th harmonic momen t o f the measure S and the com- plex exp onential interpolatio n problem is equiv alen t to a sp ecific moment problem in the plane consisting in retrieving the distribution S f rom s k , k = 0 , 1 , 2 , . . . 2 p − 1. Conditions for existence and unicit y of the solutio n are detU 0 ( s ) 6 = 0 , detU 1 ( s ) 6 = 0 (see e.g. [15, Th.7.2c]). More realistically , by denoting in b old all random quan tities, let us consider the discrete sto c hastic pro cess defined b y a k = s k + ν k , k = 0 , 1 , 2 , . . . , n − 1 (2 ) where n ≥ 2 p and ν k is a complex Gaussian zero-mean white noise discrete pro cess with kno wn v ariance σ 2 . W e wan t therefore to solv e the complex exp o nen tial approximation problem (CEAP) consisting in estimating p and { c j , ξ j } , j = 1 , . . . , p, from a r ealizatio n a k , k = 0 , 1 , 2 , . . . , n − 1 of a k . This is equiv alen t, when p is kno wn, to solv e a Pade ’ appro ximation problem i.e. to compute the [ p, p − 1] Pade’ approximan t of the fo rmal p o w er series Complex exponentials approximation 4 f ( z ) = P k a k z − k , or to solving a g eneralized eigen v alue pro blem f o r non- square p encils [1 7, 7 ] or a specific noisy momen ts problem in the plane. Ev en if p w ere known the problem w ould b e quite difficult and usually ill- p osed. A wide literature exists on the sub ject. W e can summarize some we ll known facts as follow s (see e.g. [19 , 14, 10]). The problem is optimally conditio ned when ξ j are equispaced on t he unit circle. In this case in fact mo del (1) reduces t o the F ourier mo del whic h is an orthogo nal one. Clusters o f ξ j are more difficult to estimate than w ell separated ones. Complex exp o nen- tials with relativ ely small | c j | are more difficult to estimate than those with relativ ely large w eigh t. Recen tly a new approa c h for solving the complex expo nen tial approxima- tion problem in a sto c hastic framew ork w as prop osed [1], whic h exploits the relation with generalized eigen v alue problems and with momen ts problems outlined ab ov e but without assuming t o know p . It mak es use of to ols from the t heory of loga rithmic p oten tial with external fields [22] and the theory of r a ndom p olynomials [5, 11] and provide s an estimate of p and p oint and in terv al estimates of { c j , ξ j } . In this work some computational and n umerical issues are addressed to mak e this new to ol useful in practice. An algorithm is dev elop ed and tested on we ll known difficult problems. The pap er is organized as fo llows. In Section 1 the metho d in tro duced in [1] is shortly summarized. In Section 2 t he prop osed algorithm is discussed. In Section 3 the algorithm is used to solv e a Nuclear Magnetic Resonance sp ectrometry problem, a time series interpola t ion and extrap olation prob- lem and a shap e from moments problem providing some comparisons with existing metho ds. Complex exponentials approximation 5 1 The new transform Starting from a k , k = 0 , 1 , 2 , . . . , n − 1, assuming n eve n, let us consider the sto c hastic CEIP (i.e. a CEIP for each realization of { a k } ) a k = n/ 2 X j =1 c j ξ k j , k = 0 , 1 , . . . , n − 1 .. (3) and the asso ciated random measure S n ( z , σ ) = n/ 2 X j =1 c j δ ( z − ξ j ) . (4) Let us also define the ra ndom Hank el n 2 × n 2 matrices U 0 ( a ) , U 1 ( a ), where a = [ a 0 , . . . , a n − 1 ]. The generalized eigen v alues ξ j , j = 1 , . . . , n/ 2 of the random p encil P = [ U 1 ( a ) , U 0 ( a )] satisfy t he equation p n/ 2 ( z ) = det [ U 1 ( a ) − z U 0 ( a )] = 0 where p n/ 2 ( z ) is a random p o lynomial. W e can then consider the exp ected v alue of the (random) normalized counting measure on the zeros ξ j , j = 1 , . . . , n/ 2 of this p olynomial (condensed densit y , [11, 5]): h n ( z ) = 2 n E n/ 2 X j =1 δ ( z − ξ j ) . In [2] it w as pro v ed that, when s = 0, in the limit for n → ∞ the condensed densit y is a distribution supp orted on the unit circle and it can b e prov ed ([1]) that in the limit for σ → ∞ the generalized eigen v alues ξ j tend to concen trate on the unit circle and, in the limit for σ → 0 , they concen trate around the tr ue ξ j , j = 1 , . . . , p . It is therefore eviden t that in order to solve CEAP , the first issue to address is the identifiabilit y one. If the Signal-to Complex exponentials approximation 6 Noise rat io (SNR) is not large enough with resp ect to the signal structure as discussed in the in tro duction, t here is no hop e to solv e CEAP . The first step of the metho d in tro duced in [1] provides a t o ol fo r assess ing if CEAP is solv able based on the prop erties of the condense d densit y of the generalized eigen v alues h n ( z ). More precisely w e giv e the followin g: Definition 1 The me asur e S ( z ) is i d entifiable fr om a k , k = 0 , . . . , n − 1 if ∃ r k > 0 , k = 1 , . . . , p such that • h n ( z ) is unimo dal in N k = { z k | z − ξ k | ≤ r k } • T p k =1 N k = ∅ The follow ing r esult, prov ed in [1 ], give s the relation b etw een S n ( z , σ ), and the unkno wn measure S ( z ) Theorem 1 If S ( z ) is identifiable fr om a then Z N h ( r h ) E [ S n ( z , σ )] dz = c h + o ( σ ) , h = 1 , . . . , p and Z A E [ S n ( z , σ )] dz = o ( σ ) , ∀ A ⊂ D − [ j N j ( r j ) . As in the limit for σ → 0, the condensed densit y tends contin uously t o a dis- tribution supp orted on the true ξ j , j = 1 , . . . , p , it do es exist σ small enough to mak e S ( z ) iden tifiable from a and in this case w e can use the random mea- sure S n ( z , σ ) to estimate S ( z ) b y using Theorem 1. T o p erform this prog ram w e need tw o steps. The first one consists in either to chec k the iden tifiability of the measure S ( z ) from a or to prop erly design the exp erimen t (i.e. to Complex exponentials approximation 7 c ho ose n and σ ) in order to get iden tifiability . The second step consists in building an estimator of S n ( z , σ ). Ab out the first step we notice t hat of course the function h n ( z ) cannot b e computed b ecause w e do not know s i.e. the mean of a . How ev er, assuming to kno w s , we can use h n ( z ) to state whether S ( z ) is identifiable from the data. Unfortunately ev en in the G aussian assumption the a nalytic computation of h n ( z ) is hard. Ho w ev er it can b e approximated ([1]) b y ˜ h n ( z ) = 1 2 π n ∆ X µ j ( z ) > 0 log( µ j ( z )) where ∆ is the Laplacian op erato r acting on z and µ j ( z ) are the eigenv alues of ( U 1 ( s ) − z U 0 ( s ))( U 1 ( s ) − z U 0 ( s )) + nσ 2 2 A ( z , z ) (5) where s = [ s 0 , . . . , s n − 1 ], A ( z , z ) = 1 + | z | 2 − z 0 . . . 0 − z 1 + | z | 2 − z 0 . . . . . . . . 0 . . . 0 − z 1 + | z | 2 ∈ I C n 2 × n 2 and ov erline denotes conjugation. Remark. F rom eq uation (5) it follo ws that n should not be as large as p ossible to get the b est estimates o f S ( z ). In fact to o man y data will con v ey to o m uc h noise which could mask the signal s k . W e ha v e therefore a to o l either to c hec k iden tifiability or to design pro p erly the exp erimen t. In most r eal problems w e hav e some prior information ab out the unkno wn measure S ( z ) . W e can then compute ˜ h n ( z ) for s ev eral candidate measures compat ible with o ur prior informatio n and choo se v alues n and σ that make the candidate measures iden tifiable. Complex exponentials approximation 8 W e no w mov e to the second step of the pro cedure consisting in estimating the random measure S n ( z , σ ) and extracting from it the required information. If we hav e R samples from the data discrete sto ch astic pro cess a w e can estimate E [ S n ( z , σ )] b y solving CEIP for each sample a ( r ) , r = 1 , . . . , R, i.e. finding ( c ( r ) j , ξ ( r ) j ) , j = 1 , . . . , n/ 2 , suc h that a ( r ) k = P n/ 2 j =1 c ( r ) j ( ξ ( r ) j ) k , k = 0 , 1 , . . . , n − 1 and then taking the sample mean 1 R R X r =1 n/ 2 X j =1 c ( r ) j δ ( z − ξ ( r ) j ) . If o nly one sample is av a ilable we can use the follow ing metho d prop osed in [1]. W e notice first tha t in order to cop e with the Dirac distribution app earing in the definition of S n ( z , σ ), it is con v enien t to use an alternativ e expression giv en b y (see [2]) S n ( z , σ ) = 1 4 π ∆ n/ 2 X j =1 c j log( | z − ξ j | 2 ) . Then we build indep enden t replications of the data pro cess (pseudosamples) b y defining a ( r ) k = a k + ν ( r ) k , k = 0 , . . . , n − 1; r = 1 , . . . , R where { ν ( r ) k } are i.i.d. zero mean complex Gaussian v ariables with v ariance σ ′ 2 and therefore a ( r ) k ha v e v ariance ˜ σ 2 = σ 2 + σ ′ 2 . W e then define the estimator, conditioned to a S c n,R ( z , ˜ σ ) = 1 2 π R ∆ R X r =1 n/ 2 X j =1 c ( r ) j log( | z − ξ ( r ) j | ) (6) where ( c ( r ) j , ξ ( r ) j ) , j = 1 , . . . , n/ 2 ar e t he solution of CEIP fo r the pseudo data a ( r ) k , k = 0 , . . . , n − 1 which are computable b y a MonteCarlo pro cedure giv en a . In [1] the fo llowing theorem is pro v ed Complex exponentials approximation 9 Theorem 2 L et M ( z ) and M c ( z ) b e the me an squar e d err or of S n ( z , σ ) and S c n,R ( z , ˜ σ ) r es p e ctively. In the limit for σ → 0 , it exi s ts σ ′ and R ( σ ′ ) such that ∀ R ≥ R ( σ ′ ) , M c ( z ) < M ( z ) ∀ z . In order to estimate ( c j , ξ j ) , j = 1 , . . . , p, we ma ke use of Theorem 1. In fact, if S ( z ) is identifiable, there exist disjoin t sets N k , k = 1 , . . . , p suc h that R N k E [ S n ( z , σ )] dz ≫ σ , and eac h o f them should include one and only one ξ k . Therefore lo oking at the sets A suc h tha t R A S n,R ( z , ˜ σ )) dz ≫ σ it is p ossible to iden tify ˆ p disjoin t sets ˆ N k whic h p o ssibly include the true ξ k . This can b e done b y computing a discrete tra nsform b y ev aluating S c n,R ( z , ˜ σ ) on a suitable la ttice L = { ( x i , y i ) , i = 1 , . . . , N } by taking a discretization of the Laplacian o p erator, g iving rise to a matr ix P ( ˜ σ ) ∈ ℜ ( N × N ) + - t he P -transform of t he v ector a - suc h that P ( i, j, ˜ σ ) = S c n,R ( x i + iy j ). The ˆ p relative maxima of the absolute v alue of the P -transform are then computed as w ell as disjoin t neigh b ors ˆ N k cen tered o n them. Estimates ( ˆ c k , ˆ ξ k ) of ( c k , ξ k ) are obtained by a v eraging the ( c ( r ) j , ξ ( r ) j ) whic h b elong to eac h ˆ N k . The name ”transform” is justified b y observing that t o the vec tor a w e asso ciate the matrix P (direct transform), and to the matrix P we asso ciate the v ector whose comp onents are ˆ a k = ˆ p X j =1 ˆ c j ˆ ξ k j → a k , whe n σ → 0 (in v erse tra nsfor m). 2 The algorithm The metho d for estimating the unkno wn parameters p , { ( c j , ξ j ) , , j = 1 , . . . , p } outlined in the previous section is quite exp ensiv e and delicate from the n u- merical p oint of view. In this section w e discuss the main issues to b e a d- Complex exponentials approximation 10 dressed to implemen t the basic metho d and suggest a new approac h whic h mimics the basic one giving r ise to a fast and reliable alg o rithm. The computation of the P -t r a nsform is the most critical part of the whole pro cedure. There are man y algo rithms to compute ( ˆ c j , ˆ ξ j ) based on differen t approac hes (see e.g. [12, 3] for short r eviews) whic h are useful in differen t applied con texts. If computationa l burden is the principal issue and the geometric structure of the unkno wn measure S ( z ) is simple, extremely fast algorithms based on the generalized o rthogonality of P ade’ p o lynomials can b e used to compute ( ˆ c j , ˆ ξ j ) ([1 5 , pg.631-632 ],[6]). If clusters of p oles can b e exp ected it is b etter t o solv e the generalized eigen v alue problem e.g. as discusse d in [1 9 ] and [1 2] where sev eral adv anced metho ds ar e presen ted or [14], where the Hanke l structure of the p encil P = [ U 1 ( a ) , U 0 ( a )] is tak en into accoun t to sp eed up the computation and QR factorizatio n and Q Z iteration are used as w ell as a suitable diagonal scaling of the p encil P , for ach ieving n umerical stabilit y . An eve n mor e expensiv e me tho d is describ ed in [24] where a total least squares approach is used taking in to accoun t the Hank el structure and the noise aff ecting the elemen ts of P . A classical approach is giv en b y Prony’s metho d [20] whic h splits the problem in three parts b y solving t w o linear least squares problems with T o eplitz and V andermonde structure resp ectiv ely and a p olynomial ro ot ing problem. F ast co des for all these sub-steps do exist [13, sect.4.6,4.7] as w ell as to tal least squares [27] and structured total least squares algorithms [17]. A further complication is due to the fact that for computing t he P -transform R generalized eigenv alue problems ha v e to b e solv ed. An effectiv e compro- mise b et we en accuracy and sp eed of computation is giv en by the fo llo wing pro cedure: • compute ( c (0) j , ξ (0) j ) , j = 1 , . . . , n/ 2 , by solving the generalized eigen- Complex exponentials approximation 11 v alue problem for the p encil P b y o ne of t he accurate metho ds quoted ab ov e. If the metho d describ ed in [1 4] is used the computational cost of this step is O (( n 2 ) 3 ) • select the generalized eigen v alues ξ (0) j , j = 1 , . . . , ˜ p corresp onding to the ˜ p largest v alues | c (0) j | where ˜ p is a n upp er b ound of p, ˜ p ≤ n 2 • fo r each pseudosample a ( r ) compute the co efficien ts of the p o lynomial p ( z ) = det[ U 1 ( a ( r ) ) − z U 0 ( a ( r ) )] b y the first step of Pron y’s metho d. This requires O (( n 2 ) 2 ) flops b ecause of the Hank el structure of U 0 ( a ( r ) ) • to compute ξ ( r ) j , j = 1 , . . . ˜ p , apply a fast iteration suc h as e.g. Laguerre metho d [30] to the p olynomial p ( z ), taking as initial v alues ξ (0) j , j = 1 , . . . , ˜ p . Usually it con v erges in few iterations, t herefore it costs O ( n 2 ˜ p ) flops o r less if the Horner sche me to compute the p olynomial deriv ativ es is implemen ted through the fa st F o urier transform [25]. • to compute c ( r ) j , j = 1 , . . . ˜ p , apply the third step of Pron y’s metho d, forming the V andermonde matrix o f ξ ( r ) j , j = 1 , . . . ˜ p and solving a least squares problem e.g. by LSQR metho d [21], whic h is a go o d compromise b etw een accuracy and computational speed. Usually a few iterat ions are sufficien t, therefore it costs O ( n 2 ˜ p ) flops. • The last step for computing the P -tra nsform consists in ev aluating the summation in (6) and then computing a discrete Laplacian. This can b e the most exp ensiv e part of the pro cedure b ecause the summation m ust b e computed fo r eac h z i = ( x i , y i ) , i = 1 , . . . , N of the lattice L . Therefore we need O ( R ˜ pN 2 ) flops. How ev er we notice that w e o nly Complex exponentials approximation 12 need an estimate of the lo cal maxima of the absolute v alue of the P - transform. These are lik ely to b e close to the cen troids of p oles clusters and their v alue is a monotonic increasing function of the corresp o nding | c ( r ) j | . A fast w a y to e stimate them consists then in applying a clustering metho d, suc h e.g. k-means, to the R ˜ p ve ctors o f I R 3 [ ℜ ξ ( r ) j , ℑ ξ ( r ) j , | c ( r ) j | ] , r = 1 , . . . , R, j = 1 , . . . , ˜ p lo oking for ˜ p clusters. The clustering algorithm can b e initialized by [ ℜ ξ (0) j , ℑ ξ (0) j , | c (0) j | ] , j = 1 , . . . , ˜ p computed in the first t w o steps. W e then compute S c n,R ( z , ˜ σ ) = 1 2 π R ∆ R X r =1 ˜ p X j =1 c ( r ) j log( | z − ξ ( r ) j | ) ! (7) for z ∈ N k , k = 1 , . . . , ˜ p where N k is a small regular mesh of p oints with size δ , cen tered o n the centroid o f the k − th cluster. Finally , after theorem 1 , we select the ˆ p ≤ ˜ p clusters suc h tha t X z h ∈ N k S c n,R ( z h , ˜ σ ) δ 2 > K σ where K > 1. Estimates ( ˆ c j , ˆ ξ j ) , j = 1 , . . . , ˆ p o f ( c j , ξ j ) , j = 1 , . . . , p are then obtained b y av eraging the ( c ( r ) j , ξ ( r ) j ) whic h b elong to the se- lected clusters. The computational cost of t he clustering alg o rithm and the computation o f S c n,R ( z , ˜ σ ) is O ( R ( ˜ p ) 2 ) flops. Summing up w e can solve the CEAP pro blem in O (( n 2 ) 3 ) + O ( R ( n 2 ) 2 ) flops. In most applications R < n is enough to get go o d results, therefore O ( n 3 ) flops is a reasonable upper b ound for solving the pro blem in most cases Complex exponentials approximation 13 (fast metho d). In a few particularly difficult problems the computation of ξ ( r ) j , j = 1 , . . . ˜ p, r = 1 , . . . , R is b etter p erformed b y the same accurate metho ds used for r = 0. In these cases the computatio nal burden b ecomes O ( n 4 ) (slow metho d). 3 Numerical exp erimen ts In order to appreciate the b eha vior of the prop osed algorit hm in practice, four examples o n real and syn thetic data are presen ted. The first one cop es with the classic problem o f quan tification of Nuclear Mag netic Resonance sp ec- tra (see e.g. [28, 29]) which is usually solv ed b y ad ho c metho ds requiring visual insp ection by the op erato r . The second example is a n interpolatio n- extrap olation problem on a syn thetic time series used in the 2004 Comp eti- tion on Artificial Time Series, o r ganized in the framework of the Europ ean Neural Netw ork So ciet y [8]. Comparisons with the results o btained by par- ticipan ts are provided. The third example is an in terp olat io n problem of a real acoustic signal with a missing frag ment. The aim here is t o reconstruct the missing part in order to mak e the reconstructed signal to sound a s the original. The last example is a shap e from momen ts reconstruction problem. It turns out that the iden tification of a p olygonal region in the plane from its complex momen ts can b e form ulated as a sp ecific CEIP [9, 14]. Syn thetic data sets are g enerated and the results are compared with those obtained in [12] when the n um b er o f the p olygon v ertices is known. Moreo v er the case when the num b er of v ertices is unkno wn is also addressed. W e notice that sev eral hy p erparameters hav e to b e chose n e.g. the upp er b ound ˜ p of p , the n um b er R of pseudosamples, the v ariance σ ′ 2 of { ν ( r ) k } and the constant K . Moreo v er o ne of the most critical hyperparameter is the Complex exponentials approximation 14 n um b er n o f data p o in ts, as noted in t he Remark in section 2. Usually w e can only cut some data in or der to reduce the noise. In order to select go o d h yp erparameters a p erformance criterion is c hosen and the metho d is applied for sev eral v alues of the h yp erpara meters in suitable in terv a ls. Then those that g ive the b est v alue of the p erformance criterion are used to compute the final results. The p erformance criterion is problem dep enden t. Ho w ev er a standard residual analysis prov ides usually a go o d basis to build up a go o d criterion. In the fo llo wing t he n umber of residuals whose absolute v alue is larger tha n σ is used as p erformance criterion. 3.1 NMR sp ectroscop y In the top part o f fig.1 a 1 H Magnetic Resonance absorption sp ectrum f r o m a diluted aqueous solution of the tr ip eptide Glutathione in it s r educed form (GSH) is show n. It is computed by taking the discrete F ourier transform (DFT) of a F ree Induction Deca y (FID ) signal of 4096 data p oints. In ideal conditions the ph ysical mo del for the FID is a linear com bination with p o s- itiv e w eigh ts of complex exp onen tials. The absorptio n sp ectrum is the real part o f the F ourier transform of the FID . It t ur ns out that it is giv en by a linear comb inatio n of Lor entian functions. The sp ectroscopist is inte rested in estimating the parameters characterizing these Loren tian lines, namely their mo des, widths and relativ e areas which are simply related resp ectiv ely to the argument of the complex exp onen tials mo deling the FID , to their ab- solute v alue and to w eigh ts associated t o them. In a real exp erimen tal setup the ideal conditions are no longer true. Standard metho ds, implemen ted on most spectrometers, fit each p eak of the absorption sp ectrum with a Loren- tian function. If the p eaks are close, a v ery ill conditioned non linear problem has to b e solv ed whic h can heav ily dep end on the in teractive ch oices of the Complex exponentials approximation 15 sp ectroscopist to initialize the pro cedure. A b etter a lternativ e is pro vided b y time-domain metho ds (see e.g.[28, 29]) whic h exploit the f act that the FID can b e mo deled by complex exp onentials. The problem can still b e very ill conditioned. Ho we ve r, if the SNR is large enough, reasonable estimates of the quan tities of in terest can b e obtained b y solving the CEAP problem for the FID by the prop osed metho d, whic h provide s a global stable solution and no longer requires critic interaction with the sp ectroscopist. The analysis is p erfo rmed in the interv al of the sp ectrum marke d by the rectangle in the to p part of fig.1. A quadruplet whose a r eas a re in the ra tios 1 : 3 : 3 : 1 is the theoretical reference. The frequencies a r e measured in parts p er million (ppm). The standard interactiv e pro cedure pro vides an estimate of the a reas of the four p eaks suc h that their ratios are 1 . 02 : 3 . 21 : 3 . 1 5 : 1. In o rder to apply t he prop osed metho d the F ID is first filtered by a pass- band F ourier filter [4 , 3]. In the middle-b ott o m part of the same figure, the absorption sp ectrum o f the filtered FID is sho wn. When the main p eaks of the sp ectrum are clustered and the clusters are w ell separated, it is in f a ct p ossible to split the analysis b y filtering out from the FID all the frequencies but those b elonging to a giv en interv al [18]. The filtered FID is giv en b y only 30 0 data p oints and the prop osed metho d was applied to solv e the CEAP for it. The results are sho wn on the middle-top part of fig.1. F our estimated Loren tian lines mark ed 1 − 4 are plo t t ed and their a reas are rep orted in the legend as w ell their mo des in ppm. The ratios o f the a reas are 1 . 07 : 2 . 98 : 2 . 92 : 1 whic h compare fa vorably with those estimated in teractiv ely . On t he b o t t om part of the figure the weigh ted sum of the four Loren tian lines is plot t ed. The ag reemen t with the zo omed absorption sp ectrum on the middle-b otto m part is quite go o d. Complex exponentials approximation 16 3.2 Time series in terp olation and extrap olation In order to apply the prop osed metho d to solv e extrap olatio n problems it is enough to solv e a CEAP for the measured data and then ev aluate the mo del on the extrap olation abscissas. T o solv e an interpolation problem w e not ice that, in the noiseless case, we can consider the segmen t s o f data b efore and after the missed segmen t as pro duced b y the same mo del (1) for a set of in- dices k displaced b y a fixed quantit y q . It is easy to sho w that the generalized eigen v alues and eigen v ectors are inv ariant for suc h a displacemen t. Therefore w e can solv e t w o separate CEAPs for the observ ed segmen ts, and apply the prop osed metho d to the p o oled generalized eigenv alues and eigenv ectors. W e need only to mo dify the V andermonde matrix for computing c ( r ) j in the last step to take in to account the gap in the observ ations. Assuming that eac h segmen t has n o bserv a tions w e ha v e c ( r ) = V † a , where V = 1 1 . . . 1 ξ ( r ) 1 ξ ( r ) 2 . . . ξ ( r ) ˜ p . . . . . . . . . . . . ( ξ ( r ) 1 ) n − 1 ( ξ ( r ) 2 ) n − 1 . . . ( ξ ( r ) ˜ p ) n − 1 ( ξ ( r ) 1 ) n + q − 1 ( ξ ( r ) 2 ) n + q − 1 . . . ( ξ ( r ) ˜ p ) n + q − 1 . . . . . . . . . . . . ( ξ ( r ) 1 ) 2 n + q − 1 ( ξ ( r ) 2 ) 2 n + q − 1 . . . ( ξ ( r ) ˜ p ) 2 n + q − 1 Complex exponentials approximation 17 The interpolated v alues are then obtained b y a int = V ˆ c , V = ˆ ξ n 1 ˆ ξ n 2 . . . ˆ ξ n ˆ p . . . . . . . . . . . . ˆ ξ n + q − 1 1 ˆ ξ n + q − 1 2 . . . ˆ ξ n + q − 1 ˆ p . The first example in this class of problems cop es with a time series of 5000 samples with 1 0 0 missing v alues at times 981 − 1000 , 1 9 81 − 2000 , 2981 − 3000 , 3981 − 4000 , 498 1 − 5000 . Therefore we w ant to solv e four in terp o la tion and one extrap olatio n problems. As the data are syn thetic the truth is kno wn and the results o btained by 17 metho ds are rep orted in [8] where the mean squared error (MSE) for the interpolatio n problems a nd the in terp olation + extrap olation pro blems are rep ort ed. It can b e argued that the MSE is not the b est discrepancy measure for this data set b ecause a fit with a smo othing cubic spline giv es results b etter tha n all of the 17 quoted metho ds for the in terp olatio n + extrap olat io n problems and b etter t ha n 15 of t hem for the extrap olation problem. Therefore w e wan t to see ho w m uc h the pro p osed metho d is able to impro v e on the solution pro vided by the cubic spline. W e then apply the metho d to the residual obtained b y subtracting the sm o othing spline fro m the data. In fig.2 top left the full time series with missing data is plotted. The other plots show the true v alues and the reconstructed ones on each missed data interv al. The M S E 100 = 270 and M S E 80 = 195 hav e to b e compared with M S E 100 = 408 a nd M S E 80 = 222 which are the b est results obtained in [8] b y t wo differen t metho ds among the 17 considered. The second example is illustrat ed in fig. 3 . An audio signal, corresp onding to a ringing b ell, made up of 50000 samples a t 11025 Hz is considered. The first 1000 0 samples are plott ed in the top left part of the figure. A fra g men t of 1 000 samples are put to zero as sho wn in the top right part of the figure. Complex exponentials approximation 18 The metho d is applied to in terp olate the missing f ragmen t. Tw o da ta sets made up of 300 samples eac h b efore and after the miss ing data are considered as sho wn in the middle left part o f the fig ure. The results are sho wn in the middle right par t of the figure where 50 missed data are plot ted sup erim- p osed to the inte rp olated v alues. Ev en if the fit is not impressiv e most o f the main sp ectral c haracteristics of the signal are w ell repro duced a s sho wn in the b ottom part of the figure where the F o urier sp ectrum of the original complete signal is plo tted on the left, and the F ourier sp ectrum o f the com- plete signal with the missing fragment replaced by the interpo la ted v alues is shown on the rig h t. The sound pro duced b y the reconstructed signal is almost undistinguishable from the original one. σ RM S E [12] RM S E Star s hape 1e-3 5.74e-2 3.68e-2 1e-4 1.74e-2 1.02e-2 1e-5 1.71e-3 1.05e-3 C shape 1e-3 4.46e-3 4.30e-3 1e-4 4.51e-4 4.27e-4 1e-5 4.59e-5 4.28e-5 T able 1: F or the star shap ed p olygon and the C shap ed p olygon, the RMSE a v eraged ov er all the v ertices obtained in [12] when p is know n and equal t o the true v alue for σ = 1 e − 3 , 1 e − 4 , 1 e − 5 is rep orted in the third column. In the fourth column the corresp onding RMSE obtained by the prop osed pro cedure is r ep orted. Complex exponentials approximation 19 3.3 Shap e from momen ts problems In [9, 14] it w as show n that the p vertice s ξ 1 , . . . , ξ p of a non degenerate p olygon P and its complex momen ts µ k , k = 0 , 1 , . . . , 2 p − 1 are r elat ed by k ( k − 1 ) µ k = k ( k − 1) Z P z k dx dy = p X j =1 c j ξ j , µ 0 = µ 1 = 0 where c j = i 2 ξ j − 1 − ξ j ξ j − 1 − ξ j − ξ j − ξ j +1 ξ j − ξ j +1 ! assuming that the v ertices are arranged in coun terclo c kwise direction in the order of increasing index and ex tending the indexing of the ξ j cyclically so that ξ 0 = ξ p , ξ 1 = ξ p +1 . Therefore to iden tify the p o lygon (i.e. its v ertices) from its complex momen ts is equiv alen t to solv e a CEIP for the data s k = k ( k − 1) µ k . In [12] sev eral metho ds fo r solving this sp ecific problem w ere compared on tw o differen t p olygons for σ = 10 − 3 , 10 − 4 , 10 − 5 b y a sim ulat io n exp eriment inv olving N = 100 indep enden t replications and n = 1 01 noisy momen ts. F or comparison, in T able 1 the results o btained b y the prop o sed metho d and the b est among those repo rted in [12, T a bles IV, VI I I, b old figures] are rep orted. The ro ot mean squared error ( R MSE) av eraged o ve r all para meters ξ j is computed by RM S E = 1 p p X k =1 v u u t 1 N N X j =1 | ξ ( j ) k − ˆ ξ ( j ) k | 2 . As the b est results were obtained in [12] by using GPOF metho d ([16 ]) but in one case, also in the prop osed pro cedure the solutio n of the generalized eigen v alue problem (step 1) w as obt a ined b y GPOF with the same setup used in [12 ]. Therefore ˜ p = p is ass umed to b e know n, as in [12], and the P -transform w as no t computed b ecause a ll the p estimated clusters were Complex exponentials approximation 20 retained. Moreov er this was the only example where GPOF w as used also for computing ξ ( r ) j , j = 1 , . . . ˜ p, r = 1 , . . . , R (slo w metho d). An improv emen t can b e noticed in all cases. In the first column o f fig. 4 the estimated ξ j for σ = 10 − 4 and for the considered p olygons are plotted. W e notice that in some ve rtices, the ξ j are so concen t r a ted that they coincide with one p oin t at the used resolution. Next w e use the f ull fast prop osed pro cedure assuming not to know p and putting ˜ p = n/ 2. The RMSE av eraged o ve r all para meters ξ j and the mean and standard deviation o f ˆ p are rep or t ed in T able 2. In the second column of fig. 4 the estimated ξ j for σ = 10 − 4 and for the considered p olygons are plott ed. σ RM S E p mea n p s.d. Star s hape 1e-3 1.07e-1 8 3 1e-4 7.62e-2 10 3 1e-5 2.98e-2 11 4 C shape 1e-3 4.13e-2 9 1 1e-4 2.94e-2 9 2 1e-5 2.72e-2 9 2 T able 2: F or the star shap ed p olygon and the C shap ed p olygon, the RMSE a v eraged ov er all the v ertices obtained in the case of p unkno wn for σ = 1 e − 3 , 1 e − 4 , 1 e − 5 is rep or ted in the third column. In the fourth and fifth columns the estimated mean and s.d. o f p are rep orted. 4 Conclus ion A new approac h for solving a classic inv erse ill-p o sed problem is discussed from the computational p oint of view. The approa c h is a p erturbativ e one, therefore it exploits the information generated by solving sev eral closed prob- Complex exponentials approximation 21 lems b y any standard metho d whic h b est suits the user’s needs such e.g. n u- merical quality a nd/or computational sp eed. The final results a re obta ined b y an ”av eraging” step, hence they a re quite stable with resp ect to noise and, provide d that some h yp erparameters are prop erly selected, sensitivit y is also preserv ed, allo wing to retriev e features of the signal whic h are maske d b y the noise. Sev eral n umerical examples are presen ted whic h confirm these practical abilities oft en improv ing on the results giv en b y kno wn metho ds. Ac kno wledg men ts I wish to thank S.Grande and L.Guidoni of the Istituto Sup eriore di Sanita’, Rome, Italy , f or providing the NMR data a nd for man y useful discussions . References [1] Barone, P. (200 8 ). A new transform fo r solving t he noisy complex exp o nen tials a pproximation problem, arXiv:0 801.1758 . [2] Barone, P . (2005). On the distribution of p oles of P ade’ approxim ants to t he Z -transform of complex Gaussian white no ise, J. Appr ox. The ory 132 224-240. [3] Barone, P. , M arch, R . (2 0 01). A nov el class of P ad ´ e based metho d in spectral analysis. J. Comput. Metho ds Sci. Eng. 1 185-211. [4] Belkic D z., Dando P.A., Main J., T a ylor H.S. (20 00). Three no v el high-resolution nonlinear metho ds for fast signal pro cessing, J.Chem. Phys. 113 (16) 6542–6556 Complex exponentials approximation 22 [5] Bharucha-Reid A.T., Sambandham M. (1986) . R andom Polyno- mials . Academic Press, New Y ork. [6] Brezinski C., Re div o-Zaglia M. (1991). Extr ap olation metho ds: the ory and pr actic e . Nor th Holland, Amsterdam. [7] Boutr y, G., Elad, M . Golub, G., Milanf ar, P. (2005). The generalized eigenv alue pro blem for no nsquare p encils using a minimal p erturbation approach. SIAM J. Sci. Comp. , 27,2 582-601 . [8] Lendasse, A., Oja, E. , Simula, O., Verleysen, M. (20 04). Time Series Pre diction Comp etition: The CA TS Benc hmark IJCNN’2004 pr o- c e e dings International Join t Confer enc e on Neur al Networks Buda p est (Hungary), 25-2 9 July 2004, I EEE , 1615-1 620. [9] Da vis, P.J. (1964) . T riangle f orm ulas in the complex plane. Math. Com- put. , 18 569 -577. [10] Donoho, D.L. (1992). Superresolution via sparsity constrain ts. SI AM J. Math. Anal. , 23,5 1309-133 1. [11] Hammersley, J.M . (1956 ). The zeros of a random p olynomial. Pr o c. Berkely Symp. Math. Stat. Pr ob abili ty, 3r d , 2 89-111. [12] Elad, M., Milanf ar, P ., Golub, G. (20 04). Shap e from Momen ts - An Estimation Theory Perspectiv e, IEEE T r ans . on Signal Pr o c essin g , 52 1814-1829 . [13] Golub G.H., V an Loan C.F. ( 1996). Matrix c o mputations , The Johns Hopkins Univ ersity Press, Baltimore. Complex exponentials approximation 23 [14] Golub, G.H. , Milanf ar, P., V arah, J. (200 4). A stable n umerical metho d f o r inv erting shap es fro m momen ts. SIAM J. Sci. Comp. , 21,4 1222–124 3. [15] Henrici, P. (1977). Applie d and c omputational c om plex analysis, vol.I , John Wiley , New Y ork. [16] Hua, Y., Sarkar, T.K. (1991). Matrix p encil metho d for estimating parameters of damp ed/undamp ed sin usoids in noise, IEEE T ASSP , 39 892-900 . [17] Lemmerling, P., V a n Huffel, S. (2002). Structured t otal least squares: analysis, algorithms and applications. In V an Huffel, S., L em- merling, P.(Eds.) T otal le ast squar e s and err ors-in-variable s mo del ling . Kluv er, Dordrech t, 79–91. [18] Neuhauser, D . (1990 ). Bound state eigenfunctions fro m w a ve pac k ets: time-energy resolution, J.Chem. Phys. , 93 2611–26 16. [19] Osborne M.R., Smyth G.K. (1995). A Mo dified Prony Algo r ithm for Exp o nen tial F unction Fitting, SIAM J. Sci. Com p ut. 16 11 9 -138. [20] Prony, R. (17 9 5). Essai exp´ erimental et analytique sur les lo is de la dilatabilit ´ e de fluides ´ elastiq ues et sur celles de la f orce expansiv e de la v ap eur de l’eau et de la v ap eur de l’a lk o ol, ` a diff ´ eren tes temp´ era t ur es, Journal de l’ ´ Ec ole Polyte chnique F lor´ eal et Plairial , I I I, v ol.1, n.22, 24- 76. [21] P aige, C.C. and Saunders, M.A. (1982). LSQR: An Algorithm for Sparse Linear Equations And Sparse Least Squares. A CM T r ans. Math. Soft. 8 43- 71. Complex exponentials approximation 24 [22] Saff, E.B ., Totik, V. (1997) . L o garithmi c p o tentials with external fields , Springer, Berlin [23] Scharf, L.L. (1991). Statistic al signal p r o c essing , Addison-W esley , Reading. [24] Schuermans M., Lemmerling P., D e La thauwer L. , V an Huf- fel S. (2006 ) . The use of tota l least squares data fitting in the shap e- from-momen ts problem. Signal Pr o c essing 86 1109 - 1115. [25] Sitton, G.A.; Burrus, C.S.; Fo x, J.W.; Treitel, S. (2003). F ac- toring v ery-high-degree p olynomials, Signal Pr o c essin g Magazine IEEE 20 27- 42. [26] Stew ar t, G .W. (2 001). Matrix algorithms, vol.2 , SIAM, P hiladelphia. [27] V an Huffel S., V ander w alle J. (199 1 ). The total le ast squar es pr oblem : c omputational asp e c ts and analysis , SIAM, Philadelphia. [28] Viti, V., Petrucci, C. and Barone, P. (1 9 97). Pron y methods in NMR sp ectroscop y , In terna tion a l Journal of Ima g ing Systems and T e chnolo gy 8 565–571. [29] Viti, V., Ragona, R., Guido ni, G., B ar one, P., Furman, E., Degani, H. (1997 ). Hormona l -induced mo dulation in the phosphate metab olites of breast cancer: analysis of in viv o 31P MRS signals with a mo dified Prony metho d, Magnetic R esonanc e in Me dicine 38 285– 295. [30] Wilkinson J.H. (1965). The Algebr aic Eige nvalue Pr oblem , Clarendon Press, Oxford. Complex exponentials approximation 25 3.885 3.772 3.659 3.546 3.432 3.319 3.206 3.092 2.979 2.866 2.753 3.051 3.044 3.036 3.029 3.022 3.014 3.007 2.999 2.992 2.984 2.977 1 2 3 4 01 ppm = 3.028, area = 0.155 02 ppm = 3.026, area = 0.430 03 ppm = 3.024, area = 0.421 04 ppm = 3.022, area = 0.144 3.051 3.044 3.036 3.029 3.022 3.014 3.007 2.999 2.992 2.984 2.977 3.051 3.044 3.036 3.029 3.022 3.014 3.007 2.999 2.992 2.984 2.977 Figure 1: Quan tification of NMR sp ectra. T op: the NMR F o urier spectrum and the ROI. Middle-top: the estimated L o ren tian lines in the R OI a nd their areas. Middle-b ott om: the F ourier sp ectrum in the R OI. Botto m: the sum of the estimated Loren tian lines. Complex exponentials approximation 26 1000 2000 3000 4000 5000 980 985 990 995 1000 MSE = 114.9005 1980 1985 1990 1995 2000 MSE = 55.9423 2980 2985 2990 2995 3000 MSE = 426.7196 3980 3985 3990 3995 4000 MSE = 182.64 4980 4985 4990 4995 5000 MSE = 570.1459 Figure 2: T op left: t ime series with fiv e missing in terv a ls. T rue v alues on each in terv al (-); inte rp olated v alues (+). T otal MSE o n the first fo ur in terv als = 195. T otal MSE o n the five interv als = 270. Complex exponentials approximation 27 4500 5000 5500 6000 6500 −0.4 −0.2 0 0.2 0.4 5000 5010 5020 5030 5040 5050 −0.4 −0.2 0 0.2 0.4 0 1 2 3 4 0 10 20 30 0 1 2 3 4 0 10 20 30 40 0 2000 4000 6000 8000 10000 −0.5 0 0.5 0 2000 4000 6000 8000 10000 −0.5 0 0.5 Figure 3: T op left: a segmen t of a n audio signal; top righ t: the part missed is sho wn; middle left: the da ta used to in terp olate; middle right: a fraction of the inte rp olated data (+) are sup erimp osed to the unkno wn ones (- ); b ottom left: F ourier sp ectrum of the missed par t ; b ottom right: F ourier sp ectrum of the reconstructed data. Complex exponentials approximation 28 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 Figure 4: Estimates of t he v ertices of the star shaped and C shap ed p olygons obtained b y t he pro p osed metho d on N = 100 replications with σ = 1 .e − 4 . Left: the true n um b er of ve rtices is kno wn. Right: the true n um b er of v ertices is unknown.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment