Function reconstruction as a classical moment problem: A maximum entropy approach

We present a systematic study of the reconstruction of a non-negative function via maximum entropy approach utilizing the information contained in a finite number of moments of the function. For testing the efficacy of the approach, we reconstruct a …

Authors: Parthapratim Biswas, Arun K. Bhattacharya

F unction reconstruction as a classical momen t problem: A maxim um en trop y approac h P arthapratim Bisw as Dep artment of Physics and Astr onomy, The University of Southern Mississippi, Hattiesbur g, MS 39406, USA ∗ Arun K. Bhattac harya Dep artment of Physics, The University of Bur dwan, Bur dwan, WB 713104, India W e presen t a systematic study of the reconstruction of a non-negative function via maximum en tropy approac h utilizing the information con tained in a finite num b er of momen ts of the function. F or testing the efficacy of the approach, we reconstruct a set of functions using an iterative en tropy optimization scheme, and study the conv ergence profile as the num b er of moments is increased. W e consider a wide v ariet y of functions that include a distribution with a sharp discon tinuit y , a rapidly oscillatory function, a distribution with singularities, and finally a distribution with several spikes and fine structure. The last example is imp ortant in the con text of the determination of the natural densit y of the logistic map. The con vergence of the metho d is studied by comparing the moments of the appro ximated functions with the exact ones. F urthermore, by v arying the num ber of moments and iterations, we examine to what extent the features of the functions, such as the divergence b eha vior at singular points within the interv al, is repro duced. The proximit y of the reconstructed maxim um entrop y solution to the exact solution is examined via Kullback-Leibler divergence and v ariation measures for different n umber of moments. P ACS num bers: 02.30.Zz, 05.10.-a, 02.60.Pn I. INTR ODUCTION The reconstruction of a non-negativ e distribution from its moments constitutes the so-called classical momen t problem, and is an arc het ypal example of an in verse prob- lem [1, 2] in mathematical sciences. Owing to its im- p ortance in the context of probabilit y theory and the c hallenging problems of analysis asso ciated with it, the momen t problem has attracted the atten tion of a large n umber of researchers from many div erse fields of sci- ence and engineering [3 – 13]. In the classical Hausdorff momen t problem (HMP), one addresses the problem of reconstructing a non-negativ e, real v alued function f ( x ) in a finite interv al [ a, b ] from a sequence of real num- b ers. The sequence forms a ‘moment sequence’ that satisfies the Hausdorff conditions [14]. The problem is sev erely ill-posed in the Hadamard sense [15]. F or a fi- nite num b er of momen ts, most of the existing numerical metho ds are susceptible to large instabilities but several metho ds do exist that attempt to construct a regular- ized solution by av oiding these instabilities [9, 16]. The HMP has b een addressed by using a v ariet y of metho ds (suc h as Tikhono v’s regularization method [17] and the use of Pollaczek p olynomial by Viano [18, 19]) but the information-theoretic approac h is particularly fascinat- ing to the ph ysicists. The latter is based on the maxi- mum entr opy principle (MEP) prop osed by Jaynes [20]. The MEP provides a suitable framework to reconstruct a distribution by maximizing the Shannon information en tropy [21] and at the same time ensures the matching ∗ Electronic address: Partha.Bisw as@usm.edu of the moments of the distribution. Our in terest in the momen t problem stems from the fact that inv erse prob- lems of this t yp e are frequently encountered in many ar- eas of physical, mathematical and biological sciences [5– 12, 22, 23]. A v ery simple but elegant example is the in version of the sp ecific heat data of solids. It is known that the constant v olume vibrational sp ecific heat of a solid can b e expressed as the conv olution of a known function of the frequency and the vibrational frequency distribution function (FDF) [24]. The task of extract- ing the FDF b y inv erting the exp erimen tally-measured v alues of the sp ecific heat at constan t volume as a func- tion of temperature is a w ell-kno wn example of an inv erse problem in solid state physics [25, 26]. The fo cus of our present work is to reconstruct a non- negativ e function v ery accurately within the framew ork of the MEP from the kno wledge of a finite n um b er of momen ts. Although there exists a num ber of numerical pro cedures that address this problem, most of them be- come unreliable when the num ber of moment constrain ts exceeds a problem-dep endent upp er limit. A close re- view of the metho ds and the study of the example func- tions presented therein immediately reveal the weakness of the metho ds [9, 16]. F or example, it is v ery difficult to repro duce accurately the v an Hov e singularities in the frequency distribution of (crystalline) solids or the pres- ence of a gap in the density of electronic states in a solid. While the algorithm proposed by Silv er and R¨ oder [27] do es reproduce the latter correctly and is capable of deal- ing with a large num b er of moments, we are not aw are of an y systematic study of function reconstruction by this approac h at this time. It is, therefore, worth while to ex- plore the p ossibility of developing a reliable sc heme for the entrop y optimization program and to apply it to a 2 range of non-negative functions having complex struc- ture within the interv al. The rest of the pap er is organized as follows. In Section I I w e briefly describ e a procedure that has been devel- op ed recen tly b y us to reduce the momen t problem to a discretized en tropy optimization problem (EOP) [28]. W e then test our metho dology in Section I I I b y examin- ing to what exten t it is successful in reconstructing a wide v ariety of functions on the basis of input information in the form of Chebyshev moments of the functions. The con vergence b eha vior of the maximum entrop y solution is then discussed in Section IV with particular empha- sis on the num b er of moments. The proximit y of the reconstructed solution to the exact solution for differen t distributions is also studied via Kullback-Leibler [29] di- v ergence and v ariation measures [30]. I I. MAXIMUM ENTROPY APPRO A CH TO THE HAUSDORFF MOMENT PROBLEM The classical moment problem for a finite interv al [a, b], also known as the Hausdorff moment problem [14], can b e state d as follows. Consider a set of moments µ i = Z b a x i ρ ( x ) dx i = 0 , 1 , 2 , . . . , m, i ≤ m (1) of a function ρ ( x ) integrable ov er the interv al with µ i < ∞ ∀ x ∈ [a,b]. The problem is to construct the non- negativ e function ρ ( x ) from the knowledge of its mo- men ts. The necessary and sufficien t conditions for a solu- tion to exist were given b y Hausdorff [14]. The moment problem and its v ariants ha ve been studied extensively in the literature [1, 2, 18, 22, 31]. Mead and Papanico- laou [9] hav e, in particular discussed a num ber of moment problems encountered in v arious of physics. F or a finite n umber of momen ts, the problem is underdetermined and it is not p ossible to construct the unique solution from the moment sequence unless further assumptions ab out the function are made. Within the framework of maxi- m um entrop y principle, one attempts to find a function ρ ( x ) that maximizes the information entrop y functional, S [ ρ ] = − Z b a ρ ( x ) ln[ ρ ( x )] dx (2) sub ject to the momen t constraints defined b y Eq. (1). The resulting solution is an approximate function ρ M E ( x ), which can b e obtained by functional differenti- ation of a Lagrangian with resp ect to the unkno wn func- tion ρ ( x ). The Lagrangian is given by , L ( ρ, λ ) = − S [ ρ ] + m X i =0 λ i Z b a x n ρ ( x ) dx − µ i ! . (3) No w, δ L δ ρ ( x ) = 0 = ⇒ ρ M E ( x ) = exp − m X i =0 λ i x i ! . (4) The normalized function ρ ( x ) is often referred to as probabilit y densit y since it is p ositive semidefinite and the interv al [a, b] can b e mapped onto [0,1] without an y loss of generality . F or a normalized function with µ 0 = 1, the Lagrange multiplier λ 0 is connected to the others via, e λ 0 = Z 1 0 exp − m X i =1 λ i x i ! = Z, and the maximum entrop y (ME) solution can b e written as, ρ M E ( x ) = exp − m X i =1 λ i x i ! / Z , (5) where Z is known as the partition function. A reliable scheme for handling the entrop y optimiza- tion problem sub ject to the matching of the moments was discussed b y us in Ref. [28]. The essen tial idea b ehind the approac h is to use a discretized form of the Shannon en- trop y functional and the momen t constraints using an accurate quadrature form ula. The constraint optimiza- tion problem inv olving the primal v ariables is then re- duced to an unconstrained conv ex optimization program in volving the dual v ariables of the problem. This guaran- tees the existence of a unique solution within the frame- w ork of maximum entrop y principle. The solution is le ast biase d [32] and satisfies the moment constraints defined b y Eq. (1). The pro cedure consists of: 1) rewriting the Lagrangian of the problem in Eq. (3) in terms of the dis- cretized v ariables to obtain the ME solution, 2) using the resulting ME solution in asso ciation with Eq. (1) to re- duce the EOP as an unconstrained conv ex optimization problem in dual v ariables, and finally 3) minimizing the ob jectiv e function in the dual space to obtain the optimal solution in the primal space. Using a suitable quadrature (e.g. Gaussian) with a set of w eights ω j ’s and abscissae x j ’s, the discretized La- grangian can b e written as, L ( ˜ ρ, ˜ λ ) = n X j =1 ˜ ρ j ln  ˜ ρ j ω j  − m X i =1 ˜ λ i   n X j =1 t ij ˜ ρ j − µ i   , (6) where 0 ≤ ˜ ρ ∈ R n and ˜ λ = − λ ∈ R m , respectively are the primal and the dual v ariables of the EOP . In the equation ab ov e, w e hav e used the notation ˜ ρ j = ω j ρ j and t ij = ( x j ) i . The discretized ME solution is given b y the functional v ariation with resp ect to the unknown function as b efore, ρ M E j = exp m X i =1 t ij ˜ λ i − 1 ! , j = 1 , 2 , . . . n. (7) Equations (1) and (7) can b e combined together and the EOP can b e reduced to an unconstrained conv ex op- 3 timization problem in volving the dual v ariables ˜ λ ’s: min ˜ λ ∈ R m   D ( ˜ λ ) ≡ n X j =1 ω j exp m X i =1 t ij ˜ λ i − 1 ! − m X i =1 µ i ˜ λ i   . (8) By iteratively obtaining an estimate of ˜ λ , D ( ˜ λ ) can b e minimized, and the ME solution ˜ ρ ( ˜ λ ∗ ) can be con- structed from Eq. (7). The ob jective function D ( ˜ λ ) can b e minimized b y mo difying a metho d, which is largely due to Bergman [33], and was presen ted and discussed at length in Ref. [28] b oth for the p ow er and the Cheb yshev momen ts. F or the latter, the ME solution can b e shown to b e expressed in the form of Eq.(7) with t ij = T ∗ i ( x j ), where T ∗ i ( x ) is the shifted Cheb yshev polynomials. In the follo wing, we apply our algorithm to reconstruct a v ariety of functions corresp onding to different num ber of shifted Chebyshev momen ts. I I I. APPLICA T ION TO FUNCTION RECONSTR UCTION W e now illustrate the metho d b y reconstructing a n um- b er of exact functions from a kno wledge of their mo- men ts. F or all but one of the examples studied here, the momen ts of the functions can b e obtained from an- alytical expressions. In the remaining case the moments ha ve b een calculated numerically using standard double precision arithmetic. As men tioned earlier, we map the functions on to the in terv al [0,1] and assume they are nor- malized so that the functions can b e treated as probabil- it y density functions (pdf ) without any loss of generalit y . It is well-kno wn that for a finite num b er of moments, the Hausdorff moment problem cannot b e solv ed uniquely . One needs to supply additional information to choose a suitable solution from an ensemble of solutions that sat- isfy the giv en momen t constraints. The maximum en- trop y (ME) ansatz constructs the le ast biase d solution that maximizes the entrop y asso ciated with the densit y and is consistent with the given moments. The accu- racy of the reconstructed solution can be measured b y v arying the num ber of moments. A comparison with the exact solution (if a v ailable) would reveal to what extent the ME solution matc hes with the exact solution. F or an unkno wn function with a finite set of moments, the qualit y of the ME solution may b e judged by the prox- imit y of the input (exact) moments to the output (ap- pro ximated) moments resulting from the reconstructed distribution. By increasing the n umber of moments one can systematically impro v e the qualit y of the solution. It should, how ev er, b e noted, that for a function with a complicated structure, the conv ergence of the first few momen ts do es not guaran tee its accurate repro duction. The ME solution in this case may not represent the ex- act solution, but is still correct as far as the maximum en tropy principle is concerned. It is therefore imp ortant to study the conv ergence b ehavior of the solutions with momen ts for a n um b er of functions with widely differ- en t shap es. T o this end we compare, in the follo wing, our maximum entrop y solution corresp onding to a v ari- et y of exact distribution and a distribution amenable to an accurate n umerical analysis. A. Case 1 : f ( x ) = 1 W e b egin with a step function which is unit y through- out the in terv al [0 , 1]. As mentioned earlier, w e use the shifted Chebyshev p olynomials T ∗ n ( x ), which is defined via, T ∗ n ( x ) = T n (2 x − 1) T n ( x ) = cos  n cos − 1 ( x ))  for n = 0 , 1 , . . . The moments can b e calculated analytically in this case, and are given by , µ 0 = 1; µ 1 = 0; µ n = 1 + ( − 1) n 2 − 2 n 2 n 6 = 1 . Although the function do es not ha ve an y structure, it is particularly imp ortant because of its b ehavior at the end p oin ts. Owing to the presence of discon tinuities at x = 0 and 1, the function is difficult to repro duce close to these p oin ts. The sharp discontin uities cause the reconstructed function (from a small num ber of momen ts) to exhibit spurious oscillations near the end points. The oscillations are progressively suppressed by increasing the n umber of Cheb yshev momen ts in our iterativ e method. Bey ond 100 moments the oscillations completely disapp ear. This b eha vior is seen clearly in fig.1 where w e ha v e plotted the reconstructed functions corresp onding to 40, 60 and 80 momen ts. The oscillations are particularly pronounced as one approac hes x = 1, but die down with increase in the n um b er of momen ts. The result corresponding to 100 moments is presented in fig.2. The plot clearly rev eals that the function has b een repro duced with an error, which is less than 1 part in 10 6 . B. Case 2 : f ( x ) = 3 2 x 1 2 The next example w e consider is a square-ro ot function f ( x ) = 3 2 x 1 2 , where the prefactor is c hosen to normal- ize the function. In many physical problems, we often encoun ter distributions showing a square-ro ot b ehavior. F or example, the sp ectral distribution of a free electron gas in 3-dimension is related to the energy via √ E , and the square-ro ot b eha vior p ersists in the weak interaction limit (at low energy). It is therefore imp ortant to see if such a square-ro ot function can b e reproduced with a high degree of accuracy using our maximum entrop y ansatz. The shifted Chebyshev momen ts for the presen t case are giv en by , 4 µ n = 9 − 12 n 2 9 − 40 n 2 + 16 n 4 for n ≥ 0 . The results for the function are plotted in figs. 3 to 5 for 100 moments. The reconstructed function is found to match excellently with the exact function throughout the interv al as shown in fig.3. Of particular imp ortance is the behavior of the function near x = 0 and 1. The square-ro ot b ehavior is accurately reproduced without an y deviation or oscillation near x = 0 as is evident from fig.4. Similarly , the b ehavior near x = 1 is also repro- duced with a high degree of accuracy as shown in fig.5. Since our metho d can exploit the information embedded in the higher moments, it is capable of repro ducing the function very accurately without any oscillation. C. Case 3: A double-parab ola with a gap Ha ving discussed tw o relativ ely simple examples, w e no w consider a case where the function v anishes in a fi- nite domain within the interv al. Such a function app ears frequen tly in the con text of the energy density of states of amorphous and crystalline semiconductors. It is in- structiv e to study whether our maximum entrop y (ME) metho d is capable of repro ducing a gap in the energy eigen v alue sp ectrum. Since the moments of the electronic densit y of states can b e obtained from the Hamiltonian of the system, our metho d can b e used as an alternativ e to ol to construct the densit y of states from the moments. This is particularly useful for treating a large non-crystalline system (e.g. in the amorphous or liquid state), in which case the direct diagonalization of the Hamiltonian ma- trix is computationally ov erkill and scales with the cubic p o wer of the system size. In contrast, our maxim um en tropy ansatz pro vides an efficient and accurate pro ce- dure for the determination of total (band) energy and the F ermi level sub ject to the a v ailabilit y of the energy momen ts. Here we use a toy mo del of a density of states that consists of tw o parab olae separated by a gap to il- lustrate the usefulness of our metho d. In particular, we c ho ose a normalized distribution with a gap from x 1 to x 2 , f ( x ) =      A x ( x 1 − x ) for x ≤ x 1 B ( x − x 2 ) (1 − x ) for x ≥ x 2 , where A and B are given by A = 6 x 2 1 (1 + x 1 − x 2 ) ; B = 6 (1 − x 2 ) 2 (1 + x 1 − x 2 ) . In the present case, we choose x 1 = 2 5 and x 2 = 3 5 giving the v alue of the gap ( x 2 − x 1 ) = 1 5 . The Chebyshev mo- men ts of the function can be calculated exactly , and as in the previous examples the function is reconstructed from the moments. In figs. 6 and 7 we ha ve plotted the results obtained from our ME ansatz along w ith the exact func- tional v alues at the quadrature points. It is remark able to note that the reconstructed function matc hes excellently with the exact one. F urthermore, the metho d reproduces the gap b etw een the parab olae correctly without any os- cillation in the gap. T able 1 lists the size of the gap cor- resp onding to different num b er of moments for tw o sets of Gaussian p oints. Since we are using a finite num b er of quadrature p oin ts, the accuracy of our gap size is limited b y the resolution of the (non-uniform) quadrature grid near the gap. W e hav e chosen a tolerance  = 5 . 0 × 10 − 3 for the reconstructed functional v alue to lo cate the onset of the gap (i.e. the zero of the function) [34]. It is evident from table 1 that as the num ber of moments increases, the size of the gap improv es and even tually con v erges v ery close to the exact numerical v alue. The accuracy can b e improv ed further by using more Gaussian p oints in the quadrature. T ABLE I: Numerical v alues of the gap for different num b er of momen ts from the reconstructed double-parabolic distribu- tion. Momen ts 96 p oints 192 p oints 20 0.1622 0.1676 40 0.1813 0.1875 60 0.1821 0.1902 80 0.1823 0.1909 100 – 0.1922 Exact numerical 0.1941 0.1945 D. Case 4: f ( x ) = 1 π √ ( x − x 2 ) W e now consider a function that has singularities in the range [0,1]. F or the purp ose of our discussion we re- fer to this function as ‘U-function’ hereafter. The shifted Cheb yshev moments of the function hav e the interesting prop ert y that except for the zeroth momen t, all the other momen ts are identically zero. The task of the ME algo- rithm in this case is to construct a function having all the moments zero except for the zeroth moment, which is unity by normalization. It may b e noted that the elec- tronic density of states p er atom D ( E ) of an infinite chain with a nearest neighbor interaction can b e expressed in the form, D ( E ) = 1 π 1 p 4 β 2 − ( E − α ) 2 , where α and β are the on-site and the nearest neigh- b or hopping integrals resp ectiv ely . The zeroth moment 5 is unity , which implies that there is only one state asso- ciated with each atom. F or α = 0 and β = 1 2 , the density of states can b e mapp ed onto the U-function within the in terv al [0:1], and our algorithm can b e applied to re- construct the latter. An imp ortant characteristic of the densit y of states (or distribution function) is that it di- v erges at the band edges (or at the end p oints). Since all the Chebyshev moments are zero aside from the ze- roth moment, it is imp ortant to see if the algorithm is capable of generating the densit y with the correct div erg- ing b ehavior at the (band) edges. In fig.8 we hav e plot- ted the results for the function for three different sets of momen ts M = 10, 40, and 80 to illustrate how the appro ximate solutions impro ve with the increase of the n umber of moments. The shap e of the function b egins to emerge correctly even for as few as first 10 moments but with significant oscillations and p o or divergence b e- ha vior near the end p oints. As the num b er of moments increases, the solution rapidly conv erges and the oscilla- tions b egin to disapp ear. In fig.9 we hav e plotted the re- sults for M = 120. The reconstructed function matches excellen tly throughout the in terv al with the exact one. The b eha vior of f ( x ) near the left edge at x = 0 is shown in fig.10 from x =0 to x =0.05. It is eviden t from the plot that ev en for very small v alues of x near the left edge, the reconstructed v alues agree with the exact v alues ex- cellen tly . A similar b ehavior has b een observed near the righ t edge of the band near x =1. The capability of our metho d in reconstructing a function with singularities in the interv al is thus convincingly demonstrated. E. Case 5: A function with a finite discon tinuit y The functions that we hav e discussed so far in the ex- amples ab ov e are contin uous within the in terv al. It w ould b e interesting to consider a case where the function has a finite discontin uit y within the interv al. As an example, w e choose a double-step function, f ( x ) =      1 2 for x ≤ x 1 3 2 for x ≥ x 1 , whic h has a finite discon tinuit y at x 1 = 1 2 . It is rather c hallenging to reconstruct the function from the momen ts so that the lo cal b ehavior near the discontin u- it y at x = 1 / 2 is correctly repro duced. As b efore, the momen t integrals can b e calculated analytically in this case. In fig.11 we ha ve plotted the function for 10, 20 and 50 moments. The solutions for the first t wo sets are exp ected to b e less accurate, and indeed they show sig- nifican t oscillations in the figure. F or 50 momen ts the matc h is quite impressive. On adding further moments, the solution progressively improv es. Figure 12 sho ws the remark able accuracy with which the function is repro- duced by employing the first 100 moments. An imp ortant feature of the reconstructed function is that the discon- tin uity has b een correctly repro duced with the exception of tw o p oin ts. F rom the v arious cases studied so far, we conclude that ab out 80 to 120 moments are needed for p oin t-wise matching of the exact and the reconstructed functions. F. Case 6 : An unknown density Up un til now, we hav e considered cases where the ex- act form of the function is known. In practical problems, ho wev er, it is more lik ely that the exact function is not a v ailable. W e should therefore consider a case where the analytical expression for the distribution is not known, but a direct n umerical solution is p ossible. As an exam- ple of such a distribution, w e c ho ose the natural inv ari- an t density of the logistic map g ( x ) = Γ x (1 − x ) with Γ = 3 . 6785. The inv ariant density for the map can b e obtained by calculating the momen ts from the time ev o- lution of an ensemble of initial iterates x 0 as discussed in Ref. [35]. Since the map is ergodic for this v alue of Γ, the momen ts obtained via the time ev olution of the map are identical to the momen ts of the natural inv ari- an t density [23, 35]. The task of our maximum entrop y algorithm is to reconstruct the approximate density , and to compare it with the n umerical density . The latter can b e obtained from a histogram of the iterates and av erag- ing o ver a large n umber of configurations [35]. The result from our ME ansatz using the first 80 momen ts is plot- ted in fig.13 along with the n umerical density . The plot clearly demonstrates that ev ery asp ect of the fine struc- ture of the numerical densit y is repro duced excellently in the maximum en tropy solution. Finally , we end this section considering a rapidly os- cillatory function having complex structure within the in terv al [0,1]. An example of such a function can b e con- structed as, f ( x ) = 1 4 (sin(167 x ) + cos(73 x )) + 6 ( x − 1 2 ) 2 + 1 2 (9) where the prefactors are c hosen to normalize the func- tion. In the context of studying diffusion in a rough one-dimensional p otential, Zw anzig has studied suc h a function to obtain a general expression for the effective diffusion co efficient by analyzing the me an first-passage time [36]. The maximum entrop y construction of this function is plotted in fig.14 for 90 moments along with the exact function. Once again the function is repro duced excellen tly with ev ery little details of the lo cal minima and maxima of the function. IV. CONVER GENCE BEHA VIOR OF THE RECONSTR UCTED SOLUTION The conv ergence behavior of the maximum en tropy solution has b een discussed at length in the litera- 6 ture [29, 30, 37, 38]. The analytical efforts are partic- ularly focused on constructing bounds of the pro ximity of the reconstructed density to the exact density assum- ing that a given num ber o f momen ts of the distributions are identical. In particular, giv en a target distribution f ( x ) and a reconstructed distribution f M ( x ) that ha v e iden tical first M moments µ 0 = 1 , µ 1 , . . . , µ M , the pro x- imit y of the t wo distributions can b e expressed via the Kullbac k-Leibler divergence [29] and the v ariation mea- sure [30], D K L [ f , f M ] = Z S f ( x ) ln  f ( x ) f M ( x )  dx (10a) D v [ f , f M ] = Z S | f M ( x ) − f ( x ) | dx, (10b) where S is the supp ort of the densities f ( x ) and f M ( x ). The divergence measure is also known as the relative en- trop y or information discrimination, and D K L ≥ 0 with the equalit y holding if and only if f ( x ) = f M ( x ) for all x . A low er b ound for the divergence measure D K L in terms of the v ariation measure was given by Kullback [39]: D K L ≥ D 2 v 2 + D 4 v 12 , (11) where it was assumed that the first M momen ts are iden- tical for b oth the distributions. Since the exact distribu- tions are kno wn for the examples considered here (except for the case 6), we can use these measures to examine if the reconstructed solution indeed satisfies the inequality . T o this end, we first study the con vergence of the mo- men ts of the reconstructed distributions with iteration and establish that the momen ts can b e matc hed very ac- curately so that for practical purp oses the reconstructed momen ts can be tak en as identical to the exact (input) momen ts for the calculation of measure in Eqs. (10a) and (10b). A. Con vergence with resp ect to the num ber of iteration As mentioned earlier, we hav e observed that the qual- it y of the ME solutions dep end on tw o factors: 1) the n umber of iterations and 2) the num b er of moments used for the purp ose of reconstruction. In general, for a dis- tribution with a fine structure, it is difficult to determine the minimal n umber of moments that are needed to re- construct the function accurately . Ho wev er, b y studying a num ber of distributions with v arying complexities and their conv ergence b ehavior, it is p ossible to obtain some useful information about the rate of con v ergence. W e ad- dress this issue by c ho osing the U-function (cas e 4) as an example, but the observ ation is found to b e true for other cases as well. F or a systematic study of conv ergence b e- ha vior of the reconstructed moments with iterations, one requires a measure of the go o dness of the fit. W e there- fore in tro duce ∆ 1 , the ro ot mean square (RMS) deviation of the exact moments from the moments pro vided by the ME solution, ∆ 1 ( N , M ) = v u u t 1 M M X i =1 ( µ i − ˜ µ i ( N , M )) 2 . (12) Here µ and ˜ µ ( N , M ) respectively denote the exact (or input) and the reconstructed (or output) moments, and the latter dep ends on the n umber of momen ts ( M ) and the iteration n umber ( N ). In the con text of our present study , the exact momen ts of the functions are kno wn, but in many practical cases they ma y not b e a v ailable and need to b e replaced b y the input moments av ailable for the problem. The quantit y ∆ 1 pro vides a measure of the pro ximity of the first M reconstructed moments to the exact ones, and a small v alue of ∆ 1 is indicativ e of the fact that the momen t constrain ts are satisfied with a high degree of accuracy . The v alue of ∆ 1 b ecomes as small as 10 − 14 pro vided an adequate n um b er of iterations are p er- formed to match a given set of momen ts. In fig.15, we ha ve plotted ∆ 1 for the case of U-function with the num- b er of iteration progressively increasing to N = 6 × 10 6 . The RMS deviation decreases rapidly with the num ber of iteration and even tually drops to a v alue of the order of 10 − 14 . An examination of the data suggests that ∆ 1 can b e fitted to an exp onential decay with iteration and is plotted in fig.15. This b ehavior is also observ ed for the other distributions discussed in Section 3. It thus follo ws that the algorithm conv erges quite rapidly , and that the momen t constraints can b e satisfied to a high degree of accuracy even for a v ery large moment set. B. Con vergence with resp ect to the num ber of momen ts While ∆ 1 pro vides a measure of the go o dness of the fit for the momen ts, it do es not necessarily guarantee a p oint-wise con vergence of the reconstructed function with the exact one. This is particularly true if a small n umber of moments are used to reconstruct the function that has a fine structure in it (cf. fig.13). In this case, the reconstructed moments can b e matched to a high degree of accuracy with input moments, but the solution may still miss out the characteristic feature of the distribution folded in the higher order moments. The maxim um en- trop y solution in this case may not repro duce the actual solution even though the approximate moments are very close to the exact momen ts. T o ensure that ∆ 1 indeed attains a sufficiently small v alue, w e need to study the appro ximate solution vis-a-vis the num b er of momen ts for a fixed cycle of iterations. Since the exact functions are kno wn in our cases, the simplest wa y to measure the qualit y of the ME solution is to construct the RMS devi- ations of the reconstructed functions from the exact ones 7 in the in terv al [0,1]: ∆ 2 ( N , M ) = v u u t 1 n g n g X i =1 h f i − ˜ f i ( N , M ) i 2 (13) ≈ ∆ 2 ( M ) for large N, where n g is the num b er of p oints used in the quadrature. Here w e hav e assumed that the dep endence of ∆ 2 on N can b e neglected so that ˜ f i ( N , M ) ≈ ˜ f i ( M ), which holds for large N owing to the fast decay of ∆ 1 . W e c ho ose ∆ 1 = 10 − 15 for eac h of the moment sets to study the v ariation of ∆ 2 ( M ) for different v alues of M . In practice, the exact function may not b e av ailable but the expression for ∆ 2 can still b e used by replacing the exact function f i b y ˜ f i ( M + ∆ M ) and constructing the RMS deviation for increasing v alues M and ∆ M . In fig.16, we ha ve plotted ∆ 2 for the case of U-function for different v alues of M. The plot sho ws a monotonic decrease of ∆ 2 with the increasing v alues of M. F or this function, w e see that a v alue of M =100 to 120 provides a small enough ∆ 2 to reconstruct the function accurately when ∆ 1 = 10 − 15 . The solid line in the figure is an exp onential fit to the data indicating a fast con vergence of our algorithm with resp ect to the moments for a fixed v alue of ∆ 1 . The proximit y of the reconstructed distribution to the exact one can b e quantified in terms of the divergence measure and the v ariation distance as defined in the b e- ginning of this section. A num b er of inequalities can b e found in the literature [30, 38] that provide lo wer bounds of the relative entrop y . The inequality in (11) is an ex- ample of such a b ound although still sharp er b ounds are a v ailable in the literature [30, 38]. In fig.17 we ha ve plot- ted the relative entrop y of the U-function for different n umber of moments. The reconstructed solution for each of the moment sets M corresponds to ∆ 1 = 10 − 15 so that the first M moments of the exact and the recon- structed functions are practically identical to each other. The righ t hand side of the inequality (11) is also plotted in the same figure for comparison. As the reconstructed solution approac hes the exact solution with increasing n umber of moments, the relative entrop y or information discrimination b etw een the tw o distributions decreases and even tually comes very close to the analytically pre- dicted low er b ound. V. CONCLUSION In this pap er we study the reconstruction of functions from a set of Chebyshev moments (of the functions) via maxim um en tropy optimization. The metho d consists of mapping the original constraint optimization problem in primal space onto an unconstrained con vex optimization problem in the dual space by using a discretized form of the moments and the Shannon en trop y of the function to b e reconstructed. The resulting optimization problem is then solved iteratively by obtaining the optimal set of La- grange’s parameters as prescrib ed in Ref.[28]. By virtue of its ability to deal with a larger n umber of momen ts, our present approac h is extremely robust and accurate. This makes it possible to reconstruct a v ariety of function that are difficult to handle otherwise. W e demonstrate the accuracy of this method b y ap- plying to a num ber of functions for which the ex- act momen ts are a v ailable. The metho d accurately repro duces not only smooth and con tin uous functions (suc h as square-ro ot and double-parab olic functions) but also non-smo oth and discontin uous functions (such as a double-step function with a finite discontin uit y). It also captures the fine structure in a rapidly oscillatory func- tion of known analytical form and the in v ariant densi- ties of a logistic map corresp onding to sp ecial v alues of the control parameter for which no analytical results are a v ailable. A con vergence study of the reconstructed mo- men ts suggests that the RMS deviation of the moments (from the exact ones) can be made as small as 10 − 15 indicating the accuracy with whic h the input momen ts can b e matc hed with the reconstructed ones. A direct comparison with the exact functions studied here reveals that the metho d indeed conv erges to the correct solution pro vided a sufficient num b er of moments are av ailable as input. The general trend of the conv ergence profile is similar in all the cases: the qualit y of the reconstructed function markedly improv es with the increase in the n um- b er of momen ts. The numerical calculations suggest that the conv ergence to ward the exact solution is almost of exp onen tial nature. Ac knowledgmen ts The work is partially supp orted by a fello wship from Aubrey Keith Lucas and Ella Ginn Lucas Endowmen t at the Universit y of Southern Mississippi. PB would like to thank Lawrence Mead, Da vid Drabold, and Roger Hay- do c k for useful discussions during the course of the work. [1] A. J. Shohat and J. D. T amarkin, The Pr oblem of Mo- ments (American Mathematical So ciety , 1963) [2] N. I. Akheizer, The classic al moment pr oblem and some r elate d questions in analysis (Hafner Publishing Co., New Y ork, 1963) [3] J. N. Kapur and H. K. Kesa v an, Entr opy optimization 8 and mathematic al pr o gr amming , Kluw er Academic Pub- lishers, Dordrech t (1997) [4] A. Lent in Image analysis and evaluation (Edited by R. Sha w), SPSE, W ashington, D.C (1953) [5] J. C. Wheeler and R.G. Gordon, J. Chem. Phys. 51 , 5566 (1969) [6] C. Isenberg, Ph ys. Rev. 150 , 712 (1966) [7] G. Bricogne, Acta Cryst. A 44 , 517 (1988) [8] C. J. Gilmore, Acta Cryst. A 52 , 561 (1996) [9] L.R. Mead and N. Papanicolaou, J. Math. Phys. 25 , 2404 (1984) [10] C.R. Smith and W.T. Grandy Jr., Maximum entr opy and Bayesian metho ds in inverse pr oblems , Reidel, Dordrec ht (1985) [11] D. A. Drabold and O. F. Sankey , Phys. Rev. Lett. 70 , 3631 (1993) [12] A.E. Carlsson and P .A.F edders, Phy . Rev. B 34 , 3567 (1986) [13] H. Gotov ac and B. Goto v ac, J. Comp. Phys. 228 , 9079 (2009) [14] F. Hausdorff, Math. Z. 16 , 220 (1923) [15] J. Hadamard, Le ctur es on the Cauchy problem in line ar p artial differ ential equations , Y ale Universit y Press, New Ha ven 1923 [16] I. T urek, J. Phys. C: Solid State Phys. 21 , 3251 (1988) [17] A. Tikhonov and V. Y. Arsenine, Solution of il l-p ose d pr oblems , V.H. Winston & Sons, W ashington, D. C. (1977) [18] G.A. Viano, J. Math. Anal. Appl. 156 , 410 (1991) [19] E. Scalas and G. A. Viano, J. Math. Phys. 34 , 5781 (1993) [20] E.T. Jaynes, Phys. Rev. 106 , 620 (1957) [21] C. Shannon, Bell Syst. T ech. J. 27 , 379 (1948) [22] J-H. Sch¨ ofeldt, N. Jimenez, A.R. Plastino, A. Plastino and M. Casas, Physica A 374 , 573 (2007) [23] W-H. Steeb, F. Solms and R. Stoop, J. Phys. A: Maths. Gen 27 , L399 (1994) [24] A.A.Maradudin, E.W.Mon troll, G.H.W eiss, and I.P .Ipato v a, The ory of lattic e dynamics in harmonic appr oximation (Academic Press, New Y ork, 1971) [25] W.V. Houston, Rev. Mo d. Ph ys. 20 , 161 (1948) [26] C. Domb and C. Isen b erg, Pro c. Phys. So c. London 79 , 659 (1962) [27] R.N. Silver and H. R¨ oder, Phys. Rev. E 56 , 4822 (1997) [28] K. Bandyopadh ya y , A.K. Bhattac harya, P . Bisw as and D.A. Drab old, Ph ys. Rev. E 71 , 057701 (2005) [29] S. Kullbac k, Information the ory and statistics (Do v er Publication, 1997) [30] A. T aglinai, Appl. Math. and Comput. 145 , 195 (2003) [31] J. Wimp, Pro c. Roy . So c. Edinburgh 82 A , 273 (1989) [32] The solution is le ast biase d as far as the entrop y of the densit y function is concerned, and the Hausdorff condi- tions are satisfied. There is no guarantee that the max- im um en tropy solution w ould be close to the exact so- lution, particularly when the very first few moments are used in the reconstruction pro cedure. [33] L. M. Bergman, USSR Comput. Math. and Math. Phys. 7 , 200 (1967) [34] The exact lo cation of the abscissa is computed b y linearly in terp olating b etw een the tw o ordinates that ha ve zero and non-zero v alues. The accuracy (of the lo cation of the abscissa) can b e improv ed further by using more p oints in the quadrature formula. [35] C. Beck and F. Schl¨ ogl, Thermo dynamics of chaotic sys- tems (Cambridge Universit y Press, Cambridge, United Kingdom, 1993) [36] R. Zwanzig, Pro c. Nat. Acad. Sci. USA 85 , 2029 (1988) [37] J.M. Borwein, SIAM J. Optim 1 , 191 (1991) [38] G. T. T oussaint, IEEE T rans. Inform. Theor. 21 , 99 (1975) [39] S. Kullback, IEEE T rans. Inform. Theor. IT-13, 126 (1967) FIG. 1: (Color online) The step function, f ( x ) = 1, recon- structed using the first 40, 60, and 80 Chebyshev moments. The plot shows the presence of oscillations at the right edge with deca ying amplitude as the n umber of moments increases. The data for 40, 60 and 80 moments are indicated in the figure by b oxes (blue), triangles (red) and circles (green) re- sp ectiv ely . A magnified view of the right edge is shown in the inset. 9 FIG. 2: (Color online) The step function as in fig.1 using the first 100 Chebyshev moments. The oscillations now com- pletely disapp ear and the function is reproduced with an error less than 1 part in 10 6 . 10 FIG. 3: The function, f ( x ) = 3 2 x 1 2 , and its maximum entrop y reconstruction using the first 100 Chebyshev momen ts. F or clarit y , ev ery second data p oint is plotted in the figure. The exact function is ev aluated at the quadrature p oints and is dra wn as a line for comparison. 11 FIG. 4: The b ehavior of the function f ( x ) = 3 2 x 1 2 for very small v alues of x along with the exact functional v alues ev alu- ated at the quadrature p oin ts. Note that only one p oin t is off the graph indicating an excellen t match to the exact function for 100 moments. FIG. 5: The behavior of the function f ( x ) = 3 2 x 1 2 near x = 1. The exact function is also plotted for comparison. 12 FIG. 6: The reconstruction of a function with a gap in the in terv al. The double-parab ola with a gap is reconstructed using the first 100 moments. The exact v alues are also plotted in the figure for comparison. 13 FIG. 7: The reconstructed double-parabola near the gap along with the exact function at the quadrature p oints. Ow- ing to the finite num ber of quadrature p oints, the recon- structed function has a non-zero v alue at x 1 = 0.4 and x 2 = 0.6. FIG. 8: (Color online) Reconstruction of the U-function as defined in the text. The data corresp ond to the first 10, 40 and 80 moments as indicated in the plot. 14 FIG. 9: The reconstructed U-function using the first 120 mo- men ts along with the exact function ev aluated at the quadra- ture points. The reconstructed function matches p oint-wise to the functional v alues as indicated in the figure. FIG. 10: The divergen t b ehavior of the U-function near x = 0. The metho d accurately reproduces the function for very small v alues of x . 15 FIG. 11: (Color online) The reconstructed double-step func- tion for the first 10, 20 and 50 moments. The reconstructed function improv es progressiv ely with the increase in the num- b er of momen ts. FIG. 12: (Color online) The reconstructed double-step func- tion using the first 100 momen ts. The exact function is also sho wn as a line. 16 FIG. 13: A reconstructed density with sharp p eaks obtained from the first 80 momen ts. The distribution corresp onds to the natural inv ariant density of the logistic map as discussed in the text. The line corresp onds to the n umerical density obtained via histogram metho d. 17 FIG. 14: (Color online) The reconstruction of an oscillatory function with a fine structure as discussed in the text. The ex- act functional v alues are also plotted at the quadrature p oints for comparison. The location of the lo cal minima and max- ima are excellently repro duced from the first 90 momen ts of the function. 18 FIG. 15: The semi-log plot of the RMS deviation ∆ 1 ( N ) for the U-function with iteration N expressed in unit of 10 6 . The RMS v alues decay exp onentially with iteration after an initial crossov er around N = 0 . 8. F or clarity of presentation, ev ery second data p oint is plotted in the figure. The num ber of moments is indicated in the figure. 19 FIG. 16: The v ariation of the RMS deviation ∆ 2 ( M ) with momen ts for a given v alue of ∆ 1 = 10 − 15 . The data can b e fitted to exp onential deca y as indicated by the b est fitted line in the plot. 20 FIG. 17: The v ariation of the KL divergence measure (cir- cles) of the U-function for different num ber of moments. The function of the v ariation measure from the right hand side of the inequalit y (11) is also plotted for comparison. The data corresp ond to the reconstructed solution with ∆ 1 = 10 − 15 .

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment