Bayesian estimation of regularization and PSF parameters for Wiener-Hunt deconvolution

Bayesian estimation of r egulariza tion and PSF paramet ers f or W iener -H unt decon v olution F ran c ¸ ois Or ieux, 1 , ∗ Jean-F r anc ¸ ois Giov anne l li, 2 Thomas Rodet, 1 1 Laboratoir e des Signaux et Syst ` emes ( C N R S – S U P E L E C – Univ . P aris-Sud 11), S U P E L E C , Plateau de Moulon, 3 rue J oliot-Curie, 91 192 Gif-sur-Yve tte, F rance 2 Laboratoir e d’Int ´ egration du Mat ´ eriau au Syst ` eme ( C N R S – E N S E I R B – Univ . Bor deaux 1 – E N S C P B ), 351 cours de la Libration, 33405 T alence, F rance ∗ Corr esponding autho r: orieux@lss.supelec.fr This paper tackles the p roblem of image d econ volution wi th j oint estim ation of PSF parameters and hyperparameters. W i thin a Bayesian frame work, the solu tion is inferred via a global a posteri ori law for unknown parameters and object. The estimate is chosen as the posterior mean, num erically calculated by means of a Monte-Carlo Markov chain algorithm . The estimates are efﬁciently computed in the Fourier domain and the effecti veness of the method is shown on sim ulated examples. Results show precise estimates for PSF parameters and hyperparameters as well as precise image estim ates in cluding restoration of high-frequencies and spatial details, withi n a global and coherent approach. c  2018 Opt ical Society of America OCIS codes: 100.1830 , 100.3020, 100.3190, 150.148 8 1. Intr oduction Image decon volution has been an activ e researc h ﬁeld for se veral decades and recent contrib uti ons can be found in p apers such as [1 – 3]. Examples of application are medical imagin g, astronomy , nondestructive test ing and more generally imagery prob lems. In these applications, de gradation s induced by the observation instrument limit the data resolution while th e need of p recise interpre- tation can be of m ajor imp ortance. For example, this is particularly critical for long -wa velength astronomy (see e.g., [4]). In additio n, the de velopm ent of a high quality i nstrumentatio n system must rationally be completed by an equi valent le vel of quality in the dev elop ment of data proce ss- ing methods. Moreo ver , e ven for poor performance systems, the restoration method c an be used to bypass instrument limitation s. 1 When the decon volution problem is ill-posed a possible solution relies on regularization, i.e ., in- troduction of information in addition to the data and the acquisition model [ 5, 6 ]. As a consequence of regularization, decon volution meth ods are speciﬁc to the class of image in accordance with the introduced information. From this s tandpoint, the present paper is dedicated to relatively smooth images encountered for numerous applications in imagery [ 4, 7 , 8 ]. The second order consequence of ill-posedness and regularization is the need to balance the compromi se between dif ferent sources of information. In the Bayesian approach [1, 9], informati on about unknowns is introd uced by means of proba- bilistic models. O nce t hese mo dels are designed, the ne xt step is t o build t he a post eriori law , giv en the measured data. The solution is then deﬁned as a representati ve point of this la w and the two most classical are (1) the m aximizer , and (2) the mean. From a computational standpo int, the ﬁrst leads to a nu merical optimizati on problem and the latter leads t o a numerical integration problem. Howe ver , the resulting estimate depends on two sets of v ariables in addi tion to the data. 1. Firstly , the e stimate naturally depends on the response of the instrument at work, namely the point spread function (P SF). The literature is predominantly de voted to decon volution in the case of known PSF . On the contrary , the present paper is dev oted t o the case of unknown or poorly kno wn PSF and there are two m ain strategies to tackle i ts estimati on from the a vailable data set (without extra measurements). (i) In m ost practical cases, the ins trument can be modeled using physical operating descrip- tion. It is thus possible to ﬁnd the equation for the PSF , at least in a ﬁrst approxim ation. This equation is usually driven by a relativ ely small number of parameters. It is a com- mon case in o ptical imaging where a Gaussian-shaped PSF is often used [10 ]. It is also the case in other ﬁelds: interferometry [ 11], magnetic resonance force microscopy [12], ﬂuorescence microscopy [13],. . . Ne vertheless, in real experiments, the parameter values are unknown or imperfectly known and need to be esti mated or adjusted in addition to the image of interest: the question is namely myopic decon volution. (ii) The second strategy forbears the use o f the parametric PSF deduced from the physical analysis and the PSF then naturally appears in a non-parametric form. Practically , the non-parametric PSF is unknown or imperfectly known and needs to be estim ated in addition to the image o f interest: the question is referred to as blind decon volution for example in interferometry [14 – 17]. From an inference po int of v iew , the difﬁc ulty of both myopic and blind problems lies in the possible lack of informat ion resulting in ambig uity between image and PSF , ev en in the noiseless case. In order to resolve the ambiguity , information must be added [3, 18] and it is crucial to make inquiri es based on any av ailable source of inform ation. T o t his end, th e knowledge of th e parametric PSF r epresents a precious means to structure the prob lem and possibly resolve the degeneracies. Moreover , due t o i nstrument design process, a nomi nal 2 value as well as an uncertainty are usually av ailable for the PSF parameters. In addi tion, from a practical and algo rithmic s tandpoint, th e my opic case, i.e., the case of parametric PSF , is o ften more dif ﬁcult due to the n on-linear dependence o f the observ ation model with respect to the PSF parameters. On t he contrary , the blind case, i.e ., the case of non-parametric PSF , yields a simpler practical and algorit hmic problem since the o bserva tion model remains linear w .r .t. the unkno wn elements gi ven the object. Despite the superior t echnical dif ﬁculty , the present paper is de voted t o the myopic format since it is expected to be more ef ﬁcient than the blind format from an information standpoint. Moreover , the blind case has been extensiv ely studied and a large amou nt of paper is av ailabl e [19 – 21], while the myopic case has been less in vestigated, though it is of major importance. 2. Secondly , the solution depends on t he prob ability law parameters named hyperparameters (means, variance s, parameters of correlation matrix,. . . ). These parameters adju st th e sh ape of the laws and in the same ti me they tune the comprom ise between the inform ation provided by the a priori and the information provided by the data. In real e xp eriments, their v alues are unknown and need to be estimated: the question is namely unsupervised decon volution. For both families of parameters (PSF parameters and hyperparameters), two approaches are a vailable. In the ﬁrst one, the parameter v alu es are empirically tuned or estimated in a preliminary step (with Maximum Likelihood [7] or calibration [22] for example), then the values are used in a second step de voted to i mage restoration given the parameters. In the second one, the parameters and the object are jointly estimated [2, 19]. For the my opic problem, Jalobeanu et al. [23] address the case of a symmetric Gauss ian PSF . The width parameter and the n oise variance are estimat ed in a preliminary st ep by Maximu m- Likelihood. A recent paper [24] addresses the estimation of a Gaussian blur parameter , as in ou r experiment, with an empirical method. They foun d the Gaussian blur parameter b y minimizin g the absolute deriv atives of the restored images Laplacian. The present paper addresses t he m yopic and u nsupervised decon volution problem . W e propose a new meth od that jointly esti mates the PSF parameters, the hyperparameters, and the image of interest. It is b uilt in a coherent and global framew o rk based on an extended a posteriori law for all the unknown var iables. Th e po sterior law is obt ained via the Bayes rule, founded on a priori laws: Gaussian for image and noise, uniform for PSF parameters and gamma or Jef freys for hyperparameters. Regarding the image prior law , we have paid special attention to the parametrization of the cov ariance matrix in order to facilitate law manipulation s such as in tegration, conditi oning or hy- perparameter estim ation. The possible degeneracy of the a posterior i law in som e limit cases is also studied. The estimate is chosen as the m ean of the po sterior law and is com puted using Monte-Carlo simulatio ns. T o this end, Monte-Carlo M arko v chain (MCMC) algorithms [25] enable to draw 3 samples from the posterior distribution desp ite its complexity and especially the non-linear depen- dence w .r . t. the PSF parameters. The paper is structured in the foll owing m anner . Sec. 2 presents the not ations and st ates the problem. The three following sections descr ibe our methodo logy: ﬁrstly the Bayesian probabilistic models are detailed in Sec. 3; t hen a proper posterior law is establis hed in Sec. 4; an M CMC algorithm to compute the estim ate i s described in Se c. 5. Numerical re sults are sho wn in Sec. 6. Finally , Sec. 7 is de voted to conclusion and perspectiv es. 2. Notations and con volution model Consider N pixels real square images represented i n lexicographic order by vector x ∈ R N , with generic elements x n . The forwar d model writes y = H w x + ǫ (1) where y ∈ R N is the vector of data, H w a con volution m atrix, x the image o f in terest and ǫ the modelization errors or the no ise. V ector w ∈ R P stands for th e PSF parameters, such as widt h or orientation of a Gaussian PSF . The matrix H w is block-circulant wit h circulant-block (BCCB) for computational efﬁ ciency of the conv olution in the Fourier s pace. The diagon alization [26] of H w writes Λ H = F H w F † where F is the unitary F o urier m atrix and † is the transpose conj ugate symbo l. The con volution, in the Fourier space, is then ◦ y = Λ H ◦ x + ◦ ǫ (2) where ◦ x = F x , ◦ y = F y and ◦ ǫ = F ǫ are the 2D dis crete Fourier t ransform ( D F T - 2 D ) of image, data and noise, respectiv ely . Since Λ H is di agonal, the con volution is computed with a term-wise product in the Fourier space. There is a strict equivalence between a description in spatial domai n (Eq. (1)) and in Fourier domain (Eq. (2)). Consequently , for coherent descriptio n and computation al ef ﬁciency , all t he de- velopments are equally done in the spatial space or in the F ourier space. For notational con venience, let us i ntroduce the component at null -frequency ◦ x 0 ∈ R and the vector of component at non-null frequencies ◦ x ∗ ∈ C N − 1 so that t he whole set of compo nents writes ◦ x = [ ◦ x 0 , ◦ x ∗ ] . Let us note 1 the vector of N components equal to 1 / N , so that 1 t x i s the empirical mean le vel of the image. The Fourier components are the ◦ 1 n and we have: ◦ 1 0 = 1 and ◦ 1 n = 0 for n 6 = 0 . Moreover , Λ 1 = F 11 t F † is a diagonal matrix with only one non-nul l coef ﬁcient at null frequency . 3. Bayesian pr obabilistic model This section presents the prior law for each s et of parameters. Regarding the image of int erest, in order t o account for smoothness, the law int roduces high-frequency penalization through a differ - 4 ential operator on the pixel. A conjugate la w is proposed for the hyperparameters and a uniform law is considered for the PS F parameters. Moreover , we hav e paid a special attention to the image prior law parametrization . In the ne x t section we present sev eral parametrizatio n in order to facilitate law m anipulations such as inte- gration, conditioning or hyperparameter estimation. Moreover , the corre lation matrix of the image law may b ecome singular in some limit cases resulting in a degenera ted prior la w (when p ( x ) = 0 for all x ∈ R N ). Based on this parametrization, Sec. 4 st udies the de g eneracy of t he po sterior in relation with the parameters of the prior law . 3.A. Image prior law The probability law for the image is a Gaussian ﬁeld with a give n precision matrix P parametrized by a vector γ . The pdf reads p ( x | γ ) = (2 π ) − N/ 2 det[ P ] 1 / 2 exp  − 1 2 x t P x  . (3) For comput ational efﬁcienc y , the pre cision matrix is designed (or approximated) in a toroidal m an- ner , and it is diagonal in the Fourier domain Λ P = F P F † . Thus, the law for x also writes p ( x | γ ) = (2 π ) − N/ 2 det[ F ] det[ Λ P ] 1 / 2 det[ F † ] exp  − 1 2 x t F † Λ P F x  (4) = (2 π ) − N/ 2 det[ Λ P ] 1 / 2 exp  − 1 2 ◦ x † Λ P ◦ x  (5) and it i s sometimes referred t o [27] as a Whittle approxim ation (see also [28, p.1 33]) for the Gaussian l aw . The ﬁlter obtain ed for ﬁxed hyp erparameters is also the W iener-Hunt ﬁlter [29], as described in Sec. 5.A. This paper focuses on s mooth images, thus on positive correlation between pixels. It is intro- duced b y h igh-frequencies p enalty usi ng any circulant di f ferential o perator: p -t h differences be- tween pixels, Laplacian, Sobel. . . Th e differential operator is denoted by D and its diagonalized form by Λ D = F D F † . Then, the precision matrix writes P = γ 1 D t D and its Fourier counterpart writes Λ P = γ 1 Λ † D Λ D = diag  0 , γ 1 | ◦ d 1 | 2 , . . . , γ 1 | ◦ d N − 1 | 2  (6) where γ 1 is a positiv e scale factor , diag builds a diagonal matrix from elementary components and ◦ d n is the n -th D F T - 2 D coefﬁcient of D . Under this parametrization of P , the ﬁrst eigen value is equal to zero corresponding to the ab- sence of penalty for the null frequency ◦ x 0 , i .e., no i nformation ac counted for about the empirical mean lev el of the im age. As a consequence, the determin ant vanishes det[ P ] = 0 resul ting in a degenerated prior . T o manage this difﬁculty , se veral approaches ha ve been proposed. Some a uthors [2, 30 ] still use t his prior despite its degenerac y and this approach can be analyzed in two ways. 5 1. On the one hand , it can be s een as a non-degenerated law for ◦ x ∗ , the set of non-nul l frequency components only . In this format, the prior does not af fect any probability to the null frequency component and the Bayes rule does not apply to th is compo nent. Thus , t his strate gy yields an incomplete posterior law , since the null frequenc y is no t embedded in the methodology . 2. On the o ther hand, it can be see n as a degenerated prio r f or the who le set of frequencies. The application of the Bayes rule is then som e what confusi ng due to d egenerac y . In this format, the posterior law cannot be guaranteed to remain non-de generated. Anyway , none o f th e two stand points yields a post erior law t hat i s both non-degenerated and addressing the whole set of frequencies. An alternati ve parametrization relies on the ener gy of x . An extra term γ 0 I , tuned by γ 0 > 0 , in the precision m atrix [31], introduces informatio n for all the frequencies including ◦ x 0 . The p recision matrix writes Λ P = γ 0 I + γ 1 Λ † D Λ D = diag  γ 0 , γ 0 + γ 1 | ◦ d 1 | 2 , . . . , γ 0 + γ 1 | ◦ d N − 1 | 2  (7) with a determinant det[Λ P ] = N − 1 Y n =0  γ 0 + γ 1 | ◦ d n | 2  . (8) The obtain ed Gaussian prior i s not degenerated and u ndoubtedly leads to a p roper posterior . Nev- ertheless, the determinant Eq. (8) is not separable in γ 0 and γ 1 . Consequently , the conditional posterior for t hese parameters is not a classical law and fu ture development will be more difﬁcult. Moreover , the non-null frequencies ◦ x ∗ are controlled by two parameters γ 0 and γ 1 p ( ◦ x | γ 0 , γ 1 ) = p ( ◦ x 0 | γ 0 ) p ( ◦ x ∗ | γ 0 , γ 1 ) . (9) The proposed approach to manage the degenerac y relies on the addition of a term for the null frequency only Λ 1 = diag (1 , 0 , . . . , 0) Λ P = γ 0 Λ † 1 Λ 1 + γ 1 Λ † D Λ D . (10) = diag  γ 0 , γ 1 | ◦ d 1 | 2 , . . . , γ 1 | ◦ d N − 1 | 2  . The determinant has a separable expression det[Λ P ] = γ 0 γ N − 1 1 N − 1 Y n =1 | ◦ d n | 2 , (11) i.e., the precision parameters ha ve been factorized. In addition, each parameter controls a dif ferent set of frequencies: p ( ◦ x | γ 0 , γ 1 ) = p ( ◦ x 0 | γ 0 ) p ( ◦ x ∗ | γ 1 ) , 6 γ 0 driv es t he empirical m ean l e vel of the image ◦ x 0 and γ 1 driv es t he smoothness ◦ x ∗ of the image. W ith the Fourier precision structure of Eq. (10), we have the non-degenerated prior l aw for the image that addresses separately all the frequencies wit h a factorized partition function w .r .t. ( γ 0 , γ 1 ) p ( x | γ 0 , γ 1 ) = ( 2 π ) − N/ 2 N − 1 Y n =1 | ◦ d n | γ 1 / 2 0 γ ( N − 1) / 2 1 exp  − γ 0 2 k ◦ x 0 k 2 − γ 1 2 k Λ D ∗ ◦ x ∗ k 2  . (12) where Λ D ∗ is obtained from Λ D without th e ﬁrst li ne and column . The next st ep i s to writ e the a priori law for the noise in an explicit form and the other parameters, including the law parameters γ and the instrument parameters w . 3.B. Noise and data laws From a m ethodological st andpoint, any st atistic can be included for errors (measurement and model errors). It is possi ble to account for correlations in th e error process o r to account for a non-Gaussian law , e.g . , Lapl acian law , generalized Gaus sian law , or other laws b ased on robust norm,. . . In the present paper , the noise is m odeled as zero-mean white Gaussian vector with u n- known precision parameter γ ǫ p ( ǫ | γ ǫ ) = (2 π ) − N/ 2 γ N/ 2 ǫ exp  − γ ǫ 2 k ǫ k 2  . (13) Consequently , the likelihood for the parameters giv en the observed data writes p ( y | x , γ ǫ , w ) = (2 π ) − N/ 2 γ N/ 2 ǫ exp  − γ ǫ 2 k y − H w x k 2  . (14) It naturally depends on the im age x , on the noise parameter γ ǫ and on the PSF parameters w embedded in H w . It clearly inv olves a least sq uares discrepancy that can be rewritten in t he Fourier domain: k y − H w x k 2 = k ◦ y − Λ H ◦ x k 2 . 3.C. Hyperparameters law A classical choi ce for hyperparameter law relies on conju gate prior [32]: the conditional posterior for the hyperparameters is in the same family as its prior . It results in practical and algorithmi c facilities: update of the laws amounts to update of a small number of parameters. The three parameters γ 0 , γ 1 and γ ǫ are precision parameters of Gaussian laws Eq. (12) and (14) and a conjugate law for t hese parameters is the Gamma law (see App endix B). Given parameters ( α i , β i ) , for i = 0 , 1 or ǫ , the pdf reads p ( γ i ) = 1 β α i i Γ( α i ) γ α i − 1 i exp ( − γ i /β i ) , ∀ γ i ∈ [0 , + ∞ [ (15) In addition to computational ef ﬁciency , the law allows for non-informative priors. W ith speciﬁc parameter values, one obtains two improper non-inform ativ e pri or : the Jeffre ys’ law p ( γ ) = 1 /γ 7 and t he un iform law p ( γ ) = U [0 , + ∞ [ ( γ ) wi th ( α i , β i ) s et t o ( 0 , + ∞ ) and (1 , + ∞ ) , respectively . Jef freys’ law is a classical law for the prec isions and is considered as non-informative [33]. This law is also in variant to p ower transform ations: the law of γ n [33, 34] is als o a Jeffre ys’ law . For these reasons de velopm ent is done using the Jeffre ys’ law . 3.D. PSF parameters law Regarding th e PSF parameters w , we consider th at the instrum ent design process or a physical study provides a nomi nal v alue w with uncertainty δ , that is to say w ∈ [ w − δ , w + δ ] . The ”Principle of Insufﬁc ient Reason” [33] leads to a uniform prior on this interv al p ( w ) = U w , δ ( w ) (16) where U w , δ is a uniform pdf on [ w − δ , w + δ ] . Nevertheless, within the proposed frame work, the choice is no t lim ited and other laws, such as Gaussian, are poss ible. Anyway other choices d o not allo w easier computation b ecause of the non-linear dependenc y of the observation mod el w .r . t. PSF parameters. 4. Pr oper posterior law At th is point , the prior law of each pa rameter is av ailable: t he PSF parameters, the hyperparameters and the im age. Thus , the joint l aw for all th e p arameters is built by multi plying the l ikelihood Eq. (14) and the a priori laws Eq. ( 12), (15) and (16) p ( ◦ x , γ ǫ , γ 0 , γ 1 , w , ◦ y ) = p ( ◦ y | ◦ x , γ ǫ , w ) p ( ◦ x | γ 0 , γ 1 ) p ( γ ǫ ) p ( γ 0 ) p ( γ 1 ) p ( w ) (17) and explicitly p ( ◦ x , γ ǫ , γ 0 , γ 1 , w , ◦ y ) = (2 π ) − N Q N − 1 n =1 | ◦ d n | β α ǫ ǫ Γ( α ǫ ) β α 0 0 Γ( α 0 ) β α 1 1 Γ( α 1 ) γ α ǫ + N/ 2 − 1 ǫ γ α 0 − 1 / 2 0 γ α 1 +( N − 1) / 2 − 1 1 exp " − γ ǫ β ǫ − γ 0 β 0 − γ 1 β 1 # U w , δ ( w ) exp  − γ ǫ 2 k ◦ y − Λ H ◦ x k 2 − γ 0 2 k ◦ x 0 k 2 − γ 1 2 k Λ D ◦ x k 2  . (18) According to the Bayes rule, the a posteriori law re ads p ( ◦ x , γ ǫ , γ 0 , γ 1 , w | ◦ y ) = p ( ◦ x , γ ǫ , γ 0 , γ 1 , w , ◦ y ) p ( ◦ y ) (19) where p ( ◦ y ) is a normalization constant p ( ◦ y ) = Z p ( ◦ y , ◦ x , γ , w ) d ◦ x d γ d w . (20) 8 As descr ibed before, setting γ 0 = 0 leads to de generated prior and joint la ws. Ho wev er , wh en the observ ation system pre serves the null frequency γ 0 can be considered as a nuisance parameter . In addition, only prior information on the smoothness is av ailable. In Bayesian framew o rk, a solut ion to eliminate the nuis ance parameters is to integrate them out in the a posteriori law . According to our parametrization Sec. 3.A, the integration of γ 0 is the integration of a Gamma law . Application o f Appendix B.B on γ 0 in the a posteriori la w Eq. (19 ) provides p ( ◦ x , γ ǫ , γ 1 , w | ◦ y ) = p ( ◦ x 0 ) p ( ◦ y , ◦ x ∗ , γ ǫ , γ 1 , w | ◦ x 0 ) Z p ( ◦ x 0 ) p ( ◦ y , ◦ x ∗ , γ ǫ , γ 1 , w | ◦ x 0 ) d γ ǫ d γ 1 d w d ◦ x ∗ d ◦ x 0 (21) with p ( ◦ x 0 ) = Z p ( ◦ x 0 | γ 0 ) p ( γ 0 ) d γ 0 =   1 + β 0 ◦ x 2 0 2   − α 0 − 1 / 2 . (22) Now the parameter is integrated, the parameters α 0 and β 0 are set to rem ove t he nu ll frequency penalization. Since we ha ve α 0 > 0 and β 0 > 0 we get (1 + β 0 ◦ x 2 0 / 2) − α 0 − 1 / 2 ≤ 1 and the joint law is majored   1 + β 0 ◦ x 2 0 2   − α 0 − 1 / 2 p ( ◦ y , ◦ x ∗ , γ ǫ , γ 1 , w | ◦ x 0 ) ≤ p ( ◦ y , ◦ x ∗ , γ ǫ , γ 1 , w | ◦ x 0 ) . (23) Consequently , b y the dominated conv ergence theorem [35], t he limit of the law wit h α 0 → 1 and β 0 → 0 can be placed under th e integral sig n at the denominator . Th en the nu ll-frequency penalization p ( ◦ x 0 ) from the numerator and denomin ator are remove d. It is equiva l ent with the integration of the γ 0 parameter under a Di rac (see appendix B). The equati on is simpli ﬁed and the integration with respect to ◦ x 0 in the denominator Eq. (20) Z R p ( ◦ y | ◦ x , γ ǫ , w ) p ( ◦ x ∗ | γ 1 ) p ( γ 1 , γ ǫ , w ) d ◦ x 0 ∝ Z R p ( ◦ y 0 | ◦ x 0 , γ ǫ , w ) d ◦ x 0 (24) ∝ Z R exp " − γ ǫ 2  ◦ y 0 − ◦ h 0 ◦ x 0  2 # d ◦ x 0 (25) con ver ges if and only if ◦ h 0 6 = 0 : the null frequenc y is observed. If this condition i s met, Eq. (21) with β 0 = 0 and α 0 = 1 is a proper posterior l aw for the im age, the precision parameters and th e PSF parameters. In other words, if the a verage is obs erved, the de g eneracy of the a p riori la w is not transmitted to the a posteriori law . 9 Then, the obtained a posteriori law writes p ( ◦ x , γ ǫ , γ 1 , w | ◦ y ) = p ( ◦ x , γ ǫ , γ 1 , w , ◦ y ) p ( ◦ y ) ∝ γ α ǫ + N/ 2 − 1 ǫ γ α 1 +( N − 1) / 2 − 1 1 U w , δ ( w ) exp  − γ ǫ 2 k ◦ y − Λ H ◦ x k 2 − γ 1 2 k Λ D ∗ ◦ x ∗ k 2  exp " − γ ǫ β ǫ − γ 1 β 1 # . (26) Finally , inference is done on this law Eq. (26). If the null frequency is not observed, or information must be added, the pre vious Eq. (19) can be used. 5. Posterior mean e s timator and law exploration This section presents the algorithm to e xplo re the posterior law Eq. (19) or (26) and to compute an estimate of the parameters. For this purpose, Monte Car lo Markov chain is used t o provide samples. Firstly , the obtained samples are used to compute di ff erent moment s of the l aw . Afterw ards, they are also used to approximate marginal la ws as hi stograms. These two repre sentations are helpful to analyse the a posteriori law , the structure of the a vailable information and the uncer tainty . They are used in Sec. 6.C.2 to illustrate the mark of the ambiguity in the myopic problem. Here, the samp les o f the a posterio ri law ar e obtained by a Gibbs sampler [25, 36 , 37]: it con - sists in iterativ el y sampling the condi tional posterior law for a set of p arameters given t he other parameters (obtained at previous iteration). T ypically , the sampled laws are the law of ◦ x , γ i and w . After a b urn-in time, the com plete set of samples are under the joint a posteriori l aw . The three next sections present e ach sampli ng step. 5.A. Sampling the image The conditional posterior law of the image is a Gaussian la w ◦ x ( k +1) ∼ p  ◦ x | ◦ y , γ ( k ) ǫ , γ ( k ) 0 , γ ( k ) 1 , w ( k )  (27) ∼ N  µ ( k +1) , Σ ( k +1)  . (28) The cov ariance matrix is diagonal and writes Σ ( k +1) =  γ ( k ) ǫ | Λ ( k ) H | 2 + γ ( k ) 0 | Λ 1 | 2 + γ ( k ) 1 | Λ D | 2  − 1 (29) and the mean µ ( k +1) = γ ( k ) ǫ Σ ( k +1) Λ † H ( k ) ◦ y . (30) where † is the transpose conj ugate sym bol. The vector µ ( k +1) is the regularized least square solu- tion at the current it eration (or the W iener -Hunt ﬁlter). Clearly , i f the null-frequency is not observed ◦ h 0 = 0 and if γ 0 = 0 , the covariance matrix Σ is not in vertible and the estim ate i s not deﬁned as described Sec. 4. 10 Finally , since the matrix is diagonal, the sample ◦ x ( k +1) is obtai ned by a term-wise product o f F ǫ (where ǫ is whi te Gaussian) with the standard deviation matrix  Σ ( k +1)  1 / 2 followed by t he addition of the mean µ ( k +1) also computed with term-wise products Eq. (30). Consequently , the sampling of the image is eff ectiv e e ven with high-dimensi onal object. 5.B. Sampling pr ecision parameters The conditional posterior laws of the p recisions are Gamma corresponding to their prior law with parameters updated by the likelihood γ ( k +1) i ∼ p  γ i | ◦ y , ◦ x ( k +1) , w ( k )  (31) ∼ G  γ i | α ( k +1) i , β ( k +1) i  . (32) For γ ǫ , γ 0 and γ 1 the parameters law are, respecti vely , α ( k +1) ǫ = α ǫ + N/ 2 and β ( k +1) ǫ =  β − 1 ǫ + 1 2 k ◦ y − Λ ( k ) H ◦ x ( k +1) k 2  − 1 , (33) α ( k +1) 0 = α 0 + 1 / 2 and β ( k +1) 0 =  β − 1 0 + 1 2  ◦ x ( k +1) 0  2  − 1 , (34) α ( k +1) 1 = α 1 + ( N − 1) / 2 and β ( k +1) 1 =  β − 1 1 + 1 2 k Λ D ◦ x ( k +1) k 2  − 1 . (35) In the case of Jeff reys’ prior , the parameters are α ( k +1) ǫ = N / 2 and β ( k +1) ǫ = 2 / k ◦ y − Λ ( k ) H ◦ x ( k +1) k 2 , (36) α ( k +1) 0 = 1 / 2 and β ( k +1) 0 = 2 /  ◦ x ( k +1) 0  2 , (37) α ( k +1) 1 = ( N − 1) / 2 and β ( k +1) 1 = 2 / k Λ D ◦ x ( k +1) k 2 . (38) Remark 1 — If the a posteriori law Eq . (26) withou t γ 0 is consider ed, ther e is no nee d to sample this parameter (Eq. (34) and (37) ar e not useful ) and γ ( k ) 0 = 0 in Eq. (29). 5.C. Sample PSF parameters The conditional law for P SF parameters writes w ( k +1) ∼ p  w | ◦ y , ◦ x ( k +1) , γ ( k +1) ǫ  (39) ∝ exp " − γ ( k +1) ǫ 2 k ◦ y − Λ H , w ◦ x ( k +1) k 2 # (40) where parameters w are embedded in the PSF Λ H . This law is not standard a nd intricate: no algo- rithm exists for direct sam pling and we use the Metropo lis-Hastings (M.-H.) method t o bypass this difﬁc u lty . In M.-H. algorith m, a s ample w p is propos ed and accepted with a certain p robability . This probabilit y depends on the rati o between t he likelihood o f the proposed value and the li keli- hood of the current v alue w ( k ) . In practice, in the independent form described in appendix C, with prior law as proposition law , it is divided in sev eral steps. 11 1. P RO P O S I T I O N : Sample a proposit ion w p ∼ p ( w ) = U [ a b ] ( w ) . (41) 2. P RO BA B I L I T Y O F AC C E P TA T I O N : Calculate the criterion J  w ( k ) , w p  = γ ( k +1) ǫ 2  k ◦ y − Λ H , w ( k ) ◦ x ( k +1) k 2 − k ◦ y − Λ H , w p ◦ x ( k +1) k 2  . (42) 3. U P D A T E : Sample t ∼ U [0 1] and takes w ( k +1) =    w p if log t < J w ( k ) otherwise . (43) 5.D. Empirical mean The s ampling of ◦ x , γ and w are repe ated i terativ ely u ntil t he law has been suf ﬁciently e xpl ored. These sampl es  ◦ x ( k ) , γ ( k ) , w ( k )  follow the global a posterior i law of Eq. (19). By the large num - bers law , the estimate, deﬁned as the posterior mean, is approximated by ˆ x = F † E [ ◦ x ] ≈ F † " 1 K K − 1 X k =0 ◦ x ( k ) # . (44) As described by Eq. (44), to obtain an estim ate of the image in the spatial space, all the computati on are achieved recursively in the Fourier space with a sin gle I FF T at the end. An implementation example in pseudo code is described Fig. 9. 6. Decon volution r esults This section presents numerical results obt ained by the proposed m ethod. In order to completely e valuate the method, true value of all parameters x , w , γ ǫ but also γ 1 , γ 0 is needed. In order to achie ve th is, an entirely simulated case is studied: image and noise are s imulated under their re- spectiv e prior la ws Eq. (12) and (13) with gi ven values of γ 0 , γ 1 and γ ǫ . Th anks to this protocol, all experimental conditions are controlled and the estimation method is entirely e valuated. The method has als o been applied in different conditions (lower signal to noise ratio, broader PSF , di ff erent and realist ic (non-simulated) images, . . . ) and showed s imilar b eha viour . Howe ver , in the case of realist ic images, si nce the true value of the hyperparameters γ 0 and γ 1 is unkn own, the ev aluation cannot be complete. 6.A. Practical e xperimental conditio ns Concretely , a 128 × 12 8 im age i s generated in th e Fourier space as the prod uct of a compl ex white Gaussian noi se and the a p riori standard deviation matrix Σ = ( γ 0 Λ † 1 Λ 1 + γ 1 Λ † D Λ D ) − 1 / 2 , giv en b y Eq. (10). The chosen m atrix Λ D results from the FF T -2 D of the Laplacian operator [0 1 0; 1 − 4 1 ; 0 1 0 ] / 8 and t he parameter va lues are γ 0 = 1 and γ 1 = 2 . 12 These p arameters provide t he i mage s hown in Fig. 1(a) : it is an image with sm ooth features similar t o a cloud. Pixels have numerical values between − 100 and 150 , and the proﬁle l ine 68 shows ﬂuctuations around a value of − 40 . The a p riori law for the h yperparameters are set to the non-informati ve Jef freys’ law by ﬁxi ng the ( α i , β i ) to (0 , + ∞ ) , as e xpl ained in Sec. 3.C. In additi on, the PSF is obtained in the Fourier space by discretization of a normalized Gaussian shape ◦ h ( ν α , ν β ) = exp − 2 π 2  ν 2 α ( w α cos 2 ϕ + w β sin 2 ϕ ) + ν 2 β ( w α sin 2 ϕ + w β cos 2 ϕ ) + 2 ν α ν β sin ϕ cos ϕ ( w α − w β )  ! (45) with frequencies ( ν α , ν β ) ∈ [ − 0 . 5; 0 . 5] 2 . This low-pass ﬁlter , illu strated in Fig. 2, is controlled by three parameters: • two width parameters w α and w β set to 20 and 7, respectively . Their a priori laws are uniform: p ( w α ) = U [19 21] ( w α ) and p ( w β ) = U [6 8] ( w α ) corresponding to a n uncertainty of about 5% and 15% around the nominal value (see Sec 3.D). • a rotation p arameter ϕ s et to π / 3 . T he a priori l aw is also uniform p ( ϕ ) = U [ π / 4 π / 2] ( ϕ ) corresponding to 50% uncertainty . Then, th e conv olutio n is comput ed in t he Fourier space and the data are obtained by addi ng a white Gauss ian noise w ith precisi on γ ǫ = 0 . 5 . Data are shown Fig . 1(b): they are naturally smoother than the true image and the small ﬂuctuations are less visible and corrupt ed by t he noise. The empirical mean le vel of the image is corre ctly observed ( the null frequency coef ﬁcient of H w is ◦ h 0 = 1 ) so the parameter γ 0 is considered as a nuisance parameter . Consequent ly it is inte g rated out under a Dirac (see S ec. 4). This is equi valent to ﬁx its v alue to 0 in the algorithm Fig. 9, line 4. Finally , the method is e valuated on two dif ferent situations. 1. The unsupervised and non-myo pic case: the parameters w are known. Consequently , there is no Metrop olis-Hastings step (Sec. 5.C): lines 9 to 16 ar e ignored i n the alg orithm of F ig. 9 and w is set to its t rue value. T o obtain sufﬁ cient law exploration, the algorithm is run un til the differ ence between two successive emp irical means is less than 10 − 3 . In this case, 921 samples are necessary and they are comp uted in approximately 12 seconds on a processor at 2.66 GHz with Matlab, 2. The unsupervised and m yopic case: all the parameters are estimated. T o obtain sufﬁc i ent law exploration, t he alg orithm is run until the diff erence between two successive emp irical means is less t han 5 × 10 − 5 . In thi s case, 18 715 samples are needed and they are computed in approximately 7 minutes. Remark 2 — The alg orithm has also b een run for up to 1 000 000 sa mples, in b oth cases, witho ut 13 per ceptible qualitat ive c hanges. 6.B. Estimatio n r esults 6.B.1. Images The t wo results for the image are given Figs. 1(c) and 1(d) for th e non -myopic and the myopi c cases, respectiv ely . The ef fect of decon volution is notable on the image, as well as on the shown proﬁle. The ob- ject i s correctly positioned, the orders of magni tude are respected and the mean lev el is correctly reconstructed. The image is restored, more details are v isible and the proﬁles are closer matching to the true image than data. More precisely , the pixels 20-25 of the 68-th line in Fig. 1 sho w the restoration of the original dynamic whereas it is not visible in the data. Between pixels 70 and 1 10, ﬂuctuations not visible in data are also correctly restored. In order to visualize and study the spectral contents of the images, circular a verage of empirical power spectral densit y is considered and called “spectrum” hereafter . The subjacent spectral vari- able is a radial f requency f such as f 2 = ν 2 α + ν 2 β . The spectrum of the true object, data and restored object are shown Figs. 3(a) and 3(b) in non-myopic and myopi c cases, respecti vely . It is clear that the spectrum of the true image is correctly retrie ved, in both cases, up to the radial frequency f ≈ 0 . 075 . Above this frequency , noi se is clearly dominant and information abou t the i mage is almost lost. In ot her words, the met hod produces correct spectral equali zation in the properly ob- served frequenc y band. The result is expected from a W iener-Hunt method but the achiev ement is the jo int estim ation of hyp erparameter and instrument parameters in addition to the correct spectral equalization. Concerning a com parison betwee n non-myopic and myopic c ases, there is no visual difference s . The spectrum Fig s. 3(a) and 3(b) in non-myop ic and myopic cases respectively are visually indis- tinguishabl e. This is also t he case when comparing Figs. 1(c) and 1(d) and especially 68-th line. From a more precise quantitative ev aluati on, a slight dif ference is observed and detailed belo w . In order to quantify performances, a normalized euclidean distance e = k x − x ∗ k / k x ∗ k (46) between an im age x and the true image x ∗ is consid ered. It is com puted between true image and estimate images as well as between true image and data. Results are reported in T ab . 1 and conﬁrm that the decon volution is ef fectiv e with an error of approx imately 6 % in m yopic case com pared to 11 % wit h data. Both non-myopic and myo pic decon volution reduce error by a factor 1.7 with respect to the observed data. Regarding a comparison between non-m yopic and myopi c case, the errors are almost the sam e, with a sl ightly lower value for the non-myo pic case, as expected. This di f ference is coherent with 14 the int uition: more information are injected in th e non-myo pic case through the true PSF parame- ters values. 6.B.2. Hyperpar ameters and instrument parameters Concerning the o ther parameters, their estim ates are close to the true values and are report ed in T ab . 2. The γ ǫ estimate is very close to the true value with c γ ǫ = 0 . 49 instead o f 0.5 in the two cases. The er ror for the PSF parameters are 0.35%, 2.7% and 1.9% for w α , w β and ϕ , respectiv ely . The va l ue of γ 1 is underestimated in the two cases with approximately 1.7 instead of 2. All the true values fall in the ˆ µ ± 3 ˆ σ int erv al. In order to deepen the num erical study , the paper ev aluates the capability of the method to accurately select th e best values for hyperparameters and instrument parameters. T o this end, we compute th e esti mation error Eq . (46 ) for a set of “exhausti ve” values of the parameters [ γ ǫ , γ 1 , w α , w β , ϕ ] . The protocol is the following: 1) choos e a new value for a parameter ( γ ǫ for example) and ﬁx the other parameters to the value provided by our algorithm, 2) compute the W iener-Hunt solution (Sec . 5.A) and 3) compute the error index. Results are reported in Fig. 4. In each case, smooth variation of error is observed when varying hyperparameters and instrum ent parameters and an un ique opti mum is visible. By t his way , one can ﬁnd the value of the parameters that provide the best W iener -Hunt solution when the true image x ⋆ is known. It is reported on T ab . 1 and s hows almost i mperceptible improvement: optim ization of the parameters (based on the true image x ⋆ ) a l low negligible i mprovement (smaller than 0.02 % as reported in T ab . 1). So, the main conclusion is t hat, the unsupervised and myopic proposed approach is a relev ant tool in order to tune parameters: it works (withou t the knowledge of the true image), as well as an optimal approach (based on the knowledge of the true image). 6.C. A posteriori law characteristics This s ection describes the a posterior i law usi ng hi stograms, means and variances of the parame- ters. The sample histograms, Figs. 5 a nd 6, provide an a pproximation of the mar ginal posterior law for each parameter . T abs. 1 and 2 report the va riance for th e i mage and law parameters respectiv ely and thus allow to quantify the uncertainty . 6.C.1. Hyperpar ameter characterist ics The his tograms for γ ǫ and γ 1 , Fig. 5, are concentrated around a m ean value in bot h non-myopi c and myopic cases. The variance for γ ǫ is lower than the one for γ 1 and it can be explained as follo ws . The obs erved data are directly impacted by noise (present at the system ou tput) whereas they are indirectly i mpacted by the object (present at th e system input). The con volution sys tem damages the o bject and not the noi se: as a consequence, the parameter γ ǫ (that driv es nois e law) is more 15 reliably estimated than γ 1 (that drive s object law) . A second observa tion is the smaller variance for γ 1 in the non-myopic case Fig. 5(c) than in the myopic case Fig. 5(d) . It is the consequence of the addition of information in the non-myopi c case w .r .t. th e myopic one, throu gh the value of the PS F parameters. In the myop ic case, the estimates are founded on t he knowledge of an in terva l for t he va lues of the in strument parameters, where as in t he non-myopic case, the estimat es are founded on the true values for the instrum ent parameters. 6.C.2. PSF paramet er characterist ics Fig. 6 gives his tograms for the three PSF parameters and their app earances are qui te different from the one for hyperparameters. Th e histograms for w α and w β , Figs. 6(a) and 6(b) are not as concentrated as the one of Fig. 5 for hyperparameters. Their variances are quite large with re gards to the interval of th e prior la w . On the cont rary , the histogram for the parameter ϕ , Fig. 6(c), has the smallest v ariance. It is analyzed as a consequ ence of a larger sensitivity of the data w .r .t. the parameter ϕ th an w .r .t. the parameters w α and w β . In an equi valent manner , t he observed data are more informative about th e parameter ϕ than abou t the parameters w α and w β . 6.C.3. Mark of the myopic ambigu ity Finally , a correlation between parameters ( γ 1 , w α ) and ( γ 1 , w β ) i s visible on their joint histograms Fig. 7. It can be interpreted as a consequence of the ambiguity in the primitive myo pic decon volu- tion problem, in the following manner: the parameters γ 1 and w both participate in the in terpreta- tion of the spectral cont ent of data, γ 1 as a scale factor and w as a shape factor . An in crease of w α or w β results in a decrease of the cutof f frequency of the o bservation s ystem. In order to e x plain the spectral content of a giv en data set, the spectrum of the original image must c ontain more high frequencies, i.e., a smaller γ 1 . This is also observed on the histogram illustrated Fig. 7(a). 6.D. MCMC algorit hm c haracterist ics Globally , the chains of Figs. 5 and 6, h a ve a Markov feature (correlated) and explore t he para meter space. They hav e a burn-in period foll owed by a stationary state. Th is characteristic has alwa ys been observed regardless the initi alization. For ﬁxed experimental conditio ns, th e stationary state of multiple runs was alwa ys around th e s ame value. Consid ering different initiali zations, the only visible change is on the length of the burn-in per iod. More precisely , t he chain of γ ǫ is concentrated in a sm all interval, the burn-in p eriod is very short (less than 1 0 sampl es) and its ev olu tion seems independent of the other parameters. Th e chain of γ 1 has a larger exploration, the b urn-in period is l onger (approximately 200 sampl es) and the histogram is lar ger . This is i n accordance with the analysis of Sec tion 6.C.1. About the PSF parameters, the beha viour is dif ferent for ( w α , w β ) and ϕ . The chain of th e two width parameters has a very good exploration with quasi-instantaneous b urn-in period. Con versely , 16 the chain of ϕ is more concentrated and its b urn-in period is approximat ely 4 000 samples. This is also in accordance with previous analysis (Section 6.C.2). Acceptation rates in the Metropolis-Hasti ngs algorit hm are reported i n T ab . 3 : they are qu ite small, especially for the rotation parameter . This is due to the st ructure of the i mplemented algo- rithm: an ind ependant Metropolis -Hastings algorithm with t he prior law as a propos ition law . Th e main adv ant age of t his choice is its simplicity but as a counterpart, a high rejection rate is observed due to a lar ge a priori interval for the angle parameter . A future work will be dev oted to the desi gn of more accurate proposition law . 6.E. Robustness of prior imag e model Fig. 8 ill ustrates the proposed m ethod on a m ore realis tic im age with heterogeneou s spatial struc- tures. The origi nal is the Lena im age and the data has been o btained wi th th e same Gaussian PSF and also corruption by white Gaussian noise. The Fig. 8(b) shows that the restored image is c loser to th e true one than the data. Smaller s tructures are visible and edges are sharper , for example around pixel 200 . The estimated parameters are c γ ǫ = 1 . 98 while the true value is γ ⋆ ǫ = 2 . Con- cerning t he PSF parameters, t he result s are c w α = 19 . 3 , c w β = 7 . 5 and b ϕ = 1 . 15 wh ile the true values are respectiv ely w ⋆ α = 20 , w ⋆ β = 7 and ϕ ⋆ = 1 . 05 as in the pre vio us section. Here a gain, the estimated PSF parameters are close to the true values giving a ﬁrst assessment of the capabili ty of the method in a more realistic context. 7. Conclusion and perspectiv es This paper presents a new global and coh erent method for myopic and unsupervised decon volution of relativ ely smooth images. It is b uilt within a Bayesian frame work and a proper extended a posteriori l aw for the PSF para meters, the hyperparameters and the image. The estimate, deﬁned as the posterior mean, is computed by means of an MCMC algorithm in less than a fe w minutes. Numerical assessm ent testiﬁes that the parameters of the PSF and the parameters of the prior laws are precisely estim ated. In addi tion, results also demonstrate that the myopi c and uns upervised decon volved image is closer t o the true image than the data and show true restored high-frequencies as well as spatial details. The paper focuses on linear in variant model often encount ered i n astronomy , medical imagi ng, nondestructive testi ng and especiall y in optical problems . Non-inv ariant linear models can also be cons idered in order t o a ddress other applications such as spectrometry [4] or ﬂuorescence mi- croscopy [13]. The loss o f inv ariance property prec ludes entirely F o urier -based computati ons but the meth odology remains valid and p racticable. In particular , it i s poss ible to draw samples of t he image by means of an optimization algorithm [38]. Gaussian l aw , related to L 2 penalization, is known for possible excessi ve sharp edges penaliza- tion in the restored object. The use of con vex L 2 − L 1 penalization [39 – 41] or non c on vex L 2 − L 0 17 penalization [42] can overcome th is limi tation. In these cases a difﬁculty occurs in the development of myopi c and unsupervised decon volution: the partition function of the prior law for the im age is in in tricate or ev en unknown d ependency w .r .t . the parameters [1, 7 , 43]. Ho we ver a recent pa- per [41] o vercome the di fﬁ cul ty resulting in an ef ﬁcient unsupervised decon volution and we plan to extend this w ork for t he myopic case. Regarding noise, Ga ussian likelihood limits robustness to outliers or aberrant data and it is pos- sible to appeal to rob ust law such as Huber p enalization in order to bypass the limitation. Ne ver- theless, t he partition function for th e n oise law is a gain di f ﬁcult or impo ssible to manage and it is possible to resort to the idea proposed in [41] to ove rcom e the dif ﬁcult y . Finally , estim ation of parameters of correlatio n matrix (cutoff frequency , attenuation coefﬁ- cients,. . . ) is possibl e withi n the same methodolog ical frame work. Thi s could be achiev ed for the correlation matrix o f the object or the no ise. As for the PSF parameters, the approach could rely on an extended a posteriori law , including the new parameters and a Metropolis-Hasti ngs sampl er . 8. Acknowledgmen t The authors would like to thank Professor Alain Ab er g el in IAS laboratory at Universit ´ e Paris-Sud 11, France, for fruitful discussion s and constructive suggest ions. Th e authors a re also grateful to Cornelia V acar , IMS laboratory , f or carefully reading the paper . A. Law in Fourier space For a Gaussi an vector x ∼ N ( µ , Σ ) , the law for ◦ x = F x (the FFT o f x ) is also Gaussian whose ﬁrst two moments are the following: • The mean is ◦ µ = E [ ◦ x ] = F E [ ◦ x ] = F µ . (47) • The cov ariance matrix is ◦ Σ = E [( ◦ x − ◦ µ )( ◦ x − ◦ µ ) † ] = F Σ F † . (48) Moreover , if the cov ariance matrix Σ is circulant it writes ◦ Σ = F Σ F † = Λ Σ . (49) i.e., the co va riance matrix ◦ Σ is diagonal. B. The Gamma pr obability density B.A. Deﬁnition The Gamma pdf for γ > 0 , with give n parameter α > 0 and β > 0 , is written G ( γ | α, β ) = 1 β α Γ( α ) γ α − 1 exp ( − γ /β ) . (50) 18 T ab . 4 gives three limit cases for ( α, β ) . The follo wing properties hold: • The mean is E G [ γ ] = α β • The var iance is V G [ γ ] = α β 2 • The maximiser is β ( α − 1) if and only if α > 1 B.B. Mar ginal isation First consid er a N dimensi onal zero-mean Gaussian vector with a given precision matrix γ Γ with γ > 0 . The pdf reads p ( x | γ ) = (2 π ) − N/ 2 γ N/ 2 det[ Γ ] 1 / 2 exp h − γ x t Γ x / 2 i . (51) So consider the conj ugate pdf for γ as a Gamm a law wit h parameter ( α , β ) (see previous A n- nex). The joint l aw for ( x , γ ) is the pro duct of the pdf given by E q. (50) and Eq. (51): p ( x , γ ) = p ( x | γ ) p ( γ ) . The marginalization of the joint la w i s kno wn [44]: p ( x ) = Z R + p ( x | γ ) p ( γ ) d γ = β N/ 2 det[ Γ ] 1 / 2 Γ ( α + N/ 2 ) (2 π ) N/ 2 Γ( α ) 1 + β x t Γ x 2 ! − α − N/ 2 (52) which is a N dimensional t -Student law of 2 α degrees of freedom with a β Γ precision matrix. Finally , the conditional law reads: p ( γ | x ) = (2 π ) − N/ 2 det[ Γ ] 1 / 2 β α Γ( α ) γ α + N/ 2 − 1 exp h − γ  x t Γ x / 2 + 1 /β i . (53) Thanks to conjugacy , i t is also a Gamma pdf with parameters ¯ α , ¯ β given by ¯ α = α + N / 2 and ¯ β − 1 = β − 1 + 2 / ( x t Γ x ) . C. The Metr opoli s-Hastings algorithm The Metropolis-Hasti ngs algorithm provides sampl es of a t ar g et law f ( w ) that cannot be directly sampled but can be ev aluated, at least up t o a multi plicative constant . Using t he so called “instru- ment law” q  w p | w ( t )  , samples of the target la w are obtained by the following iterations. 1. Sample a proposition w p ∼ q  w p | w ( t )  . 2. Compute the probability ρ = min    f ( w p ) f ( w ( t ) ) q  w ( t ) | w p  q ( w p | w ( t ) ) , 1    . (54) 3. T ake w ( t +1) =    w p with ρ probabili ty w ( t ) with 1 − ρ probability . (55) 19 At con vergence, the samples follow the target law f ( w ) [25, 36]. When q  w p | w ( t )  = q ( w p ) the algorithm is named independent Metropolis-Hastin gs. In addition, if the in strument law is uniform, the acceptance probability gets simpler in ρ = min ( f ( w p ) f ( w ( t ) ) , 1 ) . (56) Refer ences 1. J. Idier , ed., B ayesian Appr oach t o In verse Pr oblems (ISTE Lt d and John W i ley & Sons Inc., London, 2008). 2. R. Molina, J. Mateos, and A. K . Katsaggelo s, “Blind decon volution using a variational ap- proach to p arameter , i mage, and blur est imation, ” IEEE Trans. Image Processing 15 , 3 715– 3727 (2006). 3. P . Campis i and K. Egiazarian, eds., Blind Image Decon volution (CRC Press, 2007). 4. T . Rodet, F . Orieux, J.-F . Giov ann elli, and A. Aber gel, “Data in version for over- resolved s pec- tral imaging in astronomy , ” IEEE J. of Selec. T opics in Signal Proc. 2 , 802–8 11 (2008). 5. A. T ikho nov and V . Arsenin, Sol utions of Ill -P osed Pr oblems (W inston , W ashingt on, D C , 1977). 6. S. T womey , “On the numerical solution of Fredholm integral equations of the ﬁr st kind by the in version of the linear sy stem produced by quadrature, ” J. Assoc. Comp. Mach. 10 , 97–101 (1962). 7. A. Jalob eanu, L. Blanc-F ´ eraud, and J. Zerubia, “Hyperparameter estimati on for s atellite im- age restoration by a MCMC m aximum li kelihood meth od, ” Pattern Recognition 35 , 341–352 (2002). 8. J. A. O’Sullivan, “Roughness penalties on ﬁnite domains, ” IEEE Trans. Image Processing 4 , 1258–1268 (1995). 9. G. Demoment, “Image reconstruction and restoration: Overvie w of common estimation struc- ture and problems, ” IEEE Trans. Acoust. Speech, Signal Processing A S S P -37 , 2024–20 36 (1989). 10. P . Pankajakshani, B. Zhang, L. Blanc-F ´ eraud, Z. Kam, J.-C. Oliv o-Marin, and J. Zerubia, “Blind decon volution for thin-layered confocal imagi ng, ” Appl. Opt. 48 , 4437–444 8 (2009). 11. E. Thibaut and J.-M. Conan, “Strict a priori const raints for m aximum likelihood blind decon- volution, ” J. Opt. Soc. Am. A 12 , 485– 492 (1995). 12. N. Dobig eon, A. H ero, and J.-Y . T ourneret, “Hierarchical bayesian sparse image reconstruction with application to MRFM, ” IEEE Trans. Image Processing (2009). 13. B. Zhang, J. Zerubia, and J.-C. Oliv o -Marin, “Gaussian approxi mations of ﬂuorescence mi- croscope point-spread function models, ” Appl. Opt. 46 , 1819–1829 (2007). 20 14. L. Mugnier , T . Fusco, and J .-M. Conan, “MISTRAL: a myopic edge-preserving image restora- tion method , with app lication to astronomical adaptive-optics-corrected lon g-exposure im- ages, ” J. Opt. Soc. Amer . 21 , 1841–1 854 (2004). 15. E. Thi ´ ebaut, “MiRA: an effecti ve imagin g algorithm for opt ical int erferometry , ” in “proc. SPIE: Astrono mical T elescopes and Instrument ation, ” , v ol. 7013 (200 8), vol. 701 3, pp. 70131–I. 16. T . Fusco, J.-P . V . ran, J.-M . Conan, and L. M. Mugnier , “Myopi c decon volution method for adaptiv e optics images of stellar ﬁelds, ” Astron. Astrophys. Suppl. Ser . 134 , 193 (1999). 17. J.-M . C onan, L. Mugnier , T . Fusco, V . Michau, and R . G., “Myopic decon volution of adaptiv e optics i mages by use of object and point-spread function power spectra, ” Applied Optics 37 , 4614–4622 (1998). 18. A. C. Likas and N. P . Galatsanos, “ A variational approach for Bayesian blind image decon vo- lution, ” IEEE Trans. Image Processing 52 , 2222–2233 (2004). 19. T . Bishop, R . Molina, and J. Hopgood, “Blind restoration of bl urred photographs via AR mod- elling and MCMC, ” in “Image Processing, 2008. ICIP 2008. 15 th IEEE Int. Conference on, ” (2008). 20. E. Y . Lam and J . W . Goodman, “Iterativ e statisti cal approach to blind i mage d econ volution, ” J. Opt. Soc. Am. A 17 , 1177–1184 (2000). 21. Z. Xu and E. Y . Lam, “Maximum a posteriori blind image decon volution with Huber –Markov random-ﬁeld regularization, ” Opt. Lett. 34 , 1453–1455 (2009). 22. M . Cannon, “Blind decon volution of spatially in variant image blurs with phase, ” IEEE Trans. Acoust. Speech, Signal Processing 24 , 58–63 (1976). 23. A. Jalobeanu, L. Blanc-Feraud, and J. Zerubia, “Estimatio n of bl ur and no ise parameters in remote sensing, ” in “Proc. IEEE ICASSP , ” , vol. 4 (2002), v ol. 4, pp. 3580–3583. 24. F . Chen and J. Ma, “ An empirical identiﬁcation m ethod of Gaussian blur parameter for image deblurring, ” Signal Processing, IEEE T rans. on (2009). 25. C. P . Robert and G. Casella, Monte-Carlo Statist ical Methods , Springer T exts in Statis tics (Springer , New Y ork, N Y , 2000). 26. B. R. Hunt, “ A matrix theory proof of the discrete con volution theorem, ” IEEE Trans. Automat. Contr . A C-19 , 285–288 (1971). 27. M . Ca lder and R. A. Da vi s, “Introduction to Whittle (195 3) ‘The analysis o f multiple stationary time series’, ” Breakthroughs in Statistics 3 , 141–148 (1997). 28. P . J. Brockwell and R. A. Davis, T ime Series: Theory and Methods (Spring er -V erlag, Ne w Y ork, 1991). 29. B. R. Hunt, “Decon volution of linear systems by constrained re gression and its relationship to the Wiener theory , ” IEEE Trans. Automat. Contr . A C-17 , 703–705 (1972). 30. K. Mardi a, J. Kent, and J. Bibby , Multivariat e Analysis (San Diego : Academic Press, 1992), 21 chap. 2, pp. 36–43. 31. C. A. Bouman and K. D. Sauer , “ A generalized Gaussi an im age m odel for edge-preserving M A P estimation, ” IEEE Trans. Image Processing 2 , 296–310 (1993). 32. D. MacKay , Information Theory , Inf er ence, and Learning Algori thms (Cambridge University Press, 2003). 33. R. E. Kass and L. W ass erman, “The selection of pri or distributions by formal rul es, ” J. A mer . Statist. Assoc. 91 , 1343–1370 (1996). 34. E. T . Jaynes, Pr obabi lity Theory: The L ogic of Science (Cambridge Uni versity Press, 2003). 35. S. Lang, Real and functi onal analysis (Springer , 1 993). 36. P . Br ´ emaud, Markov Chains. Gibbs ﬁelds, Mont e Carlo Simulati on, and Queues , T exts i n Applied Mathematics 31 (Spinger , New Y ork, N Y , 1999). 37. S. Gem an and D. Geman, “Stochastic relaxation, Gibbs dist ributions, and the Bayesian restora- tion of images, ” IEEE Trans. P attern Anal. Mach. Intell. 6 , 721–741 (1984). 38. F . Orieux, O. F ´ eron, and J.-F . Giov annelli , “Stochastic sampl ing of large dim ension non- stationnary gaussian ﬁeld for image restoration, ” Subm itted to ICIP2010. 39. H. R. K ¨ unsch, “Rob ust priors for smoothing and image restoration, ” Ann. Inst. Stat. Math. 46 , 1–19 (1994). 40. P . Charbonnier , L. Blanc-F ´ eraud, G. Aubert, and M. Barlaud, “Deterministic edge-preserving regularization in computed imaging, ” IEEE Trans. Image Processing 6 , 298–311 (1997). 41. J.-F . Giov annelli, “Unsupervised Bayesian con vex decon volution based on a ﬁeld with an ex- plicit partition function, ” IEEE Trans. Image Processing 17 , 16–26 (2008). 42. D. Gem an and C. Y ang, “Nonlinear image recovery with half-quadratic regularization, ” IEEE Trans. Image Processing 4 , 932–946 (1995). 43. X. Descom bes, R. Morris, J. Zerubia, and M. Berthod, “Estimation of Markov random ﬁeld prior parameters using Markov chain Mon te Carlo m aximum likelihood, ” IEEE Trans. Image Processing 8 , 954–963 (1999). 44. G. E. P . Box and G. C. Tiao, Bayesian infer ence in statistical analysis (Addi son-W esley pub- lishing, 1972). 22 T able 1. E rror e (Eq. (46 )) and av eraged standard deviation ˆ σ of the posterior image law . The “Best” error has been obtained with the kno wledg e of the true image. Data Non-myopic Myopic Best Error ( e ) 11.092 % 6.241 % 6.253 % 6.235 % ˆ σ of x law - 3.16 3.25 - 23 T able 2. Quantitative e valuation: true and esti mated values of hyperparameters and PSF parameters. b γ ǫ ± ˆ σ b γ 1 ± ˆ σ b w α ± ˆ σ b w β ± ˆ σ b ϕ ± ˆ σ T rue value 0.5 2 20 7 1.05 ( π / 3 ) Non-my opic Estimate 0.49 ± 0 . 0056 1.78 ± 0 . 14 - - - Error 2.0 % 11 % - - - Myopic Estimate 0.49 ± 0 . 0056 1.65 ± 0 . 15 20. 07 ± 0 . 53 7.19 ± 0 . 38 1.03 ± 0 . 04 Error 2.0 % 18 % 0.35 % 2.7 % 1.9 % 24 T able 3. Acceptation rate. Parameter w α w β ϕ Acceptation rate 14.50 % 9.44 % 2.14 % 25 T able 4. Speciﬁc l aws obtained a s limit of the Gamma pdf. α β Jef freys 0 + ∞ Uniform 1 + ∞ Dirac - 0 26 0 50 100 −120 −100 −80 −60 −40 −20 0 (a) True image 0 50 100 −120 −100 −80 −60 −40 −20 0 T rue Data (b) Data 0 50 100 −120 −100 −80 −60 −40 −20 0 T rue Estimate (c) Non-m yopic 0 50 100 −120 −100 −80 −60 −40 −20 0 T rue Estimate (d) Myo pic Fig. 1 . Th e ﬁgure 1(a) represents a 128 × 128 sample of t he a prio ri law for the object wit h γ 0 = 1 and γ 1 = 2 . Fig. 1(b) is the data computed with the PSF shown in Fig. 2. Figs. 1(c) and 1(d) are the estimates with non-myopic and the myopi c estimate, respectiv ely . Proﬁ les correspond to the 68-th line. 27 −0.5 0 0.5 −0.5 0 0.5 Fig. 2. PSF with w α = 20 , w β = 7 and ϕ = π / 3 . The x-axis and y-axis are reduced frequency . 0 0.05 0.1 0.15 0.2 10 −10 10 −5 10 0 10 5 10 10 T rue Convolued image Data Estimate (a) No n-Myop ic 0 0.05 0.1 0.15 0.2 10 −10 10 −5 10 0 10 5 10 10 T ru e Convolued image Data Estimate (b) Myo pic Fig. 3. Circular a verage of the empirical power spectral density of the i mage, the con volued image, the data (con volued image corr upted by noise) and the e s timates, in radial frequency with y-axis in logarithmi c scale. The x-axis is the radial fre- quency . 28 10 −2 10 0 10 −1 (a) γ ǫ 10 −2 10 0 10 −1 (b) γ 1 19 19.5 20 20.5 21 0.0625 0.0626 0.0627 0.0628 (c) w α 6 6.5 7 7.5 8 0.0624 0.0626 0.0628 0.063 0.0632 0.0634 (d) w β 0.8 1 1.2 1.4 0.06 0.065 0.07 0.075 0.08 (e) ϕ Fig. 4. Computatio n of the best parameters in the sens e e Eq. (46). The s ymbol ’ × ’ is the minim um and the symbol ’. ’ i s the estimated va lue by our approach. The y-axis of γ ǫ and γ 1 are in logarithmic scale. 29 0 0.5 1 1.5 2 0 100 200 300 400 500 600 300 600 900 0 0.5 1 1.5 2 (a) γ ǫ for non-my opic case 0 0.5 1 1.5 2 0 2000 4000 6000 8000 10000 12000 5000 10000 15000 0 0.5 1 1.5 2 (b) γ ǫ for myop ic case 0 1 2 3 0 10 20 30 40 300 600 900 0 0.5 1 1.5 2 2.5 3 (c) γ 1 for non- myopic case 0 1 2 3 0 100 200 300 400 500 600 5000 10000 15000 0 0.5 1 1.5 2 2.5 3 (d) γ 1 for myop ic case Fig. 5 . Histog rams and chains for the non-myo pic case in Figs. 5(a)-5(c) and the myopic case in Figs. 5(b)-5(d) for γ ǫ and γ 1 , respectiv ely . The symbol × localizes the init ial v alue and the dashed line corresponds to the t rue value. The x-axis are iteration’ s index for the chains and parameter v alue for the histograms. 30 19 19.5 20 20.5 21 0 100 200 300 400 5000 10000 15000 19 19.5 20 20.5 21 (a) w α 6 6.5 7 7.5 8 0 100 200 300 400 500 600 5000 10000 15000 6 6.5 7 7.5 8 (b) w β 0.8 1 1.2 1.4 0 1000 2000 3000 4000 5000 10000 15000 0.8 1 1.2 1.4 (c) ϕ Fig. 6. Histogram a nd chain for the PSF parameters w α in Fig. 6(a), w β in Fig. 6(b) and ϕ in Fig. 6(c). The symbol × localizes th e init ial value and the dashed lin e corresponds to the true value. The x-axis for the histograms and the y-axis of the chain are limits of a priori law . 31 γ 1 w α 19.5 20 20.5 1 1.2 1.4 1.6 1.8 2 2.2 0 20 40 60 80 100 120 (a) ( γ 1 , w α ) γ 1 w β 6.5 7 7.5 1 1.2 1.4 1.6 1.8 2 2.2 0 50 100 150 200 (b) ( γ 1 , w β ) Fig. 7. Joint histograms for th e coup le ( γ 1 , w α ) and ( γ 1 , w β ) in Figs. 7(a) and 7(b) respectiv ely . The x-axis and y-axis ar e the parameter value. 0 50 100 150 200 250 0 50 100 150 200 250 (a) Data 0 50 100 150 200 250 0 50 100 150 200 250 (b) Estimated imag e Fig. 8. Observed im age Fig. 8(a) and restored image Fig . 8(b). Proﬁles correspond to the 68-th line. The solid line is the true proﬁle. Dashed line corr espond to data i n Fig. 8(a) and estimated proﬁles in Fig. 8(b). 32 1: Initialis ation of  ◦ x (0) , γ (0) , w (0) , k = 0  2: rep eat % Sample of ◦ x 3: Σ ← γ ( k ) ǫ | Λ H | 2 + γ ( k ) 0 | Λ 1 | 2 + γ ( k ) 1 | Λ D | 2 4: µ ← γ ( k ) ǫ Σ − 1 Λ ∗ H ◦ y 5: ◦ x ( k ) ← µ + Σ − 1 / 2 . ∗ randn % Sample of γ 6: γ ( k ) ǫ ← gamrnd ( α ǫ , β ǫ ) 7: γ ( k ) 1 ← gamrnd ( α 1 , β 1 ) 8: γ ( k ) 0 ← gamrnd ( α 0 , β 0 ) % Sample of w 9: w p ← rand ∗ ( a − b ) + a 10: J ← γ ǫ  k ◦ y − Λ H ◦ x k 2 − k ◦ y − Λ H , w p ◦ x k 2  / 2 11: if log( rand ) < min { J , 0 } then 12: w ( k ) ← w p 13: Λ H ← Λ H , w p 14: else 15: w ( k ) ← w ( k − 1) 16: end if % Empirical mean 17: k ← k + 1 18: ◦ ¯ x ( k ) ← P i ◦ x ( i ) /k 19: until | ¯ x ( k ) − ¯ x ( k − 1) | / | ¯ x ( k ) | ≤ criterion Fig. 9 . Pseudo-code algorithm . gamrnd , rand and randn draw samples of gamma var iable, uniform variable, and zero-mean unit-variance white com plex Gaussi an vector re spectiv el y . 33

Bayesian estimation of regularization and PSF parameters for Wiener-Hunt deconvolution

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment