Compressed sensing imaging techniques for radio interferometry

Mon. Not. R. Astron. Soc. 000 , 1–10 (2009) Printed Octobe r 12, 2021 (MN L A T E X style ﬁle v2.2) Compressed sensing imagin g techniques f or radio interferometry Y . W iaux 1 , 2 , L. Jacques 1 , 3 , G. Puy 1 , A. M. M. Scaife 4 , P . V andergheynst 1 1 Institut e of Electrical Engineerin g, Ecole P olyt echn ique F ´ ed ´ erale de Lausanne (EPFL), CH-1015 Lausanne, Switzerlan d 2 Centr e for P article Physics and Phenomenolo gy , Universit ´ e catholique de Louvain (UCL), B-1348 Louvain-la-Ne uve , Belgium 3 Communicat ions and Remote Sensing Laborato ry , Universit´ e catholi que de Louvain (UCL), B-1348 Louvain- la-Neuv e, Belgium 4 Astr ophysic s Group, Cav endish Laboratory , Univer sity of Cambridge , Cambridge CB3 0HE, Unite d Kingdom October 12, 2021 ABSTRA CT Radio interferom etry probes astro physical signals th rough inco mplete a nd n oisy F ourier mea- surements. T he theor y of co mpressed sensing demonstra tes that such m easurements m ay ac- tually sufﬁce for accu rate reconstructio n of sparse or com pressible sign als. W e p ropose new generic imaging tec hniques based o n conv ex optimization fo r g lobal m inimization p roblems deﬁned in this con text. T he versatility of the fra mew ork notably allows introd uction of spe- ciﬁc prior infor mation on the signals, wh ich of fers th e possibility of signiﬁcant improvements of re construction relative to the stand ard local matching pursuit algo rithm CLEAN used in radio astro nomy . W e illustrate the p otential of the ap proach by study ing reconstru ction per- forman ces on simulations of two different k inds of sign als o bserved with very g eneric inter- ferometr ic con ﬁgurations. T he ﬁrst kind is an intensity ﬁ eld of co mpact astrophysical objects. The seco nd kind is the imp rint o f cosmic strings in the tempe rature ﬁeld o f the c osmic mi- crowa ve backgro und radiation, o f particular interest for cosmolog y . Key words: technique s: interferom etric, tech niques: image p rocessing, cosmolo gy: cosmic microwa ve bac kgrou nd 1 INTRODUCTION Radio interferometry is a po werful techn ique for aperture syn- thesis in astronomy , dating back to more than sixty years ago (Ryle & V onber g 1946; Blythe 1957; Ryle et al. 1959; Ryle & Hewish 1960; Thompson et al. 2004). In a few word s, thanks to interferometric techniques, radio telescope arrays syn- thesize the aperture of a unique telescope of the same size as t he maximum projected distance between two t elescopes on the plane perpendicular to t he pointing direction of the instrument. This al- lo ws observ ations with otherwise inaccessible angular resolutions and sen sitivities in ra dio astronom y . The small portion o f the celes- tial sphere accessible to the instrument around the pointing direc- tion tracked during observ ation deﬁnes the original real planar sig- nal or image I t o be reco vere d. The fundamental Nyquist-Shanno n theorem r equires a signal to be sampled at a frequenc y of twice its bandwidth to be exac tly kno wn. T he signal I may therefore be express ed as a vector x ∈ R N containing the required number N of sampled values. Radio-interferometric data are acquired in t he Fourier plane. The number m of spatial frequencies probed may be much smaller than t he number N of discrete frequencies of the original band-limited signal, so that the Fourier cov erage is incom- plete. Moreo ver the spatial frequencies pro bed are not u niformly sampled. T he measurements are also obv iously affected by noise. An ill -posed inv erse problem is thus deﬁned for reconstruction of the original image. Beyon d the Nyquist-Shannon theorem, the emerging theory of compressed sensing aims at merging data acquisition a nd compres- sion (Cand ` es et al. 2006a,b; Cand ` es 2006; Donoho 2 006; Baraniuk 2007). It notably relies on t he idea that a large v ariety of signals in Nature are sparse or compressible. By deﬁnition, a signal i s sparse in so me basis if its ex pansion contains only a small number of no n- zero coef ﬁ cients. More g enerally it is compressible if its expansion only contains a small number of signiﬁcant coef ﬁ cients, i.e. if a large number of its coef ﬁcients bear a negligible valu e. Compressed sensing t heory demonstrates that a much smaller number of li near measurements is required for accurate knowled ge of such signals than is required for Nyq uist-Shannon s ampling. The sensing matrix must simply satisfy a so-called restricted isome try property . In par- ticular , a small number of random measurements in a sensing basis incoherent with the sparsity or compressibility basis will ensure this property with ov erwhelming probability , e.g. random Fourier mea- surements of a signal sparse i n real or wav elet space. Consequently , if compressed sensing had been develo ped before t he advent of ra- dio interferometry , one could probably not have though t of a much better design of measurements for sparse and compressible si gnals in an imaging perspecti v e. In this work we present results showing that the theory o f com- pressed sensing offers po werful image reconstruction techniques for radio-interferometric data. These techniques are b ased on global minimization problems, which are solved by con v ex optimization algorithms. W e also empha size on the versatility of the sche me rel- ativ e to t he inclusion of speciﬁc prior information on the signal in the minimization problems. This v ersatility allo ws the deﬁnition of c  2009 RAS 2 W iaux et al. image reconstruction tech niques wh ich are signiﬁcantly m ore p o w- erful than standard deco n v olution algorithm called CLEAN used in the contex t of radio astronomy . In Section 2, we pose the in verse problem for image recon- struction from radio-interferometric data and discuss the standard image reconstruction techniques used in radio astronomy . In Sec- tion 3, we concisely describe the central results of the theory of compressed sensing regarding the deﬁnition of a sensing basis and the accu rate reconstruction of sparse or compressible signals. In Section 4, we ﬁrstly comment on the exact co mpliance of radio interferometric measurements with compressed sensing. W e then study the reconstruction performan ces of v arious compressed sens- ing imaging techniques relative to CLEAN on simulations of two kinds of si gnals of interest for astrophysics and cosmology . W e ﬁ- nally conclude in Section 5. Notice that a ﬁrst ap plication of compressed sensing in as- tronomy (B obin et al. 2008) was very recently pro posed for non - destructi ve data compression on board t he future Herschel space observ atory 1 . The versatility of the compressed sensing frame- work to account for speciﬁc prior information on signals was already pointed out in that context. Moreo v er , the generic po - tential of compressed sensing for interferometry was pointed in the signal processing commu nity since the time when the the- ory emerg ed (Donoho 2006 ; Cand ` es et al. 2006b; Mary & Michel 2007; Lev anda & Leshem 2008). It was also very recently ac- kno wledged in radio astronom y (Corn well 2008 ). The present work nonetheless represents the ﬁrst application of compressed se ns- ing for the deﬁnition of new imaging technique s in r adio in- terferometry . A hug e amount of work may be en visaged alo ng these lines, notably for the transfer o f the proposed t echniques to optical and infrared interferometry . The extension of t hese techniques from the plane to the sphere will also be essential, notably wi th regard to forthcoming radio interferometers wi th wide ﬁ elds of view on the celestial sphere (Cornwell et al. 2008 ; McEwen & Scaife 2008), such as the future S quare Kilometer Ar- ray (SKA) 2 (Carilli & Rawlings 200 4). 2 RADIO INTERFEROMETR Y In this section, we recall the v an Cittert-Zernike theorem on the ba- sis of which we f ormulate the in verse problem posed for i mage re- construction from radio-interferometric data. W e also describe and discuss t he standard image reconstruction techniques used in ra- dio astronomy , namely a l ocal matching pursuit algo rithm called CLEAN and a global optimization algorithm called the maximum entropy method (MEM). 2.1 van Cittert-Zernike theorem In a tracking conﬁguration, all radio telescopes of an interferomet- ric array point in the same d irection. The ﬁeld of vie w observed on the celestial sphere S 2 is l imited by a so-called illumination func- tion A ( ω ) , depending on the angular position ω ∈ S 2 . The size of its angular support is essentially inv ersely proportional to the size of the dishes of the t elescopes (Thompson et al. 2004). At each in- stant of observation , eac h telescope pair iden tiﬁed by an index b measures a complex visibili ty y b ∈ C . This visibility is deﬁned as 1 http:/ /hersche l.esac.esa.int/ 2 http:/ /www .s katel escope.or g/ the correlation between incoming electric ﬁelds E at the positions of the two telescopes in the three-dimen sional space, ~ b 1 , ~ b 2 ∈ R 3 : y b = ﬁ E “ ~ b 1 , t ” E ∗ “ ~ b 2 , t ” ﬂ ∆ t . (1) In this relation, t denotes the time v ariable and the bracke ts h ·i ∆ t denote an average ov er a time ∆ t long relativ e to t he period of the radio wa ve detected . W e consider a monochromatic signal with a wa velen gth of emission λ , and made u p of in coherent sources. W e al so cons ider a standard interferometer with an illumination function whose angu- lar support is small enough so that the ﬁeld of view may be i denti- ﬁed to a planar patch of the celestial sphere: P ⊂ R 2 . The signal and the ill umination function thus r especti vely appear as functions I ( ~ p ) and A ( ~ p ) of the angular variable seen as a t wo-dimensio nal vector ~ p ∈ R 2 with an origin at the pointing direction of the array . The vector ~ B b = ~ b 2 − ~ b 1 ∈ R 3 deﬁning the relativ e position be- tween the two telescopes is c alled the baseline, and its projection on the plane perpendicular t o the pointing direction of the instrument may be denoted as ~ B ⊥ b ∈ R 2 . One also makes the additional as- sumption that t he maximum projection o f the ba selines in the point- ing direction itself is small (Cornwell et al. 2008). In this context, the so-called van Cittert-Zernike theorem states that the visibility measured identiﬁes with the two-dimensional Fourier transform of the i mage multiplied by t he i llumination function AI at the single spatial frequenc y ~ u b = ~ B ⊥ b λ , (2) i.e. y b = c AI ( ~ u b ) , (3) with c AI ( ~ u ) ≡ Z R 2 A ( ~ p ) I ( ~ p ) e − 2 iπ ~ p · ~ u d 2 ~ p, (4) for any two-dimensional vector ~ u ∈ R 2 . Interferometric arrays thus probe signals at a resolution equi v alent to that of a single tele- scope wit h a size R essentially equ iv alent to the maximum pro- jected baseline on the plane perpendicular to the pointing direction: R ≃ max b ~ B ⊥ b . This expresses the essence of aperture synthesis (Thompson et al. 2004). 2.2 Interferometric in verse problem In t he course of an observ ation, the projected baselines on the plane perpendicular to the pointing direction change thanks to the Earth’ s rotation and run ov er an ell ipse in the Fourier plane of the origi- nal image, whose parameters are li nked to the pa rameters of ob- serv ation. The total number m/ 2 of spatial frequencies probed by all pairs of telescopes of the array during the observ ation provides some Fourier cov erage characterizing the interferometer . Any in- terferometer is thus si mply identiﬁed by a binary mask in Fourier equal to 1 for each spatial frequenc y probed and 0 otherwise. The visibilities measured may be denoted as a vector of m/ 2 complex Fourier coef ﬁcients y ∈ C m/ 2 = { y b = c AI ( ~ u b ) } 1 6 b 6 m/ 2 , pos- sibly affected by complex noise va lues n ∈ C m/ 2 = { n b = n ( ~ u b ) } 1 6 b 6 m/ 2 of astroph ysical or instrumental origin. Consid- ering that the si gnal I and the illumination function A are real, a symmetry c AI ( − ~ u b ) = c AI ∗ ( ~ u b ) also holds so that indepen- dent measurements may all be localized in one half of the Fourier c  2009 RAS, MNRAS 000 , 1–10 Compr essed sensing for radio interfer ometry 3 plane. The binary mask in Fourier identifying the interferome- ter is deﬁned in this half of the plane and ren dered symmetric around the origin so that it also corresponds to the Fourier trans- form of a real function. In this context, t he measured visibilities may equi v alently be denoted as a vector of m real Fourier coef ﬁ- cients y ∈ R m = { y r } 1 6 r 6 m consisting of the real and imaginary parts of the complex measures, possibly affec ted by real noise val- ues n ∈ R m = { n r } 1 6 r 6 m . The original signal I ( ~ p ) and the il lumination function A ( ~ p ) can be approximated by band-limited functions restr icted to t he ﬁ- nite ﬁeld of vie w precisely set by the illumination function: ~ p ∈ P . In t his context, we notice that they are identiﬁed by their Nyquist- Shannon sampling o n a discrete uniform grid of N = N 1 / 2 × N 1 / 2 points ~ p i ∈ R 2 in real space with 1 6 i 6 N . The sampled signal may thus be denoted as x ∈ R N = { x i = I ( ~ p i ) } 1 6 i 6 N while the i llumination function is denoted as a ∈ R N = { a i = A ( ~ p i ) } 1 6 i 6 N , and the sampled product reads as ¯ x ∈ R N = { ¯ x i = AI ( ~ p i ) } 1 6 i 6 N . Becau se of the assumed ﬁnite ﬁeld of view , the functions may equiv alently be described by their complex Fourier coef ﬁcients on a discrete uniform grid of N = N 1 / 2 × N 1 / 2 spa- tial frequencies ~ u i with 1 6 i 6 N . T his grid is symme tric around the origin and limited at the maximum frequenc y deﬁning the band limit. In particular for t he Fourier coefﬁcients of the product AI one has: b ¯ x ∈ C N = { b ¯ x i = c AI ( ~ u i ) } 1 6 i 6 N . The functions being real, one again has the symmetry c AI ( − ~ u i ) = c AI ∗ ( ~ u i ) so that the signal is described by exactly N / 2 complex Fourier coefﬁcients in one half of the Fourier plane, or equiv alently N real Fourier coef- ﬁcients consisting of the real and imag inary parts of these complex coef ﬁcients. In t he following we only use t his decomposition wit h real coef ﬁcients in one half of the Fourier plane. Ho we ver the frequencies ~ u b probed deﬁned by (2) for 1 6 b 6 m/ 2 are continuous and do not generally belong to the set of discrete frequencies ~ u i for 1 6 i 6 N . Reconstruction schemes i n general pe rform a preliminary gridding operation on the visibilities y r with 1 6 r 6 m so that the in verse p roblem may be refor- mulated in a pure discrete setting, i .e. between the discrete Fourier and real planes (Thompson et al. 2004). The essential reason fo r the gridding resides in the subsequent use of the standard fast Fourier transform (FFT) 3 . Fo r the sa ke o f the c onsiderations that follow w e assume that the frequencies probed ~ u b belong to the discrete grid of points ~ u i so that no arti fact due to the gridding i s introduced. In this discrete setti ng the F ourier coverage is una v oidably incomplete in the sense that the number of real constraints m is alway s smaller than the number of unkno wns N : m < N . An ill-posed in verse problem is thus deﬁned for the reconstruction of the signal x from the measured visibilities y as: y = Φ ri x + n, (5) for a giv en noise n , and with a sensing matri x Φ ri for radio inter- ferometry of the form Φ ri = M F D . (6) In this relation, the m atrix D ∈ R N × N = { D ii ′ = a i δ ii ′ } 1 6 i,i ′ 6 N is the diagonal matrix i mplementing the illumina- tion function, and the matrix F ∈ R N × N = { F ii ′ } 1 6 i,i ′ 6 N im- plements the discrete Fourier transform providing the real Fourier coef ﬁcients in one half of the Fourier plane. The matrix M ∈ 3 Notice that fast algorithms have been dev elope d to compute a Fourier transform on non -equispac ed spatia l freq uencie s (NFFT) (Potts et al. 2008). This could in princi ple allo w one to av oid an expl icit gridding operation. R m × N = { M r i } 1 6 r 6 m ;1 6 i 6 N is the rectangular binary matrix implementing the mask characterizing the interferometer in one half of the F ourier plane. It contains only one non-zero v alue on each line, at the index of one of the two real Fourier coefﬁcients correspondin g t o each of the spatial frequencies probed. W e restrict our considerations to i ndepende nt Gaussian noise with variance σ 2 r = σ 2 ( y r ) . F rom a stati stical point of view , the likelihood L associated w ith a candidate reconstruction x ∗ of the signal x is deﬁned as the probability of the data y giv en the model x ∗ , or equiv alently t he probability of the noise residual n ∗ = y − Φ ri x ∗ . Under the Gaussian noise assumption it reads as L ( y | x ∗ ) ∝ exp » − 1 2 χ 2 ( x ∗ ; Φ ri , y ) – , (7) with the correspon ding negativ e logarithm χ 2 ( x ∗ ; Φ ri , y ) = m X r =1 ( n ∗ r ) 2 σ 2 r , (8) follo wing a chi-squa re distrib ution with m degrees of freedom. The χ 2 deﬁnes a noise level esti mator . The level of residual noise n ∗ should be reduced by ﬁ nding x ∗ minimizing this χ 2 , which corre- sponds to maximize the likelihood L . T ypically , the measurement constraint on the reconstruction may be deﬁned as a bound χ 2 ( x ∗ ; Φ ri , y ) 6 ǫ 2 , (9) with ǫ 2 correspondin g to some ( 100 α ) th percentile of the chi- square distribution, i.e. p ( χ 2 6 ǫ 2 ) = α for some α 6 1 . For a solution with a χ 2 = ǫ 2 , there is a probability α that pure noise gi ves a residual smaller than or equal to the observed residual n ∗ , and a probability 1 − α that n oise gi ve s a lar ger residual. T oo small an α would thus induce possible no ise ov er-ﬁtting, i.e. in clusion of part of the noise in the reconstruction. These considerations might of course be generalized to other kinds of noise distributions. The i n v erse problem being ill-posed, many signals may for- mally satisfy measurement constraints such as (9). In general, the problem may only ﬁnd a un ique solution x ∗ , as close as p ossible to the true signal x , throug h a re gularization schem e which sho uld encompass enoug h prior information on the original signal. All possible image reconstruction algorithms will essentially be dis- tinguished through the kind of regu larization considered. 2.3 Standard imaging techniques The general in v erse problem (5) is to be considered i f one wishes t o undo the multiplication by the illumination function and to re cov er the original signal x on the giv en ﬁeld of vie w . In practice, the re- construction is usually considered for the original i mage I already multiplied by the illumination function A , whose sampled values are ¯ x = Dx ∈ R N = { ¯ x i = a i x i } 1 6 i 6 N . In this setting the in v erse problem reads as y = ¯ Φ ri ¯ x + n, (10) with a sensing matrix ¯ Φ ri strictly implementing a con volution: ¯ Φ ri = M F . (11) Firstly , the most stan dard an d otherwise already v ery effec- tiv e i mage reconstruction algorithm from visibility measurements is called CL EAN. I t approaches the image reconstruction in terms of the corresponding decon volution problem in real space (H ¨ ogbom 1974; Schwarz 1978; Thompson et al. 2004). In standard vocab u- lary , the in verse transform o f the Fou rier measuremen ts with all c  2009 RAS, MNRAS 000 , 1–10 4 W iaux et al. non-observ ed visibiliti es set to zero is call ed the dirty image. Its sampled v alues ¯ x ( d ) ∈ R N = { ¯ x ( d ) i } 1 6 i 6 N are simply obtained by app lication of the adjoint sensing matrix to the observ ed visibil- ities: ¯ x ( d ) = ¯ Φ † ri y . The in verse transform of the binary mask identi- fying the i nterferometer is called the dirty be am. Its sampled v alues d ∈ R N = { d i } 1 6 i 6 N follo w from the application of the adjoint sensing matrix to a vector of unit values 1 m ∈ R m : d = ¯ Φ † ri 1 m . The in verse transform of th e no ise n with all non-observed visibili- ties set to zero deﬁnes an alternativ e ex pression of the noise in real space. Again its sampled values n ( d ) ∈ R N = { n ( d ) i } 1 6 i 6 N are simply obtained by application of the adjoint sensing matrix to the noise realization: n ( d ) = ¯ Φ † ri n . The i n v erse problem (10) can thus be rephrased by expressing the dirty image as the conv olution of the original image with the dirty beam, plus the noise: ¯ x ( d ) = d ⋆ ¯ x + n ( d ) . (12) CLEAN is a non-linear decon volu tion method relying on this re- lation and working by local iterative beam remov al. At each iter- ation, the point in real space is identiﬁed where a residual image, initialized to the dirty image, tak es its maximum absolute v alue. The beam is remo ved at that point with the correct amplitude to produce the residual image for the next iteration. Simultaneously the maximum absolute v alue observed renormalized by the central v alue of the beam is added at the same point i n t he approximation image, initialized to a null image. This procedure assumes that t he original signal is a sum of Dirac spikes. A sparsity or compressibil- ity prior on t he original signal in real space is imp licitly introduced so t hat its energy is concentrated at speciﬁc locations. On the con- trary , the Gaussian noise should be distributed e verywh ere on the image and should not signiﬁcantly affect the selection of points in the iterations. This underlying sp arsity hypoth esis serves a s a re gu- larization of the in ve rse problem. A loop gain factor γ i s generally introduced in the procedure which de ﬁnes the fraction of the beam considered at each itera- tion. V alues γ around a fe w tenths are u sually used which allo w f or a more cautious consideration of the sidelobes of the dirty beam. The overall procedure is greatly enhanced by this simple improve- ment, albeit at high computational cost. In a stati stical sense, the stopping criterion fo r the iteration procedure should be set i n terms of relation (9). Ho we ver , the proce dure is known to be slo w an d the algorithm is often stopped after an arbitrary number of iterations. V arious wei ghting schemes can be applied to the binary mask in Fourier . Natural weighting simply corresponds to r eplace the unit v alues by the inv erse variance of the noise af fecting the correspo nd- ing visibilit y measurement. T his corresponds to a standard matched ﬁltering operation allowin g the maximization of the sign al-to-noise ratio of the dirty image before dec on v olution. So-called uniform and robust weightings can notably be used to correct for the non- uniformity of the Fo urier cov erage associated wi th the measured visibilities and to reduce the sidelobes of the dirty beam in real space. Multi-scale versions of this method were also de v eloped (Cornwell 2008). CLEAN and multi-scale v ersions may actually be formu- lated in terms of the well-kno wn matching pursuit (MP) procedure (Mallat & Zhang 199 3; Mallat 19 98). The co rresponding MP algo- rithm simply uses a circulant dictionary for which the projection on atoms corresponds to the con vo lution with the dirty beam. The loop gain fac tor may also be trivially introduced in this conte xt. Secondly , another appro ach to the reconstruction of images from visibili ty measure ments is MEM. In contrast to CLEAN, MEM solves a glob al optimization problem in which the in verse problem (10) is regularized by the i ntroduction of an entropic prior on the signal (Ables 1974; Gull & Daniell 19 99; Cornwell & Evans 1985; Gull & Skilling 1999). For positiv e signals, the relativ e en- tropy function between a sampled signal ¯ x ∈ R N = { ¯ x i } 1 6 i 6 N and a model z ∈ R N = { z i } 1 6 i 6 N takes the simple form S ( ¯ x, z ) = − N X i ¯ x i ln ¯ x i z i . (13) This function is always negativ e and takes its maximum null valu e when ¯ x = z . In the ab sence of a precise kno wledge of the signal ¯ x , z is set to a vector of constant values. In such a case, maximizing the entrop y prior promotes smoo thness of the recon structed image. The MEM problem i s t he unconstrained optimization problem deﬁned as the minimization of a functiona l co rresponding t o the sum of the relativ e entrop y S and the χ 2 : min ¯ x ′ ∈ R N » 1 2 χ 2 ` ¯ x ′ ; ¯ Φ ri , y ´ − τ S ` ¯ x ′ , z ´ – , (14) for some suitably chosen regularization parameter τ > 0 . In gen- eral, the minimization t hus requires a trade-off between χ 2 mini- mization, and relativ e entrop y maximization. Notice that t he deﬁnition (13) may easily be generalized for non-positi v e signals. A multi-scale version of MEM was also de- ﬁned. It considers that the original imag e may hav e an ef ﬁcient representation in terms of its decomposition in a w av elet b asis. The entropy is then deﬁned directly on the wav elet coef ﬁcients of the signal (Maisinger et al. 1999). For completeness we ﬁnally quote t he WIP E reconstruction procedure which also solv es a global minimization problem, but in which the in verse problem (10 ) is regularized by the introduction of a smoothness prior on the part of the signal whose Fourier sup- port corresponds to the non-pro bed spatial frequencies. This corre- sponds to minimize the χ 2 after assigning a null value to all initially non-observ ed visibilities (Lannes et al. 1994, 1996). In conclusion, CLEAN is a local iterati v e deco n v olution tech- nique, while MEM and W IPE are reconstruction techniques based on global minimization problems. All three approaches are ﬂexi- ble enough to consider v arious bases (Dirac, wa velet, etc.) where a majority of natural signals ca n h av e a sparse or compressible repre- sentation. CL EAN also implicitly assumes the sparsity of the signal in the reconstruction procedure. But none of these methods explic- itly imposes the sparsity o r compressibility p rior on the reconstruc- tion. This precise gap is notably b ridged by th e imag ing techniques deﬁned in the frame work of the compressed sensing theory . 3 COMPRESSED SENSING In this section we deﬁne the ge neral frame work of the theory of compressed sensing and q uote its essential impact beyond the Nyquist-Shannon sampling theorem. W e then describe the re- stricted i sometry property that the sens ing basis n eeds to satisfy so that sparse and compressible signals may be accurately recov- ered through a global optimization problem. W e ﬁnally discuss the idea that incoherence of t he sensing and sparsity or compressibil- ity bases as well as randomness of the measurements are the ke y properties to ensure this restricted isometry . 3.1 Bey ond Nyquist-Shann on In the framework of compressed sensing the signals probed are ﬁrstly assumed to be sparse or compressible in some basis. T ech- nically , we consider a real signal iden tiﬁed by its Ny quist-Shannon c  2009 RAS, MNRAS 000 , 1–10 Compr essed sensing for radio interfer ometry 5 sampling as x ∈ R N = { x i } 1 6 i 6 N . A real basis Ψ ∈ R N × T = { Ψ iw } 1 6 i 6 N ;1 6 w 6 T is deﬁned, which may be either orthogon al, with T = N , or redundant, with T > N (Rauhut et al. 2008). The decomposition α ∈ R T = { α w } 1 6 w 6 T of the signal deﬁned by x = Ψ α, (15) is sparse or compressible in the sense t hat i t only contains a small number K ≪ N of non-zero or signiﬁcant coefﬁcients respec- tiv ely . The signal is then assumed to be probed by m r eal l inear measurements y ∈ R m = { y r } 1 6 r 6 m in some real sensing basis Φ ∈ R m × N = { Φ r i } 1 6 r 6 m ;1 6 i 6 N and possibly af fected by inde- pendent and identically distributed noise n ∈ R m = { n r } 1 6 r 6 m : y = Θ α + n with Θ = ΦΨ ∈ R m × T . (16) This number m of constraints is typically assumed to be smaller than the dimension N of t he vector deﬁning the signal, so that the in v erse problem (16) is ill-posed. In this context, the theory of compressed sensing deﬁnes the explicit r estricted iso metry property (RIP) that the matrix Θ should satisfy in order to allow an accurate recov ery of sparse or compress- ible si gnals (Cand ` es et al. 2 006a,b; Cand ` es 200 6). In that regard, the theory offers multiple ways to design su itable sensing matri- ces Φ from properties o f incoherence with Ψ and randomness of the measurements. It sho ws in particular that a small number of measurements is required relative to a nai ve Nyquist-Shannon sam- pling: m ≪ N . The framewo rk also d eﬁnes a global minimiza- tion p roblem for the signal recovery called Basis Pursuit (BP). This problem r egularizes the originally ill-posed in ve rse problem by an explicit sparsity or compressibility prior on the signal. The corre- sponding solution may be obtained through con v ex optimization. Alternativ e global minimization prob lems may also be designed. 3.2 Restricted isometry and Basis Pursuit Let us primarily recall tha t the ℓ p norm of a real vector u ∈ C Q = { u l } 1 6 l 6 Q is deﬁned for any p ∈ R + as || u || p ≡ ( P Q l =1 | u l | p ) 1 /p , whe re | u l | stands f or the absolute v alue of t he componen t u l . The well-known ℓ 2 norm is to the square-root of the sum of the absolute v alues squared of the vector compon ents. By deﬁnition the matri x Θ sati sﬁes a R IP of order K if there exists a constant δ K < 1 such that (1 − δ K ) || α K || 2 2 6 || Θ α K || 2 2 6 (1 − δ K ) || α K || 2 2 , (17) for all v ectors α K containing at ma ximum K non-zero co ef ﬁcients. The ℓ 1 norm of the vector α ∈ R T = { α w } 1 6 w 6 T is simply deﬁned as th e sum of the absolute v alues of the vector c omponents: || α || 1 ≡ T X w =1 | α w | . (18) From a Bayesian point of view , this ℓ 1 norm may be seen as t he negati v e logarithm of a Laplacian prior distribution on each inde- pendent component of α . F or comparison the square of the ℓ 2 norm may be seen as the n egati v e logarithm of a Gaussian prior distrib u- tion. It i s well-kn o wn that a Laplacian distribution is highly peak ed and bears heavy tails, relative to a Gaussian distri bution. This cor- responds to say that the signal is deﬁned by only a small n umber of signiﬁcant c oef ﬁcients, much smaller than a Gaussian signal w ould be. In other words the representation α of t he signal x in the spar- sity or compressibility basis Ψ is indeed sparse or compressible if it follows such a prior . Fi nding the α ′ that best corresponds t o this prior req uires to maximize its Laplacian probability distribution, or equi v alently to minimize the ℓ 1 norm. Notice that this conclusion also follows from a pure geometrical ar gument in R T (Cand ` es et al. 2006b; Baraniuk 2007). A constrained optimization problem exp licitly regularized by a ℓ 1 sparsity prior can be deﬁned. T his so-called Basis Pursuit de- noise (BP ǫ ) problem is the minimization of the ℓ 1 norm of α ′ under a constraint on the ℓ 2 norm of the residual noise: min α ′ ∈ R T || α ′ || 1 subject to || y − Θ α ′ || 2 6 ǫ. (19) Let us recall tha t the noise w as assumed to b e identically dis- tributed. Consequently , considering Gaussian noise, the ℓ 2 norm term i n the BP ǫ problem is identical to the cond ition (9), for ǫ 2 correspondin g to some suitable percentile of the χ 2 distribution with m deg rees of freedom gov erning the noise le vel estimator . This BP ǫ problem is solved by application of non-linear and iter- ativ e con v ex optimization algorithms (Combettes & Pesquet 2008; v an den Berg & Friedlande r 2008). In the a bsence of n oise, the BP ǫ problem is simply called Basis Pursuit (BP). If the solution of the BP ǫ problem is denoted α ∗ then the co rresponding syn thesis-based signal reconstruction reads, from (15), as x ∗ = Ψ α ∗ . Compressed sensing sh o ws that if t he matrix Θ satisﬁes a RIP of order 2 K with some suitable constant δ 2 K < √ 2 − 1 (Cand ` es 2008), then the solution x ∗ of the BP ǫ problem provides an accurate reconstruction of a signal x t hat is sparse or compressible with K signiﬁcant coefﬁcients. T he reconstruction may be said to be opti- mal in that exactly sparse signals are reco vered e xactly through BP in the absence of noise: x ∗ = x . Moreover strong stability results exist f or compres sible signals in the presen ce of noise. In that case, the ℓ 2 norm of the difference between the representation α of the signal in the sparsity o r compressibility b asis an d its reconstruction α ∗ is bounded by t he sum of two terms. The ﬁrst term is due to the noise and i s proportional to ǫ . The second term is due to the non-e xact sparsity of a compressible signal and is proportional to the ℓ 1 norm of the dif ference between α and the approximation α K deﬁned by retaining only its K largest compone nts and sending all other v alues to zero. In this context, one has || α − α ∗ || 2 6 C 1 ,K ǫ + C 2 ,K || α − α K || 1 √ K , (20) for two known constants C 1 ,K and C 2 ,K depending on δ 2 K . For instance, when δ 2 K = 0 . 2 , we hav e C 1 ,K = 8 . 5 and C 2 ,K = 4 . 2 (Cand ` es et al. 2006b; Cand ` es 2008). In an orthonormal bas is Ψ this relati on represents an explicit bound on the ℓ 2 norm of the differe nce between the signal x itself and its reconstruction x ∗ as || x − x ∗ || 2 = || α − α ∗ || 2 . Moreove r x K = Ψ α K then represents the best sparse approximation of x with K terms, in the sense that || x − x K || 2 is minimum. The constrained B P ǫ problem may also be rephrased i n terms of an unconstrained minimization problem for a functional deﬁned as t he sum of the ℓ 1 norm of α ′ and the ℓ 2 norm of t he residual noise: min α ′ ∈ R T » 1 2 || y − Θ α ′ || 2 2 + τ || α ′ || 1 – , (21) for some suitably chosen regularization parameter τ > 0 . For each value of ǫ , there exists a v alue τ such that the solutions of the constrained and unconstrained ℓ 1 sparsity problems are identi- cal (v an den Berg & Friedlander 2008). F rom a Bayesian point of vie w , this minimization is the n equi v alent to maximum a po steriori (MAP) estimation for a signal with Laplacian prior distri bution in the sparsity or compressibility basis, i n the presence of Gaussian noise. c  2009 RAS, MNRAS 000 , 1–10 6 W iaux et al. Finally , alternati ve m inimization problems may be deﬁned for the recov ery . Firstly , a ℓ p norm with 0 < p 6 1 may for example be substituted for the ℓ 1 norm in the deﬁnition of the minimization problem. From a Bayesian point of view , the ℓ p norm to the po wer p may be seen as the negati ve logarithm of a prior distribution iden- tiﬁed as a generalized Gaussian distribution (GGD). Such distri- butions are ev en more highly peaked and bear heavier tail s than a Laplacian distribution and thus promote str onger compressibility of the signals. The oretical results hold for such ℓ p norm minimization problems wh en a RIP i s satisﬁed (Foucart & Lai 2008). Such p rob- lems are non-con vex but can be solved iteratively by con ve x op- timization algorithms performing re-weighted ℓ 1 norm minimiza- tion ( Cand ` es et al. 2008; Davies & Gribon v al 2008; Foucart & Lai 2008; Chartrand & Y in 2007). Secondly , a TV norm may also be substituted f or the ℓ 1 norm in the deﬁnition of the minimization problem for signals with sparse or compressible gradients. The TV norm of a signal is simply deﬁned as the ℓ 1 norm of the magnitude of its gradient (Rudin et al. 1992 ). A theoretical result of exact re- construction holds for su ch TV norm minimization prob lems in the case of Fourier measurements of signals wi th exactly sparse gradi- ents in the absence of noise (Cand ` es et al. 2006a). But no proof of stability relative to noise and non-ex act sparsity ex ists at the moment. S uch minimization is also accessible through an iterative scheme from con vex optimization algorithms (Cand ` es & R omberg 2005). This ﬂe xibility in the deﬁnition of the o ptimization prob lem is a ﬁ rst important manifestation of the versatility of t he compressed sensing t heory , an d o f the conv e x optimization scheme. It opens the door to the deﬁnition a whole variety of po werful image reconstruc- tion techniques that may take adva ntage of some available speciﬁc prior i nformation on the signal under scrutiny beyon d its generic sparsity or compressibility . 3.3 Incoherence a nd randomness The issue of the design of t he sensing matrix Φ ensuring the RIP for Θ = ΦΨ is of course fundamental. One can actually sho w that incoherence of Φ wit h t he sparsity or compressibility basis Ψ and randomness of the measurements will en sure that the RIP is sat- isﬁed with ov erwhelming probability , provide d that the number of measurements is larg e enough relativ e to the sparsity K considered (Cand ` es et al. 2006b; Cand ` es 2006). In this context, the variety of approaches to design suitable sensing matri ces is a second form of the versatility of the com pressed sensing framework . As a ﬁ rst example, the measurements may be drawn from a Gaussian matrix Φ with purely random real entries, in which case the RIP is satisﬁed if K 6 C m ln( N/m ) , (22) for some constant C . The most recent result provides a valu e C ≃ 0 . 5 , hence sho wing that the r equired redundancy of measurements m/K is very small (Dono ho & T anner 2009). As a second example of interest for radio interferometry , the measurements may arise from a uniform rand om selection of Fourier frequencies. In this case, the precise condition for the RI P depends on the degree of incoherence be tween the Fourier basis and t he sparsity or compressibility basis. If t he unit-normed basis vectors correspondin g to the lines of F and the columns of Ψ are denoted { f e } 1 6 e 6 N and { ψ e ′ } 1 6 e ′ 6 T , the mutual coheren ce µ of the bases may be deﬁned as their maximum scalar product: µ = √ N max e,e ′ |h f e | ψ e ′ i| . (23) The RIP is then satisﬁed if K 6 C ′ m µ 2 ln 4 N , (24) for so me c onstant C ′ . As the incoherence is maximu m between the Fourier and real spaces with µ = 1 , the lowes t number of meas ure- ments would be required for a signal that is sparse in real space. Notice t hat a factor l n N i nstead of l n 4 N in c ondition (24 ) w as not prov en but conjectured, suggesting t hat a lower number of mea- surements would still ensure the RIP . In that regard, empirical re- sults (Lustig et al. 2007) suggest that ratios m/K between 3 and 5 already ensure a reconstruction quality through BP ǫ that is equiv a- lent to the quality ensured by (20). Let us also emphasize that the TV norm minimization is often used fr om Fo urier measu rements of sign als with sp arse or com- pressible gradients. As already stated no stability result such as (20) was prov en for t he reconstruction provided by this minimiza- tion sc heme. Empirical results suggest howe ver that TV norm min- imization provides the same quality of reconstruc tion as BP ǫ for the sa me typ ical ratios m/K between 3 and 5 ( Cand ` es & Romberg 2005; Lustig et al. 2007). 4 APPLICA TIONS In this sec tion, we ﬁrstl y comment on the exact compliance of r adio interferometric measurements with co mpressed sensing. W e then consider simulations of two kinds of signals for reconstruction from visibility measurements: an intensity ﬁ eld of c ompact astrophysical objects and a signal induced by cosmic strings in the temperature ﬁeld of the cosmic micro wa ve backg round (CMB) radiation. Rely- ing on the ve rsatility of the con v ex optimization scheme , enhanced minimization problems are deﬁned in the compressed sensing per- specti ve through the introduction of speciﬁc prior information on the signals. The reconstruction performance is studied in compar- ison both with the standard BP ǫ reconstructions in the absence of speciﬁc priors and with the CLEAN reconstruction . 4.1 Interferometric mea surements and compr essed sensing In the context of compressed sensing, the sensing matri x needs to satisfy the RIP . If Fourier measurements are con sidered, this re- quirement may be reached through a uniform random selection of a low number of Fourier frequencies. In the context of radio in- terferometry , realistic visibil ity distri butions are deterministic, i. e. non-rando m, superpositions of ell iptical distributions in the Fourier plane of the image t o recon struct. Howe ver , the structure of the Fourier sampling is extremely dependent on the speciﬁc conﬁgu- ration of the radio telescope array under consideration. V isi bilities from variou s interferometers may be combined , as well as visibil- ities from the same interferometer with dif ferent pointing direc- tions in the mosaicking technique (Thompson et al. 20 04). From this point of vie w t he realistic visibility distributions themselves are rather ﬂe xible. Moreov er , t he standard uniform weighting of the visibilities may be us ed to pro vide uniformity of the ef fectiv e mea- surement density i n the Fourier plane. Correctly studied realistic distributions might thus not be so far from complying exactly with the compressed sensing r equirements. Finally , it was recently sug- gested that speciﬁc deterministic distr ibutions of a l o w number of c  2009 RAS, MNRAS 000 , 1–10 Compr essed sensing for radio interfer ometry 7 linear measurements might in fact allo w accurate signal reconstruc- tion in the context of comp ressed sensing (Matei & Meyer 2008). Nonetheless, modiﬁcations of radio interferometric measure- ments might be conceiv ed in order to comply exactly with stan- dard compressed sen sing results. T o this end, on e might want to introduce randomness in the visibility distribution. Formally , ran- dom repositioning of the telescopes during observation or random integration times for the deﬁnition of indiv idual visibilities could provid e important adv ances in th at direction. Also notice that co m- pressed sen sing does not require that measurements be ide ntiﬁed to Fourier coefﬁcients of the signal. The versatility of the frame work relativ e to the des ign of suitab le sensing matrices might actually be used to deﬁne generalized radio interferometric measurements, beyo nd standard visibilities, ensuring that the RIP is explicitly sat- isﬁed. In this perspec tiv e, direct modiﬁcations of the acquisition process through a scheme similar to spread spectrum techniques (Naini et al. 2009) or coded aperture techniques ( Marcia & W illett 2008) could also provide important adv ances. In the following applications we simply con sider standard vis- ibility measurements. W e assume generic interferometric conﬁgu- rations characterized by uniform random selections of visibilities. 4.2 Experimental set up W e consider two kinds of astrophysical signals I that are sparse in some basis, and for which speciﬁc prior information is av ailable. For each kind of signal, 30 simulations are considered. Observa- tions of bo th kinds of signals are simulated for ﬁve hypothetical radio interferometers unaffected by instrumental noise, assuming that the co nditions under which relation (3) h olds are sa tisﬁed. The ﬁeld of vie w observe d o n t he celestial sphere by the interferometers is limited by a Gaussian illumination function A with a full width at half maximum (FWHM) of 40 arcminutes of angular opening. The original signals considered are deﬁned as sam pled images with N = 256 × 256 pixels on a total ﬁeld of vie w of 1 . 8 ◦ × 1 . 8 ◦ . The ﬁrst kind of signal consists of a compact object i ntensity ﬁeld in which the astr ophysical objects are represented as a super- position of elongated Gau ssians of v arious scales in some arbitrary intensity units. The important speciﬁc prior information in this case is the p ositivity of the signal. The second kind of signal is of partic- ular interest for cosmology . It consists of temperature steps in µ K induced by topological defects such as cosmic strings in the zero- mean perturbations of the CMB. The string netw ork of interest can be mapped as the magnitude of the gradient of the string signal i t- self. T he essential speciﬁc prior information in this case resides in the fact tha t the statistical distri bution of a string signal m ay be well modelled in wav elet space. One simulation of a compact object in- tensity ﬁeld and th e magnitude of the g radient of one simulation o f a string signal are represented in Figure 1, after multipli cation by the illumination function. As discussed already , we assume uniform random selections of visi bilities. T he ﬁve interferometers considered identiﬁed by an index c with 1 6 c 6 5 only differ by their Fourier coverag e. This cov erage is deﬁned by the m/ 2 r andomly distributed frequencies probed in one half of the Fourier plane, corresponding to m real Fourier coef ﬁcients as: m/ N = 5 c/ 100 . For each conﬁguration, the g eneral in verse problem is the one posed in (5) wit h the sen sing matrix Φ = Φ ri deﬁned i n (6) if one wishes to undo the multi- plication by the illumination function and to recover the origin al signal x . T he in v erse problem (10) applies with the sensing matrix Φ = ¯ Φ ri deﬁned in (11) if one wishes to recov er ¯ x . For each reconstruction, we compare the performance of the Basis Pursuit approaches en hanced by t he inclusion of sp eciﬁc prior signal information in the minimization problem, wi th both the standard BP ǫ or BP performance, and the C LEAN performance. As the signals considered are sparse or co mpressible in some basis, we do not consider an y MEM or WIPE reconstruction, which disre gard the sparsity information. T he performance of the algorithms com- pared is ev aluated t hrough the signal-to-noise ratio (SNR) of the reconstruction for the compact object intensity ﬁeld, and through the SNR of the magnitude of the gradient of the reconstruction for the string si gnal. The SNR of a reconstruc ted signal s relative to an original signal s i s technically deﬁned as SNR ( s, s ) = − 20 log 10 σ ( s − s ) σ ( s ) , (25) where σ ( s − s ) and σ ( s ) stand for the sampled stand ard d e viations of the residual signal s − s and of the original signal s , respecti vely . It is consequen tly measured in decibels (dB). As far as the computation complex ity of the algorithms is concerned, notice that both CLEAN and the variou s Basis Pursuit algorithms considered share t he same scaling with N at each it- eration. This scaling i s driv en by the comp lexity of the F FT , i.e. O ( N log N ) . The number of iterations required by each algorihm is therefore critical in a comparison of computation times. 4.3 Compact object intensity ﬁeld Each simulation of the co mpact ob ject i ntensity ﬁeld consists of 100 Gaussians with random positions and orientations, rand om am- plitudes in the r ange [0 , 1] in the chosen intensity units, and ran- dom but small scales identiﬁed by standard de viations along each basis direction in the range [1 , 4] in number of pixels. Given their structure, such signals are probably optimally modelled by sparse approximations in some wav elet basis. But as the maximum possi- ble i ncoherence wi th Fourier space is reached from real space, we chose the spa rsity or comp ressibility ba sis to be the Dirac basis, i.e. Ψ = I N 1 / 2 × N 1 / 2 . Fo r further simpliﬁcation of the pro blem we consider the inv erse problem (10) with the sensing matrix ¯ Φ ri , for reconstruction of the original signal ¯ x multiplied by the ill umina- tion function. As no noise is considered, a BP problem is co nsidered i n a standard compressed sensing approach. Howe ver , the prior kno wl- edge of the positivity of the signal also allows one to pose an en- hanced BP+ problem as: min ¯ x ′ ∈ R N || ¯ x ′ || 1 subject to y = ¯ Φ ri ¯ x ′ and ¯ x ′ > 0 . (26) Notice tha t no t heoretical recov ery result w as yet pro vided fo r such a problem in t he described framewo rk of compressed sensing. B ut the performance of this approach for the problem considered is as- sessed on the basis of the simulations. T he positivity prior is eas- ily incorporated into a con v ex optimization solver based on prox- imal operator t heory ( Moreau 1962 ). T he Douglas-Rachford split- ting method (Combettes & Pesquet 2008) guarantees that such an additional con ve x constraint is inserted naturally in an efﬁcient it - erativ e pro cedure ﬁnding the global minimum of the BP+ problem. For simplicity , the stopping criterion of the iterati ve process is here set in terms of the number of iterations: 10 4 . The BP + reconstruction of the original signal ¯ x reported in Figure 1 is also represented in the ﬁgure for the conﬁguration c = 2 . For comparison, the dirty image ¯ x ( d ) used in CLE AN and obtained by simple app lication of the adjoint sens ing matrix ¯ Φ † ri to the observed visibilit ies is also represented. The mean SNR and c  2009 RAS, MNRAS 000 , 1–10 8 W iaux et al. 0 0.05 0.1 0.15 0.2 0.25 −0.005 0 0.005 0.01 0.015 0.02 0.025 0 0.05 0.1 0.15 0.2 0.25 5 10 15 20 25 30 0 10 20 30 40 50 Coverage (%) SNR (dB) CLEAN BP BP+ 0.2 0.4 0.6 0.8 1 1.2 1.4 x10 13 1 2 3 4 5 6 7 8 0.2 0.4 0.6 0.8 1 5 10 15 20 25 30 1.5 2 2.5 3 3.5 4 4.5 Coverage (%) SNR (dB) CLEAN BP SBP Figure 1. T op panels: compact object intensity ﬁeld in some arbitrary intensi ty units. The original signal multiplie d by the illuminatio n function ¯ x is reported (left), as well as the di rty image ¯ x ( d ) (cente r left) and the BP+ reconstruc tion of ¯ x (center right), for the inte rferometri c conﬁguration c = 2 . The graph of the mean SNR with 1 σ error bars over 30 simulations is also reported for the CLEAN, BP , and BP+ reconstruct ions of ¯ x as a function of the Fourier covera ge identi fying the interferometric conﬁgurati ons (extr eme right). Bottom panels: string signal in the CMB in µ K. T he m agnitud e of the gradient of the original signal x re-multiplied by th e ill uminati on function is re ported (le ft), as well as the dirty imag e ¯ x ( d ) (cente r left) and the SBP ǫ reconstru ction of x re-multiplied by the illuminati on function (center right), for the interferometri c conﬁgurati on c = 2 . The graph of the mean SNR with 1 σ error bars over 30 simulations is also report ed for the CLEAN rec onstructi on, and for the BP ǫ and SBP ǫ reconstru ctions re-multipli ed by the illumin ation function, as a functio n of the Fourier cov erage identi fying the interfe rometric conﬁgurat ions (extre me right). correspondin g one st andard dev iation ( 1 σ ) error bars ov er the 30 simulations are reported in Figure 1 fo r the CLEAN reconstru ction of ¯ x wi th γ = 0 . 1 , and for the BP and BP+ reconstructions of ¯ x , as a function of the Fourier coverage i dentifying t he interferomet- ric conﬁgurations. A ll obviously c ompare very favo rably relative to the SNR of ¯ x ( d ) , not reported on the graph. One must a ckno wledge the fact that BP and CLEAN provide rel ativ ely similar qualities of reconstruction. Howe v er , the BP reconstruction is actually achie ved much more rapidly than the CLEAN reconstruction, both in terms of number of i terations and computation ti me. T his highlights the fact that the BP approach may in general be comp utationally much less e xpensi v e. The B P+ recon struction e xhibits a signiﬁcantly bet- ter SNR than the BP and CLEAN reconstructions. The main out- come of this analysis thus resides in the f act that t he inclusion of the positi vity prior on the signal signiﬁcantly improv es reconstruction. For completeness, let us mention that it w as sugg ested decades ago that CLEAN can be understood as some approximation o f what we called the BP+ approach (Marsh & Richardson 1987). Notice t hat the sparsity or compressibility basis is orthonor- mal and the error || ¯ x − ¯ x ∗ || 2 in the BP reconstruction ¯ x ∗ of ¯ x is theoretically bounded by ( 20) with ǫ = 0 . Assuming saturation of this bo und, the SNR of the B P recon struction allo ws the estimation of t he maximum sparsity K of the b est sparse approximation ¯ x K of ¯ x . Preliminary analysis from the mean SNR of reconstruc tions o ver the simulations considered suggests that ratios m/K ≃ 5 hold for each of the v alues of m associated with the ﬁv e interferometric con- ﬁgurations probed. This r esult appears to be i n full coherence wi th the accepted empirical ratios quoted abov e (Lustig et al. 2007). 4.4 String signal in the CMB The CMB signal as a whole is a realization of a statistical pro- cess. In our setting, the zero-mean temperature perturbations con- sidered in µ K may be mod elled as a linear superposition of the non-Gaussian string si gnal x made up of steps and of a Gaus- sian component g seen as no ise. The power spectrum of this as- trophysical noise is set by the concordance cosmolog ical model. W e only include here the so-called primary CMB anisotropies (Hammond et al. 2008). The typica l number , width and spatial d is- tribution of long strings or string loops in a gi v en ﬁeld of vie w are also all go ve rned by the concordance cosmo logical model. Our 30 si mulations of the CMB signal are built as a superposi- tion of a unique realisti c string signal simulation borro wed f rom Fraisse et al. (2008) with 30 simulations of the Gaussian correlated noise. The string tension ρ , a dimensionless number related to the mass per unit length of string, is up to some extent a free parameter of the model. This tension sets the overall amplitude of the signal and needs to be ev aluated from observ ations. For t he sake of the present analysis, we only study the string signal for one realistic v alue ρ = 3 . 2 × 10 − 8 , which technically ﬁx es the SNR of the observ ed str ing signal buried in the astrophysical noise. This value is assessed prior to any signal reconstruction, by ﬁtting the po wer spectrum of the data t o the sum of the po wer spectra of the signal and noise on the frequencies probed (Hammond et al. 2008). This estimation may be considered as very precise at the tension of in- terest and i s not t o be considered as a signiﬁcant source of error in the subsequen t reconstruction. In this context, preliminary analysis of 16 independent realis- tic simulations of a string signal, also from Fraisse et al. (2008), allows one to show that the random process f rom which the c  2009 RAS, MNRAS 000 , 1–10 Compr essed sensing for radio interfer ometry 9 string signal arises is well modelled by GGD’ s in wav elet space (Hammond et al. 2 008). W e consider a r edundan t steerable w av elet basis Ψ s with 6 scales j ( 1 6 j 6 6 ) i ncluding lo w pass and high pass axisymmetric ﬁ lters, and four intermediate scales deﬁn- ing steerable wa velets with 6 basis orientations q ( 1 6 q 6 6 ) (Simoncelli & Freeman 1995). By statisti cal isotropy , the GGD pri- ors π j for a wav elet coe fﬁcient α ′ w only depend on the scale: π j ( α w ) ∝ exp » − ˛ ˛ ˛ ˛ α w ρu j ˛ ˛ ˛ ˛ v j – , (27) where w is t o be though t of as a multi-index identifying a coefﬁ- cient at giv en scale j , position i , and orientation q . Assuming i n- dependen ce of the wav elet coefﬁcients, the total prior probability distribution of the signal is simply the product of the probability distributions for each v alue of w , which reads as π ( α ) ∝ ex p −|| α || s , (28) for a “s” norm || α || s ≡ X w ˛ ˛ ˛ ˛ α w ρu j ˛ ˛ ˛ ˛ v j . (29) The expo nent parameters v j are called GGD shape parameters and can be considered as a measure of the co mpressibility of t he und er- lying distribution . V alues close t o 0 yield very peaked probability distributions with hea vy tails relativ e to Gaussian distribu tions, i.e. very compressible distributions. The li st of these v alues at all scales reads as: { v 1 = 0 . 43 , v 2 = 0 . 39 , v 3 = 0 . 47 , v 4 = 0 . 58 , v 5 = 0 . 76 , v 6 = 1 . 86 } . T he signal is thus u nderstood as being well mod- elled by a very compressible e xpan sion in its wav elet representation and we choose the c orresponding redund ant basis as th e sp arsity or compressibility basis for the i n v erse problem: Ψ = Ψ s . The list v alues of the GGD scale parameters u j identifying the v ariances of the distributions at all scales reads as: { u 1 = 8 . 9 × 10 − 3 , u 2 = 2 . 8 × 10 − 3 , u 3 = 2 . 2 × 10 − 2 , u 4 = 0 . 15 , u 5 = 0 . 95 , u 6 = 57 } . In full generality we consider the general in verse problem (5) wi th the sensing matrix Φ ri , for reconstruction of the original signal x non-multiplied by the illumination function. Even in the absence of instrumental noise the measured visi- bilities thus follo w from (16) with a noise term n = Φ ri g , (30) representing values of the Fourier transform of the astrophysical noise g multiplied by the illumination f unction. Discarding the very local correlations in the Fourier plane introduced by the ill umina- tion function, one may consider that the mea surements are inde- pendent and affected by independent Gaussian noise realizations. The correspondin g noise variance σ 2 r on y r with 1 6 r 6 m , is thus identiﬁed from the v alues of the kno wn power spe ctrum of g . A whitening matrix W cmb ∈ R m × m = { ( W cmb ) r r ′ = σ − 1 r δ r r ′ } 1 6 r,r ′ 6 m is introduced on t he measured visibili ties y , so that the correspond ing visibili ties ˜ y = W cmb y are af fected by in- dependen t and identically distri buted noise, as required to pose a BP ǫ problem. This operation corresponds to a matched ﬁltering i n the absence of which any hope of good reconstruction is vain. A BP ǫ problem is thus con sidered after estimation of ρ . Ho we ver , the prior statist ical knowled ge on the sign al also allows one to pose an enhanced Statistical Basis Pursuit denoise (SBP ǫ ) problem. It is deﬁned as t he minimization of the negati v e logarithm of the spe- ciﬁc prior on the signal, i.e. the s norm of the vector of i ts wave let coef ﬁcients, under the measurement constraint: min α ′ ∈ R T || α ′ || s subject to || ˜ y − W cmb Φ ri Ψ s α ′ || 2 6 ǫ. (31) Notice that the s norm is similar but still more general than a sin- gle ℓ p norm and n o theoretical recov ery result wa s yet pro vided for such a problem in the framewo rk of compressed sen sing. Again, the performance of this approach for the problem considered is as- sessed on the basis of the simulations. Most shape parameters v j are smaller than 1 , which implies that t he norm deﬁned is not con vex. W e thus recon struct the signal t hrough the re-weighted ℓ 1 norm minimization described abov e (Cand ` es et al. 2008). In this r egard, we use the SPGL1 toolbox (v an den Berg & F riedlander 2008) 4 . The value of ǫ 2 in t he BP ǫ and SBP ǫ problems i s taken to be around the 99 th percentile of t he χ 2 with m deg rees of freedom gove rn- ing the noise le vel estimator . This v alue also serv es as the stopping criterion for the CLEAN reconstruction. The magnitude of the gradient of the SBP ǫ reconstruction of the original signal x reported in Figure 1 is also represented i n the ﬁgure for the conﬁguration c = 2 , after re-multiplication by the illumination function which sets the ﬁeld of view of interest. For comparison, the magnitude of the gradient of the dirt y i mage ¯ x ( d ) used in CLEAN and obtained by simple application of the adjoint sensing matrix ¯ Φ † ri to the observe d visibilities is also represented. The mean SNR and corresponding one stand ard de viation ( 1 σ ) error bars ov er the 30 simulations are reported in Figure 1 for t he CLEAN reconstruction with γ = 0 . 1 , and for the BP ǫ and SBP ǫ reconstructions re-multiplied b y the il lumination fun ction, as a function of the Fourier coverage identifying the interferometric conﬁgurations. All ob viously compare v ery fa vo rably relativ e to the S NR of ¯ x ( d ) , not reported on the graph. One must still ackno wl- edge the fact that BP ǫ and CLEAN provide relativ ely similar qual- ities of reconstruction. The BP ǫ reconstruction is ach iev ed much more rapidly than the CL EAN reconstruc tion, highlighting the fact that the BP ǫ approach may in general be computationally much less expen siv e. The S BP ǫ reconstruction exh ibits a si gniﬁcantly better SNR than the BP and CLEAN reconstruction s. Let us ackno wl edge the fact that the re-weighted ℓ 1 norm min- imization of the SBP ǫ approach proceeds by successiv e iterations of ℓ 1 norm minimization. T his unavoidab ly signiﬁcantly increases the comp utation time for reconstruction relati ve t o the sing le ℓ 1 norm minimization of the BP ǫ approach. Relying on the idea that the c oef ﬁcients of the lo w pass ﬁlter do not sign iﬁcantly participate to t he identiﬁcation of the string network itself, our implementa- tion of SBP ǫ does not p erform any re-weighting a t the scale j = 6 , where v 6 = 1 was thus assumed. T his restriction allo ws one to keep SBP ǫ computation times similar to those of CLEAN. Let us notice howe v er that an e ven better SNR is obtained by correct re- weighting at j = 6 , albeit at the cost of a prohibitiv e increase in computation time. The main o utcome of the analysis is t wofold. Firstly , the pres- ence of a whitening operation is essential when correlated noise is considered. Secondly , the inclusion of the prior statisti cal kno wl- edge on the signal also signiﬁcantly improv es reconstruction. 5 CONCLUSION Compressed sensing offers a n e w frame work for image reconstruc- tion in r adio interferometry . In this context, the in verse problem for image reconstruction from i ncomplete and noisy Fourier measure- ments is reg ularized by the deﬁnition of global minimization prob- lems in which a generic sparsity or compressibility prior i s explic- 4 http:/ /www .cs.ubc.ca/labs/scl /spgl1/ c  2009 RAS, MNRAS 000 , 1–10 10 W iaux et al. itly imposed. These problems are solved through con vex optimiza- tion. The versatility of this schem e also allo ws inclusion of spe ciﬁc prior information on the signal under scrutiny in t he minimization problems. W e studied reconstruction performances on simulations of an intensity ﬁeld of compact astrophysical objects and of a sig- nal induced by cosmic stri ngs in the CMB temperature ﬁ eld, ob- served with very generic interferometric conﬁgurations. T he BP ǫ technique provides similar r econstruction performances as the stan- dard matchin g pursuit algorithm CLEAN. The inclusion of speciﬁc prior information signiﬁcantly improves t he quality of reconstruc- tion. Further work by the authors along these lines i s in preparation. In parti cular , a more complete analysis i s being performed to esti- mate the lowest st ring tension do wn to which compressed sensing imaging techniques can r econstruct a string signal in the CMB, in more realistic noise and Fourier cov erage conditions. In this case, gi ven the compressibility of t he magnitude of the gradient of the string signal itself, TV norm minimization also rep resents an inter- esting alternativ e to the SBP ǫ problem proposed here. A CKNO WLEDGMENTS The authors wish to thank A. A. Fraisse, C. Ringev al, D. N. Spergel, and F . R. Bouchet for kindly prov iding string signal simulations, as well as M. J. Fadili for discussions on optimization by proximal methods. The authors also thank the revie wer T . J. Cornwell for his v aluable comments. The work of Y . W . was funded by the Swi ss National Science Foun dation (SNF) under contract No. 200020- 113353 . Y . W . and L. J. are P ostdoctoral Researchers of the Belgian National Science Founda tion (F .R. S.-FNRS ). References Ables, J. G., 1974, A&AS, 15, 383 Baraniuk R., 2007, IEEE Signal Proc. Magazine, 24, 118 Bobin J., S tarck J. -L., Ottensamer R., 2008, IEEE Sel. T op. Signal Proc., 2, 718 Blythe J. H., 1957, MNRAS, 117, 644 Cand ` es E. J., Romberg J., 2 005, preprint (http://www .dsp.ece.rice.edu/cs/, January 2005) Cand ` es E . J., Romber g J., T ao T ., 2006 a, IEE E T rans. Inform. Theory , 52, 489 Cand ` es E. J. , Romberg J., T ao T ., 2006b, Comm. P ure and Appl. Math., 59, 1207 Cand ` es E . J., 2006, Proc. Int. Congress Math. V ol. 3. Euro. Math. Soc., p. 1433 Cand ` es E. J., W akin M., Boyd S., 2008, preprint (arXiv:0 711.1612v1 [stat.ME]) Cand ` es E. J., 2008, Compte Rendus de l’Academie des S ciences, Paris, Series I, 346 , 589 Carilli C., Rawlings S., ed s, 200 4, Ne w Astron. Re v ., V ol. 48, Sci- ence with the Square Kilometre Array . Elsevier , Oxford Chartrand R., Y in W ., 2007, preprint (http://www .dsp.ece.rice.edu/cs/, 2007) Combettes P . L., Pesquet J. C., 2007, IEEE S el. T op. S ignal Proc., 1, 564 Cornwell T . J., Evans K. F ., 19 85, A&A, 143, 77 Cornwell T . J., Golap K., Bhatna gar S., 2008, IEEE Sel. T op. Sig- nal Proc., 2, 647 Cornwell T . J., 2008, preprint (arXiv:0806 .2228v1 [astro-ph]) Davies M . E., Gribo n v al R., 200 8, preprint (http://www .dsp.ece.rice.edu/cs/, July 2008) Donoho D. L., 2006, IEEE T rans. Inform. Theory , 52, 1289 Donoho D. L., T anner J., 2009, Journal of the AMS, 22, 1 Foucart S. , Lai M.-J., 200 8, preprint (http://www .dsp.ece.rice.edu/cs/, July 2008) Fraisse A. A., Ringev al C., Spergel D. N., B ouchet F . R. , 2008, Phys. Re v . D, 78, 043535 Gull S. F ., Daniell G. J., 1978, Nat, 272, 686 Gull S. F ., Skilling J., 1999. Quantiﬁed M aximum Entrop y , Mem- Sys5 Users’ Manua l. Maximum E ntropy Data Consu ltants Ltd. Hammond D. K., Wiaux Y ., V andergh eynst P ., 2008, preprint (arXiv:0 811.1267v1 [astro-ph]) H ¨ ogbom J. A., 1974, A&AS, 15, 417 Lannes A., Anterri eu E. , Bouyoucef K., 1994, J. Mo d. Optics., 41, 1537 Lannes A., Anterri eu E. , Bouyoucef K., 1996, J. Mo d. Optics., 43, 105 Lev anda R., Leshem A., 2008, Proc. 25th Con v . IEEE I srael. IEEE Signal Proc. Soc., p. 716 Lustig M., Donoho D., Pauly J. M., 2007, Mag. Res. Medicine, 58, 1182 Maisinger K., Hobson M. P ., Lasenb y A. N., 2004 , MNRAS, 347, 339 Mallat S. G., Zhang Z., 1993, IEEE T rans. Signal Proc., 41, 3397 Mallat S . G., 1998, A wave let tour of signal processing. Aca demic Press, San Dieg o Marcia R. F ., W illett R. M. , 2008, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Proc.. IE EE Signal Proc. Soc., p. 833 Marsh K. A., Richardson J. M., 1987, A&A, 182, 174 Mary D., Michel O. J. J., 2007, Proc. GRETSI Coll., p. 733 Matei B., Mey er Y ., preprint (http://www .dsp.ece.rice.edu /cs/, 2008) McEwen J. D., Scaife A. M. M., 2008, MNRAS, 389, 1163 Moreau J. J., 1962, A&A, 255, 2897 Naini F . M., Gribon val R ., Jacques L., V ande rghe ynst P ., 2009, Proc. IEEE Int. Conf. on Acoustics, Speec h and Signal Proc.. IEEE Signal Proc. Soc., in press Potts D., S teidl G., T asche M., 2001, in Benedetto J. J. , Ferreira P . J. S. G ., eds, M odern Sampling Theory: Mathematics and Ap - plications. Birkhuser , Boston, p. 249 Rauhut H., Schnass K., V ander ghe ynst P ., 2008, IEE E T rans. In- form. Theory , 54, 2210 Rudin L. I., Osher S., Fatemi E., 1992 , Physica D, 60, 259 Ryle M., V onber g D. D., 1946, Nat, 158, 339 Ryle M., Hewish A., Shakeshaft J. R., 1 959, IRE T rans. Antennas Propag., 7, S120 Ryle M., He wish A., 1960, MNRAS, 120, 220 Schwarz U. J., 1978, A&A, 65, 345 Simoncelli E. P ., Freeman W . T ., 1995 , Proc. IEEE Int. Conf. Sig- nal Proc. V ol. III. IEEE Signal Proc. Soc., p. 444 v an den Berg E., Friedlander M. P ., 2008, SIAM J. Sci. Comput., 31, 890 Thompson A. R., Moran J. M., Swenson Jr . G. W ., 2 004, Interfer- ometry and Synthesis in Radio Astronomy . WILEY -VCH V erlag GmbH & Co. KGaA, W einheim c  2009 RAS, MNRAS 000 , 1–10

Compressed sensing imaging techniques for radio interferometry

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment