Non-convex optimization for 3D point source localization using a rotating point spread function

Non-convex optimization fo r 3D p oint source lo calization using a rotating p oint sp read function ∗ Chao W ang † , Ra ymond Chan † , Mila Nik olova ‡ , Rob ert Plemmons § , and Sudhak ar Prasad ¶ Abstract. W e consider the high-resolution imaging problem of 3D point source image reco very from 2D data using a metho d based on point spread function (PSF) engineering. The metho d in volv es a new tec hnique, recen tly prop osed by S. Prasad, based on the use of a rotating PSF with a single lob e to obtain depth from defo cus. The amount of rotation of the PSF enco des the depth p osition of the p oin t source. Applications include high-resolution single molecule lo calization microscopy as well as the problem addressed in this pap er on lo calization of space debris using a space-based telescop e. The lo calization problem is discretized on a cubical lattice where the coordinates of nonzero entries represen t the 3D lo cations and the v alues of these entries the ﬂuxes of the p oin t sources. Finding the lo cations and ﬂuxes of the p oin t sources is a large-scale sparse 3D inv erse problem. A new non- con vex regularization method with a data-ﬁtting term based on Kullback-Leibler (KL) divergence is prop osed for 3D localization for the Poisson noise mo del. In addition, w e propose a new scheme of estimation of the source ﬂuxes from the KL data-ﬁtting term. Numerical exp eriments illustrate the eﬃciency and stability of the algorithms that are trained on a random subset of image data before b eing applied to other images. Our 3D lo calization algorithms can b e readily applied to other kinds of depth-enco ding PSFs as well. Key w o rds. Non-conv ex optimization algorithms, 3D lo calization, space debris, p oin t spread function, image rotation, image processing AMS subject classiﬁcations. 65K10, 65F22, 90C26 1. Intro duction. Imaging and lo calizing p oint sources with high accuracy in a 3D v olume is an imp ortan t and c hallenging task. An example is sup er-resolution 3D single-molecule lo calization [ 3 , 40 ], which is an area of in tense in terest in biology (cell imaging, folding, mem brane b ehavior, etc.), in chemistry (sp ectral diﬀusion, molecular distortions, etc.), and in ph ysics (structures of materials, quan tum optics, etc.). A diﬀerent problem of in terest in ∗ Dedicated to the memory of Professor Mila Nik olova who passed aw a y on June 20th, 2018. Submitted to the edito rs on April 1st, 2018. Revision submitted on Septemb er 27th, 2018. F unding: The research of the ﬁrst, fourth and ﬁfth author were supported by the US Air Fo rce Oﬃce of Scientiﬁc Resea rch under grant F A9550-15-1-0286. The wo rk of the second autho r was supp o rted b y HKRGC Grants No. CUHK14306316, HKRGC CRF Grant C1007-15G, HKRGC AoE Grant AoE/M-05/12, CUHK DA G No. 4053211, and CUHK FIS Grant No. 1907303. The w ork of the third author was partially funded b y the F rench Research Agency (ANR) under grant No ANR-14-CE27-001 (MIRIAM) and by the Isaac Newton Institute for Mathematical Sciences fo r supp o rt and hospitality during the programme Va riational Methods and Eﬀective Algorithms for Imaging and Vision, EPSRC grant no EP/K032208/1. The wo rk of the fourth autho r was also supp orted b y HKRGC Grants No. CUHK14306316. † Depa rtment of Mathematics, The Chinese Universit y of Hong Kong, Shatin, Hong Kong ( chao wang.hk@gmail.com and rchan@math.cuhk.edu.hk ). ‡ CMLA, CNRS, ENS Cachan, Universit´ e Pa ris-Saclay , 94235 Cachan Cedex, France (deceased). § Depa rtments of Computer Science and Mathematics, Wak e Fo rest University , Winston-Salem, NC 27109, USA ( plemmons@wfu.edu ). ¶ Depa rtment of Physics and Astronomy , The University of New Mexico, Albuquerque, NM 87131, USA ( sp rasad@unm.edu ). 1 2 C. W ANG, R. CHAN, M. NIKOLO V A, R. PLEMMMONS AND S. PRASAD the space-surveillance comm unit y is that of lo calization of space debris using a space-based telescop e. Since the optical wa v elength is muc h shorter than the radio wa v elength, optical detection and lo calization is exp ected to attain far greater precision than the more commonly emplo yed radar systems. Ho wev er, the shorter ﬁeld depth of optical imaging systems may limit their p erformance to a shorter range of distance s. An integrated system consisting of a radar system for p erforming radio detection, localization, and ranging of space debris at larger distances, whic h cues in an optical system when debris reac h shorter distances, may ultimately pro vide optimal p erformance for detecting and trac king debris at distances ranging from tens of kilometers down to hundreds of meters. A stand-alone optical system based on the use of a light-sheet illumination and scattering concept [ 8 ] for sp otting debris within meters of a spacecraft has b een prop osed. A second system can lo calize all three coordinates of an unresolv ed, scattering debris [ 11 , 41 ] b y utilizing either parallex b et ween t w o observ atories or a pulsed laser ranging system or a hybrid system. F or parallex, tw o observ atories receive debris scattered optical signal simultaneously . F or the pulsed laser, the ranging system is coupled to a single imaging observ atory . The h ybrid system utilizes both approac hes in which the laser pulse transmitted from one of the t w o observ atories is received at time-gated single-photon detectors with goo d parallax information at both the observ atories. How ever, to the b est of our knowledge there is no other prop osal for a full 3D debris localization and trac king optical or optical-radar system working in the range of tens to hundreds of meters. Prasad [ 32 ] has prop osed the use of an optical imager that exploits oﬀ-cen ter image rotation to encode in a single image snapshot both the range z and transverse ( x, y ) co ordinates of a swarm of unresolv ed sources such as small, sub-centimeter class space debris, which when actively illuminated can scatter a fraction of laser irradiance back in to the imaging sensor. Here w e develop a promising non-con v ex optimization algorithm and compare its p er- formance with other recent algorithms that can reconstruct the 3D p ositions and ﬂuxes of a random collection of many p oin t sources within the fo cal depth of a rotating-PSF-based imaging system from a single image dataset. W e shall assume the data to be corrupted only b y signal-dep endent shot noise that is well describ ed by a Poisson statistical mo del, as would b e roughly characteristic of signal acquisition b y an EM-CCD sensor in the photon counting (PC) mo de [ 7 ]. 1.1. Previous related w ork. The area of 3D ob ject lo calization and imaging is getting increased attention in recent y ears. One metho d is to scan the 2D slices of information at diﬀeren t depths and then reconstruct the 3D image [ 30 ]. Owing to the ineﬃciency and other limitations of standard PSFs, sev eral optical mo diﬁcations [ 3 , 40 ] hav e b een developed for these problems. There are three typical classes of metho ds: multifocus metho d, interferomet- ric detection and p oin t spread function (PSF) engineering. The multifocus metho d [ 14 , 31 ] uses more than one fo cal plane sim ultaneously . The interferometric metho d [ 36 ] extracts the lo cations of p oin t sources from the interference fringes of a source signal coherently propagat- ing in tw o opp osed paths in the imaging instrument. PSF engineering [ 17 , 28 , 29 , 33 , 34 , 35 ], whic h enco des the depth co ordinate of p oin t sources in a 3D scene in to a single 2D snapshot, is based on c ho osing a phase pattern that makes the defo cused image of a p oint source depth- dep enden t without blurring it excessively . F or example, this can b e achiev ed b y inserting a NON-CONVEX OPTIMIZA TION F OR 3D LOCALIZA TION 3 cylindrical lens into an optical system to get approximately elliptical Gaussian, astigmatic images whose axial orientation and size con tain unique information ab out the source lo ca- tions relative to the plane of b est fo cus [ 13 ]. A more general wa y to create the requisite phase ab errations is to use phase masks. There are man y diﬀeren t kinds of phase masks, which giv e rise to the double-helix PSF [ 28 , 29 ], the corkscrew PSF [ 17 ], the T etrap o d PSFs [ 34 , 35 ] and the single-lob e PSF [ 15 , 33 ]. Since the amount of rotation of the PSF enco des the depth information for the double-helix PSF, the corkscrew and the single-lob e PSF, w e call them rotating PSFs. T etrap o d PSFs [ 34 , 35 ] represent another approach that is rather analogous to astigmatic imaging in that the source depth is enco ded in the shap e of the PSF. In this pap er, w e will only fo cus on Prasad’s single-lob e rotating PSF [ 33 ]. Sev eral al- gorithms ha ve b een developed for 3D p oint source lo calization, for example in the area of single molecule localization in microscop y 3D DA OSTORM [ 1 ] and 3D F ALCON [ 19 ]. Ho w- ev er these tw o m ethods cannot be applied for localization using rotating PSFs, since they are based on ﬁtting Gaussian functions, whic h w orks w ell only for astigmatism-based PSFs. Easy-DHPSF [ 18 ] is a p opular approach for solving the double-helix PSF problem but it re- quires no o v erlap for PSF images whic h is often impossible to satisfy in realit y . In [ 37 ], the authors prop osed a generalized metho d for v arious rotating PSFs by using 3D deconv olution with a regularization metho d. The optimization mo del is for an additive Poisson noise mo del that is only approximately correct under highly limited brightness conditions. Their additive P oisson noise mo del is formed by adding a noise simulated from the Poisson distribution to the image, making their noise mo del unrealistically data-indep endent. Owing to sparsity , a h ybrid algorithm with matc hing pursuit and conv ex optimization w as describ ed in [ 2 ], but no theoretical analysis w as presen ted. It is therefore imp ortant to consider an algorithm for the high-density case that makes no assumptions ab out source brightness. With this aim, we prop ose an optimization model for single-lob e rotating PSF [ 33 ] based 3D localization of p oin t sources. W e emphasize, ho wev er, that our mo del is broadly applicable to a v ariet y of PSFs. 1.2. Problem development. In image formation mo deling, the p oin t spread function (PSF) is the imaging system’s resp onse to light from a p oint source. Our purp ose is to extend the w ork of Prasad on PSF engineering [ 33 ], whic h proposed the use of depth-dep enden t image rotation, by developing a non-conv ex optimization algorithm for 3D p oint source lo calization using such an imaging system. See Section 2 for details. By imp osing spiral phase retaration with a phase winding num ber that changes in regular in teger steps from one ann ular zone to the next of an ap erture-based phase mask, one can create an image of a p oin t source that has an appro ximate rotational shap e in v ariance, pro- vided the zone radii hav e a square ro ot dep endence on their indices. Sp eciﬁcally , when the distance of the source from the ap erture of suc h an imaging system changes, the oﬀ-center, shap e-preserving PSF merely rotates by an amoun t roughly proportional to the source mis- fo cus from the plane of best fo cus. The follo wing general mo del based on the rotating PSF image describes the spatial distribution of image brightness for M p oint sources. The observ ed 2D image without noise is (1.1) I 0 ( x, y ) = M X i =1 H i ( x − x i , y − y i ) f i + b, 4 C. W ANG, R. CHAN, M. NIKOLO V A, R. PLEMMMONS AND S. PRASAD where b is the uniform background and H i ( x − x i , y − y i ) is the rotating PSF for the i -th p oint source of ﬂux f i and 3D position coordinates ( x i , y i , z i ) with the depth information co ded in eac h H i , and ( x, y ) is the p osition in image plane. Here, w e build a forw ard model for the problem based in part on the approach dev elop ed in [ 37 ]. In order to estimate the 3D lo cations of the p oin t sources, we discretize the distribution of p oint sources on a lattice X ∈ R m × n × d . The indices of the nonzero entries of X are the 3-dimensional lo cations of the p oin t sources and the v alues at these en tries correspond to the ﬂuxes, i.e., the energy emitted b y the illuminated point source. The 2D observ ed image G ∈ R m × n can b e approximated as G ≈ P ( T ( A ∗ X ) + b ) , where b is bac kground signal and A ∗ X is the conv olution of X with the 3D PSF A . This 3D PSF is a cub e which is constructed by a sequence of image s with resp ect to diﬀerent depths of the p oin ts. Eac h slice is the image corresp onding to a p oin t source at the origin in the ( x, y ) plane and at depth z . Here T is an operator for extracting the last slice of the cube A ∗ X since the observed information is a snapshot, and P is the Poisson noise op erator. Figure 1.1 sho ws the forward mo del for a sp eciﬁc PSF. In order to recov er X , w e need to solve a large-scale sparse 3D inv erse problem given as follows: min X kX k 0 s.t. D ( X ) <  (1.2) where kX k 0 is the counting function whic h gives the n um b er of nonzero entries in X and is also called the ` 0 pseudo-norm. Here  provides the noise lev el and D is a certain data-ﬁtting term based on the noise mo del. This problem is NP-hard [ 21 ] and there are a num b er of algorithms and metho ds [ 9 ] b eing dev elop ed to ﬁnd sparse representation or a go o d approximate solu- tion. Examples of such algorithms include greedy algorithms lik e orthogonal matc hing pursuit (OMP) [ 27 ], iterative hard thresholding algorithm (IHT) [ 4 ], which requires the knowledge of the exact num ber of sparse en tries in the ground truth, and con vex/non-con vex relaxation metho ds [ 6 , 22 , 23 , 24 , 38 , 42 ] where the ` 0 term is appro ximated and ( 1.2 ) is replaced b y a regularization mo del. Here, we develop a con v ex/non-con v ex relaxation approach for solving ( 1.2 ) for the Poisson noise mo del. In comparison to state-of-the-art metho ds, our approach is more eﬃcient ev en for high density cases. W e do not need to know the exact num ber of the ground truth p oint sources and, furthermore, we hav e observed less o verﬁtting than other metho ds. That is to say that our estimated result recov ers fewer false p oint sources. 1.3. Outline of the pap er. The rest of the pap er is organized as follows. In Section 2 , we describe the ph ysics mo del for Prasad’s single lob e rotating PSF. In Section 3 , we prop ose a non-conv ex optimization mo del to solv e the p oin t source localization problem. In Section 4 , our non-con vex optimization algorithm is developed, with a p ost-pro cessing step for eliminating clustered false p ositive p oin t sources. In addition, a new iterativ e scheme for estimating the ﬂux v alues is also prop osed in this section. Numerical experiments, including comparisons with other optimization mo dels, are discussed in Section 5 . Some concluding remarks are made in Section 6 . NON-CONVEX OPTIMIZA TION F OR 3D LOCALIZA TION 5 Figure 1.1. Il lustration of the discr etize d forwar d model for the single lob e PSF. 2. Physics mo del fo r single lob e rotating PSFs . Here, we consider a new technique, recen tly patented by Prasad in [ 15 , 33 ], for applying a rotating PSF with a single lob e to obtain depth from defo cus. Comparing to the traditional double-helix rotating PSF [ 29 ], this single lob e PSF uses photons from targets more eﬃcien tly and yields images that are less cluttered and confounding in high-density swarm scenarios than other tec hniques. The amoun t of rotation of Prasad’s PSF encodes the depth position of the p oin t source. Denote A ζ as the PSF matrix for a p oint source with ﬂux f = 1, transverse lo cation r 0 = ( x 0 , y 0 ) = 0 , and defo cus parameter ζ . Note that ζ is prop ortional to the ob ject-distance δ z from the in-fo cus ob ject plane, according to the relation (2.1) ζ = − π δ z R 2 λl 0 ( l 0 + δ z ) , where l 0 is the distance b et ween the lens and the plane of Gaussian fo cus and R is the radius of the pupil. Here the imaging wa v elength is denoted by λ . The tw o-dimensional image data matrix G ∈ R m × n with Poisson noise is approximated as: (2.2) G ≈ P M X i =1 ( A ζ i ∗ δ ( x i , y i )) f i + b ! , where b is the spatially uniform background, ∗ is the conv olution op eration, and δ ( x i , y i ) is the transverse lo cation matrix for the i th source, with elemen ts, ( δ ( x i , y i )) uv = ( 1 , ( u, v ) = ( x i , y i ) , 0 , otherwise . 6 C. W ANG, R. CHAN, M. NIKOLO V A, R. PLEMMMONS AND S. PRASAD The center of rotation of the image in the transverse plane, denoted b y r I = ( x, y ), is related to the source transverse lo cation, r 0 , in the ob ject plane via linear magniﬁcation, r I = − z I r 0 l 0 + δ z , where z I is the distance b etw een the image plane and the lens. According to the F ourier optics mo del [ 10 ], the incoheren t PSF for a clear ap erture con- taining a phase mask with optical phase retardation, ψ ( s ), is given by (2.3) A ζ ( s ) = 1 π     Z P ( u )exp  ι (2 π u · s + ζ u 2 − ψ ( u ))  d u     2 , where ι = √ − 1, P ( u ) denotes the indicator function for the pupil of unit radius, and s = ( s, φ s ) is a scaled v ersion of the image-plane position vector, r , namely s = r λz I /R . Here r is measured from the center of the geometric image p oint lo cated at r I . The pupil-plane position vector ρ is normalized by the pupil radius, u = ρ R . F or the single-lob e rotating PSF, ψ ( u ) is chosen to b e the spiral phase proﬁle deﬁned as ψ ( u ) = lφ u , for r l − 1 L ≤ u ≤ r l L , l = 1 , · · · , L, in whic h L is the n umber of concen tric ann ular zones in the phase mask. W e ev aluate ( 2.3 ) b y using the fast F ourier transform. Remark: F or the discretized forward mo del (see Figure 1.1 ), w e get the dictionary A by sampling depths at regular interv als in the range, ζ i ∈ [ − π L, π L ], ov er whic h the PSF p erforms one complete rotation ab out the geometric image center b efore it b egins to break apart. The i -th slice of dictionary is denoted as A ζ i . 3. Non-convex optimization mo del . In general, for inv erse problems the ob jective func- tion includes a data-ﬁtting term and a regularization term. The regularization term, represent- ing prior information, is based on the desired prop erties of the solution, while the data-ﬁtting term measures the discrepancy betw een the estimated and the observ ed images according to the noise mo del. Here we consider the Poisson noise mo del for whic h the data ﬁtting term is the I -div ergence. The mo del with a regularization term, µ R ( X ) is min X ≥ 0 D K L ( T ( A ∗ X ) + b 1 , G ) + µ R ( X ) , where the uniform background is denoted as b . Here w e also use it to denote the matrix b 1, 1 ∈ R m × n , i.e., b 1 is the m × n matrix with all en tries b . The data-ﬁdelity term is the negativ e log-lik elihoo d of the data. F or P oisson noise, this term is the I -div ergence whic h is also known as Kullback-Leibler (KL) divergence (see [ 16 ]) D K L ( z , g ) = h g , ln g z i + h 1 , z − g i . NON-CONVEX OPTIMIZA TION F OR 3D LOCALIZA TION 7 This is a large-scale sparse 3D inv erse problem. F or the regularization term, w e consider a non-conv ex function (see [ 22 , 23 , 24 , 42 ]), using sp eciﬁcally R ( X ) := m,n,d X i,j,k =1 θ ( X ij k ) = m,n,d X i,j,k =1 |X ij k | a + |X ij k | , where a is ﬁxed and determines the degree of non-conv exit y , see Figure 3.1 (a). Therefore, the minimization problem amounts to (3.1) min X ≥ 0    h 1 , T ( A ∗ X ) − G ln( T ( A ∗ X ) + b 1) i + µ m,n,d X i,j,k =1 |X ij k | a + |X ij k |    . Here, θ ( t ) = lim  → 1 θ  ( t ) , where θ  ( t ) = | t | a +  | t | . Since θ  ( t ) represen ts the ` 1 -norm when  = 0, w e see the pro cess of increasing non-con vexit y as  increases from 0 to 1; see Figure 3.1 (b). Moreo ver, w e easily get the deriv ative of θ ( t ): θ 0 ( t ) = a sign( t ) ( a + | t | ) 2 , when t 6 = 0 , and θ 0 (0 + ) = 1 a , where θ 0 (0 + ) is the right deriv ativ e at zero. -200 -150 -100 -50 0 50 100 150 200 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 a = 1 a = 10 a = 50 a = 100 (a) θ ( t ; a ) -200 -150 -100 -50 0 50 100 150 200 0 0.5 1 1.5 2 2.5 ǫ = 0 ǫ = 0.3 ǫ = 0.6 ǫ = 1 (b) θ  ( t ; a ) Figure 3.1. Plot of θ ( t ; a ) and θ  ( t ; a ) . In (a), we cho ose diﬀer ent values of a ; in (b) we cho ose diﬀer ent values of  while ﬁxing a = 80 . 4. Algo rithm development. The minimization problem ( 3.1 ) inv olv es non-conv ex opti- mization. Here we use an iterative reweigh ted ` 1 algorithm (IRL1) [ 26 ] to solve this opti- mization problem. It is a ma jorization-minimization method whic h solv es a series of con v ex optimization problem with a weigh ted- ` 1 regularization term. It considers the problem (see Algorithm 3, in [ 26 ]) min x ∈ X F ( x ) := F 1 ( x ) + F 2 ( G ( x )) , 8 C. W ANG, R. CHAN, M. NIKOLO V A, R. PLEMMMONS AND S. PRASAD where X is the constraint set. F is a lo w er semicon tinuous (lsc) function, extended, real- v alued, prop er, while F 1 is prop er, low er-semicon tinous, and con vex and F 2 is co ordinatewise nondecreasing, i.e. F 2 ( x ) ≤ F 2 ( x + te i ) with x, x + te i ∈ G ( X ) and t > 0 , where e i is the i -th canonical basis unit vector. The function F 2 is concav e on G ( X ). The IRL1 iterative scheme [ 26 , Algorithm 3] is    w l = ∂ F 2 ( y ) , y = G ( x l ) , x l +1 = argmin x ∈ X  F 1 ( x ) + h w l , G ( x ) i  , where ∂ stands for sub diﬀeren tial. F or our problem ( 3.1 ), w e can c ho ose: F 1 ( X ) = h 1 , T ( A ∗ X ) − G log( T ( A ∗ X ) + b 1) i ; F 2 ( X ) = µ m,n,d X i,j,k =1 X ij k a + X ij k ; G ( X ) = |X | ; X = {X | X ij k ≥ 0 for all i, j, k } . Therefore, w e compute the partial deriv ativ e of w l and get w l ij k = aµ ( a + ˆ X l ij k ) 2 , ∀ i, j, k . Here w l 6 = 0 is ﬁnite, since a, µ 6 = 0, and all X ij k ≥ 0 o wning to the constraint X . According to [ 12 , 24 ], these terms satisfy the requirements of the algorithm. Algorithm 4.1 giv es the IRL1 for solving ( 3.1 ). Algorithm 4.1 Iterativ e reweigh ted ` 1 algorithm (IRL1) for the rotating PSF problem Input: X 0 ∈ R m × n × d and G ∈ R m × n . Set a and µ . Output: The solution X ∗ whic h is the minimizer in the last outer iteration. 1: rep eat 2: Compute w l ij k = aµ ( a + ˆ X l ij k ) 2 , ∀ i, j, k ; 3: Giv en G , w l ij k , obtain ˆ X l b y solving (4.1) ˆ X l = argmin X ≥ 0    h 1 , T ( A ∗ X ) − G log ( T ( A ∗ X ) + b 1) i + m,n,d X i,j,k =1 w l ij k |X ij k |    . 4: un til con vergence 4.1. Subp roblem of IRL1. In Algorithm 4.1 , w e need to solv e the subproblem ( 4.1 ), whic h can b e solved b y the alternating direction metho d of m ultipliers (ADMM) [ 5 ]. F or this, we in tro duce t wo auxiliary v ariables, namely U 0 and U 1 , and transform ( 4.1 ) into min U 1 ≥ 0 , U 0 , X    h 1 , T U 0 − G log( T U 0 + b 1) i + m,n,d X i,j,k =1 w l ij k | ( U 1 ) ij k | : U 0 = A ∗ X , U 1 = X    . NON-CONVEX OPTIMIZA TION F OR 3D LOCALIZA TION 9 The nonnegative constraint can b e rewritten as an indicator function such that the ob jective function in this subproblem b ecomes (4.2) h 1 , T U 0 − G log( T U 0 + b 1) i + m,n,d X i,j,k =1 w l ij k | ( U 1 ) ij k | + 1 + ( U 1 ) , where 1 + ( X ) is the indicator function of the nonnegative constrain t 1 + ( X ) = ( 0 , X ≥ 0 , ∞ , otherwise . Therefore, the augmen ted Lagrangrian function L ( U 0 , U 1 , X , η 0 , η 1 ) [ 25 ] for subproblem ( 4.1 ) is L ( U 0 , U 1 , X , η 0 , η 1 ) := h 1 , T U 0 − G log( T U 0 + b 1) i + m,n,d X i,j,k =1 w l ij k | ( U 1 ) ij k | + β 0 2 kU 0 − A ∗ X − η 0 k 2 F + β 1 2 kU 1 − X − η 1 k 2 F + 1 + ( U 1 ) , where η 0 , η 1 ∈ R m × n × d are the Lagrange multipliers and β 0 , β 1 > 0. Here kX k F is the F rob enius norm of X , whic h is equal to the ` 2 -norm of the vectorized X . Starting at X = X t , U 1 = U t 1 and U 2 = U t 2 , applying ADMM in [ 5 ] yields the iterative scheme U t +1 0 = argmin U 0  h 1 , T U 0 − G log( T U 0 + b 1) i + β 0 2 kU 0 − A ∗ X t − η t 0 k 2 F  (4.3a) U t +1 1 = argmin U 1 ≥ 0    m,n,d X i,j,k =1 w l ij k | ( U 1 ) ij k | + β 1 2 kU 1 − X t − η t 1 k 2 F    (4.3b) X t +1 = argmin X  β 0 2   U t +1 0 − A ∗ X − η t 0   2 F + β 1 2   U t +1 1 − X − η t 1   2 F  (4.3c) η t +1 0 = η t 0 − ρ ( U t +1 0 − A ∗ X t +1 ) (4.3d) η t +1 1 = η t 1 − ρ ( U t +1 1 − X t +1 ) (4.3e) Here ρ ∈  0 , 1+ √ 5 2  is the dual steplength. W e c ho ose ρ = 1 . 618 in our numerical tests. The U 1 -subproblem ( 4.3b ) is solved b y soft-thresholding under nonnegative constrain t. So the closed-form solution is given by  U t +1 1  ij k = max n X t + η t 1 − w l ij k /β 1 , 0 o . The X -subproblem ( 4.3c ) is a least squares problem. W e rewrite the con volution into comp o- nen twise multiplication by using the F ourier transform and then the ob jective function of the subproblem b ecomes β 0 2   F {A} · F {X } − F  U t +1 0 − η t 0    2 F + β 1 2   F {X } − F {U t +1 1 − η t 1 }   2 F . 10 C. W ANG, R. CHAN, M. NIKOLO V A, R. PLEMMMONS AND S. PRASAD Its closed-form solution reads as (4.4) X t +1 = F − 1 (  |F {A} | 2 + β 1 β 0  − 1  F {A} · F {U t +1 0 − η t 0 } + β 1 β 0 F {U t +1 1 − η t 1 }  ) , where |X | 2 and X are the comp onent wise op erations of the square of absolute v alue of X , and complex conjugate of X , resp ectiv ely . By assuming the b oundary condition to b e p erio dic, w e use the 3D fast F ourier transformation (3D FFT) to compute ( 4.4 ) eﬃciently . F or the solution of U 0 -subproblem ( 4.3a ), w e need Prop osition 4.1 . The range of the follo wing prop osition go es b ey ond the solution of ( 4.3a ). It gives a closed-form solution to a highly nonlinear and useful functional, namely the Thikhonov regularized KL divergence in volving high-dimensional linear op erators. It can also b e seen as the closed-form solution of the proximal op erator of this general KL divergence. Prop osition 4.1. Given β , b ∈ R 1 , η ∈ R m × n × d and G ∈ R m × n , c onsider the minimization pr oblem (4.5) min U  h 1 , T U − G log( T U + b 1) i + β 2 kU − ξ k 2 F  . Then ( 4.5 ) has a close d-form solution (4.6) U ∗ ij k = ( − (1+ β b − β ξ ij k )+ √ (1 − β b − β ξ ij k ) 2 +4 β G ij 2 β , if k = d, ξ ij k , otherwise . Pr o of. Denote J ( U ) := h 1 , T U − G log ( T U + b 1) i + β 2 kU − ξ k 2 F . The minimizer of J with resp ect to U satisﬁes ∇J ( U ∗ ) = 0. Here the minimization of J ( U ) can b e considered as a sequence sub-minimization problems separately . Then the minimizer can b e achiev ed b y considering partial deriv ativ e separately . In addition, h 1 , T U − G log( T U + b 1) i can b e rewritten as m,n P i,j =1 ( U ij d − G ij log( U ij d + b )) which do es not inv olv e U ij k with k 6 = d . Hence the partial deriv ative of h 1 , T U − G log ( T U + b 1) i with resp ect to the entries { ( i, j, k ) , k 6 = d } are zero. By ∇ β 2 kU − ξ k 2 F = β ( U − ξ ) we get (4.7) U ∗ ij k = ξ ij k if k 6 = d, F or those entries in the last slice of U ∗ , we rewrite the minimization problem in these en tries as (4.8) min U ij d  h 1 , U ij d − G ij log( U ij d + b ) i + β 2 ( U ij d − ξ ij d ) 2  . Denoting y = U ij d + b , then ( 4.8 ) b ecomes a standard one-dimensional KL divergence problem min y  D K L ( y , G ij ) + β 2 ( y − ξ ij d − b )  . NON-CONVEX OPTIMIZA TION F OR 3D LOCALIZA TION 11 By [ 39 , Lemma 2.2], the explicit form of the minimizer is (4.9) y ∗ = − (1 − β b − β ξ ij k ) + p (1 − β b − β ξ ij k ) 2 + 4 β G ij 2 β . Com bining ( 4.7 ) and ( 4.9 ), we obtain ( 4.6 ). Based on Prop osition 4.1 , w e can obtain a closed-form solution for the U 0 -subproblem by setting ξ = A ∗ X t + η t 0 and β = β 0 Remark: The conv ergence of ( 4.1 ) is ensured b y [ 39 , Theorem 4.2], since the discrete forw ard model T ( A ∗ X ) can b e written as matrix-v ector m ultiplication. And, we can regarded L in [ 39 , Theorem 4.2] as an identify matrix. 4.2. Removing false p ositives using a centroid metho d. F or real data, p oint sources may b e not on the grid, which means the discrete mo del is not accurate. In order to av oid missing sources, the regularization parameter µ is kept small, whic h leads to the o ver-ﬁtting eﬀect. That is to say that there are more p oin t sources in our estimated result than the ground truth, which we m ust attempt to mitigate. Our optimization solution generally con tains tigh tly clustered p oin t sources, so we need to regard any such cluster of p oin t sources as a single p oint source. The same phenomenon has b een observ ed in [ 20 , 37 , 43 ]. T o this end, w e apply a post-pro cessing approac h following [ 43 ]. The metho d is based on a w ell-deﬁned tolerance distance for recognizing clustered neighbors denoted as C . The criterion is to ﬁnd nonzero en tries that are within a certain distance from the given point whose pixel v alue is higher than or equal to other nonzero p oin ts in the neigh b orho o d. F or recognizing the cluster of p oin t sources, we need to start from the entry with the highest in tensity . By the ab ov e criterion, one clustered neighborho o d is recognized. Then we compute the cen troid of each cluster. That is to say , to get the cen troid lo cation ( x, y , z ) for one cluster C as x = P ( i,j,k ) ∈ C i X ij k P ( i,j,k ) ∈ C X ij k ; y = P ( i,j,k ) ∈ C j X ij k P ( i,j,k ) ∈ C X ij k ; z = P ( i,j,k ) ∈ C k X ij k P ( i,j,k ) ∈ C X ij k . The ﬂux for this represen tativ e p oint source in cluster C is P ( i,j,k ) ∈ C X ij k . W e set the v alue of these entries in the recognized neigh borho o d as 0 and then w e pro ceed with the searching pro cess recursiv ely . An estimated p oin t source that cannot ﬁnd out its corresp onding ground truth is called false p ositive. T o some false p ositiv es which ma y b e due to the eﬀect of the p erio dic boundary condition, we set a threshold, say 5% of highest in tensit y [ 37 ]. Those en tries whose pixel v alue is low er than the threshold will b e regarded as false positives so that w e directly delete them in the searching process. The whole post-pro cessing method is summarized in Algorithm 4.2 . 4.3. Estimating the ﬂux values. In Section 4 , w e provided an algorithm for estimating the lo cations and ﬂuxes for the p oint sources. How ev er, n umerical results sho w that the ﬂux v alues are generally underestimated. In [ 20 , 37 ], least squares ﬁtting is used for impro ving the resolution as well as updating the corresp onding ﬂuxes. How ev er, our problem is not Gaussian noise or additive P oisson noise as used in their pap er. Our Poisson noise is data- dep enden t, which cannot force the regenerated image T ( A ∗ X ∗ ) to match the observ ed data 12 C. W ANG, R. CHAN, M. NIKOLO V A, R. PLEMMMONS AND S. PRASAD Algorithm 4.2 Remo ving cluster p oin t sources by computing cen troids (Cen troid FP). 1: Set U as a zero 3D tensor of the same size as X ; 2: Find the maximum v alue of the solution X ; 3: Recognize the neighbour C based on the ab o ve criterion and recursively chec k the other nonzero p oints; 4: Compute the centroid of C and set the v alue of this cen troid entry in U as the summation of all the pixel v alues in C ; 5: Set X ( C ) = 0; 6: If there is any nonzero entry in X , go to Step 2, otherwise, go to the next step; 7: Set pixel v alue of these entries in U whose v alue is low er than 5% of the highest intensit y as 0. with least squares, where X ∗ is the result after solving the non-conv ex optimization problem and post-pro cessing. Our aim is th us to estimate the source ﬂuxes from the KL data-ﬁtting term appropriate to the Poisson noise mo del when the source 3D p ositions hav e already b een accurately estimated. Let the PSF corresponding to the i -th source b e arranged as the column v ector h i . The stac king of the M column vectors in the same sequence as the source lab els for the M sources then deﬁnes a system PSF matrix H , with H = [ h 1 , h 2 , · · · , h M ] ∈ R K × M , where K is the total num ber of pixels in the v ectorized data arra y , so K = mn. The v ectorized observ ed image is denoted by g ∈ R K × 1 . The uniform background is denoted as the vector b 1 with 1 ∈ R K × 1 . The ﬂux vector is denoted as f ∈ R M × 1 . Here the problem is ov erdetermined meaning that the n umber of p oint sources M is muc h smaller than the n umber of av ailable data K = mn . Therefore we need to do some reﬁnement of the estimates by minimizing directly data ﬁtting term. Since the negative log-lik eliho o d function for the Poisson model, up to certain data dep endent terms, is simply the KL div ergence function, D K L ( H f + b 1 , g ) = h 1 , H f − g log ( H f + b 1 ) i , its minimization with resp ect to the ﬂux vector f , p erformed by setting the gradient of D K L with resp ect to f (see [ 12 ]) zero, yields the nonlinear relation ∇ D K L ( H f + b 1 , g ) = H T 1 − K X i =1 g i e T i ( H f + b 1 ) H T e i = K X i =1 e T i ( H f + b 1 − g ) e T i ( H f + b 1 ) H T e i = 0 , (4.10) where e i is the i -th canonical basis unit vector. Consider no w an iterative solution of ( 4.10 ). Since b is prop ortional to the vector of ones, 1 , we write ( 4.10 ) as NON-CONVEX OPTIMIZA TION F OR 3D LOCALIZA TION 13 0 = bD K L ( H f + b 1 , g ) = K X i =1 e T i ( H f + b 1 − g ) e T i ( b 1 ) e T i ( H f + b 1 ) H T e i = K X i =1 e T i ( H f + b 1 − g ) e T i ( H f + b 1 − H f ) e T i ( H f + b 1 ) H T e i = K X i =1 e T i ( H f + b 1 − g ) H T e i − K X i =1 e T i ( H f + b 1 − g ) e T i H f e T i ( H f + b 1 ) H T e i = H T H f + H T ( b 1 − g ) − K X i =1 e T i ( H f + b 1 − g ) e T i H f e T i ( H f + b 1 ) H T e i . In this form, we easily get H T H f = H T ( g − b 1 ) + K X i =1 e T i ( H f + b 1 − g ) e T i H f e T i ( H f + b 1 ) H T e i , By multiplying by ( H T H ) − 1 the tw o sides of this equation, we directly get (4.11) f = f G + K X i =1 e T i ( H f + b 1 − g ) e T i H f e T i ( H f + b 1 ) H + e i , where H + = ( H T H ) − 1 H T and f G = H + ( g − b 1 ) is the solution corresp onding to the Gaussian noise mo del. This suggests the following iterative algorithm: (4.12) f n +1 = f G + K ( f n ) , n = 1 , 2 , · · · where K ( f ) = K X i =1 e T i ( H f + b 1 − g ) e T i H f e T i ( H f + b 1 ) H + e i . T o emphasize that our non-conv ex optimization model is based on the use of a KL data ﬁtting (KL) term and a non-conv ex (NC) regularization term, w e designate our approach as KL-NC that has b een enhanced with p ost-pro cessing and a reﬁned estimation of ﬂux. W e summarize the diﬀerent steps of our prop osed metho d for the rotating PSF problem in Algorithm 4.3 . 5. Numerical exp eriments. In this section we apply our optimization approach to solving sim ulated rotating PSF problems for point source localization and compare it to some other optimization methods. The co des of our algorithm and the others with whic h w e compared our metho d were written in MA TLAB 9 . 0 (R2016a) , and all the numerical exp eriments were conducted on a typical p ersonal computer with a standard CPU (In tel i7-6700, 3.4GHz). 14 C. W ANG, R. CHAN, M. NIKOLO V A, R. PLEMMMONS AND S. PRASAD Algorithm 4.3 KL-NC with p ost-pro cessing and estimation of the ﬂux. Input: X 0 ∈ R m × n × d and G ∈ R m × n . Output: The lo cations of p oint sources with the corresp onding ﬂux { ( x ∗ i , y ∗ i , z ∗ i , f ∗ i ) } . 1: Solv e the non-con vex optimization problem b y Algorithm 4.1 and get the minimizer X ∗ ; 2: Do the post-pro cessing b y Algorithm 4.2 to get the lo cations of estimated point sources { ( x ∗ i , y ∗ i , z ∗ i ) } ; 3: Estimate the ﬂuxes by the iterative scheme ( 4.12 ) and get f ∗ . The ﬁdelity of lo calization is assessed in terms of the recall rate , deﬁned as the r atio of the numb er of identiﬁe d true p ositive p oint sour c es over the numb er of true p ositive p oint sour c es, and the precision rate , deﬁned as the r atio of the numb er of identiﬁe d true p ositive p oint sour c es over the numb er of al l p oint sour c es obtaine d by the algorithm ; see [ 3 ]. T o distinguish true p ositiv es from false p ositiv es from the estimated point sources, we need to determine the minimum total distance b etw een the estimated p oint sources and true p oin t sources. Here all 2D sim ulated observ ed images are described b y 96-b y-96 matrices. W e set the num b er of zones of the spiral phase mask responsible for the rotating PSF at L = 7 and the ap erture-plane side length as 4 which sets the pixel resolution in the 2D image (FFT) plane as 1/4 in units of λz I /R . The dictionary corresp onding to our discretized 3D space contains 21 slices in the axial direction, with the corresp onding v alues of the defo cus parameter, ζ , distributed uniformly ov er the range, [ − 21 , 21]. According to the Abbe-Rayleigh resolution criterion, t wo p oint sources that are within (1 / 2) λz I /R of eac h other and lying in the same transv erse plane cannot be separated in the limit of low intensities. In view of this criterion and our choice of the ap erture-plane side length and if we assume conserv ativ ely that our algorithm do es not yield any signiﬁcant sup erresolution, we must regard tw o p oin t sources that are within 2 image pixel units of each other as a single point source. Analogously , t w o sources along the same line of sight ( i.e., with the same x, y coordinates) that are axially separated from each other within a single unit of ζ must also b e regarded as a single p oin t source. As for real problems, our simulation does not assume that the p oint sources are on the grid p oin ts. Rather, a num ber of p oin t sources are randomly generated in a 3D contin uous image space with certain ﬂuxes. W e consider a v ariet y of source densities, from 5 p oin t sources to 40 p oint sources in the same size space. F or each densit y , we randomly generate 20 observed images and use them for training the parameters in our algorithm, and then test 50 simulated images with the w ell-selected parameters. The num ber of photons emitted by eac h p oin t source follows a P oisson distribution with mean of 2000 photons. Instead of adding P oisson noise as additive noise as in [ 37 ], we apply data-dep endent Poisson noise by using the MA TLAB command G = poissrnd(I0+b) , where I0 is the 2D original image formed by adding all the images of the p oint sources, and b is the background noise whic h we set to a typical v alue 5. Here, poissrnd is the MA TLAB command whose input is the mean of the Poisson distribution. NON-CONVEX OPTIMIZA TION F OR 3D LOCALIZA TION 15 5.1. 3D lo calizations for lo w and high densit y cases. In this subsection, we will sho w some 3D lo calizations results in low and high density cases (15 p oin t sources and 30 p oint sources); see Figure 5.1 . Note that our 3D lo calization is estimated very well for 15 p oin t sources. In this example, all the true p ositiv es are iden tiﬁed and there is only one false p ositiv e. F rom Figure 5.1 (c) and (e), we see that there are tw o true p oint sources that are close in their transverse co ordinates, but w ell-separated in the axial direction; see the arrows in Figure 5.1 (c),(e). In the 2D original image (without noise) in Figure 5.1 (c), the rotating PSF images of those tw o p oint sources are non-ov erlapping and our approach can well identify b oth tw o p oint sources. Figure 5.1 (b), (d) and (f ) show the high density (30 point sources) case with man y o verlap- ping rotating PSF images of the resp ectiv e p oint sources. Suc h image ov erlap in the presence of P oisson noise makes the problem diﬃcult. The num b er of p oin t sources is not easily ob- tained by observ ation. In this sp eciﬁc case, our algorithm still identiﬁes all the true p oin t sources correctly , but pro duces 6 false p ositives. F rom Figure 5.1 (d), we can see that these false p ositives come from rather substantial PSF o verlapping. Recall that the solution of our non-conv ex optimization problem in volv es outer and inner iterations. The stopping criteria for inner iteration is set as the relativ e error kU t +1 1 −U t 1 k F kU t 1 k F less than a toleran t small num b er. W e also set the maximum num ber of outer iterations as 2 and maxim um n um b er of inner iterations as 400. F rom Figures 5.2 and 5.3 , w e can see how and wh y w e c hose these t wo maximum iteration n umbers. Let us denote the num b er of maximum outer iterations and inner iterations as Max out and Max in resp ectiv ely . In Figure 5.2 , w e ﬁxed Max in as 400 and tried diﬀeren t Max out . As we see, the recall rate, whether with or without p ost-pro cessing, c hanges little. The precision rate increases, how ev er, when we c hange Max out from 1 to 2, but it changes considerably less when c hanging Max out from 2 to 3, while the computational time increases linearly with Max out . In Figure 5.3 , on the other hand, we ﬁxed Max out as 2 and changed Max in . Here Max in = 400 gives the best results with a relatively lo w time cost. In both cases, the precision rate can be impro ved signiﬁcantly with p ost-pro cessing. Here, we consider the practical con vergence of our algorithm. That is to chec k the diﬀer- ences b et w een the auxiliary v ariables ( U 0 and U 1 ) and their corresp onding substituted v ariables ( A ∗ X and X ), i.e., the v alue of kU 0 − A ∗ X k F and kU 1 − X k F in each iteration. Noted that the maximum num b er of inner and outer iterations are 400 and 2 resp ectively . W e plot the v alues of these t w o terms for eac h iteration in Figures 5.4 and 5.5 . Both terms decrease iterativ ely . Note that the v alues of kU 0 k F and kA ∗ X k F are ab out 2300 and the v alues of kU 1 k F and kX k F are ab out 260. Therefore, the diﬀerences b etw een auxiliary v ariables and their substituted v ariables are small, which shows the conv ergence numerically . A t the end of this subsection, w e consider the stabilit y of our algorithm under PSF errors and bias. As an example of such an error, which might arise from imp erfections of the phase- mask fabrication pro cess, w e n umerically add a Gaussian phase randomness at the level of a fraction of a wa v e to the spiral phase mask. That is to say , A ζ ( s ) in ( 2.3 ) is changed to ˆ A ζ ( s ) = 1 π     Z P ( u )exp  ι (2 π u · s + ζ u 2 − ψ ( u ) + σ N )  d u     2 , where σ N represents a Gaussian noise term with standard deviation σ chosen from 1/40th 16 C. W ANG, R. CHAN, M. NIKOLO V A, R. PLEMMMONS AND S. PRASAD (a) Observed image (b) Observed image (c) Estimated lo cations in 2D (d) Estimated lo cations in 2D 80 60 0 y 2 4 6 8 40 10 12 20 14 16 x 18 40 20 20 60 80 (e) Estimated lo cations in 3D 80 60 0 y 2 4 6 8 40 10 12 20 14 16 x 18 40 20 20 60 80 (f ) Estimated lo cations in 3D Figure 5.1. 3D lo calizations for low and high density c ases: (a), (c) and (e) ar e the r esults for the 15 p oint sour c es case; (b), (d) and (f ) are the r esults for the 30 p oint sour c es c ase. “o” is the gr ound truth point sour c e and “+” is the estimate d p oint sourc e. NON-CONVEX OPTIMIZA TION F OR 3D LOCALIZA TION 17 123 Maximum outer iteration 70 75 80 85 90 95 100 105 110 Recall rate (%) with post-processing without post-processing 123 Maximum outer iteration 20 30 40 50 60 70 80 90 100 Precision (%) with post-processing without post-processing 123 Maximum outer iteration 0 5 10 15 20 25 30 35 Time (second) with post-processing without post-processing Figure 5.2. Eﬀe cts of the maximum numb er of outer iteration. to 1/10th of a w a v e in the spiral phase, i.e. σ is set b et ween 2 π 40 and 2 π 10 radians. W e apply the Gaussian noise by using the MA TLAB command “randn” and consider the case of 15 p oin t sources by calculating the a verage of recall and precision rate for 50 observed images randomly generated from the distribution of p oint sources. W e compare the results with and without any PSF error. See Figure 5.6 . The recall rate is stable for all these noise levels. When the noise level, σ , is less than 0.4 radian, ( i.e. , ab out 1/25th wa v e of spiral phase), the precision rate is within 3% of the noise-less case. But when σ is larger than 0.4 radian, the precision rate drops b y about 8%. These results sho w go o d stabilit y of our algorithm under mo dest PSF phase errors. 5.2. Compa rison of Algo rithm KL-NC with other algorithms. In this subsection we compare our approach with three other optimization algorithms: KL- ` 1 (KL data ﬁtting with ` 1 regularization mo del); ` 2 - ` 1 (least squares ﬁtting term with ` 1 regularization mo del) and ` 2 -NC (least squares ﬁtting term with non-con v ex regularization model). F or all these models, w e do the same p ost-pro cessing and estimation of ﬂux v alues after solving the corresp onding optimization problem. In KL- ` 1 , we need to solve the following optimization problem min X ≥ 0 { D K L ( T ( A ∗ X ) + b 1 , G ) + µ kX k 1 } . W e use ADMM by introducing tw o auxiliary v ariables U 0 = A ∗ X and U 1 = X . W e get a similar augmen ted Lagrangrian function with w l ij k replaced b y µ in ( 4.2 ). Therefore, w e use Prop osition 4.1 to solve this problem. 18 C. W ANG, R. CHAN, M. NIKOLO V A, R. PLEMMMONS AND S. PRASAD 300 400 500 Maximum inner iteration 40 50 60 70 80 90 100 110 Recall rate (%) with post-processing without post-processing 300 400 500 Maximum inner iteration 40 50 60 70 80 90 100 110 Precision rate (%) with post-processing without post-processing 300 400 500 Maximum inner iteration 9 9.5 10 10.5 11 11.5 12 12.5 13 Time (second) with post-processing without post-processing Figure 5.3. Eﬀe cts of the maximum numb er of inner iteration. 0 50 100 150 200 250 300 350 400 inner iterations 0 10 20 30 40 50 60 ||U 1 -X|| (a) In the 1st outer iteration 0 50 100 150 200 250 300 350 400 inner iterations 5 10 15 20 25 30 35 40 45 50 ||U 1 -X|| (b) In the 2nd outer iteration Figure 5.4. kU 1 − X k F in e ach iter ation. In ` 2 - ` 1 , we need to solve the following optimization problem (5.1) min X ≥ 0  1 2 kT ( A ∗ X ) + b 1 − G k 2 F + µ kX k 1  . The problem is similar to the optimization problem in [ 37 ], except that w e include the bac k- ground which is estimated easily by signal processing. W e can therefore still use ADMM for solving ( 5.1 ). NON-CONVEX OPTIMIZA TION F OR 3D LOCALIZA TION 19 0 50 100 150 200 250 300 350 400 inner iterations 0 50 100 150 200 250 300 350 400 450 500 ||U 0 -A*X|| (a) In the 1st outer iteration 0 50 100 150 200 250 300 350 400 inner iterations 0 50 100 150 ||U 0 -A*X|| (b) In the 2nd outer iteration Figure 5.5. kU 0 − A ∗ X k F in e ach iter ation. 0 0.1 0.2 0.3 0.4 0.5 0.6 noise level 50 60 70 80 90 100 110 rate (%) recall precision Figure 5.6. R e c al l and pr e cision r ates under diﬀer ent levels of err ors in the de gr ade d PSF. In ` 2 -NC, we need to solve the following optimization problem (5.2) min X ≥ 0  1 2 kT ( A ∗ X ) + b 1 − G k 2 F + µ R ( X )  , where R ( X ) is the same as KL-NC metho d. W e see that IRL1 can also work w ell for solving ( 5.2 ) with the ` 1 w eighted optimization as a subproblem which we solve by ADMM since it is similar to ` 2 - ` 1 . 20 C. W ANG, R. CHAN, M. NIKOLO V A, R. PLEMMMONS AND S. PRASAD Both the initial guesses of X and U 0 are set as 0 for all these metho ds. In order to do the comparison, we plot the solutions for the four mo dels as w ell as the ground truth in the same space; see Figures 5.7 and 5.8 which corresp ond to the cases of 15 and 30 p oint sources, resp ectiv ely . 0 5 80 10 60 80 15 y 60 x 40 20 40 20 20 true L2-L1 KL-L1 (a) Without p ost-pro cessing ( ` 1 mo dels) 0 5 80 10 60 80 15 y 60 x 40 20 40 20 20 true L2-NC KL-NC (b) Without p ost-pro cessing (non-conv ex mo del) 0 5 80 10 60 80 15 y 60 x 40 20 40 20 20 true L2-L1 KL-L1 (c) With p ost-pro cessing ( ` 1 mo dels) 0 5 80 10 60 80 15 y 60 x 40 20 40 20 20 true L2-NC KL-NC (d) With p ost-pro cessing (non-conv ex mo del) Figure 5.7. Solution X from 4 algorithms (15 p oint sour c es). F rom Figure 5.7 and Figure 5.8 , w e see the o verﬁtting of the ` 1 regularization mo dels (KL- ` 1 and ` 2 - ` 1 ). Before p ost-pro cessing, the solutions of these tw o algorithms spread out the PSFs a lot and ha v e many false p ositives. After p ost-pro cessing, b oth algorithms are impro ved, esp ecially KL- ` 1 . How ev er, in comparison to the non-conv ex regularization (KL- NC and ` 2 -NC), they still ha ve man y more false positives. Among the four algorithms, our NON-CONVEX OPTIMIZA TION F OR 3D LOCALIZA TION 21 0 5 80 10 60 80 15 y 60 x 40 20 40 20 20 true L2-L1 KL-L1 (a) Without p ost-pro cessing ( ` 1 mo dels) 0 5 80 10 60 80 15 y 60 x 40 20 40 20 20 true L2-NC KL-NC (b) Without p ost-pro cessing (non-conv ex mo del) 0 5 80 10 60 80 15 y 60 x 40 20 40 20 20 true L2-L1 KL-L1 (c) With p ost-pro cessing ( ` 1 mo dels) 0 5 80 10 60 80 15 y 60 x 40 20 40 20 20 true L2-NC KL-NC (d) With p ost-pro cessing (non-conv ex mo del) Figure 5.8. Solution X from 4 algorithms (30 p oint sour c es). approac h (KL-NC) p erforms the b est in terms of the recall and precision rates. W e also tested a num b er of other v alues of p oint source densit y , namely 5, 10, 15, 20, 30 and 40 point sources, and computed the a v erage recall and precision rate of 50 images for eac h densit y and for each algorithm; see T ables 5.1 and 5.2 . The results sho w the sup erior results of our metho d in terms of b oth recall and precision rates. In T able 5.1 , the b est recall and precision rates in eac h case are lab eled by bold fonts. As in the ab ov e discussion, our non- con vex regularization tends to eliminate more false p ositives whic h increases the precision rate. The KL data-ﬁtting term, on the other hand, improv es the recall rate as we see by comparing the results of KL-NC with ` 2 -NC. Before p ost-pro cessing, we see that all the algorithms hav e 22 C. W ANG, R. CHAN, M. NIKOLO V A, R. PLEMMMONS AND S. PRASAD lo w precision rates, esp ecially the tw o emplo ying the ` 1 regularization mo del at less than 10%. T able 5.1 Comp arison of our KL-NC with ` 2 - ` 1 , ` 2 -NC and KL- ` 1 . Al l the r esults ar e after p erforming p ost- pro cessing . ` 2 - ` 1 ` 2 -NC KL- ` 1 KL-NC No. Sources Recall Prec. Recall Prec. Recall Prec. Recall Prec. 5 100.00 % 68.91% 97.60% 89.15% 98.93% 58.64% 100.00 % 97.52 % 10 99.60 % 55.95% 94.80% 83.51% 99.40% 65.24% 99.40% 93.69 % 15 98.67% 56.28% 92.80% 84.77% 98.93 % 58.64% 98.40% 88.60 % 20 97.70% 56.50% 95.20% 80.92% 98.10 % 57.82% 97.70% 87.49 % 30 96.00% 55.74% 93.93% 77.77% 94.00% 56.22% 96.20 % 79.75 % 40 93.80% 52.68% 95.40 % 59.34% 93.70% 54.29% 95.00% 73.35 % T able 5.2 Comp arison of our KL-NC with ` 2 - ` 1 , ` 2 -NC and KL- ` 1 . Al l the r esults ar e b efore p ost-pro cessing . ` 2 - ` 1 ` 2 -NC KL- ` 1 KL-NC No. Sources Recall Prec. Recall Prec. Recall Prec. Recall Prec. 5 100.00% 9.36% 98.80% 58.97% 100.00% 3.41% 100.00% 52.27% 10 100.00% 8.48% 96.80% 54.50% 100.00% 2.92% 100.00% 51.13% 15 100.00% 4.32% 96.47% 60.05% 100.00% 3.41% 100.00% 49.40% 20 100.00% 4.41% 97.70% 50.27% 100.00% 3.27% 99.30% 52.80% 30 100.00% 4.47% 98.13% 49.25% 99.93% 2.71% 99.40% 35.67% 40 100.00% 4.47% 98.95% 30.27% 100.00% 3.75% 99.00% 30.31% In the follo wing, w e compare the results of the estimation of the ﬂux in these four algo- rithms, considering sp eciﬁcally the case of 15 p oint sources. In Figure 5.9 , we plot the ﬂuxes of ground truth sources and the ﬂuxes of the estimated p oint sources for the true p ositive p oin t sources. F or the false p ositiv e point sources, we only sho w the estimated ﬂuxes. Both ` 1 mo dels underestimate the ﬂuxes. The rotating PSF images for false p ositiv es carry the energy aw a y from the true p ositiv e source ﬂuxes. In non-conv ex mo dels, we also hav e similar observ ations when we hav e false p ositives. F or example, in Figure 5.9 (b), w e see the ﬂux on the ﬁfth bar is underestimated more than the others. W e note that its rotating PSF image is o verlapping with that of a false positive. The more false p ositiv es an algorithm reco v ers the more they will spread out the intensit y , leading to more underestimated ﬂuxes for the true p ositiv es. W e also tested 50 diﬀerent observed images for each density , and analyzed the relative error in the estimated ﬂux v alues, which we deﬁne as error = f est − f tru f tru , NON-CONVEX OPTIMIZA TION F OR 3D LOCALIZA TION 23 0 5 10 15 20 25 30 0 20 40 60 80 100 120 140 160 true est (a) ` 2 - ` 1 0 2 4 6 8 10 12 14 16 0 20 40 60 80 100 120 140 160 true est (b) ` 2 -NC 0 5 10 15 20 25 0 20 40 60 80 100 120 140 160 true est (c) KL- ` 1 0 2 4 6 8 10 12 14 16 0 20 40 60 80 100 120 140 160 true est (d) KL-NC Figure 5.9. Estimating ﬂux values. The b ar graph with no value gr ound truth part c orr esp onds to a false p ositive. where ( f est , f tru ) is the pair which con tains the ﬂux of an iden tiﬁed true positive and the corresp onsing ground truth ﬂux. W e plot the histogram of the relativ e errors on these four mo dels in Figures 5.10 and 5.11 . W e still see the adv an tage of KL-NC o v er other algorithms in this resp ect. The distribution of relative errors mostly lies within [0 , 0 . 1]. F or the ` 1 regularization algorithms, the distribution of the relative error spreads out and there are man y cases with error higher than 0.3. F or ` 2 -NC, we hav e the similar result for the 15 p oint sources case; ho wev er, in higher density cases, KL-NC has b etter results comparing with ` 2 -NC. 5.3. Lo w numb er of photons case. In the previous subsection, we sa w the adv antages of using our nonconcon vex regularization term. How ev er, KL-NC do es not seem to p ossess signiﬁcan t adv an tages o v er the ` 2 -NC model, esp ecially in the cases of low source densit y . This is because only when w e ha v e a suﬃcien tly large num b er of photons, as was the case in the results discussed earlier, the ` 2 data-ﬁtting term corresp onding to the maximum lik eliho o d estimator for the Gaussian noise mo del is w ell approximated by the KL data-ﬁtting term for the P oisson noise mo del assumed here. In order to v erify this, we study the case of lo w photon n umber in this subsection. In Figure 5.12 , w e see the observed image with three diﬀerent photon num bers p er source, namely 500, 1000, and 2000. F or the case of 500 photons p er 24 C. W ANG, R. CHAN, M. NIKOLO V A, R. PLEMMMONS AND S. PRASAD (a) ` 2 - ` 1 (b) ` 2 -NC (c) KL- ` 1 (d) KL-NC Figure 5.10. Histo gr am of the r elative err ors of the ﬂux values for the 15 p oint sour c es c ase. source, the image is v ery dim, and we can barely distinguish the rotating PSF signal from the bac kground noise, so w e do not consider this case any further. F or the case of 1000 photons b eing emitted b y eac h p oin t source, the corresp onding image follo ws a Poisson distribution with a mean of 1000 photons. W e take each image to con tain 15 p oint sources. As b efore, we ﬁrst randomly generated 20 images, whic h were used for training parameters. Then we tested 50 diﬀerent observed images by using the trained parameter v alues. W e obtained 90.00% for the recall rate and 78.66% for the precision rate in the KL-NC algorithm, but only 84.67% and 59.87%, resp ectively , for the t wo rates in the ` 2 -NC algorithm. F or ev aluation of the ﬂux estimates, w e refer to Figure 5.13 from whic h w e see that once again KL-NC gives m uch better results than ` 2 -NC since the bars corresp onding to the lo w relativ e error case are higher. W e therefore conclude that the KL data-ﬁtting term is m uch b etter than the ` 2 data-ﬁtting term when dealing with low num bers of photons. 6. Conclusions and Current Directions. In this pap er, we ha ve proposed a non-con vex optimization algorithm for 3D lo calization of a swarm of randomly spaced p oin t sources using a sp eciﬁc rotating PSF whic h has a single primary lob e in the image of eac h p oint source. Rotating PSFs with a single lob e seem to p ossess several obvious adv antages ov er those that ha ve multiple lob es, such as the double-lob e rotating PSF, e.g. [ 28 , 29 , 37 ], or spread out rapidly with increasing source defo cus, suc h as the standard Airy PSF [ 10 ], particularly when NON-CONVEX OPTIMIZA TION F OR 3D LOCALIZA TION 25 (a) ` 2 - ` 1 (b) ` 2 -NC (c) KL- ` 1 (d) KL-NC Figure 5.11. Histo gr am of relative err ors of ﬂux values in the 30 p oint sour c es c ase. dealing with relativ ely high source densities and relativ ely small photon num b ers p er source. W e hav e emplo yed a post-pro cessing step based b oth on centroiding the locations of recov ered sources that are tigh tly clustered and thresholding the reco vered ﬂux v alues to eliminate ob vious false p ositives from our reco vered sources. In addition, we ha v e prop osed a new iterativ e sc heme for reﬁning the estimate of the source ﬂuxes after the sources ha v e been lo calized. In our numerical results, w e sim ulated noisy image data according to the physics of the space debris problem at optical w av elengths using a space-based telescop e. T o our kno wledge, our algorithm is the ﬁrst one developed so far for snapshot 3D localization of space debris and tracking of space debris via a rotating PSF approach. These tec hniques can b e applied to other rotating and depth-enco ding PSFs for accurate 3D lo calization and ﬂux reco very of p oin t sources in a scene from its image data under the P oisson noise mo del. Applications include not only 3D localization of space debris, but also sup er-resolution 3D single-molecule lo calization microscopy , e.g. [ 3 , 40 ]. A theoretical analysis of the adv an tages of our non-conv ex mo del ov er conv ex mo dels and their conv ergence prop erties for the rotating PSF problem is planned for study in a future pap er. Complete reco very based on the theory of sup er-resolution is the direction w e will use to further analyze PSF stabilit y and related questions. T ests of this algorithm based on real data collected using phase masks fabricated for b oth applications will b e studied in the 26 C. W ANG, R. CHAN, M. NIKOLO V A, R. PLEMMMONS AND S. PRASAD Figure 5.12. Observe d image with diﬀerent numb ers of photons in e ach point sour c e. (a) ` 2 - ` 1 (b) ` 2 -NC (c) KL- ` 1 (d) KL-NC Figure 5.13. Consider the 1000 photons p er p oint sour c e case. Histo gr am of the r elative err ors of the ﬂuxes for the 15 p oint sour c es. future. W ork inv olving snapshot multi-spectral imaging, which will p ermit accurate material c haracterization as well as higher 3D resolution and lo calization of space micro-debris via a sequence of snapshots, is currently under wa y . NON-CONVEX OPTIMIZA TION F OR 3D LOCALIZA TION 27 REFERENCES [1] H. Babcock, Y. M. Sigal, and X. Zhuang , A high-density 3D lo c alization algorithm for sto chastic optic al r e c onstruction micr osc opy , Optical Nanoscopy , 1 (2012), p. 6. [2] A. Barsic, G. Grover, and R. Piestun , Thre e-dimensional sup er-r esolution and lo c alization of dense clusters of single mole cules , Scientiﬁc reports, 4 (2014), p. 5388. [3] U. J. Birk , Sup er-R esolution Micr osc opy: A Pr actic al Guide , John Wiley & Sons, 2017. [4] T. Blumensa th and M. E. Da vies , Iter ative thr esholding for sp arse appr oximations , Journal of F ourier analysis and Applications, 14 (2008), pp. 629–654. [5] S. Boyd, N. P arikh, E. Chu, B. Pelea to, J. Eckstein, et al. , Distribute d optimization and statistic al le arning via the alternating dir e ction metho d of multipliers , F oundations and T rends R  in Mac hine Learning, 3 (2011), pp. 1–122. [6] S. S. Chen, D. L. Donoho, and M. A. Sa unders , Atomic de c omp osition by basis pursuit , SIAM review, 43 (2001), pp. 129–159. [7] O. Daigle and S. Blais-Ouellette , Photon c ounting with an EMCCD , in Proc. SPIE, v ol. 7536, 2010, pp. 7536–10, https://doi.org/10.1117/12.840047 . [8] C. R. Engler t, J. T. Ba ys, K. D. Marr, C. M. Bro wn, A. C. Nicholas, and T. T. Finne , Optic al orbital debris sp otter , Acta Astronautica, 104 (2014), pp. 99–105. [9] S. F oucar t and H. Rauhut , A mathematic al intr o duction to c ompr essive sensing , v ol. 1, Birkh¨ auser Basel, 2013. [10] J. W. Goodman , Intr o duction to F ourier Optics , F reeman, 4th ed., 2017. [11] D. Hampf, P. W agner, and W. Riede , Optic al te chnolo gies for the observation of low e arth orbit obje cts , arXiv preprint arXiv:1501.05736, (2015). [12] Z. T. Harmany, R. F. Marcia, and R. M. Willett , This is SPIRAL-T AP: Sp arse p oisson intensity r e c onstruction algorithms—the ory and pr actic e , IEEE T ransactions on Image Pro cessing, 21 (2012), pp. 1084–1096. [13] B. Huang, W. W ang, M. Ba tes, and X. Zhuang , Thr e e-dimensional sup er-r esolution imaging by sto chastic optic al r e c onstruction micr osc opy , Science, 319 (2008), pp. 810–813. [14] M. F. Juette, T. J. Gould, M. D. Lessard, M. J. Mlodzianoski, B. S. Nagpure, B. T. Bennett, S. T. Hess, and J. Bewersdorf , Thr e e-dimensional sub–100 nm r esolution ﬂuor esc enc e micr oscopy of thick samples , Nature metho ds, 5 (2008), p. 527. [15] R. Kumar and S. Prasad , PSF rotation with changing defo cus and applic ations to 3D imaging for sp ace situational awar eness , in Pro ceedings of the 2013 AMOS T ec hnical Conference, Maui, HI, 2013. [16] T. Le, R. Char trand, and T. J. Asaki , A variational appr o ach to r e c onstructing images c orrupte d by p oisson noise , Journal of mathematical imaging and vision, 27 (2007), pp. 257–263. [17] M. D. Lew, S. F. Lee, M. Badieirost ami, and W. Moerner , Corkscr ew p oint spr e ad function for far-ﬁeld thr e e-dimensional nanosc ale lo c alization of p ointlike objects , Optics letters, 36 (2011), pp. 202– 204. [18] M. D. Lew, A. R. von Diezmann, and W. Moerner , Easy-DHPSF open-sour c e softwar e for thr e e- dimensional lo c alization of single molecules with pr e cision b eyond the optic al diﬀr action limit , Proto col exc hange, 2013 (2013). [19] J. Min, S. J. Holden, L. Carlini, M. Unser, S. Manley, and J. C. Ye , 3D high-density lo c aliza- tion micr osc opy using hybrid astigmatic/biplane imaging and sp arse image r e c onstruction , Biomedical optics express, 5 (2014), pp. 3935–3948. [20] J. Min, C. V onesch, H. Kirshner, L. Carlini, N. Olivier, S. Holden, S. Manley, J. C. Ye, and M. Unser , F ALCON: fast and unbiase d r e c onstruction of high-density sup er-r esolution micr osc opy data , Scientiﬁc rep orts, 4 (2014), p. 4577. [21] B. K. Na t arajan , Sp arse approximate solutions to line ar systems , SIAM journal on computing, 24 (1995), pp. 227–234. [22] M. Nikolo v a, M. K. Ng, and C. T am , F ast nonc onvex nonsmooth minimization metho ds for image r estor ation and r e c onstruction , IEEE T ransactions on Image Pro cessing, 19 (2010), pp. 3073–3088. [23] M. Nikolo v a, M. K. Ng, and C. T am , On ` 1 data ﬁtting and conc ave r e gularization for image r e c overy , SIAM Journal on Scientiﬁc Computing, 35 (2013), pp. A397–A430. [24] M. Nikolov a, M. K. Ng, S. Zhang, and W. Ching , Eﬃcient re c onstruction of pie c ewise c onstant 28 C. W ANG, R. CHAN, M. NIKOLO V A, R. PLEMMMONS AND S. PRASAD images using nonsmo oth nonc onvex minimization , SIAM journal on Imaging Sciences, 1 (2008), pp. 2– 25. [25] J. Nocedal and S. Wright , Numeric al optimization: Springer scienc e & business me dia , New Y ork, (2006). [26] P. Ochs, A. Dosovitskiy, T. Bro x, and T. Pock , On iter atively r eweighte d algorithms for nonsmo oth nonc onvex optimization in c omputer vision , SIAM Journal on Imaging Sciences, 8 (2015), pp. 331–372. [27] Y. C. P a ti, R. Rezaiif ar, and P. S. Krishnaprasad , Ortho gonal matching pursuit: R e cursive function appr oximation with applic ations to wavelet de c omp osition , in Signals, Systems and Computers, 1993. 1993 Conference Record of The Twen ty-Sev en th Asilomar Conference on, IEEE, 1993, pp. 40–44. [28] S. R. P. P a v ani and R. Piestun , High-eﬃciency r otating p oint spr e ad functions , Optics express, 16 (2008), pp. 3484–3489. [29] S. R. P. P a v ani, M. A. Thompson, J. S. Biteen, S. J. Lord, N. Liu, R. J. Twieg, R. Piestun, and W. Moerner , Thr e e-dimensional, single-molecule ﬂuor esc enc e imaging b eyond the diﬀr action limit by using a double-helix p oint spr e ad function , Pro ceedings of the National Academy of Sciences, 106 (2009), pp. 2995–2999. [30] P. Prabha t, S. Ram, E. S. W ard, and R. J. Ober , Simultane ous imaging of diﬀer ent foc al planes in ﬂuor esc enc e micr osc opy for the study of c el lular dynamics in thr e e dimensions , IEEE transactions on nanobioscience, 3 (2004), pp. 237–242. [31] P. Prabha t, S. Ram, E. S. W ard, and R. J. Ober , Simultane ous imaging of diﬀer ent foc al planes in ﬂuor esc enc e micr osc opy for the study of c el lular dynamics in thr e e dimensions , IEEE transactions on nanobioscience, 3 (2004), pp. 237–242. [32] S. Prasad , Innovations in sp ac e-obje ct shap e r e c overy and 3D sp ac e debris lo c al- ization , in AF OSR-SSA W orkshop, Maui, 2017, Presentation slides av ailable at h ttps://communit y .apan.org/wg/afosr/m/k athy/176362/do wnload. [33] S. Prasad , R otating p oint spr e ad function via pupil-phase engine ering , Optics letters, 38 (2013), pp. 585– 587. [34] Y. Shechtman, S. J. Sahl, A. S. Backer, and W. Moerner , Optimal p oint spr e ad function design for 3D imaging , Ph ysical review letters, 113 (2014), p. 133902. [35] Y. Shechtman, L. E. Weiss, A. S. Backer, S. J. Sahl, and W. Moerner , Pr e cise thr e e-dimensional sc an-fr e e multiple-p article tr acking over lar ge axial r anges with tetr ap od p oint spr e ad functions , Nano letters, 15 (2015), pp. 4194–4199. [36] G. Shtengel, J. A. Galbraith, C. G. Galbraith, J. Lippincott-Schw ar tz, J. M. Gillette, S. Manley, R. Sougra t, C. M. W a terman, P. Kanchana wong, M. W. Da vidson, et al. , In- terfer ometric ﬂuor esc ent super-r esolution microsc opy r esolves 3D c el lular ultr astructur e , Pro ceedings of the National Academy of Sciences, 106 (2009), pp. 3125–3130. [37] B. Shuang, W. W ang, H. Shen, L. J. T auzin, C. Fla tebo, J. Chen, N. A. Moringo, L. D. Bishop, K. F. Kell y, and C. F. Landes , Gener alize d r e c overy algorithm for 3D sup er-r esolution micr osc opy using r otating p oint spr e ad functions , Scientiﬁc reports, 6 (2016). [38] E. Soubies, L. Blanc-F ´ eraud, and G. A uber t , A c ontinuous exact ` 0 p enalty (CEL0) for le ast squares r e gularize d pr oblem , SIAM Journal on Imaging Sciences, 8 (2015), pp. 1607–1639. [39] T. Teuber, G. Steidl, and R. H. Chan , Minimization and p ar ameter estimation for seminorm r e gu- larization mo dels with I-diver genc e c onstr aints , Inv erse Problems, 29 (2013), p. 035007. [40] A. von Diezmann, Y. Shechtman, and W. Moerner , Thr e e-dimensional lo calization of single mole cules for sup er-r esolution imaging and single-particle tr acking , Chemical Reviews, (2017). [41] P. W agner, D. Hampf, F. Sproll, T. Hasenohr, L. Humber t, J. Rodmann, and W. Riede , Dete c- tion and laser r anging of orbital obje cts using optic al metho ds , in Remote Sensing System Engineering VI, vol. 9977, International Society for Optics and Photonics, 2016, p. 99770D. [42] J. Xiao, M. K. Ng, and Y. Y ang , On the c onver genc e of nonc onvex minimization metho ds for image r e c overy , IEEE T ransactions on Image Pro cessing, 24 (2015), pp. 1587–1598. [43] L. Zhu, W. Zhang, D. Elna t an, and B. Huang , F aster STORM using c ompr esse d sensing , Nature metho ds, 9 (2012), p. 721.

Non-convex optimization for 3D point source localization using a rotating point spread function

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment