Bits from Photons: Oversampled Image Acquisition Using Binary Poisson Statistics

1 Bits from Photons: Ov ersampled Image Acquisition Using Binary Poisson Statisti cs Feng Y ang, Student Member , IEEE, Y ue M. Lu, M ember , IEEE, Luciano Sbaiz, Member , IEEE, and Martin V etterli, F ellow , IEEE Abstract W e study a new image sensor that is reminiscent of trad itional photo graphic ﬁlm. Each pixel in the sensor has a binary response, giving only a one-bit quantized measurement of the local light intensity . T o analyze its perfo rmance, we formulate the oversampled binar y sensing scheme as a parameter estimation problem based on quantized Poisson statistics. W e show that, with a single-photo n quantization thresho ld and large oversampling factors, the Cram ´ er-Rao lower boun d (CRLB) of the e stimation variance approac hes that of an ideal unqu antized sensor, that is, as if ther e wer e no quantization in the sensor measur ements. Furthermo re, the CRLB is shown to be asymp totically achiev able by the maximum likelihoo d estimator (MLE). By sho wing that the log-likelihood fun ction of our problem is con cav e, we guarantee the g lobal optimality of iterative algor ithms in ﬁnding the MLE. Num erical results on bo th synthetic data and image s taken by a prototyp e sensor verify our theoretical analysis and demon strate the effecti veness of our image reconstruc tion algo rithm. T hey also sugg est the p otential applicatio n o f the oversampled bina ry sensing scheme in high dynam ic rang e pho tograph y . Index T erms Computation al photog raphy , high dyn amic rang e imagin g, dig ital ﬁlm senso r , ph oton-limited im aging, Poisson statistics, qu antization, diffraction- limited imaging F . Y ang is with the School of Computer and Communication Sciences, ´ Ecole Polytechnique F ´ ed ´ erale de Lausanne (EP FL), CH-1015 Lausanne, Switzerland (e-mail: feng.yang@epﬂ.ch; WW W : http://lcav .epﬂ.ch/people/feng .yang) . Y . M. Lu is with the School of Engineering and Applied S ciences, H arv ard Uni versity , Cambridge, MA 02138, USA (e-mail: yuelu@seas.harv ard.edu; WWW: http://lu.seas.harvard.edu). L. Sbaiz i s with Goog le Zurich, C H-8002 Zurich, Switzerland (e-mail: luciano.sbaiz@gmail.com). M. V etterli is with the School of Computer and Communication S ciences, ´ Ecole Polytechnique F ´ ed ´ erale de Lausanne (E PFL), CH-1015, Lausanne, Switzerland (e-mail : martin.vetterli@epﬂ.ch; WW W : http://lcav .epﬂ.ch/people/martin.v etterli). This work was supported by the S wiss National S cience Foundation under grant 200020-132 730, and by the European Research Council under grant ERC-24700 6. 2 I . I N T R O D U C T I O N Before the a dvent of digital imag e s ensors, pho tography , for the most p art o f its history , used ﬁlm to record light information. At the heart of e very photographic ﬁ lm are a lar ge nu mber of light-sensiti ve g rains of silver- halide crystals [1]. Du ring expos ure, e ach micron-siz ed grain has a binary f ate: Either it is struck by some incident photons a nd bec omes “expos ed”, or it is missed by the photon bomba rdment and remains “unexpose d”. In the subseq uent ﬁlm development process , expos ed grains, due to their altered chemical properties, are con verted to silver meta l, c ontributi ng to opaqu e spo ts on the ﬁlm; un exposed grains are washed away in a chemical bath, le aving be hind them transparen t regions on the ﬁlm. Th us, in es senc e, photographic ﬁlm is a binary imaging medium, using local densities of o paque silver grains to encode the original light intensity information. Th anks to the small size and lar ge number of these grains, one hardly notices this qu antized nature of ﬁlm wh en viewing it at a distance, observing only a c ontinuous gray tone. In this work, we study a new digital image se nsor that is reminisc ent o f pho tographic ﬁlm. Each pixel in the sensor has a binary response , gi ving only a one-bit qu antized mea surement of the local light intensity . At the start o f the expos ure period, all pixels are se t to 0 . A pixel is the n s et to 1 if the n umber of photons rea ching it during the expo sure is at least equal to a given thres hold q . One way to build such binary sensors is to modify standard memory ch ip technology , where each memory bit ce ll is des igned to be sensitive to visible light [2]. W ith current CMOS technology , the level of integration of s uch sy stems can exceed 10 9 ∼ 10 10 ( i.e. , 1 giga to 10 giga) pixels per chip. In this case, the c orrespond ing pixel s izes (around 50 nm [3]) a re far be low the diff raction limi t of light (see Sec tion II for more details), a nd thus the image se nsor is oversampling the optical reso lution of the light ﬁeld. Intuitiv ely , on e can exploit this s patial redundan cy to compensa te for the information los s due to o ne-bit q uantizations, as is clas sic in ov ersampled analog-to-digital (A/ D) con versions [4]–[7]. Building a binary se nsor that emulates the p hotographic ﬁlm process was ﬁ rst envisioned by Fossum [8], who coined the n ame “digital ﬁ lm senso r” . The original moti vation was mainly out of tech nical nec essity . The miniaturization of ca mera systems calls for the co ntinuous shrinking of pixel sizes. At a ce rtain point, howe ver , the limited full-well capac ity ( i.e. , the maximum photon-electrons a p ixel ca n hold) of small pixels becomes a bo ttleneck, yielding very low sign al-to-noise ratios (SNRs) a nd p oor dynamic rang es. In contrast, a binary sensor whose p ixels o nly need to detect a few pho ton-electrons around a small threshold q h as much less requirement for full-well capacities, allowing pixel size s to shrink further . In this pape r , we present a theoretical ana lysis o f the performance of the bina ry image sens or , a nd propose an efﬁcient and op timal algorithm to reco nstruct images from the bina ry sens or measuremen ts. Our analysis and numerical simulations demonstrate tha t the dyna mic ranges of the binary se nsors can be orders o f ma gnitude higher than those of con ventional imag e s ensors, thu s p roviding one more motiv ation Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 3 for c onsidering this binary sensing sc heme. Since photon arriv als at each pixel can be we ll-approximated by a Poisson rand om proce ss who se rate is determined by the local light intensity , we formulate the b inary s ensing and s ubseq uent image recon struction as a parameter es timation problem b ased on q uantized Poiss on statistics. Imag e estimation from Poisson statistics ha s bee n extensively studied in the pas t, with applications in b iomedical and astrophysical imaging. Previous work in the literature has used linear mo dels [9], multiscale models [10 ], [11], an d nonlinea r piecewise smo oth models [12], [13] to des cribe the underlying images , leading to diff erent (penalized) maximum likelihood a nd/or Bayesian re construction algorithms. The main dif ference between o ur work a nd previous works is tha t we have on ly acce ss to one-bit quantized Poisson statistics. The b inary quantiza tion and spa tial oversampling in the sensing sch eme add interesting d imensions to the original problem. As we will sh ow in Section III, the performance of the binary s ensor depe nds on the intricate interplay of three parameters: the average light intens ity , the quantization thres hold q , and the oversampling factor . The binary s ensing scheme studied in this pa per also bea rs resemblance to oversampled A/D con version scheme s with quantizations (see , e.g. , [4]–[7]). Previous w ork on one -bit A/D conv ersions conside rs ban- dlimited signals or , in ge neral, signals li ving in the range space o f some overcomplete representations. The eff ect o f qua ntization is often approximated by additi ve noise, wh ich is then mitigated throu gh noise s haping [4], [6], or dithering [7], followed by linear recons truction. In our work, the binary senso r measurements are modeled as one-bit quantized versions of c orrelated Poisso n random variables (instead of deterministic signals), an d we d irectly so lve the statistical in verse problem by using max imum likelihood estimation, without a ny add iti ve noise approximation. The rest of the pap er is organized as follows. After a p recise de scription of the binary sensing model in Section II, we present three main co ntrib utions in this paper: 1 . E stimation performanc e: In Sec tion III, we analyze the performance of the proposed binary s ensor in e stimating a piecewise constant light intensity func tion. In what might be viewed as a s urprising resu lt, we show that, when the quan tization threshold q = 1 and with large oversampling factors, the Cram ´ er- Rao lower bound (CRLB) [14] of the estimation variance app roaches that of unquantized Poisson intensity estimation, that is, a s if there were no quantization in the sen sor me asuremen ts. Furthermore, the CRLB can be asymptotically a chieved b y a maximum lik elihood estimator (MLE), for lar ge oversampling factors. Combined, the se two res ults e stablish the fe asibility of trading spatial reso lutions for high er qu antization bit de pth. 2 . Adv antage over traditional sensors: W e c ompare the oversampled binary sensing sche me with tradi- tional image se nsors in Section III-C. Ou r analysis shows that, with suf ﬁciently lar ge oversampling f actors, the n ew binary sen sor can hav e higher dynamic range s, making it particularly attracti ve in a cquiring sc enes 4 containing both bright and dark regions. 3 . Image reconstruc tion: Section IV presents an MLE-based algorithm to reconstruct the light intensity ﬁeld from the binary s ensor meas urements. As an important result in this work, we show that the log- likelihood function in our problem is alw ays concave for arbitrary linear ﬁ eld mod els, thus ensuring the achievement of global optimal so lutions b y iterative algorithms. For numerica lly solving the MLE , we present a g radient method, a nd derive efﬁcient implementations ba sed on fast sign al proces sing algorithms in the polypha se domain [15], [16]. This attention to c omputational efﬁciency is important in practice, due to extremely lar ge spatial resolutions of the binary s ensors. Section V presents numerical res ults on b oth synthe tic data an d images taken by a prototype de vice [17]. These results verify our theoretical an alysis on the binary sensing s cheme, demonstrate the effecti veness of our image reco nstruction algorithm, an d showcase the beneﬁt of using the ne w binary sensor in acqu iring scene s with high dynamic ran ges. T o simplify the presentation we base our disc ussions on a one-dimens ional ( 1 -D) sensor array , but all the results can be eas ily extended to the 2 - D case. I I . I M A G I N G B Y O V E R S A M P L E D B I NA RY S E N S O R S A. Diffraction Limit and Line ar Light F ie ld Mo dels In this se ction, we d escribe the bina ry imaging scheme studied in this work. Cons ider a simpliﬁed camera model shown in Fig. 1(a). W e denote by λ 0 ( x ) the inc oming light intensity ﬁeld ( i.e. , the radianc e map). By assu ming tha t light intensities rema in constan t within a short exposure period, we model the ﬁeld as only a function of the spatial variable x . W ithout loss of generality , we assume that the dimension of the sensor a rray is of one sp atial unit, i.e. , 0 ≤ x ≤ 1 . After pass ing through the optical system, the original light ﬁeld λ 0 ( x ) g ets ﬁltered by the lens, which acts like a linear system with a g i ven impulse respons e. Due to imperfections ( e.g. , aberrations) in the lens, the impulse respons e, a.k.a. the point spread function (PSF) of the o ptical system, can not be a Dirac delta, thus, imposing a limit on the resolution of the obs ervable light ﬁeld. Howe ver , a more funda mental physical limit is d ue to light dif fraction [18]. As a resu lt, even if the lens is ideal, the PS F is s till unav oidably a small blurry spot [see, for examp le, Fig. 1(b)]. In optics , su ch diffraction-li mited spot is often ca lled the Airy dis k [18] , whose rad ius R a can be computed as R a = 1 . 2 2 wf , where w is the wavelength of the light a nd f is the F-number of the optical sy stem. Example 1: At wa velength w = 420 nm ( i.e . , for blue v isible light) and f = 2 . 8 , the radius of the Airy disk is 1 . 43 µ m. T wo objects with distance smaller than R a cannot be clearly separated b y the imaging Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 5 Sensor Lens P S f r a g r e p l a c e m e n t s λ 0 ( x ) λ ( x ) (a) P S f r a g r e p l a c e m e n t s x -5 -4 -3 -2 -1 0 1 2 3 4 5 0 0.02 0.04 0.06 (b) Fig. 1. The imaging model. (a) The simpliﬁed arch itecture of a diffraction-limited imaging system. Incid ent light ﬁeld λ 0 ( x ) passes t hrough an optical lens, which acts like a linear system with a dif fraction-limited poin t spread function (PSF ). The r esult is a smoothed light ﬁ eld λ ( x ) , which is subsequently captured by the image sensor . (b) The PSF (Airy disk) of an ideal lens with a circular aperture. system as their Airy d isks on the image sensor start blurring together . Current CMOS technology ca n already make stan dard pixels smaller than R a , reaching sizes ranging from 0 . 5 µ m to 0 . 7 µ m [19]. In the case of binary sens ors, the simplicity of each pixel allo ws the feature size to be further red uced. For example, based on stand ard memory technolog y , each memory bit-cell ( i.e. , pixel) can have s izes around 50 n m [3], making it poss ible to substantially oversample the light ﬁeld. In what follo ws, we den ote by λ ( x ) the dif fraction-limit ed ( i.e. , “ob servable”) light intensity ﬁeld, which is the outcome of pa ssing the original light ﬁeld λ 0 ( x ) through the lens. Due to the lowpass (smoothing) nature of the PSF , the resu lting λ ( x ) has a ﬁnite spatial-resolution, i.e. , it ha s a ﬁ nite nu mber of d egrees of free dom p er unit spac e. Deﬁnition 1 (Linear ﬁeld model): In this work, we model the diffraction-li mited light intensity ﬁeld as λ ( x ) = N τ N − 1 X n =0 c n ϕ ( N x − n ) , (1) where ϕ ( x ) is a nonnega ti ve interpolation kernel, N is a giv en integer , τ is the exp osure time, a nd { c n : c n ≥ 0 } is a set of free variables. Remark 1: The con stant N/τ in front of the su mmation is not essen tial, but its inclusion here leads to simpler e xpres sions in our later analysis. The func tion λ ( x ) as deﬁne d in (1) has N degrees of freedom. T o guaran tee that the resulting light ﬁ elds are p hysically mean ingful, we require bo th the interpolation kernel ϕ ( x ) an d the expans ion c oefﬁcients 6 { c n } to be n onnegative. Some exa mples of the interpolation kernels ϕ ( x ) include the box func tion, β ( x ) def =      1 , if 0 ≤ x ≤ 1; 0 , otherwise , (2) cardinal B-s plines [20 ], β k ( x ) =  β ∗ . . . ∗ β | {z } ( k +1) t imes   x + k 2  , (3) and the squared sinc function, sin 2  π ( x − 1 2 )  /  π ( x − 1 2 )  2 . B. Sa mpling the Light Intensity F ield The image s ensor in Fig. 1(a) works as a sampling device of the light intensity ﬁe ld λ ( x ) . Sup pose that the sensor co nsists of M pixels p er unit sp ace, a nd that the m th pixel covers the area between [ m M , m +1 M ] , for 0 ≤ m < M . W e de note by s m the total light expos ure acc umulated on the surface area of the m th pixel within an expos ure time pe riod [0 , τ ] . Then , s m def = Z τ 0 Z ( m +1) / M m/ M λ ( x ) dx dt = τ h λ ( x ) , β ( M x − m ) i , (4) where β ( x ) is the box function deﬁned in (2) and h· , ·i repres ents the standard L 2 -inner produ ct. Su bstitute the light ﬁeld model (1) into the above equality , s m = τ h N τ X n c n ϕ ( N x − n ) , β ( M x − m ) i = X n c n h N ϕ ( N x − n ) , β ( M x − m ) i = X n c n h ϕ ( x ) , β  M ( x + n ) N − m  i , (5) where (5) is obtained through a cha nge of vari ables ( N x − n ) → x . Deﬁnition 2: The spatial oversampling factor , denoted by K , is the ratio between the number of pixels per unit spac e and the numbe r of degrees of freed om needed to specify the li ght ﬁeld λ ( x ) in (1), i.e. , K def = M N . (6) In this work, we are interested in the “oversampled” case w here K > 1 . Furthermore, we assume that K is an integer for simplicity o f n otation. Us ing (6) , an d by introducing a discrete ﬁlter g m def = h ϕ ( x ) , β ( K x − m ) i , m ∈ Z , (7) Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 7 Photon Counting Quantization Binary sensing Light exposure P S f r a g r e p l a c e m e n t s c n K g m τ / ( K N ) s m y m b m Fig. 2. The si gnal processing block diagram of the imaging model studied in this paper . In the ﬁrst step, the light exposure value s m at the m th pixel i s related t o the e xpansion co efﬁcients c n through a conc atenation of upsampling and ﬁltering operations. Subsequently , the image sensor con verts { s m } into quantized measurements { b m } (see F ig. 3 and the discussions in Section II-C for details of this second step). we c an s implify (5) as s m = X n c n g m − K n . (8) The above eq uality s peciﬁes a simple linear map ping from the expans ion co efﬁcients { c n } of the light ﬁeld to the light exposu re v alues { s m } acc umulated by the image sensor . Re aders famili ar with multirate signal proce ssing [15], [16] will immedia tely reco gnize that the relation in (8) can b e implemented via a concate nation o f upsampling and ﬁltering, as sho wn in the left part of Fig. 2. This observation c an also be veriﬁed by expressing (8) in the z -transform domain S ( z ) = X n c n z − K n G ( z ) = C ( z K ) G ( z ) , (9) and using the fact that C ( z K ) is the z -transform o f the K -fold upsampled version of c n . In Section IV, we will further study the signal processing block dia gram in Fig. 2 to de ri ve ef ﬁcient implementations of the proposed image recons truction algorithm. Example 2: The discrete ﬁlter g m is complete ly sp eciﬁed by the interpolation kernel ϕ ( x ) a nd the oversampling f actor K . As a simple case, when the kernel ϕ ( x ) = β ( x ) , we can compute from (7) that g m =      1 /K, for 0 ≤ m < K ; 0 , otherwise . (10) C. Binar y S ensing a nd On e-Bit P oiss on Statistics Fig. 3 illustrates the binary sensor mode l. Recall from (4) that { s m } deno te the expos ure v alues ac cumu- lated by the se nsor pixels. Depending o n the local values o f { s m } , each p ixel (dep icted a s “b uckets” in the ﬁgure) c ollects a dif ferent number of photons hitting on its surface. In wh at follows, we den ote by y m the number of photons impinging on the surface of the m th pixel during a n exposu re period [0 , τ ] . The relation between s m and the pho ton count y m is stoc hastic. More speciﬁc ally , y m can be modeled as realizations 8 Binary Light intensity Photon counting measurements quantization Binary P S f r a g r e p l a c e m e n t s s 0 s 1 s 2 s m y 0 y 1 y 2 y m b 0 b 1 b 2 b m m ( q = 2) m e a s u r e m e n t s Fig. 3. The model of t he binary image sensor . The pixels (sho wn as “b uckets”) collect photons, the numb ers of which are compared against a quantization threshold q . In the ﬁgure, w e illustrate the case when q = 2. T he pixel outputs are binary: b m = 1 ( i.e. , white pix els) if there are at least two photons receiv ed by the pix el; otherwise, b m = 0 ( i.e. , gray pix els). of a Poisso n random v ariable Y m , whose intensity parameter is equal to s m , i.e . , P ( Y m = y m ; s m ) = s y m m e − s m y m ! , for y m ∈ Z + ∪ { 0 } . (11) It is a well-known property of the Poiss on process that E [ Y m ] = s m . Thus, the average n umber of photons captured by a gi ven pixel is equal to the loc al light exposure s m . As a ph otosens iti ve d evice, each pixel in the image se nsor c on verts photons to electrical signals, whos e amplitude is proportional to the number of photons impinging on that p ixel. 1 In a c on ventional s ensor design, the analog electrical signals are then quantized by an A/D con verter into 8 to 14 bits (usually the more bits the be tter). In this work, we study a new sensor des ign using the followi ng binary ( i.e. , one -bit) quantization s cheme. Deﬁnition 3 (Binary Quantization): Le t q ≥ 1 b e an integer thresh old. A binary q uantizer is a mapping Q : Z + ∪ { 0 } 7− → { 0 , 1 } , such tha t Q ( y ) =      1 , if y ≥ q ; 0 , othe rwise . In Fig. 3, we il lustrate the binary qua ntization sc heme. White pixels in the ﬁg ure s how Q ( y m ) = 1 an d gray pixels show Q ( y m ) = 0 . W e de note by b m def = Q ( y m ) , b m ∈ { 0 , 1 } , the qu antized output o f the m th pixel. Since the photon counts { y m } are drawn from random variables { Y m } , s o a re the binary s ensor ou tput 1 The ex act ratio between these two quantities is determined by the quantum ef ﬁciency of the sensor . Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 9 { b m } , from the random variables n B m def = Q ( Y m ) o . Intr oduc ing two functions p 0 ( s ) def = q − 1 X k =0 s k k ! e − s , p 1 ( s ) def = 1 − q − 1 X k =0 s k k ! e − s (12) we c an write P ( B m = b m ; s m ) = p b m ( s m ) , b m ∈ { 0 , 1 } . (13) Remark 2: The n oise mo del cons idered in this paper is that of Poisson noise. In practice, the performance of image sens ors is also inﬂu enced by thermal noise , which in our case can be modeled as rando m bit- ﬂipping in the binary sensor me asuremen ts. Due to s pace c onstraints, we leave further discussion s on this additional n oise s ource a nd its impact on recons truction performanc e to a follow-up work. D. Multiple Expos ures and T emporal Oversamp ling Our pre vious discuss ions focus on the case of a cquiring a s ingle frame o f quantized measurements d uring the expo sure time [0 , τ ] . As an extension, we c an co nsider multiple exposures and a cquire J con secutive and indep enden t frames. The expo sure time for ea ch frame is set to τ /J , so that the total acquisition time remains the s ame as the single exposure case . In what follows, we ca ll J the tempo ral oversampling fac tor . As be fore, we assume tha t τ ≪ 1 an d thus light intensities λ ( x ) stay c onstant within the entire acquisition time [0 , τ ] . For the j th frame ( 0 ≤ j < J ), we denote by s j,m the light exposure at the m th pixel. Follo wing the same deriv ations as in Section II-B, we can show that s j,m = 1 J X n c n g m − K n , for all j, (14) where { c n } are the expansion co efﬁcients of the light ﬁeld λ ( x ) , and g m is the d iscrete ﬁlter deﬁn ed in (7). The only difference between (14) and (8) is the extra factor of 1 /J , due to the change of exposure time from τ to τ /J . In the z -domain, s imilar to (9), S j ( z ) = 1 J C ( z K ) G ( z ) . (15) In what follo ws, we establish an equiv alence be tween temporal o versampling and s patial oversampling. More prec isely , we will sh ow that an M -pixel sensor tak ing J indepen dent exposures ( i.e. , with J -times oversampling in time) is ma thematically e quiv alent to a single sensor co nsisting of M J pixels. First, we introduce a new seque nce e s m , 0 ≤ m < M J , cons tructed by interlacing the J exposure sequen ces { s j,m } . For example, when J = 2 , the ne w s equen ce is s 0 , 0 , s 1 , 0 , s 0 , 1 , s 1 , 1 , . . . , s 0 ,m , s 1 ,m , . . . , s 0 ,M − 1 , s 1 ,M − 1 , 10 where { s 0 ,m } a nd { s 1 ,m } alternate. In general, e s m can be obtained as e s m def = s j,n , m = J n + j, 0 ≤ j < J, 0 ≤ n < M . (16) In multir ate sign al processing, the above con struction is called the polyphase r epresentation [15 ], [16], and its a lternating subsequenc es { s j,m } J − 1 j =0 the polyphase compo nents . Pr oposition 1: Let e g m be a ﬁlter whose z -trans form e G ( z ) def = G ( z J )(1 + z − 1 + . . . + z − ( J − 1) ) /J , (17) where G ( z ) is the z -transform of the ﬁlter g m deﬁned in (7). Then, e s m = X n c n e g m − K J n . (18) Pr oof: See Appendix A. Remark 3: Proposition 1 formally estab lishes the equi valence betwee n spatial and temporal o versampling. W e note that (18) h as exactly the same form as (8), a nd thus the mapp ing from { c n } to { e s m } can be implemented by the same signal proc essing operations shown in Fig. 2—we only need to c hange the upsampling factor from K to K J an d the ﬁlter from g m to e g m . In ess ence, by taking J consecutive exposures with an M -pixel sensor , we ge t the same light exposure v alues { e s m } , as if we had us ed a more densely packed sensor with M J pixels. Remark 4: T aking multiple exposures is a very e f fectiv e way to increase the total oversampling f actor of the b inary s ensing s cheme . The key assumption in our analysis is that, d uring the J conse cutiv e exposures, the light ﬁeld remains consta nt over time. T o make s ure this assumption holds for arbitrary values of J , we set the expo sure time for each frame to τ /J , for a ﬁxed and small τ . Cons equen tly , the maximum temporal oversampling f actor we c an achie ve in p ractice will be li mited by the readou t speed o f the binary sensor . Thanks to the equivalence between sp atial a nd tempo ral oversampling, we o nly nee d to focus on the single exposure c ase in o ur followi ng discus sions o n the performance of the binary sens or and image reconstruction a lgorithms. All the results we obtain extend direc tly to the multiple exposure c ase. I I I . P E R F O R M A N C E A N A L Y S I S In this s ection, we s tudy the performanc e of the binary image s ensor in es timating light intensity information, analyze the inﬂuen ce of the quantization threshold and oversampling factors, and demonstrate the new sens or’ s advantage over traditional se nsors in terms of highe r dynamic rang es. In our analysis, we assume that the light ﬁeld is piecewise con stant, i.e. , the interpo lation kernel ϕ ( x ) in (1) is the box function β ( x ) . This simplifying assumption allo ws us to deri ve closed-form expressions for sev eral important performance measures of interest. Numerical results in Se ction V suggest that the res ults and conclusions we o btain in this section applies to the general linea r ﬁ eld mo del in (1) with diff erent interpolation kerne ls. Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 11 A. Th e Cram ´ er-Rao Lower Bou nd (CRLB) of Estimation V ariances From Deﬁnition 1, reco nstructing the light intens ity ﬁ eld λ ( x ) b oils down to es timating the unk nown deterministic p arameters { c n } . Inp ut to our estimation problem is a sequ ence of b inary se nsor mea surements { b m } , which are realizations of Be rnoulli random variables { B m } . The probability distributions of { B m } depend on the light exposure values { s m } , as in (13). Finally , the exposure values { s m } are linked to the light intens ity p arameters { c n } in the form of (8). Assume that the light ﬁeld λ ( x ) is piecewise constan t. W e have c omputed in Example 2 that, under this case, the discrete ﬁlter g m used in (8) is a cons tant supported within [0 , K − 1] , as in (10). The mapping (8) be tween { c n } and { s m } can now be simpliﬁed as s m = c n /K, for nK ≤ m < ( n + 1) K. (19) W e see that the parameters { c n } have disjoint regions of inﬂuence , meaning, c 0 can on ly be sens ed by a group of pixels { s 0 , . . . , s K − 1 } , c 1 by { s K , . . . , s 2 K − 1 } , and so on. Conseque ntly , the paramete rs { c n } c an be e stimated o ne-by-one, independently of each other . In what follo ws, and without loss of g enerality , we focus on estimating c 0 from the bloc k of bina ry measureme nts b def = [ b 0 , . . . , b K − 1 ] T . For notational s implicity , we will drop the subs cript in c 0 and use c instead. T o analyz e the performance of the binary sensing sc heme, we ﬁrst compute the C RLB [14], which provides a theoretical lower bou nd on the variance of any unbiased estimator . Denote by L b ( c ) the likelihood func tion of observing K binary s ensor measurement b . Then, L b ( c ) def = P ( B m = b m , 0 ≤ m < K ; c ) , = K − 1 Y m =0 P ( B m = b m ; c ) , (20) = K − 1 Y m =0 p b m ( c/K ) , (21) where (20) is due to the indep enden ce of the ph oton c ounting processes a t d if ferent pixel loca tions, a nd (21 ) follo ws from (13) and (19). Deﬁning K 1 ( 0 ≤ K 1 < K ) to be the number of “1”s in the binary s equen ce b , we can simplif y (21) as L b ( c ) =  p 1 ( c/K )  K 1  p 0 ( c/K )  K − K 1 . (22) Pr oposition 2: The CRLB of estimating the light intens ity c from K binary sens or measurements wit h threshold q ≥ 1 is CRLB bin ( K, q ) = c   q − 1 X j =0 ( q − 1)!( c/K ) − j ( q − 1 − j )!     ∞ X j =0 ( q − 1)!( c/K ) j ( q + j )!   , for c > 0 . (23) 12 Pr oof: See Appendix B. It will be interesting to compare the performance of o ur bina ry image senso r with that of an idea l se nsor which does n ot use q uantization at all. T o that end, consider the s ame situation a s before, whe re we u se K pixels to obse rve a constan t light intensity value c . The light exposu re s m at ea ch pixel is equa l to c/K , as in (19 ). Now , unlike the b inary sens or which only takes one-bit me asuremen ts, c onsider an idea l sensor that can p erfectly record the number of photon arri vals at ea ch pixel. By referring to Fig. 3, the sensor measureme nts in this case w ill be { y m } , whose probab ility distributions are gi ven in (11). In A ppendix C, we compute the CRLB of this u nquantize d sens ing scheme a s CRLB ideal ( K ) = c, (24) which is na tural and reﬂects the fact that the variance of a Poiss on random variable is eq ual to its mean ( i.e. , c , in ou r cas e). T o be sure, we alw ays have CRLB bin ( K, q ) > CRLB ideal ( K ) , for arbitrary oversampling factor K and quantization threshold q . This is n ot surprising, as we lose information by one-bit quantizations. In practice, the ratio be tween the two CRLBs provides a me asure of performanc e degrad ations incurred by the bina ry sensors . What is su rprising is that the two qua ntities can be ma de arbitraril y close, when q = 1 and K is lar ge, as sho wn by the following prop osition. Pr oposition 3: For q = 1 , CRLB bin ( K, q ) = c + c 2 2 K + O  1 K 2  , (25) which c on ver ges to CRLB ideal ( K ) as the oversampling factor K go es to inﬁnity . F or q ≥ 2 , CRLB bin ( K, q ) / CRLB ideal ( K ) > 1 . 31 , ( 26) and lim K →∞ CRLB bin ( K, q ) / CRLB ideal ( K ) = ∞ . Pr oof: Spec ializing the expression (23) for q = 1 , we get CRLB bin ( K, 1) = c  1 + c 2 K + c 2 3! K 2 + c 3 4! K 3 + . . .  , and thus (25). The statemen ts for cases wh en q ≥ 2 are sho wn in Appendix D. Proposition 3 indicates that it is feasible to use oversampling to compensate for infor mation loss du e to binary q uantizations. It follo ws from (25) that, with large ov ersampling factors, the binary sen sor operates as if there were no quantization in its measu rements. It is also important to note that this de sirable tradeof f between s patial resolution and estimation variance only works for a sing le-photon thresho ld ( i.e. , q = 1 ). For other choices of the quantization threshold, the “gap” between CRLB bin ( K, q ) and CRLB ideal ( K ) , measured in terms of their ratio, cannot b e made arbitrarily small, as sho wn in (26). In f act, it quick ly ten ds to inﬁnity as the oversampling factor K increa ses. Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 13 The results of Proposition 3 ca n be intuiti vely understood as follo ws: The expected number of photons collected by each pixel during light expos ure is equal to s m = c/K . As the oversampling factor K goes to inﬁnity , the mean value of the Poisson distrib ution tends to ze ro. Consequently , most pixels on the sens or will only g et zero or one p hoton, with the p robability of rec eiving two or more photons at a pixel close to zero. In this case, with high probability , a binary quantization scheme with threshold q = 1 does not lose information. In contrast, if q ≥ 2 , the binary sensor measureme nts will be almost uniformly zero, making it nearly impossible to differentiate betwee n diff erent light intens ities. B. As ymptotic Achievability of the CR LB In what follows, we show tha t, when q = 1 , the CRLB deri ved in (23) can be as ymptotically achieved by a simple maximum likelihood estimator (MLE). Gi ven a sequence of K binary measurements b , the MLE we s eek is the parameter that maximizes the lik elihood function L b ( c ) in (22). More spec iﬁcally , b c ML ( b ) def = arg max 0 ≤ c ≤ S L b ( c ) = arg max 0 ≤ c ≤ S  1 − p 0 ( c/K )  K 1  p 0 ( c/K )  K − K 1 , (27) where we substitute p 1 ( c/K ) in (22) by its equ i valent form 1 − p 0 ( c/K ) . Th e lo wer bound of the search domain c ≥ 0 is c hosen according to physical constraints, i.e. , the light ﬁeld can not take negative values. The upp er bound c ≤ S b ecomes ne cessa ry when K 1 = K , in which case the likelihood func tion L b ( c ) = p 1 ( c/K ) K is monotonically increasing with respec t to the light intensity level c . Lemma 1: The MLE so lution to (27) is b c ML ( b ) =      K p [ − 1] 0 (1 − K 1 /K ) , if 0 ≤ K 1 ≤ K (1 − p 0 ( S/K )) , S, otherwise , (28) where p [ − 1] 0 ( x ) is the in verse function of p 0 ( x ) . Remark 5: From the deﬁn ition in (12), we can easily verify that d dx p 0 ( x ) < 0 for all x > 0 . It follows that the function p 0 ( x ) is strictly de creasing for x > 0 and that the in verse p [ − 1] 0 ( x ) is well-deﬁne d. For example, when q = 1 , we h ave p 0 ( x ) = e − x and thus p [ − 1] 0 ( x ) = − log( x ) . In this particular case, and for K 1 ≪ K , we h ave b c ML ( b ) = − K log(1 − K 1 /K ) ≈ K 1 . It follows that we c an use the sum of the K binary me asuremen ts as a ﬁrst-order a pproximation of the light intensity estimation. Pr oof: At the two extreme cases, w hen K 1 = 0 or K 1 = K , it is easy to s ee that (28) is indeed the solution to (27). Next, we a ssume that 0 < K 1 < K . Computing the d eri vati ve of L b ( c ) an d setting it to ze ro, we can verify that the equation d dc L b ( c ) = 0 has a single solution at b c max = K p [ − 1] 0 (1 − K 1 /K ) . 14 Since L b ( c ) ≥ 0 an d L b (0) = lim c →∞ L b ( c ) = 0 , we conclude that the likelihood function L b ( c ) ac hiev es its ma ximum v alue at b c max . Finally , the MLE so lution b c ML = min { b c max , S } , and thus , we hav e (28 ). Theorem 1: When q = 1 , we have E [ b c ML ( b )] = c + ε 1 + O (1 /K ) , for c < S − 2 , (29) where | ε 1 | ≤ 2 c e 1 − c  ec S − 1  S − 1 . Meanwhile, the mea n squared error (MSE) of the estimator approac hes CRLB ideal , i.e . , E  ( b c ML ( b ) − c ) 2  = c + ε 2 + O (1 /K ) , for c < S − 2 , (30) where | ε 2 | ≤ 2 c ( c + 1) e − c  ec S − 2  S − 2 . Remark 6: It is e asy to verify tha t, for ﬁxed c , the two terms ε 1 and ε 2 con ver ge (very q uickly) to 0 as S tends to inﬁnity . It then follows from (29) and (30) that the MLE is a symptotically unbias ed and ef ﬁcient, in the sen se that lim S → ∞ lim K →∞ E [ b c ML ( b )] = c an d lim S → ∞ lim K →∞ E  ( b c ML ( b ) − c ) 2  = c. W e leave the formal proof of this theo rem to Ap pendix E. Its main idea can be summarized as follows. As K goes to inﬁnity , the area of ea ch pixel tends to z ero, s o doe s the average numbe r of ph otons arri ving at that pixel. As a resu lt, most pixels o n the se nsor will get on ly ze ro or o ne photon d uring exposu re. A single-photon binary q uantization scheme c an record perfectly the pattens of “0”s a nd “1”s o n the se nsor . It lose s information only when a pixel rec eiv es two or more p hotons, but the probability of such ev ents tends to zero as K increa ses. Suppose , now , that we us e a qu antization threshold q ≥ 2 . In this case, as K tends to inﬁnity , the binary response s of diff erent p ixels will almost always be “0” , e ssentially obfuscating the a ctual light intensity values. This problem leads to p oor performance in the MLE. As stated in the following propo sition, the asymptotic MSE for q ≥ 2 bec omes c 2 instead of c . Pr oposition 4: When q ≥ 2 , the MLE is asy mptotically biase d , that is, for any ﬁxed c and S , lim K →∞ E [ b c ML ( b )] = 0 . (31) Meanwhile, the MSE becomes lim K →∞ E  ( b c ML ( b ) − c ) 2  = c 2 . (32) Pr oof: See Appendix F. Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 15 10 1 10 2 10 3 10 4 10 5 10 6 0 10 20 30 40 50 60 Light Exposure Values SNR(dB) Fig. 4. Performance comparisons of three different sensing schemes (“BIN”, “IDEAL ”, and “SA T ”) over a wide range of light expo sure v alues c (shown in logarithmic scale). The dash-dot line (in red) represents the “IDEAL ” scheme with no quantization; The solid li ne (in blue) correspond s t o the “SA T” scheme with a saturation point set at C max = 9130 [21]; The four dashed lines (in black) correspond to the “BIN” scheme with q = 1 and different ov ersampling factors (from left to right, K = 2 13 , 2 14 , 2 15 and 2 16 , respecti vely). C. Ad vantages over T raditional Senso rs In wha t follo ws, we de monstrate the advantage of the oversampled binary sensing sc heme, denoted by “BIN”, in ac hieving high er dynamic ranges. W e focus on the case where the q uantization threshold is s et to q = 1 . For comparisons , we a lso c onsider the follo wing two a lternati ve sens ing sch emes: The ﬁrst, deno ted by “IDEAL ”, uses a s ingle pixel to estimate the light exposure parameter ( i.e. , nonoversampled), b ut that pixel can rec ord pe rfectly the number of photon arriv als during exposure; The sec ond sche me, denoted b y “SA T”, is very similar to the ﬁrst, with the ad dition of a saturation po int C max , beyon d which the pixel can ho ld no more photons. Note that in our discuss ions, the “SA T” sc heme serves as an idea lized model of co n ventional image senso rs, for which the saturation is cau sed by the limited full-well capa city of the semicondu ctor device. The general trend of con ventional image se nsor design has been to pack more pixels per chip by reducing pixel s izes, lead ing to lo wer full-well cap acities and thus lower sa turation values. Fig. 4 comp ares performances of the three dif ferent s ensing sc hemes ( i.e. , “ BIN”, “ IDEAL ”, and “SA T”) over a wide range of light exposure values. W e measure the performances in terms of s ignal-to-noise ratios (SNRs), deﬁned as SNR = 10 log 10 c 2 E [( b c − c ) 2 ] , where b c is the estimation of the light exposure value we obtain from each of the sensing sch emes. W e obse rve tha t the “IDEAL ” sch eme (the red dash-dot line in the ﬁgure) represents an upper-bound of the es timation p erformance. T o s ee this, denote by y the number of photons that arri ve at the pixel during 16 exposure. Then y is a realization of a Po isson ran dom v ariable Y with intensity equal to the light exposure value c , i.e . , P ( Y = y ; c ) = c y e − c y ! . Maximizing this function over c , we can c ompute the MLE for the “ IDEAL ” sc heme as b c IDEAL ( y ) = y . It is easy to verify that this estimator is unbias ed, i.e. , E [ b c IDEAL ( Y )] = E [ Y ] = c , and that it ac hieves the ideal CRLB in (24), i.e. , var ( b c IDEAL ( Y )) = var ( Y ) = c . Accordingly , w e ca n c ompute the SNR as SNR IDEAL = 10 log 10 ( c 2 /c ) = 10 log 10 ( c ) , which a ppears a s a straight line in our ﬁgure with the light exposure v alues c shown in log arithmic s cale. The solid line in the ﬁgure correspon ds to the “SA T” scheme, with a sa turation p oint set a t C max = 91 30 , which is the full well cap acity of the image sensor reported in [21]. The sensor measurement in this cas e is y SA T def = min { y , C max } , a nd the estimator we use is b c SA T ( y SA T ) = y SA T . (33) W e c an se e tha t the “ SA T” sc heme initially has the same performanc e a s “IDEAL ”. It remains this way until the light exposure value c approache s the saturation point C max , after which there is a drastic drop 2 in SNR. Denoting by SNR min the minimum acce ptable SNR in a giv en application, we ca n then deﬁne the dynamic range of a sens or as the range of c for which the sens or achie ves at lea st SNR min . For example, if we choos e SNR min = 20 d B, then, as sh own in the ﬁgu re, the SA T sc heme has a dyna mic range from c = 10 2 to c ≈ 10 4 , or , if mea sured in terms of ratios, 100 : 1 . Finally , the three dashe d lines re present the “BIN” scheme with q = 1 and increasing oversampling factors (from left to right: K = 2 13 , 2 14 , 2 15 and 2 16 , resp ectiv ely). W e use the MLE giv en in (28) a nd plot the corresponding estimation SNRs . W e see that, within a lar ge rang e o f c , the pe rformance o f the “BIN” scheme is very close to that of the “IDEAL ” s cheme that does no t use quantization. This veriﬁes our analysis in Theorem 1 , which states that the “BIN” sche me with a single-pho ton threshold can a pproach the ideal unquantized CRLB when the ov ersampling factor is large enough. F urthermore, when compared with the “SA T” scheme, the “ BIN” scheme h as a more gradual decrease in SNR whe n the light expos ure values inc rease, and has a higher dyna mic range. For example, wh en K = 2 16 , the dyn amic range of the “BIN” scheme spans from c = 10 2 to c = 10 5 . 8 , about two orders of magnitude higher than that of “S A T”. In Section V, we will prese nt a nume rical experiment tha t p oints to a potential a pplication of the binary sensor in high dynamic range photography . 2 The estimator in (33) is biased around c = C max . Fo r a very narro w range of light intensity v alues centered around C max , the MSE of this biased estimator is lower t han the i deal CRLB. Thus, there is actually a short “spike” in S NR right before the drop. Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 17 Remark 7: Note that K is the product o f the spatial oversampling factor and the temporal oversampling factor . For example, the pixel pitch of the imag e sens or reported in [21] is 1 . 6 5 µm . If the binary sen sor is buil t on me mory ch ip techn ology , with a pitch size of 50 n m [3], then the maximum sp atial oversampling factor is abou t 1089 . T o achieve K = 2 13 , 2 14 , 2 15 and 2 16 , resp ectiv ely , a s required in Fig. 4, we then need to have temp oral oversampling factors ranging from 8 to 60 . Un like traditional sensors which require multi-bit quantize rs, the binary sens ors only nee d one -bit c omparators. This s implicity in hardware c an potentially lead to f aster readout spe eds, mak ing it practical to apply temporal oversampling. I V . O P T I M A L I M AG E R E C O N S T RU C T I O N A N D E FFI C I E N T I M P L E M E N T A T I O N S In the pre vious sec tion, we studied the performance of the binary image sens or , and deriv ed the MLE for a p iecewise-constant light ﬁe ld model. Ou r analysis es tablishes the op timality of the MLE, showing that, with s ingle-photon thresholding and large oversampling factors, the MLE approac hes the performance of an ideal sensing sc heme without q uantization. In this section, we extends the MLE to the general linear ﬁeld mo del in (1), with arbitrary interpolation kernels. As a main result of this work, we s how that the log-likelihood function is always conc av e. This desirable property g uarantees the global con vergence of iterati ve n umerical a lgorithms in solving the MLE. A. Image Reconstruction by MLE Under the linear ﬁeld model introdu ced in Deﬁnition 1, reconstructing an image [ i.e. , the light ﬁeld λ ( x ) ] is equiv alent to estimating the p arameters { c n } in (1). As shown in (8), the light exposu re values { s m } at different sen sors are related to { c n } throug h a linear mapping, implemented as u psampling followed by ﬁltering a s in Fig. 2. Since it is linear , the mapping (8) can b e written as a matrix-vector multiplication s = G c , (34) where s def = [ s 0 , s 1 , . . . , s M − 1 ] T , c def = [ c 0 , c 1 , . . . , c N − 1 ] T , a nd G is an M × N matrix represe nting the combination of upsa mpling (by K ) and ﬁltering (by g m ). Ea ch element of s can then be written a s s m = e T m Gc , (35) where e m is the m th standard Euc lidean bas is vector . 3 Remark 8: In using the a bove n otations, we do not d istinguish between single exposure a nd multiple exposures, whose equiv alence has been established by Propos ition 1 in Section II-D. In the case of multiple exposures, the e ssential structure of G —upsampling follo wed by ﬁltering—remains the same. All we need 3 Here we use zero-based indexing. Thus, e 0 def = [1 , 0 , . . . , 0] T , e 1 def = [0 , 1 , . . . , 0] T , and so on. 18 to do is to replace s by the interlaced se quenc e { e s m } cons tructed in (16), the oversampling f actor K by K J , and the ﬁlter g m by e g m in (17). Similar to o ur deriv ations in (20) and (21), the likelihood function giv en M binary meas urements b def = [ b 0 , b 1 , . . . , b M − 1 ] T can be computed as L b ( c ) = M − 1 Y m =0 P ( B m = b m ; s m ) = M − 1 Y m =0 p b m ( e T m Gc ) , ( 36) where (36) follows fr om (12) and (35). In our subsequent dis cussion s, it is more con venient to work with the log-li kelihood function, deﬁne d as l b ( c ) def = log L b ( c ) = M − 1 X m =0 log p b m ( e T m Gc ) . (37) For any gi ven o bservation b , the MLE we seek is the parameter that ma ximizes L b ( c ) , or equi valently , l b ( c ) . Speciﬁcally , b c ML ( b ) def = arg max c ∈ [0 ,S ] N l b ( c ) = arg max c ∈ [0 ,S ] N M − 1 X m =0 log p b m ( e T m Gc ) . (38) The cons traint c ∈ [0 , S ] N means that every parameter c n should satisfy 0 ≤ c n ≤ S , for some prese t maximum value S . Example 3: As discu ssed in Section III, wh en the light ﬁeld is piecewise-cons tant, different light ﬁeld parameters { c n } can be estimated independen tly . In tha t case, the likelihood fun ction h as only one variable [see (22 )] a nd can b e e asily visualized. In Fig. 5, we plot L b ( c ) in (22) and the c orrespond ing log-likelihood function l b ( c ) , un der diff erent choices of the quantization thresho lds. W e obse rve from the ﬁgu res that the likelihood func tions are not concav e, but the log-li kelihood functions ind eed are. In what follo ws, we will show that this result is general, namely , the log-likeli hood functions in the form of (37) are always concave. Lemma 2: For any tw o integers i, j suc h that 0 ≤ i ≤ j < ∞ or 0 ≤ i < j = ∞ , the fun ction log j X k = i x k e − x k ! is concav e on the interval x ∈ [0 , ∞ ) . Pr oof: See Appendix G. Theorem 2: For arbitrary binary sensor measurements b , the log-lik elihood function l b ( c ) deﬁned in (37) is concav e on the doma in c ∈ [0 , S ] N . Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 19 0 40 80 120 160 200 0 1 2 3 4 x 10 −3 P S f r a g r e p l a c e m e n t s c L b ( c ) ℓ b ( c ) q = 1 q = 3 q = 5 (a) 0 40 80 120 160 200 −150 −100 −50 0 P S f r a g r e p l a c e m e n t s c L b ( c ) ℓ b ( c ) q = 1 q = 3 q = 5 (b) Fig. 5. The likelihood and log-likelihood functions for piecewise-co nstant li ght ﬁelds. (a) The likelihood functions L b ( c ) , deﬁned in (22), und er different choices of the quantization thresholds q = 1 , 3 , 5 , respecti vely . (b) The corresponding log-likelihood func tions. In computing these functions, we set the parameters in (22) as follows: K = 12 , i.e. , the sensor is 12 -times ov ersampled. T he binary sensor measurem ents contain 10 “1”s, i.e. , K 1 = 10 . Pr oof: It follo ws from the deﬁn ition in (12) that, for any b m ∈ { 0 , 1 } , the function log p b m ( s ) is either log q − 1 X k =0 s k e − s k ! or log ∞ X k = q s k e − s k ! . (39) W e can a pply Lemma 2 in b oth c ases, and show t hat { log p b m ( s ) } are concave f unctions for s ≥ 0 . Since the sum of con cave func tions is still concave and the c omposition of a concave function with a linear mapping ( s m = e T m Gc ) is still c oncave, we conc lude that the log -likelihood function deﬁned in (37) is concave. In g eneral, there is no closed -form solution to the maximization problem in (38). An MLE solution ha s to b e found through nume rical algorithms. Th eorem 2 guarantees the global conv ergence o f the se iterativ e numerical me thods. B. Iterative Algorithm and Ef ﬁcient Implementations W e compu te the numerical solution of the MLE by us ing a standard gradient as cent method. Den ote by c ( k ) the estimation of the unknown parameter c at the k th step. The e stimation c ( k +1) at the next step is obtained by c ( k +1) = P D  c ( k ) + γ k ▽ l b ( c ( k ) )  , (40) where ▽ l b ( c ( k ) ) is the gradient of the log-likelihood function ev aluated at c ( k ) , γ k is the step-siz e at the current iteration, and P D is the projection onto the search do main D def = [0 , S ] N . W e ap ply P D to ensure that all estimations of c lie in the s earch d omain. 20 T aking the deri vati ve of the log-likelihood function l b ( c ) in (37), we can compute the gradient as ▽ l b ( c ( k ) ) = G T h D b 0 ( s ( k ) 0 ) , D b 1 ( s ( k ) 1 ) , . . . , D b M − 1 ( s ( k ) M − 1 ) i T , (41) where s ( k ) def = [ s ( k ) 0 , . . . , s ( k ) M − 1 ] T = Gc ( k ) is the current estimation of the light exposure v alues , and D b ( s ) def = d ds log p b ( s ) for b = 0 , 1 . For example, wh en q = 1 , we have p 0 ( s ) = e − s and p 1 ( s ) = 1 − e − s . In this c ase, D 0 ( s ) = − 1 a nd D 1 ( s ) = 1 / (1 − e − s ) , res pectively . The cho ice of the s tep size γ k has signiﬁcan t inﬂue nce over the s peed of co n ver gence of the above iterati ve algorithm. W e follo w [9] by ch oosing, a t each step , a γ k so that the gradient vectors at the current and the next iterations a re approximately orthogonal to e ach other . By assuming that the estimates s ( k +1) and s ( k ) at conse cutiv e iterations are close to each other , we can u se the follo wing ﬁrst-order approximation D b ( s ( k +1) m ) ≈ D b ( s ( k ) m ) + H b ( s ( k ) m )( s ( k +1) m − s ( k ) m ) , where H b ( s ) def = d ds D b ( s ) = d 2 ds 2 log p b ( s ) , for b = 0 , 1 . It foll ows that ▽ l b ( c ( k +1) ) = G T h D b 0 ( s ( k +1) 0 ) , D b 1 ( s ( k +1) 1 ) , . . . , D b M − 1 ( s ( k +1) M − 1 ) i ≈ ▽ l b ( c ( k ) ) + G T diag n H b 0 ( s ( k ) 0 ) , H b 1 ( s ( k ) 1 ) , . . . , H b M − 1 ( s ( k ) M − 1 ) o ( s ( k +1) − s ( k ) ) . (42) Assuming tha t the gradient upd ate c ( k ) + γ k ▽ l b ( c ( k ) ) is inside of the constraint set D , we can neglec t the projection o perator P D in (40), and write s ( k +1) − s ( k ) = G ( c ( k +1) − c ( k ) ) = γ k G ▽ l b ( c ( k ) ) . Substituting the above equality into (42), we get ▽ l b ( c ( k +1) ) ≈ ▽ l b ( c ( k ) ) + γ k G T diag n H b 0 ( s ( k ) 0 ) , H b 1 ( s ( k ) 1 ) , . . . , H b M − 1 ( s ( k ) M − 1 ) o G ▽ l b ( c ( k ) ) . Finally , by requiring that ▽ l b ( c ( k +1) ) be orthogon al to ▽ l b ( c ( k ) ) , we compute the optimal step s ize a s γ k = k ▽ l b ( c ( k ) ) k 2     diag  q − H b 0 ( s ( k ) 0 ) , . . . , q − H b M − 1 ( s ( k ) M − 1 )  G ▽ l b ( c ( k ) )     2 . (43) Remark 9: By deﬁnition, H b ( s ) (for b = 0 , 1 ) are the s econd-orde r deriv ati ves of conc ave fun ctions (see Lemma 2), a nd are thus no npositiv e. Co nseque ntly , the terms p − H b ( s ) in the denominator of (43) are well-deﬁned. Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 21 P S f r a g r e p l a c e m e n t s a m u m K g m G (a) P S f r a g r e p l a c e m e n t s b m v m K g − m G T (b) P S f r a g r e p l a c e m e n t s a m u 0 ,m u 1 ,m u K − 1 ,m g 0 ,m g 1 ,m g K − 1 ,m (c) P S f r a g r e p l a c e m e n t s v m b 0 ,m b 1 ,m b K − 1 ,m g 0 , − m g 1 , − m g K − 1 , − m (d) Fig. 6. Signal processing implementations of Ga and G T b . (a) T he product Ga can be obtained by upsampling followed by ﬁltering. (b) The product G T b can be obtained by ﬁltering follo wed by do wnsampling . Note that the ﬁl ter used in (b) is g − m , i.e. , the “ﬂipped” version of g m . (c) The polyphase domain implementation of (a). (d) The polyph ase domain implementation of (b). At every iteration of the gradient a lgorithm, we need to upd ate the gradient a nd the step size γ k . W e see from (41) and (43 ) tha t the compu tations always in volv e matrix-v ector products in the form of Ga and G T b , for some vectors a , b . The matrix G is of size M × N , wh ere M is the total number of pixels. In practice, M will be in the range of 10 9 ∼ 10 10 ( i.e. , gigapixel per chip), making it impossible to directly implement the matrix op erations. Fortunately , the matrix G used in both formulae is highly structured, and can be implemented as ups ampling follo wed by ﬁltering (see ou r discussions in Section II-B and the expression (8) for details). Similarly , the transpose G T can be implemen ted by ﬁltering (by g − m ) follo wed by do wnsa mpling, esse ntially “ ﬂipping” all the operations in G . Fig. 6(a) and Fig. 6(b) s ummarizes these operations. W e no te that the implementations illustrated in Fig. 6(a) a nd F ig. 6(b) are n ot yet optimized: For example, the input to the ﬁlter g m in Fig. 6(a) is an upsampled sequence, containing mostly zero elements; I n Fig. 6(b), we compute a full ﬁltering operation (by g − m ), only to discard most of the ﬁ ltering res ults in the subsequ ent downsampling step. All these deﬁciencies c an be eliminated b y using the tool of po lyphase representations from multirate signal processing [15], [16], as follo ws. First, we split the ﬁlter g m into K non-overlapping p olyphas e compo nents g 0 ,m , g 1 ,m , . . . , g K − 1 ,m , deﬁne d as g k ,m = g K m + k , for 0 ≤ k < K. (44) Intuiti vely , the polyp hase comp onents spec iﬁed in (44) a re simply downsampled versions of the original 22 ﬁlter g m , with the s ampling locations of all these polyphas e compo nents forming a complete partition. Th e mapping b etween the ﬁlter g m and its po lyphase compo nents is one-to-one. T o rec onstruct g m , we ca n easily verify that, in the z -domain, G ( z ) = G 0 ( z K ) + z − 1 G 1 ( z K ) + . . . z − ( K − 1) G K − 1 ( z K ) . (45) Follo wing the sa me s teps a s above, we c an also sp lit the s equenc es u m and b m in Fig. 6 into their re spective polyphas e componen ts u 0 ,m , u 1 ,m , . . . , u K − 1 ,m and b 0 ,m , b 1 ,m , . . . , b K − 1 ,m . Pr oposition 5: Denote by U k ( z ) a nd B k ( z ) (for 0 ≤ k < K ) the z -transforms o f u k ,m and b k ,m , respectively . Th en, U k ( z ) = A ( z ) G k ( z ) , for 0 ≤ k < K, (46) and V ( z ) = K − 1 X k =0 B k ( z ) G k ( z − 1 ) . (47) Pr oof: See Appendix H. The results of Prop osition 5 require some further exp lanations. What equation (46 ) su ggests is an alternati ve implementation of Ga , as shown in Fig. 6(c). W e compute K parallel c on voluti ons between the input a m and the p olyphase ﬁlters { g k ,m } . The c hanne l outpu ts are the polypha se components { u k ,m } , which can be combined to form the desired output u m . Similarly , it follows from (47) that G T b can be implemented by the parallel ﬁltering scheme in Fig. 6(d). The new implemen tations in Fig. 6(c) and Fig. 6(d) a re signiﬁcan tly faster than their respective coun- terparts. T o see this, suppos e that the ﬁlter g m has L coefﬁcients. It is ea sy to see that the original implementation in Fig. 6(a) requires O ( K L ) arit hmetic operations for e very pixel in a m . In contrast, eac h indi vidual chan nel in Fig. 6(c) requires only O ( L/K ) arithmetic o perations (due to the sho rter supports of the polyph ase ﬁlters), and thus the total co st of Fig. 6(c) s tays at O ( L ) op erations per pixel. This represents a K -fold reduction in computationa l complexities. A similar a nalysis also shows that Fig. 6(d) ne eds K - times fe wer o perations than Fig. 6(b). Recall that K is the oversampling factor of o ur image se nsor . As we operate in highly o versampled regi mes ( e.g. , K = 1024 ) to compens ate for information los s due to one-bit quantizations, the above improvements ma ke our algorithms orders of magnitude faster . V . N U M E R I C A L R E S U LT S W e present s everal numerical results in this section to verify ou r theoretical analysis a nd the eff ectiveness of the propos ed image reconstruction algorithm. Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 23 P S f r a g r e p l a c e m e n t s 0 0.2 0.4 0.6 0.8 1 0 500 1000 1500 2000 (a) P S f r a g r e p l a c e m e n t s 0 0.2 0.4 0.6 0.8 1 0 500 1000 1500 2000 (b) P S f r a g r e p l a c e m e n t s 0 0.2 0.4 0.6 0.8 1 0 500 1000 1500 2000 (c) P S f r a g r e p l a c e m e n t s 0 0.2 0.4 0.6 0.8 1 0 500 1000 1500 2000 (d) Fig. 7. Binary sensing and reconstructions of 1-D light ﬁelds. (a) The original light ﬁ eld λ ( x ) , modeled as a linear combination of shifted spline kernels. (b) The reconstruction result obtain ed by the proposed MLE-based algorithm, using measurements taken by a sensor with spatial ov ersampling factor K = 256 . (c) An improved reconstruction result due to the use of a larger spatial ov ersampling factor K = 2048 . (d) An alternati ve result, obtained by keeping K = 256 but taking J = 8 consecuti ve exposures. A. 1-D Synthetic Signals Consider a 1 -D light ﬁeld λ ( x ) s hown in Fig. 7 (a). The interpolation ﬁlter ϕ ( x ) we use is the cubic B-spline function β 3 ( x ) deﬁned in (3). W e c an see that λ ( x ) is a linea r combination of the shifted kernels, with the expansion co efﬁcients { c n } shown a s b lue do ts in the ﬁgure. W e simulate a bina ry s ensor with threshold q = 1 an d oversampling factor K = 256 . Applying the proposed MLE-bas ed algorithm in Section IV, we obtain a rec onstructed light ﬁeld (the red dashed curve) shown in Fig. 7(b), tog ether w ith the original “ ground truth” (the blue solid curve). W e observe that the low- light regions are well-recons tructed but there exist large “overshoots” in the high-light regions. W e can substan tially improve the recon struction qua lity b y inc reasing the oversampling factor of the sensor . Fig. 7(c) shows the res ult obta ined by increa sing the spatial oversampling factor to K = 2048 . Alternati vely , we show in Fig. 7 (d) a dif ferent reconstruction resu lt obtained by keeping the original spatial ov ersampling f actor at K = 256 , b ut taking J = 8 cons ecutive exposu res. V isually , the two sensor conﬁgurations , i.e. , K = 2048 , J = 1 a nd K = 256 , J = 8 , lead to very simi lar recons truction performances. This observation agrees with o ur pre vious theoretical ana lysis in Sec tion II -D on the equi valence between spatial a nd temp oral o versampling sche mes. 24 (a) (b) (c) Fig. 8. High dynamic range photography using the bina ry sensor . (a) A sequen ce of i mages taken inside of a church w ith decreasing expo sure times [22]. (b) The reconstructed high dynamic range radiance map (in logarithmic scales) using our MLE reconstruction algorithm. (c) The tone-ma pped version of the recon structed radiance map. B. Ac quiring Scenes with High Dynam ic Ranges A well-known difﬁculty in photograp hy is the limited d ynamic ran ges of the image s ensors. Capturing both very bright an d very dark regions fait hfully in a single image is dif ﬁcult. For example, Fig. 8(a) s hows several images taken inside of a church with dif ferent exposure times [22]. The scene contains both sun-lit areas and s hadow regions, with the former over a thousa nd times brighter than the latter . Suc h h igh dynamic ranges a re well-beyond the capabilities of c on ventional image sen sors. As a result, thes e images a re either overexposed or underexpose d, with no single image rendering details in both areas. In light of this problem, an activ e area of research in compu tational photograp hy is to recons truct a h igh d ynamic ran ge radiance map by combining multiple imag es with dif ferent exposure se ttings (see, e.g. , [22], [23]). While produ cing succe ssful results, such multi-exposure approa ches can be time-cons uming. In Sec tion III -C , we have shown that the bina ry sensor studied in this work can achiev e highe r dynamic ranges than con ventional image s ensors . T o de monstrate this a dvantage, we use the high d ynamic range radiance map obtained in [22] as the ground truth data [ i.e. , the light ﬁe ld λ ( x ) as deﬁne d in (1)], and simulate the acquis ition of this scene by using a binary sensor with a single photon threshold. The spatial oversampling factor of the binary se nsor is set to 32 × 32 , and the tempo ral oversampling factor is 256 ( i.e. , 256 independent frames ). Similar to o ur p revious experiment on 1 -D signals, we use a cubic B-spline kernel [ i.e. , ϕ ( x ) = β 3 ( x ) ] along e ach of the spatial dimensions. Fig. 8(b) sho ws the reconstructed radiance map using our algorithm described in S ection IV. Since the radiance map has a dynamic range of 3 . 3 × 10 5 : 1 , the image is shown in logarithmic scale. T o ha ve a visu ally more pleasing result, we also s hown in Fig. 8(c) a tone-mapp ed [23] version of the reconstruction. W e ca n see from Fig. 8(b) and Fig. 8(c) tha t details in both light and shadow regions have be en faithfully preserved in the recon structed radiance map, su ggesting Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 25 Fig. 9. Image reconstruction from the binary measurements taken by a S P AD sensor [17], with a spatial resolution of 32 × 32 pixels. The ﬁnal image (lower-right corner) is obtained by incorpora ting 4096 consecutiv e frames, 50 of w hich are sho wn i n the ﬁgure. the potential application of the binary se nsor in high dyn amic ran ge p hotography . C. Re sults o n Re al Sensor Data W e have also applied our recon struction algorithm to images taken by an expe rimental se nsor based on single photon avalanche d iodes (SP ADs) [17]. The senso r has binary-valued pixels with single-photon sensitiviti es, i.e. , the quantization threshold is q = 1 . Du e to its experimental nature, the sensor has limited spatial resolution, con taining an array of only 3 2 × 32 detectors. T o emulate the ef fect of spatial oversampling, we apply temporal oversampling and acquire 4096 independ ent bina ry frames of a s tatic scene . In this case, we can estimate the light inten sity at each p ixel indep enden tly by using the closed-form MLE solution in (28). Fig. 9 shows 50 such binary images, tog ether with the ﬁnal rec onstruction result (at the lower -right corner). Th e qu ality o f rec onstruction veri ﬁes o ur theoretical model and analys is. V I . C O N C L U S I O N S W e presented a the oretical study of a new image sens or that acquires light information us ing one-bit pixels—a scheme reminiscent of traditional photograph ic ﬁlm. By formulating the b inary sensing sc heme as a parameter estimation problem based on quan tized Poisson s tatistics, we analyz ed the p erformance of the binary senso r in acqu iring light intensity information. Our analysis shows that, with a single-photon quantization threshold a nd lar ge oversampling factors, the b inary sensor pe rforms much li ke an ideal s ensor , as if there were n o quantiza tion. T o recover the light ﬁe ld from b inary s ensor measureme nts, we prop osed an MLE-based image reconstruction algorithm. W e showed tha t the corresponding log-likelihood function is alw ays c oncave, thus gu aranteeing the global con ver gence of numerical so lutions. T o solve for the MLE, we adop t a standard gradient method, an d derive efﬁcient implementations using fast signa l process ing algorithms in the polyphas e d omain. Finally , we presen ted n umerical results on both sy nthetic data an d images taken by a prototype senso r . Th ese results verify our theo retical ana lysis and demo nstrate the eff ectiveness of o ur image rec onstruction a lgorithm. They a lso point to the poten tial of the new binary sensor in high dynamic range photography applications . 26 A P P E N D I X A. Proof of Proposition 1 The sequence e s m in (16) can be written, equiv alently , as e s m = P J − 1 j =0 P M − 1 n =0 s j,n δ m − J n − j , where δ l is the Kronecker delta function. T aking z -transforms on both sides o f the equality lead s to e S ( z ) = J − 1 X j =0 z − j M − 1 X n =0 s j,n z − J n = J − 1 X j =0 z − j S j ( z J ) . (48) By substituting (15) into (48) and us ing the deﬁnition (17) , we can simplify (48) as e S ( z ) = C ( z K J ) e G ( z ) . (49) Finally , since C ( z K J ) is the z -transform of the se quenc e P n c n δ m − K J n , it follows fr om (49) that e s m = ( P n c n δ m − K J n ) ∗ e g m , and thus (18). B. Th e CRL B of Binary Sensors W e ﬁrst compute the Fisher infor mation, deﬁned a s I ( c ) = E [ − ∂ 2 ∂ c 2 log L b ( c )] . Using (22), we ge t I ( c ) = E  − ∂ 2 ∂ c 2  K 1 log p 1 ( c/K ) + ( K − K 1 ) log p 0 ( c/K )   = E " K 1  p ′′ 0 ( c/K ) p 1 ( c/K ) + p ′ 0 ( c/K ) 2  K 2 p 1 ( c/K ) 2 − ( K − K 1 )  p ′′ 0 ( c/K ) p 0 ( c/K ) − p ′ 0 ( c/K ) 2  K 2 p 0 ( c/K ) 2 # , (50) where p ′ 0 ( x ) = d dx p 0 ( x ) a nd p ′′ 0 ( x ) = d 2 dx 2 p 0 ( x ) a re the ﬁrst and second order deriv ati ve of p 0 ( x ) , respecti vely . In reaching (50) , we have also used the fact that p 1 ( x ) = 1 − p 0 ( x ) and thus p ′ 0 ( x ) = − d dx p 1 ( x ) and p ′′ 0 ( x ) = − d 2 dx 2 p 1 ( x ) . Note that K 1 = P 0 ≤ m 0 . So, for x ≥ 1 , (57) is greater tha n  Z ∞ −∞ e − 1 2 u 2 / p 4 + u 2 du  2 ≈ 1 . 3 1 . (58) For 0 ≤ x < 1 , we can ob tain the following inequa lities by keeping the ﬁrst two terms in p 0 ( x ) a nd the ﬁrst te rm in p 1 ( x ) : p 0 ( x ) p 1 ( x ) x 2 q − 1 e − 2 x / ( q − 1)! 2 ≥ (1 + x ) e − 2 x x q /q ! x 2 q − 1 e − 2 x / ( q − 1)! 2 = ( q − 1)!( x 1 − q + x 2 − q ) q ≥ 2( q − 1)! q . It is eas y to see that 2( q − 1)! q is a monoton ically increasing function with respect to q ≥ 2 . Therefore, 2( q − 1)! q ≥ 4 3 ≈ 1 . 3 3 f or q ≥ 3 . (59) Finally , for q = 2 , we k eep the ﬁrst two terms in p 0 ( x ) and p 1 ( x ) a nd g et p 0 ( x ) p 1 ( x ) x 2 q − 1 e − 2 x / ( q − 1)! 2 ≥ (1 + x ) e − 2 x  x 2 / 2 + x 3 / 6  x 3 e − 2 x = x/ 6 + 2 / 3 + 1 2 x ≥ 1 . 3 3 f or 0 ≤ x ≤ 1 . (60) Combining (58 ), (59) and (60), we reach the desired resu lt in (26). Finally , we show that, for q ≥ 2 , the “gap” betwee n CRLB bin ( K, q ) and CRLB ideal ( K ) will on ly get bigger as the oversampling factor K grows. T o that end , we notice that when K → ∞ , the variable x = c/K → 0 . Keeping the ﬁrst terms in p 0 ( x ) a nd p 1 ( x ) , we have p 0 ( x ) p 1 ( x ) x 2 q − 1 e − 2 x / ( q − 1)! 2 ≥ e − 2 x x q /q ! x 2 q − 1 e − 2 x / ( q − 1)! 2 = ( q − 1)! x 1 − q q . For q ≥ 2 , the a bove quantity go es to inﬁnity as x → 0 . Therefore, lim K →∞ CRLB bin ( K, q ) / CRLB ideal ( K ) = ∞ . Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 29 E. Proof of Th eorem 1 When q = 1 , we ha ve p 0 ( x ) = e − x and thus p [ − 1] 0 ( x ) = − log ( x ) . In this case, the MLE solution in (28) can be rewri tten as b c ML ( b ) =      − K log (1 − K 1 /K ) , if 0 ≤ K 1 ≤ K (1 − e − S/K ) , S, otherwise . W e note that − K log(1 − K 1 /K ) = K 1 + K 2 1 2 K + K 3 1 3 K 2 + . . . a nd that lim K →∞ K (1 − e − S/K ) = S . Thu s, for s ufﬁciently lar ge K , the a bove MLE solution can be further simpliﬁed as b c ML ( b ) =      K 1 + O ( 1 K ) , if 0 ≤ K 1 < S, S, otherwise . (61) W ithout loss of ge nerality , we assu me that S is an integer in what follows. The expe cted value of the MLE then b ecomes E [ b c ML ( b )] = S − 1 X n =0 n P ( K 1 = n ) + S K X n = S P ( K 1 = n ) + O (1 /K ) . Using the followi ng identity c = P ∞ n =0 n c n e − c n ! about the mean of a Poisson rand om variable, we hav e   E [ b c ML ( b )] − c   =    S − 1 X n =0 n  P ( K 1 = n ) − c n e − c n !  + S K X n = S P ( K 1 = n ) − ∞ X n = S n c n e − c n ! + O (1 /K )    ≤ S    S − 1 X n =0  P ( K 1 = n ) − c n e − c n !     + S K X n = S P ( K 1 = n ) + ∞ X n = S c n e − c ( n − 1)! + O (1 /K ) . (62) In what follows, we de ri ve bound s for the quantities on the right-hand side of the a bove inequality . First, consider the probability P ( K 1 = n ) . Since K 1 is a binomial random variable, we ha ve P ( K 1 = n ) =  K n  (1 − p 0 ( c/K )) n p 0 ( c/K ) ( K − n ) = K ( K − 1) . . . ( K − n + 1) n ! K n ( K (1 − e − c/K )) n e − c ( K − n ) /K . (63) For e very n < S , it is easy to verify that K ( K − 1) ... ( K − n +1) K n = 1 + O (1 /K ) , ( K (1 − e − c/K )) n = c n + O (1 /K ) and e − c ( K − n ) /K = e − c + O (1 /K ) . Thus, for any n < S , we can simplify (63) as P ( K 1 = n ) = c n n ! e − c + O (1 /K ) . (64) It foll ows that S    S − 1 X n =0  P ( K 1 = n ) − c n e − c n !     = O (1 /K ) . (65) 30 Next, cons ider the second term on the right-hand s ide o f (62). S K X n = S P ( K 1 = n ) = S (1 − S − 1 X n =0 P ( K 1 = n )) = S (1 − S − 1 X n =0 c n n ! e − c ) + O (1 /K ) (66) = S ∞ X n = S c n n ! e − c + O (1 /K ) ≤ S e − c  ec S  S + O (1 /K ) , (67) for a ll c < S , where (66) follo ws from (64) a nd the inequality (67) is due to the Chernoff bound on the tail of P oisson d istrib utions [24]. Similarly , the third term on the right-hand side of (62) can be rewritt en as ∞ X n = S c n e − c ( n − 1)! = c ∞ X n = S − 1 c n e − c n ! ≤ c e − c  ec S − 1  S − 1 , (68 ) where the ine quality is a gain an application of the Chernoff bound . Finally , on subs tituting (65), (67) and (68) into (62), and after some simple manipulations, we reach (29). The proof for the mean -squared error formula (30 ) is similar . Using (61) , we have E  ( b c ML ( b ) − c ) 2  = S − 1 X n =0 ( n − c ) 2 P ( K 1 = n ) + ( S − c ) 2 K X n = S P ( K 1 = n ) + O (1 /K ) = S − 1 X n =0 ( n − c ) 2 c n e − c n ! + ( S − c ) 2 ∞ X n = S c n e − c n ! + O (1 /K ) , (69) where in reaching (69), we have used the es timation (64) of the Binomial prob abilities. W e note that the variance of a Poisson random variable is equal to its mean. Thus, c = P ∞ n =0 ( n − c ) 2 c n e − c n ! . On combining this identity with (69),    E  ( b c ML ( b ) − c ) 2  − c    =    ( S − c ) 2 ∞ X n = S c n e − c n ! − ∞ X n = S ( n − c ) 2 c n e − c n ! + O (1 /K )    ≤ 2 ∞ X n = S n 2 c n e − c n ! + O (1 /K ) , for c ≤ S ≤ 2 c ( c + 1) ∞ X n = S − 2 c n e − c n ! + O (1 /K ) , for c < S − 2 . Applying the Chernoff bound to the above ineq uality , we get (30). Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 31 F . Pr oof of Pr o position 4 W e have p 0 ( x ) = e − x P q − 1 n =0 x n n ! , and thus 1 − p 0 ( x ) = e − x P ∞ n = q x n n ! . It follo ws that K (1 − p 0 ( S/K )) = K e − S/K  S q K q q ! + S q +1 K q +1 ( q + 1)! + . . .  = e − S/K  S q K q − 1 q ! + S q +1 K q ( q + 1)! + . . .  . For q ≥ 2 and any ﬁ xed co nstant S , the above quantity c on ver ges to 0 as K tend s to inﬁnity . As a resu lt, for s ufﬁciently lar ge K , the MLE solution in (28) can be simpliﬁed as b c ML ( b ) =      0 , if K 1 = 0 , S, otherwise , (70) where we have a lso used the fact that p 0 (0) = 1 an d thus p [ − 1] 0 (1) = 0 . Us ing (70), we can c ompute the expected value of the MLE as E [ b c ML ( b )] = 0 P ( K 1 = 0) + S (1 − P ( K 1 = 0)) = S (1 − P ( K 1 = 0)) . (71) W e have K 1 = 0 when all the pix el responses are uniformly 0 . The probability of seeing such an event is P ( K 1 = 0) = p 0 ( c/K ) K = e − c  1 + c K + . . . + c q − 1 K q − 1 ( q − 1)!  K , which c on ver ges to 1 a s K tends to inﬁnity , i.e . , lim K →∞ P ( K 1 = 0) = 1 . (72) Substituting (72) into (71), we get (31). Next, we compute the MSE as E  ( b c ML ( b ) − c ) 2  = c 2 P ( K 1 = 0) + ( S − c ) 2  1 − P ( K 1 = 0)  , which, upon taking the limit as K → ∞ , leads to (32). G. Proof of Le mma 2 The function h ( x ) def = log P j k = i x k e − x k ! is continuous ly differentiable on the interval (0 , ∞ ) . The refore, to establish its concavity , we just need to show that its second deri vati ve is nonpositiv e. T o that end, we ﬁrst introduce a seque nce of functions { r k ( x ) } k ∈ Z ∪{∞} , deﬁ ned as r k ( x ) def =      x k /k ! , if 0 ≤ k < ∞ ; 0 , if k < 0 or k = ∞ . (73) 32 It is s traightforward to verify that d dx r k ( x ) = r k − 1 ( x ) f or all k ∈ Z ∪ {∞} . Now , rewr iting h ( x ) as log P j k = i r k ( x ) − x and co mputing its second deri vati ve, we get d 2 dx 2 h ( x ) = ( P j k = i r k − 2 )( P j k = i r k ) − ( P j k = i r k − 1 ) 2 ( P j k = i r k ) 2 , (74) where we have omitted the function ar gument x in r k ( x ) , r k − 1 ( x ) and r k − 2 ( x ) for notational simplicity . Recall tha t our goal is to show that d 2 dx 2 h ( x ) ≤ 0 , for x > 0 . Since the den ominator of (74) is always positiv e, we just need to focus on its numerator . Using the iden tities P i ≤ k ≤ j r k = P i ≤ k ≤ j r k − 1 + r j − r i and P i ≤ k ≤ j r k − 1 = P i ≤ k ≤ j r k − 2 + r j − 1 − r i − 1 , we can simplify the numerator of (74) as follo ws:  X i ≤ k ≤ j r k − 2  X i ≤ k ≤ j r k − 1 + r j − r i  −  X i ≤ k ≤ j r k − 1  X i ≤ k ≤ j r k − 2 + r j − 1 − r i − 1  = X i ≤ k ≤ j  ( r k − 2 r j − r k − 1 r j − 1 ) + ( r k − 1 r i − 1 − r k − 2 r i )  . (75) In w hat follo ws, we show that r k − 2 ( x ) r j ( x ) − r k − 1 ( x ) r j − 1 ( x ) ≤ 0 (76) for arbitrary choices of x ≥ 0 and i ≤ k ≤ j , where 0 ≤ i ≤ j < ∞ or 0 ≤ i < j = ∞ . Note that, when k < 2 or j = ∞ , the left-hand side of (76) becomes − r k − 1 ( x ) r j − 1 ( x ) and thus (76) automatically holds. Now , assume that k ≥ 2 and j < ∞ . From the deﬁnition in (73), the left-hand side of (76) is x k − 2 x j ( k − 2)! j ! − x k − 1 x j − 1 ( k − 1)!( j − 1)! = x k + j − 2 ( k − 2)!( j − 1)!  1 j − 1 k − 1  ≤ 0 for i ≤ k ≤ j . Using similar ar guments, we can also show that r k − 1 ( x ) r i − 1 ( x ) − r k − 2 ( x ) r i ( x ) ≤ 0 , for x ≥ 0 . (77) On s ubstituting the inequ alities (76) and (77 ) into (75), we verif y that the numerator of (74) is nonpositiv e, and therefore d 2 dx 2 h ( x ) ≤ 0 , for all x > 0 . H. Proof of Proposition 5 Expressing the signa l proce ssing operations in Fig. 6 in the z -domain, we have U ( z ) = A ( z K ) G ( z ) = A ( z K ) G 0 ( z K ) + z − 1 A ( z K ) G 1 ( z K ) + . . . + z − ( K − 1) A ( z K ) G K − 1 ( z K ) , (78) where A ( z K ) in the ﬁrst equa lity is the z -transform of the K -times upsampled version of a m , an d (78) follo ws from (45). Similar to (45), we can exp and U ( z ) in terms of the z -transforms of its polyp hase compone nts, as U ( z ) = U 0 ( z K ) + z − 1 U 1 ( z K ) + . . . + z − ( K − 1) U K − 1 ( z K ) . (79) Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 33 Comparing (78) a gainst (79) and u sing the uniqu eness of the p olyphase decompos ition, we conclude that U k ( z ) = A ( z ) G k ( z ) , for all 0 ≤ k < K . Now , consider Fig. 6 (b). W e note that the z -transform of g − m is G ( z − 1 ) . Den ote by d m the output o f the ﬁlteri ng operation. The n, its z -transform can be comp uted as D ( z ) = B ( z ) G ( z − 1 ) = K − 1 X k =0 z − k B k ( z K ) ! K − 1 X k =0 z k G k ( z − K ) ! = K − 1 X k =0 B k ( z K ) G k ( z − K ) + X 0 ≤ i 6 = j

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment