Bits from Photons: Oversampled Image Acquisition Using Binary Poisson Statistics

We study a new image sensor that is reminiscent of traditional photographic film. Each pixel in the sensor has a binary response, giving only a one-bit quantized measurement of the local light intensity. To analyze its performance, we formulate the o…

Authors: Feng Yang, Yue M. Lu, Luciano Sbaiz

Bits from Photons: Oversampled Image Acquisition Using Binary Poisson   Statistics
1 Bits from Photons: Ov ersampled Image Acquisition Using Binary Poisson Statisti cs Feng Y ang, Student Member , IEEE, Y ue M. Lu, M ember , IEEE, Luciano Sbaiz, Member , IEEE, and Martin V etterli, F ellow , IEEE Abstract W e study a new image sensor that is reminiscent of trad itional photo graphic film. Each pixel in the sensor has a binary response, giving only a one-bit quantized measurement of the local light intensity . T o analyze its perfo rmance, we formulate the oversampled binar y sensing scheme as a parameter estimation problem based on quantized Poisson statistics. W e show that, with a single-photo n quantization thresho ld and large oversampling factors, the Cram ´ er-Rao lower boun d (CRLB) of the e stimation variance approac hes that of an ideal unqu antized sensor, that is, as if ther e wer e no quantization in the sensor measur ements. Furthermo re, the CRLB is shown to be asymp totically achiev able by the maximum likelihoo d estimator (MLE). By sho wing that the log-likelihood fun ction of our problem is con cav e, we guarantee the g lobal optimality of iterative algor ithms in finding the MLE. Num erical results on bo th synthetic data and image s taken by a prototyp e sensor verify our theoretical analysis and demon strate the effecti veness of our image reconstruc tion algo rithm. T hey also sugg est the p otential applicatio n o f the oversampled bina ry sensing scheme in high dynam ic rang e pho tograph y . Index T erms Computation al photog raphy , high dyn amic rang e imagin g, dig ital film senso r , ph oton-limited im aging, Poisson statistics, qu antization, diffraction- limited imaging F . Y ang is with the School of Computer and Communication Sciences, ´ Ecole Polytechnique F ´ ed ´ erale de Lausanne (EP FL), CH-1015 Lausanne, Switzerland (e-mail: feng.yang@epfl.ch; WW W : http://lcav .epfl.ch/people/feng .yang) . Y . M. Lu is with the School of Engineering and Applied S ciences, H arv ard Uni versity , Cambridge, MA 02138, USA (e-mail: yuelu@seas.harv ard.edu; WWW: http://lu.seas.harvard.edu). L. Sbaiz i s with Goog le Zurich, C H-8002 Zurich, Switzerland (e-mail: luciano.sbaiz@gmail.com). M. V etterli is with the School of Computer and Communication S ciences, ´ Ecole Polytechnique F ´ ed ´ erale de Lausanne (E PFL), CH-1015, Lausanne, Switzerland (e-mail : martin.vetterli@epfl.ch; WW W : http://lcav .epfl.ch/people/martin.v etterli). This work was supported by the S wiss National S cience Foundation under grant 200020-132 730, and by the European Research Council under grant ERC-24700 6. 2 I . I N T R O D U C T I O N Before the a dvent of digital imag e s ensors, pho tography , for the most p art o f its history , used film to record light information. At the heart of e very photographic fi lm are a lar ge nu mber of light-sensiti ve g rains of silver- halide crystals [1]. Du ring expos ure, e ach micron-siz ed grain has a binary f ate: Either it is struck by some incident photons a nd bec omes “expos ed”, or it is missed by the photon bomba rdment and remains “unexpose d”. In the subseq uent film development process , expos ed grains, due to their altered chemical properties, are con verted to silver meta l, c ontributi ng to opaqu e spo ts on the film; un exposed grains are washed away in a chemical bath, le aving be hind them transparen t regions on the film. Th us, in es senc e, photographic film is a binary imaging medium, using local densities of o paque silver grains to encode the original light intensity information. Th anks to the small size and lar ge number of these grains, one hardly notices this qu antized nature of film wh en viewing it at a distance, observing only a c ontinuous gray tone. In this work, we study a new digital image se nsor that is reminisc ent o f pho tographic film. Each pixel in the sensor has a binary response , gi ving only a one-bit qu antized mea surement of the local light intensity . At the start o f the expos ure period, all pixels are se t to 0 . A pixel is the n s et to 1 if the n umber of photons rea ching it during the expo sure is at least equal to a given thres hold q . One way to build such binary sensors is to modify standard memory ch ip technology , where each memory bit ce ll is des igned to be sensitive to visible light [2]. W ith current CMOS technology , the level of integration of s uch sy stems can exceed 10 9 ∼ 10 10 ( i.e. , 1 giga to 10 giga) pixels per chip. In this case, the c orrespond ing pixel s izes (around 50 nm [3]) a re far be low the diff raction limi t of light (see Sec tion II for more details), a nd thus the image se nsor is oversampling the optical reso lution of the light field. Intuitiv ely , on e can exploit this s patial redundan cy to compensa te for the information los s due to o ne-bit q uantizations, as is clas sic in ov ersampled analog-to-digital (A/ D) con versions [4]–[7]. Building a binary se nsor that emulates the p hotographic film process was fi rst envisioned by Fossum [8], who coined the n ame “digital fi lm senso r” . The original moti vation was mainly out of tech nical nec essity . The miniaturization of ca mera systems calls for the co ntinuous shrinking of pixel sizes. At a ce rtain point, howe ver , the limited full-well capac ity ( i.e. , the maximum photon-electrons a p ixel ca n hold) of small pixels becomes a bo ttleneck, yielding very low sign al-to-noise ratios (SNRs) a nd p oor dynamic rang es. In contrast, a binary sensor whose p ixels o nly need to detect a few pho ton-electrons around a small threshold q h as much less requirement for full-well capacities, allowing pixel size s to shrink further . In this pape r , we present a theoretical ana lysis o f the performance of the bina ry image sens or , a nd propose an efficient and op timal algorithm to reco nstruct images from the bina ry sens or measuremen ts. Our analysis and numerical simulations demonstrate tha t the dyna mic ranges of the binary se nsors can be orders o f ma gnitude higher than those of con ventional imag e s ensors, thu s p roviding one more motiv ation Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 3 for c onsidering this binary sensing sc heme. Since photon arriv als at each pixel can be we ll-approximated by a Poisson rand om proce ss who se rate is determined by the local light intensity , we formulate the b inary s ensing and s ubseq uent image recon struction as a parameter es timation problem b ased on q uantized Poiss on statistics. Imag e estimation from Poisson statistics ha s bee n extensively studied in the pas t, with applications in b iomedical and astrophysical imaging. Previous work in the literature has used linear mo dels [9], multiscale models [10 ], [11], an d nonlinea r piecewise smo oth models [12], [13] to des cribe the underlying images , leading to diff erent (penalized) maximum likelihood a nd/or Bayesian re construction algorithms. The main dif ference between o ur work a nd previous works is tha t we have on ly acce ss to one-bit quantized Poisson statistics. The b inary quantiza tion and spa tial oversampling in the sensing sch eme add interesting d imensions to the original problem. As we will sh ow in Section III, the performance of the binary s ensor depe nds on the intricate interplay of three parameters: the average light intens ity , the quantization thres hold q , and the oversampling factor . The binary s ensing scheme studied in this pa per also bea rs resemblance to oversampled A/D con version scheme s with quantizations (see , e.g. , [4]–[7]). Previous w ork on one -bit A/D conv ersions conside rs ban- dlimited signals or , in ge neral, signals li ving in the range space o f some overcomplete representations. The eff ect o f qua ntization is often approximated by additi ve noise, wh ich is then mitigated throu gh noise s haping [4], [6], or dithering [7], followed by linear recons truction. In our work, the binary senso r measurements are modeled as one-bit quantized versions of c orrelated Poisso n random variables (instead of deterministic signals), an d we d irectly so lve the statistical in verse problem by using max imum likelihood estimation, without a ny add iti ve noise approximation. The rest of the pap er is organized as follows. After a p recise de scription of the binary sensing model in Section II, we present three main co ntrib utions in this paper: 1 . E stimation performanc e: In Sec tion III, we analyze the performance of the proposed binary s ensor in e stimating a piecewise constant light intensity func tion. In what might be viewed as a s urprising resu lt, we show that, when the quan tization threshold q = 1 and with large oversampling factors, the Cram ´ er- Rao lower bound (CRLB) [14] of the estimation variance app roaches that of unquantized Poisson intensity estimation, that is, a s if there were no quantization in the sen sor me asuremen ts. Furthermore, the CRLB can be asymptotically a chieved b y a maximum lik elihood estimator (MLE), for lar ge oversampling factors. Combined, the se two res ults e stablish the fe asibility of trading spatial reso lutions for high er qu antization bit de pth. 2 . Adv antage over traditional sensors: W e c ompare the oversampled binary sensing sche me with tradi- tional image se nsors in Section III-C. Ou r analysis shows that, with suf ficiently lar ge oversampling f actors, the n ew binary sen sor can hav e higher dynamic range s, making it particularly attracti ve in a cquiring sc enes 4 containing both bright and dark regions. 3 . Image reconstruc tion: Section IV presents an MLE-based algorithm to reconstruct the light intensity field from the binary s ensor meas urements. As an important result in this work, we show that the log- likelihood function in our problem is alw ays concave for arbitrary linear fi eld mod els, thus ensuring the achievement of global optimal so lutions b y iterative algorithms. For numerica lly solving the MLE , we present a g radient method, a nd derive efficient implementations ba sed on fast sign al proces sing algorithms in the polypha se domain [15], [16]. This attention to c omputational efficiency is important in practice, due to extremely lar ge spatial resolutions of the binary s ensors. Section V presents numerical res ults on b oth synthe tic data an d images taken by a prototype de vice [17]. These results verify our theoretical an alysis on the binary sensing s cheme, demonstrate the effecti veness of our image reco nstruction algorithm, an d showcase the benefit of using the ne w binary sensor in acqu iring scene s with high dynamic ran ges. T o simplify the presentation we base our disc ussions on a one-dimens ional ( 1 -D) sensor array , but all the results can be eas ily extended to the 2 - D case. I I . I M A G I N G B Y O V E R S A M P L E D B I NA RY S E N S O R S A. Diffraction Limit and Line ar Light F ie ld Mo dels In this se ction, we d escribe the bina ry imaging scheme studied in this work. Cons ider a simplified camera model shown in Fig. 1(a). W e denote by λ 0 ( x ) the inc oming light intensity field ( i.e. , the radianc e map). By assu ming tha t light intensities rema in constan t within a short exposure period, we model the field as only a function of the spatial variable x . W ithout loss of generality , we assume that the dimension of the sensor a rray is of one sp atial unit, i.e. , 0 ≤ x ≤ 1 . After pass ing through the optical system, the original light field λ 0 ( x ) g ets filtered by the lens, which acts like a linear system with a g i ven impulse respons e. Due to imperfections ( e.g. , aberrations) in the lens, the impulse respons e, a.k.a. the point spread function (PSF) of the o ptical system, can not be a Dirac delta, thus, imposing a limit on the resolution of the obs ervable light field. Howe ver , a more funda mental physical limit is d ue to light dif fraction [18]. As a resu lt, even if the lens is ideal, the PS F is s till unav oidably a small blurry spot [see, for examp le, Fig. 1(b)]. In optics , su ch diffraction-li mited spot is often ca lled the Airy dis k [18] , whose rad ius R a can be computed as R a = 1 . 2 2 wf , where w is the wavelength of the light a nd f is the F-number of the optical sy stem. Example 1: At wa velength w = 420 nm ( i.e . , for blue v isible light) and f = 2 . 8 , the radius of the Airy disk is 1 . 43 µ m. T wo objects with distance smaller than R a cannot be clearly separated b y the imaging Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 5 Sensor Lens P S f r a g r e p l a c e m e n t s λ 0 ( x ) λ ( x ) (a) P S f r a g r e p l a c e m e n t s x -5 -4 -3 -2 -1 0 1 2 3 4 5 0 0.02 0.04 0.06 (b) Fig. 1. The imaging model. (a) The simplified arch itecture of a diffraction-limited imaging system. Incid ent light field λ 0 ( x ) passes t hrough an optical lens, which acts like a linear system with a dif fraction-limited poin t spread function (PSF ). The r esult is a smoothed light fi eld λ ( x ) , which is subsequently captured by the image sensor . (b) The PSF (Airy disk) of an ideal lens with a circular aperture. system as their Airy d isks on the image sensor start blurring together . Current CMOS technology ca n already make stan dard pixels smaller than R a , reaching sizes ranging from 0 . 5 µ m to 0 . 7 µ m [19]. In the case of binary sens ors, the simplicity of each pixel allo ws the feature size to be further red uced. For example, based on stand ard memory technolog y , each memory bit-cell ( i.e. , pixel) can have s izes around 50 n m [3], making it poss ible to substantially oversample the light field. In what follo ws, we den ote by λ ( x ) the dif fraction-limit ed ( i.e. , “ob servable”) light intensity field, which is the outcome of pa ssing the original light field λ 0 ( x ) through the lens. Due to the lowpass (smoothing) nature of the PSF , the resu lting λ ( x ) has a finite spatial-resolution, i.e. , it ha s a fi nite nu mber of d egrees of free dom p er unit spac e. Definition 1 (Linear field model): In this work, we model the diffraction-li mited light intensity field as λ ( x ) = N τ N − 1 X n =0 c n ϕ ( N x − n ) , (1) where ϕ ( x ) is a nonnega ti ve interpolation kernel, N is a giv en integer , τ is the exp osure time, a nd { c n : c n ≥ 0 } is a set of free variables. Remark 1: The con stant N/τ in front of the su mmation is not essen tial, but its inclusion here leads to simpler e xpres sions in our later analysis. The func tion λ ( x ) as define d in (1) has N degrees of freedom. T o guaran tee that the resulting light fi elds are p hysically mean ingful, we require bo th the interpolation kernel ϕ ( x ) an d the expans ion c oefficients 6 { c n } to be n onnegative. Some exa mples of the interpolation kernels ϕ ( x ) include the box func tion, β ( x ) def =      1 , if 0 ≤ x ≤ 1; 0 , otherwise , (2) cardinal B-s plines [20 ], β k ( x ) =  β ∗ . . . ∗ β | {z } ( k +1) t imes   x + k 2  , (3) and the squared sinc function, sin 2  π ( x − 1 2 )  /  π ( x − 1 2 )  2 . B. Sa mpling the Light Intensity F ield The image s ensor in Fig. 1(a) works as a sampling device of the light intensity fie ld λ ( x ) . Sup pose that the sensor co nsists of M pixels p er unit sp ace, a nd that the m th pixel covers the area between [ m M , m +1 M ] , for 0 ≤ m < M . W e de note by s m the total light expos ure acc umulated on the surface area of the m th pixel within an expos ure time pe riod [0 , τ ] . Then , s m def = Z τ 0 Z ( m +1) / M m/ M λ ( x ) dx dt = τ h λ ( x ) , β ( M x − m ) i , (4) where β ( x ) is the box function defined in (2) and h· , ·i repres ents the standard L 2 -inner produ ct. Su bstitute the light field model (1) into the above equality , s m = τ h N τ X n c n ϕ ( N x − n ) , β ( M x − m ) i = X n c n h N ϕ ( N x − n ) , β ( M x − m ) i = X n c n h ϕ ( x ) , β  M ( x + n ) N − m  i , (5) where (5) is obtained through a cha nge of vari ables ( N x − n ) → x . Definition 2: The spatial oversampling factor , denoted by K , is the ratio between the number of pixels per unit spac e and the numbe r of degrees of freed om needed to specify the li ght field λ ( x ) in (1), i.e. , K def = M N . (6) In this work, we are interested in the “oversampled” case w here K > 1 . Furthermore, we assume that K is an integer for simplicity o f n otation. Us ing (6) , an d by introducing a discrete filter g m def = h ϕ ( x ) , β ( K x − m ) i , m ∈ Z , (7) Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 7 Photon Counting Quantization Binary sensing Light exposure P S f r a g r e p l a c e m e n t s c n K g m τ / ( K N ) s m y m b m Fig. 2. The si gnal processing block diagram of the imaging model studied in this paper . In the first step, the light exposure value s m at the m th pixel i s related t o the e xpansion co efficients c n through a conc atenation of upsampling and filtering operations. Subsequently , the image sensor con verts { s m } into quantized measurements { b m } (see F ig. 3 and the discussions in Section II-C for details of this second step). we c an s implify (5) as s m = X n c n g m − K n . (8) The above eq uality s pecifies a simple linear map ping from the expans ion co efficients { c n } of the light field to the light exposu re v alues { s m } acc umulated by the image sensor . Re aders famili ar with multirate signal proce ssing [15], [16] will immedia tely reco gnize that the relation in (8) can b e implemented via a concate nation o f upsampling and filtering, as sho wn in the left part of Fig. 2. This observation c an also be verified by expressing (8) in the z -transform domain S ( z ) = X n c n z − K n G ( z ) = C ( z K ) G ( z ) , (9) and using the fact that C ( z K ) is the z -transform o f the K -fold upsampled version of c n . In Section IV, we will further study the signal processing block dia gram in Fig. 2 to de ri ve ef ficient implementations of the proposed image recons truction algorithm. Example 2: The discrete filter g m is complete ly sp ecified by the interpolation kernel ϕ ( x ) a nd the oversampling f actor K . As a simple case, when the kernel ϕ ( x ) = β ( x ) , we can compute from (7) that g m =      1 /K, for 0 ≤ m < K ; 0 , otherwise . (10) C. Binar y S ensing a nd On e-Bit P oiss on Statistics Fig. 3 illustrates the binary sensor mode l. Recall from (4) that { s m } deno te the expos ure v alues ac cumu- lated by the se nsor pixels. Depending o n the local values o f { s m } , each p ixel (dep icted a s “b uckets” in the figure) c ollects a dif ferent number of photons hitting on its surface. In wh at follows, we den ote by y m the number of photons impinging on the surface of the m th pixel during a n exposu re period [0 , τ ] . The relation between s m and the pho ton count y m is stoc hastic. More specific ally , y m can be modeled as realizations 8 Binary Light intensity Photon counting measurements quantization Binary P S f r a g r e p l a c e m e n t s s 0 s 1 s 2 s m y 0 y 1 y 2 y m b 0 b 1 b 2 b m m ( q = 2) m e a s u r e m e n t s Fig. 3. The model of t he binary image sensor . The pixels (sho wn as “b uckets”) collect photons, the numb ers of which are compared against a quantization threshold q . In the figure, w e illustrate the case when q = 2. T he pixel outputs are binary: b m = 1 ( i.e. , white pix els) if there are at least two photons receiv ed by the pix el; otherwise, b m = 0 ( i.e. , gray pix els). of a Poisso n random v ariable Y m , whose intensity parameter is equal to s m , i.e . , P ( Y m = y m ; s m ) = s y m m e − s m y m ! , for y m ∈ Z + ∪ { 0 } . (11) It is a well-known property of the Poiss on process that E [ Y m ] = s m . Thus, the average n umber of photons captured by a gi ven pixel is equal to the loc al light exposure s m . As a ph otosens iti ve d evice, each pixel in the image se nsor c on verts photons to electrical signals, whos e amplitude is proportional to the number of photons impinging on that p ixel. 1 In a c on ventional s ensor design, the analog electrical signals are then quantized by an A/D con verter into 8 to 14 bits (usually the more bits the be tter). In this work, we study a new sensor des ign using the followi ng binary ( i.e. , one -bit) quantization s cheme. Definition 3 (Binary Quantization): Le t q ≥ 1 b e an integer thresh old. A binary q uantizer is a mapping Q : Z + ∪ { 0 } 7− → { 0 , 1 } , such tha t Q ( y ) =      1 , if y ≥ q ; 0 , othe rwise . In Fig. 3, we il lustrate the binary qua ntization sc heme. White pixels in the fig ure s how Q ( y m ) = 1 an d gray pixels show Q ( y m ) = 0 . W e de note by b m def = Q ( y m ) , b m ∈ { 0 , 1 } , the qu antized output o f the m th pixel. Since the photon counts { y m } are drawn from random variables { Y m } , s o a re the binary s ensor ou tput 1 The ex act ratio between these two quantities is determined by the quantum ef ficiency of the sensor . Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 9 { b m } , from the random variables n B m def = Q ( Y m ) o . Intr oduc ing two functions p 0 ( s ) def = q − 1 X k =0 s k k ! e − s , p 1 ( s ) def = 1 − q − 1 X k =0 s k k ! e − s (12) we c an write P ( B m = b m ; s m ) = p b m ( s m ) , b m ∈ { 0 , 1 } . (13) Remark 2: The n oise mo del cons idered in this paper is that of Poisson noise. In practice, the performance of image sens ors is also influ enced by thermal noise , which in our case can be modeled as rando m bit- flipping in the binary sensor me asuremen ts. Due to s pace c onstraints, we leave further discussion s on this additional n oise s ource a nd its impact on recons truction performanc e to a follow-up work. D. Multiple Expos ures and T emporal Oversamp ling Our pre vious discuss ions focus on the case of a cquiring a s ingle frame o f quantized measurements d uring the expo sure time [0 , τ ] . As an extension, we c an co nsider multiple exposures and a cquire J con secutive and indep enden t frames. The expo sure time for ea ch frame is set to τ /J , so that the total acquisition time remains the s ame as the single exposure case . In what follows, we ca ll J the tempo ral oversampling fac tor . As be fore, we assume tha t τ ≪ 1 an d thus light intensities λ ( x ) stay c onstant within the entire acquisition time [0 , τ ] . For the j th frame ( 0 ≤ j < J ), we denote by s j,m the light exposure at the m th pixel. Follo wing the same deriv ations as in Section II-B, we can show that s j,m = 1 J X n c n g m − K n , for all j, (14) where { c n } are the expansion co efficients of the light field λ ( x ) , and g m is the d iscrete filter defin ed in (7). The only difference between (14) and (8) is the extra factor of 1 /J , due to the change of exposure time from τ to τ /J . In the z -domain, s imilar to (9), S j ( z ) = 1 J C ( z K ) G ( z ) . (15) In what follo ws, we establish an equiv alence be tween temporal o versampling and s patial oversampling. More prec isely , we will sh ow that an M -pixel sensor tak ing J indepen dent exposures ( i.e. , with J -times oversampling in time) is ma thematically e quiv alent to a single sensor co nsisting of M J pixels. First, we introduce a new seque nce e s m , 0 ≤ m < M J , cons tructed by interlacing the J exposure sequen ces { s j,m } . For example, when J = 2 , the ne w s equen ce is s 0 , 0 , s 1 , 0 , s 0 , 1 , s 1 , 1 , . . . , s 0 ,m , s 1 ,m , . . . , s 0 ,M − 1 , s 1 ,M − 1 , 10 where { s 0 ,m } a nd { s 1 ,m } alternate. In general, e s m can be obtained as e s m def = s j,n , m = J n + j, 0 ≤ j < J, 0 ≤ n < M . (16) In multir ate sign al processing, the above con struction is called the polyphase r epresentation [15 ], [16], and its a lternating subsequenc es { s j,m } J − 1 j =0 the polyphase compo nents . Pr oposition 1: Let e g m be a filter whose z -trans form e G ( z ) def = G ( z J )(1 + z − 1 + . . . + z − ( J − 1) ) /J , (17) where G ( z ) is the z -transform of the filter g m defined in (7). Then, e s m = X n c n e g m − K J n . (18) Pr oof: See Appendix A. Remark 3: Proposition 1 formally estab lishes the equi valence betwee n spatial and temporal o versampling. W e note that (18) h as exactly the same form as (8), a nd thus the mapp ing from { c n } to { e s m } can be implemented by the same signal proc essing operations shown in Fig. 2—we only need to c hange the upsampling factor from K to K J an d the filter from g m to e g m . In ess ence, by taking J consecutive exposures with an M -pixel sensor , we ge t the same light exposure v alues { e s m } , as if we had us ed a more densely packed sensor with M J pixels. Remark 4: T aking multiple exposures is a very e f fectiv e way to increase the total oversampling f actor of the b inary s ensing s cheme . The key assumption in our analysis is that, d uring the J conse cutiv e exposures, the light field remains consta nt over time. T o make s ure this assumption holds for arbitrary values of J , we set the expo sure time for each frame to τ /J , for a fixed and small τ . Cons equen tly , the maximum temporal oversampling f actor we c an achie ve in p ractice will be li mited by the readou t speed o f the binary sensor . Thanks to the equivalence between sp atial a nd tempo ral oversampling, we o nly nee d to focus on the single exposure c ase in o ur followi ng discus sions o n the performance of the binary sens or and image reconstruction a lgorithms. All the results we obtain extend direc tly to the multiple exposure c ase. I I I . P E R F O R M A N C E A N A L Y S I S In this s ection, we s tudy the performanc e of the binary image s ensor in es timating light intensity information, analyze the influen ce of the quantization threshold and oversampling factors, and demonstrate the new sens or’ s advantage over traditional se nsors in terms of highe r dynamic rang es. In our analysis, we assume that the light field is piecewise con stant, i.e. , the interpo lation kernel ϕ ( x ) in (1) is the box function β ( x ) . This simplifying assumption allo ws us to deri ve closed-form expressions for sev eral important performance measures of interest. Numerical results in Se ction V suggest that the res ults and conclusions we o btain in this section applies to the general linea r fi eld mo del in (1) with diff erent interpolation kerne ls. Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 11 A. Th e Cram ´ er-Rao Lower Bou nd (CRLB) of Estimation V ariances From Definition 1, reco nstructing the light intens ity fi eld λ ( x ) b oils down to es timating the unk nown deterministic p arameters { c n } . Inp ut to our estimation problem is a sequ ence of b inary se nsor mea surements { b m } , which are realizations of Be rnoulli random variables { B m } . The probability distributions of { B m } depend on the light exposure values { s m } , as in (13). Finally , the exposure values { s m } are linked to the light intens ity p arameters { c n } in the form of (8). Assume that the light field λ ( x ) is piecewise constan t. W e have c omputed in Example 2 that, under this case, the discrete filter g m used in (8) is a cons tant supported within [0 , K − 1] , as in (10). The mapping (8) be tween { c n } and { s m } can now be simplified as s m = c n /K, for nK ≤ m < ( n + 1) K. (19) W e see that the parameters { c n } have disjoint regions of influence , meaning, c 0 can on ly be sens ed by a group of pixels { s 0 , . . . , s K − 1 } , c 1 by { s K , . . . , s 2 K − 1 } , and so on. Conseque ntly , the paramete rs { c n } c an be e stimated o ne-by-one, independently of each other . In what follo ws, and without loss of g enerality , we focus on estimating c 0 from the bloc k of bina ry measureme nts b def = [ b 0 , . . . , b K − 1 ] T . For notational s implicity , we will drop the subs cript in c 0 and use c instead. T o analyz e the performance of the binary sensing sc heme, we first compute the C RLB [14], which provides a theoretical lower bou nd on the variance of any unbiased estimator . Denote by L b ( c ) the likelihood func tion of observing K binary s ensor measurement b . Then, L b ( c ) def = P ( B m = b m , 0 ≤ m < K ; c ) , = K − 1 Y m =0 P ( B m = b m ; c ) , (20) = K − 1 Y m =0 p b m ( c/K ) , (21) where (20) is due to the indep enden ce of the ph oton c ounting processes a t d if ferent pixel loca tions, a nd (21 ) follo ws from (13) and (19). Defining K 1 ( 0 ≤ K 1 < K ) to be the number of “1”s in the binary s equen ce b , we can simplif y (21) as L b ( c ) =  p 1 ( c/K )  K 1  p 0 ( c/K )  K − K 1 . (22) Pr oposition 2: The CRLB of estimating the light intens ity c from K binary sens or measurements wit h threshold q ≥ 1 is CRLB bin ( K, q ) = c   q − 1 X j =0 ( q − 1)!( c/K ) − j ( q − 1 − j )!     ∞ X j =0 ( q − 1)!( c/K ) j ( q + j )!   , for c > 0 . (23) 12 Pr oof: See Appendix B. It will be interesting to compare the performance of o ur bina ry image senso r with that of an idea l se nsor which does n ot use q uantization at all. T o that end, consider the s ame situation a s before, whe re we u se K pixels to obse rve a constan t light intensity value c . The light exposu re s m at ea ch pixel is equa l to c/K , as in (19 ). Now , unlike the b inary sens or which only takes one-bit me asuremen ts, c onsider an idea l sensor that can p erfectly record the number of photon arri vals at ea ch pixel. By referring to Fig. 3, the sensor measureme nts in this case w ill be { y m } , whose probab ility distributions are gi ven in (11). In A ppendix C, we compute the CRLB of this u nquantize d sens ing scheme a s CRLB ideal ( K ) = c, (24) which is na tural and reflects the fact that the variance of a Poiss on random variable is eq ual to its mean ( i.e. , c , in ou r cas e). T o be sure, we alw ays have CRLB bin ( K, q ) > CRLB ideal ( K ) , for arbitrary oversampling factor K and quantization threshold q . This is n ot surprising, as we lose information by one-bit quantizations. In practice, the ratio be tween the two CRLBs provides a me asure of performanc e degrad ations incurred by the bina ry sensors . What is su rprising is that the two qua ntities can be ma de arbitraril y close, when q = 1 and K is lar ge, as sho wn by the following prop osition. Pr oposition 3: For q = 1 , CRLB bin ( K, q ) = c + c 2 2 K + O  1 K 2  , (25) which c on ver ges to CRLB ideal ( K ) as the oversampling factor K go es to infinity . F or q ≥ 2 , CRLB bin ( K, q ) / CRLB ideal ( K ) > 1 . 31 , ( 26) and lim K →∞ CRLB bin ( K, q ) / CRLB ideal ( K ) = ∞ . Pr oof: Spec ializing the expression (23) for q = 1 , we get CRLB bin ( K, 1) = c  1 + c 2 K + c 2 3! K 2 + c 3 4! K 3 + . . .  , and thus (25). The statemen ts for cases wh en q ≥ 2 are sho wn in Appendix D. Proposition 3 indicates that it is feasible to use oversampling to compensate for infor mation loss du e to binary q uantizations. It follo ws from (25) that, with large ov ersampling factors, the binary sen sor operates as if there were no quantization in its measu rements. It is also important to note that this de sirable tradeof f between s patial resolution and estimation variance only works for a sing le-photon thresho ld ( i.e. , q = 1 ). For other choices of the quantization threshold, the “gap” between CRLB bin ( K, q ) and CRLB ideal ( K ) , measured in terms of their ratio, cannot b e made arbitrarily small, as sho wn in (26). In f act, it quick ly ten ds to infinity as the oversampling factor K increa ses. Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 13 The results of Proposition 3 ca n be intuiti vely understood as follo ws: The expected number of photons collected by each pixel during light expos ure is equal to s m = c/K . As the oversampling factor K goes to infinity , the mean value of the Poisson distrib ution tends to ze ro. Consequently , most pixels on the sens or will only g et zero or one p hoton, with the p robability of rec eiving two or more photons at a pixel close to zero. In this case, with high probability , a binary quantization scheme with threshold q = 1 does not lose information. In contrast, if q ≥ 2 , the binary sensor measureme nts will be almost uniformly zero, making it nearly impossible to differentiate betwee n diff erent light intens ities. B. As ymptotic Achievability of the CR LB In what follows, we show tha t, when q = 1 , the CRLB deri ved in (23) can be as ymptotically achieved by a simple maximum likelihood estimator (MLE). Gi ven a sequence of K binary measurements b , the MLE we s eek is the parameter that maximizes the lik elihood function L b ( c ) in (22). More spec ifically , b c ML ( b ) def = arg max 0 ≤ c ≤ S L b ( c ) = arg max 0 ≤ c ≤ S  1 − p 0 ( c/K )  K 1  p 0 ( c/K )  K − K 1 , (27) where we substitute p 1 ( c/K ) in (22) by its equ i valent form 1 − p 0 ( c/K ) . Th e lo wer bound of the search domain c ≥ 0 is c hosen according to physical constraints, i.e. , the light field can not take negative values. The upp er bound c ≤ S b ecomes ne cessa ry when K 1 = K , in which case the likelihood func tion L b ( c ) = p 1 ( c/K ) K is monotonically increasing with respec t to the light intensity level c . Lemma 1: The MLE so lution to (27) is b c ML ( b ) =      K p [ − 1] 0 (1 − K 1 /K ) , if 0 ≤ K 1 ≤ K (1 − p 0 ( S/K )) , S, otherwise , (28) where p [ − 1] 0 ( x ) is the in verse function of p 0 ( x ) . Remark 5: From the defin ition in (12), we can easily verify that d dx p 0 ( x ) < 0 for all x > 0 . It follows that the function p 0 ( x ) is strictly de creasing for x > 0 and that the in verse p [ − 1] 0 ( x ) is well-define d. For example, when q = 1 , we h ave p 0 ( x ) = e − x and thus p [ − 1] 0 ( x ) = − log( x ) . In this particular case, and for K 1 ≪ K , we h ave b c ML ( b ) = − K log(1 − K 1 /K ) ≈ K 1 . It follows that we c an use the sum of the K binary me asuremen ts as a first-order a pproximation of the light intensity estimation. Pr oof: At the two extreme cases, w hen K 1 = 0 or K 1 = K , it is easy to s ee that (28) is indeed the solution to (27). Next, we a ssume that 0 < K 1 < K . Computing the d eri vati ve of L b ( c ) an d setting it to ze ro, we can verify that the equation d dc L b ( c ) = 0 has a single solution at b c max = K p [ − 1] 0 (1 − K 1 /K ) . 14 Since L b ( c ) ≥ 0 an d L b (0) = lim c →∞ L b ( c ) = 0 , we conclude that the likelihood function L b ( c ) ac hiev es its ma ximum v alue at b c max . Finally , the MLE so lution b c ML = min { b c max , S } , and thus , we hav e (28 ). Theorem 1: When q = 1 , we have E [ b c ML ( b )] = c + ε 1 + O (1 /K ) , for c < S − 2 , (29) where | ε 1 | ≤ 2 c e 1 − c  ec S − 1  S − 1 . Meanwhile, the mea n squared error (MSE) of the estimator approac hes CRLB ideal , i.e . , E  ( b c ML ( b ) − c ) 2  = c + ε 2 + O (1 /K ) , for c < S − 2 , (30) where | ε 2 | ≤ 2 c ( c + 1) e − c  ec S − 2  S − 2 . Remark 6: It is e asy to verify tha t, for fixed c , the two terms ε 1 and ε 2 con ver ge (very q uickly) to 0 as S tends to infinity . It then follows from (29) and (30) that the MLE is a symptotically unbias ed and ef ficient, in the sen se that lim S → ∞ lim K →∞ E [ b c ML ( b )] = c an d lim S → ∞ lim K →∞ E  ( b c ML ( b ) − c ) 2  = c. W e leave the formal proof of this theo rem to Ap pendix E. Its main idea can be summarized as follows. As K goes to infinity , the area of ea ch pixel tends to z ero, s o doe s the average numbe r of ph otons arri ving at that pixel. As a resu lt, most pixels o n the se nsor will get on ly ze ro or o ne photon d uring exposu re. A single-photon binary q uantization scheme c an record perfectly the pattens of “0”s a nd “1”s o n the se nsor . It lose s information only when a pixel rec eiv es two or more p hotons, but the probability of such ev ents tends to zero as K increa ses. Suppose , now , that we us e a qu antization threshold q ≥ 2 . In this case, as K tends to infinity , the binary response s of diff erent p ixels will almost always be “0” , e ssentially obfuscating the a ctual light intensity values. This problem leads to p oor performance in the MLE. As stated in the following propo sition, the asymptotic MSE for q ≥ 2 bec omes c 2 instead of c . Pr oposition 4: When q ≥ 2 , the MLE is asy mptotically biase d , that is, for any fixed c and S , lim K →∞ E [ b c ML ( b )] = 0 . (31) Meanwhile, the MSE becomes lim K →∞ E  ( b c ML ( b ) − c ) 2  = c 2 . (32) Pr oof: See Appendix F. Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 15 10 1 10 2 10 3 10 4 10 5 10 6 0 10 20 30 40 50 60 Light Exposure Values SNR(dB) Fig. 4. Performance comparisons of three different sensing schemes (“BIN”, “IDEAL ”, and “SA T ”) over a wide range of light expo sure v alues c (shown in logarithmic scale). The dash-dot line (in red) represents the “IDEAL ” scheme with no quantization; The solid li ne (in blue) correspond s t o the “SA T” scheme with a saturation point set at C max = 9130 [21]; The four dashed lines (in black) correspond to the “BIN” scheme with q = 1 and different ov ersampling factors (from left to right, K = 2 13 , 2 14 , 2 15 and 2 16 , respecti vely). C. Ad vantages over T raditional Senso rs In wha t follo ws, we de monstrate the advantage of the oversampled binary sensing sc heme, denoted by “BIN”, in ac hieving high er dynamic ranges. W e focus on the case where the q uantization threshold is s et to q = 1 . For comparisons , we a lso c onsider the follo wing two a lternati ve sens ing sch emes: The first, deno ted by “IDEAL ”, uses a s ingle pixel to estimate the light exposure parameter ( i.e. , nonoversampled), b ut that pixel can rec ord pe rfectly the number of photon arriv als during exposure; The sec ond sche me, denoted b y “SA T”, is very similar to the first, with the ad dition of a saturation po int C max , beyon d which the pixel can ho ld no more photons. Note that in our discuss ions, the “SA T” sc heme serves as an idea lized model of co n ventional image senso rs, for which the saturation is cau sed by the limited full-well capa city of the semicondu ctor device. The general trend of con ventional image se nsor design has been to pack more pixels per chip by reducing pixel s izes, lead ing to lo wer full-well cap acities and thus lower sa turation values. Fig. 4 comp ares performances of the three dif ferent s ensing sc hemes ( i.e. , “ BIN”, “ IDEAL ”, and “SA T”) over a wide range of light exposure values. W e measure the performances in terms of s ignal-to-noise ratios (SNRs), defined as SNR = 10 log 10 c 2 E [( b c − c ) 2 ] , where b c is the estimation of the light exposure value we obtain from each of the sensing sch emes. W e obse rve tha t the “IDEAL ” sch eme (the red dash-dot line in the figure) represents an upper-bound of the es timation p erformance. T o s ee this, denote by y the number of photons that arri ve at the pixel during 16 exposure. Then y is a realization of a Po isson ran dom v ariable Y with intensity equal to the light exposure value c , i.e . , P ( Y = y ; c ) = c y e − c y ! . Maximizing this function over c , we can c ompute the MLE for the “ IDEAL ” sc heme as b c IDEAL ( y ) = y . It is easy to verify that this estimator is unbias ed, i.e. , E [ b c IDEAL ( Y )] = E [ Y ] = c , and that it ac hieves the ideal CRLB in (24), i.e. , var ( b c IDEAL ( Y )) = var ( Y ) = c . Accordingly , w e ca n c ompute the SNR as SNR IDEAL = 10 log 10 ( c 2 /c ) = 10 log 10 ( c ) , which a ppears a s a straight line in our figure with the light exposure v alues c shown in log arithmic s cale. The solid line in the figure correspon ds to the “SA T” scheme, with a sa turation p oint set a t C max = 91 30 , which is the full well cap acity of the image sensor reported in [21]. The sensor measurement in this cas e is y SA T def = min { y , C max } , a nd the estimator we use is b c SA T ( y SA T ) = y SA T . (33) W e c an se e tha t the “ SA T” sc heme initially has the same performanc e a s “IDEAL ”. It remains this way until the light exposure value c approache s the saturation point C max , after which there is a drastic drop 2 in SNR. Denoting by SNR min the minimum acce ptable SNR in a giv en application, we ca n then define the dynamic range of a sens or as the range of c for which the sens or achie ves at lea st SNR min . For example, if we choos e SNR min = 20 d B, then, as sh own in the figu re, the SA T sc heme has a dyna mic range from c = 10 2 to c ≈ 10 4 , or , if mea sured in terms of ratios, 100 : 1 . Finally , the three dashe d lines re present the “BIN” scheme with q = 1 and increasing oversampling factors (from left to right: K = 2 13 , 2 14 , 2 15 and 2 16 , resp ectiv ely). W e use the MLE giv en in (28) a nd plot the corresponding estimation SNRs . W e see that, within a lar ge rang e o f c , the pe rformance o f the “BIN” scheme is very close to that of the “IDEAL ” s cheme that does no t use quantization. This verifies our analysis in Theorem 1 , which states that the “BIN” sche me with a single-pho ton threshold can a pproach the ideal unquantized CRLB when the ov ersampling factor is large enough. F urthermore, when compared with the “SA T” scheme, the “ BIN” scheme h as a more gradual decrease in SNR whe n the light expos ure values inc rease, and has a higher dyna mic range. For example, wh en K = 2 16 , the dyn amic range of the “BIN” scheme spans from c = 10 2 to c = 10 5 . 8 , about two orders of magnitude higher than that of “S A T”. In Section V, we will prese nt a nume rical experiment tha t p oints to a potential a pplication of the binary sensor in high dynamic range photography . 2 The estimator in (33) is biased around c = C max . Fo r a very narro w range of light intensity v alues centered around C max , the MSE of this biased estimator is lower t han the i deal CRLB. Thus, there is actually a short “spike” in S NR right before the drop. Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 17 Remark 7: Note that K is the product o f the spatial oversampling factor and the temporal oversampling factor . For example, the pixel pitch of the imag e sens or reported in [21] is 1 . 6 5 µm . If the binary sen sor is buil t on me mory ch ip techn ology , with a pitch size of 50 n m [3], then the maximum sp atial oversampling factor is abou t 1089 . T o achieve K = 2 13 , 2 14 , 2 15 and 2 16 , resp ectiv ely , a s required in Fig. 4, we then need to have temp oral oversampling factors ranging from 8 to 60 . Un like traditional sensors which require multi-bit quantize rs, the binary sens ors only nee d one -bit c omparators. This s implicity in hardware c an potentially lead to f aster readout spe eds, mak ing it practical to apply temporal oversampling. I V . O P T I M A L I M AG E R E C O N S T RU C T I O N A N D E FFI C I E N T I M P L E M E N T A T I O N S In the pre vious sec tion, we studied the performance of the binary image sens or , and deriv ed the MLE for a p iecewise-constant light fie ld model. Ou r analysis es tablishes the op timality of the MLE, showing that, with s ingle-photon thresholding and large oversampling factors, the MLE approac hes the performance of an ideal sensing sc heme without q uantization. In this section, we extends the MLE to the general linear field mo del in (1), with arbitrary interpolation kernels. As a main result of this work, we s how that the log-likelihood function is always conc av e. This desirable property g uarantees the global con vergence of iterati ve n umerical a lgorithms in solving the MLE. A. Image Reconstruction by MLE Under the linear field model introdu ced in Definition 1, reconstructing an image [ i.e. , the light field λ ( x ) ] is equiv alent to estimating the p arameters { c n } in (1). As shown in (8), the light exposu re values { s m } at different sen sors are related to { c n } throug h a linear mapping, implemented as u psampling followed by filtering a s in Fig. 2. Since it is linear , the mapping (8) can b e written as a matrix-vector multiplication s = G c , (34) where s def = [ s 0 , s 1 , . . . , s M − 1 ] T , c def = [ c 0 , c 1 , . . . , c N − 1 ] T , a nd G is an M × N matrix represe nting the combination of upsa mpling (by K ) and filtering (by g m ). Ea ch element of s can then be written a s s m = e T m Gc , (35) where e m is the m th standard Euc lidean bas is vector . 3 Remark 8: In using the a bove n otations, we do not d istinguish between single exposure a nd multiple exposures, whose equiv alence has been established by Propos ition 1 in Section II-D. In the case of multiple exposures, the e ssential structure of G —upsampling follo wed by filtering—remains the same. All we need 3 Here we use zero-based indexing. Thus, e 0 def = [1 , 0 , . . . , 0] T , e 1 def = [0 , 1 , . . . , 0] T , and so on. 18 to do is to replace s by the interlaced se quenc e { e s m } cons tructed in (16), the oversampling f actor K by K J , and the filter g m by e g m in (17). Similar to o ur deriv ations in (20) and (21), the likelihood function giv en M binary meas urements b def = [ b 0 , b 1 , . . . , b M − 1 ] T can be computed as L b ( c ) = M − 1 Y m =0 P ( B m = b m ; s m ) = M − 1 Y m =0 p b m ( e T m Gc ) , ( 36) where (36) follows fr om (12) and (35). In our subsequent dis cussion s, it is more con venient to work with the log-li kelihood function, define d as l b ( c ) def = log L b ( c ) = M − 1 X m =0 log p b m ( e T m Gc ) . (37) For any gi ven o bservation b , the MLE we seek is the parameter that ma ximizes L b ( c ) , or equi valently , l b ( c ) . Specifically , b c ML ( b ) def = arg max c ∈ [0 ,S ] N l b ( c ) = arg max c ∈ [0 ,S ] N M − 1 X m =0 log p b m ( e T m Gc ) . (38) The cons traint c ∈ [0 , S ] N means that every parameter c n should satisfy 0 ≤ c n ≤ S , for some prese t maximum value S . Example 3: As discu ssed in Section III, wh en the light field is piecewise-cons tant, different light field parameters { c n } can be estimated independen tly . In tha t case, the likelihood fun ction h as only one variable [see (22 )] a nd can b e e asily visualized. In Fig. 5, we plot L b ( c ) in (22) and the c orrespond ing log-likelihood function l b ( c ) , un der diff erent choices of the quantization thresho lds. W e obse rve from the figu res that the likelihood func tions are not concav e, but the log-li kelihood functions ind eed are. In what follo ws, we will show that this result is general, namely , the log-likeli hood functions in the form of (37) are always concave. Lemma 2: For any tw o integers i, j suc h that 0 ≤ i ≤ j < ∞ or 0 ≤ i < j = ∞ , the fun ction log j X k = i x k e − x k ! is concav e on the interval x ∈ [0 , ∞ ) . Pr oof: See Appendix G. Theorem 2: For arbitrary binary sensor measurements b , the log-lik elihood function l b ( c ) defined in (37) is concav e on the doma in c ∈ [0 , S ] N . Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 19 0 40 80 120 160 200 0 1 2 3 4 x 10 −3 P S f r a g r e p l a c e m e n t s c L b ( c ) ℓ b ( c ) q = 1 q = 3 q = 5 (a) 0 40 80 120 160 200 −150 −100 −50 0 P S f r a g r e p l a c e m e n t s c L b ( c ) ℓ b ( c ) q = 1 q = 3 q = 5 (b) Fig. 5. The likelihood and log-likelihood functions for piecewise-co nstant li ght fields. (a) The likelihood functions L b ( c ) , defined in (22), und er different choices of the quantization thresholds q = 1 , 3 , 5 , respecti vely . (b) The corresponding log-likelihood func tions. In computing these functions, we set the parameters in (22) as follows: K = 12 , i.e. , the sensor is 12 -times ov ersampled. T he binary sensor measurem ents contain 10 “1”s, i.e. , K 1 = 10 . Pr oof: It follo ws from the defin ition in (12) that, for any b m ∈ { 0 , 1 } , the function log p b m ( s ) is either log q − 1 X k =0 s k e − s k ! or log ∞ X k = q s k e − s k ! . (39) W e can a pply Lemma 2 in b oth c ases, and show t hat { log p b m ( s ) } are concave f unctions for s ≥ 0 . Since the sum of con cave func tions is still concave and the c omposition of a concave function with a linear mapping ( s m = e T m Gc ) is still c oncave, we conc lude that the log -likelihood function defined in (37) is concave. In g eneral, there is no closed -form solution to the maximization problem in (38). An MLE solution ha s to b e found through nume rical algorithms. Th eorem 2 guarantees the global conv ergence o f the se iterativ e numerical me thods. B. Iterative Algorithm and Ef ficient Implementations W e compu te the numerical solution of the MLE by us ing a standard gradient as cent method. Den ote by c ( k ) the estimation of the unknown parameter c at the k th step. The e stimation c ( k +1) at the next step is obtained by c ( k +1) = P D  c ( k ) + γ k ▽ l b ( c ( k ) )  , (40) where ▽ l b ( c ( k ) ) is the gradient of the log-likelihood function ev aluated at c ( k ) , γ k is the step-siz e at the current iteration, and P D is the projection onto the search do main D def = [0 , S ] N . W e ap ply P D to ensure that all estimations of c lie in the s earch d omain. 20 T aking the deri vati ve of the log-likelihood function l b ( c ) in (37), we can compute the gradient as ▽ l b ( c ( k ) ) = G T h D b 0 ( s ( k ) 0 ) , D b 1 ( s ( k ) 1 ) , . . . , D b M − 1 ( s ( k ) M − 1 ) i T , (41) where s ( k ) def = [ s ( k ) 0 , . . . , s ( k ) M − 1 ] T = Gc ( k ) is the current estimation of the light exposure v alues , and D b ( s ) def = d ds log p b ( s ) for b = 0 , 1 . For example, wh en q = 1 , we have p 0 ( s ) = e − s and p 1 ( s ) = 1 − e − s . In this c ase, D 0 ( s ) = − 1 a nd D 1 ( s ) = 1 / (1 − e − s ) , res pectively . The cho ice of the s tep size γ k has significan t influe nce over the s peed of co n ver gence of the above iterati ve algorithm. W e follo w [9] by ch oosing, a t each step , a γ k so that the gradient vectors at the current and the next iterations a re approximately orthogonal to e ach other . By assuming that the estimates s ( k +1) and s ( k ) at conse cutiv e iterations are close to each other , we can u se the follo wing first-order approximation D b ( s ( k +1) m ) ≈ D b ( s ( k ) m ) + H b ( s ( k ) m )( s ( k +1) m − s ( k ) m ) , where H b ( s ) def = d ds D b ( s ) = d 2 ds 2 log p b ( s ) , for b = 0 , 1 . It foll ows that ▽ l b ( c ( k +1) ) = G T h D b 0 ( s ( k +1) 0 ) , D b 1 ( s ( k +1) 1 ) , . . . , D b M − 1 ( s ( k +1) M − 1 ) i ≈ ▽ l b ( c ( k ) ) + G T diag n H b 0 ( s ( k ) 0 ) , H b 1 ( s ( k ) 1 ) , . . . , H b M − 1 ( s ( k ) M − 1 ) o ( s ( k +1) − s ( k ) ) . (42) Assuming tha t the gradient upd ate c ( k ) + γ k ▽ l b ( c ( k ) ) is inside of the constraint set D , we can neglec t the projection o perator P D in (40), and write s ( k +1) − s ( k ) = G ( c ( k +1) − c ( k ) ) = γ k G ▽ l b ( c ( k ) ) . Substituting the above equality into (42), we get ▽ l b ( c ( k +1) ) ≈ ▽ l b ( c ( k ) ) + γ k G T diag n H b 0 ( s ( k ) 0 ) , H b 1 ( s ( k ) 1 ) , . . . , H b M − 1 ( s ( k ) M − 1 ) o G ▽ l b ( c ( k ) ) . Finally , by requiring that ▽ l b ( c ( k +1) ) be orthogon al to ▽ l b ( c ( k ) ) , we compute the optimal step s ize a s γ k = k ▽ l b ( c ( k ) ) k 2     diag  q − H b 0 ( s ( k ) 0 ) , . . . , q − H b M − 1 ( s ( k ) M − 1 )  G ▽ l b ( c ( k ) )     2 . (43) Remark 9: By definition, H b ( s ) (for b = 0 , 1 ) are the s econd-orde r deriv ati ves of conc ave fun ctions (see Lemma 2), a nd are thus no npositiv e. Co nseque ntly , the terms p − H b ( s ) in the denominator of (43) are well-defined. Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 21 P S f r a g r e p l a c e m e n t s a m u m K g m G (a) P S f r a g r e p l a c e m e n t s b m v m K g − m G T (b) P S f r a g r e p l a c e m e n t s a m u 0 ,m u 1 ,m u K − 1 ,m g 0 ,m g 1 ,m g K − 1 ,m (c) P S f r a g r e p l a c e m e n t s v m b 0 ,m b 1 ,m b K − 1 ,m g 0 , − m g 1 , − m g K − 1 , − m (d) Fig. 6. Signal processing implementations of Ga and G T b . (a) T he product Ga can be obtained by upsampling followed by filtering. (b) The product G T b can be obtained by filtering follo wed by do wnsampling . Note that the fil ter used in (b) is g − m , i.e. , the “flipped” version of g m . (c) The polyphase domain implementation of (a). (d) The polyph ase domain implementation of (b). At every iteration of the gradient a lgorithm, we need to upd ate the gradient a nd the step size γ k . W e see from (41) and (43 ) tha t the compu tations always in volv e matrix-v ector products in the form of Ga and G T b , for some vectors a , b . The matrix G is of size M × N , wh ere M is the total number of pixels. In practice, M will be in the range of 10 9 ∼ 10 10 ( i.e. , gigapixel per chip), making it impossible to directly implement the matrix op erations. Fortunately , the matrix G used in both formulae is highly structured, and can be implemented as ups ampling follo wed by filtering (see ou r discussions in Section II-B and the expression (8) for details). Similarly , the transpose G T can be implemen ted by filtering (by g − m ) follo wed by do wnsa mpling, esse ntially “ flipping” all the operations in G . Fig. 6(a) and Fig. 6(b) s ummarizes these operations. W e no te that the implementations illustrated in Fig. 6(a) a nd F ig. 6(b) are n ot yet optimized: For example, the input to the filter g m in Fig. 6(a) is an upsampled sequence, containing mostly zero elements; I n Fig. 6(b), we compute a full filtering operation (by g − m ), only to discard most of the fi ltering res ults in the subsequ ent downsampling step. All these deficiencies c an be eliminated b y using the tool of po lyphase representations from multirate signal processing [15], [16], as follo ws. First, we split the filter g m into K non-overlapping p olyphas e compo nents g 0 ,m , g 1 ,m , . . . , g K − 1 ,m , define d as g k ,m = g K m + k , for 0 ≤ k < K. (44) Intuiti vely , the polyp hase comp onents spec ified in (44) a re simply downsampled versions of the original 22 filter g m , with the s ampling locations of all these polyphas e compo nents forming a complete partition. Th e mapping b etween the filter g m and its po lyphase compo nents is one-to-one. T o rec onstruct g m , we ca n easily verify that, in the z -domain, G ( z ) = G 0 ( z K ) + z − 1 G 1 ( z K ) + . . . z − ( K − 1) G K − 1 ( z K ) . (45) Follo wing the sa me s teps a s above, we c an also sp lit the s equenc es u m and b m in Fig. 6 into their re spective polyphas e componen ts u 0 ,m , u 1 ,m , . . . , u K − 1 ,m and b 0 ,m , b 1 ,m , . . . , b K − 1 ,m . Pr oposition 5: Denote by U k ( z ) a nd B k ( z ) (for 0 ≤ k < K ) the z -transforms o f u k ,m and b k ,m , respectively . Th en, U k ( z ) = A ( z ) G k ( z ) , for 0 ≤ k < K, (46) and V ( z ) = K − 1 X k =0 B k ( z ) G k ( z − 1 ) . (47) Pr oof: See Appendix H. The results of Prop osition 5 require some further exp lanations. What equation (46 ) su ggests is an alternati ve implementation of Ga , as shown in Fig. 6(c). W e compute K parallel c on voluti ons between the input a m and the p olyphase filters { g k ,m } . The c hanne l outpu ts are the polypha se components { u k ,m } , which can be combined to form the desired output u m . Similarly , it follows from (47) that G T b can be implemented by the parallel filtering scheme in Fig. 6(d). The new implemen tations in Fig. 6(c) and Fig. 6(d) a re significan tly faster than their respective coun- terparts. T o see this, suppos e that the filter g m has L coefficients. It is ea sy to see that the original implementation in Fig. 6(a) requires O ( K L ) arit hmetic operations for e very pixel in a m . In contrast, eac h indi vidual chan nel in Fig. 6(c) requires only O ( L/K ) arithmetic o perations (due to the sho rter supports of the polyph ase filters), and thus the total co st of Fig. 6(c) s tays at O ( L ) op erations per pixel. This represents a K -fold reduction in computationa l complexities. A similar a nalysis also shows that Fig. 6(d) ne eds K - times fe wer o perations than Fig. 6(b). Recall that K is the oversampling factor of o ur image se nsor . As we operate in highly o versampled regi mes ( e.g. , K = 1024 ) to compens ate for information los s due to one-bit quantizations, the above improvements ma ke our algorithms orders of magnitude faster . V . N U M E R I C A L R E S U LT S W e present s everal numerical results in this section to verify ou r theoretical analysis a nd the eff ectiveness of the propos ed image reconstruction algorithm. Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 23 P S f r a g r e p l a c e m e n t s 0 0.2 0.4 0.6 0.8 1 0 500 1000 1500 2000 (a) P S f r a g r e p l a c e m e n t s 0 0.2 0.4 0.6 0.8 1 0 500 1000 1500 2000 (b) P S f r a g r e p l a c e m e n t s 0 0.2 0.4 0.6 0.8 1 0 500 1000 1500 2000 (c) P S f r a g r e p l a c e m e n t s 0 0.2 0.4 0.6 0.8 1 0 500 1000 1500 2000 (d) Fig. 7. Binary sensing and reconstructions of 1-D light fields. (a) The original light fi eld λ ( x ) , modeled as a linear combination of shifted spline kernels. (b) The reconstruction result obtain ed by the proposed MLE-based algorithm, using measurements taken by a sensor with spatial ov ersampling factor K = 256 . (c) An improved reconstruction result due to the use of a larger spatial ov ersampling factor K = 2048 . (d) An alternati ve result, obtained by keeping K = 256 but taking J = 8 consecuti ve exposures. A. 1-D Synthetic Signals Consider a 1 -D light field λ ( x ) s hown in Fig. 7 (a). The interpolation filter ϕ ( x ) we use is the cubic B-spline function β 3 ( x ) defined in (3). W e c an see that λ ( x ) is a linea r combination of the shifted kernels, with the expansion co efficients { c n } shown a s b lue do ts in the figure. W e simulate a bina ry s ensor with threshold q = 1 an d oversampling factor K = 256 . Applying the proposed MLE-bas ed algorithm in Section IV, we obtain a rec onstructed light field (the red dashed curve) shown in Fig. 7(b), tog ether w ith the original “ ground truth” (the blue solid curve). W e observe that the low- light regions are well-recons tructed but there exist large “overshoots” in the high-light regions. W e can substan tially improve the recon struction qua lity b y inc reasing the oversampling factor of the sensor . Fig. 7(c) shows the res ult obta ined by increa sing the spatial oversampling factor to K = 2048 . Alternati vely , we show in Fig. 7 (d) a dif ferent reconstruction resu lt obtained by keeping the original spatial ov ersampling f actor at K = 256 , b ut taking J = 8 cons ecutive exposu res. V isually , the two sensor configurations , i.e. , K = 2048 , J = 1 a nd K = 256 , J = 8 , lead to very simi lar recons truction performances. This observation agrees with o ur pre vious theoretical ana lysis in Sec tion II -D on the equi valence between spatial a nd temp oral o versampling sche mes. 24 (a) (b) (c) Fig. 8. High dynamic range photography using the bina ry sensor . (a) A sequen ce of i mages taken inside of a church w ith decreasing expo sure times [22]. (b) The reconstructed high dynamic range radiance map (in logarithmic scales) using our MLE reconstruction algorithm. (c) The tone-ma pped version of the recon structed radiance map. B. Ac quiring Scenes with High Dynam ic Ranges A well-known difficulty in photograp hy is the limited d ynamic ran ges of the image s ensors. Capturing both very bright an d very dark regions fait hfully in a single image is dif ficult. For example, Fig. 8(a) s hows several images taken inside of a church with dif ferent exposure times [22]. The scene contains both sun-lit areas and s hadow regions, with the former over a thousa nd times brighter than the latter . Suc h h igh dynamic ranges a re well-beyond the capabilities of c on ventional image sen sors. As a result, thes e images a re either overexposed or underexpose d, with no single image rendering details in both areas. In light of this problem, an activ e area of research in compu tational photograp hy is to recons truct a h igh d ynamic ran ge radiance map by combining multiple imag es with dif ferent exposure se ttings (see, e.g. , [22], [23]). While produ cing succe ssful results, such multi-exposure approa ches can be time-cons uming. In Sec tion III -C , we have shown that the bina ry sensor studied in this work can achiev e highe r dynamic ranges than con ventional image s ensors . T o de monstrate this a dvantage, we use the high d ynamic range radiance map obtained in [22] as the ground truth data [ i.e. , the light fie ld λ ( x ) as define d in (1)], and simulate the acquis ition of this scene by using a binary sensor with a single photon threshold. The spatial oversampling factor of the binary se nsor is set to 32 × 32 , and the tempo ral oversampling factor is 256 ( i.e. , 256 independent frames ). Similar to o ur p revious experiment on 1 -D signals, we use a cubic B-spline kernel [ i.e. , ϕ ( x ) = β 3 ( x ) ] along e ach of the spatial dimensions. Fig. 8(b) sho ws the reconstructed radiance map using our algorithm described in S ection IV. Since the radiance map has a dynamic range of 3 . 3 × 10 5 : 1 , the image is shown in logarithmic scale. T o ha ve a visu ally more pleasing result, we also s hown in Fig. 8(c) a tone-mapp ed [23] version of the reconstruction. W e ca n see from Fig. 8(b) and Fig. 8(c) tha t details in both light and shadow regions have be en faithfully preserved in the recon structed radiance map, su ggesting Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 25 Fig. 9. Image reconstruction from the binary measurements taken by a S P AD sensor [17], with a spatial resolution of 32 × 32 pixels. The final image (lower-right corner) is obtained by incorpora ting 4096 consecutiv e frames, 50 of w hich are sho wn i n the figure. the potential application of the binary se nsor in high dyn amic ran ge p hotography . C. Re sults o n Re al Sensor Data W e have also applied our recon struction algorithm to images taken by an expe rimental se nsor based on single photon avalanche d iodes (SP ADs) [17]. The senso r has binary-valued pixels with single-photon sensitiviti es, i.e. , the quantization threshold is q = 1 . Du e to its experimental nature, the sensor has limited spatial resolution, con taining an array of only 3 2 × 32 detectors. T o emulate the ef fect of spatial oversampling, we apply temporal oversampling and acquire 4096 independ ent bina ry frames of a s tatic scene . In this case, we can estimate the light inten sity at each p ixel indep enden tly by using the closed-form MLE solution in (28). Fig. 9 shows 50 such binary images, tog ether with the final rec onstruction result (at the lower -right corner). Th e qu ality o f rec onstruction veri fies o ur theoretical model and analys is. V I . C O N C L U S I O N S W e presented a the oretical study of a new image sens or that acquires light information us ing one-bit pixels—a scheme reminiscent of traditional photograph ic film. By formulating the b inary sensing sc heme as a parameter estimation problem based on quan tized Poisson s tatistics, we analyz ed the p erformance of the binary senso r in acqu iring light intensity information. Our analysis shows that, with a single-photon quantization threshold a nd lar ge oversampling factors, the b inary sensor pe rforms much li ke an ideal s ensor , as if there were n o quantiza tion. T o recover the light fie ld from b inary s ensor measureme nts, we prop osed an MLE-based image reconstruction algorithm. W e showed tha t the corresponding log-likelihood function is alw ays c oncave, thus gu aranteeing the global con ver gence of numerical so lutions. T o solve for the MLE, we adop t a standard gradient method, an d derive efficient implementations using fast signa l process ing algorithms in the polyphas e d omain. Finally , we presen ted n umerical results on both sy nthetic data an d images taken by a prototype senso r . Th ese results verify our theo retical ana lysis and demo nstrate the eff ectiveness of o ur image rec onstruction a lgorithm. They a lso point to the poten tial of the new binary sensor in high dynamic range photography applications . 26 A P P E N D I X A. Proof of Proposition 1 The sequence e s m in (16) can be written, equiv alently , as e s m = P J − 1 j =0 P M − 1 n =0 s j,n δ m − J n − j , where δ l is the Kronecker delta function. T aking z -transforms on both sides o f the equality lead s to e S ( z ) = J − 1 X j =0 z − j M − 1 X n =0 s j,n z − J n = J − 1 X j =0 z − j S j ( z J ) . (48) By substituting (15) into (48) and us ing the definition (17) , we can simplify (48) as e S ( z ) = C ( z K J ) e G ( z ) . (49) Finally , since C ( z K J ) is the z -transform of the se quenc e P n c n δ m − K J n , it follows fr om (49) that e s m = ( P n c n δ m − K J n ) ∗ e g m , and thus (18). B. Th e CRL B of Binary Sensors W e first compute the Fisher infor mation, defined a s I ( c ) = E [ − ∂ 2 ∂ c 2 log L b ( c )] . Using (22), we ge t I ( c ) = E  − ∂ 2 ∂ c 2  K 1 log p 1 ( c/K ) + ( K − K 1 ) log p 0 ( c/K )   = E " K 1  p ′′ 0 ( c/K ) p 1 ( c/K ) + p ′ 0 ( c/K ) 2  K 2 p 1 ( c/K ) 2 − ( K − K 1 )  p ′′ 0 ( c/K ) p 0 ( c/K ) − p ′ 0 ( c/K ) 2  K 2 p 0 ( c/K ) 2 # , (50) where p ′ 0 ( x ) = d dx p 0 ( x ) a nd p ′′ 0 ( x ) = d 2 dx 2 p 0 ( x ) a re the first and second order deriv ati ve of p 0 ( x ) , respecti vely . In reaching (50) , we have also used the fact that p 1 ( x ) = 1 − p 0 ( x ) and thus p ′ 0 ( x ) = − d dx p 1 ( x ) and p ′′ 0 ( x ) = − d 2 dx 2 p 1 ( x ) . Note that K 1 = P 0 ≤ m 0 . So, for x ≥ 1 , (57) is greater tha n  Z ∞ −∞ e − 1 2 u 2 / p 4 + u 2 du  2 ≈ 1 . 3 1 . (58) For 0 ≤ x < 1 , we can ob tain the following inequa lities by keeping the first two terms in p 0 ( x ) a nd the first te rm in p 1 ( x ) : p 0 ( x ) p 1 ( x ) x 2 q − 1 e − 2 x / ( q − 1)! 2 ≥ (1 + x ) e − 2 x x q /q ! x 2 q − 1 e − 2 x / ( q − 1)! 2 = ( q − 1)!( x 1 − q + x 2 − q ) q ≥ 2( q − 1)! q . It is eas y to see that 2( q − 1)! q is a monoton ically increasing function with respect to q ≥ 2 . Therefore, 2( q − 1)! q ≥ 4 3 ≈ 1 . 3 3 f or q ≥ 3 . (59) Finally , for q = 2 , we k eep the first two terms in p 0 ( x ) and p 1 ( x ) a nd g et p 0 ( x ) p 1 ( x ) x 2 q − 1 e − 2 x / ( q − 1)! 2 ≥ (1 + x ) e − 2 x  x 2 / 2 + x 3 / 6  x 3 e − 2 x = x/ 6 + 2 / 3 + 1 2 x ≥ 1 . 3 3 f or 0 ≤ x ≤ 1 . (60) Combining (58 ), (59) and (60), we reach the desired resu lt in (26). Finally , we show that, for q ≥ 2 , the “gap” betwee n CRLB bin ( K, q ) and CRLB ideal ( K ) will on ly get bigger as the oversampling factor K grows. T o that end , we notice that when K → ∞ , the variable x = c/K → 0 . Keeping the first terms in p 0 ( x ) a nd p 1 ( x ) , we have p 0 ( x ) p 1 ( x ) x 2 q − 1 e − 2 x / ( q − 1)! 2 ≥ e − 2 x x q /q ! x 2 q − 1 e − 2 x / ( q − 1)! 2 = ( q − 1)! x 1 − q q . For q ≥ 2 , the a bove quantity go es to infinity as x → 0 . Therefore, lim K →∞ CRLB bin ( K, q ) / CRLB ideal ( K ) = ∞ . Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 29 E. Proof of Th eorem 1 When q = 1 , we ha ve p 0 ( x ) = e − x and thus p [ − 1] 0 ( x ) = − log ( x ) . In this case, the MLE solution in (28) can be rewri tten as b c ML ( b ) =      − K log (1 − K 1 /K ) , if 0 ≤ K 1 ≤ K (1 − e − S/K ) , S, otherwise . W e note that − K log(1 − K 1 /K ) = K 1 + K 2 1 2 K + K 3 1 3 K 2 + . . . a nd that lim K →∞ K (1 − e − S/K ) = S . Thu s, for s ufficiently lar ge K , the a bove MLE solution can be further simplified as b c ML ( b ) =      K 1 + O ( 1 K ) , if 0 ≤ K 1 < S, S, otherwise . (61) W ithout loss of ge nerality , we assu me that S is an integer in what follows. The expe cted value of the MLE then b ecomes E [ b c ML ( b )] = S − 1 X n =0 n P ( K 1 = n ) + S K X n = S P ( K 1 = n ) + O (1 /K ) . Using the followi ng identity c = P ∞ n =0 n c n e − c n ! about the mean of a Poisson rand om variable, we hav e   E [ b c ML ( b )] − c   =    S − 1 X n =0 n  P ( K 1 = n ) − c n e − c n !  + S K X n = S P ( K 1 = n ) − ∞ X n = S n c n e − c n ! + O (1 /K )    ≤ S    S − 1 X n =0  P ( K 1 = n ) − c n e − c n !     + S K X n = S P ( K 1 = n ) + ∞ X n = S c n e − c ( n − 1)! + O (1 /K ) . (62) In what follows, we de ri ve bound s for the quantities on the right-hand side of the a bove inequality . First, consider the probability P ( K 1 = n ) . Since K 1 is a binomial random variable, we ha ve P ( K 1 = n ) =  K n  (1 − p 0 ( c/K )) n p 0 ( c/K ) ( K − n ) = K ( K − 1) . . . ( K − n + 1) n ! K n ( K (1 − e − c/K )) n e − c ( K − n ) /K . (63) For e very n < S , it is easy to verify that K ( K − 1) ... ( K − n +1) K n = 1 + O (1 /K ) , ( K (1 − e − c/K )) n = c n + O (1 /K ) and e − c ( K − n ) /K = e − c + O (1 /K ) . Thus, for any n < S , we can simplify (63) as P ( K 1 = n ) = c n n ! e − c + O (1 /K ) . (64) It foll ows that S    S − 1 X n =0  P ( K 1 = n ) − c n e − c n !     = O (1 /K ) . (65) 30 Next, cons ider the second term on the right-hand s ide o f (62). S K X n = S P ( K 1 = n ) = S (1 − S − 1 X n =0 P ( K 1 = n )) = S (1 − S − 1 X n =0 c n n ! e − c ) + O (1 /K ) (66) = S ∞ X n = S c n n ! e − c + O (1 /K ) ≤ S e − c  ec S  S + O (1 /K ) , (67) for a ll c < S , where (66) follo ws from (64) a nd the inequality (67) is due to the Chernoff bound on the tail of P oisson d istrib utions [24]. Similarly , the third term on the right-hand side of (62) can be rewritt en as ∞ X n = S c n e − c ( n − 1)! = c ∞ X n = S − 1 c n e − c n ! ≤ c e − c  ec S − 1  S − 1 , (68 ) where the ine quality is a gain an application of the Chernoff bound . Finally , on subs tituting (65), (67) and (68) into (62), and after some simple manipulations, we reach (29). The proof for the mean -squared error formula (30 ) is similar . Using (61) , we have E  ( b c ML ( b ) − c ) 2  = S − 1 X n =0 ( n − c ) 2 P ( K 1 = n ) + ( S − c ) 2 K X n = S P ( K 1 = n ) + O (1 /K ) = S − 1 X n =0 ( n − c ) 2 c n e − c n ! + ( S − c ) 2 ∞ X n = S c n e − c n ! + O (1 /K ) , (69) where in reaching (69), we have used the es timation (64) of the Binomial prob abilities. W e note that the variance of a Poisson random variable is equal to its mean. Thus, c = P ∞ n =0 ( n − c ) 2 c n e − c n ! . On combining this identity with (69),    E  ( b c ML ( b ) − c ) 2  − c    =    ( S − c ) 2 ∞ X n = S c n e − c n ! − ∞ X n = S ( n − c ) 2 c n e − c n ! + O (1 /K )    ≤ 2 ∞ X n = S n 2 c n e − c n ! + O (1 /K ) , for c ≤ S ≤ 2 c ( c + 1) ∞ X n = S − 2 c n e − c n ! + O (1 /K ) , for c < S − 2 . Applying the Chernoff bound to the above ineq uality , we get (30). Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 31 F . Pr oof of Pr o position 4 W e have p 0 ( x ) = e − x P q − 1 n =0 x n n ! , and thus 1 − p 0 ( x ) = e − x P ∞ n = q x n n ! . It follo ws that K (1 − p 0 ( S/K )) = K e − S/K  S q K q q ! + S q +1 K q +1 ( q + 1)! + . . .  = e − S/K  S q K q − 1 q ! + S q +1 K q ( q + 1)! + . . .  . For q ≥ 2 and any fi xed co nstant S , the above quantity c on ver ges to 0 as K tend s to infinity . As a resu lt, for s ufficiently lar ge K , the MLE solution in (28) can be simplified as b c ML ( b ) =      0 , if K 1 = 0 , S, otherwise , (70) where we have a lso used the fact that p 0 (0) = 1 an d thus p [ − 1] 0 (1) = 0 . Us ing (70), we can c ompute the expected value of the MLE as E [ b c ML ( b )] = 0 P ( K 1 = 0) + S (1 − P ( K 1 = 0)) = S (1 − P ( K 1 = 0)) . (71) W e have K 1 = 0 when all the pix el responses are uniformly 0 . The probability of seeing such an event is P ( K 1 = 0) = p 0 ( c/K ) K = e − c  1 + c K + . . . + c q − 1 K q − 1 ( q − 1)!  K , which c on ver ges to 1 a s K tends to infinity , i.e . , lim K →∞ P ( K 1 = 0) = 1 . (72) Substituting (72) into (71), we get (31). Next, we compute the MSE as E  ( b c ML ( b ) − c ) 2  = c 2 P ( K 1 = 0) + ( S − c ) 2  1 − P ( K 1 = 0)  , which, upon taking the limit as K → ∞ , leads to (32). G. Proof of Le mma 2 The function h ( x ) def = log P j k = i x k e − x k ! is continuous ly differentiable on the interval (0 , ∞ ) . The refore, to establish its concavity , we just need to show that its second deri vati ve is nonpositiv e. T o that end, we first introduce a seque nce of functions { r k ( x ) } k ∈ Z ∪{∞} , defi ned as r k ( x ) def =      x k /k ! , if 0 ≤ k < ∞ ; 0 , if k < 0 or k = ∞ . (73) 32 It is s traightforward to verify that d dx r k ( x ) = r k − 1 ( x ) f or all k ∈ Z ∪ {∞} . Now , rewr iting h ( x ) as log P j k = i r k ( x ) − x and co mputing its second deri vati ve, we get d 2 dx 2 h ( x ) = ( P j k = i r k − 2 )( P j k = i r k ) − ( P j k = i r k − 1 ) 2 ( P j k = i r k ) 2 , (74) where we have omitted the function ar gument x in r k ( x ) , r k − 1 ( x ) and r k − 2 ( x ) for notational simplicity . Recall tha t our goal is to show that d 2 dx 2 h ( x ) ≤ 0 , for x > 0 . Since the den ominator of (74) is always positiv e, we just need to focus on its numerator . Using the iden tities P i ≤ k ≤ j r k = P i ≤ k ≤ j r k − 1 + r j − r i and P i ≤ k ≤ j r k − 1 = P i ≤ k ≤ j r k − 2 + r j − 1 − r i − 1 , we can simplify the numerator of (74) as follo ws:  X i ≤ k ≤ j r k − 2  X i ≤ k ≤ j r k − 1 + r j − r i  −  X i ≤ k ≤ j r k − 1  X i ≤ k ≤ j r k − 2 + r j − 1 − r i − 1  = X i ≤ k ≤ j  ( r k − 2 r j − r k − 1 r j − 1 ) + ( r k − 1 r i − 1 − r k − 2 r i )  . (75) In w hat follo ws, we show that r k − 2 ( x ) r j ( x ) − r k − 1 ( x ) r j − 1 ( x ) ≤ 0 (76) for arbitrary choices of x ≥ 0 and i ≤ k ≤ j , where 0 ≤ i ≤ j < ∞ or 0 ≤ i < j = ∞ . Note that, when k < 2 or j = ∞ , the left-hand side of (76) becomes − r k − 1 ( x ) r j − 1 ( x ) and thus (76) automatically holds. Now , assume that k ≥ 2 and j < ∞ . From the definition in (73), the left-hand side of (76) is x k − 2 x j ( k − 2)! j ! − x k − 1 x j − 1 ( k − 1)!( j − 1)! = x k + j − 2 ( k − 2)!( j − 1)!  1 j − 1 k − 1  ≤ 0 for i ≤ k ≤ j . Using similar ar guments, we can also show that r k − 1 ( x ) r i − 1 ( x ) − r k − 2 ( x ) r i ( x ) ≤ 0 , for x ≥ 0 . (77) On s ubstituting the inequ alities (76) and (77 ) into (75), we verif y that the numerator of (74) is nonpositiv e, and therefore d 2 dx 2 h ( x ) ≤ 0 , for all x > 0 . H. Proof of Proposition 5 Expressing the signa l proce ssing operations in Fig. 6 in the z -domain, we have U ( z ) = A ( z K ) G ( z ) = A ( z K ) G 0 ( z K ) + z − 1 A ( z K ) G 1 ( z K ) + . . . + z − ( K − 1) A ( z K ) G K − 1 ( z K ) , (78) where A ( z K ) in the first equa lity is the z -transform of the K -times upsampled version of a m , an d (78) follo ws from (45). Similar to (45), we can exp and U ( z ) in terms of the z -transforms of its polyp hase compone nts, as U ( z ) = U 0 ( z K ) + z − 1 U 1 ( z K ) + . . . + z − ( K − 1) U K − 1 ( z K ) . (79) Y ANG et al. : BITS FR OM PHO TONS: O VERSAMPLED IMA GE A CQUISITION USING BIN AR Y POISSON ST A TISTICS 33 Comparing (78) a gainst (79) and u sing the uniqu eness of the p olyphase decompos ition, we conclude that U k ( z ) = A ( z ) G k ( z ) , for all 0 ≤ k < K . Now , consider Fig. 6 (b). W e note that the z -transform of g − m is G ( z − 1 ) . Den ote by d m the output o f the filteri ng operation. The n, its z -transform can be comp uted as D ( z ) = B ( z ) G ( z − 1 ) = K − 1 X k =0 z − k B k ( z K ) ! K − 1 X k =0 z k G k ( z − K ) ! = K − 1 X k =0 B k ( z K ) G k ( z − K ) + X 0 ≤ i 6 = j

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment