Optimal and Unbiased Fluxes from Up-the-Ramp Detectors under Variable Illumination

Near-infrared (NIR) detectors -- which use non-destructive readouts to measure time-series counts-per-pixel -- play a crucial role in modern astrophysics. Standard NIR flux extraction techniques were developed for space-based observations and assume …

Authors: Bowen Li, Kevin A. McKinnon, Andrew K. Saydjari

Optimal and Unbiased Fluxes from Up-the-Ramp Detectors under Variable Illumination
Draft version March 24, 2026 Typeset using L A T E X tw o column style in AAST eX7.0.1 Optimal and Un biased Fluxes from Up-the-Ramp Detectors under V ariable Illumination Bowen Li , 1 Kevin A. McKinnon , 2 Andrew K. Sa ydjari , 3 , ∗ Conor Sa yres , 4 Gwendol yn M. Eadie , 1, 2 Andrew R. Casey , 5, 6 Jon A. Hol tzman , 7 Timothy D. Brandt , 8 and Jos ´ e G. Fern ´ andez-Trincado 9, 10 1 Dep artment of Statistic al Scienc es, University of T or onto, 700 University Avenue, T or onto, ON M5G 1Z5, Canada 2 David A. Dunlap Dep artment of Astr onomy & Astrophysics, University of T or onto, 50 St Ge or ge Str e et, T or onto, ON M5S 3H4, Canada 3 Dep artment of Astr ophysic al Scienc es, Princeton University, Princ eton, NJ 08544 USA 4 Dep artment of Astr onomy, University of Washington, Box 351580, Seattle, W A 98195, USA 5 Scho ol of Physics and Astr onomy, Monash University, Clayton, VIC 3800, A ustr alia 6 Center for Computational Astr ophysics, Flatir on Institute, 162 Fifth Avenue, New Y ork, NY 10010, USA 7 Dep artment of Astr onomy, New Mexic o State University, P.O. Box 30001, MSC 4500, L as Cruc es, NM 88033, USA 8 Sp ac e T elesc op e Science Institute, 3700 San Martin Drive, Baltimor e, MD 21218, USA 9 Universidad Cat´ olic a del Norte, N´ ucle o UCN en Ar que olo g ´ ıa Gal´ actic a - Inst. de Astr onom ´ ıa, Av. Angamos 0610, A ntofagasta, Chile 10 Universidad Cat´ olic a del Norte, Dep artamento de Ingenier ´ ıa de Sistemas y Computaci´ on, A v. Angamos 0610, Antofagasta, Chile (Dated: Jan uary 2026; Received January 15, 2026; Revised Marc h 20, 2026; Accepted FUTURE D A TE) Submitted to Publications of the Astronomical Society of the Pacific ABSTRA CT Near-infrared (NIR) detectors – whic h use non-destructive readouts to measure time-series coun ts- p er-pixel – play a crucial role in mo dern astrophysics. Standard NIR flux extraction tec hniques were dev elop ed for space-based observ ations and assume that source fluxes are constan t o v er an observ ation. Ho wev er, ground-based telescop es often see short-timescale atmospheric v ariations that can dramati- cally c hange the n um b er of photons arrivin g at a pixel. This w ork presen ts a new statistical model that shares information b etw een neigh b oring spectral pixels to c haracterize time-v ariable observ ations and extract un biased fluxes with optimal uncertainties. W e generate realistic syn thetic data using a v ariety of flux and amplitude-of-time-v ariabilit y conditions to confirm that our mo del reco v ers unbiased and optimal estimates of both the true flux and the time-v ariable signal. W e find that the time-v ariable mo del should b e fav ored ov er a constant-flux model when the observed count rates c hange by more than 3.5%. Ignoring time v ariabilit y in the data can result in flux-dep endent, unknown-sign biases that are as large as ∼ 120% of the flux uncertaint y . Using real APOGEE sp ectra, we find empirical evidence for approximately wa v elength-independent, time-dependent v ariations in count rates with amplitudes muc h greater than the 3.5% threshold. Our mo del can robustly measure and remo v e the time-dep endence in real data, improving the quality of data-model comparison. W e show sev eral ex- amples where the observ ed time-dependence quantitativ ely agrees with indep endent measuremen ts of observing conditions, such as v ariable cloud cov er and seeing. Keywor ds: Astronomical detectors (84); Astronomical instrumentation (799) 1. INTR ODUCTION Extracting flux from near-infrared (NIR) observ ations is imp ortant for many areas of astrophysical research. NIR information, in particular, probes further into dust y Corresponding author: Bo wen Li bowenj.li@mail.utoron to.ca ∗ Hubble F ellow regions than optical wa v elengths, enabling the study of stars on the other side of our Galaxy or in the dust- obscured centers of neighboring galaxies. Man y mo d- ern telescop es used in ongoing astronomical surveys use NIR detectors, suc h as the James W ebb Space T ele- scop e (JWST), the twin APOGEE spectrographs ( Wil- son et al. 2019 ) at the Apac he P oin t ( Gunn et al. 2006 ) and Las Campanas Observ atories ( Bow en & V aughan 1973 ), the Prime F o cus Sp ectrograph (PFS) on the Sub- 2 Li et al. aru telescope ( Sugai et al. 2015 ), and the so on-to-launch Nancy Grace Roman Space T elescop e. A k ey difference b et ween NIR detectors and classical CCDs is that they can p erform non-destructive readouts, whic h means that eac h pixel keeps a record of coun ts as a function of time. As a result, counts versus time plots tend to mak e a “ramp” shape, where the slop e is defined by the inciden t flux. The time series asso ciated with this measurement is often referred to as “sampling up-the-ramp” (SUTR). A large amoun t of pioneering work on NIR detectors w as done in preparation for JWST, with Fixsen et al. ( 2000 ) first defining a mo del to estimate the flux and its corresponding uncertaint y from SUTR observ ations. Subsequen t work – such as, Offenberg et al. ( 2001 ) and Rausc her et al. ( 2007 ) – further refined the definitions of the exp ected fluxes and uncertainties and tested against real observ ations with NIR detectors. A k ey difference b et ween these approaches comes from how the observed coun ts at different read times are weigh ted when mea- suring the best-fit slop e/flux. Rauscher et al. ( 2007 ), for example, uses equal w eights such that the best-fit slop e is the one that minimizes the distance b etw een all mea- sured coun ts. This approach is statistically v alid if noise is independent from read to read (as for idealized read noise), but does not account for the cov ariant b eha vior of photon noise. Fixsen et al. ( 2000 ), how ev er, applies larger w eights to the first few and last few reads when photon noise dominates. This leads to higher extracted signal-to-noise ratios (SNR), esp ecially in the high flux limit. W e note that the current JWST pip elines use the Fixsen et al. ( 2000 ) approac h while the main data reduc- tion pip eline of the APOGEE sp ectrographs 11 ( Holtz- man et al. 2014 ; Nidever et al. 2015 ) use the Rauscher et al. ( 2007 ) approach. More recen tly , Brandt ( 2024 ) presented a new opti- mal fitting metho d that addresses many of the previous limitations. Fixsen et al. ( 2000 ) used appro ximate, dis- crete, tabulated weigh ts for combining the individual reads in to fluxes. This sacrificed some SNR to a v oid exp ensiv e computations asso ciated with inv erting a sep- arate cov ariance matrix for ev ery pixel. Brandt ( 2024 ), b y recasting the problem in the space of read differences, deriv ed a closed-form solution for the maximum lik eli- ho o d flux as a linear function of the individual reads. This enables un biased and sligh tly higher SNR flux mea- suremen ts at a small cost in computational performance. Ov erall, the Brandt ( 2024 ) approach is v ery effective for space-telescop e data where the flux of sources is con- stan t ov er an exposure, suc h as for JWST and Roman. 11 apogee drp can be found here https://gith ub.com/sdss/ apogee drp/tree/daily Ho wev er, the constan t-flux assumption begins to break do wn when dealing with ground-based telescopes, where atmospheric conditions can change substantially on the timescale of an exp osure (i.e., a few minutes) or betw een differen t NIR detector readouts (i.e., ∼ 10 seconds). With v ariable illumination, using differen t w eights from one pixel to another – using either Fixsen et al. ( 2000 ) or Brandt ( 2024 ) – will lead to fluxes that are biased differen tly depending on the particular time-dependent signal and each pixel’s true flux. Any mo del that treats all pixels identically , like that of Rauscher et al. ( 2007 ), will a v oid this differen tial bias but will sacrifice SNR. This pap er prop oses a new mo del to optimally ex- tract fluxes from NIR detectors when dealing with time- v ariable data. Section 2 presents the mo del and its un- derlying statistics. It explains the k ey underlying as- sumptions and cav eats ab out its use. Section 3 then tests the mo del with realistic synthetic data under a v a- riet y of conditions. These tests are expanded to real data from the northern APOGEE spectrograph in Section 4 . Finally , our conclusions are presented in Section 5 along with prop osed future w ork fo cusing on impro vemen ts and implementation in a no v el APOGEE data reduc- tion pipeline. 2. ST A TISTICAL MODEL Our goal is to measure, up to a m ultiplicativ e con- stan t, unbiased relative fluxes of all pixels on a detector asso ciated with a given spectrum read out up-the-ramp. This could b e accomplished by applying equal weigh ts to the differen t reads in a pixel (e.g., Rausc her et al. 2007 ), or b y an y mo del that applies the same read-by- read weigh ts to all pixels. Ho wev er, such an approach is not optimal. Due to the different co v ariance of read noise and photon noise, higher flux SNRs are p ossible b y weigh ting the reads differently depending on eac h pixel’s level of illumination ( Fixsen et al. 2000 ). The b est weigh ts for eac h pixel dep end b oth on the flux in that pixel and on the time-dep endence of the illumina- tion, an asp ect that is not treated in either Fixsen et al. ( 2000 ) or Brandt ( 2024 ). W e seek a generalization of the Brandt ( 2024 ) optimal approach that accoun ts for non- constan t illumination, assuming the time dep endence of the illumination is shared betw een neigh b oring pixels. In the case of Brandt ( 2024 ), this illumination function is a constant. W e b egin by defining some terms. Let n r b e the num- b er of read-out times and n p b e the num ber of pixels. F or the case where the flux in a pixel is not changing o ver time, w e exp ect that the observed counts in pixel p Optimal Fluxes fr om V ariable Illumina tion 3 at read j is described by: c j,p = δ j,p + j X i =1 ∆ r i,p (1) where j ∈ [1 , n r ], δ j,p is noise associated with the mea- suremen t, and ∆ r i,p is the true num b er of new photons that arrive at pixel p b etw een reads j − 1 and j . In this wa y , w e use i to k eep track of information b etwe en reads while j is used to trac k information at differen t read times. The measuremen t noise is describ ed by: δ j,p ∼ N  µ = 0 , σ 2 = σ 2 r,p  (2) where σ r,p is the per-pixel read noise, which we assume has b een previously measured during calibration of the detector (e.g., using the mo del of Brandt 2025 ). The ∆ r i,p is described by: ∆ r i,p ∼ Poisson ( λ = f p ) (3) where f p is the true flux of the source in pixel p . Because mo deling combinations of Poisson and Gaus- sian distributions is often computationally exp ensive, it is common to mak e a Poisson-appro ximated-as- Gaussian assumption such that: ∆ r i,p ∼ Poisson ( λ = f p ) ≈ N  µ = f p , σ 2 = f p  , (4) and we will do the same for this work. This assump- tion is very go o d in the case of large f p where Gaussian and P oisson distributions lo ok quite similar. As f p de- creases, the Poisson-to-Gaussian approximation is less go o d; ho w ever, this is often at the p oint where the im- pact on the observed counts from the read noise starts to dominate, so the final c j,p distribution is still well- appro ximated by a Gaussian for all f p v alues. After making this appro ximation, w e are left with a combi- nation of Gaussian distributions, which mak es analysis m uch easier. Brandt ( 2024 ) show ed that if we c hange to w ork with coun t differences b etw een neighboring reads instead of coun ts – that is, ∆ c j,p = c j +1 ,p − c j,p – then w e ha ve the following relationship:  ∆ c p ∼ N   µ = f p ·  1 , V V V p = V V V r,p + f p · I I I  (5) where  ∆ c p is a vector of the ( n r − 1) count differences, I I I is the ( n r − 1) × ( n r − 1) identit y matrix, and V V V r,p is a tri-diagonal matrix that describ ed the read noise as V V V r,p =          2 σ 2 r,p − σ 2 r,p 0 · · · 0 − σ 2 r,p 2 σ 2 r,p − σ 2 r,p · · · 0 0 − σ 2 r,p 2 σ 2 r,p · · · 0 . . . . . . . . . . . . . . . 0 0 0 · · · 2 σ 2 r,p          . (6) Solving the mo del in Equation 5 is difficult in the- ory b ecause the f p v alue of interest shows up in both the n umerator and denominator of the arguments in the Gaussian likelihoo d. How ev er, w e often ha v e a goo d first guess for what f p should b e based on the data – such as either the mean or median of the observ ed coun t differ- ences – esp ecially in the case of a large num b er of reads, whic h is quite typical with NIR detectors. If w e tak e this first guess, b f p , to generate c V V V p = V V V p ( b f p ), then the b est fit flux is defined to be: µ f p =   1 T · c V V V p − 1 ·  1  − 1 ·   1 T · c V V V p − 1 ·  ∆ c p  (7) with uncertain t y σ f p =   1 T · c V V V p − 1 ·  1  − 1 / 2 . (8) W e can then iterate using this new b est guess for f p to up date b f p = µ f p and c V V V p to measure an improv ed µ f p , rep eating un til con vergence. In practice, Brandt ( 2024 ) find that a single up date and recalculation is all that is needed to remov e an y bias in µ f p . While this approac h works successfully with space- based data where the fluxes of sources are truly quite stable o ver the course of an exp osure, this is not often the case for ground-based observ atories that hav e con- founding effects like changing atmospheric conditions. T o address this fact, we adjust the mo del in Equation 5 to include a time-v arying component: b i ≥ 0 f p ≥ 0 ∆ r i,p ∼ Poisson ( λ = f p · b i ) (9) where b i is the “flux m ultiplication factor” for read dif- ference i . W e define  b as a length n r − 1 vector of the m ul- tiplication factors that describ e how the fluxes change as a function of time: 0 for “no light received” and 1 for “maxim um transparency/efficiency during the observ a- tion”. With this mo del, w e assume that all pixels we analyze ha ve b een impacted by the same m ultiplicative factor at a giv en read time. W e are also assuming that these multiplicativ e factors can be approximated b y a constan t betw een adjacent reads. F uture work might ex- plore using definitions that encode/enforce contin uit y in time, for example by defining a smo oth time-dependent function, b ( t ), that can b e integrated b et ween readout times to calculate  b . A comparison of example syn thetic data generated us- ing constan t flux and a flux that is changing ov er time is sho wn in Figure 1 . F or this syn thetic pixel, the true flux is f p = 3000 e − /read and the  b for the orange line is 1 for the first six elements, and then 0.5 for subsequen t 4 Li et al. Figure 1. T op: An example “ramp” of coun ts v ersus read n umber using syn thetic data. These data are generated using a Poisson distribution to sim ulate photon arriv al with Gaus- sian read noise, as describ ed in the b eginning of Section 2 . The blue line represents the observed coun ts in a pixel with constan t flux o ver time, while the orange line shows the ob- serv ed counts for flux that c hanges during the exp osure to sim ulate the effects of, for example, a cloud mo ving in be- t ween reads six and seven. Bottom: Count differences ver- sus time using the same data as in the top panel. The true input flux for both cases is 3000 electrons per read, and the orange line’s flux is decreased to half of its maxim um after six reads. read differences. One can imagine this represents a case where clouds rolled in for the later reads. It is apparen t from b oth the counts and the count differences that a single flux will not describ e the orange data. Using similar arguments to Brandt ( 2024 ), w e write the model as pro ducts of Gaussians suc h that:  ∆ c p ∼ N  f p ·  b, V V V p = V V V r,p + V V V f ,b,p  (10) where w e no w ha v e V V V f ,b,p = f p ·       b 1 0 · · · 0 0 b 2 · · · 0 . . . . . . . . . . . . 0 0 · · · b n r − 1       . (11) Because w e assume that a collection of pixels share the same  b , we can jointly fit those pixels to constrain the time-v arying comp onent. F or sp ectroscopic data, this usually means using pixels along the same trace of light, corresp onding to a similar patch of sky (e.g., from one fiber or slit). F or imaging data, this w ould mean using pixels that are near each other on the im- age. In this work, our assumption of a shared  b betw een neigh b oring pixels means we are concerned with mo d- eling w a v elength-indep endent temp oral v ariations. In Section 4 , we will use real sp ectroscopic data to show that our achromatic mo del captures the v ast ma jorit y of the time-dep endent effects. Chromatic v ariations p o- ten tially con tribute higher order effects that could b e explored in future work. F rom the new mo del in Equation 10 , we notice that there is a degeneracy b et ween f p and  b . In particular, w e can arbitrarily m ultiply f p b y an y non-zero v alue so long as we divide  b b y the same factor. T o fix this issue, w e force one of the elements of  b to b e 1:  b ′ =       1 b 2 . . . b n r − 1       =       1 0 . . . 0       +       0 0 · · · 0 0 1 · · · 0 . . . . . . . . . . . . 0 0 · · · 1       ·  b =  10 + 01 01 01 ·  b (12) suc h that all other  b elemen ts are measured relative to this fixed one and f p is tied to that particular read time. In practice, we set the element in the first-guess estimate of  b with the highest v alue to b e 1 by rescaling  b and f p appropriately . Ho wev er, this choice do es not require or enforce that all other  b e lemen ts b e smaller than one: in the fitting pro cess, w e migh t learn that one of the other elemen ts is the true highest  b elemen t. Now, our full mo del becomes:  b ∼ N   b prior , V V V prior   ∆ c p ∼ N  f p ·  b ′ , V V V p = V V V r,p + V V V f ,b ′ ,p  (13) where  b prior =  1 and V V V − 1 prior = lim a → 0 +       1 a 0 · · · 0 0 a · · · 0 . . . . . . . . . . . . 0 0 · · · a       (14) describ e a non-informative prior that ensures our desired  b elemen t is held at 1 exactly: with this form, the fixed elemen t of  b has a prior that is infinitely narrow while the other elemen ts ha v e priors that are infinitely wide. Optimal Fluxes fr om V ariable Illumina tion 5 After w e define our flux vector as  f =       f 1 f 2 . . . f n p       , (15) w e use the standard technique of taking a first partial deriv ative of the negative log likelihoo d, setting to 0 to find extrema, then ev aluating the second deriv ativ e at that extremal v alue to determine the mean and cov ari- ance of the final p osterior distribution. This results in p osterior conditionals for  b and  f describ ed b y: ( f p |  b, data) ∼ N  µ f ,p , σ 2 f ,p  (  b |  f , data) ∼ N (  µ b , V V V b ) (16) where σ 2 f ,p = h (  10 + 01 01 01 · b b ) T · b V V V − 1 p · (  10 + 01 01 01 · b b ) i − 1 , µ f ,p = σ 2 f ,p h (  10 + 01 01 01 · b b ) T · b V V V − 1 p ·  ∆ c p i , V V V b = h X p b f 2 p ( 01 01 01 T · b V V V − 1 p · 01 01 01) + V V V − 1 prior i − 1 ,  µ b = V V V b · h X p  b f p 01 01 01 T · b V V V − 1 p · (  ∆ c p −  10 · b f p )  + V V V − 1 prior ·  b prior i . (17) Here, we hav e used a similar trick as the Brandt ( 2024 ) metho d where we assume that w e can estimate a go o d first guess for b oth b f and b b from the data to define b V V V p . By construction,  µ b will b e exactly 1 in its first elemen t (or whic hever has been chosen to b e fixed) and will hav e a corresponding uncertain t y of σ prior . Using these p os- terior conditionals, we can use a standard MCMC ap- proac h, such as Gibbs sampling, to dra w samples from the posterior (  f ,  b | data) distribution. While Gibbs sampling of the distributions describ ed in Equation 16 will give statistically robust samples of the desired p osterior, it can b e quite slow. If we instead recognize that we are almost alwa ys interested in sim- ply measuring the maximum-a-posteriori (MAP) of the p osterior, then getting the full shape of the distribution correct using Gibbs sampling is unnecessary . Instead, w e can use a small num ber of iterations to get to the MAP using a linearized version of the model described b y Equation 10 . In particular, we hav e the mo del count differences in pixel p defined as  m p = f p ·  b ′ = f p (  10 + 01 01 01 ·  b ) (18) whic h has deriv atives with resp ect to  b and f p of ∂  m p ∂  b = f p 01 01 01 ∂  m p ∂ f p =  10 + 01 01 01 ·  b. (19) W e can then define an offset v ector from some initial guess  v T 0 =   b 0 T ,  f 0 T  as  ∆ v = "  ∆ b  ∆ f # that giv es an appro ximation of the mo del coun t differ- ences as:  m p ≈ f p, 0 (  10 + 01 01 01 ·  b 0 ) + h ∂  m p ∂  b |  v =  v 0   ∂  m p ∂  f |  v =  v 0 i · "  ∆ b  ∆ f # :=  m p, 0 + U U U p ·  ∆ v , (20) where U U U p is an ( n r − 1) × ( n p + n r − 1) Jacobian ma- trix describing the first deriv atives of  b and  f . Here, δ  m p δ  f |  f =  f 0 is an ( n r − 1) × n p matrix that is zeros every- where except for column p , which is equal to the vector  10 + 01 01 01 ·  b 0 . F or clarity , an example set of U U U p matrices for n r = 4, n p = 2,  b T = (1 , b 2 , b 3 ),  f T = ( f 1 , f 2 ) would lo ok lik e: U U U 1 =    0 0 0 1 0 0 f 1 0 b 2 0 0 0 f 1 b 3 0    and U U U 2 =    0 0 0 0 1 0 f 2 0 0 b 2 0 0 f 2 0 b 3    . No w, the linear-approximated statistical mo del b e- comes:  ∆ v ∼ N   ∆ v prior , V V V  ∆ v , prior   ∆ c p ∼ N   m p, 0 + U U U p ·  ∆ v , V V V p  (21) where again w e hav e in tro duced a prior term that only serv es to fix one of the  b elemen ts at exactly 1 with  ∆ v prior =  1 (i.e., a v ector of length n r − 1 + n p , where ev ery elemen t except the  b en try we hold fixed could arbitrarily be set to an y v alue) and V V V − 1  ∆ v , prior = lim a → 0 +       1 a 0 · · · 0 0 a · · · 0 . . . . . . . . . . . . 0 0 · · · a       . (22) 6 Li et al. W e note that the mo del in Equation 21 can b e re-written to explicitly ignore the fixed  b elemen t and thereby re- mo ve the need for a prior – in fact, this is ho w w e imple- men t the parameter fitting in our co de – but we leav e it here for completeness. Combining the prior and likeli- ho o d abov e, w e arriv e at the follo wing posterior distri- bution:  ∆ v ∼ N   µ  ∆ v , V V V  ∆ v  (23) with V V V  ∆ v = h X p ( b U U U T p · b V V V − 1 p · b U U U p ) + V V V − 1  ∆ v , prior i − 1 , (24) and  µ  ∆ v = V V V  ∆ v · h X p  b U U U T p · b V V V − 1 p · (  ∆ c p −  m p, 0 )  + V V V − 1  ∆ v , prior ·  ∆ v prior i (25) In words, this new formalism sa ys that w e can use a starting guess (  b 0 ,  f 0 ), and then use gradient descent to calculate an impro ved subsequent guess, jointly fit- ting the  b and  f . W e then iterate this up dating pro cess un til the parameters are no longer changing. It t ypi- cally takes under five iterations to ha ve all  b elements c hanging by less than 1 × 10 − 10 when the flux is relative high ( ≥ 100 e − /read), with the  f elemen ts changing by a similar magnitude. Another approac h for optimizing the fitting code could make use of, for example, autodiff to n umerically estimate Jacobian vectors and Hessian ma- trices to quickly arrive at the MAP while also accounting for the V V V p dep endence on f p and  b in Equation 13 . A direct comparison b etw een the results of Gibbs sam- pling and the linearized model are compared in Ap- p endix A , with the k ey takea wa y b eing that they agree v ery well in mean estimates of all parameters but show sligh t differences in the widths and correlations of their distributions. While the posterior conditional distribu- tions of  f and  b are b oth multiv ariate Gaussians (i.e. Equation 16 ), the joint p osterior is not exactly Gaus- sian (though it is close). This is b ecause b oth  f and  b app ear in the cov ariance matrices of eac h other’s p oste- rior conditional definitions. The small width and corre- lation differences in the joint  b and  f p osterior distribu- tion are likely a consequence of the linearized metho d appro ximating a close-but-not-entirely-Gaussian distri- bution as exactly Gaussian, with the Gibbs metho d giv- ing more slightly more accurate estimates of uncertain- ties. Ho w ever, we are almost alwa ys interested in the mean estimate of  b , whic h w e can use to correct the data and then measure the resulting b est-fit flux, as dis- cussed in the following subsection. Because the mean es- timates of  b and  f for the Gibbs and linearized metho ds are effectively equiv alen t, w e are safe to choose either metho d. F or the remainder of the pap er, all analyses will use the linearized mo del, as w e ha ve a fairly fast implemen tation that is similar to what most reduction pip elines would choose to employ . W e also note that, should the Gibbs metho d b e desired, the linearized or Hessian-based results provide a go o d parameter initial- ization b ecause it will b e at the mo de of the p osterior distribution. An implemen tation of our fitting pro ce- dure can b e found at h ttps://gith ub.com/KevinMcK95/ fit time dep NIR fluxes . 2.1. Cave ats and R e c ommendations Similar to the Brandt ( 2024 ) method, we hav e as- sumed we are able to estimate an initial guess for  b and  f from the data. In practice, this is muc h easier to do when the fluxes are substantially larger than the read noise and when using a fairly large num b er of pix- els. W e find that the linearized mo del is insensitive to the first guess parameters – giv en enough iterations of the up dates – b ecause the posterior distribution is uni- mo dal. Our approach to define initializations for  f and  b is the following: 1. Estimate f p b y taking the median of the read dif- ferences in pixel p (i.e. a go o d estimate if the flux w as constan t in time,  b =  1); 2. Divide the coun t differences by this f p to get an estimate of  b for each pixel; 3. T ake an in verse-v ariance w eigh ted a v erage of those p er-pixel  b estimates to define the starting guess  b , where the weigh ts come from the read noise and observ ed coun t differences in each pixel at each read time; 4. Rescale the  b guess such that the maximum of this v ector is exactly 1; 5. Divide the observ ed coun ts by this starting guess  b and take an in v erse-v ariance weigh ted mean of the corrected count differences in eac h pixel to define a starting guess  f . T o b e consisten t with ph ysical expectations, we ensure that all the elemen ts of  f and  b are p ositive at ev ery step in the up date pro cess by replacing the up dated b est fit with 0 (or some very small non-zero threshold) if the curren t guess is negative. One could imagine w orking with log fluxes and log  b elements or using more com- plicated prior distributions (e.g. truncated m ultiv ariate Gaussians) to ensure the p osteriors alw a ys pro duce non- negativ e samples; ho w ev er, these c hoices w ould break Optimal Fluxes fr om V ariable Illumina tion 7 the p osterior Gaussianity we rely on to p erform com- putationally fast analyses. F uture work could explore searc hing for relatively fast implemen tations using these non-negativ e constrain ts. When jointly fitting  b and  f – either using the lin- earized mo del or Gibbs – w e end up with a distribu- tion of  f that is highly correlated betw een pixels. This correlation is often not desirable b ecause later steps in pro cessing a detector will typically make assumptions ab out pixel independence. Ideally , w e could sim ultane- ously fit all the pixels and keep track of all pixel cor- relations, but this w ould in volv e holding v ery large ma- trices in memory (i.e., greater than n pixels × n pixels in size). Similarly , tasks requiring the in v ersion of this large matrix would b e impractical or impossible. F or most applications, w e recommend using the linearized mo del to jointly fit  b and  f using the av ailable bright pixels. Then, thro w a w ay the join t  f and keep only the b est-fit  b . With this  b best , extract fluxes from all the de- sired pixels – including those that w ere not used in the original joint fit – using the p osterior conditional distri- bution (  f |  b =  b best , data) as defined in Equation 16 . In this wa y , the final fluxes will be uncorrelated with eac h other, preserving the desired pixel indep endence. F or all the figures presented in this w ork, the fluxes w e sho w are calculated using (  f |  b =  b best , data) and not the fluxes w e measuring during the join t fit. In this w ay , w e can think of the  f measured during joint fitting as n uisance parameters with  b being the key measurement of interest. Of course, the ultimate goal is to measure the correct (  f |  b =  b best , data). W e note that the p osterior  b will also hav e correlations b et ween the different elemen ts corresp onding to differ- en t read times. In theory , using  b to accurately correct the observ ed data requires including the effect of these correlations by marginalizing ov er the p osterior  b distri- bution, (  b | data) which we define to hav e cov ariance V V V b, post . Ho wev er, this is quite costly and again intro- duces correlations betw een the pixel fluxes. Another ap- proac h is to approximate the marginalization by adding a f 2 p V V V b, post term to the data cov ariance matrix V V V p in (  f |  b =  b best , data), and this is able to give bac k  f un- certain ties that are broadly consistent with results from the exp ensive marginalization. Ho wev er, V V V b, post is typ- ically dense, so it breaks the fast-to-in vert tri-diagonal form of V V V p and instead pro duces a slow er-to-in v ert ma- trix that is p oten tially full-rank, significan tly increasing computational costs. W e explored using the full-rank matrix as well as only including the correlations up to the tri-diagonal elements when correcting the data: the full-rank version pro duces  f uncertainties that are most consisten t with the  b marginalization results. T o sum- marize, all cases of propagating  b cov ariances only in- creased the magnitude of the (  f |  b ) uncertainties. As is sho wn in the Section 3 , tests with syn thetic data where w e do not propagate  b co v ariances – that is, only the b est fit  b is used to measure (  f |  b =  b best , data) – return un biased fluxes that agree with the input truth within their uncertainties.As a result, we recommend not prop- agating  b uncertain ties, assuming  b can b e measured to higher than ∼ 10% precision. In our mo del, w e are assuming that the nonlinear- it y of each pixel has also been sufficien tly accounted for through calibration of the detector. If non-linearity is still present in the data, then w e exp ect that our mo del will pick up an a verage nonlinearit y response from the join tly fit pixels: that is, ev en time-indep endent data will return a  b with elemen ts near 1 at the b eginning read times and < 1 at later read times. How ever, we note that this is likely an undesirable w ay to handle nonlinearit y corrections; there is no reason to believe that nearb y pixels hav e the exact same nonlinearit y re- sp onse, but our mo del forces them to share a nonlinear- it y correction. It is muc h b etter to measure and remov e nonlinearit y b efore applying our mo del. When our mo del is applied to real data, we cannot kno w if  b is solely measuring the effects of the sky . The  b we measure could be a com bination of improp erly- remo ved nonlinearit y effects, a c hanging sky , issues with telescop e guiding, strange electronic noise, or persis- tence, to name a few. That b eing said, the source of the effect is not the greatest concern; it is clear that the data are p o orly describ ed b y the Brandt ( 2024 ) con- stan t flux mo del. As a result, such a mo del will ha v e unaccoun ted-for systematic errors, underestimated un- certain ties, and a flux-dep endent bias. Our time v ariable mo del pro duces fluxes that are less biased, regardless of whether w e understand the source of the effect. If external constrain ts of  b w ere a v ailable – for exam- ple, relatively strong prior information ab out the sky v ariability during an exp osure from telescop e guider cameras – then disentangling the origin of different time v ariable signals would b ecome more feasible. As we show later in Section 4.3 , guider camera information appears fairly correlated to the time v ariable signals we measure from real science data. F uture work will fo cus on turn- ing these guider camera estimates into well-calibrated prior constraints. After using telescop e calibration data – such as a combination of darks and dome flats – to re- mo ve or mo del p ersistence and nonlinearity effects, the ma jority of the remaining time-v ariable signal could be describ ed as a combination of sky and stellar v ariabil- it y . With w ell-informed priors on the sky v ariability , the  b measuremen ts could be decomposed to understand 8 Li et al. b oth the Earth’s atmosphere as well as astrophysical v ariability in a particular star’s brightness/spectrum. Th us, the metho ds presented in this paper may present a path to w ards learning in teresting time-series stellar as- troph ysics o ver relatively short timescales ( ∼ 10 seconds p er read for APOGEE). In this w ork, w e use the w ord “bias” to describe fluxes that hav e a non-zero offset in exp ectation from the truth. In the follo wing sections, w e show that fitting time- indep enden t models to time-v ariable data produces flux- dep enden t biases, with the precise sign and magnitude dep ending on the particular  b for an observ ation (Sec- tion 3.1 , third panel of Figure 4 ). When we uniformly marginalize ov er many  b draws (Section 3.3 ), we find that the exp ectation of this bias goes to 0 (third panel of Figure 7 ), but this does not mean that the individual ob- serv ations are un biased. Instead, the o verall effect of the bias can be thought of as a systematic error that is un- accoun ted for when using the time-indep enden t model. Throughout thi s paper, the bias we men tion is implicitly flux-dep enden t and “systematic error” and “bias mag- nitude” are used interc hangeably . Measuring  b is muc h easier when the count rate dif- ferences are significantly larger than read noise. Ho w- ev er, many pixels – whether in sp ectroscopic or imag- ing data – will naturally b e quite dark. F or imaging data, w e imagine a user would either measure a single common  b for an image or calculate independent, local  b v alues across the face of the detector using different brigh t sources, and then build some form of interpola- tion b etw een those measurements to correct the data. F or sp ectroscopic observ ations, the p ositions of bright sources/traces on the detector could b e mapp ed to p osi- tions on the sky to allo w the interpolation of  b for faint sources. Another approac h would b e to fo cus on the flux near bright sky emission lines that should b e present for all spectroscopic observ ations, regardless of the target’s brigh tness. How ev er, this latter approac h has an addi- tional complication in that the night sky is exp ected to c hange in differen t wa ys than sources o ver the course of the night; for example, the sky’s large-scale trend of starting off bright, reaching a minimum, and then brigh tening tow ards the sunrise, or fast-timescale v aria- tions in h ydroxyl radicals. Using skyline pixels to correct source pixels would then require calibrating the relativ e c hanging flux b etw een a source and the sky , p oten tially harnessing prior constrain ts from indep endent measure- men ts of the sky v ariability as a function of time (e.g., from guide/w eather-trac king cameras sitting at the tele- scop e). As men tioned earlier, there is an arbitrary tradeoff b e- t ween  b and  f , so it is important to remember that all the fluxes measured using the same  b should be consis- ten t with each other up to some multiplicativ e constan t. Ho wev er, when comparing pixels corrected with differen t  b v ectors, we ha ve to b e more careful. F or sp ectroscopic data, we suggest measuring a single  b p er trace/fib er. T ypically , the difference in relative fluxes b etw een the traces is measured/calibrated at some later step in a data reduction pip eline, which should naturally handle differences b etw een the arbitrary  b scaling across the detector. How b est to deal with the cross-talk regions b et ween traces is not yet ob vious, though the most sta- tistically robust answ er likely inv olv es complete forward mo deling of sp ectral ligh t and p oint spread functions across the detector. F or imaging data, lo cally measured  b v ectors should also b e handled with care or a single  b p er image should b e used. In b oth imaging and sp ec- troscop y cases, it w ould b e wise to force all indep endent  b measurements to share the same reference read (i.e., where b i = 1) so that everything is measured relativ e to the flux transparency at the same read time. Finally , w e note that the analyses in the follo wing sec- tions regularly compare the results to the case where we assume the fluxes are constant in time, as measured by the Brandt ( 2024 ) mo del. While the results will sho w that our new model does a better job of describing our time-v ariable data, we m ust emphasize that w e are not concluding that the Brandt ( 2024 ) mo del is incorrect. These comparisons merely sho w that ignoring time v ari- abilit y when it is present in the data leads to w orse re- sults. F or data where the fluxes are truly stable ov er an observ ation – such as for space telescop es – the Brandt ( 2024 ) model is the correct c hoice to use. 3. TESTING WITH SYNTHETIC DA T A W e generate synthetic data using the Poisson and Gaussian model defined in Equation 9 giv en some choice of  f and  b . In this section, w e will fo cus on tests using a range of flux levels as well as  b amplitudes to quantify our abilit y to measure unbiased fluxes. 3.1. Fitting a Single Synthetic Observation W e first test how well we can reco ver our input fluxes and  b for a single realization of synthetic data. W e set n reads = 33 and the read noise p er pixel to b e uniform random dra ws with bounds of 14 − 24 e − , which are t ypical v alues for APOGEE science observ ations. F or this test case, we define the true  b to b e n reads − 1 uni- form draws with bounds of 0 . 5 − 1 . 0, the true fluxes to b e n pixels uniform draws with b ounds of 800 − 1200 e − /read, and n pixels = 100. W e ha v e chosen 100 pixels to ensure that our fitting code (when used on a p ersonal laptop) is able to run fast and is light w eigh t (in terms of RAM): fu- ture implementation with real data would likely b enefit Optimal Fluxes fr om V ariable Illumina tion 9 Figure 2. Comparison of the best-fit  b and (  f |  b =  b best ) and the truth for syn thetic data from 100 pixels in a photon-dominated regime. The left panels fo cus on  b , and the right panels fo cus on (  f |  b ). The top panels compare the input and outputs directly , while the bottom panels show the uncertain t y-scaled residuals. The b ottom panels display the b est-fit χ 2 scaled by the appropriate degrees of freedom, b oth of which are near 1.0 as expected for reasonable measuremen ts and uncertain ties. The red p oin ts in the  b panels mark the read difference num b er that was not used in the fitting to break degeneracies b etw een  b and  f , as discussed in Section 2 . from using all a v ailable pixels. Our curren t test rep- resen ts a “high” flux case b ecause the input fluxes are significan tly larger than the read noise, making this the easiest test of our mo del. After generating the data, we estimate the first guess of the  b and  f using the approach describ ed in Sec- tion 2.1 , whic h is necessary to define the first cov ariance matrix that describ e the data. W e then use the starting guess b V V V p matrices to up date our b est  f and  b guesses based on the linearized mo del of Equation 23 . W e then up date the b V V V p matrices and repeat un til all elements of  b are changin g less than some small tolerance (e.g., 1 × 10 − 10 ), which usually happ ens after only three or four iterations. Before comparing the input and output  f and  b , we m ust accoun t for the arbitrary scaling betw een them. F or example, if clouds in the sky meant that there w as maxim um 80% of ligh t getting through in a giv en exp o- sure, then the true  b is a vector of 0.8, but our mo del w ould return a vector of all ones for  b with the  f b e- ing a factor 0.8 low er than the true source fluxes. F or real data, this discrepancy does not matter b ecause w e only ever exp ect to measure relativ e fluxes of sources, with later calibration steps determining an absolute flux- ing if necessary . F or our comparisons with syn thetic data, how ev er, we wan t to quan tify how well our mea- suremen ts and uncertainties describ e the truth, which means we must rescale our outputs b efore we can com- pare with the input v alues. Sp ecifically , we measure the m ultiplicative factor needed that minimizes the differ- ence b etw een the true and measured  b , and then rescale the  f and  b b y that amount before comparison. The b est-fit results for one realization of sim ulated data are sho wn in Figure 2 , with the left panels focus- ing on  b and the righ t panels fo cusing on (  f |  b =  b best ). Both vectors hav e go o d agreemen t with the truth (upp er panels) and an y residuals are well-described b y their un- certain ties (b ottom panels). W e also calculate the data 10 Li et al. Figure 3. Comparison of the data χ 2 empirical cumulativ e distribution function (CDF) for the same sim ulated data as in Figure 2 . The blue line shows the χ 2 , as calculated using Equation 26 when using the b est fit  f and  b , which sho ws go od agreemen t with the exp ected χ 2 ν = n reads − 2 distribution (blac k line). The solid orange line sho ws the data χ 2 af- ter measuring the best-fit constant flux following the Brandt ( 2024 ) mo del, which is not able to describ e the data within its uncertain ties. In fact, the solid orange histogram finds that the difference b etw een the constan t flux and data requires the uncertain ties to be 3.91 times larger, as demonstated by the dashed orange histogram, whic h has muc h closer agree- men t to the expected curve. χ 2 distribution using the b est-fit fluxes and  b using χ 2 = X p h  ∆ c p − f p ·  b i T V V V − 1 p h  ∆ c p − f p ·  b i , (26) to confirm that the final b est-fit mo del prop erly de- scrib es the data, and this is demonstrated in Figure 3 . W e hav e also included the data χ 2 distribution if we as- sume that the flux is constant in time and is extracted using the Brandt ( 2024 ) approac h, whic h, as expected, is not able to describ e the data within their uncertain ties (solid orange line). In this case, the time-indep endent mo del requires the data uncertainties b e inflated by a factor of 3.91 to recov er the expected agreemen t (orange dashed histogram). W e next test for bias b etw een the fluxes measured us- ing the time-indep enden t and time-v ariable mo dels. W e do this by generating data 1000 times using the same true  b as shown in Figure 2 , but no w across a range of flux lev els that are typical for pixels in an APOGEE exp osure. F or eac h data draw, we extract fluxes using b oth (  f |  b =  1) and (  f |  b =  b best ), where  b best is the same best-fit v alue we ha v e measured previously; that is, we do not re-estimate  b best using the new data dra ws. W e also extract fluxes using the equal-w eight- p er-read approach of Rausc her et al. ( 2007 ), where the Figure 4. Comparison b etw een the time-indep endent (orange lines) versus time-v ariable (blue lines) versus equal-w eight (green lines) models for the extracted flux un- certain ties (top), flux signal-to-noise ratios (second), me- dian biases (third), and residual distribution widths (bot- tom). Blac k lines and shaded regions corresp ond to the ex- p ected v alues and scatter given independent draws from a unit Gaussian. These measuremen ts come from 1000 sim ula- tions of data using the same  b as in Figure 2 , but using a large range of flux lev els. The top panel shows that time-v ariable and time-indep endent mo dels predict extremely similar flux uncertain ties (differences m uc h smaller than 1%) while the equal-w eight mo del predicts larger uncertainties, esp ecially at the high flux end. The third-from-top panel emphasizes that neglecting to mo del time v ariability leads to a substan- tial flux-dep enden t bias when using the Brandt ( 2024 ) ap- proac h. The b ottom panel sho ws that all the mo dels ha ve residual distributions with the exp ected widths. fluxes come from their Equation 3 and the uncertainties are defined b y their Equation 1. T o b e fair in our com- parisons, we allo w all the extracted fluxes in a realization to b e scaled by a single multiplicativ e factor to hav e the b est agreement b etw een the mo del and the truth. A t eac h flux lev el, we end up with a residual distribution whose median should b e near zero and 1 2 · 68% width should b e near 1 to agree with the unit Gaussian. The result of this comparison is shown in Figure 4 , with the top panel showing the extracted flux uncertainties, the second panel showing the flux signal-to-noise ratios, the third panel showing the median of the uncertaint y-scaled residuals, and the b ottom panel showing the width of the Optimal Fluxes fr om V ariable Illumina tion 11 residual distribution. The top panel sho ws that both the time-v ariable and time-indep endent models (blue and orange lines) predict similar flux uncertainties as a func- tion of true flux, with at maximum ∼ 1% differences b e- t ween the tw o lines. The equal-weigh t mo del, ho w ev er, predicts larger flux uncertain ties, esp ecially in the high flux limit ( ∼ 8% smaller SNR at a flux of 1000 e − /read). The bottom panel shows that all models measure widths of their residual distributions that are well-described by their uncertainties; all the width lines also agree reason- ably well with the black line and shaded region show- ing the exp ected mean and scatter based on the same n umber of draws from a unit Gaussian. The blue and green lines in the third panel shows that b oth the time- v ariable and equal-weigh t models are able to recov er the exp ected agreemen t with a zero median. How ev er, the orange line shows that neglecting to account for time v ariability in the data leads to a systematic source of scatter in the constan t flux mo del that is also nefari- ously flux-dep endent: in this case, the systematic error is as large as ∼ 0 . 45 of the flux uncertaint y in this case. When we rep eat these tests with different realizations of  b – as we will explore in Section 3.3 – w e also find that the strength and sign of this bias can change sig- nifican tly . This is a particularly pernicious type of bias b ecause there is no simple wa y to correct for it after the fact. The only w ay to remov e this bias and measure optimal flux uncertain ties is to use the time-dependent mo del. 3.2. Synthetic Observations over Differ ent Flux L evels Next, we create 1000 realizations of data taken from the same configuration as in Section 3.1 – that is, using the same n pixels , n reads , read noise p er pixel, and true  b – to quantify ho w w ell the measured  b and (  f |  b ) v ec- tors and uncertain ties agree with the truth. W e repeat this test across a range of flux lev els: from read-noise- dominated cases to the photon-dominated regime. W e first inv estigate our ability to measure  b using data generated from different flux lev els. The results of this test are shown in Figure 5 . The top panel sho ws the rela- tionship b etw een the median  b uncertaint y and the input true flux. Evidently , it takes a flux of 30 to 40 e − /read to reach a 10% uncertaint y in  b when using 100 pixels with n reads = 33. W e note that these tests used 100 syn- thetic pixels for all flux cases; it is theoretically p ossible to reach any desired SNR in  b b y increasing the n umber of pixels used in the join t fit, though this would signif- ican tly increase computation time. The middle panel next sho ws the 16-th, 50-th, and 84-th p ercentiles of the χ 2 distribution of the measured  b compared to the truth, with horizontal lines sho wing the exp ectation for Figure 5. Summary of  b measurements from repeated sim- ulation across a range of flux levels using n reads = 33 and n pixels = 100. Each data p oint comes from 1000 iterations of fitting data with the same true  b as shown in Figure 2 . The top panel shows the median uncertaint y in the measured  b from all realizations, with the errorbars sho wing the 68% in- terv al from those realizations. The middle panel shows the 16-th, 50-th, and 84-th percentiles of the χ 2 distribution of measured  b compared to the truth. The b ottom panel sho ws the median of the  b uncertain ty-scaled residual distribution. This figure rev eals that flux levels > 10 e − /read are able to measure unbiased  b using 100 pixels. those summary statistics. While most flux lev els pro- duce  b distributions that agree with the expectation, a div ergence b egins to appear at low fluxes, suggest- ing that the chosen configuration requires a signal of > 10 e − /read to measure trust worth y  b vectors and un- certain ties. At the highest flux level, we find that the  b measurements slightly o verestimate the uncertainties based on the middle panel χ 2 medians falling below the exp ectation. This size of this ov erestimate is ab out 3%, but to o-large uncertainties are preferable to underesti- mated ones for most science applications. In the bottom panel, the medians of the  b z-score distributions are close to the exp ectation of zero for most flux lev els. F or flux lev els ≥ 10 e − /read, the bias is less than one p ercent of the measured  b uncertainties; how ev er, there are signifi- can t  b biases for flux lev els < 10 e − /read. These results further emphasize the previous finding that jointly fit- ting 100 pixels requires fluxes > 10 e − /read to ha ve enough information to constrain  b . 12 Li et al. Figure 6. Summary of (  f |  b ) measurements from the same sim ulations in Figure 5 . F or each realization, w e correct the observ ed data using the b est-fit  b and then measure the b est fit (  f |  b ) to simulate ho w real data w ould b e pro cessed. The top panel sho ws the 16-th, 50-th, and 84-th percentiles of the χ 2 distribution of measured fluxes compared to the truth. The middle panel shows the median of the flux uncertain t y-s- caled residual distribution, with the orange line showing an equiv alent analysis of time-indep endent data. The b ottom panel estimates the width of the flux z-score distribution. This figure reveals that unbiased fluxes can be extracted us- ing  b measured from 100 pixels with flux lev els ≥ 30 e − /read. The most imp ortant question for this w ork, how ev er, is whether the fluxes we measure are accurate. T o that end, w e take the best estimate of  b for eac h realization and then use it to correct the data b efore measuring the final b est fit flux, (  f |  b best ). W e then compare these fluxes and their uncertainties to the truth, as demon- strated by Figure 6 . The top panel shows that the flux χ 2 distribution agrees with the expectation across most flux ranges. The middle panel sho ws that the extracted fluxes b egin to b ecome more biased than exp ected when the flux level is < 30 e − /read. F or the b ottom tw o pan- els, we hav e included the results from analyzing com- parable time-indep endent data (i.e.,  b =  1) using the Brandt ( 2024 ) constant flux mo del. W e notice that b oth analyses tend to pro duce a slight negative bias in the z-score median, whic h is the result of a natural correla- tion b etw een the extracted fluxes and their uncertain- ties: b oth the extracted flux and its uncertaint y dep end on a previous iteration’s b est guess of the flux, meaning they are fundamen tally correlated suc h that w e should not exp ect p erfect agreement with the unit Gaussian. The b ottom panel sho ws that the width of the flux z- score distribution agrees with exp ectations do wn to all but the low est flux lev els. These (  f |  b ) tests rev eal that the limiting threshold is sligh tly higher than previously seen with the  b results. T o ensure unbiased (  f |  b ) in addition to trustw orthy  b , we require the information conten t pro vided by 100 pixels with flux ≥ 30 e − /read. W e m ust emphasize that the results of this section do not imply that extracting lo w fluxes will alwa ys pro duce biased results; instead, it sa ys that using the  b measured f rom 100 lo w signal pixels can pro duce biased fluxes. In the next subsection, we will inv estigate the true flux bias when we are able to measure a precise  b . 3.3. Synthetic Observations over Differ ent  b Am plitudes Finally , w e quan tify the scale of the flux bias as a func- tion of flux and amplitude in  b . That is, we define b amp suc h that a given true  b true is created b y n reads − 1 dra ws from a uniform distribution with b ounds from 1 − b amp to 1. In this notation, b amp = 0 corresp onds to the time indep enden t case. W e then dra w 1000 realizations of  b true v ectors for b amp ∈ [0 , 1] in steps of 0.1. W e then follo w the same pro cess as described in Section 3.1 : for eac h  b true realization, we generate 100 pixels worth of data using a flux lev el of 1000 e − /read and then use our linearized mo del to measure a high precision  b best . W e next generate 100 examples of data across a range of fluxes with the same  b true , and then extract fluxes from those data using (  f |  b =  1) and (  f |  b =  b best ). W e then measure the flux uncertain ties and residuals for each flux level for that  b true , giving one realization of the results shown in the top t wo panels of Figure 4 . W e then marginalize o ver the 1000  b true sim ulations for a giv en b amp , yielding the results in Figure 7 . The dif- feren t colored lines corresp ond to different b amp lev els, while dotted and solid lines represent the time v ariable and time indep endent mo dels, respectively . Red lines and shaded regions show the exp ected v alues and scat- ters after drawing indep endent measurements from unit Gaussians using the same num ber of realizations as the sim ulated data. The top panel shows that the flux uncertaint y mea- sured by both mo dels increases with b amp . This is b e- cause the n um b er of photons that arriv e at a pixel de- creases for each increase in b amp , thereby reducing the data’s constraining p ow er. There is very little difference ( < 1%) b etw een the dotted and solid lines in this panel. Optimal Fluxes fr om V ariable Illumina tion 13 Figure 7. Quantification of flux uncertaint y and bias as a function of true flux and  b amplitudes from 0 (no change in flux v ersus time) to 1 (p ossible for fluxes to be 100% dimin- ished). Dotted lines corresp ond to fluxes extracted with the time-v ariable mo del, solid lines refer to fluxes extracted from the same data assuming time independence, and the color of eac h line corresponds to the  b amplitude of those sim ulations. The top panel sho ws that the measured flux uncertaint y in- creases with b amp b ecause fewer photons reach each pixel on a verage when  b <  1. W e note that the difference b etw een the dashed and solid lines is < 1% for all cases. The sec- ond-from-top panel shows the flux signal-to-noise, which is the true flux divided by the measured flux uncertaint y . The third-from-top panel shows that the median of the uncer- tain ty-scaled residual distributions marginalized ov er  b real- izations is broadly consisten t with 0, though there is more scatter than expected (red shaded region) in the time-inde- p enden t results. The bottom panel shows excess scatter in the median of the z-score distributions: ∼ 68% of observ a- tions at a given b amp are exp ected to show a bias with a magnitude equal to or smaller than the corresp onding line. All of the dashed lines are very close to the expected red line, sho wing that the time-v ariable model produces the sto chasti- cally exp ected scatter in flux measuremen ts. The solid lines sho w that fitting a constant flux to time v arying data re- sults in typical biases of ∼ 5% of the flux uncertaint y when b amp = 0 . 3 and up to ∼ 50% when b amp = 1 . 0. The third panel sho ws the median of (  f |  b ) z-score dis- Figure 8. Measurement of the threshold for deciding the optimal model for extracting fluxes. These results are based on the bias magnitudes presen ted in the b ottom panel of Fig- ure 7 . The y-axis is the median magnitude of the bias, and the x-axis is b amp . The blue line shows the median (o v er all fluxes and b amp ) of the measured time-v ariable bias profiles, while the corresp onding shaded region shows the 5 and 95th p ercen tiles. The orange scatter p oints for the time-indep en- den t mo del sho w the per- b amp median for eac h of the bottom panel Figure 7 solid line profiles. The solid orange line is a simple 4-th order polynomial fit to the orange p oints, which has goo d agreement for b amp ∈ [0 , 1]. The black vertical line giv es the b amp = 0 . 035 transition where the bias from the time-indep enden t mo del has a > 95% probability of b eing larger than the bias from the time-v ariable model. tributions, showing that marginalizing ov er all  b realiza- tions pro duces results that are fairly consistent with the exp ected 0. Here, we see that the time-v ariable mo del has smaller scatter than the time-indep endent mo del, esp ecially as b amp increases. W e do notice that there is potentially a small bias ( ∼ − 0 . 5% of the flux uncer- tain ty) in the median z-score of the time-v ariable re- sults as we mov e to lo w er fluxes. This is likely the same sligh t negativ e bias seen in the orange line in the mid- dle panel of Figure 6 , suggesting that the new model is reac hing the same fundamental limit as exp ected from time-indep enden t data. The b ottom panel of Figure 7 is a key result of this w ork as it gives the exp ected bias as a function of true flux and b amp . The plotted lines are calculated b y taking the standard deviation of the 1000 realizations of the z- score medians for a given b amp , and then the exp ected scatter for this configuration of sim ulations is subtracted off (red line), resulting in the excess scatter in the z- score bias. All of the dotted lines for the time-v ariable mo del are very near the exp ected red line, suggesting the scatter we see in the time-v ariable results is explained b y sto chastic processes. The time-indep endent results, 14 Li et al. Figure 9. Magnitude of the flux bias at different percentile lev els for b amp ∈ [0 . 2 , 0 . 5 , 1 . 0]. The cy an lines in these panels, corresp onding to the 68-th p ercentile, are the same as the curv es displa y ed in the b ottom panel of Figure 7 . The top panel reveals that the bias magnitude can b e as large as 1 . 2 · σ f for b amp = 1 . 0. The b ottom panel suggests that we can b e ∼ 95% confiden t that the time-indep endent mo del will pro duce a < 0 . 05 · σ f magnitude bias when b amp ≤ 0 . 2. ho wev er, show that the measured scatter in the z-score bias is flux-dep endent and grows to large v alues as b amp increases. T o provide an illustrativ e example, this plot sa ys that a true flux near 100 e − /read can exp ect that, 68% of the time, the bias will b e as large as 0 . 5 · σ f in magnitude when b amp = 1 . 0. Of course, the precise bias sign and magnitude at each flux dep end on the exact shap e of  b true for that observ ation (e.g., as shown in the third panel of Figure 4 ). W e can also use the results in the b ottom panel of Figure 7 to determine when it is b etter to use the time-v ariable or time-indep endent mo del, and this is demonstrated in Figure 8 . W e hav e zo omed in on the b amp ≤ 0 . 3 cases to highlight the transition b etw een bias magnitude from the different mo dels. The solid blue line is calculated by taking a median of the time-v ariable profiles across all flux levels and b amp , with the corre- sp onding shaded regions giving the 5 and 95 percentiles. The blue line being slightly larger than 0 is evidence that the time-v ariable mo del incurs a small ( < 0 . 1% of the flux uncertaint y) cost as so ciated with fitting  b and then extracting (  f |  b =  b best ). The orange p oints are calcu- lated by taking medians ov er the flux levels for the solid profiles in the b ottom panel of Figure 7 , giving a verage bias magnitude p er b amp . The orange solid line is the result of fitting a simple 4-th order p olynomial to the orange p oints for all b amp , whic h do es a goo d job of de- scribing the measuremen ts for b amp ∈ [0 , 1]. T ests with other p olynomial orders – both higher and lo w er than 4 – did a visually w orse job of describing the orange data p oin ts in the b amp ≤ 0 . 3 region or pro duced unphysical in terp olated v alues (e.g., relatively large negativ e v al- ues in the b amp ≤ 0 . 1 region). W e find that the solid orange line b ecomes greater than the top of the blue shaded region at b amp = 0 . 035. This implies that there is a > 95% probabilit y that the time-independent model con tributes more systematic error to the measurements than the time-v ariable mo del when the time-v ariable sig- nal is > 3 . 5%. W e suggest that b amp > 3 . 5% is a use- ful threshold for deciding when the time-v ariable mo del is preferred. Using linear interpolation of the orange p oin ts instead of the 4-th order p olynomial returns an in tersection at b amp = 0 . 028, suggesting a more conser- v ative c hoice would be to prefer the time-v ariable mo del when the time-v ariability is > 2 . 8%. W e explore different p ercen tiles of the bias magnitude distributions of the time-dep endent mo del for b amp ∈ [0 . 2 , 0 . 5 , 1 . 0] in Figure 9 . The cy an lines in this figure are the same as the solid curves plotted in the b ottom panel of Figure 7 . F rom the top panel, we see that the bias magnitudes can b ecome as large as 120% of the flux uncertaint y . A key tak ea wa y comes from the b ot- tom panel, which shows that w e can b e fairly confident – 95% of the time – that the time-indep endent mo del will pro duce flux-dep endent biases that are smaller than 5% of the flux uncertaint y when b amp ≤ 0 . 2. These met- rics pro vide another p ossible threshold for determining when to transition b etw een the time-v ariable and time- dep enden t mo dels. W e emphasize that the flux levels, read noise, and n umber of reads used in our syn thetic tests are c har- acteristic of real APOGEE observ ations. As a result, Figures 7 to Figures 9 provide useful metrics to assess the impact of ignoring time-v ariabilit y in data. Using indep enden t estimates of the APOGEE flux c hange as a function of time – for example, from a guider cam- era on the telescop e – one could estimate the magnitude of the flux bias to decide whether to extract using the time-indep enden t or time-dependent model. 4. V ALID A TION WITH REAL D A T A W e also v alidate our mo del using real data from the APOGEE sp ectrograph at Apache Poin t Observ atory , using observ ations that w ere part of the Sloan Digital Sky Surv ey V (SDSS-V; Kollmeier et al. 2025 ). Optimal Fluxes fr om V ariable Illumina tion 15 Figure 10. T op: An example 33-read ramp of counts versus read time using real science data from the APOGEE spec- trograph (blue line) compared to the best-fit constant flux (orange line). Bottom: Count differences revealing time fluctuations in the data that are not well-described by a sin- gle constant flux. The right axis shows the relative count differences after dividing by the b est-fit constant flux. 4.1. Time V ariable Signals in R e al Scienc e Observations A flux ramp from an example real science pixel is sho wn in the top of Figure 10 , with the best-fit con- stan t flux shown in orange; this is a pixel on the trace of a brigh t star, a w a y from sky emission lines, at a wa ve- length of ∼ 16928 . 1 ˚ A. After taking the difference b e- t ween subsequent reads, the bottom panel reveals that the flux is v arying b y amounts larger than the uncertain- ties can explain, suggesting that a time-constan t flux is not a go o d mo del of the data. Using our time-v arying mo del, we fit the data from 100 pixels sim ultaneously . These pixels are tak en b y follo wing the maximum of the trace for one relatively brigh t star across a single detector. That is, these pixels corresp ond to the same star and p osition on the sky but co ver a wide range in wa velength ( ∼ 600 ˚ A). A com- parison betw een the resulting b est fit model and count differences is sho wn in Figure 11 , with the differen t pan- els highlighting differen t pixels (i.e., w av elengths). The uncertain ty-scaled residuals in the b ottom panel show that the new mo del do es a goo d job of describing the data. Figure 11. Comparison of the b est-fit mo del with real APOGEE data across many pixels spanning ∼ 600 Angstroms in w a velength. The pixels come from one APOGEE detector (c hip R) and all lie on the trace of the same star with the top panel sho wing the star’s 2D spectrum (red b ounding lines showing the pixels nearest to the trace cen ter). White dots near the b ottom of the top panel show the X p ositions of the 100 pixels used in measuring  b best , and colored arrows p oint to the 5 pixels highligh ted in the follo wing panels of this figure. The blac k line sho ws the mo del prediction in that pixel, while the colored p oints sho w the coun t differences in each pixel. The χ 2 comparison b et ween mo del and data is sho wn in the bottom left corner of eac h panel, scaled by the relev an t degrees of freedom. The last panel displays the uncertaint y-scaled residuals, sho wing that the mo del do es a go o d job of describing the data. 16 Li et al. Figure 12. Comparison of the data residual χ 2 distribu- tions for 2048 pixels of the same star used in the analy- sis of Figure 11 . The tw o upp er panels show the data χ 2 p er pixel as a function of pixel index (i.e. wa v elength) and flux, resp ectively , while the bottom panel shows the empir- ical CDFs. The blac k horizontal lines in the upp er panels sho w the 16-th, 50-th, and 84-th p ercentiles of the exp ected χ 2 ν distribution. Small grey p oints along the b ottom edges of the top tw o panels sho w the pixel indices and fluxes for the 100 equally-spaced pixels used in the  b fit. The orange points and distribution comes from using the b est-fit time-constant flux as defined b y the ( Brandt 2024 ) model, while the blue p oin ts and distribution show the results after accounting for time v ariation using the new mo del of this work. The time– v arying mo del almost p erfectly matc hes the expected distri- bution, while the time-independent mo del do es not. These distributions reveal that the time constant mo del requires the data uncertain ties to be inflated by a factor of 2.27 to ha ve the exp ected agreemen t, while the time v arying mo del requires multiplication by a factor of 0.99. As with the synthetic data, w e then use the best-fit  b to correct the data b efore measuring a new best-fit flux in each pixel, (  f |  b =  b best ), including pixels that were not used in the  b measuremen t. That is, we use all 2048 pixels along the trace cen ter (all pixels inside of the red Figure 13. T op: Examples of  b vectors (re-scaled b y their median) for different APOGEE chips/detectors cov ering dif- feren t w a v elength ranges for the same exp osure and star as in Figures 11 and 12 . Legends give the scaled χ 2 comparison b e- t ween eac h chip’s  b and an inv erse-v ariance weigh ted-a v erage of  b calculated from all three c hips. Bottom: Uncertain ty-s- caled residuals of the  b vectors in the top panel compared to an inv erse-v ariance weigh ted-a verage. Ev en though the data on the differen t chips were fit indep endently , the results are extremely similar, indicating that our assumption of a w av e- length-indep enden t  b is broadly supp orted by the data. b ounded region in the top panel in Figure 11 ) instead of only the 100 pixels used to measure  b best . W e then com- pare the data χ 2 w e obtain with our new measurements v ersus the constant-flux results, yielding the distribu- tions in Figure 12 . Here, we see that the time-v arying mo del has remo v ed the disagreemen t seen betw een the exp ected distribution and the time-independent mo del. The upp er t w o panels also reveal that the time-v ariable mo del do es not show an y noticeable trends in data χ 2 as a function of flux and w a velength – which is a h uge impro vemen t ov er the time-indep endent mo del – lend- ing credence to our assumption that the ma jority of the temp oral v ariation is ac hromatic. W e further test the w av elength indep endence assump- tion b y analyzing data from tw o additional detectors that were observing the same targets at the exact same times, just at different wa v elengths. W e will refer to the different chips using “red”, “green”, and “blue” – lab eled R, G, and B – based on their relativ e p ositions in w av elength. W e rep eat our analysis using the trace of the same star, but now using data from the other chips analyzed completely indep endently . The b est-fit  b v ec- tors for each of the three c hips are shown in Figure 13 . Optimal Fluxes fr om V ariable Illumina tion 17 The pixels no w span almost 2000 ˚ A in w av elength, and y et the results from three separate ∼ 600 ˚ A regions agree quite closely with each other. How ev er, we do note that the chips show some statistical differences when their uncertain ties are accoun ted for, as highlighted b y the reduced χ 2 legend v alues. Some of the differences seen b et ween the c hips in their residuals compared to their w eighted a v erage (b ottom panel) might be due to differ- en t nonlinearit y responses that ha ve not b een remo ved from the data, v arying persistence prop erties, chromatic differences in point spread functions, or a high order w av elength-dependence in  b . The APO Sloan T elescop e cov ers a three degree wide field of view. When we explore the  b that we measure from fib ers/stellar traces that correspond to p ositions that are nearby on the sky , w e find a general agree- men t. W e are still in the early stages of understanding ho w these results change across the fo cal plane. Small, spatially-coheren t differences ma y b e telling us about the individual atmospheric differences each fib er sees along its line-of-sight, or they ma y be indicativ e of a v ariation in telescop e-delivered image quality across the fo cal plane. W e are actively exploring how b est to mea- sure and correct the observed data as a function of p osi- tion on the fo cal plane, source brightness, and position on detector. Implemen ting our time-v ariable mo del us- ing an en tire detector’s/nigh t’s w orth of data will be the fo cus of future work. 4.2. Nul l T ests with T elesc op e Calibr ation Data Next, w e test our model on telescop e calibration exp o- sures that we exp ect to b e time-independent. The par- ticular exposure w e use is of quartz lamp ligh t that is re- flected off the in terior of the telescop e dome. This means the traces are bright, all fib ers are ev enly illuminated, and the measurements should not exhibit atmosphere- related time v ariabilit y . There m a y b e some subtle sources of flux v ariabilit y – such as the lamps heating up ov er the course of use, persistence, or incompletely- remo ved nonlinearit y effects – but we expect these to b e small compared to the c hanging sky seen with the real science data. After repeating the same analysis using 100 pixels across one detector for a single fib er’s trace, w e get the results in Figure 14 . As exp ected, the  b is quite flat. W e find that both the time-indep endent and time-dep enden t mo dels do a go o d job of describing the data with no obvious bias as a function of flux or wa v e- length, as demonstrated in Figure 15 . The data χ 2 dis- tributions suggest that the data uncertainties might b e sligh tly o verestimated by a factor of 2 − 3%, pro viding p oten tial evidence that the current read noise estimates are sligh tly too large. Figure 14. Same as Figure 11 , but now for APOGEE quartz lamp calibration data (chip R). The source in this case is inside of the telescope’s dome, so we do not expect to see a significan t v ariation in time from the atmosphere. Indeed, w e recov er a  b that is consistent with a time-stable source for data on this chip. W e again compare the  b measured from the differen t c hips and assess how similar they are to each other as w ell as to the exp ected  b =  1. These results are sho wn in Figure 16 . W e find that all the  b elem en ts from the dif- feren t c hips are within a few p ercent of the exp ected one v ector. The chips also all generally show a decrease in  b elemen ts as a function of time, suggesting that the time- 18 Li et al. Figure 15. Same as Figure 12 , but using the calibration data in Figure 14 . As expected, there is little difference b e- t ween the time-independent and time-dep endent results for the calibration data where time v ariation should be minimal. dep enden t signal we measure in this case is likely due to nonlinearit y effects. W e note that the reduced χ 2 v alues in the upp er panel legend are the results when compar- ing eac h chip’s  b to a weigh te d-a verage  b . As with the previous science observ ations, we see some disagreemen t b et ween the c hips when w e accoun t for the  b uncertain- ties, whic h is most likely a result of different nonlinear- it y prop erties. Statistically comparing each c hip’s  b to the exp ected  b =  1, we measure reduced χ 2 v alues of 0.78, 15.8, and 15.9 for the R, G, and B chips resp ec- tiv ely . While c hip R is consistent with a time-stable source, the other c hips are not, p otentially indicating that  b is truly capturing nonlinearity b eha vior and pick- ing up differences betw een the c hips. These nonlinearit y differences may also explain the nature of the small dis- parit y b etw een the  b measured from real science data in Figure 13 . In the future when w e hav e access to nonlin- earit y corrections for these data, w e would exp ect to see Figure 16. Same as Figure 13 , but using the calibration data in Figure 14 . The reduced χ 2 v alues presen ted in the legend again compare each chip’s  b to a weigh ted av erage  b , not to the exp ected v ector of ones. When we instead compare to the exp ected  b =  1 case, we measure reduced χ 2 v alues of 0.78, 15.8, and 15.9 for the R, G, and B chips resp ectiv ely . These results suggest the R chip is consistent with the time-independent exp ectation, but the other chips are not. F or b oth the G and B chips (and R to a lesser exten t), we see a decrease in  b ov er time, whic h is what we w ould exp ect from nonlinearit y effects. the chip differences reduce and for the  b measurements to be more consistent with a v ector of ones. 4.3. Comp arison of Time V ariable Signal with Indep endent Sky Me asur ements Finally , w e compare the  b we measure to indep enden t estimates of the sky’s prop erties as determined by the Sloan T elescop e’s guide camera system. This system consists of six CCDs equipp ed with SDSS r band filters moun ted around the periphery of the telescop e’s field of view ( Pogge et al. 2020 ). All guide cameras are ex- p osed in a con tinuous lo op during survey observ ations. P oint sources detected in guide frames are cross-matched to Gaia DR2 ( Gaia Collab oration et al. 2018 ) ob jects with G < 18 mag. Gaia astrometry is used to deter- mine telescope p ointing corrections during science exp o- sures, but the guide frames additionally trac k changes in flux (transparency) and seeing (FWHM) for each unique source detected. V ariations in transparency are generally asso ciated with changes in cloud cov er, and we exp ect transparency trends measured b y the guide system to also be present in the spectra. In addition to transparency , atmospheric Optimal Fluxes fr om V ariable Illumina tion 19 seeing can v ary on timescales shorter than a t ypical sp ectrograph exp osure. The SDSS-V guider rep orts a Gaussian FWHM for each guide exp osure. Seeing can significan tly affect the flux receiv ed at the spectrograph in fiber-fed sp ectroscopy . APOGEE fib ers subtend a 3 arcsecond diameter circle on sky at the Sloan F ounda- tion T elescop e, so when seeing is p o or (e.g. > 3 arcsec FWHM), significan t amounts of the av ailable photons are landing outside the fib er’s collecting area. Giv en measurements for transparency and seeing, we can roughly mo del the flux-in-fib er change based on n g guide camera images taken during an APOGEE exp o- sure. W e c ho ose a 2D Gaussian model for the telescop e’s PSF during guider exp osure i (with i ∈ [1 , . . . , n g ]) g ( x, y , σ i ) = 1 2 π σ 2 i exp  −  x 2 + y 2 2 σ 2 i  , (27) where the Gaussian width is informed by the guider mea- sured FWHM in arcseconds σ i = FWHM i 2 √ 2 ln 2 . (28) F or each guide image, we analytically integrate the PSF profile ov er the surface of the circular 3 arcsecond di- ameter fib er, and then multiply by the guider measured transparency t i to estimate the flux entering the fib er f i = t i Z 2 π 0 Z 1 . 5 ′′ 0 g ( r cos θ , r sin θ , σ i ) r dr dθ. (29) Finally , we normalize this flux sequence by the maxi- m um computed flux during the APOGEE exp osure f i, norm = f i max { f 1 , f 2 , . . . , f n g } . (30) T elescop e p ointing errors (or guide jitter) can also b e a source of time-v ariable flux entering the fiber. In the examples we presen ted next, the guide error is stable and small, and so should not contribute tow ard a time- v ariable flux signal in the sp ectra. W e start b y considering a case where the guider sug- gests that cloud co ver is disp ersing during the exp o- sure while the seeing remains relativ ely constant. In this case, we measure  b for ten of the brightest stars, eac h fit independently , using 100 pixels along eac h star’s trace on a single detector. The results are sho wn in Fig- ure 17 with the colored lines in the b ottom panel sho w- ing the best fit  b measurements, rescaled to ha v e the closest agreement with the guider-estimated flux change. The top panel sho ws the median guider-measured trans- parency change across all sources in all guide chips (red line, left axis) in addition to the seeing FWHM (blue Figure 17. Comparison of  b measured from the ten brigh t- est stars in an APOGEE exposure (colored lines with scatter p oin ts, left axis of b ottom panel) with guider camera mea- suremen ts of transparency (red line, left axis of top panel), seeing (blue line, right axis of top panel), and total estimated c hange in flux (dark blue line, right axis of b ottom panel). While the seeing is relatively stable ov er the APOGEE exp o- sure, the transparency increases from b eginning to end and matc hes the shape of the measured  b vectors, suggesting that a cloud mo ving out of the field of view is driving the time– v arying component of the fluxes. Figure 18. Same as Figure 17 but using a different exp osure for a case where the seeing changes while the transparency is relatively stable. As the seeing increases from ∼ 1 . 5 arcsec to ∼ 3 . 5 arcsec, less of each star’s ligh t is b eing collected b y the fib er compared to the beginning, driving the change in  b as shown by the the b ottom panel. line, right axis). The right axis of the b ottom panel displa ys the estimated flux c hange after com bining the effects of c hanging transparency and seeing. The agree- men t in shap e b etw een the transparency and the  b from the different stars suggests that the c hange in flux as a function of time is largely dominated b y transparency c hanges for this exp osure. 20 Li et al. Figure 19. Same as Figure 17 but using a different expo- sure for a case where the transparency and seeing are b oth relativ ely stable. All the  b measurements are quite close to 1.0, agreeing with a largely stable atmosphere. W e then rep eat this analysis for tw o additional APOGEE exp osures. The next example, shown in Fig- ure 18 , is a case where the seeing is changing but the transparency is stable. These results show the expected an ti-correlation betw een the  b measuremen ts and seeing: as the seeing increases, less light is entering the fib er compared to earlier in the exp osure. The final example, sho wn in Figure 19 , is where the guider predicts that the atmosphere is stable and the measured  b vectors largely agree. With these real data tests, w e hav e sho wn that the mo del is flexible enough to pro cess images under a v a- riet y of conditions: large versus small time v ariabilit y as well as seeing versus transparency driven c hanges. In addition, the coheren t time-series structure in the data in Figure 11 suggests that there truly is a time-v ariable comp onen t to the science observ ations. W e hav e further explored the origin of the  b signal through comparisons with indep endent-but-sim ultaneous measurements from a guider camera, with results confirming that our mo del is able to capture real atmospheric v ariations. Thus, the time evolution we measure is not merely the result of p o orly-calibrated or misundersto o d data, and is instead ph ysically motiv ated. 5. CONCLUSION This w ork prop oses a new statistical mo del for ex- tracting optimal time-v arying fluxes from NIR detectors (Section 2 ). This approach builds up on and generalizes previous mo dels that were dev elop ed for space-based ob- serv ations where sources are typically constant in flux during an exp osure. Now, ground-based NIR observ a- tions that experience changing atmospheric conditions can measure unbiased fluxes with optimal uncertainties. W e v alidate our mo del b y comparing to a v ariet y of realistic synthetic datasets, sho wing that we can recov er the input truth across a range of flux lev els (Section 3.2 ) and amplitudes of flux v ariabilit y (Section 3.3 ). W e find that APOGEE-lik e observ ations should use the time- dep enden t mo del ov er the time-indep endent mo del when the amplitude of flux v ariation is ≥ 3 . 5% to av oid flux- dep enden t biases with magnitudes up to 120% of the flux uncertain t y (Figures 7 to 9 ). W e next sho w that our new mo del p erforms well on real data from the northern APOGEE sp ectrograph (Section 4 ). The time-v ariable mo del is able to describ e data b oth in cases where the flux app ears to b e c hanging (Figure 11 ) as w ell as with time-stable calibration obser- v ations (Figures 14 and 16 ). W e confirm that the bulk of the time v ariability agrees with our mo del’s wa velength indep endence assumption (Figures 12 and 13 ). W e also explore the origin of these flux c hanges through compar- isons with independent guider camera measuremen ts of the atmospheric conditions, finding that the time evolu- tion w e measure is physically motiv ated by a v ariable sky (Figures 17 - 19 ). F or real APOGEE data, accoun ting for time-v ariability significan tly improv es data-mo del com- parisons and pro duces unbiased fluxes. W e plan to incorp orate this mo del into a new APOGEE reduction pipeline that is under active devel- opmen t: ApogeeReduction.jl . 12 Ongoing and future w ork will focus on tac kling a few outstanding questions b efore w e can implement this approach at scale. This includes understanding the b est wa y to measure  b from a large collection of pixels (i.e. 2048 × 2048 pixel 2 de- tectors) that are p otentially seeing differen t atmospheric conditions, how b est to measure time-v ariability in fib ers with v ery faint pixels, and optimizing the co de to pre- v ent the introduction of new b ottlenecks into the reduc- tion process. A CKNOWLEDGEMENTS The authors thank the anon ymous referee for com- men ts that help ed improv e the clarit y of this pap er. The authors also thank Adam Wheeler for helpful conv ersa- tions that improv ed the bo dy of this w ork. K.A.M. ackno wledges supp ort from the Univ ersit y of T oronto’s Eric and W endy Schmidt AI in Science Post- do ctoral F ellowship, a program of Sc hmidt Sciences. A.K.S. ackno wledges supp ort for this work was pro- vided b y NASA through the NASA Hubble F ellow- ship gran t HST-HF2-51564.001-A a w arded b y the Space 12 The GitHub repository for the pipeline can b e found at h ttps: //github.com/andrew- saydjari/ApogeeReduction.jl Optimal Fluxes fr om V ariable Illumina tion 21 T elescop e Science Institute, whic h is operated b y the Asso ciation of Univ ersities for Research in Astronomy , Inc., for NASA, under contract NAS5-26555. J.G.F-T gratefully ac knowledges the grants supp ort pro vided by ANID F ondecyt Postdoc No. 3230001 (Sp onsoring researcher), the Joint Committee ESO- Go vernmen t of Chile under the agreemen t 2023 ORP 062/2023, and the supp ort of the Do ctoral Program in Artificial In telligence, DISC-UCN. F unding for the Sloan Digital Sky Survey V has b een provided by the Alfred P . Sloan F oundation, the Heising-Simons F oundation, the National Science F oun- dation, and the Participating Institutions. SDSS ac- kno wledges supp ort and resources from the Center for High-P erformance Computing at the Universit y of Utah. SDSS telescopes are lo cated at Apache P oin t Obser- v atory , funded b y the Astrophysical Research Consor- tium and op erated b y New Mexico State Universit y , and at Las Campanas Observ atory , op erated by the Carnegie Institution for Science. The SDSS w eb site is www.sdss.org . SDSS is managed by the Astroph ysical Researc h Con- sortium for the Participating Institutions of the SDSS Collab oration, including the Carnegie Institution for Science, Chilean National Time Allo cation Commit- tee (CNT AC) ratified researc hers, Caltec h, the Gotham P articipation Group, Harv ard Universit y , Heidelb erg Univ ersity , The Flatiron Institute, The Johns Hop- kins Universit y , L’Ecole polytechnique f ´ ed ´ erale de Lau- sanne (EPFL), Leibniz-Institut f¨ ur Astroph ysik Pots- dam (AIP), Max-Planc k-Institut f ¨ ur Astronomie (MPIA Heidelb erg), Max-Planck-Institut f ¨ ur Extraterrestrisc he Ph ysik (MPE), Nanjing Universit y , National Astronom- ical Observ atories of China (NAOC), New Mexico State Univ ersity , The Ohio State Universit y , Pennsylv ania State Universit y , Smithsonian Astroph ysical Observ a- tory , Space T elescop e Science Institute (STScI), the Stellar Astrophysics Participation Group, Universidad Nacional Aut´ onoma de M´ exico, Universit y of Arizona, Univ ersity of Colorado Boulder, Universit y of Illinois at Urbana-Champaign, Univ ersit y of T oron to, Univ ersity of Utah, Universit y of Virginia, Y ale Univ ersity , and Y unnan Universit y . F acilities: Sloan (APOGEE-N), Du Pon t (APOGEE-S) Softwar e: ApogeeReduction.jl (in prep.), fitramp ( Brandt 2024 ), jupyter ( Kluyver et al. 2016 ), matplotlib ( Hun ter 2007 ), numpy ( Harris et al. 2020 ), pandas ( Wes McKinney 2010 ; pandas developmen t team 2020 ), scipy ( Virtanen et al. 2020 ) REFERENCES Bo wen, I. S., & V aughan, Jr., A. H. 1973, ApOpt, 12, 1430, doi: 10.1364/AO.12.001430 Brandt, T. D. 2024, P ASP, 136, 045004, doi: 10.1088/1538- 3873/ad38d9 —. 2025, P ASP, 137, 125006, doi: 10.1088/1538- 3873/ae1dab Fixsen, D. J., Offenberg, J. D., Hanisc h, R. J., et al. 2000, P ASP, 112, 1350, doi: 10.1086/316626 Gaia Collab oration, Brown, A. G. A., V allenari, A., et al. 2018, A&A, 616, A1, doi: 10.1051/0004- 6361/201833051 Gunn, J. E., Siegmund, W. A., Mannery, E. J., et al. 2006, AJ, 131, 2332, doi: 10.1086/500975 Harris, C. R., Millman, K. J., v an der W alt, S. J., et al. 2020, Nature, 585, 357, doi: 10.1038/s41586- 020- 2649- 2 Holtzman, J. A., Nidever, D. L., Nguyen, D., et al. 2014, in American Astronomical So ciety Meeting Abstracts, V ol. 223, American Astronomical So ciety Meeting Abstracts #223, 440.04 Hun ter, J. D. 2007, Computing in Science & Engineering, 9, 90, doi: 10.1109/MCSE.2007.55 Kluyv er, T., Ragan-Kelley , B., P ´ erez, F., et al. 2016, in P ositioning and Po wer in Academic Publishing: Play ers, Agen ts and Agendas, ed. F. Loizides & B. Schmidt, IOS Press, 87 – 90 Kollmeier, J. A., Rix, H.-W., Aerts, C., et al. 2025, arXiv e-prin ts, doi: 10.48550/arXiv.2507.06989 Nidev er, D. L., Holtzman, J. A., Allende Prieto, C., et al. 2015, AJ, 150, 173, doi: 10.1088/0004- 6256/150/6/173 Offen b erg, J. D., Fixsen, D. J., Rauscher, B. J., et al. 2001, P ASP, 113, 240, doi: 10.1086/318615 pandas developmen t team, T. 2020, pandas-dev/pandas: P andas, latest, Zeno do, doi: 10.5281/zenodo.3509134 P ogge, R. W., Derw en t, M. A., O’Brien, T. P ., et al. 2020, in Ground-based and Airb orne Instrumentation for Astronom y VI I I, ed. C. J. Ev ans, J. J. Bryan t, & K. Motohara, V ol. 11447, International So ciety for Optics and Photonics (SPIE), 1144781, doi: 10.1117/12.2561113 Rausc her, B. J., F ox, O., F erruit, P ., et al. 2007, P ASP, 119, 768, doi: 10.1086/520887 22 Li et al. Sugai, H., T amura, N., Karo ji, H., et al. 2015, Journal of Astronomical T elescop es, Instruments, and Systems, 1, 035001, doi: 10.1117/1.JA TIS.1.3.035001 Virtanen, P ., Gommers, R., Oliphant, T. E., et al. 2020, Nature Methods, 17, 261, doi: 10.1038/s41592- 019- 0686- 2 Wes McKinney . 2010, in Pro ceedings of the 9th Python in Science Conference, ed. St´ efan v an der Walt & Jarrod Millman, 56 – 61, doi: 10.25080/Ma jora- 92bf1922- 00a Wilson, J. C., Hearty, F. R., Skrutskie, M. F., et al. 2019, P ASP, 131, 055001, doi: 10.1088/1538- 3873/ab0075 Optimal Fluxes fr om V ariable Illumina tion 23 APPENDIX A. GIBBS VERSUS LINEARIZED FITTING Here we compare the results of fitting synthetic data with the Gibbs metho d using the p osterior conditionals in Equation 16 v ersus the linearized mo del defined by Equation 23 . The precise data w e fit are from the high flux case describ e d in Section 3.1 and Figure 2 . F or the linearized model, the p osterior distribution is defined directly by Equation 23 . F or the Gibbs metho d, w e estimate the full p osterior (  f ,  b | data) by drawing samples from the p osterior conditionals (  f |  b, data) and (  b |  f , data) and then taking a mean and co v ariance of the (  f ,  b | data) samples to define a multiv ariate Gaus- sian that can be easily compared to the linearized re- sults. Figure 20 compares the t wo approaches when fo- cusing on the posterior distributions of the last five ele- men ts of the  b v ector. This corner plot sho ws that the mean estimates are virtually identical, but that there is some difference in the width and correlations of the m ultiv ariate posteriors. In this case, the Gibbs metho d (blue) tends to pro duce narro w er distributions than the linearized model (orange). How ev er, as discussed in Sec- tion 2 , w e are typically only concerned with the mean or MAP  b when trying to extract unbiased fluxes; the uncertain ty and correlations of the  b are not propagated to our b est fit (  f |  b ) measurements, as discussed in Sec- tion 2.1 . The mean  b for all read differences and  f for all synthetic pixels in this analysis are directly compared in Figure 21 . The top panels show that b oth metho ds reco ver almost iden tical measuremen ts, and the b ottom panels sho w that these mean estimates are extremely similar when compared to the size of their uncertain ties. The fact that b oth metho ds agree so closely in their  b mean estimates – that is, typically within 1% of their uncertain ty – shows that neglecting the robust-but-slow Gibbs metho d in fav or of the faster linearized model is lik ely to ha v e a negligible impact on final results. When we rep eat the analysis but use  b =  1 and com- pare the extracted fluxes to the time-indep enden t fits, w e find that the flux differences betw een the Gibbs and linearized method are approximately five times smaller than the difference b et ween either method and the time- indep enden t result. This is further evidence that the c hoice of Gibbs versus linear is not a substantial source of error. 24 Li et al. Figure 20. Corner plot comparing the multidimensional relationships b etw een the last fiv e  b elements obtained from the Gibbs metho d (blue lines) and the linearized model (orange lines). The ellipses show the 68%, 95% and 99% contours defined by the p osterior distribution from each method, while the red lines sho w the input truth used to generate the synthetic data. The titles ab ov e the diagonals display the mean and 68% credible region as defined b y the Gibbs sampling metho d. Both metho ds agree quite w ell with the truth and return almost iden tical mean estimates. The k ey difference b et ween the approaches comes from the size and shap es of the p osterior distributions, though they still broadly agree with each other. Optimal Fluxes fr om V ariable Illumina tion 25 Figure 21. Comparison of the b est-fit mean  b and  f using both the linearized mo del and Gibbs metho d. Lik e Figure 2 , the left panels fo cus on  b while the right fo cus on  f . The top panels sho w the direct one-to-one comparison of the measurements of the different techniques, and the b ottom sho ws the residuals scaled by the quadrature-com bined-uncertaint y . W e see that the z-scores are all v ery close to 0 (t ypically smaller than 1% of a corresp onding uncertaint y unit), sho wing that the different tec hniques provide mean estimates that are almost identical to one another.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment