Optimal and Unbiased Fluxes from Up-the-Ramp Detectors under Variable Illumination

Draft version March 24, 2026 Typeset using L A T E X tw o column style in AAST eX7.0.1 Optimal and Un biased Fluxes from Up-the-Ramp Detectors under V ariable Illumination Bowen Li , 1 Kevin A. McKinnon , 2 Andrew K. Sa ydjari , 3 , ∗ Conor Sa yres , 4 Gwendol yn M. Eadie , 1, 2 Andrew R. Casey , 5, 6 Jon A. Hol tzman , 7 Timothy D. Brandt , 8 and Jos ´ e G. Fern ´ andez-Trincado 9, 10 1 Dep artment of Statistic al Scienc es, University of T or onto, 700 University Avenue, T or onto, ON M5G 1Z5, Canada 2 David A. Dunlap Dep artment of Astr onomy & Astrophysics, University of T or onto, 50 St Ge or ge Str e et, T or onto, ON M5S 3H4, Canada 3 Dep artment of Astr ophysic al Scienc es, Princeton University, Princ eton, NJ 08544 USA 4 Dep artment of Astr onomy, University of Washington, Box 351580, Seattle, W A 98195, USA 5 Scho ol of Physics and Astr onomy, Monash University, Clayton, VIC 3800, A ustr alia 6 Center for Computational Astr ophysics, Flatir on Institute, 162 Fifth Avenue, New Y ork, NY 10010, USA 7 Dep artment of Astr onomy, New Mexic o State University, P.O. Box 30001, MSC 4500, L as Cruc es, NM 88033, USA 8 Sp ac e T elesc op e Science Institute, 3700 San Martin Drive, Baltimor e, MD 21218, USA 9 Universidad Cat´ olic a del Norte, N´ ucle o UCN en Ar que olo g ´ ıa Gal´ actic a - Inst. de Astr onom ´ ıa, Av. Angamos 0610, A ntofagasta, Chile 10 Universidad Cat´ olic a del Norte, Dep artamento de Ingenier ´ ıa de Sistemas y Computaci´ on, A v. Angamos 0610, Antofagasta, Chile (Dated: Jan uary 2026; Received January 15, 2026; Revised Marc h 20, 2026; Accepted FUTURE D A TE) Submitted to Publications of the Astronomical Society of the Paciﬁc ABSTRA CT Near-infrared (NIR) detectors – whic h use non-destructive readouts to measure time-series coun ts- p er-pixel – play a crucial role in mo dern astrophysics. Standard NIR ﬂux extraction tec hniques were dev elop ed for space-based observ ations and assume that source ﬂuxes are constan t o v er an observ ation. Ho wev er, ground-based telescop es often see short-timescale atmospheric v ariations that can dramati- cally c hange the n um b er of photons arrivin g at a pixel. This w ork presen ts a new statistical model that shares information b etw een neigh b oring spectral pixels to c haracterize time-v ariable observ ations and extract un biased ﬂuxes with optimal uncertainties. W e generate realistic syn thetic data using a v ariety of ﬂux and amplitude-of-time-v ariabilit y conditions to conﬁrm that our mo del reco v ers unbiased and optimal estimates of both the true ﬂux and the time-v ariable signal. W e ﬁnd that the time-v ariable mo del should b e fav ored ov er a constant-ﬂux model when the observed count rates c hange by more than 3.5%. Ignoring time v ariabilit y in the data can result in ﬂux-dep endent, unknown-sign biases that are as large as ∼ 120% of the ﬂux uncertaint y . Using real APOGEE sp ectra, we ﬁnd empirical evidence for approximately wa v elength-independent, time-dependent v ariations in count rates with amplitudes muc h greater than the 3.5% threshold. Our mo del can robustly measure and remo v e the time-dep endence in real data, improving the quality of data-model comparison. W e show sev eral ex- amples where the observ ed time-dependence quantitativ ely agrees with indep endent measuremen ts of observing conditions, such as v ariable cloud cov er and seeing. Keywor ds: Astronomical detectors (84); Astronomical instrumentation (799) 1. INTR ODUCTION Extracting ﬂux from near-infrared (NIR) observ ations is imp ortant for many areas of astrophysical research. NIR information, in particular, probes further into dust y Corresponding author: Bo wen Li bowenj.li@mail.utoron to.ca ∗ Hubble F ellow regions than optical wa v elengths, enabling the study of stars on the other side of our Galaxy or in the dust- obscured centers of neighboring galaxies. Man y mo d- ern telescop es used in ongoing astronomical surveys use NIR detectors, suc h as the James W ebb Space T ele- scop e (JWST), the twin APOGEE spectrographs ( Wil- son et al. 2019 ) at the Apac he P oin t ( Gunn et al. 2006 ) and Las Campanas Observ atories ( Bow en & V aughan 1973 ), the Prime F o cus Sp ectrograph (PFS) on the Sub- 2 Li et al. aru telescope ( Sugai et al. 2015 ), and the so on-to-launch Nancy Grace Roman Space T elescop e. A k ey diﬀerence b et ween NIR detectors and classical CCDs is that they can p erform non-destructive readouts, whic h means that eac h pixel keeps a record of coun ts as a function of time. As a result, counts versus time plots tend to mak e a “ramp” shape, where the slop e is deﬁned by the inciden t ﬂux. The time series asso ciated with this measurement is often referred to as “sampling up-the-ramp” (SUTR). A large amoun t of pioneering work on NIR detectors w as done in preparation for JWST, with Fixsen et al. ( 2000 ) ﬁrst deﬁning a mo del to estimate the ﬂux and its corresponding uncertaint y from SUTR observ ations. Subsequen t work – such as, Oﬀenberg et al. ( 2001 ) and Rausc her et al. ( 2007 ) – further reﬁned the deﬁnitions of the exp ected ﬂuxes and uncertainties and tested against real observ ations with NIR detectors. A k ey diﬀerence b et ween these approaches comes from how the observed coun ts at diﬀerent read times are weigh ted when mea- suring the best-ﬁt slop e/ﬂux. Rauscher et al. ( 2007 ), for example, uses equal w eights such that the best-ﬁt slop e is the one that minimizes the distance b etw een all mea- sured coun ts. This approach is statistically v alid if noise is independent from read to read (as for idealized read noise), but does not account for the cov ariant b eha vior of photon noise. Fixsen et al. ( 2000 ), how ev er, applies larger w eights to the ﬁrst few and last few reads when photon noise dominates. This leads to higher extracted signal-to-noise ratios (SNR), esp ecially in the high ﬂux limit. W e note that the current JWST pip elines use the Fixsen et al. ( 2000 ) approac h while the main data reduc- tion pip eline of the APOGEE sp ectrographs 11 ( Holtz- man et al. 2014 ; Nidever et al. 2015 ) use the Rauscher et al. ( 2007 ) approach. More recen tly , Brandt ( 2024 ) presented a new opti- mal ﬁtting metho d that addresses many of the previous limitations. Fixsen et al. ( 2000 ) used appro ximate, dis- crete, tabulated weigh ts for combining the individual reads in to ﬂuxes. This sacriﬁced some SNR to a v oid exp ensiv e computations asso ciated with inv erting a sep- arate cov ariance matrix for ev ery pixel. Brandt ( 2024 ), b y recasting the problem in the space of read diﬀerences, deriv ed a closed-form solution for the maximum lik eli- ho o d ﬂux as a linear function of the individual reads. This enables un biased and sligh tly higher SNR ﬂux mea- suremen ts at a small cost in computational performance. Ov erall, the Brandt ( 2024 ) approach is v ery eﬀective for space-telescop e data where the ﬂux of sources is con- stan t ov er an exposure, suc h as for JWST and Roman. 11 apogee drp can be found here https://gith ub.com/sdss/ apogee drp/tree/daily Ho wev er, the constan t-ﬂux assumption begins to break do wn when dealing with ground-based telescopes, where atmospheric conditions can change substantially on the timescale of an exp osure (i.e., a few minutes) or betw een diﬀeren t NIR detector readouts (i.e., ∼ 10 seconds). With v ariable illumination, using diﬀeren t w eights from one pixel to another – using either Fixsen et al. ( 2000 ) or Brandt ( 2024 ) – will lead to ﬂuxes that are biased diﬀeren tly depending on the particular time-dependent signal and each pixel’s true ﬂux. Any mo del that treats all pixels identically , like that of Rauscher et al. ( 2007 ), will a v oid this diﬀeren tial bias but will sacriﬁce SNR. This pap er prop oses a new mo del to optimally ex- tract ﬂuxes from NIR detectors when dealing with time- v ariable data. Section 2 presents the mo del and its un- derlying statistics. It explains the k ey underlying as- sumptions and cav eats ab out its use. Section 3 then tests the mo del with realistic synthetic data under a v a- riet y of conditions. These tests are expanded to real data from the northern APOGEE spectrograph in Section 4 . Finally , our conclusions are presented in Section 5 along with prop osed future w ork fo cusing on impro vemen ts and implementation in a no v el APOGEE data reduc- tion pipeline. 2. ST A TISTICAL MODEL Our goal is to measure, up to a m ultiplicativ e con- stan t, unbiased relative ﬂuxes of all pixels on a detector asso ciated with a given spectrum read out up-the-ramp. This could b e accomplished by applying equal weigh ts to the diﬀeren t reads in a pixel (e.g., Rausc her et al. 2007 ), or b y an y mo del that applies the same read-by- read weigh ts to all pixels. Ho wev er, such an approach is not optimal. Due to the diﬀerent co v ariance of read noise and photon noise, higher ﬂux SNRs are p ossible b y weigh ting the reads diﬀerently depending on eac h pixel’s level of illumination ( Fixsen et al. 2000 ). The b est weigh ts for eac h pixel dep end b oth on the ﬂux in that pixel and on the time-dep endence of the illumina- tion, an asp ect that is not treated in either Fixsen et al. ( 2000 ) or Brandt ( 2024 ). W e seek a generalization of the Brandt ( 2024 ) optimal approach that accoun ts for non- constan t illumination, assuming the time dep endence of the illumination is shared betw een neigh b oring pixels. In the case of Brandt ( 2024 ), this illumination function is a constant. W e b egin by deﬁning some terms. Let n r b e the num- b er of read-out times and n p b e the num ber of pixels. F or the case where the ﬂux in a pixel is not changing o ver time, w e exp ect that the observed counts in pixel p Optimal Fluxes fr om V ariable Illumina tion 3 at read j is described by: c j,p = δ j,p + j X i =1 ∆ r i,p (1) where j ∈ [1 , n r ], δ j,p is noise associated with the mea- suremen t, and ∆ r i,p is the true num b er of new photons that arrive at pixel p b etw een reads j − 1 and j . In this wa y , w e use i to k eep track of information b etwe en reads while j is used to trac k information at diﬀeren t read times. The measuremen t noise is describ ed by: δ j,p ∼ N  µ = 0 , σ 2 = σ 2 r,p  (2) where σ r,p is the per-pixel read noise, which we assume has b een previously measured during calibration of the detector (e.g., using the mo del of Brandt 2025 ). The ∆ r i,p is described by: ∆ r i,p ∼ Poisson ( λ = f p ) (3) where f p is the true ﬂux of the source in pixel p . Because mo deling combinations of Poisson and Gaus- sian distributions is often computationally exp ensive, it is common to mak e a Poisson-appro ximated-as- Gaussian assumption such that: ∆ r i,p ∼ Poisson ( λ = f p ) ≈ N  µ = f p , σ 2 = f p  , (4) and we will do the same for this work. This assump- tion is very go o d in the case of large f p where Gaussian and P oisson distributions lo ok quite similar. As f p de- creases, the Poisson-to-Gaussian approximation is less go o d; ho w ever, this is often at the p oint where the im- pact on the observed counts from the read noise starts to dominate, so the ﬁnal c j,p distribution is still well- appro ximated by a Gaussian for all f p v alues. After making this appro ximation, w e are left with a combi- nation of Gaussian distributions, which mak es analysis m uch easier. Brandt ( 2024 ) show ed that if we c hange to w ork with coun t diﬀerences b etw een neighboring reads instead of coun ts – that is, ∆ c j,p = c j +1 ,p − c j,p – then w e ha ve the following relationship:  ∆ c p ∼ N   µ = f p ·  1 , V V V p = V V V r,p + f p · I I I  (5) where  ∆ c p is a vector of the ( n r − 1) count diﬀerences, I I I is the ( n r − 1) × ( n r − 1) identit y matrix, and V V V r,p is a tri-diagonal matrix that describ ed the read noise as V V V r,p =          2 σ 2 r,p − σ 2 r,p 0 · · · 0 − σ 2 r,p 2 σ 2 r,p − σ 2 r,p · · · 0 0 − σ 2 r,p 2 σ 2 r,p · · · 0 . . . . . . . . . . . . . . . 0 0 0 · · · 2 σ 2 r,p          . (6) Solving the mo del in Equation 5 is diﬃcult in the- ory b ecause the f p v alue of interest shows up in both the n umerator and denominator of the arguments in the Gaussian likelihoo d. How ev er, w e often ha v e a goo d ﬁrst guess for what f p should b e based on the data – such as either the mean or median of the observ ed coun t diﬀer- ences – esp ecially in the case of a large num b er of reads, whic h is quite typical with NIR detectors. If w e tak e this ﬁrst guess, b f p , to generate c V V V p = V V V p ( b f p ), then the b est ﬁt ﬂux is deﬁned to be: µ f p =   1 T · c V V V p − 1 ·  1  − 1 ·   1 T · c V V V p − 1 ·  ∆ c p  (7) with uncertain t y σ f p =   1 T · c V V V p − 1 ·  1  − 1 / 2 . (8) W e can then iterate using this new b est guess for f p to up date b f p = µ f p and c V V V p to measure an improv ed µ f p , rep eating un til con vergence. In practice, Brandt ( 2024 ) ﬁnd that a single up date and recalculation is all that is needed to remov e an y bias in µ f p . While this approac h works successfully with space- based data where the ﬂuxes of sources are truly quite stable o ver the course of an exp osure, this is not often the case for ground-based observ atories that hav e con- founding eﬀects like changing atmospheric conditions. T o address this fact, we adjust the mo del in Equation 5 to include a time-v arying component: b i ≥ 0 f p ≥ 0 ∆ r i,p ∼ Poisson ( λ = f p · b i ) (9) where b i is the “ﬂux m ultiplication factor” for read dif- ference i . W e deﬁne  b as a length n r − 1 vector of the m ul- tiplication factors that describ e how the ﬂuxes change as a function of time: 0 for “no light received” and 1 for “maxim um transparency/eﬃciency during the observ a- tion”. With this mo del, w e assume that all pixels we analyze ha ve b een impacted by the same m ultiplicative factor at a giv en read time. W e are also assuming that these multiplicativ e factors can be approximated b y a constan t betw een adjacent reads. F uture work might ex- plore using deﬁnitions that encode/enforce contin uit y in time, for example by deﬁning a smo oth time-dependent function, b ( t ), that can b e integrated b et ween readout times to calculate  b . A comparison of example syn thetic data generated us- ing constan t ﬂux and a ﬂux that is changing ov er time is sho wn in Figure 1 . F or this syn thetic pixel, the true ﬂux is f p = 3000 e − /read and the  b for the orange line is 1 for the ﬁrst six elements, and then 0.5 for subsequen t 4 Li et al. Figure 1. T op: An example “ramp” of coun ts v ersus read n umber using syn thetic data. These data are generated using a Poisson distribution to sim ulate photon arriv al with Gaus- sian read noise, as describ ed in the b eginning of Section 2 . The blue line represents the observed coun ts in a pixel with constan t ﬂux o ver time, while the orange line shows the ob- serv ed counts for ﬂux that c hanges during the exp osure to sim ulate the eﬀects of, for example, a cloud mo ving in be- t ween reads six and seven. Bottom: Count diﬀerences ver- sus time using the same data as in the top panel. The true input ﬂux for both cases is 3000 electrons per read, and the orange line’s ﬂux is decreased to half of its maxim um after six reads. read diﬀerences. One can imagine this represents a case where clouds rolled in for the later reads. It is apparen t from b oth the counts and the count diﬀerences that a single ﬂux will not describ e the orange data. Using similar arguments to Brandt ( 2024 ), w e write the model as pro ducts of Gaussians suc h that:  ∆ c p ∼ N  f p ·  b, V V V p = V V V r,p + V V V f ,b,p  (10) where w e no w ha v e V V V f ,b,p = f p ·       b 1 0 · · · 0 0 b 2 · · · 0 . . . . . . . . . . . . 0 0 · · · b n r − 1       . (11) Because w e assume that a collection of pixels share the same  b , we can jointly ﬁt those pixels to constrain the time-v arying comp onent. F or sp ectroscopic data, this usually means using pixels along the same trace of light, corresp onding to a similar patch of sky (e.g., from one ﬁber or slit). F or imaging data, this w ould mean using pixels that are near each other on the im- age. In this work, our assumption of a shared  b betw een neigh b oring pixels means we are concerned with mo d- eling w a v elength-indep endent temp oral v ariations. In Section 4 , we will use real sp ectroscopic data to show that our achromatic mo del captures the v ast ma jorit y of the time-dep endent eﬀects. Chromatic v ariations p o- ten tially con tribute higher order eﬀects that could b e explored in future work. F rom the new mo del in Equation 10 , we notice that there is a degeneracy b et ween f p and  b . In particular, w e can arbitrarily m ultiply f p b y an y non-zero v alue so long as we divide  b b y the same factor. T o ﬁx this issue, w e force one of the elements of  b to b e 1:  b ′ =       1 b 2 . . . b n r − 1       =       1 0 . . . 0       +       0 0 · · · 0 0 1 · · · 0 . . . . . . . . . . . . 0 0 · · · 1       ·  b =  10 + 01 01 01 ·  b (12) suc h that all other  b elemen ts are measured relative to this ﬁxed one and f p is tied to that particular read time. In practice, we set the element in the ﬁrst-guess estimate of  b with the highest v alue to b e 1 by rescaling  b and f p appropriately . Ho wev er, this choice do es not require or enforce that all other  b e lemen ts b e smaller than one: in the ﬁtting pro cess, w e migh t learn that one of the other elemen ts is the true highest  b elemen t. Now, our full mo del becomes:  b ∼ N   b prior , V V V prior   ∆ c p ∼ N  f p ·  b ′ , V V V p = V V V r,p + V V V f ,b ′ ,p  (13) where  b prior =  1 and V V V − 1 prior = lim a → 0 +       1 a 0 · · · 0 0 a · · · 0 . . . . . . . . . . . . 0 0 · · · a       (14) describ e a non-informative prior that ensures our desired  b elemen t is held at 1 exactly: with this form, the ﬁxed elemen t of  b has a prior that is inﬁnitely narrow while the other elemen ts ha v e priors that are inﬁnitely wide. Optimal Fluxes fr om V ariable Illumina tion 5 After w e deﬁne our ﬂux vector as  f =       f 1 f 2 . . . f n p       , (15) w e use the standard technique of taking a ﬁrst partial deriv ative of the negative log likelihoo d, setting to 0 to ﬁnd extrema, then ev aluating the second deriv ativ e at that extremal v alue to determine the mean and cov ari- ance of the ﬁnal p osterior distribution. This results in p osterior conditionals for  b and  f describ ed b y: ( f p |  b, data) ∼ N  µ f ,p , σ 2 f ,p  (  b |  f , data) ∼ N (  µ b , V V V b ) (16) where σ 2 f ,p = h (  10 + 01 01 01 · b b ) T · b V V V − 1 p · (  10 + 01 01 01 · b b ) i − 1 , µ f ,p = σ 2 f ,p h (  10 + 01 01 01 · b b ) T · b V V V − 1 p ·  ∆ c p i , V V V b = h X p b f 2 p ( 01 01 01 T · b V V V − 1 p · 01 01 01) + V V V − 1 prior i − 1 ,  µ b = V V V b · h X p  b f p 01 01 01 T · b V V V − 1 p · (  ∆ c p −  10 · b f p )  + V V V − 1 prior ·  b prior i . (17) Here, we hav e used a similar trick as the Brandt ( 2024 ) metho d where we assume that w e can estimate a go o d ﬁrst guess for b oth b f and b b from the data to deﬁne b V V V p . By construction,  µ b will b e exactly 1 in its ﬁrst elemen t (or whic hever has been chosen to b e ﬁxed) and will hav e a corresponding uncertain t y of σ prior . Using these p os- terior conditionals, we can use a standard MCMC ap- proac h, such as Gibbs sampling, to dra w samples from the posterior (  f ,  b | data) distribution. While Gibbs sampling of the distributions describ ed in Equation 16 will give statistically robust samples of the desired p osterior, it can b e quite slow. If we instead recognize that we are almost alwa ys interested in sim- ply measuring the maximum-a-posteriori (MAP) of the p osterior, then getting the full shape of the distribution correct using Gibbs sampling is unnecessary . Instead, w e can use a small num ber of iterations to get to the MAP using a linearized version of the model described b y Equation 10 . In particular, we hav e the mo del count diﬀerences in pixel p deﬁned as  m p = f p ·  b ′ = f p (  10 + 01 01 01 ·  b ) (18) whic h has deriv atives with resp ect to  b and f p of ∂  m p ∂  b = f p 01 01 01 ∂  m p ∂ f p =  10 + 01 01 01 ·  b. (19) W e can then deﬁne an oﬀset v ector from some initial guess  v T 0 =   b 0 T ,  f 0 T  as  ∆ v = "  ∆ b  ∆ f # that giv es an appro ximation of the mo del coun t diﬀer- ences as:  m p ≈ f p, 0 (  10 + 01 01 01 ·  b 0 ) + h ∂  m p ∂  b |  v =  v 0   ∂  m p ∂  f |  v =  v 0 i · "  ∆ b  ∆ f # :=  m p, 0 + U U U p ·  ∆ v , (20) where U U U p is an ( n r − 1) × ( n p + n r − 1) Jacobian ma- trix describing the ﬁrst deriv atives of  b and  f . Here, δ  m p δ  f |  f =  f 0 is an ( n r − 1) × n p matrix that is zeros every- where except for column p , which is equal to the vector  10 + 01 01 01 ·  b 0 . F or clarity , an example set of U U U p matrices for n r = 4, n p = 2,  b T = (1 , b 2 , b 3 ),  f T = ( f 1 , f 2 ) would lo ok lik e: U U U 1 =    0 0 0 1 0 0 f 1 0 b 2 0 0 0 f 1 b 3 0    and U U U 2 =    0 0 0 0 1 0 f 2 0 0 b 2 0 0 f 2 0 b 3    . No w, the linear-approximated statistical mo del b e- comes:  ∆ v ∼ N   ∆ v prior , V V V  ∆ v , prior   ∆ c p ∼ N   m p, 0 + U U U p ·  ∆ v , V V V p  (21) where again w e hav e in tro duced a prior term that only serv es to ﬁx one of the  b elemen ts at exactly 1 with  ∆ v prior =  1 (i.e., a v ector of length n r − 1 + n p , where ev ery elemen t except the  b en try we hold ﬁxed could arbitrarily be set to an y v alue) and V V V − 1  ∆ v , prior = lim a → 0 +       1 a 0 · · · 0 0 a · · · 0 . . . . . . . . . . . . 0 0 · · · a       . (22) 6 Li et al. W e note that the mo del in Equation 21 can b e re-written to explicitly ignore the ﬁxed  b elemen t and thereby re- mo ve the need for a prior – in fact, this is ho w w e imple- men t the parameter ﬁtting in our co de – but we leav e it here for completeness. Combining the prior and likeli- ho o d abov e, w e arriv e at the follo wing posterior distri- bution:  ∆ v ∼ N   µ  ∆ v , V V V  ∆ v  (23) with V V V  ∆ v = h X p ( b U U U T p · b V V V − 1 p · b U U U p ) + V V V − 1  ∆ v , prior i − 1 , (24) and  µ  ∆ v = V V V  ∆ v · h X p  b U U U T p · b V V V − 1 p · (  ∆ c p −  m p, 0 )  + V V V − 1  ∆ v , prior ·  ∆ v prior i (25) In words, this new formalism sa ys that w e can use a starting guess (  b 0 ,  f 0 ), and then use gradient descent to calculate an impro ved subsequent guess, jointly ﬁt- ting the  b and  f . W e then iterate this up dating pro cess un til the parameters are no longer changing. It t ypi- cally takes under ﬁve iterations to ha ve all  b elements c hanging by less than 1 × 10 − 10 when the ﬂux is relative high ( ≥ 100 e − /read), with the  f elemen ts changing by a similar magnitude. Another approac h for optimizing the ﬁtting code could make use of, for example, autodiff to n umerically estimate Jacobian vectors and Hessian ma- trices to quickly arrive at the MAP while also accounting for the V V V p dep endence on f p and  b in Equation 13 . A direct comparison b etw een the results of Gibbs sam- pling and the linearized model are compared in Ap- p endix A , with the k ey takea wa y b eing that they agree v ery well in mean estimates of all parameters but show sligh t diﬀerences in the widths and correlations of their distributions. While the posterior conditional distribu- tions of  f and  b are b oth multiv ariate Gaussians (i.e. Equation 16 ), the joint p osterior is not exactly Gaus- sian (though it is close). This is b ecause b oth  f and  b app ear in the cov ariance matrices of eac h other’s p oste- rior conditional deﬁnitions. The small width and corre- lation diﬀerences in the joint  b and  f p osterior distribu- tion are likely a consequence of the linearized metho d appro ximating a close-but-not-entirely-Gaussian distri- bution as exactly Gaussian, with the Gibbs metho d giv- ing more slightly more accurate estimates of uncertain- ties. Ho w ever, we are almost alwa ys interested in the mean estimate of  b , whic h w e can use to correct the data and then measure the resulting b est-ﬁt ﬂux, as dis- cussed in the following subsection. Because the mean es- timates of  b and  f for the Gibbs and linearized metho ds are eﬀectively equiv alen t, w e are safe to choose either metho d. F or the remainder of the pap er, all analyses will use the linearized mo del, as w e ha ve a fairly fast implemen tation that is similar to what most reduction pip elines would choose to employ . W e also note that, should the Gibbs metho d b e desired, the linearized or Hessian-based results provide a go o d parameter initial- ization b ecause it will b e at the mo de of the p osterior distribution. An implemen tation of our ﬁtting pro ce- dure can b e found at h ttps://gith ub.com/KevinMcK95/ ﬁt time dep NIR ﬂuxes . 2.1. Cave ats and R e c ommendations Similar to the Brandt ( 2024 ) method, we hav e as- sumed we are able to estimate an initial guess for  b and  f from the data. In practice, this is muc h easier to do when the ﬂuxes are substantially larger than the read noise and when using a fairly large num b er of pix- els. W e ﬁnd that the linearized mo del is insensitive to the ﬁrst guess parameters – giv en enough iterations of the up dates – b ecause the posterior distribution is uni- mo dal. Our approach to deﬁne initializations for  f and  b is the following: 1. Estimate f p b y taking the median of the read dif- ferences in pixel p (i.e. a go o d estimate if the ﬂux w as constan t in time,  b =  1); 2. Divide the coun t diﬀerences by this f p to get an estimate of  b for each pixel; 3. T ake an in verse-v ariance w eigh ted a v erage of those p er-pixel  b estimates to deﬁne the starting guess  b , where the weigh ts come from the read noise and observ ed coun t diﬀerences in each pixel at each read time; 4. Rescale the  b guess such that the maximum of this v ector is exactly 1; 5. Divide the observ ed coun ts by this starting guess  b and take an in v erse-v ariance weigh ted mean of the corrected count diﬀerences in eac h pixel to deﬁne a starting guess  f . T o b e consisten t with ph ysical expectations, we ensure that all the elemen ts of  f and  b are p ositive at ev ery step in the up date pro cess by replacing the up dated b est ﬁt with 0 (or some very small non-zero threshold) if the curren t guess is negative. One could imagine w orking with log ﬂuxes and log  b elements or using more com- plicated prior distributions (e.g. truncated m ultiv ariate Gaussians) to ensure the p osteriors alw a ys pro duce non- negativ e samples; ho w ev er, these c hoices w ould break Optimal Fluxes fr om V ariable Illumina tion 7 the p osterior Gaussianity we rely on to p erform com- putationally fast analyses. F uture work could explore searc hing for relatively fast implemen tations using these non-negativ e constrain ts. When jointly ﬁtting  b and  f – either using the lin- earized mo del or Gibbs – w e end up with a distribu- tion of  f that is highly correlated betw een pixels. This correlation is often not desirable b ecause later steps in pro cessing a detector will typically make assumptions ab out pixel independence. Ideally , w e could sim ultane- ously ﬁt all the pixels and keep track of all pixel cor- relations, but this w ould in volv e holding v ery large ma- trices in memory (i.e., greater than n pixels × n pixels in size). Similarly , tasks requiring the in v ersion of this large matrix would b e impractical or impossible. F or most applications, w e recommend using the linearized mo del to jointly ﬁt  b and  f using the av ailable bright pixels. Then, thro w a w ay the join t  f and keep only the b est-ﬁt  b . With this  b best , extract ﬂuxes from all the de- sired pixels – including those that w ere not used in the original joint ﬁt – using the p osterior conditional distri- bution (  f |  b =  b best , data) as deﬁned in Equation 16 . In this wa y , the ﬁnal ﬂuxes will be uncorrelated with eac h other, preserving the desired pixel indep endence. F or all the ﬁgures presented in this w ork, the ﬂuxes w e sho w are calculated using (  f |  b =  b best , data) and not the ﬂuxes w e measuring during the join t ﬁt. In this w ay , w e can think of the  f measured during joint ﬁtting as n uisance parameters with  b being the key measurement of interest. Of course, the ultimate goal is to measure the correct (  f |  b =  b best , data). W e note that the p osterior  b will also hav e correlations b et ween the diﬀerent elemen ts corresp onding to diﬀer- en t read times. In theory , using  b to accurately correct the observ ed data requires including the eﬀect of these correlations by marginalizing ov er the p osterior  b distri- bution, (  b | data) which we deﬁne to hav e cov ariance V V V b, post . Ho wev er, this is quite costly and again intro- duces correlations betw een the pixel ﬂuxes. Another ap- proac h is to approximate the marginalization by adding a f 2 p V V V b, post term to the data cov ariance matrix V V V p in (  f |  b =  b best , data), and this is able to give bac k  f un- certain ties that are broadly consistent with results from the exp ensive marginalization. Ho wev er, V V V b, post is typ- ically dense, so it breaks the fast-to-in vert tri-diagonal form of V V V p and instead pro duces a slow er-to-in v ert ma- trix that is p oten tially full-rank, signiﬁcan tly increasing computational costs. W e explored using the full-rank matrix as well as only including the correlations up to the tri-diagonal elements when correcting the data: the full-rank version pro duces  f uncertainties that are most consisten t with the  b marginalization results. T o sum- marize, all cases of propagating  b cov ariances only in- creased the magnitude of the (  f |  b ) uncertainties. As is sho wn in the Section 3 , tests with syn thetic data where w e do not propagate  b co v ariances – that is, only the b est ﬁt  b is used to measure (  f |  b =  b best , data) – return un biased ﬂuxes that agree with the input truth within their uncertainties.As a result, we recommend not prop- agating  b uncertain ties, assuming  b can b e measured to higher than ∼ 10% precision. In our mo del, w e are assuming that the nonlinear- it y of each pixel has also been suﬃcien tly accounted for through calibration of the detector. If non-linearity is still present in the data, then w e exp ect that our mo del will pick up an a verage nonlinearit y response from the join tly ﬁt pixels: that is, ev en time-indep endent data will return a  b with elemen ts near 1 at the b eginning read times and < 1 at later read times. How ever, we note that this is likely an undesirable w ay to handle nonlinearit y corrections; there is no reason to believe that nearb y pixels hav e the exact same nonlinearit y re- sp onse, but our mo del forces them to share a nonlinear- it y correction. It is muc h b etter to measure and remov e nonlinearit y b efore applying our mo del. When our mo del is applied to real data, we cannot kno w if  b is solely measuring the eﬀects of the sky . The  b we measure could be a com bination of improp erly- remo ved nonlinearit y eﬀects, a c hanging sky , issues with telescop e guiding, strange electronic noise, or persis- tence, to name a few. That b eing said, the source of the eﬀect is not the greatest concern; it is clear that the data are p o orly describ ed b y the Brandt ( 2024 ) con- stan t ﬂux mo del. As a result, such a mo del will ha v e unaccoun ted-for systematic errors, underestimated un- certain ties, and a ﬂux-dep endent bias. Our time v ariable mo del pro duces ﬂuxes that are less biased, regardless of whether w e understand the source of the eﬀect. If external constrain ts of  b w ere a v ailable – for exam- ple, relatively strong prior information ab out the sky v ariability during an exp osure from telescop e guider cameras – then disentangling the origin of diﬀerent time v ariable signals would b ecome more feasible. As we show later in Section 4.3 , guider camera information appears fairly correlated to the time v ariable signals we measure from real science data. F uture work will fo cus on turn- ing these guider camera estimates into well-calibrated prior constraints. After using telescop e calibration data – such as a combination of darks and dome ﬂats – to re- mo ve or mo del p ersistence and nonlinearity eﬀects, the ma jority of the remaining time-v ariable signal could be describ ed as a combination of sky and stellar v ariabil- it y . With w ell-informed priors on the sky v ariability , the  b measuremen ts could be decomposed to understand 8 Li et al. b oth the Earth’s atmosphere as well as astrophysical v ariability in a particular star’s brightness/spectrum. Th us, the metho ds presented in this paper may present a path to w ards learning in teresting time-series stellar as- troph ysics o ver relatively short timescales ( ∼ 10 seconds p er read for APOGEE). In this w ork, w e use the w ord “bias” to describe ﬂuxes that hav e a non-zero oﬀset in exp ectation from the truth. In the follo wing sections, w e show that ﬁtting time- indep enden t models to time-v ariable data produces ﬂux- dep enden t biases, with the precise sign and magnitude dep ending on the particular  b for an observ ation (Sec- tion 3.1 , third panel of Figure 4 ). When we uniformly marginalize ov er many  b draws (Section 3.3 ), we ﬁnd that the exp ectation of this bias goes to 0 (third panel of Figure 7 ), but this does not mean that the individual ob- serv ations are un biased. Instead, the o verall eﬀect of the bias can be thought of as a systematic error that is un- accoun ted for when using the time-indep enden t model. Throughout thi s paper, the bias we men tion is implicitly ﬂux-dep enden t and “systematic error” and “bias mag- nitude” are used interc hangeably . Measuring  b is muc h easier when the count rate dif- ferences are signiﬁcantly larger than read noise. Ho w- ev er, many pixels – whether in sp ectroscopic or imag- ing data – will naturally b e quite dark. F or imaging data, w e imagine a user would either measure a single common  b for an image or calculate independent, local  b v alues across the face of the detector using diﬀerent brigh t sources, and then build some form of interpola- tion b etw een those measurements to correct the data. F or sp ectroscopic observ ations, the p ositions of bright sources/traces on the detector could b e mapp ed to p osi- tions on the sky to allo w the interpolation of  b for faint sources. Another approac h would b e to fo cus on the ﬂux near bright sky emission lines that should b e present for all spectroscopic observ ations, regardless of the target’s brigh tness. How ev er, this latter approac h has an addi- tional complication in that the night sky is exp ected to c hange in diﬀeren t wa ys than sources o ver the course of the night; for example, the sky’s large-scale trend of starting oﬀ bright, reaching a minimum, and then brigh tening tow ards the sunrise, or fast-timescale v aria- tions in h ydroxyl radicals. Using skyline pixels to correct source pixels would then require calibrating the relativ e c hanging ﬂux b etw een a source and the sky , p oten tially harnessing prior constrain ts from indep endent measure- men ts of the sky v ariability as a function of time (e.g., from guide/w eather-trac king cameras sitting at the tele- scop e). As men tioned earlier, there is an arbitrary tradeoﬀ b e- t ween  b and  f , so it is important to remember that all the ﬂuxes measured using the same  b should be consis- ten t with each other up to some multiplicativ e constan t. Ho wev er, when comparing pixels corrected with diﬀeren t  b v ectors, we ha ve to b e more careful. F or sp ectroscopic data, we suggest measuring a single  b p er trace/ﬁb er. T ypically , the diﬀerence in relative ﬂuxes b etw een the traces is measured/calibrated at some later step in a data reduction pip eline, which should naturally handle diﬀerences b etw een the arbitrary  b scaling across the detector. How b est to deal with the cross-talk regions b et ween traces is not yet ob vious, though the most sta- tistically robust answ er likely inv olv es complete forward mo deling of sp ectral ligh t and p oint spread functions across the detector. F or imaging data, lo cally measured  b v ectors should also b e handled with care or a single  b p er image should b e used. In b oth imaging and sp ec- troscop y cases, it w ould b e wise to force all indep endent  b measurements to share the same reference read (i.e., where b i = 1) so that everything is measured relativ e to the ﬂux transparency at the same read time. Finally , w e note that the analyses in the follo wing sec- tions regularly compare the results to the case where we assume the ﬂuxes are constant in time, as measured by the Brandt ( 2024 ) mo del. While the results will sho w that our new model does a better job of describing our time-v ariable data, we m ust emphasize that w e are not concluding that the Brandt ( 2024 ) mo del is incorrect. These comparisons merely sho w that ignoring time v ari- abilit y when it is present in the data leads to w orse re- sults. F or data where the ﬂuxes are truly stable ov er an observ ation – such as for space telescop es – the Brandt ( 2024 ) model is the correct c hoice to use. 3. TESTING WITH SYNTHETIC DA T A W e generate synthetic data using the Poisson and Gaussian model deﬁned in Equation 9 giv en some choice of  f and  b . In this section, w e will fo cus on tests using a range of ﬂux levels as well as  b amplitudes to quantify our abilit y to measure unbiased ﬂuxes. 3.1. Fitting a Single Synthetic Observation W e ﬁrst test how well we can reco ver our input ﬂuxes and  b for a single realization of synthetic data. W e set n reads = 33 and the read noise p er pixel to b e uniform random dra ws with bounds of 14 − 24 e − , which are t ypical v alues for APOGEE science observ ations. F or this test case, we deﬁne the true  b to b e n reads − 1 uni- form draws with bounds of 0 . 5 − 1 . 0, the true ﬂuxes to b e n pixels uniform draws with b ounds of 800 − 1200 e − /read, and n pixels = 100. W e ha v e chosen 100 pixels to ensure that our ﬁtting code (when used on a p ersonal laptop) is able to run fast and is light w eigh t (in terms of RAM): fu- ture implementation with real data would likely b eneﬁt Optimal Fluxes fr om V ariable Illumina tion 9 Figure 2. Comparison of the best-ﬁt  b and (  f |  b =  b best ) and the truth for syn thetic data from 100 pixels in a photon-dominated regime. The left panels fo cus on  b , and the right panels fo cus on (  f |  b ). The top panels compare the input and outputs directly , while the bottom panels show the uncertain t y-scaled residuals. The b ottom panels display the b est-ﬁt χ 2 scaled by the appropriate degrees of freedom, b oth of which are near 1.0 as expected for reasonable measuremen ts and uncertain ties. The red p oin ts in the  b panels mark the read diﬀerence num b er that was not used in the ﬁtting to break degeneracies b etw een  b and  f , as discussed in Section 2 . from using all a v ailable pixels. Our curren t test rep- resen ts a “high” ﬂux case b ecause the input ﬂuxes are signiﬁcan tly larger than the read noise, making this the easiest test of our mo del. After generating the data, we estimate the ﬁrst guess of the  b and  f using the approach describ ed in Sec- tion 2.1 , whic h is necessary to deﬁne the ﬁrst cov ariance matrix that describ e the data. W e then use the starting guess b V V V p matrices to up date our b est  f and  b guesses based on the linearized mo del of Equation 23 . W e then up date the b V V V p matrices and repeat un til all elements of  b are changin g less than some small tolerance (e.g., 1 × 10 − 10 ), which usually happ ens after only three or four iterations. Before comparing the input and output  f and  b , we m ust accoun t for the arbitrary scaling betw een them. F or example, if clouds in the sky meant that there w as maxim um 80% of ligh t getting through in a giv en exp o- sure, then the true  b is a vector of 0.8, but our mo del w ould return a vector of all ones for  b with the  f b e- ing a factor 0.8 low er than the true source ﬂuxes. F or real data, this discrepancy does not matter b ecause w e only ever exp ect to measure relativ e ﬂuxes of sources, with later calibration steps determining an absolute ﬂux- ing if necessary . F or our comparisons with syn thetic data, how ev er, we wan t to quan tify how well our mea- suremen ts and uncertainties describ e the truth, which means we must rescale our outputs b efore we can com- pare with the input v alues. Sp eciﬁcally , we measure the m ultiplicative factor needed that minimizes the diﬀer- ence b etw een the true and measured  b , and then rescale the  f and  b b y that amount before comparison. The b est-ﬁt results for one realization of sim ulated data are sho wn in Figure 2 , with the left panels focus- ing on  b and the righ t panels fo cusing on (  f |  b =  b best ). Both vectors hav e go o d agreemen t with the truth (upp er panels) and an y residuals are well-described b y their un- certain ties (b ottom panels). W e also calculate the data 10 Li et al. Figure 3. Comparison of the data χ 2 empirical cumulativ e distribution function (CDF) for the same sim ulated data as in Figure 2 . The blue line shows the χ 2 , as calculated using Equation 26 when using the b est ﬁt  f and  b , which sho ws go od agreemen t with the exp ected χ 2 ν = n reads − 2 distribution (blac k line). The solid orange line sho ws the data χ 2 af- ter measuring the best-ﬁt constant ﬂux following the Brandt ( 2024 ) mo del, which is not able to describ e the data within its uncertain ties. In fact, the solid orange histogram ﬁnds that the diﬀerence b etw een the constan t ﬂux and data requires the uncertain ties to be 3.91 times larger, as demonstated by the dashed orange histogram, whic h has muc h closer agree- men t to the expected curve. χ 2 distribution using the b est-ﬁt ﬂuxes and  b using χ 2 = X p h  ∆ c p − f p ·  b i T V V V − 1 p h  ∆ c p − f p ·  b i , (26) to conﬁrm that the ﬁnal b est-ﬁt mo del prop erly de- scrib es the data, and this is demonstrated in Figure 3 . W e hav e also included the data χ 2 distribution if we as- sume that the ﬂux is constant in time and is extracted using the Brandt ( 2024 ) approac h, whic h, as expected, is not able to describ e the data within their uncertain ties (solid orange line). In this case, the time-indep endent mo del requires the data uncertainties b e inﬂated by a factor of 3.91 to recov er the expected agreemen t (orange dashed histogram). W e next test for bias b etw een the ﬂuxes measured us- ing the time-indep enden t and time-v ariable mo dels. W e do this by generating data 1000 times using the same true  b as shown in Figure 2 , but no w across a range of ﬂux lev els that are typical for pixels in an APOGEE exp osure. F or eac h data draw, we extract ﬂuxes using b oth (  f |  b =  1) and (  f |  b =  b best ), where  b best is the same best-ﬁt v alue we ha v e measured previously; that is, we do not re-estimate  b best using the new data dra ws. W e also extract ﬂuxes using the equal-w eight- p er-read approach of Rausc her et al. ( 2007 ), where the Figure 4. Comparison b etw een the time-indep endent (orange lines) versus time-v ariable (blue lines) versus equal-w eight (green lines) models for the extracted ﬂux un- certain ties (top), ﬂux signal-to-noise ratios (second), me- dian biases (third), and residual distribution widths (bot- tom). Blac k lines and shaded regions corresp ond to the ex- p ected v alues and scatter given independent draws from a unit Gaussian. These measuremen ts come from 1000 sim ula- tions of data using the same  b as in Figure 2 , but using a large range of ﬂux lev els. The top panel shows that time-v ariable and time-indep endent mo dels predict extremely similar ﬂux uncertain ties (diﬀerences m uc h smaller than 1%) while the equal-w eight mo del predicts larger uncertainties, esp ecially at the high ﬂux end. The third-from-top panel emphasizes that neglecting to mo del time v ariability leads to a substan- tial ﬂux-dep enden t bias when using the Brandt ( 2024 ) ap- proac h. The b ottom panel sho ws that all the mo dels ha ve residual distributions with the exp ected widths. ﬂuxes come from their Equation 3 and the uncertainties are deﬁned b y their Equation 1. T o b e fair in our com- parisons, we allo w all the extracted ﬂuxes in a realization to b e scaled by a single multiplicativ e factor to hav e the b est agreement b etw een the mo del and the truth. A t eac h ﬂux lev el, we end up with a residual distribution whose median should b e near zero and 1 2 · 68% width should b e near 1 to agree with the unit Gaussian. The result of this comparison is shown in Figure 4 , with the top panel showing the extracted ﬂux uncertainties, the second panel showing the ﬂux signal-to-noise ratios, the third panel showing the median of the uncertaint y-scaled residuals, and the b ottom panel showing the width of the Optimal Fluxes fr om V ariable Illumina tion 11 residual distribution. The top panel sho ws that both the time-v ariable and time-indep endent models (blue and orange lines) predict similar ﬂux uncertainties as a func- tion of true ﬂux, with at maximum ∼ 1% diﬀerences b e- t ween the tw o lines. The equal-weigh t mo del, ho w ev er, predicts larger ﬂux uncertain ties, esp ecially in the high ﬂux limit ( ∼ 8% smaller SNR at a ﬂux of 1000 e − /read). The bottom panel shows that all models measure widths of their residual distributions that are well-described by their uncertainties; all the width lines also agree reason- ably well with the black line and shaded region show- ing the exp ected mean and scatter based on the same n umber of draws from a unit Gaussian. The blue and green lines in the third panel shows that b oth the time- v ariable and equal-weigh t models are able to recov er the exp ected agreemen t with a zero median. How ev er, the orange line shows that neglecting to account for time v ariability in the data leads to a systematic source of scatter in the constan t ﬂux mo del that is also nefari- ously ﬂux-dep endent: in this case, the systematic error is as large as ∼ 0 . 45 of the ﬂux uncertaint y in this case. When we rep eat these tests with diﬀerent realizations of  b – as we will explore in Section 3.3 – w e also ﬁnd that the strength and sign of this bias can change sig- niﬁcan tly . This is a particularly pernicious type of bias b ecause there is no simple wa y to correct for it after the fact. The only w ay to remov e this bias and measure optimal ﬂux uncertain ties is to use the time-dependent mo del. 3.2. Synthetic Observations over Diﬀer ent Flux L evels Next, we create 1000 realizations of data taken from the same conﬁguration as in Section 3.1 – that is, using the same n pixels , n reads , read noise p er pixel, and true  b – to quantify ho w w ell the measured  b and (  f |  b ) v ec- tors and uncertain ties agree with the truth. W e repeat this test across a range of ﬂux lev els: from read-noise- dominated cases to the photon-dominated regime. W e ﬁrst inv estigate our ability to measure  b using data generated from diﬀerent ﬂux lev els. The results of this test are shown in Figure 5 . The top panel sho ws the rela- tionship b etw een the median  b uncertaint y and the input true ﬂux. Evidently , it takes a ﬂux of 30 to 40 e − /read to reach a 10% uncertaint y in  b when using 100 pixels with n reads = 33. W e note that these tests used 100 syn- thetic pixels for all ﬂux cases; it is theoretically p ossible to reach any desired SNR in  b b y increasing the n umber of pixels used in the join t ﬁt, though this would signif- ican tly increase computation time. The middle panel next sho ws the 16-th, 50-th, and 84-th p ercentiles of the χ 2 distribution of the measured  b compared to the truth, with horizontal lines sho wing the exp ectation for Figure 5. Summary of  b measurements from repeated sim- ulation across a range of ﬂux levels using n reads = 33 and n pixels = 100. Each data p oint comes from 1000 iterations of ﬁtting data with the same true  b as shown in Figure 2 . The top panel shows the median uncertaint y in the measured  b from all realizations, with the errorbars sho wing the 68% in- terv al from those realizations. The middle panel shows the 16-th, 50-th, and 84-th percentiles of the χ 2 distribution of measured  b compared to the truth. The b ottom panel sho ws the median of the  b uncertain ty-scaled residual distribution. This ﬁgure rev eals that ﬂux levels > 10 e − /read are able to measure unbiased  b using 100 pixels. those summary statistics. While most ﬂux lev els pro- duce  b distributions that agree with the expectation, a div ergence b egins to appear at low ﬂuxes, suggest- ing that the chosen conﬁguration requires a signal of > 10 e − /read to measure trust worth y  b vectors and un- certain ties. At the highest ﬂux level, we ﬁnd that the  b measurements slightly o verestimate the uncertainties based on the middle panel χ 2 medians falling below the exp ectation. This size of this ov erestimate is ab out 3%, but to o-large uncertainties are preferable to underesti- mated ones for most science applications. In the bottom panel, the medians of the  b z-score distributions are close to the exp ectation of zero for most ﬂux lev els. F or ﬂux lev els ≥ 10 e − /read, the bias is less than one p ercent of the measured  b uncertainties; how ev er, there are signiﬁ- can t  b biases for ﬂux lev els < 10 e − /read. These results further emphasize the previous ﬁnding that jointly ﬁt- ting 100 pixels requires ﬂuxes > 10 e − /read to ha ve enough information to constrain  b . 12 Li et al. Figure 6. Summary of (  f |  b ) measurements from the same sim ulations in Figure 5 . F or each realization, w e correct the observ ed data using the b est-ﬁt  b and then measure the b est ﬁt (  f |  b ) to simulate ho w real data w ould b e pro cessed. The top panel sho ws the 16-th, 50-th, and 84-th percentiles of the χ 2 distribution of measured ﬂuxes compared to the truth. The middle panel shows the median of the ﬂux uncertain t y-s- caled residual distribution, with the orange line showing an equiv alent analysis of time-indep endent data. The b ottom panel estimates the width of the ﬂux z-score distribution. This ﬁgure reveals that unbiased ﬂuxes can be extracted us- ing  b measured from 100 pixels with ﬂux lev els ≥ 30 e − /read. The most imp ortant question for this w ork, how ev er, is whether the ﬂuxes we measure are accurate. T o that end, w e take the best estimate of  b for eac h realization and then use it to correct the data b efore measuring the ﬁnal b est ﬁt ﬂux, (  f |  b best ). W e then compare these ﬂuxes and their uncertainties to the truth, as demon- strated by Figure 6 . The top panel shows that the ﬂux χ 2 distribution agrees with the expectation across most ﬂux ranges. The middle panel sho ws that the extracted ﬂuxes b egin to b ecome more biased than exp ected when the ﬂux level is < 30 e − /read. F or the b ottom tw o pan- els, we hav e included the results from analyzing com- parable time-indep endent data (i.e.,  b =  1) using the Brandt ( 2024 ) constant ﬂux mo del. W e notice that b oth analyses tend to pro duce a slight negative bias in the z-score median, whic h is the result of a natural correla- tion b etw een the extracted ﬂuxes and their uncertain- ties: b oth the extracted ﬂux and its uncertaint y dep end on a previous iteration’s b est guess of the ﬂux, meaning they are fundamen tally correlated suc h that w e should not exp ect p erfect agreement with the unit Gaussian. The b ottom panel sho ws that the width of the ﬂux z- score distribution agrees with exp ectations do wn to all but the low est ﬂux lev els. These (  f |  b ) tests rev eal that the limiting threshold is sligh tly higher than previously seen with the  b results. T o ensure unbiased (  f |  b ) in addition to trustw orthy  b , we require the information conten t pro vided by 100 pixels with ﬂux ≥ 30 e − /read. W e m ust emphasize that the results of this section do not imply that extracting lo w ﬂuxes will alwa ys pro duce biased results; instead, it sa ys that using the  b measured f rom 100 lo w signal pixels can pro duce biased ﬂuxes. In the next subsection, we will inv estigate the true ﬂux bias when we are able to measure a precise  b . 3.3. Synthetic Observations over Diﬀer ent  b Am plitudes Finally , w e quan tify the scale of the ﬂux bias as a func- tion of ﬂux and amplitude in  b . That is, we deﬁne b amp suc h that a given true  b true is created b y n reads − 1 dra ws from a uniform distribution with b ounds from 1 − b amp to 1. In this notation, b amp = 0 corresp onds to the time indep enden t case. W e then dra w 1000 realizations of  b true v ectors for b amp ∈ [0 , 1] in steps of 0.1. W e then follo w the same pro cess as described in Section 3.1 : for eac h  b true realization, we generate 100 pixels worth of data using a ﬂux lev el of 1000 e − /read and then use our linearized mo del to measure a high precision  b best . W e next generate 100 examples of data across a range of ﬂuxes with the same  b true , and then extract ﬂuxes from those data using (  f |  b =  1) and (  f |  b =  b best ). W e then measure the ﬂux uncertain ties and residuals for each ﬂux level for that  b true , giving one realization of the results shown in the top t wo panels of Figure 4 . W e then marginalize o ver the 1000  b true sim ulations for a giv en b amp , yielding the results in Figure 7 . The dif- feren t colored lines corresp ond to diﬀerent b amp lev els, while dotted and solid lines represent the time v ariable and time indep endent mo dels, respectively . Red lines and shaded regions show the exp ected v alues and scat- ters after drawing indep endent measurements from unit Gaussians using the same num ber of realizations as the sim ulated data. The top panel shows that the ﬂux uncertaint y mea- sured by both mo dels increases with b amp . This is b e- cause the n um b er of photons that arriv e at a pixel de- creases for each increase in b amp , thereby reducing the data’s constraining p ow er. There is very little diﬀerence ( < 1%) b etw een the dotted and solid lines in this panel. Optimal Fluxes fr om V ariable Illumina tion 13 Figure 7. Quantiﬁcation of ﬂux uncertaint y and bias as a function of true ﬂux and  b amplitudes from 0 (no change in ﬂux v ersus time) to 1 (p ossible for ﬂuxes to be 100% dimin- ished). Dotted lines corresp ond to ﬂuxes extracted with the time-v ariable mo del, solid lines refer to ﬂuxes extracted from the same data assuming time independence, and the color of eac h line corresponds to the  b amplitude of those sim ulations. The top panel sho ws that the measured ﬂux uncertaint y in- creases with b amp b ecause fewer photons reach each pixel on a verage when  b <  1. W e note that the diﬀerence b etw een the dashed and solid lines is < 1% for all cases. The sec- ond-from-top panel shows the ﬂux signal-to-noise, which is the true ﬂux divided by the measured ﬂux uncertaint y . The third-from-top panel shows that the median of the uncer- tain ty-scaled residual distributions marginalized ov er  b real- izations is broadly consisten t with 0, though there is more scatter than expected (red shaded region) in the time-inde- p enden t results. The bottom panel shows excess scatter in the median of the z-score distributions: ∼ 68% of observ a- tions at a given b amp are exp ected to show a bias with a magnitude equal to or smaller than the corresp onding line. All of the dashed lines are very close to the expected red line, sho wing that the time-v ariable model produces the sto chasti- cally exp ected scatter in ﬂux measuremen ts. The solid lines sho w that ﬁtting a constant ﬂux to time v arying data re- sults in typical biases of ∼ 5% of the ﬂux uncertaint y when b amp = 0 . 3 and up to ∼ 50% when b amp = 1 . 0. The third panel sho ws the median of (  f |  b ) z-score dis- Figure 8. Measurement of the threshold for deciding the optimal model for extracting ﬂuxes. These results are based on the bias magnitudes presen ted in the b ottom panel of Fig- ure 7 . The y-axis is the median magnitude of the bias, and the x-axis is b amp . The blue line shows the median (o v er all ﬂuxes and b amp ) of the measured time-v ariable bias proﬁles, while the corresp onding shaded region shows the 5 and 95th p ercen tiles. The orange scatter p oints for the time-indep en- den t mo del sho w the per- b amp median for eac h of the bottom panel Figure 7 solid line proﬁles. The solid orange line is a simple 4-th order polynomial ﬁt to the orange p oints, which has goo d agreement for b amp ∈ [0 , 1]. The black vertical line giv es the b amp = 0 . 035 transition where the bias from the time-indep enden t mo del has a > 95% probability of b eing larger than the bias from the time-v ariable model. tributions, showing that marginalizing ov er all  b realiza- tions pro duces results that are fairly consistent with the exp ected 0. Here, we see that the time-v ariable mo del has smaller scatter than the time-indep endent mo del, esp ecially as b amp increases. W e do notice that there is potentially a small bias ( ∼ − 0 . 5% of the ﬂux uncer- tain ty) in the median z-score of the time-v ariable re- sults as we mov e to lo w er ﬂuxes. This is likely the same sligh t negativ e bias seen in the orange line in the mid- dle panel of Figure 6 , suggesting that the new model is reac hing the same fundamental limit as exp ected from time-indep enden t data. The b ottom panel of Figure 7 is a key result of this w ork as it gives the exp ected bias as a function of true ﬂux and b amp . The plotted lines are calculated b y taking the standard deviation of the 1000 realizations of the z- score medians for a given b amp , and then the exp ected scatter for this conﬁguration of sim ulations is subtracted oﬀ (red line), resulting in the excess scatter in the z- score bias. All of the dotted lines for the time-v ariable mo del are very near the exp ected red line, suggesting the scatter we see in the time-v ariable results is explained b y sto chastic processes. The time-indep endent results, 14 Li et al. Figure 9. Magnitude of the ﬂux bias at diﬀerent percentile lev els for b amp ∈ [0 . 2 , 0 . 5 , 1 . 0]. The cy an lines in these panels, corresp onding to the 68-th p ercentile, are the same as the curv es displa y ed in the b ottom panel of Figure 7 . The top panel reveals that the bias magnitude can b e as large as 1 . 2 · σ f for b amp = 1 . 0. The b ottom panel suggests that we can b e ∼ 95% conﬁden t that the time-indep endent mo del will pro duce a < 0 . 05 · σ f magnitude bias when b amp ≤ 0 . 2. ho wev er, show that the measured scatter in the z-score bias is ﬂux-dep endent and grows to large v alues as b amp increases. T o provide an illustrativ e example, this plot sa ys that a true ﬂux near 100 e − /read can exp ect that, 68% of the time, the bias will b e as large as 0 . 5 · σ f in magnitude when b amp = 1 . 0. Of course, the precise bias sign and magnitude at each ﬂux dep end on the exact shap e of  b true for that observ ation (e.g., as shown in the third panel of Figure 4 ). W e can also use the results in the b ottom panel of Figure 7 to determine when it is b etter to use the time-v ariable or time-indep endent mo del, and this is demonstrated in Figure 8 . W e hav e zo omed in on the b amp ≤ 0 . 3 cases to highlight the transition b etw een bias magnitude from the diﬀerent mo dels. The solid blue line is calculated by taking a median of the time-v ariable proﬁles across all ﬂux levels and b amp , with the corre- sp onding shaded regions giving the 5 and 95 percentiles. The blue line being slightly larger than 0 is evidence that the time-v ariable mo del incurs a small ( < 0 . 1% of the ﬂux uncertaint y) cost as so ciated with ﬁtting  b and then extracting (  f |  b =  b best ). The orange p oints are calcu- lated by taking medians ov er the ﬂux levels for the solid proﬁles in the b ottom panel of Figure 7 , giving a verage bias magnitude p er b amp . The orange solid line is the result of ﬁtting a simple 4-th order p olynomial to the orange p oints for all b amp , whic h do es a goo d job of de- scribing the measuremen ts for b amp ∈ [0 , 1]. T ests with other p olynomial orders – both higher and lo w er than 4 – did a visually w orse job of describing the orange data p oin ts in the b amp ≤ 0 . 3 region or pro duced unphysical in terp olated v alues (e.g., relatively large negativ e v al- ues in the b amp ≤ 0 . 1 region). W e ﬁnd that the solid orange line b ecomes greater than the top of the blue shaded region at b amp = 0 . 035. This implies that there is a > 95% probabilit y that the time-independent model con tributes more systematic error to the measurements than the time-v ariable mo del when the time-v ariable sig- nal is > 3 . 5%. W e suggest that b amp > 3 . 5% is a use- ful threshold for deciding when the time-v ariable mo del is preferred. Using linear interpolation of the orange p oin ts instead of the 4-th order p olynomial returns an in tersection at b amp = 0 . 028, suggesting a more conser- v ative c hoice would be to prefer the time-v ariable mo del when the time-v ariability is > 2 . 8%. W e explore diﬀerent p ercen tiles of the bias magnitude distributions of the time-dep endent mo del for b amp ∈ [0 . 2 , 0 . 5 , 1 . 0] in Figure 9 . The cy an lines in this ﬁgure are the same as the solid curves plotted in the b ottom panel of Figure 7 . F rom the top panel, we see that the bias magnitudes can b ecome as large as 120% of the ﬂux uncertaint y . A key tak ea wa y comes from the b ot- tom panel, which shows that w e can b e fairly conﬁdent – 95% of the time – that the time-indep endent mo del will pro duce ﬂux-dep endent biases that are smaller than 5% of the ﬂux uncertaint y when b amp ≤ 0 . 2. These met- rics pro vide another p ossible threshold for determining when to transition b etw een the time-v ariable and time- dep enden t mo dels. W e emphasize that the ﬂux levels, read noise, and n umber of reads used in our syn thetic tests are c har- acteristic of real APOGEE observ ations. As a result, Figures 7 to Figures 9 provide useful metrics to assess the impact of ignoring time-v ariabilit y in data. Using indep enden t estimates of the APOGEE ﬂux c hange as a function of time – for example, from a guider cam- era on the telescop e – one could estimate the magnitude of the ﬂux bias to decide whether to extract using the time-indep enden t or time-dependent model. 4. V ALID A TION WITH REAL D A T A W e also v alidate our mo del using real data from the APOGEE sp ectrograph at Apache Poin t Observ atory , using observ ations that w ere part of the Sloan Digital Sky Surv ey V (SDSS-V; Kollmeier et al. 2025 ). Optimal Fluxes fr om V ariable Illumina tion 15 Figure 10. T op: An example 33-read ramp of counts versus read time using real science data from the APOGEE spec- trograph (blue line) compared to the best-ﬁt constant ﬂux (orange line). Bottom: Count diﬀerences revealing time ﬂuctuations in the data that are not well-described by a sin- gle constant ﬂux. The right axis shows the relative count diﬀerences after dividing by the b est-ﬁt constant ﬂux. 4.1. Time V ariable Signals in R e al Scienc e Observations A ﬂux ramp from an example real science pixel is sho wn in the top of Figure 10 , with the best-ﬁt con- stan t ﬂux shown in orange; this is a pixel on the trace of a brigh t star, a w a y from sky emission lines, at a wa ve- length of ∼ 16928 . 1 ˚ A. After taking the diﬀerence b e- t ween subsequent reads, the bottom panel reveals that the ﬂux is v arying b y amounts larger than the uncertain- ties can explain, suggesting that a time-constan t ﬂux is not a go o d mo del of the data. Using our time-v arying mo del, we ﬁt the data from 100 pixels sim ultaneously . These pixels are tak en b y follo wing the maximum of the trace for one relatively brigh t star across a single detector. That is, these pixels corresp ond to the same star and p osition on the sky but co ver a wide range in wa velength ( ∼ 600 ˚ A). A com- parison betw een the resulting b est ﬁt model and count diﬀerences is sho wn in Figure 11 , with the diﬀeren t pan- els highlighting diﬀeren t pixels (i.e., w av elengths). The uncertain ty-scaled residuals in the b ottom panel show that the new mo del do es a goo d job of describing the data. Figure 11. Comparison of the b est-ﬁt mo del with real APOGEE data across many pixels spanning ∼ 600 Angstroms in w a velength. The pixels come from one APOGEE detector (c hip R) and all lie on the trace of the same star with the top panel sho wing the star’s 2D spectrum (red b ounding lines showing the pixels nearest to the trace cen ter). White dots near the b ottom of the top panel show the X p ositions of the 100 pixels used in measuring  b best , and colored arrows p oint to the 5 pixels highligh ted in the follo wing panels of this ﬁgure. The blac k line sho ws the mo del prediction in that pixel, while the colored p oints sho w the coun t diﬀerences in each pixel. The χ 2 comparison b et ween mo del and data is sho wn in the bottom left corner of eac h panel, scaled by the relev an t degrees of freedom. The last panel displays the uncertaint y-scaled residuals, sho wing that the mo del do es a go o d job of describing the data. 16 Li et al. Figure 12. Comparison of the data residual χ 2 distribu- tions for 2048 pixels of the same star used in the analy- sis of Figure 11 . The tw o upp er panels show the data χ 2 p er pixel as a function of pixel index (i.e. wa v elength) and ﬂux, resp ectively , while the bottom panel shows the empir- ical CDFs. The blac k horizontal lines in the upp er panels sho w the 16-th, 50-th, and 84-th p ercentiles of the exp ected χ 2 ν distribution. Small grey p oints along the b ottom edges of the top tw o panels sho w the pixel indices and ﬂuxes for the 100 equally-spaced pixels used in the  b ﬁt. The orange points and distribution comes from using the b est-ﬁt time-constant ﬂux as deﬁned b y the ( Brandt 2024 ) model, while the blue p oin ts and distribution show the results after accounting for time v ariation using the new mo del of this work. The time– v arying mo del almost p erfectly matc hes the expected distri- bution, while the time-independent mo del do es not. These distributions reveal that the time constant mo del requires the data uncertain ties to be inﬂated by a factor of 2.27 to ha ve the exp ected agreemen t, while the time v arying mo del requires multiplication by a factor of 0.99. As with the synthetic data, w e then use the best-ﬁt  b to correct the data b efore measuring a new best-ﬁt ﬂux in each pixel, (  f |  b =  b best ), including pixels that were not used in the  b measuremen t. That is, we use all 2048 pixels along the trace cen ter (all pixels inside of the red Figure 13. T op: Examples of  b vectors (re-scaled b y their median) for diﬀerent APOGEE chips/detectors cov ering dif- feren t w a v elength ranges for the same exp osure and star as in Figures 11 and 12 . Legends give the scaled χ 2 comparison b e- t ween eac h chip’s  b and an inv erse-v ariance weigh ted-a v erage of  b calculated from all three c hips. Bottom: Uncertain ty-s- caled residuals of the  b vectors in the top panel compared to an inv erse-v ariance weigh ted-a verage. Ev en though the data on the diﬀeren t chips were ﬁt indep endently , the results are extremely similar, indicating that our assumption of a w av e- length-indep enden t  b is broadly supp orted by the data. b ounded region in the top panel in Figure 11 ) instead of only the 100 pixels used to measure  b best . W e then com- pare the data χ 2 w e obtain with our new measurements v ersus the constant-ﬂux results, yielding the distribu- tions in Figure 12 . Here, we see that the time-v arying mo del has remo v ed the disagreemen t seen betw een the exp ected distribution and the time-independent mo del. The upp er t w o panels also reveal that the time-v ariable mo del do es not show an y noticeable trends in data χ 2 as a function of ﬂux and w a velength – which is a h uge impro vemen t ov er the time-indep endent mo del – lend- ing credence to our assumption that the ma jority of the temp oral v ariation is ac hromatic. W e further test the w av elength indep endence assump- tion b y analyzing data from tw o additional detectors that were observing the same targets at the exact same times, just at diﬀerent wa v elengths. W e will refer to the diﬀerent chips using “red”, “green”, and “blue” – lab eled R, G, and B – based on their relativ e p ositions in w av elength. W e rep eat our analysis using the trace of the same star, but now using data from the other chips analyzed completely indep endently . The b est-ﬁt  b v ec- tors for each of the three c hips are shown in Figure 13 . Optimal Fluxes fr om V ariable Illumina tion 17 The pixels no w span almost 2000 ˚ A in w av elength, and y et the results from three separate ∼ 600 ˚ A regions agree quite closely with each other. How ev er, we do note that the chips show some statistical diﬀerences when their uncertain ties are accoun ted for, as highlighted b y the reduced χ 2 legend v alues. Some of the diﬀerences seen b et ween the c hips in their residuals compared to their w eighted a v erage (b ottom panel) might be due to diﬀer- en t nonlinearit y responses that ha ve not b een remo ved from the data, v arying persistence prop erties, chromatic diﬀerences in point spread functions, or a high order w av elength-dependence in  b . The APO Sloan T elescop e cov ers a three degree wide ﬁeld of view. When we explore the  b that we measure from ﬁb ers/stellar traces that correspond to p ositions that are nearby on the sky , w e ﬁnd a general agree- men t. W e are still in the early stages of understanding ho w these results change across the fo cal plane. Small, spatially-coheren t diﬀerences ma y b e telling us about the individual atmospheric diﬀerences each ﬁb er sees along its line-of-sight, or they ma y be indicativ e of a v ariation in telescop e-delivered image quality across the fo cal plane. W e are actively exploring how b est to mea- sure and correct the observed data as a function of p osi- tion on the fo cal plane, source brightness, and position on detector. Implemen ting our time-v ariable mo del us- ing an en tire detector’s/nigh t’s w orth of data will be the fo cus of future work. 4.2. Nul l T ests with T elesc op e Calibr ation Data Next, w e test our model on telescop e calibration exp o- sures that we exp ect to b e time-independent. The par- ticular exposure w e use is of quartz lamp ligh t that is re- ﬂected oﬀ the in terior of the telescop e dome. This means the traces are bright, all ﬁb ers are ev enly illuminated, and the measurements should not exhibit atmosphere- related time v ariabilit y . There m a y b e some subtle sources of ﬂux v ariabilit y – such as the lamps heating up ov er the course of use, persistence, or incompletely- remo ved nonlinearit y eﬀects – but we expect these to b e small compared to the c hanging sky seen with the real science data. After repeating the same analysis using 100 pixels across one detector for a single ﬁb er’s trace, w e get the results in Figure 14 . As exp ected, the  b is quite ﬂat. W e ﬁnd that both the time-indep endent and time-dep enden t mo dels do a go o d job of describing the data with no obvious bias as a function of ﬂux or wa v e- length, as demonstrated in Figure 15 . The data χ 2 dis- tributions suggest that the data uncertainties might b e sligh tly o verestimated by a factor of 2 − 3%, pro viding p oten tial evidence that the current read noise estimates are sligh tly too large. Figure 14. Same as Figure 11 , but now for APOGEE quartz lamp calibration data (chip R). The source in this case is inside of the telescope’s dome, so we do not expect to see a signiﬁcan t v ariation in time from the atmosphere. Indeed, w e recov er a  b that is consistent with a time-stable source for data on this chip. W e again compare the  b measured from the diﬀeren t c hips and assess how similar they are to each other as w ell as to the exp ected  b =  1. These results are sho wn in Figure 16 . W e ﬁnd that all the  b elem en ts from the dif- feren t c hips are within a few p ercent of the exp ected one v ector. The chips also all generally show a decrease in  b elemen ts as a function of time, suggesting that the time- 18 Li et al. Figure 15. Same as Figure 12 , but using the calibration data in Figure 14 . As expected, there is little diﬀerence b e- t ween the time-independent and time-dep endent results for the calibration data where time v ariation should be minimal. dep enden t signal we measure in this case is likely due to nonlinearit y eﬀects. W e note that the reduced χ 2 v alues in the upp er panel legend are the results when compar- ing eac h chip’s  b to a weigh te d-a verage  b . As with the previous science observ ations, we see some disagreemen t b et ween the c hips when w e accoun t for the  b uncertain- ties, whic h is most likely a result of diﬀerent nonlinear- it y prop erties. Statistically comparing each c hip’s  b to the exp ected  b =  1, we measure reduced χ 2 v alues of 0.78, 15.8, and 15.9 for the R, G, and B chips resp ec- tiv ely . While c hip R is consistent with a time-stable source, the other c hips are not, p otentially indicating that  b is truly capturing nonlinearity b eha vior and pick- ing up diﬀerences betw een the c hips. These nonlinearit y diﬀerences may also explain the nature of the small dis- parit y b etw een the  b measured from real science data in Figure 13 . In the future when w e hav e access to nonlin- earit y corrections for these data, w e would exp ect to see Figure 16. Same as Figure 13 , but using the calibration data in Figure 14 . The reduced χ 2 v alues presen ted in the legend again compare each chip’s  b to a weigh ted av erage  b , not to the exp ected v ector of ones. When we instead compare to the exp ected  b =  1 case, we measure reduced χ 2 v alues of 0.78, 15.8, and 15.9 for the R, G, and B chips resp ectiv ely . These results suggest the R chip is consistent with the time-independent exp ectation, but the other chips are not. F or b oth the G and B chips (and R to a lesser exten t), we see a decrease in  b ov er time, whic h is what we w ould exp ect from nonlinearit y eﬀects. the chip diﬀerences reduce and for the  b measurements to be more consistent with a v ector of ones. 4.3. Comp arison of Time V ariable Signal with Indep endent Sky Me asur ements Finally , w e compare the  b we measure to indep enden t estimates of the sky’s prop erties as determined by the Sloan T elescop e’s guide camera system. This system consists of six CCDs equipp ed with SDSS r band ﬁlters moun ted around the periphery of the telescop e’s ﬁeld of view ( Pogge et al. 2020 ). All guide cameras are ex- p osed in a con tinuous lo op during survey observ ations. P oint sources detected in guide frames are cross-matched to Gaia DR2 ( Gaia Collab oration et al. 2018 ) ob jects with G < 18 mag. Gaia astrometry is used to deter- mine telescope p ointing corrections during science exp o- sures, but the guide frames additionally trac k changes in ﬂux (transparency) and seeing (FWHM) for each unique source detected. V ariations in transparency are generally asso ciated with changes in cloud cov er, and we exp ect transparency trends measured b y the guide system to also be present in the spectra. In addition to transparency , atmospheric Optimal Fluxes fr om V ariable Illumina tion 19 seeing can v ary on timescales shorter than a t ypical sp ectrograph exp osure. The SDSS-V guider rep orts a Gaussian FWHM for each guide exp osure. Seeing can signiﬁcan tly aﬀect the ﬂux receiv ed at the spectrograph in ﬁber-fed sp ectroscopy . APOGEE ﬁb ers subtend a 3 arcsecond diameter circle on sky at the Sloan F ounda- tion T elescop e, so when seeing is p o or (e.g. > 3 arcsec FWHM), signiﬁcan t amounts of the av ailable photons are landing outside the ﬁb er’s collecting area. Giv en measurements for transparency and seeing, we can roughly mo del the ﬂux-in-ﬁb er change based on n g guide camera images taken during an APOGEE exp o- sure. W e c ho ose a 2D Gaussian model for the telescop e’s PSF during guider exp osure i (with i ∈ [1 , . . . , n g ]) g ( x, y , σ i ) = 1 2 π σ 2 i exp  −  x 2 + y 2 2 σ 2 i  , (27) where the Gaussian width is informed by the guider mea- sured FWHM in arcseconds σ i = FWHM i 2 √ 2 ln 2 . (28) F or each guide image, we analytically integrate the PSF proﬁle ov er the surface of the circular 3 arcsecond di- ameter ﬁb er, and then multiply by the guider measured transparency t i to estimate the ﬂux entering the ﬁb er f i = t i Z 2 π 0 Z 1 . 5 ′′ 0 g ( r cos θ , r sin θ , σ i ) r dr dθ. (29) Finally , we normalize this ﬂux sequence by the maxi- m um computed ﬂux during the APOGEE exp osure f i, norm = f i max { f 1 , f 2 , . . . , f n g } . (30) T elescop e p ointing errors (or guide jitter) can also b e a source of time-v ariable ﬂux entering the ﬁber. In the examples we presen ted next, the guide error is stable and small, and so should not contribute tow ard a time- v ariable ﬂux signal in the sp ectra. W e start b y considering a case where the guider sug- gests that cloud co ver is disp ersing during the exp o- sure while the seeing remains relativ ely constant. In this case, we measure  b for ten of the brightest stars, eac h ﬁt independently , using 100 pixels along eac h star’s trace on a single detector. The results are sho wn in Fig- ure 17 with the colored lines in the b ottom panel sho w- ing the best ﬁt  b measurements, rescaled to ha v e the closest agreement with the guider-estimated ﬂux change. The top panel sho ws the median guider-measured trans- parency change across all sources in all guide chips (red line, left axis) in addition to the seeing FWHM (blue Figure 17. Comparison of  b measured from the ten brigh t- est stars in an APOGEE exposure (colored lines with scatter p oin ts, left axis of b ottom panel) with guider camera mea- suremen ts of transparency (red line, left axis of top panel), seeing (blue line, right axis of top panel), and total estimated c hange in ﬂux (dark blue line, right axis of b ottom panel). While the seeing is relatively stable ov er the APOGEE exp o- sure, the transparency increases from b eginning to end and matc hes the shape of the measured  b vectors, suggesting that a cloud mo ving out of the ﬁeld of view is driving the time– v arying component of the ﬂuxes. Figure 18. Same as Figure 17 but using a diﬀerent exp osure for a case where the seeing changes while the transparency is relatively stable. As the seeing increases from ∼ 1 . 5 arcsec to ∼ 3 . 5 arcsec, less of each star’s ligh t is b eing collected b y the ﬁb er compared to the beginning, driving the change in  b as shown by the the b ottom panel. line, right axis). The right axis of the b ottom panel displa ys the estimated ﬂux c hange after com bining the eﬀects of c hanging transparency and seeing. The agree- men t in shap e b etw een the transparency and the  b from the diﬀerent stars suggests that the c hange in ﬂux as a function of time is largely dominated b y transparency c hanges for this exp osure. 20 Li et al. Figure 19. Same as Figure 17 but using a diﬀerent expo- sure for a case where the transparency and seeing are b oth relativ ely stable. All the  b measurements are quite close to 1.0, agreeing with a largely stable atmosphere. W e then rep eat this analysis for tw o additional APOGEE exp osures. The next example, shown in Fig- ure 18 , is a case where the seeing is changing but the transparency is stable. These results show the expected an ti-correlation betw een the  b measuremen ts and seeing: as the seeing increases, less light is entering the ﬁb er compared to earlier in the exp osure. The ﬁnal example, sho wn in Figure 19 , is where the guider predicts that the atmosphere is stable and the measured  b vectors largely agree. With these real data tests, w e hav e sho wn that the mo del is ﬂexible enough to pro cess images under a v a- riet y of conditions: large versus small time v ariabilit y as well as seeing versus transparency driven c hanges. In addition, the coheren t time-series structure in the data in Figure 11 suggests that there truly is a time-v ariable comp onen t to the science observ ations. W e hav e further explored the origin of the  b signal through comparisons with indep endent-but-sim ultaneous measurements from a guider camera, with results conﬁrming that our mo del is able to capture real atmospheric v ariations. Thus, the time evolution we measure is not merely the result of p o orly-calibrated or misundersto o d data, and is instead ph ysically motiv ated. 5. CONCLUSION This w ork prop oses a new statistical mo del for ex- tracting optimal time-v arying ﬂuxes from NIR detectors (Section 2 ). This approach builds up on and generalizes previous mo dels that were dev elop ed for space-based ob- serv ations where sources are typically constant in ﬂux during an exp osure. Now, ground-based NIR observ a- tions that experience changing atmospheric conditions can measure unbiased ﬂuxes with optimal uncertainties. W e v alidate our mo del b y comparing to a v ariet y of realistic synthetic datasets, sho wing that we can recov er the input truth across a range of ﬂux lev els (Section 3.2 ) and amplitudes of ﬂux v ariabilit y (Section 3.3 ). W e ﬁnd that APOGEE-lik e observ ations should use the time- dep enden t mo del ov er the time-indep endent mo del when the amplitude of ﬂux v ariation is ≥ 3 . 5% to av oid ﬂux- dep enden t biases with magnitudes up to 120% of the ﬂux uncertain t y (Figures 7 to 9 ). W e next sho w that our new mo del p erforms well on real data from the northern APOGEE sp ectrograph (Section 4 ). The time-v ariable mo del is able to describ e data b oth in cases where the ﬂux app ears to b e c hanging (Figure 11 ) as w ell as with time-stable calibration obser- v ations (Figures 14 and 16 ). W e conﬁrm that the bulk of the time v ariability agrees with our mo del’s wa velength indep endence assumption (Figures 12 and 13 ). W e also explore the origin of these ﬂux c hanges through compar- isons with independent guider camera measuremen ts of the atmospheric conditions, ﬁnding that the time evolu- tion w e measure is physically motiv ated by a v ariable sky (Figures 17 - 19 ). F or real APOGEE data, accoun ting for time-v ariability signiﬁcan tly improv es data-mo del com- parisons and pro duces unbiased ﬂuxes. W e plan to incorp orate this mo del into a new APOGEE reduction pipeline that is under active devel- opmen t: ApogeeReduction.jl . 12 Ongoing and future w ork will focus on tac kling a few outstanding questions b efore w e can implement this approach at scale. This includes understanding the b est wa y to measure  b from a large collection of pixels (i.e. 2048 × 2048 pixel 2 de- tectors) that are p otentially seeing diﬀeren t atmospheric conditions, how b est to measure time-v ariability in ﬁb ers with v ery faint pixels, and optimizing the co de to pre- v ent the introduction of new b ottlenecks into the reduc- tion process. A CKNOWLEDGEMENTS The authors thank the anon ymous referee for com- men ts that help ed improv e the clarit y of this pap er. The authors also thank Adam Wheeler for helpful conv ersa- tions that improv ed the bo dy of this w ork. K.A.M. ackno wledges supp ort from the Univ ersit y of T oronto’s Eric and W endy Schmidt AI in Science Post- do ctoral F ellowship, a program of Sc hmidt Sciences. A.K.S. ackno wledges supp ort for this work was pro- vided b y NASA through the NASA Hubble F ellow- ship gran t HST-HF2-51564.001-A a w arded b y the Space 12 The GitHub repository for the pipeline can b e found at h ttps: //github.com/andrew- saydjari/ApogeeReduction.jl Optimal Fluxes fr om V ariable Illumina tion 21 T elescop e Science Institute, whic h is operated b y the Asso ciation of Univ ersities for Research in Astronomy , Inc., for NASA, under contract NAS5-26555. J.G.F-T gratefully ac knowledges the grants supp ort pro vided by ANID F ondecyt Postdoc No. 3230001 (Sp onsoring researcher), the Joint Committee ESO- Go vernmen t of Chile under the agreemen t 2023 ORP 062/2023, and the supp ort of the Do ctoral Program in Artiﬁcial In telligence, DISC-UCN. F unding for the Sloan Digital Sky Survey V has b een provided by the Alfred P . Sloan F oundation, the Heising-Simons F oundation, the National Science F oun- dation, and the Participating Institutions. SDSS ac- kno wledges supp ort and resources from the Center for High-P erformance Computing at the Universit y of Utah. SDSS telescopes are lo cated at Apache P oin t Obser- v atory , funded b y the Astrophysical Research Consor- tium and op erated b y New Mexico State Universit y , and at Las Campanas Observ atory , op erated by the Carnegie Institution for Science. The SDSS w eb site is www.sdss.org . SDSS is managed by the Astroph ysical Researc h Con- sortium for the Participating Institutions of the SDSS Collab oration, including the Carnegie Institution for Science, Chilean National Time Allo cation Commit- tee (CNT AC) ratiﬁed researc hers, Caltec h, the Gotham P articipation Group, Harv ard Universit y , Heidelb erg Univ ersity , The Flatiron Institute, The Johns Hop- kins Universit y , L’Ecole polytechnique f ´ ed ´ erale de Lau- sanne (EPFL), Leibniz-Institut f¨ ur Astroph ysik Pots- dam (AIP), Max-Planc k-Institut f ¨ ur Astronomie (MPIA Heidelb erg), Max-Planck-Institut f ¨ ur Extraterrestrisc he Ph ysik (MPE), Nanjing Universit y , National Astronom- ical Observ atories of China (NAOC), New Mexico State Univ ersity , The Ohio State Universit y , Pennsylv ania State Universit y , Smithsonian Astroph ysical Observ a- tory , Space T elescop e Science Institute (STScI), the Stellar Astrophysics Participation Group, Universidad Nacional Aut´ onoma de M´ exico, Universit y of Arizona, Univ ersity of Colorado Boulder, Universit y of Illinois at Urbana-Champaign, Univ ersit y of T oron to, Univ ersity of Utah, Universit y of Virginia, Y ale Univ ersity , and Y unnan Universit y . F acilities: Sloan (APOGEE-N), Du Pon t (APOGEE-S) Softwar e: ApogeeReduction.jl (in prep.), fitramp ( Brandt 2024 ), jupyter ( Kluyver et al. 2016 ), matplotlib ( Hun ter 2007 ), numpy ( Harris et al. 2020 ), pandas ( Wes McKinney 2010 ; pandas developmen t team 2020 ), scipy ( Virtanen et al. 2020 ) REFERENCES Bo wen, I. S., & V aughan, Jr., A. H. 1973, ApOpt, 12, 1430, doi: 10.1364/AO.12.001430 Brandt, T. D. 2024, P ASP, 136, 045004, doi: 10.1088/1538- 3873/ad38d9 —. 2025, P ASP, 137, 125006, doi: 10.1088/1538- 3873/ae1dab Fixsen, D. J., Oﬀenberg, J. D., Hanisc h, R. J., et al. 2000, P ASP, 112, 1350, doi: 10.1086/316626 Gaia Collab oration, Brown, A. G. A., V allenari, A., et al. 2018, A&A, 616, A1, doi: 10.1051/0004- 6361/201833051 Gunn, J. E., Siegmund, W. A., Mannery, E. J., et al. 2006, AJ, 131, 2332, doi: 10.1086/500975 Harris, C. R., Millman, K. J., v an der W alt, S. J., et al. 2020, Nature, 585, 357, doi: 10.1038/s41586- 020- 2649- 2 Holtzman, J. A., Nidever, D. L., Nguyen, D., et al. 2014, in American Astronomical So ciety Meeting Abstracts, V ol. 223, American Astronomical So ciety Meeting Abstracts #223, 440.04 Hun ter, J. D. 2007, Computing in Science & Engineering, 9, 90, doi: 10.1109/MCSE.2007.55 Kluyv er, T., Ragan-Kelley , B., P ´ erez, F., et al. 2016, in P ositioning and Po wer in Academic Publishing: Play ers, Agen ts and Agendas, ed. F. Loizides & B. Schmidt, IOS Press, 87 – 90 Kollmeier, J. A., Rix, H.-W., Aerts, C., et al. 2025, arXiv e-prin ts, doi: 10.48550/arXiv.2507.06989 Nidev er, D. L., Holtzman, J. A., Allende Prieto, C., et al. 2015, AJ, 150, 173, doi: 10.1088/0004- 6256/150/6/173 Oﬀen b erg, J. D., Fixsen, D. J., Rauscher, B. J., et al. 2001, P ASP, 113, 240, doi: 10.1086/318615 pandas developmen t team, T. 2020, pandas-dev/pandas: P andas, latest, Zeno do, doi: 10.5281/zenodo.3509134 P ogge, R. W., Derw en t, M. A., O’Brien, T. P ., et al. 2020, in Ground-based and Airb orne Instrumentation for Astronom y VI I I, ed. C. J. Ev ans, J. J. Bryan t, & K. Motohara, V ol. 11447, International So ciety for Optics and Photonics (SPIE), 1144781, doi: 10.1117/12.2561113 Rausc her, B. J., F ox, O., F erruit, P ., et al. 2007, P ASP, 119, 768, doi: 10.1086/520887 22 Li et al. Sugai, H., T amura, N., Karo ji, H., et al. 2015, Journal of Astronomical T elescop es, Instruments, and Systems, 1, 035001, doi: 10.1117/1.JA TIS.1.3.035001 Virtanen, P ., Gommers, R., Oliphant, T. E., et al. 2020, Nature Methods, 17, 261, doi: 10.1038/s41592- 019- 0686- 2 Wes McKinney . 2010, in Pro ceedings of the 9th Python in Science Conference, ed. St´ efan v an der Walt & Jarrod Millman, 56 – 61, doi: 10.25080/Ma jora- 92bf1922- 00a Wilson, J. C., Hearty, F. R., Skrutskie, M. F., et al. 2019, P ASP, 131, 055001, doi: 10.1088/1538- 3873/ab0075 Optimal Fluxes fr om V ariable Illumina tion 23 APPENDIX A. GIBBS VERSUS LINEARIZED FITTING Here we compare the results of ﬁtting synthetic data with the Gibbs metho d using the p osterior conditionals in Equation 16 v ersus the linearized mo del deﬁned by Equation 23 . The precise data w e ﬁt are from the high ﬂux case describ e d in Section 3.1 and Figure 2 . F or the linearized model, the p osterior distribution is deﬁned directly by Equation 23 . F or the Gibbs metho d, w e estimate the full p osterior (  f ,  b | data) by drawing samples from the p osterior conditionals (  f |  b, data) and (  b |  f , data) and then taking a mean and co v ariance of the (  f ,  b | data) samples to deﬁne a multiv ariate Gaus- sian that can be easily compared to the linearized re- sults. Figure 20 compares the t wo approaches when fo- cusing on the posterior distributions of the last ﬁve ele- men ts of the  b v ector. This corner plot sho ws that the mean estimates are virtually identical, but that there is some diﬀerence in the width and correlations of the m ultiv ariate posteriors. In this case, the Gibbs metho d (blue) tends to pro duce narro w er distributions than the linearized model (orange). How ev er, as discussed in Sec- tion 2 , w e are typically only concerned with the mean or MAP  b when trying to extract unbiased ﬂuxes; the uncertain ty and correlations of the  b are not propagated to our b est ﬁt (  f |  b ) measurements, as discussed in Sec- tion 2.1 . The mean  b for all read diﬀerences and  f for all synthetic pixels in this analysis are directly compared in Figure 21 . The top panels show that b oth metho ds reco ver almost iden tical measuremen ts, and the b ottom panels sho w that these mean estimates are extremely similar when compared to the size of their uncertain ties. The fact that b oth metho ds agree so closely in their  b mean estimates – that is, typically within 1% of their uncertain ty – shows that neglecting the robust-but-slow Gibbs metho d in fav or of the faster linearized model is lik ely to ha v e a negligible impact on ﬁnal results. When we rep eat the analysis but use  b =  1 and com- pare the extracted ﬂuxes to the time-indep enden t ﬁts, w e ﬁnd that the ﬂux diﬀerences betw een the Gibbs and linearized method are approximately ﬁve times smaller than the diﬀerence b et ween either method and the time- indep enden t result. This is further evidence that the c hoice of Gibbs versus linear is not a substantial source of error. 24 Li et al. Figure 20. Corner plot comparing the multidimensional relationships b etw een the last ﬁv e  b elements obtained from the Gibbs metho d (blue lines) and the linearized model (orange lines). The ellipses show the 68%, 95% and 99% contours deﬁned by the p osterior distribution from each method, while the red lines sho w the input truth used to generate the synthetic data. The titles ab ov e the diagonals display the mean and 68% credible region as deﬁned b y the Gibbs sampling metho d. Both metho ds agree quite w ell with the truth and return almost iden tical mean estimates. The k ey diﬀerence b et ween the approaches comes from the size and shap es of the p osterior distributions, though they still broadly agree with each other. Optimal Fluxes fr om V ariable Illumina tion 25 Figure 21. Comparison of the b est-ﬁt mean  b and  f using both the linearized mo del and Gibbs metho d. Lik e Figure 2 , the left panels fo cus on  b while the right fo cus on  f . The top panels sho w the direct one-to-one comparison of the measurements of the diﬀerent techniques, and the b ottom sho ws the residuals scaled by the quadrature-com bined-uncertaint y . W e see that the z-scores are all v ery close to 0 (t ypically smaller than 1% of a corresp onding uncertaint y unit), sho wing that the diﬀerent tec hniques provide mean estimates that are almost identical to one another.

Optimal and Unbiased Fluxes from Up-the-Ramp Detectors under Variable Illumination

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment