Implicit neural representations for larval zebrafish brain microscopy: a reproducible benchmark on the MapZebrain atlas

Implicit neural represen tations for larv al zebraﬁsh brain microscop y: a repro ducible b enc hmark on the MapZebrain atlas Agnieszk a Prego wsk a Institute of F undamental T e chnolo gic al R ese ar ch Polish A c ademy of Scienc es, Pawinskie go 5B, W arsaw, 02-106, Poland Abstract Bac kground: Implicit neural represen tations (INRs) pro vide con tin uous, co ordinate-based enco dings that are attractive for atlas registration, cross- mo dalit y resampling, sparse-view completion, and compact sharing of neu- roanatomical datasets. Repro ducible ev aluation is missing for high-resolution larv al zebraﬁsh microscopy , where accurate neuropil delineation and preser- v ation of ﬁne neuronal pro cesses are critical. New metho d: W e introduce a repro ducible INR b enc hmark tailored to the MapZebrain larv al zebraﬁsh brain atlas. Under a uniﬁed, seed-con trolled proto col, w e compare SIREN, F ourier features, Haar positional encoding, and a m ulti-resolution grid across 950 grayscale microscop y images, includ- ing atlas slices and single-neuron pro jections. Images are normalized us- ing p er-image (1 , 99) p ercen tiles using only 10 . 00% of pixels from non-held- out columns. Spatial generalization is ev aluated on a deterministic 40 . 00% column-wise hold-out along the X -axis. Results: Haar and F ourier enco dings achiev e the highest macro-av eraged reconstruction ﬁdelit y on held-out columns ( ≈ 26 dB), with the grid repre- sen tation moderately b ehind and SIREN trailing in macro av erages but re- maining comp etitiv e on area-weigh ted micro av erages in the heterogeneous all-in-one regime. SSIM and an edge-focused error (mean absolute error on Sob el edges) conﬁrm sup erior b oundary preserv ation for Haar and F ourier. Comparison with existing metho ds: Compared with smo oth-bias INRs suc h as SIREN and grid-based schemes prone to axis-aligned artefacts, explicit sp ectral and m ultiscale enco dings deliver improv ed high-frequency ﬁdelit y and sharp er inter-regional b oundaries. Conclusions: F or MapZebrain w orkﬂows, Haar and F ourier enco dings are preferable for b oundary-sensitiv e tasks such as atlas registration, lab el transfer, and morphology-preserving sharing of single-neuron pro jections, while ligh t weigh t SIREN remains a useful baseline for background mo delling or denoising when high-frequency detail is less critical. Keywor ds: implicit neural represen tations, SIREN, F ourier features, m ulti-resolution grid, MapZebrain, larv al zebraﬁsh, brain atlas, neuroimaging 1. In tro duction Implicit Neural Represen tations (INRs) enco de signals as a contin uous function of spatial co ordinates parameterized b y a neural netw ork. This rep- resen tation supp orts alias-free resampling, sup er-resolution at any scale, and compact detail-preserving storage in imaging ([1, 2, 6]). INRs ha ve reshap ed graphics and view syn thesis through p ositional enco dings and radiance ﬁelds ([8]), as well as eﬃcien t m ulti-resolution/grid and tensor enco ders ([4, 5, 6]). Despite this progress, there is little systematic evidence comparing classi- cal INR v arian ts on neuroanatomical images, where signals are diﬀuse and island-lik e while b oundaries remain sharp. Diﬀeren t form ulations of INRs show v arying represen tational prop erties. P erio dic-activ ation MLPs (SIREN) mitigate sp ectral bias, allo wing direct mo deling of high-frequency structure ([1]). F ourier features embed input co- ordinates in sin usoidal bases, allo wing conv en tional MLPs to capture ﬁne detail eﬃciently in lo w-dimensional settings ([2]). A w av elet-st yle Haar po- sitional enco ding oﬀers a simple piecewise-constan t alternative. T o improv e computational eﬃciency , a recen t study has explored semi- explicit enco ders that map co ordinates to features through multi-resolution tables. Instant Neural Graphics Primitives in tro duce grid encoding ([4]), Pleno xels replace MLPs with dense vo xel grids and spherical harmonics ([5]), and T ensoRF factorizes ﬁelds into low-rank tensors ([6]). The lo cal Implicit Image F unction further sho ws that lo cal implicit deco ders preserv e edges and textures for arbitrary-scale sup er-resolution ([3]). Although grid- and tensor-based enco ders are no w standard in scene mo deling, their b eha vior on neuroanatomical textures, c haracterized b y sparse, island-like signals with sharp in ter-regional b oundaries remains insuﬃciently understo od, motiv ating a head-to-head ev aluation under a uniﬁed proto col ([7]). 2 In this pap er, w e presen t quantitativ e ev aluation of INRs on MapZebrain. W e study four INRs v arian ts, including SIREN, F ourier features, Haar p o- sitional enco ding, and a m ulti-resolution grid (Grid). W e ev aluate under t wo regimes: regions (a curated collection of atlas slices), and all-in-one (a recursiv e crawl that co v ers atlas regions and single-neuron pro jections), all within a uniﬁed, seed-con trolled prepro cessing and v alidation proto col. W e rep ort b oth macro- and pixel-weigh ted micro-av eraged PSNR and analyze qualitativ e failure mo des relev an t to neuroanatomical textures. T o position this pap er as a practical tool for larv al zebraﬁsh imaging pip elines, w e in tro duce a reproducible INR b enc hmark tailored to the MapZe- brain larv al zebraﬁsh atlas, with a uniﬁed prepro cessing, column-wise hold- out, and seed-controlled train/test proto col that can b e reused for future metho ds. W e also provide a large-scale, head-to-head comparison of four clas- sical INR parameterizations (SIREN, F ourier features, Haar p ositional en- co ding, and a m ulti-resolution grid) across 950 neuroanatomical microscopy images spanning curated atlas slices and single-neuron pro jections. More- o ver, we relate diﬀerences in reconstruction ﬁdelit y , SSIM, and edge-fo cused errors to biologically relev an t structures, including lay ered neuropil, sharp region b oundaries, and long, thin axonal pro cesses in the larv al zebraﬁsh brain. W e derive practical recommendations for selecting INR enco dings in atlas-cen tric workﬂo ws (registration, cross-mo dality resampling, sparse-view completion, and storage-eﬃcien t sharing of single-neuron pro jections) and pro vide co de and ﬁxed splits to enable straightforw ard adoption. 2. Biological motiv ation The larv al zebraﬁsh brain is small, optically accessible, and stereot yp ed across individuals, with la yered neuropil (e.g. optic tectum), sharply delin- eated region b orders (e.g. pallium-diencephalon interfaces), and long, thin axonal tracts (e.g. commissures). These structures imp ose comp eting recon- struction demands: preserving punctate cell-b ody texture and synaptic-lik e hotsp ots, main taining sharp in ter-regional boundaries, and a voiding o ver- smo othing of submicron-scale neurites. F rom a biomedical imaging standp oin t, an implicit neural represen ta- tion that resp ects these constrain ts is directly useful for sev eral core down- stream tasks. First, atlas registration and region delineation are inherently b oundary-sensitiv e; ﬁdelit y at pallium-diencephalon b orders and within lay- ered neuropil translates into reduced registration error and impro ved lab el 3 transfer. Second, cross-mo dalit y resampling, such as aligning anatomy with gene-expression v olumes, requires alias-free in terp olation at arbitrary spatial co ordinates, a natural ﬁt for contin uous co ordinate-based deco ders. Third, sparse-view completion reﬂects practical acquisition scenarios (mosaics, par- tial ﬁelds of view), where spatial generalization across con tiguous missing spans is critical; our column-wise hold-out explicitly prob es this b ehavior. Finally , storage-eﬃcien t sharing of single-neuron pro jections b eneﬁts from compact parametric represen tations that preserve thin pro cesses, commis- sural crossings, and lo cal texture without inﬂating memory fo otprin t. These biological and imaging c haracteristics translate in to concrete metho d- ological requiremen ts for INR mo dels. An eﬀectiv e INR must preserve re- gion b orders and lay ering to a void systematic registration errors, maintain thin axonal pro cesses across held-out spans to supp ort tracing and morphol- ogy analysis, and provide smo oth, alias-free in terp olation for cross-mo dalit y w orkﬂows. Positional enco dings therefore b ecome ﬁrst-order design choices: m ultiscale or sp ectral sc hemes are exp ected to b etter preserve sharp b ound- aries and thin neurites, whereas models with stronger lo w-frequency bias risk attenuating ﬁne morphology . Our ev aluation proto col and metrics are th us chosen to reﬂect these practical constrain ts rather than generic image- denoising p erformance. 3. Metho ds 3.1. Implicit Neur al R epr esentations Implicit represen tations enco de a signal as a contin uous function f θ : R d → R c , implemen ted b y a neural net work with parameters θ . Once trained, f θ can b e queried at arbitrary co ordinates, pro viding resolution-indep endent access to the underlying signal. INRs oﬀer smo oth in terp olation, parameter-eﬃcien t storage, and diﬀerentiabilit y , facilitating gradient-based pipelines ([14, 15, 8, 16, 4]). 3.2. Datab ase W e consider MapZebrain, a m ulti-mo dal atlas of the larv al zebraﬁsh brain in tegrating region-level anatom y and single-neuron pro jections within a com- mon co ordinate framew ork([9, 10, 11, 12]). The larv al zebraﬁsh oﬀers a small, 4 optically accessible, and stereotyped nerv ous system, enabling whole-brain imaging at cellular and gene-expression resolution with high throughput. MapZebrain integrates region-level anatom y and single-neuron pro jections in to a common coordinate frame, yielding images with punctate textures and sharp inter-regional b oundaries across heterogeneous nativ e resolution. These prop erties make MapZebrain a realistic and stringen t b enchmark for reconstruction, resampling, and compression, against whic h w e test the ca- pabilities and limitations of INR mo dels. Our corpus comprises 950 images in total: 227 curated atlas slices (the regions regime) and 723 ﬁles from a recursiv e, heterogeneous crawl (the all-in-one regime). 3.3. R e gimes W e consider tw o complemen tary regimes. Regions: a curated set of 227 represen tative atlas slices (mo dal size H × W ), pro viding a con trolled, re- p eatable substrate. All-in-one: a recursiv e cra wl of 723 ﬁles comprising atlas- region images and single-neuron pro jections at native resolutions, reﬂecting real-w orld v ariabilit y . 3.4. Pr epr o c essing All images are con verted to grayscale and normalized p er image to [0 , 1] using the (1 , 99) p ercen tiles: I norm = clip(( I − P 1 ) / ( P 99 − P 1 + 10 − 6 ) , 0 , 1) . In regions, when sizes diﬀer we centrally crop to the mo dal size to enable grid-based batc hing, all-in-one preserves native sizes. Spatial generalization is prob ed with a column-wise hold-out along X ( α = 0 . 40 ). In regions, a single deterministic mask (global seed) is reused across images. In all-in- one, masks are deterministically seeded p er ﬁle to av oid leak age and ensure repro ducibilit y . 3.5. Evaluation F or each image, w e train the INR on a random subsample of approx- imately 10.00% of the a v ailable pixels, restricted to training columns, and then reconstruct the learned ﬁeld f θ o ver [0 , 1] 2 sampled at the original image resolution. Ev aluation is p erformed exclusively on a column-wise hold-out along the X -axis ( α = 0 . 4 ; blocked_cols_X ), which prob es spatial gener- alization across con tiguous gaps rather than random-pixel denoising. This split is deterministic and ﬁxed across metho ds to ensure comparability . Let I : { 0 , . . . , H − 1 } × { 0 , . . . , W − 1 } → R ≥ 0 denote a prepro cessed gra yscale image normalized to [0 , 1] , and let ˆ I b e its INR reconstruction. The 5 test set Ω test con tains all pixels from held-out columns, and the reconstruction error is quan tiﬁed by mean squared error MSE test = 1 | Ω test | X ( x,y ) ∈ Ω test  I ( x, y ) − ˆ I ( x, y )  2 . (1) The primary metric is the corresponding p eak signal-to-noise ratio (PSNR, dB) PSNR test = 20 log 10  R √ MSE test  , R = 1 . 0 , (2) where R is the dynamic range after n ormalization. F or each metho d w e rep ort macro-a verages (mean ± SD across images) and micro-av erages (computed from a single aggregated MSE), the latter w eighting larger images more heav- ily . Robust statistics (median, IQR, 10.00% trimmed mean) are rep orted in App endix T able 11 to reduce the inﬂuence of near-constant outliers. P erformance distributions are summarized with b ox and violin plots. The comparison is based on p er image PSNR diﬀerences. Statistical signiﬁcance is assessed using the paired Wilcoxon signed-rank test. Conﬁdence interv als for mean diﬀerences are estimated through nonparametric b o otstrap with B = 10 , 000 resamples (95.00% p ercen tile). W e also report the n um b er of wins p er metho d (highest PSNR test p er image). W e emphasize b oth macro- and micro- a verages b ecause image sizes v ary widely across the corpus. Extremely high PSNR outliers in regions correspond to near-constant ﬁelds after percentile normalization. These are rep orted explicitly and analyzed separately . W e also rep ort eﬀect sizes (e.g., Cliﬀ ’s δ with b ootstrap CIs) in addition to p - v alues to reﬂect practical, not only statistical, signiﬁcance. Bey ond PSNR, we compute SSIM on the held-out columns and an edge- fo cused error deﬁned as the mean absolute error restricted to edge pixels iden tiﬁed in the ground truth b y a Sob el op erator. The threshold selects the top 10 . 00% of gradien t magnitudes p er image, computed on the test columns only . 4. Numerical Results T able 1 summarizes macro and micro PSNR test across 950 images. F ourier and Haar achiev e the highest macro means (b oth ≈ 26 dB), follo wed b y Grid ( ≈ 24 dB), whereas SIREN trails ( ≈ 14 dB). Because a subset of atlas slices b ecomes near-constan t after (1 , 99) normalization, macro means substan tially 6 exceed medians (t ypically 12 - 14 dB). W e therefore rep ort robu st summaries and analyze outliers separately . T able 1: Macro- and micro-av eraged PSNR test_mean (dB) across 950 images. Mo del Macro-Mean Macro-Std Micro-Mean Wins Haar 26.09 27.70 12.83 354 F ourier 25.95 26.59 13.75 288 SIREN 13.90 2.28 13.41 233 Grid 24.12 28.31 10.98 75 Note. Macro means are inﬂated by a small num b er of near-constant atlas slices after p er-image (1 , 99) normalization, occasionally yielding ≥ 100 dB PSNR. F or robust statistics (median, IQR, 10.00% trimmed mean) see T able 11 and the outlier analysis in T able 9. Area-w eighted (micro) PSNR, computed from a single MSE aggregated o ver all images, do wn-w eigh ts small atlas ﬁelds and b etter reﬂects p erfor- mance on large heterogeneous pro jections (T able 2). On the en tire corpus, F ourier and SIREN are close on the micro a verage ( ≈ 13 . 5 - 13 . 8 dB), Haar is slightly lo wer ( ≈ 12 . 8 dB), and Grid remains b ehind ( ≈ 11 dB). In the all-in-one regime, F ourier, Haar, and SIREN cluster around 12 - 13 dB, while Grid drops b elo w 10 dB. T able 2: Area-weigh ted (micro) PSNR test (dB) by regime. Regime Haar F ourier SIREN Grid all-in-one 11.23 12.47 12.71 9.80 regions 26.47 26.57 17.68 26.45 Figures 2-4 summarize p er-image distributions (bar, b o x and violin).These plots summarize the same statistics as were presented in T able 1. Haar and F ourier sho w nearly identical median PSNR distributions, with large v ariance caused by image heterogeneit y (t ypical v alues 10-26 dB). V ery high outliers (equal or higher than 100 dB) o ccur o ccasionally , mostly in low-v ariance re- gions or small images. Grid p erforms mo derately but remains b elo w Haar and F ourier across most samples, while SIREN maintains stable but low er scores, consisten t with its kno wn preference for low-frequency structure. 7 Figure 1: Overall distribution of PSNR test_mean for the four INR v ariants (all images). F ourier and Haar hav e similar medians but a broader upp er tail. SIREN is the most concen trated. Figure 1 sho ws the same result in a condensed form: all four mo dels ha ve their medians around 12 - 14 dB, but F ourier and Haar exhibit a muc h broader upp er tail, corresp onding to the atlas-like images that are almost constan t after p ercen tile normalization. The grid is systematically shifted do wnw ards, and SIREN has the tightest spread, consisten t with its smo oth, lo w-frequency bias. Represen tative reconstructions for selected MapZebrain atlas regions il- lustrate clear qualitative diﬀerences b et ween the INR parameterizations, see Figure 5. Across b oth pro jection and column-wise holdout regimes, F ourier and Haar enco dings preserve ﬁne-grained textures, neuron-lik e puncta, and sharp regional b orders, whereas SIREN pro duces globally smo oth reconstruc- tions that suppress small-scale structure. Grid exhibits strong vertical band- ing and loss of detail, indicating o ver-smoothing and p oor generalization along the held-out axis. Haar’s m ultiscale basis successfully retains thin pro- cesses but o ccasionally in tro duces blo c king artifacts at region b oundaries- consisten t with the di scon tin uities inherent in piecewise w a velet support. F ourier maintains o verall coherence and edge contin uit y , yielding the most 8 Figure 2: Mean PSNR test_mean (macro) across metho ds. Haar and F ourier achiev e the highest av erages. visually faithful reconstructions. SIREN, relying purely on learned p eriodicity , captures low-frequency con- trast but fails to recov er lo calized in tensity v ariations, while Grid’s ﬁxed p o- sitional bins lead to severe blurring. The corresp onding MAE proﬁles (right column) conﬁrm these observ ations: F ourier and Haar main tain lo w, struc- tured error across the spatial domain, while SIREN and Grid exhibit higher and more erratic MAE, esp ecially in dense or highly textured regions. Haar and F ourier consisten tly reach the highest PSNR test and low est MAE, while SIREN p erforms mo derately and Grid remains weak est. Paired Wilco xon signed-rank tests run separately for the t wo regimes conﬁrmed that, in the all-in-one set, b oth Haar and F ourier signiﬁcan tly outp erform the grid enco der ( p ≪ 10 − 6 , p aired Cliﬀ ’s δ ≈ 0 . 7 - 0 . 95 ), and Haar also outp erforms SIREN ( p < 10 − 5 ). In the regions set, diﬀerences b et ween Haar, F ourier, and the grid are no longer signiﬁcant ( p > 0 . 4 ), whic h is consisten t with their almost iden tical distributions dominated by ≥ 100 dB outliers. Across b oth regimes, SIREN is the only metho d that never pro duces such extreme v alues. Ov erall, Haar and F ourier emerge as the most robust INR enco dings for MapZebrain data, pro viding a goo d balance betw een ﬁdelit y and computa- tional cost, while Grid remains slightly b ehind in this low-epo c h setting. T able 3 rep orts macro- and micro-a v eraged PSNR test_mean across b oth regimes. In all-in-one (723 images), macro PSNR is ≈ 10-13 dB (F ourier 12 . 93 ± 1 . 79 , SIREN 12 . 70 ± 0 . 82 , Haar 12 . 41 ± 2 . 53 , Grid 10 . 25 ± 2 . 15 ), with wins: Haar 291 , SIREN 228 , F ourier 198 , Grid 6 . In regions (227 im- 9 Figure 3: Box plot of PSNR test_mean sho wing broad distributions with o ccasional outliers. ages), explicit enco dings reac h muc h higher macro PSNR (Haar 69 . 68 ± 26 . 36 , Grid 68 . 29 ± 27 . 84 , F ourier 67 . 43 ± 26 . 22 , SIREN 17 . 72 ± 0 . 64 ), with wins: F ourier 90 , Grid 69 , Haar 63 , SIREN 5 . This split suggests that curated atlas slices fav or explicit, m ultiscale/sp ectral enco dings, while the heteroge- neous all-in-one set narrows gaps and increases the relative comp etitiv eness of SIREN on pixel-weigh ted micro a verages. T o complemen t PSNR av erages, we also coun ted wins p er image, i.e. how man y test images were b est reconstructed b y a giv en INR (highest PSNR test ). In the heterogeneous all-in-one set (723 images), Haar won on 291 images (40.20%), SIREN on 228 (31.50%), F ourier on 198 (27.40%), and the grid only on 6 images (0.80%). In con trast, in the curated regions set (227 images), explicit encodings dominated: F ourier w on on 90 images (39.60%), Haar on 63 (35.70%), and the grid on 69 (22.50%), while SIREN was b est only on 5 images (2.20%). These counts indicate that on clean, atlas-like slices F ourier/Haar are clearly preferable, on mixed-resolution, neuron-pro jection data SIREN is substantially more comp etitiv e than what its macro PSNR alone w ould suggest. W e note that v ery high PSNR v alues ( ≥ 100 dB) o ccur only in the regions regime and corresp ond to nearly constant ﬁelds after p er-image (1 , 99) p er- cen tile normalization. Because such cases inﬂate macro means, we addition- 10 Figure 4: Violin plot of PSNR test_mean illustrating p er-image v ariability . ally rep ort medians, IQRs, and 10% trimmed means (App endix, T able 11). These robust statistics conﬁrm the ranking Haar ≈ F ourier > Grid > SIREN. T able 4 rep orts complemen tary test-only metrics (SSIM and edge MAE). F ourier attains the highest SSIM alongside the low est edge error, indicating a fav orable trade-oﬀ b et w een global structure and b oundary sharpness. Grid sho ws SSIM close to F ourier y et exhibits mark edly worse edge ﬁdelit y , con- sisten t with smo oth reconstructions that raise p erceptual similarity but blur in terfaces. SIREN reaches mid-rank SSIM and comp etitiv e PSNR micro , but degrades on edges relativ e to F ourier. Haar lies b et ween F ourier and SIREN. W e rep ort test-only SSIM on the held-out columns using the standard 11 × 11 Gaussian window after p er-image (1 , 99) normalization. Edge MAE is the mean absolute error computed only ov er edge pixels detected in the ground truth by a Sob el op erator, selecting the top 10 . 00% of gradient mag- nitudes within the held-out columns. F ourier attains the b est combination of global structure and b oundary ﬁdelit y (highest SSIM and low est Edge MAE), whic h is most relev ant for atlas registration. Grid’s relatively high SSIM but p oor Edge MAE indicates ov er-smo othing of in terfaces, while SIREN remains comp etitiv e in area-w eighted PSNR but atten uates ﬁne detail. 11 Figure 5: Representativ e reconstructions for selected MapZebrain regions using diﬀerent p ositional encodings. F ourier and Haar preserve morphological detail and boundaries, SIREN smo oths ﬁne structures, and Grid exhibits strong blurring and vertical artifacts. Righ tmost panels show column-wise MAE proﬁles. 12 T able 3: Per-regime results: macro-a veraged PSNR test_mean (dB) and wins p er metho d. Regime Mo del Macro-Mean Macro-Std Wins all-in-one Haar 12.41 2.53 291 all-in-one F ourier 12.93 1.79 198 all-in-one SIREN 12.70 0.82 228 all-in-one Grid 10.25 2.15 6 regions Haar 69.68 26.36 63 regions F ourier 67.43 26.22 90 regions Grid 68.29 27.84 69 regions SIREN 17.72 0.64 5 T able 4: Complementary test-only metrics on held-out columns (all images). Higher is b etter for SSIM; low er is better for Edge MAE. PSNR micro rep orted for reference. Macro a verages across images. Mo del SSIM test ( ↑ ) Edge MAE test ( ↓ ) PSNR micro [dB] F ourier 0.689 0.183 13.80 Grid 0.687 0.246 10.93 Haar 0.634 0.213 12.83 SIREN 0.443 0.208 13.43 Note. All metrics are computed only on held-out columns; Edge MAE is restricted to ground-truth edge pixels (top 10 . 00% Sob el magnitudes p er image). 5. In terpretation of results based on biological exp erimen t On atlas slices, F ourier and Haar b est preserve neuropil lay ering (e.g., stratiﬁed optic tectum) and sharp inter-regional b orders (e.g., pallium-diencephalon in terfaces, rhombomeric b oundaries), which is consistent with their higher medians and larger share of per-image wins in the regions regime (Figure 1). In single-neuron pro jections within the all-in-one set, b oth enco dings main- tain slender pro cess con tinuit y across the column-wise held-out gaps and pre- serv e commissural crossing at the midline (Figure 5). By contrast, S IREN’s smo oth bias attenuates high-spatial-frequency detail, leading to loss of the thinnest axonal and dendritic branches despite co mp etitiv e area-w eigh ted (micro) PSNR. The grid enco der o ccasionally in tro duces vertical banding 13 aligned with the held-out axis, degrading b oundary ﬁdelit y and commis- sural con tinuit y . T aken together, these biology-aw are observ ations align with kno wn larv al zebraﬁsh morphology , lay ered neuropil, crisp region b oundaries, and long, thin tracts, and explain why explicit sp ectral/m ultiscale enco dings (F ourier/Haar) are preferable for atlas registration and b oundary-sensitive tasks, while SIREN remains a light w eight baseline for bac kground mo deling or denoising where ﬁne morphology is less critical. 6. Discussion T asks that hinge on b oundary ﬁdelity (e.g. region delineation, atlas reg- istration) b eneﬁt most from explicit p ositional bases (Haar/F ourier), whic h b etter preserve inter-regional edges. T racing of sparse single, neuron pro- jections fav ours enco dings that maintain pro cess contin uit y across missing spans; here Haar/F ourier were sup erior under the column-wise split. SIREN’s smo oth bias can still b e attractiv e for background mo delling or denoising in v olumetric acquisitions where high-frequency detail is less critical. Our b enc hmark shown consistent diﬀerences b et w een INR parameteri- zations. Explicit enco dings (F ourier, Haar, and the m ulti-resolution Grid) reac h the b est reconstruction quality , with F ourier/Haar leading on macro PSNR and Grid slightly b ehind under the short training budget. SIREN trails on macro but remains comp etitiv e on pixel-w eighted micro av erages in the heterogeneous all-in-one set. These trends align with prior work on fre- quency bias [2, 1], where explicit positional bases mitigate the lo w-frequency preference of standard MLPs. A cross b oth regimes, the relativ e ordering b et w een parameterizations is stable and mirrors classic frequency-bias results, including explicit enco d- ings that inject either dense sp ectral cov erage (F ourier, Grid) or multiscale structure (Haar) consisten tly outp erform purely learned p erio dicit y (SIREN) on ﬁne morphology . Smooth bias c haracteristic of sin usoidal INRs was il- lustraited in Figure 6. This supp orts the view that, for neuroanatomical INR, the p ositional basis is a ﬁrst-order design c hoice, while additional depth/capacit y yields diminish ing returns unless it aligns with the image statistics (thin pro cesses, sharp b oundaries) ([1]). The gap b et w een macro- and micro-av eraged scores indicates diﬀeren t failure mo des: metho ds that excel in global con trast and edge sharpness (F ourier/Grid) ma y not maximize lo cal smo othness, whereas SIREN, despite capturing low-frequency trends, tends to ov ersmo oth small-scale textures. 14 Figure 6: Smo oth bias characteristic of sinusoidal INRs (SIREN). Example reconstruction under column-wise holdout w as presented in Fig- ure 7. F or downstream biomedical tasks, this implies rep orting b oth macro and micro metrics and selecting the parameterization b y task: segmen ta- tion and region delineation b eneﬁt from edge-ﬁdelit y , while visualization or bac kground mo deling ma y tolerate smo oth bias ([17]). The column-wise hold-out emphasizes in terp olation across anisotropic missing spans rather than memorization. Metho ds with frequency-structured priors (F ourier/Grid) main tain ﬁdelity under this split, suggesting impro ved out-of-grid generalization, which is relev an t for mosaics, partial acquisitions, and sparsely sampled volumes commonly encountered in microscopy pip elines ([2]). Haar’s blo c king artifacts at region b oundaries p oin t to a basis-data mis- matc h at blo c k transitions ([2]). Haar p ositional enco ding preserv es sharp edges but in tro duces block b oundary artifacts (Figure 8). Haar ac hiev es high ﬁdelity ( ≈ 14.6 dB PSNR) and retains ﬁne pro cesses, introduces soft b oundary seams b et ween neigh b oring blo c ks, consisten t with the piecewise form ulation of Haar enco dings. F ourier yields smoother transitions but blurs thin structures, while SIREN and Grid again sho w excessive smo othing. The MAE proﬁle rev eals lo calized oscillations aligned with blo c k boundaries. These artifacts can b e mitigated b y o verlapping windows or soft w a v elet 15 Figure 7: Example reconstruction under column-wise holdout regime illustrating direc- tional ov erﬁtting. Figure 8: Haar p ositional enco ding preserv es sharp edges but introduces blo c k b oundary artifacts. 16 v ariants, whic h preserv e edge faithfulness while impro ving con tinuit y . Soft m ultiscale v ariants, li k e ov ercomplete or learned w a velet dictionaries, could mitigate this while retaining edge faithfulness ([4]). F or b oundary-sensitiv e downstream tasks (atlas registration, lab el trans- fer), prefer F ourier or Haar. F or background mo deling or denoising under tigh t compute, SIREN is a viable baseline. Grid enco ders may require longer training and tuning (levels, features p er level) to reac h their p oten tial on microscop y textures. 7. Practical implications for atlas workﬂo ws F or readers using larv al zebraﬁsh atlases in practice, our results suggest simple guidelines. First, for b oundary-sensitiv e tasks such as registration, la- b el transfer, and region-of-in terest analysis on MapZebrain-like data, F ourier or Haar p ositional enco dings should b e preferred, as they consistently pre- serv e inter-regional b orders and la y ered neuropil while maintaining high edge ﬁdelit y . Second, for sharing single-neuron pro jections in a storage-eﬃcient w ay , b oth F ourier and Haar achiev e a fa v ourable trade-oﬀ b etw een recon- struction qualit y and mo del size, retaining thin commissural pro cesses and lo cal punctate texture. Third, SIREN pro vides a useful baseline when com- putational resources are limited or when the primary goal is bac kground mo delling or smo othing, but it should b e used with caution when ﬁne mor- phology is critical. Finally , our column-wise hold-out proto col can b e adopted as a routine sanit y c hec k for assessing whether new INR v ariants gen uinely generalize across contiguous gaps t ypical of mosaics and sparse-view acqui- sitions. 8. Limitations Our study applies a short training budget for b oth regimes, namely 10.00% pixel sampling, 10 ep ochs. Longer optimization or h yp erparame- ter tuning could change the relative ranking, particularly for Grid enco ders. W e train mo dels indep endently for each image, without using cross-image priors or transfer. Consequen tly , cross-sp ecimen generalization is not ev alu- ated. Rep orted PSNR distributions are broad, reﬂecting heterogeneous na- tiv e resolutions and sparsity . V ery high outliers, equal or higher than 100 dB, t ypically o ccur for nearly constant or small ﬁelds of view. 17 9. Conclusions W e in tro duced a t wo-regime, repro ducible INR b enchmark on MapZe- brain spanning region-lev el atlas slices and whole-pro jection images under a uniﬁed proto col. Haar and F ourier p ositional enco dings consisten tly achiev ed the b est reconstruction ﬁdelit y (macro-a v eraged PSNR test_mean ≈ 26 dB ), with Grid p erforming mo derately and SIREN remaining comp etitiv e mainly on micro-av eraged metrics. A cross b oth regimes, m ultiscale (Haar) and dense-sp ectral (F ourier) priors preserv ed thin pro cesses and sharp b ound- aries substantially b etter than arc hitectures relying on learned smo othness (SIREN, Grid), indicating that the inductive bias of the p ositional basis out- w eighs marginal depth/capacity gains. The systematic gap b et ween macro- and micro-a veraged scores shows that metho ds trade oﬀ global b oundary in- tegrit y versus lo cal smo othness, underscoring the v alue of rep orting b oth. Our column-wise holdout emphasizes in terp olation rather than memoriza- tion, suggesting that frequency-structured priors generalize more reliably to out-of-grid co ordinates typical in partially sampled biomedical imagery . T able 5: Key h yp erparameters Setting V alue In tensity normalization p er-image p ercen tiles (1 , 99) Hold-out split blocked_cols_X , α = 0 . 40 T rain subsampling 10% pixels from training columns Ep ochs / Optimizer 10 / A dam W ( lr = 10 − 3 , wd = 10 − 6 ) Loss Smo oth- ℓ 1 ( β = 0 . 01 ) Batc h size min(131 , 072 , max(65 , 536 , ⌊ H W · 0 . 10 / 2 ⌋ )) Dataloading num_workers =0, pin_memory =T rue Device CPU (CUD A only if a v ailable; not used here) SIREN width=256, depth=6, w ﬁrst 0 = 36 , w hidden 0 = 1 F ourier bands=48, max_freq=24, learnable, + MLP(192, 4) Haar L=8, include_input, + MLP(192, 4) Grid L=8, feats=2, + MLP(192, 4) 10. Repro ducibilit y All exp eriments ran deterministically on an In tel(R) Core(TM) i7-14700F (2.10 GHz), Windo ws 10 (x86_64), Python 3.10.13; PyT orch 2.1.2+cpu, 18 torc hvision 0.16.2+cpu, torchaudio 2.1.2+cpu; NumPy 1.26.4 (Op enBLAS), SciPy 1.11.4, pandas 2.1.4, Matplotlib 3.8.2, scikit-image 0.22.0, PyY AML 6.0.1, tqdm 4.66.2. Global seeding uses numpy.random.default_rng(SEED_GLOBAL) ; p er-ﬁle test masks are deriv ed deterministically from the ﬁle path (stable hash) combined with SEED_GLOBAL . T rain/test splits and pixel subsampling are ﬁxed across metho ds. All mo dels receive ( x, y ) ∈ [0 , 1] 2 co ordinates and predict a single gra yscale in tensity . Eac h image is con verted to grayscale and indep enden tly normalized b y robust p ercen tiles (1 , 99) : I norm = clip  I − P 1 P 99 − P 1 + 10 − 6 , 0 , 1  . W e train on a column-wise hold-out along the X -axis with a ﬁxed test frac- tion α = 0 . 40 ( HOLDOUT_FRAC_X=0.40 ), and randomly subsample 10 . 00% of pixels only from training columns ( TRAIN_SAMPLE_PERC=0.10 ). No resiz- ing is applied in the r e gions regime ( RESIZE_TO=None ). Optimization uses A dam W with a ﬁxed learning rate 10 − 3 and w eight decay 10 − 6 for 10 ep ochs ( EPOCHS=10 ). The loss is Smo oth- ℓ 1 with β = 0 . 01 . The batch size is com- puted adaptiv ely as bs = min( BATCH_CAP = 131 , 072 , max(65 , 536 , ⌊ ( H · W · 0 . 10) / 2 ⌋ )) . Data loaders use num_workers=0 and pin_memory=True . T raining runs on CPU (CUDA is used only if av ailable; in our exp eriments CUDA was not used). Key parameters are shown in the T able 5. T rain/test splits and ran- dom subsampling are fully deterministic ( numpy.default_rng(seed) ) and iden tical across INR v arian ts. App endix Detaile d summary statistics T able 6 rep orts descriptiv e statistics of p er-image test PSNR (macro view) across all 950 images. T able 7 breaks these do wn by regime. Pairwise signiﬁc anc e (macr o PSNR) W e rep ort paired Wilcoxon signed-rank tests (t wo-sided) on p er-image PSNR with Holm-Bonferroni correction in the main pap er, see T able 8. Here w e additionally provide Cliﬀ ’s δ eﬀect sizes with b o otstrap 95% CIs (400 resamples) and the num ber of paired images used. 19 T able 6: Per-method descriptiv e statistics of PSNR test_mean (dB) across all images (N=950): mean ± SD, median, and IQR. Mo del Mean ± SD Median IQR N F ourier 25.95 ± 26.59 13.16 [11.87, 18.05] 950 Grid 24.12 ± 28.31 9.99 [9.36, 16.43] 950 Haar 26.09 ± 27.70 14.10 [10.31, 16.92] 950 SIREN 13.90 ± 2.28 13.00 [12.21, 15.18] 950 T able 7: P er-metho d descriptive statistics by regime (macro view): mean ± SD, median, and IQR. Regime Mo del Mean ± SD Median IQR N all-in-one F ourier 12.93 ± 1.79 12.47 [11.73, 13.69] 723 all-in-one Grid 10.25 ± 2.15 9.80 [9.06, 10.32] 723 all-in-one Haar 12.41 ± 2.53 12.96 [10.18, 14.52] 723 all-in-one SIREN 12.70 ± 0.82 12.54 [12.08, 13.23] 723 regions F ourier 67.43 ± 26.22 55.81 [45.77, 100.00] 227 regions Grid 68.29 ± 27.84 54.89 [44.36, 100.00] 227 regions Haar 69.68 ± 26.36 64.10 [46.50, 100.00] 227 regions SIREN 17.72 ± 0.64 17.74 [17.38, 18.17] 227 Outlier analysis (r e gions, ≥ 100 dB) High PSNR outliers o ccur only in the regions set and corresp ond to near- constan t ﬁelds after p er-image p ercen tile normalization, see T able 9. Wins p er r e gime Coun ts and shares (p ercen tage within regime) of images won b y eac h metho d, see T ables 10, 9, and 10. R obust r ep orting In addition to means ± SD, we report median, IQR, and 10.00% trimmed means to reduce the inﬂuence of near-constant outliers 9, in T able 11. Ethics Statemen t This study p erforms s econdary analysis of publicly av ailable microscopy data (MapZebrain) and inv olv es no new exp eriments on animals or humans. 20 T able 8: Pairwise comparisons: macro PSNR. Comparison p -v alue (Wilco xon) Cliﬀ ’s δ 95% CI P aired N Haar vs. SIREN 2.617e-14 0.051 [-0.008, 0.103] 950 F ourier vs. SIREN 2.940e-34 0.029 [-0.020, 0.079] 950 Haar vs. Grid 1.692e-49 0.308 [0.254, 0.359] 950 F ourier vs. Grid 5.293e-67 0.435 [0.387, 0.487] 950 Haar vs. F ourier 1.432e-08 -0.053 [-0.107, 0.001] 950 T able 9: Outliers (PSNR ≥ 100 dB) in r e gions . Mo del Outliers T otal (regions) Share (%) F ourier 73 227 32.2 Haar 67 227 29.5 Grid 94 227 41.4 SIREN 0 227 0.0 No identiﬁable human data are present; therefore institutional ethics approv al and consen t were not required. Conﬂict of in terest The author declares no comp eting in terests. Data a v ailability All ra w data are av ailable at [12]. Prepro cessed splits, masks, and scripts to repro duce the exp erimen ts will b e made a v ailable on request. F unding This researc h received no external funding. Author con tribution AP: Conceptualization, Metho dology , Softw are, V alidation, F ormal anal- ysis, In vestigation, Data curation, Visualization, W riting-original draft, W riting- review & editing. 21 T able 10: Wins per regime with shares. Regime Best metho d Wins N Share (%) all-in-one F ourier 198 723 27.4 all-in-one Grid 6 723 0.8 all-in-one Haar 291 723 40.2 all-in-one SIREN 228 723 31.5 regions F ourier 90 227 39.6 regions Grid 69 227 22.5 regions Haar 63 227 35.7 regions SIREN 5 227 2.2 T able 11: Robust PSNR statistics across all images. Mo del Median IQR T rimmed Mean (10%) Haar 14.10 [10.31, 16.92] 16.84 F ourier 13.16 [11.87, 18.05] 16.27 Grid 9.99 [ 9.36, 16.43] 14.22 SIREN 13.00 [12.21, 15.18] 13.62 Figure 9: Num b er of p er-image wins in the al l-in-one regime. Haar dominates, SIREN remains comp etitiv e, the grid almost nev er wins. 22 Figure 10: Num b er of per-image wins in the r e gions regime. Explicit enco dings (F ourier, Haar, Grid) dominate ov er SIREN. 23 References [1] Sitzmann V., Martel J., Bergman A., Lindell D., W etzstein G., 2020. Implicit Neural Representations with P erio dic Activ ation F unctions, in: A dv ances in Neural Information Pro cessing Systems, Laro c helle H., Ranzato M., Hadsell R., Balcan M.F., Lin H. (Eds.), Curran Asso ciates, Inc., Article 33, 7462–7473. [2] T ancik M., Sriniv asan P . P ., Mildenhall B., F rido vich-Keil S., Ragha- v an N., Singhal U., Ramamo orthi R, Barron J. T., Ng R., 2020. F ourier features let net w orks learn high frequency functions in low dimensional domains, in: Pro ceedings of the 34th In ternational Conference on Neu- ral Information Pro cessing Systems (NIPS ’20). Curran Asso ciates Inc., Red Ho ok, NY, USA, Article 632, 7537–7547. [3] Chen Y., Liu S., W ang X. 2021. Learning Contin uous Im- age Representation with Lo cal Implicit Image F unction, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recog- nition (CVPR), Nash ville, TN, USA, 2021, pp. 8624–8634. h ttps://doi.org/10.1109/CVPR46437.2021.00852. [4] Müller T., Ev ans A., Schied C., Keller A. 2022. Instan t neural graphics primitiv es with a multiresolution hash encoding. A CM T rans. Graph. 41, 4, 102. https://doi.org/10.1145/3528223.3530127. [5] F ridovic h-Keil S., Y u A., T ancik M., Chen Q., Rec ht B., Kanazaw a A. 2022. Pro ceedings of the IEEE/CVF Conference on Computer Vision and P attern Recognition (CVPR), pp. 5501–5510. [6] Chen A., Xu Z., Geiger A., Y u J., Su H. 2022. T ensoRF: T enso- rial Radiance Fields, in: Computer Vision – ECCV 2022: 17th Eu- rop ean Conference, T el A viv, Israel, October 23–27, 2022, Pro ceed- ings, P art XXXI I. Springer-V erlag, Berlin, Heidelberg, pp. 333–350. h ttps://doi.org/10.1007/978-3-031-19824-3_20. [7] Xie Y., T akik a wa T., Saito S., Litany O., Y an S., Khan N., T om bari F., T ompkin J., Sitzmann, V., Sridhar, S. 2022. Neural Fields in Vi- sual Computing and Beyond. Computer Graphics F orum, 41,641–676. h ttps://doi.org/10.1111/cgf.14505. 24 [8] Mildenhall B., Sriniv asan P .P ., T ancik M., Barron J.T., Ramamo or- thi R., Ng R. 2021. NeRF: representing scenes as neural radi- ance ﬁelds for view syn thesis, Comm un. ACM 65, 1, 99–106. h ttps://doi.org/10.1145/3503250. [9] Randlett, O., W ee, C., Naumann, E. et al. 2015. Whole-brain activity mapping on to a zebraﬁsh brain atlas. Nat Methods 12, 1039–1046. h ttps://doi.org/10.1038/nmeth.3581. [10] Kunst M., Laurell E., Mok a yes N., Kramer A., Kub o F., F ernan- des A.M., Förster D., Dal Masc hio M., Baier H. 2019. A Cellular- Resolution Atlas of the Larv al Zebraﬁsh Brain. Neuron 103(1), 21– 38.e5. h ttps://doi.org/10.1016/j.neuron.2019.04.034. [11] Shainer I., Kuehn E., Laurell E., Al Kassar M., Mok a y es N., Sherman S., Larsc h J., Kunst M., Baier H. 2023.A single-cell resolution gene expression atlas of the larv al zebraﬁsh brain. Sci A dv. 22,9(8),eade9909. h ttps://doi.org/10.1126/sciadv.ade9909. [12] MapZebrain: A m ultimo dal atlas of the larv al zebraﬁsh brain. [Online]. A v ailable: https://mapzebrain.org/ . A ccessed: Oct. 7, 2025. [13] Sc hw arz K., Sauer A., Niemeyer M., Liao Y., and Geiger A., 2022. V oxGRAF: fast 3D-aw are image syn thesis with sparse vo xel grids. In Pro ceedings of the 36th International Conference on Neural Informa- tion Pro cessing Systems (NIPS ’22), Curran Asso ciates Inc., Red Ho ok, NY, USA, Article 2464, pp. 33999–34011. [14] P ark J.J., Florence P ., Straub J., Newcom b e R., Lo vegro v e S. 2019. DeepSDF: Learning Contin uous Signed Distance F unctions for Shape Represen tation, IEEE/CVF Conference on Computer Vision and P at- tern Recognition (CVPR), Long Beach, CA, USA, pp. 165–174. h ttps://doi.org/10.1109/CVPR.2019.00025. [15] Mesc heder L., Oechsle M., Niemey er M., Now ozin S., Geiger A. 2019. Occupancy Netw orks: Learning 3D Reconstruction in F unc- tion Space,IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR), Long Beach, CA, USA, pp. 4455–4465. h ttps://doi.org/10.1109/CVPR.2019.00459. 25 [16] Strümpler Y., P ostels J., Y ang R., V an Gool L., T om bari F. 2022. Implicit Neural Representations for Image Compression, in: Computer Vision – ECCV 2022: 17th European Conference, T el A viv, Israel, Octob er 23–27, 2022, Pro ceedings, P art XXVI. Springer-V erlag, Berlin, Heidelb erg, pp. 74–91. h ttps://doi.org/10.1007/978-3-031-19809-0_5. [17] Rahaman, N., Baratin, A., Arpit, D., Dräxler, F., Lin, M., Hamprech t, F.A., Bengio, Y., Courville, A.C. (2018). On the Sp ectral Bias of Neural Net works. International Conference on Machine Learning. [18] Chen, W., Jaques, N., T a ylor, S., S ano, A., F edor, S., Picard, R.W., (2015) W av elet-based motion artifact remo v al for electroder- mal activity . Annu In t Conf IEEE Eng Med Biol So c. 2015, 6223-6. h ttps://doi.org/10.1109/EMBC.2015.7319814. 26

Implicit neural representations for larval zebrafish brain microscopy: a reproducible benchmark on the MapZebrain atlas

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment