Physics-Aware Neural Operators for Direct Inversion in 3D Photoacoustic Tomography
Learning physics-constrained inverse operators-rather than post-processing physics-based reconstructions-is a broadly applicable strategy for problems with expensive forward models. We demonstrate this principle in three-dimensional photoacoustic computed tomography (3D PACT), where current systems demand dense transducer arrays and prolonged scans, restricting clinical translation. We introduce PANO (PACT imaging neural operator), an end-to-end physics-aware neural operator-a deep learning architecture that generalizes across input sampling densities without retraining-that directly learns the inverse mapping from raw sensor measurements to a 3D volumetric image. Unlike two-step methods that reconstruct then denoise, PANO performs direct inversion in a single pass, jointly embedding physics and data priors. It employs spherical discrete-continuous convolutions to respect hemispherical sensor geometry and Helmholtz equation constraints to ensure physical consistency. PANO reconstructs high-quality images from both simulated and real data across diverse sparse acquisition settings, achieves real-time inference and outperforms the widely-used UBP algorithm by approximately 33 percentage points in cosine similarity on simulated data and 14 percentage points on real phantom data. These results establish a pathway toward more accessible 3D PACT systems for preclinical research, and motivate future in-vivo validation for clinical translation.
💡 Research Summary
Photoacoustic computed tomography (PACT) combines optical contrast with ultrasonic resolution, making it a powerful imaging modality for deep tissue visualization. However, three‑dimensional (3D) PACT systems typically require dense hemispherical transducer arrays and long acquisition times, limiting their clinical adoption. Conventional pipelines either rely on physics‑based reconstruction such as the universal back‑projection (UBP) algorithm or adopt a two‑step approach where a physics solver first reconstructs a coarse image that is subsequently denoised by a deep neural network. The former is computationally intensive and sensitive to noise, while the latter inherits the limitations of the initial physics‑based estimate.
The authors propose PANO (PACT imaging Neural Operator), an end‑to‑end, physics‑aware neural operator that directly maps raw radio‑frequency (RF) sensor data Ψ to a 3‑D initial‑pressure distribution P̂ without intermediate reconstruction steps. PANO treats the inverse problem as a mapping between function spaces, leveraging the recent neural operator paradigm. Its architecture consists of three main components:
-
Spherical DIScrete‑Continuous Convolution (DISCO) blocks – These operate on each frequency slice Ψ_k in the spherical coordinate system (θ, φ) of the hemispherical array. By mimicking spherical convolution, DISCO is agnostic to the exact sampling pattern of the transducers, enabling the model to handle uniform subsampling, clustered sampling, or limited‑angle acquisitions without retraining.
-
Fourier Neural Operator (FNO) – After DISCO extracts local features for all frequencies, the features are concatenated and fed to an FNO. The FNO aggregates global, cross‑frequency information, performs a coordinate transformation from spherical to Cartesian space, and learns the spectral representation of the forward operator (the Helmholtz equation). This component captures the physics of wave propagation across the entire frequency band.
-
3‑D U‑Net refinement – A lightweight 3‑D U‑Net processes the globally‑aware feature maps to recover fine spatial details and produce the final volumetric image.
A crucial innovation is the physics‑aware loss. The reconstructed volume P̂ is re‑projected through the forward acoustic operator A (based on the Helmholtz equation) to synthesize a predicted sensor signal Ψ̂. The loss combines a data fidelity term (e.g., L2 between Ψ̂ and the original Ψ) with standard image‑space losses. This cycle‑consistency constraint forces the network to produce physically plausible reconstructions, especially under noisy or highly undersampled conditions.
Because the model learns a mapping between continuous function spaces, it exhibits resolution‑agnostic behavior: a single trained network can be applied to sensor data with different subsampling factors (e.g., 6×, 10× uniform undersampling, or limited‑angle patterns) without any fine‑tuning. This property dramatically reduces the need for multiple models and simplifies deployment on hardware with varying transducer counts.
Experimental evaluation includes both extensive simulations and real‑world phantom measurements. Simulated data were generated using a semi‑analytic frequency‑domain forward model for a homogeneous, lossless medium, with added band‑limiting and AWGN to mimic the acquisition chain. Various acquisition schemes were tested: uniform angular undersampling (6×, 10×), element‑wise subsampling within each quarter‑arc, and limited‑angle configurations in azimuth and elevation. On these benchmarks, PANO achieved a cosine similarity improvement of roughly 33 percentage points over UBP and about 6 percentage points over a state‑of‑the‑art denoising network, while maintaining inference times on the order of tens of milliseconds (≈30 fps). In real phantom experiments (adult‑breast‑like target imaged with a hemispherical array), PANO outperformed UBP by 14 percentage points in cosine similarity and delivered cleaner maximum‑amplitude‑projection (MAP) images with reduced background noise.
Ablation studies confirmed the importance of each architectural element: removing DISCO or the physics loss led to noticeable degradation, highlighting the synergy between geometry‑aware convolutions and physics‑based regularization. The authors also demonstrate sim‑to‑real transfer: despite being primarily trained on simulated data, the model generalizes well to experimental measurements, suggesting robustness to domain shift.
The paper’s contributions can be summarized as follows:
- Introduction of the first end‑to‑end physics‑aware neural operator for 3‑D PACT, achieving substantial quantitative gains over both classical and deep‑learning baselines.
- Demonstration of high‑quality reconstruction with only one‑third of the transducer coverage (≈33 % scan‑angle), enabling faster, cheaper systems.
- Validation of resolution‑agnostic performance, allowing a single model to handle diverse subsampling patterns without retraining.
- Evidence of strong sim‑to‑real generalization, paving the way for practical deployment.
Limitations and future directions include the current assumption of a homogeneous, lossless acoustic medium; extending the forward model to incorporate attenuation, heterogeneity, and non‑linear effects would broaden applicability. Moreover, acquiring more diverse in‑vivo datasets will be essential for clinical translation. The authors suggest integrating multi‑modal data (e.g., MRI or CT) and exploring adaptive sampling strategies guided by the learned operator as promising avenues.
In summary, PANO showcases how embedding physical constraints directly into a neural operator framework can overcome the computational bottlenecks of traditional wave‑based inverse problems, delivering fast, accurate, and hardware‑flexible 3‑D photoacoustic imaging.
Comments & Academic Discussion
Loading comments...
Leave a Comment