Can NeRFs See without Cameras?

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Neural Radiance Fields (NeRFs) have been remarkably successful at synthesizing novel views of 3D scenes by optimizing a volumetric scene function. This scene function models how optical rays bring color information from a 3D object to the camera pixels. Radio frequency (RF) or audio signals can also be viewed as a vehicle for delivering information about the environment to a sensor. However, unlike camera pixels, an RF/audio sensor receives a mixture of signals that contain many environmental reflections (also called “multipath”). Is it still possible to infer the environment using such multipath signals? We show that with redesign, NeRFs can be taught to learn from multipath signals, and thereby “see” the environment. As a grounding application, we aim to infer the indoor floorplan of a home from sparse WiFi measurements made at multiple locations inside the home. Although a difficult inverse problem, our implicitly learnt floorplans look promising, and enables forward applications, such as indoor signal prediction and basic ray tracing.

💡 Research Summary

The paper asks whether the core ideas behind Neural Radiance Fields (NeRF) can be transferred from optical imaging to the domain of wireless signal sensing, where measurements consist of a mixture of line‑of‑sight (LoS) and multipath components. The authors propose EchoNeRF, a redesign of NeRF that learns an implicit 2‑D floor‑plan from sparse Wi‑Fi power measurements taken at multiple receiver locations.

Key contributions:

Physical signal model – Received power is decomposed into a LoS term and a first‑order reflection term. The LoS term follows the classic Friis equation, attenuated by the product of voxel transparencies along the ray. The reflection term models each voxel’s opacity, surface orientation, incident and scattering angles, and the attenuation of the two sub‑segments (Tx‑voxel and voxel‑Rx).
Plausible reflection set – Since any opaque voxel could in principle act as a reflector, the problem is severely under‑determined. The authors discretize surface orientation into a small set (four cardinal directions) and exploit the fact that indoor walls are mostly orthogonal. This reduces the set of voxels that can plausibly generate a given reflection to a low‑dimensional manifold, dramatically cutting computational complexity.
Two‑stage training – Direct L2 loss on total power leads to domination by the LoS component; gradients for reflection‑related voxels become vanishingly small. EchoNeRF first trains only the LoS branch to quickly identify transparent voxels along direct rays. After convergence, the LoS branch is frozen and a second stage refines voxel opacity and orientation using the reflection model, with additional smoothness regularization to compensate for sparse measurements.
Implementation – A simple MLP predicts voxel opacity δ∈

Can NeRFs See without Cameras?

💡 Research Summary

Comments & Academic Discussion

Leave a Comment