The trajectoRIR Database: Room Acoustic Recordings Along a Trajectory of Moving Microphones

The trajectoRIR Database: Room Acoustic Recordings Along a Trajectory of Moving Microphones
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Data availability is essential in the development of acoustic signal processing algorithms, especially when it comes to data-driven approaches that demand large and diverse training datasets. For this reason, an increasing number of databases have been published in recent years, including either room impulse responses (RIRs) or audio recordings during motion. In this paper we introduce the trajectoRIR database, an extensive, multi-array collection of both dynamic and stationary acoustic recordings along a controlled trajectory in a room. Specifically, the database contains moving-microphone recordings and stationary RIRs that spatially sample the room acoustics along an L-shaped trajectory. This combination makes trajectoRIR unique and applicable to a wide range of tasks, including sound source localization and tracking, spatially dynamic sound field reconstruction, auralization, and system identification. The recording room has a reverberation time of 0.5 s, and the three different microphone configurations employed include a dummy head, with additional reference microphones located next to the ears, 3 first-order Ambisonics microphones, two circular arrays of 16 and 4 channels, and a 12-channel linear array. The motion of the microphones was achieved using a robotic cart traversing a 4.62 m-long rail at three speeds: [0.2, 0.4, 0.8] m/s. Audio signals were reproduced using two stationary loudspeakers. The collected database features 8648 stationary RIRs, as well as perfect sweeps, speech, music, and stationary noise recorded during motion. Python functions are provided to access the recorded audio and retrieve the associated geometric information.


💡 Research Summary

The paper presents the trajectoRIR database, a comprehensive collection of both stationary room impulse responses (RIRs) and dynamic microphone recordings captured along a precisely controlled L‑shaped trajectory inside a reverberant laboratory. The authors identify a gap in existing acoustic datasets: while many provide either static RIRs or recordings made with moving microphones, none combine the two in a way that allows direct correspondence between the static acoustic transfer functions and the dynamic recordings taken while the microphones move. To fill this gap, they built a modular rail system (≈4.4 m total length, 16 cm wide) on which a robotic cart travels at three constant speeds (0.2, 0.4, 0.8 m s⁻¹). The cart’s path is marked at 92 positions spaced about 5 cm apart, enabling the collection of RIRs at the exact same spatial points used for the moving recordings.

Three microphone configurations are employed, covering a broad spectrum of commonly used arrays:

  1. MC1 – a Neumann KU‑100 dummy head with two in‑ear omnidirectional microphones, two reference omnidirectional microphones placed next to the ears, a 16‑channel uniform circular array surrounding the head, and a 4‑channel circular array mounted above the head.
  2. MC2 – identical to MC1 but without the dummy head, i.e., only the ear‑adjacent and circular arrays.
  3. MC3 – three first‑order ambisonic microphones plus a 12‑channel uniform linear array.

These configurations were chosen because they are already used in several well‑known RIR databases, facilitating cross‑dataset research and transfer learning.

The room (Alamire Interactive Laboratory, KU Leuven) measures roughly 6.4 × 6.9 × 4.7 m (volume ≈208 m³) and has a reverberation time T₂₀ = 0.5 s. Two Genelec 8030 CP loudspeakers are placed on opposite sides of the trajectory and serve as the only sound sources. For the static part of the dataset, a total of 8 648 RIRs were measured across all microphone configurations and positions. For the dynamic part, five source signals were used: a piano excerpt, a drum beat, female speech, white noise, and two perfect sweeps covering low‑ and high‑frequency ranges. Each signal was recorded at the three cart speeds and with each microphone configuration, yielding 108 multi‑channel recordings (36 per configuration). In addition, the ego‑noise of the cart and rail system was recorded separately to support noise‑reduction research.

All audio files are stored as 48 kHz, 24‑bit WAV files; the entire collection amounts to 3.4 hours of audio, 7.47 GB in size. Accompanying CSV files contain exhaustive metadata: 3‑D coordinates of microphones and speakers, cart speed, timestamps, and even temperature readings. To simplify data handling, the authors provide Python scripts that load the audio, retrieve geometric information, and visualize the trajectory, enabling researchers to integrate the dataset into machine‑learning pipelines with minimal preprocessing.

The paper validates the usefulness of the database through a systematic evaluation of time‑variant RIR estimation. Three approaches are compared: (i) using only the sparse static RIR measurements, (ii) using only the moving‑microphone recordings, and (iii) fusing both sources. The fused method achieves the highest correlation with ground‑truth RIRs and the lowest root‑mean‑square error, demonstrating that the combination of static and dynamic data yields superior reconstruction of the acoustic field.

Limitations are acknowledged: the room’s reverberation time is moderate (0.5 s), only two fixed loudspeakers are used, and the trajectory is confined to a planar L‑shape, which does not capture fully three‑dimensional motion. Future work could extend the database to rooms with longer reverberation times, multiple source positions, and more complex 3‑D trajectories.

In summary, trajectoRIR is the first publicly available dataset that simultaneously provides high‑resolution static RIRs and synchronized moving‑microphone recordings across multiple array geometries. Its rich metadata, diverse signal set, and open‑source access tools make it a valuable resource for research areas such as sound‑source localization and tracking, dynamic sound‑field reconstruction, auralization, echo cancellation, and data‑driven acoustic modeling.


Comments & Academic Discussion

Loading comments...

Leave a Comment