Completing Sets of Prototype Transfer Functions for Subspace-based Direction of Arrival Estimation of Multiple Speakers
To estimate the direction of arrival (DOA) of multiple speakers, subspace-based prototype transfer function matching methods such as multiple signal classification (MUSIC) or relative transfer function (RTF) vector matching are commonly employed. In general, these methods require calibrated microphone arrays, which are characterized by a known array geometry or a set of known prototype transfer functions for several directions. In this paper, we consider a partially calibrated microphone array, composed of a calibrated binaural hearing aid and a (non-calibrated) external microphone at an unknown location with no available set of prototype transfer functions. We propose a procedure for completing sets of prototype transfer functions by exploiting the orthogonality of subspaces, allowing to apply matching-based DOA estimation methods with partially calibrated microphone arrays. For the MUSIC and RTF vector matching methods, experimental results for two speakers in noisy and reverberant environments clearly demonstrate that for all locations of the external microphone DOAs can be estimated more accurately with completed sets of prototype transfer functions than with incomplete sets. \c{opyright}20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
💡 Research Summary
This paper addresses a practical limitation of subspace‑based direction‑of‑arrival (DOA) estimation methods—namely MUSIC and relative‑transfer‑function (RTF) vector matching—when only a partially calibrated microphone array is available. In many hearing‑aid scenarios a calibrated binaural device (four microphones on a dummy head) is combined with one or more external microphones (eMics) whose positions are unknown and for which no anechoic prototype transfer functions (PTFs) have been measured. Traditional subspace methods require a complete set of PTFs for all microphones; without them the spatial spectrum cannot be formed and DOA estimation degrades.
The authors propose a mathematically rigorous procedure to “complete” the missing PTFs by exploiting the orthogonality between the signal subspace and the noise subspace of the pre‑whitened covariance matrix. After pre‑whitening the noisy covariance matrix Φ_y with the square‑root of the undesired‑component covariance Φ_u, the eigen‑decomposition yields a signal eigenvector a_w(θ) that is (approximately) parallel to the pre‑whitened direct‑path acoustic transfer function vector. The noise subspace Q_n is partitioned into components associated with the calibrated binaural array (Q_n,HA) and the unknown eMic (q_n,E). Orthogonality imposes the relation Q_n^H a_w(θ)=0, which can be rewritten as a linear equation in the unknown eMic element A_w,E(θ). By solving a least‑squares problem
α_opt = arg min_α ‖Q_n,HA^H q_n,E − α a_w,HA(θ)‖²,
with α = –1/A_w,E(θ), the missing pre‑whitened transfer function for the eMic is obtained analytically. The completed pre‑whitened transfer‑function vector a_w,completed(θ) is then formed by concatenating the known binaural part a_w,HA(θ) with the estimated eMic component.
Having reconstructed the full set of pre‑whitened PTFs, the authors generate the corresponding RTF vectors by de‑whitening: g_completed(θ)=Φ_u^{1/2} a_w,completed(θ)·e₁·
Comments & Academic Discussion
Loading comments...
Leave a Comment