Circular Statistics-based low complexity DOA estimation for hearing aid application

The proposed Circular statistics-based Inter-Microphone Phase difference estimation Localizer (CIMPL) method is tailored toward binaural hearing aid systems with microphone arrays in each unit. The method utilizes the circular statistics (circular me…

Authors: Lars D. Mosgaard, David Pelegrin-Garcia, Thomas B. Elmedyb

Circular Statistics-based low complexity DOA estimation for hearing aid   application
LOCA T A Challenge W orkshop, a satellite ev ent of IW AENC 2018 September 17-20, 2018, T okyo, Japan CIRCULAR ST A TISTICS-B ASED LO W COMPLEXITY DO A ESTIMA TION FOR HEARING AID APPLICA TION L. D. Mosgaar d, D. P ele grin-Gar cia, T . B. Elmedyb, M. J . Pihl, P . Mowlaee W idex A/S, Nymølle vej 6, DK-3540 L ynge, Denmark lmos@widex.com ABSTRA CT The proposed Circular statistics-based Inter-Microphone Phase dif- ference estimation Localizer (CIMPL) method is tailored tow ard binaural hearing aid systems with microphone arrays in each unit. The method utilizes the circular statistics (circular mean and circu- lar v ariance) of inter-microphone phase dif ference (IPD) across dif- ferent microphone pairs. These IPDs are firstly mapped to time de- lays through a v ariance-weighted linear fit, then mapped to azimuth direction-of-arriv al (DoA) and lastly information of different micro- phone pairs is combined. The v ariance is carried through the differ - ent transformations and acts as a reliability inde x of the estimated angle. Both the resulting angle and variance are fed into a wrapped Kalman filter, which provides a smoothed estimate of the DoA. The proposed method improves the accuracy of the tracked angle of a single moving source compared with the benchmark method provided by the LOCA T A challenge, and it runs approximately 75 times faster . Index T erms — Direction-of-arriv al estimation, inter-microphone phase estimation, time dif ference of arri val, circular statistics, hear - ing aids. 1. INTR ODUCTION Microphone array processing is of interest for hands-free communi- cation, hearing aids, robotics and immersive audio communication systems. It is used in a wide range of applications including noise reduction [1, 2], informed spatial filters for source separation [2, 3], source localization [4] and robust beamforming [5, 6]. The achiev- able performance in these applications is hea vily governed by the accurate information about the direction-of-arri val (DoA) of the tar - get source(s). Con ventional methods for DoA estimation can be grouped into two classes: i) subspace methods relying on e.g. steered- response power phase transform (SRP-PHA T) [7], MUSIC [8] and ESPRIT [9], and ii) cross-po wer spectrum phase (CSP) based methods [10, 11]. While the methods in the two groups are dif ferent in terms of their DoA estimation accuracy and the computational efficienc y , among them, CSP is popular due to simplicity and reliability . Of particular importance is the so-called generalized cross correlation (GCC) method using the phase transform (PHA T) normalization [10] for its rob ustness in DoA estimation for acous- tic source localization [11]. More recently , circular statistics has shown a great potential in multi-channel source tracking for both subspace-based [12] and CSP-based [13] methods. In this paper , we propose CSP-based DoA estimator which re- lies on circular statistics throughout all estimation stages (Figure 1). Our proposed method, CIMPL , is particularly targeted for applica- tion in hearing aids. Specifically , we consider a binaural hearing aid FFT C I M P P h a s e d i f f e r e n c e e s t i m a t i o n FFT FFT FFT T D o A f i t θ L e f t T D o A f i t θ R i g h t T D o A f i t θ B i n Do A m a p D o A m a p Do A m a p C o m b i n e d i r e c t i o n W r a p p e d K a l m a n f i l t e r F r o n t L e f t F r o n t R i g h t R e a r L e f t R e a r R i g h t T D o A e s t i m a t i o n M o n a u r a l a n d b i n a u r a l i n t e g r a t i o n S o u r c e t r a c k i n g Figure 1: System diagram for the proposed method composed of three stages: i) TDoA estimation relying on Circular statistics-based Inter-Microphone Phase dif ference estimation (CIMP) and TDoA fit to left, right and binaural IPDs, ii) data association by inte grating the monaural (left and right) and binaural TDoAs, and iii) source tracker using wrapped Kalman filter . setup consisting of two microphones per hearing aid with a binaural radio connection between each hearing aid. For DoA estimation in such a hearing aid setup, two major challenges are i) the restricted positioning of microphones with a small microphone inter -spacing on each hearing aid and ii) strict computational limitations. W e demonstrate the performance of the proposed method with hearing aid recordings in the presence of a single static source (task 1), a single moving source (task 3) and a single moving source with a moving listener (task 5) as defined in the LOCA T A challenge [14]. 2. DO A ESTIMA TION The CIMPL method is based on three major components: i) time difference of arri val (TDoA) estimation, ii) monaural and binaural integration, iii) and source tracking. Figure 1 provides an overvie w of the CIMPL method. The different stages are explained in the following. 2.1. Time difference of arriv al estimation The initial step in CIMPL is to estimate the TDoA for each micro- phone set. The TDoA estimation is divided in two stages operating in the frequency domain. The first stage is a phase difference es- timation and the second stage consists of a weighted linear fit to estimate the TDoA. LOCA T A Challenge W orkshop, a satellite ev ent of IW AENC 2018 September 17-20, 2018, T okyo, Japan 2.1.1. Cir cular statistics-based inter-micr ophone phase differ - ence estimation (CIMP) The instantaneous IPD at frame l and frequency bin k , denoted by θ ab ( k , l ) , defined between two microphones a and b is giv en by the instantaneous normalized cross-spectrum e j θ ab ( k,l ) = X a ( k , l ) X ∗ b ( k , l ) | X a ( k , l ) X b ( k , l ) | , (1) where X a and X b are the short-time F ourier transforms of the in- put signals at the two microphones and j = √ − 1 . W e assume that θ ab ( k , l ) is a particular realization of a circular random variable Θ . Therefore, the statistical properties of the IPDs are gov erned by cir- cular statistics and the mean is giv en by [15, 16] E l { e j θ ab ( k,l ) } = R ab ( k , l ) e j ˆ θ ab ( k,l ) , (2) where E is a short-time e xpectation operator (moving a verage), ˆ θ ab ∈ [ − π , π [ is the mean IPD and R ab ∈ [0 , 1] is the mean resultant length. The mean resultant length carries information about the direc- tional statistics of the impinging signals at the hearing aid, specif- ically about the spread of the IPD. For uniformly distributed Θ , which corresponds to the signal at the two microphones being com- pletely uncorrelated, the associated mean resultant length goes to 0. At the other extreme Θ is distributed as a Dirac delta function Θ ∼ W { δ ( θ ab − θ 0 ) } corresponding to an ideal anechoic source for a specific frequency f at θ 0 = 2 π f d/c cos ϕ , where W {·} de- notes the transformation that maps a probability density function to its wrapped counterpart [15], d is the inter-microphone spacing, c is the speed of sound, and ϕ is the angle of arri val relativ e to the ro- tation axis of the microphone pair . In this case, the mean resultant length con verges to one. A particular detrimental type of interference, both for speech intelligibility and for common DoA algorithms, is late re verbera- tion typically modeled as diffuse noise. Diffuse noise is character- ized by being a sound field with completely random incident sound wa ves [17]. This corresponds to the IPD ha ving a uniform probabil- ity density Θ ∼ W {U ( − πf /f u , πf /f u ) } , where f u = c/ (2 d ) is the upper frequency limit where phase ambiguities, due to the 2 π - periodicity of the IPD, are avoided. For dif fuse noise scenarios, the mean resultant length for low frequencies ( f << f u ) approaches one. It gets close to zero as the frequency approaches the phase ambiguity limit. Thus, at low frequencies, both diffuse noise and localized sources have similar mean resultant length and it becomes difficult to statistically distinguish the two sound fields from each other . T o resolve the aforementioned limitation, we propose trans- forming the IPD such that the probability density for diffuse noise is mapped to a uniform distribution Θ ∼ U [ − π , π [ for all frequen- cies up to f u while preserving the mean resultant length of local- ized sources. Under free- and far-field conditions and assuming that the inter-microphone spacing is kno wn, the mapped mean re- sultant length ˜ R ab ( k , l ) , which is the mean resultant length of the transformed IPD, takes the form ˜ R ab ( k , l ) =     E l n e j θ ab ( k,l ) k u /k o     , (3) where k u = 2 K f u /f s with f s being the sampling frequency and K the number of frequency bins up to the Nyquist limit. The mapped mean resultant length for dif fuse noise approaches zero for all k < k u while for anechoic sources it approaches one as intended. Commonly used methods for estimating diffuse noise (e.g., [18, 19]) are only applicable for k > k u . Unlike those methods, the mapped mean resultant length works best for k < k u and is partic- ularly suitable for arrays with very short microphone spacing such as hearing aids. Particularly , by employing the proposed mapped mean resultant length instead of the mean resultant length, correct weighting is applied in time-frequency which takes into account the diffuse noise for low frequency TDoA estimation for small micro- phone arrays like hearing aid. Due to the acoustical nature of hearing aid arrays, only frequen- cies up to k u are considered. At higher frequencies, both for the small spacing between the two microphones on one hearing aid (i.e., monaural case) and across the ears (i.e., binaural case), the assump- tions of free- and far -field break down. 2.1.2. Estimating time differ ence in the fr equency domain Giv en the mean IPD and the mapped mean resultant lengths cal- culated so far , the TDoA corresponding to the direct path from a giv en source needs to be estimated. In free- and far -field conditions the TDoA of a single stationary broadband source corresponds to a constant group delay across frequency , which reduces the prob- lem of estimating the TDoA to fitting a straight line θ ( f ) = 2 π f τ . This is ef fectiv ely done in GCC method by using the in verse Fourier transform and finding the TDoA as the time lag that maximizes the GCC. Because the IPDs are circular variables, the estimation of TDoA requires solving a circular-linear fit [15]. For a probabilistic inter - pretation of the regression problem using wrapped IPDs, we refer to [13]. Howev er, since we are only considering frequencies below f u , hereby av oiding phase ambiguity , an ordinary linear fit can be used as an approximation. In a commonly used least mean square fit, it is assumed that all data is pulled from a common distribution. Howe ver , for each mean IPD, a mapped mean resultant length is estimated, corresponding to a reliability measure of the mean IPD. Due to the aforementioned small inter-microphone spacing in the hearing aid setup, we employ the mapped mean resultant length in (3) instead of the mean resultant length. Assuming for simplicity that the IPD follows a wrapped normal distribution, the v ariance ( σ 2 ab ) is giv en by [15], σ 2 ab ( k , l ) = − 2 log( ˜ R ab ( k , l )) . (4) For small variances a wrapped normal distrib ution is well approx- imated by a normal distribution. Ho wever , for small sample sizes, the lo w mean resultant length values are ov erestimated, correspond- ing to an underestimation of the variance, which leads to over em- phasizing uncertain data points in the fit. As one way to circum- vent this problem, we emprically found that using circular disper- sion [15], defined as δ ab ( k , l ) = 1 − ˜ R 4 ab ( k , l ) 2 ˜ R 2 ab ( k , l ) (5) for a wrapped normal distrib ution, deemphasizes the uncertain data points. The reason for this is that δ ab penalizes low ˜ R values more than when using (4), while providing practically the same results for higher ˜ R v alues. Considering that each data point has a known variance giv en by the circular dispersion and approximating the LOCA T A Challenge W orkshop, a satellite ev ent of IW AENC 2018 September 17-20, 2018, T okyo, Japan wrapped normal distribution with the normal distribution, the best least mean square fitted τ ab takes the form τ ab ( l ) = 1 2 π K 0 P k =1 ˆ θ ab ( k,l ) f k δ ab ( k,l ) K 0 P k =1 f 2 k δ ab ( k,l ) , (6) where k is the frequency bin index, ˆ θ ab is the estimated mean IPD from (2) and the summation higher limit K 0 < K denotes the num- ber of frequency bins over which the fit is performed. The actual frequency is f k = f s k / (2 K ) . The variance of the estimated TDoA can, by approximating δ ab as a deterministic variable, be written as v ar ( τ ab ( l )) = 1 4 π 2 1 K 0 P k =1 f 2 k δ ab ( k,l ) . (7) This expression contains a number of simplifications and it should only be considered as an approximation. Howe ver , using (7) allo ws for a computationally simple closed form approximation of the vari- ance of the estimated TDoA, which can be utilized throughout the further stages to associate data based on their variance. 2.2. Monaural and binaural information integration From the estimated TDoA and its variance, a local DoA can be es- timated for each microphone pair along with its variance. In the proposed method only azimuth DoA is considered and the look di- rection of the hearing aid user is defined as zero. Three microphone pairs are required in CIMPL: the two (left and right) monaural com- binations ( M ∈ { L, R } ) and a binaural ( B ) pair . Additional bin- aural pairs can be included to improve the accuracy . Assuming far and free field and that the monaural arrays point in the look direc- tion, the local DoAs can be estimated from the monaural TDoAs as follows, φ M = arccos  c d M τ M  , (8) where d M is the inter-microphone spacing between the two micro- phones on one hearing aid (monaural). Note that, e ven though the calculations take place at each frame l (i.e., φ M ≡ φ M ( l ) ), here and in the rest of the paper we drop the time index for conciseness. Using the T aylor expansion of (8) around φ M = 90 ◦ , the variance of the estimated monaural DoAs can be approximated from the vari- ance of the TDoAs as v ar ( φ M ) ≈  c d M  2 v ar ( τ M ) , (9) where the v ar ( τ M ) is estimated using (7). For the binaural microphone pair , we assume far field and an ellipsoidal head model [20]. From this, the binaural DoA is well approximated by φ B ≈  c d B τ B  , (10) where d B is the inter-microphone spacing between the two hear- ing aids on the head and the look direction is perpendicular to the rotation axis of the binaural microphone pair . The v ariance of the estimated binaural DoA can be written as v ar ( φ B ) =  c d B  2 v ar ( τ B ) . (11) The estimated DoAs are circular v ariables and their estimated variances are transformed to mean resultant lengths using (4), where each DoA is assumed to follow a wrapped normal distribution. W e denote R M ( M ∈ { L, R } ) and R B as the monaural and the bin- aural mean resultant lengths associated with the angle of arri vals, respectiv ely . The monaural DoA estimates for the left and the right pairs are defined in the interval [0 , π ] due to the rotational symmetry around the line connecting the microphones. Correspondingly , the binaural DoA is defined within [ − π / 2 , π / 2] . In order to combine the infor- mation from the monaural pairs and the binaural pair , a common support must be established. This is accomplished by mapping all azimuth estimates onto the full circle ( ϕ ∈ [ − π , π [ ). The choice of the monaural mean resultant length depends on which hearing aid is closer to the source. Using the binaural pair , we determine whether a giv en source is to the left ( φ B ≥ 0 ) or the right ( φ B < 0 ). Based on this, if the source is located on the left, the left monaural micro- phone pair is chosen ( ϕ M = φ L ), and similarly on the right side ( ϕ M = − φ R ). Due to the head shado w effect, the monaural mi- crophone pair closer to the source yields a more reliable estimate. From the chosen monaural pair it can be determined if a potential source is in front of ( | ϕ M | ≤ π / 2 ) or behind ( | ϕ M | > π / 2 ) the hearing aid user . When a source is in the front, then ϕ B = φ B . If the source is determined to be to the right and behind the wearer , then ϕ B = − π − φ B , and if it is behind and to the left, then ϕ B = π − φ B . The mean resultant lengths are inv ariant under translations and are con verted directly . W e ha ve a monaural and a binaural azimuth estimate of the full- circle DoA with their mean resultant lengths. From this, a statistical test is performed to assess the null hypothesis that the two estimates hav e a common mean [15]. The modified test statistic that we em- ploy is Y = 2  w M δ M + w B δ B  − p C 2 + S 2  , (12) where C and S are giv en by C = w M δ M cos( ϕ M ) + w B δ B cos( ϕ B ) , (13) S = w M δ M sin( ϕ M ) + w B δ B sin( ϕ B ) . Here, δ is the circular dispersion known from (5), w M = sin 2 ( ϕ M ) and w B = cos 2 ( ϕ B ) are weighting factors for the monaural and binaural estimates, respecti vely , and Y is the test statistic to be com- pared with the upper 100(1- α )% point of the χ 2 1 distribution, with α as the significance lev el. The weighting factors are used to ef- fectiv ely reduce the reliability of the estimates to compensate for the approximations made in (9) and (11). If the null hypothesis is accepted with α = 0 . 1 , a common mean direction ˆ ϕ of the two estimates is calculated as [15] ˆ ϕ = ∠ { w 1 R M e iϕ M + w 2 R B e iϕ B } , (14) with w 1 = w M / ( R M δ M ) w M / ( R M δ M ) + w B / ( R B δ B ) , w 2 = w B / ( R B δ B ) w M / ( R M δ M ) + w B / ( R B δ B ) . (15) LOCA T A Challenge W orkshop, a satellite ev ent of IW AENC 2018 September 17-20, 2018, T okyo, Japan Similarly , the circular dispersion of the common mean direction is δ = 2 w 2 1 R 2 M δ M + w 2 2 R 2 B δ B ( w 1 R M + w 2 R B ) 2 . (16) Subsequently , the mean resultant length of the common mean can be calculated by solving (5) for R using the circular dispersion ob- tained by (16) yielding R = 1 p δ + √ 1 + δ 2 . (17) If the null hypothesis is rejected, the DoA and its mean resul- tant length are chosen from the estimate with the lowest circular dispersion, i.e., either the monaural or the binaural. From the above development, the information provided from the monaural and the binaural TDoAs and their variance are com- bined to make a unified full-circle DoA ˆ ϕ estimate in (14) with an accompanying circular dispersion δ in (16) and the mean resultant length R in (17). 2.3. Source tracking The azimuth estimation at the output from the previous stage is very noisy , but at the same time it is accompanied by an instantaneous in- dication of reliability in the form of the mean resultant length R (17) or the circular dispersion (16). W e include an angle-only wrapped Kalman filter [21] to obtain a smoother estimate. Differently from the original method described in [21], which assumes a fixed and known variance denoted by σ 2 w for the innovation term, we update this quantity at each frame using the circular dispersion as an ap- proximation, i.e. σ 2 w t ≈ δ . By using circular dispersion provided in (17) instead of v ariance, low R values map onto higher σ 2 w values. 3. EV ALU A TION The LOCA T A challenge dev elopment dataset [14] was used to as- sess the performance of CIMPL. More specifically , the hearing aid recordings in the presence of a single static source (task 1), a single moving source (task 3) and a single moving source with a mo ving listener (task 5) were considered. The standard de viation of the process noise in the wrapped Kalman filter was set to 1 ◦ . Figure 2 illustrates the behavior of the algorithm for a recording of a single mo ving source. Notice that the raw azimuth estimates, shown in gray on the top panel, were very noisy . In contrast, the tracked angles, sho wn in red on the top panel, are smoother and more accurate thanks to the use of a wrapped Kalman filter . The input measurement variance to the wrapped Kalman filter was updated at each frame with the dispersion δ , related to the reliability factor of the estimates, shown in red on the bottom panel, shown in Figure 2. The mean absolute deviation from the ground truth (with stan- dard deviation shown in parentheses), averaged across all data seg- ments where speech was activ e, was 5.9 ◦ (10.4 ◦ ) for task 1, 8.2 ◦ (8.2 ◦ ) for task 3, and 18.7 ◦ (23.5 ◦ ) for task 5. As shown in Figure 3, the performance of CIMPL in task 1 is comparable to that provided by the tracked MUSIC algorithm pro- vided by LOCA T A Challenge [14] as the benchmark, and better in tasks 3 and 5. Moreov er, CIMPL runs in 1.3% of the CPU time required by the tracked MUSIC algorithm [14] provided in the LO- CA T A challenge. Figure 2: [T op] Azimuth tracking of a single moving source with CIMPL (red) and ground truth (dashed), together with raw angle estimates before the wrapped Kalman filter (gray). [Bottom] Raw audio signal (gray) and the reliability factor (red) used as input to the wrapped Kalman filter . Figure 3: Azimuth accuracy for T asks 1, 3 and 5 for the hearing aid recordings of the LOCA T A challenge de velopment dataset [14]. 4. CONCLUDING REMARKS In this paper we proposed a new DoA estimator targeted for tracking a single source with a binaural hearing aid setup. By estimating the angle via circular statistics, the mean resultant length is obtained which acts as a reliability index. The mean resultant length is then carried throughout all the processing steps and is used at the tracker to improv e the accuracy of the tracked angle. Performance ev aluation of the proposed method on the hearing aid recordings provided in the development dataset of the LOCA T A challenge [14] rev ealed an improved accuracy of the tracked an- gle of a single mo ving source compared to the benchmark method (tracked MUSIC algorithm) provided by the organizers, while run- ning approximately 75 times faster . The low computational com- plexity of our algorithm makes it a fa vorable choice for hearing aid application. The estimated angle may be used at further stages of potential hearing aid processing, such as informed beamforming or scene classification. LOCA T A Challenge W orkshop, a satellite ev ent of IW AENC 2018 September 17-20, 2018, T okyo, Japan 5. REFERENCES [1] A. Schwarz and W . Kellermann, “Coherent-to-Diffuse Power Ratio Estimation for Derev erberation, ” IEEE T ransactions on Audio, Speech and Languag e Processing , vol. 23, no. 6, pp. 1006–1018, 2015. [2] S. Chakrabarty and E. A. Habets, “A Bayesian approach to informed spatial filtering with robustness against DOA esti- mation errors, ” IEEE T ransactions on Audio, Speech and Lan- guage Pr ocessing , vol. 26, no. 1, pp. 145–160, 2018. [3] O. Thiergart, M. T aseska, and E. A. P . Habets, “An Informed Parametric Spatial Filter based on Instantaneous Direction-of- Arriv al Estimates, ” IEEE T ransactions on Audio, Speech and Language Pr ocessing , vol. 22, no. 12, pp. 1–15, 2014. [4] M. Farmani, M. S. Pedersen, Z.-H. T an, and J. Jensen, “In- formed Sound Source Localization Using Relative Transfer Functions for Hearing Aid Applications, ” IEEE/A CM T rans- actions on Audio, Speech, and Language Pr ocessing , vol. 25, no. 3, pp. 611–623, 2017. [5] D. P . Jarrett, E. A. Habets, M. R. Thomas, N. D. Gaubitch, and P . A. Naylor , “Dere verberation performance of rigid and open spherical microphone arrays: Theory & simulation, ” 2011 Joint W orkshop on Hands-free Speech Communication and Micr ophone Arrays, HSCMA’11 , no. April, pp. 145–150, 2011. [6] S. Gannot and I. Cohen, “ Adaptive beamforming and postfil- tering, ” in Handbook of Speech Processing , J. Benesty , M. M. Sondhi, and H. Y iteng, Eds. Springer Berlin Heidelberg, 2008, ch. 10, pp. 945–978. [7] J. H. Dibiase, “A high-accuracy , low-latenc y technique for talker localization in rev erberant environments using micro- phone arrays, ” Ph.D. dissertation, Bro wn University , 2000. [8] R. O. Schmidt, “Multiple emitter location and signal param- eter estimation, ” IEEE T ransactions on Antennas and Propa- gation , vol. 34, pp. 276–280, Mar . 1986. [9] R. Roy and T . Kailath, “ESPRIT-estimation of signal parame- ters via rotational in v ariance techniques, ” IEEE T rans. Acous- tics, Speech, and Signal Pr ocessing , vol. 37, no. 7, pp. 984– 995, 1989. [10] C. H. Knapp and G. C. Carter , “The generalized correlation method for estimation of time delay, ” IEEE T ransactions on Acoustics, Speech and Signal Processing , vol. ASSP-24, no. 4, pp. 320–327, 1976. [11] M. Omologo and P . Sv aizer, “ Acoustic source location in noisy and re verberant environment using CSP analysis, ” IEEE International Conference on Acoustics, Speech, and Signal Pr ocessing (ICASSP) , vol. 2, no. October 2014, pp. 921–924 vol. 2, 1996. [12] M. T aseska and E. A. Habets, “DOA-informed source ex- traction in the presence of competing talkers and background noise, ” EURASIP Journal on Advances in Signal Processing , vol. 2017, no. 1, 2017. [13] J. T raa and P . Smaragdis, “Multichannel source separation and tracking with RANSA C and directional statistics, ” IEEE/ACM T ransactions on Audio Speech and Langua ge Processing , vol. 22, no. 12, pp. 2233–2243, 2014. [14] H. W . L ¨ ollmann, C. Ev ers, A. Schmidt, H. Mellmann, H. Bar - fuss, P . A. Naylor, and W . Kellermann, “The LOCA T A chal- lenge data corpus for acoustic source localization and track- ing, ” in IEEE Sensor Array and Multichannel Signal Pr ocess- ing W orkshop (SAM) , Shef field, UK, July 2018. [15] N. I. Fisher, Statistical Analysis of Circular Data . Cambridge Un viersity Press, 1993. [16] K. V . Mardia and P . E. Jupp, Directional Statistics . John W iley & Sons, 2000. [17] R. K. Cook, R. V . W aterhouse, R. D. Berendt, S. Edelman, and M. C. Thompson, “Measurement of correlation coefficients in rev erberant sound fields, ” The Journal of the Acoustical Soci- ety of America , vol. 27, no. 6, pp. 1072–1077, 1955. [18] J. B. Allen, D. A. Berkley , and J. Blauert, “Multi micro- phone signal-processing technique to remove room reverbera- tion from speech signals, ” The Journal of the Acoustical Soci- ety of America , vol. 62, no. 4, pp. 912–915, 1977. [19] A. W estermann, J. M. Buchholz, and T . Dau, “Binaural dere- verberation based on interaural coherence histograms, ” The Journal of the Acoustical Society of America , vol. 133, no. 5, pp. 2767–2777, 2013. [20] R. Duda, C. A vendirno, and J. R. Algazi, “An adaptable el- lipsoidal head model for the interaural time difference, ” in ICASSP , 1999, pp. 965–968. [21] J. Traa and P . Smaragdis, “ A wrapped Kalman filter for az- imuthal speaker tracking, ” IEEE Signal Pr ocessing Letters , vol. 20, no. 12, pp. 1257–1260, 2013.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment