Multipath-enabled private audio with noise

MUL TIP A TH-ENABLED PRIV A TE A UDIO WITH NOISE Anadi Chaman ∗ , Y u-Jeh Liu ∗ , Jonah Casebeer † , Ivan Dokmani ´ c ∗ Departments of ∗ Electrical and Computer Engineering and † Computer Science Uni versity of Illinois at Urbana-Champaign ABSTRA CT W e address the problem of pri vately communicating audio messages to multiple listeners in a reverberant room using a set of loudspeak- ers. W e propose two methods based on emitting noise. In the ﬁrst method, the loudspeakers emit noise signals that are appropriately ﬁltered so that after echoing along multiple paths in the room, they sum up and descramble to yield distinct meaningful audio messages only at speciﬁc focusing spots , while being incoherent e verywhere else. In the second method, adapted from wireless communications, we project noise signals onto the nullspace of the MIMO channel matrix between the loudspeakers and listeners. Loudspeakers repro- duce a sum of the projected noise signals and intended messages. Again because of echoes, the MIMO nullspace changes across dif- ferent locations in the room. Thus, the listeners at focusing spots hear intended messages, while the acoustic channel of an eavesdrop- per at any other location is jammed. W e show , using both numerical and real experiments, that with a small number of speakers and a few impulse response measurements, audio messages can indeed be communicated to a set of listeners while ensuring negligible intelli- gibility elsewhere. Index T erms — Priv ate audio communication, speech priv acy , multi-channel con volutional synthesis, speech intelligibility . 1. INTR ODUCTION Consider the problem of sending audio messages to different listen- ers in a rev erberant room, while making sure that each message can only be understood by its intended recipient. Importantly , no eav es- dropper anywhere in the room should be able to understand any of the messages. This problem is related to personal audio zones and sound ﬁeld reproduction [1–9] where the goal is to reproduce different sound streams in a few predeﬁned zones in a room while minimizing the sound level e verywhere else. In most of these approaches, howe ver , an eavesdropper with a sensitive microphone (or a good ear) can easily understand the messages. The reason is that the loudspeak- ers simply reproduce linearly ﬁltered versions of desired messages which remain highly correlated with any residual error signal. T o address the problem of priv ate audio communication, we pro- pose two methods. As an extension of our previous work [10], the ﬁrst approach communicates audio messages to intended focusing spots by emitting appropriately ﬁltered white Gaussian noise signals from loudspeakers. The ﬁlters are constructed such that after passing through speciﬁc sets of paths and time delays, these ﬁltered random signals sum up coherently as they arrive at the target focusing points. On the other hand, they yield incoherent signals at locations with dif- ferent sets of signal propagation paths. This solution is expected to Project webpage: https://swing-research.github .io/private-audio/ work well when a room has high spatial div ersity of acoustic chan- nels. In our second approach, the idea is to send random noise from loudspeakers in addition to message signals, such that the noise sig- nals add up to zero only at the intended listening points, while they continue to mask the messages everywhere else. This results in the interception of clean audio messages at the focusing spots while hav- ing low intelligibility at other locations. This technique is inspired by standard methods in wireless networking on jamming ea vesdroppers [11, 12]. Howe ver , to the best of our knowledge, the prior works con- sider fading wireless channels without explicitly considering inter- symbol interference (echoes). While this could be a fair assumption for networks like W iFi where sampling times are much larger than propagation delays of wireless signals, this is not the case in room acoustics. Hence, we adapt this jamming scheme to work with long con volutional channels. Priv acy in multizone reproduction systems was ﬁrst studied in [13] where the authors also use noise to mask message signals in “quiet” zones to reduce intelligibility . While their method is appli- cable in both anechoic and rev erberant conditions, the performance is degraded in the presence of echoes. On the other hand, as we elaborate later , our methods critically rely on echoes and multipath propagation. In particular , our solutions exploit the spatial diver - sity of room impulse responses (RIRs) across different locations in a room and the redundant degrees of freedom in signal transmission provided by multiple loudspeakers. Unlike in multizone methods, howe ver , we can only deliv er messages to a small, ﬁxed region of space. On the other hand, we achie ve good performance using a rather small number of loudspeakers and impulse response measure- ments (in our experiments we use only six). The problem of jamming ea vesdroppers has been studied exten- siv ely in wireless communication. The theoretical foundation was laid by Shannon [14] and later extended by [15, 16] who sho wed the feasibility of secrecy if the communication channel of an eav esdrop- per is de graded. The methods in [11, 12, 17] use artiﬁcial noise; [18] showed the possibility of secret communication as a consequence of slow wireless fading. Prior works hav e also looked at a related problem of eav esdropper detection [19–21]. In this paper , we empirically show that unlike traditional multi- zone sound ﬁeld reproduction which is usually de graded in re verber- ant en vironments [22, 23], both of our proposed approaches giv e ex- cellent results in the presence of echoes since echoes enhance spatial div ersity . W e deriv e conditions needed to generate desired messages at the focusing spots, and demonstrate both numerically and through real experiments that with six speakers and the knowledge of RIRs at the intended listening points, priv ate audio communication is ef- fectiv ely achie vable. In addition, we compare the robustness of the two approaches to system failures and uncertainties. 2. PR OBLEM FORMULA TION Consider a system with L loudspeakers, each emitting an audio sig- nal to K listeners. W ithout loss of generality , let the desired length of the signal y k at the k th listener be N . W e also assume that the room impulse response (RIR) between the k th listener and the i th speaker is a sequence h ki which is L h long and known a priori. This signal receiv ed by the k th listener is giv en as a sum of con- volutions: y k ( n ) = L X i =1 ( h ki ∗ x i )( n ) , n = 0 , 1 , ..., N − 1 , (1) where x i ∈ R L x is the signal transmitted by the i th speaker with length L x = N − L h + 1 , and ∗ represents linear con volution. W e deﬁne intended message vector y in ∈ R N K as a concatenation of all y k ∈ R N : y in = [ y > 1 , y > 2 , . . . y > K ] > . Similarly , we deﬁne channel matrices H k of size N × LL x as [ H k 1 , H k 2 , . . . , H kL ] , where each H ki is a T oeplitz con volution matrix composed using h ki . Deﬁning H = [ H > 1 , H > 2 , . . . , H > K ] > and x = [ x > 1 , x > 2 , . . . , x > L ] > , (1) can be rewritten as: y in = H x . (2) If the matrix H has full ro w rank, we can reconstruct an y de- sired message signals at the K listeners. A well-known solution to (2) is given by x = H † y in , where H † is the pseudoinv erse of H . Though this solution sufﬁces for message reconstruction at the lis- teners, it does not enforce unintelligibility at other locations. W e could, howe ver , exploit the additional degrees of freedom provided by the nullspace of H to generate a suitable x that ensures signal degradation outside the tar get focusing spots. W e note that for typical audio sampling rates, RIR lengths and message lengths, H is far too large to compute the pseudoin verse explicitly . That is why we solve all least-squares design problems in this paper by the conjugate gradient method. Since the inv olved matrices are all block-T oeplitz, the conjugate gradient method can be efﬁciently implemented using f ast Fourier transforms. 3. THE TWO APPRO A CHES As per (2), x can be suitably chosen to ensure that the message sig- nals outside the focusing spots remain unintelligible. In this section, we present two methods to achieve this task, each constructing x in a dif ferent way: (i) multichannel conv olutional synthesis (MCCS) by noise and (ii) noise in the nullspace approach. 3.1. Multichannel convolutional synthesis by noise Recall from (1) that the signal arriving at the k th listener is y k = P L i =1 h ki ∗ x i . In this ﬁrst approach, we constrain x i to be a con- volution of a ﬁlter g i of length L g with a noise signal n i of length L n , drawn from standard normal distrib ution. This is equiv alent to x i = N i g i , i = 1 , 2 , . . . , L, (3) where N i is an L x × L g T oeplitz conv olution matrix composed using the vector n i , with L x = L g + L n − 1 . W e deﬁne g = [ g > 1 , g > 2 , . . . , g > L ] > and a block diagonal matrix N as N = diag([ N 1 , N 2 , . . . , N L ]) . Then equations in (3) can be combined for all i ∈ { 1 , . . . , L } to gi ve x = N g and y in = H N g . (4) Giv en H N and y in , g can be computed using conjugate gradient method. This model constrains x to lie on a subspace of random vec- tors. T o understand why , consider the signal emitted by the i th loud- speaker , x i , which can be written as x i ( n ) = L n − 1 X p =0 n i ( p ) g i ( n − p ) , n = 0 , 1 , ..., L x − 1 . W e can interpret x i as a sum of randomly-scaled translates of ﬁlter g i . For all speakers, g i are constructed such that conv olutions of x i with room impulse responses sum up to yield the desired mes- sages only at the listeners. Thus, a speciﬁc set of RIRs { h ki } , cor- responding to the intended listener–speak er pairs correctly descram- bles the translates. In a room with rich spatial diversity , locations other than the intended listening points will be characterized by a different set of RIRs. W e thus cannot expect the descrambling to yield the correct output, and the randomness of n i then ensures non- intelligibility of the resulting signal. 3.2. Noise in the nullspace W e adapt the second approach from the wireless communications literature. Concretely , x is chosen as a sum of a message-carrying vector s ∈ R LL x and a noise-like signal w ∈ R LL x , i.e., x = s + w . W e construct s and w to satisfy H s = y in and H w = 0 , so that y in = H ( s + w ) = H s . (5) This is achiev ed by choosing w as the projection of a random noise vector on the nullspace of the channel matrix H , i.e., w = P N ( H ) v , where the entries of v are i.i.d. standard Gaussian and P N ( H ) is the projector on the null space of H . As mentioned in Section 2, H is typically large, which makes the direct computation of its nullspace a prohibitively comple x task. Instead, we ﬁrst ﬁnd the projection of v on the row space of H by solving ˆ z = argmin z k v − H > z k 2 2 . (6) W e again use the conjugate gradient method to solve (6) using fast Fourier transforms since H is block-T oeplitz. Once ˆ z is found, the nullspace projection P N ( H ) v is simply v − H > ˆ z . 4. CONDITIONS FOR PERFECT RECONSTRUCTION In this section, we present the conditions needed to ensure perfect reconstruction of any set of message signals of length N at the K listeners (or any y in ∈ R N K ) for both approaches. 4.1. Multi-channel convolutional synthesis by noise From (4), perfect reconstruction can be achie ved if the ov erall chan- nel matrix H N has full row rank, N K . W e make the assumption that the room is drawn randomly from a continuous distribution. (For example, let the corners be chosen uniformly at random within ﬁxed balls.) W e also assume that the loudspeaker and listener positions are placed at random according to an absolutely continuous distribution. These assumptions imply that the distribution of the nullspace of H is absolutely continuous with respect to the Haar measure on the Grassmannian. Then, we have the following result. 0 0.2 0.4 0.6 0.8 1 Intended listener 1 Intended listener 2 Other location STOI MCCS approach (Anechoic) Null space approach (Anechoic) 0 0.2 0.4 0.6 0.8 1 Intended listener 1 Intended listener 2 Other location STOI MCCS approach (Reverberant) Null space approach (Reverberant) (a) (b) Intended List eners (c) (d) Speaker s MCCS approach: R everberan t case Null space approach: R everberant c ase S1 S2 S3 S4 S5 S6 Fig. 1 : ST OI scores at 2 intended listeners and one additional location using MCCS and nullspace approach in (a) anechoic and (b) re verberant setting. (c)-(d) Heat maps reﬂecting STOI scores at 4200 locations in a simulated room of size 7 m × 8 m. Speakers illustrated as S1-S6. Proposition 4.1. Suppose LL g ≥ N K . Then H N has full r ow rank with probability 1 . Pr oof. W e hav e that rank( H N ) ≤ min { rank( H ) , rank( N ) } by rank inequalities. W ith the conditions of the proposition, this implies that rank( H N ) ≤ N K . The only w ay to ha ve a strict inequality is that the nullspace of H intersects the range of N along a subspace of dimension greater that LL g − N K . On the other hand, because the nullspace of H is continuously distributed and independent from N , it will intersect the range of N exactly along a subspace of di- mension LL g − N K with probability 1. This result implies that for most setups in sufﬁciently rev erber- ant rooms, we will be able to produce the desired messages at the listener positions. 4.2. Noise in nullspace approach From (5), H needs to ha ve full row rank for perfect reconstruction of all y in ∈ R N K . Similar to the pre vious case, since H is a function of the RIRs between the speak er-listener pairs, it is not completely in the user’ s control to ensure that it has full rank as it depends on room geometry and the spatial diversity of RIRs. In practice, ho wever , if we assume a randomized setup and room as in the pre vious section, and the conditions of Proposition 4.2 are satisﬁed, then H can be expected to ha ve full ro w rank with probability 1. Proposition 4.2. The following conditions ar e necessary for perfect r econstruction of message signals at the listener s. (a) The number of r ows of H should be at least as lar ge as the length of y in = ⇒ ( L x + L h − 1) ≥ N . (b) Ther e should be at least as many columns as rows in H . (c) L x needs to be gr eater than the highest r elative time delay among each listener-speaker pair . Pr oof. ( a ) ensures that we ha ve suf ﬁcient samples to generate the desired message length; ( b ) is elementary linear algebra; ( c ) ensures that “silent” regions do not exist within a signal generated at a lis- tening point. It should be noted that both of our approaches satisfy the condi- tion in (a) with equality . Also, ( b ) gives a lower bound on the num- ber of speakers, L , needed for reconstruction, i.e., L ≥ N K L x . This is lower than the number of speakers needed by the MCCS approach, as per Proposition 4.1 5. EXPERIMENT AL RESUL TS W e ev aluate the performance of the two proposed techniques using both numerical and real e xperiments. The numerical e xperiments are performed with 6 loudspeakers randomly placed in a simulated con- ve x room of size 7 m × 8 m ha ving walls with absorption coef ﬁcient 0.35. RIRs between the speakers and listeners are calculated based on image source model, using the pyroomacoustics package [24]. W e perform the real experiments in an ofﬁce space of size 10 m × 6 m using two Genelec 8030B and four Genelec 8010A loud- speakers. The RIRs are measured using the exponential sine sweep technique [25]. In all experiments, the power of signals emitted by the loudspeakers is kept ﬁxed. The intelligibility of the generated sounds is assessed using Short-Time Objectiv e Intelligibility (STOI) [26] measure. 5.1. Numerical experiments 5.1.1. P erfect reconstruction: A case for echoes In order to pro vide insight into the importance of echoes in our solu- tion, we ﬁrst perform an experiment in a simulated anechoic room. W e randomly place two listeners inside the room and calculate STOI scores of the signals arriving there using the two approaches. An additional location is randomly chosen to examine the signal degra- dation outside the target focusing spots. W e then repeat the same experiment but in the presence of echoes. Fig. 1 (a) sho ws that in the anechoic setting, while the signal at the ﬁrst listener has high intelligibility with STOI scores close to 1 for both approaches, the second listener does not. On the other hand, Fig. 1(b) shows that in the presence of echoes, signal intelligibility is restored at the second listener as well. This indicates that the spatial div ersity provided by echoes helps in conditioning the channel matrix H , which in turn supports perfect reconstruction of messages at target locations. 5.1.2. Signal degradation outside focusing spots Both Fig. 1 (a) and (b) indicate that the nullspace-based method has a greater impact on signal degradation at the location chosen out- side the focusing spots. T o examine this further, we calculate STOI scores at 4200 locations in a simulated rev erberant room and create heat maps as sho wn in Fig. 1 (c) and (d). In both plots, the bright spots at the locations of intended listeners indicate high intelligibil- ity . Howe ver , regions outside the focusing spots in Fig. 1 (d) hav e relativ ely lower STOI scores as compared to Fig. 1 (c), thus indicat- ing tow ards better jamming capabilities of the nullspace approach. Both methods perform signal degradation outside the focusing spots using noise. T o understand how these random signals result 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 1.6e5 6.4e5 1.44e6 2.56e6 4e6 5.76e6 7.84e6 STOI Noise variance MCCS approach Null space approach (a) 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 STOI Ratio of L n and L x (b) Fig. 2 : (a) STOI vs noise variance for the 2 methods outside focus- ing spots. (b) STOI vs noise length as a proportion of ov erall input length for MCCS approach outside focusing spots. in unintelligibility of sound, we ﬁrst in vestigate the role of noise variance. For 100 randomly selected speaker -listener conﬁgurations, we check the impact of increasing noise v ariance on STOI values for both methods. Fig. 2 (a) shows a decline in median STOI scores as the input noise power is increased for the nullspace approach, whereas they do not change much for the MCCS method. This result is not surprising because in the nullspace approach, noise is fed into the loudspeakers with the message signals in an additiv e sense. Thus, a deterioration of SNR and subsequent STOI decline is expected with increase in noise variance. Howe ver , the signal emitted by the i th loudspeaker is x i = n i ∗ g i for MCCS method. Here, if the variance of n i is increased, g i simply gets scaled to preserve the original x i . W e no w in vestigate the factors that impact the jamming ca- pability of the MCCS approach. Recall that this method inv olves “scrambling” of message-carrying input ﬁlters g i by noise which are thereby appropriately descrambled at the intended locations by the correct RIR values. Thus, we expect that longer noise vectors would ha ve a stronger impact on signal integrity when the RIR changes. T o v erify this claim, we v ary the length of noise vectors L n as a proportion of a ﬁxed length L x , and calculate the STOI scores for 100 randomly chosen speaker -listener conﬁgurations. Fig. 2(b) veriﬁes that increasing the length of noise v ectors leads to a decrease in median intelligibility scores outside the focusing spots. These results point to wards an interesting phenomenon. Giv en unlimited av ailable input power at the speakers, one could arbitrar- ily improve jamming by increasing noise po wer in the nullspace method. Howe ver , in MCCS approach, an arbitrary increase in jam- ming by increasing L n is not feasible, because for a ﬁxed message length N and ﬁxed L h , L x = L g + L n − 1 is ﬁx ed, and one can only increase L n , as long as L g ≥ N K L (from Proposition 4.1). 5.1.3. Rob ustness to system failures and uncertainties W e assess how the reconstruction of audio messages at the target listeners is affected by system failures and uncertainties: (i) mal- function of loudspeakers while emitting audio signals, and (ii) er- rors in RIR measurements. W e did simulations ov er 100 random speaker –listener conﬁgurations and examined the beha vior of the STOI scores. In (i), we compute the appropriate x i (to be emitted by the i th loudspeaker) for a system of 6 speakers. Ho wev er, while measuring STOI at the listeners, not all speakers are used. Fig. 3(a) shows that the STOI scores decline as more speakers are dropped, and the decline is more rapid for the nullspace method as compared to MCCS approach. 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.05 0.1 0.15 0.2 0.25 0.3 STOI Ratio of RIR noise power and true RIR average power MCCS approach Null space approach 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1 2 3 4 5 STOI Number of speakers dropped MCCS approach Null space approach (a) (b) Fig. 3 : Robustness analysis. Impact of (a) speaker malfunction and (b) inaccuracies in RIR estimates on STOI scores at focusing spots. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Spot 1 Spot 2 10cm 20cm 30cm 50cm 100cm STOI MCCS approach Null space approach (a) (b) Fig. 4 : (a) Experimental setup: speakers represented in green, and microphones in red boxes. (b) STOI values measured at two focusing spots, and at different distances from Spot 2 in a real room setting. On the other hand, we analyze the robustness to channel mea- surement errors by computing x i using RIR v alues with white Gaus- sian noise added to them. These erroneous x i are then con volved with the true RIRs to compute the signals arri ving at focusing spots. Fig. 3(b) indicates that errors in the knowledge of RIRs before signal transmission by the loudspeak ers lead to reduced intelligibility at the focusing spots. Again, the MCCS approach shows more robustness to uncertainties as compared to the nullspace approach. 5.2. Experiment in a real setting W e perform an experiment to e valuate the two approaches in a real room with 6 loudspeakers and measure the ST OI scores of generated sounds with microphones at 7 locations. The experimental setup is shown in Fig. 4 (a). T wo microphones are chosen to be the focusing spots, and the rest are placed at increasing distances from Spot 2. Fig. 4 (b) shows the measured STOI v alues. The observed intelligi- bility at the tw o spots is good with high STOI scores, and the signals become considerably degraded 50 cm aw ay from the focusing spots. As e xpected from simulations, the nullspace approach has a stronger impact on signal degradation outside the tar get listeners. 6. CONCLUSION W e present two approaches to address the pri vate audio communica- tion problem in a rev erberant room. Both approaches are based on emitting noise signals from loudspeakers and then utilizing echoes in the room to ensure that they yield intelligible messages at selected locations, while being incoherent elsewhere. Simulated and real ex- periments suggest that with just 6 loudspeakers and a few impulse response measurements, we can deliver clear audio messages at the desired locations while ensuring unintelligibility ev erywhere else. 7. REFERENCES [1] M. Poletti, “ An inv estigation of 2-d multizone surround sound systems, ” in 125th Audio Engineering Society Con vention , Oct 2008. [2] Y . J. W u and T . D. Abhayapala, “Spatial multizone sound- ﬁeld reproduction: Theory and design, ” IEEE T ransactions on Audio, Speech, and Language Processing , vol. 19, no. 6, pp. 1711–1720, Aug 2011. [3] T . Betlehem, W . Zhang, M. A. Poletti, and T . D. Abhayapala, “Personal sound zones: Deliv ering interface-free audio to mul- tiple listeners, ” IEEE Signal Pr ocessing Magazine , vol. 32, no. 2, pp. 81–91, March 2015. [4] S. J. Elliott, J. Cheer, J. Choi, and Y . Kim, “Robustness and regularization of personal audio systems, ” IEEE T ransactions on Audio, Speech, and Language Processing , vol. 20, no. 7, pp. 2123–2133, Sep. 2012. [5] Y . Cai, M. W u, and J. Y ang, “Sound reproduction in personal audio systems using the least-squares approach with acoustic contrast control constraint, ” The Journal of the Acoustical Society of America , vol. 135, no. 2, pp. 734–741, 2014. [Online]. A vailable: https://doi.org/10.1121/1.4861341 [6] J.-W . Choi and Y .-H. Kim, “Generation of an acoustically bright zone with an illuminated re gion using multiple sources, ” The Journal of the Acoustical Society of America , v ol. 111, no. 4, pp. 1695–1700, 2002. [7] A. J. Berkhout, D. de Vries, and P . V ogel, “ Acoustic control by wa ve ﬁeld synthesis, ” The J ournal of the Acoustical Society of America , vol. 93, no. 5, pp. 2764–2778, 1993. [8] D. B. W ard and T . D. Abhayapala, “Reproduction of a plane- wa ve sound ﬁeld using an array of loudspeakers, ” IEEE T rans- actions on speech and audio pr ocessing , v ol. 9, no. 6, pp. 697– 707, 2001. [9] W . Jin, W . B. Kleijn, and D. V irette, “Multizone soundﬁeld reproduction using orthogonal basis expansion, ” in IEEE In- ternational Conference on Acoustics, Speech and Signal Pro- cessing , May 2013, pp. 311–315. [10] Y . Liu, J. Casebeer, and I. Dokmani, “Cocktails, but no party: Multipath-enabled priv ate audio, ” in 16th International W ork- shop on Acoustic Signal Enhancement (IW AENC) , Sep. 2018, pp. 186–190. [11] R. Ne gi and S. Goel, “Secret communication using artiﬁcial noise, ” in 62nd IEEE V ehicular T echnology Conference , vol. 3, Sep. 2005, pp. 1906–1910. [12] S. Goel and R. Negi, “Guaranteeing secrecy using artiﬁ- cial noise, ” IEEE Tr ansactions on W ireless Communications , vol. 7, no. 6, pp. 2180–2189, June 2008. [13] J. Donley , C. Ritz, and W . B. Kleijn, “Improving speech pri- vac y in personal sound zones, ” in IEEE International Confer- ence on Acoustics, Speech and Signal Processing (ICASSP) , March 2016, pp. 311–315. [14] C. E. Shannon, “Communication theory of secrec y systems, ” The Bell System T echnical Journal , vol. 28, no. 4, pp. 656–715, Oct 1949. [15] I. Csiszar and J. K orner, “Broadcast channels with conﬁdential messages, ” IEEE T ransactions on Information Theory , vol. 24, no. 3, pp. 339–348, May 1978. [16] A. D. W yner , “The wire-tap channel, ” Bell System T echnical Journal , vol. 54, no. 8, pp. 1355–1387, 1975. [Online]. A vailable: https://onlinelibrary .wiley .com/doi/abs/10.1002/j. 1538- 7305.1975.tb02040.x [17] S. Goel and R. Negi, “Secret communication in presence of colluding eav esdroppers, ” in IEEE Military Communications Confer ence , vol. 3, Oct 2005, pp. 1501–1506. [18] J. Barros and M. R. D. Rodrigues, “Secrecy capacity of wire- less channels, ” in IEEE International Symposium on Informa- tion Theory , July 2006, pp. 356–360. [19] A. Mukherjee and A. L. Swindlehurst, “Detecting passive eav esdroppers in the mimo wiretap channel, ” in IEEE Interna- tional Confer ence on Acoustics, Speec h and Signal Processing (ICASSP) , March 2012, pp. 2809–2812. [20] A. Chaman, J. W ang, J. Sun, H. Hassanieh, and R. Roy Choud- hury , “Ghostbuster: Detecting the presence of hidden eaves- droppers, ” in Pr oceedings of the 24th Annual International Confer ence on Mobile Computing and Networking . A CM, 2018, pp. 337–351. [21] C. Stagner, A. Conrad, C. Osterwise, D. G. Beetner, and S. Grant, “ A practical superheterodyne-receiv er detector using stimulated emissions, ” IEEE T ransactions on Instrumentation and Measur ement , vol. 60, no. 4, pp. 1461–1468, April 2011. [22] W . Jin and W . B. Kleijn, “Theory and design of multizone soundﬁeld reproduction using sparse methods, ” IEEE/ACM T ransactions on Audio, Speech, and Language Pr ocessing , vol. 23, no. 12, pp. 2343–2355, Dec 2015. [23] T . Betlehem and T . D. Abhayapala, “Theory and design of sound ﬁeld reproduction in reverberant rooms, ” The Journal of the Acoustical Society of America , v ol. 117, no. 4, pp. 2100–2111, 2005. [Online]. A vailable: https: //doi.org/10.1121/1.1863032 [24] R. Scheibler, E. Bezzam, and I. Dokmani, “Pyroomacous- tics: A python package for audio room simulation and ar- ray processing algorithms, ” in IEEE International Conference on Acoustics, Speech and Signal Pr ocessing (ICASSP) , April 2018, pp. 351–355. [25] A. Farina, “Simultaneous measurement of impulse response and distortion with a swept-sine technique, ” in 108th Audio Engineering Society Con vention , Feb 2000. [Online]. A vailable: http://www .aes.org/e- lib/bro wse.cfm?elib=10211 [26] C. H. T aal, R. C. Hendriks, R. Heusdens, and J. Jensen, “ A short-time objecti ve intelligibility measure for time-frequency weighted noisy speech, ” in IEEE International Conference on Acoustics, Speech and Signal Processing , March 2010, pp. 4214–4217.

Multipath-enabled private audio with noise

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment