Close Miking Empirical Practice Verification: A Source Separation Approach

Close miking represents a widely employed practice of placing a microphone very near to the sound source in order to capture more direct sound and minimize any pickup of ambient sound, including other, concurrently active sources. It is used by the a…

Authors: Konstantinos Drossos, Stylianos Ioannis Mimilakis, Andreas Floros

Close Miking Empirical Practice Verification: A Source Separation   Approach
Close Miking Empir ical Practice V er ification: A Source Separ ation Approach K onst antinos D r ossos A udio Research Group, Dept. of S ig nal Processing, T ampere Univ ersity of T echnology , T ampere, Finland. S tylianos I o annis M imilakis ∗ Fraunhofer IDMT , Ilmenau, Germa ny . A ndreas F lor o s Lab of Audio visual S ignal Processing, Dept. of Audio visual Arts, Ionian University , Corfu, Greece. T uomas V irt anen A udio Research Group, Dept. of S ignal Processing, T ampere Univ ersity of T echnology , T ampere, Finland. G erald S chuller T echnical Univ ersity of Ilmenau, Ilmenau, Germa ny . Abstract Close miking r epresents a widely employed practice of placing a microphone very n ear to the soun d source in order to capture more direct soun d and mini mize any pickup of ambi ent sou nd, includin g other , concur- rently active sources. It is used b y the audio engin eering community for decades for audio recording, based on a number of empirical rul es that were evolved durin g the record ing practice itself. But can this empir- ical knowledge an d close miking practice b e systematically verified? In this w ork we aim to addr ess t his question based on an analytic methodology that employs techniques and metrics originating from the sound source separation evaluation field. In particular , w e apply a quantitative analysis of the source separation capabilities of the close miking technique. The an alysis is applied on a reco rding dataset obtained at multiple positions of a typical musical hall, multiple distances betw een t he microphone and the sound source mult iple microph one types and multiple l evel differences between the sound sou rce and the ambient acoustic compo- nent. F or all the above cases we compute the Source to Interference R ati o (SIR) metric. The results obtained clearly demonstrate an optimum close-miking performance that matches the current empirical kn owledge of professional audio recording. I. I ntr oduc tion Capturing sound through electro-acoustic transducers is one of the fundamental tasks ∗ Correspondance should be addressed to mis@idmt.fraunhofer .de in audio engineering. In practice, although audio recording is not restrained b y particu- lar specifications, there are a pplications where certain restrictions apply , for ex a mple when capturing an aud io source output in the pres- ence of other a ctiv e audio sour ces ( e.g. at a 1 "Ev aluation of close miking technique", p resented at the 142 nd AES Convention, Berlin, 2017. liv e music per formance, or in a live rec ording session). In these cases, the presence of the latter sound sources intro duces ambient noise, which is ad d ed to the ambient noise of the recording space (if a ny). A widely-emplo yed techniq ue that is used for capturing an audio source that is simulta- neously active with other sources is commonly kno wn a s close miking [1]. It defines the microphon e’s placement close to the sound source and nearly in all ca ses it is used a s a rule of thumb [2]. W ith this specific place- ment of the microphone, the ca ptured audio tends to contain more energy from the tar- geted source tha n from the surrounding ones. Hence, close mikin g effectively functions as a mechanical source separation method that aims to separ a te the signal of the ta rgeted source from the mixture of the sound field that is cre a ted b y all concurrently a ctiv e sound sources. The suggested distance for the micro- phone placement roughly spans from 0.0 3 to 1 meter a wa y f rom the targeted source [1], bal- ancing the trade off between affecting the tim- bre of the targeted source a nd the pickup of unw anted sources. Although this technique is widely–used, according to a uthors’ best of kno wledge, there is no pr evious study that systematically v erified the abo v e microphone distance or ev aluated its effect on the r esulting captured audio in terms of source separation. The field of source separation is not re- cent. It regards the estimation of individ- ual signal components, denoted as sources, from their observed mixtures, and there are numerous published works focusing on this paradigm [3]. S ource separa tion has been utilized in many a pplications spanning from audio signal processing, e.g. for audio up- mixing [4], stere o image enhancement [ 5, 6], harmonic-percussiv e separation [7] , source modeling [8] and singing voice/so lo separa- tion [9, 10], to neurologi cal studies, for sepa- rating different electrical sources during phys- iological signals measurements [11], and satel- lite images, e.g. for detecting the a ctual mor- phology of the ground [12]. For ev aluating source separa tion techniques, a couple of strategies ha ve been proposed. More specifically , in [13] a set of metrics are presented that c an assess the extracted information from the mixture taking into ac- count the produced artifacts (i.e. deforma- tions induced b y the separation algorithm, such as musical noise), noise (energy perturb a - tions that d oes not corre spond to the extracte d source nor the interfering ones) and interfer- ence (a deformation of unw anted sources con- tributing to extracted infor ma tion). Focus- ing on modeling a nd measuring the interfer- ence of unw anted sources subject to a ta rgeted one, the notion of disjointness ortho gonality is introduced in [14]. Assuming that non– interfering sour ces are completely orthogonal to e a ch other in a signal domain, i.e. the short- time Fourier tra nsfor m (ST FT) , the degree of o v erla p that the sources might hav e can be esti- mated pro viding a n intuitiv e estimation of the total interference [1 4]. Since close miking a ims a t separating the targeted source from the mixture of the to- tal sound field that is created by all the ac- tiv e sound sources, it ca n be considered as a source sep a ration technique a nd its effect to re- alistic scenarios ca n be ev aluated b y the a bo v e mentioned strategies. In this work we tr y to ev aluate the close miking technique under the abov e perspectiv e. W e employ the af oremen- tioned method f or source separation evalua- tion based on the orthogonality a ssumption [14], and assess the effect of d istance, targeted sound source sound pressure lev el, interf e ring noise sound pressure lev el, angle of the mi- crophone with respect to the central axis of the targeted sound source, and different types of microphones lobes by means of signal to interfer ence ratio (SIR), essentially objectifying the choice of micr ophone place ment. For that cause, w e conducted a ser ie s of measurements in a reverberant room, i.e. an empty thea te r , with tw o sound sources a nd a sound lev el me- ter for calibrating the reproduction lev e ls. The rest of the paper is organized as follows . S ection II pro vides an o v er view of the existing literature that focuses on close miking, along with the presentation of the appropriate met- 2 "Ev aluation of close miking technique", p resented at the 142 nd AES Convention, Berlin, 2017. rics and their computation. S ection III outlines the methodology follo w ed for the perf ormed measurements, while S ection IV contains the obtained results. Finally , Section V holds the discussion of the results a nd S ection VI con- cludes the pa p er and proposes f utur e works. II. E xisting work i. Close miking technique Close miking is rather based on empirical kno wledge and a set of general guidelines that define the location a nd distance of the micro- phone from the sound source [1]. Existing studies a re pa rticularly focusing on two dif- ferent aspects. The first considers the varying spectral information and perceiv ed timbre of the music sound sour ces. The second regards the inspection of the close miking technique from a signal processing point of view a nd its relation to room a coustics. Focusing on the first aspect, in [15] record- ings of a variety of musical instruments and human voice are employ ed. T hese recordings are perfor med using d iffere nt microphone placement distances, ranging from 0.0 3 to 1 meter . The re c orded signals are transformed into the frequency domain and compared with the emana tion patterns of eac h sound ex am- ined source. As an outcome, differe nt equal- ization techniques are proposed depending on the placement of the r ecording microphone. Follo wing the same a pproach, a w ork more centered to human v oice is presented in [16]. It examines the distance of the placement of the microphone and its effect on the percep- tual spectral content. Finally , in [17] an as- sessment of micr ophone placement with re- spect to the ambience reflections, transmit- ted to the rec ording d evice, and timbre is presented. Different microphone–source dis- tances are ex amined alongside various angle orientations of the micropho ne with respect to the central axis of the sound sour ce (i.e. [ 15 ◦ , 30 ◦ , 45 ◦ , 60 ◦ , 90 ◦ ] ). Differentiating from the abov e studies, the w ork in [ 2] ev aluates close miking from a dif- ferent signal processing perspectiv e. In partic- ular , this w ork aims to v alida te the close mik- ing technique by ex a mining the effect of the excitation of the surrounding acoustic spa ce. T o do so, sound sources a re recorded in v a r- ious distances and the recorded signals are subjectiv ely assessed f or their perceptual sup- pression of the rev e rberation effec t. Nonethe- less, all the literature d e scribed a bov e relies on the empirical knowledge of the rela tiv e dis- tance betw een the microphone and the sound source. A quantified answer rega rding the def- inition of this distance ra nge is still not being proposed. ii. Computation of SIR For the ev a lua tion of the source sepa ration ca- pabilities of the close miking technique, w e emplo y ed the Signal to Interference Ratio (SIR) metric. Usually , this metric is used in the ev a l- uation of the sour ce separation task and indi- cates the energy ratio betw een a signal, sepa- rated from mixture of signals, and the interf e r- ence from the mixture that is apparent in the separated signal. More for mally , let x be a v ector denoting a single-channel ( monaural ) mixture consisting of 2 additive sources exp r essed as vectors s and n . Giv en that eac h source is known be- forehand, the d egree of o v erla p tha t the ta r- geted source s and the interfering n hav e, c a n be computed yielding the objectiv e measure of SIR. T o d o so, a n analysis opera tor T is applied to each source (targeted and interfering one) as follows: S ( m , k ) = T ( s ) , (1) N ( m , k ) = T ( n ) , (2) where T corresponds to the STFT analysis op- eration using the pa rameters proposed by a standard source separa tion ev aluation ( SSE) scheme [ 18], and m , k denote the time-frames and frequency bins (sub-bands), respectively . For the computation of S I R , given a pair of sources, the method pre sented in [14] is fol- lo w ed. Theref ore, a time-frequency filtering 3 "Ev aluation of close miking technique", p resented at the 142 nd AES Convention, Berlin, 2017. operation, i.e. tim e-freque ncy masking , is de- riv ed from Eq. 3: M ( m , k ) = ( 1, if | S ( m , k ) | ≥ | N ( m , k ) | 0, other wise. (3) Then, b y taking into ac c ount all the av ail- able time-f requency samples m , k and expr ess- ing as matrices the output of equations 1 –3, the S I R is computed as follo ws: S I R = 1 0 log 10  || M ⊙ | S || | 2 F || M ⊙ | N | | | 2 F  , (4) where | · | refers to the modulus, i. e . the magni- tude , of the time-frequency representation of each sour ce, ⊙ is an element-wise multipli- cation, and | | · || 2 F denotes the squared Frobe- nious norm. The v alues of S I R will approach + ∞ when the magnitude of the acquired signal S ( m , k ) , for each time-frame and f requency sub-ba nd, will be superior to the interfering one. On the other hand, when the v alues approach − ∞ , then the interfering source completely d omi- nates ov er their mixture. Essentially , this leads to a straightfor w a rd assessment of ho w w ell a method describes or estimates the targeted sig- nal x , in presence of outliers, can be acquired. III. E xperiment al proced ure The experimental procedure of the w ork at hand is sepa r ated in tw o tasks: i) recording of the individual signals, and b) the computa- tion of S I R subject to e a ch r ecording of a pair of sources. The former w as utilized in a mu- nicipal theater , located in Lixouri, Kefalonia, (Ionian islands, Greece), before the disastro us earthquakes in the A utumn of 2014 and re- sulted into the f or mation of the audio dataset emplo y ed b y the SS E ta sk. The latter w as im- plemented b y utilizing the signal model de- scribed in S e c tion ii. The aim of the first task is to provide a comprehensiv e set of recorded material con- taining the source signal, the noise signal, and the mixture of both. Eac h rec orded w av efor m is character ized by: a ) the distance b e tw een T able 1: List of the equ i pment used for audio recordings Apparatus Model Apparatus Mod e l SLM B&K 2250 T y pe A SLM Mic. A Shure SM57, dynamic, cardioid Laptop Macbook Pro 15” Mic. B Behringer ECM8000, condenser , omni- directional Recording softwa re Digidesig n ProT ools M-Pow ered 8 Musical instrument amplifier Behringer V -T one GMX212 Digital sound card M-A udio Fast T rack Ultra Loudspeaker Electrov oice SX300 the microphone and the signal source, b) the type of microphon e, and c) the sound pressure lev el (SPL) of the ac tual sour ce and the noise source. V arious combinations of the abov e fa c- tors were considered in the particular task. On the other hand, the se c ond task invol v es the ev aluation of the perfor mance of close miking as a source sepa ration method. The expected outcome is to determine the effectiv e limits and the relations between the key factors men- tioned abo v e, subject to an objectiv e measure. In the follo wing sections the abov e tasks will be presented in detail. i. Recordings pr ocedure The a udio recordings were performed us- ing a musical instrument amplifier , one loudspeaker , one laptop with recording softw are a nd a digital sound card, tw o microphon es (one dynamic and one con- denser/measurements), a nd one S ound Lev el Meter (SL M ). One microphone was omni- directional, while the the other had a cardioid lobe. The full list of all equipment parts is pro- vided in T a ble 1. Close miking aims at d iminishing the addi- tion of the noise in the final aud io mixture. The prime ele ment that affects the efficacy of this technique is the distance betw een the mi- crophone and the sound source. But since, 4 "Ev aluation of close miking technique", p resented at the 142 nd AES Convention, Berlin, 2017. on one ha nd, the distance between the micro- phone a nd the actual sound source ca n result into a n attenuation of the SPL and, on the other hand, different sound sources in a real- w orld scenario ar e likely to exhibit varying SPL, the question of the effect of SPL in the close miking technique is also raised. Finally , v a rious rece iving patterns of microphon es are utilized in a recording session. These affect the effective SPL recorded b y the micropho ne and thus different microphone lobes are possi- ble to portra y divergen t r e sults in close miking. In addition, there are references in the utiliza- tion of an angle between the central a xes of the microphon e and the sound source in order to achiev e impro ved attenuation of the receiving noise from the micropho ne. Card ✒✑ ✓✏ ❄ ❄ ✻ ❄ ✲ Noise Source A udio I/O Recor ding Device Omni ✒✑ ✓✏ Figure 1: The set-up of the measurements In order to a llo w the investi gation of the source–microph one distance, the source’s SPL, microphon e’s lobe and micropho ne-sound source angle’s effect in the par ticular tech- nique, the experimental set-up presented in Figure 1 w as perfor med: 2 audio sources, 1 laptop, 1 digital sound card and tw o micro - phones w er e utilized for the recording. For the a ngle ca se, the cardioid microphone w as used. The details of each component a r e listed in T able 1. Thus, the loudspeaker served as the noise source, the musical instrument a mplifier as the targeted sound source and the other a re self–explanator y with r espect to their utiliza- tion in the experimental process. All appa - ratuses emplo y ed are ra ther common to mu- sic perfor mances, a ca se where close miking 2 3 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 Frequency (kHz) Time (sec) T 20 T 30 EDT Figure 2: T 20 , T 30 and E D T measured in the stage of the theater at which the recordings took place technique is thoroughly met. As source sig- nal, a clean guitar r iff (re petitiv e tonal sound from electric guitar , amplified through the cor- responding amplifier) was emplo yed without applying a ny distorting sound effects. In or- der to introduce ambient noise, a pink noise generator w as activated. Each signal had a time length of 15 seconds. The r ecording process consisted of tw o phases. The first realized the rev erberation measurement of the rec ording room, while the second included the actual recordings. Re- garding the latter case, different recordings w ere considered with a) the omni-directional lobe micropho ne, b) the cardioid microphone with its central axis aligned with the central axis of the sound source a nd c) the cardioid microphon e placed with an angle of 45 de- grees re la tiv e to the sound source’s centra l axis. The rev erberation measurement w as im- plemented with the use of the SLM at six different positions in the stage of the theater , forming a hexagon. For a ll positions the T 20 , T 30 and Ea rly Deca y T ime ( E D T ) values w ere obtained. The results are illustrated in Fig- ure 2. Clearly , the theater stage can be considered fairly reverberant, especially in the region of 2.2 k H z . This fact a llo ws our investigation to be perf or med in a rather un-fav ored environ- ment; thus it can provide results that corre- spond to cases where close miking wo uld be 5 "Ev aluation of close miking technique", p resented at the 142 nd AES Convention, Berlin, 2017. T able 2: Soun d source - micro phone distances u sed in the first phase of the recording procedure Index i d Distance (m) Index Distance (m) 01 0.03 07 0.21 02 0.06 08 0.24 03 0.09 09 0.27 04 0.12 10 0.30 05 0.15 11 0.65 06 0.18 12 1.00 fa v ored in order to eliminate capturing of au- dio signals emerging from all noise sources, including a mbient noise. Regarding the sec- ond phase of the recordings process, 12 differ- ent sound source-microphone distances w er e emplo y ed, with index i d ∈ [ 1, 12 ] a nd rang- ing from 0.03 to 1 meter . The thir d record- ing type (with the c a rdioid microphon e placed with an angle of 45 d egrees relativ e to the sound source’s ce ntra l axis) included 10 ad- ditional distances, ranging from 0.0 3 to 0.30 meters, marked as i d ∈ [ 1, 10 ] . For clar ity , these values are summarzied in T able 2. Up to 0.3 meters the distance increment step equals to 0.03 meters. Abov e that limit, it becomes 0.35 meters. The reason for that is the ap- parent evidence in the existing literature, that abov e 0.3 meters close miking technique suf- fers from leaka ge and interference, when the sound pressure lev el of unw anted sources is high, contrar y to the desired source [1, 2]. Also, in the second phase the utilized distances are those with index i d ≤ 10. Moreov er , different SPL values w ere e mploy ed for both sound and noise sources and for all micr ophone lobe’s cases. For the former sound source, a set of 2 different SPL, S P L S [ i S ] , i S ∈ [ 1, 2 ] , v al- ues w ere used, whereas for the latte r a set of 5, S P L N [ i N ] , i N ∈ [ 1, 5 ] . This information is listed in T a ble 3. The different SPLs for the sound a nd noise source hav e a variation step of 3 d B S PL , since this difference corresponds to two times the acoustic energy . Also, there are 3 different S P L S : one that can b e considered as high, one as medium and one a s low . In conjunc- tion with the S P L N , these values allow the in- T able 3: SP L values used for the recordings proced ure Index SPL, ref P r e f = 2 × 1 0 − 4 Sound Source SPL (S P L S ) S P L S [ 1 ] 100 dB SPL S P L S [ 2 ] 97 dB S PL S P L S [ 3 ] 94 dB S PL Noise Source S P L (S P L N ) S P L N [ 1 ] 100 dB SPL S P L N [ 2 ] 97 dB S PL S P L N [ 3 ] 94 dB S PL S P L N [ 4 ] 91 dB S PL S P L N [ 5 ] 88 dB S PL v estigation of the different SPL effect. More specifically , each S P L S w as used with ev ery S P L N , i.e. for S P L S all S P L N w ere utilized for the noise source a nd the same stands f or S P L S [ 2 ] . Thus, for S P L S [ 1 ] it can be seen that the selected S P L N span in the dynamic range of equal S PL to 1 / 24 times low er ( for the case of S P L S [ 1 ] and S P L N [ 5 ] ). In the case of S P L S [ 2 ] , the d ynamic range of S PL c orre- sponds to double ac oustic energy emerging from the noise source as w ell as the same, half, one quarter and one eighth a coustic energy for the noise. Regarding the S P L S [ 3 ] it ca n be seen that the selected SPL for the noise source corre- sponds to quadruple, double, equal, half and one quarter acoustic energy e merging f rom the noise source. All SPLs w ere calculated in terms of L e q , with a time length av erage equal to the time length of both the sound and noise source signal (i.e. 1 5 seconds). The a ctual recordings were pe rformed for each microphone and for all S P L S , S P L N and appropriate sound source-microphone distances. In pa rticular , if D [ i d ] are the differ- ent distances a s presented in T able 2, M t , t ∈ [ 1, 2 ] are the different micropho ne lobes with M 1 to be the omni-directional a nd M 2 to be the cardioid lobe, A n g [ i an g ] , i an g ∈ [ 1, 3 ] the angle betw een the microphone’s and source’s central axis, with A n g [ 1 ] = 0 o , A n g [ 2 ] = 3 0 o and A n g [ 3 ] = 45 o , since the effect is minimal for low er angle variations [17], and S P L S [ i S ] and S P L N [ i N ] the different SPLs for the sound and noise source respectively , then the follow- 6 "Ev aluation of close miking technique", p resented at the 142 nd AES Convention, Berlin, 2017. ing recording sets, R i , w ere created: R 1 = { D [ i d ] , M 1 , A n g [ 1 ] , S P L N [ i N ] } (5) R 2 = { D [ i d ] , M 1 , A n g [ 1 ] , S P L S [ i S ] } (6) R 3 = { D [ i d ] , M 2 , A n g [ 1 ] , S P L N [ i N ] } (7) R 4 = { D [ i d ] , M 2 , A n g [ 1 ] , S P L S [ i S ] } (8) R 5 = { D [ i ′ d ] , M 2 , A n g [ 2 ] , S P L N [ i N ] } (9) R 6 = { D [ i ′ d ] , M 2 , A n g [ 2 ] , S P L S [ i S ] } (10) R 7 = { D [ i ′ d ] , M 2 , A n g [ 3 ] , S P L N [ i N ] } (11) R 8 = { D [ i ′ d ] , M 2 , A n g [ 3 ] , S P L S [ i S ] } (12) where i d ∈ [ 1, 12 ] , i ′ d ∈ [ 1, 5 ] , i S ∈ [ 1, 3 ] and i N ∈ [ 1, 5 ] . It must be noted that in the cases where a recording contains both S P L S and S P L N , these tw o were physically apparent and recorded at the same time. The calibra- tion of the S PL for each sound source ( S P L S and S P L N ) w as perf ormed with the SLM, a t the point of the recording microphone, for each source-microphone distance sepa rately , and without a ny other source ac tiv e. The recordings in the ov erall data set w ere all time trimmed to 15 seconds in order to contain exactly the produced signals from all cases. The audio data from the 15 seconds long recordings w ere sa ved under standard CD quality , i.e.sa mpling f r equency equal to 44.1 k H z and 16 bit sample length, using the typical wa ve file format. The latter audio files w ere utilized b y the SSE process presented im- mediately next, organized in the sets R ′ 1 to R ′ 8 , in accordance to E quations 5 to 1 2. ii. S ource separation ev aluation For ev aluation purposes, pairs of a udio files from the re cording sets w e re utilized a s input to the SSE process. E ach pair contains two au- dio files, one containing the noise-free record- ing (i.e. the de sire d source is a c tiv e only; an audio file from recording sets with even in- dex), considered as the estimated source in terms of the SSE process, and the audio file from the recording with the noise source ac- tiv e (i.e . a udio file from recording sets with odd index). The SIR w as computed for the recording set pairs: a ) R ′ 1 and R ′ 2 , b) R ′ 3 and R ′ 4 , c) R ′ 5 and R ′ 6 , and d) R ′ 7 and R ′ 8 . As can be see n from Equa- tions 5 to 1 2, the recording sets with odd in- dices contain re cordings with the noise source activ e and recording sets with ev en indices contain recordings with the desired source ac- tiv e. Also, each of the pa irs a) to d), contains recording sets with the sa me micropho ne type and the same angle betw een the microphone and the sound source. Thus, the input for the calculation of the S IR for one recording pair w as audio from each of the recording sets in this pa ir and with the same indices i d / i ′ d , i s , i n , and i an g IV . R esul ts The results from the abov e experimental pro- cess are organized in 12 figures, correspond- ing to the different combinations of micro- phone types, pla cement a ngles and the pro- duced SPL. Specifica lly , in Figure 4 are the r e - sults for the ca rdioid microphone and for z e ro degrees angle betw een the microphone and the source. In Figure 3 are the r e sults for the omni-directional microphone. In Figure 5 are the re sults for the cardioid microphon e with an angle of 30 ◦ betw ee n the microphone and the source and in Figure 6 the results for the cardioid microphone and with an angle of 45 ◦ betw ee n the microphone a nd the source are sho wn. V . D iscussion The results presented in the previous section portra y the ex p e cted fact tha t the lo w er SPL of the noise results in better performance of the close miking technique. Also, a general trend from all figures and subfigures is that the SPL of the source and the SIR see ms to be a na logous. This means that the high er the SPL of the source, the higher the SIR. These ob- servations a re in accordance with the general purpose and expectations of the close miking technique. Focusing on Figures 3 and 4, one can see 7 "Ev aluation of close miking technique", p resented at the 142 nd AES Convention, Berlin, 2017. 0 10 20 30 40 50 60 70 80 90 100 88 90 92 94 96 98 100 −5 0 5 10 15 20 Distance (cm) SIR plot for source level of 94 dB Noise level (dB SPL) SIR (dB) (a) Source S PL : 94 dB 0 10 20 30 40 50 60 70 80 90 100 88 90 92 94 96 98 100 −5 0 5 10 15 20 Distance (cm) SIR plot for source level of 97 dB Noise level (dB SPL) SIR (dB) (b) S ource SPL : 97 dB 0 10 20 30 40 50 60 70 80 90 100 88 90 92 94 96 98 100 −5 0 5 10 15 20 Distance (cm) SIR plot for source level of 100 dB Noise level (dB SPL) SIR (dB) (c) Sou rce SPL : 100 dB Figure 3: SIR of omni directional microphone over various sound pressure levels of source,with respec t to distance D and sound pressur e level of n oise. 0 10 20 30 40 50 60 70 80 90 100 88 90 92 94 96 98 100 −5 0 5 10 15 20 Distance (cm) SIR plot for source level of 94 dB Noise level (dB SPL) SIR (dB) (a) Source S PL : 94 dB 0 10 20 30 40 50 60 70 80 90 100 88 90 92 94 96 98 100 −5 0 5 10 15 20 Distance (cm) SIR plot for source level of 97 dB Noise level (dB SPL) SIR (dB) (b) S ource SPL : 97 dB 0 10 20 30 40 50 60 70 80 90 100 88 90 92 94 96 98 100 −5 0 5 10 15 20 Distance (cm) SIR plot for source level of 100 dB Noise level (dB SPL) SIR (dB) (c) Sou rce SPL : 100 dB Figure 4: SIR of cardioid microphone over various sound pressure levels of source, with r espect to distance D and sound pressure level of n oise. that in all cases the cardioid micr ophone out- performs the omni-directional one. The SIR v a lues obtained with the cardioid microphone are almost double of the SIR values obtained with the omni-directional. In addition, in both cases the ma ximum perf ormance of the close miking te chnique seems to be achiev ed for a 5 cm distance betw een the source and the micro- phone. After that distance, a re duction of the SIR is obser v ed for both cases. For the omni– directional case, the reduction is between 10 and 20 centimeters (cm), while for the cardioid microphon e case, the reduction is obser v ed betw ee n 2 0 and 4 0 cm. Follo w ed by that re- duction, the SIR rises up to a limit achiev ed around 70 cm. Focusing on Figures 5 a nd 6, one can also observe better interfe r ence reduction (higher SIR values) for all source SPL, distances, and noise S PL when compared to the previous two cases. Additionally , in the same ca ses, i.e. Figures 5 and 6, there is a maximum of SIR around 12 to 1 4 cm. This comes in contrast with the previous tw o cases where the peak w as obser v ed below 10 cm. Since for the cases of Figures 5 and 6 we did not pe rform mea - surements with distances grea ter than 15 cm, w e cannot conclude if the SIR cur v es wo uld exhibit the similar behavior as the SIR c urves from Figures 3 and 4, i.e. a de ep a t certain dis- tance follo wed by a small increase tow ards a high limit of the SIR. The SIR v a lues obtained with the placement of the cardioid microphones with an angle are almost three times the v alues of the SIR that w ere obtained with the other two cases. This clearly indicates that placing a cardioid microphon e with an angle against the cen- tral axis of the noise results in better pe r for- mance of the close miking tec hnique. These v a lues of S IR in the corresponding p e aks are almost three times the peak SIR v a lues from the r e st tw o ca ses of microphone types and angles of pla cement. This clearly indicates the outperformance of the cardioid microphones placed with an a ngle versus the cardioid mi- crophone pla ced without an angle and the omni-directional microphone cases. Finally , betw ee n the tw o different angular placements of the ca rdioid microphones, there is not any 8 "Ev aluation of close miking technique", p resented at the 142 nd AES Convention, Berlin, 2017. 2 4 6 8 10 12 14 16 88 90 92 94 96 98 100 −5 0 5 10 15 20 Distance (cm) SIR plot for source level of 94 dB Noise level (dB SPL) SIR (dB) (a) Source S PL : 94 dB 2 4 6 8 10 12 14 16 88 90 92 94 96 98 100 −5 0 5 10 15 20 Distance (cm) SIR plot for source level of 97 dB Noise level (dB SPL) SIR (dB) (b) S ource SPL : 97 dB 2 4 6 8 10 12 14 16 88 90 92 94 96 98 100 −5 0 5 10 15 20 Distance (cm) SIR plot for source level of 100 dB Noise level (dB SPL) SIR (dB) (c) Sou rce SPL : 100 dB Figure 5: SIR of cardioid microphone, with an angle of 30 ◦ , over various soun d pressure levels of source, with respect to distance D an d sound pressure level of n oise. 2 4 6 8 10 12 14 16 88 90 92 94 96 98 100 −5 0 5 10 15 20 Distance (cm) SIR plot for source level of 94 dB Noise level (dB SPL) SIR (dB) (a) Source S PL : 94 dB 2 4 6 8 10 12 14 16 88 90 92 94 96 98 100 −5 0 5 10 15 20 Distance (cm) SIR plot for source level of 97 dB Noise level (dB SPL) SIR (dB) (b) S ource SPL : 97 dB 2 4 6 8 10 12 14 16 88 90 92 94 96 98 100 −5 0 5 10 15 20 Distance (cm) SIR plot for source level of 100 dB Noise level (dB SPL) SIR (dB) (c) Sou rce SPL : 100 dB Figure 6: SIR of cardioid microphone, with an angle of 45 ◦ , over various soun d pressure levels of source, with respect to distance D an d sound pressure level of n oise. notable difference with the current experimen- tal setup. VI. C onclusions The w ork at hand performed a quantitativ e analysis of the source sepa ration ca pabilities of the close miking technique. Since this technique is a mechanical sour ce separation method, the present w ork a p p lies a quantita- tiv e analysis of the actual close miking tech- nique. This analysis is perfor med with tw o different microphone types, three different an- gular placements of the microphon es, 12 dif- ferent distances be tw een the microphone and the sour c e, three different source SPL , and, fi- nally , under five different noise SPL values. The results obta ined clearly indica te that the best pe rformance of close miking is achiev ed when the microphone has a ca rdioid lobe, placed with an a ngle of 30 or 45 degrees with respect to the central axis of the source and in distance of a round 12 cm. Future measurements and studies could, po- tentially , show the effect of the height of the microphon e in the close miking technique. Fi- nally , there wo uld increased interest in a sub- jectiv e ev aluation of the quality of the source separation with close miking with different types of microphones and d ifferent angular placements of the microphones with respect to the source. VII. A cknowledgements The a uthors w ould like to thank the Depart- ment of T echnology of S ound and Musical Instruments, T echnological Educational Insti- tute of Ionian Islands, for pro viding the equip- ment for the measurements. Part of the re- search leading to these results has re ceiv ed funding from: i) the European Research Coun- cil under the European Union’s H202 0 Frame- w ork Programme through ERC Grant A gr e e- ment 63 7422 EVER YSOUND, a nd ii) the Euro- pean Union’s H2020 Fra mew ork Programme (H2020 -MSCA- ITN-201 4) under grant agree- ment no 64 2685 MacS eNet. 9 "Ev aluation of close miking technique", p resented at the 142 nd AES Convention, Berlin, 2017. R eferences [1] D. M. Huber and R. E. Runstein, Mod ern Recording T ech niques . Burlington, MA : Fo- cal Press, 200 5. [2] E. K. Kokkinis, E. Georganti, a nd J. Mour- jopoulos, “Statistical properties of the close-microphon e r esponses,” in Proceed- ings of 13 2nd Audio Engineering Society Convention . Budapest, Hungary: Audio Engineering S ociety , April 20 1 2. [3] R. Liu and S. Li, “A review on music source separation,” in Informat io n, Com- puting and T elecomm unication, 20 09. YC- ICT ’09. IEEE Y outh Conference on , S ept 2009, pp. 34 3 –346 . [4] D. Fitzgerald, “Upmixing from mono - a source sepa ration approach,” in 17th In- ternational Conference o n Digit al Signal Pro- cessing (DSP) . Corfu, Greece: IEEE, July 2011, pp. 1– 7 . [5] A. Floros and N. A. T atlas, “S p a tial enhancement for immersiv e stereo a u- dio applications,” in 20 1 1 17th Interna- tional Confer ence on Digit al Signal Process- ing (DSP) , July 2 011, pp. 1–7 . [6] K. Dros sos, S. I. Mimilakis, A. Floros, and N. G. Kanellopoulos, “Ste reo goes mobile: Spatial enhancement for short- distance loudspeaker setups,” in Eighth International Conference on Intelligent I nfor- mation Hiding and Multimedia Signal Pro- cessing (I I H-MSP) , Peiraeus, Greece, J uly 2012, pp. 43 2 – 435 . [7] E. C a no, M. Plumbley , and C. Dittmar , “Phase-based harmonic percussiv e sepa- ration,” in Proceedings of the Annual Con- feren ce of the International Speech Communi- cation Association (Interspeech) , Singapore, S eptember 201 4. [8] A. Liutkus, Z. Rafii, B. Pardo, D. Fitzgerald, and L. Daudet, “Ker- nel Spectrogram models for source separation,” in HSCMA , Nancy , France, May 2014 . [Online]. A v ailable: https://hal.inria.fr/hal- 00959 384 [9] Z. Rafii, Z. Duan, and B. Pa rdo, “ Combin- ing rhythm-based and pitch-based meth- ods for background and melody separa- tion,” IEEE/ACM T ransactions on Audio, Speech, and Language Processing , v ol. 22, no. 12, pp. 1884– 1893 , Dec 201 4 . [10] S. I. M imilakis, E. Cano, J. Abesser , a nd G. S chuller , “New sonorities for jazz recordings: S eparation and mixing using deep neural netw orks,” in Audio E ngineer- ing Society 2nd Workshop on Intelligent Mu- sic Production , L ondon, UK, S ept. 2 016. [11] S. Choi, A. Cichocki, H.-M. Park, and S.- Y . Lee, “Blind source separation a nd inde- pendent c omponent analysis: A review ,” Neural Information Process ing - Letters and Reviews , vol . 6, no. 1, pp. 1 –57, Jan 2 005. [12] M. Loghmari, M. Naceur , and M. Boussema, “A spectral a nd spa- tial source separation of multispectra l images,” Geoscience and Remote Sensing, IEEE T ransactions on , v ol. 44, no. 12 , pp. 3659– 3673 , Dec 2 006. [13] E. V incent, R. Gribonv al, a nd C. Fév otte, “Performa nce measurement in blind au- dio source separation,” IEE E T ransactio ns in Audio , Speech and Languag e Processing , v ol. 14, no. 4, pp. 1462 –146 9, 2006. [14] O. Y ilmaz and S. Rickard, “ Blind sep- aration of speech mixtures v ia time- frequency masking,” IEEE T ransactions on Signal Processing , v ol. 5 2, no. 7 , pp. 1830– 1847, July 2 0 04. [15] B. Bartlett, “T onal effects of close micro- phone placement,” Journal of Audio Engi- neering Society , vol . 29 , no. 1 0, pp. 7 26– 738, October 19 81. [16] E. B. Brixe n, “Near-field registration of the human voice: Spectral changes due 10 "Ev aluation of close miking technique", p resented at the 142 nd AES Convention, Berlin, 2017. to positions,” in Proceedings of 104 t h Au- dio Engineering Society Convention . Am- sterdam, The Ne therlands: A udio Engi- neering S ociety , May 1998 . [17] A. Case, “Recording electric guitar - the science and the myth,” Journal of t he Audio Engineering Society , vo l. 58, no. 1/2, 20 10. [18] C. Fév otte, R. Gribonv a l, and E. V incent, “Bss ev al toolbox user guide,” IRISA, Rennes, France, T echnical Report 1706, April 2005 . 11

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment