A Learnable Distortion Correction Module for Modulation Recognition

Modulation recognition is a challenging task while performing spectrum sensing in a cognitive radio setup. Recently, the use of deep convolutional neural networks (CNNs) has shown to achieve state-of-the-art accuracy for modulation recognition \cite{…

Authors: Kumar Yashashwi, Amit Sethi, Prasanna Chaporkar

A Learnable Distortion Correction Module for Modulation Recognition
1 A Learnable Distortion Correction Module for Modulation Recognition Kumar Y ashashwi, Amit Sethi, Prasanna Chaporkar Department of Electrical Engineering, Indian Institute of T echnology Bombay , India { kryashashwi, asethi, chaporkar } @ee.iitb .ac.in Abstract —Modulation recognition is a challenging task while performing spectrum sensing in a cognitive radio setup. Recently , the use of deep con volutional neural net- works (CNNs) has shown to achieve state-of-the-art accu- racy f or modulation r ecognition [1]. Ho wever , a wir eless channel distorts the signal and CNNs are not explicitly designed to undo these artifacts. T o impr ove the perfor - mance of CNN-based recognition schemes we propose a signal distortion correction module (CM) and show that this CM+CNN scheme achieves accuracy better than the existing schemes. The proposed CM is also based on a neural network that estimates the random carrier fr equency and phase offset introduced by the channel and feeds it to a part that undoes this distortion right before CNN- based modulation recognition. Its output is differentiable with respect to its weights, which allows it to be trained end-to-end with the modulation recognition CNN based on the received signal. For supervision, only the modulation scheme label is used and the knowledge of true frequency or phase offset is not requir ed. Keyw ords — Cognitive radio, Deep Learning, Modulation Recognition, Signal distortion I . I N T RO D U C T I O N W ith an increasing number of users in wireless net- works there has been an increasingly high congestion in the av ailable spectrum making it a scarce asset. Ho w- ev er, at times parts of the spectrum remain underuti- lized [2]. This gives rise to the need for algorithms that can dynamically share the available spectrum. In the scenario of cognitive radio, spectrum sharing allows cognitiv e radio users (secondary) to share the spectrum bands of the licensed-band users (primary). A key aspect of spectrum sharing is spectrum sensing [3]. Spectrum sharing in volv es white space detection based on which the secondary users (SU) communicate. Since the primary users (PU) opportunistically allo w the secondary users to operate in an inactive frequency band originally allocated to the PUs, minimum time delay is desired [4]. Recent research efforts have been towards designing high-quality spectrum-sensing devices and algorithms to characterize the radio frequency (RF) en vironment, particularly for recognition of the modulation scheme. Distortion of the receiv ed signal due to channel fading effects makes the modulation recognition a challenging task. Hence, an algorithm that models and corrects the distortion caused by the channel should improv e modulation recognition. In the past few years, deep learning techniques have achiev ed state-of-the-art performance in pattern recog- nition tasks [5]. F or the purpose of spectrum sensing different deep learning algorithms such as multilayer perceptron (MLP) [6] and con volutional neural network (CNN) [1] have been proposed to recognize the mod- ulation scheme from the giv en signal. In this work we introduce a module in a neural network to account for the random carrier frequency offset (CFO) and phase noise. Carrier frequency of fset and phase noise are added randomly to the transmitted signal by the channel and as a result, the recognition accuracy reduces. For example, if frequency de viation in the transmittor is 10 ppm abo ve the centre frequency and the same is for the recei ver , a CFO of 20 ppm is induced effecti vely in the recei ved baseband signal. If the carrier frequency is 4 GHz, the CFO is up to ± 80 kHz. Moreov er , Doppler ef fect would further degrade the frequency offset if either the transmitter or reciev er is mo ving. T o tackle such problems, we propose a correction module (CM) to undo the effect of random frequency and phase noise without any prior information about these factors. T o be more precise, the correction of CFO and phase noise is unsupervised. This idea is inspired from spatial transformer networks used in image recognition [7]. The CM when used with CNN improv es the recognition accuracy for both high and low values of signal-to-noise ratio (SNR). W e call this scheme CM+CNN. The rest of the paper is organized as follows: Section II discusses the related work and dataset generation method. In Section III we introduce our methodology . Section IV discusses our results and conclusions are in Section V. I I . B A C K G RO U N D A N D R E L AT E D W O R K T echniques for determining modulation scheme hav e been depending on increasingly complex machine learn- ing methods. Early work by Nandi [6] implemented a decision theoretic and MLP approach for modulation recognition. A hierarchical modulation recognition sys- tem was introduced in [8] which sho ws that with in- creased path fading, the classification accuracy de grades. Some other efforts have utilized machine learning tech- niques such as support v ector machines (SVM) [9]. Other techniques include feature engineering methods obtained using cyclostationarity [10] and wav elet transform [11]. Extracting a proper set of features for classification also has man y practical issues. For example, without prior knowledge, the instantaneous phase or frequency cannot be estimated. W ork in [12] utilizes different variants of CNN architectures to improv e the modulation recognition accuracy . A detailed survey on the methods for modula- tion recognition is presented in [1]. For fair comparison, we ev aluate our scheme on the same dataset that has been used in the prior work [1]. Correction of channel artifacts in the signal was not considered in prior works due to 2 Fig. 1: Data Generation Scheme [13]. Parameter V alue Sampling frequency 200 kHz Sampling rate offset standard deviation 0.01 Hz Maximum sampling rate offset 50 Hz Carrier frequency offset standard deviation 0.01 Hz Maximum carrier frequency offset 500 Hz Number of sinusoids used in frequency selective fading 8 Maximum doppler frequency used in fading 1 Fading model Rician Rician K-factor 4 Delays [0.0, 0.9, 1.7] Magnitudes corresponding to each delay time [1, 0.8, 0.3] Ntaps 8 Standard deviation of the A WGN process 10 − SNR 10 T ABLE I: Channel Model Parameters [13] which ev en for SNRs greater than 0dB the accurac y reported was poor . The proposed CM+CNN framework addresses this issue by using a learnable correction mod- ule in tandem with a CNN, leading to a higher accuracy . For the purpose of de veloping machine learning models for radio recently an open source, synthetically generated dataset (RadioML2016.10a) using GNUradio was intro- duced [13]. Fig.1 illustrates the dataset generation tech- nique. The channel incorporates a sampling frequency offset, carrier frequency offset and a phase noise using a random walk process. Additiv e white gaussian noise (A WGN) further degrades the signal. Parameters used to model the channel are listed in T able I. The model for signal generation is a complex enough replication of real radio transmission signals making it a quality dataset for developing algorithms and performing simulations for software based radio. I I I . P RO P O S E D T E C H N I Q U E As described in Section I, addition of CFO and phase noise to the signal hampers modulation recognition. In this section we introduce a correction module (CM) to address this issue as depicted in Fig. 2. Overall the CM can be di vided into two parts. The first part is a trainable function that estimates the phase and frequency of fsets (correction parameters) from the received signal. The second part is a static function that generates the input for CNN by undoing the frequency and phase distortion on the receiv ed signal using the offsets estimated by the first part. The first part is trained by backpropagating the error from the modulation recognition label through the CNN and through the second part. Thus, no additional supervised information is needed such as true phase or frequency offset. A. Correction parameter estimation For the first part of the CM, we utilize a fully connected network (FCN) to estimate CFO ω and phase offset φ . The FCN has one hidden layer (in which 80 hidden neurons gav e good v alidation performance) follo wed by a final layer with two outputs ω and φ . T o allow the estimation of these two parameters from a continuous and unbounded range assuming no prior knowledge, we choose the activ ation function of the final layer to be linear . Since the signal is distorted randomly , the error in the estimation of the correction parameters may vary with SNR and modulation scheme. Therefore we also experimented with the idea of simultaneously gi ving multiple versions of the signal to the CNN along with the original (uncorrected) signal as well. Assuming that there are K + 1 pairs of correction parameters ( ω k , φ k ) indexed by k , such that k = 0 was reserved for the received signal without any estimated correction (i.e. ( ω 0 , φ 0 ) = (0 , 0) , and rest of the K signals were generated using the 2 K output neurons of the FCN. B. Generating input for CNN The second part of the module applies the phase and frequency in verse transformations using the estimated correction factors by multiplying the receiv ed signal x n with e − j ω k n − j φ k , where n is the discrete time index of the signal. That is, the second part implemented the following equations: Y ( I ) k,n = <  x n e − j ω k n − φ k  (1) Y ( Q ) k,n = =  x n e − j ω k n − φ k  (2) In practice, we obtained best results with K = 1 as sho wn in Fig. 2. That is, k = 0 corresponded to the original signal, while we needed to estimate only one frequency- phase pair for k = 1 using the FCN part requiring it to hav e only two output neurons. Thus, the dimension of the output of this part was 128 × 2( K + 1) , where the dataset had 128 samples for each signal, and the factor 2 accounts for both real and imaginary parts of the signal. Thus, the output of the CM was sized 128 × 4 . C. End-to-end training and CNN ar chitectur es The output of the CM, which was K + 1 versions of the receiv ed signal, was input into the CNN that estimated the modulation scheme. T o train CM+CNN, i.e. the pa- rameters of the FCN and the CNN, the recognition error was backpropagated through the cascade of CNN, in verse transformation, and FCN. This was possible because the sub-gradient of outputs of the FCN and the CNN with respect to their respecti ve inputs and parameters (weights and biases) e xists everywhere by design. Additionally , a quick look at (1), (2) is sufficient to realize that the gradient of the outputs of the in verse transformation with respect to its inputs x n , ω k , and φ k also exists. This allowed end-to-end backpropagation using only the knowledge of the modulation scheme for the training data without additional knowledge of the actual frequency or phase offsets thus learning it unsupervised. T o improve the modulation recognition accuracy , we trained two different CNNs, one each for SNR belo w and abov e 0 dB. Based on prior studies we assume that whether the SNR is abov e or below 0dB can be deter- mined, even without the knowledge of the modulation scheme [14],[15]. W e confirmed that architectures similar to the ones described previously in [1] worked well, which is satisfying as it also allows direct assessment of adding the proposed correction module (CM). W e used a four-layer CNN for non-negati ve SNR and a three-layer CNN for negati ve SNR based on a validation process. All 3 Fig. 2: Correction module and network architecture. Fig. 3: Accuracy comparison between proposed technique (CM+CNN) and previous benchmark (CNN, CLDNN) con volutional layers of our CNNs (whether three or four) had 50 one-dimensional con volutional filters of size c l × 8 , where c l is the number of feature maps or channels of the previous layer . W e used 4 input channels as described in III-B for the input layer , unlike the 2 channels of [1]. The con volution was performed using valid setting and thus no padding was required at signal edges. The first two con volutional were each followed by max-pooling of factor 2 . The output of the con volutional layers is followed by a dense layer ha ving 512 neurons. The output layer had 11 neurons. All layers used rectified linear activ ation, except the output layer that used softmax. I V . R E S U LT S A N D D I S C U S S I O N Among various experiments we conducted to deter- mine the useful combinations of neural network archi- tectures and hyperparameters, including the number of estimated correction parameters, we describe those that led to conclusi ve results. The RadioML2016.10a dataset that we used has signals with 11 analog and digital modu- lation schemes with SNR v arying from − 20 dB to +18 dB. Since e very signal passes through the channel described in T able I, it gets distorted by sampling rate of fset, carrier frequency offset, phase noise and A WGN. The correction module in our work accounts for phase and frequency offset. W e also verified the previous benchmark results by reimplementing CNN and CLDNN [1]. The architecture used for CNN has 3 con volutional layers each having 50 filters with 1 × 8 filter size. For CLDNN the output of 3 con volutional layers is concatenated with the output of the first con volutional layer . Comparison of modulation -5% 0% 5% 1 0% 1 5% 2 0% -20 -16 -12 -8 -4 0 4 8 12 16 Acc ur acy Ga in (%) SNR (dB) Freq uency Phase Frequen cy and phase Fig. 4: Accuracy gain with respect to the base CNN for frequency correction, phase correction, both frequency and phase corrections Fig. 5: Normalized histogram for frequency corrections recognition accuracy between the proposed method, CNN and CLDNN for dif ferent SNRs is shown in Fig. 3. For SNRs above - 14 dB a higher accuracy is observed using the proposed technique with significant improvements for SNR greater than 0dB. Similar performance improvement is observed for SNR less than - 14 dB. W e also experimented with the follo wing three cases of parameter corrections: 1) frequency only , 2) phase only , and 3) frequency and phase corrections. Accuracy gains with respect to the base CNN for the three cases are presented in Fig. 4. W e observed significant gains for nearly all the cases, thus demonstrating the benefit of frequency and phase offset corrections. Fig. 5 shows the output of the correction module. Note 4 that the activ ation function of the final fully connected layer of the FCN was linear as described in Section III-A. The output could hav e been any real value. But we observe in Fig. 5 most of the frequency corrections lie in the range of − 0 . 01 Hz to +0 . 01 Hz. The standard deviation of the frequenc y corrections obtained is 0 . 01131 Hz. This matches with the standard deviation of carrier frequency offset that is used to model the channel as listed in T able I. Hence the correction module estimates the CFO closely with the actual offset values without any extra supervised data. Due to complex selective channel fading and delays, it is difficult to estimate the actual range of random phase noise. In our experiments we found the phase noise correction to vary between 150 ◦ and 270 ◦ with mode at 240 ◦ . Further we plot the confusion matrix for non-negati ve and negati ve SNRs in Fig. 6. For non-negati ve SNRs we observe that the major confusion is between QAM16 and QAM64. A reason for this can be that features for a signal with QAM64 modulation may not be captured by just 128 samples due to which the deep network confuses it with QAM16. Due to increased noise we observed confusion to have increased for negativ e SNR signals. All the techniques do no better than a random guess for signals having SNR lower than − 14 dB as sho wn in Fig. 3. V . C O N C L U S I O N W e introduced a new module in this paper to esti- mate the carrier frequency offset and phase noise of the received signal to improve modulation recognition accuracy . The proposed netw ork outperforms the pre vious benchmark achieving significant accuracy improvements for both high and low SNR signals. Since a generic CNN is not designed to deal with the ef fects caused by wire- less channels, we addressed this issue by introducing a correction module. Further we observe that the frequency corrections calculated corresponds closely with the actual frequency of fsets caused by the channel. Since there can be any number of perceptrons (correction factors) in the final layer, corrections other than phase or frequency can also be estimated. W e have demonstrated that the concept of spatial transformer networks [7] can be generalized to distortion correction for signals in a cognitive radio setup. Similarly , distortion parameters for audio and speech signals can also be estimated for signal correction before a recognition task in a similarly cascaded neural network that can be trained end-to-end. R E F E R E N C E S [1] N. W est and T . O’Shea, “Deep architectures for modulation recognition, ” IEEE International Confer ence on Dynamic Spec- trum Access Networks , 2017. [2] S. Haykin, “Cognitive radio: brain-empowered wireless commu- nications, ” IEEE Journal on Selected Areas in Communications , vol. 23, pp. 201–220, 2005. [3] T . Y ucek and H. Arslan, “ A survey of spectrum sensing algo- rithms for cogniti ve radio applications, ” IEEE Communications Surveys & T utorials , vol. 11, pp. 116–130, 2009. [4] O.B. Akan, O.B. Karli, and O. Ergul, “Cognitive radio sensor networks, ” IEEE Network , vol. 23, 2009. [5] Y . LeCun, G. Bengio, and G. Hinton, “Deep learning, ” Natur e , vol. 521, 2015. [6] AK Nandi and EE Azzouz, “ Algorithms for automatic modulation recognition of communication signals, ” IEEE T ransactions on communications , vol. 46, 1998. (a) (b) Fig. 6: (a) Confusion matrix for non-negati ve SNR, (b) Confusion matrix for negativ e SNR [7] M. Jaderber g, K.Simonyan, et al., “Spatial transformer networks, ” Advances in Neural Information Processing Systems , 2015. [8] E. Like and V . Chakrav arthy , “Signal classification in fading channels using cyclic spectral analysis, ” Journal on Wir eless Communications and Networking , 2009. [9] S. Hassanpour, A. Pezeshk, and F . Behnia, “ Automatic digital modulation recognition based on novel features and support vector machine, ” International Confer ence on Signal-Image T ec hnology & Internet-Based Systems , 2016. [10] O. Dobre, A. Abdi, et al., “Survey of automatic modulation classification techniques: classical approaches and new trends, ” Communications, IET , vol. 1, pp. 137–156, 2007. [11] P . Prakasam and M. Madheswaran, “Digital modulation identifi- cation model using wav elet transform and statistical parameters, ” Comp. Sys., Netw ., and Comm , 2008. [12] T . O’Shea, J. Corgan, and T . Clancy , “Convolutional radio modulation recognition networks, ” Engineering Applications of Neural Network , 2016. [13] T . O’Shea and N. W est, “Radio machine learning dataset generation with gnu radio, ” Proceedings of the GNU Radio Confer ence , 2016. [14] D. Wu, X. Gu, and Q. Guo, “Blind signal-to-noise ratio estimation algorithm with small samples for wireless digital communications, ” Intelligent Computing in Signal Processing and P attern Recognition, Springer , Berlin, Heidelber g , 2006. [15] S. Dan and G. Lindong, “ A blind snr estimator for digital bandpass signals, ” Journal of Electronics , vol. 25, 2008.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment