Universal Audio Steganalysis Based on Calibration and Reversed Frequency Resolution of Human Auditory System

February 23, 2026

Reading time: 6 minute

...

📝 Abstract

Calibration and higher order statistics (HOS) are standard components of many image steganalysis systems. These techniques have not yet found adequate attention in audio steganalysis context. Specifically, most of current works are either non-calibrated or only based on noise removal approach. This paper aims to fill these gaps by proposing a new set of calibrated features based on re-embedding technique. Additionally, we show that least significant bit (LSB) is the most sensitive bit-plane to data hiding algorithms and therefore it can be employed as a universal embedding method. Furthermore, the proposed features are based on a model that has the maximum deviation from human auditory system (HAS), and therefore are more suitable for the purpose of steganalysis. Performance of the proposed method is evaluated on a wide range of data hiding algorithms in both targeted and universal paradigms. Simulation results show that the proposed method can detect the finest traces of data hiding algorithms and in very low embedding rates. The system detects steghide at capacity of 0.06 bit per symbol (BPS) with sensitivity of 98.6% (music) and 78.5% (speech). These figures are respectively 7.1% and 27.5% higher than state-of-the-art results based on RMFCC.

💡 Analysis

🇰🇷 한글로 읽기

📄 Content

Universal Audio Steganalysis Based on Calibration and Reversed Frequency Resolution of Human Auditory System1

Hamzeh Ghasemzadeh1*, Meisam Khalil Arjmandi2

1 Department of Communicative Sciences and Disorders, Michigan State University, MI, USA
2 Department of Communicative Sciences and Disorders, Michigan State University, MI, USA
*hamzeh_g62@yahoo.com Abstract: Calibration and higher order statistics (HOS) are standard components of image steganalysis. However, these techniques have not yet found adequate attention in audio steganalysis. Specifically, most of current studies are either non-calibrated or only based on noise removal. The goal of this paper is to fill these gaps and to show that calibrated features based on re-embedding technique improves performance of audio steganalysis. Furthermore, we show that least significant bit (LSB) is the most sensitive bit-plane to data hiding algorithms and therefore it can be employed as a universal embedding method. The proposed features also benefit from an efficient model which is tailored to the needs for audio steganalysis and represent the maximum deviation from human auditory system (HAS). Performance of the proposed method is evaluated on a wide range of data hiding algorithms in both targeted and universal paradigms. The results show the effectiveness of the proposed method in detecting the finest traces of data hiding algorithms in very low embedding rates. The system detects steghide at capacity of 0.06 bit per symbol (BPS) with sensitivity of 98.6% (music) and 78.5% (speech). These figures are respectively 7.1% and 27.5% higher than the state-of-the-art results based on RMFCC features. Key words: Audio steganalysis, Audio steganography, Data hiding, Reversed Mel frequency cepstrum coefficients, Calibration 1- Introduction During the past decade, information security has been revolutionized, and many new trends have emerged. Multimedia encryption systems [1, 2], multimedia secret sharing [3], steganography [4], steganalysis and watermarking [5] are among such trends. Among them, steganography has received a lot of attention. Communicating through a covert channel without arising attention of a third party and preventing traffic analysis are the main purposes of steganography. The outcome of this process is a stego signal (𝑠∈𝒮) which results from hiding

1 “This paper is a preprint of a paper submitted to and accepted for publication in IET Signal Processing and is subject to Institution of Engineering and Technology Copyright. The copy of record is available at the IET Digital Library.” DOI: iet-spr.2016.0690

the intended message (𝑚∈ℳ) inside a host signal, namely called cover (𝑐∈𝒞). Steganography methods can be classified into categories of text, audio, image, video, and network traffics, depending on the type of cover signal. Steganalysis is the countermeasure of steganography which aims to detect the presence of hidden messages. Likewise, steganalysis methods may be classified according to the type of cover into categories of text, audio, image, video, and network traffics. Steganalysis in each of these categories can be further divided into targeted and universal methods. In the former, the embedding algorithm is known, whereas there is no prior assumption about the embedding algorithm in the later one [6].
One of the first audio steganalysis method was proposed in [7] where cover signal was estimated by de-noising the signal under inspection. Audio quality metrics (AQMs) were used to quantify the discrepancies between the original signal and its estimated cover [7]. Hausdroff distance was proposed as a solution to the inefficiency of AQMs in detecting traces of hidden data [8]. In [9], negative effect of high correlation between the features extracted from these de- noising methods and their signals was solved.
All of these previous works are similar in that, they have used indirect methods for comparing between stegos and their estimated covers. However, conducting this comparison on the distributions of stegos and covers are more appropriate. This approach was pursued in [10], where it was shown that the degree of histograms flatness derived from wavelet coefficients of stegos and their cover counterparts is a discriminative criterion. Gaussian mixture model (GMM) and generalized Gaussian distribution (GGD) were used to capture this criterion. Another improvement was obtained from the second order derivative of audio signal [11]. On the basis of this observation, two different approaches were proposed by incorporating Markov transition probability and Mel-frequency cepstral coefficients (MFCC) [11, 12]. Ghasemzadeh et al. suggested a new steganalysis method by arguing that, by definition, ear should not be able to distinguish between cover and stego signals. According to this argument, MFCC and AQMs are count

View Original ArXiv

This content is AI-processed based on ArXiv data.

Universal Audio Steganalysis Based on Calibration and Reversed Frequency Resolution of Human Auditory System

📝 Abstract

💡 Analysis

📄 Content

Table of Contents

Table of Contents

📝 Abstract

💡 Analysis

📄 Content

Start searching

No results found