Audio Processing

All posts under tag "Audio Processing"

9 posts total

Sorted by date

SELEBI: Percussion-aware Time Stretching via Selective Magnitude Spectrogram Compression by Nonstationary Gabor Transform

Phase vocoder-based time-stretching is a widely used technique for the time-scale modification of audio signals. However, conventional implementations suffer from ``percussion smearing,'' a well-known artifact that significantly degrades the quality of percussive components. We attribute this artifa

February 23, 2026

Audio Processing Electrical Engineering and Systems Science

Bottleneck Transformer-Based Approach for Improved Automatic STOI Score Prediction

In this study, we have presented a novel approach to predict the Short-Time Objective Intelligibility (STOI) metric using a bottleneck transformer architecture. Traditional methods for calculating STOI typically requires clean reference speech, which limits their applicability in the real world. To

February 23, 2026

Audio Processing Electrical Engineering and Systems Science

SA-SSL-MOS: Self-supervised Learning MOS Prediction with Spectral Augmentation for Generalized Multi-Rate Speech Assessment

Designing a speech quality assessment (SQA) system for estimating mean-opinion-score (MOS) of multi-rate speech with varying sampling frequency (16-48 kHz) is a challenging task. The challenge arises due to the limited availability of a MOS-labeled training dataset comprising multi-rate speech sampl

February 23, 2026

Audio Processing Learning Electrical Engineering and Systems Science

Enroll-on-Wakeup: A First Comparative Study of Target Speech Extraction for Seamless Interaction in Real Noisy Human-Machine Dialogue Scenarios

Target speech extraction (TSE) typically relies on pre-recorded high-quality enrollment speech, which disrupts user experience and limits feasibility in spontaneous interaction. In this paper, we propose Enroll-on-Wakeup (EoW), a novel framework where the wake-word segment, captured naturally during

February 23, 2026

Audio Processing Electrical Engineering and Systems Science

Phoneme-Based Persian Speech Recognition

Undoubtedly, one of the most important issues in computer science is intelligent speech recognition. In these systems, computers try to detect and respond to the speeches they are listening to, like humans. In this research, presenting of a suitable method for the diagnosis of Persian phonemes by AI

February 23, 2026

Audio Processing Computer Science Machine Learning Sound Electrical Engineering and Systems Science

Melody Generation using an Interactive Evolutionary Algorithm

Music generation with the aid of computers has been recently grabbed the attention of many scientists in the area of artificial intelligence. Deep learning techniques have evolved sequence production methods for this purpose. Yet, a challenging problem is how to evaluate generated music by a machine

February 23, 2026

Audio Processing Computer Science Neural Computing Sound Electrical Engineering and Systems Science

Music of Brain and Music on Brain: A Novel EEG Sonification approach

Can we hear the sound of our brain? Is there any technique which can enable us to hear the neuro-electrical impulses originating from the different lobes of brain? The answer to all these questions is YES. In this paper we present a novel method with which we can sonify the Electroencephalogram (EEG

February 23, 2026

Audio Processing Computer Science Physics Quantitative Biology Sound Electrical Engineering and Systems Science

The organization of a three-manual keyboard for 53-tone tempered and other tempered systems

The aim is to explore new opportunities of the pitch organization of the musical scale. Specifically, a numerical comparison of the different musical temperaments among themselves in the degree of approximation of the Pythagorean scale is provided, and thus it numerically substantiates the thesis th

February 23, 2026

Audio Processing Computer Science Sound System Electrical Engineering and Systems Science

Learning spatial hearing via innate mechanisms

The acoustic cues used by humans and other animals to localise sounds are subtle, and change during and after development. This means that we need to constantly relearn or recalibrate the auditory spatial map throughout our lifetimes. This is often thought of as a 'supervised' learning process where

February 23, 2026

Audio Processing Computer Science Quantitative Biology Learning Neural Computing Electrical Engineering and Systems Science

< Category Statistics (Total: 5005) >

Astrophysics

525

Computer Science

1777

Condensed Matter

241

Economics

Electrical Engineering and Systems Science

General Relativity

General Research

699

HEP-EX

HEP-LAT

HEP-PH

HEP-TH

MATH-PH

NUCL-EX

NUCL-TH

Nonlinear Sciences

196

Quantitative Biology

401

Quantitative Finance

164

Quantum Physics

Statistics

270

Audio Processing

SELEBI: Percussion-aware Time Stretching via Selective Magnitude Spectrogram Compression by Nonstationary Gabor Transform

Bottleneck Transformer-Based Approach for Improved Automatic STOI Score Prediction

SA-SSL-MOS: Self-supervised Learning MOS Prediction with Spectral Augmentation for Generalized Multi-Rate Speech Assessment

Enroll-on-Wakeup: A First Comparative Study of Target Speech Extraction for Seamless Interaction in Real Noisy Human-Machine Dialogue Scenarios

Phoneme-Based Persian Speech Recognition

Melody Generation using an Interactive Evolutionary Algorithm

Music of Brain and Music on Brain: A Novel EEG Sonification approach

The organization of a three-manual keyboard for 53-tone tempered and other tempered systems

Learning spatial hearing via innate mechanisms

< Category Statistics (Total: 5005) >

Table of Contents

Table of Contents

SELEBI: Percussion-aware Time Stretching via Selective Magnitude Spectrogram Compression by Nonstationary Gabor Transform

Bottleneck Transformer-Based Approach for Improved Automatic STOI Score Prediction

SA-SSL-MOS: Self-supervised Learning MOS Prediction with Spectral Augmentation for Generalized Multi-Rate Speech Assessment

Enroll-on-Wakeup: A First Comparative Study of Target Speech Extraction for Seamless Interaction in Real Noisy Human-Machine Dialogue Scenarios

Phoneme-Based Persian Speech Recognition

Melody Generation using an Interactive Evolutionary Algorithm

Music of Brain and Music on Brain: A Novel EEG Sonification approach

The organization of a three-manual keyboard for 53-tone tempered and other tempered systems

Learning spatial hearing via innate mechanisms

< Category Statistics (Total: 5005) >

Start searching

No results found