Eess-As
Interpreting DNN output layer activations: A strategy to cope with unseen data in speech recognition
Polyphonic audio tagging with sequentially labelled data using CRNN with learnable gated linear units
Sentiment Analysis on Speaker Specific Speech Data
Masked Conditional Neural Networks for Automatic Sound Events Recognition
Generation of Infra sound to replicate a wind turbine
Melody Generation using an Interactive Evolutionary Algorithm
What Do Neurons Listen To? A Neuron-level Dissection of a General-purpose Audio Model
Enroll-on-Wakeup: A First Comparative Study of Target Speech Extraction for Seamless Interaction in Real Noisy Human-Machine Dialogue Scenarios
Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis
Learning spatial hearing via innate mechanisms
Music of Brain and Music on Brain: A Novel EEG Sonification approach
The organization of a three-manual keyboard for 53-tone tempered and other tempered systems
Research on several key technologies in practical speech emotion recognition
CC-G2PnP: Streaming Grapheme-to-Phoneme and prosody with Conformer-CTC for unsegmented languages
A Fast-Converged Acoustic Modeling for Korean Speech Recognition: A Preliminary Study on Time Delay Neural Network
Analysis of a Modern Voice Morphing Approach using Gaussian Mixture Models for Laryngectomees
Deepfake Word Detection by Next-token Prediction using Fine-tuned Whisper
A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording