AI-Driven Cardiorespiratory Signal Processing: Separation, Clustering, and Anomaly Detection
This research applies artificial intelligence (AI) to separate, cluster, and analyze cardiorespiratory sounds. We recorded a new dataset (HLS-CMDS) and developed several AI models, including generative AI methods based on large language models (LLMs) for guided separation, explainable AI (XAI) techniques to interpret latent representations, variational autoencoders (VAEs) for waveform separation, a chemistry-inspired non-negative matrix factorization (NMF) algorithm for clustering, and a quantum convolutional neural network (QCNN) designed to detect abnormal physiological patterns. The performance of these AI models depends on the quality of the recorded signals. Therefore, this thesis also reviews the biosensing technologies used to capture biomedical data. It summarizes developments in microelectromechanical systems (MEMS) acoustic sensors and quantum biosensors, such as quantum dots and nitrogen-vacancy centers. It further outlines the transition from electronic integrated circuits (EICs) to photonic integrated circuits (PICs) and early progress toward integrated quantum photonics (IQP) for chip-based biosensing. Together, these studies show how AI and next-generation sensors can support more intelligent diagnostic systems for future healthcare.
💡 Research Summary
This dissertation presents an end‑to‑end AI‑driven framework for processing mixed heart and lung sounds, integrating cutting‑edge generative AI, deep learning, and quantum computing techniques with next‑generation biosensing hardware. A new dataset, HLS‑CMDS, was recorded from a clinical training manikin using a 22 kHz digital stethoscope, providing both normal and pathological recordings together with metadata.
For blind source separation, the author introduces LingoNMF, a large‑language‑model‑guided non‑negative matrix factorization. By prompting an LLM with periodicity cues extracted from the audio, the algorithm adaptively updates penalty terms, achieving 3–4 dB improvements in signal‑to‑distortion ratio (SDR) and signal‑to‑interference ratio (SIR) over standard NMF variants.
A second separation model, XV‑AE‑WMT, combines a variational auto‑encoder (VAE) with a wavelet‑based masking scheme and a temporal‑consistency loss. This architecture yields SDR = 26.8 dB, SIR = 32.8 dB, and a latent‑space Silhouette score of 0.345, demonstrating superior separation and clustering of overlapping cardiac and pulmonary sounds.
Unsupervised clustering is tackled with Chem‑NMF, a multi‑layer α‑divergence NMF inspired by catalytic chemistry. The method stabilizes convergence through a “catalyst” initialization and divergence control, leading to 5–7 % gains in accuracy and normalized mutual information on benchmark image datasets, and showing promising results on the cardiorespiratory recordings.
Anomaly detection is performed by QuPCG, a hybrid quantum‑classical pipeline. After wavelet transformation, the signal is converted into energy‑map images, encoded into qubits, and processed by a quantum convolutional neural network (QCNN). In simulated experiments, QuPCG attains 93.33 % ± 2.9 % test accuracy for binary heart‑sound classification, indicating the potential of quantum‑enhanced feature encoding.
The thesis also surveys the sensing layer, covering MEMS acoustic microphones, quantum‑dot and nitrogen‑vacancy (NV) center biosensors, photonic integrated circuits (PICs), and early integrated quantum photonics (IQP) platforms. The review highlights the transition from electronic to photonic integration and the promise of on‑chip quantum photonic sensors for ultra‑low‑noise biomedical acquisition.
Overall, the work demonstrates that (1) high‑quality data acquisition, (2) LLM‑guided NMF, (3) VAE‑based masked separation, (4) chemistry‑inspired NMF clustering, and (5) quantum‑convolutional anomaly detection together form a powerful pipeline for cardiorespiratory signal analysis. While the results surpass conventional baselines, the author acknowledges limitations such as the subjectivity of LLM prompts, the need for validation on real patient recordings, and the current reliance on quantum simulators. Future research directions include open‑source release of prompts and code, clinical trials with patient data, and implementation of the QCNN on physical quantum hardware.
Comments & Academic Discussion
Loading comments...
Leave a Comment