UniCoMTE: A Universal Counterfactual Framework for Explaining Time-Series Classifiers on ECG Data

Reading time: 5 minute
...

📝 Original Info

  • Title: UniCoMTE: A Universal Counterfactual Framework for Explaining Time-Series Classifiers on ECG Data
  • ArXiv ID: 2512.17100
  • Date: 2025-12-18
  • Authors: ** - Justin Li (보스턴 대학교) - Efe Sencan (보스턴 대학교) - Jasper Zheng Duan (Sandia National Laboratories) - Vitus J. Leung (Sandia National Laboratories) - Stephen Tsaur (보스턴 대학교·보스턴 메디컬 센터) - Ayse K. Coskun (보스턴 대학교) **

📝 Abstract

Machine learning models, particularly deep neural networks, have demonstrated strong performance in classifying complex time series data. However, their black-box nature limits trust and adoption, especially in high-stakes domains such as healthcare. To address this challenge, we introduce UniCoMTE, a model-agnostic framework for generating counterfactual explanations for multivariate time series classifiers. The framework identifies temporal features that most heavily influence a model's prediction by modifying the input sample and assessing its impact on the model's prediction. UniCoMTE is compatible with a wide range of model architectures and operates directly on raw time series inputs. In this study, we evaluate UniCoMTE's explanations on a time series ECG classifier. We quantify explanation quality by comparing our explanations' comprehensibility to comprehensibility of established techniques (LIME and SHAP) and assessing their generalizability to similar samples. Furthermore, clinical utility is assessed through a questionnaire completed by medical experts who review counterfactual explanations presented alongside original ECG samples. Results show that our approach produces concise, stable, and human-aligned explanations that outperform existing methods in both clarity and applicability. By linking model predictions to meaningful signal patterns, the framework advances the interpretability of deep learning models for real-world time series applications.

💡 Deep Analysis

Figure 1

📄 Full Content

UniCoMTE: A Universal Counterfactual Framework for Explaining Time-Series Classifiers on ECG Data Justin Li1, Efe Sencan1, Jasper Zheng Duan2, Vitus J. Leung2, Stephen Tsaur1,3, Ayse K. Coskun1 1*Boston University, Boston,MA, USA. 2Sandia National Laboratories, Albuquerque, NM, USA. 3Boston Medical Center, Boston, MA, USA. Contributing authors: justinli@bu.edu; esencan@bu.edu; jzduan@sandia.gov; vjleung@sandia.gov; Stephen.Tsaur@bmc.org; acoskun@bu.edu; Abstract Machine learning models, particularly deep neural networks, have demonstrated strong performance in classifying complex time series data. However, their black- box nature limits trust and adoption, especially in high-stakes domains such as healthcare. To address this challenge, we introduce UniCoMTE, a model- agnostic framework for generating counterfactual explanations for multivariate time series classifiers. The framework identifies temporal features that most heav- ily influence a model’s prediction by modifying the input sample and assessing its impact on the model’s prediction. UniCoMTE is compatible with a wide range of model architectures and operates directly on raw time series inputs. In this study, we evaluate UniCoMTE’s explanations on a time series ECG classifier. We quantify explanation quality by comparing our explanations’ comprehen- sibility to comprehensibility of established techniques (LIME and SHAP) and assessing their generalizability to similar samples. Furthermore, clinical utility is assessed through a questionnaire completed by medical experts who review coun- terfactual explanations presented alongside original ECG samples. Results show that our approach produces concise, stable, and human-aligned explanations that outperform existing methods in both clarity and applicability. By linking model predictions to meaningful signal patterns, the framework advances the interpretability of deep learning models for real-world time series applications. 1 arXiv:2512.17100v2 [cs.LG] 22 Dec 2025 Keywords: Explainable artificial intelligence (XAI), Counterfactual explanations, ECG classification, Machine Learning 1 Introduction Cardiovascular diseases (CVDs) remain the leading cause of death globally, accounting for an estimated 17.9 million deaths each year [1]. Early detection and diagnosis are critical for reducing morbidity and mortality, as timely interventions can significantly improve outcomes [2]. Electrocardiograms (ECGs) serve as a primary non-invasive diagnostic tool to assess cardiac function by recording the heart’s electrical activity over time. Given the complexity and sheer volume of ECG recordings, researchers have increasingly turned to deep learning methods as a means to automate ECG-based diagnosis. Recent studies have demonstrated that deep learning models in particular can achieve high performance for ECG classification tasks and show potential for clinical application in research settings. For example, a deep neural network trained on 12-lead ECG samples can outperform cardiology residents in detecting multiple arrhythmias, with F1-scores above 80% and specificity over 99%, across six ECG abnormalities [3]. Similarly, a Convolutional Neural Network [4] (CNN) model trained on 12-lead ECG data can perform on par with cardiologists and exhibits greater accuracy than a leading commercial ECG analysis system. Other models have achieved high perfor- mances across a range of similar classification tasks including the classification of myocardial infarction and atrial fibrillation [5–7]. Beyond performance comparisons with clinical standards, several studies investigate the impact of architectural choices. For instance, using one-dimensional time-series models appear more effective than transforming ECG signals into image representations. One study finds that a gated recurrent unit–based recurrent neural network [8] achieves around 80% sensitivity and 81% specificity, outperforming both two-dimensional CNN approaches and multi- modal fusion of one- and two-dimensional inputs. In terms of efficiency, a lightweight 11-layer hybrid convolutional neural network–long short-term memory (CNN–LSTM) model achieves near-perfect arrhythmia classification (approximately 98% accuracy) across eight rhythm classes [9], while remaining compact enough for deployment to wearable monitors for continuous, real-time detection. Traditional feature-based ML methods also show promise: one approach combines advanced ECG signal process- ing—such as peak detection—with a ML classifier to achieve state-of-the-art heartbeat classification performance on a large dataset of over 10,000 patients [10]. Notably, this method maintains high accuracy across different patient cohorts, achieving around 80–90% accuracy even when evaluated on external hospital data, in contrast to sharp performance drops observed in less generalizable models. Although these models have achieved high performance across a range of disease classification tasks in research setting

📸 Image Gallery

CoMTESoftwareFlowchart.drawio.png ECGCounterfactual_class3_sample99_V4.jpg ECGCounterfactual_class4_sample40_V4.jpg ECGCounterfactual_class4_sample446_V4.jpg ECGCounterfactual_class4_sample75_V4.jpg ECGCounterfactual_class4_sample98_V4.jpg Simplified_CoMTESoftwareFlowchart.jpg atrial_fibrillation_plot_output.jpg dnn_model_ecg.png first_degree_atrioventricular_blockplot_output.jpg left_bundle_branch_blockplot_output.jpg lime.png plot_output.jpg right_bundle_branch_blockplot_output.jpg shap_surface_class_1st_degree_AV_block.png sinus_bradycardiaplot_output.jpg sinus_tachycardiaplot_output.jpg

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut