TF-MCL: Time-frequency Fusion and Multi-domain Cross-Loss for Self-supervised Depression Detection

Reading time: 5 minute
...

📝 Original Info

  • Title: TF-MCL: Time-frequency Fusion and Multi-domain Cross-Loss for Self-supervised Depression Detection
  • ArXiv ID: 2512.13736
  • Date: 2025-12-14
  • Authors: ** - Li‑Xuan Zhao (조리현) – Tianjin University, 전기·정보공학부 - Chen‑Yang Xu (천양서) – Tianjin University, 전기·정보공학부 (공동 1저자) - Wen‑Qiang Li (원강리) – Tianjin University, 전기·정보공학부 - Bo Wang (왕보) – Nanchang Police Dog Base, Ministry of Public Security; Jiangxi Provincial Key Laboratory of Police Dog Breeding and Behavioral Science - Rong‑Xing Wei (용성위) – 동일 위 기관 소속 - Qing‑Hao Meng (청호멍) – Tianjin University, 전기·정보공학부 (교신 저자) **

📝 Abstract

In recent years, there has been a notable increase in the use of supervised detection methods of major depressive disorder (MDD) based on electroencephalogram (EEG) signals. However, the process of labeling MDD remains challenging. As a self-supervised learning method, contrastive learning could address the shortcomings of supervised learning methods, which are unduly reliant on labels in the context of MDD detection. However, existing contrastive learning methods are not specifically designed to characterize the time-frequency distribution of EEG signals, and their capacity to acquire low-semantic data representations is still inadequate for MDD detection tasks. To address the problem of contrastive learning method, we propose a time-frequency fusion and multi-domain cross-loss (TF-MCL) model for MDD detection. TF-MCL generates time-frequency hybrid representations through the use of a fusion mapping head (FMH), which efficiently remaps time-frequency domain information to the fusion domain, and thus can effectively enhance the model's capacity to synthesize time-frequency information. Moreover, by optimizing the multi-domain cross-loss function, the distribution of the representations in the time-frequency domain and the fusion domain is reconstructed, thereby improving the model's capacity to acquire fusion representations. We evaluated the performance of our model on the publicly available datasets MODMA and PRED+CT and show a significant improvement in accuracy, outperforming the existing state-of-the-art (SOTA) method by 5.87% and 9.96%, respectively.

💡 Deep Analysis

Figure 1

📄 Full Content

1

Li-Xuan Zhao a,§, Chen-Yang Xu a,§, Wen-Qiang Li a, Bo Wang b,c, Rong-Xing Wei b,c, Qing-Hao Menga,∗ aSchool of Electrical and Information Engineering, Tianjin University, 300072, Tianjin, China bNanchang Police Dog Base of the Ministry of Public Security, Nanchang 330100, China cJiangxi Provincial Key Laboratory of Police Dog Breeding and Behavioral Science, Nanchang 330100, China

Abstract In recent years, there has been a notable increase in the use of supervised detection methods of major depressive disorder (MDD) based on electroencephalogram (EEG) signals. However, the process of labeling MDD remains challenging. As a self-supervised learning method, contrastive learning could address the shortcomings of supervised learning methods, which are unduly reliant on labels in the context of MDD detection. However, existing contrastive learning methods are not specifically designed to characterize the time-frequency distribution of EEG signals, and their capacity to acquire low- semantic data representations is still inadequate for MDD detection tasks. To address the problem of contrastive learning method, we propose a time-frequency fusion and multi-domain cross-loss (TF-MCL) model for MDD detection. TF-MCL generates time-frequency hybrid representations through the use of a fusion mapping head (FMH), which efficiently remaps time-frequency domain information to the fusion domain, and thus can effectively enhance the model’s capacity to synthesize time-frequency information. Moreover, by optimizing the multi-domain cross-loss function, the distribution of the representations in the time-frequency domain and the fusion domain is reconstructed, thereby improving the model’s capacity to acquire fusion representations. We evaluated the performance of our model on the publicly available datasets MODMA and PRED+CT and show a significant improvement in accuracy, outperforming the existing state-of-the-art (SOTA) method by 5.87% and 9.96%, respectively. Key Words: MDD detection, Contrastive learning, Time-frequency fusion, Multi-domain cross-loss function

  1. Introduction Major depressive disorder (MDD) [1] is a prevalent emotional dysfunction that can manifest itself through a variety of physiological signals. Electroencephalogram (EEG) has been utilized to diagnose MDD and other psychiatric disorders due to its non-invasive, convenient, and efficient advantages. Presently, a considerable number of researchers employ the acquired EEG signals and integrate machine learning (ML) [2] or deep learning (DL) [3], [4] methods to distinguish MDD from healthy controls (HC). However, the general supervised ML or DL methods suffer from the limitation of over-reliance on data labeling. Consequently, it is frequently required that relevant psychologists conduct a large number of manual diagnoses in order to categorize the data, which requires the allocation of considerable medical resources. Self-supervised methods could solve the problem of over- reliance on labeling in supervised methods. In recent years,

Corresponding Author: Qing-Hao Meng, Email: qh meng@tju.edu.cn. Li-Xuan Zhao and Chen-Yang Xu contributed equally. the applications of contrastive learning methods in the field of time-series signals have gradually increased. Eldele et al. [5] proposed an unsupervised time-series representation learning framework via temporal and contextual contrasting (TSTCC), which designs a new cross-view prediction task to learn robust temporal representations. Yue et al. [6] put forward a universal framework for learning representations of time series in an arbitrary semantic level (TS2Vec), which implements a robust contextual representation for each timestamp by learning in a hierarchical contrastive manner in the augmented context view. Guo et al. [7] presented a modality consistency-guided contrastive learning (MoCL) method, which exploits the complementarity and redundancy between different time-series signals to construct a generalized model for personalized domain adaptation. Wu et al. [8] put forth an end-to-end auto- augmentation contrastive learning (AutoCL) method for time-series signals. AutoCL automatically learns data augmentation strategies, thereby alleviating the burden of manually designing such strategies.

TF-MCL: Time-frequency Fusion and Multi-domain Cross-Loss for Self-supervised Depression Detection

2

In the context of time-series signals, frequency domain information constitutes a pivotal feature. Consequently, scholars investigate contrastive learning methods based on time-frequency feature fusion. Yang et al. [9] introduced a new unsupervised time-series representation learning method called bilinear temporal-spectral fusion (BTSF), which can obtain excellent performance through a novel iterative bilinear time-frequency fusion method to explicitly model cross-domain dependencies. However, BTSF does n

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut