StressRoBERTa: Cross-Condition Transfer Learning from Depression, Anxiety, and PTSD to Stress Detection
Reading time: 11 minute
...
๐ Original Info
Title: StressRoBERTa: Cross-Condition Transfer Learning from Depression, Anxiety, and PTSD to Stress Detection
ArXiv ID: 2512.23813
Date: 2025-12-29
Authors: Amal Alqahtani, Efsun Kayi, Mona Diab
๐ Abstract
The prevalence of chronic stress represents a significant public health concern, with social media platforms like Twitter serving as important venues for individuals to share their experiences. This paper introduces StressRoBERTa, a cross-condition transfer learning approach for automatic detection of self-reported chronic stress in English tweets. The investigation examines whether continual training on clinically related conditions (depression, anxiety, PTSD), disorders with high comorbidity with chronic stress, improves stress detection compared to general language models and broad mental health models. RoBERTa is continually trained on the Stress-SMHD corpus (108M words from users with self-reported diagnoses of depression, anxiety, and PTSD) and fine-tuned on the SMM4H 2022 Task 8 dataset. StressRoBERTa achieves 82% F1-score, outperforming the best shared task system (79% F1) by 3 percentage points. The results demonstrate that focused cross-condition transfer from stress-related disorders (+1% F1 over vanilla RoBERTa) provides stronger representations than general mental health training. Evaluation on Dreaddit (81% F1) further demonstrates transfer from clinical mental health contexts to situational stress discussions.
๐ Full Content
Chronic stress is a persistent sense of pressure that continues for an extended period (VandenBos, 2007) and represents a significant public health concern. Understanding chronic stress is crucial given its serious negative effects on both physical and mental health. Numerous studies demonstrate that chronic stress can negatively affect the immune system (Khansari et al., 1990) and may lead to other mental illnesses such as depression or suicidality (McEwen and Sapolsky, 2006). Detecting chronic stress is vital for preventing adverse health effects, implementing effective stress management techniques, addressing underlying causes, improving overall quality of life, and reducing costs related to healthcare and lost productivity.
Building on domain-adaptive training strategies, this paper investigates whether continual training on related mental health conditions improves the detection of a target condition. The approach is evaluated on the SMM4H 2022 Shared Task 8, which focuses on classifying self-reported chronic stress on Twitter (Weissenbacher et al., 2022). This task presents several challenges. First, the dataset is highly imbalanced (37% positive, 63% negative). Second, Twitter’s 280-character limit requires models to extract stress signals from short text. Third, distinguishing genuine self-disclosure from mere mentions of stress requires understanding subtle linguistic cues. The shared task attracted multiple teams, with a median F1 score of 75% and best performance of 79% F1 (Huang et al., 2022), indicating substantial task difficulty.
This work introduces a cross-condition transfer learning approach. Pretrained language models are continually trained on posts from users with depression, anxiety, and PTSD (conditions classified as stress-related disorders with high clinical comorbidity with chronic stress) and subsequently fine-tuned on stress detection. The central research question is whether continual training on a focused set of clinically related conditions (depression, anxiety, PTSD) improves stress detection compared to both general language models and broad mental health models.
The Stress-SMHD corpus (Cohan et al., 2018) provides 108M words of training data from three stress-related conditions. This enables examination of whether continual training on a focused set of related conditions suffices for effective crosscondition transfer.
The contributions of this work are as follows:
To the authors’ knowledge, no shared task systems employed cross-condition transfer learning in which a model is continually trained on related conditions (depression, anxiety, PTSD) and then finetuned on stress detection. Most systems used general pretrained language models (BERT, RoBERTa) or Twitter-specific models (BERTweet) without continual training on mental health data.
The use of cross-condition transfer for stress detection is supported by computational linguistics research. De Choudhury et al. ( 2013) demonstrated that stress-related linguistic markers appear consistently across depression, anxiety, and stress discussions. Guntuku et al. (2017) showed that models trained on depression and anxiety transfer effectively to stress detection, achieving F1 scores of 0.76 to 0.81.
The training follows standard RoBERTa protocols using the Transformers library (Wolf et al., 2020). RoBERTa builds on the BERT architecture and is trained via masked language modeling (Liu et al., 2019). RoBERTa-base serves as the foundation model, with dynamic masking applied during training.
The training procedure follows domain-adaptive continual training (Gururangan et al., 2020). The model is initialized from the original RoBERTa checkpoint and continually trained on the Stress-SMHD corpus, which contains posts from users with depression, anxiety, and PTSD. This creates a cross-condition training scenario in which models learn representations from related mental health conditions, which are subsequently fine-tuned for stress detection. Figure 1 shows the overall process.
Stress-SMHD is a subset of the Self-Reported Mental Health Diagnoses dataset (Cohan et al., 2018) containing Reddit posts from users with diagnoses of anxiety, PTSD, or depression. This ensures the continual training data reflects language from individuals with these conditions. The corpus does not contain explicit stress labels but consists of posts discussing experiences with depression, anxiety, and PTSD. This enables evaluation of whether representations learned from these related conditions transfer effectively to stress detection. Tables 1 and2 summarize the corpus composition.
StressRoBERTa is initialized with RoBERTa-base weights (Liu et al., 2019) (Wolf et al., 2020), adapting the model to learn representations from depression, anxiety, and PTSD discussions.
StressRoBERTa is continually trained on Stress-SMHD using the Huggingface Transformers library.
Weight Initialization
Task 8
Table 6 reports F1 and recall for the positive (stress) class, comparing the proposed approach with shared-task participants and baseline models.
StressRoBERTa achieves 82% F1 on the Twitter stress detection task, outperforming all shared task participants and baseline models. This represents a 3 percentage point improvement over the best shared task system (Huang et al., 2022) (79% F1) and a 7 percentage point gain over the shared task median (75% F1). The results demonstrate that cross-condition transfer learning from stressrelated disorders provides effective representations for stress detection in short social media text. Fu-ture work should include statistical significance testing across multiple random seeds to validate the robustness of these improvements.
The proposed approach differs from shared-task systems in several key ways. First, shared task systems primarily used general pretrained language models (BERT, RoBERTa) or Twitter-specific models (BERTweet) without domain-adaptive continual training on mental health data. The best performing system (Huang et al., 2022) used ensemble methods and pseudo-labeling post-processing rather than domain-specific continual training. Second, no shared task systems employed cross-condition transfer learning from related mental health conditions.
StressRoBERTa’s superior performance suggests that domain-adaptive continual training on stressrelated disorders captures linguistic patterns relevant to stress detection that general pretrained language models miss. The focused selection of depression, anxiety, and PTSD (conditions with high clinical comorbidity with stress) provides a stronger signal than either general language models or Twitter-specific models trained on diverse content.
StressRoBERTa also outperforms domain-specific baseline models. Clinical domain models (Clin-icalBERT 75% F1, BioBERT 80% F1) perform poorly, suggesting that clinical note language does not transfer well to social media stress detection. The comparison to general mental health models is more relevant. MentalBERT achieves 81% F1 and MentalRoBERTa achieves 81% F1, identical to vanilla RoBERTa-base (81% F1). This reveals a critical insight. General mental health continual training provides no improvement for stress detection, while focused cross-condition continual training on stress-related disorders improves performance by +1% F1. Both StressRoBERTa and MentalRoBERTa use the RoBERTa-base architecture with identical fine-tuning procedures but differ in continual training corpus selection.
Mental-RoBERTa is continually trained on general mental health subreddits (r/depression, r/SuicideWatch, r/Anxiety, r/offmychest, r/bipolar, r/mentalillness, r/mentalhealth). StressRoBERTa is continually trained on stress-related disorders (r/depression, r/Anxiety, r/ptsd). MentalRoBERTa’s lack of improvement over vanilla RoBERTa-base suggests that including diverse mental health content reduces the effectiveness of transfer learning for stress detection. StressRoBERTa’s focused selection provides more substantial clinical and linguistic overlap with stress, enabling effective crosscondition transfer.
To assess generalizability, two complementary evaluations are conducted. First, cross-platform transfer from Reddit to Twitter is tested using the SMM4H Task 8 dataset. Second, crosscontext transfer from clinical/diagnostic subreddits (r/depression, r/Anxiety, r/ptsd) to situational stress discussions is tested using Dreaddit, which comprises posts from diverse topical communities.
Table 7 shows performance on both benchmarks. StressRoBERTa achieves 81% F1 on Dreaddit, demonstrating that representations learned from clinical mental health discussions (depression, anxiety, PTSD) transfer effectively to situational stress contexts. The consistent performance across Twitter (82% F1, cross-platform) and Dreaddit (81% F1, cross-context) validates that cross-condition continual training captures transferable stress-related patterns rather than platform-specific or contextspecific artifacts.
Improve Performance?
Three factors explain StressRoBERTa’s superior performance on the stress detection task.
Clinical Comorbidity High clinical comorbidity rates support the validity of cross-condition transfer. Studies show 60 to 80% of individuals with depression report chronic stress (Mazure, 1998;Tennant, 2002), and 50 to 70% of anxiety patients meet criteria for chronic stress (Kessler, 2013). Chronic stress increases depression risk 2.5 to 3.0 fold (Kendler et al., 1999) and anxiety risk 2.0 to 2.8 fold (Kessler, 2013). This overlap means language from individuals discussing depression, anxiety, or PTSD inherently contains stress-related content, making these conditions appropriate source domains for stress detection. Focused Condition Selection Two methodological factors strengthen cross-condition transfer.
Both StressRoBERTa and MentalRoBERTa include depression and anxiety subreddits. However, StressRoBERTa adds r/ptsd, a stressrelated disorder in which 50 to 70% of patients meet criteria for chronic stress, while Men-talRoBERTa adds broader mental health con-tent (r/mentalillness, r/mentalhealth, r/offmychest, r/bipolar, r/SuicideWatch). Additionally, Stress-SMHD comprises posts from users who explicitly self-reported diagnoses, ensuring that the language reflects individuals who identify as having these conditions. MentalRoBERTa was trained on all posts from mental health subreddits, including general discussions and advice-seeking, without requiring self-reported diagnoses. This suggests that focused cross-condition transfer benefits from both selecting clinically related conditions and using self-reported diagnosis data.
This work demonstrates that cross-condition transfer learning from stress-related disorders improves stress detection on social media. StressRoBERTa achieves 82% F1 on SMM4H 2022 Shared Task 8, outperforming the best shared task system (79% F1) by 3 percentage points and the median (75% F1) by 7 percentage points. The primary contribution lies in demonstrating that focused cross-condition continual training on clinically related conditions (depression, anxiety, PTSD) outperforms both general pretrained language models and general mental health models. While vanilla RoBERTa-base achieves 81% F1, the general mental health model (MentalRoBERTa) shows no improvement (81% F1), and focused cross-condition continual training improves performance to 82% F1. This validates that condition selection based on clinical relationships matters for cross-condition transfer.
Clinical comorbidity and linguistic overlap between source conditions and the target condition enable effective knowledge transfer, supported by empirical results across Twitter (82% F1, crossplatform) and Dreaddit (81% F1, cross-context). These findings have important implications for mental health NLP. Domain-adaptive continual training on carefully selected related conditions provides stronger representations than either general language models or broad mental health continual training.
StressRoBERTa demonstrates strong performance for English stress detection on social media but has several limitations. First, the model is trained and evaluated exclusively on English text from social media platforms (Reddit and Twitter). Performance on other languages, different domains (e.g., clinical notes, formal writing), and longer documents remains unexplored. Second, the Stress-SMHD corpus comprises only posts from three subreddits (r/depression, r/Anxiety, r/ptsd) with self-reported diagnoses, which may not capture the full linguistic diversity of stress-related discourse across different populations and contexts. Third, no systematic evaluation was conducted of alternative condition combinations for cross-condition transfer. Fourth, training requires substantial computational resources, though the pretrained model will be released to improve accessibility. Finally, as with any automated mental health detection system, StressRoBERTa should only be deployed with appropriate ethical safeguards and clinical oversight to prevent potential misuse.
dition Wikipedia, BooksCorpus, and CC-News. Continual training is performed on Stress-SMHD using the Transformers library
from(Liu et al., 2019)
from
4.2 Experimental SetupRoBERTa-base(Liu et al., 2019) is fine-tuned as the general baseline and StressRoBERTa as the cross-condition model. Comparisons are also made against domain-specific baselines, including Clini-calBERT(Alsentzer et al., 2019), BioBERT(Lee et al., 2020), MentalBERT, and MentalRoBERTa(Ji et al., 2022). All models are base-sized for fair comparison and share the configuration in Table5.
4.2 Experimental SetupRoBERTa-base(Liu et al., 2019) is fine-tuned as the general baseline and StressRoBERTa as the cross-condition model. Comparisons are also made against domain-specific baselines, including Clini-calBERT(Alsentzer et al., 2019), BioBERT(Lee et al., 2020), MentalBERT, and MentalRoBERTa(Ji et al., 2022). All models are base-sized for fair comparison and share the configuration in Table5
4.2 Experimental SetupRoBERTa-base(Liu et al., 2019) is fine-tuned as the general baseline and StressRoBERTa as the cross-condition model. Comparisons are also made against domain-specific baselines, including Clini-calBERT(Alsentzer et al., 2019), BioBERT(Lee et al., 2020), MentalBERT, and MentalRoBERTa(Ji et al., 2022). All models are base-sized for fair comparison and share the configuration in Table
4.2 Experimental SetupRoBERTa-base(Liu et al., 2019) is fine-tuned as the general baseline and StressRoBERTa as the cross-condition model. Comparisons are also made against domain-specific baselines, including Clini-calBERT(Alsentzer et al., 2019), BioBERT(Lee et al., 2020), MentalBERT, and MentalRoBERTa(Ji et al., 2022)
4.2 Experimental SetupRoBERTa-base(Liu et al., 2019) is fine-tuned as the general baseline and StressRoBERTa as the cross-condition model. Comparisons are also made against domain-specific baselines, including Clini-calBERT(Alsentzer et al., 2019), BioBERT(Lee et al., 2020), MentalBERT, and MentalRoBERTa
4.2 Experimental SetupRoBERTa-base(Liu et al., 2019) is fine-tuned as the general baseline and StressRoBERTa as the cross-condition model. Comparisons are also made against domain-specific baselines, including Clini-calBERT(Alsentzer et al., 2019), BioBERT(Lee et al., 2020)
4.2 Experimental SetupRoBERTa-base(Liu et al., 2019) is fine-tuned as the general baseline and StressRoBERTa as the cross-condition model. Comparisons are also made against domain-specific baselines, including Clini-calBERT(Alsentzer et al., 2019), BioBERT
4.2 Experimental SetupRoBERTa-base(Liu et al., 2019) is fine-tuned as the general baseline and StressRoBERTa as the cross-condition model. Comparisons are also made against domain-specific baselines, including Clini-calBERT(Alsentzer et al., 2019)
4.2 Experimental SetupRoBERTa-base(Liu et al., 2019) is fine-tuned as the general baseline and StressRoBERTa as the cross-condition model. Comparisons are also made against domain-specific baselines, including Clini-calBERT
4.2 Experimental SetupRoBERTa-base(Liu et al., 2019)