Enhancing Rare Codes via Probability-Biased Directed Graph Attention for Long-Tail ICD Coding
Reading time: 2 minute
...
📝 Original Info
- Title: Enhancing Rare Codes via Probability-Biased Directed Graph Attention for Long-Tail ICD Coding
- ArXiv ID: 2511.09559
- Date: 2025-10-31
- Authors: ** 논문에 명시된 저자 정보가 제공되지 않았습니다. (원문에 저자 리스트가 포함되어 있지 않음) **
📝 Abstract
Automated international classification of diseases (ICD) coding aims to assign multiple disease codes to clinical documents and plays a critical role in healthcare informatics. However, its performance is hindered by the extreme long-tail distribution of the ICD ontology, where a few common codes dominate while thousands of rare codes have very few examples. To address this issue, we propose a Probability-Biased Directed Graph Attention model (ProBias) that partitions codes into common and rare sets and allows information to flow only from common to rare codes. Edge weights are determined by conditional co-occurrence probabilities, which guide the attention mechanism to enrich rare-code representations with clinically related signals. To provide higher-quality semantic representations as model inputs, we further employ large language models to generate enriched textual descriptions for ICD codes, offering external clinical context that complements statistical co-occurrence signals. Applied to automated ICD coding, our approach significantly improves the representation and prediction of rare codes, achieving state-of-the-art performance on three benchmark datasets. In particular, we observe substantial gains in macro-averaged F1 score, a key metric for long-tail classification.💡 Deep Analysis
📄 Full Content
Reference
This content is AI-processed based on open access ArXiv data.