Title: Nodule-DETR: A Novel DETR Architecture with Frequency-Channel Attention for Ultrasound Thyroid Nodule Detection
ArXiv ID: 2601.01908
Date: 2026-01-05
Authors: Jingjing Wang, Qianglin Liu, Zhuo Xiao, Xinning Yao, Bo Liu, Lu Li, Lijuan Niu, Fugen Zhou
📝 Abstract
Thyroid cancer is the most common endocrine malignancy, and its incidence is rising globally. While ultrasound is the preferred imaging modality for detecting thyroid nodules, its diagnostic accuracy is often limited by challenges such as low image contrast and blurred nodule boundaries. To address these issues, we propose Nodule-DETR, a novel detection transformer (DETR) architecture designed for robust thyroid nodule detection in ultrasound images. Nodule-DETR introduces three key innovations: a Multi-Spectral Frequency-domain Channel Attention (MSFCA) module that leverages frequency analysis to enhance features of low-contrast nodules; a Hierarchical Feature Fusion (HFF) module for efficient multi-scale integration; and Multi-Scale Deformable Attention (MSDA) to flexibly capture small and irregularly shaped nodules. We conducted extensive experiments on a clinical dataset of real-world thyroid ultrasound images. The results demonstrate that Nodule-DETR achieves state-of-the-art performance, outperforming the baseline model by a significant margin of 0.149 in mAP@0.5:0.95. The superior accuracy of Nodule-DETR highlights its significant potential for clinical application as an effective tool in computer-aided thyroid diagnosis. The code of work is available at https://github.com/wjj1wjj/Nodule-DETR.
💡 Deep Analysis
📄 Full Content
Nodule-DETR: A Novel DETR Architecture with Frequency-Channel
Attention for Ultrasound Thyroid Nodule Detection
Jingjing Wang1, Qianglin Liu2, Zhuo Xiao1, Xinning Yao1, Bo Liu1,3,*, Lu Li1, Lijuan
Niu2, Fugen Zhou1,3
1 Image Processing Center, Beihang University, Beijing 100191, People’s Republic of
China
2 Department of Ultrasound, National Cancer Center/National Clinical Research
Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and
Peking Union Medical College, Beijing, 100021, People’s Republic of China
3 Beijing Advanced Innovation Center for Biomedical Engineering, Beihang
University, Beijing 100083, People’s Republic of China
ABSTRACT
Thyroid cancer is the most common endocrine malignancy, and its incidence is rising
globally. While ultrasound is the preferred imaging modality for detecting thyroid
nodules, its diagnostic accuracy is often limited by challenges such as low image
contrast and blurred nodule boundaries. To address these issues, we propose Nodule-
DETR, a novel detection transformer (DETR) architecture designed for robust thyroid
nodule detection in ultrasound images. Nodule-DETR introduces three key innovations:
a Multi-Spectral Frequency-domain Channel Attention (MSFCA) module that
leverages frequency analysis to enhance features of low-contrast nodules; a
Hierarchical Feature Fusion (HFF) module for efficient multi-scale integration; and
Multi-Scale Deformable Attention (MSDA) to flexibly capture small and irregularly
shaped nodules. We conducted extensive experiments on a clinical dataset of real-world
thyroid ultrasound images. The results demonstrate that Nodule-DETR achieves state-
of-the-art performance, outperforming the baseline model by a significant margin of
0.149 in mAP@0.5:0.95. The superior accuracy of Nodule-DETR highlights its
significant potential for clinical application as an effective tool in computer-aided
thyroid diagnosis. The code of work is available at https://github.com/wjj1wjj/Nodule-
DETR.
Introduction
Thyroid nodules are a common endocrine disease [1], broadly classified as benign or malignant
[2]. The global incidence of thyroid cancer has been significantly rising [3]. While the majority of
thyroid cancers have a positive prognosis, some malignant nodules are still at risk of progression.
Early diagnosis and timely treatment are crucial for preventing thyroid cancer progression and
reducing mortality rates, making effective thyroid nodule detection methods of great clinical
importance [4].
Current diagnostic methods include ultrasound (US), computed tomography (CT), magnetic
resonance imaging (MRI), fine-needle aspiration (FNA), and physical examination. Among these,
ultrasound imaging is preferred due to its efficiency, non-invasiveness, and affordability [5].
Diagnosis via ultrasound relies on morphological features like shape, boundary definition, and
internal echoes to determine benign or malignant status. However, radiologists typically diagnose
nodules based on visual assessment of sonographic features, which is relatively subjective and relies
heavily on the individual radiologist’s clinical experience. Consequently, developing accurate
methods for automated and objective detection and classification of thyroid nodules is critically
important.
With advancements in artificial intelligence, computer-aided diagnosis (CAD) systems based
on deep learning have become a mainstream approach in thyroid nodule diagnosis, aiming to reduce
the workload of physicians and improve diagnostic accuracy [6]. These systems are primarily built
on two main architectures with a fundamental trade-off. CNN-based methods, such as YOLO [7]
[8] [9] and Faster R-CNN [10] [11] [12], use convolutional kernels that excel at extracting local
features but are limited in their ability to perceive long-range dependencies [13] [14]. Transformer-
based methods, in contrast, are designed for global modeling but are less adept at capturing fine-
grained local details [15] [16]. To resolve this trade-off, recent work has focused on DETR
(DEtection TRansformer) [17], a hybrid architecture that combines a CNN backbone with a
Transformer. DETR reframes object detection as a direct set prediction problem, creating an end-
to-end framework that eliminates the need for hand-designed components like anchors [18] and
non-maximum suppression (NMS) [19]. The success of this architecture has led to its growing
adoption in medical image analysis [20] [21] [22] and, specifically, for thyroid nodule diagnosis
[23] [15].
Despite advancements in computer-aided detection (CAD) systems for ultrasound thyroid
nodule analysis, significant challenges remain. A key challenge is that many detection algorithms
rely on clearly defined nodule boundaries and distinct features to achieve accurate identification.
However, thyroid nodules in ultrasound images often exhibit blurred boundaries, irregular shapes,
and characteristi