Title: Placenta Accreta Spectrum Detection Using an MRI-based Hybrid CNN-Transformer Model
ArXiv ID: 2512.18573
Date: 2025-12-21
Authors: Sumaiya Ali, Areej Alhothali, Ohoud Alzamzami, Sameera Albasri, Ahmed Abduljabbar, Muhammad Alwazzan
📝 Abstract
Placenta Accreta Spectrum (PAS) is a serious obstetric condition that can be challenging to diagnose with Magnetic Resonance Imaging (MRI) due to variability in radiologists' interpretations. To overcome this challenge, a hybrid 3D deep learning model for automated PAS detection from volumetric MRI scans is proposed in this study. The model integrates a 3D DenseNet121 to capture local features and a 3D Vision Transformer (ViT) to model global spatial context. It was developed and evaluated on a retrospective dataset of 1,133 MRI volumes. Multiple 3D deep learning architectures were also evaluated for comparison. On an independent test set, the DenseNet121-ViT model achieved the highest performance with a five-run average accuracy of 84.3%. These results highlight the strength of hybrid CNN-Transformer models as a computer-aided diagnosis tool. The model's performance demonstrates a clear potential to assist radiologists by providing a robust decision support to improve diagnostic consistency across interpretations, and ultimately enhance the accuracy and timeliness of PAS diagnosis.
💡 Deep Analysis
📄 Full Content
Placenta Accreta Spectrum Detection Using an
MRI-based Hybrid CNN-Transformer Model
Sumaiya Ali1*, Areej Alhothali1, Ohoud Alzamzami1, Sameera Albasri2, Ahmed
Abduljabbar3, and Muhammad Alwazzan3
1Department of Computer Science, Faculty of Computing and Information
Technology, King Abdulaziz University, Jeddah, Saudi Arabia
2Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia
3Department of Radiology, King Abdulaziz University Hospital
Jeddah, Saudi Arabia
*Corresponding author: sali0174@stu.kau.edu.sa
Abstract
Placenta Accreta Spectrum (PAS) is a serious obstetric condition that can be challeng-
ing to diagnose with Magnetic Resonance Imaging (MRI) due to variability in radiologists’
interpretations. To overcome this challenge, a hybrid 3D deep learning model for automated
PAS detection from volumetric MRI scans is proposed in this study. The model integrates a 3D
DenseNet121 to capture local features and a 3D Vision Transformer (ViT) to model global spa-
tial context. It was developed and evaluated on a retrospective dataset of 1,133 MRI volumes.
Multiple 3D deep learning architectures were also evaluated for comparison. On an indepen-
dent test set, the DenseNet121-ViT model achieved the highest performance with a five-run
average accuracy of 84.3%. These results highlight the strength of hybrid CNN-Transformer
models as a computer-aided diagnosis tool. The model’s performance demonstrates a clear
potential to assist radiologists by providing a robust decision support to improve diagnostic
consistency across interpretations, and ultimately enhance the accuracy and timeliness of PAS
diagnosis.
Keywords: Placenta Accreta Spectrum, MRI, CNN, Vision Transformer
1
Introduction
Placenta Accreta Spectrum (PAS) represents life-threatening obstetric conditions defined by abnor-
mal placental invasion of uterine wall. Its incidence has risen dramatically in recent decades due
to rise in cesarean deliveries and the resulting scar tissues [1,2]. Estimates indicate that PAS cases
have doubled over the last two decades, making PAS a growing health challenge [3]. The condi-
tion carries high risks, including severe hemorrhage, infection, and frequent need for peripartum
1
arXiv:2512.18573v1 [cs.CV] 21 Dec 2025
hysterectomy, contributing significantly to maternal morbidity and mortality [2]. Early and precise
prenatal diagnosis is vital for reducing these risks and allowing multidisciplinary management,
which improves outcomes [1].
Current diagnostic practice combines clinical risk assessment with imaging, primarily ultra-
sound (US) and magnetic resonance imaging (MRI) [4]. Although, US imaging is commonly used
for initial screening it has limitations in accurately assessing invasion depth and extent, especially
with posterior placenta or bowel involvement. MRI offers complementary value through supe-
rior soft tissue contrast and larger field of view, providing detailed evaluation of invasion depth
and adjacent organ involvement [2], [4]. Among different MRI sequences, T2-weighted imaging
(T2WI) is most common for placental assessment and T1-weighted imaging (T1WI) reflect bleed-
ing conditions [5]. Key MRI signs of PAS include T2-dark intraplacental bands, focal interruption
of myometrial border, abnormal vascularity (Fig. 1) [6, 7]. However, the interpretation of these
complex imaging features remain qualitative, requires significant radiological expertise, and prone
to inter-observer variability [8], presenting the need for objective diagnostic methods.
Figure 1: MRI signs of PAS: (A) Intraplacental bands, (B) Myometrial border interruption, and
(C) Abnormal vascularity. [7]
Deep learning methods like convolutional neural networks (CNNs) have shown notable suc-
cess in learning complex patterns from raw imaging data, often exceeding human performance in
classification and segmentation tasks [9]. Although CNNs perform well in capturing local fea-
tures but they often struggle to grasp the global context. Vision Transformers (ViTs) on the other
hand uses self-attention mechanisms to effectively extract these global relationships [10]. This has
led to the development of hybrid CNN-Transformer models with the aim to combine the comple-
mentary strengths of each approach and show strong performance in complex medical imaging
tasks [11–14].
While deep learning has been applied to PAS detection from MRI, systematic studies of 3D
architectures remain limited. The optimal model design for capturing both local and global 3D
PAS markers is still undefined. This study addresses this gap by proposing a tailored 3D hybrid
DenseNet121–ViT model for end-to-end volumetric PAS detection. A systematic comparison of
six modern 3D deep learning models was conducted on a novel dataset of 3D MRI. The study
demonstrates that integration of dense local feature extraction and global self attention results in
superior diagnostic performance.
2
2
Related Work
Computational MRI analysis for PAS has evolved from handcrafted f