Placenta Accreta Spectrum Detection Using an MRI-based Hybrid CNN-Transformer Model

February 20, 2026

Reading time: 5 minute

...

📝 Original Info

Title: Placenta Accreta Spectrum Detection Using an MRI-based Hybrid CNN-Transformer Model
ArXiv ID: 2512.18573
Date: 2025-12-21
Authors: Sumaiya Ali, Areej Alhothali, Ohoud Alzamzami, Sameera Albasri, Ahmed Abduljabbar, Muhammad Alwazzan

📝 Abstract

Placenta Accreta Spectrum (PAS) is a serious obstetric condition that can be challenging to diagnose with Magnetic Resonance Imaging (MRI) due to variability in radiologists' interpretations. To overcome this challenge, a hybrid 3D deep learning model for automated PAS detection from volumetric MRI scans is proposed in this study. The model integrates a 3D DenseNet121 to capture local features and a 3D Vision Transformer (ViT) to model global spatial context. It was developed and evaluated on a retrospective dataset of 1,133 MRI volumes. Multiple 3D deep learning architectures were also evaluated for comparison. On an independent test set, the DenseNet121-ViT model achieved the highest performance with a five-run average accuracy of 84.3%. These results highlight the strength of hybrid CNN-Transformer models as a computer-aided diagnosis tool. The model's performance demonstrates a clear potential to assist radiologists by providing a robust decision support to improve diagnostic consistency across interpretations, and ultimately enhance the accuracy and timeliness of PAS diagnosis.

💡 Deep Analysis

📄 Full Content

Placenta Accreta Spectrum Detection Using an MRI-based Hybrid CNN-Transformer Model Sumaiya Ali1*, Areej Alhothali1, Ohoud Alzamzami1, Sameera Albasri2, Ahmed Abduljabbar3, and Muhammad Alwazzan3 1Department of Computer Science, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia 2Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia 3Department of Radiology, King Abdulaziz University Hospital Jeddah, Saudi Arabia *Corresponding author: sali0174@stu.kau.edu.sa Abstract Placenta Accreta Spectrum (PAS) is a serious obstetric condition that can be challeng- ing to diagnose with Magnetic Resonance Imaging (MRI) due to variability in radiologists’ interpretations. To overcome this challenge, a hybrid 3D deep learning model for automated PAS detection from volumetric MRI scans is proposed in this study. The model integrates a 3D DenseNet121 to capture local features and a 3D Vision Transformer (ViT) to model global spa- tial context. It was developed and evaluated on a retrospective dataset of 1,133 MRI volumes. Multiple 3D deep learning architectures were also evaluated for comparison. On an indepen- dent test set, the DenseNet121-ViT model achieved the highest performance with a five-run average accuracy of 84.3%. These results highlight the strength of hybrid CNN-Transformer models as a computer-aided diagnosis tool. The model’s performance demonstrates a clear potential to assist radiologists by providing a robust decision support to improve diagnostic consistency across interpretations, and ultimately enhance the accuracy and timeliness of PAS diagnosis. Keywords: Placenta Accreta Spectrum, MRI, CNN, Vision Transformer 1 Introduction Placenta Accreta Spectrum (PAS) represents life-threatening obstetric conditions defined by abnor- mal placental invasion of uterine wall. Its incidence has risen dramatically in recent decades due to rise in cesarean deliveries and the resulting scar tissues [1,2]. Estimates indicate that PAS cases have doubled over the last two decades, making PAS a growing health challenge [3]. The condi- tion carries high risks, including severe hemorrhage, infection, and frequent need for peripartum 1 arXiv:2512.18573v1 [cs.CV] 21 Dec 2025 hysterectomy, contributing significantly to maternal morbidity and mortality [2]. Early and precise prenatal diagnosis is vital for reducing these risks and allowing multidisciplinary management, which improves outcomes [1]. Current diagnostic practice combines clinical risk assessment with imaging, primarily ultra- sound (US) and magnetic resonance imaging (MRI) [4]. Although, US imaging is commonly used for initial screening it has limitations in accurately assessing invasion depth and extent, especially with posterior placenta or bowel involvement. MRI offers complementary value through supe- rior soft tissue contrast and larger field of view, providing detailed evaluation of invasion depth and adjacent organ involvement [2], [4]. Among different MRI sequences, T2-weighted imaging (T2WI) is most common for placental assessment and T1-weighted imaging (T1WI) reflect bleed- ing conditions [5]. Key MRI signs of PAS include T2-dark intraplacental bands, focal interruption of myometrial border, abnormal vascularity (Fig. 1) [6, 7]. However, the interpretation of these complex imaging features remain qualitative, requires significant radiological expertise, and prone to inter-observer variability [8], presenting the need for objective diagnostic methods. Figure 1: MRI signs of PAS: (A) Intraplacental bands, (B) Myometrial border interruption, and (C) Abnormal vascularity. [7] Deep learning methods like convolutional neural networks (CNNs) have shown notable suc- cess in learning complex patterns from raw imaging data, often exceeding human performance in classification and segmentation tasks [9]. Although CNNs perform well in capturing local fea- tures but they often struggle to grasp the global context. Vision Transformers (ViTs) on the other hand uses self-attention mechanisms to effectively extract these global relationships [10]. This has led to the development of hybrid CNN-Transformer models with the aim to combine the comple- mentary strengths of each approach and show strong performance in complex medical imaging tasks [11–14]. While deep learning has been applied to PAS detection from MRI, systematic studies of 3D architectures remain limited. The optimal model design for capturing both local and global 3D PAS markers is still undefined. This study addresses this gap by proposing a tailored 3D hybrid DenseNet121–ViT model for end-to-end volumetric PAS detection. A systematic comparison of six modern 3D deep learning models was conducted on a novel dataset of 3D MRI. The study demonstrates that integration of dense local feature extraction and global self attention results in superior diagnostic performance. 2 2 Related Work Computational MRI analysis for PAS has evolved from handcrafted f

📄 Read Full PDF on ArXiv