From Retinal Pixels to Patients: Evolution of Deep Learning Research in Diabetic Retinopathy Screening

Reading time: 5 minute
...

📝 Abstract

Diabetic Retinopathy (DR) remains a leading cause of preventable blindness, with early detection critical for reducing vision loss worldwide. Over the past decade, deep learning has transformed DR screening, progressing from early convolutional neural networks trained on private datasets to advanced pipelines addressing class imbalance, label scarcity, domain shift, and interpretability. This survey provides the first systematic synthesis of DR research spanning 2016-2025, consolidating results from 50+ studies and over 20 datasets. We critically examine methodological advances, including self- and semi-supervised learning, domain generalization, federated training, and hybrid neuro-symbolic models, alongside evaluation protocols, reporting standards, and reproducibility challenges. Benchmark tables contextualize performance across datasets, while discussion highlights open gaps in multi-center validation and clinical trust. By linking technical progress with translational barriers, this work outlines a practical agenda for reproducible, privacy-preserving, and clinically deployable DR AI. Beyond DR, many of the surveyed innovations extend broadly to medical imaging at scale.

💡 Analysis

Diabetic Retinopathy (DR) remains a leading cause of preventable blindness, with early detection critical for reducing vision loss worldwide. Over the past decade, deep learning has transformed DR screening, progressing from early convolutional neural networks trained on private datasets to advanced pipelines addressing class imbalance, label scarcity, domain shift, and interpretability. This survey provides the first systematic synthesis of DR research spanning 2016-2025, consolidating results from 50+ studies and over 20 datasets. We critically examine methodological advances, including self- and semi-supervised learning, domain generalization, federated training, and hybrid neuro-symbolic models, alongside evaluation protocols, reporting standards, and reproducibility challenges. Benchmark tables contextualize performance across datasets, while discussion highlights open gaps in multi-center validation and clinical trust. By linking technical progress with translational barriers, this work outlines a practical agenda for reproducible, privacy-preserving, and clinically deployable DR AI. Beyond DR, many of the surveyed innovations extend broadly to medical imaging at scale.

📄 Content

From Retinal Pixels to Patients: Evolution of Deep Learning Research in Diabetic Retinopathy Screening Muskaan Chopra†§, Lorenz Sparrenberg†§, Armin Berger‡‡†§, Sarthak Khanna†, Jan H. Terheyden∥, Rafet Sifa‡‡†§ ‡‡Fraunhofer IAIS - Department of Media Engineering, Germany †University of Bonn - Department of Computer Science, Germany ∥University Hospital Bonn - Department of Ophthalmology, Germany §Lamarr Institute for Machine Learning and Artificial Intelligence, Germany Abstract—Diabetic Retinopathy (DR) remains a leading cause of preventable blindness, with early detection critical for reducing vision loss worldwide. Over the past decade, deep learning has transformed DR screening, progressing from early convolutional neural networks trained on private datasets to advanced pipelines addressing class imbalance, label scarcity, domain shift, and interpretability. This survey provides the first systematic synthesis of DR research spanning 2016-2025, consolidating results from 50+ studies and over 20 datasets. We critically examine methodological advances, including self- and semi-supervised learning, domain generalization, federated training, and hybrid neuro-symbolic models, alongside evaluation protocols, reporting standards, and reproducibility challenges. Benchmark tables contextualize performance across datasets, while discussion highlights open gaps in multi-center validation and clinical trust. By linking technical progress with translational barriers, this work outlines a practical agenda for reproducible, privacy-preserving, and clinically deployable DR AI. Beyond DR, many of the surveyed innovations extend broadly to medical imaging at scale. Index Terms—Diabetic Retinopathy, Deep Learning, Self- Supervised Learning, Domain Generalization, Medical Imaging I. INTRODUCTION Diabetic Retinopathy (DR) is a leading cause of preventable blindness; early detection is critical to reduce vision loss. According to the International Diabetes Federation (IDF), approximately 537 million adults (aged 20-79) are currently living with diabetes, and this number is projected to rise to 643 million by 2030 [1]. The World Health Organization and large epidemiological studies estimate that over one- third of people with diabetes will develop some form of DR during their lifetime [2]. Despite effective treatments such as laser photocoagulation and anti-VEGF therapy, clinical outcomes remain strongly dependent on early detection and diagnosis. Regular screening is therefore essential. However, health systems worldwide face significant challenges in meeting the escalating demand for retinal examinations. In recent years, advances in artificial intelligence and the increasing This research has been funded by the Federal Ministry of Education and Research of Germany and the state of North-Rhine Westphalia as part of the Lamarr Institute for Machine Learning and Artificial Intelligence. availability of large-scale retinal imaging datasets have created new opportunities to address these screening gaps at scale. A. Clinical context and grading Diagnosis of DR predominantly relies on retinal fundus photography, graded according to the International Clinical Diabetic Retinopathy (ICDR) Severity Scale, which includes five stages from No DR to Proliferative DR [3]. Many screening programs simplify this into a binary classification of referable versus non-referable DR to support operational workflows, though full grading remains important for research and clinical prognosis [4], [5]. Recent reviews, such as Yang et al. (2022) [6], provide a broader historical perspective on DR classification systems, outlining how current ICDR/ETDRS scales capture clinical severity yet overlook neurodegenerative changes, and emerging imaging modalities. Their analysis highlights why ongoing refinement of grading standards is essential in parallel with advances in AI-based screening. B. Promise of deep learning Deep learning (DL) rapidly emerged as a powerful approach for DR screening. Gulshan et al. (2016) [7] showed near-expert performance on large private datasets, followed by Krause et al. (2018) [8] and Rakhlin et al. (2018) [9], who extended DL to multi-task grading and related applications. These early successes positioned CNNs as a scalable tool for population- level screening. Yet, translation into practice required more than high accuracy; addressing data bottlenecks, integrating into screening programs, and obtaining regulatory approval remained critical hurdles [10]. C. Challenges in early research Despite their promise, early models also revealed important limitations: • Reproducibility crisis: Many pioneering studies relied on proprietary datasets and unreleased code, complicating replication and fair benchmarking. This highlighted the ur- gent need for transparency, open-source implementations, and publicly accessible resources [11]. arXiv:2511.11065v1 [cs.CV] 14 Nov 2025  Sec. 2 Foundational Breakthroughs  and Reproducibility Gap Pr

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut