A Machine Vision Approach to Preliminary Skin Lesion Assessments
Early detection of malignant skin lesions is critical for improving patient outcomes in aggressive, metastatic skin cancers. This study evaluates a comprehensive system for preliminary skin lesion assessment that combines the clinically established ABCD rule of dermoscopy (analyzing Asymmetry, Borders, Color, and Dermoscopic Structures) with machine learning classification. Using a 1,000-image subset of the HAM10000 dataset, the system implements an automated, rule-based pipeline to compute a Total Dermoscopy Score (TDS) for each lesion. This handcrafted approach is compared against various machine learning solutions, including traditional classifiers (Logistic Regression, Random Forest, and SVM) and deep learning models. While the rule-based system provides high clinical interpretability, results indicate a performance bottleneck when reducing complex morphology to five numerical features. Experimental findings show that transfer learning with EfficientNet-B0 failed significantly due to domain shift between natural and medical images. In contrast, a custom three-layer Convolutional Neural Network (CNN) trained from scratch achieved 78.5% accuracy and 86.5% recall on median-filtered images, representing a 19-point accuracy improvement over traditional methods. The results demonstrate that direct pixel-level learning captures diagnostic patterns beyond handcrafted features and that purpose-built lightweight architectures can outperform large pretrained models for small, domain-specific medical datasets.
💡 Research Summary
The paper presents a complete pipeline for preliminary skin‑lesion assessment that merges the clinically established ABCD dermoscopy rule with modern machine‑learning techniques. Using a balanced 1,000‑image subset of the HAM10000 dataset (50 % malignant, 50 % benign), the authors first construct a rule‑based computer‑vision module that extracts four quantitative descriptors—Asymmetry (A), Border irregularity (B), Color diversity (C), and Dermoscopic structures (D)—from each dermoscopic image. Lesion segmentation is performed via Otsu thresholding and morphological cleanup; asymmetry is measured by PCA‑aligned reflection IoU, border irregularity by radial intensity gradients, color diversity by LAB‑space K‑means clustering with area‑based filtering, and structures by a combination of variance, LoG blob detection, skeleton‑based pigment‑network counting, and Hough‑line detection. The four scores are combined with the clinically derived weighting formula TDS = 1.3·A + 0.1·B + 0.5·C + 0.5·D, yielding three diagnostic categories (benign, suspicious, malignant).
These five numeric features (A, B, C, D, TDS) are fed to three traditional classifiers—Logistic Regression, Random Forest, and Support Vector Machine—trained on 80 % of the data and tested on the remaining 20 % using three preprocessing streams (median, Gaussian, flat‑average filtering). Across all streams, the median‑filtered images produced the best results, but the highest accuracy achieved by any traditional model was only 59.5 % with a recall of 61 %, confirming that compressing rich visual information into a five‑dimensional vector creates a substantial information bottleneck.
To explore transfer learning, the authors fine‑tuned EfficientNet‑B0 (pre‑trained on ImageNet) with a freeze‑then‑unfreeze schedule (10 epochs frozen, 30 epochs unfrozen) and extensive data augmentation. Despite achieving 80 % precision, the model’s recall collapsed to 4 % because it defaulted to the majority class, illustrating severe domain shift between natural images and dermoscopic data and the inadequacy of large‑scale pretrained weights for a tiny, specialized dataset.
In contrast, a custom three‑layer convolutional neural network (16→32→64 filters, 3×3 kernels, ReLU, 2×2 max‑pooling) trained from scratch on the same 1,000 images delivered markedly superior performance: 78.5 % accuracy, 86.5 % recall, and an ROC‑AUC of 0.833 on median‑filtered inputs (the flat‑average stream yielded the highest AUC of 0.833). The lightweight architecture, trained for only ten epochs with Adam optimizer and binary cross‑entropy loss, was able to learn subtle texture, color, and structural cues that the handcrafted ABCD scores missed.
Key conclusions are: (1) preprocessing choice matters—median filtering best preserves edge information crucial for both rule‑based and deep‑learning pipelines; (2) ABCD‑based handcrafted features provide valuable clinical interpretability but suffer from dimensional reduction that limits diagnostic accuracy; (3) for small, domain‑specific medical imaging datasets, purpose‑built lightweight CNNs outperform both traditional classifiers on handcrafted features and transfer‑learning approaches with large pretrained models. The authors suggest future work should expand the dataset, explore multi‑class diagnosis (six lesion types), and integrate explainable AI methods to bridge the gap between model performance and clinical trust.
Comments & Academic Discussion
Loading comments...
Leave a Comment