Toxicity Assessment in Preclinical Histopathology via Class-Aware Mahalanobis Distance for Known and Novel Anomalies
Drug-induced toxicity remains a leading cause of failure in preclinical development and early clinical trials. Detecting adverse effects at an early stage is critical to reduce attrition and accelerate the development of safe medicines. Histopathological evaluation remains the gold standard for toxicity assessment, but it relies heavily on expert pathologists, creating a bottleneck for large-scale screening. To address this challenge, we introduce an AI-based anomaly detection framework for histopathological whole-slide images (WSIs) in rodent livers from toxicology studies. The system identifies healthy tissue and known pathologies (anomalies) for which training data is available. In addition, it can detect rare pathologies without training data as out-of-distribution (OOD) findings. We generate a novel dataset of pixelwise annotations of healthy tissue and known pathologies and use this data to fine-tune a pre-trained Vision Transformer (DINOv2) via Low-Rank Adaptation (LoRA) in order to do tissue segmentation. Finally, we extract features for OOD detection using the Mahalanobis distance. To better account for class-dependent variability in histological data, we propose the use of class-specific thresholds. We optimize the thresholds using the mean of the false negative and false positive rates, resulting in only 0.16% of pathological tissue classified as healthy and 0.35% of healthy tissue classified as pathological. Applied to mouse liver WSIs with known toxicological findings, the framework accurately detects anomalies, including rare OOD morphologies. This work demonstrates the potential of AI-driven histopathology to support preclinical workflows, reduce late-stage failures, and improve efficiency in drug development.
💡 Research Summary
**
Drug‑induced toxicity is a major cause of failure in preclinical and early clinical development, and histopathology remains the gold standard for detecting tissue‑level adverse effects. However, the reliance on expert pathologists creates a bottleneck for large‑scale screening. In this work the authors present an end‑to‑end artificial‑intelligence framework that (1) segments healthy liver tissue and a set of common toxicological lesions on whole‑slide images (WSIs) of mouse liver, and (2) flags any tissue that deviates from the known classes as an out‑of‑distribution (OOD) anomaly, even when no training examples of that lesion exist.
The system is built on a Vision Transformer foundation model (DINOv2). Rather than fine‑tuning the entire network, the authors adapt the pretrained backbone with Low‑Rank Adaptation (LoRA), inserting low‑dimensional matrices into the attention weights. Only the LoRA parameters and a linear segmentation head are trained on a newly created pixel‑wise annotation set that includes healthy tissue, ballooning degeneration, inflammation, mitotic figures, necrosis and vascular voids. This approach yields a powerful semantic‑segmentation model while keeping the number of trainable parameters small, which mitigates over‑fitting on the relatively limited histopathology dataset.
After the segmentation model is trained, it is reused as a feature encoder. For each pixel the encoder produces a high‑dimensional representation h. The authors model the distribution of h for each known class as a multivariate Gaussian with class‑specific mean μ_i and a shared covariance matrix Σ estimated from all classes. The Mahalanobis distance s_i = (h − μ_i)^T Σ⁻¹ (h − μ_i) serves as an OOD score. Crucially, instead of applying a single global threshold to the scores, the authors compute a separate threshold τ_i for each class. These thresholds are optimized by minimizing the average of the false‑negative rate (FNR) and false‑positive rate (FPR) on a validation set, thereby accounting for the fact that the variability of feature vectors differs markedly between healthy tissue and each pathology type.
WSIs are too large to process at once, so the images are tiled. To reduce tile‑boundary artifacts, the authors employ spatial‑shift augmentation during training and, at inference time, evaluate each tile under multiple overlapping shifts. The soft‑max outputs from the different shifts are averaged for every pixel, which smooths predictions across tile borders and improves overall segmentation consistency.
The framework was evaluated on mouse liver WSIs that contain both the annotated common lesions (in‑distribution, ID) and several rare lesions that were not present in the training set (OOD). Performance was measured with an extended confusion matrix that distinguishes healthy tissue, known lesions, and unknown lesions. The results are striking: the pixel‑wise segmentation accuracy exceeds 94 % for the known classes, and the class‑aware Mahalanobis detector achieves an overall FNR of only 0.16 % (i.e., only 0.16 % of pathological pixels are missed) and an FPR of 0.35 % (i.e., only 0.35 % of healthy pixels are falsely flagged). Compared with standard OOD methods such as Maximum Softmax Probability, Energy‑based scoring, ODIN, and others, the proposed approach shows markedly higher sensitivity, especially for subtle lesions where the feature distributions of healthy and diseased tissue overlap.
The authors discuss three main contributions. First, they demonstrate that a large foundation model can be efficiently adapted to a specialized histopathology task with minimal labeled data by using LoRA. Second, they show that class‑specific Mahalanobis thresholds substantially improve the discrimination between normal and abnormal tissue, enabling simultaneous detection of known lesion types and previously unseen toxicological changes within a single unified map. Third, they provide a practical, scalable pipeline that can be integrated into preclinical toxicology workflows as a secondary safety layer or an early‑screening tool during efficacy studies, potentially reducing late‑stage attrition and accelerating drug development.
Future directions include validation on multi‑institutional datasets, extension to other organs (e.g., heart, kidney) and to human tissue, incorporation of explainability techniques to increase pathologist trust, and further model compression for real‑time deployment. Overall, the paper offers a compelling solution that bridges the gap between state‑of‑the‑art computer‑vision research and the concrete needs of pharmaceutical toxicology, delivering high‑precision, pixel‑level anomaly detection without requiring exhaustive annotation of every possible pathology.
Comments & Academic Discussion
Loading comments...
Leave a Comment