Collaborative Reconstruction and Repair for Multi-class Industrial Anomaly Detection

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Industrial anomaly detection is a challenging open-set task that aims to identify unknown anomalous patterns deviating from normal data distribution. To avoid the significant memory consumption and limited generalizability brought by building separate models per class, we focus on developing a unified framework for multi-class anomaly detection. However, under this challenging setting, conventional reconstruction-based networks often suffer from an identity mapping problem, where they directly replicate input features regardless of whether they are normal or anomalous, resulting in detection failures. To address this issue, this study proposes a novel framework termed Collaborative Reconstruction and Repair (CRR), which transforms the reconstruction to repairation. First, we optimize the decoder to reconstruct normal samples while repairing synthesized anomalies. Consequently, it generates distinct representations for anomalous regions and similar representations for normal areas compared to the encoder’s output. Second, we implement feature-level random masking to ensure that the representations from decoder contain sufficient local information. Finally, to minimize detection errors arising from the discrepancies between feature representations from the encoder and decoder, we train a segmentation network supervised by synthetic anomaly masks, thereby enhancing localization performance. Extensive experiments on industrial datasets that CRR effectively mitigates the identity mapping issue and achieves state-of-the-art performance in multi-class industrial anomaly detection.

💡 Research Summary

The paper addresses the challenging problem of multi‑class industrial anomaly detection (MIAD), where a single model must detect and localize defects across many product categories using only normal training data. Existing reconstruction‑based approaches suffer from an “identity mapping” issue: the decoder learns to reproduce the encoder’s features even for anomalous inputs, thereby shrinking the feature discrepancy that is essential for anomaly detection. To overcome this, the authors propose a novel framework called Collaborative Reconstruction and Repair (CRR).

CRR reframes the learning objective from pure reconstruction to a joint reconstruction‑repair task. A pretrained encoder extracts features from normal images, which serve as a stable reference. Synthetic anomalies are generated on‑the‑fly using the DRAEM strategy (Perlin noise blended with texture patches). These anomalous images are fed to an untrained decoder that is simultaneously optimized for two losses: (1) a reconstruction loss that forces the decoder to reproduce the encoder’s features on normal regions, and (2) a repair loss that forces the decoder to map the synthetic anomalous regions back to normal‑like features. This dual objective ensures that, regardless of whether the input contains defects, the decoder consistently outputs normal‑style representations for the normal parts while deliberately altering the representations of the synthetic anomalies.

To capture fine‑grained defect cues, CRR introduces feature‑level random masking. Random pixels of the encoder’s feature maps are masked before they reach the decoder, compelling the decoder to infer missing information from surrounding context. This encourages the decoder to learn local relationships and prevents it from simply copying the input.

After the decoder is trained and frozen, both encoder and decoder process an input image (with synthetic anomalies during training). Their normalized outputs are element‑wise multiplied and concatenated across multiple scales, then fed into an up‑sampling segmentation network. The segmentation head is supervised by the synthetic anomaly masks, learning to translate the feature discrepancy into a pixel‑wise anomaly map. During inference, no synthetic anomalies are added; the model relies on the learned discrepancy between encoder and decoder features to highlight real defects.

The authors evaluate CRR on three widely used benchmarks—MVTec‑AD, VisA, and Real‑IAD—as well as a proprietary industrial dataset (HSS‑IAD). Across all datasets, CRR achieves state‑of‑the‑art performance on image‑level AUROC, pixel‑level AUROC, AUPRO, and IoU, outperforming recent MIAD methods such as UniAD, DiAD, MambaAD, and Dinomaly. Ablation studies confirm that (i) the repair loss is critical for breaking identity mapping, (ii) random masking improves local defect sensitivity, and (iii) the segmentation network effectively fuses multi‑level discrepancies.

In summary, CRR introduces three key innovations: (1) a collaborative reconstruction‑repair training scheme that forces the decoder to produce normal features even when presented with anomalies, (2) feature‑level random masking to preserve and exploit local information, and (3) a discrepancy‑driven segmentation head that converts encoder‑decoder feature gaps into accurate defect localization. These contributions collectively resolve the identity mapping problem and set a new performance benchmark for multi‑class industrial anomaly detection, offering a practical solution for real‑world manufacturing inspection pipelines.

Collaborative Reconstruction and Repair for Multi-class Industrial Anomaly Detection

💡 Research Summary

Comments & Academic Discussion

Leave a Comment