Classifier Assisted Autoencoder Training Shows Promise with Industrial Anomaly Detection

+2

Submitted by dwalters on Sept. 30, 2025, 3:12 p.m. to πŸ€– | 1126 views

Quality Control is a crucial component of manufacturing.  No manufacturer wants defects to escape into the market.  AI assisted anomaly detection has the potential to relieve QC bottlenecks (increase manufacturing capacity), reduce defect escape (higher customer satisfaction), alert production lines immediately to defect detection (reduce material waste), and the like.  Thus, progress toward reliable AI anomaly detection is of paramount interest. 

AI anomaly detection can be carried out with unsupervised training on good samples using an autoencoder framework.  Subsequently, the autoencoder encodes and then decodes test images, and the reconstruction error from each image can be used as a measure for the presence of anomalies.  Selecting "good samples" for training is where potential bias can slip in.  Therefore, culling samples computationaly while maintaining unsupervised training protocol may provide advantages.

The MVTec AD2 dataset (link below) provides a challenging opportunity to develop robust anomaly detection methods.  The dataset provides samples in each of the categories: can, fabric, fruit jelly, rice, sheet metal, vials, wallplugs, and walnuts - where training samples are all "good" and the validation datasets are both "good" and "bad."  

As a former manufacturing engineer, my desire is to identify the maximum amount of good samples for pass through while identifying, ideally, all of the anomalous samples with some subset of questionable good samples redirected to QC.  This would relieve QC from examining obviously good samples while making their work more productive through defect enrichment.  To the contrary, if the fraction of bad samples allowed to bypass QC surpasses the current QC capture rate, the method is not useful.

To this end, a training pipeline was constructed whereby good samples were arbitrarily labeled 0 while the identical sample set with 10-40% added random noise was labeled 1 and used for classifier training such that the classifier could use a prediction confidence threshold to determine a subset of the unadulterated good samples to be used for autoencoder training.  Subsequently, the validation set was passed through the autoencoder using a reconstruction error threshold to detect anomalies.

Results are promising for preliminary testing on the can category.  Autoencoder training on all good samples produced accuracies in the neighborhood of 50%.  Using the same autoencoder framework with classifier culled good samples produced accuracies approaching 60%.  This research is being performed on a laptop with limited memory and a CPU (rather than a GPU).  Thus, improvements may be greater with larger memory (allowing less or no image compression) and the benefits of GPUs.

In conclusion, sample culling is a viable means to computationally reduce potential bias from unsupervised training samples in industrial anomaly detection.  Whether through the protocol described here or by some other method, the author is confident that engineered training pipelines can and will improve anomaly detection, thereby making manufacturing more efficient, bringing customers more satisfaction, and potentially reducing manufacturing and consumer costs.  Godspeed.

https://www.mvtec.com/company/research/datasets...

Comments

NOTICE: Content on Silverpaul.com represents the opinions of the authors and does not necessarily reflect the opinions of Silverpaul LLC personnel. Thank you for your kindness. Silverpaul.com uses affiliate links to help keep the lights on.