laitimes

How to solve the lack of industrial anomaly detection data?

author:3D Vision Workshop

Source: 3D Vision Workshop

Add a small assistant: dddvision, note: direction + school/company + nickname, pull you into the group. At the end of the article, industry subdivisions are attached

Scan the QR code below to join the 3D vision knowledge planet, which condenses many 3D vision practical problems, as well as learning materials for each module: nearly 20 video courses (free learning for planet members), the latest top papers, computer vision books, high-quality 3D vision algorithm source code, etc. If you want to get started with 3D vision, do projects, and engage in scientific research, welcome to scan the code to join!

论文题目:DMAD: Dual Memory Bank for Real-World Anomaly Detection

作者:Jianlong Hu, Xu Chen等

作者机构:Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, School of Informatics, Xiamen University等

Paper link: https://arxiv.org/pdf/2403.12362.pdf

This article introduces a new framework called DMAD to deal with real-world anomaly detection problems. Traditional multi-class setups often use only normal data and ignore a small but important amount of labeled anomalous data. To address this challenge, the DMAD framework proposes the concept of dual-memory enhanced representation learning, which can handle both unsupervised and semi-supervised scenarios. DMAD uses a dual-memory library to calculate the feature distance and feature attention between normal and abnormal patterns, thereby encapsulating knowledge about normal and abnormal instances, and using this knowledge to construct enhanced representations for abnormal score learning. Experimental results show that the performance of DMAD on MVTec-AD and VisA datasets is better than that of the current state-of-the-art methods, showing its ability to deal with the complexity of real-world anomaly detection scenarios.

The reader understands:

The DMAD framework proposed in this paper is highly innovative and practical in dealing with anomaly detection problems in practical industrial scenarios. By leveraging dual memory to construct enhanced representations, DMAD is able to efficiently handle both unsupervised and semi-supervised scenarios, and has achieved significant performance gains on MVTec-AD and VisA datasets. In particular, DMAD is able to learn more precise decision boundaries when dealing with a small amount of labeled anomaly data, further improving the accuracy of anomaly detection. However, this study only simulated the simplified scene, and there are certain limitations in the pixel-by-pixel annotation of anomalies, and further research is needed to solve these problems

This article introduces a new framework called DMAD to deal with image anomaly detection problems. Traditional methods train a unique model for each object, but as the number of object classes increases, this approach leads to an increase in storage consumption. To solve this problem, UniAD proposes a multi-class setup that uses the normal data of all objects to train a unified model. However, current anomaly detection methods mainly rely on unsupervised learning, and the lack of real anomaly data during training can lead to inaccurate boundary definitions. Recent research has shown that it is feasible to obtain small amounts of anomalous data in the real world, and this semi-supervised approach can help models predict potential anomalous patterns and enhance performance. Therefore, the authors propose a new unified semi-supervised setup that fills a research gap, which is closer to the actual situation, is uniform, and can take advantage of similar shortcomings to provide additional advantages during training. The DMAD framework proposed by the authors is suitable for this unified semi-supervised setup, as well as for the general unified setup. DMAD utilizes a dual-memory bank to handle both cases, first using a patch feature encoder to extract features, then using a dual-memory bank to calculate the distance and cross-attention between features, and finally using a multilayer perceptron to learn the mapping between feature representations and anomaly scores. Experimental results show that the performance of DMAD on MVTec-AD and VisA datasets is significantly better than that of the current state-of-the-art methods.

How to solve the lack of industrial anomaly detection data?

This article introduces a new framework called DMAD, which is designed to deal with real-world anomaly detection problems. In actual industrial scenarios, training a unified model is considered to be more compatible and storage efficient. The framework faces two scenarios: a general uniform setup and a uniform setup with a small number of labeled exceptions, i.e., a uniform semi-supervised setup. It depends on the availability of the anomaly. To achieve this, DMAD consists of three main components: a patch feature encoder, a dual-memory library-based knowledge enhancement, and an exception score mapper. The goal of DMAD is to train a unified neural network that is able to assign a higher anomaly score to an anomaly than a normal instance. By making effective use of both normal and accessible anomaly data, DMAD is able to make significant strides in tackling real-world anomaly detection challenges.

How to solve the lack of industrial anomaly detection data?

2.1 Patch Feature Encoder

This section introduces the patch feature encoder, which consists of the feature extractor FΦ : x → q and the optional feature filtering operation filter. The feature extractor FΦ is used to extract patch features from images and consists of a pre-trained backbone network and an aggregation operation. The training image is represented as x ∈ R3×H×W, and the extracted patch features are expressed as q ∈ RN×C, where N = H0× W0 represents the number of patches, H0 and W0 represent the height and width of the feature, and C represents the number of channels of the feature. In the general uniform setup, only normal data can be used, and for each normal image XN, the author directly obtains its patch characteristic QN. When the detection system runs, some of the annotated anomalies become accessible and can be incorporated into the training of the DMAD. For each observed anomaly xas, the authors use the Filter operation to isolate the anomaly from the patch features FΦ(xas) extracted from it. For each defective image XA, the author can calculate its anomalous patch characteristic QA. These patch features will then be enhanced with dual memory banks.

2.2 Dual memory banks enhance knowledge

This section describes a dual-memory library-based knowledge enhancement approach to handle anomaly detection issues. First, the construction of a dual-memory library is introduced, which includes the normal memory library Mn and the abnormal memory library Ma. Normal memtries store normal patterns, while abnormal memtries store latent defective patterns. For the general uniform setup, the coreset sampling algorithm is used to extract patch features from all normal data to construct Mn. For Ma, if no labeled anomalies are available, construct Ma by randomly sampling anomaly data from the DTD dataset. When the available annotated anomalies become accessible, the filtered anomaly patch feature set Mas for the observed annotated anomalies is added to Ma. To extract additional knowledge, for each patch normal feature, the nearest neighbor features are identified from Mn and Ma, and the distance and attention matrix of the features to the nearest neighbor features are calculated. Finally, the feature itself and the two parts of knowledge are combined to form an enhanced representation. This approach helps to make more efficient use of all available information in anomaly detection, improving the performance of the model.

2.3 Anomaly Score Mapper

This section describes the Anomaly Score Mapper, which is used to map the enhanced representation o to the Anomaly Score S. The multilayer perceptron (MLP) Ψ was used to learn the mapping relationship, and the hinge loss function was used to optimize the network. In the general unified (multi-class) scenario, the feature enhancement strategy is used to generate pseudo-negative samples for model training. When annotated anomalies are available, the model is optimized using a three-part hinge loss, where λ1 and λ2 are hyperparameters. This part of the work helps to model the relationship between the enhanced representation and the anomaly score, which improves the performance of anomaly detection.

2.4 Anomaly Detection and Locating

This section describes how to use a well-trained DMAD model to detect and locate anomalies in test images. Firstly, the patch-level anomaly score of the image was obtained by the DMAD model, and the average of the first 5 anomaly scores was taken as the image-level score. Then, for pixel-level scores, bilinear interpolation and Gaussian smoothing are used to optimize the values of the scores. This approach helps to accurately locate and score anomalies during anomaly detection.

How to solve the lack of industrial anomaly detection data?
  • In the experimental part, the experiments on the DMAD model on the MVTec-AD and VisA datasets are introduced. The MVTec-AD dataset contains high-resolution images from different domains, divided into training sets and test sets, for detecting texture and object defects. The VisA dataset contains high-resolution images of multiple categories for the detection of complex structures, multiple instances, and objects of a single instance. A number of standard evaluation metrics were used, including AUROC, AP, and F1max, as well as PRO indicators for anomalous localization.
  • In the experiment, WideResnet50 was used as a pre-trained CNN backbone network to extract features from layers 2 and 3, and then aggregate them into patch features. For the projection layer, a fully connected layer is used to project features and knowledge. MLP consists of four nonlinear layers, each of which includes a linear layer, batch normalization, and Leaky ReLU activation. The optimizer uses AdamW with a learning rate of 0.0001 for the linear layer and projection layer, and the learning rate for MLP is 0.0002. The training process lasted 48 epochs with a batch size of 32.
  • Experimental results show that the performance of DMAD on the MVTec-AD dataset is comparable to that of UniAD in the unsupervised case, but better than that of UniAD on the VisA dataset. When anomalies with a small number of annotations are available, DMAD achieves state-of-the-art performance on MVTec-AD and VisA datasets by utilizing dual memory to learn more precise decision boundaries.
  • In addition, anomalous localization was evaluated experimentally, and the results showed that DMAD achieved the best performance at all different settings. In summary, DMAD performs well in dealing with anomaly detection problems in actual scenarios and has high application value.
How to solve the lack of industrial anomaly detection data?
How to solve the lack of industrial anomaly detection data?
How to solve the lack of industrial anomaly detection data?

This study proposes a new framework called DMAD to deal with anomaly detection in real-world scenarios. DMAD is a unified framework capable of managing unsupervised and semi-supervised scenarios in a multi-class setup. It leverages dual memory to compute knowledge of normal and abnormal instances, and then uses this knowledge to construct enhanced representations for anomalous score learning. Experimental results on MVTec-AD and VisA datasets show that DMAD shows superior performance in anomaly detection. However, the study only simulated simplified scenarios, and pixel-by-pixel annotation of anomalies may not be available, and further research into new methods is needed to address these issues.

How to solve the lack of industrial anomaly detection data?

This article is only for academic sharing, if there is any infringement, please contact to delete the article.

3D Vision Workshop Exchange Group

At present, we have established multiple communities in the direction of 3D vision, including 2D computer vision, large models, industrial 3D vision, SLAM, autonomous driving, 3D reconstruction, drones, etc., and the subdivisions include:

2D Computer Vision: Image Classification/Segmentation, Target/Detection, Medical Imaging, GAN, OCR, 2D Defect Detection, Remote Sensing Mapping, Super-Resolution, Face Detection, Behavior Recognition, Model Quantification Pruning, Transfer Learning, Human Pose Estimation, etc

Large models: NLP, CV, ASR, generative adversarial models, reinforcement learning models, dialogue models, etc

Industrial 3D vision: camera calibration, stereo matching, 3D point cloud, structured light, robotic arm grasping, defect detection, 6D pose estimation, phase deflection, Halcon, photogrammetry, array camera, photometric stereo vision, etc.

Slam:视觉Slam、激光Slam、语义Slam、滤波算法、多传感器融吇𴢆算法、多传感器标定、动态Slam、Mot Slam、Nerf Slam、机器人导航等.

Autonomous driving: depth estimation, Transformer, millimeter-wave, lidar, visual camera sensors, multi-sensor calibration, multi-sensor fusion, autonomous driving integrated group, etc., 3D object detection, path planning, trajectory prediction, 3D point cloud segmentation, model deployment, lane line detection, Occupancy, target tracking, etc.

三维重建:3DGS、NeRF、多视图几何、OpenMVS、MVSNet、colmap、纹理贴图等

Unmanned aerial vehicles: quadrotor modeling, unmanned aerial vehicle flight control, etc

In addition to these, there are also exchange groups such as job search, hardware selection, visual product landing, the latest papers, the latest 3D vision products, and 3D vision industry news

Add a small assistant: dddvision, note: research direction + school/company + nickname (such as 3D point cloud + Tsinghua + Little Strawberry), pull you into the group.

3D Vision Workshop Knowledge Planet

3DGS、NeRF、结构光、相位偏折术、机械臂抓取、点云实战、Open3D、缺陷检测、BEV感知、Occupancy、Transformer、模型部署、3D目标检测、深度估计、多传感器标定、规划与控制、无人机仿真、三维视觉C++、三维视觉python、dToF、相机标定、ROS2、机器人控制规划、LeGo-LAOM、多模态融合SLAM、LOAM-SLAM、室内室外SLAM、VINS-Fusion、ORB-SLAM3、MVSNet三维重建、colmap、线面结构光、硬件结构光扫描仪,无人机等。

Read on