Carsten Steger’s research while affiliated with RISC Software GmbH and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (109)


Vision-guided robot calibration using photogrammetric methods
  • Article

December 2024

·

22 Reads

ISPRS Journal of Photogrammetry and Remote Sensing

Markus Ulrich

·

Carsten Steger

·

Florian Butsch

·

Maurice Liebe

Qualitative results of the Student–Teacher (S–T) method and a variational autoencoder (VAE) on a simple toy dataset. Anomaly maps are shown for an anomaly-free image, an image containing a structural anomaly (a color defect), and a logical anomaly (two circles being present instead of one). S–T inspects local image regions and therefore only detects the color defect. The VAE captures the global context of images in its bottleneck. It finds both anomalies, but also produces many false positives due to its inaccurate reconstructions
Difference between structural (left) and logical anomalies (right). While the former introduce novel local structures (i.e., the metal piece on the left), the latter violate logical constraints of the training data (i.e., the additional pushpin in the top right compartment). Our proposed method successfully localizes the anomaly in both images
Example images of the MVTec LOCO AD dataset for each of the five dataset categories. Each category contains anomaly-free train, validation, and test images. Additional test images contain various structural and logical anomalies. Pixel-precise ground truth annotations are provided for all anomalies
Schematic illustration of the introduced sPRO evaluation metric. For an annotated anomaly A, a saturation threshold s is selected. Once the overlap of the predicted region with the ground truth A exceeds s, we consider the anomaly segmentation task solved
Schematic overview of our approach. A global feature encoder Eglo\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E_\mathrm {glo}$$\end{document} is trained against descriptors from a pretrained local feature encoder Eloc\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E_\mathrm {loc}$$\end{document} through a bottleneck to capture the global context of the anomaly-free training data. Each encoder is assigned a high-capacity regression network Rglo\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$R_\mathrm {glo}$$\end{document} and Rloc\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$R_\mathrm {loc}$$\end{document}, respectively, that matches the output of its respective feature encoder. The joint training of Eglo\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E_\mathrm {glo}$$\end{document} and Rglo\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$R_\mathrm {glo}$$\end{document} facilitates the accurate matching of higher-dimensional features through a low-dimensional bottleneck

+12

Beyond Dents and Scratches: Logical Constraints in Unsupervised Anomaly Detection and Localization
  • Article
  • Full-text available

April 2022

·

2,325 Reads

·

173 Citations

International Journal of Computer Vision

·

·

Michael Fauser

·

[...]

·

Carsten Steger

The unsupervised detection and localization of anomalies in natural images is an intriguing and challenging problem. Anomalies manifest themselves in very different ways and an ideal benchmark dataset for this task should contain representative examples for all of them. We find that existing datasets are biased towards local structural anomalies such as scratches, dents, or contaminations. In particular, they lack anomalies in the form of violations of logical constraints, e.g., permissible objects occurring in invalid locations. We contribute a new dataset based on industrial inspection scenarios that evenly covers both types of anomalies. We provide pixel-precise ground truth data for each anomalous region and define a generalized evaluation metric that addresses localization ambiguities that can arise for logical anomalies. Furthermore, we propose a novel algorithm that improves over the state of the art in the joint detection of structural and logical anomalies. It consists of a local and a global network branch. The first one inspects confined regions independent of their spatial locations in the input image and is primarily responsible for the detection of entirely new local structures. The second one learns a globally consistent representation of the training data through a bottleneck that enables the detection of violations of long-range dependencies, a key characteristic of many logical anomalies. We perform extensive evaluations on our new dataset to corroborate our claims.

Download

A Multi-view Camera Model for Line-Scan Cameras with Telecentric Lenses

February 2022

·

950 Reads

·

10 Citations

Journal of Mathematical Imaging and Vision

We propose a novel multi-view camera model for line-scan cameras with telecentric lenses. The camera model supports an arbitrary number of cameras and assumes a linear relative motion with constant velocity between the cameras and the object. We distinguish two motion configurations. In the first configuration, all cameras move with independent motion vectors. In the second configuration, the cameras are mounted rigidly with respect to each other and therefore share a common motion vector. The camera model can model arbitrary lens distortions by supporting arbitrary positions of the line sensor with respect to the optical axis. We propose an algorithm to calibrate a multi-view telecentric line-scan camera setup. To facilitate a 3D reconstruction, we prove that an image pair acquired with two telecentric line-scan cameras can always be rectified to the epipolar standard configuration, in contrast to line-scan cameras with entocentric lenses, for which this is possible only under very restricted conditions. The rectification allows an arbitrary stereo algorithm to be used to calculate disparity images. We propose an efficient algorithm to compute 3D coordinates from these disparities. Experiments on real images show the validity of the proposed multi-view telecentric line-scan camera model.



The MVTec 3D-AD Dataset for Unsupervised 3D Anomaly Detection and Localization

December 2021

·

61 Reads

We introduce the first comprehensive 3D dataset for the task of unsupervised anomaly detection and localization. It is inspired by real-world visual inspection scenarios in which a model has to detect various types of defects on manufactured products, even if it is trained only on anomaly-free data. There are defects that manifest themselves as anomalies in the geometric structure of an object. These cause significant deviations in a 3D representation of the data. We employed a high-resolution industrial 3D sensor to acquire depth scans of 10 different object categories. For all object categories, we present a training and validation set, each of which solely consists of scans of anomaly-free samples. The corresponding test sets contain samples showing various defects such as scratches, dents, holes, contaminations, or deformations. Precise ground-truth annotations are provided for every anomalous test sample. An initial benchmark of 3D anomaly detection methods on our dataset indicates a considerable room for improvement.


Accurate and robust tracking of rigid objects in real time

June 2021

·

280 Reads

·

7 Citations

Journal of Real-Time Image Processing

We present the shape model object tracker, which is accurate, robust, and real-time capable on a standard CPU. The tracker has a failure mode detection, is robust to nonlinear illumination changes, and can cope with occlusions. It uses subpixel-precise image edges to track roughly rigid objects with high accuracy and is virtually drift-free even for long sequences. Furthermore, it is inherently capable of object re-detection when tracking fails. To evaluate the accuracy, robustness, and efficiency of the tracker precisely, we present a challenging new tracking dataset with pixel-precise ground truth. The precise ground-truth labels are created automatically from the photo-realistic synthetic VIPER dataset. The tracker is thoroughly evaluated against the state of the art through a number of qualitative and quantitative experiments. It is able to perform on par with the current state-of-the-art deep-learning trackers, but is at least 45 times faster, even without using a GPU. The efficiency and low memory consumption of the tracker are validated in further experiments that are conducted on an embedded device.


The MVTec Anomaly Detection Dataset: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection

April 2021

·

1,918 Reads

·

412 Citations

International Journal of Computer Vision

The detection of anomalous structures in natural image data is of utmost importance for numerous tasks in the field of computer vision. The development of methods for unsupervised anomaly detection requires data on which to train and evaluate new approaches and ideas. We introduce the MVTec anomaly detection dataset containing 5354 high-resolution color images of different object and texture categories. It contains normal, i.e., defect-free images intended for training and images with anomalies intended for testing. The anomalies manifest themselves in the form of over 70 different types of defects such as scratches, dents, contaminations, and various structural changes. In addition, we provide pixel-precise ground truth annotations for all anomalies. We conduct a thorough evaluation of current state-of-the-art unsupervised anomaly detection methods based on deep architectures such as convolutional autoencoders, generative adversarial networks, and feature descriptors using pretrained convolutional neural networks, as well as classical computer vision methods. We highlight the advantages and disadvantages of multiple performance metrics as well as threshold estimation techniques. This benchmark indicates that methods that leverage descriptors of pretrained networks outperform all other approaches and deep-learning-based generative models show considerable room for improvement.


A Camera Model for Line-Scan Cameras with Telecentric Lenses

January 2021

·

922 Reads

·

23 Citations

International Journal of Computer Vision

We propose a camera model for line-scan cameras with telecentric lenses. The camera model assumes a linear relative motion with constant velocity between the camera and the object. It allows to model lens distortions, while supporting arbitrary positions of the line sensor with respect to the optical axis. We comprehensively examine the degeneracies of the camera model and propose methods to handle them. Furthermore, we examine the relation of the proposed camera model to affine cameras. In addition, we propose an algorithm to calibrate telecentric line-scan cameras using a planar calibration object. We perform an extensive evaluation of the proposed camera model that establishes the validity and accuracy of the proposed model. We also show that even for lenses with very small lens distortions, the distortions are statistically highly significant. Therefore, they cannot be omitted in real-world applications.


Uninformed Students: Student-Teacher Anomaly Detection With Discriminative Latent Embeddings

June 2020

·

391 Reads

·

866 Citations

We introduce a powerful student-teacher framework for the challenging problem of unsupervised anomaly detection and pixel-precise anomaly segmentation in high-resolution images. Student networks are trained to regress the output of a descriptive teacher network that was pretrained on a large dataset of patches from natural images. This circumvents the need for prior data annotation. Anomalies are detected when the outputs of the student networks differ from that of the teacher network. This happens when they fail to generalize outside the manifold of anomaly-free training data. The intrinsic uncertainty in the student networks is used as an additional scoring function that indicates anomalies. We compare our method to a large number of existing deep learning based methods for unsupervised anomaly detection. Our experiments demonstrate improvements over state-of-the-art methods on a number of real-world datasets, including the recently introduced MVTec Anomaly Detection dataset that was specifically designed to benchmark anomaly segmentation algorithms.


Figure 1: Qualitative results of our anomaly detection method on the MVTec Anomaly Detection dataset. Top row: Defective input images. Center row: Ground truth regions of defects in red. Bottom row: Anomaly scores for each image pixel predicted by our algorithm.
Figure 5: Anomaly detection at multiple scales: Architectures with receptive field of size p = 17 manage to accurately segment the small scratch on the capsule (top row). However, defects at a larger scale such as the missing imprint (bottom row) become problematic. For increasingly larger receptive fields, the segmentation performance for the larger anomaly increases while it decreases for the smaller one. Our multiscale architecture mitigates this problem by combining multiple receptive fields.
Uninformed Students: Student-Teacher Anomaly Detection with Discriminative Latent Embeddings

March 2020

·

1,127 Reads

·

1 Citation

We introduce a powerful student-teacher framework for the challenging problem of unsupervised anomaly detection and pixel-precise anomaly segmentation in high-resolution images. Student networks are trained to regress the output of a descriptive teacher network that was pretrained on a large dataset of patches from natural images. This circumvents the need for prior data annotation. Anomalies are detected when the outputs of the student networks differ from that of the teacher network. This happens when they fail to generalize outside the manifold of anomaly-free training data. The intrinsic uncertainty in the student networks is used as an additional scoring function that indicates anomalies. We compare our method to a large number of existing deep learning based methods for unsupervised anomaly detection. Our experiments demonstrate improvements over state-of-the-art methods on a number of real-world datasets, including the recently introduced MVTec Anomaly Detection dataset that was specifically designed to benchmark anomaly segmentation algorithms.


Citations (70)


... With the advancements in UAD using deep learning, arXiv:2502.21012v1 [cs.DC] 28 Feb 2025 reconstruction-based methods [16]- [18] have emerged as prominent strategies by learning to accurately reconstruct normal samples while minimizing reconstruction loss, thereby enabling the identification of anomalies as deviations from expected reconstructions. Intuitively, integrating reconstruction loss functions to locally optimize the model, followed by aggregating the model weights through federated learning approaches, presents a promising avenue for enabling federated learning in UAD. ...

Reference:

FedDyMem: Efficient Federated Learning with Dynamic Memory and Memory-Reduce for Unsupervised Image Anomaly Detection
Improving Unsupervised Defect Segmentation by Applying Structural Similarity to Autoencoders
  • Citing Conference Paper
  • January 2019

... Under the MUAD setting, current common datasets include MVTecAD [4], Real-IAD [5], and MVTec LOCO [6], which *Corresponding author. This work was supported by the National Natural Science consist of various categories, such as fruits, snacks, pills, wine bottles, and fabrics, as shown in Fig. 2. ...

Beyond Dents and Scratches: Logical Constraints in Unsupervised Anomaly Detection and Localization

International Journal of Computer Vision

... Recognizing these challenges, several multimodal datasets that integrate 2D and 3D point-cloud data have been developed to more effectively capture the complexity of real-world industrial environments. For instance, MVTec 3D-AD [4] is designed for unsupervised 3D anomaly detection and localization, targeting geometric anomalies such as scratches, dents, and contaminations across 10 object categories. Another noteworthy dataset, Eyecandies [5], introduces synthetic images of 10 candy-like objects with precise 2D, depth, and normal map annotations, offering automated and unbiased labeling using synthetic data. ...

The MVTec 3D-AD Dataset for Unsupervised 3D Anomaly Detection and Localization
  • Citing Conference Paper
  • January 2022

... In 2017, Steger presented a new camera model based on the relationship between projected camera matrices, but the accuracy was too low [16]. Subsequently, he used a scanning camera model considering lens aberration, but it could only be applied to the telecenter [17,18]. In 2019, Yin X Q proposed an aberration-based model, but it was difficult to be applied to the reconstruction of 3D measurements [19]. ...

A Multi-view Camera Model for Line-Scan Cameras with Telecentric Lenses

Journal of Mathematical Imaging and Vision

... For many machine learning applications relevant for manufacturing, established ways and datasets to benchmark the performance of algorithms exist, like for causal discovery in quality data (Göbler et al., 2024), anomaly detection in images from optical inspections (Bergmann et al., 2021), or reinforcement learning in continuous control tasks (Duan et al., 2016). However, for process drift detection, such a framework is yet missing to the best of our knowledge. ...

The MVTec Anomaly Detection Dataset: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection

International Journal of Computer Vision

... There are very few CPP solutions for line scanning robotic systems [21]. Compared to area scan sensors, a line scan sensor is more suitable for defect inspection in industrial/manufacturing applications due to higher spatial resolution and lower production costs [22], [23]. Unlike conventional area cameras or optical sensors that operate at discrete positions, a line scanner employs a single beam of scanning light to detect 3D objects, requiring continuous movement along a coverage path via a robotic manipulator. ...

A Camera Model for Line-Scan Cameras with Telecentric Lenses

International Journal of Computer Vision

... To address the limitations of reconstruction-based methods in feature boundary representation and background interference, feature contrast-based methods have further enhanced detection robustness by comparing the feature differences between normal and abnormal samples. The Student-Teacher (S-T) anomaly-detection framework [18] employs a student-teacher architecture, utilizing knowledge distillation to learn the feature distribution deviations of normal samples for anomaly detection. Although this method exhibits strong robustness, its performance degrades in scenarios with limited data or class imbalance between normal and abnormal samples. ...

Uninformed Students: Student-Teacher Anomaly Detection With Discriminative Latent Embeddings
  • Citing Conference Paper
  • June 2020

... Furthermore, Liu et al. [14] have proposed that refining specific algorithms tailored for object detection holds significant promise in the realm of transmission line inspections. This is particularly significant as conventional models often encounter difficulties when categorizing images with diverse background interferences [15,16]. To facilitate efficient inspections using Unmann-ed Aerial Vehicles (UAV's), which are increasingly popular, a thorough analysis and specialized model training are essential [17]. ...

Accurate and robust tracking of rigid objects in real time

Journal of Real-Time Image Processing

... By learning the normal features after the recovery mask in a self-supervised way, MCA can efectively inhibit the excessive reconstruction of abnormal features by FR in the inference stage. • Extensive experiments have been conducted on two challenging public datasets, MVTec AD [30] and BTAD [31]. Te results show that FTR achieves state-of-the-art performance in image-level anomaly detection. ...

MVTec AD - A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection

... This type of special optical arrangement utilizes a curved image plane rather than a conventional flat field like the F-Theta lens. In the case of a common hypercentric system, the principal rays of the objective corresponding to individual object points intersect in an external convergence point in front of the lens [28,29]. The diameter of hypercentric lenses is larger than the size of the object under examination. ...

A camera model for cameras with hypercentric lenses and some example applications

Machine Vision and Applications