Alexander Binder

Alexander Binder
  • Ph.D. (Dr. rer. nat.)
  • Professor (Assistant) at Singapore University of Technology and Design

About

99
Publications
66,116
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
15,706
Citations
Current institution
Singapore University of Technology and Design
Current position
  • Professor (Assistant)
Additional affiliations
August 2015 - present
Singapore University of Technology and Design
Position
  • Professor (Assistant)
Description
  • Interpretation of Predictions made by deep learning models
May 2013 - present
Fraunhofer Institute for Open Communication Systems
Position
  • Senior Researcher
Description
  • real-time multi-sensor fusion for car localization under absence of GPS signals: video detection and tracking algorithms
May 2013 - present
Technische Universität Berlin
Position
  • Senior Researcher
Description
  • structure detection in histopathology

Publications

Publications (99)
Technical Report
Full-text available
Deep Neural Networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multi-layer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision given a ne...
Article
Full-text available
We propose a localized approach to multiple kernel learning that, in contrast to prevalent approaches, can be formulated as a convex optimization problem over a given cluster structure. From which we obtain the first generalization error bounds for localized multiple kernel learning and derive an efficient optimization algorithm based on the Fenche...
Article
Full-text available
Understanding and interpreting classification decisions of automated image classification systems is of high value in many applications as it allows to verify the reasoning of the system and provides additional information to the human expert. Although machine learning methods are solving very successfully a plethora of tasks, they have in most cas...
Preprint
Full-text available
In this paper, we present Layer-wise Feedback Propagation (LFP), a novel training approach for neural-network-like predictors that utilizes explainability, specifically Layer-wise Relevance Propagation(LRP), to assign rewards to individual connections based on their respective contributions to solving a given task. This differs from traditional gra...
Preprint
Explainable AI (XAI) is slowly becoming a key component for many AI applications. Rule-based and modified backpropagation XAI approaches however often face challenges when being applied to modern model architectures including innovative layer building blocks, which is caused by two reasons. Firstly, the high flexibility of rule-based XAI methods le...
Preprint
Full-text available
While the evaluation of explanations is an important step towards trustworthy models, it needs to be done carefully, and the employed metrics need to be well-understood. Specifically model randomization testing is often overestimated and regarded as a sole criterion for selecting or discarding certain explanation methods. To address shortcomings of...
Article
Explainable Artificial Intelligence (XAI) is an emerging research field bringing transparency to highly complex and opaque machine learning (ML) models. Despite the development of a multitude of methods to explain the decisions of black-box classifiers in recent years, these tools are seldomly used beyond visualization purposes. Only recently, rese...
Article
Full-text available
There is an increasing number of medical use cases where classification algorithms based on deep neural networks reach performance levels that are competitive with human medical experts. To alleviate the challenges of small dataset sizes, these systems often rely on pretraining. In this work, we aim to assess the broader implications of these appro...
Chapter
Visual counterfeits (We refer to CNN-generated images as counterfeits throughout this paper.) are increasingly causing an existential conundrum in mainstream media with rapid evolution in neural image synthesis methods. Though detection of such counterfeits has been a taxing problem in the image forensics community, a recent class of forensic detec...
Preprint
Full-text available
Explainable Artificial Intelligence (XAI) is an emerging research field bringing transparency to highly complex and opaque machine learning (ML) models. Despite the development of a multitude of methods to explain the decisions of black-box classifiers in recent years, these tools are seldomly used beyond visualization purposes. Only recently, rese...
Article
When neural networks are employed for high-stakes decision-making, it is desirable that they provide explanations for their prediction in order for us to understand the features that have contributed to the decision. At the same time, it is important to flag potential outliers for in-depth verification by domain experts. In this work we propose to...
Preprint
Few-shot classifiers have been shown to exhibit promising results in use cases where user-provided labels are scarce. These models are able to learn to predict novel classes simply by training on a non-overlapping set of classes. This can be largely attributed to the differences in their mechanisms as compared to conventional deep networks. However...
Article
This paper analyzes the predictions of image captioning models with attention mechanisms beyond visualizing the attention itself. We develop variants of layer-wise relevance propagation (LRP) and gradient-based explanation methods, tailored to image captioning models with attention mechanisms. We compare the interpretability of attention heatmaps s...
Preprint
There is an increasing number of medical use-cases where classification algorithms based on deep neural networks reach performance levels that are competitive with human medical experts. To alleviate the challenges of small dataset sizes, these systems often rely on pretraining. In this work, we aim to assess the broader implications of these appro...
Article
Full-text available
Recent advances in cancer research and diagnostics largely rely on new developments in microscopic or molecular profiling techniques, offering high levels of detail with respect to either spatial or molecular features, but usually not both. Here, we present an explainable machine-learning approach for the integrated profiling of morphological, mole...
Article
Full-text available
The success of convolutional neural networks (CNNs) in various applications is accompanied by a significant increase in computation and parameter storage costs. Recent efforts to reduce these overheads involve pruning and compressing the weights of various layers while at the same time aiming to not sacrifice performance. In this paper, we propose...
Chapter
We investigate the robustness of stochastic ANNs to adversarial attacks. We perform experiments on three known datasets. Our experiments reveal similar susceptibility of stochastic ANNs compared to conventional ANNs when confronted with simple iterative gradient based attacks in the white-box settings. We observe, however, that in black-box setting...
Preprint
Few-shot classifiers excel under limited training samples, making it useful in real world applications. However, the advent of adversarial samples threatens the efficacy of such classifiers. For them to remain reliable, defences against such attacks must be explored. However, closer examination to prior literature reveals a big gap in this domain....
Preprint
Understanding the features that contributed to a prediction is important for high-stake tasks. In this work, we revisit the idea of a student network to provide an example-based explanation for its prediction in two forms: i) identify top-k most relevant prototype examples and ii) show evidence of similarity between the prediction sample and each o...
Preprint
Cross-domain few-shot classification task (CD-FSC) combines few-shot classification with the requirement to generalize across domains represented by datasets. This setup faces challenges originating from the limited labeled data in each class and, additionally, from the domain shift between training and test sets. In this paper, we introduce a nove...
Chapter
The eligibility for hormone therapy to treat breast cancer largely depends on the tumor’s estrogen receptor (ER) status. Recent studies show that the ER status correlates with morphological features found in Haematoxylin-Eosin (HE) slides. Thus, HE analysis might be sufficient for patients for whom the classifier confidently predicts the ER status...
Preprint
Integrated gradients as an attribution method for deep neural network models offers simple implementability. However, it also suffers from noisiness of explanations, which affects the ease of interpretability. In this paper, we present Smooth Integrated Gradients as a statistically improved attribution method inspired by Taylor's theorem, which doe...
Article
Full-text available
Deep learning has recently gained popularity in digital pathology due to its high prediction quality. However, the medical domain requires explanation and insight for a better understanding beyond standard quantitative performance evaluation. Recently, many explanation methods have emerged. This work shows how heatmaps generated by these explanatio...
Conference Paper
Full-text available
Deep approaches to anomaly detection have recently shown promising results over shallow methods on large and complex datasets. Typically anomaly detection is treated as an unsupervised learning problem. In practice however, one may have---in addition to a large set of unlabeled samples---access to a small pool of labeled samples, e.g. a subset veri...
Preprint
Anomaly detection algorithms find extensive use in various fields. This area of research has recently made great advances thanks to deep learning. A recent method, the deep Support Vector Data Description (deep SVDD), which is inspired by the classic kernel-based Support Vector Data Description (SVDD), is capable of simultaneously learning a featur...
Preprint
This paper explains predictions of image captioning models with attention mechanisms beyond visualizing the attention itself. In this paper, we develop variants of layer-wise relevance backpropagation (LRP) and gradient backpropagation, tailored to image captioning with attention. The result provides simultaneously pixel-wise image explanation and...
Preprint
Full-text available
Within the last decade, neural network based predictors have demonstrated impressive - and at times super-human - capabilities. This performance is often paid for with an intransparent prediction process and thus has sparked numerous contributions in the novel field of explainable artificial intelligence (XAI). In this paper, we focus on a popular...
Chapter
Full-text available
For a machine learning model to generalize well, one needs to ensure that its decisions are supported by meaningful patterns in the input data. A prerequisite is however for the model to be able to explain itself, e.g. by highlighting which input features it uses to support its prediction. Layer-wise Relevance Propagation (LRP) is a technique that...
Preprint
Full-text available
Deep learning has recently gained popularity in digital pathology due to its high prediction quality. However, the medical domain requires explanation and insight for a better understanding beyond standard quantitative performance evaluation. Recently, explanation methods have emerged, which are so far still rarely used in medicine. This work shows...
Preprint
Full-text available
Deep learning has recently gained popularity in digital pathology due to its high prediction quality. However, the medical domain requires explanation and insight for a better understanding beyond standard quantitative performance evaluation. Recently, explanation methods have emerged, which are so far still rarely used in medicine. This work shows...
Conference Paper
Full-text available
Deep approaches to anomaly detection have recently shown promising results over shallow detectors on large and high-dimensional data. Most of these approaches view this task as an unsupervised learning problem. In practice however, one may have---in addition to a large set of unlabeled samples---access to a small pool of labeled samples, e.g. sampl...
Preprint
Full-text available
Deep approaches to anomaly detection have recently shown promising results over shallow approaches on high-dimensional data. Typically anomaly detection is treated as an unsupervised learning problem. In practice however, one may have---in addition to a large set of unlabeled samples---access to a small pool of labeled samples, e.g. a subset verifi...
Article
Full-text available
Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly "intelligent" behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-sol...
Preprint
Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly "intelligent" behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-sol...
Conference Paper
Full-text available
Despite the great advances made by deep learning in many machine learning problems, there is a relative dearth of deep learning approaches for anomaly detection. Those approaches which do exist involve networks trained to perform a task other than anomaly detection, namely generative models or compression, which are in turn adapted for use in anoma...
Article
Full-text available
Deep neural networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multilayer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision, given a ne...
Conference Paper
Full-text available
Recently, deep neural networks have demonstrated excellent performances in recognizing the age and gender on human face images. However, these models were applied in a black-box manner with no information provided about which facial features are actually used for prediction and how these features depend on image preprocessing, model initialization...
Conference Paper
Full-text available
Semantic boundary and edge detection aims at simultaneously detecting object edge pixels in images and assigning class labels to them. Systematic training of predictors for this task requires the labeling of edges in images which is a particularly tedious task. We propose a novel strategy for solving this task, when pixel-level annotations are not...
Preprint
Recently, deep neural networks have demonstrated excellent performances in recognizing the age and gender on human face images. However, these models were applied in a black-box manner with no information provided about which facial features are actually used for prediction and how these features depend on image preprocessing, model initialization...
Article
Full-text available
Nonlinear methods such as Deep Neural Networks (DNNs) are the gold standard for various challenging machine learning problems, e.g., image classification, natural language processing or human action recognition. Although these methods perform impressively well, they have a significant disadvantage, the lack of transparency, limiting the interpretab...
Conference Paper
Full-text available
Complex nonlinear models such as deep neural network (DNNs) have become an important tool for image classification, speech recognition, natural language processing, and many other fields of application. These models however lack transparency due to their complex nonlinear structure and to the complex data distributions to which they typically apply...
Article
Full-text available
Semantic boundary and edge detection aims at simultaneously detecting object edge pixels in images and assigning class labels to them. Systematic training of predictors for this task requires the labeling of edges in images which is a particularly tedious task. We propose a novel strategy for solving this task in an almost zero-shot manner by relyi...
Conference Paper
Full-text available
We summarize the main concepts behind a recently proposed method for explaining neural network predictions called deep Taylor decomposition. For conciseness, we only present the case of simple neural networks of ReLU neurons organized in a directed acyclic graph. More structured networks with special layers are discussed in the original paper (Mont...
Conference Paper
Full-text available
We state some key properties of the recently proposed Layer-wise Relevance Propagation (LRP) method, that make it particularly suitable for model analysis and validation. We also review the capabilities and advantages of the LRP method on empirical data, that we have observed in several previous works.
Article
Full-text available
The Layer-wise Relevance Propagation (LRP) algorithm explains a classifier's prediction specific to a given data point by attributing relevance scores to important components of the input by using the topology of the learned model itself. With the LRP Toolbox we provide platform-agnostic implementations for explaining the predictions of pre-trained...
Article
Full-text available
The Layer-wise Relevance Propagation (LRP) algorithm explains a classifier's prediction specific to a given data point by attributing relevance scores to important components of the input by using the topology of the learned model itself. With the LRP Toolbox we provide platform-agnostic implementations for explaining the predictions of pre-trained...
Conference Paper
Full-text available
Fisher vector (FV) classifiers and Deep Neural Networks (DNNs) are popular and successful algorithms for solving image classification problems. However, both are generally considered 'black box' predictors as the non-linear transformations involved have so far prevented transparent and interpretable reasoning. Recently, a principled technique, Laye...
Conference Paper
Full-text available
Layer-wise relevance propagation is a framework which allows to decompose the prediction of a deep neural network computed over a sample, e.g. an image, down to relevance scores for the single input dimensions of the sample such as subpixels of an image. While this approach can be applied directly to generalized linear mappings, product type non-li...
Preprint
Layer-wise relevance propagation is a framework which allows to decompose the prediction of a deep neural network computed over a sample, e.g. an image, down to relevance scores for the single input dimensions of the sample such as subpixels of an image. While this approach can be applied directly to generalized linear mappings, product type non-li...
Conference Paper
Full-text available
We present an application of the Layer-wise Relevance Propagation (LRP) algorithm to state of the art deep convolutional neural networks and Fisher Vector classifiers to compare the image perception and prediction strategies of both classifiers with the use of visualized heatmaps. Layer-wise Relevance Propagation (LRP) is a method to compute scores...
Preprint
We present an application of the Layer-wise Relevance Propagation (LRP) algorithm to state of the art deep convolutional neural networks and Fisher Vector classifiers to compare the image perception and prediction strategies of both classifiers with the use of visualized heatmaps. Layer-wise Relevance Propagation (LRP) is a method to compute scores...
Conference Paper
Full-text available
We present the application of layer-wise relevance propagation to several deep neural networks such as the BVLC reference neural net and googlenet trained on ImageNet and MIT Places datasets. Layer-wise relevance propagation is a method to compute scores for image pixels and image regions denoting the impact of the particular image region on the pr...
Article
Full-text available
Fisher Vector classifiers and Deep Neural Networks (DNNs) are popular and successful algorithms for solving image classification problems. However, both are generally considered `black box' predictors as the non-linear transformations involved have so far prevented transparent and interpretable reasoning. Recently, a principled technique, Layer-wis...
Technical Report
Full-text available
Deep Neural Networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multi-layer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision given a ne...
Article
Full-text available
This paper studies the generalization performance of multi-class classification algorithms, for which we obtain, for the first time, a data-dependent generalization error bound with a logarithmic dependence on the class size, substantially improving the state-of-the-art linear dependence in the existing data-dependent generalization analysis. The t...
Article
Neuroscientific data is typically analyzed based on the behavioral response of the participant. However, the errors made may or may not be in line with the neural processing. In particular in experiments with time pressure or studies where the threshold of perception is measured, the error distribution deviates from uniformity due to the structure...
Chapter
Recognition of a large set of generic visual concepts in Images and Ranking of Images based on visual semantics is one of the unsolved tasks for future multimedia and scientific applications based on image collections. From that perspective improvements of the quality of semantic annotations for image data are well matched to the goals of the THESE...
Conference Paper
Full-text available
In many real-world applications, the simplified assumption of independent and identically dis-tributed noise breaks down, and labels can have structured, systematic noise. For example, in brain-computer interface applications, training data is often the result of lengthy experimental sessions, where the attention levels of partici-pants can change...
Conference Paper
Full-text available
Conventionally, neuroscientific data is analyzed based on the behavioral response of the participant. This approach assumes that behavioral errors of participants are in line with the neural processing. However, this may not be the case, in particular in experiments with time pressure or studies investigating the threshold of perception. In these c...
Conference Paper
Full-text available
Vehicular positioning technologies enable a broad range of applications and services such as navigation systems, driver assistance systems and self-driving vehicles. However, Global Navigation Satellite Systems (GNSS) do not work in enclosed areas such as parking garages. For these scenarios, a wide range of indoor positioning technologies are avai...
Conference Paper
Full-text available
Combining information from different sources is a common way to improve classification accuracy in Brain-Computer Interfacing (BCI). For instance, in small sample settings it is useful to integrate data from other subjects or sessions in order to improve the estimation quality of the spatial filters or the classifier. Since data from different subj...
Article
In this paper we propose a novel biased random sampling strategy for image representation in Bag-of-Words models. We evaluate its impact on the feature properties and the ranking quality for a set of semantic concepts and show that it improves performance of classifiers in image annotation tasks and increases the correlation between kernels and lab...
Thesis
Die Dissertation behandelt die Erkennung visueller Konzepte auf Bildern mit Hilfe von Methoden des statistischen maschinellen Lernens. Ziel der Erkennung im Rahmen dieser Dissertation ist es, einem Bild für jedes visuelle Konzept einen reellen Wert zuzuweisen, der einem Maß für eine (nicht probabilistischen) Konfidenz in das Vorhandensein des Konze...
Article
Full-text available
We study the problem of classifying images into a given, pre-determined taxonomy. This task can be elegantly translated into the structured learning framework. However, despite its power, structured learning has known limits in scalability due to its high memory requirements and slow training process. We propose an efficient approximation of the st...
Article
Full-text available
Combining information from various image features has become a standard technique in concept recognition tasks. However, the optimal way of fusing the resulting kernel functions is usually unknown in practical applications. Multiple kernel learning (MKL) techniques allow to determine an optimal linear combination of such similarity matrices. Classi...
Data
Full-text available
The file Table S1 contains AP scores on ImageCLEF2010 test data with fixed -norm for each of the 93 concept classes listed separately. (PDF)
Preprint
Full-text available
Combining information from various image features has become a standard technique in concept recognition tasks. However, the optimal way of fusing the resulting kernel functions is usually unknown in practical applications. Multiple kernel learning (MKL) techniques allow to determine an optimal linear combination of such similarity matrices. Classi...
Conference Paper
Full-text available
In object classification tasks from digital photographs, multiple categories are considered for annotation. Some of these visual concepts may have semantic relations and can appear simultaneously in images. Although taxonomical relations and co-occurrence structures between object categories have been studied, it is not easy to use such information...
Conference Paper
Many modern applications from the domain of image classification, such as natural photo categorization, come with highly variable concepts; to this end, state-of-the-art solutions employ a large number of heterogeneous image features, leaving a demand for combining information across many descriptors. In the paradigm of kernel-based learning, the m...
Conference Paper
Full-text available
Vocabulary generation is the essential step in the bag-of-words image representation for visual concept recognition, because its quality affects classification performance substantially. In this paper, we propose a hybrid method for visual word generation which combines unsupervised density-based clustering with the discriminative power of fast sup...
Conference Paper
Full-text available
In this paper we present details on the joint submission of TU Berlin and Fraunhofer FIRST to the ImageCLEF 2011 Photo Annotation Task.We sought to experiment with extensions of Bag-of-Words (BoW) models at several levels and to apply several kernel-based learning methods recently developed in our group. For classifier training we used non-sparse m...
Conference Paper
Full-text available
Automatic annotation of images is a challenging task in computer vision because of “semantic gap” between high-level visual concepts and image appearances. Therefore, user tags attached to images can provide further information to bridge the gap, even though they are partially uninformative and misleading. In this work, we investigate multi-modal v...
Conference Paper
Full-text available
In order to achieve good performance in image annotation tasks, it is necessary to combine information from various image features. In recent competitions on photo annotation, many groups employed the bag-of-words (BoW) representations based on the SIFT descriptors over various color channels. In fact, it has been observed that adding other less in...
Conference Paper
Full-text available
The quality of visual vocabularies is crucial for the performance of bag-of-words image classification methods. Several approaches have been developed for codebook construction, the most popular method is to cluster a set of image features (e.g. SIFT) by k-means. In this paper, we propose a two-step procedure which incorporates label information in...
Conference Paper
Full-text available
In recent years bag-of-visual-words representations have gained increasing popularity in the field of image classification. Their performance highly relies on creating a good visual vocabulary from a set of image features (e.g. SIFT). For real-world photo archives such as Flicker, codebooks with larger than a few thousand words are desirable, which...
Article
Full-text available
We have developed a machine learning toolbox, called SHOGUN, which is designed for unified large-scale learning for a broad range of feature types and learning settings. It offers a considerable number of machine learning models such as support vector machines for classification and regression, hidden Markov models, multiple kernel learning, linear...
Conference Paper
Full-text available
We study the problem of classifying images into a given, pre-determined taxonomy. The task can be elegantly translated into the structured learning framework. Structured learning, however, is known for its memory consuming and slow training processes. The contribution of our paper is twofold: Firstly, we propose an efficient decomposition of the st...
Conference Paper
In order to achieve good performance in object classification problems, it is necessary to combine information from various image features. Because the large margin classifiers are constructed based on similarity measures between samples called kernels, finding appropriate feature combinations boils down to designing good kernels among a set of can...
Conference Paper
Full-text available
Combining information from various image descriptors has become a standard technique for image classification tasks. Multiple kernel learning (MKL) approaches allow to determine the optimal combination of such similarity matrices and the optimal classifier simultaneously. Most MKL approaches employ an l1-regularization on the mixing coefficients to...
Article
Apoptosis mediated via CD95 (Fas/Apo-1) is a key regulator for the biology of normal and malignant lymphocytes. Although the function of CD95 on B-cell chronic lymphocytic leukemia cells (B-CLL cells) has been studied intensively, the clinical importance of CD95 expression on normal T cells in B-CLL has not been clarified. This study aimed to inves...
Article
Full-text available
We have developed R-bindings for our machine learning toolbox SHOGUN, which features algorithms for hid-den markov models, regression and classification problems. SHOGUN's focus is on Support Vector Machines, but also implements a number of linear methods like Linear Discriminant Analysis, Linear Programming Machines and Perceptrons. It provides a...
Article
In order to achieve good performance in image annotation tasks, it is necessary to com- bine information from various image features. In our submission, we applied the non- sparse multiple kernel learning for feature combination proposed by Kloft et al.(2009) to the ImageCLEF2009 photo annotation data. Since some of the concepts of the Im- ageCLEF...
Article
Full-text available
Recent research has shown that combining various image features significantly improves the object classification performance. Multiple kernel learning (MKL) approaches, where the mixing weights at the kernel level are optimized simultaneously with the classifier pa-rameters, give a well founded framework to control the importance of each feature. A...

Network

Cited By