Conference Paper

Towards Best Practice in Explaining Neural Network Decisions with LRP

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... While LRP builds the general framework for the backpropagation, several LRP rules optimizing different properties have been introduced. Best practices on the selection of rules have been developed for the most common model architectures [24,31]. ...
... Profound evaluations on configuring the LRP rule composites and initializing the backward pass have been developed for certain standard classification models [24,29,31]. However, there is no best practice for applying LRP to object detection models. ...
... Concept Attribution Localization As glCA can be used for explaining the detection of a certain concept within an input, we want to measure the localization capabilities in the explanations for Option (a) of our approach. We, therefore, use an adapted version of the attribution localization metric [24], which is implemented in the Quantus toolbox [20], that measures the ratio of positively attributed relevance within a binary class mask to the overall positive relevance. This metric can also be applied on the concept level when concept mask annotations are provided, as in the Broden [8] dataset. ...
Preprint
Full-text available
Ensuring the quality of black-box Deep Neural Networks (DNNs) has become ever more significant, especially in safety-critical domains such as automated driving. While global concept encodings generally enable a user to test a model for a specific concept, linking global concept encodings to the local processing of single network inputs reveals their strengths and limitations. Our proposed framework global-to-local Concept Attribution (glCA) uses approaches from local (why a specific prediction originates) and global (how a model works generally) eXplainable Artificial Intelligence (xAI) to test DNNs for a predefined semantical concept locally. The approach allows for conditioning local, post-hoc explanations on predefined semantic concepts encoded as linear directions in the model's latent space. Pixel-exact scoring concerning the global concept usage assists the tester in further understanding the model processing of single data points for the selected concept. Our approach has the advantage of fully covering the model-internal encoding of the semantic concept and allowing the localization of relevant concept-related information. The results show major differences in the local perception and usage of individual global concept encodings and demand for further investigations regarding obtaining thorough semantic concept encodings.
... Application-grounded perspectives may consider the needs of explanation recipients (explainees) [39,42] and an increase in task performance for applied human-AI decision making [16] or the completeness and soundness of an explanation [21], e.g., with respect to given metrics such as the coverage of relevant image regions [19]. Facial expression recognition, which is the application of this work, is usually a multi-class problem. ...
... While the ResNet-18 from [31] is trained on the CK+ data set as well as on the Actor Study data set, the VGG-16 is trained on a variety of different data sets from vastly different settings (e.g., in-the-wild and in-the-lab): Actor Study [37] (excluding subjects [11][12][13][14][15][16][17][18][19][20][21], Aff-Wild2 [20], BP4D [41], CK+ [25], the manually annotated subset of EmotioNet [5], and UNBC [26]. We use the same training procedure as in [29] to retrain the VGG-16 without the Actor Study subjects 11-21, which is then our testing data. ...
... This metric is beneficial if there is an imbalanced ratio of displayed and non-displayed classes, which is the case for AUs [29]. The ResNet-18 is evaluated with a leave-one-out cross validation on the Actor Study data set, and the performance of the VGG-16 is evaluated on the validation data set, and additionally on the testing part of the Actor Study (subjects [11][12][13][14][15][16][17][18][19][20][21]. ...
Chapter
Full-text available
Research in the field of explainable artificial intelligence has produced a vast amount of visual explanation methods for deep learning-based image classification in various domains of application. However, there is still a lack of domain-specific evaluation methods to assess an explanation’s quality and a classifier’s performance with respect to domain-specific requirements. In particular, evaluation methods could benefit from integrating human expertise into quality criteria and metrics. Such domain-specific evaluation methods can help to assess the robustness of deep learning models more precisely. In this paper, we present an approach for domain-specific evaluation of visual explanation methods in order to enhance the transparency of deep learning models and estimate their robustness accordingly. As an example use case, we apply our framework to facial expression recognition. We can show that the domain-specific evaluation is especially beneficial for challenging use cases such as facial expression recognition and provides application-grounded quality criteria that are not covered by standard evaluation methods. Our comparison of the domain-specific evaluation method with standard approaches thus shows that the quality of the expert knowledge is of great importance for assessing a model’s performance precisely.
... For an overview of the different rules that have been proposed to define the relevance contributions R ( l −1 ,l ) i← j , see Bach et al. [44] , Kohlbrenner et al. [98] , and Samek et al. [15] . Note that in a linear network f (x) = i x i j , in which R j = f (x) , the relevance contributions R i← j are directly given by R i j = x i j . ...
... 3D-DeepLight represents a fully-convolutional neural network, in which the convolution kernels are activated through ReLU functions (see Eq. (2) ). Based on recent empirical work in computer vision [98] , which has shown that class discriminability and object localization of the LRP technique can be increased for these types of networks, we define the relevance contributions R ( l −1 ,l ) i← j . To satisfy the local conservation property (see Eq. (18) ) α and β are restricted to α + β = 1 (we set α = 2, in line with Kohlbrenner et al. [98] , Samek et al. [15] ). ...
... Based on recent empirical work in computer vision [98] , which has shown that class discriminability and object localization of the LRP technique can be increased for these types of networks, we define the relevance contributions R ( l −1 ,l ) i← j . To satisfy the local conservation property (see Eq. (18) ) α and β are restricted to α + β = 1 (we set α = 2, in line with Kohlbrenner et al. [98] , Samek et al. [15] ). ...
... Other methods (e.g., Integrated Gradients [16]) require a baseline (as a starting point from which integral is computed), sliding window shapes (Occlusion [17]) or a decision whether to multiply the result by inputs or not (e.g., Integrated Gradients, DeepLIFT [18]). In Layer-wise Relevance Propagation (LRP) [19,20], one has to choose a rule for each layer. The results depend on this choice, as illustrated by Figure 4. ...
... We investigated the following post-hoc explainability methods: Gradients [31], Input x Gradients [31], Integrated Gradients [16], Guided Backpropagation [32], Deconvolution [17], DeepLIFT [18], Guided Grad-CAM [33], Occlusion [17], LIME [14], SHAP [15] and three variants of Layer-wise Relevance Propagation (LRP) [19,20] composites: EpsilonPlusFlat (LRP-ε-rule for dense layers, LRP-α, β (α = 1, β = 0), also called ZPlus rule, for convolutional layers, and the flat rule for the first linear layer), EpsilonGammaBox (LRP-ε-rule for dense layers, the LRP-γ-rule (γ = 0.25) for convolutional layers, and the LRP-Z B -rule (or box-rule) for the first layer) and EpsilonAlpha2Beta1Flat (LRP-ε-rule for dense layers, LRP-α, β (α = 2, β = 1) for convolutional layers and the flat rule for the first linear layer) [34]. Moreover, one has to make sure that the model architecture is compatible with the rules used in LRP. ...
Preprint
Full-text available
Much machine learning research progress is based on developing models and evaluating them on a benchmark dataset (e.g., ImageNet for images). However, applying such benchmark-successful methods to real-world data often does not work as expected. This is particularly the case for biological data where we expect variability at multiple time and spatial scales. In this work, we are using grain data and the goal is to detect diseases and damages. Pink fusarium, skinned grains, and other diseases and damages are key factors in setting the price of grains or excluding dangerous grains from food production. Apart from challenges stemming from differences of the data from the standard toy datasets, we also present challenges that need to be overcome when explaining deep learning models. For example, explainability methods have many hyperparameters that can give different results, and the ones published in the papers do not work on dissimilar images. Other challenges are more general: problems with visualization of the explanations and their comparison since the magnitudes of their values differ from method to method. An open fundamental question also is: How to evaluate explanations? It is a non-trivial task because the "ground truth" is usually missing or ill-defined. Also, human annotators may create what they think is an explanation of the task at hand, yet the machine learning model might solve it in a different and perhaps counter-intuitive way. We discuss several of these challenges and evaluate various post-hoc explainability methods on grain data. We focus on robustness, quality of explanations, and similarity to particular "ground truth" annotations made by experts. The goal is to find the methods that overall perform well and could be used in this challenging task. We hope the proposed pipeline will be used as a framework for evaluating explainability methods in specific use cases.
... LRP has been successfully used to generate intuitions and measurable values describing the processing of variables in Neural Networks because its redistribution strategy follows relevance conservation and proportional decomposition principles, which preserve a strong connection with the model output [195]. ...
... The study of Lapuschkin et al. [106] illustrates a clear example of how explainability tools can assist data scientists in discovering hidden biases in learning models. Montavon et al. [191] and Kohlbrenner et al. [195] conducted reviews evaluating LRP approaches applied to Neural Networks. ...
Article
Full-text available
Intelligent applications supported by Machine Learning have achieved remarkable performance rates for a wide range of tasks in many domains. However, understanding why a trained algorithm makes a particular decision remains problematic. Given the growing interest in the application of learning-based models, some concerns arise in the dealing with sensible environments, which may impact users’ lives. The complex nature of those models’ decision mechanisms makes them the so-called “black boxes,” in which the understanding of the logic behind automated decision-making processes by humans is not trivial. Furthermore, the reasoning that leads a model to provide a specific prediction can be more important than performance metrics, which introduces a trade-off between interpretability and model accuracy. Explaining intelligent computer decisions can be regarded as a way to justify their reliability and establish trust. In this sense, explanations are critical tools that verify predictions to discover errors and biases previously hidden within the models’ complex structures, opening up vast possibilities for more responsible applications. In this review, we provide theoretical foundations of Explainable Artificial Intelligence (XAI), clarifying diffuse definitions and identifying research objectives, challenges, and future research lines related to turning opaque machine learning outputs into more transparent decisions. We also present a careful overview of the state-of-the-art explainability approaches, with a particular analysis of methods based on feature importance, such as the well-known LIME and SHAP. As a result, we highlight practical applications of the successful use of XAI.
... Relevance maps highlight predictive brain regions in individuals with dementia Based on the classifiers with the highest AUCs in the validation sets, we built an explainable pipeline for dementia prediction, LRP dementia , using composite LRP 43 , and a strategy to prioritize regions of the brain that contributed positively towards a prediction of dementia in the explanations. Using this pipeline, we computed out-of-sample relevance maps for all participants by applying the model for which the participant was unseen. ...
... where w mn denotes the weight between a m and a n We controlled the influence of different aspects of the explanations using a composite LRP strategy 43 , combining different formulations of the LRP formula for the different layers in the model to enhance specific aspects of the relevance maps. Specifically, we employed a combination of alphabeta and epsilon rules that have previously shown to produce meaningful results for dementia classifiers 41,42 . ...
Article
Full-text available
Deep learning approaches for clinical predictions based on magnetic resonance imaging data have shown great promise as a translational technology for diagnosis and prognosis in neurological disorders, but its clinical impact has been limited. This is partially attributed to the opaqueness of deep learning models, causing insufficient understanding of what underlies their decisions. To overcome this, we trained convolutional neural networks on structural brain scans to differentiate dementia patients from healthy controls, and applied layerwise relevance propagation to procure individual-level explanations of the model predictions. Through extensive validations we demonstrate that deviations recognized by the model corroborate existing knowledge of structural brain aberrations in dementia. By employing the explainable dementia classifier in a longitudinal dataset of patients with mild cognitive impairment, we show that the spatially rich explanations complement the model prediction when forecasting transition to dementia and help characterize the biological manifestation of disease in the individual brain. Overall, our work exemplifies the clinical potential of explainable artificial intelligence in precision medicine.
... Neurons with a low association score do not significantly contribute to the predictions, while those with a high magnitude do. These rules serve to elucidate the contribution of each input element to the output probability by systematically back-propagating the output probability through the neural network using localized propagation rules [26]. LRP differs from other techniques such as SA and LIME because it calculates explanations by considering the weights and inputs of each neural network layer [26]. ...
... These rules serve to elucidate the contribution of each input element to the output probability by systematically back-propagating the output probability through the neural network using localized propagation rules [26]. LRP differs from other techniques such as SA and LIME because it calculates explanations by considering the weights and inputs of each neural network layer [26]. ...
Preprint
Full-text available
Accurately predicting the severity of traffic accidents is crucial for preventing them and safeguarding traffic safety. Practitioners need to understand the underlying predictive mechanisms to identify associated risk factors and develop appropriate interventions effectively. Unfortunately, existing research often falls short in predicting diverse outcomes, with some studies neglecting the latter entirely. Moreover, designing explainable deep neural networks (DNNs) is challenging, unlike traditional models, which makes it difficult to achieve explainability with DNNs that incorporate neural networks. We propose a multi-task deep neural network framework designed to predict different types of injury severity, including injury, fatality, and property damage. Our proposed approach offers a thorough and precise method for analyzing crash injury severity. Unlike black-box models, our framework can pinpoint the critical factors contributing to injury severity by employing improved layer-wise relevance propagation. Experiments on Chinese traffic accidents demonstrate that our model accurately predicts the factors associated with injury severity and surpasses existing methods. Moreover, our experiments reveal that the critical factors identified by our approach are more logical and informative compared to those provided by baseline models. Additionally, our findings can assist policymakers make more enlightened decisions when devising and implementing improvements in traffic safety.
... A number of LRP methods were tested with the three DCNNs models. All the models returned the same results; therefore, we report the result from single DCNN, to identify the best performing backpropagation method implemented in iNNvestigate GitHub repository: Deep Taylor [27], Deep Taylor bounded [60], deconvnet (deconvolution) [61], guided backprop (guided backpropagation) [62], and LRP sequential preset a flat (LRP-SPF) [60]. The DCNN classification accuracy is evaluated for each of the above LRP methods separately, by performing a sequence of perturbation steps, as described in section 3.3. ...
... A number of LRP methods were tested with the three DCNNs models. All the models returned the same results; therefore, we report the result from single DCNN, to identify the best performing backpropagation method implemented in iNNvestigate GitHub repository: Deep Taylor [27], Deep Taylor bounded [60], deconvnet (deconvolution) [61], guided backprop (guided backpropagation) [62], and LRP sequential preset a flat (LRP-SPF) [60]. The DCNN classification accuracy is evaluated for each of the above LRP methods separately, by performing a sequence of perturbation steps, as described in section 3.3. ...
Preprint
Full-text available
Gait analysis, an expanding research area, employs non invasive sensors and machine learning techniques for a range of applicatio ns. In this study, we concentrate on gait analysis for detecting cognitive decline in Parkinson's disease (PD) and under dual task conditions. Using convolutional neural networks (CNNs) and explainable machine learning, we objectively analyze gait data and associate findings with clinically relevant biomarkers. This is accomplished by connecting machine learning outputs to decisions based on human visual observations or derived quantitative gait parameters, which are tested and routinely implemented in curr ent healthcare practice. Our analysis of gait deterioration due to cognitive decline in PD enables robust results using the proposed methods for assessing PD severity from ground reaction force (GRF) data. We achieved classification accuracies of 98% F1 sc ores for each PhysioNet.org dataset and 95.5% F1 scores for the combined PhysioNet dataset. By linking clinically observable features to the model outputs, we demonstrate the impact of PD severity on gait. Furthermore, we explore the significance of cognit ive load in healthy gait analysis, resulting in robust classification accuracies of 100% F1 scores for subject identity verification. We also identify weaker features crucial for model predictions using Layer Wise Relevance Propagation. A notable finding o f this study reveals that cognitive deterioration's effect on gait influences body balance and foot landing/lifting dynamics in both classification cases: cognitive load in healthy gait and cognitive decline in PD gait.
... This process is repeated recursively from the final layer to the input layer, generating a relevancy heatmap that can be overlaid on the input image. Further properties of LRP and details of its theoretical basis are given in (Montavon et al., 2017 [164]), and a comparison of LRP to other interpretation methods can be found in ( [165][166][167]). ...
Preprint
Full-text available
The combination of medical imaging and deep learning has significantly improved diagnostic and prognostic capabilities in the healthcare domain. Nevertheless, the inherent complexity of deep learning models poses challenges in understanding their decision-making processes. Interpretability and visualization techniques have emerged as crucial tools to unravel the black-box nature of these models, providing insights into their inner workings and enhancing trust in their predictions. This survey paper comprehensively examines various interpretation and visualization techniques applied to deep learning models in medical imaging. The paper reviews methodologies, discusses their applications, and evaluates their effectiveness in enhancing the interpretability, reliability, and clinical relevance of deep learning models in medical image analysis.
... This process is repeated recursively from the final layer to the input layer, generating a relevancy heatmap that can be overlaid on the input image. Further properties of LRP and details of its theoretical basis are given in (Montavon et al., 2017 [171]), and a comparison of LRP to other interpretation methods can be found in ( [172][173][174]). ...
Article
Full-text available
The combination of medical imaging and deep learning has significantly improved diagnostic and prognostic capabilities in the healthcare domain. Nevertheless, the inherent complexity of deep learning models poses challenges in understanding their decision-making processes. Interpretability and visualization techniques have emerged as crucial tools to unravel the black-box nature of these models, providing insights into their inner workings and enhancing trust in their predictions. This survey paper comprehensively examines various interpretation and visualization techniques applied to deep learning models in medical imaging. The paper reviews methodologies, discusses their applications, and evaluates their effectiveness in enhancing the interpretability, reliability, and clinical relevance of deep learning models in medical image analysis.
... Since we make use of Imagenet-S 50 , we use, as ground truth, the segmentation maps proposed in the dataset. We consider two metrics: (a) Pointing Game (PG) [13], which measures whether the max pixel in the saliency maps falls within the segmentation map, and (b) Attribution Localization (AL) [14], which computes the precision of the saliency map w.r.t. the segmentation map. ...
... There is not a method of choice in calculating these scores, so the modeller is free in determining what "relevance" means in the given context. The αβ rule, among others, tries to balance relevance between positive and negative contributions (Kohlbrenner et al., 2020). ...
Preprint
Full-text available
Deep Reinforcement Learning (DRL) is a frequently employed technique to solve scheduling problems. Although DRL agents ace at delivering viable results in short computing times, their reasoning remains opaque. We conduct a case study where we systematically apply two explainable AI (xAI) frameworks, namely SHAP (DeepSHAP) and Captum (Input x Gradient), to describe the reasoning behind scheduling decisions of a specialized DRL agent in a flow production. We find that methods in the xAI literature lack falsifiability and consistent terminology, do not adequately consider domain-knowledge, the target audience or real-world scenarios, and typically provide simple input-output explanations rather than causal interpretations. To resolve this issue, we introduce a hypotheses-based workflow. This approach enables us to inspect whether explanations align with domain knowledge and match the reward hypotheses of the agent. We furthermore tackle the challenge of communicating these insights to third parties by tailoring hypotheses to the target audience, which can serve as interpretations of the agent's behavior after verification. Our proposed workflow emphasizes the repeated verification of explanations and may be applicable to various DRL-based scheduling use cases.
... In Figure 7 is shown an example of a generated binary segmentation masks. As we employed a binary mask, the results of RM-A [9] are comparable to A.L [30] which we propose in table 5. The relative robustness (RIS/ROS) results are tabulated in table 6. ...
Preprint
Full-text available
Although input-gradients techniques have evolved to mitigate and tackle the challenges associated with gradients, modern gradient-weighted CAM approaches still rely on vanilla gradients, which are inherently susceptible to the saturation phenomena. Despite recent enhancements have incorporated counterfactual gradient strategies as a mitigating measure, these local explanation techniques still exhibit a lack of sensitivity to their baseline parameter. Our work proposes a gradient-weighted CAM augmentation that tackles both the saturation and sensitivity problem by reshaping the gradient computation, incorporating two well-established and provably approaches: Expected Gradients and kernel smoothing. By revisiting the original formulation as the smoothed expectation of the perturbed integrated gradients, one can concurrently construct more faithful, localized and robust explanations which minimize infidelity. Through fine modulation of the perturbation distribution it is possible to regulate the complexity characteristic of the explanation, selectively discriminating stable features. Our technique, Expected Grad-CAM, differently from recent works, exclusively optimizes the gradient computation, purposefully designed as an enhanced substitute of the foundational Grad-CAM algorithm and any method built therefrom. Quantitative and qualitative evaluations have been conducted to assess the effectiveness of our method.
... It has been shown that this approach does not only yield measurably more representative attribution maps, but also provides a solution against gradient shattering affecting previous approaches, and improves properties related to object localization and class discrimination via attribution. Common among these works is the utilization of LRPε with ε ≪ 1 (or just LRPz) to decompose fully connected layers close to the model output, followed by an application of LRPαβ to the underlying convolutional layers [14]. We have used LRPz for the fully connected layers and LRPαβ to the underlying convolutional layers. ...
Preprint
Full-text available
We present a new technique that explains the output of a CNN-based model using a combination of GradCAM and LRP methods. Both of these methods produce visual explanations by highlighting input regions that are important for predictions. In the new method, the explanation produced by GradCAM is first processed to remove noises. The processed output is then multiplied elementwise with the output of LRP. Finally, a Gaussian blur is applied on the product. We compared the proposed method with GradCAM and LRP on the metrics of Faithfulness, Robustness, Complexity, Localisation and Randomisation. It was observed that this method performs better on Complexity than both GradCAM and LRP and is better than atleast one of them in the other metrics.
... Reportedly, this is because the α1β0 rule considers only the positive preactivations in the network ( [14,17]). Recent studies have suggested LRP modifications that combine various LRP rules to produce both faithful and smooth results ( [19]). As a second example, SHapley Additive exPlnations (SHAP; [20]), an attribution method based on game theory principles, computes relevance scores called Shapley values for input features, for a given prediction. ...
Preprint
Full-text available
The accelerated progress of artificial intelligence (AI) has popularized deep learning models across domains, yet their inherent opacity poses challenges, notably in critical fields like healthcare, medicine and the geosciences. Explainable AI (XAI) has emerged to shed light on these "black box" models, helping decipher their decision making process. Nevertheless, different XAI methods yield highly different explanations. This inter-method variability increases uncertainty and lowers trust in deep networks' predictions. In this study, for the first time, we propose a novel framework designed to enhance the explainability of deep networks, by maximizing both the accuracy and the comprehensibility of the explanations. Our framework integrates various explanations from established XAI methods and employs a non-linear "explanation optimizer" to construct a unique and optimal explanation. Through experiments on multi-class and binary classification tasks in 2D object and 3D neuroscience imaging, we validate the efficacy of our approach. Our explanation optimizer achieved superior faithfulness scores, averaging 155% and 63% higher than the best performing XAI method in the 3D and 2D applications, respectively. Additionally, our approach yielded lower complexity, increasing comprehensibility. Our results suggest that optimal explanations based on specific criteria are derivable and address the issue of inter-method variability in the current XAI literature.
... We use this method to explain the decision-making of the model on the classification output, specifically on the neuron that classifies whether the initial conditions are a precursor of a strong eastern Pacific El Niño event, such that the sum of the LRP relevance values corresponding to each of the input values equals the output of this neuron. We used the composite version of LRP 66 , which yields the most consistent results among the different versions 36 . The relevance score highlights spatial features that strengthen or weaken the model classification score. ...
Article
Full-text available
Global and regional impacts of El Niño-Southern Oscillation (ENSO) are sensitive to the details of the pattern of anomalous ocean warming and cooling, such as the contrasts between the eastern and central Pacific. However, skillful prediction of such ENSO diversity remains a challenge even a few months in advance. Here, we present an experimental forecast with a deep learning model (IGP-UHM AI model v1.0) for the E (eastern Pacific) and C (central Pacific) ENSO diversity indices, specialized on the onset of strong eastern Pacific El Niño events by including a classification output. We find that higher ENSO nonlinearity is associated with better skill, with potential implications for ENSO predictability in a warming climate. When initialized in May 2023, our model predicts the persistence of El Niño conditions in the eastern Pacific into 2024, but with decreasing strength, similar to 2015–2016 but much weaker than 1997–1998. In contrast to the more typical El Niño development in 1997 and 2015, in addition to the ongoing eastern Pacific warming, an eXplainable Artificial Intelligence analysis for 2023 identifies weak warm surface, increased sea level and westerly wind anomalies in the western Pacific as precursors, countered by warm surface and southerly wind anomalies in the northern Atlantic.
... Note that the sum is not particularly necessary in equation (7), but serves as a means to compare all possible c l for identity to the current j. In practice, CRP can be implemented efficiently as a single backpropagation step by binary masking of relevance tensors, and is compatible to the recommended rule composites for relevance backpropagation 49,50 . We provide an efficient implementation of CRP based on Zennit 51 at https:// github.com/rachtibat/zennit-crp. ...
Article
Full-text available
The field of explainable artificial intelligence (XAI) aims to bring transparency to today’s powerful but opaque deep learning models. While local XAI methods explain individual predictions in the form of attribution maps, thereby identifying ‘where’ important features occur (but not providing information about ‘what’ they represent), global explanation techniques visualize what concepts a model has generally learned to encode. Both types of method thus provide only partial insights and leave the burden of interpreting the model’s reasoning to the user. Here we introduce the Concept Relevance Propagation (CRP) approach, which combines the local and global perspectives and thus allows answering both the ‘where’ and ‘what’ questions for individual predictions. We demonstrate the capability of our method in various settings, showcasing that CRP leads to more human interpretable explanations and provides deep insights into the model’s representation and reasoning through concept atlases, concept-composition analyses, and quantitative investigations of concept subspaces and their role in fine-grained decision-making.
... Some studies use a heatmap approach [10,12,17] to explain the behavior learned by the model in image classification by illustrating the areas where the model focused to make its prediction. However, these methods require expert knowledge to understand the learned behavior and are not applicable to tabular data, which is the scope of this work. ...
Chapter
This paper proposes a novel approach that combines an association rule algorithm with a deep learning model to enhance the interpretability of prediction outcomes. The study aims to gain insights into the patterns that were learned correctly or incorrectly by the model. To identify these scenarios, an association rule algorithm is applied to extract the patterns learned by the deep learning model. The rules are then analyzed and classified based on specific metrics to draw conclusions about the behavior of the model. We applied this approach to a well-known dataset in various scenarios, such as underfitting and overfitting. The results demonstrate that the combination of the two techniques is highly effective in identifying the patterns learned by the model and analyzing its performance in different scenarios, through error analysis. We suggest that this methodology can enhance the transparency and interpretability of black-box models, thus improving their reliability for real-world applications. Keywordsassociation rulesApriorideep learninginterpretabilityexplainable AI
... All of the above rules can be applied in a layer-wise fashion and combined into composites [35], meaning that different rules are applied to different layers or types of layers. A description of the composites utilized in the following experiments can be found in Appendix A.6.2. ...
Preprint
Full-text available
In this paper, we present Layer-wise Feedback Propagation (LFP), a novel training approach for neural-network-like predictors that utilizes explainability, specifically Layer-wise Relevance Propagation(LRP), to assign rewards to individual connections based on their respective contributions to solving a given task. This differs from traditional gradient descent, which updates parameters towards anestimated loss minimum. LFP distributes a reward signal throughout the model without the need for gradient computations. It then strengthens structures that receive positive feedback while reducingthe influence of structures that receive negative feedback. We establish the convergence of LFP theoretically and empirically, and demonstrate its effectiveness in achieving comparable performance to gradient descent on various models and datasets. Notably, LFP overcomes certain limitations associated with gradient-based methods, such as reliance on meaningful derivatives. We further investigate how the different LRP-rules can be extended to LFP, what their effects are on training, as well as potential applications, such as training models with no meaningful derivatives, e.g., step-function activated Spiking Neural Networks (SNNs), or for transfer learning, to efficiently utilize existing knowledge.
... Moreover, some rules are specifically designed for the input layer (Montavon et al. 2017). Due to the rule independence of how the lower-layer relevances are computed from the relevance messages in Equation 2, the rules can also be set individually for each layer, called composite-rule Kohlbrenner et al. 2020). ...
Preprint
Full-text available
The R package innsight offers a general toolbox for revealing variable-wise interpretations of deep neural networks' predictions with so-called feature attribution methods. Aside from the unified and user-friendly framework, the package stands out in three ways: It is generally the first R package implementing feature attribution methods for neural networks. Secondly, it operates independently of the deep learning library allowing the interpretation of models from any R package, including keras, torch, neuralnet, and even custom models. Despite its flexibility, innsight benefits internally from the torch package's fast and efficient array calculations, which builds on LibTorch $-$ PyTorch's C++ backend $-$ without a Python dependency. Finally, it offers a variety of visualization tools for tabular, signal, image data or a combination of these. Additionally, the plots can be rendered interactively using the plotly package.
... In future work, we will consider applying our framework to the explanation task based on the technique of layer-wise relevance propagation [89] in graph neural networks, following in the spirit of [90,91]. ...
Article
Full-text available
Considering the worst-case scenario, the junction-tree algorithm remains the most general solution for exact MAP inference with polynomial run-time guarantees. Unfortunately, its main tractability assumption requires the treewidth of a corresponding MRF to be bounded, strongly limiting the range of admissible applications. In fact, many practical problems in the area of structured prediction require modeling global dependencies by either directly introducing global factors or enforcing global constraints on the prediction variables. However, this always results in a fully-connected graph, making exact inferences by means of this algorithm intractable. Previous works focusing on the problem of loss-augmented inference have demonstrated how efficient inference can be performed on models with specific global factors representing non-decomposable loss functions within the training regime of SSVMs. Making the observation that the same fundamental idea can be applied to solve a broader class of computational problems, in this paper, we adjust the framework for an efficient exact inference proposed in to allow much finer interactions between the energy of the core model and the sufficient statistics of the global terms. As a result, we greatly increase the range of admissible applications and strongly improve upon the theoretical guarantees of computational efficiency. We illustrate the applicability of our method in several use cases, including one that is not covered by the previous problem formulation. Furthermore, we propose a new graph transformation technique via node cloning, which ensures a polynomial run-time for solving our target problem. In particular, the overall computational complexity of our constrained message-passing algorithm depends only on form-independent quantities such as the treewidth of a corresponding graph (without global connections) and image size of the sufficient statistics of the global terms.
... In this case, DTD indeed achieves that no relevance gets attributed to suppressor features in a linear setting. Notably, it has also been shown that in more complex learning scenarios and depending on root points, DTD can generally yield almost any explanation (Kohlbrenner et al., 2020;Montavon et al., 2018;Sixt & Landgraf, 2022). ...
Preprint
Full-text available
In recent years, the community of 'explainable artificial intelligence' (XAI) has created a vast body of methods to bridge a perceived gap between model 'complexity' and 'interpretability'. However, a concrete problem to be solved by XAI methods has not yet been formally stated. As a result, XAI methods are lacking theoretical and empirical evidence for the 'correctness' of their explanations, limiting their potential use for quality-control and transparency purposes. At the same time, Haufe et al. (2014) showed, using simple toy examples, that even standard interpretations of linear models can be highly misleading. Specifically, high importance may be attributed to so-called suppressor variables lacking any statistical relation to the prediction target. This behavior has been confirmed empirically for a large array of XAI methods in Wilming et al. (2022). Here, we go one step further by deriving analytical expressions for the behavior of a variety of popular XAI methods on a simple two-dimensional binary classification problem involving Gaussian class-conditional distributions. We show that the majority of the studied approaches will attribute non-zero importance to a non-class-related suppressor feature in the presence of correlated noise. This poses important limitations on the interpretations and conclusions that the outputs of these XAI methods can afford.
... For instance, in an indirect manner, differentiation between classes is essential for classification or detection [3]. On the other hand, directly identifying the differences between classes is of particular relevance in the field of Explainable AI (XAI) [25,17,5] as it can be used to explain the results of neural network algorithms. When labeled information that clearly defines the distinction between classes is available, the task becomes relatively straightforward and has been extensively studied [36,12,28,10]. ...
Preprint
Full-text available
We present a new model, training procedure and architecture to create precise maps of distinction between two classes of images. The objective is to comprehend, in pixel-wise resolution, the unique characteristics of a class. These maps can facilitate self-supervised segmentation and objectdetection in addition to new capabilities in explainable AI (XAI). Our proposed architecture is based on image decomposition, where the output is the sum of multiple generative networks (branched-GANs). The distinction between classes is isolated in a dedicated branch. This approach allows clear, precise and interpretable visualization of the unique characteristics of each class. We show how our generic method can be used in several modalities for various tasks, such as MRI brain tumor extraction, isolating cars in aerial photography and obtaining feminine and masculine face features. This is a preliminary report of our initial findings and results.
Preprint
Neurodegenerative diseases such as Alzheimer's disease (AD) or frontotemporal lobar degeneration (FTLD) involve specific loss of brain volume, detectable in vivo using T1-weighted MRI scans. Supervised machine learning approaches classifying neurodegenerative diseases require diagnostic-labels for each sample. However, it can be difficult to obtain expert labels for a large amount of data. Self-supervised learning (SSL) offers an alternative for training machine learning models without data-labels. We investigated if the SSL models can applied to distinguish between different neurodegenerative disorders in an interpretable manner. Our method comprises a feature extractor and a downstream classification head. A deep convolutional neural network trained in a contrastive self-supervised way serves as the feature extractor, learning latent representation, while the classifier head is a single-layer perceptron. We used N=2694 T1-weighted MRI scans from four data cohorts: two ADNI datasets, AIBL and FTLDNI, including cognitively normal controls (CN), cases with prodromal and clinical AD, as well as FTLD cases differentiated into its sub-types. Our results showed that the feature extractor trained in a self-supervised way provides generalizable and robust representations for the downstream classification. For AD vs. CN, our model achieves 82% balanced accuracy on the test subset and 80% on an independent holdout dataset. Similarly, the behavioral variant of frontotemporal dementia (BV) vs. CN model attains an 88% balanced accuracy on the test subset. The average feature attribution heatmaps obtained by the Integrated Gradient method highlighted hallmark regions, i.e., temporal gray matter atrophy for AD, and insular atrophy for BV. In conclusion, our models perform comparably to state-of-the-art supervised deep learning approaches. This suggests that the SSL methodology can successfully make use of unannotated neuroimaging datasets as training data while remaining robust and interpretable.
Article
Full-text available
Of great relevance to climate engineering is the systematic relationship between the radiative forcing to the climate system and the response of the system, a relationship often represented by the linear response function (LRF) of the system. However, estimating the LRF often becomes an ill-posed inverse problem due to high-dimensionality and nonunique relationships between the forcing and response. Recent advances in machine learning make it possible to address the ill-posed inverse problem through regularization and sparse system fitting. Here, we develop a convolutional neural network (CNN) for regularized inversion. The CNN is trained using the surface temperature responses from a set of Green’s function perturbation experiments as imagery input data together with data sample densification. The resulting CNN model can infer the forcing pattern responsible for the temperature response from out-of-sample forcing scenarios. This promising proof of concept suggests a possible strategy for estimating the optimal forcing to negate certain undesirable effects of climate change. The limited success of this effort underscores the challenges of solving an inverse problem for a climate system with inherent nonlinearity. Significance Statement Predicting the climate response for a given climate forcing is a direct problem, while inferring the forcing for a given desired climate response is often an inverse, ill-posed, problem, posing a new challenge to the climate community. This study makes the first attempt to infer the radiative forcing for a given target pattern of global surface temperature response using a deep learning approach. The resulting deeply trained convolutional neural network inversion model shows promise in capturing the forcing pattern corresponding to a given surface temperature response, with a significant implication on the design of an optimal solar radiation management strategy for curbing global warming. This study also highlights the technical challenges that future research should prioritize in seeking feasible solutions to the inverse climate problem.
Article
Deep neural networks have shown remarkable effectiveness in SAR target recognition. However, the explainability problem for deep neural networks remains insufficiently addressed. One approach to tackle this challenge is the SHAP method. It enhances the explainability of deep neural networks in SAR target recognition by observing how the target, shadow, and clutter regions play their own distinct roles. The masked regions are typically filled with Zero, Mean, or Random values in optical images. But if the same operation performed on SAR images, it will affect the distribution of clutter and thus introducing new out-of-distribution challenge. In this paper, we propose a novel masking method to enhance the reliability and efficiency of the SHAP method in SAR-ATR applications. Experimental results on the MSTAR and OpenSARShip-1.0 datasets demonstrate that our proposed method provides a more faithful representation to show the importance of every single regions in SAR target recognition. Compared to methods using Zero values, Mean values, and Random baselines, our proposed method significantly enhances the reliability of explainability.
Article
Full-text available
Seasons are known to have a major influence on groundwater recharge and therefore groundwater levels; however, underlying relationships are complex and partly unknown. The goal of this study is to investigate the influence of the seasons on groundwater levels (GWLs), especially during low-water periods. For this purpose, we train artificial neural networks on data from 24 locations spread throughout Germany. We exclusively focus on precipitation and temperature as input data and apply layer-wise relevance propagation to understand the relationships learned by the models to simulate GWLs. We find that the learned relationships are plausible and thus consistent with our understanding of the major physical processes. Our results show that for the investigated locations, the models learn that summer is the key season for periods of low GWLs in fall, with a connection to the preceding winter usually only being subordinate. Specifically, dry summers exhibit a strong influence on low-water periods and generate a water deficit that (preceding) wet winters cannot compensate for. Temperature is thus an important proxy for evapotranspiration in summer and is generally identified as more important than precipitation, albeit only on average. Single precipitation events show by far the largest influences on GWLs, and summer precipitation seems to mainly control the severeness of low-GWL periods in fall, while higher summer temperatures do not systematically cause more severe low-water periods.
Article
Recent advancement in deep-neural network performance led to the development of new state-of-the-art approaches in numerous areas. However, the black-box nature of neural networks often prohibits their use in areas where model explainability and model transparency are crucial. Over the years, researchers proposed many algorithms to aid neural network understanding and provide additional information to the human expert. One of the most popular methods being Layer-Wise Relevance Propagation (LRP). This method assigns local relevance based on the pixel-wise decomposition of nonlinear classifiers. With the rise of attribution method research, there has emerged a pressing need to assess and evaluate their performance. Numerous metrics have been proposed, each assessing an individual property of attribution methods such as faithfulness, robustness or localization. Unfortunately, no single metric is deemed optimal for every case, and researchers often use several metrics to test the quality of the attribution maps. In this work, we address the shortcomings of the current LRP formulations and introduce a novel method for determining the relevance of input neurons through layer-wise relevance propagation. Furthermore, we apply this approach to the recently developed Vision Transformer architecture and evaluate its performance against existing methods on two image classification datasets, namely ImageNet and PascalVOC. Our results clearly demonstrate the advantage of our proposed method. Furthermore, we discuss the insufficiencies of current evaluation metrics for attribution-based explainability and propose a new evaluation metric that combines the notions of faithfulness, robustness and contrastiveness. We utilize this new metric to evaluate the performance of various attribution-based methods. Our code is available at: https://github.com/davor10105/relative-absolute-magnitude-propagation
Chapter
The increasing complexity of machine learning models used in environmental studies necessitates robust tools for transparency and interpretability. This paper systematically explores the transformative potential of Explainable Artificial Intelligence (XAI) techniques within the field of air quality research. A range of XAI methodologies, including Permutation Feature Importance (PFI), Partial Dependence Plot (PDP), SHapley Additive exPlanations (SHAP), and Local Interpretable Model-Agnostic Explanations (LIME), have been effectively investigated to achieve robust, comprehensible outcomes in modeling air pollutant concentrations worldwide. The integration of advanced feature engineering, visual analytics, and methodologies like DeepLIFT and Layer-Wise Relevance Propagation further enhance the interpretability and reliability of deep learning models. Despite these advancements, a significant proportion of air quality research still overlooks the implementation of XAI techniques, resulting in biases and redundancies within datasets. This review highlights the pivotal role of XAI techniques in facing these challenges, thus promoting precision, transparency, and trust in complex models. Furthermore, it underscores the necessity for a continued commitment to the integration and development of XAI techniques, pushing the boundaries of our understanding and usability of Artificial Intelligence in environmental science. The comprehensive insights offered by XAI can significantly aid in decision-making processes and lead to transformative strides within the fields of Internet of Things and air quality research.
Chapter
The increasing use of AI in modern smart cities calls for explainable artificial intelligence (XAI) systems that can improve the efficiency and effectiveness of city operations while being transparent, interpretable, and trustworthy. Developing a unified framework for XAI that can handle the heterogeneity of data and systems in smart cities is the first challenge, considering the need to incorporate human factors and preferences in AI systems. The second challenge is developing new XAI methods that can handle the complexity and scale of smart city data. Addressing ethical and legal aspects is also critical, including ensuring that AI systems are fair and unbiased, protecting citizens' privacy and security, and establishing legal frameworks. Evaluating the effectiveness and usability of XAI systems is also crucial in improving city operations and stakeholder trust apart from XAI research for smart cities: improved visualization, human feedback, integration.
Chapter
This chapter provides an in-depth examination of the current use of artificial intelligence (AI) in military training applications, with a specific focus on the importance of explainability in these systems. The chapter begins by introducing the concept of AI in military training and discussing the challenges that come with building complex and efficient systems that can explain their decision-making processes. The chapter emphasizes the significance of explainability in military training applications, explaining how it enhances trust, transparency, and accountability. Furthermore, the chapter discusses the use of explainable AI in military simulations and presents a case study that demonstrates how it can be used to improve military training simulations and enhance decision-making in real-life scenarios.
Chapter
As we enter the era of Industrial Revolution 5.0 (IR 5.0), the role of artificial intelligence (AI) in various domains such as manufacturing, military, healthcare, education, and entertainment is becoming increasingly vital. However, the growing complexity and opacity of AI systems have led to a problem known as the “black box,” which hinders trust and accountability. This is where explainable AI (XAI) comes in, providing a set of processes and methods that enable human users to understand and trust the results and output produced by machine learning algorithms. By describing AI models, their expected impact, and potential biases, XAI helps ensure accuracy, fairness, transparency, and accountability in AI-powered decision making. In this chapter, the authors argue that XAI is indispensable for IR 5.0, as it enables humans to collaborate with AI systems effectively and responsibly. The authors reviewed the current state of XAI research and practice and highlighted the challenges and opportunities for XAI in IR 5.0.
Article
With the power of parallel processing, large datasets,and fast computational resources, deep neural networks (DNNs) have outperformed highly trained and experienced human experts in medical applications. However, the large global community of healthcare professionals, many of whom routinely face potentially life-or-death outcomes with complex medicolegal consequences, have yet to embrace this powerful technology. The major problem is that most current AI solutions function as a metaphorical black-box positioned between input data and output decisions without a rigorous explanation for their internal processes. With the goal of enhancing trust and improving acceptance of AI-based technology in clinical medicine, there is a large and growing effort to address this challenge using eXplainable AI (XAI), a set of techniques, strategies, and algorithms with an explicit focus on explaining the “hows and whys” of DNNs. Here, we provide a comprehensive review of the state-of-the-art XAI techniques concerning healthcare applications and discuss current challenges and future directions. We emphasize the strengths and limitations of each category, including image, tabular, and textual explanations, and explore a range of evaluation metrics for assessing the effectiveness of XAI solutions. Finally, we highlight promising opportunities for XAI research to enhance the acceptance of DNNs by the healthcare community.
Article
Full-text available
In recent years, the use of deep learning methods has rapidly increased in many research fields. Similarly, they have become a powerful tool within the climate scientific community. Deep learning methods have been successfully applied for different tasks, such as the identification of atmospheric patterns, weather extreme classification, or weather forecasting. However, due to the inherent complexity of atmospheric processes, the ability of deep learning models to simulate natural processes, particularly in the case of weather extremes, is still challenging. Therefore, a thorough evaluation of their performance and robustness in predicting precipitation fields is still needed, especially for extreme precipitation events, which can have devastating consequences in terms of infrastructure damage, economic losses, and even loss of life. In this study, we present a comprehensive evaluation of a set of deep learning architectures to simulate precipitation, including heavy precipitation events ( > 95th percentile) and extreme events ( > 99th percentile) over the European domain. Among the architectures analyzed here, the U‐Net network was found to be superior and outperformed the other networks in simulating precipitation events. In particular, we found that a simplified version of the original U‐Net with two encoder‐decoder levels generally achieved similar skill scores than deeper versions for predicting precipitation extremes, while significantly reducing the overall complexity and computing resources. We further assess how the model predicts through the attribution heatmaps from a Layer‐wise Relevance Propagation (LRP) explainability method.
Chapter
Pattern recognition systems implemented using deep neural networks achieve better results than linear models. However, their drawback is the black box property. This property means that one with no experience utilising nonlinear systems may need help understanding the outcome of the decision. Such a solution is unacceptable to the user responsible for the final decision. He must not only believe in the decision but also understand it. Therefore, recognisers must have an architecture that allows interpreters to interpret the findings. The idea of post-hoc explainable classifiers is to design an interpretable classifier parallel to the black box classifier, giving the same decisions as the black box classifier. This paper shows that the explainable classifier completes matching classification decisions with the black box classifier on the MNIST and FashionMNIST databases if Zadeh’s fuzzy logic function forms the classifier and DeconvNet importance gives the truth values. Since the other tested significance measures achieved lower performance than DeconvNet, it is the optimal transformation of the feature values to their true values as inputs to the fuzzy logic function for the databases and recogniser architecture used.KeywordsExplainable classificationDeep neural networksFuzzy logic functionsFeatures importancePost-hoc explanation
Article
Full-text available
Deep learning has recently gained popularity in digital pathology due to its high prediction quality. However, the medical domain requires explanation and insight for a better understanding beyond standard quantitative performance evaluation. Recently, many explanation methods have emerged. This work shows how heatmaps generated by these explanation methods allow to resolve common challenges encountered in deep learning-based digital histopathology analyses. We elaborate on biases which are typically inherent in histopathological image data. In the binary classification task of tumour tissue discrimination in publicly available haematoxylin-eosin-stained images of various tumour entities, we investigate three types of biases: (1) biases which affect the entire dataset, (2) biases which are by chance correlated with class labels and (3) sampling biases. While standard analyses focus on patch-level evaluation, we advocate pixel-wise heatmaps, which offer a more precise and versatile diagnostic instrument. This insight is shown to not only be helpful to detect but also to remove the effects of common hidden biases, which improves generalisation within and across datasets. For example, we could see a trend of improved area under the receiver operating characteristic (ROC) curve by 5% when reducing a labelling bias. Explanation techniques are thus demonstrated to be a helpful and highly relevant tool for the development and the deployment phases within the life cycle of real-world applications in digital pathology.
Article
Full-text available
The application of deep learning (DL) models to neuroimaging data poses several challenges, due to the high dimensionality, low sample size, and complex temporo-spatial dependency structure of these data. Even further, DL models often act as black boxes, impeding insight into the association of cognitive state and brain activity. To approach these challenges, we introduce the DeepLight framework, which utilizes long short-term memory (LSTM) based DL models to analyze whole-brain functional Magnetic Resonance Imaging (fMRI) data. To decode a cognitive state (e.g., seeing the image of a house), DeepLight separates an fMRI volume into a sequence of axial brain slices, which is then sequentially processed by an LSTM. To maintain interpretability, DeepLight adapts the layer-wise relevance propagation (LRP) technique. Thereby, decomposing its decoding decision into the contributions of the single input voxels to this decision. Importantly, the decomposition is performed on the level of single fMRI volumes, enabling DeepLight to study the associations between cognitive state and brain activity on several levels of data granularity, from the level of the group down to the level of single time points. To demonstrate the versatility of DeepLight, we apply it to a large fMRI dataset of the Human Connectome Project. We show that DeepLight outperforms conventional approaches of uni- and multivariate fMRI analysis in decoding the cognitive states and in identifying the physiologically appropriate brain regions associated with these states. We further demonstrate DeepLight's ability to study the fine-grained temporo-spatial variability of brain activity over sequences of single fMRI samples.
Chapter
Full-text available
For a machine learning model to generalize well, one needs to ensure that its decisions are supported by meaningful patterns in the input data. A prerequisite is however for the model to be able to explain itself, e.g. by highlighting which input features it uses to support its prediction. Layer-wise Relevance Propagation (LRP) is a technique that brings such explainability and scales to potentially highly complex deep neural networks. It operates by propagating the prediction backward in the neural network, using a set of purposely designed propagation rules. In this chapter, we give a concise introduction to LRP with a discussion of (1) how to implement propagation rules easily and efficiently, (2) how the propagation procedure can be theoretically justified as a ‘deep Taylor decomposition’, (3) how to choose the propagation rules at each layer to deliver high explanation quality, and (4) how LRP can be extended to handle a variety of machine learning scenarios beyond deep neural networks.
Chapter
Full-text available
A number of backpropagation-based approaches such as DeConvNets, vanilla Gradient Visualization and Guided Backpropagation have been proposed to better understand individual decisions of deep convolutional neural networks. The saliency maps produced by them are proven to be non-discriminative. Recently, the Layer-wise Relevance Propagation (LRP) approach was proposed to explain the classification decisions of rectifier neural networks. In this work, we evaluate the discriminativeness of the generated explanations and analyze the theoretical foundation of LRP, i.e. Deep Taylor Decomposition. The experiments and analysis conclude that the explanations generated by LRP are not class-discriminative. Based on LRP, we propose Contrastive Layer-wise Relevance Propagation (CLRP), which is capable of producing instance-specific, class-discriminative, pixel-wise explanations. In the experiments, we use the CLRP to explain the decisions and understand the difference between neurons in individual classification decisions. We also evaluate the explanations quantitatively with a Pointing Game and an ablation study. Both qualitative and quantitative evaluations show that the CLRP generates better explanations than the LRP.
Article
Full-text available
Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly "intelligent" behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighted, to well-informed and strategic. We observe that standard performance evaluation metrics can be oblivious to distinguishing these diverse problem solving behaviors. Furthermore, we propose our semi-automated Spectral Relevance Analysis that provides a practically effective way of characterizing and validating the behavior of nonlinear learning machines. This helps to assess whether a learned model indeed delivers reliably for the problem that it was conceived for. Furthermore, our work intends to add a voice of caution to the ongoing excitement about machine intelligence and pledges to evaluate and judge some of these recent successes in a more nuanced manner.
Article
Full-text available
Machine learning (ML) techniques such as (deep) artificial neural networks (DNN) are solving very successfully a plethora of tasks and provide new predictive models for complex physical, chemical, biological and social systems. However, in most cases this comes with the disadvantage of acting as a black box, rarely providing information about what made them arrive at a particular prediction. This black box aspect of ML techniques can be problematic especially in medical diagnoses, so far hampering a clinical acceptance. The present paper studies the uniqueness of individual gait patterns in clinical biomechanics using DNNs. By attributing portions of the model predictions back to the input variables (ground reaction forces and full-body joint angles), the Layer-Wise Relevance Propagation (LRP) technique reliably demonstrates which variables at what time windows of the gait cycle are most relevant for the characterisation of gait patterns from a certain individual. By measuring the time-resolved contribution of each input variable to the prediction of ML techniques such as DNNs, our method describes the first general framework that enables to understand and interpret non-linear ML methods in (biomechanical) gait analysis and thereby supplies a powerful tool for analysis, diagnosis and treatment of human gait.
Article
Full-text available
We aim to model the top-down attention of a convolutional neural network (CNN) classifier for generating task-specific attention maps. Inspired by a top-down human visual attention model, we propose a new backpropagation scheme, called Excitation Backprop, to pass along top-down signals downwards in the network hierarchy via a probabilistic Winner-Take-All process. Furthermore, we introduce the concept of contrastive attention to make the top-down attention maps more discriminative. We show a theoretic connection between the proposed contrastive attention formulation and the Class Activation Map computation. Efficient implementation of Excitation Backprop for common neural network layers is also presented. In experiments, we visualize the evidence of a model’s classification decision by computing the proposed top-down attention maps. For quantitative evaluation, we report the accuracy of our method in weakly supervised localization tasks on the MS COCO, PASCAL VOC07 and ImageNet datasets. The usefulness of our method is further validated in the text-to-region association task. On the Flickr30k Entities dataset, we achieve promising performance in phrase localization by leveraging the top-down attention of a CNN model that has been trained on weakly labeled web images. Finally, we demonstrate applications of our method in model interpretation and data annotation assistance for facial expression analysis and medical imaging tasks.
Conference Paper
Full-text available
Recently, deep neural networks have demonstrated excellent performances in recognizing the age and gender on human face images. However, these models were applied in a black-box manner with no information provided about which facial features are actually used for prediction and how these features depend on image preprocessing, model initialization and architecture choice. We present a study investigating these different effects. In detail, our work compares four popular neural network architectures, studies the effect of pretraining, evaluates the robustness of the considered alignment preprocessings via cross-method test set swapping and intuitively visualizes the model's prediction strategies in given preprocessing conditions using the recent Layer-wise Relevance Propagation (LRP) algorithm. Our evaluations on the challenging Adience benchmark show that suitable parameter initialization leads to a holistic perception of the input, compensating artefactual data representations. With a combination of simple preprocessing steps, we reach state of the art performance in gender recognition.
Article
Full-text available
DeConvNet, Guided BackProp, LRP, were invented to better understand deep neural networks. We show that these methods do not produce the theoretically correct explanation for a linear model. Yet they are used on multi-layer networks with millions of parameters. This is a cause for concern since linear models are simple neural networks. We argue that explanation methods for neural nets should work reliably in the limit of simplicity, the linear models. Based on our analysis of linear models we propose a generalization that yields two explanation techniques (PatternNet and PatternAttribution) that are theoretically sound for linear models and produce improved explanations for deep networks.
Article
Full-text available
A long-standing obstacle to progress in deep learning is the problem of vanishing and exploding gradients. The problem has largely been overcome through the introduction of carefully constructed initializations and batch normalization. Nevertheless, architectures incorporating skip-connections such as resnets perform much better than standard feedforward architectures despite well-chosen initialization and batch normalization. In this paper, we identify the shattered gradients problem. Specifically, we show that the correlation between gradients in standard feedforward networks decays exponentially with depth resulting in gradients that resemble white noise. In contrast, the gradients in architectures with skip-connections are far more resistant to shattering decaying sublinearly. Detailed empirical evidence is presented in support of the analysis, on both fully-connected networks and convnets. Finally, we present a new "looks linear" (LL) initialization that prevents shattering. Preliminary experiments show the new initialization allows to train very deep networks without the addition of skip-connections.
Conference Paper
Full-text available
Fisher vector (FV) classifiers and Deep Neural Networks (DNNs) are popular and successful algorithms for solving image classification problems. However, both are generally considered 'black box' predictors as the non-linear transformations involved have so far prevented transparent and interpretable reasoning. Recently, a principled technique, Layer-wise Relevance Propagation (LRP), has been developed in order to better comprehend the inherent structured reasoning of complex nonlinear classification models such as Bag of Feature models or DNNs. In this paper we (1) extend the LRP framework also for Fisher vector classifiers and then use it as analysis tool to (2) quantify the importance of context for classification, (3) qualitatively compare DNNs against FV classifiers in terms of important image regions and (4) detect potential flaws and biases in data. All experiments are performed on the PASCAL VOC 2007 and ILSVRC 2012 data sets.
Article
Full-text available
The Layer-wise Relevance Propagation (LRP) algorithm explains a classifier's prediction specific to a given data point by attributing relevance scores to important components of the input by using the topology of the learned model itself. With the LRP Toolbox we provide platform-agnostic implementations for explaining the predictions of pre-trained state of the art Caffe networks and stand-alone implementations for fully connected Neural Network models. The implementations for Matlab and python shall serve as a playing field to familiarize oneself with the LRP algorithm and are implemented with readability and transparency in mind. Models and data can be imported and exported using raw text formats, Matlab's .mat files and the .npy format for numpy or plain text. ©2016 Sebastian Lapuschkin and Alexander Binder and Grégoire Montavon and Klaus-Robert Müller and Wojciech Samek.
Article
Full-text available
Deep neural networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multilayer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision, given a new unseen data sample. Recently, several approaches have been proposed enabling one to understand and interpret the reasoning embodied in a DNN for a single test image. These methods quantify the "importance" of individual pixels with respect to the classification decision and allow a visualization in terms of a heatmap in pixel/input space. While the usefulness of heatmaps can be judged subjectively by a human, an objective quality measure is missing. In this paper, we present a general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps. We compare heatmaps computed by three different methods on the SUN397, ILSVRC2012, and MIT Places data sets. Our main result is that the recently proposed layer-wise relevance propagation algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method. We provide theoretical arguments to explain this result and discuss its practical implications. Finally, we investigate the use of heatmaps for unsupervised assessment of the neural network performance.
Conference Paper
Full-text available
We present an application of the Layer-wise Relevance Propagation (LRP) algorithm to state of the art deep convolutional neural networks and Fisher Vector classifiers to compare the image perception and prediction strategies of both classifiers with the use of visualized heatmaps. Layer-wise Relevance Propagation (LRP) is a method to compute scores for individual components of an input image, denoting their contribution to the prediction of the classifier for one particular test point. We demonstrate the impact of different choices of decomposition cut-off points during the LRP-process, controlling the resolution and semantics of the heatmap on test images from the PASCAL VOC 2007 test data set.
Article
Full-text available
Nonlinear methods such as Deep Neural Networks (DNNs) are the gold standard for various challenging machine learning problems, e.g., image classification, natural language processing or human action recognition. Although these methods perform impressively well, they have a significant disadvantage, the lack of transparency, limiting the interpretability of the solution and thus the scope of application in practice. Especially DNNs act as black boxes due to their multilayer nonlinear structure. In this paper we introduce a novel methodology for interpreting generic multilayer neural networks by decomposing the network's classification decision into contributions of it's input elements. Although our focus is on image classification, the method is applicable to any type of input data, learning task and network architecture. Our method is based on deep Taylor decomposition and efficiently utilizes the structure of the network by backpropagating the explanations from the output to the input layer. We evaluate the proposed method empirically on the MNIST and ILSVRC data sets.
Article
Full-text available
Understanding and interpreting classification decisions of automated image classification systems is of high value in many applications as it allows to verify the reasoning of the system and provides additional information to the human expert. Although machine learning methods are solving very successfully a plethora of tasks, they have in most cases the disadvantage of acting as a black box, not providing any information about what made them arrive at a particular decision. This work proposes a general solution to the problem of understanding classification decisions by pixel-wise decomposition of non- linear classifiers. We introduce a methodology that allows to visualize the contributions of single pixels to predictions for kernel-based classifiers over Bag of Words features and for multilayered neural networks. These pixel contributions can be visualized as heatmaps and are provided to a human expert who can intuitively not only verify the validity of the classification decision, but also focus further analysis on regions of potential interest. We evaluate our method for classifiers trained on PASCAL VOC 2009 images, synthetic image data containing geometric shapes, the MNIST handwritten digits data set and for the pre-trained ImageNet model available as part of the Caffe open source package.
Article
Full-text available
The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide detailed a analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the five years of the challenge, and propose future directions and improvements.
Article
Full-text available
Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models. The framework is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures. Caffe fits industry and internet-scale media needs by CUDA GPU computation, processing over 40 million images a day on a single K40 or Titan GPU (approx 2 ms per image). By separating model representation from actual implementation, Caffe allows experimentation and seamless switching among platforms for ease of development and deployment from prototyping machines to cloud environments. Caffe is maintained and developed by the Berkeley Vision and Learning Center (BVLC) with the help of an active community of contributors on GitHub. It powers ongoing research projects, large-scale industrial applications, and startup prototypes in vision, speech, and multimedia.
Article
Full-text available
This report presents the results of the 2006 PASCAL Visual Object Classes Challenge (VOC2006). Details of the challenge, data, and evalu-ation are presented. Participants in the challenge submitted descriptions of their methods, and these have been included verbatim. This document should be considered preliminary, and subject to change.
Article
Full-text available
Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day
Chapter
The problem of explaining complex machine learning models, including Deep Neural Networks, has gained increasing attention over the last few years. While several methods have been proposed to explain network predictions, the definition itself of explanation is still debated. Moreover, only a few attempts to compare explanation methods from a theoretical perspective has been done. In this chapter, we discuss the theoretical properties of several attribution methods and show how they share the same idea of using the gradient information as a descriptive factor for the functioning of a model. Finally, we discuss the strengths and limitations of these methods and compare them with available alternatives.
Chapter
Layer-wise relevance propagation (LRP) has shown potential for explaining neural network classifier decisions. In this paper, we investigate how LRP is to be applied to deep neural network which makes use of batch normalization (BatchNorm), and show that despite the functional simplicity of BatchNorm, several intuitive choices of published LRP rules perform poorly for a number of frequently used state of the art networks. Also, we show that by using the \(\varepsilon \)-rule for BatchNorm layers we are able to detect training artifacts for MobileNet and layer design artifacts for ResNet. The causes for such failures are analyzed deeply and thoroughly. We observe that some assumptions on the LRP decomposition rules are broken given specific networks, and propose a novel LRP rule tailored for BatchNorm layers. Our quantitatively evaluated results show advantage of our novel LRP rule for BatchNorm layers and its wide applicability to common deep neural network architectures. As an aside, we demonstrate that one observation made by LRP analysis serves to modify a ResNet for faster initial training convergence.
Technical Report
TensorFlow [1] is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at www.tensorflow.org.
Conference Paper
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry
Conference Paper
In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network to have remarkable localization ability despite being trained on image-level labels. While this technique was previously proposed as a means for regularizing training, we find that it actually builds a generic localizable deep representation that can be applied to a variety of tasks. Despite the apparent simplicity of global average pooling, we are able to achieve 37.1% top-5 error for object localization on ILSVRC 2014, which is remarkably close to the 34.2% top-5 error achieved by a fully supervised CNN approach. We demonstrate that our network is able to localize the discriminative image regions on a variety of tasks despite not being trained for them
Article
Explaining the output of a deep network remains a challenge. In the case of an image classifier, one type of explanation is to identify pixels that strongly influence the final decision. A starting point for this strategy is the gradient of the class score function with respect to the input image. This gradient can be interpreted as a sensitivity map, and there are several techniques that elaborate on this basic idea. This paper makes two contributions: it introduces SmoothGrad, a simple method that can help visually sharpen gradient-based sensitivity maps, and it discusses lessons in the visualization of these maps. We publish the code for our experiments and a website with our results.
Article
We study the problem of attributing the prediction of a deep network to its input features, a problem previously studied by several other works. We identify two fundamental axioms---Sensitivity and Implementation Invariance that attribution methods ought to satisfy. We show that they are not satisfied by most known attribution methods, which we consider to be a fundamental weakness of those methods. We use the axioms to guide the design of a new attribution method called Integrated Gradients. Our method requires no modification to the original network and is extremely simple to implement; it just needs a few calls to the standard gradient operator. We apply this method to a couple of image models, a couple of text models and a chemistry model, demonstrating its ability to debug networks, to extract rules from a deep network, and to enable users to engage with models better.
Conference Paper
Convolutional networks are at the core of most stateof-the-art computer vision solutions for a wide variety of tasks. Since 2014 very deep convolutional networks started to become mainstream, yielding substantial gains in various benchmarks. Although increased model size and computational cost tend to translate to immediate quality gains for most tasks (as long as enough labeled data is provided for training), computational efficiency and low parameter count are still enabling factors for various use cases such as mobile vision and big-data scenarios. Here we are exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization. We benchmark our methods on the ILSVRC 2012 classification challenge validation set demonstrate substantial gains over the state of the art: 21.2% top-1 and 5.6% top-5 error for single frame evaluation using a network with a computational cost of 5 billion multiply-adds per inference and with using less than 25 million parameters. With an ensemble of 4 models and multi-crop evaluation, we report 3.5% top-5 error and 17.3% top-1 error.
Article
We summarize the potential impact that the European Union's new General Data Protection Regulation will have on the routine use of machine learning algorithms. Slated to take effect as law across the EU in 2018, it will restrict automated individual decision-making (that is, algorithms that make decisions based on user-level predictors) which "significantly affect" users. The law will also create a "right to explanation," whereby a user can ask for an explanation of an algorithmic decision that was made about them. We argue that while this law will pose large challenges for industry, it highlights opportunities for machine learning researchers to take the lead in designing algorithms and evaluation frameworks which avoid discrimination.
Article
In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network to have remarkable localization ability despite being trained on image-level labels. While this technique was previously proposed as a means for regularizing training, we find that it actually builds a generic localizable deep representation that can be applied to a variety of tasks. Despite the apparent simplicity of global average pooling, we are able to achieve 37.1% top-5 error for object localization on ILSVRC 2014, which is remarkably close to the 34.2% top-5 error achieved by a fully supervised CNN approach. We demonstrate that our network is able to localize the discriminative image regions on a variety of tasks despite not being trained for them
Article
Convolutional networks are at the core of most state-of-the-art computer vision solutions for a wide variety of tasks. Since 2014 very deep convolutional networks started to become mainstream, yielding substantial gains in various benchmarks. Although increased model size and computational cost tend to translate to immediate quality gains for most tasks (as long as enough labeled data is provided for training), computational efficiency and low parameter count are still enabling factors for various use cases such as mobile vision and big-data scenarios. Here we explore ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization. We benchmark our methods on the ILSVRC 2012 classification challenge validation set demonstrate substantial gains over the state of the art: 21.2% top-1 and 5.6% top-5 error for single frame evaluation using a network with a computational cost of 5 billion multiply-adds per inference and with using less than 25 million parameters. With an ensemble of 4 models and multi-crop evaluation, we report 3.5% top-5 error and 17.3% top-1 error.
Article
Most modern convolutional neural networks (CNNs) used for object recognition are built using the same principles: Alternating convolution and max-pooling layers followed by a small number of fully connected layers. We re-evaluate the state of the art for object recognition from small images with convolutional networks, questioning the necessity of different components in the pipeline. We find that max-pooling can simply be replaced by a convolutional layer with increased stride without loss in accuracy on several image recognition benchmarks. Following this finding -- and building on other recent work for finding simple network structures -- we propose a new architecture that consists solely of convolutional layers and yields competitive or state of the art performance on several object recognition datasets (CIFAR-10, CIFAR-100, ImageNet). To analyze the network we introduce a new variant of the "deconvolution approach" for visualizing features learned by CNNs, which can be applied to a broader range of network structures than existing approaches.
Article
In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively.
Analyzing imagenet with spectral relevance analysis: Towards imagenet un-hans’ed
  • C J Anders
  • T Marinč
  • D Neumann
  • W Samek
  • K.-R Müller
  • S Lapuschkin
innvestigate neural networks!
  • M Alber
  • S Lapuschkin
  • P Seegerer
  • M Hägele
  • K T Schütt
  • G Montavon
Analyzing imagenet with spectral relevance analysis: Towards imagenet un-hans’ed
  • anders
innvestigate neural networks!
  • alber