Figure 1 - available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright.
Adversarial images can have manipulated explanations. Image A -adversarial image; explanation reveals target class. Image B -adversarial image; manipulated explanation hides target class. Manipulation based on (Dombrowski et al., 2019)
Source publication
Including human analysis has the potential to positively affect the robustness of Deep Neural Networks and is relatively unexplored in the Adversarial Machine Learning literature. Neural network visual explanation maps have been shown to be prone to adversarial attacks. Further research is needed in order to select robust visualizations of explanat...
Context in source publication
Context 1
... maps which are expected to reveal adversarial images ( Ye et al., 2020) may be manipulated to disguise adversarial attacks and model biases. This poses obstacles not only for model trust but also for HITL evaluation ( Figure 1). While solutions have been put forward which provide robustness toward manipulation ( Dombrowski et al., 2022), the issue remains when analysts are looking at unfamiliar classes or using traditional explanatory techniques. ...
Similar publications
Open Knowledge Maps es una aplicación informática que genera una representación visual de las áreas de conocimiento y conceptos a partir de un tema. Resulta muy útil para identificar conceptos relevantes o clasificaciones de conocimiento, especialmente cuando se desea obtener un panorama general de cómo se ha abordado un tema en la literatura, y má...