Conference Paper

Do Datapoints Argue?: Argumentation for Hierarchical Agreement in Datasets

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

This work aims to utilize quantitative bipolar argumentation to detect deception in machine learning models. We explore the concept of deception in the context of interactions of a party developing a machine learning model with potentially malformed data sources. The objective is to identify deceptive or adversarial data and assess the effectiveness of comparative analysis during different stages of model training. By modeling disagreement and agreement between data points as arguments and utilizing quantitative measures, this work proposes techniques for detecting outliers in data. We discuss further applications in clustering and uncertainty modelling.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Using machine learning (ML) to automate camera trap (CT) image processing is advantageous for time-sensitive applications. However, little is currently known about the factors influencing such processing. Here, we evaluate the influence of occlusion, distance, vegetation type, size class, height, subject orientation towards the CT, species, time-of-day, colour, and analyst performance on wildlife/human detection and classification in CT images from western Tanzania. Additionally, we compared the detection and classification performance of analyst and ML approaches. We obtained wildlife data through pre-existing CT images and human data using voluntary participants for CT experiments. We evaluated the analyst and ML approaches at the detection and classification level. Factors such as distance and occlusion, coupled with increased vegetation density, present the most significant effect on DP and CC. Overall, the results indicate a significantly higher detection probability (DP), 81.1%, and correct classification (CC) of 76.6% for the analyst approach when compared to ML which detected 41.1% and classified 47.5% of wildlife within CT images. However, both methods presented similar probabilities for daylight CT images, 69.4% (ML) and 71.8% (analysts), and dusk CT images, 17.6% (ML) and 16.2% (analysts), when detecting humans. Given that users carefully follow provided recommendations, we expect DP and CC to increase. In turn, the ML approach to CT image processing would be an excellent provision to support time-sensitive threat monitoring for biodiversity conservation.
Article
Full-text available
Quality is the key issue for judging the usability of crowdsourcing geographic data. While due to the un-professional of volunteers and the phenomenon of malicious labeling, there are many abnormal or poor quality objects in crowdsourced data. Based on this observation, an abnormal crowdsourced data detection method is proposed in this paper based on image features. This approach includes three main steps. 1) the crowdsourced vector data is used to segment the corresponding remote sensing imagery to get image objects with a priori information (e.g., shape and category) from vector data and spectral information from the images. Then, the sampling method is designed considering the spatial distribution and topographic properties of the objects, and the initial samples are obtained, although some samples are abnormal object or poor quality. 2) A feature contribution index (FCI) is defined based on information gain to select the optimal features, a feature space outlier index (FSOI) is presented to automatically identify outlier samples and changed objects. The initial samples are refined by an iteration procedure. After the iteration, the optimal features can be determined, and the refined samples with categories can be obtained; the imagery feature space is established using the optimal features for each category. 3) The abnormal objects are identified with the refined samples by calculating the FSOI values of image objects. In order to valid the effectiveness, an abnormal crowdsourced data detection prototype is developed using Visual Studio 2013 and C# programming, the above algorithms and methods are implemented and verified using water and vegetation categories as example, the OSM (OpenStreetMap) and corresponding imagery data of Changsha city as experiment data. The angular second moment (ASM), contrast, inverse difference moment (IDM), mean, variance, difference entropy, and normalized difference green index (NDGI) of vegetation, and the IDM, difference entropy and correlation and maximum band value of water are used to detect abnormal data after the selection of image optimal feature. Experimental results show that abnormal water and vegetation data in OSM can be effectively detected in this method, and the missed detection rate of the vegetation and water are all near to zero, and the positive detection rate reach 90.4% and 83.8%, respectively.
Conference Paper
Full-text available
Explaining the inner workings of deep neural network models have received considerable attention in recent years. Researchers have attempted to provide human parseable explanations justifying why a model performed a specific classification. Although many of these toolkits are available for use, it is unclear which style of explanation is preferred by end-users, thereby demanding investigation. We performed a cross-analysis Amazon Mechanical Turk study comparing the popular state-of-the-art explanation methods to empirically determine which are better in explaining model decisions. The participants were asked to compare explanation methods across applications spanning image, text, audio, and sensory domains. Among the surveyed methods, explanation-by-example was preferred in all domains except text sentiment classification, where LIME's method of annotating input text was preferred. We highlight qualitative aspects of employing the studied explainability methods and conclude with implications for researchers and engineers that seek to incorporate explanations into user-facing deployments.
Chapter
Full-text available
Labeling is a cornerstone of supervised machine learning. However, in industrial applications, data is often not labeled, which complicates using this data for machine learning. Although there are well-established labeling techniques such as crowdsourcing, active learning, and semi-supervised learning, these still do not provide accurate and reliable labels for every machine learning use case in the industry. In this context, the industry still relies heavily on manually annotating and labeling their data. This study investigates the challenges that companies experience when annotating and labeling their data. We performed a case study using a semi-structured interview with data scientists at two companies to explore their problems when labeling and annotating their data. This paper provides two contributions. We identify industry challenges in the labeling process, and then we propose mitigation strategies for these challenges.
Conference Paper
Full-text available
The development of machines that can tell stories in order to interact with humans or other artificial agents has significant implications in the area of trust and AI. Even more so if we expect such machines to be transparent and explain their reasoning when we interrogate them to see if they should be held accountable. One of these implications is the ability of machines to use stories in order to deceive others, thus undermining the relation of trust between humans and machines. In this paper we explore from the perspective of an argumentation-based dialogue game what it means for a machine to deceive by telling stories.
Article
Full-text available
A bounded-reasoning agent may face two dimensions of uncertainty: firstly, the uncertainty arising from partial information and conflicting reasons, and secondly, the uncertainty arising from the stochastic nature of its actions and the environment. This paper attempts to address both dimensions within a single unified framework, by bringing together probabilistic argumentation and reinforcement learning. We show how a probabilistic rule-based argumentation framework can capture Markov decision processes and reinforcement learning agents; and how the framework allows us to characterise agents and their argument-based motivations from both a logic-based perspective and a probabilistic perspective. We advocate and illustrate the use of our approach to capture models of agency and norms, and argue that, in addition to providing a novel method for investigating agent types, the unified framework offers a sound basis for taking a mentalistic approach to agent profiles.
Article
Full-text available
We present a Distributionally Robust Optimization (DRO) approach to estimate a robustified regression plane in a linear regression setting, when the observed samples are potentially contaminated with adversarially corrupted outliers. Our approach mitigates the impact of outliers by hedging against a family of probability distributions on the observed data, some of which assign very low probabilities to the outliers. The set of distributions under consideration are close to the empirical distribution in the sense of the Wasserstein metric. We show that this DRO formulation can be relaxed to a convex optimization problem which encompasses a class of models. By selecting proper norm spaces for the Wasserstein metric, we are able to recover several commonly used regularized regression models. We provide new insights into the regularization term and give guidance on the selection of the regularization coefficient from the standpoint of a confidence region. We establish two types of performance guarantees for the solution to our formulation under mild conditions. One is related to its out-of-sample behavior (prediction bias), and the other concerns the discrepancy between the estimated and true regression planes (estimation bias). Extensive numerical results demonstrate the superiority of our approach to a host of regression models, in terms of the prediction and estimation accuracies. We also consider the application of our robust learning procedure to outlier detection, and show that our approach achieves a much higher AUC (Area Under the ROC Curve) than M-estimation (Huber, 1964, 1973).
Article
Full-text available
As machine learning becomes widely used for automated decisions, attackers have strong incentives to manipulate the results and models generated by machine learning algorithms. In this paper, we perform the first systematic study of poisoning attacks and their countermeasures for linear regression models. In poisoning attacks, attackers deliberately influence the training data to manipulate the results of a predictive model. We propose a theoretically-grounded optimization framework specifically designed for linear regression and demonstrate its effectiveness on a range of datasets and models. We also introduce a fast statistical attack that requires limited knowledge of the training process. Finally, we design a new principled defense method that is highly resilient against all poisoning attacks. We provide formal guarantees about its convergence and an upper bound on the effect of poisoning attacks when the defense is deployed. We evaluate extensively our attacks and defenses on three realistic datasets from health care, loan assessment, and real estate domains.
Article
Full-text available
Background: Deep learning models are typically trained using stochastic gradient descent or one of its variants. These methods update the weights using their gradient, estimated from a small fraction of the training data. It has been observed that when using large batch sizes there is a persistent degradation in generalization performance - known as the "generalization gap" phenomena. Identifying the origin of this gap and closing it had remained an open problem. Contributions: We examine the initial high learning rate training phase. We find that the weight distance from its initialization grows logarithmically with the number of weight updates. We therefore propose a "random walk on random landscape" statistical model which is known to exhibit similar "ultra-slow" diffusion behavior. Following this hypothesis we conducted experiments to show empirically that the "generalization gap" stems from the relatively small number of updates rather than the batch size, and can be completely eliminated by adapting the training regime used. We further investigate different techniques to train models in the large-batch regime and present a novel algorithm named "Ghost Batch Normalization" which enables significant decrease in the generalization gap without increasing the number of updates. To validate our findings we conduct several additional experiments on MNIST, CIFAR-10, CIFAR-100 and ImageNet. Finally, we reassess common practices and beliefs concerning training of deep models and suggest they may not be optimal to achieve good generalization.
Article
Full-text available
Online argumentation systems enable stakeholders to post their problems under consideration and solution alternatives and to exchange arguments over the alternatives posted in an argumentation tree. In an argumentation process, stakeholders have their own opinions, which very often contrast and conflict with opinions of others. Some of these opinions may be outliers with respect to the mean group opinion. This paper presents a method for identifying stakeholders with outlier opinions in an argumentation process. It detects outlier opinions on the basis of individual stakeholder's opinions, as well as collective opinions on them from other stakeholders. Decision makers and other participants in an argumentation process therefore have an opportunity to explore the outlier opinions within their groups from both individual and group perspectives. In a large argumentation tree, it is often difficult to identify stakeholders with outlier opinions manually. The system presented in this paper identifies them automatically. Experiments are presented to evaluate the proposed method. Their results show that the method detects outlier opinions in an online argumentation process effectively.
Article
Full-text available
We give an introductory tutorial to assumption-based argumentation (referred to as ABA) – a form of argumentation where arguments and attacks are notions derived from primitive notions of rules in a deductive system, assumptions and contraries thereof. ABA is equipped with different semantics for determining ‘winning’ sets of assumptions and – interchangeably and equivalently – ‘winning’ sets of arguments. It is also equipped with a catalogue of computational techniques to determine whether given conclusions can be supported by a ‘winning’ set of arguments. These are in the form of disputes between (fictional) proponent and opponent players, provably correct w.r.t. the semantics. Albeit simple, ABA is powerful in that it can be used to represent and reason with a number of problems in AI and beyond: non-monotonic reasoning, preferences, decisions. While doing so, it encompasses the expressive and computational needs of these problems while affording the transparency and explanatory power of argumentation.
Chapter
This chapter reviews diagnostic and robust procedures for detecting outliers and other interesting observations in linear regression. First, we present statistics for detecting single outliers and influential observations and show their limitations for multiple outliers in high-leverage situations. Second, we discuss diagnostic procedures designed to avoid masking by finding first a clean subset for estimating the parameters and then increasing its size by incorporating, one by one, new homogeneous observations until a heterogeneous observation is found. We also discuss procedures based on sensitive observations for detecting high-leverage outliers in large data sets using the eigenvectors of a sensitivity matrix. We briefly review robust estimation methods and its relationship with diagnostic procedures. Next, we consider large high-dimensional data sets where the application of iterative procedures can be slow and show that the joint use of simple univariate statistics, as predictive residuals, Cook’s distances, and Peña’s sensitivity statistic, can be a useful diagnostic tool. We also comment on other recent procedures based on regularization and sparse estimation and conclude with a brief analysis of the relationship of outlier detection and cluster analysis. A real data and a simulated example are presented to illustrate the procedures presented in the chapter.KeywordsCook’s distanceInfluential observationsInfluence matrixLeverageMaskingPredictive residualsSensitivity matrixRobust estimation
Article
We show that an interesting class of feed-forward neural networks can be understood as quantitative argumentation frameworks. This connection creates a bridge between research in Formal Argumentation and Machine Learning. We generalize the semantics of feed-forward neural networks to acyclic graphs and study the resulting computational and semantical properties in argumentation graphs. As it turns out, the semantics gives stronger guarantees than existing semantics that have been tailor-made for the argumentation setting. From a machine-learning perspective, the connection does not seem immediately helpful. While it gives intuitive meaning to some feed-forward-neural networks, they remain difficult to understand due to their size and density. However, the connection seems helpful for combining background knowledge in form of sparse argumentation networks with dense neural networks that have been trained for complementary purposes and for learning the parameters of quantitative argumentation frameworks in an end-to-end fashion from data.
Chapter
Understanding and assessing the vulnerability of crowdsourcing learning against data poisoning attacks is the key to ensure the quality of classifiers trained from crowdsourced labeled data. Existing studies on data poisoning attacks only focus on exploring the vulnerability of crowdsourced label collection. In fact, instead of the quality of labels themselves, the performance of the trained classifier is a main concern in crowdsourcing learning. Nonetheless, the impact of data poisoning attacks on the final classifiers remains underexplored to date. We aim to bridge this gap. First, we formalize the problem of poisoning attacks, where the objective is to sabotage the trained classifier maximally. Second, we transform the problem into a bilevel min-max optimization problem for the typical learning-from-crowds model and design an efficient adversarial strategy. Extensive validation on real-world datasets demonstrates that our attack can significantly decrease the test accuracy of trained classifiers. We verified that the labels generated with our strategy can be transferred to attack a broad family of crowdsourcing learning models in a black-box setting, indicating its applicability and potential of being extended to the physical world.
Chapter
Machine Learning has been widely applied in practice, such as disease diagnosis, target detection. Commonly, a good model relies on massive training data collected from different sources. However, the collected data might expose sensitive information. To solve the problem, researchers have proposed many excellent methods that combine machine learning with privacy protection technologies, such as secure multiparty computation (MPC), homomorphic encryption (HE), and differential privacy. In the meanwhile, some other researchers proposed distributed machine learning which allows the clients to store their data locally but train a model collaboratively. The first kind of methods focuses on security, but the performance and accuracy remain to be improved, while the second provides higher accuracy and better performance but weaker security, for instance, the adversary can launch membership attacks from the gradients’ updates in plaintext.
Chapter
Argumentation, in the field of Artificial Intelligence, is a formalism allowing to reason with contradictory information as well as to model an exchange of arguments between one or several agents. For this purpose, many semantics have been defined with, amongst them, gradual semantics aiming to assign an acceptability degree to each argument. Although the number of these semantics continues to increase, there is currently no method allowing to explain the results returned by these semantics. In this paper, we study the interpretability of these semantics by measuring, for each argument, the impact of the other arguments on its acceptability degree. We define a new property and show that the score of an argument returned by a gradual semantics which satisfies this property can also be computed by aggregating the impact of the other arguments on it. This result allows to provide, for each argument in an argumentation framework, a ranking between arguments from the most to the least impacting ones w.r.t. a given gradual semantics.
Article
Knowledge integration in distributed data mining has received widespread attention that aims to integrate inconsistent information locating on distributed sites. Traditional integration methods become ineffective since they are unable to generate global knowledge, support advanced integration strategy, or make prediction without individual classifiers. In this paper, we propose an argumentation based reinforcement learning method to handle this problem. Inspired by meta-learning, we integrate distributed knowledge and extract meta-knowledge that is agreed-upon knowledge consistent to all the agents. Specifically, two learning stages are introduced: argumentation based learning stage integrates and extracts meta-knowledge, and reinforcement learning stage evaluates and refines meta-knowledge. The two learning stages run alternately to extract global meta-knowledge base, which can be used to make prediction directly. The results from extensive experiments demonstrate that our method can extract refined meta-knowledge with a much satisfied performance.
Article
High-quality annotated collections are a key element in constructing systems that use machine learning. In most cases, these collections are created through manual labeling, which is expensive and tedious for annotators. To optimize data labeling, a number of methods using active learning and crowdsourcing were proposed. This paper provides a survey of currently available approaches, discusses their combined use, and describes existing software systems designed to facilitate the data labeling process.
Conference Paper
Outlier detection is an important data mining technique to identify interesting and novel patterns, trends and anomalies from data. Density-based methods are among the most popular class of methods used in outlier detection. However, these methods suffer from the low density patterns problem that could lead to poor performance. In this paper, a novel relative density-based outlier detection algorithm is proposed, which utilizes a new measure of an object's neighborhood density. This approach takes into account an important factor for density: relative neighborhood. Experiments on both simulated and real data demonstrate that the proposed algorithm achieves better performance than other alternatives.
Article
The study of properties of gradual evaluation methods in argumentation has received increasing attention in recent years, with studies devoted to various classes of frameworks/ methods leading to conceptually similar but formally distinct properties in different contexts. In this paper we provide a novel systematic analysis for this research landscape by making three main contributions. First, we identify groups of conceptually related properties in the literature, which can be regarded as based on common patterns and, using these patterns, we evidence that many further novel properties can be considered. Then, we provide a simplifying and unifying perspective for these groups of properties by showing that they are all implied by novel parametric principles of (either strict or non-strict) balance and monotonicity. Finally, we show that (instances of) these principles (and thus the group, literature and novel properties that they imply) are satisfied by several quantitative argumentation formalisms in the literature, thus confirming the principles’ general validity and utility to support a compact, yet comprehensive, analysis of properties of gradual argumentation.
Article
It is common practice to decay the learning rate. Here we show one can usually obtain the same learning curve on both training and test sets by instead increasing the batch size during training. This procedure is successful for stochastic gradient descent (SGD), SGD with momentum, Nesterov momentum, and Adam. It reaches equivalent test accuracies after the same number of training epochs, but with fewer parameter updates, leading to greater parallelism and shorter training times. We can further reduce the number of parameter updates by increasing the learning rate ϵ\epsilon and scaling the batch size BϵB \propto \epsilon. Finally, one can increase the momentum coefficient m and scale B1/(1m)B \propto 1/(1-m), although this tends to slightly reduce the test accuracy. Crucially, our techniques allow us to repurpose existing training schedules for large batch training with no hyper-parameter tuning. We train Inception-ResNet-V2 on ImageNet to 77%77\% validation accuracy in under 2500 parameter updates, efficiently utilizing training batches of 65536 images.
Conference Paper
This paper formalizes a dialogue that includes dishonest arguments in persuasion. We propose a dialogue model that uses a predicted opponent model and define a protocol using this prediction with an abstract argumentation framework. We focus on deception as dishonesty; that is, the case in which an agent hides her knowledge. We define the concepts of dishonest argument and suspicious argument by means of the acceptance of arguments in this model. We show how a dialogue including dishonest arguments proceeds according to the protocol and discuss a condition for a dishonest argument to be accepted without being revealed.
Article
Optimal planning in environments shared with other interacting agents often involves recognizing the intent of others and their plans. This is because others' actions may impact the state of the environment and, consequently, the efficacy of the agent's plan. Planning becomes further complicated in the presence of uncertainty, which may manifest due to the state being partially observable, nondeterminism in how the dynamic state changes, and imperfect sensors. A framework for decision-theoretic planning in this space is the interactive partially observable Markov decision process (I-POMDP), which generalizes the well-known POMDP to multiagent settings. This chapter describes the general I-POMDP framework and a particular approximation that facilitates its usage. Because I-POMDPs elegantly integrate beliefs and the modeling of others in the subject agent's decision process, they apply to modeling human behavioral data obtained from strategic games. We explore the effectiveness of models based on simplified I-POMDPs in fitting experimental data onto theory-of-mind-based recursive reasoning.
Book
Recent decades have seen rapid advances in automatization processes, supported by modern machines and computers. The result is significant increases in system complexity and state changes, information sources, the need for faster data handling and the integration of environmental influences. Intelligent systems, equipped with a taxonomy of data-driven system identification and machine learning algorithms, can handle these problems partially. Conventional learning algorithms in a batch off-line setting fail whenever dynamic changes of the process appear due to non-stationary environments and external influences. Learning in Non-Stationary Environments: Methods and Applications offers a wide-ranging, comprehensive review of recent developments and important methodologies in the field. The coverage focuses on dynamic learning in unsupervised problems, dynamic learning in supervised classification and dynamic learning in supervised regression problems. A later section is dedicated to applications in which dynamic learning methods serve as keystones for achieving models with high accuracy. Rather than rely on a mathematical theorem/proof style, the editors highlight numerous figures, tables, examples and applications, together with their explanations. This approach offers a useful basis for further investigation and fresh ideas and motivates and inspires newcomers to explore this promising and still emerging field of research. © 2012 Springer Science+Business Media New York. All rights reserved.
Article
A large number of statistics are used in the literature to detect outliers and influential observations in the linear regression model. In this paper comparison studies have been made for determining a statistic which performs better than the other. This includes: (i) a detailed simulation study, and (ii) analyses of several data sets studied by different authors. Different choices of the design matrix of regression model are considered. Design A studies the performance of the various statistics for detecting the scale shift type outliers, and designs B and C provide information on the performance of the statistics for identifying the influential observations. We have used cutoff points using the exact distributions and Bonferroni's inequality for each statistic. The results show that the studentized residual which is used for detection of mean shift outliers is appropriate for detection of scale shift outliers also, and the Welsch's statistic and the Cook's distance are appropriate for detection of influential observations.
An introduction to machine learning
  • P Lison
Markov decision processes
  • M Littman
Interpretable machine learning - a guide for making black box models explainable
  • C Molnar
A formal account of deception
  • C Sakama
Learning gradual argumentation frameworks using genetic algorithms
  • J Spieler
  • N Potyka
  • S Staab