Conference Paper

Toward User-adaptive Interactive Labeling on Crowdsourcing Platforms

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

As machine learning continues to grow in popularity, so does the need for labeled training data. Crowd workers often have to tag, label, or annotate these datasets in the course of a labour-intensive, monotonous, and error-prone process that can even be frustrating. However, current task and system designs typically disregard worker-centric issues. In this vision statement, we argue that given the rising human-AI interaction in crowd work, further attention needs to be paid to the design of labeling systems in this regard. Specifically, we see the need for platforms to adapt dynamically to affective-cognitive states of crowd workers based on different types of data (i.e., physiological, behavioral or self-reported). A platform that is considerate to its crowd workers should be able to adapt to such states on an individual level, for instance, by suggesting currently fitting tasks. As a conclusion, we call for interdisciplinary research to make this vision a reality.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Crowdsourcing markets provide workers with a centralized place to find paid work. What may not be obvious at first glance is that, in addition to the work they do for pay, crowd workers also have to shoulder a variety of unpaid invisible labor in these markets, which ultimately reduces workers' hourly wages. Invisible labor includes finding good tasks, messaging requesters, or managing payments. However, we currently know little about how much time crowd workers actually spend on invisible labor or how much it costs them economically. To ensure a fair and equitable future for crowd work, we need to be certain that workers are being paid fairly for all of the work they do. In this paper, we conduct a field study to quantify the invisible labor in crowd work. We build a plugin to record the amount of time that 100 workers on Amazon Mechanical Turk dedicate to invisible labor while completing 40,903 tasks. If we ignore the time workers spent on invisible labor, workers' median hourly wage was 3.76.But,weestimatedthatcrowdworkersinourstudyspent333.76. But, we estimated that crowd workers in our study spent 33% of their time daily on invisible labor, dropping their median hourly wage to 2.83. We found that the invisible labor differentially impacts workers depending on their skill level and workers' demographics. The invisible labor category that took the most time and that was also the most common revolved around workers having to manage their payments. The second most time-consuming invisible labor category involved hyper-vigilance, where workers vigilantly watched over requesters' profiles for newly posted work or vigilantly searched for labor. We hope that through our paper, the invisible labor in crowdsourcing becomes more visible, and our results help to reveal the larger implications of the continuing invisibility of labor in crowdsourcing.
Article
Full-text available
We analyze how advice from an AI affects complementarities between humans and AI, in particular what humans know that an AI does not know: “unique human knowledge.” In a multi-method study consisting of an analytical model, experimental studies, and a simulation study, our main finding is that human choices converge toward similar responses improving individual accuracy. However, as overall individual accuracy of the group of humans improves, the individual unique human knowledge decreases. Based on this finding, we claim that humans interacting with AI behave like “Borgs,” that is, cyborg creatures with strong individual performance but no human individuality. We argue that the loss of unique human knowledge may lead to several undesirable outcomes in a host of human–AI decision environments. We demonstrate this harmful impact on the “wisdom of crowds.” Simulation results based on our experimental data suggest that groups of humans interacting with AI are far less effective as compared to human groups without AI assistance. We suggest mitigation techniques to create environments that can provide the best of both worlds (e.g., by personalizing AI advice). We show that such interventions perform well individually as well as in wisdom of crowds settings.
Article
Full-text available
The emergence of big data combined with the technical developments in Artificial Intelligence has enabled novel opportunities for autonomous and continuous decision support. While initial work has begun to explore how human morality can inform the decision making of future Artificial Intelligence applications, these approaches typically consider human morals as static and immutable. In this work, we present an initial exploration of the effect of context on human morality from a Utilitarian perspective. Through an online narrative transportation study, in which participants are primed with either a positive story, a negative story or a control condition (N = 82), we collect participants' perceptions on technology that has to deal with moral judgment in changing contexts. Based on an in-depth qualitative analysis of participant responses, we contrast participant perceptions to related work on Fairness, Accountability and Transparency. Our work highlights the importance of contextual morality for Artificial Intelligence and identifies opportunities for future work through a FACT-based (Fairness, Accountability, Context and Transparency) perspective.
Article
Full-text available
While crowd workers typically complete a variety of tasks in crowdsourcing platforms, there is no widely accepted method to successfully match workers to different types of tasks. Researchers have considered using worker demographics, behavioural traces, and prior task completion records to optimise task assignment. However, optimum task assignment remains a challenging research problem due to limitations of proposed approaches, which in turn can have a significant impact on the future of crowdsourcing. We present 'CrowdCog', an online dynamic system that performs both task assignment and task recommendations, by relying on fast-paced online cognitive tests to estimate worker performance across a variety of tasks. Our work extends prior work that highlights the effect of workers' cognitive ability on crowdsourcing task performance. Our study, deployed on Amazon Mechanical Turk, involved 574 workers and 983 HITs that span across four typical crowd tasks (Classification, Counting, Transcription, and Sentiment Analysis). Our results show that both our assignment method and recommendation method result in a significant performance increase (5% to 20%) as compared to a generic or random task assignment. Our findings pave the way for the use of quick cognitive tests to provide robust recommendations and assignments to crowd workers.
Conference Paper
Full-text available
Inspired by the increasing prevalence of digital voice assistants, we demonstrate the feasibility of using voice interfaces to deploy and complete crowd tasks. We have developed Crowd Tasker, a novel system that delivers crowd tasks through a digital voice assistant. In a lab study, we validate our proof-of-concept and show that crowd task performance through a voice assistant is comparable to that of a web interface for voice-compatible and voice-based crowd tasks for native English speakers. We also report on a field study where participants used our system in their homes. We find that crowdsourcing through voice can provide greater flexibility to crowd workers by allowing them to work in brief sessions, enabling multi-tasking, and reducing the time and effort required to initiate tasks. We conclude by proposing a set of design guidelines for the creation of crowd tasks for voice and the development of future voice-based crowdsourcing systems.
Chapter
Full-text available
Machine learning is steadily growing in popularity – as is its demand for labeled training data. However, these datasets often need to be labeled by human domain experts in a labor-intensive process. Recently, a new area of research has formed around this process, called interactive labeling. While much research exists in this young and rapidly growing area, it lacks a systematic overview. In this paper, we strive to provide such overview, along with a cluster analysis and an outlook on five avenues for future research. Hereby, we identified 57 relevant articles, most of them investigating approaches for labeling images or text. Further, our findings indicate that there exist two competing views how the user could be treated: (a) oracle, where users are queried whether a label is right or wrong versus (b) teacher, where users can offer deeper explanations in the interactive labeling process.
Conference Paper
Full-text available
Content moderation is an important element of social computing systems that facilitates positive social interaction in online platforms. Current solutions for moderation including human moderation via commercial teams are not effective and have failed to meet the demands of growing volumes of online user generated content. Through a study where we ask crowd workers to moderate tweets, we demonstrate that crowdsourcing is a promising solution for content moderation. We also report a strong relationship between the sentiment of a tweet and its appropriateness to appear in public media. Our analysis on worker responses further reveals several key factors that affect the judgement of crowd moderators when deciding on the suitability of text content. Our findings contribute towards the development of future robust moderation systems that utilise crowdsourcing.
Article
Full-text available
The suitability of crowdsourcing to solve a variety of problems has been investigated widely. Yet, there is still a lack of understanding about the distinct behavior and performance of workers within microtasks. In this paper, we first introduce a fine-grained data-driven worker typology based on different dimensions and derived from behavioral traces of workers. Next, we propose and evaluate novel models of crowd worker behavior and show the benefits of behavior-based worker pre-selection using machine learning models. We also study the effect of task complexity on worker behavior. Finally, we evaluate our novel typology-based worker pre-selection method in image transcription and information finding tasks involving crowd workers completing 1,800 HITs. Our proposed method for worker pre-selection leads to a higher quality of results when compared to the standard practice of using qualification or pre-screening tests. For image transcription tasks our method resulted in an accuracy increase of nearly 7% over the baseline and of almost 10% in information finding tasks, without a significant difference in task completion time. Our findings have important implications for crowdsourcing systems where a worker’s behavioral type is unknown prior to participation in a task. We highlight the potential of leveraging worker types to identify and aid those workers who require further training to improve their performance. Having proposed a powerful automated mechanism to detect worker types, we reflect on promoting fairness, trust and transparency in microtask crowdsourcing platforms.
Article
Quality improvement methods are essential to gathering high-quality crowdsourced data, both for research and industry applications. A popular and broadly applicable method is task assignment that dynamically adjusts crowd workflow parameters. In this survey, we review task assignment methods that address: heterogeneous task assignment, question assignment, and plurality problems in crowdsourcing. We discuss and contrast how these methods estimate worker performance, and highlight potential challenges in their implementation. Finally, we discuss future research directions for task assignment methods, and how crowdsourcing platforms and other stakeholders can benefit from them.
Conference Paper
Physio-adaptive systems define a class of information systems that refer to an innovative mode where system interaction is reached by monitoring, analyzing, and responding to hidden psychophysiological user activity in real-time. However, despite a strong interest of scholars and practitioners in physio-adaptive systems, there exists a lack of a structured and systematic form in which physio-adaptive systems research can be classified. Against this backdrop, this article showcases the current state-of-the-art of physio-adaptive systems research along three different stages, namely (1) collection of physiological data, (2) state determination, as well as (3) system adaptation. Analyzing 44 articles during the years 1994-2019, our main contribution resides in the synopsis of physio-adaptive systems literature along these stages. For instance, we illustrate that there exist three categories for adaptive responses: state display (20% of the analyzed studies), assistance offering (18%), and challenge adaptation (61%). On the grounds of our review, we propose seven promising avenues, which will support scholars in their endeavors on how to pursue with future research in the field of physio-adaptive systems.
Article
Labeling is the process of enclosing information to some object. In machine learning it is required as ground truth to leverage the potential of supervised techniques. A key challenge in labeling is that users are not necessarily eager to behave as simple oracles, that is, repeatedly answering questions whether a label is right or wrong. In this respect, scholars acknowledge designing interactivity in labeling systems as a promising area for further improvements. In recent years, a considerable number of articles focusing on interactive labeling systems have been published. However, there is a lack of consolidated principles how to design these systems. In this article, we identify and discuss five design principles for interactive labeling systems based on a literature review and offer a frame for detecting common ground in the implementation of corresponding solutions. With these guidelines, we strive to contribute design knowledge for the increasingly important class of interactive labeling systems.
Conference Paper
We study the task of interactive semantic labeling of a segmentation hierarchy. To this end we propose a framework interleaving two components: an automatic labeling step, based on a Conditional Random Field whose dependencies are defined by the inclusion tree of the segmentation hierarchy, and an interaction step that integrates incremental input from a human user. Evaluated on two distinct datasets, the proposed interactive approach efficiently integrates human interventions and illustrates the advantages of structured prediction in an interactive framework.
Article
Systems that can learn interactively from their end-users are quickly becoming widespread. Until recently, this progress has been fueled mostly by advances in machine learning; however, more and more researchers are realizing the importance of studying users of these systems. In this article we promote this approach and demonstrate how it can result in better user experiences and more effective learning systems. We present a number of case studies that demonstrate how interactivity results in a tight coupling between the system and the user, exemplify ways in which some existing systems fail to account for the user, and explore new ways for learning systems to interact with their users. After giving a glimpse of the progress that has been made thus far, we discuss some of the challenges we face in moving the field forward.
Article
Flow is a state of peak enjoyment, energetic focus, and creative concentration experienced by people engaged in adult play, which has become the basis of a highly creative approach to living. (PsycINFO Database Record (c) 2012 APA, all rights reserved)