ChapterPDF Available

Suggestions for Online User Studies: Sharing Experiences from the Use of Four Platforms

Authors:

Abstract

During exceptional times when researchers do not have physical access to users of technology, the importance of remote user studies increases. We provide recommendations based on lessons learned from conducting online user studies utilizing four online research platforms (Appen, MTurk, Prolific, and Upwork). Our recommendations aim to help those inexperienced with online user studies. They are also beneficial for those interested in increasing their proficiency, employing this increasingly important research methodology for studying people’s interactions with technology and information.
... The pre-screening of study subjects is often regarded as a best practice when conducting online studies [41]. Hence, we used Prolic's built-in qualication features to recruit participants based on their gender (i.e., to ensure balance within study groups), social media usage (Instagram), and language skills (German). ...
... Hence, we used Prolic's built-in qualication features to recruit participants based on their gender (i.e., to ensure balance within study groups), social media usage (Instagram), and language skills (German). In addition, we targeted users who already took part in at least 10 other studies as these are usually more committed and less likely to drop from experiments [41]. Last but not least, we also tried to keep the length of both the individual surveys and the full experiment as short as possible to reduce participants' fatigue and minimize the chances of attrition. ...
Preprint
Full-text available
Online self-disclosure is perhaps one of the last decade's most studied communication processes, thanks to the introduction of Online Social Networks (OSNs) like Facebook. Self-disclosure research has contributed significantly to the design of preventative nudges seeking to support and guide users when revealing private information in OSNs. Still, assessing the effectiveness of these solutions is often challenging since changing or modifying the choice architecture of OSN platforms is practically unfeasible. In turn, the effectiveness of numerous nudging designs is supported primarily by self-reported data instead of actual behavioral information. This work presents ENAGRAM, an app for evaluating preventative nudges, and reports the first results of an empirical study conducted with it. Such a study aims to showcase how the app (and the data collected with it) can be leveraged to assess the effectiveness of a particular nudging approach. We used ENAGRAM as a vehicle to test a risk-based strategy for nudging the self-disclosure decisions of Instagram users. For this, we created two variations of the same nudge and tested it in a between-subjects experimental setting. Study participants (N=22) were recruited via Prolific and asked to use the app regularly for 7 days. An online survey was distributed at the end of the experiment to measure some privacy-related constructs. From the data collected with ENAGRAM, we observed lower (though non-significant) self-disclosure levels when applying risk-based interventions. The constructs measured with the survey were not significant either, except for participants' External Information Privacy Concerns. Our results suggest that (i) ENAGRAM is a suitable alternative for conducting longitudinal experiments in a privacy-friendly way, and (ii) it provides a flexible framework for the evaluation of a broad spectrum of nudging solutions.
... They had to be at least 18 years old to join the study and were rewarded with GBP 2.00 (payed trough Prolific) for a completed survey of 15 minutes duration on average. As a standard quality control, we targeted users who already took part in at least 10 other studies and had a minimum approval rate of 98% [30]. Two attention questions were also included in the survey to identify and discard answers from unengaged participants. ...
Conference Paper
Full-text available
Social Coding Platforms (SCPs) like GitHub have become central to modern software engineering thanks to their collaborative and version-control features. Like in mainstream Online Social Networks (OSNs) such as Facebook, users of SCPs are subjected to privacy attacks and threats given the high amounts of personal and project-related data available in their profiles and software repositories. However, unlike in OSNs, the privacy concerns and practices of SCP users have not been extensively explored nor documented in the current literature. In this work, we present the preliminary results of an online survey (N=105) addressing developers' concerns and perceptions about privacy threats steaming from SCPs. Our results suggest that, although users express concern about social and organisational privacy threats, they often feel safe sharing personal and project-related information on these platforms. Moreover, attacks targeting the inference of sensitive attributes are considered more likely than those seeking to re-identify source-code contributors. Based on these findings, we propose a set of recommendations for future investigations addressing privacy and identity management in SCPs.
Chapter
Full-text available
The artificial intelligence industry has been essential in creating new jobs for the deployment of real-world solutions. As a result, the implementation of these new jobs involves the execution of multiple human intelligence micro-tasks, such as data labeling tasks for training machine learning models. The workers who perform those tasks, also known as crowd workers, usually are independent workers within crowdsourcing platforms. These platforms are subject to the free market, where the forces of supply and demand produce various power dynamics among stakeholders. As a result, disassociation between stakeholders often generates unbalanced power dynamics where workers are paid below minimum wage and are intimidated to keep their reputation or face termination. Within this chapter, we introduce computational techniques to audit the workplace conditions of crowd workers and design tools to address these power imbalances, as a positive and more efficient alternative for the labor conditions of crowd workers. This chapter develops these objectives through the design and evaluation of three tools in digital labor platforms: “Invisible Labor Tracker,” “Reputation Agent,” and “CultureFit,” which we describe below. We will demonstrate the sustainability of systems that point to a future where AI can be used to audit and address power imbalances in the workplace.
Conference Paper
Online self-disclosure is perhaps one of the last decade’s most studied communication processes, thanks to the introduction of Online Social Networks (OSNs) like Facebook. Self-disclosure research has contributed significantly to the design of preventative nudges seeking to support and guide users when revealing private information in OSNs. Still, assessing the effectiveness of these solutions is often challenging since changing or modifying the choice architecture of OSN platforms is practically unfeasible. In turn, the effectiveness of numerous nudging designs is supported primarily by self-reported data instead of actual behavioral information. Objective: This work presents ENAGRAM, an app for evaluating preventative nudges, and reports the first results of an empirical study conducted with it. Such a study aims to showcase how the app (and the data collected with it) can be leveraged to assess the effectiveness of a particular nudging approach. Method: We used ENAGRAM as a vehicle to test a risk-based strategy for nudging the self-disclosure decisions of Instagram users. For this, we created two variations of the same nudge (i.e., with and without risk information) and tested it in a between-subjects experimental setting. Study participants (N=22) were recruited via Prolific and asked to use the app regularly for 7 days. An online survey was distributed at the end of the experiment to measure some privacy-related constructs. Results: From the data collected with ENAGRAM, we observed lower (though non-significant) self-disclosure levels when applying risk-based interventions. The constructs measured with the survey were not significant either, except for participants’ External Information Privacy Concerns (EIPC). Implications: Our results suggest that (i) ENAGRAM is a suitable alternative for conducting longitudinal experiments in a privacy-friendly way, and (ii) it provides a flexible framework for the evaluation of a broad spectrum of nudging solutions.
Article
Full-text available
Faculty who engage students as participants in their qualitative research often encounter methodological and ethical problems. Ethical issues arise from the fiduciary relationship between faculty and their students, and violations of that relationship occur when the educator has a dual role as researcher with those students. Methodological issues arise from research designs to address these ethical issues. This conflict is particularly evident in faculty research on pedagogy in their own disciplines, for which students are necessary as participants but are captive in the relationship. In this article, the authors explore the issues of double agency when faculty involve students as participants in their research.
Conference Paper
Full-text available
Investigating users’ engagement with interactive persona systems can yield crucial insights for the design of such systems. Using eye-tracking, researchers can address the scarcity of behavioral user studies, even during times when physical user studies are difficult or impossible to carry out. In this research, we implement a webcam-based eye-tracking module into an interactive persona system, facilitating remote user studies. Findings from the implementation can show what information users pay attention to in persona profiles.
Conference Paper
Full-text available
Though photographs of real people are typically used to portray personas, there is little research into the potential advantages or disadvantages of using such images, relative to other image styles. We conducted an experiment with 149 participants, testing the effects of six different image styles on user perceptions and personality traits that are attributed to personas by the participants. Results show that perceptions of clarity, completeness, consistency, credibility, and empathy for a persona increase with picture realism. Personas with more realistic pictures are also perceived as more agreeable, open, and emotionally stable, with higher confidence in these assessments. We also find evidence of the uncanny valley effect, with realistic cartoon personas experiencing a decrease in the user perception scores.
Article
Full-text available
Background: The COVID-19 pandemic necessitated "going remote" with the delivery, support, and assessment of a study intervention targeting older adults enrolled in a clinical trial. While remotely delivering and assessing technology is not new, there are few methods available in the literature that are proven to be effective with diverse populations, and none for older adults specifically. Older adults comprise a very diverse population, including in terms of their experience with and access to technology, making this a challenging endeavor. Objective: Our objective was to remotely deliver and conduct usability testing for a mobile health technology intervention for older adult participants enrolled in a clinical trial of the technology. This paper describes the methodology used, its successes, and its limitations. Methods: We developed a conceptual model for remote operations, called the Framework for Agile and Remote Operations (FAR Ops), that combined the general requirements for spaceflight operations with Agile project management processes to quickly respond to this challenge. Using this framework, we iteratively created "care packages" that differed in their contents based on participant needs and were sent to study participants in order to deliver the study intervention (a medication management app) and assess its usability. Usability data was collected using the System Usability Scale (SUS) and a novel usability questionnaire developed to collect more in-depth data. Results: In the first 6 months of the project, we successfully delivered 21 care packages. We succeeded in designing and deploying a minimum viable product in less than 6 weeks, generally maintained a 2-week sprint cycle, and achieved a 40-50% return rate for both usability assessment instruments. We hypothesize that lack of engagement due to the pandemic and our use of asynchronous communication channels contributed to the return rate of usability assessments being lower than desired. We also provide general recommendations for performing remote usability testing with diverse populations based on the results of our work, including implementing screen-sharing capabilities when possible, and determining participant preference for phone or email communications. Conclusions: The FAR Ops model allowed our team to adopt remote operations for our mHealth trial, in response to interruptions from COVID-19. This approach can be useful for other research or practical projects, under similar circumstances or to improve efficiency, cost, effectiveness, and participant diversity in general. In addition to offering a replicable approach, this paper tells the often-untold story of practical challenges faced by mobile health projects and practical strategies used to address them. Clinicaltrial:
Article
Full-text available
Data-driven personas are a significant advancement in the fields of human-centered informatics and human-computer interaction. Data-driven personas enhance user understanding by combining the empathy inherent with personas with the rationality inherent in analytics using computational methods. Via the employment of these computational methods, the data-driven persona method permits the use of large-scale user data, which is a novel advancement in persona creation. A common approach for increasing stakeholder engagement about audiences, customers, or users, persona creation remained relatively unchanged for several decades. However, the availability of digital user data, data science algorithms, and easy access to analytics platforms provide avenues and opportunities to enhance personas from often sketchy representations of user segments to precise, actionable, interactive decision-making tools—data-driven personas! Using the data-driven approach, the persona profile can serve as an interface to a fully functional analytics system that can present user representation at various levels of information granularity for more task-aligned user insights. We trace the techniques that have enabled the development of data-driven personas and then conceptually frame how one can leverage data-driven personas as tools for both empathizing with and understanding of users. Presenting a conceptual framework consisting of (a) persona benefits, (b) analytics benefits, and (c) decision-making outcomes, we illustrate applying this framework via practical use cases in areas of system design, digital marketing, and content creation to demonstrate the application of data-driven personas in practical applied situations. We then present an overview of a fully functional data-driven persona system as an example of multi-level information aggregation needed for decision making about users. We demonstrate that data-driven personas systems can provide critical, empathetic, and user understanding functionalities for anyone needing such insights.
Article
This paper describes an approach to improving the reliability of a crowdsourced labeling task for which there is no objective right answer. Our approach focuses on three contingent elements of the labeling task: data quality, worker reliability, and task design. We describe how we developed and applied this framework to the task of labeling tweets according to their interestingness. We use in-task CAPTCHAs to identify unreliable workers, and measure inter-rater agreement to decide whether subtasks have objective or merely subjective answers.
Article
The COVID-19 pandemic has shaken the world to its core and has provoked an overnight exodus of developers who normally worked in an office setting to working from home. The magnitude of this shift and the factors that have accompanied this new unplanned work setting go beyond what the software engineering community has previously understood to be remote work. To find out how developers and their productivity were affected, we distributed two surveys (with a combined total of 3,634 responses that answered all required questions) weeks apart to understand the presence and prevalence of the benefits, challenges, and opportunities to improve this special circumstance of remote work. From our thematic qualitative analysis and statistical quantitative analysis, we find that there is a dichotomy of developer experiences influenced by many different factors (that for some are a benefit, while for others a challenge). For example, a benefit for some was being close to family members but for others having family members share their working space and interrupting their focus, was a challenge. Our surveys led to powerful narratives from respondents and revealed the scale at which these experiences exist to provide insights as to how the future of (pandemic) remote work can evolve.
Article
AI image captioning challenges encourage broad participation in designing algorithms that automatically create captions for a variety of images and users. To create large datasets necessary for these challenges, researchers typically employ a shared crowdsourcing task design for image captioning. This paper discusses findings from our thematic analysis of 1,064 comments left by Amazon Mechanical Turk workers using this task design to create captions for images taken by people who are blind. Workers discussed difficulties in understanding how to complete this task, provided suggestions of how to improve the task, gave explanations or clarifications about their work, and described why they found this particular task rewarding or interesting. Our analysis provides insights both into this particular genre of task as well as broader considerations for how to employ crowdsourcing to generate large datasets for developing AI algorithms.