Jahna OtterbacherOpen University of Cyprus · Social Information Systems
Jahna Otterbacher
Ph.D., University of Michigan at Ann Arbor
About
109
Publications
15,769
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,457
Citations
Introduction
Skills and Expertise
Publications
Publications (109)
Evaluating the algorithmic behavior of interactive systems is complex and time-consuming. Developers increasingly recognize the importance of accountability for their algorithmic creations’ unanticipated behavior and resulting implications. To mitigate this phenomenon, developers not only need to concentrate on the observable inaccuracies that can...
With the surge in data-centric AI and its increasing capabilities, AI applications have become a part of our everyday lives. However, misunderstandings regarding their capabilities, limitations, and associated advantages and disadvantages are widespread. Consequently, in the university setting, there is a crucial need to educate not only computer s...
Artificial Intelligence (AI) is now everywhere, including in the classroom. Thus, it is crucial for teachers not only to be able to use AI, but also to understand it, as they bear the responsibility of teaching the next generation with and about AI. Although there have been extensive discussions about the importance of digital skills and knowledge...
In today’s world, children are increasingly interacting with AI technologies as part of their daily routines. However, many children may not fully grasp how these technologies function or their implications. Misconceptions about AI capabilities, risks, and benefits abound, underlining the importance of early education on the subject. Designing educ...
Crowdsourcing plays an important role in Web and social media research, from data annotation, to online experiments and user surveys. With the emergence of Generative AI (GenAI), researchers are considering how models and tools such as GPT might replace crowdwork. Many have already evaluated GPT on annotation tasks. However, it is less clear how Ge...
With the surge in data-centric AI and its increasing capabilities, AI applications have become a part of our everyday lives. However, misunderstandings regarding their capabilities, limitations, and associated advantages and disadvantages are widespread. Consequently, in the university setting, there is a crucial need to educate not only computer s...
Algorithms have greatly advanced and become integrated into our everyday lives. Although they support humans in daily functions, they often exhibit unwanted behaviors perpetuating social stereotypes, discrimination, and other forms of biases. Regardless of their accuracy on task, many algorithms do not get scrutinized for unintended behaviors in a...
With the rise of data-driven AI and its "democratization", we are all interacting with AI-enabled technologies in everyday life. However, not everyone understands how these technologies work. There are many misconceptions surrounding what they can and cannot do, and what are the risks and benefits. Thus, there is a need to educate the general publi...
Data-driven algorithms are becoming more prevalent in our everyday lives, automating various decisions that can impact access to opportunities and resources. For this reason, much research concerns the ethical and social consequences of algorithms. Algorithmic digital marketing represents a key means by which people encounter and/or are affected by...
Mitigating bias in algorithmic systems is a critical issue drawing attention across communities within the information and computer sciences. Given the complexity of the problem and the involvement of multiple stakeholders—including developers, end users, and third-parties—there is a need to understand the landscape of the sources of bias, and the...
Much attention has been on the behaviors of computer vision services when describing images of people. Audits have revealed rampant biases that could lead to harm, when services are used by developers and researchers. We focus on temporal auditing, replicating experiments originally conducted three years ago. We document the changes observed over t...
Digital Marketing, and specifically, targeted marketing online is flourishing in recent years, and is becoming evermore precise and easy to implement, given the rise of big data and algorithmic processes. This study assesses users’ perceptions regarding the fairness in algorithmic targeted marketing, in conditions of scarcity. This is increasingly...
The unprecedented events of the COVID-19 pandemic have generated an enormous amount of information and populated the Web with new content relevant to the pandemic and its implications. Visual information such as images has been shown to be crucial in the context of scientific communication. Images are often interpreted as being closer to the truth...
In this work, we investigate how students in fields adjacent to algorithms development perceive fairness, accountability, transparency, and ethics in algorithmic decision-making. Participants (N = 99) were asked to rate their agreement with statements regarding six constructs that are related to facets of fairness and justice in algorithmic decisio...
Professionals are increasingly relying on algorithmic systems for decision making however, algorithmic decisions occasionally perceived as biased or not just. Prior work has provided evidences that education can make a difference on the perception of young developers on algorithmic fairness. In this paper, we investigate computer science students'...
Image tagging APIs, offered as Cognitive Services in the movement to democratize AI, have become popular in applications that need to provide a personalized user experience. Developers can easily incorporate these services into their applications; however, little is known concerning their behavior under specific circumstances. We consider how two s...
During times of crisis, information access is crucial. Given the opaque processes behind modern search engines, it is important to understand the extent to which the “picture” of the Covid-19 pandemic accessed by users differs. We explore variations in what users “see” concerning the pandemic through Google image search, using a two-step approach....
The first FATE Winter School, organized by the Cyprus Center for Algorithmic Transparency (CyCAT) provided a forum for both students as well as senior researchers to examine the complex topic of Fairness, Accountability, Transparency and Ethics (FATE). Through a program that included two invited keynotes, as well as sessions led by CyCAT partners a...
As the role of algorithmic systems and processes increases in society, so does the risk of bias, which can result in discrimination against individuals and social groups. Research on algorithmic bias has exploded in recent years, highlighting both the problems of bias, and the potential solutions, in terms of algorithmic transparency (AT). Transpar...
Mitigating bias in algorithmic systems is a critical issue drawing attention across communities within the information and computer sciences. Given the complexity of the problem and the involvement of multiple stakeholders, including developers, end-users and third-parties, there is a need to understand the landscape of the sources of bias, and the...
While professionals are increasingly relying on algorithmic systems for making a decision, on some occasions, algorithmic decisions may be perceived as biased or not just. Prior work has looked into the perception of algorithmic decision-making from the user's point of view. In this work, we investigate how students in fields adjacent to algorithm...
Machine-learned computer vision algorithms for tagging images are increasingly used by developers and researchers, having become popularized as easy-to-use "cognitive services." Yet these tools struggle with gender recognition, particularly when processing images of women, people of color and non-binary individuals. Socio-technical researchers have...
While professionals are increasingly relying on algorithmic systems for making a decision, on some occasions, algorithmic decisions may be perceived as biased or not just. Prior work has looked into the perception of algorithmic decision-making from the user's point of view. In this work, we investigate how students in fields adjacent to algorithm...
Image analysis algorithms have become an indispensable tool in our information ecosystem, facilitating new forms of visual communication and information sharing. At the same time, they enable large-scale socio-technical research which would otherwise be difficult to carry out. However, their outputs may exhibit social bias, especially when analyzin...
There are increasing expectations that algorithms should behave in a manner that is socially just. We consider the case of image tagging APIs and their interpretations of people images. Image taggers have become indispensable in our information ecosystem, facilitating new modes of visual communication and sharing. Recently, they have become widely...
Crowdsourcing plays a key role in developing algorithms for image recognition or captioning. Major datasets, such as MS COCO or Flickr30K, have been built by eliciting natural language descriptions of images from workers. Yet such elicita-tion tasks are susceptible to human biases, including stereotyping people depicted in images. Given the growing...
Crowdsourcing plays a key role in developing algorithms for image recognition or captioning. Major datasets, such as MS COCO or Flickr30K, have been built by eliciting natural language descriptions of images from workers. Yet such elicitation tasks are susceptible to human biases, including stereotyping people depicted in images. Given the growing...
Image recognition algorithms that automatically tag or moderate content are cru-cial in many applications but are increasingly opaque. Given transparency con-cerns, we focus on understanding how algorithms tag people images and their in-ferences on attractiveness. Theoretically, attractiveness has an evolutionary basis, guiding mating behaviors, al...
Image recognition algorithms that automatically tag or moderate content are crucial in many applications but are increasingly opaque. Given transparency concerns, we focus on understanding how algorithms tag people images and their inferences on attractiveness. Theoretically, attractiveness has an evolutionary basis, guiding mating behaviors, altho...
There are increasing expectations that algorithms should behave in a manner that is socially just. We consider the case of image tagging APIs and their interpretations of people images. Image taggers have become indispensable in our information ecosystem, facilitating new modes of visual communication and sharing. Recently, they have become widely...
Image analysis algorithms have become an indispensable tool in our information ecosystem, facilitating new forms of visual communication and information sharing. At the same time, they enable large-scale socio-technical research which would otherwise be difficult to carry out. However, their outputs may exhibit social bias, especially when analyzin...
Image analysis algorithms have been a boon to personalization in digital systems and are now widely available via easy-to-use APIs. However, it is important to ensure that they behave fairly in applications that involve processing images of people, such as dating apps. We conduct an experiment to shed light on the factors influencing the perception...
It is our great pleasure to welcome you to the Second FairUMAP workshop at UMAP 2019. This full-day workshop brings together researchers working at the intersection of user modeling, adaptation, and personalization on one hand, and bias, fairness and transparency in algorithmic systems on the other hand. The workshop was motivated by the observatio...
When modelling for the social we need to consider more than one medium. Little is known as to how platform community characteristics shape the discussion and how communicators could best engage each community, taking into consideration these characteristics. We consider comments on TED videos featuring roboticists, shared at TED.com and YouTube. We...
Journalists and researchers alike have claimed that IR systems are socially biased, returning results to users that perpetuate gender and racial stereotypes. In this position paper, I argue that IR researchers and in particular, evaluation communities such as CLEF, can and should address such concerns. Using as a guide the Principles for Algorithmi...
Many advocate for artificial agents to be empathic. Crowdsourcing could help, by facilitating human-in-the-loop approaches and data set creation for visual emotion recognition algorithms. Although crowdsourcing has been employed successfully for a range of tasks, it is not clear how effective crowdsourcing is when the task involves subjective ratin...
Over the last several years, research on our elected officials’ use of social media as a political communication platform has greatly increased. While the bulk of social media-related research focuses on elections, social media-traditional media connections, or the effect of politicians’ social media communications on people’s attitudes and opinion...
There is growing evidence that search engines produce results that are socially biased, reinforcing a view of the world that aligns with prevalent social stereotypes. One means to promote greater transparency of search algorithms - which are typically complex and proprietary - is to raise user awareness of biased result sets. However, to date, litt...
Despite that Social Media can fuel jealousy between romantic partners, by providing a convenient and socially acceptable means of monitoring one another’s online behavior, little has been written about the possible role of Social Media in Dating Violence. We examine if and how Social Media behaviors fuel victimization during physical interactions....
In this paper a novel content relatedness algorithm for social media content is proposed, based on the Explicit Semantic Analysis (ESA) technique. The proposed scheme takes into consideration social interactions. In particular starting from the vector space representation model, similarity is expressed by a summation of term weight products. In thi...
Social media like Facebook or Twitter have become an entry point to news for many readers. In that scenario, the headline is the most prominent – and often the only visible – part of the news article. We propose a novel task of using only headlines to predict the popularity of news articles. The prediction model is evaluated on headlines from two m...
A large proportion of audiences read news online, often accessing news articles through social media like Facebook or Twitter. A distinguishing characteristic of news on social media is that the most prominent (and often the only visible) part of the news article is the headline. We investigate the impact of headline characteristics, including jour...
A large proportion of audiences read news online, often accessing news articles through social media like Facebook or Twitter. A distinguishing characteristic of news on social media is that the most prominent (and often the only visible) part of the news article is the headline. We investigate the impact of headline characteristics, including jour...
Social media like Facebook or Twitter have become an entry point to news for many readers. In that scenario, the headline is the most prominent — and often the only visible — part of the news article. We propose a novel task of using only headlines to predict the popularity of news articles. The prediction model is evaluated on headlines from two m...
Gender stereotypes are strong influences on human behavior. Given our tendency to anthropomorphize, incorporating gender cues into a robot's design can influence acceptance by humans. However, little is known about the interaction between human and robot gender. We focus on the role of gender in eliciting negative, ``uncanny" reactions from observe...
There is much concern about algorithms that underlie information services and the view of the world they present. We develop a novel method for examining the content and strength of gender stereotypes in image search, inspired by the trait adjective checklist method. We compare the gender distribution in photos retrieved by Bing for the query “pers...
In the November 2014 Midterm Elections, social media was used by more members of Congress and with greater frequency than ever before. We employ an established metric for interpreting the short but influential posts made by members of Congress via Twitter to determine how they position themselves relative to other politicians, candidates, and issue...
Linguistic mimicry, the adoption of another’s language patterns, is a subconscious behavior with pro-social benefits. However, some professions advocate its conscious use in empathic communication. This involves mutual mimicry; effective communicators mimic their interlocutors, who also mimic them back. Since mimicry has often been studied in face-...
Games with a Purpose (GWAP) is a popular approach for metadata creation, enabling institutions to collect descriptions of digital artifacts on a mass scale. Creating metadata is challenging not only because one must recognize the artifact; the description must then be encoded into natural language. Language behaviors are influenced by many social f...
Our behaviors often converge with those of others, and language within social media is no exception. We consider reviews of tourist attractions at TripAdvisor (TA), the world's largest resource for travel information. Unlike social networking sites, TA review forums do not facilitate direct interaction between participants. Nonetheless, theory sugg...
Our behaviors often converge with those of others, and language within social media is no exception. We consider reviews of tourist attractions at TripAdvisor (TA), the world's largest resource for travel information. Unlike social networking sites, TA review forums do not facilitate direct interaction between participants. Nonetheless, theory sugg...
Social media holds the potential to facilitate vertical political communication by giving citizens the opportunity to interact directly with their representatives. However, skeptics claim that even when politicians use "interactive media," they avoid direct engagement with constituents, using technology to present a fa?ade of interactivity instead...
As Twitter becomes a more common means for officials to communicate with their constituents, it becomes more important that we understand how officials use these communication tools. Using data from 380 members of Congress' Twitter activity during the winter of 2012, we find that officials frequently use Twitter to advertise their political positio...
Enthusiasts propose that social media promotes vertical political communication, giving citizens the opportunity to interact directly with their representatives. However, skeptics claim that politicians avoid direct engagement with constituents, using technology to present a façade of interactivity instead. This study explores if and how elected of...
Online review forums provide consumers with essential information about goods and services by facilitating word-of-mouth communication. Despite that preferences are correlated to demographic characteristics, reviewer gender is not often provided on user profiles. We consider the case of the internet movie database (IMDb), where users exchange views...
Women and men communicate differently in both face-to-face and computer-mediated environments. We study linguistic patterns considered gendered in reviews contributed to the Internet Movie Database. IMDb has been described as a male-majority community, in which females contribute fewer reviews and enjoy less prestige than males. Analyzing reviews p...
Classic news summarization plays an important role with the exponential document growth on the Web. Many approaches are proposed to generate summaries but seldom simultaneously consider evolutionary characteristics of news plus to traditional summary elements. Therefore, we present a novel framework for the web mining problem named Evolutionary Tim...
Review communities typically display contributions in list format, using participant feedback in determining presentation order. Given the volume of contributions, which are likely to be seen? While previous work has focused on content, we examine the relationship between communication tactics and prominence. We study three communities, comparing f...
Social voting plays a key role in the organization of user-contributed content; readers are asked to indicate what they “like” or find “helpful,” with collected votes then used to prioritize valued content. Despite the popularity of these mechanisms, little is known as to how users employ and interpret this feedback. We conducted a study in which p...
How public officials communicate with their constituents and the public as a whole has been explored in terms of the traditional media, but research on their communication via social media has been, at best, largely descriptive and, at worst, unreliable for its lack of rigor. This paper assesses connections between Members of Congress’s communicati...
Despite differences in the way that men and women experience goods and communicate their perspectives, online review communities typically do not provide participants' gender. We propose to infer author gender, given a set of reviews of a particular item, and experiment on reviews posted at the Internet Movie Database (IMDb). Using logistic regress...
Online communities displaying textual postings require measures to combat information overload. One popular approach is to ask participants whether or not messages are helpful in order to then guide others to interesting content. Adopting a well-established framework for assessing data quality, we examine the nature of "helpfulness."We study consum...
We present Biased LexRank, a method for semi-supervised passage retrieval in the context of question answering. We represent a text as a graph of passages linked based on their pairwise lexical similarity. We use traditional passage retrieval techniques to identify passages that are likely to be relevant to a user’s natural language question. We th...
Web 2.0 has allowed online shoppers to become not just information seekers but also information providers. Many e-commerce venues allow users to share their experiences in the form of textual product reviews. While textual product reviews represent a wealth of information for candidate buyers, finding pertinent information becomes difficult, as the...
Purpose
Automated sentence‐level relevance and novelty detection would be of direct benefit to many information retrieval systems. However, the low level of agreement between human judges performing the task is an issue of concern. In previous approaches, annotators were asked to identify sentences in a document set that are relevant to a given top...
When an important event happens, such as a terrorist attack or natural disaster, many people turn to the World Wide Web to keep track of the most current information. Because large numbers of online agencies report on such events, and continually update their stories, the Web provides timely access to a variety of perspectives. However, following f...
Access to information via handheld devices supports decision making away from one's computer. However, limitations include small screens and constrained wireless bandwidth. We present a summarization method that transforms online content for delivery to small de- vices. Unlike previous algorithms, ours assumes nothing about document formatting, and...
News articles about the same event published over time have properties that challenge NLP and IR applications. A cluster of such texts typically exhibits instances of paraphrase and contradiction, as sources update the facts surrounding the story, often due to an ongoing investigation. The current hypothesis is that the stories "evolve" over time,...
Virtual communities often suffer from a number of problems, including questionable information quality and information overload, which threaten their utility and stability. To address this, social filtering techniques may be used, in which users rate the postings, guiding others to the important ones. This method is contrasted to information retrie...
Using the Web, consumers not only find product characteristics from manufacturers and sellers; they can also exchange opinions with other third parties. Learning about such "experience attributes" builds confidence in purchasing decisions and establishes trust between parties in transactions. However, little is known about the search process for th...
We consider the problem of tracking information over time and across sources in clusters of news stories about emergency events. While previous approaches have focused on finding new information at the document and se ntence levels (e.g. TDT FSD and the TREC Novelty track, respectively), we are interested in follow- ing information at the factual l...
We study the adoption of translation support technologies by professors at a multilingual university, using the framework
of the Technology Adoption Model (TAM). TAM states that a user’s perceived usefulness and ease of use for the technology ultimately
determines her actual use of it. Through a survey and a set of interviews with our subjects, we...
We present an evaluation of a novel hierarchical text sum- marization method that allows users to view summaries of Web documents from small, mobile devices. Unlike previ- ous approaches, ours does not require the documents to be in HTML since it infers a hierarchical structure automat- ically. Currently, the method is used to summarize news articl...
Methods for detecting sentences in an input document set, which are both relevant and novel with respect to an information need, would be of direct benefit to many systems, such as extractive text summarizers. However, satisfactory levels of agreement between judges performing this task manually have yet to demonstrated, leaving researchers to conc...
We consider the problem of question-focused sentence retrieval from complex news articles describing multi-event stories published over time. Annotators generated a list of questions central to understanding each story in our corpus. Because of the dynamic nature of the stories, many questions are time-sensitive (e.g. "How many victims have been fo...
NewsInEssence (NIE) is a new delivery and summarization system under development at the University of Michigan, which gathers and recaps news items based on specifications and interests to online news readers. NIE searches across dozens of news sites to collect a group, or cluster, of related stories and then generates a summary of the entire clust...
We present WAP MEAD, a WAP-enabled text summarization system. It incorporates a state-of-the art text summarizer enhanced to produce hierarchical summaries that are appropriate for various types of mobile devices, including cellular phones.
Paraphrases and other semantically related sentences present a challenge to NLP and IR applications such as multi-document summarization and question answering systems. While it is generally agreed that paraphrases contain approximately equivalent ideas, they often differ from one another in subtle, yet non-trivial, ways. In this paper, we examine...