Kristina Lerman

Kristina Lerman
University of Southern California | USC · Information Sciences Institute

About

372
Publications
81,159
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
11,611
Citations
Citations since 2016
198 Research Items
6737 Citations
201620172018201920202021202205001,0001,500
201620172018201920202021202205001,0001,500
201620172018201920202021202205001,0001,500
201620172018201920202021202205001,0001,500
Introduction
I am a Project Leader at the Information Sciences Institute and holds a joint appointment as a Research Associate Professor in the USC Viterbi School of Engineering's Computer Science Department. My research focuses on applying network- and machine learning-based methods to problems in social computing.
Additional affiliations
June 1998 - present
University of Southern California
Position
  • Project Manager

Publications

Publications (372)
Preprint
Full-text available
The need for emotional inference from text continues to diversify as more and more disciplines integrate emotions into their theories and applications. These needs include inferring different emotion types, handling multiple languages, and different annotation formats. A shared model between different configurations would enable the sharing of know...
Preprint
Full-text available
Detecting emotions expressed in text has become critical to a range of fields. In this work, we investigate ways to exploit label correlations in multi-label emotion recognition models to improve emotion detection. First, we develop two modeling approaches to the problem in order to capture word associations of the emotion words themselves, by eith...
Preprint
Full-text available
Morality plays an important role in culture, identity, and emotion. Recent advances in natural language processing have shown that it is possible to classify moral values expressed in text at scale. Morality classification relies on human annotators to label the moral expressions in text, which provides training data to achieve state-of-the-art per...
Article
Full-text available
Diversity in science is necessary to improve innovation and increase the capacity of the scientific workforce. Despite decades-long efforts to increase gender diversity, however, women remain a small minority in many fields, especially in senior positions. The dearth of elite women scientists, in turn, leaves fewer women to serve as mentors and rol...
Preprint
Full-text available
The scaling relations between city attributes and population are emergent and ubiquitous aspects of urban growth. Quantifying these relations and understanding their theoretical foundation, however, is difficult due to the challenge of defining city boundaries and a lack of historical data to study city dynamics over time and space. To address this...
Article
Full-text available
Change point detection has many practical applications, from anomaly detection in data to scene changes in robotics; however, finding changes in high dimensional data is an ongoing challenge. We describe a self-training model-agnostic framework to detect changes in arbitrarily complex data. The method consists of two steps. First, it labels data as...
Article
Full-text available
Measurement(s) Stress • Burnout • Affect • Depression • Sleep • Physical Activity Measurement • Alcohol Use History • Frequency Any Tobacco Use • Personality • Social Support • Intragroup Conflict • Challenge and Hindrance Stressors • Demographics • Context and Atypical Events • Daily Stressors • Most Stressful Event • Work Context • Job Performanc...
Article
Road networks represent a key component of human settlements, such as cities, towns, and villages, that mediate pollution and congestion, as well as economic development. However, little is known about the long-term development trajectories of road networks in rural and urban settings. We leverage novel spatial data sources to reconstruct and analy...
Article
Full-text available
As part of the DARPA SocialSim challenge, we address the problem of predicting behavioral phenomena including information spread involving hundreds of thousands of users across three major linked social networks: Twitter, Reddit and GitHub. Our approach develops a framework for data-driven agent simulation that begins with a discrete-event simulati...
Preprint
Full-text available
Stance detection infers a text author's attitude towards a target. This is challenging when the model lacks background knowledge about the target. Here, we show how background knowledge from Wikipedia can help enhance the performance on stance detection. We introduce Wikipedia Stance Detection BERT (WS-BERT) that infuses the knowledge into stance e...
Article
Preferential attachment, homophily, and their consequences such as scale-free (i.e. power-law) degree distributions, the glass ceiling effect (the unseen, yet unbreakable barrier that keeps minorities and women from rising to the upper rungs of the corporate ladder, regardless of their qualifications or achievements) and perception bias are well-st...
Preprint
Full-text available
While developments in machine learning led to impressive performance gains on big data, many human subjects data are, in actuality, small and sparsely labeled. Existing methods applied to such data often do not easily generalize to out-of-sample subjects. Instead, models must make predictions on test data that may be drawn from a different distribu...
Preprint
The COVID-19 pandemic has upended daily life around the globe, posing a threat to public health. Intuitively, we expect that surging cases and deaths would lead to fear, distress and other negative emotions. However, using state-of-the-art methods to measure emotions and moral concerns in social media messages posted in the early stage of the pande...
Preprint
Full-text available
Algorithms that aid human tasks, such as recommendation systems, are ubiquitous. They appear in everything from social media to streaming videos to online shopping. However, the feedback loop between people and algorithms is poorly understood and can amplify cognitive and social biases (algorithmic confounding), leading to unexpected outcomes. In t...
Article
The COVID-19 pandemic has posed unprecedented challenges to public health world-wide. To make decisions about mitigation strategies and to understand the disease dynamics, policy makers and epidemiologists must know how the disease is spreading in their communities. Here we analyse confirmed infections and deaths over multiple geographic scales to...
Preprint
Full-text available
The popularity of online gaming has grown dramatically, driven in part by streaming and the billion-dollar e-sports industry. Online games regularly update their software to fix bugs, add functionality that improve the game's look and feel, and change the game mechanics to keep the games fun and challenging. An open question, however, is the impact...
Article
Full-text available
Social networks are very important carriers of information. For instance, the political leaning of our friends can serve as a proxy to identify our own political preferences. This explanatory power is leveraged in many scenarios ranging from business decision-making to scientific research to infer missing attributes using machine learning. However,...
Preprint
Full-text available
Dialogue Act (DA) classification is the task of classifying utterances with respect to the function they serve in a dialogue. Existing approaches to DA classification model utterances without incorporating the turn changes among speakers throughout the dialogue, therefore treating it no different than non-interactive written text. In this paper, we...
Article
Full-text available
Research institutions provide the infrastructure for scientific discovery, yet their role in the production of knowledge is not well characterized. To address this gap, we analyze interactions of researchers within and between institutions from millions of scientific papers. Our analysis reveals that collaborations densify as each institution grows...
Preprint
Full-text available
We examine a key component of human settlements mediating pollution and congestion, as well as economic development: roads and their expansion in cities, towns and villages. Our analysis of road networks in more than 850 US cities and rural counties since 1900 reveals significant variations in the structure of roads both within cities and across th...
Preprint
Quantitative analysis of large-scale data is often complicated by the presence of diverse subgroups, which reduce the accuracy of inferences they make on held-out data. To address the challenge of heterogeneous data analysis, we introduce DoGR, a method that discovers latent confounders by simultaneously partitioning the data into overlapping clust...
Preprint
Full-text available
Background Groups of distantly related individuals who share a short segment of their genome identical-by-descent (IBD) can provide insights about rare traits and diseases in massive biobanks via a process called IBD mapping. Clustering algorithms play an important role in finding these groups. We set out to analyze the fitness of commonly used, fa...
Article
The social connections people form online affect the quality of information they receive and their online experience. Although a host of socioeconomic and cognitive factors were implicated in the formation of offline social ties, few of them have been empirically validated, particularly in an online setting. In this study, we analyze a large corpus...
Article
What will social media sites of tomorrow look like? What behaviors will their interfaces enable? A major challenge for designing new sites that allow a broader range of user actions is the difficulty of extrapolating from experience with current sites without first distinguishing correlations from underlying causal mechanisms. The growing availabil...
Chapter
Life events can dramatically affect our psychological state and work performance. Stress, for example, has been linked to professional dissatisfaction, increased anxiety, and workplace burnout. We therefore explore the impact of atypical positive and negative events on a number of psychological constructs through a longitudinal study of hospital an...
Chapter
Full-text available
The complex, ever-shifting landscape of social media can obscure important changes in conversations involving smaller groups. Discovering these subtle shifts in attention to topics can be challenging for algorithms attuned to global topic popularity. We present a novel unsupervised method to identify shifts in high-dimensional textual data. By util...
Article
With the widespread use of artificial intelligence (AI) systems and applications in our everyday lives, accounting for fairness has gained significant importance in designing and engineering of such systems. AI systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that these...
Conference Paper
Full-text available
Digital media platforms are reshaping our habits, how we access information, and how we interact with others. As a result, algorithms used by platforms, for example, to recommend content, play an increasingly important role in our access to information. Due to practical difficulties of accessing how platforms present content to their users, relativ...
Article
Full-text available
Successful responses to societal challenges require sustained behavioral change. However, as responses to the COVID-19 pandemic in the US showed, political partisanship and mistrust of science can reduce public willingness to adopt recommended behaviors such as wearing a mask or receiving a vaccination. To better understand this phenomenon, we expl...
Preprint
The growing popularity of wearable sensors has generated large quantities of temporal physiological and activity data. Ability to analyze this data offers new opportunities for real-time health monitoring and forecasting. However, temporal physiological data presents many analytic challenges: the data is noisy, contains many missing values, and eac...
Preprint
Full-text available
Assessing the credibility of research claims is a central, continuous, and laborious part of the scientific process. Credibility assessment strategies range from expert judgment to aggregating existing evidence to systematic replication efforts. Such assessments can require substantial time and effort. Research progress could be accelerated if ther...
Preprint
Full-text available
The viral video documenting the killing of George Floyd by Minneapolis police officer Derek Chauvin inspired nation-wide protests that brought national attention to widespread racial injustice and biased policing practices towards black communities in the United States. The use of social media by the Black Lives Matter movement was a primary route...
Preprint
Full-text available
Growing polarization of the news media has been blamed for fanning disagreement, controversy and even violence. Early identification of polarized topics is thus an urgent matter that can help mitigate conflict. However, accurate measurement of polarization is still an open research challenge. To address this gap, we propose Partisanship-aware Conte...
Article
Full-text available
Background: Gender imbalances in academia have been evident historically and persist today. For the past 60 years, we have witnessed the increase of participation of women in biomedical disciplines, showing that the gender gap is shrinking. However, preliminary evidence suggests that women, including female researchers, are disproportionately affec...
Preprint
Full-text available
Tagging facilitates information retrieval in social media and other online communities by allowing users to organize and describe online content. Researchers found that the efficiency of tagging systems steadily decreases over time, because tags become less precise in identifying specific documents, i.e., they lose their descriptiveness. However, p...
Preprint
Preferential attachment, homophily and, their consequences such as the glass ceiling effect have been well-studied in the context of undirected networks. However, the lack of an intuitive, theoretically tractable model of a directed, bi-populated~(i.e.,~containing two groups) network with variable levels of preferential attachment, homophily and gr...
Preprint
Structural inequalities persist in society, conferring systematic advantages to one group of people, for example, by giving them substantially more influence and opportunities than others. Using bibliometric data about authors of scientific publications from six different disciplines, we first present evidence for the existence of two types of cita...
Article
Voting is the defining act for a democracy. However, voting is only meaningful if public deliberation is grounded in veritable and equitable information. This essay investigates the politicization of public health practices during the Democratic primaries in the context of the 2020 U.S. presidential election, using a dataset of more than 67 million...
Preprint
Individual behavior and decisions are substantially influenced by their contexts, such as location, environment, and time. Changes along these dimensions can be readily observed in Multiplayer Online Battle Arena games (MOBA), where players face different in-game settings for each match and are subject to frequent game patches. Existing methods uti...
Conference Paper
Full-text available
Although player performance in online games has been widely studied, few studies have considered the behavioral preferences of players and how that impacts performance. In a competitive setting where players must cooperate with temporary teammates, it is even more crucial to understand how differences in playing style contribute to teamwork. Drawin...
Preprint
Full-text available
Research collaborations provide the foundation for scientific advances, but we have only recently begun to understand how they form and grow on a global scale. Here we analyze a model of the growth of research collaboration networks to explain the empirical observations that the number of collaborations scales superlinearly with institution size, t...
Article
Full-text available
Background: The novel coronavirus pandemic continues to ravage communities across the US. Opinion surveys identified the importance of political ideology in shaping perceptions of the pandemic and compliance with preventive measures. Objective: The aim of this study was to measure political partisanship and anti-science attitudes in the discussi...
Article
Mobile health systems predict health conditions based on multimodal signals. Users are often reluctant to provide their health status over privacy concerns. It is challenging to make health predictions without sufficient historical data from the users. In this paper, we propose a user-based collaborative filtering mobile health system the system re...
Article
Full-text available
Applications from finance to epidemiology and cyber-security require accurate forecasts of dynamic phenomena, which are often only partially observed. We demonstrate that a system’s predictability degrades as a function of temporal sampling, regardless of the adopted forecasting model. We quantify the loss of predictability due to sampling, and sho...
Preprint
Full-text available
BACKGROUND The novel coronavirus pandemic continues to ravage communities across the US. Opinion surveys identified importance of political ideology in shaping perceptions of the pandemic and compliance with preventive measures. OBJECTIVE The aim of this study was to measure political partisanship and anti-science attitudes in the discussions abou...
Preprint
Full-text available
Explicit and implicit bias clouds human judgement, leading to discriminatory treatment of minority groups. A fundamental goal of algorithmic fairness is to avoid the pitfalls in human judgement by learning policies that improve the overall outcomes while providing fair treatment to protected classes. In this paper, we propose a causal framework tha...
Article
Full-text available
Background: Gender imbalances in academia have been evident historically and persist today. For the past 60 years, we have witnessed the increase of participation of women in biomedical disciplines, showing that the gender gap is shrinking. However, preliminary evidence suggests that women, including female researchers, are disproportionately affe...
Preprint
BACKGROUND Gender imbalances in academia have been evident historically and persist today. For the past 60 years, we have witnessed the increase of participation of women in biomedical disciplines, showing that the gender gap is shrinking. However, preliminary evidence suggests that women, including female researchers, are disproportionately affect...
Preprint
Crowdsourcing systems aggregate decisions of many people to help users quickly identify high-quality options, such as the best answers to questions or interesting news stories. A long-standing issue in crowdsourcing is how option quality and human judgement heuristics interact to affect collective outcomes, such as the perceived popularity of optio...
Article
Diachronic word embeddings—vector representations of words over time—offer remarkable insights into the evolution of language and provide a tool for quantifying sociocultural change from text documents. Prior work has used such embeddings to identify shifts in the meaning of individual words. However, simply knowing that a word has changed in meani...
Preprint
Full-text available
Women make up a shrinking portion of physics faculty in senior positions, a phenomenon known as a "leaky pipeline." While fixing this problem has been a priority in academic institutions, efforts have been stymied by the diverse sources of leaks. In this paper we identify a bias potentially contributing to the leaky pipeline. We analyze bibliograph...
Article
Full-text available
Measurement(s) Overall Sleep Quality Rating • Step Unit of Distance • Speech • Mean Heart Rate • Proximity • Electrocardiogram Sequence • heart rate variability measurement • Respiratory Rate • physical activity measurement • light • door motion • Changes in Ambient Temperature in Medical Device Environment • humidity • Overall Emotional Well-Being...
Article
Full-text available
Crowdsourcing systems aggregate decisions of many people to help users quickly identify high-quality options, such as the best answers to questions or interesting news stories. A long-standing issue in crowdsourcing is how option quality and human judgement heuristics interact to affect collective outcomes, such as the perceived popularity of optio...
Chapter
Full-text available
News outlets are a primary source for many people to learn what is going on in the world. However, outlets with different political slants, when talking about the same news story, usually emphasize various aspects and choose their language framing differently. This framing implicitly shows their biases and also affects the reader’s opinion and unde...
Chapter
Full-text available
Continuous collection of physiological data from wearable sensors enables temporal characterization of individual behaviors. Understanding the relation between an individual’s behavioral patterns and psychological states can help identify strategies to improve quality of life. One challenge in analyzing physiological data is extracting the underlyi...