Sarah Levitan

Sarah Levitan
  • Columbia University

About

39
Publications
7,771
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
721
Citations
Current institution
Columbia University

Publications

Publications (39)
Preprint
Full-text available
In this paper, we investigate the efficacy of large language models (LLMs) in obfuscating authorship by paraphrasing and altering writing styles. Rather than adopting a holistic approach that evaluates performance across the entire dataset, we focus on user-wise performance to analyze how obfuscation effectiveness varies across individual authors....
Article
Full-text available
News outlets are well known to have political associations, and many national outlets cultivate political biases to cater to different audiences. Journalists working for these news outlets have a big impact on the stories they cover. In this work, we present a methodology to analyze the role of journalists, affiliated with popular news outlets, in...
Article
NLP approaches to automatic deception detection have gained popularity over the past few years, especially with the proliferation of fake reviews and fake news online. However, most previous studies of deception detection have focused on single domains. We currently lack information about how these single-domain models of deception may or may not g...
Conference Paper
Full-text available
We address the problem of predicting psychiatric hospitalizations using linguistic features drawn from social media posts. We formulate this novel task and develop an approach to automatically extract time spans of self-reported psychiatric hospitalizations. Using this dataset, we build predictive models of psychiatric hospitalization, comparing fe...
Conference Paper
Full-text available
We address the problem of automatic detection of psychiatric disorders from the linguistic content of social media posts. We build a large scale dataset of Reddit posts from users with eight disorders and a control user group. We extract and analyze linguistic characteristics of posts and identify differences between diagnostic groups. We build str...
Article
Full-text available
Humans rarely perform better than chance at lie detection. To better understand human perception of deception, we created a game framework, LieCatcher, to collect ratings of perceived deception using a large corpus of deceptive and truthful interviews. We analyzed the acoustic-prosodic and linguistic characteristics of language trusted and mistrust...
Article
Full-text available
The tendency of conversation partners to adjust to each other to become similar, known as entrainment, has been studied for many years. Several studies have linked differences in this behavior to gender, but with inconsistent results. We analyze individual differences in two forms of local, acoustic-prosodic entrainment in two large corpora between...
Conference Paper
Full-text available
Improving methods of automatic deception detection is an important goal of many researchers from a variety of disciplines, including psychology, computational linguistics, and criminol-ogy. We present a system to automatically identify deceptive utterances using acoustic-prosodic, lexical, syntactic, and phono-tactic features. We train and test our...
Conference Paper
Full-text available
Detecting deception from different dimensions of human behavior has been a major goal of research in psychology and computational linguistics for some years and is currently of considerable interest to military and law enforcement agencies. However, relatively little work has been done to develop automatic methods to detect deception from spoken la...
Article
A major goal of the Cognitive Infocommunication approach is to develop applications in which human and artificial cognitive systems are made to work more effectively. A critical step in this process is improving our understanding of human–human interaction so that it may be modeled more closely. Our work addresses this task by examining the role of...
Article
Tandem repeats in DNA sequences are extremely relevant in biological phenomena and diagnostic tools. Computational programs that discover these tandem repeats generate a huge volume of data, which is often difficult to decipher without further organization. In this paper, the authors describe a new method for post-processing tandem repeats through...
Article
RFID technology can successfully be used to reduce medical errors. This technology can aid in the accurate matching of patients with their medications and treatments. The enthusiasm for using RFID technology in medical settings has been tempered by privacy concerns. We discuss new encryption methods that address these concerns.

Network

Cited By