Sara Rosenthal

Sara Rosenthal
IBM Research · Thomas J. Watson Research Center

PhD

About

51
Publications
15,467
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,818
Citations
Citations since 2016
32 Research Items
3460 Citations
20162017201820192020202120220200400600
20162017201820192020202120220200400600
20162017201820192020202120220200400600
20162017201820192020202120220200400600
Education
January 2009 - July 2015
Columbia University
Field of study
  • Computer Science

Publications

Publications (51)
Preprint
Full-text available
Recent machine reading comprehension datasets include extractive and boolean questions but current approaches do not offer integrated support for answering both question types. We present a multilingual machine reading comprehension system and front-end demo that handles boolean questions by providing both a YES/NO answer and highlighting supportin...
Preprint
Pretrained language models have shown success in various areas of natural language processing, including reading comprehension tasks. However, when applying machine learning methods to new domains, labeled data may not always be available. To address this, we use supervised pretraining on source-domain data to reduce sample complexity on domain-spe...
Preprint
Full-text available
Existing datasets that contain boolean questions, such as BoolQ and TYDI QA , provide the user with a YES/NO response to the question. However, a one word response is not sufficient for an explainable system. We promote explainability by releasing a new set of annotations marking the evidence in existing TyDi QA and BoolQ datasets. We show that our...
Preprint
Full-text available
Understanding tables is an important and relevant task that involves understanding table structure as well as being able to compare and contrast information within cells. In this paper, we address this challenge by presenting a new dataset and tasks that addresses this goal in a shared task in SemEval 2020 Task 9: Fact Verification and Evidence Fin...
Article
Prior work on multilingual question answering has mostly focused on using large multilingual pre-trained language models (LM) to perform zero-shot language-wise learning: train a QA model on English and test on other languages. In this work, we explore strategies that improve cross-lingual transfer by bringing the multilingual embeddings closer in...
Article
Thousands of scientific publications discuss evidence on the efficacy of non-cancer generic drugs being tested for cancer. However, trying to manually identify and extract such evidence is intractable at scale. We introduce a natural language processing pipeline to automate the identification of relevant studies and facilitate the extraction of the...
Preprint
Full-text available
Recent approaches have exploited weaknesses in monolingual question answering (QA) models by adding adversarial statements to the passage. These attacks caused a reduction in state-of-the-art performance by almost 50%. In this paper, we are the first to explore and successfully attack a multilingual QA (MLQA) system pre-trained on multilingual BERT...
Preprint
Prior work on multilingual question answering has mostly focused on using large multilingual pre-trained language models (LM) to perform zero-shot language-wise learning: train a QA model on English and test on other languages. In this work, we explore strategies that improve cross-lingual transfer by bringing the multilingual embeddings closer in...
Preprint
Full-text available
We present the results and main findings of SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval 2020). The task involves three subtasks corresponding to the hierarchical taxonomy of the OLID schema (Zampieri et al., 2019a) from OffensEval 2019. The task featured five languages: English, Arabic, Danish,...
Preprint
Full-text available
The use of offensive language is a major problem in social media which has led to an abundance of research in detecting content such as hate speech, cyberbulling, and cyber-aggression. There have been several attempts to consolidate and categorize these efforts. Recently, the OLID dataset used at SemEval-2019 proposed a hierarchical three-level ann...
Article
Full-text available
Objective To improve efficient goal attainment of patients by analyzing the unstructured text in care manager (CM) notes (CMNs). Our task is to determine whether the goal assigned by the CM can be achieved in a timely manner. Materials and Methods Our data consists of CM structured and unstructured records from a private firm in Orlando, FL. The C...
Preprint
Full-text available
In recent years, sentiment analysis in social media has attracted a lot of research interest and has been used for a number of applications. Unfortunately, research has been hindered by the lack of suitable datasets, complicating the comparison between approaches. To address this issue, we have proposed SemEval-2013 Task 2: Sentiment Analysis in Tw...
Preprint
Full-text available
We describe the Sentiment Analysis in Twitter task, ran as part of SemEval-2014. It is a continuation of the last year's task that ran successfully as part of SemEval-2013. As in 2013, this was the most popular SemEval task; a total of 46 teams contributed 27 submissions for subtask A (21 teams) and 50 submissions for subtask B (44 teams). This yea...
Preprint
Full-text available
In this paper, we describe the 2015 iteration of the SemEval shared task on Sentiment Analysis in Twitter. This was the most popular sentiment analysis shared task to date with more than 40 teams participating in each of the last three years. This year's shared task competition consisted of five sentiment prediction subtasks. Two were reruns from p...
Preprint
Full-text available
This paper discusses the fourth year of the ``Sentiment Analysis in Twitter Task''. SemEval-2016 Task 4 comprises five subtasks, three of which represent a significant departure from previous editions. The first two subtasks are reruns from prior years and ask to predict the overall sentiment, and the sentiment towards a topic in a tweet. The three...
Preprint
Full-text available
This paper describes the fifth year of the Sentiment Analysis in Twitter task. SemEval-2017 Task 4 continues with a rerun of the subtasks of SemEval-2016 Task 4, which include identifying the overall sentiment of the tweet, sentiment towards a topic with classification on a two-point and on a five-point ordinal scale, and quantification of the dist...
Preprint
Full-text available
This paper presents the results and main findings of the shared task on Identifying and Categorizing Offensive Language in Social Media (OffensEval). SemEval-2019 Task 6 provided participants with an annotated dataset containing English tweets. The competition was divided into three sub-tasks. In sub-task A systems were trained to discriminate betw...
Preprint
Full-text available
As offensive content has become pervasive in social media, there has been much research on identifying potentially offensive messages. Previous work in this area, however, did not consider the problem as a whole, but rather focused on detecting very specific types of offensive content, e.g., hate speech, cyberbulling, or cyber-aggression. In contra...
Article
Full-text available
Social media has become very popular and mainstream, leading to an abundance of content. This wealth of content contains many interactions and conversations that can be analyzed for a variety of information. One such type of information is analyzing the roles people take in a conversation. Detecting influencers, one such role, can be useful for pol...
Article
Full-text available
We present the development and evaluation of a semantic analysis task that lies at the intersection of two very trendy lines of research in contemporary computational linguistics: (1) sentiment analysis, and (2) natural language processing of social media text. The task was part of SemEval, the International Workshop on Semantic Evaluation, a seman...
Conference Paper
Full-text available
Determining when conversational participants agree or disagree is instrumental for broader conversational analysis; it is necessary, for example, in deciding when a group has reached consensus. In this paper, we describe three main contributions. We show how different aspects of conversational structure can be used to detect agreement and disagreem...
Conference Paper
Recent business studies have shown that social technologies can significantly improve productivity within enterprises by improving access to information, ideas, and collaborators. A manifestation of the growing adoption of enterprise social technologies is the increasing use of enterprise virtual discussions to engage customers and employees. In th...
Article
Knowing who's influential can help when planning political campaigns, advertising strategies, or even combating terrorism; and now research into influence detection promises to automate such detection.
Conference Paper
Full-text available
We describe the Sentiment Analysis in Twitter task, ran as part of SemEval-2014. It is a continuation of the last year’s task that ran successfully as part of SemEval2013. As in 2013, this was the most popular SemEval task; a total of 46 teams contributed 27 submissions for subtask A (21 teams) and 50 submissions for subtask B (44 teams). This year...
Conference Paper
Full-text available
We present two supervised sentiment detection systems which were used to compete in SemEval-2014 Task 9: Sentiment Analysis in Twitter. The first system (Rosenthal and McKeown, 2013) classifies the polarity of subjective phrases as positive, negative, or neutral. It is tailored towards online genres, specifically Twitter, through the inclusion of d...
Conference Paper
Full-text available
This paper explores the automatic detection of sentences that are opinionated claims, in which the author expresses a belief. We use a machine learning based approach, investigating the impact of features such as sentiment and the output of a system that determines committed belief. We train and test our approach on social media, where people often...
Conference Paper
It has long been established that there is a correlation between the dialog behavior of a participant and how influential he or she is perceived to be by other discourse participants. In this paper we explore the characteristics of communication that make someone an opinion leader and develop a machine learning based approach for the automatic iden...
Conference Paper
Full-text available
We investigate whether wording, stylistic choices, and online behavior can be used to predict the age category of blog authors. Our hypothesis is that significant changes in writing style distinguish pre-social media bloggers from post-social media bloggers. Through experimentation with a range of years, we found that the birth dates of students in...
Conference Paper
Full-text available
This paper investigates whether high-quality annotations for tasks involving semantic disambiguation can be obtained without a major investment in time or expense. We examine the use of untrained human volunteers from Amazons Mechanical Turk in disambiguating prepositional phrase (PP) attachment over sentences drawn from the Wall Street Journal cor...
Conference Paper
Full-text available
Sentence fusion enables summarization and question-answering systems to produce output by combining fully formed phrases from different sentences. Yet there is little data that can be used to develop and evaluate fusion techniques. In this paper, we present a methodology for collecting fusions of similar sentence pairs using Amazon's Mechanical Tur...
Conference Paper
This paper explores the task of building an accurate prepositional phrase attachment corpus for new genres while avoiding a large investment in terms of time and money by crowd-sourcing judgments. We develop and present a system to extract prepositional phrases and their potential attachments from ungrammatical and informal sentences and pose the s...
Conference Paper
Full-text available
Cross-lingual tasks are especially difficult due to the compounding effect of errors in language processing and errors in machine translation (MT). In this paper, we present an error analysis of a new cross-lingual task: the 5W task, a sentence-level understanding task which seeks to return the English 5W's (Who, What, When, Where and Why) correspo...
Conference Paper
Full-text available
Cross-lingual tasks are especially difficult due to the compounding effect of errors in language processing and errors in machine translation (MT). In this paper, we present an error analysis of a new cross-lingual task: the 5W task, a sentence-level understanding task which seeks to return the English 5W's (Who, What, When, Where and Why) correspo...
Article
Full-text available
I study active learning in general pool-based active learning models as well noisy active learning algorithms and then compare them for the class of linear separators under the uniform distribution.
Article
There are many dierent kinds of analysis and applications that can be created based upon social networks. I have created a software module that will extract social networks from emails and give the user information about them.
Article
Full-text available
In this paper we discuss experiments and our results that we have ob- tained on Boosting with Noise. A boosting algorithm is one that takes a weak PAC learning algorithm and "boosts" it to achieve high accuracy. We examine the Kalai and Servedio paper (1) as our focal point for boosting. We will analyze in depth the algorithm mentioned in (1) known...
Article
We introduce a new corpus of sentence-level agreement and disagreement annotations over LiveJournal and Wikipedia threads. This is the first agreement corpus to offer full-document annotations for threaded discussions. We provide a methodology for coding responses as well as an implemented tool with an interface that facilitates annotation of a spe...

Network

Cited By