Rashmi Prasad

Rashmi Prasad
University of Wisconsin–Milwaukee | UWM · Department of Health Informatics and Administration

PhD

About

71
Publications
17,476
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,689
Citations
Additional affiliations
August 2011 - August 2017
University of Wisconsin–Milwaukee
Position
  • Professor (Assistant)

Publications

Publications (71)
Chapter
Understanding discourse relies to a great extent on correctly interpreting relations holding between the eventualities and facts mentioned in discourse. These discourse relations, such as causal, contrastive and temporal relations, can be expressed explicitly or implicitly in the discourse, and are the subject of annotation in the Penn Discourse Tr...
Conference Paper
Full-text available
Literature based discovery (LBD) is a well-known paradigm to discover hidden knowledge in scientific literature. By identifying and utilizing reported findings in literature, LBD hypothesizes novel discoveries. Most often, LBD systems generate a long list of potential discoveries and it would be time consuming and expensive to validate all of those...
Conference Paper
New drug development costs between 500 million and 2 billion dollars and takes 10-15 years, with a success rate of less than 10%. Drug repurposing is the process of discovering new indications for existing drugs and is becoming an important component of drug development as success rates for novel drugs in clinical trials decrease and costs increase...
Conference Paper
Full-text available
The CoNLL-2015 Shared Task is on Shallow Discourse Parsing, a task focusing on identifying individual discourse relations that are present in a natural language text. A discourse relation can be expressed explicitly or implicitly, and takes two arguments realized as sentences, clauses, or in some rare cases, phrases. Sixteen teams from three contin...
Article
Full-text available
The Penn Discourse Treebank (PDTB) was released to the public in 2008. It remains the largest manually annotated corpus of discourse relations to date. Its focus on discourse relations that are either lexically-grounded in explicit discourse connectives or associated with sentential adjacency has not only facilitated its use in language technology...
Article
Full-text available
To develop, evaluate, and share: (1) syntactic parsing guidelines for clinical text, with a new approach to handling ill-formed sentences; and (2) a clinical Treebank annotated according to the guidelines. To document the process and findings for readers with similar interest. Using random samples from a shared natural language processing challenge...
Conference Paper
In this paper we present our experiments with different annotation workflows for annotating discourse relations in the Hindi Discourse Relation Bank(HDRB). In view of the growing interest in the development of discourse data-banks based on the PDTB framework and the complexities associated with the discourse annotation, it is important to study and...
Article
Full-text available
Relation extraction in biomedical text mining systems has largely focused on identifying clause-level relations, but increasing sophistication demands the recognition of relations at discourse level. A first step in identifying discourse relations involves the detection of discourse connectives: words or phrases used in text to express discourse re...
Article
Full-text available
Identification of discourse relations, such as causal and contrastive relations, between situations mentioned in text is an important task for biomedical text-mining. A biomedical text corpus annotated with discourse relations would be very useful for developing and evaluating methods for biomedical discourse processing. However, little effort has...
Article
Full-text available
Part-of-speech (POS) tagging is a fundamental step required by various NLP systems. The training of a POS tagger relies on sufficient quality annotations. However, the annotation process is both knowledge-intensive and time-consuming in the clinical domain. A promising solution appears to be for institutions to share their annotation efforts, and y...
Conference Paper
Full-text available
Studies of discourse relations have not, in the past, attempted to characterize what serves as evidence for them, beyond lists of frozen expressions, or markers, drawn from a few well-defined syntactic classes. In this paper, we describe how the lexicalized discourse relation annotations of the Penn Discourse Treebank (PDTB) led to the discovery of...
Conference Paper
Full-text available
We report results on predicting the sense of implicit discourse relations between ad- jacent sentences in text. Our investigation concentrates on the association between discourse relations and properties of the referring expressions that appear in the re- lated sentences. The properties of inter- est include coreference information, gram- matical...
Article
Full-text available
This paper describes the question generation system devel-oped at UPenn for QGSTEC, 2010. The system uses predicate argument structures of sentences along with semantic roles for the question gener-ation task from paragraphs. The semantic role labels are used to identify relevant parts of text before forming questions over them. The generated quest...
Conference Paper
Full-text available
In this paper, we make a qualitative and quantitative analysis of discourse relations within the LUNA conversational spoken dialog corpus. In particular, we describe the adaptation of the Penn Discourse Treebank (PDTB) annotation scheme to the LUNA dialogs. We discuss similarities and differences between our approach and the PDTB paradigm and point...
Conference Paper
Full-text available
We present an approach to automatically identifying the arguments of discourse connectives based on data from the Penn Discourse Treebank. Of the two arguments of connectives, called Arg1 and Arg2, we focus on Arg1, which has proven more challenging to identify. Our approach employs a sentence-based representation of arguments, and distinguishes in...
Conference Paper
Full-text available
We describe the Hindi Discourse Relation Bank project, aimed at developing a large corpus annotated with discourse relations. We adopt the lexically grounded approach of the Penn Discourse Treebank, and describe our classification of Hindi discourse connectives, our modifications to the sense classification of discourse relations, and some cross-li...
Article
Full-text available
In the Hindi Discourse Relation Bank (HDRB) project, we are developing a large corpus annotated with discourse relations, such as causal, temporal, contrastive and conjunctive relations. Adopting the lexi-cally grounded approach of the Penn Dis-course Treebank (PDTB), we annotate the argument structure of both explicit and im-plicit discourse relat...
Article
Full-text available
The goal of understanding how discourse is more than a sequence of sentences has engaged researchers for many years. Researchers in the 1970's attempted to gain such understanding by identifying and classifying the phenomena involved in discourse. This was followed by attempts in the 1980s and early 1990s to explain discourse phenomena in terms of...
Article
Full-text available
While many advances have been made in Natural Language Generation (NLG), the scope of the field has been somewhat restricted because of the lack of annotated corpora from which properties of texts can be automatically acquired and applied towards the development of generation systems. In this paper, we describe how the Penn Discourse Tree-Bank (PDT...
Article
Full-text available
The goal of the Penn Discourse Treebank (PDTB) project is to develop a large-scale cor-pus, annotated with coherence relations marked by discourse connectives. Currently, the primary application of the PDTB annotation has been to news articles. In this study, we tested whether the PDTB guidelines can be adapted to a differ-ent genre. We annotated d...
Conference Paper
Full-text available
We present the second version of the Penn Discourse Treebank, PDTB-2.0, describing its lexically-grounded annotations of discourse relations and their two abstract object arguments over the 1 million word Wall Street Journal corpus. We describe all aspects of the annotation, including (a) the argument structure of discourse relations, (b) the sense...
Article
Full-text available
The term discourse structure is used to denote any structure of a text above that of the sentence. Trees have often been posited as a good abstraction when discourse is taken to have a hierarchical structure (Mann and Thompson 1987; Webber et al. 2003; Marcu 2000; Egg and Redeker 2008). Nevertheless, periodically researchers have commented on the n...
Article
Full-text available
We address the subtask of generating why-questions from texts and propose the use of causal relations annotated in the Penn Discourse TreeBank for evaluating content-selection methods for why-question genera-tion. Our initial experiments show that 71% of an independently developed data set of why-questions can be correlated with causal rela-tions a...
Article
Full-text available
We describe our initial efforts towards developing a large-scale corpus of Hindi texts annotated with discourse relations. Adopting the lexically grounded approach of the Penn Discourse Treebank (PDTB), we present a preliminary analysis of discourse connectives in a small corpus. We describe how discourse connectives are represented in the sentence...
Article
Full-text available
Spoken language generation for dialogue systems requires a dictionary of mappings between the semantic representations of concepts that the system wants to express and the realizations of those concepts. Dictionary creation is a costly process; it is currently done by hand for each dialogue domain. We propose a novel unsupervised method for learnin...
Article
Full-text available
One of the biggest challenges in the development and deployment of spoken dialogue systems is the design of the spoken language generation module. This challenge arises from the need for the generator to adapt to many features of the dialogue domain, user population, and dialogue context. A promising approach is trainable generation, which uses gen...
Article
Full-text available
This report contains the guidelines for the annotation of discourse relations in the Penn Discourse Treebank (http://www.seas.upenn.edu/~pdtb), PDTB. Discourse relations in the PDTB are annotated in a bottom up fashion, and capture both lexically realized relations as well as implicit relations. Guidelines in this report are provided for all aspect...
Article
Full-text available
We describe the problem of anaphora resolution and discuss approaches to modeling this problem. Centering Theory (CT), which is an approach to modeling certain aspects of local coherence in discourse, includes within it the component that models anaphora resolution. However, CT itself is not a theory of anaphora resolution. It was developed as part...
Conference Paper
Full-text available
Spoken language generation for dialogue systems requires a dictionary of mappings between semantic representations of con- cepts the system wants to express and re- alizations of those concepts. Dictionary creation is a costly process; it is currently done by hand for each dialogue domain. We propose a novel unsupervised method for learning such ma...
Article
Full-text available
An emerging task in text understanding and generation is to categorize information as fact or opinion and to further attribute it to the appropriate source. Corpus annotation schemes aim to encode such distinctions for NLP applications concerned with such tasks, such as information extraction, question answering, summarization, and generation. We d...
Article
Full-text available
The annotations of the Penn Discourse Treebank (PDTB) include (1) discourse connectives and their arguments, and (2) attribution of each argument of each con-nective and of the relation it denotes. Be-cause the PDTB covers the same text as the Penn TreeBank WSJ corpus, syntac-tic and discourse annotation can be com-pared. This has revealed signific...
Article
Full-text available
In this paper, we describe an annotation scheme for the attribution of abstract objects (propositions, facts, and eventualities) associated with discourse relations and their arguments annotated in the Penn Discourse TreeBank. The scheme aims to capture both the source and degrees of factuality of the abstract objects through the annotation of text...
Article
Full-text available
Discourse connectives can be analysed as encoding predicate-argument relations whose arguments derive from the interpretation of discourse units. These arguments can be anaphoric or structural. Although structural arguments can be encoded in a parse tree, anaphoric arguments must be resolved by other means. A study of nine connectives, annotating t...
Conference Paper
Full-text available
Compared to the variation in utterances that users may exhibit in conversation with spoken dialogue systems, system utterances can be very rigid with little variation. One recent approach to dealing with this problem is a trainable sentence planner, which uses natural language generation techniques to create a large number of alternative utterances...
Conference Paper
Full-text available
The Penn Discourse TreeBank (PDTB) is a new resource built on top of the Penn Wall Street Journal corpus, in which discourse connectives are annotated along with their arguments. Its use of standoff annotation allows integration with a stand-off version of the Penn TreeBank (syntactic structure) and PropBank (verbs and their arguments), which adds...
Conference Paper
Full-text available
A challenging problem for spoken dialog systems is the design of utterance generation modules that are fast, flexible and general, yet produce high quality output in particular domains. A promising approach is trainable generation, which uses general-purpose linguistic knowledge automatically adapted to the application domain. This paper presents a...
Article
A challenging problem for spoken dialog systems is the design of utterance generation modules that are fast, flexible and general, yet produce high quality output in particular domains. A promising approach is trainable generation, which uses general-purpose linguistic knowledge automatically adapted to the application domain. This paper presents a...
Article
Full-text available
While dialogue acts provide a useful schema for characterizing dialogue behaviors in human-computer and humanhuman dialogues, their utility is limited by the huge effort involved in handlabelling dialogues with a dialogue act labelling scheme. In this work, we examine whether it is possible to fully automate the tagging task with the goal of enabli...
Article
Full-text available
The Penn Discourse TreeBank (PDTB) is a new resource built on top of the complete Penn Wall Street Journal corpus, in which discourse connectives are annotated along with their arguments. Its use of stand-off annotation allows integration with a standoff version of the Penn TreeBank (syntactic structure) and PropBank (verbs and their arguments) , w...
Article
Full-text available
Discourse connectives can be analyzed as encoding predicate-argument relations whose arguments derive from the interpretation of discourse units. These arguments can be anaphoric or structural. Although structural arguments can be encoded in a parse tree, anaphoric arguments must be resolved by other means. A study of nine connectives, annotating t...
Article
Full-text available
This paper describes a new, large scale discourse-level annotation project -- the Penn Discourse TreeBank (PDTB). We present an approach to annotating a level of discourse structure that is based on identifying discourse connectives and their arguments. The PDTB is being built directly on top of the Penn TreeBank and Propbank, thus supporting the e...
Article
Full-text available
This paper describes a new discourse-level annotation project -- the Penn Discourse Treebank (PDTB) -- that aims to produce a large-scale corpus in which discourse connectives are annotated, along with their arguments, thus exposing a clearly defined level of discourse structure.
Article
Full-text available
We present an implementation of a discourse parsing system for a lexicalized Tree-Ajoining Grammar for discourse, specifying the integration of sentence and discourse level processing. Our system is based on the assumption that the compositional aspects of semantics at the discourse-level parallel those at the sentence-level. This coupling is achie...
Conference Paper
Full-text available
As the complexity of spoken dialogue systems has increased, there has been increasing interest spoken language generation (SLG). SLG promises portability across application domains and dialogue situations through the development of application- independent linguistic modules. However in practice, rule- based SLGs often have to be tuned to the appli...
Article
Full-text available
This dissertation makes a progress towards the generation of referring expressions in Hindi. We first make a proposal to exploit a combination of Gricean implicatures (Grice, 1975) and Centering theory constraints (Grosz et al., 1995) to formulate a generation algorithm for referring expressions whose domain of application is defined in terms of th...
Article
Full-text available
The objective of the DARPA Communicator project is to support rapid, cost-effective development of multi-modal speech-enabled dialogue systems with advanced conversational capabilities. During the course of the Communicator program, we have been involved in developing methods for measuring progress towards the program goals and assessing advances i...
Article
Full-text available
Discourse connectives can be analyzed as encoding predicate-argument relations whose arguments derive from the interpretation of discourse units. These arguments can be anaphoric or structural. Although structural arguments can be encoded in a parse tree, anaphoric arguments must be resolved by other means. A study of nine connectives, annotating t...
Article
Full-text available
This paper describes the evaluation methodology and results of the DARPA Communicator spoken dialog system evaluation experiments in 2000 and 2001. Nine spoken dialog systems in the travel planning domain participated in the experiments resulting in a total corpus of 1904 dialogs. We describe and compare the experimental design of the 2000 and 2001...
Conference Paper
Full-text available
Spoken dialogue systems promise ef-cient and natural access to information services from any phone. Recently, spo-ken dialogue systems for widely used ap-plications such as email, travel informa-tion, and customer care have moved from research labs into commercial use. These applications can receive millions of calls a month. This huge amount of sp...
Conference Paper
Full-text available
This paper describes the evaluation methodology and results of the 2001 DARPA Communicator evaluation. The experiment spanned 6 months of 2001 and involved eight DARPA Communicator sys- tems in the travel planning domain. It resulted in a corpus of 1242 dialogs which include many more dialogues for complex tasks than the 2000 evaluation. We describ...
Article
made to determine the ranking of the C f in Hindi: ffl If there is a single pronoun in an utterance U n+1 , then it must be the C b . ffl The C b (U n+1 ) is the highest ranked entity of C f (U n ) which is realized in U n+1 . ffl If the highest ranked entity of C f (U n+1 ) is not realized in U n+1 , then the C b is the next highest ranked 1 entit...
Article
Full-text available
In this paper we present our experiences in the evaluation of a wide-coverage grammar of English: the XTAG English grammar. We give a brief history of previous evaluations done using the XTAG grammar and then describe a pair of new evaluations done on a corpus of weather reports and the CSLI LKB test suite. Based on these experiments, we discuss th...
Article
Full-text available
This paper traces the behavior of Middle English V2 from 1350 to 1500. Unlike previous studies, this paper is based on a larger number of texts (31 texts). The analysis of the data presented here shows that with respect to the implementation of the V2 constraint, there are notably distinct patterns observed in the southern region as a whole, thus n...
Article
Full-text available
This paper presents a corpus-based investigation of the use of zero pronouns in Hindi. After establishing that the antecedents of these null arguments cannot be recovered syntactically (Rizzi, 1982; Jaeggli & Safir, 1989), I propose an account in terms of "Centering Theory" (Grosz et al., 1995). Given the Hindi-specific centering constraints propos...
Article
Full-text available
Spoken dialogue systems promise efficient and natural access to information services from any phone. Recently, spoken dialogue systems for widely used applications such as email, travel information, and customer care have moved from research labs into commercial use. These applications can receive millions of calls a month. This huge amount of spok...
Article
Full-text available
This paper investigates anaphoric reference in Hindi, with particular focus on the use and interpretation of third person personal pronouns to realize anaphoric relationships between noun phrases. We have two specific goals. The first is inspired by the central idea of Centering theory (Grosz et al. 1995), namely, that each utterance in a discourse...
Article
Full-text available
Large scale annotated corpora have played a critical role in speech and natu-ral language research. However, while existing annotated corpora such as the Penn Treebank have been highly suc-cessful at the sentence-level, we also need large-scale annotated resources that reliably encode key aspects of dis-course. In this paper, we detail (1) our plan...
Article
Full-text available
Discourse connectives can be analyzed as discourse level predicates which project predicate-argument structure on a par with verbs at the sentence level. The Penn Discourse Treebank (PDTB) reflects this view in its design providing annotation of the discourse connectives and their arguments. Like verbs, discourse connectives have multiple senses. W...
Article
Full-text available
This paper investigates the complexity of dependencies at the discourse level, in particular the dependencies between discourse connectives and their argu-ments. Our study is based on data from the Penn Discourse Treebank (PDTB) and is therefore an exploration into the ways treebanks can inform linguistic issues. We observe that, unlike in syntax,...
Article
Full-text available
Taking discourse connectives to be the predicates of binary discourse relations, the goal of Penn Discourse Treebank (PDTB) is to annotate the million word WSJ corpus in the Penn TreeBank with each of its discourse connectives and their arguments. The paper describes the linguistic obser- vations and ideas that led to the PDTB, the decisions that s...
Article
Full-text available
This paper describes initial studies in the context of a new effort within ISO to design an international standard for the annotation of discourse with semantic relations that are important for its coherence, "discourse relations". This effort takes the Penn Discourse Treebank (PDTB) as its starting point, and applies a methodology for defining sem...

Network

Cited By