Conference Paper

Sentence and Expression Level Annotation of Opinions in User-Generated Discourse

Conference: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics


In this paper, we introduce a corpus of consumer reviews from the rateitall and the eopinions websites annotated with opinion-related information. We present a two-level annotation scheme. In the first stage, the reviews are analyzed at the sentence level for (i) relevancy to a given topic, and (ii) expressing an evaluation about the topic. In the second stage, on-topic sentences containing evaluations about the topic are further investigated at the expression level for pinpointing the properties (semantic orientation, intensity), and the functional components of the evaluations (opinion terms, targets and holders). We discuss the annotation scheme, the inter-annotator agreement for different subtasks and our observations.

Download full-text


Available from: Niklas Jakob,
  • Source
    • "In our work, we only consider noun phrase entities, and we consider the noun phrase itself as an entity. Other fine-grained annotation studies include that of Toprak et al. (2010) who enrich target and holder annotations in consumer reviews with measures such as relevancy and intensity, and Somasundaran et al. (2008) who perform discourselevel annotation of opinion frames, which consist of opinions whose targets are described by similar or contrasting relations. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a method for annotating targets of opinions in Arabic in a two-stage process using the crowdsourcing tool Amazon Mechanical Turk. The first stage consists of identifying candidate targets "entities" in a given text. The second stage consists of identifying the opinion polarity (positive , negative, or neutral) expressed about a specific entity. We annotate a corpus of Arabic text using this method, selecting our data from online commentaries in different domains. Despite the complexity of the task, we find high agreement. We present detailed analysis.
    ACL 2015 Workshop on Arabic Natural Language Processing; 07/2015
  • Source
    • "Their main objective is to establish relations between targets and only few discourse-level structures of polarity shifting are identified in their annotated corpus. Toprak et al. (2010) present a corpus which considers the opinion expression at sentence-level from different aspects, such as polarity, strength, modifier, holder, and target. The annotation of polarity shifting is included yet very limited. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Cross-lingual sentiment classification aims to perform sentiment classification in a language (named as the target language) with the help of the resources from another language (named as the source language). Previous studies are prone to using all available data in the source language while using all data is observed to perform no better or even worse than using a partion of good data. In this paper, we propose a novel task called data quality controlling in the source language to select high quality samples from the source language. To tackle this task, we propose two kinds of data quality measurements: intra- and extra-quality measurements which are implemented with the certainty and similarity measurements respectively. The empirical studies demonstrate the effectiveness of the proposed approach to data quality controlling in the source language.
    Asian Language Processing (IALP), 2013 International Conference on; 01/2013
  • Source
    • "In the last years, several corpora have been annotated with information related to modality and polarity, which have made it possible to develop machine learning systems. Annotation has been performed at different levels: word (Hassan and Radev, 2010), expression (Baker et al., 2010; Toprak et al., 2010), sentence (Medlock and Briscoe, 2007), event (Saurí and Pustejovsky, 2009), discourse relation (Prasad et al., 2006), text (Amancio et al., 2010), and scope of negation and modality cues (Vincze et al., 2008). Thanks to the existence of the BioScope corpus, the scope processing task was recently introduced. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we summarize existing work on the recently introduced task of processing the scope of negation and modality cues; we analyse the scope model that existing systems can process, which is mainly the model reflected in the annotations of the biomedical corpus on which the systems have been trained; and we point out aspects of the scope finding task that would be different based on observations from a corpus from a different domain and nature.
Show more