Conference Paper

Sentence and Expression Level Annotation of Opinions in User-Generated Discourse

Conference: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics


In this paper, we introduce a corpus of consumer reviews from the rateitall and the eopinions websites annotated with opinion-related information. We present a two-level annotation scheme. In the first stage, the reviews are analyzed at the sentence level for (i) relevancy to a given topic, and (ii) expressing an evaluation about the topic. In the second stage, on-topic sentences containing evaluations about the topic are further investigated at the expression level for pinpointing the properties (semantic orientation, intensity), and the functional components of the evaluations (opinion terms, targets and holders). We discuss the annotation scheme, the inter-annotator agreement for different subtasks and our observations.

Download full-text


Available from: Niklas Jakob
  • Source
    • "Many available annotations do not show details of opinion expressions [55], [74], [107]. Recently [108] proposed a scheme to annotate a corpus of customer reviews by helping annotators using a tool. As a result the annotation contains opinion expression, opinion target, a holder, modifier and anaphoric expression. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Software review text fragments have considerably valuable information about users’ experience. It includes a huge set of properties including the software quality. Opinion mining or sentiment analysis is concerned with analyzing textual user judgments. The application of sentiment analysis on software reviews can find a quantitative value that represents software quality. Although many software quality methods are proposed they are considered difficult to customize and many of them are limited. This article investigates the application of opinion mining as an approach to extract software quality properties. We found that the major issues of software reviews mining using sentiment analysis are due to software lifecycle and the diverse users and teams.
    Full-text · Article · Jan 2016
  • Source
    • "In our work, we only consider noun phrase entities, and we consider the noun phrase itself as an entity. Other fine-grained annotation studies include that of Toprak et al. (2010) who enrich target and holder annotations in consumer reviews with measures such as relevancy and intensity, and Somasundaran et al. (2008) who perform discourselevel annotation of opinion frames, which consist of opinions whose targets are described by similar or contrasting relations. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a method for annotating targets of opinions in Arabic in a two-stage process using the crowdsourcing tool Amazon Mechanical Turk. The first stage consists of identifying candidate targets "entities" in a given text. The second stage consists of identifying the opinion polarity (positive , negative, or neutral) expressed about a specific entity. We annotate a corpus of Arabic text using this method, selecting our data from online commentaries in different domains. Despite the complexity of the task, we find high agreement. We present detailed analysis.
    Full-text · Conference Paper · Jul 2015
  • Source
    • "Their main objective is to establish relations between targets and only few discourse-level structures of polarity shifting are identified in their annotated corpus. Toprak et al. (2010) present a corpus which considers the opinion expression at sentence-level from different aspects, such as polarity, strength, modifier, holder, and target. The annotation of polarity shifting is included yet very limited. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Cross-lingual sentiment classification aims to perform sentiment classification in a language (named as the target language) with the help of the resources from another language (named as the source language). Previous studies are prone to using all available data in the source language while using all data is observed to perform no better or even worse than using a partion of good data. In this paper, we propose a novel task called data quality controlling in the source language to select high quality samples from the source language. To tackle this task, we propose two kinds of data quality measurements: intra- and extra-quality measurements which are implemented with the certainty and similarity measurements respectively. The empirical studies demonstrate the effectiveness of the proposed approach to data quality controlling in the source language.
    Full-text · Conference Paper · Aug 2013
Show more