Conference Paper

Sentence and Expression Level Annotation of Opinions in User-Generated Discourse

Conference: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

ABSTRACT In this paper, we introduce a corpus of consumer reviews from the rateitall and the eopinions websites annotated with opinion-related information. We present a two-level annotation scheme. In the first stage, the reviews are analyzed at the sentence level for (i) relevancy to a given topic, and (ii) expressing an evaluation about the topic. In the second stage, on-topic sentences containing evaluations about the topic are further investigated at the expression level for pinpointing the properties (semantic orientation, intensity), and the functional components of the evaluations (opinion terms, targets and holders). We discuss the annotation scheme, the inter-annotator agreement for different subtasks and our observations.

1 Bookmark
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we compare three different generalization methods for in-domain and cross-domain opinion holder extraction being simple unsupervised word clustering, an induction method inspired by distant supervision and the usage of lexical resources. The generalization methods are incorporated into diverse classifiers. We show that generalization causes significant improvements and that the impact of improvement depends on the type of classifier and on how much training and test data differ from each other. We also address the less common case of opinion holders being realized in patient position and suggest approaches including a novel (linguistically-informed) extraction method how to detect those opinion holders without labeled training data as standard datasets contain too few instances of this type.
    Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics; 04/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: The aim of this paper is twofold: measuring the effect of discourse structure when assessing the overall opinion of a document and analyzing to what extent these effects depend on the corpus genre. Using Segmented Discourse Representation Theory as our formal framework, we propose several strategies to compute the overall rating. Our results show that discourse-based strategies lead to better scores in terms of accuracy and Pearson's correlation than state-of-the-art approaches.
    Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2; 03/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Current work on sentiment analysis is characterized by approaches with a pragmatic focus, which use shallow techniques in the interest of robustness but often rely on ad-hoc creation of data sets and methods. We argue that progress towards deep analysis depends on a) enriching shallow representations with linguistically motivated, rich information, and b) focussing different branches of research and combining ressources to create synergies with related work in NLP. In the paper, we propose SentiFrameNet, an extension to FrameNet, as a novel representation for sentiment analysis that is tailored to these aims.
    Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis; 07/2012

Full-text (2 Sources)

Available from
May 23, 2014