Conference Paper

Aspect and sentiment unification model for online review analysis.

DOI: 10.1145/1935826.1935932 Conference: Proceedings of the Forth International Conference on Web Search and Web Data Mining, WSDM 2011, Hong Kong, China, February 9-12, 2011
Source: DBLP

ABSTRACT User-generated reviews on the Web contain sentiments about detailed aspects of products and services. However, most of the reviews are plain text and thus require much effort to obtain information about relevant details. In this paper, we tackle the problem of automatically discovering what aspects are evaluated in reviews and how sentiments for different aspects are expressed. We first propose Sentence-LDA (SLDA), a probabilistic generative model that assumes all words in a single sentence are generated from one aspect. We then extend SLDA to Aspect and Sentiment Unification Model (ASUM), which incorporates aspect and sentiment together to model sentiments toward different aspects. ASUM discovers pairs of {aspect, sentiment} which we call senti-aspects. We applied SLDA and ASUM to reviews of electronic devices and restaurants. The results show that the aspects discovered by SLDA match evaluative details of the reviews, and the senti-aspects found by ASUM capture important aspects that are closely coupled with a sentiment. The results of sentiment classification show that ASUM outperforms other generative models and comes close to supervised classification methods. One important advantage of ASUM is that it does not require any sentiment labels of the reviews, which are often expensive to obtain.

1 Bookmark
  • [Show abstract] [Hide abstract]
    ABSTRACT: Influence maximization (im) is the problem of finding a small subset of nodes (seed nodes) in a social network that could maximize the spread of influence. Despite the progress achieved by state-of-the-art greedy im techniques, they suffer from two key limitations. Firstly, they are inefficient as they can take days to find seeds in very large real-world networks. Secondly, although extensive research in social psychology suggests that humans will readily conform to the wishes or beliefs of others, surprisingly, existing im techniques are conformity-unaware. That is, they only utilize an individual’s ability to influence another but ignores conformity (a person’s inclination to be influenced) of the individuals. In this paper, we propose a novel conformity-aware cascade ( \({\textsc {c}}^2\) ) model which leverages on the interplay between influence and conformity in obtaining the influence probabilities of nodes from underlying data for estimating influence spreads. We also propose a variant of this model called \(\textsc {c}^3\) model that supports context-specific influence and conformity of nodes. A salient feature of these models is that they are aligned to the popular social forces principle in social psychology. Based on these models, we propose a novel greedy algorithm called cinema that generates high-quality seed set for the im problem. It first partitions, the network into a set of non-overlapping subnetworks and for each of these subnetworks it computes the influence and conformity indices of nodes by analyzing the sentiments expressed by individuals. Each subnetwork is then associated with a cog-sublist which stores the marginal gains of the nodes in the subnetwork in descending order. The node with maximum marginal gain in each cog-sublist is stored in a data structure called mag-list. These structures are manipulated by cinema to efficiently find the seed set. A key feature of such partitioning-based strategy is that each node’s influence computation and updates can be limited to the subnetwork it resides instead of the entire network. This paves way for seamless adoption of cinema on a distributed platform. Our empirical study with real-world social networks comprising of millions of nodes demonstrates that cinema as well as its context-aware and distributed variants generate superior quality seed set compared to state-of-the-art im approaches.
    The VLDB Journal 02/2014; 24(1):117-141. · 1.70 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: User-generated content diffusion on social networks has triggered an explosive attention in various disciplines. Within tourism activities, social media has growth in the past years rapidly through regular social network sites, or thematic social network sites such as TripAdvisor. The present study aims to provide a deeper insight into this matter, having as starting point the thought that clients posts good or bad reviews, regarding to different aspects of their experience; and, that a client who has a good experience in restaurant tends to revisit it and recommended it to friends, as opposite if the experience was bad they tell this to friend and recommend not visit. To assess customers' reviews of restaurants, data was gathered on TripAdvisor of Top 10 restaurants in two island context Azores and Hawaii. All the comments were studied carefully and categorized in set of dimensions that measured how the entirety of a meal was perceived: sight, hearing, smell, taste and touch. As the results showed, food is the most decisive variable adopted in the UGC. Additionally, our findings support the notion that the overall quality of the meal reflects a lot more than flavor or taste of the food. To these elements, we need to add visual effect, freshness of the ingredients, and healthiness of the meal, among others as main contents spread on SNS. Thus, results reinforce the literature relative to the social media and ads to the knowledge of the contents created and shared by tourists relative to restaurant experience as a whole.
    10/2015; 175:162-169.
  • [Show abstract] [Hide abstract]
    ABSTRACT: With plenty of online resources constantly increasing (like weblog, product reviews, news reviews, etc.), it is difficult to read them and obtain the useful information, especially emotion information. The emotion analysis on internet online information has received much attention from natural language processing field in recent years. In most existing works, single-label emotion analysis have been studied by many scientists, it often ignores the complexity of human feelings. This paper is dedicated to construct the multi-label emotion topic model for recognizing the complicated emotions of weblog sentences based on Chinese emotion corpus Ren-CECps. We employ latent topic variables and emotion variables to find complex emotions of the sentence. The results of experiments indicate that the model is reasonable and effective in recognizing the mixed emotions of weblog sentences.
    2013 IEEE/SICE International Symposium on System Integration (SII); 12/2013

Full-text (3 Sources)

Available from
Nov 1, 2014