Conference Paper

Latent Sentiment Model for Weakly-Supervised Cross-Lingual Sentiment Classification

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

In this paper, we present a novel weakly-supervised method for cross-lingual sentiment analysis. In specific, we propose a latent sentiment model (LSM) based on latent Dirichlet allocation where sentiment labels are considered as topics. Prior information extracted from English sentiment lexicons through machine translation are incorporated into LSM model learning, where preferences on expectations of sentiment labels of those lexicon words are expressed using generalized expectation criteria. An efficient parameter estimation procedure using variational Bayes is presented. Experimental results on the Chinese product reviews show that the weakly-supervised LSM model performs comparably to supervised classifiers such as Support vector Machines with an average of 81% accuracy achieved over a total of 5484 review documents. Moreover, starting with a generic sentiment lexicon, the LSM model is able to extract highly domain-specific polarity words from text. KeywordsLatent sentiment model (LSM)–cross-lingual sentiment analysis–Generalized expectation–latent Dirichlet allocation

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... In order to overcome the influence of the quality of MT on CLSA, relevant studies try to make some improvements on traditional MT system. The improvement methods mainly include: learning from the translation of source language sentiment lexicons [12], refining the training samples [13], finding the optimal baseline model for sentiment classification [14], using annotated data from multiple source languages [15], incorporating unlabeled texts from the target language [16] and incorporating Distortion Tolerance to balance the reversal of sentiment caused by MT system [5]. Table 1 shows the representative researches in early CLSA, and those works marked with * are CLSA researches based on MT and its improved variants, mainly published on 2011-2017. ...
... CLSA based on MT is a supervised method, which suffers from the generalization problem, especially when there is a domain mismatch between source and target languages. In order to solve this problem, He et al. (2011) proposed a weakly supervised model called latent sentiment model (LSM) which was based on the latent Dirichlet allocation (LDA) model combined with prior sentiment knowledge learned from sentiment lexicons [12]. Compared with other models based on LDA, LSM model incorporates prior sentiment knowledge into the process of sentiment preference computation using generalized expectation criteria. ...
... CLSA based on MT is a supervised method, which suffers from the generalization problem, especially when there is a domain mismatch between source and target languages. In order to solve this problem, He et al. (2011) proposed a weakly supervised model called latent sentiment model (LSM) which was based on the latent Dirichlet allocation (LDA) model combined with prior sentiment knowledge learned from sentiment lexicons [12]. Compared with other models based on LDA, LSM model incorporates prior sentiment knowledge into the process of sentiment preference computation using generalized expectation criteria. ...
Article
Full-text available
Cross-lingual sentiment analysis (CLSA) leverages one or several source languages to help the low-resource languages to perform sentiment analysis. Therefore, the problem of lack of annotated corpora in many non-English languages can be alleviated. Along with the development of economic globalization, CLSA has attracted much attention in the field of sentiment analysis and the last decade has seen a surge of researches in this area. Numerous methods, datasets and evaluation metrics have been proposed in the literature, raising the need for a comprehensive and updated survey. This paper fills the gap by reviewing the state-of-the-art CLSA approaches from 2004 to the present. This paper teases out the research context of cross-lingual sentiment analysis and elaborates the following methods in detail: (1) The early main methods of CLSA, including those based on Machine Translation and its improved variants, parallel corpora or bilingual sentiment lexicon; (2) CLSA based on cross-lingual word embedding; (3) CLSA based on multi-BERT and other pre-trained models. We further analyze their main ideas, methodologies, shortcomings, etc., and attempt to reach a conclusion on the coverage of languages, datasets and their performance. Finally, we look into the future development of CLSA and the challenges facing the research area.
... Aiming to discover users' interests about different sentiment topics in texts, Almars et al. [24] presented a hierarchical user sentiment topic model HUSTM, where each word in a document is associated with three latent variables: a user, a topic, and a sentiment. He et al. [25] proposed a weakly-supervised latent sentiment model, which replaces topic layer in LDA with sentiment layer and acquires sentiment prior information from English sentiment lexicons via machine translation. Hai et al. [26] presented a supervised joint topic model SJASM, which leverages the inter-dependency between the aspect-based sentiment and overall sentiment, and estimate the overall sentiments of reviews via a normal linear model. ...
Article
Full-text available
Most existing unsupervised approaches to detect topic sentiment in social texts consider only the text sequences in corpus and put aside social dynamics, as leads to algorithm’s disability to discover true sentiment of social users. To address the issue, a probabilistic graphical model LDTSM (Long-term Dependence Topic-Sentiment Mixture) is proposed, which introduces dependency distance and uses the dynamics of social media to achieve the perfect combination of inheriting historical topic sentiment and fitting topic sentiment distribution underlying in current social texts. Extensive experiments on real-world SinaWeibo datasets show that LDTSM significantly outperforms JST, TUS-LDA and dNJST in terms of sentiment classification accuracy, with better inference convergence, and topic and sentiment evolution analysis results demonstrate that our approach is promising.
... Latent Sentiment Model (LSM) (He 2011) was proposed by He et al.. Topics are divided into three special topics with emotion included in the LSM model to realize the emotional analysis of documents. However, it is a semi supervised approach, and some manual intervention is required. ...
Article
Full-text available
The Latent Dirichlet Allocation (LDA) topic model is a popular research topic in the field of text mining. In this paper, Sentiment Word Co-occurrence and Knowledge Pair Feature Extraction based LDA Short Text Clustering Algorithm (SKP-LDA) is proposed. A definition of a word bag based on sentiment word co-occurrence is proposed. The co-occurrence of emotional words takes full account of different short texts. Then, the short texts of a microblog are endowed with emotional polarity. Furthermore, the knowledge pairs of topic special words and topic relation words are extracted and inserted into the LDA model for clustering. Thus, semantic information can be found more accurately. Then, the hidden n topics and Top30 special words set of each topic are extracted from the knowledge pair set. Finally, via LDA topic model primary clustering, a Top30 topic special words set is obtained that is clustered by K-means secondary clustering. The clustering center is optimized iteratively. Comparing with JST, LSM, LTM and ELDA, SKP-LDA performs better in terms of Accuracy, Precision, Recall and F-measure. The experimental results show that SKP-LDA reveals better semantic analysis ability and emotional topic clustering effect. It can be applied to the field of micro-blog to improve the accuracy of network public opinion analysis effectively.
... A lot of research has come forth in this area in 2011. He [24] presents a weakly-supervised technique that uses a latent sentiment model and that considers sentiment labels as topics. The experiment is done on Chinese reviews and the accuracy is found superior to the supervised classification methods. ...
Article
Full-text available
User database of the internet is expanding at a swift rate with the dramatic growth of social media. These include information as well as personal opinions about products, ideas, news, politics, etc. These online opinions and reviews act as a word-to-mouth medium for enhancing or diminishing the popularity of a product, item or concept. Thus, automated analysis of the tone of online opinions helps customers and busi- ness personnel significantly to take decisions and develop strategies efficiently. This task, known as sentiment analysis, is an area of active research that relies heavily on the text processing methodology called word embedding. Word embedding is a pro- cess of representing text into numeric format, to enable mathematical operations on them. The present study proposes a method of enhancing the performance of word em- bedding approaches, by integrating sentiment-based information, to render them more suitable for sentiment analysis. Sentiment-based information is incorporated through self-organizing map, where similarity is calculated based on the scores of sentiment- based words. The similarity is further tuned using particle swarm optimization method. Experimentally, performance of the proposed method is justified for sentiment analysis task using various classifiers. Various performance measurement indexes are used to validate the superiority of the proposed method compared to existing approaches.
... As for more retrieval-oriented tasks, the ranking of products and reviews benefits from sentiment detection [10]: by identifying categories important to the users from sentiments expressed on Twitter, products ca be re-ranked accordingly. Moreover crosslanguage retrieval and ranking can incorporate sentiments and their respective transla-tions [19]. Finally, annotating search results with the expressed general sentiment can be helpful as a facet in result presentation [11]. ...
Conference Paper
Full-text available
We reproduce three classification approaches with diverse feature sets for the task of classifying the sentiment expressed in a given tweet as either positive, neutral, or negative. The reproduced approaches are also combined in an ensemble, averaging the individual classifiers’ confidence scores for the three classes and deciding sentiment polarity based on these averages. Our experimental evaluation on SemEval data shows our re-implementations to slightly outperform their respective originals. Moreover, in the SemEval Twitter sentiment detection tasks of 2013 and 2014, the ensemble of reproduced approaches would have been ranked in the top-5 among 50 participants. An error analysis shows that the ensemble classifier makes few severe misclassifications, such as identifying a positive sentiment in a negative tweet or vice versa. Instead, it tends to misclassify tweets as neutral that are not, which can be viewed as the safest option.
... For example, Wan (2009) applies a supervised cotraining framework to iteratively adapt knowledge learned from the two languages by transferring translated texts to each other. Other similar work includes (Wei and Pal, 2010) and (He, 2011b). All these approaches rely on MT to build language connection. ...
Conference Paper
Full-text available
Cross-lingual sentiment analysis is a task of identifying sentiment polarities of texts in a low-resource language by using sentiment knowledge in a resource-abundant language. While most existing approaches are driven by transfer learning, their performance does not reach to a promising level due to the transferred errors. In this paper, we propose to integrate into knowledge transfer a knowledge validation model , which aims to prevent the negative influence from the wrong knowledge by distinguishing highly credible knowledge. Experiment results demonstrate the necessity and effectiveness of the model.
Conference Paper
Sentiment connection is the basis of cross-lingual sentiment analysis (CSLA) solutions. Most of existing work mainly focus on general semantic connection that the misleading information caused by non-sentimental semantics probably lead to relatively low efficiency. In this paper, we propose to capture the document-level sentiment connection across languages (called cross-lingual sentiment relation) for CLSA in a joint two-view convolutional neural networks (CNNs), namely Bi-View CNN (BiVCNN). Inspired by relation embedding learning, we first project the extracted parallel sentiments into a bilingual sentiment relation space, then capture the relation by subtracting them with an error-tolerance. The bilingual sentiment relation considered in this paper is the shared sentiment polarity between two parallel texts. Experiments conducted on public datasets demonstrate the effectiveness and efficiency of the proposed approach.
Conference Paper
With the special “@, #, //” symbols, which include a lot of emotional symbols and pictures etc., tweets are different with other user-generated general texts, such as blogs, forums, reviews. Considering structural features and content of tweets, we present a semi-supervised Aspect and Sentiment Unification Model(PL-SASU). Using more information rather than solo texts, this model can model tweets better. The experiments of sentiment classification and aspect identification on real twitter data show that PL-SASU outperforms JTS, ASUM and UTSU model.
This article presents two novel approaches for incorporating sentiment prior knowledge into the topic model for weakly supervised sentiment analysis where sentiment labels are considered as topics. One is by modifying the Dirichlet prior for topic-word distribution (LDA-DP), the other is by augmenting the model objective function through adding terms that express preferences on expectations of sentiment labels of the lexicon words using generalized expectation criteria (LDA-GE). We conducted extensive experiments on English movie review data and multi-domain sentiment dataset as well as Chinese product reviews about mobile phones, digital cameras, MP3 players, and monitors. The results show that while both LDA-DP and LDA-GE perform comparably to existing weakly supervised sentiment classification algorithms, they are much simpler and computationally efficient, rendering them more suitable for online and real-time sentiment classification on the Web. We observed that LDA-GE is more effective than LDA-DP, suggesting that it should be preferred when considering employing the topic model for sentiment analysis. Moreover, both models are able to extract highly domain-salient polarity words from text.
Conference Paper
Full-text available
We address the problem of sentiment and objectivity classification of product re-views in Chinese. Our approach is distinct-ive in that it treats both positive / negative sentiment and subjectivity / objectivity not as distinct classes but rather as a con-tinuum; we argue that this is desirable from the perspective of would-be customers who read the reviews. We use novel unsuper-vised techniques, including a one-word 'seed' vocabulary and iterative retraining for sentiment processing, and a criterion of 'sentiment density' for determining the ex-tent to which a document is opinionated. The classifier achieves up to 87% F-meas-ure for sentiment polarity detection.
Conference Paper
Full-text available
Probabilistic topic models have become popular as methods for dimensionality reduction in collections of text documents or images. These models are usually treated as generative models and trained using maximum likelihood or Bayesian methods. In this paper, we discuss an alternative: a discriminative framework in which we assume that supervised side information is present, and in which we wish to take that side information into account in finding a re duced dimensional- ity representation. Specifically, we present DiscLDA, a dis criminative variation on Latent Dirichlet Allocation (LDA) in which a class-dependent linear transforma- tion is introduced on the topic mixture proportions. This parameter is estimated by maximizing the conditional likelihood. By using the transformed topic mix- ture proportions as a new representation of documents, we obtain a supervised dimensionality reduction algorithm that uncovers the latent structure in a docu- ment collection while preserving predictive power for the task of classification. We compare the predictive power of the latent structure of DiscLDA with unsu- pervised LDA on the 20 Newsgroups document classification ta sk and show how our model can identify shared topics across classes as well as class-dependent topics.
Conference Paper
Full-text available
A significant portion of the world's text is tagged by readers on social bookmark- ing websites. Credit attribution is an in- herent problem in these corpora because most pages have multiple tags, but the tags do not always apply with equal specificity across the whole document. Solving the credit attribution problem requires associ- ating each word in a document with the most appropriate tags and vice versa. This paper introduces Labeled LDA, a topic model that constrains Latent Dirichlet Al- location by defining a one-to-one corre- spondence between LDA's latent topics and user tags. This allows Labeled LDA to directly learn word-tag correspondences. We demonstrate Labeled LDA's improved expressiveness over traditional LDA with visualizations of a corpus of tagged web pages from del.icio.us. Labeled LDA out- performs SVMs by more than 3 to 1 when extracting tag-specific document snippets. As a multi-label text classifier, our model is competitive with a discriminative base- line on a variety of datasets.
Conference Paper
Full-text available
This paper presents the SELC Model (SElf-Supervised, (Lexicon-based and (Corpus-based Model) for sentiment classification. The SELC Model includes two phases. The first phase is a lexicon-based iterative process. In this phase, some reviews are initially classified based on a sentiment dictionary. Then more reviews are classified through an iterative process with a negative/positive ratio control. In the second phase, a supervised classifier is learned by taking some reviews classified in the first phase as training data. Then the supervised classifier applies on other reviews to revise the results produced in the first phase. Experiments show the effectiveness of the proposed model. SELC totally achieves 6.63% F1-score improvement over the best result in previous studies on the same data (from 82.72% to 89.35%). The first phase of the SELC Model independently achieves 5.90% improvement (from 82.72% to 88.62%). Moreover, the standard deviation of F1-scores is reduced, which shows that the SELC Model could be more suitable for domain-independent sentiment classification.
Conference Paper
Full-text available
We describe and evaluate a new method of automatic seed word selection for un- supervised sentiment classification of product reviews in Chinese. The whole method is unsupervised and does not re- quire any annotated training data; it only requires information about commonly oc- curring negations and adverbials. Unsu- pervised techniques are promising for this task since they avoid problems of do- main-dependency typically associated with supervised methods. The results ob- tained are close to those of supervised classifiers and sometimes better, up to an F1 of 92%.
Article
Full-text available
This paper presents a comparative study of three closely related Bayesian models for unsupervised document level sentiment classification, namely, the latent sentiment model (LSM), the joint sentimenttopic (JST) model, and the Reverse-JST model. Extensive experiments have been conducted on two corpora, the movie review dataset and the multi-domain sentiment dataset. It has been found that while all the three models achieve either better or comparable performance on these two corpora when compared to the existing unsupervised sentiment classification approaches, both JST and Reverse-JST are able to extract sentiment-oriented topics. In addition, Reverse-JST always performs worse than JST suggesting that the JST model is more appropriate for joint sentiment topic detection.
Article
Full-text available
We describe a modification to the AdaBoost algorithm that permits the incorporation of prior human knowledge as a means of compensating for a shortage of training data. We give a convergence result for the algorithm.
Article
This note describes generalized expectation (GE) criteria, a framework for incorporating preferences about model expectations into parameter esti-mation objective functions. We discuss relations to other methods, various learning paradigms it supports, and applications that can leverage its flexibil-ity.
Article
Although fully generative models have been successfully used to model the contents of text documents, they are often awkward to apply to combinations of text data and document metadata. In this paper we propose a Dirichlet-multinomial regression (DMR) topic model that includes a log-linear prior on document-topic distributions that is a function of observed features of the document, such as author, publication venue, references, and dates. We show that by selecting appropriate features, DMR topic models can meet or exceed the performance of several previously published topic models designed for specific data.
Conference Paper
It is dicult to apply machine learning to new domains be- cause often we lack labeled problem instances. In this pa- per, we provide a solution to this problem that leverages domain knowledge in the form of anities between input features and classes. For example, in a baseball vs. hockey text classification problem, even without any labeled data, we know that the presence of the word puck is a strong indi- cator of hockey. We refer to this type of domain knowledge as a labeled feature. In this paper, we propose a method for training discriminative probabilistic models with labeled fea- tures and unlabeled instances. Unlike previous approaches that use labeled features to create labeled pseudo-instances, we use labeled features directly to constrain the model's predictions on unlabeled instances. We express these soft constraints using generalized expectation (GE) criteria — terms in a parameter estimation objective function that ex- press preferences on values of a model expectation. In this paper we train multinomial logistic regression models us- ing GE criteria, but the method we develop is applicable to other discriminative probabilistic models. The complete objective function also includes a Gaussian prior on param- eters, which encourages generalization by spreading param- eter weight to unlabeled features. Experimental results on text classification data sets show that this method outper- forms heuristic approaches to training classifiers with labeled features. Experiments with human annotators show that it is more beneficial to spend limited annotation time labeling features rather than labeling instances. For example, after only one minute of labeling features, we can achieve 80% ac- curacy on the ibm vs. mac text classification problem using GE-FL, whereas ten minutes labeling documents results in an accuracy of only 77%
Conference Paper
In this work, we propose a novel scheme for sentiment classification (without labeled examples) which combines the strengths of both "learn-based" and "lexicon-based" approaches as follows: we first use a lexicon-based technique to label a portion of informative examples from given task (or domain); then learn a new supervised classifier based on these labeled ones; finally apply this classifier to the task. The experimental results indicate that proposed scheme could dramatically outperform "learn-based" and "lexicon-based" techniques.
Conference Paper
There is a growing interest in mining opinions using senti- ment analysis methods from sources such as news, blogs and product reviews. Most of these methods have been devel- oped for English and are difficult to generalize to other lan- guages. We explore an approach utilizing state-of-the-art ma- chine translation technology and perform sentiment analysis on the English translation of a foreign language text. Our ex- periments indicate that (a) entity sentiment scores obtain ed by our method are statistically significantly correlated ac ross nine languages of news sources and five languages of a par- allel corpus; (b) the quality of our sentiment analysis meth od is largely translator independent; (c) after applying cert ain normalization techniques, our entity sentiment scores can be used to perform meaningful cross-cultural comparisons.
Conference Paper
It is a challenging task to identify sentiment polarity of Chinese reviews because the re- sources for Chinese sentiment analysis are limited. Instead of leveraging only monolin- gual Chinese knowledge, this study proposes a novel approach to leverage reliable English resources to improve Chinese sentiment analysis. Rather than simply projecting Eng- lish resources onto Chinese resources, our ap- proach first translates Chinese reviews into English reviews by machine translation ser- vices, and then identifies the sentiment polar- ity of English reviews by directly leveraging English resources. Furthermore, our approach performs sentiment analysis for both Chinese reviews and English reviews, and then uses ensemble methods to combine the individual analysis results. Experimental results on a dataset of 886 Chinese product reviews dem- onstrate the effectiveness of the proposed ap- proach. The individual analysis of the translated English reviews outperforms the in- dividual analysis of the original Chinese re- views, and the combination of the individual analysis results further improves the perform- ance.
Conference Paper
Although research in other languages is increasing, much of the work in subjectivity analysis has been applied to English data, mainly due to the large body of electronic resources and tools that are available for this language. In this paper, we propose and evaluate methods that can be employed to transfer a repository of subjectivity resources across languages. Specifically, we attempt to leverage on the resources available for English and, by employing machine translation, generate resources for subjectivity analysis in other languages. Through comparative evaluations on two different languages (Romanian and Spanish), we show that automatic translation is a viable alternative for the construction of resources and tools for subjectivity analysis in a new target language.
Conference Paper
The lack of Chinese sentiment corpora limits the research progress on Chinese sentiment classification. However, there are many freely available English sentiment corpora on the Web. This paper focuses on the problem of cross-lingual sentiment classification, which leverages an available English corpus for Chi- nese sentiment classification by using the Eng- lish corpus as training data. Machine transla- tion services are used for eliminating the lan- guage gap between the training set and test set, and English features and Chinese features are considered as two independent views of the classification problem. We propose a co- training approach to making use of unlabeled Chinese data. Experimental results show the effectiveness of the proposed approach, which can outperform the standard inductive classifi- ers and the transductive classifiers.
Conference Paper
This paper explores methods for generating subjectivity analysis resources in a new lan- guage by leveraging on the tools and re- sources available in English. Given a bridge between English and the selected target lan- guage (e.g., a bilingual dictionary or a par- allel corpus), the methods can be used to rapidly create tools for subjectivity analysis in the new language.
Article
Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet Allocation (LDA), called joint sentiment/topic model (JST), which detects sentiment and topic simultaneously from text. Unlike other machine learning approaches to sentiment classification which often require labeled corpora for classifier training, the proposed JST model is fully unsupervised. The model has been evaluated on the movie review dataset to classify the review sentiment polarity and minimum prior information have also been explored to further improve the sentiment classification accuracy. Preliminary experiments have shown promising results achieved by JST.
We introduce supervised latent Dirichlet allocation (sLDA), a statistical model of labelled documents. The model accommodates a variety of response types. We derive an approximate maximum-likelihood procedure for parameter estimation, which relies on variational methods to handle intractable posterior expectations. Prediction problems motivate this research: we use the fitted model to predict response values for new documents. We test sLDA on two real-world problems: movie ratings predicted from reviews, and the political tone of amendments in the U.S. Senate based on the amendment text. We illustrate the benefits of sLDA versus modern regularized regression, as well as versus an unsupervised LDA analysis followed by a separate regression.