Chapter

Comparison, Classification and Survey of Aspect Based Sentiment Analysis: Second International Conference, ICAICR 2018, Shimla, India, July 14–15, 2018, Revised Selected Papers, Part I

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Sentiment Analysis is the study of sentiments expressed by people. Aspect based Sentiment Analysis is the study of sentiments expressed by people regarding the aspect of an entity. Aspect based Sentiment Analysis is becoming an important task in realising the finer sentiments of objects as described by people in their opinions. In the present paper we describe several techniques which have come up in recent years involving aspect term extraction and/or aspect sentiment prediction.Present paper describes the taxonomy of aspect based sentiment analysis with detailed explainaton of recent methods used. This paper also gives the pros and cons of research papers discussed. In the present paper we have compared all the papers with table enteries.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The task was executed using supervised and unsupervised methods [12]. T2 assigns the polarity of the sentiment analysis to the extracted aspect [13]. T3 identifies the category using a multilabel classifier that classifies each entity into multiple labels, where the label consists of entities and aspects. ...
Article
Full-text available
Educational institutions typically gather feedback from beneficiaries through formal surveys. Offering open-ended questions allows students to express their opinions about matters that may not have been measured directly in closed-ended questions. However, responses to open-ended questions are typically overlooked due to the time and effort required. Aspect-based sentiment analysis is used to automate the process of extracting fine-grained information from texts. This study aims to 1) examine the performance of different BERT-based models for aspect term extraction for Arabic text sourced from educational institution surveys; 2) develop a system that automates the ABSA process in a way that will automatically label survey responses. An end-to-end system was developed as a case study to extract aspect terms, identify their polarity, map extracted aspects to their respective categories, and aggregate category polarity. To accomplish this, the models were evaluated using an in-house dataset. The result showed that FAST-LCF-ATEPC, a multilingual checkpoint, outperformed other models including AraBERT, MARBERT, and QARiB, in the aspect-term extraction task, with an F1 score of 0.58. Hence, it was used for aspect-term polarity classification, showing an F1 score of 0.86. Mapping aspects to their respective categories using a predefined list yielded an average F1 score of 0.98. Furthermore, the polarities of the categories were aggregated to summarize the overall polarity for each category. The developed system can support Arabic educational institutions in harnessing valuable information in responses to open-ended survey questions, allowing decision-makers to better allocate resources, and improve facilities, services, and students’ learning experiences.
Article
In recent years, aspect category detection has become popular due to the rapid growth in customer reviews data on e-commerce and other online platforms. Aspect Category Detection, a sub-task of Aspect-Based Sentiment Analysis, categorizes the reviews based on the features of a product such as a laptop’s display, or an aspect of an entity such as the restaurant’s ambiance. Various methods have been proposed to deal with such a problem. In this paper, we first introduce several datasets in the community that deal with this task and take a closer look at them by providing some exploratory analysis. Then, we review a number of representative methods for aspect category detection and classify them into two main groups: 1) supervised learning, and 2) unsupervised learning. Next, we discuss the strengths and weaknesses of different kinds of methods, which are expected to benefit both practical applications and future research. Finally, we discuss the challenges, open problems, and future research directions.
Article
Full-text available
Sentiment analysis or opinion mining has come forth as an attractive research field in the past few years. Sentiment analysis extracts sentiments from the text for analysis and aggregation at different levels of detail. In aspect-level sentiment analysis, we aggregate sentiment for different aspects of entities. The bulk of the research work executed so far focuses on detecting explicit aspects but ignored implicit aspects, which are insinuated by other existing words and articulates of the sentence. Since a significant percentage of sentences contain implicit aspects, detection of implicit aspects becomes vital for sentiment analysis. This survey concentrates on implicit aspect detection, and a detailed discussion about state of the art is provided. The available methods are categorized depending on the algorithm applied. Quantitative evaluation for different methods as stated by authors is included for comparison purpose. Discussion about terminology, issues, and scope in the detection of implicit aspects is also included. The fine-grained sentiment information collected may be used in many applications in various domains. This survey aims to advocate the need for implicit aspect detection, determine existing efficient solutions, identify complications in implicit aspect detection, and suggest measures to improve performance, which comprise future research trends in implicit aspect detection.
Article
Full-text available
Hotel booking websites use online ratings and customer feedback to help the customer’s decision making process but reviews provide a better insight about the hotel but most travellers don’t have the time or patience to read all reviews. This study analyzes the hotel reviews and gives information that ratings might overlook. The reviews and metadata are crawled from website and classified into predefined classes as per some of the common aspects. Then Topic modelling technique (LDA) is applied to identify hidden information and aspects, followed by sentiment analysis on classified sentences and summarization. Finally we discuss results and future work, ultimately building towards Hotel Recommender System.
Article
Full-text available
We study the problem of automatic extraction of aspects from code-mixed social media data in the form of topic clusters. To address the same, we present the background and propose a code-mixed probabilistic topic model. Unlike the standard Latent Dirichlet Allocation (LDA) model, it updates the distribution of words to distribution of cross-lingual sets. This results in enhancing LDA to process code-mixed data to generate topic clusters by i) improving the relevance of aspect clusters by restricting insignificant words from inclusion in the clusters and ii) encouraging inclusion of coherent words which are semantically related to each other. This becomes possible by leveraging cross-lingual semantic information from a multilingual dictionary called BabelNet. We call our proposed model as code-mixed semantic LDA (cms-LDA) model. Our results indicate that cms-LDA substantially improves the coherence of aspects in topic clusters as compared to the standard topic modeling counterparts. In our experiments we compared the performance of our model using three forms of data i) monolingual where data is written in a single language and the language is known. ii) code-mixed data with automatic language identification and monolingual cluster representations of the same.
Conference Paper
Full-text available
This paper presents Eatery, a multi-aspect restaurant rating system that identifies rating values for different aspects of a restaurant by means of aspect-level sentiment analysis. Eatery uses a hierarchical taxonomy that represents relationships between various aspects of the restaurant domain that enables finding the sentiment score of an aspect as a composite sentiment score of its sub-aspects. The system consists of a word co-occurrence based technique to identify multiple implicit aspects appearing in a sentence of a review. An improved version of Analytic Hierarchy Process (AHP) is used to obtain weights specific to a restaurant by utilizing the relationships between aspects, which allows finding the composite sentiment score for each aspect in the taxonomy. The system also has the ability to rate individual food items and food categories. An improved version of Single Pass Partition Method (SPPM) is used to categorise food names to obtain food categories.
Conference Paper
Full-text available
New opportunities and challenges arise with the growing availability of online Arabic reviews. Sentiment analysis of these reviews can help the beneficiary by summarizing the opinions of others about entities or events. Also, for opinions to be comprehensive, analysis should be provided for each aspect or feature of the entity. In this paper, we propose a generic approach that extracts the entity aspects and their attitudes for reviews written in modern standard Arabic. The proposed approach does not exploit predefined sets of features, nor domain ontology hierarchy. Instead we add sentiment tags on the patterns and roots of an Arabic lexicon and used these tags to extract the opinion bearing words and their polarities. The proposed system is evaluated on the entity-level using two datasets of 500 movie reviews with accuracy 96% and 1000 restaurant reviews with accuracy 86.7%. Then the system is evaluated on the aspect-level using 500 Arabic reviews in different domains (Novels, Products, Movies, Football game events and Hotels). It extracted aspects, at 80.8% recall and 77.5% precision with respect to the aspects defined by domain experts.
Conference Paper
Full-text available
The World Wide Web holds a wealth of information in the form of unstructured texts such as customer reviews for products, events and more. By extracting and analyzing the expressed opinions in customer reviews in a fine-grained way, valuable opportunities and insights for customers and businesses can be gained. We propose a neural network based system to address the task of Aspect-Based Sentiment Analysis to compete in Task 2 of the ESWC-2016 Challenge on Semantic Sentiment Analysis. Our proposed architecture divides the task in two subtasks: aspect term extraction and aspect-specific sentiment extraction. This approach is flexible in that it allows to address each subtask independently. As a first step, a recurrent neural network is used to extract aspects from a text by framing the problem as a sequence labeling task. In a second step, a recurrent network processes each extracted aspect with respect to its context and predicts a sentiment label. The system uses pretrained semantic word embedding features which we experimentally enhance with semantic knowledge extracted from WordNet. Further features extracted from SenticNet prove to be beneficial for the extraction of sentiment labels. As the best performing system in its category, our proposed system proves to be an effective approach for Aspect-Based Sentiment Analysis.
Conference Paper
Full-text available
This paper describes the polarity classification system designed for participation in SemEval-2016 Task 5-ABSA. The aim is to determine the sentiment polarity expressed towards certain aspect within a consumer review. Our system is based on supervised learning using Support Vector Machine (SVM). We use standard features for basic classification model. On top this, we include rules to check precedent polarity sequence. This approach is experimental .
Article
Full-text available
Sentiment analysis or Opinion mining is becoming an important task both from academics and commercial standpoint. In recent years text mining has become most promising area for research. There is an exponential growth with respect to World Wide Web, Mobile Technologies, Internet usage and business on electronic commerce applications. Because of which web opinion sources like online shopping portals, discussion forums, peer-to-peer networks, groups, blogs, micro blogs and social networking applications are extensively used to share the information, experience and opinions. In sentiment analysis, the opinion is evaluated to its positivity, negativity and neutrality with respect to the complete document or object. But this level of analysis does not provide the necessary detailed information for many applications. To obtain more fine grained analysis we need to go to Aspect Based Sentiment Analysis (ABSA). Aspect Based Sentiment analysis introduces a suite of problems which require deeper NLP capabilities and also produces a rich set of results.
Conference Paper
Full-text available
This paper reports the IIT-TUDA participation in the SemEval 2016 shared Task 5 of Aspect Based Sentiment Analysis (ABSA) for sub-task 1. We describe our system incorporating domain dependency graph features, dis-tributional thesaurus and unsupervised lexical induction using an unlabeled external corpus for aspect based sentiment analysis. Overall, we submitted 29 runs, covering 7 languages and 4 different domains. Our system is placed first in sentiment polarity classification for the English laptop domain, Spanish and Turkish restaurant reviews, and opinion target expression for Dutch and French in restaurant domain , and scores in medium ranks for aspect category identification and opinion target extraction .
Conference Paper
Full-text available
This paper presents our contribution to the Se-mEval 2016 task 5: Aspect-Based Sentiment Analysis. We have addressed Subtask 1 for the restaurant domain, in English and French, which implies opinion target expression detection , aspect category and polarity classification. We describe the different components of the system, based on composite models combining sophisticated linguistic features with Machine Learning algorithms, and report the results obtained for both languages.
Article
Full-text available
Aspect based Sentiment Analysis (ABSA) is a subarea of opinion mining which enables one to gain deeper insights into the features of items which interest the users by mining reviews. In this paper we attempt to perform ABSA on movie review data. Unlike other domains such as camera, laptops restaurants etc, a major chunk of movie reviews is devoted to describing the plot and contains no information about user interests. The presence of such narrative content may potentially mislead the review analysis. The contribution of this paper is two-fold: a two class classification scheme for plots and reviews without the need for labeled data is proposed. The overhead of constructing manually labeled data to build the classifier is avoided and the resulting classifier is shown to be effective using a small manually built test set. Secondly we propose a scheme to detect aspects and the corresponding opinions using a set of hand crafted rules and aspect clue words. Three schemes for selection of aspect clue words are explored - manual labeling (M), clustering(C) and review guided clustering (RC). The aspect and sentiment detection using all the three schemes is empirically evaluated against a manually constructed test set. The experiments establish the effectiveness of manual labeling over cluster based approaches but among the cluster based approaches, the ones utilizing the review guided clue words performed better.
Conference Paper
Full-text available
Exploiting sentiment relations to capture opinion targets has recently caught the interest of many researchers. In many cases target entities are themselves part of the sentiment lexicon creating a loop from which it is difficult to infer the overall sentiment to the target entities. In the present work we propose to detect opinion targets by extracting syntactic patterns from short-texts. Experiments show that our method was able to successfully extract 1,879 opinion targets from a total of 2,052 opinion targets. Furthermore, the proposed method obtains comparable results to SemEval 2015 opinion target models in which we observed the syntactic structure relation that exists between sentiment words and their target.
Article
Full-text available
Aspect extraction is an important step in opinion mining, to generate list of object, aspect and its opinions. Therefore, previous studies still give an opportunity to find the pattern of an aspect and opportunities in terms of aspect extraction performance. This paper propose syntactic pattern based on features observation to extract aspects from unstructured review, accompanied with a comprehensive analysis of varied pattern. This paper also provides some technical issues that arise based on performance evaluation and analysis using syntactic pattern extraction. The experimental results showed that the syntactic pattern approach had several weaknesses that need to be improved.
Article
Full-text available
Popularity and availability of opinion-rich resources in e-commerce platform is growing rapidly. Before buying any product, one is interested to know the opinion of other people about that product. For any product, there are hundreds of reviews available online so it becomes very difficult for the customers to read all the reviews. Also, one cannot set his mind based on reading some of the review since it gives him a biased view about that product. So we need to automate this process. As we know, there are lots of opinion words present in the sentences of a review which will tell about the polarity of that product. Out of all the opinion words, some words behave in the same manner means they have the same polarity in all contexts, but some words are context dependent means they have different polarity in different context. In this paper, we proposed an Aspect Based Sentiment Analysis and Summarization (ASAS) System, which handles the context dependent opinion words that has been the cause of major difficulties. For finding the opinion polarity, first, we used an online dictionary for classifying the context independent opinion word. Second, we used natural linguistic rules for assigning the polarity to maximum possible context dependent words. These steps create the training data set. Third, for classification of the remaining opinion words, we used opinion words and feature together rather than opinion words alone, because the same opinion word can have different polarity in the same domain. Then we used our Interaction Information method to classify the feature-opinion pairs. Fourth, as negation plays a very crucial role, we found negation words and flipped the polarity of the corresponding opinion word. Finally, after classifying each opinion word, the system generated a short summary for that particular product based on each feature
Conference Paper
Aspect-level sentiment classification is a fine-grained sentiment analysis task, which aims to predict the sentiment of a text in different aspects. One key point of this task is to allocate the appropriate sentiment words for the given aspect.Recent work exploits attention neural networks to allocate sentiment words and achieves the state-of-the-art performance. However, the prior work only attends to the sentiment information and ignores the aspect-related information in the text, which may cause mismatching between the sentiment words and the aspects when an unrelated sentiment word is semantically meaningful for the given aspect. To solve this problem, we propose a HiErarchical ATtention (HEAT) network for aspect-level sentiment classification. The HEAT network contains a hierarchical attention module, consisting of aspect attention and sentiment attention. The aspect attention extracts the aspect-related information to guide the sentiment attention to better allocate aspect-specific sentiment words of the text. Moreover, the HEAT network supports to extract the aspect terms together with aspect-level sentiment classification by introducing the Bernoulli attention mechanism. To verify the proposed method, we conduct experiments on restaurant and laptop review data sets from SemEval at both the sentence level and the review level. The experimental results show that our model better allocates appropriate sentiment expressions for a given aspect benefiting from the guidance of aspect terms. Moreover, our method achieves better performance on aspect-level sentiment classification than state-of-the-art models.
Conference Paper
Aspects opinion mining is one of the most important issues in the field of Natural Language Processing. This paper proposes an algorithm of aspects opinion mining by combining word embedding and dependency parsing. Firstly, training word embedding and constructing sentiment and aspect lexicon by word embedding. Secondly, using dependency parser to discover the phrases that have dependencies. Thirdly, filtering these phrases according to the sentiment and aspect lexicon, and obtaining language patterns of aspects. Lastly, using these language patterns of aspects to discover all emotion words of every aspect, and computing the sentimental orientation of every aspect. The experimental results on a reviews corpus of a video software show that the precision, recall and F-score of our algorithm achieves to 73.17%, 76.60%, and 74.85% respectively.
Conference Paper
This paper presents a method for gathering and evaluating user attitudes towards previously released video games. A three-part video game franchise was selected, and all user reviews of these games were collected. The most frequently mentioned words of the game were derived from this dataset through word frequency analysis. The words, called "aspects" were then further analyzed through a manual aspect based sentiment analysis. The final analysis showed that the rating of user reviews to a high degree correlate with the sentiment of the aspect in question. This knowledge is valuable for a developer who wishes to learn more about previous games success or failure factors.
Article
Opinion mining refers to extract subjective information from text data using tools such as natural language processing (NLP), text analysis and computational linguistics. Micro-blogging and social network are the most popular Web 2.0 applications, like Twitter and Facebook which are developed for sharing opinions about different topics. This kind of application becomes a rich data source for opinion mining and sentiment analysis. This information is crucial for managers, who should improve the quality of a product based on customers’ opinions. Concerning the characteristic of a product as mobile phone, it is particularly difficult to identify the features being commented on (e.g., camera quality, battery life, price, etc). In our work, we present a new method that able to extract product features opinions of customer from social networks using text analysis techniques. This task identifies customers opinions regarding product features. We develop a system for retrieving tweets about a product from Twitter and detect product features opinions and their polarity. To validate the effectiveness of this approach, we used a dataset published by Bing Lius group in our approach experimentation. This dataset contains many notated customer reviews of five products such as Canon G3 and Nokia 6610. Next, we test this method with tweets retrieved from Twitter about Nokia, Samsung and Iphone features products.
Article
Sentiment Analysis is the task of automatically discovering the exact sentimental ideas about a product (or service, social event, etc.) from customer textual comments (i.e. reviews) crawled from various social media resources. Recently, we can see the rising demand of aspect-based sentiment analysis, in which we need to determine sentiment ratings and importance degrees of product aspects. In this paper we propose a novel multi-layer architecture for representing customer reviews. We observe that the overall sentiment for a product is composed from sentiments of its aspects, and in turn each aspect has its sentiments expressed in related sentences which are also the compositions from their words. This observation motivates us to design a multiple layer architecture of knowledge representation for representing the different sentiment levels for an input text. This representation is then integrated into a neural network to form a model for prediction of product overall ratings. We will use the representation learning techniques including word embeddings and compositional vector models, and apply a back-propagation algorithm based on gradient descent to learn the model. This model consequently generates the aspect ratings as well as aspect weights (i.e. aspect importance degrees). Our experiment is conducted on a data set of reviews from hotel domain, and the obtained results show that our model outperforms the well-known methods in previous studies.
Conference Paper
Current approaches in aspect-based sentiment analysis ignore or neutralize unhandled issues emerging from the lexicon-based scoring (i.e., SentiWordNet), whereby lexical sentiment analysis only classifies text based on affect word presence and word count are limited to these surface features. This is coupled with considerably low detection rate among implicit concepts in the text. To address this issues, this paper proposed the use of ontology to i) enhance aspect extraction process by identifying features pertaining to implicit entities, and ii) eliminate lexicon-based sentiment scoring issues which, in turn, improve sentiment analysis and summarization accuracy. Concept-level sentiment analysis aims to go beyond word-level analysis by employing ontologies which act as a semantic knowledge base rather than the lexicon. The outcome is an Ontology-Based Product Sentiment Summarization (OBPSS) framework which outperformed other existing summarization systems in terms of aspect extraction and sentiment scoring. The improved performance is supported by the sentence-level linguistic rules applied by OBPSS in providing a more accurate sentiment analysis.
Conference Paper
Opinion mining, also known as sentiment analysis, refers to the method of identifying and extracting subjective information from the source materials by the use of natural language processing, text analysis and computational linguistics. It has gained immense amount of importance in recent times due to the growing number of blogs, forums and other social networks which contain huge amount of opinions. Feature extraction is an important factor in opinion mining which refers to the method of extracting those properties on which the opinions are based on. There is a method for identifying features based on Intrinsic and Extrinsic Domain Relevance (IEDR) which exploits the difference in opinion feature's statistics across two corpora. The approach includes syntactic rules to process the review sentences and a well known and generalized weight equation with a numerical statistic known as Term Frequency-Inverse Document Frequency to calculate domain relevance which often fails to identify many of the legitimate features. So in this paper, we propose an effective approach of this IEDR technique for the purpose of feature extraction. Our proposed approach includes a handful of extended syntactic rules to process review sentences. It also includes optimization in calculation of domain relevance with the modification of weight equation. To verify our proposed approach, we have applied it on two real-world review corpora along with the existing IEDR approach. Our proposed approach exhibits a remarkable improvement in performance for finding opinion features outperforming the currently existing IEDR method.
Conference Paper
This paper describes our deep learning-based approach to multilingual aspect-based sentiment analysis as part of SemEval 2016 Task 5. We use a convolutional neural network (CNN) for both aspect extraction and aspect-based sentiment analysis. We cast aspect extraction as a multi-label classification problem, outputting probabilities over aspects parameterized by a threshold. To determine the sentiment towards an aspect, we concatenate an aspect vector with every word embedding and apply a convolution over it. Our constrained system (unconstrained for English) achieves competitive results across all languages and domains, placing first or second in 5 and 7 out of 11 language-domain pairs for aspect category detection (slot 1) and sentiment polarity (slot 3) respectively, thereby demonstrating the viability of a deep learning-based approach for multilingual aspect-based sentiment analysis.
Article
In this paper, we present the first deep learning approach to aspect extraction in opinion mining. Aspect extraction is a subtask of sentiment analysis that consists in identifying opinion targets in opinionated text, i.e., in detecting the specific aspects of a product or service the opinion holder is either praising or complaining about. We used a 7-layer deep convolutional neural network to tag each word in opinionated sentences as either aspect or non-aspect word. We also developed a set of linguistic patterns for the same purpose and combined them with the neural network. The resulting ensemble classifier, coupled with a word-embedding model for sentiment analysis, allowed our approach to obtain significantly better accuracy than state-of-the-art methods.
Conference Paper
Descriptions and reviews for products abound on the web and characterise the corresponding products through their aspects. Extracting these aspects is essential to better understand these descriptions, e.g., for comparing or recommending products. Current pattern-based aspect extraction approaches focus on flat patterns extracting flat sets of adjective-noun pairs. Aspects also have crucial importance on sentiment classification in which sentiments are matched with aspect-level expressions. A preliminary step in both aspect extraction and aspect based sentiment analysis is to detect aspect terms and opinion targets. In this paper, we propose a sequential learning approach to extract aspect terms and opinion targets from opinionated documents. For the first time, we use semi-markov conditional random fields for this task and we incorporate word embeddings as features into the learning process. We get comparative results on the benchmark datasets for the subtask of aspect term extraction in SemEval-2014 Task 4 and the subtask of opinion target extraction in SemEval-2015 Task 12. Our results show that word embeddings improve the detection accuracy for aspect terms and opinion targets.
Book
Sentiment analysis is the computational study of people's opinions, sentiments, emotions, and attitudes. This fascinating problem is increasingly important in business and society. It offers numerous research challenges but promises insight useful to anyone interested in opinion analysis and social media analysis. This book gives a comprehensive introduction to the topic from a primarily natural-language-processing point of view to help readers understand the underlying structure of the problem and the language constructs that are commonly used to express opinions and sentiments. It covers all core areas of sentiment analysis, includes many emerging themes, such as debate analysis, intention mining, and fake-opinion detection, and presents computational methods to analyze and summarize opinions. It will be a valuable resource for researchers and practitioners in natural language processing, computer science, management sciences, and the social sciences.
Conference Paper
The popularity of internet along with the huge number of reviews posted daily via social media, blogs and review sites invokes the research challenges on topic or aspect based analysis. In the recent years, it also has become a challenging task to mine opinions with respect to the aspects from the available unstructured and noisy data. In this paper, we present a novel approach to identify the key terms and its sentiments from the reviews of Restaurants and Laptops with the help of different features and Conditional Random Field based machine learning algorithm. The supervised method achieves F-score of 0.7493380 and 0.6858054 for aspect term identification whereas 0.68982 and 0.6041 of accuracy for aspect based sentiment classification on Restaurant and Laptop reviews, respectively.
Article
Sentiment analysis of a movie review plays an important role in understanding the sentiment conveyed by the user towards the movie. In the current work we focus on aspect based sentiment analysis of movie reviews in order to find out the aspect specific driving factors. These factors are the score given to various movie aspects and generally aspects with high driving factors direct the polarity of the review the most. The experiment showed that by giving high driving factors to Movie, Acting and Plot aspects of a movie, we obtained the highest accuracy in the analysis of movie reviews.
Article
An attentional mechanism has lately been used to improve neural machine translation (NMT) by selectively focusing on parts of the source sentence during translation. However, there has been little work exploring useful architectures for attention-based NMT. This paper examines two simple and effective classes of attentional mechanism: a global approach which always attends to all source words and a local one that only looks at a subset of source words at a time. We demonstrate the effectiveness of both approaches over the WMT translation tasks between English and German in both directions. With local attention, we achieve a significant gain of 5.0 BLEU points over non-attentional systems which already incorporate known techniques such as dropout. Our ensemble model using different attention architectures has established a new state-of-the-art result in the WMT'15 English to German translation task with 25.9 BLEU points, an improvement of 1.0 BLEU points over the existing best system backed by NMT and an n-gram reranker.
Conference Paper
The Internet and Web 2.0 social media have emerged as an important medium for expressing sentiments, opinions, evaluations, and reviews. Sentiment analysis or opinion mining is becoming an open research domain due to the abundance of discussion forums, Weblogs, e-commerce portals, social networking and content sharing sites where people tend to express their opinions. Sentiment Analysis involves classifying text documents based on the opinion expressed being positive or negative about a given topic. This paper proposes a sentiment classification model using back-propagation artificial neural network (BPANN). Information Gain and three popular sentiment lexicons are used to extract sentiment representing features that are then used to train and test the BPANN. This novel approach combines the strength of BPANN in classification accuracy with utilizing intrinsic domain knowledge available in the sentiment lexicons. The results obtained on the movie-review corpora have shown that the proposed approach has been able to reduce dimensionality, while producing accurate sentiment based classification of text.
Article
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.
Article
The vast majority of existing approaches to opinion feature extraction rely on mining patterns only from a single review corpus, ignoring the nontrivial disparities in word distributional characteristics of opinion features across different corpora. In this paper, we propose a novel method to identify opinion features from online reviews by exploiting the difference in opinion feature statistics across two corpora, one domain-specific corpus (i.e., the given review corpus) and one domain-independent corpus (i.e., the contrasting corpus). We capture this disparity via a measure called domain relevance (DR), which characterizes the relevance of a term to a text collection. We first extract a list of candidate opinion features from the domain review corpus by defining a set of syntactic dependence rules. For each extracted candidate feature, we then estimate its intrinsic-domain relevance (IDR) and extrinsic-domain relevance (EDR) scores on the domain-dependent and domain-independent corpora, respectively. Candidate features that are less generic (EDR score less than a threshold) and more domain-specific (IDR score greater than another threshold) are then confirmed as opinion features. We call this interval thresholding approach the intrinsic and extrinsic domain relevance (IEDR) criterion. Experimental results on two real-world review domains show the proposed IEDR approach to outperform several other well-established methods in identifying opinion features.