Conference Paper

A feature dependent method for opinion mining and classification

DLSI, Univ. Alicante, Alicante
DOI: 10.1109/NLPKE.2008.4906796 Conference: Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on
Source: IEEE Xplore

ABSTRACT Mining the web for customer opinion on different products is both a useful, as well as challenging task. Previous approaches to customer review classification included document level, sentence and clause level sentiment analysis and feature based opinion summarization. In this paper, we present a feature driven opinion summarization method, where the term ldquodrivenrdquo is employed to describe the concept-to-detail (product class to product-specific characteristics) approach we took. For each product class we first automatically extract general features (characteristics describing any product, such as price, size, design), for each product we then extract specific features (as picture resolution in the case of a digital camera) and feature attributes (adjectives grading the characteristics, as for example high or low for price, small or big for size and modern or faddy for design). Further on, we assign a polarity (positive or negative) to each of the feature attributes using a previously annotated corpus and Support Vector Machines Sequential Minimal Optimization machine learning with the Normalized Google Distance. We show how the method presented is employed to build a feature-driven opinion summarization system that is presently working in English and Spanish. In order to detect the product category, we use a modified system for person names classification. The raw review text is split into sentences and depending on the product class detected, only the phrases containing the specific product features are selected for further processing. The phrases extracted undergo a process of anaphora resolution, Named Entity Recognition and syntactic parsing. Applying syntactic dependency and part of speech patterns, we extract pairs containing the feature and the polarity of the feature attribute the customer associates to the feature in the review. Eventually, we statistically summarize the polarity of the opinions different customers expressed about the product on the -
web as percentages of positive and negative opinions about each of the product features. We show the results and improvements over baseline, together with a discussion on the strong and weak points of the method and the directions for future work.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Social media constitutes a major component of Web 2.0 and includes social networks, blogs, forum discussions, micro-blogs, etc. Users of social media generate a huge volume of reviews and comments on daily basis. These reviews and comments reflect the opinions of users about different issues, such as: products, news, entertainments, or sports. Therefore different establishments may need to analyze these reviews and comments. For examples: It is essential for companies to know the pros and cons of their products or services in the eyes of customers. Governments may want In addition to know the attitude of people towards certain decisions, services, etc. Although the manual analysis of textual reviews and comments can be more accurate than the automatic methods, nonetheless, it is time consuming, expensive, and can be In addition subjective. In addition, the huge amount of data contained in social networks can make it impractical to perform analysis manually. This paper focuses on evaluating social content in Arabic language and contexts. Currently, Middle East is an area rich of major political and social reforms. The social media can be a rich source of information to evaluate such contexts. In this research we developed an opinion mining and analysis tool to collect different forms of Arabic language (i.e. Standard or MSA, and colloquial). The tool accepts comments or opinions as input and generates polarity based outputs related to the comments. For example the output can be whether the comment or review is: (subjective or objective), (positive or negative), and (strong or weak). The evaluation of the performance of the developed tool showed that it yields more accurate results when it is applied on domain-based Arabic reviews relative to general-based Arabic reviews.
    International Journal of Advanced Computer Science and Applications 05/2014; 5(5):181-195. · 1.32 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Social networks and users’ interactions are distinct features for the current Web. They constitute a fundamental part of Web 2.0, where people produce, disseminate, and consume information in new interactive forms where users are not only passive information receivers. Social media succeed to attract a large portion of online users, which explains the explosive growth of social media in terms of comments, reviews, blogs, microblogs, Twitters, and postings in social network sites. In this scope, sentiment analysis research field refers to the analysis of people’s sentiments, opinions, attitudes, and emotions towards events, products, companies, individuals, issues, sport teams ...etc. Facebook, and YouTube are within the top 3 sites used in many Middle Eastern (ME) countries, and the world. Therefore a huge volume of Arabic comments and reviews are generated daily about different aspects of life in this part of the world. Modern Standard Arabic (MSA) is used mainly in media (Newspapers, Journals, TV and Radio), academic institution, and to some extent in social media. While colloquial Arabic is used by the public in their conversations, chatting, etc.. Analysis of social networks in ME countries shows that both MSA and colloquial or slang languages are used. The aim of this study is to build a novel sentiment analysis tool called colloquial Non-Standard Arabic - Modern Standard Arabic-Sentiment Analysis Tool (CNSAMSA-SAT) dedicated to both colloquial Arabic and MSA. A large number of Arabic collected comments and reviews from social media were tokenized and analyzed to build polarity lexicons which constitute an essential part of CNSA-MSA-SAT. Each Arabic collected comment and review is manually assigned to one of the three polarity values: (positive, negative, and neutral). Further, each collected review or comment is added to CNSA-MSA-SAT and is then assigned to one of the three polarities values based on algorithms developed for this purpose.
    The fourth International Conference on Information and Communication Systems (ICICS 2013); 04/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Dealing with the ever-growing information overload in the Internet, Recommender Systems are widely used online to suggest potential customers item they may like or find useful. Collaborative Filtering is the most popular techniques for Recommender Systems which collects opinions from customers in the form of ratings on items, services or service providers. In addition to the customer rating about a service provider, there is also a good number of online customer feedback information available over the Internet as customer reviews, comments, newsgroups post, discussion forums or blogs which is collectively called user generated contents. This information can be used to generate the public reputation of the service providers’. To do this, data mining techniques, specially recently emerged opinion mining could be a useful tool. In this paper we present a state of the art review of Opinion Mining from online customer feedback. We critically evaluate the existing work and expose cutting edge area of interest in opinion mining. We also classify the approaches taken by different researchers into several categories and sub-categories. Each of those steps is analyzed with their strength and limitations in this paper.