Article

A new hierarchy framework for feature engineering through multi-objective evolutionary algorithm in text classification

Wiley
Concurrency and Computation: Practice and Experience
Authors:
  • Islamic Azad University Khorasgan (Isfahan) Branch Isfahan Iran
  • Islamic Azad University Isfahan(Khorasgan) Branch
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Sentiment classification is a field of sentiment analysis concerned with analyzing opin- ions, emotions, evaluations, and attitudes regarding a special topic like a product, an organization, a person, or an incident. With the growth of user-generated con- tent on the Web, this field gained great importance in online reviews. With a wide range of reviews, customers cannot read all reviews. Considering the increasing rate of electronic documents and the urgent need manually mine for keywords that are hard and time-consuming, doing the same automatically is of high demand. A new framework proposed here to mine and classify users’ comments based on mining keywords by applying the sequence pattern mining through the Separation-Power concept, a multi-objective evolutionary algorithm based on decomposition with four objectives, and a neural network as the final classifier. Some modifications are made on multi-objective evolutionary algorithm based on decomposition and Apriori algorithms to improve the text classification efficiency. To evaluate the proposed framework, three datasets applied; which compared with the two methods to measure accuracy, preci- sion, recall, and error-index. The results indicate that this framework provides a better outcome than its counterparts with 99.45 precision, 99.34 accuracy, 99.48 recall, and 99.28% f-measure.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Feature selection plays a significant role in Sentiment classification that affect the overall accuracy of classification [11][12][13]. Due to the highly unstructured content of social media, a compelling feature selection improves and optimizes the results. The traditional feature selection methods such as Chi-Square, Information Gain (IG), and mutual information (MI) [14] select capable features to decrease the size of the data but not the accuracy. ...
... The FANS' highest precision, accuracy, and recall were achieved at 96.05%, 96.83%, and 96.54%, respectively. Figures 11,12,and 13 show the precision, accuracy, and recall values compared with MOGWOKB and CSO-LST-MNN in tenfold cross-validation, respectively. The x-axis shows the mentioned work in tenfold cross-validation, whereas the y-axis shows the evaluation metrics. ...
Article
Full-text available
Sentiment classification is a prevalent task in text mining in which a text classifies into positive, negative, or neutral classes. Sentiment classification is an essential issue of decision-making for people, companies, etc. Feature selection is the most influential stage in sentiment classification. Due to the NP-hard nature of the problem and a huge of existing texts, the traditional feature selection techniques, such as statistical techniques, generate sub-optimal solutions. Swarm intelligence algorithms are extensively devoted to optimization problems. These algorithms produce features by increasing the classification performance and decreasing the computational complexity and feature set size. In this study, the authors proposed a framework using the modified multi-objective Firefly algorithm, namely FANS (Firefly Algorithm Naïve Bayes Sentiment). The two targets are decreasing the naïve Bayes error classifier and the k-nearest neighbor. A neural network is used as the final classifier. The three datasets on Movie review and Twitter domains are applied to evaluate the FANS. The FANS outperform its counterparts regarding precision, accuracy, and recall. The FANS yields 96.88% precision, 97.65% accuracy, and 96.54% recall.
... Authors applied decision boundaries in cross-domain sentiment analysis (Fu & Liu, 2022). Authors mine keywords and applied feature engineering techniques to explore patterns (Asgarnezhad, Monadjemi & Aghaei, 2022). Aspect-level sentiment analysis has been performed using an adaptive SVM model and Twitter dataset . ...
Article
The outbreak of the COVID-19 pandemic has also triggered a tsunami of news, instructions, and precautionary measures related to the disease on social media platforms. Despite the considerable support on social media, a large number of fake propaganda and conspiracies are also circulated. People also reacted to COVID-19 vaccination on social media and expressed their opinions, perceptions, and conceptions. The present research work aims to explore the opinion dynamics of the general public about COVID-19 vaccination to help the administration authorities to devise policies to increase vaccination acceptance. For this purpose, a framework is proposed to perform sentiment analysis of COVID-19 vaccination-related tweets. The influence of term frequency-inverse document frequency, bag of words (BoW), Word2Vec, and combination of TF-IDF and BoW are explored with classifiers including random forest, gradient boosting machine, extra tree classifier (ETC), logistic regression, Naïve Bayes, stochastic gradient descent, multilayer perceptron, convolutional neural network (CNN), bidirectional encoder representations from transformers (BERT), long short-term memory (LSTM), and recurrent neural network (RNN). Results reveal that ETC outperforms using BoW with a 92% of accuracy and is the most suitable approach for sentiment analysis of COVID-19-related tweets. Opinion dynamics show that sentiments in favor of vaccination have increased over time.
... For statistical evaluations, the mean absolute error (MAE) and root absolute error (RAE) are obtained as shown in Eqs. 55 and 56, respectively [52,53] (see Table 4). ...
Article
Full-text available
A wireless sensor network consists of many wireless sensors in a specific area to collect information from the environment and send the collected data to the base station. In this type of network, a sink node is applied to improve data aggregation with a mobile sink. Many methods have proposed for the use of mobile sinks and a detailed evaluation of the performance of these methods has not been provided. In this paper, the current authors present an effective and new method by combining three data collection methods and mobile sinks. Results reveal that the proposed method has a better performance in terms of parameters than other methods. A main difference is that in addition to the mobile sink, it uses other nodes called advanced nodes that direct data from the header nodes to the sink path, which ultimately results in better performance. The results show that the proposed method has more significant superiority over its comparative techniques, particularly on energy consumption, network lifetime, delay, and missing data.
... Consequently, we believe that the present proposed method by stratified sampling achieved excellent performance (96.67% accuracy). There are some works that show the current study with the applied methods has good enough results [25][26][27][28][29][30][31][32][33][34]. Several experiments were applied and the accuracy of 96.67% was achieved, which approximately was improved 2%. ...
Article
Diagnosis of diabetes is a classification problem that attracts more in recent years. Diabetes mellitus happens when the whole body cannot provide an adequate quantity of insulin to adjust glucose levels. In the low insulin level, food products in glucose are turned into glucose, improving the sugar to a more than average level. All existing works show that many techniques are successful for this disease, Artificial Intelligence. There exist many classification models to aim the prediction of diabetes. We introduce a novel model to investigate the role of pre-processing and data reduction for classification problems in the diagnosis of diabetes. The model has four steps consisting of Pre-processing, Feature sub-selection, Classification, and Performance. In the classification technique, we apply the voting technique with three classifiers. Many experiments were conducted to reveal the performance of the proposed work for the diagnosis of diabetics. The results confirmed the superiority of our model over its counterparts, and the best accuracy, precision, recall, and F1 were achieved at 96.67, 100, 100, and 94.01%, respectively.
... They used the Rapid Miner tool and applied ensemble operations including bagging, boosting, voting, and stacking [4]. We applied these techniques without pre-processing through Decision Tree (DT), Random Forest (RF), and K-nearest neighbor (KNN) 34 algorithms [5]. The accuracies of bagging, boosting, voting, and stacking reached 97. 12, 97.12, 97.12, 96.15%, respectively. ...
... For example, for people who carry complications of anemia, blood and heart disease is one of the most necessary problems that this disease faces. There is no specific treatment to limit the aggravation of this disease [1]. This disease is treated by regular chronic kidney dialysis, and this is not enough to prompt us. ...
... The highest results achieved through our optimized model for the boosting method on the datasets. Experiments showed that our independent-domain approach can improve the classification performance and outperform the existing traditional techniques [2], [3], [ 8 ] , [9], [11], and [12]. The excess of this article organized as follows: Our paper contain a summary of the related works. ...
Article
With the extensive Internet applications, review sentiment classification has attracted increasing interest among text mining experts. Traditional bag of words approaches did not indicate multiple relationships connecting words while emphasizing the pre-processing phase and data reduction techniques, making a huge performance difference in classification. This study suggests a model as a different efficient model for multi-class sentiment classification using sampling techniques, feature selection methods, and ensemble supervised classification to increase the performance of text classification. The feature selection phase of our model has applied n-grams, a computational method that optimizes feature selection procedure by extracting features based on the relationships of the words to improve a candidate selection of features. The proposed model classifies the sentiment of tweets and online reviews through ensemble methods, including boosting, bagging, stacking, and voting in conjunction with supervised methods. Besides, two sampling techniques were applied in the pre-processing phase. In the experimental study, a comprehensive range of comparative experiments was conducted to assess the effectiveness of our model using the best existing works in the literature on well-known movie reviews and Twitter datasets. The highest accuracy and f-measure for our model obtained 92.95 and 92.65% on the movie dataset, 90.61 and 87.73% on the Twitter dataset, respectively.
... Recently, several technologies have been developed to succeed the problems. A KNN method can consolidate a confidence portion scale into the traditional KNN [41]- [43]. See KNN Classification in Fig. 4. ...
Article
Background and Objectives: Autism is the most well-known disease that occurs in any age people. There is an increasing concern in appealing machine learning techniques to diagnose these incurable conditions. But, the poor quality of most datasets contains the production of efficient models for the forecast of autism. The lack of suitable pre-processing methods outlines inaccurate and unstable results. For diagnosing the disease, the techniques handled to improve the classification performance yielded better results, and other computerized technologies were applied. Methods: An effective and high performance model was introduced to address pre-processing problems such as missing values and outliers. Several based classifiers applied on a well-known autism data set in the classification stage. Among many alternatives, we remarked that combine replacement with the mean and improvement selection with Random Forest and Decision Tree technologies provide our obtained highest results. Results: The best-obtained accuracy, precision, recall, and F-Measure values of the MVO-Autism suggested model were the same, and equal 100% outperforms their counterparts. Conclusion: The obtained results reveal that the suggested model can increase classification performance in terms of evaluation metrics. The results are evidence that the MVO-Autism model outperforms its counterparts. The reason is that this model overcomes both problems.
Chapter
In the process of stock price forecasting, there are the following problems: how to find the more effective factors for stock price forecasting, and how to calculate the weight of the constructed stock correlation factor sets. To solve the above problems, this paper proposes a method of factor construction in the field of stock price prediction based on genetic programming. The method can automatically construct the factor by reading the original data set of the stock, and calculate the weight of each factor. In addition, this paper also proposes a new crossover operator, which can dynamically adjust the selection of crossover nodes by using the information in the execution process of genetic programming algorithm, so as to improve the quality of the constructed factor set. A lot of experiments have been carried out with this method. The results show that the factors constructed by this method can improve the accuracy of the stock price prediction algorithm in most cases.
Article
Full-text available
Sentiment analysis on video lectures on YouTube that discuss the haram of music is an exciting topic to find out public opinion. This study aims to find what factors affect the model's accuracy in sentiment analysis, especially on video lecture content on YouTube. The data used is comment data on three video lectures that discuss the haram of music, which has been labelled into two categories: positive and negative. The data is divided into two categories, namely primary data, as many as 2099 data that have not been normalized, while secondary data has 1001 data that have been normalized. The experiment shows that the validity of the data, labelling the data, the amount of data, and preprocessing are essential points in forming a good sentiment analysis classification model because, from the test results, it was found that imbalance techniques such as SMOTE, word embedding word2Vec and FastText, and SVM and KNN classification algorithms do not provide maximum accuracy if the data used primary data. However, the data imbalance process, such as oversampling and SVM and KNN classification algorithms, will provide better model accuracy if used with secondary data. Based on the trial results, it is found that when using the SVM algorithm, primary data produces the highest accuracy at 58.35%, while secondary data is 72.23%. If using KNN, the primary data provides the highest model accuracy at 53.54%, while the secondary data has the highest accuracy at 72.81%. Based on these results, it was found that the validity of the data or data must be appropriate and related to the case raised and labelling the data must be done carefully because the most crucial is the inappropriate data in preprocessing the data must be done correctly, if data preprocessing is done in an inappropriate way then data imbalance techniques such as oversampling do not have enough influence on increasing accuracy, but if on the contrary then accuracy will increase. The selection of the right word embedding also affects accuracy. It is necessary to do many experiments to select the correct algorithm and follow the data owned because selecting the correct algorithm will provide maximum accuracy model results
Article
Due to the importance of automatic identification of brain conditions, many researchers concentrate on Epilepsy disorder to aim to the detecting of eye states and classification systems. Eye state recognition has a vital role in biomedical informatics such as controlling smart home devices, driving detection, etc. This issue is known as electroencephalogram signals. There are many works in this context in which traditional techniques and manually extracted features are used. The extraction of effective features and the selection of proper classifiers are challenging issues. In this study, a classification system named PEML-E was proposed in which a different pre-processing stage is used. The ensemble methods in the classification stage are compared to the base classifiers and the most important works in this context. To evaluate, a freely available public EEG eye state dataset from UCI is applied. The highest accuracy, precision, recall, F1, specificity, and sensitivity are obtained 95.88, 95.39, 96.25, 96.18, 96.25, and 95.44%, respectively.
Article
Full-text available
Due to extensive web applications, sentiment classifcation (SC) has become a rel- evant issue of interest among text mining experts. The extensive online reviews pre- vent the application of efective models to be used in companies and in the decision making of individuals. Pre-processing greatly contributes in sentiment classifca- tion. The traditional bag-of-words approaches do not record multiple relationships among words. In this study, emphasis is on the pre-processing stage and data reduc- tion techniques, which would make a big diference in sentiment classifcation ef- ciency. To classify opinions, a multi-objective-grey wolf-optimization algorithm is proposed where the two objectives aim for decreasing the error of Naïve Bayes and K-nearest neighbour classifers and a neural network as the fnal classifer. In evalu- ating this proposed framework, three datasets are applied. By obtaining 95.76% pre- cision, 95.75% accuracy, 95.99% recall, and 95.82% f-measure, it is evident that this framework outperforms its counterparts.
Article
Full-text available
The evaluation of feature selection methods for text classification with small sample datasets must consider classification performance, stability, and efficiency. It is, thus, a multiple criteria decision-making (MCDM) problem. Yet there has been few research in feature selection evaluation using MCDM methods which considering multiple criteria. Therefore, we use MCDM-based methods for evaluating feature selection methods for text classification with small sample datasets. An experimental study is designed to compare five MCDM methods to validate the proposed approach with 10 feature selection methods, nine evaluation measures for binary classification, seven evaluation measures for multi-class classification, and three classifiers with 10 small datasets. Based on the ranked results of the five MCDM methods, we make recommendations concerning feature selection methods. The results demonstrate the effectiveness of the used MCDM-based method in evaluating feature selection methods.
Article
Full-text available
In recent years, the growth of social network has increased the interest of people in analyzing reviews and opinions for products before they buy them. Consequently, this has given rise to the domain adaptation as a prominent area of research in sentiment analysis. A classifier trained from one domain often gives poor results on data from another domain. Expression of sentiment is different in every domain. The labeling cost of each domain separately is very high as well as time consuming. Therefore, this study has proposed an approach that extracts and classifies opinion words from one domain called source domain and predicts opinion words of another domain called target domain using a semi-supervised approach, which combines modified maximum entropy and bipartite graph clustering. A comparison of opinion classification on reviews on four different product domains is presented. The results demonstrate that the proposed method performs relatively well in comparison to the other methods. Comparison of SentiWordNet of domain-specific and domain-independent words reveals that on an average 72.6% and 88.4% words, respectively, are correctly classified.
Article
Full-text available
This paper proposes a new stopping criterion for decomposition-based multi-objective evolutionary algorithms (MOEA/Ds) to reduce the unnecessary usage of computational resource. In MOEA/D, a multi-objective problem is decomposed into a number of single-objective subproblems using a Tchebycheff decomposition approach. Then, optimal Pareto front (PF) is obtained by optimizing the Tchebycheff objective of all the subproblems. The proposed stopping criterion monitors the variations of Tchebycheff objective at every generation using maximum Tchebycheff objective error (MTOE) of all the subproblems and stops the algorithm, when there is no significant improvement in MTOE. χ2\chi ^{2} test is used for statistically verifying the significant changes of MTOE for every γ\gamma generations. The proposed stopping criterion is implemented in a recently constrained MOEA/D variant, namely CMOEA/D-CDP, and a simulation study is conducted with the constrained test instances for choosing a suitable tolerance value for the MTOE stopping criterion. A comparison with the recent stopping methods demonstrates that the proposed MTOE stopping criterion is simple and has minimum computational complexity. Moreover, the MTOE stopping criterion is tested on real-world application, namely multi-objective H\hbox {H}_{\infty } loop shaping PID controller design. Simulation results revealed that the MTOE stopping criterion reduces the unnecessary usage of computational resource significantly when solving the constrained test instances and multi-objective H\hbox {H}_{\infty } loop shaping PID controller design problems.
Article
Full-text available
Sentiment classification is one of the important tasks in text mining, which is to classify documents according to their opinion or sentiment. Documents in sentiment classification can be represented in the form of feature vectors, which are employed by machine learning algorithms to perform classification. For the feature vectors, the feature selection process is necessary. In this paper, we will propose a feature selection method called fitness proportionate selection binary particle swarm optimization (F-BPSO). Binary particle swarm optimization (BPSO) is the binary version of particle swam optimization and can be applied to feature selection domain. F-BPSO is a modification of BPSO and can overcome the problems of traditional BPSO including unreasonable update formula of velocity and lack of evaluation on every single feature. Then, some detailed changes are made on the original F-BPSO including using fitness sum instead of average fitness in the fitness proportionate selection step. The modified method is, thus, called fitness sum proportionate selection binary particle swarm optimization (FS-BPSO). Moreover, further modifications are made on the FS-BPSO method to make it more suitable for sentiment classification-oriented feature selection domain. The modified method is named as SCO-FS-BPSO where SCO stands for “sentiment classification-oriented”. Experimental results show that in benchmark datasets original F-BPSO is superior to traditional BPSO in feature selection performance and FS-BPSO outperforms original F-BPSO. Besides, in sentiment classification domain, SCO-FS-BPSO which is modified specially for sentiment classification is superior to traditional feature selection methods on subjective consumer review datasets.
Article
Full-text available
With the rapid development of the World Wide Web, electronic word-of-mouth interaction has made consumers active participants. Nowadays, a large number of reviews posted by the consumers on the Web provide valuable information to other consumers. Such information is highly essential for decision making and hence popular among the internet users. This information is very valuable not only for prospective consumers to make decisions but also for businesses in predicting the success and sustainability. In this paper, a Gini Index based feature selection method with Support Vector Machine (SVM) classifier is proposed for sentiment classification for large movie review data set. The results show that our Gini Index method has better classification performance in terms of reduced error rate and accuracy.
Conference Paper
Full-text available
We present a new feature type named rating-based feature and evaluate the contribution of this feature to the task of document-level sentiment analysis. We achieve state-of-the-art results on two publicly available standard polarity movie datasets: on the dataset consisting of 2000 reviews produced by Pang and Lee (2004) we obtain an accuracy of 91.6% while it is 89.87% evaluated on the dataset of 50000 reviews created by Maas et al. (2011). We also get a performance at 93.24% on our own dataset consisting of 233600 movie reviews, and we aim to share this dataset for further research in sentiment polarity analysis task.
Conference Paper
Full-text available
Sequential pattern mining algorithms using a vertical representation are the most efficient for mining sequential patterns in dense or long sequences, and have excellent overall performance. The vertical representation allows generating patterns and calculating their supports without performing costly database scans. However, a crucial performance bottleneck of vertical algorithms is that they use a generate-candidate-and-test approach that can generate a large amount of infrequent candidates.To address this issue, we propose pruning candidates based on the study of item co-occurrences. We present a new structure named CMAP (Co-occurence MAP) for storing co-occurrence information. We explain how CMAP can be used to prune candidates in three state-of-the-art vertical algorithms, namely SPADE, SPAM and ClaSP. An extensive experimental study with six real-life datasets shows that (1) co-occurrence-based pruning is effective, (2) CMAP is very compact and that (3) the resulting algorithms outperform state-of-the-art algorithms for mining sequential patterns (GSP, PrefixSpan, SPADE and SPAM) and closed sequential patterns (ClaSP and CloSpan).
Article
Full-text available
Twitter is a microblogging site in which users can post updates (tweets) to friends (followers). It has become an immense dataset of the so-called sentiments. In this paper, we introduce an approach that automatically classifies the sentimentof tweets by using classifier ensembles and lexicons. Tweets are classified as either positive or negative concerning a query term. This approach is useful for consumers who can use sentiment analysis to search for products, for companies that aim at monitoring the public sentiment of their brands, and for many other applications. Indeed, sentiment classification in microblogging services (e.g., Twitter) through classifier ensembles and lexicons has not been well explored in the literature. Ourexperiments on a variety of public tweet sentiment datasets show that classifier ensembles formed by Multinomial Naive Bayes, SVM, Random Forest, and Logistic Regression can improve classification accuracy.
Article
Full-text available
Twitter has become one of the most popular micro-blogging platform recently. Millions of users can share their thoughts and opinions about different aspects and events on the micro-blogging platform. Therefore, Twitter is considered as a rich source of information for decision making and sentiment analysis. Sentiment analysis refers to a classification problem where the main focus is to predict the polarity of words and then classify them into positive and negative feelings with the aim of identifying attitude and opinions that are expressed in any form or language. Sentiment analysis over Twitter offers organisations a fast and effective way to monitor the publics' feelings towards their brand, business, directors, etc. A wide range of features and methods for training sentiment classifiers for Twitter datasets have been researched in recent years with varying results. The primary issues in previous techniques are classification accuracy, data sparsity and sarcasm, as they incorrectly classify most of the tweets with a very high percentage of tweets incorrectly classified as neutral. This research paper focuses on these problems and presents an algorithm for twitter feeds classification based on a hybrid approach. The proposed method includes various pre-processing steps before feeding the text to the classifier. Experimental results show that the proposed technique overcomes the previous limitations and achieves higher accuracy when compared to similar techniques.
Article
Full-text available
This article reports our study of the role of social content (i.e., user-generated content in social networking environment) in online consumers’ decision process when they search for an inexperienced product to buy. Through close observation of users’ objective behavior and interview of their reflective thoughts during an initial exploratory user study, we have first derived a set of system implications and integrated these implications into a three-stage system architecture. Furthermore, driven by the specific implication regarding the impact of user reviews in influencing users’ decision stages, we have presented a linear-chain conditional random-field-based social-opinion-mining algorithm, and have identified its higher effectiveness against related algorithms in an experiment. Finally, we present our system’s user interfaces and emphasize on how to display the opinion-mining results in the form of both quantitative presentation and qualitative visualization. KeywordsUsers’ information needs–Social content–Complex decision making–Inexperienced products–Decision system–Opinion mining
Conference Paper
Full-text available
Merchants selling products on the Web often ask their customers to review the products that they have purchased and the associated services. As e-commerce is becoming more and more popular, the number of customer reviews that a product receives grows rapidly. For a popular product, the number of reviews can be in hundreds or even thousands. This makes it difficult for a potential customer to read them to make an informed decision on whether to purchase the product. It also makes it difficult for the manufacturer of the product to keep track and to manage customer opinions. For the manufacturer, there are additional difficulties because many merchant sites may sell the same product and the manufacturer normally produces many kinds of products. In this research, we aim to mine and to summarize all the customer reviews of a product. This summarization task is different from traditional text summarization because we only mine the features of the product on which the customers have expressed their opinions and whether the opinions are positive or negative. We do not summarize the reviews by selecting a subset or rewrite some of the original sentences from the reviews to capture the main points as in the classic text summarization. Our task is performed in three steps: (1) mining product features that have been commented on by customers; (2) identifying opinion sentences in each review and deciding whether each opinion sentence is positive or negative; (3) summarizing the results. This paper proposes several novel techniques to perform these tasks. Our experimental results using reviews of a number of products sold online demonstrate the effectiveness of the techniques.
Conference Paper
Full-text available
Sentiment analysis on Twitter data has attracted much attention recently. In this paper, we focus on target-dependent Twitter sentiment classification; namely, given a query, we classify the sentiments of the tweets as positive, negative or neutral according to whether they contain positive, negative or neutral sentiments about that query. Here the query serves as the target of the sentiments. The state-of-the-art approaches for solving this problem always adopt the target-independent strategy, which may assign irrelevant sentiments to the given target. Moreover, the state-of-the-art approaches only take the tweet to be classified into consideration when classifying the sentiment; they ignore its context (i.e., related tweets). However, because tweets are usually short and more ambiguous, sometimes it is not enough to consider only the current tweet for sentiment classification. In this paper, we propose to improve target-dependent Twitter sentiment classification by 1) incorporating target-dependent features; and 2) taking related tweets into consideration. According to the experimental results, our approach greatly improves the performance of target-dependent sentiment classification.
Conference Paper
Full-text available
Web content mining is intended to help people discover valuable information from large amount of unstructured data on the web. Movie review mining classifies movie reviews into two polarities: positive and negative. As a type of sentiment-based classification, movie review mining is different from other topic-based classifications. Few empirical studies have been conducted in this domain. This paper investigates movie review mining using two approaches: machine learning and semantic orientation. The approaches are adapted to movie review domain for comparison. The results show that our results are comparable to or even better than previous findings. We also find that movie review mining is a more challenging application than many other types of review mining. The challenges of movie review mining lie in that factual information is always mixed with real-life review data and ironic words are used in writing movie reviews. Future work for improving existing approaches is also suggested.
Article
Full-text available
Decomposition is a basic strategy in traditional multiobjective optimization. However, it has not yet been widely used in multiobjective evolutionary optimization. This paper proposes a multiobjective evolutionary algorithm based on decomposition (MOEA/D). It decomposes a multiobjective optimization problem into a number of scalar optimization subproblems and optimizes them simultaneously. Each subproblem is optimized by only using information from its several neighboring subproblems, which makes MOEA/D have lower computational complexity at each generation than MOGLS and nondominated sorting genetic algorithm II (NSGA-II). Experimental results have demonstrated that MOEA/D with simple decomposition methods outperforms or performs similarly to MOGLS and NSGA-II on multiobjective 0-1 knapsack problems and continuous multiobjective optimization problems. It has been shown that MOEA/D using objective normalization can deal with disparately-scaled objectives, and MOEA/D with an advanced decomposition method can generate a set of very evenly distributed solutions for 3-objective test instances. The ability of MOEA/D with small population, the scalability and sensitivity of MOEA/D have also been experimentally investigated in this paper.
Article
Full-text available
The web contains a wealth of product reviews, but sifting through them is a daunting task. Ideally, an opinion mining tool would process a set of search results for a given item, generating a list of product attributes (quality, features, etc.) and aggregating opinions about each of them (poor, mixed, good). We begin by identifying the unique properties of this problem and develop a method for automatically distinguishing between positive and negative reviews. Our classifier draws on information retrieval techniques for feature extraction and scoring, and the results for various metrics and heuristics vary depending on the testing situation. The best methods work as well as or better than traditional machine learning. When operating on individual sentences collected from web searches, performance is limited due to noise and ambiguity. But in the context of a complete web-based tool and aided by a simple method for grouping sentences into attributes, the results are qualitatively quite useful.
Article
Background and Objectives: With the extensive web applications, review sentiment classification has attracted increasing interest among text mining works. Traditional approaches did not indicate multiple relationships connecting words while emphasizing the preprocessing phase and data reduction techniques, making a huge performance difference in classification. Methods: This study suggests a model as an efficient model for sentiment classification combining preprocessing techniques, sampling methods, feature selection methods, and ensemble supervised classification to increase the classification performance. In the feature selection phase of the proposed model, we applied n-grams, which is a computational method, to optimize the feature selection procedure by extracting features based on the relationships of the words. Then, the best-selected feature through the particle swarm optimization algorithm to optimize the feature selection procedure by iteratively trying to improve feature selection. Results: In the experimental study, a comprehensive range of comparative experiments conducted to assess the effectiveness of the proposed model using the best in the literature on Twitter datasets. The highest performance of the proposed model obtains 97.33, 92.61, 97.16, and 96.23% in terms of precision, accuracy, recall, and f-measure, respectively. Conclusion: The proposed model classifies the sentiment of tweets and online reviews through ensemble methods. Besides, two sampling techniques had applied in the preprocessing phase. The results confirmed the superiority of the proposed model over state-of-the-art systems.
Article
With the availability of websites and the growth of comments, reviews of user-generated content are published on the Internet. Sentiment Classification is one of the most common problems in text mining, which applies to categorize reviews into positive and negative classes. Pre-processing has an important role when these textual contexts are employed by machine learning techniques. Without efficient pre-processing methods, unreliable results will be achieved. This research probes to investigate the performance of pre-processing for the Sentiment Classification problem on three popular datasets. We suggest a high-performance framework to enhance classification performance. First, features of user's opinions are extracted based on three methods: (1) Backward Feature Selection; (2) High Correlation Filter; and (3) Low Variance Filter. Second, the error rate of the primary classification for each method is calculated through the perceptron. Finally, the best method is selected through the fuzzy analytic hierarchy process. This framework is beneficial for companies to observe people's comments about their brands and for many other applications. The current authors have provided further evidence to confirm the superiority of the proposed framework. The obtained results indicate that on average this proposed framework outperformed its counterparts. This framework yields 90.63 precision, 90.89 accuracy, 91.27 recall, and 91.05% f-measure.
Article
Background and Objectives: Twitter Sentiment Classification is one of the most popular fields in information retrieval and text mining. Millions of people of the world intensity use social networks like Twitter. It supports users to publish tweets to tell what they are thinking about topics. There are numerous web sites built on the Internet presenting Twitter. The user can enter a sentiment target and seek for tweets containing positive, negative, or neutral opinions. This is remarkable for consumers to investigate the products before purchase automatically. Methods: This paper suggests a model for sentiment classification. The goal of this model is to investigate what is the role of n-grams and sampling techniques in Sentiment Classification application using an ensemble method on Twitter datasets. Also, it examines both binary and multiple classifications, which are classified datasets into positive, negative, or neutral classes. Results: Twitter Classification is an outstanding problem, which has very few free resources and not available due to modified authorization status. However, all Twitter datasets are not labeled and free, except for our applied dataset. We reveal that the combination of ensemble methods, sampling techniques, and n-grams can improve the accuracy of Twitter Sentiment Classification. Conclusion: The results confirmed the superiority of the proposed model over state-of-the-art systems. The highest results obtained in terms of accuracy, precision, recall, and f-measure.
Article
Wireless sensor networks (WSNs) find application in various fields like environmental monitoring, health-care, land security, and many more. To ease our day-to-day activity, WSNs have become an integral tool for complex data gathering tasks. Monitoring a phenomenon by a WSN depends on the collective data provided by the sensor nodes. To ensure reliable operation of WSNs, it is important to quantify the performance of such networks in terms of network reliability measures. This article studies the reliability of WSNs with multistate nodes and proposes an approach to evaluate the flow-oriented network reliability of WSNs consisting of multistate sensor nodes. The proposed method takes into account the dynamic state of the network due to multistate sensor nodes. The proposed approach includes enumeration of shortest minimal paths from application-specific flow satisfying sensor nodes (source nodes) to the sink node. It then proposes a modified sum-of-disjoint products approach to evaluate WSN reliability in the presence of multistate nodes from the enumerated shortest minimal paths. Simulations are performed on WSNs of various sizes to show the applicability of the proposed approach on arbitrary WSNs.
Book
An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people can, and do, actively use information technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area of opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object. Opinion Mining and Sentiment Analysis covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. The focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. The survey includes an enumeration of the various applications, a look at general challenges and discusses categorization, extraction and summarization. Finally, it moves beyond just the technical issues, devoting significant attention to the broader implications that the development of opinion-oriented information-access services have: questions of privacy, vulnerability to manipulation, and whether or not reviews can have measurable economic impact. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided. Opinion Mining and Sentiment Analysis is the first such comprehensive survey of this vibrant and important research area and will be of interest to anyone with an interest in opinion-oriented information-seeking systems.
Article
Social media are generating an enormous amount of sentiment data in the form of companies getting their customers' opinions on their products, political sentiment analysis and movie reviews, etc. In this scenario, twitter sentiment analysis is undertaken for classifying and identifying sentiments or opinions expressed by people in their tweets. Usually, the raw tweets consist of more noises in terms of URLs, stop-words, positive emojis and negative emojis, which are essentially reduced. After pre-processing, an effective topic modelling methodology Latent Dirichlet Allocation (LDA) is implemented for extracting the keywords and identifying the concerned topics. The extracted key words are utilized for twitter sentiment analysis using Possibilistic fuzzy c-means (PFCM) approach. The proposed clustering method finds the optimal clustering heads from the sentimental contents of twitter-sandersapple2 database. The acquired results are obtained in two forms such as positive and negative. Finally, the experimental outcome shows that the proposed approach improved accuracy in twitter sentiment analysis up to 3-3.5% compared to the existing methods: pattern based approach and ensemble method.
Article
The multi-objective evolutionary algorithm based on decomposition (MOEA/D) has been recognized as a promising method for solving multi-objective optimization problems (MOPs), receiving a lot of attention from researchers in recent years. However, its performance in handling MOPs with complicated Pareto fronts (PFs) is still limited, especially for real-world applications whose PFs are often complex featuring, e.g., a long tail or a sharp peak. To deal with this problem, an improved MOEA/D (named iMOEA/D) that mainly focuses on bi-objective optimization problems (BOPs) is therefore proposed in this paper. To demonstrate the capabilities of iMOEA/D, it is applied to design optimization problems of truss structures. In iMOEA/D, the set of the weight vectors defined in MOEA/D is numbered and divided into two subsets: one set with odd-weight vectors and the other with even-weight vectors. Then, a two-phase search strategy based on the MOEA/D framework is proposed to optimize their corresponding populations. Furthermore, in order to enhance the total performance of iMOEA/D, some recent developments for MOEA/D, including an adaptive replacement strategy and a stopping criterion, are also incorporated. The reliability, efficiency and applicability of iMOEA/D are investigated through seven existing benchmark test functions with complex PFs and three optimal design problems of truss structures. The obtained results reveal that iMOEA/D generally outperforms MOEA/D and NSGA-II in both benchmark test functions and real-world applications.
Chapter
Selecting and extracting feature is a vital step in sentiment analysis. The statistical techniques of feature selection like document frequency thresholding produce sub-optimal feature subset because of the non-polynomial (NP)-hard character of the problem. Swarm intelligence algorithms are used extensively in optimization problems. Swarm optimization renders feature subset selection by improving the classification accuracy and reducing the computational complexity and feature set size. In this work, we propose firefly algorithm for feature subset selection optimization. SVM classifier is used for the classification task. Four different datasets are used for the classification of which two are in Hindi and two in English. The proposed method is compared with feature selection using genetic algorithm. This method, therefore, is successful in optimizing the feature set and improving the performance of the system in terms of accuracy.
Article
Sentiment analysis is a critical task of extracting subjective information from online text documents. Ensemble learning can be employed to obtain more robust classification schemes. However, most approaches in the field incorporated feature engineering to build efficient sentiment classifiers. The purpose of our research is to establish an effective sentiment classification scheme by pursuing the paradigm of ensemble pruning. Ensemble pruning is a crucial method to build classifier ensembles with high predictive accuracy and efficiency. Previous studies employed exponential search, randomized search, sequential search, ranking based pruning and clustering based pruning. However, there are tradeoffs in selecting the ensemble pruning methods. In this regard, hybrid ensemble pruning schemes can be more promising. In this study, we propose a hybrid ensemble pruning scheme based on clustering and randomized search for text sentiment classification. Furthermore, a consensus clustering scheme is presented to deal with the instability of clustering results. The classifiers of the ensemble are initially clustered into groups according to their predictive characteristics. Then, two classifiers from each cluster are selected as candidate classifiers based on their pairwise diversity. The search space of candidate classifiers is explored by the elitist Pareto-based multi-objective evolutionary algorithm. For the evaluation task, the proposed scheme is tested on twelve balanced and unbalanced benchmark text classification tasks. In addition, the proposed approach is experimentally compared with three ensemble methods (AdaBoost, Bagging and Random Subspace) and three ensemble pruning algorithms (ensemble selection from libraries of models, Bagging ensemble selection and LibD3C algorithm). Results demonstrate that the consensus clustering and the elitist pareto-based multi-objective evolutionary algorithm can be effectively used in ensemble pruning. The experimental analysis with conventional ensemble methods and pruning algorithms indicates the validity and effectiveness of the proposed scheme.
Article
Sentiment analysis is one of the prominent fields of data mining that deals with the identification and analysis of sentimental contents generally available at social media. Twitter is one of such social medias used by many users about some topics in the form of tweets. These tweets can be analyzed to find the viewpoints and sentiments of the users by using clustering-based methods. However, due to the subjective nature of the Twitter datasets, metaheuristic-based clustering methods outperforms the traditional methods for sentiment analysis. Therefore, this paper proposes a novel metaheuristic method (CSK) which is based on K-means and cuckoo search. The proposed method has been used to find the optimum cluster-heads from the sentimental contents of Twitter dataset. The efficacy of proposed method has been tested on different Twitter datasets and compared with particle swarm optimization, differential evolution, cuckoo search, improved cuckoo search, gauss-based cuckoo search, and two n-grams methods. Experimental results and statistical analysis validate that the proposed method outperforms the existing methods. The proposed method has theoretical implications for the future research to analyze the data generated through social networks/medias. This method has also very generalized practical implications for designing a system that can provide conclusive reviews on any social issues.
Article
With the widespread usage of social networks, forums and blogs, customer reviews emerged as a critical factor for the customers’ purchase decisions. Since the beginning of 2000s, researchers started to focus on these reviews to automatically categorize them into polarity levels such as positive, negative, and neutral. This research problem is known as sentiment classification. The objective of this study is to investigate the potential benefit of multiple classifier systems concept on Turkish sentiment classification problem and propose a novel classification technique. Vote algorithm has been used in conjunction with three classifiers, namely Naive Bayes, Support Vector Machine (SVM), and Bagging. Parameters of the SVM have been optimized when it was used as an individual classifier. Experimental results showed that multiple classifier systems increase the performance of individual classifiers on Turkish sentiment classification datasets and meta classifiers contribute to the power of these multiple classifier systems. The proposed approach achieved better performance than Naive Bayes, which was reported the best individual classifier for these datasets, and Support Vector Machines. Multiple classifier systems (MCS) is a good approach for sentiment classification, and parameter optimization of individual classifiers must be taken into account while developing MCS-based prediction systems.
Article
Social media has become the largest data source of public opinion. The application of sentiment analysis to social media texts has great potential, but faces great challenges because of domain heterogeneity. Sentiment orientation of words varies by content domain, but learning context-specific sentiment in social media domains continues to be a major challenge. The language domain poses another challenge since the language used in social media today differs significantly from that used in traditional media. To address these challenges, we propose a method to adapt existing sentiment lexicons for domain-specific sentiment classification using an unannotated corpus and a dictionary. We evaluate our method using two large developing corpora, containing 743,069 tweets related to the stock market and one million tweets related to political topics, respectively, and five existing sentiment lexicons as seeds and baselines. The results demonstrate the usefulness of our method, showing significant improvement in sentiment classification performance.
Article
In this paper, we present the first deep learning approach to aspect extraction in opinion mining. Aspect extraction is a subtask of sentiment analysis that consists in identifying opinion targets in opinionated text, i.e., in detecting the specific aspects of a product or service the opinion holder is either praising or complaining about. We used a 7-layer deep convolutional neural network to tag each word in opinionated sentences as either aspect or non-aspect word. We also developed a set of linguistic patterns for the same purpose and combined them with the neural network. The resulting ensemble classifier, coupled with a word-embedding model for sentiment analysis, allowed our approach to obtain significantly better accuracy than state-of-the-art methods.
Book
Sequential data from Web server logs, online transaction logs, and performance measurements is collected each day. This sequential data is a valuable source of information, as it allows individuals to search for a particular value or event and also facilitates analysis of the frequency of certain events or sets of related events. Finding patterns in sequences is of utmost importance in many areas of science, engineering, and business scenarios. Pattern Discovery Using Sequence Data Mining: Applications and Studies provides a comprehensive view of sequence mining techniques and presents current research and case studies in pattern discovery in sequential data by researchers and practitioners. This research identifies industry applications introduced by various sequence mining approaches.
Article
Distance or similarity measures are essential to solve many pattern recognition problems such as classification, clustering, and retrieval problems. Various distance/similarity measures that are applicable to compare two probability density functions, pdf in short, are reviewed and categorized in both syntactic and semantic relationships. A correlation coefficient and a hierarchical clustering technique are adopted to reveal similarities among numerous distance/similarity measures.
Article
In order to successfully apply opinion mining (OM) to the large amounts of user-generated content produced every day, we need robust models that can handle the noisy input well yet can easily be adapted to a new domain or language. We here focus on opinion mining for YouTube by (i) modeling classifiers that predict the type of a comment and its polarity, while distinguishing whether the polarity is directed towards the product or video; (ii) proposing a robust shallow syntactic structure (STRUCT) that adapts well when tested across domains; and (iii) evaluating the effectiveness on the proposed structure on two languages, English and Italian. We rely on tree kernels to automatically extract and learn features with better generalization power than traditionally used bag-of-word models. Our extensive empirical evaluation shows that (i) STRUCT outperforms the bag-of-words model both within the same domain (up to 2.6% and 3% of absolute improvement for Italian and English, respectively); (ii) it is particularly useful when tested across domains (up to more than 4% absolute improvement for both languages), especially when little training data is available (up to 10% absolute improvement) and (iii) the proposed structure is also effective in a lower-resource language scenario, where only less accurate linguistic processing tools are available.
Conference Paper
We present a method that learns word embedding for Twitter sentiment classification in this paper. Most existing algorithms for learning continuous word representations typically only model the syntactic context of words but ignore the sentiment of text. This is problematic for sentiment analysis as they usually map words with similar syntactic context but opposite sentiment polarity, such as good and bad, to neighboring word vectors. We address this issue by learning sentimentspecific word embedding (SSWE), which encodes sentiment information in the continuous representation of words. Specifically, we develop three neural networks to effectively incorporate the supervision from sentiment polarity of text (e.g. sentences or tweets) in their loss functions. To obtain large scale training corpora, we learn the sentiment-specific word embedding from massive distant-supervised tweets collected by positive and negative emoticons. Experiments on applying SSWE to a benchmark Twitter sentiment classification dataset in SemEval 2013 show that (1) the SSWE feature performs comparably with hand-crafted features in the top-performed system; (2) the performance is further improved by concatenating SSWE with existing feature set.
Article
Penalty functions are frequently employed for handling constraints in constrained optimization problems (COPs). In penalty function methods, penalty coefficients balance objective and penalty functions. However, finding appropriate penalty coefficients to strike the right balance is often very hard. They are problems dependent. Stochastic ranking (SR) and constraint-domination principle (CDP) are two promising penalty functions based constraint handling techniques that avoid penalty coefficients. In this paper, the extended/modified versions of SR and CDP are implemented for the first time in the multiobjective evolutionary algorithm based on decomposition (MOEA/D) framework. This led to two new algorithms, CMOEA/D-DE-SR and CMOEA/D-DE-CDP. The performance of these new algorithms is tested on CTP-series and CF-series test instances in terms of the HV-metric, IGD-metric, and SC-metric. The experimental results are compared with NSGA-II, IDEA, and the three best performers of CEC 2009 MOEA competition, which showed better and competitive performance of the proposed algorithms on most test instances of the two test suits. The sensitivity of the performance of proposed algorithms to parameters is also investigated. The experimental results reveal that CDP works better than SR in the MOEA/D framework.
Article
This paper presents a novel approach to Sentiment Polarity Classification in Twitter posts, by extracting a vector of weighted nodes from the graph of WordNet. These weights are used in SentiWordNet to compute a final estimation of the polarity. Therefore, the method proposes a non-supervised solution that is domain-independent. The evaluation of a generated corpus of tweets shows that this technique is promising.
Conference Paper
With the ∞ourish of the Web, online review is becoming a more and more useful and important information resource for people. As a result, automatic review mining and sum- marizing has become a hot research topic recently. Difier- ent from traditional text summarization, review mining and summarizing aims at extracting the features on which the re- viewers express their opinions and determining whether the opinions are positive or negative. In this paper, we focus on a speciflc domain { movie review. A multi-knowledge based approach is proposed, which integrates WordNet, statisti- cal analysis and movie knowledge. The experimental results show the efiectiveness of the proposed approach in movie review mining and summarizing.
Article
An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do, actively use information technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area, of opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object. This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. Our focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. We include material on summarization of evaluative text and on broader issues regarding privacy, manipulation, and economic impact that the development of opinion-oriented information-access services gives rise to. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided.
Article
Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet Allocation (LDA), called joint sentiment/topic model (JST), which detects sentiment and topic simultaneously from text. Unlike other machine learning approaches to sentiment classification which often require labeled corpora for classifier training, the proposed JST model is fully unsupervised. The model has been evaluated on the movie review dataset to classify the review sentiment polarity and minimum prior information have also been explored to further improve the sentiment classification accuracy. Preliminary experiments have shown promising results achieved by JST.
Article
Multi-objective evolutionary algorithms (MOEAs) that use non-dominated sorting and sharing have been criticized mainly for: (1) their O(MN3) computational complexity (where M is the number of objectives and N is the population size); (2) their non-elitism approach; and (3) the need to specify a sharing parameter. In this paper, we suggest a non-dominated sorting-based MOEA, called NSGA-II (Non-dominated Sorting Genetic Algorithm II), which alleviates all of the above three difficulties. Specifically, a fast non-dominated sorting approach with O(MN2) computational complexity is presented. Also, a selection operator is presented that creates a mating pool by combining the parent and offspring populations and selecting the best N solutions (with respect to fitness and spread). Simulation results on difficult test problems show that NSGA-II is able, for most problems, to find a much better spread of solutions and better convergence near the true Pareto-optimal front compared to the Pareto-archived evolution strategy and the strength-Pareto evolutionary algorithm - two other elitist MOEAs that pay special attention to creating a diverse Pareto-optimal front. Moreover, we modify the definition of dominance in order to solve constrained multi-objective problems efficiently. Simulation results of the constrained NSGA-II on a number of test problems, including a five-objective, seven-constraint nonlinear problem, are compared with another constrained multi-objective optimizer, and the much better performance of NSGA-II is observed
Article
We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we find that standard machine learning techniques definitively outperform human-produced baselines. However, the three machine learning methods we employed (Naive Bayes, maximum entropy classification, and support vector machines) do not perform as well on sentiment classification as on traditional topic-based categorization. We conclude by examining factors that make the sentiment classification problem more challenging.
Article
. Multiple, often conflicting objectives arise naturally in most real-world optimization scenarios. As evolutionary algorithms possess several characteristics due to which they are well suited to this type of problem, evolution-based methods have been used for multiobjective optimization for more than a decade. Meanwhile evolutionary multiobjective optimization has become established as a separate subdiscipline combining the fields of evolutionary computation and classical multiple criteria decision making. In this paper, the basic principles of evolutionary multiobjective optimization are discussed from an algorithm design perspective. The focus is on the major issues such as fitness assignment, diversity preservation, and elitism in general rather than on particular algorithms. Di#erent techniques to implement these strongly related concepts will be discussed, and further important aspects such as constraint handling and preference articulation are treated as well. Finally, two applications will presented and some recent trends in the field will be outlined. Key words: evolutionary algorithms, multiobjective optimization 1
Target-dependent twitter sentiment classification with rich automatic features
  • D T Vo
  • Y Zhang
Vo DT, Zhang Y. Target-dependent twitter sentiment classification with rich automatic features. Proceedings of the 24th International Joint Conference on Artificial Intelligence; 2015:1347-1353.
  • Cac Coello
  • G B Lamont
  • D A Van Veldhuizen
Coello CAC, Lamont GB, Van Veldhuizen DA. Evolutionary Algorithms for Solving Multi-objective Problems. Vol 5. Berlin: Springer; 2007.