Currently, with the increase of the social media popularity, a large variety of digital media has been developed that allows easily to share opinions, experiences, postures and sentiments about different sectors or entities our interest. Twitter is one of the most popular microblog used around the world, this platform it generates around 250 million tweets every day, with the objective of gain
... [Show full abstract] valuable knowledge that can help to solve a social need.
The goal of this research is to model sentiment analysis on Twitter to predict patterns in data behavior using supervised approaches of automatic learning on the detection of sentiment in two classes as {positives, negatives} in automated form on language spanish. The tool is conformed for four important modules: the tweet search, union of multiple csv files, data pre-processing, filtering and sentiment prediction.
For the creation of the classifier, the data was supervised and tagged manually using six popular algorithms for the sentiment classification: random forest, logistic regression, naive bayes, gradient boosting, decision trees and support vector machines with 10 times cross-validation. During the experimental phase, Naive Bayes algorithm had the best performance with 92% precision, recall, F1-score and 91.90% accuracy in the sentiment classification using models of unigram vectorization.