PresentationPDF Available

SASTuit, software de análisis de sentimiento utilizando aprendizaje automático

Authors:
  • Totalplay Telecomunicaciones
  • Institute Technological of Misantla

Abstract and Figures

SASTuit es una herramienta enfocada al análisis de tweets en español, capaz de clasificar el sentimiento predominante en dos clases: positivos y negativos, utilizando aprendizaje automático. Consta de cuatro funciones principales, tales como 1) búsqueda de tweets, 2) la unión de varios conjuntos de datos, 3) tareas de pre-procesamiento y filtrado de datos y 4) la predicción de tweets en positivos/negativos en español.
Content may be subject to copyright.
Año XIII, Vol. III. Septiembre - Diciembre 2021
Año XIII, Vol. III. Septiembre - Diciembre 2021
Año XIII, Vol. III. Septiembre - Diciembre 2021
f1
Año XIII, Vol. III. Septiembre - Diciembre 2021
Año XIII, Vol. III. Septiembre - Diciembre 2021
Año XIII, Vol. III. Septiembre - Diciembre 2021
... Por lo que respecta a Lingmotif, desarrollada por el grupo de investigación Tecnolengua de la Universidad de Málaga (ltl.uma.es), decide si un texto es positivo o negativo, su intensidad (frente, por ejemplo, al software SASTuit, de Herrera Contreras et al., 2021), aporta datos cuantitativos, perfil de sentimiento, incluso cierto análisis cualitativo de elementos de texto detectados con orientación sintácticosemántica y atiende a los emojis, sin los que muchos tuits serían neutros, ya que, según Moreno-Ortiz (2019), la negatividad se expresa de forma más explícita. Se destacan en las siguientes figuras los parámetros considerados y el análisis que arroja del tuit: «Eu las faltas de ortografia son intencionales no me crean tan estúpida». ...
Article
Full-text available
Resumen. El objetivo de este trabajo es estudiar la conciencia metalingüística de los usuarios de Twitter respecto de la ortografía del español desde un punto de vista político-ideológico. Para ello se ha realizado un estudio sociolingüístico a partir de un corpus constituido por 30225 tuits. Los resultados demuestran que los usuarios de Twitter utilizan mayoritariamente la cuestión ortográfica como un arma arrojadiza y un elemento de prestigio, a la vez que empieza a crecer el número de usuarios que considera que el uso y la preservación de las normas ortográficas son una muestra de clasismo, cuando no de puro elitismo social o racismo "colonial". Palabras clave: sociolingüística sincrónica; lingüística de corpus; ideologías sobre la ortografía [en] Orthography as an ideological issue on Twitter Abstract. Our objective has been to study the metalinguistic awareness of the Twitter users regarding the orthography of Spanish from the political-ideological point of view. For this, a sociolinguistic study has been carried out from a corpus made up of 30,225 tweets. The results seem to show that in Twitter Spanish users continue to use the orthography as a throwing weapon and an element of prestige; however, the number of users who consider that the use and preservation of orthographic rules are a matter of classism, if not of pure social elitism or "colonial" racism, is beginning to grow.
Thesis
Full-text available
Currently, with the increase of the social media popularity, a large variety of digital media has been developed that allows easily to share opinions, experiences, postures and sentiments about different sectors or entities our interest. Twitter is one of the most popular microblog used around the world, this platform it generates around 250 million tweets every day, with the objective of gain valuable knowledge that can help to solve a social need. The goal of this research is to model sentiment analysis on Twitter to predict patterns in data behavior using supervised approaches of automatic learning on the detection of sentiment in two classes as {positives, negatives} in automated form on language spanish. The tool is conformed for four important modules: the tweet search, union of multiple csv files, data pre-processing, filtering and sentiment prediction. For the creation of the classifier, the data was supervised and tagged manually using six popular algorithms for the sentiment classification: random forest, logistic regression, naive bayes, gradient boosting, decision trees and support vector machines with 10 times cross-validation. During the experimental phase, Naive Bayes algorithm had the best performance with 92% precision, recall, F1-score and 91.90% accuracy in the sentiment classification using models of unigram vectorization.
Article
Full-text available
Twitter se ha convertido en una red social ideal para el análisis de audiencias y el estudio de opiniones sobre eventos y acontecimientos en tiempo real alrededor del mundo. El objetivo de esta investigación es el de analizar el sentimiento (positivo, negativo y neutro) de las ofertas publicadas en Twitter a través del hashtag #BlackFriday. La metodología empleada es un Análisis de Sentimiento con Machine Learning con el que se identifica el sentimiento de las ofertas publicadas en la red social.
Article
Modern companies generate value by digitalizing their services and products. Knowing what customers are saying about the firm through reviews in social media content constitutes a key factor to succeed in the big data era. However, social media data analysis is a complex discipline due to the subjectivity in text review and the additional features in raw data. Some frameworks proposed in the existing literature involve many steps that thereby increase their complexity. A two-stage framework to tackle this problem is proposed: the first stage is focused on data preparation and finding an optimal machine learning model for this data; the second stage relies on established layers of big data architectures focused on getting an outcome of data by taking most of the machine learning model of stage one. Thus, a first stage is proposed to analyze big and small datasets in a non-big data environment, whereas the second stage analyzes big datasets by applying the first stage machine learning model of. Then, a study case is presented for the first stage of the framework to analyze reviews of hotel-related businesses. Several machine learning algorithms were trained for two, three and five classes, with the best results being found for binary classification.
Article
User generated content (UGC) is providing new broad information datasets about airport service quality (ASQ) that are more easily available to researchers than information gathered using traditional techniques, such as surveys conducted with passengers. Research in the field is characterized by UGC provided on specialized blogs and websites. This study utilizes London Heathrow airport's Twitter account dataset and applies the sentiment analysis (SA) technique to measure ASQ. The aim of this research is to explore how SA techniques can identify new insights beyond those provided by more traditional methods. The dataset includes 4392 tweets and the SA identifies 23 attributes that can be used for comparison with other ASQ scales. Findings indicate that the frequency of passenger references to the attributes of the scale differs significantly in some cases and that the discernment of these differences can provide actionable insights for airport management when improving airport service quality.
Article
With the rapid growth of social media, sentiment analysis, also called opinion mining, has become one of the most active research areas in natural language processing. Its application is also widespread, from business services to political campaigns. This article gives an introduction to this important area and presents some recent developments.
Global Digital Report
  • W A Social
Social, W.A. (2020). Global Digital Report 2020. Recuperado el 01 de Marzo de 2020, de https://wearesocial.com/digital-2020.