Prediction metrics for cleaned data post stop words removal and stemming

Prediction metrics for cleaned data post stop words removal and stemming

Source publication
Article
Full-text available
In an age of social media, online forums, and chats, cyberbullying is a prevalent issue. On Twitter (now X), approximately 500 million tweets are shared per day (Antonakaki et.al., 2021). It is the job of the moderators to ensure these tweets follow standard community guidelines. However, the sheer number of tweets makes it difficult to sort manual...

Context in source publication

Context 1
... for ML Classification after Data Cleanup Table 2 shows the classification report after cleaning stop words in the data and using stemming. The accuracy of the classification improved marginally by 1% to 2% for most of the models. ...

Similar publications

Article
Full-text available
Hate speech constitutes a major problem on microblogging platforms, with automatic detection being a growing research area. Most existing works focus on analyzing the content of social media posts. Our study shifts focus to predicting which users are likely to become targets of hate speech. This paper proposes a novel Hate-speech Target Prediction...
Preprint
Full-text available
Social bots-automated accounts that generate and spread content on social media-are exploiting vulnerabilities in these platforms to manipulate public perception and disseminate disinformation. This has prompted the development of public bot detection services; however, most of these services focus primarily on Twitter, leaving niche platforms vuln...
Preprint
Full-text available
To tackle the global challenge of online hate speech, a large body of research has developed detection models to flag hate speech in the sea of online content. Yet, due to systematic biases in evaluation datasets, detection performance in real-world settings remains unclear, let alone across geographies. To address this issue, we introduce HateDay,...
Article
Full-text available
Social media is a communication tool that supports users to interact socially using technology. One of the most popular social media platforms is Twitter. However, its media platform has been considered by the virtual police as one of the main sources of spreading hate speech on social media. In this final project research, the authors conducted a...
Article
Full-text available
Social media platforms have become essential for communication but have also created spaces where harmful content, including cyberbullying, racism, and other abusive behaviors, thrives. This study employs AI-driven sentiment analysis to classify social media posts into three categories: Abusive, Neutral, and Harmless. A dataset of Twitter posts sou...

Citations

Article
Full-text available
Abstract-The rapid development of Artificial Intelligence (AI) technology for artificial intelligence has become a comprehensive topic of social debate, especially in many social media nowadays. The purpose of this study is to analyze public sentiment about AI using tweet data collected through scraping comment data. A total of seven data records using AI-related keywords such as chatgpt, openai, and deepseek were processed along with NLP (natural language processing) technology. Before processing, it included text cleaning, lemmatization, and removing stop words. Mood analysis was performed using the Vader algorithm. The results showed that 47% of tweets were positive, 32% were neutral, and 21% were negative. The data visualization also shows the most frequently used words in AI-related discussions and the most active users. This study includes a general explanation of public perception of AI, opening up opportunities for further studies on the dynamics of public discourse in the digital age.