Figure - available from: World Wide Web
This content is subject to copyright. Terms and conditions apply.
A graphical representation of common misclassification errors made on a noisy user (e.g., a journalist). On the left, we show a misclassification error commonly made by content-based systems on users who post/interact with apparently risky content, even if he/she is linked with several safe users. On the right, we show a misclassification error commonly made by topology-based systems on users who establish more relationships with risky users than with safe users, even if he/she posts only safe content

A graphical representation of common misclassification errors made on a noisy user (e.g., a journalist). On the left, we show a misclassification error commonly made by content-based systems on users who post/interact with apparently risky content, even if he/she is linked with several safe users. On the right, we show a misclassification error commonly made by topology-based systems on users who establish more relationships with risky users than with safe users, even if he/she posts only safe content

Source publication
Article
Full-text available
The massive adoption of social networks increased the need to analyze users’ data and interactions to detect and block the spread of propaganda and harassment behaviors, as well as to prevent actions influencing people towards illegal or immoral activities. In this paper, we propose HURI, a method for social network analysis that accurately classif...

Similar publications

Article
Full-text available
Twitter boasts 319 million daily active users, making it an invaluable asset for public figures and businesses looking to cultivate positive public image. Businesses can leverage sentiment analysis for real-time polling on various social media platforms allowing them to gauge public sentiment and opinion accurately. Recently, academic researchers f...
Article
Full-text available
Bolsonaro’s supporters used social media to spread content during key events related to the Brasília attack. An unprecedented analysis of more than 15,000 public WhatsApp groups showed that these political actors tried to manufacture consensus in preparation for and after the attack. A cross-platform time series analysis showed that the spread of c...
Article
Full-text available
A ascensão de governos de direita ao redor do globo tem revelado estratégias comuns que são replicadas em diversos países. Para investigar este fenômeno, o artigo apresenta casos de Jair Bolsonaro a fim de identificar características comuns ao perfil de lideranças populistas autocráticas como a espetacularização da política por meio de declarações...
Conference Paper
Full-text available
Text classification is one of the most important tasks in Natural Language Processing. As text data is growing rapidly, it needs more computational power to classify the text in a big dataset. The task is difficult for characteristic-rich languages like Bangla. Having good-quality text data significantly affects the outcome of the model that has be...

Citations

... The work in [5] performs multivariate data fusion via Independent Vector Analysis (IVA) with sparse inverse covariance estimation and demonstrates its usefulness in detecting misinformation during high-impact events. The issue of safe vs. risky user classification in social networks is addressed in [6], where a hybrid method is devised to consider user network topology and text contents in a joint manner. The authors in [7] integrate novelty detection, sentiment prediction, and misinformation detection within the same deep multi-task learning architecture. ...
Conference Paper
Full-text available
Social media content can present a number of threats, including misinformation and hate speech towards specific demographic groups. One challenge is to effectively discriminate between benign and malicious posts, given the massive amount of available content. In this context, predictive models for malicious content detection can be extremely valuable, leading to the automatic removal of posts and user accounts or content being flagged for subsequent moderation. However, some of the existing detection models are limited to the analysis of a single data modality. At the same time, most multi-modal approaches operate in a fully supervised learning setting that assumes the availability of labeled data for both benign and malicious content. In this paper, we fill this gap by proposing a multimodal one-class learning approach for malicious online content detection. Our approach leverages feature extraction, dimensionality reduction, and one-class learning models to analyze text and image data in online posts simultaneously. Models learn their decision function in the challenging scenario where only benign online content is used as training data, overcoming the limitations of a fully supervised setting. Our experiments with two real-world datasets containing misinformation and hate speech posts reveal the effectiveness of different combinations of one-class learning models and dimensionality reduction techniques.
... These state-of-the-art approaches can be categorized as neural network-based methods [5] [6] [7] [8]. [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] and feature-based methods, which often rely on linguistic features, including text-based [22] [23], repetitiveness [24] [25] [26], and emotional semantics [27]. However, existing research works present some limitations, including the inability to effectively combine information from multiple modalities, the exclusive use of English as the primary language for detection, and a focus on addressing specific topics, e.g. ...
... Authors in [9] adopt Doc2Vec embeddings of emails and RF/SVM models to classify emails as phishing or legitimate. A similar approach is used in [10] to predict COVID-19 vaccine hesitancy, and in [11] to classify social network users in risky and safe. The authors in [12] leverage BERT embeddings and a CNN classifier to detect spam tweets combining topic-based features with contextual BERT embeddings. ...