Conference Paper

Application of Data Mining Clustering for Patterns Analysis of Cyberbullying Surveys

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In latest years, harassment or abuse through mobile devices and the Internet has been on the rise. This issue, better known as cyberbullying, is crueler and more dangerous than the traditional ways of bullying, largely due to the anonymity on social media or the Internet possibly generating consequences across the person’s lifetime. Therefore, different approaches have been developed in the search of alerting, informing, and preventing about cyberbullying situations such as the creation of regulations, issuing laws, or promoting technological approaches. This paper aims to find relevant patterns by applying clustering techniques, and to accomplish this goal, the survey titled the scale of victimization among adolescents through mobile phones and Internet (CYBVIC) has been used allowing to measure behaviors of harassment, aggression and social exclusion. The results obtained by the clustering can be used to combat this social problem due to this analysis highlights the seven most important questions and the hidden patterns among the filled responses.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Cyberbullying takes many different forms, including verbal abuse, sexual misbehavior, identity theft, and the creation of groups meant to attack people [11,43,44]. Direct cyberbullying, in which the bully interacts directly with the victim, and indirect cyberbullying, in which the attacker publicly posts negative material about the victim on social media sites [8,13,17,45] are possible two classification of cyberbullying. ...
Article
Full-text available
In modern life, the easy access to the internet and social media has become almost essential. Although online social media channels are good means of communication, their frequency of cyberbullying and harassment creates problems as well. On these sites, cyberbullying takes many forms that cause targeted victims to experience anxiety, guilt, and despair. With an eye on demographic factors and personal experiences of victimizing, this study explores the degree and impressions of cyberbullying within a Thai sample community. By means of a thorough questionnaire survey, important new perspectives on respondents’ awareness of cyberbullying, its underlying causes, and coping mechanisms for such events were obtained. Analyzing demographic data in tandem with cyberbullying experiences found a clear correlation between social media use length and the likelihood of running into cyberbullying. Moreover, the work suggests the creation of a customized natural language processing system specifically designed to identify Thai language cyberbullying events, therefore attaining an interesting accuracy rate of 84.23%. The study also points out the possible market chances for commercializing this method, implying it may be a profitable investment. These results provide a fresh understanding of language processing strategies for efficient content analysis on several internet platforms. Received: 16 September 2024 |Revised: 12 November 2024 | Accepted: 20 December 2024 Conflicts of Interest The authors declare that they have no conflicts of interest to this work. Data Availability Statement Data available on request from the corresponding author upon reasonable request. Author Contribution Statement Bello Musa Yakubu: Conceptualization, Validation, Formal analysis, Investigation, Writing – original draft, Writing – review & editing, Visualization, Funding acquisition. Worapong Bumrungsri: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing – original draft, Visualization. Pattarasinee Bhattarakosol: Conceptualization, Methodology, Formal analysis, Resources, Data curation, Writing – review & editing, Supervision, Project administration, Funding acquisition.
Article
Full-text available
The primary objective of this study was to investigate the effects of cyberbullying through social exclusion and verbal harassment on emotional, stress, and coping responses. Twenty-nine undergraduate students (16 females aged 18.25 ± 0.58 years and 13 males aged 18.46 ± 1.13 years) volunteered for the study. All volunteers participated in two experiments that stimulated cyberbullying through social exclusion or verbal harassment. In the first experiment, the effects of cyberbullying through social exclusion were investigated using a virtual ball-tossing game known as Cyberball. In the second experiment, the influence of cyberbullying through verbal harassment was tested using a hypothetical scenario together with reading of online comments. Emotional, stress, and coping responses were measured via the Positive Affect and Negative Affect Scale, the Dundee Stress State Questionnaire, and the Coping Inventory for Task Stress, respectively. The results demonstrated that social exclusion and verbal harassment induced a negative emotional state. We also found that verbal harassment through the use of impolite language increased engagement, and increased worry compared with social exclusion effects.
Article
Full-text available
The k-means algorithm is generally the most known and used clustering method. There are various extensions of k-means to be proposed in the literature. Although it is an unsupervised learning to clustering in pattern recognition and machine learning, the k-means algorithm and its extensions are always influenced by initializations with a necessary number of clusters a priori. That is, the k-means algorithm is not exactly an unsupervised clustering method. In this paper, we construct an unsupervised learning schema for the k-means algorithm so that it is free of initializations without parameter selection and can also simultaneously find an optimal number of clusters. That is, we propose a novel unsupervised k-means (U-k-means) clustering algorithm with automatically finding an optimal number of clusters without giving any initialization and parameter selection. The computational complexity of the proposed U-k-means clustering algorithm is also analyzed. Comparisons between the proposed U-k-means and other existing methods are made. Experimental results and comparisons actually demonstrate these good aspects of the proposed U-k-means clustering algorithm.
Article
Full-text available
Among many clustering algorithms, the K-means clustering algorithm is widely used because of its simple algorithm and fast convergence. However, the K-value of clustering needs to be given in advance and the choice of K-value directly affect the convergence result. To solve this problem, we mainly analyze four K-value selection algorithms, namely Elbow Method, Gap Statistic, Silhouette Coefficient, and Canopy; give the pseudo code of the algorithm; and use the standard data set Iris for experimental verification. Finally, the verification results are evaluated, the advantages and disadvantages of the above four algorithms in a K-value selection are given, and the clustering range of the data set is pointed out.
Article
Full-text available
Software testing plays an indispensable part in the software development process. A huge number of test cases are required to be tested to improve the quality of the software which is a tedious and time consuming process. In this paper we aim to minimize the number of test cases by eliminating redundant test cases and thereby assisting us in reducing the time consumed in testing huge number of test cases. We have used the popular data mining k-means algorithm along with an elbow method to reduce the number of test cases required to be tested. Experimental result presents better clustering accuracy and significant elimination of redundant test cases by using the proposed approach.
Conference Paper
Full-text available
With the increasing use of social media, cyberbullying behaviour has received more and more attention. Cyberbul-lying may cause many serious and negative impacts on a person's life and even lead to teen suicide. To reduce and stop cyberbullying, one effective solution is to automatically detect bullying content based on appropriate machine learning and natural language processing techniques. However, many existing approaches in the literature are just normal text classification models without considering bullying characteristics. In this paper, we propose a representation learning framework specific to cyberbullying detection. Based on word embeddings, we expand a list of pre-defined insulting words and assign different weights to obtain bullying features , which are then concatenated with Bag-of-Words and latent semantic features to form the final representation before feeding them into a linear SVM classifier. Experimental study on a twitter dataset is conducted, and our method is compared with several baseline text representation learning models and cyberbullying detection methods. The superior performance achieved by our method has been observed in this study.
Article
Full-text available
El objetivo de este estudio fue evaluar las propiedades psicométricas de la Escala de victimización entre adolescentes a través del teléfono móvil y de internet (CYBVIC) en estudiantes chilenos. Se analizó una muestra de 1,533 adolescentes de ambos sexos, de edades comprendidas entre 13 y 18 años, pertenecientes a 14 establecimientos educativos de la Región de la Araucanía, Chile. Se empleó un procedimiento robusto de análisis, tanto para la adaptación lingüística como para el estudio empírico. Los resultados del análisis factorial exploratorio y confirmatorio, obtenidos a través del procedimiento de muestra cruzadas, evidencian que el CYBVIC puede ser utilizado como una medida bidimensional de agresión a través de internet y del teléfono móvil. El análisis de confiabilidad por consistencia interna fue satisfactorio, así como su relación con otros constructos. Se concluye que el CYBVIC, a pesar de la disminución significativa de su cantidad de ítems, proporciona evidencia psicométrica suficiente para su uso en población chilena.
Article
Full-text available
Cyberbullying is a growing problem among adolescents and adults alike. To date, research concerning cyberbullying has focused on Europe and the Anglophone countries. This study contributes to understanding of cyberbullying by adding the case of adolescents in Japan. Participants were 899 high school students who completed a self-report questionnaire on technology use habits, cyberbullying and cybervictimization experiences. Logistic regression analyses were used to measure the relationship between cyberbullying, cybervictimization and several independent variables, including gender, age and technology use. Results showed that 22% of the participants had experienced cybervictimization, while 7.8% admitted to cyberbullying others. Most cyberbullying cases involved classmates and the victims knew the identities of their tormentors. Multiple logistic regression analyses revealed that cybervictimization is the biggest significant predictor of cyberbullying and vice versa. Having more online friends was significantly associated with cyberbullying and cybervictimization.
Article
Full-text available
Cyberbullying has been identified as an important problem amongst youth in the last decade. This paper reviews some recent findings and discusses general concepts within the area. The review covers definitional issues such as repetition and power imbalance, types of cyberbullying, age and gender differences, overlap with traditional bullying and sequence of events, differences between cyberbullying and traditional bullying, motives for and impact of cyber victimization, coping strategies, and prevention/intervention possibilities. These issues will be illustrated by reference to recent and current literature, and also by in-depth interviews with nine Swedish students aged 13–15 years, who had some first-hand experience of one or more cyberbullying episodes. We conclude by discussing the evidence for different coping, intervention and prevention strategies.
Article
Full-text available
An evaluation of the success of the evidence-based ConRed program, which addresses cyberbullying and other emerging problems linked with the use of the internet and seeks to promote a positive use of this new environment. The main aims of the ConRed program are a) to improve perceived control over information on the internet, b) to reduce the time dedicated to digital device usage, and c) to prevent and reduce cyberbullying. The impact of the program was evaluated with a quasi-experimental design with a sample of 893 students (595 experimental and 298 control). The results of the mixed repeated measures ANOVAs demonstrate that ConRed contributes to reducing cyberbullying and cyber-dependence, to adjusting the perception of information control, and to increasing the perception of safety at school.
Article
Full-text available
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.
Chapter
The objective of this systematic literature review is to find the scientific contributions that involve the topics of cyberbullying, data science techniques and microlearning. The proposed research question is how data science engages with data treatment for microlearning about fighting bullying and cyberbullying. To answer it, four important digital libraries were selected, and the search was performed using an established search string. Two hundred eighty-three studies were preliminary found; after an initial review, 69 were chosen to be part of the process. This literature review follows the guidelines established by Kitchenham and Charters. After applying the review protocol, which guarantees the rigour and replicability of the study, it was found that not many studies involve the initial terms; therefore, this scientific article is an initial step toward an in-depth analysis of this area of study.KeywordsData scienceCyberbullyingMicrolearning
Article
Abstract In this article, by applying k-means clustering, cut-off points are obtained for the recoding of raw scale scores into a fixed number of groupings that preserve the original scoring. The method is demonstrated on a Likert scale measuring xenophobia that was used in a large-scale sample survey conducted in Northern Greece by the National Centre for Social Research. Applying split-half samples and fuzzy c-means clustering, the stability of the proposed solution is validated empirically. Testing its performance against three single indicators of xenophobia shows that it differentiates well between non-xenophobic and xenophobic respondents. The proposed method may be easily applied to facilitate interpretation by providing a more concise and meaningful “profile” of Likert scale (or subscale) raw scores especially the negative and positive ends of the scale for evaluation and social policy purposes.
Conference Paper
Clusteringis a technique in which a given data set is divided into groups called clusters in such a manner that the data points that are similar lie together in one cluster. Clustering plays an important role in the field of data mining due to the large amount of data sets.This paper reviews the various clustering algorithms available for data mining and provides a comparative analysis of the various clustering algorithms like DBSCAN, CLARA, CURE, CLARANS, KMeans etc.
Conference Paper
Cyberbullying has become intensive field of research, due to its major impact on society. Most researchers analyze causes and consequences of cyberbullying, however, only few try to improve software to reduce or stop cyberbullying, and make Internet a safer place. In this article, current review of efforts in cyberbullying detection using web content mining techniques is presented.
Article
Bullying is a social problem. The proliferation of electronic technology has provided a new forum for bullies to harm victims. That is, bullies can transmit harmful text messages, photos, or video over the Internet and other digital communication devices to victims. This malpractice of technology-oriented phenomenon known as cyberbullying has become a social problem. College students who have been cyberbullied have committed suicide, dropped out, or endured torment while in school. This article provides an overview of cyberbullying among adults in higher education and an examination of the current status of state and federal laws that may serve as deterrents to cyberbullying.
Article
School educators play an important role in cyberbullying management. Since scarce earlier research indicated low perceived competence of school educators in handling cyberbullying, more insight is needed in what determines their actions and how to improve these practices. This study assessed school educator practices, their perceptions and context factors from a behavior change theoretical framework, and investigated educator clusters related to this. An online survey was conducted among 451 secondary school educators (teachers, principals, school counselors). School educators mostly used recommended actions (i.e. conversations with pupils, enlisting professionals for support, parental involvement, providing supportive victim advice). Four educator clusters were identified: 'referrers' (65%), 'disengaged' educators (14%), 'concerned' educators (12%) and 'use all means' educators (9%). The first two clusters were less adept at handling cyberbullying and comprised mostly teachers, particularly indicating a need for training teachers. Our findings show a need for tailored educator training, e.g. by job position, gender, school size and grade. The behavior change theoretical framework can help target educators' particular needs.
Article
Background: Bullying is common among young students, and cyberbullying has increased due to the use of technology. This study investigates the prevalence of bullying and cyberbullying among high school students and the emotional effects of bullying on students. Methods: Students at East Chapel Hill High School, Chapel Hill, North Carolina completed the Gatehouse Bullying Scale and the Peer Relations Questionnaire. They answered questions regarding how often they had experienced certain types of bullying in school and the emotional effects the bullying had on them. Results: The combined results from both surveys indicated that the prevalence of bullying was 55% with 18% of respondents reporting cyberbullying. Teasing and name-calling were the most common types of bullying, as 40% of students reported having been teased or called names. The most serious type of bullying, being threatened with harm, hit, or kicked, occurred in 20% of boys and 8% of girls, with 25% of respondents reported "quite upset" by the experience. The majority (79%) of students who had been bullied did not share with anyone about being bullied, and of those who did, only 50% were taken seriously. Conclusions: Bullying is still prevalent among high school students, and cyberbullying is becoming more widespread. Most victims do not share their bullying experience, and if they did, only half believe they are taken seriously. Both bullying among students in school and cyberbullying deserve attention due to their potentially devastating effects on victims.
Article
Often, categorical ordinal data are clustered using a well-defined similarity measure for this kind of data and then using a clustering algorithm not specifically developed for them. The aim of this article is to introduce a new clustering method suitably planned for ordinal data. Objects are grouped using a multinomial model, a cluster tree and a pruning strategy. Two types of pruning are analyzed through simulations. The proposed method allows to overcome two typical problems of cluster analysis: the choice of the number of groups and the scale invariance.
Revista panamericana de salud pública
  • S Buelga
  • M J Cava
  • G Musitu
Buelga S, Cava MJ, Musitu G (2012) [validation of the adolescent victimization through mobile phone and internet scale]. Revista panamericana de salud pública = Pan Am J Publ Health 32:36-42. doi: https://doi.org/10.1590/S1020-49892012000700006
1-cyberbullying: definition, consequences, prevalence
  • M Campbell
  • S Bauman
Campbell M, Bauman S (2018) 1-cyberbullying: definition, consequences, prevalence. In: Campbell M, Bauman S (eds) Reducing cyberbullying in schools. Academic Press, pp 3-16. https://doi.org/10.1016/B978-0-12-811423-0.00001-8.https://www.sciencedirect.com/ science/article/pii/B9780128114230000018
Recommender systems handbook
  • F Ricci
  • L Rokach
  • B Shapira
  • P B Kantor
Ricci F, Rokach L, Shapira B, Kantor PB (2011) Recommender systems handbook. Springer. https://doi.org/10.1088/1751-8113/44/8/085201
Prevalencia y consecuencias del cyberbullying: una revisión
  • M Garaigordobil
Garaigordobil M (2011) Prevalencia y consecuencias del cyberbullying: una revisión. Int J Psychol Psychol Therapy. https://www.redalyc.org/articulo.oa?id=56019292003
Handbook of cluster analysis
  • C Hennig
  • M Meila
  • F Murtagh
  • R Rocci
Hennig C, Meila M, Murtagh F, Rocci R (2015) Handbook of cluster analysis. CRC Press
Cyberbullying: a virtual offense with real consequences
  • T S Rao
  • D Bansal
  • S Chandran
Rao TS, Bansal D, Chandran S (2018) Cyberbullying: a virtual offense with real consequences. Indian J Psychiatry 60(1):3
Clustering techniques in data mining: a comparison
  • Garima
  • H Gulati
  • P K Singh