Matthew L. Williams's research while affiliated with Cardiff University and other places

Publications (55)

Article
Full-text available
Hateful individuals and groups have increasingly been using the Internet to express their ideas, spread their beliefs, and recruit new members. Under- standing the network characteristics of these hateful groups could help understand individuals’ exposure to hate and derive intervention strategies to mitigate the dangers of such networks by disrupt...
Article
Twitter has emerged as one of the most popular platforms to get updates on entertainment and current events. However, due to its 280-character restriction and automatic shortening of URLs, it is continuously targeted by cybercriminals to carry out drive-by download attacks, where a user’s system is infected by merely visiting a Web page. Popular ev...
Article
Full-text available
In this article, we conduct a comprehensive study of online antagonistic content related to Jewish identity posted on Twitter between October 2015 and October 2016 by UK-based users. We trained a scalable supervised machine learning classifier to identify antisemitic content to reveal patterns of online antisemitism perpetration at the source. We b...
Article
Forensic science is constantly evolving and transforming, reflecting the numerous technological innovations of recent decades. There are, however, continuing issues with the use of digital data, such as the difficulty of handling large-scale collections of text data. As one way of dealing with this problem, we used machine-learning techniques, part...
Conference Paper
Full-text available
This paper presents a system developed during our participation (team name: scmhl5) in the TRAC-2 Shared Task on aggression identification. In particular, we participated in English Sub-task A on three-class classification ('Overtly Aggressive', 'Covertly Aggressive' and 'Non-aggressive') and English Sub-task B on binary classification for Misogyni...
Article
Full-text available
Offensive or antagonistic language targeted at individuals and social groups based on their personal characteristics (also known as cyber hate speech or cyberhate) has been frequently posted and widely circulated via the World Wide Web. This can be considered as a key risk factor for individual and societal tension surrounding regional instability....
Article
Full-text available
National governments now recognize online hate speech as a pernicious social problem. In the wake of political votes and terror attacks, hate incidents online and offline are known to peak in tandem. This article examines whether an association exists between both forms of hate, independent of ‘trigger’ events. Using Computational Criminology that...
Article
Full-text available
Linked survey and Twitter data present an unprecedented opportunity for social scientific analysis, but the ethical implications for such work are complex—requiring a deeper understanding of the nature and composition of Twitter data to fully appreciate the risks of disclosure and harm to participants. In this article, we draw on our experience of...
Article
Full-text available
To explore possible distinctive features of online memorials for youth suicides, amid concerns about glorification, we compared public Facebook memorials for suicides and road traffic accident deaths, using Linguistic Inquiry and Word Count software. People who posted on memorial sites wrote at greater length about suicides, using longer words and...
Conference Paper
Full-text available
In traditional machine learning, classifiers training is typically un-dertaken in the setting of single-task learning, so the trained classi-fier can discriminate between different classes. However, this must be based on the assumption that different classes are mutually exclusive. In real applications, the above assumption does not always hold. Fo...
Article
Full-text available
Sentiment analysis is a very popular application area of text mining and machine learning. The popular methods include Support Vector Machine, Naive Bayes, Decision Trees and Deep Neural Networks. However, these methods generally belong to discriminative learning, which aims to distinguish one class from others with a clear-cut outcome, under the p...
Article
Full-text available
In light of issues such as increasing unit nonresponse in surveys, several studies argue that social media sources such as Twitter can be used as a viable alternative. However, there are also a number of shortcomings with Twitter data such as questions about its representativeness of the wider population and the inability to validate whose data you...
Preprint
Full-text available
Sentiment analysis is a very popular application area of text mining and machine learning. The popular methods include Support Vector Machine, Naive Bayes, Decision Trees and Deep Neural Networks. However, these methods generally belong to discriminative learning, which aims to distinguish one class from others with a clear-cut outcome, under the p...
Conference Paper
Full-text available
In this paper we present a proposal to address the problem of the pricey and unreliable human annotation, which is important for detection of hate speech from the web contents. In particular, we propose to use the text that are produced from the suspended accounts in the aftermath of a hateful event as subtle and reliable source for hate speech pre...
Article
Full-text available
Objectives: In the light of concern about the harmful effects of media reporting of suicides and a lack of comparative research, this study compares the number and characteristics of reports on young people’s suicides and road traffic accidents (RTAs) in newspapers and Twitter. Methods: Comparison of newspaper and Twitter reporting of deaths by sui...
Article
Full-text available
Cybercrime is recognized as one of the top threats to UK economic security. On a daily basis, the computer networks of businesses suffer security breaches. A less explored dimension of this problem is cybercrimes committed by insiders. This paper provides a criminological analysis of corporate insider victimization. It begins by presenting reviews...
Preprint
Full-text available
Hateful and offensive language (also known as hate speech or cyber hate) posted and widely circulated via the World Wide Web can be considered as a key risk factor for individual and societal tension linked to regional instability. Automated Web-based hate speech detection is important for the observation and understanding trends of societal tensio...
Article
Full-text available
The increasing popularity of social media platforms creates new digital social networks in which individuals can interact and share information, news, and opinion. The use of these technologies appears to have the capacity to transform current social configurations and relations, not least within the public and civic spheres. Within the social scie...
Conference Paper
Full-text available
Empirical research involving the analysis of Internet-based data raises a number of ethical challenges. One instance of this is the analysis of Twitter data, in particular when specific tweets are reproduced for the purposes of dissemination. Although Twitter is an open platform it is possible to question whether this provides a sufficient ethical...
Article
Full-text available
New and emerging forms of data, including posts harvested from social media sites such as Twitter, have become part of the sociologist’s data diet. In particular, some researchers see an advantage in the perceived ‘public’ nature of Twitter posts, representing them in publications without seeking informed consent. While such practice may not be at...
Conference Paper
Full-text available
The deliberate misuse of technical infrastructure (including the Web and social media) for cyber deviant and cybercriminal behaviour, ranging from the spreading of extremist and terrorism-related material to online fraud and cyber security attacks, is on the rise. This workshop aims to better understand such phenomena and develop methods for tackli...
Article
Full-text available
The nature of the risk or threat posed by ‘cyberfraud’ - fraud with a cyber dimension – is examined empirically based on data reported by the public and business to Action Fraud. These are used to examine the implications for a more effective risk-based response, both by category of fraud and also responding to cyberfraud generally, not just in the...
Article
Full-text available
Hateful and antagonistic content published and propagated via the World Wide Web has the potential to cause harm and suffering on an individual basis, and lead to social tension and disorder beyond cyber space. Despite new legislation aimed at prosecuting those who misuse new forms of communication to post threatening, harassing, or grossly offensi...
Article
Full-text available
Bioinformatics, a specialism propelled into relevance by the Human Genome Project and the subsequent -omic turn in the life science, is an interdisciplinary field of research. Qualitative work on the disciplinary identities of bioinformaticians has revealed the tensions involved in work in this “borderland.” As part of our ongoing work on the emerg...
Article
Full-text available
Social media platforms provide an increasingly popular means for individuals to share content online. Whilst this produces undoubted societal benefits, the ability for content to be spontaneously posted and reposted creates an ideal environment for rumour and false/malicious information to spread rapidly. When this occurs it can cause significant h...
Article
Background: Concern has been expressed about the potentially contagious effect of television soap opera suicides and suicidal language in social media. Aims: Twitter content was analyzed during the week in which a fictional assisted suicide was broadcast on a British television soap opera, "Coronation Street." Method: Tweets were collected if...
Article
Full-text available
This paper critically examines the affordances and limitations of big data for the study of crime and disorder. We hypothesize that disorder-related posts on Twitter are associated with actual police crime rates. Our results provide evidence that naturally occurring social media data may provide an alternative information source on the crime proble...
Article
Full-text available
The last 5-10 years have seen a massive rise in the popularity of social media platforms such as Twitter, Facebook, Tumblr etc. These platforms enable users to post and share their own content instantly, meaning that material can be seen by multiple others in a short period of time. The growing use of social media has been accompanied by concerns t...
Technical Report
Full-text available
The use of the internet and technology to commit economic crime has been escalating sharply in recent years, bringing new challenges in preventing and tackling such crime. This research, commissioned by the City of London Corporation, with the support of the City of London Police, and prepared by Cardiff University, explores the nature of economic...
Conference Paper
Full-text available
The increasing popularity of social media platforms such as Facebook, Twitter, Instagram and Tumblr has been accompanied by concerns over the growing prevalence of 'harmful' online interactions. The term 'digital wildfire' has been coined to characterise the capacity for provocative content on social media to propagate rapidly and cause offline har...
Article
Full-text available
This paper presents the first criminological analysis of an online social reaction to a crime event of national significance, in particular the detection and propagation of cyberhate on social media following a terrorist attack. We take the Woolwich, London terrorist attack in 2013 as our event of interest and draw on Cohen’s process of warning, im...
Article
The election forecasting 'industry' is a growing one, both in the volume of scholars producing forecasts and methodological diversity. In recent years a new approach has emerged that relies on social media and particularly Twitter data to predict election outcomes. While some studies have shown the method to hold a surprising degree of accuracy the...
Article
Full-text available
Online fraud is the most prevalent acquisitive crime in Europe. This study applies routine activities theory to a subset of online fraud, online identity theft, by exploring country-level mechanisms, in addition to individual determinants via a multi-level analysis of Eurobarometer survey data. This paper adds to the theory of cybercrime and policy...
Article
The use of “Big Data” in policy and decision making is a current topic of debate. The 2013 murder of Drummer Lee Rigby in Woolwich, London, UK led to an extensive public reaction on social media, providing the opportunity to study the spread of online hate speech (cyber hate) on Twitter. Human annotated Twitter data was collected in the immediate a...
Article
Full-text available
This paper specifies, designs and critically evaluates two tools for the automated identification of demographic data (age, occupation and social class) from the profile descriptions of Twitter users in the United Kingdom (UK). Meta-data data routinely collected through the Collaborative Social Media Observatory (COSMOS: http://www.cosmosproject.ne...
Article
Full-text available
The growing number of people using social media to publish their opinions, share expertise, make social connections and promote their ideas to an international audience is creating data on an epic scale. This enables social scientists to conduct research into ethnography, discourse analysis and analysis of social interactions, providing insight int...
Article
This paper presents findings from the All Wales Hate Crime Project. Most hate crime research has focused on discrete victim types in isolation. For the first time, internationally, this paper examines the psychological and physical impacts of hate crime across seven victim types drawing on quantitative and qualitative data. It contributes to the ha...
Article
Full-text available
In this paper, we reflect on the disciplinary contours of contemporary sociology, and social science more generally, in the age of ‘big and broad’ social data. Our aim is to suggest how sociology and social sciences may respond to the challenges and opportunities presented by this ‘data deluge’ in ways that are innovative yet sensitive to the socia...
Article
Full-text available
Little is currently known about the factors that promote the propagation of information in online social networks following terrorist events. In this paper we took the case of the terrorist event in Woolwich, London in 2013 and built models to predict information flow size and sur-vival using data derived from the popular social networking site Twi...
Article
Twenty years ago Mark Burke's pioneering research into homosexuality and policing evidenced widespread prejudice and hostility toward lesbian, gay and bisexual police officers in nine forces across England and Wales. These serving officers were felt to represent the most serious kind of contamination and threat to the integrity of the British Polic...
Article
We propose that late modern policing practices, that rely on neighbourhood intelligence, the monitoring of tensions, surveillance and policing by accommodation, need to be augmented in light of emerging ‘cyber-neighbourhoods’, namely social media networks. The 2011 riots in England were the first to evidence the widespread use of social media platf...
Article
Purpose ‐ This paper aims to map out multi-agency partnerships in the UK information assurance (UKIA) network in the UK. Design/methodology/approach ‐ The paper surveyed members of the UKIA community and achieved a 52 percent response rate (n=104). The paper used a multi-dimensional scaling (MDS) technique to map the multi-agency cooperation space...
Article
A perennial criticism regarding the use of social media in social science research is the lack of demographic information associated with naturally occurring mediated data such as that produced by Twitter. However the fact that demographics information is not explicit does not mean that it is not implicitly present. Utilising the Cardiff Online Soc...
Article
Full-text available
Technological innovation in digital communications, epitomised in the shift from the informational web (Web1.0) to the interactional web (Web2.0), provokes new opportunities and challenges for social research. Web2.0 technologies, particularly the new social media (e.g. social networking, blogging and micro-blogging) as well as the increased access...
Article
eCrime is now the typical volume property crime in the United Kingdom impacting more of the public than traditional acquisitive crimes such as burglary and car theft (Anderson et al, 2012). It has become increasingly central to the National Security Strategy of several countries; in the United Kingdom becoming a Tier One threat. While it is apparen...
Article
This paper examines public perceptions of three sexual grooming types: computer-mediated sexual grooming (CMSG), familial sexual grooming (FSG) and localised sexual grooming (LSG). Using data from a national survey of 557 respondents from the United Kingdom, we tested models that predicted perceptions of the prevalence of CMSG, FSG and LSG and the...

Citations

... Twitter is one of the most popular entertainment and news updating source. However, owing to its 280-character cap and automatic shortening of URLs, computer attackers are constantly targeting for drive-by-download assaults where a user's system is compromised by visiting a web page [65]. Cybercriminals utilize regular processes to recruit large quantities of people to hack and distribute malware using common hashtags to generate misleading messages to attract fraudulent websites. ...
... Social media has always been a useful source of information for the management of natural disasters and crises, such as earthquakes [3], floods [4], pandemic Zika and Ebola [5,6] or terrorist attacks [7,8], to name a few. Traditional hard sensors only offer quantitative information that is not valid in many scenarios where human-interactions are still mandatory. ...
... The output of these is then combined in the next stage to determine the class of the text input. This is performed either through The problem of detecting aggressive language that uses out-of-vocabulary words has also been approached by several researchers, Madisetty & Sankar Desarkar (2018); Raiyani et al. (2018); Liu et al. (2020), distinguishing the Overtly Aggressive hateful textual content from the Covertly Aggressive. Nevertheless, no results exist so far to demonstrate any performance gain over the classic approaches, nor are there many differences between the strategies used to detect direct and indirect aggression. ...
... Antisemitic hate crimes in the United States rose to the highest level since 2008 in 2019 (Levin, 2020). Social media platforms are particularly popular vessels for antisemitic discourse (Ozalp et al., 2020;Zannettou et al., 2020). Despite this, a recent survey from the American Jewish Committee reports that almost half of Americans are unfamiliar with the term antisemitism (Mayer, 2020) and nearly 3 out of 10 Americans say, "they are not sure how many Jews died during the Holocaust" (Alper et al., 2020: 6). ...
... Users consent to have their data made available to third parties including academics when they sign up to Reddit. Existing ethical guidelines state that in this situation explicit consent is not required from each user (Procter et al., 2019). We obfuscate user names as User_A or User_B to reduce the possibility of identifying users. ...
... Numerous scholars have also linked hate crimes to hateful speech and extremist ideas (Chan et al., 2016;Foxman & Wolf, 2013;Freilich et al., 2011;Singh & Singh, 2012; The New America Foundation International Security Program, n.d.). Indeed, online hatred is often the precursor of offline crimes (Chan et al., 2016;Williams et al., 2020). For example, Awan and Zempi (2016) found that the 2015 terrorist attacks in Paris and Tunisia and the activities of Islamic State militants triggered a significant increase in anti-Muslim attacks both online and offline, and victims feared that online hatred would materializing in actual violence against them in the offline world (Awan & Zempi, 2016;Zempi, 2014). ...
... E. Chen & Wojcik, 2016;K. Chen et al., 2021;Fiesler & Proferes, 2018;Murphy, 2017;Samuel & Buchanan, 2020;Sloan et al., 2020). ...
... However, because of the wide coverage of sampling designs, crime estimates from victimisation surveys are generally only reliable at larger regional levels (Rosenbaum & Lavrakas, 1995). When studies do attempt to offer a more fine-grained spatial focus, they are normally restricted to a specific time and place (Akpinar et al., 2021;Buil-Gil et al., 2022;Hunter et al., 2021), or have relied on alternatively generated crime measures, such as acoustic gundetection technology to measure gun crime (Mazeika, 2022) or social media data to measure hate crime (Williams et al., 2020). ...
... Most records included in this review were conducted by researchers based in North America (n = 51; 57%), Europe (n = 22; 25%) [20, 46, [59][60][61][62][63][64][65][66][67][68][69][70][71][72][73][74][75][76][77], and Oceania (n = 8; 9%) [78][79][80][81][82][83][84] ( Table 1). All included records were published between 2000 and 2021, and each record reported on a distinct study. ...
... As with offline hate crime, online hate speech (or cyberhate) posted on social media has become a growing social problem. In 2016 and 2017, the UK's decision to leave the European Union, and a string of terror attacks, was followed by noticeable and unprecedented increases in cyberhate , with a rhetoric of invasion, threat and otherness (Alorainy et al. 2019). Some research suggests that the perpetrators of cyberhate have similar motivations to those who resort to violence offline Awan 2014;Chan et al. 2016;Awan and Zempi 2017). ...