Bennett Kleinberg

Bennett Kleinberg
Tilburg University | UVT · Department of Methodology and Statistics

PhD

About

66
Publications
104,465
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,592
Citations
Additional affiliations
September 2018 - present
University College London
Position
  • Professor (Assistant)
November 2015 - August 2018
University of Amsterdam
Position
  • PhD Student

Publications

Publications (66)
Chapter
This chapter seeks to understand lone-actor terrorists through their use of language. Studies examining terrorist and extremist online postings, pre-attack threats, and manifestos are described. The authors specifically focus on efforts in which linguistic analysis is performed automatically, for example, to measure potential warning signs of viole...
Preprint
Anonymously written threats constitute a special form of worrying behavior, in which the author of a threat decides to hide their identity. Importantly, anonymous threats are an increasingly common issue exacerbated by online communication. Anonymity raises additional challenges for threat assessors, but little is known about how practitioners appr...
Preprint
The problem of online threats and abuse directed at public figures could potentially be mitigated with a computational approach, where sources of abusive language are better understood or identified through author profiling. However, abusive language constitutes a specific domain of language that is untested on whether differences emerge based on p...
Article
Full-text available
Background Cryptocurrency fraud has become a growing global concern, with various governments reporting an increase in the frequency of and losses from cryptocurrency scams. Despite increasing fraudulent activity involving cryptocurrencies, research on the potential of cryptocurrencies for fraud has not been examined in a systematic study. This rev...
Article
Full-text available
The introduction of COVID-19 lockdown measures and an outlook on return to normality are demanding societal changes. Among the most pressing questions is how individuals adjust to the pandemic. This paper examines the emotional responses to the pandemic in a repeated-measures design. Data ( n = 1698) were collected in April 2020 (during strict lock...
Preprint
Full-text available
Research shows that natural language processing models are generally considered to be vulnerable to adversarial attacks; but recent work has drawn attention to the issue of validating these adversarial inputs against certain criteria (e.g., the preservation of semantics and grammaticality). Enforcing constraints to uphold such criteria may render a...
Preprint
Full-text available
The introduction of COVID-19 lockdown measures and an outlook on return to normality are demanding societal changes. Among the most pressing questions is how individuals adjust to the pandemic. This paper examines the emotional responses to the pandemic in a repeated-measures design. Data (n=1698) were collected in April 2020 (during strict lockdow...
Article
Full-text available
In this crowdsourced initiative, independent analysts used the same dataset to test two hypotheses regarding the effects of scientists' gender and professional status on verbosity during group meetings. Not only the analytic approach but also the operationalizations of key variables were left unconstrained and up to individual analysts. For instanc...
Article
Full-text available
The increased threat of right-wing extremist violence necessitates a better understanding of online extremism. Radical message boards, small-scale social media platforms, and other internet fringes have been reported to fuel hatred. The current paper examines data from the right-wing forum Stormfront between 2001 and 2015. We specifically aim to un...
Article
Full-text available
The media frequently describes the 2017 Charlottesville ‘Unite the Right’ rally as a turning point for the alt-right and white supremacist movements. Social movement theory suggests that the media attention and public discourse concerning the rally may have engendered changes in social identity performance and visibility of the alt-right, but this...
Article
Full-text available
This paper introduces the Grievance Dictionary, a psycholinguistic dictionary that can be used to automatically understand language use in the context of grievance-fueled violence threat assessment. We describe the development of the dictionary, which was informed by suggestions from experienced threat assessment practitioners. These suggestions an...
Preprint
Full-text available
For sensitive text data to be shared among NLP researchers and practitioners, shared documents need to comply with data protection and privacy laws. There is hence a growing interest in automated approaches for text anonymization. However, measuring such methods' performance is challenging: missing a single identifying attribute can reveal an indiv...
Article
Full-text available
Background Deception detection is a prevalent problem for security practitioners. With a need for more large-scale approaches, automated methods using machine learning have gained traction. However, detection performance still implies considerable error rates. Findings from different domains suggest that hybrid human-machine integrations could offe...
Conference Paper
Research shows that natural language processing models are generally considered to be vulnerable to adversarial attacks; but recent work has drawn attention to the issue of validating these adversarial inputs against certain criteria (e.g., the preservation of semantics and grammaticality). Enforcing constraints to uphold such criteria may render a...
Chapter
Among the critical challenges around the COVID-19 pandemic is dealing with the potentially detrimental effects on people’s mental health. Designing appropriate interventions and identifying the concerns of those most at risk requires methods that can extract worries, concerns and emotional responses from text data. We examine gender differences and...
Conference Paper
Among the critical challenges around the COVID-19 pandemic is dealing with the potentially detrimental effects on people’s mental health. Designing appropriate interventions and identifying the concerns of those most at risk requires methods that can extract worries, concerns and emotional responses from text data. We examine gender differences and...
Preprint
This paper introduces the Grievance Dictionary, a psycholinguistic dictionary which can be used to automatically understand language use in the context of grievance-fuelled violence threat assessment. We describe the development the dictionary, which was informed by suggestions from experienced threat assessment practitioners. These suggestions and...
Preprint
The problem of online threats and abuse could potentially be mitigated with a computational approach, where sources of abuse are better understood or identified through author profiling. However, abusive language constitutes a specific domain of language for which it has not yet been tested whether differences emerge based on a text author's person...
Preprint
Full-text available
Text data are being used as a lens through which human cognition can be studied at a large scale. Methods like emotion analysis are now in the standard toolkit of computational social scientists but typically rely on third-person annotation with unknown validity. As an alternative, this paper introduces online emotion induction techniques from expe...
Preprint
Among the critical challenges around the COVID-19 pandemic is dealing with potentially detrimental effects on people's mental health. Designing appropriate interventions and identifying the concerns of those most at risk requires methods that can extract worries, concerns and emotional responses from text data. We examine gender differences and the...
Preprint
Full-text available
While recent efforts have shown that neural text processing models are vulnerable to adversarial examples, comparatively little attention has been paid to explicitly characterize their effectiveness. To overcome this, we present analytical insights into the word frequency characteristics of word-level adversarial examples for neural text classifica...
Preprint
Full-text available
The current policy of removing drill music videos from social media platforms such as YouTube remains controversial because it risks conflating the co-occurrence of drill rap and violence with a causal chain of the two. Empirically, we revisit the question of whether there is evidence to support the conjecture that drill music and gang violence are...
Preprint
Full-text available
The COVID-19 pandemic is having a dramatic impact on societies and economies around the world. With various measures of lockdowns and social distancing in place, it becomes important to understand emotional responses on a large scale. In this paper, we present the first ground truth dataset of emotional responses to COVID-19. We asked participants...
Preprint
Full-text available
Background: Deception detection is a prevalent problem for security practitioners. With a need for more large-scale approaches, automated methods using machine learning have gained traction. However, detection performance still implies considerable error rates. Findings from other domains suggest that hybrid human-machine integrations could offer a...
Article
Full-text available
Despite considerable concern about how human trafficking offenders may use the Internet to recruit their victims, arrange logistics or advertise services, the Internet-trafficking nexus remains unclear. This study explored the prevalence and correlates of a set of commonly-used indicators of labour trafficking in online job advertisements. Taking a...
Preprint
The Response Time-Based Concealed Information Test (RT-CIT) can reveal when a person recognizes a relevant (probe) item among other, irrelevant items, based on comparatively slower responding to the probe item. Thereby, if a person is concealing the knowledge about the relevance of this item (e.g., recognizing it as a murder weapon), this deception...
Article
Full-text available
The Response Time-Based Concealed Information Test (RT-CIT) can reveal when a person recognizes a relevant ('probe') item among other, irrelevant items, based on comparatively slower responding to the probe item. Thereby, if a person is concealing the knowledge about the relevance of this item (e.g., recognizing it as a murder weapon), this decepti...
Preprint
Full-text available
This paper presents how techniques from natural language processing can be used to examine the sentiment trajectories of gang-related drill music in the United Kingdom (UK). This work is important because key public figures are loosely making controversial linkages between drill music and recent escalations in youth violence in London. Thus, this p...
Preprint
The media frequently describes the 2017 Charlottesville 'Unite the Right' rally as a turning point for the alt-right and white supremacist movements. Related research into social movements also suggests that the media attention and public discourse concerning the rally may have influenced the alt-right. Empirical evidence for these claims is largel...
Article
Full-text available
Purpose: Verbal credibility assessments examine language differences to tell truthful from deceptive statements (e.g., of allegations of child sexual abuse). The dominant approach in psycholegal deception research to date (used in 81% of recent studies that report on accuracy) to estimate the accuracy of a method is to find the optimal statistical...
Research
Full-text available
Social media and tech companies face the challenge of identifying and removing terrorist and extremist content from their platforms. This paper presents the findings of a series of interviews with Global Internet Forum to Counter Terrorism (GIFCT) partner companies and law enforcement Internet Referral Units (IRUs). It offers a unique view on curre...
Conference Paper
Full-text available
News consumption exhibits an increasing shift towards online sources, which bring platforms such as YouTube more into focus. Thus, the distribution of politically loaded news is easier , receives more attention, but also raises the concern of forming isolated ideological communities. Understanding how such news is communicated and received is becom...
Preprint
Full-text available
Purpose: Verbal credibility assessments examine language to discern lie from the truth. These tests are used for the scientific study of the language of lies in US Presidential candidates and fraudulent scientists, but also in criminal proceedings for evaluating allegations of child sexual abuse. The dominant approach in psycholegal deception resea...
Preprint
Full-text available
Several research lines attempted to tell truthful from deceptive texts by looking at the concreteness in language as an indicator of truthfulness. We identified eight different operationalizations of concreteness for computer-automated analysis and validated these operationalizations on six diverse datasets containing truthful and deceptive texts (...
Conference Paper
The proliferation of misleading information in everyday access media outlets such as social media feeds, news blogs, and online newspapers have made it challenging to identify trustworthy news sources, thus increasing the need for computational tools able to provide insights into the reliability of online content. In this paper, we focus on the aut...
Article
Full-text available
Abstract Pump-and-dump schemes are fraudulent price manipulations through the spread of misinformation and have been around in economic settings since at least the 1700s. With new technologies around cryptocurrency trading, the problem has intensified to a shorter time scale and broader scope. The scientific literature on cryptocurrency pump-and-du...
Preprint
Full-text available
Vlogs provide a rich public source of data in a novel setting. This paper examined the continuous sentiment styles employed in 27,333 vlogs using a dynamic intra-textual approach to sentiment analysis. Using unsupervised clustering, we identified seven distinct continuous sentiment trajectories characterized by fluctuations of sentiment throughout...
Article
Full-text available
Verbal deception detection has gained momentum as a technique to tell truth‐tellers from liars. At the same time, researchers' degrees of freedom make it hard to assess the robustness of effects. Replication research can help evaluate how reproducible an effect is. We present the first replication in verbal deception research whereby ferry passenge...
Article
Full-text available
Recently, verbal credibility assessment has been extended to the detection of deceptive intentions , the use of a model statement, and predictive modeling. The current investigation combines these 3 elements to detect deceptive intentions on a large scale. Participants read a model statement and wrote a truthful or deceptive statement about their p...
Chapter
Full-text available
There is an increasing demand for deception detection at scale. In situations in which larger numbers of people need to be tested, traditional deception-detection methods are limited because they often require extensive testing sessions or are limited in their flexibility to novel contexts. The aim of this chapter is to discuss the potential for la...
Article
When embedded among a number of plausible irrelevant options, the presentation of critical (e.g., crime-related or autobiographical) information is associated with a marked increase in response time (RT). This RT effect crucially depends on the inclusion of a target/non-target discrimination task with targets being a dedicated set of items that req...
Article
Full-text available
Background Academic research on deception detection has largely focused on the detection of past events. For many applied purposes, however, the detection of false reports about someone’s intention merits attention. Based on the verbal deception detection paradigm, we explored whether true statements on intentions were more detailed and more specif...
Article
There is an increasing demand for automated verbal deception detection systems. We propose named entity recognition (NER; i.e., the automatic identification and extraction of information from text) to model three established theoretical principles: (i) truth tellers provide accounts that are richer in detail, (ii) contain more contextual references...
Article
Full-text available
The proliferation of misleading information in everyday access media outlets such as social media feeds, news blogs, and online newspapers have made it challenging to identify trustworthy news sources, thus increasing the need for computational tools able to provide insights into the reliability of online content. In this paper, we focus on the aut...
Preprint
Full-text available
Background: The shift towards open science, implies that researchers should share their data. Often there is a dilemma between publicly sharing data and protecting their subjects' confidentiality. Moreover, the case of unstructured text data (e.g. stories) poses an additional dilemma: anonymizing texts without deteriorating their content for second...
Article
Full-text available
The reaction time (RT)-based Concealed Information Test (CIT) allows for the detection of concealed knowledge (e.g., one's true identity) when the questions are presented randomly (multiple-probe protocol), but its performance is much weaker when questions are presented in blocks (e.g., first question about surname, then about birthday; single-prob...
Preprint
Full-text available
There is an increasing demand for automated verbal deception detection systems. We propose named entity recognition (NER; i.e., the automatic identification and extraction of information from text) based on three established theoretical principles: (i) truth-tellers provide accounts that are richer in detail, (ii) contain more contextual references...
Preprint
Full-text available
Background: Academic research on deception detection has largely focused on the detection of past events. For many applied purposes, however, the detection of false reports about someone's intention merits attention. Based on the verbal deception detection paradigm, we explored whether true statements on intentions were more detailed and more speci...
Preprint
The reaction time (RT)-based Concealed Information Test (CIT) allows for the detection of concealed knowledge (e.g., one’s true identity) when the questions are presented randomly (multiple-probe protocol), but its performance is much weaker when questions are presented in blocks (e.g., first question about surname, then about birthday; single-prob...
Article
Full-text available
The Verifiability Approach (VA) is a promising new approach for deception detection. It extends existing verbal credibility assessment tools by asking interviewees to provide statements rich in verifiable detail. Details that i) have been experienced with an identifiable person , ii) have been witnessed by an identifiable person, or iii) have been...
Article
Full-text available
By assessing the association strength with TRUE and FALSE, the autobiographical Implicit Association Test (aIAT) [Sartori, G., Agosta, S., Zogmaister, C., Ferrara, S. D., & Castiello, U. (2008). How to accurately detect autobiographical events. Psychological Science, 19, 772–780. doi:10.1111/j.1467-9280.2008.02156.x] aims to determine which of two...
Article
Full-text available
Do motivated liars lie more successfully? The motivational effort hypothesis predicts that higher motivation effectively diminishes the chance of being detected, whereas the motivational impairment hypothesis predicts that the higher the motivation to go undetected, the greater the chance of being detected. We manipulated motivation in two online r...
Conference Paper
Full-text available
The Verifiability Approach (VA) is a promising new approach for deception detection. It extends existing verbal credibility assessment tools by asking interviewees to provide statements rich in verifiable detail. Details that i) have been experienced with an identifiable person, ii) have been witnessed by an identifiable person, or iii) have been r...
Article
Full-text available
Background: The Internet has already changed people’s lives considerably and is likely to drastically change forensic research. We developed a web-based test to reveal concealed autobiographical information. Initial studies identified a number of conditions that affect diagnostic efficiency. By combining these moderators, the present study investig...
Article
Full-text available
Reproducibility is a defining feature of science, but the extent to which it characterizes current research is unknown. We conducted replications of 100 experimental and correlational studies published in three psychology journals using high-powered designs and original materials when available. Replication effects were half the magnitude of origin...
Article
Full-text available
There is accumulating evidence that reaction times (RTs) can be used to detect recognition of critical (e.g., crime) information. A limitation of this research base is its reliance upon small samples (average n = 24), and indications of publication bias. To advance RT-based memory detection, we report upon the development of the first web-based mem...
Article
RT-based memory detection may provide an efficient means to assess recognition of concealed information. There is, however, considerable heterogeneity in detection rates, and we explored two potential moderators: Item Saliency and Test Protocol. Participants tried to conceal low salient (e.g., favourite colour) and high salient items (e.g., first n...

Projects