Conference Paper

Identifying Biases in Politically Biased Wikis through Word Embeddings

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

With the increase of biased information available online, the importance of analysis and detection of such content has also significantly risen. In this paper, we aim to quantify different kinds of social biases using word embeddings. Towards this goal we train such embeddings on two politically biased MediaWiki instances, namely RationalWiki and Conservapedia. Additionally we included Wikipedia as an online encyclopedia, which is accepted by the general public. Utilizing and combining state-of-the-art word embedding models with WEAT and WEFAT, we display to what extent biases exist in the above-mentioned corpora. By comparing embeddings we observe interesting differences between different kinds of wikis.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... This impacts generalization performance negatively (Shah et al., 2020) and may have harmful consequences in practical applications (Bender et al., 2021;Joseph and Morgan, 2020). So far, one hurdle to mitigate these problems is the limited reliability of common measures of social bias present in a corpus (Spliethöver and Wachsmuth, 2021), stemming from embedding training algorithms not tailored to low-resource situations (Knoche et al., 2019;Spinde et al., 2021). ...
... WEAT's main idea is to calculate the cumulative distance between groups of word vectors that describe a social group and attributes. Similar measures exist, such as ECT (Dev and Phillips, 2019), RNSB (Sweeney and Najafian, 2019), and MAC (Manzini et al., 2019), RIPA (Ethayarajh et al., 2019), WEATVEC (Knoche et al., 2019), the Smoothed First-Order Co-occurrence (Rekabsaz et al., 2021) and SAME (Schröder et al., 2021) but our goal is not to find the best measure. Rather, we seek to learn how measures like WEAT behave for different embedding algorithms. ...
... Closest to our work is the research of Knoche et al. (2019) and Spinde et al. (2021). The former use WEAT to compare social biases present word embeddings trained on different ideological online wikis. ...
... This impacts generalization performance negatively (Shah et al., 2020) and may have harmful consequences in practical applications (Bender et al., 2021;Joseph and Morgan, 2020). So far, one hurdle to mitigate these problems is the limited reliability of common measures of social bias present in a corpus (Spliethöver and Wachsmuth, 2021), stemming from embedding training algorithms not tailored to low-resource situations (Knoche et al., 2019;Spinde et al., 2021). ...
... WEAT's main idea is to calculate the cumulative distance between groups of word vectors that describe a social group and attributes. Similar measures exist, such as ECT (Dev and Phillips, 2019), RNSB (Sweeney and Najafian, 2019), and MAC (Manzini et al., 2019), RIPA (Ethayarajh et al., 2019), WEATVEC (Knoche et al., 2019), the Smoothed First-Order Co-occurrence (Rekabsaz et al., 2021) and SAME (Schröder et al., 2021) but our goal is not to find the best measure. Rather, we seek to learn how measures like WEAT behave for different embedding algorithms. ...
... Closest to our work is the research of Knoche et al. (2019) and Spinde et al. (2021). The former use WEAT to compare social biases present word embeddings trained on different ideological online wikis. ...
Preprint
Full-text available
News articles both shape and reflect public opinion across the political spectrum. Analyzing them for social bias can thus provide valuable insights, such as prevailing stereotypes in society and the media, which are often adopted by NLP models trained on respective data. Recent work has relied on word embedding bias measures, such as WEAT. However, several representation issues of embeddings can harm the measures' accuracy, including low-resource settings and token frequency differences. In this work, we study what kind of embedding algorithm serves best to accurately measure types of social bias known to exist in US online news articles. To cover the whole spectrum of political bias in the US, we collect 500k articles and review psychology literature with respect to expected social bias. We then quantify social bias using WEAT along with embedding algorithms that account for the aforementioned issues. We compare how models trained with the algorithms on news articles represent the expected social bias. Our results suggest that the standard way to quantify bias does not align well with knowledge from psychology. While the proposed algorithms reduce the~gap, they still do not fully match the literature.
... Other authors have previously researched both how algorithms can be investigated for journalistic purposes (Diakopoulos, 2015), described how algorithms involved in newswork could be made transparent (Diakopoulos & Koliska, 2017) and provided descriptions of how automation can help reduce bias in reporting (Fischer-Hwang, Grosz, Hu, Karthik, & Yang, 2020). Similarly, some technical works have investigated methods for identifying bias in non-journalistic contexts (e.g., Caliskan, Bryson, & Narayanan, 2017;Knoche, Popović, Lemmerich, & Strohmaier, 2019). In this article, we synthesize how these methods and ideas apply to diagnosing automated journalism itself for bias. ...
... Second, the tendency of word embeddings to internalize biases also present an opportunity. Previous works (e.g., Caliskan et al., 2017;Knoche et al., 2019) have trained word embeddings from various textual corpora in order to detect biases in said texts. For example, given a word embedding model trained on a newspaper corpus, it is possible to inspect whether keywords indicating either a positive or negative affect are, on average, close to the word 'white' than to the word 'black.' ...
Article
Full-text available
In this article we consider automated journalism from the perspective of bias in news text. We describe how systems for automated journalism could be biased in terms of both the information content and the lexical choices in the text, and what mechanisms allow human biases to affect automated journalism even if the data the system operates on is considered neutral. Hence, we sketch out three distinct scenarios differentiated by the technical transparency of the systems and the level of cooperation of the system operator, affecting the choice of methods for investigating bias. We identify methods for diagnostics in each of the scenarios and note that one of the scenarios is largely identical to investigating bias in non-automatically produced texts. As a solution to this last scenario, we suggest the construction of a simple news generation system, which could enable a type of analysis-by-proxy. Instead of analyzing the system, to which the access is limited, one would generate an approximation of the system which can be accessed and analyzed freely. If successful, this method could also be applied to analysis of human-written texts. This would make automated journalism not only a target of bias diagnostics, but also a diagnostic device for identifying bias in human-written news.
... Datasets: Extending the experimental design from [8], we apply debiasing simultaneously on following target sets/subclasses: (male, female) -gender, (islam, christianity, atheism) -religion and (black and white names) -race with seven distinct attribute set pairs 5 . We collected target, attribute sets, and class definitional sets from literature [11,16,15,3,10,8], see our online appendix for a complete list. ...
... Datasets: Extending the experimental design from [8], we apply debiasing simultaneously on following target sets/subclasses: (male, female) -gender, (islam, christianity, atheism) -religion and (black and white names) -race with seven distinct attribute set pairs 5 . We collected target, attribute sets, and class definitional sets from literature [11,16,15,3,10,8], see our online appendix for a complete list. As in previous studies [7], evaluation was done on three pretrained Word Embedding models with vector dimension of 300: FastText 2 (English webcrawl and Wikipedia, 2 million words), GloVe 3 (Common Crawl, Wikipedia and Gigaword, 2.2 million words) and Word2Vec 4 (Trained on Google News, 3 million words). ...
Preprint
Bias in Word Embeddings has been a subject of recent interest, along with efforts for its reduction. Current approaches show promising progress towards debiasing single bias dimensions such as gender or race. In this paper, we present a joint multiclass debiasing approach that is capable of debiasing multiple bias dimensions simultaneously. In that direction, we present two approaches, HardWEAT and SoftWEAT, that aim to reduce biases by minimizing the scores of the Word Embeddings Association Test (WEAT). We demonstrate the viability of our methods by debiasing Word Embeddings on three classes of biases (religion, gender and race) in three different publicly available word embeddings and show that our concepts can both reduce or even completely eliminate bias, while maintaining meaningful relationships between vectors in word embeddings. Our work strengthens the foundation for more unbiased neural representations of textual data.
... As far as we know, only one study from 2019 used this approach. Though it found some evidence of biases among Wikis (far-left RationalWiki, far-right Conservapedia, and ostensibly neutral Wikipedia), this application of the method is out of date and in need of a reevaluation (Knoche et al., 2019). ...
Article
Full-text available
We scraped user pages of 7,739 Wikipedia editors. Of these, 224 users positioned themselves politically using the semi-standardized "userboxes". Based on this sample, Wikipedia editors' views had a strong tilt towards the left. The results are congruent with the political leanings of related occupations, such as journalists and academics.
... User generated content "A Characterization of Political Communities on Reddit" (Soliman et al., 2019), "Constructing the Visual Online Political Self: An Analysis of Instagram Use by the Scottish Electorate" (Mahoney et al., 2016), "microblogging information diffusion activity during the 2011 Egyptian political uprisings" (Kling, 1987), "characterize users who adversarially interact with political figures on Twitter [--] in the two months leading up to the 2018 midterm elections" (Aal et al., 2014), and "Political Hashtags & the Lost Art of Democratic Discourse" (Le et al., 2017), (Kling, 1987;Starbird and Palen, 2012;Vigil-Hayes et al., 2017;Soliman et al., 2019;Gorkovenko and Taylor, 2019;Hua et al., 2020;Kou and Nardi, 2018;Mahoney et al., 2016;Borge-Holthoefer et al., 2015;Graells-Garrido et al., 2016;Brooker et al., 2015;Semaan et al., 2015b;Seering et al., 2019;Wang and Mark, 2017;Dosono et al., 2017;Matias et al., 2017;Zhang and Counts, 2016;Booten, 2016;Maruyama et al., 2014;Knoche et al., 2019;Dosono and Semaan, 2018;Morgan et al., 2013;Grevet et al., 2014;Zhang and Counts, 2015;Choudhury et al., 2014;Semaan et al., 2015a;Rho and Mazmanian, 2020;Salehi-Abari and Boutilier, 2015;Le et al., 2017;Nelimarkka et al., 2018;Semaan et al., 2014;Al-Ani et al., 2012;Park et al., 2011;Trevisan et al., 2019;Hemphill and Roback, 2014;Hemphill et al., 2013;Agarwal et al., 2020;Li et al., 2018) (Mahoney et al., 2016;Zubiaga et al., 2013;Maruyama et al., 2014;Kriplean et al., 2014;Pierson, 2015;Rho and Mazmanian, 2020;Le et al., 2017;Semaan et al., 2014) Continued on next page 37 , "[--] the outcome of a concerted effort to develop responsive and impactful direct democracy platforms. We offer a sociotechnical genealogy of the process, informed by theory of deliberative democracy" (Feltwell et al., 2019), and "Strong representative democracies rely on educated, informed, and active citizenry to provide oversight of the government. ...
Preprint
Full-text available
Human–computer interaction scholars are increasingly touching on topics relatedto politics or democracy. As these concepts are ambiguous, an examination ofconcepts’ invoked meanings aids in the self-reflection of our research efforts. Weconduct a thematic analysis of all papers with the word ‘politics’ in abstract,title or keywords (n=378) and likewise 152 papers with the word ‘democracy.’We observe that these words are increasingly being used in human-computerinteraction, both in absolute and relative terms. At the same time, we show thatresearchers invoke these words with diverse levels of analysis in mind: the earlyresearch focused on mezzo-level (i.e., small groups), but more recently the workhas begun to include macro-level analysis (i.e., society and politics as played inthe public sphere). After the increasing focus on the macro-level, we see a tran-sition towards more normative and activist research, in some areas it replacesobservational and empirical research. These differences indicate semantic differ-ences, which – in the worst case – may limit scientific progress. We bring thesedifferences visible to help in further exchanges of ideas and human–computerinteraction community to explore how it orients itself to politics and democracy.
... This may lead to fewer serendipitous information encounters in the short term, and narrower views, informational blind spots, or radical polarisation in the longer term [4,28]. Awareness of bias in news and media has gained substantial attention on its own [5,19,34] as well as in relation to personalised search and news services [10,12,15,23,32]. ...
Chapter
Full-text available
Personalisation in search has improved performance, focus, and user experience to a great extent, however, it also arguably polarises informational perspectives. This paper seeks to illustrate an experimental methodology to quantify how three situational user variables affect personalisation across two search engines: Google and DuckDuckGo. We find that the presence of cookies and prior search history markedly affect the first page of search results on both platforms, but that prior (shallow) browsing history has no observable effect. We also find that there is very little in common between the results of both search engines. We argue that these results advocate more consideration of how personalisation fosters filter biases.
... Other highprofile papers such as Garg et al. (2018) and Lewis and Lupyan (2020) have used the WEAT to study cultural biases across time and place. Importantly, the method is now being used to evaluate the political biases of websites (Knoche et al. 2019), detect the purposeful spread of misinformation on social media by state-sponsored actors (Toney et al. 2021), uncover biases present in and proliferated through popular song lyrics (Barman, Awekar, and Kothari 2019), and even to measure how much gender bias US judges display in their judicial opinions (Ash, Chen, and Galletta 2021). ...
Article
The word embedding association test (WEAT) is an important method for measuring linguistic biases against social groups such as ethnic minorities in large text corpora. It does so by comparing the semantic relatedness of words prototypical of the groups (e.g., names unique to those groups) and attribute words (e.g., ‘pleasant’ and ‘unpleasant’ words). We show that anti-Black WEAT estimates from geo-tagged social media data at the level of metropolitan statistical areas strongly correlate with several measures of racial animus—even when controlling for sociodemographic covariates. However, we also show that every one of these correlations is explained by a third variable: the frequency of Black names in the underlying corpora relative to White names. This occurs because word embeddings tend to group positive (negative) words and frequent (rare) words together in the estimated semantic space. As the frequency of Black names on social media is strongly correlated with Black Americans’ prevalence in the population, this results in spuriously high anti-Black WEAT estimates wherever few Black Americans live. This suggests that research using the WEAT to measure bias should consider term frequency, and also demonstrates the potential consequences of using black-box models like word embeddings to study human cognition and behavior.
... Other highprofile papers such as Garg et al. (2018) and Lewis and Lupyan (2020) have used the WEAT to study cultural biases across time and place. Importantly, the method is now being used to evaluate the political biases of websites (Knoche et al. 2019), detect the purposeful spread of misinformation on social media by state-sponsored actors (Toney et al. 2021), uncover biases present and proliferated through popular song lyrics (Barman, Awekar, and Kothari 2019), and even to measure how much gender bias US judges display in their judicial opinions (Ash, Chen, and Galletta 2021). ...
Preprint
Full-text available
The word embedding association test (WEAT) is an important method for measuring linguistic biases against social groups such as ethnic minorities in large text corpora. It does so by comparing the semantic relatedness of words prototypical of the groups (e.g., names unique to those groups) and attribute words (e.g., 'pleasant' and 'unpleasant' words). We show that anti-black WEAT estimates from geo-tagged social media data at the level of metropolitan statistical areas strongly correlate with several measures of racial animus--even when controlling for sociodemographic covariates. However, we also show that every one of these correlations is explained by a third variable: the frequency of Black names in the underlying corpora relative to White names. This occurs because word embeddings tend to group positive (negative) words and frequent (rare) words together in the estimated semantic space. As the frequency of Black names on social media is strongly correlated with Black Americans' prevalence in the population, this results in spurious anti-Black WEAT estimates wherever few Black Americans live. This suggests that research using the WEAT to measure bias should consider term frequency, and also demonstrates the potential consequences of using black-box models like word embeddings to study human cognition and behavior.
... One of the additional lexicons tested, the WEAT lexicon, deserves special consideration since previous works have used this small size lexicon (N = 50) when testing for bias in word embedding models [8,15,16]. Although results of projecting WEAT sentiment words onto the cultural axes analyzed roughly agree with the HGI lexicon projection tests, some cultural axes show divergent results. ...
Article
Full-text available
Concerns about gender bias in word embedding models have captured substantial attention in the algorithmic bias research literature. Other bias types however have received lesser amounts of scrutiny. This work describes a large-scale analysis of sentiment associations in popular word embedding models along the lines of gender and ethnicity but also along the less frequently studied dimensions of socioeconomic status, age, physical appearance, sexual orientation, religious sentiment and political leanings. Consistent with previous scholarly literature, this work has found systemic bias against given names popular among African-Americans in most embedding models examined. Gender bias in embedding models however appears to be multifaceted and often reversed in polarity to what has been regularly reported. Interestingly, using the common operationalization of the term bias in the fairness literature, novel types of so far unreported bias types in word embedding models have also been identified. Specifically, the popular embedding models analyzed here display negative biases against middle and working-class socioeconomic status, male children, senior citizens, plain physical appearance and intellectual phenomena such as Islamic religious faith, non-religiosity and conservative political orientation. Reasons for the paradoxical underreporting of these bias types in the relevant literature are probably manifold but widely held blind spots when searching for algorithmic bias and a lack of widespread technical jargon to unambiguously describe a variety of algorithmic associations could conceivably be playing a role. The causal origins for the multiplicity of loaded associations attached to distinct demographic groups within embedding models are often unclear but the heterogeneity of said associations and their potential multifactorial roots raises doubts about the validity of grouping them all under the umbrella term bias. Richer and more fine-grained terminology as well as a more comprehensive exploration of the bias landscape could help the fairness epistemic community to characterize and neutralize algorithmic discrimination more efficiently.
... One of the additional lexicons tested, the WEAT lexicon, deserves special consideration since previous works have used this small size lexicon (N=50) when testing for bias in word embedding models (8,15,16). Although results of projecting WEAT sentiment words onto the cultural axes analyzed roughly agree with the HGI lexicon projection tests, some cultural axes show divergent results. ...
Preprint
Full-text available
Concerns about gender bias in word embedding models have captured substantial attention in the algorithmic bias research literature. Yet, the common elastic usage of the term bias to describe a broad array of distinct algorithmic phenomena can be misleading. Here, a large-scale analysis of gender associations in popular pre-trained word embedding models suggests that the purported gender bias in these models is multifaceted and often reversed in polarity to what has been regularly reported. Consistent with previous scholarly literature, this work has found bias against given names popular among African-Americans in all embedding models examined. Interestingly, using this popular operationalization of the term bias in the fairness literature, novel types of so far unreported bias types in word embedding models have also been identified. Specifically, biases against middle and working class socioeconomic status, male children, senior citizens and intellectual phenomena such as Islamic religious faith and conservative political orientation. Still, using the umbrella term bias to refer to a heterogeneity of algorithmic associations in language models precludes precise characterization of algorithmic unfairness or lack thereof. Richer and more precise terminology could help the field to communicate more efficiently.
Chapter
Bias in Word Embeddings has been a subject of recent interest, along with efforts for its reduction. Current approaches show promising progress towards debiasing single bias dimensions such as gender or race. In this paper, we present a joint multiclass debiasing approach that is capable of debiasing multiple bias dimensions simultaneously. In that direction, we present two approaches, HardWEAT and SoftWEAT, that aim to reduce biases by minimizing the scores of the Word Embeddings Association Test (WEAT). We demonstrate the viability of our methods by debiasing Word Embeddings on three classes of biases (religion, gender and race) in three different publicly available word embeddings and show that our concepts can both reduce or even completely eliminate bias, while maintaining meaningful relationships between vectors in word embeddings. Our work strengthens the foundation for more unbiased neural representations of textual data.
Article
Full-text available
Machine learning is a means to derive artificial intelligence by discovering patterns in existing data. Here we show that applying machine learning to ordinary human language results in human-like semantic biases. We replicate a spectrum of known biases, as measured by the Implicit Association Test, using a widely used, purely statistical machine-learning model trained on a standard corpus of text from the Web. Our results indicate that text corpora contain re-coverable and accurate imprints of our historic biases, whether morally neutral as towards insects or flowers, problematic as towards race or gender, or even simply veridical, reflecting the status quo distribution of gender with respect to careers or first names. Our methods hold promise for identifying and addressing sources of bias in culture, including technology.
Article
Full-text available
Machines learn what people know implicitly AlphaGo has demonstrated that a machine can learn how to do things that people spend many years of concentrated study learning, and it can rapidly learn how to do them better than any human can. Caliskan et al. now show that machines can learn word associations from written texts and that these associations mirror those learned by humans, as measured by the Implicit Association Test (IAT) (see the Perspective by Greenwald). Why does this matter? Because the IAT has predictive value in uncovering the association between concepts, such as pleasantness and flowers or unpleasantness and insects. It can also tease out attitudes and beliefs—for example, associations between female names and family or male names and career. Such biases may not be expressed explicitly, yet they can prove influential in behavior. Science , this issue p. 183 ; see also p. 133
Article
Full-text available
The blind application of machine learning runs the risk of amplifying biases present in data. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning and natural language processing tasks. We show that even word embeddings trained on Google News articles exhibit female/male gender stereotypes to a disturbing extent. This raises concerns because their widespread use, as we describe, often tends to amplify these biases. Geometrically, gender bias is first shown to be captured by a direction in the word embedding. Second, gender neutral words are shown to be linearly separable from gender definition words in the word embedding. Using these properties, we provide a methodology for modifying an embedding to remove gender stereotypes, such as the association between between the words receptionist and female, while maintaining desired associations such as between the words queen and female. We define metrics to quantify both direct and indirect gender biases in embeddings, and develop algorithms to "debias" the embedding. Using crowd-worker evaluation as well as standard benchmarks, we empirically demonstrate that our algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks. The resulting embeddings can be used in applications without amplifying gender bias.
Conference Paper
Full-text available
State-of-the-art named entity recognition systems rely heavily on hand-crafted features and domain-specific knowledge in order to learn effectively from the small, supervised training corpora that are available. In this paper, we introduce two new neural architectures---one based on bidirectional LSTMs and conditional random fields, and the other that constructs and labels segments using a transition-based approach inspired by shift-reduce parsers. Our models rely on two sources of information about words: character-based word representations learned from the supervised corpus and unsupervised word representations learned from unannotated corpora. Our models obtain state-of-the-art performance in NER in four languages without resorting to any language-specific knowledge or resources such as gazetteers.
Article
Full-text available
Bidirectional Long Short-Term Memory Recurrent Neural Network (BLSTM-RNN) has been shown to be very effective for modeling and predicting sequential data, e.g. speech utterances or handwritten documents. In this study, we propose to use BLSTM-RNN for a unified tagging solution that can be applied to various tagging tasks including part-of-speech tagging, chunking and named entity recognition. Instead of exploiting specific features carefully optimized for each task, our solution only uses one set of task-independent features and internal representations learnt from unlabeled text for all tasks.Requiring no task specific knowledge or sophisticated feature engineering, our approach gets nearly state-of-the-art performance in all these three tagging tasks.
Article
Full-text available
Contributing to history has never been as easy as it is today. Anyone with access to the Web is able to play a part on Wikipedia, an open and free encyclopedia. Wikipedia, available in many languages, is one of the most visited websites in the world and arguably one of the primary sources of knowledge on the Web. However, not everyone is contributing to Wikipedia from a diversity point of view; several groups are severely underrepresented. One of those groups is women, who make up approximately 16% of the current contributor community, meaning that most of the content is written by men. In addition, although there are specific guidelines of verifiability, notability, and neutral point of view that must be adhered by Wikipedia content, these guidelines are supervised and enforced by men. In this paper, we propose that gender bias is not about participation and representation only, but also about characterization of women. We approach the analysis of gender bias by defining a methodology for comparing the characterizations of men and women in biographies. In particular we refer to three dimensions of biographies: meta-data, language usage, and structure of the network built from links between articles. Our results show that, indeed, there are differences in characterization and structure. Some of these differences are reflected from the offline world documented by Wikipedia, but other differences can be attributed to gender bias in Wikipedia content. We contextualize these differences in feminist theory and discuss their implications for Wikipedia policy.
Article
Full-text available
The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of "Canada" and "Air" cannot be easily combined to obtain "Air Canada". Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.
Article
Full-text available
This study empirically examines whether Wikipedia has a neutral point of view. It develops a method for measuring the slant of 28 thousand articles about US politics. In its earliest years, Wikipedia's political entries lean Democrat on average. The slant diminishes during Wikipedia's decade of experience. This change does not arise primarily from revision of existing articles. Most articles arrive with a slant, and most articles change only mildly from their initial slant. The overall slant changes due to the entry of articles with opposite slants, leading toward neutrality for many topics, not necessarily within specific articles.
Article
Full-text available
We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best performing techniques based on different types of neural networks. We observe large improvements in accuracy at much lower computational cost, i.e. it takes less than a day to learn high quality word vectors from a 1.6 billion words data set. Furthermore, we show that these vectors provide state-of-the-art performance on our test set for measuring syntactic and semantic word similarities.
Conference Paper
Full-text available
A goal of statistical language modeling is to learn the joint probabilit y function of sequences of words. This is intrinsically difficult because o f the curse of dimensionality: we propose to fight it with its own weap ons. In the proposed approach one learns simultaneously (1) a distributed r ep- resentation for each word (i.e. a similarity between words) along with (2) the probability function for word sequences, expressed with these repr e- sentations. Generalization is obtained because a sequence of words that has never been seen before gets high probability if it is made of words that are similar to words forming an already seen sentence. We report on experiments using neural networks for the probability function, sh owing on two text corpora that the proposed approach very significantly im- proves on a state-of-the-art trigram model.
Article
Full-text available
An implicit association test (IAT) measures differential association of 2 target concepts with an attribute. The 2 concepts appear in a 2-choice task (2-choice task (e.g., flower vs. insect names), and the attribute in a 2nd task (e.g., pleasant vs. unpleasant words for an evaluation attribute). When instructions oblige highly associated categories (e.g., flower + pleasant) to share a response key, performance is faster than when less associated categories (e.g., insect & pleasant) share a key. This performance difference implicitly measures differential association of the 2 concepts with the attribute. In 3 experiments, the IAT was sensitive to (a) near-universal evaluative differences (e.g., flower vs. insect), (b) expected individual differences in evaluative associations (Japanese + pleasant vs. Korean + pleasant for Japanese vs. Korean subjects), and (c) consciously disavowed evaluative differences (Black + pleasant vs. White + pleasant for self-described unprejudiced White subjects).
Article
Full-text available
A goal of statistical language modeling is to learn the joint probability function of sequences of words. This is intrinsically difficult because of the curse of dimensionality: we propose to fight it with its own weapons.
Article
Word embeddings are increasingly being used as a tool to study word associations in specific corpora. However, it is unclear whether such embeddings reflect enduring properties of language or if they are sensitive to inconsequential variations in the source documents. We find that nearest-neighbor distances are highly sensitive to small changes in the training corpus for a variety of algorithms. For all methods, including specific documents in the training set can result in substantial variations. We show that these effects are more prominent for smaller training corpora. We recommend that users never rely on single embedding models for distance calculations, but rather average over multiple bootstrap samples, especially for small corpora.
Article
Significance Word embeddings are a popular machine-learning method that represents each English word by a vector, such that the geometry between these vectors captures semantic relations between the corresponding words. We demonstrate that word embeddings can be used as a powerful tool to quantify historical trends and social change. As specific applications, we develop metrics based on word embeddings to characterize how gender stereotypes and attitudes toward ethnic minorities in the United States evolved during the 20th and 21st centuries starting from 1910. Our framework opens up a fruitful intersection between machine learning and quantitative social science.
Conference Paper
We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best performing techniques based on different types of neural networks. We observe large improvements in accuracy at much lower computational cost, i.e. it takes less than a day to learn high quality word vectors from a 1.6 billion words data set. Furthermore, we show that these vectors provide state-of-the-art performance on our test set for measuring syntactic and semantic word similarities.
Conference Paper
The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large num- ber of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alterna- tive to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of “Canada” and “Air” cannot be easily combined to obtain “Air Canada”. Motivated by this example,we present a simplemethod for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.
Chapter
On the surface, cognitive biases appear to be puzzling when viewed through an evolutionary lens. Because they depart from standards of logic and accuracy, they appear to be design flaws instead of examples of good evolutionary engineering. Biases are often ascribed to cognitive ?constraints? or flaws in the design of the mind that were somehow not overcome by evolution. To the evolutionary psychologist, however, evolved psychological mechanisms are expected to solve particular problems well, in ways that contributed to fitness ancestrally. Viewed in this way, cognitive biasescould have evolved because they positively impacted fitness. They are, then, not necessarily designflaws?instead, they could be design features. This chapter describes research documenting adaptive biases across many domains. These include inferences about danger, the cooperativeness of others, and the sexual and romantic interests of prospective mates. This chapter also addresses the question of why biases often seem to be implemented at the cognitive level, producing genuine misperceptions, rather than merely biases in enacted behavior.
Article
Continuous word representations, trained on large unlabeled corpora are useful for many natural language processing tasks. Many popular models to learn such representations ignore the morphology of words, by assigning a distinct vector to each word. This is a limitation, especially for morphologically rich languages with large vocabularies and many rare words. In this paper, we propose a new approach based on the skip-gram model, where each word is represented as a bag of character n-grams. A vector representation is associated to each character n-gram, words being represented as the sum of these representations. Our method is fast, allowing to train models on large corpus quickly. We evaluate the obtained word representations on five different languages, on word similarity and analogy tasks.
Article
Wikipedia is a community-created encyclopedia that contains information about notable people from different countries, epochs and disciplines and aims to document the world's knowledge from a neutral point of view. However, the narrow diversity of the Wikipedia editor community has the potential to introduce systemic biases such as gender biases into the content of Wikipedia. In this paper we aim to tackle a sub problem of this larger challenge by presenting and applying a computational method for assessing gender bias on Wikipedia along multiple dimensions. We find that while women on Wikipedia are covered and featured well in many Wikipedia language editions, the way women are portrayed starkly differs from the way men are portrayed. We hope our work contributes to increasing awareness about gender biases online, and in particular to raising attention to the different levels in which gender biases can manifest themselves on the web.
Conference Paper
Unbiased language is a requirement for reference sources like encyclopedias and scientific texts. Bias is, nonetheless, ubiq-uitous, making it crucial to understand its nature and linguistic realization and hence detect bias automatically. To this end we analyze real instances of human edits de-signed to remove bias from Wikipedia ar-ticles. The analysis uncovers two classes of bias: framing bias, such as praising or perspective-specific words, which we link to the literature on subjectivity; and episte-mological bias, related to whether propo-sitions that are presupposed or entailed in the text are uncontroversially accepted as true. We identify common linguistic cues for these classes, including factive verbs, implicatives, hedges, and subjective inten-sifiers. These insights help us develop fea-tures for a model to solve a new prediction task of practical importance: given a bi-ased sentence, identify the bias-inducing word. Our linguistically-informed model performs almost as well as humans tested on the same task.
  • Guillaume Lample
  • Miguel Ballesteros
  • Sandeep Subramanian
  • Kazuya Kawakami
  • Chris Dyer
Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural architectures for named entity recognition. arXiv:1603.01360 (2016).
  • Peilu Wang
  • Yao Qian
  • K Frank
  • Lei Soong
  • Hai He
  • Zhao
Peilu Wang, Yao Qian, Frank K Soong, Lei He, and Hai Zhao. 2015. A unified tagging solution: Bidirectional lstm recurrent neural network with word embedding. arXiv:1511.00215 (2015).
The Evolution of Cognitive Bias. John Wiley & Sons, Inc
  • Haselton Martie G.