ArticlePublisher preview available
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

People’s perceptions about the size of minority groups in social networks can be biased, often showing systematic over- or underestimation. These social perception biases are often attributed to biased cognitive or motivational processes. Here we show that both over- and underestimation of the size of a minority group can emerge solely from structural properties of social networks. Using a generative network model, we show that these biases depend on the level of homophily, its asymmetric nature and on the size of the minority group. Our model predictions correspond well with empirical data from a cross-cultural survey and with numerical calculations from six real-world networks. We also identify circumstances under which individuals can reduce their biases by relying on perceptions of their neighbours. This work advances our understanding of the impact of network structure on social perception biases and offers a quantitative approach for addressing related issues in society.
Individual- and group-level social perception bias a,b, Individuals belong to one of two groups: the majority (blue) or the minority (orange). The minority fraction is 1/3 in both (fm ≈ 0.33) the (a) homophilic network and (b) heterophilic network. We studied social perception biases originating at both the individual and group level. On the individual level, individual i perceives the size of the minority group in the overall network based on their personal network, denoted by dashed circles. In the homophilic network, i perceives the size of the minority to be approximately 1/6 ≈ 16%, while in the heterophilic network, i perceives the size of the minority to be approximately 4/6 ≈ 67%. Therefore, in the homophilic network, individual i underestimates the minority-group size by a factor of 0.5 and in the heterophilic network i overestimates the minority-group size by a factor of 2 (see equation (1)). At the group level, the majority group perceives the size of the minority group to be (1/3 + 1/6 + 2/3)/8 = 7/48 ≈ 0.15 in the homophilic network, and (1/2 + 1/3 + 2/3 + 2/3 + 1 + 1)/8 = 25/48 ≈ 0.52 in the heterophilic network. Thus, the majority group underestimates the size of the minority group by a factor of 0.45 in the homophilic network (group-level perception bias = 0.15fm\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{{0.15}}{{f_m}}$$\end{document}=0.150.33\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{{0.15}}{{0.33}}$$\end{document} = 0.45) and overestimates the minority-group size by a factor of 1.6 in the heterophilic network (0.520.33\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{{0.52}}{{0.33}}$$\end{document} = 1.6; see equation (2)). In sum, depending on the structure of the network, individuals’ and groups’ perceptions about their own and other groups’ sizes can be distorted.
Survey results: bias in perception of minority-group size for participants whose personal networks exhibit different levels of homophily (h) and for attributes held by a small, medium or large minority group in a given country a–f, Each row shows results from a different country: Germany (a,b, n = 99), South Korea (c,d, n = 100) and the United States (e,f, n = 101). Columns show perception biases of the minority (left) and the majority (right) group for each attribute. Different colours distinguish perception biases for attributes that in a given country are held by a small (fm < 0.2; yellow), medium (0.2 ≤ fm < 0.4; purple) or large (0.4 ≤ fm < 0.5; red) minority group. The value of the individual perception bias indicates the accuracy with which each participant (each point in the plot) perceived the size of the minority group in the overall population. Group-level perception biases are calculated by averaging individual participants’ perception biases for each homophily bin (0.02 increments), and they are denoted by fitted lines. A perception bias of 1 suggests perfect accuracy (horizontal line in each panel), values >1 indicate overestimation of the minority-group size and values <1 indicate underestimation of the minority-group size. The insets show fitted trends on a log scale for easier comparison of the sizes of underestimation and overestimation. These trends also approximate the results for the simple difference measure of perception biases. Homophily (h) is estimated from participants’ reports about the frequency of people with each attribute in their personal networks (see Methods).
This content is subject to copyright. Terms and conditions apply.
1Department of Mathematics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA. 2Department of Computational Social Science, GESIS,
Cologne, Germany. 3Institute for Web Science and Technologies, University of Koblenz-Landau, Koblenz, Germany. 4Asia Pacific Center for Theoretical
Physics, Pohang, Republic of Korea. 5Department of Physics, Pohang University of Science and Technology, Pohang, Republic of Korea. 6Department of
Computer Science, Aalto University, Espoo, Finland. 7Department for Society, Technology and Human Factors & Department of Computer Science, RWTH
Aachen University, Aachen, Germany. 8Santa Fe Institute, Santa Fe, NM, USA. 9Complexity Science Hub Vienna, Vienna, Austria. 10Harding Center for Risk
Literacy, Max Planck Institute for Human Development, Berlin, Germany. 11These authors contributed equally: Eun Lee, Fariba Karimi.
People’s perceptions of their social worlds determine their own
personal aspirations1 and willingness to engage in different
behaviours, from voting2 and energy conservation3 to health
behaviour4, drinking5 and smoking6. Yet, when forming these per-
ceptions, people seldom have an opportunity to draw representa-
tive samples from the overall social network, or from the general
population. Instead, their samples are constrained by the local
structure of their personal networks, which can bias their percep-
tion of the relative frequency of different attributes in the general
population. For example, supporters of different candidates in the
2016 US presidential election formed relatively isolated Twitter
communities7. Such insular communities can overestimate the rela-
tive frequency of their own attributes in the overall society. This
has been documented in the literature on overestimation effects
including false consensus, looking-glass perception and, more gen-
erally, social projection812. In an apparent contradiction, it has also
been documented that people holding a particular view sometimes
underestimate the frequency of that view, as described in the litera-
ture on false uniqueness13,14, pluralistic ignorance15,16 and majority
illusion17. These over- and underestimation errors, which we call
social perception biases, affect people’s judgements of minority- and
majority-group sizes18.
It has been observed that social perception biases can be related
to the structural properties of personal networks19,20, which can
strongly affect the samples of information on which individuals rely
when forming their social perceptions21,22. However, the impact of
different network properties on social perception biases has not yet
been systematically explored. Here we explore three such proper-
ties. The first is the level of homophily, or how likely the one is to
be connected to similar others, which is known as a fundamental
structural property of many social networks23. The second property
is the asymmetry of homophily, or whether homophily is larger in
some subgroups than in others. For example, it has been observed
that in scientific collaborations, homophily among women is stron-
ger than homophily among men24. The third property is the rela-
tive size of minority and majority groups in the society. Many social
networks are characterized by a large majority group and a much
smaller minority group. Examples are the proportions of different
genders in science, technology, engineering and maths, of people
with different levels of income and of people who smoke or not.
Most existing explanations of social perception biases invoke
motivational and cognitive processes rather than social network
structure. For example, processes that explain overestimation of
the frequency of ones own attributes (for example, false consensus)
include wishful thinking25, easier recall of the reasons for having
one’s own view9, rational inference of population frequencies based
on ones own attributes26, feeling good when others share one’s own
view27, and justifying one’s undesirable behaviours by overestimat-
ing their frequency in society28. However, these processes cannot
explain the opposite effect, underestimating the frequency of our
own view (for example, false uniqueness). Instead, this opposite bias
is typically explained by a different set of cognitive or motivational
processes, such as differential attention to one’s own and other
groups13 and bolstering perceived self-competence14. Ideally, both
overestimation and underestimation biases would be explained by
a single mechanism18.
Here we show empirically, analytically and numerically that a
simple network model can explain both over- and underestimation
in social perceptions, without assuming biased motivational or cog-
nitive processes. Results from a cross-cultural survey show that the
level of homophily and size of the minority group influence people’s
social perception biases. Analytical results from a generative net-
work model with tunable homophily and minority-group size align
well with the empirical findings. Numerical investigations show that
Homophily and minority-group size explain
perception biases in social networks
Eun Lee 1,11*, Fariba Karimi 2,3,11*, Claudia Wagner 2,3, Hang-Hyun Jo 4,5,6, Markus Strohmaier 2,7
and Mirta Galesic 8,9,10
People’s perceptions about the size of minority groups in social networks can be biased, often showing systematic over- or
underestimation. These social perception biases are often attributed to biased cognitive or motivational processes. Here we
show that both over- and underestimation of the size of a minority group can emerge solely from structural properties of social
networks. Using a generative network model, we show that these biases depend on the level of homophily, its asymmetric
nature and on the size of the minority group. Our model predictions correspond well with empirical data from a cross-cultural
survey and with numerical calculations from six real-world networks. We also identify circumstances under which individuals
can reduce their biases by relying on perceptions of their neighbours. This work advances our understanding of the impact of
network structure on social perception biases and offers a quantitative approach for addressing related issues in society.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
... Information transmission in the context of Information and Communication Technologies is a great opportunity to create a better-informed society, but in practice, these technologies are also promoting phenomena such as the viral spreading of fake news 1-3 , echo chambers 4-6 , perception biases like false consensus or majority illusions 7 and social polarization 5,6,8 . We understand by echo chambers situations in which the transmission of information among individuals belonging to the same opinion group is dominant, while transmission among individuals with different opinions is hindered. ...
... The real social impact of echo chambers and their causal link with misinformation cascades are debated topics [9][10][11] , but data-driven and computational approaches confirm that the structural properties of social networks are tied to the emergence of echo chambers [4][5][6] . In particular, the homophily of the network -that is, the tendency of nodes to be connected to other nodes of the same group-seems to be a key ingredient to generate echo chambers 5 and perception biases 7 . ...
... Changing the degrees of the source and target nodes does not cause novel behaviors, so they can be disregarded as control parameters (not shown). Importantly, this simplified model does not account for the observed strong biases in the presence of homophily 4,7 . ...
Full-text available
We study how information transmission biases arise by the interplay between the structural properties of the network and the dynamics of the information in synthetic scale-free homophilic/heterophilic networks. We provide simple mathematical tools to quantify these biases. Both Simple and Complex Contagion models are insufficient to predict significant biases. In contrast, a Hybrid Contagion model—in which both Simple and Complex Contagion occur—gives rise to three different homophily-dependent biases: emissivity and receptivity biases, and echo chambers. Simulations in an empirical network with high homophily confirm our findings. Our results shed light on the mechanisms that cause inequalities in the visibility of information sources, reduced access to information, and lack of communication among distinct groups.
... Recent research demonstrates that network structure can generate misperceptions in people's expectation about what is "normal" behavior in the population (Lerman et al., 2016;Lee et al., 2019;Stewart et al., 2019). However, at the core of these results lie two crucial assumptions. ...
... agents misperceive the degree distribution, while inLee et al. (2019) andStewart et al. (2019) they ignore the underlying sorting in the network. One direct implication of the assumption of naivity is that people do not fully use available network information, for example, about network neighbors to update expectations about population behavior. ...
Full-text available
We investigate how individuals form expectations about population behavior using statistical inference based on observations of their social relations. Misperceptions about others' connectedness and behavior arise from sampling bias stemming from the friendship paradox and uncertainty from small samples. In a game where actions are strategic complements, we characterize the equilibrium and analyze equilibrium behavior. We allow for agent sophistication to account for the sampling bias and demonstrate how sophistication affects the equilibrium. We show how population behavior depends on both sources of misperceptions and illustrate when sampling uncertainty plays a critical role compared to sampling bias.
... In academia specifically, gender homophily has been demonstrated in the editorial process [15] and in co-authorship [16]. Additionally, academics tend to cite individuals nearby in their co-authorship networks, and these networks influence an individual's inferences about the field as a whole [17]. Whether these homophilic trends arise through choices or are imposed by institutions [13,18], the trend toward sameness tends to mitigate efforts to rectify disparities by recruitment, and might contribute to why some fields show increasing disparities despite increasing diversity [19,20]. ...
... The parameters α and β are drawn from different distributions depending on the gender of the agent. In alignment with literature showing homophily in co-author and citation networks [16,17], women are given higher values of α and β compared to men, leading them to know and cite more women authors. After each meeting or citation, this history is updated. ...
Full-text available
In multiple academic disciplines, having a perceived gender of `woman' is associated with a lower than expected rate of citations. In some fields, that disparity is driven primarily by the citations of men and is increasing over time despite increasing diversification of the profession. It is likely that complex social interactions and individual ideologies shape these disparities. Computational models of select factors that reproduce empirical observations can help us understand some of the minimal driving forces behind these complex phenomena and therefore aid in their mitigation. Here, we present a simple agent-based model of citation practices within academia, in which academics generate citations based on three factors: their estimate of the collaborative network of the field, how they sample that estimate, and how open they are to learning about their field from other academics. We show that increasing homophily -- or the tendency of people to interact with others more like themselves -- in these three domains is sufficient to reproduce observed biases in citation practices. We find that homophily in sampling an estimate of the field influences total citation rates, and openness to learning from new and unfamiliar authors influences the change in those citations over time. We next model a real-world intervention -- the citation diversity statement -- which has the potential to influence both of these parameters. We determine a parameterization of our model that matches the citation practices of academics who use the citation diversity statement. This parameterization paired with an openness to learning from many new authors can result in citation practices that are equitable and stable over time. Ultimately, our work underscores the importance of homophily in shaping citation practices and provides evidence that specific actions may mitigate biased citation practices in academia.
... Many studies have documented biases in these social perceptions, including both overestimation and underestimation of the size of minority groups. We investigate whether such seemingly contradictory biases, namely false consensus (overestimation of the frequency of one's attributes) and false uniqueness (underestimating the frequency of our view), can be explained merely by the structure of the social networks [36]. In other words, how does our estimate about minorities relates to our social networks? ...
Full-text available
In this chapter, we provide an overview of recent advances in data-driven and theory-informed complex models of social networks and their potential in understanding societal inequalities and marginalization. We focus on inequalities arising from networks and network-based algorithms and how they affect minorities. In particular, we examine how homophily and mixing biases shape large and small social networks, influence perception of minorities, and affect collaboration patterns. We also discuss dynamical processes on and of networks and the formation of norms and health inequalities. Additionally, we argue that network modeling is paramount for unveiling the effect of ranking and social recommendation algorithms on the visibility of minorities. Finally, we highlight the key challenges and future opportunities in this emerging research topic.
... Simulations consist of updating users' opinions and/or network connections (rewiring) based on the opinion of neighbor users in the network. The opinion update scheme depends upon the specific model that may involve (i) the fundamental mechanism of homophily, by which individuals tend to interact and create connections with others sharing similar features [39,40] (ii) social contagion [41,42], i.e. the tendency of individuals to become similar each-other over time; ...
Several studies pointed out that users seek the information they like the most, filter out dissenting information, and join groups of like-minded users around shared narratives. Feed algorithms may burst such a configuration toward polarization, thus influencing how information (and misinformation) spreads online. However, despite the extensive evidence and data about polarized opinion spaces and echo chambers, the interplay between human and algorithmic factors in shaping these phenomena remains unclear. In this work, we propose an opinion dynamic model mimicking human attitudes and algorithmic features. We quantitatively assess the adherence of the model's prediction to empirical data and compare the model performances with other state-of-the-art models. We finally provide a synthetic description of social media platforms regarding the model's parameters space that may be used to fine-tune feed algorithms to eventually smooth extreme polarization.
... For instance, the smallest group (i.e., the minority group) in a network can have a systemic disadvantage of being less connected than larger groups, depending on the group mixing 26 . Having a lower number of connections poses several disadvantages to individuals, such as low social capital 27 , health issues 28 , and perception biases 29 . Yet, the mechanisms underlying group dynamics and their relation to degree inequality in social gatherings are still unexplored. ...
Full-text available
Uncovering how inequality emerges from human interaction is imperative for just societies. Here we show that the way social groups interact in face-to-face situations can enable the emergence of disparities in the visibility of social groups. These disparities translate into members of specific social groups having fewer social ties than the average (i.e., degree inequality). We characterize group degree inequality in sensor-based data sets and present a mechanism that explains these disparities as the result of group mixing and group-size imbalance. We investigate how group sizes affect this inequality, thereby uncovering the critical size and mixing conditions in which a critical minority group emerges. If a minority group is larger than this critical size, it can be a well-connected, cohesive group; if it is smaller, minority cohesion widens inequality. Finally, we expose group under-representation in degree rankings due to mixing dynamics and propose a way to reduce such biases. The emergence of inequality in social interactions can depend on a number of factors, among which the intrinsic attractiveness of individuals, but also group size the presence of pre-formed social ties. Here, the authors propose “social attractiveness” as a mechanism to account for the emergence of inequality in face-to-face social dynamics and show this reproduces real-world gathering data, predicting the existence of a critical group size for the minority group below which higher cohesion among its members leads to higher inequality.
... Social networks are the infrastructure of our social and professional life. They impact, among others, our cooperation [18], our health [8], and our social perceptions [24]. The structure of modern online social networks is however not only shaped by well-studied social mechanisms (such as homophily or preferential attachment), but it is also affected by people recommender systems, complex algorithms that suggest new connections among social network users. ...
Full-text available
Network-based people recommendation algorithms are widely employed on the Web to suggest new connections in social media or professional platforms. While such recommendations bring people together, the feedback loop between the algorithms and the changes in network structure may exacerbate social biases. These biases include rich-get-richer effects, filter bubbles, and polarization. However, social networks are diverse complex systems and recommendations may affect them differently, depending on their structural properties. In this work, we explore five people recommendation algorithms by systematically applying them over time to different synthetic networks. In particular, we measure to what extent these recommendations change the structure of bi-populated networks and show how these changes affect the minority group. Our systematic experimentation helps to better understand when link recommendation algorithms are beneficial or harmful to minority groups in social networks. In particular, our findings suggest that, while all algorithms tend to close triangles and increase cohesion, all algorithms except Node2Vec are prone to favor and suggest nodes with high in-degree. Furthermore, we found that, especially when both classes are heterophilic, recommendation algorithms can reduce the visibility of minorities.
... There is also a natural tendency for people to connect to individuals similar to them, the so-called homophily (see, e.g. McPherson et al. [2001]), which adds to the potential for a social network to create information bubbles and is amplified even further in modern social media networks (Lee et al. [2019]). The current vaccination debate has brought to the fore the dramatic effects that misperception can have in people's lives [Johnson et al., 2020] and made it clear how important it is to design social networks where participants receive the most unbiased information possible. ...
Majority illusion occurs in a social network when the majority of the network nodes belong to a certain type but each node's neighbours mostly belong to a different type, therefore creating the wrong perception, i.e., the illusion, that the majority type is different from the actual one. From a system engineering point of view, we want to devise algorithms to detect and, crucially, correct this undesirable phenomenon. In this paper we initiate the computational study of majority illusion in social networks, providing complexity results for its occurrence and avoidance. Namely, we show that identifying whether a network can be labelled such that majority illusion is present, as well as the problem of removing an illusion by adding or deleting edges of the network, are NP-complete problems.
Inter‐city association patterns can be embodied in many aspects, such as transportation, immigration, and the spread of diseases. Among these aspects, culture, as an important content of human society, is also a manifestation of inter‐city association. The recognition of inter‐city cultural association patterns plays an important role in understanding the spatial distribution pattern of culture. This article defines cultural eigenvectors to represent city cultural characteristics by mining the semantics of place names. On this basis, a cultural semantic similarity network (CSSN) is constructed to recognize inter‐city cultural association patterns. Meanwhile, the related algorithm is designed to discover the cultural spatial structures using China as a case study. Finally, four types of cultural hubs, four typical cultural belts, and 13 cultural circles are identified. This article not only recognizes the cultural importance and associations of Chinese cities, but also provides a reference for other city association studies.
Full-text available
Homophily can put minority groups at a disadvantage by restricting their ability to establish links with a majority group or to access novel information. Here, we show how this phenomenon can influence the ranking of minorities in examples of real-world networks with various levels of heterophily and homophily ranging from sexual contacts, dating contacts, scientific collaborations, and scientific citations. We devise a social network model with tunable homophily and group sizes, and demonstrate how the degree ranking of nodes from the minority group in a network is a function of (i) relative group sizes and (ii) the presence or absence of homophilic behaviour. We provide analytical insights on how the ranking of the minority can be improved to ensure the representativeness of the group and correct for potential biases. Our work presents a foundation for assessing the impact of homophilic and heterophilic behaviour on minorities in social networks.
Full-text available
Theoretical models of critical mass have shown how minority groups can initiate social change dynamics in the emergence of new social conventions. Here, we study an artificial system of social conventions in which human subjects interact to establish a new coordination equilibrium. The findings provide direct empirical demonstration of the existence of a tipping point in the dynamics of changing social conventions. When minority groups reached the critical mass—that is, the critical group size for initiating social change—they were consistently able to overturn the established behavior. The size of the required critical mass is expected to vary based on theoretically identifiable features of a social setting. Our results show that the theoretically predicted dynamics of critical mass do in fact emerge as expected within an empirical system of social coordination.
Full-text available
A longstanding problem in the social, biological, and computational sciences is to determine how groups of distributed individuals can form intelligent collective judgments. Since Galton's discovery of the "wisdom of crowds" [Galton F (1907) Nature 75:450-451], theories of collective intelligence have suggested that the accuracy of group judgments requires individuals to be either independent, with uncorrelated beliefs, or diverse, with negatively correlated beliefs [Page S (2008) The Difference: How the Power of Diversity Creates Better Groups, Firms, Schools, and Societies]. Previous experimental studies have supported this view by arguing that social influence undermines the wisdom of crowds. These results showed that individuals' estimates became more similar when subjects observed each other's beliefs, thereby reducing diversity without a corresponding increase in group accuracy [Lorenz J, Rauhut H, Schweitzer F, Helbing D (2011) Proc Natl Acad Sci USA 108:9020-9025]. By contrast, we show general network conditions under which social influence improves the accuracy of group estimates, even as individual beliefs become more similar. We present theoretical predictions and experimental results showing that, in decentralized communication networks, group estimates become reliably more accurate as a result of information exchange. We further show that the dynamics of group accuracy change with network structure. In centralized networks, where the influence of central individuals dominates the collective estimation process, group estimates become more likely to increase in error.
Full-text available
Mature B cells coexpress both IgM and IgD B-cell antigen receptor (BCR) classes, which are organized on the cell surface in distinct protein islands. The specific role of the IgD-BCR is still enigmatic, but it is colocalized with several other receptors on the B-cell surface, including the coreceptor CD19. Here, we report that the chemokine receptor CXCR4 is also found in proximity to the IgD-BCR. Furthermore, B cells from IgD-deficient mice show defects in CXCL12-mediated CXCR4 signaling and B-cell migration, whereas B cells from IgM-deficient mice are normal in this respect. CXCR4 activation results in actin cytoskeleton remodeling and PI3K/Akt and Erk signaling in an IgD-BCR-dependent manner. The defects in CXCR4 signaling in IgD-deficient B cells can be overcome by anti-CD19 antibody stimulation that also increases CXCL12-mediated B-cell migration of normal B cells. These results show that the IgD-BCR, CD19, and CXCR4 are not only colocalized at nanometer distances but are also functionally connected, thus providing a unique paradigm of receptor signaling cross talk and function.
Studies of social judgments have demonstrated a number of diverse phenomena that were so far difficult to explain within a single theoretical framework. Prominent examples are false consensus and false uniqueness, as well as self-enhancement and self-depreciation. Here we show that these seemingly complex phenomena can be a product of an interplay between basic cognitive processes and the structure of social and task environments. We propose and test a new process model of social judgment, the social sampling model (SSM), which provides a parsimonious quantitative account of different types of social judgments. In the SSM, judgments about characteristics of broader social environments are based on sampling of social instances from memory, where instances receive activation if they belong to a target reference class and have a particular characteristic. These sampling processes interact with the properties of social and task environments, including homophily, shapes of frequency distributions, and question formats. For example, in line with the model’s predictions we found that whether false consensus or false uniqueness will occur depends on the level of homophily in people’s social circles and on the way questions are asked. The model also explains some previously unaccounted-for patterns of self-enhancement and self-depreciation. People seem to be well informed about many characteristics of their immediate social circles, which in turn influence how they evaluate broader social environments and their position within them.
Scientific collaborations shape novel ideas and new discoveries and help scientists to advance their scientific career through publishing high impact publications and grant proposals. Recent studies however show that gender inequality is still present in many scientific practices ranging from hiring to peer review processes and grant applications. While empirical findings highlight that collaborations impact success and gender inequality is present in science, we know little about gender-specific differences in collaboration patterns, how they change over time and how they impact scientific success. In this paper we close this gap by studying gender-differences in dropout rates, productivity and collaboration patterns of more than one million computer scientists over the course of 47 years. We investigate which collaboration patterns are related with scientific success and if these patterns are similar for male and female scientists. Our results highlight that while subtle gender disparities in dropout rates, productivity and collaboration patterns exist, successful male and female scientists reveal the same collaboration patterns: compare with scientists in the same career age, they tend to collaborate with more colleagues than other scientists, establish more long lasting and repetitive collaborations, bring people together that have not been collaborating before and collaborate more with other successful scientists.
Computational social scientists often harness the Web as a "societal observatory" where data about human social behavior is collected. This data enables novel investigations of psychological, anthropological and sociological research questions. However, in the absence of demographic information, such as gender, many relevant research questions cannot be addressed. To tackle this problem, researchers often rely on automated methods to infer gender from name information provided on the web. However, little is known about the accuracy of existing gender-detection methods and how biased they are against certain sub-populations. In this paper, we address this question by systematically comparing several gender detection methods on a random sample of scientists for whom we know their full name, their gender and the country of their workplace. We further suggest a novel method that employs web-based image retrieval and gender recognition in facial images in order to augment name-based approaches. Our findings show that the performance of name-based gender detection approaches can be biased towards countries of origin and such biases can be reduced by combining name-based an image-based gender detection methods.