Content uploaded by Jan Skopek
Author content
All content in this area was uploaded by Jan Skopek on May 09, 2023
Content may be subject to copyright.
1
Chapter 12: Studying mate choice using digital trace data
from online dating
Preprint version of
Skopek, J. 2023. Studying mate choice using digital trace data from online dating. In:
Skopek, J. (Ed.), Research Handbook on Digital Sociology (pp. 211–238). Cheltenham, UK:
Edward Elgar.
DOI: https://doi.org/10.4337/9781789906769.00020
Jan Skopek
1 Introduction
Online and mobile dating
1
has grown to a significant social-digital technology (see also
Coyle & Alexopoulos in this volume). Survey evidence indicates that, by today, finding
partners online is one of the most important ways of couple formation (Potarca, 2020;
Rosenfeld, Thomas, & Hausen, 2019). From a business perspective, online dating has been one
of the fastest growing digital market over the past decade reaching an annual revenue level of
four billion US dollars worldwide and rising (Statista, 2022). During lockdown and social
distancing measures imposed by the Covid-19 health policies, online and mobile dating apps
have become important channels for the reconfiguration of traditional dating venues to ‘date
from home’ or ‘virtual’ dating venues (Gibson, 2021). Hence, it seems as searching for intimate
relationships is becoming part of people’s routine digital repertoire like searching for
information or jobs. Trying to keep up with the pace of technological change and its social
consequences, social research has addressed the social profile of online dating users (e.g.,
Anzani, Di Sarno, & Prunas, 2018) or the way how online dating itself may be shaping and
changing experience, practice, patterns, or stability of intimate relationships and marriage
(Alexopoulos, Timmermans, & McNallie, 2020; Potarca, 2017).
This chapter dedicates to a somewhat different perspective, namely the study of mating
and dating through online dating as a research laboratory. As socio-technological tools for
partner search, online dating applications generate data of theoretical interest for social
sciences that are interested in studying relationship formation and assortative mating
2
(Blossfeld, 2009; Lichter & Qian, 2019; Schwartz, 2013). Family research, demography and
social stratification scholarship have a longstanding interest in the trends, causes and
consequences of social homogamy (the observation that people with similar social
characteristics such as education tend to marry) and racial endo- or exogamy (Bavel, 2021;
Schwartz, 2013), however the bulk of empirical research works with census or survey data on
actual unions or marriages. In other words, by looking at marriages those studies inspect the
(preliminary) end point of a complex social process rather than at the actual process itself. In
contrast, user-generated digital trace data from online dating offer social research a non-
reactive research environment to study of the early steps of relationship formation especially
contact choices people make and social interactions that take place at the first stage of romantic
encounter (Kreager, Cavanagh, Yen, & Yu, 2014; Skopek, Schulz, & Blossfeld, 2011).
Employing non-reactive research methods is by no means new in mating research. For
example, the literature on partner- and partnership preferences is full of studies that examined
traditional personal ads in newspapers to learn about mating (e.g., Hirschman, 1987). Social-
psychologists have used ‘speed-dating’ experiments to study dynamic aspects of romantic
attraction (Finkel, Eastwick, & Matthews, 2007). As the chapter aims to show, large scale data
from online dating, however, provide a whole new level of possibilities and opportunities for
social research. This ranges from aspects of self-presentation, stated and revealed partner
preferences, to dynamic aspects of two-sided matching. Data generated by social interaction
on online dating services is therefore valuable for studying ‘old’ but also ‘new’ questions about
dating. In particular, digital trace data can be a precious data source for studying the dynamics
at the earliest steps of relationship formation. In sharp contrast to mass social media sites like
Twitter or YouTube and for apparent reasons, user activity on online dating sites is not public;
discretion and privacy are a central concern of platforms and their users (for a discussion of
open and closed access models see Posegga in this volume). Thus, methods of collecting digital
trace data via web scraping or “API-harvesting” access as commonly used to retrieve data from
YouTube and the likes (see Breuer et al. in this volume) are usually inapplicable in the context
of online dating. Hence, accessing digital trace data from online dating is challenging and
typically requires a trustful cooperation between the provider companies and academic
research. Nonetheless, in the past such business-research collaborations have been established
and, thus, we have seen a series of studies that draw on online dating to gain insights into dating
behaviour.
The aim of this chapter is to review social research that employed digital trace data
from online dating to study dating and mating, in particular micro-mechanisms of assortative
3
mating. The overview starts with a short methodological analysis on the nature of digital trace
data in the context of online dating and associated opportunities but also limitations for social
research. Subsequently, I survey research over the past 15 years that employed digital trace
data from online dating. This review is necessarily selective and prioritises research published
in peer-review journals. Additionally, I will present some first-hand findings on male-female
interaction patterns in online dating and discuss how they may contribute to the sociological
and demographic study of assortative mating. Finally, the chapter will briefly address some
hypotheses and empirical evidence regarding the impact online dating may have for union
formation and assortative mating. The chapter concludes with an outlook on how future
research working with data from online dating could look like.
2 Studying dating using digital trace data
2.1 Non-reactive and unobtrusive measurement
Reactivity of research is a widespread concern in social science research methodology,
in particular, when it comes to the study of private and delicate issues such as romantic
preferences and choices, attitudes, and values. In the sensitive and very personal context of
dating and mating, social desirability bias in findings is just one example of common criticism
survey methods must put up with. In such situations, research may leverage non-reactive
methods to obtain unobtrusive measures of the social processes under study (Webb, Campbell,
Schwartz, & Sechrest, 1999).
The use of non-reactive methods such as ‘trace data’ is by no means new in research
on dating and mating. In the pre-digital era of the 70s, 80s and 90s personal ads in newspapers
have been one the major sources of non-reactive data to examine on men and women’s
relationship goals and dating preferences (Butler-Smith, Cameron, & Collins, 1998; Harrison
& Saeed, 1977; Hirschman, 1987). Yet, it is fair to say that the advent of online dating as a
social application nowadays used on a mass scale promises entirely new research opportunities
to social scientists concerned with studying mate search based on non-reactive data. Over past
years, sociological research has begun to recognise the potential of big data and to discuss
associated methodological challenges for research (Cesare, Lee, McCormick, Spiro, &
Zagheni, 2018; Edelmann, Wolff, Montagne, & Bail, 2020; Hampton, 2017; Lazer & Radford,
2017), yet more specific discussions on the promises and pitfalls of associated with digital trace
data gathered from online dating sites are hardly available. We will explore some these in the
following.
4
By digital trace data from online dating, I refer to data that is being recorded by the
software architecture underlying online dating applications. Sometimes this kind of data is
referred to as ‘log-data’ since it is passively collected. Such data is recorded (typically by web-
based applications) in relational SQL databases as a necessary by-product of the operation of
dating apps; it is usually not recorded for research purposes. That implies, that data recording
follows the specific exigencies of application and software engineering rather than social
science ideas or principles of research design (for a deeper discussion on that issue see Posegga
in this volume). At a very basic level, such data contains profile data and activity and
messaging data, the latter typically involving (internal) email- or chat-like messages taking
place between users. User profiles include basic sociodemographic descriptors such as gender,
date of birth, or education level but also imagery (profile pictures) or textual bio descriptions
for the purpose of users’ self-presentation (for a detailed discussion see Coyle & Alexopoulos
in this volume). Commonly, messaging data consist of sender, recipient, time stamps and
message content (also other activities such as profile visits, ‘likes’ etc. may be recorded).
Together, user profile and messaging data allow researchers to reconstruct social interaction
on those dating apps (such as who contacted whom first, who responded to whom, etc.). Those
data represent digital traces people using dating apps have left behind while using those sites
for their romantic pursuit. Therefore, digital trace data from online dating is theoretically
relevant process-generated data; namely, data being generated by the actual process of men
and women seeking potential partners online.
Figure 12.1 illustrates building blocks of non-reactive measurement that are common
in online dating research of the sort discussed in this chapter. Nearly all studies reviewed below
adopt at least some of those elements. This basic schema also underpins the development of
more refined statistical frameworks that are rooted in cognitive sciences and decision theory
and that aim to model choices people make in online dating at multiple stages (Bruch, Feinberg,
& Lee, 2016). In an ideal-typical fashion the figure illustrates events as they occur in online
dating services (including swipe-based apps). Firsts, users (represented through their profile)
screen other users’ profiles. A common feature online dating services provide are search filters
which allow users to restrict the user base to a subset of profiles meeting certain filter criteria
(e.g., show only users in a certain age range and in a perimeter of 50 kilometres from user’s
location). In the example, User A ‘views’ (e.g., screening search results and/or visiting other
user profiles) potential ‘targets’ C, B, and D. We might call this ‘consideration ties’ as C, B,
and D but not F or E represent the effective choice set for A. Some studies measure those
consideration ties by the set of profiles an online dater has browsed through (e.g., Hitsch,
5
Hortaçsu, & Ariely, 2010b; Skopek, Schulz, et al., 2011), other studies construct consideration
ties by any possible dyads between (opposite sex) users within a geographical region (e.g.,
Lewis, 2013, 2016; Lin & Lundquist, 2013). User A decided to approach C and B and sent
initial or first contact messages. The fact that A chose C and B for contacting but not D reveals
information about A’s underlying dating preferences. An initial contact is just the first stage of
a ‘romantic’ tie in online dating, thereby expressing ‘romantic interest’ on the initiator’s side
only. Reply by the target is the second stage indicating (potentially) romantic reciprocity.
2
While A got no response from C, he or she received a reply from B. Replies marks reciprocal
events thereby measuring (to some degree) mutual interest between target and initiator. B got
a request from E too, but did not reply, which tells something about B’s underlying preferences
(based on selective reciprocity conditional on received requests). Furthermore, A is a rather
active user initiating contacts whereas B is more passive waiting to be contacted.
All those events are software-recorded with time stamps allowing an exact
reconstruction of how events unfolded over time and sequence. Hence, trace data from online
dating is genuinely longitudinal in nature and represents highly reliable and non-reactive
measurements. Limitations arise apparently in terms of the subjective meaning of certain
events which is usually opaque (e.g., was a message reply motivated by romantic reciprocity
or just by the felt obligation to give an answer?) unless the actual message content was analysed
(ethically problematic). Therefore, a central feature of trace data from online dating (but also
of digital trace data in general) is that meaning must be inferred from theoretical considerations
that combine both inductive insights (how do people use a particular platform) as well as
deductive knowledge (what are ideal-typical aspects of online dating).
A
C
B
D
Consideration
(e.g., visited profile, choice set)
User (profile)
Message
E
F
6
Figure 12.1. Building blocks of measurement in research with digital trace data from online
dating.
2.2 Opportunities and promises
Securing such digital trace data can be of considerable interest for social research
interested in the study of dating, relationship formation, and assortative mating. First, digital
trace data embody non-reactive and unobtrusive measures on mate seekers’ search and
interaction behaviour. Therefore, the data allows an ex-post reconstruction of social interaction
in a dating context which is not distorted by the presence of a researcher. Furthermore, the
absence of observer effects renders concerns about social desirability bias – which plague
reactive methods like surveys or interviews – unwarranted.
Second, the software driven data recording ensures a highly reliable quasi-observation
without observer errors or biases as they might occur in traditional observation methods.
Moreover, compared to surveys, reporting or recall bias (e.g., if interviewed people might not
remember all the events that happened and when they happened) is of no concern, since the
software records all events with timestamps in databases. Thus, digital trace data from online
dating enable – at unprecedented level – a high precision and large-scale measurement of social
interaction in real-life dating contexts.
Third, by revealing social interaction in dating, such digital trace data allows to
formulate and empirically address new research questions that would be out of scope with
standard survey data. Perhaps most important from a substantive and theoretical perspective,
online-dating trace data extend the analytical perspective from individuals (the subjects of
surveys) to social interaction and the outcomes of social interaction. Hence, trace data allows
new insights into dynamic aspects of the social fabric and the effective inner workings of dating
markets.
The review of empirical studies further below will portray the kind of research
questions that have been pursued by social scientists using online dating as a research site. At
this point, it suffices to say that online dating data have been used to examine inter-gender
dynamics in the early steps of relationships development which are opaque in standard studies
of union formation, assortative mating, and marriage. In particular that involves, (1) male and
female mate preferences as revealed by contact choices, (2) the interplay of preferences and
two-sided choice in creating reciprocated online relationships (matches), and (3) mate seeking
7
strategies in terms of contact behaviour or aspects of self-presentation. As we will see later,
online dating sites have been used as ‘research labs’ to study those micro-mechanisms through
both for observational and experimental research designs.
2.3 Limitations and pitfalls
With all those promises, there also come limitations and challenges. Many of those
relate to known issues of big data and their use in social research in general (Cesare et al., 2018;
Lazer & Radford, 2017). I will briefly discuss a few of those issues which are specifically
related to online dating.
First, limitations relate to external validity. How is data from online dating
representative and what or whom is it representative for? One may argue, representativity of a
piece of data decides about the inferential power a study has. Apparently, dating website or
app users do not reflect a random probability sample of a discernible and clearly specified
population which represents the usual benchmark for statistical representativity in standard
survey methodology. While trace data may embody a ‘census’ of all users of a particular dating
platform (and their communication acts), those users are obviously self-selected. Therefore,
dating sites are selective research sites and it cannot be ruled out that users of dating apps and
sites recruit from a sub-population that may possess rather specific relationships ideas and
preferences which makes them use those apps in the first place. Conversely, people with more
traditional attitudes and preferences might be deterred from the modern-day practice of online
dating. What is more, the reality of online dating offers a variety of dating applications for
different ends such as more general conventional dating (focus on long-term and intimate
relationships) versus ‘adult’ dating (explicit focus on short-term sexual contacts) as well as
different apps for different target groups (such as niche sites for elderly, handicapped, obese,
LGBTQ, or ‘dog-owners’). Hence, challenges arise as to ‘whose’ dating behaviour is being
studied and to what extent quasi-observed behaviour on a specific dating site can be generalized
to the population (of adult persons) at large. Concerns as to the general representativity of
evidence gathered in online dating have certainly been alleviated by the immense rise of online
dating which has widely become the standard mode of dating (Potarca, 2020; Rosenfeld et al.,
2019). Nonetheless, evidence from (single) online dating sites must be considered as
‘exemplary’ ethnographic evidence rather than representative evidence on mate search
behaviour. By no means is that a limitation only; indeed, it is a strength at the same time. Were
online dating users to be sampled in statistically representative ways, that is independently
8
drawn from each other, all the valuable interactional data between them would be lacking.
Thus, the ethnographic, non-representative aspect of digital trace data from online dating is
countered by the enormous theoretical benefits such data can offer.
A second challenge associated with big data scenarios in general but also with digital
trace data from online dating has been referred to as the ‘ideal-user assumption’ (Lazer &
Radford, 2017). Researchers assume that the users they are analysing are authentic persons
who use those sites in the expected ways (like having honest intentions, presenting themselves
in realistic ways, looking for a partner). However, that assumption may not – indeed does not
– always hold, therefore possibly affecting measurement validity. Even though events such as
user messages are recorded in highly reliable ways, we do not know the exact meaning of those
messages or user’s motivations behind sending a message. Very much like in conventional
observation settings, quasi-observation of online dating behaviour based on digital trace data
cannot easily reveal users’ subject experiences, motivations, or meaning in relation to their
self-presentation and interactions. Deception like misrepresentation and ‘strategic’ cheating or
lying in profiles and messaging may be common (Schmitz, Zillmann, Bamberg, & Hans-Peter
Blossfeld, 2013); users may have (multiple) ‘fake’ accounts and may engage in ‘lurking’ or
other forms of anti-social behaviour (Duncan & March, 2019; Murphy, 2017; Rege, 2009);
spam bots (Schmitz, Yanenko, & Hebing, 2012) and ‘catfishing’ (Simmons & Lee, 2020) may
be part of the online dating reality. While non-ideal user behaviour can be a relevant research
subject on its own, researchers employing digital trace data for studying dating must be aware
of it (Lazer & Radford, 2017), and, as far as possible, may employ methods to identify and
limit the influence of artefactual behaviour (e.g., Schmitz et al., 2012).
A third challenge concerns the theory-data fit. Recording of trace data is predominantly
a technical process that follows the technical expediencies of running a dating application.
Most likely, data collection is not informed by social theory and social science standards. What
is not recorded by the software cannot be measured. Therefore, trace data even though
potentially massive in volume represent ‘thin descriptions’ of social phenomena (Janetzko,
2008, p. 163). As we will see later, some research tried to remedy this exact issue by combining
trace and survey data, the latter ensuring some ‘thickness’ in the data. Moreover, different types
of dating apps generate different types of data as users interact with the specific information
system different platforms provide (see also the discussion by Posegga in this volume). Trace
data generated by a matchmaking site (users are presented with ‘matches’) will differ from data
generated on a standard online dating site (users can actively screen the profile database) or a
‘swipe-based’ mobile dating app (see Coyle & Alexopoulos in this volume). At the same time,
9
digital trace data is recorded on a level of precision that social science theory usually cannot
put up with. Frequently, when it comes to non-reactive data gathered from the internet, theories
are not only underspecified but missing entirely (Janetzko, 2008). That requires researchers to
develop and specify theoretical concepts to interpret these data in meaningful ways.
Importantly, researchers will have to familiarize themselves in detail with the functionality,
use, and data structure of the applications. Qualitative methods such as expert interviews with
the industry specialists, staff of dating service firms (e.g., software developers), interviews with
site or app users, or ethnographic approaches may be of great value in that regard. Such
research would be best informed not only by sociology but also by computer and information
sciences as well as communication and behavioural sciences (see Yan in this volume).
Finally, there are obvious limitations in scope when it comes to the conceptual breadth
of mating. Using trace data we can study the first encounter between mate seekers, how they
select and approach each other and how interactions may arise through initial contacts.
However, these data can hardly tell us anything about whether and how those online
relationships convert into real life, meaningful, romantic relationships. In this regard, research
collaborations with companies running dating site could be established that would enable
follow-up interviews with couples who met online, which till now has rarely been achieved.
One exception to this is Lee (2016) who, while not having had collected actual digital trace
data, obtained follow-up data on married couples who met through the service of a Korean
dating site to study the effect of online dating on assortative mating. In addition to conceptual
limitations, ethical aspects need to be considered, too, as online dating users might not be aware
that their online behaviour is being analysed. While research use of the data is commonly part
of the terms and conditions of platform use and most research works with de-facto anonymised
data, ethical grey zones might arise when data is being extracted from photographs or romantic
messages (as a few studies in the past did) which could identify personalities and very sensitive
and private information. Employing machine learning techniques for automated text and image
analysis (see Schwemmer et al. in this volume) to code imagery and textual material in digital
trace data could be way to minimize ethical constrains in future online dating research.
3 A survey of online dating research
In this section will take stock of findings obtained by research using digital trace data
from online dating. To my knowledge the first studies to systematically employ data from
online dating came from the United States and Korea (Fiore & Donath, 2005; Hitsch, Hortaçsu,
10
& Ariely, 2010a; Hitsch et al., 2010b). In Europe, the first social scientific studies to use data
from online dating arose from a research project carried out in collaboration with an online
dating company (Schulz, Skopek, & Blossfeld, 2010; Skopek, 2012; Skopek, Schulz, &
Blossfeld, 2009; Skopek, Schulz, et al., 2011). Recent years have seen a growing number of
studies using trace data from online dating.
I limit my survey of studies as follows. First, I focus on research that has been published
in peer-review journals within the social sciences. Hence, I do not cover information or
computer science related literature on online dating, for example, research that aims to develop
matching or recommendation algorithms for dating platforms (e.g., Luo et al., 2022; Xia, Liu,
Sun, & Chen, 2015). Second, in line with the chapter’s goal, I prioritise works that analyse
dynamic data measuring how social interaction evolves between dating site users through
activity and communication (e.g., browsing profiles, messaging). Many studies distinguish
between stages of interaction, most significantly the stages of initial contacting (User A emails
another user B for the first time) and reciprocating (User B respond to User A’s request). Several
other studies use profile data from online dating sites to examine stated preferences. Even
though not being strictly interactional data, the review will include a few of them, too. Trace
data studies differ in regional coverage (country, sub-country regions) and the length of
observation, and typically follow time and event sampling approaches (i.e., they sample
activity and messaging events of users within a limited observation period). Moreover, I
included a couple of field experiments carried out in online dating which are noteworthy as
they can provide stricter causal evidence than observational studies. Third, I restrict my survey
to studies on heterosexual dating relationships. Fourth, the review is organised around five
central themes: studies that are interested in general principles of interaction such as (1) general
partner preferences and (2) the emergence of hidden ‘communities’ on digital dating markets,
and studies that examine interactions along the specific socio-structural dimensions of (3)
education, (4) age, and (5) race. Table 12.A1 in the appendix provides an overview of the 25
studies included in the survey. While reviewing those study I will also presents some additional
empirical illustrations of selected topics based on elaborations of digital trace data retrieved
from a German Online dating website which colleagues and I have previously analysed in a
serious of publications (e.g., Schulz et al., 2010; Skopek, Schmitz, & Blossfeld, 2011; Skopek
et al., 2009; Skopek, Schulz, et al., 2011).
11
3.1 What daters seek for in partners – similarity or hierarchy?
Most studies use messaging data from online dating to explore general interaction
behaviour, the degree of similarity or dissimilarity in contacting, and preferences male and
female users ‘reveal’ by their contact choices. Fiore and Donath (2005), one of the first and
most cited studies on online dating, examined dyadic interactions of about 65,000 heterosexual
users of a US dating website and found evidence for substantial homophily in interactions (i.e.,
users sought others with similar features more often than chance would predict) along
demographics, physical appearance, and lifestyle attributes (e.g., drink and smoking habits,
pets). Interestingly, levels of ‘homophily’ were even larger for reciprocated contacts even
though reciprocity was generally low (nearly 80% of initial contacts remained unanswered).
For example, about 52% of users’ contacts (56% of reciprocated contacts) were directed to
other users having the same marital status whereas only 32% were expected under statistical
independence. Those findings are interesting as they did show for the first time that despite
online dating sites offering an ‘open setting’ of mate search, interactions evolve in a socially
structured fashion much alike in the more ‘closed’ offline settings of mate search.
In two complementary studies Hitsch and colleagues (2010a, 2010b) employed trace
data from an US online dating site to (a) reveal male and female preferences in terms of the
desirability of partner attributes and (b) to examine if sorting patterns found in online dating
can account for sorting patterns found in marriages. As the authors argue, in contrast to
previous studies on stated mate preferences (e.g., via means of survey, see South, 1991) data
from online dating “provides us with a near-ideal market environment that allows us to observe
the participants’ choice sets and their actual mate choices” (Hitsch et al., 2010b, p. 395). In
other words, online dating research can ‘reveal’ the very preferences of men and women that
are effectively shaping early interactions in the process of relationship formation.
A crucial analytical distinction regards the directionality of those preferences: the
operation of ‘vertical’ preferences indicates a hierarchical order of the sort “the more is better”
in a sought trait (e.g., individuals prefer partners with higher income) whereas the operation of
‘horizontal’ or homophilic preferences indicates a desire for similarity (individuals seek
partners with similar levels of a trait, e.g., similar educational level or smoking habits). The
distinction is of theoretical importance since matching outcomes such as homogamy in
marriages can be explained in principle by both types of preferences. In other words, do we
observe homogamy in couples because the ‘like likes the like’ or because individuals seek for
12
the most desirable features and partner similarity is just the sheer result of a competitive
matching process?
In a multi-study setup, Taylor at al. (2011) employed digital trace data from online
dating to revisit the ‘matching’ preference hypothesis, namely the sociopsychological idea that
mate seekers opt for others “in their league” (having similar mate value) as similarity would
maximise the prospect of reciprocity. Based on physical attractiveness (measures obtained
through ratings of profile pictures), the study found that online daters, in general and
irrespective of gender, contacted targets who were physically more attractive than themselves,
a result being at odds with matching but rather in line with vertical preferences. However,
matching by appearance emerged once conditioning on reciprocated contacts suggesting that
similarity in looks, whilst not always sought for, does result from the process of reciprocity.
Employing data on profile browsing and first-contact behaviour from a dating site in
the US, Hitsch et al. (2010b) used discrete choice models to estimate how vertical (absolute)
and horizontal (relative) mate preferences shape first-contact behaviour of men and women in
online dating along a whole range of socio-economic and demographic (e.g., education, age,
income, occupation, race), physical (e.g., looks rating, body-mass-index, height), cultural (e.g.,
political views, religion), or life style (e.g., smoking, drinking) attributes. The choice set was
reconstructed by the profiles users were browsing through. Actual choices (first contact
messages) were evaluated at the backdrop of those choice sets (the probability of sending an
initial contact to a target conditional on the having ‘seen’ the latter’s profile). Hitsch and
colleagues’ findings indicate that both vertical and horizontal preferences shape initial
contacting behaviour. For example, they found verticality to be dominant in relation to income
and physical attractiveness, but horizontality to be dominant for age, education, and race
(similarity preferences were especially strong for women) as well as cultural and lifestyle traits.
Gender differences in preferences were evident too. Women showed stronger same-education
and same-race preferences than men. Furthermore, compared to men women revealed a
stronger preference for partner’s income relative to aspects of physique. A field experiment
conducted on a Chinese dating site corroborates gender differences in preferences for partner
income (Ong & Wang, 2015): while male dating subjects of all income levels visited (fictitious)
female profiles indiscriminately regarding income, female dating subjects visited male profiles
with higher incomes at higher rates.
As one of the first studies in Europe, Skopek, Schulz, et al. (2011) employed data
gathered through a collaboration with a popular German online dating site to explore
homophilic preferences along education, age, and physical attractiveness as revealed by initial
13
contacting behaviour (conditional on browsed profiles as consideration ties) as well as response
behaviour (conditional on being contacted) for male and female online daters. In line with
Hitsch et al. (2010b), findings obtained from discrete choice models indicated horizontal
preferences to be dominant for education and age (stronger for women) whereas vertical
preferences to be dominant with respect to physical attractiveness.
For illustration, Table 12.1 shows some statistics on similarity and dissimilarity
matching along education, age, and BMI (body-mass-index) class based on a reanalysis of the
dataset from Skopek, Schulz, et al. (2011). The table displays the average fraction of attribute
constellations in initial contacts of initiative users (users who are sending out contacts)
separating between male and female initiators and between initial contact ties and reciprocal
contact ties (target replied). Observed fractions are compared with expected fractions (based
on the attribute distribution in the opposite sex) and a ratio is calculated to assess the amount
of ‘bias’ in contacting (i.e., the degree of non-randomness). Examining education and age, we
note that similarity contacting (I = T) occurred at higher rates than expected. For example,
females contacted similar aged males approximately 2.8 times more often than expected under
a scenario of random contacting. Males strongly under-selected older females for contacts (just
20% of the structural expectation) and so did females with respect to younger males.
Educationally similar partners were generally over-selected as well, especially by females who
structurally avoided contacting lower educated males. Selection bias partly changes when
conditioning on reciprocated contacts. For example, males’ contact similarity in age is higher
in reciprocated ties as many of the younger female targets did not respond. While not being a
good indicator for physical attractiveness, BMI does matter as well. For example, males over-
selected ‘thinner’ and under-selected ‘bigger’ targets even though the latter were more likely
to reply. Females under-selected ‘thinner’ male targets and if contacted those were also less
likely to give a reply as we can see from the even larger under-selection bias in reciprocal
contacts.
< INSERT TABLE 12.1 IN LANDSCAPE FORMAT PAGE HERE
(SEE END OF FILE!) >
Lewis (2016) employed a reginal sub-sample (users in New York City) from a larger
online dating dataset gathered in the US to examine horizontal and vertical partner preferences
as well as gender differences therein. In contrast to the previous studies (Hitsch et al., 2010b;
Skopek, Schulz, et al., 2011), the study’s methodical approach to ‘reveal’ male and female
14
users’ preferences built on exponential random graph models from social network analysis
which, roughly speaking, estimate the probability of ties between any two (opposite sex) users
(potential ties therefore are the consideration or choice set). These models are suitable to
identify ‘tie generating’ mechanisms while accounting for the structural opportunities arising
from attribute distributions among site users. Findings on race, income, education, and religion
showed evidence for similarity preferences but also status competition (i.e., vertical
preferences) both of which may even be at work simultaneously. Furthermore, the findings
indicate that partner preferences are a product of interactional dynamics since users were
generally more open to deviate from horizontal or vertical patterns in the reply stage.
In line with findings from observation data, a more recent study by Egebark et al. (2021)
adds experimental evidence for the verticality of preferences in terms of physical
attractiveness. A randomized field experiment conducted on a Dutch online dating site
demonstrated that both men and women preferred to respond to physically more attractive
senders irrespective of their own attractivity. Surprisingly, patterns were quantitatively
identical across genders suggesting attractivity preferences to be universal. Suck lack of gender
differences in preferences for partners’ physique is an indication that stereotypical findings
from the past of ‘men seeking beauty’ in exchange of ‘women seeking status’ (Allen, 1976;
Udry, 1977) may no longer be valid in nowadays gender-equal societies.
The studies reviewed so far examined preferences and matching along certain socio-
demographic or socio-cultural attributes as separate domains. Men and women, however, may
evaluate potential partners as ‘packages’ the desirability of which might be determined by
multiple attributes to different weights. In that regard, theories of social exchange conceive
courtship and mating as a system of quid-pro-quo in which men and women exchange mutually
rewarding resources, yet the perceived value of resources such as physical attractivity or
socioeconomic status may essentially depend on gender (Edwards, 1968; Rosenfeld, 2005).
Hence, attribute-based based studies may underestimate the actual degree to which vertical
preferences operate if mate value or partner desirability on the market is a gender-specific
construct.
The study by Kreager et al. (2014) accounts for this possibility by examining daters’
interactions along the concept of ‘social desirability’ which they operate based on daters’
ratings of opposite-sex profiles. Indeed, their findings suggest that both male and female daters
revealed strong vertical preferences in terms of (gender-specific) social desirability scores.
Second, like Taylor et al. (2011), they found that homophily in terms of social desirability is
not an individual preference but rather a social process that filters out desirability mismatches
15
over reciprocal and repeated social interaction (message exchange). Third, they found evidence
for an ‘initiator’ advantage: Daters who initiated contacts benefited from interactions with more
desirable partners compared to those who waited for being approached. Mechanisms
explaining such advantages of high-aiming initiators could be either asymmetric anchoring
effects (initiators aim deliberately high in terms of desirability, passive receivers’ choice sets
are consequently restricted to generally lower desirability) or just ‘luck’ as repeatedly aiming
high may occasionally “pay off” (Kreager et al., 2014, p. 406). However, any initiator
advantages are substantially differently distributed across genders as it is mostly men who
make the first step. Hence, as the authors conclude, “women often forego the promise of online
dating and are left wondering where all the good men have gone” (Kreager et al., 2014, p. 406).
Bruch and Newman (2018) studied online dating interactions in four large metropolitan
areas in the United States. In contrast to previous studies who either defined users’ desirability
by ratings or popularity, they calculated users’ desirability based on the PageRank algorithm
by which a users’ desirability is measured a function not only of incoming requests but also
the desirability values of those users who are sending those requests.
3
Simply put, the
desirability of a woman (man) on the dating market is defined by the desirability of the men
(women) who approach her (him). Based on that measure, Bruch and Newman (2018) provide
some fascinating insights into mate seeking strategies. When initiating contacts, most daters
aim for slightly more desirable partners: on average they pursued targets who were about 25%
more desirable than themselves, however, they had a substantially lower probability of getting
replies from more desirable targets. The authors also counted the number of words in a message
to have a measure of the ‘quality’ of a particular approach. Consistently, mate seekers invested
in writing longer messages when targeting more desirable partners. Taken together the study
suggests that mate seekers have a good sense of their own market value, in other words they
know their “league” (Bruch & Newman, 2018, p. 4), and use that as an aspirational benchmark
for initiating and entering online interactions.
General online dating sites are quite simple catalogue-like applications. They provide
users with a search function to screen the database of profiles and send messages. A common
alternative to catalogue-sites are matching sites offering matchmaking services. On those sites
users fill out lengthy questionnaires about their social identity, habits, and relationship- and
partner preferences based on which a ‘secretive’ and ‘wonderous’ platform algorithm generates
‘matches’, that is, suggested profiles for contact. While revealing partner preferences from
messaging behaviour on those sites is somewhat problematic due to artifacts created by the
algorithm, data from the matching questionnaire can be an interesting source for research.
16
Through research collaboration, Dinh et al. (2021) could collect data from a large online
matchmaking site (in the UK) over nearly 12 years, thereby representing a remarkable long-
term study using data from online dating. Among the various findings, the study could
demonstrate that partner’s income and education is more important for women than men.
However, over the observed time window (2007-2018) importance of income and education as
a criterion for partner selection has declined substantially for both genders. Likewise, the study
by Potarca & Mills (2015) used preference data from an international matchmaking site to
study cross-national variation in men’s and women’s racial dating preferences.
3.2 The social structure of online dating – do ‘hidden’ communities emerge in online
dating?
A few studies employed a social network analysis approach to digital trace data from
online dating to uncover features of the social structure created in online dating. Rather than
studying preferences by examining contact choices directly, Felmlee and Kreager (2017) used
clustering methods from social network analysis to identify ‘invisible communities’ in an
online dating market which are created as a meso-level outcome of the communication
structure. Importantly, such communities might be entirely opaque for the involved individuals
but nonetheless shape individuals’ contact opportunities. Based on a restricted trace data set
from a metropolitan online dating market in the US, the study found indeed evidence for the
existence of dating clusters created through reciprocated message ties. Based on the principle
of homophily, physical attractiveness and age were found to be the strongest factors in
clustering that network. In other words, more attractive users were in reciprocated ties with
other more attractive users and likewise for age. In comparison, race and education were less
strong correlates with cluster memberships.
In a similar fashion, Bruch and Newman (2019) employed community detection
algorithms from social network analysis to explore the structure of heterosexual dating
markets. The study authors examined millions of (reciprocal) message exchanges among users
in four large metropolitan areas in US. Modelling four ‘submarket’ communities within each
city, the study found that about 75% of all reciprocal ties fall into the same submarket (and just
25% between submarkets). Age was the most stratifying characteristic of geographically
localised submarkets. Interestingly, while in the younger age submarkets men outnumbered
women, women outnumbered men in the older age submarkets. That points to an asymmetry
in male and female preferences for partner age, an aspect we will turn back to further below.
Race interacted with age in generating submarkets: minority women (especially black women)
17
were substantially younger than white women in older age submarkets. Finally, the study
showed that submarkets were constituted by first contacting patterns but reinforced through
reciprocation.
Taken together, those community detecting studies demonstrate the value of digital
trace data from online dating markets to explore the ‘hidden’ social structure of dating as it
emerges from male-female interaction.
3.3 Gender and education
A couple of studies employed digital trace data from online dating to better understand
well-documented patterns and trends in social especially educational homogamy in marriage
outcomes (Blossfeld, 2009; Schwartz, 2013). Those studies set out to tap into a theoretically
important but empirically hard to capture aspect of social stratification of marriage. Are
patterns of educational homogamy, so frequently found in research on assortative mating, the
sheer result of structural partner market opportunities, that is, meeting opportunities between
men and women? Or are those patterns the result of deliberate choices men and women make
about partners? Studying trace data from online dating can shed new light on those questions
since micro-structural opportunities on the dating site as a local ‘digital partner market’ can be
perfectly controlled.
Being the first European study using online dating trace data gathered from a major
German online dating platform, Skopek et al. (2009) studied education-specific mechanisms
of online mate seeking. In their study, they examined trace data covering more than 116,000
first-contact messages sent off by more than 12,000 users over a six-month period in 2007 and
on a major German online dating site. They found that, indeed, educational similarity in initial
contacts was substantially higher than under random conditions. Hence, first contacting was
driven by horizontal preferences (homophilic preferences for education), and that held true
especially for women. In line with hypotheses derived from social exchange theory, horizontal
education preferences for (initiating) daters at higher levels of educational attainment. Gender-
specific mechanisms guided deviations from similarity patterns: Women strongly avoided
contacting ‘down’ in terms of education whereas this was not or less true for men (see also
Table 12.1).
Follow up studies found that similarity in education (but also age and physical
appearance) promoted both initial contacts and reciprocity in form of getting replies from
targets (Schulz et al., 2010; Skopek, Schulz, et al., 2011). The study by Skopek, Schulz, et al.
18
(2011) employed conditional probability models that estimated (1) the probability of initial
contacts (conditional on the choice set of browsed target profiles) and (2) the probability of
replies (conditional on the choice set of received initial requests) while accounting for the
effects of user specific idiosyncrasies in (repeated) messaging behaviour. The results
demonstrated that, irrespective of gender, educational similarity (between initiator and target)
favours first contacts over educational dissimilarity. This was primarily explained by daters
avoiding contacting ‘downwards’ in terms of education, while there was no penalty for
‘upward’ contacting over ‘horizontal’ contacting. In other words, the own educational
background shapes daters’ aspiration level in screening potential romantic partners as both
male and female daters preferred targets who had at least the same level of education.
Educational similarity (over dissimilarity) in turn favoured replies to initial contacts, thus,
promoted reciprocity in online dating. Once again, this was predominantly due to first-
contacted daters’ avoidance of replying ‘downward’. Quite surprising from the view of
traditional gender role theories, was the finding that men were most likely to reply to female
initiators who were better educated than themselves (even though those constellations
happened relatively rarely). In a second set of analyses, Skopek, Schulz, et al. (2011) modelled
the probability of educational similarity in contacts (initial contacts and the subset of
reciprocated one) as a function of education and gender controlling for the opportunity
structure on the platform (determining the random chance to have same-education contact) and
age. Results revealed that same-education contacts (initial as well as reciprocated) were more
likely to occur the higher the education level of the user, and that was true especially for
women. Thus, daters (especially women) with higher educational resources, choose to
associate more frequently with others at the same ‘level’.
Similarity in education, therefore, not only encourages initial contacts between users,
but also supports the formation of further interaction. Once again, this evidence supports the
idea of homophily being a social process: the logic of two-sided choice reinforces patterns of
similarity in early steps of relationship formation. Importantly, these principles held
symmetrically for both men and women.
While not focusing exclusively on education, the study by Lewis (2016), who studied
messages between US-American online dating users in New York, found evidence for
horizontal and vertical preferences being simultaneously at work but in a very gendered way:
while men at all education levels prefer women who have a similar education level, highly
educated men receive more requests from all women (and particularly women with higher
education); in contrast, higher educated women were sought after only by higher educated men
19
but not by lower educated men. Interestingly, the results indicated an inverted u-shape effect
of women’s education on their market desirability: men (irrespective of their education)
targeted women in the ‘middle’ (bachelor’s degree) while avoiding women at the top (master
or PhD) and bottom (just high school or two-year college). Those findings from New York are
only partly consistent to the findings by Skopek, Schulz, et al. (2011) on Germany who have
not detected a general penalty for women at the higher end of education. This discrepancy in
the findings might be the result of methodological differences or actual differences in the
respective populations of daters.
Egebark et al. (2021) identified an interesting gender asymmetry with respect to
education based on an experimental study carried out in online dating in the Netherlands. While
high educated women preferred to respond to higher (rather than lower) educated male senders,
the reverse held true for higher educated men who were more likely to reply to lower rather
than higher educated female senders. The evidence speaks for the presence of structural
disadvantages higher educated women might face on the dating market.
To sum up, findings on educational homophily in online dating are of extraordinary
theoretical significance for social stratification research on assortative mating and social
homogamy which is usually unable to gauge the role of micro-level processes (such as partner
preferences and aspects of choice) for observed patterns of union and marriage formation
(Blossfeld, 2009; Schwartz, 2013). In essence those findings on educational homophily in
online dating suggest that educational stratification operates from the very early steps of
relationship formation: ‘who contacts whom’ and ‘who reply to whom’ is, hence, not random
but socially structured from the outset. Furthermore, these results from online dating suggest
that the new digital partner markets tearing apart traditional social barriers in mating is an
unlikely scenario; rather, traditional social barriers are erected on dating sites by the very mate
seeking behaviour of individuals.
3.4 Gender, age and age preferences
Age is one of the most apparent factors structuring mating yet at the same time
paradoxically one of the most disregarded factors in sociological research on mating. One
explanation for that might be the historically rather stable average age difference in couples
(husbands are on average 3 years older). A few demographers have questioned if this observed
age difference in couples is representative for men’s and women’s actual partner age
preferences (England & McClintock, 2009). Furthermore, theories on the role of age and
20
relative partner age for male and female partner choices are by no means unanimous in their
predictions. To simplify matters here, one could see ‘age’ as a cultural aspect in everyday life
and hence matching on age, thus, age similarity, is rewarding in formation and sustaining of
intimate relationships. In addition, social norms may prescribe age similarity and, thus, larger
age discrepancies may be seen as deviant (Bytheway, 1981). On the other hand, age might be
a personal resource linked to social desirability and attraction thereby defining the (relative)
partner value individuals enjoy on the market. In that regard, cultural theories relating to beauty
standards discuss a ‘gendered double standard of aging’ (Sontag, 1979) as well as evolutionary
theories pointing to gender-specific reproductive strategies in mate choice (Buss, 1989) would
conclude that age ‘works’ differently for genders in terms of determining mate value. Whereas
previous research on age preferences was mainly restricted to preferences people express in
personal ads (e.g., Campos, Otta, & Siqueira, 2002) or state in direct survey questions (e.g.,
South, 1991), trace data from online dating can reveal how age and age preferences operate for
men and women in a real partner market.
While not being in the primary analytical focus, Skopek, Schulz, et al. (2011) examined
the role of age similarity and dissimilarity on the probability of first contact and replies.
Compared to females of similar age, men strongly avoided first contacting females older than
themselves but favoured contacting younger females. In contrast, women favoured similar aged
men for first contacts over both older and especially younger men. Interestingly, age similarity
clearly favoured replies (reciprocity) as both men and women were less likely to reply to older
and younger initiators. Dissimilarity penalties were, however, asymmetric across genders:
Among male targets, the penalty for older initiators was larger than for younger initiators, while
the reverse held true for female targets. Similarly, the study by Hitsch et al. (2010b) detected
horizontal preferences for age, especially for women.
Those studies, however, did not examine the possibility that mate seekers’ preferences
for partner age may vary by mate seeker’s own age. Skopek, Schmitz, & Blossfeld (2011) used
a sample of German online daters to test various theories on male and female preferences on
partner age. They put to test contradicting theoretical arguments about age similarity (directed
similarity, i.e., the man is a bit older than woman) as opposed to age preferences being a
function of gender and age itself. By an innovative data design that integrated trace data (first
contact messaging data from online dating) and survey data (collected via a panel survey on
the same dating site), the study could contrast preferences as ‘revealed’ by patterns of contact
choices and behaviour with preferences as ‘stated’ by users through means of survey. The
theoretical expectation that daters are looking for (roughly) similar aged partners was
21
empirically disappointed in a very clear-cut way. Instead, the study found that men increasingly
prefer and target younger women as they get older while women, although preferring older
males when younger, become increasingly mixed and undirected in terms of partner age as
they get older.
Figure 12.2. Fraction of younger, older, and same aged targets first contacted by gender.
Note: Fractions calculated for users and, subsequently, aggregated by gender and age group; obs. = as observed;
exp.= as expected statistically if users would contact targets randomly by age. Similar age = age discrepancy is
two years or less; older = target is 3 or more years older than initiator; younger; target is 3 or more years younger
than initiator. n=10,427 users and m=115.909 first contact messages. Data retrieved from a German online dating
site in 2007 (for detailed description of the data see Skopek, Schmitz, et al., 2011).
Figure 12.2 illustrates those findings by showing observed and expected fractions of
younger, older, and same-age targets contacted by males and females across different age
groups. For younger men, the modal female target was roughly of same age. However, older
men increasingly went for younger targets, crowding out any other age constellations. For men
at age of 30 or older the modal female target was younger. Furthermore, men clearly avoided
older women which is visible in the very low fraction of older female targets in absolute terms
but also relative to the statistical expectation (dashed line). In contrast, women at younger age
groups predominantly sought older targets. The ‘older men’ as females’ modal target was
0
.2
.4
.6
.8
1
Average fraction
< 20 20 - 29 30 - 39 40 - 49 50 - 59 > 60
Age group
Men contacting women
0
.2
.4
.6
.8
1
Average fraction
< 20 20 - 29 30 - 39 40 - 49 50 - 59 > 60
Age group
Women contacting men
younger (obs.) similar age (obs.) older (obs.)
younger (exp.) similar age (exp.) older (exp.)
Target is
22
shifting, however, across age too. Patterns became more and more mixed for older women
(e.g., in age group 50-59 fractions of older, similar aged and younger targets are nearly equal).
Hence, women displayed a larger variance in sought partner age at older age groups.
Recently, using digital trace data from a Czech mobile dating app Šetinová and
Topinková (2021) replicated the German findings from Skopek, Schmitz, et al. (2011). In line
with the male German daters, male Czech daters increasingly targeted younger women and
avoided targeting older women as a function their own age. In contrast, female daters, while
generally contacting older men, became more accepting in terms of contact younger men and
more restrictive in contacting older men as a function of age. The replication study – with data
being collected roughly 10 years later and in a different country – represents crucial evidence
that adds to the generality of findings on the dynamics of male and female age preferences.
The findings obtained by the two studies carry two key implications for theories on
assortative mating along age: first, they highlight that normative or cultural expectations about
horizontal age preferences are too simplistic to serve as useful explanations for observed
behaviour; second, the findings demonstrate the significance of age in shaping opportunities
on the dating market but, importantly, with different implications for men and women.
What exactly are those implications of age-related preferences? Using the same data
from online dating, we can look at the flip side of male and female contacting preferences and
behaviour, namely examining the opportunity structure manifesting on the digital partner
market: who is approached by others (through receiving contact requests) and, therefore,
accumulates opportunities for ‘romance’? To explore that question, I constructed an index of
incoming requests that measures the relative contact popularity of mate seekers of a certain age
group and that is standardized by gender.
4
For example, a score of 100 in the relative popularity
index indicates average popularity implying that a certain age group of men receives an average
number of requests that equals the average of all men. Values larger than 100 indicate above
average popularity (get more requests than average) and values smaller than 100 indicate below
average popularity.
Panel A in Figure 12.3 shows index scores for German male and female daters by age
(from 18 to 60). Age-related popularity patterns diverge substantially by gender. Inspecting
men first, we notice the astoundingly low popularity of young men around age 20 – they were
contacted by women at a rate which is barely 10% of the average man’s rate. Despite increases,
male popularity scores remain subpar till age 30 and remain at average till the end of 30s. Past
40, men are steadily enjoying higher popularity and men towards end of 40s are roughly 50%
more popular and men at mid to end of 50 are roughly 100% more popular than the average
23
man. Taken together, men’s popularity is monotonically increasing along age. In contrast,
women’s popularity profile is highly non-monotonic: women below 40 enjoy mostly over-par
popularity, with women from the mid-20s and early 30s being the most sought-after groups.
Past 40, female popularity is steadily declining. In relative terms the popularity of nearly 60
years old women is roughly comparable with the popularity of men younger than 25. Taken
together, this simple analysis demonstrates that younger men (below 30) and more mature
women (especially past 40) face structural disadvantages on the dating market; conversely,
women in their later 20s and early 30s as well as men beyond age 40 enjoy structural
advantages.
Another implication regards dating strategies of men and women. Online daters have
at hand two generic approaches to elicit contact opportunities: either they choose to be ‘active’
by contacting targets or they choose to be ‘passive’ by waiting for contact requests to arrive
and, possibly, send a reply. Theories on gender and social norms in dating argue that a more
‘aggressive’, initiative behaviour at early stages of intimate relationships is consistent with
masculine role conceptions but not with feminine ones which rather relate to diffident
behaviour (Clark, Shaver, & Abrahams, 1999; Green & Sandos, 1983). Unsurprisingly, all
studies reviewed here provide overwhelming evidence for a ‘male initiator’ schema – it is men
who are much more active in terms of making the first step – and it is women who receive
much more requests and, therefore, can afford to be more ‘passive’. However, is this traditional
‘dating game’ consistent across age and, accordingly, changing market opportunities?
To explore that further, I created another empirical elaboration on how the active-
passive strategy mix is distributed across gender and age. Constructed was a very simple
measure of initiative which is a just the fraction of self-initiated in all message ties a user has.
At the extremes this ‘initiation score’ is 1, implying a perfectively active strategy (all ties are
self-initiated), and 0, implying a perfectly passive strategy (all ties are initiated by others). All
individual scores have been averaged by gender and age. Panel B in Figure 12.3 plots the result.
Two important observations can be made. First and very apparent, men were the more initiative
sex but that was most pronounced for the younger age groups. Around the early 20s in age,
more than 80% of males’ message ties were self-initiated contacts. That is contrasted by little
more than 20% self-initiated ties for females of the same age group and up to the early 30s.
Yet, those figures converge across older age groups as males become less active and more
passive while females become more active and less passive. Indeed, towards end of 50s, dating
strategies of men and women were nearly indistinguishable. Finally, the scatter plot in Panel C
24
of Figure 12.3 shows that relative popularity and initiative behaviour are strongly negatively
correlated at the aggregated level of age groups.
Figure 12.3. Popularity and contact initiation by age.
0
50
100
150
200
250
Relative Popularity Index Score
20 30 40 50 60
Age
A
0
.2
.4
.6
.8
1
Initiation score
20 30 40 50 60
Age
B
18
19
20 21
22
23
24 25
26
27
28 29
30
31
32
33
34 35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
18
19
20
21
22
23
24 25 26
27
28
29
30
31
32
33
34
35
36
37
38 39
40
41
42
43
44
45
46
47
48 49
50
51
52
53
54
55
56
57
58
59
60
0
.2
.4
.6
.8
1
Initiation score
025 50 75 100 125 150 175 200 225 250
Relative popularity score
C
Men Women
25
Note: Panel A) Age-related index of incoming contact requests (relative popularity) by gender. Panel B) Age-
related initiative scores by gender; Initiation score = fraction of initiated contacts of all contact ties. Panel C)
Correlation between relative popularity and initiation score; data points are age groups from 18 to 60 (Pearson
correlation: -0.96 for men, -0.84 for women). Triangles indicate male, circles female index scores. Dashed lines
represent smoothed averages. n=12,602 users, data retrieved from a German online dating site in 2007 (for detailed
description of the data see Skopek, Schmitz, et al., 2011; Skopek, Schulz, et al., 2011).
3.5 In-group versus out-group: Gender, race and racial hierarchies in dating
A series of studies from the US employed digital trace data from online dating to study
aspects of race and racial discrimination in dating, especially the role of dating in racial
segregation. Racial segregation and racialized exclusion have traditionally been a major
concern in US literature on social stratification, and studies on interracial marriage and intimate
relationships can add valuable insights to racial boundaries and inter-ethnic distance in modern
society (Qian & Lichter, 2007). Yet, analysing racial homogamy as found in marriage data
cannot reveal the extent to which racial segregation in marriages is caused by constrained
interethnic meeting opportunities or racial biases in dating preferences and practices. For
studying the latter, trace data from online dating has demonstrated to be of immense theoretical
and empirical value.
Earlier studies on racial preferences in online dating (Feliciano, Robnett, & Komaie,
2009; Robnett & Feliciano, 2011) collected data on stated race preferences (standardized tick
boxes) from internet dating profiles on Yahoo! Personals, which was one of the major online
dating services in the early 2000s. Regarding white majority users, Feliciano et al. (2009) could
demonstrate, first, that race was a fundamental criterion for daters as preferences on race were
expressed more frequently on other traits (e.g., education or religion). Second, they found
decisive gender differences with (white) women being generally less open to non-white
relationships than men and, compared to men, were much more likely to exclude Latino and
Asian contacts. Robnett and Feliciano (2011) conducted a follow up study based on the same
stated preference data but including minority racial groups as well. In support of exchange and
race position theories, they found that, if race preferences were stated, white daters stated much
stronger in-group dating preferences than non-white daters. Gender and race-group interacted
substantially in terms of creating exclusionary preference patterns which disadvantaged Asian
men and black females much more than their opposite-sex counterparts. Taken together, the
evidence garnered from stated preferences in online dating profiles suggests considerable racial
hierarchies in dating but with different implications for race-gender groups due to substantive
gender differences.
26
Subsequent research extended previous research on stated race preferences by
examining the degree to which actual messaging behaviour in online dating is racially
segregated. Lewis (2013) analysed messaging data obtained from a large online dating site in
the US to reveal racial preferences in dating. The study found a high degree of racial self-
segregation as users’ first contact messaging was disproportionally directed towards their own
ethnicity background even after controlling for regional differences in meeting opportunities.
Compared to contact initiation, however, users were more likely to cross racial boundaries
when replying to initial contacts. Interestingly, Lewis found in-group racial biases in initial
contacting and crossing of racial boundaries in replies to be inversely related (e.g., Asians
showed one of the strongest in-group biases when contacting others but showed also one of the
largest reversals of that bias when replying to cross-race contactors). Finally, employing a
quasi-experimental approach the study demonstrated that having received cross-race messages
made users more likely to seek cross-race contacts in subsequent interaction, which held true
especially for non-white users. Lewis explained that finding by a pre-emptive discrimination
effect. According to this interpretation (especially minority) users do not try to approach cross-
race contacts at first because they anticipate (racist) rejection, yet once this belief is invalided
by a cross-race contact, they get more open-minded affecting future choices. A follow up study
replicated those findings on race (Lewis, 2016).
The study by Lin & Lundquist (2013) examined how race, gender, and education
interact in shaping mate searching behaviour online. They obtained digital trace data from one
of the largest US dating websites (covering the period 2003 to 2010). After restricting the data
to user interaction within the 20 largest metropolitan areas in the United States, their dataset
contained nearly one million users and nearly four million initial messages (including
information on replies). They run two sets of analysis: a) the probability of a contact between
any possible dyad of users within areas (which is comparable to the exponential random graph
approach chosen by Lewis, 2013; 2016) and b) the probability of a reply given a first contact.
First, and in line with Lewis (2013), their results demonstrated that the initial contact step is
driven by rather strong racial ingroup bias: daters most frequently approach other daters from
the same racial identity group. The reply step showed much less in-group bias, but at the same
time revealed evidence for the presence of racial hierarchies: White daters’ messages were
reciprocated by all other groups, black (especially female) daters’ requests to non-black groups
were least likely to be reciprocated, and Asian and Hispanic daters’ requests received medium
level response. Gender mattered as well: compared to white men and non-white women, white
women revealed the strongest in-group racial bias in both initial contacting as well as replying.
27
Refined analyses showed that educational differences could not account for that: white, college
educated daters were more likely to contact and reply non-college educated but white educated
daters rather than college-educated black daters. Taken together, these findings highlight the
profound racial stratification that operates in the early relationship formation, when men and
women make choices about ‘who to go ahead with’ and ‘who to dismiss’. Furthermore, these
findings from online dating demonstrate that this racial stratification is highly gendered, and it
is black women and black men who seem to be most disadvantaged in the dating market.
Overall, those studies demonstrate the valuable insights digital trace data from online dating
can add to the literature on racial segregation in romantic networks.
Is there a correspondence between racial dating preferences people state and racial
dating preferences people reveal by their search and contact behaviour? The studies by
Anderson et al. (2014) and Mendelsohn et al. (2014) are rare pieces that aim to address this
question. Mendelsohn and colleagues examined jointly stated and revealed racial preferences
of white and black daters drawing on data from more than a million users of nationwide dating
site in the US. Stated preferences were extracted from profile data. Findings indicated that
white more than black users and female more than male users stated same-race partner
preferences explicitly. Examining contact patterns, they found that the large majority of whites
contacted in-group with only a tiny 3% fraction of contacts targeted to black users. Importantly,
that pattern held even for users who stated no racial preferences in their profiles. Anderson et
al. (2014) examined profile and activity data (no messaging but users viewing other users’
profiles) from more than 250,000 users of a popular US online dating site. Stated preferences
were measured by race preferences users specified in their profiles, whereas revealed
preferences were measured by the probability ratio of ‘viewing’ (i.e., visiting) same race versus
non-same race profiles. The study yielded three fascinating findings regarding gender, race,
and political ideology (which users could specified in their profile). First, even users who
explicitly stated to be open about race (i.e., explicitly stating no same-race preferences) showed
considerable same-race preferences as revealed by their platform activities – irrespective of
race and gender; even politically ‘very liberal’ users who stated explicitly no same-race
preferences were roughly two to three times more likely to view same race rather than other
race profiles. Second, at the level of stated preferences, women – both white and black – were
much more likely to state same-race preferences than men, however that was not the case for
revealed preferences for which no gender differences were found. As the study demonstrated,
this gap in stated versus revealed preferences between genders was due to men (women) acting
more (less) consistently upon their preferential statements (or, in other words, stated preference
28
predicted stronger revealed preferences for men than women). Third, political ideology
mattered: both stated and revealed same-race preferences increased consistently with political
conservatism.
For our discussion, those two studies provide a remarkable insight: ‘what’ people tell
to seek and ‘what’ people do actually seek in romantic partners do not always fully align.
Analysing trace data from online dating therefore can be a powerful research tool to uncover
perhaps frequently unconscious preferences or ‘biases’ that drive individual choices,
relationship opportunities, and finally outcomes.
Whereas online dating can be fascinating research lab for understanding the micro-
foundations of racial segregation in intimate relationships, nearly all studies are situated in the
specific ethnic-historical context of the United States. Do those findings generalize to other
countries as well? Is there cross-cultural variation in racial preferences? The study by Potarca
and Mills (2015) provides some answers as they analysed a sample of nearly sixty thousand
user profiles from nine European countries has been obtained from a major multinational
matchmaking dating site. The study is remarkable as it is the first and, to my knowledge, so far
only cross-national piece of research on racial dating preferences based on online dating data.
Although the data did not include messaging, it included specifications on users’ preferred
ethnicity for partners (broad categories such as European, African, Arab, Asian, Hispanic).
Importantly, this information was hidden from other users (thereby not distorted by potential
social desirability bias) but served the site’s matchmaking algorithm to present suitable partners
to users. Overall, users across Europe indicated a considerable same-race preference but there
was a racial hierarchy with Europeans being the most preferred ethnic group (among Europeans
but all non-European groups as well) and Arabs and Africans being the least preferred group.
In fact, in minority groups across Europe, out-group preferences towards Europeans
outweighed in-group racial preferences – a finding that contrasts previous findings from US.
Nonetheless, across European countries considerable variation in in-group and out-group racial
preferences was detected (for example, same-race preferences of natives were most
pronounced in Italy, France, and Austria, and least pronounced in Sweden), which was partly
explained by concurrent cross-national variation in immigrant population sizes (as a measure
for intergroup contact but also potential intergroup conflict), general attitudinal climate towards
immigrants (as a measure of tolerance levels), and relative sizes of ethnic groups (as a measure
of minority in-group opportunity). Overall, the study adds a significant contribution to the
research on interracial and interethnic marriage by demonstrating the connection between
29
racial dating preferences of majorities and minorities and the specific racial or ethnic context
of nations.
3.6 Summary
Overall, the survey has demonstrated the value of employing digital trace data from
online dating for the study of union formation and assortative mating. One central
methodological contribution of online dating studies lies in the fact that online dating can
empower social researcher with quasi-observational data on mate seeking behaviour in real-
life partner market contexts. Many studies using such data have focused on ‘revealing’ male
and female mate preferences by reconstructing and analysing observed mutual contacting
behaviour. In that respect a central theme has been the contrast between horizontality and
verticality of preferences in respect to certain partner traits or general partner desirability.
Studies could also uncover the dynamic nature of those preferences being put in practice; for
example, mate seekers may adjust their preferences based on the feedback they receive from
the market. Moreover, going beyond inspecting one-sided preferences and choices of
individuals, many studies scrutinised the role of two-sided choices for the establishment of
reciprocal contact relationships. As we have seen, similarity in dating relations is frequently
reinforced by reciprocity, thus, ‘homophily’ – as observed by traditional research on intimate
relationships – seems to be a social process of filtering that results from actors mutually
agreeing to continue a relationship. A couple of studies examined behavioural aspects and their
effects and could for example demonstrate the presence of an ‘initiator advantage’. Other,
community-detecting studies showed that ‘hidden’ submarkets exist in online dating, which
can profoundly shape dating opportunities of men and women. Apart from those more general
aspects of dating, we reviewed a series of studies that examined ‘who dates whom?’ from a
social stratification perspective. Regarding assortative dating on education, we found
homophily to be the dominant mechanism, even though there were gender differences in the
details. While frequently neglected in sociological research, age is a prevailing dimension that
stratifies dating. The evidence from online dating has demonstrate how widely partner age
preferences vary by individuals own age and how different the implications are for genders.
Finally, studies have shown the highly racialised nature of dating which is resembled in the
very private choices men and, especially, women make on online dating services.
Besides studying the process of partner search through and in online dating, important
questions arise as to the actual impact online dating has on union formation and marriage
30
outcomes. Hence, before drawing a general conclusion, we will discuss some empirical
research on the consequences of online dating for union formation and assortative mating in
the 21st century.
4 The impact of online dating: social integration or social closure?
Actual marriage outcomes are not driven by mate preferences only but also by
opportunities to meet certain individuals on the marriage market (Kalmijn, 1998). Hence, in
how far can mate preferences estimated from online dating account for actual matching
outcomes in marriages? To address that question, Hitsch et al. (2010a) combined preference
estimates with economic theories of two-sided matching to predict matching outcomes. Their
predictions yielded assortative mating outcomes that resembled closely – in qualitative and
quantitative terms – actual marriage outcomes in a comparable population. Hence, men’s and
women’s preferences for partners when operating in a two-sided fashion seem to be a major
mechanism in producing observed marriage outcomes. On the flipside, those findings suggest
that online dating may produce socially selective dating outcomes much alike ‘offline’
assortative mating.
Such findings raise the question as to how online dating may affect actual couple
formation: Does online dating act as an ‘amplifier’ or ‘equaliser’ in terms of assortative mating
and social homogamy? A handful empirical studies tackled this question but found mixed
evidence. Exploiting data collected from real couples who met through a South Korean Dating
site, Lee (2016) aimed to identify the causal effect online dating may have on assortative
mating and union formation. Lee’s findings are striking. While online dating had weakened
occupational and geographical sorting, it strengthened couples sorting long educational level,
age, and relationship history, which lends support to the hypothesis of online dating
‘amplifying’ social homogamy. Evidence rather in favour of the equaliser idea, however, is
reported by Potarca (2017) who employed representative survey data from United States and
Germany: couples who had met ‘online’ exhibited a similar and frequently lower degree of
homogamy in terms of education, race, and religiosity compared to couples who had met via
traditional ‘offline’ venues (e.g., schools and social networks). Moreover, findings from a
recent study on the US by Thomas (2020) add support to the equaliser hypothesis: compared
to couples who met offline, couples who met first online were more different in terms of race
and ethnicity, religion, and educational attainment albeit they were more similar in age.
Nonetheless, Thomas’ (2020) analysis also concluded that the overall impact of online dating
31
on population-level patterns of heterogamy and exogamy might be rather limited.
Representative data from Switzerland analysed by Potarca (2020) suggests that couples who
met through dating apps (as opposed to traditional offline settings) revealed more exogamous
patterns in terms of educational attainment and geographical distance. Taken together, most of
the evidence seems to suggest that the ‘digitalised’ partner market of today operates towards a
mitigation rather cementation of social barriers in the formation of romantic relationships.
Finally, the recent study by Potarca (2021) could demonstrate that men and women who met
partners online as opposed to traditional settings were more likely to convert their relationship
to a committed marital union. This ‘online dating’ premium in the transition to marriage was
especially pronounced for higher educated women who seem to benefit the most from enlarged
options online dating might offer.
5 Conclusion
Being a niche phenomenon around the turn of the century, digitally mediated dating –
online dating – has developed into one of the most important social tools of seeking and finding
intimate relationships in the 2020s. While research has just started to understand the
consequences of this dating revolution on relationship formation in the 21st century, online and
mobile dating sites are huge generators of non-reactive, quasi-observational data on the process
of mate search itself. This chapter provided a critical discussion on the very nature of digital
trace data from online dating and surveyed research that employed such data from the past one
and half decades. The review of 25 studies revealed how trace data from online dating
facilitates our scientific understanding of mate seeking strategies men and women apply,
gender-specific preferences regarding partner attributes along various socioeconomic,
sociodemographic, and sociocultural characteristics, the role of two-sided choices for the
emergence of dating relationships, as well as the implicit social and racial segregation as
created within the digital dating market. Arguably, the rise of online dating seems to make
evidence gathered from online dating ever more relevant for future research on intimate
relationships, marriage and family, and social stratification.
And still, the extant research leaves much to be wished for in future. The concrete wish
list is long so I will limit myself to some major strategic points. First, most of the studies draw
on data collected in the 2000s to early 2010s, the ‘early’ age of online dating. However, online
dating services have developed substantially over the past decade and especially the rise of
app-based mobile dating is hardly reflected in the study landscape. Clearly, future research
32
needs a data update. Second, a lion share of studies are situated in the context of the United
States whereas just a few studies relate to European or Asian countries. Despite the presence
of internationally operating online dating companies, there is nearly no cross-national research
that would compare dating behaviour in various cultural and societal contexts, apart from the
eDarling study by Potarca and Mills (2015). Third, quasi-observational trace data on user-user
interactions represent rather ‘thin’ descriptions of the dating process. Linking trace data with
synchronized survey data (e.g., via user follow-up surveys) could generate a much richer
account of the personal, biographical, and social circumstances of male and female daters, their
identities, motivations, and aspirations, as well as, prospectively, their dating and relationship
outcomes. Indeed, just one study in the review went for such an integration.
Online dating sites and apps are, in stark contrast to Twitter or YouTube, closed-access
applications and data collection for research purposes can only succeed through collaborative
ties between research and the industry. Previous collaborations have been rather local and
short-term (e.g., in the form of a data ‘hand over’), but the future could lie in long-term
business-research collaborations. The online dating market has been strongly consolidating
over the past decade; major market players have emerged and are probably here to stay for a
while. Big companies like eHarmony or the Match Group dominate the market with various
products like Match.com or Tinder used by people all around the globe. Long-term institutional
collaborations could be of great benefit for both academic researcher, who could erect an
unprecedentedly integrated and cross-national data infrastructure on dating and mating, and
business players, who want to improve their products in a research-based fashion and provide
services the customers can put trust in. Rather than single academics with limited resources
and funding, social science infrastructure projects may pioneer such future research
collaborations. To be sure, such longer-term research-business collaborations will require
investments on all sides, but the scientific prospects will be worth it.
NOTES
1
In this chapter I use online dating and mobile dating synonymously.
2
The assumption that replies to initial contacts may reciprocate romantic interest may not always be valid. Even
though, non-responding in an anonymous, non-face-to-face environment is the most cost-effective option (in
terms of time needed to write a response) of expressing non-reciprocation, targeted users may send a ‘polite’
reject the initiator. Nonetheless, the typically low rates of reply that have been found in many studies may
suggest ‘polite’ rejections to occur rather rarely.
3
The PageRank has been developed by Google founder Larry Page and, in the past, was used by Google as an
algorithm to determine hyperlink popularity of websites.
4
The relative popularity index I is calculated separately by gender via
𝐼 = 100 ∙ (!!
!∙"
"!
)
with
𝑐#
being the
number of messages received by users of age
𝑎
,
𝑐
the number of messages received by all users,
𝑛#
the number
33
of users in age
𝑎
, and
𝑛
the number of all users. An important feature of the index is its standardization by
gender which is important as women had received on average much more contacts than men over the
observation period (women on average 11.8, men on average 2.9).
References
Alexopoulos, C., Timmermans, E., & McNallie, J. (2020). Swiping more, committing less:
Unraveling the links among dating app use, dating app success, and intention to commit
infidelity. Computers in Human Behavior, 102(August 2019), 172–180.
Allen, B. P. (1976). Race and Physical Attractiveness As Criteria for White Subjects’ Dating
Choices. Social Behavior and Personality: An International Journal, 4(2), 289–296.
Anderson, A., Goel, S., Huber, G., Malhotra, N., & Watts, D. J. (2014). Political ideology and
racial preferences in online dating. Sociological Science, 1(February), 28–40.
Anzani, A., Di Sarno, M., & Prunas, A. (2018). Using smartphone apps to find sexual partners:
A review of the literature. Sexologies, 27(3), e61–e65.
Bavel, J. Van. (2021). Partner choice and partner markets. In N. F. Schneider & M. Kreyenfeld
(Eds.), Research Handbook on the Sociology of the Family (pp. 219–231). Celtenham,
UK: Edward Elgar Publishing.
Blossfeld, H.-P. (2009). Educational Assortative Marriage in Comparative Perspective. Annual
Review of Sociology, 35(1), 513–530.
Bruch, E. E., Feinberg, F., & Lee, K. Y. (2016). Extracting multistage screening rules from
online dating activity data. Proceedings of the National Academy of Sciences of the United
States of America, 113(38), 10530–10535.
Bruch, E. E., & Newman, M. E. J. (2018). Aspirational pursuit of mates in online dating
markets. Science Advances, 4(8), 4–10.
Bruch, E. E., & Newman, M. E. J. (2019). Structure of online dating markets in US cities.
Sociological Science, 219–234.
Buss, D. M. (1989). Sex differences in human mate preferences : Evolutionary hypotheses
tested in 37 cultures. Behavioral and Brain Science, 12, 1–49.
Butler-Smith, P., Cameron, S., & Collins, A. (1998). Gender differences in mate search effort:
An exploratory economic analysis of personal advertisements. Applied Economics,
30(10), 1277–1285.
Bytheway, W. R. (1981). The Variation with Age of Age Differences in Marriage. Journal of
Marriage and the Family, 43, 923–927.
34
Campos, L. D. S., Otta, E., & Siqueira, J. D. O. (2002). Sex differences in mate selection
strategies: Content analyses and responses to personal advertisements in Brazil. Evolution
and Human Behavior, 23, 395–406.
Cesare, N., Lee, H., McCormick, T., Spiro, E., & Zagheni, E. (2018). Promises and Pitfalls of
Using Digital Traces for Demographic Research. Demography, 55(5), 1979–1999.
Clark, C. L., Shaver, P. R., & Abrahams, M. F. (1999). Strategic Behaviors in Romantic
Relationship Initiation. Personality and Social Psychology Bulletin, 25, 709–722.
Dinh, R., Gildersleve, P., Blex, C., & Yasseri, T. (2021). Computational courtship
understanding the evolution of online dating through large-scale data analysis. Journal of
Computational Social Science, (0123456789).
Duncan, Z., & March, E. (2019). Using Tinder® to start a fire: Predicting antisocial use of
Tinder® with gender and the Dark Tetrad. Personality and Individual Differences,
145(March), 9–14.
Edelmann, A., Wolff, T., Montagne, D., & Bail, C. A. (2020). Computational social science
and sociology. Annual Review of Sociology, 46, 61–81.
Edwards, J. (1968). Familial behavior as social exchange. Journal of Marriage and the Family,
3, 518–526.
Egebark, J., Ekström, M., Plug, E., & van Praag, M. (2021). Brains or beauty? Causal evidence
on the returns to education and attractiveness in the online dating market. Journal of
Public Economics, 196, 104372.
England, P., & McClintock, E. A. (2009). The gendered double standard of aging in US
marriage markets. Population and Development Review, 35(December), 797–816.
Feliciano, C., Robnett, B., & Komaie, G. (2009). Gendered racial exclusion among white
internet daters. Social Science Research, 38(1), 39–54.
Felmlee, D. H., & Kreager, D. A. (2017). The invisible contours of online dating communities:
A social network perspective. Journal of Social Structure, 18.
Finkel, E. J., Eastwick, P. W., & Matthews, J. (2007). Speed dating as an invaluable tool for
studying romantic attraction: A methological primer. Personal Relationships, 14(2007),
149–166.
Fiore, A., & Donath, J. (2005). Homophily in online dating: when do you like someone like
yourself? CHI’05 Extended Abstracts on Human Factors in Computing Systems, 1371–
1374.
Gibson, A. F. (2021). Exploring the impact of COVID-19 on mobile dating: Critical avenues
for research. Social and Personality Psychology Compass, 15(11), 1–9.
35
Green, S. K., & Sandos, P. (1983). Perceptions of male and female initiators of relationships.
Sex Roles, 9, 849–852.
Hampton, K. N. (2017). Studying the Digital: Directions and Challenges for Digital Methods.
Annual Review of Sociology.
Harrison, A. A., & Saeed, L. (1977). Let’s make a deal: An analysis of revelations and
stipulations in lonely hearts advertisements. Journal of Personality and Social
Psychology, 35(4), 257–264.
Hirschman, E. C. (1987). People as Products: Analysis of a Complex Marketing Exchange.
Journal of Marketing, 51(1), 98.
Hitsch, G. J., Hortaçsu, A., & Ariely, D. (2010a). Matching and sorting in online dating.
American Economic Review, 100, 130–163.
Hitsch, G. J., Hortaçsu, A., & Ariely, D. (2010b). What makes you click?—Mate preferences
in online dating. Quantitative Marketing and Economics, 8(4), 393–427.
Janetzko, D. (2008). Nonreactive data collection. In N. Fielding, R. M. Lee, & G. Blank (Eds.),
The SAGE handbook of online research methods (pp. 161–173). London: SAGE
Publications.
Kalmijn, M. (1998). Intermarriage and homogamy: causes, patterns, trends. Annual Review of
Sociology, 395–421.
Kreager, D. a., Cavanagh, S. E., Yen, J., & Yu, M. (2014). “Where have all the good men
gone?” Gendered interactions in online dating. Journal of Marriage and Family,
76(April), 387–410.
Lazer, D., & Radford, J. (2017). Data Ex Machina: Introduction to Big Data. Annual Review
of Sociology, 43, 19–39.
Lee, S. (2016). Effect of Online Dating on Assortative Mating: Evidence from South Korea.
Journal of Applied Econometrics, 31(6), 1120–1139.
Lewis, K. (2013). The limits of racial prejudice. Proceedings of the National Academy of
Sciences of the United States of America, 110(47), 18814–18819.
Lewis, K. (2016). Preferences in the early stages of mate choice. Social Forces, 95(1), 283–
320. https://doi.org/10.1093/sf/sow036
Lichter, D. T., & Qian, Z. (2019). The Study of Assortative Mating: Theory, Data, and
Analysis. In R. Schoen (Ed.), Analytical Family Demography (pp. 303–337). Cham:
Springer.
Lin, K. H., & Lundquist, J. (2013). Mate selection in cyberspace: The intersection of race,
gender, and education. American Journal of Sociology, 119(1), 183–215.
36
Luo, L., Zhang, X., Chen, X., Liu, K., Peng, D., & Yang, X. (2022). DCRS: a deep contrast
reciprocal recommender system to simultaneously capture user interest and attractiveness
for online dating. Neural Computing and Applications, 34(8), 6413–6425.
Mendelsohn, G. A., Taylor, L. S., Fiore, A. T., & Cheshire, C. (2014). Black/White dating
online: Interracial courtship in the 21st century. Psychology of Popular Media Culture,
3(1), 2–18.
Murphy, A. (2017). Dating dangerously: Risks lurking within mobile dating apps. Catholic
University Journal of Law and Technology, 26, 100. Retrieved from
https://scholarship.law.edu/jlt/vol26/iss1/7
Ong, D., & Wang, J. (2015). Income attraction: An online dating field experiment. Journal of
Economic Behavior and Organization, 111, 13–22.
Potarca, G. (2017). Does the internet affect assortative mating? Evidence from the U.S. and
Germany. Social Science Research, 61, 278–297.
Potarca, G. (2020). The demography of swiping right. An overview of couples who met
through dating apps in Switzerland. PLoS ONE, 15(12 December), 1–22.
Potarca, G. (2021). Online dating is shifting educational inequalities in marriage formation in
Germany. Demography, 58(5), 1977–2007.
Potarca, G., & Mills, M. (2015). Racial preferences in online dating across European countries.
European Sociological Review, 31(3), 326–341.
Qian, Z., & Lichter, D. T. (2007). Social boundaries and marital assimilation: Interpreting
trends in racial and ethnic intermarriage. American Sociological Review, 72(1), 68–94.
Rege, A. (2009). What’s Love Got to Do with It? Exploring Online Dating Scams and Identity
Fraud. International Journal of Cyber Criminology, 3(2), 494–512.
Robnett, B., & Feliciano, C. (2011). Patterns of Racial-Ethnic Exclusion by Internet Daters.
Social Forces, 89(3), 807–828.
Rosenfeld, M. J. (2005). A Critique of Exchange Theory in Mate Selection. American Journal
of Political Science, 110(5), 1284–1325.
Rosenfeld, M. J., Thomas, R. J., & Hausen, S. (2019). Disintermediating your friends: How
online dating in the United States displaces other ways of meeting. Proceedings of the
National Academy of Sciences of the United States of America, 116(36), 17753–17758.
Schmitz, A., Yanenko, O., & Hebing, M. (2012). Identifying Artificial Actors in E-Dating: A
Probabilistic Segmentation Based on Interactional Pattern Analysis. In W. A. Gaul, A.
Geyer-Schulz, L. Schmidt-Thieme, & J. Kunze (Eds.), Challenges at the Interface of Data
Analysis, Computer Science, and Optimization (pp. 319–327). Heidelberg: Springer.
37
Schmitz, A., Zillmann, D., Bamberg, O.-F.-U., & Hans-Peter Blossfeld, G. (2013). Do Women
Pick Up Lies before Men? The Association between Gender, Deception Patterns, and
Detection Modes in Online Dating. Online Journal of Communication and Media
Technologies, 3(3), 52–73.
Schulz, F., Skopek, J., & Blossfeld, H.-P. (2010). Partnerwahl als konsensuelle Entscheidung.
Das Antwortverhalten bei Erstkontakten im Online-Dating. Kölner Zeitschrift Für
Soziologie Und Sozialpsychologie, 62(3), 485–514.
Schwartz, C. R. (2013). Trends and Variation in Assortative Mating: Causes and
Consequences. Annual Review of Sociology, 39(1), 451–470.
Šetinová, M., & Topinková, R. (2021). Partner preference and age: User’s mating behavior in
online dating. Journal of Family Research, 33(3), 566–591.
Simmons, M., & Lee, J. S. (2020). Catfishing: A Look into Online Dating and Impersonation.
In G. Meiselwitz (Ed.), Social Computing and Social Media. Design, Ethics, User
Behavior, and Social Network Analysis (pp. 349–358). Cham: Springer.
Skopek, J. (2012). Partnerwahl im Internet. Eine quantitative Analyse von Strukturen und
Prozessen der Online-Partnersuche. Wiesbaden: VS Verlag für Sozialwissenschaften.
Skopek, J., Schmitz, A., & Blossfeld, H.-P. (2011). The gendered dynamics of age preferences
- Empirical evidence from online dating. Zeitschrift Für Familienforschung, 23, 267–290.
Skopek, J., Schulz, F., & Blossfeld, H.-P. (2009). Partnersuche im Internet. Kölner Zeitschrift
Für Soziologie Und Sozialpsychologie, 61(2), 183–210.
Skopek, J., Schulz, F., & Blossfeld, H.-P. (2011). Who Contacts Whom? Educational
Homophily in Online Mate Selection. European Sociological Review, 27(2), 180–195.
Sontag, S. (1979). The double standard of aging. In J. Williams (Ed.), Psychology of women
(pp. 462–478). New York: Academic Press.
South, S. J. (1991). Sociodemographic Differentials in Mate Selection Preferences. Journal of
Marriage and Family, 53(4), 928.
Statista. (2022). Dating Worldwide. Statista Market Forecast. Retrieved from
https://www.statista.com/outlook/dmo/eservices/dating-services/online-
dating/worldwide
Taylor, L. S., Fiore, A. T., Mendelsohn, G. A., & Cheshire, C. (2011). “Out of My League”: A
Real-World Test of the Matching Hypothesis. Personality and Social Psychology
Bulletin, 37(7), 942–954.
Thomas, R. J. (2020). Online Exogamy Reconsidered: Estimating the Internet’s Effects on
Racial, Educational, Religious, Political and Age Assortative Mating. Social Forces,
38
98(3), 1257–1286.
Udry, J. R. (1977). The Importance of Being Beautiful: A Reexamination and Racial
Comparison. American Journal of Sociology, 83, 154.
Webb, E. J., Campbell, D. T., Schwartz, R. D., & Sechrest, L. (1999). Unobtrusive measures.
Thousand Oaks: Sage Publications.
Xia, P., Liu, B., Sun, Y., & Chen, C. (2015). Reciprocal recommendation system for online
dating. Proceedings of the 2015 IEEE/ACM International Conference on Advances in
Social Networks Analysis and Mining, ASONAM 2015, 234–241.
https://doi.org/10.1145/2808797.2809282
39
Appendix
40
Table 12.A1. Social science studies published using trace data from online dating (chronological sorting).
Study
Data (national context, type, period)
Research topics and themes
Fiore & Donath (2005)
(no peer-review publication, but most cited
study)
OD site in the United States, about 65,000 users and
nearly 240,000 messages, period 2002-2003
Degree of homophily in user exchanges along
various demographic and lifestyle characteristics
Feliciano et al. (2009)
United States, subsample of 1,558 white online daters
from a sample of 6070 online dating profiles from four
major cities on Yahoo Personals, period 2004-2005
Interracial dating preferences (preference stated in
user profiles, white daters only), gender
differences
Skopek et al. (2009)
OD site in Germany, 12,608 users, 116,138 initial
messages (first contacts), period 2007
Educational stratification in online dating (initial
contacting), gender differences
Schulz et al. (2010)
OD site in Germany, 10,440 initiators and 20,708
targets, 116,138 initial messages (first contacts), period
2007
Educational stratification in online dating and
exchange-theoretical trade-offs (reciprocity
contacting), gender differences
Hitsch et al. (2010b)
OD site in the United States, 6,485 users and their
messages from two US metropolitan areas (Boston and
San Diego), period 2003
Vertical versus horizontal preferences, gender
differences
Hitsch et al. (2010a)
United States, digital trace data same as (Hitsch et al.,
2010b), representative CPS data on Boston and San
Diego
Vertical versus horizontal preferences,
predicted sorting outcomes based on matching
algorithms (Gale-Shapely)
Robnett & Feliciano (2011)
United States, 6,070 user profiles from Yahoo
Personals, period 2004-2005
Interracial dating preferences stated in user profiles
Skopek, Schulz, et al. (2011)
OD site in Germany, 12,608 users, 116,138 initial
messages (first contacts), period 2007
Vertical versus horizontal preferences (education,
age, physical attractiveness), gender differences in
preferences
Skopek, Schmitz, et al. (2011)
OD site in Germany, digital trace data (10,427 users
and 115,909 initial contact events, period 2007) and
survey data from a web-survey conducted on the same
site 2 years later (2,271 users, period 2009-2010)
Gendered preferences for partner age based on
revealed and stated preferences
Taylor et al. (2011)
OD site in the United States, 4 sub-studies including lab
settings and actual data from an online dating site
(several sub-samples from < 1000 to > 1 million users),
period 2009-2010
Test of the ‘matching’ hypothesis (preference for
similar desirability) along lines of physical
attractiveness, self-worth, platform popularity
41
Lewis (2013)
OD site in the United States (OkCupid), more than
126,000 users and their messages, period 2010
Interracial dating preferences, racial segregation in
online dating, causal effect of cross-race
contacting on preferences
Lin & Lundquist (2013)
OD site in the United States, data restricted to 20
largest metropolitan areas, more than 1 million users
and nearly 4 million initial messages, period 2003-2010
Interracial dating preferences, racial segregation in
online dating
Anderson et al. (2014)
OD site in the United States, more than 251,000 users
and their activities, period 2009
Link/gap between stated and revealed race
preferences and the role of sex, race, and political
ideology
Mendelsohn et al. (2014)
OD site in the United States, nationwide sample of
users (> 1 million. users) of a major dating site, period
2009-2010
Interracial dating preferences of black and white
OD users (stated and revealed preferences), racial
segregation in online dating
Kreager et al. (2014)
OD site in the United States, 6-month observation
restricted to a mid-sized city, 14,533 users (177,404
first contact messages), period 2010-2011
Gendered contact preferences, matching versus
competition (vertical preferences), homophily as a
process, initiator advantage
Potarca & Mills (2015)
International online dating/matchmaking site
(eDarling), data from European countries, 58,880 user
profiles and indicted matching preferences, period 2011
Racial preferences, in-group/out-group partner
preferences of ethnic groups across Europe,
country-level determinants of racial preferences
Ong & Wang (2015)
OD site in China, field experiment, 360 (180 per
gender) baseline profiles with varying income levels,
randomly released, click volume counted, period 2013
Estimating ‘click’ returns to income in online
dating
Bruch et al. (2016)
OD site in the United States, sub-sample of 1,855 users
and their activities from the New York City
metropolitan area, period 2014
Development of a framework to model online
dating choices in a multistage fashion (browsing,
messaging)
Lewis (2016)
OD site in the United States (OkCupid),
sub-sample of the data used in Lewis (2013), 7,671
users from New York City and their messaging
activities, period 2010
‘Matching’ (preference for similarity) versus
‘competition’ (preference for status) across various
socioeconomic, socio-demographic, and cultural
attributes, gender differences
Felmlee & Kreager (2017)
OD site in the United States, data extract from one
metropolitan region of a national online dating site,
3,521 users and their (reciprocated) messages, 1-month
observation period (year not specified)
Exploration of meso-level network cluster
structures in online dating
Bruch & Newman (2018)
OD site in the United States,
Mate seeking and messaging strategies (contacting
in terms of user desirability)
42
restriction to the four large metropolitan areas of New
York City, Boston, Chicago, and Seattle, 186,935 users
and their messaging activity, period 2014
Bruch & Newman (2019)
OD site in the United States, restriction to four large
metropolitan areas: New York City, Boston, Chicago,
and Seattle; > 4 million users and > 15 million
reciprocal interactions, period 2014
Social structure of US dating markets,
identification of sub-market structures as revealed
by messaging
Egebark et al. (2021)
OD site in the Netherlands, field experiment, 12 (6 by
gender) fictious users profiles sending initial contacts
(signalling long-term relationship goals and invitation
to meet) to a randomized sample of 2,667 (real) site
users; period 2016
Estimating returns (e.g., attention, replies, or
positive replies) to attractiveness and education in
online dating
Dinh et al. (2021)
International online dating/matchmaking site
(eHarmony), dataset obtained from eHarmony (United
Kingdom), subsample of users registered during month
March, 149,440 user profiles and messaging, period
2007-2018
Gender differences in mate preferences, gender
differences in online dating practices, long-term
change over time
Šetinová & Topinková (2021)
Mobile dating app in Czech Republic, 10,528 users and
196,206 initial messages (standardized invitations to
chat), period 2016-2019
Gendered preferences for partner age
43
(LANDSCAPE TABLE 12.1)
44
Table 12.1. Similarity and dissimilarity along education, age, and physique in first and reciprocated contacts of men and women (German Online Dating site).
Initial contacts
Reciprocated contacts
Expectation
Observed
Bias (Obs./Exp.)
Observed
Bias (Obs./Exp.)
Education levela
Male Þ Female
I = T
30.4%
35.0%
1.15
36.7%
1.21
I < T
31.6%
29.2%
0.92
26.6%
0.84
I > T
38.1%
35.8%
0.94
36.6%
0.96
Female Þ Male
I = T
30.2%
40.9%
1.35
40.4%
1.34
I < T
38.4%
40.7%
1.06
38.8%
1.01
I > T
31.4%
18.5%
0.59
20.9%
0.67
Ageb
Male Þ Female
I = T
12.6%
27.8%
2.20
29.8%
2.36
I < T
42.9%
8.5%
0.20
11.1%
0.26
I > T
44.5%
63.8%
1.43
59.1%
1.33
Female Þ Male
I = T
12.6%
35.2%
2.79
34.4%
2.72
I < T
44.5%
50.4%
1.13
52.3%
1.18
I > T
42.9%
14.4%
0.34
13.3%
0.31
BMI Classc
Male Þ Female
I = T
46.5%
49.8%
1.07
50.5%
1.09
I < T
22.4%
12.0%
0.53
14.3%
0.64
I > T
31.1%
38.2%
1.23
35.2%
1.13
Female Þ Male
I = T
46.5%
51.9%
1.12
51.5%
1.11
I < T
30.9%
28.9%
0.94
30.8%
0.99
I > T
22.6%
19.2%
0.85
17.8%
0.79
Note: Average users’ similar and dissimilar contact dyads in initial contacts and the subset of reciprocated contacts. I = initiator, T = target. a Four education levels: 1=lower
secondary, 2=intermediate secondary, 3=upper secondary, and 4=tertiary. b Age similarity: age difference being 2 years at most; target is older (I < T) if at least 3 years older
and younger (I > T) if at least 3 years younger. c Body-mass-index used to determine five weight classes according to WHO: 1=underweight, 2=normal weight, 3=overweight,
4=obesity, 5=severe obesity. n=10,440 initiating users (sample sizes effectively lower due to missing values). Data retrieved from a German online dating site in 2007 and
contact events weighted by inverse number of events per user (for detailed description of the data see Skopek, Schulz, et al., 2011).