PreprintPDF Available

DNA Data From Roma In Forensic Genetic Studies And Databases: Risks And Challenges

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

Since the early 1990s, DNA data has been collected in Roma communities in international collaborations from forensic and medical genetics, yielding 45 forensic genetics publications drawing on DNA data from Roma. No other minority or population group from Europe has received such a high level of attention. This paper examines data collections represented in forensic genetic journals and databases relative to internationally respected ethical standards. It demonstrates that ethical requirements are rarely met by studies and data sets. For some data, consent may have been obtained for purposes other than forensics. Several studies list co-authors affiliated with investigative or military forces. What we have observed can be described as a generalised lack of transparency and as deficiency in awareness of the ethical sensitivity of DNA data from Roma in forensic contexts. In some cases, data sharing practices and non-transparent reporting may be concealing "data laundering".
1
Unpublished draft, has not yet undergone peer review.
Submitted to the American Journal of Bioethics on Nov 12th 2020
DNA DATA FROM ROMA IN FORENSIC GENETIC STUDIES AND DATABASES:
RISKS AND CHALLENGES
Veronika Lipphardt
Mihai Surdu
Abstract
Since the early 1990s, DNA data has been collected in Roma communities in international collabora-
tions from forensic and medical genetics, yielding 45 forensic genetics publications drawing on DNA
data from Roma. No other minority or population group from Europe has received such a high level of
attention.
This paper examines data collections represented in forensic genetic journals and databases relative
to internationally respected ethical standards. It demonstrates that ethical requirements are rarely met
by studies and data sets. For some data, consent may have been obtained for purposes other than
forensics. Several studies list co-authors affiliated with investigative or military forces.
What we have observed can be described as a generalised lack of transparency and as deficiency in
awareness of the ethical sensitivity of DNA data from Roma in forensic contexts. In some cases, data
sharing practices and non-transparent reporting may be concealing “data laundering”.
I. Introduction
This paper examines DNA data from Roma in publicly available forensic genetic databases and
in studies published in scientific journals. The databases are used in the context of criminal inves-
tigations and for research; the studies have varying research goals and often feed into those da-
tabases. The main aim of this paper is to evaluate forensic genetic studies and DNA databases in
relation to expected ethical standards.
The paper argues that, for DNA data from Roma obtained or processed in forensic contexts, ethi-
cal requirements for genetic research, as laid down in international agreements and protocols, are
rarely considered. What we have observed in the available sources may generally be described
as wide-ranging non-transparency, pointing to a lack of awareness of the ethical sensitivity of DNA
data from Roma. Furthermore, this paper will explore some of the methodological and conceptual
challenges connected to DNA data and bio-samples from Roma.
We begin with our research design, followed by an overview over ethical standards and agree-
ments, continue with an assessment of ethical practices, contextualize our findings in various
ways, and discuss and conclude in the final section.
2
1. Limitations
This paper focuses on DNA data from Roma in forensic contexts today. Yet some contextualiza-
tion and a consideration of limitations are necessary: Firstly, many of the points raised concern
not only DNA data from Roma, but also DNA data from other social or ethnic minorities. However,
in any regard, Roma are the most strongly affected group in Europe (for Uighurs, see Moreau
2019; for other non-European groups, see D’Amato et al 2020). Secondly, many of the points
raised are of concern not only for the DNA data in forensic databases and journals, but also in
medical and population genetics (although most severely in forensic genetics), as well as DNA
data crossing of boundaries between these subfields of human genetics. Thirdly, the focus of this
paper is centering on ethical, social and privacy risks while, for the sake of brevity, it treats the
methodological and conceptual challenges of the research publications less thoroughly (cf.
Lipphardt, Rappold and Surdu, under review).
2. Research questions and hypotheses
Before we introduce our research questions and hypotheses, we wish to contextualize our findings
within two different yet related fields of scholarship: science and technology studies (STS) on the
one hand, and the ethical, legal and social implications (ELSI) or aspects (ELSA) of genomics on
the other hand.
A. STS research on “suspect populations” in forensic genetics
The argument of this paper is developed against the backdrop of a widespread increased trust in
the reliability of forensic DNA technologies
1
, and of exaggerated hopes for their potential to ensure
security (Buchanan and Weitz, 2017; Staubach et al, 2017). This reliance can be observed both
in the public sphere as well as in experts’ and professionals’ statements (Lipphardt 2018). DNA is
widely seen as a “truth machine” for forensics, beyond the possibility of error (Lynch et al., 2010:xi).
While the best practice of DNA fingerprinting technology – using a high number of autosomal STR
markers has reached an unparalleled level of reliability when it comes to identification, other
methods, such as calculating random match probabilities for uniparental haplotypes or predicting
externally visible characteristics (EVC) and biogeographical ancestry (BGA) are far less reliable
and technically not so well established (Pfaffelhuber et al., 2019) and depend much more on in-
terpretation (Lipphardt 2018).
Furthermore, in contrast to DNA fingerprinting, a technology that zooms in onto the individual level
of identification, prediction of EVC and BGA zooms in onto the populational level, thereby risking
to create “suspect populations” (Cole and Lynch, 2006; M’charek, Toom and Jong 2019). As Toom
puts it, “genetic information regarding ‘sex’ and ‘race’ is not so much used as evidence but is
1
For a much more comprehensive overview over current STS literature on forensic DNA technologies, see Machado
and Granja (2020); Ellebrecht and Weber (forthcoming); M’charek, Toom and Jong (2019); Williams and Wienroth
(2017) and M’charek and Wade (2020:321): “Themes that have been studied in the literature include the architecture
of DNA databases, what goes into them and how this is regulated, how DNA databases can be used by the state and
the private sector for these purposes, and how data can creep and bleed into other databases and functions; privacy,
democratic transparency and accountability; and questions of evidence and expertise—how DNA came to be seen as
authoritative in forensics, criminal investigation and crime management, and how it operates in practice. […] Included
in this area is the way forensic genetics may provide mechanisms to discriminate against racialised categories and, in
so doing, rematerialise and authorise ‘race’ itself in diverse forms (genetic, phenotypic, cultural) […]”.
3
considered as an important lead which feeds into the process of criminal investigation. It enables
detectives to focus their investigations on one group that shares a particular trait, like ‘race’ and/or
‘sex’ (Toom 2010:186).”
Accordingly, some of the social and ethical issues raised by these forensic technologies are the
risk of discriminatory effects, privacy risks, possible violations of the ‘right not to know’, and „the
vision of a slippery slope leading, ultimately, to eugenics (Koops and Schellekens 2008:160). STS
scholars have emphasized that, although EVC and BGA technologies could have the potential to
exclude ethnic minorities from the pool of suspects, their use will also, and perhaps more signifi-
cantly, contribute to shaping “suspect populations” (Cole and Lynch, 2006; Cole, 2020; M’charek,
Toom and Prainsack, 2012; Toom et al., 2016; Wienroth, 2020). STS scholars argue that EVC
and BGA prediction “increases the risk of stigmatization of individuals and groups, reinforces racial
categories and clusters an entire population group into a ‘suspect population’” (M’charek, Toom
and Prainsack, 2012: e16; see also Oorshot and M’charek 2021).
EVC and BGA prediction can be most useful in criminal investigations when applied to minorities,
because in most cases, a result pointing to the majority of the population will not help with focusing
investigations. Hence the discriminatory effects generated by these technologies affect minorities
only (Buchanan et al., 2018; Kayser and De Knijff, 2011:183; Lipphardt et al., 2016, Lipphardt
2018; Lynch et al., 2008; M’charek, Schramm K and Skinner, 2014; Toom et al. 2016). Gannett
argues that technologies for predicting biogeographical ancestry, along with the ancestry informa-
tive markers (AIMs) used for them, produce a re-enactment of racial thinking and racial taxono-
mies on a global scale (Gannett, 2014). According to Gannett (2014:183), BGA and AIM based
technologies contribute to the “criminalization of racial minorities”. As STS scholars have demon-
strated, the use of EVC and BGA technologies in investigations can lead to DNA dragnets of
certain communities or minorities where informed consent was not always the case (McCartney,
2006; EUROFORGEN 2017; Williams and Wienroth, 2014, 2017).
The research question following from this strand of research is whether the management of DNA
data from Roma in forensic genetics contributes to creating a “suspect population” and/or contri-
butes to reinforcing stereotypes.
B. ELSA/ELSI research on “ethics dumping”
For a contextualization of our findings in the field of ELSA (Borck et al. 2018), the concept of ethics
dumping is relevant in this case. As Schroeder et al. (2018) write, the concept of ethics dumping
was coined by the Science with and for Society Unit of the European Commission (EC):
Due to the progressive globalisation of research activities, the risk is higher that research with sensitive
ethical issues is conducted by European organisations outside the EU in a way that would not be accepted
in Europe from an ethical point of view. This exportation of these non-compliant research practices is called
ethics dumping.” (European Commission (nd) Ethics. Horizon 2020, cited in Schroeder et al. (2018: 2)
According to Schroeder et al. (2018), ethics dumping points to exploitative practices in internatio-
nal collaborations between developed and low or middle-income countries (LMIC). Exploitive prac-
tices occur when “research can be undertaken in an LMIC that would be prohibited in a high-
income country” and when there is “insufficient ethics awareness on the part of the researcher, or
low research governance capacity in the host nation” (Schroeder et al., 2018: 2).
4
Schroeder’s conceptualization does not claim that ethical standards are followed in the EU but not
elsewhere; and the concept of ethic dumping as developed by the EC points to the use of double
standards: EU countries may apply high ethical standards and ethical oversight at home, while
allowing lower ethical standards or even no ethical overview in non-EU countries in interna-tio-nal
partnerships.
The research question following from this strand of research is whether the treatment of DNA data
from Roma in forensic genetics and in other subdisciplines of genetics can be understood as
“ethics dumping”.
C. Further questions and hypotheses
We believe our research can contribute new questions and hypotheses to both fields. Firstly, as it
seems, forensic genetics is a field of applied research which is subject to two different ethical
regimes: one of investigation and one of academia. This might create tensions in forensic genetics
due to diverging notions of professional ethos.
Secondly, we argue that the boundaries between forensic genetics and other genetic subdiscipli-
nes are rather permeable, but as ethical standards are followed differently in these fields, boun-
dary crossings might lead to the violation of ethical standards.
Secondly, we argue that the boundaries between forensic genetics and other genetic subdiscipli-
nes are rather permeable, but as ethical standards are followed differently in these fields, boun-
dary crossings might lead to the violation of ethical standards.
Thirdly, in the context of data sharing, for some of the genetics studies examined, we claim to
have observed an ethically problematic data management practise we call “data laundering”, a
concept we introduce below.
Fourthly, we argue that ethical and epistemic challenges are often closely intertwined. But this is
not a major focus in this paper, as we provide a larger overview over methodological, conceptual
and epistemological aspects of DNA studies on Roma in a different publication (Lipphardt, Rap-
pold and Surdu, under revision). The findings of both our papers need to be seen against the
historical backdrop of ca. 447 genetic studies on Roma that have appeared since 1921. These
are medical genetic studies (271), population history genetic studies (131) and forensic genetic
studies (45). Most of these studies (75%) appeared after 1990, when genetic studies were increa-
singly drawing on DNA methods instead of serology; but in spite of the methodological turn to
molecular genetics, the studies continued a long standing research tradition that began in 1921,
addressing genetic topics by focussing on Roma.
3. Material
For this article, we have scanned the websites of YHRD and EMPOP and compared information
retrieved there with the information available in publications in forensic genetic journals and other
genetic journals. A systematic screening for population studies published in FSI:G reveals that
Roma have been a constant target of forensic geneticists since the inception of the journal. No
other minority or population group from Europe received a similar level of attention by ISFG.
5
Most of the 45 forensic genetic papers on Roma retrieved via standard search methods were
published in peer-reviewed forensic journals, flagship publications of the field. Thirty-two of them
appeared before 2011, that is, prior to the introduction of ethical requirements.
2
Overall, there are 28 full papers, 7 announcements of population data
3
, 4 conference presenta-
tions, 3 short communications, one letter to the editor, one review of literature and one case report.
32 out of 45 publications are based on primary data (i.e. the research teams of the authors were
involved in the collection of DNA samples). Out of 45 forensics genetics studies based on DNA
samples collected from Roma, 34 were based on DNA samples collected in countries in Central
and Eastern Europe (CEE). At least 10 sample collections took place in Hungary
4
; 5 studies were
based on samples collected in Western Europe (3 studies in Spain, 2 in Portugal) and 1 in Greece;
and two studies are based on samples collected in Turkey.
In addition to the forensic genetic publications, a number of population history genetic and biome-
dical genetic publications were taken into consideration for this paper, as well as conference and
workshop proceedings.
For the comparison of research practises with field-immanent ethical requirements, we have re-
viewed ethical standards and guidelines, along with statements from academics and institutions.
4. Methods
For this analysis, we have examined the meta-information of DNA data of Roma in forensic genetic
databases and publications with a focus on
a) methodological criteria such as how Roma are represented in the database (metapopulation
assignment, proportion, geographical balance, sample size)
b) ethical criteria such as ethical approval, informed consent, publication transparency, data
transparency, privacy, transgression of disciplinary boundaries
c) sociological criteria such as international collaborations, institutional affiliations, and funding.
Drawing on online repositories and online search tools, we follow the DNA data from the YHRD
and EMPOP databases backwards: from each data set’s information to the referenced publication
where the data was first published, and from there to laboratories and actors involved in genetic
studies of Roma and other minority groups.
2
The impact of the journals in which most of the papers appeared is rather high: FSI Genetics is first in the category
“legal medicine/forensic science”, IJLM is second and FSI fifth.
3
About this genre, Carracedo et al. (2010: 145) writes: “Announcements of population data consisted in short commu-
nications under a fixed format, avoiding the repetition of superfluous information (i.e., materials and methods) and con-
centrating the message on the key information needed for the use of genetic data for forensic and population genetics.”
4
We corrected for some datasets that were re-used for multiple studies (e.g. there are 15 datasets from Hungary from
which 10 are primary data and 5 secondary data). The primary data for forensic studies in all European countries comes
from at least 3133 Romani DNA donors.
6
5. Ethical standards for genetic research with vulnerable populations
One central approach of this paper is to assess forensic genetic studies and DNA databases
relative to the ethical standards internal to the field of genetics.
5
What follows is a rough overview
over standards, agreements and considerations published in the past decades and years. As
these standards demonstrate, the relevant ethical risks have been well known and demands for
action have been clearly articulated since the mid-20th century. However, as subsequent initia-
tives to improve ethical conduct show, implementing and reinforcing these standards seems not
to have been followed, especially for vulnerable, isolated groups such as the Roma.
A. International standards, protocols and agreements
In addition to national legislation regulating research, privacy issues and biobanking, there is a
broad range of public documents from international and professional organizations (e.g. conven-
tions, protocols, codes of conduct, declarations, policy documents, statements) guiding ethical
conduct in genetic and biomedical research. It is beyond the scope of this article to do a thorough
review of these documents; instead we will focus on those documents that are widely viewed as
significant and, within those documents, on the aspects relevant for the purpose of this article.
The main points are informed consent, the quality of the procedure for achieving the informed
consent, as well as privacy and confidentiality.
Most relevant for our case study is the “World Medical Association Declaration of Helsinki
Ethical Principles for Medical Research Involving Human Subjects” (hereafter Helsinki De-
claration) which was adopted in 1964 and amended several times, lastly in 2013. The Helsinki
Declaration considers vulnerable groups and individuals as a special case for the enforcement of
ethical standards in biomedical research. Particularly relevant for this paper is Article 20 which
prioritizes the health needs and interests of the vulnerable populations over the scientific interests
of the researchers.
The Helsinki Declaration extends the requirement of respecting ethical principles to the publica-
tions resulting from the research and makes the sponsors, editors and publishers of the research
results co-responsible for the observance of ethical standards (art. 36).
The UNESCO 2003 declaration “International Declaration on Human Genetic Data” (IDHG)
is particularly relevant in this context because Roma have long been stigmatized and discriminated
against in Europe. Article 7 of IDHG is titled Non-discrimination and non-stigmatization” and sti-
pulates clearly that the interpretation of findings should not be conducive to stigmatization of indi-
viduals, families, communities or groups.
Article 15 of IDHG stresses that data should be interpreted with caution and sensitivity in order to
avoid negative social consequences.
5
There is a large body of academic literature – especially from the field of Science and Technology Studies, but also
from within human genetics which deals critically with genetic research on minorities and vulnerable populations.
Some studies have highlighted that cultural, political and social pre-assumptions of human groups inform the drafting
of research designs and the collection of DNA data, and also reinforce existing stereotypical group notions. This criticism
is mirrored and occasionally taken over by geneticists, as reflected in some of the ethical standards reviewed below.
Moreover, for the sake of medical progress, some human geneticists call for more genetic studies on these populations,
yet with utmost ethical sensitivity (Ben-Eghan et al., 2020; Curtis & Balloux, 2020).
7
Article 16 states that re-using data collected for a different purpose is not permissible except for
“prior, free, informed and express consent of the person concerned” or for “important public
interest reason” (p.44). In regard to the sharing of benefits, Article 19 of the IDHG mentions, among
others: “(i) special assistance to the persons and groups that have taken part in the research; (ii)
access to medical care; (iii) provision of new diagnostics, facilities for new treatments or drugs
stemming from the research; (iv) support for health services.” (p.42)
DNA data from Roma is typically shared within and across international research collaborations.
The country of data collection is often not the country of data processing; while much of the data
is collected in Eastern European countries, much of the processing happens in Western European
countries. Article 6c of the IDHG states that reviews of ethic committees are necessary for all
countries involved in the “collection, processing, use and storage of human genetic data”. In our
case, this would apply to all countries in which the involved researchers have institutional
affiliations, but also countries where a database is maintained.
The Nagoya Protocol on Access and Benefit-sharing
6
adopted in 2010 (hereafter Nagoya
Protocol) was issued by the Convention of Biological Diversity of the United Nations and is infor-
mative for its focus on “fair and equitable sharing of the benefits arising from the utilization of
genetic resources” from “indigenous and local communities”. The issue of benefit sharing is dis-
cussed controversially among researchers and ethicists: while some argue that no incentives
should be offered for the donation of DNA, others hold that donors should expect some benefits
in exchange, yet not necessarily monetary or material ones.
The annex of the Nagoya Protocol lists a series of possible and discussed non-monetary and
monetary benefits: (a) collaboration, cooperation and contribution in education and training, con-
tributions to the local economy, food and livelihood security benefits, health and food security
related research social recognition; (b) payments for sample, for royalties, joint ownership of pro-
perty rights.
European Society for Human Genetics: As all published forensic genetic studies on Roma draw
on DNA data collected in Europe, the positions on ethical standards taken by the European Soci-
ety for Human Genetics (ESHG) are relevant in this context. Within the general ethical standards
for DNA biobanking in biomedical research, the ESHG (2003a) stresses the need of special pro-
tection for vulnerable individuals and populations. Of specific interest for this article is the ESHG’s
recommendation of additional collective consent in the case of minorities (ESHG 2003a:907).
The ESHG recommendations also provide a careful evaluation of benefits and harms of population
genetic screening programmes. Among the potential harms that needs to be considered, the
EHSG focuses especially on the risks of stigmatization and discrimination (EHSG 2003b: S5). The
ESHG took a more explicit position when it condemned “genetic testing to establish racial origins
for political purposes”, on scientific, social and moral grounds.
7
(ESHG, 13 June 2012)
6
The full name of the protocol is Nagoya Protocol on Access to Genetic Resources and the Fair and Equitable Sharing
of Benefits Arising from their Utilization to the Convention on Biological Diversity. The document was adopted in 2010,
in Nagoya, Japan.
7
This statement came as a response to a case in which a member of the extreme right Hungarian party, Jobbik,
requested a certificate attesting that he did not have Jewish or Roma ancestors. The medical diagnostic company Nagy
Gén Diagnostic and Research tested his genome for 18 genetic markers they considered to be specific for Roma and
Jews. The resulting certificate concluded the member of parliament had neither Jewish nor Romany ancestry (Abott,
8
Following from these standards, an ethical approval procedure for a human genetic research pro-
ject requires the approval by an independent ethics committee and the written informed consent
of each participant. Furthermore, researchers are required to pay special attention to vulnerable
groups and minorities. The informed consent needs to state clearly what research purposes the
data may be used for. Ideally, for the use of their data for a new research project, a participant
would each time have to be re-approached to give individual consent (e.g. for two different condi-
tions within medical genetics). Broader consent options are discussed, however, the practicality
and ethical implications of these remain controversial.
B. Statements of scientists, scientific associations and scientific journals
As it seems, the ethical standards listed above refer to noble ideals, but have proven hard to
maintain in practice. Hence, a growing number of geneticists and interdisciplinary teams empha-
size that genetic research on humans must be conducted with utmost reflexivity and sensitivity,
must provide appropriate sample information, definitions and ethical procedures, and must not
disregard problematic implications of employing racial and ethnic labels (Curtis and Balloux, 2020;
D’Amato et al., 2020; Ben-Eghan et al., 2020; Claw et al., 2018; Hindorff et al., 2018; Takezawa
et al., 2014; Barbujani, Ghirotto and Tassi, 2013; Ali Khan, 2011; Royal et al 2010; Lee et al.,
2008).)Most of these texts do not go beyond the biomedical field to consider forensic genetics, or
beyond general conditions in English-speaking countries to consider the specific, concrete situa-
tion of vulnerable minorities.
As an exception to the former, Bard (2010) compares biomedical and forensic biobank regulation
in EU member states, concentrating on Central and Eastern Europe. Bard (2010) states that in
these countries widely differing legislations prevail, and makes recommendations for the ethical
regulation of forensic genetic databases.
Regarding the latter, critical scholars have reported on what they consider exploitative research
practices in genetic research involving subjects from various minorities, indigenous communities
and vulnerable populations (see also Radin, Kowal and Reardon, 2013): for example, for San
people in southern Africa (Chennells and Steenkamp, 2018); Havasupai Tribe (Garrison, 2013),
Navajo Nation (Reardon, 2017); Native Americans in the U.S. (TallBear, 2013); and Australia’s
indigenous people (Kowal, 2013). Actors from the communities concerned, from academia or
NGOs have brought ethics dumping to the attention of the public. In some cases, they have been
able to induce a change in the relationships between researchers and vulnerable populations,
aiming for new, fair ethical arrangements.
For example, in the context of genetic research in Africa characterized as “helicopter’ research, in
which foreign scientists take samples and data from communities and then return to their home
institutions” (Nordling, 18 April 2018), a Community Engagement Working Group (CEWG) was
established in 2015 in order to promote community engagement as a guide for developing best
practice in genomic research. The community engagement as described by the CEWG (H3Africa-
Consortium, 2014:5) includes “consulting with gatekeepers of the community, soliciting the views
and inputs of community members prior to, during and after research, feeding back research fin-
dings, to building partnerships with the community”. The aim of this engagement, involving a broad
2012). It was issued just a few days ahead of local elections in Hungary. In the weeks before, Jobbik had successfully
based its Parliament election campaign on strong anti-Semitic, anti-Roma and nationalistic sentiments (Good, 2012).
9
range of stakeholders, is to offer protection against stigmatization, exploitation and discrimination
while considering the needs of the researched communities. In the U.S., for genetic research with
indigenous groups, an ethical framework has been adopted in 2018, asking the researchers to
develop an appropriate understanding of community structures and of sovereignty, to engage with
the communities during the whole research process, to build cultural competency in the research
team, to improve transparency through communication, to build up capacities on a local level and
to use culturally appropriate means for disseminating the findings (Claw et al., 2018).
Taken together, in these texts, a set of new, additional ethical standards emerges that are of a
paramount importance for the study of minorities and vulnerable populations:
Racialized/ethnic populations or indigeneous groups in general:
Minorities, indigenous groups and vulnerable populations require utmost sensitivity. Resear-
chers should strive for an enhanced dialogue with, and participation of, these communities
and their representatives during the entire research process, and consider forms of collective
consent in addition to individual consent. (D’Amato et al., 2020; Takezawa et al., 2014).
Samples should be used exclusively for the declared purposes unless there is an explicit
informed consent and the approval of an ethical committee for an extended use of data
(D’Amato et al., 2020:2).
The rationale of selecting a certain population, the criteria for assigning population labels to
individuals, and the population labels themselves need to be adequately described, in order
to avoid reinforcing racial and ethnic stereotypes (Ali Khan et al., 2011; Lee at al., 2008, Ta-
kezawa et al., 2014). Population labels should respect the cultural sensitivities of the popula-
tions concerned. (Takezawa et al., 2014: 4) The use of racial and ethnic categories should be
minimized in clinical genetics, opting instead for individual level data (Lee et al., 2008)
Genetic research should provide benefits to disadvantaged groups (Curtis and Balloux, 2020)
Genetic research should account for the inequities arising from unbalanced sampling, parti-
cularly with regard to minorities (Curtis & Balloux, 2020: 314)
A broad range of expertise including social scientists needs to be included in genetic studies
of human variation (Lee et al., 2008: 404.3).
Isolated populations specifically:
Genetic studies of isolated populations are ethically far more challenging than studies in the ge-
neral population and require heightened ethical awareness (Mascalzoni et al., 2010):
Privacy risks and the risk of de-anonymisation are higher for isolated populations than for
other populations (Mascalzoni et al., 2010). Information provided by individual participants
could also affect non-participants, as the genetic information extends to their families, com-
munities and population group.
Defining and framing a population as genetically isolated can be stigmatizing. Accordingly,
researchers should discuss with participants from the very beginning how research findings
will be framed (particularly if they intend to frame it in terms of inbreeding and consanguinity).
Transparency: The potential individual and collective risk resulting from participation should
be diligently explained, including the challenges in ensuring complete privacy for participants
from isolated populations. The researchers should specify with whom they will share the data
10
and ensure that data will not be shared with police or courts, or explain under what conditions
this might happen (e.g. by a court order).
Social pressure: In small communities, pressure from the family or community may interfere
with free consent.
Benefit sharing: Decisions about sharing potential benefits derived from the research should
be taken together with consideration of the communities under study. Researchers need to
seek appropriate ways of avoiding false expectations, but they should also not understate
potentially expectable benefits.
Substructure: When examining isolated populations, researchers risk tapping into – or mis-
interpreting substructure (Ehler & Vanek 2017). This can be a threat to the veracity of inter-
pretation of forensic genetic casework, as it can introduce bias into forensic statistics and
analyses.
In court cases, “using appropriate population databases is very important especially when a
genetically isolated population might be involved” (Dotto et al., 2020: 102168.1).
6. Ethical standards in forensic genetics
A. Forensic genetics journals
The above ethical standards have been adopted late in forensic genetics in comparison to the
biomedical field. This is reflected by the late but welcome adoption of ethical requirements in 2010
by the main journal of the forensic genetics community, Forensic Science International: Genetics
(FSIG), the journal of the International Society for Forensic Genetics (ISFG).
8
A requirement for
authors to explicitly mention the “informed consent and/or specific approval of a recognized ethical
committee” was introduced by FSIG in 2010 (Carracedo et al., 2010: 145). A similar requirement
was published by the editors of the International Journal of Legal Medicine (IJLM) in the same
year. At the same time, the IJLM asked authors to use “the correct ethonym [sic!]”, as for example
“Roma” instead of “Gypsies” (Parson and Roewer, 2010).
In 2017, the EUROFORGEN Group, an EU funded platform assembling forensic geneticists and
bioethicists from 9 European countries, published a guide for the general public. Therein, the
group acknowledges the risk of overrepresentation and criminalization of minorities in forensic
databases: „There are also concerns that certain minority groups are disproportionately represen-
ted on national DNA databases. Some argue that such inequalities could stoke feelings that cer-
tain groups are being unfairly criminalized or discriminated against.” (EUROFORGEN 2017: 24).
In 2020, FSIG published a series of ethical requirements for authors, concerning the acquisition
of biological material including human DNA (D’Amato et al. 2020). These guidelines include the
observance of several protocols, declarations and conventions such as the Helsinki Declaration
and the UNESCO 2003 declaration. The guidelines ask prospective authors to present a template
8
The International Society for Forensic Genetics (ISFG) is an international organization assembling more than 1200
forensic geneticists from 60 countries. ISFG emerged from the German Gesellschaft für forensische Blutgruppenkunde,
an organization founded in 1968 in Mainz (https://www.isfg.org/About/History) (accessed 12.10.2020).
11
of the consent form used for obtaining samples of human DNA. They prohibit the secondary use
of DNA samples and DNA data for other purposes than specified in the consent form, unless an
ethical committee has approved the secondary use. Consent given in verbal form and video re-
cords is not considered acceptable as a form of consent (D’Amato et al. 2020).
In theory, the ethical procedures applied to publications should also lead to ethical conduct in the
collecting of DNA data in the future, in the transferring of DNA data to forensic genetic research
databases YHRD and EMPOP and in the curation of these databases. This is because data up-
loaded to both databases is understood to be published in a scientific journal before or shortly
after the upload.
To summarise, as D’Amato et al (2020) suggest, in terms of ethical conduct, forensic genetics has
tried to catch up with the biomedical field. However, most data uploaded to forensic genetic
research databases, or else shared internationally across laboratories and disciplines, has been
collected from the 1990s onwards until 2010. Against this backdrop, an ethical assessment of
these databases and of the data-sharing practise in this field is a mandatory next step.
B. Forensic genetic research databases: YHRD and EMPOP
9
Forensic genetic databases exist in two different varieties. Firstly, there are national DNA databa-
ses (NDNAD), maintained by investigative authorities, containing autosomal STR data of suspects
or convicted perpetrators. The technical procedures are well established; these databases are not
hosted at a university, and problems that would require an extraordinary intervention of experien-
ced scientists well established in the scientific community is rarely necessary. A number of studies
show that national DNA databases (NDNAD), maintained by investigative authorities and used in
court, can be biased by ethnicity, race, social class and geography (see e.g. Lynch et al., 2008;
Skinner 2013; Skinner 2020). But as these are subject to laws for criminal investigation, the above
ethical standards developed for research in human genetics do not easily apply (cf. Bard 2010).
NDNADs are not in the focus of this article.
Secondly, there are forensically used DNA databases that do need constant academic oversight
by experienced and internationally acknowledged researchers: databases that hold populational
information about the frequency of STR profiles around the globe. They come in two kinds: Firstly,
there are databases for autosomal STR frequency information, used for the calculation of likeli-
hood ratios for random matches relevant in court (yet if 16 STRs are considered, this is hardly
necessary). Secondly, there are lineage-specific databases, e.g. YHRD and EMPOP. Y-chromo-
somal DNA (yDNA) information is passed on in a lineage from fathers to sons, and mitochondrial
DNA (mtDNA) is passed on in a lineage from mothers to all of their children. Because the latter
does not allow for individual identification – all members of a lineage share the same haplotype –
they should not be used for likelihood ratio calculations unless put into context of lineage matches:
For example, a hair typically found at a crime scene mostly only contains mtDNA. If a suspect’s
mtDNA matches the hair’s mtDNA, that does not identify the suspect as the perpetrator, because
9
Y-Chromosome Haplotype Reference Database (YHRD); EDNAP mtDNA Population Database (EMPOP). We have
not examined the database ALFRED, though it does contain a relatively small number of datsets from Roma; and we
have not examined STRIDER, which, to our knowledge, does not contain data from Roma.
12
the exact same mtDNA haplotype belongs to many individuals in the same mtDNA lineage. But a
query in EMPOP allows to roughly assess how frequent such a haplotype would be in a given
country or region. If it is frequent, this would be information in favour of the suspect. However, as
one can imagine, if populations are not represented well or not at all in the databases, this can
lead to misinterpretations of results that can only be controlled by outstanding academic expertise.
So, these lineage databases provide for applications within criminal investigation: to assess the
frequency of matches in a population; and – though not legally in Germany to predict biogeo-
graphical ancestry.
10
In addition to these applications, for academic forensic geneticists, the data-
bases also have research functions. But what is much more important, these databases, in order
to fulfill their investigative function in the most reliable way, need to fully live up to the standards
of research: Investigators draw on their services because the databases promise scientific autho-
ritative expertise in an intelligence gathering operation that is anything but simple to understand.
From the prosecutor’s perspective, any DNA data involved in criminalistic matchmaking processes
and likelihood calculations might be seen as subject to laws for criminal investigation but not as
subject to ethical research standards. However, as these databases are publicly available for re-
search purposes, and as they shall guarantee the scientificity of their applications in investigation
and in court, they should arguably observe research ethical standards, such as free written in-
formed consent and ethics committee’s approval. And in deed, for guidance on ethical issues,
both databases point to the guidelines of the two leading scientific journals in the field.
The DNA data in the YHRD and the EMPOP is thus subject to two different ethical regimes: as
elements of research databases, it is no different than data in any other genetic database for
research purposes, hence the same ethical standards should apply. As a data basis for criminal
investigation and court procedures, its use is not understood to be subject to these ethical stan-
dards (although data collection, and information about its population representativeness, should
be). This makes the ethical status of DNA data in YHRD and EMPOP worthy of discussion.
In what follows, this article will focus particularly on YHRD as a database hosted in Germany. The
YHRD (accessible online at https://yhrd.org/ ) is an Y-Chromosomal STR database intended “to
be used in the quantitative assessment of matches in forensic and kinship casework”, and as
“support of forensic caseworkers in their routine decision-making process” (Roewer et al., 2001:
107). Launched in 2000 by Lutz Roewer, the database is hosted at Charité, Berlin, Germany (Roe-
wer et al. 2001) and has become the largest and most widely used Y-chromosomal STR-profile
database. According to the website, it is publicly available because “it is not restricted to registered
10
The database can be used in two different ways (Ellebrecht and Weber, 2021, forthcoming): For one, if no atDNA-
STRs of a perpetrator can be obtained, but Y-STR, and if a match between the unknown DNA trace and an Y-STR-
profile in an NDNAD has been achieved, the prosecution still has to demonstrate in court a low random match probability
(RMP). The RMP is higher if the Y-STR-profile is more frequent in the suspect’s population. For two, in the case of
trace-trace matches or if there is no match at all in the NDNAD, investigators use the YHRD to predict biogeographical
ancestry (BGA) from unknown DNA traces. In Germany, the prediction of BGA for investigative purposes is forbidden.
However, as Ellebrecht and Weber (2021, forthcoming) demonstrate, the database has nevertheless been used in the
past by criminal investigators for this purpose.
13
users”. Moreover, the database aims to be “representative of the geographical and ethnical
structure of the populations of interest” (Roewer et al., 2001: 107).
11
The EMPOP (accessible online at https://empop.online/), launched in 2006 by Walter Parson at
University of Innsbruck, Austria (Parson and Dür, 2007), has become the largest database for mt-
STR-profiles and provides complementary services to the YHRD. In spite of their international
importance, the databases seem to be underfunded, making quality maintenance difficult.
The keepers of both forensic genetic research databases claim to impose strict quality controls on
contributors before they can upload DNA data to the database. For contributing to YHRD, labora-
tories have to perform “a quality assurance exercise prior to the inclusion of any data” (Roewer et
al., 2001: 107). The overall quality control procedure of YHRD refers mainly to technical quality
assurance (https://yhrd.org/pages/help/contribute), which needs to be certified or proven through
the submissions of specific certificates or technical files. Three points in the guidelines for contri-
bution refer to matters other than technical quality criteria: “You or your lab has to analyze the
population samples and have the right to submit/publish the data. It is not possible to submit data
of another lab or collected data found somewhere on the internet.” And more specifically, “Infor-
med consent and/or specific approval of a recognized ethical committee are required for all data.”
(https://yhrd.org/pages/disclaimer). Thirdly, contributors shall not keep any personal information
that would allow re-identification of individuals from the data sets.
However, while for the technical quality assurance and data accuracy, substantial evidence is
mandatory, the two non-technical requirements can be met by simple statements. Likewise, for
the protection of personal information, a simple statement is enough.
The keeper of EMPOP aims for a “quality control that meets forensic standards” (Parson and Dür,
2007). The EMPOP database has a multi stage quality control aimed at increasing data accuracy
and diminishing errors (https://empop.online/methods). However, as with YHRD, ethical standards
are left in the responsibility of data contributors. In the questionnaire, a contributor needs to state
that the ethical approval by an ethical committee has been obtained, provide a blank informed
consent sheet, the ethics committee’s name, and the application number. In its terms of use the
EMPOP database specifies that the entire liability for the contents rests with the contributors:
The service provider is not obliged to supervise information that is allocated by third parties or
to search for circumstances that refer to illegal agitation. The commitments to the removal or the
inhibition of use of information according to general law remain thereof unaffected. Contents will
be instantly removed when pertinent infringement becomes known.”
(https://empop.online/terms_of_use)
Hence, for ethical quality, both databases need to rely on trust only. To assess the ethical quality
of contributed data, specific expertise and additional resources would be required. Considering
11
This is noteworthy because the collection of ethnic data is generally viewed as highly problematic in Germany. Genetic
data of Roma, collected in Europe and ethnically labelled, may undermine both data protection policies and guidelines
for handling ethnic data (Goodwin Morag, 2017, presentation given at the Workshop Race and ‘the Roma’, University
of Amsterdam, 17 November 2017).
14
that both databases are underfunded, it is hardly surprising that they cannot enforce ethical quality
standards, as will be shown in the following analysis.
II. Results
Three main findings are highlighted here: overrepresentation, ethical non-transparency, and
breaching of disciplinary boundaries.
1. Representation of Roma in databases and journals
Our analysis yields that forensic geneticists are significantly interested in Roma; they are the most
researched European population in forensic genetics. We identified 45 forensic genetics papers
published after 1990 which focused on Roma. This interest is paralleled by an overrepresentation
of Roma in forensic genetic databases. Furthermore, Roma are not ascribed to European popu-
lations in the databases.
Metapopulation assignment of Roma: In YHRD, any sample obtained from Roma, no matter from
what European country, is viewed as belonging to the metapopulation “Indo-Iranian”, not “Euro-
pean”. In EMPOP, they are placed in the category “Westeurasians”.
Proportion/Overrepresentation: Roma are overrepresented in some national databases in YHRD
in comparison to other citizens. For example, the “national database” of Bulgaria in YHRD has
311 data sets categorized as “Romani”, 218 as Bulgarian and 61 as Turks (available at:
https://yhrd.org/tools/national_database/Bulgaria, accessed 19 Oct 2020). The national database
“Hungary” comprises a total of 619 data sets labeled “Romani” and 962 categorized as Hungarian
and other minorities (available at: https://yhrd.org/tools/national_database/Hungary, accessed 19
Oct 2020). Roma are overrepresented in the EMPOP when considering the total number of
datasets in the “Westeurasian” category, the number of populations ascribed to this label, and
also with regard to national populations. In all of these cases, the proportion of Roma in the overall
national population is much smaller than in the database. This may have implications when inter-
preting matches.
Geographical balance: Most studies on Roma come from CEE countries, and within the CEE re-
gion, Hungary was by far the main location for sample collection. For Hungary, most samples used
in forensic publications were collected in a single county, i.e. Baranya county (where, according
to the 2011 census, 4.66% of 386,411 inhabitants are Roma). Some data sets from these studies
in Hungary have been shared many times within and beyond forensic genetics teams. While Hun-
garian datasets make up a large part in YHRD, Hungary and Bulgaria are equally strongly repre-
sented in EMPOP.
Sample Size: Compared to sample sizes of Roma in human population genetics, the samples
used for forensic genetics studies are relatively small, ranging from 22 individuals to a maximum
of 208 individuals. Rules regarding the minimum sample size were first issued by FSIG in 2013
(Carracedo et al., 2013), i.e. some years after most of the studies had already been published –
15
which is why most of the studies published before 2013 do not meet the current standard.
12
The
editors of FSI:G, however, allow for multiple exceptions. In this regard, any paper which includes
“forensically relevant information” or “small but important populations or ethnic groups” could be
exempted from the minimum sample size standards (Gusmão Let al., 2017). This can easily be
stretched to apply to Roma groups. There are no other quality standards in place that would have
to be applied in those cases, although methodological and conceptual quality is of utmost impor-
tance, particularly with small and ethnic groups (Ehler and Vlanek, 2017).
Discussion: There is no simple answer to the questions why, and to what effect, Roma are over-
represented in these databases. First of all, for the technology of BGA prediction by lineages, used
when there is no atDNA match in the NDNAD, the problem has hitherto been largely neglected by
academics. It might be expected that in BGA prediction, overrepresentation implies a concrete
risk of wrong assignments, overfocussing and misrepresenting the group. In order to better cha-
racterize this risk, a transdisciplinary examination would be necessary.
Secondly, in the case of calculating RMP, an assessment is even more complex: an overrepre-
sentation of small and isolated groups might pose hitherto unrecognized problems for forensic
statistics and its use in court, but it could also bring about effects that are advantageous for Roma
suspects. On the one hand, overrepresentation of a population could result in a higher RMP and
thus result in exoneration because the match shows to have limited importance. On the other
hand, if the reference population comes from one small isolated community that cannot represent
any other Roma communities, this could perhaps yield distorted results. The finding of a haplotype
linked to a particular sub-population within a country dataset can be misinterpreted, which could
exacerbate the problem. However, whether this poses a real and unfair threat for Roma suspects
seems to depend on the technicalities of how one can search and use the database. This might
be different for EMPOP and YHRD. Here, a transdisciplinary examination would be necessary too.
There are also societal issues: firstly, privacy risks are even higher for isolated communities if the
sample number for their population is higher; and according to the new EU privacy law, privacy
risks do not only apply to individuals and their individual information but also to ethnic groups and
their ethnicity information. Secondly, police forces appear to collect a lot of DNA from Roma, but
probably not with the motivation to accomplish the exoneration of Roma suspects in court; rather,
arguably, to gain more information about what they perceive as group criminality.
13
Forensic ge-
neticists, for their part, collect and use data from Roma because Roma are seen as a scientifically
12
Currently the minimum sample size requirement for publication in FSI Genetics is a minimum of 500 samples for
autosomal and X-chromosome short tandem repeats (STRs), a prescription which has been maintained since 2013
(Carracedo et al., 2013; Carracedo et al., 2014; Gusmão et al., 2017). For the Y-chromosome and mtDNA studies, the
minimum size was decreased from 250 samples in 2013 (Carracedo et al., 2013) to 200 samples in 2014 (Carracedo
et al., 2014). In 2017 for Y-chromosome studies, the standard was raised at minimum 400 samples for Y-STRs analyses
and respectively to 300 samples when both Y-STRs and Y-SNPs (single nucleotide polymorphisms) are performed; for
mtDNA the number of 200 samples is currently in use except for full genome analyses where only 100 cases are
required (Gusmão et al., 2017).
13
Police forces focus on what they call “criminal networks of Roma ethnicity” (EUROPOL 2016: 18). However, both in
data collection and in police practices, the focus extends to “ethnic Roma” and is not limited to “criminal networks”.
EUROPOL views Roma as an ethnic group and ascribes criminal activities, especially trafficking in human beings, to
“ethnic Roma”. “Bulgarian and Romanian (mostly of Roma ethnicity), Nigerian and Chinese groups”, the report claims,
“are probably the most threatening to society as a whole." (EUROPOL 2011: 26). In 2010, the European Union Agency
16
interesting population. For both professional groups, Roma are what Roewer calls a “population
of interest”. Whether Roma are overrepresented in NDNAD databases (for example in Germany),
and if so, to what consequences, remains an open question.
Forensic genetic events
In accordance with the publication record, the overrepresentation of Roma in forensic genetics
can also be found in forensic genetics’ conference and workshop programmes:
DNA data of Roma have been a frequent topic of discussion at IFSG congresses since 2000
(e.g. see ISFG program and abstract books 21, 22 and 26 for the congresses from 2005,
2007 and 2015; available at https://www.isfg.org/Meeting, accessed 19 Oct 2020).
Talks and presentations regarding DNA data of Roma also recur at the Y-Chromosome User
Workshop, convened every other year since 1996, run by the main forensic organizations of
Western Europe (e.g. program and abstracts books for the workshops no. 4, 5, 6, 7; available
at: https://yhrd.org/pages/help/workshops, accessed 19 Oct 2020).
2. Informed consent / ethical approval
Our main interest here is whether the ethical quality safeguards of forensic genetic databases and
journals work in practice if their accomplishment is based on trust alone.
A. Publications
Since the leading forensic genetics journals have only introduced ethical approval and informed
consent in 2010, it is hardly a surprise that only few publications published before 2010 make
statements about ethical approval and informed consent.
Twenty-eight of the 32 forensic genetics papers reporting on primary data do not report on a re-
view-and-approval procedure by an ethical committee. Twenty of 32 studies reporting primary data
do not specify whether informed consent had been obtained from their subjects. A requirement for
authors to explicitly mention the “informed consent and/or specific approval of a recognized ethical
committee” was only introduced by FSIG in 2010 (Carracedo et al., 2010: 145). A similar require-
ment was made by the editors of IJLM the same year. However, this does not seem to apply
retrospectively when data collected before 2010 is shared. Only 7 out of 32 publications based on
primary data were published after 2010. All of them acknowledge either informed consent or ethi-
cal approval of an ethical committee: 3 papers provide information on both informed consent and
ethical approval; 2 provide information on informed consent but not on ethical approval and 2
provide information on ethical approval but not about informed consent.
B. Publication references and unpublished data in databases
Not all data uploaded to YHRD and EMPOP comes with a publication reference; this seems to be
the case more often in YHRD than in EMPOP. In a database like YHRD that makes genetic data
publicly available to anyone without registration, a scientific publication is an important step for
for Fundamental Rights (FRA) reported that Roma, amongst other ethnic minorities in Europe, are more frequently
stopped and searched by the police when compared to majority populations (FRA, 2010).
17
demonstrating that one has the right to make data publicly available. This might be different for
EMPOP, where one has to register for using the database.
Different reasons could account for unreferenced data: both databases accept data uploads prior
to the proper publication of a paper. FSI:G and IJLM have made it a requirement for authors to
send their data through the quality check of YHRD/EMPOP before publication can be granted.
14
This makes the upload – after a successful quality check and before the predictable publication –
an obvious, but not a mandatory step. In other cases, contributors who upload data before
publication seem to simply forget to add a reference once the publication is out. This was the case
with DNA data from Roma coming from a non-forensic publication (Martinez-Cruz et al., 2015:
data upload is dated June 2015, publication Sept. 2015.) A publication reference was missing on
the YHRD website until Oct 2020 and was added after we made an inquiry to the keeper of the
database. Currently, all data from Roma in YHRD and EMPOP comes with a publication reference.
Theoretically, if all quality assessments for uploads were as reliable as they are supposed to be
in the journals, the upload to the database could be counted as publication.
However, there are other cases, not related to Romani samples but instructive for the YHRD pro-
cedures, where no publication reference can be provided and no similar quality control is in place.
This is the case, for example, with two samples uploaded in 2017 that the contributors claim to
have collected in Germany: one with 32 individuals labelled “Afghan”, the other with 42 individuals
labelled “Romanian”. Furthermore, there are several samples collected in various German federal
states without a publication reference. The senior contributor of these datasets is Gerhard Baess-
ler at the criminal investigation authority of the federal state Baden-Württemberg (Landeskriminal-
amt, LKA BaWü). Baessler was unavailable via phone or email due to his recent retirement. In a
phone conversation with a person responsible for the department, we were told that publication of
data in a scientific journal is not required, but that uploading data to YHRD is a publication process
in itself. After insisting that this was not the required procedure, we were provided with a publica-
tion that publishes data from Germany along with data from many places around the world (Purps
et al., 2014); however, this publication does not include the data sets mentioned above. Upon
requesting a publication reference for these data sets, no response was issued. Employees from
three other Landeskriminalämter have uploaded several hundred genetic data sets each to YHRD
as contributors without providing a publication reference. The LKA of Baden-Württemberg has
contributed most; no other LKA has contributed genetic data from minorities. Criminal investigative
authorities in Germany do not report to an ethics committee.
While we have only looked at data in the national database “Germany”, we assume that (a) more
such unpublished data sets from other countries are in YHRD and (b) investigative authorities are
sometimes the providers of such datasets. At least some unpublished data sets have obviously
undergone no quality check for ethical procedures; criminal investigative authorities may not have
the same awareness of the ethical principles for treating genetic data appropriately that should
14
For mtDNA papers, previous acceptance of the dataset in EMPOP is required; for YSTR and YSNP data, previous
inclusion of the data in the YSTR/YSNP database is required." (https://www.springer.com/journal/414/submission-
guidelines. Accessed 31.10.2020.)
18
prevail in scientific circles. Moreover, these are publicly available data for the use of the scientific
community which should strictly follow ethical standards.
We hold that the collection of genetic data without consent and ethical approval, and the upload
of such data to a publicly available database such as YHRD, is untenable and not justified by the
ethical requirements of the database. Even though the data is anonymized, the de-anonymization
risk is high, in particular for isolated groups and minorities, as is the danger of stigmatization and
discrimination arising from such data.
C. Informed consent / ethical approval in databases
Even though YHRD requires contributors to demonstrate they have undergone appropriate ethical
approval and informed consent procedures, the data in the database, even if a publication is re-
ferenced, does not always come with a transparent demonstration; in many cases it does not
come with any demonstration at all. This is mostly due to the fact that ethical procedures have not
been required in forensic genetic journals until 10 years ago. Thus, publications either simply do
not state any informed consent or ethical approval, although the authors may have applied them
(albeit perhaps not for a genetic but a medical purpose), or there has never been any attempt to
apply these procedures.
Legally, the authors can point to the fact that ethical approval and informed consent was not re-
quired from them by the journals until 2010. Seen from an ethical perspective, however, these
topics should have been dealt with much earlier by the overall community of forensic genetics, as
it is still part of the overall scientific community of genetics, with enough networks and overlap. But
even if in a legal perspective, the collection of genetic data did not require ethical procedures at
the time and the place where it happened, that data cannot simply be re-used for all kinds of
purposes from an ethical point of view.
And if an informed consent has been obtained, there is still the question what it was obtained for:
for medical purposes; for research into the history of a population; or for uploading data to a da-
tabase used by criminal investigators? For a clearly defined purpose, or for any kind of research?
In the case of isolated communities, were DNA donors made aware of the privacy risks during the
informed consent procedure? This is crucial because the re-use of data underlies specific ethical
standards that do not allow for this if not so stated in the informed consent sheet. While it is not
unlikely that somebody donating to a medical genetic study on a certain genetic disease would
not mind the data being used to study other medical conditions, it is unlikely that the transgression
between medical and forensic databases would be acceptable for most donors; especially since
around the globe, the willingness to donate DNA is low for health-related purposes (Middleton et
al., 2020) and probably even lower for forensic uses.
Also, not all informed consent stated in a manuscript up for publication might be reliable. Recently,
authors had to retract papers from scientific journals because the genetic data had been collected
from Uighurs in China under ethically untenable conditions and with participation of Chinese in-
vestigative authorities (Moreau, 2019; Zhang et al., 2019 – retracted). While those genetic publi-
cations had stated ethical procedures and were published on that basis, the retraction obviously
19
followed the argument that under the current conditions in China, such a statement cannot be
trustworthy. Hence trust might not be enough when it comes to ethical safeguards in publishing
genetic data from minority populations, especially in the case of Roma, a minority for a long time
discriminated against by the police and state authorities.
Against this backdrop, the data from Roma uploaded to YHRD and EMPOP deserves closer at-
tention. Firstly, both databases contain data from published forensic studies that do not explicitly
state informed consent. Secondly, both databases also contain data from studies with research
goals other than forensic genetics, as, for example, population history genetics or medical gene-
tics. In these cases, informed consent cannot simply be trusted either. And thirdly, for some data
in publications and/or in databases, the consent may have been obtained for purposes other than
forensic research and application. In many cases, it is hard to imagine that it was a voluntary
“donation” to a forensic database or for a forensic genetic publication. Here are some examples
of problematic issues relating to informed consent for published studies and forensic databases:
Gusmão et al. (2008a) in Annals of Human Genetics reports a sample of DNA from 126
“Portuguese Gypsies” collected with informed consent in a population history genetic study.
Yet, the same sample was also used for a forensic paper (Gusmão et al., 2008b) without an
acknowledgement of informed consent. The original consent could have been obtained without
an explicit disclosure of a potential forensic use of the sample.
Salihovic et al. (2011) in American Journal of Physical Anthropology states several research
purposes, but not any forensic purpose.
15
Yet the primary data (from Croatian Rom*mja)
published in that paper was included in the EMPOP database. The authors also re-used data
from Roma published elsewhere: 204 samples from Hungary, published in Irwin et al. (2007),
with no statement on ethical procedures (see below); 147 samples from Bulgaria and Lithuania:
Gresham et al. (2001), stating ethical procedures (see below); an unclear number of samples
from Croatia: Klaric et al. (2009) with no statement of ethical procedures; 69 samples from
Poland: Malyarchuk et al., (2006); 138 from Portugal and 76 from Spain: Mendizabal et al.
(2011), both with statements on ethical procedures. These re-used data sets were not uploaded
to EMPOP under the reference to Salihovic et al. (2011).
However, in terms of data-sharing transparency, we followed up two of these publications of pri-
mary data referenced by Salihovic et al (2011) which were most widely shared for diverse purpo-
ses since their publication: Gresham et al. (2001) and Irwin et al. (2007). From both of these
papers, data was uploaded to EMPOP without transparent procedures, as we will show below.
Irwin et al., 2007: This paper does not state any informed consent or ethical approval, although
the main author is affiliated to an U.S. institution. The paper is based on samples from Hunga-
rian Roma, collected in the 1990s by co-author Miklos Angyal, affiliated to the Forensic Science
Unit (Baranya County Police Department). As the authors state, some of the data is the same
as in 3 other publications, one of which was already published in 1997, that is, 10 years earlier
and none of them states informed consent or ethical approval: Füredi et al. 1997, Egyed et
al. 2000, Egyed et al. 2006.
15
The aim of the study is framed as such: “[this] study is a part of the ongoing multidisciplinary anthropological, mole-
cular-genetic, and epidemiological investigations of the Roma population in Croatia” (Salihovic et al., 2011: 263).
20
Gresham et al. (2001): This study provides the first population historic account of genetic data of
Roma merged from three different national contexts: Bulgaria, Lithuania and Spain. The Bul-
garian data was collected in the framework of a medical screening for genetic diseases descri-
bed in Tournev 2016.
16
The authors state: “Informed consent for both aspects of the study
[medical genetics and population history, V.L.] has been obtained from all individuals involved.
This study complies with the ethical guidelines of the participating institutions.”
Gresham’s PhD thesis, where the data collection is described in greater detail than in the pub-
lication, can be downloaded from the Internet:
https://ro.ecu.edu.au/cgi/viewcontent.cgi?article=2517&context=theses. In this manuscript,
Gresham states that consent was “informed oral consent” (p. 82), notably obtained in the late
1990s in post-communist countries. However, as Gresham writes, the data collection was not
done by himself, and a forensic purpose was not intended. Furthermore, the thesis provides
the exact names of the villages where samples were collected.
In accordance with the ethical standards listed above, one would expect that this data cannot be
uploaded to genetic databases like EMPOP or YHRD. In the next case, we will demonstrate in
detail that non-transparent and impermissible uploads happen nonetheless:
Martinez-Cruz et al. (2015): The data from this population history study has been uploaded to
both EMPOP and YHRD. The authors do not mention a forensic purpose for data collection.
They state: One thousand seven hundred and thirty-seven unrelated individuals self-identified
as Roma (N=753) or non-Roma (N=984) were collected after the corresponding ethical appro-
val and informed consent.”
However, the authors do not specify what portion of the data is primary data they have collected
and analysed themselves. Of seven national data sets, four data sets (Ukraine, Slovakia, Ro-
mania and Greece) are not referenced to another publication, which implies this is primary data.
According to the authors, however, the sample from Greece comes from four named villages,
and thus seems to be identical with a sample collected in the very same villages, as noted in
an earlier publication of the same authors (Mendizabal et al., 2012 – see below); but there, a
reference to the original publication is given (Deligiannidis et al., 2006): These 57 Greek
samples were initially published in a forensic genetic journal without informed consent or ethical
approval.
Three other data sets in this study are referenced to four other publications (Bulgaria: Morar et
al. 2004 and Gresham et al. 2001; Hungary: Pamjav et al. 2011; Spain: Mendizabal et al. 2011,
see below.) Pamjav et al. 2011 do not provide any information on informed consent or ethical
approval; for Gresham et al. 2001, see above.
Mendizabal et al. 2011 (whose author panel is overlapping with that of Mendizabal et al. 2012,
see below) states that written informed consent and ethical approval for the samples collected
on the Iberian Peninsula were obtained. This might be correct.
However, in a recent publication, the same senior author together with some of the co-authors
from Mendizabal et al. 2011 has reported on informed consent in other studies in an incorrect way:
16
Tournev describes the sampling strategy as follows: “A neurological screening of hereditary neuromuscular disorders
using the “door to door” method was performed in 2500 towns and villages (with a predominantly Roma population) in
the country. Those towns and villages where pedigrees with hereditary neuromuscular disorders resided were visited
between 2-10 times with the aim of collecting pedigree information, blood samples for genetic studies and a neurological
examination of the patients. The field work studies covered a period of 20 years (1994–2014). 97% of the Roma
population living in compact Gypsy quarters was encompassed.“ (Tournev 2016: 99)
21
Font-Porterias et al. 2019 states: “Written informed consent was obtained from all the
volunteers and the present project has the corresponding IRB approval (CEIC-Parc de Salut
Mar 2016/6723/I).” This might be correct for the samples the team has collected themselves.
But in addition, they use the merged dataset from Mendizabal et al. (2012) which contains re-
used datasets published without acknowledging informed consent or ethical approval (see
below).
Mendizabal et al 2012: In this publication, the authors claim to have obtained informed consent
from participants. But the study reportedly re-uses the same genetic data from Greece which,
as demonstrated above, does not come with a written informed consent. For all other portions
of data in Mendizabal et al. 2012, no publication is referenced, but this is most likely no primary
data: Much more likely, this is shared data from Gresham et al. 2001, where no written informed
consent was obtained. Whether the team has really collected primary data, or whether the
publication reference is simply missing, remains unclear. Informed consent information seems
to be completely unreliable in these cases.
Ignoring privacy risks, Martinez-Cruz et al. (2015) and Gresham’s PhD thesis provide very precise
information on some villages where Roma were sampled; on that basis, it might be quite easy to
de-anonymize and re-identify the donors.
No matter whether these inconsistencies are caused by purposeful, unintentional or negligent be-
haviour: this non-transparent data handling is clearly not in accordance with ethical standards and
requires action on the side of database keepers, journal editors and the forensic genetic commu-
nity. The data sets and the studies concerned should be retracted, except for cases for which the
authors and database maintainers can present a proof of sound informed consent.
III. Contextualizations
In this chapter, we contextualize our findings in the larger context of biomedical research in inter-
national research collaborations, spanning academia and industry as well as police forces.
Some of the interdisciplinary and international collaborations involve authors from several fields,
including medical, forensic, or general departments of biology or genetics at universities, or from
clinics, health administrations, police forces and private entities such as Applied Biosystems,
which is part of the Thermo Fisher Scientific corporation (Martinez-Cruz et al., 2011), and the
biopharmaceutical company Pfizer (Maria Saiz et al., 2014).
Furthermore, because collaborations for genetic studies of Roma have a long and rich history, we
contextualize our findings in the history of genetic research.
1. DNA Data crossing boundaries between Forensic Genetics and Medical Genetics
For an informed consent to be fair and truly informed, researchers need to inform participants
about all intended usages, or ask for a broad consent for all kinds of purposes. Even then, the
scope of potential uses and risks need to be informed about. When approaching donors for DNA
samples, a medical purpose will be met with much more openness than a forensic purpose, in
particular if, in the latter case, the research makes an isolated group more vulnerable to ethnic
profiling and police interventions or if it threatens to violate privacy. Moreover, given that the pre-
diction of medical conditions from DNA for criminal investigation is illegal in many nation states, it
22
is clear that there need to be boundaries between these applications of genetic data. Interdiscipli-
nary collaborations between forensic and medical genetics point to data transfers between fields
which may follow ethical standards to different degrees.
We hence start from the premise that it can be ethically problematic if medical and forensic genetic
research are carried out by the same team, or if data is transferred from one field to the other, or
if researchers from one field publish in the other. Yet this is often the case on various levels. In
some instances, forensic geneticists were part of larger teams with affiliations to medical centres,
community hospitals and universities; or an authors’ panel involved authors affiliated with medical
institutions and authors affiliated with forensic organizations. Some medical and population gene-
ticists who have a track record of studying Roma, or who have significantly contributed to the
constitution of DNA datasets labelled “Roma” in databases, have also co-authored forensic papers
(e.g. Melegh as co-author of a paper by Cruciani et al., 2011; Bernasovský and Bernasovská as
co-authors of two papers by Soták et al., 2008, 2011 and of a paper by Petrejčíková et al., 2011).
In other cases, biological specimens and/or DNA data were transferred from a clinical to a forensic
context, most likely without an explicit consent of the research subjects.
Three forensic studies explicitly mention that their samples were collected by medical doctors
(Nagy et al. 2007: 25, published in Forensic Science International; Saiz et al., 2014 in Forensic
Research, re-using the data from Novokmet and Pavčec, 2007 in Forensic Science International).
The ethnic categorization of samples by the collectors indicates that systematic ethnic labelling
for Roma is in place and routine in medical institutions in some countries.
2. Collaborations with police forces
Collaborations with police forces raise questions whether samples were donated voluntarily, or
whether one must assume coercive collection practices. In this respect, two papers from China
have recently been retracted by Springer, due to the fact that an informed consent cannot be
granted if Chinese police forces are involved, even if the authors have stated ethical approval in
their study (Zhang et al., 2019 - retracted article). The study of Maria Saiz et al. (2014) indicates
that the involvement of police forces is not a singular rare event: for the Latin American samples
they have analysed themselves, the authors thank medical units and police forces for “kindly pro-
viding the DNA samples and for their great effort in collecting information about the origin and
ethnicity of each individual in the study.”
Several forensic genetic papers using DNA data from Roma list co-authors affiliated with police,
investigative or military forces.
17
This aligns with the observation that Roma are a population of
interest for police forces in different countries. Two of the studies explicitly mention the Hungarian
police as the contributors of DNA samples: “Miklos Angyal (Forensic Science Unit, Baranya Coun-
ty Police Department, Hungary) for carefully [sic] collecting the Romani population samples” (Irwin
et al. 2007:382; see also Egyed et al. (2007: 162).
17
For more detail, see annex 2. Several German Landeskriminalämter; Forensic Science Unit, Baranya County Police
Department, Pecs, Hungary; Armed Forces DNA Identification Laboratory, Rockville, MD, USA; criminalistic laboratory
of Lodz police, Poland; Forensic Centre ‘‘Ivan Vucetic´’’, General Police Directorate, Ministry of the Interior of the Re-
public of Croatia; Department of Biology, Institute of Forensic Science of Police Corps, Bratislava, Slovak Republic;
Illinois State Police, Research & Development Laboratory, Springfield, USA; Subdivision of Biological and Biochemical
Examinations and Analyses F.S.D., Hellenic Police, Athens, Greece.
23
3. International Collaborations
The topic of collaborations is important for an assessment of ethical standards. In regard to inter-
national collaborations, according to the 2003 IHDG declaration of UNESCO the “collection, pro-
cessing, use and storage of human genetic data, human proteomic data or biological samples”
that occurs in two or more states should pass the approval of ethical committees in all the states
concerned. A lack of approval of ethical committees in many of the forensic studies does not only
signal a breach of ethical standards in the countries of sample collection, but also in countries
where co-authors have institutional affiliations.
Regarding genetic data from Roma, most of the international collaborations are carried out be-
tween researchers from CEE countries on the one hand, whose main role seems to be the collec-
tion of DNA samples from Romani individuals, and on the other hand researchers from Western
(European) countries who contribute different types of expertise in the processing of DNA samples
and data. Most often, international collaborations connect researchers from Hungary and their
peers from Western countries (e.g. Germany, Austria, Portugal, Italy, Spain, Argentina, Belgium,
Poland, Norway, Spain, France, Japan, United Kingdom, USA), sometimes in large international
teams including medical units and police forces. Of all countries, Hungary contributed the largest
number of individual DNA data sets from Romani to the forensic community.
A striking example is the study of
Morar et al (2004). Data from this study has been used in Martinez-Cruz et al. (2015) and, under
that reference, uploaded to YHRD and EMPOP by the team of senior author David Comas, but
without a recognition of where the data came from. Morar et al. (2004), an international author
panel from Bulgaria, Germany (LMU Munich), Australia, Lithuania, Slovakia, Hungary, India,
Canada, Great Britain, France and Spain, report on primary data from two Eastern European
Roma groups (“Balkan” and “Vlax”) as well as data from “Western European” Roma. For the
latter, as the authors state (p. 598), data comes from “Hungary, Slovenia, Czech Republic,
Lithuania, Germany, France, Italy, Spain and Portugal”. These datasets were lumped together
as “Western European” because, the authors state, “information on Gypsy group identity was
unavailable, partial or contradictory”. Data from Bulgaria was most probably the same as in
Gresham et al. (2001), but this remains unstated. Morar et al. (2004) claim that informed con-
sent has been obtained from all participants, which is misleading, and that the study complies
with the ethical guidelines of the institutions involved. It is unlikely, however, that the ethics
committee of a German university (LMU Munich) has approved of a study on “Gypsies” in the
early 2000s.
Such international collaborations are problematic, particularly if data is uploaded in forensic data-
bases, as the countries might have different national legislations regarding data protection and
the oversee of genetic data. Also, as can be seen from the example above, international ethical
standards can be undermined more easily if authors from specific countries classify their primary
data in a different national category and skip ethical approval procedures.
4. Funding
When specified at all, sources of funding for this research can sometimes be surprising. Institu-
tions with organizational missions very different from forensics are mentioned as sponsors of fo-
rensic studies, as for example the ministries of education and science in Bulgaria and Croatia
24
(mentioned in several publications), or the Wellcome Trust (mentioned in Ploski et al., 2002), an
organization whose mission is to improve health. In some cases, projects with very different aims
than forensics were mentioned as sources of support, as for example the project “Molecular-ge-
netic portrait of the Roma—an isolated founder population model”, mentioned in Pokupčić et al.
(2008), an article published in FSI:G. In the summary presentation, the project webpage
(https://inantro.hr/en/national-projects/) specifies that the focus of the project is on health and so-
cial inclusion, respectively “to assess the population and genetic structure and the burden of the
population with monogenetically determined disorders” and to “contribute to a fuller understanding
of the Roma population in Croatia and improve their living conditions, health and inclusion into the
wider social community” (our translation). The same forensic publication of Pokupčić et al. (2008)
acknowledges a grant project ‘‘Population Structure and Genetic History of Western Balkan Ro-
ma’’ of the Wenner-Gren Foundation for one of the researchers. However, the Wenner-Gren Foun-
dation credits this researcher with a publication in a non-forensic journal and not with the co-
authored forensic publication from this project.
18
A DNA biobank titled “European Biological Archive” was developed between 1995-1997 with EU
funding (CORDIS). As some of the project members from CEE countries have been involved in
extensive sample collecting and sharing efforts, this EU-funded project has contributed significant-
ly to the collection and accessibility of data and biological specimens from Roma.
The study by Morar et al. (2004) above has mainly drawn on data from CEE countries. Funding
came from Australia (Australian Research Council), UK (Medical Research Council / Wellcome
Trust), Germany (German Research Foundation), and Hungarian-German cooperations (OM-
BMBF; MTA-DGF). However, the DFG reports they have no files about this funding in their archi-
ves. With regard to ethical procedures, DFG trusts the funding receivers for having applied to the
ethical committee of their institution (which is probably not the case for the German LMU co-au-
thors of Morar et al. 2004).
Thus, some funding institutions might neither know nor approve of their contribution to the collec-
tion and sharing of data without ethical approval, or to the publication and upload to forensic da-
tabases.
5. Calibration and validation in academic and commercial contexts
Some ethnic minorities and isolated groups are in the focus of investigative authorities in their
country, which might explain an overfocus in forensic genetics. Yet ethnic minorities and isolated
groups also play a role in calibrating new technologies in Forensic Genetics, either developed by
academics or by tech companies. Data from Tibetans and Uighurs, for example, were used for
calibrating and validating the Huaxia Platinum Platform of Thermo Fischer (Wang et al. 2018).
Data from the isolated group of Karitiana (Brazil) was used to demonstrate the effectiveness of
Forensic Genetics methods from the beginning of its career in the courts (Munsterhjelm 2015).
Data from Roma is used in García-Magariños et al (2015) to develop a “parametric approach to
kinship hypothesis testing using identity-by-descent parameters”; the 27 samples from Romani
diseased children were collected in a “meningococcal disease association study” (Davila et al.
18
http://www.wennergren.org/grantees/martinovic-klaric-irena; last accessed 31.10.2020.
25
2010) that does not mention any Roma sample donors. Although the authors state that informed
parental consent has been obtained, it is unclear whether the consent procedure informed parents
about a potential forensic usage of the data of their children. DNA data from Roma has been used
as a tool for the calibration of forensic databases (e.g. for YHRD, see Roewer et al 2001, using
data from Hungarian forensic geneticists (Füredi et al. 1999) that has no sampling information)
and for new sequencing technologies, e.g. testing massive parallel sequencing with Illumina Fo-
renSeq on European populations (Casals et al. 2017). Data from Roma has thus also been used
repeatedly to increase the power of DNA databases and related technologies used in court.
6. The context of interaction between Roma and police forces
The focus of Forensic Genetics on Roma, and the overrepresentation of Roma in forensic genetic
databases, is to be viewed against the backdrop of a strong focus of EUROPOL on Roma. What
police forces claim to focus on are so-called “criminal networks of Roma ethnicity” (EUROPOL
2016: 18). However, both in data collection and in police practices, the focus extends to “ethnic
Roma” and is not limited to “criminal networks”. EUROPOL views Roma as an ethnic group and
ascribes criminal activities, especially trafficking in human beings, to “ethnic Roma”. “Bulgarian
and Romanian (mostly of Roma ethnicity), Nigerian and Chinese groups”, the report claims, “are
probably the most threatening to society as a whole." (EUROPOL 2011: 26). In 2010, the Euro-
pean Union Agency for Fundamental Rights (FRA) reported that Roma, amongst other ethnic
minorities in Europe, are more frequently stopped and searched by the police when compared to
majority populations (FRA, 2010).
7. Historical contextualization
While most of our findings point to a problematic lack of ethical awareness, both on the structural
and the individual level, we think that the full extent of the problem can only be grasped when
viewed against the historical backdrop. A brief historical overview might begin with police practices
of identification, labelling and counting “Gypsies” starting in the 19th century.
19
Later, many state
authorities in several countries devised specific “Gypsy” registers about Roma and Sinti for a
range of purposes. During the National Socialist Regime, both professional groups, police forces
and scientists, contributed to the repression, persecution and genocide of Roma and Sinti. Terrible
memories of that time still reverberate in their communities and families. While the genocide of
Jewish citizens received attention in the late 20th century German society, the genocide of Roma
and Sinti has found public attention only much more recently. Roma and Sinti have thus been
exposed to discrimination and persecution for very long after the end of WWII. While there is some
public awareness of this history (and some recognition of what it did to those affected) in Germany
today, the awareness in other countries might be less high.
More specifically, discernable from the more than 445 genetic publications on Roma between
1921 and 2020, there are clear continuities and patterns one needs to keep in mind when review-
ing genetic data today. To name but some aspects:
19
For the history of categorization and surveillance of “Gypsies” in police work, see Lucassen (1991, 1997) and Willems
(1997). For the history of identification and counting of “Gypsies” in police-led censuses, see Surdu (2016).
26
For about six decades, many blood group genetic studies of Roma made use of the infra-
structure of law enforcement, police data and police support for the recruitment of subjects.
20
Notably, some studies conducted by forensic geneticists, as well as some studies that make
use of data collected in prisons, were not published in forensic genetics journals but in other
scientific journals.
Prisons were among the most popular recruitment sites at least until the beginning of the
1980s, and authors were explicit about their sampling practices in their publications.
21
Since
then, forensic geneticists no longer account for sampling sites.
Forensic genetics was only established as a field in the 1970s/1980s. Geneticists from legal
medicine and population genetics, however, have worked with data collected by police forces
since long before this time.
During the 1970s and 1980s, Hungarian researchers who collected data in forensic contexts
built up strong collaborations with German forensic researchers (see, for example, Füredi et
al., 1998; Goedde et al., 1995; Walter et al., 1992). Later on, Bulgarian and Czechoslovakian
researchers also established collaborations with German researchers, mostly with a biome-
dical focus. None of these collaborative studies raised or discussed questions of representa-
tivity, methodology, discrimination or research ethics.
Distinguishing between “Gypsies” and “non-Gypsies” in Hungarian prisons included an in-
spection of external appearances indicating continuities in the racial framing of the group: “to
avoid examination of gypsies of unclear origin we investigated only those who could be con-
sidered real gypsies on account of their external somatic features” (Rex-Kiss et al., 1972a:
358). Moreover, the same study acknowledges that the participation of “Gypsies” in genetic
research was not voluntary.
Stigmatizing and insulting language were prevalent in most genetic accounts of Roma
throughout the second half of the 20thc, and continued until the early 2000s. However, since
2010, institutions of Forensic Genetics do no longer use “Gypsies” as a label (exception:
Bembea et al. 2011), while in medical journals and databases this can still be the case: as
population labels; and in combinations as “Gypsy disorders”, “Gypsy mutations” or “Gypsy
chromosomes”. In some studies, the framing of Roma bears no conceptual difference to his-
torical conceptions of a so-called “Gypsy race” under the National Socialist regime.
Instrumentalizing approaches have historically dominated biomedical genetic studies of Ro-
ma; obviously, this population has much to offer to geneticists in terms of research and inno-
vation opportunities. In numerous biomedical studies, the authors praise isolated populations
and particularly the Roma – respectively their “unique genetic heritage” – as “our special ‘re-
search tool
22
or state that “the unique genetic heritage of the Roma provides a powerful tool
20
Already in the interwar period, in several European countries, prisons were used as recruitment sites for genetic
investigations of Roma employing a racial biology approach; most infamously: Rassenhygienische und Bevölkerungs-
biologische Forschungsstelle (e.g. Ritter, 1941) and the Institute of Racial Biology in Sweden (see Ministry of Culture
Sweden, 2015: 57). As early as 1932, a genetics team collaborated with police forces in order to obtain blood samples
from Roma (Gärtner, 1932).
21
For example, in Hungary, Roma prisoners were continuously examined in blood group research (see e.g. Rex-Kiss,
Szabó L and Szabó, 1972a, 1972b; Rex-Kiss et al., 1973 and Rex-Kiss and Szabó,1981).
22
Tom Wahlig Stiftung website. Unraveling the molecular basis of hereditary spastic paraplegias – Contribution from
European Gypsies. Prof. Albena Jordanova, PhD. Available at: https://www.hsp-info.de/en/researchers/project-
reports/project-30.html (accessed 31.10.2020).
27
for the positional cloning of monogenic disease genes” (Gresham, PhD manuscript, p. iii). The
“usefulness” of DNA data from Roma for genetic research has also been recognized in foren-
sic studies.
Different regimes exist in Europe regarding the collection and processing of ethnic data (e.g.
CEE countries collect statistics on ethnicity in censuses whereas many Western European
countries do not).
In communist countries, informed consent procedures did not exist and according to experts,
it would have been inconsistent with the traditional doctor-patient-relationship to obtain infor-
med consent (Kalaydjeva and Kremensky, 1992). Blood samples, however, were most likely
taken before 1989 and informed consent probably did not immediately become a routine prac-
tice after 1989.
In some cases, sampling practices were designed in a way that pressured individuals to com-
ply. In historical cases, community leaders were approached to gather samples of “Gypsies”
in Greece (Bartsocas et al., 1979), or to obtain sampling related information about Romani
groups in France (Ely, 1961). Until 2010, Roma community leaders in Serbia were asked to
give their consent for the whole community (instead of individual consent) to contribute DNA
(Beljić Živković et al, 2010).
The recruitment of donors was often dependent on the authority of medical staff or senior
community members (see also Szamosi, 2010). This may have complicated individual rights
to refuse or withdraw participation. It may have also led to significant social community pres-
sure on individuals to conform with recruitment demands. Our interview with a senior geneti-
cist confirms and adds to this observation.
As many of the research subjects were illiterate, and many of them were children, obtaining
informed consent in accordance with international standards was considerably more difficult.
No explanation about the efforts taken can be found in the studies.
IV. Discussion and conclusions
Minorities are especially at risk in forensic genetics and in DNA technologies involved in criminal
investigation: their privacy risk is higher, as well as the risk for stigmatization; they are more affec-
ted by biases in data collection, representation errors and misinterpretations; and ensuring ethi-
cally acceptable procedures for them seems not to have been a priority for researchers and in-
vestigators. In this paper, we have demonstrated that ethical safeguards are not sufficient for vul-
nerable groups such as the Roma: Some of the most important ethical standards agreed upon in
international science policy documents are not met in many genetic studies on Roma. While it
might come as no surprise that Chinese forensic authorities unethically collect data from Uighurs
(Moreau 2019, Zhang et al., 2019 - retracted), and while we expect to find more such infringements
of ethical standards in studies of minorities and isolated groups from other parts of the world, for
Europe, the case of the Roma is unparalleled.
We hope we have made clear that these problems are by no means confined to forensic genetics:
the data has circulated widely in other genetic subdisciplines, too. It is also not a problem confined
28
to Eastern Europe: Western European researchers are equally involved in the studies, and bene-
fits seem to be distributed on both sides. The problem is also not confined to Roma: other vulne-
rable groups are affected too.
With regard to our research questions, we have demonstrated that, in the eyes of investigators
and forensic geneticists, Roma are viewed as a population of specific interest, albeit for different
reasons: while an interestingly isolated population for the geneticists, they are deemed a suspect
population by many investigators. With the upload and sharing of DNA data of Roma collected by
investigative forces, this latter attitude time and again seems to spill over into forensic genetics,
an academic field so heavily dependent on its applicatory value for investigation.
This resonates with our observation that forensic genetics, as both an academic field and a field
of application, is subject to two different normative regimes. As we have shown, this can lead to
ethically problematic constellations, for example, when investigators upload DNA data from mino-
rity populations to a publicly available database without following ethical guidelines. Another prob-
lematic constellation might arise when investigators make use of EMPOP/YHRD services without
drawing on the appropriate expertise to interpret the results correctly.
Another important observation is that the boundary between forensic genetics and neighboring
genetic subdisciplines is also permeable: Data is shared between medical and forensic genetics
in non-transparent ways, with ethically problematic consequences.
Ethical dumping. In the introduction, we asked whether what we have observed can be described
as ethical dumping in the sense of Schroeder 2017. Indeed, this is the case. The EC had
determined that ethical dumping happens outside of the EU. Notably, however, with the collection
of DNA data from Roma, such ethical dumping happens even within the EU, with a relatively clear
East-West-dynamic. Most of the samples in Roma-related forensic studies and the forensic
databases were collected before 2010 in Central and Eastern Europe (CEE) countries. Some of
these data were collected in partnerships between researchers from Western European EU
member states and CEE countries not yet members of the EU at the time of sample collection
(Poland, Czech Republic, Slovakia, Hungary, Slovenia: members in 2004, Romania and Bulgaria:
in 2007, Croatia in 2013). Some other countries of data collection are not members of the EU:
Turkey, Macedonia, Albania, countries of former Yugoslavia. The few Western European
countries of data collection have been EU members at the time of data collection: Most notably,
Spain and Portugal. As this paper demonstrates, ethics dumping can be observed in many of
these data collections and analyses.
Yet as our paper shows, Roma are affected by ethics dumping even in Germany: Not only did
German researchers contribute to international collaborations without considering ethical stan-
dards; some biobanks or data collections might even contain samples or data from Roma collected
in Germany, as most likely in Morar et al. (2004), but these samples seem to have been merged
into large international datasets under non-trans-pa-rent data curation. This equals ethic dumping,
that is, exploitative practices of collecting DNA samples. However, in this case, it is not about
researchers from economically developed EU countries profiting from data collection in less de-
veloped non-EU countries, but about privileged researchers from a country’s majority population
profiting from data collection in underprivileged minorities of the same country.
29
In order to conceal ethics dumping, we argue, some of these international authors’ teams are
involved in what we call data laundering” in genetic studies of Roma: whether and how ethical
standards have been implemented is obscured in their studies all the way down: by reporting
misleading or wrong information about informed consent procedures; by extending statements
on implemented ethical standards from a small, correctly sampled dataset onto a large, merged
dataset, consisting of samples collected in the past under totally different conditions; by taking and
situating data out of context in databases and studies.
This is not the first case of ethics dumping reported on in academia, and fortunately there are
some cases of “lessons learnt” in the ethical field and in genetics. In some of these, an empirical
examination of “ethics in practice” has led to a revision of standards and/or their implementation,
mainly for taking into account the perspectives of minority populations. Most importantly, though,
the discussion of such cases seems to have heightened the awareness of researchers for the
needs and rights of vulnerable populations. Without doubt, today, many in human genetics take
ethics very seriously and invest much effort into meeting the expectations and standards.
However, some of their colleagues do not approach vulnerable populations in the way the majority
of human geneticists considers appropriate; unfortunately, this behaviour nevertheless risks to
shake the overall trust of the public in human genetics.
Best practices already exist for other populations, namely, to involve minority communities into
the studies as partners from the very beginning. But this has obviously not happened in the studies
on Roma. With one noteworthy exception: Font-Ponterias et al. (2019) seem to have done exactly
this – involving a Roma organization into the study in a fair way, with written informed consent.
But the same study also uses the merged and highly problematic data set from Mendizabal et al.
(2012). The authors of Font-Ponterias et al. (2019) claim they have obtained “written consent”
from all “volunteers”. This is either wrong, or it suggests that the shared data from Mendizabal et
al. (2012) was not obtained from volunteers; but why, then, use it in a study in 2019 that has
allegedly been approved by an ethical committee? This, we argue, sums up to data laundering.
Benefits and burdens. Whether the Roma benefit from this research strand should have priority
in this discussion, yet so far, this is highly questionable. We argue that most of the medical genetic
studies carried out on Roma show no potential benefits for the researched subjects, and no
benefit-sharing mechanisms with researched subjects and communities have been planned, as
far as we can tell from the sources. Moreover, the people sampled from poor neighbourhoods
might have other medical needs and priorities, namely access to basic health care. Furthermore,
poor people are unlikely to gain access to potential drugs or therapies developed as a result of
discoveries made on Roma-related genetic data, but patients from economically developed
countries and well-off segments of the population will benefit.
With regard to forensic genetic studies, benefit sharing is even more unrealistic, even if a strict
application of best practises in database keeping would help to improve the situation. After all,
much depends on the interpretation of lab results by investigators, even if the technical validity
were high and ethical requirements for data collection were met. We know of at least one example
in which the interpretation of BGA results has had negative consequences for Roma. In the Heil-
bronn police office murder case (Lipphardt A, 2019), the unknown DNA trace was compared with
data held on EMPOP; newspapers reported the analysis had indicated “Eastern European” an-
cestry (even though an EMPOP query would not allow for such a conclusion). Despite this general
30
result, implying large parts of the German population, investigators focused on Roma and took
thousands of blood samples, accompanied by a public campaign assuming the perpetrator was
to be found in the “Gypsy milieu”. In another example illustrative for the stigmatizing power of
genetic testing, a Hungarian right-wing politician had his DNA analysed by a private company in
order to rule out “Roma” or “Jewish” ancestry.
One overall effect of the developments in several applicatory fields of genetics is an increased
stigmatization and criminalization of Roma. The representation of Roma in papers published by
geneticists of various genetic subdisciplines, as pointed out in this paper, risks reinforcing and
deepening stigmatization and societal exclusion on several levels: in social relationships, accep-
tance into communities, workplace, health care systems, administrative contexts and law enfor-
cement. This is by no means only a kind of abuse of scientific results: Adding to the discursive
exclusion of Roma from Europe, in popularizing (non-scientific) publications and in media covera-
ge, scientists represent Roma in an exoticizing, romanticizing way, so that the “genetic isolation”
by “cultural traditions of endogamy” would seem more plausible to audiences.
What the future might bring. Our analysis resonates with current debates on various ethical
challenges, particularly in biomedicine: this paper has been pulled together at a time of heated
international debates regarding scientific and informed consent standards in biomedicine,
particularly in genetics. We argue that Roma and other vulnerable isolated groups are more
heavily affected by the downside of data-related issues. If, in these current debates, loose stan-
dards come to be favoured over strict ones, we predict negative consequences for marginalized
and vulnerable populations: simply trusting in the self-control of stakeholders who have a strong
interest in owning and exploiting data of “interesting communities” is not an adequate answer.
We also wish to highlight the risks of future applications in state administration: the genetics com-
munity risks a biologistic, deterministic understanding of who is a member of the Roma group and
who is not, based on biological data. Biologically determined group ascription may lead to biolo-
gically assigned citizenship. This may lead to dangerous future uses of DNA analysis, e.g. some
kind of ethnic/racial diagnosis for the purpose of exclusion, or for granting and denying access to
rights and citizenship.
We wish to emphasize that since the 1990s, the awareness for ethical challenges has certainly
increased among the geneticists who study Roma. This can be seen from the publications and
the databases. However, we argue that for the protection of a vulnerable group, it is still insuffi-
cient, and definitely not in accordance with the ethical standards of genetics. We are aware of the
fact that researchers working in this field may provide a different perspective. For example, re-
searchers might consider themselves justified to work around restrictions or circumstantial bar-
riers, and would sometimes also be ready to justify this ethically. At one end of the awareness-
spectrum there might be researchers with little awareness of how data can be problematic; in the
middle, researchers who are aware, but see themselves in a position that does not allow for chan-
ges; at the other end there might be researchers who are fully aware and yet take advantage of
discriminatory and problematic structures. So far, no visible effort has been made to raise and
solve the problems outlined above. ‘Continue to muddle through’ is neither a solution nor does it
help the Roma. Our hope is that, once awareness increases, the many geneticists who are not in
agreement with the described practices will express their concerns publicly.
31
We request that those researchers who collect or work with Roma samples, or with samples from
other vulnerable populations, who review proposals, project reports or publications, as well as
funding institutions and journals, take on the challenges that come with a scientifically and ethically
sound investigation of those they call “Roma”. This includes thorough and serious considerations
of critical arguments from the social sciences and humanities, but most importantly, an engage-
ment with those addressed as “Roma”. This could lead to having certain studies funded and pub-
lished, while others remain unfunded or unpublished. If geneticists, the pharma industry and in-
vestigative bodies seek the benefits of researching this data, they need to make an effort to both
avoid negative effects for the communities and to ensure such research actively benefits the com-
munities; currently, the communities carry severe social burdens on behalf of these studies.
It is in the interest of all stakeholders involved – the minority members, the geneticists, the mem-
bers of investigative authorities, and the taxpayers – that the criminal investigators can do their
work in accordance with highest standards: technical as well as ethical standards. It is in no one’s
interest that criminal investigation risks to violate these standards, or to inflict harm on minorities.
In Germany, a dialogue between German forensic geneticists (in investigative authorities and at
universities) and the Zentralrat der Deutschen Sinti and Roma would be a first step towards deve-
loping a shared understanding of the situation around DNA data in forensic contexts and the par-
ticipation of Roma in an institutional setup of discussions about ethical issues. As part of the quality
assessment, systematic and continued efforts to implement ethical assessments of these data-
bases and of the data sharing practices in the field of forensic genetics would be the next step. As
for the main publications in forensic genetics journals, the involvement of reviewers from the social
sciences and humanities, retraction of papers failing ethical standards, and publication of Informed
Consent sheets (the blank form) with the published study are some of the possibilities to improve
the transparency and integrity for the whole branch of forensic genetics.
Acknowledgements
We are grateful for a DFG grant on our former project The Genetic Construction of Roma Group-
ness and its Interdisciplinary Entanglements (2016-2019). It was during this project that we built
up a large part of our database with genetic studies on Roma and we developed contextual infor-
mation relevant to our analysis. Many thanks to Anja Reuss and her colleagues from the Zentralrat
der Deutschen Sinti und Roma, and to Frank Reuter from the Forschungsstelle für Antiziganismus.
We thank Yves Moreau for the insightful conversations on ethical aspects related to genetic data
collection. We thank Leon Kokkoliadis and Eric Llaveria Caselles, student assistants at Max
Planck Institute for the History of Science, who contributed by collecting genetic papers focusing
on Roma at the beginning of our project (2014-2015). We thank Cedric Bradbury and Sarah Weitz,
student assistants at University College Freiburg, who helped with collecting, sorting and organi-
zing the database of genetic studies of Roma (2016-2020). We are grateful to Silvia Stößer who
helped us with various administrative tasks. We had inspiring discussions about genetic studies
on Roma with Anna Lipphardt, Nicholas Buchanan, Amade M'charek, Huub van Baar and Ildikó
Plájás. Our understanding of the genetic studies benefitted from FRIAS funding during 2018/2019,
and particularly from very fruitful and insightful encounters with our colleagues from the interdisci-
plinary project group at Freiburg Institute for Advanced Studies (FRIAS): Anna Köttgen, Anne-
32
Christine Mupepele, Peter Pfaffelhuber and Fabian Staubach. We are grateful for numerous con-
versations with Matthias Wienroth, Sabine Lutz-Bonengel, Gudrun Rappold, Till Andlauer, Tino
Plümecke and Thomas Schulze on this topic. We have received helpful comments on earlier ver-
sions of this text from Denise Syndercomb Court, Nils Ellebrecht, Sabine Lutz-Bonengel and Anna
Lipphardt. We are grateful for advice from the DFG-Senatskommission für Grundsatzfragen in der
Genforschung, in particular from Brigitte Schlegelberger.
33
Bibliography
Abbot A (2012) Genome test slammed for assessing ‘racial purity: Hungarian far-right politician
certified as 'free of Jewish and Roma' genes. Nature, vol. 486, 14 June 2012.
Ádány R and Balázs M (2016). Hungary: The Faculty of Public Health, Debrecen. In Foldspang
A et al. (eds) 50 years of professional public health workforce development. ASPHER’s
50th Anniversary Book. The Association of Schools of Public Health in the European
Region (ASPHER)
Ali-Khan SE et al (2011) The use of race, ethnicity and ancestry in human genetic research.
The HUGO journal 5(1-4): 47-63.
Balanovska EV et al. (2016) Population biobanks: organizational models and prospects of
application in gene geography and personalized medicine. Russian Journal of Genetics
52(12): 1227-1243.
Barbujani G, Ghirotto S and Tassi F (2013) Nine things to remember about human genome
diversity. Tissue Antigens 82: 155-164.
Bárd, P., 2010. The force of law: genetic data protection in Central and Eastern Europe. JAHR
1(1): 95-112.
Bartsocas CS et al. (1979) Genetic structure of the Greek gypsies. Clinical Genetics 15(1): 5-
10.
Beljić Živković T et al. (2010) Screening for diabetes among Roma people living in Serbia.
Croatian Medical Journal 51(2): 144-150.
Bembea M et al. (2011) Y-chromosome STR haplotype diversity in three ethnically isolated
population from North-Western Romania. Forensic Science International: Genetics 5(3): e99-
e100.
Ben-Eghan et al. (2020) Don’t ignore genetic data from minority populations. Nature (585): 184-
186.
Borck C, Lipphardt V, Maasen S, Müller R, Penkler M (2018) Responsible Research? Dilem-
mata der Integration gesellschaftlicher und kultureller Perspektiven in naturwissenschaftli-
che Forschungsprogramme. Berichte zur Wissenschaftsgeschichte 41(3):215-221
Brandstätter A et al. (2008) Mitochondrial DNA control region variation in Ashkenazi Jews from
Hungary. Forensic Science International: Genetics 2(1): e4-e6.
Buchanan N and Weitz S (2017) Eine Technologie der Angstkultur. Freispruch, Heft 11,
September 2017.
Carracedo Á et al. (2014) Update of the guidelines for the publication of genetic population
data. Forensic Science International: Genetics 10: A1-A2.
Carracedo Á et al. (2013) New guidelines for the publication of genetic population data.
Forensic Science International: Genetics 7(2): 217-220.
Carracedo Á et al. (2010) Publication of population data for forensic purposes. Forensic
Science International: Genetics 4(3): 145-147.
Casals F et al. (2017) Length and repeat-sequence variation in 58 STRs and 94 SNPs in two
Spanish populations. Forensic Science International: Genetics 30: 66-70.
Cazacu C et al. (2013) Personalized medicine for whom? The situation of Romani people.
Revista Romana de Bioetica 11(3): 84-91.
34
Claw K G et al. (2018) A framework for enhancing ethical genomic research with Indigenous
communities. Nature Communications 9(1): 1-7.
Cole S A and Lynch M (2006) The social and legal construction of suspects. Annu. Rev. Law
Soc. Sci. 2: 39-60.
Cole S. (2020) Individual and collective identification in contemporary Forensics. BioSocieties
15:350375.
Cruciani F et al. (2011) Strong intra-and inter-continental differentiation revealed by Y chromo-
some SNPs M269, U106 and U152. Forensic Science International: Genetics 5(3): e49-e52.
Curtis D and Balloux F (2020) Topical ethical issues in the publication of human genetics
research. Annals of Human Genetics 84 (4): 313-314.
CORDIS website. Biological history of European populations. Project ID: CIPD940038.
Available at https://cordis.europa.eu/project/rcn/29025/factsheet/en (accessed
31.10.2020).
Cruciani F et al. (2011) Strong intra-and inter-continental differentiation revealed by Y chromo-
some SNPs M269, U106 and U152. Forensic Science International: Genetics 5(3): e49-
e52.
D’Amato M E et al. (2020) Ethical publication of research on genetics and genomics of
biological material: guidelines and recommendations. Forensic Science International:
Genetics (48):102299
Davila et al. (2010) Genome-wide association study identifies variants in the CFH region asso-
ciated with host susceptibility to meningococcal disease. Nature Genetics 42(9): 772-776.
Deligiannidis P et al. (2006) Forensic evaluation of 13 STR loci in the Roma population (Gyp-
sies) of Greece. Forensic Science International 157(2-3): 198-200.
Diószegi J et al. (2017) Distribution characteristics and combined effect of polymorphisms af-
fecting alcohol consumption behaviour in the Hungarian general and Roma populations.
Alcohol and Alcoholism 52(1): 104-111.
Dotto F et al. (2020) Analysis of a DNA mixture involving Romani reference populations. Fo-
rensic Science International: Genetics 44: 102168.
Egyed B et al. (2007) Mitochondrial control region sequence variations in the Hungarian popu-
lation: analysis of population samples from Hungary and from Transylvania (Romania).
Forensic Science International: Genetics 1(2): 158-162.
Egyed B et al. (2006) Analysis of the population heterogeneity in Hungary using fifteen foren-
sically informative STR markers. Forensic Science International 158(2-3): 244-249.
Egyed B et al. (2003) Population genetic analysis in Hungarian populations using the Power-
plex™ 16 system. In International Congress Series (Vol. 1239, No. C, pp. 121-122).
Elsevier Science
Egyed B et al. (2000) Analysis of eight STR loci in two Hungarian populations. International
Journal of Legal Medicine 113(5): 272-275.
Ehler E, Vanek D (2017) Forensic genetic analyses in isolated populations with examples of
central European Valachs and Roma. Journal of Forensic and Legal Medicine 48; 46-52.
Ellebrecht N and Weber D (forthcoming, 2021) Verbotener function creep.Genetische Her-
kunftsbestimmung im Spannungsfeld forensischer DNA-Analysen, polizeilicher Ermittlung
und rechtlicher Vorgaben. Kriminologisches Journal (1).
Ely B (1961) Les groupes sanguins de 47 Tsiganes de la région Parisienne. Bulletins et Mé-
moires de la Société d'Anthropologie de Paris 2(2): 233-237.
35
Fiatal S (2011) Characterization of the Hungarian Reference DNA Biobank. The role of ACE
I/D polymorphism in susceptibility to metabolic syndrome among Hungarians. Short thesis
for the degree of Doctor of Philosophy (Ph.D.), University of Debrecen.
Font-Porterias N et al. (2019) European Roma groups show complex West Eurasian admixture
footprints and a common South Asian genetic origin. PLoS Genetics 15(9): p.e1008417.
Füredi S et al. (1999) Y-STR haplotyping in two Hungarian populations. International Journal
of Legal Medicine 113(1): 38-42.
Füredi S et al. (1998) Population genetic data on four STR loci in a Hungarian Romany
population. International Journal of Legal Medicine 112(1) 72-74.
Füredi S et al. (1997) Semi-automatic DNA profiling in a Hungarian Romany population using
the STR loci HumVWFA31, HumTH01, HumTPOX, and HumCSF1PO. International
Journal of Legal Medicine 110(4): 184-187.
García-Magariños et al. (2015) A parametric approach to kinship hypothesis testing using
identity-by-descent parameters. Statistical applications in genetics and molecular biology,
14(5), pp.465-479.
Gärtner S (1932) Serologische Untersuchungen an Wanderzigeunern: Agglutination,
Wassermannsche Reaktion und Blutgruppenbestimmungen. Zeitschrift für Hygiene und
Infektionskrankheiten 113: 741-750.
Goedde H W et al (1995) Serum protein and erythrocyte enzyme polymorphisms in twelve
population groups of Hungary. Anthropologischer Anzeiger 53 (H.2): 97-124.
Good A (2012) Genetic testing of far-right Hungarian politician provokes an uproar. Foreign
Policy, 15 June 2012. Available at: https://foreignpolicy.com/2012/06/15/genetic-
testing-of-far-right-hungarian-politician-provokes-an-uproar/ Accessed 21.10.2020.
Gresham D et al. (2001) Origins and divergence of the Roma (gypsies). The American Journal
of Human Genetics 69(6): 1314-1331.
Gusmão L et al. (2017) Revised guidelines for the publication of genetic population data.
Forensic Science International: Genetics 30: 160-163.
Gusmão A et al. (2008a) A Perspective on the History of the Iberian Gypsies Provided by
Phylogeographic Analysis of Y-Chromosome Lineages. Annals of Human Genetics 72(2):
215-227.
Gusmão A et al. (2008b) Y-chromosomal STR haplotypes in a Gypsy population from Portugal.
Forensic Science International: Genetics Supplement Series 1(1): 212-213.
Hindorff LA et al. (2018) Prioritizing diversity in human genomics research. Nature Reviews
Genetics 19(3): 175.
Irwin J et al. (2007) Hungarian mtDNA population databases from Budapest and the Baranya
county Roma. International Journal of Legal Medicine 121(5): 377-383.
Janicsek I et al. (2015) Significant interethnic differences in functional variants of PON1 and
P2RY12 genes in Roma and Hungarian population samples. Molecular Biology Reports
42(1): 227-232.
Kalaydjieva L and Kremensky I (1992) Screening for phenylketonuria in a totalitarian state.
Journal of Medical Genetics 29(9): 656-658.
Kalaydjieva L, Gresham D and Calafell F (2001) Genetic studies of the Roma (Gypsies): a
review. BMC Medical Genetics 2(1): 2- 5.
36
Kaseniit KE et al. (2020) Genetic ancestry analysis on> 93,000 individuals undergoing
expanded carrier screening reveals limitations of ethnicity-based medical guidelines.
Genetics in Medicine 22(10): 1694-1702.
Klarić I M et al. (2009) Dissecting the molecular architecture and origin of Bayash Romani
patrilineages: genetic influences from South-Asia and the Balkans. American Journal of
Physical Anthropology 138 (3): 333-342.
Koops B J and Schellekens M (2008) Forensic DNA phenotyping: regulatory issues. The
Columbia Science & Technology Law Review 9: 158-202.
Law I and Kovats M (2018) Rethinking Roma: Identities, Politicisation and New Agendas.
Springer.
Lee SSJ et al. (2008) The ethics of characterizing difference: guiding principles on using racial
categories in human genetics. Genome Biology 9(7): 404-404.4
Lipphardt V, Rappold G and Surdu M (forthcoming, 2021) Representing vulnerable populations
in genetic studies: The case of Roma. Science in Context.
Lipphardt A (2019) Die Erfindung des „Heilbronner Phantoms": Kulturanthropologische
Annäherungen an den NSU-Komplex. Zeitschrift für Volkskunde 115(1): 50-167.
Lipphardt V (2018) Vertane Chancen? Die aktuelle politische Debatte um Erweiterte DNA-Ana-
lysen in Ermittlungsverfahren. Berichte zur Wissenschaftsgeschichte 41(3): 279-301.
Lucassen L (1991) The power of definition. Stigmatisation, minoritisation and ethnicity illustra-
ted by the history of Gypsies in the Netherlands. Netherlands Journal of Social Sciences
27: 80-91.
Lucassen L (1997) «Harmful tramps» Police professionalization and Gypsies in Germany,
1700-1945. Crime, Histoire & Sociétés/Crime, History & Societies 1(1): 29-50.
Lynch M et al. (2010) Truth machine: The contentious history of DNA fingerprinting. University
of Chicago Press
Machado H, Granja R (2020) Forensic genetics in the governance of crime. Singapore: Palgrave
Macmillan.
Magyari L et al. (2014) Marked differences of haplotype tagging SNP distribution, linkage, and
haplotype profile of IL23 receptor gene in Roma and Hungarian population samples.
Cytokine 65(2): 148-152.
Malyarchuk BA et al. (2006) Mitochondrial DNA diversity in the Polish Roma. Annals of Human
Genetics 70(2): 195-206.
Maria Saiz MS et al. (2014) Action protocols in DNA identification of isolated populations. J
Forensic Res 5(218): 2.
Martínez-Cruz B et al. (2015) Origins, admixture and founder lineages in European Roma.
European Journal of Human Genetics 24(6): 937-943.
Martínez-Cruz B et al. (2011) Multiplex single-nucleotide polymorphism typing of the human Y
chromosome using TaqMan probes. Investigative Genetics 2(1): 13.
Mascalzoni D et al. (2010) Comparison of participant information and informed consent forms
of five European studies in genetic isolated populations. European Journal of Human
Genetics 18(3): 296- 302.
McCartney, C (2006) Forensic Identification and Criminal Justice. Cullumpton: Willan.
M’charek A and Wade P (2020) Doing the individual and the collective in forensic genetics:
governance, race and restitution. BioSocieties 15(3):317328.
37
M’charek A, Toom V and Jong L (2019) The trouble with race in forensic identification. Science,
Technology & Human Values 45(5):804-828.
M’charek A, Toom V and Prainsack B (2012) Bracketing Off Population Does Not Advance
Ethical Reflection on EVCs: A Reply to Kayser and Schneider. Forensic Science Interna-
tional: Genetics 6(1):e16-7; author reply e18-9.
Mendizabal I et al. (2012) Reconstructing the population history of European Romani from
genome-wide data. Current Biology 22(24): 2342-2349.
Mendizabal I et al. (2011) Reconstructing the Indian origin and dispersal of the European
Roma: a maternal genetic perspective. PloS one 6(1): p.e15988.
Middleton A et al. (2020) Global public perceptions of genomic data sharing: what shapes the
willingness to donate DNA and health data?. The American Journal of Human Genetics
107(4): 743-752.
Morar B et al. (2004) Mutation history of the Roma/Gypsies. The American Journal of Human
Genetics 75(4): 596-609.
Moreau Y (2019) Crack down on genomic surveillance. Nature 576: 36-38.
Munsterhjelm M (2015) Beyond the Line: Violence and the Objectification of the Karitiana
Indigenous People as Extreme Other in Forensic Genetics. International Journal for the
Semiotics of Law-Revue internationale de Sémiotique Juridique 28(2): 289-316.
Myers M (2020) An inheritance of exclusion: Roma education, genetics and the turn to biosocial
solutions. Research in Education 107(1): 55-71.
Nagy K et al. (2017) Distinct Penetrance of Obesity-Associated Susceptibility Alleles in the
Hungarian General and Roma Populations. Obesity Facts 10(5): 444-457.
Nagy A et al. (2015a) Extreme differences in SLCO1B3 functional polymorphisms in Roma and
Hungarian populations. Environmental Toxicology and Pharmacology 39(3): 1246-1251.
Nagy A et al. (2015b) Marked differences in frequencies of statin therapy relevant SLCO1B1
variants and haplotypes between Roma and Hungarian populations. BMC Genetics 16(1):
108.
Nagy M et al. (2007) Searching for the origin of Romanies: Slovakian Romani, Jats of Haryana
and Jat Sikhs Y-STR data in comparison with different Romani populations. Forensic
Science International 169(1): 19-26.
Novokmet N and Pavčec Z (2007) Genetic polymorphisms of 15 AmpFlSTR identifiler loci in
Romani population from Northwestern Croatia. Forensic Science International 168(2-3):
e43-e46.
Oorschot I and M’charek A (2021) Un/Doing Race: On Technology, Individuals, and Collectives
in Forensic Practice. Subsequent version to be published in M. Hojer Bruun and C. Hasse
(eds.) The Handbook for the Anthropology of Technology, Palgrave 2021. Pre-publication
draft, not proof corrected. Available at:
https://www.researchgate.net/publication/345310060_UnDoing_Race_On_Technology_I
ndividuals_and_Collectives_in_Forensic_Practice [accessed Nov 11 2020].
Pamjav,H et al. (2011) Genetic structure of the paternal lineage of the Roma people. American
Journal of Physical Anthropology 145(1): 21-29.
Parson W and Dür A (2007) EMPOP—a forensic mtDNA database. Forensic Science Interna-
tional: Genetics 1(2): 88-92
38
Parson W and Roewer L (2010) Publication of population data of linearly inherited DNA markers
in the International Journal of Legal Medicine. International Journal of Legal Medicine
124(5): 505-509
Petrejčíková E et al. (2011) Allele frequencies and population data for 11 Y-chromosome STRs
in samples from Eastern Slovakia. Forensic Science International: Genetics 5(3): e53-e62.
Ploski R et al. (2002) Homogeneity and distinctiveness of Polish paternal lineages revealed by
Y chromosome microsatellite haplotype analysis. Human Genetics 110(6): 592-600.
Pokupčić K et al. (2008) Y-STR genetic diversity of Croatian (Bayash) Roma. Forensic Science
International: Genetics 2(2): e11-e13.
Purps J et al. (2014) A global analysis of Y-chromosomal haplotype diversity for 23 STR loci.
Forensic Science International: Genetics (12): 12-23.
Rex-Kiss B, Szabó L and Szabó S (1972a). Blood group investigations among the Gypsy po-
pulation of Hungary. I. Examination of ABO, MN and Rh blood groups. Annales immuno-
logiae Hungaricae 16: 355-370.
Rex-Kiss B, Szabó L and Szabó S (1972b). Blood group investigations among the gypsy
population of Hungary. II. Examinations of haptoglobin types and level and GM (1) factor.
Annales Immunologiae Hungaricae 16: 371-376.
Rex-Kiss B et al. (1973) ABO, MN, Rh blood groups, Hp types and Hp level, Gm (1) factor
investigations on the Gypsy population of Hungary. Human Biology 45(1): 41-61.
Rex-Kiss B and Szabó S (1981) AB0-Blutgruppenbestimmungen an Strafgefangenen.
Zeitschrift für Rechtsmedizin 86(4): 295-301.
Roewer L et al. (2001) Online reference database of European Y-chromosomal short tandem
repeat (STR) haplotypes. Forensic Science International 118(2-3): 106-113.
Royal CD et al. (2010) Inferring genetic ancestry: opportunities, challenges, and implications.
The American Journal of Human Genetics 86(5): 661-673.
Ritter R (1941) Die Bestandsaufnahme der Zigeuner und Zigeunermischlinge in Deutschland.
Der Öffentliche Gesundheitsdienst 6(21): 477-489.
Sandor JM and Bárd P (2009) The Legal Regulation of Biobanks. National Report: Hungary.
Center for Ethics and Law in Biomedicine (CELAB), Budapest.
Salihović M P et al. (2011) The role of the Vlax Roma in shaping the European Romani maternal
genetic history. American Journal of Physical Anthropology 146(2): 262-270.
Schroeder et al. (eds) (2018) Ethics Dumping. Case Studies from North-South Research
Collaborations. (SpringerBriefs in Research and Innovation Governance)
Sipeky C et al. (2015) Interethnic variability of CYP4F2 (V433M) in admixed population of Roma
and Hungarians. Environmental Toxicology and Pharmacology 40(1): 280-283.
Sipeky C et al. (2014) Lower carrier rate of GJB2 W24X ancestral Indian mutation in Roma
samples from Hungary: implication for public health intervention. Molecular Biology
Reports 41(9): 6105-6110.
Sipeky C et al. (2013) High prevalence of CYP2C19* 2 allele in Roma samples: study on Roma
and Hungarian population samples with review of the literature. Molecular Biology Reports
40(8): 4727-4735.
Sipeky C et al. (2011) Genetic variability and haplotype profile of MDR1 (ABCB1) in Roma and
Hungarian population samples with a review of the literature. Drug Metabolism and Phar-
macokinetics 26(2): 206-215.
39
Sipeky C et al. (2010) Population pharmacogenomics and personalized medicine research in
Hungary: Achievements and lessons learned. Current Pharmacogenomics and Personali-
zed Medicine 8(3): 194-201.
Skinner D (2013) The NDNAD has no Ability in Itself to be Discriminatory: Ethnicity and the
Governance of the National Forensic DNA Database. Sociology, 47(5), pp.976-992.
Skinner (2020) Race, racism and identification in the era of technosecurity. Science as Culture
29 (1):77-99
Soták M et al. (2008) Genetic variation analysis of 15 autosomal STR loci in Eastern Slovak
Caucasian and Romany (Gypsy) population. Forensic Science International: Genetics 3(1):
e21-e25.
Soták M et al. (2011) Population database of 17 autosomal STR loci from the four predominant
Eastern Slovakia regions. Forensic Science International: Genetics 5(3): 262-263.
Staubach F et al. (2017) Note limitations of DNA legislation. Nature 545(7652): 30-30.
Sumegi K et al. (2015) Functional variants of lipid level modifier MLXIPL, GCKR, GALNT2,
CILP2, ANGPTL3 and TRIB1 genes in healthy Roma and Hungarian populations.
Pathology & Oncology Research 21(3): 743-749.
Surdu M (2016) Those Who Count. Budapest, New York: Central European University Press.
Szalai R et al. (2014a) Genetic polymorphisms in promoter and intronic regions of CYP1A2
gene in Roma and Hungarian population samples. Environmental Toxicology and Pharma-
cology 38(3): 814-820.
Szalai R (2014b) Admixture of beneficial and unfavourable variants of GLCCI1 and FCER2 in
Roma samples can implicate different clinical response to corticosteroids. Molecular Bio-
logy Reports 41(11): 7665-7669.
Szalai R et al. (2015) Interethnic differences of cytochrome P450 gene polymorphisms may
influence outcome of taxane therapy in Roma and Hungarian populations. Drug Metabo-
lism and Pharmacokinetics 30(6): 453-456.
Szamosi B (2010) Genetic Studies of Romani Populations in Hungary. An Intersectional Ana-
lysis. Bachelor thesis, Central European University, Budapest.
Takezawa Y et al. (2014) Human genetic research, race, ethnicity and the labeling of popula-
tions: recommendations based on an interdisciplinary workshop in Japan. BMC medical
ethics 15(1): 33.
Toom V (2010) Inquisitorial forensic DNA profiling in the Netherlands and the expansion of the
forensic genetic body. In Hindmarsh, R. and Prainsack, B. (eds.), Genetic Suspects. Global
Governance of Forensic DNA Profiling and Databasing, pp. 175-196. Cambridge:
Cambridge University Press.
Toom V, Wienroth M, M'charek A, Prainsack B, Williams R, Duster T et al. (2016): Approaching
ethical, legal and social issues of emerging forensic DNA phenotyping (FDP) technologies
comprehensively: Reply to 'Forensic DNA phenotyping: Predicting human appearance
from crime scene material for investigative purposes' by Manfred Kayser. In: Forensic
science international: Genetics 22, e1-e4.
Tournev I (2016) The Meryon Lecture at the 18th Annual Meeting of the Meryon Society
Wolfson College, Oxford, UK, 12th September 2014: Neuromuscular disorders in Roma
(Gypsies)–collaborative studies, epidemiology, community-based carrier testing program
and social activities. Neuromuscular Disorders 26(1): 94-103.
40
Varszegi D et al. (2014) Hodgkin disease therapy induced second malignancy susceptibility
6q21 functional variants in Roma and Hungarian population samples. Pathology & Onco-
logy Research 20: 529-533.
Walter H et al. (1992) Investigations on the variability of four genetic serum protein markers in
Poland. Zeitschrift für Morphologie und Anthropologie Bd. 79(H.2): 203-214.
Wang M et al. (2018) Genetic characteristics and phylogenetic analysis of three Chinese ethnic
groups using the Huaxia Platinum System. Scientific Reports 8(1): 1-8.
Weber A et al. (2015) Increased prevalence of functional minor allele variants of drug metabo-
lizing CYP2B6 and CYP2D6 genes in Roma population samples. Pharmacological Reports
67(3): 460-464.
Wienroth M (2020) Value beyond scientific Validity: Let’s RULE (Reliability, Utility, LEgitimacy).
Journal of Responsible Innovation.
Williams R and Wienroth M (2017) Social and Ethical Aspects of Forensic Genetics: A Critical
Review. Forensic Science Review 29 (2), 145-169.
Willems W (1997) In Search of the True Gypsy: From Enlightenment to Final Solution. London:
Frank Cass.
Zaharova B et al. (2001) Y-chromosomal STR haplotypes in three major population groups in
Bulgaria. Forensic Science International 124(2-3): 182-186.
Zhang D et al. (2019) RETRACTED ARTICLE: Y Chromosomal STR haplotypes in Chinese
Uyghur, Kazakh and Hui ethnic groups and genetic features of DYS448 null allele and
DYS19 duplicated allele. International Journal of Legal Medicine.
Policy documents
EUROFORGEN (2017) Making sense of forensic genetics. What can DNA tell you about a
crime? Accessed 22.10.2020 at https://senseaboutscience.org/wp-
content/uploads/2017/01/making-sense-of-forensic-genetics.pdf
Europol (2016) Situation report: Trafficking in human beings in the EU. Europol Public
Information. Accessed 25.10.2020 https://www.europol.europa.eu/publications-
documents/trafficking-in-human-beings-in-eu
Europol, O.C.T.A. (2011) EU organised crime threat assessment. European Police Office.
Accessed 25.10.2020 at https://www.europol.europa.eu/activities-services/main-
reports/octa-2011-eu-organised-crime-threat-assessment
European Society for Human Genetics ESHG (2012) ESHG condemns use of testing to
establish 'racial purity', Wednesday, June 13, 2012. Accessed at 31.10.2020 at
https://secure.eshg.org/477.0.html
European Society for Human Genetics ESHG (2003a) Data storage and DNA banking for
biomedical research: technical, social and ethical issues. European Journal of Human
Genetics 11(12): 906-908.
European Society for Human Genetics ESHG (2003b) Population genetic screening pro-
grammes: technical, social and ethical issues. European Journal of Human Genetics 11(12):
S5-7.
41
FRA, European Union Agency for Fundamental Rights (2010) Data in focus report: police stops
and minorities. Accessed 25.10.2020 at
https://fra.europa.eu/sites/default/files/fra_uploads/1132-EU-MIDIS-police.pdf
H3Africa-Consortium (2017) H3Africa guidelines for community engagement (Version
Two). Consortium HA, editor. Accessed 11.11. 2020 at https://h3africa.org/wp-
content/uploads/2018/05/CE%20Revised%20Guidelines_Final_September%202017%20(
1).pdf
Ministry of Culture Sweden (2015) The Dark Unknown History.White Paper on Abuses and
Rights Violations Against Roma in the 20th Century. Accessed 31.10.2020 at
https://www.government.se/49b72f/contentassets/eab06c1ac82b476586f928931cfc8238/t
he-dark-unknown-history---white-paper-on-abuses-and-rights-violations-against-roma-in-
the-20th-century-ds-20148
Nagoya Protocol on Access to Genetic Resources and the Fair and Equitable Sharing of
Benefits Arising from their Utilization to the Convention on Biological Diversity. Accessed
21.10.2020 at https://www.cbd.int/abs/doc/protocol/nagoya-protocol-en.pdf
World Medical Association Declaration of Helsinki - Ethical Principles for Medical Research
Involving Human Subjects. Accessed 21.10.2020 at https://www.wma.net/policies-
post/wma-declaration-of-helsinki-ethical-principles-for-medical-research-involving-human-
subjects/
UNESCO (2003) International Declaration on Human Genetic Data. Accessed 21.10.2020 at
http://portal.unesco.org/en/ev.php-
URL_ID=17720&URL_DO=DO_TOPIC&URL_SECTION=201.html
... Also, if the police or military forces have helped to collect them, the data might not be published in a journal at all -and so not be subject to editorial checks. A German law-enforcement institution, the Baden Württemberg State Office of Criminal Investigation in Stuttgart, for example, collected data from dozens of people from Afghanistan and Romania and uploaded them in 2017 to the YHRD public database without indicating whether individuals had consented to their data being used in this way 13 . ...
Article
Analysis of how papers and databases are handled and interpreted shows that geneticists in Europe must stamp out unethical research practices at home, not just abroad. Analysis of how papers and databases are handled and interpreted shows that geneticists in Europe must stamp out unethical research practices at home, not just abroad.
Article
Geneticists say a global Y-chromosome database holds profiles from men who are unlikely to have given free informed consent. Geneticists say a global Y-chromosome database holds profiles from men who are unlikely to have given free informed consent.
Article
Full-text available
Argument Moreau (2019) has raised concerns about the use of DNA data obtained from vulnerable populations, such as the Uighurs in China. We discuss another case, situated in Europe and with a research history dating back 100 years: genetic investigations of Roma. In our article, we focus on problems surrounding representativity in these studies. We claim that many of the circa 440 publications in our sample neglect the methodological and conceptual challenges of representativity. Moreover, authors do not account for problematic misrepresentations of Roma resulting from the conceptual frameworks and sampling schemes they use. We question the representation of Roma as a “genetic isolate” and the underlying rationales, with a strong focus on sampling strategies. We discuss our results against the optimistic prognosis that the “new genetics” could help to overcome essentialist understandings of groups.
Article
Full-text available
Analyzing genomic data across populations is central to understanding the role of genetic factors in health and disease. Successful data sharing relies on public support, which requires attention to whether people around the world are willing to donate their data that are then subsequently shared with others for research. However, studies of such public perceptions are geographically limited and do not enable comparison. This paper presents results from a very large public survey on attitudes toward genomic data sharing. Data from 36,268 individuals across 22 countries (gathered in 15 languages) are presented. In general, publics across the world do not appear to be aware of, nor familiar with, the concepts of DNA, genetics, and genomics. Willingness to donate one's DNA and health data for research is relatively low, and trust in the process of data's being shared with multiple users (e.g., doctors, researchers, governments) is also low. Participants were most willing to donate DNA or health information for research when the recipient was specified as a medical doctor and least willing to donate when the recipient was a for-profit researcher. Those who were familiar with genetics and who were trusting of the users asking for data were more likely to be willing to donate. However, less than half of participants trusted more than one potential user of data, although this varied across countries. Genetic information was not uniformly seen as different from other forms of health information, but there was an association between seeing genetic information as special in some way compared to other health data and increased willingness to donate. The global perspective provided by our "Your DNA, Your Say" study is valuable for informing the development of international policy and practice for sharing genomic data. It highlights that the research community not only needs to be worthy of trust by the public, but also urgent steps need to be taken to authentically communicate why genomic research is necessary and how data donation, and subsequent sharing, is integral to this.
Article
Full-text available
Carrier status associates strongly with genetic ancestry, yet current carrier screening guidelines recommend testing for a limited set of conditions based on a patient’s self-reported ethnicity. Ethnicity, which can reflect both genetic ancestry and cultural factors (e.g., religion), may be imperfectly known or communicated by patients. We sought to quantitatively assess the efficacy and equity with which ethnicity-based carrier screening captures recessive disease risk. For 93,419 individuals undergoing a 96-gene expanded carrier screen (ECS), correspondence was assessed among carrier status, self-reported ethnicity, and a dual-component genetic ancestry (e.g., 75% African/25% European) calculated from sequencing data. Self-reported ethnicity was an imperfect indicator of genetic ancestry, with 9% of individuals having >50% genetic ancestry from a lineage inconsistent with self-reported ethnicity. Limitations of self-reported ethnicity led to missed carriers in at-risk populations: for 10 ECS conditions, patients with intermediate genetic ancestry backgrounds—who did not self-report the associated ethnicity—had significantly elevated carrier risk. Finally, for 7 of the 16 conditions included in current screening guidelines, most carriers were not from the population the guideline aimed to serve. Substantial and disproportionate risk for recessive disease is not detected when carrier screening is based on ethnicity, leading to inequitable reproductive care.
Article
Full-text available
Forensic Science international: Genetics and Forensic Science Iinternational: Reports communicate research on a variety of biological materials using genetics and genomic methods. Numerous guidelines have been produced to secure standardization and quality of results of scientific investigations. Yet, no specific guidelines have been produced for the ethical acquisition of such data. These guidelines summarize universally adopted principles for conducting ethical research on biological materials, and provide details of the general procedures for conducting ethical research on materials of human, animal, plant and environmental environmetal origin. Finally, the minimal ethics requirements for submission of research material are presented.
Article
Forensische Bestimmungen der biogeographischen Herkunft eines/einer unbekannten Verdächtigen anhand von DNA-Spuren sind in Deutschland nicht erlaubt, kommen aber bereits zur Anwendung. Dieser function creep ist nur zum Teil durch rechtliche Sonderregelungen und Graubereiche gedeckt. Dieser Artikel zeigt, dass die Herkunft eines/einer Verdächtigen bereits beim DNA-Profilabgleich eine Rolle spielt. Von hier aus reift die Herkunftsanalyse zu einem eigenständigen Verfahren heran, dessen verbotener Einsatz durch drei Entwicklungen gefördert wird. (1) In den Laboren werden Herkunftsinformationen im Zuge von DNA-Analysen teilweise ungewollt sichtbar. (2) Die Popularisierung und Automatisierung von Herkunftsanalysen haben diese massiv erleichtert. (3) Kontrolldefizite führen dazu, dass Analysen formal erlaubt und durchgeführt werden, obwohl es keine rechtliche Grundlage gibt. Function creeps hängen folglich nicht nur von technischen, sondern auch von institutionellen Machbarkeiten ab.
Article
My perspective piece contributes to social studies of biometric technologies, and to studies on values and valuation within debates of responsible innovation. I reflect on innovation as social practice where values are temporary settlements of considerations around validity, operability, and social compatibility of socio-technical innovations. As such, I propose a practice-based approach to testing values in new technologies and their respective emerging practice and governance arrangements around Reliability, Utility and LEgitimacy (RULE). These three values combine scientific with operational and social aspects of innovation as centre-points around which deliberative engagement can be facilitated between different societal perspectives, offering the opportunity to develop greater awareness of diverse and at times competing understandings of value. On the case study of forensic genetics – the use of genetic material and data for policing purposes in security and justice contexts – I make the case for multi-perspectival, cross-disciplinary, community-grounded deliberation based on RULE.
Article
Efforts to build representative studies are defeated when scientists discard data from certain groups. Instead, researchers should work to balance statistical needs with fairness. Efforts to build representative studies are defeated when scientists discard data from certain groups. Instead, researchers should work to balance statistical needs with fairness.