As early as 2002, the International Ethics Committee of
the Human Genome Organization (HUGO) stated that
human genomic databases should be considered as global
public goods . In this statement, global public goods
were deﬁned as goods ‘whose scope extends worldwide,
are enjoyable by all with no groups excluded, and when
consumed by one individual, are not depleted for others’
. Buttressed by the Bermuda Principles of 1996  and
mirrored in the Fort Lauderdale rules of 2003 , the
common philosophy of sharing resources was reaﬃrmed
in the 2008 International Summit on Proteomics Data
Release and Sharing Policy in Amsterdam  and in the
Toronto International Data Release Workshop of 2009 .
Finally, in January 2011, 17 major health funding
agencies signed a joint statement on sharing research
data to promote and improve public health . However,
the challenge is to take these fundamental values of
sharing and access and to develop guiding principles
and procedures that can be used as a basis for
To begin, we consider data sharing as a form of data
processing as deﬁned by the EU Directive 95/46/EC on
data protection . In this directive, data processing
refers to: ‘any operation or set of operations which is
performed upon personal data, whether or not by
automatic means, such as […] retrieval, consultation, use,
disclosure by transmission, dissemination or otherwise
making available […]’ . Data can include raw data,
genotype/phenotype data and data included within
governmental health administrative databases. eoreti-
cally, personal medical records could be subsumed under
this term, but we have not speciﬁcally addressed such
data because their regulation is jurisdiction-speciﬁc. e
code’s principles, however, remain pertinent to such data.
For the terms ‘coded’ and ‘anonymized’, we use the
deﬁnitions provided by the 2007 International Con fer-
ence on Harmonization .
Data sharing is regarded as essential for enabling and
promoting genomic research in a way that will maximize
the beneﬁts to public health  and society . All
countries, funders and investigators are aware of the need
for research ethics and governance mechanisms in
research, but currently there is little policy guidance that
is speciﬁc to the international sharing of genomic
research data. In view of the recent calls for the develop-
ment of common principles applying to data access and
use [7,10], Public Population Project in Genomics (P3G)
, European Network for Genetic and Genomic
Epidemiology (ENGAGE)  and Centre for Health,
Law and Emerging Technologies (HeLEX)  are work-
ing on an international data sharing Code of Conduct
(Box 1). is has a dual purpose: to elucidate shared
values and to provide guidance on the basic obligations
Data sharing is increasingly regarded as an ethical and
scientic imperative that advances knowledge and
thereby respects the contributions of the participants.
Because of this and the ever-increasing amount of
data access requests currently led around the world,
three groups have decided to develop data sharing
principles specic to the context of collaborative
international genomics research. These groups are: the
international Public Population Project in Genomics
(P3G), an international consortium of projects partaking
in large-scale genetic epidemiological studies
and biobanks; the European Network for Genetic
and Genomic Epidemiology (ENGAGE), a research
project aiming to translate data from large-scale
epidemiological research initiatives into relevant
clinical information; and the Centre for Health, Law and
Emerging Technologies (HeLEX). We propose seven
dierent principles and a preliminary international data
sharing Code of Conduct for ongoing discussion.
© 2010 BioMed Central Ltd
Towards a data sharing Code of Conduct for
international genomic research
Bartha Maria Knoppers
*, Jennifer R Harris
, Anne Marie Tassé
, Jane Kaye
, Mylène Deschênes
and Ma’n H Zawati
Department of Human Genetics, McGill University, 740 Dr Peneld Avenue,
Montreal, Quebec H3A 1A4, Canada
Full list of author information is available at the end of the article
Knoppers et al. Genome Medicine 2011, 3:46
© 2011 BioMed Central Ltd
ﬂowing from it. Given the varied disciplinary back-
grounds of researchers working in genomic research, it
can no longer be presumed that all the scientists engaged
in data sharing are bound by the same medical or other
professional deontological frameworks or can be subject
to disciplinary action for a breach. erefore, the pro-
posed international Code of Conduct for data sharing in
genomic research seeks to provide common guidance on
the basis of two fundamental values: (i) mutual respect
and trust between scientists, stakeholders and partici-
pants; and (ii) a commitment to safeguarding public
trust, participation and investment. e elaboration and
eventual implementation of such a code should be the
object of ongoing discussion and will begin with a series
Box 1. International Data Sharing Code of Conduct
This proposed international data sharing Code of Conduct seeks to promote greater access to and use of data in ways that are (as
proposed by the joint statement by funders of health research ):
‘Equitable: any approach to the sharing of data should recognize and balance the needs of researchers who generate and use data, other
analysts who might want to reuse those data, and communities and funders who expect health benets to arise from research.
Ethical: all data sharing should protect the privacy of individuals and the dignity of communities, while simultaneously respecting the
imperative to improve public health through the most productive use of data.
Ecient: any approach to data sharing should improve the quality and value of research and increase its contribution to improving public
health. Approaches should be proportionate and build on existing practice and reduce unnecessary duplication and competition.’
Principles and Procedures
Irrespective of the discipline, scientists involved in data sharing should be bona de researchers.
Proof of academic or other recognized peer reviewed standing is essential.
Harmonization of data collection and archiving methods and tools ensures validation of scientic quality.
Collaboration promotes eciency, sustainability and comparability.
Facilitation of both the deposit of data and secure access to data are the foundations of data sharing.
Curators of databases should promote sharing to generate maximum value.
Harmonization of deposit, access procedures and use promotes accessibility, equity and transparency.
Responsible governance should be shared between funders, generators and users of data.
Investments in databases require coordination, strategy and long-term core funding.
Mechanisms for building interoperability should be encouraged and appropriate management anticipated.
Capacity building and recognition of all the data generators contributes to best practices.
Trust and the promotion of data sharing rely on data management and security mechanisms and also on oversight of their functioning.
Mechanisms for identifying and tracking data generators and users should be international.
Key policies on publications, intellectual property, and industry involvement should be public.
Websites that are accessible to the general public serve to provide feedback on progress and general results.
Inter-agency co-operation and funding fosters streamlined and ecient monitoring and good governance.
Provisions should be made for ongoing public engagement that is tailored to the nature of the database and local cultures.
Mutual respect between all stakeholders is founded on personal and professional integrity.
Prevention of harms and anticipation of public concerns and scientic needs through foresight mechanisms encourage the development
of common, prospective policies.
Irresponsible research practices should be reported.
Sanctions for breach of this Code or of other legal or ethical obligations must be clear.
Knoppers et al. Genome Medicine 2011, 3:46
Page 2 of 4
of consultative discussions at international, European
and national fora.
Principles and procedures: background and
Although we are not attempting to prioritize or in any
way create a hierarchy among various principles in the
ﬁeld of data sharing, they all derive from a shared belief
in maximizing both scientiﬁc quality (Box 1, point 1) and
public beneﬁt through rapid release and public accessi-
bility to data (Box 1, point 2) .
e assurance of quality is sine qua non for ethical
science. Making it an explicit requirement reiterates its
importance and mandates comparison, validation and
replication, thereby ensuring appropriate and common
standard operating procedures and the use of accredited
facilities. Prospectively harmonizing procedures to facili-
tate interoperability and comparability is likely to promote
such quality and accessibility.
ere is no doubt that maximizing public beneﬁt,
investment and participation is facilitated through data
sharing. Not only should access be equitable for research-
ers in both the public and private sectors, but ethics
reviewers should have the proper training and tools to
evaluate international requests. e datasets themselves
may be derived from the contributions of multiple
sources from diﬀerent countries and projects. e
current legal and ethical constraints and bottlenecks to
access are obvious. Indeed, multiplicity of ethics review
may well be the Achilles heel for eﬃcient sharing.
e tripartite responsibility of the data producers,
users and funders lays the foundation for data sharing
(Box 1, point 3). We see data sharing, which is often a
condition of funding, as part of the eﬃcient and proper
stewardship of public funds. It also binds eventual users
in the recognition of a just return on public investment
and participation. is responsibility is chieﬂy expressed
both in the security mechanisms that translate the
principle into the construction of information technology
tools and ﬁrewalls and in the governance framework.
Security mechanisms (Box 1, point 4) go well beyond the
application of ﬁrewalls or de-identiﬁcation techniques,
such as coding or anonymization. Indeed, unique, digital
identiﬁers (IDs) for biobanks [15,16] and for researchers
 have been proposed not only for security purposes
but to facilitate access. Such IDs would enable veriﬁcation
and validation of the identities and credentials of
researchers by institutions and would become a mecha-
nism for allowing, tracking and auditing access, as well as
Digital identiﬁer systems allow data tracing and pros-
pec tively limit the potential for malicious activities
involving re-identiﬁcation of participants. is trans-
parency of data ﬂow, access and use also curtails the
possibility of pre-publication scooping between producers
and users (Box 1, point 5). Pre-publication data release
depends on the respect by users and journals of
publication moratoria that allow data producers to share
data openly but provides a period of time to analyze and
publish their own data before secondary users do so.
Proper acknowledgement of the use of data resources
also allows funders to track their ‘investments’. It allows
the public to see that their altruistic participation has led
to fruitful scientiﬁc endeavors. Most importantly, data
users agree not to use intellectual property protection in
ways that would prevent or block access to, or use of, any
element of the dataset or any conclusion drawn directly
from it . is does not prevent further research with
attendant intellectual property rights in downstream
discoveries provided that the best practices for licensing
policies for genomic inventions are followed.
Good governance underpins a system of data sharing that
depends on trust. Approaches to governance necessarily
vary between contexts and countries. Irrespective of
these diﬀerences, governance should be ﬂexible in the
oversight and monitoring systems put in place. is is
crucial because public trust, which is increasingly trans-
lated through broad consents, is counterbalanced by both
security systems and governance. It could be asked
whether in considering the longevity of large inter-
national datasets, including samples, separate governance
models should be developed as distinct from local insti-
tutional mechanisms or those applicable to the oversight
of clinical trials.
Good governance assures the public and funders of
proper accountability and ethics review (Box 1, point 6).
Although local laws and ethics review systems vary, the
ethics norms and biobank policies applicable to large
data repositories are beginning to emerge [19,20]. ese
common norms are increasingly mirrored in model
material transfer and access agreements . Contractual
in nature, they serve to bind researchers and their institu-
tions. Implicit in such agreements are the very principles
under discussion here. By making them explicit by using
such contracts, researchers, policymakers and ethics
com mittees have tools to work with that are more
transparent. For scientiﬁc integrity (Box 1, point 7) to be
viable, discussion on the nature of such principles and
their procedural translation in diﬀerent contexts will
necessarily vary. Nevertheless, mutual respect between
all stakeholders and participants can be built on these
fundamental principles and procedures. Integrity also
entails the prevention of harms, anticipation of public
concerns and scientiﬁc needs as well as the reporting of
Knoppers et al. Genome Medicine 2011, 3:46
Page 3 of 4
irresponsible research practices and the creation of
appropriate sanctions .
Most importantly, ongoing communication with the
public on the ‘reality’ of data sharing principles and
procedures is essential. us, lay summaries of the
research proposals accessing and using data repositories
should be publicly posted. Although there is no personal
beneﬁt to participants, such a public registry of research
uses ultimately allows participants to withdraw if they
disagree with the direction of the research. ere are also
other mechanisms of communication, such as bulletins
and websites. Population studies recontact their partici-
pants for updates, or to take new measurements, thereby
keeping ongoing consent alive and valid.
e most telling aspect of the developments described
above, however, is that the underlying values presented
here come from the current approaches promoted and
used by the scientists and funders themselves. Concern
for scientiﬁc integrity and mutual respect are then not
imposed by legislative or professional ﬁat but rather
reveal an already existing shared ethos on the proper
foundations for international science in the 21st century.
is augers well for the future viability of the preliminary
version of our proposed international data sharing Code
of Conduct in genomic research (Box 1).
Addressing the issue of data sharing in the context of
international genomic research requires not only a
holistic approach, but also the fair balancing of the
interests, rights and duties of various stakeholders involved
in collaborative endeavors. We have highlighted the need
for equitable, ethical and eﬃcient access to data and
proposed a Code of Conduct (Box 1) that incorporates
seven principles: quality, accessibility, responsibility,
security, transparency, accountability and integrity. We
trust that this code will foster broader discussion
involv ing multiple stakeholders.
ENGAGE, European Network for Genetic and Genomic Epidemiology; HeLEX,
Centre for Health, Law and Emerging Technologies; HUGO, Human Genome
Organization; P3G, Public Population Project in Genomics.
The authors declare that they have no competing interests.
BMK wrote the rst draft of the manuscript; JRH, AMT, IBL, JK, MD and MHZ
contributed equally to the manuscript. All authors read and approved the nal
The authors would like to thank Michael Le Huynh for his assistance in editing
1Department of Human Genetics, McGill University, 740 Dr Peneld Avenue,
Montreal, Quebec H3A 1A4, Canada. 2Norwegian Institute of Public Health,
PO Box 4404, Nydalen, N-0403 Oslo, Norway. 3Department of Public Health,
University of Oxford, Richards Building, Old Road Campus, Headington, Oxford
OX3 7LF, UK. 4P3G, Suite 590, 3333 Queen-Mary Road, Montreal, Quebec
Published: 14 July 2011
1. Human Genome Organisation (HUGO), Ethics Committee: Statement on
human genomic databases, December 2002. J Int Bioethique 2003,
2. Human Genome Organisation (HUGO): Principles Agreed at the First
International Strategy Meeting on Human Genome Sequencing: 25-28 February
1996; Bermuda. HUGO; 1996. [http://www.gene.ucl.ac.uk/hugo/bermuda.htm]
3. Sharing Data from Large-scale Biological Research Projects: A System of
Tripartite Responsibility. Report of a meeting organized by the Wellcome
Trust and held on 14-15 January 2003 at Fort Lauderdale, USA
4. Rodriguez H, Snyder M, Uhlén M, Andrews P, Beavis R, Borchers C, Chalkley RJ,
Cho SY, Cottingham K, Dunn M, Dylag T, Edgar R, Hare P, Heck AJ, Hirsch RF,
Kennedy K, Kolar P, Kraus HJ, Mallick P, Nesvizhskii A, Ping P, Pontén F, Yang L,
Yates JR, Stein SE, Hermjakob H, Kinsinger CR, Apweiler R: Recommendations
from the 2008 International Summit on Proteomics Data Release and
Sharing Policy: The Amsterdam principles. J Proteome Res 2009, 8:3689-3692.
5. Birney E, Hudson TJ, Green ED, Gunter C, Eddy S, Rogers J, Harris JR, Ehrlich
SD, Apweiler R, Austin CP, Berglund L, Bobrow M, Bountra C, Brookes AJ,
Cambon-Thomsen A, Carter NP, Chisholm RL, Contreras JL, Cooke RM, Crosby
WL, Dewar K, Durbin R, Dyke SO, Ecker JR, El Emam K, Feuk L, Gabriel SB,
Gallacher J, Gelbart WM, Granell A, et al.: Prepublication data sharing. Nature
6. Sharing research data to improve public health: full joint statement by
funders of health research [http://www.wellcome.ac.uk/About-us/Policy/
7. EU Directive 95/46/EC - The Data Protection Directive
8. International Conference on Harmonisation of Technical Requirements for
Registration of Pharmaceuticals for Human Use: Denitions for Genomic
Biomarkers, Pharmacogenomics, Pharmacogenetics, Genomic Data and
Sample Coding Categories E15 [http://www.ich.org/leadmin/Public_Web_
9. Data storage and DNA banking for biomedical research: technical, social
and ethical issues. Eur J Hum Genet 2003, 11 Suppl 2:S8-S10.
10. O’Brien SJ: Stewardship of human biospecimens, DNA, genotype, and
clinical data in the GWAS era. Annu Rev Genomics Hum Genet 2009, 10:193-209.
11. Public Population Project in Genomics [http://www.p3g.org]
12. European Network for Genetic and Genomic Epidemiology
13. Centre for Health, Law and Emerging Technologies
14. Laurie G, Mallia P, Frenkel DA, Krajewska A, Moniz H, Nordal S, Pitz C, Sandor J:
Managing Access to Biobanks: how can we reconcile individual privacy
and public interests in genetic research? Med Law Int 2010, 10:315-337.
15. Cambon-Thomsen A: Assessing the impact of biobanks. Nat Genet 2003,
16. Kaufmann F, Cambon-Thomsen A: Tracing biological collections: between
books and clinical trials. JAMA 2008, 299:2316-2318.
17. GEN2PHEN Knowledge Centre [http://www.gen2phen.org]
18. International Cancer Genome Consortium [http://www.icgc.org]
19. OECD principles and guidelines for access to research data from public
20. Guidelines on human biobanks and genetic research databases (HBGRDs)
21. Singapore Statement on Research Integrity
Cite this article as: Knoppers BM, et al.: Towards a data sharing Code of
Conduct for international genomic research. Genome Medicine 2011, 3:46.
Knoppers et al. Genome Medicine 2011, 3:46
Page 4 of 4