Genomics and Privacy: Implications of the New Reality of Closed Data for the Field

Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, USA.
PLoS Computational Biology (Impact Factor: 4.83). 12/2011; 7(12):e1002278. DOI: 10.1371/journal.pcbi.1002278
Source: PubMed

ABSTRACT Open source and open data have been driving forces in bioinformatics in the past. However, privacy concerns may soon change the landscape, limiting future access to important data sets, including personal genomics data. Here we survey this situation in some detail, describing, in particular, how the large scale of the data from personal genomic sequencing makes it especially hard to share data, exacerbating the privacy problem. We also go over various aspects of genomic privacy: first, there is basic identifiability of subjects having their genome sequenced. However, even for individuals who have consented to be identified, there is the prospect of very detailed future characterization of their genotype, which, unanticipated at the time of their consent, may be more personal and invasive than the release of their medical records. We go over various computational strategies for dealing with the issue of genomic privacy. One can "slice" and reformat datasets to allow them to be partially shared while securing the most private variants. This is particularly applicable to functional genomics information, which can be largely processed without variant information. For handling the most private data there are a number of legal and technological approaches-for example, modifying the informed consent procedure to acknowledge that privacy cannot be guaranteed, and/or employing a secure cloud computing environment. Cloud computing in particular may allow access to the data in a more controlled fashion than the current practice of downloading and computing on large datasets. Furthermore, it may be particularly advantageous for small labs, given that the burden of many privacy issues falls disproportionately on them in comparison to large corporations and genome centers. Finally, we discuss how education of future genetics researchers will be important, with curriculums emphasizing privacy and data security. However, teaching personal genomics with identifiable subjects in the university setting will, in turn, create additional privacy issues and social conundrums.

1 Follower
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The use of whole genome sequencing in translational research not only holds promise for finding new targeted therapies but also raises several ethical and legal questions. The four main ethical and legal challenges are as follows: (1) the handling of additional or incidental findings stemming from whole genome sequencing in research contexts; (2) the compatibility and balancing of data protection and research that is based on broad data sharing; (3) the responsibility of researchers, particularly of non-physician researchers, working in the field of genome sequencing; and (4) the process of informing and asking patients or research subjects for informed consent to the sequencing of their genome. In this paper, first, these four challenges are illustrated and, second, concrete solutions are proposed, as elaborated by the interdisciplinary Heidelberg EURAT project group, as guidelines for the use of genome sequencing in translation research and therapy in Heidelberg.
    Journal of Laboratory and Clinical Medicine 07/2014; 38(4):211-20. DOI:10.1515/labmed-2014-0027 · 2.80 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: As research laboratories and clinics collaborate to achieve precision medicine, both communities are required to understand mandated electronic health/medical record (EHR/EMR) initiatives that will be fully implemented in all clinics in the United States by 2015. Stakeholders will need to evaluate current record keeping practices and optimize and standardize methodologies to capture nearly all information in digital format. Collaborative efforts from academic and industry sectors are crucial to achieving higher efficacy in patient care while minimizing costs. Currently existing digitized data and information are present in multiple formats and are largely unstructured. In the absence of a universally accepted management system, departments and institutions continue to generate silos of information. As a result, invaluable and newly discovered knowledge is difficult to access. To accelerate biomedical research and reduce healthcare costs, clinical and bioinformatics systems must employ common data elements to create structured annotation forms enabling laboratories and clinics to capture sharable data in real time. Conversion of these datasets to knowable information should be a routine institutionalized process. New scientific knowledge and clinical discoveries can be shared via integrated knowledge environments defined by flexible data models and extensive use of standards, ontologies, vocabularies, and thesauri. In the clinical setting, aggregated knowledge must be displayed in user-friendly formats so that physicians, non-technical laboratory personnel, nurses, data/research coordinators, and end-users can enter data, access information, and understand the output. The effort to connect astronomical numbers of data points, including '-omics'-based molecular data, individual genome sequences, experimental data, patient clinical phenotypes, and follow-up data is a monumental task. Roadblocks to this vision of integration and interoperability include ethical, legal, and logistical concerns. Ensuring data security and protection of patient rights while simultaneously facilitating standardization is paramount to maintaining public support. The capabilities of supercomputing need to be applied strategically. A standardized, methodological implementation must be applied to developed artificial intelligence systems with the ability to integrate data and information into clinically relevant knowledge. Ultimately, the integration of bioinformatics and clinical data in a clinical decision support system promises precision medicine and cost effective and personalized patient care.
    01/2015; 5:4. DOI:10.1186/s13336-015-0019-3
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The risks and benefits of research using large databases of personal information are evolving in an era of ubiquitous, internet-based data exchange. In addition, information technology has facilitated a shift in the relationship between individuals and their personal data, enabling increased individual control over how (and how much) personal data are used in research, and by whom. This shift in control has created new opportunities to engage members of the public as partners in the research enterprise on more equal and transparent terms. Here, we consider how some of the technological advances driving and paralleling developments in genomics can also be used to supplement the practice of informed consent with other strategies to ensure that the research process as a whole honors the notion of respect for persons upon which human research subjects protections are premised. Further, we suggest that technological advances can help the research enterprise achieve a more thoroughgoing respect for persons than was possible when current policies governing human subject research were developed. Questions remain about the best way to revise policy to accommodate these changes.
    03/2014; 5(1):1-12. DOI:10.3390/genes5010001

Full-text (2 Sources)

Available from
May 17, 2014