Helen Parkinson

Helen Parkinson
European Molecular Biology Laboratory | EMBL · EMBL Hinxton (EBI)

About

257
Publications
57,244
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
28,473
Citations

Publications

Publications (257)
Article
The NHGRI-EBI GWAS Catalog serves as a vital resource for the genetic research community, providing access to the most comprehensive database of human GWAS results. Currently, it contains close to 7 000 publications for >15 000 traits, from which more than 625 000 lead associations have been curated. Additionally, 85 000 full genome-wide summary st...
Preprint
Full-text available
The NHGRI-EBI GWAS Catalog serves as a vital resource for the genetic research community, providing access to the most comprehensive database of human GWAS results. Currently, it contains close to 7,000 publications for more than 15,000 traits, from which more than 625,000 lead associations have been curated. Additionally, 85,000 full genome-wide s...
Article
Full-text available
The International Mouse Phenotyping Consortium (IMPC) systematically produces and phenotypes mouse lines with presumptive null mutations to provide insight into gene function. The IMPC now uses the programmable RNA-guided nuclease Cas9 for its increased capacity and flexibility to efficiently generate null alleles in the C57BL/6N strain. In additio...
Preprint
Phenotypic data are critical for understanding biological mechanisms and consequences of genomic variation, and are pivotal for clinical use cases such as disease diagnostics and treatment development. For over a century, vast quantities of phenotype data have been collected in many different contexts covering a variety of organisms. The emerging f...
Preprint
Full-text available
Splicing quantitative trait loci (QTLs) have been implicated as a common mechanism underlying complex trait associations. However, utilising splicing QTLs in target discovery and prioritisation has been challenging due to extensive data normalisation which often renders the direction of the genetic effect as well as its magnitude difficult to inter...
Article
Full-text available
The use of omic modalities to dissect the molecular underpinnings of common diseases and traits is becoming increasingly common. But multi-omic traits can be genetically predicted, which enables highly cost-effective and powerful analyses for studies that do not have multi-omics¹. Here we examine a large cohort (the INTERVAL study²; n = 50,000 part...
Article
Full-text available
Background: Microphthalmia, anophthalmia, and coloboma (MAC) spectrum disease encompasses a group of eye malformations which play a role in childhood visual impairment. Although the predominant cause of eye malformations is known to be heritable in nature, with 80% of cases displaying loss-of-function mutations in the ocular developmental genes OT...
Article
Full-text available
As a model organism, Drosophila is uniquely placed to contribute to our understanding of how brains control complex behavior. Not only does it have complex adaptive behaviors, but also a uniquely powerful genetic toolkit, increasingly complete dense connectomic maps of the central nervous system and a rapidly growing set of transcriptomic profiles...
Article
Full-text available
Motivation Since early 2020, the COVID-19 pandemic has confronted the biomedical community with an unprecedented challenge. The rapid spread of COVID-19 and ease of transmission seen worldwide is due to increased population flow and international trade. Front-line medical care, treatment research and vaccine development also require rapid and infor...
Article
Full-text available
PDCM Finder (www.cancermodels.org) is a cancer research platform that aggregates clinical, genomic and functional data from patient-derived xenografts, organoids and cell lines. It was launched in April 2022 as a successor of the PDX Finder portal, which focused solely on patient-derived xenograft models. Currently the portal has over 6200 models a...
Article
Full-text available
The NHGRI-EBI GWAS Catalog (www.ebi.ac.uk/gwas) is a FAIR knowledgebase providing detailed, structured, standardised and interoperable genome-wide association study (GWAS) data to >200 000 users per year from academic research, healthcare and industry. The Catalog contains variant-trait associations and supporting metadata for >45 000 published GWA...
Article
The GWAS Catalog is a comprehensive resource of data from genome wide association studies. Top associations and detailed metadata are made available in a standard format alongside full p-value summary statistics. These are re-used by the genomics community, e.g. in meta-analyses, generation of polygenic scores, identification of new drug targets. A...
Article
Full-text available
The International Mouse Phenotyping Consortium (IMPC; https://www.mousephenotype.org/) web portal makes available curated, integrated and analysed knockout mouse phenotyping data generated by the IMPC project consisting of 85M data points and over 95,000 statistically significant phenotype hits mapped to human diseases. The IMPC portal delivers a s...
Article
Full-text available
Background The diagnostic rate of Mendelian disorders in sequencing studies continues to increase, along with the pace of novel disease gene discovery. However, variant interpretation in novel genes not currently associated with disease is particularly challenging and strategies combining gene functional evidence with approaches that evaluate the p...
Preprint
statistics from genome-wide association studies (GWAS) represent a huge potential for research. A challenge for researchers in this field is the access and sharing of summary statistics data due to a lack of standards for the data content and file format. For this reason, the GWAS Catalog hosted a series of meetings in 2021 with summary statistics...
Article
PDCM Finder (https://cancermodels.org) is a new portal that aggregates patient-derived models (xenografts, cell lines and organoids) from 27 academic and commercial providers, enables users to search and compare over 6000 models and associated molecular data, and connects users with model providers to facilitate collaboration among researchers. Use...
Preprint
Full-text available
There are thousands of distinct disease entities and concepts, each of which are known by different and sometimes contradictory names. The lack of a unified system for managing these entities poses a major challenge for both machines and humans that need to harmonize information to better predict causes and treatments for disease. The Mondo Disease...
Preprint
Full-text available
Background: The FAIR Principles explicitly require the use of FAIR vocabularies, but what precisely constitutes a FAIR vocabulary remains unclear. Being able todefine FAIR vocabularies, identify features of FAIR vocabularies, and provide assessment approaches against the features can guide the development of vocabularies. Results: We differentiate...
Article
Full-text available
Background Patient-derived xenografts (PDX) mice models play an important role in preclinical trials and personalized medicine. Sharing data on the models is highly valuable for numerous reasons – ethical, economical, research cross validation etc. The EurOPDX Consortium was established 8 years ago to share such information and avoid duplicating ef...
Preprint
Full-text available
The diagnostic rate of Mendelian disorders in sequencing studies continues to increase, along with the pace of novel disease gene discovery. However, variant interpretation in novel genes not currently associated with disease is particularly challenging and strategies combining gene functional evidence with approaches that evaluate the phenotypic s...
Article
Full-text available
The European Genome-phenome Archive (EGA - https://ega-archive.org/) is a resource for long term secure archiving of all types of potentially identifiable genetic, phenotypic, and clinical data resulting from biomedical research projects. Its mission is to foster hosted data reuse, enable reproducibility, and accelerate biomedical and translational...
Article
Full-text available
Human biomedical datasets that are critical for research and clinical studies to benefit human health also often contain sensitive or potentially identifying information of individual participants. Thus, care must be taken when they are processed and made available to comply with ethical and regulatory frameworks and informed consent data condition...
Article
We present the Polygenic Score (PGS) Catalog (https://www.PGSCatalog.org), an open resource of published scores (including variants, alleles and weights) and consistently curated metadata required for reproducibility and independent applications. The PGS Catalog has capabilities for user deposition, expert curation and programmatic access, thus pro...
Article
Polygenic risk scores (PRSs), which often aggregate results from genome-wide association studies, can bridge the gap between initial discovery efforts and clinical applications for the estimation of disease risk using genetics. However, there is notable heterogeneity in the application and reporting of these risk scores, which hinders the translati...
Article
Full-text available
Reproducibility in the statistical analyses of data from high-throughput phenotyping screens requires a robust and reliable analysis foundation that allows modelling of different possible statistical scenarios. Regular challenges are scalability and extensibility of the analysis software. In this manuscript, we describe OpenStats, a freely availabl...
Article
Full-text available
The genetic landscape of diseases associated with changes in bone mineral density (BMD), such as osteoporosis, is only partially understood. Here, we explored data from 3,823 mutant mouse strains for BMD, a measure that is frequently altered in a range of bone pathologies, including osteoporosis. A total of 200 genes were found to significantly aff...
Article
Full-text available
Open Targets Genetics (https://genetics.opentargets.org) is an open-access integrative resource that aggregates human GWAS and functional genomics data including gene expression, protein abundance, chromatin interaction and conformation data from a wide range of cell types and tissues to make robust connections between GWAS-associated loci, variant...
Conference Paper
Patient-derived tumor xenograft (PDX) models are a critical oncology platform for cancer research, drug development and personalized medicine. Because of the heterogeneous nature of PDX repositories, finding models of interest is a challenge. The Jackson Laboratory and EMBL-EBI are developing PDX Finder, the world's largest open PDX database contai...
Preprint
Full-text available
Polygenic [risk] scores (PGS) can enhance prediction and understanding of common diseases and traits. However, the reproducibility of PGS and their subsequent applications in biological and clinical research have been hindered by several factors, including: inadequate and incomplete reporting of PGS development, heterogeneity in evaluation techniqu...
Preprint
Reproducibility in the statistical analyses of data from high-throughput phenotyping screens requires a robust and reliable analysis foundation that allows modelling of different possible statistical scenarios. Regular challenges are scalability and extensibility of the analysis software. In this manuscript, we describe OpenStats, a freely availabl...
Article
Full-text available
Background: Gene Ontology (GO) is a major bioinformatic resource used for analysis of large biomedical datasets, for example from genome-wide association studies, applied universally across biological fields, including Alzheimer's disease (AD) research. Objective: We aim to demonstrate the applicability of GO for interpretation of AD datasets to...
Article
Full-text available
The identification of causal variants in sequencing studies remains a considerable challenge that can be partially addressed by new gene-specific knowledge. Here, we integrate measures of how essential a gene is to supporting life, as inferred from viability and phenotyping screens performed on knockout mice by the International Mouse Phenotyping C...
Preprint
Full-text available
An increasing number of gene expression quantitative trait locus (QTL) studies have made summary statistics publicly available, which can be used to gain insight into human complex traits by downstream analyses such as fine-mapping and colocalisation. However, differences between these datasets in their variants tested, allele codings, and in the t...
Article
Full-text available
Motivation: High-throughput phenomic projects generate complex data from small treatment and large control groups that increase the power of the analyses but introduce variation over time. A method is needed to utlize a set of temporally local controls that maximises analytic power while minimising noise from unspecified environmental factors. Re...
Conference Paper
Patient-derived tumor xenograft (PDX) mouse models are an important oncology platform for cancer research, drug development and personalized medicine that are available from academic labs, large research consortia and contract research organizations (CROs). Because of the distributed and heterogeneous nature of repositories, finding models of inter...
Conference Paper
Patient-derived tumor xenograft (PDX) mouse models are an important oncology platform for cancer research, drug development and personalized medicine that are available from academic labs, large research consortia and contract research organizations (CROs). Because of the distributed and heterogeneous nature of repositories, finding models of inter...
Preprint
Full-text available
Although genomic sequencing has been transformative in the study of rare genetic diseases, identifying causal variants remains a considerable challenge that can be addressed in part by new gene-specific knowledge. Here, we integrate measures of how essential a gene is to supporting life, as inferred from the comprehensive viability and phenotyping...
Preprint
Motivation High-throughput phenomic projects generate complex data from small treatment and large control groups that increase the power of the analyses but introduce variation over time. A method is needed to utlize a set of temporally local controls that maximises analytic power while minimising noise from unspecified environmental factors. Resu...
Article
Full-text available
[This corrects the article DOI: 10.1038/s42003-018-0226-0.]. Despite advances in next generation sequencing technologies, determining the genetic basis of ocular disease remains a major challenge due to the limited access and prohibitive cost of human forward genetics. Thus, less than 4,000 genes currently have available phenotype information for a...
Article
Full-text available
Patient-derived tumor xenograft (PDX) mouse models are a versatile oncology research platform for studying tumor biology and for testing chemotherapeutic approaches tailored to genomic characteristics of individual patients’ tumors. PDX models are generated and distributed by a diverse group of academic labs, multi-institution consortia and contrac...
Article
Full-text available
Despite advances in next generation sequencing technologies, determining the genetic basis of ocular disease remains a major challenge due to the limited access and prohibitive cost of human forward genetics. Thus, less than 4,000 genes currently have available phenotype information for any organ system. Here we report the ophthalmic findings from...
Article
Full-text available
Metabolic diseases are a worldwide problem but the underlying genetic factors and their relevance to metabolic disease remain incompletely understood. Genome-wide research is needed to characterize so-far unannotated mammalian metabolic genes. Here, we generate and analyze metabolic phenotypic data of 2016 knockout mouse strains under the aegis of...
Article
Full-text available
The accurate description of ancestry is essential to interpret, access, and integrate human genomics data, and to ensure that these benefit individuals from all ancestral backgrounds. However, there are no established guidelines for the representation of ancestry information. Here we describe a framework for the accurate and standardized descriptio...
Article
Full-text available
The analysis and interpretation of high-throughput datasets relies on access to high-quality bioinformatics resources, as well as processing pipelines and analysis tools. Gene Ontology (GO, geneontology.org) is a major resource for gene enrichment analysis. The aim of this project, funded by the Alzheimer’s Research United Kingdom (ARUK) foundation...
Article
Full-text available
The GWAS Catalog delivers a high-quality curated collection of all published genome-wide association studies enabling investigations to identify causal variants, understand disease mechanisms, and establish targets for novel therapies. The scope of the Catalog has also expanded to targeted and exome arrays with 1000 new associations added for these...
Article
Full-text available
The BioSamples database at EMBL-EBI provides a central hub for sample metadata storage and linkage to other EMBL-EBI resources. BioSamples has recently undergone major changes, both in terms of data content and supporting infrastructure. The data content has more than doubled from around 2 million samples in 2014 to just over 5 million samples in 2...
Article
Patient-derived tumor xenograft (PDX) mouse models have emerged as an important oncology research platform to study tumor evolution, drug response and for tailoring chemotherapeutic approaches to individual patients. PDX models are produced and made available in repositories managed by small academic labs, large research consortia and contract rese...
Preprint
Full-text available
Patient-derived tumor xenograft (PDX) mouse models are a versatile oncology research platform for studying tumor biology and for testing chemotherapeutic approaches tailored to genomic characteristics of individual patient’s tumors. PDX models are generated and distributed by a diverse group of academic labs, research organizations, multi-instituti...
Article
Full-text available
Background: The Gene Ontology (GO) consists of over 40,000 terms for biological processes, cell components and gene product activities linked into a graph structure by over 90,000 relationships. It has been used to annotate the functions and cellular locations of several million gene products. The graph structure is used by a variety of tools to g...
Article
Full-text available
Unambiguous cell line authentication is essential to avoid loss of association between data and cells. The risk for loss of references increases with the rapidity that new human pluripotent stem cell (hPSC) lines are generated, exchanged, and implemented. Ideally, a single name should be used as a generally applied reference for each cell line to a...
Article
Full-text available
Background: The Experimental Factor Ontology (EFO) is an application ontology driven by experimental variables including cell lines to organize and describe the diverse experimental variables and data resided in the EMBL-EBI resources. The Cell Line Ontology (CLO) is an OBO community-based ontology that contains information of immortalized cell li...
Article
Full-text available
p>Patient-derived tumor xenograft (PDX) mouse models have emerged as an important oncology research platform to study tumor evolution, mechanisms of drug response and resistance, and tailoring chemotherapeutic approaches for individual patients. The lack of robust standards for reporting on PDX models has hampered the ability of researchers to find...
Article
Full-text available
The arrival of high-throughput technologies in cancer science and medicine has made the possibility for knowledge generation greater than ever before. However, this has brought with it real challenges as researchers struggle to analyse the avalanche of information available to them. A unique U.K.-based initiative has been established to promote dat...
Preprint
Background The accurate description of ancestry is essential to interpret and integrate human genomics data, and to ensure that advances in the field of genomics benefit individuals from all ancestral backgrounds. However, there are no established guidelines for the consistent, unambiguous and standardized description of ancestry. To fill this gap,...