Matthew E Hurles

Chinese PLA General Hospital, Beijing, Beijing Shi, China

Are you Matthew E Hurles?

Claim your profile

Publications (60)1051.87 Total impact

  • Article: Genome-wide SNP and CNV analysis identifies common and low-frequency variants associated with severe early-onset obesity.
    [show abstract] [hide abstract]
    ABSTRACT: Common and rare variants associated with body mass index (BMI) and obesity account for <5% of the variance in BMI. We performed SNP and copy number variation (CNV) association analyses in 1,509 children with obesity at the extreme tail (>3 s.d. from the mean) of the BMI distribution and 5,380 controls. Evaluation of 29 SNPs (P < 1 × 10(-5)) in an additional 971 severely obese children and 1,990 controls identified 4 new loci associated with severe obesity (LEPR, PRKCH, PACS1 and RMST). A previously reported 43-kb deletion at the NEGR1 locus was significantly associated with severe obesity (P = 6.6 × 10(-7)). However, this signal was entirely driven by a flanking 8-kb deletion; absence of this deletion increased risk for obesity (P = 6.1 × 10(-11)). We found a significant burden of rare, single CNVs in severely obese cases (P < 0.0001). Integrative gene network pathway analysis of rare deletions indicated enrichment of genes affecting G protein-coupled receptors (GPCRs) involved in the neuronal regulation of energy homeostasis.
    Nature Genetics 04/2013; · 35.53 Impact Factor
  • Article: Genetic Basis of Y-Linked Hearing Impairment.
    [show abstract] [hide abstract]
    ABSTRACT: A single Mendelian trait has been mapped to the human Y chromosome: Y-linked hearing impairment. The molecular basis of this disorder is unknown. Here, we report the detailed characterization of the DFNY1 Y chromosome and its comparison with a closely related Y chromosome from an unaffected branch of the family. The DFNY1 chromosome carries a complex rearrangement, including duplication of several noncontiguous segments of the Y chromosome and insertion of ∼160 kb of DNA from chromosome 1, in the pericentric region of Yp. This segment of chromosome 1 is derived entirely from within a known hearing impairment locus, DFNA49. We suggest that a third copy of one or more genes from the shared segment of chromosome 1 might be responsible for the hearing-loss phenotype.
    The American Journal of Human Genetics 01/2013; · 10.60 Impact Factor
  • Article: Origins of the domestic horse.
    Proceedings of the National Academy of Sciences 10/2012; · 9.68 Impact Factor
  • Article: Validity of the Family-Based Association Test for Copy Number Variant Data in the Case of Non-Linear Intensity-Genotype Relationship.
    Genetic Epidemiology 09/2012; · 3.44 Impact Factor
  • Article: DECIPHER: web-based, community resource for clinical interpretation of rare variants in developmental disorders.
    [show abstract] [hide abstract]
    ABSTRACT: Patients with developmental disorders often harbour sub-microscopic deletions or duplications that lead to a disruption of normal gene expression or perturbation in the copy number of dosage-sensitive genes. Clinical interpretation for such patients in isolation is hindered by the rarity and novelty of such disorders. The DECIPHER project (https://decipher.sanger.ac.uk) was established in 2004 as an accessible online repository of genomic and associated phenotypic data with the primary goal of aiding the clinical interpretation of rare copy-number variants (CNVs). DECIPHER integrates information from a variety of bioinformatics resources and uses visualization tools to identify potential disease genes within a CNV. A two-tier access system permits clinicians and clinical scientists to maintain confidential linked anonymous records of phenotypes and CNVs for their patients that, with informed consent, can subsequently be shared with the wider clinical genetics and research communities. Advances in next-generation sequencing technologies are making it practical and affordable to sequence the whole exome/genome of patients who display features suggestive of a genetic disorder. This approach enables the identification of smaller intragenic mutations including single-nucleotide variants that are not accessible even with high-resolution genomic array analysis. This article briefly summarizes the current status and achievements of the DECIPHER project and looks ahead to the opportunities and challenges of jointly analysing structural and sequence variation in the human genome.
    Human Molecular Genetics 09/2012; 21(R1):R37-44. · 7.64 Impact Factor
  • Source
    Article: A systematic survey of loss-of-function variants in human protein-coding genes.
    [show abstract] [hide abstract]
    ABSTRACT: Genome-sequencing studies indicate that all humans carry many genetic variants predicted to cause loss of function (LoF) of protein-coding genes, suggesting unexpected redundancy in the human genome. Here we apply stringent filters to 2951 putative LoF variants obtained from 185 human genomes to determine their true prevalence and properties. We estimate that human genomes typically contain ~100 genuine LoF variants with ~20 genes completely inactivated. We identify rare and likely deleterious LoF alleles, including 26 known and 21 predicted severe disease-causing variants, as well as common LoF variants in nonessential genes. We describe functional and evolutionary differences between LoF-tolerant and recessive disease genes and a method for using these differences to prioritize candidate genes found in clinical sequencing studies.
    Science 02/2012; 335(6070):823-8. · 31.20 Impact Factor
  • Source
    Article: Mapping copy number variation by population-scale genome sequencing.
    [show abstract] [hide abstract]
    ABSTRACT: Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.
    Nature 02/2011; 470(7332):59-65. · 36.28 Impact Factor
  • Source
    Article: Variation in genome-wide mutation rates within and between human families.
    [show abstract] [hide abstract]
    ABSTRACT: J.B.S. Haldane proposed in 1947 that the male germline may be more mutagenic than the female germline. Diverse studies have supported Haldane's contention of a higher average mutation rate in the male germline in a variety of mammals, including humans. Here we present, to our knowledge, the first direct comparative analysis of male and female germline mutation rates from the complete genome sequences of two parent-offspring trios. Through extensive validation, we identified 49 and 35 germline de novo mutations (DNMs) in two trio offspring, as well as 1,586 non-germline DNMs arising either somatically or in the cell lines from which the DNA was derived. Most strikingly, in one family, we observed that 92% of germline DNMs were from the paternal germline, whereas, in contrast, in the other family, 64% of DNMs were from the maternal germline. These observations suggest considerable variation in mutation rates within and between families.
    Nature Genetics 01/2011; 43(7):712-4. · 35.53 Impact Factor
  • Source
    Article: Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants.
    [show abstract] [hide abstract]
    ABSTRACT: We have systematically compared copy number variant (CNV) detection on eleven microarrays to evaluate data quality and CNV calling, reproducibility, concordance across array platforms and laboratory sites, breakpoint accuracy and analysis tool variability. Different analytic tools applied to the same raw data typically yield CNV calls with <50% concordance. Moreover, reproducibility in replicate experiments is <70% for most platforms. Nevertheless, these findings should not preclude detection of large CNVs for clinical diagnostic purposes because large CNVs with poor reproducibility are found primarily in complex genomic regions and would typically be removed by standard clinical data curation. The striking differences between CNV calls from different platforms and analytic tools highlight the importance of careful assessment of experimental design in discovery and association studies and of strict data curation and filtering in diagnostics. The CNV resource presented here allows independent data evaluation and provides a means to benchmark new algorithms.
    Nature Biotechnology 01/2011; 29(6):512-20. · 29.50 Impact Factor
  • Source
    Article: Independent and population-specific association of risk variants at the IRGM locus with Crohn's disease.
    [show abstract] [hide abstract]
    ABSTRACT: DNA polymorphisms in a region on chromosome 5q33.1 which contains two genes, immunity related GTPase related family, M (IRGM) and zinc finger protein 300 (ZNF300), are associated with Crohn's disease (CD). The deleted allele of a 20 kb copy number variation (CNV) upstream of IRGM was recently shown to be in strong linkage disequilibrium (LD) with the CD-associated single nucleotide polymorphisms and is itself associated with CD (P < 0.01). The deletion was correlated with increased or reduced expression of IRGM in transformed cells in a cell line-dependent manner, and has been proposed as a likely causal variant. We report here that small insertion/deletion polymorphisms in the promoter and 5' untranslated region of IRGM are, together with the CNV, strongly associated with CD (P = 1.37 x 10(-5) to 1.40 x 10(-9)), and that the CNV and the 5'-untranslated region variant -308(GTTT)(5) contribute independently to CD susceptibility (P = 2.6 x 10(-7) and P = 2 x 10(-5), respectively). We also show that the CD risk haplotype is associated with a significant decrease in IRGM expression (P < 10(-12)) in untransformed lymphocytes from CD patients. Further analysis of these variants in a Japanese CD case-control sample and of IRGM expression in HapMap populations revealed that neither the IRGM insertion/deletion polymorphisms nor the CNV was associated with CD or with altered IRGM expression in the Asian population. This suggests that the involvement of the IRGM risk haplotype in the pathogenesis of CD requires gene-gene or gene-environment interactions which are absent in Asian populations, or that none of the variants analysed are causal, and that the true causal variants arose after the European-Asian split.
    Human Molecular Genetics 05/2010; 19(9):1828-39. · 7.64 Impact Factor
  • Source
    Article: Discovery of common Asian copy number variants using integrated high-resolution array CGH and massively parallel DNA sequencing.
    [show abstract] [hide abstract]
    ABSTRACT: Copy number variants (CNVs) account for the majority of human genomic diversity in terms of base coverage. Here, we have developed and applied a new method to combine high-resolution array comparative genomic hybridization (CGH) data with whole-genome DNA sequencing data to obtain a comprehensive catalog of common CNVs in Asian individuals. The genomes of 30 individuals from three Asian populations (Korean, Chinese and Japanese) were interrogated with an ultra-high-resolution array CGH platform containing 24 million probes. Whole-genome sequencing data from a reference genome (NA10851, with 28.3x coverage) and two Asian genomes (AK1, with 27.8x coverage and AK2, with 32.0x coverage) were used to transform the relative copy number information obtained from array CGH experiments into absolute copy number values. We discovered 5,177 CNVs, of which 3,547 were putative Asian-specific CNVs. These common CNVs in Asian populations will be a useful resource for subsequent genetic studies in these populations, and the new method of calling absolute CNVs will be essential for applying CNV data to personalized medicine.
    Nature Genetics 04/2010; 42(5):400-5. · 35.53 Impact Factor
  • Article: Mutation spectrum revealed by breakpoint sequencing of human germline CNVs.
    [show abstract] [hide abstract]
    ABSTRACT: Precisely characterizing the breakpoints of copy number variants (CNVs) is crucial for assessing their functional impact. However, fewer than 10% of known germline CNVs have been mapped to the single-nucleotide level. We characterized the sequence breakpoints from a dataset of all CNVs detected in three unrelated individuals in previous array-based CNV discovery experiments. We used targeted hybridization-based DNA capture and 454 sequencing to sequence 324 CNV breakpoints, including 315 deletions. We observed two major breakpoint signatures: 70% of the deletion breakpoints have 1-30 bp of microhomology, whereas 33% of deletion breakpoints contain 1-367 bp of inserted sequence. The co-occurrence of microhomology and inserted sequence is low (10%), suggesting that there are at least two different mutational mechanisms. Approximately 5% of the breakpoints represent more complex rearrangements, including local microinversions, suggesting a replication-based strand switching mechanism. Despite a rich literature on DNA repair processes, reconstruction of the molecular events generating each of these mutations is not yet possible.
    Nature Genetics 04/2010; 42(5):385-91. · 35.53 Impact Factor
  • Source
    Article: Towards a comprehensive structural variation map of an individual human genome.
    [show abstract] [hide abstract]
    ABSTRACT: Several genomes have now been sequenced, with millions of genetic variants annotated. While significant progress has been made in mapping single nucleotide polymorphisms (SNPs) and small (<10 bp) insertion/deletions (indels), the annotation of larger structural variants has been less comprehensive. It is still unclear to what extent a typical genome differs from the reference assembly, and the analysis of the genomes sequenced to date have shown varying results for copy number variation (CNV) and inversions. We have combined computational re-analysis of existing whole genome sequence data with novel microarray-based analysis, and detect 12,178 structural variants covering 40.6 Mb that were not reported in the initial sequencing of the first published personal genome. We estimate a total non-SNP variation content of 48.8 Mb in a single genome. Our results indicate that this genome differs from the consensus reference sequence by approximately 1.2% when considering indels/CNVs, 0.1% by SNPs and approximately 0.3% by inversions. The structural variants impact 4,867 genes, and >24% of structural variants would not be imputed by SNP-association. Our results indicate that a large number of structural variants have been unreported in the individual genomes published to date. This significant extent and complexity of structural variants, as well as the growing recognition of their medical relevance, necessitate they be actively studied in health-related analyses of personal genomes. The new catalogue of structural variants generated for this genome provides a crucial resource for future comparison studies.
    Genome biology 01/2010; 11(5):R52. · 6.63 Impact Factor
  • Source
    Article: Characterising and predicting haploinsufficiency in the human genome.
    Ni Huang, Insuk Lee, Edward M Marcotte, Matthew E Hurles
    [show abstract] [hide abstract]
    ABSTRACT: Haploinsufficiency, wherein a single functional copy of a gene is insufficient to maintain normal function, is a major cause of dominant disease. Human disease studies have identified several hundred haploinsufficient (HI) genes. We have compiled a map of 1,079 haplosufficient (HS) genes by systematic identification of genes unambiguously and repeatedly compromised by copy number variation among 8,458 apparently healthy individuals and contrasted the genomic, evolutionary, functional, and network properties between these HS genes and known HI genes. We found that HI genes are typically longer and have more conserved coding sequences and promoters than HS genes. HI genes exhibit higher levels of expression during early development and greater tissue specificity. Moreover, within a probabilistic human functional interaction network HI genes have more interaction partners and greater network proximity to other known HI genes. We built a predictive model on the basis of these differences and annotated 12,443 genes with their predicted probability of being haploinsufficient. We validated these predictions of haploinsufficiency by demonstrating that genes with a high predicted probability of exhibiting haploinsufficiency are enriched among genes implicated in human dominant diseases and among genes causing abnormal phenotypes in heterozygous knockout mice. We have transformed these gene-based haploinsufficiency predictions into haploinsufficiency scores for genic deletions, which we demonstrate to better discriminate between pathogenic and benign deletions than consideration of the deletion size or numbers of genes deleted. These robust predictions of haploinsufficiency support clinical interpretation of novel loss-of-function variants and prioritization of variants and genes for follow-up studies.
    PLoS Genetics 01/2010; 6(10):e1001154. · 8.69 Impact Factor
  • Source
    Article: Large, rare chromosomal deletions associated with severe early-onset obesity.
    [show abstract] [hide abstract]
    ABSTRACT: Obesity is a highly heritable and genetically heterogeneous disorder. Here we investigated the contribution of copy number variation to obesity in 300 Caucasian patients with severe early-onset obesity, 143 of whom also had developmental delay. Large (>500 kilobases), rare (<1%) deletions were significantly enriched in patients compared to 7,366 controls (P < 0.001). We identified several rare copy number variants that were recurrent in patients but absent or at much lower prevalence in controls. We identified five patients with overlapping deletions on chromosome 16p11.2 that were found in 2 out of 7,366 controls (P < 5 x 10(-5)). In three patients the deletion co-segregated with severe obesity. Two patients harboured a larger de novo 16p11.2 deletion, extending through a 593-kilobase region previously associated with autism and mental retardation; both of these patients had mild developmental delay in addition to severe obesity. In an independent sample of 1,062 patients with severe obesity alone, the smaller 16p11.2 deletion was found in an additional two patients. All 16p11.2 deletions encompass several genes but include SH2B1, which is known to be involved in leptin and insulin signalling. Deletion carriers exhibited hyperphagia and severe insulin resistance disproportionate for the degree of obesity. We show that copy number variation contributes significantly to the genetic architecture of human obesity.
    Nature 12/2009; 463(7281):666-70. · 36.28 Impact Factor
  • Source
    Article: Origins and functional impact of copy number variation in the human genome.
    [show abstract] [hide abstract]
    ABSTRACT: Structural variations of DNA greater than 1 kilobase in size account for most bases that vary among human genomes, but are still relatively under-ascertained. Here we use tiling oligonucleotide microarrays, comprising 42 million probes, to generate a comprehensive map of 11,700 copy number variations (CNVs) greater than 443 base pairs, of which most (8,599) have been validated independently. For 4,978 of these CNVs, we generated reference genotypes from 450 individuals of European, African or East Asian ancestry. The predominant mutational mechanisms differ among CNV size classes. Retrotransposition has duplicated and inserted some coding and non-coding DNA segments randomly around the genome. Furthermore, by correlation with known trait-associated single nucleotide polymorphisms (SNPs), we identified 30 loci with CNVs that are candidates for influencing disease susceptibility. Despite this, having assessed the completeness of our map and the patterns of linkage disequilibrium between CNVs and SNPs, we conclude that, for complex traits, the heritability void left by genome-wide association studies will not be accounted for by common CNVs.
    Nature 10/2009; 464(7289):704-12. · 36.28 Impact Factor
  • Article: High-throughput haplotype determination over long distances by haplotype fusion PCR and ligation haplotyping.
    Daniel J Turner, Matthew E Hurles
    [show abstract] [hide abstract]
    ABSTRACT: When combined with haplotype fusion PCR (HF-PCR), ligation haplotyping is a robust, high-throughput method for empirical determination of haplotypes, which can be applied to assaying both sequence and structural variation over long distances. Unlike alternative approaches to haplotype determination, such as allele-specific PCR and long PCR, HF-PCR and ligation haplotyping do not suffer from mispriming or template-switching errors. In this method, HF-PCR is used to juxtapose DNA sequences from single-molecule templates, which contain single-nucleotide polymorphisms (SNPs) or paralogous sequence variants (PSVs) separated by several kilobases. HF-PCR uses an emulsion-based fusion PCR, which can be performed rapidly and in a 96-well format. Subsequently, a ligation-based assay is performed on the HF-PCR products to determine haplotypes. Products are resolved by capillary electrophoresis. Once optimized, the procedure can be performed quickly, taking a day and a half to generate phased haplotypes from genomic DNA.
    Nature Protocol 01/2009; 4(12):1771-83. · 8.36 Impact Factor
  • Source
    Article: Accurate whole human genome sequencing using reversible terminator chemistry.
    [show abstract] [hide abstract]
    ABSTRACT: DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation. Here we report an approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost. Single molecules of DNA are attached to a flat surface, amplified in situ and used as templates for synthetic sequencing with fluorescent reversible terminator deoxyribonucleotides. Images of the surface are analysed to generate high-quality sequence. We demonstrate application of this approach to human genome sequencing on flow-sorted X chromosomes and then scale the approach to determine the genome sequence of a male Yoruba from Ibadan, Nigeria. We build an accurate consensus sequence from >30x average depth of paired 35-base reads. We characterize four million single-nucleotide polymorphisms and four hundred thousand structural variants, many of which were previously unknown. Our approach is effective for accurate, rapid and economical whole-genome re-sequencing and many other biomedical applications.
    Nature 12/2008; 456(7218):53-9. · 36.28 Impact Factor
  • Article: A robust statistical method for case-control association testing with copy number variation.
    [show abstract] [hide abstract]
    ABSTRACT: Copy number variation (CNV) is pervasive in the human genome and can play a causal role in genetic diseases. The functional impact of CNV cannot be fully captured through linkage disequilibrium with SNPs. These observations motivate the development of statistical methods for performing direct CNV association studies. We show through simulation that current tests for CNV association are prone to false-positive associations in the presence of differential errors between cases and controls, especially if quantitative CNV measurements are noisy. We present a statistical framework for performing case-control CNV association studies that applies likelihood ratio testing of quantitative CNV measurements in cases and controls. We show that our methods are robust to differential errors and noisy data and can achieve maximal theoretical power. We illustrate the power of these methods for testing for association with binary and quantitative traits, and have made this software available as the R package CNVtools.
    Nature Genetics 10/2008; 40(10):1245-52. · 35.53 Impact Factor
  • Article: Adaptive evolution of UGT2B17 copy-number variation.
    [show abstract] [hide abstract]
    ABSTRACT: The human UGT2B17 gene varies in copy number from zero to two per individual and also differs in mean number between populations from Africa, Europe, and East Asia. We show that such a high degree of geographical variation is unusual and investigate its evolutionary history. This required first reinterpreting the reference sequence in this region of the genome, which is misassembled from the two different alleles separated by an artifactual gap. A corrected assembly identifies the polymorphism as a 117 kb deletion arising by nonallelic homologous recombination between approximately 4.9 kb segmental duplications and allows the deletion breakpoint to be identified. We resequenced approximately 12 kb of DNA spanning the breakpoint in 91 humans from three HapMap and one extended HapMap populations and one chimpanzee. Diversity was unusually high and the time to the most recent common ancestor was estimated at approximately 2.4 or approximately 3.0 million years by two different methods, with evidence of balancing selection in Europe. In contrast, diversity was low in East Asia where a single haplotype predominated, suggesting positive selection for the deletion in this part of the world.
    The American Journal of Human Genetics 09/2008; 83(3):337-46. · 10.60 Impact Factor

Institutions

  • 2013
    • Chinese PLA General Hospital
      Beijing, Beijing Shi, China
  • 2006–2010
    • Wellcome Trust Sanger Institute
      Cambridge, ENG, United Kingdom
  • 2007
    • University of Chicago
      • Department of Human Genetics
      Chicago, IL, USA
  • 2002–2007
    • University of Cambridge
      • • Department of Applied Mathematics and Theoretical Physics
      • • McDonald Institute for Archaeological Research
      Cambridge, ENG, United Kingdom
  • 2001–2004
    • University of Leicester
      • Department of Genetics
      Leicester, ENG, United Kingdom
  • 2003
    • University of Oxford
      • Department of Biochemistry
      Oxford, ENG, United Kingdom