[Show abstract][Hide abstract] ABSTRACT: As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of
the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs
from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a
benchmarking exercise within the context of the International Cancer Genome Consortium.We compare
sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods
and increasing sequencing depth to B100� shows benefits, as long as the tumour:control coverage
ratio remains balanced. We observe widely varying mutation call rates and low concordance among
analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing
with the artefacts. However, we show that, using the benchmark mutation set we have created, many
issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy.
Full-text · Article · Dec 2015 · Nature Communications
[Show abstract][Hide abstract] ABSTRACT: BioMart Central Portal is a first of its kind, community-driven effort to provide unified access to dozens of biological databases
spanning genomics, proteomics, model organisms, cancer data, ontology information and more. Anybody can contribute an independently
maintained resource to the Central Portal, allowing it to be exposed to and shared with the research community, and linking
it with the other resources in the portal. Users can take advantage of the common interface to quickly utilize different sources
without learning a new system for each. The system also simplifies cross-database searches that might otherwise require several
complicated steps. Several integrated tools streamline common tasks, such as converting between ID formats and retrieving
sequences. The combination of a wide variety of databases, an easy-to-use interface, robust programmatic access and the array
of tools make Central Portal a one-stop shop for biological data querying. Here, we describe the structure of Central Portal
and show example queries to demonstrate its capabilities.
Database URL: http://central.biomart.org.
Full-text · Article · Sep 2011 · Database The Journal of Biological Databases and Curation
[Show abstract][Hide abstract] ABSTRACT: Catalogue of Somatic Mutations in Cancer (COSMIC) (http://www.sanger.ac.uk/cosmic) is a publicly available resource providing information on somatic mutations implicated in human cancer. Release v51 (January 2011) includes data from just over 19 000 genes, 161 787 coding mutations and 5573 gene fusions, described in more than 577 000 tumour samples. COSMICMart (COSMIC BioMart) provides a flexible way to mine these data and combine somatic mutations with other biological relevant data sets. This article describes the data available in COSMIC along with examples of how to successfully mine and integrate data sets using COSMICMart.
Database URL: http://www.sanger.ac.uk/genetics/CGP/cosmic/biomart/martview/
Preview · Article · Jan 2011 · Database The Journal of Biological Databases and Curation
[Show abstract][Hide abstract] ABSTRACT: "COSMIC, the Catalogue Of Somatic Mutations In Cancer":http://www.sanger.ac.uk/cosmic is designed to store and display somatic mutation information relating to human cancers, combining detailed information on publications, samples and mutation types. The information is curated both from the primary literature and the laboratories at the Cancer Genome Project, Sanger Institute, UK, and then semi-automatically entered into the COSMIC database. The v47 release (May 2010) contained the curation of 9202 papers describing 116,977 mutations across 466,851 samples. In order to provide consistent annotation of the data, COSMIC has developed a classification system for cancer histology and tissue ontology, and adapted HGVS mutation nomenclature recommendations to describe the multiple mutation types involved in cancer. Cancer genetics is moving from systematic screens of candidate gene sets to whole genome sequencing analyses, and COSMIC displays and navigates this new data; we have recently included systematic gene screens and whole genome sequencing studies. COSMIC will annotate and display somatic mutation data that will be emerging from the "International Cancer Genome Consortium (ICGC)":http://www.icgc.org/ and "The Cancer Genome Atlas (TCGA)":http://cancergenome.nih.gov/ projects. New tools are being developed to interpret this genomic data with coding mutation annotations. In addition COSMIC will be expanded to curate and display data from mouse insertional mutagenesis screening and mouse cancer model exome/genome sequencing in the future. The data within COSMIC is freely available without restriction via a website, in datasheets on the "FTP site":ftp://ftp.sanger.ac.uk/pub/CGP/cosmic and through the "COSMIC Biomart":http://www.sanger.ac.uk/genetics/CGP/cosmic/biomart/martview/, available from the "COSMIC homepage":http://www.sanger.ac.uk/cosmic
[Show abstract][Hide abstract] ABSTRACT: COSMIC (http://www.sanger.ac.uk/cosmic) curates comprehensive information on somatic mutations in human cancer. Release v48 (July 2010) describes over 136 000 coding
mutations in almost 542 000 tumour samples; of the 18 490 genes documented, 4803 (26%) have one or more mutations. Full scientific
literature curations are available on 83 major cancer genes and 49 fusion gene pairs (19 new cancer genes and 30 new fusion
pairs this year) and this number is continually increasing. Key amongst these is TP53, now available through a collaboration
with the IARC p53 database. In addition to data from the Cancer Genome Project (CGP) at the Sanger Institute, UK, and The
Cancer Genome Atlas project (TCGA), large systematic screens are also now curated. Major website upgrades now make these data
much more mineable, with many new selection filters and graphics. A Biomart is now available allowing more automated data
mining and integration with other biological databases. Annotation of genomic features has become a significant focus; COSMIC
has begun curating full-genome resequencing experiments, developing new web pages, export formats and graphics styles. With
all genomic information recently updated to GRCh37, COSMIC integrates many diverse types of mutation information and is making
much closer links with Ensembl and other data resources.
Preview · Article · Oct 2010 · Nucleic Acids Research
[Show abstract][Hide abstract] ABSTRACT: Clear cell renal cell carcinoma (ccRCC) is the most common form of adult kidney cancer, characterized by the presence of inactivating mutations in the VHL gene in most cases, and by infrequent somatic mutations in known cancer genes. To determine further the genetics of ccRCC, we have sequenced 101 cases through 3,544 protein-coding genes. Here we report the identification of inactivating mutations in two genes encoding enzymes involved in histone modification-SETD2, a histone H3 lysine 36 methyltransferase, and JARID1C (also known as KDM5C), a histone H3 lysine 4 demethylase-as well as mutations in the histone H3 lysine 27 demethylase, UTX (KMD6A), that we recently reported. The results highlight the role of mutations in components of the chromatin modification machinery in human cancer. Furthermore, NF2 mutations were found in non-VHL mutated ccRCC, and several other probable cancer genes were identified. These results indicate that substantial genetic heterogeneity exists in a cancer type dominated by mutations in a single gene, and that systematic screens will be key to fully determining the somatic genetic architecture of cancer.
[Show abstract][Hide abstract] ABSTRACT: Large-scale systematic resequencing has been proposed as the key future strategy for the discovery of rare, disease-causing sequence variants across the spectrum of human complex disease. We have sequenced the coding exons of the X chromosome in 208 families with X-linked mental retardation (XLMR), the largest direct screen for constitutional disease-causing mutations thus far reported. The screen has discovered nine genes implicated in XLMR, including SYP, ZNF711 and CASK reported here, confirming the power of this strategy. The study has, however, also highlighted issues confronting whole-genome sequencing screens, including the observation that loss of function of 1% or more of X-chromosome genes is compatible with apparently normal existence.
[Show abstract][Hide abstract] ABSTRACT: Somatically acquired epigenetic changes are present in many cancers. Epigenetic regulation is maintained via post-translational modifications of core histones. Here, we describe inactivating somatic mutations in the histone lysine demethylase gene UTX, pointing to histone H3 lysine methylation deregulation in multiple tumor types. UTX reintroduction into cancer cells with inactivating UTX mutations resulted in slowing of proliferation and marked transcriptional changes. These data identify UTX as a new human cancer gene.
[Show abstract][Hide abstract] ABSTRACT: Epilepsy and mental retardation limited to females (EFMR) is a disorder with an X-linked mode of inheritance and an unusual expression pattern. Disorders arising from mutations on the X chromosome are typically characterized by affected males and unaffected carrier females. In contrast, EFMR spares transmitting males and affects only carrier females. Aided by systematic resequencing of 737 X chromosome genes, we identified different protocadherin 19 (PCDH19) gene mutations in seven families with EFMR. Five mutations resulted in the introduction of a premature termination codon. Study of two of these demonstrated nonsense-mediated decay of PCDH19 mRNA. The two missense mutations were predicted to affect adhesiveness of PCDH19 through impaired calcium binding. PCDH19 is expressed in developing brains of human and mouse and is the first member of the cadherin superfamily to be directly implicated in epilepsy or mental retardation.
[Show abstract][Hide abstract] ABSTRACT: Nonsense-mediated mRNA decay (NMD) is of universal biological significance. It has emerged as an important global RNA, DNA and translation regulatory pathway. By systematically sequencing 737 genes (annotated in the Vertebrate Genome Annotation database) on the human X chromosome in 250 families with X-linked mental retardation, we identified mutations in the UPF3 regulator of nonsense transcripts homolog B (yeast) (UPF3B) leading to protein truncations in three families: two with the Lujan-Fryns phenotype and one with the FG phenotype. We also identified a missense mutation in another family with nonsyndromic mental retardation. Three mutations lead to the introduction of a premature termination codon and subsequent NMD of mutant UPF3B mRNA. Protein blot analysis using lymphoblastoid cell lines from affected individuals showed an absence of the UPF3B protein in two families. The UPF3B protein is an important component of the NMD surveillance machinery. Our results directly implicate abnormalities of NMD in human disease and suggest at least partial redundancy of NMD pathways.
[Show abstract][Hide abstract] ABSTRACT: In the course of systematic screening of the X-chromosome coding sequences in 250 families with nonsyndromic X-linked mental retardation (XLMR), two families were identified with truncating mutations in BRWD3, a gene encoding a bromodomain and WD-repeat domain-containing protein. In both families, the mutation segregates with the phenotype in affected males. Affected males have macrocephaly with a prominent forehead, large cupped ears, and mild-to-moderate intellectual disability. No truncating variants were found in 520 control X chromosomes. BRWD3 is therefore a new gene implicated in the etiology of XLMR associated with macrocephaly and may cause disease by altering intracellular signaling pathways affecting cellular proliferation.
Full-text · Article · Sep 2007 · The American Journal of Human Genetics
[Show abstract][Hide abstract] ABSTRACT: The undertaking of large-scale DNA sequencing screens for somatic variants in human cancers requires accurate and rapid processing
of traces for variants. Due to their often aneuploid nature and admixed normal tissue, heterozygous variants found in primary
cancers are often subtle and difficult to detect. To address these issues, we have developed a mutation detection algorithm,
AutoCSA, specifically optimized for the high throughput screening of cancer samples.
[Show abstract][Hide abstract] ABSTRACT: We have identified one frameshift mutation, one splice-site mutation, and two missense mutations in highly conserved residues in ZDHHC9 at Xq26.1 in 4 of 250 families with X-linked mental retardation (XLMR). In three of the families, the mental retardation phenotype is associated with a Marfanoid habitus, although none of the affected individuals meets the Ghent criteria for Marfan syndrome. ZDHHC9 is a palmitoyltransferase that catalyzes the posttranslational modification of NRAS and HRAS. The degree of palmitoylation determines the temporal and spatial location of these proteins in the plasma membrane and Golgi complex. The finding of mutations in ZDHHC9 suggests that alterations in the concentrations and cellular distribution of target proteins are sufficient to cause disease. This is the first XLMR gene to be reported that encodes a posttranslational modification enzyme, palmitoyltransferase. Furthermore, now that the first palmitoyltransferase that causes mental retardation has been identified, defects in other palmitoylation transferases become good candidates for causing other mental retardation syndromes.
Full-text · Article · Jun 2007 · The American Journal of Human Genetics
[Show abstract][Hide abstract] ABSTRACT: Cancers arise owing to mutations in a subset of genes that confer growth advantage. The availability of the human genome sequence led us to propose that systematic resequencing of cancer genomes for mutations would lead to the discovery of many additional cancer genes. Here we report more than 1,000 somatic mutations found in 274 megabases (Mb) of DNA corresponding to the coding exons of 518 protein kinase genes in 210 diverse human cancers. There was substantial variation in the number and pattern of mutations in individual cancers reflecting different exposures, DNA repair defects and cellular origins. Most somatic mutations are likely to be 'passengers' that do not contribute to oncogenesis. However, there was evidence for 'driver' mutations contributing to the development of the cancers studied in approximately 120 genes. Systematic sequencing of cancer genomes therefore reveals the evolutionary diversity of cancers and implicates a larger repertoire of cancer genes than previously anticipated.
[Show abstract][Hide abstract] ABSTRACT: We have identified three truncating, two splice-site, and three missense variants at conserved amino acids in the CUL4B gene on Xq24 in 8 of 250 families with X-linked mental retardation (XLMR). During affected subjects' adolescence, a syndrome emerged with delayed puberty, hypogonadism, relative macrocephaly, moderate short stature, central obesity, unprovoked aggressive outbursts, fine intention tremor, pes cavus, and abnormalities of the toes. This syndrome was first described by Cazebas et al., in a family that was included in our study and that carried a CUL4B missense variant. CUL4B is a ubiquitin E3 ligase subunit implicated in the regulation of several biological processes, and CUL4B is the first XLMR gene that encodes an E3 ubiquitin ligase. The relatively high frequency of CUL4B mutations in this series indicates that it is one of the most commonly mutated genes underlying XLMR and suggests that its introduction into clinical diagnostics should be a high priority.
Full-text · Article · Mar 2007 · The American Journal of Human Genetics
[Show abstract][Hide abstract] ABSTRACT: In a systematic sequencing screen of the coding exons of the X chromosome in 250 families with X-linked mental retardation (XLMR), we identified two nonsense mutations and one consensus splice-site mutation in the AP1S2 gene on Xp22 in three families. Affected individuals in these families showed mild-to-profound mental retardation. Other features included hypotonia early in life and delay in walking. AP1S2 encodes an adaptin protein that constitutes part of the adaptor protein complex found at the cytoplasmic face of coated vesicles located at the Golgi complex. The complex mediates the recruitment of clathrin to the vesicle membrane. Aberrant endocytic processing through disruption of adaptor protein complexes is likely to result from the AP1S2 mutations identified in the three XLMR-affected families, and such defects may plausibly cause abnormal synaptic development and function. AP1S2 is the first reported XLMR gene that encodes a protein directly involved in the assembly of endocytic vesicles.
Full-text · Article · Jan 2007 · The American Journal of Human Genetics