-
D Bell,
A Berchuck,
M Birrer,
J Chien,
D W Cramer,
F Dao,
R Dhir,
P Disaia,
H Gabra,
P Glenn, [......],
R Myles,
C Schaefer,
K R Mills Shaw,
J Vaught,
J B Vockley,
P J Good,
M S Guyer,
B Ozenberger, J Peterson,
E Thomson
[show abstract]
[hide abstract]
ABSTRACT: A catalogue of molecular aberrations that cause ovarian cancer is critical for developing and deploying therapies that will improve patients' lives. The Cancer Genome Atlas project has analysed messenger RNA expression, microRNA expression, promoter methylation and DNA copy number in 489 high-grade serous ovarian adenocarcinomas and the DNA sequences of exons from coding genes in 316 of these tumours. Here we report that high-grade serous ovarian cancer is characterized by TP53 mutations in almost all tumours (96%); low prevalence but statistically recurrent somatic mutations in nine further genes including NF1, BRCA1, BRCA2, RB1 and CDK12; 113 significant focal DNA copy number aberrations; and promoter methylation events involving 168 genes. Analyses delineated four ovarian cancer transcriptional subtypes, three microRNA subtypes, four promoter methylation subtypes and a transcriptional signature associated with survival duration, and shed new light on the impact that tumours with BRCA1/2 (BRCA1 or BRCA2) and CCNE1 aberrations have on survival. Pathway analyses suggested that homologous recombination is defective in about half of the tumours analysed, and that NOTCH and FOXM1 signalling are involved in serous ovarian cancer pathophysiology.
Nature 01/2011; 474(7353):609-615. · 36.28 Impact Factor
-
The International Cancer Genome Consortium,
T J Hudson,
W. Anderson,
A. Artez,
A. D. Barker,
C. Bell,
R. R. Bernabe,
M. K. Bhan,
F. Calvo,
I. Eerola, [......],
G. D. Bader,
P. C. Boutros,
P. Flicek,
G. Getz,
R. Guigo,
G. Guo,
D Haussler,
S Heath,
Hans et al Lehrach,
Pablo Landgraf
[show abstract]
[hide abstract]
ABSTRACT: The International Cancer Genome Consortium (ICGC) was launched to coordinate large-scale cancer genome studies in tumours from 50 different cancer types and/or subtypes that are of clinical and societal importance across the globe. Systematic studies of more than 25,000 cancer genomes at the genomic, epigenomic and transcriptomic levels will reveal the repertoire of oncogenic mutations, uncover traces of the mutagenic influences, define clinically relevant subtypes for prognosis and therapeutic management, and enable the development of new cancer therapies.
Nature 04/2010; 464:993-998. · 36.28 Impact Factor
-
T. J. Hudson,
W. Anderson,
A. Artez,
A. D. Barker,
C. Bell,
R. R. Bernabe,
M. K. Bhan,
F. Calvo,
I. Eerola,
D. S. Gerhard, [......],
F. S. Collins,
C. C. Compton,
E. S. Lander,
W. Burke,
A. R. Green,
S. R. Hamilton,
O. P. Kallioniemi,
T. J. Ley,
E. T. Liu,
B. J. Wainwright
[show abstract]
[hide abstract]
ABSTRACT: The International Cancer Genome Consortium (ICGC) was launched to coordinate large-scale cancer genome studies in tumours from 50 different cancer types and/or subtypes that are of clinical and societal importance across the globe. Systematic studies of more than 25,000 cancer genomes at the genomic, epigenomic and transcriptomic levels will reveal the repertoire of oncogenic mutations, uncover traces of the mutagenic influences, define clinically relevant subtypes for prognosis and therapeutic management, and enable the development of new cancer therapies.
Nature 04/2010; 464:993-8. · 36.28 Impact Factor
-
K. E. Nelson,
C. Weinel,
I. T. Paulsen,
R. J. Dodson,
H. Hilbert,
V. A. P. Martins dos Santos,
D. E. Fouts,
S. R. Gill,
M. Pop,
M. Holmes, [......],
D. Stjepandic,
J. Hoheisel,
M. Straetz,
S. Heim,
C. Kiewitz,
J. A. Eisen,
K. N. Timmis,
A. Düsterhöft,
B. Tümmler,
C. M. Fraser
Environmental Microbiology 06/2003; 5(7):630 - 630. · 5.84 Impact Factor
-
T D Read,
G S A Myers,
R C Brunham,
W C Nelson,
I T Paulsen,
J Heidelberg,
E Holtzapple,
H Khouri,
N B Federova,
H A Carty, [......],
D H Haft, J Peterson,
M J Beanan,
O White,
S L Salzberg,
R-c Hsia,
G McClarty,
R G Rank,
P M Bavoil,
C M Fraser
[show abstract]
[hide abstract]
ABSTRACT: The genome of Chlamydophila caviae (formerly Chlamydia psittaci, GPIC isolate) (1 173 390 nt with a plasmid of 7966 nt) was determined, representing the fourth species with a complete genome sequence from the Chlamydiaceae family of obligate intracellular bacterial pathogens. Of 1009 annotated genes, 798 were conserved in all three other completed Chlamydiaceae genomes. The C.caviae genome contains 68 genes that lack orthologs in any other completed chlamydial genomes, including tryptophan and thiamine biosynthesis determinants and a ribose-phosphate pyrophosphokinase, the product of the prsA gene. Notable amongst these was a novel member of the virulence-associated invasin/intimin family (IIF) of Gram-negative bacteria. Intriguingly, two authentic frameshift mutations in the ORF indicate that this gene is not functional. Many of the unique genes are found in the replication termination region (RTR or plasticity zone), an area of frequent symmetrical inversion events around the replication terminus shown to be a hotspot for genome variation in previous genome sequencing studies. In C.caviae, the RTR includes several loci of particular interest including a large toxin gene and evidence of ancestral insertion(s) of a bacteriophage. This toxin gene, not present in Chlamydia pneumoniae, is a member of the YopT effector family of type III-secreted cysteine proteases. One gene cluster (guaBA-add) in the RTR is much more similar to orthologs in Chlamydia muridarum than those in the phylogenetically closest species C.pneumoniae, suggesting the possibility of horizontal transfer of genes between the rodent-associated Chlamydiae. With most genes observed in the other chlamydial genomes represented, C.caviae provides a good model for the Chlamydiaceae and a point of comparison against the human atherosclerosis-associated C.pneumoniae. This crucial addition to the set of completed Chlamydiaceae genome sequences is enabling dissection of the roles played by niche-specific genes in these important bacterial pathogens.
Nucleic Acids Research 05/2003; 31(8):2134-47. · 8.03 Impact Factor
-
K E Nelson,
C Weinel,
I T Paulsen,
R J Dodson,
H Hilbert,
V A P Martins dos Santos,
D E Fouts,
S R Gill,
M Pop,
M Holmes, [......],
D Stjepandic,
J Hoheisel,
M Straetz,
S Heim,
C Kiewitz,
J A Eisen,
K N Timmis,
A Düsterhöft,
B Tümmler,
C M Fraser
[show abstract]
[hide abstract]
ABSTRACT: Pseudomonas putida is a metabolically versatile saprophytic soil bacterium that has been certified as a biosafety host for the cloning of foreign genes. The bacterium also has considerable potential for biotechnological applications. Sequence analysis of the 6.18 Mb genome of strain KT2440 reveals diverse transport and metabolic systems. Although there is a high level of genome conservation with the pathogenic Pseudomonad Pseudomonas aeruginosa (85% of the predicted coding regions are shared), key virulence factors including exotoxin A and type III secretion systems are absent. Analysis of the genome gives insight into the non-pathogenic nature of P. putida and points to potential new applications in agriculture, biocatalysis, bioremediation and bioplastic production.
Environmental Microbiology 01/2003; 4(12):799-808. · 5.84 Impact Factor
-
K. E. Nelson,
C. Weinel,
I. T. Paulsen,
R. J. Dodson,
H. Hilbert,
V. A. P. Martins dos Santos,
D. E. Fouts,
S. R. Gill,
M. Pop,
M. Holmes, [......],
D. Stjepandic,
J. Hoheisel,
M. Straetz,
S. Heim,
C. Kiewitz,
J. Eisen,
K. N. Timmis,
A. Düsterhöft,
B. Tümmler,
C. M. Fraser
[show abstract]
[hide abstract]
ABSTRACT: Pseudomonas putida is a metabolically versatile saprophytic soil bacterium that has been certified as a biosafety host for the cloning of foreign genes. The bacterium also has considerable potential for biotechnological applications. Sequence analysis of the 6.18 Mb genome of strain KT2440 reveals diverse transport and metabolic systems. Although there is a high level of genome conservation with the pathogenic Pseudomonad Pseudomonas aeruginosa (85% of the predicted coding regions are shared), key virulence factors including exotoxin A and type III secretion systems are absent. Analysis of the genome gives insight into the non-pathogenic nature of P. putida and points to potential new applications in agriculture, biocatalysis, bioremediation and bioplastic production.
Environmental Microbiology 11/2002; 4(12):799 - 808. · 5.84 Impact Factor
-
R D Fleischmann,
D Alland,
J A Eisen,
L Carpenter,
O White, J Peterson,
R DeBoy,
R Dodson,
M Gwinn,
D Haft, [......],
A Delcher,
T Utterback,
J Weidman,
H Khouri,
J Gill,
A Mikula,
W Bishai,
W R Jacobs Jr,
J C Venter,
C M Fraser
[show abstract]
[hide abstract]
ABSTRACT: Virulence and immunity are poorly understood in Mycobacterium tuberculosis. We sequenced the complete genome of the M. tuberculosis clinical strain CDC1551 and performed a whole-genome comparison with the laboratory strain H37Rv in order to identify polymorphic sequences with potential relevance to disease pathogenesis, immunity, and evolution. We found large-sequence and single-nucleotide polymorphisms in numerous genes. Polymorphic loci included a phospholipase C, a membrane lipoprotein, members of an adenylate cyclase gene family, and members of the PE/PPE gene family, some of which have been implicated in virulence or the host immune response. Several gene families, including the PE/PPE gene family, also had significantly higher synonymous and nonsynonymous substitution frequencies compared to the genome as a whole. We tested a large sample of M. tuberculosis clinical isolates for a subset of the large-sequence and single-nucleotide polymorphisms and found widespread genetic variability at many of these loci. We performed phylogenetic and epidemiological analysis to investigate the evolutionary relationships among isolates and the origins of specific polymorphic loci. A number of these polymorphisms appear to have occurred multiple times as independent events, suggesting that these changes may be under selective pressure. Together, these results demonstrate that polymorphisms among M. tuberculosis strains are more extensive than initially anticipated, and genetic variation may have an important role in disease pathogenesis and immunity.
Journal of Bacteriology 11/2002; 184(19):5479-90. · 3.83 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: The human genome was analyzed for evidence that genes had been laterally transferred into the genome from prokaryotic organisms. Protein sequence comparisons of the proteomes of human, fruit fly, nematode worm, yeast, mustard weed, eukaryotic parasites, and all completed prokaryote genomes were performed, and all genes shared between human and each of the other groups of organisms were collected. About 40 genes were found to be exclusively shared by humans and bacteria and are candidate examples of horizontal transfer from bacteria to vertebrates. Gene loss combined with sample size effects and evolutionary rate variation provide an alternative, more biologically plausible explanation.
Science 07/2001; 292(5523):1903-6. · 31.20 Impact Factor
-
E S Lander,
L M Linton,
B Birren,
C Nusbaum,
M C Zody,
J Baldwin,
K Devon,
K Dewar,
M Doyle,
W FitzHugh, [......],
K A Wetterstrand,
A Patrinos,
M J Morgan,
P de Jong,
J J Catanese,
K Osoegawa,
H Shizuya,
S Choi,
Y J Chen,
J Szustakowki
[show abstract]
[hide abstract]
ABSTRACT: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.
Nature 03/2001; 409(6822):860-921. · 36.28 Impact Factor
-
A Theologis,
J R Ecker,
C J Palm,
N A Federspiel,
S Kaul,
O White,
J Alonso,
H Altafi,
R Araujo,
C L Bowman, [......],
T Utterback,
S Van Aken,
M Vaysberg,
V S Vysotskaia,
M Walker,
D Wu,
G Yu,
C M Fraser,
J C Venter,
R W Davis
[show abstract]
[hide abstract]
ABSTRACT: The genome of the flowering plant Arabidopsis thaliana has five chromosomes. Here we report the sequence of the largest, chromosome 1, in two contigs of around 14.2 and 14.6 megabases. The contigs extend from the telomeres to the centromeric borders, regions rich in transposons, retrotransposons and repetitive elements such as the 180-base-pair repeat. The chromosome represents 25% of the genome and contains about 6,850 open reading frames, 236 transfer RNAs (tRNAs) and 12 small nuclear RNAs. There are two clusters of tRNA genes at different places on the chromosome. One consists of 27 tRNA(Pro) genes and the other contains 27 tandem repeats of tRNA(Tyr)-tRNA(Tyr)-tRNA(Ser) genes. Chromosome 1 contains about 300 gene families with clustered duplications. There are also many repeat elements, representing 8% of the sequence.
Nature 01/2001; 408(6814):816-20. · 36.28 Impact Factor
-
M Salanoubat,
K Lemcke,
M Rieger,
W Ansorge,
M Unseld,
B Fartmann,
G Valle,
H Blöcker,
M Perez-Alonso,
B Obermaier, [......],
A Muraki,
S Nakayama,
N Nakazaki,
S Shinpo,
C Takeuchi,
T Wada,
A Watanabe,
M Yamada,
M Yasuda,
S Tabata
[show abstract]
[hide abstract]
ABSTRACT: Arabidopsis thaliana is an important model system for plant biologists. In 1996 an international collaboration (the Arabidopsis Genome Initiative) was formed to sequence the whole genome of Arabidopsis and in 1999 the sequence of the first two chromosomes was reported. The sequence of the last three chromosomes and an analysis of the whole genome are reported in this issue. Here we present the sequence of chromosome 3, organized into four sequence segments (contigs). The two largest (13.5 and 9.2 Mb) correspond to the top (long) and the bottom (short) arms of chromosome 3, and the two small contigs are located in the genetically defined centromere. This chromosome encodes 5,220 of the roughly 25,500 predicted protein-coding genes in the genome. About 20% of the predicted proteins have significant homology to proteins in eukaryotic genomes for which the complete sequence is available, pointing to important conserved cellular functions among eukaryotes.
Nature 01/2001; 408(6814):820-2. · 36.28 Impact Factor
-
T D Read,
R C Brunham,
C Shen,
S R Gill,
J F Heidelberg,
O White,
E K Hickey, J Peterson,
T Utterback,
K Berry, [......],
C Bowman,
R Dodson,
M Gwinn,
W Nelson,
R DeBoy,
J Kolonay,
G McClarty,
S L Salzberg,
J Eisen,
C M Fraser
[show abstract]
[hide abstract]
ABSTRACT: The genome sequences of Chlamydia trachomatis mouse pneumonitis (MoPn) strain Nigg (1 069 412 nt) and Chlamydia pneumoniae strain AR39 (1 229 853 nt) were determined using a random shotgun strategy. The MoPn genome exhibited a general conservation of gene order and content with the previously sequenced C.trachomatis serovar D. Differences between C.trachomatis strains were focused on an approximately 50 kb 'plasticity zone' near the termination origins. In this region MoPn contained three copies of a novel gene encoding a >3000 amino acid toxin homologous to a predicted toxin from Escherichia coli O157:H7 but had apparently lost the tryptophan biosyntheis genes found in serovar D in this region. The C. pneumoniae AR39 chromosome was >99.9% identical to the previously sequenced C.pneumoniae CWL029 genome, however, comparative analysis identified an invertible DNA segment upstream of the uridine kinase gene which was in different orientations in the two genomes. AR39 also contained a novel 4524 nt circular single-stranded (ss)DNA bacteriophage, the first time a virus has been reported infecting C. pneumoniae. Although the chlamydial genomes were highly conserved, there were intriguing differences in key nucleotide salvage pathways: C.pneumoniae has a uridine kinase gene for dUTP production, MoPn has a uracil phosphororibosyl transferase, while C.trachomatis serovar D contains neither gene. Chromosomal comparison revealed that there had been multiple large inversion events since the species divergence of C.trachomatis and C.pneumoniae, apparently oriented around the axis of the origin of replication and the termination region. The striking synteny of the Chlamydia genomes and prevalence of tandemly duplicated genes are evidence of minimal chromosome rearrangement and foreign gene uptake, presumably owing to the ecological isolation of the obligate intracellular parasites. In the absence of genetic analysis, comparative genomics will continue to provide insight into the virulence mechanisms of these important human pathogens.
Nucleic Acids Research 03/2000; 28(6):1397-406. · 8.03 Impact Factor
-
S Casjens,
N Palmer,
R van Vugt,
W M Huang,
B Stevenson,
P Rosa,
R Lathigra,
G Sutton, J Peterson,
R J Dodson,
D Haft,
E Hickey,
M Gwinn,
O White,
C M Fraser
[show abstract]
[hide abstract]
ABSTRACT: We have determined that Borrelia burgdorferi strain B31 MI carries 21 extrachromosomal DNA elements, the largest number known for any bacterium. Among these are 12 linear and nine circular plasmids, whose sequences total 610 694 bp. We report here the nucleotide sequence of three linear and seven circular plasmids (comprising 290 546 bp) in this infectious isolate. This completes the genome sequencing project for this organism; its genome size is 1 521 419 bp (plus about 2000 bp of undetermined telomeric sequences). Analysis of the sequence implies that there has been extensive and sometimes rather recent DNA rearrangement among a number of the linear plasmids. Many of these events appear to have been mediated by recombinational processes that formed duplications. These many regions of similarity are reflected in the fact that most plasmid genes are members of one of the genome's 161 paralogous gene families; 107 of these gene families, which vary in size from two to 41 members, contain at least one plasmid gene. These rearrangements appear to have contributed to a surprisingly large number of apparently non-functional pseudogenes, a very unusual feature for a prokaryotic genome. The presence of these damaged genes suggests that some of the plasmids may be in a period of rapid evolution. The sequence predicts 535 plasmid genes >/=300 bp in length that may be intact and 167 apparently mutationally damaged and/or unexpressed genes (pseudogenes). The large majority, over 90%, of genes on these plasmids have no convincing similarity to genes outside Borrelia, suggesting that they perform specialized functions.
Molecular Microbiology 02/2000; 35(3):490-516. · 5.01 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: A new system for aligning whole genome sequences is described. Using an efficient data structure called a suffix tree, the system is able to rapidly align sequences containing millions of nucleotides. Its use is demonstrated on two strains of Mycoplasma tuberculosis, on two less similar species of Mycoplasma bacteria and on two syntenic sequences from human chromosome 12 and mouse chromosome 6. In each case it found an alignment of the input sequences, using between 30 s and 2 min of computation time. From the system output, information on single nucleotide changes, translocations and homologous genes can easily be extracted. Use of the algorithm should facilitate analysis of syntenic chromosomal regions, strain-to-strain comparisons, evolutionary comparisons and genomic duplications.
Nucleic Acids Research 07/1999; 27(11):2369-76. · 8.03 Impact Factor
-
Genome Research 02/1999; 9(1):1-4. · 13.61 Impact Factor
-
C M Fraser,
S J Norris,
G M Weinstock,
O White,
G G Sutton,
R Dodson,
M Gwinn,
E K Hickey,
R Clayton,
K A Ketchum, [......],
M D Cotton,
C Fujii,
S Garland,
B Hatch,
K Horst,
K Roberts,
M Sandusky,
J Weidman,
H O Smith,
J C Venter
[show abstract]
[hide abstract]
ABSTRACT: The complete genome sequence of Treponema pallidum was determined and shown to be 1,138,006 base pairs containing 1041 predicted coding sequences (open reading frames). Systems for DNA replication, transcription, translation, and repair are intact, but catabolic and biosynthetic activities are minimized. The number of identifiable transporters is small, and no phosphoenolpyruvate:phosphotransferase carbohydrate transporters were found. Potential virulence factors include a family of 12 potential membrane proteins and several putative hemolysins. Comparison of the T. pallidum genome sequence with that of another pathogenic spirochete, Borrelia burgdorferi, the agent of Lyme disease, identified unique and common genes and substantiates the considerable diversity observed among pathogenic spirochetes.
Science 08/1998; 281(5375):375-88. · 31.20 Impact Factor
-
C M Fraser,
S Casjens,
W M Huang,
G G Sutton,
R Clayton,
R Lathigra,
O White,
K A Ketchum,
R Dodson,
E K Hickey, [......],
P Artiach,
C Bowman,
S Garland,
C Fuji,
M D Cotton,
K Horst,
K Roberts,
B Hatch,
H O Smith,
J C Venter
[show abstract]
[hide abstract]
ABSTRACT: The genome of the bacterium Borrelia burgdorferi B31, the aetiologic agent of Lyme disease, contains a linear chromosome of 910,725 base pairs and at least 17 linear and circular plasmids with a combined size of more than 533,000 base pairs. The chromosome contains 853 genes encoding a basic set of proteins for DNA replication, transcription, translation, solute transport and energy metabolism, but, like Mycoplasma genitalium, it contains no genes for cellular biosynthetic reactions. Because B. burgdorferi and M. genitalium are distantly related eubacteria, we suggest that their limited metabolic capacities reflect convergent evolution by gene loss from more metabolically competent progenitors. Of 430 genes on 11 plasmids, most have no known biological function; 39% of plasmid genes are paralogues that form 47 gene families. The biological significance of the multiple plasmid-encoded genes is not clear, although they may be involved in antigenic variation or immune evasion.
Nature 01/1998; 390(6660):580-6. · 36.28 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Lawrence Berkeley Laboratory (LBL) is contracted by the US Department of Energy to provide an auxiliary modeling effort for the Stripa Project. Within this effort, we are making calculations of inflow to the Simulated Drift Experiment (SDE), i.e. inflow to six parallel, closely spaced D-holes, using a preliminary set of data collected in five other holes, the N- and W-holes during Stages 1 and 2 of the Site Characterization and Validation (SCV) project. Our approach has been to focus on the fracture zones rather than the general set of ubiquitous fractures. Approximately 90% of all the water flowing in the rock is flowing in fracture zones which are neither uniformly conductive nor are they infinitely extensive. Our approach has been to adopt the fracture zone locations as they have been identified with geophysics. We use geologic sense and the original geophysical data to add one zone where significant water inflow has been observed that can not be explained with the other geophysical zones. This report covers LBL's preliminary prediction of flow into the D-holes. Care should be taken in interpreting the results given in this report. As explained below, the approach that LBL has designed for developing a fracture hydrology model requires cross-hole hydrologic data. Cross-hole tests are planned for Stage 3 but were unavailable in Stage 1. As such, we have inferred from available data what a cross-hole test might show and used this synthetic data to make a preliminary calculation of the inflow into the D-holes. Then using all the Stage 3 data we will calculate flow into the Validation Drift itself. The report mainly demonstrates the use of our methodology and the simulated results should be considered preliminary.
01/1990;
-
R. H. Waterston,
K. Lindblad-Toh,
E. Birney,
J. Rogers,
J. F. Abril,
P. Agarwal,
R. Agarwala,
R. Ainscough,
M. Alexandersson,
P. An, [......],
S. Williams,
R. K. Wilson,
E. Winter,
K. C. Worley,
D. Wyman,
S. Yang,
S. P. Yang,
E. M. Zdobnov,
M. C. Zody,
E. S. Lander
[show abstract]
[hide abstract]
ABSTRACT: The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.
NATURE. 420(6915):520-62.