[show abstract][hide abstract] ABSTRACT: Amplification of distal 3q is the most common genomic aberration in squamous lung cancer (SQC). SQC develops in a multistage progression from normal bronchial epithelium through dysplasia to invasive disease. Identifying the key driver events in the early pathogenesis of SQC will facilitate the search for predictive molecular biomarkers and the identification of novel molecular targets for chemoprevention and therapeutic strategies. For technical reasons, previous attempts to analyze 3q amplification in preinvasive lesions have focused on small numbers of predetermined candidate loci rather than an unbiased survey of copy-number variation.
To perform a detailed analysis of the 3q amplicon in bronchial dysplasia of different histological grades.
We use molecular copy-number counting (MCC) to analyze the structure of chromosome 3 in 19 preinvasive bronchial biopsy specimens from 15 patients and sequential biopsy specimens from 3 individuals.
We demonstrate that no low-grade lesions, but all high-grade lesions, have 3q amplification. None of seven low-grade lesions progressed clinically, whereas 8 of 10 patients with high-grade disease progressed to cancer. We identify a minimum commonly amplified region on chromosome 3 consisting of 17 genes, including 2 known oncogenes, SOX2 and PIK3CA. We confirm that both genes are amplified in all high-grade dysplastic lesions tested. We further demonstrate, in three individuals, that the clinical progression of high-grade preinvasive disease is associated with incremental amplification of SOX2, suggesting this promotes malignant progression.
These findings demonstrate progressive 3q amplification in the evolution of preinvasive SQC and implicate SOX2 as a key target of this dynamic process.
American Journal of Respiratory and Critical Care Medicine 03/2010; 182(1):83-91. · 11.04 Impact Factor
[show abstract][hide abstract] ABSTRACT: Most cancer genomes are characterized by the gain or loss of copies of some sequences through deletion, amplification or unbalanced translocations. Delineating and quantifying these changes is important in understanding the initiation and progression of cancer, in identifying novel therapeutic targets, and in the diagnosis and prognosis of individual patients. Conventional methods for measuring copy-number are limited in their ability to analyse large numbers of loci, in their dynamic range and accuracy, or in their ability to analyse small or degraded samples. This latter limitation makes it difficult to access the wealth of fixed, archived material present in clinical collections, and also impairs our ability to analyse small numbers of selected cells from biopsies. Molecular copy-number counting (MCC), a digital PCR technique, has been used to delineate a non-reciprocal translocation using good quality DNA from a renal carcinoma cell line. We now demonstrate microMCC, an adaptation of MCC which allows the precise assessment of copy number variation over a significant dynamic range, in template DNA extracted from formalin-fixed paraffin-embedded clinical biopsies. Further, microMCC can accurately measure copy number variation at multiple loci, even when applied to picogram quantities of grossly degraded DNA extracted after laser capture microdissection of fixed specimens. Finally, we demonstrate the power of microMCC to precisely interrogate cancer genomes, in a way not currently feasible with other methodologies, by defining the position of a junction between an amplified and non-amplified genomic segment in a bronchial carcinoma. This has tremendous potential for the exploitation of archived resources for high-resolution targeted cancer genomics and in the future for interrogating multiple loci in cancer diagnostics or prognostics.
The Journal of Pathology 07/2008; 216(3):307-16. · 7.59 Impact Factor
[show abstract][hide abstract] ABSTRACT: Salmonella Genomic Island-1 (SGI-1) harbors a cluster of genes encoding multidrug resistance (MDR). SGI-1 is horizontally transmissible and is therefore of significant public health concern. This study presents two novel realtime PCRs detecting three SGI-1 protein-coding genes and a SGI-1 fingerprinting assay. These assays were applied to 445 European enterobacterial isolates. Results from real-time PCRs were comparable to those obtained from gelbased PCRs used for the detection of SGI-1, but were rapid to perform and suitable for large-scale screening. Furthermore, real-time PCRs also detected SGI-1 even when only part of the island was present in bacterial isolates. No trace of SGI-1 was detected in isolates other than Salmonella enterica. The fingerprints showed that regions of SGI-1 outside the MDR region exhibited genomic variations between isolates. In conclusion, the realtime PCRs described here are suitable for the detection of SGI-1 in bacterial isolates. Further studies are necessary to elucidate divergence in its non-MDR region.
Microbial Drug Resistance 02/2008; 14(2):79-92. · 2.36 Impact Factor
[show abstract][hide abstract] ABSTRACT: Eimeria tenella is an intracellular protozoan parasite that infects the intestinal tracts of domestic fowl and causes coccidiosis, a serious and sometimes lethal enteritis. Eimeria falls in the same phylum (Apicomplexa) as several human and animal parasites such as Cryptosporidium, Toxoplasma, and the malaria parasite, Plasmodium. Here we report the sequencing and analysis of the first chromosome of E. tenella, a chromosome believed to carry loci associated with drug resistance and known to differ between virulent and attenuated strains of the parasite. The chromosome--which appears to be representative of the genome--is gene-dense and rich in simple-sequence repeats, many of which appear to give rise to repetitive amino acid tracts in the predicted proteins. Most striking is the segmentation of the chromosome into repeat-rich regions peppered with transposon-like elements and telomere-like repeats, alternating with repeat-free regions. Predicted genes differ in character between the two types of segment, and the repeat-rich regions appear to be associated with strain-to-strain variation.
Genome Research 04/2007; 17(3):311-9. · 14.40 Impact Factor
[show abstract][hide abstract] ABSTRACT: Many methods exist for genotyping--revealing which alleles an individual carries at different genetic loci. A harder problem is haplotyping--determining which alleles lie on each of the two homologous chromosomes in a diploid individual. Conventional approaches to haplotyping require the use of several generations to reconstruct haplotypes within a pedigree, or use statistical methods to estimate the prevalence of different haplotypes in a population. Several molecular haplotyping methods have been proposed, but have been limited to small numbers of loci, usually over short distances. Here we demonstrate a method which allows rapid molecular haplotyping of many loci over long distances. The method requires no more genotypings than pedigree methods, but requires no family material. It relies on a procedure to identify and genotype single DNA molecules, and reconstruction of long haplotypes by a 'tiling' approach. We demonstrate this by resolving haplotypes in two regions of the human genome, harbouring 20 and 105 single-nucleotide polymorphisms, respectively. The method can be extended to reconstruct haplotypes of arbitrary complexity and length, and can make use of a variety of genotyping platforms. We also argue that this method is applicable in situations which are intractable to conventional approaches.
Nucleic Acids Research 02/2007; 35(1):e6. · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: The social amoebae are exceptional in their ability to alternate between unicellular and multicellular forms. Here we describe the genome of the best-studied member of this group, Dictyostelium discoideum. The gene-dense chromosomes of this organism encode approximately 12,500 predicted proteins, a high proportion of which have long, repetitive amino acid tracts. There are many genes for polyketide synthases and ABC transporters, suggesting an extensive secondary metabolism for producing and exporting small molecules. The genome is rich in complex repeats, one class of which is clustered and may serve as centromeres. Partial copies of the extrachromosomal ribosomal DNA (rDNA) element are found at the ends of each chromosome, suggesting a novel telomere structure and the use of a common mechanism to maintain both the rDNA and chromosomal termini. A proteome-based phylogeny shows that the amoebozoa diverged from the animal-fungal lineage after the plant-animal split, but Dictyostelium seems to have retained more of the diversity of the ancestral genome than have plants, animals or fungi.
[show abstract][hide abstract] ABSTRACT: Cryptosporidium species cause acute gastroenteritis and diarrhoea worldwide. They are members of the Apicomplexa--protozoan pathogens that invade host cells by using a specialized apical complex and are usually transmitted by an invertebrate vector or intermediate host. In contrast to other Apicomplexans, Cryptosporidium is transmitted by ingestion of oocysts and completes its life cycle in a single host. No therapy is available, and control focuses on eliminating oocysts in water supplies. Two species, C. hominis and C. parvum, which differ in host range, genotype and pathogenicity, are most relevant to humans. C. hominis is restricted to humans, whereas C. parvum also infects other mammals. Here we describe the eight-chromosome approximately 9.2-million-base genome of C. hominis. The complement of C. hominis protein-coding genes shows a striking concordance with the requirements imposed by the environmental niches the parasite inhabits. Energy metabolism is largely from glycolysis. Both aerobic and anaerobic metabolisms are available, the former requiring an alternative electron transport system in a simplified mitochondrion. Biosynthesis capabilities are limited, explaining an extensive array of transporters. Evidence of an apicoplast is absent, but genes associated with apical complex organelles are present. C. hominis and C. parvum exhibit very similar gene complements, and phenotypic differences between these parasites must be due to subtle sequence divergence.
[show abstract][hide abstract] ABSTRACT: The apicomplexan Cryptosporidium parvum is an intestinal parasite that affects healthy humans and animals, and causes an unrelenting infection in immunocompromised individuals such as AIDS patients. We report the complete genome sequence of C. parvum, type II isolate. Genome analysis identifies extremely streamlined metabolic pathways and a reliance on the host for nutrients. In contrast to Plasmodium and Toxoplasma, the parasite lacks an apicoplast and its genome, and possesses a degenerate mitochondrion that has lost its genome. Several novel classes of cell-surface and secreted proteins with a potential role in host interactions and pathogenesis were also detected. Elucidation of the core metabolism, including enzymes with high similarities to bacterial and plant counterparts, opens new avenues for drug development.
[show abstract][hide abstract] ABSTRACT: The apicomplexan Cryptosporidium parvum is one of the most prevalent protozoan parasites of humans. We report the physical mapping of the genome of the Iowa isolate, sequencing and analysis of chromosome 6, and approximately 0.9 Mbp of sequence sampled from the remainder of the genome. To construct a robust physical map, we devised a novel and general strategy, enabling accurate placement of clones regardless of clone artefacts. Analysis reveals a compact genome, unusually rich in membrane proteins. As in Plasmodium falciparum, the mean size of the predicted proteins is larger than that in other sequenced eukaryotes. We find several predicted proteins of interest as potential therapeutic targets, including one exhibiting similarity to the chloroquine resistance protein of Plasmodium. Coding sequence analysis argues against the conventional phylogenetic position of Cryptosporidium and supports an earlier suggestion that this genus arose from an early branching within the Apicomplexa. In agreement with this, we find no significant synteny and surprisingly little protein similarity with Plasmodium. Finally, we find two unusual and abundant repeats throughout the genome. Among sequenced genomes, one motif is abundant only in C. parvum, whereas the other is shared with (but has previously gone unnoticed in) all known genomes of the Coccidia and Haemosporida. These motifs appear to be unique in their structure, distribution and sequences.
Genome Research 09/2003; 13(8):1787-99. · 14.40 Impact Factor
[show abstract][hide abstract] ABSTRACT: HAPPY mapping is an in vitro approach for defining the order and spacing of DNA markers directly on native genomic DNA. This cloning-free technique is based on analysing the segregation of markers amplified from high molecular weight genomic DNA which has been broken randomly and 'segregated' by limiting dilution into subhaploid samples. It is a uniquely versatile tool, allowing for the construction of genome maps with flexible ranges and resolutions. Moreover, it is applicable to plant genomes, for which many of the techniques pioneered in animal genomes are inapplicable or inappropriate. We report here its demonstration in a plant genome by reconstructing the physical map of a 1.9 Mbp region around the FCA locus of Arabidopsis thaliana. The resulting map, spanning around 10% of chromosome 4, is in excellent agreement with the DNA sequence and has a mean marker spacing of 16 kbp. We argue that HAPPY maps of any required resolution can be made immediately and with relatively little effort for most plant species and, furthermore, that such maps can greatly aid the construction of regional or genome-wide physical maps.
[show abstract][hide abstract] ABSTRACT: The genome of the lower eukaryote Dictyostelium discoideum comprises six chromosomes. Here we report the sequence of the largest, chromosome 2, which at 8 megabases (Mb) represents about 25% of the genome. Despite an A + T content of nearly 80%, the chromosome codes for 2,799 predicted protein coding genes and 73 transfer RNA genes. This gene density, about 1 gene per 2.6 kilobases (kb), is surpassed only by Saccharomyces cerevisiae (one per 2 kb) and is similar to that of Schizosaccharomyces pombe (one per 2.5 kb). If we assume that the other chromosomes have a similar gene density, we can expect around 11,000 genes in the D. discoideum genome. A significant number of the genes show higher similarities to genes of vertebrates than to those of other fully sequenced eukaryotes. This analysis strengthens the view that the evolutionary position of D. discoideum is located before the branching of metazoa and fungi but after the divergence of the plant kingdom, placing it close to the base of metazoan evolution.
[show abstract][hide abstract] ABSTRACT: We have made a high-resolution HAPPY map of chromosome 6 of Dictyostelium discoideum consisting of 300 sequence-tagged sites with an average spacing of 14 kb along the approximately 4-Mb chromosome. The majority of the marker sequences were derived from randomly chosen clones from four different chromosome 6-enriched plasmid libraries or from subclones of YACs previously mapped to chromosome 6. The map appears to span the entire chromosome, although marker density is greater in some regions than in others and is lowest within the telomeric region. Our map largely supports previous gene-based maps of this chromosome but reveals a number of errors in the physical map. In addition, we find that a high proportion of the plasmid sequences derived from gel-enriched chromosome 6 (and that form the basis of a chromosome-specific sequencing project) originates from other chromosomes.
Genome Research 12/2000; 10(11):1737-42. · 14.40 Impact Factor
[show abstract][hide abstract] ABSTRACT: We have constructed a HAPPY map of the apicomplexan parasite Cryptosporidium parvum. We have placed 204 markers on the 10.4-Mb genome, giving an average marker spacing of approximately 50 kb, with an effective resolution of approximately 40 kb. HAPPY mapping (an in vitro linkage technique based on screening approximately haploid amounts of DNA by the polymerase chain reaction) is fast and accurate and is not subject to the distortions inherent in cloning, meiotic recombination, or hybrid cell formation. In addition, little genomic DNA is needed as a substrate, and the AT content of the genome is largely immaterial, making it an ideal method for mapping otherwise intractable parasite genomes. The map, covering all eight chromosomes, consists of 10 linkage groups, each of which has been chromosomally assigned. We have verified the accuracy of the map by several methods, including the construction of a >140-kb PAC contig on chromosome VI. Less than 1% of our markers detect non-rDNA duplicated sequences.
Genome Research 12/1998; 8(12):1299-307. · 14.40 Impact Factor
[show abstract][hide abstract] ABSTRACT: A translocation involving chromosomes 12 and 14 [t(12;14)(q15;24.1)] is commonly seen in benign smooth muscle tumor as uterine leiomyoma (UL). A contig of P1-derived artificial chromosome and bacterial artificial chromosome clones on chromosome 14, encompassing a t(12;14) breakpoint cluster region (BCR) in UL, was generated principally using the recently developed HAPPY map of chromosome 14 as a framework (P. H. Dear et al., 1998, Genomics 48: 232-241). Three UL t(12;14) breakpoints have been localized within this contig, showing that a BCR of at least 400 kb exists on chromosome 14. Other studies of tumors with t(12;14) rearrangements similarly show breakpoints within a 475-kb multiple aberration region on chromosome 12. Thus t(12;14) is an example of a translocation in which the breakpoints are located within a BCR on both chromosome 12 and chromosome 14, justifying the identification of expressed sequences that are altered in these BCR regions. A total of four expressed sequences were identified in the BCR on chromosome 14. Two of these were novel cDNAs (D14S1460E and D14S1461E). The chromosome 14 cDNAs were expressed in multiple adult tissues. The identification of a large breakpoint cluster region on chromosome 14 suggests that translocations in this region mediate their effects at a distance and also that elements that predispose this region to recurrent chromosomal translocation may be widely distributed.
[show abstract][hide abstract] ABSTRACT: We have localized the gene encoding human RNase k6 to within approximately 120 kb on the long (q) arm of chromosome 14 by HAPPY mapping. With this information, the relative positions of the six human RNase A ribonucleases that have been mapped to this locus can be inferred. To further our understanding of the individual lineages comprising the RNase A superfamily, we have isolated and characterized 10 novel genes orthologous to that encoding human RNase k6 from Great Ape, Old World, and New World monkey genomes. Each gene encodes a complete ORF with no less than 86% amino acid sequence identity to human RNase k6 with the eight cysteines and catalytic histidines (H15 and H123) and lysine (K38) typically observed among members of the RNase A superfamily. Interesting trends include an unusually low number of synonymous substitutions (Ks) observed among the New World monkey RNase k6 genes. When considering nonsilent mutations, RNase k6 is a relatively stable lineage, with a nonsynonymous substitution rate of 0.40 x 10(-9) nonsynonymous substitutions/nonsynonymous site/year (ns/ns/yr). These results stand in contrast to those determined for the primate orthologs of the two closely related ribonucleases, the eosinophil-derived neurotoxin (EDN) and eosinophil cationic protein (ECP), which have incorporated nonsilent mutations at very rapid rates (1.9 x 10(-9) and 2.0 x 10(-9) ns/ns/yr, respectively). The uneventful trends observed for RNase k6 serve to spotlight the unique nature of EDN and ECP and the unusual evolutionary constraints to which these two ribonuclease genes must be responding. [The sequence data described in this paper have been submitted to the GenBank data library under accession nos. AF037081-AF037090.]
Genome Research 07/1998; 8(6):599-607. · 14.40 Impact Factor
[show abstract][hide abstract] ABSTRACT: We have mapped 1001 novel sequence-tagged sites on human chromosome 14. The mean spacing between markers is approximately 90 kb, most markers are mapped with a resolution of better than 100 kb, and physical distances are determined. The map was produced using HAPPY mapping, a simple and widely applicable in vitro approach that is analogous to linkage or to radiation hybrid mapping, but that circumvents many of the difficulties and potential artifacts associated with these methods. We show also that the map serves as a robust scaffold for building physical maps using large-insert clones.