Comparing Genomes in terms of Protein Structure: Surveys of a Finite Parts List

Department of Molecular Biophysics and Biochemistry, 266 Whitney Avenue, Yale University, PO Box 208114, New Haven, CT 06520, USA
FEMS microbiology reviews (Impact Factor: 13.81). 08/1998; DOI: 10.1016/S0168-6445(98)00019-9
Source: CiteSeer

ABSTRACT We give an overview of the emerging field of structural genomics, describing how genomes can be compared in terms of protein structure. As the number of genes in a genome and the total number of protein folds are both quite limited, these comparisons take the form of surveys of a finite parts list, similar in respects to demographic censuses. Fold surveys have many similarities with other whole-genome characterizations, e.g. analyses of motifs or pathways. However, structure has a number of aspects that make it particularly suitable for comparing genomes, namely the way it allows for the precise definition of a basic protein module and the fact that it has a better defined relationship to sequence similarity than does protein function. An essential requirement for a structure survey is a library of folds, which groups the known structures into "fold families." This library can be built up automatically using a structure-comparison program, and we described how important objective stat...

  • Source
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Abstract Single nucleotide polymorphisms,(SNPs) are useful for genome-wide,mapping,and study of disease genes. Previous studies have focused on specific genes or SNPs pooled from a variety of different sources. Here, we present a systematic approach to the analysis of SNPs in relation to various ,features on a ,genome-wide ,scale. We have ,performed ,a comprehensive analysis of 39,408 SNPs on human chromosomes 21 and 22 from The SNP Consortium (TSC) database, where SNPsare obtained by random sequencing using consistent and uniform ,methods. Our study indicates that the occurrence of SNPs is lowest in exons and higher in repeats, introns and pseudogenes. Moreover, in comparing genes and pseudogenes, we find that the SNP density is higher in pseudogenes and the ratio of nonsynonymous, to ,synonymous ,changes ,is much ,higher as well. ,These observations,may ,be explained ,by the ,increased ,rate of SNP ,accumulation ,in pseudogenes, which presumably are not under selective pressure. We have also performed secondary structure prediction on all coding regions and found that there is no preferential distribution of SNPs in α-helices, β-sheets or coils. This could imply that protein structures, in general, can tolerate a wide degree of substitutions. Tables relating
  • [Show abstract] [Hide abstract]
    ABSTRACT: Genomics, the study of the properties of genes and gene products on a whole-organism scale, is revolutionizing all aspects of biology. So powerful has knowledge of the complete nucleotide sequences of the genomes of whole organisms proven to be, that it has spawned a large family of progeny, each shifting the emphasis of their disciplines to discovery- driven (as opposed to hypothesis-driven) research: high-throughput, genome-scale data acquisition. Among the fields that have jumped onto the genomics bandwagon most rapidly is the field of structural biology. The painstaking determination of structures of individual proteins by laboratories that then spent years following-up that work by looking at structures of ligand complexes or mutants is being augmented by assembly-line production of structures for all of the proteins in a pathway or even a whole microbe, as rapidly as possible, with any follow-up work to be left to others. Structural genomics, as this effort is called, has as its stated goals the filling-in of the catalog of known protein folds and the assignment of function to gene products whose functions are not known (these may make up 40% of the gene products in a typical genome), by structural similarity to proteins of known function. How realistic are these expectations? What will be the impact on drug discovery and development? And what other tools are needed to realize the promise inherent in this richness of data?

Full-text (2 Sources)

Available from
May 26, 2014

Hedi Hegyi