Article

The fine details of evolution

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Charles Darwin's theory of evolution was based on studies of biology at the species level. In the time since his death, studies at the molecular level have confirmed his ideas about the kinship of all life on Earth and have provided a wealth of detail about the evolutionary relationships between different species and a deeper understanding of the finer workings of natural selection. We now have a wealth of data, including the genome sequences of a wide range of organisms, an even larger number of protein sequences, a significant knowledge of the three-dimensional structures of proteins, DNA and other biological molecules, and a huge body of information about the operation of these molecules as systems in the molecular machinery of all living things. This issue of Biochemical Society Transactions contains papers from oral presentations given at a Biochemical Society Focused Meeting to commemorate the 200th Anniversary of Charles Darwin's birth, held on 26-27 January 2009 at the Wellcome Trust Conference Centre, Cambridge. The talks reported on some of the insights into evolution which have been obtained from the study of protein sequences, structures and systems.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The more the capability of the classifier to map the training data and separate between the two classes the more accurate it will show in predicting the class of newly unlabeled items. However, protein families are heterogeneous as they represent a long evolutionary history of a wide range of organisms (Laskowski et al. 2009;Ohta 2008;Das et al. 2015). ...
Article
Full-text available
The aim of this paper is to evaluate improvement in the classification of protein sequence data by introducing clustering as a prepossessing step. Clustering analysis was introduced to discover any possible sub-clusters that might have different patterns within the same protein class. A classification learning algorithm is then applied to each cluster to enhance the classification accuracy. Two standard benchmark datasets: caspase 3 human substrates that include cleaved and non-cleaved peptides, and the membrane proteins inner and α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha$$\end{document}-helical proteins were used to examine the proposed approach. Different descriptors based on the physicochemical properties of amino acids were extracted from the protein sequence data and two encoding methods were used to represent the protein sequences using the descriptors. The results show that applying clustering process prior to classification gives higher prediction accuracy than using classification alone. In addition, the result of time performance shows that the proposed approach succeeded in reducing the training time of the classification process significantly while maintaining the accuracy of prediction.
... The first category of duplicated genes is highly specialized and their functions can only be altered after significant modifications. The second category of duplicated genes are not very specialized and include enzymes such as esterases, cytochrome P450 and glutathione S-transferases and interact with a wide range of different substrates and exhibit better adaptation with a single substrate over time [49]. There is no set constant rate for protein evolution and the process occurs under specific environmental conditions at any given time. ...
Article
Self-assembled peptides and proteins possess tremendous potential as targeted drug delivery systems and key applications of these well-defined nanostructures reside in anti-cancer therapy. Peptides and proteins can self-assemble into nanostructures of diverse sizes and shapes in response to changing environmental conditions such as pH, temperature, ionic strength, as well as host and guest molecular interactions; their countless benefits include good biocompatibility and high loading capacity for hydrophobic and hydrophilic drugs. These self-assembled nanomaterials can be adorned with functional moieties to specifically target tumor cells. Stimuli-responsive features can also be incorporated with respect to the tumor microenvironment. This review sheds light on the growing interest in self-assembled peptides and proteins and their burgeoning applications in cancer treatment and immunotherapy.
Article
Full-text available
Molecular gelators are currently receiving a great deal of attention. These are small molecules which, under the appropriate conditions, assemble in solution to, in the majority of cases, give long fibrillar structures which entangle to form a three-dimensional network. This immobilises the solvent, resulting in a gel. Such gelators have potential application in a number of important areas from drug delivery to tissue engineering. Recently, the use of peptide-conjugates has become prevalent with oligopeptides (from as short as two amino acids in length) conjugated to a polymer, alkyl chain or aromatic group such as naphthalene or fluorenylmethoxycarbonyl (Fmoc) being shown to be effective molecular gelators. The field of gelation is extremely large; here we focus our attention on the use of these peptide-conjugates as molecular hydrogelators.
Article
Full-text available
The mission of UniProt is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information that is essential for modern biological research. UniProt is produced by the UniProt Consortium which consists of groups from the European Bioinformatics Institute, the Protein Information Resource and the Swiss Institute of Bioinformatics. The core activities include manual curation of protein sequences assisted by computational analysis, sequence archiving, a user-friendly UniProt website and the provision of additional value-added information through cross-references to other databases. UniProt is comprised of four major components, each optimized for different uses: the UniProt Archive, the UniProt Knowledgebase, the UniProt Reference Clusters and the UniProt Metagenomic and Environmental Sequence Database. One of the key achievements of the UniProt consortium in 2008 is the completion of the first draft of the complete human proteome in UniProtKB/Swiss-Prot. This manually annotated representation of all currently known human protein-coding genes was made available in UniProt release 14.0 with 20 325 entries. UniProt is updated and distributed every three weeks and can be accessed online for searches or downloaded at www.uniprot.org .
Article
Full-text available
HGT (horizontal gene transfer) is recognized as an important force in bacterial evolution. Now that many eukaryotic genomes have been sequenced, it has become possible to carry out studies of HGT in eukaryotes. The present review compares the different approaches that exist for identifying HGT genes and assess them in the context of studying eukaryotic evolution. The metabolic evolution resource metaTIGER is then described, with discussion of its application in identification of HGT in eukaryotes.
Article
Full-text available
Genomes contain a large number of genes that do not have recognizable homologues in other species. These genes, found in only one or a few closely related species, are known as orphan genes. Their limited distribution implies that many of them are probably involved in lineage-specific adaptive processes. One important question that has remained elusive to date is how orphan genes originate. It has been proposed that they might have arisen by gene duplication followed by a period of very rapid sequence divergence, which would have erased any traces of similarity to other evolutionarily related genes. However, this explanation does not seem plausible for genes lacking homologues in very closely related species. In the present article, we review recent efforts to identify the mechanisms of formation of primate orphan genes. These studies reveal an unexpected important role of transposable elements in the formation of novel protein-coding genes in the genomes of primates.
Article
Full-text available
Catalase/peroxidases (KatGs) are bifunctional haem b-containing (Class I) peroxidases with overwhelming catalase activity and substantial peroxidase activity with various one-electron donors. These unique oxidoreductases evolved in ancestral bacteria revealing a complex gene-duplicated structure. Besides being found in numerous bacteria of all phyla, katG genes were also detected in genomes of lower eukaryotes, most prominently of sac and club fungi. Phylogenetic analysis demonstrates the occurrence of two distinct groups of fungal KatGs that differ in localization, structural and functional properties. Analysis of lateral gene transfer of bacterial katGs into fungal genomes reveals that the most probable progenitor was a katG from a bacteroidetes predecessor. The putative physiological role(s) of both fungal KatG groups is discussed with respect to known structure-function relationships in bacterial KatGs and is related with the acquisition of (phyto)pathogenicity in fungi.
Article
Full-text available
Molecular function is the result of proteins working together, mediated by highly specific interactions. Maintenance and change of protein interactions can thus be considered one of the main links between molecular function and mutation. As a consequence, protein interaction datasets can be used to study functional evolution directly. In terms of constraining change, the co-evolution of interacting molecules is a very subtle process. This has implications for the signal being used to predict protein-protein interactions. In terms of functional change, the 'rewiring' of interaction networks, gene duplication is critically important. Interestingly, once duplication has occurred, the genes involved have different probabilities of being retained related to how they were generated. In the present paper, we discuss some of our recent work in this area.
Article
Full-text available
The evolution of proteins is inseparably linked to their function. Because most biological processes involve a number of different proteins, it may become impossible to study the evolutionary properties of proteins in isolation. In the present article, we show how simple mechanistic models of biological processes can complement conventional comparative analyses of biological traits. We use the specific example of the phage-shock stress response, which has been well characterized in Escherichia coli, to elucidate patterns of gene sharing and sequence conservation across bacterial species.
Article
Full-text available
Protein domains are the common currency of protein structure and function. Over 10,000 such protein families have now been collected in the Pfam database. Using these data along with animal gene phylogenies from TreeFam allowed us to investigate the gain and loss of protein domains. Most gains and losses of domains occur at protein termini. We show that the nature of changes is similar after speciation or duplication events. However, changes in domain architecture happen at a higher frequency after gene duplication. We suggest that the bias towards protein termini is largely because insertion and deletion of domains at most positions in a protein are likely to disrupt the structure of existing domains. We can also use Pfam to trace the evolution of specific families. For example, the immunoglobulin superfamily can be traced over 500 million years during its expansion into one of the largest families in the human genome. It can be shown that this protein family has its origins in basic animals such as the poriferan sponges where it is found in cell-surface-receptor proteins. We can trace how the structure and sequence of this family diverged during vertebrate evolution into constant and variable domains that are found in the antibodies of our immune system as well as in neural and muscle proteins.
Article
Full-text available
To take full advantage of the mouse as a model organism, it is essential to distinguish lineage-specific biology from what is shared between human and mouse. Investigations into shared genetic elements common to both have been well served by the draft human and mouse genome sequences. More recently, the virtually complete euchromatic sequences of the two reference genomes have been finished. These reveal a high ( approximately 5%) level of sequence duplications that had previously been recalcitrant to sequencing and assembly. Within these duplications lie large numbers of rodent- or primate-specific genes. In the present paper, we review the sequence properties of the two genomes, dwelling most on the duplications, deletions and insertions that separate each of them from their most recent common ancestor, approx. 90 million years ago. We consider the differences in gene numbers and repertoires between the two species, and speculate on their contributions to lineage-specific biology. Loss of ancient single-copy genes are rare, as are gains of new functional genes through retrotransposition. Instead, most changes to the gene repertoire have occurred in large multicopy families. It has been proposed that numbers of such 'environmental genes' rise and fall, and their sequences change, as adaptive responses to infection and other environmental pressures, including conspecific competition. Nevertheless, many such genes may be under little or no selection.
Article
Full-text available
Divergent evolution of proteins reflects both selectively advantageous and neutral amino acid substitutions. In the present article, we examine restraints on sequence, which arise from selectively advantageous roles for structure and function and which lead to the conservation of local sequences and structures in families and superfamilies. We analyse structurally aligned members of protein families and superfamilies in order to investigate the importance of the local structural environment of amino acid residues in the acceptance of amino acid substitutions during protein evolution. We show that solvent accessibility is the most important determinant, followed by the existence of hydrogen bonds from the side-chain to main-chain functions and the nature of the element of secondary structure to which the amino acid contributes. Polar side chains whose hydrogen-bonding potential is satisfied tend to be more conserved than their unsatisfied or non-hydrogen-bonded counterparts, and buried and satisfied polar residues tend to be significantly more conserved than buried hydrophobic residues. Finally, we discuss the importance of functional restraints in the form of interactions of proteins with other macromolecules in assemblies or with substrates, ligands or allosteric regulators. We show that residues involved in such functional interactions are significantly more conserved and have differing amino acid substitution patterns.
Article
Full-text available
Gene duplication provides raw material for functional innovation. Recent advances have shed light on two fundamental questions regarding gene duplication: which genes tend to undergo duplication? And how does natural selection subsequently act on them? Genomic data suggest that different gene classes tend to be retained after single-gene and whole-genome duplications. We also know that functional differences between duplicate genes can originate in several different ways, including mutations that directly impart new functions, subdivision of ancestral functions and selection for changes in gene dosage. Interestingly, in many cases the 'new' function of one copy is a secondary property that was always present, but that has been co-opted to a primary role after the duplication.
Article
Full-text available
In systems biology, biologically relevant quantitative modelling of physiological processes requires the integration of experimental data from diverse sources. Recent developments in high-throughput methodologies enable the analysis of the transcriptome, proteome, interactome, metabolome and phenome on a previously unprecedented scale, thus contributing to the deluge of experimental data held in numerous public databases. In this review, we describe some of the databases and simulation tools that are relevant to systems biology and discuss a number of key issues affecting data integration and the challenges these pose to systems-level research.
Article
Full-text available
The worldwide Protein Data Bank (wwPDB) is the international collaboration that manages the deposition, processing and distribution of the PDB archive. The online PDB archive is a repository for the coordinates and related information for more than 38 000 structures, including proteins, nucleic acids and large macromolecular complexes that have been determined using X-ray crystallography, NMR and electron microscopy techniques. The founding members of the wwPDB are RCSB PDB (USA), MSD-EBI (Europe) and PDBj (Japan) [H.M. Berman, K. Henrick and H. Nakamura (2003) Nature Struct. Biol., 10, 980]. The BMRB group (USA) joined the wwPDB in 2006. The mission of the wwPDB is to maintain a single archive of macromolecular structural data that are freely and publicly available to the global community. Additionally, the wwPDB provides a variety of services to a broad community of users. The wwPDB website at http://www.wwpdb.org/ provides information about services provided by the individual member organizations and about projects undertaken by the wwPDB.
Article
Full-text available
MetaCyc (MetaCyc.org) is a universal database of metabolic pathways and enzymes from all domains of life. The pathways in MetaCyc are curated from the primary scientific literature, and are experimentally determined small-molecule metabolic pathways. Each reaction in a MetaCyc pathway is annotated with one or more well-characterized enzymes. Because MetaCyc contains only experimentally elucidated knowledge, it provides a uniquely high-quality resource for metabolic pathways and enzymes. BioCyc (BioCyc.org) is a collection of more than 350 organism-specific Pathway/Genome Databases (PGDBs). Each BioCyc PGDB contains the predicted metabolic network of one organism, including metabolic pathways, enzymes, metabolites and reactions predicted by the Pathway Tools software using MetaCyc as a reference database. BioCyc PGDBs also contain predicted operons and predicted pathway hole fillers—predictions of which enzymes may catalyze pathway reactions that have not been assigned to an enzyme. The BioCyc website offers many tools for computational analysis of PGDBs, including comparative analysis and analysis of omics data in a pathway context. The BioCyc PGDBs generated by SRI are offered for adoption by any interested party for the ongoing integration of metabolic and genome-related information about an organism.
Article
Full-text available
The Genomes On Line Database (GOLD) is a comprehensive resource that provides information on genome and metagenome projects worldwide. Complete and ongoing projects and their associated metadata can be accessed in GOLD through pre-computed lists and a search page. As of September 2007, GOLD contains information on more than 2900 sequencing projects, out of which 639 have been completed and their sequence data deposited in the public databases. GOLD continues to expand with the goal of providing metadata information related to the projects and the organisms/environments towards the Minimum Information about a Genome Sequence’ (MIGS) guideline. GOLD is available at http://www.genomesonline.org and has a mirror site at the Institute of Molecular Biology and Biotechnology, Crete, Greece at http://gold.imbb.forth.gr/
Article
The mission of UniProt is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information that is essential for modern biological research. UniProt is produced by the UniProt Consortium which consists of groups from the European Bioinformatics Institute, the Protein Information Resource and the Swiss Institute of Bioinformatics. The core activities include manual curation of protein sequences assisted by computational analysis, sequence archiving, a user-friendly UniProt website and the provision of additional value-added information through cross-references to other databases. UniProt is comprised of four major components, each optimized for different uses: the UniProt Archive, the UniProt Knowledgebase, the UniProt Reference Clusters and the UniProt Metagenomic and Environmental Sequence Database. One of the key achievements of the UniProt consortium in 2008 is the completion of the first draft of the complete human proteome in UniProtKB/Swiss-Prot. This manually annotated representation of all currently known human protein-coding genes was made available in UniProt release 14.0 with 20 325 entries. UniProt is updated and distributed every three weeks and can be accessed online for searches or downloaded at www.uniprot.org.
Article
• In considering the Origin of Species, it is quite conceivable that a naturalist, reflecting on the mutual affinities of organic beings, on their embryological relations, their geographical distribution, geological succession, and other such facts, might come to the conclusion that each species had not been independently created, but had descended, like varieties, from other species. Nevertheless, such a conclusion, even if well founded, would be unsatisfactory, until it could be shown how the innumerable species inhabiting this world have been modified, so as to acquire that perfection of structure and coadaptation which most justly excites our admiration. Naturalists continually refer to external conditions, such as climate, food, &c, as the only possible cause of variation. In one very limited sense, as we shall hereafter see, this may be true; but it is preposterous to attribute to mere external conditions, the structure, for instance, of the woodpecker, with its feet, tail, beak, and tongue, so admirably adapted to catch insects under the bark of trees. In the case of the misseltoe, which draws its nourishment from certain trees, which has seeds that must be transported by certain birds, and which has flowers with separate sexes absolutely requiring the agency of certain insects to bring pollen from one flower to the other, it is equally preposterous to account for the structure of this parasite, with its relations to several distinct organic beings, by the effects of external conditions, or of habit, or of the volition of the plant itself. (PsycINFO Database Record (c) 2012 APA, all rights reserved) • In considering the Origin of Species, it is quite conceivable that a naturalist, reflecting on the mutual affinities of organic beings, on their embryological relations, their geographical distribution, geological succession, and other such facts, might come to the conclusion that each species had not been independently created, but had descended, like varieties, from other species. Nevertheless, such a conclusion, even if well founded, would be unsatisfactory, until it could be shown how the innumerable species inhabiting this world have been modified, so as to acquire that perfection of structure and coadaptation which most justly excites our admiration. Naturalists continually refer to external conditions, such as climate, food, &c, as the only possible cause of variation. In one very limited sense, as we shall hereafter see, this may be true; but it is preposterous to attribute to mere external conditions, the structure, for instance, of the woodpecker, with its feet, tail, beak, and tongue, so admirably adapted to catch insects under the bark of trees. In the case of the misseltoe, which draws its nourishment from certain trees, which has seeds that must be transported by certain birds, and which has flowers with separate sexes absolutely requiring the agency of certain insects to bring pollen from one flower to the other, it is equally preposterous to account for the structure of this parasite, with its relations to several distinct organic beings, by the effects of external conditions, or of habit, or of the volition of the plant itself. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Spectrin is a cytoskeletal protein thought to have descended from an alpha-actinin-like ancestor. It emerged during evolution of animals to promote integration of cells into tissues by assembling signalling and cell adhesion complexes, by enhancing the mechanical stability of membranes and by promoting assembly of specialized membrane domains. Spectrin functions as an (alphabeta([H]))(2) tetramer that cross-links transmembrane proteins, membrane lipids and the actin cytoskeleton, either directly or via adaptor proteins such as ankyrin and 4.1. In the present paper, I review recent findings on the origins and adaptations in this system. (i) The genome of the choanoflagellate Monosiga brevicollis encodes alpha-, beta- and beta(Heavy)-spectrin, indicating that spectrins evolved in the immediate unicellular precursors of animals. (ii) Ankyrin and 4.1 are not encoded in that genome, indicating that spectrin gained function during subsequent animal evolution. (iii) Protein 4.1 gained a spectrin-binding activity in the evolution of vertebrates. (iv) Interaction of chicken or mammal beta-spectrin with PtdInsP(2) can be regulated by differential mRNA splicing, which can eliminate the PH (pleckstrin homology) domain in betaI- or betaII-spectrins; in the case of mammalian betaII-spectrin, the alternative C-terminal region encodes a phosphorylation site that regulates interaction with alpha-spectrin. (v) In mammalian evolution, the single pre-existing alpha-spectrin gene was duplicated, and one of the resulting pair (alphaI) neo-functionalized for rapid make-and-break of tetramers. I hypothesize that the elasticity of mammalian non-nucleated erythrocytes depends on the dynamic rearrangement of spectrin dimers/tetramers under the shearing forces experienced in circulation.
Article
Proteins of the SNARE (soluble N-ethylmaleimide-sensitive factor-attachment protein receptor) family are key factors in all vesicle-fusion steps in the endocytic and secretory pathways. SNAREs can assemble into a tight four-helix bundle complex between opposing membranes, a process that is thought to pull the two membranes into close proximity. The complex-forming domains are highly conserved, not only between different species, but also between different vesicular trafficking steps. SNARE protein sequences can be classified into four main types (Qa, Qb, Qc and R), each reflecting their position in the four-helix bundle. Further refinement of these main types resulted in the identification of 20 distinct conserved groups, which probably reflect the original repertoire of a proto-eukaryotic cell. We analysed the evolution of the SNARE repertoires in metazoa and fungi and unveiled remarkable differences in both lineages. In metazoa, the SNARE repertoire appears to have undergone a substantial expansion, particularly in the endosomal pathways. This expansion probably occurred during the transition from a unicellular to a multicellular lifestyle. We also observed another expansion that led to a major increase of the secretory SNAREs in the vertebrate lineage. Interestingly, fungi developed multicellularity independently, but in contrast with plants and metazoa, this change was not accompanied by an expansion of the SNARE set. Our findings suggest that the rise of multicellularity is not generally linked to an expansion of the SNARE set. The structural and functional diversity that exists between fungi and metazoa might offer a simple explanation for the distinct evolutionary history of their SNARE repertoires.
Article
The evolution of protein function appears to involve alternating periods of conservative evolution and of relatively rapid change. Evidence for such episodic evolution, consistent with some theoretical expectations, comes from the application of increasingly sophisticated models of evolution to large sequence datasets. We present here some of the recent methods to detect functional shifts, using amino acid or codon models. Both provide evidence for punctual shifts in patterns of amino acid conservation, including the fixation of key changes by positive selection. Although a link to gene duplication, a presumed source of functional changes, has been difficult to establish, this episodic model appears to apply to a wide variety of proteins and organisms.
Article
There is considerable variation in the rate at which different proteins evolve. Why is this? Classically, it has been considered that the density of functionally important sites must predict rates of protein evolution. Likewise, amino acid choice is usually assumed to reflect optimal protein function. In the present article, we briefly review evidence suggesting that this protein function-centred view is too simplistic. In particular, we concentrate on how selection acting during the protein's production history can also affect protein evolutionary rates and amino acid choice. Exploring the role of selection at the DNA and RNA level, we specifically address how the need (i) to specify exonic splice enhancer motifs in pre-mRNA, and (ii) to ensure nucleosome positioning on DNA have an impact on amino acid choice and rates of evolution. For both, we review evidence that sequence affected by more than one coding demand is particularly constrained. Strikingly, in mammals, splicing-related constraints are quantitatively as important as expression parameters in predicting rates of protein evolution. These results indicate that there is substantially more to protein evolution than protein functional constraints.
Article
The study of superfamilies of protein domains using a combination of structure, sequence and function data provides insights into deep evolutionary history. In the present paper, analyses of functional diversity within such superfamilies as defined in the CATH-Gene3D resource are described. These analyses focus on structure-function relationships in very large and diverse superfamilies, and on the evolution of domain superfamily members in protein-protein complexes. © The Authors Journal compilation.
Article
A functional enzyme displays activity with at least one substrate and can be represented by a vector in substrate-activity space. Many enzymes, including GSTs (glutathione transferases), are promiscuous in the sense that they act on alternative substrates, and the corresponding vectors operate in multidimensional space. The direction of the vector is governed by the relative activities of the diverse substrates. Stochastic mutations of already existing enzymes generate populations of variants, and clusters of functionally similar mutants can serve as parents for subsequent generations of enzymes. The proper evolving unit is a functional quasi-species, which may not be identical with the 'best' variant in its generation. The manifestation of the quasi-species is dependent on the substrate matrix used to explore catalytic activities. Multivariate analysis is an approach to identifying quasi-species and to investigate evolutionary trajectories in the directed evolution of enzymes for novel functions.
Article
Phosphorylation plays essential roles in nearly every aspect of cell life. Protein kinases regulate signalling pathways and cellular processes that mediate metabolism, transcription, cell-cycle progression, differentiation, cytoskeleton arrangement and cell movement, apoptosis, intercellular communication, and neuronal and immunological functions. Protein kinases share a conserved catalytic domain, which catalyses the transfer of the gamma-phosphate of ATP to a serine, threonine or tyrosine residue in protein substrates. The kinase can exist in an active or inactive state regulated by a variety of mechanisms in different kinases that include control by phosphorylation, regulation by additional domains that may target other molecules, binding and regulation by additional subunits, and control by protein-protein association. This Novartis Medal Lecture was delivered at a meeting on protein evolution celebrating the 200th anniversary of Charles Darwin's birth. I begin with a summary of current observations from protein sequences of kinase phylogeny. I then review the structural consequences of protein phosphorylation using our work on glycogen phosphorylase to illustrate one of the more dramatic consequences of phosphorylation. Regulation of protein phosphorylation is frequently disrupted in the diseased state, and protein kinases have become high-profile targets for drug development. Finally, I consider recent advances on protein kinases as drug targets and describe some of our recent work with CDK9 (cyclin-dependent kinase 9)-cyclin T, a regulator of transcription.
On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata
  • C Darwin
  • John Murray
  • K Mavrommatis
  • K Tavernarakis
  • N Kyrpides
Darwin, C. (1859) On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life, 1st edn., John Murray, London 2 Liolios, K., Mavrommatis, K., Tavernarakis, N. and Kyrpides, N.C. (2008) The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res. 36, D475–D479 3
The Universal Protein Resource (UniProt)
The UniProt Consortium (2009) The Universal Protein Resource (UniProt) 2009. Nucleic Acids Res. 37, D169–D174
  • S Ohno
Ohno, S. (1970) Evolution by Gene Duplication, Springer, New York 8
Transcriptome evolution: shaping genes, genomes
  • A O Urrutia
Urrutia, A.O. (2009) Transcriptome evolution: shaping genes, genomes... and proteins, http://www.biochemistry.org/ Portals/0/Conferences/abstracts/SA099/SA099M001.pdf (Abstract)
The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata
  • K Liolios
  • K Mavrommatis
  • N Tavernarakis
  • N C Kyrpides
Liolios, K., Mavrommatis, K., Tavernarakis, N. and Kyrpides, N.C. (2008) The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res. 36, D475-D479
Transcriptome evolution: shaping genes
  • Urrutia