A cast of hundreds, if not quite thousands, of researchers worldwide have published their work on the pilot phase of the 1000 Genomes Project, building directly on the success of previous efforts of the Human Genome Project and the International HapMap Project. The 1000 Genome Project's stated goal is quite specific: “.
Rice, one of the most important food crops for humans, is the first crop plant to have its genome sequenced. Rice whole-genome microarrays, genome tiling arrays and genome-wide gene-indexed mutant collections have recently been generated. With the availability of these resources, discovering the function of the estimated 41,000 rice genes is now within reach. Such discoveries have broad practical implications for understanding the biological processes of rice and other economically important grasses such as cereals and bioenergy crops.
Subtelomeres are extraordinarily dynamic and variable regions near the ends of chromosomes. They are defined by their unusual structure: patchworks of blocks that are duplicated near the ends of multiple chromosomes. Duplications among subtelomeres have spawned small gene families, making inter-individual variation in subtelomeres a potential source of phenotypic diversity. The ectopic recombination that occurs between subtelomeres might also have a role in reconstituting telomeres in the absence of telomerase. However, the propensity for subtelomeres to interchange is a double-edged sword, as extensive subtelomeric homology can mediate deleterious rearrangements of the ends of chromosomes to cause human disease.
Since the discovery in 1993 of the first small silencing RNA, a dizzying number of small RNA classes have been identified, including microRNAs (miRNAs), small interfering RNAs (siRNAs) and Piwi-interacting RNAs (piRNAs). These classes differ in their biogenesis, their modes of target regulation and in the biological pathways they regulate. There is a growing realization that, despite their differences, these distinct small RNA pathways are interconnected, and that small RNA pathways compete and collaborate as they regulate genes and protect the genome from external and internal threats.
Susceptibility to the growing global public health problem of cardiovascular disease is associated with levels of plasma lipids and lipoproteins. Several experimental strategies have helped us to clarify the genetic architecture of these complex traits, including classical studies of monogenic dyslipidaemias, resequencing, phenomic analysis and, more recently, genome-wide association studies and analysis of metabolic networks. The genetic basis of plasma lipoprotein levels can now be modelled as a mosaic of contributions from multiple DNA sequence variants, both rare and common, with varying effect sizes. In addition to filling gaps in our understanding of plasma lipoprotein metabolism, the recent genetic advances will improve our ability to classify, diagnose and treat dyslipidaemias.
To fully understand the allelic variation that underlies common diseases, complete genome sequencing for many individuals with and without disease is required. This is still not technically feasible. However, recently it has become possible to carry out partial surveys of the genome by genotyping large numbers of common SNPs in genome-wide association studies. Here, we outline the main factors - including models of the allelic architecture of common diseases, sample size, map density and sample-collection biases - that need to be taken into account in order to optimize the cost efficiency of identifying genuine disease-susceptibility loci.
Digital genetics, or the genetics of digital organisms, is a new field of research that has become possible as a result of the remarkable power of evolution experiments that use computers. Self-replicating strands of computer code that inhabit specially prepared computers can mutate, evolve and adapt to their environment. Digital organisms make it easy to conduct repeatable, controlled experiments, which have a perfect genetic 'fossil record'. This allows researchers to address fundamental questions about the genetic basis of the evolution of complexity, genome organization, robustness and evolvability, and to test the consequences of mutations, including their interaction and recombination, on the fate of populations and lineages.
MicroRNAs (miRNAs) are a large family of post-transcriptional regulators of gene expression that are approximately 21 nucleotides in length and control many developmental and cellular processes in eukaryotic organisms. Research during the past decade has identified major factors participating in miRNA biogenesis and has established basic principles of miRNA function. More recently, it has become apparent that miRNA regulators themselves are subject to sophisticated control. Many reports over the past few years have reported the regulation of miRNA metabolism and function by a range of mechanisms involving numerous protein-protein and protein-RNA interactions. Such regulation has an important role in the context-specific functions of miRNAs.
Repeat expansion mutations cause at least 22 inherited neurological diseases. The
complexity of repeat disease genetics and pathobiology has revealed unexpected shared themes
and mechanistic pathways among the diseases, such as RNA toxicity. Also, investigation of
the polyglutamine diseases has identified post-translational modification as a key step in
the pathogenic cascade and has shown that the autophagy pathway has an important role in the
degradation of misfolded proteins — two themes that are likely to be relevant to the entire
neurodegeneration field. Insights from repeat disease research are catalysing new lines of
study that should not only elucidate molecular mechanisms of disease but also highlight
opportunities for therapeutic intervention for these currently untreatable disorders.
Integrating results from diverse experiments is an essential process in our effort to understand the logic of complex systems, such as development, homeostasis and responses to the environment. With the advent of high-throughput methods--including genome-wide association (GWA) studies, chromatin immunoprecipitation followed by sequencing (ChIP-seq) and RNA sequencing (RNA-seq)--acquisition of genome-scale data has never been easier. Epigenomics, transcriptomics, proteomics and genomics each provide an insightful, and yet one-dimensional, view of genome function; integrative analysis promises a unified, global view. However, the large amount of information and diverse technology platforms pose multiple challenges for data access and processing. This Review discusses emerging issues and strategies related to data integration in the era of next-generation genomics.
Despite efforts from a range of disciplines, our ability to predict and combat the
evolution of antibiotic resistance in pathogenic bacteria is limited. This is because
resistance evolution involves a complex interplay between the specific drug, bacterial
genetics and both natural and treatment ecology. Incorporating details of the molecular
mechanisms of drug resistance and ecology into evolutionary models has proved useful in
predicting the dynamics of resistance evolution. However, putting these models to practical
use will require extensive collaboration between mathematicians, molecular biologists,
evolutionary ecologists and clinicians.
All plant and animal species arise by speciation - the evolutionary splitting of one species into two reproductively incompatible species. But until recently our understanding of the molecular genetic details of speciation was slow in coming and largely limited to Drosophila species. Here, I review progress in determining the molecular identities and evolutionary histories of several new 'speciation genes' that cause hybrid dysfunction between species of yeast, flies, mice and plants. The new work suggests that, surprisingly, the first steps in the evolution of hybrid dysfunction are not necessarily adaptive.
Genome-wide association (GWA) studies for pharmacogenomics-related traits are increasingly
being performed to identify loci that affect either drug response or susceptibility to
adverse drug reactions. Until now, only the largest effects have been detected, partly
because of the challenges of obtaining large numbers of cases for pharmacogenomic studies.
Since 2007, a range of pharmacogenomics GWA studies have been published that have identified
several interesting and novel associations between drug responses or reactions and
clinically relevant loci, showing the value of this approach.
Until recently, it was impracticable to identify the genes that are responsible for variation in continuous traits, or to directly observe the effects of their different alleles. Now, the abundance of genetic markers has made it possible to identify quantitative trait loci (QTL)--the regions of a chromosome or, ideally, individual sequence variants that are responsible for trait variation. What kind of QTL do we expect to find and what can our observations of QTL tell us about how organisms evolve? The key to understanding the evolutionary significance of QTL is to understand the nature of inherited variation, not in the immediate mechanistic sense of how genes influence phenotype, but, rather, to know what evolutionary forces maintain genetic variability.
DNA methylation has recently moved to centre stage in the aetiology of
human neurodevelopmental syndromes such as the fragile X, ICF and Rett syndromes.
These diseases result from the misregulation of genes that occurs with the
loss of appropriate epigenetic controls during neuronal development. Recent
advances have connected DNA methylation to chromatin-remodelling enzymes,
and understanding this link will be central to the design of new therapeutic
Host-adapted bacteria include mutualists and pathogens of animals, plants and insects. Their study is therefore important for biotechnology, biodiversity and human health. The recent rapid expansion in bacterial genome data has provided insights into the adaptive, diversifying and reductive evolutionary processes that occur during host adaptation. The results have challenged many pre-existing concepts built from studies of laboratory bacterial strains. Furthermore, recent studies have revealed genetic changes associated with transitions from parasitism to mutualism and opened new research avenues to understand the functional reshaping of bacteria as they adapt to growth in the cytoplasm of a eukaryotic host.
Pathogens have always been a major cause of human mortality, so they impose strong selective pressure on the human genome. Data from population genetic studies, including genome-wide scans for selection, are providing important insights into how natural selection has shaped immunity and host defence genes in specific human populations and in the human species as a whole. These findings are helping to delineate genes that are important for host defence and to increase our understanding of how past selection has had an impact on disease susceptibility in modern populations. A tighter integration between population genetic studies and immunological phenotype studies is now necessary to reveal the mechanisms that have been crucial for our past and present survival against infection.
The primary cilium has recently stepped into the spotlight, as a flood of data show that this organelle has crucial roles in vertebrate development and human genetic diseases. Cilia are required for the response to developmental signals, and evidence is accumulating that the primary cilium is specialized for hedgehog signal transduction. The formation of cilia, in turn, is regulated by other signalling pathways, possibly including the planar cell polarity pathway. The cilium therefore represents a nexus for signalling pathways during development. The connections between cilia and developmental signalling have begun to clarify the basis of human diseases associated with ciliary dysfunction.
Unlimited cellular proliferation depends on counteracting the telomere attrition that
accompanies DNA replication. In human cancers this usually occurs through upregulation of
telomerase activity, but in 10–15% of cancers — including some with particularly poor
outcome — it is achieved through a mechanism known as alternative lengthening of telomeres
(ALT). ALT, which is dependent on homologous recombination, is therefore an important target
for cancer therapy. Although dissection of the mechanism or mechanisms of ALT has been
challenging, recent advances have led to the identification of several genes that are
required for ALT and the elucidation of the biological significance of some phenotypic
markers of ALT. This has enabled development of a rapid assay of ALT activity levels and the
construction of molecular models of ALT.
Phenotypic variation for quantitative traits results from the simultaneous segregation of alleles at multiple quantitative trait loci. Understanding the genetic architecture of quantitative traits begins with mapping quantitative trait loci to broad genomic regions and ends with the molecular definition of quantitative trait loci alleles. This has been accomplished for some quantitative trait loci in Drosophila. Drosophila quantitative trait loci have sex-, environment- and genotype-specific effects, and are often associated with molecular polymorphisms in non-coding regions of candidate genes. These observations offer valuable lessons to those seeking to understand quantitative traits in other organisms, including humans.
Cancers are caused by the accumulation of genomic alterations. Therefore, analyses of cancer genome sequences and structures provide insights for understanding cancer biology, diagnosis and therapy. The application of second-generation DNA sequencing technologies (also known as next-generation sequencing) - through whole-genome, whole-exome and whole-transcriptome approaches - is allowing substantial advances in cancer genomics. These methods are facilitating an increase in the efficiency and resolution of detection of each of the principal types of somatic cancer genome alterations, including nucleotide substitutions, small insertions and deletions, copy number alterations, chromosomal rearrangements and microbial infections. This Review focuses on the methodological considerations for characterizing somatic genome alterations in cancer and the future prospects for these approaches.
In recent years views of eukaryotic gene expression have been transformed by the finding that enormous diversity can be generated at the RNA level. Advances in technologies for characterizing RNA populations are revealing increasingly complete descriptions of RNA regulation and complexity; for example, through alternative splicing, alternative polyadenylation and RNA editing. New biochemical strategies to map protein-RNA interactions in vivo are yielding transcriptome-wide insights into mechanisms of RNA processing. These advances, combined with bioinformatics and genetic validation, are leading to the generation of functional RNA maps that reveal the rules underlying RNA regulation and networks of biologically coherent transcripts. Together these are providing new insights into molecular cell biology and disease.
Sequence-directed genetic interference pathways control gene expression and preserve genome
integrity in all kingdoms of life. The importance of such pathways is highlighted by the
extensive study of RNA interference (RNAi) and related processes in eukaryotes. In many
bacteria and most archaea, clustered, regularly interspaced short palindromic repeats
(CRISPRs) are involved in a more recently discovered interference pathway that protects
cells from bacteriophages and conjugative plasmids. CRISPR sequences provide an adaptive,
heritable record of past infections and express CRISPR RNAs — small RNAs that target
invasive nucleic acids. Here, we review the mechanisms of CRISPR interference and its roles
in microbial physiology and evolution. We also discuss potential applications of this novel
During their dispersal from Africa, our ancestors were exposed to new environments and diseases. Those who were better adapted to local conditions passed on their genes, including those conferring these benefits, with greater frequency. This process of natural selection left signatures in our genome that can be used to identify genes that might underlie variation in disease resistance or drug metabolism. These signatures are, however, confounded by population history and by variation in local recombination rates. Although this complexity makes finding adaptive polymorphisms a challenge, recent discoveries are instructing us how and where to look for the signatures of selection.
Chemical genetics is the study of gene-product function in a cellular
or organismal context using exogenous ligands. In this approach, small molecules
that bind directly to proteins are used to alter protein function, enabling
a kinetic analysis of the in vivo consequences of these changes. Recent
advances have strongly enhanced the power of exogenous ligands such that they
can resemble genetic mutations in terms of their general applicability and
target specificity. The growing sophistication of this approach raises the
possibility of its application to any biological process.
Theoretical studies of adaptation have exploded over the past decade. This work has been inspired by recent, surprising findings in the experimental study of adaptation. For example, morphological evolution sometimes involves a modest number of genetic changes, with some individual changes having a large effect on the phenotype or fitness. Here I survey the history of adaptation theory, focusing on the rise and fall of various views over the past century and the reasons for the slow development of a mature theory of adaptation. I also discuss the challenges that face contemporary theories of adaptation.
Glioblastoma multiforme is the most malignant of the primary brain tumours
and is almost always fatal. The treatment strategies for this disease have
not changed appreciably for many years and most are based on a limited understanding
of the biology of the disease. However, in the past decade, characteristic
genetic alterations have been identified in gliomas that might underlie the
initiation or progression of the disease. Recent modelling experiments in
mice are helping to delineate the molecular aetiology of this disease and
are providing systems to identify and test novel and rational therapeutic
Genome sequences reveal that a deluge of DNA from organelles has constantly been bombarding the nucleus since the origin of organelles. Recent experiments have shown that DNA is transferred from organelles to the nucleus at frequencies that were previously unimaginable. Endosymbiotic gene transfer is a ubiquitous, continuing and natural process that pervades nuclear DNA dynamics. This relentless influx of organelle DNA has abolished organelle autonomy and increased nuclear complexity.
The genomes of multicellular eukaryotes provide information that determines the phenotype. However, not all sequences in the genome are required for this purpose. Other sequences are often selfish in their actions and interact in complex ways. Here, an analogy is developed between the components of the genome, including mobile DNA elements, and an ecological community. Unlike ecological communities, however, the slow rates at which genomes change allow us to reconstruct patterns of interaction that stretch back tens or hundreds of millions of years.
Genome-wide association studies have greatly improved our understanding of the genetic basis of disease risk. The fact that they tend not to identify more than a fraction of the specific causal loci has led to divergence of opinion over whether most of the variance is hidden as numerous rare variants of large effect or as common variants of very small effect. Here I review 20 arguments for and against each of these models of the genetic basis of complex traits and conclude that both classes of effect can be readily reconciled.
Organisms require an appropriate balance of stability and reversibility in gene expression programmes to maintain cell identity or to enable responses to stimuli; epigenetic regulation is integral to this dynamic control. Post-translational modification of histones by methylation is an important and widespread type of chromatin modification that is known to influence biological processes in the context of development and cellular responses. To evaluate how histone methylation contributes to stable or reversible control, we provide a broad overview of how histone methylation is regulated and leads to biological outcomes. The importance of appropriately maintaining or reprogramming histone methylation is illustrated by its links to disease and ageing and possibly to transmission of traits across generations.
Interest in the role of the microbiome in human health has burgeoned over the past decade with the advent of new technologies for interrogating complex microbial communities. The large-scale dynamics of the microbiome can be described by many of the tools and observations used in the study of population ecology. Deciphering the metagenome and its aggregate genetic information can also be used to understand the functional properties of the microbial community. Both the microbiome and metagenome probably have important functions in health and disease; their exploration is a frontier in human genetics.
Knowledge of epigenetic alterations in disease is rapidly increasing owing to the development of genome-wide techniques for their identification. The ever-growing number of genes that show epigenetic alterations in disease emphasizes the crucial role of these epigenetic alterations - particularly DNA methylation - for future diagnosis, prognosis and prediction of response to therapies. This Review focuses on epigenetic profiling, which has started to be of clinical value in cancer and may in the future be extended to other diseases, such as neurological and autoimmune disorders.
Recent studies have uncovered myriad viral sequences that are integrated or 'endogenized' in the genomes of various eukaryotes. Surprisingly, it appears that not just retroviruses but almost all types of viruses can become endogenous. We review how these genomic 'fossils' offer fresh insights into the origin, evolutionary dynamics and structural evolution of viruses, which are giving rise to the burgeoning field of palaeovirology. We also examine the multitude of ways through which endogenous viruses have influenced, for better or worse, the biology of their hosts. We argue that the conflict between hosts and viruses has led to the invention and diversification of molecular arsenals, which, in turn, promote the cellular co-option of endogenous viruses.
Farm animal populations harbour rich collections of mutations with phenotypic effects that have been purposefully enriched by breeding. Most of these mutations do not have pathological phenotypic consequences, in contrast to the collections of deleterious mutations in model organisms or those causing inherited disorders in humans. Farm animals are of particular interest for identifying genes that control growth, energy metabolism, development, appetite, reproduction and behaviour, as well as other traits that have been manipulated by breeding. Genome research in farm animals will add to our basic understanding of the genetic control of these traits and the results will be applied in breeding programmes to reduce the incidence of disease and to improve product quality and production efficiency.
Genomic DNA is often thought of as the stable template of heredity,
largely dormant and unchanging, apart from perhaps the occasional point mutation.
But it has become increasingly clear that DNA is dynamic rather than static,
being subjected to rearrangements, insertions and deletions. Much of this
plasticity can be attributed to transposable elements and their genomic relatives.
Many genes that mediate sexual reproduction, such as those involved in gamete recognition, diverge rapidly, often as a result of adaptive evolution. This widespread phenomenon might have important consequences, such as the establishment of barriers to fertilization that might lead to speciation. Sequence comparisons and functional studies are beginning to show the extent to which the rapid divergence of reproductive proteins is involved in the speciation process.
Small-RNA-guided gene regulation is a recurring theme in biology. Animal germ cells are characterized by an intriguing small-RNA-mediated gene-silencing mechanism known as the PIWI pathway. For a long time, both the biogenesis of PIWI-interacting RNAs (piRNAs) as well as their mode of gene silencing has remained elusive. A recent body of work is shedding more light on both aspects and implicates PIWI in the establishment of transgenerational epigenetic states. In fact, the epigenetic states imposed by PIWI on targets may actually drive piRNA production itself. These findings start to couple small RNA biogenesis with small-RNA-mediated epigenetics.
Small-RNA-guided gene regulation has emerged as one of the fundamental principles in cell function, and the major protein players in this process are members of the Argonaute protein family. Argonaute proteins are highly specialized binding modules that accommodate the small RNA component - such as microRNAs (miRNAs), short interfering RNAs (siRNAs) or PIWI-associated RNAs (piRNAs) - and coordinate downstream gene-silencing events by interacting with other protein factors. Recent work has made progress in our understanding of classical Argonaute-mediated gene-silencing principles, such as the effects on mRNA translation and decay, but has also implicated Argonaute proteins in several other cellular processes, such as transcriptional regulation and splicing.
Meta-analysis of genome-wide association studies (GWASs) has become a popular method for discovering genetic risk variants. Here, we overview both widely applied and newer statistical methods for GWAS meta-analysis, including issues of interpretation and assessment of sources of heterogeneity. We also discuss extensions of these meta-analysis methods to complex data. Where possible, we provide guidelines for researchers who are planning to use these methods. Furthermore, we address special issues that may arise for meta-analysis of sequencing data and rare variants. Finally, we discuss challenges and solutions surrounding the goals of making meta-analysis data publicly available and building powerful consortia.
In mammals and other eukaryotes most of the genome is transcribed in a developmentally regulated manner to produce large numbers of long non-coding RNAs (ncRNAs). Here we review the rapidly advancing field of long ncRNAs, describing their conservation, their organization in the genome and their roles in gene regulation. We also consider the medical implications, and the emerging recognition that any transcript, regardless of coding potential, can have an intrinsic function as an RNA.
Knowing the precise locations of nucleosomes in a genome is key to understanding how genes are regulated. Recent 'next generation' ChIP-chip and ChIP-Seq technologies have accelerated our understanding of the basic principles of chromatin organization. Here we discuss what high-resolution genome-wide maps of nucleosome positions have taught us about how nucleosome positioning demarcates promoter regions and transcriptional start sites, and how the composition and structure of promoter nucleosomes facilitate or inhibit transcription. A detailed picture is starting to emerge of how diverse factors, including underlying DNA sequences and chromatin remodelling complexes, influence nucleosome positioning.
Family history is an important independent risk factor for coronary artery disease (CAD), and identification of susceptibility genes for this common, complex disease is a vital goal. Although there has been considerable success in identifying genetic variants that influence well-known risk factors, such as cholesterol levels, progress in unearthing novel CAD genes has been slow. However, advances are now being made through the application of large-scale, systematic, genome-wide approaches. Recent findings particularly highlight the link between CAD and inflammation and immunity, and highlight the biological insights to be gained from a genetic understanding of the world's biggest killer.
Since the first description of RNA interference (RNAi) in animals less than a decade ago, there has been rapid progress towards its use as a therapeutic modality against human diseases. Advances in our understanding of the mechanisms of RNAi and studies of RNAi in vivo indicate that RNAi-based therapies might soon provide a powerful new arsenal against pathogens and diseases for which treatment options are currently limited. Recent findings have highlighted both promise and challenges in using RNAi for therapeutic applications. Design and delivery strategies for RNAi effector molecules must be carefully considered to address safety concerns and to ensure effective, successful treatment of human diseases.
The success of Drosophila melanogaster as a model organism is largely due to the power of forward genetic screens to identify the genes that are involved in a biological process. Traditional screens, such as the Nobel-prize-winning screen for embryonic-patterning mutants, can only identify the earliest phenotype of a mutation. This review describes the ingenious approaches that have been devised to circumvent this problem: modifier screens, for example, have been invaluable for elucidating signal-transduction pathways, whereas clonal screens now make it possible to screen for almost any phenotype in any cell at any stage of development.
Over two metres of DNA is packaged into each nucleus in the human body in a manner that still allows for gene regulation. This remarkable feat is accomplished by the wrapping of DNA around histone proteins in repeating units of nucleosomes to form a structure known as chromatin. This chromatin structure is subject to various modifications that have profound influences on gene expression. Recently developed techniques to study chromatin modifications at a genome-wide scale are now allowing researchers to probe the complex components that make up epigenomes. Here we review genome-wide approaches to studying epigenomic structure and the exciting findings that have been obtained using these technologies.
If invertebrate neurons are injured by hostile environments or aberrant proteins they die much like human neurons, indicating that the powerful advantages of invertebrate molecular genetics might be successfully used for testing specific hypotheses about human neurological diseases, for drug discovery and for non-biased screens for suppressors and enhancers of neurodegeneration. Recent molecular dissection of the genetic requirements for hypoxia, excitotoxicity and death in models of Alzheimer disease, polyglutamine-expansion disorders, Parkinson disease and more, is providing mechanistic insights into neurotoxicity and suggesting new therapeutic interventions. An emerging theme is that neuronal crises of distinct origins might converge to disrupt common cellular functions, such as protein folding and turnover.
Implantation involves an intricate discourse between the embryo and uterus and is a gateway to further embryonic development. Synchronizing embryonic development until the blastocyst stage with the uterine differentiation that takes place to produce the receptive state is crucial to successful implantation, and therefore to pregnancy outcome. Although implantation involves the interplay of numerous signalling molecules, the hierarchical instructions that coordinate the embryo-uterine dialogue are not well understood. This review highlights our knowledge about the molecular development of preimplantation and implantation and the future challenges of the field. A better understanding of periimplantation biology could alleviate female infertility and help to develop novel contraceptives.
There has been a long history of innovation and development of tools for gene discovery and genetic analysis in Drosophila melanogaster. This includes methods to induce mutations and to screen for those mutations that disrupt specific processes, methods to map mutations genetically and physically, and methods to clone and characterize genes at the molecular level. Modern genetics also requires techniques to do the reverse to disrupt the functions of specific genes, the sequences of which are already known. This is the process referred to as reverse genetics. During recent years, some valuable new methods for conducting reverse genetics in Drosophila have been developed.