-
D Altshuler,
R M Durbin,
G R Abecasis, D R Bentley,
A Chakravarti,
A G Clark,
F S Collins,
F M De La Vega,
P Donnelly,
M Egholm, [......],
J E McEwen,
A Abdallah,
C R Juenger,
N C Clemm,
A Duncanson,
E D Green,
M S Guyer,
J L Peterson,
Y Xue,
R A Cartwright
Nature 10/2010; 467:1061-1073. · 36.28 Impact Factor
-
S G Gregory,
K F Barlow,
K E McLay,
R Kaul,
D Swarbreck,
A Dunham,
C E Scott,
K L Howe,
K Woodfine,
C C A Spencer, [......],
W D H Burrill,
S M Clegg,
P Dhami,
O Dovey,
L M Faulkner,
S M Gribble,
C F Langford,
R D Pandian,
K M Porter,
E Prigmore
[show abstract]
[hide abstract]
ABSTRACT: The reference sequence for each human chromosome provides the framework for understanding genome function, variation and evolution. Here we report the finished sequence and biological annotation of human chromosome 1. Chromosome 1 is gene-dense, with 3,141 genes and 991 pseudogenes, and many coding sequences overlap. Rearrangements and mutations of chromosome 1 are prevalent in cancer and many other diseases. Patterns of sequence variation reveal signals of recent selection in specific genes that may contribute to human fitness, and also in regions where no function is evident. Fine-scale recombination occurs in hotspots of varying intensity along the sequence, and is enriched near genes. These and other studies of human biology and disease encoded within chromosome 1 are made possible with the highly accurate annotated sequence, as part of the completed set of chromosome sequences that comprise the reference human genome.
Nature 06/2006; 441(7091):315-21. · 36.28 Impact Factor
-
S. G. Gregory,
K. F. Barlow,
K. E. McLay,
R. Kaul,
D. Swarbreck,
A. Dunham,
C. E. Scott,
K. L. Howe,
K. Woodfine,
C. C. A. Spencer, [......],
R. Wooster,
I. Dunham,
N. P. Carter,
G. McVean,
M. T. Ross,
J. Harrow,
M. V. Olson,
S. Beck,
J. Rogers, D. R. Bentley
[show abstract]
[hide abstract]
ABSTRACT: The reference sequence for each human chromosome provides the framework for understanding genome function, variation and evolution. Here we report the finished sequence and biological annotation of human chromosome 1. Chromosome 1 is gene-dense, with 3,141 genes and 991 pseudogenes, and many coding sequences overlap. Rearrangements and mutations of chromosome 1 are prevalent in cancer and many other diseases. Patterns of sequence variation reveal signals of recent selection in specific genes that may contribute to human fitness, and also in regions where no function is evident. Fine-scale recombination occurs in hotspots of varying intensity along the sequence, and is enriched near genes. These and other studies of human biology and disease encoded within chromosome 1 are made possible with the highly accurate annotated sequence, as part of the completed set of chromosome sequences that comprise the reference human genome.
Nature 05/2006; 441(7091):315-321. · 36.28 Impact Factor
-
Journal of Thrombosis and Haemostasis 12/2005; 3(11):2600-1. · 5.73 Impact Factor
-
M. T. Ross,
D. V. Grafham,
A. J. Coffey,
S. Scherer,
K. McLay,
D. Muzny,
M. Platzer,
G. R. Howell,
C. Burrows,
C. P. Bird, [......],
A. Coulson,
D. L. Nelson,
G. Weinstock,
J. E. Sulston,
R. Durbin,
T. Hubbard,
R. A. Gibbs,
S. Beck,
J. Rogers, D. R. Bentley
[show abstract]
[hide abstract]
ABSTRACT: The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence.
Nature 03/2005; 434:325-37. · 36.28 Impact Factor
-
S. J. Humphray,
K. Oliver,
A. R. Hunt,
R. W. Plumb,
J. E. Loveland,
K. L. Howe,
T. D. Andrews,
S. Searle,
S. E. Hunt,
C. E. Scott, [......],
A. Coulson,
H. Bl|[ouml]|cker,
R. Durbin,
J. E. Sulston,
T. Hubbard,
M. J. Jackson, D. R. Bentley,
S. Beck,
J. Rogers,
I. Dunham
[show abstract]
[hide abstract]
ABSTRACT: Chromosome 9 is highly structurally polymorphic. It contains the largest autosomal block of heterochromatin, which is heteromorphic in 6–8% of humans, whereas pericentric inversions occur in more than 1% of the population. The finished euchromatic sequence of chromosome 9 comprises 109,044,351 base pairs and represents >99.6% of the region. Analysis of the sequence reveals many intra- and interchromosomal duplications, including segmental duplications adjacent to both the centromere and the large heterochromatic block. We have annotated 1,149 genes, including genes implicated in male-to-female sex reversal, cancer and neurodegenerative disease, and 426 pseudogenes. The chromosome contains the largest interferon gene cluster in the human genome. There is also a region of exceptionally high gene and G + C content including genes paralogous to those in the major histocompatibility complex. We have also detected recently duplicated genes that exhibit different rates of sequence divergence, presumably reflecting natural selection.
Nature 05/2004; 429(6990):369-374. · 36.28 Impact Factor
-
A Dunham,
L H Matthews,
J Burton,
J L Ashurst,
K L Howe,
K J Ashcroft,
D M Beare,
D C Burford,
S E Hunt,
S Griffiths-Jones, [......],
M W Wright,
L Young,
A Coulson,
R Durbin,
T Hubbard,
J E Sulston,
S Beck, D R Bentley,
J Rogers,
M T Ross
[show abstract]
[hide abstract]
ABSTRACT: Chromosome 13 is the largest acrocentric human chromosome. It carries genes involved in cancer including the breast cancer type 2 (BRCA2) and retinoblastoma (RB1) genes, is frequently rearranged in B-cell chronic lymphocytic leukaemia, and contains the DAOA locus associated with bipolar disorder and schizophrenia. We describe completion and analysis of 95.5 megabases (Mb) of sequence from chromosome 13, which contains 633 genes and 296 pseudogenes. We estimate that more than 95.4% of the protein-coding genes of this chromosome have been identified, on the basis of comparison with other vertebrate genome sequences. Additionally, 105 putative non-coding RNA genes were found. Chromosome 13 has one of the lowest gene densities (6.5 genes per Mb) among human chromosomes, and contains a central region of 38 Mb where the gene density drops to only 3.1 genes per Mb.
Nature 05/2004; 428(6982):522-8. · 36.28 Impact Factor
-
A. Dunham,
L.H. Matthews,
J. Burton,
J.L. Ashurst,
K.L. Howe,
K.J. Ashcroft,
D.M. Beare,
D.C. Burford,
S.E. Hunt,
S.J. Griffiths-Jones, [......],
M.W. Wright,
L. Young,
A. Coulson,
R. Durbin,
T. Hubbard,
J.E. Sulston,
S. Beck, D.R. Bentley,
J. Rogers,
M.T. Ross
Nature 04/2004; 428(6982):522-8. · 36.28 Impact Factor
-
A. Dunham,
L. H. Matthews,
J. Burton,
J. L. Ashurst,
K. L. Howe,
K. J. Ashcroft,
D. M. Beare,
D. C. Burford,
S. E. Hunt,
S. Griffiths-Jones, [......],
M. W. Wright,
L. Young,
A. Coulson,
R. Durbin,
T. Hubbard,
J. E. Sulston,
S. Beck, D. R. Bentley,
J. Rogers,
M. T. Ross
[show abstract]
[hide abstract]
ABSTRACT: Chromosome 13 is the largest acrocentric human chromosome. It carries genes involved in cancer including the breast cancer type 2 (BRCA2) and retinoblastoma (RB1) genes, is frequently rearranged in B-cell chronic lymphocytic leukaemia, and contains the DAOA locus associated with bipolar disorder and schizophrenia. We describe completion and analysis of 95.5 megabases (Mb) of sequence from chromosome 13, which contains 633 genes and 296 pseudogenes. We estimate that more than 95.4% of the protein-coding genes of this chromosome have been identified, on the basis of comparison with other vertebrate genome sequences. Additionally, 105 putative non-coding RNA genes were found. Chromosome 13 has one of the lowest gene densities (6.5 genes per Mb) among human chromosomes, and contains a central region of 38 Mb where the gene density drops to only 3.1 genes per Mb.
Nature 03/2004; 428(6982):522-528. · 36.28 Impact Factor
-
A J Mungall,
S A Palmer,
S K Sims,
C A Edwards,
J L Ashurst,
L Wilming,
M C Jones,
R Horton,
S E Hunt,
C E Scott, [......],
L Young,
R M Younger, D R Bentley,
A Coulson,
R Durbin,
T Hubbard,
J E Sulston,
I Dunham,
J Rogers,
S Beck
[show abstract]
[hide abstract]
ABSTRACT: Chromosome 6 is a metacentric chromosome that constitutes about 6% of the human genome. The finished sequence comprises 166,880,988 base pairs, representing the largest chromosome sequenced so far. The entire sequence has been subjected to high-quality manual annotation, resulting in the evidence-supported identification of 1,557 genes and 633 pseudogenes. Here we report that at least 96% of the protein-coding genes have been identified, as assessed by multi-species comparative sequence analysis, and provide evidence for the presence of further, otherwise unsupported exons/genes. Among these are genes directly implicated in cancer, schizophrenia, autoimmunity and many other diseases. Chromosome 6 harbours the largest transfer RNA gene cluster in the genome; we show that this cluster co-localizes with a region of high transcriptional activity. Within the essential immune loci of the major histocompatibility complex, we find HLA-B to be the most polymorphic gene on chromosome 6 and in the human genome.
Nature 11/2003; 425(6960):805-11. · 36.28 Impact Factor
-
A. J. Mungall,
S. A. Palmer,
S. K. Sims,
C. A. Edwards,
J. L. Ashurst,
L. Wilming,
M. C. Jones,
R. Horton,
S. E. Hunt,
C. E. Scott, [......],
L. Young,
R. M. Younger, D. R. Bentley,
A. Coulson,
R. Durbin,
T. Hubbard,
J. E. Sulston,
I. Dunham,
J. Rogers,
S. Beck
[show abstract]
[hide abstract]
ABSTRACT: Chromosome 6 is a metacentric chromosome that constitutes about 6% of the human genome. The finished sequence comprises 166,880,988 base pairs, representing the largest chromosome sequenced so far. The entire sequence has been subjected to high-quality manual annotation, resulting in the evidence-supported identification of 1,557 genes and 633 pseudogenes. Here we report that at least 96% of the protein-coding genes have been identified, as assessed by multi-species comparative sequence analysis, and provide evidence for the presence of further, otherwise unsupported exons/genes. Among these are genes directly implicated in cancer, schizophrenia, autoimmunity and many other diseases. Chromosome 6 harbours the largest transfer RNA gene cluster in the genome; we show that this cluster co-localizes with a region of high transcriptional activity. Within the essential immune loci of the major histocompatibility complex, we find HLA-B to be the most polymorphic gene on chromosome 6 and in the human genome.
Nature 10/2003; 425(6960):805-811. · 36.28 Impact Factor
-
P. Deloukas,
L. H. Matthews,
J. Ashurst,
J. Burton,
J. G. R. Gilbert,
M. Jones,
G. Stavrides,
J. P. Almeida,
A. K. Babbage,
C. L. Bagguley, [......],
D. L. Willey,
L. Williams,
S. A. Williams,
L. Wilming,
P. W. Wray,
T. Hubbard,
R. M. Durbin, D. R. Bentley,
S. Beck,
J. Rogers
[show abstract]
[hide abstract]
ABSTRACT: The finished sequence of human chromosome 20 comprises 59,187,298 base pairs (bp) and represents 99.4% of the euchromatic DNA. A single contig of 26 megabases (Mb) spans the entire short arm, and five contigs separated by gaps totalling 320 kb span the long arm of this metacentric chromosome. An additional 234,339 bp of sequence has been determined within the pericentromeric region of the long arm. We annotated 727 genes and 168 pseudogenes in the sequence. About 64% of these genes have a 5' and a 3' untranslated region and a complete open reading frame. Comparative analysis of the sequence of chromosome 20 to whole-genome shotgun-sequence data of two other vertebrates, the mouse Mus musculus and the puffer fish Tetraodon nigroviridis, provides an independent measure of the efficiency of gene annotation, and indicates that this analysis may account for more than 95% of all coding exons and almost all genes.
Nature 12/2001; 414(6866):865-871. · 36.28 Impact Factor
-
R Sachidanandam,
D Weissman,
S C Schmidt,
J M Kakol,
L D Stein,
G Marth,
S Sherry,
J C Mullikin,
B J Mortimore,
D L Willey, [......],
D Reich,
J Higgins,
M J Daly,
B Blumenstiel,
J Baldwin,
N Stange-Thomann,
M C Zody,
L Linton,
E S Lander,
D Altshuler
[show abstract]
[hide abstract]
ABSTRACT: We describe a map of 1.42 million single nucleotide polymorphisms (SNPs) distributed throughout the human genome, providing an average density on available sequence of one SNP every 1.9 kilobases. These SNPs were primarily discovered by two projects: The SNP Consortium and the analysis of clone overlaps by the International Human Genome Sequencing Consortium. The map integrates all publicly available SNPs with described genes and other genomic features. We estimate that 60,000 SNPs fall within exon (coding and untranslated regions), and 85% of exons are within 5 kb of the nearest SNP. Nucleotide diversity varies greatly across the genome, in a manner broadly consistent with a standard population genetic model of human history. This high-density SNP map provides a public resource for defining haplotype variation across the genome, and should help to identify biomedically important genes for diagnosis and therapy.
Nature 03/2001; 409(6822):928-33. · 36.28 Impact Factor
-
K L Evans,
S Le Hellard,
S W Morris,
D Lawson,
C Whitton,
C A Semple,
J A Fantes,
H S Torrance,
M P Malloy,
J C Maule,
S J Humphray,
M T Ross, D R Bentley,
W J Muir,
D H Blackwood,
D J Porteous
[show abstract]
[hide abstract]
ABSTRACT: Bipolar affective disorder (BPAD) is a complex disease with a significant genetic component and a population lifetime risk of 1%. Our previous work identified a region of human chromosome 4p that showed significant linkage to BPAD in a large pedigree. Here, we report the construction of an accurate, high-resolution physical map of 6.9 Mb of human chromosome 4p15.3-p16.1, which includes an 11-cM (5.8 Mb) critical region for BPAD. The map consists of 460 PAC and BAC clones ordered by a combination of STS content analysis and restriction fragment fingerprinting, with a single approximately 300-kb gap remaining. A total of 289 new and existing markers from a wide range of sources have been localized on the contig, giving an average marker resolution of 1 marker/23 kb. The STSs include 57 ESTs, 9 of which represent known genes. This contig is an essential preliminary to the identification of candidate genes that predispose to bipolar affective disorder, to the completion of the sequence of the region, and to the development of a high-density SNP map.
Genomics 03/2001; 71(3):315-23. · 3.02 Impact Factor
-
D R Bentley,
P Deloukas,
A Dunham,
L French,
S G Gregory,
S J Humphray,
A J Mungall,
M T Ross,
N P Carter,
I Dunham, [......],
R G Taylor,
A A Thorpe,
E Tinsley,
G L Warry,
A Whittaker,
P Whittaker,
S H Williams,
T E Wilmer,
R Wooster,
C L Wright
[show abstract]
[hide abstract]
ABSTRACT: We constructed maps for eight chromosomes (1, 6, 9, 10, 13, 20, X and (previously) 22), representing one-third of the genome, by building landmark maps, isolating bacterial clones and assembling contigs. By this approach, we could establish the long-range organization of the maps early in the project, and all contig extension, gap closure and problem-solving was simplified by containment within local regions. The maps currently represent more than 94% of the euchromatic (gene-containing) regions of these chromosomes in 176 contigs, and contain 96% of the chromosome-specific markers in the human gene map. By measuring the remaining gaps, we can assess chromosome length and coverage in sequenced clones.
Nature 03/2001; 409(6822):942-3. · 36.28 Impact Factor
-
J D McPherson,
M Marra,
L Hillier,
R H Waterston,
A Chinwalla,
J Wallis,
M Sekhon,
K Wylie,
E R Mardis,
R K Wilson, [......],
A Shimizu,
K Shibuya,
J Kudoh,
S Minoshima,
J Ramser,
P Seranski,
C Hoff,
A Poustka,
R Reinhardt,
H Lehrach
[show abstract]
[hide abstract]
ABSTRACT: The human genome is by far the largest genome to be sequenced, and its size and complexity present many challenges for sequence assembly. The International Human Genome Sequencing Consortium constructed a map of the whole genome to enable the selection of clones for sequencing and for the accurate assembly of the genome sequence. Here we report the construction of the whole-genome bacterial artificial chromosome (BAC) map and its integration with previous landmark maps and information from mapping efforts focused on specific chromosomal regions. We also describe the integration of sequence data with the map.
Nature 03/2001; 409(6822):934-41. · 36.28 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Long-range comparative sequence analysis provides a powerful strategy for identifying conserved regulatory elements. The stem cell leukemia (SCL) gene encodes a bHLH transcription factor with a pivotal role in hemopoiesis and vasculogenesis, and it displays a highly conserved expression pattern. We present here a detailed sequence comparison of 193 kb of the human SCL locus to 234 kb of the mouse SCL locus. Four new genes have been identified together with an ancient mitochondrial insertion in the human locus. The SCL gene is flanked upstream by the SIL gene and downstream by the MAP17 gene in both species, but the gene order is not collinear downstream from MAP17. To facilitate rapid identification of candidate regulatory elements, we have developed a new sequence analysis tool (SynPlot) that automates the graphical display of large-scale sequence alignments. Unlike existing programs, SynPlot can display the locus features of more than one sequence, thereby indicating the position of homology peaks relative to the structure of all sequences in the alignment. In addition, high-resolution analysis of the chromatin structure of the mouse SCL gene permitted the accurate positioning of localized zones accessible to restriction endonucleases. Zones known to be associated with functional regulatory regions were found to correspond precisely with peaks of human/mouse homology, thus demonstrating that long-range human/mouse sequence comparisons allow accurate prediction of the extent of accessible DNA associated with active regulatory regions.
Genome Research 02/2001; 11(1):87-97. · 13.61 Impact Factor
-
J C Mullikin,
S E Hunt,
C G Cole,
B J Mortimore,
C M Rice,
J Burton,
L H Matthews,
R Pavitt,
R W Plumb,
S K Sims, [......],
C A Steward,
J E Sulston,
E J Tinsley,
K J Turney,
D L Willey,
G D Wilson,
A A McMurray,
I Dunham,
J Rogers, D R Bentley
[show abstract]
[hide abstract]
ABSTRACT: The human genome sequence will provide a reference for measuring DNA sequence variation in human populations. Sequence variants are responsible for the genetic component of individuality, including complex characteristics such as disease susceptibility and drug response. Most sequence variants are single nucleotide polymorphisms (SNPs), where two alternate bases occur at one position. Comparison of any two genomes reveals around 1 SNP per kilobase. A sufficiently dense map of SNPs would allow the detection of sequence variants responsible for particular characteristics on the basis that they are associated with a specific SNP allele. Here we have evaluated large-scale sequencing approaches to obtaining SNPs, and have constructed a map of 2,730 SNPs on human chromosome 22. Most of the SNPs are within 25 kilobases of a transcribed exon, and are valuable for association studies. We have scaled up the process, detecting over 65,000 SNPs in the genome as part of The SNP Consortium programme, which is on target to build a map of 1 SNP every 5 kilobases that is integrated with the human genome sequence and that is freely available in the public domain.
Nature 10/2000; 407(6803):516-20. · 36.28 Impact Factor
-
A J Bench,
E P Nacheva,
T L Hood,
J L Holden,
L French,
S Swanton,
K M Champion,
J Li,
P Whittaker,
G Stavrides,
A R Hunt,
B J Huntly,
L J Campbell, D R Bentley,
P Deloukas,
A R Green
[show abstract]
[hide abstract]
ABSTRACT: Deletion of the long arm of chromosome 20 represents the most common chromosomal abnormality associated with the myeloproliferative disorders (MPDs) and is also found in other myeloid malignancies including myelodysplastic syndromes (MDS) and acute myeloid leukaemia (AML). Previous studies have identified a common deleted region (CDR) spanning approximately 8 Mb. We have now used G-banding, FISH or microsatellite PCR to analyse 113 patients with a 20q deletion associated with a myeloid malignancy. Our results define a new MPD CDR of 2.7 Mb, an MDS/AML CDR of 2.6 Mb and a combined 'myeloid' CDR of 1.7 Mb. We have also constructed the most detailed physical map of this region to date--a bacterial clone map spanning 5 Mb of the chromosome which contains 456 bacterial clones and 202 DNA markers. Fifty-one expressed sequences were localized within this contig of which 37 lie within the MPD CDR and 20 within the MDS/AML CDR. Of the 16 expressed sequences (six genes and 10 unique ESTs) within the 'myeloid' CDR, five were expressed in both normal bone marrow and purified CD34 positive cells. These data identify a set of genes which are both positional and expression candidates for the target gene(s) on 20q.
Oncogene 09/2000; 19(34):3902-13. · 6.37 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: In cDNA indexing, differentially expressed genes are identified by the display of specific, corresponding subsets of cDNA. Subdivision of the cDNA population is achieved by the sequence-specific ligation of adapters to the overhangs created by class IIS restriction enzymes. However, inadequate specificity of ligation leads to redundancy between different adapter subsets. We evaluate the incidence of mismatches between adapters and class IIS restriction fragments during ligation and describe a modified set of conditions that improves ligation specificity. The improved protocol reduces redundancy between amplified cDNA subsets, which leads to a lower number of bands per lane of the differential display gel, and therefore simplifies analysis. We confirm the validity of this revised protocol by identifying five differentially expressed genes in mouse duodenum and ileum.
BioTechniques 06/2000; 28(5):958-64. · 2.67 Impact Factor