-
M. T. Ross,
D. V. Grafham,
A. J. Coffey,
S. Scherer,
K. McLay,
D. Muzny,
M. Platzer,
G. R. Howell,
C. Burrows,
C. P. Bird, [......],
A. Coulson,
D. L. Nelson,
G. Weinstock,
J. E. Sulston,
R. Durbin,
T. Hubbard,
R. A. Gibbs,
S. Beck,
J. Rogers,
D. R. Bentley
[show abstract]
[hide abstract]
ABSTRACT: The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence.
Nature 03/2005; 434:325-37. · 36.28 Impact Factor
-
P Deloukas,
M E Earthrowl,
D V Grafham,
M Rubenfield,
L French,
C A Steward,
S K Sims,
M C Jones,
S Searle,
C Scott, [......],
N K Moschonas,
R Siebert,
K Fechtel,
D Bentley,
R Durbin,
T Hubbard,
L Doucette-Stamm,
S Beck,
D R Smith,
J Rogers
[show abstract]
[hide abstract]
ABSTRACT: The finished sequence of human chromosome 10 comprises a total of 131,666,441 base pairs. It represents 99.4% of the euchromatic DNA and includes one megabase of heterochromatic sequence within the pericentromeric region of the short and long arm of the chromosome. Sequence annotation revealed 1,357 genes, of which 816 are protein coding, and 430 are pseudogenes. We observed widespread occurrence of overlapping coding genes (either strand) and identified 67 antisense transcripts. Our analysis suggests that both inter- and intrachromosomal segmental duplications have impacted on the gene count on chromosome 10. Multispecies comparative analysis indicated that we can readily annotate the protein-coding genes with current resources. We estimate that over 95% of all coding exons were identified in this study. Assessment of single base changes between the human chromosome 10 and chimpanzee sequence revealed nonsense mutations in only 21 coding genes with respect to the human sequence.
Nature 06/2004; 429(6990):375-81. · 36.28 Impact Factor
-
P. Deloukas,
M. E. Earthrowl,
D. V. Grafham,
M. Rubenfield,
L. French,
C. A. Steward,
S. K. Sims,
M. C. Jones,
S. Searle,
C. Scott, [......],
N. K. Moschonas,
R. Siebert,
K. Fechtel,
D. Bentley,
R. Durbin,
T. Hubbard,
L. Doucette-Stamm,
S. Beck,
D. R. Smith,
J. Rogers
[show abstract]
[hide abstract]
ABSTRACT: The finished sequence of human chromosome 10 comprises a total of 131,666,441 base pairs. It represents 99.4% of the euchromatic DNA and includes one megabase of heterochromatic sequence within the pericentromeric region of the short and long arm of the chromosome. Sequence annotation revealed 1,357 genes, of which 816 are protein coding, and 430 are pseudogenes. We observed widespread occurrence of overlapping coding genes (either strand) and identified 67 antisense transcripts. Our analysis suggests that both inter- and intrachromosomal segmental duplications have impacted on the gene count on chromosome 10. Multispecies comparative analysis indicated that we can readily annotate the protein-coding genes with current resources. We estimate that over 95% of all coding exons were identified in this study. Assessment of single base changes between the human chromosome 10 and chimpanzee sequence revealed nonsense mutations in only 21 coding genes with respect to the human sequence.
Nature 05/2004; 429(6990):375-381. · 36.28 Impact Factor
-
A Dunham,
L H Matthews,
J Burton,
J L Ashurst,
K L Howe,
K J Ashcroft,
D M Beare,
D C Burford,
S E Hunt,
S Griffiths-Jones, [......],
M W Wright,
L Young,
A Coulson,
R Durbin,
T Hubbard,
J E Sulston,
S Beck,
D R Bentley,
J Rogers,
M T Ross
[show abstract]
[hide abstract]
ABSTRACT: Chromosome 13 is the largest acrocentric human chromosome. It carries genes involved in cancer including the breast cancer type 2 (BRCA2) and retinoblastoma (RB1) genes, is frequently rearranged in B-cell chronic lymphocytic leukaemia, and contains the DAOA locus associated with bipolar disorder and schizophrenia. We describe completion and analysis of 95.5 megabases (Mb) of sequence from chromosome 13, which contains 633 genes and 296 pseudogenes. We estimate that more than 95.4% of the protein-coding genes of this chromosome have been identified, on the basis of comparison with other vertebrate genome sequences. Additionally, 105 putative non-coding RNA genes were found. Chromosome 13 has one of the lowest gene densities (6.5 genes per Mb) among human chromosomes, and contains a central region of 38 Mb where the gene density drops to only 3.1 genes per Mb.
Nature 05/2004; 428(6982):522-8. · 36.28 Impact Factor
-
A. Dunham,
L.H. Matthews,
J. Burton,
J.L. Ashurst,
K.L. Howe,
K.J. Ashcroft,
D.M. Beare,
D.C. Burford,
S.E. Hunt,
S.J. Griffiths-Jones, [......],
M.W. Wright,
L. Young,
A. Coulson,
R. Durbin,
T. Hubbard,
J.E. Sulston,
S. Beck,
D.R. Bentley,
J. Rogers,
M.T. Ross
Nature 04/2004; 428(6982):522-8. · 36.28 Impact Factor
-
A. Dunham,
L. H. Matthews,
J. Burton,
J. L. Ashurst,
K. L. Howe,
K. J. Ashcroft,
D. M. Beare,
D. C. Burford,
S. E. Hunt,
S. Griffiths-Jones, [......],
M. W. Wright,
L. Young,
A. Coulson,
R. Durbin,
T. Hubbard,
J. E. Sulston,
S. Beck,
D. R. Bentley,
J. Rogers,
M. T. Ross
[show abstract]
[hide abstract]
ABSTRACT: Chromosome 13 is the largest acrocentric human chromosome. It carries genes involved in cancer including the breast cancer type 2 (BRCA2) and retinoblastoma (RB1) genes, is frequently rearranged in B-cell chronic lymphocytic leukaemia, and contains the DAOA locus associated with bipolar disorder and schizophrenia. We describe completion and analysis of 95.5 megabases (Mb) of sequence from chromosome 13, which contains 633 genes and 296 pseudogenes. We estimate that more than 95.4% of the protein-coding genes of this chromosome have been identified, on the basis of comparison with other vertebrate genome sequences. Additionally, 105 putative non-coding RNA genes were found. Chromosome 13 has one of the lowest gene densities (6.5 genes per Mb) among human chromosomes, and contains a central region of 38 Mb where the gene density drops to only 3.1 genes per Mb.
Nature 03/2004; 428(6982):522-528. · 36.28 Impact Factor
-
A J Mungall,
S A Palmer,
S K Sims,
C A Edwards,
J L Ashurst,
L Wilming,
M C Jones,
R Horton,
S E Hunt,
C E Scott, [......],
L Young,
R M Younger,
D R Bentley,
A Coulson,
R Durbin,
T Hubbard,
J E Sulston,
I Dunham,
J Rogers,
S Beck
[show abstract]
[hide abstract]
ABSTRACT: Chromosome 6 is a metacentric chromosome that constitutes about 6% of the human genome. The finished sequence comprises 166,880,988 base pairs, representing the largest chromosome sequenced so far. The entire sequence has been subjected to high-quality manual annotation, resulting in the evidence-supported identification of 1,557 genes and 633 pseudogenes. Here we report that at least 96% of the protein-coding genes have been identified, as assessed by multi-species comparative sequence analysis, and provide evidence for the presence of further, otherwise unsupported exons/genes. Among these are genes directly implicated in cancer, schizophrenia, autoimmunity and many other diseases. Chromosome 6 harbours the largest transfer RNA gene cluster in the genome; we show that this cluster co-localizes with a region of high transcriptional activity. Within the essential immune loci of the major histocompatibility complex, we find HLA-B to be the most polymorphic gene on chromosome 6 and in the human genome.
Nature 11/2003; 425(6960):805-11. · 36.28 Impact Factor
-
A. J. Mungall,
S. A. Palmer,
S. K. Sims,
C. A. Edwards,
J. L. Ashurst,
L. Wilming,
M. C. Jones,
R. Horton,
S. E. Hunt,
C. E. Scott, [......],
L. Young,
R. M. Younger,
D. R. Bentley,
A. Coulson,
R. Durbin,
T. Hubbard,
J. E. Sulston,
I. Dunham,
J. Rogers,
S. Beck
[show abstract]
[hide abstract]
ABSTRACT: Chromosome 6 is a metacentric chromosome that constitutes about 6% of the human genome. The finished sequence comprises 166,880,988 base pairs, representing the largest chromosome sequenced so far. The entire sequence has been subjected to high-quality manual annotation, resulting in the evidence-supported identification of 1,557 genes and 633 pseudogenes. Here we report that at least 96% of the protein-coding genes have been identified, as assessed by multi-species comparative sequence analysis, and provide evidence for the presence of further, otherwise unsupported exons/genes. Among these are genes directly implicated in cancer, schizophrenia, autoimmunity and many other diseases. Chromosome 6 harbours the largest transfer RNA gene cluster in the genome; we show that this cluster co-localizes with a region of high transcriptional activity. Within the essential immune loci of the major histocompatibility complex, we find HLA-B to be the most polymorphic gene on chromosome 6 and in the human genome.
Nature 10/2003; 425(6960):805-811. · 36.28 Impact Factor
-
P. Deloukas,
L. H. Matthews,
J. Ashurst,
J. Burton,
J. G. R. Gilbert,
M. Jones,
G. Stavrides,
J. P. Almeida,
A. K. Babbage,
C. L. Bagguley, [......],
D. L. Willey,
L. Williams,
S. A. Williams,
L. Wilming,
P. W. Wray,
T. Hubbard,
R. M. Durbin,
D. R. Bentley,
S. Beck,
J. Rogers
[show abstract]
[hide abstract]
ABSTRACT: The finished sequence of human chromosome 20 comprises 59,187,298 base pairs (bp) and represents 99.4% of the euchromatic DNA. A single contig of 26 megabases (Mb) spans the entire short arm, and five contigs separated by gaps totalling 320 kb span the long arm of this metacentric chromosome. An additional 234,339 bp of sequence has been determined within the pericentromeric region of the long arm. We annotated 727 genes and 168 pseudogenes in the sequence. About 64% of these genes have a 5' and a 3' untranslated region and a complete open reading frame. Comparative analysis of the sequence of chromosome 20 to whole-genome shotgun-sequence data of two other vertebrates, the mouse Mus musculus and the puffer fish Tetraodon nigroviridis, provides an independent measure of the efficiency of gene annotation, and indicates that this analysis may account for more than 95% of all coding exons and almost all genes.
Nature 12/2001; 414(6866):865-871. · 36.28 Impact Factor
-
R Sachidanandam,
D Weissman,
S C Schmidt,
J M Kakol,
L D Stein,
G Marth,
S Sherry,
J C Mullikin,
B J Mortimore,
D L Willey, [......],
D Reich,
J Higgins,
M J Daly,
B Blumenstiel,
J Baldwin,
N Stange-Thomann,
M C Zody,
L Linton,
E S Lander,
D Altshuler
[show abstract]
[hide abstract]
ABSTRACT: We describe a map of 1.42 million single nucleotide polymorphisms (SNPs) distributed throughout the human genome, providing an average density on available sequence of one SNP every 1.9 kilobases. These SNPs were primarily discovered by two projects: The SNP Consortium and the analysis of clone overlaps by the International Human Genome Sequencing Consortium. The map integrates all publicly available SNPs with described genes and other genomic features. We estimate that 60,000 SNPs fall within exon (coding and untranslated regions), and 85% of exons are within 5 kb of the nearest SNP. Nucleotide diversity varies greatly across the genome, in a manner broadly consistent with a standard population genetic model of human history. This high-density SNP map provides a public resource for defining haplotype variation across the genome, and should help to identify biomedically important genes for diagnosis and therapy.
Nature 03/2001; 409(6822):928-33. · 36.28 Impact Factor
-
D R Bentley,
P Deloukas,
A Dunham,
L French,
S G Gregory,
S J Humphray,
A J Mungall,
M T Ross,
N P Carter,
I Dunham, [......],
R G Taylor,
A A Thorpe,
E Tinsley,
G L Warry,
A Whittaker,
P Whittaker,
S H Williams,
T E Wilmer,
R Wooster,
C L Wright
[show abstract]
[hide abstract]
ABSTRACT: We constructed maps for eight chromosomes (1, 6, 9, 10, 13, 20, X and (previously) 22), representing one-third of the genome, by building landmark maps, isolating bacterial clones and assembling contigs. By this approach, we could establish the long-range organization of the maps early in the project, and all contig extension, gap closure and problem-solving was simplified by containment within local regions. The maps currently represent more than 94% of the euchromatic (gene-containing) regions of these chromosomes in 176 contigs, and contain 96% of the chromosome-specific markers in the human gene map. By measuring the remaining gaps, we can assess chromosome length and coverage in sequenced clones.
Nature 03/2001; 409(6822):942-3. · 36.28 Impact Factor
-
J C Mullikin,
S E Hunt,
C G Cole,
B J Mortimore, C M Rice,
J Burton,
L H Matthews,
R Pavitt,
R W Plumb,
S K Sims, [......],
C A Steward,
J E Sulston,
E J Tinsley,
K J Turney,
D L Willey,
G D Wilson,
A A McMurray,
I Dunham,
J Rogers,
D R Bentley
[show abstract]
[hide abstract]
ABSTRACT: The human genome sequence will provide a reference for measuring DNA sequence variation in human populations. Sequence variants are responsible for the genetic component of individuality, including complex characteristics such as disease susceptibility and drug response. Most sequence variants are single nucleotide polymorphisms (SNPs), where two alternate bases occur at one position. Comparison of any two genomes reveals around 1 SNP per kilobase. A sufficiently dense map of SNPs would allow the detection of sequence variants responsible for particular characteristics on the basis that they are associated with a specific SNP allele. Here we have evaluated large-scale sequencing approaches to obtaining SNPs, and have constructed a map of 2,730 SNPs on human chromosome 22. Most of the SNPs are within 25 kilobases of a transcribed exon, and are valuable for association studies. We have scaled up the process, detecting over 65,000 SNPs in the genome as part of The SNP Consortium programme, which is on target to build a map of 1 SNP every 5 kilobases that is integrated with the human genome sequence and that is freely available in the public domain.
Nature 10/2000; 407(6803):516-20. · 36.28 Impact Factor
-
A J Mungall,
S J Humphray,
S A Ranby,
C A Edwards,
R W Heathcott,
C M Clee,
E Holloway,
A I Peck,
P Harrison,
L D Green, [......],
A Smith,
M A Leversha,
Y H Ramsey,
S M Clegg, C M Rice,
G L Maslen,
S E Hunt,
C E Scott,
C A Soderlund,
I Dunham
[show abstract]
[hide abstract]
ABSTRACT: Our aim is to construct physical clone maps covering those regions of chromosome 6 that are not currently extensively mapped, and use these to determine the DNA sequence of the whole chromosome. The strategy we are following involves establishing a high density framework map of the order of 15 markers per Megabase using radiation hybrid (RH) mapping. The markers are then used to identify large-insert genomic bacterial clones covering the chromosome, which are assembled into sequence-ready contigs by restriction enzyme fingerprinting and sequence tagged site (STS) content analysis. Contig gap closure is performed by walking experiments using STSs developed from the end sequences of the clone inserts.
DNA Sequence 02/1997; 8(3):151-4. · 0.75 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: The Sanger Centre Chromosome 6 Database (6ace) has been developed as the primary means of release of annotated sequencing and mapping information for human chromosome 6 from the Sanger Centre. It is also being used to curate global data from published and unpublished external sources. The rationale behind the development of 6ace is described, together with information as to how to access the database.
DNA Sequence 02/1997; 8(3):167-71. · 0.75 Impact Factor
-
P Deloukas,
L H Matthews,
J Ashurst,
J Burton,
J G Gilbert,
M Jones,
G Stavrides,
J P Almeida,
A K Babbage,
C L Bagguley, [......],
D L Willey,
L Williams,
S A Williams,
L Wilming,
P W Wray,
T Hubbard,
R M Durbin,
D R Bentley,
S Beck,
J Rogers
[show abstract]
[hide abstract]
ABSTRACT: The finished sequence of human chromosome 20 comprises 59,187,298 base pairs (bp) and represents 99.4% of the euchromatic DNA. A single contig of 26 megabases (Mb) spans the entire short arm, and five contigs separated by gaps totalling 320 kb span the long arm of this metacentric chromosome. An additional 234,339 bp of sequence has been determined within the pericentromeric region of the long arm. We annotated 727 genes and 168 pseudogenes in the sequence. About 64% of these genes have a 5' and a 3' untranslated region and a complete open reading frame. Comparative analysis of the sequence of chromosome 20 to whole-genome shotgun-sequence data of two other vertebrates, the mouse Mus musculus and the puffer fish Tetraodon nigroviridis, provides an independent measure of the efficiency of gene annotation, and indicates that this analysis may account for more than 95% of all coding exons and almost all genes.
Nature 414(6866):865-71. · 36.28 Impact Factor