Article
The H-Invitational Database (H-InvDB), a comprehensive annotation resource for human genes and transcripts.
Japan Biological Information Research Center, Japan Biological Informatics Consortium, Japan.
Nucleic Acids Research (impact factor:
8.03).
02/2008;
36(Database issue):D793-9.
DOI:10.1093/nar/gkm999
pp.D793-9
Source: PubMed
- Citations (20)
-
Cited In (0)
-
Article: Origin of phenotypes: genes and transcripts.
[show abstract] [hide abstract]
ABSTRACT: While the concept of a gene has been helpful in defining the relationship of a portion of a genome to a phenotype, this traditional term may not be as useful as it once was. Currently, "gene" has come to refer principally to a genomic region producing a polyadenylated mRNA that encodes a protein. However, the recent emergence of a large collection of unannotated transcripts with apparently little protein coding capacity, collectively called transcripts of unknown function (TUFs), has begun to blur the physical boundaries and genomic organization of genic regions with noncoding transcripts often overlapping protein-coding genes on the same (sense) and opposite strand (antisense). Moreover, they are often located in intergenic regions, making the genic portions of the human genome an interleaved network of both annotated polyadenylated and nonpolyadenylated transcripts, including splice variants with novel 5' ends extending hundreds of kilobases. This complex transcriptional organization and other recently observed features of genomes argue for the reconsideration of the term "gene" and suggests that transcripts may be used to define the operational unit of a genome.Genome Research 07/2007; 17(6):682-90. · 13.61 Impact Factor -
Article: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.
[show abstract] [hide abstract]
ABSTRACT: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. Firstly, individual weights are assigned to each sequence in a partial alignment in order to down-weight near-duplicate sequences and up-weight the most divergent ones. Secondly, amino acid substitution matrices are varied at different alignment stages according to the divergence of the sequences to be aligned. Thirdly, residue-specific gap penalties and locally reduced gap penalties in hydrophilic regions encourage new gaps in potential loop regions rather than regular secondary structure. Fourthly, positions in early alignments where gaps have been opened receive locally reduced gap penalties to encourage the opening up of new gaps at these positions. These modifications are incorporated into a new program, CLUSTAL W which is freely available.Nucleic Acids Research 12/1994; 22(22):4673-80. · 8.03 Impact Factor -
Article: Prediction of complete gene structures in human genomic DNA.
[show abstract] [hide abstract]
ABSTRACT: We introduce a general probabilistic model of the gene structure of human genomic sequences which incorporates descriptions of the basic transcriptional, translational and splicing signals, as well as length distributions and compositional features of exons, introns and intergenic regions. Distinct sets of model parameters are derived to account for the many substantial differences in gene density and structure observed in distinct C + G compositional regions of the human genome. In addition, new models of the donor and acceptor splice signals are described which capture potentially important dependencies between signal positions. The model is applied to the problem of gene identification in a computer program, GENSCAN, which identifies complete exon/intron structures of genes in genomic DNA. Novel features of the program include the capacity to predict multiple genes in a sequence, to deal with partial as well as complete genes, and to predict consistent sets of genes occurring on either or both DNA strands. GENSCAN is shown to have substantially higher accuracy than existing methods when tested on standardized sets of human and vertebrate genes, with 75 to 80% of exons identified exactly. The program is also capable of indicating fairly accurately the reliability of each predicted exon. Consistently high levels of accuracy are observed for sequences of differing C + G content and for distinct groups of vertebrates.Journal of Molecular Biology 05/1997; 268(1):78-94. · 4.00 Impact Factor
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed.
The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual
current impact factor.
Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence
agreement may be applicable.
Keywords
34 699 human gene clusters
alternative splicing variants
Clustering Viewer
current H-InvDB annotation resources
DiseaseInfo Viewer
functional domains
functional non-protein-coding RNAs
gene expression profiles
Gene family/group
gene structures
human genes
human genome sequences
International Nucleotide Sequence Databases
latest release H-InvDB_4.6
new features
protein 3D structure
protein-protein interactions
sub cellular localizations
sub-databases
TOPO Viewer