About
21
Publications
2,234
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
589
Citations
Publications
Publications (21)
The amino acid content of the proteins encoded by a genome may predict the coding potential of that genome and may reflect lifestyle restrictions of the organism. Here, we calculated the Kullback–Leibler divergence from the mean amino acid content as a metric to compare the amino acid composition for a large set of bacterial and phage genome sequen...
Raw data file: amino acid frequency calculations
Raw data file: KLD calculations for phage genomes
Raw data file: KLD calculations for bacterial genomes
Phages are the most abundant biological entities on Earth and play major ecological roles, yet the current sequenced phage genomes do not adequately represent their diversity, and little is known about the abundance and distribution of these sequenced genomes in nature. Although the study of phage ecology has benefited tremendously from the emergen...
All sequence data contain inherent information that can be measured by Shannon's uncertainty theory. Such measurement is valuable in evaluating large data sets, such as metagenomic libraries, to prioritize their analysis and annotation, thus saving computational resources. Here, Shannon's index of complete phage and bacterial genomes was examined....
Supplementary Table
Oxygen minimum zones (OMZs) are oceanographic features that affect ocean productivity and biodiversity, and contribute to ocean nitrogen loss and greenhouse gas emissions. Here we describe the viral communities associated with the Eastern Tropical South Pacific (ETSP) OMZ off Iquique, Chile for the first time through abundance estimates and viral m...
Next-generation sequencing technologies are rapidly transforming molecular systematic studies of non-model animal taxa. The arachnid order Opiliones (commonly known as "harvestmen") includes more than 6,400 described species placed into four well-supported lineages (suborders). Fossil plus molecular clock evidence indicates that these lineages were...
Individual gene matrices.
(DOCX)
Results
of analyses of unpartitioned nucleotide matrices with third position sites removed.
(TIF)
Results
from gene ontology analysis.
(DOCX)
Results
of alternative clock analyses.
(DOCX)
Summary information for individual gene matrices (e.g., gene identities, nucleotide alignment lengths, PI values, average pairwise divergence values, taxon composition, etc.).
(XLSX)
Next-generation sequencing technologies are rapidly transforming molecular systematic studies of non-model animal taxa. The arachnid order Opiliones (commonly known as “harvestmen”) includes more than 6,400 described species placed into four well-supported lineages (suborders). Fossil plus molecular clock evidence indicates that these lineages were...
Prophages are phages in lysogeny that are integrated into, and replicated as part of, the host bacterial genome. These mobile
elements can have tremendous impact on their bacterial hosts’ genomes and phenotypes, which may lead to strain emergence and
diversification, increased virulence or antibiotic resistance. However, finding prophages in microb...
Example code snippets. The additional file contains example code in Perl, Python, and Java that demonstrates how to access the SEED using SOAP.
The SEED integrates many publicly available genome sequences into a single resource. The database contains accurate and up-to-date annotations based on the subsystems concept that leverages clustering between genomes and other clues to accurately and efficiently annotate microbial genomes. The backend is used as the foundation for many genome annot...
Projects
Project (1)