Publications (16) View all
-
Article: Identification and localisation of the NB-LRR gene family within the potato genome.
Florian Jupe, Leighton Pritchard, Graham J Etherington, Katrin Mackenzie, Peter J A Cock, Frank Wright, Sanjeev Kumar Sharma, Dan Bolser, Glenn J Bryan, Jonathan D G Jones, Ingo Hein[show abstract] [hide abstract]
ABSTRACT: The potato genome sequence derived from the Solanum tuberosum Group Phureja clone DM1-3 516 R44 provides unparalleled insight into the genome composition and organisation of this important crop. A key class of genes that comprises the vast majority of plant resistance (R) genes contains a nucleotide-binding and leucine-rich repeat domain, and is collectively known as NB-LRRs. As part of an effort to accelerate the process of functional R gene isolation, we performed an amino acid motif based search of the annotated potato genome and identified 438 NB-LRR type genes among the ~39,000 potato gene models. Of the predicted genes, 77 contain an N-terminal toll/interleukin 1 receptor (TIR)-like domain, and 107 of the remaining 361 non-TIR genes contain an N-terminal coiled-coil (CC) domain. Physical map positions were established for 370 predicted NB-LRR genes across all 12 potato chromosomes. The majority of NB-LRRs are physically organised within 63 identified clusters, of which 50 are homogeneous in that they contain NB-LRRs derived from a recent common ancestor. By establishing the phylogenetic and positional relationship of potato NB-LRRs, our analysis offers significant insight into the evolution of potato R genes. Furthermore, the data provide a blueprint for future efforts to identify and more rapidly clone functional NB-LRR genes from Solanum species.BMC Genomics 01/2012; 13:75. · 4.07 Impact Factor -
SourceAvailable from: John Eargle
Article: MetaBase--the wiki-database of biological databases.
Dan M Bolser, Pierre-Yves Chibon, Nicolas Palopoli, Sungsam Gong, Daniel Jacob, Victoria Dominguez Del Angel, Dan Swan, Sebastian Bassi, Virginia González, Prashanth Suravajhala, [......], Paolo Romano, Rob Edwards, Bryan Bishop, John Eargle, Timur Shtatland, Nicholas J Provart, Dave Clements, Daniel P Renfro, Daeui Bhak, Jong Bhak[show abstract] [hide abstract]
ABSTRACT: Biology is generating more data than ever. As a result, there is an ever increasing number of publicly available databases that analyse, integrate and summarize the available data, providing an invaluable resource for the biological community. As this trend continues, there is a pressing need to organize, catalogue and rate these resources, so that the information they contain can be most effectively exploited. MetaBase (MB) (http://MetaDatabase.Org) is a community-curated database containing more than 2000 commonly used biological databases. Each entry is structured using templates and can carry various user comments and annotations. Entries can be searched, listed, browsed or queried. The database was created using the same MediaWiki technology that powers Wikipedia, allowing users to contribute on many different levels. The initial release of MB was derived from the content of the 2007 Nucleic Acids Research (NAR) Database Issue. Since then, approximately 100 databases have been manually collected from the literature, and users have added information for over 240 databases. MB is synchronized annually with the static Molecular Biology Database Collection provided by NAR. To date, there have been 19 significant contributors to the project; each one is listed as an author here to highlight the community aspect of the project.Nucleic Acids Research 12/2011; 40(Database issue):D1250-4. · 8.03 Impact Factor -
Article: The SEQanswers wiki: a wiki database of tools for high-throughput sequencing analysis.
Jing-Woei Li, Keith Robison, Marcel Martin, Andreas Sjödin, Björn Usadel, Matthew Young, Eric C Olivares, Dan M Bolser[show abstract] [hide abstract]
ABSTRACT: Recent advances in sequencing technology have created unprecedented opportunities for biological research. However, the increasing throughput of these technologies has created many challenges for data management and analysis. As the demand for sophisticated analyses increases, the development time of software and algorithms is outpacing the speed of traditional publication. As technologies continue to be developed, methods change rapidly, making publications less relevant for users. The SEQanswers wiki (SEQwiki) is a wiki database that is actively edited and updated by the members of the SEQanswers community (http://SEQanswers.com/). The wiki provides an extensive catalogue of tools, technologies and tutorials for high-throughput sequencing (HTS), including information about HTS service providers. It has been implemented in MediaWiki with the Semantic MediaWiki and Semantic Forms extensions to collect structured data, providing powerful navigation and reporting features. Within 2 years, the community has created pages for over 500 tools, with approximately 400 literature references and 600 web links. This collaborative effort has made SEQwiki the most comprehensive database of HTS tools anywhere on the web. The wiki includes task-focused mini-reviews of commonly used tools, and a growing collection of more than 100 HTS service providers. SEQwiki is available at: http://wiki.SEQanswers.com/.Nucleic Acids Research 11/2011; 40(Database issue):D1313-7. · 8.03 Impact Factor -
SourceAvailable from: Michael Lappe
Article: PDBWiki: added value through community annotation of the Protein Data Bank.
[show abstract] [hide abstract]
ABSTRACT: The success of community projects such as Wikipedia has recently prompted a discussion about the applicability of such tools in the life sciences. Currently, there are several such 'science-wikis' that aim to collect specialist knowledge from the community into centralized resources. However, there is no consensus about how to achieve this goal. For example, it is not clear how to best integrate data from established, centralized databases with that provided by 'community annotation'. We created PDBWiki, a scientific wiki for the community annotation of protein structures. The wiki consists of one structured page for each entry in the the Protein Data Bank (PDB) and allows the user to attach categorized comments to the entries. Additionally, each page includes a user editable list of cross-references to external resources. As in a database, it is possible to produce tabular reports and 'structure galleries' based on user-defined queries or lists of entries. PDBWiki runs in parallel to the PDB, separating original database content from user annotations. PDBWiki demonstrates how collaboration features can be integrated with primary data from a biological database. It can be used as a system for better understanding how to capture community knowledge in the biological sciences. For users of the PDB, PDBWiki provides a bug-tracker, discussion forum and community annotation system. To date, user participation has been modest, but is increasing. The user editable cross-references section has proven popular, with the number of linked resources more than doubling from 17 originally to 39 today. Database URL: http://www.pdbwiki.org.Database The Journal of Biological Databases and Curation 01/2010; 2010:baq009. · 2.07 Impact Factor -
SourceAvailable from: Michael Lappe
Article: Residue contact-count potentials are as effective as residue-residue contact-type potentials for ranking protein decoys.
[show abstract] [hide abstract]
ABSTRACT: For over 30 years potentials of mean force have been used to evaluate the relative energy of protein structures. The most commonly used potentials define the energy of residue-residue interactions and are derived from the empirical analysis of the known protein structures. However, single-body residue 'environment' potentials, although widely used in protein structure analysis, have not been rigorously compared to these classical two-body residue-residue interaction potentials. Here we do not try to combine the two different types of residue interaction potential, but rather to assess their independent contribution to scoring protein structures. A data set of nearly three thousand monomers was used to compare pairwise residue-residue 'contact-type' propensities to single-body residue 'contact-count' propensities. Using a large and standard set of protein decoys we performed an in-depth comparison of these two types of residue interaction propensities. The scores derived from the contact-type and contact-count propensities were assessed using two different performance metrics and were compared using 90 different definitions of residue-residue contact. Our findings show that both types of score perform equally well on the task of discriminating between near-native protein decoys. However, in a statistical sense, the contact-count based scores were found to carry more information than the contact-type based scores. Our analysis has shown that the performance of either type of score is very similar on a range of different decoys. This similarity suggests a common underlying biophysical principle for both types of residue interaction propensity. However, several features of the contact-count based propensity suggests that it should be used in preference to the contact-type based propensity. Specifically, it has been shown that contact-counts can be predicted from sequence information alone. In addition, the use of a single-body term allows for efficient alignment strategies using dynamic programming, which is useful for fold recognition, for example. These facts, combined with the relative simplicity of the contact-count propensity, suggests that contact-counts should be studied in more detail in the future.BMC Structural Biology 01/2009; 8:53. · 2.48 Impact Factor