Sig2BioPAX: Java tool for converting flat files to BioPAX Level 3 format

Department of Pharmacology and Systems Therapeutics, Systems Biology Center New York (SBCNY), Mount Sinai School of Medicine, New York NY 10029, USA. .
Source Code for Biology and Medicine 03/2011; 6(1):5. DOI: 10.1186/1751-0473-6-5
Source: PubMed


The World Wide Web plays a critical role in enabling molecular, cell, systems and computational biologists to exchange, search, visualize, integrate, and analyze experimental data. Such efforts can be further enhanced through the development of semantic web concepts. The semantic web idea is to enable machines to understand data through the development of protocol free data exchange formats such as Resource Description Framework (RDF) and the Web Ontology Language (OWL). These standards provide formal descriptors of objects, object properties and their relationships within a specific knowledge domain. However, the overhead of converting datasets typically stored in data tables such as Excel, text or PDF into RDF or OWL formats is not trivial for non-specialists and as such produces a barrier to seamless data exchange between researchers, databases and analysis tools. This problem is particularly of importance in the field of network systems biology where biochemical interactions between genes and their protein products are abstracted to networks.
For the purpose of converting biochemical interactions into the BioPAX format, which is the leading standard developed by the computational systems biology community, we developed an open-source command line tool that takes as input tabular data describing different types of molecular biochemical interactions. The tool converts such interactions into the BioPAX level 3 OWL format. We used the tool to convert several existing and new mammalian networks of protein interactions, signalling pathways, and transcriptional regulatory networks into BioPAX. Some of these networks were deposited into PathwayCommons, a repository for consolidating and organizing biochemical networks.
The software tool Sig2BioPAX is a resource that enables experimental and computational systems biologists to contribute their identified networks and pathways of molecular interactions for integration and reuse with the rest of the research community.


Available from: Avi Ma'ayan
  • Source
    • "In addition, SMPDB’s data downloads now include all of its pathway information in BioPAX format. BioPAX is an RDF/OWL-based standard exchange language designed to compactly represent biological pathways at the molecular and cellular level (12). The availability of SMPDB’s data in BioPAX format and the impending availability of the same pathway data in SBML (expected in late 2013) should greatly expand SMPDB’s appeal to systems biologists as well as its potential utility in a variety of pathway, biochemical and metabolomic analysis packages. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The Small Molecule Pathway Database (SMPDB, is a comprehensive, colorful, fully searchable and highly interactive database for visualizing human metabolic, drug action, drug metabolism, physiological activity and metabolic disease pathways. SMPDB contains >600 pathways with nearly 75% of its pathways not found in any other database. All SMPDB pathway diagrams are extensively hyperlinked and include detailed information on the relevant tissues, organs, organelles, subcellular compartments, protein cofactors, protein locations, metabolite locations, chemical structures and protein quaternary structures. Since its last release in 2010, SMPDB has undergone substantial upgrades and significant expansion. In particular, the total number of pathways in SMPDB has grown by >70%. Additionally, every previously entered pathway has been completely redrawn, standardized, corrected, updated and enhanced with additional molecular or cellular information. Many SMPDB pathways now include transporter proteins as well as much more physiological, tissue, target organ and reaction compartment data. Thanks to the development of a standardized pathway drawing tool (called PathWhiz) all SMPDB pathways are now much more easily drawn and far more rapidly updated. PathWhiz has also allowed all SMPDB pathways to be saved in a BioPAX format. Significant improvements to SMPDB's visualization interface now make the browsing, selection, recoloring and zooming of pathways far easier and far more intuitive. Because of its utility and breadth of coverage, SMPDB is now integrated into several other databases including HMDB and DrugBank.
    Nucleic Acids Research 11/2013; 42(Database issue). DOI:10.1093/nar/gkt1067 · 9.11 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Dynamic visualizations and expressive representations are needed in systems biology to handle multiple interactions occurring during the biological processes of biopathway representations. Dynamic visualizations allow users an ease of interaction with pathway models. At the same time, representations of biopathways should express how interactions take place. In spite of the fact that diverse databases provide users with pathways, their information and representation are frequently different from each other and show restricted interactions because of their static visualization. An adopted solution is to merge diverse representations to obtain a richer one. However, due to different formats and the multiple links involved in the pathway representations, the merge results frequently in erroneous models and in a tangle web of relations very hard to be manipulated. Instead, this work introduces a concurrent dynamic visualization (CDV) of the same pathway, which is retrieved from different sites and then transformed into Petri net representations to facilitate the understanding of their biological processes by interacting with them. We applied this approach to the analysis of the Notch signaling pathway, associated with cervical cancer; we obtained it from different sources which we compared and manipulated simultaneously by interacting with the provided CDV until the user generated a personalized pathway.
    Journal of Applied Research and Technology 10/2012; 10(5):766-782. · 0.45 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Putting new findings into the context of available literature knowledge is one approach to deal with the surge of high-throughput data results. Furthermore, prior knowledge can increase the performance and stability of bioinformatic algorithms, for example, methods for network reconstruction. In this review, we examine software packages for the statistical computing framework R, which enable the integration of pathway data for further bioinformatic analyses. Different approaches to integrate and visualize pathway data are identified and packages are stratified concerning their features according to a number of different aspects: data import strategies, the extent of available data, dependencies on external tools, integration with further analysis steps and visualization options are considered. A total of 12 packages integrating pathway data are reviewed in this manuscript. These are supplemented by five R-specific packages for visualization and six connector packages, which provide access to external tools.
    Biology 03/2014; 3(1):85-100. DOI:10.3390/biology3010085
Show more