Project

JPhyloIO - A Java library for event-based I/O of phylogenetic file formats through a common interface

Goal: JPhyloIO is Java library for reading and writing phylogenetic (alignment and tree) file formats. It's aim is to allow developers of bioinformatical applications access to various formats using a single interface, while being independent of the concrete application data model. The library supports event based I/O of NeXML, Nexus, PhyloXML, FASTA, Newick, Phylip, Extended Phylip, MEGA, PDE and XTG, including all metadata modeled by these formats.

This project keeps you up to date on the development of JPhyloIO, new releases and publications.

The documentation, including full JavaDocs and multiple example applications showing how to use JPhyloIO are available here: http://bioinfweb.info/JPhyloIO/Documentation. If you have further questions, don't hesitate to ask one in this project. The project is open source under GNU LGPL. Suggestions and contributions (e.g. via GitHub) are always welcome.

http://bioinfweb.info/JPhyloIO/
https://twitter.com/bioinfweb
https://github.com/bioinfweb/JPhyloIO
Legal notice: http://bioinfweb.info/JPhyloIO/About
Privacy Policy: http://bioinfweb.info/Privacy

Updates
0 new
7
Recommendations
0 new
0
Followers
0 new
6
Reads
1 new
108

Project log

Ben C Stöver
added an update
Bug fix release 0.5.4 for JPhyloIO is out.
It contains adjustments to be compatible with Java 11.
 
Ben C Stöver
added an update
Bug fix release 0.5.3 for JPhyloIO is out. Correct number of sequence end events is now generated when reading MEGA files with line breaks within sequences. http://bioinfweb.info/JPhyloIO/Download
 
Ben C Stöver
added an update
JPhyloIO event lister is now avaliable and allows to easily inspect how the Java library translates phylogenetic documents with OTU lists, MSAs, trees or networks into a sequence of event objects.
 
Ben C Stöver
added an update
We switched to MantisBT as our bug tracking system, which is now open to all users. Check out http://bioinfweb.info/JPhyloIO/Bugs if you want to report a bug a share a feature request with us.
 
Ben C Stöver
added an update
Bug fix release 0.5.2 for JPhyloIO is out. Writing multiple sequences in a row with the same writer does not generate unnecessary namespace declarations in XML formats anymore.
 
Ben C Stöver
added an update
The bug fix release 0.5.1 for JPhyloIO is out. Long sequences are now read correctly from NeXML. Version 0.5.0 wrote a wrong JPhyloIO version number to output files, which is now also fixed.
 
Ben C Stöver
added an update
JPhyloIO 0.5.0 is available now. It offers new object translators to convert between Java objects and their text/XML representations, support for new TreeGraph 2 data in its XTG format and bug fixes for reading and writing NeXML.
 
Ben C Stöver
added 4 research items
Specimens form the falsifiable evidence used in plant systematics. Derivatives of specimens (including the specimen as the organism in the field) such as tissue and DNA samples play an increasing role in research. The EDIT Platform for Cybertaxonomy is a specialist's tool that allows to document and sustainably store all data that are used in the taxonomic work process, from field data to DNA sequences. The types of data stored can be very heterogeneous consisting of specimens, images, text data, primary data files, taxon assignments, etc. The EDIT Platform organizes the linking between such data by using a generic data model for representing the research process. Each step in the process is regarded as a derivation step and generates a derivative of the previous step. This could be a field unit having a specimen as its derivative or a specimen having a tissue sample as its derivative. Each derivation step also produces meta data storing who, when and how the derivation was done. The Platform's Common Data Model (CDM) and the applications build on the CDM library thus represent the first comprehensive implementation of the largely theoretical models developed in the late 1990ies (Berendsohn et al. 1999). In a pilot project research data about the genus Campanula (Kilian et al. 2015, FUB, BGBM 2012) was gathered and used to create a hierarchy of derivatives reaching from field data to DNA sequences. Additionally, the open source library for multiple sequence alignments LibrAlign (Stöver and Müller 2015) was used to integrate an alignment editor into the EDIT platform that allows to generate consensus sequences as derivatives of DNA sequences. The persistent storage of each link in the derivation process and the degree of detail on how the data and meta data are stored will speed up the research process, ease the reproducibility of research results and enhance sustainability of collections.
We present the model and implementation of a workflow that blazes a trail in systematic biology for the re-usability of character data (data on any kind of characters of pheno- and genotypes of organisms) and their additivity from specimen to taxon level. We take into account that any taxon characterization is based on a limited set of sampled individuals and characters, and that consequently any new individual and any new character may affect the recognition of biological entities and/or the subsequent delimitation and characterization of a taxon. Taxon concepts thus frequently change during the knowledge generation process in systematic biology. Structured character data are therefore not only needed for the knowledge generation process but also for easily adapting characterizations of taxa. We aim to facilitate the construction and reproducibility of taxon characterizations from structured character data of changing sample sets by establishing a stable and unambiguous association between each sampled individual and the data processed from it. Our workflow implementation uses the European Distributed Institute of Taxonomy Platform, a comprehensive taxonomic data management and publication environment to: (i) establish a reproducible connection between sampled individuals and all samples derived from them; (ii) stably link sample-based character data with the metadata of the respective samples; (iii) record and store structured specimen-based character data in formats allowing data exchange; (iv) reversibly assign sample metadata and character datasets to taxa in an editable classification and display them and (v) organize data exchange via standard exchange formats and enable the link between the character datasets and samples in research collections, ensuring high visibility and instant re-usability of the data. The workflow implemented will contribute to organizing the interface between phylogenetic analysis and revisionary taxonomic or monographic work. Database URL : http://campanula.e-taxonomy.net/
Today a variety of alignment and tree file formats exist, some of which well-established but limited in their data model, others more recently proposed offer advanced future-orientated features for metadata representation. Most phylogenetic and other bioinformatic software currently only supports one or few different formats, while supporting many widely-used standards simultaneously would be desirable to achieve optimal interoperability and prevent data loss by external conversions. We developed JPhyloIO, which allows reading and writing of alignment and tree formats (NeXML, PhyloXML, Nexus, Newick, FASTA, Phylip, MEGA, XTG, PDE) using a common interface. It is the only currently available Java-library that generalizes between the different data and metadata concepts of all formats, while still allowing access to their individual features. By simply implementing a single JPhyloIO based reader and writer, application developers can easily support all formats in one step and the event-based architecture allows the library to be combined with any application business model design, while still being memory efficient for large datasets. We provide JPhyloIO as a service to the scientific community, which will benefit from simplified development of software that supports various standards simultaneously. Our aims are to increase the interoperability between different (phylogenetic) software tools and to foster usage of more recently proposed formats providing a powerful metadata concept. It currently integrated in a number of applications and is fully interoperable with our Java-library LibrAlign, which offers powerful components for multiple sequence alignments and attached raw and metadata. Download and documentation: http://bioinfweb.info/JPhyloIO/ .
Ben C Stöver
added a project goal
JPhyloIO is Java library for reading and writing phylogenetic (alignment and tree) file formats. It's aim is to allow developers of bioinformatical applications access to various formats using a single interface, while being independent of the concrete application data model. The library supports event based I/O of NeXML, Nexus, PhyloXML, FASTA, Newick, Phylip, Extended Phylip, MEGA, PDE and XTG, including all metadata modeled by these formats.
This project keeps you up to date on the development of JPhyloIO, new releases and publications.
The documentation, including full JavaDocs and multiple example applications showing how to use JPhyloIO are available here: http://bioinfweb.info/JPhyloIO/Documentation. If you have further questions, don't hesitate to ask one in this project. The project is open source under GNU LGPL. Suggestions and contributions (e.g. via GitHub) are always welcome.