ArticlePDF Available

Abstract and Figures

Modern bioinformatics and computational biology are fields of study driven by the availability of effective software required for conducting appropriate research tasks. Apart from providing reliable and fast implementations of different data analysis algorithms, these software applications should also be clear and easy to use through proper user interfaces, providing appropriate data management and visualization capabilities. In this regard, the user experience obtained by interacting with these applications via their Graphical User Interfaces (GUI) is a key factor for their final success and real utility for researchers. Despite the existence of different packages and applications focused on advanced data visualization, there is a lack of specific libraries providing pertinent GUI components able to help scientific bioinformatics software developers. To that end, this paper introduces GC4S, a bioinformatics-oriented collection of high-level, extensible, and reusable Java GUI elements specifically designed to speed up bioinformatics software development. Within GC4S, developers of new applications can focus on the specific GUI requirements of their projects, relying on GC4S for generalities and abstractions. GC4S is free software distributed under the terms of GNU Lesser General Public License and both source code and documentation are publicly available at http://www.sing-group.org/gc4s
This content is subject to copyright.
RESEARCH ARTICLE
GC4S: A bioinformatics-oriented Java software
library of reusable graphical user interface
components
Hugo Lo
´pez-Ferna
´ndezID
1,2,3,4,5
*, Miguel Reboiro-Jato
1,2,3
, Daniel Glez-Peña
1,2,3
,
Rosalı
´a Laza
1,2,3
, Reyes Pavo
´n
1,2,3
, Florentino Fdez-Riverola
1,2,3
1ESEI—Escuela Superior de Ingenierı
´a Informa
´tica, Universidad de Vigo, Ourense, Spain, 2CINBIO—
Centro de Investigaciones Biome
´dicas, Universidad de Vigo, Vigo, Spain, 3SING Research Group, Galicia
Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Spain, 4Instituto de Investigac¸ão e
Inovac¸ão em Sau
´de (I3S), Universidade do Porto, Porto, Portugal, 5Instituto de Biologia Molecular e Celular
(IBMC), Porto, Portugal
*hlfernandez@uvigo.es
Abstract
Modern bioinformatics and computational biology are fields of study driven by the availability
of effective software required for conducting appropriate research tasks. Apart from provid-
ing reliable and fast implementations of different data analysis algorithms, these software
applications should also be clear and easy to use through proper user interfaces, providing
appropriate data management and visualization capabilities. In this regard, the user experi-
ence obtained by interacting with these applications via their Graphical User Interfaces
(GUI) is a key factor for their final success and real utility for researchers. Despite the exis-
tence of different packages and applications focused on advanced data visualization, there
is a lack of specific libraries providing pertinent GUI components able to help scientific bioin-
formatics software developers. To that end, this paper introduces GC4S, a bioinformatics-
oriented collection of high-level, extensible, and reusable Java GUI elements specifically
designed to speed up bioinformatics software development. Within GC4S, developers of
new applications can focus on the specific GUI requirements of their projects, relying on
GC4S for generalities and abstractions. GC4S is free software distributed under the terms
of GNU Lesser General Public License and both source code and documentation are pub-
licly available at http://www.sing-group.org/gc4s
Introduction
The importance of bioinformatics and computational biology in research today is leading to
the development of a growing number of complex bioinformatics tools and software packages.
In this context, these applications must solve two important needs: the provision of reliable
implementations of data analysis algorithms, as well as convenient interfaces enabling their
effective use for conducting appropriate research [1]. Indeed, several studies highlighting the
importance of applying proper software engineering practices and methodologies to
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 1 / 19
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
OPEN ACCESS
Citation: Lo
´pez-Ferna
´ndez H, Reboiro-Jato M,
Glez-Peña D, Laza R, Pavo
´n R, Fdez-Riverola F
(2018) GC4S: A bioinformatics-oriented Java
software library of reusable graphical user interface
components. PLoS ONE 13(9): e0204474. https://
doi.org/10.1371/journal.pone.0204474
Editor: Frederique Lisacek, Swiss Institute of
Bioinformatics, SWITZERLAND
Received: August 6, 2018
Accepted: September 7, 2018
Published: September 20, 2018
Copyright: ©2018 Lo
´pez-Ferna
´ndez et al. This is
an open access article distributed under the terms
of the Creative Commons Attribution License,
which permits unrestricted use, distribution, and
reproduction in any medium, provided the original
author and source are credited.
Data Availability Statement: The source code of
GC4S is available in the following GitHub
repository: https://github.com/sing-group/GC4S.
Usage instructions and documentation are both
available trough the official GC4S project website:
http://www.sing-group.org/gc4s/.
Funding: Hugo Lo
´pez-Ferna
´ndez is supported by a
post-doctoral fellowship from Xunta de Galicia
(ED481B 2016/068-0).
Competing interests: The authors have declared
that no competing interests exist.
systematically develop high-quality bioinformatics software have been published [2,3]. Simi-
larly, some authors have contributed with valuable and helpful recommendations for the
appropriate development of effective scientific software [4,5]. The concept of user experience
when developing such bioinformatics applications is becoming increasingly important, and
some studies also provide useful guidelines for improving the development of user friendly
bioinformatics software [1,6]. All the approaches discussed above highlight how the usability
of different types of bioinformatics applications can be improved by adopting well-established
practices in usability and software engineering.
Taking the previously discussed facts into account, several frameworks, platforms and
libraries for different programming languages (e.g. Java, R, Python, C++, etc.) and comple-
mentary bioinformatics areas (e.g. proteomics, genomics, etc.) were recently released with the
goal of facilitating the development of successful bioinformatics applications [711]. This
plethora of alternatives provides a reliable basis for developing novel algorithms and data anal-
ysis functionalities, allowing developers to extend them and reuse existing code. Several librar-
ies and applications focused on scientific and biological data visualization [7,1214] are also
available. A notable example is the JSparklines library [15], which provides the capability of
visualizing numbers in Java Swing tables, making use of sparklines. These data visualization
libraries certainly help developers create new applications and provide excellent features to
end-users. However, other aspects ensuring an effective user experience (e.g. the creation of
input dialogs, configuration panels and other user interaction steps) are inadequately sup-
ported due to a lack of specific resources providing reusable Graphical User Interface (GUI)
components able to help developers in the same way as many of the aforementioned frame-
works and libraries.
In the same line of previous approaches, we also contributed to this collection with AIBench
[16], a Java application framework for scientific software development. Since its first release, it
has been successfully used to create different bioinformatics software such as LA-iMageS [17],
Mass-Up [18], S2P [19,20], BioAnnote [21], BEW [22], @Note [23], ADOPS [24], DPD [25] or
MLibrary [26], among others, covering a wide range of application areas including, but not
limited to, proteomics, genomics and text mining. By providing the most common functional-
ities present in typical scientific applications, such as user parameter collection, logging, multi-
threaded execution, experiment repeatability, workflow management, and automatic GUI
generation, AIBench allows developers to focus on the specific application logic. The wide
range of successfully released bioinformatics software based on AIBench demonstrates that
our framework has provided an effective alternative to increase productivity while developing
high-quality source code [27].
The development of AIBench applications always relies on the straightforward implementa-
tion of three different but complementary types of programmable components: operations,
data types and views. However, our long-term experience in the field has proved that GUI (a
key feature in any deployment) should be improved further. As previously stated, AIBench
automatically constructs the user interface by generating (i) application menus based on
declared operations and (ii) input dialogs for obtaining required operation parameters. How-
ever, the view components, which must be developed in Java Swing, are the responsibility of
the programmer. In this regard, we noticed that these kinds of components were being copied
and pasted between code bases, including non-AIBench based applications. In doing so, devel-
opers were missing an opportunity for reusing GUI elements by sharing them between code
bases and saving time in future projects.
Bearing all that in mind, and with the specific goal of overcoming these shortcomings, we
initially designed, and later developed and tested, GC4S (GUI Components for Swing), an
open source software library that aims to help programmers developing Java Swing based
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 2 / 19
scientific applications. GC4S is a collection of high-level, extensible, bioinformatics-oriented
and reusable Java GUI elements created by using Swing and SwingX low-level components. By
publishing this library, developers of new applications can exclusively focus on the specific
GUI requirements of their projects, relying on GC4S for generalities. This way, they can also
benefit from bug fixes, updates, and improvements in GC4S. To the best of our knowledge,
there is currently no other open-source, free library offering such functionalities.
The remainder of the paper is structured as follows: while Section 2 introduces our novel
GC4S library discussing implementation details and essential concepts, Section 3 illustrates the
usefulness of GC4S for scientific software development by presenting three complementary
case studies showing how GC4S can be easily integrated with the AIBench framework or inde-
pendently used for speeding up the development process of coding standalone applications.
Section 4 discusses the main contributions of this work as well as the lessons we learned in the
last few years. Finally, Section 5 summarizes the main conclusions extracted from this contri-
bution and outlines future research work.
The GC4S library: Implementation and components
Implementation
GC4S 1.2 is implemented in Java 8 as an Apache Maven project to automatically build the
library. The source code of the project is freely available at https://github.com/sing-group/
GC4S under a GNU LGPL 3.0 License (http://www.gnu.org/copyleft/lgpl.html). This project
contains six main Maven modules: (i)gc4s, containing the general-purpose GC4S components;
(ii)gc4s-genomebrowser, containing a component to visualize genomes and biological data;
(iii)gc4s-heatmap, containing a component to create heat map representations of numerical
data; (iv)gc4s-jsparklines-factory, containing classes to facilitate the creation of JSparkLines
renderers [14]; (v)gc4s-multiple-sequence-alignment-viewer, containing a component to visual-
ize multiple sequence alignments; and (vi)gc4s-statistics-tests-table, containing a component to
display datasets in customizable tables and automatically compute p-values and q-values to
compare the conditions in them. Moreover, for each of these six modules, there is a -demo
module containing demos and examples.
At the implementation level, components in the main modules (hereafter, GC4S compo-
nents) can be grouped into three main categories: (i)extensions, that is, components that
enhance existing components in order to add functionalities; (ii)high-level components, that is,
components that provide new functionalities by grouping other existing components; and (iii)
utilities, components that provide service methods (e.g. builders) or resources (e.g. icons).
Components
From the programmer’s perspective, GC4S components serve three main purposes: (i) retrieve
user input; (ii) display data or results; and (iii) provide different programming utilities.
According to these functionalities, they are classified as Input components, Output (or Visuali-
zation) components or Utility components, respectively. Regarding the structure of the source
code of the library, Table 1 shows the top-level GC4S packages, reflecting how components are
grouped according to their function. In addition, Supporting Information S1 Table provides a
comprehensive list of the entire GC4S library, providing a brief description of each compo-
nent. Extensive Javadoc documentation is also available at http://www.sing-group.org/gc4s/
javadoc. Supporting Information S1 Document provides basic documentation about the usage
of the library in Maven-based projects as well as examples of some components of the gc4s
module.
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 3 / 19
To demonstrate the utility and purpose of the library under consideration, this section
illustrates, by way of example, how different components work and can be easily used. As men-
tioned above, the main purpose of GC4S is to provide a reliable set of generic GUI components
that can be efficiently reused in scientific applications development, allowing programmers to
focus on concrete problem details.
Genome browser
The gc4s-genomebrowser module provides a graphical component called GenomeBrowserto
visualize genomes and different types of biological data, in a manner similar to the web
Human Genome Browser at UCSC [28]. This component relies on the PileLine library [29] for
parsing the genomic files.
The GenomeBrowsercomponent in Fig 1 shows the visualization of a reference genome
with two tracks. As the code snippet in Fig 2 shows, this component must be instantiated by
providing an object of a class implementing the GenomeIndexPileLine interface. In this
example, the genome index is created using the GenomeIndexBuilderprovided by Pile-
Line and then instantiated by creating a PileLineGenomeIndexobject. Alternatively, the
genome index may be created using samtools [30] and then instantiated by an object of the
FaiGenomeIndexclass. The tracks containing biological data are read from genomic
Table 1. GC4S library structure. Packages are prefixed by org.sing_group.gc4s, which is ommitted to avoid
redundancy.
Package Description
.dialog Provides components related to the creation of different types of dialogs.
.dialog.wizard Provides components related to the creation of Wizard dialogs or assistants.
.event Provides extensions of interfaces and classes related to different types of events fired by AWT
(Abstract Window Toolkit) components.
.input Provides components related to the retrieval of different types of user inputs.
.input.combobox Provides components related to the retrieval of user input using combo boxes.
.input.csv Provides components related to the retrieval of CSV (comma-separated values)
configurations.
.input.filechooser Provides components related to file selection.
.input.list Provides components related to the retrieval of user input using lists.
.input.text Provides components related to the retrieval of user input using text fields.
.jsparklines Provides components related to the creation of different types of JSparkLines renderers.
.msaviewer Provides components related to the visualization of multiple alignment sequences.
.statistics Provides components related to the visualization of statistical tests tables.
.ui Provides components related to the creation of user interfaces and layouts.
.ui.icons Provides icons and utilities related to icons.
.ui.menu Provides components related to the creation of menus.
.ui.tabbedpane Provides components related to the creation of tabbed panes.
.ui.text Provides components related to the creation of text components.
.utilities Provides classes offering different functionalities that cannot be grouped in other packages
(dialog, event, input, ui or visualization).
.utilities.builder Provides builder classes to facilitate the creation of components.
.visualization Provides components related to data visualization.
.visualization.
heatmap
Provides components related to heat map visualization.
.visualization.table Provides components related to table visualization.
.visualization.table.
csv
Provides components related to table visualization of CSV data.
https://doi.org/10.1371/journal.pone.0204474.t001
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 4 / 19
position files including standard bam (http://samtools.github.io/hts-specs/SAMv1.pdf), pileup
(http://samtools.sourceforge.net/pileup.shtml), BED (http://genome.ucsc.edu/FAQ/
FAQformat#format1), GFF (http://genome.ucsc.edu/FAQ/FAQformat#format3) or VCF
(http://www.internationalgenome.org/wiki/Analysis/vcf4.0/) formats. In the case of bam files,
the corresponding .bai index must exist in the same directory as the provided bam file.
The toolbar in the upper area allows users to interactively browse their data by specifying
the genome range and chromosome of interest, zooming in or out, and so on. Additionally,
other settings such as colour, background colour, maximum number of depth levels, and other
options, can be customized for each track.
Heat map
The gc4s-heatmap module provides a graphical component called JHeatMapto visualize
numerical data matrices as heat maps, graphical representations where the individual values
are represented as colours.
The JHeatMapcomponent in Fig 3 shows the visualization of a data matrix with three rows
and five columns. As the code snippet in Fig 4 shows, this component must be instantiated by
providing a data matrix and two arrays specifying rows and columns names. Alternatively, it can
be instantiated by providing a JHeatMapModelobject. Also, the colours that must be used to
Fig 1. The GenomeBrowsercomponent showing two tracks.
https://doi.org/10.1371/journal.pone.0204474.g001
Fig 2. Code snippet showing the instantiation of a GenomeBrowser.Also,the range and chromosome of interest and two tracks are set programmatically.
https://doi.org/10.1371/journal.pone.0204474.g002
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 5 / 19
Fig 3. The JHeatMapcomponent.
https://doi.org/10.1371/journal.pone.0204474.g003
Fig 4. Code snippet showing the instantiation of a JHeatMap.Also, the colours that must be used to create the colour gradient are set programmatically.
https://doi.org/10.1371/journal.pone.0204474.g004
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 6 / 19
create the gradient are set programmatically in this example. This component has the mouse
wheel zooming feature enabled by default, but it can be programmatically disabled.
Based on this core component, it is provided the JHeatMapPanelcomponent that pres-
ents a JHeatMapalong with a toolbar for controlling different visualization options. These
options are: (i) changing the colours of the colour gradient, (ii) changing the minimum and
maximum values that must be used to compute the colour gradient, (iii) changing the font set-
tings, (iv) transforming the data matrix (e.g. apply row centring, log transform data, etc.), (v)
establishing the visible rows and columns, and (vi) exporting the heat map as image. This com-
ponent must be instantiated by providing a previously created JHeatMap component.
Multiple Sequence Alignments Viewer
The gc4s-multiple-sequence-alignment-viewer module provides a graphical component called Mul-
tipleSequenceAlignmentViewerPanelto display multiple sequence alignment data.
The MultipleSequenceAlignmentViewerPanelcomponent in Fig 5 shows the
visualization of two sample sequences. As the code snippet in Fig 6 shows, this component
must be instantiated by providing a list of objects that implement the Sequenceinterface.
Also, a configuration object can be passed to set the initial visualization settings. If this object
is not provided, the default settings are used.
When this visualization is used, two features are usually required: (i) the possibility of
highlighting specific sequence positions (e.g. to point out a specific amino acid or nucleotide
as a result of a biological analysis, etc.), (ii) the possibility of adding additional information in
form of tracks that are placed before or after the sequences (e.g. to add some kind of score asso-
ciated with each sequence position). In GC4S, these can be achieved by the usage of Sequen-
ceAlignmentRendererand MultipleSequenceAlignmentTracksModel
objects, respectively. These objects can be added to the viewer constructor so that they can be
used to show additional information. Fig 7 shows an example of this component configured
with a renderer and a model that adds a track named Scores.
Based on this core component, it is provided the MultipleSequenceAlignment-
ViewerControlcomponent that presents a MultipleSequenceAlignmentVie-
werPanelalong with a toolbar with different options for the end-users of the visualization.
These options are: (i) selecting the tracks model and renderer that are being shown, (ii) chang-
ing the viewer configuration through a GUI dialog, and (iii) exporting the view as image or
HTML document. Additional advanced examples of these two components are provided in
the corresponding demo module.
Statistics Tests Table
The gc4s-statistics-tests-table module provides a graphical component called Statistic-
sTestTableto visualize tabular datasets and automatically compute p-values and q-values
for each row in order to compare the conditions in them.
Fig 5. The MultipleSequenceAlignmentViewerPanelcomponent.
https://doi.org/10.1371/journal.pone.0204474.g005
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 7 / 19
The table is generic, meaning that it can be used to display any type of data. The table
retrieves the data from generic datasets implementing the Datasetinterface, which also has
a default implementation called DefaultDataset. The statistical tests used to compute p-
values in each row to compare the conditions in the dataset must be compatible with the type
of data under consideration. This means that a table created with a Dataset<Boolean>
should receive a Test<Boolean>. The package org.sing_group.org.gc4s.
statistics.data.testsof this module provides test implementations for Boolean
Fig 6. Code snippet showing the instantiation of a MultipleSequenceAlignmentViewerPanel.The initial visualization settings are also passed to the
constructor.
https://doi.org/10.1371/journal.pone.0204474.g006
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 8 / 19
(Chi-squared test, Chi-squared test with Yates’ correction, Randomization test, Fisher’s Exact
test), Number(several t-tests and ANOVA), and String (Chi-squared test) data. Developers
using the library can develop their own tests by simply implementing the Test interface.
The StatisticsTestTablecomponent in Fig 8 shows the visualization of a tiny data-
set of Booleanvalues with two conditions, ten samples and ten features (for illustrative pur-
poses, the size is small because it is meant to be a minimal example; advanced examples with
bigger datasets can be found at the corresponding demo module). As the code snippet in Fig 9
shows, this component must be instantiated by providing a Dataset and a Testobjects for
the same type. Optionally, an object that implements the Correctioninterface can be pro-
vided in order to have one additional column displaying the corrected p-values obtained from
the test, as it is the case of this example. After its instantiation, the table is also configured to:
(i) set a cell renderer that displays text “YES” or “NO” instead of true and false; (ii) set a header
cell renderer that displays sample names using different colours depending on their associated
Fig 7. The MultipleSequenceAlignmentViewerPanelcomponent configured with a renderer and a
model that adds a track named Scores.
https://doi.org/10.1371/journal.pone.0204474.g007
Fig 8. The StatisticsTestTablecomponent.
https://doi.org/10.1371/journal.pone.0204474.g008
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 9 / 19
conditions and draws a left border at the first column of each condition to enhance the visual
distinction between conditions; (iii) set a highlighter that uses different colours for the cell val-
ues (true in red and false in green), and (iv) set a highlighter that draws a left border at the first
sample of each condition to enhance the visual distinction between conditions. These classes
Fig 9. Code snippet showing the instantiation and configuration of a StatisticsTestTable.
https://doi.org/10.1371/journal.pone.0204474.g009
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 10 / 19
are also included in this module, thus they are reusable by the programmers of the library in
the same way that we use them in the demos.
The time needed for the computation of the p-values and q-values can vary depending on
the dataset size and the statistical tests used, so a ProgressEventListenercan be added
to the table in order to receive progress events from the component. Based on this core compo-
nent, it is provided a StatisticsTestTablePanelcomponent that presents a Sta-
tisticsTestTablealong with a progress bar to monitor the progress computation. This
component is instantiated in the same way than the basic table and the table used internally
can be obtained programmatically so that it can be fully configured and controlled.
Case studies
Two AIBench-based applications and a regular standalone application were selected as case
studies to illustrate how GC4S helped in the development of actual bioinformatics software.
AIBench-based applications: S2P and DEWE
S2P (http://www.sing-group.org/s2p) and DEWE (http://www.sing-group.org/dewe) were
constructed using the AIBench framework following the straightforward architecture depicted
in Fig 10. The gui module of both applications was entirely constructed with GC4S, allowing
us to focus on the specific needs of those developments relying on the general purpose compo-
nents offered by GC4S.
S2P is an open source application for fast and accurate processing of 2D-gel and MALDI-
based mass spectrometry protein data. Since this application allows users to manage and visu-
alize different types of available data, GC4S was used to provide a rich and effective user
Fig 10. AIBench-based applications architecture (S2P and DEWE).
https://doi.org/10.1371/journal.pone.0204474.g010
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 11 / 19
experience. For instance, GC4S allowed enhancing tables by adding a popup summary to each
column (Fig 11) as well as the creation of interactive and customizable heat map visualizations
(Fig 12). Tables are also enhanced by adding spark lines from the JSparklines library that are
created using the gc4s-jsparklines-factory module (Fig 13).
Additionally, S2P also makes intensive use of the previously mentioned InputParame-
tersPanelcomponent to retrieve user inputs for different operations. Regarding this func-
tionality, there is a noteworthy case: S2P deals with comma-separated values (CSV) files, and it
must retrieve the CSV format configuration from the user. As this format is application-inde-
pendent, GC4S offers a component called CsvPanel, which allows users to either customize
or choose a predefined CSV format (Fig 14).
DEWE is a novel open source application for executing RNA-Seq analysis focused on sup-
porting differential expression experiments. As in the case of S2P, this application also benefits
from GC4S components to provide a compelling user experience, using the table enhance-
ments previously discussed together with the InputParametersPanel.
The DEWE project has the specific goal of facilitating the configuration and execution of
differential expression analysis workflows. As these workflows require different inputs and
configurations, a configuration assistant or wizard seemed a good way of supporting this func-
tionality. Consequently, the Wizardcomponent was included in the GC4S library to facilitate
the creation of such configuration assistants. This particular component extends JDialog
and accepts a list of WizardStepobjects. Fig 15 shows the configuration assistant of one of
the built-in workflows included in DEWE. While DEWE only needs to implement the specific
WizardStepcomponents needed to configure the workflow (i.e. the configuration steps),
the Wizardcomponent implemented in GC4S constructs the assistant dialog and manages
the wizard buttons as well as the left sidebar that contains the steps labels.
Standalone developments
As previously commented, the components of the GC4S library can be used either separately or
in conjunction with the AIBench framework to implement the view components. An illustrative
Fig 11. A popup summary added to each column of a table using the ColumnSummaryTableCellRenderer
component.
https://doi.org/10.1371/journal.pone.0204474.g011
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 12 / 19
Fig 12. Heat map visualization of spots data in S2P using the JHeatMapcomponent.
https://doi.org/10.1371/journal.pone.0204474.g012
Fig 13. Screenshot of the S2P application showing the table that displays a Mascot identifications report. This and
other tables are enhanced with spark lines from the JSparklines library that are created using the gc4s-jsparklines-
factory module.
https://doi.org/10.1371/journal.pone.0204474.g013
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 13 / 19
example of the former case is the development of SEDA (SEquence DAtaset Builder). SEDA
(http://www.sing-group.org/seda) is Java Swing application for efficient and flexible processing
of FASTA files, offering different functionalities. Apart from using some of the previously com-
mented components, two other notable examples illustrate the broad scope of our GC4S library.
The first example involves the use of some of the icons provided in the Icons class. For
instance, Fig 16 shows how a different icon from GC4S is used depending on the validity of the
selected output directory, which is done using a JFileChooserPanelcomponent.
The second example involves a very common situation in which a different panel needs to
be shown to the user depending on a previous selection in a combo box. This can be easily
achieved by using a CardLayout, switching the visible panel (or card) when the user selec-
tion changes. The CardsPanelcomponent provides this functionality and it is used in
SEDA to display different configuration panels to the user, depending on the selection previ-
ously made in a combo box (Fig 17).
Discussion and lessons learned
Two of the main forms of providing reusable software include libraries and frameworks.
While libraries can be seen as a set of reusable data structures and functions, frameworks
incorporate the concept of dependency inversion whereby the framework shapes the
Fig 14. The CsvPanelcomponent allows users to configure their own CSV format.
https://doi.org/10.1371/journal.pone.0204474.g014
Fig 15. Example of use of a Wizardin DEWE to create a workflow configuration assistant.
https://doi.org/10.1371/journal.pone.0204474.g015
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 14 / 19
architecture of the application and the user code is plugged into the framework to build the
final application. Libraries and frameworks are two complementary concepts. While frame-
works usually incorporate libraries with helper functions, they are different concepts.
Java is a good ecosystem for developing reusable software. Not only because of its object
oriented nature, which eases reusability by taking advantage of encapsulation and polymor-
phism, but also because of (i) Swing, which includes a JComponentbase class for every visual
component and a delegation event model that allows programmers to create and reuse custom
interactive components easily, and (ii) Maven, which speeds-up the incorporation of libraries
and frameworks to new projects due to its dependency management system.
The Java ecosystem incorporated in 2008 the JavaFX toolkit for the creation of GUI desktop
applications as well as Rich Internet Applications. It was meant to be the successor of Java
Swing because of its modern features (such as built-in UI controls, CSS support, the WebView
component, or smooth animations, among others), but it seems that it has not met the expec-
tations, as it has not achieved the same adoption and support levels (in terms of community
forums and experts, for instance) than Java Swing. Furthermore, the sophisticated GUI fea-
tures included in JavaFX may not be so essential for the development of scientific applications
that are easy to use for end-users without advanced informatics skills, while the relative sim-
plicity of Swing probably makes it more appropriate for academia. Although it is hard to esti-
mate the actual usage of both libraries, some numbers can be obtained in this regard. By July
2018 we have seen that: (i) a Google Scholar query shows 42 400 entries for “Java Swing” and 4
950 for “JavaFX”, (ii) a generic Google search shows about 26 600 000 results for “Java Swing”
and about 4 170 000 for “JavaFX”, and (iii) a Scopus query throws only 40 results for “JavaFX”
but 258 for “Java Swing”. Based on this numbers and the maturity of the Java Swing technol-
ogy, for which there are many additional third-party libraries and frameworks, it is still an
appropriate choice for the development of scientific and bioinformatics software.
In this paper we have presented GC4S, a library of reusable Java Swing components, which
is a very useful complement to our AIBench framework for biomedical desktop applications.
While AIBench provides applications with a reusable main architecture and structure, based
on the input-process-output (IPO) model, the GC4S library provides a set of components that
are incorporated as needed for each specific application.
We provided an overview of the GC4S library and presented three real-world use cases to
illustrate its capabilities and usefulness. When developing the GUI of these three software
applications, Java Swing, SwingX and GC4S were used. As seen in Fig 18, Java Swing classes
are the most referenced by GUI classes in the three projects. Nevertheless, the percentage of
references to GC4S is notable, especially in DEWE and SEDA. This demonstrates the impact
that the presented library had on the development of these three applications.
Fig 16. Example of use of different icons provided by the Icons class.
https://doi.org/10.1371/journal.pone.0204474.g016
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 15 / 19
Fig 17. Use of CardsPanelin SEDA to display a different configuration panel (A, B, C or D) to the user depending on the choice made in the Rename type
combo box.
https://doi.org/10.1371/journal.pone.0204474.g017
Fig 18. Percentages of GUI classes in S2P, DEWE and SEDA with references to Java Swing, GC4S and SwingX classes.
https://doi.org/10.1371/journal.pone.0204474.g018
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 16 / 19
Conclusions
GC4S (http://www.sing-group.org/GC4S) is open-source and freely distributed under license
LGPLv3, providing a set of reusable graphical user interface components for Java Swing. The
real utility of GC4S has been demonstrated by different case studies where its use has led to a
more efficient software development process. GC4S is open to further extensions and we will
keep improving it with fresh and innovative ideas as existing projects evolve or novel develop-
ments reach the marketplace.
Supporting information
S1 Table. Complete list of GC4S components by package and type. Packages and compo-
nents are listed alphabetically.
(DOCX)
S1 Document. Basic examples of GC4S.
(DOCX)
Acknowledgments
H. Lo
´pez-Ferna
´ndez is supported by a post-doctoral fellowship from Xunta de Galicia
(ED481B 2016/068-0). SING group thanks CITI (Centro de Investigación,Transferencia e Inno-
vación) from University of Vigo for hosting its IT infrastructure.
Author Contributions
Conceptualization: Hugo Lo
´pez-Ferna
´ndez, Miguel Reboiro-Jato, Daniel Glez-Peña, Reyes
Pavo
´n, Florentino Fdez-Riverola.
Funding acquisition: Hugo Lo
´pez-Ferna
´ndez, Florentino Fdez-Riverola.
Investigation: Hugo Lo
´pez-Ferna
´ndez, Miguel Reboiro-Jato, Daniel Glez-Peña, Reyes Pavo
´n.
Methodology: Hugo Lo
´pez-Ferna
´ndez, Miguel Reboiro-Jato, Daniel Glez-Peña, Rosalı
´a Laza,
Reyes Pavo
´n, Florentino Fdez-Riverola.
Project administration: Hugo Lo
´pez-Ferna
´ndez, Rosalı
´a Laza, Reyes Pavo
´n.
Resources: Miguel Reboiro-Jato.
Software: Hugo Lo
´pez-Ferna
´ndez, Miguel Reboiro-Jato, Daniel Glez-Peña.
Supervision: Hugo Lo
´pez-Ferna
´ndez, Miguel Reboiro-Jato, Daniel Glez-Peña, Florentino
Fdez-Riverola.
Writing original draft: Hugo Lo
´pez-Ferna
´ndez, Rosalı
´a Laza, Reyes Pavo
´n, Florentino
Fdez-Riverola.
Writing review & editing: Hugo Lo
´pez-Ferna
´ndez, Miguel Reboiro-Jato, Daniel Glez-Peña,
Rosalı
´a Laza, Reyes Pavo
´n, Florentino Fdez-Riverola.
References
1. Bolchini D, Finkelstein A, Perrone V, Nagl S. Better bioinformatics through usability analysis. Bioinfor-
matics. 2009; 25:406–12. https://doi.org/10.1093/bioinformatics/btn633 PMID: 19073592
2. Rother K, Potrzebowski W, Puton T, Rother M, Wywial E, Bujnicki JM. A toolbox for developing bioinfor-
matics software. Brief Bioinform. 2012; 13:244–57. https://doi.org/10.1093/bib/bbr035 PMID: 21803787
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 17 / 19
3. Leprevost F da V, Barbosa VC, Francisco EL, Perez-Riverol Y, Carvalho PC. On best practices in the
development of bioinformatics software. Front Genet [Internet]. 2014 [cited 2017 Dec 6];5. Available
from: http://journal.frontiersin.org/article/10.3389/fgene.2014.00199/abstract
4. PrlićA, Procter JB. Ten Simple Rules for the Open Development of Scientific Software. PLoS Comput
Biol. 2012; 8:e1002802. https://doi.org/10.1371/journal.pcbi.1002802 PMID: 23236269
5. Seemann T. Ten recommendations for creating usable bioinformatics command line software. Giga-
Science [Internet]. 2013 [cited 2017 Dec 6];2. Available from: https://academic.oup.com/gigascience/
article-lookup/doi/10.1186/2047-217X-2-15
6. Pavelin K, Cham JA, de Matos P, Brooksbank C, Cameron G, Steinbeck C. Bioinformatics Meets User-
Centred Design: A Perspective. Bourne PE, editor. PLoS Comput Biol. 2012; 8:e1002554. https://doi.
org/10.1371/journal.pcbi.1002554 PMID: 22807660
7. Perez-Riverol Y, Wang R, Hermjakob H, Mu¨ller M, Vesada V, Vizcaı
´no JA. Open source libraries and
frameworks for mass spectrometry based proteomics: A developer’s perspective. Biochim Biophys
Acta BBA—Proteins Proteomics [Internet]. [cited 2013 Nov 18]; Available from: http://www.
sciencedirect.com/science/article/pii/S1570963913001039
8. Stajich JE. The Bioperl Toolkit: Perl Modules for the Life Sciences. Genome Res. 2002; 12:1611–8.
https://doi.org/10.1101/gr.361602 PMID: 12368254
9. Giardine B. Galaxy: A platform for interactive large-scale genome analysis. Genome Res. 2005;
15:1451–5. https://doi.org/10.1101/gr.4086505 PMID: 16169926
10. Prlic A, Yates A, Bliven SE, Rose PW, Jacobsen J, Troshin PV, et al. BioJava: an open-source frame-
work for bioinformatics in 2012. Bioinformatics. 2012; 28:2693–5. https://doi.org/10.1093/
bioinformatics/bts494 PMID: 22877863
11. Chang W, Cheng J, Allaire JJ, Xie Y, McPherson J, RStudio, et al. shiny: WebApplication Framework
for R [Internet]. 2018 [cited 2018 Jul 27]. Available from: https://CRAN.R-project.org/package=shiny
12. Wang R, Perez-Riverol Y, Hermjakob H, Vizcaı
´no JA. Open source libraries and frameworks for biologi-
cal data visualisation: A guide for developers. PROTEOMICS. 2015; 15:1356–74. https://doi.org/10.
1002/pmic.201400377 PMID: 25475079
13. Fang X, Miller JA, Arnold J. J3DV: A Java-based 3D database visualization tool. Softw Pract Exp. 2002;
32:443–63.
14. Gansner ER, North SC. An open graph visualization system and its applications to software engineer-
ing. Softw Pract Exp. 2000; 30:1203–33.
15. Barsnes H, Vaudel M, Martens L. JSparklines: Making tabular proteomics data come alive. PROTEO-
MICS. 2015; 15:1428–31. https://doi.org/10.1002/pmic.201400356 PMID: 25422159
16. Fdez-Riverola F, Glez-Peña D, Lo
´pez-Ferna
´ndez H, Reboiro-Jato M, Me
´ndez JR. A JAVA application
framework for scientific software development. Softw—Pract Exp. 2012; 42:1015–36.
17. Lo
´pez-Ferna
´ndez H, de S. Pesso
ˆa G, Arruda MAZ, Capelo-Martı
´nez JL, Fdez-Riverola F, Glez-Peña
D, et al. LA-iMageS: a software for elemental distribution bioimaging using LA–ICP–MS data. J Chemin-
formatics [Internet]. 2016 [cited 2017 Dec 6];8. Available from: http://jcheminf.springeropen.com/
articles/10.1186/s13321-016-0178-7
18. Lo
´pez-Ferna
´ndez H, Santos HM, Capelo JL, Fdez-Riverola F, Glez-Peña D, Reboiro-Jato M. Mass-Up:
an all-in-one open software application for MALDI-TOF mass spectrometry knowledge discovery. BMC
Bioinformatics [Internet]. 2015 [cited 2015 Dec 15];16. Available from: http://www.biomedcentral.com/
1471-2105/16/318
19. Lo
´pez-Ferna
´ndez H, Arau
´jo JE, Jorge S, Glez-Peña D, Reboiro-Jato M, Santos HM, et al. S2P: A soft-
ware tool to quickly carry out reproducible biomedical research projects involving 2D-gel and MALDI-
TOF MS protein data. Comput Methods Programs Biomed. 2018; 155:1–9. https://doi.org/10.1016/j.
cmpb.2017.11.024 PMID: 29512488
20. Lo
´pez-Ferna
´ndez H, Arau
´jo JE, Glez-Peña D, Reboiro-Jato M, Fdez-Riverola F, Capelo-Martı
´nez JL.
S2P: A Desktop Application for Fast and Easy Processing of 2D-Gel and MALDI-Based Mass Spec-
trometry Protein Data. In: Fdez-Riverola F, Mohamad MS, Rocha M, De Paz JF, Pinto T, editors. 11th
Int Conf Pract Appl Comput Biol Bioinforma [Internet]. Cham: Springer International Publishing; 2017
[cited 2017 Dec 6]. p. 1–8. Available from: http://link.springer.com/10.1007/978-3-319-60816-7_1
21. Lo
´pez-Ferna
´ndez H, Reboiro-Jato M, Glez-Peña D, Aparicio F, Gachet D, Buenaga M, et al. BioAnnote:
A software platform for annotating biomedical documents with application in medical learning environ-
ments. Comput Methods Programs Biomed. 2013; 111:139–47. https://doi.org/10.1016/j.cmpb.2013.
03.007 PMID: 23562645
22. Pe
´rez-Rodrı
´guez G, Glez-Peña D, Azevedo NF, Pereira MO, Fdez-Riverola F, Lourenc¸o A. Enabling
systematic, harmonised and large-scale biofilms data computation: The Biofilms Experiment
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 18 / 19
Workbench. Comput Methods Programs Biomed. 2015; 118:309–21. https://doi.org/10.1016/j.cmpb.
2014.12.005 PMID: 25600941
23. Lourenc¸o A, Carreira R, Carneiro S, Maia P, Glez-Peña D, Fdez-Riverola F, et al. @Note: a workbench
for biomedical text mining. J Biomed Inform. 2009; 42:710–20. https://doi.org/10.1016/j.jbi.2009.04.002
PMID: 19393341
24. Reboiro-Jato D, Reboiro-Jato M, Fdez-Riverola F, Vieira CP, Fonseca NA, Vieira J. ADOPS—Auto-
matic Detection Of Positively Selected Sites. J Integr Bioinforma. 2012; 9:200.
25. Santos HM, Reboiro-Jato M, Glez-Peña D, Nunes-Miranda JD, Fdez-Riverola F, Carvallo R, et al. Deci-
sion peptide-driven: a free software tool for accurate protein quantification using gel electrophoresis
and matrix assisted laser desorption ionization time of flight mass spectrometry. Talanta. 2010;
82:1412–20. https://doi.org/10.1016/j.talanta.2010.07.007 PMID: 20801349
26. Galesio M, Lo
´pez-Fdez H, Reboiro-Jato M, Go
´mez-Meire S, Glez-Peña D, Fdez-Riverola F, et al.
Speeding up the screening of steroids in urine: Development of a user-friendly library. Steroids. 2013;
78:1226–32. https://doi.org/10.1016/j.steroids.2013.08.014 PMID: 24036418
27. Lo
´pez-Ferna
´ndez H, Reboiro-Jato M, Pe
´rez Rodrı
´guez JA, Fdez-Riverola F, Glez-Peña D. The Artificial
Intelligence Workbench: a retrospective review. ADCAIJ Adv Distrib Comput Artif Intell J. 2016; 5:73.
28. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, et al. The Human Genome
Browser at UCSC. Genome Res. 2002; 12:996–1006. https://doi.org/10.1101/gr.229102 PMID:
12045153
29. Glez-Peña D, Go
´mez-Lo
´pez G, Reboiro-Jato M, Fdez-Riverola F, Pisano DG. PileLine: a toolbox to
handle genome position information in next-generation sequencing studies. BMC Bioinformatics. 2011;
12:31. https://doi.org/10.1186/1471-2105-12-31 PMID: 21261974
30. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format
and SAMtools. Bioinformatics. 2009; 25:2078–9. https://doi.org/10.1093/bioinformatics/btp352 PMID:
19505943
GC4S: A bioinformatics-oriented Java software library of reusable graphical user interface components
PLOS ONE | https://doi.org/10.1371/journal.pone.0204474 September 20, 2018 19 / 19
... SEDA is implemented in Java 8 using the GC4S library for GUI development [12] and a custom framework for CLI development. ...
Article
Full-text available
Background The initial version of SEDA assists life science researchers without programming skills with the preparation of DNA and protein sequence FASTA files for multiple bioinformatics applications. However, the initial version of SEDA lacks a command-line interface for more advanced users and does not allow the creation of automated analysis pipelines. Results The present paper discusses the updates of the new SEDA release, including the addition of a complete command-line interface, new functionalities like gene annotation, a framework for automated pipelines, and improved integration in Linux environments. Conclusion SEDA is an open-source Java application and can be installed using the different distributions available (https://www.sing-group.org/seda/download.html) as well as through a Docker image (https://hub.docker.com/r/pegi3s/seda). It is released under a GPL-3.0 license, and its source code is publicly accessible on GitHub (https://github.com/sing-group/seda). The software version at the time of submission is archived at Zenodo (version v1.6.0, http://doi.org/10.5281/zenodo.10201605).
... SEDA is implemented in Java 8 using the GC4S library for GUI development [22]. The project has a plugin-based architecture and comprises a core module, which provides the SEDA main operations and data structures (classes for representing and managing sequences and files), and several modules that provide additional operations. ...
Article
SEDA (SEquence DAtaset builder) is a multiplatform desktop application for the manipulation of FASTA files containing DNA or protein sequences. The convenient graphical user interface gives access to a collection of simple (filtering, sorting, or file reformatting, among others) and advanced (BLAST searching, protein domain annotation, gene annotation, and sequence alignment) utilities not present in similar applications, which eases the work of life science researchers working with DNA and/or protein sequences, especially those who have no programming skills. This paper presents general guidelines on how to build efficient data handling protocols using SEDA, as well as practical examples on how to prepare high-quality datasets for single gene phylogenetic studies, the characterization of protein families, or phylogenomic studies. The user-friendliness of SEDA also relies on two important features: (i) the availability of easy-to-install distributable versions and installers of SEDA, including a Docker image for Linux, and (ii) the facility with which users can manage large datasets. SEDA is open-source, with GNU General Public License v3.0 license, and publicly available at GitHub (https://github.com/sing-group/seda). SEDA installers and documentation are available at https://www.sing-group.org/seda/
... In contrast to R, Java, MOG's platform, has been used to develop numerous software with interfaces that are interactive and user-friendly (e.g. (78,(104)(105)(106)), and MOG provides the researcher with specialized GUIs and methods for exploratory data analysis. MOG's GUI allows direct interactivity with the data through interactive tables, trees and visualizations, so that a researcher can easily explore data from different perspectives. ...
Article
Full-text available
The diverse and growing omics data in public domains provide researchers with tremendous opportunity to extract hidden, yet undiscovered, knowledge. However, the vast majority of archived data remain unused. Here, we present MetaOmGraph (MOG), a free, open-source, standalone software for exploratory analysis of massive datasets. Researchers , without coding, can interactively visualize and evaluate data in the context of its metadata, honing-in on groups of samples or genes based on attributes such as expression values, statistical associations, metadata terms and ontology annotations. Interaction with data is easy via interactive visualizations such as line charts, box plots, scatter plots, histograms and volcano plots. Statistical analyses include co-expression analysis, differential expression analysis and differential correlation analysis, with significance tests. Researchers can send data subsets to R for additional analyses. Multithreading and indexing enable efficient big data analysis. A researcher can create new MOG projects from any numerical data; or explore an existing MOG project. MOG projects, with history of explorations, can be saved and shared. We illustrate MOG by case studies of large cu-rated datasets from human cancer RNA-Seq, where we identify novel putative biomarker genes in different tumors, and microarray and metabolomics data from Arabidopsis thaliana. MOG executable and code: http://metnetweb.gdcb.iastate.edu/ and https: //github.com/urmi-21/MetaOmGraph/.
... Because they are written in R (18)(19)(20), these tools must rely on R's somewhat limited capabilities for interactive applications. In contrast to R, Java, MOG's platform, has been used to develop numerous software with interfaces that are interactive and userfriendly (102,(130)(131)(132)(133), and MOG provides the researcher with specialized GUIs and methods for exploratory data analysis. MOG's GUI allows direct interactivity with the data and the visualizations, so that a researcher can easily explore data from different perspectives. ...
Preprint
Full-text available
The diverse and growing omics data in public domains provide researchers with a tremendous opportunity to extract hidden knowledge. However, the challenge of providing domain experts with easy access to these big data has resulted in the vast majority of archived data remaining unused. Here, we present MetaOmGraph (MOG), a free, open-source, standalone software for exploratory data analysis of massive datasets by scientific researchers. Using MOG, a researcher can interactively visualize and statistically analyze the data, in the context of its metadata. Researchers can interactively hone-in on groups of experiments or genes based on attributes such as expression values, statistical results, metadata terms, and ontology annotations. MOG's statistical tools include coexpression, differential expression, and differential correlation analysis, with permutation test-based options for significance assessments. Multithreading and indexing enable efficient data analysis on a personal computer, with no need for writing code. Data can be visualized as line charts, box plots, scatter plots, and volcano plots. A researcher can create new MOG projects from any data or analyze an existing one. An R-wrapper lets a researcher select and send smaller data subsets to R for additional analyses. A researcher can save MOG projects with a history of the exploratory progress and later reopen or share them. We illustrate MOG by case studies of large curated datasets from human cancer RNA-Seq, in which we assembled a list of novel putative biomarker genes in different tumors, and microarray and metabolomics from A. thaliana.
Chapter
Full-text available
Research and development (R&D) activities are employed to find new theories commonly derived in software products; but the gap between the proof-of-concept software and the final implementation is commonly hard to narrow. In a previous work, a combination of the COMET (collaborative object modeling and architectural design method) and OCEP (Open Community engagement model) methodologies was successfully applied to the development of a person re-identification system. In this paper, the COMET-OCEP software process is conceptually related to other processes, and its guidelines are detailed. Additionally, the advantages of the COMET-OCEP process are highlighted through the analysis of a test case.
Preprint
Full-text available
Research and development (R&D) activities are employed to find new theories commonly derived in software products; but the gap between the proof-of-concept software and the final implementation is commonly hard to narrow. In a previous work, a combination of the COMET (collaborative object modeling and architectural design method) and OCEP (Open Community engagement model) methodologies was successfully applied to the development of a person re-identification system. In this paper, the COMET-OCEP software process is conceptually related to other processes, and its guidelines are detailed. Additionally, the advantages of the COMET-OCEP process are highlighted through the analysis of a test case.
Article
Background: Transcriptomics profiling aims to identify and quantify all transcripts present within a cell type or tissue at a particular state, and thus provide information on the genes expressed in specific experimental settings, differentiation or disease conditions. RNA-Seq technology is becoming the standard approach for such studies, but available analysis tools are often hard to install, configure and use by users without advanced bioinformatics skills. Methods: Within reason, DEWE aims to make RNA-Seq analysis as easy for non-proficient users as for experienced bioinformaticians. DEWE supports two well-established and widely used differential expression analysis workflows: using Bowtie2 or HISAT2 for sequence alignment; and, both applying StringTie for quantification, and Ballgown and edgeR for differential expression analysis. Also, it enables the tailored execution of individual tools as well as helps with the management and visualisation of differential expression results. Results: DEWE provides a user-friendly interface designed to reduce the learning curve of less knowledgeable users while enabling analysis customisation and software extension by advanced users. Docker technology helps overcome installation and configuration hurdles. In addition, DEWE produces high quality and publication-ready outputs in the form of tab-delimited files and figures, as well as helps researchers with further analyses, such as pathway enrichment analysis. Conclusions: The abilities of DEWE are exemplified here by practical application to a comparative analysis of monocytes and monocyte-derived dendritic cells, a study of clinical relevance. DEWE installers and documentation are freely available at https://www.sing-group.org/dewe.
Article
Full-text available
Maximum-likelihood methods based on models of codon substitution have been widely used to infer positively selected amino acid sites that are responsible for adaptive changes. Nevertheless, in order to use such an approach, software applications are required to align protein and DNA sequences, infer a phylogenetic tree and run the maximum-likelihood models. Therefore, a significant effort is made in order to prepare input files for the different software applications and in the analysis of the output of every analysis. In this paper we present the ADOPS (Automatic Detection Of Positively Selected Sites) software. It was developed with the goal of providing an automatic and flexible tool for detecting positively selected sites given a set of unaligned nucleotide sequence data. An example of the usefulness of such a pipeline is given by showing, under different conditions, positively selected amino acid sites in a set of 54 Coffea putative S-RNase sequences. ADOPS software is freely available and can be downloaded from http://sing.ei.uvigo.es/ADOPS.
Article
Full-text available
The spatial distribution of chemical elements in different types of samples is an important field in several research areas such as biology, paleontology or biomedicine, among others. Elemental distribution imaging by laser ablation inductively coupled plasma mass spectrometry (LA–ICP–MS) is an effective technique for qualitative and quantitative imaging due to its high spatial resolution and sensitivity. By applying this technique, vast amounts of raw data are generated to obtain high-quality images, essentially making the use of specific LA–ICP–MS imaging software that can process such data absolutely mandatory. Since existing solutions are usually commercial or hard-to-use for average users, this work introduces LA-iMageS, an open-source, free-to-use multiplatform application for fast and automatic generation of high-quality elemental distribution bioimages from LA–ICP–MS data in the PerkinElmer Elan XL format, whose results can be directly exported to external applications for further analysis. A key strength of LA-iMageS is its substantial added value for users, with particular regard to the customization of the elemental distribution bioimages, which allows, among other features, the ability to change color maps, increase image resolution or toggle between 2D and 3D visualizations. Electronic supplementary material The online version of this article (doi:10.1186/s13321-016-0178-7) contains supplementary material, which is available to authorized users.
Article
Full-text available
Last decade, biomedical and bioinformatics researchers have been demanding advanced and user-friendly applications for real use in practice. In this context, the Artificial Intelligence Workbench, an open-source Java desktop application framework for scientific software development, emerged with the goal of provid-ing support to both fundamental and applied research in the domain of transla-tional biomedicine and bioinformatics. AIBench automatically provides function-alities that are common to scientific applications, such as user parameter defini-tion, logging facilities, multi-threading execution, experiment repeatability, work-flow management, and fast user interface development, among others. Moreover, AIBench promotes a reusable component based architecture, which also allows assembling new applications by the reuse of libraries from existing projects or third-party software. Ten years have passed since the first release of AIBench, so it is time to look back and check if it has fulfilled the purposes for which it was conceived to and how it evolved over time.
Article
Full-text available
Background Mass spectrometry is one of the most important techniques in the field of proteomics. MALDI-TOF mass spectrometry has become popular during the last decade due to its high speed and sensitivity for detecting proteins and peptides. MALDI-TOF-MS can be also used in combination with Machine Learning techniques and statistical methods for knowledge discovery. Although there are many software libraries and tools that can be combined for these kind of analysis, there is still a need for all-in-one solutions with graphical user-friendly interfaces and avoiding the need of programming skills. Results Mass-Up, an open software multiplatform application for MALDI-TOF-MS knowledge discovery is herein presented. Mass-Up software allows data preprocessing, as well as subsequent analysis including (i) biomarker discovery, (ii) clustering, (iii) biclustering, (iv) three-dimensional PCA visualization and (v) classification of large sets of spectra data. Conclusions Mass-Up brings knowledge discovery within reach of MALDI-TOF-MS researchers. Mass-Up is distributed under license GPLv3 and it is open and free to all users at http://sing.ei.uvigo.es/mass-up. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0752-4) contains supplementary material, which is available to authorized users.
Article
Full-text available
Recent advances in high-throughput experimental techniques have led to an exponential increase in both the size and the complexity of the datasets commonly studied in biology. Data visualisation is increasingly used as the key to unlock this data, going from hypothesis generation to model evaluation and tool implementation. It is becoming more and more the heart of bioinformatics workflows, enabling scientists to reason and communicate more effectively. In parallel, there has been a corresponding trend towards the development of related software, which has triggered the maturation of different visualisation libraries and frameworks. For bioinformaticians, scientific programmers and software developers, the main challenge is to pick out the most fitting one(s) to create clear, meaningful and integrated data visualisation for their particular use cases. In this review, we introduce a collection of open source or free to use libraries and frameworks for creating data visualisation, covering the generation of a wide variety of charts and graphs. We will focus on software written in Java, JavaScript or Python. We truly believe this software offers the potential to turn tedious data into exciting visual stories. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Article
Full-text available
Bioinformatics is one of the major areas of study in modern biology. Medium- and large-scale quantitative biology studies have created a demand for professionals with proficiency in multiple disciplines, including computer science and statistical inference besides biology. Bioinformatics has now become a cornerstone in biology, and yet the formal training of new professionals (Perez-Riverol et al., 2013; Via et al., 2013), the availability of good services for data deposition, and the development of new standards and software coding rules (Sandve et al., 2013; Seemann, 2013) are still major concerns. Good programming practices range from documentation and code readability through design patterns and testing (Via et al., 2013; Wilson et al., 2014). Here, we highlight some points for best practices and raise important issues to be discussed by the community.
Article
Background and objective 2D-gel electrophoresis is widely used in combination with MALDI-TOF mass spectrometry in order to analyze the proteome of biological samples. For instance, it can be used to discover proteins that are differentially expressed between two groups (e.g. two disease conditions, case vs. control, etc.) thus obtaining a set of potential biomarkers. This procedure requires a great deal of data processing in order to prepare data for analysis or to merge and integrate data from different sources. This kind of work is usually done manually (e.g. copying and pasting data into spreadsheet files), which is highly time consuming and distracts the researcher from other important, core tasks. Moreover, engaging in a repetitive process in a non-automated, handling-based manner is prone to error, thus threatening reliability and reproducibility. The objective of this paper is to present S2P, an open source software to overcome these drawbacks. Methods S2P is implemented in Java on top of the AIBench framework, and relies on well-established open source libraries to accomplish different tasks. Results S2P is an AIBench based desktop multiplatform application, specifically aimed to process 2D-gel and MALDI-mass spectrometry protein identification-based data in a computer-aided, reproducible manner. Different case studies are presented in order to show the usefulness of S2P. Conclusions S2P is open source and free to all users at http://www.sing-group.org/s2p. Through its user-friendly GUI interface, S2P dramatically reduces the time that researchers need to invest in order to prepare data for analysis.
Conference Paper
2D-gel electrophoresis is widely used in combination with MALDI-TOF mass spectrometry in order to analyse the proteome of biological samples. It can be used to discover proteins that are differentially expressed between two groups (e.g. two disease conditions) obtaining thus a set of potential biomarkers. Biomarker discovery requires a lot of data processing in order to prepare data for analysis or in order to merge data from different sources. This kind of work is usually done manually, being highly time consuming and distracting the operator or researcher from other important tasks. Moreover, doing this repetitive process in a non-automated, handling-based manner is error-prone, affecting reliability and reproducibility. To overcome these drawbacks, the S2P, an AIBench based desktop multiplatform application, has been specifically created to process 2D-gel and MALDI-mass spectrometry protein identification-based data in a computer-aided manner. S2P is open source and free to all users at http://www.sing-group.org/s2p.
Article
Background and objective: Biofilms are receiving increasing attention from the biomedical community. Biofilm-like growth within human body is considered one of the key microbial strategies to augment resistance and persistence during infectious processes. The Biofilms Experiment Workbench is a novel software workbench for the operation and analysis of biofilms experimental data. The goal is to promote the interchange and comparison of data among laboratories, providing systematic, harmonised and large-scale data computation. Methods: The workbench was developed with AIBench, an open-source Java desktop application framework for scientific software development in the domain of translational biomedicine. Implementation favours free and open-source third-parties, such as the R statistical package, and reaches for the Web services of the BiofOmics database to enable public experiment deposition. Results: First, we summarise the novel, free, open, XML-based interchange format for encoding biofilms experimental data. Then, we describe the execution of common scenarios of operation with the new workbench, such as the creation of new experiments, the importation of data from Excel spreadsheets, the computation of analytical results, the on-demand and highly customised construction of Web publishable reports, and the comparison of results between laboratories. Conclusions: A considerable and varied amount of biofilms data is being generated, and there is a critical need to develop bioinformatics tools that expedite the interchange and comparison of microbiological and clinical results among laboratories. We propose a simple, open-source software infrastructure which is effective, extensible and easy to understand. The workbench is freely available for non-commercial use at http://sing.ei.uvigo.es/bew under LGPL license.
Article
Perhaps the most common way of presenting proteomics data, and indeed life sciences data in general, is by using some form of tabular data. And while tables can be very informative and contain lots of information, the format can be challenging to interpret visually. An elegant and efficient solution is to extend the textual and numerical information with an additional visual layer, referred to as sparklines, making it intuitive to draw inferences about the properties of the underlying data. We here present a free and open source Java library called JSparklines (http://jsparklines.googlecode.com) that allows straightforward addition of a substantial list of customizable sparklines to tabular data representations, and we show examples of how these sparklines greatly simplify the interpretation of the tabular data.This article is protected by copyright. All rights reserved