Content uploaded by Chesley Leslin
Author content
All content in this area was uploaded by Chesley Leslin on Nov 06, 2015
Content may be subject to copyright.
BIOINFORMATICS
Friend, an Integrated Analytical Front-end Application for Bioin-
formatics
Alexej Abyzov, Mounir Errami, Chesley M. Leslin, and Valentin A. Ilyin
*
Department of Biology, Northeastern University, Boston, Massachusetts, USA
Received on ; revised on ; accepted on
Advance Access publication . . .
ABSTRACT
Friend is a bioinformatics application designed for simultaneous
analysis and visualization of multiple structures and sequences of
proteins and/or DNA/RNA. The application provides basic function-
alities such as: structure visualization with different rendering and
coloring, sequence alignment, and simple phylogeny analysis,
along with a number of extended features to perform more complex
analyses of sequence structure relationships, including: structural
alignment of proteins, investigation of specific interaction motifs,
studies of protein-protein and protein-DNA interactions, and protein
super-families. It is also useful for functional annotation of proteins,
protein modeling, and protein folding studies. Friend provides three
levels of usage; 1) an extensive GUI for a scientist with no pro-
gramming experience, 2) a command line interface for scripting for a
scientist with some programming experience, and 3) the ability to
extend Friend with user written libraries for an experienced pro-
grammer. The application is linked and communicates with local and
remote sequence and structure databases.
Availability: http://mozart.bio.neu.edu/friend
Contact: ilyin@neu.edu, abyzov@mozart.bio.neu.edu
Many areas of research in biology, biomedicine and bio-
informatics require tools to perform comparative analytical studies
of DNA and protein sequence and structure families. There are
numerous examples where sequence-structure function research is
essential, including the analysis of conservative positions, active
and binding site residues, variations in orthologous and paralogous
sequences and structures, sequence alignments, structural align-
ments and classification, identification of the key residues for pro-
tein functionality and formation of macromolecular complexes,
phylogenetic tree analysis, and simple visual inspection. Therefore,
extended applications which integrate data from different sources,
facilitate in visualization and assist in analytical studies are an
everyday need in biological, biochemical, molecular biology and
related research areas.
Integration of different types of data is challenging and
not always straightforward. Despite the significant variety of ap-
plications in each particular research area: sequence analysis, struc-
ture studies, and phylogeny only a few such applications integrate
the data, for example JalView (Clamp et al., 2004), Cn3D (Wang
et al., 2000), Pfaat (Johnson et al., 2003), DeepView (Schwede et
*
To whom correspondence should be addressed, ilyin@neu.edu.
al., 2003). One more example of such an attempt to integrate se-
quence and structure analysis, ModView, a Netscape plug-in, has
been recently reported (Ilyin et al., 2003). While the integrative
analytical ability of the application has been found useful for pro-
teins analysis and the plug-in is downloaded constantly by
Linux/Netscape version 4 users, ModView has a clear downside
because of the strong dependence on the evolution of the Netscape
browser, in particular, Netscape does not provide backward com-
patibilities in its new versions and has restricted data handling
abilities. To avoid dependence on any third party software a new
stand-alone program, Friend, has been developed which itself
communicates with internet resources, provides data integration
between structure and sequence modules and local user files (see
Figure 1). The majority of ModView’s functionalities have been
re-implemented in Friend, while many new options have been
added including a user’s ability to extend Friend with new func-
tionality with dynamic libraries, i.e. its own plug-ins.
Figure 1. General architecture of Friend.
Friend is a stand-alone, multi-module, web-related appli-
cation, designed to be a front-end user interface for visual and
analytical studies on multiple sequences and multiple structures in
real time on a personal computer (see Figure 2). Both structure
and sequence data are linked to local and remote databases provid-
ing researchers with a comprehensive picture about related pro-
teins. Friend allows a user to visualize and manipulate hundreds of
spatial protein or DNA/RNA structures and sequences, and pro-
vides an easy to use interface.
There are more than 200 basic commands, including
various coloring and rendering of the protein structure, exhaustive
selection of subsets of atoms, residues, stereoviewing of structures,
C++
Structure
Module
Java
Sequence
Module
C++main
J-main
JSQL
-
C
C plugin
J plugin
Text
bio
-
DBs
SQL
bio
-
DBs
Bio
Informatics
Applications
Bridge
Linux/Unix
Windows
Mac
Friend
© The Author (2005). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oupjournals.org
Bioinformatics Advance Access published August 2, 2005
by guest on August 22, 2015http://bioinformatics.oxfordjournals.org/Downloaded from
A. Abyzov et al.
2
interactive atom/residue identification and labeling, alignment
editing, phylogeny clustering and tree viewing, pairwise sequence
alignment, cross coloring of sequence and structures, changing
residue representation in structure by simply clicking on it in the
sequence, saving the results in a variety of text and graphical for-
mats, and producing publication quality pictures. Several popular
alignment and structure representation formats are supported, in-
cluding: PDB, FASTA, PIR, CLUSTAL, and SKY. The SKY-
format was created to fulfill the demand for saving and linking
sequence and structure data cooperatively.
Figure 2. Friend displays structures, sequences and phylogeny tree for 137
lysozymes.
The advantage of Friend is the three levels of its usage:
1) for a scientist with no programming experience, Friend provides
users with an extensive GUI (the description can be found in the
program manual); 2) for a scientist with some programming ex-
perience, it provides the possibility to create user-defined menus in
a XML-file in order to use and combine any of the more than 200
commands; and 3) for a programmer, the ability to extend Friend
with user written libraries. This is accomplished by use of abstract
classes, providing functions to access and manipulate internal ap-
plication data. Using the abstract interfaces one can write code and
compile it into a dynamically loaded library, which is loaded and
executed during run time. The absence or presence of the library
does not affect the functionality of the Friend core. The opportu-
nity to add user written libraries allows users to perform more
complex and specific analysis of the studied object in a time effec-
tive manner, since there is no need to develop basic routine func-
tions; this also ensures that the code developed by different people
does not interfere with each other. As an example, the TOPOFIT
method for the structural alignment of proteins (Ilyin et al., 2004)
has been implemented as a separate dynamically loaded library.
Another powerful feature of the Friend application is the
ability to provide an interface to various sequence and structure
databases and other bioinformatics applications. Friend is used as a
visual front-end interface to the Structural Exon Database, SEDB
(Leslin et al., 2004) and to a database for mapping non-
synonymous SNPs, StSNP (Uzun et al., 2004). Internal integrated
client modules allow a user to perform similarity searches using
BLAST and to load protein structures from the PDB on “the fly”.
Friend also provides an interface to the homology modeling soft-
ware MODELLER (Sali and Blundell, 1993) and to the multiple
sequence alignment program ClustalW (Jeanmougin et al., 1998).
The QHULL (Barber, 1996) library for fast Delaunay Tessellation
(Delaunay, 1934) is integrated in Friend to analyze protein and
DNA/RNA atom-atom, residues-residue, residues-base, base-base
interactions along with the visualization of the tessellation in dif-
ferent views.
Friend is an ongoing project; future directions include
development of several additional modules to broaden the number
of databases interfaces (to NCBI databases, SWISS-PROT, Uni-
Prot, etc); porting it to the Mac OS and supplying the application
as a plug-in for a number of popular browsers (Internet Explorer,
Netscape, Mozilla, and FireFox). Friend has extensively used in a
Bioinformatics course and has a constant rate of outside installa-
tions ~25-30 per month since January, 2003.
REFERENCES
Barber, C. B. Dobkin D. P. and Huhdanpaa H. T. The Quickhull algorithm for convex
hulls. ACM Trans.on Mathematical Software 22(4):469-483. 1996.
http://www.qhull.org
Clamp,M., Cuff, J., Searle, S. M. and Barton, G. J. (2004). The Jalview Java Align-
ment Editor. Bioinformatics 20, 426-427.
Delaunay,B. (1934). Sur La Sphere Vide. Bull Acad Science USSR VII: Class Sci
Mat Nat 793-800.
Ilyin,V.A., Abyzov, A. and Leslin, C. M. (2004). Structural Alignment of Proteins by
a Novel TOPOFIT Method, As a Superimposition of Common Volumes at a
Topomax Point. Protein Sci 13, 1865-1874.
Jeanmougin,F., Thompson, J. D., Gouy, M., Higgins, D. G. and Gibson, T. J. (1998).
Multiple Sequence Alignment With Clustal X. Trends Biochem Sci 23, 403-405.
Johnson,J.M., Mason, K., Moallemi, C., Xi, H., Somaroo, S. and Huang, E. S. (2003).
Protein Family Annotation in a Multiple Alignment Viewer. Bioinformatics 19,
544-545.
Leslin,C.M., Abyzov, A. and Ilyin, V. A. (2004). Structural Exon Database, SEDB,
Mapping Exon Boundaries on Multiple Protein Structures. Bioinformatics.
Sali,A. and Blundell, T. L. (1993). Comparative Protein Modelling by Satisfaction of
Spatial Restraints. J Mol Biol 234, 779-815.
Schwede,T., Kopp, J., Guex, N. and Peitsch, M. C. (2003). SWISS-MODEL: An
Automated Protein Homology-Modeling Server. Nucleic Acids Res 31, 3381-
3385.
Uzun, A., Leslin, C., and Ilyin, V. A. StSNP:Structure SNP, a database for mapping
nonsnynonymous SNPs onto spatial structures of proteins. ASBMB Annual Meet-
ing and 8th IUBMB Conference. 12-6-2004.
Wang,Y., Geer, L. Y., Chappey, C., Kans, J. A. and Bryant, S. H. (2000). Cn3D:
Sequence and Structure Views for Entrez. Trends Biochem Sci 25, 300-302.
by guest on August 22, 2015http://bioinformatics.oxfordjournals.org/Downloaded from