iTools: A Framework for Classification, Categorization and Integration of Computational Biology Resources

Center for Computational Biology, University of California Los Angeles, Los Angeles, California, United States of America.
PLoS ONE (Impact Factor: 3.23). 02/2008; 3(5):e2265. DOI: 10.1371/journal.pone.0002265
Source: PubMed


The advancement of the computational biology field hinges on progress in three fundamental directions--the development of new computational algorithms, the availability of informatics resource management infrastructures and the capability of tools to interoperate and synergize. There is an explosion in algorithms and tools for computational biology, which makes it difficult for biologists to find, compare and integrate such resources. We describe a new infrastructure, iTools, for managing the query, traversal and comparison of diverse computational biology resources. Specifically, iTools stores information about three types of resources--data, software tools and web-services. The iTools design, implementation and resource meta-data content reflect the broad research, computational, applied and scientific expertise available at the seven National Centers for Biomedical Computing. iTools provides a system for classification, categorization and integration of different computational biology resources across space-and-time scales, biomedical problems, computational infrastructures and mathematical foundations. A large number of resources are already iTools-accessible to the community and this infrastructure is rapidly growing. iTools includes human and machine interfaces to its resource meta-data repository. Investigators or computer programs may utilize these interfaces to search, compare, expand, revise and mine meta-data descriptions of existent computational biology resources. We propose two ways to browse and display the iTools dynamic collection of resources. The first one is based on an ontology of computational biology resources, and the second one is derived from hyperbolic projections of manifolds or complex structures onto planar discs. iTools is an open source project both in terms of the source code development as well as its meta-data content. iTools employs a decentralized, portable, scalable and lightweight framework for long-term resource management. We demonstrate several applications of iTools as a framework for integrated bioinformatics. iTools and the complete details about its specifications, usage and interfaces are available at the iTools web page

Download full-text


Available from: Michael A Sherman, Oct 10, 2015
26 Reads
  • Source
    • "These can only be processed using efficient and structured systems. Efficient and effective tool interoperability is critical in many scientific endeavors as it enables new types of analyses, facilitates new applications, and promotes interdisciplinary collaborations (Dinov et al., 2008). The Pipeline Environment (Rex et al., 2003; Dinov et al., 2009) is a visual programming language and execution environment that enables the construction of complete study designs and management of data provenance in the form of complex graphical workflows. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Many contemporary neuroscientific investigations face significant challenges in terms of data management, computational processing, data mining and results interpretation. These four pillars define the core infrastructure necessary to plan, organize, orchestrate, validate and disseminate novel scientific methods, computational resources and translational healthcare findings. Data management includes protocols for data acquisition, archival, query, transfer, retrieval and aggregation. Computational processing involves the necessary software, hardware and networking infrastructure required to handle large amounts of heterogeneous neuroimaging, genetics, clinical and phenotypic data and meta-data. In this manuscript we describe the novel high-throughput neuroimaging-genetics computational infrastructure available at the Institute for Neuroimaging and Informatics (INI) and the Laboratory of Neuro Imaging (LONI) at University of Southern California (USC). INI and LONI include ultra-high-field and standard-field MRI brain scanners along with an imaging-genetics database for storing the complete provenance of the raw and derived data and meta-data. A unique feature of this architecture is the Pipeline environment, which integrates the data management, processing, transfer and visualization. Through its client-server architecture, the Pipeline environment provides a graphical user interface for designing, executing, monitoring validating, and disseminating of complex protocols that utilize diverse suites of software tools and web-services. These pipeline workflows are represented as portable XML objects which transfer the execution instructions and user specifications from the client user machine to remote pipeline servers for distributed computing. Using Alzheimer’s and Parkinson’s data, we provide several examples of translational applications using this infrastructure.
    Frontiers in Neuroinformatics 04/2014; 8:41. DOI:10.3389/fninf.2014.00041 · 3.26 Impact Factor
  • Source
    • "Table 1 summarizes various efforts to develop environments for tool integration, interoperability and meta-analysis [13]. There is a clear need to establish tool interoperability as it enables new types of analyses, facilitates new applications, and promotes interdisciplinary collaborations [14]. Compared to other environments, the LONI Pipeline offers several advantages, including a distributed grid-enabled client-server infrastructure and efficient deployment of new resources to the community: new tools need not be recompiled, migrated or altered to be made functionally available to the community. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Modern computational neuroscience employs diverse software tools and multidisciplinary expertise to analyze heterogeneous brain data. The classical problems of gathering meaningful data, fitting specific models, and discovering appropriate analysis and visualization tools give way to a new class of computational challenges--management of large and incongruous data, integration and interoperability of computational resources, and data provenance. We designed, implemented and validated a new paradigm for addressing these challenges in the neuroimaging field. Our solution is based on the LONI Pipeline environment [3], [4], a graphical workflow environment for constructing and executing complex data processing protocols. We developed study-design, database and visual language programming functionalities within the LONI Pipeline that enable the construction of complete, elaborate and robust graphical workflows for analyzing neuroimaging and other data. These workflows facilitate open sharing and communication of data and metadata, concrete processing protocols, result validation, and study replication among different investigators and research groups. The LONI Pipeline features include distributed grid-enabled infrastructure, virtualized execution environment, efficient integration, data provenance, validation and distribution of new computational tools, automated data format conversion, and an intuitive graphical user interface. We demonstrate the new LONI Pipeline features using large scale neuroimaging studies based on data from the International Consortium for Brain Mapping [5] and the Alzheimer's Disease Neuroimaging Initiative [6]. User guides, forums, instructions and downloads of the LONI Pipeline environment are available at
    PLoS ONE 09/2010; 5(9). DOI:10.1371/journal.pone.0013070 · 3.23 Impact Factor
  • Source
    • "A consortium composed of the seven US National Centers for Biomedical Computing [16] has recently developed another index of bioinformatics resources called iTools [17,18]. Web services are used here to provide access to resources that are annotated according to their functionality. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The rapid evolution of Internet technologies and the collaborative approaches that dominate the field have stimulated the development of numerous bioinformatics resources. To address this new framework, several initiatives have tried to organize these services and resources. In this paper, we present the BioInformatics Resource Inventory (BIRI), a new approach for automatically discovering and indexing available public bioinformatics resources using information extracted from the scientific literature. The index generated can be automatically updated by adding additional manuscripts describing new resources. We have developed web services and applications to test and validate our approach. It has not been designed to replace current indexes but to extend their capabilities with richer functionalities. We developed a web service to provide a set of high-level query primitives to access the index. The web service can be used by third-party web services or web-based applications. To test the web service, we created a pilot web application to access a preliminary knowledge base of resources. We tested our tool using an initial set of 400 abstracts. Almost 90% of the resources described in the abstracts were correctly classified. More than 500 descriptions of functionalities were extracted. These experiments suggest the feasibility of our approach for automatically discovering and indexing current and future bioinformatics resources. Given the domain-independent characteristics of this tool, it is currently being applied by the authors in other areas, such as medical nanoinformatics. BIRI is available at
    BMC Bioinformatics 10/2009; 10(1):320. DOI:10.1186/1471-2105-10-320 · 2.58 Impact Factor
Show more