Applications of the pipeline environment for visual informatics and genomics computations

Laboratory of Neuro Imaging (LONI), University of California, Los Angeles, Los Angeles, CA 90095, USA.
BMC Bioinformatics (Impact Factor: 2.58). 07/2011; 12(1):304. DOI: 10.1186/1471-2105-12-304
Source: PubMed


Contemporary informatics and genomics research require efficient, flexible and robust management of large heterogeneous data, advanced computational tools, powerful visualization, reliable hardware infrastructure, interoperability of computational resources, and detailed data and analysis-protocol provenance. The Pipeline is a client-server distributed computational environment that facilitates the visual graphical construction, execution, monitoring, validation and dissemination of advanced data analysis protocols.
This paper reports on the applications of the LONI Pipeline environment to address two informatics challenges - graphical management of diverse genomics tools, and the interoperability of informatics software. Specifically, this manuscript presents the concrete details of deploying general informatics suites and individual software tools to new hardware infrastructures, the design, validation and execution of new visual analysis protocols via the Pipeline graphical interface, and integration of diverse informatics tools via the Pipeline eXtensible Markup Language syntax. We demonstrate each of these processes using several established informatics packages (e.g., miBLAST, EMBOSS, mrFAST, GWASS, MAQ, SAMtools, Bowtie) for basic local sequence alignment and search, molecular biology data analysis, and genome-wide association studies. These examples demonstrate the power of the Pipeline graphical workflow environment to enable integration of bioinformatics resources which provide a well-defined syntax for dynamic specification of the input/output parameters and the run-time execution controls.
The LONI Pipeline environment provides a flexible graphical infrastructure for efficient biomedical computing and distributed informatics research. The interactive Pipeline resource manager enables the utilization and interoperability of diverse types of informatics resources. The Pipeline client-server model provides computational power to a broad spectrum of informatics investigators--experienced developers and novice users, user with or without access to advanced computational-resources (e.g., Grid, data), as well as basic and translational scientists. The open development, validation and dissemination of computational networks (pipeline workflows) facilitates the sharing of knowledge, tools, protocols and best practices, and enables the unbiased validation and replication of scientific findings by the entire community.

  • Source
    • "The flexibility of the pipeline facilitates the implementation of new analytical strategies directly from the interface. Other groups have independently developed a genomic pipeline using LONI, supporting the utility of this resource for sequencing data (Dinov et al., 2011; Torri et al., 2012; Figure 6). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Next Generation Sequencing studies generate a large quantity of genetic data in a relatively cost and time efficient manner and provide an unprecedented opportunity to identify candidate causative variants that lead to disease phenotypes. A challenge to these studies is the generation of sequencing artifacts by current technologies. To identify and characterize the properties that distinguish false positive variants from true variants, we sequenced a child and both parents (trio) using DNA isolated from three sources (blood, buccal cells, and saliva). The trio strategy allowed us to identify variants in the proband that could not have been inherited from the parents (Mendelian errors) and would most likely indicate sequencing artifacts. Quality control measurements were examined and three measurements were found to identify the greatest number of Mendelian errors. These included read depth, genotype quality score, and alternate allele ratio. Filtering the variants on these measurements removed ~95% of the Mendelian errors while retaining 80% of the called variants. These filters were applied independently. After filtering, the concordance between identical samples isolated from different sources was 99.99% as compared to 87% before filtering. This high concordance suggests that different sources of DNA can be used in trio studies without affecting the ability to identify causative polymorphisms. To facilitate analysis of next generation sequencing data, we developed the Cincinnati Analytical Suite for Sequencing Informatics (CASSI) to store sequencing files, metadata (e.g. relatedness information), file versioning, data filtering, variant annotation, and identify candidate causative polymorphisms that follow either de novo, rare recessive homozygous or compound heterozygous inheritance models. We conclude the data cleaning process improves the signal to noise ratio in terms of variants and facilitates the identification of candidate disease causative polymorphisms.
    Full-text · Article · Feb 2014 · Frontiers in Genetics
  • Source
    • "In the current study, we used the Laboratory of Neuroimaging (LONI) Pipeline [19], [20] for image preprocessing, volumetric analysis and cortical thickness (CT) analysis. We focused on differences of local morphologic brain alterations between UC and healthy control subjects (HCs), and compared them to findings in IBS subjects. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Regional cortical thickness alterations have been reported in many chronic inflammatory and painful conditions, including inflammatory bowel diseases (IBD) and irritable bowel syndrome (IBS), even though the mechanisms underlying such neuroplastic changes remain poorly understood. In order to better understand the mechanisms contributing to grey matter changes, the current study sought to identify the differences in regional alterations in cortical thickness between healthy controls and two chronic visceral pain syndromes, with and without chronic gut inflammation. 41 healthy controls, 11 IBS subjects with diarrhea, and 16 subjects with ulcerative colitis (UC) underwent high-resolution T1-weighted magnetization-prepared rapid acquisition gradient echo scans. Structural image preprocessing and cortical thickness analysis within the region of interests were performed by using the Laboratory of Neuroimaging Pipeline. Group differences were determined using the general linear model and linear contrast analysis. The two disease groups differed significantly in several cortical regions. UC subjects showed greater cortical thickness in anterior cingulate cortical subregions, and in primary somatosensory cortex compared with both IBS and healthy subjects. Compared with healthy subjects, UC subjects showed lower cortical thickness in orbitofrontal cortex and in mid and posterior insula, while IBS subjects showed lower cortical thickness in the anterior insula. Large effects of correlations between symptom duration and thickness in the orbitofrontal cortex and postcentral gyrus were only observed in UC subjects. The findings suggest that the mechanisms underlying the observed gray matter changes in UC subjects represent a consequence of peripheral inflammation, while in IBS subjects central mechanisms may play a primary role.
    Full-text · Article · Jan 2014 · PLoS ONE
  • Source
    • "We applied network analysis to obtain new insights about large-scale regional connectivity and to compare morphological brain architectures and network properties between groups of subjects with IBS and HCs. To assist in our large-scale analyses we employed the Laboratory of Neuro Imaging (LONI) pipeline [39] [41] [135], a graphical workflow environment that allows users to describe executable tools in a graphical user interface and create processing modules as nodes in a graph representing the complete computational protocol [40] [42]. We provide evidence for regional alterations in GM volume as well as differences in the regional properties of large-scale structural brain networks in subjects with IBS compared to HCs. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Alterations in gray matter (GM) density/ volume and cortical thickness (CT) have been demonstrated in small and heterogeneous samples of subjects with different chronic pain syndromes, including irritable bowel syndrome (IBS). Aggregating across 7 structural neuroimaging studies conducted at UCLA between August 2006 and April 2011, we examined group differences in regional GM volume in 201 predominantly premenopausal female subjects (82 IBS, mean age: 32 ± 10 SD, 119 Healthy Controls [HCs], 30± 10 SD). Applying graph theoretical methods and controlling for total brain volume, global and regional properties of large-scale structural brain networks were compared between IBS and HC groups. Relative to HCs, the IBS group had lower volumes in bilateral superior frontal gyrus, bilateral insula, bilateral amygdala, bilateral hippocampus, bilateral middle orbital frontal gyrus, left cingulate, left gyrus rectus, brainstem, and left putamen. Higher volume was found for the left postcentral gyrus. Group differences were no longer significant for most regions when controlling for Early Trauma Inventory global score with the exception of the right amygdala and the left post central gyrus. No group differences were found for measures of global and local network organization. Compared to HCs, the right cingulate gyrus and right thalamus were identified as significantly more critical for information flow. Regions involved in endogenous pain modulation and central sensory amplification were identified as network hubs in IBS. Overall, evidence for central alterations in IBS was found in the form of regional GM volume differences and altered global and regional properties of brain volumetric networks.
    Full-text · Article · Sep 2013 · Pain
Show more