[Show abstract][Hide abstract] ABSTRACT: Molecular interrogation of a biological sample through DNA sequencing, RNA and microRNA profiling, proteomics and other assays,
has the potential to provide a systems level approach to predicting treatment response and disease progression, and to developing
precision therapies. Large publicly funded projects have generated extensive and freely available multi-assay data resources;
however, bioinformatic and statistical methods for the analysis of such experiments are still nascent. We review multi-assay
genomic data resources in the areas of clinical oncology, pharmacogenomics and other perturbation experiments, population
genomics and regulatory genomics and other areas, and tools for data acquisition. Finally, we review bioinformatic tools that
are explicitly geared toward integrative genomic data visualization and analysis. This review provides starting points for
accessing publicly available data and tools to support development of needed integrative methods.
Full-text · Article · Oct 2015 · Briefings in Bioinformatics
[Show abstract][Hide abstract] ABSTRACT: Bioconductor is an open-source, open-development software project for the analysis and comprehension of high-throughput data in genomics and molecular biology. The project aims to enable interdisciplinary research, collaboration and rapid development of scientific software. Based on the statistical programming language R, Bioconductor comprises 934 interoperable packages contributed by a large, diverse community of scientists. Packages cover a range of bioinformatic and statistical applications. They undergo formal initial review and continuous automated testing. We present an overview for prospective users and contributors.
[Show abstract][Hide abstract] ABSTRACT: This paper reviews strategies for solving problems encountered when analyzing
large genomic data sets and describes the implementation of those strategies in
R by packages from the Bioconductor project. We treat the scalable processing,
summarization and visualization of big genomic data. The general ideas are well
established and include restrictive queries, compression, iteration and
parallel computing. We demonstrate the strategies by applying Bioconductor
packages to the detection and analysis of genetic variants from a whole genome
[Show abstract][Hide abstract] ABSTRACT: VariantAnnotation is an R / Bioconductor (Gentleman et al., 2004) package for the exploration and annotation of genetic variants. Capabilities exist for reading, writing and filtering Variant Call Format (VCF) files. VariantAnnotation allows ready access to additional R / Bioconductor facilities for advanced statistical analysis, data transformation, visualization, and integration with diverse genomic resources.
This package is implemented in R and available for download at the Bioconductor(1) web site. The package contains extensive help pages for individual functions, and a 'vignette' outlining typical work flows; it is made available under the open source 'Artistic-2.0' license. Version 1.9.38 was used in this manuscript.
[Show abstract][Hide abstract] ABSTRACT: ShortRead is a package for input, quality assessment, manipulation and output of high-throughput sequencing data. ShortRead is provided in the R and Bioconductor environments, allowing ready access to additional facilities for advanced statistical analysis, data transformation, visualization and integration with diverse genomic resources.
Availability and Implementation: This package is implemented in R and available at the Bioconductor web site; the package contains a ‘vignette’ outlining typical work flows.
[Show abstract][Hide abstract] ABSTRACT: This paper reviews the central concepts and implementation of data structures and methods for studying genetics of gene expression
with the GGtools package of Bioconductor. Illustration with a HapMap+expression dataset is provided.
Availability: Package GGtools is part of Bioconductor 1.9 (http://bioconductor.org). Open source with Artistic License.