[Show abstract][Hide abstract] ABSTRACT: Bioconductor is an open-source, open-development software project for the analysis and comprehension of high-throughput data in genomics and molecular biology. The project aims to enable interdisciplinary research, collaboration and rapid development of scientific software. Based on the statistical programming language R, Bioconductor comprises 934 interoperable packages contributed by a large, diverse community of scientists. Packages cover a range of bioinformatic and statistical applications. They undergo formal initial review and continuous automated testing. We present an overview for prospective users and contributors.
[Show abstract][Hide abstract] ABSTRACT: This paper reviews strategies for solving problems encountered when analyzing
large genomic data sets and describes the implementation of those strategies in
R by packages from the Bioconductor project. We treat the scalable processing,
summarization and visualization of big genomic data. The general ideas are well
established and include restrictive queries, compression, iteration and
parallel computing. We demonstrate the strategies by applying Bioconductor
packages to the detection and analysis of genetic variants from a whole genome
[Show abstract][Hide abstract] ABSTRACT: VariantAnnotation is an R / Bioconductor (Gentleman et al., 2004) package for the exploration and annotation of genetic variants. Capabilities exist for reading, writing and filtering Variant Call Format (VCF) files. VariantAnnotation allows ready access to additional R / Bioconductor facilities for advanced statistical analysis, data transformation, visualization, and integration with diverse genomic resources.
This package is implemented in R and available for download at the Bioconductor(1) web site. The package contains extensive help pages for individual functions, and a 'vignette' outlining typical work flows; it is made available under the open source 'Artistic-2.0' license. Version 1.9.38 was used in this manuscript.
[Show abstract][Hide abstract] ABSTRACT: ShortRead is a package for input, quality assessment, manipulation and output of high-throughput sequencing data. ShortRead is provided in the R and Bioconductor environments, allowing ready access to additional facilities for advanced statistical analysis, data transformation, visualization and integration with diverse genomic resources.
Availability and Implementation: This package is implemented in R and available at the Bioconductor web site; the package contains a ‘vignette’ outlining typical work flows.
[Show abstract][Hide abstract] ABSTRACT: This paper reviews the central concepts and implementation of data structures and methods for studying genetics of gene expression with the GGtools package of Bioconductor. Illustration with a HapMap+expression dataset is provided. Availability: Package GGtools is part of Bioconductor 1.9 (http://bioconductor.org). Open source with Artistic License.