Content uploaded by Michael T. Wolfinger
Author content
All content in this area was uploaded by Michael T. Wolfinger on Feb 21, 2015
Content may be subject to copyright.
ViennaNGS Executive Summary
ViennaNGS is a Perl distribution for building efficient next-generation
sequencing (NGS) data analysis pipelines, integrating high-level
routines and wrapper functions for common NGS processing tasks.
•Project started in Summer 2014
•Not an established pipeline per se, it provides tools and functionality
for the development of custom NGS pipelines in Perl
•Provides modular and reusable code for NGS processing
2
ViennaNGS implements thematically
related functionality in different Perl
modules and classes under the Bio
namespace, partly building on BioPerl
and the Moose object framework.
ViennaNGS Components
3
ViennaNGS Module Overview 1/3
Bio::ViennaNGS::AnnoC!
Lightweight interface for conversion of sequence annotation data
Bio::ViennaNGS::Bam!
High-level manipulation of BAM files
Bio::ViennaNGS::BamStat!
Moose based class for collecting mapping statistics
Bio::ViennaNGS::BamStatSummary!
Interface for processing BamStatSummary objects on multiple BAM files
Bio::ViennaNGS::Util!
Wrapper routines for common third party NGS utils and auxiliary functions
4
ViennaNGS Module Overview 2/3
Bio::ViennaNGS::Expression!
Compute normalized expression based on read counts
Bio::ViennaNGS::Fasta!
Moose wrapper for Bio::DB::Fasta
Bio::ViennaNGS::Bed !
Convenience class for handling genomic interval data in BED format
Bio::ViennaNGS::SpliceJunc!
Identification and characterization of splice junctions
Bio::ViennaNGS::UCSC!
Automatic generation of UCSC Assembly and Track Hubs
5
ViennaNGS Module Overview 3/3
Bio::ViennaNGS::MinimalFeature!
Base class for handling genomic interval data
Bio::ViennaNGS::Feature!
Interface for simple genomic intervals representing BED6 entries
Bio::ViennaNGS::ExtFeature!
Extends BED6 elements
Bio::ViennaNGS::FeatureChain!
Bundles individual Feature objects
Bio::ViennaNGS::FeatureLine!
Abstract representation of transcripts, pools FeatureChain objects
6
Moose In 30 Seconds
use Point;!
use Point3D;
my $pt2D = Point->new(x => 2, # x:2!
y => 4, # y:4!
);!
$pt2D->clear(); # x:0 y:0
my $pt3D = Point3D->new(x => 10, # x:10!
y => 20, # y:20!
z => 30 # z:30!
);!
$pt3D->clear; # x:0 y:0 z:0
7
A postmodern object system for Perl 5 that makes Object Oriented
programming easier, more consistent, and less tedious
package Point;!
use Moose;!
has 'x' => (is => 'rw', isa => 'Int'); !
has 'y' => (is => 'rw', isa => 'Int');
sub clear { !
my $self = shift;!
$self->x(0);!
$self->y(0); !
}
package Point3D;!
use Moose; !
extends 'Point'; !
!
has 'z' => (is => 'rw', isa => 'Int');
after 'clear' => sub {!
my $self = shift;!
$self->z(0);!
};
The BED Annotation Format
8
Window Position
Scale
chr1:
chr1:1,165,129-1,166,810 (1,682 bp)
500 bases araThaTAIR10
1,165,500 1,166,000 1,166,500
AT1G04350.1
chr1 1165164 1166768 AT1G04350.1 0 + 1165295 1166538 0 3 637,322,485, 0,708,1119,
Generic Feature Annoation 1/2
Bio::ViennaNGS::MinimalFeature
has ‘chromosome’ => (isa => ‘Str’)!
has ‘start’ => (isa => ‘Int’)
has ‘end’ => (isa => ‘Int’)
has ‘strand’ => (isa => ‘PlusOrMinus’) # +/-/.
Bio::ViennaNGS::Feature
extends Bio::ViennaNGS::MinimalFeature
has ‘name’ => (isa => ‘Str’)!
has ‘score’ => (isa => ‘Value’)
Bio::ViennaNGS::ExtFeature
extends Bio::ViennaNGS::Feature
has ‘extension’ => (isa => ‘Str’) 9
Generic Feature Annoation 2/2
Bio::ViennaNGS::FeatureChain
has ‘type’ => (isa => ‘Str’)!
has ‘chain’ => (isa => ‘ArrayRef’)
Bio::ViennaNGS::FeatureLine
extends Bio::ViennaNGS::MinimalFeature
has ‘id’ => (isa => ‘Str’)!
has ‘fc’ => (isa => ‘HashRef’)
10
★Feature extends MinimalFeature by two
attributes, thereby representing a BED6 entry
★FeatureChain bundles Feature elements,
creating individual annotation chains for e.g.
exons, introns,UTRs etc.
★FeatureLine combines a set of individual
FeatureChain objects, thereby providing a
convenient means of representing transcripts
ViennaNGS Interval Classes
11
ViennaNGS Documentation and Tutorials
★ViennaNGS comes with extensive documentation based on Perl’s
POD system, thereby providing a single documentation base
★ViennaNGS::Tutorial guides prospective users through the
development of basic NGS analysis pipelines
★The tutorial is split into different chapters, each covering a common
use case in NGS analysis and describing a possible solution step by
step
12
ViennaNGS Utilities
ViennaNGS comes with a collection of complementary executable Perl scripts for
accomplishing routine tasks often required in NGS data processing.
13
These CLI utilities serve as reference implementations of the library routines and
can readily be used for atomic tasks in NGS data processing.
ViennaNGS Availability
The ViennaNGS Perl distribution is available from GitHub & CPAN
https://github.com/mtw/Bio-ViennaNGS
http://search.cpan.org/dist/Bio-ViennaNGS
“ViennaNGS: A toolbox for building efficient next-generation sequencing analysis pipelines”
M.T. Wolfinger, J. Fallmann, F. Eggenhofer, F. Amman
bioRxiv preprint DOI:10.1101/013011
14
15