Christopher Maximilian Pockrandt

Christopher Maximilian Pockrandt
Freie Universität Berlin | FUB · Institute of Computer Science

About

18
Publications
1,943
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
365
Citations
Citations since 2017
16 Research Items
363 Citations
2017201820192020202120222023020406080100
2017201820192020202120222023020406080100
2017201820192020202120222023020406080100
2017201820192020202120222023020406080100

Publications

Publications (18)
Article
Full-text available
Unlabelled: Kraken and KrakenUniq are widely-used tools for classifying metagenomics sequences. A key requirement for these systems is a database containing all k-mers from all genomes that the users want to be able to detect, where k = 31 by default. This database can be very large, easily exceeding 100 gigabytes (GB) and sometimes 400 GB. Previo...
Preprint
Full-text available
The original CHESS database of human genes was assembled from nearly 10,000 RNA sequencing experiments in 53 human body sites produced by the Genotype-Tissue Expression (GTEx) project, and then augmented with genes from other databases to yield a comprehensive collection of protein-coding and noncoding transcripts. The construction of the new CHESS...
Article
Metagenomic experiments expose the wide range of microscopic organisms in any microbial environment through high-throughput DNA sequencing. The computational analysis of the sequencing data is critical for the accurate and complete characterization of the microbial community. To facilitate efficient and reproducible metagenomic analysis, we introdu...
Preprint
Kraken and KrakenUniq are widely-used tools for classifying metagenomics sequences. A key requirement for these systems is a database containing all k-mers from all genomes that the users want to be able to detect, where k=31 by default. This database can be very large, easily exceeding 100 gigabytes (GB) and sometimes 400 GB. Previously, Kraken an...
Article
PhyloCSF ++ is an efficient and parallelized C ++ implementation of the popular PhyloCSF method to distinguish protein-coding and non-coding regions in a genome based on multiple sequence alignments. It can score alignments or produce browser tracks for entire genomes in the wig file format. Additionally, PhyloCSF ++ annotates coding sequences in G...
Article
Full-text available
Background : Metagenomic sequencing has the potential to identify a wide range of pathogens in human tissue samples. Sarcoidosis is a complex disorder whose etiology remains unknown and for which a variety of infectious causes have been hypothesized. We sought to conduct metagenomic sequencing on cases of ocular and periocular sarcoidosis, none of...
Article
The ability to detect recombination in pathogen genomes is crucial to the accuracy of phylogenetic analysis and consequently to forecasting the spread of infectious diseases and to developing therapeutics and public health policies. However, in case of the SARS-CoV-2, the low divergence of near-identical genomes sequenced over a short period of tim...
Article
Although the ability to programmatically summarize and visually inspect sequencing data is an integral part of genome analysis, currently available methods are not capable of handling large numbers of samples. In particular, making a visual comparison of transcriptional landscapes between two sets of thousands of RNA-seq samples is limited by avail...
Preprint
Full-text available
The ability to detect recombination in pathogen genomes is crucial to the accuracy of phylogenetic analysis and consequently to forecasting the spread of infectious diseases and to developing therapeutics and public health policies. However, previous methods for detecting recombination and reassortment events cannot handle the computational require...
Article
Full-text available
Motivation: Computing the uniqueness of k-mers for each position of a genome while allowing for up to e mismatches is computationally challenging. However, it is crucial for many biological applications such as the design of guide RNA for CRISPR experiments. More formally, the uniqueness or (k, e)-mappability can be described for every position as...
Article
Full-text available
Background: Natural variations in a genome can drastically alter the CRISPR-Cas9 off-target landscape by creating or removing sites. Despite the resulting potential side-effects from such unaccounted for sites, current off-target detection pipelines are not equipped to include variant information. To address this, we developed VARiant-aware detect...
Preprint
We present a fast and exact algorithm to compute the (k,e)-mappability. Its inverse, the (k,e)-frequency counts the number of occurrences of each k-mer with up to e errors in a sequence. The algorithm we present is a magnitude faster than the algorithm in the widely used GEM suite while not relying on heuristics, and can even compute the mappabilit...
Preprint
Full-text available
Finding approximate occurrences of a pattern in a text using a full-text index is a central problem in bioinformatics and has been extensively researched. Bidirectional indices have opened new possibilities in this regard allowing the search to start from anywhere within the pattern and extend in both directions. In particular, use of search scheme...
Article
Full-text available
Finding approximate occurrences of a pattern in a text using a full-text index is a central problem in bioinformatics. Bidirectional indices have opened new possibilities for solving the problem as they allow the search to be started from anywhere within the pattern and extended in both directions. In particular, use of search schemes (partitioning...
Article
Background: The use of novel algorithmic techniques is pivotal to many important problems in life science. For example the sequencing of the human genome Venter et al. (2001) would not have been possible without advanced assembly algorithms and the development of practical BWT based read mappers have been instrumental for NGS analysis. However, ow...
Conference Paper
The unidirectional FM index was introduced by Ferragina and Manzini in 2000 and allows to search a pattern in the index in one direction. The bidirectional FM index (2FM) was introduced by Lam et al. in 2009. It allows to search for a pattern by extending an infix of the pattern arbitrarily to the left or right. If \(\sigma \) is the size of the al...
Article
Full-text available
We introduce a new method for conducting an exact search in a uni- and bidirectional FM index in $\mathcal{O}(1)$ time per step while using $\mathcal{O}(\log \sigma \cdot n) + o(\log \sigma \cdot \sigma \cdot n)$ bits of space. This is done by replacing the wavelet tree by a new data structure, the \emph{Enhanced Prefixsum Rank dictionary} (EPR-dic...
Article
Zusammenfassung Diese Arbeit gibt eine Einführung in aktuelle Prozessor-Architekturen für Mobile-Internet-Devices (MIDs), wie z.B. Smartphones oder Tablet-PCs. Sie stellt die Anforderungen an Prozessor-Architekturen heraus, vergleicht die Architektur des ARM Cortex-A8 mit der des Intel Atom Z510 und zeigt auf, inwiefern sie die Architekturziele erf...

Network

Cited By