Y. Pirola

Università degli Studi di Milano-Bicocca, Monza, Lombardy, Italy

Are you Y. Pirola?

Claim your profile

Publications (2)1.54 Total impact

  • Article: An Efficient Algorithm for Haplotype Inference on Pedigrees with Recombinations and Mutations
    Y. Pirola, P. Bonizzoni, Tao Jiang
    [show abstract] [hide abstract]
    ABSTRACT: Haplotype Inference (HI) is a computational challenge of crucial importance in a range of genetic studies. Pedigrees allow to infer haplotypes from genotypes more accurately than population data, since Mendelian inheritance restricts the set of possible solutions. In this work, we define a new HI problem on pedigrees, called Minimum-Change Haplotype Configuration (MCHC) problem, that allows two types of genetic variation events: recombinations and mutations. Our new formulation extends the Minimum-Recombinant Haplotype Configuration (MRHC) problem, that has been proposed in the literature to overcome the limitations of classic statistical haplotyping methods. Our contribution is twofold. First, we prove that the MCHC problem is APX-hard under several restrictions. Second, we propose an efficient and accurate heuristic algorithm for MCHC based on an L-reduction to a well-known coding problem. Our heuristic can also be used to solve the original MRHC problem and can take advantage of additional knowledge about the input genotypes. Moreover, the L-reduction proves for the first time that MCHC and MRHC are O(nm/log nm)-approximable on general pedigrees, where n is the pedigree size and m is the genotype length. Finally, we present an extensive experimental evaluation and comparison of our heuristic algorithm with several other state-of-the-art methods for HI on pedigrees.
    IEEE/ACM Transactions on Computational Biology and Bioinformatics 03/2012; · 1.54 Impact Factor
  • Source
    Conference Proceeding: PIntron: A fast method for gene structure prediction via maximal pairings of a pattern and a text
    [show abstract] [hide abstract]
    ABSTRACT: A challenging issue in designing computational methods for predicting the gene structure into exons and introns from a cluster of transcript (EST, mRNA) sequences, is guaranteeing both accuracy and efficiency in time and space, when large clusters of over than 20,000 ESTs and genes longer than 1Mb are processed. Traditionally, the problem has been faced by combining different tools, not specifically designed for this task. We propose a fast method based on ad hoc procedures for solving the problem. Our method combines two ideas: a novel algorithm of proved small time complexity for computing spliced alignments of a transcript against a genome, and an efficient algorithm that exploits the inherent redundancy of information in a cluster of transcripts to select, among all possible factorizations of EST sequences, those allowing to infer splice site junctions that are largely confirmed by the input data. The EST alignment procedure is based on the construction of maximal embeddings, that are sequences obtained from paths of a graph structure, called embedding graph, whose vertices are the maximal pairings of a genomic sequence T and an EST P. The procedure runs in time linear in the length of P and T and in the size of the output. PIntron, the software tool implementing our methodology, is available at http://www.algolab.eu/PIntron and it is able to process in a few seconds some critical genes that are not manageable by other gene structure prediction tools. At the same time, PIntron exhibits high accuracy (sensitivity and specificity) when compared with ENCODE data.
    Computational Advances in Bio and Medical Sciences (ICCABS), 2011 IEEE 1st International Conference on; 03/2011