Tzvika Hartman

Bar Ilan University, Gan, Tel Aviv, Israel

Are you Tzvika Hartman?

Claim your profile

Publications (20)11.81 Total impact

  • Source
    Jens Gramm · Tzvika Hartman · Till Nierhoff · Roded Sharan · Till Tantau
    [Show abstract] [Hide abstract]
    ABSTRACT: Recent technologies for typing single nucleotide polymorphisms (SNPs) across a population are producing genome-wide genotype data for tens of thousands of SNP sites. The emergence of such large data sets underscores the importance of algorithms for large-scale haplotyping. Common haplotyping approaches first partition the SNPs into blocks of high linkage-disequilibrium, and then infer haplotypes for each block separately. We investigate an integrated haplotyping approach where a partition of the SNPs into a minimum number of non-contiguous subsets is sought, such that each subset can be haplotyped under the perfect phylogeny model. We show that finding an optimum partition is -hard even if we are guaranteed that two subsets suffice. On the positive side, we show that a variant of the problem, in which each subset is required to admit a perfect path phylogeny haplotyping, is solvable in polynomial time.
    Preview · Article · Sep 2009 · Discrete Mathematics
  • Source
    Amihood Amir · Tzvika Hartman · Oren Kapah · Avivit Levy · Ely Porat
    [Show abstract] [Hide abstract]
    ABSTRACT: Consider the following optimization problem: given two strings over the same alphabet, transform one into another by a succession of interchanges of two elements. In each interchange the two participating elements exchange positions. An interchange is given a weight that depends on the distance in the string between the two exchanged elements. The object is to minimize the total weight of the interchanges. This problem is a generalization of a classical problem on permutations (where every element appears once). The generalization considers general strings with possibly repeating elements, and a function assigning weights to the interchanges. The generalization to general strings (with unit weights) was mentioned by Cayley in the 19th century, and its complexity has been an open question since. We solve this open problem and consider various weight functions as well.
    Full-text · Article · Jan 2009 · SIAM Journal on Computing
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The haplotype inference problem (HIP) asks to find a set of haplotypes which resolve a given set of genotypes. This problem is important in practical fields such as the investigation of diseases or other types of genetic mutations. In order to find the haplotypes which are as close as possible to the real set of haplotypes that comprise the genotypes, two models have been suggested which are by now well-studied: The perfect phylogeny model and the pure parsimony model. All known algorithms up till now for haplotype inference may find haplotypes that are not necessarily plausible, i.e., very rare haplotypes or haplotypes that were never observed in the population. In order to overcome this disadvantage, we study in this paper, a new constrained version of HIP under the above-mentioned models. In this new version, a pool of plausible haplotypes H is given together with the set of genotypes G, and the goal is to find a subset H ⊆ H that resolves G. For constrained perfect phlogeny haplotyping (CPPH), we provide initial insights and polynomial-time algorithms for some restricted cases of the problem. For constrained parsimony haplotyping (CPH), we show that the problem is fixed parameter tractable when parameterized by the size of the solution set of haplotypes.
    Full-text · Article · Jan 2009 · IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM
  • Source
    Amihood Amir · Tzvika Hartman · Oren Kapah · B. Riva Shalom · Dekel Tsur
    [Show abstract] [Hide abstract]
    ABSTRACT: The Longest Common Subsequence (LCS) is a well studied problem, having a wide range of implementations. Its motivation is in comparing strings. It has long been of interest to devise a similar measure for comparing higher dimensional objects, and more complex structures. In this paper we study the Longest Common Substructure of two matrices and show that this problem is NP-hard. We also study the Longest Common Subforest problem for multiple trees including a constrained version, as well. We show NP-hardness for k>2 unordered trees in the constrained LCS. We also give polynomial time algorithms for ordered trees and prove a lower bound for any decomposition strategy for k trees.
    Full-text · Article · Dec 2008 · Theoretical Computer Science
  • Source
    Amihood Amir · Tzvika Hartman · Oren Kapah · B. Riva Shalom · Dekel Tsur
    [Show abstract] [Hide abstract]
    ABSTRACT: The Longest Common Subsequence (LCS) is a well studied problem, having a wide range of implementations. Its motivation is in comparing strings. It has long been of interest to devise a similar measure for comparing higher dimensional objects, and more complex structures. In this paper we give, what is to our knowledge, the first inherently multi-dimensional definition of LCS. We discuss the Longest Common Substructure of two matrices and the Longest Common Subtree problem for multiple trees including a constrained version. Both problems cannot be solved by a natural extension of the original LCS solution. We investigate the tractability of the above problems. For the first we prove NP\cal{NP} -Completeness. For the latter NP\cal{NP} -hardness holds for two general unordered trees and for k trees in the constrained LCS.
    Full-text · Chapter · Sep 2007
  • Source
    Tzvika Hartman
    [Show abstract] [Hide abstract]
    ABSTRACT: An important problem in genome rearrangements is sorting permutations by transpositions. Its complexity is still open, and two rather complicated 1.5-approximation algorithms for sorting linear permutations are known (Bafna and Pevzner, 96 and Christie, 98). In this paper, we observe that the problem of sorting circular permutations by transpositions is equivalent to the problem of sorting linear permutations by transpositions. Hence, all algorithms for sorting linear permutations by transpositions can be used to sort circular permutations. Our main result is a new 1.5-approximation algorithm, which is considerably simpler than the previous ones, and achieves running time which is equal to the best known. Moreover, the analysis of the algorithm is significantly less involved, and provides a good starting point for studying related open problems.
    Preview · Chapter · Mar 2007
  • Source
    Conference Paper: Generalized LCS.
    Amihood Amir · Tzvika Hartman · Oren Kapah · B. Riva Shalom · Dekel Tsur

    Full-text · Conference Paper · Jan 2007
  • Source
    Amihood Amir · Tzvika Hartman · Oren Kapah · Avivit Levy · Ely Porat
    [Show abstract] [Hide abstract]
    ABSTRACT: An underlying assumption in the classical sorting problem is that the sorter does not know the index of every element in the sorted array. Thus, comparisons are used to determine the order of elements, while the sorting is done by interchanging elements. In the closely related interchange rearrangement problem, final positions of elements are already given, and the cost of the rearrangement is the cost of the interchanges. This problem was studied only for the limited case of permutation strings, where every element appears once. This paper studies a generalization of the classical and well-studied problem on permutations by considering general strings input, thus solving an open problem of Cayley from 1849, and examining various cost models.
    Full-text · Conference Paper · Jan 2007
  • Tzvika Hartman · Elad Verbin
    [Show abstract] [Hide abstract]
    ABSTRACT: We study the problems of sorting signed permutations by reversals (SBR) and sorting unsigned permutations by transpositions (SBT), which are central problems in computational molecular biology. While a polynomial-time solution for SBR is known, the computational complexity of SBT has been open for more than a decade and is considered a major open problem. In the first efficient solution of SBR, Hannenhalli and Pevzner [HP99] used a graph-theoretic model for representing permutations, called the interleaving graph. This model was crucial to their solution. Here, we define a new model for SBT, which is analogous to the interleaving graph. Our model has some desirable properties that were lacking in earlier models for SBT. These properties make it extremely useful for studying SBT. Using this model, we give a linear-algebraic framework in which SBT can be studied. Specifically, for matrices over any algebraic ring, we define a class of matrices called tight matrices. We show that an efficient algorithm which recognizes tight matrices over a certain ring, \(\mathbb{M}\), implies an efficient algorithm that solves SBT on an important class of permutations, called simple permutations. Such an algorithm is likely to lead to an efficient algorithm for SBT that works on all permutations. The problem of recognizing tight matrices is also a generalization of SBR and of a large class of other “sorting by rearrangements” problems, and seems interesting in its own right as. We give an efficient algorithm for recognizing tight symmetric matrices over any field of characteristic 2. We leave as an open problem to find an efficient algorithm for recognizing tight matrices over the ring \(\mathbb{M}\).
    No preview · Conference Paper · Oct 2006
  • Source
    Isaac Elias · Tzvika Hartman
    [Show abstract] [Hide abstract]
    ABSTRACT: Sorting permutations by transpositions is an important problem in genome rearrangements. A transposition is a rearrangement operation in which a segment is cut out of the permutation and pasted in a different location. The complexity of this problem is still open and it has been a 10-year-old open problem to improve the best known 1.5-approximation algorithm. In this paper, we provide a 1.375-approximation algorithm for sorting by transpositions. The algorithm is based on a new upper bound on the diameter of 3-permutations. In addition, we present some new results regarding the transposition diameter: we improve the lower bound for the transposition diameter of the symmetric group and determine the exact transposition diameter of simple permutations.
    Preview · Article · Oct 2006 · IEEE/ACM Transactions on Computational Biology and Bioinformatics
  • Source
    Jens Gramm · Tzvika Hartman · Till Nierhoff · Roded Sharan · Till Tantau
    [Show abstract] [Hide abstract]
    ABSTRACT: Recent technologies for typing single nucleotide polymorphisms (SNPs) across a population are producing genome-wide genotype data for tens of thousands of SNP sites. The emergence of such large data sets underscores the importance of algorithms for large-scale haplotyping. Common haplotyping approaches first partition the SNPs into blocks of high linkage-disequilibrium, and then infer haplotypes for each block separately. We investigate an integrated haplotyping approach where a partition of the SNPs into a minimum number of non-contiguous subsets is sought, such that each subset can be haplotyped under the perfect phylogeny model. We show that finding an optimum partition is NP-hard even if we are guaranteed that two subsets suffice. On the positive side, we show that a variant of the problem, in which each subset is required to admit a perfect path phylogeny haplotyping, is solvable in polynomial time.
    Preview · Chapter · Sep 2006
  • Source
    Tzvika Hartman · Roded Sharan
    [Show abstract] [Hide abstract]
    ABSTRACT: One of the most promising ways to determine evolutionary distance between two organisms is to compare the order of appearance of orthologous genes in their genomes. The resulting genome rearrangement problem calls for finding a shortest sequence of rearrangement operations that sorts one genome into the other. In this paper we provide a 1.5-approximation algorithm for the problem of sorting by transpositions and transreversals, improving on a five-year-old 1.75 ratio for this problem. Our algorithm is also faster than current approaches and requires time for n genes.
    Preview · Article · May 2005 · Journal of Computer and System Sciences
  • Source
    Tzvika Hartman · Roded Sharan
    [Show abstract] [Hide abstract]
    ABSTRACT: One of the most promising ways to determine evolutionary distance between two organisms is to compare the order of appearance of orthologous genes in their genomes. The resulting genome rearrangement problem calls for finding a shortest sequence of rearrangement operations that sorts one genome into the other. In this paper we provide a 1.5-approximation algorithm for the problem of sorting by transpositions and transreversals, improving on a five years old 1.75 ratio for this problem. Our algorithm is also faster than current approaches and requires O(n(3/2) rootlog n) time for n genes.
    Preview · Article · Jan 2005 · Lecture Notes in Computer Science
  • Source
    Tzvika Hartman · Roded Sharan
    [Show abstract] [Hide abstract]
    ABSTRACT: One of the most promising ways to determine evolutionary distance between two organisms is to compare the order of appearance of orthologous genes in their genomes. The resulting genome rearrangement problem calls for finding a shortest sequence of rearrangement operations that sorts one genome into the other. In this paper we provide a 1.5-approximation algorithm for the problem of sorting by transpositions and transreversals, improving on a five-year-old 1.75 ratio for this problem. Our algorithm is also faster than current approaches and requires
    Preview · Conference Paper · Sep 2004
  • Source
    Tzvika Hartman · Ron Shamir
    [Show abstract] [Hide abstract]
    ABSTRACT: An important problem in genome rearrangements is sorting permutations by transpositions. The complexity of the problem is still open, and two rather complicated 1.5-approximation algorithms for sorting linear permutations are known (Bafna and Pevzner, 98 and Christie, 99). The fastest known algorithm is the quadratic algorithm of Bafna and Pevzner. In this paper, we observe that the problem of sorting circular permutations by transpositions is equivalent to the problem of sorting linear permutations by transpositions. Hence, all algorithms for sorting linear permutations by transpositions can be used to sort circular permutations. Our main result is a new 1.5-approximation algorithm, which is considerably simpler than the previous ones, and whose analysis is significantly less involved.
    Full-text · Article · May 2004 · Information and Computation
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We study a design and optimization problem that occurs, for example, when single nucleotide polymorphisms (SNPs) are to be genotyped using a universal DNA tag array. The problem of optimizing the universal array to avoid disruptive cross-hybridization between universal components of the system was addressed in previous work. Cross-hybridization can, however, also occur assay specifically, due to unwanted complementarity involving assay-specific components. Here we examine the problem of identifying the most economic experimental configuration of the assay-specific components that avoids cross-hybridization. Our formalization translates this problem into the problem of covering the vertices of one side of a bipartite graph by a minimum number of balanced subgraphs of maximum degree 1. We show that the general problem is NP-complete. However, in the real biological setting, the vertices that need to be covered have degrees bounded by d. We exploit this restriction and develop an O(d)-approximation algorithm for the problem. We also give an O(d)-approximation for a variant of the problem in which the covering subgraphs are required to be vertex disjoint. In addition, we propose a stochastic model for the input data and use it to prove a lower bound on the cover size. We complement our theoretical analysis by implementing two heuristic approaches and testing their performance on synthetic data as well as on simulated SNP data.
    Full-text · Article · Feb 2004 · Journal of Computational Biology
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We study a design and optimization problem that occurs, for example, when single nucleotide polymorphisms (SNPs) are to be genotyped using a universal DNA tag array. The problem of optimizing the universal array to avoid disruptive cross-hybridization between universal components of the system was addressed in a previous work. However, cross-hybridization can also occur assay-specifically, due to unwanted complementarity involving assay-specific components. Here we examine the problem of identifying the most economic experimental configuration of the assay-specific components that avoids cross-hybridization. Our formalization translates this problem into the problem of covering the vertices of one side of a bipartite graph by a minimum number of balanced subgraphs of maximum degree 1. We show that the general problem is NP-complete. However, in the real biological setting the vertices that need to be covered have degrees bounded by d. We exploit this restriction and develop an O(d)-approximation algorithm for the problem. We also give an O(d)-approximation for a variant of the problem in which the covering subgraphs are required to be vertex-disjoint. In addition, we propose a stochastic model for the input data and use it to prove a lower bound on the cover size. We complement our theoretical analysis by implementing two heuristic approaches and testing their performance on simulated and real SNP data.
    Full-text · Conference Paper · Apr 2003
  • Source
    Eran Halperin · Shay Halperin · Tzvika Hartman · Ron Shamir
    [Show abstract] [Hide abstract]
    ABSTRACT: Sequencing by hybridization (SBH) is a DNA sequencing technique, in which the sequence is reconstructed using its k-mer content. This content, which is called the spectrum of the sequence, is obtained by hybridization to a universal DNA array. Standard universal arrays contain all k-mers for some fixed k, typically 8 to 10. Currently, in spite of its promise and elegance, SBH is not competitive with standard gel-based sequencing methods. This is due to two main reasons: lack of tools to handle realistic levels of hybridization errors and an inherent limitation on the length of uniquely reconstructible sequence by standard universal arrays. In this paper, we deal with both problems. We introduce a simple polynomial reconstruction algorithm which can be applied to spectra from standard arrays and has provable performance in the presence of both false negative and false positive errors. We also propose a novel design of chips containing universal bases that differs from the one proposed by Preparata et al. (1999). We give a simple algorithm that uses spectra from such chips to reconstruct with high probability random sequences of length lower only by a squared log factor compared to the information theoretic bound. Our algorithm is very robust to errors and has a provable performance even if there are both false negative and false positive errors. Simulations indicate that its sensitivity to errors is also very small in practice.
    Full-text · Article · Feb 2003 · Journal of Computational Biology
  • T. Hartman
    [Show abstract] [Hide abstract]
    ABSTRACT: One of the most promising ways to determine evolutionary distance between two organisms isto compare the order of appearance of orthologous genes in their genomes. The resulting genomerearrangement problem calls for finding a shortest sequence of rearrangement operations that sortsone genome into the other. In this paper we provide a 1.5-approximation algorithm for the problemof sorting by transpositions and transreversals, improving on a five-years-old 1.75 ratio for thisproblem. Our...
    No preview · Article ·
  • Source
    Tzvika Hartman
    [Show abstract] [Hide abstract]
    ABSTRACT: Abstract This thesis concerns applications of combinatorial and algorithmic techniques in molecular biology. We focus on two main areas. In the first part we consider algorithms for tracing genome rearrangement events. Genome rearrangements are large scale DNA mutations, in which large segments are rearranged in order and orientation. Since these mutations are very rare, they are useful in evolutionary studies. In particular, we consider rearrangements in which a segment of DNA is cut out and inserted in a dierent,location in the same chromosome. If the segment is inserted in the same orientation it is called transposition; otherwise it is a transreversal. We first study the problem of sorting permutations by transpositions only. We develop a novel 1.5-approximation algorithm, that is considerably simpler than the extant ones. Moreover, the analysis of the algorithm is significantly less involved, and provides a good starting point for studying related open problems. Next, we consider the problem of sorting permutations by transpositions, transreversals and revrevs (an operation reversing each of two consecutive segments). Our main result is a quadratic 1.5-approximation algorithm for sorting by these operations, improving over the extant best known approximation ratio of 1.75. We present an implementation of both algorithms that runs in time O(n,/2
    Preview · Article ·

Publication Stats

438 Citations
11.81 Total Impact Points

Institutions

  • 2006-2009
    • Bar Ilan University
      • Department of Computer Science
      Gan, Tel Aviv, Israel
  • 2003-2007
    • Weizmann Institute of Science
      • • Department of Computer Science and Applied Mathematics
      • • Department of Molecular Genetics
      Rhovot, Central District, Israel