Diego P. RubertFederal University of Mato Grosso do Sul · Faculty of Computing
Diego P. Rubert
Dr.
About
26
Publications
1,246
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
69
Citations
Introduction
Diego P. Rubert currently works at the Faculty of Computing, Universidade Federal de Mato Grosso do Sul. Diego does research in Computational Biology, Theory of Computation and Algorithms. Their current project is 'New practical measure of local similarity in comparative genomics and applications'.
Publications
Publications (26)
The comparison of large-scale genome structures across distinct species offers valuable insights into the species’ phylogeny, genome organization, and gene associations. In this chapter, we review the family-free genome comparison tool FFGC that, relying on built-in interfaces with a sequence comparison tool (either BLAST+ or DIAMOND) and with an I...
Background
Recently we developed a gene orthology inference tool based on genome rearrangements (Journal of Bioinformatics and Computational Biology 19:6, 2021). Given a set of genomes our method first computes all pairwise gene similarities. Then it runs pairwise ILP comparisons to compute optimal gene matchings, which minimize, by taking the simi...
Ancestral reconstruction is a classic task in comparative genomics. Here, we study the genome median problem, a related computational problem which, given a set of three or more genomes, asks to find a new genome that minimizes sum of pairwise distances between it and the given genomes. The distance stands for the amount of evolution observed at th...
Background: Recently we developed a gene orthology inference tool based on genome rearrangements (Journal of Bioinformatics and Computational Biology 19:6, 2021). Given a set of genomes our method first computes all pairwise gene similarities. Then it runs pairwise ILP comparisons to compute optimal gene matchings, which minimize, by taking the sim...
Recently we developed a gene orthology inference tool based on genome rearrangements (Journal of Bioinformatics and Computational Biology 19:6, 2021). Given a set of genomes our method first computes all pairwise gene similarities. Then it runs pairwise ILP comparisons to compute optimal gene matchings, which minimize, by taking the similarities in...
In the median problem we are given a set of three or more genomes and want to find a new genome minimizing the sum of pairwise distances between it and the given genomes. For almost all rearrangement operations the median problem is NP-hard. We study the median problem under a restricted rearrangement measure called c4-distance, which is closely re...
Sequence alignment supports numerous tasks in bioinformatics, natural language processing, pattern recognition, social sciences, and other fields. While the alignment of two sequences may be performed swiftly in many applications, the simultaneous alignment of multiple sequences proved to be naturally more intricate. Although most multiple sequence...
Recently, we proposed an efficient ILP formulation [Rubert DP, Martinez FV, Braga MDV, Natural family-free genomic distance, Algorithms Mol Biol 16:4, 2021] for exactly computing the rearrangement distance of two genomes in a family-free setting. In such a setting, neither prior classification of genes into families, nor further restrictions on the...
Sequence alignment supports numerous tasks in bioinformatics, natural language processing, pattern recognition, social sciences, and others fields. While the alignment of two sequences may be performed swiftly in many applications, the simultaneous alignment of multiple sequences proved to be naturally more intricate. Although most multiple sequenc...
Background
A classical problem in comparative genomics is to compute the rearrangement distance, that is the minimum number of large-scale rearrangements required to transform a given genome into another given genome. The traditional approaches in this area are family-based , i.e., require the classification of DNA fragments of both genomes into fa...
Background: A classical problem in comparative genomics is to compute the rearrangement distance, that is the minimum number of large-scale rearrangements required to transform a given genome into another given genome.
The traditional approaches in this area are family-based, i.e., require the classification of DNA fragments of both genomes into fa...
A classical problem in comparative genomics is to compute the rearrangement distance, that is the minimum number of large-scale rearrangements required to transform a given genome into another given genome. While the most traditional approaches in this area are family-based, i.e., require the classification of DNA fragments of both genomes into fam...
The analysis of biological networks allows the understanding of many biological processes, including the structure, function, interaction and evolutionary relationships of their components. One of the most important concepts in biological network analysis is that of network motifs, which are patterns of interconnections that occur in a given networ...
A classical problem in comparative genomics is to compute the rearrangement distance, that is the minimum number of large-scale rearrangements required to transform a given genome into another given genome. While the most traditional approaches in this area are family-based, i.e., require the classification of DNA fragments into families, more rece...
Research in comparative genomics supports the investigation of important questions in molecular biology, genetics and biomedicine. A central question in this field is the elucidation of similarities and differences between genomes by means of different measures. This summary, submitted to CTD 2020, briefly describes the main contributions, original...
One of the most important concepts in biological network analysis is that of network motifs, which are patterns of interconnections that occur in a given network at a frequency higher than expected in a random network. In this work we are interested in searching and inferring network motifs in a class of biological networks that can be represented...
Background:
Computationally inferred ancestral genomes play an important role in many areas of genome research. We present an improved workflow for the reconstruction from highly diverged genomes such as those of plants.
Results:
Our work relies on an established workflow in the reconstruction of ancestral plants, but improves several steps of t...
Since bipartite convex graphs emerged from industrial applications, algorithms for this class of graphs have been devised in several research areas such as scheduling, DNA analysis, and constraint programming. A bipartite graph G=(V,W,E) is convex if there exists an ordering of the vertices of W such that, for each v∈V, the neighbors of v are conse...
This is the additional file 1 of "Computing the family-free DCJ similarity" and contains the APX-hardness proof of the ffdcj-similarity problem.
Background:
The genomic similarity is a large-scale measure for comparing two given genomes. In this work we study the (NP-hard) problem of computing the genomic similarity under the DCJ model in a setting that does not assume that the genes of the compared genomes are grouped into gene families. This problem is called family-free DCJ similarity....
The genomic similarity is a large-scale measure for comparing two given genomes. In this work we study the (NP-hard) problem of computing the genomic similarity under the DCJ model in a setting that does not assume that the genes of the compared genomes are grouped into gene families. This problem is called family-free DCJ similarity. Here we propo...
Background: Rearrangements are large-scale mutations in genomes, responsible for complex changes and structural variations. Most rearrangements that modify the organization of a genome can be represented by the double cut and join (DCJ) operation. Given two balanced genomes, i.e., two genomes that have exactly the same number of occurrences of each...
Rearrangements are large-scale mutations in genomes, responsible for complex changes and structural variations. Most rearrangements that modify the organization of a genome can be represented by the double cut and join (DCJ) operation. Given two genomes with the same content, so that we have exactly the same number of copies of each gene in each ge...
A bipartite graph G = (V, W, E) is convex if there exists an ordering of the vertices of W such that, for each v ∈ V, the neighbors of v are consecutive in W . In this work, we address the Hamiltonian Circuit Problem, a well-known problem in Combinatorial Optimization. We present a novel sequential linear-time algorithm for determining a Hamiltonia...