Evgeni Tsivtsivadze

TNO, 's-Gravenhage, South Holland, Netherlands

Are you Evgeni Tsivtsivadze?

Claim your profile

Publications (34)36.74 Total impact

  • Sultan Imangaliyev · Bart Keijser · Wim Crielaard · Evgeni Tsivtsivadze ·
    [Show abstract] [Hide abstract]
    ABSTRACT: We use Human Microbiome Project (HMP) cohort [1] to infer personalized oral microbial networks of healthy individuals. To determine clustering of individuals with similar microbial profiles, co-regularized spectral clustering algorithm is applied to the dataset. For each cluster we discovered, we compute co-occurrence relationships among the microbial species that determine microbial network per cluster of individuals. The results of our study suggest that there are several differences in microbial interactions on personalized network level in healthy oral samples acquired from various niches. Based on the results of co-regularized spectral clustering we discover two groups of individuals with different topology of their microbial interaction network. The results of microbial network inference suggest that niche-wise interactions are different in these two groups. Our study shows that healthy individuals have different microbial clusters according to their oral microbiota. Such personalized microbial networks open a better understanding of the microbial ecology of healthy oral cavities and new possibilities for future targeted medication. The scripts written in scientific Python and in Matlab, which were used for network visualization, are provided for download on the website http://learning-machines.com/. Copyright © 2015. Published by Elsevier Inc.
    Methods 04/2015; 83. DOI:10.1016/j.ymeth.2015.03.017 · 3.65 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A cross-sectional observational study was conducted to evaluate interindividual biochemical variation in unstimulated whole saliva in a population of 268 systemically healthy young students, 18-30 yr of age, with no apparent caries lesions or periodontal disease. Salivary flow rate, protein content, pH, buffering capacity, mucins MUC5B and MUC7, albumin, secretory IgA, cystatin S, lactoferrin, chitinase, amylase, lysozyme, and proteases were measured using ELISAs and enzymatic activity assays. Significant differences were found between male and female subjects. Salivary pH, buffering capacity, protein content, MUC5B, secretory IgA, and chitinase activity were all lower in female subjects compared with male subjects, whereas MUC7 and lysozyme activity were higher in female subjects. There was no significant difference between sexes in salivary flow rate, albumin, cystatin S, amylase, and protease activity. Principal component analysis (PCA) and spectral clustering (SC) were used to assess intervariable relationships within the data set and to identify subgroups. Spectral clustering identified two clusters of participants, which were subsequently described. This study provides a comprehensive overview of the distribution and inter-relations of a set of important salivary biochemical variables in a systemically healthy young adult population, free of apparent caries lesions and periodontal disease. It highlights significant gender differences in salivary biochemistry. © 2015 Eur J Oral Sci.
    European Journal Of Oral Sciences 03/2015; 123(3). DOI:10.1111/eos.12182 · 1.49 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Rationale. Many bacterial pathogens causing respiratory infections in children are common residents of the respiratory tract. Insight into bacterial colonization patterns and microbiota stability at young age might elucidate healthy or susceptible conditions for development of respiratory disease. Objective. To study bacterial succession of the respiratory microbiota in the first two years of life and its relation to respiratory health characteristics. Methods. Upper respiratory microbiota profiles of 60 healthy children at the ages of 1.5, 6, 12 and 24 months were characterized by 16S-based pyrosequencing. We determined consecutive microbiota profiles by machine-learning algorithms and validated the findings cross-sectionally in an additional cohort of 140 children per age group. Measurements and main results. Overall, we identified 8 distinct microbiota profiles in the upper respiratory tract of healthy infants. Profiles could already been identified at 1.5 months of age, and were associated with microbiota stability and change over the first two years of life. More stable patterns were marked by early presence and high abundance of Moraxella and Corynebacterium/Dolosigranulum, were positively associated with breastfeeding in the first period of life and with lower rates of parental-reported respiratory infections in the consecutive periods. Less stable profiles were marked by high abundance of Haemophilus or Streptococcus. Conclusions. These findings provide novel insights into microbial succession in the respiratory tract in infancy and link early-life profiles to microbiota stability and respiratory health characteristics. New prospective studies should elucidate potential implications of our findings for early diagnosis and prevention of respiratory infections. Clinical trial registration available at www.clinicaltrials.gov, ID NCT00189020.
    American Journal of Respiratory and Critical Care Medicine 10/2014; 190(11). DOI:10.1164/rccm.201407-1240OC · 13.00 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Cardiovascular disease risk increases when lipoprotein metabolism is dysfunctional. We have developed a computational model able to derive indicators of lipoprotein production, lipolysis, and uptake processes from a single lipoprotein profile measurement. This is the first study to investigate whether lipoprotein metabolism indicators can improve cardiovascular risk prediction and therapy management. We calculated lipoprotein metabolism indicators for 1981 subjects (145 cases, 1836 controls) from the Framingham Heart Study offspring cohort in which NMR lipoprotein profiles were measured. We applied a statistical learning algorithm using a support vector machine to select conventional risk factors and lipoprotein metabolism indicators that contributed to predicting risk for general cardiovascular disease. Risk prediction was quantified by the change in the Area-Under-the-ROC-Curve (ΔAUC) and by risk reclassification (Net Reclassification Improvement (NRI) and Integrated Discrimination Improvement (IDI)). Two VLDL lipoprotein metabolism indicators (VLDLE and VLDLH) improved cardiovascular risk prediction. We added these indicators to a multivariate model with the best performing conventional risk markers. Our method significantly improved both CVD prediction and risk reclassification. Two calculated VLDL metabolism indicators significantly improved cardiovascular risk prediction. These indicators may help to reduce prescription of unnecessary cholesterol-lowering medication, reducing costs and possible side-effects. For clinical application, further validation is required.
    PLoS ONE 03/2014; 9(3):e92840. DOI:10.1371/journal.pone.0092840 · 3.23 Impact Factor
  • Sultan Imangaliyev · Bart Keijser · Wim Crielaard · Evgeni Tsivtsivadze ·
    [Show abstract] [Hide abstract]
    ABSTRACT: As the amount of metagenomic data grows rapidly, online statistical learning algorithms are poised to play key role in metagenome analysis tasks. Frequently, data are only partially labeled, namely dataset contains partial information about the problem of interest. This work presents an algorithm and a learning framework that is naturally suitable for the analysis of large scale, partially labeled metagenome datasets. We propose an online multi-output algorithm that learns by sequentially co-regularizing prediction functions on unlabeled data points and provides improved performance in comparison to several supervised methods. We evaluate predictive performance of the proposed methods on NIH Human Microbiome Project dataset. In particular we address the task of predicting relative abundance of Porphyromonas species in the oral cavity. In our empirical evaluation the proposed method outperforms several supervised regression techniques as well as leads to notable computational benefits when training the predictive model.
    2013 IEEE International Conference on Bioinformatics and Biomedicine (BIBM); 12/2013
  • Evgeni Tsivtsivadze · Tom Heskes ·
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a novel sparse preference learning/ranking algorithm. Our algorithm approximates the true utility function by a weighted sum of basis functions using the squared loss on pairs of data points, and is a generalization of the kernel matching pursuit method. It can operate both in a supervised and a semi-supervised setting and allows efficient search for multiple, near-optimal solutions. Furthermore, we describe the extension of the algorithm suitable for combined ranking and regression tasks. In our experiments we demonstrate that the proposed algorithm outperforms several state-of-the-art learning methods when taking into account unlabeled data and performs comparably in a supervised learning scenario, while providing sparser solutions.
  • Daniel Kühlwein · Twan van Laarhoven · Evgeni Tsivtsivadze · Josef Urban · Tom Heskes ·
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, an overview of state-of-the-art techniques for premise selection in large theory mathematics is provided, and new premise selection techniques are introduced. Several evaluation metrics are introduced, compared and their appropriateness is discussed in the context of automated reasoning in large theory mathematics. The methods are evaluated on the MPTP2078 benchmark, a subset of the Mizar library, and a 10% improvement is obtained over the best method so far.
    Proceedings of the 6th international joint conference on Automated Reasoning; 06/2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Activity regulated neurotransmission shapes the computational properties of a neuron and involves the concerted action of many proteins. Classical, intuitive working models often assign specific proteins to specific steps in such complex cellular processes, whereas modern systems theories emphasize more integrated functions of proteins. To test how often synaptic proteins participate in multiple steps in neurotransmission we present a novel probabilistic method to analyze complex functional data from genetic perturbation studies on neuronal secretion. Our method uses a mixture of probabilistic principal component analyzers to cluster genetic perturbations on two distinct steps in synaptic secretion, vesicle priming and fusion, and accounts for the poor standardization between different studies. Clustering data from 121 perturbations revealed that different perturbations of a given protein are often assigned to different steps in the release process. Furthermore, vesicle priming and fusion are inversely correlated for most of those perturbations where a specific protein domain was mutated to create a gain-of-function variant. Finally, two different modes of vesicle release, spontaneous and action potential evoked release, were affected similarly by most perturbations. This data suggests that the presynaptic protein network has evolved as a highly integrated supramolecular machine, which is responsible for both spontaneous and activity induced release, with a group of core proteins using different domains to act on multiple steps in the release process.
    PLoS Computational Biology 04/2012; 8(4):e1002450. DOI:10.1371/journal.pcbi.1002450 · 4.62 Impact Factor
  • Source
    Jesse Alama · Tom Heskes · Daniel Kühlwein · Evgeni Tsivtsivadze · Josef Urban ·
    [Show abstract] [Hide abstract]
    ABSTRACT: Smart premise selection is essential when using automated reasoning as a tool for large-theory formal proof development. A good method for premise selection in complex mathematical libraries is the application of machine learning to large corpora of proofs. This work develops learning-based premise selection in two ways. First, a newly available minimal dependency analysis of existing high-level formal mathematical proofs is used to build a large knowledge base of proof dependencies, providing precise data for ATP-based re-verification and for training premise selection algorithms. Second, a new machine learning algorithm for premise selection based on kernel methods is proposed and implemented. To evaluate the impact of both techniques, a benchmark consisting of 2078 large-theory mathematical problems is constructed,extending the older MPTP Challenge benchmark. The combined effect of the techniques results in a 50% improvement on the benchmark over the Vampire/SInE state-of-the-art system for automated reasoning in large theories.
    Journal of Automated Reasoning 08/2011; 52(2). DOI:10.1007/s10817-013-9286-5 · 0.88 Impact Factor
  • Evgeni Tsivtsivadze · Tapio Pahikkala · Jorma Boberg · Tapio Salakoski · Tom Heskes ·
    [Show abstract] [Hide abstract]
    ABSTRACT: Situations when only a limited amount of labeled data and a large amount of unlabeled data are available to the learning algorithm are typical for many real-world problems. To make use of unlabeled data in preference learning problems, we propose a semisupervised algorithm that is based on the multiview approach. Our algorithm, which we call Sparse Co-RankRLS, minimizes a least-squares approximation of the ranking error and is formulated within the co-regularization framework. It operates by constructing a ranker for each view and by choosing such ranking prediction functions that minimize the disagreement among all of the rankers on the unlabeled data. Our experiments, conducted on real-world dataset, show that the inclusion of unlabeled data can improve the prediction performance significantly. Moreover, our semisupervised preference learning algorithm has a linear complexity in the number of unlabeled data items, making it applicable to large datasets.
  • Source
    D. A. Kühlwein · J. Urban · E. Tsivtsivadze · H. Geuvers · T. Heskes ·
    [Show abstract] [Hide abstract]
    ABSTRACT: Premise selection and ranking is a pressing problem for applications of automated reasoning to large formal theories and knowledge bases. Smart selection of premises has a significant impact on the efficiency of automated proof assistant systems in large theories. Despite this, machine-learning methods for this domain are underdeveloped. In this paper we propose a general learning algorithm to address the premise selection problem. Our approach consists of simultaneous training of multiple predictors that learn to rank a set of premises in order to estimate their expected usefulness when proving a new conjecture. The proposed algorithm efficiently constructs prediction functions and can take correlations among multiple tasks into account. The experiments demonstrate that the proposed method significantly outperforms algorithms previously applied to the task.
  • Conference Paper: Learning2Reason.
    Daniel Kühlwein · Josef Urban · Evgeni Tsivtsivadze · Herman Geuvers · Tom Heskes ·
    [Show abstract] [Hide abstract]
    ABSTRACT: In recent years, large corpora of formally expressed knowledge have become available in the fields of formal mathematics, software verification, and real-world ontologies. The Learning2Reason project aims to develop novel machine learning methods for computer-assisted reasoning on such corpora. Our global research goals are to provide good methods for selecting relevant knowledge from large formal knowledge bases, and to combine them with automated reasoning methods.
    Intelligent Computer Mathematics - 18th Symposium, Calculemus 2011, and 10th International Conference, MKM 2011, Bertinoro, Italy, July 18-23, 2011. Proceedings; 01/2011
  • Source
    Evgeni Tsivtsivadze · Josef Urban · Herman Geuvers · Tom Heskes ·
    [Show abstract] [Hide abstract]
    ABSTRACT: Learning reasoning techniques from previous knowledge is a largely underdeveloped area of automated reasoning. As large bodies of formal knowledge are becoming available to automated reasoners, state-of-the-art machine learning methods can provide powerful heuristics for problem-specific detection of relevant knowledge contained in the libraries. In this paper we develop a semantic graph kernel suitable for learning in structured mathematical domains. Our kernel incorporates contextual information about the features and unlike "random walk"-based graph kernels it is also applicable to sparse graphs. We evaluate the proposed semantic graph kernel on a subset of the large formal Mizar mathematical library. Our empirical evaluation demonstrates that graph kernels in general are particularly suitable for the automated reasoning domain and that in many cases our semantic graph kernel leads to improvement in performance compared to linear, Gaussian, latent semantic, and geometric graph kernels.
    Proceedings of the Eleventh SIAM International Conference on Data Mining, SDM 2011, April 28-30, 2011, Mesa, Arizona, USA; 01/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: In different fields like decision making, psychology, game theory and biology, it has been observed that paired-comparison data like preference relations defined by humans and animals can be intransitive. Intransitive relations cannot be modeled with existing machine learning methods like ranking models, because these models exhibit strong transitivity properties. More specifically, in a stochastic context, where often the reciprocity property characterizes probabilistic relations such as choice probabilities, it has been formally shown that ranking models always satisfy the well-known strong stochastic transitivity property. Given this limitation of ranking models, we present a new kernel function that together with the regularized least-squares algorithm is capable of inferring intransitive reciprocal relations in problems where transitivity violations cannot be considered as noise. In this approach it is the kernel function that defines the transition from learning transitive to learning intransitive relations, and the Kronecker-product is introduced for representing the latter type of relations. In addition, we empirically demonstrate on two benchmark problems, one in game theory and one in theoretical biology, that our algorithm outperforms methods not capable of learning intransitive reciprocal relations.
    European Journal of Operational Research 11/2010; 206(3-206):676-685. DOI:10.1016/j.ejor.2010.03.018 · 2.36 Impact Factor
  • Source
    Antolin Janssen · Evgeni Tsivtsivadze · Jorma Boberg · Tjeerd M.H. Dijkstra · Tom Heskes ·

  • Source
    Tjeerd M. H. Dijkstra · Evgeni Tsivtsivadze · Elena Marchiori · Tom Heskes ·

    Lecture Notes in Computer Science 01/2010; DOI:10.1007/978-3-642-16001-1 · 0.51 Impact Factor
  • Evgeni Tsivtsivadze · Tapio Pahikkala · Jorma Boberg · Tapio Salakoski ·
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a framework for constructing kernels that take advantage of local correlations in sequential data. The kernels designed using the proposed framework measure parse similarities locally, within a small window constructed around each matching feature. Furthermore, we propose to incorporate positional information inside the window and consider different ways to do this. We applied the kernels together with regularized least-squares (RLS) algorithm to the task of dependency parse ranking using the dataset containing parses obtained from a manually annotated biomedical corpus of 1100 sentences. Our experiments show that RLS with kernels incorporating positional information perform better than RLS with the baseline kernel functions. This performance gain is statistically significant.
    Applied Intelligence 12/2009; 31(3). DOI:10.1007/s10489-008-0114-2 · 1.85 Impact Factor
  • Source
    Tapio Pahikkala · Evgeni Tsivtsivadze · Antti Airola · Jouni Järvinen · Jorma Boberg ·
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we introduce a framework for regularized least-squares (RLS) type of ranking cost functions and we propose three such cost functions. Further, we propose a kernel-based preference learning algorithm, which we call RankRLS, for minimizing these functions. It is shown that RankRLS has many computational advantages compared to the ranking algorithms that are based on minimizing other types of costs, such as the hinge cost. In particular, we present efficient algorithms for training, parameter selection, multiple output learning, cross-validation, and large-scale learning. Circumstances under which these computational benefits make RankRLS preferable to RankSVM are considered. We evaluate RankRLS on four different types of ranking tasks using RankSVM and the standard RLS regression as the baselines. RankRLS outperforms the standard RLS regression and its performance is very similar to that of RankSVM, while RankRLS has several computational benefits over RankSVM.
    Machine Learning 04/2009; 75(1). DOI:10.1007/s10994-008-5097-z · 1.89 Impact Factor
  • Source

  • Frontiers in Neuroinformatics 01/2009; DOI:10.3389/conf.neuro.11.2009.08.035 · 3.26 Impact Factor

Publication Stats

218 Citations
36.74 Total Impact Points


  • 2014
    • TNO
      's-Gravenhage, South Holland, Netherlands
  • 2012
    • Technische Universiteit Eindhoven
      • Department of Electrical Engineering
      Eindhoven, North Brabant, Netherlands
  • 2010-2012
    • Radboud University Nijmegen
      • • Institute for Computing and Information Sciences
      • • Department of Intelligent Systems
      Nymegen, Gelderland, Netherlands
  • 2005-2009
    • University of Turku
      • • Department of Information Technology
      • • Turku Centre for Computing Science, TUCS
      Turku, Province of Western Finland, Finland