Evgeni Tsivtsivadze

TNO, Delft, South Holland, Netherlands

Are you Evgeni Tsivtsivadze?

Claim your profile

Publications (29)11.31 Total impact

  • Evgeni Tsivtsivadze, Tom Heskes
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a novel sparse preference learning/ranking algorithm. Our algorithm approximates the true utility function by a weighted sum of basis functions using the squared loss on pairs of data points, and is a generalization of the kernel matching pursuit method. It can operate both in a supervised and a semi-supervised setting and allows efficient search for multiple, near-optimal solutions. Furthermore, we describe the extension of the algorithm suitable for combined ranking and regression tasks. In our experiments we demonstrate that the proposed algorithm outperforms several state-of-the-art learning methods when taking into account unlabeled data and performs comparably in a supervised learning scenario, while providing sparser solutions.
    07/2013;
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, an overview of state-of-the-art techniques for premise selection in large theory mathematics is provided, and new premise selection techniques are introduced. Several evaluation metrics are introduced, compared and their appropriateness is discussed in the context of automated reasoning in large theory mathematics. The methods are evaluated on the MPTP2078 benchmark, a subset of the Mizar library, and a 10% improvement is obtained over the best method so far.
    Proceedings of the 6th international joint conference on Automated Reasoning; 06/2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Activity regulated neurotransmission shapes the computational properties of a neuron and involves the concerted action of many proteins. Classical, intuitive working models often assign specific proteins to specific steps in such complex cellular processes, whereas modern systems theories emphasize more integrated functions of proteins. To test how often synaptic proteins participate in multiple steps in neurotransmission we present a novel probabilistic method to analyze complex functional data from genetic perturbation studies on neuronal secretion. Our method uses a mixture of probabilistic principal component analyzers to cluster genetic perturbations on two distinct steps in synaptic secretion, vesicle priming and fusion, and accounts for the poor standardization between different studies. Clustering data from 121 perturbations revealed that different perturbations of a given protein are often assigned to different steps in the release process. Furthermore, vesicle priming and fusion are inversely correlated for most of those perturbations where a specific protein domain was mutated to create a gain-of-function variant. Finally, two different modes of vesicle release, spontaneous and action potential evoked release, were affected similarly by most perturbations. This data suggests that the presynaptic protein network has evolved as a highly integrated supramolecular machine, which is responsible for both spontaneous and activity induced release, with a group of core proteins using different domains to act on multiple steps in the release process.
    PLoS Computational Biology 04/2012; 8(4):e1002450. · 4.87 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Smart premise selection is essential when using automated reasoning as a tool for large-theory formal proof development. A good method for premise selection in complex mathematical libraries is the application of machine learning to large corpora of proofs. This work develops learning-based premise selection in two ways. First, a newly available minimal dependency analysis of existing high-level formal mathematical proofs is used to build a large knowledge base of proof dependencies, providing precise data for ATP-based re-verification and for training premise selection algorithms. Second, a new machine learning algorithm for premise selection based on kernel methods is proposed and implemented. To evaluate the impact of both techniques, a benchmark consisting of 2078 large-theory mathematical problems is constructed,extending the older MPTP Challenge benchmark. The combined effect of the techniques results in a 50% improvement on the benchmark over the Vampire/SInE state-of-the-art system for automated reasoning in large theories.
    Journal of Automated Reasoning 08/2011; · 0.57 Impact Factor
  • Conference Paper: Learning2Reason.
    [Show abstract] [Hide abstract]
    ABSTRACT: In recent years, large corpora of formally expressed knowledge have become available in the fields of formal mathematics, software verification, and real-world ontologies. The Learning2Reason project aims to develop novel machine learning methods for computer-assisted reasoning on such corpora. Our global research goals are to provide good methods for selecting relevant knowledge from large formal knowledge bases, and to combine them with automated reasoning methods.
    Intelligent Computer Mathematics - 18th Symposium, Calculemus 2011, and 10th International Conference, MKM 2011, Bertinoro, Italy, July 18-23, 2011. Proceedings; 01/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: Situations when only a limited amount of labeled data and a large amount of unlabeled data are available to the learning algorithm are typical for many real-world problems. To make use of unlabeled data in preference learning problems, we propose a semisupervised algorithm that is based on the multiview approach. Our algorithm, which we call Sparse Co-RankRLS, minimizes a least-squares approximation of the ranking error and is formulated within the co-regularization framework. It operates by constructing a ranker for each view and by choosing such ranking prediction functions that minimize the disagreement among all of the rankers on the unlabeled data. Our experiments, conducted on real-world dataset, show that the inclusion of unlabeled data can improve the prediction performance significantly. Moreover, our semisupervised preference learning algorithm has a linear complexity in the number of unlabeled data items, making it applicable to large datasets.
    01/2011;
  • Source
    Evgeni Tsivtsivadze, Josef Urban, Herman Geuvers, Tom Heskes
    Proceedings of the Eleventh SIAM International Conference on Data Mining, SDM 2011, April 28-30, 2011, Mesa, Arizona, USA; 01/2011
  • Source
    01/2011;
  • Source
    01/2010;
  • Source
    Lecture Notes in Computer Science 01/2010; · 0.51 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In different fields like decision making, psychology, game theory and biology, it has been observed that paired-comparison data like preference relations defined by humans and animals can be intransitive. Intransitive relations cannot be modeled with existing machine learning methods like ranking models, because these models exhibit strong transitivity properties. More specifically, in a stochastic context, where often the reciprocity property characterizes probabilistic relations such as choice probabilities, it has been formally shown that ranking models always satisfy the well-known strong stochastic transitivity property. Given this limitation of ranking models, we present a new kernel function that together with the regularized least-squares algorithm is capable of inferring intransitive reciprocal relations in problems where transitivity violations cannot be considered as noise. In this approach it is the kernel function that defines the transition from learning transitive to learning intransitive relations, and the Kronecker-product is introduced for representing the latter type of relations. In addition, we empirically demonstrate on two benchmark problems, one in game theory and one in theoretical biology, that our algorithm outperforms methods not capable of learning intransitive reciprocal relations.
    European Journal of Operational Research 01/2010; · 2.04 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we introduce a framework for regularized least-squares (RLS) type of ranking cost functions and we propose three such cost functions. Further, we propose a kernel-based preference learning algorithm, which we call RankRLS, for minimizing these functions. It is shown that RankRLS has many computational advantages compared to the ranking algorithms that are based on minimizing other types of costs, such as the hinge cost. In particular, we present efficient algorithms for training, parameter selection, multiple output learning, cross-validation, and large-scale learning. Circumstances under which these computational benefits make RankRLS preferable to RankSVM are considered. We evaluate RankRLS on four different types of ranking tasks using RankSVM and the standard RLS regression as the baselines. RankRLS outperforms the standard RLS regression and its performance is very similar to that of RankSVM, while RankRLS has several computational benefits over RankSVM.
    Machine Learning 01/2009; · 1.47 Impact Factor
  • Source
    Evgeni Tsivtsivadze, Botond Cseke, Tom Heskes
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose the kernel principal component ranking algo- rithm (KPCRank) for learning preference relations. The algorithm can be considered as an extension of nonlinear principal component regres- sion applicable to preference learning task. It is particularly suitable for learning from noisy datasets where a lower dimensional data representa- tion preserves most expressive features. In many cases near-linear depen- dence of regressors (multicollinearity) can notably decrease performance of the learning algorithm, however, KPCRank can eectively deal with this situation. It is accomplished by projecting the data onto p-principal components in the feature space dened by a positive denite kernel and consecutive learning of the ranking function. Despite the fact that the number of the pairwise preferences is quadratic, the training time of KPCRank scales linearly with the number of data points in the training set and is equal to that of the principal component regression. We com- pare the algorithm to several ranking and regression methods, including probabilistic regression on pairwise comparison data. Our experiments demonstrate that the performance of KPCRank is better than that of the baseline methods, when learning to rank from the data corrupted by noise.
    Solid State Communications - SOLID STATE COMMUN. 01/2009;
  • [Show abstract] [Hide abstract]
    ABSTRACT: We propose a framework for constructing kernels that take advantage of local correlations in sequential data. The kernels designed using the proposed framework measure parse similarities locally, within a small window constructed around each matching feature. Furthermore, we propose to incorporate positional information inside the window and consider different ways to do this. We applied the kernels together with regularized least-squares (RLS) algorithm to the task of dependency parse ranking using the dataset containing parses obtained from a manually annotated biomedical corpus of 1100 sentences. Our experiments show that RLS with kernels incorporating positional information perform better than RLS with the baseline kernel functions. This performance gain is statistically significant.
    Applied Intelligence 01/2009; · 1.85 Impact Factor
  • Frontiers in Neuroinformatics 01/2009;
  • Source
    01/2009;
  • Source
    01/2008;
  • [Show abstract] [Hide abstract]
    ABSTRACT: During past decade, kernel methods have proved to be successful in different text analysis tasks. There are several reasons that make kernel based methods applicable to many real world problems especially in domains where data is not naturally represented in a vector form. Firstly, instead of manual construction of the feature space for the learning task, kernel functions provide an alternative way to design useful features automatically, therefore, allowing very rich representations. Secondly, kernels can be designed to incorporate a. prior knowledge about the domain. This property allows to notably improve performance of the general learning methods and their simple adaptation to the specific problem. Finally, kernel methods are naturally applicable in situations where data representation is not in a vectorial form, thus avoiding extensive preprocessing step. In this chapter, we present the main ideas behind kernel methods in general and kernels for text analysis in particular as well as provide an example of designing feature space for parse ranking problem with different kernel functions.
    01/2008;
  • Evgeni Tsivtsivadze, Jorma Boberg, Tapio Salakoski
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose kernels that take advantage of local correlations in sequential data and present their application to the protein classification problem. Our locality kernels measure protein sequence similarities within a small window constructed around matching amino acids. The kernels incorporate positional information of the amino acids inside the window and allow a range of position dependent similarity evaluations. We use these kernels with regularized least-squares algorithm (RLS) for protein classification on the SCOP database. Our experiments demonstrate that the locality kernels perform significantly better than the spectrum and the mismatch kernels. When used together with RLS, performance of the locality kernels is comparable with some state-of-the-art methods of protein classification and remote homology detection.
    01/2007;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a Locality-Convolution (LC) kernel in applica-tion to dependency parse ranking. The LC kernel measures parse similar-ities locally, within a small window constructed around each matching feature. Inside the window it makes use of a position sensitive func-tion to take into account the order of the feature appearance. The sim-ilarity between two windows is calculated by computing the product of their common attributes and the kernel value is the sum of the window similarities. We applied the introduced kernel together with Regular-ized Least-Squares (RLS) algorithm to a dataset containing dependency parses obtained from a manually annotated biomedical corpus of 1100 sentences. Our experiments show that RLS with LC kernel performs bet-ter than the baseline method. The results outline the importance of local correlations and the order of feature appearance within the parse. Final validation demonstrates statistically significant increase in parse ranking performance.
    01/2006;

Publication Stats

131 Citations
11.31 Total Impact Points

Institutions

  • 2012
    • TNO
      Delft, South Holland, Netherlands
  • 2010–2012
    • Radboud University Nijmegen
      • • Institute for Computing and Information Sciences
      • • Department of Intelligent Systems
      Nymegen, Gelderland, Netherlands
  • 2005–2009
    • University of Turku
      • Department of Information Technology
      Turku, Province of Western Finland, Finland