
Yuri Kalnishkan- Royal Holloway University of London
Yuri Kalnishkan
- Royal Holloway University of London
About
47
Publications
2,195
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
239
Citations
Current institution
Publications
Publications (47)
Financial organisations such as brokers face a significant challenge in servicing the investment needs of thousands of their traders worldwide. This task is further compounded since individual traders will have their own risk appetite and investment goals. Traders may look to capture short-term trends in the market which last only seconds to minute...
In this paper we apply methods of prediction with expert advice to real-world foreign exchange trading data with the aim of finding effective investment strategies. We start with the framework of the long-short game, introduced by Vovk and Watkins (1998), and then propose modifications aimed at improving the performance with respect to standard por...
This paper proposes a method for determining the resolution for the processing of irregularly-sampled time series data to provide a balanced perspective of agents’ behaviour. The behaviour is described as a collection of prolonged events, which are characterised by start/open and end/close times in addition to other useful attributes. We propose th...
In this paper, prediction with expert advice is surveyed focusing on Vovk’s Aggregating Algorithm. The established theory as well as extensions developed in the recent decade are considered. The paper is aimed at practitioners and covers important application scenarios.
We obtain a lower bound for an algorithm predicting finite-dimensional distributions (i.e., points from a simplex) under Kullback-Leibler loss. The bound holds w.r.t. the class of softmax linear predictors. We then show that the bound is asymptotically matched by the Bayesian universal algorithm.
Interval prediction often provides more useful information compared to a simple point forecast. For example, in renewable energy forecasting, while the initial focus has been on deterministic predictions, the uncertainty observed in energy generation raises an interest in producing probabilistic forecasts. One aims to provide prediction intervals s...
Here we introduce a data staging algorithm designed to reconstruct multiple time series databases into a partitioned and regularised database. The Data Aggregation Partition Reduction Algorithm, or DAPRA for short, was designed to solve the practical issue of effective and meaningful visualisation of irregularly sampled time series data. This paper...
We consider the framework of competitive prediction, where one provides guarantees compared to other predictive models that are called experts. We propose a universal algorithm predicting finite-dimensional distributions, i.e. points from a simplex, under Kullback–Leibler game. In the standard framework for prediction with expert advice, the perfor...
We consider the framework of competitive prediction when one provides guarantees compared to other predictive models that are called experts. We propose the algorithm that combines point predictions of an infinite pool of linear experts and outputs probability forecasts in the form of cumulative distribution functions. We evaluate the quality of pr...
This paper formulates a protocol for prediction of packs, which is a special case of on-line prediction under delayed feedback. Under the prediction of packs protocol, the learner must make a few predictions without seeing the respective outcomes and then the outcomes are revealed in one go. The paper develops the theory of prediction with expert a...
This paper formulates the protocol for prediction of packs, which a special case of prediction under delayed feedback. Under this protocol, the learner must make a few predictions without seeing the outcomes and then the outcomes are revealed. We develop the theory of prediction with expert advice for packs. By applying Vovk's Aggregating Algorithm...
The paper presents a competitive prediction-style upper bound on the square loss of the Aggregating Algorithm for Regression with Changing Dependencies in the linear case. The algorithm is able to compete with a sequence of linear predictors provided the sum of squared Euclidean norms of differences of regression coefficient vectors grows at a subl...
Learning with expert advice as a scheme of on-line learning has been very successfully applied to various learning problems due to its strong theoretical basis. In this paper, for the purpose of times series prediction, we investigate the application of Aggregation Algorithm,which a generalisation of the famous weighted majority algorithm. The resu...
Predictive complexity is a generalization of Kolmogorov complexity motivated by an on-line prediction scenario. It quantifies the "unpredictability" of a sequence in a particular prediction environment. This chapter surveys key results on predictive complexity for games with finitely many outcomes. The issues of existence, non-existence, uniqueness...
The paper explores connections between asymptotic complexity and generalised entropy. Asymptotic complexity of a language (a language is a set of finite or infinite strings) is a way of formalising the complexity of predicting the next element in a sequence: it is the loss per element of a strategy asymptotically optimal for that language. Generali...
In this paper we describe an unsupervised, deterministic algorithm for segmenting DJ-mixed Electronic Dance Music streams (for example; podcasts, radio shows, live events) into their respective tracks. We attempt to reconstruct boundaries as close as possible to what a human domain expert would engender. The goal of DJ-mixing is to render track bou...
The paper describes an application of Aggregating Algorithm to the problem of
regression. It generalizes earlier results concerned with plain linear
regression to kernel techniques and presents an on-line algorithm which
performs nearly as well as any oblivious kernel predictor. The paper contains
the derivation of an estimate on the performance of...
This paper derives an identity connecting the square loss of ridge regression
in on-line mode with the loss of the retrospectively best regressor. Some
corollaries about the properties of the cumulative loss of on-line ridge
regression are also obtained.
We apply the method of defensive forecasting, based on the use of game-theoretic supermartingales, to prediction with expert advice. In the traditional setting of a countable number of experts and a finite number of outcomes, the Defensive Forecasting Algorithm is very close to the well-known Aggregating Algorithm. Not only the performance guarante...
The paper deals with on-line regression settings with signals belonging to a Banach lattice. Our algorithms work in a semi-online setting where all the inputs are known in advance and outcomes are unknown and given step by step. We apply the Aggregating Algorithm to construct a prediction method whose cumulative loss over all the input vectors is c...
Multi-class classification is one of the most important tasks in machine learning. In this paper we consider two online multi-class classification problems: classification by a linear model and by a kernelized model. The quality of predictions is measured by the Brier loss function. We suggest two computationally efficient algorithms to work with t...
This paper resolves the problem of predicting as well as the best expert up to an additive term of the order o(n), where n is the length of a sequence of letters from a finite alphabet. We call the games that permit this weakly mixable and give a geometrical characterisation of the class of weakly mixable games. Weak mixability turns out to be equi...
This paper deals with the problem of making predictions in the online mode of learning where the dependence of the outcome y
t
on the signal xt
can change with time. The Aggregating Algorithm (AA) is a technique that optimally merges experts from a pool, so that the resulting strategy suffers a cumulative loss that is almost as good as that of the...
Consider the online regression problem where the dependence of the outcome y
t
on the signal xt
changes with time. Standard regression techniques, like Ridge Regression, do not perform well in tasks of this type. We propose two methods to handle this problem: WeCKAAR, a simple modification of an existing regression technique, and KAARCh, an applica...
In this paper the concept of asymptotic complexity of languages is introduced. This concept formalises the notion of learnability in a particular environment and generalises Lutz and Fortnow’s concepts of predictability and dimension. Then asymptotic complexities in different prediction environments are compared by describing the set of all pairs o...
Kernel Ridge Regression (KRR) and the recently de- veloped Kernel Aggregating Algorithm for Regression (KAAR) are regression methods based on Least Squares. KAAR has theoretical advantages over KRR since a bound on its square loss for the worst case is known that does not hold for KRR. This bound does not make any assumptions about the underlying p...
Abstract Kernel Ridge Regression (KRR) and the Kernel Aggregating Algorithm for Regression (KAAR) are existing regression methods,based on Least Squares. KRR is a well established regression technique, while KAAR is the result of relatively recent work. KAAR is similar to KRR but with some extra regularisation that makes,it predict better when,the...
This paper discusses learning style theories with a focus on the VARK model and Honey-Mumford questionnaires. The traditional views are described and a restatement in terms of abilities or skills is proposed. Then the learning style methodology is applied to teach-ing computer science. A hypothesis is made that most of computer science students sha...
The paper introduces a way of re-constructing a loss function from predictive complexity. We show that a loss function and expectations of the corresponding predictive complexity w.r.t. the Bernoulli distribution are related through the Legendre transformation. It is shown that if two loss functions specify the same complexity then they are equival...
It is well known that there exists a universal (i.e., optimal to within an additive constant if allowed to work infinitely
long) algorithm for lossless data compression (Kolmogorov, Levin). The game of lossless compression is an example of an on-line
prediction game; for some other on-line prediction games (such as the simple prediction game) a uni...
It is well known in the theory of Kolmogorov complexity that most strings cannot be compressed; more precisely, only exponentially
few (Θ(2
n − m
)) strings of length n can be compressed by m bits. This paper extends the ‘incompressibility’ property of Kolmogorov complexity to the ‘unpredictability’ property of
predictive complexity. The ‘unpredict...
It is well known in the theory of Kolmogorov complexity that most strings cannot be compressed; more precisely, only exponentially few (Θ (2n−m)) binary strings of length n can be compressed by m bits. This paper extends the ‘incompressibility’ property of Kolmogorov complexity to the ‘unpredictability’ property of predictive complexity. The ‘unpre...
This paper shows that if the curvature of the boundary of the set of superpredictions for a game vanishes in a nontrivial
way, then there is no predictive complexity for the game. This is the first result concerning the absence of complexity for
games with convex sets of superpredictions. The proof is further employed to show that for some games th...
This paper investigates the behaviour of the constant c(β) from the Aggregating Algorithm. Some conditions for mixability are derived and it is shown that for many non-mixable games
c(β) still converges to 1. The condition c(β) → 1 is shown to imply the existence of weak predictive complexity and it is proved that many games specify complexity up
t...
In this paper we introduce a general method of establishing tight linear inequalities between different types of predictive complexity. Predictive complexity is a generalisation of Kolmogorov complexity and it bounds the ability of an algorithm to predict elements of a sequence. Our method relies upon probabilistic considerations and allows us to d...
The paper introduces a way of re-constructing a loss function from predictive complexity. We show that a loss function and
expectations of the corresponding predictive complexity w.r.t. the Bernoulli distribution are related through the Legendre
transformation. It is shown that if two loss functions specify the same complexity then they are equival...
Predictive complexity is a generalisation of Kolmogorov complexity. In this paper we point out some properties of predictive complexity connected with the Legendre (--Young--Fenchel) transformation. Our main result is that mixability is necessary for the existence of conditional predictive complexity (it is known to be sufficient under very mild as...
In this paper an application of the Complexity Approximation Principle to the non-linear regression is suggested. We combine
this principle with the approximation of the complexity of a real-valued vector parameter proposed by Rissanen and thus derive
a method for the choice of parameters in the non-linear regression.
In this paper we introduce a general method that allows to prove tight linear inequalities between different types of predictive
complexity and thus we generalise our previous results. The method relies upon probabilistic considerations and allows to
describe (using geometrical terms) the sets of coefficients which correspond to true inequalities....
In this paper we consider and compare different types of predictive complexity, which bounds the ability of any algorithm to predict elements of a sequence. Particular types of predictive complexity are specified by loss functions we use to measure the deviations between predictions and actual outcomes. We consider the logarithmic and square loss f...
In this paper, we describe audio "texture" features based on the Short Time Fourier Transform (STFT). We use these features in combination with three popular learn-ing machine algorithms to classify spoken voice segments of a popular Electronic Dance Music radio show "A State of Trance", which is produced by the current world number 1 DJ; Armin van...