# Argimiro ArratiaUniversitat Politècnica de Catalunya | UPC · Department of Computer Science

Argimiro Arratia

Phd Mathematics, MSc Comp. Sci. from U. Wisconsin

## About

84

Publications

16,190

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

346

Citations

Citations since 2016

Introduction

My scientific career comprises two stages (so far): I began doing research and publishing papers on Finite Model Theory and Computational Complexity (i.e. Descriptive Complexity),
but since 2010 I have focused my research and publication efforts in Financial Time Series Analysis, Optimization Heuristics and Mathematics of Finance with an emphasis on their algorithmic and numerical aspects (i.e. Computational Finance).

Additional affiliations

September 2009 - present

January 2009 - September 2009

September 2003 - September 2008

## Publications

Publications (84)

Neural Ordinary Differential Equations (NODE) have emerged as a novel approach to deep learning, where instead of specifying a discrete sequence of hidden layers, it parameterizes the derivative of the hidden state using a neural network [1]. The solution to the underlying dynamical system is a flow, and various works have explored the universality...

We present clustAnalytics, an R package available now on CRAN, which provides methods to validate the results of clustering algorithms on unweighted and weighted networks, particularly for the cases where the existence of a community structure is unknown. clustAnalytics comprises a set of criteria for assessing the significance and stability of a c...

In [1], Newman et al. introduced the Reduced Mutual Information (RMI), a measure of the similarity between two partitions of a set useful in clustering and community detection. The computation of RMI requires counting the amount of contingency tables with fixed row and column sums, a #P-complete problem, for which the authors suggest to use analyti...

Neural Ordinary Differential Equations (NODE) have emerged as a novel approach to deep learning, where instead of specifying a discrete sequence of hidden layers, it parameterizes the derivative of the hidden state using a neural network. The solution to the underlying dynamical system is a flow, and various papers have explored the universality of...

In this work we address the following problem: Having chosen a well diversified portfolio, we show how to improve on its return, maintaining the diversification. In order to achieve this boost on return we construct a neighborhood of the well diversified portfolio and find a portfolio that maximizes the return in that neighborhood. For that we use...

We study potential biases of popular cluster quality metrics, such as conductance or modularity. We propose a method that uses both stochastic and preferential attachment block models construction to generate networks with preset community structures, to which quality metrics will be applied. These models also allow us to generate multi-level struc...

Background
The main goal of this work is to estimate the actual number of cases of Covid-19 in Spain in the period 01-31-2020/06-01-2020 by Autonomous Communities. Based on these estimates, this work allows us to accurately re-estimate the lethality of the disease in Spain, taking into account unreported cases.
Methods
A hierarchical Bayesian mode...

We provide a systematic approach to validate the results of clustering methods on weighted networks, in particular for the cases where the existence of a community structure is unknown. Our validation of clustering comprises a set of criteria for assessing their significance and stability. To test for cluster significance, we introduce a set of com...

Although models for count data with over-dispersion have been widely considered in the literature, models for under-dispersion -- the opposite phenomenon -- have received less attention as it is only relatively common in particular research fields such as biodosimetry and ecology. The Good distribution is a flexible alternative for modelling count...

The problem of dealing with misreported data is very common in a wide range of contexts for different reasons. The current situation caused by the Covid-19 worldwide pandemic is a clear example, where the data provided by official sources were not always reliable due to data collection issues and to the high proportion of asymptomatic cases. In thi...

In early 2018 Bitcoin prices peaked at US$ 20,000 and, almost two years later, we still continue debating if cryptocurrencies can actually become a currency for the everyday life or not. From the economic point of view, and playing in the field of behavioral finance, this paper analyses the relation between Bitcoin price and the search interest on...

This chapter describes the basic mechanics for building a forecasting model that uses as input sentiment indicators derived from textual data. In addition, as we focus our target of predictions on financial time series, we present a set of stylized empirical facts describing the statistical properties of lexicon-based sentiment indicators extracted...

A solution to a portfolio optimization problem is always conditioned by constraints on the initial capital and the price of the available market assets. If a risk neutral measure is known, then the price of each asset is the discounted expected value of the asset’s price under this measure. But if the market is incomplete, the risk neutral measure...

The present paper introduces a new model used to study and analyse the severe acute
respiratory syndrome coronavirus 2 (SARS-CoV2) epidemic-reported-data from Spain.
This is a Hidden Markov Model whose hidden layer is a regeneration process with
Poisson immigration, Po-INAR(1), together with a mechanism that allows the
estimation of the under-repor...

It has been recently shown that a deep neural network with i.i.d. random parameters is equivalent to a Gaussian process in the limit of infinite network width. The Gaussian process associated to the neural network is fully described by a recursive covariance kernel determined by the architecture of the network, and which is expressed in terms of ex...

This paper introduces and extensively explores a forecasting procedure based on multivariate dynamic kernels to re-examine—under a non-linear, kernel methods framework—the experimental tests reported by Welch and Goyal (Rev Financ Stud 21(4):1455–1508, 2008) showing that several variables proposed in the finance literature are of no use as exogenou...

The present paper introduces a new model used to study and analyse the severe acute respiratory syndrome coronavirus 2 (SARS-CoV2) epidemic-reported-data from Spain. This is a Hidden Markov Model whose hidden layer is a regeneration process with Poisson immigration, Po-INAR(1), together with a mechanism that allows the estimation of the under-repor...

The main goal of this work is to estimate the actual number of cases of Covid-19 in Spain in the period 01-31-2020 / 06-01-2020 by Autonomous Communities. Based on these estimates, this work allows us to accurately re-estimate the lethality of the disease in Spain, taking into account unreported cases. A hierarchical Bayesian model recently propose...

We present an attempt to quantify the magnitude of under-reporting in the daily cases of CoVID-19 with data from Catalunya and Uruguay.

Convolutional Neural Networks (CNN) are best known as good image classifiers. This model is recently been used for financial forecasting. The purpose of this work is to show that by converting financial information into images and feeding these financial-image representation to the CNN, it results in an improvement in classification.

A stop-loss rule is a risk management tool whereby the investor predefines some condition that, upon being triggered by market dynamics, implies the liquidation of her outstanding position. Such a tool is widely used by practitioners in financial markets with the hope of improving their investment performance by cutting losses and consolidating gai...

In a market with frictions, bid and ask prices are described by sublinear pricing func-tionals, which can be defined recursively using coherent risk measures. We prove the convergence of bid and ask prices for various European and American possible path-dependent options, in particular plain vanilla, Asian, lookback and barrier options in a binomia...

It is shown in this paper how a solution for a combinatorial problem obtained from applying the greedy algorithm is guaranteed to be optimal for those instances of the problem that, under an appropriate algebraic representation, satisfy the Cohen-Macaulay property known for rings and modules in Commutative Algebra. The choice of representation for...

Given any stationary time series {X n : n ∈ Z} satisfying an ARMA(p, q) model for arbitrary p and q with infinitely divisible innovations, we construct a continuous time stationary process {x t : t ∈ R} such that the distribution of {x n : n ∈ Z}, the process sampled at discrete time, coincides with the distribution of {X n }. In particular the aut...

This work presents a content-based recommender system for machine learning classifier algorithms. Given a new data set, a recommendation of what classifier is likely to perform best is made based on classifier performance over similar known data sets. This similarity is measured according to a data set characterization that includes several state-o...

This paper introduces a forecasting procedure based on multivariate dynamic kernels to re-examine –under a non linear framework– the experimental tests reported by Welch and Goyal showing that several variables proposed in the academic literature are of no use to predict the equity premium under linear regressions. For this approach kernel function...

We present a new construction of continuous ARMA processes based on iterating an Ornstein–Uhlenbeck operator \(\mathcal{O}\mathcal{U}_{\kappa }\) that maps a random variable y(t) onto \(\mathcal{O}\mathcal{U}_{\kappa }y(t) =\int _{ -\infty }^{t}\mathrm{e}^{-\kappa (t-s)}dy(s)\). This construction resembles the procedure to build an AR( p) from an A...

An Ornstein-Uhlenbeck (OU) process can be considered as a continuous time
interpolation of the discrete time AR$(1)$ process. Departing from this fact,
we analyse in this work the effect of iterating OU treated as a linear operator
that maps a Wiener process onto Ornstein-Uhlenbeck process, so as to build a
family of higher order Ornstein-Uhlenbeck...

We present an improvement of an estimator of causality in financial time series via transfer entropy, which includes the side information that may affect the cause-effect relation in the system, i.e. a conditional information-transfer based causality. We show that for weakly stationary time series the conditional transfer entropy measure is non-neg...

This paper proposes an improvement to the method for clustering exchange rates given by D. J. Fenn et al, in Quantitative Finance, 12 (10) 2012, pp.1493-1520. To deal with the potentially non linear nature of currency time series dependence, we propose two alternative similarity metrics to use instead of the one used in the aforementioned paper bas...

We propose a forecasting procedure based on multivariate dynamic kernels, with the capability of integrating information measured at different frequencies and at irregular time intervals in financial markets. A data compression process redefines the original financial time series into temporal data blocks, analyzing the temporal information of mult...

We present a combinatorial study on the rearrangement of links in the structure of directed networks for the purpose of improving the valuation of a vertex or group of vertices as established by an eigenvector-based centrality measure. We build our topological classification starting from unidirectional rooted trees and up to more complex hierarchi...

We present an efficient algorithm for the confluent hypergeometric functions when the imaginary part of b and z is large. The algorithm is based on the steepest descent method, applied to a suitable representation of the confluent hypergeometric functions as a highly oscillatory integral, which is then integrated by using various quadrature methods...

We present a construction of a family of continuous-time ARMA processes based on p iterations of the linear operator that maps a Lévy process onto an Ornstein-Uhlenbeck process. The construction resembles the procedure to build an AR(p) from an AR(1). We show that this family is in fact a subfamily of the well-known CARMA(p, q) processes, with seve...

Separations among the first order logic ${\cal R}ing(0,+,*)$ of finite
residue class rings, its extensions with generalized quantifiers, and in the
presence of a built-in order are shown, using algebraic methods from class
field theory. These methods include classification of spectra of sentences over
finite residue classes as systems of congruence...

We present GeoSRS, a hybrid recommender system for a popular location-based social network (LBSN), in which users are able to write short reviews on the places of interest they visit. Using state-of-the-art text mining techniques, our system recommends locations to users using as source the whole set of text reviews in addition to their geographica...

The Ornstein-Uhlenbeck (OU) process is a well known continuous–time interpolation of the discrete–time autoregressive process of order one, the AR(1). We propose a generalization of the OU process that resembles the construction of autoregressive processes of higher order p > 1 from the AR(1). The higher order OU processes thus obtained are called...

The book covers a wide range of topics, yet essential, in Computational Finance (CF), understood as a mix of Finance, Computational Statistics, and Mathematics of Finance. In that regard it is unique in its kind, for it touches upon the basic principles of all three main components of CF, with hands-on examples for programming models in R. Thus, th...

Harry Markowitz presented in 1952the basic tenet of portfolio selection: to find a combination of assets that in a given period of time produces the highest possible return at the least possible risk.

The price of a stock as a function of time constitutes a financial time series, and as such it contains an element of uncertainty which demands the use of statistical methods for its analysis

The two most popular approaches to investment, although considered as opposite paradigms of financial engineering, are Technical Analysis and Fundamental Analysis.

It has been frequently observed that US markets leads other developed markets in Europe or Asia, and that at times the leader becomes the follower. Within a market, or even across different markets, some assets’ returns are observed to behave like other assets’ returns, or completely opposite, and thus may serve as pairs for a trading strategy or p...

This chapter presents Brownian motion, also known as Wiener process. This is the most fundamental continuous-time model in finance.

Many financial problems require making decisions based on current knowledge and under uncertainty about the future.

In this book optimization heuristics refers to algorithmic methods for finding approximate solutions to the basic optimization problem of minimizing (or maximizing) a function subject to certain constraints. The problems of interest are typically of big size, or admitting several local optimum solutions, for which exact deterministic approaches are...

This chapter presents some basic discrete-time models for financial time series.

This chapter is intended for giving the reader the minimum background on the fundamentals of finance. An outlook on the most common financial instruments and the places where these are traded. An introduction to investment strategies, portfolio management and basic asset pricing. In short, we give a succinct review of the what, how, and when of fin...

The dramatic rise in the use of social network platforms such as Facebook or Twitter has resulted in the availability of vast and growing user-contributed repositories of data. Exploiting this data by extracting useful information from it has become a great challenge in data mining and knowledge discovery. A recently popular way of extracting usefu...

We propose a methodology for clustering financial time series of stocks'
returns, and a graphical set-up to quantify and visualise the evolution
of these clusters through time. The proposed graphical representation
allows for the application of well known algorithms for solving
classical combinatorial graph problems, which can be interpreted as
pro...

The first order logic \(\mathcal{R}ing(0,+,*,<)\) for finite residue class rings with order is presented, and extensions of this logic with generalized quantifiers are given. It is shown that this logic and its extensions capture DLOGTIME-uniform circuit complexity classes ranging from AC
0 to TC
0. Separability results are obtained for the hierarc...

This is a report on an implementation of a spectral clustering algorithm for classifying very large internet sites, with special emphasis on the practical prob-lems encountered in developing such a data mining system. Remarkably some of these technical difficulties are due to fundamental issues pertaining to the mathematics in-volved, and are not t...

We propose a methodology for clustering financial time series of stocks'
returns, and a graphical set-up to quantify and visualise the evolution of
these clusters through time. The proposed graphical representation allows for
the application of well known algorithms for solving classical combinatorial
graph problems, which can be interpreted as pro...

This paper presents our studies on the rearrangement of links from the
structure of websites for the purpose of improving the valuation of a page or
group of pages as established by a ranking function as Google's PageRank. We
build our topological taxonomy starting from unidirectional and bidirectional
rooted trees, and up to more complex hierarchi...

We present a formal syntax of approximate formulas suited for the logic with counting quantifiers SOLP. This logic was studied by us in [1] where, among other properties, we showed: (i) In the presence of a built–in (linear) order, SOLP can describe NP–complete problems and fragments of it capture classes like P and NL; (ii) weakening the ordering...

A correlation based hierarchical clustering is performed at different time periods in order to study the evolution of clusters among the components of the Spanish stock market IBEX35. This model can be used to design portfolios of companies with similar or dissimilar historical returns behaviour.
Some conclusions: Our experiments confirmed a popul...

Stock Analizer es un sistema experto en desarrollo en PHP, SQL y Java que reconoce los patrones Candlesticks (o velas) en el histórico de cualquier valor bursátil, y emite recomendaciones basadas en la interpretación de estos patrones.
Stock Analizer, se basa en un sistema relacional algebraico que describe formalmente los patrones candlesticks qu...

Inspired by recent work of Meduna on deep pushdown automata, we consider the computational power of a class of basic program schemes, NPSDSs
, based around assignments, while-loops and non-deterministic guessing but with access to a deep pushdown stack which, apart from having the usual push and pop instructions, also has deep-push instructions whi...

This paper presents a syntax of approximate formulae suited for the logic with counting quantifiers 𝒮𝒪ℒ𝒫. This logic was formalised by us in [1] where, among other properties, we showed the following facts: (i) In the presence of a built–in (linear) order, 𝒮𝒪ℒ𝒫 can describe NP–complete problems and some of its fragments capture the classes P and NL...

Inspired by recent work of Meduna on deep pushdown automata, we consider the computational power of a class of basic program schemes, $\mbox{NPSDS}_s$, based around assignments, while-loops and non-deterministic guessing but with access to a deep pushdown stack which, apart from having the usual push and pop instructions, also has deep-push instruc...

We present a second-order logic of proportional quantifiers, , which is essentially a first-order language extended with quantifiers that act upon second-order variables of a given arity
r and count the fraction of elements in a subset of r-tuples of a model that satisfy a formula. Our logic is capable of expressing proportional versions of differe...

Resumen. Probamos que el PageRank de la raíz de u arbol (PR) sólo depende de la partición de sus nodos en niveles (y no de las conexiones entre niveles). Probamos que la supresión deí ultimo nivel o de todos los nodos posibles (conservando la altura) en la mitad superior deí arbol aumenta su PR. Introducimos el concepto dé arbol cola como el subárb...

We present a second order logic of proportional quantifiers, SOLP, which is essentially a first order language extended with quanti- fiers that act upon second order variables of a given arity r, and count the fraction of elements in a subset of r-tuples of a model that satisfy a formula. Our logic is capable of expressing proportional versions of...

We formulate a formal syntax of approximate formulas for the logic with count- ing quantifiers, SOLP, studied by us in (1), where we showed the following facts: (i) In the presence of a built-in (linear) order, SOLP can describe NP-complete problems and fragments of it capture classes like P and NL; (ii) weakening the or- dering relation to an almo...

We present a probability logic (essentially a first order language extended with quantifiers that count the fraction of elements in a model that satisfy a first order formula) which, on the one hand, captures uniform circuit classes such as AC0 and TC0 over arithmetic models, namely, finite structures with linear order and arithmetic relations, and...

The Theory of Descriptive Computational Complexity deals with Computational Complexity from the pers- pective of Logic. Among its main goals it is the logical characterization of computational complexity classes, traditionally defined in terms of resource bounded Turing machines. The presentation often found of this theory in the current literature...

The game Whex is here defined, which is similar to Generalized Hex but the players are restricted to colour vertices adjacent to the vertex last coloured by one of the players. It is shown that the problem of deciding existence of winning strategies for one of the players in this game is complete for PSPACE, via quantifier free projections, and tha...

The Theory of Descriptive Computational Complexity deals with Computational Complexity from the perspective of Logic. Among its main goals it is the logical characterization of computational complexity classes, traditionally defined in terms of resource bounded Turing machines. The presentation often found of this theory in the current literature,...

We show how the fact that there is a first-order projection from the problem transitive closure (TC) to some other problem Ω enables us to automatically deduce that a natural game problem, , whose instances are labelled instances of Ω, is complete for PSPACE (via log-space reductions). Our analysis is strongly dependent upon the reduction from TC t...

We outline a plan to show, using Non-estandar Anlysis, that the equery "Is the number of distint prime factors of the cardinality of a finite set even?" is not definable in a first-order language with built-in linear order, built-in operations of addition, multiplication and based 2 exponentiation.

Decimotercera Escuela Venezolana de Matemáticas Incluye bibliografía e índice

A class of languages is defined from the operations of union, intersec-tion, complement, concatenation and a new operation which, for two given languages A and B, and a fixed language V , called the context, builds the set of words whose number of possible factorizations as three factors, the prefix in A, the suffix in B, and the middle in the cont...

We begin by proving that the class of problems accepted by the program schemes of NPS is exactly the class of problems defined by the sentences of transitive closure logic (program schemes of NPS are obtained by generalizing basic non-deterministic while-programs whose tests within while instructions are quantifier-free first-order formulae). We th...

We extend a logical characterization of PSPACE due to Makowsky and Pnueli by showing that their logic has a particular normal form which implies that the Generalized Hex problem is complete for PSPACE via very restricted logical reductions. We also show that this normal form result fails in the absence of a built-in successor relation.

## Projects

Projects (2)