
Iñigo UrteagaBasque Center for Applied Mathematics · Machine Learning
Iñigo Urteaga
Electrical Engineering, PhD
Tenure-tracked Ikerbasque Research Fellow in the Machine Learning group at the Basque Center for Applied Mathematics
About
39
Publications
7,552
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
347
Citations
Introduction
I am a tenure-tracked Ikerbasque Research Fellow in the Machine Learning group at the Basque Center for Applied Mathematics (BCAM).
I study statistical models and algorithms to extract information from data, for computer systems to effectively learn how to perform descriptive, predictive, and prescriptive tasks.
I have specialized in statistical machine learning, computational Bayesian statistics, approximate inference methods, and sequential decision processes.
Additional affiliations
April 2018 - January 2023
September 2016 - April 2018
August 2011 - August 2016
Education
August 2011 - August 2016
Publications
Publications (39)
There are many practical signal processing settings where a state-space model consists of a state described by an ARMA process that is observed via non-linear functions of the state. In this paper, we propose a particle filtering method for sequentially estimating the ARMA process in the presence of unknown parameters. In the considered problem, we...
This paper considers inference on the widely used state-space models described by hidden ARMA state processes of unknown order observed via non-linear functions of the states. We propose a particle filtering method for sequentially inferring the unknown ARMA time-series by Rao-Blackwellization of all the static unknowns. Our method does not rely ei...
In the past decades, Sequential Monte Carlo (SMC) sampling has proven to be a method of choice in many applications where the dynamics of the studied system are described by nonlinear equations and/or non-Gaussian noises. In this paper, we study the application of SMC sampling to nonlinear state-space models where the state is a fractional Gaussian...
In this paper we consider a set of time-series that are coupled by latent fractional Gaussian processes. Specifically, we address time-series that combine idiosyncratic short-term and shared long-term features. The long-memory is modeled by fractional Gaussian processes, whereas the short-memory properties are captured by linear models of past data...
Increased interest in Wireless Sensor Networks (WSNs) by scientists and engineers is forcing WSN research to focus on application requirements. Data is available as never before in many fields of study; practitioners are now burdened with the challenge of doing data-rich research rather than being data-starved. In-situ sensors can be prone to error...
We characterize short-term and long-term user engagement patterns in a self-tracking, mobile health app. We introduce and define engagement metrics to capture the quantity, duration, and density of participant engagement according to different domains of self-tracking. We focus our study on Phendo, a research app designed for participants to self-t...
Transformer-based language models (TLMs) provide state-of-the-art performance in many modern natural language processing applications. TLM training is conducted in two phases. First, the model is pre-trained over large volumes of text to minimize a generic objective function, such as the Masked Language Model (MLM). Second, the model is fine-tuned...
Objective
The study sought to build predictive models of next menstrual cycle start date based on mobile health self-tracked cycle data. Because app users may skip tracking, disentangling physiological patterns of menstruation from tracking behaviors is necessary for the development of predictive models.
Materials and Methods
We use data from a po...
We explore how to quantify uncertainty when designing predictive models for healthcare to provide well-calibrated results. Uncertainty quantification and calibration are critical in medicine, as one must not only accommodate the variability of the underlying physiology, but adjust to the uncertain data collection and reporting process. This occurs...
Mobile health (mHealth) apps such as menstrual trackers provide a rich source of self-tracked health observations that can be leveraged for statistical modeling. However, such data streams are notoriously unreliable since they hinge on user adherence to the app. Thus, it is crucial for machine learning models to account for self-tracking artifacts...
Endometriosis is a systemic and chronic condition in women of childbearing age, yet a highly enigmatic disease with unresolved questions: there are no known biomarkers, nor established clinical stages. We here investigate the use of patient-generated health data and data-driven phenotyping to characterize endometriosis patient subtypes, based on th...
The menstrual cycle is a key indicator of overall health for women of reproductive age. Previously, menstruation was primarily studied through survey results; however, as menstrual tracking mobile apps become more widely adopted, they provide an increasingly large, content-rich source of menstrual health experiences and behaviors over time. By expl...
The menstrual cycle is a key indicator of overall health for women of reproductive age. Previously, the study of women's menstruation was done primarily through survey results; however, as mobile apps for menstrual tracking become more widely adopted, they provide an increasingly large, content-rich source of menstrual health experiences and behavi...
We present an end-to-end statistical framework for personalized, accurate, and minimally invasive modeling of female reproductive hormonal patterns. Reconstructing and forecasting the evolution of hormonal dynamics is a challenging task, but a critical one to improve general understanding of the menstrual cycle and personalized detection of potenti...
We investigate the use of self-tracking data and unsupervised mixed-membership models to phenotype endometriosis. Endometriosis is a systemic, chronic condition of women in reproductive age and, at the same time, a highly enigmatic condition with no known biomarkers to monitor its progression and no established staging. We leverage data collected t...
The multi-armed bandit is a sequential allocation task where an agent must learn a policy that maximizes long term payoff, where only the reward of the played arm is observed at each iteration. In the stochastic setting, the reward for each action is generated from an unknown distribution, which depends on a given 'context', available at each inter...
The multi-armed bandit (MAB) problem is a sequential allocation task where the goal is to learn a policy that maximizes long term payoff, where only the reward of the executed action is observed; i.e., sequential optimal decisions are made, while simultaneously learning how the world operates. In the stochastic setting, the reward for each action i...
We consider the problem of sequential inference of latent time-series with innovations correlated in time and observed via nonlinear functions. We accommodate time-varying phenomena with diverse properties by means of a flexible mathematical representation of the data. We characterize statistically such time-series by a Bayesian analysis of their d...
In this paper, we introduce a novel task for machine learning in healthcare, namely personalized modeling of the female hormonal cycle. The motivation for this work is to model the hormonal cycle and predict its phases in time, both for healthy individuals and for those with disorders of the reproductive system. Because there are individual differe...
In many biomedical, science, and engineering problems, one must sequentially decide which action to take next so as to maximize rewards. Reinforcement learning is an area of machine learning that studies how this maximization balances exploration and exploitation, optimizing interactions with the world while simultaneously learning how the world op...
Reinforcement learning studies how to balance exploration and exploitation in real-world systems, optimizing interactions with the world while simultaneously learning how the world works. One general class of algorithms for such learning is the multi-armed bandit setting (in which sequential interactions are independent and identically distributed)...
This paper is Part I of a series of two papers where we address sequential estimation of wide-sense stationary autoregressive moving average (ARMA) state processes by particle filtering. In Part I, we present estimation methods for ARMA processes of known model order, where the parameters are first known and then unknown. The driving noise of the A...
This is Part II of a series of two papers where we address sequential estimation of wide-sense stationary autoregressive moving average (ARMA) state processes by particle filtering. In Part I, we considered a state-space model where the state was an ARMA process of known order and where the parameters of the process could be known or unknown. In th...
We propose a Sequential Monte Carlo (SMC) method for filtering and prediction of time-varying signals under model uncertainty. Instead of resorting to model selection, we fuse the information from the considered models within the proposed SMC method. We achieve our goal by dynamically adjusting the resampling step according to the posterior predict...
This paper proposes the HURRy (HUman Routines used for Routing) protocol, which infers and benefits from the social behaviour of nodes in disruptive networking environments. HURRy incorporates the contact duration to the information retrieved from historical encounters among neighbours, so that smarter routing decisions can be made. The specificati...
The characterization of human interaction at different levels has been a matter of interest in many disciplines. So far, social networking through the Internet has been the main source to infer human beings’ relations. Nevertheless, due to the irruption of wearable devices with wireless communication capabilities, initiatives that use them to measu...
This paper highlights the challenges to be taken into consideration when Bluetooth is used as a radio technology to capture proximity traces between people. Our study analyzes the limitations of Bluetooth-based trace acquisition initiatives carried out until now in terms of granularity and reliability. We then propose an optimal configuration for t...
In this paper, we propose a novel approach for decomposing hedge fund returns onto observable risk factors. We utilize a vector stochastic-volatility model to extract the time-varying exposure of low frequency hedge fund returns on high frequency market data. We implement the estimation by using particle filtering and the concept of Rao-Blackwelliz...
Energy efficiency and high data relevancy are crucial for wireless sensor network applications; challenges usually tackled by network clustering or event-driven techniques focused only on the performance of clusterheads or too restricted to specific applications. In contrast, this paper formalizes the combined NP-Complete problem of event-driven ne...
Both energy efficiency and high data relevancy are crucial for wireless sensor network applications. Network clustering and event-driven protocols are two main approaches typically used to fulfill those requirements. Existing techniques are either focused only on performance of clusterheads or too restricted to specific applications; however, few o...
The emerging technology of wireless sensor networks (WSNs) is an integrated, distributed, wireless network of sensing devices. It has the potential to monitor dynamic hydrological and environmental processes more effectively than traditional monitoring and data acquisition techniques by providing environmental information at greater spatial and tem...
In this position paper we present the design of an end-to-end scalable content streaming system that optimizes the quality of experience of the end-user by allowing each client to retrieve a customized multimedia stream, based on both network and client states. By taking advantage of multimedia scalability, our proposed receiverdriven architecture...
Increased interest in wireless sensor networks by scientists and engineers is forcing wireless sensor networking research to focus on application requirements. Data is available as never before in many fields of study; practitioners are now burdened with the challenge of doing data-rich research rather than being data-starved. However, in situ sens...
A Wireless Sensor Network (WSN), an emerging technology, is an integrated, distributed, and wireless network of many sensing devices. It is capable of more effectively monitoring dynamic hydrological and environmental processes by providing environmental information at spatial and temporal resolutions much greater than traditional techniques. Furth...
Groundwater transport modeling is intended to aid in remediation
processes by providing prediction of plume location and by helping to
bridge data gaps in the typically undersampled subsurface environment.
Increased availability of computer resources has made computer-based
transport models almost ubiquitous in calculating health risks,
determining...