Iñigo Urteaga

Iñigo Urteaga
Basque Center for Applied Mathematics · Machine Learning

Electrical Engineering, PhD
Tenure-tracked Ikerbasque Research Fellow in the Machine Learning group at the Basque Center for Applied Mathematics

About

39
Publications
7,552
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
347
Citations
Introduction
I am a tenure-tracked Ikerbasque Research Fellow in the Machine Learning group at the Basque Center for Applied Mathematics (BCAM). I study statistical models and algorithms to extract information from data, for computer systems to effectively learn how to perform descriptive, predictive, and prescriptive tasks. I have specialized in statistical machine learning, computational Bayesian statistics, approximate inference methods, and sequential decision processes.
Additional affiliations
April 2018 - January 2023
Columbia University
Position
  • Associate Research Scientist
September 2016 - April 2018
Columbia University
Position
  • PostDoc Position
August 2011 - August 2016
Stony Brook University
Position
  • PhD in Electrical Engineering
Education
August 2011 - August 2016
Stony Brook University
Field of study
  • Statistical Signal Processing

Publications

Publications (39)
Conference Paper
There are many practical signal processing settings where a state-space model consists of a state described by an ARMA process that is observed via non-linear functions of the state. In this paper, we propose a particle filtering method for sequentially estimating the ARMA process in the presence of unknown parameters. In the considered problem, we...
Conference Paper
This paper considers inference on the widely used state-space models described by hidden ARMA state processes of unknown order observed via non-linear functions of the states. We propose a particle filtering method for sequentially inferring the unknown ARMA time-series by Rao-Blackwellization of all the static unknowns. Our method does not rely ei...
Conference Paper
In the past decades, Sequential Monte Carlo (SMC) sampling has proven to be a method of choice in many applications where the dynamics of the studied system are described by nonlinear equations and/or non-Gaussian noises. In this paper, we study the application of SMC sampling to nonlinear state-space models where the state is a fractional Gaussian...
Conference Paper
In this paper we consider a set of time-series that are coupled by latent fractional Gaussian processes. Specifically, we address time-series that combine idiosyncratic short-term and shared long-term features. The long-memory is modeled by fractional Gaussian processes, whereas the short-memory properties are captured by linear models of past data...
Conference Paper
Full-text available
Increased interest in Wireless Sensor Networks (WSNs) by scientists and engineers is forcing WSN research to focus on application requirements. Data is available as never before in many fields of study; practitioners are now burdened with the challenge of doing data-rich research rather than being data-starved. In-situ sensors can be prone to error...
Preprint
Full-text available
We characterize short-term and long-term user engagement patterns in a self-tracking, mobile health app. We introduce and define engagement metrics to capture the quantity, duration, and density of participant engagement according to different domains of self-tracking. We focus our study on Phendo, a research app designed for participants to self-t...
Preprint
Full-text available
Transformer-based language models (TLMs) provide state-of-the-art performance in many modern natural language processing applications. TLM training is conducted in two phases. First, the model is pre-trained over large volumes of text to minimize a generic objective function, such as the Masked Language Model (MLM). Second, the model is fine-tuned...
Article
Full-text available
Objective The study sought to build predictive models of next menstrual cycle start date based on mobile health self-tracked cycle data. Because app users may skip tracking, disentangling physiological patterns of menstruation from tracking behaviors is necessary for the development of predictive models. Materials and Methods We use data from a po...
Article
Full-text available
We explore how to quantify uncertainty when designing predictive models for healthcare to provide well-calibrated results. Uncertainty quantification and calibration are critical in medicine, as one must not only accommodate the variability of the underlying physiology, but adjust to the uncertain data collection and reporting process. This occurs...
Preprint
Full-text available
Mobile health (mHealth) apps such as menstrual trackers provide a rich source of self-tracked health observations that can be leveraged for statistical modeling. However, such data streams are notoriously unreliable since they hinge on user adherence to the app. Thus, it is crucial for machine learning models to account for self-tracking artifacts...
Article
Full-text available
Endometriosis is a systemic and chronic condition in women of childbearing age, yet a highly enigmatic disease with unresolved questions: there are no known biomarkers, nor established clinical stages. We here investigate the use of patient-generated health data and data-driven phenotyping to characterize endometriosis patient subtypes, based on th...
Article
Full-text available
The menstrual cycle is a key indicator of overall health for women of reproductive age. Previously, menstruation was primarily studied through survey results; however, as menstrual tracking mobile apps become more widely adopted, they provide an increasingly large, content-rich source of menstrual health experiences and behaviors over time. By expl...
Preprint
Full-text available
The menstrual cycle is a key indicator of overall health for women of reproductive age. Previously, the study of women's menstruation was done primarily through survey results; however, as mobile apps for menstrual tracking become more widely adopted, they provide an increasingly large, content-rich source of menstrual health experiences and behavi...
Preprint
Full-text available
We present an end-to-end statistical framework for personalized, accurate, and minimally invasive modeling of female reproductive hormonal patterns. Reconstructing and forecasting the evolution of hormonal dynamics is a challenging task, but a critical one to improve general understanding of the menstrual cycle and personalized detection of potenti...
Preprint
Full-text available
We investigate the use of self-tracking data and unsupervised mixed-membership models to phenotype endometriosis. Endometriosis is a systemic, chronic condition of women in reproductive age and, at the same time, a highly enigmatic condition with no known biomarkers to monitor its progression and no established staging. We leverage data collected t...
Preprint
Full-text available
The multi-armed bandit is a sequential allocation task where an agent must learn a policy that maximizes long term payoff, where only the reward of the played arm is observed at each iteration. In the stochastic setting, the reward for each action is generated from an unknown distribution, which depends on a given 'context', available at each inter...
Preprint
Full-text available
The multi-armed bandit (MAB) problem is a sequential allocation task where the goal is to learn a policy that maximizes long term payoff, where only the reward of the executed action is observed; i.e., sequential optimal decisions are made, while simultaneously learning how the world operates. In the stochastic setting, the reward for each action i...
Article
Full-text available
We consider the problem of sequential inference of latent time-series with innovations correlated in time and observed via nonlinear functions. We accommodate time-varying phenomena with diverse properties by means of a flexible mathematical representation of the data. We characterize statistically such time-series by a Bayesian analysis of their d...
Article
Full-text available
In this paper, we introduce a novel task for machine learning in healthcare, namely personalized modeling of the female hormonal cycle. The motivation for this work is to model the hormonal cycle and predict its phases in time, both for healthy individuals and for those with disorders of the reproductive system. Because there are individual differe...
Article
Full-text available
In many biomedical, science, and engineering problems, one must sequentially decide which action to take next so as to maximize rewards. Reinforcement learning is an area of machine learning that studies how this maximization balances exploration and exploitation, optimizing interactions with the world while simultaneously learning how the world op...
Article
Full-text available
Reinforcement learning studies how to balance exploration and exploitation in real-world systems, optimizing interactions with the world while simultaneously learning how the world works. One general class of algorithms for such learning is the multi-armed bandit setting (in which sequential interactions are independent and identically distributed)...
Article
This paper is Part I of a series of two papers where we address sequential estimation of wide-sense stationary autoregressive moving average (ARMA) state processes by particle filtering. In Part I, we present estimation methods for ARMA processes of known model order, where the parameters are first known and then unknown. The driving noise of the A...
Article
This is Part II of a series of two papers where we address sequential estimation of wide-sense stationary autoregressive moving average (ARMA) state processes by particle filtering. In Part I, we considered a state-space model where the state was an ARMA process of known order and where the parameters of the process could be known or unknown. In th...
Conference Paper
We propose a Sequential Monte Carlo (SMC) method for filtering and prediction of time-varying signals under model uncertainty. Instead of resorting to model selection, we fuse the information from the considered models within the proposed SMC method. We achieve our goal by dynamically adjusting the resampling step according to the posterior predict...
Conference Paper
This paper proposes the HURRy (HUman Routines used for Routing) protocol, which infers and benefits from the social behaviour of nodes in disruptive networking environments. HURRy incorporates the contact duration to the information retrieved from historical encounters among neighbours, so that smarter routing decisions can be made. The specificati...
Article
The characterization of human interaction at different levels has been a matter of interest in many disciplines. So far, social networking through the Internet has been the main source to infer human beings’ relations. Nevertheless, due to the irruption of wearable devices with wireless communication capabilities, initiatives that use them to measu...
Article
This paper highlights the challenges to be taken into consideration when Bluetooth is used as a radio technology to capture proximity traces between people. Our study analyzes the limitations of Bluetooth-based trace acquisition initiatives carried out until now in terms of granularity and reliability. We then propose an optimal configuration for t...
Conference Paper
In this paper, we propose a novel approach for decomposing hedge fund returns onto observable risk factors. We utilize a vector stochastic-volatility model to extract the time-varying exposure of low frequency hedge fund returns on high frequency market data. We implement the estimation by using particle filtering and the concept of Rao-Blackwelliz...
Article
Energy efficiency and high data relevancy are crucial for wireless sensor network applications; challenges usually tackled by network clustering or event-driven techniques focused only on the performance of clusterheads or too restricted to specific applications. In contrast, this paper formalizes the combined NP-Complete problem of event-driven ne...
Conference Paper
Full-text available
Both energy efficiency and high data relevancy are crucial for wireless sensor network applications. Network clustering and event-driven protocols are two main approaches typically used to fulfill those requirements. Existing techniques are either focused only on performance of clusterheads or too restricted to specific applications; however, few o...
Article
The emerging technology of wireless sensor networks (WSNs) is an integrated, distributed, wireless network of sensing devices. It has the potential to monitor dynamic hydrological and environmental processes more effectively than traditional monitoring and data acquisition techniques by providing environmental information at greater spatial and tem...
Conference Paper
In this position paper we present the design of an end-to-end scalable content streaming system that optimizes the quality of experience of the end-user by allowing each client to retrieve a customized multimedia stream, based on both network and client states. By taking advantage of multimedia scalability, our proposed receiverdriven architecture...
Article
Increased interest in wireless sensor networks by scientists and engineers is forcing wireless sensor networking research to focus on application requirements. Data is available as never before in many fields of study; practitioners are now burdened with the challenge of doing data-rich research rather than being data-starved. However, in situ sens...
Article
A Wireless Sensor Network (WSN), an emerging technology, is an integrated, distributed, and wireless network of many sensing devices. It is capable of more effectively monitoring dynamic hydrological and environmental processes by providing environmental information at spatial and temporal resolutions much greater than traditional techniques. Furth...
Article
Groundwater transport modeling is intended to aid in remediation processes by providing prediction of plume location and by helping to bridge data gaps in the typically undersampled subsurface environment. Increased availability of computer resources has made computer-based transport models almost ubiquitous in calculating health risks, determining...

Network

Cited By