Stefan Vlaski

Stefan Vlaski
  • Doctor of Philosophy
  • Lecturer at Imperial College London

Lecturer (Assistant Professor)

About

122
Publications
6,431
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,002
Citations
Current institution
Imperial College London
Current position
  • Lecturer
Additional affiliations
October 2019 - August 2021
Swiss Federal Institute of Technology in Lausanne
Position
  • PostDoc Position
Education
July 2014 - September 2019
University of California, Los Angeles
Field of study
  • Electrical and Computer Engineering

Publications

Publications (122)
Preprint
Full-text available
Hidden Markov Models (HMMs) provide a rigorous framework for inference in dynamic environments. In this work, we study the alpha-HMM algorithm motivated by the optimal online filtering formulation in settings where the true state evolves as a Markov chain with equal exit probabilities. We quantify the dynamics of the algorithm in stationary environ...
Preprint
In multi-objective optimization, minimizing the worst objective can be preferable to minimizing the average objective, as this ensures improved fairness across objectives. Due to the non-smooth nature of the resultant min-max optimization problem, classical subgradient-based approaches typically exhibit slow convergence. Motivated by primal-dual co...
Preprint
Full-text available
Social learning strategies enable agents to infer the underlying true state of nature in a distributed manner by receiving private environmental signals and exchanging beliefs with their neighbors. Previous studies have extensively focused on static environments, where the underlying true state remains unchanged over time. In this paper, we conside...
Preprint
The performance of algorithms for decentralized optimization is affected by both the optimization error and the consensus error, the latter of which arises from the variation between agents' local models. Classically, algorithms employ averaging and gradient-tracking mechanisms with constant combination matrices to drive the collection of agents to...
Preprint
Decentralized learning strategies allow a collection of agents to learn efficiently from local data sets without the need for central aggregation or orchestration. Current decentralized learning paradigms typically rely on an averaging mechanism to encourage agreement in the parameter space. We argue that in the context of deep neural networks, whi...
Article
Communication-constrained algorithms for decentralized learning and optimization rely on local updates coupled with the exchange of compressed signals. In this context, differential quantization is an effective technique to mitigate the negative impact of compression by leveraging correlations between successive iterates. In addition, the use of er...
Preprint
In distributed learning agents aim at collaboratively solving a global learning problem. It becomes more and more likely that individual agents are malicious or faulty with an increasing size of the network. This leads to a degeneration or complete breakdown of the learning process. Classical aggregation schemes are prone to breakdown at small cont...
Conference Paper
Non-Bayesian social learning enables multiple agents to conduct networked signal and information processing through observing environmental signals and information aggregating. Traditional non-Bayesian social learning models only consider single signals, limiting their applications in scenarios where multiple viewpoints of information are available...
Preprint
Non-Bayesian social learning enables multiple agents to conduct networked signal and information processing through observing environmental signals and information aggregating. Traditional non-Bayesian social learning models only consider single signals, limiting their applications in scenarios where multiple viewpoints of information are available...
Preprint
Full-text available
Communication-constrained algorithms for decentralized learning and optimization rely on local updates coupled with the exchange of compressed signals. In this context, differential quantization is an effective technique to mitigate the negative impact of compression by leveraging correlations between successive iterates. In addition, the use of er...
Article
Adaptive social learning is a useful tool for studying distributed decision-making problems over graphs. This paper investigates the effect of combination policies on the performance of adaptive social learning strategies. Using large-deviation analysis, it first derives a bound on the steady-state error probability and characterizes the optimal se...
Article
Full-text available
Federated learning is a semi-distributed algorithm, where a server communicates with multiple dispersed clients to learn a global model. The federated architecture is not robust and is sensitive to communication and computational overloads due to its one-master multi-client structure. It can also be subject to privacy attacks targeting personal inf...
Article
In this paper we study the problem of social learning under multiple true hypotheses and self-interested agents that exchange information over a graph. In this setup, each agent receives data that might be generated from a different hypothesis (or state) than the data received by the other agents. In contrast to the related literature on social l...
Article
This article reviews significant advances in networked signal and information processing (SIP), which have enabled in the last 25 years extending decision making and inference, optimization, control, and learning to the increasingly ubiquitous environments of distributed agents. As these interacting agents cooperate, new collective behaviors emerge...
Conference Paper
Full-text available
Distributed networks are widely used in industrial and consumer applications. As the communication capabilities of such networks are usually limited, it is important to develop algorithms which are capable of handling the vast amount of data processing locally and only communicate some aggregated value. Additionally, these algorithms have to be rob...
Conference Paper
Full-text available
Distributed networks are widely used in industrial and consumer applications. As the communication capabilities of such networks are usually limited, it is important to develop algorithms which are capable of handling the vast amount of data processing locally and only communicate some aggregated value. Additionally, these algorithms have to be rob...
Preprint
Distributed learning paradigms, such as federated or decentralized learning, allow a collection of agents to solve global learning and optimization problems through limited local interactions. Most such strategies rely on a mixture of local adaptation and aggregation steps, either among peers or at a central fusion center. Classically, aggregation...
Preprint
Full-text available
Classical paradigms for distributed learning, such as federated or decentralized gradient descent, employ consensus mechanisms to enforce homogeneity among agents. While these strategies have proven effective in i.i.d. scenarios, they can result in significant performance degradation when agents follow heterogeneous objectives or data. Distributed...
Preprint
Full-text available
The vulnerability of machine learning models to adversarial attacks has been attracting considerable attention in recent years. Most existing studies focus on the behavior of stand-alone single-agent learners. In comparison, this work studies adversarial training over graphs, where individual agents are subjected to perturbations of varied strength...
Preprint
Full-text available
This work focuses on adversarial learning over graphs. We propose a general adversarial training framework for multi-agent systems using diffusion learning. We analyze the convergence properties of the proposed scheme for convex optimization problems, and illustrate its enhanced robustness to adversarial attacks.
Preprint
We study the privatization of distributed learning and optimization strategies. We focus on differential privacy schemes and study their effect on performance. We show that the popular additive random perturbation scheme degrades performance because it is not well-tuned to the graph structure. For this reason, we exploit two alternative graph-homom...
Article
Full-text available
In this paper, we consider decentralized optimization problems where agents have individual cost functions to minimize subject to subspace constraints that require the minimizers across the network to lie in low-dimensional subspaces. This constrained formulation includes consensus or single-task optimization as special cases, and allows for more g...
Preprint
Full-text available
This work studies networked agents cooperating to track a dynamical state of nature under partial information. The proposed algorithm is a distributed Bayesian filtering algorithm for finite-state hidden Markov models (HMMs). It can be used for sequential state estimation tasks, as well as for modeling opinion formation over social networks under d...
Preprint
We study the generation of dependent random numbers in a distributed fashion in order to enable privatized distributed learning by networked agents. We propose a method that we refer to as local graph-homomorphic processing; it relies on the construction of particular noises over the edges to ensure a certain level of differential privacy. We show...
Preprint
Full-text available
The article reviews significant advances in networked signal and information processing, which have enabled in the last 25 years extending decision making and inference, optimization, control, and learning to the increasingly ubiquitous environments of distributed agents. As these interacting agents cooperate, new collective behaviors emerge from l...
Preprint
Full-text available
In this paper, we consider decentralized optimization problems where agents have individual cost functions to minimize subject to subspace constraints that require the minimizers across the network to lie in low-dimensional subspaces. This constrained formulation includes consensus or single-task optimization as special cases, and allows for more g...
Conference Paper
Full-text available
Distributed learning paradigms, such as federated and decentralized learning, allow for the coordination of models across a collection of agents, and without the need to exchange raw data. Instead, agents compute model updates locally based on their available data, and subsequently share the update model with a parameter server or their peers. This...
Preprint
Full-text available
Distributed learning paradigms, such as federated and decentralized learning, allow for the coordination of models across a collection of agents, and without the need to exchange raw data. Instead, agents compute model updates locally based on their available data, and subsequently share the update model with a parameter server or their peers. This...
Preprint
Full-text available
Observations collected by agents in a network may be unreliable due to observation noise or interference. This paper proposes a distributed algorithm that allows each node to improve the reliability of its own observation by relying solely on local computations and interactions with immediate neighbors, assuming that the field (graph signal) monito...
Preprint
Full-text available
Adaptive social learning is a useful tool for studying distributed decision-making problems over graphs. This paper investigates the effect of combination policies on the performance of adaptive social learning strategies. Using large-deviation analysis, it first derives a bound on the probability of error and characterizes the optimal selection fo...
Preprint
Full-text available
Federated learning is a semi-distributed algorithm, where a server communicates with multiple dispersed clients to learn a global model. The federated architecture is not robust and is sensitive to communication and computational overloads due to its one-master multi-client structure. It can also be subject to privacy attacks targeting personal inf...
Preprint
Full-text available
Social learning algorithms provide models for the formation of opinions over social networks resulting from local reasoning and peer-to-peer exchanges. Interactions occur over an underlying graph topology, which describes the flow of information among the agents. In this work, we propose a technique that addresses questions of explainability and in...
Preprint
Full-text available
Social learning algorithms provide models for the formation of opinions over social networks resulting from local reasoning and peer-to-peer exchanges. Interactions occur over an underlying graph topology, which describes the flow of information and relative influence between pairs of agents. For a given graph topology, these algorithms allow for t...
Article
Full-text available
The objective of meta-learning is to exploit knowledge obtained from observed tasks to improve adaptation to unseen tasks. Meta-learners are able to generalize better when they are trained with a larger number of observed tasks and with a larger amount of data per task. Given the amount of resources that are needed, it is generally difficult to exp...
Article
Federated learning encapsulates distributed learning strategies that are managed by a central unit. Since it relies on using a selected number of agents at each iteration, and since each agent, in turn, taps into its local data, it is only natural to study optimal sampling policies for selecting agents and their data in federated learning implement...
Article
Social learning algorithms provide models for the formation of opinions over social networks resulting from local reasoning and peer-to-peer exchanges. Interactions occur over an underlying graph topology, which describes the flow of information among the agents. To account for drifting conditions in the environment, this work adopts an adaptive so...
Article
This work proposes a decentralized architecture, where individual agents aim at solving a classification problem while observing streaming features of different dimensions and arising from possibly different distributions. In the context of social learning, several useful strategies have been developed, which solve decision making problems through...
Preprint
Full-text available
This work proposes a decentralized architecture, where individual agents aim at solving a classification problem while observing streaming features of different dimensions and arising from possibly different distributions. In the context of social learning, several useful strategies have been developed, which solve decision making problems through...
Article
Recent years have seen increased interest in performance guarantees of gradient descent algorithms for non-convex optimization. A number of works have uncovered that gradient noise plays a critical role in the ability of gradient descent recursions to efficiently escape saddle-points and reach second-order stationary points. Most available works li...
Preprint
Full-text available
This work proposes a multi-agent filtering algorithm over graphs for finite-state hidden Markov models (HMMs), which can be used for sequential state estimation or for tracking opinion formation over dynamic social networks. We show that the difference from the optimal centralized Bayesian solution is asymptotically bounded for geometrically ergodi...
Preprint
Full-text available
In this paper we study the problem of social learning under multiple true hypotheses and self-interested agents which exchange information over a graph. In this setup, each agent receives data that might be generated from a different hypothesis (or state) than the data other agents receive. In contrast to the related literature in social learning,...
Article
The purpose of this work is to develop and study a decentralized strategy for Pareto optimization of an aggregate cost consisting of regularized risks. Each risk is modeled as the expectation of some loss function with unknown probability distribution while the regularizers are assumed deterministic, but are not required to be differentiable or eve...
Preprint
Full-text available
Adaptive networks have the capability to pursue solutions of global stochastic optimization problems by relying only on local interactions within neighborhoods. The diffusion of information through repeated interactions allows for globally optimal behavior, without the need for central coordination. Most existing strategies are developed for cooper...
Preprint
Full-text available
A common assumption in the social learning literature is that agents exchange information in an unselfish manner. In this work, we consider the scenario where a subset of agents aims at deceiving the network, meaning they aim at driving the network beliefs to the wrong hypothesis. The adversaries are unaware of the true hypothesis. However, they wi...
Article
The diffusion strategy for distributed learning from streaming data employs local stochastic gradient updates along with exchange of iterates over neighborhoods. In Part I [2] of this work we established that agents cluster around a network centroid and proceeded to study the dynamics of this point. We established expected descent in non-convex env...
Article
Driven by the need to solve increasingly complex optimization problems in signal processing and machine learning, there has been increasing interest in understanding the behavior of gradient-descent algorithms in non-convex environments. Most available works on distributed non-convex optimization problems focus on the deterministic setting where ex...
Preprint
Full-text available
Federated learning encapsulates distributed learning strategies that are managed by a central unit. Since it relies on using a selected number of agents at each iteration, and since each agent, in turn, taps into its local data, it is only natural to study optimal sampling policies for selecting agents and their data in federated learning implement...
Preprint
Full-text available
Federated learning is a useful framework for centralized learning from distributed data under practical considerations of heterogeneity, asynchrony, and privacy. Federated architectures are frequently deployed in deep learning settings, which generally give rise to non-convex optimization problems. Nevertheless, most existing analysis are either li...
Preprint
Full-text available
Federated learning involves a mixture of centralized and decentralized processing tasks, where a server regularly selects a sample of the agents and these in turn sample their local data to compute stochastic gradients for their learning updates. This process runs continually. The sampling of both agents and data is generally uniform; however, in t...
Preprint
Full-text available
A common assumption in the social learning literature is that agents exchange information in an unselfish manner. In this work, we consider the scenario where a subset of agents aims at driving the network beliefs to the wrong hypothesis. The adversaries are unaware of the true hypothesis. However, they will "blend in" by behaving similarly to the...
Preprint
Full-text available
This paper presents an adaptive combination strategy for distributed learning over diffusion networks. Since learning relies on the collaborative processing of the stochastic information at the dispersed agents, the overall performance can be improved by designing combination policies that adjust the weights according to the quality of the data. Su...
Preprint
Full-text available
Decentralized algorithms for stochastic optimization and learning rely on the diffusion of information as a result of repeated local exchanges of intermediate estimates. Such structures are particularly appealing in situations where agents may be hesitant to share raw data due to privacy concerns. Nevertheless, in the absence of additional privacy-...
Preprint
Full-text available
This work proposes a new way of combining independently trained classifiers over space and time. Combination over space means that the outputs of spatially distributed classifiers are aggregated. Combination over time means that the classifiers respond to streaming data during testing and continue to improve their performance even during this phase...
Preprint
Full-text available
The objective of meta-learning is to exploit the knowledge obtained from observed tasks to improve adaptation to unseen tasks. As such, meta-learners are able to generalize better when they are trained with a larger number of observed tasks and with a larger amount of data per task. Given the amount of resources that are needed, it is generally dif...
Article
The utilization of online stochastic algorithms is popular in large-scale learning settings due to their ability to compute updates on the fly, without the need to store and process data in large batches. When a constant step-size is used, these algorithms also have the ability to adapt to drifts in problem parameters, such as data or model propert...
Article
Full-text available
The problem of simultaneously learning several related tasks has received considerable attention in several domains, especially in machine learning, with the so-called multitask learning (MTL) problem, or learning to learn problem [1], [2]. MTL is an approach to inductive transfer learning (using what is learned for one problem to assist with anoth...
Article
Full-text available
This paper formulates a multitask optimization problem where agents in the network have individual objectives to meet, or individual parameter vectors to estimate, subject to a smoothness condition over the graph. The smoothness condition softens the transition in the tasks among adjacent nodes and allows incorporating information about the graph s...
Article
Full-text available
Part I of this paper formulated a multitask optimization problem where agents in the network have individual objectives to meet, or individual parameter vectors to estimate, subject to a smoothness condition over the graph. A diffusion strategy was devised that responds to streaming data and employs stochastic approximations in place of actual grad...
Article
Full-text available
Part I of this paper considered optimization problems over networks where agents have individual objectives to meet, or individual parameter vectors to estimate, subject to subspace constraints that require the objectives across the network to lie in low-dimensional subspaces. Starting from the centralized projected gradient descent, an iterative a...
Preprint
Full-text available
The utilization of online stochastic algorithms is popular in large-scale learning settings due to their ability to compute updates on the fly, without the need to store and process data in large batches. When a constant step-size is used, these algorithms also have the ability to adapt to drifts in problem parameters, such as data or model propert...
Preprint
Full-text available
Rapid advances in data collection and processing capabilities have allowed for the use of increasingly complex models that give rise to nonconvex optimization problems. These formulations, however, can be arbitrarily difficult to solve in general, in the sense that even simply verifying that a given point is a local minimum can be NP-hard [1]. Stil...
Preprint
Full-text available
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments. While many federated learning architectures process data in an online manner, and are hence adaptive by nature, most performance analyses assume static optimization problems and offer no guarantees in the presence of drifts in the...
Article
Full-text available
This paper considers optimization problems over networks where agents have individual objectives to meet, or individual parameter vectors to estimate, subject to subspace constraints that require the objectives across the network to lie in low-dimensional subspaces. This constrained formulation includes consensus optimization as a special case, and...
Preprint
Full-text available
The problem of learning simultaneously several related tasks has received considerable attention in several domains, especially in machine learning with the so-called multitask learning problem or learning to learn problem [1], [2]. Multitask learning is an approach to inductive transfer learning (using what is learned for one problem to assist in...
Preprint
Full-text available
Under appropriate cooperation protocols and parameter choices, fully decentralized solutions for stochastic optimization have been shown to match the performance of centralized solutions and result in linear speedup (in the number of agents) relative to non-cooperative approaches in the strongly-convex setting. More recently, these results have bee...
Preprint
Full-text available
The purpose of this work is to develop and study a distributed strategy for Pareto optimization of an aggregate cost consisting of regularized risks. Each risk is modeled as the expectation of some loss function with unknown probability distribution while the regularizers are assumed deterministic, but are not required to be differentiable or even...

Network

Cited By