# Ayfer Ozgur's research while affiliated with Stanford University and other places

**What is this page?**

This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

## Publications (120)

The third call for papers of the Series on Machine Learning in Communications and Networks has continued to receive a great number of high-quality papers covering various aspects of intelligent communications, from which we have included 26 original contributions in this issue. In the following, we provide a brief review of key contributions of pap...

Generative Adversarial Networks are a popular method for learning distributions from data by modeling the target distribution as a function of a known distribution. The function, often referred to as the generator, is optimized to minimize a chosen distance measure between the generated and target distributions. One commonly used measure for this p...

We introduce a novel anytime Batched Thompson sampling policy for multi-armed bandits where the agent observes the rewards of her actions and adjusts her policy only at the end of a small number of batches. We show that this policy simultaneously achieves a problem dependent regret of order $O(\log(T))$ and a minimax regret of order $O(\sqrt{T\log(...

We study the asymptotic performance of the Thompson sampling algorithm in the batched multi-armed bandit setting where the time horizon $T$ is divided into batches, and the agent is not able to observe the rewards of her actions until the end of each batch. We show that in this batched setting, Thompson sampling achieves the same asymptotic perform...

The Second Call for Papers of the Series on Machine Learning in Communications and Networks has continued to receive a great number of high-quality papers covering various aspects of intelligent communication systems. In addition to those already published, we include in this issue 27 articles that have been submitted to the call. In the following,...

The Second Call for Papers of the Series on Machine Learning in Communications and Networks has continued to receive a great number of high-quality papers covering various aspects of intelligent communication systems. In addition to 23 original contributions in response to the first call for papers, we include in this issue 5 articles submitted to...

We study schemes and lower bounds for distributed minimax statistical estimation over a Gaussian multiple-access channel (MAC) under squared error loss, in a framework combining statistical estimation and wireless communication. First, we develop "analog" joint estimation-communication schemes that exploit the superposition property of the Gaussian...

We consider the processing of statistical samples $X\sim P_\theta$ by a channel $p(y|x)$, and characterize how the statistical information from the samples for estimating the parameter $\theta\in\mathbb{R}^d$ can scale with the mutual information or capacity of the channel. We show that if the statistical model has a sub-Gaussian score function, th...

Since the inception of the group testing problem in World War II, the prevailing assumption in the probabilistic variant of the problem has been that individuals in the population are infected by a disease independently. However, this assumption rarely holds in practice, as diseases typically spread through connections between individuals. We intro...

In the era of the new generation of communication systems, data traffic is expected to continuously strain the capacity of future communication networks. Along with the remarkable growth in data traffic, new applications, such as wearable devices, autonomous systems, and the Internet of Things (IoT), continue to emerge and generate even more data t...

In the domains of dataset construction and crowdsourcing, a notable challenge is to aggregate labels from a heterogeneous set of labelers, each of whom is potentially an expert in some subset of tasks (and less reliable in others). To reduce costs of hiring human labelers or training automated labeling systems, it is of interest to minimize the num...

The large communication cost for exchanging gradients between different nodes significantly limits the scalability of distributed training for large-scale learning models. Motivated by this observation, there has been significant recent interest in techniques that reduce the communication cost of distributed Stochastic Gradient Descent (SGD), with...

The classical problem of supervised learning is to infer an accurate estimate of a target variable
$Y$
from a measured variable
$X$
using a set of labeled training samples. Motivated by the increasingly distributed nature of data and decision making, this paper considers a variation of this classical problem in which the inference is distribute...

We consider optimization using random projections as a statistical estimation problem, where the squared distance between the predictions from the estimator and the true solution is the error metric. In approximately solving a large-scale least squares problem using Gaussian sketches, we show that the sketched solution has a conditional Gaussian di...

We develop data processing inequalities that describe how Fisher information from statistical samples can scale with the privacy parameter
$\varepsilon $
under local differential privacy constraints. These bounds are valid under general conditions on the distribution of the score of the statistical model, and they elucidate under which conditions...

Thompson sampling has been shown to be an effective policy across a variety of online learning tasks. Many works have analyzed the finite time performance of Thompson sampling, and proved that it achieves a sub-linear regret under a broad range of probabilistic settings. However its asymptotic behavior remains mostly underexplored. In this paper, w...

The optimal transport problem studies how to transport one measure to another in the most cost-effective way and has wide range of applications from economics to machine learning. In this paper, we introduce and study an information constrained variation of this problem. Our study yields a strengthening and generalization of Talagrand's celebrated...

We revisit the primitive relay channel, introduced by Cover in 1987. Recent work derived upper bounds on the capacity of this channel that are tighter than the classical cutset bound using the concentration of measure. In this paper, we recover, generalize, and improve upon some of these upper bounds with simpler proofs using reverse hypercontracti...

We develop data processing inequalities that describe how Fisher information from statistical samples can scale with the privacy parameter $\varepsilon$ under local differential privacy constraints. These bounds are valid under general conditions on the distribution of the score of the statistical model, and they elucidate under which conditions th...

Multiclass classification problems are most often solved by either training a single centralized classifier on all $K$ classes, or by reducing the problem to multiple binary classification tasks. This paper explores the uncharted region between these two extremes: How can we solve the $K$-class classification problem by combining the predictions of...

The large communication cost for exchanging gradients between different nodes significantly limits the scalability of distributed training for large-scale learning models. Motivated by this observation, there has been significant recent interest in techniques that reduce the communication cost of distributed Stochastic Gradient Descent (SGD), with...

In combinatorial group testing, the primary objective is to fully identify the set of at most d defective items from a pool of n items using as few tests as possible. The celebrated result for the combinatorial group testing problem is that the number of tests, denoted by t, can be made logarithmic in n when d = O(poly(log n)). However, state-of-th...

We consider a distributed logistic regression problem where labeled data pairs $(X_i,Y_i)\in \mathbb{R}^d\times\{-1,1\}$ for $i=1,\ldots,n$ are distributed across multiple machines in a network and must be communicated to a centralized estimator using at most $k$ bits per labeled pair. We assume that the data $X_i$ come independently from some dist...

The capacity of the semi-deterministic relay channel (SD-RC) with non-causal channel state information (CSI) only at the encoder and decoder is characterized. The capacity is achieved by a scheme based on cooperative-bin-forward. This scheme allows cooperation between the transmitter and the relay without the need of the later to decode a part of t...

We consider the probabilistic group testing problem where
$d$
random defective items in a large population of
$N$
items are identified with high probability by applying binary tests. It is known that the
$\Theta (d \log N)$
tests are necessary and sufficient to recover the defective set with vanishing probability of error when
$d = O(N^{\alp...

We consider the problem of learning high-dimensional, nonparametric and structured (e.g. Gaussian) distributions in distributed networks, where each node in the network observes an independent sample from the underlying distribution and can use $k$ bits to communicate its sample to a central processor. We consider three different models for communi...

The primitive relay channel, introduced by Cover in 1987, is the simplest single-source single-destination network model that captures some of the most essential features and challenges of relaying in wireless networks. Recently, Wu and Ozgur developed upper bounds on the capacity of this channel that are tighter than the cutset bound. In this pape...

We consider an extremal problem for subsets of high-dimensional spheres that can be thought of as an extension of the classical isoperimetric problem on the sphere. Let $A$ be a subset of the $(m-1)$-dimensional sphere $\mathbb{S}^{m-1}$, and let $\mathbf{y}\in \mathbb{S}^{m-1}$ be a randomly chosen point on the sphere. What is the measure of the i...

Consider a memoryless relay channel, where the relay is connected to the destination with an isolated bit pipe of capacity
$C_{0}$
. Let
$C(C_{0})$
denote the capacity of this channel as a function of
$C_{0}$
. What is the critical value of
$C_{0}$
, such that
$C(C_{0})$
first equals
$C(\infty )$
? This is a long-standing open problem posed b...

We consider the probabilistic group testing problem where $d$ random defective items in a large population of $N$ items are identified with high probability by applying binary tests. It is known that $\Theta(d \log N)$ tests are necessary and sufficient to recover the defective set with vanishing probability of error. However, to the best of our kn...

The classical problem of supervised learning is to infer an accurate predictor of a target variable $Y$ from a measured variable $X$ by using a finite number of labeled training samples. Motivated by the increasingly distributed nature of data and decision making, in this paper we consider a variation of this classical problem in which the predicti...

We consider a communication channel where there is no common clock between the transmitter and the receiver. This is motivated by the recent interest in building system-onchip radios for Internet of Things applications, which cannot rely on crystal oscillators for accurate timing. We identify two types of clock uncertainty in such systems: timing j...

We present random access schemes for machine-type communication where a massive number of low-energy wireless devices want to occasionally transmit short information packets. We focus on the device discovery problem, with extensions to joint discovery and data transmission as well as data transmission without communicating device identities. We for...

This paper investigates the achievable rates of an additive white Gaussian noise energy-harvesting (EH) channel with an infinite battery. The EH process is characterized by a sequence of blocks of harvested energy, which is known causally at the source. The harvested energy remains constant within a block while the harvested energy across different...

Motivated by the recent emergence of energy harvesting and wirelessly powered transceivers, we study communication over a memoryless channel with a transmitter whose battery is recharged at random or deterministic times known to the receiver. We characterize the capacity of this channel as the limit of an n-letter maximum mutual information rate un...

Consider a memoryless relay channel, where the channel from the relay to the destination is an isolated bit pipe of capacity $C_0$. Let $C(C_0)$ denote the capacity of this channel as a function of $C_0$. What is the critical value of $C_0$ such that $C(C_0)$ first equals $C(\infty)$? This is a long-standing open problem posed by Cover and named "T...

The data center network (DCN), wired or wireless, features large amounts of Many-to-One (M2O) sessions. Each M2O session is currently established using Point-to-Point (P2P) communications and Store-and-Forward (SAF) relays, and is generally followed by a certain computation at the destination, typically a weighted summation of the received informat...

We consider online power control for an energy harvesting system with random i.i.d. energy arrivals and a finite size battery. We propose a simple online power control policy for this channel that requires minimal information regarding the distribution of the energy arrivals and prove that it is universally near-optimal for all parameter values. In...

We consider an energy harvesting multiple access channel (MAC) where the transmitters are powered by an exogenous stochastic energy harvesting process and equipped with finite batteries. We characterize the capacity region of this channel as n-letter mutual information rate and develop inner and outer bounds that differ by a constant gap. An intere...

The cut-set bound developed by Cover and El Gamal in 1979 has since remained the best known upper bound on the capacity of the Gaussian relay channel. We develop a new upper bound on the capacity of the Gaussian primitive relay channel which is tighter than the cut-set bound. Our proof is based on typicality arguments and concentration of Gaussian...

We consider an $n$ -relay Gaussian diamond network where a source communicates to a destination with the help of $n$ half-duplex relays. Achieving rates close to the capacity of this network requires to employ all the $n$ relays under an optimal transmit/receive schedule. Even for the moderate values of $n$ , this can have significant operational c...

Motivated by recent developments in wireless power transfer, we study communication with a remotely powered transmitter. We propose an information-theoretic model where a charger can dynamically decide on how much power to transfer to the transmitter based on its side information regarding the communication, while the transmitter needs to dynamical...

We consider the discrete memoryless symmetric primitive relay channel, where, a source $X$ wants to send information to a destination $Y$ with the help of a relay $Z$ and the relay can communicate to the destination via an error-free digital link of rate $R_0$, while $Y$ and $Z$ are conditionally independent and identically distributed given $X$. W...

The cut-set bound developed by Cover and El Gamal in 1979 has since remained
the best known upper bound on the capacity of the Gaussian relay channel. We
develop a new upper bound on the capacity of the Gaussian primitive relay
channel which is tighter than the cut-set bound. Our proof is based on
typicality arguments and concentration of Gaussian...

The data center network (DCN), wired or wireless, features large amounts of
Many-to-One (M2O) sessions. Each M2O session is currently operated based on
Point-to-Point (P2P) communications and Store-and-Forward (SAF) relays, and is
generally followed by certain further computation at the destination.
%typically a weighted summation of the received d...

We consider an energy harvesting system where the fixed size battery of the transmitter is recharged with certain probability at each channel use. For this setup, we explicitly characterize the optimal online energy management strategy for maximizing the long-term throughput under different assumptions on the availability of channel state informati...

We consider communication over the AWGN channel with a transmitter whose
battery is recharged with RF energy transfer at random times known to the
receiver. We assume that the recharging process is i.i.d. Bernoulli. We
characterize the capacity of this channel as the limit of an $n$-letter maximum
mutual information rate under both causal and nonca...

We consider an energy harvesting channel, in which the transmitter is powered
by an exogenous stochastic energy harvesting process $E_t$, such that $0\leq
E_t\leq\bar{E}$, which can be stored in a battery of finite size $\bar{B}$. We
provide a simple and insightful formula for the approximate capacity of this
channel with bounded guarantee on the a...

We investigate if feedback can increase the capacity of an energy harvesting
communication channel where a transmitter powered by an exogenous energy
arrival process and equipped with a finite battery communicates to a receiver
over a memoryless channel. For a simple special case where the energy arrival
process is deterministic and the channel is...

We consider the problem of finding the largest capacity subnetwork of a given size of a layered Gaussian relay network. While the exact capacity of Gaussian relay networks is unknown in general, motivated by recent capacity approximations we use the information-theoretic cutset bound as a proxy for the true capacity of such networks. There are two...

We consider communication over heterogeneous parallel channels, where a
transmitter is connected to two users via two parallel channels: a MIMO
broadcast channel (BC) and a noiseless rate-limited multicast channel. We
characterize the optimal degrees of freedom (DoF) region of this setting when
the transmitter has delayed channel state information...

In this paper we examine the value of feedback that comes from overhearing,
without dedicated feedback resources. We focus on a simple model for this
purpose: a deterministic two-hop interference channel, where feedback comes
from overhearing the forward-links. A new aspect brought by this setup is the
dual-role of the relay signal. While the relay...

We consider communication over heterogeneous parallel channels, where a transmitter is connected to two users via two parallel channels: (1) a MISO broadcast channel (BC), and (2) a noiseless rate-limited multicast channel. We characterize the optimal degrees of freedom (DoF) region of this setting when the transmitter has delayed channel state inf...

We consider the problem of communicating over the state-dependent Z-interference channel (S-D Z-IC), when the state is known noncausally only to the interfering transmitter. We present an achievability scheme and show that it is optimal for the injective deterministic S-D Z-IC. This scheme is simple in the sense that it does not involve rate-splitt...

We consider an energy-harvesting communication system where a transmitter powered by an exogenous energy arrival process and equipped with a finite battery communicates over a discrete-time AWGN channel. Assuming that the energy arrival process is i.i.d. Bernoulli, we provide a simple approximation to the capacity of this channel and bound the appr...