Denis Charles

Denis Charles
Microsoft · Microsoft Research

Ph.D. Computer Science

About

51
Publications
6,281
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,228
Citations
Introduction
Skills and Expertise

Publications

Publications (51)
Article
Full-text available
In classical causal inference, inferring cause-effect relations from data relies on the assumption that units are independent and identically distributed. This assumption is violated in settings where units are related through a network of dependencies. An example of such a setting is ad placement in sponsored search advertising, where the likeliho...
Preprint
Full-text available
Today, many web advertising data flows involve passive cross-site tracking of users. Enabling such a mechanism through the usage of third party tracking cookies (3PC) exposes sensitive user data to a large number of parties, with little oversight on how that data can be used. Thus, most browsers are moving towards removal of 3PC in subsequent brows...
Preprint
Full-text available
It is often critical for prediction models to be robust to distributional shifts between training and testing data. Viewed from a causal perspective, the challenge is to distinguish the stable causal relationships from the unstable spurious correlations across shifts. We describe a causal transfer random forest (CTRF) that combines existing trainin...
Preprint
Full-text available
In classical causal inference, inferring cause-effect relations from data relies on the assumption that units are independent and identically distributed. This assumption is violated in settings where units are related through a network of dependencies. An example of such a setting is ad placement in sponsored search advertising, where the clickabi...
Preprint
Full-text available
Contextual bandits are a common problem faced by machine learning practitioners in domains as diverse as hypothesis testing to product recommendations. There have been a lot of approaches in exploiting rich data representations for contextual bandit problems with varying degree of success. Self-supervised learning is a promising approach to find ri...
Preprint
Full-text available
Self-supervision is key to extending use of deep learning for label scarce domains. For most of self-supervised approaches data transformations play an important role. However, up until now the impact of transformations have not been studied. Furthermore, different transformations may have different impact on the system. We provide novel insights i...
Preprint
We present a unified framework for Batch Online Learning (OL) for Click Prediction in Search Advertisement. Machine Learning models once deployed, show non-trivial accuracy and calibration degradation over time due to model staleness. It is therefore necessary to regularly update models, and do so automatically. This paper presents two paradigms of...
Article
Full-text available
In real world systems, the predictions of deployed Machine Learned models affect the training data available to build subsequent models. This introduces a bias in the training data that needs to be addressed. Existing solutions to this problem attempt to resolve the problem by either casting this in the reinforcement learning framework or by quanti...
Conference Paper
Full-text available
In this paper we propose a general family of position auctions used in paid search, which we call multi-score position auctions. These auctions contain the GSP auction and the GSP auction with squashing as special cases. We show experimentally that these auctions contain special cases that perform better than the GSP auction with squashing, in term...
Patent
Full-text available
Various embodiments provide techniques for graph clustering. In one or more embodiments, a participation graph is obtained that represents relationships between entities. An auxiliary graph is constructed based on the participation graph. The auxiliary graph may be constructed such that the auxiliary graph is less dense than the participation graph...
Article
Full-text available
Quick interaction between a human teacher and a learning machine presents numerous benefits and challenges when working with web-scale data. The human teacher guides the machine towards accomplishing the task of interest. The learning machine leverages big data to find examples that maximize the training value of its interaction with the teacher. W...
Conference Paper
Full-text available
Labeling data is a seemingly simple task required for training many machine learning systems, but is actually fraught with problems. This paper introduces the notion of concept evolution, the changing nature of a person's underlying concept (the abstract notion of the target class a person is labeling for, e.g., spam email, travel related web pages...
Conference Paper
Full-text available
The Shapley value provides a fair method for the division of value in coalitional games. Motivated by the application of crowdsourcing for the collection of suitable labels and features for regression and classification tasks, we develop a method to approximate the Shapley value by identifying a suitable decomposition into multiple issues, with the...
Article
Full-text available
This work shows how to leverage causal inference to understand the behavior of complex learning systems interacting with their environment and predict the consequences of changes to the system. Such predictions allow both humans and algorithms to select the changes that would have improved the system performance. This work is illustrated by experim...
Conference Paper
Full-text available
In Internet ad auctions, search engines often throttle budget constrained advertisers so as to spread their spends across the specified time period. Such policies are known as budget smoothing policies. In this paper, we perform a principled, game-theoretic study of what the outcome of an ideal budget smoothing algorithm should be. In particular, w...
Conference Paper
Full-text available
In Internet ad auctions, search engines often throttle budget constrained advertisers so as to spread their spends across the specified time period. Such policies are known as budget smoothing policies. In this paper, we perform a principled, game-theoretic study of what the outcome of an ideal budget smoothing algorithm should be. In particular, w...
Article
Full-text available
This work shows how to leverage causal inference to understand the behavior of complex learning systems interacting with their environment and predict the consequences of changes to the system. Such predictions allow both humans and algorithms to select changes that improve both the short-term and long-term performance of such systems. This work is...
Conference Paper
Full-text available
We derive efficient algorithms for both detecting and representing matchings in lopsided bipartite graphs; such graphs have so many nodes on one side that it is infeasible to represent them in memory or to identify matchings using standard approaches. Detecting and representing matchings in lopsided bipartite graphs is important for allocating and...
Conference Paper
Full-text available
This paper presents the design and theoretical analysis of a P2P resource exchange market, a novel application of mar- kets to the domain of P2P backup. While the long-term goal is an open market using real money, here we consider a system where monetary transfers are prohibited. We flrst describe the design of the market and the user interface we...
Article
Full-text available
The work in this article was motivated by the development of an experimental Peer-to-Peer backup system called Storage Exchange System at Microsoft Live Labs. The data from the peers is encoded using Hierarchical Erasure codes and stored on other peers. Such codes have also been independently described by others. We give a new efficient algorithm t...
Article
Full-text available
Using quaternion algebras over a totally real field one can construct families of Ramanujan graphs as quotients of a Bruhat-Tits tree by a discrete subgroup coming from the quaternion algebra. Such constructions were made by Lubotzky-Phillips-Sarnak and Pizer for rational quaternion algebras and by Jordan-Livné for quaternion algebras over totally...
Conference Paper
Full-text available
Peer-to-peer (P2P) backup systems are an attractive alternative to server-based systems because the immense costs of large data centers can be saved by using idle resources on millions of private computers instead. This paper presents the design and theoretical analysis of a market for a P2P backup system. While our long-term goal is an open resour...
Article
Full-text available
Market-based intelligent systems are becoming in-creasingly important in our everyday lives. How-ever, when the goal of these systems is not the sale or purchase of items, traditional interfaces allowing users to specify bid or ask prices are no longer ap-propriate. Thus, there is a need for different kinds of user interfaces to interact with these...
Article
Full-text available
We propose constructing provable collision resistant hash functions from expander graphs in which finding cycles is hard. As examples, we investigate two specific families of optimal expander graphs for provable collision resistant hash function constructions: the families of Ramanujan graphs constructed by Lubotzky-Phillips-Sarnak and Pizer respec...
Conference Paper
Full-text available
A Bloom filter is a space efficient structure for storing static sets, where the space efficiency is gained at the expense of a small probability of false-positives. A Bloomier filter generalizes a Bloom filter to compactly store a function with a static support. In this article we give a simple construction of a Bloomier filter. The construction i...
Conference Paper
Full-text available
We present a new method to evaluate large degree isogenies between elliptic curves over finite fields. Previous approaches all have exponential running time in the logarithm of the degree. If the endomorphism ring of the elliptic curve is ‘small’ we can do much better, and we present an algorithm with a running time that is polynomial in the logari...
Article
Full-text available
A Bloom filter is a space efficient structure for storing static sets, where the space efficiency is gained at the expense of a small probability of false-positives. A Bloomier filter generalizes a Bloom filter to compactly store a function with a static support. In this article we give a simple construction of a Bloomier filter. The construction i...
Article
Full-text available
In this article, we give evidence that computing Fourier coefficients of the Hecke eigenforms for composite indices is no easier than factoring integers. In particular, we show that the existence of a polynomial time algorithm that, given n, computes the n-th Fourier coefficient of a (fixed) Hecke eigenform implies that we can factor most RSA modul...
Article
Full-text available
We consider the problem of counting the number of lattice vectors of a given length. We show that problem is ♯P-complete resolving an open problem. Furthermore, we show that the problem is at least as hard as integer factorization even for lattices of bounded rank or lattices generated by vectors of bounded norm. Next, we discuss a deterministic al...
Conference Paper
Full-text available
A minimal perfect function maps a static set ofkeys on to the range of integers {0,1,2, … , � − 1} . We present a scalable high performance algorithm based on random graphs for constructing minimal perfect hash functions (MPHFs). For a set ofkeys, our algorithm outputs a description of ℎ in expected time �(�) . The evaluation of ℎ(�) requires three...
Article
Full-text available
Distortion maps allow one to solve the Decision Diffie-Hellman problem on subgroups of points on the elliptic curve. In the case of ordinary elliptic curves over finite fields, it is known that in most cases there are no distortion maps. In this article we characterize the existence of distortion maps in the remaining cases.
Article
Full-text available
This paper presents a practical digital signature scheme to be used in conjunction with network coding. Our scheme simultaneously provides authentication and detects malicious nodes that intentionally corrupt content on the network.
Article
Full-text available
We consider the problem of checking whether an elliptic curve defined over a given number field has complex multiplication. We study two polynomial time algorithms for this problem, one randomized and the other deterministic. The randomized algorithm can be adapted to yield the discriminant of the endomorphism ring of the curve.
Article
Full-text available
Introduction The # modular polynomial, # # (x, y), parameterizes pairs of elliptic curves with an isogeny of degree # between them. Modular polynomials provide the defining equations for modular curves, and are useful in many di#erent aspects of computational number theory and cryptography. For example, computations with modular polynomials have be...
Article
We study higher Arthur-Merlin classes defined via several natural probabilistic operators BP;R and co-R. We investigate the complexity classes they define, and a number of interactions between these operators and the standard polynomial time hierarchy. We prove a hierarchy theorem for these higher Arthur-Merlin classes involving interleaving operat...
Article
Full-text available
We present a new probabilistic algorithm to compute modular polynomials modulo a prime. Modular polynomials parameterize pairs of isogenous elliptic curves and are useful in many aspects of computational number theory and cryptography. Our algorithm has the distinguishing feature that it does not involve the computation of Fourier coefficients of m...
Article
Full-text available
this article and for his insightful comments. I thank Rohit for all those coee house brainstorming sessions, and Tal for conducting his enthusiastic weekly seminar. Thanks to Madhulika for her support in things great and small. I would like to add that this article contains no original material or content, except possibly in the errors it contains...
Article
Full-text available
We show that the Ramanujan Tau function (n) can be computed by a randomized algorithm in time O(n + ) for every > 0. The same method also yields a deterministic algorithm that runs in time O(n 4 + ) for every > 0 to compute (n). Previous algorithms to compute (n) (n) time.
Conference Paper
Full-text available
We study higher Arthur-Merlin classes defined via several natural probabilistic operators BP, R and coR. We investigate the complexity classes they define, and a number of interactions between these operators and the standard polynomial time hierarchy. We prove a hierarchy theorem for these higher Arthur-Merlin classes involving interleaving operat...
Article
Full-text available
We show that for every > 0 and > 0 there are squarefree integers that are free of prime factors > X in the interval [X X + X 1 2 + ] for all large enough X. The approach used is a simple variant of the methods used by Balog [Ba87] and by Harman [Har91] in their study of smooth integers in short intervals.
Article
We show that under the abc-conjecture the largest prime factor of 2 n + 1 is
Article
Full-text available
We show that if we pick logd N integers in the interval (1··· N) then the probability that there is a subset of them whose product yields a perfect square is exp c log N log log N for any c < 1 2d. The methods used to prove these results are elementary.
Article
Full-text available
called undeniable signature schemes require prime numbers of the form 2p 1 such that p is also prime. Sieve methods can yield valuable clues about these distributions and hence allow us to bound the running times of these algorithms. In this treatise we survey the major sieve methods and their important applications in number theory. We apply sieve...
Article
Full-text available
INTRODUCTION Primality Testing is a fundamental problem of Number Theory, for which despite centuries of study no provably efficient algorithms have been devised. Further it has several applications especially in Cryptography. In this treatise we shall survey this beautiful and interesting area. 2. PRIME NUMBERS In this section we shall enumerate t...
Article
Full-text available
We show that there are innitely many primes p such that the Subgroup Membership Problemfor PSL(2; p) belongs to NP \ coNP.1.
Article
We exhibit classes of polynomials whose sets of kth partial derivatives form Gro bner bases for all k, with respect to all term orders. The classes are defined by syntactic constraints on arithmetical formulas defining the polynomials. Read-once formulas without constants have this property for all k, while those with constants have a weaker "Grobn...
Article
Full-text available
We introduce branching programs augmented with the ability to write to and read from variables other than the inputs. This is a substantial strengthening of the model. We show, however, that Ajtai's size lower bounds for linear-time multi-way branching programs solving a Hamming distance problem [Ajt99a] carry over to the stronger model. This indic...
Article
Full-text available
We exhibit classes of polynomials whose sets of rst partial derivatives form Grobner bases, with respect to all term orders. The classes are dened by syntactic constraints on arithmetical formulas dening the polynomials. Read-once formulas without constants also have this prop- erty, while those with constants have a weaker \Grobner-bounding" prope...

Network

Cited By