Miroslav Dudík’s research while affiliated with Microsoft and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (77)


SureMap: Simultaneous Mean Estimation for Single-Task and Multi-Task Disaggregated Evaluation
  • Preprint

November 2024

Mikhail Khodak

·

·

Alexandra Chouldechova

·

Miroslav Dudík

Disaggregated evaluation -- estimation of performance of a machine learning model on different subpopulations -- is a core task when assessing performance and group-fairness of AI systems. A key challenge is that evaluation data is scarce, and subpopulations arising from intersections of attributes (e.g., race, sex, age) are often tiny. Today, it is common for multiple clients to procure the same AI model from a model developer, and the task of disaggregated evaluation is faced by each customer individually. This gives rise to what we call the multi-task disaggregated evaluation problem, wherein multiple clients seek to conduct a disaggregated evaluation of a given model in their own data setting (task). In this work we develop a disaggregated evaluation method called SureMap that has high estimation accuracy for both multi-task and single-task disaggregated evaluations of blackbox models. SureMap's efficiency gains come from (1) transforming the problem into structured simultaneous Gaussian mean estimation and (2) incorporating external data, e.g., from the AI system creator or from their other clients. Our method combines maximum a posteriori (MAP) estimation using a well-chosen prior together with cross-validation-free tuning via Stein's unbiased risk estimate (SURE). We evaluate SureMap on disaggregated evaluation tasks in multiple domains, observing significant accuracy improvements over several strong competitors.



A Unified Model and Dimension for Interactive Estimation

June 2023

·

5 Reads

We study an abstract framework for interactive learning called interactive estimation in which the goal is to estimate a target from its "similarity'' to points queried by the learner. We introduce a combinatorial measure called dissimilarity dimension which largely captures learnability in our model. We present a simple, general, and broadly-applicable algorithm, for which we obtain both regret and PAC generalization bounds that are polynomial in the new dimension. We show that our framework subsumes and thereby unifies two classic learning models: statistical-query learning and structured bandits. We also delineate how the dissimilarity dimension is related to well-known parameters for both frameworks, in some cases yielding significantly improved analyses.


Fairlearn: Assessing and Improving Fairness of AI Systems
  • Preprint
  • File available

March 2023

·

387 Reads

·

6 Citations

·

Miroslav Dudík

·

Richard Edgar

·

[...]

·

Fairlearn is an open source project to help practitioners assess and improve fairness of artificial intelligence (AI) systems. The associated Python library, also named fairlearn, supports evaluation of a model's output across affected populations and includes several algorithms for mitigating fairness issues. Grounded in the understanding that fairness is a sociotechnical challenge, the project integrates learning resources that aid practitioners in considering a system's broader societal context.

Download


Convex Analysis at Infinity: An Introduction to Astral Space

May 2022

·

86 Reads

Not all convex functions on Rn\mathbb{R}^n have finite minimizers; some can only be minimized by a sequence as it heads to infinity. In this work, we aim to develop a theory for understanding such minimizers at infinity. We study astral space, a compact extension of Rn\mathbb{R}^n to which such points at infinity have been added. Astral space is constructed to be as small as possible while still ensuring that all linear functions can be continuously extended to the new space. Although astral space includes all of Rn\mathbb{R}^n, it is not a vector space, nor even a metric space. However, it is sufficiently well-structured to allow useful and meaningful extensions of concepts of convexity, conjugacy, and subdifferentials. We develop these concepts and analyze various properties of convex functions on astral space, including the detailed structure of their minimizers, exact characterizations of continuity, and convergence of descent algorithms.


Personalization Improves Privacy-Accuracy Tradeoffs in Federated Optimization

February 2022

·

23 Reads

Large-scale machine learning systems often involve data distributed across a collection of users. Federated optimization algorithms leverage this structure by communicating model updates to a central server, rather than entire datasets. In this paper, we study stochastic optimization algorithms for a personalized federated learning setting involving local and global models subject to user-level (joint) differential privacy. While learning a private global model induces a cost of privacy, local learning is perfectly private. We show that coordinating local learning with private centralized learning yields a generically useful and improved tradeoff between accuracy and privacy. We illustrate our theoretical results with experiments on synthetic and real-world datasets.


Network Structure, Gender Diversity, and Interdisciplinarity Predict the Centrality of AI Organizations

August 2021

·

9 Reads

Artificial intelligence (AI) research plays an increasingly important role in society, impacting key aspects of human life. From face recognition algorithms aiding national security in airports, to software that advises judges in criminal cases, and medical staff in healthcare, AI research is shaping critical facets of our experience in the world. But who are the people and institutional bodies behind this influential research? What are the predictors of influence of AI researchers and research organizations? We study this question using social network analysis, in an exploration of the structural characteristics, i.e., network topology, of research organizations that shape modern AI. In a sample of 158 organizations with 16,385 affiliated authors of published papers in prominent AI conferences (e.g., NeurIPS, FAccT, AIES), we find that both industry and academic research organizations with influential authors are more interdisciplinary, more hierarchical, more gender diverse, and less clustered. Here, authors’ betweenness centrality in co-authorship networks was used as a measure of their influence. We also find that gender minorities (e.g., women) have less influence in the AI community, determined as lower betweenness centrality in co-authorship networks. These results suggest that while diversity adds significant value to AI based organizations, the individuals contributing to the increased diversity are marginalized in the AI field. We discuss these results in the context of current events with important societal implications.


Bayesian decision-making under misspecified priors with applications to meta-learning

July 2021

·

13 Reads

Thompson sampling and other Bayesian sequential decision-making algorithms are among the most popular approaches to tackle explore/exploit trade-offs in (contextual) bandits. The choice of prior in these algorithms offers flexibility to encode domain knowledge but can also lead to poor performance when misspecified. In this paper, we demonstrate that performance degrades gracefully with misspecification. We prove that the expected reward accrued by Thompson sampling (TS) with a misspecified prior differs by at most O~(H2ϵ)\tilde{\mathcal{O}}(H^2 \epsilon) from TS with a well specified prior, where ϵ\epsilon is the total-variation distance between priors and H is the learning horizon. Our bound does not require the prior to have any parametric form. For priors with bounded support, our bound is independent of the cardinality or structure of the action space, and we show that it is tight up to universal constants in the worst case. Building on our sensitivity analysis, we establish generic PAC guarantees for algorithms in the recently studied Bayesian meta-learning setting and derive corollaries for various families of priors. Our results generalize along two axes: (1) they apply to a broader family of Bayesian decision-making algorithms, including a Monte-Carlo implementation of the knowledge gradient algorithm (KG), and (2) they apply to Bayesian POMDPs, the most general Bayesian decision-making setting, encompassing contextual bandits as a special case. Through numerical simulations, we illustrate how prior misspecification and the deployment of one-step look-ahead (as in KG) can impact the convergence of meta-learning in multi-armed and contextual bandits with structured and correlated priors.



Citations (54)


... Regarding intersectionality, Herlihy et al. (2024) introduces a regression approach using confidence intervals to measure intersectional groups' performance and demonstrate that strong performance can be achieved even with small samples. Also, several approaches have evaluated fairness on non-binary protected attributes, posing similar challenges. ...

Reference:

Intersectional Divergence: Measuring Fairness in Regression
A structured regression approach for evaluating model performance across intersectional subgroups
  • Citing Conference Paper
  • June 2024

... To mitigate any difference in performance metrics between subgroups, fairness aware loss functions were investigated. This involved using the Fairlearn wrapper for XGBoost (Weerts et al., 2023), which defines a custom loss function that looks to balance model performance across demographic groups of interest. However, this led to a global drop in performance meaning more opportunities for intervention would be missed across the cohort. ...

Fairlearn: Assessing and Improving Fairness of AI Systems

... We examine coauthorship homophily in Section 5.2. Recent work suggests that women in computer science working on artificial intelligence tend to occupy less central positions in collaboration networks [49], but no studies have considered the broader question of how race and/or gender relate to centrality in computer science at large, or whether centrality in coauthorship networks correlates with prestige. ...

Interdisciplinarity, Gender Diversity, and Network Structure Predict the Centrality of AI Organizations
  • Citing Conference Paper
  • June 2022

... On the one hand, the public acceptance and individual use of CTA may strengthen the factual forms of social cohesion that exist in a collective (67) or strengthen factors that are elementary for them (72,82), it may prevent negative consequences for forms of social cohesion that would have happened if CTA were not accepted and used (66), or open up possibilities for reimagining old and establishing new relationships (92). On the other hand, the acceptance and use of CTA may also have a negative impact on a society's forms of social cohesion: by threatening or undermining the forms of social cohesion that exist in a collective (83), by weakening factors that are elementary for them (75,77,(80)(81)(82)86), by reinforcing existing discriminations and worsening the situation for specific groups of people (38,75,90,91), or by using resources for the development and implementation of CTA that would have had more positive effects on the community if used in an alternative way (88,89). ...

Responsible computing during COVID-19 and beyond
  • Citing Article
  • July 2021

Communications of the ACM

... Surowiecki claims that the best way to bring this diversity together is usually through anonymous trading on a market, thereby avoiding some of the herding dynamics that occur when humans encounter one another. 1 Whether or not prices in prediction markets represent real probabilities is vigorously debated among the community and in academic literature (e.g. Brown et al. 2019;Buckley 2017;Pathak et al. 2015). Nevertheless, the idea that they approximate real probabilities is crucial for the format's utility in the crypto community. ...

A comparison of forecasting methods: fundamentals, polling, prediction markets, and experts
  • Citing Article
  • October 2015

The Journal of Prediction Markets

... In order to achieve a balance between real-time performance and the complexity of the optimization problem [20,21], we propose an innovative model that combines the improved MDP with convex programming theory. Specifically, we apply a dyadic space decomposition method based on the Z-transform to reconstruct the MDP problem into a solvable linear programming form, which solves the instability in the traditional model due to initial condition uncertainty and non-smooth state transfer. ...

Reinforcement Learning with Convex Constraints

... Expert guidance has been widely explored to address the sample efficiency and exploration challenges inherent in RL. Imitation learning-based methods, like DAgger (Ross et al., 2011), Hg-DAgger (Le et al., 2018), Soft DAgger (Nazeer et al., 2023) iteratively collect expert feedback on states visited by the learning policy to address distribution shift. Instead of direct action imitation, IRL-based methods select a reward function from the set of possible solutions that best explains expert behavior. ...

Hierarchical Imitation and Reinforcement Learning
  • Citing Conference Paper
  • March 2018

... The healthcare field is undergoing a datadriven revolution, where advanced technologies and data analytics are reshaping our approach to disease understanding, diagnosis, and treatment. Precision medicine, powered by omics data and digital biomarkers, is at the forefront of this transformation, aiming to provide more effective and personalized healthcare solutions [11]. ...

A multifactorial model of T cell expansion and durable clinical benefit in response to a PD-L1 inhibitor

... Dutch census dataset:The Dutch census dataset represents aggregated groups of people in the Netherlands for the year 2001. The primary goal of the Dutch dataset is to predict a person's profession, which can be classified as high level (prestigious) or low level (low earning)[62].12. Bank marketing dataset:This dataset consists of 45,211 samples in the finance area. ...

A Reductions Approach to Fair Classification
  • Citing Article
  • March 2018