Xiaoyi Cheng’s scientific contributions

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (2)


Figure 2: The stages in a ML lifecycle from problem formulation to deployment and monitoring with their respective fairness considerations in orange. Clarify can be used at dataset construction and model testing and monitoring to investigate them. í µí°´í µí°· Accuracy Difference: We compare the accuracy, i.e. the fraction of examples for which the predictions is equal to the label, across groups. We define í µí°´í µí°· = (í µí±‡ í µí±ƒ í µí±Ž + í µí±‡ í µí± í µí±Ž )/í µí±› í µí±Ž − (í µí±‡ í µí±ƒ í µí±‘ + í µí±‡ í µí± í µí±‘ )/í µí±› í µí±‘ . í µí± í µí°· Recall Difference: We compare the recall (the fraction of positive examples that receive a positive prediction) across groups. í µí± í µí°· = í µí±‡ í µí±ƒ í µí±Ž /í µí±› (1) í µí±Ž − í µí±‡ í µí±ƒ í µí±‘ /í µí±› (1) í µí±‘
Figure 3: Pre-training bias results in Data Wrangler.
Figure 4: Post-training bias results in Studio Trials.
Figure 5: SHAP feature importance.
Figure 6: Scaling of SHAP on 100,000 examples and measuring time (left) and cost (right) for a varying number of instances. method processing time in minutes cost in dollars pre-training bias 1 $0.03 post-training bias 14 $0.13 Table 2: Processing time and cost of bias metrics computation for 1 million examples.

+1

Amazon SageMaker Clarify: Machine Learning Bias Detection and Explainability in the Cloud
  • Preprint
  • File available

September 2021

·

313 Reads

Michaela Hardt

·

Xiaoguang Chen

·

Xiaoyi Cheng

·

[...]

·

Krishnaram Kenthapadi

Understanding the predictions made by machine learning (ML) models and their potential biases remains a challenging and labor-intensive task that depends on the application, the dataset, and the specific model. We present Amazon SageMaker Clarify, an explainability feature for Amazon SageMaker that launched in December 2020, providing insights into data and ML models by identifying biases and explaining predictions. It is deeply integrated into Amazon SageMaker, a fully managed service that enables data scientists and developers to build, train, and deploy ML models at any scale. Clarify supports bias detection and feature importance computation across the ML lifecycle, during data preparation, model evaluation, and post-deployment monitoring. We outline the desiderata derived from customer input, the modular architecture, and the methodology for bias and explanation computations. Further, we describe the technical challenges encountered and the tradeoffs we had to make. For illustration, we discuss two customer use cases. We present our deployment results including qualitative customer feedback and a quantitative evaluation. Finally, we summarize lessons learned, and discuss best practices for the successful adoption of fairness and explanation tools in practice.

Download

Citations (1)


... Quantificar o viés nos dados é o primeiro passo para corrigir as disparidades e evitar modelos injustos. A partir da definição do que são os vieses e como eles podem surgir nos dados, torna-se imperativo o entendimento do contexto e onde as diferentes medidas de análise de vieses podem ser aplicadas [Hardt et al. 2021]. Por exemplo, considerando o atributo sexo e, para simplificar, assumindo dois grupos demográficos (ou duas classes): homens e mulheres. ...

Reference:

Construindo Modelos Justos: Fundamentos, Estratégias e Desafios para uma IA Ética e Equitativa na Saúde
Amazon SageMaker Clarify: Machine Learning Bias Detection and Explainability in the Cloud
  • Citing Conference Paper
  • August 2021