January 2025
·
16 Reads
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
January 2025
·
16 Reads
September 2024
·
9 Reads
·
1 Citation
International Journal of Forecasting
June 2024
·
12 Reads
May 2024
·
14 Reads
·
1 Citation
May 2024
·
31 Reads
·
23 Citations
Science Advances
Machine learning (ML) methods are proliferating in scientific research. However, the adoption of these methods has been accompanied by failures of validity, reproducibility, and generalizability. These failures can hinder scientific progress, lead to false consensus around invalid claims, and undermine the credibility of ML-based science. ML methods are often applied and fail in similar ways across disciplines. Motivated by this observation, our goal is to provide clear recommendations for conducting and reporting ML-based science. Drawing from an extensive review of past literature, we present the REFORMS checklist (recommendations for machine-learning-based science). It consists of 32 questions and a paired set of guidelines. REFORMS was developed on the basis of a consensus of 19 researchers across computer science, data science, mathematics, social sciences, and biomedical sciences. REFORMS can serve as a resource for researchers when designing and implementing a study, for referees when reviewing papers, and for journals when enforcing standards for transparency and reproducibility.
January 2024
·
1 Read
SSRN Electronic Journal
August 2023
·
78 Reads
Science is facing a reproducibility crisis. Previous work has proposed incorporating data analysis replications into classrooms as a potential solution. However, despite the potential benefits, it is unclear whether this approach is feasible, and if so, what the involved stakeholders-students, educators, and scientists-should expect from it. Can students perform a data analysis replication over the course of a class? What are the costs and benefits for educators? And how can this solution help benchmark and improve the state of science? In the present study, we incorporated data analysis replications in the project component of the Applied Data Analysis course (CS-401) taught at EPFL (N=354 students). Here we report pre-registered findings based on surveys administered throughout the course. First, we demonstrate that students can replicate previously published scientific papers, most of them qualitatively and some exactly. We find discrepancies between what students expect of data analysis replications and what they experience by doing them along with changes in expectations about reproducibility, which together serve as evidence of attitude shifts to foster students' critical thinking. Second, we provide information for educators about how much overhead is needed to incorporate replications into the classroom and identify concerns that replications bring as compared to more traditional assignments. Third, we identify tangible benefits of the in-class data analysis replications for scientific communities, such as a collection of replication reports and insights about replication barriers in scientific work that should be avoided going forward. Overall, we demonstrate that incorporating replication tasks into a large data science class can increase the reproducibility of scientific work as a by-product of data science instruction, thus benefiting both science and students.
August 2023
·
215 Reads
·
4 Citations
Machine learning (ML) methods are proliferating in scientific research. However, the adoption of these methods has been accompanied by failures of validity, reproducibility, and generalizability. These failures can hinder scientific progress, lead to false consensus around invalid claims, and undermine the credibility of ML-based science. ML methods are often applied and fail in similar ways across disciplines. Motivated by this observation, our goal is to provide clear reporting standards for ML-based science. Drawing from an extensive review of past literature, we present the REFORMS checklist (porting Standards achine Learning Based cience). It consists of 32 questions and a paired set of guidelines. REFORMS was developed based on a consensus of 19 researchers across computer science, data science, mathematics, social sciences, and biomedical sciences. REFORMS can serve as a resource for researchers when designing and implementing a study, for referees when reviewing papers, and for journals when enforcing standards for transparency and reproducibility.
August 2023
·
91 Reads
·
17 Citations
Proceedings of the National Academy of Sciences
Traditionally, scientists have placed more emphasis on communicating inferential uncertainty (i.e., the precision of statistical estimates) compared to outcome variability (i.e., the predictability of individual outcomes). Here, we show that this can lead to sizable misperceptions about the implications of scientific results. Specifically, we present three preregistered, randomized experiments where participants saw the same scientific findings visualized as showing only inferential uncertainty, only outcome variability, or both and answered questions about the size and importance of findings they were shown. Our results, composed of responses from medical professionals, professional data scientists, and tenure-track faculty, show that the prevalent form of visualizing only inferential uncertainty can lead to significant overestimates of treatment effects, even among highly trained experts. In contrast, we find that depicting both inferential uncertainty and outcome variability leads to more accurate perceptions of results while appearing to leave other subjective impressions of the results unchanged, on average.
August 2023
·
29 Reads
Numerical perspectives help people understand extreme and unfamiliar numbers (e.g., \330 billion is about \1,000 per person in the United States). While research shows perspectives to be helpful, generating them at scale is challenging both because it is difficult to identify what makes some analogies more helpful than others, and because what is most helpful can vary based on the context in which a given number appears. Here we present and compare three policies for large-scale perspective generation: a rule-based approach, a crowdsourced system, and a model that uses Wikipedia data and semantic similarity (via BERT embeddings) to generate context-specific perspectives. We find that the combination of these three approaches dominates any single method, with different approaches excelling in different settings and users displaying heterogeneous preferences across approaches. We conclude by discussing our deployment of perspectives in a widely-used online word processor.
... Data analogies link abstract data to familiar concepts to improve understanding. Researchers evaluate these analogies using controlled experiments and assess effectiveness through subjective ratings like helpfulness [10,35,43,64,66], estimation errors [35,64], and correlations between model and human ratings [66]. Analogies also facilitate scientific discovery and design. ...
May 2024
... for checklists). For our description of machine learning-based modelling, we adhered to the REFORMS checklist(Kapoor et al., 2024; Supplement 1). The reporting of the final publication will similarly adhere to these guidelines as well as the full PRISMA guidelines(Page et al., 2021).Regarding open science practices, all supplementary materials for this protocol are available via the Open Science Framework (OSF), at https://osf.io/yuhp8/. ...
May 2024
Science Advances
... In working papers, results have been mixed. Some studies have found that interacting with AI tools improves skill on subsequent tests in which AI tools are not available (20,21), while others have shown null or even negative effects (21)(22)(23). Notably, these studies examine AI tutors, chatbots, or explanations explicitly designed to support learning, rather than simply providing solutions as is typical in real-world use. ...
January 2023
SSRN Electronic Journal
... Hofman et al. [27] introduced a sports metaphor to conceptualize the spectrum of the impact of Generative AI on human cognition. They describe three distinct roles that AI can play: steroids, sneakers, and coach. ...
January 2023
SSRN Electronic Journal
... However, these approaches often fall short when applied to the complexity and scale of contemporary datasets, which frequently involve multifaceted social phenomena and diverse data sources (Lundberg et al. 2022). Machine learning offers transformative opportunities to address these challenges by enabling researchers to uncover complex patterns in large datasets, thereby enhancing the reliability and depth of theory validation (Wang et al. 2023;Kapoor et al. 2023). ...
August 2023
... In this way, averages-in-themselves are not necessarily problematic, rather their de-contextualized use is (e.g., an inattention to surrounding variation and the constructed nature of the target being measured). For instance, while analyses of distributions are thought to buffer against essentialist inferences by framing average "group" properties as probabilities and not certainties (Lockhart, 2023), research practices often reduce distributions down to point estimate (i.e., average) comparisons and interpret them as such (Zhang et al., 2023) or interpret race distributions in a way that obscures the race realism or racist causality that underlies them (Holland, 2008;James, 2008;Winston, 2020a;Zuberi, 2000). Put simply, scientific practices powerfully shape how race(s) can be reified (K. ...
August 2023
Proceedings of the National Academy of Sciences
... An AI model should possess four features to be called an LLM: deep comprehension of natural language for tasks such as translation; ability to create human-like text; contextual awareness, especially in knowledge-intensive domains; and strong problem-solving and decision-making using text-based information for tasks [7]. They offer a wide range of applications and services across various domains such as healthcare [8], customer support [9], code generation and evaluation [10], finance [11], and education [10,11]. They are categorized into two categories, Encoder-Decoder or Encoder-Only (BERT-style LLMs) and Decoder-Only (GPT-style LLMs). ...
July 2023
... The Rashomon effect refers to the observation that, for a given task, there tend to be many disparate, equally good models [6]. This phenomenon presents both challenges and opportunities: the Rashomon effect leads to the related phenomenon of predictive multiplicity [19,25,31,49,50], wherein equally good models may yield different predictions for any individual, but has also lead to theoretical insights into model simplicity [4,40,41] and been applied to produce robust measures of variable importance [9,11,13,43]. A more thorough discussion of the implications of the Rashomon effect can be found in [36]. ...
June 2023
... While positive feelings persisted over time, negative feelings had greater saliency but faded away or evolved through self-reflection [68,71] (but see also [57] and [72]). Neuroscience academics should reduce the narration of 'success stories' [11,40] to emphasise, instead, how things can be learnt from negative experiences. Medical education studies can analyse personal and career-related trajectories with current and former academics to see if the 'four trajectories of academic identity development (one of stable academic identity and three of lost academic identity) and four narratives of attrition (disillusionment, a search for new purpose, refusal to sacrifice personal life and academic inadequacy)' [16] can be replicated in neuroscience education. ...
January 2023
Judgment and Decision Making
... In real-estate markets, for example, precise, non-round values for offers appear to trigger more fine-grained pricing scales among market participants, which leads to lower counteroffers compared to round offers (Leib et al., 2021). In some cases, rounding may simply occur due to the aesthetical appeal of round numbers or the fact that they are easier to cognitively process (Nguyen et al., 2022); but often-and especially in estimation tasks-the choice of round numbers is clearly related to uncertainty. This will become evident in the subsequent analyses but is also apparent from past research: For instance, accounting research finds that analysts who round earnings-per-share forecasts are less informed, make less effort, and have fewer resources (Herrmann and Thomas, 2005). ...
April 2022