Michael Muller

Michael Muller
IBM Research · Artificial Intelligence

PhD

About

326
Publications
93,719
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
9,172
Citations
Additional affiliations
June 1998 - January 2019
IBM Research
Position
  • Research Staff Member

Publications

Publications (326)
Article
Computational notebooks allow data scientists to express their ideas through a combination of code and documentation. However, data scientists often pay attention only to the code, and neglect creating or updating their documentation during quick iterations. Inspired by human documentation practices learned from 80 highly-voted Kaggle notebooks, we...
Preprint
Full-text available
HCI engages with data science through many topics and themes. Researchers have addressed biased dataset problems, arguing that bad data can cause innocent software to produce bad outcomes. But what if our software is not so innocent? What if the human decisions that shape our data-processing software, inadvertently contribute their own sources of b...
Book
Best practices for addressing the bias and inequality that may result from the automated collection, analysis, and distribution of large datasets. Human-centered data science is a new interdisciplinary field that draws from human-computer interaction, social science, statistics, and computational techniques. This book, written by founders of the f...
Preprint
Full-text available
Generative machine learning models have recently been applied to source code, for use cases including translating code between programming languages, creating documentation from code, and auto-completing methods. Yet, state-of-the-art models often produce code that is erroneous or incomplete. In a controlled study with 32 software engineers, we exa...
Preprint
Full-text available
What does it mean for a generative AI model to be explainable? The emergent discipline of explainable AI (XAI) has made great strides in helping people understand discriminative models. Less attention has been paid to generative models that produce artifacts, rather than decisions, as output. Meanwhile, generative AI (GenAI) technologies are maturi...
Preprint
Full-text available
Human-Centered AI (HCAI) is an emerging discipline that aims to create AI systems that amplify [56, 55] and augment [58] human abilities and preserve human control in order to make AI partnerships more productive, enjoyable, and fair [25]. Our workshop aims to bring together researchers and practitioners from the NeurIPS and HCI communities and oth...
Preprint
Full-text available
Translating source code from one programming language to another is a critical, time-consuming task in modernizing legacy applications and codebases. Recent work in this space has drawn inspiration from the software naturalness hypothesis by applying natural language processing techniques towards automating the code translation task. However, due t...
Preprint
Geographically dispersed teams often face challenges in coordination and collaboration, lowering their productivity. Understanding the relationship between team dispersion and productivity is critical for supporting such teams. Extensive prior research has studied these relations in lab settings or using qualitative measures. This paper extends pri...
Article
Full-text available
Explainability of AI systems is critical for users to take informed actions and hold systems accountable. While "opening the opaque box" is important, understanding who opens the box can govern if the Human-AI interaction is effective. In this paper, we conduct a mixed-methods study of how two different groups of whos-people with and without a back...
Preprint
Explainability of AI systems is critical for users to take informed actions and hold systems accountable. While "opening the opaque box" is important, understanding who opens the box can govern if the Human-AI interaction is effective. In this paper, we conduct a mixed-methods study of how two different groups of whos--people with and without a bac...
Conference Paper
Full-text available
Ground-truth labeling is an important activity in machine learning. Many studies have examined how crowdworkers apply labels to records in machine learning datasets. However, there have been few studies that have examined the work of domain experts when their knowledge and expertise are needed to apply labels. We provide a grounded account of the w...
Conference Paper
Full-text available
As AI-powered systems increasingly mediate consequential decision-making, their explainability is critical for end-users to take informed and accountable actions. Explanations in human-human interactions are socially-situated. AI systems are often socio-organizationally embedded. However, Explainable AI (XAI) approaches have been predominantly algo...
Preprint
Full-text available
Generative models have become adept at producing artifacts such as images, videos, and prose at human-like levels of proficiency. New generative techniques, such as unsupervised neural machine translation (NMT), have recently been applied to the task of generating source code, translating it from one programming language to another. The artifacts p...
Preprint
Full-text available
Labeling data is an important step in the supervised machine learning lifecycle. It is a laborious human activity comprised of repeated decision making: the human labeler decides which of several potential labels to apply to each example. Prior work has shown that providing AI assistance can improve the accuracy of binary decision tasks. However, t...
Preprint
Full-text available
Computational notebooks allow data scientists to express their ideas through a combination of code and documentation. However, data scientists often pay attention only to the code, and neglect creating or updating their documentation during quick iterations, which leads to challenges in sharing their notebooks with others and future selves. Inspire...
Preprint
The development of AI applications is a multidisciplinary effort, involving multiple roles collaborating with the AI developers, an umbrella term we use to include data scientists and other AI-adjacent roles on the same team. During these collaborations, there is a knowledge mismatch between AI developers, who are skilled in data science, and exter...
Preprint
Full-text available
As AI-powered systems increasingly mediate consequential decision-making, their explainability is critical for end-users to take informed and accountable actions. Explanations in human-human interactions are socially-situated. AI systems are often socio-organizationally embedded. However, Explainable AI (XAI) approaches have been predominantly algo...
Preprint
Data science and machine learning (DS/ML) are at the heart of the recent advancements of many Artificial Intelligence (AI) applications. There is an active research thread in AI, \autoai, that aims to develop systems for automating end-to-end the DS/ML Lifecycle. However, do DS and ML workers really want to automate their DS/ML workflow? To answer...
Preprint
Full-text available
Recently, the automated translation of source code from one programming language to another by using automatic approaches inspired by Neural Machine Translation (NMT) methods for natural languages has come under study. However, such approaches suffer from the same problem as previous NMT approaches on natural languages, viz. the lack of an ability...
Chapter
Full-text available
Recent progress in machine learning has given rise to a plethora of tools and applications that rely on conversational interactions, from chatbots, speech-controlled devices to robots and virtual agents. Conversational interfaces are becoming widely accepted for utility tools, where a common function is to serve users’ information needs. Albeit wit...
Article
Full-text available
Today, the prominence of data science within organizations has given rise to teams of data science workers collaborating on extracting insights from data, as opposed to individual data scientists working alone. However, we still lack a deep understanding of how data science workers collaborate in practice. In this work, we conducted an online surve...
Preprint
Generative AI is a class of machine learning technology that learns to generate new data from training data. While deep fakes and media-and art-related generative AI breakthroughs have recently caught people's attention and imagination, the overall area is in its infancy for business use. Further, little is known about generative AI's potential for...
Preprint
Today, the prominence of data science within organizations has given rise to teams of data science workers collaborating on extracting insights from data, as opposed to individual data scientists working alone. However, we still lack a deep understanding of how data science workers collaborate in practice. In this work, we conducted an online surve...
Preprint
Full-text available
We explore trust in a relatively new area of data science: Automated Machine Learning (AutoML). In AutoML, AI methods are used to generate and optimize machine learning models by automatically engineering features, selecting models, and optimizing hyperparameters. In this paper, we seek to understand what kinds of information influence data scienti...
Conference Paper
Social media platforms and social network sites generate a multitude of digital trace behavioral data, the scale of which often necessitates the use of computational data science methods. On the other hand, the socio-behavioral and often relational nature of the social media data requires the attention to context of user activity traditionally asso...
Preprint
Artificial Intelligence (AI) can now automate the algorithm selection, feature engineering, and hyperparameter tuning steps in a machine learning workflow. Commonly known as AutoML or AutoAI, these technologies aim to relieve data scientists from the tedious manual work. However, today's AutoAI systems often present only limited to no information a...
Preprint
Two general routes have been followed to develop artificial agents that are sensitive to human values---a top-down approach to encode values into the agents, and a bottom-up approach to learn from human actions, whether from real-world interactions or stories. Although both approaches have made exciting scientific progress, they may face challenges...
Article
In society today, people experiencing disability can face discrimination. As artificial intelligence solutions take on increasingly important roles in decision-making and interaction, they have the potential to impact fair treatment of people with disabilities in society both positively and negatively. We describe some of the opportunities and risk...
Article
In recent years there has been an increasing trend in which data scientists and domain experts work together to tackle complex scientific questions. However, such collaborations often face challenges. In this paper, we aim to decipher this collaboration complexity through a semi-structured interview study with 22 interviewees from teams of bio-medi...
Conference Paper
Full-text available
This paper reflects on the expectations of museum guides regarding companion AI-powered robots in a science museum space. We employed Design Fiction as a technique to explore machine teaching of future technologies in public spaces. The fiction is illustrated by an open-ended “imaginary abstract” which showcases the dilemma of buying AI robots to w...
Conference Paper
Qualitative methods have long been an important component of CSCW research. However, it can be challenging to make qualitative work legible to a broader set of researchers, which is critical as mixed methods research becomes more common. Moreover, the shift towards larger scales of data and increasing calls for open data and more transparency pose...
Conference Paper
Full-text available
The rapid advancement of artificial intelligence (AI) is changing our lives in many ways. One application domain is data science. New techniques in automating the creation of AI, known as AutoAI or AutoML, aim to automate the work practices of data scientists. AutoAI systems are capable of autonomously ingesting and pre-processing data, engineering...
Preprint
Full-text available
This paper reflects on the expectations of museum guides regarding companion AI-powered robots in a science museum space. We employed Design Fiction as a technique to explore machine teaching of future technologies in public spaces. The fiction is illustrated by an open-ended "imaginary abstract" which showcases the dilemma of buying AI robots to w...
Preprint
Full-text available
In recent years there has been an increasing trend in which data scientists and domain experts work together to tackle complex scientific questions. However, such collaborations often face challenges. In this paper, we aim to decipher this collaboration complexity through a semi-structured interview study with 22 interviewees from teams of bio-medi...
Preprint
The rapid advancement of artificial intelligence (AI) is changing our lives in many ways. One application domain is data science. New techniques in automating the creation of AI, known as AutoAI or AutoML, aim to automate the work practices of data scientists. AutoAI systems are capable of autonomously ingesting and pre-processing data, engineering...
Conference Paper
Full-text available
By 2019, diversity is an established fact in most workplaces, teams, and work-groups, presenting both old and new challenges to CSCW in terms of team structure and technological supports for increasingly diverse teams. The research literature on diversity and teams has examined many definitions and attributes of diversity, and has described differe...
Conference Paper
Full-text available
1 As Machine Learning (ML) systems become increasingly ubiquitous, capable and autonomous, it has become essential to take a human-centered view to consider how people's interactions with ML systems, including the effort to develop and evolve ML systems, impact their work practices, wellbeing and the social-organizational environment. Built on our...
Preprint
Full-text available
HCI has a growing body of work regarding important social and community issues, as well as various grassroots communities working to make CHI more international and inclusive. In this workshop, we will build on this work: first reflecting on the contemporary CHI climate, and then developing an actionable plan towards making CHI2019 and subsequent S...
Conference Paper
Full-text available
With the rise of big data, there has been an increasing need for practitioners in this space and an increasing opportunity for researchers to understand their workflows and design new tools to improve it. Data science is often described as data-driven, comprising unambiguous data and proceeding through regularized steps of analysis. However, this v...
Conference Paper
With the rise of big data, there has been an increasing need to understand who is working in data science and how they are doing their work. HCI and CSCW researchers have begun to examine these questions. In this workshop, we invite researchers to share their observations, experiences, hypotheses, and insights, in the hopes of developing a taxonomy...
Conference Paper
An ongoing challenge within the diverse HCI and social computing research communities is understanding research ethics in the face of evolving technology and methods. Building upon successful town hall meetings at CHI 2018, GROUP 2018 and CSCW 2018, this panel will be structured to facilitate audience discussion and to collect input about current c...
Article
StackExchange is a network of Question & Answer (Q&A) sites that support collaborative knowledge exchange on a variety of topics. Prior research found a significant imbalance between those who contribute content to Q&A sites (predominantly people from Western countries) and those who passively use the site (the so-called "lurkers"). One possible ex...
Conference Paper
An ongoing challenge within the diverse HCI and social computing research communities is understanding research ethics in the face of evolving technology and methods. Building upon successful town hall meetings at ACM conferences including CSCW, CHI, GROUP, and IDC, this panel will be structured to facilitate audience discussion and to collect inpu...
Conference Paper
Full-text available
Many conversational agents (CAs) are developed to answer users' questions in a specialized domain. In everyday use of CAs, user experience may extend beyond satisfying information needs to the enjoyment of conversations with CAs, some of which represent playful interactions. By studying a field deployment of a Human Resource chatbot, we report on u...
Conference Paper
An ongoing challenge within the HCI research community is the development of community norms for research ethics in the face of evolving technology and methods. Building upon a successful town hall meeting at CHI 2017, this panel will include members of the SIGCHI Research Ethics Committee, but will be structured to facilitate a roundtable discussi...
Conference Paper
This one-day workshop will help early career researchers/academics develop their careers in HCI through intensive interaction with senior mentors from academia and industry who are experienced in research and professional service. Application to the workshop is open to all members of the HCI community who have received their PHDs in the past five y...
Conference Paper
The ACM SIGCHI community has been at the forefront of addressing issues of equity and inclusivity in the design and use of technology, accounting for various aspects of users' identities such as gender, ethnicity, and sexuality. With this panel, we wish to explore how we, as SIGCHI, might better target similar goals of equity and inclusivity - acro...
Conference Paper
Full-text available
The story "In the Data Kitchen" appeared online in 2017, and went viral, receiving an astonishing degree of attention for an unattributed work with obscure origins. We review this provocative fiction, discussing its evident resonance with societal concerns and ongoing discussions of big-data ethics.
Conference Paper
Full-text available
When employees participate in organizational crowdfunding, they seek partial funding from their existing social networks. Among proposers of projects, teams with larger social networks tend to be more successful in reaching their funding goals. However, little is known about the consequences of participation on employees' social networks, during an...
Conference Paper
As technology and data access continue to evolve, research ethics in the areas of Human-Computer Interaction and social computing are becoming increasingly complex. Despite increasing interest among researchers, there is still a lack of consistent community norms around ethical gray areas. One charge of the SIGCHI ethics committee is to help develo...