
Anne Elizabeth CarpenterBroad Institute of MIT and Harvard · Imaging Platform
Anne Elizabeth Carpenter
PhD
About
331
Publications
79,512
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
25,909
Citations
Citations since 2017
Introduction
Dr. Anne Carpenter is an Institute Scientist directing the Imaging Platform at the Broad Institute of Harvard and MIT. Her research group develops algorithms and data analysis methods for large-scale experiments involving images. The team’s open-source CellProfiler software is used by thousands of biologists worldwide (www.cellprofiler.org). Carpenter is a pioneer in image-based profiling, the extraction of rich, unbiased information from images for a number of important applications in drug discovery and functional genomics.
Additional affiliations
January 2007 - present
June 2003 - December 2006
June 1997 - May 2003
Education
September 1997 - May 2003
September 1995 - May 1997
September 1994 - May 1995
Wheaton College, Illinois
Field of study
- Biology
Publications
Publications (331)
Robust, generalizable approaches to identify compounds efficiently with undesirable mechanisms of action in complex cellular assays remain elusive. Such a process would be useful for hit triage during high-throughput screening and, ultimately, predictive toxicology during drug development. Here we generate cell painting and cellular health profiles...
Image-based profiling quantitatively assesses the effects of perturbations on cells by capturing a breadth of changes via microscopy. Here, we provide two complementary protocols to help explore and interpret data from image-based profiling experiments. In the first protocol, we examine the similarity among perturbed cell samples using data from co...
Unlabelled:
Cellular exposure to free fatty acids (FFA) is implicated in the pathogenesis of obesity-associated diseases. However, studies to date have assumed that a few select FFAs are representative of broad structural categories, and there are no scalable approaches to comprehensively assess the biological processes induced by exposure to dive...
Quantitative microscopy is a powerful method for performing phenotypic screens, from which image-based profiling can extract a wealth of information, termed profiles. These profiles can be used to elucidate the changes in cellular phenotypes across cell populations from different patient samples or following genetic or chemical perturbations. One s...
The morphology of cells is dynamic and mediated by genetic and environmental factors. Characterizing how genetic variation impacts cell morphology can provide an important link between disease association and cellular function. Here, we combined genomic and high-content imaging approaches on iPSCs from 297 unique donors to investigate the relations...
Cells can be perturbed by various chemical and genetic treatments and the impact on gene expression and morphology can be measured via transcriptomic profiling and image-based assays, respectively. The patterns observed in these high-dimensional profile data can power a dozen applications in drug discovery and basic biology research, but both types...
Imaging flow cytometry combines the high-event-rate nature of flow cytometry with the advantages of single-cell image acquisition associated with microscopy. The measurement of large numbers of features from the resulting images provides rich data sets that have resulted in a wide range of novel biomedical applications. In this Primer, we discuss t...
Background: Different asthma phenotypes are driven by molecular endotypes. A Th1-high phenotype is linked to severe, therapy-refractory asthma, subclinical infections and neutrophil inflammation. Previously, we found neutrophil granulocytes (NGs) from asthmatics exhibit decreased chemotaxis towards leukotriene B4 (LTB 4 ), a chemoattractant involve...
Morphological and gene expression profiling can cost-effectively capture thousands of features in thousands of samples across perturbations by disease, mutation, or drug treatments, but it is unclear to what extent the two modalities capture overlapping versus complementary information. Here, using both the L1000 and Cell Painting assays to profile...
A medical diagnosis sets a principal investigator on a new path.
Identifying the chemical regulators of biological pathways is a time-consuming bottleneck in developing therapeutics and research compounds. Typically, thousands to millions of candidate small molecules are tested in target-based biochemical screens or phenotypic cell-based screens, both expensive experiments customized to each disease. Here, our u...
A primary obstacle in translating genetics and genomics data into therapeutic strategies is elucidating the cellular programs affected by genetic variants and genes associated with human diseases. Broadly applicable high-throughput, unbiased assays offer a path to rapidly characterize gene and variant function and thus illuminate disease mechanisms...
Measuring the phenotypic effect of treatments on cells through imaging assays is an efficient and powerful way of studying cell biology, and requires computational methods for transforming images into quantitative data that highlights phenotypic outcomes. Here, we present an optimized strategy for learning representations of treatment effects from...
Robust, generalizable approaches to identify compounds efficiently with undesirable mechanisms of action in complex cellular assays remain elusive. Such a process would be useful for hit triage during high-throughput screening and, ultimately, predictive toxicology during drug development. We generated cell painting and cellular health profiles for...
In image-based profiling, software extracts thousands of morphological features of cells from multi-channel fluorescence microscopy images, yielding single-cell profiles that can be used for basic research and drug discovery. Powerful applications have been proven, including clustering chemical and genetic perturbations based on their similar morph...
Successful mapping of cancer dependencies requires conducting genetic and pharmacological screens in a diversity of cell models. However, existing model development approaches require long periods of culture time during which evolutionary pressures reduce heterogeneity. It also remains difficult to create long-term models of many cancers, greatly l...
Intestinal fibrosis is a common complication of several enteropathies with inflammatory bowel disease being the major cause. The progression of intestinal fibrosis may lead to intestinal stenosis and obstruction. Even with an increased understanding of tissue fibrogenesis, there are no approved treatments for intestinal fibrosis. Historically, drug...
Resolving fundamental molecular and functional processes underlying human synaptic development is crucial for understanding normal brain function as well as dysfunction in disease. Based upon increasing evidence of species divergent features of brain cell types, coupled with emerging studies of complex human disease genetics, we developed the first...
Purification is essential before differentiating human induced pluripotent stem cells (hiPSCs) into cells that fully express particular differentiation marker genes. High-quality iPSC clones are typically purified through gene expression profiling or visual inspection of the cell morphology; however, the relationship between the two methods remains...
Most variants in most genes across most organisms have an unknown impact on the function of the corresponding gene. This gap in knowledge is especially acute in cancer, where clinical sequencing of tumors now routinely reveals patient-specific variants whose functional impact on the corresponding gene is unknown, impeding clinical utility. Transcri...
A variational autoencoder (VAE) is a machine learning algorithm, useful for generating a compressed and interpretable latent space. These representations have been generated from various biomedical data types and can be used to produce realistic-looking simulated data. However, standard vanilla VAEs suffer from entangled and uninformative latent sp...
Quantitative optical microscopy—an emerging, transformative approach to single-cell biology—has seen dramatic methodological advancements over the past few years. However, its impact has been hampered by challenges in the areas of data generation, management, and analysis. Here we outline these technical and cultural challenges and provide our pers...
We present a new, carefully designed and well-annotated dataset of images and image-based profiles of cells that have been treated with chemical compounds and genetic perturbations. Each gene that is perturbed is a known target of at least two compounds in the dataset. The dataset can thus serve as a benchmark to evaluate methods for predicting sim...
Software has provided cell biologists the power to quantify specific cellular features in cell images at scale. Before long, these biologists also recognized the potential to extract much more biological information from the same images. From here, the field of image-based profiling, the process of extracting unbiased representations that capture m...
Patient stem cell-derived models enable imaging of complex disease phenotypes and the development of scalable drug discovery platforms. Current preclinical methods for assessing cellular activity do not, however, capture the full intricacies of disease-induced disturbances, and instead typically focus on a single parameter, which impairs both the u...
Most variants in most genes across most organisms have an unknown impact on the function of the corresponding gene. This gap in knowledge is especially acute in cancer, where clinical sequencing of tumors now routinely reveals patient-specific variants whose functional impact on the corresponding gene is unknown, impeding clinical utility. Transcri...
In this paper, we summarize a global survey of 484 participants of the imaging community, conducted in 2020 through the NIH Center for Open BioImage Analysis (COBA). This 23-question survey covered experience with image analysis, scientific background and demographics, and views and requests from different members of the imaging community. Through...
Deep profiling of cell states can provide a broad picture of biological changes that occur in disease, mutation, or in response to drug or chemical treatments. Morphological and gene expression profiling, for example, can cost-effectively capture thousands of features in thousands of samples across perturbations, but it is unclear to what extent th...
Evolving in sync with the computation revolution over the past 30 years, computational biology has emerged as a mature scientific field. While the field has made major contributions toward improving scientific knowledge and human health, individual computational biology practitioners at various institutions often languish in career development. As...
Background
Imaging data contains a substantial amount of information which can be difficult to evaluate by eye. With the expansion of high throughput microscopy methodologies producing increasingly large datasets, automated and objective analysis of the resulting images is essential to effectively extract biological information from this data. Cell...
Populations of cells can be perturbed by various chemical and genetic treatments and the impact on the cells gene expression (transcription, i.e. mRNA levels) and morphology (in an image-based assay) can be measured in high dimensions. The patterns observed in this profile data can be used for more than a dozen applications in drug discovery and ba...
Obesity and its associated metabolic syndrome are a leading cause of morbidity and mortality in the United States. Given the disease’s heavy burden on patients and the healthcare system, there has been increased interest in identifying pharmacological targets for the treatment and prevention of obesity. Towards this end, genome-wide association stu...
Image-based experiments can yield many thousands of individual measurements describing each object of interest, such as cells in microscopy screens. CellProfiler Analyst is a free, open-source software package designed for the exploration of quantitative image-derived data and the training of machine learning classifiers with an intuitive user inte...
A variational autoencoder (VAE) is a machine learning algorithm, useful for generating a compressed and interpretable latent space. These representations have been generated from various biomedical data types and can be used to produce realistic-looking simulated data. However, standard vanilla VAEs suffer from entangled and uninformative latent sp...
The in vitro micronucleus assay is a globally significant method for DNA damage quantification used for regulatory compound safety testing in addition to inter-individual monitoring of environmental, lifestyle and occupational factors. However, it relies on time-consuming and user-subjective manual scoring. Here we show that imaging flow cytometry...
In this paper, we summarize a global survey of 484 participants of the imaging community, conducted in 2020 through the NIH Center for Open BioImage Analysis (COBA). This 23-question survey covered experience with image analysis, scientific background and demographics, and views and requests from different members of the imaging community. Through...
Image-based experiments can yield many thousands of individual measurements describing each object of interest, such as cells in microscopy screens. CellProfiler Analyst is a free, open-source software package designed for the exploration of quantitative image-derived data and the training of machine learning classifiers with an intuitive user inte...
Identifying chemical regulators of biological pathways is a time-consuming bottleneck in developing therapeutics and research compounds. Typically, thousands to millions of candidate small molecules are tested in target-based biochemical screens or phenotypic cell-based screens, both expensive experiments customized to each disease. Here, our broad...
Background
Imaging data contains a substantial amount of information which can be difficult to evaluate by eye. With the expansion of high throughput microscopy methodologies producing increasingly large datasets, automated and objective analysis of the resulting images is essential to effectively extract biological information from this data. Cell...
Human induced pluripotent stem cell-derived (iPSC) neural cultures offer clinically relevant models of human diseases, including Amyotrophic Lateral Sclerosis, Alzheimer’s, and Autism Spectrum Disorder. In situ characterization of the spatial-temporal evolution of cell state in 3D culture and subsequent 2D dissociated culture models based on protei...
Patient stem cell-derived models enable imaging of complex disease phenotypes and the development of scalable drug discovery platforms. Current preclinical methods for assessing cellular activity do not, however, capture the full intricacies of disease-induced disturbances, and instead typically focus on a single parameter, which impairs both the u...
Deep learning offers the potential to extract more than meets the eye from images captured by imaging flow cytometry. This protocol describes the application of deep learning to single-cell images to perform supervised cell classification and weakly supervised learning, using example data from an experiment exploring red blood cell morphology. We d...
The in vitro micronucleus assay is a globally significant method for DNA damage quantification used for regulatory compound safety testing in addition to inter-individual monitoring of environmental, lifestyle and occupational factors. However it relies on time-consuming and user-subjective manual scoring. Here we show that imaging flow cytometry a...
ImageJ and CellProfiler have long been leading open‐source platforms in the field of bioimage analysis. ImageJ's traditional strength is in single‐image processing and investigation, while CellProfiler is designed for building large‐scale, modular analysis pipelines. Although many image analysis problems can be well solved with one or the other, us...
Biomedical research centers can empower basic discovery and novel therapeutic strategies by leveraging their large-scale datasets from experiments and patients. This data, together with new technologies to create and analyze it, has ushered in an era of data-driven discovery which requires moving beyond the traditional individual, single-discipline...
Fast-paced innovations in imaging have resulted in single systems producing exponential amounts of data to be analyzed. Computational methods developed in computer science labs have proven to be crucial for analyzing these data in an unbiased and efficient manner, reaching a prominent role in most microscopy studies. Still, their use usually requir...
Microscopy images are rich in information about the dynamic relationships among biological structures. However, extracting this complex information can be challenging, especially when biological structures are closely packed, distinguished by texture rather than intensity, and/or low intensity relative to the background. By learning from large amou...
Successful mapping of cancer dependencies requires conducting genetic and drug screens on a diversity of models. However, the difficulty in generating long-term models of many cancers limits the share of patient samples that can be studied. Such long-term models have likely also lost the cellular heterogeneity present in the original tumor due to i...
Genetic and chemical perturbations impact diverse cellular phenotypes, including multiple indicators of cell health. These readouts reveal toxicity and antitumorigenic effects relevant to drug discovery and personalized medicine. We developed two customized microscopy assays, one using four targeted reagents and the other three targeted reagents, t...
Human induced pluripotent stem cell-derived (iPSC) neural cultures offer clinically relevant models of human diseases, including Amyotrophic Lateral Sclerosis, Alzheimer’s, and Autism Spectrum Disorder. In situ characterization of the spatial-temporal evolution of cell state in 2D and 3D culture models and organoids based on protein expression leve...
Image-based profiling is a maturing strategy by which the rich information present in biological images is reduced to a multidimensional profile, a collection of extracted image-based features. These profiles can be mined for relevant patterns, revealing unexpected biological activity that is useful for many steps in the drug discovery process. Suc...
Neuronal synapses contain hundreds of different protein species important for regulating signal transmission. Characterizing differential expression profiles of proteins within synapses in distinct regions of the brain has revealed a high degree of synaptic diversity defined by unique molecular organization. Multiplexed imaging of in vitro rat prim...
Recent advances in deep learning enable using chemical structures and phenotypic profiles to accurately predict assay results for compounds virtually, reducing the time and cost of screens in the drug discovery process. The relative strength of high-throughput data sources - chemical structures, images (Cell Painting), and gene expression profiles...
Background:
Identified as an Alzheimer's disease (AD) susceptibility gene by genome wide-association studies, BIN1 has 10 isoforms that are expressed in the Central Nervous System (CNS). The distribution of these isoforms in different cell types, as well as their role in AD pathology still remains unclear.
Methods:
Utilizing antibodies targeting...
Winning the American Society for Cell Biology’s Women in Cell Biology Mid-career Award is incredibly meaningful to me, as it validates that someone focusing on engineering and applications can be a “real” cell biologist, too. Single-minded devotion to studying a particular biological process is not a prerequisite for a career in science and academi...
Significance
We developed a strategy to avoid human subjectivity by assessing the quality of red blood cells using imaging flow cytometry and deep learning. We successfully automated traditional expert assessment by training a computer with example images of healthy and unhealthy morphologies. However, we noticed that experts disagree on ∼18% of ce...
Background
The mechanisms that regulate platelet biogenesis remain unclear; factors that trigger megakaryocytes (MKs) to initiate platelet production are poorly understood. Platelet formation begins with proplatelets which are cellular extensions originating from the MK cell body.
Objectives
Proplatelet formation is an asynchronous and dynamic pro...
Background:
A common yet still manual task in basic biology research, high-throughput drug screening and digital pathology is identifying the number, location, and type of individual cells in images. Object detection methods can be useful for identifying individual cells as well as their phenotype in one step. State-of-the-art deep learning for ob...
Genetic and chemical perturbations impact diverse cellular phenotypes, including multiple indicators of cell health. These readouts reveal toxicity and antitumorigenic effects relevant to drug discovery and personalized medicine. We developed two customized microscopy assays that use seven reagents to measure 70 specific cell health phenotypes incl...
Dr. Anne Carpenter addresses her career path from cell biology toward computation. Why would a researcher move outside their comfort zone into a different field, from a domain into data science? What is the best way to bridge domain and data? What is challenging about moving from domain toward data? What is amazing about bridging domain and data?
Neuronal synapses contain hundreds of different protein species important for regulating signal transmission. Characterizing differential expression profiles of proteins within synapses in distinct regions of the brain has revealed a high degree of synaptic diversity defined by unique molecular organization. Multiplexed imaging of in vitro neuronal...
Learning rules by which cell shape impacts cell function would enable control of cell physiology and fate in medical applications, particularly, on the interface of cells and material of the implants. We defined the phenotypic response of human bone marrow-derived mesenchymal stem cells (hMSCs) to 2176 randomly generated surface topographies by pro...
Single-cell segmentation is typically a crucial task of image-based cellular analysis. We present nucleAIzer, a deep-learning approach aiming toward a truly general method for localizing 2D cell nuclei across a diverse range of assays and light microscopy modalities. We outperform the 739 methods submitted to the 2018 Data Science Bowl on images re...
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
Filtered through the analytical power of artificial intelligence, the wealth of available biomedical data promises to revolutionize cancer research, diagnosis and care. In this Viewpoint, six experts discuss some of the challenges, exciting developments and future questions arising at the interface of machine learning and oncology.
Acute lymphoblastic leukemia (ALL) is the most common childhood cancer. While there are a number of well‐recognized prognostic biomarkers at diagnosis, the most powerful independent prognostic factor is the response of the leukemia to induction chemotherapy (Campana and Pui: Blood 129 (2017) 1913–1918). Given the potential for machine learning to i...