Robert F Murphy

Robert F Murphy
Carnegie Mellon University | CMU · Computational Biology Department

Ph.D.

About

292
Publications
24,899
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
12,850
Citations
Citations since 2017
32 Research Items
3023 Citations
20172018201920202021202220230100200300400500
20172018201920202021202220230100200300400500
20172018201920202021202220230100200300400500
20172018201920202021202220230100200300400500
Introduction
Robert F. Murphy is the Ray and Stephanie Lane Professor of Computational Biology and Professor of Biological Sciences, Biomedical Engineering, and Machine Learning at Carnegie Mellon, and Honorary Professor of Biology at the University of Freiburg. His group pioneered application of machine vision to subcellular patterns. His current interests include image-derived models of cell organization, protein location changes during oncogenesis, and active machine learning approaches to biology.
Additional affiliations
May 2007 - January 2015
Carnegie Mellon University
Position
  • Director, Lane Center for Computational Biology (now Computational Biology Department)
May 1983 - December 2012
Carnegie Mellon University
Description
  • Departments of Biological Sciences, Biomedical Engineering and Machine Learning
June 1974 - October 1979
California Institute of Technology
Position
  • Research Assistant
Education
June 1974 - September 1979
California Institute of Technology
Field of study
  • Biochemistry
September 1971 - May 1974
Columbia University
Field of study
  • Biochemistry

Publications

Publications (292)
Preprint
Full-text available
Motivation Multiplexed protein imaging methods provide valuable information on complex tissue structure and cellular heterogeneity. However, costs increase and image quality decreases with the number of biomarkers imaged, and the number of markers that can be measured in the same tissue sample is limited. Results In this work, we propose an effici...
Article
Tetrahymena thermophila possesses arrays of motile cilia that promote fluid flow for cell motility. These consist of intricately organized basal bodies (BBs) that nucleate and position cilia at the cell cortex. Tetrahymena cell geometry and spatial organization of BBs play important roles in cell size, swimming, feeding, and division. How cell geom...
Article
Cell segmentation is a cornerstone of many bioimage informatics studies and inaccurate segmentation introduces error in downstream analysis. Evaluating segmentation results is thus a necessary step for developing segmentation methods as well as for choosing the most appropriate method for a particular type of sample. The evaluation process has typi...
Article
Full-text available
Motivation: Cells contain dozens of major organelles and thousands of other structures, many of which vary extensively in their number, size, shape and spatial distribution. This complexity and variation dramatically complicates the use of both traditional and deep learning methods to build accurate models of cell organization. Most cellular organ...
Preprint
Full-text available
Motivation Cells contain dozens of major organelles and thousands of other structures, many of which vary extensively in their number, size, shape and spatial distribution. This complexity and variation dramatically complicates the use of both traditional and deep learning methods to build accurate models of cell organization. Most cellular organel...
Preprint
Full-text available
Cell segmentation is a cornerstone of many bioimage informatics studies. Inaccurate segmentation introduces computational error in downstream cellular analysis. Evaluating the segmentation results is thus a necessary step for developing the segmentation methods as well as choosing the most appropriate one for a certain kind of tissue or image. The...
Article
Full-text available
Motivation High throughput and high content screening are extensively used to determine the effect of small molecule compounds and other potential therapeutics upon particular targets as part of the early drug development process. However, screening is typically used to find compounds that have a desired effect but not to identify potential undesir...
Article
A major challenge for protein databases is reconciling information from diverse sources. This is especially difficult when some information consists of secondary, human-interpreted rather than primary data. For example, the Swiss-Prot database contains curated annotations of subcellular location that are based on predictions from protein sequence,...
Article
Most of the fascinating phenomena studied in cell biology emerge from interactions among highly organized multi-molecular structures embedded into complex and frequently dynamic cellular morphologies. For the exploration of such systems, computer simulation has proved to be an invaluable tool, and many researchers in this field have developed sophi...
Article
The killing of tumor cells by CD8+ T cells is suppressed by the tumor microenvironment, and increased expression of inhibitory receptors, including programmed cell death protein-1 (PD-1), is associated with tumor-mediated suppression of T cells. To find cellular defects triggered by tumor exposure and associated PD-1 signaling, we established an ex...
Preprint
Full-text available
Most of the fascinating phenomena studied in cell biology emerge from interactions among highly organized multi-molecular structures and rapidly propagating molecular signals embedded into complex and frequently dynamic cellular morphologies. For the exploration of such systems, computational simulation has proved to be an invaluable tool, and many...
Preprint
Full-text available
A key step in understanding the spatial organization of cells and tissues is the ability to construct generative models that accurately reflect that organization. In this paper, we focus on building generative models of electron microscope (EM) images in which the positions of cell membranes and mitochondria have been densely annotated, and propose...
Article
Full-text available
PC12 cells are a popular model system to study changes driving and accompanying neuronal differentiation. While attention has been paid to changes in transcriptional regulation and protein signaling, much less is known about the changes in organization that accompany PC12 differentiation. Fluorescence microscopy can provide extensive information ab...
Article
Full-text available
Motivation: Systematic and comprehensive analysis of protein subcellular location as a critical part of proteomics ("location proteomics") has been studied for many years, but annotating protein subcellular locations and understanding variation of the location patterns across various cell types and states is still challenging. Results: In this w...
Article
Full-text available
Supramolecular signaling assemblies are of interest for their unique signaling properties. A µm scale signaling assembly, the central supramolecular signaling cluster (cSMAC), forms at the center of the interface of T cells activated by antigen-presenting cells. We have determined that it is composed of multiple complexes of a supramolecular volume...
Article
Full-text available
Supramolecular signaling assemblies are of interest for their unique signaling properties. A µm scale signaling assembly, the central supramolecular signaling cluster (cSMAC), forms at the center of the interface of T cells activated by antigen-presenting cells. We have determined that it is composed of multiple complexes of a supramolecular volume...
Article
Full-text available
Supramolecular signaling assemblies are of interest for their unique signaling properties. A µm scale signaling assembly, the central supramolecular signaling cluster (cSMAC), forms at the center of the interface of T cells activated by antigen-presenting cells. We have determined that it is composed of multiple complexes of a supramolecular volume...
Article
Full-text available
Transformative technologies are enabling the construction of three-dimensional maps of tissues with unprecedented spatial and molecular resolution. Over the next seven years, the NIH Common Fund Human Biomolecular Atlas Program (HuBMAP) intends to develop a widely accessible framework for comprehensively mapping the human body at single-cell resolu...
Conference Paper
Protein complexes play a significant role in the core functionality of cells. These complexes are typically identified by detecting densely connected subgraphs in protein-protein interaction (PPI) networks. Recently, multiple large-scale mass spectrometry-based experiments have significantly increased the availability of PPI data in order to furthe...
Chapter
This chapter describes the procedures necessary to create generative models of the spatial organization of cells directly from microscope images and use them to automatically provide geometries for spatial simulations of cell processes and behaviors. Such models capture the statistical variation in the overall cell architecture as well as the numbe...
Preprint
Full-text available
Supramolecular signaling assemblies are of interest for their unique signaling properties. A µm scale signaling assembly, the central supramolecular signaling cluster (cSMAC), forms at the center of the interface of T cells activated by antigen presenting cells. We have determined that it is composed of multiple complexes of a supramolecular volume...
Article
Full-text available
Within influenza virus infected cells, viral genomic RNA are selectively packed into progeny virions, which predominantly contain a single copy of 8 viral RNA segments. Intersegmental RNA-RNA interactions are thought to mediate selective packaging of each viral ribonucleoprotein complex (vRNP). Clear evidence of a specific interaction network culmi...
Preprint
Full-text available
Cellular differentiation is a complex process requiring the coordination of many cellular components. PC12 cells are a popular model system to study changes driving and accompanying neuronal differentiation. While significant attention has been paid to changes in transcriptional regulation and protein signaling, much less is known about the changes...
Article
Full-text available
Motivation: Cell shape provides both geometry for, and a reflection of, cell function. Numerous methods for describing and modeling cell shape have been described, but previous evaluation of these methods in terms of the accuracy of generative models has been limited. Results: Here we compare traditional methods and deep autoencoders to build ge...
Preprint
Full-text available
Within influenza virus infected cells, viral genomic RNA are selectively packed into progeny virions, which predominantly contain a single copy of 8 viral RNA segments. Intersegmental RNA-RNA interactions are thought to mediate selective packaging of each viral ribonucleoprotein complex (vRNP). Clear evidence of a specific interaction network culmi...
Conference Paper
A key step in understanding the spatial organization of cells and tissues is the ability to construct generative models that accurately reflect that organization. In this paper, we focus on building generative models of electron microscope (EM) images in which the positions of cell membranes and mitochondria have been densely annotated, and propose...
Article
Full-text available
Upstream open reading frames (uORFs), located in transcript leaders (5' UTRs), are potent cis-acting regulators of translation and mRNA turnover. Recent genome-wide ribosome profiling studies suggest that thousands of uORFs initiate with non-AUG start codons. While intriguing, these non-AUG uORF predictions have been made without statistical contro...
Article
Full-text available
Motivation: Efforts to model how signaling and regulatory networks work in cells have largely either not considered spatial organization or have used compartmental models with minimal spatial resolution. Fluorescence microscopy provides the ability to monitor the spatiotemporal distribution of many molecules during signaling events, but as of yet...
Article
Macroautophagy is regarded as a nonspecific bulk degradation process of cytoplasmic material within the lysosome. However, the process has mainly been studied by nonspecific bulk degradation assays using radiolabeling. In the present study we monitor protein turnover and degradation by global, unbiased approaches relying on quantitative mass spectr...
Chapter
Full-text available
Three-dimensional live cell imaging of the interaction of T cells with antigen-presenting cells (APCs) visualizes the subcellular distributions of signaling intermediates during T cell activation at thousands of resolved positions within a cell. These information-rich maps of local protein concentrations are a valuable resource in understanding T c...
Article
Quantitative image analysis procedures are necessary for the automated discovery of effects of drug treatment in large collections of fluorescent micrographs. When compared to their mammalian counterparts, the effects of drug conditions on protein localization in plant species are poorly understood and underexplored. To investigate this relationshi...
Article
Full-text available
As a central element within the RAS/ERK pathway, the serine/threonine kinase BRAF plays a key role in development and homeostasis and represents the most frequently mutated kinase in tumors. Consequently, it has emerged as an important therapeutic target in various malignancies. Nevertheless, the BRAF activation cycle still raises many mechanistic...
Article
Full-text available
Accurate representations of cellular organization for multiple eukaryotic cell types are required for creating predictive models of dynamic cellular function. To this end, we have previously developed the CellOrganizer platform, an open source system for generative modeling of cellular components from microscopy images. CellOrganizer models capture...
Article
Full-text available
Fluorescence microscopy is one of the most important tools in cell biology research because it provides spatial and temporal information to investigate regulatory systems inside cells. This technique can gen- erate data in the form of signal intensities at thousands of positions resolved inside individual live cells. However, given extensive cell-t...
Article
Full-text available
The long-term goal of connecting scales in biological simulation can be facilitated by scale-agnostic methods. We demonstrate that the weighted ensemble (WE) strategy, initially developed for molecular simulations, applies effectively to spatially resolved cell-scale simulations. The WE approach runs an ensemble of parallel trajectories with assign...
Data
This spreadsheet contains the RandTag clone names, tagged gene and subcellular location annotations for the clones used in this work. DOI: http://dx.doi.org/10.7554/eLife.10047.011
Data
The spreadsheet contains the subcellular pattern labels assigned to each group of phenotypes in Figure 4. An image classifier was trained using the pattern labels of Figure 5 and applied to those images in each group in Figure 4. The label assigned with the highest frequency is shown, and any additional labels that were assigned with a probability...
Data
The spreadsheet contains the average feature values for the images for all experiments that passed the post-hoc image quality filtering process, the class each clone was assigned to by visual inspection (for the unperturbed condition), and the distance values that give rise to Figure 5. DOI: http://dx.doi.org/10.7554/eLife.10047.014
Data
The spreadsheet contains the average feature values for all measured experiments, the round each was measured, and the cluster number they were assigned to in each round. DOI: http://dx.doi.org/10.7554/eLife.10047.012
Article
High throughput screening determines the effects of many conditions on a given biological target. Currently, to estimate the effects of those conditions on other targets requires either strong modeling assumptions (e.g. similarities among targets) or separate screens. Ideally, data-driven experimentation could be used to learn accurate models for m...
Article
Full-text available
Characterizing the spatial distribution of proteins directly from microscopy images is a difficult problem with numerous applications in cell biology (e.g. identifying motor-related proteins) and clinical research (e.g. identification of cancer biomarkers). Here we describe the design of a system that provides automated analysis of punctate protein...
Data
Comparison of average distance of puncta from microtubules measured empirically and in our fitted model across proteins, cell types, and proteins and cell types. Each symbol represents a cell type; square for A-431, diamond for U-2 OS and circle for U-251 MG. The lines represent confidence intervals using Tukey’s range test for the empirical data (...
Data
Quality of fitted distributions for punctate proteins. P-P plots comparing the CDFs of the probability of vesicle given distance from microtubule for the fitted model and the empirical distribution are shown for the median cell of each pattern (the same cells as shown in Fig 5 and S4 Fig). (TIF)
Data
Determination of annotation threshold. Receiver operating characteristic curves for the accuracy statistic for determining the in-class threshold are shown for the three cell types. The accuracy corresponding to the optimal threshold is shown as a black circle (see Methods). (TIF)
Data
Representative images from seven patterns and corresponding synthesized protein pattern in U-2OS cells. The left column shows cell images closest to the median of parameter space for cells of that pattern, and the right column shows synthesized protein patterns from the generative model of protein pattern conditional on cell geometry and microtubul...
Data
Results for comparison of HPA proteins to the eleven punctate subpattern classes. The values in the columns for each subpattern are the separability measures for all cells of a given protein with the cells of the founder protein(s) for that subpattern. (XLS)
Data
Updated protein annotations for the UniProt database. The information in S2 Dataset is reformatted and includes Genome Ontology terms to be assigned to each protein. (XML)
Data
Generative model parameters. Radial position is defined as r = L1/(L1+L2) where L1 is the distance between the center of each punctum and the nuclear membrane, and L2 is the distance from the center of each punctum to the cell membrane. Therefore, r is positive if the punctum is outside of the nucleus and negative inside. α is the angle between the...
Data
Updated protein annotations resulting from this work. The file is in XML format appropriate for incorporation into protein databases. These entries are only for those proteins assigned to a single pattern using the thresholds determined in S2 Fig. (XML)
Article
The use of fluorescence microscopy has undergone a major revolution over the past twenty years, both with the development of dramatic new technologies and with the widespread adoption of image analysis and machine learning methods. Many open source software tools provide the ability to use these methods in a wide range of studies, and many molecula...
Article
Full-text available
Modeling cell shape variation is critical to our understanding of cell biology. Previous work has demonstrated the utility of non-rigid image registration methods for the construction of non-parametric nuclear shape models where pairwise deformation distances are measured between all shapes and are embedded into a low dimensional shape space. Using...
Article
Full-text available
Active learning is a powerful tool for guiding an experimentation process. Instead of doing all possible experiments in a given domain, active learning can be used to pick the experiments that will add the most knowledge to the current model. Especially, for drug discovery and development, active learning has been shown to reduce the number of expe...
Conference Paper
Understanding the dynamics of biochemical networks is a major goal of systems biology. Due to the heterogeneity of cells and the low copy numbers of key molecules, spatially resolved approaches are required to fully understand and model these systems. Until recently, most spatial modeling was performed using geometries obtained either through manua...
Article
Full-text available
Active learning has shown to reduce the number of experiments needed to obtain high-confidence drug-target predictions. However, in order to actually save experiments using active learning, it is crucial to have a method to evaluate the quality of the current prediction and decide when to stop the experimentation process. Only by applying reliable...
Article
Significance Changes in the expression of proteins are often associated with oncogenesis, and are frequently used as cancer biomarkers. Changes in the subcellular location of proteins have been less frequently investigated. In this paper, we describe a robust pipeline for identifying those proteins whose subcellular location undergoes statistically...
Article
Full-text available
Background Drug discovery and development has been aided by high throughput screening methods that detect compound effects on a single target. However, when using focused initial screening, undesirable secondary effects are often detected late in the development process after significant investment has been made. An alternative approach would be to...
Article
Full-text available
High throughput and high content screening involve determination of the effect of many compounds on a given target. As currently practiced, screening for each new target typically makes little use of information from screens of prior targets. Further, choices of compounds to advance to drug development are made without significant screening against...
Article
T cells are activated through interaction with antigen-presenting cells (APCs). During activation, receptors and signaling intermediates accumulate in diverse spatiotemporal distributions. These distributions control the probability of signaling interactions and thus govern information flow through the signaling system. Spatiotemporally resolved sy...
Article
Full-text available
Evaluation of previous systems for automated determination of subcellular location from microscope images has been done using datasets in which each location class consisted of multiple images of the same representative protein. Here, we frame a more challenging and useful problem where previously unseen proteins are to be classified. Using CD-tagg...
Article
Imaging techniques such as immunofluorescence (IF) and the expression of fluorescent protein (FP) fusions are widely used to investigate the subcellular distribution of proteins. Here we report a systematic analysis of >500 human proteins comparing the localizations obtained in live versus fixed cells using FPs and IF, respectively. We identify sys...
Article
Full-text available
Detection of neuronal cell differentiation is essential to study cell fate decisions under various stimuli and/or environmental conditions. Many tools exist that quantify differentiation by neurite length measurements of single cells. However, quantification of differentiation in whole cell populations remains elusive so far. Because such populatio...
Article
This chapter describes approaches for learning models of subcellular organization from images. The primary utility of these models is expected to be from incorporation into complex simulations of cell behaviors. Most current cell simulations do not consider spatial organization of proteins at all, or treat each organelle type as a single, idealized...
Article
Full-text available
The Human Protein Atlas contains immunofluorescence images showing subcellular locations for thousands of proteins. These are currently annotated by visual inspection. In this paper, we describe automated approaches to analyze the images and their use to improve annotation. We began by training classifiers to recognize the annotated patterns. By ra...
Data