Daphne KollerStanford University | SU
Daphne Koller
About
421
Publications
66,744
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
71,190
Citations
Publications
Publications (421)
Studies in mouse have shed important light on human hematopoietic differentiation and disease. However, substantial differences between the two species often limit the translation of findings from mouse to human.
Here, we compare previously defined modules of co-expressed genes in human and mouse immune cells based on compendia of genome-wide profi...
Introduction:
A major goal of neonatal medicine is to identify neonates at highest risk for morbidity and mortality. Previously, we developed PhysiScore (Saria et al., 2010), a novel tool for preterm morbidity risk prediction. We now further define links between overall individual morbidity risk, specific neonatal morbidities, and placental pathol...
Aging is one of the most important biological processes and is a known risk factor for many age-related diseases in human. Studying age-related transcriptomic changes in tissues across the whole body can provide valuable information for a holistic understanding of this fundamental process. In this work, we catalogue age-related gene expression chan...
We consider the problem of parameter estimation and energy minimization for a region-based semantic segmentation model. The model divides the pixels of an image into non-overlapping connected regions, each of which is to a semantic class. In the context of energy minimization, the main problem we face is the large number of putative pixel-to-region...
To understand the regulation of tissue-specific gene expression, the GTEx Consortium gen- erated RNA-seq expression data for more than thirty distinct human tissues. This data pro- vides an opportunity for deriving shared and tissue specific gene regulatory networks on the basis of co-expression between genes. However, a small number of samples are...
Understanding the functional consequences of genetic variation, and how it affects complex human disease and quantitative traits, remains a critical challenge for biomedicine. We present an analysis of RNA sequencing data from 1641 samples across 43 tissues from 175 individuals, generated as part of the pilot phase of the Genotype-Tissue Expression...
To understand the regulation of tissue-specific gene expression, the GTEx Consortium generated RNA-seq expression data for more than thirty distinct human tissues. This data provides an opportunity for deriving shared and tissue-specific gene regulatory networks on the basis of co-expression between genes. However, a small number of samples are ava...
Ribosome profiling data report on the distribution of translating ribosomes, at steady-state, with codon-level resolution. We present a robust method to extract codon translation rates and protein synthesis rates from these data, and identify causal features associated with elongation and translation efficiency in physiological conditions in yeast....
New high-frequency, automated data collection and analysis algorithms could offer new insights into complex learning processes, especially for tasks in which students have opportunities to generate unique open-ended artifacts such as computer programs. These approaches should be particularly useful because the need for scalable project-based and st...
B cells produce a diverse antibody repertoire by undergoing gene rearrangements. Pathogen exposure induces the clonal expansion of B cells expressing antibodies that can bind the infectious agent. To assess human B cell responses to trivalent seasonal influenza and monovalent pandemic H1N1 vaccination, we sequenced gene rearrangements encoding the...
To extend our understanding of the genetic basis of human immune function and dysfunction, we performed an expression quantitative
trait locus (eQTL) study of purified CD4+ T cells and monocytes, representing adaptive and innate immunity, in a multi-ethnic cohort of 461 healthy individuals. Context-specific
cis- and trans-eQTLs were identified, and...
Massive open online courses (MOOCs) enable the delivery of high-quality educational experiences to large groups of students. Coursera, one of the largest MOOC providers, developed a program to provide students with verified credentials as a record of their MOOC performance. Such credentials help students convey achievements in MOOCs to future emplo...
Significance
The immune system must constantly adapt to combat infections and other challenges. This is accomplished by continuously evolving the antibody repertoire, and by maintaining memory of prior challenges. By using next-generation DNA sequencing technology, we have examined the shear amount of antibody made by individuals during a flu vacci...
Elderly humans show decreased humoral immunity to pathogens and vaccines, yet the effects of aging on B cells are not fully known. Chronic viral infection by CMV is implicated as a driver of clonal T cell proliferations in some aging humans, but whether CMV or EBV infection contributes to alterations in the B cell repertoire with age is unclear. We...
Peer and self-assessment offer an opportunity to scale both assessment and learning to global classrooms. This article reports our experiences with two iterations of the first large online class to use peer and self-assessment. In this class, peer grades correlated highly with staff-assigned grades. The second iteration had 42.9% of students...
In this paper, we tackle the problem of combining features extracted from video for complex event recognition. Feature combination is an especially relevant task in video data, as there are many features we can extract, ranging from image features computed from individual frames to video features that take temporal information into account. To comb...
Broadly neutralizing HIV antibodies (bnAbs) are typically highly somatically mutated, raising doubts as to whether they can be elicited by vaccination. We used 454 sequencing and designed a novel phylogenetic method to model lineage evolution of the bnAbs PGT121-134 and found a positive correlation between the level of somatic hypermutation (SHM) a...
Understanding the consequences of regulatory variation in the human genome remains a major challenge, with important implications for understanding gene regulation and interpreting the many disease-risk variants that fall outside of protein-coding regions. Here, we provide a direct window into the regulatory consequences of genetic variation by seq...
In 2011, Stanford University offered three online courses, which anyone in the world could enroll in and take for free. Together, these three courses had enrollments of around 350,000 students, making this one of the largest experiments in online education ever performed. Since the beginning of 2012, we have transitioned this effort into a new vent...
Systems and methods can mine structured clinical event data in an electronic health record (EHR) system to determine patient outcomes. Mining the structured clinical event data instead of or in addition to mining discharge summaries can increase the accuracy of patient outcome identification. Sophisticated language models can be used to extract out...
Transcriptomic assays that measure expression levels are widely used to study the manifestation of environmental or genetic variations in cellular processes. RNA-sequencing in particular has the potential to considerably improve such understanding because of its capacity to assay the entire transcriptome, including novel transcriptional events. How...
In massive open online courses (MOOCs), peer grading serves as a critical
tool for scaling the grading of complex, open-ended assignments to courses with
tens or hundreds of thousands of students. But despite promising initial
trials, it does not always deliver accurate results compared to human experts.
In this paper, we develop algorithms for est...
Despite the importance of the immune system in many diseases, there are currently no objective benchmarks of immunological health. In an effort to identifying such markers, we used influenza vaccination in 30 young (20-30 years) and 59 older subjects (60 to >89 years) as models for strong and weak immune responses, respectively, and assayed their s...
In this paper, we tackle the problem of understanding the temporal structure of complex events in highly varying videos obtained from the Internet. Towards this goal, we utilize a conditional model trained in a max-margin framework that is able to automatically discover discriminative and interesting segments of video, while simultaneously achievin...
Dengue is the most prevalent mosquito-borne viral disease in humans, and the lack of early prognostics, vaccines, and therapeutics contributes to immense disease burden. To identify patterns that could be used for sequence-based monitoring of the antibody response to dengue, we examined antibody heavy-chain gene rearrangements in longitudinal perip...
Genome-wide association studies have identified thousands of loci for common diseases, but, for the majority of these, the mechanisms underlying disease susceptibility remain unknown. Most associated variants are not correlated with protein-coding changes, suggesting that polymorphisms in regulatory regions probably contribute to many disease pheno...
The differentiation of αβT cells from thymic precursors is a complex process essential for adaptive immunity. Here we exploited the breadth of expression data sets from the Immunological Genome Project to analyze how the differentiation of thymic precursors gives rise to mature T cell transcriptomes. We found that early T cell commitment was driven...
The differentiation of hematopoietic stem cells into cells of the immune system has been studied extensively in mammals, but the transcriptional circuitry that controls it is still only partially understood. Here, the Immunological Genome Project gene-expression profiles across mouse immune lineages allowed us to systematically analyze these circui...
Metadata, exposure history and antibody titer pre- and post-vaccine
The differentiation of hematopoietic stem cells into immune cells has been extensively studied in mammals, but the transcriptional circuitry controlling it is still only partially understood. Here, the Immunological Genome Project gene expression profiles across mouse immune lineages allowed us to systematically analyze these circuits. Using a comp...
In this paper, we consider one aspect of the problem of applying decision theory to the design of agents that learn how to make decisions under uncertainty. This aspect concerns how an agent can estimate probabilities for the possible states of the world, given that it only makes limited observations before committing to a decision. We show that th...
In previous work [BGHK92, BGHK93], we have studied the random-worlds approach
-- a particular (and quite powerful) method for generating degrees of belief
(i.e., subjective probabilities) from a knowledge base consisting of objective
(first-order, statistical, and default) information. But allowing a knowledge
base to contain only objective informa...
Much of the knowledge about cell differentiation and function in the immune system has come from studies in mice, but the relevance to human immunology, diseases, and therapy has been challenged, perhaps more from anecdotal than comprehensive evidence. To this end, we compare two large compendia of transcriptional profiles of human and mouse immune...
After infection, many factors coordinate the population expansion and differentiation of CD8(+) effector and memory T cells. Using data of unparalleled breadth from the Immunological Genome Project, we analyzed the CD8(+) T cell transcriptome throughout infection to establish gene-expression signatures and identify putative transcriptional regulato...
This paper re-examines the problem of parameter estimation in Bayesian
networks with missing values and hidden variables from the perspective of
recent work in on-line learning [Kivinen & Warmuth, 1994]. We provide a unified
framework for parameter estimation that encompasses both on-line learning,
where the model is continuously adapted to new dat...
Bayesian networks provide a modeling language and associated inference
algorithm for stochastic domains. They have been successfully applied in a
variety of medium-scale applications. However, when faced with a large complex
domain, the task of modeling using Bayesian networks begins to resemble the
task of programming using logical circuits. In th...
We consider probabilistic inference in general hybrid networks, which include
continuous and discrete variables in an arbitrary topology. We reexamine the
question of variable discretization in a hybrid network aiming at minimizing
the information loss induced by the discretization. We show that a nonuniform
partition across all variables as oppose...
Much of the knowledge about cell differentiation and function in the immune system has come from studies in mice, but the relevance to human immunology, diseases, and therapy has been challenged, perhaps more from anecdotal than comprehensive evidence. To this end, we compare two large compendia of transcriptional profiles of human and mouse immune...
The monitoring and control of any dynamic system depends crucially on the
ability to reason about its current status and its future trajectory. In the
case of a stochastic system, these tasks typically involve the use of a belief
state- a probability distribution over the state of the process at a given
point in time. Unfortunately, the state space...
Dynamic Bayesian networks provide a compact and natural representation for
complex dynamic systems. However, in many cases, there is no expert available
from whom a model can be elicited. Learning provides an alternative approach
for constructing models of dynamic systems. In this paper, we address some of
the crucial computational aspects of learn...
The clique tree algorithm is the standard method for doing inference in
Bayesian networks. It works by manipulating clique potentials - distributions
over the variables in a clique. While this approach works well for many
networks, it is limited by the need to maintain an exact representation of the
clique potentials. This paper presents a new unif...
In previous work, we pointed out the limitations of standard Bayesian
networks as a modeling framework for large, complex domains. We proposed a new,
richly structured modeling language, {em Object-oriented Bayesian Netorks},
that we argued would be able to deal with such domains. However, it turns out
that OOBNs are not expressive enough to model...