RchyOptimyx: Cellular hierarchy optimization for flow cytometry

Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, British Columbia, Canada.
Cytometry Part A (Impact Factor: 2.93). 12/2012; 81A(12). DOI: 10.1002/cyto.a.22209
Source: PubMed


Analysis of high-dimensional flow cytometry datasets can reveal novel cell populations with poorly understood biology. Following discovery, characterization of these populations in terms of the critical markers involved is an important step, as this can help to both better understand the biology of these populations and aid in designing simpler marker panels to identify them on simpler instruments and with fewer reagents (i.e., in resource poor or highly regulated clinical settings). However, current tools to design panels based on the biological characteristics of the target cell populations work exclusively based on technical parameters (e.g., instrument configurations, spectral overlap, and reagent availability). To address this shortcoming, we developed RchyOptimyx (cellular hieraRCHY OPTIMization), a computational tool that constructs cellular hierarchies by combining automated gating with dynamic programming and graph theory to provide the best gating strategies to identify a target population to a desired level of purity or correlation with a clinical outcome, using the simplest possible marker panels. RchyOptimyx can assess and graphically present the trade-offs between marker choice and population specificity in high-dimensional flow or mass cytometry datasets. We present three proof-of-concept use cases for RchyOptimyx that involve 1) designing a panel of surface markers for identification of rare populations that are primarily characterized using their intracellular signature; 2) simplifying the gating strategy for identification of a target cell population; 3) identification of a non-redundant marker set to identify a target cell population. © 2012 International Society for Advancement of Cytometry.

Download full-text


Available from: Ryan R. Brinkman, Nov 12, 2014
  • Source
    • "The flowType algorithm builds on this framework by allowing the exclusion of the markers [49]. This enables the development of statistical tools (e.g., RchyOptimyx) that can investigate the importance of each marker and exclude high-dimensional redundancies [50]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Flow cytometry bioinformatics is the application of bioinformatics to flow cytometry data, which involves storing, retrieving, organizing, and analyzing flow cytometry data using extensive computational resources and tools. Flow cytometry bioinformatics requires extensive use of and contributes to the development of techniques from computational statistics and machine learning. Flow cytometry and related methods allow the quantification of multiple independent biomarkers on large numbers of single cells. The rapid growth in the multidimensionality and throughput of flow cytometry data, particularly in the 2000s, has led to the creation of a variety of computational analysis methods, data standards, and public databases for the sharing of results. Computational methods exist to assist in the preprocessing of flow cytometry data, identifying cell populations within it, matching those cell populations across samples, and performing diagnosis and discovery using the results of previous steps. For preprocessing, this includes compensating for spectral overlap, transforming data onto scales conducive to visualization and analysis, assessing data for quality, and normalizing data across samples and experiments. For population identification, tools are available to aid traditional manual identification of populations in two-dimensional scatter plots (gating), to use dimensionality reduction to aid gating, and to find populations automatically in higher dimensional space in a variety of ways. It is also possible to characterize data in more comprehensive ways, such as the density-guided binary space partitioning technique known as probability binning, or by combinatorial gating. Finally, diagnosis using flow cytometry data can be aided by supervised learning techniques, and discovery of new cell types of biological importance by high-throughput statistical methods, as part of pipelines incorporating all of the aforementioned methods. Open standards, data, and software are also key parts of flow cytometry bioinformatics. Data standards include the widely adopted Flow Cytometry Standard (FCS) defining how data from cytometers should be stored, but also several new standards under development by the International Society for Advancement of Cytometry (ISAC) to aid in storing more detailed information about experimental design and analytical steps. Open data is slowly growing with the opening of the CytoBank database in 2010 and FlowRepository in 2012, both of which allow users to freely distribute their data, and the latter of which has been recommended as the preferred repository for MIFlowCyt-compliant data by ISAC. Open software is most widely available in the form of a suite of Bioconductor packages, but is also available for web execution on the GenePattern platform.
    PLoS Computational Biology 12/2013; 9(12):e1003365. DOI:10.1371/journal.pcbi.1003365 · 4.62 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Introduction: Flow cytometry has been around for over 40 years, but only recently has the opportunity arisen to move into the high-throughput domain. The technology is now available and is highly competitive with imaging tools under the right conditions. Flow cytometry has, however, been a technology that has focused on its unique ability to study single cells and appropriate analytical tools are readily available to handle this traditional role of the technology. Areas covered: Expansion of flow cytometry to a high-throughput (HT) and high-content technology requires both advances in hardware and analytical tools. The historical perspective of flow cytometry operation as well as how the field has changed and what the key changes have been discussed. The authors provide a background and compelling arguments for moving toward HT flow, where there are many innovative opportunities. With alternative approaches now available for flow cytometry, there will be a considerable number of new applications. These opportunities show strong capability for drug screening and functional studies with cells in suspension. Expert opinion: There is no doubt that HT flow is a rich technology awaiting acceptance by the pharmaceutical community. It can provide a powerful phenotypic analytical toolset that has the capacity to change many current approaches to HT screening. The previous restrictions on the technology, based on its reduced capacity for sample throughput, are no longer a major issue. Overcoming this barrier has transformed a mature technology into one that can focus on systems biology questions not previously considered possible.
    Expert Opinion on Drug Discovery 06/2012; 7(8):679-93. DOI:10.1517/17460441.2012.693475 · 3.54 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Traditional methods for flow cytometry (FCM) data processing rely on subjective manual gating. Recently, several groups have developed computational methods for identifying cell populations in multidimensional FCM data. The Flow Cytometry: Critical Assessment of Population Identification Methods (FlowCAP) challenges were established to compare the performance of these methods on two tasks: (i) mammalian cell population identification, to determine whether automated algorithms can reproduce expert manual gating and (ii) sample classification, to determine whether analysis pipelines can identify characteristics that correlate with external variables (such as clinical outcome). This analysis presents the results of the first FlowCAP challenges. Several methods performed well as compared to manual gating or external variables using statistical performance measures, which suggests that automated methods have reached a sufficient level of maturity and accuracy for reliable use in FCM data analysis.
    Nature Methods 02/2013; 10(3). DOI:10.1038/nmeth.2365 · 32.07 Impact Factor
Show more