Jean-Eudes Dazard

Case Western Reserve University School of Medicine, Cleveland, Ohio, United States

Are you Jean-Eudes Dazard?

Claim your profile

Publications (25)86.89 Total impact

  • Jean-Eudes Dazard · Hua Xu · J Sunil Rao
    [Show abstract] [Hide abstract]
    ABSTRACT: We present an implementation in the R language for statistical computing of our recent non-parametric joint adaptive mean-variance regularization and variance stabilization procedure. The method is specifically suited for handling difficult problems posed by high-dimensional multivariate datasets (p ≫ n paradigm), such as in 'omics'-type data, among which are that the variance is often a function of the mean, variable-specific estimators of variances are not reliable, and tests statistics have low powers due to a lack of degrees of freedom. The implementation offers a complete set of features including: (i) normalization and/or variance stabilization function, (ii) computation of mean-variance-regularized t and F statistics, (iii) generation of diverse diagnostic plots, (iv) synthetic and real 'omics' test datasets, (v) computationally efficient implementation, using C interfacing, and an option for parallel computing, (vi) manual and documentation on how to setup a cluster. To make each feature as user-friendly as possible, only one subroutine per functionality is to be handled by the end-user. It is available as an R package, called MVR ('Mean-Variance Regularization'), downloadable from the CRAN.
    No preview · Article · Jan 2016
  • Jean-Eudes Dazard · Michael Choe · Michael LeBlanc · J Sunil Rao
    [Show abstract] [Hide abstract]
    ABSTRACT: PRIMsrc is a novel implementation of a non-parametric bump hunting procedure, based on the Patient Rule Induction Method (PRIM), offering a unified treatment of outcome variables, including censored time-to-event (Survival), continuous (Regression) and discrete (Classification) responses. To fit the model, it uses a recursive peeling procedure with specific peeling criteria and stopping rules depending on the response. To validate the model, it provides an objective function based on prediction-error or other specific statistic, as well as two alternative cross-validation techniques, adapted to the task of decision-rule making and estimation in the three types of settings. PRIMsrc comes as an open source R package, including at this point: (i) a main function for fitting a Survival Bump Hunting model with various options allowing cross-validated model selection to control model size (#covariates) and model complexity (#peeling steps) and generation of cross-validated end-point estimates; (ii) parallel computing; (iii) various S3-generic and specific plotting functions for data visualization, diagnostic, prediction, summary and display of results. It is available on CRAN and GitHub.
    No preview · Article · Jan 2016
  • Source
    Jean-Eudes Dazard · Michael Choe · Michael LeBlanc · J. Sunil Rao
    [Show abstract] [Hide abstract]
    ABSTRACT: We introduce a survival/risk bump hunting framework to build a bump hunting model with a censored time-to-event response. Our method called Survival Bump Hunting relies on a rule-induction method, based on recursive peelings that uses specific survival peeling criteria such as hazards ratio or log-rank test statistics. To validate our model and improve survival prediction accuracy, we describe two alternative cross-validation techniques adapted to the joint task of decision-rule making by recursive peeling (i.e. decision-box) and survival estimation. One is commonly done by averaging test results and the other by combining test samples over the cross-validation loops. In the process, we introduce an objective function based on survival endpoints or prediction-error statistics, such as the log-rank test and the concordance error rate, to optimize the tuning parameter of the model and do model selection in a survival setting. Numerical analyses show the importance of replicated cross-validation and the differences between criteria and techniques. Although other non-parametric survival models exist, none addresses directly the problem of identifying local extreme. We compared our method to regression survival trees and their ensemble version, cox regression, and survival semi-supervised versions of clustering and PCA. Numerical analyses show how Survival Bump Hunting can come up with different estimates and extreme subgroups unlike other methods. It provides an insight into the behavior of commonly used models and suggest alternatives to be adopted in practice. Finally, our Survival Bump Hunting framework was applied to a time-to-event HIV clinical dataset. In it, we identified subsets of patients characterized by clinical and demographic covariates with a distinct extreme survival outcome, for which tailored medical interventions could be made. An R package called PrimSRC will be released.
    Preview · Article · Jan 2015 · Statistical Analysis and Data Mining
  • Source
    Daniel A Díaz-Pachón · Jean-Eudes Dazard · J. Sunil Rao
    [Show abstract] [Hide abstract]
    ABSTRACT: Principal Components Analysis is a widely used technique for dimension reduction and characterization of variability in multivariate populations. Our interest lies in studying when and why the rotation to principal components can be used effectively within a response-predictor set relationship in the context of mode hunting. Specifically focusing on the Patient Rule Induction Method (PRIM), we first develop a fast version of this algorithm (fastPRIM) under normality which facilitates the theoretical studies to follow. Using basic geometrical arguments, we then demonstrate how the PC rotation of the predictor space alone can in fact generate improved mode estimators. Simulation results are used to illustrate our findings.
    Preview · Article · Sep 2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: BackgroundTo determine how diets high in saturated fat could increase polyp formation in the mouse model of intestinal neoplasia, Apc Min/+ , we conducted large-scale metabolome analysis and association study of colon and small intestine polyp formation from plasma and liver samples of Apc Min/+ vs. wild-type littermates, kept on low vs. high-fat diet. Label-free mass spectrometry was used to quantify untargeted plasma and acyl-CoA liver compounds, respectively. Differences in contrasts of interest were analyzed statistically by unsupervised and supervised modeling approaches, namely Principal Component Analysis and Linear Model of analysis of variance. Correlation between plasma metabolite concentrations and polyp numbers was analyzed with a zero-inflated Generalized Linear Model.ResultsPlasma metabolome in parallel to promotion of tumor development comprises a clearly distinct profile in Apc Min/+ mice vs. wild type littermates, which is further altered by high-fat diet. Further, functional metabolomics pathway and network analyses in Apc Min/+ mice on high-fat diet revealed associations between polyp formation and plasma metabolic compounds including those involved in amino-acids metabolism as well as nicotinamide and hippuric acid metabolic pathways. Finally, we also show changes in liver acyl-CoA profiles, which may result from a combination of Apc Min/+ -mediated tumor progression and high fat diet. The biological significance of these findings is discussed in the context of intestinal cancer progression.ConclusionsThese studies show that high-throughput metabolomics combined with appropriate statistical modeling and large scale functional approaches can be used to monitor and infer changes and interactions in the metabolome and genome of the host under controlled experimental conditions. Further these studies demonstrate the impact of diet on metabolic pathways and its relation to intestinal cancer progression. Based on our results, metabolic signatures and metabolic pathways of polyposis and intestinal carcinoma have been identified, which may serve as useful targets for the development of therapeutic interventions.
    Preview · Article · Jun 2014 · BMC Systems Biology
  • Source
    Daniel A. Diaz-Pachon · J. Sunil Rao · Jean-Eudes Dazard
    [Show abstract] [Hide abstract]
    ABSTRACT: We show that if we have an orthogonal base ($u_1,\ldots,u_p$) in a $p$-dimensional vector space, and select $p+1$ vectors $v_1,\ldots, v_p$ and $w$ such that the vectors traverse the origin, then the probability of $w$ being to closer to all the vectors in the base than to $v_1,\ldots, v_p$ is at least 1/2 and converges as $p$ increases to infinity to a normal distribution on the interval [-1,1]; i.e., $\Phi(1)-\Phi(-1)\approx0.6826$. This result has relevant consequences for Principal Components Analysis in the context of regression and other learning settings, if we take the orthogonal base as the direction of the principal components.
    Preview · Article · Apr 2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: DEFB4/103A encoding β-defensin 2 and 3, respectively, inhibit CXCR4-tropic (X4) viruses in vitro. We determined whether DEFB4/103A Copy Number Variation (CNV) influences time-to-X4 and time-to-AIDS outcomes. We utilized samples from a previously published Multicenter AIDS Cohort Study (MACS), which provides longitudinal account of viral tropism in relation to the full spectrum of rates of disease progression. Using traditional models for time-to-event analysis, we investigated association between DEFB4/103A CNV and the two outcomes, and interaction between DEFB4/103A CNV and disease progression groups, Fast and Slow. Time-to-X4 and time-to-AIDS were weakly correlated. There was a stronger relationship between these two outcomes for the fast progressors. DEFB4/103A CNV was associated with time-to-AIDS, but not time-to-X4. The association between higher DEFB4/103A CNV and time-to-AIDS was more pronounced for the slow progressors. DEFB4/103A CNV was associated with time-to-AIDS in a disease progression group-specific manner in the MACS cohort. Our findings may contribute to enhancing current understanding of how genetic predisposition influences AIDS progression.
    Full-text · Article · Dec 2012 · Journal of AIDS & Clinical Research
  • Source
    Sudipto Saha · Jean-Eudes Dazard · Hua Xu · Rob M Ewing
    [Show abstract] [Hide abstract]
    ABSTRACT: Large-scale protein-protein interaction data sets have been generated for several species including yeast and human and have enabled the identification, quantification, and prediction of cellular molecular networks. Affinity purification-mass spectrometry (AP-MS) is the preeminent methodology for large-scale analysis of protein complexes, performed by immunopurifying a specific "bait" protein and its associated "prey" proteins. The analysis and interpretation of AP-MS data sets is, however, not straightforward. In addition, although yeast AP-MS data sets are relatively comprehensive, current human AP-MS data sets only sparsely cover the human interactome. Here we develop a framework for analysis of AP-MS data sets that addresses the issues of noise, missing data, and sparsity of coverage in the context of a current, real world human AP-MS data set. Our goal is to extend and increase the density of the known human interactome by integrating bait-prey and cocomplexed preys (prey-prey associations) into networks. Our framework incorporates a score for each identified protein, as well as elements of signal processing to improve the confidence of identified protein-protein interactions. We identify many protein networks enriched in known biological processes and functions. In addition, we show that integrated bait-prey and prey-prey interactions can be used to refine network topology and extend known protein networks.
    Full-text · Article · Jul 2012 · Journal of Proteome Research
  • Jean-Eudes Dazard · J Sunil Rao
    [Show abstract] [Hide abstract]
    ABSTRACT: The paper addresses a common problem in the analysis of high-dimensional high-throughput "omics" data, which is parameter estimation across multiple variables in a set of data where the number of variables is much larger than the sample size. Among the problems posed by this type of data are that variable-specific estimators of variances are not reliable and variable-wise tests statistics have low power, both due to a lack of degrees of freedom. In addition, it has been observed in this type of data that the variance increases as a function of the mean. We introduce a non-parametric adaptive regularization procedure that is innovative in that : (i) it employs a novel "similarity statistic"-based clustering technique to generate local-pooled or regularized shrinkage estimators of population parameters, (ii) the regularization is done jointly on population moments, benefiting from C. Stein's result on inadmissibility, which implies that usual sample variance estimator is improved by a shrinkage estimator using information contained in the sample mean. From these joint regularized shrinkage estimators, we derived regularized t-like statistics and show in simulation studies that they offer more statistical power in hypothesis testing than their standard sample counterparts, or regular common value-shrinkage estimators, or when the information contained in the sample mean is simply ignored. Finally, we show that these estimators feature interesting properties of variance stabilization and normalization that can be used for preprocessing high-dimensional multivariate data. The method is available as an R package, called 'MVR' ('Mean-Variance Regularization'), downloadable from the CRAN website.
    No preview · Article · Jul 2012 · Computational Statistics & Data Analysis
  • Source
    Jean-Eudes J Dazard · Sudipto Saha · Rob M Ewing
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Affinity-Purification Mass-Spectrometry (AP-MS) provides a powerful means of identifying protein complexes and interactions. Several important challenges exist in interpreting the results of AP-MS experiments. First, the reproducibility of AP-MS experimental replicates can be low, due both to technical variability and the dynamic nature of protein interactions in the cell. Second, the identification of true protein-protein interactions in AP-MS experiments is subject to inaccuracy due to high false negative and false positive rates. Several experimental approaches can be used to mitigate these drawbacks, including the use of replicated and control experiments and relative quantification to sensitively distinguish true interacting proteins from false ones. Methods To address the issues of reproducibility and accuracy of protein-protein interactions, we introduce a two-step method, called ROCS, which makes use of Indicator Prey Proteins to select reproducible AP-MS experiments, and of Confidence Scores to select specific protein-protein interactions. The Indicator Prey Proteins account for measures of protein identifiability as well as protein reproducibility, effectively allowing removal of outlier experiments that contribute noise and affect downstream inferences. The filtered set of experiments is then used in the Protein-Protein Interaction (PPI) scoring step. Prey protein scoring is done by computing a Confidence Score, which accounts for the probability of occurrence of prey proteins in the bait experiments relative to the control experiment, where the significance cutoff parameter is estimated by simultaneously controlling false positives and false negatives against metrics of false discovery rate and biological coherence respectively. In summary, the ROCS method relies on automatic objective criterions for parameter estimation and error-controlled procedures. Results We illustrate the performance of our method by applying it to five previously published AP-MS experiments, each containing well characterized protein interactions, allowing for systematic benchmarking of ROCS. We show that our method may be used on its own to make accurate identification of specific, biologically relevant protein-protein interactions, or in combination with other AP-MS scoring methods to significantly improve inferences. Conclusions Our method addresses important issues encountered in AP-MS datasets, making ROCS a very promising tool for this purpose, either on its own or in conjunction with other methods. We anticipate that our methodology may be used more generally in proteomics studies and databases, where experimental reproducibility issues arise. The method is implemented in the R language, and is available as an R package called “ROCS”, freely available from the CRAN repository http://cran.r-project.org/.
    Full-text · Article · Jun 2012 · BMC Bioinformatics
  • Source
    Jean-Eudes Dazard · J Sunil Rao · Sanford Markowitz
    [Show abstract] [Hide abstract]
    ABSTRACT: The question of molecular heterogeneity and of tumoral phenotype in cancer remains unresolved. To understand the underlying molecular basis of this phenomenon, we analyzed genome-wide expression data of colon cancer metastasis samples, as these tumors are the most advanced and hence would be anticipated to be the most likely heterogeneous group of tumors, potentially exhibiting the maximum amount of genetic heterogeneity. Casting a statistical net around such a complex problem proves difficult because of the high dimensionality and multicollinearity of the gene expression space, combined with the fact that genes act in concert with one another and that not all genes surveyed might be involved. We devise a strategy to identify distinct subgroups of samples and determine the genetic/molecular signature that defines them. This involves use of the local sparse bump hunting algorithm, which provides a much more optimal and biologically faithful transformed space within which to search for bumps. In addition, thanks to the variable selection feature of the algorithm, we derived a novel sparse gene expression signature, which appears to divide all colon cancer patients into two populations: a population whose expression pattern can be molecularly encompassed within the bump and an outlier population that cannot be. Although all patients within any given stage of the disease, including the metastatic group, appear clinically homogeneous, our procedure revealed two subgroups in each stage with distinct genetic/molecular profiles. We also discuss implications of such a finding in terms of early detection, diagnosis and prognosis.
    Preview · Article · May 2012 · Statistics in Medicine
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: To define a panel of novel protein biomarkers of renal disease. Adults with type 1 diabetes in the Coronary Artery Calcification in Type 1 Diabetes study who were initially free of renal complications (n = 465) were followed for development of micro- or macroalbuminuria (MA) and early renal function decline (ERFD, annual decline in estimated glomerular filtration rate of ≥3.3%). The label-free proteomic discovery phase was conducted in 13 patients who progressed to MA by the 6-year visit and 11 control subjects, and four proteins (Tamm-Horsfall glycoprotein, α-1 acid glycoprotein, clusterin, and progranulin) identified in the discovery phase were measured by enzyme-linked immunosorbent assay in 74 subjects: group A, normal renal function (n = 35); group B, ERFD without MA (n = 15); group C, MA without ERFD (n = 16); and group D, both ERFD and MA (n = 8). In the label-free analysis, a model of progression to MA was built using 252 peptides, yielding an area under the curve (AUC) of 84.7 ± 5.3%. In the validation study, ordinal logistic regression was used to predict development of ERFD, MA, or both. A panel including Tamm-Horsfall glycoprotein (odds ratio 2.9, 95% CI 1.3-6.2, P = 0.008), progranulin (1.9, 0.8-4.5, P = 0.16), clusterin (0.6, 0.3-1.1, P = 0.09), and α-1 acid glycoprotein (1.6, 0.7-3.7, P = 0.27) improved the AUC from 0.841 to 0.889. A panel of four novel protein biomarkers predicted early renal damage in type 1 diabetes. These findings require further validation in other populations for prediction of renal complications and treatment monitoring.
    Full-text · Article · Mar 2012 · Diabetes care
  • [Show abstract] [Hide abstract]
    ABSTRACT: Allogeneic hematopoietic stem cell transplantation (SCT) is the only curative therapy for many malignant and nonmalignant conditions. Idiopathic pneumonia syndrome (IPS) is a frequently fatal complication that limits successful outcomes. Preclinical models suggest that IPS represents an immune mediated attack on the lung involving elements of both the adaptive and the innate immune system. However, the etiology of IPS in humans is less well understood. To explore the disease pathway and uncover potential biomarkers of disease, we performed two separate label-free, proteomics experiments defining the plasma protein profiles of allogeneic SCT patients with IPS. Samples obtained from SCT recipients without complications served as controls. The initial discovery study, intended to explore the disease pathway in humans, identified a set of 81 IPS-associated proteins. These data revealed similarities between the known IPS pathways in mice and the condition in humans, in particular in the acute phase response. In addition, pattern recognition pathways were judged to be significant as a function of development of IPS, and from this pathway we chose the lipopolysaccaharide-binding protein (LBP) protein as a candidate molecular diagnostic for IPS, and verified its increase as a function of disease using an ELISA assay. In a separately designed study, we identified protein-based classifiers that could predict, at day 0 of SCT, patients who: 1) progress to IPS and 2) respond to cytokine neutralization therapy. Using cross-validation strategies, we built highly predictive classifier models of both disease progression and therapeutic response. In sum, data generated in this report confirm previous clinical and experimental findings, provide new insights into the pathophysiology of IPS, identify potential molecular classifiers of the condition, and uncover a set of markers potentially of interest for patient stratification as a basis for individualized therapy.
    No preview · Article · Feb 2012 · Molecular & Cellular Proteomics
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Dendritic cells (DC) direct the magnitude, polarity and effector function of the adaptive immune response. DC express toll-like receptors (TLR), antigen capturing and processing machinery, and costimulatory molecules, which facilitate innate sensing and T cell activation. Once activated, DC can efficiently migrate to lymphoid tissue and prime T cell responses. Therefore, DC play an integral role as mediators of the immune response to multiple pathogens. Elucidating the molecular mechanisms involved in DC activation is therefore central in gaining an understanding of host response to infection. Unfortunately, technical constraints have limited system-wide 'omic' analysis of human DC subsets collected ex vivo. Here we have applied novel proteomic approaches to human myeloid dendritic cells (mDCs) purified from 100 mL of peripheral blood to characterize specific molecular networks of cell activation at the individual patient level, and have successfully quantified over 700 proteins from individual samples containing as little as 200,000 mDCs. The proteomic and network readouts after ex vivo stimulation of mDCs with TLR3 agonists are measured and verified using flow cytometry.
    Full-text · Article · Sep 2011 · Journal of immunological methods
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Adenoviruses force quiescent cells to re-enter the cell cycle to replicate their DNA, and for the most part, this is accomplished after they express the E1A protein immediately after infection. In this context, E1A is believed to inactivate cellular proteins (e.g., p130) that are known to be involved in the silencing of E2F-dependent genes that are required for cell cycle entry. However, the potential perturbation of these types of genes by E1A relative to their functions in regulatory networks and canonical pathways remains poorly understood. We have used DNA microarrays analyzed with Bayesian ANOVA for microarray (BAM) to assess changes in gene expression after E1A alone was introduced into quiescent cells from a regulated promoter. Approximately 2,401 genes were significantly modulated by E1A, and of these, 385 and 1033 met the criteria for generating networks and functional and canonical pathway analysis respectively, as determined by using Ingenuity Pathway Analysis software. After focusing on the highest-ranking cellular processes and regulatory networks that were responsive to E1A in quiescent cells, we observed that many of the up-regulated genes were associated with DNA replication, the cell cycle and cellular compromise. We also identified a cadre of up regulated genes with no previous connection to E1A; including genes that encode components of global DNA repair systems and DNA damage checkpoints. Among the down-regulated genes, we found that many were involved in cell signalling, cell movement, and cellular proliferation. Remarkably, a subset of these was also associated with p53-independent apoptosis, and the putative suppression of this pathway may be necessary in the viral life cycle until sufficient progeny have been produced. These studies have identified for the first time a large number of genes that are relevant to E1A's activities in promoting quiescent cells to re-enter the cell cycle in order to create an optimum environment for adenoviral replication.
    Full-text · Article · May 2011 · BMC Research Notes
  • Source
    Jean-Eudes Dazard · J Sunil Rao
    [Show abstract] [Hide abstract]
    ABSTRACT: The search for structures in real datasets e.g. in the form of bumps, components, classes or clusters is important as these often reveal underlying phenomena leading to scientific discoveries. One of these tasks, known as bump hunting, is to locate domains of a multidimensional input space where the target function assumes local maxima without pre-specifying their total number. A number of related methods already exist, yet are challenged in the context of high dimensional data. We introduce a novel supervised and multivariate bump hunting strategy for exploring modes or classes of a target function of many continuous variables. This addresses the issues of correlation, interpretability, and high-dimensionality (p ≫ n case), while making minimal assumptions. The method is based upon a divide and conquer strategy, combining a tree-based method, a dimension reduction technique, and the Patient Rule Induction Method (PRIM). Important to this task, we show how to estimate the PRIM meta-parameters. Using accuracy evaluation procedures such as cross-validation and ROC analysis, we show empirically how the method outperforms a naive PRIM as well as competitive non-parametric supervised and unsupervised methods in the problem of class discovery. The method has practical application especially in the case of noisy high-throughput data. It is applied to a class discovery problem in a colon cancer micro-array dataset aimed at identifying tumor subtypes in the metastatic stage. Supplemental Materials are available online.
    Preview · Article · Dec 2010 · Journal of Computational and Graphical Statistics
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Crooked tail (Cd) mice bear a gain-of-function mutation in Lrp6, a co-receptor for canonical WNT signaling, and are a model of neural tube defects (NTDs), preventable with dietary folic acid (FA) supplementation. Whether the FA response reflects a direct influence of FA on LRP6 function was tested with prenatal supplementation in LRP6-deficient embryos. The enriched FA (10 ppm) diet reduced the occurrence of birth defects among all litters compared with the control (2 ppm FA) diet, but did so by increasing early lethality of Lrp6−/− embryos while actually increasing NTDs among nulls alive at embryonic days 10–13 (E10–13). Proliferation in cranial neural folds was reduced in homozygous Lrp6−/− mutants versus wild-type embryos at E10, and FA supplementation increased proliferation in wild-type but not mutant neuroepithelia. Canonical WNT activity was reduced in LRP6-deficient midbrain–hindbrain at E9.5, demonstrated in vivo by a TCF/LEF-reporter transgene. FA levels in media modulated the canonical WNT response in NIH3T3 cells, suggesting that although FA was required for optimal WNT signaling, even modest FA elevations attenuated LRP5/6-dependent canonical WNT responses. Gene expression analysis in embryos and adults showed striking interactions between targeted Lrp6 deficiency and FA supplementation, especially for mitochondrial function, folate and methionine metabolism, WNT signaling and cytoskeletal regulation that together implicate relevant signaling and metabolic pathways supporting cell proliferation, morphology and differentiation. We propose that FA supplementation rescues Lrp6Cd/Cd fetuses by normalizing hyperactive WNT activity, whereas in LRP6-deficient embryos, added FA further attenuates reduced WNT activity, thereby compromising development.
    Full-text · Article · Dec 2010 · Human Molecular Genetics
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Diabetes mellitus is estimated to affect approximately 24 million people in the United States and more than 150 million people worldwide. There are numerous end organ complications of diabetes, the onset of which can be delayed by early diagnosis and treatment. Although assays for diabetes are well founded, tests for its complications lack sufficient specificity and sensitivity to adequately guide these treatment options. In our study, we employed a streptozotocin-induced rat model of diabetes to determine changes in urinary protein profiles that occur during the initial response to the attendant hyperglycemia (e.g. the first two months) with the goal of developing a reliable and reproducible method of analyzing multiple urine samples as well as providing clues to early markers of disease progression. After filtration and buffer exchange, urinary proteins were digested with a specific protease, and the relative amounts of several thousand peptides were compared across rat urine samples representing various times after administration of drug or sham control. Extensive data analysis, including imputation of missing values and normalization of all data was followed by ANOVA analysis to discover peptides that were significantly changing as a function of time, treatment and interaction of the two variables. The data demonstrated significant differences in protein abundance in urine before observable pathophysiological changes occur in this animal model and as function of the measured variables. These included decreases in relative abundance of major urinary protein precursor and increases in pro-alpha collagen, the expression of which is known to be regulated by circulating levels of insulin and/or glucose. Peptides from these proteins represent potential biomarkers, which can be used to stage urogenital complications from diabetes. The expression changes of a pro-alpha 1 collagen peptide was also confirmed via selected reaction monitoring.
    Preview · Article · Jul 2009 · Molecular & Cellular Proteomics
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Standard genetic mapping techniques scan chromosomal segments for location of genetic linkage and association signals. The majority of these methods consider only correlations at single markers and/or phenotypes with explicit detailing of the genetic structure. These methods tend to be limited by their inability to consider the effect of large numbers of model variables jointly. In contrast, we propose a Bayesian analysis of variance (ANOVA) method to categorize individuals based on similarity of multidimensional profiles and attempt to analyze all variables simultaneously. Using Problem 1 of the Genetic Analysis Workshop 15 data set, we demonstrate the method's utility for joint analysis of gene expression levels and single-nucleotide polymorphism genotypes. We show that the method extracts similar information to that of previous genetic mapping analyses, and suggest extensions of the method for mining unique information not previously found.
    Full-text · Article · Feb 2007 · BMC proceedings
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Human embryonic stem cells (ESC) are undifferentiated and are endowed with the capacities of self-renewal and pluripotential differentiation. Adult stem cells renew their own tissue, but whether they can transdifferentiate to other tissues is still controversial. To understand the genetic program that underlies the pluripotency of stem cells, we compared the transcription profile of ESC with that of progenitor/stem cells of human hematopoietic and keratinocytic origins, along with their mature cells to be viewed as snapshots along tissue differentiation. ESC gene profiles show higher complexity with significantly more highly expressed genes than adult cells. We hypothesize that ESC use a strategy of expressing genes that represent various differentiation pathways and selection of only a few for continuous expression upon differentiation to a particular target. Such a strategy may be necessary for the pluripotency of ESC. The progenitors of either hematopoietic or keratinocytic cells also follow the same design principle. Using advanced clustering, we show that many of the ESC expressed genes are turned off in the progenitors/stem cells followed by a further down-regulation in adult tissues. Concomitantly, genes specific to the target tissue are up-regulated toward mature cells of skin or blood.
    Full-text · Article · Feb 2005 · The FASEB Journal

Publication Stats

367 Citations
86.89 Total Impact Points

Institutions

  • 2012-2014
    • Case Western Reserve University School of Medicine
      • Center for Proteomics and Bioinformatics
      Cleveland, Ohio, United States
  • 2005-2014
    • Case Western Reserve University
      • • Center for Proteomics and Bioinformatics
      • • Department of Epidemiology and Biostatistics
      • • School of Medicine
      Cleveland, Ohio, United States
  • 2004
    • Bar Ilan University
      • Faculty of Life Sciences
      Gan, Tel Aviv, Israel
  • 2003
    • Weizmann Institute of Science
      • Department of Molecular Cell Biology
      Tell Afif, Tel Aviv, Israel