Using chemometrics for navigating in the large data sets of genomics, proteomics, and metabonomics (GPM)

Umeå University, Umeå, Västerbotten, Sweden
Analytical and Bioanalytical Chemistry (Impact Factor: 3.44). 11/2004; 380(3):419-29. DOI: 10.1007/s00216-004-2783-y
Source: PubMed


This article describes the applicability of multivariate projection techniques, such as principal-component analysis (PCA) and partial least-squares (PLS) projections to latent structures, to the large-volume high-density data structures obtained within genomics, proteomics, and metabonomics. PCA and PLS, and their extensions, derive their usefulness from their ability to analyze data with many, noisy, collinear, and even incomplete variables in both X and Y. Three examples are used as illustrations: the first example is a genomics data set and involves modeling of microarray data of cell cycle-regulated genes in the microorganism Saccharomyces cerevisiae. The second example contains NMR-metabonomics data, measured on urine samples of male rats treated with either of the drugs chloroquine or amiodarone. The third and last data set describes sequence-function classification studies in a set of G-protein-coupled receptors using hierarchical PCA.

21 Reads
  • Source
    • "This approach provides a systemic and objective way of analysing the HPTLC plate. In addition, these classification algorithms have been applied to classify different descriptors in the field of food chemistry [11], forensic chemistry [12], environment chemistry [13], bioinformatics [14], pharmaceutics [15], metabonomics [16], proteomics [17], genomics [18] and so on. Recently, three comprehensive review articles [6] [7] [19] were published which summarised the application of various classification algorithms for differentiating medicinal herbs of different species [20] [21], age [22] and geographical origins [23]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Puerariae Lobatae Radix (PLR), the root of Pueraria lobata, is a traditional Chinese medicine for treating diabetes and cardiovascular diseases. Puerariae Thomsonii Radix (PTR), the root of Pueraria thomsonii, is a closely related species to PLR and has been used as a PLR substitute in clinical practice. The aim of this study was to compare the classification accuracy of high performance thin-layer chromatography (HPTLC) with that of ultra-performance liquid chromatography (UPLC) in differentiating PLR from PTR. The Matlab functions were used to facilitate the digitalisation and pre-processing of the HPTLC plates. Seven multivariate classification methods were evaluated for the two chromatographic methods. The results demonstrated that the HPTLC classification models were comparable to the UPLC classification models. In particular, k-nearest neighbours, partial least square-discriminant analysis, principal component analysis-discriminant analysis and support vector machine-discriminant analysis showed the highest rate of correct species classification, whilst the lowest classification rate was obtained from soft independent modelling of class analogy. In conclusion, HPTLC combined with multivariate analysis is a promising technique for the quality control and differentiation of PLR and PTR.
    Full-text · Article · Feb 2014 · Journal of pharmaceutical and biomedical analysis
  • Source
    • "Graph Pad Prism 5.0 software was used (GraphPad, San Diego, CA). The multivariate analysis ''Orthogonal partial least squares projections to latent structures'' (O-PLS) was used to assess the influence of clinical parameters on vaccine responses with the aid of the SIMCA-P statistical programme (Umetrics, Umeå, Sweden) [22]. Data derived from the O-PLS models were further analyzed using the Spearman rank correlation test and the Mann–Whitney U test as appropriate. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Vaccination with the 23-valent pneumococcal polysaccharide vaccine (PPV) has been recommended for elderly patients with B cell malignancies and dysfunctions because of their enhanced susceptibility to pneumococcal infections. More recent recommendations advocate the use of conjugate pneumococcal vaccines. Methods We compared responses to single dose vaccination with either PPV or the 7-valent pneumococcal conjugate vaccine (PCV7) in fifty-six patients ⩾60 years with a diagnosis of multiple myeloma (n = 24), Waldenstrom’s macroglobulinemia (n = 15) and the non-malignant B cell disorder monoclonal gammopathy of undetermined significance (MGUS) (n = 17), and 20 age-matched controls. Serum was collected prior to vaccination and 4–8 weeks later, and analyzed for IgG antibody levels to pneumococcal serotypes 4, 6B, 9V, 14, 18C, 19F, and 23F by ELISA. Functional antibody activity towards pneumococcal serotypes 4 and 14 was measured using an opsonophagocytic killing assay (OPA). Results All patient groups had lower pre-vaccination IgG antibody and OPA titers to the investigated serotypes compared to the healthy controls. Following vaccination, myeloma patients responded with significant IgG titer increases to 1/7 serotypes and OPA titer increases to 1/2 serotypes. Corresponding IgG and OPA vaccine responses were 3/7 and 0/2 for Waldenstrom patients, 4/7 and 1/2 for MGUS patients, and 4/7 and 2/2 for the healthy controls, respectively. Notably high antibody levels without corresponding OPA titers were seen among a few myeloma patients indicating the presence of non-functional antibodies. Neither of the two vaccines elicited significantly higher serotype-specific IgG concentrations, OPA titers or antibody fold increases in any of the study groups. Hypogammaglobulinemia and ongoing chemotherapy were associated with poor vaccine responses in a multivariate analysis. Conclusion Our findings confirm that B cell malignancies and disorders among elderly patients are associated with suboptimal responses to pneumococcal vaccination. Single-dose PCV7 was not shown to be superior to PPV.
    Full-text · Article · Dec 2013 · Trials in Vaccinology
  • Source
    • "A range of different univariate and multivariate data analysis methods and different software have been used for analysing SELDI spectral data [11,16,17,19,25,27-30]. We believe that multivariate methods based on latent variables are better suited, as these methods can handle data with more variables than observations and data which are noisy and highly collinear [22,31,32]. They provide a good tool for visualization of the data, detection of patterns and object classification. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Classical scrapie in sheep is a fatal neurodegenerative disease associated with the conversion PrPC to PrPSc. Much is known about genetic susceptibility, uptake and dissemination of PrPSc in the body, but many aspects of prion diseases are still unknown. Different proteomic techniques have been used during the last decade to investigate differences in protein profiles between affected animals and healthy controls. We have investigated the protein profiles in serum of sheep with scrapie and healthy controls by SELDI-TOF-MS and LC-MS/MS. Latent Variable methods such as Principal Component Analysis, Partial Least Squares-Discriminant Analysis and Target Projection methods were used to describe the MS data. The serum proteomic profiles showed variable differences between the groups both throughout the incubation period and at the clinical end stage of scrapie. At the end stage, the target projection model separated the two groups with a sensitivity of 97.8%, and serum amyloid A was identified as one of the protein peaks that differed significantly between the groups. At the clinical end stage of classical scrapie, ten SELDI peaks significantly discriminated the scrapie group from the healthy controls. During the non-clinical incubation period, individual SELDI peaks were differently expressed between the groups at different time points. Investigations of differences in -omic profiles can contribute to new insights into the underlying disease processes and pathways, and advance our understanding of prion diseases, but comparison and validation across laboratories is difficult and challenging.
    Full-text · Article · Nov 2013 · BMC Research Notes
Show more