Clustering methods for the analysis of DNA microarray data

Source: CiteSeer

ABSTRACT It is now possible to simultaneously measure the expression of thousands of genes during cellular differentiation and response, through the use of DNA microarrays. A major statistical task is to understand the structure in the data that arise from this technology. In this paper we review various methods of clustering, and illustrate how they can be used to arrange both the genes and cell lines from a set of DNA microarray experiments. The methods discussed are global clustering techniques including hierarchical, K-means, and block clustering, and tree-structured vector quantization. Finally, we propose a new method for identifying structure in subsets of both genes and cell lines that are potentially obscured by the global clustering approaches. 1 Introduction DNA microarrays and other high-throughput methods for analyzing complex nucleic acid samples make it now possible to measure rapidly, efficiently and accurately the levels of virtually all genes expressed in a biologi...

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Drug resistance was first identified in cancer cells that express proteins known as multidrug resistance proteins that extrude the therapeutic agents out of the cells resulting in alteration of pharmacokinetics, tissue distribution, and pharmacodynamics of drugs. To this end studies were carried out to investigate the role of pharmacological inhibitors and pharmaceutical excipients with a primary focus on P-glycoprotein (P-gp). The aim of this study was to investigate holistic changes in transporter gene expression during permeability upon formulation of indomethacin as solid dispersion. Initial characterization studies of solid dispersion of indomethacin showed that the drug was dispersed within the carrier in amorphous form. Analysis of permeability data across Caco-2 monolayers revealed that drug absorption increased by 4-fold when reformulated as solid dispersion. The last phase of the work involved investigation of gene expression changes of transporter genes during permeability. The results showed that there were significant differences in the expression of both ATP-binding cassette (ABC) transporter genes as well as solute carrier transporter (SLC) genes suggesting that the inclusion of polyethylene glycol as well as changes in molecular form of drug from crystalline to amorphous have a significant bearing on the expression of transporter network genes resulting in differences in drug permeability.
    Journal of Drug Targeting 11/2010; 19(8):615-23. · 2.77 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Reinforcement-Learning is learning how to best-react to situations, through trial and error. In the Machine-Learning community Reinforcement-Learning is researched with respect to artificial (machine) decision-makers, referred to as agents. The agents are assumed to be situated within an environment which behaves as a Markov Decision Process. This chapter provides a brief introduction to Reinforcement-Learning, and establishes its relation to Data-Mining. Specifically, the Reinforcement-Learning problem is defined; a few key ideas for solving it are described; the relevance to Data-Mining is explained; and an instructive example is presented. Key wordsReinforcement-Learning
    12/2009: pages 401-417;
  • [Show abstract] [Hide abstract]
    ABSTRACT: The idea of ensemble methodology is to build a predictive model by integrating multiple models. It is well-known that ensemble methods can be used for improving prediction performance. In this chapter we provide an overview of ensemble methods in classification tasks. We present all important types of ensemble methods including boosting and bagging. Combining methods and modeling issues such as ensemble diversity and ensemble size are discussed. Key wordsEnsemble-Boosting-AdaBoost-Windowing-Bagging-Grading-Arbiter Tree-Combiner Tree
    12/2009: pages 959-979;