What is bioinformatics? An introduction and overview

Source: CiteSeer

ABSTRACT A flood of data means that many of the challenges in biology are now challenges in computing. Bioinformatics, the application of computational techniques to analyse the information associated with biomolecules on a large-scale, has now firmly established itself as a discipline in molecular biology, and encompasses a wide range of subject areas from structural biology, genomics to gene expression studies. In this review we provide an introduction and overview of the current state of the field. We discuss the main principles that underpin bioinformatics analyses, look at the types of biological information and databases that are commonly used, and finally examine some of the studies that are being conducted, particularly with reference to transcription regulatory systems. 2. Introduction Biological data are flooding in at an unprecedented rate (1). For example as of August 2000, the GenBank repository of nucleic acid sequences contained 8,214,000 entries (2) and the SWISS-PROT databas...

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Pharmacoinformatics is new emerging information technologies like neuroinformatics, immunoinformatics, bioinformatics, Metabolomics, chemo-informatics, toxico-informatics, cancer informatics, genome informatics, proteome informatics, biomedical informatics are basic tools provided for the purpose of drug discovery. There is an increasing recognition that information technology can be effectively used for drug discovery. The work in pharmacoinformatics can be broadly divided into two categories -scientific aspects and service aspects. The scientific component deals with the drug discovery and development activities whereas the service oriented aspects are more patient centric. The compelling drivers for the pharmaceutical industry are minimizing the time between a drug's discovery and its delivery to the marketplace and maintaining high productivity in the manufacturing processes. During a product's lifecycle many complex decisions must be made to achieve these goals. To better support the development and manufacturing processes at each stage, we have proposed a new epitome to facilitate the management and transfer of data information and knowledge. In future these information technology efforts are expected to grow both in terms of their reliability and scope. Thus, this emerging technology (pharmacoinformatics) is becoming an essential component of pharmaceutical sciences.
    American Journal of PharmTech Research. 06/2012;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Visualization of high-dimensional data has always been a challenging task. Here we discuss and propose variants of non-linear data projection methods (Generative Topographic Mapping (GTM) and GTM with simultaneous feature saliency (GTM-FS)) that are adapted to be effective on very high-dimensional data. The adaptations use log space values at certain steps of the Expectation Maximization (EM) algorithm and during the visualization process. We have tested the proposed algorithms by visualizing electrostatic potential data for Major Histocompatibility Complex (MHC) class-I proteins. The experiments show that the variation in the original version of GTM and GTM-FS worked successfully with data of more than 2000 dimensions and we compare the results with other linear/nonlinear projection methods: Principal Component Analysis (PCA), Neuroscale (NSC) and Gaussian Process Latent Variable Model (GPLVM).
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The availability of huge amounts of data resulted in great need of data mining technique in order to generate useful knowledge. In the present study we provide detailed information about data mining techniques with more focus on classification techniques as one important supervised learning technique. We also discuss WEKA software as a tool of choice to perform classification analysis for different kinds of available data. A detailed methodology is provided to facilitate utilizing the software by a wide range of users. The main features of WEKA are 49 data preprocessing tools, 76 classification/regression algorithms, 8 clustering algorithms, 3 algorithms for finding association rules, 15 attribute/subset evaluators plus 10 search algorithms for feature selection. WEKA extracts useful information from data and enables a suitable algorithm for generating an accurate predictive model from it to be identified. Moreover, medical bioinformatics analyses have been performed to illustrate the usage of WEKA in the diagnosis of Leukemia.
    Computer Engineering and Intelligent Systems. 12/2013; 4(13):28-38.


Available from