Molecular Genetics Information System (MOLGENIS): alternatives in developing local experimental genomics databases

Groningen Bioinformatics Center, Faculty of Medical Sciences and Faculty of Mathematics and Natural Sciences, University of Groningen, Groningen, The Netherlands.
Bioinformatics (Impact Factor: 4.62). 10/2004; 20(13):2075-83. DOI: 10.1093/bioinformatics/bth206
Source: DBLP

ABSTRACT Genomic research laboratories need adequate infrastructure to support management of their data production and research workflow. But what makes infrastructure adequate? A lack of appropriate criteria makes any decision on buying or developing a system difficult. Here, we report on the decision process for the case of a molecular genetics group establishing a microarray laboratory.
Five typical requirements for experimental genomics database systems were identified: (i) evolution ability to keep up with the fast developing genomics field; (ii) a suitable data model to deal with local diversity; (iii) suitable storage of data files in the system; (iv) easy exchange with other software; and (v) low maintenance costs. The computer scientists and the researchers of the local microarray laboratory considered alternative solutions for these five requirements and chose the following options: (i) use of automatic code generation; (ii) a customized data model based on standards; (iii) storage of datasets as black boxes instead of decomposing them in database tables; (iv) loosely linking to other programs for improved flexibility; and (v) a low-maintenance web-based user interface. Our team evaluated existing microarray databases and then decided to build a new system, Molecular Genetics Information System (MOLGENIS), implemented using code generation in a period of three months. This case can provide valuable insights and lessons to both software developers and a user community embarking on large-scale genomic projects.


Available from: Ritsert C Jansen, Jun 03, 2015
  • [Show abstract] [Hide abstract]
    ABSTRACT: Microvillus inclusion disease (MVID) is one of the most severe congenital intestinal disorders and is characterized by neonatal secretory diarrhea and the inability to absorb nutrients from the intestinal lumen. MVID is associated with patient-, family- and ancestry-unique mutations in the MYO5B gene, encoding the actin-based motor protein myosin Vb. Here we review the MYO5B gene and all currently known MYO5B mutations and for the first time methodologically categorize these with regard to functional protein domains and recurrence in MYO7A associated with Usher syndrome and other myosins. We also review animal models for MVID and the latest data on functional studies related to the myosin Vb protein. To congregate existing and future information on MVID geno-/phenotypes and facilitate its quick and easy sharing among clinicians and researchers, we have constructed an online MOLGENIS-based international patient registry ( This easily accessible database currently contains detailed information of 137 MVID patients together with reported clinical/phenotypic details and 41 unique MYO5B mutations, of which several unpublished. The future expansion and prospective nature of this registry is expected to improve disease diagnosis, prognosis and genetic counseling. This article is protected by copyright. All rights reserved.
    Human Mutation 12/2013; 34(12). DOI:10.1002/humu.22440 · 5.05 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The future of biology will be increasingly driven by the fundamental paradigm shift from hypothesis-driven research to data-driven discovery research employing the massive amounts of available biological data. We identify key technological developments needed to enable this paradigm shift involving (1) the ability to store and manage extremely large datasets which are dispersed over a wide geographical area, (2) development of novel analysis and visualization tools which are capable of operating on enormous data resources without overwhelming researchers with unusable information, and (3) formalisms for integrating mathematical models of biosystems from the molecular level to the organism population level. This will require the development of tools which efficiently utilize high-performance compute power, large storage infrastructures and large aggregate memory architectures. The end result will be the ability of a researcher to integrate complex data from many different sources with simulations to analyze a given system at a wide range of temporal and spatial scales in a single conceptual model.
    Journal of Biological Systems 06/2006; 14(02). DOI:10.1142/S0218339006001805 · 0.96 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Information systems on the web are becoming important resources for those studying biomedical sciences. One area of study of these sciences focuses on the oral cavity and on proteins that reside in it. Several online platforms provide specific knowledge on multiple microorganisms and associated proteins, but these are generic and are not designed for specific case studies. This work aims to develop a strategy and a prototype for the storage of information related to the oral cavity, aiming their use in research. It will integrate data collected from experimental results with existing references on the web and explored by other entities. The prototype allows researchers in the biomedical sciences, without particular expertise in databases, searching for proteins, genes and diseases, and integrating new test results in the existing database.
    07/2011, Degree: M.Sc., Supervisor: José Luís Oliveira