Maurice HT Ling
Research skills
-
ITPython programming, relational database, graph database
-
StatisticalANOVA, Bootstrapping, Randomization
-
OtherSimulation
Research interests
-
InterestsPython programming, Evolutionary Ecology, biomedical text mining, Data Warehousing
Research experience
-
Jul 2009
Research: HygDAS - Hypergraph Database Management System
http://hygdas.sf.net -
Jul 2009
Research: BeSSY - Formally Specifying the Behaviours of Software Components
http://bessy.sf.net -
Jan 2009–
Dec 2011Research: Adaptive Evolution (Examining Evolution in the Laboratory)
Singapore Polytechnic · School of Chemical and Life Sciences · Singapore Polytechnichttp://maurice.vodien.com/portfolio/RprojLUCAexpevo.html -
Nov 2008
Research: Which Genes do not Change in Expression?
Singapore Polytechnic · School of Chemical and Life Sciences · Singapore Polytechnichttp://maurice.vodien.com/portfolio/RprojLUCAinvariant.html -
Sep 2008
Research: CyNote - Cyber Laboratory Notebook for Biologists and Bioinformaticists
e-notebook, 21 CFR Part 11 http://cynote.sf.net -
Jan 2007
Research: COPADS - Collection of Python Algorithms and Data Structures
http://copads.sf.net -
Jul 2006
Research: Muscorian - Mining Biomedical Literature for Protein-Protein Interactions
http://muscorian.sf.net -
Jul 2003
Research: OpenDWS - Open Data Warehousing Suite
http://opendws.sf.net
Education
-
Jul 2004–
May 2009University of Melbourne
Bioinfomatics · Doctor of PhilosophyAustralia · Melbourne -
Jul 2003–
Jul 2004University of Melbourne
Molecular and Cell Biology · Bachelor of Science (Degree with Honours)Australia · Melbourne
Other
-
LanguagesEnglish, Chinese
-
Scientific MembershipsMember, Association of Computing Machinery
Member, Institute of Mathematical Statistics
Member, Association of Medical and Bioinformatics (Singapore)
Vice President, Python Users Group (Singapore)
Senior Fellow, International Fitness Association -
Journal RefereeRecent Patents in Biotechnology
International Journal of Parallel and Distributed systems -
Other InterestsCo-Editor-in-Chief, The Python Papers Anthology
Chief Editor, Computational and Mathematical Biology
Honorary Fellow, The University of Melbourne
Publications
-
Mapping Relational Operations onto Hypergraph Model
05/2011;
The relational model is the most commonly used data model for storing large datasets, perhaps due to the simplicity of the tabular format which had revolutionized database management systems. However, many real world objects are recursive and associative in nature which makes storage in the relation... [more] The relational model is the most commonly used data model for storing large datasets, perhaps due to the simplicity of the tabular format which had revolutionized database management systems. However, many real world objects are recursive and associative in nature which makes storage in the relational model difficult. The hypergraph model is a generalization of a graph model, where each hypernode can be made up of other nodes or graphs and each hyperedge can be made up of one or more edges. It may address the recursive and associative limitations of relational model. However, the hypergraph model is non-tabular; thus, loses the simplicity of the relational model. In this study, we consider the means to convert a relational model into a hypergraph model in two layers. At the bottom layer, each relational tuple can be considered as a star graph centered where the primary key node is surrounded by non-primary key attributes. At the top layer, each tuple is a hypernode, and a relation is a set of hypernodes. We presented a reference implementation of relational operators (project, rename, select, inner join, natural join, left join, right join, outer join and Cartesian join) on a hypergraph model. Using a simple example, we demonstrate that a relation and relational operators can be implemented on this hypergraph model.
-
Bactome II: Analyzing Gene List for Gene Ontology Over-Representation
The Python Papers Source Codes. 01/2011; 3:3.
Microarray is an experimental tool that allows for the screening of several thousand genes in a single experiment and the analysis of which often requires mapping onto biological processes. This allows for the examination of processes that are over-represented. A number of tools have been developed ... [more] Microarray is an experimental tool that allows for the screening of several thousand genes in a single experiment and the analysis of which often requires mapping onto biological processes. This allows for the examination of processes that are over-represented. A number of tools have been developed but each differed in terms of organisms that can be analyzed. Gene Ontology website has a list of up-to-date annotation files for different organisms that can be used for overrepresentation analysis. Each file maps each gene of the organism to its ontological terms. It is a simple tool that allows users to use the up-to-date annotation files to generate the expected and observed counts for each GO identifier (GO ID) from a given gene list for further statistical analyses.
-
Soft tagging of overlapping high confidence gene mention variants for cross-species full-text gene normalization
BMC Bioinformatics. 01/2011; 12(Suppl 8):S6.
-
3.43Impact points
Soft tagging of overlapping high confidence gene mention variants for cross-species full-text gene normalization.
BMC bioinformatics. 01/2011; 12 Suppl 8:S6.
Previously, gene normalization (GN) systems are mostly focused on disambiguation using contextual information. An effective gene mention tagger is deemed unnecessary because the subsequent steps will filter out false positives and high recall is sufficient. However, unlike similar tasks in the past ... [more] Previously, gene normalization (GN) systems are mostly focused on disambiguation using contextual information. An effective gene mention tagger is deemed unnecessary because the subsequent steps will filter out false positives and high recall is sufficient. However, unlike similar tasks in the past BioCreative challenges, the BioCreative III GN task is particularly challenging because it is not species-specific. Required to process full-length articles, an ineffective gene mention tagger may produce a huge number of ambiguous false positives that overwhelm subsequent filtering steps while still missing many true positives. We present our GN system participated in the BioCreative III GN task. Our system applies a typical 2-stage approach to GN but features a soft tagging gene mention tagger that generates a set of overlapping gene mention variants with a nearly perfect recall. The overlapping gene mention variants increase the chance of precise match in the dictionary and alleviate the need of disambiguation. Our GN system achieved a precision of 0.9 (F-score 0.63) on the BioCreative III GN test corpus with the silver annotation of 507 articles. Its TAP-k scores are competitive to the best results among all participants. We show that despite the lack of clever disambiguation in our gene normalization system, effective soft tagging of gene mention variants can indeed contribute to performance in cross-species and full-text gene normalization.
-
3.58Impact points
High expression stability of microtubule affinity regulating kinase 3 (MARK3) makes it a reliable reference gene.
IUBMB life. 03/2010; 62(3):200-3.
Difference in gene expressions is characteristic of the function of different cell types and those genes with low expression variance can be used as standards for quantitative gene expression studies. Microarray technology is used to study global gene expression within a cell; hence, represents a su... [more] Difference in gene expressions is characteristic of the function of different cell types and those genes with low expression variance can be used as standards for quantitative gene expression studies. Microarray technology is used to study global gene expression within a cell; hence, represents a suitable source of data to mine for genes with low expression variance. The coefficient of variation (COV) of each gene was determined and a threshold of less than 0.1 COV was used to select stably expressed genes in each data set. Our results showed that microtubule affinity-regulating kinase 3 (MARK3) has the lowest COV in eight microarray datasets. In addition, the gene expression of housekeeping genes, which is very likely to be stably expressed, tends to fluctuate highly under different conditions, marking them as being less reliable for use as reference genes.
-
Escherichia coli adapts to food additives within 180 generations
Singapore Society of Biochemistry and Molecular Biology Young Scientists' Symposium 2010, Singapore Science Centre; 01/2010
-
Specifying the behaviour of Python programs: language and basic examples
The Python Papers. 01/2010; 5:4.
-
Evolution Characterization of Escherichia coli using RFLP DNA Fingerprinting
01/2010; Singapore Polytechnic.
ISBN: DBTBTech0902
-
Russel and Rao coefficient is a suitable substitute for Dice coefficient in studying restriction mapped genetic distances of Escherichia coli
Computational and Mathematical Biology. 01/2010; 1:1.
-
COPADS, II: Chi-square test, F-test and t-test routines from Gopal Kanji's 100 statistical tests
The Python Papers Source Codes. 01/2010; 2:3.
-
Mining protein-protein interactions from published abstracts with MontyLingua
01/2010; iConcept Press Pty Ltd.
-
Filtering Microarray Correlations by Statistical Literature Analysis Yields Potential Hypotheses for Lactation Research
02/2009;
Our results demonstrated that a previously reported protein name co-occurrence method (5-mention PubGene) which was not based on a hypothesis testing framework, it is generally statistically more significant than the 99th percentile of Poisson distribution-based method of calculating co-occurrence. ... [more] Our results demonstrated that a previously reported protein name co-occurrence method (5-mention PubGene) which was not based on a hypothesis testing framework, it is generally statistically more significant than the 99th percentile of Poisson distribution-based method of calculating co-occurrence. It agrees with previous methods using natural language processing to extract protein-protein interaction from text as more than 96% of the interactions found by natural language processing methods to overlap with the results from 5-mention PubGene method. However, less than 2% of the gene co-expressions analyzed by microarray were found from direct co-occurrence or interaction information extraction from the literature. At the same time, combining microarray and literature analyses, we derive a novel set of 7 potential functional protein-protein interactions that had not been previously described in the literature.
-
Ten Z-test Routines from Gopal Kanji's 100 Statistical Tests.
The Python Papers Source Codes. 01/2009; 1:5.
This manuscript presents the implementation and testing of 10 Z-test routines from Gopal Kanji’s book entitled “100 Statistical Tests”.
-
Biomedical literature analysis: current state and challenges
01/2009; Nova Science Publishers, Inc..
-
Identification of Transcriptional Invariant Genes in Mouse Liver from Microarray Data
15th Youth Science Conference, Singapore; 01/2009
-
Applying lazy local learning in BCII.5 article categorization task. BioCreative II.5 Workshop Special Session on Digital Annotations
BioCreative 11.5, Centro Nacional de Investigaciones Oncologicas, Spain; 01/2009
Following (7)
-
Habibur Rahman
TigerHATS -
Andreas Schreiber
Deutsches Zentrum fuer Luft- und Raumfahrt e. V. -
Remi Mollicone
CFAR-m -
Amr Farahat
جامعة المنصورة - مصر -
Chinmoy Saha
University of Dhaka