The Mouse Genome Database (MGD): from genes to mice—a community resource for mouse biology
The Mouse Genome Database (MGD) forms the core of the Mouse Genome Informatics (MGI) system (http://www.informatics.jax.org), a model organism database resource for the laboratory mouse. MGD provides essential integration of experimental knowledge for the mouse system with information annotated from both literature and online sources. MGD curates and presents consensus and experimental data representations of genotype (sequence) through phenotype information, including highly detailed reports about genes and gene products. Primary foci of integration are through representations of relationships among genes, sequences and phenotypes. MGD collaborates with other bioinformatics groups to curate a definitive set of information about the laboratory mouse and to build and implement the data and semantic standards that are essential for comparative genome analysis. Recent improvements in MGD discussed here include the enhancement of phenotype resources, the re-development of the International Mouse Strain Resource, IMSR, the update of mammalian orthology datasets and the electronic publication of classic books in mouse genetics.
The Mouse Genome Database (MGD): from genes to
mice—a community resource for mouse biology
Janan T. Eppig, Carol J. Bult, James A. Kadin, Joel E. Richardson, Judith A. Blake*
and the Mouse Genome Database Group
The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
Received October 12, 2004; Revised and Accepted October 20, 2004
The Mouse Genome Dat abase (MGD) forms the core
of the Mouse Genome Informatics (MGI) system
(http://www.informatics.jax.org), a model organism
database resource for the laboratory mouse. MGD
provides essential integration of experimental know-
ledge for the mouse system with information anno-
tated from both literat ure and online sources. MGD
curates and presents consensus and experimental
data representations of genotype (sequence) through
phenotype information, including highly detailed
reports about genes and gene products. Primary foci
of integration are through represe ntations of relation-
ships among genes, sequences and phenotypes.
MGD collaborates with other bioinformatics groups
to curate a definitive set of information about the
laboratory mouse and to build and implement the
data and semantic standards that are essential for
comparative genome analysis. Recent improveme nts
in MGD discussed here include the enhancement
of phenotype resources, the re-development of the
of mammalian orthology datasets and the electronic
The Mouse Genome Database (MGD) provides a comprehens-
ive and integrated view of genetic, genomic and biological
information for the laboratory mouse (1,2). MGD contains
information on mouse genes, genetic markers and genomic
features as well as the associations of these features with
sequence sets, reagents, alleles and mutant phenotypes.
MGD integrates sequence with biology through the curated
association of genome, transcript and protein sequence sets
with mouse genes—work done in collaboration with other
large genome informatics resources.
MGD is updated daily and there are weekly data exchanges
with other major genomics resources such as NCBI and Swiss-
Prot. A recent snapshot of MGD content is listed in Table 1.
Since the ﬁrst release of this database in 1994, MGD has
continued to evolve, expanding its data coverage, improving
data access and providing new data query, analysis and display
MGD is the core component of the Mouse Genome Infor-
matics (MGI) database resource (http://www.informatics.jax.org)
hosted at The Jackson Laboratory (http://www.jax.org). Other
projects and resources that are part of the MGI system include
the Gene Expression Database (GXD) (3) and the Mouse
Tumor Biology Database (MTB) (4) (http://tumor.informatics.
jax.org). All MGI component groups participate actively in the
development and application of the Gene Ontology (GO) (5)
IMPROVEMENTS DURING 2004
Hosting of the International Mouse Strain Resource
The International Mouse Strain Resource (IMSR) (http://
www.informatics.jax.org/imsr/) has as its goal to provide and
maintain a worldwide catalog of resources for mouse strains
and stocks. The IMSR has developed a searchable database
with a web front-end to assist researchers in locating and
obtaining the mouse resources they need (Figure 1).
An initial version of the IMSR was developed in 1999 (6) as
a collaborative effort with the Medical Research Council
(MRC) Mammalian Genetics Unit (Harwell, UK) and
contained a searchable resource for mouse stocks and strains
*To whom correspondence should be addressed. Tel: +1 207 288 6248; Fax: +1 207 288 6132; Email: email@example.com
The Mouse Genome Database Group: A. Anagnostopoulos, R. M. Baldarelli, M. Baya, J. S. Beal, S. M. Bello, W. J. Boddy, D. W. Bradt, D. L. Burkart, N. E. Butler,
J. Campbell, M. A. Cassell, L. E. Corbani, S. L. Cousins, D. J. Dahmen, H. Dene, A. D. Diehl, H. J. Drabkin, K. S. Frazer, P. Frost, L. H. Glass, C. W. Goldsmith,
P. L. Grant, M. Lennon-Pierce, J. Lewis, I. Lu, L. J. Maltais, M. McAndrews-Hill, L. McClellan, D. B. Miers, L. A. Miller, L. Ni, J. E. Ormsby, D. Qi, T. B. K. Reddy,
D. J. Reed, B. Richards-Smith, D. R. Shaw, R. Sinclair, C. L. Smith, P. Szauter, M. B. Walker, D. O. Walton, L. L. Washburn, I. T. Witham and Y. Zhu
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access
version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press
are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but
only in part or as a derivative work this must be clearly indicated. For commercial re-use permissions, please contact firstname.lastname@example.org.
ª 2005, the authors
Nucleic Acids Research, Vol. 33, Database issue ª Oxford University Press 2005; all rights reserved
Nucleic Acids Research, 2005, Vol. 33, Database issue D471–D475
held at The Jackson Laboratory (JAX) and at the MRC Har-
well sites. While this proved to be a useful resource, it was
severely limited in containing only information for these two
major mouse laboratories.
With the establishment of multiple mutagenesis centers,
gene trap centers, and the increasing use of genetic engineer-
ing technologies, the number of mouse stocks and strains
and the specialization of genotype and their characterization
has exploded. A number of new repositories and distribution
centers have been established worldwide to cope with the
exponential increase in specialized mouse stocks. The new
pressing need for a central cataloging of stocks and strains
prompted us to re-develop the IMSR in a more robust fashion,
such that it could easily accommodate data from multiple sites,
provide a better search interface for users, and enable links to
phenotype searching and to speciﬁc stock data from each site
that distributed mouse resources.
Users can search IMSR by strain, gene or allele designa-
tions, strain state(s) and strain classes. For each strain satisfy-
ing the search criteria, IMSR provides users with data on
where a strain is available from, in what state(s) the strain
exists (e.g. live, cryopreserved embryos or gametes, ES cell
lines), the class of strain and mutant alleles carried by the
strain. Hypertext links are provided (i) from each strain des-
ignation to its strain information page at the holding site, (ii) to
an auto-generated email form to the holder’s designated
representative for obtaining additional information or ordering
the mouse resource and (iii) from each mutant phenotypic
allele carried by a strain to the detailed characterization of
that allele in the MGI.
Additional links from the IMSR homepage provide instruc-
tions for participating in IMSR by listing one’s mouse
resources, for searching MGI for additional mouse genetic,
genomic and biological information, and for checking the
ofﬁcial mouse nomenclature guidelines from the International
Committee on Genetic Nomenclature for Mice.
Current centers with mouse resources included in IMSR are
as follows: The Jackson Laboratory (JAX), the Mouse Mutant
Regional Resources Centers (MMRRC), the Center for
Animal Resources and Development (CARD), the Oak Ridge
Table 1. Snapshot of data content in MGD: October 7, 2004
MGD data statistics October 7, 2004
Number of genes with sequence data 28 287
Number of genes (including unmapped mutants) 33 207
Number of markers (including genes) 57 521
Number of markers mapped 53 082
Number of genes with links to Swiss-Prot 7769
Number of genes with GO annotations 15 309
Number of mouse/human curated orthologies 14 893
Number of mouse/rat curated orthologies 12 679
Number of genes with one or more phenotypic alleles 4996
Number of cataloged phenotypic alleles 10 949
Number of references 87 527
Number of mouse nucleotide sequences integrated into
the MGI system (includes ESTs)
>7 600 000
Figure 1. The new IMSR is a searchable online database of mouse strains and stocks available worldwide, including inbred, mutant and genetically engineered mice.
The goal of the IMSR is to assist the international scientific community in locating and obtaining mouse resources for research. The data content found in the IMSR is
as it was supplied by data provider sites.
D472 Nucleic Acids Research, 2005, Vol. 33, Database issue
National Laboratory (ORNL), the European Mouse Mutant
Archive (EMMA) and the BayGenomics Gene Trap Resource.
In progress is the incorporation of stocks from the MRC
Genetics Unit, Harwell (Har), the Beta Cell Biology Consor-
tium (BCBC), Neuromice (NMICE) and the Mouse Models of
Human Cancer Consortium (MMHCC). Interest has been
expressed by several other sites, including additional mouse
mutagenesis centers, additional gene trap resources, and other
distribution centers. IMSR also accepts stock listings from
individuals. All strains and stocks listed in IMSR should be
available to the research community and regular updating from
sites is required to keep the IMSR current.
Enhanced orthology resources
MGD provides a curated set of mammalian orthologs for the
research community. Although MGD supports orthology
annotations to over 20 mammalian genomes, the priority effort
focuses on the creation of orthology sets among mouse, human
and rat. This set is constructed through an iterative process
using both computational and manual approaches. This year,
we worked with the HomoloGene resource at the NCBI (7) to
reciprocally incorporate some of the HomoloGene computa-
tional three-way reciprocal best-hit sets into the MGI system.
HomoloGene incorporates MGD-curated mammalian ortho-
logy sets in their resources. In addition, we continue to work
with the research community to carefully curate gene family
sets, usually at the instigation of the research community (8,9).
The Orthology Detail Page in MGD (Figure 2) for the gene
Wt1 illustrates the paradigm for orthology data. All assertions
of orthology are supported by a statement of evidence and
a citation. There are links to comparative mapping visualiza-
tions and links to genomics resources for the other species
Electronic publication of classic books in mouse genetics
MGD has responded to user requests in making electronic
copies of popular out-of-print books available online. Two
such books have been developed as online versions, Mouse
Genetics by Lee Silver (Oxford University Press, 1995) at
http://www.informatics.jax.org/silver/ and The Coat Colors
of Mice by Willys K. Silvers (Springer Verlag, 1979) at
http://www.informatics.jax.org/wksilvers/ (Figure 3). To
develop these online books, publisher and author copyrights
were obtained, text was re-developed and hypertext links
placed within the text for cross-referencing and to provide
direct access to MGI for enhanced gene and reference data.
Photographs and graphics were scanned into electronic form or
in some cases, redrawn. Both books have been welcomed by
MGD users and permission to include additional out-of-print
books is being sought.
MGD encourages user input into its gene and allele annotation
efforts. On each gene detail and allele detail page, a clickable
button (‘Your Input Welcome’) brings the user to a web-based
form for submitting updates to the information being viewed.
Figure 2. Mammalian Orthology Detail Page. The Mammalian Orthology Query Results page presents a table of results from MGI orthology curation. The table
includes species, symbol, chromosome, external and internal accession IDs and criteria for the assertions. The criteria include both a statement of evidence and a
citation. Hypertext links are incorporated as appropriate. Comparative chromosome map visualizations between any two of the species can be accessed from this
page. These data are updated nightly.
Nucleic Acids Research, 2005, Vol. 33, Database issue D473
Mouse gene nomenclature
The MGD gene annotation group assigns unique symbols and
names to mouse genes under the guidelines set by the Inter-
national Committee on Standardized Genetic Nomenclature
for mouse (http://www.informatics.jax.org/mgihome/nomen/
index.shtml). Through curation of shared links between
MGI and other bioinformatics resources, the ofﬁcial nomen-
clature for mouse genes is becoming widely disseminated. The
MGI nomenclature group works closely with human (http://
and rat (http://rgd.mcw.edu) nomenclature specialists to pro-
vide consistent nomenclature for mammalian species. Scientists
can reserve symbols prior to publication using the electronic
nomenclature submission form (http://www.informatics.jax.
org/mgihome/nomen/nomen_submit_form.shtml) or by
contacting the MGD nomenclature coordinator by email
(email@example.com). The MGD nomenclature
coordinator can also assist with other nomenclature issues,
such as revision of gene family designations.
Electronic data submission
Any type of data that MGD maintains can be submitted as an
electronic contribution. Over the last year, the most frequent
submissions have been of mutant and phenotypic allele infor-
mation originating from the large mouse mutagenesis centers.
Other common types of submission include gene and strain
nomenclature, mutant and QTL mapping data, polymorphisms
and mammalian homologies. Each electronic submission
receives a permanent database accession ID. All datasets
are associated with either an electronic submission reference
or a published paper. These reference pages provide links
to associated datasets. Online information about data submis-
sion procedure is found at http://www.informatics.jax.org/
Community outreach an d user support
MGD provides extensive user support through online docu-
mentation and easy email or phone access to User Support
Staff. User Support WWW access: http://www.informatics.
Email access: firstname.lastname@example.org
Telephone access: +1 207 288 6445
FAX access: +1 207 288 6132
Other outreach. MGI-LIST (http://www.informatics.jax.org/
mgihome/lists/lists.shtml) is a moderated and active email
Figure 3. Electronic publication of classic books in mouse genetics. MGI offers electronic versions of key out-of-print books in mouse genetics, Mouse Genetics by
Lee M. Silver (http://www.informatics.jax.org/silver/) and The Coat Colors of Mice by Willys K. Silvers (http://www.informatics.jax.org/wksilvers/). Gene
symbols in both books link to MGI gene detail pages where readers can access all the information MGI has assembled for that gene, including phenotypic
alleles, nucleotide and protein sequence, mapping and expression data and GO annotations. References cited in the books are linked to PubMed or MGI reference
detail pages, which, in turn are linked to additional curated information in MGI.
D474 Nucleic Acids Research, 2005, Vol. 33, Database issue
bulletin board supported by the MGI User Support group.
Other outreach includes Online Tutorials and answers to
Frequently Asked Questions, available at http://www.
MGD is implemented in the Sybase relational database sys-
tem, version 12.5. A large set of CGI scripts and Java Servlets
mediate the user’s interaction with the database. For computa-
tional users, direct SQL access can be requested through User
Support. User-requested database reports and a number of
widely used data ﬁles (generated daily) are available on the
ftp site (ftp://ftp.informatics.jax.org/pub/reports/index.html).
The following citation format is suggested when referring to
datasets speciﬁc to the MGD component of MGI: Mouse
Genome Database (MGD), Mouse Genome Informatics,
The Jackson Laboratory, Bar Harbor, Maine (http://www.
informatics.jax.org). [Type in date (month, year) when you
retrieve the data cited.] For general citation of the MGI
resource please cite this article.
The Mouse Genome Database is supported by NIH/NHGRI
1. Bult,C.J., Blake,J.A., Richardson,J.E., Kadin,J.A., Eppig,J.T. and the
Mouse Genome Database Group (2004) The Mouse Genome Database
(MGD): integrating biology with the genome. Nucleic Acids Res.,
2. Blake,J.A., Richardson.J.E., Bult,C.J., Kadin,J.A., Eppig,J.T. and the
Mouse Genome Database Group (2003) MGD: the Mouse Genome
Database. Nucleic Acids Res., 31, 193–195.
3. Hill,D.P., Begley,D.A., Finger,J.H., Hayamizu,T.F., McCright,I.J.,
Smith,C.M., Beal,J.S., Corbani,L.E., Blake,J.A., Eppig,J.T., Kadin,J.A.,
Richardson,J.E. and Ringwald,M. (2004) The Mouse Gene Expression
Database (GXD): updates and enhancements. Nucleic Acids Res., 32,
f,D., Krupke,D.M., Sundberg,J.P., Eppig,J.T. and Bult,C.J. (2002)
The mouse tumor biology database: a public resource for cancer genetics
and pathology of the mouse. Cancer Res., 62, 1235–1240.
5. The Gene Ontology Consortium (2004) The Gene Ontology (GO)
Database and Informatics Resource. Nucleic Acids Res.,
6. Eppig,J.T. and Strivens,M. (1999) Finding a mouse: the International
Mouse Strain Resource (IMSR). Trends Genet., 15, 81–82.
7. Wheeler,D.L., Church,D.M., Edgar,R., Federhen,S., Helmberg,W.,
Madden,T.L., Pontius,J.U., Schuler,G.D., Schriml,L.M., Sequeira,E.
et al. (2004) Database resources of the National Center for
Biotechnology Information: update. Nucleic Acids Res.,
8. Mashek,D.G., Bornfeldt,K.E., Coleman,R.A., Berger,J., Bernlohr,D.A.,
Black,P., DiRusso,C.C., Farber,S.A., Guo,W., Hashimoto,N. et al.
(2004) Revised nomenclature for the mammalian long-chain acyl-CoA
synthetase gene family. J. Lipid Res., 45, 1958–1961.
9. Nelson,D.R., Zeldin,D.C., Hoffman,S.M.G., Maltais,L.J., Wain,H.M.
and Nebert,D.W. (2004) Comparison of cytochrome P450 (CYP) genes
from the mouse and human genomes, including nomenclature
recommendations for genes, pseudogenes and alternative-splice variants.
Phamacogenetics, 14, 1–18.
Nucleic Acids Research, 2005, Vol. 33, Database issue D475