Dynamic instability of the Major Urinary Protein gene family revealed by genomic and phenotypic comparisons between C57 and 129 strain mice

Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB101SA, UK.
Genome biology (Impact Factor: 10.81). 02/2008; 9(5):R91. DOI: 10.1186/gb-2008-9-5-r91
Source: PubMed


The major urinary proteins (MUPs) of Mus musculus domesticus are deposited in urine in large quantities, where they bind and release pheromones and also provide an individual 'recognition signal' via their phenotypic polymorphism. Whilst important information about MUP functionality has been gained in recent years, the gene cluster is poorly studied in terms of structure, genic polymorphism and evolution.
We combine targeted sequencing, manual genome annotation and phylogenetic analysis to compare the Mup clusters of C57BL/6J and 129 strains of mice. We describe organizational heterogeneity within both clusters: a central array of cassettes containing Mup genes highly similar at the protein level, flanked by regions containing Mup genes displaying significantly elevated divergence. Observed genomic rearrangements in all regions have likely been mediated by endogenous retroviral elements. Mup loci with coding sequences that differ between the strains are identified--including a gene/pseudogene pair--suggesting that these inbred lineages exhibit variation that exists in wild populations. We have characterized the distinct MUP profiles in the urine of both strains by mass spectrometry. The total MUP phenotype data is reconciled with our genomic sequence data, matching all proteins identified in urine to annotated genes.
Our observations indicate that the MUP phenotypic polymorphism observed in wild populations results from a combination of Mup gene turnover coupled with currently unidentified mechanisms regulating gene expression patterns. We propose that the structural heterogeneity described within the cluster reflects functional divergence within the Mup gene family.

Download full-text


Available from: Rob Beynon
  • Source
    • "Blixem and Dotter are used extensively by the HAVANA group at the Wellcome Trust Sanger Institute and are essential to the manual annotation process. Examples of work published by the HAVANA group that has involved the use of Blixem and Dotter includes5678910. Belvu is used in the curation of high-quality " seed " alignments for the Pfam database[11]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background: Manual annotation is essential to create high-quality reference alignments and annotation. Annotators need to be able to view sequence alignments in detail. The SeqTools package provides three tools for viewing different types of sequence alignment: Blixem is a many-to-one browser of pairwise alignments, displaying multiple match sequences aligned against a single reference sequence; Dotter provides a graphical dot-plot view of a single pairwise alignment; and Belvu is a multiple sequence alignment viewer, editor, and phylogenetic tool. These tools were originally part of the AceDB genome database system but have been completely rewritten to make them generally available as a standalone package of greatly improved function. Findings: Blixem is used by annotators to give a detailed view of the evidence for particular gene models. Blixem displays the gene model positions and the match sequences aligned against the genomic reference sequence. Annotators use this for many reasons, including to check the quality of an alignment, to find missing/misaligned sequence and to identify splice sites and polyA sites and signals. Dotter is used to give a dot-plot representation of a particular pairwise alignment. This is used to identify sequence that is not represented (or is misrepresented) and to quickly compare annotated gene models with transcriptional and protein evidence that putatively supports them. Belvu is used to analyse conservation patterns in multiple sequence alignments and to perform a combination of manual and automatic processing of the alignment. High-quality reference alignments are essential if they are to be used as a starting point for further automatic alignment generation. Conclusions: While there are many different alignment tools available, the SeqTools package provides unique functionality that annotators have found to be essential for analysing sequence alignments as part of the manual annotation process.
    Preview · Article · Dec 2016 · BMC Research Notes
    • "All are synthesised as preproteins , but the signal peptide is removed precisely to reveal a conserved N-terminus. Mouse MUPs contain a single disulphide bond but only two have consensus sites for N-linked glycosylation (MUP3 and MUP21) of which one has been proven to be glycosylated [3]. Of these 21 genes, the central 15 show very high sequence homology, and the peripheral six are more variable in their sequence. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The Major Urinary Proteins (MUPs) of the house mouse, Mus musculus domesticus, are 18–19 kDa beta-barrel lipocalins that are involved in chemical communication between individuals. Many of them are excreted in urine where they play multiple roles, including coding of owner identity and transport, and slow release of bound volatile pheromones. One of them, darcin, is a pheromone in its own right and induces long-term memory for the identity and location of the scent mark owner. We have shown that mass spectrometric analysis of intact proteins, and their ion mobility behaviour, is capable of dissecting subtle structural differences between the members of this class of proteins. Moreover, mass spectrometric analysis of the intact proteins can contribute towards molecular phenotyping of MUPs. However, whilst allowing relative quantification, the ionisation propensity or gas phase properties of the individual MUPs may compromise absolute quantification. To solve the challenge of absolute quantification of MUP expression, we have designed and constructed a QconCAT built from endopeptidase LysC peptides that is capable of quantifying MUPs found in laboratory animal strains and some MUPs from wild caught individuals.
    No preview · Article · Aug 2015
  • Source
    • "Initially, four chromosomes sequenced at the Wellcome Trust Sanger Institute (2, 4, 11, and X) were systematically annotated on a clone-by-clone basis during the assembly phase. Secondly , numerous genomic regions and gene families considered of particular interest to the wider community had their annotation prioritized, for example, the major histocompatibility complex on chr17 (unpublished), the Major Urinary Proteins gene cluster on chr 4 (Mudge et al. 2008) and the large complement of immunoglobulin loci found at several sites across the genome (unpublished). The HAVANA group has also been involved in several collaborative projects over the years that have required annotation on a gene-by-gene basis. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Annotation on the reference genome of the C57BL6/J mouse has been an ongoing project ever since the draft genome was first published. Initially, the principle focus was on the identification of all protein-coding genes, although today the importance of describing long non-coding RNAs, small RNAs, and pseudogenes is recognized. Here, we describe the progress of the GENCODE mouse annotation project, which combines manual annotation from the HAVANA group with Ensembl computational annotation, alongside experimental and in silico validation pipelines from other members of the consortium. We discuss the more recent incorporation of next-generation sequencing datasets into this workflow, including the usage of mass-spectrometry data to potentially identify novel protein-coding genes. Finally, we will outline how the C57BL6/J genebuild can be used to gain insights into the variant sites that distinguish different mouse strains and species. Electronic supplementary material The online version of this article (doi:10.1007/s00335-015-9583-x) contains supplementary material, which is available to authorized users.
    Full-text · Article · Jul 2015 · Mammalian Genome
Show more