Yuna Blum

University of California, Los Angeles, Los Angeles, California, United States

Are you Yuna Blum?

Claim your profile

Publications (8)20.58 Total impact

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Very few causal genes have been identified by quantitative trait loci (QTLs) mapping because of the large size of QTLs, and most of them were identified thanks to functional links already known with the targeted phenotype. Here we propose to combine selection signature detection, coding SNP annotation, and cis-expression QTL analyses to identify potential causal genes underlying QTLs identified in divergent line designs. As a model, we chose experimental chicken lines divergently selected for only one trait, the abdominal fat weight, in which several QTLs were previously mapped. Using a new haplotype-based statistics exploiting the very high SNP density generated through whole genome re-sequencing, we found 129 significant selective sweeps. Most of the QTLs co-localized with at least one sweep, which markedly narrowed candidate region size. Some of those sweeps contained only one gene, therefore making them strong positional causal candidates with no presupposed function. We then focused on two of these QTLs/sweeps. The absence of non-synonymous SNPs in their coding regions strongly suggests the existence of causal mutations acting in cis on their expression, confirmed by cis-eQTL identification using either allele-specific expression or genetic mapping analyses. Additional expression analyses on those two genes in the chicken and mice contrasted for adiposity reinforces their link with this phenotype. This study shows for the first time the interest of combining selective sweeps mapping, coding SNP annotation and cis-eQTL analyses for identifying causative genes for a complex trait, in the context of divergent lines selected for this specific trait. Moreover, it highlights two genes, JAG2 and PARK2, as new potential negative and positive key regulators of adiposity in chicken and mice. Copyright © 2015 Author et al.
    G3-Genes Genomes Genetics 02/2015; 5(4). DOI:10.1534/g3.115.016865 · 2.51 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Microarray analysis was used to identify genes whose expression in the mammary gland of Holstein-Friesian dairy cows was affected by the nonconservative Ala to Lys amino acid substitution at position 232 in exon VIII of the diacylglycerol-O-transferase 1 (DGAT1) gene. Mammary gland biopsies of 9 homozygous Ala cows, 13 heterozygous cows (Ala/Lys), and 4 homozygous Lys cows in midlactation were taken. Microarray ANOVA and factor analysis for multiple testing methods were used as statistical methods to associate the expression level of the genes present on Affymetrix bovine genome arrays (Affymetrix Inc., Santa Clara, CA) with the DGAT1 gene polymorphism. The data was also analyzed at the level of functional modules by gene set enrichment analysis. In this small-scale experimental setting, DGAT1 gene polymorphism did not modify milk yield and composition significantly, although expected changes occurred in the yields of C14:0, cis-9 C16:1, and long-chain fatty acids. Diacylglycerol-O-transferase 1 gene polymorphism affected the expression of 30 annotated genes related to cell growth, proliferation, and development, remodeling of the tissue, cell signaling and immune system response. Furthermore, the main affected functional modules were related to energy metabolism (lipid biosynthesis, oxidative phosphorylation, electron transport chain, citrate cycle, and propanoate metabolism), protein degradation (proteosome-ubiquitin pathways), and the immune system. We hypothesize that the observed differences in transcriptional activity reflect counter mechanisms of mammary gland tissue to respond to changes in milk fatty acid concentration or composition, or both.
    Journal of Dairy Science 09/2012; 95(9):4989-5000. DOI:10.3168/jds.2012-5348 · 2.55 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Integrative genomics approaches that combine genotyping and transcriptome profiling in segregating populations have been developed to dissect complex traits. The most common approach is to identify genes whose eQTL colocalize with QTL of interest, providing new functional hypothesis about the causative mutation. Another approach includes defining subtypes for a complex trait using transcriptome profiles and then performing QTL mapping using some of these subtypes. This approach can refine some QTL and reveal new ones.In this paper we introduce Factor Analysis for Multiple Testing (FAMT) to define subtypes more accurately and reveal interaction between QTL affecting the same trait. The data used concern hepatic transcriptome profiles for 45 half sib male chicken of a sire known to be heterozygous for a QTL affecting abdominal fatness (AF) on chromosome 5 distal region around 168 cM. Using this methodology which accounts for hidden dependence structure among phenotypes, we identified 688 genes that are significantly correlated to the AF trait and we distinguished 5 subtypes for AF trait, which are not observed with gene lists obtained by classical approaches. After exclusion of one of the two lean bird subtypes, linkage analysis revealed a previously undetected QTL on chromosome 5 around 100 cM. Interestingly, the animals of this subtype presented the same q paternal haplotype at the 168 cM QTL. This result strongly suggests that the two QTL are in interaction. In other words, the "q configuration" at the 168 cM QTL could hide the QTL existence in the proximal region at 100 cM. We further show that the proximal QTL interacts with the previous one detected on the chromosome 5 distal region. Our results demonstrate that stratifying genetic population by molecular phenotypes followed by QTL analysis on various subtypes can lead to identification of novel and interacting QTL.
    BMC Genomics 11/2011; 12:567. DOI:10.1186/1471-2164-12-567 · 4.04 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The final steps of genetic mapping research programs require close analysis of several QTL regions to select candidate genes for further studies. Despite several websites (NCBI genome browser, Ensembl Browser, UCSC Genome Browser) or web tools (Biomart, Galaxy) developed to achieve this task, the selection of candidate genes remains a laborious process. The information made available on the more prominent websites differs slightly in terms of gene prediction and functional annotation, while other websites provide extra information that researchers may want to use (HGNC approved gene symbols, Gene Ontology Annotation or functional data, conservation of synteny with other species, etc.). It is possible to manually merge and compare this information for one QTL containing few genes, but not for many different QTL regions containing dozens of genes. Here, we propose a web tool that, for a given region of interest, merges the list of genes available in NCBI and Ensembl, removes redundancy, adds functional annotations from different prominent web sites, and highlights the genes for which functional annotation fits the biological function or diseases of interest. The tool is dedicated to sequenced species of livestock including cattle, pig, chicken, and horse as well as dog, i.e. species that have been extensively studied (with over 8000 QTLs detected; see Nevertheless, the family designs and the low number of animals used in these species, most of the studies use linkage analysis, and the QTL regions identified remain large (containing dozens of genes). Conversely, in human and model species, most analyses now draw heavily on association studies involving large cohorts, thus providing more power and accuracy, and the web tools already available focus on these species through functional annotation of SNPs in association with the trait [1-3]. As most of these tools focus on the SNP annotation itself, describing whether the SNP is located in a gene, then a coding sequence could have a functional effect, etc. While these web tools are highly efficient in providing a good annotation for specific SNPs, they clearly cannot be used to collect information on the large regions obtained in livestock species. AnnotQTL is a web tool designed to gather the functional annotation of different prominent websites while minimizing redundant information. Using all known information substantially accelerates the gene analysis of QTL regions for livestock species traits and improves the selection of candidate genes. The AnnotQTL web tool is available at
    JOBIM 2011, Institut Pasteur, Paris; 06/2011
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: AnnotQTL is a web tool designed to aggregate functional annotations from different prominent web sites by minimizing the redundancy of information. Although thousands of QTL regions have been identified in livestock species, most of them are large and contain many genes. This tool was therefore designed to assist the characterization of genes in a QTL interval region as a step towards selecting the best candidate genes. It localizes the gene to a specific region (using NCBI and Ensembl data) and adds the functional annotations available from other databases (Gene Ontology, Mammalian Phenotype, HGNC and Pubmed). Both human genome and mouse genome can be aligned with the studied region to detect synteny and segment conservation, which is useful for running inter-species comparisons of QTL locations. Finally, custom marker lists can be included in the results display to select the genes that are closest to your most significant markers. We use examples to demonstrate that in just a couple of hours, AnnotQTL is able to identify all the genes located in regions identified by a full genome scan, with some highlighted based on both location and function, thus considerably increasing the chances of finding good candidate genes. AnnotQTL is available at
    Nucleic Acids Research 05/2011; 39(Web Server issue):W328-33. DOI:10.1093/nar/gkr361 · 8.81 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Microarray technology allows the simultaneous analysis of thousands of genes within a single experiment. Significance analyses of transcriptomic data ignore the gene dependence structure. This leads to correlation among test statistics which affects a strong control of the false discovery proportion. A recent method called FAMT allows capturing the gene dependence into factors in order to improve high-dimensional multiple testing procedures. In the subsequent analyses aiming at a functional characterization of the differentially expressed genes, our study shows how these factors can be used both to identify the components of expression heterogeneity and to give more insight into the underlying biological processes. The use of factors to characterize simple patterns of heterogeneity is first demonstrated on illustrative gene expression data sets. An expression data set primarily generated to map QTL for fatness in chickens is then analyzed. Contrarily to the analysis based on the raw data, a relevant functional information about a QTL region is revealed by factor-adjustment of the gene expressions. Additionally, the interpretation of the independent factors regarding known information about both experimental design and genes shows that some factors may have different and complex origins. As biological information and technological biases are identified in what was before simply considered as statistical noise, analyzing heterogeneity in gene expression yields a new point of view on transcriptomic data.
    BMC Bioinformatics 07/2010; 11:368. DOI:10.1186/1471-2105-11-368 · 2.67 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The availability of genome-wide expression data to complement the measurements of a phenotypic trait opens new opportunities for identifying biologic processes and genes that are involved in trait expression. Usually differential analysis is a preliminary step to identify the key biological processes involved in the variability of the trait of interest. However, this variability shall be viewed as resulting from a complex combination of genes individual contributions. In other words, exploring the interactions between genes viewed in a network structure which vertices are genes and edges stand for inhibition or activation connections gives much more insight on the internal structure of expression profiles. Many currently available solutions for network analysis based on the Gaussian Graphical Model have been developed but an efficient estimation of the network from high-dimensional data is still a questioning issue. Extending the idea introduced for differential analysis by Friguet et al. 2009 and Blum et al. 2010, we propose to take advantage of a factor model structure to infer gene regulatory networks. This method shows good inferential properties and also allows an efficient testing strategy for the significance of direct gene interactions (partial correlations). We use the method in a study that aims at identifying the genes implied in fatness variability in chickens. We model the networks of genes controlled by genome regions known to be related to the fatness variability and analyze the modular structure thanks to the annotation available giving us new hypotheses about the causal mutations and underlying biological processes.
    International Plant and Animal Genome Conference XX 2012;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: La technologie des puces à ADN permet l'analyse simultanée du niveau d'expression de plusieurs milliers de gènes. Un des enjeux de l'analyse de ce type de données est de comprendre la structure de dépendance, qui rend compte des relations biologiques entre les gènes. En particulier, on s'intéresse ici à la modélisation du réseau de régulation des gènes impliqués dans le contrôle d'un caractère phénotypique. Dans un premier temps, on définit un cadre général pour la prise en compte de la dépendance par l'identification de facteurs latents, modélisant la variation commune à l'ensemble des gènes. On montre que l'introduction de ces facteurs dans les procédures d'analyse différentielle en améliore la puissance ainsi que la stabilité des taux d'erreurs. De plus, dans le contexte des modèles graphiques gaussiens pour la modélisation des réseaux d'interactions entre gènes, on présente une méthode d'estimation des corrélations partielles s'appuyant sur la réduction de la dimension des données par les variables latentes. La méthode est illustrée par son application à une étude visant à identifier les gènes impliqués dans le métabolisme des lipides chez le poulet (UMR INRA Génétique Animale de Rennes).

Publication Stats

18 Citations
20.58 Total Impact Points


  • 2015
    • University of California, Los Angeles
      Los Angeles, California, United States
  • 2010–2012
    • Agrocampus Ouest
      Roazhon, Brittany, France
  • 2011
    • French National Institute for Agricultural Research
      • Laboratoire de Génétique Cellulaire
      Lutetia Parisorum, Île-de-France, France