Julie Park

Stanford University, Stanford, CA, USA

Are you Julie Park?

Claim your profile

Publications (18)100.75 Total impact

  • Article: Annotation of functional variation in personal genomes using RegulomeDB.
    [show abstract] [hide abstract]
    ABSTRACT: As the sequencing of healthy and disease genomes becomes more commonplace, detailed annotation provides interpretation for individual variation responsible for normal and disease phenotypes. Current approaches focus on direct changes in protein coding genes, particularly nonsynonymous mutations that directly affect the gene product. However, most individual variation occurs outside of genes and, indeed, most markers generated from genome-wide association studies (GWAS) identify variants outside of coding segments. Identification of potential regulatory changes that perturb these sites will lead to a better localization of truly functional variants and interpretation of their effects. We have developed a novel approach and database, RegulomeDB, which guides interpretation of regulatory variants in the human genome. RegulomeDB includes high-throughput, experimental data sets from ENCODE and other sources, as well as computational predictions and manual annotations to identify putative regulatory potential and identify functional variants. These data sources are combined into a powerful tool that scores variants to help separate functional variants from a large pool and provides a small set of putative sites with testable hypotheses as to their function. We demonstrate the applicability of this tool to the annotation of noncoding variants from 69 full sequenced genomes as well as that of a personal genome, where thousands of functionally associated variants were identified. Moreover, we demonstrate a GWAS where the database is able to quickly identify the known associated functional variant and provide a hypothesis as to its function. Overall, we expect this approach and resource to be valuable for the annotation of human genome sequences.
    Genome Research 09/2012; 22(9):1790-7. · 13.61 Impact Factor
  • Source
    Article: YeastMine--an integrated data warehouse for Saccharomyces cerevisiae data as a multipurpose tool-kit.
    [show abstract] [hide abstract]
    ABSTRACT: The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) provides high-quality curated genomic, genetic, and molecular information on the genes and their products of the budding yeast Saccharomyces cerevisiae. To accommodate the increasingly complex, diverse needs of researchers for searching and comparing data, SGD has implemented InterMine (http://www.InterMine.org), an open source data warehouse system with a sophisticated querying interface, to create YeastMine (http://yeastmine.yeastgenome.org). YeastMine is a multifaceted search and retrieval environment that provides access to diverse data types. Searches can be initiated with a list of genes, a list of Gene Ontology terms, or lists of many other data types. The results from queries can be combined for further analysis and saved or downloaded in customizable file formats. Queries themselves can be customized by modifying predefined templates or by creating a new template to access a combination of specific data types. YeastMine offers multiple scenarios in which it can be used such as a powerful search interface, a discovery tool, a curation aid and also a complex database presentation format. DATABASE URL: http://yeastmine.yeastgenome.org.
    Database The Journal of Biological Databases and Curation 01/2012; 2012:bar062. · 2.07 Impact Factor
  • Source
    Article: CvManGO, a method for leveraging computational predictions to improve literature-based Gene Ontology annotations.
    [show abstract] [hide abstract]
    ABSTRACT: The set of annotations at the Saccharomyces Genome Database (SGD) that classifies the cellular function of S. cerevisiae gene products using Gene Ontology (GO) terms has become an important resource for facilitating experimental analysis. In addition to capturing and summarizing experimental results, the structured nature of GO annotations allows for functional comparison across organisms as well as propagation of functional predictions between related gene products. Due to their relevance to many areas of research, ensuring the accuracy and quality of these annotations is a priority at SGD. GO annotations are assigned either manually, by biocurators extracting experimental evidence from the scientific literature, or through automated methods that leverage computational algorithms to predict functional information. Here, we discuss the relationship between literature-based and computationally predicted GO annotations in SGD and extend a strategy whereby comparison of these two types of annotation identifies genes whose annotations need review. Our method, CvManGO (Computational versus Manual GO annotations), pairs literature-based GO annotations with computational GO predictions and evaluates the relationship of the two terms within GO, looking for instances of discrepancy. We found that this method will identify genes that require annotation updates, taking an important step towards finding ways to prioritize literature review. Additionally, we explored factors that may influence the effectiveness of CvManGO in identifying relevant gene targets to find in particular those genes that are missing literature-supported annotations, but our survey found that there are no immediately identifiable criteria by which one could enrich for these under-annotated genes. Finally, we discuss possible ways to improve this strategy, and the applicability of this method to other projects that use the GO for curation. DATABASE URL: http://www.yeastgenome.org.
    Database The Journal of Biological Databases and Curation 01/2012; 2012:bas001. · 2.07 Impact Factor
  • Source
    Article: Saccharomyces Genome Database: the genomics resource of budding yeast.
    [show abstract] [hide abstract]
    ABSTRACT: The Saccharomyces Genome Database (SGD, http://www.yeastgenome.org) is the community resource for the budding yeast Saccharomyces cerevisiae. The SGD project provides the highest-quality manually curated information from peer-reviewed literature. The experimental results reported in the literature are extracted and integrated within a well-developed database. These data are combined with quality high-throughput results and provided through Locus Summary pages, a powerful query engine and rich genome browser. The acquisition, integration and retrieval of these data allow SGD to facilitate experimental design and analysis by providing an encyclopedia of the yeast genome, its chromosomal features, their functions and interactions. Public access to these data is provided to researchers and educators via web pages designed for optimal ease of use.
    Nucleic Acids Research 11/2011; 40(Database issue):D700-5. · 8.03 Impact Factor
  • Source
    Article: Using computational predictions to improve literature-based Gene Ontology annotations: a feasibility study.
    [show abstract] [hide abstract]
    ABSTRACT: Annotation using Gene Ontology (GO) terms is one of the most important ways in which biological information about specific gene products can be expressed in a searchable, computable form that may be compared across genomes and organisms. Because literature-based GO annotations are often used to propagate functional predictions between related proteins, their accuracy is critically important. We present a strategy that employs a comparison of literature-based annotations with computational predictions to identify and prioritize genes whose annotations need review. Using this method, we show that comparison of manually assigned 'unknown' annotations in the Saccharomyces Genome Database (SGD) with InterPro-based predictions can identify annotations that need to be updated. A survey of literature-based annotations and computational predictions made by the Gene Ontology Annotation (GOA) project at the European Bioinformatics Institute (EBI) across several other databases shows that this comparison strategy could be used to maintain and improve the quality of GO annotations for other organisms besides yeast. The survey also shows that although GOA-assigned predictions are the most comprehensive source of functional information for many genomes, a large proportion of genes in a variety of different organisms entirely lack these predictions but do have manual annotations. This underscores the critical need for manually performed, literature-based curation to provide functional information about genes that are outside the scope of widely used computational methods. Thus, the combination of manual and computational methods is essential to provide the most accurate and complete functional annotation of a genome. Database URL: http://www.yeastgenome.org.
    Database The Journal of Biological Databases and Curation 01/2011; 2011:bar004. · 2.07 Impact Factor
  • Article: Saccharomyces Genome Database provides mutant phenotype data
    Nucleic Acids Research 01/2010; 38:433-436. · 8.03 Impact Factor
  • Source
    Article: Saccharomyces Genome Database provides mutant phenotype data.
    [show abstract] [hide abstract]
    ABSTRACT: The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is a scientific database for the molecular biology and genetics of the yeast Saccharomyces cerevisiae, which is commonly known as baker's or budding yeast. The information in SGD includes functional annotations, mapping and sequence information, protein domains and structure, expression data, mutant phenotypes, physical and genetic interactions and the primary literature from which these data are derived. Here we describe how published phenotypes and genetic interaction data are annotated and displayed in SGD.
    Nucleic Acids Research 11/2009; 38(Database issue):D433-6. · 8.03 Impact Factor
  • Source
    Article: Combining guilt-by-association and guilt-by-profiling to predict Saccharomyces cerevisiae gene function.
    [show abstract] [hide abstract]
    ABSTRACT: Learning the function of genes is a major goal of computational genomics. Methods for inferring gene function have typically fallen into two categories: 'guilt-by-profiling', which exploits correlation between function and other gene characteristics; and 'guilt-by-association', which transfers function from one gene to another via biological relationships. We have developed a strategy ('Funckenstein') that performs guilt-by-profiling and guilt-by-association and combines the results. Using a benchmark set of functional categories and input data for protein-coding genes in Saccharomyces cerevisiae, Funckenstein was compared with a previous combined strategy. Subsequently, we applied Funckenstein to 2,455 Gene Ontology terms. In the process, we developed 2,455 guilt-by-profiling classifiers based on 8,848 gene characteristics and 12 functional linkage graphs based on 23 biological relationships. Funckenstein outperforms a previous combined strategy using a common benchmark dataset. The combination of 'guilt-by-profiling' and 'guilt-by-association' gave significant improvement over the component classifiers, showing the greatest synergy for the most specific functions. Performance was evaluated by cross-validation and by literature examination of the top-scoring novel predictions. These quantitative predictions should help prioritize experimental study of yeast gene functions.
    Genome biology 02/2008; 9 Suppl 1:S7. · 6.63 Impact Factor
  • Source
    Article: Gene Ontology annotations at SGD: new data sources and annotation methods.
    [show abstract] [hide abstract]
    ABSTRACT: The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) collects and organizes biological information about the chromosomal features and gene products of the budding yeast Saccharomyces cerevisiae. Although published data from traditional experimental methods are the primary sources of evidence supporting Gene Ontology (GO) annotations for a gene product, high-throughput experiments and computational predictions can also provide valuable insights in the absence of an extensive body of literature. Therefore, GO annotations available at SGD now include high-throughput data as well as computational predictions provided by the GO Annotation Project (GOA UniProt; http://www.ebi.ac.uk/GOA/). Because the annotation method used to assign GO annotations varies by data source, GO resources at SGD have been modified to distinguish data sources and annotation methods. In addition to providing information for genes that have not been experimentally characterized, GO annotations from independent sources can be compared to those made by SGD to help keep the literature-based GO annotations current.
    Nucleic Acids Research 02/2008; 36(Database issue):D577-81. · 8.03 Impact Factor
  • Source
    Article: A two-hybrid screen identifies cathepsins B and L as uncoating factors for adeno-associated virus 2 and 8.
    [show abstract] [hide abstract]
    ABSTRACT: Vectors based on different serotypes of adeno-associated virus hold great promise for human gene therapy, based on their unique tissue tropisms and distinct immunological profiles. A particularly interesting candidate is AAV8, which can efficiently and rapidly transduce a wide range of tissues in vivo. To further unravel the mechanisms behind AAV8 transduction, we used yeast two-hybrid analyses to screen a mouse liver complementary DNA library for cellular proteins capable of interacting with the viral capsid proteins. In total, we recovered approximately 700 clones, comprising over 300 independent genes. Sequence analyses revealed multiple hits for over 100 genes, including two encoding the endosomal cysteine proteases cathepsins B and L. Notably, these two proteases also physically interacted with the corresponding portion of the AAV2 capsid in yeast, but not with AAV5. We demonstrate that cathepsins B and L are essential for efficient AAV2- and AAV8-mediated transduction of mammalian cells, and document the ability of purified cathepsin B and L proteins to bind and cleave intact AAV2 and AAV8 particles in vitro. These data suggest that cathepsin-mediated cleavage could prime AAV capsids for subsequent nuclear uncoating, and indicate that analysis of additional genes recovered in our screen may help to further elucidate the mechanisms behind transduction by AAV8 and related serotypes.
    Molecular Therapy 03/2007; 15(2):330-9. · 6.87 Impact Factor
  • Source
    Article: Expanded protein information at SGD: new pages and proteome browser.
    [show abstract] [hide abstract]
    ABSTRACT: The recent explosion in protein data generated from both directed small-scale studies and large-scale proteomics efforts has greatly expanded the quantity of available protein information and has prompted the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) to enhance the depth and accessibility of protein annotations. In particular, we have expanded ongoing efforts to improve the integration of experimental information and sequence-based predictions and have redesigned the protein information web pages. A key feature of this redesign is the development of a GBrowse-derived interactive Proteome Browser customized to improve the visualization of sequence-based protein information. This Proteome Browser has enabled SGD to unify the display of hidden Markov model (HMM) domains, protein family HMMs, motifs, transmembrane regions, signal peptides, hydropathy plots and profile hits using several popular prediction algorithms. In addition, a physico-chemical properties page has been introduced to provide easy access to basic protein information. Improvements to the layout of the Protein Information page and integration of the Proteome Browser will facilitate the ongoing expansion of sequence-specific experimental information captured in SGD, including post-translational modifications and other user-defined annotations. Finally, SGD continues to improve upon the availability of genetic and physical interaction data in an ongoing collaboration with BioGRID by providing direct access to more than 82,000 manually-curated interactions.
    Nucleic Acids Research 02/2007; 35(Database issue):D468-71. · 8.03 Impact Factor
  • Source
    Article: Genome Snapshot: a new resource at the Saccharomyces Genome Database (SGD) presenting an overview of the Saccharomyces cerevisiae genome.
    [show abstract] [hide abstract]
    ABSTRACT: Sequencing and annotation of the entire Saccharomyces cerevisiae genome has made it possible to gain a genome-wide perspective on yeast genes and gene products. To make this information available on an ongoing basis, the Saccharomyces Genome Database (SGD) (http://www.yeastgenome.org/) has created the Genome Snapshot (http://db.yeastgenome.org/cgi-bin/genomeSnapShot.pl). The Genome Snapshot summarizes the current state of knowledge about the genes and chromosomal features of S.cerevisiae. The information is organized into two categories: (i) number of each type of chromosomal feature annotated in the genome and (ii) number and distribution of genes annotated to Gene Ontology terms. Detailed lists are accessible through SGD's Advanced Search tool (http://db.yeastgenome.org/cgi-bin/search/featureSearch), and all the data presented on this page are available from the SGD ftp site (ftp://ftp.yeastgenome.org/yeast/).
    Nucleic Acids Research 02/2006; 34(Database issue):D442-5. · 8.03 Impact Factor
  • Article: Genome Snapshot: a new resource at the
    Nucleic Acids Research. 01/2006; 34:442-445.
  • Article: 191. Identification of a Novel Functional Domain in the Sleeping Beauty Transposase: Towards Alleviating the Restriction of SB Overproduction Inhibition
    [show abstract] [hide abstract]
    ABSTRACT: Molecular Therapy (2005) 11, S75|[ndash]|S75; doi: 10.1016/j.ymthe.2005.06.194 191. Identification of a Novel Functional Domain in the Sleeping Beauty Transposase: Towards Alleviating the Restriction of SB Overproduction Inhibition Julie Park1, Stephen R. Yant1 and Mark A. Kay11Genetics, Stanford University, Stanford, CA
    Molecular Therapy 04/2005; · 6.87 Impact Factor
  • Article: Mutational analysis of the N-terminal DNA-binding domain of sleeping beauty transposase: critical residues for DNA binding and hyperactivity in mammalian cells.
    [show abstract] [hide abstract]
    ABSTRACT: The N-terminal domain of the Sleeping Beauty (SB) transposase mediates transposon DNA binding, subunit multimerization, and nuclear translocation in vertebrate cells. For this report, we studied the relative contributions of 95 different residues within this multifunctional domain by large-scale mutational analysis. We found that each of four amino acids (leucine 25, arginine 36, isoleucine 42, and glycine 59) contributes to DNA binding in the context of the N-terminal 123 amino acids of SB transposase, as indicated by electrophoretic mobility shift analysis, and to functional activity of the full-length transposase, as determined by a quantitative HeLa cell-based transposition assay. Moreover, we show that amino acid substitutions within either the putative oligomerization domain (L11A, L18A, L25A, and L32A) or the nuclear localization signal (K104A and R105A) severely impair its ability to mediate DNA transposition in mammalian cells. In contrast, each of 10 single amino acid changes within the bipartite DNA-binding domain is shown to greatly enhance SB's transpositional activity in mammalian cells. These hyperactive mutations functioned synergistically when combined and are shown to significantly improve transposase affinity for transposon end sequences. Finally, we show that enhanced DNA-binding activity results in improved cleavage kinetics, increased SB element mobilization from host cell chromosomes, and dramatically improved gene transfer capabilities of SB in vivo in mice. These studies provide important insights into vertebrate transposon biology and indicate that Sleeping Beauty can be readily improved for enhanced genetic research applications in mammals.
    Molecular and Cellular Biology 11/2004; 24(20):9239-47. · 5.53 Impact Factor
  • Article: 819. Hyperactive Transposase Mutants of the Sleeping Beauty Transposon
    [show abstract] [hide abstract]
    ABSTRACT: Molecular Therapy (2004) 9, S310–S310; doi: 10.1016/j.ymthe.2004.06.717 819. Hyperactive Transposase Mutants of the Sleeping Beauty Transposon Stephen R. Yant1, Julie Park1, Yong Huang1, Jacob Giehm Mikkelsen1 and Mark A. Kay11Pediatrics and Genetics, Stanford University School of Medicine, Stanford, CA
    Molecular Therapy 04/2004; · 6.87 Impact Factor
  • Article: 148. The altered binding properties of sleeping beauty transposase hyperactive mutants may explain their enhanced efficacy
    Julie Park, Stephen R Yant, Mark A Kay
  • Article: Comparison of computationally- and manually-assigned Gene Ontology annotations to improve functional characterization of gene products.
    [show abstract] [hide abstract]
    ABSTRACT: The Gene Ontology (GO) describes molecular functions, biological processes, and cellular components of gene products using controlled-vocabulary terms that are related to each other in a structure that facilitates computing on GO annotations within and across species. Experimentally-based GO annotations that are manually curated from the literature are often used to predict the functions of related uncharacterized proteins. The accuracy of such annotations is thus critically important, particularly for a well-studied model organism such as _Saccharomyces cerevisiae_ which is frequently used as the source of the experimental data. Comparison of experimentally-based annotations with those predicted by computational methods for the same gene products may reveal inaccuracies in curation of the experimental data, and could additionally be used to evaluate and improve the computational methods. We will present the results of an analysis at SGD that identified four major reasons for discrepancies between the two kinds of annotation. Some discrepancies revealed cases in which human error led to errors or omissions in the manual curation, prompting prioritization for review and correction. In another category, the computational annotations were not supported or were refuted by the literature, thereby suggesting ways in which the accuracy of the prediction methods could be improved. Yet another type of discrepancy resulted from issues with the GO structure, such as missing parentage for certain terms, leading to reexamination and improvement of the ontology. Finally, some discrepancies arose because the computational predictions were entirely novel, and no relevant experimental evidence was available. These cases highlight potential interesting new avenues for experimentation.
    Nature Precedings.