Sol Katzman

University of California, Santa Cruz, Santa Cruz, California, United States

Are you Sol Katzman?

Claim your profile

Publications (24)317.87 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: Throughout evolution primate genomes have been modified by waves of retrotransposon insertions. For each wave, the host eventually finds a way to repress retrotransposon transcription and prevent further insertions. In mouse embryonic stem cells, transcriptional silencing of retrotransposons requires KAP1 (also known as TRIM28) and its repressive complex, which can be recruited to target sites by KRAB zinc-finger (KZNF) proteins such as murine-specific ZFP809 which binds to integrated murine leukaemia virus DNA elements and recruits KAP1 to repress them. KZNF genes are one of the fastest growing gene families in primates and this expansion is hypothesized to enable primates to respond to newly emerged retrotransposons. However, the identity of KZNF genes battling retrotransposons currently active in the human genome, such as SINE-VNTR-Alu (SVA) and long interspersed nuclear element 1 (L1), is unknown. Here we show that two primate-specific KZNF genes rapidly evolved to repress these two distinct retrotransposon families shortly after they began to spread in our ancestral genome. ZNF91 underwent a series of structural changes 8-12 million years ago that enabled it to repress SVA elements. ZNF93 evolved earlier to repress the primate L1 lineage until ∼12.5 million years ago when the L1PA3-subfamily of retrotransposons escaped ZNF93's restriction through the removal of the ZNF93-binding site. Our data support a model where KZNF gene expansion limits the activity of newly emerged retrotransposon classes, and this is followed by mutations in these retrotransposons to evade repression, a cycle of events that could explain the rapid expansion of lineage-specific KZNF genes.
    Nature 09/2014; · 38.60 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The genetic programs required for development of the cerebral cortex are under intense investigation. However, non-coding DNA elements that control the expression of developmentally important genes remain poorly defined. Here we investigate the regulation of Fezf2, a transcription factor that is necessary for the generation of deep-layer cortical projection neurons. Using a combination of chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq) we mapped the binding of four deep-layer-enriched transcription factors previously shown to be important for cortical development. Building upon this we characterized the activity of three regulatory regions around the Fezf2 locus at multiple stages throughout corticogenesis. We identified a promoter that was sufficient for expression in the cerebral cortex, and enhancers that drove reporter gene expression in distinct forebrain domains, including progenitor cells and cortical projection neurons. These results provide insight into the regulatory logic controlling Fezf2 expression and further the understanding of how multiple non-coding regulatory domains can collaborate to control gene expression in vivo.
    Neural Development 03/2014; 9(1):6. · 3.55 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: DNA sequencing offers a powerful tool in oncology based on the precise definition of structural rearrangements, copy number in tumor genomes. Here we describe the development of methods to compute copy number and detect structural variants with data synthesis to locally reconstruct highly rearranged regions of the tumor genome with high precision from standard short read, paired-end sequencing datasets. We find that circular assemblies are the most parsimonious explanation for a set of highly amplified tumor regions in a subset of glioblastoma multiforme (GBM) samples sequenced by The Cancer Genome Atlas (TCGA) consortium, revealing evidence for double minute chromosomes (DM) in these tumors. Further, we find that some samples harbor multiple circular amplicons and in some cases further rearrangements occurred after the initial amplicon-generating event. Fluorescence in situ hybridization (FISH) analysis offered an initial confirmation of the presence of DMs. Gene content in these assemblies helps identify likely driver oncogenes for these amplicons. RNA-seq data available for one DM offered additional support for our local tumor genome assemblies, identifying the birth of a novel exon made possible through rearranged sequences present in the DM. Consistent with previous estimates, our method was also useful for analysis of a larger set of GBM tumors for which exome sequencing data is available, finding evidence for oncogenic DMs in over 20% of clinical specimens examined.
    Cancer Research 08/2013; · 9.28 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: During meiosis in yeast, global splicing efficiency increases and then decreases. Here we provide evidence that splicing improves due to reduced competition for the splicing machinery. The timing of this regulation corresponds to repression and reactivation of ribosomal protein genes (RPGs) during meiosis. In vegetative cells, RPG repression by rapamycin treatment also increases splicing efficiency. Downregulation of the RPG-dedicated transcription factor gene IFH1 genetically suppresses two spliceosome mutations, prp11-1 and prp4-1, and globally restores splicing efficiency in prp4-1 cells. We conclude that the splicing apparatus is limiting and that pre-messenger RNAs compete. Splicing efficiency of a pre-mRNA therefore depends not just on its own concentration and affinity for limiting splicing factor(s), but also on those of competing pre-mRNAs. Competition between RNAs for limiting processing factors appears to be a general condition in eukaryotes for a variety of posttranscriptional control mechanisms including microRNA (miRNA) repression, polyadenylation, and splicing.
    Molecular cell 07/2013; · 14.61 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Pre-mRNA splicing is required for the accurate expression of virtually all human protein coding genes. However, splicing also plays important roles in coordinating subsequent steps of pre-mRNA processing such as polyadenylation and mRNA export. Here we test the hypothesis that nuclear pre-mRNA processing influences the polyribosome association of alternative mRNA isoforms. By comparing isoform ratios in cytoplasmic and polyribosomal extracts we determined that the alternative products of approximately 30% (597/1,954) of mRNA processing events are differentially partitioned between these subcellular fractions. Many of the events exhibiting isoform-specific polyribosome association are highly conserved across mammalian genomes, underscoring their possible biological importance. We find that differences in polyribosome association may be explained, at least in part by the observation that alternative splicing alters the cis-regulatory landscape of mRNAs isoforms. For example, inclusion or exclusion of upstream open reading frames (uORFs) in the 5' UTR as well as Alu-elements and microRNA target sites in the 3' UTR have a strong influence on polyribosome association of alternative mRNA isoforms. Taken together, our data demonstrate for the first time the potential link between alternative splicing and translational control of the resultant mRNA isoforms.
    Genome Research 06/2013; · 14.40 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: During development of the cerebral cortex, neural stem cells divide to expand the progenitor pool and generate basal progenitors, outer radial glia and cortical neurons. As these newly born neurons differentiate, they must properly migrate toward their final destination in the cortical plate, project axons to appropriate targets, and develop dendrites. However, a complete understanding of the precise genetic mechanisms regulating these steps is lacking. Here we show that a member of the nuclear factor one (NFI) family of transcription factors, NFIB, is essential for many of these processes in mice. We performed a detailed analysis of NFIB expression during cortical development, and investigated defects in cortical neurogenesis, neuronal migration and differentiation in NfiB(-/-) brains. We found that NFIB is strongly expressed in radial glia and corticofugal neurons throughout cortical development. However, in NfiB(-/-) cortices, radial glia failed to generate outer radial glia, subsequently resulting in a loss of late basal progenitors. In addition, corticofugal neurons showed a severe loss of axonal projections, while late-born cortical neurons displayed defects in migration and ectopically expressed the early-born neuronal marker, CTIP2. Furthermore, gene expression analysis, by RNA-sequencing, revealed a misexpression of genes that regulate the cell cycle, neuronal differentiation and migration in NfiB(-/-) brains. Together these results demonstrate the critical functions of NFIB in regulating cortical development. J. Comp. Neurol., 2013. © 2013 Wiley Periodicals, Inc.
    The Journal of Comparative Neurology 06/2013; · 3.66 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The specification of neuronal subtypes in the cerebral cortex proceeds in a temporal manner; however, the regulation of the transitions between the sequentially generated subtypes is poorly understood. Here, we report that the forkhead box transcription factor Foxg1 coordinates the production of neocortical projection neurons through the global repression of a default gene program. The delayed activation of Foxg1 was necessary and sufficient to induce deep-layer neurogenesis, followed by a sequential wave of upper-layer neurogenesis. A genome-wide analysis revealed that Foxg1 binds to mammalian-specific noncoding sequences to repress over 12 transcription factors expressed in early progenitors, including Ebf2/3, Dmrt3, Dmrta1, and Eya2. These findings reveal an unexpected prolonged competence of progenitors to initiate corticogenesis at a progressed stage during development and identify Foxg1 as a critical initiator of neocorticogenesis through spatiotemporal repression, a system that balances the production of nonradially and radially migrating glutamatergic subtypes during mammalian cortical expansion.
    Cell Reports 03/2013; · 7.21 Impact Factor
  • Source
    Nature 01/2012; 491(7422):56-65. · 38.60 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Enhancers and antisense RNAs play key roles in transcriptional regulation through differing mechanisms. Recent studies have demonstrated that enhancers are often associated with non-coding RNAs (ncRNAs), yet the functional role of these enhancer:ncRNA associations is unclear. Using RNA-Sequencing to interrogate the transcriptomes of undifferentiated mouse embryonic stem cells (mESCs) and their derived neural precursor cells (NPs), we identified two novel enhancer-associated antisense transcripts that appear to control isoform-specific expression of their overlapping protein-coding genes. In each case, an enhancer internal to a protein-coding gene drives an antisense RNA in mESCs but not in NPs. Expression of the antisense RNA is correlated with expression of a shorter isoform of the associated sense gene that is not present when the antisense RNA is not expressed. We demonstrate that expression of the antisense transcripts as well as expression of the short sense isoforms correlates with enhancer activity at these two loci. Further, overexpression and knockdown experiments suggest the antisense transcripts regulate expression of their associated sense genes via cis-acting mechanisms. Interestingly, the protein-coding genes involved in these two examples, Zmynd8 and Brd1, share many functional domains, yet their antisense ncRNAs show no homology to each other and are not present in non-murine mammalian lineages, such as the primate lineage. The lack of homology in the antisense ncRNAs indicates they have evolved independently of each other and suggests that this mode of lineage-specific transcriptional regulation may be more widespread in other cell types and organisms. Our findings present a new view of enhancer action wherein enhancers may direct isoform-specific expression of genes through ncRNA intermediates.
    PLoS ONE 01/2012; 7(8):e43511. · 3.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Ciliated protozoans possess two types of nuclei; a transcriptionally silent micronucleus, which serves as the germ line nucleus, and a transcriptionally active macronucleus, which serves as the somatic nucleus. The macronucleus is derived from a new diploid micronucleus after mating, with epigenetic information contributed by the parental macronucleus serving to guide the formation of the new macronucleus. In the stichotrichous ciliate Oxytricha trifallax, the macronuclear DNA is highly processed to yield gene-sized nanochromosomes with telomeres at each end. Here we report that soon after mating of Oxytricha trifallax, abundant 27 nt small RNAs are produced that are not present prior to mating. We performed next generation sequencing of Oxytricha small RNAs from vegetative and mating cells. Using sequence comparisons between macronuclear and micronuclear versions of genes, we found that the 27 nt RNA class derives from the parental macronucleus, not the developing macronucleus. These small RNAs are produced equally from both strands of macronuclear nanochromosomes, but in a highly non-uniform distribution along the length of the nanochromosome, and with a particular depletion in the 30 nt telomere-proximal positions. This production of small RNAs from the parental macronucleus during macronuclear development stands in contrast to the mechanism of epigenetic control in the distantly related ciliate Tetrahymena. In that species, 28-29 nt scanRNAs are produced from the micronucleus and these micronuclear-derived RNAs serve as epigenetic controllers of macronuclear development. Unlike the Tetrahymena scanRNAs, the Oxytricha macronuclear-derived 27 mers are not modified by 2'O-methylation at their 3' ends. We propose models for the role of these "27macRNAs" in macronuclear development.
    PLoS ONE 01/2012; 7(8):e42371. · 3.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Insulators help separate active chromatin domains from silenced ones. In yeast, gene promoters act as insulators to block the spread of Sir and HP1 mediated silencing while in metazoans most insulators are multipartite autonomous entities. tDNAs are repetitive sequences dispersed throughout the human genome and we now show that some of these tDNAs can function as insulators in human cells. Using computational methods, we identified putative human tDNA insulators. Using silencer blocking, transgene protection and repressor blocking assays we show that some of these tDNA-containing fragments can function as barrier insulators in human cells. We find that these elements also have the ability to block enhancers from activating RNA pol II transcribed promoters. Characterization of a putative tDNA insulator in human cells reveals that the site possesses chromatin signatures similar to those observed at other better-characterized eukaryotic insulators. Enhanced 4C analysis demonstrates that the tDNA insulator makes long-range chromatin contacts with other tDNAs and ETC sites but not with intervening or flanking RNA pol II transcribed genes.
    The EMBO Journal 11/2011; 31(2):330-50. · 9.82 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Fast evolving regions of many metazoan genomes show a bias toward substitutions that change weak (A,T) into strong (G,C) base pairs. Single-nucleotide polymorphisms (SNPs) do not share this pattern, suggesting that it results from biased fixation rather than biased mutation. Supporting this hypothesis, analyses of polymorphism in specific regions of the human genome have identified a positive correlation between weak to strong (W→S) SNPs and derived allele frequency (DAF), suggesting that SNPs become increasingly GC biased over time, especially in regions of high recombination. Using polymorphism data generated by the 1000 Genomes Project from 179 individuals from 4 human populations, we evaluated the extent and distribution of ongoing GC-biased evolution in the human genome. We quantified GC fixation bias by comparing the DAFs of W→S mutations and S→W mutations using a Mann-Whitney U test. Genome-wide, W→S SNPs have significantly higher DAFs than S→W SNPs. This pattern is widespread across the human genome but varies in magnitude along the chromosomes. We found extreme GC-biased evolution in neighborhoods of recombination hot spots, a significant correlation between GC bias and recombination rate, and an inverse correlation between GC bias and chromosome arm length. These findings demonstrate the presence of ongoing fixation bias favoring G and C alleles throughout the human genome and suggest that the bias is caused by a recombination-associated process, such as GC-biased gene conversion.
    Genome Biology and Evolution 06/2011; 3:614-26. · 4.76 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Classical approaches to determine structures of noncoding RNA (ncRNA) probed only one RNA at a time with enzymes and chemicals, using gel electrophoresis to identify reactive positions. To accelerate RNA structure inference, we developed fragmentation sequencing (FragSeq), a high-throughput RNA structure probing method that uses high-throughput RNA sequencing of fragments generated by digestion with nuclease P1, which specifically cleaves single-stranded nucleic acids. In experiments probing the entire mouse nuclear transcriptome, we accurately and simultaneously mapped single-stranded RNA regions in multiple ncRNAs with known structure. We probed in two cell types to verify reproducibility. We also identified and experimentally validated structured regions in ncRNAs with, to our knowledge, no previously reported probing data.
    Nature Methods 11/2010; 7(12):995-1001. · 23.57 Impact Factor
  • Source
    Nature 10/2010; 467:1061-1073. · 38.60 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Regions of the genome that have been the target of positive selection specifically along the human lineage are of special importance in human biology. We used high throughput sequencing combined with methods to enrich human genomic samples for particular targets to obtain the sequence of 22 chromosomal samples at high depth in 40 kb neighborhoods of 49 previously identified 100-400 bp elements that show evidence for human accelerated evolution. In addition to selection, the pattern of nucleotide substitutions in several of these elements suggested an historical bias favoring the conversion of weak (A or T) alleles into strong (G or C) alleles. Here we found strong evidence in the derived allele frequency spectra of many of these 40 kb regions for ongoing weak-to-strong fixation bias. Comparison of the nucleotide composition at polymorphic loci to the composition at sites of fixed substitutions additionally reveals the signature of historical weak-to-strong fixation bias in a subset of these regions. Most of the regions with evidence for historical bias do not also have signatures of ongoing bias, suggesting that the evolutionary forces generating weak-to-strong bias are not constant over time. To investigate the role of selection in shaping these regions, we analyzed the spatial pattern of polymorphism in our samples. We found no significant evidence for selective sweeps, possibly because the signal of such sweeps has decayed beyond the power of our tests to detect them. Together, these results do not rule out functional roles for the observed changes in these regions-indeed there is good evidence that the first two are functional elements in humans-but they suggest that a fixation process (such as biased gene conversion) that is biased at the nucleotide level, but is otherwise selectively neutral, could be an important evolutionary force at play in them, both historically and at present.
    PLoS Genetics 05/2010; 6(5):e1000960. · 8.52 Impact Factor
  • Source
    Nature. 01/2010; 467(7319):1061-1073.
  • Article: Predict-2nd
    [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: Predictions of protein local structure, derived from sequence alignment information alone, provide visualization tools for biologists to evaluate the importance of amino acid residue positions of interest in the absence of X-ray crystal/NMR structures or homology models. They are also useful as inputs to sequence analysis and modeling tools, such as hidden Markov models (HMMs), which can be used to search for homology in databases of known protein structure. In addition, local structure predictions can be used as a component of cost functions in genetic algorithms that predict protein tertiary structure. We have developed a program (predict-2nd) that trains multilayer neural networks and have applied it to numerous local structure alphabets, tuning network parameters such as the number of layers, the number of units in each layer and the window sizes of each layer. We have had the most success with four-layer networks, with gradually increasing window sizes at each layer. Results: Because the four-layer neural nets occasionally get trapped in poor local optima, our training protocol now uses many different random starts, with short training runs, followed by more training on the best performing networks from the short runs. One recent addition to the program is the option to add a guide sequence to the profile inputs, increasing the number of inputs per position by 20. We find that use of a guide sequence provides a small but consistent improvement in the predictions for several different local-structure alphabets. Availability: Local structure prediction with the methods described here is available for use online at http://www.soe.ucsc.edu/compbio/SAM_T08/T08-query.html. The source code and example networks for PREDICT-2ND are available at http://www.soe.ucsc.edu/~karplus/predict-2nd/ A required C++ library is available at http://www.soe.ucsc.edu/~karplus/ultimate/ Contact: karplus@soe.ucsc.edu Supplementary information:Supplementary data are available at Bioinformatics online.
    Bioinformatics. 11/2008; 24(21):2453-2459.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Predictions of protein local structure, derived from sequence alignment information alone, provide visualization tools for biologists to evaluate the importance of amino acid residue positions of interest in the absence of X-ray crystal/NMR structures or homology models. They are also useful as inputs to sequence analysis and modeling tools, such as hidden Markov models (HMMs), which can be used to search for homology in databases of known protein structure. In addition, local structure predictions can be used as a component of cost functions in genetic algorithms that predict protein tertiary structure. We have developed a program (predict-2nd) that trains multilayer neural networks and have applied it to numerous local structure alphabets, tuning network parameters such as the number of layers, the number of units in each layer and the window sizes of each layer. We have had the most success with four-layer networks, with gradually increasing window sizes at each layer. Because the four-layer neural nets occasionally get trapped in poor local optima, our training protocol now uses many different random starts, with short training runs, followed by more training on the best performing networks from the short runs. One recent addition to the program is the option to add a guide sequence to the profile inputs, increasing the number of inputs per position by 20. We find that use of a guide sequence provides a small but consistent improvement in the predictions for several different local-structure alphabets. Local structure prediction with the methods described here is available for use online at http://www.soe.ucsc.edu/compbio/SAM_T08/T08-query.html. The source code and example networks for PREDICT-2ND are available at http://www.soe.ucsc.edu/~karplus/predict-2nd/ A required C++ library is available at http://www.soe.ucsc.edu/~karplus/ultimate/
    Bioinformatics 09/2008; 24(21):2453-9. · 5.47 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Ultraconserved elements in the human genome are defined as stretches of at least 200 base pairs of DNA that match identically with corresponding regions in the mouse and rat genomes. Most ultraconserved elements are noncoding and have been evolutionarily conserved since mammal and bird ancestors diverged over 300 million years ago. The reason for this extreme conservation remains a mystery. It has been speculated that they are mutational cold spots or regions where every site is under weak but still detectable negative selection. However, analysis of the derived allele frequency spectrum shows that these regions are in fact under negative selection that is much stronger than that in protein coding genes.
    Science 09/2007; 317(5840):915. · 31.20 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Comparative genomics allow us to search the human genome for segments that were extensively changed in the last approximately 5 million years since divergence from our common ancestor with chimpanzee, but are highly conserved in other species and thus are likely to be functional. We found 202 genomic elements that are highly conserved in vertebrates but show evidence of significantly accelerated substitution rates in human. These are mostly in non-coding DNA, often near genes associated with transcription and DNA binding. Resequencing confirmed that the five most accelerated elements are dramatically changed in human but not in other primates, with seven times more substitutions in human than in chimp. The accelerated elements, and in particular the top five, show a strong bias for adenine and thymine to guanine and cytosine nucleotide changes and are disproportionately located in high recombination and high guanine and cytosine content environments near telomeres, suggesting either biased gene conversion or isochore selection. In addition, there is some evidence of directional selection in the regions containing the two most accelerated regions. A combination of evolutionary forces has contributed to accelerated evolution of the fastest evolving elements in the human genome.
    PLoS Genetics 11/2006; 2(10):e168. · 8.52 Impact Factor

Publication Stats

1k Citations
317.87 Total Impact Points

Institutions

  • 2006–2014
    • University of California, Santa Cruz
      • • Center for Biomolecular Science and Engineering
      • • Department of Molecular Cell & Developmental Biology
      • • Department of Biomolecular Engineering
      Santa Cruz, California, United States