-
Kunal Gangwal,
Savita Sankar,
Peter C Hollenhorst,
Michelle Kinsey,
Stephen C Haroldsen, Atul A Shah,
Kenneth M Boucher,
W Scott Watkins,
Lynn B Jorde,
Barbara J Graves,
Stephen L Lessnick
[show abstract]
[hide abstract]
ABSTRACT: The ETS gene family is frequently involved in chromosome translocations that cause human cancer, including prostate cancer, leukemia, and sarcoma. However, the mechanisms by which oncogenic ETS proteins, which are DNA-binding transcription factors, target genes necessary for tumorigenesis is not well understood. Ewing's sarcoma serves as a paradigm for the entire class of ETS-associated tumors because nearly all cases harbor recurrent chromosomal translocations involving ETS genes. The most common translocation in Ewing's sarcoma encodes the EWS/FLI oncogenic transcription factor. We used whole genome localization (ChIP-chip) to identify target genes that are directly bound by EWS/FLI. Analysis of the promoters of these genes demonstrated a significant over-representation of highly repetitive GGAA-containing elements (microsatellites). In a parallel approach, we found that EWS/FLI uses GGAA microsatellites to regulate the expression of some of its target genes including NR0B1, a gene required for Ewing's sarcoma oncogenesis. The microsatellite in the NR0B1 promoter bound EWS/FLI in vitro and in vivo and was both necessary and sufficient to confer EWS/FLI regulation to a reporter gene. Genome wide computational studies demonstrated that GGAA microsatellites were enriched close to EWS/FLI-up-regulated genes but not down-regulated genes. Mechanistic studies demonstrated that the ability of EWS/FLI to bind DNA and modulate gene expression through these repetitive elements depended on the number of consecutive GGAA motifs. These findings illustrate an unprecedented route to specificity for ETS proteins and use of microsatellites in tumorigenesis.
Proceedings of the National Academy of Sciences 08/2008; 105(29):10149-54. · 9.68 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: The conservation of in vitro DNA-binding properties within families of transcription factors presents a challenge for achieving in vivo specificity. To uncover the mechanisms regulating specificity within the ETS gene family, we have used chromatin immunoprecipitation coupled with genome-wide promoter microarrays to query the occupancy of three ETS proteins in a human T-cell line. Unexpectedly, redundant occupancy was frequently detected, while specific occupancy was less likely. Redundant binding correlated with housekeeping classes of genes, whereas specific binding examples represented more specialized genes. Bioinformatics approaches demonstrated that redundant binding correlated with consensus ETS-binding sequences near transcription start sites. In contrast, specific binding sites diverged dramatically from the consensus and were found further from transcription start sites. One route to specificity was found--a highly divergent binding site that facilitates ETS1 and RUNX1 cooperative DNA binding. The specific and redundant DNA-binding modes suggest two distinct roles for members of the ETS transcription factor family.
Genes & Development 09/2007; 21(15):1882-94. · 11.66 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: We have implemented a method that identifies the genomic origins of sample proteins by scanning their peptide-mass fingerprint against the theoretical translation and proteolytic digest of an entire genome. Unlike previously reported techniques, this method requires no predefined ORF or protein annotations. Fixed-size windows along the genome sequence are scored by an equation accounting for the number of matching peptides, the number of missed enzymatic cleavages in each peptide, the number of in-frame stop codons within a window, the adjacency between peptides, and duplicate peptide matches. Statistical significance of matching regions is assessed by comparing their scores to scores from windows matching randomly generated mass data. Tests with samples from Saccharomyces cerevisiae mitochondria and Escherichia coli have demonstrated the ability to produce statistically significant identifications, agreeing with two commonly used programs, peptident and mascot, in 86% of samples analyzed. This genome fingerprint scanning method has the potential to aid in genome annotation, identify proteins for which annotation is incorrect or missing, and handle cases where sequencing errors have caused framing mistakes in the databases. It might also aid in the identification of proteins in which recoding events such as frameshifting or stop-codon read-through have occurred, elucidating alternative translation mechanisms. The prototype is implemented as a clientserver pair, allowing the distribution, among a set of cluster nodes, of a single or multiple genomes for concurrent analysis.
Proceedings of the National Academy of Sciences 02/2003; 100(1):20-5. · 9.68 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: An mRNA transcript contains many potential antisense oligodeoxynucleotide target sites. Identification of the most efficacious targets remains an important and challenging problem. Building on separate work that revealed a strong correlation between the inclusion of short sequence motifs and the activity level of an oligo, we have developed a predictive artificial neural network system for mapping tetranucleotide motif content to antisense oligo activity. Trained for high-specificity prediction, the system has been cross-validated against a database of 348 oligos from the literature and a larger proprietary database of 908 oligos. In cross- validation tests the system identified effective oligos (i.e. oligos capable of reducing target mRNA expression to <25% that of the control) with 53% accuracy, in contrast to the <10% success rates commonly reported for trial-and-error oligo selection, suggesting a possible 5-fold reduction in the in vivo screening required to find an active oligo. We have implemented a web interface to a trained neural network. Given an RNA transcript as input, the system identifies the most likely oligo targets and provides estimates of the probabilities that oligos targeted against these sites will be effective.
Nucleic Acids Research 10/2002; 30(19):4295-304. · 8.03 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: In an effort to identify potential programmed frameshift sites by statistical analysis, we explore the hypothesis that selective pressure would have rendered such sites underabundant and underrepresented in protein-coding sequences. We developed a computer program to compare the frequencies of k-length subsequences of nucleotides with the frequencies predicted by a zero order Markov chain determined by the codon bias of the same set of sequences. The program was used to calculate and evaluate the distribution of 7-base oligonucleotides in the 6000+ putative protein-coding sequences of S. cerevisiae preliminary to the laboratory testing of the most highly underrepresented oligos for frameshifting efficiency.
Among the most significant results is the finding that the heptanucleotides CUU-AGG-C and CUU-AGU-U, sites of the programmed +1 translational frameshifts required for the production in yeast of actin filament-binding protein ABP140 and telomerase subunit EST3, respectively, rank among the least represented of phase I heptanucleotides in the coding sequences of S. cerevisiae. Laboratory experiments demonstrated that other underrepresented heptanucleotides identified by the program, for example GGU-CAG-A, are also prone to significant translational frameshifting, suggesting the possibility that genes containing other underrepresented heptamers may also encode transframe products.
The program is available for download from http://www.gesteland.genetics.utah.edu/freqAnalysis
Complete results from the analysis of S. cerevisiae are available on http://www.gesteland.genetics.utah.edu/freqAnalysis
Bioinformatics 09/2002; 18(8):1046-53. · 5.47 Impact Factor