[show abstract][hide abstract] ABSTRACT: Critical for human gene therapy is the availability of small promoters tools to drive gene expression in a highly specific and reproducible manner. We tackled this challenge by developing human DNA MiniPromoters using computational biology and phylogenetic conservation. MiniPromoters were tested in mouse as single-copy knock-ins at the Hprt locus on the X Chromosome, and evaluated for lacZ reporter expression in CNS and non-CNS tissue. Eighteen novel MiniPromoters driving expression in mouse brain were identified, two MiniPromoters for driving pan-neuronal expression, and 17 MiniPromoters for the mouse eye. Key areas of therapeutic interest were represented in this set: the cerebral cortex, embryonic hypothalamus, spinal cord, bipolar and ganglion cells of the retina, and skeletal muscle. We also demonstrated that three retinal ganglion cell MiniPromoters exhibit similar cell-type specificity when delivered via adeno-associated virus (AAV) vectors intravitreally. We conclude that our methodology and characterization has resulted in desirable expression characteristics that are intrinsic to the MiniPromoter, not dictated by copy number effects or genomic location, and results in constructs predisposed to success in AAV. These MiniPromoters are immediately applicable for pre-clinical studies towards gene therapy in humans, and are publicly available to facilitate basic and clinical research, and human gene therapy.
[show abstract][hide abstract] ABSTRACT: The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein and epigenetic levels. The resulting rich data provide a major opportunity to develop an integrated picture of commonalities, differences and emergent themes across tumor lineages. The Pan-Cancer initiative compares the first 12 tumor types profiled by TCGA. Analysis of the molecular aberrations and their functional roles across tumor types will teach us how to extend therapies effective in one cancer type to others with a similar genomic profile.
[show abstract][hide abstract] ABSTRACT: In this design study, we present an analysis and abstraction of the data and tasks related to the domain of epigenomics, and the design and implementation of an interactive tool to facilitate data analysis and visualization in this domain. Epigenomic data can be grouped into subsets either by k-means clustering or by querying for combinations of presence or absence of signal (on/off) in different epigenomic experiments. These steps can easily be interleaved and the comparison of different workflows is explicitly supported. We took special care to contain the exponential expansion of possible on/off combinations by creating a novel querying interface. An interactive heat map facilitates the exploration and comparison of different clusters. We validated our iterative design by working closely with two groups of biologists on different biological problems. Both groups quickly found new insight into their data as well as claimed that our tool would save them several hours or days of work over using existing tools.
Computer Graphics Forum 07/2013; 32(3):91-100. · 1.64 Impact Factor
[show abstract][hide abstract] ABSTRACT: White spruce (Picea glauca) is a dominant conifer of the boreal forests of North America, and providing genomics resources for this commercially valuable tree will help improve forest management and conservation efforts. Sequencing and assembling the large and highly repetitive spruce genome though pushes the boundaries of the current technology. Here, we describe a whole-genome shotgun sequencing strategy using two Illumina sequencing platforms and an assembly approach using the ABySS software. We report a 20.8 giga base pairs draft genome in 4.9 million scaffolds, with a scaffold N50 of 20 356 bp. We demonstrate how recent improvements in the sequencing technology, especially increasing read lengths and paired end reads from longer fragments have a major impact on the assembly contiguity. We also note that scalable bioinformatics tools are instrumental in providing rapid draft assemblies. AVAILABILITY: The Picea glauca genome sequencing and assembly data are available through NCBI (Accession#: ALWZ0100000000 PID: PRJNA83435). http://www.ncbi.nlm.nih.gov/bioproject/83435. CONTACT: email@example.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
[show abstract][hide abstract] ABSTRACT: Background
The mountain pine beetle, Dendroctonus ponderosae Hopkins, is the most serious insect pest of western North American pine forests. A recent outbreak has destroyed more than 15 million hectares of pine forests with major environmental impacts on forest health, and economic impacts on the forest industry. The outbreak has in part been driven by climate change and will contribute to increased carbon emissions through decaying forests.
We developed a mountain pine beetle genome sequence resource to better understand the unique aspects of this beetle’s biology. A draft de novo genome sequence was assembled from paired-end, short-read sequences from an individual field-collected male pupa and scaffolded using mate-paired, short-read genomic sequences from pooled field-collected pupae, paired-end short-insert whole transcriptome shotgun sequencing reads of mRNA from adult beetle tissues, and paired-end Sanger EST sequences from various life stages. We describe the cytochrome P450, glutathione-S- transferase, and plant cell wall degrading enzyme gene families important to mountain pine beetle survival in their harsh and nutrient-poor host environment, and examine genome-wide SNP variation. A horizontally transferred bacterial sucrose-6-phosphate hydrolase was evident in the genome and its tissue-specific transcription suggests a functional role for the species.
Despite Coleoptera being the largest insect order with over 400,000 described species, including many agricultural and forest pest species, this is only the second genome
sequence reported in Coleoptera and will provide an important resource for the Curculionoidea and other insects.
[show abstract][hide abstract] ABSTRACT: Diffuse large B-cell lymphoma (DLBCL) accounts for 30% to 40% of newly diagnosed lymphomas and has an overall cure rate of approximately 60%. Previously, we observed FOXO1 mutations in NHL patient samples. To explore the effects of FOXO1 mutations, we assessed FOXO1 status in 279 DLBCL patient samples and 22 DLBCL-derived cell lines. FOXO1 mutations were found in 8.6% (24/279) of DLBCL cases. 92.3% (24/26) of mutations were in the first exon, 46.2% (12/26) were recurrent mutations affecting the N-terminal region and another 38.5% (10/26) affected the Forkhead DNA binding domain. Recurrent mutations in the N-terminal region resulted in diminished T24 phosphorylation, loss of interaction with 14-3-3, and nuclear retention. FOXO1 mutation was associated with decreased overall survival in patients treated with R-CHOP (P = 0.037), independent of cell-of-origin and the Revised International Prognostic Index. This association was particularly evident (P = 0.003) in patients in the low-risk R-IPI categories. The independent relationship of mutations in FOXO1 to survival, transcending the prognostic influence of the R-IPI and COO, indicates that FOXO1 mutation is a novel prognostic factor that plays an important role in DLBCL pathogenesis.
[show abstract][hide abstract] ABSTRACT: Triple-negative breast cancers (TNBC) are notoriously difficult to treat because they lack hormone receptors and have limited targeted therapies. Recently, we demonstrated that p90 ribosomal S6 kinase (RSK) is essential for TNBC growth and survival indicating it as a target for therapeutic development. RSK phosphorylates Y-box binding protein-1 (YB-1), an oncogenic transcription/translation factor, highly expressed in TNBC (~70% of cases) and associated with poor prognosis, drug resistance and tumor initiation. YB-1 regulates the tumor-initiating cell markers, CD44 and CD49f however its role in Notch signaling has not been explored. We sought to identify novel chemical entities with RSK inhibitory activity. The Prestwick Chemical Library of 1120 off-patent drugs was screened for RSK inhibitors using both in vitro kinase assays and molecular docking. The lead candidate, luteolin, inhibited RSK1 and RSK2 kinase activity and suppressed growth in TNBC, including TIC-enriched populations. Combining luteolin with paclitaxel increased cell death and unlike chemotherapy alone, did not enrich for CD44(+) cells. Luteolin's efficacy against drug-resistant cells was further indicated in the primary x43 cell line, where it suppressed monolayer growth and mammosphere formation. We next endeavored to understand how the inhibition of RSK/YB-1 signaling by luteolin elicited an effect on TIC-enriched populations. ChIP-on-ChIP experiments in SUM149 cells revealed a 12-fold enrichment of YB-1 binding to the Notch4 promoter. We chose to pursue this because there are several reports indicating that Notch4 maintains cells in an undifferentiated, TIC state. Herein we report that silencing YB-1 with siRNA decreased Notch4 mRNA. Conversely, transient expression of Flag:YB-1(WT) or the constitutively active mutant Flag:YB-1(D102) increased Notch4 mRNA. The levels of Notch4 transcript and the abundance of the Notch4 intracellular domain (N4ICD) correlated with activation of P-RSK(S221/7) and P-YB-1(S102) in a panel of TNBC cell lines. Silencing YB-1 or RSK reduced Notch4 mRNA and this corresponded with loss of N4ICD. Likewise, the RSK inhibitors, luteolin and BI-D1870, suppressed P-YB-1(S102) and thereby reduced Notch4. In conclusion, inhibiting the RSK/YB-1 pathway with luteolin is a novel approach to blocking Notch4 signaling and as such provides a means of inhibiting TICs.
[show abstract][hide abstract] ABSTRACT: Neuroblastoma is a malignancy of the developing sympathetic nervous system that often presents with widespread metastatic disease, resulting in survival rates of less than 50%. To determine the spectrum of somatic mutation in high-risk neuroblastoma, we studied 240 affected individuals (cases) using a combination of whole-exome, genome and transcriptome sequencing as part of the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) initiative. Here we report a low median exonic mutation frequency of 0.60 per Mb (0.48 nonsilent) and notably few recurrently mutated genes in these tumors. Genes with significant somatic mutation frequencies included ALK (9.2% of cases), PTPN11 (2.9%), ATRX (2.5%, and an additional 7.1% had focal deletions), MYCN (1.7%, causing a recurrent p.Pro44Leu alteration) and NRAS (0.83%). Rare, potentially pathogenic germline variants were significantly enriched in ALK, CHEK2, PINK1 and BARD1. The relative paucity of recurrent somatic mutations in neuroblastoma challenges current therapeutic strategies that rely on frequently altered oncogenic drivers.
[show abstract][hide abstract] ABSTRACT: MicroRNAs (miRNAs) are recently discovered small RNA molecules that regulate developmental processes, such as proliferation, differentiation and apoptosis; however, the identity of miRNAs and their functions during liver development are largely unknown. Here, we investigated the miRNA and gene expression profiles for E8.5 endoderm, E14.5 Dlk1(+) liver cells (hepatoblasts), and adult liver by employing Illumina sequencing. We found that miRNAs were abundantly expressed at all three stages. Using K-means clustering analysis, thirteen miRNA clusters with distinct temporal expression patterns were identified. Mir302b, an endoderm enriched miRNA, was identified as a miRNA whose predicted targets are expressed highly in E14.5 hepatoblasts but low in the endoderm. We validated the expression of mir302b in the endoderm by whole-mount in situ hybridization. Interestingly, mir20a, the most highly expressed miRNA in the endoderm library, was also predicted to regulate some of the same targets as mir302b. We found that through targeting Tgfbr2, mir302b and mir20a are able to regulate TGFβ signal transduction. Moreover, mir302b can repress liver markers in an embryonic stem cell differentiation model. Collectively, we have uncovered dynamic patterns of individual miRNAs during liver development, as well as miRNA networks that could be essential for the specification and differentiation of liver progenitors. (HEPATOLOGY 2013.).
[show abstract][hide abstract] ABSTRACT: Background
The genetics of congenital heart disease remain incompletely understood. Exome sequencing has been successfully used to identify disease-causing mutations in familial disorders where candidate gene analyses and linkage mapping have failed.
We studied a large family characterized by autosomal dominant isolated secundum atrial septal defect (ASD, MIM #612794). Candidate gene resequencing and linkage analysis were uninformative.
Whole-exome sequencing of two affected family members identified 44 rare, shared variants including a non-synonymous mutation (c.532A>T, p.M178L, NM_005159.4) in alpha-cardiac actin (ACTC1). This mutation was absent from 1834 internal controls as well as from the 1000 Genomes and the Exome Sequencing Project (ESP) databases but predictions regarding its effect on protein function were divergent. However, p.M178L was the only rare mutation segregating with disease in our family.
Our results provide further evidence supporting a causative role for ACTC1 mutations in ASD. Massively-parallel sequencing of the exome allows for the detection of novel rare variants causing congenital heart disease without the limitations of a candidate gene approach. When mutation prediction algorithms are not helpful, studies of familial disease can help distinguish rare pathologic mutations from benign variants. Consideration of the family history can lead to genetic insights into congenital heart disease.
The Canadian journal of cardiology 01/2013; · 3.12 Impact Factor
[show abstract][hide abstract] ABSTRACT: Recent sequencing efforts have described the mutational landscape of the pediatric brain tumor medulloblastoma. Although MLL2 is among the most frequent somatic single nucleotide variants (SNV), the clinical and biological significance of these mutations remains uncharacterized. Through targeted re-sequencing, we identified mutations of MLL2 in 8 % (14/175) of MBs, the majority of which were loss of function. Notably, we also report mutations affecting the MLL2-binding partner KDM6A, in 4 % (7/175) of tumors. While MLL2 mutations were independent of age, gender, histological subtype, M-stage or molecular subgroup, KDM6A mutations were most commonly identified in Group 4 MBs, and were mutually exclusive with MLL2 mutations. Immunohistochemical staining for H3K4me3 and H3K27me3, the chromatin effectors of MLL2 and KDM6A activity, respectively, demonstrated alterations of the histone code in 24 % (53/220) of MBs across all subgroups. Correlating these MLL2- and KDM6A-driven histone marks with prognosis, we identified populations of MB with improved (K4+/K27-) and dismal (K4-/K27-) outcomes, observed primarily within Group 3 and 4 MBs. Group 3 and 4 MBs demonstrate somatic copy number aberrations, and transcriptional profiles that converge on modifiers of H3K27-methylation (EZH2, KDM6A, KDM6B), leading to silencing of PRC2-target genes. As PRC2-mediated aberrant methylation of H3K27 has recently been targeted for therapy in other diseases, it represents an actionable target for a substantial percentage of medulloblastoma patients with aggressive forms of the disease.
[show abstract][hide abstract] ABSTRACT: Somatic hypermutation (SHM) in the variable region of immunoglobulin genes (IGV) naturally occurs in a narrow window of B cell development to provide high-affinity antibodies. However, SHM can also aberrantly target proto-oncogenes and cause genome instability. The role of aberrant SHM (aSHM) has been widely studied in various non-Hodgkin's lymphoma particularly in diffuse large B-cell lymphoma (DLBCL). Although, it has been speculated that aSHM targets a wide range of genome loci so far only twelve genes have been identified as targets of aSHM through the targeted sequencing of selected genes. A genome-wide study aiming at identifying a comprehensive set of aSHM targets recurrently occurring in DLBCL has not been previously undertaken. Here, we present a comprehensive assessment of the somatic hypermutated genes in DLBCL identified through an analysis of genomic and transcriptome data derived from 40 DLBCL patients. Our analysis verifies that there are indeed many genes that are recurrently affected by aSHM. In particular, we have identified 32 novel targets that show same or higher level of aSHM activity than genes previously reported. Amongst these novel targets, 22 genes showed a significant correlation between mRNA abundance and aSHM.
[show abstract][hide abstract] ABSTRACT: Biologists possess the detailed knowledge critical for extracting biological insight from genome-wide data resources, and yet they are increasingly faced with nontrivial computational analysis challenges posed by genome-scale methodologies. To lower this computational barrier, particularly in the early data exploration phases, we have developed an interactive pattern discovery and visualization approach, Spark, designed with epigenomic data in mind. Here we demonstrate Spark's ability to reveal both known and novel epigenetic signatures, including a previously unappreciated binding association between the YY1 transcription factor and the corepressor CTBP2 in human embryonic stem cells.
[show abstract][hide abstract] ABSTRACT: Medulloblastoma, the most common malignant paediatric brain tumour, is currently treated with nonspecific cytotoxic therapies including surgery, whole-brain radiation, and aggressive chemotherapy. As medulloblastoma exhibits marked intertumoural heterogeneity, with at least four distinct molecular variants, previous attempts to identify targets for therapy have been underpowered because of small samples sizes. Here we report somatic copy number aberrations (SCNAs) in 1,087 unique medulloblastomas. SCNAs are common in medulloblastoma, and are predominantly subgroup-enriched. The most common region of focal copy number gain is a tandem duplication of SNCAIP, a gene associated with Parkinson's disease, which is exquisitely restricted to Group 4α. Recurrent translocations of PVT1, including PVT1-MYC and PVT1-NDRG1, that arise through chromothripsis are restricted to Group 3. Numerous targetable SCNAs, including recurrent events targeting TGF-β signalling in Group 3, and NF-κB signalling in Group 4, suggest future avenues for rational, targeted therapy.
[show abstract][hide abstract] ABSTRACT: Autosomal-recessive inheritance, severe to profound sensorineural hearing loss, and partial agenesis of the corpus callosum are hallmarks of the clinically well-established Chudley-McCullough syndrome (CMS). Although not always reported in the literature, frontal polymicrogyria and gray matter heterotopia are uniformly present, whereas cerebellar dysplasia, ventriculomegaly, and arachnoid cysts are nearly invariant. Despite these striking brain malformations, individuals with CMS generally do not present with significant neurodevelopmental abnormalities, except for hearing loss. Homozygosity mapping and whole-exome sequencing of DNA from affected individuals in eight families (including the family in the first report of CMS) revealed four molecular variations (two single-base deletions, a nonsense mutation, and a canonical splice-site mutation) in the G protein-signaling modulator 2 gene, GPSM2, that underlie CMS. Mutations in GPSM2 have been previously identified in people with profound congenital nonsyndromic hearing loss (NSHL). Subsequent brain imaging of these individuals revealed frontal polymicrogyria, abnormal corpus callosum, and gray matter heterotopia, consistent with a CMS diagnosis, but no ventriculomegaly. The gene product, GPSM2, is required for orienting the mitotic spindle during cell division in multiple tissues, suggesting that the sensorineural hearing loss and characteristic brain malformations of CMS are due to defects in asymmetric cell divisions during development.
The American Journal of Human Genetics 05/2012; 90(6):1088-93. · 11.20 Impact Factor