Niranjan Nagarajan

Genome Institute of Singapore, Tumasik, Singapore

Are you Niranjan Nagarajan?

Claim your profile

Publications (45)340.15 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: Chromosomal structural variations play an important role in determining the transcriptional landscape of human breast cancers. To assess the nature of these structural variations, we analyzed eight breast tumor samples with a focus on regions of gene amplification using mate-pair sequencing of long-insert genomic DNA with matched transcriptome profiling. We found that tandem duplications appear to be early events in tumor evolution, especially in the genesis of amplicons. In a detailed reconstruction of events on chromosome 17, we found large unpaired inversions and deletions connect a tandemly duplicated ERBB2 with neighboring 17q21.3 amplicons while simultaneously deleting the intervening BRCA1 tumor suppressor locus. This series of events appeared to be unusually common when examined in larger genomic data sets of breast cancers albeit using approaches with lesser resolution. Using siRNAs in breast cancer cell lines, we showed that the 17q21.3 amplicon harbored a significant number of weak oncogenes that appeared consistently coamplified in primary tumors. Down-regulation of BRCA1 expression augmented the cell proliferation in ERBB2-transfected human normal mammary epithelial cells. Coamplification of other functionally tested oncogenic elements in other breast tumors examined, such as RIPK2 and MYC on chromosome 8, also parallel these findings. Our analyses suggest that structural variations efficiently orchestrate the gain and loss of cancer gene cassettes that engage many oncogenic pathways simultaneously and that such oncogenic cassettes are favored during the evolution of a cancer.
    Genome research. 09/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Fastidious anaerobic bacteria play critical roles in environmental bioremediation of halogenated compounds. However, their characterization and application have been largely impeded by difficulties in growing them in pure culture. Thus far, no pure culture has been reported to respire on the notorious polychlorinated biphenyls (PCBs), and functional genes responsible for PCB detoxification remain unknown due to the extremely slow growth of PCB-respiring bacteria. Here we report the successful isolation and characterization of three Dehalococcoides mccartyi strains that respire on commercial PCBs. Using high-throughput metagenomic analysis, combined with traditional culture techniques, tetrachloroethene (PCE) was identified as a feasible alternative to PCBs to isolate PCB-respiring Dehalococcoides from PCB-enriched cultures. With PCE as an alternative electron acceptor, the PCB-respiring Dehalococcoides were boosted to a higher cell density (1.2 × 10(8) to 1.3 × 10(8) cells per mL on PCE vs. 5.9 × 10(6) to 10.4 × 10(6) cells per mL on PCBs) with a shorter culturing time (30 d on PCE vs. 150 d on PCBs). The transcriptomic profiles illustrated that the distinct PCB dechlorination profile of each strain was predominantly mediated by a single, novel reductive dehalogenase (RDase) catalyzing chlorine removal from both PCBs and PCE. The transcription levels of PCB-RDase genes are 5-60 times higher than the genome-wide average. The cultivation of PCB-respiring Dehalococcoides in pure culture and the identification of PCB-RDase genes deepen our understanding of organohalide respiration of PCBs and shed light on in situ PCB bioremediation.
    Proceedings of the National Academy of Sciences of the United States of America. 07/2014;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Opisthorchiasis is a neglected, tropical disease caused by the carcinogenic Asian liver fluke, Opisthorchis viverrini. This hepatobiliary disease is linked to malignant cancer (cholangiocarcinoma, CCA) and affects millions of people in Asia. No vaccine is available, and only one drug (praziquantel) is used against the parasite. Little is known about O. viverrini biology and the diseases that it causes. Here we characterize the draft genome (634.5 Mb) and transcriptomes of O. viverrini, elucidate how this fluke survives in the hostile environment within the bile duct and show that metabolic pathways in the parasite are highly adapted to a lipid-rich diet from bile and/or cholangiocytes. We also provide additional evidence that O. viverrini and other flukes secrete proteins that directly modulate host cell proliferation. Our molecular resources now underpin profound explorations of opisthorchiasis/CCA and the design of new interventions.
    Nature Communications 01/2014; 5:4378. · 10.02 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: RNA viruses are notorious for their ability to quickly adapt to selective pressure from the host immune system and/or antivirals. This adaptability is likely due to the error-prone characteristics of their RNA-dependent, RNA polymerase [1, 2]. Dengue virus, a member of the Flaviviridae family of positive-strand RNA viruses, is also known to share these error-prone characteristics [3]. Utilizing high-throughput, massively parallel sequencing methodologies, or next-generation sequencing (NGS), we can now accurately quantify these populations of viruses and track the changes to these populations over the course of a single infection. The aim of this chapter is twofold: to describe the methodologies required for sample preparation prior to sequencing and to describe the bioinformatics analyses required for the resulting data.
    Methods in molecular biology (Clifton, N.J.) 01/2014; 1138:175-95. · 1.29 Impact Factor
  • Niranjan Nagarajan, Mihai Pop
    [Show abstract] [Hide abstract]
    ABSTRACT: Advances in sequencing technologies and increased access to sequencing services have led to renewed interest in sequence and genome assembly. Concurrently, new applications for sequencing have emerged, including gene expression analysis, discovery of genomic variants and metagenomics, and each of these has different needs and challenges in terms of assembly. We survey the theoretical foundations that underlie modern assembly and highlight the options and practical trade-offs that need to be considered, focusing on how individual features address the needs of specific applications. We also review key software and the interplay between experimental design and efficacy of assembly.
    Nature Reviews Genetics 01/2013; · 41.06 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The high throughput and cost-effectiveness afforded by short-read sequencing technologies, in principle, enable researchers to perform 16S rRNA profiling of complex microbial communities at unprecedented depth and resolution. Existing Illumina sequencing protocols are, however, limited by the fraction of the 16S rRNA gene that is interrogated and therefore limit the resolution and quality of the profiling. To address this, we present the design of a novel protocol for shotgun Illumina sequencing of the bacterial 16S rRNA gene, optimized to amplify more than 90% of sequences in the Greengenes database and with the ability to distinguish nearly twice as many species-level OTUs compared to existing protocols. Using several in silico and experimental datasets, we demonstrate that despite the presence of multiple variable and conserved regions, the resulting shotgun sequences can be used to accurately quantify the constituents of complex microbial communities. The reconstruction of a significant fraction of the 16S rRNA gene also enabled high precision (>90%) in species-level identification thereby opening up potential application of this approach for clinical microbial characterization.
    PLoS ONE 01/2013; 8(4):e60811. · 3.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Rickettsia prowazekii is a notable intracellular pathogen, the agent of epidemic typhus, and a potential biothreat agent. We present here whole-genome sequence data for four strains of R. prowazekii, including one from a flying squirrel.
    Genome announcements. 01/2013; 1(3).
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: BACKGROUND: Gastric cancer is the second highest cause of global cancer mortality. To explore the complete repertoire of somatic alterations in gastric cancer, we combined massively-parallel short read and DNA-PET sequencing to present the first whole-genome analysis of two gastric adenocarcinomas, one with chromosomal instability and the other with microsatellite instability. RESULTS: Integrative analysis and de novo assemblies revealed the architecture of a wild-type KRAS amplification, a common driver event in gastric cancer. We discovered three distinct mutational signatures in gastric cancer - against a genome-wide backdrop of oxidative and microsatellite instability-related mutational signatures, we identified the first exome-specific mutational signature. Further characterization of the impact of these signatures by combining sequencing data from 40 complete gastric cancer exomes and targeted screening of an additional 94 independent gastric tumours uncovered ACVR2A, RPL22 and LMAN1 as recurrently mutated genes in microsatellite instability-positive gastric cancer and PAPPA as a recurrently mutated gene in TP53 wild-type gastric cancer. CONCLUSIONS: These results highlight how whole-genome cancer sequencing can uncover information relevant to tissue-specific carcinogenesis that would otherwise be missed from exome-sequencing data.
    Genome biology 12/2012; 13(12):R115. · 10.30 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Oranges are an important nutritional source for human health and have immense economic value. Here we present a comprehensive analysis of the draft genome of sweet orange (Citrus sinensis). The assembled sequence covers 87.3% of the estimated orange genome, which is relatively compact, as 20% is composed of repetitive elements. We predicted 29,445 protein-coding genes, half of which are in the heterozygous state. With additional sequencing of two more citrus species and comparative analyses of seven citrus genomes, we present evidence to suggest that sweet orange originated from a backcross hybrid between pummelo and mandarin. Focused analysis on genes involved in vitamin C metabolism showed that GalUR, encoding the rate-limiting enzyme of the galacturonate pathway, is significantly upregulated in orange fruit, and the recent expansion of this gene family may provide a genomic basis. This draft genome represents a valuable resource for understanding and improving many important citrus traits in the future.
    Nature Genetics 11/2012; · 35.21 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The study of cell-population heterogeneity in a range of biological systems, from viruses to bacterial isolates to tumor samples, has been transformed by recent advances in sequencing throughput. While the high-coverage afforded can be used, in principle, to identify very rare variants in a population, existing ad hoc approaches frequently fail to distinguish true variants from sequencing errors. We report a method (LoFreq) that models sequencing run-specific error rates to accurately call variants occurring in <0.05% of a population. Using simulated and real datasets (viral, bacterial and human), we show that LoFreq has near-perfect specificity, with significantly improved sensitivity compared with existing methods and can efficiently analyze deep Illumina sequencing datasets without resorting to approximations or heuristics. We also present experimental validation for LoFreq on two different platforms (Fluidigm and Sequenom) and its application to call rare somatic variants from exome sequencing datasets for gastric cancer. Source code and executables for LoFreq are freely available at http://sourceforge.net/projects/lofreq/.
    Nucleic Acids Research 10/2012; · 8.81 Impact Factor
  • Song Gao, Denis Bertrand, Niranjan Nagarajan
    [Show abstract] [Hide abstract]
    ABSTRACT: With the increased democratization of sequencing, the reliance of sequence assembly programs on heuristics is at odds with the need for black-box assembly solutions that can be used reliably by non-specialists. In this work, we present a formal definition for in silico assembly validation and finishing and explore the feasibility of an exact solution for this problem using quadratic programming (FinIS). Based on results for several real and simulated datasets, we demonstrate that FinIS validates the correctness of a larger fraction of the assembly than existing ad hoc tools. Using a test for unique optimal solutions, we show that FinIS can improve on both precision and recall values for the correctness of assembled sequences, when compared to competing programs. Source code and executables for FinIS are freely available at http://sourceforge.net/projects/finis/.
    Proceedings of the 12th international conference on Algorithms in Bioinformatics; 09/2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background. Dengue is the most common arboviral infection of humans. There are currently no specific treatments for dengue. Balapiravir is a prodrug of a nucleoside analogue (called R1479) and an inhibitor of hepatitis C virus replication in vivo.Methods. We conducted in vitro experiments to determine the potency of balapiravir against dengue viruses and then an exploratory, dose-escalating, randomized placebo-controlled trial in adult male patients with dengue with <48 hours of fever.Results. The clinical and laboratory adverse event profile in patients receiving balapiravir at doses of 1500 mg (n = 10) or 3000 mg (n = 22) orally for 5 days was similar to that of patients receiving placebo (n = 32), indicating balapiravir was well tolerated. However, twice daily assessment of viremia and daily assessment of NS1 antigenemia indicated balapiravir did not measurably alter the kinetics of these virological markers, nor did it reduce the fever clearance time. The kinetics of plasma cytokine concentrations and the whole blood transcriptional profile were also not attenuated by balapiravir treatment.Conclusions. Although this trial, the first of its kind in dengue, does not support balapiravir as a candidate drug, it does establish a framework for antiviral treatment trials in dengue and provides the field with a clinically evaluated benchmark molecule.Clinical Trials Registration. NCT01096576.
    The Journal of Infectious Diseases 07/2012; · 5.85 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We describe genome mapping on nanochannel arrays. In this approach, specific sequence motifs in single DNA molecules are fluorescently labeled, and the DNA molecules are uniformly stretched in thousands of silicon channels on a nanofluidic device. Fluorescence imaging allows the construction of maps of the physical distances between occurrences of the sequence motifs. We demonstrate the analysis, individually and as mixtures, of 95 bacterial artificial chromosome (BAC) clones that cover the 4.7-Mb human major histocompatibility complex region. We obtain accurate, haplotype-resolved, sequence motif maps hundreds of kilobases in length, resulting in a median coverage of 114× for the BACs. The final sequence motif map assembly contains three contigs. With an average distance of 9 kb between labels, we detect 22 haplotype differences. We also use the sequence motif maps to provide scaffolds for de novo assembly of sequencing data. Nanochannel genome mapping should facilitate de novo assembly of sequencing reads from complex regions in diploid organisms, haplotype and structural variation analysis and comparative genomics.
    Nature Biotechnology 07/2012; 30(8):771-6. · 32.44 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Gastric cancer is a major cause of global cancer mortality. We surveyed the spectrum of somatic alterations in gastric cancer by sequencing the exomes of 15 gastric adenocarcinomas and their matched normal DNAs. Frequently mutated genes in the adenocarcinomas included TP53 (11/15 tumors), PIK3CA (3/15) and ARID1A (3/15). Cell adhesion was the most enriched biological pathway among the frequently mutated genes. A prevalence screening confirmed mutations in FAT4, a cadherin family gene, in 5% of gastric cancers (6/110) and FAT4 genomic deletions in 4% (3/83) of gastric tumors. Frequent mutations in chromatin remodeling genes (ARID1A, MLL3 and MLL) also occurred in 47% of the gastric cancers. We detected ARID1A mutations in 8% of tumors (9/110), which were associated with concurrent PIK3CA mutations and microsatellite instability. In functional assays, we observed both FAT4 and ARID1A to exert tumor-suppressor activity. Somatic inactivation of FAT4 and ARID1A may thus be key tumorigenic events in a subset of gastric cancers.
    Nature Genetics 04/2012; 44(5):570-4. · 35.21 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Tyrosine kinase inhibitors (TKIs) elicit high response rates among individuals with kinase-driven malignancies, including chronic myeloid leukemia (CML) and epidermal growth factor receptor-mutated non-small-cell lung cancer (EGFR NSCLC). However, the extent and duration of these responses are heterogeneous, suggesting the existence of genetic modifiers affecting an individual's response to TKIs. Using paired-end DNA sequencing, we discovered a common intronic deletion polymorphism in the gene encoding BCL2-like 11 (BIM). BIM is a pro-apoptotic member of the B-cell CLL/lymphoma 2 (BCL2) family of proteins, and its upregulation is required for TKIs to induce apoptosis in kinase-driven cancers. The polymorphism switched BIM splicing from exon 4 to exon 3, which resulted in expression of BIM isoforms lacking the pro-apoptotic BCL2-homology domain 3 (BH3). The polymorphism was sufficient to confer intrinsic TKI resistance in CML and EGFR NSCLC cell lines, but this resistance could be overcome with BH3-mimetic drugs. Notably, individuals with CML and EGFR NSCLC harboring the polymorphism experienced significantly inferior responses to TKIs than did individuals without the polymorphism (P = 0.02 for CML and P = 0.027 for EGFR NSCLC). Our results offer an explanation for the heterogeneity of TKI responses across individuals and suggest the possibility of personalizing therapy with BH3 mimetics to overcome BIM-polymorphism-associated TKI resistance.
    Nature medicine 03/2012; 18(4):521-8. · 27.14 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Structural variations (SVs) contribute significantly to the variability of the human genome and extensive genomic rearrangements are a hallmark of cancer. While genomic DNA paired-end-tag (DNA-PET) sequencing is an attractive approach to identify genomic SVs, the current application of PET sequencing with short insert size DNA can be insufficient for the comprehensive mapping of SVs in low complexity and repeat-rich genomic regions. We employed a recently developed procedure to generate PET sequencing data using large DNA inserts of 10-20 kb and compared their characteristics with short insert (1 kb) libraries for their ability to identify SVs. Our results suggest that although short insert libraries bear an advantage in identifying small deletions, they do not provide significantly better breakpoint resolution. In contrast, large inserts are superior to short inserts in providing higher physical genome coverage for the same sequencing cost and achieve greater sensitivity, in practice, for the identification of several classes of SVs, such as copy number neutral and complex events. Furthermore, our results confirm that large insert libraries allow for the identification of SVs within repetitive sequences, which cannot be spanned by short inserts. This provides a key advantage in studying rearrangements in cancer, and we show how it can be used in a fusion-point-guided-concatenation algorithm to study focally amplified regions in cancer.
    PLoS ONE 01/2012; 7(9):e46152. · 3.53 Impact Factor
  • Source
    Song Gao, Wing-Kin Sung, Niranjan Nagarajan
    [Show abstract] [Hide abstract]
    ABSTRACT: Scaffolding, the problem of ordering and orienting contigs, typically using paired-end reads, is a crucial step in the assembly of high-quality draft genomes. Even as sequencing technologies and mate-pair protocols have improved significantly, scaffolding programs still rely on heuristics, with no guarantees on the quality of the solution. In this work, we explored the feasibility of an exact solution for scaffolding and present a first tractable solution for this problem (Opera). We also describe a graph contraction procedure that allows the solution to scale to large scaffolding problems and demonstrate this by scaffolding several large real and synthetic datasets. In comparisons with existing scaffolders, Opera simultaneously produced longer and more accurate scaffolds demonstrating the utility of an exact approach. Opera also incorporates an exact quadratic programming formulation to precisely compute gap sizes (Availability: http://sourceforge.net/projects/operasf/ ).
    Journal of computational biology: a journal of computational molecular cell biology 09/2011; 18(11):1681-91. · 1.69 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Somatic genome rearrangements are thought to play important roles in cancer development. We optimized a long-span paired-end-tag (PET) sequencing approach using 10-Kb genomic DNA inserts to study human genome structural variations (SVs). The use of a 10-Kb insert size allows the identification of breakpoints within repetitive or homology-containing regions of a few kilobases in size and results in a higher physical coverage compared with small insert libraries with the same sequencing effort. We have applied this approach to comprehensively characterize the SVs of 15 cancer and two noncancer genomes and used a filtering approach to strongly enrich for somatic SVs in the cancer genomes. Our analyses revealed that most inversions, deletions, and insertions are germ-line SVs, whereas tandem duplications, unpaired inversions, interchromosomal translocations, and complex rearrangements are over-represented among somatic rearrangements in cancer genomes. We demonstrate that the quantitative and connective nature of DNA-PET data is precise in delineating the genealogy of complex rearrangement events, we observe signatures that are compatible with breakage-fusion-bridge cycles, and we discover that large duplications are among the initial rearrangements that trigger genome instability for extensive amplification in epithelial cancers.
    Genome Research 04/2011; 21(5):665-75. · 14.40 Impact Factor
  • Source
    Niranjan Nagarajan, Carl Kingsford
    [Show abstract] [Hide abstract]
    ABSTRACT: Reassortments in the influenza virus--a process where strains exchange genetic segments--have been implicated in two out of three pandemics of the 20th century as well as the 2009 H1N1 outbreak. While advances in sequencing have led to an explosion in the number of whole-genome sequences that are available, an understanding of the rate and distribution of reassortments and their role in viral evolution is still lacking. An important factor in this is the paucity of automated tools for confident identification of reassortments from sequence data due to the challenges of analyzing large, uncertain viral phylogenies. We describe here a novel computational method, called GiRaF (Graph-incompatibility-based Reassortment Finder), that robustly identifies reassortments in a fully automated fashion while accounting for uncertainties in the inferred phylogenies. The algorithms behind GiRaF search large collections of Markov chain Monte Carlo (MCMC)-sampled trees for groups of incompatible splits using a fast biclique enumeration algorithm coupled with several statistical tests to identify sets of taxa with differential phylogenetic placement. GiRaF correctly finds known reassortments in human, avian, and swine influenza populations, including the evolutionary events that led to the recent 'swine flu' outbreak. GiRaF also identifies several previously unreported reassortments via whole-genome studies to catalog events in H5N1 and swine influenza isolates.
    Nucleic Acids Research 12/2010; 39(6):e34. · 8.81 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Molecular studies of microbial diversity have provided many insights into the bacterial communities inhabiting the human body and the environment. A common first step in such studies is a survey of conserved marker genes (primarily 16S rRNA) to characterize the taxonomic composition and diversity of these communities. To date, however, there exists significant variability in analysis methods employed in these studies. Here we provide a critical assessment of current analysis methodologies that cluster sequences into operational taxonomic units (OTUs) and demonstrate that small changes in algorithm parameters can lead to significantly varying results. Our analysis provides strong evidence that the species-level diversity estimates produced using common OTU methodologies are inflated due to overly stringent parameter choices. We further describe an example of how semi-supervised clustering can produce OTUs that are more robust to changes in algorithm parameters. Our results highlight the need for systematic and open evaluation of data analysis methodologies, especially as targeted 16S rRNA diversity studies are increasingly relying on high-throughput sequencing technologies. All data and results from our study are available through the JGI FAMeS website http://fames.jgi-psf.org/.
    BMC Bioinformatics 03/2010; 11:152. · 3.02 Impact Factor

Publication Stats

1k Citations
340.15 Total Impact Points

Institutions

  • 2010–2014
    • Genome Institute of Singapore
      • Computational and Systems Biology Group
      Tumasik, Singapore
  • 2008–2013
    • University of Maryland, College Park
      • • Department of Computer Science
      • • Center for Bioinformatics and Computational Biology
      Maryland, United States
  • 2012
    • Huazhong Agricultural University
      Wu-han-shih, Hubei, China
  • 2011
    • National University of Singapore
      • NUS Graduate School for Integrative Sciences and Engineering
      Singapore, Singapore
  • 2003–2009
    • Cornell University
      • Computer Science
      Ithaca, NY, United States