Carl F Schaefer

Bar Ilan University, Gan, Tel Aviv, Israel

Are you Carl F Schaefer?

Claim your profile

Publications (33)277.38 Total impact

  • Cancer genomics & proteomics 01/2014; 11(1):1-12. · 1.86 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The development and progression of cancer is associated with disruption of biological networks. Historically studies have identified sets of signature genes involved in events ultimately leading to the development of cancer. Identification of such sets does not indicate which biologic processes are oncogenic drivers and makes it difficult to identify key networks to target for interventions. Using a comprehensive, integrated computational approach, the authors identify the sonic hedgehog (SHH) pathway as the gene network that most significantly distinguishes tumour and tumour-adjacent samples in human hepatocellular carcinoma (HCC). The analysis reveals that the SHH pathway is commonly activated in the tumour samples and its activity most significantly differentiates tumour from the non-tumour samples. The authors experimentally validate these in silico findings in the same biologic material using Western blot analysis. This analysis reveals that the expression levels of SHH, phosphorylated cyclin B1, and CDK7 levels are much higher in most tumour tissues as compared to normal tissue. It is also shown that siRNA-mediated silencing of SHH gene expression resulted in a significant reduction of cell proliferation in a liver cancer cell line, SNU449 indicating that SHH plays a major role in promoting cell proliferation in liver cancer. The SHH pathway is a key network underpinning HCC aetiology which may guide the development of interventions for this most common form of human liver cancer.
    IET Systems Biology 12/2013; 7(6):243-51. · 1.67 Impact Factor
  • Source
    Nature Biotechnology 04/2012; 30(4):365. · 39.08 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: High resolution, system-wide characterizations have demonstrated the capacity to identify genomic regions that undergo genomic aberrations. Such research efforts often aim at associating these regions with disease etiology and outcome. Identifying the corresponding biologic processes that are responsible for disease and its outcome remains challenging. Using novel analytic methods that utilize the structure of biologic networks, we are able to identify the specific networks that are highly significantly, nonrandomly altered by regions of copy number amplification observed in a systems-wide analysis. We demonstrate this method in breast cancer, where the state of a subset of the pathways identified through these regions is shown to be highly associated with disease survival and recurrence.
    PLoS ONE 01/2011; 6(1):e14437. · 3.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The PathOlogist is a new tool designed to transform large sets of gene expression data into quantitative descriptors of pathway-level behavior. The tool aims to provide a robust alternative to the search for single-gene-to-phenotype associations by accounting for the complexity of molecular interactions. Molecular abundance data is used to calculate two metrics--'activity' and 'consistency'--for each pathway in a set of more than 500 canonical molecular pathways (source: Pathway Interaction Database, http://pid.nci.nih.gov). The tool then allows a detailed exploration of these metrics through integrated visualization of pathway components and structure, hierarchical clustering of pathways and samples, and statistical analyses designed to detect associations between pathway behavior and clinical features. The PathOlogist provides a straightforward means to identify the functional processes, rather than individual molecules, that are altered in disease. The statistical power and biologic significance of this approach are made easily accessible to laboratory researchers and informatics analysts alike. Here we show as an example, how the PathOlogist can be used to establish pathway signatures that robustly differentiate breast cancer cell lines based on response to treatment.
    BMC Bioinformatics 01/2011; 12:133. · 2.67 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Biological Pathway Exchange (BioPAX) is a standard language to represent biological pathways at the molecular and cellular level and to facilitate the exchange of pathway data. The rapid growth of the volume of pathway data has spurred the development of databases and computational tools to aid interpretation; however, use of these data is hampered by the current fragmentation of pathway information across many databases with incompatible formats. BioPAX, which was created through a community process, solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. Using BioPAX, millions of interactions, organized into thousands of pathways, from many organisms are available from a growing number of databases. This large amount of pathway data in a computable form will support visualization, analysis and biological discovery.
    Nature Biotechnology 09/2010; 28(9):935-42. · 39.08 Impact Factor
  • NCI Nature Pathway Interaction Database 03/2010;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Recent publications have described and applied a novel metric that quantifies the genetic distance of an individual with respect to two population samples, and have suggested that the metric makes it possible to infer the presence of an individual of known genotype in a sample for which only the marginal allele frequencies are known. However, the assumptions, limitations, and utility of this metric remained incompletely characterized. Here we present empirical tests of the method using publicly accessible genotypes, as well as analytical investigations of the method's strengths and limitations. The results reveal that the null distribution is sensitive to the underlying assumptions, making it difficult to accurately calibrate thresholds for classifying an individual as a member of the population samples. As a result, the false-positive rates obtained in practice are considerably higher than previously believed. However, despite the metric's inadequacies for identifying the presence of an individual in a sample, our results suggest potential avenues for future research on tuning this method to problems of ancestry inference or disease prediction. By revealing both the strengths and limitations of the proposed method, we hope to elucidate situations in which this distance metric may be used in an appropriate manner. We also discuss the implications of our findings in forensics applications and in the protection of GWAS participant privacy.
    PLoS Genetics 10/2009; 5(10):e1000668. · 8.17 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Since its start, the Mammalian Gene Collection (MGC) has sought to provide at least one full-protein-coding sequence cDNA clone for every human and mouse gene with a RefSeq transcript, and at least 6200 rat genes. The MGC cloning effort initially relied on random expressed sequence tag screening of cDNA libraries. Here, we summarize our recent progress using directed RT-PCR cloning and DNA synthesis. The MGC now contains clones with the entire protein-coding sequence for 92% of human and 89% of mouse genes with curated RefSeq (NM-accession) transcripts, and for 97% of human and 96% of mouse genes with curated RefSeq transcripts that have one or more PubMed publications, in addition to clones for more than 6300 rat genes. These high-quality MGC clones and their sequences are accessible without restriction to researchers worldwide.
    Genome Research 09/2009; 19(12):2324-33. · 13.85 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Microorganisms have been associated with many types of human diseases; however, a significant number of clinically important microbial pathogens remain to be discovered. We have developed a genome-wide approach, called Digital Karyotyping Microbe Identification (DK-MICROBE), to identify genomic DNA of bacteria and viruses in human disease tissues. This method involves the generation of an experimental DNA tag library through Digital Karyotyping (DK) followed by analysis of the tag sequences for the presence of microbial DNA content using a compiled microbial DNA virtual tag library. To validate this technology and to identify pathogens that may be associated with human cancer pathogenesis, we used DK-MICROBE to determine the presence of microbial DNA in 58 human tumor samples, including brain, ovarian, and colorectal cancers. We detected DNA from Human herpesvirus 6 (HHV-6) in a DK library of a colorectal cancer liver metastasis and in normal tissue from the same patient. DK-MICROBE can identify previously unknown infectious agents in human tumors, and is now available for further applications for the identification of pathogen DNA in human cancer and other diseases.
    BMC Medical Genomics 02/2009; 2:22. · 3.91 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Abstract Background Microorganisms have been associated with many types of human diseases; however, a significant number of clinically important microbial pathogens remain to be discovered. Methods We have developed a genome-wide approach, called Digital Karyotyping Microbe Identification (DK-MICROBE), to identify genomic DNA of bacteria and viruses in human disease tissues. This method involves the generation of an experimental DNA tag library through Digital Karyotyping (DK) followed by analysis of the tag sequences for the presence of microbial DNA content using a compiled microbial DNA virtual tag library. Results To validate this technology and to identify pathogens that may be associated with human cancer pathogenesis, we used DK-MICROBE to determine the presence of microbial DNA in 58 human tumor samples, including brain, ovarian, and colorectal cancers. We detected DNA from Human herpesvirus 6 (HHV-6) in a DK library of a colorectal cancer liver metastasis and in normal tissue from the same patient. Conclusion DK-MICROBE can identify previously unknown infectious agents in human tumors, and is now available for further applications for the identification of pathogen DNA in human cancer and other diseases.
    BMC Medical Genomics 01/2009; · 3.91 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The Pathway Interaction Database (PID, http://pid.nci.nih.gov) is a freely available collection of curated and peer-reviewed pathways composed of human molecular signaling and regulatory events and key cellular processes. Created in a collaboration between the US National Cancer Institute and Nature Publishing Group, the database serves as a research tool for the cancer research community and others interested in cellular pathways, such as neuroscientists, developmental biologists and immunologists. PID offers a range of search features to facilitate pathway exploration. Users can browse the predefined set of pathways or create interaction network maps centered on a single molecule or cellular process of interest. In addition, the batch query tool allows users to upload long list(s) of molecules, such as those derived from microarray experiments, and either overlay these molecules onto predefined pathways or visualize the complete molecular connectivity map. Users can also download molecule lists, citation lists and complete database content in extensible markup language (XML) and Biological Pathways Exchange (BioPAX) Level 2 format. The database is updated with new pathway content every month and supplemented by specially commissioned articles on the practical uses of other relevant online tools.
    Nucleic Acids Research 11/2008; 37(Database issue):D674-9. · 8.81 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Human cancer cells typically harbour multiple chromosomal aberrations, nucleotide substitutions and epigenetic modifications that drive malignant transformation. The Cancer Genome Atlas (TCGA) pilot project aims to assess the value of large-scale multi-dimensional analysis of these molecular characteristics in human cancer and to provide the data rapidly to the research community. Here we report the interim integrative analysis of DNA copy number, gene expression and DNA methylation aberrations in 206 glioblastomas-the most common type of adult brain cancer-and nucleotide sequence aberrations in 91 of the 206 glioblastomas. This analysis provides new insights into the roles of ERBB2, NF1 and TP53, uncovers frequent mutations of the phosphatidylinositol-3-OH kinase regulatory subunit gene PIK3R1, and provides a network view of the pathways altered in the development of glioblastoma. Furthermore, integration of mutation, DNA methylation and clinical treatment data reveals a link between MGMT promoter methylation and a hypermutator phenotype consequent to mismatch repair deficiency in treated glioblastomas, an observation with potential clinical implications. Together, these findings establish the feasibility and power of TCGA, demonstrating that it can rapidly expand knowledge of the molecular basis of cancer.
    Nature 10/2008; · 42.35 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The Pathway Interaction Database (*PID*, http://pid.nci.nih.gov) is a freely available collection of curated and peer-reviewed signaling pathways composed of human biomolecular interactions and cellular processes. Created in a collaboration between the U.S. National Cancer Institute and Nature Publishing Group, the database is a research tool for cell biologists, biochemists, computational biologists and bioinformaticians. The PID offers a range of tools to facilitate pathway exploration. Users can browse the pre-defi ned set of pathways and also create interaction network maps centered on a single molecule of interest or an extensive list of molecules. In addition, users can download complete data sets in extensible markup language (XML) and Biological Pathway Exchange (BioPAX) Level 2 formats. The database is updated every month and supplemented by a concise editorial section that provides synopses of recent noteworthy papers in cell signaling and specially commissioned articles on the practical uses of other relevant online tools. Users can sign up for free email alerts or RSS feeds to receive database updates.
    Cancer Research 11/2007; · 9.28 Impact Factor
  • Source
    Sol Efroni, Carl F Schaefer, Kenneth H Buetow
    [Show abstract] [Hide abstract]
    ABSTRACT: Cancer is recognized to be a family of gene-based diseases whose causes are to be found in disruptions of basic biologic processes. An increasingly deep catalogue of canonical networks details the specific molecular interaction of genes and their products. However, mapping of disease phenotypes to alterations of these networks of interactions is accomplished indirectly and non-systematically. Here we objectively identify pathways associated with malignancy, staging, and outcome in cancer through application of an analytic approach that systematically evaluates differences in the activity and consistency of interactions within canonical biologic processes. Using large collections of publicly accessible genome-wide gene expression, we identify small, common sets of pathways - Trka Receptor, Apoptosis response to DNA Damage, Ceramide, Telomerase, CD40L and Calcineurin - whose differences robustly distinguish diverse tumor types from corresponding normal samples, predict tumor grade, and distinguish phenotypes such as estrogen receptor status and p53 mutation state. Pathways identified through this analysis perform as well or better than phenotypes used in the original studies in predicting cancer outcome. This approach provides a means to use genome-wide characterizations to map key biological processes to important clinical features in disease.
    PLoS ONE 02/2007; 2(5):e425. · 3.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Cancers have been described as wounds that do not heal, suggesting that the two share common features. By comparing microarray data from a model of renal regeneration and repair (RRR) with reported gene expression in renal cell carcinoma (RCC), we asked whether those two processes do, in fact, share molecular features and regulatory mechanisms. The majority (77%) of the genes expressed in RRR and RCC were concordantly regulated, whereas only 23% were discordant (i.e., changed in opposite directions). The orchestrated processes of regeneration, involving cell proliferation and immune response, were reflected in the concordant genes. The discordant gene signature revealed processes (e.g., morphogenesis and glycolysis) and pathways (e.g., hypoxia-inducible factor and insulin-like growth factor-I) that reflect the intrinsic pathologic nature of RCC. This is the first study that compares gene expression patterns in RCC and RRR. It does so, in particular, with relation to the hypothesis that RCC resembles the wound healing processes seen in RRR. However, careful attention to the genes that are regulated in the discordant direction provides new insights into the critical differences between renal carcinogenesis and wound healing. The observations reported here provide a conceptual framework for further efforts to understand the biology and to develop more effective diagnostic biomarkers and therapeutic strategies for renal tumors and renal ischemia.
    Cancer Research 08/2006; 66(14):7216-24. · 9.28 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Membrane proteins are responsible for many critical cellular functions and identifying cell surface proteins on different keratinocyte populations by proteomic approaches would improve our understanding of their biological function. The ability to characterize membrane proteins, however, has lagged behind that of soluble proteins both in terms of throughput and protein coverage. In this study, a membrane proteomic investigation of keratinocytes using a two-dimensional liquid chromatography (LC) tandem-mass spectrometry (MS/MS) approach that relies on a buffered methanol-based solubilization, and tryptic digestion of purified plasma membrane is described. A highly enriched plasma membrane fraction was prepared from newborn foreskins using sucrose gradient centrifugation, followed by a single-tube solubilization and tryptic digestion of membrane proteins. This digestate was fractionated by strong cation-exchange chromatography and analyzed using microcapillary reversed-phase LC-MS/MS. In a set of 1306 identified proteins, 866 had a gene ontology (GO) annotation for cellular component, and 496 of these annotated proteins (57.3%) were assigned as known integral membrane proteins or membrane-associated proteins. Included in the identification of a large number of aqueous insoluble integral membrane proteins were many known intercellular adhesion proteins and gap junction proteins. Furthermore, 121 proteins from cholesterol-rich plasma membrane domains (caveolar and lipid rafts) were identified.
    Journal of Investigative Dermatology 11/2004; 123(4):691-9. · 6.37 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline.
    Genome Research 11/2004; 14(10B):2121-7. · 13.85 Impact Factor
  • Carl F Schaefer
    [Show abstract] [Hide abstract]
    ABSTRACT: Network representations of biological pathways offer a functional view of molecular biology that is different from and complementary to sequence, expression, and structure databases. There is currently available a wide range of digital collections of pathway data, differing in organisms included, functional area covered (e.g., metabolism vs. signaling), detail of modeling, and support for dynamic pathway construction. While it is currently impossible for these databases to communicate with each other, there are several efforts at standardizing a data exchange language for pathway data. Databases that represent pathway data at the level of individual interactions make it possible to combine data from different predefined pathways and to query by network connectivity. Computable representations of pathways provide a basis for various analyses, including detection of broad network patterns, comparison with mRNA or protein abundance, and simulation.
    Annals of the New York Academy of Sciences 06/2004; 1020:77-91. · 4.31 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: A combined, detergent- and organic solvent-based proteomic method for the analysis of detergent-resistant membrane rafts (DRMR) is described. These specialized domains of the plasma membrane contain a distinctive and dynamic protein and/or lipid complement, which can be isolated from most mammalian cells. Lipid rafts are predominantly involved in signal transduction and adapted to mediate and produce different cellular responses. To facilitate a better understanding of their biology and role, DRMR were isolated from Vero cells as a Triton X-100 insoluble fraction. After detergent removal, sonication in 60% buffered methanol was used to extract, solubilize and tryptically digest the resulting protein complement. The peptide digestate was analyzed by microcapillary reversed-phase liquid chromatography-tandem mass spectrometry. Gas-phase fractionation in the mass-to-charge range was employed to broaden the selection of precursor ions and increase the number of identifications in an effort to detect less abundant proteins. A total of 380 proteins were identified including all known lipid raft markers. A total of 91 (24%) proteins were classified as integral alpha-helical membrane proteins, of which 51 (56%) were predicted to have multiple transmembrane domains.
    Electrophoresis 06/2004; 25(9):1307-18. · 3.16 Impact Factor

Publication Stats

4k Citations
277.38 Total Impact Points

Institutions

  • 2011
    • Bar Ilan University
      Gan, Tel Aviv, Israel
  • 2004–2011
    • National Institutes of Health
      • Laboratory of Genetics (LG)
      Bethesda, MD, United States
  • 2001–2009
    • National Cancer Institute (USA)
      • Division of Cancer Prevention
      Maryland, United States