Ho-Lim Fung’s research while affiliated with University of California, San Diego and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (27)


Supplementary Material
  • Data

April 2017

·

17 Reads

Shicheng Guo

·

·

·

[...]

·

Supplementary Table 1: A complete list of MHBs Supplementary Table 2: Gene ontology of MHBs that loss methylation linkage in cancers. Supplementary Table 3: Tissue specific MHBs for classification of normal tissues. Supplementary Table 4: Layer specific MHBs with group specificity index. Supplementary Table 5: Gene ontology analysis of TFs bound MHBs. Supplementary Table 6: Cancer-associated High-Methylation-Haplotype (caHMH) regions. Supplementary Table 7: Deconvolution of plasma samples by 10 normal tissues, LCT, and CCT. Supplementary Table 8: Differential MHLs between cancer plasma and normal plasma. Supplementary Table 9: Group II MHB regions for estimation of cancer DNA proportion. Supplementary Table 10: Relationship between average cancer fraction and cell-free DNA extraction yield. Supplementary Table 11: Predictors for colon cancer, lung cancer, and normal plasma. Supplementary Table 12: Prediction accuracy based on tsMHB counting with 5-fold cross-validation. Supplementary Table 13: Information of all samples used in this study.


Identification of methylation haplotype blocks AIDS in deconvolution of heterogeneous tissue samples and tumor tissue-of-origin mapping from plasma DNA

March 2017

·

1,084 Reads

·

402 Citations

Nature Genetics

Adjacent CpG sites in mammalian genomes can be co-methylated owing to the processivity of methyltransferases or demethylases, yet discordant methylation patterns have also been observed, which are related to stochastic or uncoordinated molecular processes. We focused on a systematic search and investigation of regions in the full human genome that show highly coordinated methylation. We defined 147,888 blocks of tightly coupled CpG sites, called methylation haplotype blocks, after analysis of 61 whole-genome bisulfite sequencing data sets and validation with 101 reduced-representation bisulfite sequencing data sets and 637 methylation array data sets. Using a metric called methylation haplotype load, we performed tissue-specific methylation analysis at the block level. Subsets of informative blocks were further identified for deconvolution of heterogeneous samples. Finally, using methylation haplotypes we demonstrated quantitative estimation of tumor load and tissue-of-origin mapping in the circulating cell-free DNA of 59 patients with lung or colorectal cancer.


Fig. 1. SNS identified 16 neuronal subtypes over six neocortical regions. (A) Overview of SNS pipeline. Postmortem tissue from BAs 8, 10, 17, 21, 22, and 41/42 were dissociated to single nuclei for NeuN + and 4′,6-diamidino-2-phenylindole + (DAPI + ) sorting and capture on C1 chips. Resultant libraries were sequenced, mapped to the reference genome (pie chart showing averaged proportions), and screened for doublet removal before clustering and classification. BA proportions are shown. FC, frontal cortex; TC, temporal cortex; VC, visual cortex. (B) Neuronal subtypes (Ex and In) shown with multidimensional plotting by using 10-fold or greater differentially expressed genes (table S3). NoN (no nomenclature), low-expression outlier cluster. (C) Heatmap showing distinct marker gene expression (table S5).  
Fig. 2. SNS reveals distinct interneuron subtypes. (A) Pie charts display relative proportions of subtypes among BAs and FOP heatmaps for In and Ex marker genes. (B) Diagram of subpallial origins of interneurons from either the LGE or MGE with FOP heatmaps [scale as in (A)] for marker genes associated with cortical layer (L) (top), subpallial origin (middle), and interneuron classification (bottom). Potential interneuron subtypes are indicated below. SOM,  
Fig. 4. Neuronal subtypes reveal heterogeneity among BAs. (A) Multidimensional plot showing projection neuron subtypes distributed according to their predicted cortical layer (L) identity. Layer 4 Ex2 and Ex3 subtypes are indicated. (B) Clusters shown in (A) colored by BA and with BA41/42 and BA17 subpopulations of Ex3 indicated. (C) Violin plots showing differentially expressed genes between Ex2 and Ex3 subtypes (table S8). (D) Heatmap showing genes differentially expressed between BA17 and BA41/42 within the Ex3 subtype (table S10).  
Neuronal subtypes and diversity revealed by single-nucleus RNA sequencing of the human brain
  • Article
  • Full-text available

June 2016

·

1,038 Reads

·

860 Citations

Science

Single-nucleus gene expression Identifying the genes expressed at the level of a single cell nucleus can better help us understand the human brain. Blue et al. developed a single-nuclei sequencing technique, which they applied to cells in classically defined Brodmann areas from a postmortem brain. Clustering of gene expression showed concordance with the area of origin and defining 16 neuronal subtypes. Both excitatory and inhibitory neuronal subtypes show regional variations that define distinct cortical areas and exhibit how gene expression clusters may distinguish between distinct cortical areas. This method opens the door to widespread sampling of the genes expressed in a diseased brain and other tissues of interest. Science , this issue p. 1586

Download

Characterization of chromatin accessibility with a transposome hypersensitive sites sequencing (THS-seq) assay

February 2016

·

743 Reads

·

55 Citations

Genome Biology

Chromatin accessibility captures in vivo protein-chromosome binding status, and is considered an informative proxy for protein-DNA interactions. DNase I and Tn5 transposase assays require thousands to millions of fresh cells for comprehensive chromatin mapping. Applying Tn5 tagmentation to hundreds of cells results in sparse chromatin maps. We present a transposome hypersensitive sites sequencing assay for highly sensitive characterization of chromatin accessibility. Linear amplification of accessible DNA ends with in vitro transcription, coupled with an engineered Tn5 super-mutant, demonstrates improved sensitivity on limited input materials, and accessibility of small regions near distal enhancers, compared with ATAC-seq. Electronic supplementary material The online version of this article (doi:10.1186/s13059-016-0882-7) contains supplementary material, which is available to authorized users.


Characterization of Genome-Methylome Interactions in 22 Nuclear Pedigrees

July 2014

·

450 Reads

·

14 Citations

Genetic polymorphisms can shape the global landscape of DNA methylation, by either changing substrates for DNA methyltransferases or altering the DNA binding affinity of cis-regulatory proteins. The interactions between CpG methylation and genetic polymorphisms have been previously investigated by methylation quantitative trait loci (mQTL) and allele-specific methylation (ASM) analysis. However, it remains unclear whether these approaches can effectively and comprehensively identify all genetic variants that contribute to the inter-individual variation of DNA methylation levels. Here we used three independent approaches to systematically investigate the influence of genetic polymorphisms on variability in DNA methylation by characterizing the methylation state of 96 whole blood samples in 52 parent-child trios from 22 nuclear pedigrees. We performed targeted bisulfite sequencing with padlock probes to quantify the absolute DNA methylation levels at a set of 411,800 CpG sites in the human genome. With mid-parent offspring analysis (MPO), we identified 10,593 CpG sites that exhibited heritable methylation patterns, among which 70.1% were SNPs directly present in methylated CpG dinucleotides. We determined the mQTL analysis identified 49.9% of heritable CpG sites for which regulation occurred in a distal cis-regulatory manner, and that ASM analysis was only able to identify 5%. Finally, we identified hundreds of clusters in the human genome for which the degree of variation of CpG methylation, as opposed to whether or not CpG sites were methylated, was associated with genetic polymorphisms, supporting a recent hypothesis on the genetic influence of phenotypic plasticity. These results show that cis-regulatory SNPs identified by mQTL do not comprise the full extent of heritable CpG methylation, and that ASM appears overall unreliable. Overall, the extent of genome-methylome interactions is well beyond what is detectible with the commonly used mQTL and ASM approaches, and is likely to include effects on plasticity.




Figure 1: MIDAS. (a) Each slide contains 16 arrays of 255 microwells each. Cells, lysis solution, denaturing buffer, neutralization buffer and MDA master mix were each added to the microwells with a single pipette pump. Amplicon growth was then visualized with a fluorescent microscope using a real-time MDA system. Microwells showing increasing fluorescence over time were positive amplicons. The amplicons were extracted with fine glass pipettes attached to a micromanipulation system. (b) Scanning electron microscopy of a single E. coli cell displayed at different magnifications. This particular well contains only one cell, and most wells observed also contained no more than one cell. (c) A custom microscope incubation chamber was used for real time MDA. The chamber was temperature and humidity controlled to mitigate evaporation of reagents. Additionally, it prevented contamination during amplicon extraction because the micromanipulation system was self-contained. An image of the entire microwell array is also shown, as well as a micropipette probing a well. (d) Complex three-dimensional MDA amplicons were reduced to linear DNA using DNA polymerase I and Ampligase. This process substantially improved the complexity of the library during sequencing.
Figure 2: Depth of coverage of assembled contigs aligned to the reference E. coli genome. Three single E. coli cells were analyzed using MIDAS. Between 88% and 94% of the genome was assembled from 2–8 million paired-end 100-bp reads. Each colored circle is a histogram of the log2 of average depth of coverage across each assembled contig for one cell. Gaps are represented by blank whitespace in between colored contigs.
Figure 3: Genomic coverage of single cells amplified by MDA in a tube and by MIDAS. The observed multipeak profile for the MDA reactions implies that certain regions may have been amplified with exponentially greater bias compared to the majority of the genome. (a) Comparison of single E. coli cells amplified in a PCR tube for 10 h (top), 2 h (middle) and in a microwell (MIDAS) for 10 h (bottom). Genomic positions were consolidated into 1-kb bins (x axis), and were plotted against the log10 ratio (y axis) of genomic coverage (normalized to the mean). (b) Distribution of coverage of amplified single bacterial cells. The x axis shows the log10 ratio of genomic coverage normalized to the mean. (c) Comparison of single human cells amplified using traditional MDA in a PCR tube for 10 h (top) or in a microwell (MIDAS) for 10 h (middle) to a pool of unamplified human cells (bottom). Genomic positions were consolidated into variable bins of ~60 kb in size, previously determined to contain a similar read count28, and were plotted against the log10 ratio (y axis) of genomic coverage (normalized to the mean). (d) Distribution of coverage of amplified single mammalian cells. The x axis shows the log10 ratio of genomic coverage normalized to the mean.
Figure 4: Detection of CNVs. Genomic positions were consolidated into bins of ~60 kb in size which were previously determined to contain a similar read count28. Estimated copy numbers below were rounded to the nearest whole number. (a) CNVs in a Down syndrome single cell analyzed with MIDAS. The x axis shows genomic position. The y axis shows (on a log2 scale) the estimated copy number as a red line. The arrow indicates trisomy 21, which is clearly visible in this single cell. (b) CNVs in a Down syndrome single cell analyzed with traditional in-tube MDA. The x axis shows genomic position. The y axis shows (on a log2 scale) the estimated copy number as a red line. The arrow marks the expected region of trisomy 21, which is not detectable in these data. (c) CNVs in a Down syndrome single cell with trisomy 21 'spike-ins'. The x axis shows genomic position. The y axis shows (on a log2 scale) the estimated copy number as a red line. At each arrow, before CNV calling, data from a randomly determined 2 Mb section of trisomy chromosome 21 were computationally inserted into the genome, simulating a small gain-of-single-copy event. At each location, a CNV was called, showing that MIDAS can detect 2-Mb CNV accurately. (d) CNV in a Down syndrome single cell with trisomy 21 spike-ins. The x axis shows genomic position. The y axis shows (on a log2 scale) the estimated copy number as a red line. At each arrow, before CNV calling, data from a randomly determined 2 Mb section of trisomy chromosome 21 was computationally inserted into the genome, simulating a small gain-of-single-copy event.
Figure 5: Comparison of MIDAS to previously published data for in-tube MDA32, microfluidic MDA10 and MALBAC33 for diploid regions of pools of two sperm cells and diploid regions of a single SW480 cancer cell processed using MALBAC18. Genomic positions were consolidated into variable bins of ~60 kb in size previously determined to contain a similar read count28 and were plotted against the log10 ratio (y axis) of genomic coverage (normalized to the mean). For the cancer cell data, nondiploid regions have been masked out (white gaps between pink) to remove the bias generated by comparing a highly aneuploid cell to a primarily diploid cell.
Massively parallel polymerase cloning and genome sequencing of single cells using nanoliter microwells

November 2013

·

355 Reads

·

230 Citations

Nature Biotechnology

Genome sequencing of single cells has a variety of applications, including characterizing difficult-to-culture microorganisms and identifying somatic mutations in single cells from mammalian tissues. A major hurdle in this process is the bias in amplifying the genetic material from a single cell, a procedure known as polymerase cloning. Here we describe the microwell displacement amplification system (MIDAS), a massively parallel polymerase cloning method in which single cells are randomly distributed into hundreds to thousands of nanoliter wells and their genetic material is simultaneously amplified for shotgun sequencing. MIDAS reduces amplification bias because polymerase cloning occurs in physically separated, nanoliter-scale reactors, facilitating the de novo assembly of near-complete microbial genomes from single Escherichia coli cells. In addition, MIDAS allowed us to detect single-copy number changes in primary human adult neurons at 1- to 2-Mb resolution. MIDAS can potentially further the characterization of genomic diversity in many heterogeneous cell populations.


Genome-wide Analysis Reveals TET- and TDG-Dependent 5-Methylcytosine Oxidation Dynamics

April 2013

·

79 Reads

·

466 Citations

Cell

TET dioxygenases successively oxidize 5-methylcytosine (5mC) in mammalian genomes to 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5-carboxylcytosine (5caC). 5fC/5caC can be excised and repaired to regenerate unmodified cytosines by thymine-DNA glycosylase (TDG) and base excision repair (BER) pathway, but it is unclear to what extent and at which part of the genome this active demethylation process takes place. Here, we have generated genome-wide distribution maps of 5hmC/5fC/5caC using modification-specific antibodies in wild-type and Tdg-deficient mouse embryonic stem cells (ESCs). In wild-type mouse ESCs, 5fC/5caC accumulates to detectable levels at major satellite repeats but not at nonrepetitive loci. In contrast, Tdg depletion in mouse ESCs causes marked accumulation of 5fC and 5caC at a large number of proximal and distal gene regulatory elements. Thus, these results reveal the genome-wide view of iterative 5mC oxidation dynamics and indicate that TET/TDG-dependent active DNA demethylation process occurs extensively in the mammalian genome.


Figure 1: Mutated alleles are expressed in hiPSC lines.Sanger chromatograms showing the results of RNA Sequencing analysis performed on the indicated genes found mutated in the indicated hiPSC lines. Dashed lines highlight the point-mutated nucleotide. Note the expression of both reference and mutated alleles in all cases analyzed.
Figure 2: Evaluation of the functional effect of hiPSC mutations on reprogramming efficiency.(a,b) Human BJ fibroblasts were infected with retroviruses encoding OSKC, and either lentiviruses encoding shRNAs against the indicated proteins (a) or retroviruses encoding the wild type or mutated proteins (b). Relative reprogramming efficiencies (evaluated as percentage of Nanog+ colonies) are shown as fold change normalized to the averaged efficiency observed in (a) pLVTHM or (b) pMX–GFP-infected fibroblasts. In a, lentiviruses encoding shRNAs against CycE1 or p53 were used as controls of reduced or increased reprogramming efficiency, respectively. In b, retroviruses encoding p16 or the pair CDK4/CycD1 were used as controls of reduced or increased reprogramming efficiency, respectively. For a, 20,000 infected cells were plated when shRNAs against POLR1C and p53 were used, and 70,000 infected cells were plated under all other conditions. For b, a total of 25,000 infected cells were plated under all conditions. Two independent experiments with two biological replicates were carried out. All error bars depict the s.d.
Table 2
Figure 3: Retroviral silencing or wild-type/mutant gene ratio do not alter reprogramming efficiency.(a) HUVEC cells were infected with retroviruses encoding OSKC, and a similar total amount of retroviruses encoding only the wild-type form or both, the wild-type and mutant forms of the protein in an equal proportion. (b) HUVEC cells were infected with retroviruses encoding OSKC, RFP and the wild-type or mutated forms of the genes indicated. Relative reprogramming efficiencies (evaluated as percentage of Tra-1-60+ colonies) are shown as fold change normalized to the averaged efficiency observed in green fluorescent protein-infected HUVECs. Ten thousand infected cells were plated under all the conditions. Two independent experiments with two biological replicates were carried out. All error bars depict the s.d.
Analysis of protein-coding mutations in hiPSCs and their possible role during somatic cell reprogramming

January 2013

·

77 Reads

·

61 Citations

Recent studies indicate that human-induced pluripotent stem cells contain genomic structural variations and point mutations in coding regions. However, these studies have focused on fibroblast-derived human induced pluripotent stem cells, and it is currently unknown whether the use of alternative somatic cell sources with varying reprogramming efficiencies would result in different levels of genetic alterations. Here we characterize the genomic integrity of eight human induced pluripotent stem cell lines derived from five different non-fibroblast somatic cell types. We show that protein-coding mutations are a general feature of the human induced pluripotent stem cell state and are independent of somatic cell source. Furthermore, we analyse a total of 17 point mutations found in human induced pluripotent stem cells and demonstrate that they do not generally facilitate the acquisition of pluripotency and thus are not likely to provide a selective advantage for reprogramming.


Citations (13)


... Genome-wide analysis reveals that methylation levels at loop anchors show a strong correlation, with most anchors showing methylation levels below 0.6 and a Pearson correlation coefficient of 0.51 between anchor pairs (Extended Data Figure 1). Spatially proximal DNA sequences tend to display similar methylation levels, consistent with previous studies [36][37][38][39] . Building on this observation, the module calculates the number of methylated CpG sites within each anchor pair and evaluates the correlation of methylation levels between the left and right anchors. ...

Reference:

NanoLoop: A deep learning framework leveraging Nanopore sequencing for chromatin loop prediction
Identification of methylation haplotype blocks AIDS in deconvolution of heterogeneous tissue samples and tumor tissue-of-origin mapping from plasma DNA
  • Citing Article
  • March 2017

Nature Genetics

... In addition to brain entropy measures derived from neuroimaging techniques like resting-state fMRI (rsfMRI), examining the spatial distribution of neurotransmitter systems and cell types is essential for a comprehensive understanding of brain function [5,6,7,8]. Neurotransmitter maps, for instance, provide valuable insights into the density and distribution of critical neurochemical systems that underpin brain function and behavior. These maps not only help characterize normal brain function but also reveal how disruptions in neurotransmitter signaling may contribute to the pathophysiology of psychiatric disorders, such as schizophrenia (SZ) and bipolar disorder (BP) [9,10]. ...

Neuronal subtypes and diversity revealed by single-nucleus RNA sequencing of the human brain

Science

... Such long fragments may indicate accessibility of two independent regulatory regions (on both ends), but it is unclear whether the region in between these loci is also accessible. This issue is particularly acute for specialized technologies like single-cell transposome hypersensitive sites sequencing (scTHS-seq) 12,13 and scNanoATAC-seq 14 . Consequently, current fragment-based counting methods may lead to false positives counts when insertions are distantly outside the peak/bin 15,16 . ...

Characterization of chromatin accessibility with a transposome hypersensitive sites sequencing (THS-seq) assay

Genome Biology

... In pursuit of preliminary validation, we sought out existing datasets that could corroborate our discoveries. Our search identified a selection of epigenetic investigations on family trios that are complemented by GWAS data 39,46,57 . Predominantly, these studies have used the Illumina Human Methylation 450 K BeadChip assay. ...

Characterization of Genome-Methylome Interactions in 22 Nuclear Pedigrees

... Several approaches have been developed to fulfill these requirements based on diverse immobilizing technologies, including active optical, acoustic, and electrical fields, [4][5][6][7][8][9] as well as passive hydrodynamic/mechanical constrictions and interface microengineering. [10][11][12][13][14][15] The combination of most tools above with microfluidics enhances the controllability and throughput of microscale cell capture during the manipulation process and has great potential in searching for novel and pioneering insights into single-cell omics. [16,17] Despite significant advances, the current approaches still have several challenges to overcome. ...

Massively parallel polymerase cloning and genome sequencing of single cells using nanoliter microwells

Nature Biotechnology

... TET2 catalyzes the oxidation of 5-methylcytosine to 5-hydroxymethylcytosine (5hmC). 5hmC and its oxidized derivatives may eventually be replaced with an unmodified cytosine by base excision repair, resulting in demethylation [11]. Emerging evidence suggests that the TET family is important for cellular development, differentiation, and reprogramming [12][13][14]. ...

Genome-wide Analysis Reveals TET- and TDG-Dependent 5-Methylcytosine Oxidation Dynamics
  • Citing Article
  • April 2013

Cell

... Somatic coding variants in human iPSCs have been reported to be enriched Table 6 gRNA and repair template sequences. in genes mutated or having causative effects in cancers [53,54] or in active promoters [55] or depleted from genic regions [56]. However, another in-depth analysis showed that variants in human iPSCs are generally benign and fall within intergenic or intronic regions [57]. ...

Analysis of protein-coding mutations in hiPSCs and their possible role during somatic cell reprogramming

... This type of large-scale chromatin memory is so strong that it can even be retained upon major changes in cell state. For example, during iPSC reprogramming, cells that transition from a fully committed cell type into a pluripotent cell state retain memory of their cell of origin [72][73][74][75][76][77][78]. This memory is retained in the form of heterochromatin signatures, such as H3K9me3, lamin-B1, and CpH methylation [79], which are known to participate in self-sustaining positive feedback loops (Table 1). ...

Identification of a specific reprogramming-associated epigenetic signature in human induced pluripotent stem cells

Proceedings of the National Academy of Sciences

... Additionally, including unique molecular identifiers (UMIs) counteracts the bias introduced by PCR, making it possible to distinguish single-capture events, creating the so-called single-molecule MIPs (smMIPs) [38]. So far, Diep et al. has applied an smMIP approach for the first epigenomic application using bisulfite-converted DNA [39]. The authors successfully typed >300,000 CpGs of interest, but their large-scale investigation still included large amounts of DNA; hence, the applicability of this approach to smaller-scale CpG panels and to compromised DNA still remains unknown. ...

Library-free Methylation Sequencing with Bisulfite Padlock Probes

Nature Methods

... However, progress toward this goal has been slowed by legal and social considerations. 15 Despite these advances, significant gaps remain in our understanding of oocyte reprogramming mechanisms. For instance, the molecular events driving epigenetic remodeling-such as DNA demethylation, histone modifications, and chromatin reorganization-are not yet fully elucidated. ...

Human oocytes reprogram somatic cells to a pluripotent state

Nature