Hugo Y. K. Lam's research while affiliated with Roche and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (180)
While many quantum computing (QC) methods promise theoretical advantages over classical counterparts, quantum hardware remains limited. Exploiting near-term QC in computer-aided drug design (CADD) thus requires judicious partitioning between classical and quantum calculations. We present HypaCADD, a hybrid classical-quantum workflow for finding lig...
Purpose:
Among patients with vasodilatory shock, gene expression scores may identify different immune states. We aimed to test whether such scores are robust in identifying patients' immune state and predicting response to hydrocortisone treatment in vasodilatory shock.
Materials and methods:
We selected genes to generate continuous scores to de...
The lack of samples for generating standardized DNA datasets for setting up a sequencing pipeline or benchmarking the performance of different algorithms limits the implementation and uptake of cancer genomics. Here, we describe reference call sets obtained from paired tumor–normal genomic DNA (gDNA) samples derived from a breast cancer cell line—w...
Tumor Mutational Burden (TMB) is a measure of the abundance of somatic mutations in a tumor, which has been shown to be an emerging biomarker for both anti-PD-(L)1 treatment and prognosis; however, multiple challenges still hinder the adoption of TMB as a biomarker. The key challenges are the inconsistency of tumor mutational burden measurement amo...
Tumor Mutational Burden (TMB) is a measure of the abundance of somatic mutations in a tumor, which has been shown to be an emerging biomarker for anti-PD-(L)1 treatment. Nevertheless, multiple challenges still hinder the adoption of TMB for clinical decision-making. The current standard for TMB measurement requires counting the non-synonymous somat...
Accurate detection of somatic mutations is still a challenge in cancer analysis. Here we present NeuSomatic, the first convolutional neural network approach for somatic mutation detection, which significantly outperforms previous methods on different sequencing platforms, sequencing strategies, and tumor purities. NeuSomatic summarizes sequence ali...
We present DeepSomatic, the first convolutional neural network approach for somatic variant calling, which significantly outperforms previous techniques on different sequencing platforms, sequencing strategies, and tumor purities. DeepSomatic summarizes sequence alignments into small matrices and can incorporate more than a hundred features to capt...
Tumor-specific molecular profiling by next-generation sequencing (NGS) from tissue or blood (liquid biopsy) can have a tremendous potential to guide personalized healthcare for cancer treatments. However, based on the current clinical practices¹, often tumor or blood may be sequenced without the matched germline normal tissue or buffy coat. Since t...
Over a decade ago, the Atacama humanoid skeleton (Ata) was discovered in the Atacama region of Chile. The Ata specimen carried a strange phenotype-6-in stature, fewer than expected ribs, elongated cranium, and accelerated bone age-leading to speculation that this was a preserved nonhuman primate, human fetus harboring genetic mutations, or even an...
The human genome is generally organized into stable chromosomes, and only tumor cells are known to accumulate kilobase (kb)-sized extrachromosomal circular DNA elements (eccDNAs). However, it must be expected that kb eccDNAs exist in normal cells as a result of mutations. Here, we purify and sequence eccDNAs from muscle and blood samples from 16 he...
RNA-sequencing (RNA-seq) is an essential technique for transcriptome studies, hundreds of analysis tools have been developed since it was debuted. Although recent efforts have attempted to assess the latest available tools, they have not evaluated the analysis workflows comprehensively to unleash the power within RNA-seq. Here we conduct an extensi...
Mobile element insertions (MEIs) represent ~25% of all structural variants in human genomes. Moreover, when they disrupt genes, MEIs can influence human traits and diseases. Therefore, MEIs should be fully discovered along with other forms of genetic variation in whole genome sequencing (WGS) projects involving population genetics, human diseases,...
The 1000 Genomes Project produced more than 100 trillion basepairs of short read sequence from more than 2600 samples in 26 populations over a period of five years. In its final phase, the project released over 85 million genotyped and phased variants on human reference genome assembly GRCh37. An updated reference assembly, GRCh38, was released in...
OnkoInsight is a pipeline designed to detect cancer driver genes from large sequencing datasets. It includes the somatic mutation detection module SomaticSeq, and the novel driver gene detection module GSMuta. SomaticSeq leverages an ensemble approach and machine learning to accurately detect somatic mutations. In Stage 5 of the ICGC-TCGA DREAM Som...
LongISLND is a software package designed to simulate sequencing data according to the characteristics of third generation, single-molecule sequencing technologies. The general software architecture is easily extendable, as demonstrated by the emulation of Pacific Biosciences (PacBio) multipass sequencing with P5 and P6 chemistries, producing data i...
We report the sequence sof �244 human Y chromosomes randomly ascertained from 26 worldwide populations by the �1000 Genomes Project. We discovered more than 65,000 variants, including single-nucleotide variants, multiple-nucleotide variants, insertions and deletions, short tandem repeats, and copy number variants. Of these, copy number variants con...
Background:
The human genome contains variants ranging in size from small single nucleotide polymorphisms (SNPs) to large structural variants (SVs). High-quality benchmark small variant calls for the pilot National Institute of Standards and Technology (NIST) Reference Material (NA12878) have been developed by the Genome in a Bottle Consortium, bu...
Background: The human genome contains variants ranging in size from small single nucleotide polymorphisms (SNPs) to large structural variants (SVs). High-quality benchmark small variant calls for the pilot National Institute of Standards and Technology (NIST) Reference Material (NA12878) have been developed by the Genome in a Bottle Consortium, but...
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-covera...
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-covera...
Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks...
A high-confidence, comprehensive human variant set is critical in assessing accuracy of sequencing algorithms, which are crucial in precision medicine based on high-Throughput sequencing. Although recent works have attempted to provide such a resource, they still do not encompass all major types of variants including structural variants (SVs). Thus...
Investigating genomic structural variants at basepair resolution is crucial for understanding their formation mechanisms. We identify and analyse 8,943 deletion breakpoints in 1,092 samples from the 1000 Genomes Project. We find breakpoints have more nearby SNPs and indels than the genomic average, likely a consequence of relaxed selection. By inve...
Structural variations (SVs) are large genomic rearrangements that vary significantly in size, making them challenging to detect with the relatively short reads from next-generation sequencing (NGS). Different SV detection methods have been developed; however, each is limited to specific kinds of SVs with varying accuracy and resolution. Previous wo...
VarSim is a framework for assessing alignment and variant calling accuracy in high-throughput genome sequencing through simulation or real data. In contrast to simulating a random mutation spectrum, it synthesizes diploid genomes with germline and somatic mutations based on a realistic model. This model leverages information such as previously repo...
Background / Purpose:
Currently there is a lack of comprehensive simulation validation framework for next generation sequencing (NGS) analysis. Multiple agreed-upon validation datasets are critical for development of new secondary analysis methods, and read simulation is a bottleneck when simulating high coverage data. The genome in a bottle cons...
Background / Purpose:
Structural variations (SVs) are large genomic rearrangements, including deletion, insertion, inversion, duplication and translocation. SV detection is a key challenge with next-generation sequencing reads since SVs are generally much larger than read length. Accuracy of SV detection varies significantly by type, region and s...
The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign bioc...