Jian Li's research while affiliated with Roche Sequencing and Life Science and other places

Publications (8)

Article
Full-text available
Over a decade ago, the Atacama humanoid skeleton (Ata) was discovered in the Atacama region of Chile. The Ata specimen carried a strange phenotype-6-in stature, fewer than expected ribs, elongated cranium, and accelerated bone age-leading to speculation that this was a preserved nonhuman primate, human fetus harboring genetic mutations, or even an...
Article
Full-text available
A high-confidence, comprehensive human variant set is critical in assessing accuracy of sequencing algorithms, which are crucial in precision medicine based on high-Throughput sequencing. Although recent works have attempted to provide such a resource, they still do not encompass all major types of variants including structural variants (SVs). Thus...
Article
Full-text available
Structural variations (SVs) are large genomic rearrangements that vary significantly in size, making them challenging to detect with the relatively short reads from next-generation sequencing (NGS). Different SV detection methods have been developed; however, each is limited to specific kinds of SVs with varying accuracy and resolution. Previous wo...
Article
Full-text available
VarSim is a framework for assessing alignment and variant calling accuracy in high-throughput genome sequencing through simulation or real data. In contrast to simulating a random mutation spectrum, it synthesizes diploid genomes with germline and somatic mutations based on a realistic model. This model leverages information such as previously repo...
Conference Paper
Background / Purpose: Currently there is a lack of comprehensive simulation validation framework for next generation sequencing (NGS) analysis. Multiple agreed-upon validation datasets are critical for development of new secondary analysis methods, and read simulation is a bottleneck when simulating high coverage data. The genome in a bottle cons...
Conference Paper
Background / Purpose: Structural variations (SVs) are large genomic rearrangements, including deletion, insertion, inversion, duplication and translocation. SV detection is a key challenge with next-generation sequencing reads since SVs are generally much larger than read length. Accuracy of SV detection varies significantly by type, region and s...

Citations

... This increased interest also stems from the rise in destructive and invasive sampling of archeological human and faunal remains to provide more nuanced interpretations of the past (Pálsd ottir et al., 2019) and for the purpose of aDNA, isotope, histomorphological, and other analyses (Advisory Panel on the Archeology of Burials in England, 2013). This has been further intensified by poorly informed studies failing to incorporate basic osteological data (Bhattacharya et al., 2018). If publications do not include transparent consideration of ethical issues when undertaking our research, the media and the wider public may, quite rightly, feel misinformed about our research processes. ...
... A subset of CNVs from NA12878 was confirmed and further refined to those with support from multiple technologies using SVClassify 29 . The unique collection of Sanger sequencing from the HuRef sample has also been used to characterize SVs 30,31 . Long reads were used to broadly characterize SVs in a haploid hydatidiform mole cell line 32 . ...
... Compared with many previous successful CNV calling methods based on bulk tissue sequencing data [223][224][225][226][227][228][229][230][231][232][233], CNV detection from scRNA-seq is challenging due to several technical limitations, including low and non-uniform genome coverage, amplification biases [234,235] and prevalent monoallelic detection due to transcriptional stochasticity [234,[236][237][238]. The monoallelic bias is more pronounced for lowly expressed genes than highly expressed genes. ...
... Over the past few years, multiple methods have been proposed and implemented to simulate SV (e.g., VarSim [67], SURVIVOR [68], etc.) and simulate read data (e.g., Nano-Sim [69], PBsim [70], etc.) [71]. While these methods are helpful for rapid, early understanding of the utility of an SV caller, they often under-represent the complexity of SVs either at the level of the allele itself or in the regions they tend to occur (e.g., repetitive). ...