Jialiang Yang

Jialiang Yang
Geneis Beijing Co. Ltd. · Department of Sciences

Ph.D

About

265
Publications
22,169
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,711
Citations
Introduction
Jialiang Yang currently works at Geneis Beijing Co. Ltd.

Publications

Publications (265)
Article
Full-text available
IntroductionLarge genomes are full of repeated DNA sequences.It was estimated that over half of the human DNA consistsof repeated sequences (Baltimore 2001; Eichler 2001;Leem et al. 2002). Tandem duplication is one of theimportant evolutionary mechanisms for producing re-peated DNA sequences, in which the copies that may ormay not contain genes are...
Article
Full-text available
Phylogenetic networks can model reticulate evolutionary events such as hybridization, recombination, and horizontal gene transfer. However, reconstructing such networks is not trivial. Popular character-based methods are computationally inefficient, while distance-based methods cannot guarantee reconstruction accuracy since pairwise genetic distanc...
Article
Full-text available
Understanding the functional consequences of genetic variation, and how it affects complex human disease and quantitative traits, remains a critical challenge for biomedicine. We present an analysis of RNA sequencing data from 1641 samples across 43 tissues from 175 individuals, generated as part of the pilot phase of the Genotype-Tissue Expression...
Article
Full-text available
COVID-19 has spread globally to over 200 countries with more than 40 million confirmed cases and one million deaths as of November 1, 2020. The SARS-CoV-2 virus, leading to COVID-19, shows extremely high rates of infectivity and replication, and can result in pneumonia, acute respiratory distress, or even mortality. SARS-CoV-2 has been found to be...
Article
Full-text available
Microsatellite instability (MSI), an important biomarker for immunotherapy and the diagnosis of Lynch syndrome, refers to the change of microsatellite (MS) sequence length caused by insertion or deletion during DNA replication. However, traditional wet-lab experiment-based MSI detection is time-consuming and relies on experimental conditions. In ad...
Article
Recent advances in single-cell RNA sequencing (scRNA-seq) provide exciting opportunities for transcriptome analysis at single-cell resolution. Clustering individual cells is a key step to reveal cell subtypes and infer cell lineage in scRNA-seq analysis. Although many dedicated algorithms have been proposed, clustering quality remains a computation...
Article
Full-text available
Amid the COVID‐19 crisis, we put sizeable efforts to collect a high number of experimentally validated drug–virus association entries from literature by text mining and built a human drug–virus association database. To the best of our knowledge, it is the largest publicly available drug–virus database so far. Next, we develop a novel weight regular...
Article
About 30%–40% breast cancer patients suffer from recurrence and metastasis, even after targeted therapy like trastuzumab. Since breast cancer recurrence and metastasis are intrinsically related to mortality, it is critical to predict the recurrence and metastasis risk of an individual patient, which is essential for adjuvant therapy and early inter...
Article
Full-text available
As one of the most common cancers of the digestive system, colon cancer is a predominant cause of cancer-related deaths worldwide. To investigate prognostic genes in the tumor microenvironment of colon cancer, we collected 461 colon adenocarcinoma (COAD) and 172 rectal adenocarcinoma (READ) samples from The Cancer Genome Atlas (TCGA) database, and...
Article
Full-text available
Background Colorectal cancer (CRC), the 3rd most universal cancer globally, accounts for approximately 10% of newly diagnosed cancer incidences each year. Identifying biomarkers associated with CRC survival and predicting the survival of CRC patients are critical for personalized therapy. Existing studies on CRC survival are mainly based on single...
Preprint
Carcinoma of unknown primary (CUP) is a type of metastatic cancer with tissue-of-origin (TOO) unidentifiable by traditional methods. Most CUP patients have poor prognosis since no therapy targeting TOO is allowed. Thus, it’s critical to develop accurate computational methods to infer TOO. While qPCR or microarray-based methods are effective in pred...
Article
Full-text available
Since the outbreak of SARS-CoV-2 in 2019, the Chinese horseshoe bats were considered as a potential original host of SARS-CoV-2. In addition, cats, tigers, lions, mints, and ferrets were naturally or experimentally infected with SARS-CoV-2. For the surveillance and control of this highly infectious disease, it is critical to trace susceptible anima...
Article
Background Evaluating the risk of metastasis and recurrence of a cervical cancer patient is critical for appropriate adjuvant therapy. However, current risk assessment models usually involve the testing of tens to thousands of genes from patients’ tissue samples, which is expensive and time-consuming. Therefore, computer-aided diagnosis and prognos...
Article
Full-text available
Drug repositioning is an efficient and promising strategy for traditional drug discovery and development. Many research efforts are focused on utilizing deep-learning approaches based on a heterogeneous network for modeling complex drug-disease associations. Similar to traditional latent factor models, which directly factorize drug-disease associat...
Article
Full-text available
HER2-positive breast cancer is a highly heterogeneous tumor, and about 30% of patients still suffer from recurrence and metastasis after trastuzumab targeted therapy. Predicting individual prognosis is of great significance for the further development of precise therapy. With the continuous development of computer technology, more and more attentio...
Article
Invasion and migration are major characteristics of malignant cancers. Inhibiting the migration of cancer cells, rather than simply removing the primary tumor, is essential for improving the survival rates. Two-dimensional (2D) materials have important application prospects in the field of biomedicine (especially in cancer therapy) due to their uni...
Article
Full-text available
Artificial Intelligence (AI) coupled with promising machine learning (ML) techniques well known from computer science is broadly affecting many aspects of various fields including science and technology, industry, and even our day to day life. The ML techniques have been developed to analyze high-throughput data with a view to obtaining useful insi...
Article
Full-text available
Complex diseases, such as breast cancer, are often caused by mutations of multiple functional genes. Identifying disease-related genes is a critical and challenging task for unveiling the biological mechanisms behind these diseases. In this study, we develop a novel computational framework to analyze the network properties of the known breast cance...
Article
Full-text available
Background Non-small cell lung cancer (NSCLC) is one of the most prevalent causes of cancer-related death worldwide. Recently, there are many important medical advancements on NSCLC, such as therapies based on tyrosine kinase inhibitors and immune checkpoint inhibitors. Most of these therapies require tumor molecular testing for selecting patients...
Article
Full-text available
Carcinoma of unknown primary (CUP) is a type of metastatic cancer, the primary tumor site of which cannot be identified. CUP occupies approximately 5% of cancer incidences in the United States with usually unfavorable prognosis, making it a big threat to public health. Traditional methods to identify the tissue-of-origin (TOO) of CUP like immunohis...
Article
Full-text available
COVID-19 has spread globally with over 90,000,000 incidences and 1,930,000 deaths by Jan 11, 2021, which poses a big threat to public health. It is urgent to distinguish COVID-19 from common pneumonia. In this study, we reported multiple clinical feature analyses on COVID-19 in Inner Mongolia for the first time. We dynamically monitored multiple cl...
Article
Full-text available
The high dimension, high redundancy and class imbalance of cancer multiple omics data are the main challenges for cancer diagnosis. Existing studies have neglected the role of functional proteomics in the occurrence and development of cancer. In this study, a novel hybrid feature selection and ensemble learning framework, referred to as the three-s...
Article
Full-text available
Elevated plasma cholesterol and type 2 diabetes (T2D) are associated with coronary artery disease (CAD). Individuals treated with cholesterol-lowering statins have increased T2D risk, while individuals with hypercholesterolemia have reduced T2D risk. We explore the relationship between lipid and glucose control by constructing network models from t...
Article
Full-text available
The outbreak of a novel febrile respiratory disease called COVID-19, caused by a newfound coronavirus SARS-CoV-2, has brought a worldwide attention. Prioritizing approved drugs is critical for quick clinical trials against COVID-19. In this study, we first manually curated three Virus-Drug Association (VDA) datasets. By incorporating VDAs with the...
Article
Full-text available
Cancer immunotherapy, as a novel treatment against cancer metastasis and recurrence, has brought a significantly promising and effective therapy for cancer treatments. At present, programmed death 1 (PD-1) and programmed cell death-Ligand 1 (PD-L1) treatment for lung cancer is primarily recognized as an immune checkpoint inhibitor (ICI) to play an...
Article
Full-text available
A novel coronavirus, named COVID-19, has become one of the most prevalent and severe infectious diseases in human history. Currently, there are only very few vaccines and therapeutic drugs against COVID-19, and their efficacies are yet to be tested. Drug repurposing aims to explore new applications of approved drugs, which can significantly reduce...
Article
One advantage of single-cell RNA sequencing is its ability in revealing cell heterogeneity by cell clustering. However, cell clustering based on single-cell RNA sequencing is challenging due to the high transcript amplification noise, sparsity and outlier cell populations. In this study, we propose a novel sparse subspace clustering method called S...
Article
Full-text available
Lonicera japonica Thunb is a traditional Chinese herbal medicine for treating intestinal inflammation. The extraction method of Lonicera japonica Thunb polysaccharide (LJP) has been developed previously by our research group. In this study, a Fourier transform infrared spectrometer (FT-IR) was used to perform a qualitative analysis of LJP and a pre...
Article
Full-text available
The discovery of cancer of unknown primary (CUP) is of great significance in designing more effective treatments and improving the diagnostic efficiency in cancer patients. In the study, we develop an appropriate machine learning model for tracing the tissue of origin of CUP with high accuracy after feature engineering and model evaluation. Based o...
Article
Full-text available
Some carcinomas show that one or more metastatic sites appear with unknown origins. The identification of primary or metastatic tumor tissues is crucial for physicians to develop precise treatment plans for patients. With unknown primary origin sites, it is challenging to design specific plans for patients. Usually, those patients receive broad-spe...
Chapter
As the function of lncRNA is gradually understood, they have been found regulating the expression of target genes at the post-transcriptional level, and their abnormal functions may lead to so many diseases. Then, identifying the lncRNA-disease associations (LDA) can help to better understand its pathogenesis, promote the search for biomarkers of d...
Article
Full-text available
Studying transcriptome chronological change from tissues across the whole body can provide valuable information for understanding aging and longevity. Although there has been research on the effect of single-tissue transcriptomes on human aging or aging in mice across multiple tissues, the study of human body-wide multi-tissue transcriptomes on agi...
Article
Full-text available
A new coronavirus called SARS-CoV-2 is rapidly spreading around the world. Over 16,558,289 infected cases with 656,093 deaths have been reported by July 29th, 2020, and it is urgent to identify effective antiviral treatment. In this study, potential antiviral drugs against SARS-CoV-2 were identified by drug repositioning through Virus-Drug Associat...
Article
Full-text available
Sequencing-based identification of tumor tissue-of-origin (TOO) is critical for patients with cancer of unknown primary lesions. Even if the TOO of a tumor can be diagnosed by clinicopathological observation, reevaluations by computational methods can help avoid misdiagnosis. In this study, we developed a neural network (NN) framework using the exp...
Article
Full-text available
Cancer of unknown primary site (CUPS) is a type of metastatic tumor for which the sites of tumor origin cannot be determined. Precise diagnosis of the tissue origin for metastatic CUPS is crucial for developing treatment schemes to improve patient prognosis. Recently, there have been many studies using various cancer biomarkers to predict the tissu...
Article
Carcinoma of unknown primary (CUP), defined as metastatic cancers with unknown cancer origin, occurs in 3–5 per 100 cancer patients in the United States. Heterogeneity and metastasis of cancer brings great difficulties to the follow-up diagnosis and treatment for CUP. To find the tissue-of-origin (TOO) of the CUP, multiple methods have been raised....
Chapter
The primary site cannot be found after clinical and pathological evaluation, which are called cancers of unknown primary origin (CUP). CUPs may resemble a specific primary tumor site which shares common clinicopathological characteristics and prognosis. However, it may be present as a distinct disease entity with undifferentiated pathological featu...
Chapter
The status of T cell receptors (TCRs) repertoire is associated with the occurrence and progress of various diseases and can be used in monitoring the immune responses, predicting the prognosis of disease and other medical fields. High-throughput sequencing promotes the studying in TCR repertoire. The chapter focuses on the whole process of TCR prof...
Article
Full-text available
In this study, we proposed an ensemble learning method simultaneously integrating a low-rank matrix completion model and a ridge regression model to predict anticancer drug response on cancer cell lines. The model was applied to two benchmark datasets including the Cancer Cell Line Encyclopedia (CCLE) and the Genomics of Drug Sensitivity in Cancer...
Article
Full-text available
Data quality control and preprocessing are often the first step in processing next-generation sequencing (NGS) data of tumors. Not only can it help us evaluate the quality of sequencing data, but it can also help us obtain high-quality data for downstream data analysis. However, by comparing data analysis results of preprocessing with Cutadapt, Fas...
Article
It is urgent to find an effective antiviral drug against SARS-CoV-2. In this study, 96 virus-drug associations (VDAs) from 12 viruses including SARS-CoV-2 and similar viruses and 78 small molecules are selected. Complete genomic sequence similarity of viruses and chemical structure similarity of drugs are then computed. A KATZ-based VDA prediction...
Article
Background Identification of genomic markers using NGS (next generation sequencing) technology would be valuable for guiding precision medicine treatments for pancreatic cancers. Traditional somatic mutation methods require both tumor and matched non-tumor samples. However, only tumor samples are available in most times, especially in retrospective...
Article
Full-text available
During the carcinogenesis of cervical cancer, the DNA of human papillomavirus (HPV) is frequently integrated into the human genome, which might be a biomarker for the early diagnosis of cervical cancer. Although the detection sensitivity of virus infection status increased significantly through the Illumina sequencing platform, there were still dis...
Article
Full-text available
With the development of high throughput technologies, there are more and more protein–protein interaction (PPI) networks available, which provide a need for efficient computational tools for network alignment. Network alignment is widely used to predict functions of certain proteins, identify conserved network modules, and study the evolutionary re...
Article
Full-text available
For resectable cancer patients, a method that could precisely predict the risk of postoperative recurrence would be crucial for guiding adjuvant treatment. Since T Cell Receptors (TCR) repertoires had been shown to be closely related to the dynamics of cancers, here we enrolled a cohort of patients to evaluate the potential of TCR repertoires in pr...
Article
3591 Background: Metastatic cancers require further diagnosis to determine their primary tumor sites. However, the tissue-of-origin for around 5% tumors could not be identified by routine medical diagnosis according. With the development of machine learning techniques and the accumulation of big cancer data from TCGA and GEO, it is now feasible to...
Article
Background Thymidylate Synthase (TS) is an important target for folic acid inhibitors such as pemetrexed, which has considerable effects on the first-line treatment, second-line treatment and maintenance therapy for patients with late-stage Non-Small Cell Lung Cancer (NSCLC). Therefore, detecting mutations in the TYMS gene encoding TS is critical i...
Article
Full-text available
Metastatic cancers require further diagnosis to determine their primary tumor sites. However, the tissue-of-origin for around 5% tumors could not be identified by routine medical diagnosis according to a statistics in the United States. With the development of machine learning techniques and the accumulation of big cancer data from The Cancer Genom...
Article
Full-text available
Identifying disease-related microRNAs (miRNAs) is crucial to understanding the etiology and pathogenesis of many diseases. However, existing computational methods are facing a few dilemmas such as lacking “negative samples” (i.e. confirmed unrelated miRNA-disease pairs). In this study, we proposed LRMCMDA, a low-rank matrix completion-based method...
Article
Full-text available
Recently, many studies have demonstrated that microRNAs (miRNAs) are new small molecule drug targets. Identifying small molecule-miRNA associations (SMiRs) plays an important role in finding new clues for various human disease therapy. Wet experiments can discover credible SMiR associations; however, this is a costly and time-consuming process. Com...
Article
Background Many forms of variation exist in the genome, which are the main causes of individual phenotypic differences. The detection of variants, especially those located in the tumor genome, still faces many challenges due to the complexity of genome structure. Thus, the performance assessment of variation detection tools using next-generation se...
Article
Full-text available
A key goal of aging research was to understand mechanisms underlying healthy aging and develop methods to promote the human healthspan. One approach is to identify gene regulations unique to healthy aging compared with aging in the general population (i.e., "common" aging). Here, we leveraged Genotype-Tissue Expression (GTEx) project data to invest...
Article
Motivation: Single-cell RNA sequencing (scRNA-seq) technology provides a powerful tool for investigating cell heterogeneity and cell subpopulations by allowing the quantification of gene expression at single cell level. However, scRNA-seq data analysis remains challenging because of various technical noises such as dropout events (i.e., excessive...