Hui Liu’s research while affiliated with Nanjing Tech University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (72)


Generalize Drug Response Prediction by Latent Independent Projection for Asymmetric Constrained Domain Generalization
  • Preprint

February 2025

Ran Song

·

Yinpu Bai

·

Hui Liu

The accurate prediction of drug responses remains a formidable challenge, particularly at the single-cell level and in clinical treatment contexts. Some studies employ transfer learning techniques to predict drug responses in individual cells and patients, but they require access to target-domain data during training, which is often unavailable or only obtainable in future. In this study, we propose a novel domain generalization framework, termed panCancerDR, to address this challenge. We conceptualize each cancer type as a distinct source domain, with its cell lines serving as domain-specific samples. Our primary objective is to extract domain-invariant features from the expression profiles of cell lines across diverse cancer types, thereby generalize the predictive capacity to out-of-distribution samples. To enhance robustness, we introduce a latent independence projection (LIP) module that encourages the encoder to extract informative yet non-redundant features. Also, we propose an asymmetric adaptive clustering constraint, which clusters drug-sensitive samples into a compact group while drives resistant samples dispersed across separate clusters in the latent space. Our empirical experiments demonstrate that panCancerDR effectively learns task-relevant features from diverse source domains, and achieves accurate predictions of drug response for unseen cancer type during training. Furthermore, when evaluated on single-cell and patient-level prediction tasks, our model-trained solely on in vitro cell line data without access to target-domain information-consistently outperforms and matched current state-of-the-art methods. These findings highlights the potential of our method for real-world clinical applications.


Illustrative diagram of UnifyImmun framework and two-phase training strategy, as well as the sequence frequency distributions of the benchmark datasets
a, Architecture of UnifyImmun based on the cross-attention mechanism. b, Two-stage progressive training strategy. c,d, Frequency of antigen sequences (c) and TCR CDR3 sequences (d) included in our created benchmark datasets with respect to lengths. FC, fully connected.
Performance evaluation on predicting peptide–HLA binding specificity
a, Performance comparison with 12 existing methods on independent (left) and external (right) test datasets, respectively. b, ROC curves and AUC values achieved by UnifyImmun and eight competing methods on the hold-out independent test set. c, UMAP feature visualization of peptide–HLA pairs. d, PPV for the top 100, top 1,000 and top 5,000 predicted pHLA samples. MCC, Matthews correlation coefficient; ROC, receiver operating characteristic curve.
Source data
Performance evaluation on predicting peptide–TCR binding specificity
a–c, Performance comparison with four methods on independent (a), external (b) and COVID-19 (c) test sets, respectively. d–f, PPV for the top 100 (d), top 1,000 (e) and top 5,000 (f) predicted samples on independent, external and COVID-19 test sets, respectively. g,h, ROC curves and AUC values on independent (g) and external (h) test datasets, respectively.
Source data
Two-phase progressive training improved performance for both pHLA and pTCR binding prediction tasks
a,b, AUROC (a) and AUPR (b) values increased with two-phase training rounds on the pHLA independent test set. c,d, AUROC (c) and AUPR (d) values increased with two-phase training rounds on the pTCR independent test set. In all the boxplots, the sample size is 10 for each round, and each point represents the result of a technical replicate by random data partitioning. The horizontal line within each box represents the median, the box boundaries denote the interquartile range (Q25 to Q75) and whiskers extend to the most extreme data point no more than 1.5 times the interquartile range. The one-sided F-test was performed to assess the statistical differences between the two groups connected by a line, with the corresponding P values displayed above the respective lines.
Source data
Heatmaps generated from cross-attention scores and integrated gradients
a,b, Heatmaps of cross-attention scores (a) and integrated gradients (b) of the amino acid type at each position of 9-mer peptide binding to HLA molecules. c, Cumulative attention scores across peptide length of each amino acid type of peptide binding to HLA molecules. d,e, Heatmaps of cross-attention scores (d) and integrated gradients (e) of the amino acid type at each position of 9-mer peptide binding to TCR molecules. f, Cumulative attention scores across peptide length of each amino acid type of peptide binding to TCR molecules. g, Heatmaps of cross-attention scores for the top five HLA alleles with most 9-mer binding peptides. h,i, Attention score-based heatmap (h) and three-dimensional structure (i) for the TCR complex with HLA-B35:01/HPVG (PDB 3MV7).

+1

A unified cross-attention model for predicting antigen binding specificity to both HLA and TCR molecules
  • Article
  • Publisher preview available

January 2025

·

8 Reads

Nature Machine Intelligence

The immune checkpoint inhibitors have demonstrated promising clinical efficacy across various tumour types, yet the percentage of patients who benefit from them remains low. The bindings between tumour antigens and human leukocyte antigen class I/T cell receptor molecules determine the antigen presentation and T cell activation, thereby playing an important role in the immunotherapy response. In this paper, we propose UnifyImmun, a unified cross-attention transformer model designed to simultaneously predict the bindings of peptides to both receptors, providing more comprehensive evaluation of antigen immunogenicity. We devise a two-phase strategy using virtual adversarial training that enables these two tasks to reinforce each other mutually, by compelling the encoders to extract more expressive features. Our method demonstrates superior performance in predicting both peptide-HLA and peptide-TCR binding on multiple independent and external test sets. Notably, on a large-scale COVID-19 peptide-TCR binding test set without any seen peptide in the training set, our method outperforms the current state-of-the-art methods by more than 10%. The predicted binding scores significantly correlate with the immunotherapy response and clinical outcomes on two clinical cohorts. Furthermore, the cross-attention scores and integrated gradients reveal the amino acid sites critical for peptide binding to receptors. In essence, our approach marks an essential step towards comprehensive evaluation of antigen immunogenicity.

View access options

Learning Cross-Domain Representations for Transferable Drug Perturbations on Single-Cell Transcriptional Responses

December 2024

·

1 Read

Phenotypic drug discovery has attracted widespread attention because of its potential to identify bioactive molecules. Transcriptomic profiling provides a comprehensive reflection of phenotypic changes in cellular responses to external perturbations. In this paper, we propose XTransferCDR, a novel generative framework designed for feature decoupling and transferable representation learning across domains. Given a pair of perturbed expression profiles, our approach decouples the perturbation representations from basal states through domain separation encoders and then cross-transfers them in the latent space. The transferred representations are then used to reconstruct the corresponding perturbed expression profiles via a shared decoder. This cross-transfer constraint effectively promotes the learning of transferable drug perturbation representations. We conducted extensive evaluations of our model on multiple datasets, including single-cell transcriptional responses to drugs and single- and combinatorial genetic perturbations. The experimental results show that XTransferCDR achieved better performance than current state-of-the-art methods, showcasing its potential to advance phenotypic drug discovery.



Single-cell and spatial transcriptome characterize coinhibitory cell-cell communications during histological progression of lung adenocarcinoma

October 2024

·

6 Reads

·

1 Citation

Judong Luo

·

Qianman Gao

·

Meihua Wang

·

[...]

·

Hong Zhu

Introduction Lung adenocarcinoma, a prevalent and lethal malignancy globally, is characterized by significant tumor heterogeneity and a complex tumor immune microenvironment during its histologic pattern progression. Understanding the intricate interplay between tumor and immune cells is of paramount importance as it could potentially pave the way for the development of effective therapeutic strategies for lung adenocarcinoma. Methods In this study, we run comparative analysis of the single-cell transcriptomic data derived from tumor tissues exhibiting four distinct histologic patterns, lepidic, papillary, acinar and solid, in lung adenocarcinoma. Furthermore, we conducted immunofluorescence assay and spatial transcriptomic sequencing to validated the spatial co-localization of typical co-inhibitory factors. Results and Discussion Our analysis unveiled several co-inhibitory receptor-ligand interactions, including PD1-PDL1, PVR-TIGIT and TIGIT-NECTIN2, that potentially exert a pivotal role in recruiting immunosuppressive cells such as M2 macrophages and Tregs into LUAD tumor, thereby establishing immunosuppressive microenvironment and inducing T cells to exhaustion state. Furthermore, The expression level of these co-inhibitory factors, such as NECTIN2 and PVR, were strongly correlated with low immune infiltration, unfavorable patient clinical outcomes and limited efficacy of immunotherapy. We believe this study provides valuable insights into the heterogeneity of molecular, cellular interactions leading to immunosuppressive microenvironment during the histological progression of lung adenocarcinoma. The findings could facilitate the development of novel immunotherapy for lung cancer.


Single-cell and Spatial Transcriptomic Analyses Implicate Formation of the Immunosuppressive Microenvironment during Breast Tumor Progression

September 2024

·

5 Reads

The Journal of Immunology

Ductal carcinoma in situ and invasive ductal carcinoma represent two stages of breast cancer progression. A multitude of studies have shown that genomic instability increases during tumor development, as manifested by higher mutation and copy number variation rates. The advent of single-cell and spatial transcriptomics has enabled the investigation of the subtle differences in cellular states during the tumor progression at single-cell level, thereby providing more nuanced understanding of the intercellular interactions within the solid tumor. However, the evolutionary trajectory of tumor cells and the establishment of the immunosuppressive microenvironment during breast cancer progression remain unclear. In this study, we performed an exploratory analysis of the single-cell sequencing dataset of 13 ductal carcinoma in situ and invasive ductal carcinoma samples. We revealed that tumor cells became more malignant and aggressive during their progression, and T cells transited to an exhausted state. The tumor cells expressed various coinhibitory ligands that interacted with the receptors of immune cells to create an immunosuppressive tumor microenvironment. Furthermore, spatial transcriptomics data confirmed the spatial colocalization of tumor and immune cells, as well as the expression of the coinhibitory ligand–receptor pairs. Our analysis provides insights into the cellular and molecular mechanism underlying the formation of the immunosuppressive landscape during two typical stages of breast cancer progression.


A Comparative Study on the Modification of Polyphenolic, Volatile, and Sensory Profiles of Merlot Wine by Indigenous Lactiplantibacillus plantarum and Oenococcus oeni

September 2024

·

26 Reads

Australian Journal of Grape and Wine Research

Background and Aims. Te variation of malolactic fermentation (MLF) behavior between Oenococcus oeni and Lactiplantibacillus plantarum is crucial for wine quality. Tis work aimed to evaluate the fermentation kinetics of preacclimatized indigenous O. oeni (strains of SD-2a and 144-46) and Lp. plantarum (strains of XJ25 and XJA2), and their efect on color, polyphenols, volatile components, and sensory characteristics of wine. Methods and Results. Te changes of polyphenols, color, volatile components, and sensory properties of a low-pH Merlot wine (pH < 3.3) after MLF with indigenous O. oeni and Lp. plantarum strains were investigated and compared. O. oeni gave wine more foral and balsamic aromas and increased the proportion of co-pigmented anthocyanins. Only strain 144-46 of O. oeni could complete fermentation during 6-day MLF (fnal concentration of L-malic acid <0.1 g/L). However, both Lp. plantarum strains efectively and thoroughly metabolized L-malic acid within 3 days for XJA2 and 5 days for XJ25 while maintaining a substantial population of viable cells. Lp. plantarum signifcantly enhanced the proportion of polymerized anthocyanins and the fruity, caramel, fatty, roasted, and herbaceous notes in wine through the modifcation of volatile profles. Conclusions. Lp. plantarum after preacclimation shows more robust MLF performance compared to O. oeni. O. oeni prefers to give wine more foral and balsamic aromas, and a high proportion of co-pigmented anthocyanins. Lp. plantarum prefers to give wine more herbal and chemical aromas and a high proportion of polymerized anthocyanins. Signifcance of the Study. Lp. plantarum after preacclimation exhibits more favorable efects on wine compared to O. oeni., suggesting its potential for industrial application.



Ensemble Machine Learning and Predicted Properties Promote Antimicrobial Peptide Identification

July 2024

·

8 Reads

·

2 Citations

Interdisciplinary Sciences Computational Life Sciences

The emergence of antibiotic-resistant microbes raises a pressing demand for novel alternative treatments. One promising alternative is the antimicrobial peptides (AMPs), a class of innate immunity mediators within the therapeutic peptide realm. AMPs offer salient advantages such as high specificity, cost-effective synthesis, and reduced toxicity. Although some computational methodologies have been proposed to identify potential AMPs with the rapid development of artificial intelligence techniques, there is still ample room to improve their performance. This study proposes a predictive framework which ensembles deep learning and statistical learning methods to screen peptides with antimicrobial activity. We integrate multiple LightGBM classifiers and convolution neural networks which leverages various predicted sequential, structural and physicochemical properties from their residue sequences extracted by diverse machine learning paradigms. Comparative experiments exhibit that our method outperforms other state-of-the-art approaches on an independent test dataset, in terms of representative capability measures. Besides, we analyse the discrimination quality under different varieties of attribute information and it reveals that combination of multiple features could improve prediction. In addition, a case study is carried out to illustrate the exemplary favorable identification effect. We establish a web application at http://amp.denglab.org to provide convenient usage of our proposal and make the predictive framework, source code, and datasets publicly accessible at https://github.com/researchprotein/amp .


Predicting single-cell cellular responses to perturbations using cycle consistency learning

June 2024

·

7 Reads

·

2 Citations

Bioinformatics

Phenotype-based drug screening emerges as a powerful approach for identifying compounds that actively interact with cells. Transcriptional and proteomic profiling of cell lines and individual cells provide insights into the cellular state alterations that occur at the molecular level in response to external perturbations, such as drugs or genetic manipulations. In this paper, we propose cycleCDR, a novel deep learning framework to predict cellular response to external perturbations. We leverage the autoencoder to map the unperturbed cellular states to a latent space, in which we postulate the effects of drug perturbations on cellular states follow a linear additive model. Next, we introduce the cycle consistency constraints to ensure that unperturbed cellular state subjected to drug perturbation in the latent space would produces the perturbed cellular state through the decoder. Conversely, removal of perturbations from the perturbed cellular states can restore the unperturbed cellular state. The cycle consistency constraints and linear modeling in the latent space enable to learn transferable representations of external perturbations, so that our model can generalize well to unseen drugs during training stage. We validate our model on four different types of datasets, including bulk transcriptional responses, bulk proteomic responses, and single-cell transcriptional responses to drug/gene perturbations. The experimental results demonstrate that our model consistently outperforms existing state-of-the-art methods, indicating our method is highly versatile and applicable to a wide range of scenarios. Availability and implementation The source code is available at: https://github.com/hliulab/cycleCDR.


Citations (43)


... Fortunately, the remarkable progress in single-cell transcriptome sequencing has led to a significant increase in single-cell RNA sequencing data of T cells, greatly facilitating the acquisition of CDR3 sequences. By leveraging the power of large language models for pretraining, we can extract more expressive and meaningful features from the massive sequences 60 . This would greatly enhance the predictive capabilities of our model to accurately assess the immunogenicity of antigens. ...

Reference:

A unified cross-attention model for predicting antigen binding specificity to both HLA and TCR molecules
A large language model for predicting T cell receptor-antigen binding specificity
  • Citing Conference Paper
  • December 2024

... The ability of cancer to escape the immune system requires the presence of an immunesuppressive microenvironment frequently represented by the aberrant activation of immune checkpoints such as CTLA4, PD-1, and its ligand PD-1L [63][64][65][66]. ...

Single-cell and spatial transcriptome characterize coinhibitory cell-cell communications during histological progression of lung adenocarcinoma

... When applied to mice suffering from bacterial pneumonia, AMP aerosolized formulations demonstrated remarkable therapeutic efficacy comparable to penicillin, negligible toxicity, and a relatively low tendency to induce the development of resistance. Zhong et al. proposed a prediction framework that combined deep learning and statistical learning methods to screen potential AMPs [86]. This method integrated multiple LightGBM classifiers and convolutional neural networks (CNNs) that utilized a variety of predicted sequence, structural, and physicochemical properties from residue sequences extracted through various machine learning paradigms (Fig. 3b). ...

Ensemble Machine Learning and Predicted Properties Promote Antimicrobial Peptide Identification
  • Citing Article
  • July 2024

Interdisciplinary Sciences Computational Life Sciences

... The linear additive model in the latent space is widely used in deep learning for interpretability, such as latent additive neural models (Nguyen, Vasilaki, and Martínez 2023) and latent linear additivity models (Lotfollahi et al. 2021;Hetzel et al. 2022;Huang and Liu 2024). CPA (Lotfollahi et al. 2021) and cycleCDR (Huang and Liu 2024) are most related to our work, as they combine the interpretability of linear models with the power of deep learning to model single-cell transcriptional responses. ...

Predicting single-cell cellular responses to perturbations using cycle consistency learning
  • Citing Article
  • June 2024

Bioinformatics

... Fifty-five out of the 71 articles in this special issue studied flavor chemistry in processed food products, including marinated and stewed beef (Liu, Deng, et al., 2024;, cooked pork (Cheng et al., 2023), sausages (Shao et al., 2024;Sui et al., 2024;, chicken soup (Wang, Wu, et al., 2024), porridge , plant-based meat , sauerkraut Wang, Sui, Lu, et al., 2023), coffee , yogurt (Fan et al., 2024), tea Long et al., 2024;Ma et al., 2024;Ouyang et al., 2024;Qingyang et al., 2024;Xu et al., 2024;Yan et al., 2024;Zhao et al., 2024), wine Gao et al., 2024;Jiang et al., 2024;Qin et al., 2024;Wang, Yin, Shao, et al., 2023;Xi et al., 2024;Zhang, Liu, et al., 2024), soda (Barba et al., 2024), a fruit by-product (Luo et al., 2024), and processed oil Lee et al., 2024;Lin et al., 2024;. ...

Impact of indigenous Oenococcus oeni and Lactiplantibacillus plantarum species co-culture on Cabernet Sauvignon wine malolactic fermentation: Kinetic parameters, color and aroma

Food Chemistry X

... The above study illustrates that the expression level of ER stress related proteins is associated with the occurrence and development of cervical cancer. ER stress activation might have a high value in the treatment and prognosis of cervical cancer and has good clinical prospects [71,72]. ...

Endoplasmic Reticulum Stress Could Predict the Prognosis of Cervical Cancer and Regulate the Occurrence of Radiation Mucositis

Dose-Response

... Therefore, it is necessary to randomly sample the negative samples equal to the number of positive samples before training models. The sampling strategy that ranks samples according to the similarity matrix and selects the least similar ones as negative samples is more reliable than the random sampling strategy (Deng, Fan, et al., 2022). ...

Dual-Channel Heterogeneous Graph Neural Network for Predicting microRNA-Mediated Drug Sensitivity
  • Citing Article
  • November 2022

Journal of Chemical Information and Modeling

... With the rapid development of genomic technologies, there are gene expression data available in public databases for many biological and medical research questions. Integrating multisource data can overcome the problem of insufficient representativeness of target data, providing us with opportunities to explore the mechanisms of gene regulation and to understand the occurrence and progression of diseases [1,2]. However, due to the heterogeneity of biological samples from different sources, how to integrate the useful information from source data to improve estimation and prediction performance is a key challenge for the analysis and application of gene expression data [3,4]. ...

MSPCD: predicting circRNA-disease associations via integrating multi-source data and hierarchical neural network

BMC Bioinformatics

... Since 1930-th, bi-and/or trimetallic Co(Ni)Mo(W) catalysts supported on porous materials like alumina have been used [1][2][3]. However, the existing trends towards increasing the depth of oil refining, which require introduction of heavier secondary fractions in the processing along with straight-run fractions, makes it necessary to improve the catalytic systems or tighten the process con-ditions. ...

Active phase morphology engineering of NiMo/Al2O3 through La introduction for boosting hydrodesulfurization of 4,6-DMDBT
  • Citing Article
  • September 2022

Petroleum Science

... However, these experimental assays are often time-consuming, technically complex and costly. To address these challenges, some computational methods have emerged as viable alternatives to predict peptide-receptor bindings 25 35 and ATMTCR 36 . The ImmRep 2022 TCR-epitope specificity workshop released a dataset to benchmark the performance of more than ten predictive methods for pTCR bindings 37 . ...

Attention-aware contrastive learning for predicting T cell receptor–antigen binding specificity
  • Citing Article
  • September 2022

Briefings in Bioinformatics