Xin-Hui Xing’s research while affiliated with Tsinghua University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (267)


Health Engineering as a leading platform for engineering health and deconstructing disease
  • Article

May 2025

·

2 Reads

Xin-Hui Xing

·

Peter Lobie

·

Xitao Li



Understanding genetic factors that confer advantages to bacteria colonizing a non-native host

July 2024

·

76 Reads

Animals have a strong, reciprocal relationship with the microbes in their gut, but little is known about what traits allow potentially symbiotic bacteria to colonize new hosts. Given the high degree of selective pressure exerted on microbes in a digestive tract–some studies have used serial passage to examine what traits enable microbes to associate with hosts. But to speed up the mutation rate, a recent study combined serial passage with atmospheric and room temperature plasma (ARTP) mutagenesis. The researchers generated mutant libraries of Snodgrassella from bumblebees (Bombus terrestris) using ARTP, and then passed the libraries through a non-native host, gnotobiotic honeybees (Apis mellifera). Strains with mutations in the Mutual gliding locus had a competitive advantage in the honeybees but not bumblebees. Further tests suggested that these mutations promoted colonization by altering the cell’s Type IV pili-dependent motility. While more research is needed, this study demonstrates that ARTP mutagenesis can be used to rapidly identify key microbial mutations–and paves the way for broader application in evolutionary microbiota research.


Colonization of Snodgrassella communis mutants obtained from ARTP treatment in non-native host species. A Schematic diagram of the input mutant library generation using the ARTP biological breeding system. B Effect of ARTP treatment on the mortality rate of S. communis. C Schematic of in vivo colonization assays. Age-controlled bees were inoculated with different amounts of WT bacteria and input mutant library (10⁶, 10⁷, 10⁹ CFU). D Changes in the bacterial loads of bee individuals (n = 6) with different inoculation levels. Statistical analysis was performed using Student’s t-test. E Evaluation of the bacterial colonization for bees fed with 10⁹ CFU bacterial mutants. Each dot represents a bee lineage. Statistical analysis was performed using two-way ANOVA. F Schematic of the serial passage experiment. The bees were first colonized with the input mutant library, and the gut homogenates were used for subsequent serial transfer. G Evaluation of the bacterial colonization for bees during the passage experiment. Each dot represents a bee lineage. Statistical analysis was performed using two-way ANOVA
Characterization of the mutations during in vivo passage. A The total number of mutations (y-axis, bars) and frequencies of mutations (z-axis, dots) for all samples; each sample is a mixture of ~ 300 bacterial colonies from the gut homogenate of 10–15 bee individuals. B The non-metric multidimensional scaling analysis (NMDS) using the Bray–Curtis dissimilarity index considering the frequency of the mutations in each sample. The differences were analyzed for statistical significance using permutational multivariate ANOVA (PERMANOVA). C, D Overview of mutation genotype (C) and the predicted effect of the mutations (D). Each bar presents situations of the mutations in every lineage
The point mutation occurring in mglB may confer competitive advantages. A The positions and frequencies of the detected mutations. Each data point represents a mutation, and colors indicate different groups. Dashed circles indicate overlapping data points. B Mutation frequency of mglB G78R(GGA → AGA) in lineage 2 samples. C Verification of the presence and percentage of the variants by PCR from isolated strains targeting the polymorphism in mglB. D Protein architecture of MglB from S. communis B10998. Specific mutations in the evolved isolates are labeled. E The Missense3D analysis of the variant potentially causing structural alteration. The structure of MglB was predicted by modeling against the solved PDB crystal structure (PDB number 7CY1, resolution 2.19 Å)
The mgl operon in Snodgrassella. A Distribution patterns of the mgl operon in different bacterial groups. Bacteria were grouped according to the phylogeny of the MglA sequences [44]. B Maximum-likelihood phylogeny of the MglB from Snodgrassella strains (see also Additional file 3: Fig. S4). C Genome-wide phylogenetic tree of a subset of bacterial strains containing the group 3 mgl and the gene arrangement of the loci
The mutation identified in mglB affects cell motility and in vivo competitive advantage. A–F Colony expansion assay of the WT (A, B), the background mutant SA01065’ (C, D), and mutant SA01065 (E, F) on 0.5% agar. G–J In vivo competition assays in the non-native Apis mellifera (G, H) and the native host Bombus terrestris (I, J). Two sets of experiments were conducted, including WT versus SA01065 and SA01065’ versus SA01065. Single-colony PCR was performed to identify the different variant types of S. communis B10998 in the population
Identification of the mutual gliding locus as a factor for gut colonization in non-native bee hosts using the ARTP mutagenesis
  • Article
  • Full-text available

May 2024

·

94 Reads

·

5 Citations

Background The gut microbiota and their hosts profoundly affect each other’s physiology and evolution. Identifying host-selected traits is crucial to understanding the processes that govern the evolving interactions between animals and symbiotic microbes. Current experimental approaches mainly focus on the model bacteria, like hypermutating Escherichia coli or the evolutionary changes of wild stains by host transmissions. A method called atmospheric and room temperature plasma (ARTP) may overcome the bottleneck of low spontaneous mutation rates while maintaining mild conditions for the gut bacteria. Results We established an experimental symbiotic system with gnotobiotic bee models to unravel the molecular mechanisms promoting host colonization. By in vivo serial passage, we tracked the genetic changes of ARTP-treated Snodgrassella strains from Bombus terrestris in the non-native honeybee host. We observed that passaged isolates showing genetic changes in the mutual gliding locus have a competitive advantage in the non-native host. Specifically, alleles in the orphan mglB, the GTPase activating protein, promoted colonization potentially by altering the type IV pili-dependent motility of the cells. Finally, competition assays confirmed that the mutations out-competed the ancestral strain in the non-native honeybee gut but not in the native host. Conclusions Using the ARTP mutagenesis to generate a mutation library of gut symbionts, we explored the potential genetic mechanisms for improved gut colonization in non-native hosts. Our findings demonstrate the implication of the cell mutual-gliding motility in host association and provide an experimental system for future study on host-microbe interactions. ArN8Xjm3N6Dubzus4DCSQmVideo Abstract

Download

GLiDe: a web-based genome-scale CRISPRi sgRNA design tool for prokaryotes

March 2024

·

54 Reads

Background CRISPRi screening has become a powerful approach for functional genomic research. However, the off-target effects resulting from the mismatch tolerance between sgRNAs and their intended targets is a primary concern in CRISPRi applications. Results We introduce Guide Library Designer (GLiDe), a web-based tool specifically created for the genome-scale design of sgRNA libraries tailored for CRISPRi screening in prokaryotic organisms. GLiDe incorporates a robust quality control framework, rooted in prior experimental knowledge, ensuring the accurate identification of off-target hits. It boasts an extensive built-in database, encompassing 1,397 common prokaryotic species as a comprehensive design resource. Conclusions GLiDe provides the capability to design sgRNAs for newly discovered organisms. We further demonstrated that GLiDe exhibits enhanced precision in identifying off-target binding sites for the CRISPRi system.


Pooled CRISPR interference screening identifies crucial transcription factors in gas-fermenting Clostridium ljungdahlii

February 2024

·

66 Reads

Gas-fermenting Clostridium species hold tremendous promise for one-carbon biomanufacturing. To unlock their full potential, it is crucial to unravel and optimize the intricate regulatory networks that govern these organisms; however, this aspect is currently underexplored. In this study, we employed pooled CRISPR interference (CRISPRi) screening to uncover a wide range of functional transcription factors (TFs) in Clostridium ljungdahlii , a representative species of gas-fermenting Clostridium , with a special focus on the TFs associated with the utilization of carbon resources. Among the 425 TF candidates, we identified 75 and 68 TF genes affecting the heterotrophic and autotrophic growth of C. ljungdahlii , respectively. We directed our attention on two of the screened TFs, NrdR and DeoR, and revealed their pivotal roles in the regulation of deoxyribonucleotides (dNTPs) supply, carbon fixation, and product synthesis in C. ljungdahlii , thereby influencing the strain performance in gas fermentation. Based on this, we proceeded to optimize the expression of deoR in C. ljungdahlii by adjusting its promoter strength, leading to improved growth rate and ethanol synthesis of C. ljungdahlii when utilizing syngas. This study highlights the effectiveness of pooled CRISPRi screening in gas-fermenting Clostridium species, expanding the horizons for functional genomic research in these industrially important bacteria.


"Two-Birds-One-Stone" Oral Nanotherapeutic Designed to Target Intestinal Integrins and Regulate Redox Homeostasis for Ulcerative Colitis Treatment

January 2024

·

20 Reads

Designing highly efficient orally administrated nanotherapeutics with specific inflammatory site-targeting functions in the gastrointestinal (GI) tract for ulcerative colitis (UC) management is a significant challenge. Straightforward and adaptable modular multifunctional nanotherapeutics represent groundbreaking advancements and are crucial to promoting broad application in both academic research and clinical practice. In this study, we focused on exploring a specific targeting modular and functional oral nanotherapy, serving as "one stone", for the directed localization of inflammation and the regulation of redox homeostasis, thereby achieving effects against "two birds" for UC treatment. The designed nanotherapeutic agent OPNs@LMWH, which has a core-shell structure composed of oxidation-sensitive epsilon-polylysine nanoparticles (OPNs) in the core and low-molecular-weight heparin (LMWH) in the shell, exhibited specific active targeting effects and therapeutic efficacy simultaneously. We qualitatively and quantificationally confirmed that OPNs@LMWH possessed high integrin alpha M-mediated immune cellular uptake efficiency and preferentially accumulated in inflamed lesions. Compared with bare OPNs, OPNs@LMWH exhibited enhanced intracellular reactive oxygen species (ROS) scavenging and anti-inflammatory effects. After oral administration of OPNs@LMWH to mice with dextran sulfate sodium (DSS)-induced colitis, robust resilience was observed. OPNs@LMWH effectively ameliorated oxidative stress and inhibited the activation of inflammation-associated signalling pathways while simultaneously bolstering the protective mechanisms of the colonic epithelium. Overall, these findings underscore the compelling dual functionalities of OPNs@LMWH, which enable effective oral delivery to inflamed sites, thereby facilitating precise UC management.


Base editor-mediated large-scale screening of functional mutations in bacteria for industrial phenotypes

January 2024

·

45 Reads

·

1 Citation

Science China. Life sciences

Base editing, the targeted introduction of point mutations into cellular DNA, holds promise for improving genome-scale functional genome screening to single-nucleotide resolution. Current efforts in prokaryotes, however, remain confined to loss-of-function screens using the premature stop codons-mediated gene inactivation library, which falls far short of fully releasing the potential of base editors. Here, we developed a base editor-mediated functional single nucleotide variant screening pipeline in Escherichia coli. We constructed a library with 31,123 sgRNAs targeting 462 stress response-related genes in E. coli, and screened for adaptive mutations under isobutanol and furfural selective conditions. Guided by the screening results, we successfully identified several known and novel functional mutations. Our pipeline might be expanded to the optimization of other phenotypes or the strain engineering in other microorganisms.


Fig. 1. Schematic overview of the dSort-Seq data workflow. (A) During Sort-Seq, a library with different expression patterns is sorted into customized bins based on the fluorescence intensity value. (B) The mixing coefficients are quantified via NGS. (C) The overall fluorescence density is measured by FCM, and the sorting boundaries are specified on the basis of the overall fluorescence intensity density. (D) The read count number across all bins as quantified by NGS reveals the binned distribution of each variant in the library. (E) Through parameter learning, the mean, expression noise, and their relationships can be precisely identified. μ, mean; σ, SD.
Fig. 2. Framework and performance of dSort-Seq. (A to C) Two-component log mixture of Gaussians better represents the gene expression distribution than conventional methods. (A) Gene expression controlled by the LmrA repressor (36). The histogram denotes the cytometry data of the unrepressed state. YFP, yellow fluorescent protein; a.u., arbitrary units. (B) Gene expression under the control of the tnaC variant K11R_CGC (28). The data were measured under 100 μM Ala-Trp. eGFP, enhanced green fluorescent protein. (C) Gene expression driven by the promoter yebVp2 (this study). In (A) to (C), the gamma and log-normal distributions were matched using MLE, and the LGMM was fitted via the expectation-maximization algorithm. The red, cyan, and brown lines represent the fitting result of the two-component log mixture of Gaussian, log-normal, and gamma distributions, respectively. (D) Graphical representation of the model. (E) Theoretical fraction of the probability density within the corresponding boundaries. (F) Matching the mixture of two-component Gaussian mixture models to the overall fluorescence intensity distribution. The real data are sampled from experimental cytometry data; the fake data are generated from the LGMM. A fully connected neural network is used as a discriminator. (G) Example (V8A_GCC, 0 μM Ala-Trp, replicate 1) illustrating the superior performance of dSort-Seq in matching the binned distribution compared to the log-normal-based method. Kullback-Leibler divergence shows the performance of each fit. (H) Example (100 μM Ala-Trp, replicate 1) illustrating the superior performance of dSort-Seq in matching the overall fluorescence distribution compared to the log-normal-based method. In (G) and (H), the red and cyan distributions refer to the results derived from dSort-Seq and the log-normal-based method, respectively. The gray distribution refers to the real data. (I to L) Individually analyzed expression characteristics of reconstructed tnaC variants by cytometry highly correlated with those estimated via dSort-Seq in terms of means [(I) 0 μM Ala-Trp, n = 26; (K) 100 μM Ala-Trp, n = 30] and SDs [(J) 0 μM Ala-Trp; (L) 100 μM Ala-Trp).
Fig. 3. The dSort-Seq profiling of FapR-fapO-based malonyl-CoA-dependent gene expression. (A) Sort-Seq characterization of the malonyl-CoA biosensor library under six different cerulenin concentrations (0, 1, 2, 3, 5, and 8 mg/liter). Cells were sorted into eight bins according to their responses to ligand. Two biological replicates were examined for each Sort-Seq experiment. (B) Schematic diagram of the machine learning process. Gradient boosting regression was used here to interpret the relationship between features and expression strengths. The hyperparameters were optimized through fivefold cross-validation; then, the whole training dataset was used to train the model parameters, and the test dataset was used to evaluate the generalization capacity of the model. Last, the model was trained on the entire observed dataset to obtain predictions for unobserved data. (C) The model performance in the test dataset showed a good generalization capacity (n = 3077). (D) Gini importance that contributes to the gradient boosting regression tree. (E) Dose-response curves of 10 combinations with substantial dynamic ranges. Data points represent the mean values of YPet/mCherry under different cerulenin concentrations, where red dots represent individual characterization data, cyan stars represent data from dSort-Seq characterizations, and blue stars denote data from machine learning predictions. The dashed lines represent response curves fitted by the Hill equation (see Materials and Methods). The dynamic ranges of dSort-Seq calculations and individual characterizations are labeled on each panel.
Fig. 4. The dSort-Seq profiling of transcriptional and translational effects on noise production in E. coli K12 MG1655. (A) Design schemes of the promoter and the combination libraries. TSS, transcriptional start site. (B) According to the translational bursting mechanism, steady-state protein production follows a gamma distribution (3). In this distribution, the parameter "a" represents the transcription rate and "b" represents the translation rate. As a corollary, the burst size, denoted by the Fano factor, is linearly correlated with the translation rate and independent of the transcription rate. (C) According to the hierarchical Bayesian model, the intercept in the relationship between noise strength and the mean expression level is proportional to translational strength, indicating that translational bursting still dominates noise production at low expression levels (31). In this figure, b 1 and b 2 (b 1 < b 2 ) denote the translation rate, and C 1 and C 2 are constants. (D) The noise strength is linearly correlated with the mean expression level when only transcriptional strength varies. The gray line exhibits the linear regression result, which is shaded to show the 95% confidence interval. (E) The relationships between noise strength and mean expression level are similar when the translation module varies. The gray, orange, and green lines represent the regressions of all combinations and combinations with RBS apFAB864 and apFAB820, respectively. [RBS strength: apFAB820 (1.57) > apFAB864 (0.36); see Materials and Methods]. (F) The linear regression slopes (red dots) and intercepts (blue dots), obtained through the least squares regression method, do not exhibit a positive correlation with RBS strength.
Fig. 5. Overlapping RpoD-binding sites result in high expression noise. (A and B) Correlation of the expression noise with expression strength in (A) promoter and (B) combination libraries. At low mean expression levels, the noise decreases as the expression strength increases; at high mean expression levels, the noise converges to a constant value. The blue lines show the regression results (see Materials and Methods). Twenty promoters and 25 combinations exhibiting high expression noise are marked as red dots. (C) Twenty sequences from the promoter library and (D) 25 sequences from the combination library showing high expression noise were constructed and assayed through FCM. Their expression noise (red dots) is higher than that of randomly selected variants (black dots) at their corresponding mean expression levels. (E) Promoters with higher levels of expression noise residuals were enriched in thymine. (F) Promoters with more RpoD-binding sites exhibited increased expression noise residuals. The respective box plots were annotated with the median residuals atop each group. (G) Design scheme of 25 tandem promoters, each containing two overlapping RpoD-binding sites. (H) Design scheme of five constitutive promoters with the same length as the tandem promoter, each with only one RpoD-binding site. (I) Compared to promoters with single RpoD-binding site, the tandem promoters exhibited significantly higher expression noise (P = 2.40 × 10 −4 , one-tailed t test), especially when the stronger promoter was positioned upstream of the weaker promoter. Red dots represent tandem promoters where the stronger promoter located upstream of the weaker promoter, blue dots represent tandem promoters where the stronger promoter located downstream of the weaker promoter, and gray dots denote promoters with a single RpoD-binding site. Data within the dashed box were included in the analysis because the promoters in this region had similar mean expression levels but exhibited varying levels of expression noise.
Deep-learning–assisted Sort-Seq enables high-throughput profiling of gene expression characteristics with high precision

November 2023

·

98 Reads

·

4 Citations

Science Advances

Owing to the nondeterministic and nonlinear nature of gene expression, the steady-state intracellular protein abundance of a clonal population forms a distribution. The characteristics of this distribution, including expression strength and noise, are closely related to cellular behavior. However, quantitative description of these characteristics has so far relied on arrayed methods, which are time-consuming and labor-intensive. To address this issue, we propose a deep-learning–assisted Sort-Seq approach (dSort-Seq) in this work, enabling high-throughput profiling of expression properties with high precision. We demonstrated the validity of dSort-Seq for large-scale assaying of the dose-response relationships of biosensors. In addition, we comprehensively investigated the contribution of transcription and translation to noise production in Escherichia coli, from which we found that the expression noise is strongly coupled with the mean expression level. We also found that the transcriptional interference caused by overlapping RpoD-binding sites contributes to noise production, which suggested the existence of a simple and feasible noise control strategy in E. coli.


Citations (66)


... The preparation of protein hydrolysates was performed according to our previous methods (48). Briefly, hemp seed oil was removed with ethanol, protein was extracted with 0.8 mol/L NaCl aqueous solution (pH 7.0) following the enzymatic hydrolysis of hemp seed protein, and the details of the enzymatic hydrolysis conditions are displayed in Table S2. ...

Reference:

A novel bifunctional peptide VAMP mined from hemp seed protein hydrolysates improves glucose homeostasis by inhibiting intestinal DPP-IV and increasing the abundance of Akkermansia muciniphila
Comparison of physicochemical characteristics, functional properties and biological activities of hemp seed proteins by different extraction methods
  • Citing Article
  • August 2024

... These findings underline the importance of bacterial mobility in symbiosis and provide new avenues for investigating host-microbe interactions using novel experimental paradigms. 126 Based on the findings presented, it is evident that numerous variables affect microbiota composition, complicating the interpretation of results. The overall composition of the microbiota is influenced by various doi: 10.36922/ejmo.8318 ...

Identification of the mutual gliding locus as a factor for gut colonization in non-native bee hosts using the ARTP mutagenesis

... In the cytometry analysis, the fluorescence intensity distribution was log10-transformed and fitted to a two-component Gaussian mixture model [34] with parameters ( , µ 1 , µ 2 , σ 1 , σ 2 ) through the expectation-maximization algorithm. Here, and 1 − represent the mixing coefficients of the two Gaussian components, µ 1 , µ 2 , σ 1 and σ 2 represent the mean and standard deviation of the first and second Gaussian component, respectively (Eq. 1). ...

Deep-learning–assisted Sort-Seq enables high-throughput profiling of gene expression characteristics with high precision

Science Advances

... Because the PISs represented the results of the LC-MS/MS characterization, the results above indicated the high quality of the data [55]. After a systematic literature search, only WVL has been previously reported to have DPP-IV inhibitory activity [56]. The 17 potentially novel DPP-IV inhibitory peptides were used for subsequent analysis. ...

Exploration of DPP-IV Inhibitory Peptide Design Rules Assisted by the Deep Learning Pipeline That Identifies the Restriction Enzyme Cutting Site

ACS Omega

... Also, a method of co-cultivation of microorganisms was used for selection of droplets: one developed a FADS co-culture pipeline for improving erythritol production in Y. lipolytica: the picoinjection of fluorescence-based erythritol-biosensing E. coli was used to make fluorescence droplets [199]. ...

Establishment of picodroplet-based co-culture system to improve erythritol production in Yarrowia lipolytica
  • Citing Article
  • July 2023

Biochemical Engineering Journal

... Hemp-seed-derived inhibitors of DPP-IV demonstrates potential as novel therapeutics for diabetes. Sixteen DPP-IV inhibitory peptides are screened from HSP by molecular docking, and INS-1 cells experiments further validate their bioactivity on inhibiting cellular DPP-IV, enhancing glucagon-like peptide-1 (GLP-1) levels, and improving insulin secretion [100]. A tetrapeptide VAMP, mined by molecular docking and machine learning methods, could strongly inhibit DPP-IV (IC 50 = 1.00 µM in vitro) and improve glucose metabolism in obese mice by increasing GLP-1 secretion and promoting the growth of gut microbial Akkermansia muciniphila [14]. ...

Mining and Validation of Novel Hemp Seed-Derived DPP-IV-Inhibiting Peptides Using a Combination of Multi-omics and Molecular Docking
  • Citing Article
  • April 2023

Journal of Agricultural and Food Chemistry

... combinations of two or more regulatory targets do not always result in strains with improved performance 8,9 . Henceforth, it is imperative to develop an efficient rational strategy for combining synergistic multitarget in strain improvement. ...

CRISPRi-microfluidics screening enables genome-scale target identification for high-titer protein production and secretion
  • Citing Article
  • December 2022

Metabolic Engineering

... Modular design tools, data management systems, and models have been integrated into the DBTL cycle to support the initial design phase [25,26]. The build and testing phases, which involve DNA assembly, molecular cloning, and strain analysis, are becoming increasingly automated with advanced genetic engineering tools [25,[27][28][29][30][31]. Finally, the learning phase incorporates both traditional statistical evaluations and model-guided assessments, including machine learning techniques, to refine strain performance [32,33]. ...

Single‐cell microliter‐droplet screening system (MISS Cell): An integrated platform for automated high‐throughput microbial monoclonal cultivation and picking

... By treating the combination process as a Markov chain model, we can quantitatively calculate the thermodynamics of the sgRNA/DNA binding process based on fundamental thermodynamic parameters (nearest-neighbor parameters) [27]. We have further gathered nearest-neighbor parameters for the remaining 12 RNA/DNA single internal mismatches [55] and incorporated these data into the quantitative CRISPRi design tool we previously introduced (https:// www. thu-big. ...

Thermodynamic Parameters Contributions of Single Internal Mismatches In RNA/DNA Hybrid Duplexes

... Our results also suggest that future metabolic engineering efforts should focus on enhancing the supply of reducing power within the strain to meet the high demand for reducing equivalents in pleuromutilin biosynthesis, with an aim of achieving even higher titers. Additionally, nutritional supplementation may contribute to NAD(P)H regulation [27,28]. Overall, this study is the first to apply metabolic engineering strategies for increased pleuromutilin production in a native strain, guided by an integrated transcriptional and metabolite analysis approach. ...

Sodium formate redirects carbon flux and enhances heterologous mevalonate production in Methylobacterium extorquens AM1