William Rostain’s research while affiliated with Université Paris Cité and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (16)


Experimental validation of chimeras between SpyCas9 and the PAM-interacting domain (PID) of other natural variants
a: Genetic circuit used to test Cas9 PID variants. The dCas9 gene is guided to silence an operon that consists of a mCherry reporter and the SacB counter-selection marker. The gene construct was designed to enable the easy exchange of PID domains (see Methods). b: Serial dilution and spotting of E. coli MG1655 carrying the wild-type SpydCas9 or dCas9 without a functional PID in the absence (left) or presence (right) of sucrose. c: The activity of dCas9 chimeras is reported as the normalized repression of the mCherry fluorescence. Chimeras were tested against targets with a PAM recognized by the PID they carry [37]. This score is normalized such that 1 corresponds to the activity of the WT and 0 to the negative control. The percentage of identity to the SpyCas9 PID is reported. We flagged with a # the PIDs which did not recognize the TGG PAM and were therefore tested with another PAM.
Schematic representation of a standard Restricted Boltzmann Machine
The RBM is a probabilistic graphical model with two layers: the visible layer carries protein sequences x and the hidden layer encodes latent vector h. The graph is not oriented, allowing one (i) to sample from the visible layer to the representation layer using the conditional distribution p(h|x) depending on x, on the weight matrix W and on the potentials U on the hidden units (ii) to sample from the hidden layer to the visible layer using the conditional distribution p(x|h) depending on h, on the weight matrix W and on the potentials g representing the prevalence of the amino acids at each position.
Semi-supervision helps the RBM learn useful representations of Cas9 PID
a: Our Semi-Supervised Learning RBM with a one-layer classifier. This classifier takes as input the representation of the sequence in the hidden layer, and outputs the predicted PAM. b: Area under the ROC curve for the prediction of the PAM on the validation set. The curve is smoothed by averaging over 20 consecutive values. The shaded area shows the standard deviation over these 20 consecutive values of γ.
Constrained Langevin Dynamics as a sampling method
a: The Constrained Langevin Dynamics in the representation space consists of two steps. The first step is a round of sampling to evaluate the gradient of the main criterion and the gradient of the control criterion through the expectation formula (see Methods), the second is a random step following the direction drawn by these two gradients (orthogonal projection of the main direction vector regarding the control direction). A Brownian noise is added to create randomness and diversity in the samples. b: Typical Random Walks obtained starting from the WT SpyCas9 PID and progressively targeting an RBM energy between -0.267 and -0.257 and a Hamming distance to the WT from 50 to 55 amino acids. The black lines are representing the time-dependent target intervals along the random walk.
Generative capacities of the SSL-RBM
a: We tested the generative capacities of our models by generating 120 sequences with Constrained Langevin Dynamics using the trained SSL-RBM. We then use FoldX to compute the energy (displayed as ΔΔG, change in stability with respect to the wild-type) of the protein-DNA complex for the generated sequences and evaluate their quality. b: Distributions of FoldX energies of sequences generated with increasing values of γ. Distributions are drawn in gray using Gaussian Kernel Density Estimation, quantiles are also displayed for the different distributions. These quantiles and distributions show that overall, SSL-RBM trained with intermediate values of γ tend to generate sequences with better (lower) FoldX energies.

+5

Computational design of novel Cas9 PAM-interacting domains using evolution-based modelling and structural quality assessment
  • Article
  • Full-text available

November 2023

·

42 Reads

·

4 Citations

·

William Rostain

·

Florence Depardieu

·

[...]

·

We present here an approach to protein design that combines (i) scarce functional information such as experimental data (ii) evolutionary information learned from a natural sequence variants and (iii) physics-grounded modeling. Using a Restricted Boltzmann Machine (RBM), we learn a sequence model of a protein family. We use semi-supervision to leverage available functional information during the RBM training. We then propose a strategy to explore the protein representation space that can be informed by external models such as an empirical force-field method (FoldX). Our approach is applied to a domain of the Cas9 protein responsible for recognition of a short DNA motif. We experimentally assess the functionality of 71 variants generated to explore a range of RBM and FoldX energies. Sequences with as many as 50 differences (20% of the protein domain) to the wild-type retained functionality. Overall, 21/71 sequences designed with our method were functional. Interestingly, 6/71 sequences showed an improved activity in comparison with the original wild-type protein sequence. These results demonstrate the interest in further exploring the synergies between machine-learning of protein sequence representations and physics grounded modeling strategies informed by structural information.

Download

Fig 1. Experimental validation of chimeras between SpyCas9 and the PAM-interacting domain of other natural variants a: Genetic circuit used to test Cas9 PID variants. b: Spot assay of the wild-type SpyCas9 and a negative control PID variant in the circuit, on plates containing the inducer DAPG (left) or DAPG and 0.125% sucrose (right). c: Maximum-likelihood tree of alignment of PID sequences of the tested Cas9 PID chimeras. We also display the normalized repression of Cas9 chimera (SpydCas9 + variant PID) targeting a motif recognized by the PID. The experimentally determined PAM are taken from Collias et Beisel (2021) [28] review, and we flagged with a # the PIDs which did not recognize TGG and were therefore tested with another PAM. As we can see, variants who are phylogenetically closer to SpyCas9 PID are functional (in green) while the more distant are not.
Fig 3. Constrained Langevin Dynamics as a sampling method for Restricted Boltzmann Machine a: The Constrained Langevin Dynamics in the representation space consists in two steps. The first step is a round of sampling to evaluate the gradient of the main criterion and the gradient of the control criterion through the expectation formula (see methods), the second is a random step following the direction drawn by these two gradient (projection of the main direction vector on the orthogonal of the control direction). A Brownian noise is added to create randomness and diversity in the samples. b and c: Typical Random Walks for designing sequences with between different RBM energy and similarity targets
Fig 4. Semi-supervision helps the RBM learn useful representations of the Cas9 PID a: Our Semi-Supervised Learning RBM with a one-layer classifier. This classifier is oriented and take as input the representation of the sequence in the hidden layer. b: Area under the ROC curve for the prediction of the PAM on the validation set. The curve was smoothed, and we display the standard deviation over 20 consecutive values of γ. Balance between the classifier and the RBM seem to both stabilize the training and improve the accuracy of the classifier.
Computational design of novel Cas9 PAM-interacting domains using evolution-based modelling and structural quality assessment

March 2023

·

70 Reads

·

1 Citation

We present here an approach to protein design that combines evolutionary and physics-grounded modeling. Using a Restricted Boltzmann Machine, we learned a sequence model of a protein family and propose a strategy to explore the protein representation space that can be informed by external models such as an empirical force field method (FoldX). This method was applied to a domain of the Cas9 protein responsible for recognition of a short DNA motif. We experimentally assessed the functionality of 71 variants that were generated to explore a range of RBM and FoldX energies. We show how a combination of structural and evolutionary information can identify functional variants with high accuracy. Sequences with as many as 50 differences (20% of the protein domain) to the wild-type retained functionality. Interestingly, some sequences (6/71) produced by our method showed an improved activity in comparison with the original wild-type proteins sequence. These results demonstrate the interest of further exploring the synergies between machine-learning of protein sequence representations and physics grounded modeling strategies informed by structural information.


Figure 1. A random guide RNA library identifies toxic seed sequences. (A) Architecture of pTG34, which contains a DAPG-inducible dCas9 and a library of random guide RNAs. (B) Diagram of CRISPRi screen to identify unexpected guide toxicity. (C) Violin plots of log 2 FC for guide RNAs carrying different PAM-proximal sequences. The median log 2 FC of guide RNAs grouped by (D) their 5-nt PAM-proximal sequence or (E) their 4-nt PAM-proximal sequence is plotted against the −log 10 (P-values) of a Bonferroni adjusted Mann-Whitney U test.
Figure 2. The bad-seed toxicity phenomenon is caused by off-target binding in the promoter of essential genes. Differential expression of genes after 3 h expression of dCas9 in the presence of a guide RNA with (A) an AGGAA seed or (B) an ACCCA seed sequence, measured by RNA-seq. (C) Diagram of the glyQS and rpmH promoters, indicating the location of potential 5-nt off-target sites. The genotype of the PAM mut strain is indicated. (D) Serial dilution and spotting of strain AV01 (WT) or TG01 (PAM mut ) expressing dCas9 under the control of a ptet promoter in the presence of various guide RNAs: LacZ (negative control), AGGAA, CGGAA, GGGAA, TGGAA and ACCCA guides with random sequences in the first 15 nt of the guide. (E) Expression of the glyQ gene measured by RT-qPCR in strain AV01 (WT) in the presence of all four NGGAA guide RNAs after 2 h of dCas9 induction. Points show three biological replicates. (F) Violin plot of log 2 FC values for guides sharing each of the four NGGAA seed sequences. (G) Serial dilution and spotting of strain AV01 carrying a plasmid expressing rpmH (pTG40) or a control empty vector (pSEVA271), in the presence of the ACCCA guide RNA.
Figure 3. Identification of off-target sites responsible for the toxicity of a few bad-seed sequences. (A) The promoters of alaS, folA, glnS and rpmB are shown and the off-target positions identified for the TGACT, TATAG, GGGAC and AAAGG bad seeds are underlined. Mutations made to the PAM site are shown below the promoter sequence in red. (B) Serial dilution and spotting of strain AV01 (WT) or of the PAM site mutants depicted in panel (A), in the presence of the cognate guide RNA and in the absence or presence of aTc.
Figure 4. ChIP-seq analysis of dCas9 binding to the chromosome of E. coli in the presence of the toxic AGGAA or ACCCA guide RNAs. Normalized read counts showing the binding of dCas9 to (A) the off-target position in the promoter of glyQS in the presence of the AGGAA guide RNA or (B) the off-target position in the promoter of rpmH in the presence of the ACCCA guide RNA. (C) Distribution of peak scores for all peaks over AGGAANGG positions in the chromosome of E. coli. The score of the glyQS off-target is shown with a red bar. (D) Fraction of genome locations for which a ChIP peak is detected. The data are shown for locations with an increasing complementarity to the seed sequence of the AGGAA n1 guide. Points show three biological replicates. (E) Distribution of peak scores for all peaks over ACCCANGG positions in the chromosome of E. coli. The score of the rpmH off-target is shown with a red bar. (F) Fraction of genome locations for which a ChIP peak is detected. The data are shown for locations with an increasing complementarity to the seed sequence of the ACCCA n1 guide. Points show three biological replicates.
Figure 6. A screen performed in 12 Enterobacteriaceae identifies changes in toxic seed sequences. (A) Median log 2 FC of most toxic 5-nt seed sequences (dendrogram of hierarchical clustering shown on the left). (B) Sequence of alaS promoter in three strains. (C) Distribution of log 2 FC of all guide RNAs with a TGACT or a TTACT seed. (D) Sequence of RpmHp2 in three strains. (E) Distribution of log 2 FC of guide RNAs with an ACCCA seed. (F) Sequence of folA promoter in three strains. (G) Distribution of log 2 FC of guide RNAs with a TATAG seed.
Cas9 off-target binding to the promoter of bacterial genes leads to silencing and toxicity

March 2023

·

235 Reads

·

40 Citations

Nucleic Acids Research

Genetic tools derived from the Cas9 RNA-guided nuclease are providing essential capabilities to study and engineer bacteria. While the importance of off-target effects was noted early in Cas9’s application to mammalian cells, off-target cleavage by Cas9 in bacterial genomes is easily avoided due to their smaller size. Despite this, several studies have reported experimental setups in which Cas9 expression was toxic, even when using the catalytic dead variant of Cas9 (dCas9). Specifically, dCas9 was shown to be toxic when in complex with guide RNAs sharing specific PAM (protospacer adjacent motif)-proximal sequence motifs. Here, we demonstrate that this toxicity is caused by off-target binding of Cas9 to the promoter of essential genes, with silencing of off-target genes occurring with as little as 4 nt of identity in the PAM-proximal sequence. Screens performed in various strains of Escherichia coli and other enterobacteria show that the nature of toxic guide RNAs changes together with the evolution of sequences at off-target positions. These results highlight the potential for Cas9 to bind to hundreds of off-target positions in bacterial genomes, leading to undesired effects. This phenomenon must be considered in the design and interpretation of CRISPR–Cas experiments in bacteria.


Tuning of Gene Expression in Clostridium phytofermentans Using Synthetic Promoters and CRISPRi

November 2022

·

60 Reads

·

4 Citations

ACS Synthetic Biology

Control of gene expression is fundamental to cell engineering. Here we demonstrate a set of approaches to tune gene expression in Clostridia using the model Clostridium phytofermentans. Initially, we develop a simple benchtop electroporation method that we use to identify a set of replicating plasmids and resistance markers that can be cotransformed into C. phytofermentans. We define a series of promoters spanning a >100-fold expression range by testing a promoter library driving the expression of a luminescent reporter. By insertion of tet operator sites upstream of the reporter, its expression can be quantitatively altered using the Tet repressor and anhydrotetracycline (aTc). We integrate these methods into an aTc-regulated dCas12a system with which we show in vivo CRISPRi-mediated repression of reporter and fermentation genes in C. phytofermentans. Together, these approaches advance genetic transformation and experimental control of gene expression in Clostridia.



Figure 2. Cas12a gRNA switches enable RNA-triggered DNA targeting in TXTL and in E. coli. (A) Assessing gRNA switch activity in TXTL. DNA encoding dFnCas12a, a gRNA or gRNA switch, an RNA trigger and the deGFP reporter were combined in the reaction, and fluorescence is tracked over time. DNA targeting by dFnCas12a would result in reduced fluorescence. (B) Fluorescence time courses from TXTL. The central line represents the average while the surrounding band represents the standard error from three independently mixed reactions. (C) End-point measurements from the TXTL assays in B as assessed at 16 h. Fluorescence values were normalized so the targeting and non-targeting gRNAs yield 0% and 100% GFP fluorescence, respectively. (D) Assessing gRNA switches in E. coli. E. coli cells harboring plasmids encoding the same components from the TXTL assay were monitored for fluorescence and turbidity over time. All components were constitutively expressed. See B for the legend. The central line represents average while the surrounding band represents the standard error from three independent experiments starting from separate transformants.
Figure 3. An expanded set of gRNA switches respond to their synthetic RNA triggers without crosstalk. (A) Assessing DNA targeting activity of an expanded set of gRNA switches that respond to synthetic RNA triggers in TXTL. See Figure 2C for details. (B) NUPACK predictions for cross-talk between gRNA switches and their artificial RNA triggers. Predictions are based on co-folding each gRNA switch and RNA trigger each at a simulated concentration of 1 M. (C) Evaluating cross-talk between the subset of gRNA switches and RNA triggers in TXTL. Fold-increase in normalized deGFP fluorescence is reported as the average from three independently mixed reactions. See Figure 3C for details on the normalization.
Figure 4. Switch optimization allows robust sensing of the Hfq-binding small RNA RyhB in TXTL and in E. coli. (A) Activation of gRNA switches via induction of RyhB through iron starvation. RyhB is naturally induced in E. coli following depletion of iron in the culture, such as through the addition of the iron chelator 2,2 bipyridyl to the growth medium. (B) Modifying gRNA switches to enhance sensing of RyhB in TXTL. Left: RyhB sequence (top) and RyhB switch (Sw-R1 to Sw-R19) sensor domain structures, displayed 3 to 5 . The predicted secondary structure of RyhB is represented as dot-bracket annotation and matches that shown in Supplementary Figure S7. The toehold and loop, as well as complementary regions of the clamp, are aligned with their binding locations along the RyhB sequence. The linker does not bind RyhB. All gRNA switches targeted the same location in the promoter controlling deGFP expression. Right: end-point measurements of normalized deGFP fluorescence in the presence of RyhB or the randomized RNA trigger (+Rdm). See Figure 2C for details. (C) Assessing gRNA switch activity following induction of RyhB expression in E. coli. Different concentrations of 2,2 -bipyridyl were added to the growth medium to induce RyhB expression. (D) Impact of gRNA switches on deGFP expression following induction of RyhB. deGFP expression was measured by flow cytometry analysis. deGFP fluorescence was normalized to that of the targeting gRNA (0%) and the non-targeting gRNA (100%) subjected to the same growth conditions. Values represent the average and standard error of three independent experiments initiated from separate colonies.
Figure 5. gRNA switches can be triggered by the araB mRNA in TXTL and in E. coli. (A) Sensing the mRNA encoded by araBAD to drive deGFP silencing with gRNA switches. (B) Assessing gRNA switches designed to sense the araB mRNA in TXTL. See Figure 2C for details. Locations of the trigger RNAs within the araB mRNA are shown in Supplementary Figure S8. Some of the gRNA switches exhibited increased fluorescence with the araB mRNA trigger, possibly due to differences in the size and sequence of the expressed transcripts (70). (C) Assessing activation of gRNA switches in E. coli when constitutively expressing the araB mRNA or a lacZ mRNA control. The endogenous araBAD operon was deleted to prevent switch activation from this locus. Growth and fluorescence measurements were made on a microtiter plate reader. GFP expression rates were calculated as differentials of fluorescence over ABS 600 and normalized to the non-targeting (0% GFP expression rate) and targeting gRNA (100% GFP expression rate). Sw-1 with its cognate trigger (Tr-1) or a randomized RNA trigger (Rdm) served as controls. Values represent the average and standard deviation of three independent experiments initiated from separate colonies. (D) Assessing activation of gRNA switches when inducing expression of the endogenous araBAD operon. E. coli cells were induced with the addition of L-arabinose or repressed with the addition of D-glucose to the medium. See C for details.
Sequence-independent RNA sensing and DNA targeting by a split domain CRISPR-Cas12a gRNA switch

February 2021

·

224 Reads

·

38 Citations

Nucleic Acids Research

CRISPR technologies increasingly require spatiotemporal and dosage control of nuclease activity. One promising strategy involves linking nuclease activity to a cell's transcriptional state by engineering guide RNAs (gRNAs) to function only after complexing with a 'trigger' RNA. However, standard gRNA switch designs do not allow independent selection of trigger and guide sequences, limiting gRNA switch application. Here, we demonstrate the modular design of Cas12a gRNA switches that decouples selection of these sequences. The 5' end of the Cas12a gRNA is fused to two distinct and non-overlapping domains: one base pairs with the gRNA repeat, blocking formation of a hairpin required for Cas12a recognition; the other hybridizes to the RNA trigger, stimulating refolding of the gRNA repeat and subsequent gRNA-dependent Cas12a activity. Using a cell-free transcription-translation system and Escherichia coli, we show that designed gRNA switches can respond to different triggers and target different DNA sequences. Modulating the length and composition of the sensory domain altered gRNA switch performance. Finally, gRNA switches could be designed to sense endogenous RNAs expressed only under specific growth conditions, rendering Cas12a targeting activity dependent on cellular metabolism and stress. Our design framework thus further enables tethering of CRISPR activities to cellular states.


FIG. 2. (A) Schematic representation of igRNA in its inactive state, trigger RNA (trRNA) and activated igRNA after hybridization with the trRNA. The igRNA differs from the unmodified sgRNA (represented in Fig. 1A) for the addition of the toehold, clamp, and variable loop sequences. The toehold variable region 1 (VR1, blue) and variable loop region 2 (VR2, green) are complementary to the trRNA 3¢ (light blue) and 5¢ sequences (light green), respectively. The igRNA design envision no binding between the sRNA and the constrained clamp sequence (complementary to the targeting sequence). Therefore, the toehold, clamp, and targeting sequences of the igRNA are fully modifiable. Hybridization between the sRNA 5¢ and 3¢ ends with the igRNA toehold and loop sequences expose the targeting region and reactivates binding to its DNA target. (B) Schematic representation of the E. coli genetic activation reporter. The reporter represented in Figure 1C was modified by adding the trRNA component that regulates CRISPR-igRNA guided derepression of sfGFP. (C) Average ratio -standard deviation (SD) of fluorescence intensity to OD 600 measured from E. coli MG1655Z1 cells containing the plasmids carrying the LacI-repressed sfGFP reporter in all samples, the igRNA carrying the 3A switch (lane 1, 2, 3, and 5), dCas9 (1, 2, 3, and 4), and the 3A trRNA (3, 4, and 5). The 3A trRNA was replaced with the 2A (line 2) to test the specificity of igRNA switching in lane 2. Error bars indicate the SD value from all the biological and technical replicates measured.
FIG. 3. (A) Schematic representation of the E. colibased genetic repression reporter where the RNP complex formed by dCas9, igRNA, and trRNA binds and therefore represses the newly designed hybrid promoter used to drive expression of the sfGFP reporter (pHybsfGFP plasmid). (B) Average ratio -SD of fluorescence intensity to OD 600 measured from the E. coli MG1655Z1 cells carrying the pHyb-sfGFP reporter plasmid, the 3A igRNA, and a plasmid expressing both the dCas9 and 3A trRNA (lane 1) or only the dCas9 expression cassette (lane 2).
FIG. 4. (A) Average ratio -SD of fluorescence intensity to OD 600 measured from the E. coli genetic activation reporter (described in Figure 2B) carrying the LacI-repressed sfGFP reporter, the dCas9 plasmid, and the 3AV9-igRNA in the absence (lane 1) or presence (lane 3) of the mKate2 mRNA trigger. The latter condition was tested also with a variant of the trRNA-expressing construct where the mKate2 RBS was omitted to obliterate concomitant translation of the trRNA transcript (lane 2). (B) The same analysis was performed on E. coli cells carrying the HIV-responsive igRNA (HIV3-igRNA) without (lane 1) or with the addition (lane 2) of the actively translating HIV VIF mRNA trigger.
FIG. 5. (A) Schematic representation of the cell-free genetic repression reporter where the deGFP fluorescent marker carries the same target site used for the E. coli experiments inserted in either the template or non-template strand of the deGFP coding sequence. (B) Cell-free testing of 3A igRNAs with the nuclease-deactivated S. pyogenes dCas9 (left) and the active Cas9 (right) targeting the non-template reporter. Bar graphs show the average fluorescence intensity (as arbitrary unit -SD) emitted by the fluorescent reporter construct. Repression activity of the 3A igRNA with the corresponding 3A trRNA (first lane) were compared to the 3A trRNA with the noncomplementary 2A trRNA (second lane), a standard gRNA targeting the same sequence (+C gRNA) as a positive control (third lane), and a non-targeting gRNA (-C gRNA) as a negative control (fourth lane). A total of 12 replicates were produced for each reaction. The full set of data of 3A and 2A igRNAs targeting the template or non-template strand with dCas9 or Cas9 are summarized in Supplementary Figure S3. (C) Schematic representation of our platform for high-throughput testing of igRNA designs involving (left to right): (1) computational design, (2) automated cloning of DNA plasmids and assembly of cell-free reactions, (3) acquisition and analysis of CRISPR activity using fully customizable reporter constructs, and (4) selection of igRNA and trRNA designs for in vivo applications.
Engineered RNA-Interacting CRISPR Guide RNAs for Genetic Sensing and Diagnostics

October 2020

·

260 Reads

·

14 Citations

The CRISPR Journal

CRISPR guide RNAs (gRNAs) can be programmed with relative ease to allow the genetic editing of nearly any DNA or RNA sequence. Here, we propose novel molecular architectures to achieve RNA-dependent modulation of CRISPR activity in response to specific RNA molecules. We designed and tested, in both living Escherichia coli cells and cell-free assays for rapid prototyping, cis-repressed RNA-interacting guide RNA (igRNA) that switch to their active state only upon interaction with small RNA fragments or long RNA transcripts, including pathogen-derived mRNAs of medical relevance such as the human immunodeficiency virus infectivity factor. The proposed CRISPR-igRNAs are fully customizable and easily adaptable to the majority if not all the available CRISPR-Cas variants to modulate a variety of genetic functions in response to specific cellular conditions, providing orthogonal activation and increased specificity. We thereby foresee a large scope of application for therapeutic, diagnostic, and biotech applications in both prokaryotic and eukaryotic systems.


Engineering a Circular Riboregulator in Escherichia coli

September 2020

·

58 Reads

·

8 Citations

RNAs of different shapes and sizes, natural or synthetic, can regulate gene expression in prokaryotes and eukaryotes. Circular RNAs have recently appeared to be more widespread than previously thought, but their role in prokaryotes remains elusive. Here, by inserting a riboregulatory sequence within a group I permuted intron-exon ribozyme, we created a small noncoding RNA that self-splices to produce a circular riboregulator in Escherichia coli . We showed that the resulting riboregulator can trans -activate gene expression by interacting with a cis -repressed messenger RNA. We characterized the system with a fluorescent reporter and with an antibiotic resistance marker, and we modeled this novel posttranscriptional mechanism. This first reported example of a circular RNA regulating gene expression in E. coli adds to an increasing repertoire of RNA synthetic biology parts, and it highlights that topological molecules can play a role in the case of prokaryotic regulation.


FIG 1 Overview of genomic deletion and insertion in C. phytofermentans. (A) pQlox71 is introduced for genomic insertion of a lox71 (L71) site using the LtrA protein encoded by the targetron. (B) pQlox71 is cured and pQlox66 is introduced for genomic insertion of a lox66 (L66) site. (C) pQlox66 is cured, and pQcre1 is introduced for Cre-mediated recombination to delete the sequence between the lox66 and lox71 sites. (D) In the resulting strain, the deletion and lox72 site are confirmed by PCR (arrows show primers). (E) pQadd1 is introduced for genomic delivery of a lox511/71 (L5-71) and loxFAS/66 (LF-66) cassette into the genome. (F) pQadd1 is cured and pQadd2 is introduced, bearing the desired insertion sequence flanked by lox511/66 (L5-66) and loxFAS/71 (LF-71) sites. (G) pQcre2 is introduced for Cre-mediated RMCE. (H) The resulting strain has a genomic copy of the insert sequence flanked by lox511/72 (L5-72) and loxFAS/72 (LF-72) sites in the genome, which is confirmed by PCR (arrows show primers).
FIG 2 Construction of a C. phytofermentans strain with targetron-mediated insertion of a lox71 site in cphy2944 and a lox66 site in cphy2993. (A) Genome region with the lox-containing targetron insertions in cphy2944 and cphy2993. Positions of primers used in panels B and E are shown. (B) PCR confirmation of lox insertions into cphy2944 (primers 2944_1/2) and cphy2993 (primers 2993_1/2) in 3 DI-AS isolates (DI1 to DI3). DNA chromatograms from DI1 of the lox71 site in cphy2944 (C) and the lox66 site in cphy2993 (D) with the 8-bp central spacer outlined and arm mutations relative to loxP shown in red. (E) Inverse PCR (primers int_1/2) shows the 3 DI-AS isolates (DI1 to DI3) contain only the 2 expected genomic targetron insertions. The 3.5-kb band corresponds to the targetron insertion in cphy2944 and the 1.3-kb band to the insertion in cphy2993.
FIG 4 Growth of C. phytofermentans WT (), DI-SS (OE), and Del-SS () strains on glucose (A), cellobiose (B), xylose (C), and galactan (D). Data points are means from 4 cultures; shaded areas show standard deviations (SDs). mRNA expression measured by RNA-seq of cphy2944-cphy2993 in C. phytofermentans WT on glucose (E), cellobiose (F), xylose (G), and galactan (H). Bars show mean log 2 (RPKM) SD from duplicate cultures; stars show genes differentially expressed on other carbon sources relative to glucose. RNA-seq measurements and differential expression statistics are based on a previous study (3).
A Targetron-Recombinase System for Large-Scale Genome Engineering of Clostridia

December 2019

·

132 Reads

·

9 Citations

Clostridia are a group of Gram-positive anaerobic bacteria of medical and industrial importance for which limited genetic methods are available. Here, we demonstrate an approach to make large genomic deletions and insertions in the model Clostridium phytofermentans by combining designed group II introns (targetrons) and Cre recombinase. We apply these methods to delete a 50-gene prophage island by programming targetrons to position markerless lox66 and lox71 sites, which mediate deletion of the intervening 39-kb DNA region using Cre recombinase. Gene expression and growth of the deletion strain showed that the prophage genes contribute to fitness on nonpreferred carbon sources. We also inserted an inducible fluorescent reporter gene into a neutral genomic site by recombination-mediated cassette exchange (RMCE) between genomic and plasmid-based tandem lox sites bearing heterospecific spacers to prevent intracassette recombination. These approaches generally enable facile markerless genome engineering in clostridia to study their genome structure and regulation. IMPORTANCE Clostridia are anaerobic bacteria with important roles in intestinal and soil microbiomes. The inability to experimentally modify the genomes of clostridia has limited their study and application in biotechnology. Here, we developed a targetron-recombinase system to efficiently make large targeted genomic deletions and insertions using the model Clostridium phytofermentans . We applied this approach to reveal the importance of a prophage to host fitness and introduce an inducible reporter by recombination-mediated cassette exchange.


ABC Transporters Required for Hexose Uptake by Clostridium phytofermentans

July 2019

·

233 Reads

·

12 Citations

Plant-fermenting Clostridia are anaerobic bacteria that recycle plant matter in soil and promote human health by fermenting dietary fiber in the intestine. Clostridia degrade plant biomass using extracellular enzymes and then uptake the liberated sugars for fermentation. The main sugars in plant biomass are hexoses, and here, we identify how hexoses are taken in to the cell by the model organism Clostridium phytofermentans . We show that this bacterium uptakes hexoses using a set of highly specific, nonredundant ABC transporters. Once in the cell, the hexoses are phosphorylated by intracellular hexokinases. This study provides insight into the functioning of abundant members of soil and intestinal microbiomes and identifies gene targets to engineer strains for industrial lignocellulosic fermentation.


Citations (12)


... The most favorable single-amino acid substitutions (=11 mutations spread over 6 KRI3 positions) were chosen while also considering ∆∆G values lower than −0.3 Kcal/mol and keeping into account low energy penalties due to Van der Waals clashes (i.e., ∆VdW ≤ 0.8 Kcal/mol) (Table S1 and Figure 2, Box 2). Next, these selected single mutations were combined to generate additional KRI3 mutant peptides by implementing the "BuildModel" macro of FoldX [36,37] (Figure 2, Box 3). Overall, 348 KRI3 variants containing multiple amino acid substitutions (from 2 to 6 mutations) were built and analyzed within this second step ( Figure 2, Box 3). ...

Reference:

Exploring a Potential Optimization Route for Peptide Ligands of the Sam Domain from the Lipid Phosphatase Ship2
Computational design of novel Cas9 PAM-interacting domains using evolution-based modelling and structural quality assessment

... With dReaMGE, simultaneous insertions and deletions of multiple kilobase-scale sequences were achieved in E. coli and two different Pseudomonad hosts 25 . CRISPR/Cas9 is efficient and versatile for genome editing in various organisms, but its application for multiplex genome targeting in bacteria is limited by several factors, including the cytotoxicity of Cas9 expression 26 , difficulties in simultaneously co-expressing multiple guides RNAs 27 , and the lethality of unrepaired double-strand breaks because most bacteria lack the capacity for non-homologous end joining 28 . Consequently, in bacteria, Cas9 was first applied as a transiently expressed counterselection tool after the recombineering step to eliminate unmutated genomes by cleavage of the wild-type target 11,[29][30][31] . ...

Cas9 off-target binding to the promoter of bacterial genes leads to silencing and toxicity

Nucleic Acids Research

... This nucleasedeficient version of spCas9 binds to target sequences without cutting, effectively blocking gene expression [72]. Tested in various Clostridium species [73,74], CRISPRi offers adjustable gene downregulation, making it a valuable tool for the partial repression of essential genes [75], and the concurrent repression of multiple genes [76]. CRISPRa, on the other hand, involves a nuclease-deficient Cas protein fused with an activation domain for gene upregulation [77]. ...

Tuning of Gene Expression in Clostridium phytofermentans Using Synthetic Promoters and CRISPRi

ACS Synthetic Biology

... [35][36][37] SDR-based strategies have also found applications in CRISPR-based nucleic acid diagnostics. [38][39] In this regard, CRISPR systems controlled by SDRs have been reported in the context of CRISPR-Cas Type V or VI-based biosensing applications. [40][41] In these systems, the formation of a ribonucleoprotein (RNP) complex between the Cas enzyme and crRNA triggers both site-specific (cis-cleavage) and specific (trans-cleavage) nuclease activities, responsible for DNA/RNA reporter digestion and catalytic signal generation. ...

Engineered RNA-Interacting CRISPR Guide RNAs for Genetic Sensing and Diagnostics

The CRISPR Journal

... However, this research field started relatively late, and many ribonuclease circularization mechanisms are still not clear, which poses significant challenges in the field of production application. The group I intron was used to circularize RNA in vitro or in vivo, whether in cells or E. coli (BL21) [20,26,29,31,32]. However, most of the research on in vivo circularization using this intron focused on sequences between 70 and 1000 nt in length; exceeding this range significantly ...

Engineering a Circular Riboregulator in Escherichia coli

... Designed group II intron called targetrons enabled gene inactivation by targeted chromosome insertion in various Lachnospiraceae with efficiencies ranging from 12.5%-100% (Tolonen et al., 2009;Tolonen et al., 2015a;Cerisy et al., 2019a;Jin et al., 2022) (Figure 4D). Multi-gene fragments can be excised and inserted by modifying targetrons to deliver lox sites into the genome that act as anchor points for Cre-mediated recombination, which has been applied to delete a 39 kb prophage in L. phytofermentans (Cerisy et al., 2019b). ...

A Targetron-Recombinase System for Large-Scale Genome Engineering of Clostridia

... The peptidase domain of ClbP (ClbP-pep), as defined in (13), was PCR amplified with a C-terminal His tag using primers clbP_F (5-AAAGAAGGAGATAGGATCATGACAATAATGG AACACGTTAG-3) and clbP_R (5-GTGTAATGGATAGTGATCTTAATGGTGATGGTGATGATGATAT TTGCCAATGCGCAGA-3). The PCR product was cloned by ligation-independent cloning into pET-22B(+) (61,62) and the forward and reverse sequences were confirmed by sequencing. Plasmids were transformed into E. coli BL21(DE3) (Novagen 70235). ...

ABC Transporters Required for Hexose Uptake by Clostridium phytofermentans

... It was also reported that 15% of BC patients expressed poor levels of this protein compared to the healthy tissues of breast and also displayed high-grade cancer than patients in which the protein is overexpressed [40,41]. c-MYC c-MYC gene is found on chromosome 8q24 and is reported to be involved in many tumors and higher expression of this gene are associated to cause TNBC and aggressive human prostate cancer [1,13,14,42]. ...

Using RNA as Molecular Code for Programming Cellular Function

ACS Synthetic Biology

... The structure of the targeted GGA cassette ruled the composition of the GG reaction mixture. The Golden Gate reaction conditions were designed based on the previously published protocols (Engler et al., 2008;Pauthenier et al., 2012;Agmon et al., 2015). The reaction mixture contained precalculated equimolar amount of each GGF and the destination vector (50 pmoles of ends), 2 ll of T4 DNA ligase buffer (NEB), 5 U of BsaI, 200 U of T4 and ddH2O up to 20 ll. ...

The GoldenBricks assembly: A standardized one-shot cloning technique for complete cassette assembly