Christine Heitsch’s research while affiliated with Georgia Institute of Technology and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (58)


Figure 1: Representative tRNA secondary structure and tertiary interactions. From left to right: Table lists the structural components, including the genomic tag, which are then labeled on the secondary structure. The four-armed cloverleaf is closed by the acceptor (A) stem, and contains three hairpin stem-loop structures: the D, anticodon (AN), and T arms. Paired nucleotides (green) as well as the anticodon (blue) are highlighted. Most pairings are Watson-Crick (GC/AU, dash) but two wobble ones (GU, dot) are present. Various tertiary interactions (dashed lines) stabilize the 3D structure. The overall L-shape is critical to ribosome binding, and hence protein biosynthesis.
Figure 3: Comparison of rate matrix parameterizations for Markov substitution model: general Markov (GM ), strand-symmetric (sym), transition-asymmetric (tran), and TAMbiased (TAM ) along with general time-reversible (GTR). The Akaike and Bayesian information criteria (AIC, BIC) are reported as the difference from TAM 's score as this was the lowest, i.e. best.
Figure 5: Distribution of observed MFE-preserving substitutions from M (blue) compared to all related neutral neighbors in N (red). Components of the tRNA secondary structure are labeled as in Figure 1 with paired regions marked by * . Note the enhancement of paired domains in M , particularly the acceptor stem, while mutations in hairpin loops, especially the anticodon one, are depressed. Conclusions about the variable arm (M3) are confounded since it can contain both kinds of sites depending on the specific tRNA.
Figure 6: Comparison of substitution frequency over 531 C. elegans tRNA alleles with mean site-fitness (Li et al., 2016) across all one-point mutants for the S. cerevisiae arginine-CCU tRNA. Substructure components are labeled as in Figure 1 along with their average length, which was used for the site normalization.
How much is Transcription-associated Mutagenesis Driving tRNA Microevolution?
  • Preprint
  • File available

March 2025

·

3 Reads

Hector Banos

·

Ling Wang

·

Corinne Simonti

·

[...]

·

Christine E Heitsch

Transfer RNAs (tRNAs) are among the most highly conserved and frequently transcribed genes. Recent studies have identified transcription-associated mutagenesis (TAM) as a significant contributor to sequence variation around tRNA loci. However, the extent to which TAM drives allelic variation in tRNAs remains unclear, largely due to the confounding effects of strong selection pressures to maintain their structural integrity. This complexity arises because TAM-induced mutations primarily involve nucleotide transitions, which tend to preserve base-pairing stability. To address this dichotomy at the population level, we analyzed tRNA allelic variation in contemporary Caenorhabditis elegans strains. We propose a model of tRNA microevolution driven by TAM and demonstrate that the observed secondary structure characteristics align with our predicted TAM-biased patterns. Furthermore, we developed a continuous Markov substitution model that incorporates TAM-specific mutational biases. This TAM-biased model fits the C. elegans tRNA data more effectively than standard models, such as the general time-reversible (GTR) model. Based on these results, we conclude that TAM plays a significant role in shaping tRNA allelic variation within populations. This finding is consistent with recent experimental studies on tRNA fitness in yeast, but challenges prior theoretical and computational analyses that emphasize RNA base-pairing as a primary determinant in genotype-phenotype systems.

Download

Can geometric combinatorics improve RNA branching predictions?

March 2025

·

3 Reads

Prior results for tRNA and 5S rRNA demonstrated that secondary structure prediction accuracy can be significantly improved by modifying the parameters in the multibranch loop entropic penalty function. However, for reasons not well understood at the time, the scale of improvement possible across both families was well below the level for each family when considered separately. We resolve this dichotomy here by showing that each family has a characteristic target region geometry, which is distinct from the other and significantly different from their own dinucleotide shuffles. This required a much more efficient approach to computing the necessary information from the branching parameter space, and a new theoretical characterization of the region geometries. The insights gained point strongly to considering multiple possible secondary structures generated by varying the multiloop parameters. We provide proof-of-principle results that this significantly improves prediction accuracy across all 8 additional families in the Archive II benchmarking dataset.


How Parameters Influence SHAPE-Directed Predictions

May 2024

·

14 Reads

Methods in molecular biology (Clifton, N.J.)

The structure of an rna sequence encodes information about its biological function. Dynamic programming algorithms are often used to predict the conformation of an rna molecule from its sequence alone, and adding experimental data as auxiliary information improves prediction accuracy. This auxiliary data is typically incorporated into the nearest neighbor thermodynamic model22 by converting the data into pseudoenergies. Here, we look at how much of the space of possible structures auxiliary data allows prediction methods to explore. We find that for a large class of rna sequences, auxiliary data shifts the predictions significantly. Additionally, we find that predictions are highly sensitive to the parameters which define the auxiliary data pseudoenergies. In fact, the parameter space can typically be partitioned into regions where different structural predictions predominate.


Figure 1: Example of Pv2 output webpage. Left column displays an interactive summary profile graph -by default a decision tree. Nodes are clickable and labeled with corresponding number of sampled structures. Center column has a dynamic panel (top) displaying features in arc diagram format for chosen (grey) node from profile graph. Decisions on incoming edge are emphasized; positive ones in bold above the sequence line, and negative ones, denoted with ¬ in tree, in red below. Features are labeled according to the FSC table (center, middle) which lists SC regions [i, j; k, l] with fuzzy frequencies and SHC contained. Nontrivial FSC are denoted by letters, and trivial ones by their SHC index. All SHC are listed (center, bottom) with maximal (i, j, k) triplet and integer indexed in decreasing exact frequency as given. Selected profiles, or groups thereof, are denoted by rectangular leaves in decision tree and labeled by roman numerals. More than one selected profile is represented if the incoming edge is a contingency (dashed). In this case, the contingency table is given below the feature display when the leaf (dashed rectangle) is chosen. Right column shows most frequent secondary structure for each leaf in radial (or arc) diagram format. Users can download all structures corresponding to the chosen node, or just the most frequent, for further analysis. See Section 2.4, and Supplemental Material, for further information.
RNAprofiling 2.0: Enhanced cluster analysis of structural ensembles

March 2023

·

38 Reads

Understanding the base pairing of an RNA sequence provides insight into its molecular structure.By mining suboptimal sampling data, RNAprofiling 1.0 identifies the dominant helices in low-energy secondary structures as features, organizes them into profiles which partition the Boltzmann sample, and highlights key similarities/differences among the most informative, i.e. selected, profiles in a graphical format. Version 2.0 enhances every step of this approach. First, the featured substructures are expanded from helices to stems. Second, profile selection includes low-frequency pairings similar to featured ones. In conjunction, these updates extend the utility of the method to sequences up to length 600, as evaluated over a sizable dataset. Third, relationships are visualized in a decision tree which highlights the most important structural differences. Finally, this cluster analysis is made accessible to experimental researchers in a portable format as an interactive webpage, permitting a much greater understanding of trade-offs among different possible base pairing combinations.


Figure 1: Example of Pv2 output webpage. Left column displays an interactive summary profile graph -by default a decision tree. Nodes are clickable and labeled with corresponding number of sampled structures. Center column has a dynamic panel (top) displaying features in arc diagram format for chosen (grey) node from profile graph. Decisions on incoming edge are emphasized; positive ones in bold above the sequence line, and negative ones, denoted with ¬ in tree, in red below. Features are labeled according to the FSC table (center, middle) which lists SC regions [i, j; k, l] with fuzzy frequencies and SHC contained. Nontrivial FSC are denoted by letters, and trivial ones by their SHC index. All SHC are listed (center, bottom) with maximal (i, j, k) triplet and integer indexed in decreasing exact frequency as given. Selected profiles, or groups thereof, are denoted by rectangular leaves in decision tree and labeled by roman numerals. More than one selected profile is represented if the incoming edge is a contingency (dashed). In this case, the contingency table is given below the feature display when the leaf (dashed rectangle) is chosen. Right column shows most frequent secondary structure for each leaf in radial (or arc) diagram format. Users can download all structures corresponding to the chosen node, or just the most frequent, for further analysis. See Section 2.4, and Supplemental Material, for further information.
RNAprofiling 2.0: Enhanced cluster analysis of structural ensembles

March 2023

·

1 Read

Understanding the base pairing of an RNA sequence provides insight into its molecular structure.By mining suboptimal sampling data, RNAprofiling 1.0 identifies the dominant helices in low-energy secondary structures as features, organizes them into profiles which partition the Boltzmann sample, and highlights key similarities/differences among the most informative, i.e. selected, profiles in a graphical format. Version 2.0 enhances every step of this approach. First, the featured substructures are expanded from helices to stems. Second, profile selection includes low-frequency pairings similar to featured ones. In conjunction, these updates extend the utility of the method to sequences up to length 600, as evaluated over a sizable dataset. Third, relationships are visualized in a decision tree which highlights the most important structural differences. Finally, this cluster analysis is made accessible to experimental researchers in a portable format as an interactive webpage, permitting a much greater understanding of trade-offs among different possible base pairing combinations.


Counting orbits under Kreweras complementation

March 2023

·

28 Reads

The Kreweras complementation map is an anti-isomorphism on the lattice of noncrossing partitions. We consider an analogous operation for plane trees motivated by the molecular biology problem of RNA folding. In this context, we explicitly count the orbits of Kreweras' map according to their length as the number of appropriate symmetry classes of trees in the plane. These enumeration results are consolidated into a single implicit formula under the cyclic sieving phenomenon.


Figure 1: A pairing exchange on unobstructed edges with indices 1 ≤ a < b < c < d ≤ 2n converts siblings (a, b), (c, d) into parent/child (a, d), (b, c), and vice versa. Note how the incident vertices split and merge. However, edges in the four subtrees with indices exclusively in A = [1, a − 1] ∪ [d + 1, 2n], B = [a + 1, b − 1], C = [b + 1, c − 1], or D = [c + 1, d − 1] are unaltered.
Figure 2: The graph G 3 with L 3 on left, U 3 on right, and the three plane trees T ∈ T 3 with c(T ) = 2 in the middle. Dashed lines are pairing exchanges. The number of odd edges increases by 1 moving left to right.
On a barrier height problem for RNA branching

March 2023

·

16 Reads

The branching of an RNA molecule is an important structural characteristic yet difficult to predict correctly, especially for longer sequences. Using plane trees as a combinatorial model for RNA folding, we consider the thermodynamic cost, known as the barrier height, of transitioning between branching configurations. Using branching skew as a coarse energy approximation, we characterize various types of paths in the discrete configuration landscape. In particular, we give sufficient conditions for a path to have both minimal length and minimal branching skew. The proofs offer some biological insights, notably the potential importance of both hairpin stability and domain architecture to higher resolution RNA barrier height analyses.


Figure 1: A pairing exchange on unobstructed edges with indices 1 ≤ a < b < c < d ≤ 2n converts siblings (a, b), (c, d) into parent/child (a, d), (b, c), and vice versa. Note how the incident vertices split and merge. However, edges in the four subtrees with indices exclusively in A = [1, a − 1] ∪ [d + 1, 2n], B = [a + 1, b − 1], C = [b + 1, c − 1], or D = [c + 1, d − 1] are unaltered.
Figure 2: The graph G 3 with L 3 on left, U 3 on right, and the three plane trees T ∈ T 3 with c(T ) = 2 in the middle. Dashed lines are pairing exchanges. The number of odd edges increases by 1 moving left to right.
On a barrier height problem for RNA branching

March 2023

·

16 Reads

·

1 Citation

The branching of an RNA molecule is an important structural characteristic yet difficult to predict correctly, especially for longer sequences. Using plane trees as a combinatorial model for RNA folding, we consider the thermodynamic cost, known as the barrier height, of transitioning between branching configurations. Using branching skew as a coarse energy approximation, we characterize various types of paths in the discrete configuration landscape. In particular, we give sufficient conditions for a path to have both minimal length and minimal branching skew. The proofs offer some biological insights, notably the potential importance of both hairpin stability and domain architecture to higher resolution RNA barrier height analyses.


Figure 1: Example of Pv2 output webpage. Left column displays an interactive summary profile graph -by default a decision tree. Nodes are clickable and labeled with corresponding number of sampled structures. Center column has a dynamic panel (top) displaying features in arc diagram format for chosen (grey) node from profile graph. Decisions on incoming edge are emphasized; positive ones in bold above the sequence line, and negative ones, denoted with ¬ in tree, in red below. Features are labeled according to the FSC table (center, middle) which lists SC regions [i, j; k, l] with fuzzy frequencies and SHC contained. Nontrivial FSC are denoted by letters, and trivial ones by their SHC index. All SHC are listed (center, bottom) with maximal (i, j, k) triplet and integer indexed in decreasing exact frequency as given. Selected profiles, or groups thereof, are denoted by rectangular leaves in decision tree and labeled by roman numerals. More than one selected profile is represented if the incoming edge is a contingency (dashed). In this case, the contingency table is given below the feature display when the leaf (dashed rectangle) is chosen. Right column shows most frequent secondary structure for each leaf in radial (or arc) diagram format. Users can download all structures corresponding to the chosen node, or just the most frequent, for further analysis. See Section 2.4, and Supplemental Material, for further information.
RNAprofiling 2.0: Enhanced Cluster Analysis of Structural Ensembles

March 2023

·

20 Reads

·

1 Citation

Journal of Molecular Biology

Understanding the base pairing of an RNA sequence provides insight into its molecular structure. By mining suboptimal sampling data, RNAprofiling 1.0 identifies the dominant helices in low-energy secondary structures as features, organizes them into profiles which partition the Boltzmann sample, and highlights key similarities/differences among the most informative, i.e.selected, profiles in a graphical format. Version 2.0 enhances every step of this approach. First, the featured substructures are expanded from helices to stems. Second, profile selection includes low-frequency pairings similar to featured ones. In conjunction, these updates extend the utility of the method to sequences up to length 600, as evaluated over a sizable dataset. Third, relationships are visualized in a decision tree which highlights the most important structural differences. Finally, this cluster analysis is made accessible to experimental researchers in a portable format as an interactive webpage, permitting a much greater understanding of trade-offs among different possible base pairing combinations.



Citations (30)


... Let T n denote the set of plane trees with n edges, and T ∈ T n . Motivated by the molecular biology problem of RNA folding [7], label the boundary of T with [1, 2n] in increasing order counter-clockwise from the root. Let (i, j) denote the edge in T with indices i < j on the left and right sides. ...

Reference:

Counting orbits under Kreweras complementation
On a barrier height problem for RNA branching

... SHAPE reactivity values were used as constraints to model the vRNA secondary structure using RNAStructure (version 6.4) [36] with the default values of −0.6 kcal/mol and 1.8 kcal/mol [37] for intercept and slope, respectively, and secondary structure were using VARNA (version 3.93) [38]. Secondary structures predicted with RNAStructure v6.4 were compared using RNAStructViz v2.14.18 [39]. ...

RNAStructViz: Graphical base pairing analysis
  • Citing Article
  • April 2021

Bioinformatics

... We provide here a mathematical motivation for generating alternative predictions based on a parametric analysis of RNA branching Drellich et al. (2017); Barrera-Cruz et al. (2018); Poznanović et al. (2020Poznanović et al. ( , 2021. Using methods from geometric combinatorics Drellich et al. (2017), it is possible to identify all optimal predictions under any parameterization of the entropic branching penalty. ...

Improving RNA Branching Predictions: Advances and Limitations

... Our DMS-MaP (Dimethyl Sulfate-Mutational Profiling) experiments on this 87nt construct in Figure 1C indicate low (black), medium (yellow), and high (red) nucleotides based on their reactivity against DMS (Mustoe et al. 2019 Dey, et al. 2021). Both DMS and SHAPE reactivities can be used as a pseudofree energy term in thermodynamic folding algorithms to significantly improve structure prediction (Deigan et al. 2009;Greenwood and Heitsch 2020) Our two DMS-MaP biological replicates in supplementary Figure S1 are quantitatively reproducible (R 2 >0.85). When used for structure prediction with SHAPEknots (Hajdin et al. 2013), we obtained three conformations, shown in Figures 1C, 1D and 1E. ...

On the Problem of Reconstructing a Mixture of RNA Structures
  • Citing Article
  • October 2020

Bulletin of Mathematical Biology

... We provide here a mathematical motivation for generating alternative predictions based on a parametric analysis of RNA branching Drellich et al. (2017); Barrera-Cruz et al. (2018); Poznanović et al. (2020Poznanović et al. ( , 2021. Using methods from geometric combinatorics Drellich et al. (2017), it is possible to identify all optimal predictions under any parameterization of the entropic branching penalty. ...

The challenge of RNA branching prediction: a parametric analysis of multiloop initiation under thermodynamic optimization
  • Citing Article
  • February 2020

Journal of Structural Biology

... We provide here a mathematical motivation for generating alternative predictions based on a parametric analysis of RNA branching Drellich et al. (2017); Barrera-Cruz et al. (2018); Poznanović et al. (2020Poznanović et al. ( , 2021. Using methods from geometric combinatorics Drellich et al. (2017), it is possible to identify all optimal predictions under any parameterization of the entropic branching penalty. ...

On the Structure of RNA Branching Polytopes
  • Citing Article
  • August 2017

SIAM Journal on Applied Algebra and Geometry

... 1 several implementations of NNTM including RNAstructure [5], the ViennaRNA Package [6], UNAFold [7], and GTfold [8]. Although very useful, the physics-rooted NNTM approach is prone to accuracy errors in its predictions [9,10]. Accuracy can be improved by incorporating additional information, such as highthroughput chemical probing data that further constrain the thermodynamic model [11,12,13]. ...

Conditioning and Robustness of RNA Boltzmann Sampling under Thermodynamic Parameter Perturbations
  • Citing Article
  • June 2017

Biophysical Journal

... We provide here a mathematical motivation for generating alternative predictions based on a parametric analysis of RNA branching Drellich et al. (2017); Barrera-Cruz et al. (2018); Poznanović et al. (2020Poznanović et al. ( , 2021. Using methods from geometric combinatorics Drellich et al. (2017), it is possible to identify all optimal predictions under any parameterization of the entropic branching penalty. ...

Geometric combinatorics and computational molecular biology: Branching polytopes for RNA sequences
  • Citing Chapter
  • January 2017

... Under the energy minimization hypothesis, an RNA sequence will fold into its configuration only when the loop region energies are minimized and their stacked pairs are maximized. Various algorithms and combinatorial models are developed to solve the RNA secondary structure design problem [96,97,174]. An example of RNA secondary structure is demonstrated in Figure 2 (C). ...

Combinatorial Insights into RNA Secondary Structure
  • Citing Chapter
  • October 2014

Natural Computing Series

... A secondary structure for the given RNA sequence is called suboptimal if its ∆G value is close to the MFE score under the current NNTM evaluation. It has long been recognized that prediction accuracy improves when these alternative structures are also considered Mathews (2006); Rogers and Heitsch (2016). For instance, the three different NNTM parameterizations ; Mathews et al. (1999Mathews et al. ( , 2004 have always reported a "best suboptimal" accuracy which can be understood as an indicator of the approximation quality. ...

New insights from cluster analysis methods for RNA secondary structure prediction
  • Citing Article
  • March 2016

WIREs RNA