PreprintPDF Available

Joint graphical model estimation using Stein-type shrinkage for fast large scale network inference in scRNAseq data

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Graphical modeling is a widely used tool for analyzing conditional dependencies between variables and traditional methods may struggle to capture shared and distinct structures in multi-group or multi-condition settings. Joint graphical modeling (JGM) extends this framework by simultaneously estimating network structures across multiple related datasets, allowing for a deeper understanding of commonalities and differences. This capability is particularly valuable in fields such as genomics and neuroscience, where identifying variations in network topology can provide critical biological insights. Existing JGM methodologies largely fall into two categories: regularization-based approaches, which introduce additional penalties to enforce structured sparsity, and Bayesian frameworks, which incorporate prior knowledge to improve network inference. In this study, we explore an alternative method based on two-target linear covariance matrix shrinkage. Formula for optimal shrinkage intensities is proposed which leads to the development of JointStein framework. Performance of JointStein framework is proposed through simulation benchmarking which demonstrates its effectiveness for large-scale single-cell RNA sequencing (scRNA-seq) data analysis. Finally, we apply our approach to glioblastoma scRNA-seq data, uncovering dynamic shifts in T cell network structures across disease progression stages. The result highlights potential of JointStein framework in extracting biologically meaningful insights from high-dimensional data.
Content may be subject to copyright.
JOURNAL OF L
A
T
E
X CLASS FILES, VOL. 18, NO. 9, SEPTEMBER 2020 1
Joint graphical model estimation using Stein-type
shrinkage for fast large scale network inference in
scRNAseq data
Duong H.T. Vo, Nelofer Syed and Thomas Thorne
Abstract—Graphical modeling is a widely used tool for analyz-
ing conditional dependencies between variables and traditional
methods may struggle to capture shared and distinct structures in
multi-group or multi-condition settings. Joint graphical modeling
(JGM) extends this framework by simultaneously estimating
network structures across multiple related datasets, allowing
for a deeper understanding of commonalities and differences.
This capability is particularly valuable in fields such as ge-
nomics and neuroscience, where identifying variations in network
topology can provide critical biological insights. Existing JGM
methodologies largely fall into two categories: regularization-
based approaches, which introduce additional penalties to enforce
structured sparsity, and Bayesian frameworks, which incorporate
prior knowledge to improve network inference. In this study, we
explore an alternative method based on two-target linear covari-
ance matrix shrinkage. Formula for optimal shrinkage intensities
is proposed which leads to the development of JointStein frame-
work. Performance of JointStein framework is proposed through
simulation benchmarking which demonstrates its effectiveness for
large-scale single-cell RNA sequencing (scRNA-seq) data analysis.
Finally, we apply our approach to glioblastoma scRNA-seq data
from [20], uncovering dynamic shifts in T cell network structures
across disease progression stages. The result highlights potential
of JointStein framework in extracting biologically meaningful
insights from high-dimensional data.
Index Terms—gene network, joint graphical model, single-cell
RNA-seq analysis
I. INTRODUCTION
The introduction of single-cell RNA sequencing (scR-
NAseq) using high-throughput next-generation DNA sequenc-
ing (NGS) transformed transcriptomic research by providing
data at a higher resolution with information on cellular states
and the molecular interactions of individual cells [27]. The
development of scRNAseq has allowed researchers to collect
transcriptomic information from more than 20,000 genes from
more than 10,000 cells in one experiment [8]. This requires
computational methods that support big data analysis while
maintaining performance in inference processes such as net-
work estimation.
Gene network inference is one of the active research fields
where computational biologists aim to extract interaction infor-
mation between genes or their products. These networks can
include DNA-protein interaction, protein-protein interaction or
Duong H.T. Vo and Thomas Thorne are with the Computer Science
Research Centre, University of Surrey, United Kingdom. Nelofer Syed is with
the John Fulcher Neuro-Oncology Laboratory, Department of Brain Sciences,
Faculty of Medicine, Imperial College London, London W12 0NN, UK.
Corresponding author email: tom.thorne@surrey.ac.uk
gene regulatory network. In the studies Pearson or Spearman
correlation tests are often used to build correlation matrices
which contribute to the construction of gene co-expression net-
works. However, these tests fail to exclude indirect correlations
between variables under third-party effects [29]. Graphical
modeling instead records conditional independencies between
random variables in a connectivity graph [1]. To extract direct
connections between variables in graphical modeling, the
partial correlation matrix is often used. Multiple methods have
been developed for graphical modeling under the assumption
of high sparsity in the final network [18], [2], [3].
Joint graphical modeling, an extension to standard graphical
models, collates a common network structure from other re-
lated networks to improve the overall performance of network
inference compared to separate estimation [5]. Joint graphical
modeling acts on multiple groups of observations for a com-
mon set of variables, learning a specific network for each set of
observations. In the case of biological network inference, joint
network estimation supports the identification of conserved
networks between different biological groups, whilst also
highlighting specific edges in each group. Its application in
biology has been highlighted in a protein signaling pathway
study [30]. Multiple approaches have been developed for
modeling joint network structures, including regularization and
Bayesian methods. In this study, the application of two-target
linear covariance matrix shrinkage (TTLS) as a new approach
to joint graphical modeling is explored. TTLS method extends
Stein-type linear shrinkage, an approach in standard graphical
modeling that does not incorporate joint estimation. Instead,
it separately estimates the covariance matrix for each group
which ensures independent shrinkage within distinct subsets of
data. Stein-type covariance shrinkage has been widely use to
shrink the sample covariance matrix towards the identity ma-
trix [7]. In TTLS, rather than shrinking the sample covariance
to the identity matrix, the sample covariance matrix is shrunk
towards a shared covariance matrix containing information on
the common network structure shared between all groups.
With the rapid pace of development in single-cell RNA
sequencing technology, inference methods able to perform
joint graphical modeling on large-scale datasets are becoming
increasingly important. In this study, TTLS is implemented
in a framework we refer to as JointStein, which is designed
for large-scale joint graphical model inference. JointStein is
benchmarked against current joint graphical modeling ap-
proaches in terms of performance and computational time.
We also apply JointStein to joint network inference in exper-
arXiv:2503.05448v1 [stat.ME] 7 Mar 2025
JOURNAL OF L
A
T
E
X CLASS FILES, VOL. 18, NO. 9, SEPTEMBER 2020 2
imental data from glioblastoma and Plasmodium falciparum
scRNAseq data [20], [19].
II. ME TH OD S
A. Overview of joint graphical model estimation
Graphical modeling allows us to extract conditional de-
pendence and independence relationships between random
variables [1]. Conditional independency is defined as the
case where the joint probability of two variables Xand Y
conditional on a third variable Zcan be rewritten as:
P(X, Y |Z) = P(X|Z)P(Y|Z)(1)
This states that given knowledge of Z, the two variables X
and Yare independent of one another.
Gaussian Graphical Models (GGMs) are used to de-
scribe the conditional dependence structure among multiple
Gaussian-distributed variables [1]. In these models, edges
between nodes represent direct interactions, as captured by
the nonzero elements of the precision matrix, after adjusting
for the influence of all other variables [1]. One popular
approach for Gaussian graphical modeling is through estimat-
ing the partial correlation matrix by inverting the unknown
and nonsingular population covariance matrix Σ[1]. The
population covariance matrix Σis often estimated from the
sample covariance matrix S. The implementation of graphical
models for real-world network inference has been proposed in
various fields such as biology, economy and finance [1], and
in most cases, the underlying network is assumed to be sparse.
To induce sparsity in the network and overcome the large p
small nchallenge, in which sample covariance matrix S be-
comes singular, various different methods of graphical model
inference have been developed. These include minimizing the
negative log-likelihood with L1regularization, and shrinking
the sample covariance matrix Stowards the identity matrix
[2], [3], [4].
An extension to standard graphical modeling is the ap-
plication of joint graphical modeling, as the presence of
a partially shared network structure, allowing information
sharing between related data sets, has been shown to boost the
power of network inference, especially in high-dimensional
data analysis [5]. The idea of joint graphical modeling is
illustrated in Figure 1. Specifically, in heterogeneous data
when multiple related groups are present, joint estimation aims
to borrow and extract a common network structure between all
of the groups. Then individual graphs containing some of the
shared structure, but also unique group specific edges can be
inferred. This allows us to identify network edges that are
shared across multiple groups, while also allowing distinct
connections in individual groups. In the case of gene network
inference, joint estimation can reveal both core and discrete
genetic network structures in different environments or cell
types [6].
To implement joint graphical model in real-world data,
there are some considerations. Firstly, when the number of
features exceeds the number of samples, the sample covariance
matrix becomes singular and noisy [7]. This same challenge is
also faced in standard graphical modeling. With advances in
Fig. 1. Joint estimation of related networks. The shared network between
groups is highlighted as black edges, whilst unique edges to each group are
coloured in blue.
single-cell sequencing technology, gene expression data can
be collected from many thousands of genes and potentially
millions of cells [8]. Hence, network inference methods for
large-scale applications are essential to enhance computa-
tional analysis with less computational resource requirements,
for instance, methods with faster computing time. Lastly,
for joint estimation, approaches to borrow and incorporate
shared network information in network inference process are
of paramount importance, especially in a biological context
where there is likely to be some shared network structure
between different groups, for example between different cell
types or experimental conditions.
In recent works, there are two main approaches for
joint graphical modeling, regularization-based and Bayesian
methods [5]. Similar to the graphical lasso, sparsity in
regularization-based approaches is induced by L1regulariza-
tion on entries of the inverse covariance matrix, also referred
to as the precision matrix [9]:
ˆ
Θ = argmax
ΘRp×p{log(det(Θ)) tr(SΘ) λ1Θ1}(2)
In regularization-based joint graphical modeling, a second
penalty function is often added to the estimator which acts
as similarity constraint between groups [5]. For example, the
Fused Graphical Lasso (FGL) and Group Graphical Lasso
(GGL) add PF GL(Ω) = λ2Pk<k Pi,j |w(k)
i,j w(k)
i,j |and
PGGL(Ω) = λ2Pi=j{PK
k=1(w(k
i,j )2}1
2to their objective
functions, respectively [10]. Both FGL and GGL are under
the broader joint graphical lasso (JGL) framework which is
introduced by [10].
These added functions encourage some shared network pat-
tern between groups while the conventional penalty function
ensures sparsity induction on final graphical model. However,
in implementation, selecting appropriate values of the regular-
ization parameters and the computational time required by the
algorithms are two main hindrances.
On the other hand, Bayesian models encode the properties
of the group of graphical models in their prior. For instance,
JOURNAL OF L
A
T
E
X CLASS FILES, VOL. 18, NO. 9, SEPTEMBER 2020 3
[11] designed priors specific to temporal and spatial models,
encouraging edges to be shared between networks that are
temporally or spatially close to one another. However this
algorithm is complex to implement and can be computationally
demanding, especially when using Markov Chain Monte Carlo
algorithms to estimate the posterior distribution.
In this study, we aim to explore a different approach of joint
graphical modeling using Stein-type shrinkage with two target
matrices, and benchmark the performance of two-target Stein-
type linear covariance matrix shrinkage through performance
comparison analysis on simulated data.
B. Two-target linear covariance matrix shrinkage
A new approach to joint graphical modeling we explore
in this work is called two-target linear shrinkage (TTLS).
The Two-target Linear Shrinkage (TTLS) formula is based
on the double shrinkage estimator from [12], and the multi-
target shrinkage estimation from [13]. To combine information
of the common network structure while shrinking entries of
sample covariance matrix towards zero in our approach, we
apply TTLS balancing between a shared covariance matrix
Sshared and the identity matrix Iusing shrinkage regulators
γ1and γ2as follows:
Σ = (1 γ1γ2)S+γ1S(i)
shared +γ2I(3)
Then the estimator in TTLS is as follow, for iGgroups:
ˆ
Σi= (1 ˆγ1ˆγ2)Si+ ˆγ1ˆ
S(i)
shared + ˆγ2I(4)
The shared covariance matrix Sshared contains information
of the underlying common network structure in our TTLS
approach. For an estimator of Sshared, we use the average of
the sample or shrunk covariance matrices of the other groups
can be used. Shrinking the sample covariance matrix before
taking the average can potentially reduce errors and ensure the
estimator is non-singular in case of high-dimensional data:
ˆ
S(i)
shared =1
G1
G
X
j=i
Sj(5)
The risk function of ˆ
Σiis based on the Frobenius loss:
R(ˆ
Σi,Σi) = E{∥ˆ
ΣiΣi2
F}
=E{∥SiΣi2
F}+γT 2γTb(6)
where
γ=ˆγ1
ˆγ2(7)
A= E{∥ ˆ
S(i)
shared Si2
F}E{⟨ ˆ
S(i)
shared Si, I Si⟩}
E{⟨ISi,ˆ
S(i)
shared Si⟩} E{∥ISi2
F}!
(8)
b=E{⟨ˆ
S(i)
shared Si,ΣSi⟩}
E{⟨ISi,ΣSi⟩} (9)
From [14], E{∥SiΣi2
F}equals 1
na2+p
na2
1where ai=
1
ptri)and details of the calculation of a1and a2are in
[14]. Most importantly, value of E{∥SiΣi2
F}is shown not
to depend on γ. Therefore, the optimization problem can be
rewritten into:
arg min F(γ) = γT 2γTb(10)
with
ˆγ10
ˆγ20
ˆγ1+ ˆγ21.
Estimation of the matrix A uses its sample based counter-
parts and the vector b is based on equation 30 in [13]:
A= ¯
S(i)
shared Si2
F¯
S(i)
shared Si, I Si
ISi,¯
S(i)
shared Si ISi2
F!
(11)
b= ˆ
V(Si)ˆ
V(ˆ
S(i)
shared)
ˆ
V(Si)ˆ
V(I)!(12)
Calculations of ˆ
V(Si),ˆ
V(ˆ
S(i)
shared)and ˆ
V(I)are as follow-
ing, considering a squared symmetric matrix M:
ˆ
V(M) = n
(n1)2(n2)
n
X
i=1 (Min1
nM)2
F(13)
To solve the optimization problem in equation 10, we use
Lagrange multipliers with slack variables to calculate the
formula of the optimal shrinkage intensity γ1and γ2. Details
of the calculation are in Supplementary material 1. Another
option for solving objective functions is using Nelder-Mead
optimization algorithm with inequality constraints provided by
constrOptim function in R.
C. Higher criticism in significance testing of sparse partial
correlation coefficients when number of features is high
Higher criticism (HC) is defined as a second-level signifi-
cance test used when a very small fraction of the independent
tests are not in the null hypothesis [15]. This is widely appli-
cable in the case where signals are sparse and using standard
approaches for significance testing such as a 0.05 threshold
on adjusted p-values cannot decrease the false positive rate.
In the case of estimating sparse partial correlation matrices
in graphical modeling, significance testing is required to select
significantly non-zero entries in the matrix. This is often done
using t-statistics in a Pearson correlation test of significance.
However, when the number of features (p) increases, the num-
ber of coefficients 1
2p(p1) for testing grows quadratically.
If on average there are 2edges per node, the percentage of
nonzero entries in the population partial correlation matrix
is then 4
p1, and as p goes towards infinity, the fraction is
extremely small and goes towards zero. Hence, a second-
level of testing to reduce false positive rates is necessary and
important in this sparse signal case.
Higher criticism was first generalized by [15] in which it is
a test on increasingly-ordered p-values. The objective function
of HC statistics proposed by [15] is:
HC =max1iα0NN[i/N p(i)]/qp(i)(1 p(i)).(14)
JOURNAL OF L
A
T
E
X CLASS FILES, VOL. 18, NO. 9, SEPTEMBER 2020 4
Fig. 2. JointStein workflow. JointStein is a Stein-type joint estimation
framework which implements two-target linear covariance matrix shrinkage
and higher criticism when the number of features is equal to or larger than
1000.
From the first generalized function of HC statistics, multiple
objective functions have been proposed which aim to solve
different problems such as p>nhigh-dimensional data [16].
In this study, two HC objective functions from [16] (HCD J )
and [17] (HCLS) are implemented on false discovery rate
(FDR) adjusted p values. Denoting p(i)is the i-th p-value in
list of p-values arranging in ascending order, nis total number
of p-values or hypothesis tests in the list:
HCDJ = max
1iα0
n[i/n p(i)]/qp(i)(1 p(i))(15)
HCLS = max
k0kk1
2nI{p(k)< k/n}
q(k/n)log(k/np(k))(k/n p(k))]
(16)
The HC formula determines a position tin the list of ordered
p-values, giving a cut-off point at which all p-values with
positions itare considered to belong to the alternative
hypothesis. Partial correlation coefficients with HC-significant
p-values are considered to be non-zero.
III. RES ULTS A ND DISCUSSION
A. JointStein workflow and benchmarking process
The JointStein approach is a joint graphical modeling work-
flow based on Stein-type linear covariance matrix shrinkage.
The workflow comprises three main steps, data processing,
two-target linear shrinkage and significance testing. In the
first step, when working with scRNA-seq data, heterogeneous
count data containing samples from multiple unknown groups
can be classified into discrete clusters using existing clustering
methods [31], [32]. However, in some cases when groups or
cell types are known, and this clustering step can be skipped
and normalization can be performed on the count data. After
the data processing step, two-target linear covariance matrix
shrinkage is applied which produces an estimated partial
correlation matrix. Standard significance tests with p-value
adjustment are then applied, and when the number of features
is above 1000, higher criticism is recommended for estimating
significant partial correlation coefficients.
The JointStein workflow is benchmarked against standard
linear shrinkage (GeneNet [18]), regularization-based joint
estimation (JGL [10]) and Bayesian based joint estimation
(Bayes [11]). The benchmarking process is carried out on
simulated data, and experimental single-cell RNA sequencing
(scRNAseq) data. Simulation benchmarking is conducted by
simulating data from a multivariate normal distribution, with
precision matrices simulated using the model from Figure 1,
in which each group comprises of 40% shared edges and 60%
randomly generated edges. Matthew’s correlation coefficient
(MCC), which is a correlation coefficient between a set of
predicted and reference values, is chosen as the performance
metric. Further details of the benchmarking are in provided in
Supplementary material 2.
To demonstrate the applicability of the method to experi-
mental data, scRNAseq data from classified brain cells during
tumor progression and from malaria parasite (Plasmodium
falciparum) at different development stages are used to demon-
strate the potential of joint network estimation in real-world
data [20], [19].
B. Performance comparison to existing joint graph estimation
approaches
Simulation benchmarking was conducted using multivariate
normally distributed data with joint network simulation. The
performance assessment was conducted under various condi-
tions by adjusting the sample size, the proportion of shared
edges among networks, and the number of groups. To examine
the effect of sample size, the number of observations (n) was
varied between 200 and 2000, while the number of features (p)
remained constant at 2000 across 5 groups (G), with 40% of
edges being shared across networks. Furthermore, the impact
of shared edge proportion was analyzed by fixing p= 2000,
n= 200, and G= 5, while modifying the proportion of shared
edges from 10% to 90%. Lastly, the influence of the number
of groups was evaluated by adjusting Gfrom 2 to 8, keeping
p= 2000,n= 1000, and maintaining 40% of edges as shared
across networks. The results are summarized in Figure 3. In all
cases, the Bayesian method outperforms other methods, with
the JointStein approach second. Noticeably, l1regularization
based joint estimation, fused graphical lasso algorithm (FGL)
does not increase in performance as number of samples rises.
A potential issue explaining the performance of FGL is the
fixed regularization parameters (λ1= 0.3and λ2= 0.1),
as current regularization parameter selection methods are
developed for standard graphical modeling [21] and are not
applicable to joint approaches. Hence, choosing the optimal
regularization parameters for FGL remains a subject for re-
search. Compared to FGL, JointStein is nonparametetric, in
which shrinkage intensities are formulated from minimizing
objective functions. On the other hand, GeneNet performance
remains constant when the number of groups or percentage
of common edges increases. This is to be expected as the
method is designed for inference in a single group and
does not exploit the shared network structure present in the
data. This simulation benchmarking illustrates the potential
of incorporating shared-network information to improve the
performance of graphical modeling.
To explore the practicality of joint graphical modeling
approaches in large-scale data analysis, the computational time
JOURNAL OF L
A
T
E
X CLASS FILES, VOL. 18, NO. 9, SEPTEMBER 2020 5
Fig. 3. Performance of JointStein against graphical modeling existing
methods in multivariate normally distributed data. JointStein framework
is compared when there is a variation in number of samples (p= 2000,
G= 50, proportion of common edges is 40%) (A), percentage of common
edges (p= 2000,n= 200,G= 5) (B) and number of groups (p= 2000,
n= 1000, proportion of common edges is 40%)(C) with 10 iterations.
TABLE I
COM PUTATI ONA L TI ME.ALL ANALYSES WERE PERFORMED ON UBUN TU
WITH INT EL XEO N GOL D 5215L CPU (10 ITERATIONS,P-NUM BE R OF
FE ATURE S/G ENE S,N-N UMB ER O F OBS ERVATION S/C ELL S IN E ACH GR OUP,
G - NU MBE R OF G ROUP S,S-SECONDS,M-MINUTES,H-HOURS).
p n G Bayes FGL GeneNet JointStein
200 100 5 6.17(m) 2.09(m) 0.03(s) 0.17(s)
1000 500 5 52.86(m) 1.17(h) 3.220(s) 16.742(s)
2000 1000 5 7.02(h) 6.06(h) 29.722(s) 2.48(m)
of Bayes, FGL, JointStein and the GeneNet approach are
measured and highlighted in Table I. JointStein has by far the
shortest computational time among the two other joint esti-
mation methods (Bayes and FGL). Importantly, in large count
matrix data (p=2000, n=1000, G=5), it takes approximately 7
and 6hours for Bayes and FGL to complete the estimation, re-
spectively whilst, it takes under 3minutes for JointStein. This
emphasizes potential of JointStein to conduct joint estimation
analysis when there is a large number of genes. Importantly,
the advantage of JointStein becomes more pronounced as
the number of cell groups increases. For instance, in [20],
36 distinct cell clusters were annotated from glioblastoma
samples, and joint graphical modeling methods like Bayes and
FGL would be computationally prohibitive in such large-scale
multi-group settings. In contrast, the faster computation of
JointStein suggests its suitability for modeling gene networks
across numerous cell types, making it a promising tool for
large-scale single-cell transcriptomics analysis.
C. Implementation in experimental scRNAseq data
Visualization and interpretation of gene expression data in
scRNAseq experiments are of paramount importance. Multiple
methods are proposed including heatmaps and clustering to
identify biologically related genes, gene set enrichment anal-
ysis (GSEA) for functional annotation and network analysis
to demonstrate how different pathways interact [22]. In this
experimental implementation, we aim to use the JointStein
workflow for network inference, while combining the network
based results with heatmaps, clustering, and GSEA, to ex-
plore the scRNAseq data from different angles. First, gene
expression data of scRNAseq studies after quality control
and normalization are collected. Cell clustering and cluster
Fig. 4. Cell cluster plots of glioblastoma study [20]. A, UMAP plot of
all cells with cell type annotation reproduced from the GBM study [20]. B,
Second-level clustering to identify subclusters of T cells.
identification then allow distinct groups to be identified for use
in our joint estimation approach. The JointStein workflow is
then applied to the scRNAseq data of each group to infer gene
networks, and following this simplified networks using gene
clusters with their corresponding annotated functions as nodes
are used to analyze interactions between different pathways.
Glioblastoma (GBM) is malignant brain cancer with a
highly immunosuppressive and protumorigenic immune mi-
croenvironment [20]. SCRNAseq technology was used to char-
acterise changes in the immune landscape during glioblastoma
progression in a de novo orthotopic mouse model [20]. GBM
single cells were harvested at early, late stages of GBM and
classified by cell markers [20]. Normal brains were included as
controls [20]. The cell clustering UMAP plot was reproduced
from the study (Figure 4A). In this study, we focused on
T cells within the scRNAseq dataset to demonstrate the
relevance of JointStein framework in joint network inference
for single-cell data analysis. T cells play a critical role in the
immune response to GBM and have been extensively studied
in cancer research, providing a well-characterized reference for
evaluating network inference results [54]. Their biological, ge-
nomic, and gene regulatory data in GBM are well-documented,
allowing for direct comparison with existing literature. This
enables us to assess the consistency of our inferred networks
with known gene interactions, further validating the robustness
of the proposed framework.
Further stratification was carried out since T cells are
shown to consist of multiple subpopulations, each with distinct
biological functions and roles in immune responses [33].
The original GBM study annotated this cluster broadly as
T cells without distinguishing finer subpopulations. Further
clustering was applied using the kNN algorithm to refine the
classification and identify subclusters that may better represent
biologically distinct T cell populations. The final clustering
result identified 20 subclusters of T cells and is presented in
Figure 4B.
In this implementation, scRNAseq data of T cells from three
stages and after data quality control is used. Results are shown
in Figure 5. In this analysis, a total of 3,381 differentially
expressed genes were included to construct gene networks.
To facilitate visualization and interpretation, a gene clustering
step was performed with the objective of reducing the number
of nodes and dimensions, enabling a clearer representation of
gene relationships while integrating gene set annotations that
reflect biological functions. Genes were clustered based on
JOURNAL OF L
A
T
E
X CLASS FILES, VOL. 18, NO. 9, SEPTEMBER 2020 6
Fig. 5. Joint gene network inference of glioblastoma study [20]. A, UMAP of harvested T cells from 3 different stages were reproduced from main study
and smaller clusters were classified using kNN algorithm. The result is in Figure 4B. Cluster 5 and 7 forming clear distinct clusters with high percentages
of cells from early and late stages of GBM are subjects of interest. B, Shared network is shown between all T cell clusters in the UMAP with thickness
corresponding amount of gene interactions between clusters. C & D, Specific networks to early-GBM and late-GBM clusters, respectively are illustrated with
average gene expression in each clusters as heatmap background.
their expression profiles in T cells, allowing functionally re-
lated genes to be grouped together. Hierarchical clustering was
employed to identify gene clusters, where Euclidean distance
was used to compute the distance matrix, and the complete
agglomeration method was applied for hierarchical clustering.
To refine the clustering, we implemented the hybrid dynamic
tree-cutting method using the cutreeDynamic function in
R, with the deepSplit parameter set to 4, ensuring a
minimum of four genes per cluster. This approach resulted
in the identification of 62 gene sets which were annotated
functionally using PANTHER overrepresentation enrichment
analysis [39]. Final results of gene sets and their annotations
were subsequently integrated into the network plots to enhance
biological interpretation.
Noticeably, clusters 5 and 7 have high proportions of early
and late stages of GBM cells respectively. Hence, these clus-
ters are of interest for joint network inference and annotated
as early-GBM and late-GBM clusters. The shared network
structure among all clusters is visualized in Figure 5B while
the specific networks of early-GBM and late-GBM clusters are
in Figure 5C and D, respectively. From the network inference
results, gene sets 24 and 34 are highly expressed in both of
these clusters. Gene set 24 comprises genes primarily involved
in antigen processing and immune response. This includes
C1qa, C1qb, and C1qc, which encode complement component
C1q, a key protein in both adaptive and innate immunity
[35]. C1q recognizes and binds to various immune system
activators, playing a crucial role in initiating complement
activation, clearing cellular debris, and regulating immune
cell interactions [35]. Structurally, it consists of six identical
subunits forming a collagenous stalk with globular heads,
which facilitate its binding functions [34]. Additionally, this
cluster contains H2-Aa, H2-Ab1, and H2-Eb1, encoding major
histocompatibility complex (MHC) class II proteins. These
proteins are essential for presenting processed antigens to
T cells, a critical step in triggering antigen-specific immune
responses [36]. Beyond their role in antigen presentation,
MHC class II molecules are also involved in intracellular
signaling pathways that can lead to apoptosis [36]. Further-
more, Tuba1b and Tubb5 are present in this cluster, encoding
tubulin proteins that provide structural integrity to the cell,
supporting cytoskeletal organization and stability [37]. Cluster
34 is enriched with genes encoding actin proteins, including
Actg1 and Actb. Actin filaments serve multiple essential
JOURNAL OF L
A
T
E
X CLASS FILES, VOL. 18, NO. 9, SEPTEMBER 2020 7
functions, including providing mechanical stability, facilitating
intracellular transport, and enabling cell movement [38].
In late-GBM cell cluster, gene set 46, 47, 48, 52 and 57 are
highly activated. Gene set 46 contains genes involved in mi-
tochondrial respiration, including mt-Nd1, mt-Nd4, mt-Cytb,
and mt-Co3. mt-Nd1 and mt-Nd4 encode subunits of NADH
dehydrogenase, an essential enzyme in the electron transport
chain responsible for transferring electrons from NADH to
ubiquinone, contributing to ATP production [40]. mt-Cytb
encodes cytochrome B, a key component of respiratory chain
complex [41]. mt-Co3 encodes cytochrome c oxidase, the
terminal enzyme in the electron transport chain that catalyzes
the reduction of oxygen to water, driving oxidative phos-
phorylation [42]. Together, these proteins play a fundamental
role in mitochondrial metabolism by enabling efficient cellular
respiration. Gene set 47 consists of Ccl5, Ctsb, and Ctsd,
which are involved in the regulation of macrophage apoptosis
[23], [24], [25]. These genes have been implicated in immune
system modulation and reported to contribute to the poor
prognosis of glioblastoma (GBM) [23], [24], [25]. Ccl5 (C-
C motif chemokine ligand 5) plays a role in inflammatory
responses and immune cell recruitment, while Ctsb and Ctsd
encode cathepsins, a lysosomal protease that regulate protein
degradation and apoptosis [43], [44]. Their involvement in
GBM progression highlights their potential role in tumor-
associated inflammation and immune evasion. Cluster 48 in-
cludes Lyz1 and Lyz2, which encode lysozymes. Lysozymes
play a crucial role in host defense by breaking down bacterial
cell walls, contributing to antibacterial immunity [45]. These
genes have been reported to be particularly responsive to
bacterial infections, underscoring their importance in pathogen
defense and immune regulation [45].
Edges between gene groups represent connections between
individual genes belonging to each group. For example, the
presence of an edge between group 24 and group 48 indicates
that at least one gene in group 24 is linked to one or more
genes in group 48. The thickness of each edge corresponds to
the number of such connections, with thicker edges represent-
ing a higher number of gene-to-gene interactions between the
two groups. Connection between group 24 and 48 is present in
all networks of cell clusters while connection between group
24 and 47 is only in the networks of early-GBM and late-GBM
cell clusters. In the context of the T cell gene network from
glioblastoma samples, the interaction between clusters 24 and
48 may reflect coordinated immune responses within the tumor
microenvironment. Gene cluster 24 includes genes involved
in antigen processing and complement activation, such as
C1qa, C1qb, and C1qc, which play a role in recognizing
pathogens and modulating immune cell activity. Meanwhile,
gene cluster 48 contains Lyz1 and Lyz2, encoding lysozymes
that contribute to antibacterial defense. The presence of these
clusters within the T cell network suggests a potential link
between complement-mediated immune signaling and immune
clearance responses. Given that complement activation can
influence macrophage function and enhance phagocytosis, it is
possible that interactions between these clusters contribute to
shaping the immune landscape in glioblastoma by regulating
immune clearance mechanisms and inflammatory responses
[46]. This interaction may be particularly relevant in the
context of glioblastoma, where immune modulation plays a
critical role in shaping tumor progression and immune evasion.
In both early-GBM and late-GBM networks, gene group
50, which consists of pseudogenes (Gm15464, Gm10689,
Gm17786, Gm8618, Gm10054) in Mus musculus, remains
isolated, lacking connections to other gene groups. While
little is known about the functions of these pseudogenes, their
identification in differential gene expression analysis suggests
a potential role in immune modulation within T cells in the
glioblastoma microenvironment. In contrast, gene group 17
exhibits interactions exclusively in the early-GBM cluster,
forming connections with gene groups 4, 9, and 39. Notably,
the connection between gene group 17 and 9 includes edges
between Tmsb10 and Calm1. Tmsb10 encodes a transcrip-
tional factor crucial for cytoskeletal organization and has
been associated with immune cell activity, including T cells
(gamma delta, CD4+Th1 and Th2, CD8+), macrophages (M1
and M2) [47], [48]. Meanwhile, Calm1 encodes calmodulin,
a calcium sensor involved in intracellular Ca2+ signaling,
playing a fundamental role in cellular processes and indirectly
contributing to T-cell and B-cell activation [49]. Gene group
17 connects to gene group 39 through edges between Cfl1 and
Arhgdib. Cfl1 encodes the cofilin protein, which is essential
for actin cytoskeletal remodeling. It has been demonstrated
that cofilin plays a critical role in early αβ T-cell development,
while also exhibiting differential involvement in αβ versus γδ
T-cell maturation [50]. Similarly, Arhgdib, which encodes mi-
nor histocompatibility antigens (MiHA), has been implicated
in antigen presentation and immune cell recognition [51]. A
study has shown that T cells can recognize LB-ARHGDIB-1R
(MiHA) on primary leukemic cells [51]. Together, the specific
connections between gene group 17 and gene groups 9 and
39 highlight key processes involved in T-cell activation and
antigen presentation during the early stages of glioblastoma
progression. The absence of these interactions in late-GBM
may indicate a shift in immune dynamics, potentially reflect-
ing changes in the tumor microenvironment that alter T-cell
function and immune surveillance.
In the late-GBM network, gene group 17 forms a new
connection with gene group 24 through edges between Apoe
and Ftl1. Apoe encodes apolipoprotein E, a key component of
plasma lipoproteins that not only regulates lipid metabolism
but also induces tumor progression [52]. Notably, Apoe de-
ficiency has been shown to support antitumor immunity by
preventing T cell exhaustion, suggesting its role in immune
suppression within the glioblastoma microenvironment [52].
Ftl1, encoding ferritin light polypeptide 1, is part of the
ferritin complex, which plays a crucial role in iron storage
and immune regulation [53]. Higher expression levels of
ferritin components, including FTL and FTH, have been ob-
served in regulatory T cells (Tregs) compared to conventional
CD4+CD127+CD25T cells, indicating a potential role
in maintaining function of Tregs [53]. The presence of this
connection in late-stage glioblastoma suggests a shift toward
an immunosuppressive microenvironment, where connection
between ApoE and FTL1 may contribute to tumor progres-
sion by promoting Treg-mediated immune evasion and T-cell
JOURNAL OF L
A
T
E
X CLASS FILES, VOL. 18, NO. 9, SEPTEMBER 2020 8
exhaustion. This newly established interaction in late-GBM
highlights a possible mechanism by which glioblastoma ad-
vances to a more immunosuppressive state, potentially limiting
effective anti-tumor immune responses.
IV. CONCLUSIONS
Joint gene network inference algorithms support the anal-
ysis of scRNAseq data by inferring and comparing networks
between groups of cells. In this study, we proposed a new
approach of joint graphical modeling using two-target linear
shrinkage. TTLS shows better performance than individual
network inference with GeneNet, and the fused graphical
lasso, and has potential for large-scale analysis using less
computational time compared to the Bayesian approach con-
sidered. Furthermore, higher criticism is highlighted as a
second-level significance testing approach for use in high-
dimensional partial correlation matrix estimation with high
sparsity. Importantly, prior knowledge is not a requirement in
JointStein which broadens application of JointStein in species
with less available data.
In the analysis of experimental data from T cell clusters in a
glioblastoma study, the potential of the JointStein framework
in learning local network structures in heterogenous data is
demonstrated. In future work we aim to extend this approach to
heterogeneous samples with less clearly defined groups, where
JointStein could be used to estimate local networks by sharing
data from nearby samples. This will allow the approach to be
applied to single cell data representing trajectories of cells, or
to time series data.
REFERENCES
[1] C. Giraud, Introduction to High-Dimensional Statistics. Chapman and
Hall/CRC, 2021.
[2] J. Friedman, T. Hastie, and R. Tibshirani, ”Sparse inverse covariance
estimation with the graphical lasso,” Biostatistics, vol. 9, no. 3, pp. 432–
441, 2008.
[3] N. Meinshausen and P. B¨
uhlmann, ”High-dimensional graphs and variable
selection with the lasso,” 2006.
[4] O. Ledoit and M. Wolf, ”A well-conditioned estimator for large-
dimensional covariance matrices, Journal of Multivariate Analysis, vol.
88, no. 2, pp. 365–411, 2004.
[5] K. Tsai, O. Koyejo, and M. Kolar, ”Joint Gaussian graphical model
estimation: A survey, Wiley Interdisciplinary Reviews: Computational
Statistics, vol. 14, no. 6, p. e1582, 2022.
[6] J. Guo, E. Levina, G. Michailidis, and J. Zhu, ”Joint estimation of multiple
graphical models,” Biometrika, vol. 98, no. 1, pp. 1–15, 2011.
[7] O. Ledoit and M. Wolf, ”Honey, I shrunk the sample covariance matrix,
UPF Economics and Business Working Paper, no. 691, 2003.
[8] D. Jovic, X. Liang, H. Zeng, L. Lin, F. Xu, and Y. Luo, ”Single-cell RNA
sequencing technologies and applications: A brief overview,” Clinical and
Translational Medicine, vol. 12, no. 3, p. e694, 2022.
[9] R. Mazumder and T. Hastie, ”The graphical lasso: New insights and
alternatives, Electronic Journal of Statistics, vol. 6, p. 2125, 2012.
[10] P. Danaher, P. Wang, and D. M. Witten, ”The joint graphical lasso
for inverse covariance estimation across multiple classes,” Journal of the
Royal Statistical Society Series B: Statistical Methodology, vol. 76, no.
2, pp. 373–397, 2014.
[11] B. Jia, F. Liang, and the TEDDY Study Group, ”Fast hybrid Bayesian
integrative learning of multiple gene regulatory networks for type 1
diabetes,” Biostatistics, vol. 22, no. 2, pp. 233–249, 2021.
[12] Y. Ikeda, T. Kubokawa, and M. S. Srivastava, ”Comparison of linear
shrinkage estimators of a large covariance matrix in normal and non-
normal distributions,” Computational Statistics & Data Analysis, vol. 95,
pp. 95–108, 2016.
[13] T. Lancewicki and M. Aladjem, ”Multi-target shrinkage estimation for
covariance matrices, IEEE Transactions on Signal Processing, vol. 62,
no. 24, pp. 6380–6390, 2014.
[14] T. J. Fisher and X. Sun, ”Improved Stein-type shrinkage estimators for
the high-dimensional multivariate normal covariance matrix, Computa-
tional Statistics & Data Analysis, vol. 55, no. 5, pp. 1909–1918, 2011.
[15] D. Donoho and J. Jin, ”Higher criticism for detecting sparse heteroge-
neous mixtures,” 2004.
[16] D. Donoho and J. Jin, ”Feature selection by higher criticism thresholding
achieves the optimal phase diagram, Philosophical Transactions of the
Royal Society A: Mathematical, Physical and Engineering Sciences, vol.
367, no. 1906, pp. 4449–4470, 2009.
[17] J. Li and D. Siegmund, ”Higher criticism: p-values and criticism, 2015.
[18] J. Sch¨
afer and K. Strimmer, ”A shrinkage approach to large-scale
covariance matrix estimation and implications for functional genomics,
Statistical Applications in Genetics and Molecular Biology, vol. 4, no. 1,
2005.
[19] A. Poran, C. N¨
otzel, O. Aly, N. Mencia-Trinchant, C. T. Harris, M. L.
Guzman, D. C. Hassane, O. Elemento, and B. F. C. Kafsack, ”Single-cell
RNA sequencing reveals a signature of sexual commitment in malaria
parasites,” Nature, vol. 551, no. 7678, pp. 95–99, 2017.
[20] A. T. Yeo, S. Rawal, B. Delcuze, A. Christofides, A. Atayde, L. Strauss,
L. Balaj, V. A. Rogers, E. J. Uhlmann, H. Varma, et al., ”Single-cell RNA
sequencing reveals evolution of immune landscape during glioblastoma
progression,” Nature Immunology, vol. 23, no. 6, pp. 971–984, 2022.
[21] T. Zhao, H. Liu, K. Roeder, J. Lafferty, and L. Wasserman, ”The huge
package for high-dimensional undirected graph estimation in R,” The
Journal of Machine Learning Research, vol. 13, no. 1, pp. 1059–1062,
2012.
[22] ”Biological interpretation of gene expression
data,” https://www.ebi.ac.uk/training/online/courses/
functional-genomics- ii-common- technologies-and- data-analysis- methods/
biological-interpretation- of-gene- expression-data-2/, accessed: Nov. 19,
2024.
[23] M. Novak, M. K. Krajnc, B. Hrastar, B. Breznik, B. Majc, M. Mlinar, A.
Rotter, A. Porˇ
cnik, J. Mlakar, K. Stare, et al., ”CCR5-mediated signaling
is involved in invasion of glioblastoma cells in its microenvironment,
International Journal of Molecular Sciences, vol. 21, no. 12, p. 4199,
2020.
[24] M. K. Kranjc, M. Novak, R. G. Pestell, and T. T. Lah, ”Cytokine CCL5
and receptor CCR5 axis in glioblastoma multiforme,” Radiology and
Oncology, vol. 53, no. 4, pp. 397–406, 2019.
[25] X. Ding, C. Zhang, H. Chen, M. Ren, and X. Liu, ”Cathepsins trigger
cell death and regulate radioresistance in glioblastoma,” Cells, vol. 11,
no. 24, p. 4108, 2022.
[26] C. E. Griguer, A. B. Cantor, H. M. Fathallah-Shaykh, G. Y. Gillespie,
A. S. Gordon, J. M. Markert, I. Radovanovic, V. Clement-Schatlo, C.
N. Shannon, and C. R. Oliva, ”Prognostic relevance of cytochrome C
oxidase in primary glioblastoma multiforme,” PloS One, vol. 8, no. 4, p.
e61035, 2013.
[27] C. Ziegenhain, B. Vieth, S. Parekh, B. Reinius, A. Guillaumet-Adkins,
M. Smets, H. Leonhardt, H. Heyn, I. Hellmann, and W. Enard, ”Compar-
ative analysis of single-cell RNA sequencing methods, Molecular Cell,
vol. 65, no. 4, pp. 631–643, 2017.
[28] V. Svensson, R. Vento-Tormo, and S. A. Teichmann, ”Exponential
scaling of single-cell RNA-seq in the past decade, Nature Protocols,
vol. 13, no. 4, pp. 599–604, 2018.
[29] S. Huang, J. Li, L. Sun, J. Ye, A. Fleisher, T. Wu, K. Chen, E.
Reiman, and the Alzheimer’s Disease NeuroImaging Initiative, ”Learning
brain connectivity of Alzheimer’s disease by sparse inverse covariance
estimation,” NeuroImage, vol. 50, no. 3, pp. 935–949, 2010.
[30] C. J. Oates, J. Korkola, J. W. Gray, and S. Mukherjee, ”Joint estimation
of multiple related biological networks,” The Annals of Applied Statistics,
vol. 8, no. 3, pp. 1892–1919, 2014.
[31] L. Van der Maaten and G. Hinton, ”Visualizing data using t-SNE,
Journal of Machine Learning Research, vol. 9, no. 11, 2008.
[32] L. McInnes, J. Healy, and J. Melville, ”UMAP: Uniform manifold
approximation and projection for dimension reduction,” arXiv preprint
arXiv:1802.03426, 2018.
[33] C.-H. Koh, S. Lee, M. Kwak, B.-S. Kim, and Y. Chung, ”CD8 T-cell
subsets: heterogeneity, functions, and therapeutic potential, Experimental
& Molecular Medicine, vol. 55, no. 11, pp. 2287–2299, 2023.
[34] O. Middleton, H. Wheadon, and A. M. Michie, ”Classical complement
pathway, in Encyclopedia of Immunobiology, M. J. H. Ratcliffe, Ed.
Oxford: Academic Press, 2016, pp. 318–324. DOI: https://doi.org/10.
1016/B978-0- 12-374279- 7.02014-2.
JOURNAL OF L
A
T
E
X CLASS FILES, VOL. 18, NO. 9, SEPTEMBER 2020 9
[35] K. B. M. Reid, ”Complement component C1q: historical perspective of a
functionally versatile, and structurally unusual, serum protein,” Frontiers
in Immunology, vol. 9, p. 764, 2018.
[36] T. M. Holling, E. Schooten, and P. J. van Den Elsen, ”Function and
regulation of MHC class II molecules in T-lymphocytes: of mice and
men,” Human Immunology, vol. 65, no. 4, pp. 282–290, 2004.
[37] P. Binarov´
a and J. Tuszynski, ”Tubulin: structure, functions and roles in
disease,” Cells, vol. 8, no. 10, p. 1294, 2019.
[38] T. D. Pollard and J. A. Cooper, ”Actin, a central player in cell shape
and movement, Science, vol. 326, no. 5957, pp. 1208–1212, 2009.
[39] H. Mi, X. Huang, A. Muruganujan, H. Tang, C. Mills, D. Kang, and P.
D. Thomas, ”PANTHER version 11: expanded annotation data from Gene
Ontology and Reactome pathways, and data analysis tool enhancements,”
Nucleic Acids Research, vol. 45, no. D1, pp. D183–D189, 2017.
[40] T. Yagi, B. B. Seo, E. Nakamaru-Ogiso, M. Marella, J. Barber-Singh,
T. Yamashita, and A. Matsuno-Yagi, ”Possibility of transkingdom gene
therapy for complex I diseases, Biochimica et Biophysica Acta (BBA)-
Bioenergetics, vol. 1757, no. 5-6, pp. 708–714, 2006.
[41] J. Everse, ”Heme proteins,” Elsevier, 2013.
[42] I. Ishigami, R. G. Sierra, Z. Su, A. Peck, C. Wang, F. Poitevin, S.
Lisova, B. Hayes, F. R. Moss III, S. Boutet, et al., ”Structural insights
into functional properties of the oxidized form of cytochrome c oxidase,”
Nature Communications, vol. 14, no. 1, p. 5752, 2023.
[43] R. E. Marques, R. Guabiraba, R. C. Russo, and M. M. Teixeira, ”Tar-
geting CCL5 in inflammation,” Expert Opinion on Therapeutic Targets,
vol. 17, no. 12, pp. 1439–1460, 2013.
[44] D. Dheer, J. Nicolas, and R. Shankar, ”Cathepsin-sensitive nanoscale
drug delivery systems for cancer therapy and other diseases, Advanced
Drug Delivery Reviews, vol. 151, pp. 130–151, 2019.
[45] S. A. Ragland and A. K. Criss, ”From bacterial killing to immune
modulation: Recent insights into the functions of lysozyme,” PLoS
Pathogens, vol. 13, no. 9, p. e1006512, 2017.
[46] S. S. Bohlson, S. D. O’Conner, H. J. Hulsebus, M.-M. Ho, and
D. A. Fraser, ”Complement, C1q, and C1q-related molecules regulate
macrophage polarization,” Frontiers in Immunology, vol. 5, p. 402, 2014.
[47] R. Xiao, S. Shen, Y. Yu, Q. Pan, R. Kuang, and H. Huang, ”TMSB10
promotes migration and invasion of cancer cells and is a novel prognostic
marker for renal cell carcinoma,” International Journal of Clinical and
Experimental Pathology, vol. 12, no. 1, p. 305, 2019.
[48] Z. Li, Y. Li, Y. Tian, N. Li, L. Shen, and Y. Zhao, ”Pan-cancer analysis
identifies the correlations of Thymosin Beta 10 with predicting prognosis
and immunotherapy response,” Frontiers in Immunology, vol. 14, p.
1170539, 2023.
[49] S. Hasterok, B. Nyesiga, and A. G. Wingren, ”CALM1, Atlas of Ge-
netics and Cytogenetics in Oncology and Haematology, 2022. Available:
https://atlasgeneticsoncology.org/gene/208988/calm1.
[50] I. Seeland, Y. Xiong, C. Orlik, D. Deibel, S. Prokosch, G. K¨
ublbeck,
B. Jahraus, D. De Stefano, S. Moos, F. C. Kurschus, et al., ”The actin
remodeling protein cofilin is crucial for thymic αδ T-cell development,
PLoS Biology, vol. 16, no. 7, p. e2005380, 2018.
[51] M. J. Pont, W. Hobo, M. W. Honders, S. A. P. van Luxemburg-Heijs, M.
G. D. Kester, A. M. van Oeveren-Rietdijk, N. Schaap, H. C. de Boer, C.
A. M. van Bergen, and H. Dolstra, ”LB-ARHGDIB-1R as a novel minor
histocompatibility antigen for therapeutic application,” Haematologica,
vol. 100, no. 10, p. e419, 2015.
[52] B. Zhao, W. Wang, Z. Chen, and S. Jin, ”Apolipoprotein E deficiency
enhances the anti-tumor immunity via repressing T cell exhaustion,” All
Life, vol. 14, no. 1, pp. 1054–1062, 2021.
[53] Q. Wu, A. R. Carlos, F. Braza, M.-L. Bergman, J. Z. Kitoko, P. Bastos-
Amador, E. Cuadrado, R. Martins, B. S. Oliveira, V. C. Martins, et al.,
”Ferritin heavy chain supports stability and function of the regulatory T
cell lineage,” The EMBO Journal, vol. 43, no. 8, pp. 1445–1483, 2024.
[54] Wang, H., Zhou, H., Xu, J., Lu, Y., Ji, X., Yao, Y., Chao, H., Zhang,
J., Zhang, X., Yao, S., et al. ”Different T-cell subsets in glioblastoma
multiforme and targeted immunotherapy. Cancer Letters, vol. 496, 2021,
pp. 134–143.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
CD8 T cells play crucial roles in immune surveillance and defense against infections and cancer. After encountering antigenic stimulation, naïve CD8 T cells differentiate and acquire effector functions, enabling them to eliminate infected or malignant cells. Traditionally, cytotoxic T cells, characterized by their ability to produce effector cytokines and release cytotoxic granules to directly kill target cells, have been recognized as the constituents of the predominant effector T-cell subset. However, emerging evidence suggests distinct subsets of effector CD8 T cells that each exhibit unique effector functions and therapeutic potential. This review highlights recent advancements in our understanding of CD8 T-cell subsets and the contributions of these cells to various disease pathologies. Understanding the diverse roles and functions of effector CD8 T-cell subsets is crucial to discern the complex dynamics of immune responses in different disease settings. Furthermore, the development of immunotherapeutic approaches that specifically target and regulate the function of distinct CD8 T-cell subsets holds great promise for precision medicine.
Article
Full-text available
Cytochrome c oxidase (CcO) is an essential enzyme in mitochondrial and bacterial respiration. It catalyzes the four-electron reduction of molecular oxygen to water and harnesses the chemical energy to translocate four protons across biological membranes. The turnover of the CcO reaction involves an oxidative phase, in which the reduced enzyme (R) is oxidized to the metastable OH state, and a reductive phase, in which OH is reduced back to the R state. During each phase, two protons are translocated across the membrane. However, if OH is allowed to relax to the resting oxidized state (O), a redox equivalent to OH, its subsequent reduction to R is incapable of driving proton translocation. Here, with resonance Raman spectroscopy and serial femtosecond X-ray crystallography (SFX), we show that the heme a3 iron and CuB in the active site of the O state, like those in the OH state, are coordinated by a hydroxide ion and a water molecule, respectively. However, Y244, critical for the oxygen reduction chemistry, is in the neutral protonated form, which distinguishes O from OH, where Y244 is in the deprotonated tyrosinate form. These structural characteristics of O provide insights into the proton translocation mechanism of CcO.
Article
Full-text available
Introduction The biological function and prognosis roles of thymosin β(TMSB) 10 are still unclear in pan-cancer. Methods We retrieved The Cancer Genome Atlas and Genotype-tissue expression datasets to obtain the difference of TMSB10 expression between pan-cancer and normal tissues, and analyzed the biological function and prognosis role of TMSB10 in pan-cancer by using cBioPortal Webtool. Results The expression of TMSB10 in tumor tissues was significantly higher than normal tissues, and showed the potential ability to predict the prognosis of patients in Pan-cancer. It was found that TMSB10 was significantly correlated with tumor microenvironment, immune cell infiltration and immune regulatory factor expression. TMSB10 is involved in the regulation of cellular signal transduction pathways in a variety of tumors, thereby mediating the occurrence of tumor cell invasion and metastasis. Finally, TMSB10 can not only effectively predict the anti-PD-L1 treatment response of cancer patients, but also be used as an important indicator to evaluate the sensitivity of chemotherapy. In vitro, low expression of TMSB10 inhibited clonogenic formation ability, invasion, and migration in glioma cells. Furthermore, TMSB10 may involve glioma immune regulation progression by promoting PD-L1 expression levels via activating STAT3 signaling pathway. Conclusions Our results show that TMSB10 is abnormally expressed in tumor tissues, which may be related to the infiltration of immune cells in the tumor microenvironment. Clinically, TMSB10 is not only an effective prognostic factor for predicting the clinical treatment outcome of cancer patients, but also a promising biomarker for predicting the effect of tumor immune checkpoint inhibitors (ICIs) and chemotherapy in some cancers.
Article
Full-text available
Treatment of glioblastoma (GBM) remains very challenging, and it is particularly important to find sensitive and specific molecular targets. In this work, we reveal the relationship between the expression of cathepsins and radioresistance in GBM. We analyzed cathepsins (cathepsin B, cathepsin D, cathepsin L, and cathepsin Z/X), which are highly associated with the radioresistance of GBM by regulating different types of cell death. Cathepsins could be potential targets for GBM treatment.
Article
Full-text available
Glioblastoma (GBM) is an incurable primary malignant brain cancer hallmarked with a substantial protumorigenic immune component. Knowledge of the GBM immune microenvironment during tumor evolution and standard of care treatments is limited. Using single-cell transcriptomics and flow cytometry, we unveiled large-scale comprehensive longitudinal changes in immune cell composition throughout tumor progression in an epidermal growth factor receptor-driven genetic mouse GBM model. We identified subsets of proinflammatory microglia in developing GBMs and anti-inflammatory macrophages and protumorigenic myeloid-derived suppressors cells in end-stage tumors, an evolution that parallels breakdown of the blood–brain barrier and extensive growth of epidermal growth factor receptor ⁺ GBM cells. A similar relationship was found between microglia and macrophages in patient biopsies of low-grade glioma and GBM. Temozolomide decreased the accumulation of myeloid-derived suppressor cells, whereas concomitant temozolomide irradiation increased intratumoral GranzymeB ⁺ CD8 ⁺ T cells but also increased CD4 ⁺ regulatory T cells. These results provide a comprehensive and unbiased immune cellular landscape and its evolutionary changes during GBM progression.
Article
Full-text available
Single‐cell RNA sequencing (scRNA‐seq) technology has become the state‐of‐the‐art approach for unravelling the heterogeneity and complexity of RNA transcripts within individual cells, as well as revealing the composition of different cell types and functions within highly organized tissues/organs/organisms. Since its first discovery in 2009, studies based on scRNA‐seq provide massive information across different fields making exciting new discoveries in better understanding the composition and interaction of cells within humans, model animals and plants. In this review, we provide a concise overview about the scRNA‐seq technology, experimental and computational procedures for transforming the biological and molecular processes into computational and statistical data. We also provide an explanation of the key technological steps in implementing the technology. We highlight a few examples on how scRNA‐seq can provide unique information for better understanding health and diseases. One important application of the scRNA‐seq technology is to build a better and high‐resolution catalogue of cells in all living organism, commonly known as atlas, which is key resource to better understand and provide a solution in treating diseases. While great promises have been demonstrated with the technology in all areas, we further highlight a few remaining challenges to be overcome and its great potentials in transforming current protocols in disease diagnosis and treatment. This review provides a concise summary of the single‐cell RNA sequencing technologies. Overview and guidelines for planning experimental procedures are presented. Bioinformatics tools for scRNA‐seq data analysis are thoroughly discussed. Applications and further development of scRNA‐seq technology are highlighted.
Article
Full-text available
Apolipoprotein E (ApoE), a component of plasma lipoproteins, also can promote tumor cell proliferation, migration and metastasis. However, the role of host ApoE in regulating anti-tumor immunity is still unknown. Here host ApoE-deficient (ApoE −/−) mice were used to understand the role of ApoE in antitumor immunity. Tumor-infiltrating lymphocytes subsets and exhausted T cells were measured by flow cytometry in B16F10 and MC38 immunogenic tumor-bearing mice. Both B16F10 and MC38 growth were significantly repressed in ApoE −/− mice. Tumor-infiltrating T cells and pro-inflammatory M1 macrophages were significantly increased in ApoE −/− mice. Moreover, decreased PD-1⁺ cells representing exhausted CD8⁺ T cells and increased cytotoxic CD8⁺ T cells function and proliferation indicating as TNF- α⁺, IFN-γ⁺, GZMB⁺ and Ki67⁺ CD8⁺ cells in tumor-infiltrating lymphocytes of ApoE −/− mice may contribute to the antitumor function of host ApoE deficiency. Thus, inhibition of ApoE could be a good strategy for cancer immunotherapy.
Article
Regulatory T (TREG) cells develop via a program orchestrated by the transcription factor forkhead box protein P3 (FOXP3). Maintenance of the TREG cell lineage relies on sustained FOXP3 transcription via a mechanism involving demethylation of cytosine-phosphate-guanine (CpG)-rich elements at conserved non-coding sequences (CNS) in the FOXP3 locus. This cytosine demethylation is catalyzed by the ten–eleven translocation (TET) family of dioxygenases, and it involves a redox reaction that uses iron (Fe) as an essential cofactor. Here, we establish that human and mouse TREG cells express Fe-regulatory genes, including that encoding ferritin heavy chain (FTH), at relatively high levels compared to conventional T helper cells. We show that FTH expression in TREG cells is essential for immune homeostasis. Mechanistically, FTH supports TET-catalyzed demethylation of CpG-rich sequences CNS1 and 2 in the FOXP3 locus, thereby promoting FOXP3 transcription and TREG cell stability. This process, which is essential for TREG lineage stability and function, limits the severity of autoimmune neuroinflammation and infectious diseases, and favors tumor progression. These findings suggest that the regulation of intracellular iron by FTH is a stable property of TREG cells that supports immune homeostasis and limits the pathological outcomes of immune-mediated inflammation.
Article
Graphs representing complex systems often share a partial underlying structure across domains while retaining individual features. Thus, identifying common structures can shed light on the underlying signal, for instance, when applied to scientific discovery or clinical diagnoses. Furthermore, growing evidence shows that the shared structure across domains boosts the estimation power of graphs, particularly for high‐dimensional data. However, building a joint estimator to extract the common structure may be more complicated than it seems, most often due to data heterogeneity across sources. This manuscript surveys recent work on statistical inference of joint Gaussian graphical models, identifying model structures that fit various data generation processes. This article is categorized under: Data: Types and Structure > Graph and Network Data Statistical Models > Graphical Models