Available via license: CC BY-NC-ND 4.0
Content may be subject to copyright.
Vol.:(0123456789)
Int J Comput Intell Syst (2025) 18:64
Int J Comput Intell Syst (2025) 18:64 https://doi.org/10.1007/s44196-025-00792-w
RESEARCH ARTICLE
TraitBertGCN: Personality Trait Prediction Using
BertGCN withData Fusion Technique
MuhammadWaqas1· FengliZhang1,2· AsifAliLaghari3· AhmadAlmadhor4·
FilipPetrinec5· AsifIqbal1· MianMuhammadYasirKhalil1
Received: 20 November 2024 / Revised: 21 February 2025 / Accepted: 10 March 2025
© The Author(s) 2025
Abstract
Personality prediction via different techniques is an established and trending topic in psychology. The advance-
ment of machine learning algorithms in multiple fields also attracted the attention of Automatic Personality Pre-
diction (APP). This research proposes a novel TraitBertGCN method with a data fusion technique for predicting
personality traits. Initially, this work integrates a pre-trained language model, Bidirectional Encoder Representa-
tions from Transformers (BERT), with a three-layer Graph Convolutional Network (GCN) to leverage large-scale
language understanding and graph-based learning for personality prediction. This study fuses the two datasets
(essays and myPersonality) to overcome the bias and generalize the model across different domains. We fine-tuned
our TraitBertGCN model on the fused dataset and then evaluated it on both datasets individually to assess its
adaptability and accuracy in varied contexts. We compared the proposed model’s results with previous studies;
our model achieved better performance in personality trait prediction across multiple datasets, with an average
accuracy of 77.42% on the essays dataset and 87.59% on the myPersonality dataset.
Keywords Automatic personality prediction· Bidirectional encoder representations from transformers· Graph
convolutional network· TraitBertGCN· Data fusion
1 Introduction
Personality prediction defines everyone's actual identity. It includes a distinctive combination of behaviors,
thoughts, emotions, and ideas that define a person and influence their perception of the world. Recent psychology
research shows that personality traits affect mental health, professional performance, interpersonal relationships,
well-being, and consumer behavior. Automatic personality prediction (APP) is a developing field that predicts a
person's personality using observable data. Advancements in technology have the potential to effectively predict
personality traits from social media, textual material, photos, and videos [1]. The most widely used personality
trait classification models were developed by personality psychology. The best known of these is the Five Factor
Model (FFM), sometimes called the Big Five model [2, 3], which divides personality traits into five main clas-
sifications: openness, conscientiousness, extroversion, agreeableness, and neuroticism (OCEAN). These char-
acteristics are all on a continuum, with different people exhibiting different amounts of each characteristic. For
example, extroversion can show up in two extremes: people who are very social and open and people who are
more shy and introspective. People in psychology study generally agree on the Big Five model, because it can
explain complex aspects of human behavior and can be used in a wide range of cultural contexts.
Int J Comput Intell Syst (2025) 18:64
64 Page 2 of 24 https://doi.org/10.1007/s44196-025-00792-w
The idea behind predicting a person's personality is to see where they fall under the' Big Five' model
categories. Previously, psychologists would do this by asking lots of questions such as the Revised NEO
(Neuroticism, Extroversion, Openness) Personality Inventory (NEO-PI-R) [4] and assessing the person based
on their responses. This method had its own problems, because people may not always be honest about how
they really are, and asking all these questions can be tough in large-scale studies. Therefore, researchers are
trying to explore the automatic ways to predict a person's personality. They examine the kinds of stuff, such
as a person's emails, social media posts, or other things they have written, with the help of available artificial
intelligence approaches.
Being able to predict a person's personality is really useful accurately. It may help in various tasks or daily
life of real-world domains, including marketing, psychology, politics, and even Human Resource Management
(HRM). These automated techniques may be helpful in understanding the preferences, likes, dislikes, and
tendencies of humans without interacting with them directly. An automatic personality prediction system can
recognize human behaviors by simply analyzing their writing style, pictures, and behavior on social media.
Moreover, implementing these APP systems can be beneficial in various applications, like boosting social
media participation, recommendation systems, and even supporting mental health treatment [1].
Recently, the APP field got the attention of computer scientists and researchers, who have implemented
machine learning algorithms that can analyze huge amounts of textual data. Textual data may be in the form
of emails, social media posts, and other digital communication used to infer personality traits. The exponen-
tial growth in digital content generation has given an explosion of data that may reveal personality traits. The
huge amount of online data generated by users has made text analysis an effective method for personality trait
prediction. Natural Language Processing (NLP) is crucial to textual data analysis [5, 6], particularly with
advanced models like BERT (Bidirectional Encoder Representations from Transformers) [7]. According to
previous studies, BERT performed well in classification tasks, like text classification, sentiment analysis, and
personality prediction in the field of NLP. Its ability to recognize contextual meaning has improved language
understanding as well as interpretation in different applications [6].
The inspection of the words and sentences that are related to personality traits is based on the lexical
approaches of a language in previous text-based personality prediction examinations. In vocabulary, any
sequence of words or sentences that represent social interaction or positive emotions can be denoted as extro-
verts, whereas when this sequence of words shows negative emotions, it can be recognized as neurotics. To
understand the nuance of a language, there is a manually maintained lexicon method, known as Linguistic
Inquiry and Word Count (LIWC) [8], that builds a relationship between words based on psychological clas-
sifications. This procedure also has some challenges compared to the advantages it offers. One major issue
with this manual technique is that it stops the learning ability to completely understand the complexity and
details of a language due to the lists of predetermined words. However, this procedure provides beneficial
information but cannot completely recognize human expression and personality due to the defined limitations.
Technological advancements in the domain of machine learning and deep learning algorithms have also
played a vital role in predicting text-based personality traits. These algorithms can extract complex data fea-
tures from the text data without human intervention. Another advantage of these algorithms, which makes
them more suitable for tasks like personality prediction, is that they can also capture the relationship between
words, phrases, and sentences [1]. As can be seen in past years, pre-trained language models like BERT per-
formed well in every field, especially in the field of natural language processing (NLP), so they can also be
more favorable in the prediction of text-based personality traits. The understanding of the word in the text
can be improved based on the context, because these models help to learn contextualized word embeddings.
Due to the absence of a huge amount of labeled data, even after all the current advancements in personality
trait prediction, it is still a challenging task. This is the major issue in the training of models for this type of
task, because, due to the confidentiality problem, sometimes it is very hard, time-consuming, and tedious to
collect personality data. Moreover, there are some deficiencies in the previous studies due to the ignorance
of relationships between data points and the evaluation of words or documents.
Int J Comput Intell Syst (2025) 18:64
Page 3 of 24 64
https://doi.org/10.1007/s44196-025-00792-w
Graph Convolutional Networks (GCNs) [9] can be constructive for these types of tasks [10]. Modeling
the connections between numerous data points helps to improve personality trait analysis and overcome
some of the deep learning’s limitations. GCNs are excellent tools for evaluating structured data, including
text corpora, social networks, and citation networks. These networks describe interconnections between ele-
ments like words and documents and propagate information via these connections. GCNs excel at analyzing
data points and their relationships. GCNs may express connections across words and sentences to improve
personality prediction by improving understanding of semantic structure.
In recent studies, hybrid models that combine BERT with different deep learning algorithms have been
investigated to predict personality traits and achieved better accuracy and performance [11, 12] and so on.
This research proposes a hybrid model for personality trait prediction that integrates two BERT variants
(BERT-base and RoBERTa-base) with GCN strengths, in which we utilize GCN’s graph-based word and
document associations with BERT’s contextualized word embeddings. BERT excels in extracting complex
semantic representations from textual data, whereas GCNs improve classification accuracy using word–docu-
ment structural connections [13]. This hybrid technique has achieved optimum performance in NLP applica-
tions like text classification [14]. However, personality prediction using this hybrid model is in its infancy
period. Based on our literature review, TraitBertGCN is the first hybrid architecture to improve textual
personality trait prediction by fine-tuning the BERT and GCN components simultaneously to leverage
structural connections and text semantics. This research aims to build a hybrid model that generalizes across
different datasets and linguistic styles to develop personality trait prediction systems, especially with sparse
or noisy data. Our proposed model with a data fusion approach outperforms current personality prediction
approaches in precision and accuracy across two benchmark datasets. The specific contributions of our
research are as follows:
• This research proposes a hybrid model called TraitBertGCN with data fusion technique to predict per-
sonality traits. The initial step of this study is to create the BERT embeddings from the fused text datasets.
We process the data into batches for getting the embedding due to memory limitations. Finally, we save
these embeddings which enables the models to contain contextually rich representations for further analysis.
• After the creation of embedding, a graph structure is generated using cosine similarity between docu-
ment embeddings. Based on the embedding similarity, the graph develops the nodes and edges for each
document. This implies that closely connected documents would predict similar personality traits. The graph
records the document relationships using K-Nearest Neighbors (KNN), which improves the ability of the
model to predict personality traits from local and global contexts.
• The next step is to fine-tune the BERT model on the fused dataset for multi-label classification. This
dataset is divided into training and validation sets according to the 80/20 rule to evaluate the model accu-
rately. The fine-tuned BERT model optimizes accuracy, recall, and F-measure by associating textual infor-
mation with personality traits. Class weights during fine-tuning rectify imbalance classes of the dataset and
appropriately represent all traits.
• Finally, we integrate the fine-tuned BERT with three-layer GCN to deal with the text embeddings
and graph structural relationships because of these integrations. GCN layers use document representation
interconnectivity to improve predictions. The model is then evaluated on different test datasets, where its
performance has been assessed through various metrics, including accuracy and F-measure, providing a
comprehensive overview of its effectiveness in predicting personality traits. This multifaceted approach
reveals the power of deep learning and graph-based approaches to improve prediction accuracy.
The remainder of this study delves into the background of APP and the advantages of using hybrid models.
We begin by reviewing the major personality trait models in Sect.2. Section3 describes the methodology
used in personality prediction from text, followed by exploring the proposed model, which combines BERT
with GCN for personality classification. In Sect.4, we compare our proposed model results with previous
studies and discuss them. We conclude our work in Sect.5.
Int J Comput Intell Syst (2025) 18:64
64 Page 4 of 24 https://doi.org/10.1007/s44196-025-00792-w
2 Related Work
This section reviews personality prediction research using textual data, deep learning, GCNs, and hybrid models
like BERT and GCN. Previous investigations provide a solid foundation for our personality trait prediction research.
Automatic prediction of personality traits from text is an emerging field in NLP and affective computing. This mul-
tidisciplinary problem involves deep learning, graph-based neural networks, and computational languages.
2.1 Personality Trait Prediction fromText Data
Affective computing and sentiment analysis are the primary domains of artificial intelligence that seek to investigate
human emotions, thoughts, and personality using multiple modalities, including textual content, audio, and video data
[1]. The prediction of personality is mainly based on psychology, which defines these traits as psychological factors
that change a person's emotions, attitudes, feelings, and actions [15]. With the recognition of social media and textual
communication, text data analysis has become a famous te`chnique for predicting personality traits.
Text data is an immediate expression of people's thoughts, ideas, feelings, and emotions, making it a fantastic aid
in the research area of personality prediction. Previous studies on personality detection primarily based on linguistic
aspects, which include the Linguistic Inquiry and Word Count (LIWC) [8], helped researchers pick out linguistic
markers related to personality traits. A prominent psychological Big Five model [2], which has been extensively
utilized in computational personality prediction techniques, permits researchers to associate personality traits with
textual data. The transition from the traditional creation of features to automatic feature extraction in NLP has revo-
lutionized these prediction models. Early researchers deployed different machine learning algorithms and classical
classifiers [which include support vector machines (SVM) and decision trees (C4.5)] to retrieve lexical, semantic, and
syntactic data from text [16]. The popular Essays dataset compiled by Pennebaker etal. [17] sets the "gold standard"
for personality prediction studies, identifying many important features and integrating attributes that had a big impact
on trait prediction.
2.2 Deep Learning Approaches forPersonality Prediction
As deep learning becomes increasingly important, researchers have studied more complicated neural network models
to predict personality traits from text [1]. Convolutional Neural Networks (CNNs) were a popular model that used
text structure evaluation to predict personality traits [18]. Sun etal. [19] combined the power of CNN and Bidirec-
tional Long Short-Term Memory (Bi-LSTM) networks to enhance personality prediction via text data. They also
presented the idea of Latent Sentence Group (LSG) in their work, which combines textual structural contents con-
sisting of word- or sentence-level correlations to optimize performance for personality trait labeling further. Yang Li
etal. [20] designed a distinctive multi-task learning framework based on CNN, in which they presented the concept
of an information-sharing gate named Softmax Gate (SoG) to capture emotions and personality traits. Highlighting
the crucial relationship between emotional behavior and personality traits. In their work, SoG deals with the flow of
information between two CNNs, avoids the interference between multiple tasks, and enables effective feature sharing.
They tested on three different datasets, in which they attained a 2.09% enhancement in accuracy for the personality
prediction dataset and a 1.32% increment in the emotion detection dataset. Tendera etal. [21] proposed an integrated
LSTM + CNN 1D architecture to predict personality through textual data from myPersonality dataset and achieved
an average accuracy of 74.17% using deep learning techniques.
2.3 Transfer Learning andPre‑trained Models inNLP
Transfer learning is an indispensable building block in NLP research, with pre-trained language models like BERT
[7], GPT [22], and XLNet [23], which can be strong baselines for improving downstream tasks. Training on large
datasets instructs these models to capture broad and complex linguistic patterns that may be tuned for particular
Int J Comput Intell Syst (2025) 18:64
Page 5 of 24 64
https://doi.org/10.1007/s44196-025-00792-w
purposes. Pre-trained models like BERT have excelled in predicting personality traits across different datasets. The
outcomes of transfer learning for personality trait prediction are impressive. When applied to the Essays dataset,
the Universal Language Model Fine-tuning (ULMFiT) [24] and Embeddings from Language Models (ELMo)
[25] methods improved predictions for all five personality traits [11]. BERT has been fine-tuned using the Essays
and myPersonality datasets and outperformed prior machine learning and deep learning approaches [11, 12].
According to previous studies on text classification [14], integratingGCNs withBERT embeddings will
improve accuracy in different transfer learning tasks. When employing BERT to produce superior embeddings
and GCNs to examine their connections, researchers got better results than using either modelalone [14, 27]. This
hybrid methodology allows for more accurate personality trait prediction, especially in big and complex datasets
like social media interactions or psychological surveys.
2.4 Hybrid Models: BERT andGCN forPersonality Prediction
Graph neural networks (GNNs) have proven effective models for determining connections and relationships
between structured data in graphs [28]. Its actions have been used effectively in NLP, recommendation systems,
and traffic prediction [29]. GNNs are deployed for many tasks, including answering questions, classifying seman-
tic roles, and extracting relationships. GNNs are excellent at capturing the connections between nodes in a network
through message-passing techniques. GCNs are a subset of GNNs that excel at modeling interconnections between
data points, so they are ideal for tasks such as text classification that rely on contextual and semantic relation-
ships between words or sentences [9]. The recognition of pre-trained Large Language Models (LLMs) like BERT
has been a great milestone in Natural Language Processing (NLP), enabling the models to fetch deep semantic
and syntactic features without the need for extensive feature searches. Contextualized BERT embeddings were
utilized in these prediction studies, outperforming standard psycholinguistic methods [30]. Previous studies [11,
12, 31] have proven higher accuracy in predicting these traits by fine-tuning BERT to the best available datasets,
including Essay [17] or myPersonality [26, 32].
Some of the recent studies have examined the integration of GNNs with pre-trained language models. For
example, Yao etal. [33] used GCNs, and Zhang etal. [34], Ma etal. [35], and Pan etal. [36] established BERT
and GNN to represent the links between tokens in the text. These models, however, are primarily concerned with
document-level categorization, while GNNs capture relations within a single document. Lin etal. [14] improved
this work by applying graphs to represent connections between texts, showing that GCNs can benefit from large-
scale pre-training, especially when merged with robust language models such as BERT. They utilized a state-of-
the-art model named BertGCN [27], which they tested on five different datasets and compared their results with
the best available baseline models.
Li etal. [37] proposed a novel Heterogeneous Graph Attention Joint (HGAJ) method to capture the relationships
between nodes in heterogeneous graphs effectively. They used a graph attention neural network to assign atten-
tion weights to relational subgraphs based on the features extracted by different nodes and edges from the graph
structure. They combined the Galactica large language model into their proposed model to learn the semantic
features in subgraphs and to improve the overall performance implemented on three datasets. Wang etal. [38]
introduced a model called Semiotic Signal Integration Network (SSIN) to deal with the integration problem of
syntactic and semantic featured in Aspect-Based Sentiment Analysis (ABSA) on the publically available SemE-
val dataset. Their model contains two modules: Syntax-based Isomorphic Convolutional Network (SynGIN) to
learn the relations between words within linguistic trees and Semantic Attention Network (SemGSAT) to capture
the semantic information between words and aspects using the multi-head attention mechanism with graph to
optimize the performance.
BERT and GCN have been revealed to outperform the conventional text classification models in various NLP
tasks [14]. BERT creates nuanced contextual embeddings that learn hidden semantic relationships in documents,
while GCNs use graphical structures to model complex data point relationships at the word, phrase, or document
level. BERT embeddings are fed into the GCNS, allowing message transmission between these graph nodes to
Int J Comput Intell Syst (2025) 18:64
64 Page 6 of 24 https://doi.org/10.1007/s44196-025-00792-w
capture the long-range dependencies between words. Understanding local (word-level) and global (document-
level) textual connections for optimum classification makes this approach ideal for personality trait prediction.
Recent studies show that these hybrid models can perform better. He etal. [27] proposed an approach to
enhance the generalization of BERT in Natural Language Inference (NLI) tasks without compromising the per-
formance on benchmark datasets. They integrated the BERT with GCN, which enabled the model to learn more
complex synthetic relationships between the sentences. They also offered the concept of a co-attention mechanism
to improve sentence-matching capabilities. This integrated strategy can find specific and overarching word correla-
tions for personality trait prediction. It can excel at analyzing complex text data like essays and social media posts.
2.5 Gaps inExisting Research
BERT and GCNs provide valuable insights, but the present research has significant limitations. Most of the lit-
erature focuses on exploring the connections between either word level or document level, ignoring the power
of hybrid models for the available tasks like personality trait prediction [11, 12]. BERT has been extensively
used for text classification [14, 27], but its integration with GCNs for personality prediction is still in its infancy.
Previous research lacks investigations using GCNs to analyze correlations across multiple datasets, such as
essays or social media postings from different individuals. Most of the research explores connections inside a
single document; however, examining relations across diverse documents may reveal how personality traits have
been predicted. Pre-trained language models for personality prediction have progressed, but more research is
needed to integrate them with GCNs. This integration helps to learn relational and linguistic features in textual
data, improving personality trait accuracy and depth prediction.
3 Methodology
This section explains our research methods, including dataset selection, data preparation, and hybrid model
implementation. The datasets serve as a basic building block for this research, affecting its reliability and accu-
racy. Initially, we describe the sources of the datasets, sizes, and essential attributes. We next discuss data pre-
processing approaches, including missing value handling, normalization, and feature selection, to ensure data
quality and analysis. Finally, we examine model selection criteria and explain our hybrid model that combines
two different modeling approaches to improve the performance of personality trait prediction. Figure1 and
Algorithm1 describe the overall workflow of our proposed methodology. This systematic approach optimizes
prediction accuracy and provides adaptability and generalizability in real-world applications.
Fig. 1 Personality trait prediction proposed methodology workflow
Int J Comput Intell Syst (2025) 18:64
Page 7 of 24 64
https://doi.org/10.1007/s44196-025-00792-w
Algorithm1 Proposed Methodology of TraitBertGCN
3.1 Datasets
This research exploits the potential of fusion techniques on two online available benchmark personality trait
prediction datasets. Both datasets convey valuable insights into personality trait distribution throughout the text
datasets. First, James Pennebaker and colleagues created the Essays dataset to compare personality trait prediction
based on textual data [17]. Volunteers wrote 2467 anonymous, authentic, and stream-of-consciousness essays in a
controlled environment for this dataset. Moreover, the myPersonality dataset comprises Facebook status updates
of 250 anonymous users from the myPersonality project, which contains 9917 records [26]. This application
operated from 2007 to 2012, allowing users to share status postings for psychological research, accumulating a
large amount of data that could be accessed upon request due to confidentiality [32].
Both datasets are labeled with author personality traits like Extroversion (EXT), Neuroticism (NEU), Agreea-
bleness (AGR), Conscientiousness (CON), and Openness (OPN). Table1 presents the details and their features
or traits as binary labels (yes/no) from a Big Five personality model over both datasets. It is aligned with a self-
administered questionnaire, which may be referred to as self-assessment. The myPersonality dataset has shorter
texts than the Essays dataset.
Figures2 and 3 show each of the five personality traits’ distribution based on essays and myPersonality dataset,
in which “yes” and “no” labels represent the presence of “1/0” in each trait. In Fig.2, the dataset looks balanced,
since the number of essays labeled “yes” and “no” for each trait is slightly different. In Fig.3, the two classes are
imbalanced. Balance is important, because it assures that the dataset is acceptable for creating different learning
Int J Comput Intell Syst (2025) 18:64
64 Page 8 of 24 https://doi.org/10.1007/s44196-025-00792-w
models, such as the APP model. A balanced dataset prevents bias, allowing the model to predict personality traits
more robustly and generally. To handle the class imbalance problem, we applied the inverse frequency technique
(instead of a more complex method like SMOTE), which pays more attention to calculating the weights of minor-
ity classes during the fine-tuning process and enhances the classification performance of imbalanced classes. We
calculate the class weights for each personality trait described in Eq.1
Equation1 calculates the class weight for each trait based on the ratio where NTotal represents the total number
of samples in the dataset. Meanwhile, Psc shows a positive sample count for each class. This ensures that the model
assigns higher weights to minority classes to deal with the class imbalance problem during the fine-tuning process.
Figures4 and 5 examine the five personality traits of essays and the myPersonality dataset in a correlation
matrix. The matrix is symmetric, meaning that the value connecting trait 1 with trait 2 is the same as the value
(1)
Weight
POS =
N
Total
−P
SC
P
SC
.
Table 1 Datasets details Dataset Data resources Number of trait Number of entries Words count
Essays [17] Students 5 (Yes/No) 2467 Essays 1,900,000
myPersonality [26] Facebook status 5 (Yes/No) 9917 Posts 143,600
Fig. 2 Five personality trait
distribution based on essays
dataset
Fig. 3 Five personality trait
distribution based on myPer-
sonality dataset
Int J Comput Intell Syst (2025) 18:64
Page 9 of 24 64
https://doi.org/10.1007/s44196-025-00792-w
correlating trait 2 with trait 1. Since each trait is fully connected with itself, every value in the main diagonal of
the matrix is 1.0.
The matrix correlation coefficient is –1 to + 1. A coefficient around + 1 reveals a significant positive connection
between two traits, indicating that knowing one trait’s label might assist in predicting another more easily. For
example, a positive correlation coefficient between two traits means that an essay or text labeled true for one trait
is likely true for the other. A coefficient close to –1 indicates a significant negative relationship, where knowing
one trait is true makes the other more likely to be false.
Finally, Tables2 and 3 demonstrate personality trait distributions in both datasets. Table2 shows that the Essays
dataset effectively represents all personality traits. Table3 reveals an imbalance class across Neuroticism (NEU)
and Openness (OPN). Maintaining original data integrity improves deep learning algorithms for personality trait
detection, especially in binary classification problems. Both datasets have been concatenated to create multi-source
input data for our model to improve the prediction of Big 5 personality traits.
3.2 Datasets Processing
This step refines and converts inputtexts to make them simpler and more understandablefor the system to
process. This procedure is a standard in NLPtasksand includes a range of task-specific operations. This sec-
tion discusses the preparation processes in this phase.
In pre-processing, the emphasis was on"tokenization," or splitting text into tokens, which is usually based
onwords but is not compulsory. Tokens are the prominentsemantic units for analysis. This task was made
easier using BertTokenizer. Since BERT has a maximum limit of 512 tokens, while the essays dataset con-
tains 650–660 words (divided into sub-documents), and the myPersonality dataset status comprises rarely 60
words. Hence, to avoid bias, we utilized text truncation and padding techniques during the model's fine-tuning
Fig. 4 Correlation matrix
of essays dataset over five per-
sonality trait
Int J Comput Intell Syst (2025) 18:64
64 Page 10 of 24 https://doi.org/10.1007/s44196-025-00792-w
process due to the different writing styles, contexts, and text lengths. Eliminating undesired charactersthat
might inhibit analysis was necessary to improve the text. To clean the text datasets, we applied the stop_words
library to remove the stop words, because it reduces noise and improves performance. Normalization is fol-
lowed to standardize tokens. For textual features normalization and representation, we applied pre-trained
BERT embeddings to transform the text into a numerical representation. We utilized BERT, because it is good
for learning longer contextual representations instead of just the frequencies of words, as in Term Frequency-
Inverse Document Frequency (TF-IDF). This ensures that related concepts can be identified despite slight
character sequence differences. Lowercasing and lemmatization simplified the data, enabling the model to
concentrate on essential meanings rather than morphological distinctions.
Next, we usedpandas and scikit-learn to prepare the datasets for further analysis. We splitthe text data into
training, validation, and test sets, and labeledeach split after eliminating stop words. Reset the dataset for index-
ing and transformedcategorical columns ('y' and 'n') to numerical values (1.0 and 0.0). After modification, we
Fig. 5 Correlation matrix of
myPersonality dataset over
five personality trait
Table 2 Five personality
trait distribution based on
essays dataset
Value EXT NEU AGR CON OPN
Yes 1275 1234 1309 1254 1271
No 1192 1233 1158 1213 1196
Table 3 Five personality
trait distribution based on
myPersonality dataset
Value EXT NEU AGR CON OPN
Yes 4210 3717 5268 4556 7370
No 5707 6200 4649 5361 2547
Int J Comput Intell Syst (2025) 18:64
Page 11 of 24 64
https://doi.org/10.1007/s44196-025-00792-w
fused both datasets. The data frame and column labels are returned for analysis, modeling, and preparingthe
data for furtheroperations.
3.3 Proposed Model andImplementation
Rapid developments in NLP and graph-based learning need efficient model selection and implementation to
improve personality trait prediction performance. The novel TraitBertGCN model combines the contextual depth
of BERT-style embeddings with the structural knowledge of GCN.
As compared to traditional deep learning algorithms, BERT can capture long-range, dynamic, and context-
aware embeddings by analyzing the complete sentence instead of just generating the static word embeddings as
done by previous techniques like Word2Vec TF-IDF and Glove. A better understanding of the contiguous words
enables the model to learn the word sense in which context the particular words are used. In the APP context,
capturing the contextual embeddings based on the given sentence assists the model in understanding the lan-
guage nuances, which is difficult for traditional methods due to learning the static embeddings and might miss
the nuance of the language. For example, a sentence like “I like to help others” which is showing agreeableness
based on the context, because there is no teamwork involved. While “I like to help my team to achieve success”
represents conscientiousness here, because “help” is used here in different contexts.
Graph Convolutional Networks (GCNs) [9] are implemented to analyze structured datasets like text and social
networks. GCN graphs represent data points like words and documents as nodes. The edges connecting these
nodes show connections between them, including semantic similarity. GCNs enable the model to grasp local and
global data point relationships by propagating information across the network using convolutional operations on
this graph structure. By capturing the relationship between words and documents by GCNs, it can be valuable to
predict personality traits accurately. This model can better understand the complex structure of the semantic data.
GCNs might discover patterns in words that are used together in identical contexts and documents with similar
topics. The GCNs are ideal for tasks that need structural relations, such as personality trait prediction, due to their
better performance in modeling data point connections.
Fig. 6 TraitBertGCN model architecture
Int J Comput Intell Syst (2025) 18:64
64 Page 12 of 24 https://doi.org/10.1007/s44196-025-00792-w
Yao etal. [33] proposed a TextGCN model in which they denoted the node features by an identity matrix, but
in our work, we utilized the BERT embeddings for document nodes and treated them as input. In contrast, the
word nodes are initialized as zero matrices, as in Eq.2. In Eq.2, the word nodes are set to start with zero, because
they have no information, and their features are learned during the fine-tuning process through the interaction
with document nodes. We did not initialize them with random or pre-trained word embeddings, because it might
introduce bias, accelerate, or restrict the learning process. While initialization from zero permits the model to
learn better word representations from scratch. Based on our literature review, personality traits can be best pre-
dicted through the integration capabilities of BERT and GCNs. The TraitBertGCN structure utilizes detailed and
contextualized word embeddings of BERT, as shown in Eq.3, and a graphical representation of word–document
relationships of GCN. As shown in Fig.6, a hybrid TraitBertGCN model has been designed, and it is fine-tuned
using graph structure information with text semantics to obtain better predictive performance for traits
In Eq.2, Where Xdoc represents the BERT embeddings, which has the size of ndoc*d (number of document nodes
(ndoc) and embeddings dimensionality (d)), while 0 represents the zero matrix with the size of nword*d (number of
(2)
X=
[
Xdoc
0
](
ndoc +nword
)
∗d
(3)
EHidden =B(T).
Fig. 7 TraitBertGCN (BERT-base) confusion matrix for each trait on essays dataset
Int J Comput Intell Syst (2025) 18:64
Page 13 of 24 64
https://doi.org/10.1007/s44196-025-00792-w
word nodes (nword) and embeddings dimensionality (d)). In Eq.3, EHidden presents the BERT hidden representation
and B(T) tokenized the input text of each document into word pieces with the use of BERT.
We performed all the implementations in Google Colab with Tesla T4 GPU (2560 CUDA cores, and the RAM
size was 16GB GDDR6). First, BERT (BERT-base and RoBERTa-base) produces word embeddings from textual
data, which are forwarded to the GCN. The GCN arranges words and documents as nodes and their cosine similari-
ties as edges, as shown in Eq.4. Then, GCN performs the convolutional operations to propagate information across
the network and capture local and global word and document relationships as defined in Eq.5. This hybrid technique
integrates pre-trained models with graph-based approaches to provide more accurate and contextually aware pre-
dictions. In our implementation, we analyzed different parameters tuning and finally selected the maximum length
to 128, batch size to 4, number of epochs to 5, learning rate to 10−5, and three layers of GCN with relu activation
function for fine-tuning. To avoid model overfitting, we saved the best weights on the third epoch and evaluated the
model with these weights on test data. BERT and GCN insights combine to predict personality traits as expressed in
Eqs.6 and 7, providing an in-depth study
(4)
S
ij =
E
i⋅
E
j
‖
‖Ei‖
‖
‖
‖
‖
Ej
‖
‖
‖
Fig. 8 TraitBertGCN (BERT-base) confusion matrix for each trait on myPersonality dataset
Int J Comput Intell Syst (2025) 18:64
64 Page 14 of 24 https://doi.org/10.1007/s44196-025-00792-w
In Eq.4, Sij describes the cosine similarity of the documents i and j, which can be in the range of –1 to 1. At the
same time, Ei and Ej present the documents' embeddings. In Eq.5, σ is the relu activation function, MD defines the
diagonal matrix, MA is the adjacency matrix, El is the intermediate layers embeddings, and MlW shows the weight
matrix. In Eq.6, σ is the sigmoid activation, EL represents the last layer or final embeddings, MlW shows the weight
matrix, and b is the bias value. In Eq.7, LBCE defines the binary cross-entropy loss, N is the total number of classes,
yi is the ground truth values, and yʹi is the predicted values.
4 Results andDiscussion
In this section, we demonstrate the evaluation metrics and experimental results on the datasets, which have been
defined in Sect.3 for the prediction of personality traits.
(5)
E
l+1=𝜎
(
M−1∕2
DMAM−1∕2
DElMl
W
)
(6)
Y�
=𝜎
(
E
L
∗M
l
W
+b
)
(7)
L
BCE =−1
N
N
∑
i=1
(
yilog
(
y�
i
)
+
(
1−yi
)
log
(
1−y�
i
).
Fig. 9 TraitBertGCN (RoBERTa-base) confusion matrix for each trait on essays dataset
Int J Comput Intell Syst (2025) 18:64
Page 15 of 24 64
https://doi.org/10.1007/s44196-025-00792-w
4.1 Evaluation Metrics
Most classification models are assessed using precision, recall, F-measure, and accuracy. The “actual labels” or “gold
standard,” which indicate data classification, and the “system predicted labels,” which the model assigns, are key to
these metrics. Figures7 and 8 show the confusion matrix for TraitBertGCN (BERT-base) and Figs.9 and 10 present
the confusion matrix for TraitBertGCN (RoBERTa-base) on two different datasets in which there are four outcomes
for each personality trait based on model prediction, such as True Positive (TP) when actual and predicted labels are
true; True Negative (TN) is the inverse of TP, False Positive (FP), when the actual label is false but predicted label is
true; False Negative (TN) is the inverse of FP, respectively. TP and TN are essential classification results, since they
reflect valid predictions. The system’s accuracy is the ratio of accurate predictions (TP + TN) to total predictions, as
presented in Eq.8
Precision and recall are important measures for model performance and accuracy. Precision is the percentage of
accurate system-positive predictions, as shown in Eq.9. Calculated recall is the percentage of gold standard true
positives the system detected correctly, as presented in Eq.10. These measures are significant, but they might be inac-
curate if viewed separately. A model may have great accuracy but poor recall, or vice versa. The weighted harmonic
mean of these two to balance them can be calculated as an F-measure in Eq.11
(8)
Accuracy
=
TP +TN
TP +TN +FP +FN.
Fig. 10 TraitBertGCN (RoBERTa-base) confusion matrix for each trait on myPersonality dataset
Int J Comput Intell Syst (2025) 18:64
64 Page 16 of 24 https://doi.org/10.1007/s44196-025-00792-w
(9)
Precision
=
TP
TP +FP
Table 4 F-measure and accuracy of fine-tuned TraitBertGCN and tested on essays dataset
Bold values represent the highest values which have been achieved by our proposed model
Model F-measure Accuracy
OCE ANAvg OC EANAvg
Fusion(ELMO + ULMFit + BERT) [11] – – – – – – 65.6 59.52 61.15 60.8 62.2 61.85
BERT-base + MLP [12] – – – – – – 64.6 59.2 60 58.8 60.5 60.62
SEPRNN [39] 67.84 63.46 71.50 71.92 62.36 67.416 63.16 57.49 58.91 57.49 59.51 59.312
personalityGCN [40] 67 68 67 69 69 68 64.80 59.10 60 57.70 63 60.92
RoBERTa [41] – – – – – – 65.86 58.55 60.62 59.72 61.04 61.16
BB-SVM [42] – – – – – – 62.09 57.84 59.30 56.52 59.39 59.03
EnsembleModelingWithHAN [43] 57.37 59.74 65.80 61.62 60.69 61.04 56.30 59.18 64.25 60.31 61.14 60.24
CNN + Mairesse [44] – – – – – – 62.68 57.30 58.09 56.71 59.38 58.83
LibSVM + SMO [45] 61.9 56 55.6 55.7 58.3 57.7 61.95 56.04 55.75 57.54 58.31 57.92
SVM(SMO) [46] 56 54 53 50 54 53.4 – – – – – –
FiveFeatures + SMO [47] 66.1 63.3 63.4 61.5 63.7 63.6 – – – – – –
SVM [48] 60.57 56.46 56.28 53.9 58.15 57.072 – – – – – –
CNN [49] 70.35 68.92 68.05 66.29 69.53 68.63 67.34 68.36 65.52 63.49 66.94 66.33
RNN [49] 69.23 66.95 70.22 67.37 64.4 67.64 69.17 68.56 69.37 68.76 63.9 67.95
LSTM [49] 73.15 73.02 72.05 72.79 65.15 71.23 69.78 68.97 69.57 69.98 65.72 68.44
BiLSTM [49] 73.64 75.68 77.72 71.78 68.34 73.43 71.4 72.62 73.83 70.18 69.37 71.48
CNNessays [50] – – – – – – 62 57 58 56 59 58.4
ULMFiT [51] – – – – – – 63.30 57.97 58.85 58.25 59.88 59.85
TraitBertGCN (RoBERTa-base) 54.65 70.23 67.84 65.56 69.05 65.47 65.02 63.15 63.80 62.87 62.02 63.37
TraitBertGCN (BERT-base) 80.11 70.77 72.15 74.40 72.27 73.94 79.77 67.49 69.80 70.13 71.63 71.76
Table 5 F-measure of fine-
tuned TraitBertGCN and
tested on myPersonality
dataset
Bold values represent the highest values which have been achieved by our proposed model
Model O C E A N Avg
PersonalityGCN [40] 76.8 75 85 70 79 77.16
PMC + LIWC + unigram [52] 65 62 71 68 70 67.2
SVM [53] 61 54 56 45 49 53
TraitBertGCN (RoBERTa-base) 89.25 77.26 71.71 75.00 74.50 77.54
TraitBertGCN (BERT-base) 93.86 79.17 79.77 82.68 78.49 82.79
Table 6 Accuracy of fine-
tuned TraitBertGCN and
tested on myPersonality
dataset
Bold values represent the highest values which have been achieved by our proposed model
Model O C E A N Avg
Fusion(ELMO + ULMFit + BERT) [11] 81.00 65.31 80.55 63.69 79.00 73.91
PersonalityGCN [40] 80 76 80 68 79 76.6
TraitBertGCN (RoBERTa-base) 84.35 79.29 77.46 76.05 81.48 79.73
TraitBertGCN (BERT-base) 90.76 80.69 83.48 82.35 83.65 84.19
Int J Comput Intell Syst (2025) 18:64
Page 17 of 24 64
https://doi.org/10.1007/s44196-025-00792-w
Unfortunately, the F-measure neglects True Negatives, which may be crucial for many systems, especially those
that must accurately detect negative classes. This constraint chooses accuracy, which incorporates both positive
and negative outcomes, over F-measure in classification systems.
4.2 Experimental Results
This research presents a TraitBertGCN model for an automated personality prediction system and tests its accu-
racy. This research uses semantic and structural knowledge representations to improve personality trait prediction
by merging the pre-trained language model BERT with graph convolutional networks.
After the fusion of BERT with GCN, as discussed in Sect.3.3, we fine-tuned and tested our TraitBertGCN
model on each dataset individually. Table. 4 describes the results of all five personality traits in terms of
different evaluation metrics on the essay dataset, while Tables5 and 6 present the results of the myPerson-
ality dataset. El-Demerdash etal. [11] utilized pre-trained large language models like BERT, ULMFit, and
ELMo for the extraction of the features from the text data. They performed the classifier-level and data-level
fusion to obtain better results. Their model achieved an average accuracy of 61.85% on the essays dataset,
(10)
Recall
=
TP
TP +FN
(11)
F
−measure =
2×P×R
P+R
.
Fig. 11 Accuracy of TraitBertGCN on essays dataset
Int J Comput Intell Syst (2025) 18:64
64 Page 18 of 24 https://doi.org/10.1007/s44196-025-00792-w
while they acquired an average accuracy of 73.91% on the myPersonality dataset. Yash etal. [12] integrated
the BERT embeddings with traditional psycholinguistic features. After the integration, they extracted the
features from the data and fed them into the deep learning model like Multi-Layer Perceptron (MLP). Their
model attained an average accuracy of 60.62% on the essay dataset. Ramezani etal. [49] proposed a model
in which they initially preprocessed the data and then generated a matrix representation by embedding the
knowledge graph in vector space using RDF2vec. Finally, they fed the graph structure matrix in four different
classifiers, including Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short-
Term Memory (LSTM), and Bidirectional Long Short-Term Memory (Bi-LSTM). Bi-LSTM gained the best
average accuracy of 71.48%. Further, Figs.11 and 12 show the results on the Essays dataset, and Figs.13 and
14 present the results on the myPersonality dataset. All the models performed well, but the highest average
F-measure and average accuracy have been achieved by our proposed TraitBertGCN (BERT-base) model.
Finally, Tables7 and 8 present the results of our model in which we performed the fusion at both levels
(classifier and data level). We fine-tuned our final TraitBertGCN model on the fused data and then tested
it on each dataset individually. Using the data fusion technique, our proposed model uses contextualized
word embeddings of BERT and a graphical representation of GCN's word–document relationships. This
helps the model improve the feature extraction and better understand complex relationships between traits
for better results. The best average accuracy of 77.42% on the essays dataset, while on the myPersonality
dataset, the best average accuracy of 87.59% has been achieved by TraitBertGCN (BERT-base), as shown
in Figs.15 and 16. In the light of current results, we can observe that our TraitBertGCN model with data
fusion technique has surpassed the results of previous studies.
Fig. 12 F-measure of TraitBertGCN on essays dataset
Int J Comput Intell Syst (2025) 18:64
64 Page 20 of 24 https://doi.org/10.1007/s44196-025-00792-w
Table 7 Fine-tuned
TraitBertGCN on fused
data and tested on Essays
dataset
Bold values represent the highest values which have been achieved by our proposed model
Model O C E A N Avg
ELMo (E) 63.18 57.72 59.23 58.61 61.5 60
ULMFiT (U) 63.26 58.47 59.59 59.25 60.29 60.17
BERT(B) 63.5 58.9 60.85 58.9 60.75 61.1
Fusion (E + U + B) 65.6 59.52 61.15 60.8 62.2 61.85
BERT + MLP 64.6 59.2 60 58.8 60.5 60.6
TraitBertGCN (RoBERTa-base) 66.48 62.75 63.19 61.78 64.00 63.64
TraitBertGCN (BERT-base) 80.95 76.90 76.98 76.37 76.90 77.42
Table 8 Fine-tuned
TraitBertGCN on fused
data and tested on
myPersonality dataset
Bold values represent the highest values which have been achieved by our proposed model
Model O C E A N Avg
ELMo (E) 79.60 63.75 76.59 63.30 78.00 72.25
ULMFiT (U) 78.65 64.75 77.31 62.80 76.45 72.00
BERT(B) 80.40 62.25 79.95 61.50 78.35 72.50
Fusion (E + U + B) 81.00 65.31 80.55 63.69 79.00 73.91
LSTM + CNN 79.31 59.62 78.95 56.52 79.49 74.17
TraitBertGCN (RoBERTa-base) 85.90 78.67 78.27 78.72 80.72 80.46
TraitBertGCN (BERT-base) 92.54 86.27 85.45 85.92 87.75 87.59
Fig. 15 Accuracy of TraitBertGCN fine-tuned on fused data and tested on Essays datasets
Int J Comput Intell Syst (2025) 18:64
Page 21 of 24 64
https://doi.org/10.1007/s44196-025-00792-w
5 Conclusion
The TraitBertGCN model represents a promising new approach for personality prediction from text. By combining
the strengths of BERT and GCN, the model can capture both the semantic content of the text and the structural
relationships between words and documents, as well as the data fusion technique, leading to improved predic-
tion accuracy. Our proposed model’s results demonstrate that this hybrid approach outperforms existing models
with an average accuracy of 77.42% on the essays dataset and 87.59% on the myPersonality dataset and offers a
robust solution for automatic personality prediction in various contexts. Future work will explore the application
of this model to other personality models beyond the Big Five model and its potential use in other NLP tasks.
Author Contributions All authors have equal contribution.
Data Availability No datasets were generated or analysed during the current study.
Declarations
Conflict of Interest The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Interna-
tional License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as
long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence,
and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material
derived from this article or parts of it. The images or other third party material in this article are included in the article’s
Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the
article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted
Fig. 16 Accuracy of TraitBertGCN Fine-tuned on fused data and tested on myPersonality datasets
Int J Comput Intell Syst (2025) 18:64
64 Page 22 of 24 https://doi.org/10.1007/s44196-025-00792-w
use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat
iveco mmons. org/ licen ses/ by- nc- nd/4. 0/.
References
1. Mehta, Y., Majumder, N., Gelbukh, A., Cambria, E.: Recent trends in deep learn- ing based personality detection. Artif.
Intell. Rev. 53(4), 2313–2339 (2020)
2. McCrae, R.R., John, O.P.: An introduction to the five-factor model and its applications. J. Pers. 60(2), 175–215 (1992)
3. Digman, J.M.: Personality structure: Emergence of the five-factor model. Annu. Rev. Psychol. 41(1), 417–440 (1990)
4. Costa, P.T., McCrae, R.R.: The revised neo personality inventory (neo-pi-r). The SAGE Handbook of Personality Theory
and Assessment 2(2), 179–198 (2008)
5. Christian, H., Suhartono, D., Chowanda, A., Zamli, K.Z.: Text based personality prediction from multiple social media
data sources using pre-trained language model and model averaging. J. Big Data 8(1), 68 (2021)
6. Koroteev, M.V.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:
2103. 11943 (2021)
7. Devlin, J.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv: 1810.
04805 (2018)
8. Pennebaker, J.W.: Linguistic inquiry and word count: LIWC 2001. Lawrence Erlbaum Associates (2001).
9. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv: 1609.
02907 (2016)
10. Zhang, S., Tong, H., Xu, J., Maciejewski, R.: Graph convolutional networks: a comprehensive review. Comput. Soc.
Netw. 6(1), 1–23 (2019)
11. El-Demerdash, K., El-Khoribi, R.A., Shoman, M.A.I., Abdou, S.: Deep learning based fusion strategies for personality
prediction. Egypt. Inform. J. 23(1), 47–53 (2022)
12. Mehta, Y., Fatehi, S., Kazameini, A., Stachl, C., Cambria, E., Eetemadi, S.: Bottom-up and top-down: Predicting per-
sonality with psycholinguistic and lan- guage model features. In: 2020 IEEE International Conference on Data Mining
(ICDM), pp. 1184–1189 (2020). IEEE
13. Plenz, M., Frank, A.: Graph language models. arXiv preprint arXiv: 2401. 07105 (2024)
14. Lin, Y., Meng, Y., Sun, X., Han, Q., Kuang, K., Li, J., Wu, F.: Bertgcn: Transductive text classification by combining
gcn and bert. arXiv preprint arXiv: 2105. 05727 (2021)
15. Al Marouf, A., Hasan, M.K., Mahmud, H.: Comparative analysis of feature selec- tion algorithms for computational
personality prediction from social media. IEEE Trans. Comput. Soc. Syst. 7(3), 587–599 (2020)
16. Chittaranjan, G., Blom, J., Gatica-Perez, D.: Who’s who with big-five: Analyz- ing and classifying personality traits
with smartphones. In: 2011 15th Annual International Symposium on Wearable Computers, pp. 29–36 (2011). IEEE
17. Pennebaker, J.W., King, L.A.: Linguistic styles: language use as an individual difference. J. Pers. Soc. Psychol. 77(6),
1296 (1999)
18. Rahman, M.A., Al Faisal, A., Khanam, T., Amjad, M., Siddik, M.S.: Personality detection from text using convolutional
neural network. In: 2019 1st Interna- tional Conference on Advances in Science, Engineering and Robotics Technology
(ICASERT), pp. 1–6 (2019). IEEE
19. Sun, X., Liu, B., Cao, J., Luo, J., Shen, X.: Who am i? personality detection based on deep learning for texts. In: 2018
IEEE International Conference on Communications (ICC), pp. 1–6 (2018). IEEE
20. Li, Y., Kazemeini, A., Mehta, Y., Cambria, E.: Multitask learning for emotion and personality traits detection. Neuro-
computing 493, 340–350 (2022)
21. Tandera, T., Suhartono, D., Wongso, R., Prasetio, Y.L., etal.: Personality pre- diction system from facebook users. Proc.
Comput. Sci. 116, 604–611 (2017)
22. Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I.: Improving language understanding by generative pretrain-
ing(2018). https:// www. mikec aptain. com/ resou rces/ pdf/ GPT-1. pdf
23. Yang, Z.: Xlnet: Generalized autoregressive pre-training for language understand- ing. arXiv preprint arXiv: 1906. 08237
(2019)
24. Howard, J., Ruder, S.: Universal language model fine-tuning for text classification. arXiv preprint arXiv: 1801. 06146
(2018)
25. Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations.
arXiv preprint arXiv: 1802. 05365 (2018)
26. Kosinski, M., Matz, S.C., Gosling, S.D., Popov, V., Stillwell, D.: Facebook as a research tool for the social sciences:
Opportunities, challenges, ethical considerations, and practical guidelines. Am. Psychol. 70(6), 543 (2015)
27. He, Q., Wang, H., Zhang, Y.: Enhancing generalization in natural language infer- ence by syntax. In: Findings of the
Association for Computational Linguistics: EMNLP 2020, pp. 4973–4978 (2020)
Int J Comput Intell Syst (2025) 18:64
Page 23 of 24 64
https://doi.org/10.1007/s44196-025-00792-w
28. Liu, B., Wu, L.: Graph neural networks in natural language processing. Graph Neural Networks: Foundations,Frontiers,
and Applications, 463–481 (2022). https:// doi. org/ 10. 1007/ 978- 981- 16- 6054-2_ 21
29. Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Philip, S.Y.: A comprehensive survey on graph neural networks.
IEEE Trans. Neural Netw. Learn. Syst. 32(1), 4–24 (2020)
30. Akber, A., Ferdousi, T., Ahmed, R., Asfara, R., Rab, R., Zakia, U.: Personal- ity and emotion—a comprehensive
analysis using contextual text embeddings. Nat. Lang. Process. J. 9, 100105 (2024)
31. Salahat, M., Ali, L., Ghazal, T.M., Alzoubi, H.M.: Personality assessment based on natural stream of thoughts
empowered with machine learning. Comput. Mater .Continua 76(1), 1–17 (2023)
32. Kosinski, M., Stillwell, D., Graepel, T.: Private traits and attributes are pre- dictable from digital records of human
behavior. Proc. Natl. Acad. Sci. 110(15), 5802–5805 (2013)
33. Yao, L., Mao, C., Luo, Y.: Graph convolutional networks for text classification. In: Proceedings of the AAAI Confer-
ence on Artificial Intelligence, vol. 33, pp. 7370–7377 (2019)
34. Zhang, H., Zhang, J.: Text graph transformer for document classification. In: Conference on Empirical Methods in
Natural Language Processing (EMNLP) (2020)
35. Ma, J., Liu, B., Li, K., Li, C., Zhang, F., Luo, X., Qiao, Y.: A review of graph neural networks and pre-trained lan-
guage models for knowledge graph reasoning. Neurocomputing 609, 128490 (2024)
36. Pan, S., Zheng, Y., Liu, Y.: Integrating graphs with large language models: methods and prospects. IEEE Intell. Syst.
39(1), 64–68 (2024)
37. Li, B., Wang, H., Tan, X., Li, Q., Chen, J., Qiu, X.: Adaptive heterogeneous graph reasoning for relational under-
standing in interconnected systems. J. Supercomput. 81(1), 112 (2025)
38. Wang, H., Qiu, X., Tan, X.: Multivariate graph neural networks on enhancing syntactic and semantic for aspect-based
sentiment analysis. Appl. Intell. 54(22), 11672–11689 (2024)
39. Xue, X., Feng, J., Sun, X.: Semantic-enhanced sequential modeling for personality trait recognition from texts. Appl.
Intell. 51, 1–13 (2021)
40. Wang, Z., Wu, C.-H., Li, Q.-B., Yan, B., Zheng, K.-F.: Encoding text information with graph convolutional networks
for personality recognition. Appl. Sci. 10(12), 4081 (2020)
41. Jiang, H., Zhang, X., Choi, J.D.: Automatic text-based personality recognition on monologues and multiparty
dialogues using attentive networks and contex- tual embeddings (student abstract). In: Proceedings of the AAAI
Conference on Artificial Intelligence, vol. 34, pp. 13821–13822 (2020)
42. Kazameini, A., Fatehi, S., Mehta, Y., Eetemadi, S., Cambria, E.: Personality trait detection using bagged svm over
bert word embedding ensembles. arXiv preprint arXiv: 2010. 01309 (2020)
43. Ramezani, M., Feizi-Derakhshi, M.-R., Balafar, M.-A., Asgari-Chenaghlu, M., Feizi-Derakhshi, A.-R., Nikzad-
Khasmakhi, N., Ranjbar-Khadivi, M., Jahanbakhsh-Nagadeh, Z., Zafarani-Moattar, E., Akan, T.: Automatic per-
sonal- ity prediction: an enhanced method using ensemble modeling. Neural Comput. Appl. 34(21), 18369–18389
(2022)
44. Majumder, N., Poria, S., Gelbukh, A., Cambria, E.: Deep learning-based doc- ument modeling for personality detec-
tion from text. IEEE Intell. Syst. 32(2), 74–79 (2017)
45. Tighe, E.P., Ureta, J.C., Pollo, B.A.L., Cheng, C.K., Dios Bulos, R.: Personality trait classification of essays with
the application of feature reduction. In: SAAIP@ IJCAI, pp. 22–28 (2016)
46. Verhoeven, B., Daelemans, W., De Smedt, T.: Ensemble methods for personality recognition. In: Proceedings of the
International AAAI Conference on Web and Social Media, vol. 7, pp. 35–38 (2013)
47. Poria, S., Gelbukh, A., Agarwal, B., Cambria, E., Howard, N.: Common sense knowledge based personality recogni-
tion from text. In: Advances in Soft Com- puting and Its Applications: 12th Mexican International Conference on
Artificial Intelligence, MICAI 2013, Mexico City, Mexico, November 24–30, 2013, Proceed- ings, Part II 12, pp.
484–496 (2013). Springer
48. Mohammad, S., Kiritchenko, S.: Using nuances of emotion to identify personality. In: Proceedings of the Interna-
tional AAAI Conference on Web and Social Media, vol. 7, pp. 27–30 (2013)
49. Ramezani, M., Feizi-Derakhshi, M.-R., Balafar, M.-A.: Knowledge graph-enabled text-based automatic personality
prediction. Comput. Intell. Neurosci. 2022(1), 3732351 (2022)
50. Yuan, C., Wu, J., Li, H., Wang, L.: Personality recognition based on user gen- erated content. In: 2018 15th Inter-
national Conference on Service Systems and Service Management (ICSSSM), pp. 1–6 (2018). IEEE
51. El-Demerdash, K., El-Khoribi, R.A., Shoman, M.A.I., Abdou, S.: Psycholog- ical human traits detection based on
universal language modeling. Egypt. Inform. J. 22(3), 239–244 (2021)
52. Zheng, H., Wu, C.: Predicting personality using facebook status based on semi- supervised learning. In: Proceedings
of the 2019 11th International Conference on Machine Learning and Computing, pp. 59–64 (2019)
53. Farnadi, G., Zoghbi, S., Moens, M.-F., De Cock, M.: Recognising personality traits using facebook status updates.
In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 7, pp. 14–18 (2013)
Int J Comput Intell Syst (2025) 18:64
64 Page 24 of 24 https://doi.org/10.1007/s44196-025-00792-w
Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional
affiliations.
Authors and Aliations
MuhammadWaqas1· FengliZhang1,2· AsifAliLaghari3· AhmadAlmadhor4·
FilipPetrinec5· AsifIqbal1· MianMuhammadYasirKhalil1
* Fengli Zhang
fzhang@uestc.edu.cn
Muhammad Waqas
m.waqas@std.uestc.edu.cn
Asif Ali Laghari
asiflaghari@synu.edu.cn
Ahmad Almadhor
aaalmadhor@ju.edu.sa
Filip Petrinec
petrinec3@uniba.sk
Asif Iqbal
asif.iqbal@std.uestc.edu.cn
Mian Muhammad Yasir Khalil
myasirkhalil@yahoo.com
1 School ofInformation andSoftware Engineering, University ofElectronic Science andTechnology ofChina,
Chengdu, China
2 Network andData Security Key Laboratory ofSichuan Province, Chengdu, China
3 Software College, Shenyang Normal University, Shenyang, China
4 Department ofComputer Engineering andNetworks, College ofComputer andInformation Sciences, Jouf University,
72388Sakaka, SaudiArabia
5 Department ofInformation Management andBusiness Systems, Faculty ofManagement, Comenius University
Bratislava, Odbojárov 10, 82005Bratislava25, Slovakia