Available via license: CC BY 4.0
Content may be subject to copyright.
Citation: Essameldin, R.; Ismail,
A.A.; Darwish, S.M. Quantifying
Opinion Strength: A Neutrosophic
Inference System for Smart Sentiment
Analysis of Social Media Network.
Appl. Sci. 2022,12, 7697. https://
doi.org/10.3390/app12157697
Academic Editors: Adegboyega Ojo
and Rizun Nina
Received: 30 May 2022
Accepted: 29 July 2022
Published: 30 July 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
applied
sciences
Article
Quantifying Opinion Strength: A Neutrosophic Inference
System for Smart Sentiment Analysis of Social Media Network
Reem Essameldin 1, Ahmed A. Ismail 2and Saad M. Darwish 1, *
1Department of Information Technology, Institute of Graduate Studies and Research, Alexandria University,
163 Horreya Avenue, El-Shatby, Alexandria 21526, Egypt; igsr.reemessameldin@alexu.edu.eg
2The Higher Institute of Computers and Information Systems, Abo Qir, Alexandria 21913, Egypt;
gisapp13@gmail.com
*Correspondence: saad.darwish@alexu.edu.eg
Abstract:
The contemporary speed at which opinions move on social media makes them an un-
deniable force in the field of opinion mining (OM). This may cause the OM challenge to become
more social than technical. This is when the process can determinately represent everyone to the
degree they are worth. Nevertheless, considering perspectivism can result in opinion dynamicity.
Pondering the existence of opinion dynamicity and uncertainty can provide smart OM on social
media. This study proposes a neutrosophic-based OM approach for Twitter that handles perspec-
tivism, its consequences, and indeterminacy. For perspectivism, a social network analysis (SNA)
was conducted using popular SNA tools (e.g., Graphistry). An influence weighting of users was
performed using an artificial neural network (ANN) based on the SNA provided output and people’s
reactions to the OM analyzed texts. The initiative adoption of neutrosophic logic (NL) to integrate
users’ influence with their OM scores is to deal with both the opinion dynamicity and indeterminacy.
Thus, it provides new uncertainty OM scores that can reflect everyone. The OM scores needed
for integration were generated using TextBlob. The results show the ability of NL to improve the
OM process and accurately consider the innumerable degrees. This will eventually aid in a better
understanding of people’s opinions, helping OM in social media to become a real pillar of many
applications, especially business marketing.
Keywords: social network analysis; artificial neural networks; neutrosophic logic; opinion mining
1. Introduction
Nowadays, the fact that social media is the main source of digital marketing and
market research information (e.g., news and reviews) is solidified [
1
,
2
]. With billions of
active users on different online social networks (OSNs), people have started to eliminate
global barriers by sharing their thoughts, opinions, and feelings toward everything [
3
,
4
].
The core strength of these OSNs is their influence; companies can communicate with their
customers with positive content to boost motivation towards their products or services.
Interestingly, people rather than companies are proved to be the leading engine of influence
on OSNs; they can drive public orientation towards anything in a matter of seconds [
5
].
In either case, opinion mining (OM) or sentiment analysis (SA) is essential to analyze
opinions, for companies to obtain insight into how their marketing strategies performed,
how people on OSNs talk about them, and then take corrective actions afterward to improve
their business [6].
Opinions are expressed by humans (i.e., opinion holders about an entity or features of
an entity at a given time) [
6
]; business owners, who aim at improving their market image,
pay more attention to searching for opinion holders of good impact on OSNs to spread
opinions of positive orientations about their business entities and their features [
6
,
7
]. Social
network analysis (SNA) was applied to OSNs to highlight influencers for businesses to
Appl. Sci. 2022,12, 7697. https://doi.org/10.3390/app12157697 https://www.mdpi.com/journal/applsci
Appl. Sci. 2022,12, 7697 2 of 20
spread positive opinions of their business [
7
]. Using the graph and network theory, SNA is
one way to study the importance of users in their social networks. The centrality measure in
SNA is one primary assessment for detecting influential nodes (i.e., users) in a given graph
network [
4
]. However, this sort of action can increase the diffusion of positive opinions
with no effect on the main process; OM deals with opinion texts only [1].
The subjective nature of opinions makes them subject to many types of uncertainty
when being automatically analyzed. For example, opinions have gradient polarity; two
positive texts are not positive to the same degree. Moreover, undecided polarity can happen
when text polarity is at an equal distance between two polarity classes. At that instant, the
machine makes the approximate or dominant choice (e.g., neutral class). Likewise, many
neutral classified opinions are undecided (e.g., having the same opposite polarity) but such
a class is not considered in OM classification [
8
]. These innumerable degrees reflect reality
and are found in social media texts, which then need to be considered for more accurate
and real OM results. More than that, when individuals perform text rating, this may cause
the generation of different rates for the same texts based on the observers’ perspective.
Opinion dynamics on OSNs is one possible result of perspectivism, thus correlated with
SNA in the determinacy of users’ importance and trust degree [7,9,10].
All the above-mentioned cases can be easily detected and solved by humans, but
hardly processed by machines [
4
]. Leaving aside its unfamiliar implementation in OM,
the nature of neutrosophic logic (NL) is highly integrated with how humans think, where
an indefinite environment is a human’s ordinary space to draw a conclusion or make a
judgment due to incomplete knowledge. Furthermore, NL is malleable in terms of accepting
different values from observers. Thus, NL can acquire some mental ability for the OM
process to become smart. In NL, each object in a certain universe has a degree of
truth (T),
a degree of indeterminacy (
I
), and a degree of falsity
(F)
, where
T
,
I
,
F
are standard or
non-standard real subsets of
]−0, 1+[
. Such a classification might help in determining the
percent of indeterminacy in a text’s polarity [9].
The main objectives of this study are to reduce social bias through algorithmically
certain and more accurate OM classification results befitting trust. The proposed model can
improve the process of OM in OSNs by highlighting the most irritating technical problems
that are caused by the social tendency of the application domain.
The remainder of this paper is planned in the following way. In Section 2, social
media-based research is classified and briefly discussed in a literature-based approach.
Section 3highlights the research problem for Section 4to define how the proposed model
is intended to solve it. The required validations and their resulting analysis are presented
and elucidated in Section 5. Conclusions are drawn in Section 6.
2. Related Work
Research that is based on analyzing social media data are gaining massive
attention [4,6].
In the literature, this research type is performed to serve many applications (e.g., market,
health care, etc.), and they are mostly classified into structural-based and content-based
analyses [
4
]. The structural-based analysis deals with the main social network structure,
studies nodes (i.e., users) interactions, and the network topology. The content-based
analysis deals with content being created by nodes and shared on social media [
4
]. Figure 1
shows the classification of social media data-based research.
In marketing applications, business owners and researchers are concerned about the
structural-based analyses that include SNA to find influencers [
10
]. In line with that, in
2017, most research found OSNs’ influencers based on graph theory by measuring network
topology (e.g., degree of centrality, betweenness, etc.) [
4
,
11
]. Jianqiang et al. [
12
] proposed
measuring influence based on considering not only the structural-based analysis of social
media data (i.e., graph theory), but also the influence of users’ tweets by considering
retweets, replies, and favorite counts. They believed that the content of tweets itself
can influence people as much as their authors. They tested their proposed model, and
it was found to be a better method than the compared methods. In 2020, concerning
Appl. Sci. 2022,12, 7697 3 of 20
the community detection under the same analysis type, Oueslati et al. [
7
] highlighted the
problem of detecting opinion leaders on dynamic social graphs. They attempted to consider
the dynamic nature of online social networks (OSNs) while detecting opinion leaders on
social media. They proposed a new model that collected Facebook posts and then filtered
them based on the included opinion words and the text type (i.e., status). They considered
posts important when having higher opinion words and social reactions. Accordingly,
influencers were the ones whose posts had the highest combined score. Experiments on real
data collected from Facebook were performed. The validation of the model was compared
with some previous work, and the results showed good performance of this model when
considering the dynamic nature of the network.
Appl. Sci. 2022, 12, 7697 3 of 21
Figure 1. Social media data-based research classification in the literature.
In marketing applications, business owners and researchers are concerned about the
structural-based analyses that include SNA to find influencers [10]. In line with that, in
2017, most research found OSNs’ influencers based on graph theory by measuring net-
work topology (e.g., degree of centrality, betweenness, etc.) [4,11]. Jianqiang et al. [12]
proposed measuring influence based on considering not only the structural-based analy-
sis of social media data (i.e., graph theory), but also the influence of users’ tweets by con-
sidering retweets, replies, and favorite counts. They believed that the content of tweets
itself can influence people as much as their authors. They tested their proposed model,
and it was found to be a better method than the compared methods. In 2020, concerning
the community detection under the same analysis type, Oueslati et al. [7] highlighted the
problem of detecting opinion leaders on dynamic social graphs. They attempted to con-
sider the dynamic nature of online social networks (OSNs) while detecting opinion lead-
ers on social media. They proposed a new model that collected Facebook posts and then
filtered them based on the included opinion words and the text type (i.e., status). They
considered posts important when having higher opinion words and social reactions. Ac-
cordingly, influencers were the ones whose posts had the highest combined score. Exper-
iments on real data collected from Facebook were performed. The validation of the model
was compared with some previous work, and the results showed good performance of
this model when considering the dynamic nature of the network.
It was found that the isolated application of one analysis, whether it was structural
or content-based, can cause information incompleteness, essential patterns, and
knowledge loss [4]. This caused a few researchers to combine both analyses for different
application purposes. In 2021, Jin et al. [13] attempted to combine both the content-based
analysis (i.e., SA) with the structural-based analysis (i.e., SNA) by designing a new senti-
ment link analysis using a graph network to predict users’ attitudes towards an entity.
They considered the problem of determining users’ attitudes towards entities without the
normal analysis of text sentiment. They tried to solve the problem of the undecided po-
larity of tweets posted by their authors on social media due to the short text or unclear
wording to retrieve users’ real attitudes. They also tried to predict the hidden attitude
when users did not share their opinions on social media. They modeled the information
network to retrieve unknown sentiment links based on users’ social relationships, user
attributes, and movie attributes. Experiments were conducted using movies reviewed from
Weibo gaming proving the effectiveness of this method on some state-of-art methods.
In the same year, Chauhan et al. [14] highlighted the importance of predicting elec-
tion results based on three approaches by surveying 38 papers on election prediction. The
three approaches were: the volumetric approach (counts of posts, likes, etc.); the content-
based analysis in the form of the SA approach (polarity of posts); and the structural-based
analysis in the form of SNA (measure centrality of supporters). They found that most re-
search depended on SA, either alone or combined with a volumetric approach. Only one
paper combined the three approaches. They concluded the need for more efforts in this
Figure 1. Social media data-based research classification in the literature.
It was found that the isolated application of one analysis, whether it was structural or
content-based, can cause information incompleteness, essential patterns, and knowledge
loss [
4
]. This caused a few researchers to combine both analyses for different application
purposes. In 2021, Jin et al. [
13
] attempted to combine both the content-based analysis
(i.e., SA) with the structural-based analysis (i.e., SNA) by designing a new sentiment
link analysis using a graph network to predict users’ attitudes towards an entity. They
considered the problem of determining users’ attitudes towards entities without the normal
analysis of text sentiment. They tried to solve the problem of the undecided polarity of
tweets posted by their authors on social media due to the short text or unclear wording
to retrieve users’ real attitudes. They also tried to predict the hidden attitude when users
did not share their opinions on social media. They modeled the information network to
retrieve unknown sentiment links based on users’ social relationships, user attributes, and
movie attributes. Experiments were conducted using movies reviewed from Weibo gaming
proving the effectiveness of this method on some state-of-art methods.
In the same year, Chauhan et al. [
14
] highlighted the importance of predicting election
results based on three approaches by surveying 38 papers on election prediction. The three
approaches were: the volumetric approach (counts of posts, likes, etc.); the content-based
analysis in the form of the SA approach (polarity of posts); and the structural-based analysis
in the form of SNA (measure centrality of supporters). They found that most research
depended on SA, either alone or combined with a volumetric approach. Only one paper
combined the three approaches. They concluded the need for more efforts in this field to
produce a more valid and acceptable model for election result prediction (e.g., to detect
spam accounts, sarcasm, etc.).
Regarding the implementation of the NL technique in the literature, it was found
to be applied in many fields (e.g., image processing and disease diagnosis) with no real
application to SA not long ago [
8
]. Some researchers have investigated the significance of
NL by conducting theoretical comparisons with other existing logics (e.g., intuitionistic
logic and fuzzy logic) [15–17].
Due to the low implementation rate of NL, there are a lack of findings for a specific
software tool for NL. Consequently, a few researchers have started to suggest an imple-
Appl. Sci. 2022,12, 7697 4 of 20
mentation method for applying NL in practice [
9
]. In 2012, Ansaria et al. [
9
] mentioned the
problem of not having any available software for NL. They presented full-detailed steps for
generating an NL classifier based on the fuzzy toolbox of MATLAB. It followed the same
idea of constructing three fuzzy inference systems (FIS) representing each component of
NL. They also illustrated the ambiguity cases in detail for the Iris dataset when applying
fuzzy and NL systems, showing how professionally NL can deal with ambiguity cases. In
the same direction, in 2016, Basha et al. [
18
] suggested the utilization of a knowledge-based
rule-based system due to the close resemblance between human thinking and rule-based
approaches. They proposed a neutrosophic rule-based system to handle uncertainty, in
particular, ambiguity due to the overlap areas in classification. The proposed model desig-
nates rules and non-overlap membership functions for each NL output component. Rules
were trained and tested on three datasets (e.g., the Iris dataset). A comparison with a fuzzy
system was conducted to prove the ability of the NL system to handle ambiguity with
better accuracy.
In 2018, Bhutani et al. [
19
] highlighted the importance of rule-based classification
using fuzzy logic for its ability to handle interpretation and its weakness in handling
uncertainty. The authors in [
19
] followed the same concept and steps as in [
18
]. They
constructed an NL classifier on the same line as their fuzzy classifier. The difference was in
designing the membership functions. In NL, the output must be represented in triple format
components. For each component, they designed non-overlapping membership functions
for both input and output and the component rules. The constructed NL classification
model was practiced on an appendicitis dataset and then compared with the ordinary fuzzy
classifier as a performance test. The comparison demonstrated the superiority of the NL
classifier in handling ambiguity in the data classification.
In dealing with the uncertainty and OM, in 2017, Smarandache et al. [
20
] attempted
to express the importance of the NL in dealing with uncertainties and how it can handle
idea dynamicity. They applied NL in an election process where blind and null votes
might represent a high percentage and needed to be processed. They proposed a refining
process based on minimizing the rate of indeterminacy and increasing the truth and falsity
rates in returns, thus minimizing uncertainties. Using NL, they succeeded in finding and
effectively refining these types of votes. In 2019, Smarandache et al. [
21
] addressed the
problem of word similarity. They attempted to design a new word similarity measure
based on sentiment results obtained from SentiWordNet 3.0. This considered the challenge
of multi-meaning words by measuring the semantic distance from a seed word using a
neutrosophic approach. They proposed an NL method for classifying the sentiment of
words by measuring distances to seed words representing each standard polarity class.
They applied Hamming distance, Euclidean distance, and Intuitionistic Euclidean distance
to evaluate the accuracy of the method where promising results were obtained.
In 2020, Kandasamya et al. [
8
] suggested a refined process for SA polarity results from
TextBlob to handle the problem of indeterminacy in polarity. They proposed a multi-refined
neutrosophic set where polarity scores resulted are refined into seven polarity classes (i.e.,
strong positive, positive, indeterminate positive, indeterminate, indeterminate negative,
negative, and strong negative). Another two methods were applied, where polarity scores
were classified into five and three classes, respectively. A comparative study was conducted
between their model and the two other used methods. Tests were performed on 10 different
datasets to see how well the three methods worked at evaluating them. The results showed
that the proposed refined method was the best.
In conjunction with the continuous production of social media-based research, this
work combined both structure and content-based methods by utilizing a neutrosophic-
based OM approach that can provide each author with a polarity score for his text that
depends on his influence level. A new element to the basic five opinion elements that
represented the influence level was added. This model applies a novel classification of the
opinion holders into four types based on their centrality measures and the reactions to their
texts using an artificial neural network (ANN). SNA was performed using several software
Appl. Sci. 2022,12, 7697 5 of 20
tools that are available online to provide the required information about authors’ topologies
in their OSNs. People’s opinions and text polarity scores are combined for the first time in
this model. It also takes into account text indeterminacy and dynamicity with NL.
3. Problem Definition
On OSNs, each of the basic opinion elements (
tl
,
hk
,
ei
,
aij
) spearheads the orientation
of opinions (
ooi jkl
) to which they belong. At a given time (
tl
), billions of opinion holders (
hk
)
share their opinions at tremendous speed about any entity (
ei
) or/and its features (
aij
); OM
must automatically deal with the frequent increase in opinions and their orientation (
ooi jkl
)
change over time [
22
]. One reason for the orientation change is the
hk
influence; people on
OSN can establish credibility in areas of wide audience interest due to their personality,
knowledge, or simply for sharing relatable content. They can have different observations
of the same entity, leaving the decision to the audience to agree with whom they trust more.
Nevertheless, OM treats opinions as pure text with the same importance, whether they are
expressed by a professional, expert, influencer, spammer, or ordinary person. Contrary to
the norm, opinion holders in the eye of OM are equally likely threaten the credibility of
opinions and the validity of results. Back to the content being processed, OM can process
features of an entity when more detailed opinions are important, and only the entity when
detailed opinions are less important. However, writing opinions on OSNs in a spoken
language increases the risk of content uncertainty as well as the hardiness of determining
its main scope.
These issues doubt the ability of OM on OSNs in reflecting reality providing low-
quality information-based results for OM beneficiaries and decision makers. Accordingly,
improving the process of OM on OSNs became an insistence. In the achievement of such,
the OM process should take into consideration the influence of users and their perspective
power as well as the uncertainty of texts and their impact on result efficiency. A summary
of the problems to be solved in this study is as follows:
-
The compulsory dependency on low-quality information for the OM process and its
reflection on providing invalid results for decision making;
-
The scarcity of parallel processing of perspectivism (i.e., users’ credibility) with the
main OM process;
-
The need for effective ways to determine and numerically represent users’ credibility;
-
The inefficient consideration of opinion dynamics in generating a sensitive
consensus opinion;
- The continuous need for decreasing ambiguity in opinion texts;
-
The absence of a methodical integration between polarity scores and text perspectivism
in the process of OM to provide a polarity score that reflects real intentions.
4. Proposed Model
The above-mentioned problems caution the ability of OM, in its current state, to be
applied to OSNs. The main shortage is in the sedulity to use the same opinion elements
while being applied to OSNs and their environmental uncertainty. The proposed model
aims to improve the effectiveness of applying OM to Twitter. To solve the mentioned
problems, we should readapt the opinion elements to accommodate the application domain
properties. Accordingly, we need to introduce new elements to qualify the OM process
on OSNs that are specifically related to opinion holders’ hallmarks and social reactions
to their actions. Other than that, we must solve common uncertainty problems that can
improve the opinion classification process. The proposed model mainly focuses on all the
previously mentioned problems while holding the following contributions:
-
The proposed model suggests adding a new element (assume:
wkl
) to the five elements
of opinion, describing the importance of opinion holders and their reactions to their
opinions. This element takes into consideration the impact of the opinion holder and
his reaction to his opinion on the assigned opinion orientation. Importance weighting
should be applied to opinion holders so that they are provided weights based on their
Appl. Sci. 2022,12, 7697 6 of 20
authority over the audience. As such, perspectivism can be practically represented in
the main OM process;
-
The users’ weighting method is proposed in this model, which depends on apply-
ing SNA and ANN. A comparative analysis of three famous SNA tools must be
conducted to adopt the most applicable one. ANN is applied to rank users based
on their centrality measure produced by the SNA tool as well as the reaction-based
features of users’ texts (i.e., likes, shares, etc.). Weights are provided based on users’
ranks; the top-ranked is the most weighted and vice versa. ANN was chosen for
being efficient in dealing with the complicated behavior of humans that could not be
mathematically represented;
-
The innovative adoption of NL is to integrate the new elements with the other tra-
ditional elements of opinion. Moreover, NL adoption hybrids the process of OM,
providing more accurate polarity scores through performing the OM combining lex-
icon with machine learning (ML) (i.e., TextBlob and NL). Finally, NL was proven
to effectively deal with uncertainty especially opinion ambiguity and dynamicity.
Figure 2presents the proposed model and its components. The phases of the proposed
model are described as follows:
Appl. Sci. 2022, 12, 7697 7 of 21
Figure 2. The proposed NL-based OM model.
4.1. Data Collection Phase
Any model requires data to work on and test its validity. For OM, this phase is es-
sential to identify most of the opinion elements (i.e., ℎ, , , and ). In this proposed
model, Twitter is the chosen OSN, and thus, the Twitter API was used to collect the nec-
essary data. For the proposed model and adding to the traditional data required in any
OM process, data about opinion holders and their tweets’ reaction-based features are to
be essentially required for collection. The previously collected data in [10] were used for
this work due to the lack of availability of data that include both aspects of opinion hold-
ers, and their tweets’ features, along with the main opinion texts. Accordingly, for this
model, the data required to represent opinion elements were as follows:
- Opinion text in the English language, per user;
- , : the opinion entity is the World Football Cup 2018 and all its possible aspects
(e.g., video assistant referee (VAR));
- ℎ: indicates username of the opinion holder;
- : opinions that were collected during the period from 14 June 2018 till 15 July 2018,
and;
Figure 2. The proposed NL-based OM model.
Appl. Sci. 2022,12, 7697 7 of 20
4.1. Data Collection Phase
Any model requires data to work on and test its validity. For OM, this phase is essential
to identify most of the opinion elements (i.e.,
hk
,
ei
,
aij
, and
tl
). In this proposed model,
Twitter is the chosen OSN, and thus, the Twitter API was used to collect the necessary data.
For the proposed model and adding to the traditional data required in any OM process,
data about opinion holders and their tweets’ reaction-based features are to be essentially
required for collection. The previously collected data in [
10
] were used for this work due
to the lack of availability of data that include both aspects of opinion holders, and their
tweets’ features, along with the main opinion texts. Accordingly, for this model, the data
required to represent opinion elements were as follows:
- Opinion text in the English language, per user;
-ei
,
aij
: the opinion entity is the World Football Cup 2018 and all its possible aspects
(e.g., video assistant referee (VAR));
-hk: indicates username of the opinion holder;
-tl
: opinions that were collected during the period from 14 June 2018 till 15 July 2018,
and;
-
The contribution support data that include: follower/following list of
hk
and the likes,
retweets, and replies counts for their collected opinion tweet.
4.2. Users Weighting Process
This contribution phase is divided into two main steps: SNA then ANN ranking and
weighting processes. Data collected from the first phase about the newly added elements is
mainly required to execute this phase. Follower/following lists and tweets’ features are
individually used as follows:
(1)
Social Network Analysis (SNA)
This step works with follower/following lists of opinion holders. Using SNA, users
in a given social network can be provided some centrality measures representing their
importance in this network. Owning higher centrality measures indicates more central,
influential, and powerful users [
4
,
23
]. The most considered centrality measures in this work
are degree, betweenness, and closeness centrality. They measure the popularity of nodes
through counting connections, how users control information diffusion in the network,
and how quickly information spreads from users, respectively [
23
]. By applying some
available SNA tools and using the collected follower/following lists, a graph of users can
be constructed. In such a graph, a user is represented as a node, symbolized as “
u
”, and
his relations between other users are edges, symbolized as “
e
” per each, with directions,
so-called directed graph “
G
”. Moreover, centrality measures of nodes can be obtained. The
directed graph and centrality measures are defined as [24]:
G=(V,E)(1)
where
G
is the social network graphically represented, as shown in Figure 3.
V
is the set
of nodes representing users in the OSN: V= {u
1
,u
2
,u
3
,
. . .
,u
n
}, and
E
is the set of edges
representing directed relations between the graphically represented nodes in the OSN:
E= {e1,e2,e3, . . . , em}
. If two users (e.g.,
u1
and
u2
) follow each other, then a directed edge
between the two users is constructed: eu1u2∈E.
Cd−(u) =
→
Ni
(2)
where
Cd−(u)
is the degree centrality (out-degree) of node
u
representing the number of
outgoing edges from a node, and
→
Ni={jeV:(i,j)eE}
is a set of nodes that node
i
is
connected to [12].
Bet(u) = ∑
r6=u6=w∈V
|{gr w(u)}|
|{gr w}| (3)
Appl. Sci. 2022,12, 7697 8 of 20
where
Bet(u)
is the betweenness centrality of node
u
,
|{gr w}|
denotes the number of
shortest paths from node
r
to node
w
, and
|{gr w(u)}|
denotes the number of the shortest
paths from node rto node wthrough node u.
Col(u) = 1
N+1∑
w∈V,w6=u
1
gu,w(4)
where
Col(u)
is the closeness centrality of node
u
,
N
is the number of nodes in
V
, and
gu,w
is the length of the shortest paths between the uth node and the rest of the network.
Appl. Sci. 2022, 12, 7697 9 of 21
Figure 3. The constructed directed graph using different SNA tools.
A comparative analysis of three famous SNA software tools (i.e., Graphistry, Cyto-
scape, and the University of California at Irvine network (UCINET)) was conducted to
choose the most suitable tool for this proposed work. According to our view and based
on the evaluation process conducted in [4], Graphistry is the best relevant SNA software
tool to adopt. Compared with the other tools, it was found to be the best in terms of visu-
alization, scalability, and ease of use. Figure 3 shows the different representations of our
directed graph using the three mentioned SNA software tools. As illustrated in Figure 3,
we can easily identify bridges and popular users in Graphistry and Cytoscape while noth-
ing can be identified using UCINET. Nodes in Graphistry are represented with large circle
sizes with a large degree of centrality measures. The larger the degree of centrality of a
node, the larger the circle size and vice versa. Moreover, Graphistry was easier for loading
the data and building the directed graph.
(2) Artificial Neural Network Ranking Process
In this step, the collected data about tweets’ reaction features (i.e., likes, counts, etc.)
is combined with the centrality measures of their opinion holders obtained from the SNA
process to help rank opinion holders based on their tweet’s influence as well as their top-
ological influence in the social network. Opinion holders obtain scores from 1 to 100.
When centrality measures are high or/and their tweets obtain high reactions, then opinion
holders obtain a high score and are highly ranked. ANN is used for this step; Figure 4
shows the steps of the ANN scoring process. ANN adoption was due to the complexity of
this behavioral data to be mathematically represented. Figure 5 shows, in practice, the
correlation between a sample data of action-based records (X-axis) and the resultant users’
score (Y-axis) when plotted. It can be noticed that, for example, authors of low action-
based tweets’ counts can highly score and vice versa. Thus, there is no certain guidance
rule for scoring users based on their tweets’ action-based counts.
Figure 3. The constructed directed graph using different SNA tools.
A comparative analysis of three famous SNA software tools (i.e., Graphistry, Cytoscape,
and the University of California at Irvine network (UCINET)) was conducted to choose
the most suitable tool for this proposed work. According to our view and based on the
evaluation process conducted in [
4
], Graphistry is the best relevant SNA software tool to
adopt. Compared with the other tools, it was found to be the best in terms of visualization,
scalability, and ease of use. Figure 3shows the different representations of our directed
graph using the three mentioned SNA software tools. As illustrated in Figure 3, we can
easily identify bridges and popular users in Graphistry and Cytoscape while nothing can
be identified using UCINET. Nodes in Graphistry are represented with large circle sizes
with a large degree of centrality measures. The larger the degree of centrality of a node, the
larger the circle size and vice versa. Moreover, Graphistry was easier for loading the data
and building the directed graph.
(2)
Artificial Neural Network Ranking Process
In this step, the collected data about tweets’ reaction features (i.e., likes, counts, etc.)
is combined with the centrality measures of their opinion holders obtained from the SNA
process to help rank opinion holders based on their tweet’s influence as well as their
topological influence in the social network. Opinion holders obtain scores from 1 to 100.
When centrality measures are high or/and their tweets obtain high reactions, then opinion
holders obtain a high score and are highly ranked. ANN is used for this step; Figure 4
shows the steps of the ANN scoring process. ANN adoption was due to the complexity
of this behavioral data to be mathematically represented. Figure 5shows, in practice, the
correlation between a sample data of action-based records (X-axis) and the resultant users’
Appl. Sci. 2022,12, 7697 9 of 20
score (Y-axis) when plotted. It can be noticed that, for example, authors of low action-based
tweets’ counts can highly score and vice versa. Thus, there is no certain guidance rule for
scoring users based on their tweets’ action-based counts.
Appl. Sci. 2022, 12, 7697 10 of 21
Figure 4. The training and scoring processes of the constructed ANN.
Figure 5. Correlation between users’ scores and their tweets reactions.
A trial and error approach was followed and resulted in a constructed feed-forward
ANN with 10 neurons in its hidden layer. In the training process, a sample of 100 to 10,000
records per input was applied to improve training performance and decrease scoring errors.
Figure 4. The training and scoring processes of the constructed ANN.
Appl. Sci. 2022, 12, 7697 10 of 21
Figure 4. The training and scoring processes of the constructed ANN.
Figure 5. Correlation between users’ scores and their tweets reactions.
A trial and error approach was followed and resulted in a constructed feed-forward
ANN with 10 neurons in its hidden layer. In the training process, a sample of 100 to 10,000
records per input was applied to improve training performance and decrease scoring errors.
Figure 5. Correlation between users’ scores and their tweets reactions.
Appl. Sci. 2022,12, 7697 10 of 20
A trial and error approach was followed and resulted in a constructed feed-forward
ANN with 10 neurons in its hidden layer. In the training process, a sample of
100 to
10,000 records
per input was applied to improve training performance and decrease
scoring errors.
Figure 6illustrates the impact of applying 100 and 10,000 training samples on the
ANN process error, where error seemed to decrease as training samples increased. Another
observation is the dependency on the out-of-degree measure rather than other centrality
measures. The output of the constructed ANN is a user score of 100; a user with a score of
100 is a highly ranked user, while one with a score of 1 is a lowly ranked one. These output
scores are normalized to represent users’ important weighting; high-scoring users are highly
ranked with a higher importance weight than others in the same social network. Users’
importance weighting results in classifying opinion holders into 4 main classes or types as
follows: Micro, Macro, Mega, and A-Listers Influencers [
25
–
28
]. Figure 7summarizes how
opinion holders are classified in this proposed work and their influence levels.
Appl. Sci. 2022, 12, 7697 11 of 21
Figure 6 illustrates the impact of applying 100 and 10,000 training samples on the
ANN process error, where error seemed to decrease as training samples increased. An-
other observation is the dependency on the out-of-degree measure rather than other cen-
trality measures. The output of the constructed ANN is a user score of 100; a user with a
score of 100 is a highly ranked user, while one with a score of 1 is a lowly ranked one.
These output scores are normalized to represent users’ important weighting; high-scoring
users are highly ranked with a higher importance weight than others in the same social net-
work. Users’ importance weighting results in classifying opinion holders into 4 main classes
or types as follows: Micro, Macro, Mega, and A-Listers Influencers [25–28]. Figure 7 sum-
marizes how opinion holders are classified in this proposed work and their influence levels.
Figure 6. (a) A training performance of 9.6461 with 100 Samples, (b) a training performance of 4.1030
with 10,000 Samples.
Figure 6.
(
a
) A training performance of 9.6461 with 100 Samples, (
b
) a training performance of
4.1030 with 10,000 Samples.
Appl. Sci. 2022,12, 7697 11 of 20
Appl. Sci. 2022, 12, 7697 12 of 21
Figure 7. Opinion holders’ classification based on their influence level.
- Micro influencers: This category is of a low influence level that may include: personal
users, secondary actors, retweeters, or silent followers. These users exhibit poor be-
havior in their social networks. Neither active nor sparing, they use their account
when they like to and for their purposes (i.e., entertainment, news, and learning).
Thus, Micro influencers can possess low reactions to their writings and low centrality
measures as shown in Figure 7;
- Macro influencers: Their influence level lies between low and moderate influence. It
includes actors with important content, builders, or even trolls. They work on build-
ing and growing their relationships and increasing their social network engagement
through creating interesting content or even inflammatory conversation. Thus, they
can possess high reactions to their writings as shown in Figure 7. They can become
mega influencers with time based on their ability to gain audience trust;
- Mega Influencers: This category’s influence level can lie between moderate and high
influence levels. It may include business users, brokers, and newscasters. They are
followed by a large sector for being the source of information. Thus, they possess
high centrality measures as shown in Figure 7. They record an active presence on
social media, providing service, marketing, advertising, etc.;
- A-list influencers or potential influencers: They own the highest influence level. They
are extremely popular and have between thousands and millions of followers. They
include: celebrities, the most recognizable people on earth that tend to act, sing, play
football, etc.; or professionals who have grown a strong brand for themselves for
sharing useful information about topics of professional interest, fostering interaction,
and being followed by many. Thus, they possess both high reactions and centrality
measures as shown in Figure 7.
4.3. Opinion Mining Process
This phase deals with the opinion texts of the data collected in the first phase. It in-
cludes two main steps: text preprocessing and opinion classification. A lexicon technique
(i.e., TextBlob) is to be applied to the collected tweets. TextBlob is a natural language pro-
cessing (NLP) Python library that returns two values for SA: polarity and subjectivity,
with a standard range of [− 1.0,1.0], where negative values refer to negative statements
and positive ones to positive statements. In our work, TextBlob was utilized for the two
mentioned steps of this phase. The reason for choosing a lexicon classifier, TextBlob in
particular, is owing to the previous intention to perform a hybrid classification of tweets
by combining lexicon, in this case, TextBlob, with ML, which is NL in the final phase.
Figure 7. Opinion holders’ classification based on their influence level.
-
Micro influencers: This category is of a low influence level that may include: personal
users, secondary actors, retweeters, or silent followers. These users exhibit poor
behavior in their social networks. Neither active nor sparing, they use their account
when they like to and for their purposes (i.e., entertainment, news, and learning).
Thus, Micro influencers can possess low reactions to their writings and low centrality
measures as shown in Figure 7;
-
Macro influencers: Their influence level lies between low and moderate influence. It
includes actors with important content, builders, or even trolls. They work on building
and growing their relationships and increasing their social network engagement
through creating interesting content or even inflammatory conversation. Thus, they
can possess high reactions to their writings as shown in Figure 7. They can become
mega influencers with time based on their ability to gain audience trust;
-
Mega Influencers: This category’s influence level can lie between moderate and high
influence levels. It may include business users, brokers, and newscasters. They are
followed by a large sector for being the source of information. Thus, they possess high
centrality measures as shown in Figure 7. They record an active presence on social
media, providing service, marketing, advertising, etc.;
-
A-list influencers or potential influencers: They own the highest influence level. They
are extremely popular and have between thousands and millions of followers. They
include: celebrities, the most recognizable people on earth that tend to act, sing, play
football, etc.; or professionals who have grown a strong brand for themselves for
sharing useful information about topics of professional interest, fostering interaction,
and being followed by many. Thus, they possess both high reactions and centrality
measures as shown in Figure 7.
4.3. Opinion Mining Process
This phase deals with the opinion texts of the data collected in the first phase. It
includes two main steps: text preprocessing and opinion classification. A lexicon technique
(i.e., TextBlob) is to be applied to the collected tweets. TextBlob is a natural language
processing (NLP) Python library that returns two values for SA: polarity and subjectivity,
with a standard range of
[−1.0, 1.0]
, where negative values refer to negative statements
and positive ones to positive statements. In our work, TextBlob was utilized for the two
mentioned steps of this phase. The reason for choosing a lexicon classifier, TextBlob in
particular, is owing to the previous intention to perform a hybrid classification of tweets by
combining lexicon, in this case, TextBlob, with ML, which is NL in the final phase. TextBlob
Appl. Sci. 2022,12, 7697 12 of 20
was chosen for its proven ability to deal with informal texts from preprocessing to polarity
classification [8]. The preprocessing step was performed using TextBlob and includes:
- The removal of the uniform resource locators (URLs), @username, stop words, etc.;
-
The substitution of slang, emoticons, etc. Example of one tweet before preprocessing:
RT @FIFAWorldCup: #FRA #FRA #FRA “This is amazing, it’s pinnacle: France are
on top of the world!” @FIFIAWorldCupFRA heard from #WorldCu. After prepro-
cessing, this tweet becomes: amazing pinnacle France top world heard. Afterward,
polarity classification of the preprocessed tweet is performed using TextBlob. The
polarity classification of the above-mentioned example is sentiment (polarity = 0.55,
subjectivity = 0.7).
4.4. Neutrosophic-Based OM Classification
It is the final and most important phase in this proposed model for two reasons: it
integrates the new elements of opinion into the traditional opinion elements. In addition,
NL can deal with different classification uncertainties that do exist in the case of the OSNs:
-
Opinion dynamicity: Where opinions about the same entity can vary from one person
to the other. NL can deal with opinion dynamicity; its properties accept this difference
and can handle it to achieve a consensus opinion about the entity [
9
,
21
]. To achieve
consensus in neutrosophy, opinions with different observations (
n
polarity scores)
should have only one score of triple representation
(t,i,f)
. In this case, we implement
a weighted average formula for each
t
,
i
,
f
component inspired by the work performed
in [21] that is defined as:
ts=ts1+1
2ts2+1
3ts3+. . . +1
ntsn
1+1
2+1
3+. . . +1
n
(5)
is=it1+1
2is2+1
3is3+. . . +1
nisn
1+1
2+1
3+. . . +1
n
(6)
fs=fs1+1
2fs+1
3fs3+. . . +1
nfsn
1+1
2+1
3+. . . +1
n
(7)
where
ts
,
is
, and
fs
are the overall true, indeterminate, and falsity scores of the opinion
text
s
, respectively.
s1
and
sn
represent the polarity scores assigned to the same
opinion sentence
s
by the observers of both the highest and the lowest influence
level
n
, respectively. For example,
ts1
is the true component of the opinion sentence
s
assigned by the first highest influence observer.
-
Opinion indeterminacy: This is when humans cannot be certain. It is the case of
neither being true nor false. In the field of OM, this can appear in classifying an
opinion as neutral for just being unable to determine its real intended polarity, or
for having positive and negative words of zero resultant sum, and thus, considered
neutral. This type of polarity classification is not considered until NL appears; where
each opinion can have a degree of truth, indeterminacy, or falsity. NL can deal with
the failure in determining the polarity of text by considering it of a high indeterminate
degree, which is more accurate than considering it neutral or with any other wrong
polarity class that badly affects the accuracy of results.
-
Classification ambiguity: It is a part of indeterminacy where the classified output
lies in the common area between two classes. NL can effectively deal with this case
by setting a confident value for the truth component (
t
). Figure 8shows an example
of a graphically represented neutrosophic set and how a confident value can be set
on it. Using this value, one can determine the significance of
i
,
f
components for a
given opinion’s score. If
t
is greater than the confidence value (i.e., 0.5 based on [
9
]),
Appl. Sci. 2022,12, 7697 13 of 20
then the corresponding
i
,
f
components can be considered insignificant [
9
]. All the
above-mentioned purposes can be achieved using the following steps:
i/f=insi gni f ic ant,t≥0.5
sig ni f ican t,t<0.5 (8)
Appl. Sci. 2022, 12, 7697 14 of 21
setting a confident value for the truth component (). Figure 8 shows an example of
a graphically represented neutrosophic set and how a confident value can be set on
it. Using this value, one can determine the significance of , components for a given
opinion’s score. If is greater than the confidence value (i.e., 0.5 based on [9]), then
the corresponding , components can be considered insignificant [9]. All the
above-mentioned purposes can be achieved using the following steps:
/ = , ≥ 0.5
, < 0.5
(8)
Figure 8. A confident value set on a neutrosophic set.
(1) Neutrosophication
In this step, the crisp inputs are converted to neutrosophic-based inputs using three
membership functions that represent: truth, indeterminate, and failure membership, per
input [20]:
Input 1: Represents user’s influence (UI), the newly added opinion element. This in-
put ranged from 0 to 1 with three linguistic variables, low, moderate, and high influ-
ence (LI, MI, and HI), to build the truth, indeterminate, and falsity membership func-
tions. The membership functions are trapezoidal inspired by [10].
Input 2: Represents the opinion orientation, the traditional opinion element, namely
polarity score (PS). This input’s range is [−1,1] with seven linguistic variables:
strong negative (SN), negative (NEG), weak negative (WN), neutral (N), weak posi-
tive (WP), positive (P), and strong positive (SP) with three triangular-shaped mem-
bership functions, also inspired by [10].
(2) Inference Engine/or Rule Evaluation
In this step, the neutrosophic inputs are converted into neutrosophic outputs using
IF-THEN rules. The rules were designed to cover all the truth, indeterminate, and falsity
cases for the inputs and their corresponding outputs. The above-mentioned inputs are the
antecedents joined by the minimum operator “AND”. The designed rules are based on
the designed rules in [9,10] and are found to be 45 rules. A sample of the designed NL
rules for the truth component () is listed in Table 1. MI-t, for example means true mod-
erate influence.
Figure 8. A confident value set on a neutrosophic set.
(1)
Neutrosophication
In this step, the crisp inputs are converted to neutrosophic-based inputs using three
membership functions that represent: truth, indeterminate, and failure membership,
per input [20]:
-
Input 1: Represents user ’s influence (UI), the newly added opinion element. This
input ranged from 0 to 1 with three linguistic variables, low, moderate, and high
influence (LI, MI, and HI), to build the truth, indeterminate, and falsity membership
functions. The membership functions are trapezoidal inspired by [10].
-
Input 2: Represents the opinion orientation, the traditional opinion element, namely
polarity score
(PS)
. This input’s range is
[−1, 1]
with seven linguistic variables:
strong negative (SN), negative (NEG), weak negative (WN), neutral (N), weak positive
(WP), positive (P), and strong positive (SP) with three triangular-shaped membership
functions, also inspired by [10].
(2)
Inference Engine/or Rule Evaluation
In this step, the neutrosophic inputs are converted into neutrosophic outputs using
IF-THEN rules. The rules were designed to cover all the truth, indeterminate, and falsity
cases for the inputs and their corresponding outputs. The above-mentioned inputs are
the antecedents joined by the minimum operator “AND”. The designed rules are based
on the designed rules in [
9
,
10
] and are found to be 45 rules. A sample of the designed
NL rules for the truth component
(t)
is listed in Table 1. MI-t, for example means true
moderate influence.
(3)
Deneutrosophication
In this final NL step, the output of the inference engine (i.e., the neutrosophic outputs)
is converted into a crisp output using the corresponding three membership functions and
a suitable modulation technique. In this work, the center of gravity (COG) was chosen
to determine the defuzzified output (
z∗
) from the accumulated output of the rules (
µc(z)
)
which is defined as:
z∗=Zµc(z).zdz
µc(z)dz (9)
Appl. Sci. 2022,12, 7697 14 of 20
Table 1. Sample of the designed NL rules.
Rules # Rules
1 IF UI is ‘LI-t’ and PS is ‘SN-t’, THEN PS is ‘NEG-t’
2 IF UI is ‘LI-t’ and PS is ‘SP-t’, THEN PS is ‘P-t’
3 IF UI is ‘LI-t’ and PS is ‘NEG’, THEN PS is ‘WN-t’
4 IF UI is ‘LI-t’ and PS is ‘P-t’, THEN PS is ‘WP-t’
5 IF UI is ‘LI-t’ and PS is ‘WN-t’, THEN PS is ‘N-t’
6 IF UI is ‘LI-t’ and PS is ‘WP-t’, THEN PS is ‘N-t’
7 IF UI is ‘LI-t’ and PS is ‘N-t’, THEN PS is ‘N-t’
8 IF UI is ‘MI-t’ and PS is ‘N-t’, THEN PS is ‘N-t’
9 IF UI is ‘HI-t’ and PS is ‘N-t’, THEN PS is ‘N-t’
10 IF UI is ‘MI-t’ and PS is ‘SN-t’, THEN PS is ‘SN-t’
11 IF UI is ‘MI-t’ and PS is ‘NEG-t’, THEN PS is ‘NEG-t’
12 IF UI is ‘MI-t’ and PS is ‘WN-t’, THEN PS is ‘WN-t’
13 IF UI is ‘MI-t’ and PS is ‘WP-t’, THEN PS is ‘WP-t’
14 IF UI is ‘MI-t’ and PS is ‘P-t’, THEN PS is ‘P-t’
15 IF UI is ‘MI-t’ and PS is ‘SP-t’, THEN PS is ‘SP-t’
16 IF UI is ‘HI-t’ and PS is ‘SN-t’, THEN PS is ‘SN-t’
17 IF UI is ‘HI-t’ and PS is ‘SP-t’, THEN PS is ‘SP-t’
18 IF UI is ‘HI-t’ and PS is ‘NEG-t’, THEN PS is ‘SN-t’
19 IF UI is ‘HI-t’ and PS is ‘P-t’, THEN PS is ‘SP-t’
20 IF UI is ‘HI-t’ and PS is ‘WN-t’, THEN PS is ‘NEG-t’
21 IF UI is ‘HI-t’ and PS is ‘WP-t’, THEN PS is ‘P-t’
4.5. Desired Opinion Polarity Class
In this step, a final polarity class is determined, inspired by the polarity classes applied
in [
10
]. This work applied seven polarity classes, known as: strong positive, positive, weak
positive, neutral/or indeterminate, weak negative, negative, and strong negative. Table 2
shows the seven polarity classes and their corresponding ranges. We added a new polarity
class to neutral, namely “indeterminate”. It is when the text is neither neutral nor any of
the existed classes. This is a bonus from using NL; we can say the class of a certain text is
undecided. The obtained final class of polarity and the indeterminate class are determined
by Equation (8) as shown below:
Pol arity =truth component,t≥0.5
indeterminate,t<0.5 (10)
Table 2. The seven polarity levels [10].
Score Polarity
n>0.75 Strong Positive
0.25 <n≤0.75 Positive
0<n≤0.25 Weak Positive
0/NULL Neutral/undecided
−0.25 ≤n<0 Weak Negative
−0.75 ≤n<−0.25 Negative
n<−0.75 Strong Negative
5. Experimental Results
To effectively validate the proposed model, a desktop program was developed using
Python, for implementing TextBlob, and the matrix laboratory (MATLAB) libraries for
implementing ANN and NL. Toolboxes for ANN and fuzzy logic were used to build the
proposed ANN and NL classifiers. The fuzzy toolbox was used due to the unavailability of
a toolbox for NL until this time and the proven possibility of implementing NL using the
fuzzy toolbox as mentioned in [
9
]. The dataset already collected and filtered in [
10
] was
used as a consequence of the absence of a benchmark dataset that combines the opinion
Appl. Sci. 2022,12, 7697 15 of 20
text, its writer’s following/followers list, and its tweet reaction counts. Table 3documents
the polarity classification of 1080 collected tweets using TextBlob.
Table 3. Polarity distributions of the dataset using TextBlob.
Overall
Items
Polarity Classes and Distributions
Positive Neutral Negative
1080
559 (51.76%)
456 (42.22%)
65 (6.02%)
SP P WP SN NEG WN
150 232 177 1 32 32
Regarding the contribution phase of adding a new element into the basic opinion
elements, the constructed ANN operated with efficient performance and attempted to
classify the opinion holders of the dataset into the four mentioned types. Table 4reports
the resultant ANN classification errors of each of the opinion holder types. It was obvious
that the constructed ANN trained more on the low-level influencers than on the high-level
influencers. This can be due to the abundant presence of the low influencers in training
and real-life compared with the presence of the high influencers. Figure 9emphasizes this
observation and shows the results of the ANN classification of the opinion holders in the
used dataset. The lower influencers have the highest percent, and this can be a caution
for the decisionmakers to not rely on the traditional OM results while dealing with OSNs
without classifying their opinion holders, as there would be spammers and people of low
impact whose opinions are not that worthwhile.
Table 4. Average errors in the ANN classification.
Overall Items
Average Error in Opinion Holders ANN Classification
Low Influence Moderate High Influence
1080
0.099 0.239 0.251
Micro Macro Mega A-Listers
0.089 0.234 0.248 0.208
Appl. Sci. 2022, 12, 7697 17 of 21
Figure 9. Opinion holders’ classification in the dataset.
According to our proposed model, opinion holders on OSNs share opinions of the
same polarity as others, but they should vary in their obtained OM polarity scores due to
their influence on the audience. Figure 10 shows the polarity change from a regular OM
process with its basic opinion elements using TextBlob and after integrating the new ele-
ment of opinion (i.e., user’s weight) into them using NL. It is obvious the difference in the
resultant polarity classes when considering users’ importance and influence on others. Due
to the large percentage of low influencers in the considered dataset, we can notice the in-
crease in the percentage of the neutral class for most texts due to the low effect of their users.
Figure 10. Polarity classes differences before and after considering user’s weight.
In the OM process, we usually use any tool/technique to obtain a polarity score for a
given text. Each tool may provide the same text a different polarity score due to many
factors such as the tool’s accuracy. In contrast, in our model, opinion holders can share
the same opinion text, but due to our consideration of a user’s weight, these same opinion
texts should have different polarity scores based on their sharer’s importance in the social
network. To deal with such a case is similar to dealing with applying different OM tech-
niques to the same text. In this case, the NL allows such a feature that it allows different
observations for the same text and can obtain a polarity score for the problem text. Table 5
shows an example sample of retweeted texts through different influence level users. They
were provided different scores and we reported how the NL dealt with such cases.
Figure 9. Opinion holders’ classification in the dataset.
Appl. Sci. 2022,12, 7697 16 of 20
According to our proposed model, opinion holders on OSNs share opinions of the
same polarity as others, but they should vary in their obtained OM polarity scores due
to their influence on the audience. Figure 10 shows the polarity change from a regular
OM process with its basic opinion elements using TextBlob and after integrating the new
element of opinion (i.e., user’s weight) into them using NL. It is obvious the difference in
the resultant polarity classes when considering users’ importance and influence on others.
Due to the large percentage of low influencers in the considered dataset, we can notice
the increase in the percentage of the neutral class for most texts due to the low effect of
their users.
Appl. Sci. 2022, 12, 7697 17 of 21
Figure 9. Opinion holders’ classification in the dataset.
According to our proposed model, opinion holders on OSNs share opinions of the
same polarity as others, but they should vary in their obtained OM polarity scores due to
their influence on the audience. Figure 10 shows the polarity change from a regular OM
process with its basic opinion elements using TextBlob and after integrating the new ele-
ment of opinion (i.e., user’s weight) into them using NL. It is obvious the difference in the
resultant polarity classes when considering users’ importance and influence on others. Due
to the large percentage of low influencers in the considered dataset, we can notice the in-
crease in the percentage of the neutral class for most texts due to the low effect of their users.
Figure 10. Polarity classes differences before and after considering user’s weight.
In the OM process, we usually use any tool/technique to obtain a polarity score for a
given text. Each tool may provide the same text a different polarity score due to many
factors such as the tool’s accuracy. In contrast, in our model, opinion holders can share
the same opinion text, but due to our consideration of a user’s weight, these same opinion
texts should have different polarity scores based on their sharer’s importance in the social
network. To deal with such a case is similar to dealing with applying different OM tech-
niques to the same text. In this case, the NL allows such a feature that it allows different
observations for the same text and can obtain a polarity score for the problem text. Table 5
shows an example sample of retweeted texts through different influence level users. They
were provided different scores and we reported how the NL dealt with such cases.
Figure 10. Polarity classes differences before and after considering user’s weight.
In the OM process, we usually use any tool/technique to obtain a polarity score
for a given text. Each tool may provide the same text a different polarity score due to
many factors such as the tool’s accuracy. In contrast, in our model, opinion holders can
share the same opinion text, but due to our consideration of a user’s weight, these same
opinion texts should have different polarity scores based on their sharer’s importance in
the social network. To deal with such a case is similar to dealing with applying different
OM techniques to the same text. In this case, the NL allows such a feature that it allows
different observations for the same text and can obtain a polarity score for the problem text.
Table 5shows an example sample of retweeted texts through different influence level users.
They were provided different scores and we reported how the NL dealt with such cases.
Table 5. Samples of opinion dynamicity in the dataset.
User # User’s Weight (ANN) User ’s Influence Type Polarity Score
(TextBlob)
Polarity Score
(Using Our Proposed NL Model (t,i,f))
273 0.10 Micro 0.90 (0.5, 0, 0)
518 0.44 Macro 0.90 (0.91, 0, 0)
570 0.68 Mega 0.90 (0.89, 0, 0)
Overall score for (n=16 )is (0.79, 0.02, 0.03)with a final polarity score of 0.79 (SP)
147 0.20 Micro 0.125 (0, 0, 0)
212 0.31 Macro 0.125 (0.16, 0, 0)
123 0.68 Mega 0.125 (0.16, 0, 0)
283 0.75 A-Lister 0.125 (0.50, 0, 0)
Overall score for (n=20 )is (0.26, 0, 0)with a final polarity score of 0.26 (P)
608 0.26 Micro 0.80 (0.5, 0, 0)
473 0.44 Macro 0.80 (0.89, 0, 0)
328 0.72 Mega 0.80 (0.93, 0.83, 0.84)
285 0.92 A-Lister 0.80 (0.93, 0, 0)
Overall score for (n=40 )is (0.82, 0.15, 0.16)with a final polarity score of 0.82
Appl. Sci. 2022,12, 7697 17 of 20
For solving the problem of indeterminacy class, our proposed model highlighted the
existence of undecided polarity and tried to identify it in the used dataset. Figure 11 shows
the effect of such detection on the neutral polarity, where it is obvious that the neutral
percentage in the dataset decreases from 69 to 51% due to the efficient identification of the
undecided polarities in texts. Compared with cases #1 and #2, the neutral class normally
increases while applying our proposed model due to the existence of low influencers whose
tweets can be perceived as neutral. However, after considering the three components of
NL, the neutral class increased (i.e., case #3) due to the existence of undecided polarities
that were automatically considered neutral. By excluding these undecided cases, the final
neutral class becomes an accurate representation of that class in the dataset.
Appl. Sci. 2022, 12, 7697 18 of 21
Table 5. Samples of opinion dynamicity in the dataset.
User #
User’s
Weight
(ANN)
User’s Influ-
ence Type
Polarity Score
(TextBlob)
Polarity Score
(Using Our Proposed NL Model
(,,)
)
273 0.10 Micro 0.90 (0.5, 0, 0)
518 0.44 Macro 0.90 (0.91, 0, 0)
570 0.68 Mega 0.90 (0.89, 0, 0)
Overall score for (
= 16
) is (
0.79,0.02,0.03
) with a final polarity score of 0.79 (SP)
147 0.20 Micro 0.125 (0, 0, 0)
212 0.31 Macro 0.125 (0.16, 0, 0)
123 0.68 Mega 0.125 (0.16, 0, 0)
283 0.75 A-Lister 0.125 (0.50, 0, 0)
Overall score for (
= 20
) is (
0.26,0,0
) with a final polarity score of 0.26 (P)
608 0.26 Micro 0.80 (0.5, 0, 0)
473 0.44 Macro 0.80 (0.89, 0, 0)
328 0.72 Mega 0.80 (0.93, 0.83, 0.84)
285 0.92 A-Lister 0.80 (0.93, 0, 0)
Overall score for (
= 40
) is (
0.82,0.15,0.16
) with a final polarity score of 0.82
For solving the problem of indeterminacy class, our proposed model highlighted the
existence of undecided polarity and tried to identify it in the used dataset. Figure 11 shows
the effect of such detection on the neutral polarity, where it is obvious that the neutral
percentage in the dataset decreases from 69 to 51% due to the efficient identification of the
undecided polarities in texts. Compared with cases #1 and #2, the neutral class normally
increases while applying our proposed model due to the existence of low influencers
whose tweets can be perceived as neutral. However, after considering the three compo-
nents of NL, the neutral class increased (i.e., case #3) due to the existence of undecided
polarities that were automatically considered neutral. By excluding these undecided
cases, the final neutral class becomes an accurate representation of that class in the dataset.
Figure 11. Neutral class change cases.
According to [8], small changes can shift feelings from being SP to P. Some opinions
can still be SP ignoring any changes, while others may have different polarities at the same
time. Undecided cases can happen and must be efficiently identified without approxima-
tion or dominant choice. Deep in the undecided cases identified in the dataset, Table 6
shows some undecided cases. Then Equation (10) was applied to solve such existence of
undecided polarities. If undecided polarities are undecided after applying (10), then it was
Figure 11. Neutral class change cases.
According to [
8
], small changes can shift feelings from being SP to P. Some opinions can
still be SP ignoring any changes, while others may have different polarities at the same time.
Undecided cases can happen and must be efficiently identified without approximation or
dominant choice. Deep in the undecided cases identified in the dataset, Table 6shows some
undecided cases. Then Equation (10) was applied to solve such existence of undecided
polarities. If undecided polarities are undecided after applying (10), then it was classified
as indeterminate. Figure 12 presents the final polarity scores that were provided using the
proposed model after detecting and dealing with opinions’ indeterminacy and dynamicity.
Appl. Sci. 2022, 12, 7697 19 of 21