ArticlePDF Available

Abstract and Figures

The growing popularity of social media sites has generated a massive amount of data that attracted researchers, decision-makers, and companies to investigate people's opinions and thoughts in various fields. Sentiment analysis is considered an emerging topic recently. Decision-makers, companies, and service providers as well-considered sentiment analysis as a valuable tool for improvement. This research paper aims to obtain a dataset of tweets and apply different machine learning algorithms to analyze and classify texts. This research paper explored text classification accuracy while using different classifiers for classifying balanced and unbalanced datasets. It was found that the performance of different classifiers varied depending on the size of the dataset. The results also revealed that the Naive Byes and ID3 gave a better accuracy level than other classifiers, and the performance was better with the balanced datasets. The different classifiers (K-NN, Decision Tree, Random Forest, and Random Tree) gave a better performance with the unbalanced datasets.
Content may be subject to copyright.
www.astesj.com 1683
Sentiment Analysis in English Texts
Arwa Alshamsi1, Reem Bayari1, Said Salloum2,3,*
1Faculty of Engineering & IT, The British University, Dubai, 345015, UAE
2Research Institute of Sciences & Engineering, University of Sharjah, Sharjah, 27272, UAE
3School of Science, Engineering, and Environment, University of Salford, Manchester, M5 4WT, UK
A R T I C L E I N F O
A B S T R A C T
Article history:
Received: 23 September, 2020
Accepted: 24 December, 2020
Online: 28 December, 2020
The growing popularity of social media sites has generated a massive amount of data that
attracted researchers, decision-makers, and companies to investigate people's opinions and
thoughts in various fields. Sentiment analysis is considered an emerging topic recently.
Decision-makers, companies, and service providers as well-considered sentiment analysis
as a valuable tool for improvement. This research paper aims to obtain a dataset of tweets
and apply different machine learning algorithms to analyze and classify texts. This research
paper explored text classification accuracy while using different classifiers for classifying
balanced and unbalanced datasets. It was found that the performance of different classifiers
varied depending on the size of the dataset. The results also revealed that the Naive Byes
and ID3 gave a better accuracy level than other classifiers, and the performance was better
with the balanced datasets. The different classifiers (K-NN, Decision Tree, Random Forest,
and Random Tree) gave a better performance with the unbalanced datasets.
Keywords:
Sentiment Analysis
Balanced Dataset
Unbalanced Dataset
Classification
1. Introduction
The recent widening expansion of social media has changed
communication, sharing, and obtaining information [14]. In
addition to this, many companies use social media to evaluate their
business performance by analysing the conversations' contents [5].
This includes collecting customers' opinions about services,
facilities, and products. Exploring this data plays a vital role in
consumer retention by improving the quality of services [6, 7].
Social media sites such as Instagram, Facebook, and Twitter offer
valuable data that can be used by business owners not only to track
and analyse customers' opinions about their businesses but also
that of their competitors [811]. Moreover, these valuable data
attracted decision-makers who seek to improve the services
provided [8, 9, 12, 13].
In this research paper, several research papers that studied
Twitter's data classification and analysis for different purposes
were surveyed to investigate the methodologies and approaches
utilized for text classification. The authors of this research paper
aim to obtain open-source datasets then conduct text classification
experiments using machine learning approaches by applying
different classification algorithms, i.e., classifiers. The authors
utilized several classifiers to classify texts of two versions of
datasets. The first version is unbalanced datasets, and the second
is balanced datasets. The authors then compared the classification
accuracy for each used classifier on classifying texts of both
datasets.
2. Literature Review
As social media websites have attracted millions of users,
these websites store a massive number of texts generated by users
of these websites [1421]. Researchers were interested in
investigating these metadata for search purposes [17, 18, 2225].
In this section, a number of research papers that explored the
analysis and classification of Twitter metadata were surveyed to
investigate different text classification approaches [26] and the text
classification results.
Researchers of [27] investigated the user's gender of Twitter.
Authors noticed that many Twitter users use the URL section of
the profile to point to their blogs, and the blogs provided valuable
demographic information about the users. Using this method, the
authors created a corpus of about 184000 Twitter users labeled
with their gender. Then authors arranged the dataset for
ASTESJ
ISSN: 2415-6698
*Corresponding Author: Said Salloum, University of Sharjah, UAE. Tel:
+971507679647 Email: ssalloum@sharjah.ac.ae
Advances in Science, Technology and Engineering Systems Journal Vol. 5, No. 6, 1683-1689 (2020)
www.astesj.com
Special Issue on Multidisciplinary Sciences and Engineering
https://dx.doi.org/10.25046/aj0506200
H. Tariq et al. / Advances in Science, Technology and Engineering Systems Journal Vol. 5, No. 6, 1683-1689 (2020)
www.astesj.com 1684
experiments as following: for each user; they specify four fields;
the first field contains the text of the tweets and the remaining three
fields from the user's profile on Twitter, i.e., full name, screen
name, and description. After that, the authors conducted the
experiments and found that using all of the dataset fields while
classifying Twitter user's gender provides the best accuracy of
92%. Using tweets text only for classifying Twitter user's gender
provides an accuracy of 76%. In [28], the authors used Machine
Learning approaches for Sentiment Analysis. Authors constructed
a dataset consisting of more than 151000 Arabic tweets labeled as
"75,774 positive tweets and 75,774 negative tweets". Several
Machine Learning Algorithms were applied, such as Naive Bayes
(NB), AdaBoost, Support vector machine (SVM), ME, and Round
Robin (RR). The authors found that RR provided the most accurate
results on classifying texts, while AdaBoost classifier results were
the least accurate results. A study by [29] interested as well in
Sentiment Analysis of Arabic texts. The authors constructed the
Arabic Sentiment Tweets Dataset ASTD, which consists of 84,000
Arabic tweets. The number of tweets remaining after annotation
was around 10,000 tweets. The authors applied machine learning
approaches using classifiers on the collected dataset. They reported
the following: (1) The best classifier applied on the dataset is SVM,
(2) Classifying a balanced set is challenging compared to the
unbalanced set. The balanced set has fewer tweets than the
unbalanced set, which may negatively affect the classification's
reliability. In [30], the author investigated the effects of applying
preprocessing methods before the sentiment classification of the
text. The authors used classifiers and five datasets to evaluate the
preprocessing method's effects on the classification. Experiments
were conducted, and researchers reported the following findings:
Removing URL has no much effect, Removing stop words have a
slight effect, Removing Numbers have no effect, Expanding
Acronym improved the classification performance, and the same
preprocessing methods have the same effects on the classifier's
performance, NB and RF classifiers showed more sensitivity than
LR and SVM classifiers. In conclusion, the classifier's
performance for sentiment analysis was improved after applying
preprocessing methods. A study by [31] investigated Twitter
geotagged data to construct a national database of people's health
behavior. The authors compared indicators generated by machine
learning algorithms to indicators generated by a human. The
authors collected around 80 million geotagged tweets. Then
Spatial Join procedures were applied, and 99.8% of tweets were
successfully linked. Then tweets were processed. After that,
machine learning approaches were used and successfully applied
in classifying tweets into happy and not happy with high accuracy.
In [32] explored classifying sentiments in movie reviews. The
authors constructed a dataset of 21,000 tweets of movie reviews.
Dataset split into train set and test set. Preprocessing methods
applied, then two classifiers, i.e., NB and SVM, were used to
classify tweets text into positive or negative sentiment. The authors
found that better accuracy achieved using SVM of 75% while NB
has 65% accuracy. Researchers of [33] used Machine Learning
methods and Semantic Analysis for analyzing tweet's sentiments.
Authors labeled tweets in a dataset that consists of 19340 sentences
into positive or negative. They applied preprocessing methods
after that features were extracted; authors applied Machine
Learning approaches, i.e., Naïve Bayes, Maximum Entropy, and
Support Vector Machine (SVM) classifiers after that Semantic
Analysis were applied. The authors found that Naïve Bayes
provided the best accuracy of 88.2, the next SVM of 85.5, and the
last is Maximum entropy of 83.8. The authors reported as well that
after applying Semantic Analysis, the accuracy increased to reach
89.9. In [34], the authors analyzed sentiments by utilizing games.
Authors introduced TSentiment, which is a web-based game.
TSentiment used for emotion identification in Italian tweets.
TSentiment is an online game in which the users compete to
classify tweets in the dataset consists of 59,446 tweets. Users first
must evaluate the tweet's polarity, i.e., positive, negative, and
neutral, then users have to select the tweet's sentiment from a pre-
defined list of 9 sentiments in which 3 sentiments identified for the
positive polarity, 3 sentiments identified for negative polarity.
Neutral polarity is used for tweets that have no sentiment
expressions. This approach for classifying tweets was effective.
A study by [35] examined the possibility of enhancing the
accuracy of predictions of stock market indicators using Twitter
data sentiment analysis. The authors used a lexicon-based
approach to determine eight specific emotions in over 755 million
tweets. The authors applied algorithms to predict DJIA and
S&P500 indicators using Support Vectors Machine (SVM) and
Neural Networks (NN). Using the SVM algorithm in DJIA
indication, the best average precision rate of 64.10 percent was
achieved. The authors indicated that the accuracy could be
increased by increasing the straining period and by improving the
algorithms for sentiment analysis. authors conclude that adding
Twitter details does not improve accuracy significantly. In [36],
the authors applied sentiment analysis on around 4,432 tweets to
collect opinions on Oman tourism, they build a domain-specific
ontology for Oman tourism using Concept Net. Researchers
constructed a sentiment lexicon based on three existing lexicons,
SentiStrength, SentWordNet, and Opinion lexicon. The authors
randomly divide 80% of the data for the training set and 20% for
testing. The researcher used two types of semantic sentiment,
Contextual Semantic Sentiment Analysis, and Conceptual
Semantic Sentiment Analysis. Authors applied Nave Base
supervised machine learning classifier and found that using
conceptual semantic sentiment analysis expressively improves the
sentiment analysis's performance. A study by [37] used sentiment
analysis and subjectivity analysis methods to analyze French
tweets and predict the French CAC40 stock market. The author
used a French dataset that consists of 1000 positive and negative
book reviews. The author trained the neural network by using three
input features on 3/4 of the data, and he tested on the remaining
quarter. The achieved accuracy 80% and a mean absolute
percentage error (MAPE) of 2.97%, which is less than the work
reported by Johan Bollen. The author suggested adding more
features as input to improve the performance. In [38], the authors
examined the relationship between Twitter's social emotion and
the stock market. Researchers collected millions of tweets by
Twitter API. Researchers retrieved the NASDAQ market closing
price in the same period. The authors applied the correlation
coefficient. Authors conclude that emotion-related terms have
some degree of influence on the stock market overall trend, but it
did not meet standards that can be used as a guide to stock-market
prediction. While at the same time, there was a fairly close
association between positive, negative, and angry mood-words.
Particularly sad language tends to have a far greater influence on
the stock market than other groups. In [39], the authors
investigated telecommunications companies' conversation on
H. Tariq et al. / Advances in Science, Technology and Engineering Systems Journal Vol. 5, No. 6, 1683-1689 (2020)
www.astesj.com 1685
social media Twitter ('indihome,' in Indonesia ). The authors
collected 10,839 raw data for segmentation. The authors collected
data: over 5 periods of time in the same year. Authors found that
most of the tweets (7,253) do not contain customers' perception
toward Indihome. Only 3,586 tweets are containing the perception
of customers toward Indihome. Most of the data contained
perception reveal that the customers have the negative perception
(3,119) on Indihome and only 467 tweets contain positive
perceptions; the biggest number of negative perceptions relate to
the first product, the second relates to a process, third relate to
people, and fourth relate to pricing. Researchers of [40] examined
prevalence and geographic variations for opinion polarities about
e-cigarettes on Twitter. Researchers collected data from Twitter by
pre-defined seven keywords. They classified the tweets into four
categories: Irrelevant to e-cigarettes, Commercial tweets, organic
tweets with attitudes (supporting or against or neutral) the use of
e-cigarettes, and the geographic locations information city and
state. Researchers selected six socio-economic variables from
Census data 2014 that are associated with smoking and health
disparities. Researchers classified the tweets based on a
combination of human judgment and machine-learning
algorithms, and two coders classified a random sample of 2000
tweets into five categories. The researcher applied a multilabel
Nave Bayes classifier algorithm; the model achieved an accuracy
of 93.6% on the training data. Then the researcher applied the
machine learning algorithm to a full set of collected tweets and
found the accuracy of the validation data was 83.4%. To evaluate
the socio-economic impact related to public perception regarding
e-cigarette use in the USA, researchers calculated the Pearson
correlation between prevalence and percentage of opinion
polarities and selected ACS variants for 50 states and the District
of Columbia. In [41], the authors Investigated the link between any
updates on certain brands and their reaction. Researchers gathered
geographic locations based on the data to see consumer
distribution. Researchers collected Twitter data by using the REST
API. In total, 3,200, from ten different profiles, then used
sentiment analysis to differentiate between clustered data
expressed positively or negatively then resampled the result in an
object model and cluster. For every answer, the researcher has been
evaluated for the textual sentiment analysis from the object model.
Researchers used AFINN based word list and Sentiments of
Emojis to run comprehensive sentiment analysis; for the data that
not existed in the word list, researcher added a separated layer to
an analysis by using emoji analysis on top of sentiment analysis,
and authors did not see any difference in the level of accuracy
when applying this extra layer. The researcher found some
Sentiment Analysis weaknesses related to the misuse of emoji, the
use of abbreviated words or terms of slang, and the use of sarcasm.
In [42], the authors proposed an application that can classify a
Twitter content into spam or legitimate. Auhtors used an integrated
approach, from URL analysis, Natural Language Processing, and
Machine Learning techniques. Auhtors analyzed the URL that
derived from the tweets, then convert URLs to their long-form,
then compare URLs with Blacklisted URLs, then compare them
with a set pre-defined expressions list as spam; the presence of any
of these expressions can conclude that the URL is spam. After
cleaning data, the stemmed keywords are compared with the per
set of identified spam words and, if a pre-defined expressions list
are found in the tweet, then the user is classified as spam. Six
features were used for classification. The training set has 100
instances with six features and a label. The author used Nave-
Bayes algorithm. Authors manually examined 100 users and found
(60 were legitimate and 40 were spam) then the sampled checked
by the application and the result presented that 98 were classified
correctly.
3. Proposed Approach
In this work, the authors implemented and evaluated different
classifiers in classifying the sentiment of the tweets. It’s by
utilizing RapidMiner software. Classifiers were applied on both
balanced and unbalanced datasets. Classifiers used are Decision
Tree, Naïve Bayes, Random Forest, K-NN, ID3, and Random
Tree.
4. Experiment Setup
In this section, the dataset is described as well as the settings
and evaluation techniques are used in the experiments have been
discussed. The prediction for the tweet category is tested twice
the first time on an unbalanced data set and the second time on a
balanced dataset as below.
Experiments on the unbalanced dataset: Decision Tree,
Naïve Bayes, Random Forest, K-NN, ID3, and Random Tree
classifiers were applied on six unbalanced datasets.
Experiments on the balanced dataset: In this experiment,
the challenges related to unbalanced datasets were tackled by
manual procedures to avoid biased predictions and misleading
accuracy. The majority class in each dataset almost equalized
with the minority classes, i.e., many positive, negative, and
neutral, practically the same in the balanced dataset as
represented in Table 3.
4.1. Dataset Description
We obtained a dataset from Kaggle, one of the largest online
data science communities in this work. It consists of more than
14000 tweets, labeled either (positive, negative, or neutral). The
dataset was also split into six datasets; each dataset includes tweets
about one of six American airline companies (United, Delta,
Southwest, Virgin America, US Airways, and American). Firstly,
we summarized the details about the obtained datasets, as
illustrated in Table 1 below.
Table 1: Summary of obtained Dataset
American Airline Companies
Virgin
Americ
a
Unite
d
Delt
a
Southwes
t
US
Airway
s
America
n
Number
of
Tweets
504
3822
2222
2420
2913
2759
Positive
Tweets
152
492
544
570
269
336
Negativ
e
Tweets
181
2633
955
1186
2263
1960
Neutral
Tweets
171
697
723
664
381
463
4.2. Dataset Cleansing
H. Tariq et al. / Advances in Science, Technology and Engineering Systems Journal Vol. 5, No. 6, 1683-1689 (2020)
www.astesj.com 1686
In this section, the authors described the followed procedure
in the dataset preparation. The authors utilized RapidMinor
software for tweet classification. Authors followed the methods
described below:
1) Splitting the dataset into a training set and test set.
2) Loading the dataset, i.e., excel file into RapidMinor software
using Read Excel operator.
3) Applying preprocessing by utilizing the below operators.
Transform Cases operator to transform text to lowercase.
Tokenize operator to split the text into a sequence of tokens.
Filter Stop words operator to remove stop words such as: is,
the, at, etc.
Filter Tokens (by length) operator: to remove token based on
the length, in this model, minimum characters are 3, and
maximum characters are 20 any other tokens that don't match
the rule will be removed.
Stem operator: to convert words into base form.
4.3. Dataset Training
Each of the datasets was divided into two-part. The first part
contains 66% of the total number of tweets of the data set, and it is
used to train the machine to classify the data under one attribute,
which is used to classify the tweets to either (positive or Negative
or Neutral). The remaining 34% of tweets were used to classify
tweets' attribute to (positive or Negative or Neutral), i.e., test set.
Figure 1: Summarization of the Process Model
4.4. Dataset Classifying
In this section, the authors described the steps in the tweet’s
classification techniques.
Set Role operator is used to allow the system to identify
sentiment as the target variable,
Select Attributes operator is used to removing any attribute
which has any missing values.
Then in the validation operator, the dataset is divided into two
parts (training and test). We used Two-thirds of the dataset to
train the dataset and the last one-third to evaluate the model.
Different machine learning algorithms are used for training
the dataset (Decision Tree, Naïve Bayes, Random Forest, K-
NN, ID3, and Random Tree).
For testing the model, the Performance operator utilized to
measure the performance of the model.
5. Experiment Results and Discussion
This section presented the experiment results in terms of
accuracy level of prediction for each classifier on both types of
datasets (balanced, unbalanced) and a comparison between the two
experiments.
5.1. Experiment results for an unbalanced dataset
Figure 2 and Table 2 present the accuracy results of the
utilized classifiers on the datasets.
Table 2: Accuracy results on unbalanced dataset
Accuracy
Virgin
Americ
a
United
Delta
Southw
est
US
Airway
s
Americ
an
Dataset
504
3822
2222
2420
2913
2759
Training
set
333
2523
1467
1597
1923
1821
Test set
171
1299
755
823
990
938
Decision
Tree
31.86%
72.03%
42.08%
50.46%
82.72%
68.98%
Naïve
Bayes
32.74%
72.38%
42.28%
51.01%
82.72%
72.21%
Random
Forest
31.86%
72.03%
42.08%
50.46%
82.72%
68.98%
K-NN
39.82%
11.66%
35.27%
50.46%
82.72%
69.43%
ID3
32.74%
72.38%
42.28%
51.01%
82.72%
72.21%
Random
Tree
31.86%
72.03%
42.08%
50.46%
82.72%
68.98%
Figure 2: Accuracy results on unbalanced airline datasets using different
classifiers
In some datasets, the classifier's accuracy results were very
high, while it was low in others. All classifier's performance on the
US airways dataset and United dataset provided the best accuracy
due to the dataset's size, which was the largest. Naïve Bayes
classifier, Decision Tree, and ID3 were mostly better than other
classifiers and were given almost the same accuracy level. The
classifiers with Virgin America dataset reported the lowest
accuracy level due to the dataset's size, which is very small.
H. Tariq et al. / Advances in Science, Technology and Engineering Systems Journal Vol. 5, No. 6, 1683-1689 (2020)
www.astesj.com 1687
5.2. Experiment results for a balanced dataset
Decision Tree, Naïve Bayes, Random Forest, K-NN, ID3, and
Random Tree classifiers were applied on the five obtained
balanced datasets. (United, Delta, Southwest, and US Airways).
The dataset for each was divided into two parts. The first part
contains 66% of the total number of tweets of the data set, and it is
used to train the machine to classify the data under one attribute,
which is used to classify the tweets as either positive, Negative, or
Neutral. The remaining 34% of tweets were used to classify tweets'
attributes into (positive, Negative, or Neutral), i.e., test set.
Table 3: Number of tweets before and after balancing.
Number of instances
Percentage
Total
tweets
before
balancing
Total
tweets
after
balancing
Positive
Negative
Neutral
United
3822
8276
33%
33%
34%
Delta
2222
2635
33%
33%
34%
Southwest
2420
5518
33%
33%
33%
US Airways
2913
6608
33%
33%
33%
American
2759
5924
34%
34%
33%
After applying different algorithms on the five balanced
datasets, the performance, i.e., accuracy results, were reported in
Table 4 and Figure 3 below:
Table 4: Accuracy results on the balanced dataset
Accuracy
Virgin
Americ
a
United
Delta
Southw
est
US
Airway
s
Americ
an
Dataset
8276
2635
5518
6608
5924
8276
Training
set
5464
1743
3642
4363
3911
5464
Test set
2812
892
1876
2245
2013
2812
Decision
Tree
35.06%
34.63%
34.35%
35.06%
33.98%
35.06%
Naïve
Bayes
97.65%
36.99%
65.48%
97.65%
61.20%
97.65%
Random
Forest
35.06%
34.63%
34.35%
35.06%
33.98%
35.06%
K-NN
38.79%
32.77%
35.32%
38.79%
39.47%
38.79%
ID3
97.65%
36.99%
65.48%
97.65%
61.20%
97.65%
Random
Tree
35.06%
34.63%
34.35%
35.06%
33.98%
35.06%
Figure 3: Accuracy results on balanced airline datasets using different classifiers
5.3. Comparison between two experiments results for each
classifier
While comparing results between the performance of the
classifiers on balanced and unbalanced datasets, it was found the
following as seen in Figure 4 below:
5.3.1 Naive Byes and ID3
Gave the best accuracy than other classifiers in the two
experiments. The accuracy level with the balanced datasets higher
than unbalanced ones. In the unbalanced datasets, the maximum
accuracy for both classifiers was 82.7%. In the balanced dataset,
the accuracy reached 97.6%; these results confirm that these two
classifiers are the best compared to the other selected classifiers in
the two experiments:
5.3.2 K-NN and Decision Tree
Show better performance with the unbalanced datasets, and
the difference is so apparent. The maximum accuracy with the
balanced datasets is 39.4%, while it reached 82.7 % with the
unbalanced datasets.
5.3.3 Random forest and Random Tree
It shows better performance with the unbalanced datasets, and
the difference is so apparent. The maximum accuracy with the
balanced datasets around 35%, while it reached 82.7% with the
unbalanced datasets.
In conclusion, Naive Bayes and ID3 gave a better accuracy
level than other classifiers, and the performance was better with
the balanced datasets. The different classifiers (K-NN, Decision
Tree, Random Forest, and Random Tree) gave a better
understanding of the unbalanced datasets.
Figure 4: Accuracy results of classifiers on balanced and unbalanced datasets
6. Conclusions
Social media websites are gaining very big popularity among
people of different ages. Platforms such as Twitter, Facebook,
Instagram, and Snapchat allowed people to express their ideas,
opinions, comments, and thoughts. Therefore, a huge amount of
data is generated daily, and the written text is one of the most
common forms of the generated data. Business owners, decision-
makers, and researchers are increasingly attracted by the valuable
and massive amounts of data generated and stored on social media
websites. Sentiment Analysis is a Natural Language Processing
field that increasingly attracted researchers, government
authorities, business owners, services providers, and companies to
improve products, services, and research. In this research paper,
the authors aimed to survey sentiment analysis approaches.
Therefore, 16 research papers that studied Twitter's text
H. Tariq et al. / Advances in Science, Technology and Engineering Systems Journal Vol. 5, No. 6, 1683-1689 (2020)
www.astesj.com 1688
classification and analysis were surveyed. The authors also aimed
to evaluate different machine learning algorithms used to classify
sentiment to either positive or negative, or neutral. This experiment
aims to compare the efficiency and performance of different
classifiers that have been used in the sixteen papers that are
surveyed. These classifiers are (Decision Tree, Naïve Bayes,
Random Forest, K-NN, ID3, and Random Tree). Besides, the
authors investigated the balanced dataset factor by applying the
same classifiers twice on the dataset, one on the unbalanced and
the other, after balancing the dataset. The targeted dataset included
six datasets about six American airline companies (United, Delta,
Southwest, Virgin America, US Airways, and American); it
consists of about 14000 tweets. The authors reported that the
classifier's accuracy results were very high in some datasets while
low in others. The authors indicated that the dataset size was the
reason for that. On the balanced dataset, the Naïve Bayes classifier,
Decision Tree, and ID3 were mostly better than other classifiers
and have given the almost same level of accuracy. The classifiers
with Virgin America dataset reported the lowest level of accuracy
due to its small size. On the unbalanced dataset, results show that
the Naive Byes and ID3 gave a better level of accuracy than other
classifiers when it’s applied on the balanced datasets. While (K-
NN, Decision Tree, Random Forest, and Random Tree) gave a
better understanding of the unbalanced datasets.
Conflict of Interest
The authors declare no conflict of interest
Acknowledgment
This is a part of project done in British University in Dubai.
References
[1] S.A. Salloum, C. Mhamdi, B. Al Kurdi, K. Shaalan, “Factors affecting the
Adoption and Meaningful Use of Social Media: A Structural Equation
Modeling Approach,” International Journal of Information Technology and
Language Studies, 2(3), 2018.
[2] M. Alghizzawi, S.A. Salloum, M. Habes, “The role of social media in
tourism marketing in Jordan,” International Journal of Information
Technology and Language Studies, 2(3), 2018.
[3] S.A. Salloum, W. Maqableh, C. Mhamdi, B. Al Kurdi, K. Shaalan, “Studying
the Social Media Adoption by university students in the United Arab
Emirates,” International Journal of Information Technology and Language
Studies, 2(3), 2018.
[4] S.A. Salloum, M. Al-Emran, S. Abdallah, K. Shaalan, Analyzing the arab
gulf newspapers using text mining techniques, 2018, doi:10.1007/978-3-
319-64861-3_37.
[5] F.A. Almazrouei, M. Alshurideh, B. Al Kurdi, S.A. Salloum, Social Media
Impact on Business: A Systematic Review, 2021, doi:10.1007/978-3-030-
58669-0_62.
[6] Alshurideh et al., “Understanding the Quality Determinants that Influence
the Intention to Use the Mobile Learning Platforms: A Practical Study,”
International Journal of Interactive Mobile Technologies (IJIM), 13(11),
157183, 2019.
[7] S.A. Salloum, K. Shaalan, Adoption of E-Book for University Students,
2019, doi:10.1007/978-3-319-99010-1_44.
[8] S.A. Salloum, M. Alshurideh, A. Elnagar, K. Shaalan, “Mining in
Educational Data: Review and Future Directions,” in Joint European-US
Workshop on Applications of Invariance in Computer Vision, Springer: 92
102, 2020.
[9] S.A. Salloum, M. Alshurideh, A. Elnagar, K. Shaalan, “Machine Learning
and Deep Learning Techniques for Cybersecurity: A Review,” in Joint
European-US Workshop on Applications of Invariance in Computer Vision,
Springer: 5057, 2020.
[10] S.A. Salloum, R. Khan, K. Shaalan, “A Survey of Semantic Analysis
Approaches,” in Joint European-US Workshop on Applications of
Invariance in Computer Vision, Springer: 6170, 2020.
[11] K.M. Alomari, A.Q. AlHamad, S. Salloum, “Prediction of the Digital Game
Rating Systems based on the ESRB.”
[12] S.A. Salloum, M. Al-Emran, A.A. Monem, K. Shaalan, “A survey of text
mining in social media: Facebook and Twitter perspectives,” Advances in
Science, Technology and Engineering Systems, 2(1), 2017,
doi:10.25046/aj020115.
[13] S.A. Salloum, A.Q. AlHamad, M. Al-Emran, K. Shaalan, A survey of Arabic
text mining, Springer, Cham: 417431, 2018, doi:10.1007/978-3-319-
67056-0_20.
[14] C. Mhamdi, M. Al-Emran, S.A. Salloum, Text mining and analytics: A case
study from news channels posts on Facebook, 2018, doi:10.1007/978-3-319-
67056-0_19.
[15] A.S. Alnaser, M. Habes, M. Alghizzawi, S. Ali, “The Relation among
Marketing ads, via Digital Media and mitigate (COVID-19) pandemic in
Jordan The Relationship between Social Media and Academic Performance:
Facebook Perspective View project Healthcare challenges during COVID-
19 pandemic View project,” Dspace.Urbe.University, (July), 2020.
[16] M. Alshurideh, B. Al Kurdi, S. Salloum, “Examining the Main Mobile
Learning System Drivers’ Effects: A Mix Empirical Examination of Both
the Expectation-Confirmation Model (ECM) and the Technology
Acceptance Model (TAM),” in International Conference on Advanced
Intelligent Systems and Informatics, Springer: 406417, 2019.
[17] M. Alghizzawi, M. Habes, S.A. Salloum, M.A. Ghani, C. Mhamdi, K.
Shaalan, “The effect of social media usage on students’e-learning acceptance
in higher education: A case study from the United Arab Emirates,”
International Journal of Information Technology and Language Studies, 3(3),
2019.
[18] M. Habes, S.A. Salloum, M. Alghizzawi, C. Mhamdi, “The Relation
Between Social Media and Students’ Academic Performance in Jordan:
YouTube Perspective,” in International Conference on Advanced Intelligent
Systems and Informatics, Springer: 382392, 2019.
[19] M. Habes, S.A. Salloum, M. Alghizzawi, M.S. Alshibly, “The role of
modern media technology in improving collaborative learning of students in
Jordanian universities,” International Journal of Information Technology
and Language Studies, 2(3), 2018.
[20] B.A. Kurdi, M. Alshurideh, S.A. Salloum, Z.M. Obeidat, R.M. Al-dweeri,
“An empirical investigation into examination of factors influencing
university students’ behavior towards elearning acceptance using SEM
approach,” International Journal of Interactive Mobile Technologies, 14(2),
2020, doi:10.3991/ijim.v14i02.11115.
[21] S.A. Salloum, M. Al-Emran, M. Habes, M. Alghizzawi, M.A. Ghani, K.
Shaalan, “Understanding the Impact of Social Media Practices on E-
Learning Systems Acceptance,” in International Conference on Advanced
Intelligent Systems and Informatics, Springer: 360369, 2019.
[22] M. Alghizzawi, M.A. Ghani, A.P.M. Som, M.F. Ahmad, A. Amin, N.A.
Bakar, S.A. Salloum, M. Habes, “The Impact of Smartphone Adoption on
Marketing Therapeutic Tourist Sites in Jordan,” International Journal of
Engineering & Technology, 7(4.34), 9196, 2018.
[23] S.F.S. Alhashmi, S.A. Salloum, S. Abdallah, “Critical Success Factors for
Implementing Artificial Intelligence (AI) Projects in Dubai Government
United Arab Emirates (UAE) Health Sector: Applying the Extended
Technology Acceptance Model (TAM),” in International Conference on
Advanced Intelligent Systems and Informatics, Springer: 393405, 2019.
[24] M. Alghizzawi, M. Habes, S.A. Salloum, The Relationship Between Digital
Media and Marketing Medical Tourism Destinations in Jordan: Facebook
Perspective, 2020, doi:10.1007/978-3-030-31129-2_40.
[25] R.S. Al-Maroof, S.A. Salloum, A.Q.M. AlHamadand, K. Shaalan, A Unified
Model for the Use and Acceptance of Stickers in Social Media Messaging,
2020, doi:10.1007/978-3-030-31129-2_34.
[26] K.S.A. Wahdan, S. Hantoobi, S.A. Salloum, K. Shaalan, “A systematic
review of text classification research based ondeep learning models in Arabic
language,” Int. J. Electr. Comput. Eng, 10(6), 66296643, 2020.
[27] J.D. Burger, J. Henderson, G. Kim, G. Zarrella, “Discriminating gender on
Twitter,” in Proceedings of the 2011 Conference on Empirical Methods in
Natural Language Processing, 13011309, 2011.
[28] D. Gamal, M. Alfonse, E.-S.M. El-Horbaty, A.-B.M. Salem, “Twitter
Benchmark Dataset for Arabic Sentiment Analysis,” International Journal of
Modern Education and Computer Science, 11(1), 33, 2019.
[29] M. Nabil, M. Aly, A. Atiya, “Astd: Arabic sentiment tweets dataset, in
Proceedings of the 2015 conference on empirical methods in natural
language processing, 25152519, 2015.
[30] Z.J. and G. Xiaolini, “Comparison Research on Text Pre-processing
Methods on Twitter Sentiment Analysis,” Digital Object Identifier, 2017,
doi:10.1109/ACCESS. 2017. 2672677.
H. Tariq et al. / Advances in Science, Technology and Engineering Systems Journal Vol. 5, No. 6, 1683-1689 (2020)
www.astesj.com 1689
[31] Q.C. Nguyen, D. Li, H.-W. Meng, S. Kath, E. Nsoesie, F. Li, M. Wen,
“Building a national neighborhood dataset from geotagged Twitter data for
indicators of happiness, diet, and physical activity,” JMIR Public Health and
Surveillance, 2(2), e158, 2016.
[32] A. Amolik, N. Jivane, M. Bhandari, M. Venkatesan, “Twitter sentiment
analysis of movie reviews using machine learning techniques,” International
Journal of Engineering and Technology, 7(6), 17, 2016.
[33] G. Gautam, D. Yadav, “Sentiment analysis of twitter data using machine
learning approaches and semantic analysis,” in 2014 Seventh International
Conference on Contemporary Computing (IC3), IEEE: 437442, 2014.
[34] M. Furini, M. Montangero, “TSentiment: On gamifying Twitter sentiment
analysis,” in 2016 IEEE Symposium on Computers and Communication
(ISCC), IEEE: 9196, 2016.
[35] A. Porshnev, I. Redkin, A. Shevchenko, “Machine learning in prediction of
stock market indicators based on historical data and data from Twitter
sentiment analysis .,” 2013 IEEE 13th International Conference on Data
Mining Workshops, 440444, 2013, doi:10.1109/ICDMW.2013.111.
[36] V. Ramanathan, “Twitter Text Mining for Sentiment Analysis on People ’ s
Feedback about Oman Tourism,” 2019 4th MEC International Conference
on Big Data and Smart City (ICBDSC), 15, 2019.
[37] V. Martin, “Predicting the french stock market using social media analysis,”
Proceedings - 8th International Workshop on Semantic and Social Media
Adaptation and Personalization, SMAP 2013, 37, 2013,
doi:10.1109/SMAP.2013.22.
[38] Q. Li, B. Zhou, Q. Liu, “Can twitter posts predict stock behavior?: A study
of stock market with twitter social emotion,” Proceedings of 2016 IEEE
International Conference on Cloud Computing and Big Data Analysis,
ICCCBDA 2016, 359364, 2016, doi:10.1109/ICCCBDA.2016.7529584.
[39] Indrawati, A. Alamsyah, “Social network data analytics for market
segmentation in Indonesian telecommunications industry,” 2017 5th
International Conference on Information and Communication Technology,
ICoIC7 2017, 0(c), 2017, doi:10.1109/ICoICT.2017.8074677.
[40] H. Dai, J. Hao, “Mining social media data for opinion polarities about
electronic cigarettes,” Tobacco Control, 26(2), 175180, 2017,
doi:10.1136/tobaccocontrol-2015-052818.
[41] A. Husnain, S.M.U. Din, G. Hussain, Y. Ghayor, “Estimating market trends
by clustering social media reviews,” Proceedings - 2017 13th International
Conference on Emerging Technologies, ICET2017, 2018-Janua, 16, 2018,
doi:10.1109/ICET.2017.8281716.
[42] K. Kandasamy, P. Koroth, “An integrated approach to spam classification
on Twitter using URL analysis, natural language processing and machine
learning techniques,” 2014 IEEE Students’ Conference on Electrical,
Electronics and Computer Science, SCEECS 2014, 15, 2014,
doi:10.1109/SCEECS.2014.6804508.
... In addition, users on social media show an important degree of trust in the information shared and received by others. The information given in this content has the potential to offer significant insights to various consumers including companies A. Al Shamsi et al. (2021), service providers, and several institutes and organizations. These entities can utilize the data to inform their decisionmaking processes and facilitate changes Mataoui et al. (2016); Albayari et al. (2021). ...
Article
Full-text available
Sentiment analysis (SA) is a widely recognized and increasing field of research in the science of natural language processing (NLP). A wide range of methods exist by which individuals express their sentiments and emotions. Sarcasm is occasionally employed with sentiments, particularly when expressing intense emotions. Sarcasm is characterized by using positive language to express a negative intention. In current research, these two aspects are often treated as separate tasks. However, recent advancements in deep learning algorithms have greatly improved the efficiency of standalone classifiers for both sentiment and sarcasm tasks. Despite these improvements, a major challenge remains: correctly classifying sarcastic sentences as negative. Furthermore, there has been an important increase in the number of research efforts focused on Arabic dialects. In this research paper, we explore both Sentiment and Sarcasm within multi-dialect Arabic language corpora to set up a highly accurate sentiment classification and sarcasm detection. To be more specific, we develop a system of classification that employs a Multi-Task Learning (MTL) algorithm using a pre-trained Arabic language model to accurately determine sentiment classification and sarcasm detection. Considering this, we claim that having the ability to identify sarcasm will improve the accuracy of sentiment classification. The performance of our approach showed notable results, surpassing the performance of previously developed models described in the literature on all of the three datasets, for sentiment classification with up to an F1-score of 73.96% on ArSarcasmsenti_{senti} dataset and up to an F1-score of 59.46% on ArSentD-Lev dataset. Moreover on sarcasm detection task our model got an F1-score of 76.42% on ArSarcasmsarcasm_{sarcasm} dataset outperforming all other models.
... Sentiment analysis, a core aspect of natural language processing (NLP) and machine learning (ML), interprets and classifies the emotional content of textual data [1]. This field is crucial for various applications such as monitoring brand perceptions on social media and analyzing public opinion on diverse issues [2]- [4]. ...
... In recent years, sentiment analysis has been applied in the tobacco industry [5][6][7]. Yang et al. [8] constructed a cigarette sentiment analysis dataset using two cigarette products. They employed a conventional online comment dictionary to compute sentiment scores and tendencies, overlooking the profound exploration of user comment information. ...
Article
Full-text available
In the age of information explosion and artificial intelligence, sentiment analysis tailored for the tobacco industry has emerged as a pivotal avenue for cigarette manufacturers to enhance their tobacco products. Existing solutions have primarily focused on intrinsic features within consumer reviews and achieved significant progress through deep feature extraction models. However, they still face these two key limitations: (1) neglecting the influence of fundamental tobacco information on analyzing the sentiment inclination of consumer reviews, resulting in a lack of consistent sentiment assessment criteria across thousands of tobacco brands; (2) overlooking the syntactic dependencies between Chinese word phrases and the underlying impact of sentiment scores between word phrases on sentiment inclination determination. To tackle these challenges, we propose the External Knowledge-enhanced Cross-Attention Fusion model, CITSA. Specifically, in the Cross Infusion Layer, we fuse consumer comment information and tobacco fundamental information through interactive attention mechanisms. In the Textual Attention Enhancement Layer, we introduce an emotion-oriented syntactic dependency graph and incorporate sentiment-syntactic relationships into consumer comments through a graph convolution network module. Subsequently, the Textual Attention Layer is introduced to combine these two feature representations. Additionally, we compile a Chinese-oriented tobacco sentiment analysis dataset, comprising 55,096 consumer reviews and 2074 tobacco fundamental information entries. Experimental results on our self-constructed datasets consistently demonstrate that our proposed model outperforms state-of-the-art methods in terms of accuracy, precision, recall, and F1-score.
... It acknowledges that data privacy is a complicated socio-technical problem rather than just a technical one, and that organizations must adopt data rules that strike a balance between ethical considerations and technological advancements [81]. The approach emphasizes that robust ethical standards that preserve individual privacy rights should be incorporated into effective data regulations in addition to technical safeguards [82]. In order to ensure that data use is in line with larger social values and norms, ethical concerns are consequently essential in directing the design and implementation of data systems [83]. ...
Article
Full-text available
Monitoring the responsible application of Artificial Intelligence (AI) in higher education requires the establishment of robust regulatory frameworks and thorough policy guidelines. The study concentrates on important elements that could impede the excellence of education, including issues with data security, privacy, and policies as well as legal frameworks. By examining these variables, the study aims to address the particular opportunities and difficulties encountered in this setting and to gain a deeper knowledge of how AI deployment affects educational excellence in Jordanian higher education institutions. A survey research methodology has been chosen. Institutions in Jordan that have started implementing AI or metaverse technology were given a pre-made questionnaire. The population of students in Jordanian higher education institutions, including both local and foreign students at different educational levels, is the main subject of this study. During the three-month data collection phase, the sample size was cautiously raised to more than 457 individuals in order to boost the research's robustness. The results show that the AI adoption, trust in technology (by data privacy and security), and policy & regulations in Jordanian higher institutions have significant impacts on educational excellence. Our results highlight the urgent need for policymakers to reevaluate and explain current regulatory frameworks in order to safeguard educational excellence, while also confirming the transformative possible of AI implementation in improving instructive resources and services. This study demonstrates the possible benefits of integrating AI technologies into educational backgrounds by confirming the strong correlation between AI adoption and educational excellence. In order to increase the effectiveness, usability, and caliber of educational resources and services, schools ought to think about implementing AI-driven tools and platforms.
... The advent of social media has revolutionized the way individuals express their emot ions and opinions [1]. Platforms like Twitter have become vast repositories of real-t ime public sentiment, providing a unique opportunity for the application of Natural Language Processing (NLP) techniques. ...
Conference Paper
Full-text available
With the growing prominence of social media as a platform for expressing opinions and emotions, understanding the emotional undercurrents in large volu mes of text data has become increasingly crucial. Tweets, often reflecting public sentiment, contain a rich tapestry of emotions that can be harnessed for diverse applications ranging from market analysis to mental health monitoring. The dataset comprises 40,000 tweet records, each tagged with one of thirteen distinct emotions, making it a challenging task to perform multiclass emotion classification due to the sheer volume of data and the nuanced spectrum of emotional expressions. Traditional classification models often struggle with such high-dimensional, multi-category data, leading to the need for a more sophisticated approach that ensures both accuracy and computational efficiency. To address the complexity of multiclass emotion classification, we propose a novel approach that combines text preprocessing, advanced feature extraction using TF-IDF, and dimensionality reduction via 2D Principal Component Analysis (PCA). We then apply K-means clustering to the reduced feature set to identify inherent groupings within the emotional content of the tweets. This method not only reduces computational demands but also logically consolidates the emotions into fewer categories, potentially enhancing the performance of subsequent classification models. The implementat ion of our method yielded distinct clusters that suggest a logical grouping of the emotions within the tweets. The 2D PCA visualization revealed clear separations among clusters, indicating that our approach successfully captured meaningful patterns in the dataset. The ability to effectively cluster complex emotional data opens the door to creating more nuanced and efficient multiclass classification models. By reducing the number of categories and focusing on clustered groups, we can streamline the classification process and enhance the interpretability of results. This has significant implications for real-world applications, including targeted marketing campaigns, public policy
... Based on the evaluation of important research publications and the application of deep experimental investigation of several DL models' performance undertaken in this work, we decided to merge these DNNs. Many research papers show that these models perform well in text mining (Alshamsi et al., 2020;Nanni et al., 2021;Patil and Rane, 2021). The second layer of the hybrid model is the CNN with activation function 'relu'. ...
Article
Full-text available
In the new era of digital communications, cyberbullying is a significant concern for society. Cyberbullying can negatively impact stakeholders and can vary from psychological to pathological, such as self-isolation, depression and anxiety potentially leading to suicide. Hence, detecting any act of cyberbullying in an automated manner will be helpful for stakeholders to prevent any unfortunate results from the victim’s perspective. Data-driven approaches, such as machine learning (ML), particularly deep learning (DL), have shown promising results. However, the meta-analysis shows that ML approaches, particularly DL, have not been extensively studied for the Arabic text classification of cyberbullying. Therefore, in this study, we conduct a performance evaluation and comparison for various DL algorithms (LSTM, GRU, LSTM-ATT, CNN-BLSTM, CNN-LSTM and LSTM-TCN) on different datasets of Arabic cyberbullying to obtain more precise and dependable findings. As a result of the models’ evaluation, a hybrid DL model is proposed that combines the best characteristics of the baseline models CNN, BLSTM and GRU for identifying cyberbullying. The proposed hybrid model improves the accuracy of all the studied datasets and can be integrated into different social media sites to automatically detect cyberbullying from Arabic social datasets. It has the potential to significantly reduce cyberbullying. The application of DL to cyberbullying detection problems within Arabic text classification can be considered a novel approach due to the complexity of the problem and the tedious process involved, besides the scarcity of relevant research studies.
... Unstructured Arabic text used in blogs, news' sites, social networking sites, and online forums has increased at a neverbefore-seen rate due to the growing use of smart mobile devices and digital platforms. [1][2][3] This increase in data has promoted the issue of cyberbullying. This widespread problem has a serious negative influence on people's mental health and presents a major challenge in the digital age. ...
... Similar to the conducted research in other languages, the aim is to extract valuable information from these texts to use in diverse industries and systems. Although platforms like Instagram is proven suitable for SA and opinion mining in languages such as English [9], [10], Arabic [11]- [14], Urdu [15]- [17], etc., there has been less emphasis on these aspects in the Persian language due to a lack of standardized datasets and limited resources for SA processing and analysis. Ensemble methods in machine learning encompass a set of learning models in which classifiers are combined in a way that compensates for the weaknesses of individual models. ...
Article
Full-text available
On a daily basis, an abundance of opinions, thousands or even millions of comments are generated by various individuals on social media. Collecting and evaluating these comments using traditional methods and algorithms is accompanied by less accuracy. Therefore, the development of a robust sentiment analysis system is essential for the accurate analysis of users’ sentiments. Current methods have limited accuracy. Therefore, an idea to overcome this limitation is to get benefit of several classifiers together. Ensemble methods, through the combination of several different algorithms with diverse structures, can generate a new framework capable of better analyzing the sentiments. In the present study, an ensemble-based model is introduced to extract meaningful information from Persian comments on the Instagram social media platform. The model is proposed for the classification and prediction of users’ behaviors or emotions across distinct categories. This hybrid model comprises three main phases. The first phase is pre-processing and word embedding. Word2Vec is used for this manner. The second phase consists of four proposed deep models, namely CNN, LSTM, CNN-LSTM, and LSTM-CNN which are used as classifiers. Finally, in the third phase, ensemble techniques like MLP and Voting ensemble are employed to aggregate the results derived from the previous phase. To evaluate the performance of the proposed ensemble-based model, the model is applied to the Insta.csv dataset, containing Persian comments on Instagram. Experimental results demonstrate that the proposed ensemble-based model, utilizing the Voting ensemble, outperforms other ensemble methods. In terms of accuracy, it achieves 72.337%, therefore, the Voting ensemble shows a 4.9% improvement over the MLP ensemble.
Article
Full-text available
Sentiment analysis, the automated process of determining emotions or opinions expressed in text, has seen extensive exploration in the field of natural language processing. However, one aspect that has remained underrepresented is the sentiment analysis of the Moroccan dialect, which boasts a unique linguistic landscape and the coexistence of multiple scripts. Previous works in sentiment analysis primarily targeted dialects employing Arabic script. While these efforts provided valuable insights, they may not fully capture the complexity of Moroccan web content, which features a blend of Arabic and Latin script. As a result, our study emphasizes the importance of extending sentiment analysis to encompass the entire spectrum of Moroccan linguistic diversity. Central to our research is the creation of the largest public dataset for Moroccan dialect sentiment analysis that incorporates not only Moroccan dialect written in Arabic script but also in Latin characters. By assembling a diverse range of textual data, we were able to construct a dataset with a range of 19,991 manually labeled texts in Moroccan dialect and also publicly available lists of stop words in Moroccan dialect as a new contribution to Moroccan Arabic resources. In our exploration of sentiment analysis, we undertook a comprehensive study encompassing various machine-learning models to assess their compatibility with our dataset. While our investigation revealed that the highest accuracy of 98.42% was attained through the utilization of the DarijaBert-mix transfer-learning model, we also delved into deep learning models. Notably, our experimentation yielded a commendable accuracy rate of 92% when employing a CNN model. Furthermore, in an effort to affirm the reliability of our dataset, we tested the CNN model using smaller publicly available datasets of Moroccan dialect, with results that proved to be promising and supportive of our findings.
Article
Full-text available
Classifying or categorizing texts is the process by which documents are classified into groups by subject, title, author, etc. This paper undertakes a systematic review of the latest research in the field of the classification of Arabic texts. Several machine learning techniques can be used for text classification, but we have focused only on the recent trend of neural network algorithms. In this paper, the concept of classifying texts and classification processes are reviewed. Deep learning techniques in classification and its type are discussed in this paper as well. Neural networks of various types, namely, RNN, CNN, FFNN, and LSTM, are identified as the subject of study. Through systematic study, 12 research papers related to the field of the classification of Arabic texts using neural networks are obtained: for each paper the methodology for each type of neural network and the accuracy ration for each type is determined. The evaluation criteria used in the algorithms of different neural network types and how they play a large role in the highly accurate classification of Arabic texts are discussed. Our results provide some findings regarding how deep learning models can be used to improve text classification research in Arabic language.
Article
Full-text available
There are several reasons why most of the universities implement E-learning. The extent of E-learning programs is being offered by the higher educational institutes in the UAE are evidently expanding. However, very few studies have been carried out to validate the process of how E-learning is being accepted and employed by university students. The study involved a sample of 365 university students. To describe the acceptance process, the Structural Equation Modeling (SEM) method was used. On the basis of the technology acceptance model (TAM), the standard structural model that in
Article
Full-text available
This study investigates the influence of student social media usage on the acceptance of e-learning platforms at the British University in Dubai. A modified Technology Acceptance Model was developed and validated for the quantitative study, which comprised data collected from 410 graduate and postgraduate students via an electronic questionnaire. The findings showed that knowledge sharing, social media features and motivation to use social media systems, including Facebook YouTube and Twitter, positively affected the perceived usefulness and perceived ease-of-use of e-learning platforms, which, in turn, led to increased e-learning platform acceptance by students. The research model can be adapted to similar studies to assist in further research regarding how higher-education institutions in the UAE can maximize the benefits and uptake of e-learning platforms.
Conference Paper
Full-text available
This study aimed to analyze and discover the relation of using digital media sites (Facebook) on promoting medical tourism destinations in Jordan, and its impact on the behavior of tourists through the technologies provided by these means. Away from the traditional methods in marketing, the researchers used the survey methodology for a sample of 560 tourists distributed at central of Jordan in Dead Sea area to realize the study objective, a new framework was suggested to show the impact of Facebook on the behavior of tourists through: demographic variables, Facebook features, advertising, by using the TAM model in adoption of social media technology in tourism marketing for tourist destinations in Jordan. The proposed data were analyzed using the Smart PLS system by modeling structural equations (SEM). The outcome of the study showed that the advantages of Facebook, advertising and demographic variables have a favorable effect on the (PEOU) of the tourist and the PU in the adoption of tourism behavior, in addition to the (PU) and (PEOU) (ATT), which led to the adoption of behavior around therapeutic tourism destinations in Jordan. By determining the impact of Facebook in marketing tourism in Jordan, it would be useful to conduct further research to provide better proposals for marketing tourist therapeutic destinations in Jordan.
Conference Paper
Full-text available
This study aims mainly at analyzing the relationship between social media and students’ academic performance in Jordan in the context of higher education from a YouTube perspective. It intends to explore the benefits this relationship may have in enhancing students; leaning and improving their academic performance. To successfully reach its aims, this study proposes a new model aiming at verifying the relationship of social Bookmarking, YouTube Features, Perceived Usefulness, Use of Social Media, on Jordanian students’ academic performance. To verify the validity of the proposed model, data were analyzed using Smart PLS using structural equations modeling (SEM). Data were collected from Yarmouk University in Jordan covering all the levels of study at the university. An electronic questionnaire was conducted for a target of 360 students who participated in this study. The findings of the study revealed that Social Bookmarking, YouTube Features, Perceived Usefulness, Use of Social Media are important factors to predict students’ academic performance in relation to using social networking media for e-learning purposes in Jordan.
Conference Paper
Full-text available
There have been several longitudinal studies concerning the learners’ acceptance of e-learning systems using the higher educational institutes (HEIs) platforms. Nonetheless, little is known regarding the investigation of the determinants affecting the e-learning acceptance through social media applications in HEIs. In keeping with this, the present study attempts to understand the influence of social media practices (i.e., knowledge sharing, social media features, and motivation and uses) on students’ acceptance of e-learning systems by extending the technology acceptance model (TAM) with these determinants. A total of 410 graduate and undergraduate students enrolled at the British University in Dubai, UAE took part in the study by the medium of questionnaire surveys. The partial least squares-structural equation modeling (PLS-SEM) is employed to analyze the extended model. The empirical data analysis triggered out that social media practices including knowledge sharing, social media features, and motivation and uses have significant positive impacts on both perceived usefulness (PU) and perceived ease of use (PEOU). It is also imperative to report that the acceptance of e-learning systems is significantly influenced by both PU and PEOU. In summary, social media practices play an effective positive role in influencing the acceptance of e-learning systems by students.
Article
Full-text available
This study tries to find out the best model for prediction video game rate categories. A representation from four rating categories (everyone, everyone 10+, teen, mature) was used for the analysis. The paper follows CRISP-DM approach under Rapid Miner software to business and data understanding, Data preparation, model building and evaluation. The researchers compared prediction among six model and the results showed that the Generalized Linear Models (GLMs) achieved a best accuracy (0.9027), also results highlighted eight important content descriptions to have the highest influence on prediction.
Conference Paper
Social media is a multifaceted phenomenon that significantly affects business competence mainly because of spearheading the evolutionary process. The primary purpose of the systematic review is to encompass the evaluation of social media as a model that influences business enterprises in the local and international levels. The systematic review utilized four primary hypotheses to determine the influence of social media on businesses. These hypotheses are Social media (SM) that significantly influences the sales (SL) in business, Social media (SM) which have a strong relationship with businesses loyalty (LO), Social media (SM) that influences business by awareness (AW), and Social media (SM) significantly influences the level of business performance (BP). Different research studies established that social media significantly contributes to the competence of firms mainly because of the global effect. Examples of these social media facets include social media knowledge and various platforms such as Facebook, Instagram, Twitter, YouTube, and LinkedIn. In this case, social media fostered the emergence of various business capabilities. Examples of these capabilities encompass brand awareness, brand loyalty, and sales. Social media is a platform that profoundly influences the level of business competence through the advancement of business capabilities.
Conference Paper
This study aims to investigate the intention to use and actual use of Mobile Learning System (MLS) drivers by students within the UAE higher education setting. A set of factors were chosen to study and test the issue at hand. These factors are social influence, expectation-confirmation, perceived ease of use, perceived usefulness, satisfaction, continuous intention and finally the actual use of such MLS. This study adds more light to the MLS context because it combines between two models which are the Information Technology Acceptance Model (TAM) and Expectation-Confirmation Model (ECM). A set of hypotheses were developed based on such theoretical combination. The data collected from 448 students for the seek of primary data and analyzed using the Structural Equation Modeling (SEM) in particular (SmartPLS) to evaluate the developed study model and test the prepared hypotheses. The study found that both social influence and expectation-confirmation factors influence positively perceived ease of use, perceived usefulness and satisfaction and such three drivers influence positively students’ intention to use MLS. Based on previous proposed links, the study confirms that intention to use such mobile educational means affect strongly and positively the actual use. Scholars and practitioners should take care of learners’ intention to use and actual use of MLS and their determinants into more investigation especially the social influence and reference group ones within the educational setting. A set of limitation and future research venues were mentioned in details also.
Conference Paper
The combination of two technology model which are the Technology Acceptance Model (TAM) and Use of Gratifications Theory (U&G) to create an integrated model is the first step in predicting the importance of using emotional icons and the level of satisfaction behind this usage. The reason behind using these two theories into one integrated model is that U&G provides specific information and a complete understanding of usage, whereas TAM theory has proved its effectiveness with a variety of technological applications. A self-administered survey was conducted in University of Fujairah with college students to find out the social and cognitive factors that affect the usage of stickers in WhatsApp in the United Arab of Emirates. The hypothesized model is validated empirically using the responses received from an online survey of 372 respondents were analyzed using structural equation modeling (SEM-PLS). The results show that ease of use, perceived usefulness, cognition, hedonic and social integrative significantly affected the intention to use sticker by college students. Moreover, personal integrative had a significant influence on the intention to use sticker in UAE.