ChapterPDF Available

Spatial Sentiment and Perception Analysis of BBC News Articles Using Twitter Posts Mining


Abstract and Figures

Over the past few decades, an exponential growth is seen in social media, online resources and microblogging websites such as Twitter. There has been a gush of user generated content and production of huge amount of data through news and event sharing on these sites is no exception. Data generated by these resources is a rich source of information for data mining. Sentiment Analysis is a current and important research area that attempts to determine the polarity of text. Determining the sentiments on happening events around the globe has become extremely important. In this paper, a subjective lexicon-based approach is proposed to mine the unstructured data into meaningful information from a popular microblogging website, Twitter, in order to determine the semantic orientation of real-time reactions and opinions. The main focus is to extract the audience’s sentiment related to BBC news articles being shared on Twitter. Firstly, our approach will extract all comments on shared articles and determine their polarity. Secondly, it categorizes the extracted users based on their location and shows the collective opinion of users in different regions. Thirdly, a visualization tool has been developed for viewing the obtained results.
Content may be subject to copyright.
Spatial Sentiment and Perception Analysis of BBC News Articles Using Twitter
Posts Mining
Farah Younas1 and Dr. Majdi Owda2
1 Department of Computer Science, Shaheed Zulfikar Ali Bhutto Institute of Science and Technology, Islamabad, Pakistan.
2 Department of Computing and Mathematics, Manchester Metropolitan University, Chester Street, Manchester, M1 5GD,
Abstract. Over the past few decades, due to an exponential growth in social media, online resources and
microblogging websites such as twitter. There has been a gush of user generated content and production of huge
amount of data through news and event sharing on these sites is no exception. Data generated by these resources
is a rich source of information for data mining. Sentiment Analysis is a current and important research area that
attempts to determine the polarity of text. Determining the sentiments on happening events around the globe has
become extremely important. In this paper, a subjective lexicon-based approach is proposed to mine the
unstructured data into meaning full information from a popular microblogging website, Twitter, in order to
determine the semantic orientation of real-time reactions and opinions. The main focus is to extract the audience’s
sentiment related to BBC news articles being shared on twitter. Firstly, our approach will extract all comments
on shared articles and determine their polarity. Secondly, it categorizes the extracted users based upon their
location and shows the collective opinion of users in different regions. Thirdly, a visualization tool has been
developed for viewing the obtained results.
Keywords: Microblogging, Big Data, Sentiment Analysis, Subjective Lexicon-based approach, Twitter, Spring
MVC, Hibernate.
1. Introduction
The entire process of identifying and mining subjective information from raw data is termed as sentiment analysis and
is closely related to the field of NLP (Natural Language Processing) which tries to minimize the gap between machine
and human by extracting and analysing beneficial information from natural language messages. In past few years, with
the growth of web technology there has been enormous growth in use of microblogging platform like Twitter. People
are not only using these web resources but also giving their feedback, consequently producing further useful
information. Social networking sites produce up to terabytes of data per week. Spurred by this growth in amount of
user’s feedback, views and opinions, it is becoming essential to mine this data. Data collected on the daily basis is
wasted if it is not utilized properly for any purpose. Companies are seeking ways to perform this task for better decision
making. Wide quantity of information generating large and complex datasets on day to day basis presents many
challenges to the analysts who want to extract meaningful information from the data. The traditional data processing
applications are in adequate for processing this data. Analysis such large amount of data is a challenge.
Sentiment Analysis or Opinion mining is a Natural Language Processing application and over the past few decades
Information Extraction (IE) task has perceived a flourishing attention. It is also called as emotion analysis or mood
extraction and the basic in this task is classification of text polarity as positive, negative or neutral. Social media is an
area where sentiment analysis has been extensively applied. The aim of the approach is to use the twitter data in order
to perform the sentiment analysis and develop a tool for visualizing the results. The targeted domain is the comments
posted on news articles of the BBC website which people share on Twitter; purpose is to analyse the impact of the
particular article on various regions around the globe. People from various parts of world comment on the shared news
article and express their thoughts and opinions about it which can be positive or negative. Due to large amount of data,
it is nearly impossible to analyse it manually hence an automated tool is required for its analysis.
There are three different approaches to perform sentiment analysis (1) Subjective Lexicon - each word in the list is
allocated a score that specifies word’s nature as positive (good), negative (bad), or neutral. (2) N-Gram Modelling -
different types of models (uni-gram, bi-gram, tri-gram or their combination) are used to make N-Gram model which is
further utilized for classifying training data. (3) Machine Learning - supervised and semi-supervised learning is
performed by feature extraction from text and learn the model [1]. We will be using the first approach for this project
i.e. Subjective Lexicon.
Sentiment Analysis can be done at different levels such as (1) Document Level Classification This type of analysis is
performed on an entire review; whole review is categorized on the basis of overall opinion. (2) Sentence Level
Classification This process is carried out in two steps (i) A sentence is classified into either of the two classes:
objective or subjective and known as subjectivity classification. (ii) classification of a subjective sentence into either of
the two classes: positive or negative and referred to as sentiment classification, approach adopted by our study. The
approach will be used is based Subject Lexicon which is an unsupervised learning technique as data is not labelled.
The remaining sections of this paper are organized as follows. Section 2 is Related work. Section 3 and Section 4 contain
Research Methodology and Implementation Details respectively. Analysis and results are presented in section 5. Section
6 lists the limitation and future research challenges. Finally, the conclusion is mentioned in section 7.
2. Related Work
“Sentiment analysis or opinion mining refers to the opinion mining refers to the application of natural language
processing, computational linguistics and text analytics to identify and extract subjective information in source
materials” [2]. Support vector machine (SVM) was used in a study of emotion classification from to investigate the
emotion grouping of web blog corpora. The study took into account the sentence context and performed the emotional
classification. From this work it was concluded that emotions in the document’s last sentence has the maximum
significance in determining the polarity of surveyed document [3]. Read looked at the emoticons such as for
building the training set in order to perform text classification. Texts comprising of emoticons from Usenet newsgroups
were used by author as sources. It is evident that the training set depends both on the topic of the domain and the time
when the data was collected. Therefore, if the training set from one domain is applied on another domain let’s say from
domain A to domain B, it could be done if both the domains share the domain specific vocabulary. Domain, time and
topic independent datasets are obtained when experiments are performed using emoticons labelled training sets.
Satisfactory results were obtained by emoticons-trained classifiers [4].
Pang et al., used movie review as data to build sentiment lexicon. They did not classify the document by topic but
overall sentiments categorizing the reviews as positive or negative. It was found that machine learning techniques are
better than baselines produced by human. Their system motivated the other machine learning techniques Naïve Bayes,
maximum entropy classification, SVM and found that they perform better on traditional topic-based categorization as
on sentiment characterization [5]. empirical method for building adjective’s sentiment lexicon was first developed by
Hatzivassiloglou [6]. They used a large corpus to identify and validate constraint from conjunctions based on semantic
orientation i.e positive or negative of the conjoined adjective. The nature of conjunction linking the adjective is the key
point. To find out the nature of conjoined adjectives (same or different) orientation, these constraints were used by a
log-regression model and 82% accuracy was obtained on each independent conjunction. High level of performance was
obtained by experimenting real data evaluation and simulation providing more than 90% classification precision for
adjectives [6]. For classifying reviews as positive or negative, Peter turney presented a simple unsupervised algorithm
and proposed the idea of classifying these reviews as recommended i.e., thumbs up or not recommended i.e., thumbs
down. A phrase in review which contains the adjectives or adverbs was taking and their average semantic orientation
was used to predict the classification of a review. Experiments were conducted on movie review corpus. A review is
placed is recommended (thumbs up) category if average semantic orientation of the phrase is positive otherwise it is
classified as not recommended. An automated system was necessary for better formalization of problem [7]. The
algorithm was proposed by Turney to extract PMI Point wise Mutual Information for consecutive words and their
polarity [5].
Since the last few years’ companies such as tweetfeel (, Twitratr
(, Twitter Sentiment Analysis Tool (, Social Mention
( are available. While there has been reasonable volume of research on how sentiments are
conveyed in genres like news articles, online reviews. Sentic Corner or Sentic Computing, a model developed a new
model was developed [8]. This model is an intelligent user interface in which our current frame of mind is in harmony
with design and content and we don’t have to deal with continuously blasted ads and user unfriendly interfaces. Their
research is based on emotion representation and common sense. It collects audio, video, images dynamically related to
user’s current activities and feelings to infer emotional state over the web [9].
Users are interested in the aspectual sentiment classification rather than the binary output as positive or negative.
Therefore, in order to satisfy the end user, the sentiment analysis covered till now is insufficient.
Let’s take an example, consider a social worker who has developed a scheme and he wants to the change in society
prior and after the operation of his scheme. Hence, the system used for the purpose of sentiment analysis should be
efficient enough to recognise and classify the aspectual sentiment that is present in the text. As a solution to this problem,
Das proposed a sentiment structuration technique which is based on 5Ws which are Why, What, Where, When. Label
bias problem may occur due to some drawbacks of this techniques [10]. To rectify this problem hidden Maximum
Entropy Model (MEMM) explained above was introduced. Another theory called as Appraisal theory was described by
Bloom which characterised the opinion into three categories: affect, appreciation and judgment [11]. Aggregation of
data is the foremost needs of the end user. Sentiment Summarization -Visualization-Tracking can be done in two ways;
Polarity Wise: An overall polarity wise summary can be shown in the form of a Gantt chart produced by the system.
A user can find out more details by looking into the summary text [12].
Topic Wise: Sentiment summaries based on the customized topic about 5Ws can be generated by users. A user can
choose any dimension or combination of multiple dimensions. Pang and Lee performed topic wise sentiment
summarization [5].
A similar study was conducted on twitter news article which used Lexicon based approach to perform sentiment
analysis. The experiments were performed on BBC information dataset, which expresses the applicability and validation
of the adopted approach. Opinion mining was performed on articles from 2004 and 2005 to analyse which category of
news have more positive articles than the other ones [13]. Wang et al., examined the affiliation among crime statistics
and drug-associated tweets. Social media, which includes Twitter, has been shown to be a feasible tool for monitoring
and predicting public health events which include disease outbreaks. According to their study, Twitter can be used as
tool for monitoring crimes [14]. Within our previous study on Experiment for Analysing the Impact of Financial Events
on Twitter, we conducted a research on twitter which analysed the issue of detecting irregularities at real-time in
financial market according to the volume (as a sign of the importance of the irregularity) and to other features (as signs
of the potential origin causing the irregularity) [15]. Furthermore, another study conducted by us used Twitter as a
source of decision making tool and inspected the permeability of Twitter to financial events as a way to provide evidence
which allows Twitter for use as a social sensor for the economic and stock market with real time [16].
3. Research Methodology
The architecture of the proposed system is 3-tier that performs the sentiment analysis on a news articles which user
wants to find the user opinion on. 3 layers of system architecture comprise of Presentation, Business, and Data Access
Layer. Figure 1 - show the main modules of system architecture. Figure 2 - shows all the components and
subcomponents in addition to indication of interaction between them. The input to the system is a URL of a BBC
news article under analysis. A search query is generated based on the user input with the help of its subcomponent
known as filtering engine. Twitter is then queried by the Social Media Retrieval Engine to gather the required
information about the people who shared that article on twitter and extracts the comments on that article. Obtained
results are stored in the data store for analysis purpose. The collected data might contain a lot of noise such as
unnecessary words, punctuation marks, emoticons etc., which are not helpful in analysis. So, the data needs to be
pre-processed prior to analysis. Finally, processed data is used to carry out the analysis and presents the results on
the client’s browser. Results contain the extracted comments, information about the users, a map indicating their
geographical location, pie and bar graphs presenting the percentage result of the sentiment analysis. Collecting data
for the project manually from twitter would be a tedious job, so the twitter library twitter4j for twitter API which
allows user to collect data from twitter website was used for data collection purpose. Following sub-sections illustrate
the components of proposed system.
Fig 1 Modules of System
Fig. 2. System Architecture
3.1 Query Builder
This component generates a query based upon the user input i.e. what user wants to search. Hence, the input stream to
this component is the input URL and the output stream is the generated query according to which social media retrieval
engine will collect data for analysis. Its subcomponent is filtering engine (a) Filtering Engine It filters out the tweets
which we do not want to process and provides us the required tweets and their associated comments. Characteristics of
the tweets to be filtered out are as follows:
a. Tweets written in any other language except English.
b. Tweets which does not contain the URL of the searched article.
c. Tweets from specified time period.
3.2 Social Media Retrieval Engine
Responsibility of this module is to gather the information about the users who have commented on or have shared the
searched article. To carry out this task, Twitter4j library is used to query Twitter API as mentioned in subsection 2.1,
generating tweet objects as the output stream of the component. Once a tweet and its relevant information is retrieved,
it is added to the outgoing stream and stored in the database. Twitter API imposes several regulations and an
unavoidable rate limit restriction on the number of tweets which can be extracted per hour, so it is a very likely for
some tweets to be overlooked.
3.3 Data Pre-Processing Engine
To prepare the data for analysis, it is pre-processed. This process makes the gathered data noise free. In first step, every
extracted tweet is split into characters. In second step, word which are not meaning full for analysis, termed as Stop
words, are removed from extracted tweet sentences. This step helps in decreasing the meaningless vocabulary in order
to obtain optimum results. There is no universal list of such words. Table 1 provides the list of words removed from
collected tweets.
Table 1. Action performed on unwanted content
Unwanted Content
Punctuation (! ? , . ” : ; )
Uppercase characters
Lowercase all content
Stop words
BBC News
3.4 Data Store
MySQL database was used in development of the project for storing and managing the information about users and
tweets along with the comments extracted by retrieval engine. It also contains the list of words for performing the
sentiment analysis on the collected information.
3.5 Reasoning Engine
Main task of this engine is to perform the core sentiment analysis making it the most important part of the whole
architecture. Four subcomponents of this engine shown in Figure 3. are as follows:
Sentiment Analysis. This subcomponent is responsible for comparing the obtained comments with the list of sentiment
words stored in the database to categorize the impact of the article as positive or negative. Once it finds a word, an
increment is performed on the weight of obtained sentiment either positive or negative.
Aggregation Function. After obtaining the positive and negative count from sentiment analysis component, an
aggregation function %𝑨𝒈𝒈𝒓𝒆𝒈𝒂𝒕𝒆 = [(𝜮𝒄𝒐𝒖𝒏𝒕/𝑻𝒐𝒕𝒂𝒍𝒄𝒐𝒖𝒏𝒕) ∗ 𝟏𝟎𝟎] is applied on them to calculate the final
percentage of for both positive and negative counts.
Graph Generation. Values obtained from the aggregation function are used to plot pie chart and bar chart for
visualizing the results of experiment.
Map Plotting. This component plots the graph using google maps based on user location extracted in the data gathering
Fig. 1. Reasoning Engine
4. Implementation Details
To make the system as responsive as possible the program is developed using Spring MVC integrated with hibernate
along with Tomcat7 web server. MySQL database is used for data management and retrieval. The Twitter data model
used in this project is shown in the Figure 4.
Fig. 4. Twitter Data Model
4.1. Tools and Environment
The developed web application runs on the following technology stack:
Java EE IDE - Eclipse Mars Release 4.5.0
Tomcat v7.0
Spring MVC 3.0
Hibernate 3.0
MySQL Workbench 6.3
Platform - All the processing was carried out on 64- bit operating system, AMD A10 processor, 8GB RAM
running on windows operating system.
4.2. User Interface
Designed interface allows the user to view the searched article. Extracted tweets, its comment and information related
to the users is displayed on the user interface. It also provides a facility to view the results in various forms such as,
overall percentages of obtained results categorized as positive or negative. Furthermore, different regions indicating
their associated percentages is also displayed for comparison.
5. Results and Analysis
For illustration purpose, a news article and a tweet with its comments are presented as a case study. For example, we
search the following article: BBC News - General election 2019: Labour facing long haul, warns Few obtained
tweets are shown in the Figure 5.
Fig. 5. Extracted comments on Queried Tweet
The approach was applied to a set of tweets and comments on the article. At the time of query execution, a total of
590 people shared this article on their twitter profiles. Results obtained showed that in 71.88% people of United states
of America perceived it as a positive news whereas, 28.57 percent of people considered it as negative. Figure 6 and
7 shows the result in form of a map, indicating percentage of positive and negative perception of people from where
they responded to the news article. Table 2 shows the obtained percentages of perception from extracted regions. An
overall ratio is also obtained through as a part of analysis and displayed in the form of pie chart and bar chart.
According to which 69.57 % of people around the globe considered it as positive and 30.43% people considered this
news article as negative as shown in Figure 8 and 9.
Table 2. Percentages of perceptions in different regions.
Percentage of Positive Perception
Percentage of Negative Perception
United States
United Kingdom
Fig. 8. Pie Chart of Perceptions Fig. 9. Bar Chart of Perceptions
Fig. 6. Locations’ Percentage of Positive Perception
Fig. 7. Locations’ Percentage of Positive Perception
6. Research Limitation and Challenges
Performed Sentiment analysis was limited to comments written in English language only. Another limitation and
challenge are the generation and addition of newer vocabulary, hence arising the need of keeping our dictionary
updated. Furthermore, limitation of extracting a specified number of tweets in an hour imposed by twitter API results
in missing out some of the tweets affecting the analysis in turn. Since the analysis is performed on real time data,
determining the accuracy of results is a challenging task.
7. Conclusion
People around the globe these days intend to consume the news more than ever before. Mining the polarity of important
events happening around us is a useful approach in order to get an idea of impact of a certain event or news on different
regions of the world therefore providing a spatial sentiment or spreading sentiment is essential. This paper explored the
aforementioned direction of sentiment analysis out of many directions possible, in which, opinion mining was
performed on the comments posted by various users on BBC news articles shared on twitter accounts. Since opinions
differ according to the context of news and location of the audiences. A BBC news article which was shared on twitter
was used for demonstration of experiment and its results. Further work will be based on extracting data from various
social media sources such as Facebook and integrating it with the results obtained from the proposed approach. This
will result in an improved spatial sentiment analysis as data will be collected and fused from diverse sources.
L. Zhang, R. Ghosh, M. Dekhil, M. Hsu and B. Liu, "Combining Lexicon-based and Learning-
based Methods," June 21, 2011.
R. Tejwani, "Sentiment Analysis: A Survey," 2014.
C. Yang, K. Hsin-Yih and H.-H. Chen, "Emotion Classification Using Web Blog Corpora," in
IEEE/WIC/ACM International Conference on Web Intelligence, National Taiwan University,
Taipei, Taiwan, 2007.
J. Read, "Using emoticons to reduce dependency in machine learning techniques for sentiment
classification.," in The Association for Computer Linguistics., 2005.
B. Pang, L. Lee and S. Vaithyanathan, "Thumbs up? Sentiment Classification using Machine
Learning," in EMNLP, USA, 2002.
V. Hatzivassiloglou and K. McKeown, "Predicting the Semantic Orientation of Adjectives,"
May 2002.
P. D. Turney, "Thumbs Up or Thumbs Down? Semantic Orientation Applied to," 40th Annual
Meeting of the Association for Computational Linguistics (ACL), pp. 417-424, July 2002.
E. Cambria and A. Hussain, Sentic Computing, Techniques, Tools, and Applications, Springer,
E. Cambria, A. Hussain and C. Eckl, "Taking Refuge in Your Personal Sentic Corner," in
Cambria2011TakingRI, 2011.
A. Das, S. Bandyopadhyay and B. Gambäck, "The 5W Structure for Sentiment Summarization-
Visualization-Tracking," in Proceedings of the 13th international conference on Computational
Linguistics and Intelligent Text Processing, March 2012.
J. Martin and P. White, "The language of evaluation: Appraisal in English.," London, 2005.
M. A. Karim, Technical Challenges and Design Issues in Bangla Language Processing, IGI
Global, 2013.
S. Taj and A. F. Meghji, "Sentiment Analysis of News Articles: A lexicon Based Approach," in
2nd International Conference on Computing Mathematics & Engineering Technologies-2019
(iCoMET), February 2019.
W. Yan, W. Yu, S. Liu and S. Young, "The Relationship Between Social Media Data and Crime
Rates in the United States," Social Media + Society, vol. 5, no. 1, March 2019.
M. Owda, K. Crockett and A. F. Vilas, "Experiment for Analysing the Impact of Financial
Events on Twitter," August 2017.
A. F. Vilas, R. P. D. Redondo, K. Crockett, M. Owda and L. Evans, "Twitter permeability to
financial events: an experiment towards a model for sensing irregularities," Multimedia Tools
and Applications, vol. 78, no. 7, p. 92179245, April 2019.
... These positive changes in news articles could affect the reader's perception. 63 In other words, it could influence the public reading the article to have a positive perception of telemedicine. Also, a positive change in the tone of a news article could affect public preference for telemedicine. ...
Full-text available
Telemedicine is rapidly growing to meet the increased needs for high-quality health care during the COVID-19 pandemic. However, telemedicine is still a sensitive issue as it is related to medical privatization. The use of telemedicine after the COVID-19 outbreak might be influenced by public opinion, and this may be an important key in implementing telemedicine. In this study, we aimed to assess if telemedicine-related newspaper articles and comments changed positively during the COVID-19 pandemic. From January 1, 2019, to March 1, 2020 (before COVID-19), a total of 1073 telemedicine-related articles were found in the Korean news network. Although the post-COVID-19 article collection period (from March 2, 2020, to September 30, 2020) was about half that of the pre-COVID-19, about twice the number (1934) of telemedicine-related articles were collected. And telemedicine-related news articles had a more positive tone post-COVID-19 than pre-COVID-19 (52.9% after vs 40.4% before). In conclusion, this study presented the association between the COVID-19 outbreak and changes in the media’s perception of telemedicine in Korea. This study presented that, as telemedicine begins to be utilized due to COVID-19, news media and readers who embrace it are beginning to view telemedicine positively, suggesting that COVID-19 has a positive foundation for the spread of telemedicine.
... Younas and Owda show in their work [26] a lexicon-based method for analysing the sentiment of tweets regarding BBC news articles. Another important work includes the works of Dutot and Castellano [27] where they suggested the need to analyse a brand considering "social media", as well as other components such as "brand characteristic" and the quality of the services or of the website of that brand. ...
We present a novel framework for brand monitoring and analysis, based on the available data of Twitter in Romanian. The framework uses a sentiment analysis text classifier for distinguishing Twitter posts between positive or negative, which was trained and tested using a novel dataset of tweets in Romanian, labelled by the authors. We created and compared four adapted preprocessing pipelines, that generated four sets of data, on which we trained several machine learning models. Based on the evaluation metrics, results show that a neural network using fastText has the best F1-score and accuracy, thus this model was further used for our proposed framework for brand monitoring and analysis. Our application creates various reputation scores, based on which it generates three kind of reports: reputation report of a single brand, reputation report of an industry and comparative reputation report of two companies, in a desired time frame.
Full-text available
Crime monitoring tools are needed for public health and law enforcement officials to deploy appropriate resources and develop targeted interventions. Social media, such as Twitter, has been shown to be a feasible tool for monitoring and predicting public health events such as disease outbreaks. Social media might also serve as a feasible tool for crime surveillance. In this study, we collected Twitter data between May and December 2012 and crime data for the years 2012 and 2013 in the United States. We examined the association between crime data and drug-related tweets. We found that tweets from 2012 were strongly associated with county-level crime data in both 2012 and 2013. This study presents preliminary evidence that social media data can be used to help predict future crimes. We discuss how future research can build upon this initial study to further examine the feasibility and effectiveness of this approach.
Conference Paper
Full-text available
Modern technological era has reshaped traditional lifestyle in several domains. The medium of publishing news and events has become faster with the advancement of Information Technology (IT). IT has also been flooded with immense amounts of data, which is being published every minute of every day, by millions of users, in the shape of comments, blogs, news sharing through blogs, social media micro-blogging websites and many more. Manual traversal of such huge data is a challenging job; thus, sophisticated methods are acquired to perform this task automatically and efficiently. News reports events that comprise of emotions-good, bad, neutral. Sentiment analysis is utilized to investigate human emotions (i.e., sentiments) present in textual information. This paper presents a lexicon-based approach for sentiment analysis of news articles. The experiments have been performed on BBC news dataset, which expresses the applicability and validation of the adopted approach.
Full-text available
There is a general consensus of the good sensing and novelty characteristics of Twitter as an information media for the complex financial market. This paper investigates the permeability of Twittersphere, the total universe of Twitter users and their habits, towards relevant events in the financial market. Analysis shows that a general purpose social media is permeable to financial-specific events and establishes Twitter as a relevant feeder for taking decisions regarding the financial market and event fraudulent activities in that market. However, the provenance of contributions, their different levels of credibility and quality and even the purpose or intention behind them should to be considered and carefully contemplated if Twitter is used as a single source for decision taking. With the overall aim of this research, to deploy an architecture for real-time monitoring of irregularities in the financial market, this paper conducts a series of experiments on the level of permeability and the permeable features of Twitter in the event of one of these irregularities. To be precise, Twitter data is collected concerning an event comprising of a specific financial action on the 27th January 2017: the announcement about the merge of two companies Tesco PLC and Booker Group PLC, listed in the main market of the London Stock Exchange (LSE), to create the UK’s Leading Food Business. The experiment attempts to answer two research questions which aim to characterize the features of Twitter permeability to the financial market. The experimental results confirm that a far-impacting financial event, such as the merger considered, caused apparent disturbances in all the features considered, that is, information volume, content and sentiment as well as geographical provenance. Analysis shows that although Twitter is not a specific financial forum, it is permeable to financial events. Therefore it should be considered within the architecture for real-time monitoring of irregularities in the financial market.
Full-text available
Sentiment analysis (also known as opinion mining) refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in source materials. Mining opinions expressed in the user generated content is a challenging yet practically very useful problem. This survey would cover various approaches and methodology used in Sentiment Analysis and Opinion Mining in general. The focus would be on Internet text like, Product review, tweets and other social media.
Conference Paper
Full-text available
In this paper we address the Sentiment Analysis problem from the end user's perspective. An end user might desire an automated at-a-glance presentation of the main points made in a single review or how opinion changes time to time over multiple documents. To meet the requirement we propose a relatively generic opinion 5Ws structurization, further used for textual and visual summary and tracking. The 5W task seeks to extract the semantic constituents in a natural language sentence by distilling it into the answers to the 5W questions: Who, What, When, Where and Why. The visualization system facilitates users to generate sentiment tracking with textual summary and sentiment polarity wise graph based on any dimension or combination of dimensions as they want i.e. "Who" are the actors and "What" are their sentiment regarding any topic, changes in sentiment during "When" and "Where" and the reasons for change in sentiment as "Why".
Conference Paper
Twitter, as the heart of publicly accessible Social Media, is one of the currently used platforms to share financial information and is a valuable source of information for different roles in the financial market. For all these roles, the quality analysis of Twitter as a source of financial information is essential to take decisions. The work in this paper is aligned with the ongoing work of the authors to a solution for irregularity monitoring in the financial market by harnessing data in online social media. To do so, the permeability of a variety of social media data feeders to financial irregularities should be analysed. That is the case of the experiment in this paper by putting the focus on Twitter microblogging platform and checking if this general purpose social media is permeable to a specific financial event. For this, we detail the analysis of Twitter permeability to a specific event in the past few months: the announcement about the merge of Tesco and Booker to create a UK’s Leading Food Business on the 27th January 2017. Both companies Tesco PLC and Booking Group PLC are listed in the main market of LSE (London Stock Exchange). Our findings provide promising evidences to address the problem of real-time detection of irregularities in the financial market via Twitter according to the volume (as a sign of the importance of the irregularity) and to other features (as signs of the potential origin causing the irregularity).
This is the first comprehensive account of the Appraisal Framework. The underlying linguistic theory is explained and justified, and the application of this flexible tool, which has been applied to a wide variety of text and discourse analysis issues, is demonstrated throughout by sample text analyses from a range of registers, genres and fields.
The title Cognitive Agent-based Computing reflects a unified framework combining two key modeling paradigms for developing cognition/understanding of a special type of systems namely the Complex Adaptive Systems (CAS).
Conference Paper
In a world in which web users are continuously blasted by ads and often compelled to deal with user-unfriendly interfaces, we sometimes feel like we want to evade from the sensory overload of standard web pages and take refuge in a safe web corner, in which contents and design are in harmony with our current frame of mind. Sentic Corner is an intelligent user interface that dynamically collects audio, video, images and text related to the user’s current feelings and activities as an interconnected knowledge base, which is browsable through a multi-faceted classification website. 1