Content uploaded by Belfin R V
Author content
All content in this area was uploaded by Belfin R V on Oct 26, 2020
Content may be subject to copyright.
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm
k
k k
k
Author Queries
AQ1 Please provide Table 4.1 citation.
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 61
k
k k
k
61
4
Application of Machine Learning in the Social Network
Belfin R. V.1,E.GraceMaryKanaga
1, and Suman Kundu2,3
1Department of Computer Science and Engineering, Karunya Institute of Technology and Sciences,
Coimbatore, India
2Department of Computer Science and Engineering, Indian Institute of Technology, Jodhpur, India
3Department of Computational Intelligence, Wroclaw University of Science and Technology, Wroclaw, Poland
4.1 Introduction
Social media platforms have become an integral part of day-to-day life for a majority of
the world’s internet users. People tend to get more erudition from social media. Apart from
information, people can create content for social media to showcase their skills. An example
is the video resume, which professionals create and publish on social media to show their
presence. Content can take dierent forms such as images, text, emoticons, and videos.
Since there are not many limits on content creation on social media, users generate a mas-
sive amount of data that shows all the characteristics of big data. This data can be used for
dierent analytical and predictive applications for business. Selling data through APIs for
business and educational purposes is also a business for many data giants. Structural Query
Language is not sucient to mine information from big data. It needs complex statistical
and machine learning (ML) approaches to glean information from this massive data. The
chapter provides a survey of dierent metaheuristic machine learning algorithms used for
various interesting research problems in the domain of social networks and big data.
4.1.1 Social Media
A critical entity of the World Wide Web is social media, which comes in dierent forms
including social blogs, forums, professional networks, picture sharing applications, social
gaming sites, chatting applications, and most importantly social networks. Social media is
mighty in the sense that estimates predict we will reach 3.02 billion monthly active social
media users by 2021. A forecast by Statista.com (2018) shows that China alone will have 750
million users by 2022 and India will have one-third of a billion users. On average, internet
users worldwide spend 135 minutes surng social media. This user density has resulting
in marketers promoting their products on social media in a new eld named social media
marketing or social digital media advertising. Recently, there has been a complete trans-
formation in the usage of social networking sites, switching from being used on personal
Recent Advances in Hybrid Metaheuristics for Data Clustering, First Edition.
Edited by Sourav De, Sandip Dey, and Siddhartha Bhattacharyya.
© 2020 John Wiley & Sons Ltd. Published 2020 by John Wiley & Sons Ltd.
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 62
k
k k
k
62 4 Application of Machine Learning in the Social Network
computers to now being used more often on mobile devices. The social networking giants
like Facebook, Twitter, and many others give away their mobile applications to customers.
There are even location-based microblogging and many other services oered to their cus-
tomers through mobile applications.
4.1.2 Big Data
The amount of data generated by social networks and social media is unimaginable. It cov-
ers all four signicant features of big data, the so-called 4V’s. The 4V’s are volume, velocity,
variety, and veracity, and when present in generated social media data, the analysis on the
data becomes complex. Leaving the complex data as it is not a wise decision for the tech-
nology giants. These social media organizations have started analyzing this generated data
to give better prospects to their users. The users using these features are happy and excited
to see applications built on their data. The application users can personalize it and share
the personalized content with their friends on social media. To leverage the content gen-
erated on social media, branding and advertising departments of the top companies create
marketing plans and budgets accordingly. These companies also need to understand the
outcome of their advertisements, the preference of their customers, and even the negative
reviews. Since the amount of data is enormous, it is impossible to do the analysis manu-
ally. Information from the historical transactions and social media data is not enough for
the top ocials to decide on their future goals. The organizations have to stay ahead of the
competitors. Machine learning models come to the rescue to help top management make
decisions.
4.1.3 Machine Learning
Machine learning and AI are the important concepts in the current scenario. Much of the
human work will be replaced by machines. For example, in the future, bots will replace most
of the humans in the armed forces of a country. Restaurants can replace the waiters with AI
bots. Bots in restaurants are available in a fewrestaurants in now. There are machine learn-
ing approaches that can teach the bots to understand the environment and act accordingly.
Classication, clustering, regression, and deep learning are some of the models in machine
learning.
As shown in Figure 4.1 the machine learning algorithms can be divided into four
types, namely, supervised learning, unsupervised learning, semisupervised learning, and
Machine
Learning
Regression Classication Clustering Association Classication Classication ControlClustering
Supervised
Learning
Unsupervised
Learning
Semi-
Supervised
Learning
Reinforcement
Learning
Figure 4.1 Classification of machine learning algorithms
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 63
k
k k
k
4.1 Introduction 63
reinforcement learning. Supervised learning algorithms are used when the target variable
is continuous and categorical. Some use cases for supervised learning are regression anal-
ysis for housing price prediction and the classication of medical images. Unsupervised
learning algorithms are used when there is no target variable. Clustering in marketing
data for customer segmentation and market basket analysis or association rule mining
of a supermarket transaction data are the use cases of unsupervised learning algorithms.
Semisupervised algorithms can be used when the target variable in the data is categorical.
The text classication of news data and lane nding in GPS data using clustering are
some of the use cases of semisupervised learning algorithms. Reinforcement learning is
an advanced level of learning algorithm that learns the environment and acts accordingly.
Reinforcement learning can be implemented in the data when the target variable in the
data is categorical or there is no target variable. The use cases for reinforcement learning
are driver-less cars and optimizing the marketing cost of a business.
4.1.4 Natural Language Processing (NLP)
The amount of content generated by the users of social media is exponentially increasing.
The text data cannot be processed by a machine eciently like with other formats of data.
A machine needs to understand human slang and language to analyze the text content.
Natural language processing (NLP) helps machines understand human slang and language
in the text content generated on social media. The ow of content from social media to a
big data storage system and the analysis by ML and NLP are illustrated in Figure 4.2.
In recent times, machine learning and articial intelligence play a vital role in engaging
millions of social media users. Recent studies show that customers are more loyal to the
companies that respond to them promptly. Bots or machine learning programs automati-
cally understand the customers’ queries using NLP and respond to them then and there.
This advancement helps companies retain their customers and build stronger relationships
with them. The basic model of a social media chatbot is illustrated in Figure 4.3.
Big Data Storage
Big Data
generated from
social media
platforms is
stored in the
cloud for further
processing.
Data scientists
use ML and
NLP to tap
useful
information form
the big data.
Machine Learning and Natural
language Processing Engine
ML NLP
Social Media Platforms
Figure 4.2 Workflow of big data, machine learning, and social media
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 64
k
k k
k
64 4 Application of Machine Learning in the Social Network
NLP
Machine Learnin
g
Bot Logic
Messaging Platforms
APIs′
Social Media Content
Figure 4.3 Chatbot schematic diagram
4.1.5 Social Network Analysis
Social network analysis (SNA) is a method of analyzing social relationships usually with
the concepts of networks and graph theory. In SNA the social actors are usually denoted
with nodes, and the relationships are denoted with edges of the graph. There are dier-
ent variants in these networks like directed, undirected, and weighted networks. In recent
times there have been multilayer representations to represent complex social structures.
Although graph theories were at the forefront of social network analysis (Beln et al., 2018;
Beln and Grace Mary Kanaga, 2018), there were attempts to use other theories like game
theory (Narayanam and Narahari, 2011) and granular computing (Kundu and Pal, 2015a;
Pal and Kundu, 2017; Kundu and Pal, 2018) to solve social network issues. This chapter is a
summary of the various applications and machine learning methods available in the social
network and big data literature.
This chapter has compiled classication methods and applications in section 4.2 followed
by the clustering methods and applications in Section 4.3. The regression-based concepts
and their application in social networks are discussed in Section 4.4. Finally, the application
of evolutionary algorithms and deep learning methods are discussed in the section 4.5.
4.2 Application of Classification Models in Social Networks
Classication divides whole content in to chunks of related content. Machine learning clas-
sication is done on date that has labels associated with it. For instance, say a user has a
massive number of emails in an inbox. Classifying those emails based on topics like work,
promotions, and social might help the user to prioritize his work. In this example, work,
social, and promotions are the labels. This process is similar to placing colored balls in the
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 65
k
k k
k
4.2 Application of Classification Models in Social Networks 65
right baskets with similarly colored balls. In social networks, there are several applications
where classication concepts are instrumental. This following section gives several appli-
cations, such as spam content classication, labeling data available in an online social net-
work, medical data classication, human behavior analysis, and sentiment analysis given
in the literature.
4.2.1 Spam Content Detection
The digital age has resulted in lots of strategies for businesses to market their products and
pump lots of money into their digital marketing. These marketing strategies have generated
lots of promotional content dispersed across social media. Most of the content that reaches
users is irrelevant to them. Separating relevant user information from the irrelevant infor-
mation is called spam content classication. Benevenuto et al. (2009) identies spam users
who spread impure information on YouTube using real YouTube users and content. Zhu
et al. (2012) proposed a method for spam content classication to solve the problems in
content and topology-based classication models. The data for the experiment was taken
from China’s largest social network, Renren.com. A work by Ahmed and Abulaish (2013)
proposed a statistical method to analyze and lter spam content in Facebook and Twitter
data. The algorithm proposed generates 14 generic statistical features to detect a user who
spreads spam content.
Gender classication in social media data is an important aspect for law enforcement,
target advertising, and other social-related problems. Alowibdi et al. (2013) proposed an
algorithm for classifying proles as male and female proles. The algorithm used ve fea-
tures to classify the gender. The features may be the color of the prole background picture
or the set of text used to post the content on social media. Li and Xu (2014) introduced a
rule-based classication system based on sociology concepts to identify and label emotions
in microblog posts. They used Chinese microblog post data for the experiment. SPADE
Wang et al. (2014) is a social media classier that classies spam and useful messages
across a social network. The proposed method is a generic solution for multiple social net-
works using cross-domain and associative classication. Bots in a social network create
unrealistic text and spread false information. The classication of human accounts and bot
accounts has been designed by Igawa et al. (2016). They use random forests and multi-
layer perceptron classiers to test their model in a set of scraped data related to the 2014
FIFA world cup. Tacchini et al. (2017) proposed a work that focuses on misinformation
detection in social networks. They used Facebook posts as the data for their experiment.
This method uses logistic regression and a Boolean crowd sourcing algorithm to build the
classier model.
4.2.2 Topic Modeling and Labeling
Topic modeling (TM) is one of the crucial areas of research in big data analytics. TM is a
process where the text content in the extensive data is summarized into specic groups. An
example of this method is the grouping of news content into sports, economics, and pol-
itics. This section contains a brief discussion of the literature available in topic modeling
and labeling in social network data. Tuulos and Tirri (2004) used social network chat room
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 66
k
k k
k
66 4 Application of Machine Learning in the Social Network
data. They tried to break up the dynamic nature of the chat data and model it into topics.
Location annotation is a critical method to group locations. Ye et al. (2011) uses a support
vector machine (SVM) classier method to annotate and tag locations. Finally, it catego-
rizes the location as various categories. SocioDim by Tang and Liu (2011) works on the
classication model for social media by considering the heterogeneity of the social network.
Wanichayapong et al. (2011) worked on topic modeling with trac congestion data from
social media and broke it into two categories such as point and link. McAuley and Leskovec
(2012) tried to nd the inter-dependencies between the images considering the metadata
of the images on social media. They also considered the social community of the user who
created the content below the image: the data for the work generated from the comments
section below the image and the person who uploaded the image and his friends’ networks.
In addition, social media users add their dining, shopping, and other preferences on social
media. This generated content can help marketing experts recommend products to the
users. The approach in Song et al. (2013) is an iterative learning-based classier that learns
each user’s content and classies them in dierent user buckets. The algorithm also under-
stands the user’s friends content and provides a personalized recommendation. Customer
churn prediction is another important aspect in business. Churn analysis will forecast the
loyalty of customers. Verbeke et al. (2014) used real telecommunication datasets to predict
customer churn. The algorithm uses a combination of relational and nonrelational classi-
cation models to predict the churn. Emails are an important part of everyone’sprofessional
life. Classication in emails can be done to separate spam emails from the critical emails and
to classify the subject of the mail content. Alsmadi and Alhami (2015) proposed a method
using n-grams to classify spam emails in English and Arabic. Nowadays many users of the
internet have accounts on multiple online social network sites. The work in Peled et al.
(2016) developed a classier to match the entities between online social network accounts.
They used the data collected from Facebook and Xing to experiment with their classier.
Himelboim et al. (2017) classies Twitter tweets by using the information in the text and the
patterns visible in the network. The authors used the density, modularity, centrality, and the
faction of independent users in the network to build the classication model. The previous
works are centered around the users and not on the entire network structure. Adverse drug
reactions are considered to be one of the determinants of mortality in the medical eld. The
work in Yang et al. (2015) classies the experiences shared by doctors and the victims on
social media, micro-blogging sites, and forums. Finally, the data will be classied to form a
drug reaction database.
4.2.3 Human Behavior Analysis
This type of classication methods analyzes the data and groups the users according
to the user’s behavior in online social networks. An example of this human behavior
analysis is grouping the user’s gender using their behavior in online social networks. This
section will summarize state-of-the-art literature that classies human behavior in social
networks. Eleta and Golbeck (2014) classify patterns of communication on Twitter while
considering its multilinguistic nature. The work resulted in understanding the global reach
of social media and the ow of multilingual communication in social networks. This work
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 67
k
k k
k
4.3 Application of Clustering Models in Social Networks 67
also studied how multilinguistic users of Twitter mediate information sharing from a
dierent language. User personality classication is an essential aspect for a criminology
department and also for business. The work on user personality classication done by
Lima and de Castro (2014) takes the group of text shared by the user and learns it using
dierent machine learning approaches like naïve Bayes, support vector machine, and
multilayer perception neural network. Bayot and Gonçalves (2018) classify the age and
gender of the users in a social network using deep-convolutional neural networks (CNNs).
Epilepsy is a brain disorder commonly correlated with abnormal cortical and subcortical
functional networks. Zhang et al. (2011) use functional MRI data that is classied with the
help of social network analysis theories to nd this epilepsy.
4.2.4 Sentiment Analysis
Sentiment analysis classies the users’ emotions from the text they share on social media
and microblogging sites. An example of this might be classifying the happy, neutral, and
unhappy customers from the feedback data. This method aims to understand the con-
tent generated by the user and decide its emotion with computation or statistical methods.
Web technology is the most signicant technological advancement from the past decade.
It changed the way people think and the way they purchase items. Lo and Potdar (2009)
discussed opinion mining and sentiment analysis from the feedback data generated by
users for e-commerce products. Batool et al. (2013) analyzed the Twitter data to understand
the emotion of each tweet. The algorithm proposed includes a synonym binder module
and a knowledge enhancement module to classify and summarize the tweets. Sentiment
analysis and classication on Facebook status data was done by Akaichi (2013). This algo-
rithm builds sentiment lexicons based on the emoticons, interjections, and acronyms to
classify the sentiments in the status text. Vázquez et al. (2014) explains the recent trend
among e-commerce customers to look at the feedback of other people to help them decide
whether to buy the product. This work classies the microblog posts based on the reviews
posted by users. (Burnap et al., 2015) experimented with suicide-related communication
using machine learning classication methods from the Twitter data. This proposed work
classies the text that refers to suicidal contents using the lexical, structural, emotive, and
psychological features extracted from Twitter posts.
4.3 Application of Clustering Models in Social Networks
Clustering is the concept of automatically nding subgroups from massive data. In a social
network the same idea can be called community detection. There are many related works
that talk about community detection (Beln et al., 2018), (Beln and Grace Mary Kanaga,
2018) in social networks. Grouping methods can be utilized for many applications. One
example of clustering a real graph data word’s adjacency (Fortunato, 2010) is depicted in
Figure 4.4. The dataset is the adjacency network of popular adjectives and nouns in the
book David Coppereld by Charles Dickens. Some of the other applications of clustering
mentioned in the literature are discussed next.
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 68
k
k k
k
68 4 Application of Machine Learning in the Social Network
Figure 4.4 Clustering in the network data using a word adjacency dataset
4.3.1 Recommender Systems
More people are traveling today than ever before, and they often use recommendations
from blogs and forums. Since one person generates the content on a microblogging site, the
recommendations might not be the best for each traveler. The recommender system needs
a learning engine that provides the best recommendation aggregated from the content of
multiple travelers. Cenamor et al. (2017) designed a system that takes previous data, clus-
ters it into a daily travel plan, and makes personalized recommendations to the user. Chen
et al. (2017) proposed a new recommender system that suggests clustered urban functional
areas with the help of collected building-level social media data. The proposed work was
implemented in the Yuexiu District, Guangzhou, China with the K-values 2 and 4. These
recommender systems can be used for urban planning for smart city projects. Feng et al.
(2015) proposed a personalized movie recommender system that uses the community to
recommend a movie. The community detection in the proposed work is done based on asso-
ciation rule mining. This recommender system was tested with the MovieLens and Netix
datasets.
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 69
k
k k
k
Table 4.1 Summary Classification Applications
AQ1
Reference Problem Dataset Data Type Classification Type
Ye et al. (2011) Text annotation Facebook check-in data Human/Social Binary support vector
machine (SVM) classier
Song et al. (2013) Personalized recommendation Sina Weibo Human/Social Gradient descent learning
Batool et al. (2013) Synonym and knowledge
enhancement
Twitter data Human/Social Domain-specic learning
Ahmed and Abulaish (2013) Spam ltering Facebook and Twitter data Human/Social Nave Bayes, Jrip, and J48
Akaichi (2013) Complexities in conveyed texts Facebook status data Human/Social Support vector machine
(SVM) and naive Bayes
Li et al. (2014) All traditional models use
statistical methods
Chinese micro-blog posts Human/Social Rule based
Vázquez et al. (2014) Costly sentiment analysis English and Spanish social
media data
Human/Social Rule based
Lima and de Castro (2014) Omission of social media
metadata
Twitter data Human/Social Nave Bayes, SVM, multilayer
perceptron neural network
(Yang, Kiang, and Shang 2015) Adverse drug reactions
(ADRs)
Medhelp website Web text data Latent Dirichlet allocation
modeling
Igawa et al. (2016) Text bots 2014 FIFA World Cup data Web text data Random forests and
multilayer perceptrons
Tacchini et al. (2017) Misinformation classier Facebook data Human/social Logistic regression
Bayot and Gonçalves (2018) Gender classication Adience for age and gender Human/social Deep CNN
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 70
k
k k
k
70 4 Application of Machine Learning in the Social Network
4.3.2 Sentiment Analysis
Community detection plays a vital role in analyzing the eect of some real-world happen-
ings. Ou et al. (2017) examined the emotion of an event that occurred in the real world—the
proposed algorithm nds a community, detects the community emotion, aggregates the
community emotion, and detects any community emotion burst. In other cases, most peo-
ple are comfortable with the brand they use for a given product. Companies promote their
brands on social media to build their customer base. The brand community will enable
customers to know more about products and create a strong relationship with the customer
base. Habibi et al. (2014) proposed a model for an overlapping brand community that creates
a positive inuence and brand trust among the customers.
4.3.3 Information Spreading or Promotion
Information dispersion is one of the signicant areas in a social network analysis accord-
ing to Shaji et al. (2018). Information dispersion learns about the ow of information on a
social network. Social network information spreading is used in product promotion. Target
marketing is an area where the marketing is targeted to a group of individuals or com-
munity. Johnston (2017) proposed a theoretical model where social media can be lever-
aged by the statutory agencies to communicate to the community on social media. Sitter
and Curnew (2016) proposed an innovative model and described how social media can
be used by social workers to share YouTube videos with community members. Croitoru
et al. (2015) learned how to use the big data generated from social media after an event.
Their experiment was carried out with two real-world datasets from social media. The data
used includes the user-generated content and propagation data after the events Occupy
Wall Street in November 2011 and the Boston Marathon bombing in April 2013. Alsmadi
and Alhami (2015) proposed a method that clusters events on Twitter. The clustered events
will be spread across communities. Schirr (2013) proposed a method for community-based
learning and sharing educational information and curriculum development for classroom
training. Zhou et al. (2012) claimed that social network communication is community spe-
cic and not individual specic. Zhou et al. proposed a method named COCOMP that shares
a message with a community that is similar. Lakkaraju and Ajmera (2011) introduced a
community-based application that predicts the reach of a brand or content in the future.
Conover et al. (2011) experimented with the political aliation of Twitter users using a
hidden community structure. Ang (2011) proposed a model of a community with customer
relationship management (CRM) data to use customers to build products. The model sug-
gests the CRM phases as connect, converse, create, and collaborate. Ebner and Reinhardt
(2009) proposed a method to build a scientic community using the Twitter community.
4.3.4 Geolocation-Specific Applications
Epidemiology is the area of learning about disease outbreaks and the spreading process.
Community studies can help the health department to quickly nd the epidemic and the
path of a disease. Hossain et al. (2016) studied the 2014 Ebola outbreak in Africa and experi-
mented with how a social media community study can help to defend the spreading proac-
tively. Social media is an essential tool nowadays to report disasters, and social networks
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 71
k
k k
k
4.4 Application of Regression Models in Social Networks 71
also support rescue teams in locating aected people and areas. Bakillah et al. (2015) pro-
posed a method for geospatial clustering to spread information during a disastrous situa-
tion. The case study used for the work was Typhoon Haiyan in the Philippines, with data
from Twitter. Geolocation applications are good to work on because they are location spe-
cic. Atzmanstorfer et al. (2014) proposed a citizen-oriented and location-aware spatial
planning system. The social media users from the location-aware community can partic-
ipate in the discussion and brainstorm and implement various planning and functional
activities in and around their location. The case study used for this experiment was the
participatory land-zoning process in the Capital District of Quito, Ecuador.
4.4 Application of Regression Models in Social Networks
Regression is a well-known machine learning technique used for nding relationships
between independent and dependent variables in data. With people’s lives intertwined
with social networks, it is obvious that human emotions, behaviors, and sentiments
will depend their personal and organizational social networks. Over the past few years
scientists have been trying to gure out how one’s social network aects their personal
behaviors, emotions, performance, and other humanly attributes in relation to dierent
life activities. Regression analysis has been at the forefront of these scientic explorations.
Positional analysis of social networks started in the late 20th century and intensied in the
last decades due to the availability of technology that made data collection an easy task
for the researchers. Dierent interesting problems have been investigated by scholars with
regression analysis being used as the major instrument for studying social network data.
In this section, we provide a few examples of studies and show how regression analysis
facilitated the understanding of correlations between dierent aspects of human nature
and social network properties.
4.4.1 Social Network and Human Behavior
Human behavior is a complex output of their psychological and physiological states within
the individual and social contexts. Sometime one’s social network can aect their perfor-
mance in jobs whether individual performance or a group performance. A eld study was
conducted in 2001 by Sparrowe et al. (2001) with 190 employee in 38 dierent groups. These
190 employees are from 5 dierent organizations. The study was conducted over two social
networks between these members on an organization basis. One is an advice network, and
the other is a hindrance network. Using regression analysis, they showed that the indi-
vidual performance is positively and negatively related to the in-degree centrality score of
an individual in the advice network and hindrance network, respectively. Group perfor-
mance was also studied, and they found that the inuence of hindrance network density
is highly negatively signicant for group performance. Both of these networks were con-
structed from the informal relationships between two individuals in a group, and the data
was collected by interviewing all 190 participants. While the advice network was comprised
of the relationships through which employees share resources and information, the hin-
drance network was formed from the negative relationships such as interference, threat,
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 72
k
k k
k
72 4 Application of Machine Learning in the Social Network
sabotage, and rejections. In a similar way, Collins and Clark (2003) worked with a top man-
agement network of technological rms to study the eect on the performance of their rms
in terms of sales growth and stock returns. In this case, the social network was formed with
the top management and their internal and external contacts. Instead of a person-to-person
network, this network used the weighted links between the members of the top manage-
ment team with dierent internal departments such as sales and marketing, research and
development, etc., and external providers such as suppliers, nance institutions, customers,
etc. The weight of the links in these person-to-department networks was based on the num-
ber of contacts, time span of interactions, and intensity of their relations as reported by the
management team members through a survey. Hierarchical regression was performed to
nd the relationship between these networks and the rm’s growth.
Network measures used as the independent variables have included network size, net-
work range (Powell and Brantley, 1992; Scott, 2000), and the strength of ties (Granovetter,
1973). The regression results showed that the range and strength of an external network was
signicantly related to a rm’s sales growth and stock returns, but the size of the external
network had no signicant eect. On the contrary, the network size of an internal net-
work was signicant for the sales growth but not for the stock returns, while the range
of an internal network was signicantly related to the stock return but not with the sales
growth. In an interesting research work, Cimenler et al. (2014) tried to nd a correlation
between researchers’ social network matrices with the researchers’ citation performance.
In this work, they collected four dierent social networks of 100 researchers from the Col-
lege of Engineering at the University of South Florida. These networks included a personal
communication network, a joint grant network, a co-authorship network, and a joint patent
network. The H-index was taken as a dependent variable characterizing the citation per-
formance and seven dierent network measures. Specically, degree centrality, closeness
centrality, betweenness centrality, eigenvector centrality, average tie strength, Burt’s e-
ciency coecient, and local clustering coecient were taken as independent variables.
In addition to this, researchers’ demographic attributes such as gender, race, and depart-
ment were taken as input variables. With this massive attribute set, they ran a separate
Poison regression bi-variate model for each attribute obtained from four dierent social
networks. They found that degree, closeness, eigenvector, betweeneness centrality, average
tie strength, and local clustering coecient of co-authorship network have a statistically
signicant eect on citation performance. Degree, closeness, betweenness centrality, aver-
age tie strength, and eciency coecient of pettent network and only degree, closeness,
eigenvector and local clustering coecient of grant proposal network have a positive signif-
icance in citation performance. Interestingly for a communication network, only closeness
and eigenvector centrality had a statistically signicant eect on citation performance.
In the aforesaid paragraph we show how one person’s performance can be enhanced/
deducted due to their social position (centrality) in their personal and work social network.
Now we will see how perception within the social network can change their attitude toward
dierent events. Tucker (2014) studied an interesting phenomenon of human behavior
using regression. Tucker reported that when a person thinks that the social network plat-
form is honoring privacy by facilitating some software conguration, then they are more
prone to accept the personalized content even though every other parameter of the person-
alized content remains the same. The study was conducted over the social network platform
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 73
k
k k
k
4.4 Application of Regression Models in Social Networks 73
Facebook to see the user’s response to personalized advertisements and media from a few
NGOs. Fortunately, in the middle of the campaign, Facebook introduced a privacy control
on the platform. Regression analysis with the ad click-through before and after the intro-
duction of the privacy control showed a dierent pattern. It showed that people are more
responsive to the personalized content after the introduction of a privacy control. Hence, it
provides evidence of the idea that perception can really aect our responses in a network.
In another experiment, Paluck et al. (2016) experimented with 24,191 students of 56
schools to support the theory of human behavior that states that one’s behavior is adjusted
toward the societal normative. In this experiment, the students’ social network was formed
by surveying in the mentioned 56 schools. Then they selected a few students from ran-
domly chosen schools and trained them as an anti-conict squad. Linear regression was
performed over the data collected during one year of studies. The result of the regression
analysis shows a more than 30% reduction in per-student conicts in the schools where
seed students played the role of anti-conict agents. But what is more interesting is that
where the seed sets were chosen based on the socially referent, more reduction in conict
was visible. Another interesting problem of peer inuence on human behavior was studied
by Bapna and Umyarov (2015). This experiment was conducted over the large-scale online
music social network Last.fm. The network contains more than 23 million friendship links
and 3.8 million users. They scanned several snapshot and extracted the user subscription
data. In addition to this, several demographic information and social activity reports were
collected from the website. Logistic regression was performed with this massive data. It
prevails that once a person subscribes for a premium service, the chances of subscription
increase in the neighborhood. Thus, the peer inuence has a statistically and economically
signicant causal eect. In addition, the regression revealed that the strength of the peer
inuence is inversely proportional to the size of the friendship circle.
4.4.2 Emotion Contagion through Social Networks
Emotion contagion is an interesting research problem that states that human emotions such
as happiness, loneliness, and depression can be transferred from person to person. Evidence
has been found that two socially connected individuals have similar emotions. However,
the casual eect of this may be attributed to either contagion or homophily. Coviello et al.
(2014) conducted experiments with a massive amount of Facebook data to see whether
emotions diused through the friendship links in online social networks. Regression with
instrumental variable was used to determine the emotional contagions in the network. They
chose rainfall as the instrument and two dierent regressions were used to establish the
hypothesis that (i) rainfall is correlated to negative emotions in human beings and (ii) these
negative emotions diuse to other geographically distant friends through online social net-
works. Although they found proof of social contagions of emotions, the ratio of an indirect
to direct eect of rainfall was quite low compared to economical or political contagions.
A similar experiment with a massive amount of Facebook data was conducted by Kramer
et al. (2014). This experiment was conducted with 689,003 Facebook users. The control
experiments were done by reducing friends’ positive and negative emotional posts from
the users’ news feeds. Poisson regression was performed with the percentage of reduc-
tion as a regression weight. An interesting nding with this regression analysis was that
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 74
k
k k
k
74 4 Application of Machine Learning in the Social Network
omitting emotional content reduced the number of words the person subsequently pro-
duced irrespective of the type of emotions. Later they performed weighted linear regres-
sion to show that when positive content was omitted, then the negative emotional posts
increased whereas the positive decreased. The reverse was found to be true when negative
contents were omitted. Thus, they found that human emotion is contagious over the online
platform Facebook.
4.4.3 Recommender Systems in Social Networks
Recommender systems try to predict a user’s anity to a product or service based on either
the user’s or similar users’ past experiences (collaborative ltering) or the attributes of simi-
lar products (content-based ltering) in a social network. With the increase of social media,
network recommender systems have become more relevant in recent times.
Collaborative topic regression (CTR) (Wang and Blei, 2011) combines both of the tech-
niques to better recommend the topic more relevant to a user. Purushotham et al. (2012)
went a bit further and integrated CTR with the social matrix factorization model. This
takes advantages of the social relations of users into account. The main motivation for the
idea came from the fact that the social relations form between two users because they have
similarities. Thus, incorporating social correlations can improve the accuracy of the rec-
ommendations. They experimented with two real-world online social networks: the online
music station Last.fm and the online bookmark sharing platform Delicious. One of the chal-
lenges for correct recommeder systems was to identify the geographical location of the user.
McGee et al. (2013) worked in this direction to predict a user’s location based on the tie
strength.
A study was conducted on Twitter, and decision tree regression was used to improve pre-
diction. Very recently, Tacchini et al. (2017) worked on an interesting project where they
tried to answer the question “Can a hoax be identied based on the users who liked it?”.
The authors proposed a logistic regression-based technique to classify a post as a hoax from
user activities (likes) on that post. The experiment was conducted on a large amount of Face-
book data that was collected during 2016. A very interesting fact about the user activities is
that on average hoax posts have more likes than nonhoax posts.
4.5 Application of Evolutionary Computing and Deep
Learning in Social Networks
Deep learning is a growing machine learning technique. It’s a hierarchical learning tech-
nique that learns the structures inside the data. At its core, deep learning is a feed-forward
articial neural network with many hidden layers. On the other hand, evolutionary com-
puting is a family of global optimization techniques inspired by biological evolution. Both
of these tools have been used to learn and optimize social network data. In this section, we
provide a few such examples where deep learning and evolutionary computing have been
used to solve research issues in social networks.
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 75
k
k k
k
4.5 Application of Evolutionary Computing and Deep Learning in Social Networks 75
4.5.1 Evolutionary Computing and Social Network
One of the rst attempts to use genetic algorithms with social network analysis was by
Wilson and Banzhaf (2009). The research was conducted over the huge amount of email
communication data of Enron Corporation. The main objective of their study was to nd
the key players among the 151 employees of the organization. The social networkwas inte-
grated into the genetic algorithm through the tness function of the genetic algorithm. The
tness function used was derived from social network measures such as degree, density,
and proximity prestige.
Community detection (Kundu and Pal, 2015b) is one of the most important problems of
social networks where evolutionary algorithms have been eectively used. One such study
was conducted by Gong et al. (2012). In this work, a multi-objective evolutionary algorithm
was used to optimize two important properties of the communities. They simultaneously
maximized the internal link density and minimized the density of links between commu-
nities. They used a modied version of the multi-objective evolutionary algorithm based
on decomposition proposed by Qingfu Zhang and Hui Li (2007). Liu et al. (2014) used a
multi-objective evolutionary algorithm to detect communities in a signed network. A signed
social network is the network where both friend and foe relationships are present. This
algorithm tried to optimize two contradictory objectives of a community. The algorithm
maximizes positive links within a community while minimizing the negative links from
it. Very recently, Rizman žalik (2019) used a multi-objactive genetic algorithm to detect
communities. Here, both the objective functions were minimized to get the end results.
These objective functions were based on the node’s centrality measure and ratio of edges.
To use the genetic algorithm in community detection, they modied dierent steps such as
initialization, mutation, and crossover of the genetic algorithm.
4.5.2 Deep Learning and Social Networks
Deep learning was rst used in social networks by Perozzi et al. (2014). In this work, the
authors used deep learning to represent social graphs with a latent representation in contin-
uous vector space. This allows other well-known statistical and machine learning models
to be used with social network data easily. To learn the social representation, they used a
stream of short random walks. In 2015, Nikfarjam et al. (2015) used deep learning tech-
niques to analyze user posts in social networks. Their objective was to learn about adverse
drug reactions. Deep learning tools were mainly used to interpret natural languages that
automatically classify unlabeled user posts. Li et al. (2014) uses the conditional temporal
restricted Boltzmann machine to predict future links in dynamic social networks. A con-
ditional temporal restricted Boltzmann machine was inherited from the original restricted
Boltzmann machine (Hinton and Salakhutdinov, 2006). Multiple snapshots of the network
at dierent timestamps were used as the input to the model and nodes’ transitional pat-
tern, and inuence of local neighbors are used as the conditional and temporal properties
for the model.
Hate speech classication is one eld of deep learning known to perform well. However,
a study conducted by Aroyehun and Gelbukh (2018) concluded the opposite. The study
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 76
k
k k
k
76 4 Application of Machine Learning in the Social Network
Table 4.2 Summary of Clustering Applications
Reference Problem Dataset Data Type Clustering Type
(Chen et al. 2017) Clustering
urban
functional
areas
Building-level
social media data,
Yuexiu District,
Guangzhou, China
Human/
social
K-Medoids
(Feng et al. 2015) dynamic user
interests
MovieLens and
Netix datasets
Movie
data
Time-weighted
association
rule mining
(Alsaedi and
Burnap 2015)
Events
clustering
Twitter Social Online
clustering
method
(Bakillah, Li, and
Liang 2015)
Geolocated
communities
Twitter: typhoon
Haiyan in the
Philippines
Social Spatial
clustering
(Habibi, Laroche,
and Richard 2014)
Inuence
brand trust
Ecommerce social
media data
Social Overlapping
community
(Atzmanstorfer
et al. 2014)
GeoCitizen
platform
Case study: Capital
District of Quito,
Ecuador
Geo
Social
Spatial
clustering
(Conover et al.
2011)
Cluster
political
aliation
Twitter Human/
social
Latent
semantic
analysis
(Ebner and
Reinhardt 2009)
Scientic
community
Twitter Human/
social
Online
communities
objective was to compare dierent deep learning techniques to identify aggression or hate
speech against the baseline support vector machine with naive Bayes (Wang and Manning,
2012). The other goal of the study was to see the performance of dierent deep neural net-
works in the presence of varying sizes of data. They found that on average the deep learning
technique needed more data points to perform better than the baseline SVM algorithm. In
another study, an interesting problem related to inuence maximization (Pal et al., 2014)
was attempted by Qiu et al. (2018). Here the deep learning technique was used to predict
user actions for neighbors in the network, which in turn provided a way to predict a user’s
inuence in the network. The experiment was conducted over a large-scale online social
network and demonstrated its applicability in proling the inuence of a node in a social
network.
4.6 Summary
In the recent times, social networks are changing the way people operate. As discussed
in the chapter, social network usage happens in almost all areas of life. Some of the
applications discussed will be an eye-opener for many researchers and bring in many
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 77
k
k k
k
Acknowledgments 77
Table 4.3 Summary Regression Application
Reference Problem Dataset Data Type Regression Type
Sparrowe
et al. (2001)
Individual
performance and
group performance in
an employee advice
and hindrance network
190 employees
38 groups 5
organizations
Human/
Social
Simple
Collins and
Clark (2003)
Firms performance
based on top
management social
network
73 companies
avg. empl. 1,742
Human/
Social
Hierarchical
Tucker
(2014)
Personalized
advertising and privacy
controls
1.2 million
Facebook user
Online Logistic
Purushotham
et al. (2012)
Recommendation
systems
Lastfm: 1,892
users
Delicious: 1,867
users
Online Collaborative
topic
regression
Cimenler
et al. (2014)
Researchers’ citation
performance based on
their social network
100 researchers
4 dierent
social networks
Human/
Social
Poisson
Coviello et al.
(2014)
Emotion contagion Massive
Facebook data
Online With
instrumental
variables
Kramer et al.
(2014)
Emotion contagion Facebook with
689,003 users
Online Poisson
Bapna and
Umyarov
(2015)
Peer inuence in a
music website
Last.fm 3.8m
users 23m edges
Online Logistic
Paluck et al.
(2016)
Reducing the conict
between students using
SNA
24,191 students
56 schools
Human/
Social
Linear &
least-square
Tacchini
et al. (2017)
Hoax post
identication
Facebook Online Logistic
interdisciplinary applications in the future. The literature discussed was summarized in
tables. The application of regression in social networks was summarized in Table 4.3,
the classication applications were summarized in Table 4.2, and nally, the clustering
applications in social media were summarized in Table 4.3.
Acknowledgments
Suman Kundu acknowledges the National Science Center, Poland, for the grant 2016/23/
B/ST6/01735.
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 78
k
k k
k
78 4 Application of Machine Learning in the Social Network
References
Ahmed, F. and Abulaish, M. (2013) A generic statistical approach for spam detection in Online
Social Networks. Computer Communications,36 (10-11), 1120–1129,
doi:10.1016/j.comcom.2013.04.004.
Akaichi, J. (2013) Social networks’ Facebook’ statutes updates mining for sentiment
classication, in Proceedings - SocialCom/PASSAT/BigData/EconCom/BioMedCom 2013,pp.
886–891, doi:10.1109/SocialCom.2013.135.
Alowibdi, J.S., Buy, U.A., and Yu, P. (2013) Language independent gender classication on
Twitter, in Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social
Networks Analysis and Mining - ASONAM ’13, pp. 739–743, doi:10.1145/2492517.2492632.
Alsmadi, I. and Alhami, I. (2015) Clustering and classication of email contents. Journal of
King Saud University - Computer and Information Sciences,27 (1), 46–57,
doi:10.1016/j.jksuci.2014.03.014.
Ang, L. (2011) Community relationship management and social media. Journal of Database
Marketing and Customer Strategy Management,18 (1), 31–38, doi:10.1057/dbm.2011.3.
Aroyehun, S.T. and Gelbukh, A. (2018) Aggression Detection in Social Media: Using Deep
Neural Networks, Data Augmentation, and Pseudo Labeling, in Proceedings of the First
Workshop on Trolling, Aggression and Cyberbullying, pp. 90–97.
Atzmanstorfer, K., Resl, R., Eitzinger, A., and Izurieta, X. (2014) The GeoCitizen-approach:
Community-based spatial planning - An Ecuadorian case study. Cartography and
Geographic Information Science,41 (3), 248–259, doi:10.1080/15230406.2014.890546.
Bakillah, M., Li, R.Y., and Liang, S.H. (2015) Geo-located community detection in Twitter with
enhanced fast-greedy optimization of modularity: the case study of typhoon Haiyan.
International Journal of Geographical Information Science,29 (2), 258–279,
doi:10.1080/13658816.2014.964247.
Bapna, R. and Umyarov, A. (2015) Do Your Online Friends Make You Pay? A Randomized
Field Experiment on Peer Inuence in Online Social Networks. Management Science,61 (8),
1902–1920, doi:10.1287/mnsc.2014.2081.
Batool, R., Khattak, A.M., Maqbool, J., and Lee, S. (2013) Precise tweet classication and
sentiment analysis, in 2013 IEEE/ACIS 12th International Conference on Computer and
Information Science, ICIS 2013 - Proceedings, pp. 461–466, doi:10.1109/ICIS.2013.6607883.
Bayot, R.K. and Gonçalves, T. (2018) Age and gender classication of tweets using
convolutional neural networks, in Lecture Notes in Computer Science (including subseries
Lecture Notes in Articial Intelligence and Lecture Notes in Bioinformatics), vol. 10710 LNCS,
vol. 10710 LNCS, pp. 337–348, doi:10.1007/978-3-319-72926-8_28.
Beln, R.V., E., G.M.K., and Bródka, P. (2018) Overlapping community detection using
superior seed set selection in social networks. Computers and Electrical Engineering,
doi:10.1016/j.compeleceng.2018.03.012.
Beln, R.V. and Grace Mary Kanaga, E. (2018) Parallel seed selection method for overlapping
community detection in social network. Scalable Computing, doi:10.12694/scpe.v19i4.1429.
Benevenuto, F., Rodrigues, T., Almeida, J., Gonçalves, M., and Almeida, V. (2009) Detecting
spammers and content promoters in online video social networks, in Proceedings - IEEE
INFOCOM, doi:10.1109/INFCOMW.2009.5072127.
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 79
k
k k
k
Refe rences 79
Burnap, P., Colombo, W., and Scoureld, J. (2015) Machine Classication and Analysis of
Suicide-Related Communication on Twitter, in Proceedings of the 26th ACM Conference on
Hypertext & Social Media - HT ’15, pp. 75–84, doi:10.1145/2700171.2791023. 0305058.
Cenamor, I., de la Rosa, T., Núñez, S., and Borrajo, D. (2017) Planning for tourism routes using
social networks. Expert Systems with Applications,69, 1–9, doi:10.1016/j.eswa.2016.10.030.
Chen, Y., Liu, X., Li, X., Liu, X., Yao, Y., Hu, G., Xu, X., and Pei, F. (2017) Delineating urban
functional areas with building-level social media data: A dynamic time warping (DTW)
distance based k-medoids method. Landscape and Urban Planning,160, 48–60.
Cimenler, O., Reeves, K.A., and Skvoretz, J. (2014) A regression analysis of researchers’ social
network metrics on their citation performance in a college of engineering. Journal of
Informetrics,8(3), 667–682, doi:10.1016/j.joi.2014.06.004.
Collins, C.J. and Clark, K.D. (2003) Strategic human resource practices, top management team
social networks, and rm performance: The role of human resource practices in creating
organizational competitive advantage, doi:10.2307/30040665.
Conover, M.D., Gonçalves, B., Ratkiewicz, J., Flammini, A., and Menczer, F. (2011) Predicting
the political alignment of twitter users, in Proceedings - 2011 IEEE International Conference
on Privacy, Security, Risk and Trust and IEEE International Conference on Social Computing,
PASSAT/SocialCom 2011, doi:10.1109/PASSAT/SocialCom.2011.34.
Coviello, L., Sohn, Y., Kramer, A.D., Marlow, C., Franceschetti, M., Christakis, N.A., and
Fowler, J.H. (2014) Detecting emotional contagion in massive social networks. PLoS ONE,
9(3), e90 315, doi:10.1371/journal.pone.0090315.
Croitoru, A., Wayant, N., Crooks, A., Radzikowski, J., and Stefanidis, A. (2015) Linking cyber
and physical spaces through community detection and clustering in social media feeds.
Computers, Environment and Urban Systems,53, 47–64, doi:10.1016/j.compenvurbsys.2014.
11.002.
Ebner, M. and Reinhardt, W. (2009) Social networking in scientic conferences Twitter as tool
for strengthen a scientic community, in telearnnoekaleidoscopeorg, vol. 2, vol. 2, pp. 1–8.
Eleta, I. and Golbeck, J. (2014) Multilingual use of Twitter: Social networks at the language
frontier. Computers in Human Behavior,41, 424–432, doi:10.1016/j.chb.2014.05.005.
Feng, H., Tian, J., Wang, H.J., and Li, M. (2015) Personalized recommendations based on
time-weighted overlapping community detection. Information and Management,52 (7),
789–800, doi:10.1016/j.im.2015.02.004.
Fortunato, S. (2010) Community detection in graphs. Physics Reports,486 (3-5), 75–174,
doi:10.1016/j.physrep.2009.11.002.
Gong, M., Ma, L., Zhang, Q., and Jiao, L. (2012) Community detection in networks by using
multiobjective evolutionary algorithm with decomposition. Physica A: Statistical Mechanics
and its Applications,391 (15), 4050–4060, doi:10.1016/j.physa.2012.03.021.
Granovetter, M.S. (1973) The strength of weak ties. American Journal of Sociology,78 (6),
1360–1380.
Habibi, M.R., Laroche, M., and Richard, M.O. (2014) The roles of brand community and
community engagement in building brand trust on social media. Computers in Human
Behavior,37, 152–161, doi:10.1016/j.chb.2014.04.016.
Himelboim, I., Smith, M.A., Rainie, L., Shneiderman, B., and Espina, C. (2017) Classifying
Twitter Topic-Networks Using Social Network Analysis. Social Media +Society,3(1),
205630511769 154, doi:10.1177/2056305117691545.
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 80
k
k k
k
80 4 Application of Machine Learning in the Social Network
Hinton, G.E. and Salakhutdinov, R.R. (2006) Reducing the dimensionality of data with neural
networks. Science (New York, N.Y.),313 (5786), 504–7, doi:10.1126/science.1127647.
Hossain, L., Kam, D., Kong, F., Wigand, R.T., and Bossomaier, T. (2016) Social media in Ebola
outbreak. Epidemiology and Infection,144 (10), 2136–2143, doi:10.1017/S095026881600039X.
Igawa, R.A., Barbon, S., Paulo, K.C.S., Kido, G.S., Guido, R.C., Júnior, M.L.P., and da Silva, I.N.
(2016) Account classication in online social networks with LBCA and wavelets.
Information Sciences,332, 72–83, doi:10.1016/j.ins.2015.10.039.
Johnston, J. (2017) Courts’ use of social media: A community of practice model. International
Journal of Communication,11, 669–683, doi:10.1021/am504320h.
Kramer, A.D.I., Guillory, J.E., and Hancock, J.T. (2014) Experimental evidence of massive-scale
emotional contagion through social networks. Proceedings of the National Academy of
Sciences,111 (24), 8788–8790, doi:10.1073/pnas.1320040111.
Kundu, S. and Pal, S.K. (2015a) FGSN: Fuzzy Granular Social Networks - Model and
applications. Information Sciences,314, 100–117, doi:10.1016/j.ins.2015.03.065.
Kundu, S. and Pal, S.K. (2015b) Fuzzy-rough community in social networks. Pattern
Recognition Letters,67, 145–152, doi:10.1016/j.patrec.2015.02.005.
Kundu, S. and Pal, S.K. (2018) Double bounded rough set, tension measure, and social link
prediction. IEEE Transactions on Computational Social Systems,5(3), 841–853,
doi:10.1109/TCSS.2018.2861215.
Lakkaraju, H. and Ajmera, J. (2011) Attention prediction on social media brand pages, in
Proceedings of the 20th ACM international conference on Information and knowledge
management - CIKM ’11, p. 2157, doi:10.1145/2063576.2063915.
Li, W. and Xu, H. (2014) Text-based emotion classication using emotion cause extraction.
Expert Systems with Applications,41 (4 PART 2), 1742–1749, doi:10.1016/j.eswa.2013.08.073.
Li, X., Du, N., Li, H., Li, K., Gao, J., and Zhang, A. (2014) A Deep Learning Approach to Link
Prediction in Dynamic Networks, in Proceedings of the 2014 SIAM International Conference
on Data Mining, Society for Industrial and Applied Mathematics, Philadelphia, PA, pp.
289–297, doi:10.1137/1.9781611973440.33.
Lima, A.C.E. and de Castro, L.N. (2014) A multi-label, semi-supervised classication approach
applied to personality prediction in social media. Neural Networks,58, 122–130,
doi:10.1016/j.neunet.2014.05.020.
Liu, C., Liu, J., and Jiang, Z. (2014) A multiobjective evolutionary algorithm based on similarity
for community detection from signed social networks. IEEE Transactions on Cybernetics,
44 (12), 2274–2287, doi:10.1109/TCYB.2014.2305974.
Lo, Y.W. and Potdar, V. (2009) A review of opinion mining and sentiment classication
framework in social networks, in 2009 3rd IEEE International Conference on Digital
Ecosystems and Technologies, DEST ’09, pp. 396–401, doi:10.1109/DEST.2009.5276705.
McAuley, J. and Leskovec, J. (2012) Image labeling on a network: Using social-network
metadata for image classication, in Lecture Notes in Computer Science (including subseries
Lecture Notes in Articial Intelligence and Lecture Notes in Bioinformatics), vol. 7575 LNCS,
vol. 7575 LNCS, pp. 828–841, doi:10.1007/978-3-642-33765-9_59. 1207.3809.
McGee, J., Caverlee, J., and Cheng, Z. (2013) Location prediction in social media based on tie
strength, in Proceedings of the 22nd ACM international conference on Conference on
information & knowledge management - CIKM ’13, pp. 459–468,
doi:10.1145/2505515.2505544. 1111.2904.
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 81
k
k k
k
Refe rences 81
Narayanam, R. and Narahari, Y. (2011) A Shapley value-based approach to discover inuential
nodes in social networks. IEEE Transactions on Automation Science and Engineering,8(1),
130–147.
Nikfarjam, A., Sarker, A., O’Connor, K., Ginn, R., and Gonzalez, G. (2015) Pharmacovigilance
from social media: mining adverse drug reaction mentions using sequence labeling with
word embedding cluster features. Journal of the American Medical Informatics Association,
22 (3), 671–681, doi:10.1093/jamia/ocu041.
Ou, G., Chen, W., Wang, T., Wei, Z., Li, B., Yang, D., and Wong, K.F. (2017) Exploiting
Community Emotion for Microblog Event Detection, in Social Media Content Analysis,pp.
439–456, doi:10.1142/9789813223615_0027.
Pal, S.K. and Kundu, S. (2017) Granular Social Network: Model and Applications, in Handbook
of Big Data Technologies (eds A.Y. Zomaya and S. Sakr), Springer International Publishing,
Cham, pp. 617–651, doi:10.1007/978-3-319-49340-4_18.
Pal, S.K., Kundu, S., and Murthy, C.A. (2014) Centrality measures, upper bound, and inuence
maximization in large scale directed social networks. Fundamenta Informaticae,130 (3),
317–342.
Paluck, E.L., Shepherd, H., and Aronow, P.M. (2016) Changing climates of conict: A social
network experiment in 56 schools. Proceedings of the National Academy of Sciences of the
United States of America,113 (3), 566–71, doi:10.1073/pnas.1514483113.
Peled, O., Fire, M., Rokach, L., and Elovici, Y. (2016) Matching entities across online social
networks. Neurocomputing,210, 91–106, doi:10.1016/j.neucom.2016.03.089.
Perozzi, B., Al-Rfou,R., and Skiena, S. (2014) DeepWalk: Online Learning of Social
Representations. Proceedings of the 20th ACM SIGKDD international conference on
Knowledge discovery and data mining - KDD ’14, pp. 701–710, doi:10.1145/2623330.2623732.
Powell, W.W. and Brantley, P. (1992) Competitive Cooperation in Biotechnology: Learning
through Networks?, in Networks and Organizations: Structure, Form, and Action,Harvard
Business School Press, Boston, pp. 366–394.
Purushotham, S., Liu, Y., and Kuo, C.C.J. (2012) Collaborative Topic Regression with Social
Matrix Factorization for Recommendation Systems, in Proceedings of the 29th International
Confer- ence on Machine Learning, Edinburgh, pp. 759–766,
doi:10.1016/j.jhydrol.2004.11.010. 1206.4684.
Qingfu Zhang and Hui Li (2007) MOEA/D: A Multiobjective Evolutionary Algorithm Based on
Decomposition. IEEE Transactions on Evolutionary Computation,11 (6), 712–731,
doi:10.1109/TEVC.2007.892759.
Qiu, J., Tang, J., Ma, H., Dong, Y., Wang, K., and Tang, J. (2018) DeepInf: Social Inuence
Prediction with Deep Learning, in Proceedings of the 24th ACM SIGKDD International
Conference on Knowledge Discovery & Data Mining - KDD ’18, ACM Press, New York, New
York, USA, pp. 2110–2119, doi:10.1145/3219819.3220077.
Rizman Žalik, K. (2019) Evolution Algorithm for Community Detection in Social Networks
Using Node Centrality, pp. 73–87, doi:10.1007/978-3-319-77604-0_6.
Schirr, G.R. (2013) Community-Sourcing a New Marketing Course: Collaboration in Social
Media. Marketing Education Review,23 (3), 225–240, doi:10.2753/MER1052-8008230302.
Scott, J. (2000) Social network analysis : a handbook, SAGE Publications.
Shaji, A., Beln, R., and Grace Mary Kanaga, E. (2018) An innovated SIRS model for
information spreading,vol.645, doi:10.1007/978-981-10-7200-0_37.
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 82
k
k k
k
82 4 Application of Machine Learning in the Social Network
Sitter, K.C. and Curnew, A.H. (2016) The application of social media in social work community
practice. Social Work Education,35 (3), 271–283, doi:10.1080/02615479.2015.1131257.
Song, Y., Lu, Z., Leung, C.W.k., and Yang, Q. (2013) Collaborative boosting for activity
classication in microblogs, in Proceedings of the 19th ACM SIGKDD international conference
on Knowledge discovery and data mining - KDD ’13, p. 482, doi:10.1145/2487575.2487661.
Sparrowe, R.T., Liden, R.C., Wayne, S.J., and Kraimer, M.L. (2001) Social networks and the
performance of individuals and groups. Academy of Management Journal,44 (2), 316–325,
doi:10.2307/3069458.
Statista.com (2018) Social Media Statistics & Facts ∣Statista. URL https://www.statista.com/
topics/1164/social-networks/.
Tacchini, E., Ballarin, G., Della Vedova, M.L., Moret, S., and de Alfaro, L. (2017) Some Like it
Hoax: Automated Fake News Detection in Social Networks. 1704.07506.
Tang, L. and Liu, H. (2011) Leveraging social media networks for classication. Data Mining
and Knowledge Discovery,23 (3), 447–478, doi:10.1007/s10618-010-0210-x.
Tucker, C.E. (2014) Social networks, personalized advertising, and privacy controls. Journal of
Marketing Research,51 (5), 546–562, doi:10.1509/jmr.10.0355.
Tuulos, V.H. and Tirri, H. (2004) Combining topic models and social networks for chat data
mining, in Proceedings - IEEE/WIC/ACM International Conference on Web Intelligence, WI
2004, pp. 206–213, doi:10.1109/WI.2004.10025.
Vázquez, S., Muñoz-García, Ó., Campanella, I., Poch, M., Fisas, B., Bel, N., and Andreu, G.
(2014) A classication of user-generated content into consumer decision journey stages.
Neural Networks,58, 68–81, doi:10.1016/j.neunet.2014.05.026.
Verbeke, W., Martens, D., and Baesens, B. (2014) Social network analysis for customer churn
prediction. Applied Soft Computing Journal,14 (PART C), 431–446,
doi:10.1016/j.asoc.2013.09.017.
Wang, C. and Blei, D.M. (2011) Collaborative topic modeling for recommending scientic
articles, in Proceedings of the 17th ACM SIGKDD international conference on Knowledge
discovery and data mining - KDD ’11, p. 448, doi:10.1145/2020408.2020480.
arXiv:1411.2581v1.
Wang, D., Irani, D., and Pu, C. (2014) SPADE: a social-spam analytics and detection framework.
Social Network Analysis and Mining,4(1), 1–18, doi:10.1007/s13278-014-0189-1.
Wang, S. and Manning, C.D. (2012) Baselines and bigrams: Simple, good sentiment and topic
classication, in Proceedings of the 50th Annual Meeting of the Association for Computational
Linguistics: Short Papers - Volume 2, Association for Computational Linguistics, Stroudsburg,
PA, USA, ACL ’12, pp. 90–94.
Wanichayapong, N., Pruthipunyaskul, W., Pattara-Atikom, W., and Chaovalit, P. (2011)
Social-based trac information extraction and classication, in 2011 11th International
Conference on ITS Telecommunications, ITST 2011, pp. 107–112,
doi:10.1109/ITST.2011.6060036.
Wilson, G. and Banzhaf, W. (2009) Discovery of email communication networks from the
enron corpus with a genetic algorithm using social network analysis, in 2009 IEEE Congress
on Evolutionary Computation, CEC 2009, pp. 3256–3263, doi:10.1109/CEC.2009.4983357.
Yang, M., Kiang, M., and Shang, W. (2015) Filtering big data from social media - Building an
early warning system for adverse drug reactions. Journal of Biomedical Informatics,54,
230–240, doi:10.1016/j.jbi.2015.01.011.
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 83
k
k k
k
Refe rences 83
Ye, M., Shou, D., Lee, W.C., Yin, P., and Janowicz, K. (2011) On the semantic annotation of
places in location-based social networks, in Proceedings of the 17th ACM SIGKDD
international conference on Knowledge discovery and data mining - KDD ’11, p. 520,
doi:10.1145/2020408.2020491.
Zhang, X., Tokoglu, F., Negishi, M., Arora, J., Winstanley, S., Spencer, D.D., and Constable, R.T.
(2011) Social network theory applied to resting-state fMRI connectivity data in the
identication of epilepsy networks with iterative feature selection. Journal of Neuroscience
Methods,199 (1), 129–139, doi:10.1016/j.jneumeth.2011.04.020.
Zhou, W., Jin, H., and Liu, Y. (2012) Community discovery and proling with social messages,
in Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and
data mining - KDD ’12, p. 388, doi:10.1145/2339530.2339593.
Zhu, Y., Wang, X., Zhong, E., Liu, N., Li, H., and Yang, Q. (2012) Discovering spammers in
social networks, in Association for the Advancement of Articial Intelligence, pp. 171–177.
Trim Size: 170mm x 244mm Single Column Souravde551591 c04.tex V1 - 02/20/2020 1:49pm Page 84
k
k k
k