Conference PaperPDF Available

Towards Predictive Policing: Knowledge-based Monitoring of Social Networks

Authors:

Abstract

Increasing the resilience of the society against disorders, such as disasters, attacks or threatening groups, is a major challenge. Recent events highlight the importance of a resilient society and steps which are required to be taken in resilience engineering. textit{A priori} the optimal way to handle such adverse events is to prevent them, or at least provide appropriate courses of preparation. The essential requirement for every kind of preparation is information about relevant upcoming events. Such information can be gained for example from social networks and can form the basis for a long-term and short-term strategic planning by security forces. For that purpose, we here propose an application framework for knowledge-based social network monitoring, which aims at predicting short-term activities, as well as the long-term development of potentially dangerous groups. In this work, a theoretical outline of this approach is given and discussed.
Towards Predictive Policing: Knowledge-based
Monitoring of Social Networks
Michael Spranger, Florian Heinke, Steffen Grunert and Dirk Labudde
University of Applied Sciences Mittweida
Mittweida, Germany
Email: {name.surname}@hs-mittweida.de
Abstract—Increasing the resilience of the society against dis-
orders, such as disasters, attacks or threatening groups, is a
major challenge. Recent events highlight the importance of a
resilient society and steps which are required to be taken in
resilience engineering. A priori the optimal way to handle such
adverse events is to prevent them, or at least provide appropriate
courses of preparation. The essential requirement for every kind
of preparation is information about relevant upcoming events.
Such information can be gained for example from social networks
and can form the basis for a long-term and short-term strategic
planning by security forces. For that purpose, we here propose
an application framework for knowledge-based social network
monitoring, which aims at predicting short-term activities, as well
as the long-term development of potentially dangerous groups.
In this work, a theoretical outline of this approach is given and
discussed.
Keywordsforensic; text processing; resilience engineering
I. INTRODUCTION
The representation and the communication via the Internet,
especially in social networks, have become a standard not
only for individuals, companies and organizations but also for
political groups or gangs using these platforms for planning,
appointing and conducting criminal offences [1], [2]. Large
events with a relatively large degree of group dynamics,
like sport events, demonstrations or festivals, require a high
expenditure of staff on the side of the security forces because
of unpredictability and uncertainty of associated dynamics. For
example, to secure the soccer events in 2014 in Germany
approximately two million working hours of police officers
were necessary [3]. In order to support decision makers, we
outline an application framework for monitoring cliques and
groups in social networks, which can be key elements in
the emergence of critical events. The monitoring process is
facilitated by means of employing general domain-specific
endangerer profiles. Such a profile can be deduced from a set
of social network sites of known endangerers or perpetrators
(in the strict sense). Identifying suspicious activities is realized
by group recommendation classifiers.
The following section is structured according to the steps
required to generate the proposed framework. First, aspects of
ontology definition are outlined, followed by discussions on
endangerer profile generation and classifier training. Finally,
monitoring strategies are proposed.
II. PRO PO SA L
The proposed application framework enables decision mak-
ers of security forces to identify threat hot-spots. In this way,
they are able to control their human resources. In order to
support long-term resource planning, The second aim is to
predict the long-term development of groups that pose a threat.
The process pipeline consists of three parts:
1) modelling the threat ontology
2) train the general domain-specific endangerers profile
3) monitoring all matching social network sites and
calculate a long-term and short-term threat score
A. Threat Ontology
The term ontology in a common understanding means a
formal and explicit specification of a common conceptualiza-
tion. In particular, it is defined as a set of common classified
terms and symbols referred to a syntax, and a network of asso-
ciate relations [4]. Similar to the crime ontology we proposed
in recent work [5], an ontology can be used for modelling a
complex threat assessment. In this way, knowledge of decision
makers is introduced and can be used for extracting semantic
information from posts and comments of social network’s
profiles. In particular, the works of Wimalasuriya and Dou [6],
Embley [7] and Maedche [8], show that the use of ontologies
is suitable for assisting the extraction of semantic units, as well
as their visualization and structures such processes very well.
B. Endangerer Profile
In order to distinguish profiles of interest regarding to
a certain threat, a general profile needs to be modelled.
Recent work [9], [10] has shown that feature vectors derived
from social network profiles are suitable for generating group
recommendations. In a similar way, a general classifier can be
trained based on the social network profiles of known persons
associated with a special threat. For example, Facebook pro-
files of known hooligans of a specific soccer club can be used
to train classifiers that are able to identify social activity of
hooligans and peers in social networks.
The generation process is divided into three parts depicted in
Figure 1.
C. Monitoring Activities
Once a profile is generated and the threat specific ontology
is defined, the social network monitoring can be conducted. At
this point a multi-level, information extraction process aims
at instantiating the ontology using textual information, like
posts and comments. An example of how such a process can
be structured is given by Spranger and Labudde [5]. Further
text analysis steps, like sentiment analysis (see the discussions
39Copyright (c) IARIA, 2015. ISBN: 978-1-61208-415-2
IMMM 2015 : The Fifth International Conference on Advances in Information Mining and Management
Profile
Selection
Social
Feature
Selection
Classifier
Training
Figure 1. The process of deriving a threat specific general profile.
given in [11] and [12] for details) can complete the instantiated
model in different ways. As a short-term benefit, a score can
be computed for various time points, signalling whether a
threatening event regarding to the specific profile and ontology
is directly pointing to a specific location and time frame. These
results can be applied to a map to localize short-term hot-spots
in terms of security and their dynamics as discussed by Davies
and Bishop [13].
Figure 2. The proposed system. The central, expert-modelled threat-specific
ontology describes the environment of a special threat. A general endangerer
profile completes the model. In the process the model is used to extract textual
information from social network activities. Different scoring functions allow
the identification of threat hot-spots or can show the long-term evolution of
groups and cliques.
In the age of Big Data and algorithms handling such
amounts of information, deducing long term developments of
such groups and dynamics is at its early stage. Methodological
concepts widely used in modelling complex relations (as for
instance systems biology) can be directly transferred to the
field of resilience engineering. Especially, employing generic
mathematical models to social networks has become compu-
tationally feasible, but requires further research. For example,
epidemiological models can be efficiently applied to study long
term evolutions of groups and sub-networks (see [14]) and
study the information transfer between them. Thus, generating
valid models and derive predictions from them can be of great
value, for instance, in planning personnel and staff demands.
REFERENCES
[1] ITU. Number of worldwide internet users from
2000 to 2014 (in millions). statista. [Online]. Avail-
able: http://www.statista.com/statistics/273018/number-of-internet-
users-worldwide/ (2015)
[2] eMarketer & American Marketing Association. Number of social
network users worldwide from 2010 to 2018 (in billions). statista.
[Online]. Available: http://www.statista.com/statistics/278414/number-
of-worldwide-social-network-users (2015)
[3] ZIS. Jahresbericht 2013/14. Zentrale Informationsstelle
Sporteins¨
atze. [Online]. Available: http://www.polizei-nrw.de/
media/Dokumente/Behoerden/LZPD/ZIS Jahresbericht 2013 14.pdf
(2014)
[4] T. R. Gruber, “Toward principles for the design of ontologies used for
knowledge sharing,” in Formal Ontology in Conceptual Analysis and
Knowledge Representation, N. Guarino and R. Poli, Eds. Kluwer
Academic Publishers, 1993.
[5] M. Spranger and D. Labudde, “Towards establishing an expert system
for forensic text analysis,” International Journal on Advances in Intel-
ligent Systems, vol. 7, no. 1/2, 2014, pp. 247–256.
[6] D. C. Wimalasuriya and D. Dou, “Ontology-based information extrac-
tion: An introduction and a survey of current approaches,” Journal of
Information Science, vol. 36, no. 3, 2010, pp. 306–323.
[7] D. W. Embley, “Toward semantic understanding: an approach based
on information extraction ontologies,” in Proceedings of the 15th Aus-
tralasian database conference - Volume 27, ser. ADC ’04. Darlinghurst,
Australia, Australia: Australian Computer Society, Inc., 2004, pp. 3–12.
[8] A. Maedche, G. Neumann, and S. Staab, “Bootstrapping an ontology-
based information extraction system,” Studies In Fuzziness And Soft
Computing, vol. 111, 2003, pp. 345–362.
[9] M. Manca, L. Boratto, and S. Carta, “Producing friend recommenda-
tions in a social bookmarking system by mining users content,” in Proc.
3rd. International Conference on Advances in Information Mining and
Management, IARIA. ThinkMind Library, 2013, p. 59 to 64.
[10] M. Cheung and J. She, “Bag-of-features tagging approach for a better
recommendation with social big data,” in Proc. 4th. International Con-
ference on Advances in Information Mining and Management, IARIA.
ThinkMind Library, 2014, p. 83 to 88.
[11] S. M. Mohammad, S. Kiritchenko, and X. Zhu, “Nrc-canada: Building
the state-of-the-art in sentiment analysis of tweets,” in Proceedings of
the Second Joint Conference on Lexical and Computational Semantics
(SEMSTAR’13), 2013.
[12] X. Wan, “Co-training for cross-lingual sentiment classification,” in
Proceedings of the Joint Conference of the 47th Annual Meeting of the
ACL and the 4th International Joint Conference on Natural Language
Processing of the AFNLP: Volume 1. Association for Computational
Linguistics, 2009, pp. 235–243.
[13] T. Davies and S. Bishop, “Modelling patterns of burglary on street
networks,” Crime Science, vol. 2, no. 1, 2013, p. 10.
[14] J. Cannarella and J. A. Spechler, “Epidemiological modeling of online
social network dynamics,” CoRR, vol. abs/1401.4208, 2014.
40Copyright (c) IARIA, 2015. ISBN: 978-1-61208-415-2
IMMM 2015 : The Fifth International Conference on Advances in Information Mining and Management
... The information gained this way can be used to solve crimes by searching for digital evidence that relates to the crime in the real world. Additionally, methods of predictive policing can help to organize police missions as was shown in [3]- [5]. The detection of opinion leaders in social networks is an important task for different reasons. ...
Article
Full-text available
In recent years, the automated, efficient and sensitive monitoring of social networks has become increasingly important for the criminal investigation process and crime prevention. Previously, we have shown that the detection of opinion leaders is of great interest in forensic applications to gather important information. In the current work, it is argued that state of the art methods, determining the relative degree to which an opinion leader exerts influence over the network, have weaknesses if networks exhibit a star-like social graph topology, whereas these topologies result from the interaction of users with similar interests. This is typically the case in networks of political organizations. In these cases, the underlying topologies are highly focused on one (or only a few) central actor(s) and lead to less meaningful results by classic measures of node centrality commonly used to ascertain the degree of leadership. With the help of data collected from the Facebook and Twitter network of a German political party, these aspects are examined and a quantitative indicator for describing star-like network topologies is introduced and discussed. This measure can be of great value in assessing the applicability of established leader detection methods. Finally, two variations of a new measure– the CompetenceRank – which is based on the LeaderRAnk score and aims to address the discussed problems in cases with and without additional network data such as likes and shares, are proposed.
... The information gained this way can be used to solve crimes by searching for digital evidence that relates to the crime in the real world. Additionally, methods of predictive policing can help to organize police missions as was shown in [1]- [3]. The detection of opinion leaders in social networks is an important task for different reasons. ...
Conference Paper
Full-text available
In recent years, the automated, efficient and sensitive monitoring of social networks has become increasingly important for criminal investigations and crime prevention. Previously, we have shown that the detection of opinion leaders is of great interest in forensic applications. In the present study, it is argued that state of the art opinion leader detection methods have weaknesses if networks exhibit star-like social graph topology, whereas these topologies result from the interaction of users with similar interests. This is typically the case for Facebook pages of political organizations. In these cases, the underlying topologies are highly focused on one (or only a few) central actor(s) and lead to less meaningful results by classic measures of node centrality commonly used for leader detection. The presents study examines these aspects closer and exemplifies them with the help of data collected from the Facebook page of a German political party for five consecutive months. Furthermore, a quantitative indicator for describing star-like network topologies is introduced and discussed. This measure can be of great value in assessing the applicability of established leader detection methods. Finally, a modified LeaderRank score is proposed -- the CompetenceRank -- which aims to address discussed problems.
... Even though social networks are successful and have progressed throughout these past years, they have also contributed to the formation of new criminal energy. As already mentioned in [1], in particular, the provision of an infrastructure for rapid communication and the possibility to exchange ideas, pictures etc. in private and protected environments, which are difficult to control by investigators -if at all -enables radical or extreme political groups, criminal gangs or terrorist organizations to use Social Networks as a tool to plan, appoint and execute criminal offenses. These groups often use large-scale events with a high degree of group dynamics to promote their ideas. ...
Article
Full-text available
Major incidents can disturb the state of balance of a society and it is important to increase the resilience of the society against such disturbances. There are different causes for major incidents, one of which are groups of individuals, for example at demonstrations. The ideal way to handle such events would be to prevent them, or at least provide information to ensure the appropriate security services are prepared. Nowadays, a lot of communication, even criminal, takes place in social networks, which, hence, provide the ideal ground to gain the necessary information, by monitoring such groups. In the present paper, we propose an application framework for knowledge-based social network monitoring. The ultimate goal is the prediction of shortterm activities, as well as the long-term development of potentially dangerous groups, based on sentiment and topic analysis and the identification of opinion-leaders. Here, we present the first steps to reach this goal, which include the assessment of the risk for a major incident caused by a group of individuals based on the sentiment in the social network groups and the topics discussed.
Conference Paper
Full-text available
The interests of users are always important for person-alized content recommendations on friendships, events and media content from the social big data. However, those interests may not be specified, which makes the recommendations challenging. One of the possible solutions is to analyze the user's interests from the shared content, especially images with manually annotated tags. They are shared on online social networks such as Flickr and Instagram. However, the accuracy of the recommendation is greatly affected by the accuracy of the tag, which is not always reliable. This paper demonstrates how a bag-of-features (BoF)-based tagging approach can help to improve the accuracy of recommendations using an unsupervised algorithm. A set of auxiliary tags is used to represent user interests and, hence, the recommendation. The approach is evaluated with over 500 user and 200k images from Flickr. It is proven that by BoF tagging (BoFT), friendship recommendation is possible without friendship/tag information and the recall and the precision rate are improved by about 50% over using user tags.
Article
Full-text available
The analysis of digital media and particularly texts acquired in the context of police securing/seizure is currently a very time-consuming, error-prone and largely manual process. Nevertheless, such analysis are often crucial for finding evidential information in criminal proceedings in general as well as fulfilling any judicial investigation mandate. Therefore, an integrated and knowledge-based computational solution for supporting the analysis and subsequent evaluation process is currently developed by the authors. In this work, we outline the main ideas of this framework and present an approach for categorizing texts with adjustable precision combining rule-based decision formula and machine learning techniques. Furthermore, we introduce a text processing pipeline for deep analysis of forensic texts as well as approaches towards solving domain specific problems like detection and understanding of hidden semantics as well as the automatic assignment of forensic roles.
Article
Full-text available
The last decade has seen the rise of immense online social networks (OSNs) such as MySpace and Facebook. In this paper we use epidemiological models to explain user adoption and abandonment of OSNs, where adoption is analogous to infection and abandonment is analogous to recovery. We modify the traditional SIR model of disease spread by incorporating infectious recovery dynamics such that contact between a recovered and infected member of the population is required for recovery. The proposed infectious recovery SIR model (irSIR model) is validated using publicly available Google search query data for "MySpace" as a case study of an OSN that has exhibited both adoption and abandonment phases. The irSIR model is then applied to search query data for "Facebook," which is just beginning to show the onset of an abandonment phase. Extrapolating the best fit model into the future predicts a rapid decline in Facebook activity in the next few years.
Article
Full-text available
In this paper, we describe how we created two state-of-the-art SVM classifiers, one to detect the sentiment of messages such as tweets and SMS (message-level task) and one to detect the sentiment of a term within a submissions stood first in both tasks on tweets, obtaining an F-score of 69.02 in the message-level task and 88.93 in the term-level task. We implemented a variety of surface-form, semantic, and sentiment features. with sentiment-word hashtags, and one from tweets with emoticons. In the message-level task, the lexicon-based features provided a gain of 5 F-score points over all others. Both of our systems can be replicated us available resources.
Article
Full-text available
Information Extraction aims to retrieve certain types of information from natural language text by processing them automatically. For example, an information extraction system might retrieve information about geopolitical indicators of countries from a set of web pages while ignoring other types of information. Ontology-based information extraction has recently emerged as a subfield of information extraction. Here, ontologies - which provide formal and explicit specifications of conceptualizations - play a crucial role in the information extraction process. Because of the use of ontologies, this field is related to knowledge representation and has the potential to assist the development of the Semantic Web. In this paper, we provide an introduction to ontology-based information extraction and review the details of different ontology-based information extraction systems developed so far. We attempt to identify a common architecture among these systems and classify them based on different factors, which leads to a better understanding on their operation. We also discuss the implementation details of these systems including the tools used by them and the metrics used to measure their performance. In addition, we attempt to identify the possible future directions for this field.
Article
Full-text available
Recent work in Artificial Intelligence (AI) is exploring the use of formal ontologies as a way of specifying content-specific agreements for the sharing and reuse of knowledge among software entities. We take an engineering perspective on the development of such ontologies. Formal ontologies are viewed as designed artifacts, formulated for specific purposes and evaluated against objective design criteria. We describe the role of ontologies in supporting knowledge sharing activities, and then present a set of criteria to guide the development of ontologies for these purposes. We show how these criteria are applied in case studies from the design of ontologies for engineering mathematics and bibliographic data. Selected design decisions are discussed, and alternative representation choices are evaluated against the design criteria.
Article
A fundamental issue in crime prevention is the efficient deployment of resources and the effective targeting of interventions, both of which require some form of prediction of future crime. One crime for which this is feasible is burglary, the distinctive spatio-temporal signatures of which can be exploited to inform predictions. Mathematical models in particular are capable of both encoding concisely the theoretical foundations of criminal behaviour and allowing the quantitative analysis of specific scenarios, and their capacity to reproduce the general patterns of burglary suggests that the approach has considerable potential. Previous models, however, are situated on simplified representations of space and do not reflect realistically the built environment in which crime takes place; specifically, they do not incorporate urban street networks. Such networks are fundamental to situational theories of crime, in the sense that they determine the configuration of urban space and, therefore, shape those human activity patterns which are thought to give rise to crime. Furthermore, streets are the natural domain for many policing activities, and their structure is determined by planning decisions, so that insight into their relationship with crime is likely to be of immediate practical use. With this in mind, this paper presents a mathematical model of crime which is explicitly situated on a street network. After discussing theoretical considerations and specifying the model itself, examples of typical networks are explored.
Conference Paper
Social Bookmarking Systems (and Social Media Systems in general) are experiencing a quick growth in the number of active users. This expansion led to the well-known “social interaction overload” problem, that means that each user has too many potential people to interact with. In order to address this problem, user recommender systems are widely proposed in the social media literature to recommend friends or people to follow. Currently, there are no approaches able to produce friend recommendations in the Social Bookmarking Systems domain. In this paper we propose a friend recommendation algorithm for a Social Bookmarking System, based on low computational effort heuristics that allow real time applications. Experimental results show that, when users tag in the same way and are also interested in the same content, they can be recommended as friends. The proposed algorithm produces better results, with respect to policies that use only tags and do not consider content.
Conference Paper
Information is ubiquitous, and we are flooded with more than we can process. Somehow, we must rely less on visual processing, point-and-click navigation, and manual decision making and more on computer sifting and organization of information and auto-mated negotiation and decision making. A resolu-tion of these problems requires software with seman-tic understanding—a grand challenge of our time. More particularly, we must solve problems of au-tomated interoperability, integration, and knowledge sharing, and we must build information agents and process agents that we can trust to give us the in-formation we want and need and to negotiate on our behalf in harmony with our beliefs and goals. This paper proffers the use of information-extraction ontologies as an approach that may lead to semantic understanding.
Conference Paper
The lack of Chinese sentiment corpora limits the research progress on Chinese sentiment classification. However, there are many freely available English sentiment corpora on the Web. This paper focuses on the problem of cross-lingual sentiment classification, which leverages an available English corpus for Chi- nese sentiment classification by using the Eng- lish corpus as training data. Machine transla- tion services are used for eliminating the lan- guage gap between the training set and test set, and English features and Chinese features are considered as two independent views of the classification problem. We propose a co- training approach to making use of unlabeled Chinese data. Experimental results show the effectiveness of the proposed approach, which can outperform the standard inductive classifi- ers and the transductive classifiers.