OSINT in the Context of Cyber-Security


The impact of cyber-crime has necessitated intelligence and law enforcement agencies across the world to tackle cyber threats. All sectors are now facing similar dilemmas of how to best mitigate against cyber-crime and how to promote security effectively to people and organizations. Extracting unique and high value intelligence by harvesting public records to create a comprehensive profile of certain targets is emerging rapidly as an important means for the intelligence community. As the amount of available open sources rapidly increases, countering cyber-crime increasingly depends upon advanced software tools and techniques to collect and process the information in an effective and efficient manner. This chapter reviews current efforts of employing open source data for cyber-criminal investigations developing an integrative OSINT Cybercrime Investigation Framework.
14.1 Introduction
During the 21st century, the digital world has acted as a double-edged sword
(Gregory and Glance 2013; Yuan and Chen 2012). Through the revolution of
publicly accessible sources (i.e., open sources), the digital world has provided
modern society with enormous advantages, whilst at the same time, issues of
information insecurity have brought to light vulnerabilities and weaknesses (Hobbs
et al. 2014; Yuan and Chen 2012). The shared infrastructure of the internet creates
the potential for interwoven vulnerabilities across all users (Appel 2011): The
viruses, hackers, leakage of secure and private information, system failures, and
interruption of servicesappeared in an abysmal stream (Yuan and Chen 2012).
(Wall 2007;2005) and Nykodym et al. (2005) discussed that cyberspace possess
four unique features called transformative keysfor criminals to commit crimes:
1. Globalization, which provides offenders with new opportunities to exceed
conventional boundaries
2. Distributed networks, which create new opportunities for victimization
3. Synopticism and Panopticism, which enable surveillance capability on victims
4. Data trails, which may allow new opportunities for criminals to commit identity
In addition to the above, Hobbs et al. (2014) claim that one of the main trends of
the recent yearsinternet development is that connection to the Internet may be a
very risky endeavour.
As well as the epidemic use and advancement of mobile communication tech-
nology, the use of open sources propagates the elds of intelligence, politics and
business (Hobbs et al. 2014). Whilst traditional sources and information channels
(news outlets, databases, encyclopedias, etc.) have been forced to adapt to the new
virtual space to maintain their presence, many newmedia sources (especially from
social media) disseminate large amounts of user-generated content that has sub-
sequently reshaped the information landscape. Examples of the scale of user gen-
erated information include the 500 million Tweets per day on Twitter and the 98
million daily blog posts on Tumblr (Hobbs et al. 2014) as well as millions of
individual personal Facebook pages. With the evolution of the information land-
scape, it has been essential that law enforcement agencies now harvest relevant
content through investigations and regulated surveillance, to prevent and detect
terrorist activities (Koops et al. 2013).
As has been considered in earlier chapters the term Open Source Intelligence
(OSINT) emanates from national security services and law enforcement agencies
(Kapow Software 2013). OSINT for our purposes here is predominantly dened as,
the scanning, nding, collecting, extracting, utilizing, validation, analysis, and
sharing intelligence with intelligence-seeking consumers of open sources and
publicly available data from unclassied, non-secret sources(Fleisher 2008;
Koops et al. 2013). OSINT encompasses various public sources such as academic
publications (research papers, conference publications, etc.), media sources
(newspaper, radio channels, television, etc.), web content (websites, social media,
etc.), and public data (open government documents, public companies announce-
ments, etc.) (Chauhan and Panda 2015a,b).
OSINT was traditionally described by searching publicly available published
sources (Burwell 2004) such as books, journals, magazines, pamphlets, reports and
the like. This is often referred to literature intelligence or LITINT (Clark 2004).
However, the rapid growth of digital media sources throughout the web and public
communication airwaves have enlarged the scope of Open Source activities
(Boncella 2003). Since there are diverse public online sources from which we can
collect intelligence, this type of OSINT is described as WEBINT by many authors.
Indeed, the terms WEBINT and OSINT are often used interchangeably (Chauhan
and Panda 2015a,b). Social media such as social networks, media sharing com-
munities and collaborative projects are areas where the majority of user generated
content is produced. Social Media Intelligence or SOCMINT refers to the intelli-
gence that is collected from social media sites. Some of their information may be
openly accessible without any kind of authentication required prior to investigation
(Omand et al. 2014; pp. 36; Chauhan and Panda 2015a,b).
Many law enforcement and security agencies are turning towards OSINT for the
additional breadth and depth of information to reinforce and help validate con-
textual knowledge (see for instance Chap. 13). Unlike typical IT systems, which
can adopt only a limited range of input, OSINT data sources are as varied as the
internet itself and will continue to evolve as technology standards expand (Kapow
Software 2013): OSINT can provide a background, ll epistemic gaps and create
links between seemingly unrelated sources, resulting in an altogether more com-
plete intelligence picture(Hobbs et al. 2014,p.2).
OSINT increasingly depends on the assimilation of all-source collection and
analysis. Such intelligence is an essential part of national security, competitive
intelligence, benchmarking, and even data mining within the enterprise(Appel
2011, p. xvii). The process of OSINT is shown in Fig. 14.1. OSINT has been used
for a long time by the government, military and in the corporate world to keep an
eye on the competition and to have a competitive advantage (Chauhan and Panda
2015a,b). Also a great number of internet usersenjoy legal activities from
communications and commerce to games, dating, and blogging(Appel 2011, p. 6),
and OSINT plays a critical role in this context.
Fig. 14.1 The OSINT
The current chapter aims to present an in-depth review of the role of OSINT in
cyber security context. Cybercrime and its related applications are explored such as
the concepts of the Deep and Dark Web, anonymity and cyber-attacks. Further, it
will review OSINT collection and analysis tools and techniques with a glance at
related works as main parts of its contribution. Finally, these related works are
articulated alongside the cyber threat domain and its open sources to establish a big
pictureof this topic.
14.2 The Importance of OSINT with a View on
Cyber Security
Increases in the quantity and type of challenges for contemporary, national security,
intelligence, law enforcement and security practitioners have sped up the use of
open sources in the internet to help draw out a more cohesive picture of people,
entities and activities (Appel 2011; also Chaps. 2,3,12 and 13). A recent PWC
American Survey (2015) entitled Key ndings from the 2015 US State of
Cybercrime Surveyfrom more than 500 executives of US businesses, law
enforcement services and government agencies articulates that cybercrime con-
tinues to make headlines and cause headaches among business executives.76 % of
cyber-security leaders said they are more concerned about cyber threats this year:
Cybersecurity incidents are not only increasing in number, they are also becoming
progressively destructive and target a broadening array of information and attack
vectors(PWC 2015).
In a report of the U.S. Ofce of Homeland Security, critical mission areas,
wherein the adoption of OSINT is vital, include general-intelligence, advanced
warnings, domestic counter-terrorism, protecting critical infrastructure (including
cyberspace), defending against catastrophic terrorism and emergency preparedness
and response (Chen et al. 2012). Therefore, intelligence, security and public safety
agencies are gathering large volumes of data from multiple sources, including the
criminal records of terrorism incidents and from cyber security threats (Chen et al.
Glassman and Kang (2012) discussed OSINT as the output of changing human
information relationships resulting from the emergence and growing dominance of
the World Wide Web in everyday life. Socially inappropriate behaviour has been
detected in Web sites, blogs and online-communities of all kinds from child
exploitation to fraud, extremism, radicalisation, harassment, identity theft, and
private-information leaks.Identity theft and the distribution of illegally copied
lms, TV shows, music, software, and hardware designs are good examples of how
the Internet has magnied the impact of crime(Hobbs et al. 2014).
The globalization, speed of dissemination, anonymity, cross-border nature of the
internet, and the lack of appropriate legislation or international agreements have
made some of them very wide-spread, and very difcult to litigate (Kim et al.
2011). There exist different types of dark sides of the internet, but also applications
to shed on the dark sides, comprising both technology-centric and
non-technology-centric ones. Technology-centric dark sides include spam, mal-
ware, hacking, Denial of Service (DoS) attacks, phishing, click fraud and violation
of digital property rights. Non-technology-centric dark sides include online scams
and frauds, physical harm, cyber-bullying, spreading false or private information
and illegal online gambling. Non-technology responses include legislation, law
enforcement, litigation, international collaboration, civic actions, education and
awareness and caution by people (Kim et al. 2011).
Computer crime and digital evidence are growing by orders that are as yet
unmeasured except by occasional surveys (Hobbs et al. 2014). To an intelligence
analyst, the internet is pivotal owing to the capabilities of browsers, search engines,
web sites, databases, indexing, searching and analytical applications (Appel 2011).
However, there are key issues which can distract from the right direction of OSINT
projects such as harvesting data from big open records on the internet and the
integration of data to add the capability of OSINT project parameters (Kapow
Software 2013).
14.3 Cyber Threats: Terminology and Classication
is any illegal activity arising from one or more internet components
such as Web sites, chat rooms or e-mail (Govil and Govil 2007) and commonly
dened as criminal offenses committed using the internet or another computer
network as a component of the crime(Agrawal et al. 2014). In 2007, the European
Commission (EC) identied three different types of cyber-crime: traditional forms
of crime using cyber relating to, for example, forgery, web shops and e-market
types of fraud, illegal content such as child pornography and crimes unique to
electronic networks(e.g., hacking and Denial of Service attacks). Burden and
Palmer (2003) distinguished truecybercrime (i.e., dishonest or malicious acts,
which would not exist outside of an online environment) from crimes which are
simply e-enabled. They presented truecyber-crimes as hacking, dissemination
of viruses, cyber-vandalism, domain name hijacking, Denial of Service Attacks
(DoS/DDoS), in contrast to e-enabledcrimes such as misuse of credit cards,
information theft, defamation, black mailing, cyber-pornography, hate sites, money
laundering, copyright infringements, cyber-terrorism and encryption. Evidently,
crime has inltrated the Web 2.0 along with all other types of human activities
(Hobbs et al. 2014).
In this chapter, the terms computer crime, internet crime, online crimes, hi-tech crimes, infor-
mation technology crime and cyber-crimes are being used interchangeably.
Cyber-attacks are increasingly being considered to be of the utmost severity for
national security. Such attacks disrupt legitimate network operations and include
deliberate detrimental effects towards network devices, overloading a network and
denying services to a network to legitimate users. An attacker may also exploit loop
holes, bugs, and miscongurations in software services to disrupt normal network
activities (Hoque et al. 2014).
The attackers goal is to perform reconnaissance by restraining the power of
freely available information extracted using different intelligence gathering ways
before executing a targeted attack (Enbody and Sood 2014). Meanwhile, secrecy
is a key part of any organized cyber-attack. Actions can be hidden behind a mask of
anonymity varying from the use of ubiquitous cyber-cafes to sophisticated efforts to
covert internet routing (Govil and Govil 2007). Cyber-criminals exploit opportu-
nities for anonymity and disguise over web-based communication to navigate
malicious activities such as phishing, spamming, blackmail, identity theft and drug
trafcking (Gottschalk et al. 2011; Igbal et al. 2012). Network security tools
facilitate network attackers in addition to network defenders in recognizing network
vulnerabilities and colleting site statistics. Network attackers attempt to identify
security breaches based on common services open on a host gathering relevant
information for launching a successful attack.
Kshetri (2005) classied cyber-attacks into two types: targeted and opportunistic
attacks. In targeted attacks specic tools are applied against specic cyber targets,
which makes this type more dangerous than the other one. Opportunistic attacks
entail the disseminating of worms and viruses deploying indiscriminately across the
internet (Hoqu et al. 2014). Figure 14.2 provides a taxonomy of cyber-crime types
(what) with their motives (why) and the tools to commit them (how).
To counter the ability of organized cyber-crime to operate remotely through
untraceable accounts and compromised computers and ghting against online crime
gangs it is therefore essential to supply tools to LEAs and actors in national security
for the detection, classication and defence from various types of attacks (Simmons
et al. 2014).
14.4 Cyber-Crime Investigations
14.4.1 Approaches, Methods and Techniques
Current information professionals draw from a variety of methods for organizing
open sources including but not limited to web-link analysis, metrics, scanning
methods, source mapping, text mining, ontology creation, blog analysis and pattern
recognition methods. Algorithms are developed using computational topology,
hyper-graphs, social network analysis (SNA), Knowledge Discovery and Data
Mining (KDD), agent based simulations, dynamic information systems analysis,
amongst others (Brantingham 2011).
Fig. 14.2 Cyber Crime types: Which-Why-How (Type, Motives, Committing Tools and techs)
Table 14.1 Tools for the collection, storage and classication of open source data
Tools purpose Application/description of tool(s)
Data encoding The term encoding refers to the process of putting a sequence
of characters into a special format for transmission or storage
purposes. In a web environment, relevant datasets are recovered
from data services available either locally or globally on the
internet. Depending on the service and the type of information,
data can be presented in different formats. Modelling platforms are
required to interact with a mixture of data formats including plain
text, markup languages and binary les (Vitolo et al. 2015; n.d.).
Examples: The Geoinformatics for Geochemistry System (database
web services adopting plain text format), base 64online Encoder,
XML encoder
Data acquisition The automatic collection of data from various sources (e.g., sensors
and readers in a factory, laboratory, medical or scientic
environment). Data acquisition has usually been conducted via data
access points and web links such as http or ftp pages, but required
periodical updates. Using a catalogue allows a screening of
available data sources before their acquisition (Ames et al. 2012;
Vitolo et al. 2015).
Examples: Meta-data catalogues
Data provenance This term is used to refer to the process of tracing and recording the
origins of data and its movement between databases. Behind the
concept of provenance is the dynamic nature of data. Instead of
creating different copies of the same dataset, it is important to keep
track of changes and store a record of the process that led to the
current state. Data provenance can, in this way, guarantee
reliability of data and reproducibility of results. Provenance is now
an increasingly important issue in scientic databases, where it is
central to the validation of data for inspecting and verifying
quality, usability and reliability of data (particularly in Semantic
Web Services) (Buneman et al. 2000; Szomszor and Moreau 2003;
Tilmes et al. 2010; Vitolo et al. 2015).
Examples: Distributed version Control Systems such as Git,
Data storage This term refers to the practice of storing electronic data with a
third party service accessed via the internet. It is an alternative to
traditional local storage (e.g., disk or tape drives) and portable
storage (e.g., optical media or ash drives). It can also be called
hosted storage,internet storageor cloud storage. Relational
databases (DB) are currently the best choice in storing and sharing
data (Vitolo et al. 2015; n.d.).
Examples: Postgre SQL, MySQL, Oracle, NoSQL
OSINT analytic tools provide frameworks for data mining techniques to analyse
data, visualize patterns and offer analytical models to recognize and react to
identify patterns. These tools should combine/unify indispensable features and
contain integrated algorithms and methods supporting the typical data mining
techniques, entailing (but not limited to) classication, regression, association and
item-set mining, similarity and correlation as well as neural networks (Harvey
2012). Such analytics tools are software products which provide predictive and
prescriptive analytics applications, some running on big open sources computing
platforms, commonly parallel processing systems based on clusters of commodity
servers, scalable distributed storage and technologies such as Hadoop and NoSQL
databases. The tools are designed to empower users rapidly to analyse large
amounts of data (Loshin 2015). The most predominant tools and techniques for
OSINT collection and storage are summaries in Table 14.1.
14.4.2 Detection and Prevention of Cyber Threats
Techniques to make use of open sources involve a number of specic disciplines
including statistics, data mining, machine learning, neural networks, social network
Table 14.1 (continued)
Tools purpose Application/description of tool(s)
Data curation Data curation is aimed at data discovery and retrieval, data quality
assurance, value addition, reuse and preservation over time. It
involves selection and appraisal by creators and archivists;
evolving provision of intellectual access; redundant storage; data
transformations. Data curation is critical for scientic data
digitization, sharing, integration, and use (Dou et al. 2012; n.d.).
Examples: Data warehouses, Data marts, Data Management Plan
tools (DMPTool)
Data visualization (and
This term refers to the presentation of data in a pictorial or
graphical format (e.g., creating tables, images, diagrams and other
intuitive ways to understand data). Interactive data visualization
goes a step further: moving beyond the display of static graphics
and spreadsheets to using computers and mobile devices to drill
down into charts and graphs for more details, and interactively (and
immediately) changing what data you see and how it is processed
(Vitolo et al. 2015; n.d.).
Examples: Poly Maps, NodeBox, FF Chartwell, SAS visual
Analytics, Google Map
Distributed version control systems have been designed to ease the traceability of changes, in
documents, codes, plain text data sets and more recently geospatial contents.
DMP tools create ready-to-use data management plans for specic funding agencies to meet
funder requirements for data management plans, get step-by-step instructions and guidance for
your data and learn about resources and services available at your institution to help fulll the data
management requirements of your grant.
analysis, signal processing, pattern recognition, optimization methods and visual-
ization approaches (Chen and Zhang 2014; also Chapters in Part 2 of this book).
Gottschalk et al. (2011) presented a four-stage growth model for Knowledge
Discovery to support investigations and the prevention of white-collar
crime in
business organizations (Gottschalk 2010). The four stages are labelled:
1. Investigator-to-technology
2. Investigator-to-investigator
3. Investigator-to-information
4. Investigator-to-application
Through the proper exercise of knowledge, such processes can assist in problem
solving. This four-part system attempts to validate the conclusions by nding
evidence to support them. In law enforcement this is an important system feature as
evidence determines whether a person is charged or not for a crime (Gottschalk
et al. 2011) and the extent to which proceedings against them will succeed (see
Chaps. 17 and 18).
Lindelauf et al. (2011) investigated the structural position of covert criminal net-
works using the secrecy versus information trade-off characterization of covert
networks to identify criminal networks topologies. They applied this technique on
evidence for the investigation of Jemaah Islamiyahs Bali bombing as well as
heroin distribution networks in New York. Danowski (2011) developed a
methodology combining text analysis and social network analysis for locating
individuals in discussion forums, who have highly similar semantic networks based
on watch-list membersobserved message content or based on other standards such
as radical content extracted from messages they disseminate on the internet. In the
domain of countering cyber terrorism and inciting violence Danowski used a
Pakistani discussion forum with diverse content to extract intelligence of illegal
behaviour. Igbal et al. (2013) presented a unied data mining solution to address the
problem of authorship analysis in anonymous textual communications such as
spamming and spreading malware and to model the writing style of suspects in the
context of cyber-criminal behaviour.
Brantingham (2011) offered a comprehensive computational framework for
co-offending network mining, which combines formal data modelling with data
mining of large crime and terrorism data sets aimed towards identifying common
and useful patterns. Petersen et al. (2011) proposed a node removal algorithm in
the context of cyber-terrorism to remove key nodes of a terrorism network. Fallah
(2010) proposed a puzzle-based strategy of game theory using the solution concept
of the Nash Equilibrium to handle sophisticated DoS attack scenarios. Chonka et al.
(2011) offered a solution through Cloud TraceBack (CTB) to nd the source of DoS
attacks and introduced the use of a back propagation neutral network, called Cloud
White-collar crime is nancial crime committed by upper class members of society for personal or
organizational gain. White-collar criminals are individuals who tend to be wealthy, highly edu-
cated, and socially connected, and they are typically employed by and in legitimate organizations..
222 F. Tabatabaei and D. Wells
Table 14.2 Categorization of methods using open source data for cyber-criminal investigations
Domain (Which) Author (Who) Methodology description (How)
Data mining Criminal networks Iqbal et al.
Proposing a framework that consists of three miner,2.topic
miner and 3. information visualizer. It is a unied framework of data mining
and natural language processing techniques to collect data from chat logs for
intuitive and interpretable evidence that facilitates the investigative process
for crime investigation.
Available from: Online Messages (Chat Logs) extracted from Social
Activity boom in cyber
cafes, and anomaly
Ansari et al.
Describing a typical fuzzy intrusion detection scenario for information mining
application in real time that investigates vulnerabilities of computer networks
Available from: Data available via ISPs
Malware activities
detection using fast-ux
services networks (FFSN)
Wu et al.
Investigating detection solutions of Fast-ux domains by using Data Mining
techniques (Linear Regression) to detect the FFSN
and analysing the feature
Available from: Data in two classes: white and black lists. The white list
includes more than 60 thousands benign domain names; the black list has
about 100 FFSNs domain names detected by
Cyber terrorism resilience Koester and
Providing a supporting framework via FCA (Factor Concept Analysis) to nd
and ll information gaps in Web Information Retrieval and Web Intelligence
for cyberterrorism resilience
Available from: Small terrorist data sets based on 2002, 2005, London,
Text Mining Counter Cyber Terrorism Srihari (2009) Using Unapparent Information Revelation (UIR) method to propose a new
framework for different interpretation. A generalization of this taskinvolves
query terms representing general concepts (e.g. indictment, foreign policy)
Intrusion Detection
Adeva and
Atxa (2007)
Proposing detection attempts of either gaining unauthorised access or
misusing a web application and introducing an intrusion detection software
component based on text-mining techniques using Arnassystem
Social Network
Cyber terrorism (detecting
terrorist networks)
Chen et al.
Providing a novel graph-based algorithm that generates networks to identify
hidden links between nodes in a network with current information available to
Table 14.2 (continued)
Terrorist network ghting Kock Wiil
et al. (2011)
Offering a novel method to analyse the importance of links and to identify key
entities in the terrorist (covert) networks using Crime Fighter Assistant
Available from: Open sources: 9/11 attacks (2001), Bali night club bombing
(2002), Madrid bombings (2004), and 7/7 London bombings (2005)
Network attacks (intrusion
He and
Using an Automatic Semantic Network with two layers: rst mode and second
mode networks. The rst mode network identies relevant attacks based on
similarity measures; the second mode network is modied based on the rst
mode and adjusts it by adding domain expertise
Available from: Selected data from the KDD CUP 99 data set made available
at the Third International Knowledge Discovery and Data Mining Tools
Optimization methods
(based on game
Preventing DDoS attacks Spyridopoulos
et al. (2013)
Making a two-player, one-shot, non-cooperative, zero-sum game in which the
attackers purpose is to nd the optimal conguration parameters for the
attack in order to cause maximum service disruption with the minimum cost.
This model attempts to explore the interaction between an attacker and a
defender during a DDoS attack scenario
Available from: A series of experiments based on the Network Simulator
(ns-2) using the dumbbell network topology
Trust management and
DoS attacks
Li et al. (2009) Proposing a defence technique using two trust management systems (Key
Note and Trust Builder) and credential caching. In their two player zero-sum
game model, the attacker tries to deprive as much resources as possible, while
the defender tries to identify the attacker as quickly as possible
Available from: KeyNote (open-source library for the KeyNote trust
management system) as an example to demonstrate that a DoS attack can
easily paralyze a trust management server
Cyber terrorism Matusitz
A model combining game theory and social network theory to model how
cyber-terrorism works to analyse the battle between computer security experts
and cyberterrorists; all players wish the outcome to be as positive or
rewarding as possible
Table 14.2 (continued)
Related works for
Cyber-crime investigation Katos and
Bendar (2008)
Presenting an information system to capture the information provided by the
different members during a cyber-crime investigation adopting elements of
the Strategic Systems Thinking Framework (SST). SST consists of three main
aspects: 1. intra-analysis,2. inter analysis and 3. value-analysis
Computer hacking Kshetri (2005) Proposing a conceptual framework based on factors and motivations, which
encourage and energize the cyber offendersbehaviour:
1. Characteristics of the source nation
2. Motivation of attack
3. Prole of target organization (types of attack)
Preventing white collar
Developing an organizing framework for knowledge management systems in
policing nancial crime containing four stages to investigation and prevention
nancial crimes:
1. Ofcer to technology systems
2. Ofcer to ofcer systems
3. Ofcer to information systems
4. Ofcer to application systems
Detecting cyber-crime in
nancial sector
Lagazio et al.
Proposing a multi-level approach that aims at mapping the interaction of both
interdependent and differentiated factors with focusing on system dynamics
theory in the nancial sector. The factors together can facilitate or prevent
cyber-crime, while increasing and/or decreasing its economic and social costs.
Capturing and analysing
military intelligence to
prevent crises
Song (2011) Proposing a military intelligence early warning mechanism based on open
sources with four modules (1. collection module, 2. early-warning intelligence
processing, 3. early warning intelligence analysis, 4. preventive actions) to
help the collection, tracking, monitoring and analysis of crisis signals used by
operation commanders and intelligence personnel to support preventive
Creates a fully qualied domain name to have hundreds (or thousands) IP addresses assigned to it
A knowledge management tool for terrorist network analysis
This training dataset was originally prepared and managed by MIT Lincoln Labs
14 OSINT in the Context of Cyber-Security 225
Protector, which was trained to detect and lter against such attack trafc.
Mukhopadhyay et al. (2013) suggested a Copula-aided Bayesian Belief Network
(CBBN) to assess and to quantify cyber-risk and cyber vulnerability assessment
In summary, the eld of computational criminology includes a wide range of
computational techniques to identify:
1. Patterns and emerging trends
2. Crime generators and crime attractors
3. Terrorist, organized crime and gang social and spatial networks
4. Co-offending networks
Current models and methods are summarized Table 14.2 according to providing
cyber-crime types (which), author (who), methodology (how) and open sources
used for testing.
While many approaches seem to be helpful for cyber-crime investigation,
existing literature suggests that social network analysis (SNA), data mining, text
analysis, correlational studies and optimization methods specically with focus on
big data analysis of open sources are the most practical techniques to aid
Techniques / Methods
Data Mining
Text Mining Information Extraction
Method Game Theory
Web Mining Link Analysis
Machine Learning
Social Network
Node Removal
Network Extraction
Semantic Networks
Statistical Method Regression Models
Conceptual Knowledge-
based Frameworks
Cloud Computing
Fig. 14.3 Categorization of cyber-crime investigation methods and models
practitioners and security and forensic agencies. Currently available techniques can
be categorized in a schematic diagram such as Fig. 14.3.
14.5 Conclusions
The impact of cyber-crime has necessitated intelligence and law enforcement
agencies across the world to tackle cyber threats. All sectors are now facing similar
dilemmas of how to best mitigate against cyber-crime and how to promote security
effectively to people and organizations (Jahankhani et al. 2014; Staniforth 2014).
Extracting unique and high value intelligence by harvesting public records to create
a comprehensive prole of certain targets is emerging rapidly as an important
means for the intelligence community (Bradbury 2011; Steele 2006). As the amount
of available open sources rapidly increases, countering cyber-crime increasingly
depends upon advanced software tools and techniques to collect and process the
information in an effective and efcient manner (Kock Wiil et al. 2011).
This chapter reviewed current efforts of employing open source data for
cyber-criminal investigations. Figure 14.4 provides a summary of the ndings in
the form of an integrative Cybercrime Investigation Framework.
