ArticlePDF Available

Trajectory Intrusion Evidence Data Analysis

Authors:
  • Edo State University Iyamho

Abstract and Figures

Location aware devices are used extensively in many networking systems such as car navigation, IP traceback, the collected spatio-temporal data capture the detected movement information of the tagged objects, offering tremendous opportunities for data mining of useful knowledge.
Content may be subject to copyright.
Procedia Computer Science 00 (2015) 1–5
Procedia
Computer
Science
www.elsevier.com/locate/procedia
Trajectory Intrusion Evidence Data Analysis
Omaji Samuela, Akogwu Blessing. Ob
aCOMSATS Institute of Information Technology, Islamabad Pakistan
bSHESTCO, Sheda Kwali Abuja, Nigeria.
Abstract
c
2015 Published by Elsevier Ltd.
Keywords:
Intrusion Evidence, Trajectory, Network Forensics, Relative Identifier.
1. Introduction
Location aware devices are used extensively in many networking systems such as car navigation, IP
traceback, the collected spatio-temporal data capture the detected movement information of the tagged ob-
jects, oering tremendous opportunities for data mining of useful knowledge . Yet the raw data mining
would reveal information of the intruder. In this article, we analyze how evidence can be obtained from
trajectory logs. In the diagram below, the double arrow indicates forward and backward movement of the
Fig. 1. Showing Simple Attacker Path Visited
attacker, There are a lots of possible path attacker can visit. suppose the attacker decided to launch an attack
on the database at R7, the attacker might decided to perform surveillance to determine the administrator de-
vice either through social engineering or by other means. Suppose s(he) knows that the admin is at R1, and
perform packet sning to get the sessions token used by the administrator, if successful, s(he) acts as the
2/Procedia Computer Science 00 (2015) 1–5
admin. Possibly, s(he) might brute force to get the credentials and authenticates himself as the administrator.
Attacker behaving as the admin, can learn where the web server, file server and other services/resources in
the network. To obtained the exact first device/point the attacker visited is dependent on the timestamps.
2. Motivation
In recent years, There has been an explosive growth of location aware device such as RFID, GPS-
based devices, Cell phone and PDAs. The use of this devices facilitates new and exciting location-based
application that consequently generate huge collection of trajectory data. Recent research reveals that these
trajectory data can be use for various data analysis purpose to improve current systems such as city trac
control, mobility management, urban planning and location-based service advertisements. We will deploy
this motivation in the aspect of network forensics to trace intruders activities and paths from trajectory
evidence obtains from honeynet, HIDS and NIDS.
3. Analysis
Attack Model on Trajectory Evidence Data
3.1. preliminaries
Definition 1 (Relative Identifier). Let RI denote the relative-identifier.
Given a table U, the evidence table T (C1. . . Ci),F1:UT,F2:TUwhere U U. Then RITof T
is the multiset of attributes {C1. . . C}⊆{C. . . Cj}where tiU,F2(F1(ti)[RIt]) =ti
RI =(DetectT ime,AuxiliaryValue) Hence RI is a set of attributes that could potentially correlate tables.
Fig. 2. Evidence Knowledge Base EKB(subjected to removal)
Definition 2 (Raw Alert Generation(R(n)). Let P(A1. . . An)and Q(B1. . . Bm)be two evidence tables,
then let T be P Q. then the union of T holds if
1. ti.P=ti.Q, where ti RIi>0. Therefore at time n 1the raw alerts R(n)contains all the tuple in
T at timestamp 1. . . n
2. R(n)=Sn
j=1T(j)
This raw alert R(n) is sends to the MySQL database.
Definition 3 (Evidence Knowledge Base(EKB)). At time n, the intrusion table includes:
An Evidence knowledge base EKB(n) which has a column Ahnamed the ” HyperID”, a column Alnamed
”lifespan” and all attributes of R(n)For every tuple t R(n), there is a row b EK B(n)such that b =t.
Equivalently, EKB(n)incorporates everything in R(n)and the lifespan of each tuple in EK B(n).
/Procedia Computer Science 00 (2015) 1–5 3
Definition 4 (Consequence). The consequence is a multiset of attributes in R(n)that contains (Classifica-
tion, Auxiliary attributes, Directory)
We will use example to illustrate how intrusion can seen through trajectory analysis.
ID Source IP Paths Consequences
1 172.xxx.xxx.xxx <O2G3D4K6D7S8>types of damages
2 172.xxx.xxx.xxx <K6D7S8>types of damages
3 172.xxx.xxx.xxx <G3D4K6S8>types of damages
4 172.xxx.xxx.xxx <O2D5D7S8>types of damages
5 172.xxx.xxx.xxx <G3G7S8>types of damages
6 172.xxx.xxx.xxx <D5K6S8>types of damages
7 172.xxx.xxx.xxx <O2K6D7S8>types of damages
8 172.xxx.xxx.xxx <O2D5K6D7S8>types of damages
9 172.xxx.xxx.xxx <G7K6S8>types of damages
10 172.xxx.xxx.xxx <O3G3K3S3>types of damages
Table 1. Trajectory Evidence Data Obtained from Honeynet and IDS
Let denote the evidence data obtained from honeynet, NIDS and HIDS as O-ossec, D-dionaea, K-kippo,
G-glastopf and S-Snort.
Example
Suppose a trajectory evidence table contains all information about the visited location( the source IP, paths
and consequences) where the paths is a sequence of pair (lociti) indicating the intruder’s visited location loci
at time ti.
For example table 1 has a path <DS T IP1aDS TI P2b. . . DS TI Pnz>meaning the intruder has visited
DS TI P1,DS TIP12 and DS T IPn at time a, b, and z respectively.
Without lost of generality, we assume that each record contain only on subsequence attributes. This example
can be use to achieve the following;
1. Evidence Record Linkage: If a path in the evidence table is so specific that not many intruder’s record
match it. Hence we can link the intruder’s with that trajectory data.
image in this page
2. Evidence Attribute Linkage: If a subsequence evidence occurs frequently together with some se-
quence of pairs, then the subsequence information can be inferred from such sequence even though
the exact record of the intruder cannot be identify.
3. Evidence Data Sparseness: Consider the intruder in a network or honeypot, they usually visit only a
few locations compared to all available locations, so the trajectory path is relatively short.
4. High Intrusion Dimensionality: Consider a network or honeypot with (10 many services functioning
24hrs a day) , there are 10 ×24 possible combinations dimension of locations and timestamps. Each
dimension could be a potential RI as in definition( 1) attribute used for each record and attribute
linkages.
Using the trajectory evidence data as evidence knowledge base as defined in definition( 3) to perform ev-
idence record or evidence attribute linkage. However in real life, it is dicult to reveal and acquired all
the visited location and timestamps of the intruder because it require non trivial eort to gather each piece
of evidence from scatter generated logs from dierent devices and systems or sensors on the network at
dierent time. Thus, it is reasonable to assume that the intruders is bounded by at most lpairs of (loci,ti)
that intruder has visited.
Based on this assumption, the high dimensionality and spatio-temporal data, the general intuition is to en-
sure that every sequence Qwith a maximum length Lof any path in the trajectory evidence table (T ET ) is
not shared by at least kin records in T ET .
4/Procedia Computer Science 00 (2015) 1–5
Definition 5 (Trajectory Evidence Table). A Trajectory Evidence Table T E T is a collection of records
in the form <loc1t1loc2t2. . . lociti>:S1,S2, . . . Si:d1,d2,...,dmwhere loc1t1loc2t2
. . . locitiis the path, siSiare the consequence as in definition( 4)values, and diDiare the RI as in
definition ( 1)of an intruder.
A pair (loc1t1) represents the visited location loc1of an intruder at time ti. An intruder may revisit the same
location at dierent time. At any one location, so <a1b1>is not a valid sequence and timestamps in a
path increases monotonically.
3.1.1. Usefulness of Trajectory Evidence Data
The measure of usefulness varies depending on the data mining task to be performed on the evidence
database. In this section, we aim at conserving the maximal frequency occurrence (MFO). An occurrence
q=<loc1t1loc2t2. . . lociti>is an ordered set of locations. An occurrence qis frequent in a
trajectory evidence table T ET if [T ET (q)k0] where T E T (q) is the set of records containing qand k0is a
minimum support threshold. Frequent occurrence (FO) capture the major paths of the moving intruder and
often form the information basis for dierent primitive data mining task on the data. (eg. association rules
mining). In the context of trajectories, association rules can be used to determine the subsequence locations
of the moving intruder given the previous visited location. This knowledge is important for trac packet
analysis.
FO is useful, yet mining all FO is computationally expensive operation. When the data volume is large
and FO are long. it is feasible to identify all FO because all subsequence of FO are also frequent. Since
trajectory evidence data is high dimensional and in large volume, a more feasible solution is conserve only
the maximal frequency occurrence MFO.
Definition 6 (Maximal Frequent EvidenceMFE). For a given minimal support threshold k0>0, an oc-
currence of evidence x is maximal frequent in a trajectory evidence table TE T . if x is frequent and no higher
occurrence of evidence of x is frequent in TET .
The set of M FE denoted by M(t) is much smaller than the set of FO in T ET given the same k0.MFE
contains essential information for dierent kinds of evidence analysis. MF E captures the longest frequent
visited paths as earlier described. Any subsequence of MF E is also F O. Once MF E is determined, the
assisted counts of any particular FO can be computed by scanning the evidence data table once.
3.1.2. Identifying Violating Evidence Occurrence(IVEO)
We will use the occurrence length not greater than las our evidence knowledge base to linked attack.
Thus any non empty occurrence qwith |q| ≤ lin T ET is a violating evidence occurrence, if its group T(q)
does not satisfies the two definitions below.
Definition 7 (L-S-Reveal Intrusion). Let L be the maximum length of the evidence knowledge base. Let
S be the set of Consequence(see definition 4)values. A Trajectory evidence table T ET satisfies L S
Revealintrusion if and only if for any occurrence q with |q| ≥ L
1. |T(q)| ≤ k, where k >0is an integer reveal evidence threshold and
2. P(c
q)c for any c S , where 0c1is a real number confidence threshold.
Definition 8 (Violating Evidence Occurrence). Let q be an occurrence of a path in TET with |q| ≥ L·q
is a violating evidence occurrence with respect to a L SRevealIntrusion requirement if
1. q is non-empty and
2. |T(q)|>K or P(c
q)c for any subsequence value c S .
/Procedia Computer Science 00 (2015) 1–5 5
Example
show some example from the trajectory table
A Trajectory evidence data table satisfies LSRevealIntrusion requirement if all violating evidence
occurrence with respect to the reveal requirement is not removed (i.e. all possible channels for record and
attribute linkages are not removed).
The simplest approach is to list all possible evidence violating sequences and keep them constants. Thus the
monotonic property with respect to Lholds in LSRevealIntru sion.
Definition 9 (Co-Localization of Trajectory Evidence). Two Trajectories r1and r2defined in time inter-
val [t1,tn]co-locate written as CoLoct1,tn(r1,rn)with respect to uncertainty threshold βif and only if for each
point (x1,y1,t)in r1and (x2,y2,t)in r2with t [t1,tn], its holds that the distance ((x1,y1),(x2,y2)) β.
The distance can be any function that measures the distance between points.
DS T ((x1,y1),(x2,y2)) =p(x1x2)2+(y1y2)2.
Intuitively, a trajectory is reveal if its does or not shares a similar path with other trajectories.
Bibliography
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.