Conference PaperPDF Available

Diagnosing Advanced Persistent Threats: A Position Paper

Authors:

Abstract and Figures

When a computer system is hacked, analyzing the root-cause (for example entry-point of penetration) is a diagnostic process. An audit trail, as defined in the National Information Assurance Glossary, is a security-relevant chronological (set of) record(s), and/or destination and source of records that provide evidence of the sequence of activities that have affected, at any time, a specific operation, procedure, or event. After detecting an intrusion, system administrators manually analyze audit trails to both isolate the root-cause and perform damage impact assessment of the attack. Due to the sheer volume of information and low-level activities in the audit trails, this task is rather cumbersome and time intensive. In this position paper, we discuss our ideas to automate the analysis of audit trails using machine learning and model-based reasoning techniques. Our approach classifies audit trails into the high-level activities they represent, and then reasons about those activities and their threat potential in real-time and forensically. We argue that, by using the outcome of this reasoning to explain complex evidence of malicious behavior, we are equipping system administrators with the proper tools to promptly react to, stop, and mitigate attacks.
Content may be subject to copyright.
Diagnosing Advanced Persistent Threats: A Position Paper
Rui Abreu and Danny Bobrow and Hoda Eldardiry and Alexander Feldman and
John Hanley and Tomonori Honda and Johan de Kleer and Alexandre Perez
Palo Alto Research Center
3333 Coyote Hill Rd
Palo Alto, CA 94304, USA
{rui,bobrow,hoda.eldardiry,afeldman,john.hanley,tomo.honda,dekleer,aperez}@parc.com
Dave Archer and David Burke
Galois, Inc.
421 SW 6th Avenue, Suite 300
Portland, OR 97204, USA
{dwa,davidb}@galois.com
Abstract
When a computer system is hacked, analyz-
ing the root-cause (for example entry-point
of penetration) is a diagnostic process. An
audit trail, as defined in the National Infor-
mation Assurance Glossary, is a security-
relevant chronological (set of) record(s),
and/or destination and source of records that
provide evidence of the sequence of activi-
ties that have affected, at any time, a specific
operation, procedure, or event. After detect-
ing an intrusion, system administrators man-
ually analyze audit trails to both isolate the
root-cause and perform damage impact as-
sessment of the attack. Due to the sheer vol-
ume of information and low-level activities
in the audit trails, this task is rather cum-
bersome and time intensive. In this posi-
tion paper, we discuss our ideas to automate
the analysis of audit trails using machine
learning and model-based reasoning tech-
niques. Our approach classifies audit trails
into the high-level activities they represent,
and then reasons about those activities and
their threat potential in real-time and foren-
sically. We argue that, by using the outcome
of this reasoning to explain complex evi-
dence of malicious behavior, we are equip-
ping system administrators with the proper
tools to promptly react to, stop, and mitigate
attacks.
1 Introduction
Today, enterprise system and network behaviors are
typically “opaque”: stakeholders lack the ability to as-
sert causal linkages in running code, except in very
simple cases. At best, event logs and audit trails can
offer some partial information on temporally and spa-
tially localized events as seen from the viewpoint of
individual applications. Thus current techniques give
operators little system-wide situational awareness, nor
any viewpoint informed by a long-term perspective.
Adversaries have taken advantage of this opacity by
adopting a strategy of persistent, low-observability
operation from inside the system, hiding effectively
through the use of long causal chains of system and
application code. We call such adversaries advanced
persistent threats, or APTs.
To address current limitations, this position pa-
per discusses a technique that aims to track causal-
ity across the enterprise and over extended periods of
time, identify subtle causal chains that represent ma-
licious behavior, localize the code at the roots of such
behavior, trace the effects of other malicious actions
descended from those roots, and make recommenda-
tions on how to mitigate those effects. By doing so,
the proposed approach aims to enable stakeholders to
understand and manage the activities going on in their
networks. The technique exploits both current and
novel forms of local causality to construct higher-level
observations, long-term causality in system informa-
tion flow. We propose to use a machine learning ap-
proach to classify segments of low-level events by the
activities they represent, and reasons over these ac-
tivities, prioritizing candidate activities for investiga-
tion. The diagnostic engine investigates these candi-
dates looking for patterns that may represent the pres-
ence of APTs. Using pre-defined security policies and
related mitigations, the approach explains discovered
APTs and recommends appropriate mitigations to op-
erators. We plan to leverage models of APT and nor-
mal business logic behavior to diagnose such threats.
Note that the technique is not constrained by availabil-
ity of human analysts, but can benefit by human-on-
the-loop assistance.
The approach discussed in the paper will offer un-
precedented capability for observation of long-term,
subtle system-wide activity by automatically con-
structing such global, long-term causality observa-
Proceedings of the 26th International Workshop on Principles of Diagnosis
193
tions. The ability to automatically classify causal
chains of events in terms of abstractions such as ac-
tivities, will provide operators with a unique capabil-
ity to orient to long-term, system-wide evidence of
possible threats. The diagnostic engine will provide
a unique capability to identify whether groups of such
activities likely represent active threats, making it eas-
ier for operators to decide whether long-term threats
are active, and where they originate, even before those
threats are identified by other means. Thus, the ap-
proach will pave the way for the first automated, long-
horizon, continuously operating system-wide support
for an effective defender Observe, Orient, Decide, and
Act (OODA) loop.
2 Running Example
The methods proposed in this article are illustrated on
arealistic running example. The attackers in this ex-
ample use sophisticated and recently discovered ex-
ploits to gain access to the victim’s resources. The at-
tack is remote and does not require social engineering
or opening a malicious email attachment. The meth-
ods that we propose, however, are not limited to this
class of attacks.
victim’s local network
system
administrator
data storage
back end
web server
front-end
router
Internet
hacker
Figure 1: Network topology for the attack
The network topology used for our running example
is shown in figure 1. The attack is executed over sev-
eral days. It starts by (1) compromising the web server
front-end, followed by (2) a reconnaissance phase and
(3) compromising the data storage back end and ulti-
mately extracting and modifying sensitive information
belonging to the victim.
Both the front-end and the back end in this example
run unpatched UB UN TU 13.1 LINU X OS on an IN-
TE L R
SAN DY BRIDGETM architecture.
What follows is a detailed chronology of the events:
1. The attacker uses the A PACH E httpd server, a
cgi-bin script, and the SHELLSHOCK vulnera-
bility (GNU bash exploit registered in the Com-
mon Vulnerabilities and Exposures database as
CVE 2014-6271 (see https://nvd.nist.
gov/) to gain remote shell access to the victim’s
front-end. It is now possible for the attacker to
execute processes on the front-end as the non-
privileged user www-data.
2. The attacker notices that the front-end is run-
ning an unpatched UB UN TU LINUX OS version
13.1. The attacker uses the nc Linux utility to
copy an exploit for obtaining root privileges. The
particular exploit that the attacker uses utilizes
the x32 recvmmsg() kernel vulnerability reg-
istered in the Common Vulnerabilities and Expo-
sures (CVE) database as CVE 2014-0038. After
running the copied binary for a few minutes the
attacker gains root access to the front-end host.
3. The attacker installs a root-kit utility that inter-
cepts all input to ssh;
4. A system administrator uses the compromised
ssh to connect to the back-end revealing his back-
end password to the attacker;
5. The attacker uses the compromised front-end to
bypass firewalls and uses the newly acquired
back-end administrator’s password to access the
back-end;
6. The attacker uses a file-tree traversing utility on
the back-end that collects sensitive data and con-
solidates it in an archive file;
7. The attacker sends the archive file to a third-party
hijacked computer for analysis.
3 Auditing and Instrumentation
Almost all computing systems of sufficiently high-
level (with the exception of some embedded systems)
leave detailed logs of all system and application activ-
ities. Many UN IX variants such as LINUX log via the
syslog daemon, while WI ND OWSTM uses the event
log service. In addition to the usual logging mecha-
nisms, there is a multitude of projects related to se-
cure and detailed auditing. An audit log is more de-
tailed trail of any security or computation-related ac-
tivity such as file or RAM access, system calls, etc.
Depending on the level of security we would like
to provide, there are several methods for collecting in-
put security-related information. On one extreme, it is
possible to use the existing log files. On the other ex-
treme there are applications for collecting detailed in-
formation about the application execution. One such
approach [1]runs the processes of interest through a
debugger and logs every memory read and write ac-
cess.
It is also possible to automatically inject logging
calls in the source files before compiling them, allow-
ing us to have static or dynamic logging or a combi-
nation of the two. Log and audit information can be
signed, encrypted and sent in real-time to a remote
server to make system tampering and activity-hiding
more difficult. All these configuration decisions im-
pose different trade-offs in security versus computa-
tional and RAM load [2]and depend on the organiza-
tional context.
2
Proceedings of the 26th International Workshop on Principles of Diagnosis
194
.
.
.
front_end.secure_access_log:11.239.64.213 - [22/Apr/2014
06:30:24 +0200] "GET /cgi-bin/test.cgi HTTP/1.1" 401 381
.
.
.
front_end.rsyslogd.log:recvmsg(3, msg_name(0) =
NULL, msg_iov(1) = ["29/Apr/2014 22:15:49 ...", 8096],
msg_controllen = 0, msg_flags = MSG_CTRUNC,
MSG_DONTWAIT) = 29
.
.
.
back_end:auditctl:type = SYSCALL msg = au-
dit(1310392408.506:36): arch = c000003e syscall = 2
success = yes exit = 3 a0 = 7fff2ce9471d a1 = 0 a2 = 61f768
a3 = 7fff2ce92a20 items = 1 ppid = 20478 pid = 21013 auid
= 1000 uid = 0 gid = 0 euid = 0 suid = 0 fsuid = 0 egid = 0
sgid = 0 fsgid = 0 ses = 1 comm = "grep" exe = "/bin/grep"
.
.
.
Figure 2: Part of log files related to the attack from the
running example
Figure 2 shows part of the logs collected for our run-
ning example. The first entry is when the attacker ex-
ploits the SHELLSHOCK vulnerability through a CGI
script of the web server. The second entry shows sys-
log strace-like message resulting from the kernel
escalation. Finally, the attacker uses the grep com-
mand on the back-end server to search for sensitive
information and the call is recorded by the audit sys-
tem.
It is often the case that the raw system and secu-
rity log files are preprocessed and initial causal links
are computed. If we trace the exec,fork, and
join POSIX system calls, for example, it is possi-
ble to add graph-like structure to the log files comput-
ing provenance graphs. Another method for comput-
ing local causal links is to consider shared resources,
e.g., two threads reading and writing the same memory
address [1].
4 Activity Classification
The Activity Classifier continuously annotates audit
trails with semantic tags describing the higher-order
activity they represent. For example, ‘remote shell ac-
cess’, ‘remote file overwrite’, and ‘intra-network data
query’ are possible activity tags. These tags are used
by the APT Diagnostics Engine to enable higher-order
reasoning about related activities, and to prioritize ac-
tivities for possible investigation.
4.1 Hierarchical semantic annotation of
audit trails
A key challenge in abstracting low-level events into
higher-order activity patterns that can be reasoned
about efficiently is that such patterns can be described
at multiple levels of semantic abstraction, all of which
may be useful in threat analysis. Indeed, higher-order
abstractions may be composed of lower-order abstrac-
tions that are in turn abstractions of low-level events.
For example, a sequential set of logged events such as
‘browser forking bash’, ‘bash initiating Netcat’, and
‘Netcat listening to new port’, might be abstracted as
the activity ‘remote shell access’. The set of activities,
‘remote shell access’, and ‘escalation of privilege’ can
be abstracted as the activity ‘remote root shell access’.
We approach activity annotation as a supervised
learning problem that uses classification techniques to
generate activity tags for audit trails. Table 1 shows
multiple levels of activity classifications for the above
APT example.
Table 1 represents one possible classification-
enriched audit trail for such an APT. There can be
many relatively small variations. For example, ob-
scuring the password file could be done using other
programs. A single classifier only allows for a single
level of abstraction, and a single leap from low-level
events to very abstract activities (for example, from
‘bash execute perl’ level to ‘extracting modified file’
level) will have higher error caused by these additional
variations.
To obtain several layers of abstraction for reason-
ing over, and thus reduce overall error in classifica-
tion, we use a multi-level learning strategy that models
information at multiple levels of semantic abstraction
using multiple classifiers. Each classifier solves the
problem at one abstraction level, by mapping from a
lower-level (fine) feature space to the next higher-level
conceptual (coarse) feature space.
The activity classifier rely on both a vocabulary of
activities and a library of patterns describing these ac-
tivities that will be initially defined manually. This vo-
cabulary and pattern set reside in a Knowledge Base.
In our training approach, results from training lower
level classifiers are used as training data for higher
level classifiers. In this way, we coherently train all
classifiers by preventing higher-level classifiers from
being trained with patterns that will never be gener-
ated by their lower-level precursors. We use an ensem-
ble learning approach to achieve accurate classifica-
tion. This involves stacking together both bagged and
boosted models to reduce both variance and bias er-
ror components [3]. The classification algorithm will
be trained using an online-learning technique and in-
tegrated within an Active Learning Framework to im-
prove classification of atypical behaviors.
Generating Training Data for Classification To
build the initial classifier, training data is generated
using two methods. First, an actual deployed sys-
tem is used to collect normal behavior data, and a
Subject Matter Expert manually labels it. Second,
a testing platform is used to generate data in a con-
trolled environment, particularly platform dependent
vulnerability-related behavior. In addition, to gener-
ate new training data of previously unknown behavior,
we use an Active Learning framework as described in
Section 5.
3
Proceedings of the 26th International Workshop on Principles of Diagnosis
195
Table 1: Sample classification problem for running example
Activity 1 Activity 2 Activity 3
Remote Shell Access Remote File Overwrite Modified File Download
Shell Shock Trojan Installation Password Exfiltration
Browser (Port 80) fork bash Netcat listen to Port 8443 Netcat listen to Port 8443
bash fork Netcat Port 8443 receive binary file Port 8443 fork bash
Netcat listen to port 8080 binary file overwrites libns.so bash execute perl
Perl overwrite /tmp/stolen_pw
Port 8443 send /tmp/stolen_pw
5 Prioritizer
As the Activity Classifier annotates audit trails with
activity descriptors, the two (parallel) next steps in our
workflow are to 1) prioritize potential threats to be re-
ferred to the Diagnostic Engine (see Section 6) for in-
vestigation, and 2) prioritize emergent activities that
(after suitable review and labeling) are added to the ac-
tivity classifier training data. This module prioritizes
activities by threat severity and confidence level. This
prioritization process presents three key challenges.
5.1 Threat-based rank-annotation of
activities
One challenge in ranking activities according to their
threat potential is the complex (and dynamic) notion of
what constitutes a threat. Rankings based on matching
to known prior threats is necessary, but not sufficient.
An ideal ranking approach should take known threats
into account, while also proactively considering the
unknown threat potential of new kinds of activities.
Another such challenge is that risk may be assessed
at various levels of activity abstraction, requiring that
overall ranking must be computed by aggregating risk
assessments at multiple abstraction levels.
We implement two ranking approaches: a super-
vised ranker based on previously known threats and an
unsupervised ranker that considers unknown potential
threats.
Supervised ranking using APT classification to
catch known threats. The goal of APT classifica-
tion is to provide the diagnostic engine with critical
APT related information such as APT Phase, severity
of attack, and confidence level associated with APT
tagging for threat prioritization. Since the audit trails
are annotated hierarchically into different granularity
of actions, multiple classifiers will be built to consider
each hierarchical level separately. APT classifiers are
used to identify entities that are likely to be instances
of known threats or phases of an APT attack. Two
types of classifiers are used. The first classifier is
hand-coded and the second classifier is learned from
training data.
The hand-coded classifier is designed to have high
precision, using hand-coded rules, mirroring SIEM
and IDS systems. Entities tagged by this classifier are
given the highest priority for investigation. The second
classifier, which is learned from training data, will pro-
vide higher recall at the cost of precision. Activities
are ranked according to their threat level by aggregat-
ing a severity measure (determined by classified threat
type) and a confidence measure. We complement the
initial set of training data to calibrate our classifiers by
using an Active Learning Framework, which focuses
on improving the classification algorithm through oc-
casional manual labeling of the most critical activities
in the audit trails.
Unsupervised ranking using normalcy charac-
terization to catch unknown threats. The second
component of the prioritizer is a set of unsupervised
normalcy rankers, which rank entities based on their
statistical “normalcy". Activities identified as un-
usual will be fed to the Active Learning framework
to check if any of them are “unknown” APT activities.
This provides a mechanism for detecting “unknown”
threats while also providing feedback to improve the
APT classifier.
5.2 Combining Multiple Rankings
One of the key issues with combining the outputs of
multiple risk ranking is dealing with two-dimensional
risk (severity, confidence) scores that may be on very
different scales. A diverse set of score normalization
techniques have been proposed [4; 5; 6]to deal with
this issue, but no single technique has been found to
be superior over all the others. An alternative to com-
bining scores is to combine rankings [7]. Although
converting scores to rankings does lose information, it
remains an open question if the loss in information is
compensated for by the convenience of working with
the common scale of rankings.
We will develop combination techniques for
weighted risk rankings based on probabilistic rank ag-
gregation methods. This approach builds on our own
work [8]that shows the robustness of the weighted
ranking approach. We also build on principled meth-
ods for combining ranking data found in the statistics
and information retrieval literature.
Traditionally, the goal of rank aggregation [9; 10]
is to combine a set of rankings of the same candi-
dates into a single consensus ranking that is “better”
than the individual rankings. We extend the tradi-
tional approach to accommodate the specific context
of weighted risk ranking. First, unreliable rankers will
be identified and either ignored or down-weighted,
lest their rankings decrease the quality of the over-
all consensus [7; 10]. Second, we will discount ex-
cessive correlation among rankers, so that a set of
4
Proceedings of the 26th International Workshop on Principles of Diagnosis
196
highly redundant rankers do not completely outweigh
the contribution of other alternative rankings. To ad-
dress these two issues, we will associate a probabilis-
tic latent variable Ziwith the i’th entity of interest,
which indicates whether the entity is anomalous or
normal. Then, we will build a probabilistic model
that allows us to infer the posterior distribution over
the Zibased on the observed rankings produced by
each of the input weighted risk rankings. This poste-
rior probability of Zibeing normal will then be used
as the weighted risk rank. Our model will make the
following assumptions to account for both unreliable
and correlated rankers: 1) Anomalies are ranked lower
than all normal instances and these ranks tend to be
concentrated near the lower rankings of the provided
weighted risk rankings, and 2) Normal data instances
tend to be uniformly distributed near the higher rank-
ings of the weighted risk rankings.
There are various ways to build a probabilistic
model that reflects the above assumptions and al-
lows for the inference of the Zivariables through
Expectation-Maximization [11]. In addition to these
assumptions, we will explore allowing other factors to
influence the latent Zivariables, such as features of
the entities as well as feedback provided by an expert
analysts.
6 Diagnosis
We view the problem of detecting, isolating, and ex-
plaining complex APT campaigns behavior from rich
activity data is a diagnostic problem. We will use
an AI-based diagnostic reasoning to guide the global
search for possible vulnerabilities that enabled the
breach. Model-based diagnosis (MBD) [12]is a par-
ticularly compelling approach as it supports reasoning
over complex causal networks (for example, having
multiple conjunctions, disjunctions, and negations)
and identifies often subtle combinations of root causes
of the symptoms (the breach).
6.1 An MBD approach for APT detection
and isolation: Motivation
Attack detection and isolation are two distinct chal-
lenges. Often diagnostic approaches use separate
models for detection and isolation [13]. MBD how-
ever uses a single model, to combine these two rea-
sonings. The security model contains both part of
the security policy (that communicating with certain
blacklisted hosts may indicate an information leak)
and information about the possible locations and con-
sequence of a vulnerability (a privilege escalation may
lead to an information leak). The security model also
contains abstract security constraints such as if a pro-
cess requires authentication, a password must be read
and compared against.
The diagnostic approach takes into consideration
the bootstrapping of an APT which we consider the
root-cause of the attack. What enables a successful
APT is either a combination of software component
vulnerabilities or the combined use of social engineer-
ing and insufficiency of the organizational security
policies. We use MBD for computing the set of si-
multaneously exploited vulnerabilities that allowed the
deployment of the APT. Computing such explanations
is possible because MBD reasons in terms of multiple-
faults [14]. In our running example this set would in-
clude both the fact the the web server has been ex-
ploited due to the Shellshock vulnerability and that a
the attacker gained privileged access on the front-end
due to the use of the X64_32 escalation vulnerability.
The abstract security model is used to gather infor-
mation about types of attacks the system is vulnerable
to, and to aid deciding the set of actions required to
stop an APT campaign (policy enforcement). Various
heuristics exist to find the set of meaningful diagnosis
candidates. As an example, one might be interested
in the minimal set of actions to stop the attack [15;
16]or select those candidates that capture significant
probability mass [17]. In the rest of this section, for
illustration purposes, we use minimality as the heuris-
tic of interest. MBD is the right tool for dealing with
computation of diagnosis candidates as it offers sev-
eral ways to address the modeling and computational
complexity [18; 19].
6.2 Detection and Isolation of Attacks from
Abstract Security Model and Sensor
Data
The abstract security model provides an abstraction
mechanism that is originally missing in the audit trails.
More precisely what is not in the audit trails and what
is in the security model is how to connect (possibly
disconnected) activities for the purpose of global rea-
soning. The abstract security model and the sensor
data collected from the audit trails are provided as in-
puts to an MBD algorithms that performs the high-
level reasoning about possible vulnerabilities and at-
tacks similar to what a human security officer would
do.
The information in the “raw” audit trails is of too
high fidelity [2]and low abstraction to be used by a
“crude” security model. That is the reason the diag-
nostic engine needs the machine learning module to
temporally and spatially group nodes in the audit trails
and to provide semantically rich variable/value sensor
data about actions, suitable for MBD. Notice that in
this process, the audit trail structure is translated to se-
mantic categories, i.e., the diagnostic engine receives
as observations time-series of sensed actions.
The listing that follows next shows an abstract se-
curity model for the running example in the LYDIA
language [20]. This bears some resemblance to PRO -
LO G, except that LYDIA is a language for model-
based diagnosis of logical circuits while PROLOG is
for Horn-style reasoning. The use of LYD IA is for
illustration purposes only, in reality computer sys-
tems can be much more easily modeled as state ma-
chines. There is a significant body of literature deal-
ing with diagnosis of discrete-event systems [21; 22;
23], to name just a few.
5
Proceedings of the 26th International Workshop on Principles of Diagnosis
197
1system front_end (bool know_root_password)
2 {
3bool httpd_shell_vuln ; // vulnerability
4bool buffer_overflow_vuln ; // vulnerability
5bool escalation_vuln ; // vulnerability
6
7bool httpd_shell ;
8bool root_shell ;
9bool leak_passwd;
10
11 // weakfault models
12 if (! httpd_shell_vuln ) { // if healthy
13 ! httpd_shell ; // forbid shells via httpd
14 }
15
16 if (! escalation_vuln ) { // if healthy
17 ! root_shell ; // no root shell is possible
18 }
19
20 if (! buffer_overflow_vuln ) { // if healthy
21 !leak_passwd; // passwords don’t leak
22 }
23
24 bool access_passwd;
25 attribute observable (access_passwd) = true;
26
27 !access_passwd => !leak_passwd;
28
29 /∗∗
30 Knowing the root password can be explained
31 by a root shell ( for example there is a
32 password sniffer ).
33 /
34 know_root_password =>
35 (( httpd_shell || leak_passwd) && root_shell );
36 }
37
38 system back_end(bool know_root_password)
39 {
40 bool comm;
41 attribute observable (comm) = true;
42
43 /∗∗
44 Normal users can only communicate with a
45 list of permitted hosts .
46 /
47 if (! know_root_password) {
48 comm == true;
49 }
50 }
51
52 system main()
53 {
54 bool know_root_password;
55
56 system front_end fe( know_root_password);
57 system back_end be(know_root_password);
58 }
LYDIA translates the model to an internal proposi-
tional logic formula. Part of this internal representa-
tion is shown in figure 3, which uses the standard VLSI
[24]notation to denote AND-gates, OR-gates, and
NOT-gates. Wires are labeled with variable names.
Boolean circuits (matching propositional logic), how-
ever, have limited expressiveness and modeling secu-
>
Legend:
know root password
root shell
httpd shell
1assumable variable
2p
2r
2q
leak pw1
buffer overflow vuln1
2internal variable
Figure 3: Part of the abstract security model for the
running example
rity constraints in it is notoriously difficult, hence we
plan to create or use specialized modal logic similar to
the one proposed in [25].
Notice that the format of the Boolean circuit shown
in figure 3 is very close to the one used in Truth Main-
tenance System (TMS) [26]. The only assumable vari-
able in figure 3 is buffer_overflow_vuln and its
default value is false (i.e., there is no buffer overflow
vulnerability in the web server process).
We next show how a reasoning engine can discover
a conflict through forward and backward propagation.
Looking at figure 3, it is clear that rmust be true be-
cause it is an input to an AND-gate whose output is set
to true. Therefore either por q(or both) must be true.
This means that either buffer_overflow_vuln or
leak_pw must be false. If we say that leak_pw is
assumed to be true (measured or otherwise inferred),
then leak_pw and buffer_overflow_vuln are to-
gether part of a conflict. It means that the reasoning
engine has to change one of them to resolve the con-
tradiction.
Based on the observation from our running exam-
ple and a TMS constructed from the security model
shown in figure 3, the hitting set algorithm computes
two possible diagnostic hypotheses: (1) the attacker
gained a shell access through a web-server vulnerabil-
ity and the attacker performed privilege escalation or
(2) the attacker injected binary code through a buffer
overflow and the attacker performed privilege escala-
tion.
If we use LYDIA to compute the set of diagnoses for
the running example, we get the following two (am-
biguous) diagnoses for the root-cause of the penetra-
tion:
$ lydia example.lm example.obs
d1 = { fe.escalation_vuln,
fe.httpd_shell_vuln }
d2 = { fe.buffer_overflow_vuln,
fe.escalation_vuln }
MBD uses probabilities to computes a sequence of
possible diagnoses ordered by likelihood. This proba-
bility can be used for many purposes: decide which di-
agnosis is more likely to be the true fault explanation,
whether there is the need for consider further evidence
from the logs or limit the number of diagnoses that
need to be identified. Many policies exist to compute
these probabilities [27; 28].
For illustration purposes we consider that the diag-
6
Proceedings of the 26th International Workshop on Principles of Diagnosis
198
noses for the running example are ambiguous. Before
we discuss methods for dealing with this ambiguity,
we address the major research challenge of model gen-
eration.
6.3 Model Generation
The abstract vulnerability model can either be con-
structed manually or semi-automatically. The chal-
lenge with modeling is that an APT campaign gener-
ally exploits unknown vulnerabilities. Therefore, our
approach to address this issue is to construct the model
which captures expected behavior (known goods) of
the system. Starting from generic parameterized vul-
nerability models and security objectives, the abstract
vulnerability model can be extended with information
related to known vulnerabilities (known bads).
Generating the model can be done either manu-
ally or semi-automatically. We will explore venues to
generate this model manually, which requires signif-
icant knowledge about potential security vulnerabili-
ties, while being error prone and not detailed enough.
Amongst company specific requirements, we envisage
the abstract vulnerability model to capture the most
common attacks that target software systems, as de-
scribed in the Common Attack Pattern Enumeration
and Classification (CAPEC1). The comprehensive list
of known attacks has been designed to better under-
stand the perspective of an attacker exploiting the vul-
nerabilities and, from this knowledge, devise appro-
priate defenses.
As modeling is challenging, we propose to explore
semi-automatic approaches to construct models. The
semi-automatic method is suitable to addressing the
modeling because in security, similarly to diagno-
sis, there is (1) component models and (2) structure.
While it is difficult to automate the building of com-
ponent models (this may even require natural language
parsing of databases such as CAPEC), it is feasible to
capture diagnosis-oriented information from structure
(physical networking or network communication).
Yet another approach to semi-automatically gener-
ate the model is to learn it from executions of the
system (e.g., during regression testing, just before
deployment). This approach to system modeling is
inspired by the work in automatic software debug-
ging work [29], where modeling of program behav-
ior is done in terms of abstraction of program traces
– known as spectra [30], abstracting from modeling
specific components and data dependencies
The outlined approaches to construct the abstract
vulnerability model entail different costs and diagnos-
tic accuracies. As expected, manually building the
model is the most expensive one. Note that build-
ing the model is a time-consuming and error-prone
task. The two semi-automatic ways also entail differ-
ent costs: one exploits the available, static informa-
tion and the other requires the system to be executed
to compute a meaningful set of executions. We will in-
vestigate the trade-offs between modeling approaches
1http://capec.mitre.org/
and their diagnostic accuracy in the context of trans-
parent computing.
7 Conclusions
Identifying the root-cause and perform damage im-
pact assessment of advanced persistent threats can be
framed as a diagnostic problem. In this paper, we dis-
cuss an approach that leverages machine learning and
model-based diagnosis techniques to reason about po-
tential attacks.
Our approach classifies audit trails into high-level
activities, and then reasons about those activities and
their threat potential in real-time and forensically. By
using the outcome of this reasoning to explain com-
plex evi- dence of malicious behavior, the system
administrators is provided with the proper tools to
promptly react to, stop, and mitigate attacks.
References
[1]Kyu Hyung Lee, Xiangyu Zhang, and Dongyan
Xu. High accuracy attack provenance via binary-
based execution partition. In Proceedings of
the 20th Annual Network and Distributed System
Security Symposium, San Diego, CA, February
2013.
[2]Kyu Hyung Lee, Xiangyu Zhang, and Dongyan
Xu. LogGC: Garbage collecting audit log. In
Proceedings of the 2013 ACM SIGSAC Confer-
ence on Computer and Communications Secu-
rity, pages 1005–1016, Berlin, Germany, 2013.
[3]M. Galar, A. Fernández, E. Barrenechea,
H. Bustince, and F. Herrera. A review on ensem-
bles for the class imbalance problem: Bagging-,
boosting-, and hybrid-based approaches. IEEE
Transactions onSystems, Man, and Cybernetics,
Part C: Applications and Reviews, 42(4):463–
484, July 2012.
[4]Charu C Aggarwal. Outlier ensembles: Position
paper. ACM SIGKDD Explorations Newsletter,
14(2):49–58, 2013.
[5]Jing Gao and Pang-Ning Tan. Converting out-
put scores from outlier detection algorithms into
probability estimates. In Proceedings of the Sixth
International Conference on Data Mining, pages
212–221. IEEE, December 2006.
[6]Hans-Peter Kriegel, Peer Kröger, Erich Schu-
bert, and Arthur Zimek. Interpreting and unify-
ing outlier scores. In Proceedings of the Eleventh
SIAM International Conference on Data Mining,
pages 13–24, April 2011.
[7]Erich Schubert, Remigius Wojdanowski, Arthur
Zimek, and Hans-Peter Kriegel. On evaluation of
outlier rankings and outlier scores. In Proceed-
ings of the Twelfth SIAM International Confer-
ence on Data Mining, pages 1047–1058, April
2012.
7
Proceedings of the 26th International Workshop on Principles of Diagnosis
199
[8]Hoda Eldardiry, Kumar Sricharan, Juan Liu,
John Hanley, Bob Price, Oliver Brdiczka, and
Eugene Bart. Multi-source fusion for anomaly
detection: using across-domain and across-time
peer-group consistency checks. Journal of
Wireless Mobile Networks, Ubiquitous Com-
puting, and Dependable Applications (JoWUA),
5(2):39–58, 2014.
[9]Yoav Freund, Raj D. Iyer, Robert E. Schapire,
and Yoram Singer. An efficient boosting al-
gorithm for combining preferences. Journal of
Machine Learning Research, 4(Nov):933–969,
2003.
[10]Ke Deng, Simeng Han, Kate J Li, and Jun S Liu.
Bayesian aggregation of order-based rank data.
Journal of the American Statistical Association,
109(507):1023–1039, 2014.
[11]Arthur P Dempster, Nan M Laird, and Donald B
Rubin. Maximum likelihood from incomplete
data via the EM algorithm. Journal of the royal
statistical society. Series B, 39(1):1–38, 1977.
[12]Johan de Kleer, Olivier Raiman, and Mark
Shirley. One step lookahead is pretty good. In
Readings in Model-Based Diagnosis, pages 138–
142. Morgan Kaufmann Publishers, San Fran-
cisco, CA, 1992.
[13]Alexander Feldman, Tolga Kurtoglu, Sriram
Narasimhan, Scott Poll, David Garcia, Johan
de Kleer, Lukas Kuhn, and Arjan van Gemund.
Empirical evaluation of diagnostic algorithm
performance using a generic framework. In-
ternational Journal of Prognostics and Health
Management, pages 1–28, 2010.
[14]Johan de Kleer and Brian Williams. Diagnosing
multiple faults. Artificial Intelligence, 32(1):97–
130, 1987.
[15]Oleg Sheyner, Joshua Haines, Somesh Jha,
Richard Lippmann, and Jeannette M Wing. Au-
tomated generation and analysis of attack graphs.
In Proceeding of the 2002 IEEE Symposium
on Security and Privacy, pages 273–284. IEEE,
May 2002.
[16]Seyit Ahmet Camtepe and Bülent Yener. Mod-
eling and detection of complex attacks. In Pro-
ceeding of the Third International Conference on
Security and Privacy in Communications Net-
works, pages 234–243, September 2007.
[17]Rui Abreu and Arjan JC van Gemund. A
low-cost approximate minimal hitting set algo-
rithm and its application to model-based diagno-
sis. In Proceedings of the Eighth Symposium on
Abstraction, Reformulation and Approximation,
pages 2–9, July 2009.
[18]Alexander Feldman, Gregory Provan, and Arjan
van Gemund. Approximate model-based diagno-
sis using greedy stochastic search. Journal of Ar-
tificial Intelligence Research, 38:371–413, 2010.
[19]Nuno Cardoso and Rui Abreu. A distributed
approach to diagnosis candidate generation. In
Progress in Artificial Intelligence, pages 175–
186. Springer, 2013.
[20]Alexander Feldman, Jurryt Pietersma, and Ar-
jan van Gemund. All roads lead to fault
diagnosis: Model-based reasoning with LY-
DIA. In Proceedings of the Eighteenth Belgium-
Netherlands Conference on Artificial Intelli-
gence (BNAIC’06), Namur, Belgium, October
2006.
[21]Meera Sampath, Raja Sengupta, Stephane Lafor-
tune, Kasim Sinnamohideen, and Demosthenis C
Teneketzis. Failure diagnosis using discrete-
event models. Control Systems Technology, IEEE
Transactions on, 4(2):105–124, 1996.
[22]Alban Grastien, Marie-Odile Cordier, and Chris-
tine Largouët. Incremental diagnosis of discrete-
event systems. In DX, 2005.
[23]Alban Grastien, Patrik Haslum, and Sylvie
Thiébaux. Conflict-based diagnosis of discrete
event systems: theory and practice. 2012.
[24]Behrooz Parhami. Computer Arithmetic: Algo-
rithms and Hardware Designs. Oxford Univer-
sity Press, Inc., New York, NY, USA, 2nd edi-
tion, 2009.
[25]Janice Glasgow, Glenn Macewen, and Prakash
Panangaden. A logic for reasoning about secu-
rity. ACM Transactions on Computer Systems,
10(3):226–264, August 1992.
[26]Kenneth Forbus and Johan de Kleer. Building
Problem Solvers. MIT Press, 1993.
[27]Johan de Kleer. Diagnosing multiple persistent
and intermittent faults. In Proceeding of the 2009
International Joint Conference on Artificial In-
telligence, pages 733–738, July 2009.
[28]Rui Abreu, Peter Zoeteweij, and Arjan J. C.
Van Gemund. A new bayesian approach to multi-
ple intermittent fault diagnosis. In Proceeding of
the 2009 International Joint Conference on Arti-
ficial Intelligence, pages 653–658, July 2009.
[29]Rui Abreu, Peter Zoeteweij, and Arjan JC
Van Gemund. Spectrum-based multiple fault lo-
calization. In Proceedings of the 24th IEEE/ACM
International Conference on Automated Soft-
ware Engineering, pages 88–99, November
2009.
[30]Mary Jean Harrold, Gregg Rothermel, Kent
Sayre, Rui Wu, and Liu Yi. An empirical in-
vestigation of the relationship between spectra
differences and regression faults. Software Test-
ing Verification and Reliability, 10(3):171–194,
2000.
8
Proceedings of the 26th International Workshop on Principles of Diagnosis
200
... Despite of the numerous publications proposing automatic multi-step attack detection methods based on event correlation (see Section 3), we have not found in the literature any proposal to combine them in a single system. Moreover, some of them [6,7] do not even disclose the details about how the detection method works, so the method cannot be reproduced by other researchers. Open research can help the development of more advanced methods able to deal with current multi-step attacks. ...
... Finally, Abreu et al. [7] propose a framework for detecting APTs from log information. Activities are first classified, then ranked according to the priority of generated alerts. ...
... The infected machine scans the network looking for endpoints accepting connections in port TCP 445, the one used in SMB. If the endpoint uses SMBv1, a vulnerability [7] is exploited and the endpoint is also infected. Figure 7 illustrates the example, and the kind of event matched by each node is shown in Table 1. ...
... Vaughn et al. (2005) examine the possibility for automated cyber vulnerability recognition where sensor data is used to trigger security warnings. The aim of automated cyber security is also sought by Abreu et al. (2015) with the use of audit trail data. With this work Abreu et al. (2015) employ machine learning techniques to derive patterns and insights to, in principle, enable automated actions and decisions to be made. ...
... The aim of automated cyber security is also sought by Abreu et al. (2015) with the use of audit trail data. With this work Abreu et al. (2015) employ machine learning techniques to derive patterns and insights to, in principle, enable automated actions and decisions to be made. Duncan and Whittington (2016) advise on the regular analysis of audit trails in the effective securing of cloud based systems. ...
Chapter
Full-text available
The participation of multiple stakeholders in the innovation process is one of the assumptions of Responsible Innovation (RI). This partnership aims to broaden visions, in order to generate debate and engagement. The present study’s aim, based on a meta-synthesis, is to evaluate how stakeholder participation in RI takes place. Thus, qualitative case studies were identified that investigated the participation of stakeholders in responsible innovation. Those studies have shown that, although participation is achieved when innovation is already in the process of being implemented or already inserted in the market, it serves as a basis for modifications, both in the developed product and in the paradigm of innovation. Based on the concept of Responsible Innovation and its dimensions, the role of stakeholders in the context of innovation is restricted to consultative participation. The agents that stimulate their participation are academic researchers and researchers linked to multi-institutional projects. We have noticed that the studies favour the participation of multiple stakeholders like policymakers (including funding agencies, regulators and executives), business/industry representatives (internal or outsourced innovation departments and/or some R & D base), civil society organizations (such as foundations, associations, social movements, community organizations, charities, media), as well as researchers and innovators (affiliates of various institutions and organizations at different levels). One point that stands out is the change of vision of one stakeholder over the other. Although the difficulty is pointed out in the dialogue, it is possible, by inserting them collectively into the discussion, that the different stakeholders will develop a better understanding of the different points of view. The present study has discovered that RI is treated as a result and not as a process.
... In later research Vaughn et al. (2005) examine the possibility for automated cyber vulnerability recognition where sensor data is used to trigger security warnings. The aim of automated cyber security is also sought by Abreu et al. (2015) with the use of audit trail data. With this work Abreu et al. (2015) and others such as Nehinbe (2014) employ machine learning techniques to derive patterns and insights to, in principle, enable automated actions and decisions to be made. ...
... The aim of automated cyber security is also sought by Abreu et al. (2015) with the use of audit trail data. With this work Abreu et al. (2015) and others such as Nehinbe (2014) employ machine learning techniques to derive patterns and insights to, in principle, enable automated actions and decisions to be made. Duncan and Whittington (2016) advise on the regular analysis of audit trails in the effective securing of Cloud based systems. ...
Article
Full-text available
The changing nature of manufacturing, in recent years, is evident in industry’s willingness to adopt network-connected intelligent machines in their factory development plans. A number of joint corporate/government initiatives also describe and encourage the adoption of Artificial Intelligence (AI) in the operation and management of production lines. Machine learning will have a significant role to play in the delivery of automated and intelligently supported maintenance decision-making systems. While e-maintenance practice provides aframework for internet-connected operation of maintenance practice the advent of IoT has changed the scale of internetworking and new architectures and tools are needed. While advances in sensors and sensor fusion techniques have been significant in recent years, the possibilities brought by IoT create new challenges in the scale of data and its analysis. The development of audit trail style practice for the collection of data and the provision of acomprehensive framework for its processing, analysis and use should be avaluable contribution in addressing the new data analytics challenges for maintenance created by internet connected devices. This paper proposes that further research should be conducted into audit trail collection of maintenance data, allowing future systems to enable ‘Human in the loop’ interactions.
... Despite the numerous publications proposing automatic multi-step attack detection methods based on event correlation (see Section 3), we have not found in the literature any proposal to combine them in a single system. Moreover, some of them [5,6] do not even disclose the details about how the detection method works, so the method cannot be reproduced by other researchers. Open research can help the development of more advanced methods able to deal with current multi-step attacks. ...
... Finally, Abreu et al. [6] propose a framework for detecting APTs from log information. Activities are first classified, then ranked according to the priority of generated alerts. ...
Article
Full-text available
Current attacks are complex and stealthy. The recent WannaCry malware campaign demonstrates that this is true not only for targeted operations, but also for massive attacks. Complex attacks can only be described as a set of individual actions composing a global strategy. Most of the time, different devices are involved in the same attack scenario. Information about the events recorded in these devices can be collected in the shape of logs in a central system, where an automatic search of threat traces can be implemented. Much has been written about automatic event correlation to detect multi-step attacks but the proposed methods are rarely brought together in the same platform. In this paper we propose OMMA (Operator-guided Monitoring of Multi-step Attacks), an open and collaborative engineering system which offers a platform to integrate the methods developed by the multi-step attack detection research community. Inspired by HuMa and KILS systems, OMMA incorporates real-time feedback from human experts, so the integrated methods can improve their performance through a learning process. This feedback loop is used by Morwilog, an ACO-based analysis engine that we show as one of the first methods to be integrated in OMMA.
... There are some frameworks in the literature proposing a whole end-to-end correlation process focused on multi-step attack detection. In the one proposed by researchers from Palo Alto Research Center and Galois Inc. [218], a mixture of methods are applied in different stages, from activity classification to alert ranking. Valeur et al. [65] introduce a whole correlation system divided in several phases. ...
... Similarity-based [128,32,129,21], mixed [218], case-based [14,200,201,29,66,203] and causal correlation [165] Traces with triggering alerts 5 Similarity-based [110,17,107,106] and causal correlation [71] We can see there is a disparity in the use of the different types of traces: more than 85% of the publications are exclusively focused on the analysis of alerts. According to Brogi et al. [19], detecting a multi-step attack is the same as highlighting the links between elementary attacks. ...
Article
Full-text available
Since the beginning of the Internet, cyberattacks have threatened users and organisations. They have become more complex concurrently with computer networks. Nowadays, attackers need to perform several intrusion steps to reach their �nal objective. The set of these steps is known as multi-step attack, multi- stage attack or attack scenario. Their multi-step nature hinders intrusion detection, as the correlation of more than one action is needed to understand the attack strategy and identify the threat. Since the beginning of 2000s, the security research community has tried to propose solutions to detect this kind of threat and to predict further steps. This survey aims to gather all the publications proposing multi-step attack detection methods. We focus on methods that go beyond the detection of a symptom and try to reveal the whole structure of the attack and the links between its steps. We follow a systematic approach to bibliographic research in order to identify the relevant literature. Our e�ort results in a corpus of 181 publications covering 119 methods, which we describe and classify. The analysis of the publications allows us to extract some conclusions about the state of research in multi-step attack detection. As far as we know, this is the �rst survey fully dedicated to multi-step attack detection methods as mechanisms to reveal attack scenarios composed of digital traces left by attackers.
... The work by Peisert et al. [36] was one of the early works that apply anomaly detection on finegrained program trace for forensic analysis, where function call traces of utility tools (su, ssh and lpr) were fed directly to STIDE [34] in offline mode. Abreu et al. [37] propose an approach that classifies audit trails into predefined high-level activities and reasons about the threat potentials through machine learning to help the system administrator mitigate advanced persistent threats. CMarkov [38] carried out anomaly detection on the detailed system call trace, which includes both the callee and the caller information. ...
Article
Full-text available
The efficacy of anomaly detection is fundamentally limited by the descriptive power of the input events. Today’s anomaly detection systems are optimized for coarse-grained events of specific types such as system logs and API traces. An attack can evade detection by avoiding noticeable manifestations in the coarse-grained events. Intuitively, we may fix the loopholes by reducing the event granularity, but this brings up two obvious challenges. First, fine-grained events may not have the rich semantics needed for feature construction. Second, the anomaly detection algorithms may not scale for the volume of the fine-grained events. We propose the application profile extractor (APE) that utilizes compression-based sequential pattern mining to generate compact profiles from fine-grained program traces for anomaly detection algorithms. With minimal assumptions on the event semantics, the profile generation are compatible with a wide variety of program traces. In addition, the compact profiles scale anomaly detection algorithms for the high data rate of fine-grained program tracing. We also outline scenarios that justify the need for anomaly detection with fine-grained program tracing events.
Article
Full-text available
The advent of Advanced Persistent Threat (APT) as a new concept in cyber warfare has raised many concerns in recent years. APT based cyber-attacks are usually stealthy, stepwise, slow, long-term, planned, and based on a set of varied zero-day vulnerabilities. As a result, these attacks behave as diverse and dynamic as possible, and hence the generated alerts for these attacks are normally below the common detection thresholds of the conventional attacks. Therefore, the present approaches are not mostly able to effectively detect or analyze the behavior of this class of attacks. In this article, an approach for real-time detection of APT based cyber-attacks based on causal analysis and correlating the generated alerts by security and non-security sensors is introduced. The proposed method computes the infection score of hosts by modeling, discovery, and analysis of causal relationships among APT steps. For this purpose, a dynamic programming algorithm is introduced which works on alerts of each host separately and conducts a long-term analysis on the attack process to combat the outlasting feature of the APT attacks yet coping with a high volume of alert information. The proposed method is implemented and extensively evaluated using a semi real-world dataset and simulation. The experimental results show that the proposed approach can effectively rank hosts based on their infection likelihood with acceptable accuracy. INDEX TERMS Advanced persistent threat (APT), attack process modeling, alerts correlation, causal analysis.
Article
Full-text available
We present a conflict-based approach to diagnosing Discrete Event Systems (DES) which generalises Reiter's Diagnose al-gorithm to a much broader class of problems. This approach obviates the need to explicitly reconstruct the system's be-haviors that are consistent with the observation, as is typi-cal of existing DES diagnosis algorithms. Instead, our al-gorithm explores the space of diagnosis hypotheses, testing hypotheses for consistency, and generating conflicts which rule out successors and other portions of the search space. Under relatively mild assumptions, our algorithm correctly computes the set of preferred diagnosis candidates. We in-vestigate efficient symbolic representations of the hypotheses space and provide a SAT-based implementation of this frame-work which is used to address a real-world problem in pro-cessing alarms for a power transmission system.
Article
Full-text available
Rank aggregation, i.e., combining several ranking functions (called base rankers) to get aggregated, usually stronger rankings of a given set of items, is encountered in many disciplines. Most methods in the literature assume that base rankers of interest are equally reliable. It is very common in practice, however, that some rankers are more informative and reliable than others. It is desirable to distinguish high quality base rankers from low quality ones and treat them differently. Some methods achieve this by assigning pre-specified weights to base rankers. But there are no systematic and principled strategies for designing a proper weighting scheme for a practical problem. In this paper, we propose a Bayesian approach, called Bayesian Aggregation of Rank Data (BARD), to overcome this limitation. By attaching a quality parameter to each base ranker and estimating these parameters along with the aggregation process, BARD measures reliabilities of base rankers in a quantitative way and makes use of this information to improve the aggregated ranking. In addition, we design a method to detect highly correlated rankers and to account for their information redundancy appropriately. Both simulation studies and real data applications show that BARD significantly outperforms existing methods when equality of base rankers varies greatly.
Article
Full-text available
A variety of rule-based, model-based and data-driven techniques have been proposed for detec-tion and isolation of faults in physical systems. However, there have been few efforts to compara-tively analyze the performance of these approaches on the same system under identical conditions. One reason for this was the lack of a standard framework to perform this comparison. In this pa-per we introduce a framework, called DXF, that provides a common language to represent the sys-tem description, sensor data and the fault diag-nosis results; a run-time architecture to execute the diagnosis algorithms under identical condi-tions and collect the diagnosis results; and an eval-uation component that can compute performance metrics from the diagnosis results to compare the algorithms. We have used DXF to perform an em-pirical evaluation of 13 diagnostic algorithms on a hardware testbed (ADAPT) at NASA Ames Re-search Center and on a set of synthetic circuits typically used as benchmarks in the model-based diagnosis community. Based on these empirical data we analyze the performance of each algorithm and suggest directions for future development. 1 INTRODUCTION Fault Diagnosis in physical systems involves the detection of anomalous system behavior and the identification of its cause. Some key steps in diag-nostic inference are fault detection (is the output of the system incorrect?), fault isolation (what is
Conference Paper
Generating diagnosis candidates for a set of failing transactions is an important challenge in the context of automatic fault localization of both software and hardware systems. Being an NP-Hard problem, exhaustive algorithms are usually prohibitive for real-world, often large, problems. In practice, the usage of heuristic-based approaches trade-off completeness for time efficiency. An example of such heuristic approaches is Staccato, which was proposed in the context of reasoning-based fault localization. In this paper, we propose an efficient distributed algorithm, dubbed MHS2, that renders the sequential search algorithm Staccato suitable to distributed, Map-Reduce environments. The results show that MHS2 scales to larger systems (when compared to Staccato), while entailing either marginal or small runtime overhead.
Chapter
Outlier detection research is currently focusing on the development of new methods and on improving the computation time for these methods. Evaluation however is rather heuristic, often considering just precision in the top k results or using the area under the ROC curve. These evaluation procedures do not allow for assessment of similarity between methods. Judging the similarity of or correlation between two rankings of outlier scores is an important question in itself but it is also an essential step towards meaningfully building outlier detection ensembles, where this aspect has been completely ignored so far. In this study, our generalized view of evaluation methods allows both to evaluate the performance of existing methods as well as to compare different methods w.r.t. their detection performance. Our new evaluation framework takes into consideration the class imbalance problem and offers new insights on similarity and redundancy of existing outlier detection methods. As a result, the design of effective ensemble methods for outlier detection is considerably enhanced. Copyright
Article
System-level audit logs capture the interactions between applications and the runtime environment. They are highly valuable for forensic analysis that aims to identify the root cause of an attack, which may occur long ago, or to determine the ramifications of an attack for recovery from it. A key challenge of audit log-based forensics in practice is the sheer size of the log files generated, which could grow at a rate of Gigabytes per day. In this paper, we propose LogGC, an audit logging system with garbage collection (GC) capability. We identify and overcome the unique challenges of garbage collection in the context of computer forensic analysis, which makes LogGC different from traditional memory GC techniques. We also develop techniques that instrument user applications at a small number of selected places to emit additional system events so that we can substantially reduce the false dependences between system events to improve GC effectiveness. Our results show that LogGC can reduce audit log size by 14 times for regular user systems and 37 times for server systems, without affecting the accuracy of forensic analysis.
Article
Many software maintenance and testing tasks involve comparing the behaviours of program versions. Program spectra have recently been proposed as a heuristic for use in performing such comparisons. To assess the potential usefulness of spectra in this context an experiment was conducted, examining the relationship between differences in program spectra and the exposure of regression faults (faults existing in a modified version of a program that were not present prior to modifications, or not revealed in previous testing), and empirically comparing several types of spectra. The results reveal that certain types of spectra differences correlate with high frequency—at least in one direction—with the exposure of regression faults. That is, when regression faults are revealed by particular inputs, spectra differences are likely also to be revealed by those inputs, though the reverse is not true. The results also suggest that several types of spectra that appear, analytically, to offer greater precision in predicting the presence of regression faults than other, cheaper, spectra may provide no greater precision in practice. These results have ramifications for future research on, and for the practical uses of, program spectra. Copyright © 2000 John Wiley & Sons, Ltd.