Content uploaded by Jon Timmis
Author content
All content in this area was uploaded by Jon Timmis
Content may be subject to copyright.
On Permutation Masks in Hamming
Negative Selection
Thomas Stibor1, Jonathan Timmis2, and Claudia Eckert1
1Department of Computer Science
Darmstadt University of Technology
{stibor, eckert}@sec.informatik.tu-darmstadt.de
2Departments of Electronics and Computer Science
University of York, Heslington, York
jtimmis@cs.york.ac.uk
Abstract. Permutation masks were proposed for reducing the number
of holes in Hamming negative selection when applying the r-contiguous
or r-chunk matching rule. Here, we show that (randomly determined)
permutation masks re-arrange the semantic representation of the under-
lying data and therefore shatter self-regions. As a consequence, detec-
tors do not cover areas around self regions, instead they cover randomly
distributed elements across the space. In addition, we observe that the
resulting holes occur in regions where actually no self regions should
occur.
1 Introduction
Applying negative selection for anomaly detection problems has been undertaken
extensively [1,2,3,4]. Anomaly detection problems, also termed one-class classifi-
cation, can be considered as a type of pattern classification problem, where one
tries to describe a single class of objects, and distinguish that from all other pos-
sible objects. More formally, one-class classification is a problem of generating
decision boundaries that can successfully distinguish between the normal and
anomalous class. Hamming negative selection is an immune-inspired technique
for one-class classification problems. Recent results, however, have revealed sev-
eral problems concerning algorithm complexity of generating detectors [5,6,7]
and determining the proper matching threshold to allow for the generation of
correct generalization regions [8]. In this paper we investigate an extended tech-
nique for Hamming negative selection: permutation masks. Permutation masks
are immunologically motivated by lymphocyte diversity. Lymphocyte diversity
is an important property of the immune system, as it enables a lymphocyte to
reacting to many substances, i.e. it induces diversity and generalization. This
kind of generalization process inspired Hofmeyr [3,9] to propose a similar coun-
terpart for use in Hamming negative selection. Hofmeyr introduced permutation
masks in order to reduce the number of undetectable elements. It was argued
that permutation masks could be useful for covering the non-self space efficiently
when varying the representation by means of permutation masks (see Fig. 1).
H. Bersini and J. Carneiro (Eds.): ICARIS 2006, LNCS 4163, pp. 122–135, 2006.
c
!Springer-Verlag Berlin Heidelberg 2006
On Permutation Masks in Hamming Negative Selection 123
Fig. 1. Visualized concept of varying representations by means of permutation masks
to reduce the number of undetectable elements. The light gray shaded area in the
middle represents the self regions (normal class in terms of anomaly detection). The
dark gray shaded shapes represent areas which are covered by detectors with varying
representations. The white area represents the non-self space (anomalous class in terms
of anomaly detection). This figure is taken from [9].
In the following two sections we briefly introduce the standard negative selec-
tion inspired anomaly detection technique.
2 Artificial Immune System
An artificial immune system (AIS) [10] is a paradigm inspired by the immune
system and are used for solving computational and information processing prob-
lems. An AIS can be described, and developed, using a framework [10] which
contains the following basic elements:
–A representation for the artificial immune elements.
–A set of functions, which quantifies the interactions of the artificial immune
elements (affinity).
–A set of algorithms which based on observed immune principles and methods.
This 3-step abstraction (representation, affinity, algorithm) for using the AIS
framework is discussed in the following sections.
2.1 Hamming Shape-Space
The notion of shape-space was introduced by Perelson and Oster [11] and allows
a quantitative affinity description between immune components known as an-
tibodies and antigens. More precisely, a shape-space is a metric space with an
associated distance (affinity) function.
124 T. Stibor, J. Timmis, and C. Eckert
The Hamming shape-space UΣ
lis built from all elements of length lover a
finite alphabet Σ.
Example 1.
Σ={0,1}
000 . . . 000
000 . . . 001
. . . . . . . . . .
. . . . . . . . . .
111 . . . 111
!"# $
l
Σ={A, C, G, T }
AAA . . . AAA
AAA . . . AAC
............
............
T T T . . . T T T
!"# $
l
In example 1 two Hamming shape-spaces for different alphabets and alphabet
sizes are presented. On the left, a Hamming shape-space defined over the binary
alphabet of length lis shown. On the right, a Hamming shape-space defined over
the DNA bases alphabet (Adenine, Cytosine, Guanine, Thymine) is presented.
2.2 R-Contiguous and R-Chunk Matching
A formal description of antigen-antibody interactions not only requires a repre-
sentation (encoding), but also appropriate affinity functions. Percus et. al [12]
proposed the r-contiguous matching rule for abstracting the affinity of an anti-
body needed to recognize an antigen.
Definition 1. An element e∈UΣ
lwith e=e1e2. . . eland detector d∈UΣ
l
with d=d1d2. . . dl, match with r-contiguous rule, if a position pexists where
ei=difor i=p,...,p+r−1,p≤l−r+ 1.
Informally, two elements, with the same length, match if at least rcontiguous
characters are identical.
An additional rule, which subsumes1the r-contiguous rule, is the r-chunk
matching rule [13].
Definition 2. An element e∈UΣ
lwith e=e1e2. . . eland detector
d∈N×DΣ
rwith d= (p|d1d2. . . dr), for r≤l, p ≤l−r+ 1 match with r-chunk
rule, if a position pexists where ei=difor i=p,...,p+r−1.
Informally, element eand detector dmatch if a position pexists, where all
characters of eand dare identical over a sequence of length r.
We use the term subsume as any r-contiguous detector can be represented as a
set of r-chunk detectors. This implicates that any set of elements from UΣ
lthat
can be recognized with a set of r-contiguous detectors can also be recognized
with some set of r-chunk detectors. The converse statement is surprisingly not
true, i.e. there exists a set of elements from UΣ
lthat can be recognized with a set
1Include within a larger entity.
On Permutation Masks in Hamming Negative Selection 125
of r-chunk detectors, but not recognized with any set of r-contiguous detectors.
We demonstrate this converse statement on an example, a formal approach is
provided in [14].
Example 2. Given a Hamming shape-space U{0,1}
5, a set
S={01011,01100,01110,10010,10100,11100}of self elements and a detector
length r= 3.
All possible generable r-contiguous detectors for the complementary space
U{0,1}
5\Sare Dr−contiguous ={00000,00001,00111,11000,11001}.
All possible generable r-chunk detectors are
Dr−chunk ={0|000,0|001,0|110,1|000,1|011,1|100,2|000,2|001,2|101,2|111}.
The set Dr−contiguous recognizes the elements
P1=U{0,1}
5\(S∪{01010,01101,10011,10101,11101,11110}),
whereas the set Dr−chunk recognizes the elements
P2=U{0,1}
5\(S∪{10011,01010,11110}). Hence |P1|≤|P2|.
Example 2 shows, that the set of r-chunk detectors Dr−chunk recognizes more
elements of U{0,1}
5than the set of r-contiguous detectors Dr−contiguous and there-
fore the r-chunk matching rule subsumes the r-contiguous rule.
3 Hamming Negative Selection
Forrest et al. [1] proposed a (generic2) negative selection algorithm for detecting
changes in data streams. Given a shape-space U=Sseen ∪Sunseen ∪Nwhich
is partitioned into training data Sseen and testing data (Sseen ∪Sunseen ∪N).
The basic idea is to generate a number of detectors for the complementary space
U\Sseen and then to apply these detectors to classify new (unseen) data as self
(no data manipulation) or non-self (data manipulation).
Algorithm 1. Generic Negative Selection Algorithm
input :Sseen = set of self seen elements
output:D= set of generated detectors
begin
1.Define self as a set Sseen of elements in shape-space U
2.Generate a set Dof detectors, such that each fails to match any element in
Sseen
3.Monitor (seen and unseen) data δ⊆Uby continually matching the
detectors in D against δ.
end
The generic negative selection algorithm can be used with arbitrary shape-
spaces and affinity functions. In this paper, we focus on Hamming negative
2Applicable to arbitrary shape-spaces.
126 T. Stibor, J. Timmis, and C. Eckert
selection, i.e. the negative selection algorithm which operates on Hamming shape-
space and employs the r-chunk matching rule and permutation masks.
3.1 Holes as Generalization Regions
The r-contiguous and r-chunk matching rule induce undetectable elements —
termed holes (see Fig. 2). In general, all matching rules which match over a
certain element length induce holes. This statement is theoretically investigated
in [15,14] and empirically explored3in [16]. Holes are some4elements from U\
Sseen, i.e. elements not seen during the training phase. For these elements, no
detectors can be generated and therefore they cannot be recognized and classified
as non-self elements. However, the term holes is not an accurate expression, as
holes are necessary to generalize beyond the training set. A detector set which
generalizes well ensures that seen and unseen self elements are not recognized
by any detector, whereas all other elements are recognized by detectors and
classified as non-self. Hence, holes must represent unseen self elements; or in
other words, holes must represent generalization regions in the shape-space UΣ
l.
1000
0001
!
!
100 000
000 001 ={0001,1001}
={1000,0000}
={s1, h1}
={s2, h2}
r−1
Fig. 2. Self elements s1= 0001 and s2= 1000 induce holes h1, h2, i.e. elements which
are not detectable with r-contiguous and r-chunk matching rules for r= 3
4 Permutation Masks
Permutation masks were proposed by Hofmeyr [3,9] for reducing the number of
holes. A permutation mask is a bijective mapping πthat specifies a reordering
for all elements ai∈UΣ
l, i.e. a1→π(a1), a2→π(a2),...,a|Σ|l→π(a|Σ|l).
More formally, a permutation π∈Sn, where n∈N, can be written as a 2 ×n
matrix, where the first row are elements a1, a2, . . . , anand the second row the
new arrangement π(a1),π(a2),...,π(an), i.e.
%a1a2. . . an
π(a1)π(a2)... π(an)&
For the sake of simplicity we will use the equivalent cycle notation [17] to specify
a permutation. A permutation in cycle notation can be written as (b1b2. . . bn)
and means “b1becomes b2, . . . , bn−1becomes bn,bnbecomes b1. In addition, this
notation allows the identity and non-cyclic mappings, for instance (b1) (b2b3) (b4)
means : b1→b1,b2→b3, b3→b2and b4→b4.
3Hamming, r-contiguous, r-chunk and Rogers & Tanimoto matching rule.
4The number of holes is controlled by the matching threshold r.
On Permutation Masks in Hamming Negative Selection 127
4.1 Permutation Masks for Inducing Other Holes
As explained above, a permutation mask is a bijective mapping and therefore can
increase or reduce the number of holes — there also exists permutation masks
which results in self elements which neither increase nor reduce the number of
holes. The simplest examples is the identity permutation mask.
For reducing the number of holes, πmust be chosen at an appropriate value,
and a certain number of detectors must be generable.
Reconsider the self elements s1= 0001, s2= 1000 in figure (2). One can see
that elements h1= 1001 and h2= 0000 are not detectable by the r-contiguous
and r-chunk matching rule. However, after applying the permutation mask π0=
(1 2 4 3), i.e.
π0(s1) = 0010,π0(s2) = 0100
one can verify (see Fig. 3) that holes h1, h2are eliminated.
π0(1000)
π0(0001)
!
!
010 100
001 010 ={0010}
={0100}
={π0(s1)}
={π0(s2)}
r−1
Fig. 3. The permutated self elements π0(s1) and π0(s2) induce no holes by r-contiguous
and r-chunk matching rule
However, it is also clear that (1 2 4 3) (2 4 3 1),(4 3 1 2) and (3 1 2 4) represent
the same permutation, namely the cycle permutation of π0= (1 2 4 3). Specif-
ically, all cycle permutations of an arbitrary selected πleads, in terms of the
r-chunk and r-contiguous matching, to the same holes.
On the other hand, there do exist permutation masks which do not reduce
holes, i.e. π(si) = sj, for i'=jand self elements s1, s2,...,s|S|. An example is
the permutation π1= (14)(2)(3), as π1(s1) = s2and π1(s2) = s1.
Furthermore, as mentioned above, a permutation mask can also increase the
number of holes. In our subsequent presented experiments this is illustrated for
instance in figures55(c) and 5(d).
5 Permutation Masks Experiments in Hamming Negative
Selection
In [18,8] results were presented which demonstrated the coherence between the
matching threshold rand generalization regions when the r-chunk matching rule
in Hamming negative selection is applied. Recall, as holes are not detectable by
any detector, holes must represent unseen self elements, or in other words holes
must represent generalization regions. In the following experiment we will investi-
gate how randomly determined permutation masks will influence the occurrence
5With and without permutation mask.
128 T. Stibor, J. Timmis, and C. Eckert
of holes (generalization regions). More specifically, we will empirically explore
if holes occur in suitable generalization regions when a randomly determined
permutation mask is applied. Finally, we explore empirically whether randomly
determined permutation masks reduce the number of holes.
Stibor et al. [8] have shown in prior experiments that the matching thresh-
old ris a crucial parameter and is inextricably linked to the input data being
analyzed. However, permutation masks were not considered in [8]. In order to
study the impact of permutation masks on generalization regions, and to obtain
comparable results to previously performed experiments [8], we will utilize the
same mapping function and data set. Furthermore, we will explore the impact
of permutation masks on an additional data set (see Fig. 4).
5.1 Experiments Settings
The first self data set contains 1000 Gaussian (µ= 0.5,σ= 0.1) generated points
p= (x, y)∈[0,1]2. Each point pis mapped to a binary string
b1, b2,...,b8
! "# $
bx
, b9, b10,...,b16
! "# $
by
,
where the first 8 bits encode the integer x-value ix:= (255 ·x+ 0.5)and the last
8 bits the integer y-value iy:= (255 ·y+ 0.5), i.e.
[0,1]2→(ix, iy)∈[1, . . . , 256 ×1, . . . , 256] →(bx, by)∈U{0,1}
8×U{0,1}
8
This mapping is proposed in [18] and also utilized in [8] — it satisfies a straightfor-
ward visualization of real-valued encoded points in Hamming negative selection.
The second data set (termed banana data set) is depicted in figure (4) and is a com-
monly used benchmark for anomaly detection problems [19]. The banana data set
is taken from [20] and consists of 5300 points in total. These points are partitioned
in two different classes, C+which represents points inside the“banana-shape”and
class C−which contains points outside of the“banana-shape”. In this experiment we
have taken points from C+only for simulating one self-region (similar to figure 1).
More specifically, we have normalized with min-max method all points from C+
to the unitary square [0,1]2. We then sampled 1000 random points from C+and
mapped those sampled points to bit-strings of length 16.
As the r-chunk matching rule subsumes the r-contiguous rule, i.e. recognize
at least as many elements as the r-contiguous matching rule (see section 2.2), we
have performed all experiments with the r-chunk matching rule. Furthermore,
as proposed in [3,9] we have randomly determined permutation masks π∈S16.
5.2 Experimental Results
In figures (5,6,7,8) experimental results are presented. The black points represent
the 1000 sampled self elements, the white points are holes, and the grey points
represent areas which are covered by r-chunk detectors. It is not surprising that
On Permutation Masks in Hamming Negative Selection 129
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
X
Y
Fig. 4. Banana data set (points from class C+), min-max normalized to [0,1]2. In an
perfect case (error-less detection), the r-chunk detectors should cover regions outside
the “banana” shape. The region within the “banana” shape is the generalization region
and should consists of undetectable elements, i.e. holes and self elements.
for both data sets, holes occur as they should in generalization regions when
8≤r≤10. This phenomena is discussed and explained in [8]. To summarize
results from [8], a detector matching length which is not at least as long as the
semantical representation of the underlying data — in this case 8 bits for xand
ycoordinates — results in incorrect generalization regions.
What is more interesting though, is the observation that a (randomly deter-
mined) permutation mask shatters the semantical representation of the under-
lying data (see Fig. 5-8 (b,d,f,h,j,l,n,p,r,t)) and therefore, holes are randomly
distributed across the space instead of being concentrated inside or close to self
regions. This observation also means that detectors are not covering areas around
the self regions, instead they recognize elements which are also randomly dis-
tributed across the space. Furthermore one can see that the number of holes
— when applying permutation masks (see Fig. 5-8 (b,d,f,h,j,l,n,p,r,t)) — is in
some cases significantly higher than without permutation masks (see Fig. 5-8
(a,c,d,e,g,i,k,m,q,s)). This observation could be explained with the previous ob-
servation, that permutation masks distort the underlying data and therefore
shatter self regions. As a consequence the underlying data is transformed into a
collection of random chunks. For randomly determined self elements, Stibor et
al. [6] showed that the number of holes increase exponentially for r:= l→0.
Of course this shattering effect is linked very strongly to the mapping function
employed. However it is clear that each permutation mask — except the identity
permutation — semantically (more or less) distort the data. Furthermore, we
believe that finding a permutation mask which does not significantly distort the
semantical representation of the data may be computational intractable6.
6In the worst-case, one have to check all n! permutations of Sn.
130 T. Stibor, J. Timmis, and C. Eckert
(a) r= 2 (b) r= 2,π(c) r= 3 (d) r= 3,π
(e) r= 4 (f) r= 4,π(g) r= 5 (h) r= 5,π
(i) r= 6 (j) r= 6,π(k) r= 7 (l) r= 7,π
(m) r= 8 (n) r= 8,π(o) r= 9 (p) r= 9,π
(q) r= 10 (r) r= 10,π(s) r= 11 (t) r= 11,π
Fig. 5. A visualized simulation run, with 1000 random (self) points generated by a
Gaussian distribution with mean µ= 0.5 and variance σ= 0.1. The grey shaded area
is covered by the generated r-chunk detectors, the white areas are holes. The black
points are self elements. The captions which include a “π” are simulations results with
the randomly determined permutation mask π∈S16.
On Permutation Masks in Hamming Negative Selection 131
(a) r= 2 (b) r= 2,π(c) r= 3 (d) r= 3,π
(e) r= 4 (f) r= 4,π(g) r= 5 (h) r= 5,π
(i) r= 6 (j) r= 6,π(k) r= 7 (l) r= 7,π
(m) r= 8 (n) r= 8,π(o) r= 9 (p) r= 9,π
(q) r= 10 (r) r= 10,π(s) r= 11 (t) r= 11,π
Fig. 6. An additional visualized simulation run, with 1000 random (self) points gen-
erated by a Gaussian distribution with mean µ= 0.5 and variance σ= 0.1. The grey
shaded area is covered by the generated r-chunk detectors, the white areas are holes.
The black points are self elements. The captions which include a “π” are simulations
results with the randomly determined permutation mask π∈S16.
132 T. Stibor, J. Timmis, and C. Eckert
(a) r= 2 (b) r= 2,π(c) r= 3 (d) r= 3,π
(e) r= 4 (f) r= 4,π(g) r= 5 (h) r= 5,π
(i) r= 6 (j) r= 6,π(k) r= 7 (l) r= 7,π
(m) r= 8 (n) r= 8,π(o) r= 9 (p) r= 9,π
(q) r= 10 (r) r= 10,π(s) r= 11 (t) r= 11,π
Fig. 7. A visualized simulation run, 1000 randomly sampled (self) points from banana
data set. The grey shaded area is covered by the generated r-chunk detectors, the white
areas are holes. The black points are self elements. The captions which include a “π”
are simulations results with the randomly determined permutation mask π∈S16 .
On Permutation Masks in Hamming Negative Selection 133
(a) r= 2 (b) r= 2,π(c) r= 3 (d) r= 3,π
(e) r= 4 (f) r= 4,π(g) r= 5 (h) r= 5,π
(i) r= 6 (j) r= 6,π(k) r= 7 (l) r= 7,π
(m) r= 8 (n) r= 8,π(o) r= 9 (p) r= 9,π
(q) r= 10 (r) r= 10,π(s) r= 11 (t) r= 11,π
Fig. 8. An additional visualized simulation run, with 1000 randomly sampled (self)
points from banana data set. The grey shaded area is covered by the generated r-
chunk detectors, the white areas are holes. The black points are self elements. The
captions which include a “π” are simulations results with the randomly determined
permutation mask π∈S16.
134 T. Stibor, J. Timmis, and C. Eckert
In order to obtain representative results, we performed 50 simulation runs,
each with a randomly determined permutation mask for both data sets. Due
to the lack of space to present all 50 simulation runs, we have selected two
simulation results at random for each data set (see Fig. 5,6,7,8). The remaining
simulation results are closely comparable to results in figures (5,6,7,8).
6 Conclusion
Lymphocyte diversity is an important property of the immune system for recog-
nizing a huge amount of diverse substances. This property has been abstracted in
terms of permutation masks in the Hamming negative selection detection tech-
nique. In this paper we have shown that (randomly determined) permutation
masks in Hamming negative selection, distort the semantic meaning of the un-
derlying data — the shape of the distribution — and as a consequence shatter
self regions. Furthermore, the distorted data is transformed into a collection of
random chunks. Hence, detectors are not covering areas around the self regions,
instead they are randomly distributed across the space. Moreover the resulting
holes (the generalization) occur in regions where actually no self regions should
occur. Additionally we believe that it is computational infeasible to find permu-
tation masks which correctly capture the semantical representation of the data
— if one exists at all. We conclude that the use of permutation masks casts doubt
on the appropriateness of abstracting diversity in Hamming negative selection.
References
1. Forrest, S., Perelson, A.S., Allen, L., Cherukuri, R.: Self-nonself discrimination in
a computer. In: Proceedings of the 1994 IEEE Symposium on Research in Security
and Privacy, IEEE Computer Society Press (1994)
2. Dasgupta, D., Forrest, S.: Novelty detection in time series data using ideas from
immunology. In: Proceedings of the 5th International Conference on Intelligent
Systems. (1996)
3. Hofmeyr, S.A.: An Immunological Model of Distributed Detection and its Appli-
cation to Computer Security. PhD thesis, University of New Mexico (1999)
4. Singh, S.: Anomaly detection using negative selection based on the r-contiguous
matching rule. In: Proceedings of the 1st International Conference on Artificial
Immune Systems (ICARIS), Unversity of Kent at Canterbury Printing Unit (2002)
99–106
5. Kim, J., Bentley, P.J.: An evaluating of negative selection in an artificial immune
system for network intrusion detection. In: Proceedings of the Genetic and Evolu-
tionary Computation Conference, GECCO-2001. (2001) 1330–1337
6. Stibor, T., Timmis, J., Eckert, C.: On the appropriateness of negative selection
defined over hamming shape-space as a network intrusion detection system. In:
Congress On Evolutionary Computation – CEC 2005, IEEE Press (2005) 995–1002
7. Stibor, T., Timmis, J., Eckert, C.: The link between r-contiguous detectors and
k-cnf satisfiability. In: Congress On Evolutionary Computation – CEC 2006, IEEE
Press (2006 (to appear))
On Permutation Masks in Hamming Negative Selection 135
8. Stibor, T., Timmis, J., Eckert, C.: Generalization regions in hamming negative
selection. In: Intelligent Information Processing and Web Mining. Advances in
Soft Computing, Springer-Verlag (2006) 447–456
9. Hofmeyr, S., Forrest, S.: Architecture for an artificial immune system. Evolutionary
Computation 8(2000) 443–473
10. de Castro, L.N., Timmis, J.: Artificial Immune Systems: A New Computational
Intelligence Approach. Springer Verlag (2002)
11. Perelson, A.S., Oster, G.: Theoretical studies of clonal selection: minimal antibody
repertoire size and reliability of self-nonself discrimination. In: J. Theor. Biol.
Volume 81. (1979) 645–670
12. Percus, J.K., Percus, O.E., Perelson, A.S.: Predicting the size of the T-cell receptor
and antibody combining region from consideration of efficient self-nonself discrim-
ination. Proceedings of National Academy of Sciences USA 90 (1993) 1691–1695
13. Balthrop, J., Esponda, F., Forrest, S., Glickman, M.: Coverage and generalization
in an artificial immune system. In: GECCO 2002: Proceedings of the Genetic and
Evolutionary Computation Conference, New York, Morgan Kaufmann Publishers
(2002) 3–10
14. Esponda, F., Forrest, S., Helman, P.: A formal framework for positive and negative
detection schemes. IEEE Transactions on Systems, Man and Cybernetics Part B:
Cybernetics 34 (2004) 357–373
15. D’haeseleer, P., Forrest, S., Helman, P.: An immunological approach to change
detection: algorithms, analysis, and implications. In: Proceedings of the 1996 IEEE
Symposium on Research in Security and Privacy, IEEE Computer Society, IEEE
Computer Society Press (1996) 110–119
16. Gonz´alez, F., Dasgupta, D., Ni˜no, L.F.: A randomized real-valued negative selec-
tion algorithm. In: Proceedings of the 2nd International Conference on Artificial
Immune Systems (ICARIS). Volume 2787 of Lecture Notes in Computer Science.,
Edinburgh, UK, Springer-Verlag (2003) 261–272
17. Knuth, D.E.: The Art of Computer Programming. third edn. Volume 1. Addison-
Wesley (2002)
18. Gonz´alez, F., Dasgupta, D., G´omez, J.: The effect of binary matching rules in
negative selection. In: Genetic and Evolutionary Computation – GECCO-2003.
Volume 2723 of Lecture Notes in Computer Science., Chicago, Springer-Verlag
(2003) 195–206
19. Tax, D.M.J.: One-class classification. PhD thesis, Technische Universiteit Delft
(2001)
20. R¨
atsch, G.: Benchmark repository (1998)
http://ida.first.fraunhofer.de/projects/bench/benchmarks.htm.