Content uploaded by Jon Timmis

Author content

All content in this area was uploaded by Jon Timmis

Content may be subject to copyright.

On Permutation Masks in Hamming

Negative Selection

Thomas Stibor1, Jonathan Timmis2, and Claudia Eckert1

1Department of Computer Science

Darmstadt University of Technology

{stibor, eckert}@sec.informatik.tu-darmstadt.de

2Departments of Electronics and Computer Science

University of York, Heslington, York

jtimmis@cs.york.ac.uk

Abstract. Permutation masks were proposed for reducing the number

of holes in Hamming negative selection when applying the r-contiguous

or r-chunk matching rule. Here, we show that (randomly determined)

permutation masks re-arrange the semantic representation of the under-

lying data and therefore shatter self-regions. As a consequence, detec-

tors do not cover areas around self regions, instead they cover randomly

distributed elements across the space. In addition, we observe that the

resulting holes occur in regions where actually no self regions should

occur.

1 Introduction

Applying negative selection for anomaly detection problems has been undertaken

extensively [1,2,3,4]. Anomaly detection problems, also termed one-class classiﬁ-

cation, can be considered as a type of pattern classiﬁcation problem, where one

tries to describe a single class of objects, and distinguish that from all other pos-

sible objects. More formally, one-class classiﬁcation is a problem of generating

decision boundaries that can successfully distinguish between the normal and

anomalous class. Hamming negative selection is an immune-inspired technique

for one-class classiﬁcation problems. Recent results, however, have revealed sev-

eral problems concerning algorithm complexity of generating detectors [5,6,7]

and determining the proper matching threshold to allow for the generation of

correct generalization regions [8]. In this paper we investigate an extended tech-

nique for Hamming negative selection: permutation masks. Permutation masks

are immunologically motivated by lymphocyte diversity. Lymphocyte diversity

is an important property of the immune system, as it enables a lymphocyte to

reacting to many substances, i.e. it induces diversity and generalization. This

kind of generalization process inspired Hofmeyr [3,9] to propose a similar coun-

terpart for use in Hamming negative selection. Hofmeyr introduced permutation

masks in order to reduce the number of undetectable elements. It was argued

that permutation masks could be useful for covering the non-self space eﬃciently

when varying the representation by means of permutation masks (see Fig. 1).

H. Bersini and J. Carneiro (Eds.): ICARIS 2006, LNCS 4163, pp. 122–135, 2006.

c

!Springer-Verlag Berlin Heidelberg 2006

On Permutation Masks in Hamming Negative Selection 123

Fig. 1. Visualized concept of varying representations by means of permutation masks

to reduce the number of undetectable elements. The light gray shaded area in the

middle represents the self regions (normal class in terms of anomaly detection). The

dark gray shaded shapes represent areas which are covered by detectors with varying

representations. The white area represents the non-self space (anomalous class in terms

of anomaly detection). This ﬁgure is taken from [9].

In the following two sections we brieﬂy introduce the standard negative selec-

tion inspired anomaly detection technique.

2 Artiﬁcial Immune System

An artiﬁcial immune system (AIS) [10] is a paradigm inspired by the immune

system and are used for solving computational and information processing prob-

lems. An AIS can be described, and developed, using a framework [10] which

contains the following basic elements:

–A representation for the artiﬁcial immune elements.

–A set of functions, which quantiﬁes the interactions of the artiﬁcial immune

elements (aﬃnity).

–A set of algorithms which based on observed immune principles and methods.

This 3-step abstraction (representation, aﬃnity, algorithm) for using the AIS

framework is discussed in the following sections.

2.1 Hamming Shape-Space

The notion of shape-space was introduced by Perelson and Oster [11] and allows

a quantitative aﬃnity description between immune components known as an-

tibodies and antigens. More precisely, a shape-space is a metric space with an

associated distance (aﬃnity) function.

124 T. Stibor, J. Timmis, and C. Eckert

The Hamming shape-space UΣ

lis built from all elements of length lover a

ﬁnite alphabet Σ.

Example 1.

Σ={0,1}

000 . . . 000

000 . . . 001

. . . . . . . . . .

. . . . . . . . . .

111 . . . 111

!"# $

l

Σ={A, C, G, T }

AAA . . . AAA

AAA . . . AAC

............

............

T T T . . . T T T

!"# $

l

In example 1 two Hamming shape-spaces for diﬀerent alphabets and alphabet

sizes are presented. On the left, a Hamming shape-space deﬁned over the binary

alphabet of length lis shown. On the right, a Hamming shape-space deﬁned over

the DNA bases alphabet (Adenine, Cytosine, Guanine, Thymine) is presented.

2.2 R-Contiguous and R-Chunk Matching

A formal description of antigen-antibody interactions not only requires a repre-

sentation (encoding), but also appropriate aﬃnity functions. Percus et. al [12]

proposed the r-contiguous matching rule for abstracting the aﬃnity of an anti-

body needed to recognize an antigen.

Deﬁnition 1. An element e∈UΣ

lwith e=e1e2. . . eland detector d∈UΣ

l

with d=d1d2. . . dl, match with r-contiguous rule, if a position pexists where

ei=difor i=p,...,p+r−1,p≤l−r+ 1.

Informally, two elements, with the same length, match if at least rcontiguous

characters are identical.

An additional rule, which subsumes1the r-contiguous rule, is the r-chunk

matching rule [13].

Deﬁnition 2. An element e∈UΣ

lwith e=e1e2. . . eland detector

d∈N×DΣ

rwith d= (p|d1d2. . . dr), for r≤l, p ≤l−r+ 1 match with r-chunk

rule, if a position pexists where ei=difor i=p,...,p+r−1.

Informally, element eand detector dmatch if a position pexists, where all

characters of eand dare identical over a sequence of length r.

We use the term subsume as any r-contiguous detector can be represented as a

set of r-chunk detectors. This implicates that any set of elements from UΣ

lthat

can be recognized with a set of r-contiguous detectors can also be recognized

with some set of r-chunk detectors. The converse statement is surprisingly not

true, i.e. there exists a set of elements from UΣ

lthat can be recognized with a set

1Include within a larger entity.

On Permutation Masks in Hamming Negative Selection 125

of r-chunk detectors, but not recognized with any set of r-contiguous detectors.

We demonstrate this converse statement on an example, a formal approach is

provided in [14].

Example 2. Given a Hamming shape-space U{0,1}

5, a set

S={01011,01100,01110,10010,10100,11100}of self elements and a detector

length r= 3.

All possible generable r-contiguous detectors for the complementary space

U{0,1}

5\Sare Dr−contiguous ={00000,00001,00111,11000,11001}.

All possible generable r-chunk detectors are

Dr−chunk ={0|000,0|001,0|110,1|000,1|011,1|100,2|000,2|001,2|101,2|111}.

The set Dr−contiguous recognizes the elements

P1=U{0,1}

5\(S∪{01010,01101,10011,10101,11101,11110}),

whereas the set Dr−chunk recognizes the elements

P2=U{0,1}

5\(S∪{10011,01010,11110}). Hence |P1|≤|P2|.

Example 2 shows, that the set of r-chunk detectors Dr−chunk recognizes more

elements of U{0,1}

5than the set of r-contiguous detectors Dr−contiguous and there-

fore the r-chunk matching rule subsumes the r-contiguous rule.

3 Hamming Negative Selection

Forrest et al. [1] proposed a (generic2) negative selection algorithm for detecting

changes in data streams. Given a shape-space U=Sseen ∪Sunseen ∪Nwhich

is partitioned into training data Sseen and testing data (Sseen ∪Sunseen ∪N).

The basic idea is to generate a number of detectors for the complementary space

U\Sseen and then to apply these detectors to classify new (unseen) data as self

(no data manipulation) or non-self (data manipulation).

Algorithm 1. Generic Negative Selection Algorithm

input :Sseen = set of self seen elements

output:D= set of generated detectors

begin

1.Deﬁne self as a set Sseen of elements in shape-space U

2.Generate a set Dof detectors, such that each fails to match any element in

Sseen

3.Monitor (seen and unseen) data δ⊆Uby continually matching the

detectors in D against δ.

end

The generic negative selection algorithm can be used with arbitrary shape-

spaces and aﬃnity functions. In this paper, we focus on Hamming negative

2Applicable to arbitrary shape-spaces.

126 T. Stibor, J. Timmis, and C. Eckert

selection, i.e. the negative selection algorithm which operates on Hamming shape-

space and employs the r-chunk matching rule and permutation masks.

3.1 Holes as Generalization Regions

The r-contiguous and r-chunk matching rule induce undetectable elements —

termed holes (see Fig. 2). In general, all matching rules which match over a

certain element length induce holes. This statement is theoretically investigated

in [15,14] and empirically explored3in [16]. Holes are some4elements from U\

Sseen, i.e. elements not seen during the training phase. For these elements, no

detectors can be generated and therefore they cannot be recognized and classiﬁed

as non-self elements. However, the term holes is not an accurate expression, as

holes are necessary to generalize beyond the training set. A detector set which

generalizes well ensures that seen and unseen self elements are not recognized

by any detector, whereas all other elements are recognized by detectors and

classiﬁed as non-self. Hence, holes must represent unseen self elements; or in

other words, holes must represent generalization regions in the shape-space UΣ

l.

1000

0001

!

!

100 000

000 001 ={0001,1001}

={1000,0000}

={s1, h1}

={s2, h2}

r−1

Fig. 2. Self elements s1= 0001 and s2= 1000 induce holes h1, h2, i.e. elements which

are not detectable with r-contiguous and r-chunk matching rules for r= 3

4 Permutation Masks

Permutation masks were proposed by Hofmeyr [3,9] for reducing the number of

holes. A permutation mask is a bijective mapping πthat speciﬁes a reordering

for all elements ai∈UΣ

l, i.e. a1→π(a1), a2→π(a2),...,a|Σ|l→π(a|Σ|l).

More formally, a permutation π∈Sn, where n∈N, can be written as a 2 ×n

matrix, where the ﬁrst row are elements a1, a2, . . . , anand the second row the

new arrangement π(a1),π(a2),...,π(an), i.e.

%a1a2. . . an

π(a1)π(a2)... π(an)&

For the sake of simplicity we will use the equivalent cycle notation [17] to specify

a permutation. A permutation in cycle notation can be written as (b1b2. . . bn)

and means “b1becomes b2, . . . , bn−1becomes bn,bnbecomes b1. In addition, this

notation allows the identity and non-cyclic mappings, for instance (b1) (b2b3) (b4)

means : b1→b1,b2→b3, b3→b2and b4→b4.

3Hamming, r-contiguous, r-chunk and Rogers & Tanimoto matching rule.

4The number of holes is controlled by the matching threshold r.

On Permutation Masks in Hamming Negative Selection 127

4.1 Permutation Masks for Inducing Other Holes

As explained above, a permutation mask is a bijective mapping and therefore can

increase or reduce the number of holes — there also exists permutation masks

which results in self elements which neither increase nor reduce the number of

holes. The simplest examples is the identity permutation mask.

For reducing the number of holes, πmust be chosen at an appropriate value,

and a certain number of detectors must be generable.

Reconsider the self elements s1= 0001, s2= 1000 in ﬁgure (2). One can see

that elements h1= 1001 and h2= 0000 are not detectable by the r-contiguous

and r-chunk matching rule. However, after applying the permutation mask π0=

(1 2 4 3), i.e.

π0(s1) = 0010,π0(s2) = 0100

one can verify (see Fig. 3) that holes h1, h2are eliminated.

π0(1000)

π0(0001)

!

!

010 100

001 010 ={0010}

={0100}

={π0(s1)}

={π0(s2)}

r−1

Fig. 3. The permutated self elements π0(s1) and π0(s2) induce no holes by r-contiguous

and r-chunk matching rule

However, it is also clear that (1 2 4 3) (2 4 3 1),(4 3 1 2) and (3 1 2 4) represent

the same permutation, namely the cycle permutation of π0= (1 2 4 3). Specif-

ically, all cycle permutations of an arbitrary selected πleads, in terms of the

r-chunk and r-contiguous matching, to the same holes.

On the other hand, there do exist permutation masks which do not reduce

holes, i.e. π(si) = sj, for i'=jand self elements s1, s2,...,s|S|. An example is

the permutation π1= (14)(2)(3), as π1(s1) = s2and π1(s2) = s1.

Furthermore, as mentioned above, a permutation mask can also increase the

number of holes. In our subsequent presented experiments this is illustrated for

instance in ﬁgures55(c) and 5(d).

5 Permutation Masks Experiments in Hamming Negative

Selection

In [18,8] results were presented which demonstrated the coherence between the

matching threshold rand generalization regions when the r-chunk matching rule

in Hamming negative selection is applied. Recall, as holes are not detectable by

any detector, holes must represent unseen self elements, or in other words holes

must represent generalization regions. In the following experiment we will investi-

gate how randomly determined permutation masks will inﬂuence the occurrence

5With and without permutation mask.

128 T. Stibor, J. Timmis, and C. Eckert

of holes (generalization regions). More speciﬁcally, we will empirically explore

if holes occur in suitable generalization regions when a randomly determined

permutation mask is applied. Finally, we explore empirically whether randomly

determined permutation masks reduce the number of holes.

Stibor et al. [8] have shown in prior experiments that the matching thresh-

old ris a crucial parameter and is inextricably linked to the input data being

analyzed. However, permutation masks were not considered in [8]. In order to

study the impact of permutation masks on generalization regions, and to obtain

comparable results to previously performed experiments [8], we will utilize the

same mapping function and data set. Furthermore, we will explore the impact

of permutation masks on an additional data set (see Fig. 4).

5.1 Experiments Settings

The ﬁrst self data set contains 1000 Gaussian (µ= 0.5,σ= 0.1) generated points

p= (x, y)∈[0,1]2. Each point pis mapped to a binary string

b1, b2,...,b8

! "# $

bx

, b9, b10,...,b16

! "# $

by

,

where the ﬁrst 8 bits encode the integer x-value ix:= (255 ·x+ 0.5)and the last

8 bits the integer y-value iy:= (255 ·y+ 0.5), i.e.

[0,1]2→(ix, iy)∈[1, . . . , 256 ×1, . . . , 256] →(bx, by)∈U{0,1}

8×U{0,1}

8

This mapping is proposed in [18] and also utilized in [8] — it satisﬁes a straightfor-

ward visualization of real-valued encoded points in Hamming negative selection.

The second data set (termed banana data set) is depicted in ﬁgure (4) and is a com-

monly used benchmark for anomaly detection problems [19]. The banana data set

is taken from [20] and consists of 5300 points in total. These points are partitioned

in two diﬀerent classes, C+which represents points inside the“banana-shape”and

class C−which contains points outside of the“banana-shape”. In this experiment we

have taken points from C+only for simulating one self-region (similar to ﬁgure 1).

More speciﬁcally, we have normalized with min-max method all points from C+

to the unitary square [0,1]2. We then sampled 1000 random points from C+and

mapped those sampled points to bit-strings of length 16.

As the r-chunk matching rule subsumes the r-contiguous rule, i.e. recognize

at least as many elements as the r-contiguous matching rule (see section 2.2), we

have performed all experiments with the r-chunk matching rule. Furthermore,

as proposed in [3,9] we have randomly determined permutation masks π∈S16.

5.2 Experimental Results

In ﬁgures (5,6,7,8) experimental results are presented. The black points represent

the 1000 sampled self elements, the white points are holes, and the grey points

represent areas which are covered by r-chunk detectors. It is not surprising that

On Permutation Masks in Hamming Negative Selection 129

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0

X

Y

Fig. 4. Banana data set (points from class C+), min-max normalized to [0,1]2. In an

perfect case (error-less detection), the r-chunk detectors should cover regions outside

the “banana” shape. The region within the “banana” shape is the generalization region

and should consists of undetectable elements, i.e. holes and self elements.

for both data sets, holes occur as they should in generalization regions when

8≤r≤10. This phenomena is discussed and explained in [8]. To summarize

results from [8], a detector matching length which is not at least as long as the

semantical representation of the underlying data — in this case 8 bits for xand

ycoordinates — results in incorrect generalization regions.

What is more interesting though, is the observation that a (randomly deter-

mined) permutation mask shatters the semantical representation of the under-

lying data (see Fig. 5-8 (b,d,f,h,j,l,n,p,r,t)) and therefore, holes are randomly

distributed across the space instead of being concentrated inside or close to self

regions. This observation also means that detectors are not covering areas around

the self regions, instead they recognize elements which are also randomly dis-

tributed across the space. Furthermore one can see that the number of holes

— when applying permutation masks (see Fig. 5-8 (b,d,f,h,j,l,n,p,r,t)) — is in

some cases signiﬁcantly higher than without permutation masks (see Fig. 5-8

(a,c,d,e,g,i,k,m,q,s)). This observation could be explained with the previous ob-

servation, that permutation masks distort the underlying data and therefore

shatter self regions. As a consequence the underlying data is transformed into a

collection of random chunks. For randomly determined self elements, Stibor et

al. [6] showed that the number of holes increase exponentially for r:= l→0.

Of course this shattering eﬀect is linked very strongly to the mapping function

employed. However it is clear that each permutation mask — except the identity

permutation — semantically (more or less) distort the data. Furthermore, we

believe that ﬁnding a permutation mask which does not signiﬁcantly distort the

semantical representation of the data may be computational intractable6.

6In the worst-case, one have to check all n! permutations of Sn.

130 T. Stibor, J. Timmis, and C. Eckert

(a) r= 2 (b) r= 2,π(c) r= 3 (d) r= 3,π

(e) r= 4 (f) r= 4,π(g) r= 5 (h) r= 5,π

(i) r= 6 (j) r= 6,π(k) r= 7 (l) r= 7,π

(m) r= 8 (n) r= 8,π(o) r= 9 (p) r= 9,π

(q) r= 10 (r) r= 10,π(s) r= 11 (t) r= 11,π

Fig. 5. A visualized simulation run, with 1000 random (self) points generated by a

Gaussian distribution with mean µ= 0.5 and variance σ= 0.1. The grey shaded area

is covered by the generated r-chunk detectors, the white areas are holes. The black

points are self elements. The captions which include a “π” are simulations results with

the randomly determined permutation mask π∈S16.

On Permutation Masks in Hamming Negative Selection 131

(a) r= 2 (b) r= 2,π(c) r= 3 (d) r= 3,π

(e) r= 4 (f) r= 4,π(g) r= 5 (h) r= 5,π

(i) r= 6 (j) r= 6,π(k) r= 7 (l) r= 7,π

(m) r= 8 (n) r= 8,π(o) r= 9 (p) r= 9,π

(q) r= 10 (r) r= 10,π(s) r= 11 (t) r= 11,π

Fig. 6. An additional visualized simulation run, with 1000 random (self) points gen-

erated by a Gaussian distribution with mean µ= 0.5 and variance σ= 0.1. The grey

shaded area is covered by the generated r-chunk detectors, the white areas are holes.

The black points are self elements. The captions which include a “π” are simulations

results with the randomly determined permutation mask π∈S16.

132 T. Stibor, J. Timmis, and C. Eckert

(a) r= 2 (b) r= 2,π(c) r= 3 (d) r= 3,π

(e) r= 4 (f) r= 4,π(g) r= 5 (h) r= 5,π

(i) r= 6 (j) r= 6,π(k) r= 7 (l) r= 7,π

(m) r= 8 (n) r= 8,π(o) r= 9 (p) r= 9,π

(q) r= 10 (r) r= 10,π(s) r= 11 (t) r= 11,π

Fig. 7. A visualized simulation run, 1000 randomly sampled (self) points from banana

data set. The grey shaded area is covered by the generated r-chunk detectors, the white

areas are holes. The black points are self elements. The captions which include a “π”

are simulations results with the randomly determined permutation mask π∈S16 .

On Permutation Masks in Hamming Negative Selection 133

(a) r= 2 (b) r= 2,π(c) r= 3 (d) r= 3,π

(e) r= 4 (f) r= 4,π(g) r= 5 (h) r= 5,π

(i) r= 6 (j) r= 6,π(k) r= 7 (l) r= 7,π

(m) r= 8 (n) r= 8,π(o) r= 9 (p) r= 9,π

(q) r= 10 (r) r= 10,π(s) r= 11 (t) r= 11,π

Fig. 8. An additional visualized simulation run, with 1000 randomly sampled (self)

points from banana data set. The grey shaded area is covered by the generated r-

chunk detectors, the white areas are holes. The black points are self elements. The

captions which include a “π” are simulations results with the randomly determined

permutation mask π∈S16.

134 T. Stibor, J. Timmis, and C. Eckert

In order to obtain representative results, we performed 50 simulation runs,

each with a randomly determined permutation mask for both data sets. Due

to the lack of space to present all 50 simulation runs, we have selected two

simulation results at random for each data set (see Fig. 5,6,7,8). The remaining

simulation results are closely comparable to results in ﬁgures (5,6,7,8).

6 Conclusion

Lymphocyte diversity is an important property of the immune system for recog-

nizing a huge amount of diverse substances. This property has been abstracted in

terms of permutation masks in the Hamming negative selection detection tech-

nique. In this paper we have shown that (randomly determined) permutation

masks in Hamming negative selection, distort the semantic meaning of the un-

derlying data — the shape of the distribution — and as a consequence shatter

self regions. Furthermore, the distorted data is transformed into a collection of

random chunks. Hence, detectors are not covering areas around the self regions,

instead they are randomly distributed across the space. Moreover the resulting

holes (the generalization) occur in regions where actually no self regions should

occur. Additionally we believe that it is computational infeasible to ﬁnd permu-

tation masks which correctly capture the semantical representation of the data

— if one exists at all. We conclude that the use of permutation masks casts doubt

on the appropriateness of abstracting diversity in Hamming negative selection.

References

1. Forrest, S., Perelson, A.S., Allen, L., Cherukuri, R.: Self-nonself discrimination in

a computer. In: Proceedings of the 1994 IEEE Symposium on Research in Security

and Privacy, IEEE Computer Society Press (1994)

2. Dasgupta, D., Forrest, S.: Novelty detection in time series data using ideas from

immunology. In: Proceedings of the 5th International Conference on Intelligent

Systems. (1996)

3. Hofmeyr, S.A.: An Immunological Model of Distributed Detection and its Appli-

cation to Computer Security. PhD thesis, University of New Mexico (1999)

4. Singh, S.: Anomaly detection using negative selection based on the r-contiguous

matching rule. In: Proceedings of the 1st International Conference on Artiﬁcial

Immune Systems (ICARIS), Unversity of Kent at Canterbury Printing Unit (2002)

99–106

5. Kim, J., Bentley, P.J.: An evaluating of negative selection in an artiﬁcial immune

system for network intrusion detection. In: Proceedings of the Genetic and Evolu-

tionary Computation Conference, GECCO-2001. (2001) 1330–1337

6. Stibor, T., Timmis, J., Eckert, C.: On the appropriateness of negative selection

deﬁned over hamming shape-space as a network intrusion detection system. In:

Congress On Evolutionary Computation – CEC 2005, IEEE Press (2005) 995–1002

7. Stibor, T., Timmis, J., Eckert, C.: The link between r-contiguous detectors and

k-cnf satisﬁability. In: Congress On Evolutionary Computation – CEC 2006, IEEE

Press (2006 (to appear))

On Permutation Masks in Hamming Negative Selection 135

8. Stibor, T., Timmis, J., Eckert, C.: Generalization regions in hamming negative

selection. In: Intelligent Information Processing and Web Mining. Advances in

Soft Computing, Springer-Verlag (2006) 447–456

9. Hofmeyr, S., Forrest, S.: Architecture for an artiﬁcial immune system. Evolutionary

Computation 8(2000) 443–473

10. de Castro, L.N., Timmis, J.: Artiﬁcial Immune Systems: A New Computational

Intelligence Approach. Springer Verlag (2002)

11. Perelson, A.S., Oster, G.: Theoretical studies of clonal selection: minimal antibody

repertoire size and reliability of self-nonself discrimination. In: J. Theor. Biol.

Volume 81. (1979) 645–670

12. Percus, J.K., Percus, O.E., Perelson, A.S.: Predicting the size of the T-cell receptor

and antibody combining region from consideration of eﬃcient self-nonself discrim-

ination. Proceedings of National Academy of Sciences USA 90 (1993) 1691–1695

13. Balthrop, J., Esponda, F., Forrest, S., Glickman, M.: Coverage and generalization

in an artiﬁcial immune system. In: GECCO 2002: Proceedings of the Genetic and

Evolutionary Computation Conference, New York, Morgan Kaufmann Publishers

(2002) 3–10

14. Esponda, F., Forrest, S., Helman, P.: A formal framework for positive and negative

detection schemes. IEEE Transactions on Systems, Man and Cybernetics Part B:

Cybernetics 34 (2004) 357–373

15. D’haeseleer, P., Forrest, S., Helman, P.: An immunological approach to change

detection: algorithms, analysis, and implications. In: Proceedings of the 1996 IEEE

Symposium on Research in Security and Privacy, IEEE Computer Society, IEEE

Computer Society Press (1996) 110–119

16. Gonz´alez, F., Dasgupta, D., Ni˜no, L.F.: A randomized real-valued negative selec-

tion algorithm. In: Proceedings of the 2nd International Conference on Artiﬁcial

Immune Systems (ICARIS). Volume 2787 of Lecture Notes in Computer Science.,

Edinburgh, UK, Springer-Verlag (2003) 261–272

17. Knuth, D.E.: The Art of Computer Programming. third edn. Volume 1. Addison-

Wesley (2002)

18. Gonz´alez, F., Dasgupta, D., G´omez, J.: The eﬀect of binary matching rules in

negative selection. In: Genetic and Evolutionary Computation – GECCO-2003.

Volume 2723 of Lecture Notes in Computer Science., Chicago, Springer-Verlag

(2003) 195–206

19. Tax, D.M.J.: One-class classiﬁcation. PhD thesis, Technische Universiteit Delft

(2001)

20. R¨

atsch, G.: Benchmark repository (1998)

http://ida.ﬁrst.fraunhofer.de/projects/bench/benchmarks.htm.