Flexible Grid-Based Clustering

DOI: 10.1007/978-3-540-74976-9_33
Source: DBLP

ABSTRACT Grid-based clustering is particularly appropriate to deal with massive datasets. The principle is to first summarize the dataset
with a grid representation, and then to merge grid cells in order to obtain clusters. All previous methods use grids with
hyper-rectangular cells. In this paper we propose a flexible grid built from arbitrary shaped polyhedra for the data summary.
For the clustering step, a graph is then extracted from this representation. Its edges are weighted by combining density and
spatial informations. The clusters are identified as the main connected components of this graph. We present experiments indicating
that our grid often leads to better results than an adaptive rectangular grid method.

1 Follower
  • [Show abstract] [Hide abstract]
    ABSTRACT: In recent years, many measurement studies have shown the ubiquity of scanning activities in the Internet and the growing sophistication of probing techniques that became more stealthy by stretching slowly over time or using spoofed source IP addresses. Scans are mainly generated by attackers trying to map the configuration of a target network and by computer worms trying to spread over the Internet. Although, the problem of scan detection has been given a lot of attention by network security researchers, current state-of-the-art methods still suffer from high percentage of false alarms or low ratio of scan detection. In this paper, we propose to detect changes in scanning patterns, by monitor variation of the distribution of scan features in a space spanned by IP source address, IP destination address, source port number, and destination port number. This gives insight on characteristics of scanning activities and exposes the presence of emerging scanning attacks and worms. For that, we propose to use an information theoretic-based approach to detect changes in distributions.
    NETWORKING 2009, 8th International IFIP-TC 6 Networking Conference, Aachen, Germany, May 11-15, 2009. Proceedings; 01/2009
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The ensemble clustering algorithm ECCA (Ensemble of Combined Clustering Algorithms) for processing large datasets is proposed and theoretically substantiated. Results of an experimental study of the algorithm on simulated and real data proving its effectiveness are presented
    Optoelectronics Instrumentation and Data Processing 06/2011; 47(3):245-252. DOI:10.3103/S8756699011030071