Automating the Analysis of Honeypot Data (Extended Abstract).
ABSTRACT We describe the on-going work towards further automating the analysis of data generated by a large honeynet architecture called
Leurre.com and SGNET. The underlying motivation is helping us to integrate the use of honeypot data into daily network security
monitoring. We propose a system based on two automated steps: i) the detection of relevant attack events within a large honeynet traffic data set, and ii) the extraction of highly similar events based on temporal correlation.
- [Show abstract] [Hide abstract]
ABSTRACT: Rigorously characterizing the statistical properties of cyber attacks is an important problem. In this paper, we propose the first statistical framework for rigorously analyzing honeypot-captured cyber attack data. The framework is built on the novel concept of stochastic cyber attack process, a new kind of mathematical objects for describing cyber attacks. To demonstrate use of the framework, we apply it to analyze a low-interaction honeypot dataset, while noting that the framework can be equally applied to analyze high-interaction honeypot data that contains richer information about the attacks. The case study finds, for the first time, that long-range dependence (LRD) is exhibited by honeypot-captured cyber attacks. The case study confirms that by exploiting the statistical properties (LRD in this case), it is feasible to predict cyber attacks (at least in terms of attack rate) with good accuracy. This kind of prediction capability would provide sufficient early-warning time for defenders to adjust their defense configurations or resource allocations. The idea of “gray-box” (rather than “black-box”) prediction is central to the utility of the statistical framework, and represents a significant step towards ultimately understanding (the degree of) the predictability of cyber attacks.IEEE Transactions on Information Forensics and Security 11/2013; 8(11):1775-1789. · 1.90 Impact Factor