An adaptive approach to network resilience: Evolving challenge detection and mitigation.
ABSTRACT It is widely agreed that computer networks need to become more resilient to a range of challenges that can seriously impact their normal operation. Challenges include malicious attacks, misconfigurations, accidental faults and operational overloads. As part of an overall strategy for network resilience, a crucial requirement is the identification of challenges in real-time, followed by the application of appropriate remedial action. In this paper, we motivate and describe a novel solution that enables the progressive multi-stage deployment of resilience strategies, based on incomplete challenge and context information. Policies are used to orchestrate the interactions between various resilience mechanisms, which incrementally identify the nature of a challenge and deploy appropriate remediation mechanisms. We demonstrate the benefits of this approach via simulation of a resource starvation attack on an Internet Service Provider infrastructure. By initially using lightweight detection and then progressively applying more heavyweight analysis, a key contribution of our work is the ability to mitigate a challenge as early as possible and rapidly detect its root cause. The approach we propose in this paper has the flexibility, reproducibility and extensibility needed to assist in the identification and remediation of various network challenges in the future.
- SourceAvailable from: Alberto Egon Schaeffer-Filho[show abstract] [hide abstract]
ABSTRACT: Network resilience strategies aim to maintain ac-ceptable levels of network operation in the face of challenges, such as malicious attacks, operational overload or equipment failures. Often the nature of these challenges requires resilience strategies comprising mechanisms across multiple protocol layers and in disparate locations of the network. In this paper, we address the problem of resilience management and advocate that a new approach is needed for the design and evaluation of resilience strategies. To support the realisation of this approach we propose a framework that enables (1) the offline evaluation of resilience strategies to combat several types of challenges, (2) the generalisa-tion of successful solutions into reusable patterns of mechanisms, and (3) the rapid deployment of appropriate patterns when challenges are observed at run-time. The evaluation platform permits the simulation of a range of challenge scenarios and the resilience strategies used to combat these challenges. Strategies that can successfully address a particular type of challenge can be promoted to become resilience patterns. Patterns can thus be used to rapidly deploy resilience configurations of mechanisms when similar challenges are detected in the live network.01/2012;