Fault - Tolerant Software Reliability Engineering

Download full-text


Available from: David Franklin McAllister, Apr 02, 2015
  • Source
    • "If all versions do not agree, there may be an issue. Anomaly detection techniques that work by comparing output vectors of diverse, but functionally equivalent, software have been in active use in the high-assurance software industry for a long time [3] [4]. Recently these techniques have been revisited in the context of cybersecurity for Web-based systems [5] [6]. "

    Preview · Article · Jan 2016
  • Source
    • "The MRP approach can be used for modeling fault-tolerant software systems. Per-run failure probability and run's execution-time distribution for a particular fault-tolerant technique can be derived using a variety of existing models (see [10], [20], [30] and references therein). Thus, in addition to the interversion failure correlation on a single run considered in related work, our approach can account for the correlation among successive failures. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Perhaps the most stringent restriction in most software reliability models is the assumption of statistical independence among successive software failures. The authors research was motivated by the fact that although there are practical situations in which this assumption could be easily violated, much of the published literature on software reliability modeling does not seriously address this issue. The research work in this paper is devoted to developing the software reliability modeling framework that can consider the phenomena of failure correlation and to study its effects on the software reliability measures. The important property of the developed Markov renewal modeling approach is its flexibility. It allows construction of the software reliability model in both discrete time and continuous time, and (depending on the goals) to base the analysis either on Markov chain theory or on renewal process theory. Thus, their modeling approach is an important step toward more consistent and realistic modeling of software reliability. It can be related to existing software reliability growth models. Many input-domain and time-domain models can be derived as special cases under the assumption of failure s-independence. This paper aims at showing that the classical software reliability theory can be extended to consider a sequence of possibly s-dependent software runs, viz, failure correlation. It does not deal with inference nor with predictions, per se. For the model to be fully specified and applied to estimations and predictions in real software development projects, we need to address many research issues, e.g., the detailed assumptions about the nature of the overall reliability growth, way modeling-parameters change as a result of the fault-removal attempts
    Full-text · Article · Apr 2000 · IEEE Transactions on Reliability
  • Source
    • "The effectiveness of any fault-tolerant architecture depends crucially on the probability of common failure between its redundant parts. Substantial research effort has been spent on studying this factor in the case of N-version software (summaries are in references [40] [11] [41]). Empirical studies ranged between the two extremes of case studies attempting to approximate realistic development processes, producing few program versions, and statistically controlled experiments made affordable by using student programmers. "
    [Show abstract] [Hide abstract]
    ABSTRACT: This chapter surveys techniques for tolerating the effects of design defects in computer systems, paying special attention to software. Design faults are a major cause of failure in modern computer systems, and their relative importance is growing as techniques for tolerating physical faults gain wider acceptance. Although design faults could in principle be eliminated, in practice they are inevitable in many categories of systems, and designers need to apply fault tolerance for mitigating their effects. Limited degrees of fault tolerance in software – "defensive programming" – are common, but systematic application of fault tolerance for design faults is still rare and mostly limited to highly critical systems. However, the increasing dependence of system designers on off-the-shelf components often makes fault tolerance a necessary, feasible and probably cost-effective solution for achieving modest dependability improvements at affordable cost. This chapter introduces techniques and principles, outlines similarities and differences with fault tolerance against physical faults, provides a structured description of the space of design solutions, and discusses some design issues and trade-offs.
    Preview · Article ·
Show more