Conference Paper

Tradeoff Exploration between Reliability, Power Consumption, and Execution Time

DOI: 10.1007/s10009-012-0263-9 Conference: Computer Safety, Reliability, and Security - 30th International Conference, SAFECOMP 2011, Naples, Italy, September 19-22, 2011. Proceedings
Source: DBLP


For autonomous critical real-time embedded (e.g., satellite), guaranteeing a very high level of reliability is as important as keeping the power consumption as low as possible. We propose an off-line scheduling heuristic which, from a given software application graph and a given multiprocessor architecture (homogeneous and fully connected), produces a static multiprocessor schedule that optimizes three criteria: its length (crucial for real-time systems), its reliability (crucial for dependable systems), and its power consumption (crucial for autonomous systems). Our tricriteria scheduling heuristic, called TSH, uses the active replication of the operations and the data-dependencies to increase the reliability and uses dynamic voltage and frequency scaling to lower the power consumption. We demonstrate the soundness of TSH. We also provide extensive simulation results to show how TSH behaves in practice: first, we run TSH on a single instance to provide the whole Pareto front in 3D; second, we compare TSH versus the ECS heuristic (Energy-Conscious Scheduling) from the literature; and third, we compare TSH versus an optimal Mixed Linear Integer Program.

Download full-text


Available from: Alain Girault, Jan 20, 2015
  • Source
    • "Zhao et al. [22] used multi-objective ACO for reliability optimization of series-parallel systems. Assayad et al. [27] presented an offline scheduling heuristic to optimize reliability , power consumption, and performance of realtime embedded systems. A recent paper by Etemaadi and Chaudron [8] proposed two new generic architectural DoFs in metaheuristic optimization of component-based embedded systems: topology of hardware platform and load balancing of software components. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we present a novel Multi-Objective Ant Colony System algorithm to optimize Cost, Performance, and Reliability (MOACS-CoPeR) in the cloud. The proposed algorithm provides a metaheuristic-based approach for the multi-objective cloud-based software component deployment problem. MOACS-CoPeR explores the search-space of architecture design alternatives with respect to several architectural degrees of freedom and produces a set of Pareto-optimal deployment configurations. Two salient features of the proposed approach are that it is not dependent on a particular modeling language and it does not require an initial architecture configuration. Moreover, it eliminates undesired and infeasible configurations at an early stage by using performance and reliability requirements of individual software components as heuristic information to guide the search process. We also present a Java-based implementation of our proposed algorithm and compare its results with Non-dominated Sorting Genetic Algorithm II (NSGA-II). We evaluate the two algorithms against a cloud-based storage service, which is loosely based on a real system. The results show that MOACS-CoPeR outperforms NSGA-II in terms of number and quality of Pareto-optimal configurations found.
    Report number: 1142, Affiliation: Turku Centre for Computer Science
  • Source
    • "In the simulations, the reference speed is set to be s ref = 0.6 with an error rate of λ F ref = 10 −5 for fail-stop errors, and the sensitivity parameter is set to be d = 3. These parameters represent realistic settings reported in the literature [1] [3] [34], and they correspond to 0.83 ∼ 129 errors over the entire chain of computation depending on the processing speed chosen. For silent errors, we assume that its error rate is related to that of the fail-stop errors as λ S (s) = η · λ F (s), where η > 0 is the relative parameter. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we combine the traditional checkpointing and rollback recovery strategies with verification mechanisms to address both fail-stop and silent errors. The objective is to minimize either makespan or energy consumption. While DVFS is a popular approach for reducing the energy consumption, using lower speeds/voltages can increase the number of errors, thereby complicating the problem. We consider an application workflow whose dependence graph is a chain of tasks, and we study three execution scenarios: (i) a single speed is used during the whole execution; (ii) a second, possibly higher speed is used for any potential re-execution; (iii) different pairs of speeds can be used throughout the execution. For each scenario, we determine the optimal checkpointing and verification locations (and the optimal speeds for the third scenario) to minimize either objective. The different execution scenarios are then assessed and compared through an extensive set of experiments.
    • "The reliability of a task T i executed once at speed f is R i (f ) = e −λ(f )×Exe(w i ,f ) . Because the fault rate is usually very small, of the order of 10 −6 per time unit in [9] [43], 10 −5 in [5], we can use the first order approximation of R i (f ) as "

Show more