Conference Paper

Bus Access Optimization for Predictable Implementation of Real-Time Applications on Multiprocessor Systems-on-Chip

DOI: 10.1109/RTSS.2007.24 Conference: Real-Time Systems Symposium, 2007. RTSS 2007. 28th IEEE International
Source: IEEE Xplore

ABSTRACT In multiprocessor systems, the traffic on the bus does not solely originate from data transfers due to data dependen- cies between tasks, but is also affected by memory trans- fers as result of cache misses. This has a huge impact on worst-case execution time (WCET) analysis and, in general, on the predictability of real-time applications implemented on such systems. As opposed to the WCET analysis per- formed for a single processor system, where the cache miss penalty is considered constant, in a multiprocessor system each cache miss has a variable penalty, depending on the bus contention. This affects the tasks' WCET which, how- ever, is needed in order to perform system scheduling. At the same time, the WCET depends on the system schedule due to the bus interference. In this paper we present an approach to worst-case execution time analysis and system scheduling for real-time applications implemented on mul- tiprocessor SoC architectures. The emphasis of this paper is on the bus scheduling policy and its optimization, which is of huge importance for the performance of such a pre- dictable multiprocessor application.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Given that power is one of the biggest concerns of embedded systems, many devices have replaced DRAM with non-volatile Phase Change Memories (PCM). Some applications need to adhere to strict timing constraints and thus their temporal behavior must be analyzed before deploying them. Moreover, modern systems typically contain multiple cores, causing an application to incur significant delays due to the contention for the shared bus and shared main memory (PCM in this work). One of the challenges in the timing analysis for PCM main memories is the high discrepancy between read and write latencies and the high contention among cores. Finding an upper bound on these delays is non-trivial mainly because (i) memory requests may be issued by co-executing applications at random times, (ii) it is difficult to determine apriori which applications will be concurrently executing, and (iii) the type of requests applications will issue. This work proposes a method to derive upper bounds on the increase in execution time of applications executing on such PCM-based multicores. It considers the contention on the shared memory and focuses on dealing with the asymmetric read and write latencies of PCM-based memories, while taking into account the specific policy applied to schedule requests by the memory controller.
    19th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA 2013), Tapei, Taiwain; 08/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Multicore processors are an effective solution to cope with the performance requirements of real-time embedded systems due to their good performance-per-watt ratio and high performance capabilities. Unfortunately, their use in integrated architectures such as IMA or AUTOSAR is limited by the fact that multicores do not guarantee a time composable behavior for the applications: the WCET of a task depends on inter-task interferences introduced by other tasks running simultaneously. This article focuses on the off-chip memory system: the hardware shared resource with the highest impact on the WCET and hence the main impediment for the use of multicores in integrated architectures. We present an analytical model that computes the worst-case delay, also known as Upper Bound Delay (UBD), that a memory request can suffer due to memory interferences generated by other co-running tasks. By considering the UBD in the WCET analysis, the resulting WCET estimation is independent from the other tasks, hence ensuring the time composability property and enabling the use of multicores in integrated architectures. We propose a memory controller for hard real-time multicores compliant with the analytical model that implements extra hardware features to deal with refresh operations and interferences generated by co-running non hard real-time tasks.
    ACM Transactions on Embedded Computing Systems (TECS). 03/2013; 12(1s).
  • [Show abstract] [Hide abstract]
    ABSTRACT: Consider the problem of scheduling a task set ¿ of implicit-deadline sporadic tasks to meet all deadlines on a t-type heterogeneous multiprocessor platform where tasks may access multiple shared resources. The multiprocessor platform has m k processors of type-k, where k¿{1,2,¿,t}. The execution time of a task depends on the type of processor on which it executes. The set of shared resources is denoted by R. For each task ¿ i , there is a resource set R i ⊆R such that for each job of ¿ i , during one phase of its execution, the job requests to hold the resource set R i exclusively with the interpretation that (i) the job makes a single request to hold all the resources in the resource set R i and (ii) at all times, when a job of ¿ i holds R i , no other job holds any resource in R i . Each job of task ¿ i may request the resource set R i at most once during its execution. A job is allowed to migrate when it requests a resource set and when it releases the resource set but a job is not allowed to migrate at other times. Our goal is to design a scheduling algorithm for this problem and prove its performance.We propose an algorithm, LP-EE-vpr, which offers the guarantee that if an implicit-deadline sporadic task set is schedulable on a t-type heterogeneous multiprocessor platform by an optimal scheduling algorithm that allows a job to migrate only when it requests or releases a resource set, then our algorithm also meets the deadlines with the same restriction on job migration, if given processors $4 \times (1 + \operatorname{MAXP}\times \lceil \frac{\vert P\vert \times\operatorname{MAXP}}{\min \{m_{1}, m_{2}, \ldots, m_{t} \}} \rceil )$ times as fast. (Here $\operatorname{MAXP}$ and |P| are computed based on the resource sets that tasks request.) For the special case that each task requests at most one resource, the bound of LP-EE-vpr collapses to $4 \times (1 + \lceil \frac{\vert R\vert }{\min \{m_{1}, m_{2}, \ldots, m_{t} \}} \rceil )$. To the best of our knowledge, LP-EE-vpr is the first algorithm with proven performance guarantee for real-time scheduling of sporadic tasks with resource sharing on t-type heterogeneous multiprocessors.
    Real-Time Systems 03/2014; 50(2):270-314. · 0.55 Impact Factor

Full-text (2 Sources)

Available from
May 16, 2014