OpenMP-based parallelization on an MPCore multiprocessor platform – A performance and power analysis

Chair for Electrical Engineering and Computer Systems, RWTH Aachen University, Schinkelstraße 2, 52062 Aachen, Germany; Nokia Research Center, Meesmannstr. 103, 44807 Bochum, Germany
Journal of Systems Architecture (Impact Factor: 0.69). 11/2008; DOI: 10.1016/j.sysarc.2008.04.001
Source: DBLP

ABSTRACT In this contribution, the potential of parallelized software that implements algorithms of digital signal processing on a multicore processor platform is analyzed. For this purpose various digital signal processing tasks have been implemented on a prototyping platform i.e. an ARM MPCore featuring four ARM11 processor cores. In order to analyze the effect of parallelization on the resulting performance-power ratio, influencing parameters like e.g. the number of issued program threads have been studied. For parallelization issues the OpenMP programming model has been used which can be efficiently applied on C-level. In order to elaborate power efficient code also a functional and instruction level power model of the MPCore has been derived which features a high estimation accuracy. Using this power model and exploiting the capabilities of OpenMP a variety of exemplary tasks could be efficiently parallelized. The general efficiency potential of parallelization for multiprocessor architectures can be assembled.

  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a dynamic management framework for multi-core systems. Power management and temperature control are two crucial components for multi-core system control. Therefore, in this paper we propose a flexible and scalable framework which is very effective in reducing the power consumption and the temperature by adjusting the number of active cores on a Linux-based multi-core system. To manage the states of a multi-core system, both dynamic power management (DPM) and dynamic voltage/frequency scaling (DVFS) can be selectively applied depending on the type of cores. So unnecessary cores can be shut down or idling cores can be slowed down. To show the effectiveness and flexibility of our framework, we implemented the framework on both Intel Centrino Duo and ARM-11 MPCore platforms. By conducting various experiments with the proposed framework, we could successfully reduce the power consumption and the temperature.
  • [Show abstract] [Hide abstract]
    ABSTRACT: As chip multiprocessors have been widely adopted in embedded systems, achieving both high performance and low power consumptions of parallel applications becomes challenging. In order to meet these requirements, it is crucial for developers to analyze the performance and energy consumption of parallel applications. In this paper, we propose a tool for profiling and optimizing the performance and energy consumption of OpenMP applications (energy PROfiler and analyzer for OpenMP: ePRO-OMP). The main advantage of ePRO-OMP is that it can analyze both the performance and energy consumption of each parallel region of an OpenMP application, which can help developers find the bottleneck of parallel applications in detail.
    Journal of IEMEK. 01/2011; 6(5).
  • [Show abstract] [Hide abstract]
    ABSTRACT: Recent advances in semiconductor technologies make it possible to integrate many processor cores in a small device package. The parallel execution capability of such multi-core processors can be exploited to enhance the performance of many traditional sequential applications. There have been numerous research activities to develop parallelization techniques using the OpenMp programming model, in order to speed up sequential applications such as the H.264/AVC codec, but mostly in the PC environment. Therefore, it is difficult to understand which parallelization technique fits well with the H.264/AVC encoder on an embedded multi-core architecture. In this paper, we present parallelization techniques applicable to the H.264/AVC encoder on ARM MPCore using the OpenMP programming model. Further, we propose an analytical model for the performance estimation of the H.264/AVC encoder, and we then verify the model accuracy by performing simulations using hardware/software co-verification tool. Our experimental results show that the parallelization techniques proposed in this paper for the embedded multi-core platform improve the encoder performance by up to 2.36 times, and that the parallelization technique exploiting data-level parallelism outperforms the one using task-level parallelism by 41%. It is also observed that balancing loads among processor cores is a critical parameter in achieving better scalability in the encoder.
    Journal of Systems Architecture 10/2012; 58(9):339–353. · 0.69 Impact Factor

Full-text (2 Sources)

Available from
May 17, 2014