OpenMP-based parallelization on an MPCore multiprocessor platform – A performance and power analysis

Chair for Electrical Engineering and Computer Systems, RWTH Aachen University, Schinkelstraße 2, 52062 Aachen, Germany; Nokia Research Center, Meesmannstr. 103, 44807 Bochum, Germany
Journal of Systems Architecture 01/2008; DOI:10.1016/j.sysarc.2008.04.001
Source: DBLP

ABSTRACT In this contribution, the potential of parallelized software that implements algorithms of digital signal processing on a multicore processor platform is analyzed. For this purpose various digital signal processing tasks have been implemented on a prototyping platform i.e. an ARM MPCore featuring four ARM11 processor cores. In order to analyze the effect of parallelization on the resulting performance-power ratio, influencing parameters like e.g. the number of issued program threads have been studied. For parallelization issues the OpenMP programming model has been used which can be efficiently applied on C-level. In order to elaborate power efficient code also a functional and instruction level power model of the MPCore has been derived which features a high estimation accuracy. Using this power model and exploiting the capabilities of OpenMP a variety of exemplary tasks could be efficiently parallelized. The general efficiency potential of parallelization for multiprocessor architectures can be assembled.

0 0
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Recently, multicore technology has been introduced to embedded systems in order to improve performance and reduce power consumption. In the present study, three SMP multicore processors for embedded systems and a multicore processor for a desktop PC are evaluated by the parallel benchmark using OpenMP. The results indicate that, even if the memory performance is low, applications that are not memory-intensive exhibit large speedups by parallelization. The results also indicate a large performance improvement due to parallelization using OpenMP, despite its low cost.
    Evolving OpenMP in an Age of Extreme Parallelism, 5th International Workshop on OpenMP, IWOMP 2009, Dresden, Germany, June 3-5, 2009, Proceedings; 01/2009
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: During the past few years, embedded digital systems have been requested to provide a huge amount of processing power and functionality. A very likely foreseeable step to pursue this computational and flexibility trend is the generalization of on-chip multiprocessor platforms (MPSoC). In that context, choosing a programming model and providing optimized hardware support to it on these platforms is a challenging task. To deal in a portable way with MPSoCs having a different number of processors running possibly at different frequencies, work-stealing (WS) based parallelization is a current research trend.The contribution of this paper is to evaluate the impact of some simple MPSoCs’ architecture characteristics on the performance of WS in the MPSoC context. The previous evaluations of WS, either theoretical or experimental, were done on fixed multicores architectures. This work extends these studies by exploring the use of WS for the codesign of embedded applications on MPSoC platforms with different hardware capabilities, thanks to cycle-accurate measures.We firstly study the architectural choices suited to WS algorithms and measure the benefit of these architectural modifications. To assert whether WS is suited to the MPSoC context, we experimentally measure its intrinsic implementation overhead on the most efficient architectural designs. Finally, we validate the performances of the approach on two real applications: a regular multimedia application (temporal noise reduction) and an irregular computation intensive application (frames of the Mandelbrot set).Our results show that enhancing MPSoC platforms having up to 16 processors with widespread hardware support mechanisms can lead to important performance improvements at acceptable hardware cost for the considered applications.
    Journal of Systems Architecture - Embedded Systems Design. 01/2010; 56:392-406.
  • [show abstract] [hide abstract]
    ABSTRACT: New generation of manufacturing systems endows their intelligence and reconfigurability to the computerized numerical controller (CNC) machines. This paper presents an open-architecture platform based on multi-agent hardware–software units, by developing a novel Multi-Agent Distributed CONtroller (MADCON) system. This system intends to fulfill the requirements of reconfigurability for the next generation of intelligent machines. The design of intelligent drives for this system follows a hardware–software co-design approach using a simple and intuitive structure. The hardware units of the proposed system integrate control and monitoring functions providing an FPGA-based open architecture for reconfigurable applications. On the other hand, software components were developed utilizing the XML structure for system description files, gathering features like a flowchart descriptive language and a graphic user-interface. MADCON was applied to a retrofitted to CNC lathe for control and monitoring in order to validate the proposed architecture towards the development of new generation intelligent manufacturing systems.
    Journal of Systems Architecture. 01/2010;

Full-text (2 Sources)

Available from
Mar 14, 2014