R.M. Badia

Barcelona Supercomputing Center, Barcelona, Catalonia, Spain

Are you R.M. Badia?

Claim your profile

Publications (20)3.07 Total impact

  • Conference Proceeding: A Study of Speculative Distributed Scheduling on the Cell/B.E.
    [show abstract] [hide abstract]
    ABSTRACT: Star Superscalar's (StarSs) programming model converts a sequential application in C or Fortran into an efficient parallel program. The resulting parallel code is highly dynamic in the sense that data analysis and task scheduling occur at run-time, while the application executes. In this paper we compare this approach to the strategy adopted by other multi-core programming environments. The prize to pay for dynamic scheduling and dependence tracking is higher runtime overhead. We propose a distributed scheduler for Task Dependence Graphs (TDGs) to attenuate the scheduling cost in heterogeneous multi-core architectures. This scheduler allows the cores to speculatively select tasks from a conservative estimate of the TDG. In case of conflicts or lack of tasks a lightweight centralized scheduler services the faulting core after which the latter resumes its participation in the distributed scheme. Experiments with Cell Super scalar (CellSs) on a representative set of benchmarks demonstrate the reduction in runtime overhead achieved by the distributed scheduler. This reduction in runtime overhead carries over directly to a performance improvement for a large fraction of the benchmarks.
    Parallel & Distributed Processing Symposium (IPDPS), 2011 IEEE International; 06/2011
  • Conference Proceeding: COMPSs in the VENUS-C Platform: enabling e-Science applications on the Cloud
    4th Iberian Grid Infrastructure Conference; 03/2011
  • Conference Proceeding: Prediction of behavior of MPI applications
    M. Casas, R.M. Badia, J. Labarta
    [show abstract] [hide abstract]
    ABSTRACT: Scalability and performance of applications is a very important issue today. As more complex have become high performance architectures, it is more complex to predict the behavior of a given application running on them. In this paper, we propose a methodology which automatically and quickly predicts, from a very limited number of runs using very few processors, the scalability and performance of a given application in a wide range of supercomputers taking into account details of the architecture and the network of the machines.
    Cluster Computing, 2008 IEEE International Conference on; 11/2008
  • Source
    Conference Proceeding: A dependency-aware task-based programming environment for multi-core architectures
    J.M. Perez, R.M. Badia, J. Labarta
    [show abstract] [hide abstract]
    ABSTRACT: Parallel programming on SMP and multi-core architectures is hard. In this paper we present a programming model for those environments based on automatic function level parallelism that strives to be easy, flexible, portable, and performant. Its main trait is its ability to exploit task level parallelism by analyzing task dependencies at run time. We present the programming environment in the context of algorithms from several domains and pinpoint its benefits compared to other approaches. We discuss its execution model and its scheduler. Finally we analyze its performance and demonstrate that it offers reasonable performance without tuning, and that it can rival highly tuned libraries with minimal tuning effort.
    Cluster Computing, 2008 IEEE International Conference on; 11/2008
  • Conference Proceeding: COMP Superscalar: Bringing GRID Superscalar and GCM Together
    E. Tejedor, R.M. Badia
    [show abstract] [hide abstract]
    ABSTRACT: This paper presents the design, implementation and evaluation of COMP Superscalar, a new and componentised version of the GRID superscalar framework that enables the easy development of Grid-unaware applications. By means of a simple programming model, COMP Superscalar keeps the Grid as transparent as possible to the programmer. Moreover, the performance of the applications is optimized by exploiting their inherent concurrency when executing them on the Grid. The runtime of COMP Superscalar has been designed to follow the Grid Component Model (GCM) and is therefore formed by several components, each one encapsulating a given functionality identified in GRID superscalar.
    Cluster Computing and the Grid, 2008. CCGRID '08. 8th IEEE International Symposium on; 06/2008
  • Source
    Conference Proceeding: Including SMP in Grids as Execution Platform and Other Extensions in GRID Superscalar
    J.M. Perez, R.M. Badia, J. Labarta
    [show abstract] [hide abstract]
    ABSTRACT: GRID superscalar provides a very easy to use programming environment for enabling applications on the grid. Although the system already has many features, there are some areas that we wanted to enhance. In this paper we present a new version of GRID superscalar based on code annotations that includes full renaming support for scalar, array and structure parameters. We also present a tracing mechanism that allows fine tuning GRID superscalar applications, and improved support for running on SMP hosts.
    e-Science and Grid Computing, 2006. e-Science '06. Second IEEE International Conference on; 01/2007
  • Article: Performance of computationally intensive parameter sweep applications on Internet‐based Grids of computers: the mapping of molecular potential energy hypersurfaces
    [show abstract] [hide abstract]
    ABSTRACT: This work focuses on the use of computational Grids for processing the large set of jobs arising in parameter sweep applications. In particular, we tackle the mapping of molecular potential energy hypersurfaces. For computationally intensive parameter sweep problems, performance models are developed to compare the parallel computation in a multiprocessor system with the computation on an Internet-based Grid of computers. We find that the relative performance of the Grid approach increases with the number of processors, being independent of the number of jobs. The experimental data, obtained using electronic structure calculations, fit the proposed performance expressions accurately. To automate the mapping of potential energy hypersurfaces, an application based on GRID superscalar is developed. It is tested on the prototypical case of the internal dynamics of acetone. Copyright © 2006 John Wiley & Sons, Ltd.
    Concurrency and Computation Practice and Experience 09/2006; 19(4):463 - 481. · 0.64 Impact Factor
  • Source
    Article: System-level power-performance tradeoffs for reconfigurable computing
    J. Noguera, R.M. Badia
    [show abstract] [hide abstract]
    ABSTRACT: In this paper, we propose a configuration-aware data-partitioning approach for reconfigurable computing. We show how the reconfiguration overhead impacts the data-partitioning process. Moreover, we explore the system-level power-performance tradeoffs available when implementing streaming embedded applications on fine-grained reconfigurable architectures. For a certain group of streaming applications, we show that an efficient hardware/software partitioning algorithm is required when targeting low power. However, if the application objective is performance, then we propose the use of dynamically reconfigurable architectures. We propose a design methodology that adapts the architecture and algorithms to the application requirements. The methodology has been proven to work on a real research platform based on Xilinx devices. Finally, we have applied our methodology and algorithms to the case study of image sharpening, which is required nowadays in digital cameras and mobile phones
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems 08/2006; · 1.22 Impact Factor
  • Source
    Conference Proceeding: Performance and energy analysis of task-level graph transformation techniques for dynamically reconfigurable architectures
    J. Noguera, R.M. Badia
    [show abstract] [hide abstract]
    ABSTRACT: In this paper, we present an analysis of the impact in both performance and energy of several task-level graph transformation techniques to exploit the parallel processing capabilities of run-time partially reconfigurable architectures. The proposed techniques have been applied to an image processing application (i.e., image sharpening), which has been implemented in a real research platform.
    Field Programmable Logic and Applications, 2005. International Conference on; 09/2005
  • Source
    Conference Proceeding: Implementing phylogenetic inference with GRID superscalar
    [show abstract] [hide abstract]
    ABSTRACT: The grid has appeared recently as a new computing paradigm. However, to make the use of the grid available to the scientific community, frameworks that enable to easily write applications and to run them efficiently in the grid should be provided. GRID superscalar has been specially designed to satisfy the two requirements mentioned above. This paper presents an implementation of a biological application, fastDNAml, using GRID superscalar. The objective is not only to demonstrate the performance that can be achieved, but the programmability of the framework. The description contains details of the fastDNAml implementation, new features of GRID superscalar and summary of results obtained.
    Cluster Computing and the Grid, 2005. CCGrid 2005. IEEE International Symposium on; 06/2005
  • Source
    Conference Proceeding: Power-performance trade-offs for reconfigurable computing
    J. Noguera, R.M. Badia
    [show abstract] [hide abstract]
    ABSTRACT: We explore the system-level power-performance trade-offs available when implementing streaming embedded applications on fine-grained reconfigurable architectures. We show that an efficient hardware-software partitioning algorithm is required when targeting low-power. However, if the application objective is performance, then we propose the use of dynamically reconfigurable architectures. This work presents a configuration-aware data size partitioning approach. We propose a design methodology that adapts the architecture and used algorithms to the application requirements. The methodology has been proven to work on a real research platform based on Xilinx devices. Finally, we have applied our methodology and algorithms to the case study of image sharpening, which is required nowadays in digital cameras and mobile phones.
    Hardware/Software Codesign and System Synthesis, 2004. CODES + ISSS 2004. International Conference on; 10/2004
  • Source
    Article: HW/SW codesign techniques for dynamically reconfigurable architectures
    J. Noguera, R.M. Badia
    [show abstract] [hide abstract]
    ABSTRACT: Hardware/software (HW/SW) codesign and reconfigurable computing are commonly used methodologies for digital-systems design. However, no previous work has been carried out in order to define a HW/SW codesign methodology with dynamic scheduling for run-time reconfigurable architectures. In addition, all previous approaches to reconfigurable computing multicontext scheduling are based on static-scheduling techniques. In this paper, we present three main contributions: 1) a novel HW/SW codesign methodology with dynamic scheduling for discrete event systems using dynamically reconfigurable architectures; 2) a new dynamic approach to reconfigurable computing multicontext scheduling; and 3) a HW/SW partitioning algorithm for dynamically reconfigurable architectures. We have developed a whole codesign framework, where we have applied our methodology and algorithms to the case study of software acceleration. An exhaustive study has been carried out, and the obtained results demonstrate the benefits of our approach.
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems 09/2002; · 1.22 Impact Factor
  • Conference Proceeding: Dynamic run-time HW/SW scheduling techniques for reconfigurable architectures
    J. Noguera, R.M. Badia
    [show abstract] [hide abstract]
    ABSTRACT: Dynamic run-time scheduling in System-on-Chip platforms has become recently an active area of research because of the performance and power requirements of new applications. Moreover, dynamically reconfigurable logic (DRL) architectures are an exciting alternative for embedded systems design. However, all previous approaches to DRL multi-context scheduling and HW/SW scheduling for DRL architectures are based on static scheduling techniques. In this paper, we address this problem and present: (1) a dynamic scheduler hardware architecture, and (2) four dynamic run-time scheduling algorithms for DRL-based multi-context platforms. The scheduling algorithms have been integrated in our codesign environment, where a large number of experiments have been carried out. Results demonstrate the benefits of our approach
    Hardware/Software Codesign, 2002. CODES 2002. Proceedings of the Tenth International Symposium on; 02/2002
  • Conference Proceeding: A HW/SW partitioning algorithm for dynamically reconfigurable architectures
    J. Noguera, R.M. Badia
    [show abstract] [hide abstract]
    ABSTRACT: “System-On-Chip” has become a reality, and recently new reconfigurable devices have appeared. However, few efforts have been carried out in order to define HW/SW codesign methodologies and algorithms which address the challenges presented by new reconfigurable devices. In this paper we address this open problem and present a novel HW/SW partitioning algorithm for dynamically reconfigurable architectures. The algorithm is a constructive algorithm, which obtains an initial solution and afterwards tries to optimize it. The HW/SW partitioning is done taking into account the features of the dynamically reconfigurable devices, and its final goal is to minimize the reconfiguration latency. The partitioning algorithm has been implemented and integrated into our developed codesign environment, where several experiments have been carried out. The results obtained demonstrate the benefits of the algorithm
    Design, Automation and Test in Europe, 2001. Conference and Exhibition 2001. Proceedings; 02/2001
  • Source
    Conference Proceeding: Run-time HW/SW codesign for discrete event systems usingdynamically reconfigurable architectures
    J. Noguera, R.M. Badia
    [show abstract] [hide abstract]
    ABSTRACT: Hardware/software (HW/SW) codesign and reconfigurable computing (RC) are commonly used methodologies for digital systems design. However, no previous work has been carried out in order to define a run-time HW/SW codesign methodology for dynamically reconfigurable architectures. Besides, all previous approaches to RC context scheduling were based on static scheduling techniques. In this paper, we present a run-time HW/SW codesign methodology for discrete event systems using dynamically reconfigurable architectures and a dynamic approach to RC multi-context scheduling. We have applied our methodology to software acceleration and present the obtained results
    System Synthesis, 2000. Proceedings. The 13th International Symposium on; 02/2000
  • Conference Proceeding: Reconfigurable computing: an innovative solution for multimedia andtelecommunication networks simulation
    [show abstract] [hide abstract]
    ABSTRACT: Sequential network simulation is a high time-consuming application, and with the emergence of global multihop networks and gigabit-per-second links is becoming a non-affordable problem with traditional simulations. New techniques for the acceleration of these simulations based on other hardware architectures are required. Previous approaches to simulation acceleration are based on parallel computing and reconfigurable completing. A short review of most outstanding approaches showing its benefits and problems is presented in the paper. A new approach based on mapping network simulations on reconfigurable hardware is presented. Most important features of this system are: the acceleration of the simulation by hardware, and the use of a high level network modeling language which allows a transparent use of the hardware by telecommunication engineers. The core of the proposed environment is an automatic tool that compiles the high-level network model and maps the simulator behaviour into the hardware
    EUROMICRO Conference, 1999. Proceedings. 25th; 02/1999
  • Conference Proceeding: High-level synthesis of asynchronous systems: Scheduling and process synchronization
    R.M. Badia, J. Cortadella
    [show abstract] [hide abstract]
    ABSTRACT: Basic concepts for scheduling algorithms and control synthesis in high-level synthesis of asynchronous circuits are defined. Two scheduling strategies are presented and evaluated. Experiments on different benchmarks show that efficient asynchronous schedules can be obtained. Control is modeled in a distributed fashion with local controllers synchronizing between them by means of handshaking protocols
    Design Automation, 1993, with the European Event in ASIC Design. Proceedings. [4th] European Conference on; 03/1993
  • Conference Proceeding: An asynchronous architecture model for behavioral synthesis
    J. Cortadella, R.M. Badia
    [show abstract] [hide abstract]
    ABSTRACT: An asynchronous architecture model for behavioral synthesis is presented. The basis of the model lies in a distributed control structure consisting of multiple communicating processes. Data processing is performed by self-timed modules. Signal transition graphs (STGs) are used to specify the behavior of the control processes. By using existing synthesis procedures for STGs, circuits based on the presented architecture model are proved to be realizable and hazard-free
    Design Automation, 1992. Proceedings., [3rd] European Conference on; 04/1992
  • Source
    Article: Monitoring and steering Grid applications with GRID superscalar
    [show abstract] [hide abstract]
    ABSTRACT: We present the design and implementation of a general task monitoring and steering system for Grid applications (GSTAT). The system is integrated in the GRID superscalar (GRIDSs) programming framework. Information at the application, Grid node, and individual task levels are supplied upon request. Using the steering capabilities, individual tasks or the whole application can be cancelled. The corresponding jobs can be restarted using fault tolerance and checkpointing capabilities based on GRIDSs. In addition, the computational resources assigned to an application can be modified. GSTAT is tested using high throughput and high performance computing cases on an Internet-based Grid of computers.
    Future Generation Computer Systems.
  • Article: Demostración de uso de GRID superscalar
    R. Sirvent, R.M. Badia, P. Bellens
    [show abstract] [hide abstract]
    ABSTRACT: En esta presentación realizada durante la Tercera Reunión de la Red Temática para la Coordinación de Actividades Middleware en Grid pudimos ver en uso el entorno de programación GRID superscalar, desarrollado por el grupo de Grid computing del Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS) y la Universidat Politècnica de Catalunya (UPC). Este entorno pretende facilitar la programación de aplicaciones en el Grid, proporcionando un modelo de programación secuencial muy simple. A partir del algoritmo descrito secuencialmente en el código, GRID superscalar genera en tiempo de ejecución una tarea para cada llamada a una función listada en el fichero de interficies IDL. También se analizan las dependencias de datos entre las tareas, creando un grafo de dependencias, y se encarga de mandarlas ejecutar en el Grid, respetando dichas dependencias. El runtime implementa diferentes técnicas para incrementar el paralelismo disponible en la aplicación y para minimizar el tiempo de ejecución total. El entorno también proporciona una herramienta llamada Deployment Center; una interficie gráfica que facilita la preparación del entorno de ejecución. Finalmente, se mostró un prototipo de monitor de ejecución para GRID superscalar que permite al usuario visualizar interactivamente la ejecución de los trabajos.
    RedIRIS: boletín de la Red Nacional de I+D RedIRIS, ISSN 1139-207X, Nº. 80, 2007, pags. 41-46.