Bruno Raffin

Bruno Raffin
  • Research Director at National Institute for Research in Computer Science and Control

About

147
Publications
22,819
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,039
Citations
Introduction
Bruno Raffin is currently the leader of the DataMove team National Institute for Research in Computer Science and Control. His current focus include in sItu and stream processing, sensitivity analysis and data assimilation, dynamics parallel data structures and task programming. I have no time to answer all article requests. All my publications are available to anyone on HAL or my personal web page. Please go there to get the full papers.
Current institution

Publications

Publications (147)
Preprint
Artificial intelligence is transforming scientific computing with deep neural network surrogates that approximate solutions to partial differential equations (PDEs). Traditional off-line training methods face issues with storage and I/O efficiency, as the training dataset has to be computed with numerical solvers up-front. Our previous work, the Me...
Preprint
The spatiotemporal resolution of Partial Differential Equations (PDEs) plays important roles in the mathematical description of the world's physical phenomena. In general, scientists and engineers solve PDEs numerically by the use of computationally demanding solvers. Recently, deep learning algorithms have emerged as a viable alternative for obtai...
Preprint
Full-text available
Particle filters are a group of algorithms to solve inverse problems through statistical Bayesian methods when the model does not comply with the linear and Gaussian hypothesis. Particle filters are used in domains like data assimilation, probabilistic programming, neural networkoptimization, localization and navigation. Particle filters estimate t...
Conference Paper
Full-text available
Numerical simulations are ubiquitous in science and engineering. Machine learning for science investigates how artificial neural architectures can learn from these simulations to speed up scientific discovery and engineering processes. Most of these architectures are trained in a supervised manner. They require tremendous amounts of data from simul...
Preprint
Full-text available
Numerical simulations are ubiquitous in science and engineering. Machine learning for science investigates how artificial neural architectures can learn from these simulations to speed up scientific discovery and engineering processes. Most of these architectures are trained in a supervised manner. They require tremendous amounts of data from simul...
Article
Prediction of chaotic systems relies on a floating fusion of sensor data (observations) with a numerical model to decide on a good system trajectory and to compensate non-linear feedback effects. Ensemble-based data assimilation (DA) is a major method for this concern depending on propagating an ensemble of perturbed model realizations. In this pap...
Chapter
Multi-run numerical simulations using supercomputers are increasingly used by physicists and engineers for dealing with input data and model uncertainties. Most of the time, the input parameters of a simulation are modeled as random variables, then simulations are run a (possibly large) number of times with input parameters varied according to a sp...
Article
Full-text available
A widening performance gap is separating CPU performance and IO bandwidth on large scale systems. In some fields such as weather forecast and nuclear fusion, numerical models generate such amounts of data that classical post hoc processing is not feasible anymore due to the limits in both storage capacity and IO performance. In situ approaches are...
Preprint
Full-text available
The ubiquity of fluids in the physical world explains the need to accurately simulate their dynamics for many scientific and engineering applications. Traditionally, well established but resource intensive CFD solvers provide such simulations. The recent years have seen a surge of deep learning surrogate models substituting these solvers to allevia...
Poster
Full-text available
The ubiquity of fluids in the physical world explains the need to accurately simulate their dynamics for many scientific and engineering applications. Traditionally, well-established but resource-intensive CFD solvers provide such simulations. Recent years have seen a surge of deep learning surrogate models substituting these solvers to alleviate t...
Preprint
Prediction of chaotic systems relies on a floating fusion of sensor data (observations) with a numerical model to decide on a good system trajectory and to compensate nonlinear feedback effects. Ensemble-based data assimilation (DA) is a major method for this concern depending on propagating an ensemble of perturbed model realizations.In this paper...
Conference Paper
Full-text available
In situ analysis and visualization have mainly been applied to the output of a single large-scale simulation. However, topics involving the execution of multiple simulations in supercomputers have only received minimal attention so far. Some important examples are uncertainty quantification, data assimilation, and complex optimization. In this posi...
Article
Full-text available
Regardless of its origin, in the near future the challenge will not be how to generate data, but rather how to manage big and highly distributed data to make it more easily handled and more accessible by users on their personal devices. VELaSSCo (Visualization for Extremely Large-Scale Scientific Computing) is a platform developed to provide new vi...
Preprint
Full-text available
The classical approach for quantiles computation requires availability of the full sample before ranking it. In uncertainty quantification of numerical simulation models, this approach is not suitable at exascale as large ensembles of simulation runs would need to gather a prohibitively large amount of data. This problem is solved thanks to an on-t...
Article
With the goal of performing exascale computing, the importance of input/output (I/O) management becomes more and more critical to maintain system performance. While the computing capacities of machines are getting higher, the I/O capabilities of systems do not increase as fast. We are able to generate more data but unable to manage them efficiently...
Article
Full-text available
Apache Hadoop is a widely used MapReduce framework for storing and processing large amounts of data. However, it presents some performance issues that hinder its utilization in many practical use cases. Although existing alternatives like Spark or Hama can outperform Hadoop, they require to rewrite the source code of the applications due to API inc...
Conference Paper
Full-text available
In this paper, an on-line parallel analytics framework is proposed to process and store in transit all the data being generated by a Molecular Dynamics (MD) simulation run using staging nodes in the same cluster executing the simulation. The implementation and deployment of such a parallel workflow with standard HPC tools, managing problems such as...
Conference Paper
Full-text available
Quantiles are important order statistics for analysis tasks such as outlier detection or computation of non-parametric confidence intervals. Quantiles being order statistics, the classical approach for their computation requires availability of the full sample before ranking it. This approach is not suitable at exascale. Large ensembles would need...
Research Proposal
CALL FOR PAPERS High Performance Machine Learning Workshop - HPML 2018 https://hpml2018.github.io/ To be held in conjunction with the 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2018) September 24th, 2018 - Lyon, France This workshop is intended to bring together the Machine Learning (ML), Artif...
Chapter
Full-text available
The in situ paradigm proposes to co-locate simulation and analytics on the same compute node to analyze data while still resident in the compute node memory, hence reducing the need for post-processing methods. A standard approach that proved efficient for sharing resources on each node consists in running the analytics processes on a set of dedica...
Conference Paper
Full-text available
Global sensitivity analysis is an important step for analyzing and validating numerical simulations. One classical approach consists in computing statistics on the outputs from well-chosen multiple simulation runs. Simulation results are stored to disk and statistics are computed postmortem. Even if supercomputers enable to run large studies, scien...
Conference Paper
Full-text available
In situ workflows contain tasks that exchange messages composed of several data fields. However, a consumer task may not necessarily need all the data fields from its producer. For example, a molecular dynamics simulation can produce atom positions, velocities, and forces; but some analyses require only atom positions. The user should decide whethe...
Article
Parallelizing industrial simulation codes like the EUROPLEXUS software dedicated to the analysis of fast transient phenomena, is challenging. In this paper we focus on the efficient parallelization on shared memory node coupling. We propose to have each thread gather the data it needs for processing a given iteration range, before to actually advan...
Conference Paper
Full-text available
In situ processing proposes to reduce storage needs and I/O traffic by processing results of parallel simulations as soon as they are available in the memory of the compute processes. We focus here on computing in situ statistics on the results of N simulations from a parametric study. The classical approach consists in running various instances of...
Article
V oronoi diagrams are fundamental data structures in computational geometry, with applications in such areas as physics-based simulations. For non-Euclidean distances, the Voronoi diagram must be performed over a grid-graph, where the edges encode the required distance information. Th e major bottleneck in this case is a shortest path algorithm tha...
Poster
Full-text available
VELaSSCo (Visual Analysis for Extremely Large-Scale Scientific Computing) is an EC FP7 project involving a consortium of seven European partners (Fig. 1). VELaSSCo aims to provide new visual analysis methods for large-scale simulations serving the petabyte era. The main output of the project is the VELaSSCo platform which has been designed a...
Conference Paper
Full-text available
Over the past few years, the increasing amounts of data produced by large-scale simulations have motivated a shift from traditional offline data analysis to in situ analysis and visualization. In situ processing began as the coupling of a parallel simulation with an analysis or visualization library, motivated primarily by avoiding the high cost of...
Conference Paper
Full-text available
Numerical simulations using supercomputers are producing an ever growing amount of data. Efficient production and analysis of these data are the key to future discoveries. The In-Situ paradigm is emerging as a promising solution to avoid the I/O bottleneck encounter on the file system for both the simulation and the analytics by treating the data a...
Article
The PetaFlow application aims to contribute to the use of high performance computational resources forthe benefit of society. To this goal the emergence of adequate information and communication technologies withrespect to high performance computing-networking-visualisation and their mutual awareness is required. Thedeveloped technology and algorit...
Article
In this paper, we present a comparison of scheduling strategies for heterogeneous multi-CPU and multi-GPU architectures. We designed and evaluated four scheduling strategies on top of XKaapi runtime: work stealing, data-aware work stealing, locality-aware work stealing, and Heterogeneous Earliest-Finish-Time (HEFT). On a heterogeneous architecture...
Article
Full-text available
While studied over several decades, the computation of boolean operations on polyhedra is almost always addressed by focusing on the case of two polyhedra. For multiple input polyhedra and an arbitrary boolean operation to be applied, the operation is decomposed over a binary CSG tree, each node being processed separately in quasilinear time. For l...
Article
Full-text available
Voronoi diagrams are fundamental data structures in computational geometry with applications on different areas. Recent soft object simulation algorithms for real time physics engines require the computation of Voronoi diagrams over 3D images with non-Euclidean distances. In this case, the computation must be performed over a graph, where the edges...
Conference Paper
Full-text available
High performance computing systems are today composed of tens of thousands of processors and deep memory hierarchies. The next generation of machines will further increase the unbalance between I/O capabilities and processing power. To reduce the pressure on I/Os, the in situ analytics paradigm proposes to process the data as closely as possible to...
Article
Full-text available
The amount of data generated by molecular dynamics simulations of large molecular assemblies and the sheer size and complexity of the systems studied call for new ways to analyse, steer and interact with such calculations. Traditionally, the analysis is performed off-line once the huge amount of simulation results have been saved to disks, thereby...
Article
Full-text available
Combining molecular dynamics simulations with user interaction would have various applications in both education and re- search. By enabling interactivity the scientist will be able to visualize the experiment in real time and drive the simulation to a desired state more easily. However, interacting with systems of interesting size requires signifi...
Conference Paper
Full-text available
This paper presents preliminary performance comparisons of parallel applications developed natively for the Intel Xeon Phi accelerator using three different parallel programming environments and their associated runtime systems. We compare Intel OpenMP, Intel CilkPlus and XKaapi together on the same benchmark suite and we provide comparisons betwee...
Conference Paper
Full-text available
The paper presents X-KAAPI, a compact runtime for multicore architectures that brings multi parallel paradigms (parallel independent loops, fork-join tasks and dataflow tasks) in a unified framework without performance penalty. Comparisons on independent loops with OpenMP and on dense linear algebra with QUARK/PLASMA confirm our design decisions. A...
Conference Paper
Full-text available
Nowadays shared memory HPC platforms expose a large number of cores organized in a hierarchical way. Parallel application programmers struggle to express more and more fine-grain parallelism and to ensure locality on such NUMA platforms. Independent loops stand as a natural source of parallelism. Parallel environments like OpenMP provide ways of pa...
Conference Paper
The Petaflow project aims to contribute to the use of high performance computational resources to the benefit of society. To this goal the emergence of adequate information and communication technologies with respect to high performance computing-networking-visualisation and their mutual awarness is required. The developed technology and algorithms...
Conference Paper
Moving the simulation results produced by thousands of computing cores to the scientist office is not anymore an option. Remote visualization proposes to perform heavy duty postprocessing tasks at the computing center while transferring images to the scientist. To be effective, such environement needs to be flexible and interactive. Data loading fr...
Conference Paper
In this paper, we present a specification of a RESTful based networking interface for the efficient exchange and manipulation of visual computing resources. It is designed to include web applications by using modern web-technology such as Typed Arrays and WebSockets. The specification maps internal structures and data containers to two types, Eleme...
Conference Paper
Full-text available
Most recent HPC platforms have heterogeneous nodes composed of multi-core CPUs and accelerators, like GPUs. Programming such nodes is typically based on a combination of OpenMP and CUDA/OpenCL codes; scheduling relies on a static partitioning and cost model. We present the XKaapi runtime system for data-flow task programming on multi-CPU and multi-...
Article
Full-text available
Scientific simulations produce more and more memory consuming datasets. The required processing resources need to keep pace with this increase. Though parallel visualization algorithms with strong performance gains have been developed, there is a need for a parallel programming environment tailored for scientific visualization algorithms that would...
Article
Full-text available
Neighbor identification is the most computationally intensive step in particle based simulations. To contain its cost, a common approach consists in using a regular grid to sort particles according to the cell they belong to. Then, neighbor search only needs to test the particles contained in a constant number of cells. During the simulation, a usu...
Chapter
In this paper, we present two different FlowVR systems aiming to render remote data using a Particle-based Volume Renderer (PBVR). The huge size of irregular volume datasets has always been one of the major obstacles in the field of scientific visualization. We developed an application with the software FlowVR, using its functionalities of "modules...
Article
Full-text available
The benefit of using the discrete element method (DEM) for simulations of fracture in heterogeneous media has been widely highlighted. However, modelling large structure leads to prohibitive computations times. We propose to use graphics processing units (GPUs) to reduce the computation time, taking advantage of the highly data parallel nature of D...
Article
Ray casting on graphics processing units (GPUs) opens new possibilities for molecular visualization. We describe the implementation and calculation of diverse molecular representations such as licorice, ball-and-stick, space-filling van der Waals spheres, and approximated solvent-accessible surfaces using GPUs. We introduce HyperBalls, an improved...
Article
Full-text available
Networked virtual environments like Second Life enable distant people to meet for leisure as well as work. But users are represented through avatars controlled by keyboards and mouses, leading to a low sense of presence especially regarding body language. Multi-camera real-time 3D modeling offers a way to ensure a significantly higher sense of pres...
Conference Paper
Full-text available
Reordering instructions and data layout can bring significant performance improvement for memory bounded applications. Parallelizing such applications requires a careful design of the algorithm in order to keep the locality of the sequential execution. In this paper, we aim at finding a good parallelization of memory bounded applications on multico...
Conference Paper
Full-text available
Today, it is possible to associate multiple CPUs and multiple GPUs in a single shared memory architecture. Using these resources efficiently in a seamless way is a challenging issue. In this paper, we propose a parallelization scheme for dynamically balancing work load between multiple CPUs and GPUs. Most tasks have a CPU and GPU implementation, so...
Conference Paper
The Petaflow project aims to contribute to the use of high performance computational resources to the benefit of society. To this goal the emergence of adequate information and communication technologies with respect to high performance computing-networking-visualisation and their mutual awareness is required. The developed technology and algorithm...
Article
The applications of virtual and augmented reality require high performance equipment normally associated with high costs, inaccessible for small organizations and educational institutions. By adding computing power and storage capacity of several PCs interconnected by a network we have an alternative of high performance and low cost. The objective...
Article
Full-text available
Reordering instructions and data layout can bring significant performance improvement for memory bounded applications. Parallelizing such applications requires a careful design of the algorithm in order to keep the locality of the sequential execution. On one hand, parallel computation tends to create concurrent tasks that work on independent data...
Article
Full-text available
This paper focuses on the design of high performance VR applications. These applications usually involve various I/O devices and complex simulations. A parallel architecture or grid infrastructure is required to provide the necessary I/O and processing capabilities. Developing such applications faces several difficulties, two important ones being s...
Conference Paper
Full-text available
This paper proposes to revisit isosurface extraction algorithms taking into consideration two specific aspects of recent multicore architectures: their intrinsic parallelism associated with the presence of multiple computing cores and their cache hierarchy that often includes private caches as well as caches shared between all cores. Taking advanta...
Article
Full-text available
We present a multicamera real-time 3D modeling system that aims at enabling new immersive and interactive environments. This system, called Grimage, allows to retrieve in real-time a 3D mesh of the observed scene as well as the associated textures. This information enables a strong visual presence of the user into virtual worlds. The 3D shape infor...
Article
Full-text available
The Vgate project introduces a new type of immersive environment that allows full-body immersion and interaction with virtual worlds. The project is a joint initiative between computer scientists from research teams in computer vision, parallel computing and computer graphics at the INRIA Grenoble Rhone-Alpes, and the 4D View Solutions company.
Article
Full-text available
This project associates multi-camera 3D modeling, physical simulation, and tracked head-mounted displays for a strong full-body immersion and presence in virtual worlds. Three-dimensional modeling is based on the EPHV algorithm, which provides an exact geometrical surface according to input data. The geometry enables computation of full-body collis...
Conference Paper
Full-text available
We present a framework for new 3D tele-immersion applications that allows collaborative and remote 3D interactions. This framework is based on a multiple-camera platform that builds, in real-time, 3D models of users. Such models are embedded into a shared virtual environment where they can interact with other users or purely virtual objects. 3D mod...
Article
Full-text available
Real-time multi-camera 3D modeling provides full-body geometric and photometric data on the objects present in the acquisition space. It can be used as an input device for rendering textured 3D models, and for computing interactions with virtual objects through a physical simulation engine. In this paper we present a work in progress to build a col...
Article
Full-text available
In this paper we propose a parallelization of interactive physical simulations. Our approach relies on a task parallelism where the code is instrumented to mark tasks and shared data between tasks, as well as parallel loops even if they have dynamics conditions. Prior to running a simulation step, we extract a task dependency graph that is partitio...
Conference Paper
Interactions are a key part of Virtual Reality systems and can lead to complex software assembly for multi-modal and multi-site collaborative environments. This is even harder, when each participant is interacting in the same virtual world by very different hardware and software capabilities. This paper outlines a software architecture and interact...
Article
Full-text available
La puissance de calcul disponible poursuit sa progression exponentielle mais en offrant plus de parallélisme. Cette progression de la puissance disponible peut être mise à profit pour rendre interactifs certains calculs. La structure et les objectifs de l'application diffèrent alors sensiblement de ceux du calcul intensif traditionnel. Le rôle de l...
Article
Full-text available
Real-time multi-camera 3D modeling provides full-body geometric and photometric data on the objects present in the acquisition space. It can be used as an input device for rendering textured 3D models, and for computing interactions with virtual objects through a physical simulation engine. In this paper we present a work in progress to build a col...
Article
Full-text available
This paper focuses on parallel interactive applications ranging from scientific visualization, to virtual reality or computational steering. Interactivity makes them particular on three main aspects: they are endlessly iterative, use advanced I/O devices, and must perform under strong performance constraints (latency, refresh rate). A data flow gra...
Article
Full-text available
In the late 90’s, the emergence of high-performance 3D commodity graphics cards paved the way to the use of PC clusters for high-performance Virtual Reality (VR) applications. Today PC clusters are broadly used to drive multi-projector immersive environments, among other high-performance VR tasks such as tracking and sound synthesis. This survey pr...
Article
Full-text available
One important bottleneck when visualizing large data sets is the data transfer between processor and memory. Cache-aware (CA) and cache-oblivious (CO) algorithms take into consideration the memory hierarchy to design cache efficient al-gorithms. CO approaches have the advantage to adapt to unknown and varying memory hierar-chies. Recent CA and CO a...
Article
Full-text available
This paper presents a new approach to collision detection and modeling between deformable volumetric bod- ies. It allows deep intersections while alleviating the difficulties of distance field update. A ray is shot from each surface vertex in the direction of the inward normal. A collision is detected when the first intersection be- longs to an inward...
Article
Full-text available
La puissance de calcul disponible poursuit sa progression exponentielle mais en offrant plus de parallélisme. Cette progression de la puissance disponible peut être mise à profit pour rendre interactifs certains calculs. La structure et les objectifs de l'application diffèrent alors sensiblement de ceux du calcul intensif traditionnel. Le rôle de l...
Conference Paper
Full-text available
This paper focuses on parallel interactive applications ranging from scientific visualization, to virtual reality or computational steering. Interactivity makes them particular on three main aspects: they are endlessly iterative, use advanced I/O devices, and must perform under strong performance constraints (latency, refresh rate). In this paper,...
Article
Full-text available
Grimage glues multi-camera 3D modeling, physical simulation and parallel execution for a new immersive experience. Put your hands or any object into the interaction space. It is instantaneously mod- eled in 3D and injected into a virtual world populated with solid and soft objects. Push them, catch them and squeeze them.
Article
Physically-based computer graphics offers the potential of achieving high-fidelity virtual environments in which the propagation of light in real environments is accurately simulated. However, such global illumination computations for even simple scenes ...
Conference Paper
Full-text available
We present a parallel octree carving algorithm applied to real time 3D modeling from multiple video streams. Our contribution is to propose a parallel adaptive algorithm for high performance width-first octree computation. It enables to stop the algorithm at anytime while ensuring a balanced octree exploration
Conference Paper
Full-text available
This paper introduces a dynamic work balancing algorithm, based on work stealing, for time-constrained parallel octree carving. The performance of the algorithm is proved and confirmed by experimental results where the algorithm is applied to a real-time 3D modeling from multiple video streams. Compared to classical work stealing, the proposed algo...
Conference Paper
Full-text available
This paper presents an approach to recover body mo- tions from multiple views using a 3D skeletal model. It takes, as input, foreground silhouette sequences from multi- ple viewpoints, and computes, for each frame, the skeleton pose which best fit the body pose. Skeletal models encode mostly motion information and allows therefore to separate motio...
Conference Paper
Full-text available
In the late 90’s the emergence of high performance 3D commodity graphics cards opened the way to use PC clusters for high performance Virtual Reality (VR) applications. Today PC clusters are broadly used to drive multi projector immersive environments. In this paper, we survey the different approaches that have been developed to use PC clusters for...
Article
Full-text available
We propose in this article a classification of the different notions of hybridization and a generic framework for the automatic hybridization of algorithms. Then, we detail the results of this generic framework on the example of the parallel solution of multiple linear systems.
Conference Paper
Full-text available
In this paper, we present a scalable architecture to compute, visualize and interact with 3D dynamic models of real scenes. This architecture is designed for mixed reality applications requiring such dynamic models, tele-immersion for instance. Our system consists in 3 main parts: the acquisition, based on standard firewire cameras; the computation...
Article
Full-text available
a) (b) (c) (d) (e) (f) Figure 1: Coupling multiple codes such as a rigid body simulation (a) and a fluid solver (b) enables to build complex worlds (c)-(d). Different distribution and parallelization approaches can next be applied to achieve real-time user interactions (e)-(f). ABSTRACT We present a novel software framework for developing highly an...
Conference Paper
Full-text available
Existing parallel or remote rendering solutions rely on communicating pixels, OpenGL commands, scene-graph changes or application-specific data. We propose an intermediate solution based on a set of independent graphics primitives that use hardware shaders to specify their visual appearance. Compared to an OpenGL based approach, it reduces the comp...

Network

Cited By