# Frédéric MagoulèsCentraleSupélec | ECP

Frédéric Magoulès

BSc, MSc, PhD

## About

327

Publications

20,653

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

5,073

Citations

## Publications

Publications (327)

The Global-Local non-invasive coupling is an improvement of the submodeling technique, which permits to locally enhance structure computations by introducing patches with refined models and to take into accounts all the interactions. In order to circumvent its inherently limited computational performance, we propose and implement an asynchronous ve...

In this paper, we address the problem of designing a distributed application meant to run both classical and asynchronous iterations. MPI libraries are very popular and widely used in the scientific community, however asynchronous iterative methods raise non-negligible difficulties about the efficient management of communication requests and buffer...

In this paper, we address the problem of detecting the moment when an ongoing asynchronous parallel iterative process can be terminated to provide a sufficiently precise solution to a fixed-point problem being solved. Formulating the detection problem as a global solution identification problem, we analyze the snapshot-based approach, which is the...

A general asynchronous alternating iterative model is designed, for which convergence is theoretically ensured both under classical spectral radius bound and, then, for a classical class of matrix splittings for H-matrices. The computational model can be thought of as a two-stage alternating iterative method, which well suits to the well-known Herm...

This paper deals with linear algebra operations on Graphics Processing Unit (GPU) with complex number arithmetic using double precision. An analysis of their uses within iterative Krylov methods is presented to solve acoustic problems. Numerical experiments performed on a set of acoustic matrices arising from the modelisation of acoustic phenomena...

In this paper, we aim to introduce a new perspective when comparing highly parallelized algorithms on GPU: the energy consumption of the GPU. We give an analysis of the performance of linear algebra operations, including addition of vectors, element-wise product, dot product and sparse matrix-vector product, in order to validate our experimental pr...

This paper gives an analysis and an evaluation of linear algebra operations on Graphics Processing Unit (GPU) with complex number arithmetics with double precision. Knowing the performance of these operations, iterative Krylov methods are considered to solve the acoustic problem efficiently. Numerical experiments carried out on a set of acoustic ma...

Low order, sequential or non-massively parallel finite elements are generaly used for three-dimensional gravity modelling. In this paper, in order to obtain better gravity anomaly solutions in heterogeneous media, we solve the gravimetry problem using massively parallel high order finite elements on hybrid multi-CPU/GPU clusters. Parallel algorithm...

In this paper, we present, evaluate and analyse the performance of parallel synchronous Jacobi algorithms by different partitioned procedures including band-row splitting, band-row sparsity pattern splitting and substructuring splitting, when solving sparse large linear systems. Numerical experiments performed on a set of academic 3D Laplace equati...

In this paper, the authors propose an analysis of the frequency response function in a car compartment, subject to some fluctuating pressure distribution along the open cavity of the sun-roof at the top of a car. Coupling of a computational fluid dynamics and of a computational acoustics code is considered to simulate the acoustic fluid-structure i...

Asynchronous iterations are more and more investigated for both scaling and fault-resilience purpose on high performance computing platforms. While so far, they have been exclusively applied within space domain decomposition frameworks, this paper advocates a novel application direction targeting time-decomposed time-parallel approaches. Specifical...

A conventional study of fluid simulation involves different stages including conception, simulation, visualization, and analysis tasks. It is, therefore, necessary to switch between different software and interactive contexts which implies costly data manipulation and increases the time needed for decision making. Our interactive simulation approac...

Due to their highly parallel multi-cores architecture, GPUs are being increasingly used in a wide range of computationally intensive applications. Compared to CPUs, GPUs can achieve higher performances at accelerating the programs' execution in an energy-efficient way. Therefore GPGPU computing is useful for high performance computing applications...

A conventional study of fluid simulation involves different stages including conception, simulation, visualization, and analysis tasks. It is, therefore, necessary to switch between different software and interactive contexts which implies costly data manipulation and increases the time needed for decision making. Our interactive simulation approac...

The main objective of this work consists in analyzing sub-structuring method for the parallel solution of sparse linear systems with matrices arising from the discretization of partial differential equations such as finite element, finite volume and finite difference. With the success encountered by the general-purpose processing on graphics proces...

Until now, almost all investigations of asynchronous iterations within domain decomposition frameworks targeted methods of the parallel Schwarz type. A first, and sole, attempt to deal with a primal substructuring framework resulted in an asynchronous substructuring method where relaxation occurs simultaneously on the subdomains and on the interfac...

We consider shape optimization problems for elasticity systems in architecture. A typical objective in this context is to identify a structure of maximal stability that is close to an initially proposed one. For structures without external forces on varying parts classical methods allow to prove the existence of optimal shapes within well-known cla...

Laplace transform method has proved to be very efficient and easy to parallelize for the solution of time-dependent problems. However, the synchronization delay among processors implies an upper bound on the expectable acceleration factor, which leads to a lot of wasted time. In this paper, we propose an original asynchronous Laplace transform meth...

The performance of gradient methods has been considerably improved by the introduction of delayed parameters. Recently, the revealing of second-order information has given rise to the Cauchy-based methods with alignment, which are generally considered as the state of the art of gradient methods. This paper investigates the spectral properties of mi...

In the aim to find the simplest and most efficient shape of a noise absorbing wall to dissipate the acoustical energy of a sound wave, we consider a frequency model described by the Helmholtz equation with a damping on the boundary. The well-posedness of the model is shown in a class of domains with $d$-set boundaries ($N-1\le d<N$). We introduce a...

We consider shape optimization problems for elasticity systems in architecture. A typical question in this context is to identify a structure of maximal stability close to an initially proposed one. We show the existence of such an optimally shaped structure within classes of bounded Lipschitz domains and within wider classes of bounded uniform dom...

Compared with arithmetic operation, communication cost is often the bottleneck on modern computers, and thus should be paid increasing attention when choosing algorithms. Lagged gradient methods are known for their error tolerance and fast convergence. However, it appears that their parallel behavior is not well understood. In this paper, we explor...

This article presents enhancement strategies for the Hermitian and skew‐Hermitian splitting method based on gradient iterations. The spectral properties are exploited for the parameter estimation, often resulting in a better convergence. In particular, steepest descent with early stopping can generate a rough estimate of the optimal upper bound. Th...

In this paper, we tackled the convergence detection problem arisen from the absence of synchronization during asynchronous iterative computation. We showed that, when one arbitrarily takes the local components of a global solution vector, an upper bound can be established on the difference between a residual error evaluated from this global vector...

In the aim to find the simplest and most efficient shape of a noise absorbing wall to dissipate the acoustical energy of a sound wave, we consider a frequency model described by the Helmholtz equation with a damping on the boundary. The well-posedness of the model is shown in a class of domains with $d$-set boundaries ($N-1\le d<N$). We introduce a...

Background
Bariatric surgery is an effective therapeutic procedure for morbidly obese patients. The two most common interventions are Sleeve Gastrectomy (SG) and Laparoscopic Roux-en-Y Gastric Bypass (LRYGB).
Objectives
The aim of this study was to compare microbiome long-term microbiome after SG and LRYGB surgery in obese patients.
Setting
Unive...

In this paper we present an effective coarse space correction addressed to accelerate the solution of an algebraic linear system. The system arises from the formulation of the problem of interpolating scattered data by means of Radial Basis Functions. Radial Basis Functions are commonly used for interpolating scattered data during the image reconst...

This paper covers the fast solution of large acoustic problems on low-resources parallel platforms. A domain decomposition method is coupled with a dynamic load balancing scheme to efficiently accelerate a geometrical acoustic method. The geometrical method studied implements a beam-tracing method where intersections are handled as in a ray-tracing...

In the context of a virtual reconstitution of the destroyed Royaumont abbey church, this paper investigates computer sciences issues intrinsic to the physically-based image rendering. First, a virtual model was designed from historical sources and archaeological descriptions. Then some materials physical properties were measured on remains of the c...

This paper describes a work in progress about software and hardware architecture to steer and control an ongoing fluid simulation in a context of a serious game application. We propose to use the Lattice Boltzmann Method as the simulation approach considering that it can provide fully parallel algorithms to reach interactive time and because it is...

This paper discusses about the advantage of using asynchronous simulation in the case of interactive simulation in which user can steer and control parameters during a simulation in progress. synchronous models allow to compute each iteration faster to address the issues of performance needed in an highly interactive context, and our hypothesis is...

In this paper, we use an original ray-tracing domain decomposition method to address image rendering of naturally lighted scenes. This new method allows to particularly analyze rendering problems on parallel architectures, in the case of interactions between light-rays and glass material. Numerical experiments, for medieval glass rendering within t...

This paper covers the time consuming issues intrinsic to physically-based image rendering algorithms. First, glass materials optical properties were measured on samples of real glasses and other objects materials inside an hotel room were characterized by deducing spectral data from multiple trichromatic images. We then present the rendering model...

Convergence of both synchronous and asynchronous optimized Schwarz algorithms for the shifted Laplacian operator on a bounded rectangular domain, in a one‐way subdivision of the computational domain, with overlap, is shown. Convergence results are obtained under very mild conditions on the size of the subdomains and on the amount of overlap. A coup...

On modern parallel architectures, the cost of synchronization among processors can often dominate the cost of floating-point computation. Several modifications of the existing methods have been proposed in order to keep the communication cost as low as possible. This paper aims at providing a brief overview of recent advances in parallel iterative...

We present some extensions to the limited memory steepest descent method based on spectral properties and cyclic iterations. Our aim is to show that it is possible to combine sweep and delayed strategies for improving the performance of gradient methods. Numerical results are reported which indicate that our new methods are better than the original...

The performance of gradient methods has been considerably improved by the introduction of delayed parameters. After two and a half decades, the revealing of second-order information has recently given rise to the Cauchy-based methods with alignment, which reduce asymptotically the search spaces in smaller and smaller dimensions. They are generally...

This paper presents enhancement strategies for the Hermitian and skew-Hermitian splitting method based on gradient iterations. The spectral properties are exploited for the parameter estimation, often resulting in a better convergence. In particular, steepest descent with early stopping can generate a rough estimate of the optimal parameter. This i...

Laplace transform method has proved to be very efficient and easy to parallelize for the solution of time-dependent problems. However, the synchronization delay among processors implies an upper bound on the expectable acceleration factor, which leads to a lot of wasted time. In this paper, we propose an original asynchronous Laplace transform meth...

Asynchronous iterations have been investigated more and more for both scaling and fault-resilience purposes on high performance computing platforms. While so far, they have been exclusively applied within space domain decomposition frameworks, this paper advocates a novel application direction targeting time-decomposed time-parallel approaches. Spe...

This paper addresses the distributed convergence detection problem in asynchronous iterations. A modified recursive doubling algorithm is investigated in order to adapt to the non-power-of-two case. Some convergence detection algorithms are illustrated based on the reduction operation. Finally, a concluding discussion about the implementation and t...

This paper proposes a new gradient method to solve the large-scale problems. Theoretical analysis shows that the new method has finite termination property for two dimensions and converges R-linearly for any dimensions. Experimental results illustrate first the issue of parallel implementation. Then, the solution of a large-scale problem shows that...

Motivation:
Analysis toolkits for shotgun metagenomic data achieve strain-level characterization of complex microbial communities by capturing intra-species gene content variation. Yet, these tools are hampered by the extent of reference genomes that are far from covering all microbial variability, as many species are still not sequenced or have o...

Background: Bariatric surgery is an effective therapeutic procedure for morbidly obese patients as it induces sustained weight loss. The two most common interventions are Laparoscopic Sleeve Gastrectomy (LSG) and Laparoscopic Roux-en-Y Gastric Bypass(LRYGB).
Objective: Characterizing the gut microbiota changes induced by LSG and LRYGB.
Design: 89 a...

Presents corrections to the paper, “K nearest neighbour joins for big data on MapReduce: A theoretical and experimental analysis,” (Song, G., et al), IEEE Trans. Knowl. Data Eng., vol. 28, no. 9, pp. 2376–2392, Sep. 2016.

Motivation
Analysis toolkits for shotgun metagenomic data achieve strain-level characterization of complex microbial communities by capturing intra-species gene content variation. Yet, these tools are hampered by the extent of reference genomes that are far from covering all microbial variability, as many species are still not sequenced or have onl...

Spatial domain decomposition methods have been largely investigated in the last decades, while time domain decomposition seems to be contrary to intuition and so is not as popular as the former. However, many attractive methods have been proposed, especially the parareal algorithm, which showed both theoretical and experimental efficiency in the co...

Galerkin/least-squares and Galerkin gradient/least-squares stand out among several approaches designed to improve the numerical solution accuracy and counteract the pollution effect by adding terms to the standard Galerkin formulation. These added terms are multiplied by a ‘stability parameter’ which must be properly defined. In this paper, an orig...

Complex-valued Helmholtz equations arise in various applications, and a lot of research has been devoted to finding efficient preconditioners for the iterative solution of their discretizations. In this paper we consider the Helmholtz equation rewritten in real-valued block form, and use a preconditioner in a special two-by-two block form. We show...

In this paper, we address the design of a communication library which particularly targets distributed iterative computing, including randomly executed asynchronous iterations. The well-known MPI programming framework is considered, upon which unique generic routines are proposed for both blocking and non-blocking communication modes. This allows f...

A convergence proof of Asynchronous Optimized Schwarz Methods applied to a shifted Laplacian problem, with negative shift, in \(\mathbb {R}^2\) is presented. Sufficient conditions for convergence involving initial values of the approximation of the solution are discussed.

An analysis of the convergence properties of Optimized Schwarz methods applied as solvers for Poisson’s Equation in a bounded rectangular domain with Dirichlet (physical) boundary conditions and Robin transmission conditions on the artificial boundaries is presented. To our knowledge this is the first time that this is done for multiple subdomains...

https://hal-centralesupelec.archives-ouvertes.fr/hal-02750641

Convergence of classical parallel iterations is detected by performing a reduction operation at each iteration in order to compute a residual error relative to a potential solution vector. To efficiently run asynchronous iterations, blocking communication requests are avoided, which makes it hard to isolate and handle any global vector. While some...

The advent of asynchronous iterative scheme gives high efficiency to numerical computations. However, it is generally difficult to handle the problems of resource management and convergence detection. This paper uses JACK2, an asynchronous communication kernel library for iterative algorithms, to implement both classical and asynchronous parareal a...

Asynchronous iterations arise naturally in parallel computing if one wants to solve large problems with a minimization of the idle times. This paper presents an original model of asynchronous iterations for a time-domain decomposition method, namely the parareal method. The asynchronous parareal algorithm is here applied to European option pricing,...

In this paper we present an original contactless human machine interface for driving car. The proposed framework is based on the image sent by a simple camera device, which is then processed by various computer vision algorithms. These algorithms allow the isolation of the user's hand on the camera frame and translate its movements into orders sent...

This paper describes a global framework that enables contactless human machine interaction using computer vision and machine learning techniques. The main originality of our framework is that only a very simple image acquisition device, as a computer camera, is sufficient to establish a rich human machine interaction as traditional devices such as...