Conference Paper

Effective memory layout and accesses for the SPH method on the GPU

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The smoothed particle hydrodynamics (SPH) method has been implemented on Graphical Processing Units (GPU) several times to increase performance. However, the need for ever faster implementations is still there. Modern GPUs have a complex memory hierarchy of which effective utilization is paramount for high-performance computing. Use of the GPU’s shared memory has traditionally been seen as important, as GPUs have had no or little automatic caching. Newer GPUs from NVIDIA, such as the Fermi and Kepler architectures, have a more advanced cache implementation than previous generations, possibly alleviating the shared memory requirement. We present benchmark results of four different memory handling strategies for the SPH algorithm with computations on the GPUs and with kernel support width of both 2h and 3h. Our results indicate that modern caching to a great extent alleviate the need for explicit and manual use of shared memory, and that the kernel support has a great influence on the choice of memory strategy.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
In this review the theory and application of Smoothed particle hydrodynamics (SPH) since its inception in 1977 are discussed. Emphasis is placed on the strengths and weaknesses, the analogy with particle dynamics and the numerous areas where SPH has been successfully applied.
Conference Paper
Full-text available
In this paper we introduce a novel parallel and interactive SPH simulation and rendering method on the GPU using CUDA which allows for high quality visualization. The crucial particle neighborhood search is based on Z-indexing and parallel sorting which eliminates GPU memory overhead due to grid or hierarchical data structures. Furthermore, it overcomes limitations imposed by shading languages allowing it to be very flexible and approaching the practical limits of modern graphics hardware. For visualizing the SPH simulation we introduce a new rendering pipeline. In the first step, all surface particles are efficiently extracted from the SPH particle cloud exploiting the simulation data. Subsequently, a partial and therefore fast distance field volume is rasterized from the surface particles. In the last step, the distance field volume is directly rendered using state-of-the-art GPU raycasting. This rendering pipeline allows for high quality visualization at very high frame rates.
Chapter
Memory access patterns are critical for performance, especially on parallel architectures such as graphics processing units (GPUs). Because of this, the choice between an array-of-structures (AoS) data layout and a structure-of-arrays (SoA) layout has a large impact on overall program performance. However, it is not always obvious which layout will better serve a particular application, and testing both of them by hand in C++ is tedious because their syntax greatly differs. Not only is the syntax for defining the container different, but worse, the syntax for accessing the data within the container is different, leading to anywhere from tens to thousands of source code changes needed to switch any given container from the AoS to the SoA layout or vice versa. This chapter presents an abstraction layer that allows switching between the AoS and SoA layouts in C++ without having to change the data access syntax. A few changes to the structure and container definitions allow for easy performance comparison of AoS vs. SoA on existing AoS code. This abstraction retains the more intuitive AoS syntax (container[index].component) for data access yet allows switching between the AoS and SoA layouts with a single template parameter in the container type definition on the CPU and GPU. In this way, code development becomes independent of the data layout and performance is improved by choosing the correct layout for the application's usage pattern.
Book
SPH Concept and Essential Formulation Constructing Smoothing Functions SPH for General Dynamic Fluid Flows Discontinuous SPH (DSPH) SPH for Simulating Explosions SPH for Underwater Explosion Shock Simulation SPH for Hydrodynamics with Material Strength Coupling SPH with Molecular Dynamics for Multiple Scale Simulations Computer Implementation of SPH and a 3D SPH Code.
Article
This book aims at presenting the SPH method for fluid modelling from a theoretical and applied viewpoint. It comprises two parts that refer to each other. The first, dealing with the fundamentals of Hydraulics, is based on the elementary principles of Lagrangian and Hamiltonian mechanics. The specific laws governing a system of macroscopic particles are built, then the large systems involving dissipative processes are explained. The continua are then discussed; lastly, a fairly exhaustive account of turbulence is given. The second part discloses the bases of the SPH Lagrangian numerical method from the continuous equations, as well as from discrete variational principles, setting out the method's specific properties of conservativity and invariance. Various numerical schemes are compared, permanently referring to physics as dealt with in the first part. Applications to schematic instances are then discussed; ultimately, practical applications to the dimensioning of coastal and fluvial structures are considered. Despite the rapid growth in the SPH field, this book is the first to present this method in a comprehensive way for fluids. It should serve as a rigorous introduction to SPH and a reference for fundamental mathematical fluid dynamics.
Book
This book is the definitive guide to the OpenCL API an language for writing portable code for heterogeneous platforms.
Article
Thrust is a parallel algorithms library which resembles the C++ Standard Template Library (STL). Thrust's high-level interface greatly enhances programmer productivity while enabling performance portability between GPUs and multicore CPUs. Interoperability with established technologies (such as CUDA, TBB, and OpenMP) facilitates integration with existing software.
Article
Much of the current focus in high performance computing (HPC) for computational fluid dynamics (CFD) deals with grid based methods. However, parallel implementations for new meshfree particle methods such as Smoothed Particle Hydrodynamics (SPH) are less studied. In this work, we present optimizations for both central processing units (CPU) and graphics processing units (GPU) focused on a Lagrangian Smoothed Particle Hydrodynamics (SPH) method. In particular, the obtained performance and a comparison between the most efficient implementations for CPU and GPU are shown using the DualSPHysics code.
Article
In this paper, we present a Smoothed Parti- cle Hydrodynamics (SPH) implementation algorithm on GPUs. To compute a force on a particle, neighboring par- ticles have to be searched. However, implementation of a neighboring particle search on GPUs is not straightfor- ward. We developed a method that can search for neigh- boring particles on GPUs, which enabled us to imple- ment the SPH simulation entirely on GPUs. Since all of the computation is done on GPUs and no CPU process- ing is needed, the proposed algorithm can exploit the massive computational power of GPUs. Consequently, the simulation speed is many times increased with the proposed method.
Article
The tensile instability in smoothed particle hydrodynamics results in a clustering of smoothed particle hydrodynamics (SPH) particles. The clustering is particularly noticeable in materials which have an equation of state which can give rise to negative pressures, but it can occur in gases where the pressure is always positive and in magnetohydrodynamics (MHD) problems. It is a particular problem in solid body computations where the instability may corrupt physical fragmentation by numerical fragmentation which, in some cases, is so severe that the dynamics of the system is completely wrong. In this paper it is shown how the instability can be removed by using an artificial stress which, in the case of fluids, is an artificial pressure. The method is analyzed by examining the dispersion relation for small oscillations in a fluid with a stiff equation of state. The short and long wavelength limits of the dispersion relation indicate appropriate parameters for the artificial pressure and, with these parameters, the errors in the long wavelength limit are small. Numerical studies of the dispersion relation for a wide range of parameters confirm the approximate analytical results for the dispersion relation. Applications to several test problems show that the artificial stress works effectively. These problems include the evolution of a region with negative pressure, extreme expansion in one dimension, and the collision of rubber cylinders. To study this latter problem the artificial pressure is generalized to an artificial stress. The results agree well with the calculations of other stable codes.
Density-consistent initialization of sph on a regular cartesian grid: Comparative numerical study of 10 smoothing kernels in 1, 2 and 3 dimensions
  • A Lavrov
  • P Skjetne
  • B Lund
  • E Bjønnes
  • F O Bjørnson
  • J O Busklein
  • T Coudert
  • P Klebert
  • K O Lye
  • J E Olsen
  • C Pákozdi
  • J Seland
  • W Yang
A. Lavrov, P. Skjetne, B. Lund, E. Bjønnes, F. O. Bjørnson, J. O. Busklein, T. Coudert, P. Klebert, K. O. Lye, J. E. Olsen, C. Pákozdi, J. Seland, and W. Yang, "Density-consistent initialization of sph on a regular cartesian grid: Comparative numerical study of 10 smoothing kernels in 1, 2 and 3 dimensions," in Proceedings of the IUTAM Symposium on Particle Methods in Fluid Dynamics.