Niclas Jansson

Niclas Jansson
  • PhD Numerical Analysis
  • Researcher at KTH Royal Institute of Technology

About

63
Publications
10,480
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
672
Citations
Current institution
KTH Royal Institute of Technology
Current position
  • Researcher
Additional affiliations
April 2018 - April 2021
RIKEN
Position
  • Researcher
April 2016 - April 2018
RIKEN
Position
  • Researcher
April 2018 - present
KTH Royal Institute of Technology
Position
  • Researcher
Education
August 2008 - October 2011
KTH Royal Institute of Technology
Field of study
  • Numerical Analysis
August 2008 - October 2013
KTH Royal Institute of Technology
Field of study
  • Numerical Analysis
August 2003 - August 2008
KTH Royal Institute of Technology
Field of study
  • Computer Science and Engineering

Publications

Publications (63)
Article
Recent trends and advancements in including more diverse and heterogeneous hardware in High‐Performance Computing (HPC) are challenging scientific software developers in their pursuit of efficient numerical methods with sustained performance across a diverse set of platforms. As a result, researchers are today forced to re‐factor their codes to lev...
Article
The never-ending computational demand from simulations of turbulence makes computational fluid dynamics (CFD) a prime application use case for current and future exascale systems. High-order finite element methods, such as the spectral element method, have been gaining traction as they offer high performance on both multicore CPUs and modern GPU-ba...
Preprint
Full-text available
The computational power of High-Performance Computing (HPC) systems is constantly increasing, however, their input/output (IO) performance grows relatively slowly, and their storage capacity is also limited. This unbalance presents significant challenges for applications such as Molecular Dynamics (MD) and Computational Fluid Dynamics (CFD), which...
Article
Full-text available
The three-dimensional turbulent flow around a Flettner rotor, i.e. an engine-driven rotating cylinder in an atmospheric boundary layer, is studied via direct numerical simulations (DNS) for three different rotation speeds ( $$\alpha$$ α ). This technology offers a sustainable alternative mainly for marine propulsion, underscoring the critical impor...
Article
We present our approach to making direct numerical simulations of turbulence with applications in sustainable shipping. We use modern Fortran and the spectral element method to leverage and scale on supercomputers powered by the Nvidia A100 and the recent AMD Instinct MI250X GPUs, while still providing support for user software developed in Fortran...
Preprint
We present our approach to making direct numerical simulations of turbulence with applications in sustainable shipping. We use modern Fortran and the spectral element method to leverage and scale on supercomputers powered by the Nvidia A100 and the recent AMD Instinct MI250X GPUs, while still providing support for user software developed in Fortran...
Article
Full-text available
In-situ visualization on high-performance computing (HPC) systems allows us to analyze simulation results that would otherwise be impossible , given the size of the simulation data sets and offline post-processing execution time. We develop an in-situ adaptor for Paraview Catalyst and Nek5000, a massively parallel Fortran and C code for computation...
Preprint
Full-text available
We present new results on the strong parallel scaling for the OpenACC-accelerated implementation of the high-order spectral element fluid dynamics solver Nek5000. The test case considered consists of a direct numerical simulation of fully-developed turbulent flow in a straight pipe, at two different Reynolds numbers $Re_\tau=360$ and $Re_\tau=550$,...
Preprint
Full-text available
The impending termination of Moore's law motivates the search for new forms of computing to continue the performance scaling we have grown accustomed to. Among the many emerging \textit{Post-Moore} computing candidates, perhaps none is as salient as the Field-Programmable Gate Array (FPGA), which offers the means of specializing and customizing the...
Preprint
Recent trends and advancement in including more diverse and heterogeneous hardware in High-Performance Computing is challenging software developers in their pursuit for good performance and numerical stability. The well-known maxim "software outlives hardware" may no longer necessarily hold true, and developers are today forced to re-factor their c...
Preprint
For many, Graphics Processing Units (GPUs) provides a source of reliable computing power. Recently, Nvidia introduced its 9th generation HPC-grade GPUs, the Ampere 100, claiming significant performance improvements over previous generations, particularly for AI-workloads, as well as introducing new architectural features such as asynchronous data m...
Preprint
Radiation Treatment Planning (RTP) is the process of planning the appropriate external beam radiotherapy to combat cancer in human patients. RTP is a complex and compute-intensive task, which often takes a long time (several hours) to compute. Reducing this time allows for higher productivity at clinics and more sophisticated treatment planning, wh...
Preprint
Full-text available
Improvements in computer systems have historically relied on two well-known observations: Moore's law and Dennard's scaling. Today, both these observations are ending, forcing computer users, researchers, and practitioners to abandon the general-purpose architectures' comforts in favor of emerging post-Moore systems. Among the most salient of these...
Preprint
In the CFD solver Nek5000, the computation is dominated by the evaluation of small tensor operations. Nekbone is a proxy app for Nek5000 and has previously been ported to GPUs with a mixed OpenACC and CUDA approach. In this work, we continue this effort and optimize the main tensor-product operation in Nekbone further. Our optimization is done in C...
Article
The constraint-based immersed boundary (cIB) method has been shown to be accurate between low and moderate Reynolds number (Re) flows when the immersed body constraint is imposed as a volumetric constraint force. When the IB is modelled as a zero-thickness interface, where it is no longer possible to model a volumetric constraint force, we found th...
Chapter
We present a high performance computing framework for finite element simulation of blood flow in the left ventricle of the human heart. The mathematical model is described together with the discretization method and the parallel implementation in Unicorn which is part of the open source software framework FEniCS-HPC. We show results based on patien...
Article
Full-text available
Writing high-performance solvers for engineering applications is a delicate task. These codes are often developed on an application to application basis, highly optimized to solve a certain problem. Here, we present our work on developing a general simulation framework for efficient computation of time-resolved approximations of complex industrial...
Preprint
Full-text available
Writing high performance solvers for engineering applications is a delicate task. These codes are often developed on an application to application basis, highly optimized to solve a certain problem. Here, we present our work on developing a general simulation framework for efficient computation of time resolved approximations of complex industrial...
Article
Full-text available
Due to advances in medical imaging, computational fluid dynamics algorithms and high performance computing, computer simulation is developing into an important tool for understanding the relationship between cardiovascular diseases and intraventricular blood flow. The field of cardiac flow simulation is challenging and highly interdisciplinary. We...
Chapter
We present an adaptive finite element method for time-resolved simulation of aerodynamics without any turbulence-model parameters, which is applied to a benchmark problem from the HiLiftPW-3 workshop to compute the flow past a JAXA Standard Model (JSM) aircraft model at realistic Reynolds numbers. The mesh is automatically constructed by the method...
Conference Paper
The Algebraic Multigrid (AMG) method has over the years developed into an efficient tool for solving unstructured linear systems. The need to solve large industrial problems discretized on unstructured meshes, has been a key motivation for devising a parallel AMG method. Despite some success, the key part of the AMG algorithm; the coarsening step,...
Chapter
We give a brief introduction to research on adaptive computational methods for laminar compressible and incompressible flows and then focus on computability and adaptivity for turbulent incompressible flow, where we present a framework for adaptive finite element methods with duality-based a posteriori error control for chosen output quantities of...
Conference Paper
We present a framework for coupled multiphysics in computational fluid dynamics, targeting massively parallel systems. Our strategy is based on general problem formulations in the form of partial differential equations and the finite element method, which open for automation, and optimization of a set of fundamental algorithms. We describe these al...
Conference Paper
In parallel computing load balancing is an essential component of any efficient and scalable simulation code. Static data decomposition methods have proven to work well for symmetric workloads. But, in today’s multiphysics simulations, with asymmetric workloads, this imbalance prevents good scalability on future generation of parallel architectures...
Article
Full-text available
This work presents a direct comparison of unsteady, turbulent flow simulations with measurements performed using a Gulfstream G550 nose landing gear model. The experimental campaign, which was carried out by researchers from the NASA Langley Research Center, provided a series of detailed, well documented wind-tunnel measurements for comparison and...
Conference Paper
Full-text available
Developing multiphysics finite element methods (FEM) and scalable HPC implementations can be very challenging in terms of software complexity and performance, even more so with the addition of goal-oriented adaptive mesh refinement. To manage the complexity we in this work present general adaptive stabilized methods with automated implementation in...
Article
This article is a review of our work towards a parameter-free method for simulation of turbulent flow at high Reynolds numbers. In a series of papers we have developed a model for turbulent flow in the form of weak solutions of the Navier-Stokes equations, approximated by an adaptive finite element method, where: (i) viscous dissipation is assumed...
Article
We present our simulation results for the benchmark problem of the flow past a rudimentary landing gear using a General Galerkin FEM, also referred to as adaptive DNS/LES. In General Galerkin, no explicit subgrid model is used; instead, the computational mesh is adaptively refined with respect to an a posteriori error estimate of a quantity of inte...
Conference Paper
We present a time-resolved, adaptive finite element method for aerodynamics, together with the results from the HiLiftPW-2 workshop, where this method is used to compute the ow past a DLR-F11 aircraft model at realistic Reynolds number. The mesh is automatically constructed by the method as part of the computation, and no explicit turbulence model...
Conference Paper
This is a summary of preliminary results from simulations with the 30P30N high-lift device. We used the General Galerkin finite element method (G2), where no explicit subgrid model is used, and where the computational mesh is adaptively refined with respect to a posteriori error estimates for a quantity of interest. The mesh is fully unstructured a...
Conference Paper
In parallel finite element solvers, sparse matrix assembly is often a bottleneck. Implemented using message passing, latency from message matching starts to limit performance as the number of cores increases. We here address this issue by using our own stack based representation of the sparse matrix, and a hybrid parallel programming model combinin...
Article
Full-text available
In this paper we describe a general adaptive finite element framework for unstructured tetrahedral meshes without hanging nodes suitable for large scale parallel computations. Our framework is designed to scale linearly to several thousands of processors, using fully distributed and efficient algorithms. The key components of our implementation, lo...
Chapter
This chapter provides a description of the technology of Unicorn focusing on simple, efficient and 10597 general algorithms and software for the Unified Continuum (UC) concept and the adaptive General 10598 Galerkin (G2) discretization as a unified approach to continuum mechanics.
Chapter
The FEniCS project aims towards the goals of generality, efficiency, and simplicity, concerning mathematical methodology, implementation and application, and the Unicorn project is an imple- mentation aimed at FSI and high Re turbulent flow guided by these principles. Unicorn is based on the DOLFIN/FFC/FIAT suite and the linear algebra package PETS...
Article
We present a framework for adaptive finite element computation of turbulent flow and fluid–structure interaction, with focus on general algorithms that allow for complex geometry and deforming domains. We give basic models and finite element discretization methods, adaptive algorithms and strategies for efficient parallel implementation. To illustr...
Article
Full-text available
The massive computational cost for resolving all turbulent scales makes a direct numerical simulation of the underly-ing Navier-Stokes equations impossible in most engineering applications. We present recent advances in parallel adap-tive finite element methodology that enable us to efficiently compute time resolved approximations for complex geome...
Chapter
In this paper we present a computational study of turbulent flow separation for a circular cylinder at high Reynolds numbers. We use a stabilized finite element method together with skin friction boundary conditions, where we study flow separation with respect to the decrease of a friction parameter. In particular, we consider the case of zero fric...
Article
Full-text available
In this paper we present our work on optimizing the automated scientific computing framework of FEniCS for modern high performance computer architectures. We describe recent developments of a high performance implementation of the finite element library DOLFIN and solver package Unicorn for distributed memory architectures. The current state of the...
Article
Full-text available
We describe a free software/open source continuum mechanics solver Unicorn [1] as part of the FEniCS [2, 3] software project for automation of computational modeling, with aspects such as Unified Continuum (UC) modeling for canonical representation/discretization of continuum mechanics model-ing, abstraction of parallel low-level finite element ass...
Article
Full-text available
In this paper we describe a general adaptive finite element framework suitable for large scale parallel computations. Our framework is designed to scale linearly to several thousands of processors, using fully distributed and efficient algorithms. The key components of our implementation, mesh refinement and load balancing algorithms are described...

Network

Cited By