
Tzanio Kolev- Ph.D. Texas A&M University, M.Sc. Sofia University
- Computational Mathematician at Lawrence Livermore National Laboratory
Tzanio Kolev
- Ph.D. Texas A&M University, M.Sc. Sofia University
- Computational Mathematician at Lawrence Livermore National Laboratory
Working on finite element discretizations, solvers and HPC applications.
About
131
Publications
42,432
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,002
Citations
Introduction
Computational mathematician in CASC, LLNL. Current projects:
High-order ALE simulations, https://www.llnl.gov/casc/blast
• Finite element discretizations, https://mfem.org
• Center for Efficient Exascale Discretizations https://ceed.exascaleproject.org
• Finite element visualization, https://glvis.org
• Scalable linear solvers, https://www.llnl.gov/casc/hypre
• Parallel time integration, https://www.llnl.gov/casc/xbraid.
Current institution
Additional affiliations
January 2000 - May 2004
January 1997 - December 1998
June 2004 - present
Publications
Publications (131)
We introduce a novel method for bounding high-order multi-dimensional polynomials in finite element approximations. The method involves precomputing optimal piecewise-linear bounding boxes for polynomial basis functions, which can then be used to locally bound any combination of these basis functions. This approach can be applied to any element/bas...
Robust and scalable function evaluation at any arbitrary point in the finite/spectral element mesh is required for querying the partial differential equation solution at points of interest, comparison of solution between different meshes, and Lagrangian particle tracking. This is a challenging problem, particularly for high-order unstructured meshe...
Equilibriums in magnetic confinement devices result from force balancing between the Lorentz force and the plasma pressure gradient. In an axisymmetric configuration like a tokamak, such an equilibrium is described by an elliptic equation for the poloidal magnetic flux, commonly known as the Grad--Shafranov equation. It is challenging to develop a...
MAGMA (Matrix Algebra for GPU and Multicore Architectures) is a pivotal open-source library in the landscape of GPU-enabled dense and sparse linear algebra computations. With a repertoire of approximately 750 numerical routines across four precisions, MAGMA is deeply ingrained in the DOE software stack, playing a crucial role in high-performance co...
The MFEM (Modular Finite Element Methods) library is a high-performance C++ library for finite element discretizations. MFEM supports numerous types of finite element methods and is the discretization engine powering many computational physics and engineering applications across a number of domains. This paper describes some of the recent research...
The Lawrence Livermore National Laboratory (LLNL) will soon have in place the El Capitan exascale supercomputer, based on AMD GPUs. As part of a multiyear effort under the NNSA Advanced Simulation and Computing (ASC) program, we have been developing MARBL, a next generation, performance portable multiphysics application based on high-order finite e...
In this article, we present algorithms and implementations for the end-to-end GPU acceleration of matrix-free low-order-refined preconditioning of high-order finite element problems. The methods described here allow for the construction of effective preconditioners for high-order problems with optimal memory usage and computational complexity. The...
This work describes the development of matrix-free GPU-accelerated solvers for high-order finite element problems in $H(\mathrm{div})$. The solvers are applicable to grad-div and Darcy problems in saddle-point formulation, and have applications in radiation diffusion and porous media flow problems, among others. Using the interpolation-histopolatio...
In this paper we present a new GPU-oriented mesh optimization method based on high-order finite elements. Our approach relies on node movement with fixed topology, through the Target-Matrix Optimization Paradigm (TMOP) and uses a global nonlinear solve over the whole computational mesh, i.e., all mesh nodes are moved together. A key property of the...
In this paper, we present algorithms and implementations for the end-to-end GPU acceleration of matrix-free low-order-refined preconditioning of high-order finite element problems. The methods described here allow for the construction of effective preconditioners for high-order problems with optimal memory usage and computational complexity. The pr...
We propose a method for implicit high-order meshing that aligns easy-to-generate meshes with the boundaries and interfaces of the domain of interest. Our focus is particularly on the case when the target surface is prescribed as the zero isocontour of a smooth discrete function. Common examples of this scenario include using level set functions to...
We present an hr-adaptivity framework for optimization of high-order meshes. This work extends the r-adaptivity method by Dobrev et al. (Comput Fluids, 2020), where we utilized the Target-Matrix Optimization Paradigm (TMOP) to minimize a functional that depends on each element’s current and target geometric parameters: element aspect-ratio, size, s...
With the introduction of advanced heterogeneous computing architectures based on GPU accelerators, large-scale production codes have had to rethink their numerical algorithms and incorporate new programming models and memory management strategies in order to run efficiently on the latest supercomputers. In this work we discuss our co-design strateg...
In this paper we present a new GPU-oriented mesh optimization method based on high-order finite elements. Our approach relies on node movement with fixed topology, through the Target-Matrix Optimization Paradigm (TMOP) and uses a global nonlinear solve over the whole computational mesh, i.e., all mesh nodes are moved together. A key property of the...
In this paper we present a unified framework for constructing spectrally equivalent low-order-refined discretizations for the high-order finite element de Rham complex. This theory covers diffusion problems in $H^1$, $H({\rm curl})$, and $H({\rm div})$, and is based on combining a low-order discretization posed on a refined mesh with a high-order b...
In this paper we introduce general transfer operators between high-order and low-order refined finite element spaces that can be used to couple high-order and low-order simulations. Under natural restrictions on the low-order refined space we prove that both the high-to-low-order and low-to-high-order linear mappings are conservative, constant pres...
The magnetohydrodynamics (MHD) equations are continuum models used in the study of a wide range of plasma physics systems, including the evolution of complex plasma dynamics in tokamak disruptions. However, efficient numerical solution methods for MHD are extremely challenging due to disparate time and length scales, strong hyperbolic phenomena, an...
With the introduction of advanced heterogeneous computing architectures based on GPU accelerators, large-scale production codes have had to rethink their numerical algorithms and incorporate new programming models and memory management strategies in order to run efficiently on the latest supercomputers. In this work we discuss our co-design strateg...
Efficient exploitation of exascale architectures requires rethinking of the numerical algorithms used in many large-scale applications. These architectures favor algorithms that expose ultra fine-grain parallelism and maximize the ratio of floating point operations to energy intensive data movement. One of the few viable approaches to achieve high...
In this paper we describe the research and development activities in the Center for Efficient Exascale Discretization within the US Exascale Computing Project, targeting state-of-the-art high-order finite-element algorithms for high-order applications on GPU-accelerated platforms. We discuss the GPU developments in several components of the CEED so...
In this paper we describe the research and development activities in the Center for Efficient Exascale Discretization within the US Exascale Computing Project, targeting state-of-the-art high-order finite-element algorithms for high-order applications on GPU-accelerated platforms. We discuss the GPU developments in several components of the CEED so...
In this paper, we develop subspace correction preconditioners for discontinuous Galerkin (DG) discretizations of elliptic problems with hp-refinement. These preconditioners are based on the decomposition of the DG finite element space into a conforming subspace, and a set of small nonconforming edge spaces. The conforming subspace is preconditioned...
Efficient exploitation of exascale architectures requires rethinking of the numerical algorithms used in many large-scale applications. These architectures favor algorithms that expose ultra fine-grain parallelism and maximize the ratio of floating point operations to energy intensive data movement. One of the few viable approaches to achieve high...
The magnetohydrodynamics (MHD) equations are continuum models used in the study of a wide range of plasma physics systems, including the evolution of complex plasma dynamics in tokamak disruptions. However, efficient numerical solution methods for MHD are extremely challenging due to disparate time and length scales, strong hyperbolic phenomena, an...
We propose a new approach for controlling the characteristics of certain mesh faces during optimization of high-order curved meshes. The practical goals are tangential relaxation along initially aligned curved boundaries and internal surfaces, and mesh fitting to initially non-aligned surfaces. The distinct feature of the method is that it utilizes...
In this paper we introduce general transfer operators between high-order and low-order refined finite element spaces that can be used to couple high-order and low-order simulations. Under natural restrictions on the low-order refined space we prove that both the high-to-low-order and low-to-high-order linear mappings are conservative, constant pres...
Large-scale finite element simulations of complex physical systems governed by partial differential equations crucially depend on adaptive mesh refinement (AMR) to allocate computational budget to regions where higher resolution is required. Existing scalable AMR methods make heuristic refinement decisions based on instantaneous error estimation an...
The greater arithmetic intensity of high‐order finite element discretizations makes them attractive for implementation on next‐generation hardware, but assembly of high‐order finite element operators as matrices is prohibitively expensive. As a result, the development of general algebraic solvers for such operators has been an open research challen...
We present an $hr$-adaptivity framework for optimization of high-order meshes. This work extends the $r$-adaptivity method for mesh optimization by Dobrev et al., where we utilized the Target-Matrix Optimization Paradigm (TMOP) to minimize a functional that depends on each element's current and target geometric parameters: element aspect-ratio, siz...
In this paper, we develop subspace correction preconditioners for discontinuous Galerkin (DG) discretizations of elliptic problems with $hp$-refinement. These preconditioners are based on the decomposition of the DG finite element space into a conforming subspace, and a set of small nonconforming edge spaces. The conforming subspace is precondition...
MFEM is an open-source, lightweight, flexible and scalable C++ library for modular finite element methods that features arbitrary high-order finite element meshes and spaces, support for a wide variety of discretization approaches and emphasis on usability, portability, and high-performance computing efficiency. MFEM’s goal is to provide applicatio...
Performance tests and analyses are critical to effective high-performance computing software development and are central components in the design and implementation of computational algorithms for achieving faster simulations on existing and future computing architectures for large-scale application problems. In this article, we explore performance...
In this paper we propose tools for high-order mesh optimization and demonstrate their benefits in the context of multi-material Arbitrary Lagrangian-Eulerian (ALE) compressible shock hydrodynamic applications. The mesh optimization process is driven by information provided by the simulation which uses the optimized mesh, such as shock positions, ma...
The numerical approximation of compressible hydrodynamics is at the core of high-energy density (HED) multiphysics simulations as shocks are the driving force in experiments like inertial confinement fusion (ICF). In this work, we describe our extension of the hyperviscosity technique, originally developed for shock treatment in finite difference s...
Performance tests and analyses are critical to effective HPC software development and are central components in the design and implementation of computational algorithms for achieving faster simulations on existing and future computing architectures for large-scale application problems. In this paper, we explore performance and space-time trade-off...
The main goal of this milestone was to help CEED-enabled ECP applications, including ExaSMR, MARBL, ExaWind and ExaAM, to improve their performance and capabilities on GPU systems like Summit and Lassen/Sierra. In addition, the CEED team also worked to: add and improve support for additional hardware and programming models in the CEED software comp...
This paper is focused on the aspects of limiting in residual distribution (RD) schemes for high-order finite element approximations to advection problems. Both continuous and discontinuous Galerkin methods are considered in this work. Discrete maximum principles are enforced using algebraic manipulations of element contributions to the global nonli...
In this paper we propose tools for high-order mesh optimization and demonstrate their benefits in the context of multi-material Arbitrary Lagrangian-Eulerian (ALE) compressible shock hydrodynamic applications. The mesh optimization process is driven by information provided by the simulation which uses the optimized mesh, such as shock positions, ma...
As noted in Wikipedia, skin in the game refers to having ‘incurred risk by being involved in achieving a goal’, where ‘ skin is a synecdoche for the person involved, and game is the metaphor for actions on the field of play under discussion’. For exascale applications under development in the US Department of Energy Exascale Computing Project, noth...
MFEM is an open-source, lightweight, flexible and scalable C++ library for modular finite element methods that features arbitrary high-order finite element meshes and spaces, support for a wide variety of discretization approaches and emphasis on usability, portability, and high-performance computing efficiency. MFEM's goal is to provide applicatio...
Productivity from day one on supercomputers that leverage new technologies requires significant preparation. An institution that procures a novel system architecture often lacks sufficient institutional knowledge and skills to prepare for it. Thus, the "Center of Excellence" (CoE) concept has emerged to prepare for systems such as Summit and Sierra...
The goal of this milestone was the performance tuning of the CEED software, as well as the use and tuning
of CEED to accelerate the first and second wave of targeted ECP applications.
In this milestone, the CEED team developed optimization techniques and tuned for performance the
CEED software to accelerate the first and second wave target ECP appl...
In this work, we introduce a new residual distribution (RD) framework for the design of bound-preserving high-resolution finite element schemes. The continuous and discontinuous Galerkin discretizations of the linear advection equation are modified to construct local extremum diminishing (LED) approximations. To that end, we perform mass lumping an...
We propose a general algorithm for nonconforming adaptive mesh refinement (AMR) of unstructured meshes in high-order finite element codes. Our focus is on h-refinement with a fixed polynomial order. The algorithm handles triangular, quadrilateral, hexahedral, and prismatic meshes of arbitrarily high-order curvature, for any order finite element spa...
As part of its discretization mandate, CEED is developing adaptive algorithms for mesh refinement, coarsening
and parallel rebalancing needed in general unstructured adaptive mesh refinement (AMR) of high-order
hexahedral and/or tetrahedral meshes.
This milestone provides an update on our developments of adaptive mesh control methods for both
confo...
We present a method for simulation-driven optimization of high-order curved meshes. This work builds on the results of Dobrev et al. (The target-matrix optimization paradigm for high-order meshes. ArXiv e-prints, 2018, https://arxiv.org/abs/1807.09807), where we described a framework for controlling and improving the quality of high-order finite el...
We propose a general algorithm for non-conforming adaptive mesh refinement (AMR) of unstructured meshes in high-order finite element codes. Our focus is on h-refinement with a fixed polynomial order. The algorithm handles triangular, quadrilateral, hexahedral and prismatic meshes of arbitrarily high order curvature, for any order finite element spa...
In this milestone, we created and made publicly available the second full CEED software distribution, release
CEED 2.0, consisting of software components such as MFEM, Nek5000, PETSc, MAGMA, OCCA, etc.,
treated as dependencies of CEED. The release consists of 12 integrated Spack packages for libCEED, mfem,
nek5000, nekcem, laghos, nekbone, hpgmg, o...
A C++/CUDA library for executing general large scale tensor contractions on the GPU.
We describe a framework for controlling and improving the quality of high-order finite element meshes based on extensions of the Target-Matrix Optimization Paradigm (TMOP) of [P. Knupp, Eng. Comput., 28 (2012), pp. 419-429]. This approach allows high-order applications to have a very precise control over local mesh quality, while still improving th...
We describe a framework for controlling and improving the quality of high-order finite element meshes based on extensions of the Target-Matrix Optimization Paradigm (TMOP) of Knupp. This approach allows high-order applications to have a very precise control over local mesh quality, while still improving the mesh globally. We address the adaption of...
MFEM is a free, lightweight, flexible and scalable C++ library for modular finite element methods that features arbitrary high-order finite element meshes and spaces, support for a wide variety of discretization approaches and emphasis on usability, portability, and high-performance computing eciency. Its mission is to provide application scientist...
DOE math libraries
We propose an unified algebraic approach for static condensation and hybridization, two popular techniques in finite element discretizations. The algebraic approach is supported by the construction of scalable solvers for problems involving H(div)-spaces discretized by conforming (Raviart-Thomas) elements of arbitrary order. We illustrate through n...
We propose an unified algebraic approach for static condensation and hybridization, two popular techniques in finite element discretizations. The algebraic approach is supported by the construction of scalable solvers for problems involving H(div)-spaces discretized by conforming (Raviart-Thomas) elements of arbitrary order. We illustrate through n...
We present a new approach for multi-material arbitrary Lagrangian–Eulerian (ALE) hydrodynamics simulations based on high-order finite elements posed on high-order curvilinear meshes. The method builds on and extends our previous work in the Lagrangian [V. A. Dobrev, T. V. Kolev, and R. N. Rieben, SIAM J. Sci. Comput., 34 (2012), pp. B606–B641] and...
We show how a scalable preconditioner for the primal discontinuous Petrov–Galerkin (DPG) method can be developed using existing algebraic multigrid (AMG) preconditioning techniques. The stability of the DPG method gives a norm equivalence which allows us to exploit existing AMG algorithms and software. We show how these algebraic preconditioners ca...
We construct Balancing Domain Decomposition by Constraints methods for the linear systems arising from arbitrary order, finite element discretizations of the H(curl) model problem in three-dimensions. Numerical results confirm that the proposed algorithm is quasi-optimal in the coarse-to-fine mesh ratio, and poly-logarithmic in the polynomial order...
We present a new predictor-corrector approach to enforcing local maximum principles in piecewise-linear finite element schemes for the compressible Euler equations. The new element-based limiting strategy is suitable for continuous and discontinuous Galerkin methods alike. In contrast to synchronized limiting techniques for systems of conservation...
In this paper we develop a two-grid convergence theory for the parallel-in-time scheme known as multigrid reduction in time (MGRIT), as it is implemented in the open-source XBraid package [29]. MGRIT is a scalable and multi-level approach to parallel-in-time simulations that non-intrusively uses existing time-stepping schemes, and that in a specifi...
A newly developed generic electro-magnetic (EM) simulation tool for modeling RF wave propagation in SOL plasmas is presented. The primary motivation of this development is to extend the domain partitioning approach for incorporating arbitrarily shaped SOL plasmas and antenna to the TORIC core ICRF solver, which was previously demonstrated in the 2D...
We consider the comparison of multigrid methods for parabolic partial differential equations that allow space–time concurrency. With current trends in computer architectures leading towards systems with more, but not faster, processors, space–time concurrency is crucial for speeding up time-integration simulations. In contrast, traditional time-int...
This technical report describes our findings regarding performance optimizations of the tensor contraction kernels used in BLAST - a high-order FE hydrodynamics research code developed at LLNL - on various modern architectures. Our approach considers and shows ways to organize the contractions , their vectorization, data storage formats, read/write...
Outlines tensor based finite element formation in the MFEM library. Also discuses executing these tensor computations on GPUs utilizing CUDA JIT compilation.
We present a computational framework for high-performance tensor contractions on GPUs. High-performance is difficult to obtain using existing libraries, especially for many independent contractions where each contraction is very small, e.g., sub-vector/warp in size. However, using our framework to batch contractions plus application-specifics, we d...
We show how a scalable preconditioner for the primal discontinuous Petrov-Galerkin (DPG) method can be developed using existing algebraic multigrid (AMG) preconditioning techniques. The stability of the DPG method gives a norm equivalence which allows us to exploit existing AMG algorithms and software. We show how these algebraic preconditioners ca...
In this work we present a FCT-like Maximum-Principle Preserving (MPP) method to solve the transport equation. We use high-order polynomial spaces; in particular, we consider up to 5th order spaces in two and three dimensions and 23rd order spaces in one dimension. The method combines the concepts of positive basis functions for discontinuous Galerk...
We present a new closure model for single fluid, multi-material Lagrangian hydrodynamics and its application to high-order finite element discretizations of these equations [1]. The model is general with respect to the number of materials, dimension, space and time discretization. Knowledge about exact material interfaces is not required. Material...
The parallel performance of several classical Algebraic Multigrid (AMG) methods applied to linear elasticity problems is investigated. These methods include standard AMG approaches for systems of partial differential equations such as the unknown and hybrid approaches, as well as the more recent global matrix (GM) and local neighborhood (LN) approa...
Algebraic Multigrid (AMG) solvers are an essential component of many large-scale scientific simulation codes. Their continued numerical scalability and efficient implementation is critical for preparing these codes for exascale. Our experiences on modern multi-core machines show that significant challenges must be addressed for AMG to perform well...
We have investigated the use of the adaptive high-order finite-element method (FEM) for geoelectromagnetic modeling. Because high-order FEM is challenging from the numerical and computational points of view, most published finite-element studies in geoelectromagnetics use the lowest order formulation. Solution of the resulting large system of linea...
In the foreseeable future, scientific applications will run on
multiple diverse computer architectures with different power,
resilience, and performance balances. To run efficiently, codes that
have flexibility in their algorithm choices will be important.
In this work, we describe how we employ a performance and power model
to motivate algorithm...
The remap phase in arbitrary Lagrangian–Eulerian (ALE) hydrodynamics involves the transfer of field quantities defined on a post-Lagrangian mesh to some new mesh, usually generated by a mesh optimization algorithm. This problem is often posed in terms of transporting (or advecting) some state variable from the old mesh to the new mesh over a fictit...
The emergence of high-concurrency architectures offering unprecedented performance has brought many high-performance partial differential equation (PDE) discretization codes to the precipice of a major refactor. To help address this challenge a workshop titled "Algorithms and Abstractions for Assembly in PDE Codes" was held in the Computer Science...
The emergence of high-concurrency architectures offering unprecedented performance has brought many high-performance partial differential equation (PDE) discretization codes to the precipice of a major refactor. To help address this challenge a workshop titled "Algorithms and Abstractions for Assembly in PDE Codes" was held in the Computer Science...
The emergence of high-concurrency architectures offering unprecedented performance has brought many high-performance partial differential equation (PDE) discretization codes to the precipice of a major refactor. To help address this challenge a workshop titled "Algorithms and Abstractions for Assembly in PDE Codes" was held in the Computer Science...
The emergence of high-concurrency architectures offering unprecedented performance has brought many high-performance partial differential equation (PDE) discretization codes to the precipice of a major refactor. To help address this challenge a workshop titled "Algorithms and Abstractions for Assembly in PDE Codes" was held in the Computer Science...
We consider optimal-scaling multigrid solvers for the linear systems that arise from the discretization of problems with evolutionary behavior. Typically, solution algorithms for evolution equations are based on a time-marching approach, solving sequentially for one time step after the other. Parallelism in these traditional time-integration techni...
With current trends in computer architectures leading towards systems with more, but not faster, processors, faster time-to-solution must come from greater parallelism. We present a family of truly multilevel approaches to parallel time integration based on multigrid reduction (MGR) principles. The resulting multigrid-reduction-in-time (MGRIT) algo...
In this paper, we extend some of the multilevel convergence results obtained
by Xu and Zhu in [Xu and Zhu, M3AS 2008], to the case of second order linear
reaction-diffusion equations. Specifically, we consider the multilevel
preconditioners for solving the linear systems arising from the linear finite
element approximation of the problem, where bot...
Power and energy consumption are becoming an increasing concern in high performance computing. Compared to multi-core CPUs, GPUs have a much better performance per watt. In this paper we discuss efforts to redesign the most computation intensive parts of BLAST, an application that solves the equations for compressible hydrodynamics with high order...
This paper presents a high-order finite element method for calculating
elastic-plastic flow on moving curvilinear meshes and is an
extension of our general high-order curvilinear finite element approach
for solving the Euler equations of gas dynamics in a Lagrangian frame
[1,2]. In order to handle transition to plastic flow, we formulate the
stress...
The BLAST code implements a high-order numerical algorithm that solves the equations of compressible hydrodynamics using the Finite Element Method in a moving Lagrangian frame. BLAST is coded in C++ and parallelized by MPI. We accelerate the most computationally intensive parts (80%-95%) of BLAST on an NVIDIA GPU with the CUDA programming model. Se...
We study regular decompositions for H(div) spaces. In particular, we show that such regular decompositions are
closely related to a previously studied ``inf-sup'' condition for parameter-dependent Stokes problems,
for which we provide an alternative, more direct, proof.
The hypre software library (http:// www. llnl. gov/ CASC/ hypre/ ) is a collection of high performance preconditioners and solvers for large sparse linear systems of equations on massively parallel machines. This paper investigates the scaling properties of several of the popular multigrid solvers and system building interfaces in hypre on two mode...