Article

A Novel Algebraic Multigrid Approach Based on Adaptive Smoothing and Prolongation for Ill-Conditioned Systems

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The numerical simulation of modern engineering problems can easily incorporate millions or even billions degrees of freedom. In several applications, these simulations require the solution to sparse linear systems of equations, and algebraic multigrid (AMG) methods are often standard choices as iterative solvers or preconditioners. This happens due to their high convergence speed guaranteed even in large size problems, which is a consequence of the AMG ability of reducing particular error components across their multilevel hierarchy. Despite carrying the name “algebraic”, most of these methods still rely on additional information other than the global assembled sparse matrix, as for instance the knowledge of the operator near kernel. This fact somewhat limits their applicability as black-box solvers. In this work, we introduce a novel AMG approach featuring the adaptive Factored Sparse Approximate Inverse (aFSAI) method as a flexible smoother as well as three new approaches to adaptively compute the prolongation operator. We assess the performance of the proposed AMG through the solution of a set of model problems along with real-world engineering test cases. Moreover, comparisons are made with the aFSAI and BoomerAMG preconditioners, showing that our new method proves to be superior to the first strategy and comparable to the second one, if not better as in the elasticity problems.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The iterations are stopped when the 2-norm of the initial residual is reduced by 8 orders of magnitude. The RACP(1) approach is used as a preconditioner, with the application of the inverse of the primal Schur complement,S −1 u , carried out approximately by the AMG method proposed in [69,72]. For the sake of the comparison, we solve the system with the leading block A alone by accelerating GMRES(100) with the same multigrid method used to approximateS −1 u . ...
... As a comparison, we consider the performance obtained with other methods available from the literature. Table 4 reports the iteration counts, the preconditioner application costs and the total solution costs for a Mixed Constrained Preconditioner (MCP) [24,59] where: • the inverse of the leading block A is approximately applied by using either the same AMG approach used with RACP(1) [69,72], or an adaptive FSAI (aFSAI) [59,73]; • the Schur complement S = −B T A −1 B is computed inexactly by replacing A −1 with an aFSAI of A and the application of its inverse is carried out approximately by another aFSAI. ...
... Depending on the choice for the inner preconditioner of the leading block, we distinguish between MCP+AMG and MCP+FSAI in Table 4. For these approaches, the set of user-specified parameters providing the empirically optimal performance, as suggested in the relevant literature [69,[72][73][74], is used. It can be noticed that RACP is able to solve all the test cases with acceptable computational costs. ...
Article
Frictional contact is one of the most challenging problems in computational mechanics. Typically, it is a tough non-linear problem often requiring several Newton iterations to converge and causing troubles also in the solution to the related linear systems. When contact is modeled with the aid of Lagrange multipliers, the impenetrability condition is enforced exactly, but the associated Jacobian matrix is indefinite and needs a special treatment for a fast numerical solution. In this work, a constraint preconditioner is proposed where the primal Schur complement is computed after augmenting the zero block. The name Reverse is used in contrast to the traditional approach where only the structural block undergoes an augmentation. Besides being able to address problems characterized by singular structural blocks, often arising in contact mechanics, it is shown that the proposed approach is significantly cheaper than traditional constraint preconditioning for this class of problems and it is suitable for an efficient HPC implementation through the Chronos parallel package. Our conclusions are supported by several numerical experiments on mid- and large-size problems from various applications. The source files implementing the proposed algorithm are freely available on GitHub.
... The iterations are stopped when the 2-norm of the initial residual is reduced by 8 orders of magnitude. The RACP(1) approach is used as a preconditioner, with the application of the inverse of the primal Schur complement, S −1 u , carried out approximately by the AMG method proposed in [59,62]. For the sake of the comparison, we solve the system with the leading block A alone by accelerating GMRES(100) with the same multigrid method used to approximate S −1 u . ...
... • the inverse of the leading block A is approximately applied by using either the same AMG approach used with RACP(1) [59,62], or an adaptive FSAI (aFSAI) [63,49]; ...
... Depending on the choice for the inner preconditioner of the leading block, we distinguish between MCP+AMG and MCP+FSAI in Table 4. For these approaches, the set of user-specified parameters providing the empirically optimal performance, as suggested in the relevant literature [63,62,64,59], is used. It can be noticed that RACP is able to solve all the test cases with acceptable computational costs. ...
Preprint
Full-text available
Frictional contact is one of the most challenging problems in computational mechanics. Typically, it is a tough nonlinear problem often requiring several Newton iterations to converge and causing troubles also in the solution to the related linear systems. When contact is modeled with the aid of Lagrange multipliers, the impenetrability condition is enforced exactly, but the associated Jacobian matrix is indefinite and needs a special treatment for a fast numerical solution. In this work, a constraint preconditioner is proposed where the primal Schur complement is computed after augmenting the zero block. The name Reverse is used in contrast to the traditional approach where only the structural block undergoes an augmentation. Besides being able to address problems characterized by singular structural blocks, often arising in contact mechanics, it is shown that the proposed approach is significantly cheaper than traditional constraint preconditioning for this class of problems and it is suitable for an efficient HPC implementation through the Chronos parallel package. Our conclusions are supported by several numerical experiments on mid- and large-size problems from various applications. The source files implementing the proposed algorithm are freely available on GitHub.
... This work presents an extension of the adaptive Smoothing and Prolongation based Algebraic Multigrid method (aSP-AMG), proposed by [49], with the aim of specifically improving its performance for the solution of linear elasticity problems. Such method follows the path of bootstrap and adaptive AMG which is to build an approximation of the near-null space components of the problem at hand automatically. ...
... This is particularly true for structural problems where damping highest frequencies often requires the use of weights or Chebyshev polynomials [52,53]. In [49], the aFSAI [29] is proposed as smoother and its effectiveness Define Ω k as the set of the n k vertices of the adjacency graph of A k ; 3: if n k is small enough to allow a direct factorization then 4: Compute A k = L k L T k ; 5: ...
... The last key component of our AMG method is the construction of suitable prolongation and restriction operators. Following the idea proposed in [42] and successively refined in [49], we choose to build an interpolation operator fitting as close as possible the set of test vectors computed in the early setup stage. More precisely, the prolongation weights β j are computed in order to minimize the interpolation residual: ...
Article
The numerical simulation of structural mechanics applications via finite elements usually requires the solution of large-size linear systems, especially when accurate results are sought for derived variables, like stress or deformation fields. Such a task represents the most time-consuming kernel, and motivates the development of robust and efficient linear solvers for these applications. On the one hand, direct solvers are robust and easy to use, but their computational complexity in the best scenario is superlinear, which limits applicability according to the problem size. On the other hand, iterative solvers, in particular those based on algebraic multigrid (AMG)preconditioners, can reach up to linear complexity, but require more knowledge from the user for an efficient setup, and convergence is not always guaranteed, especially in ill-conditioned problems. In this work, we present a novel AMG method specifically tailored for ill-conditioned structural problems. It is characterized by an adaptive factored sparse approximate inverse (aFSAI)method as smoother, an improved least-squared based prolongation (DPLS)and a method for uncovering the near-null space that takes advantage of an already existing approximation. The resulting linear solver has been applied in the solution of challenging linear systems arising from real-world linear elastic structural problems. Numerical experiments prove the efficiency and robustness of the method and show how, in several cases, the proposed algorithm outperforms state-of-the-art AMG linear solvers. Even more important, the results show how the proposed method gives good results even assuming a default setup, making it fully adoptable also for non-expert users and commercial software.
... This work presents an extension of the adaptive Smoothing and Prolongation based Algebraic Multigrid method (aSP-AMG), proposed by [49], with the aim of specifically improving its performance for the solution of linear elasticity problems. Such method follows the path of bootstrap and adaptive AMG which is to build an approximation of the near-null space components of the problem at hand automatically. ...
... This is particularly true for structural problems where damping highest frequencies often requires the use of weights or Chebyshev polynomials [52,53]. In [49], the aFSAI [29] is proposed as smoother and its effectiveness is assessed on an extensive set of numerical experiments. aFSAI is designed for SPD matrices and, as smoother, takes the following form: ...
... The last key component of our AMG method is the construction of suitable prolongation and restriction operators. Following the idea proposed in [42] and successively refined in [49], we choose to build an interpolation operator fitting as close as possible the set of test vectors computed in the early setup stage. More precisely, the prolongation weights β j are computed in order to minimize the interpolation residual: ...
Preprint
Full-text available
The numerical simulation of structural mechanics applications via finite elements usually requires the solution of large-size and ill-conditioned linear systems, especially when accurate results are sought for derived variables interpolated with lower order functions, like stress or deformation fields. Such task represents the most time-consuming kernel in commercial simulators; thus, it is of significant interest the development of robust and efficient linear solvers for such applications. In this context, direct solvers, which are based on LU factorization techniques, are often used due to their robustness and easy setup; however, they can reach only superlinear complexity, in the best case, thus, have limited applicability depending on the problem size. On the other hand, iterative solvers based on algebraic multigrid (AMG) preconditioners can reach up to linear complexity for sufficiently regular problems but do not always converge and require more knowledge from the user for an efficient setup. In this work, we present an adaptive AMG method specifically designed to improve its usability and efficiency in the solution of structural problems. We show numerical results for several practical applications with millions of unknowns and compare our method with two state-of-the-art linear solvers proving its efficiency and robustness.
... This work presents an extension of the adaptive Smoothing and Prolongation based Algebraic Multigrid method (aSP-AMG), proposed by [49], with the aim of specifically improving its performance for the solution of linear elasticity problems. Such method follows the path of bootstrap and adaptive AMG which is to build an approximation of the near-null space components of the problem at hand automatically. ...
... This is particularly true for structural problems where damping highest frequencies often requires the use of weights or Chebyshev polynomials [52,53]. In [49], the aFSAI [29] is proposed as smoother and its effectiveness is assessed on an extensive set of numerical experiments. aFSAI is designed for SPD matrices and, as smoother, takes the following form: ...
... The last key component of our AMG method is the construction of suitable prolongation and restriction operators. Following the idea proposed in [42] and successively refined in [49], we choose to build an interpolation operator fitting as close as possible the set of test vectors computed in the early setup stage. More precisely, the prolongation weights β j are computed in order to minimize the interpolation residual: ...
Preprint
Full-text available
The numerical simulation of structural mechanics applications via finite elements usually requires the solution of large-size and ill-conditioned linear systems, especially when accurate results are sought for derived variables interpolated with lower order functions, like stress or deformation fields. Such task represents the most time-consuming kernel in commercial simulators; thus, it is of significant interest the development of robust and efficient linear solvers for such applications. In this context, direct solvers, which are based on LU factorization techniques, are often used due to their robustness and easy setup; however, they can reach only superlinear complexity, in the best case, thus, have limited applicability depending on the problem size. On the other hand, iterative solvers based on algebraic multigrid (AMG) preconditioners can reach up to linear complexity for sufficiently regular problems but do not always converge and require more knowledge from the user for an efficient setup. In this work, we present an adaptive AMG method specifically designed to improve its usability and efficiency in the solution of structural problems. We show numerical results for several practical applications with millions of unknowns and compare our method with two state-of-the-art linear solvers proving its efficiency and robustness.
... (9). They correspond to a MATLAB implementation using k = 2, Dynamic Pattern Least Squares (DPLS) [12] and energy minimisation [13]. Conversely, the AMG applied to the reduced operator used a PMIS coarsening [14], Extended+I interpolation [15] and an adaptive-pattern FSAI smoother [12]. ...
... They correspond to a MATLAB implementation using k = 2, Dynamic Pattern Least Squares (DPLS) [12] and energy minimisation [13]. Conversely, the AMG applied to the reduced operator used a PMIS coarsening [14], Extended+I interpolation [15] and an adaptive-pattern FSAI smoother [12]. As it can be seen, adopting the aggressive coarsening that spatial symmetries induce does not harm convergence and allows for the computational advantages of replacing SpMV with SpMM. ...
... Other approaches are very appropriate as well, especially in view of the code extension to large-size problems and memory-distributed computational frameworks, such as advanced Algebraic Multigrid methods, e.g. [38][39][40]. Since our code is currently written at a Matlab prototypical level, to test the algorithmic capabilities of the proposed approach we simply rely on the approximation (32). ...
... For instance, a consistent inf-sup stable discretization is generated by ℚ 2 − ℚ 1 Taylor-Hood elements. By defining a partition  ℎ of Ω made of non-overlapping elements Ω , the basis functions in equation (40) can be set as follows: ...
... Adaptive AMG [21] and adaptive smoothed aggregation [10] are among early attempts to assess the quality of the AMG setup phase during the setup process, with the ability to adaptively improve the interpolation operators. Later works focus on extending the adaptive ideas to more general settings [22], and in particular, Bootstrap AMG [4] further develops the idea of adaptive interpolation with least-squares interpolation coupled with locally relaxed vectors and multilevel eigenmodes. Other advanced approaches have a focus on specific AMG components, such as energy minimization of the interpolation operator [23, 35,30,28,25], generalizing the strength of connection procedure [27,5], or by considering the nonsymmetric nature of the problem directly [26,24]. ...
... Analysis of the energy minimization process. We use two matrices for studying prolongation energy reduction, Cube and Pflow742 [22]. While the former is quite simple, as it is the fourth refinement level of the linear elasticity cube used in the weak scalability study, the latter arises from a three-dimensional (3D) simulation of the pressure field in a multilayered porous medium discretized by a sufficiently regular Q2-hexahedral finite elements. ...
... Many AMG algorithms have been proposed, such as classical AMG, 13,14 smoothed aggregation AMGs, 15,16 AMGs based on element interpolation (AMGe), 17 and element-free AMGe. 18 The adaptive smoothed AMG ( AMG), 19 bootstrap AMG, 20,21 and adaptive smoothing and prolongation-based AMG (aSP-AMG) 22,23 were designed for the ill-conditioned system when the classical AMG algorithms had poor performance and failed to converge. In AMG algorithms, the operator complexity greatly influences the parallel performance. ...
... Only when all subsystems satisfy the local equations can the global equations be satisfied. From Equation (22), the inner DOFs x I in(LB) for the local balanced state can be solved from following equations: ...
Article
Full-text available
Solving linear equations and finding eigenvalues are essential tasks in many simulations for engineering applications, but these tasks often cause performance bottlenecks. In this work, the hierarchical subspace evolution method (HiSEM), a hierarchical iteration framework for solving scientific computing problems with solution locality, is proposed. In HiSEM, the original problem is converted to a corresponding minimization function. The problem is decomposed into a series of subsystems. Subspaces and their weights are established for the subsystems and evolve in each iteration. The subspaces are calculated based on local equations and knowledge of physical problems. A small‐scale minimization problem determines the weights of the subspaces. The solution system can be hierarchically established based on the subspaces. As the iterations continue, the degrees of freedom gradually converge to an accurate solution. Two parallel algorithms are derived from HiSEM. One algorithm is designed for symmetric positive definite linear equations, and the other is designed for generalized eigenvalue problems. The linear solver and eigensolver performance is evaluated using a series of benchmarks and a tower model with a complex topology. Algorithms derived from HiSEM can solve a super large‐scale problem with high performance and good scalability.
... Adaptive AMG [19] and adaptive smoothed aggregation [9] are among early attempts to assess the quality of the AMG setup phase during the setup process, with the ability to adaptively improve the interpolation operators. Later works focus on extending the adaptive ideas to more general settings [20], and in particular, Bootstrap AMG [3] further develops the idea of adaptive interpolation with least-squares interpolation coupled with locally relaxed vectors and multilevel eigenmodes. Other advanced approaches have a focus on specific AMG components, such as energy minimization of the interpolation operator [21,33,28,26,23], generalizing the strength of connection procedure [25,4], or by considering the nonsymmetric nature of the problem directly [24,22]. ...
... Analysis of the energy minimization process. We use two matrices for studying prolongation energy reduction, Cube and Pflow742 [20]. While the former is quite simple, as it is the fourth refinement level of the linear elasticity cube used in the weak scalability study, the latter arises from a 3D simulation of the pressuretemperature field in a multilayered porous medium discretized by hexahedral finite elements. ...
Preprint
Full-text available
Algebraic multigrid (AMG) is one of the most widely used solution techniques for linear systems of equations arising from discretized partial differential equations. The popularity of AMG stems from its potential to solve linear systems in almost linear time, that is with an O(n) complexity, where n is the problem size. This capability is crucial at the present, where the increasing availability of massive HPC platforms pushes for the solution of very large problems. The key for a rapidly converging AMG method is a good interplay between the smoother and the coarse-grid correction, which in turn requires the use of an effective prolongation. From a theoretical viewpoint, the prolongation must accurately represent near kernel components and, at the same time, be bounded in the energy norm. For challenging problems, however, ensuring both these requirements is not easy and is exactly the goal of this work. We propose a constrained minimization procedure aimed at reducing prolongation energy while preserving the near kernel components in the span of interpolation. The proposed algorithm is based on previous energy minimization approaches utilizing a preconditioned restricted conjugate gradients method, but has new features and a specific focus on parallel performance and implementation. It is shown that the resulting solver, when used for large real-world problems from various application fields, exhibits excellent convergence rates and scalability and outperforms at least some more traditional AMG approaches.
... However, robustness, scalability and computational efficiency of this class of methods is tightly connected with the choice of a proper preconditioning technique [46]. Roughly speaking, preconditioners are approximate applications of the system matrix inverse, and, from the algebraic viewpoint, can be classified into three main categories: (i) incomplete factorizations [48][49][50], (ii) approximate inverses [51][52][53][54][55][56], and (iii) multilevel methods, i.e., domain decomposition [57][58][59][60][61] and multigridlike techniques [62][63][64][65][66][67][68][69][70][71][72][73][74][75]. A key feature for a modern preconditioning framework is the algorithmic scalability, i.e., the ability to solve an increasingly refined problem with an approximately constant number of iterations of the Krylov solver. ...
... This is a key property to guarantee the solver scalability. Recent examples of effective AMG preconditioners are, for instance, taken from the References [71,[73][74][75]. In this work, we use an aggregation-based multigrid as the reference AMG operator. ...
Article
A preconditioning framework for the coupled problem of frictional contact mechanics and fluid flow in the fracture network is presented. We focus on a blended finite element/finite volume method, where the porous medium is discretized by low-order continuous finite elements with nodal unknowns, cell-centered Lagrange multipliers are used to prescribe the contact constraints, and the fluid flow in the fractures is described by a classical two-point flux approximation scheme. This formulation is consistent, but is not uniformly inf-sup bounded and requires a stabilization. For the resulting 3×3 block Jacobian matrix, robust and efficient solution methods are not available, so we aim at designing new scalable preconditioning strategies based on the physically-informed block partitioning of the unknowns and state-of-the-art multigrid techniques. The key idea is to restrict the system to a single-physics problem, approximately solve it by an inner algebraic multigrid approach, and finally prolong it back to the fully-coupled problem. Two different techniques are presented, analyzed and compared by changing the ordering of the restrictions. Numerical results illustrate the algorithmic scalability, the impact of the relative number of fracture-based unknowns, and the performance on a benchmark problem. The objective of the analysis is to identify the most promising solution strategy.
... In summary, the main ingredients of AMG are: (i) a coarsening strategy, (ii) restriction and, (iii) prolongation (interpolation) operators, (iv) the smoother, and (v) the application technique [150]. Working on these components gives rise to a considerable number of possible variants. ...
... Working on these components gives rise to a considerable number of possible variants. In this regard, AMG methods have attracted a great interest from the scientific community during the last 20 years and are currently object of intense development, see, for instance, [150][151][152][153][154][155][156][157] for a selection of methods. ...
Article
Full-text available
Linear solvers for reservoir simulation applications are the objective of this review. Specifically, we focus on techniques for Fully Implicit (FI) solution methods, in which the set of governing Partial Differential Equations (PDEs) is properly discretized in time (usually by the Backward Euler scheme), and space, and tackled by assembling and linearizing a single system of equations to solve all the model unknowns simultaneously. Due to the usually large size of these systems arising from real-world models, iterative methods, specifically Krylov subspace solvers, have become conventional choices; nonetheless, their success largely revolves around the quality of the preconditioner that is supplied to accelerate their convergence. These two intertwined elements, i.e., the solver and the preconditioner, are the focus of our analysis, especially the latter, which is still the subject of extensive research. The progressive increase in reservoir model size and complexity, along with the introduction of additional physics to the classical flow problem, display the limits of existing solvers. Intensive usage of computational and memory resources are frequent drawbacks in practice, resulting in unpleasantly slow convergence rates. Developing efficient, robust, and scalable preconditioners, often relying on physics-based assumptions, is the way to avoid potential bottlenecks in the solving phase. In this work, we proceed in reviewing principles and state-of-the-art of such linear solution tools to summarize and discuss the main advances and research directions for reservoir simulation problems. We compare the available preconditioning options, showing the connections existing among the different approaches, and try to develop a general algebraic framework.
... However, robustness, scalability and computational efficiency of this class of methods is tightly connected with the choice of a proper preconditioning technique [42]. Roughly speaking, preconditioners are approximate applications of the system matrix inverse, and, from the algebraic viewpoint, can be classified into three main categories: (i) incomplete factorizations [44][45][46], (ii) approximate inverses [47][48][49][50][51][52], and (iii) multilevel methods, i.e., domain decomposition [53-57] and multigrid-like techniques [58][59][60][61][62][63][64][65][66][67][68][69][70][71]. A key feature for a modern preconditioning framework is the algorithmic scalability, i.e., the ability to solve an increasingly refined problem with an approximately constant number of iterations of the Krylov solver. ...
... This is a key property to guarantee the solver scalability. Recent examples of effective AMG preconditioners are, for instance, taken from the References [67,[69][70][71]. In this work, we use an aggregation-based multigrid as the reference AMG operator. ...
Preprint
Full-text available
A preconditioning framework for the coupled problem of frictional contact mechanics and fluid flow in the fracture network is presented. The porous medium is discretized using low-order continuous finite elements, with cell-centered Lagrange multipliers and pressure unknowns used to impose the constraints and solve the fluid flow in the fractures, respectively. This formulation does not require any interpolation between different fields, but is not uniformly inf-sup stable and requires a stabilization. For the resulting 3 x 3 block Jacobian matrix, we design scalable preconditioning strategies, based on the physically-informed block partitioning of the unknowns and state-of-the-art multigrid preconditioners. The key idea is to restrict the system to a single-physics problem, approximately solve it by an inner algebraic multigrid approach, and finally prolong it back to the fully-coupled problem. Two different techniques are presented, analyzed and compared by changing the ordering of the restrictions. Numerical results illustrate the algorithmic scalability, the impact of the relative number of fracture-based unknowns, and the performance on a real-world problem.
... SoC (12) is generally used in smoothed aggregation AMG [41] and usually gives good results in structural problems. Finally, SoC (13) has been introduced in [32] and, though requiring a rather expensive computation, is able to accurately capture anisotropies, as is shown in [33]. After SoC is computed for every pair of nodes, weak connections are eliminated to determine MISs whose nodes become the unknown in the next level. ...
... In turn, large jumps in P introduce high frequencies in the next level operator that the smoother hardly handles. To overcome these difficulties, we compute our BAMG interpolation with an adaptive procedure similar to those described in [18,33]. More specifically, let us define the matrix \Phi whose entries \varphi ij correspond to the jth component of the ith test vector v i for any j in the interpolatory set. ...
... In this work, we propose Chronos a massively parallel implementation of a novel AMG framework [12,6] which is able to adapt all of its components to the problem at hand, from the smoother set-up, to the coarse grid hierarchy and prolongation definition. This is achieved by guessing and iteratively improving in a bootstrap fashion the near-null space of the system, which allows for both testing the smoother and the prolongation operator as well as for inferring the connection strengths between system unknowns. ...
... By distinction to Gauss-Seidel smoother, aFSAI application is perfectly parallel also in the application as, giving an explicit approximation of the system inverse, it can be applied simply by a matrix-vector product. The price to pay for the use of aFSAI is a not always negligible set-up cost that is usually compensated by a faster covergence, especially in ill-conditioned problems [12,6], where standard smoother fail in dumping high frequencies. ...
... One correlation-based measure for selecting the CDOFs is explored in Reference 12 for graph Laplacian systems (see also the recent article, 8 which uses a correlation measure for selecting the CDOFs for PDEs). With i, j being two arbitrary vector components of vector v ( ) and the component inner product defined as ...
... Indeed, these elasticity equations have been the impetus for much of the AMG developments over the past two decades. [2][3][4]7,8 The difficulty arises from the multidimensional near-nullspace (i.e., rigid body modes) and the non-M matrix property of the discretized system. ...
Article
This article develops an algebraic multigrid (AMG) method for solving systems of elliptic boundary‐value problems. It is well known that multigrid for systems of elliptic equations faces many challenges that do not arise for most scalar equations. These challenges include strong intervariable couplings, multidimensional and possibly large near‐nullspaces, analytically unknown near‐nullspaces, delicate selection of coarse degrees of freedom (CDOFs), and complex construction of intergrid operators. In this article, we consider only the selection of CDOFs and the construction of the interpolation operator. The selection is an extension of the Ruge–Stuben algorithm using a new strength of connection measure taken between nodal degrees of freedom, that is, between all degrees of freedom located at a gridpoint to all degrees of freedom at another gridpoint. This measure is based on a local correlation matrix generated for a set of smoothed test vectors derived from a relaxation‐based procedure. With this measure, selection of the CDOFs is then determined by the number of strongly correlated connections at each node, with the selection processed by a Ruge–Stuben coloring scheme. Having selected the CDOFs, the interpolation operator is constructed using a bootstrap AMG (BAMG) procedure. We apply the BAMG procedure either over the smoothed test vectors to obtain an intervariable interpolation scheme or over the like‐variable components of the smoothed test vectors to obtain an intravariable interpolation scheme. Moreover, comparing the correlation measured between the intravariable couplings with the correlation between all couplings, a mixed intravariable and intervariable interpolation scheme is developed. We further examine an indirect BAMG method that explicitly uses the coefficients of the system operator in constructing the interpolation weights. Finally, based on a weak approximation criterion, we consider a simple scheme to adapt the order of the interpolation (i.e., adapt the caliber or maximum number of coarse‐grid points that a fine‐grid point can interpolate from) over the computational domain.
... In fact, Fig. 6 indicates that the resulting coarsening is less effective than that of the underlying AMG, as it leads to larger and denser coarse-grid operators. Conversely, despite making AMGR lighter, using larger values of rapidly exceeds the range of applicability of long-distance interpolation formulas, such as Extended+I (ExtI) [79] or DPLS [80]. Indeed, the number of iterations grows fast with , rendering the AMG reduction ineffective and evincing the need for an optimal . ...
... Indeed, if there are n inn inner unknowns, the maximum inner-interface distance will be of the order of d √ n inn , where d stands for the geometrical dimension of the problem. Then, a 3D problem with about a million inner unknowns would result in distances of about 100 units, which exceeds the applicability of long-distance interpolation formulas such as Extended+I (ExtI) [18] or dynamicpattern Least Squares Fit (LSF) [43]. Hence, to allow for an accurate interpolation, we need to convert some inner nodes into coarse. ...
Preprint
Full-text available
Divergence constraints are present in the governing equations of many physical phenomena, and they usually lead to a Poisson equation whose solution typically is the main bottleneck of many simulation codes. Algebraic Multigrid (AMG) is arguably the most powerful preconditioner for Poisson's equation, and its effectiveness results from the complementary roles played by the smoother, responsible for damping high-frequency error components, and the coarse-grid correction, which in turn reduces low-frequency modes. This work presents several strategies to make AMG more compute-intensive by leveraging reflection, translational and rotational symmetries, often present in academic and industrial configurations. The best-performing method, AMGR, is based on a multigrid reduction framework that introduces an aggressive coarsening to the multigrid hierarchy, reducing the memory footprint, setup and application costs of the top-level smoother. While preserving AMG's excellent convergence, AMGR allows replacing the standard sparse matrix-vector product with the more compute-intensive sparse matrix-matrix product, yielding significant accelerations. Numerical experiments on industrial CFD applications demonstrated up to 70% speed-ups when solving Poisson's equation with AMGR instead of AMG. Additionally, strong and weak scalability analyses revealed no significant degradation.
... Lately, research efforts for efficiently solving large sparse linear systems have focused on multigrid methods [1,2]. Their applicability and efficiency are based on the use of a stationary method as a smoother for higher frequency components of the error, while the lower frequency components are transferred to a coarser level with higher frequency, which can reduce the error [3,4]. ...
Article
Full-text available
In this paper, we examine deflation-based algebraic multigrid methods for solving large systems of linear equations. Aggregation of the unknown terms is applied for coarsening, while deflation techniques are proposed for improving the rate of convergence. More specifically, the V-cycle strategy is adopted, in which, at each iteration, the solution is computed by initially decomposing it utilizing two complementary subspaces. The approximate solution is formed by combining the solution obtained using multigrids and deflation. In order to improve performance and convergence behavior, the proposed scheme was coupled with the Modified Generic Factored Approximate Sparse Inverse preconditioner. Furthermore, a parallel version of the multigrid scheme is proposed for multicore parallel systems, improving the performance of the techniques. Finally, characteristic model problems are solved to demonstrate the applicability of the proposed schemes, while numerical results are given.
... In contrast to Classical AMG, in this case, the coarsening consists of aggregating several fine nodes in one coarse level unknown and the interpolation operator is constructed by interpolating exactly a few approximations of the near-kernel [26,27]. Since then, many other multigrid variations have appeared in the literature, e.g., the element based AMG family, with energy-minimization AMGe [6], element-free AMGe [13] and spectral AMGe [9], but also the adaptive and Bootstrap AMG (BAMG) [3,4,7,10,18], where the near-kernel of the operator is approximated adaptively during the AMG setup stage. Despite many differences, all the above methods perform coarsening based on a C/F partitioning or aggregation of the unknowns, hence the common usage of distinguishing between classical or aggregationbased AMG. ...
... 229,230 . The newly-introduced adaptive smoothening prolongation-based multigrid solver (AMG) designed for ill-conditioned nonlinear problems of structural mechanicswhich is a broader problem-space in a part of which fractal-like patterns can emerge-shows robust performance both for benchmark models and real-world calculations (such as finemicrostructure composite-made mechanical tools), a key feature employed in the approach being adaptive factored sparse approximate inverse [231][232][233] . A more general study of conditioning the AMG for elliptic PDEs is conducted in ref. 234 . ...
Article
Full-text available
The complex interplay between chemistry, microstructure, and behavior of many engineering materials has been investigated predominantly by experimental methods. Parallel to the increase in computer power, advances in computational modeling methods have resulted in a level of sophistication which is comparable to that of experiments. At the continuum level, one class of such models is based on continuum thermodynamics, phase-field methods, and crystal plasticity, facilitating the account of multiple physical mechanisms (multi-physics) and their interaction during microstructure evolution. This paper reviews the status of simulation approaches and software packages in this field and gives an outlook towards promising research directions.
... A preconditioner is a cheap approximation of the matrix inverse operation [65], whose effective design often requires "a combination of art and science" [63]. There are several different preconditioner types, among them the most known are incomplete factorizations [66][67][68], approximate inverses [69][70][71][72], domain decomposition [73][74][75][76] and FETI (finite element tearing and interconnect) methods [77][78][79][80][81], and geometric and algebraic multigrid [82][83][84][85][86][87][88][89][90][91][92][93][94]. Not all of them show a linear complexity and/or a good parallel behavior. ...
Article
We present a family of preconditioning strategies for the contact problem in fractured and faulted porous media. We combine low-order continuous finite elements to simulate the bulk deformation with piecewise constant Lagrange multipliers to impose the frictional contact constraints. This formulation is not uniformly inf-sup stable and requires stabilization. We improve previous work by Franceschini et al. (2020) by introducing a novel jump stabilization technique that requires only local geometrical and mechanical properties. We then design scalable preconditioning strategies that take advantage of the block structure of the Jacobian matrix using a physics-based partitioning of the unknowns by field type, namely displacement and Lagrange multipliers. The key to the success of the proposed preconditioners is a pseudo-Schur complement obtained by eliminating the Lagrange multiplier degrees of freedom, which can then be efficiently solved using an optimal multigrid method. Numerical results, including complex real-world problems, are presented to illustrate theoretical properties, scalability and robustness of the preconditioner. A comparison with other approaches available in the literature is also provided.
... The communication-avoiding versions of the Arnoldi (CA-Arnoldi) of Hoemmen [62] that also implemented by Fahmy [41], adaptive smoothing and prolongation algebraic multigrid (aSP-AMG) of Magri et al. [63] that also applied by Fahmy [43] and the regularized of Badahmane [64] which is also used by Fahmy [65] were compared with each other in Table 3. This table reports the iteration number (IT), CPU time, relative residual (RES) and error (ERR) of the tested iteration methods with respect to different values of In the considered special case, the boundary element model of the considered example, the boundary has been discretized using 84 linear boundary elements and 404 internal points as shown in Fig. 2, and the results of temperature and displacements are plotted in Figs. 9, 10, and 11. ...
Article
Full-text available
The main aim of this article is to develop a new boundary element method (BEM) algorithm to model and simulate the nonlinear thermal stresses problems in micropolar functionally graded anisotropic (FGA) composites with temperature-dependent properties. Some inside points are chosen to treat the nonlinear terms and domain integrals. An integral formulation which is based on the use of Kirchhoff transformation is firstly used to simplify the transient heat conduction governing equation. Then, the residual nonlinear terms are carried out within the current formulation. The domain integrals can be effectively treated by applying the Cartesian transformation method (CTM). In the proposed BEM technique, the nonlinear temperature is computed on the boundary and some inside domain integral. Then, nonlinear displacement can be calculated at each time step. With the calculated temperature and displacement distributions, we can obtain the values of nonlinear thermal stresses. The efficiency of our proposed methodology has been improved by using the communication-avoiding versions of the Arnoldi (CA-Arnoldi) preconditioner for solving the resulting linear systems arising from the BEM to reduce the iterations number and computation time. The numerical outcomes establish the influence of temperature-dependent properties on the nonlinear temperature distribution, and investigate the effect of the functionally graded parameter on the nonlinear displacements and thermal stresses, through the micropolar FGA composites with temperature-dependent properties. These numerical outcomes also confirm the validity, precision and effectiveness of the proposed modeling and simulation methodology.
... According to Fahmy [18], the nonlinear operator can be written as Y'Φ x; P " = Y ̅ Φ x; P + According to the efficiency comparison of Fahmy [19] for communication-avoiding versions of the Arnoldi (CA-Arnoldi) [20], adaptive smoothing and prolongation algebraic multigrid (aSP-AMG) [21] which also implemented by Fahmy [22] and regularized [23] preconditioners, the effectiveness of our proposed method has been created by utilizing the communication-avoiding versions of the Arnoldi (CA-Arnoldi) preconditioner as in Fahmy [19] for tackling the resulting linear equations emerging from the BEM to decrease the iterations number and computation time. ...
Article
Full-text available
The main aim of this paper is to propose a new boundary element method (BEM) formulation for solving the nonlinear space-time fractional dual-phase-lag bio-heat transfer problems during electromagnetic radiation. Due to the advantages of BEM, such as not requiring a discretization of the interior of the treated region and providing a low RAM and CPU time. BEM is therefore a flexible and efficient tool for modeling bio-heat transfer problems. The efficiency of our proposed methodology has been improved by applying the communication-avoiding versions of the Arnoldi (CA-Arnoldi) preconditioner for solving the resulting linear systems arising from the BEM to reduce the iterations number and CPU time. Numerical results are depicted graphically to show the effects of time-fractional derivative order and space-fractional derivative order on the nonlinear temperature distributions. The numerical results also show the significant differences between the nonlinear temperature distributions of the classical Fourier, single-phase-lag, and dual-phase-lag bio-heat conduction models. To demonstrate the validity and accuracy of the proposed BEM methodology, numerical solutions for two-dimensional (2D) special case of the nonlinear space-time fractional dual phase lag bio-heat transfer problems are obtained and compared to experimental, Legendre wavelet collocation method (LWCM) and Fractional order Legendre functions and Galerkin method (FOLFs-GM).
... SoC (12) is generally used in smoothed aggregation AMG [51] and usually gives good results in structural problems. Finally, SoC (13) has been introduced in [36] and, though requiring a rather expensive computation, it is able to accurately capture anisotropies as is shown in [41]. After SoC is computed for every pair of nodes, weak connections are eliminated to determine a Maximum Independent Set (MIS) of nodes that will become coarse nodes in the next level. ...
Preprint
Full-text available
The numerical simulation of the physical systems has become in recent years a fundamental tool to perform analyses and predictions in several application fields, spanning from industry to the academy. As far as large scale simulations are concerned, one of the most computationally expensive task is the solution of linear systems arising from the discretization of the partial differential equations governing the physical processes.This work presents Chronos, a collection of linear algebra functions specifically designed for the solution of large, sparse linear systems on massively parallel computers (https://www.m3eweb.it/chronos/). Its emphasis is on modern, effective and scalable AMG preconditioners for High Performance Computing (HPC). This work describes the numerical algorithms and the main structures of this software suite, especially from the implementation standpoint. Several numerical results arising from practical mechanics and fluid dynamics applications with hundreds of millions of unknowns are addressed and compared with other state-of-the-art linear solvers, proving Chronos efficiency and robustness.
... Moreover, a lowrank acceleration of the polynomial preconditioner will be investigated, following e.g. [21] by exploiting the well separation of the smallest eigenvalues provided by our polynomial preconditioner. We finally observe that the described approach can be applied whenever a first level parallel preconditioner is at hand in factored form, say 0 = , to obtain a second level preconditioner applying the Newton-Chebyshev polynomials to the matrix . ...
Article
Full-text available
In this note we exploit polynomial preconditioners for the Conjugate Gradient method to solve large symmetric positive definite linear systems in a parallel environment. We put in connection a specialized Newton method to solve the matrix equation X‐1 = A and the Chebyshev polynomials for preconditioning.We propose a simple modification of one parameter which avoids clustering of extremal eigenvalues in order to speed‐up convergence. We provide results on very large matrices (up to 8.6 billion unknowns in a parallel environment) showing the efficiency of the proposed class of preconditioners. This article is protected by copyright. All rights reserved.
... However, in challenging real world problems such as those arising from structural mechanics or fluid flow in highly heterogeneous formations, standard AMG solvers may be slow to converge or even fail, so that more advanced approaches are needed. In particular, the use of powerful smoothers based on approximate inverses can be of great help as shown, for instance, in [31,13]. ...
Preprint
Full-text available
The solution of linear systems of equations is a central task in a number of scientific and engineering applications. In many cases the solution of linear systems may take most of the simulation time thus representing a major bottleneck in the further development of scientific and technical software. For large scale simulations, nowadays accounting for several millions or even billions of unknowns, it is quite common to resort to preconditioned iterative solvers for exploiting their low memory requirements and, at least potential, parallelism. Approximate inverses have been shown to be robust and effective preconditioners in various contexts. In this work, we show how adaptive FSAI, an approximate inverse characterized by a very high degree of parallelism, can be successfully implemented on a distributed memory computer equipped with GPU accelerators. Taking advantage of GPUs in adaptive FSAI set-up is not a trivial task, nevertheless we show through an extensive numerical experimentation how the proposed approach outperforms more traditional preconditioners and results in a close-to-ideal behaviour in challenging linear algebra problems.
... The aSP-AMG approach [Franceschini et al., 2018;Magri et al., 2019] is very suitable for efficient implementation in Matlab (R2018a). ...
Article
The main aim of this paper is to introduce a new memory-dependent derivative theory to contribute for increasing development of technological and industrial applications of anisotropic smart materials. This theory is called three-temperature anisotropic generalized micropolar piezothermoelasticity. The governing equations of the proposed theory are very difficult to solve analytically because of material anisotropy and its nonlinear properties. Therefore, we propose a new boundary element formulation for solving such equations. The efficiency of our proposed technique has been developed by using an adaptive smoothing and prolongation algebraic multigrid (aSP-AMG) preconditioner to reduce the computation time. The numerical results are presented highlighting the effects of the kernel function and time delay on the temperature and displacements. The numerical results also verify the validity and accuracy of the proposed methodology. It can be concluded from the numerical results of our current complex and general study that some well-known uncoupled, coupled and generalized theories of anisotropic micropolar piezothermoelasticity can be connected with the three-temperature radiative heat conduction to characterize the deformation of anisotropicmicropolar piezothermoelasticstructures in the context of memory-dependent derivative.
Article
Full-text available
Designing the topology of three-dimensional structures is a challenging problem due to its memory and time consumption. In this paper, we present a robust and efficient algorithm for solving large-scale 3D topology optimization problems. The robustness of the algorithm is ensured by adopting a globally convergent sequential linear programming method with a stopping criterion based on the first-order optimality conditions of the nonlinear problem. To increase the algorithm’s efficiency, it is combined with a multiresolution scheme that employs different discretizations to deal with displacement, design, and density variables. In addition, the time spent solving the linear equilibrium systems is substantially reduced using multigrid as a preconditioner for the conjugate gradient method. Since multiresolution can lead to the appearance of unwanted artefacts in the structure, we propose an adaptive strategy for increasing the degree of the displacement elements, with a technique for suppressing unnecessary variables that provides accurate solutions with a moderate impact on the algorithm’s performance. We also propose a new thresholding strategy, based on gradient information, to obtain structures composed only by solid or void regions. Computational experiments carried out in Matlab prove that the new algorithm effectively generates high-resolution structures at a low computational cost.
Article
In a sequence of papers, the author examined several statistical affinity measures for selecting the coarse degrees of freedom (CDOFs) or coarse nodes (Cnodes) in algebraic multigrid (AMG) for systems of elliptic partial differential equations (PDEs). These measures were applied to a set of relaxed vectors that exposes the problematic error components. Once the CDOFs are determined using any one of these measures, the interpolation operator is constructed in a bootstrap AMG (BAMG) procedure. However, in a recent paper of Kahl and Rottmann, the statistical least angle regression (LARS) method was utilized in the coarsening procedure and shown to be promising in the CDOF selection. This method is generally used in the statistics community to select the most relevant variables in constructing a parsimonious model for a very complicated and high‐dimensional model or data set (i.e., variable selection for a “reduced” model). As pointed out by Kahl and Rottmann, the LARS procedure has the ability to detect group relations between variables, which can be more useful than binary relations that are derived from strength‐of‐connection, or affinity measures, between pairs of variables. Moreover, by using an updated Cholesky factorization approach in the regression computation, the LARS procedure can be performed efficiently even when the original set of variables is large; and due to the LARS formulation itself (i.e., its ‐norm constraint), sparse interpolation operators can be generated. In this article, we extend the LARS coarsening approach to systems of PDEs. Furthermore, we incorporate some modifications to the LARS approach based on the so‐called elastic net and relaxed lasso methods, which are well known and thoroughly analyzed in the statistics community for ameliorating several major issues with LARS as a variable selection procedure. We note that the original LARS coarsening approach may have addressed some of these issues in similar or other ways but due to the limited details provided there, it is difficult to determine the extent of their similarities. Incorporating these modifications (or effecting them in similar ways) leads to improved robustness in the LARS coarsening procedure, and numerical experiments indicate that the changes lead to faster convergence in the multigrid method. Moreover, the relaxed lasso modification permits an indirect BAMG (iBAMG) extension to the interpolation operator. This iBAMG extension applied in an intra‐ or inter‐variable interpolation setting (i.e., nodal‐based coarsening), as well as in variable‐based coarsening, which will not preserve the nodal structure of a finest‐level discretization on the lower levels of the multilevel hierarchy, will be examined. For the variable‐based coarsening, because of the parsimonious feature of LARS, the performance is reasonably good when applied to systems of PDEs albeit at a substantial additional cost over a nodal‐based procedure.
Preprint
Full-text available
A low-synchronization MGS-GMRES Krylov solver employing a truncated Neumann series for the inverse 3 compact W Y MGS correction matrix T is presented. A corollary to the backward stability result of Paige et al. [1] 4 establishes that T = I −L k is sufficient for convergence of GMRES when L p F = O(ε p)κ p F (B), p > 1 where B = [r 0 , AVm ] 5 with condition number κ(B). The columns of the strictly lower triangular matrix L are defined by matrix-vector products 6 of Krylov vectors V T 1:k−2 v k−1. The preconditioner is the classical Rüge-Stuben AMG algorithm with compatible relaxation 7 and inner-outer Gauss-Seidel smoother. This smoother may also be expressed as a truncated Neumann series. Despite the 8 rapid convergence of GMRES-AMG, the cost of an elliptic pressure solver (e.g. for the Navier-Stokes equations), is still 9 substantial. Drop tolerances are applied to the strictly lower triangular matrices arising in the smoother in order to reduce 10 the number of non-zeros and accelerate the time to solution. The number of small matrix elements are found to increase 11 from fine to coarse levels and thus the efficiency gains are greater for large problems with many levels in the V-cycle. The 12 solver is applied to the pressure continuity equation for the incompressible Navier-Stokes equations. The pressure solve 13 time is reduced considerably without a change in the convergence rate. 14
Article
Full-text available
The solution of linear systems of equations is a central task in a number of scientific and engineering applications. In many cases the solution of linear systems may take most of the simulation time thus representing a major bottleneck in the further development of scientific and technical software. For large scale simulations, nowadays accounting for several millions or even billions of unknowns, it is quite common to resort to preconditioned iterative solvers for exploiting their low memory requirements and, at least potential, parallelism. Approximate inverses have been shown to be robust and effective preconditioners in various contexts. In this work, we show how adaptive Factored Sparse Approximate Inverse (aFSAI), characterized by a very high degree of parallelism, can be successfully implemented on a distributed memory computer equipped with GPU accelerators. Taking advantage of GPUs in adaptive FSAI set-up is not a trivial task, nevertheless we show through an extensive numerical experimentation how the proposed approach outperforms more traditional preconditioners and results in a close-to-ideal behavior in challenging linear algebra problems.
Article
One of the most time-consuming tasks in the procedures for the numerical study of PDEs is the solution to linear systems of equations. To that purpose, iterative solvers are viewed as a promising alternative to direct methods on high performance computers since, in theory, they are almost perfectly parallelizable. Their main drawback is the need of finding a suitable preconditioner to accelerate convergence. The Factorized Sparse Approximate Inverse (FSAI), mainly in its adaptive form, has proven to be an effective parallel preconditioner for several problems. In the present work, we report about two novel ideas to dynamically compute, on Graphics Processing Units, the FSAI sparsity pattern, which is the main task in its set-up. The first approach, mutuated from the CPU implementation, uses a global array as a non-zero indicator, whereas the second one relies on a merge- sort procedure of multiple arrays. We will show that the second approach requires significantly less memory and overcomes issues related to the limited global memory available on GPUs. Numerical tests prove that the GPU implementation of FSAI allows for an average speed-up of 7.5 over a parallel CPU implementation. Moreover, we will show that the preconditioner computation is still feasible using single precision arithmetic with a further 20% reduction of the set-up cost. Finally, the strong scalability of the overall approach in shown in a multi-GPU setting.
Article
Full-text available
This article has two main objectives: one is to describe some extensions of an adaptive Algebraic Multigrid (AMG) method of the form previously proposed by the first and third authors, and a second one is to present a new software framework, named BootCMatch, which implements all the components needed to build and apply the described adaptive AMG both as a stand-alone solver and as a preconditioner in a Krylov method. The adaptive AMG presented is meant to handle general symmetric and positive definite (SPD) sparse linear systems, without assuming any a priori information of the problem and its origin; the goal of adaptivity is to achieve a method with a prescribed convergence rate. The presented method exploits a general coarsening process based on aggregation of unknowns, obtained by a maximum weight matching in the adjacency graph of the system matrix. More specifically, a maximum product matching is employed to define an effective smoother subspace (complementary to the coarse space), a process referred to as compatible relaxation, at every level of the recursive two-level hierarchical AMG process. Results on a large variety of test cases and comparisons with related work demonstrate the reliability and efficiency of the method and of the software.
Article
Full-text available
This paper describes a software package called EVSL (for EigenValues Slicing Library) for solving large sparse real symmetric standard and generalized eigenvalue problems. As its name indicates, the package exploits spectrum slicing, a strategy that consists of dividing the spectrum into a number of subintervals and extracting eigenpairs from each subinterval independently. In order to enable such a strategy, the methods implemented in EVSL rely on a quick calculation of the spectral density of a given matrix, or a matrix pair. What distinguishes EVSL from other currently available packages is that EVSL relies entirely on filtering techniques. Polynomial and rational filtering are both implemented and are coupled with Krylov subspace methods and the subspace iteration algorithm. On the implementation side, the package offers interfaces for various scenarios including matrix-free modes, whereby the user can supply his/her own functions to perform matrix-vector operations or to solve sparse linear systems. The paper describes the algorithms in EVSL, provides details on their implementations, and discusses performance issues for the various methods.
Article
Full-text available
The use of factorized sparse approximate inverse (FSAI) preconditioners in a standard multilevel framework for symmetric positive definite (SPD) matrices may pose a number of issues as to the definiteness of the Schur complement at each level. The present work introduces a robust multilevel approach for SPD problems based on FSAI preconditioning, which eliminates the chance of algorithmic breakdowns independently of the preconditioner sparsity. The multilevel FSAI algorithm is further enhanced by introducing descending and ascending low-rank corrections, thus giving rise to the multilevel FSAI with low-rank corrections (MFLR) preconditioner. The proposed algorithm is investigated in a number of test problems. The numerical results show that the MFLR preconditioner is a robust approach that can significantly accelerate the solver convergence rate preserving a good degree of parallelism. The possibly large set-up cost, mainly due to the computation of the eigenpairs needed by low-rank corrections, makes its use attractive in applications where the preconditioner can be recycled along a number of linear solves.
Article
Full-text available
This paper provides a unified and detailed presentation of root-node style algebraic multigrid (AMG). AMG is a popular and effective iterative method for solving large, sparse linear systems that arise from discretizing partial differential equations. However, while AMG is designed for symmetric positive definite matrices (SPD), certain SPD problems, such as anisotropic diffusion, are still not adequately addressed by existing methods. The focus of this paper is on so-called root-node AMG, which can be viewed as a combination of classical and aggregation-based multigrid. An algorithm for root-node is outlined and theoretical motivation is provided, and a filtering strategy is developed, which is able to control the cost of using root-node AMG, particularly on difficult problems. Numerical results are then presented demonstrating the robust ability of root-node to solve systems-based problems, non-symmetric problems, and difficult SPD problems, including unstructured strongly anisotropic diffusion, in a scalable manner. Detailed estimates of the computational cost of the setup and solve phase are given for each example, providing additional support for root-node AMG over alternative methods.
Article
Full-text available
Graphics Processing Units (GPUs) exhibit significantly higher peak performance than conventional CPUs. However, in general only highly parallel algorithms can exploit their potential. In this scenario, the iterative solution to sparse linear systems of equations could be carried out quite efficiently on a GPU as it requires only matrix-by-vector products, dot products, and vector updates. However, to be really effective, any iterative solver needs to be properly preconditioned and this represents a major bottleneck for a successful GPU implementation. Due to its inherent parallelism, the factored sparse approximate inverse (FSAI) preconditioner represents an optimal candidate for the conjugate gradient-like solution of sparse linear systems. However, its GPU implementation requires a nontrivial recasting of multiple computational steps. We present our GPU version of the FSAI preconditioner along with a set of results that show how a noticeable speedup with respect to a highly tuned CPU counterpart is obtained.
Article
Full-text available
Polynomial filtering can provide a highly effective means of computing all eigenvalues of a real symmetric (or complex Hermitian) matrix that are located in a given interval, anywhere in the spectrum. This paper describes a technique for tackling this problem by combining a Thick-Restart version of the Lanczos algorithm with deflation (`locking') and a new type of polynomial filters obtained from a least-squares technique. The resulting algorithm can be utilized in a `spectrum-slicing' approach whereby a very large number of eigenvalues and associated eigenvectors of the matrix are computed by extracting eigenpairs located in different sub-intervals independently from one another.
Article
Full-text available
This paper presents a parallel preconditioning method for distributed sparse linear systems, based on an approximate inverse of the original matrix, that adopts a general framework of distributed sparse matrices and exploits the domain decomposition method and low-rank corrections. The domain decomposition approach decouples the matrix and once inverted, a low-rank approximation is applied by exploiting the Sherman-Morrison-Woodbury formula, which yields two variants of the preconditioning methods. The low-rank expansion is computed by the Lanczos procedure with reorthogonalizations. Numerical experiments indicate that, when combined with Krylov subspace accelerators, this preconditioner can be efficient and robust for solving symmetric sparse linear systems. Comparisons with other distributed-memory preconditioning methods are presented.
Article
Full-text available
Bootstrap algebraic multigrid (BAMG) is a multigrid‐based solver for matrix equations of the form Ax = b . Its aim is to automatically determine the interpolation weights used in algebraic multigrid by locally fitting a set of test vectors that have been relaxed as solutions to the corresponding homogeneous equation, Ax = 0. This paper studies an improved form of BAMG, called relaxation‐corrected bootstrap algebraic multigrid ( r BAMG), that involves adding scaled residuals of the test vectors to the least‐squares equations. The basic r BAMG scheme was introduced in an earlier paper [1] and analyzed on a simple model problem. The purpose of the current paper is to further develop this algorithm by incorporating several new critical components and to systematically study its performance on an interesting model problem from quantum chromodynamics. Whereas the earlier paper introduced a new least‐squares principle involving the residuals of the test vectors, a simple extrapolation scheme is developed here to accurately estimate the convergence factors of the evolving algebraic multigrid solver. Such a capability is essential to the effective development of a fast solver, and the approach introduced here is shown numerically to be much more effective than the conventional approach of just observing successive error reduction factors. Another component of the setup process developed here is an adaptive cycling process. This component assesses the effectiveness of the V‐cycle constructed in the initial r BAMG phase by applying it to the homogeneous equation. When poor convergence is observed, the set of test vectors is enhanced with the resulting error, enabling the subsequent least‐squares fit of interpolation to produce an improved V‐cycle. A related component is the scaling and recombination Ritz process that targets the so‐called weak approximation property in an attempt to reveal the important elements of these evolving error and test vector spaces. The aim of the numerical study documented here is to provide insight into the various design choices that arise in the development of an r BAMG algorithm. With this in mind, the results for quantum chromodynamics focus on the behavior of r BAMG in terms of the number of initial test vectors used, the number of relaxation sweeps applied to them, and the size of the target matrices. Copyright © 2012 John Wiley & Sons, Ltd.
Article
Full-text available
Large discontinuities in material properties, such as encountered in composite materials, lead to ill-conditioned systems of linear equations. These discontinuities give rise to small eigenvalues that may negatively affect the convergence of iterative solution methods such as the Preconditioned Conjugate Gradient (PCG) method. This paper considers the Deflated Preconditioned Conjugate Gradient (DPCG) method for solving such systems. Our deflation technique uses as the deflation space the rigid body modes of sets of elements with homogeneous material properties. We show that in the deflated spectrum the small eigenvalues are mapped to zero and no longer negatively affect the convergence. We justify our approach through mathematical analysis and we show with numerical experiments on both academic and realistic test problems that the convergence of our DPCG method is independent of discontinuities in the material properties.
Article
Full-text available
We propose a new general algorithm for constructing interpolation weights in al- gebraic multigrid (AMG). It exploits a proper extension mapping outside a neighborhood about a fine degree off reedom (dof ) to be interpolated. The extension mapping provides boundary values (based on the coarse dofs used to perform the interpolation) at the boundary of the neighborhood. The interpolation value is then obtained by matrix dependent harmonic extension ofthe boundary values into the interior ofthe neighborhood. We describe the method, present examples ofuseful extension operators, provide a two-grid anal- ysis ofmodel problems, and, by way ofnumerical experiments, demonstrate the successful application ofthe method to discretized elliptic problems.
Article
Full-text available
Many matrix equations are either inherently discrete (e. g., in geodesy) or for certain practical purposes remote from their origin (e. g., a finite element discretization on a preselected irregular grid). AMG is an algorithm designed to solve such problems by using information contained only in the matrix while at the same time basing itself on multigrid principles. This paper introduces the basic AMG concepts, develops its foundations, and describes current AMG strategies.
Article
Full-text available
Substantial effort has been focused over the last two decades on developing multilevel iterative methods capable of solving the large linear systems encountered in engineering practice. These systems often arise from discretizing partial differential equations over unstructured meshes, and the particular parameters or geometry of the physical problem being discretized may be unavailable to the solver. Algebraic multigrid (AMG) and mul- tilevel domain decomposition methods of algebraic type have been of particular interest in this context because of their promises of optimal performance without the need for ex- plicit knowledge of the problem geometry. These methods construct a hierarchy of coarse problems based on the linear system itself and on certain assumptions about the smooth components of the error. For smoothed aggregation (SA) multigrid methods applied to discretizations of elliptic problems, these assumptions typically consist of knowledge of the near-kernel or near-nullspace of the weak form. This paper introduces an extension of the SA method in which good convergence properties are achieved in situations where explicit knowledge of the near-kernel components is unavailable. This extension is accomplished in an adaptive process that uses the method itself to determine near-kernel components and adjusts the coarsening processes accordingly.
Article
Full-text available
We introduce spectral AMGe (ρAMGe), a new algebraic multigrid method for solving systems of algebraic equations that arise in Ritz-type finite element discretizations of partial differential equations. The method requires access to the element stiffness matrices, which enables accurate approximation of algebraically "smooth" vectors (i.e., error components that relaxation cannot effectively eliminate). Most other algebraic multigrid methods are based in some manner on predefined concepts of smoothness. Coarse-grid selection and prolongation, for example, are often defined assuming that smooth errors vary slowly in the direction of "strong" connections (relatively large coefficients in the operator matrix). One aim of ρAMGe is to broaden the range of problems to which the method can be successfully applied by avoiding any implicit premise about the nature of the smooth error. ρAMGe uses the spectral decomposition of small collections of element stiffness matrices to determine local representations of algebraically smooth error components. This provides a foundation for generating the coarse level and for defining effective interpolation. This paper presents a theoretical foundation for ρAMGe along with numerical experiments demonstrating its robustness. 1. Introduction. Computational investigation is a vital tool for today's scientists and engineers. Modern computer simulation methods demand increasingly greater speed and accuracy, and are being applied to extraordinarily large systems of equations, with tens or hundreds of millions of unknowns. To solve these systems with the accuracy and speed required, massively parallel computers, and algo- rithms that effectively exploit their power, are essential. As problems grow larger, algorithms whose performance scales linearly with the problem size are necessary. Unfortunately, many of today's com- putational simulations use algorithms that do not scale in this sense. Multigrid methods are scalable for many regular-grid problems. However, they can be extremely difficult to devise for the large unstructured grids that many simulations require. Algebraic Multigrid (AMG) overcomes this difficulty by abstracting, in an algebraic sense, the properties that make geo- metric multigrid methods effective. Ideally, this results in a method that is automatic and robust. The classical formulation of AMG grew from the efforts of Brandt, McCormick, Ruge, and Stuben in the 1980's. (For details, see (2, 1, 17).) Interest in AMG methods is growing rapidly, both in academia and in industry, because these methods have great potential for solving the large-scale problems common to many modern applications.
Article
Full-text available
Gmsh is an open-source 3-D finite element grid generator with a build-in CAD engine and post-processor. Its design goal is to provide a fast, light and user-friendly meshing tool with parametric input and advanced visualization capabilities. This paper presents the overall philosophy, the main design choices and some of the original algorithms implemented in Gmsh. Copyright © 2009 John Wiley & Sons, Ltd.
Article
Full-text available
Driven by the need to solve linear systems arising from problems posed on extremely large, unstructured grids, there has been a recent resurgence of interest in algebraic multigrid (AMG). AMG is attractive in that it holds out the possibility of multigrid-like performance on unstructured grids. The sheer size of many modern physics and simulation problems has led to the development of massively parallel computers, and has sparked much research into developing algorithms for them. Parallelizing AMG is a difficult task, however. While much of the AMG method parallelizes readily, the process of coarse-grid selection, in particular, is fundamentally sequential in nature.We have previously introduced a parallel algorithm [A.J. Cleary, R.D. Falgout, V.E. Henson, J.E. Jones, in: Proceedings of the Fifth International Symposium on Solving Irregularly Structured Problems in Parallel, Springer, New York, 1998] for the selection of coarse-grid points, based on modifications of certain parallel independent set algorithms and the application of heuristics designed to insure the quality of the coarse grids, and shown results from a prototype serial version of the algorithm.In this paper we describe an implementation of a parallel AMG code, using the algorithm of A.J. Cleary, R.D. Falgout, V.E. Henson, J.E. Jones [in: Proceedings of the Fifth International Symposium on Solving Irregularly Structured Problems in Parallel, Springer, New York, 1998] as well as other approaches to parallelizing the coarse-grid selection. We consider three basic coarsening schemes and certain modifications to the basic schemes, designed to address specific performance issues. We present numerical results for a broad range of problem sizes and descriptions, and draw conclusions regarding the efficacy of the method. Finally, we indicate the current directions of the research.
Article
Full-text available
We describe the University of Florida Sparse Matrix Collection, a large and actively growing set of sparse matrices that arise in real applications. The Collection is widely used by the numerical linear algebra community for the development and performance evaluation of sparse matrix algorithms. It allows for robust and repeatable experiments: robust because performance results with artificially generated matrices can be misleading, and repeatable because matrices are curated and made publicly available in many formats. Its matrices cover a wide spectrum of domains, include those arising from problems with underlying 2D or 3D geometry (as structural engineering, computational fluid dynamics, model reduction, electromagnetics, semiconductor devices, thermodynamics, materials, acoustics, computer graphics/vision, robotics/kinematics, and other discretizations) and those that typically do not have such geometry (optimization, circuit simulation, economic and financial modeling, theoretical and quantum chemistry, chemical process simulation, mathematics and statistics, power networks, and other networks and graphs). We provide software for accessing and managing the Collection, from MATLAB#8482;, Mathematica#8482;, Fortran, and C, as well as an online search capability. Graph visualization of the matrices is provided, and a new multilevel coarsening scheme is proposed to facilitate this task.
Article
Full-text available
Efficient numerical simulation of physical processes is constrained by our ability to solve the resulting linear systems, prompting substantial research into the development of multi-scale iterative methods capable of solving these linear systems with an optimal amount of effort. Overcoming the limitations of geometric multigrid methods to simple geometries and differential equations, algebraic multigrid methods construct the multigrid hierarchy based only on the given matrix. While this allows for efficient black-box solution of the linear systems associated with discretizations of many elliptic differential equations, it also results in a lack of robustness due to unsatisfied assumptions made on the near null spaces of these matrices. This paper introduces an extension to algebraic multigrid methods that removes the need to make such assumptions by utilizing an adaptive process. Emphasis is on the principles that guide the adaptivity and their application to algebraic multigrid solution of certain symmetric positive-definite linear systems.
Article
Full-text available
We develop an algebraic multigrid (AMG) setup scheme based on the bootstrap framework for multiscale scientific computation. Our approach uses a weighted least squares definition of interpolation, based on a set of test vectors that are computed by a bootstrap setup cycle and then improved by a multigrid eigensolver and a local residual-based adaptive relaxation process. To emphasize the robustness, efficiency, and flexibility of the individual components of the proposed approach, we include extensive numerical results of the method applied to scalar elliptic partial differential equations discretized on structured meshes. As a first test problem, we consider the Laplace equation discretized on a uniform quadrilateral mesh, a problem for which multigrid is well understood. Then, we consider various more challenging variable coefficient systems coming from covariant finite-difference approximations of the two-dimensional gauge Laplacian system, a commonly used model problem in AMG algorithm development for linear systems arising in lattice field theory computations.
Article
In this paper, we consider a classical form of optimal algebraic multigrid (AMG) interpolation that directly minimizes the two-grid convergence rate and compare it with the so-called ideal form that minimizes a certain weak approximation property of the coarse space. We study compatible relaxation type estimates for the quality of the coarse grid and derive a new sharp measure using optimal interpolation that provides a guaranteed lower bound on the convergence rate of the resulting two-grid method for a given grid. In addition, we design a generalized bootstrap algebraic multigrid setup algorithm that computes a sparse approximation to the optimal interpolation matrix. We demonstrate numerically that the BAMG method with sparse interpolation matrix (and spanning multiple levels) outperforms the two-grid method with the standard ideal interpolation (a dense matrix) for various scalar diffusion problems with highly varying diffusion coefficient.
Article
In the numerical simulation of structural problems, a crucial aspect concern the solution of the linear system arising from the discretization of the governing equations. In fact, ill-conditioned system, related to an unfavorable eigenspectrum, are quite common in several engineering applications. In these cases the Preconditioned Conjugate Gradient enhanced with the deflation technique seems to be a very promising approach in particular because an effective deflation space is already at hand. In fact, it is possible to utilize rigid body motions of the system, that can be calculated easily and cheaply, and only the knowledge of the geometry of problem is required. This paper investigates the advantages of using a Rigid Body Modes Deflated Conjugate Gradient in the solution of challenging systems arising from structural problems. Two different situations are analyzed: the ill-conditioning caused by low constraining is addressed deflating the total rigid body modes, while the one concerning the heterogeneity of the problem by using the rigid body modes of separate components. Moreover, the implemented method is highly parallel and therefore suitable for High Performance Computing. Numerical results show how both approaches performed successfully in reducing the overall system solution time cost and iterations required for convergence.
Article
This paper is to give an overview of AMG methods for solving large scale systems of equations such as those from the discretization of partial differential equations. AMG is often understood as the acronym of "Algebraic Multi-Grid", but it can also be understood as "Abstract Muti-Grid". Indeed, as it demonstrates in this paper, how and why an algebraic multigrid method can be better understood in a more abstract level. In the literature, there are a variety of different algebraic multigrid methods that have been developed from different perspectives. In this paper, we try to develop a unified framework and theory that can be used to derive and analyze different algebraic multigrid methods in a coherent manner. Given a smoother R for a matrix A, such as Gauss-Seidel or Jacobi, we prove that the optimal coarse space of dimension ncn_c is the span of the eigen-vectors corresponding to the first ncn_c eiven-vectors RˉA\bar RA (with Rˉ=R+RTRTAR\bar R=R+R^T-R^TAR). We also prove that this optimal coarse space can be obtained by a constrained trace-minimization problem for a matrix associated with RˉA\bar RA and demonstrate that coarse spaces of most of existing AMG methods can be viewed some approximate solution of this trace-minimization problem. Furthermore, we provide a general approach to the construction of a quasi-optimal coarse space and we prove that under appropriate assumptions the resulting two-level AMG method for the underlying linear system converges uniformly with respect to the size of the problem, the coefficient variation, and the anisotropy. Our theory applies to most existing multigrid methods, including the standard geometric multigrid method, the classic AMG, energy-minimization AMG, unsmoothed and smoothed aggregation AMG, and spectral AMGe.
Article
A class of preconditioners based on balancing domain decomposition by constraints methods is introduced in the Portable, Extensible Toolkit for Scientific Computation (PETSc). The algorithm and the underlying nonoverlapping domain decomposition framework are described with a specific focus on their current implementation in the library. Available user customizations are also presented, together with an experimental interface to the finite element tearing and interconnecting dual-primal methods within PETSc. Large-scale parallel numerical results are provided for the latest version of the code, which is able to tackle symmetric positive definite problems with highly heterogeneous distributions of the coefficients. Current limitations and future extensions of the preconditioner class are also discussed.
Article
In this paper we present a fully distributed, communicator-aware, recursive, and interlevel-overlapped message-passing implementation of the multilevel balancing domain decomposition by constraints (MLBDDC) preconditioner. The implementation highly relies on subcommunicators in order to achieve the desired effect of coarse-grain overlapping of computation and communication, and communication and communication among levels in the hierarchy (namely, interlevel overlapping). Essentially, the main communicator is split into as many nonoverlapping subsets of message-passing interface (MPI) tasks (i.e., MPI subcommunicators) as levels in the hierarchy. Provided that specialized resources (cores and memory) are devoted to each level, a careful rescheduling and mapping of all the computations and communications in the algorithm lets a high degree of overlapping be exploited among levels. All subroutines and associated data structures are expressed recursively, and therefore MLBDDC preconditioners with an arbitrary number of levels can be built while re-using significant and recurrent parts of the codes. This approach leads to excellent weak scalability results as soon as level-1 tasks can fully overlap coarser-levels duties. We provide a model to indicate how to choose the number of levels and coarsening ratios between consecutive levels and determine qualitatively the scalability limits for a given choice. We have carried out a comprehensive weak scalability analysis of the proposed implementation for the three-dimensional Laplacian and linear elasticity problems on structured and unstructured meshes. Excellent weak scalability results have been obtained up to 458,752 IBM BG/Q cores and 1.8 million MPI being, being the first time that exact domain decomposition preconditioners (only based on sparse direct solvers) reach these scales.
Article
The Factorized Sparse Approximate Inverse (FSAI) is an efficient technique for preconditioning parallel solvers of symmetric positive definite sparse linear systems. The key factor controlling FSAI efficiency is the identification of an appropriate nonzero pattern. Currently, several strategies have been proposed for building such a nonzero pattern, using both static and dynamic techniques. This article describes a fresh software package, called FSAIPACK, which we developed for shared memory parallel machines. It collects all available algorithms for computing FSAI preconditioners. FSAIPACK allows for combining different techniques according to any specified strategy, hence enabling the user to thoroughly exploit the potential of each preconditioner, in solving any peculiar problem. FSAIPACK is freely available as a compiled library at http://www.dmsa.unipd.it/~janna/software.html, together with an open-source command language interpreter. By writing a command ASCII file, one can easily perform and test any given strategy for building an FSAI preconditioner. Numerical experiments are discussed in order to highlight the FSAIPACK features and evaluate its computational performance.
Article
In recent years the growing popularity of supercomputers has fostered the development of algorithms able to take advantage of the massive parallelism offered by multiple processors. Direct methods, though robust and computationally efficient, hardly exploit high degrees of parallelism. By contrast, Krylov methods preconditioned by Factored Sparse Approximate Inverses (FSAI) provide, at least in principle, a perfectly parallel approach but are often thwarted by an excessive set-up cost. In this paper we extend the concept of supernode from sparse LU factorizations to approximate inverses, and use it to accelerate the computation of an FSAI-type preconditioner. The numerical experiments on real-world problems show that the overall FSAI efficiency can be significantly increased while preserving its intrinsic parallelism.
Article
This paper considers construction and properties of factorized sparse approximate inverse preconditionings well suited for implementation on modern parallel computers. In the symmetric case such preconditionings have the form AGLAGLTA \to G_L AG_L^T , where GLG_L is a sparse approximation based on minimizing the Frobenius form IGLLAF\| I - G_L L_A \|_F to the inverse of the lower triangular Cholesky factor LAL_A of A, which is not assumed to be known explicitly. These preconditionings preserve symmetry and/or positive definiteness of the original matrix and, in the case of M-, H-, or block H-matrices, lead to convergent splittings.
Article
This paper provides an overview of the main ideas driving the bootstrap algebraic multigrid methodology, including compatible relaxation and algebraic distances for defining effective coarsening strategies, the least squares method for computing accurate prolongation operators and the bootstrap cycles for computing the test vectors that are used in the least squares process. We review some recent research in the development, analysis and application of bootstrap algebraic multigrid and point to open problems in these areas. Results from our previous research as well as some new results for some model diffusion problems with highly oscillatory diffusion coefficient are presented to illustrate the basic components of the BAMG algorithm.
Article
An optimization problem is the task of minimizing (or maximizing — for definiteness we discuss minimization) a certain real-valued “objective functional” (or “cost” , or “energy” , or “performance index”, etc.) E(x), possibly under a set of equality and/or inequality constraints, where x = (x 1, …, x n ) is a vector (often the discretization of one or several functions) of unknown variables (real or complex numbers, and/or integers, and/or Ising spins, etc.). A general process for solving such problems is the point-by-point minimization, in which one changes only one variable x j (or few of them) at a time, lowering E as much as possible in each such step. More generally, the process accepts any candidate change of one or few variables if it causes a drop in energy (бE < 0).
Book
This is a revised edition of a book which appeared close to two decades ago. Someone scrutinizing how the field has evolved in these two decades will make two interesting observations. On the one hand the observer will be struck by the staggering number of new developments in numerical linear algebra during this period. The field has evolved in all directions: theory, algorithms, software, and novel applications. Two decades ago there was essentially no publically available software for large eigenvalue problems. Today one has a flurry to choose from, and the activity in software development does not seem to be abating. A number of new algorithms appeared in this period as well. I can mention at the outset the Jacobi-Davidson algorithm and the idea of implicit restarts, both discussed in this book, but there are a few others. The most interesting development to the numerical analyst may be the expansion of the realm of eigenvalue techniques into newer and more challenging applications. Or perhaps, the more correct observation is that these applications were always there, but they were not as widely appreciated or understood by numerical analysts, or were not fully developed due to lack of software.
Article
SUMMARYA smoothed aggregation-based algebraic multigrid solver for anisotropic diffusion problems is presented. Algebraic multigrid is a popular and effective method for solving sparse linear systems that arise from discretizing partial differential equations. However, although algebraic multigrid was designed for elliptic problems, the case of non-grid-aligned anisotropic diffusion is not adequately addressed by existing methods. To achieve scalable performance, it is shown that neither new coarsening nor new relaxation strategies are necessary. Instead, a novel smoothed aggregation approach is developed that combines long-distance interpolation, coarse-grid injection, and an energy-minimization strategy that finds the interpolation weights. Previously developed theory by Falgout and Vassilevski is used to discern that existing coarsening strategies are sufficient, but that existing interpolation methods are not. In particular, an interpolation quality measure tracks ‘closeness’ to the ideal interpolant and guides the interpolation sparsity pattern choice. Although the interpolation quality measure is computable for only small model problems, an inexact, but computable, measure is proposed for larger problems. This paper concludes with encouraging numerical results that also potentially show broad applicability (e.g., for linear elasticity). Copyright © 2012 John Wiley & Sons, Ltd.
Article
Instead of the standard estimate in terms of the spectral condition number we develop a new CG iteration number estimate depending on the quantity B = 1/ntr M/(det M)1/n, where M is an n × n preconditioned matrix. A new family of iterative methods for solving symmetric positive definite systems based on B-reducing strategies is described. Numerical results are presented for the new algorithms and compared with several well-known preconditioned CG methods.
Article
We propose an incomplete Cholesky factorization for the solution of large-scale trust region subproblems and positive definite systems of linear equations. This factorization depends on a parameter p that specifies the amount of additional memory (in multiples of n, the dimension of the problem) that is available; there is no need to specify a drop tolerance. Our numerical results show that the number of conjugate gradient iterations and the computing time are reduced dramatically for small values of p. We also show that in contrast with drop tolerance strategies, the new approach is more stable in terms of number of iterations and memory requirements.
Article
We consider the iterative solution of large sparse linear systems arising from the upwind finite difference discretization of convection-diffusion equations. The system matrix is then an M-matrix with nonnegative row sum, and, further, when the con-vective flow has zero divergence, the column sum is also nonnegative, possibly up to a small correction term. We investigate aggregation-based algebraic multigrid meth-ods for this class of matrices. A theoretical analysis is developed for a simplified two-grid scheme with one damped Jacobi post-smoothing step. An uncommon fea-ture of this analysis is that it applies directly to problems with variable coefficients; e.g., to problems with recirculating convective flow. On the basis of this theory, we develop an approach in which a guarantee is given on the convergence rate thanks to an aggregation algorithm that allows an explicit control of the location of the eigen-values of the preconditioned matrix. Some issues that remain beyond the analysis are discussed at the light of numerical experiments, and the efficiency of the method is illustrated on a sample of large two and three dimensional problems with highly varying convective flow.
Article
In this paper we describe an Incomplete LU factorization technique based on a strategy which combines two heuristics. This ILUT factorization extends the usual ILU(O) factorization without using the concept of level of fill-in. There are two traditional ways of developing incomplete factorization preconditioners. The first uses a symbolic factorization approach in which a level of fill is attributed to each fill-in element using only the graph of the matrix. Then each fill-in that is introduced is dropped whenever its level of fill exceeds a certain threshold. The second class of methods consists of techniques derived from modifications of a given direct solver by including a dropoff rule, based on the numerical size of the fill-ins introduced, traditionally referred to as threshold preconditioners. The first type of approach may not be reliable for indefinite problems, since it does not consider numerical values. The second is often far more expensive than the standard ILU(O). The strategy we propose is a compromise between these two extremes.
Chapter
The development of high performance, massively parallel computers and the increasing demands of computationally challenging applications have necessitated the development of scalable solvers and preconditioners. One of the most effective ways to achieve scalability is the use of multigrid or multilevel techniques. Algebraic multigrid (AMG) is a very efficient algorithm for solving large problems on unstructured grids. While much of it can be parallelized in a straightforward way, some components of the classical algorithm, particularly the coarsening process and some of the most efficient smoothers, are highly sequential, and require new parallel approaches. This chapter presents the basic principles of AMG and gives an overview of various parallel implementations of AMG, including descriptions of parallel coarsening schemes and smoothers, some numerical results as well as references to existing software packages.
Article
An algebraic multigrid algorithm for symmetric, positive definite linear systems is developed based on the concept of prolongation by smoothed aggregation. Coarse levels are generated automatically. We present a set of requirements motivated heuristically by a convergence theory. The algorithm then attempts to satisfy the requirements. Input to the method are the coefficient matrix and zero energy modes, which are determined from nodal coordinates and knowledge of the differential equation. Efficiency of the resulting algorithm is demonstrated by computational results on real world problems from solid elasticity, plate bending, and shells.Es wird ein algebraisches Mehrgitterverfahren fr symmetrische, positiv definite Systeme vorgestellt, das auf dem Konzept der gegltteten Aggregation beruht. Die Grobgittergleichungen werden automatisch erzeugt. Wir stellen eine Reihe von Bedingungen auf, die aufgrund der Konvergenztheorie heuristisch motiviert sind. Der Algorithmus versucht diese Bedingungen zu erfllen. Eingabe der Methode sind die Matrix-Koeffizienten und die Starrkrperbewegungen, die aus den Knotenwerten unter Kenntnis der Differentialgleichung bestimmt werden. Die Effizienz des entstehenden Algorithmus wird anhand numerischer Resultate fr praktische Aufgaben aus den Bereichen Elastizitt, Platten und Schalen demonstriert.
Article
Since the early 1990s, there has been a strongly increasing demand for more efficient methods to solve large sparse, unstructured linear systems of equations. For practically relevant problem sizes, classical one-level methods had already reached their limits and new hierarchical algorithms had to be developed in order to allow an efficient solution of even larger problems. This paper gives a review of the first hierarchical and purely matrix-based approach, algebraic multigrid (AMG). AMG can directly be applied, for instance, to efficiently solve various types of elliptic partial differential equations discretized on unstructured meshes, both in 2D and 3D. Since AMG does not make use of any geometric information, it is a “plug-in” solver which can even be applied to problems without any geometric background, provided that the underlying matrix has certain properties.
Article
Algebraic multigrid methods are designed for the solution of (sparse) linear systems of equations using multigrid principles. In contrast to standard multigrid methods, AMG does not take advantage of the origin of a particular system of equations at hand, nor does it exploit any underlying geometrical situation. Fully automatically and based solely on algebraic information contained in the given matrix, AMG constructs a sequence of “grids” and corresponding operators. A special AMG algorithm will be presented. For a wide range of problems (including certain problems which do not have a continuous background) this algorithm yields an iterative method which exhibits a convergence behavior typical for multigrid methods.
Article
This article surveys preconditioning techniques for the iterative solution of large linear systems, with a focus on algebraic methods suitable for general sparse matrices. Covered topics include progress in incomplete factorization methods, sparse approximate inverses, reorderings, parallelization issues, and block and multilevel extensions. Some of the challenges ahead are also discussed. An extensive bibliography completes the paper.
Article
A rigorous two-level theory is developed for general symmetric matrices (and nonsymmetric ones using Kaczmarz relaxation), without assuming any regularity, not even any grid structure of the unknowns. The theory applies to algebraic multigrid (AMG) processes, as well as to the usual (geometric) multigrid. It yields very realistic estimates and precise answers to basic algorithmic questions, such as: In what algebraic sense does Gauss-Seidel (or Jacobi, Kaczmarz, line Gauss-Seidel, etc.) relaxation smooth the error? When is it appropriate to use block relaxation? What algebraic relations must be satisfied by the coarse-to-fine interpolations? What is the algorithmic role of the geometric origin of the problem? The theory helps to rigorize local mode analyses and locally analyze cases where the latter is inapplicable.
Conference Paper
''hypre'' is a software library for the solution of large, sparse linear systems on massively parallel computers. Its emphasis is on modern powerful and scalable preconditioners. ''hypre'' provides various conceptual interfaces to enable application users to access the library in the way they naturally think about their problems. This paper presents the conceptual interfaces in ''hypre''. An overview of the preconditioners that are available in ''hypre'' is given, including some numerical results that show the efficiency of the library.
Article
Algebraic multigrid methods solve sparse linear systems Ax=b by automatic construction of a multilevel hierarchy. This hierarchy is defined by grid transfer operators that must accurately capture algebraically smooth error relative to the relaxation method. We propose a methodology to improve grid transfers through energy minimization. The proposed strategy is applicable to Hermitian, non-Hermitian, definite, and indefinite problems. Each column of the grid transfer operator P is minimized in an energy-based norm while enforcing two types of constraints: a defined sparsity pattern and preservation of specified modes in the range of P. A Krylov-based strategy is used to minimize energy, which is equivalent to solving APj=0A P_j = \boldsymbol{0} for each column j of P, with the constraints ensuring a nontrivial solution. For the Hermitian positive definite case, a conjugate gradient (CG-)based method is utilized to construct grid transfers, while methods based on generalized minimum residual (GMRES) and CG on the normal equations (CGNR) are explored for the general case. The approach is flexible, allowing for arbitrary coarsenings, unrestricted sparsity patterns, straightforward long-distance interpolation, and general use of constraints, either user-defined or auto-generated. We conclude with numerical evidence in support of the proposed framework.
Article
An adaptive algorithm is presented to generate automatically the nonzero pattern of the block factored sparse approximate inverse (BFSAI) preconditioner. It is demonstrated that in symmetric positive definite (SPD) problems BFSAI minimizes an upper bound to the Kaporin number of the preconditioned matrix. The mathematical structure of this bound suggests an efficient and easily parallelizable strategy for improving the given nonzero pattern of BFSAI, thus providing a novel adaptive BFSAI (ABF) preconditioner. Numerical experiments performed on large sized finite element problems show that ABF coupled with a block incomplete Cholesky (IC) outperforms BFSAI-IC even by a factor of 4, preserving the same preconditioner density and exhibiting an excellent parallelization degree.