
Mark Shephard- Rensselaer Polytechnic Institute
Mark Shephard
- Rensselaer Polytechnic Institute
About
387
Publications
39,185
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
12,435
Citations
Introduction
Current institution
Publications
Publications (387)
As fusion simulation codes increasingly account for the full geometric complexity of magnetically confined fusion systems, a need arose to provide tailored unstructured mesh technologies to address the specific needs of fusion plasma simulation codes and their coupling to other physics simulation codes. This paper presents a high level overview of...
This article presents MuMFiM, an open-source application for multiscale modeling of fibrous materials on massively parallel computers. MuMFiM uses two scales to represent fibrous materials such as biological network materials (extracellular matrix, connective tissue, etc.). It is designed to make use of multiple levels of parallelism, including dis...
The paper introduces high temperature composite software developed for mechanism-based design of composite structures. Mechanism-based design is characterized by an understanding of the critical composite behaviors at several physical scales: the fibrous (micro) scale, the ply/weave (meso) scale and the laminated part (macro) scale, and by the spec...
This article presents MuMFiM, an open source application for multiscale modeling of fibrous materials on massively parallel computers. MuMFiM uses two scales to represent fibrous materials such as biological network materials (extracellular matrix, connective tissue, etc.). It is designed to make use of multiple levels of parallelism, including dis...
This paper presents efforts to improve the hierarchical parallelism of a two scale simulation code. Two methods to improve the GPU parallel performance were developed and compared. The first used the NVIDIA Multi-Process Service and the second moved the entire sub-problem loop into a single kernel using Kokkos hierarchical parallelism and a PackedV...
Accurate RF (Radio Frequency) simulations of fusion systems like ITER require the definition of high-fidelity analysis geometries that include detailed antenna, reactor wall, and physics regions. This paper will describe a workflow for the execution of adaptive high-performance simulations of RF fusion systems. In this workflow, the simulation inpu...
Many engineering problems are characterized as having complex geometry that evolves over time combined with complex physical behaviors. In addition to an appropriate combination of physics models, discretization techniques and numerical methods, the effective simulation of such problems requires a set of methods to track the evolution of the simula...
Efficient exploitation of exascale architectures requires rethinking of the numerical algorithms used in many large-scale applications. These architectures favor algorithms that expose ultra fine-grain parallelism and maximize the ratio of floating point operations to energy intensive data movement. One of the few viable approaches to achieve high...
This paper presents a parallel interface tracking approach for evolving geometry problems where both the computational domain and mesh are updated as dictated by the analysis. An interface-fitted conforming hybrid/mixed mesh with anisotropic layered elements is used. A combination of mesh motion and mesh modification is employed to update the mesh...
Efficient exploitation of exascale architectures requires rethinking of the numerical algorithms used in many large-scale applications. These architectures favor algorithms that expose ultra fine-grain parallelism and maximize the ratio of floating point operations to energy intensive data movement. One of the few viable approaches to achieve high...
Unstructured mesh particle-in-cell, PIC, simulations executing on the current and next generation of massively parallel systems require new methods for both the mesh and particles to achieve performance and scalability on GPUs. The traditional approach to implementing PIC simulations define data structures and algorithms in terms of particles with...
We present the Exascale Framework for High Fidelity coupled Simulations (EFFIS), a workflow and code coupling framework developed as part of the Whole Device Modeling Application (WDMApp) in the Exascale Computing Project. EFFIS consists of a library, command line utilities, and a collection of run-time daemons. Together, these software products en...
A three-dimensional (3D) transient finite element formulation for modeling laser processing is presented, with the capability to simulate both additive manufacturing and laser ablation physics. State variables are introduced to model the phase change phenomenon and to track the phase and phase history throughout the simulation. Temperature dependen...
In finite element simulations, not all of the data are of equal importance. In fact, the primary purpose of a numerical study is often to accurately assess only one or two engineering output quantities that can be expressed as functionals. Adjoint-based error estimation provides a means to approximate the discretization error in functional quantiti...
As part of its discretization mandate, CEED is developing adaptive algorithms for mesh refinement, coarsening
and parallel rebalancing needed in general unstructured adaptive mesh refinement (AMR) of high-order
hexahedral and/or tetrahedral meshes.
This milestone provides an update on our developments of adaptive mesh control methods for both
confo...
The simulation of phase change processes that occur at high rates, like the collapse of a vapor bubble or the combustion of dense energetic materials, poses significant challenges that include strong discontinuities in select field variables at the interface, high-speed flows in at least one phase, significant role of compressibility, disparate pha...
A general multi-scale strategy is presented for modeling the mechanical environment of a group of neurons that were embedded within a collagenous matrix. The results of the multi-scale simulation are used to estimate the local strains that arise in neurons when the extracellular matrix is deformed. The distribution of local strains was found to dep...
A novel approach and finite element formulation for modeling the melting, consolidation, and re-solidification process that occurs in selective laser melting additive manufacturing is presented. Two state variables are introduced to track the phase (melt/solid) and the degree of consolidation (powder/fully dense). The effect of the consolidation on...
Adjoint-based error estimation provides the ability to approximate the discretization error for a functional quantity of interest, such as point-wise displacements or stresses. Mesh adaptation provides the ability to control the discretization error to obtain more accurate solutions while still remaining computationally feasible. In this paper, we...
We review the concept of stochasticity—i.e., unpredictable or uncontrolled fluctuations in structure, chemistry, or kinetic processes—in materials. We first define six broad classes of stochasticity: equilibrium (thermodynamic) fluctuations; structural/compositional fluctuations; kinetic fluctuations; frustration and degeneracy; imprecision in meas...
The scalability of unstructured mesh-based applications depends on partitioning methods that quickly balance the computational work while reducing communication costs. Zhou et al. [SIAM J. Sci. Comput., 32 (2010), pp. 3201–3227; J. Supercomput., 59 (2012), pp. 1218–1228] demonstrated the combination of (hyper)graph methods with vertex and element p...
Simulating systems with evolving relational structures on massively parallel computers require the computational work to be evenly distributed across the processing resources throughout the simulation. Adaptive, unstructured, mesh-based finite element and finite volume tools best exemplify this need. We present EnGPar and its diffusive partition im...
This paper presents a set of parallel procedures for anisotropic mesh adaptation accounting for mixed element types used in boundary layer meshes, i.e., the current procedures operate in parallel on distributed boundary layer meshes. The procedures accept anisotropic mesh metric field as an input for the desired mesh size field and apply local mesh...
Component-based simulation workflows can increase the agility of the design process by streamlining adaptation of new simulation methods. We present one such workflow for parallel unstructured mesh-based simulations and demonstrate its usefulness in the thermomechanical analysis of an array of solder joints used in microelectronics fabrication. We...
Duality-based approaches to estimate errors for functional output quantities require the solution of an auxiliary dual problem. This dual solution must be approximated in a richer function space than the one used for the original problem of interest. A novel strategy for dual enrichment is proposed based on variational multiscale (VMS) methods and...
Reliable mesh‐based simulations are needed to solve complex engineering problems. Mesh adaptivity can increase reliability by reducing discretization errors but requires multiple software components to exchange information. Often, components exchange information by reading and writing a common file format. This file‐based approach becomes a problem...
Topological data structures are useful in man y areas, including the various mesh data structures used in finite element and finite volume applications. We present a mesh data structure which is both flexible enough to support general mesh adaptivity and is compactly stored in a few large arrays. This structure can efficiently store full topologica...
Reliable mesh-based simulations are needed to solve complex engineering problems. Mesh adaptivity can increase reliability by reducing discretization errors, but requires multiple software components to exchange information. Often, components exchange information by reading and writing a common file format. This file-based approach becomes a proble...
The Parallel Unstructured Mesh Infrastructure (PUMI) is designed to support the representation of, and operations on, unstructured meshes as needed for the execution of mesh-based simulations on massively parallel computers. In PUMI, the mesh representation is complete in the sense of being able to provide any adjacency of mesh entities of multiple...
Free-boundary 3D tokamak equilibria and resistive wall instabilities are calculated using a new resistive wall model in the two-fluid M3D-C1 code. In this model, the resistive wall and surrounding vacuum region are included within the computational domain. This implementation contrasts with the method typically used in fluid codes in which the resi...
XGC1 and M3D-C1 are two fusion plasma simulation codes being developed at Princeton Plasma Physics Laboratory. XGC1 uses the particle-in-cell method to simulate gyrokinetic neoclassical physics and turbulence (Chang et al. Phys Plasmas 16(5):056108, 2009; Ku et al. Nucl Fusion 49:115021, 2009; Admas et al. J Phys 180(1):012036, 2009). M3D-(Formula...
Many of the world’s leading supercomputer architectures are a hybrid of shared memory and network-distributed memory. Such an architecture lends itself to a hybrid MPI-thread programming model. We first present an implementation of inter-thread message passing based on the MPI and pthread libraries. In addition, we present an efficient implementati...
Fiber networks are assemblies of one-dimensional elements representative of materials with fibrous microstructures such as collagen networks and synthetic nonwovens. The mechanics of random fiber networks has been the focus of numerous studies. However, fiber crimp has been explicitly represented only in few cases. In the present work, the mechanic...
Random fiber networks are assemblies of elastic elements connected in random configurations. They are used as models for a broad range of fibrous materials including biopolymer gels and synthetic nonwovens. Although the mechanics of networks made from the same type of fibers has been studied extensively, the behavior of composite systems of fibers...
George Xu Tianyu Liu Lin Su- [...]
Bob Liu
The Monte Carlo radiation transport community faces a number of challenges associated with peta- and exa-scale computing systems that rely increasingly on heterogeneous architectures involving hardware accelerators such as GPUs and Xeon Phi coprocessors. Existing Monte Carlo codes and methods must be strategically upgraded to meet emerging hardware...
The use of simulation based engineering taking advantage of massively parallel computing methods by industry is limited due to the costs associated with developing and using high performance computing software and systems. To address industries ability to effectively include large-scale parallel simulations in daily production use, two key areas ne...
Simulation of wall-bounded turbulent flows poses significant challenges and requires tightly controlled mesh spacing and structure near the walls. Semistructured or hybrid meshes are often used for turbulent boundary layer flows. These meshes not only account for complex geometry but also maintain highly anisotropic, graded, and layered elements ne...
This paper presents a curved meshing technique for unstructured tetrahedral meshes where G
1 surface continuity is maintained for the triangular element faces representing the curved domain surfaces. A bottom-up curving approach is used to support geometric models with multiple surface patches where either C
0 or G
1 geometry continuity between pat...
Massively parallel computation provides an enormous capacity to perform simulations on a timescale that can change the paradigm of how scientists, engineers, and other practitioners use simulations to address discovery and design. This work considers an active flow control application on a realistic and complex wing design that could be leveraged b...
Efforts to develop component-based simulation workflows for industrial applications using XSEDE parallel computing systems are presented.
George Xu Tianyu Liu Lin Su- [...]
Bob Liu
The Monte Carlo radiation transport community faces a number of challenges associated with peta- and exa-scale computing systems that rely increasingly on heterogeneous architectures involving hardware accelerators such as GPUs and Xeon Phi coprocessors. Existing Monte Carlo codes and methods must be strategically upgraded to meet emerging hardware...
Boundary layers in turbulent flows require fine grid spacings near the walls
which depend on the choice of turbulence model. To satisfy these requirements a
semi-structured mesh is generally used in this area with orthogonal and layered
elements. Adaptation of such a mesh needs to take into account the flow physics
along with the standard error ind...
Boundary layers in turbulent flows require fine grid spacings near the walls which depend on the choice of turbulence model. To satisfy these requirements a semi-structured mesh is generally used in this area with orthogonal and layered elements. Adaptation of such a mesh needs to take into account the flow physics along with the standard error ind...
This paper presents a parallel adaptive mesh control procedure designed to operate with high-order finite element analysis packages to enable large-scale automated simulations on massively parallel computers. The curved mesh adaptation procedure uses curved entity mesh modification operations that explicitly consider the influence of the curved mes...
Multi-element wings are popular in the aerospace community due to their high
lift performance. Turbulent flow simulations of these configurations require
very fine mesh spacings especially near the walls, thereby making use of a
boundary layer mesh necessary. However, it is difficult to accurately determine
the required mesh resolution a priori to...
In the traditional programming paradigm, data structures and algorithms are developed for specific data types and requirements. This leads to code redundancy and inflexibility, thus not allowing effective code reuse for similar applications. One effective approach to increase code reuse is generic programming, which focuses on the development of ef...
The mechanical behavior of a three-dimensional cross-linked fiber network embedded in matrix is studied in this work. The network is composed from linear elastic fibers which store energy only in the axial deformation mode, while the matrix is also isotropic and linear elastic. Such systems are encountered in a broad range of applications, from tis...
Critical to the scalability of parallel adaptive simulations are parallel control functions including load balancing, reduced inter-process communication and optimal data decomposition. In distributed meshes, many mesh-based applications frequently access neighborhood information for computational purposes which must be transmitted efficiently to a...
Simulations of turbulent flows are challenging and require tight and varying mesh spac- ings near the walls that depend on the turbulence models used. Semi-structured meshes are often used in the turbulent wall boundary layers due to their ability to be strongly graded and anisotropic. To reduce the discretization errors in the solution, an adaptiv...
We consider multiphysics applications from algorithmic and architectural perspectives, where ‘‘algorithmic’’ includes both mathematical analysis and computational complexity, and ‘‘architectural’’ includes both software and hardware environments. Many diverse multiphysics applications can be reduced, en route to their computational simulation, to a...
A soft tissue's macroscopic behavior is largely determined by its microstructural components (often a collagen fiber network surrounded by a nonfibrillar matrix (NFM)). In the present study, a coupled fiber-matrix model was developed to fully quantify the internal stress field within such a tissue and to explore interactions between the collagen fi...
Although there's a widespread belief that the effective application of high-performance computing will dramatically increase industrial innovation, progress in this area has been slow and limited because of a combination of technical and economic impediments. Here, such impediments are outlined, along with efforts to address them.
This chapter presents a set of procedures that start from image data to construct a non-manifold geometric model that supports the effective generation of meshes with the types of mesh configurations and gradations needed for efficient simulations. The types of operations needed to process the image information before and during the creation of the...
This paper describes a parallel procedure for anisotropic mesh adapta-tion with boundary layers for use in scalable CFD simulations. The parallel mesh adaptation algorithm operates with local mesh modification operations developed for both unstructured and boundary layer parts of the mesh. The adaptive approach maintains layered elements near the v...
This paper presents the development of a parallel adaptive mesh control procedure designed to operate with high-order finite element analysis packages to enable large scale automated simulations on massively parallel computers. The curved mesh adaptation procedure uses curved entity mesh modification operations. Applications of the curved mesh adap...
Critical to the scalability of parallel adaptive simulations are parallel control functions including load balancing, reduced inter-process communication and optimal data decomposition. In distributed meshes, many mesh-based applications frequently access neighborhood information for computational purposes which must be transmitted efficiently to a...
Two Department of Energy (DOE) office of Science's Scientific Discovery through Advanced Computing (SciDAC) Frameworks, Algorithms, and Scalable Technologies for Mathematics (FASTMath) software packages, Parallel Unstructured Mesh Infrastructure (PUMI) and Partitioning using Mesh Adjacencies (ParMA), are presented.
We propose a method to automatically defeature a CAD model by detecting irrelevant features using a geometry-based size field and a method to remove the irrelevant features via facet-based operations on a discrete representation. A discrete B-Rep model ...
The mechanical properties of soft connective tissues are governed by their collagen fiber network and surrounding non-fibrillar matrix (e.g., proteoglycans, cells, elastin, etc.). In order to understand how healthy tissues function, and how properties change in injury and disease, it is necessary to quantify the mechanical response of both the coll...
This paper introduces a general-purpose communication package built on top of MPI which is aimed at improving inter-processor communications independently of the supercomputer architecture being considered. The package is developed to support parallel applications that rely on computation characterized by large number of messages of various sizes,...
Parallel simulations at extreme scale require that the mesh is distributed across a large number of processors with equal
work load and minimum inter-part communications. A number of algorithms have been developed to meet these goals and graph/hypergraph-based
methods are by far the most powerful ones. However, the global implementation of current...
Scalability and time-to-solution studies have historically been focused on the size of the problem and run time. We consider a more strict definition of "solution" whereby a live data analysis (co-visualization of either the full data or in situ data extracts) provides continuous and reconfigurable insight into massively parallel simulations. Speci...
Monte Carlo simulation is ideally suited for solving Boltzmann neutron transport equation in inhomogeneous media. However, routine applications require the computation time to be reduced to hours and even minutes in a desktop system. The interest in adopting GPUs for Monte Carlo acceleration is rapidly mounting, fueled partially by the parallelism...
Higher-order finite element method requires valid curved meshes in three-dimensional domains to achieve the solution accuracy.
When applying adaptive higher-order finite elements in large-scale simulations, complexities that arise include moving the
curved mesh adaptation along with the critical domains to achieve computational efficiency. This pap...
This paper investigates I/O approaches for massively parallel partitioned solver systems. Typically, such systems have synchronized "loops" and write data in a well defined block I/O format consisting of a header and data portion. Our target use for such a parallel I/O subsystem is checkpoint-restart where writing is by far the most common operatio...
Much of the effort required to create a new simulation code goes into developing infrastructure for mesh data manipulation, adaptive refinement, design optimization, and so forth. This infrastructure is an obvious target for code reuse, except that implementations of these functionalities are typically tied to specific data structures. In this arti...
Indentation has become a popular research technique for the mechanical characterization of collagen-based soft tissues. The popularity of the method stems from its requirement of a modestly sized sample, from its ability to be applied in vitro as well as in vivo, and from the ready availability of instrumentation and analytical techniques borrowed...
The adaptive variable p- and hp-version finite element method can achieve exponential convergence rate when a near optimal
finite element mesh is provided. For general 3D domains, near optimal p-version meshes require large curved elements over
the smooth portions of the domain, geometrically graded curved elements to the singular edges and vertice...
As cardiovascular models grow more sophisticated in terms of the geometry considered, and more physiologically realistic boundary
conditions are applied, and fluid flow is coupled to structural models, the computational complexity grows. Massively parallel
adaptivity and flow solvers with extreme scalability enable cardiovascular simulations to rea...
With the development of high-performance computing, I/O issues have become the bottleneck for many massively parallel applications. This paper investigates scalable parallel I/O alternatives for massively parallel partitioned solver systems. Typically such systems have synchronized ¿loops¿ and will write data in a well defined block I/O format co...
Multiscale simulation is a promising approach for addressing a variety of real-world engineering problems. Various mathematical
approaches have been proposed to link single-scale models of physics into multiscale models. In order to be effective, new
multiscale simulation algorithms must be implemented which use partial results provided by single-s...
This paper presents a software infrastructure being developed to support the implementation of adaptive multiple model simulations. The paper first describes an abstraction of single and multiple model simulations into the individual operational components with a focus on the relationships and transformations that relate them. Building on that abst...
This paper is concerned with three-dimensional numerical simulation of a plunging liquid jet. The transient processes of forming an air cavity around the jet, capturing an initially large air bubble, and the break-up of this large toroidal-shaped bubble into smaller bubbles were analyzed. A stabilized finite element method (FEM) was employed under...
This paper presents an adaptive mesh control procedure suitable for use with high-order finite element methods to solve viscous flow problems. The procedure presented is an appropriate combination of anisotropic and boundary layer mesh adaptation that accounts for the need to use curved mesh edges and faces to maintain the required geometric approx...
Effective use of the processor memory hierarchy is an important issue in high performance computing. In this work, a part level mesh topological traversal algorithm is used to define a reordering of both mesh vertices and regions that increases the spatial locality of data and improves overall cache utilization during on processor finite element ca...
Parallel simulations at extreme scale require that the mesh is distributed across a large number of processors with equal work load and minimum interpart communications. A number of algorithms have been developed to meet these goals, e.g., graph/hypergraph and coordinate-based methods. However, the global implementation of current approaches can fa...
The scalable execution of parallel adaptive analyses requires the application of dynamic load balancing to repartition the
mesh into a set of parts with balanced work load and minimal communication. As the adaptive meshes being generated reach billions
of elements and the analyses are performed on massively parallel computers with 100,000’s of comp...
Small scale features and processes occurring at a nanometer and femtoseconds scales have a profound impact on what happens at a larger scale and over extensive period of time. The primary objective of this volume is to reflect the-state-of-the art in multiscale mathematics, modeling and simulations and to address the following barriers: What is the...
Introduction Requirements for a Parallel Infrastructure for Adaptively Evolving Unstructured Meshes Structure of the Flexible Mesh Database Parallel Flexible Mesh Database (FMDB) Mesh Migration for Full Representations Mesh Migration for Reduced Representations Parallel Adaptive Applications Closing Remark References
Implicit methods for partial differential equations using unstructured meshes allow for an efficient solution strategy for many real-world problems (e.g., simulation-based virtual surgical planning). Scalable solvers employing these methods not only enable solution of extremely-large practical problems but also lead to dramatic compression in time-...
The M3D-C^1 code is a two-fluid toroidal magnetohydrodynamic code based on high-order, compact finite elements with C^1 continuity on an unstructured adaptive triangle-based grid. The code is built upon and extends many of the favorable features of the M3D approach to solving the MHD equations in a highly magnetized toroidal plasma. The vector fiel...
Building on a general abstraction of the steps and transformations of a multiscale analysis, this chapter considers an approach to the development of multiscale simulation in which interoperable components can be effectively combined to address a wide range of multiscale simulations. Key concerns in the development of these interoperable components...
SciDAC applications have a demonstrated need for advanced software tools to manage the complexities associated with sophisticated geometry, mesh, and field manipulation tasks, particularly as computer architectures move toward the petascale. In this paper, we describe a software component – an abstract data model and programming interface – designe...
This paper introduces a general-purpose communication package built on top of MPI which is aimed at improving inter-processor communications for parallel computations characterized by large numbers of messages. The current library provides a utility for such applications based on two key attributes that are: (i) explicit consideration of the neighb...
The M3D-C^1 code is a two-fluid toroidal magnetohydrodynamic code based on high-order, compact finite elements with C^1 continuity on an unstructured adaptive triangle-based grid. The code is built upon many of the favorable features of the M3D approach to solving the MHD equations in a highly magnetized toroidal plasma. The vector fields use a phy...
We development a general framework for information-passing and concurrent discrete to continuum scale bridging and applied it to biological, electro-mechanical and thermo-electrical systems. Funds were used for partial support of two post-doctoral research associates (Aiqin Li, Dawei Zhang) and three graduate students (Renge Li, Mohan Nuggehally, J...