Yu Zhuang

Yu Zhuang
  • Texas Tech University

About

92
Publications
9,928
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,374
Citations
Current institution
Texas Tech University

Publications

Publications (92)
Preprint
Full-text available
The rapid expansion of Internet of Things (IoT) devices demands robust and resource-efficient security solutions. Physically Unclonable Functions (PUFs), which generate unique cryptographic keys from inherent hardware variations, offer a promising approach. However, traditional PUFs like Arbiter PUFs (APUFs) and XOR Arbiter PUFs (XOR-PUFs) are susc...
Preprint
Full-text available
Lightweight authentication is essential for resource-constrained Internet-of-Things (IoT). Implementable with low resource and operable with low power, Physical Unclonable Functions (PUFs) have the potential as hardware primitives for implementing lightweight authentication protocols. The arbiter PUF (APUF) is probably the most lightweight strong P...
Preprint
Full-text available
Physical Unclonable Functions (PUFs) are emerging as promising security primitives for IoT devices, providing device fingerprints based on physical characteristics. Despite their strengths, PUFs are vulnerable to machine learning (ML) attacks, including conventional and reliability-based attacks. Conventional ML attacks have been effective in revea...
Article
Authentication is critical for Internet-of-Things. The traditional approach of using cryptographic keys is subject to invasive attacks. Being unclonable even by the manufacturers, physical unclonable functions (PUFs) leverage integrated circuits’ manufacturing variations to produce responses unique for individual devices, and hence are of great pot...
Article
Full-text available
IoT devices rely on authentication mechanisms to render secure message exchange. During data transmission, scalability, data integrity, and processing time have been considered challenging aspects for a system constituted by IoT devices. The application of physical unclonable functions (PUFs) ensures secure data transmission among the internet of t...
Preprint
Full-text available
Physical Unclonable Functions (PUFs) are promising security primitives for resource-constrained IoT devices. And the XOR Arbiter PUF (XOR-PUF) is one of the most studied PUFs, out of an effort to improve the resistance against machine learning attacks of probably the most lightweight delay-based PUFs - the Arbiter PUFs. However, recent attack studi...
Article
Full-text available
A network of agents constituted of multiple unmanned aerial vehicles (UAVs) is emerging as a promising technology with myriad applications in the military, public, and civil domains. UAVs’ power, memory, and size constraints, ultra-mobile nature, and non-trusted operational environments make them susceptible to various attacks, including physical c...
Article
Full-text available
Physical Unclonable Functions (PUFs) are promising security primitives for resource-constrained network nodes. The XOR Arbiter PUF (XOR PUF or XPUF) is an intensively studied PUF invented to improve the security of the Arbiter PUF, probably the most lightweight delay-based PUF. Recently, highly powerful machine learning attack methods were discover...
Preprint
Full-text available
Physical Unclonable Functions (PUFs) are promising security primitives for resource-constrained network nodes. The XOR Arbiter PUF (XOR PUF or XPUF) is an intensively studied PUF invented to improve the security of the Arbiter PUF, probably the most lightweight delay-based PUF. Recently, highly powerful machine learning attack methods were discover...
Conference Paper
Distance learning has dramatically increased in recent years because of advanced technology. In addition, numerous universities had to offer courses in online mode in 2020 and 2021 because of the COVID-19 pandemic. However, there are more challenges in distance learning than in the traditional learning method (e.g., feedback and interaction). Recen...
Article
By revisiting, improving, and extending recent neural-network based modeling attacks on XOR Arbiter PUFs from the literature, we show that XOR Arbiter PUFs, (XOR) Feed-Forward Arbiter PUFs, and Interpose PUFs can be attacked faster, up to larger security parameters, and with an order of magnitude fewer challenge-response pairs than previously known...
Preprint
Full-text available
Security is of critical importance for the Internet of Things (IoT). Many IoT devices are resource-constrained, calling for lightweight security protocols. Physical unclonable functions (PUFs) leverage integrated circuits' variations to produce responses unique for individual devices, and hence are not reproducible even by the manufacturers. Implem...
Article
In this paper, we introduce a new I/O characteristic discovery methodology for performance optimizations on object-based storage systems. Different from traditional methods that select limited access attributes or heavily reply on domain knowledge about applications’ I/O behaviors, our method enables capturing data-access features as many as possib...
Article
Full-text available
Physical unclonable functions (PUFs), leveraging tiny physical variations of the circuits to produce unique responses for individual PUF instances, are emerging as a promising class of hardware security primitives for resource-constrained IoT devices. Component-differentially-challenged XOR PUFs (CDC XPUFs) are among the PUFs which were shown to be...
Article
Full-text available
With the advent of the Internet of Things, security has become indispensable. Physical unclonable functions (PUFs) are emerging as a promising alternative to classical cryptographic algorithms as it provides a lightweight and cost-effective solution for implementing a keyless security mechanism. Before adopting a PUF for real-world applications, a...
Article
Physical unclonable functions (PUF) are emerging as a promising alternative to traditional cryptographic protocols for IoT authentication. XOR Arbiter PUFs (XPUFs), a group of well-studied PUFs, are found to be secure against machine learning (ML) attacks if the XOR gate is large enough, as both the number of CRPs and the computational time require...
Article
Security is critically important for Internet-of-Things, but existing cryptographic protocols are not lightweight enough for resource-constrained IoT devices. Implementable with simplistic circuits and operable with shallow power, physical unclonable functions (PUFs) leverage small but unavoidable physical variations of the circuit to produce uniqu...
Article
Full-text available
Classical cryptographic methods that inherently employ secret keys embedded in non-volatile memory have been known to be impractical for limited-resource Internet of Things (IoT) devices. Physical Unclonable Functions (PUFs) have emerged as an applicable solution to provide a keyless means for secure authentication. PUFs utilize inevitable variatio...
Conference Paper
For a large volume of data, the clustering algorithm is of significant importance to categorize and analyze data. Accordingly, choosing the optimal number of clusters (K) is an essential factor, but it also is a tricky problem in big data analysis. More importantly, it is to efficiently determine the best K automatically, which is the main issue in...
Article
Full-text available
The Industrial Internet of Things (IIoT) platform consists of purpose-driven communication controllers, enterprise-grade modems (routers and gateways), and edge computing systems that require integrated software and sensing capability in mission-critical environments. Extensible purpose-built industrial supervisory control and data acquisition netw...
Conference Paper
Communication security is essential for the proper functioning of the Internet of Things. Traditional approaches that rely on cryptographic keys are vulnerable to side-channel attacks. Physical Unclonable Functions (PUFs), leveraging unavoidable and irreproducible variations of integrated circuits to produce responses unique for individual PUF devi...
Poster
❖ Grid Engine is a Distributed Resource Manager (DRM), that manages the resources of distributed systems (such as Grid, HPC, or Cloud systems). ❖ Grid Engine applies scheduling policies to allocate resources for jobs while maintaining optimal utilization of all resources. ❖ the complexity of Grid Engine’s job submission commands and complicated res...
Conference Paper
Grid Engine is a Distributed Resource Manager (DRM), that manages the resources of distributed systems (such as Grid, HPC, or Cloud systems) and executes designated jobs which have requested to occupy or consume those resources. Grid Engine applies scheduling policies to allocate resources for jobs while simultaneously attempting to maintain optima...
Preprint
Full-text available
We report on a new approach to ease the computational overhead of ab initio on-the-fly semiclassical dynamics simulations for vibrational spectroscopy. The well known bottleneck of such computations lies in the necessity to estimate the Hessian matrix for propagating the semiclassical pre-exponential factor at each step along the dynamics. The proc...
Article
We report on a new approach to ease the computational overhead of ab initio “on-the-fly” semiclassical dynamics simulations for vibrational spectroscopy. The well known bottleneck of such computations lies in the necessity to estimate the Hessian matrix for propagating the semiclassical pre-exponential factor at each step along the dynamics. The pr...
Article
For computational fluid dynamics (CFD) applications with a large number of grid points/cells, parallel computing is a common efficient strategy to reduce the computational time. How to achieve the best performance in the modern supercomputer system, especially with heterogeneous computing resources such as hybrid CPU+GPU, or a CPU + Intel Xeon Phi...
Preprint
Full-text available
For computational fluid dynamics (CFD) applications with a large number of grid points/cells, parallel computing is a common efficient strategy to reduce the computational time. How to achieve the best performance in the modern supercomputer system, especially with heterogeneous computing resources such as hybrid CPU+GPU, or a CPU + Intel Xeon Phi...
Article
Many scientific applications consist of heavy computational and analysis workload on data, and often require producing intermediate data for ongoing calculations. For instance, chemical dynamics simulations are known as heavy workload applications in terms of calculation in many aspects. There is a strong desire of seeking a solution to minimize ex...
Conference Paper
Clustering is one of the fundamental data mining procedures. Bisecting K-means (BKM) clustering has been studied to have higher computing efficiency and better clustering quality when compared with the basic Lloyd version of the K-means clustering. Elkan's method of utilizing triangle inequality significantly reduces distance calculations, and is a...
Conference Paper
Full-text available
Bisecting K-means (BKM) clustering, with or without refinement, has been shown to exhibit higher computing efficiency, better clustering quality, and low susceptibility to initial cluster centers, when compared with the basic K-means clustering algorithm. For bisecting K-means with refinement, in this paper, we investigate a variant that increases...
Poster
In many science and engineering investigations, there are data that are of critical importance but highly expensive to generate, and there is a strong incentive to reduce the amount of such data, since reduction of expensively generated data also means reduction of data generation costs. But what is critical is that the data reduction should not le...
Article
The non-contiguous access pattern of many scientific applications results in a large number of I/O requests, which can seriously limit the data-access performance. Collective I/O has been widely used to address this issue. However, the performance of collective I/O could be dramatically degraded in today's high-performance computing systems due to...
Conference Paper
Full-text available
Dynamic data sharing among cores during computation in a multi-core architectural environment has been recognized as one of the factors that add to the cost of the total execution time. One way of reducing the impact of the latency generated by intense data sharing is through communication hiding. The idea behind communication hiding is to create o...
Article
Chemical processes are intrinsically quantum mechanical and quantum effects cannot be excluded a priori. Classical dynamics that use fitted force fields have been routinely applied to complex molecular systems. But since the force fields used in classical dynamics are tuned to fit experimental and/or electronic structure data, the harmonic potentia...
Article
Collective I/O is a critical I/O strategy on high-performance parallel computing systems that enables programmers to reveal parallel processes' I/O accesses collectively and makes possible for the parallel I/O middleware to carry out I/O requests in a highly efficient manner. Collective I/O has been proven as a core parallel I/O optimization techni...
Article
Many scientific computing applications and engineering simulations exhibit noncontiguous I/O access patterns. Data sieving is an important technique to improve the performance of noncontiguous I/O accesses by combining small and noncontiguous requests into a large and contiguous request. It has been proven effective even though more data are potent...
Article
Compared with current high-performance computing (HPC) systems, exascale systems are expected to have much less memory per node, which can significantly reduce necessary collective input/output (I/O) performance. In this study, we introduce a memory-conscious collective I/O strategy that takes into account memory capacity and bandwidth constraints....
Article
High Performance Computing is trending towards exascale and some of the major barriers of high performance computing or scientific computing are dominated by latencies incurred due to storage, com-munication, and component failures. In this paper we discuss a technique to overcome one of those obstacles: latency incurred due to communication. This...
Conference Paper
Force evaluation is the most computationally intensive part in a chemical dynamics simulation, and hence most parallel simulation algorithms choose the force calculation as the main target for parallelization. The majority of existing parallel algorithms assume a uniform force-evaluation cost for all atom pairs. For dynamics with considerable bonde...
Article
In this paper, a parallelization for a large scale CFD application with mixed one-to-one multiblock/overset structured grid was implemented into our in-house TH-CFD code running on Tianhe-1A supercomputer system. Strategies at multiple software levels were employed in a mutually supportive way for overall performance enhancement, and they include g...
Article
This paper concerns versions of the Trotter-Kato Theorem and the Chernoff Product Formula for C 0-semigroups in the absence of stability. Applications to ${\mathcal{A}}$ -stable rational approximations of semigroups are presented.
Article
In classical and quasiclassical trajectory chemical dynamics simulations, the atomistic dynamics of collisions, chemical reactions, and energy transfer are studied by solving the classical equations of motion. These equations require the potential energy and its gradient for the chemical system under study, and they may be obtained directly from an...
Article
This paper shows how a compact finite difference Hessian approximation scheme can be proficiently implemented into semiclassical initial value representation molecular dynamics. Effects of the approximation on the monodromy matrix calculation are tested by propagating initial sampling distributions to determine power spectra for analytic potential...
Article
Direct dynamics simulations are a very useful and general approach for studying the atomistic properties of complex chemical systems, since an electronic structure theory representation of a system’s potential energy surface is possible without the need for fitting an analytic potential energy function. In this paper, recently introduced compact fi...
Conference Paper
The continuing decrease in memory capacity per core and the increasing disparity between core count and off-chip memory bandwidth create significant challenges for I/O operations in exascale systems. The exascale challenges require rethinking collective I/O for the effective exploitation of the correlation among I/O accesses in the exascale system....
Conference Paper
High end computing hardware has been growing fast in both uniprocessor performance and parallel system scales. Steadily advancing but somewhat lagging behind is the speed of memory accesses. Thus, needed are software and algorithms behind software that adapt well with architectural features of high end computing hardware. Stable explicit implicit d...
Article
Full-text available
We present a domain decomposition method for solving the equation modeling nitric oxide diffusions. The domain decomposition we use is one of the stabilized explicit implicit domain decomposition (SEIDD) methods. The SEIDD methods have a restriction that the interface boundaries have no cross-over inside the domain. In this paper, we present a doma...
Conference Paper
The continuing decrease in memory capacity per core and the increasing disparity between core count and off-chip memory bandwidth create significant challenges for I/O operations in exascale systems. The exascale challenges require rethinking collective I/O for the effective exploitation of the correlation among I/O accesses in the exascale system....
Article
Full-text available
Many scientific computing applications and engineering simulations exhibit noncontiguous I/O access patterns. Data sieving is an important tech-nique to improve the performance of noncontiguous I/O accesses by combining small and noncontiguous requests into a large and contiguous request. It has been proven effective even though more data is potent...
Conference Paper
The emerging Solid State Drives (SSDs) have changed the landscape of storage systems and have the potential to be widely deployed in computing systems including HPC systems. However, the cost and the capacity of SSDs have often been cited as the primary barrier to SSD deployment. In this study, we revisit the RAID design and propose a new hybrid-RA...
Article
Full-text available
In this paper, we present a family of generally applicable schemes for updating the Hessian from electronic structure calculations based on an equation derived with compact finite difference (CFD). The CFD-based equation is of higher accuracy than the quasi-Newton equation on which existing generally applicable Hessian update schemes are based. Dir...
Article
Chemical dynamics simulation is essential in a broad range of science and engineering investigations, and hence essential for science and engineering education. Many chemical dynamics simulations are very time consuming and require high performance computing systems, which, however, are not as affordable, and hence as ubiquitously accessible, as de...
Article
Explicit–implicit domain decomposition (EIDD) is a class of globally non-iterative, non-overlapping domain decomposition methods for the numerical solution of parabolic problems on parallel computers, which are highly efficient both computationally and communicationally for each time step. In this paper an alternating EIDD method is proposed which...
Article
Full-text available
In previous research [J. Chem. Phys. 111, 3800 (1999)] a Hessian-based integration algorithm was derived for performing direct dynamics simulations. In the work presented here, improvements to this algorithm are described. The algorithm has a predictor step based on a local second-order Taylor expansion of the potential in Cartesian coordinates, wi...
Article
An important metric for simulation algorithms used in compartment modelling is computation efficiency. One algorithmic achievement in efficiency is the Hines method, which substantially reduces the computation cost of solving a system of linear equations arising in each time step of implicit time integration. However, the Hines method does not work...
Conference Paper
Full-text available
Stabilized explicit implicit domain decomposition (SEIDD) is a class of globally non-iterative domain decomposition methods for the numerical simulation of unsteady diffusion processes on parallel computers. By adding a communication-cost-free stabilization step to the explicit-implicit domain decomposition (EIDD) methods, the SEIDD methods achieve...
Article
Full-text available
We report a class of stabilized explicit-implicit domain decomposition (SEIDD) methods for the numerical solution of parabolic equations.Explicit-implicit domain decomposition (EIDD) methods are globally noniterative, nonoverlapping domain decomposition methods, which, when compared with Schwarz-algorithm-based parabolic solvers, are computationall...
Conference Paper
Full-text available
In this paper, we report a class of stabilized explicit-implicit domain decomposition (SEIDD) methods for the parallel solution of parabolic problems, based on the explicit-implicit domain decomposition (EIDD) methods. EIDD methods are globally non-iterative, non-overlapping domain decomposition methods which, when compared with Schwarz alternating...
Article
We present a multilevel high order ADI method for separable generalized Helmholtz equations. The discretization method we use is a onedimensional fourth order compact finite difference applied to each directional component of the Laplace operator, resulting in a discrete system efficiently solvable by ADI methods. We apply this high order differenc...
Article
We present a fourth order numerical solution method for the singular Neumann boundary problem of Poisson equations. Such problems arise in the solution process of incompressible Navier–Stokes equations and in the time-harmonic wave propagation in the frequence space with the zero wavenumber. The equation is first discretized with a fourth order mod...
Article
Full-text available
Many temporal discretization methods for linear evolution equations converge uniformly on compact time intervals at the rate \(\tfrac{1}{{{{n}^{\alpha }}}} \) only for sufficiently smooth initial data. It is shown that these methods can be regularized such that the new schemes converge ‘in the average’ at the rate \(\tfrac{1} {{n^a }} \) for all in...
Article
We present three parallel solvers for parabolic equation. The solution methods, which are based on non-overlapping explicit/implicit time marching and implicit correction, are simple, stable, and communicationally inexpensive. Numerical experiments con rm that the solver is unconditionally stable and highly scalable with respect to the machine and...
Article
Full-text available
this paper. The domain and range of an operator A is denoted by D(A) and R(A), and D(A
Article
Full-text available
Under general hypotheses on the target set S and the dynamics of the system, we show that the minimal time function T S (·) is a proximal solution to the Hamilton–Jacobi equation. Uniqueness results are obtained with two different kinds of boundary conditions. A new propagation result is proven, and as an application, we give necessary and sufficie...
Article
Full-text available
In this study, a compact finite-difference discretization is first developed for Helmholtz equations on rectangular domains. Special treatments, then, are introduced for Neumann and Neumann-Dirichlet boundary conditions to achieve accuracy and separability. Finally, a Fast Fourier Transform (FFT) based technique is used to yield a fast direct solve...
Article
Full-text available
In this study, a compact finite-difference discretization is first developed for Helmholtz equations on rectangular domains. Special treatments, then, are introduced for Neumann and Neumann-Dirichlet boundary conditions to achieve accuracy and separability. Finally, a Fast Fourier Transform (FFT) based technique is used to yield a fast direct solve...
Article
Full-text available
Stabilized explicit-implicit domain decomposition (SEIDD) is a class of globally non-iterative, non-overlapping domain decomposition methods for the parallel solution of parabolic prob-lems. However, it has a restriction that the interface boundaries do not cross into each other inside the domain. Thus, the conventional parallelization that assigns...
Article
Full-text available
Stabilized explicit-implicit domain decomposition (SEIDD) is a class of globally non-iterative, non-overlapping domain decomposition methods for the parallel solution of parabolic prob-lems. However, it has a restriction that the interface boundaries do not cross into each other inside the domain. Thus, the conventional parallelization that assigns...

Network

Cited By