Alexander McCaskey’s research while affiliated with NVIDIA and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (44)


Fig. 1 A depiction of the sections covered in this review and how AI can be used to benefit the entire QC stack.
Fig. 2 A simple hierarchy from Artificial Intelligence to generative AI, broadly contextualizing the techniques discussed in this work. Each level is paired with a simple description.
Fig. 5 Most quantum device architectures require specific tuning and control protocols to operate as qubits. Machine learning-based approaches allow us to automate and speed up such protocols, allowing for high-throughput characterization and optimization of quantum devices.
Fig. 6 Transformer model for decoding quantum surface code. Figure adapted from [149]. Models trained on a small code distance (5 above) can transfer to larger distances (7 above) thanks to the variable input length of the transformer, cutting down on the training time.
Fig. 7 Depiction of a development platform incorporating access to both QC and AI resources. Such a platform should be accessible to both domain scientists and quantum developers, and must orchestrate hybrid workflows leveraging both AI supercomputers and quantum processors.

+1

Artificial Intelligence for Quantum Computing
  • Preprint
  • File available

November 2024

·

191 Reads

·

Marwa H. Farag

·

·

[...]

·

Timothy Costa

Artificial intelligence (AI) advancements over the past few years have had an unprecedented and revolutionary impact across everyday application areas. Its significance also extends to technical challenges within science and engineering, including the nascent field of quantum computing (QC). The counterintuitive nature and high-dimensional mathematics of QC make it a prime candidate for AI's data-driven learning capabilities, and in fact, many of QC's biggest scaling challenges may ultimately rest on developments in AI. However, bringing leading techniques from AI to QC requires drawing on disparate expertise from arguably two of the most advanced and esoteric areas of computer science. Here we aim to encourage this cross-pollination by reviewing how state-of-the-art AI techniques are already advancing challenges across the hardware and software stack needed to develop useful QC - from device design to applications. We then close by examining its future opportunities and obstacles in this space.

Download

Efficient charge-preserving excited state preparation with variational quantum algorithms

October 2024

·

12 Reads

Determining the spectrum and wave functions of excited states of a system is crucial in quantum physics and chemistry. Low-depth quantum algorithms, such as the Variational Quantum Eigensolver (VQE) and its variants, can be used to determine the ground-state energy. However, current approaches to computing excited states require numerous controlled unitaries, making the application of the original Variational Quantum Deflation (VQD) algorithm to problems in chemistry or physics suboptimal. In this study, we introduce a charge-preserving VQD (CPVQD) algorithm, designed to incorporate symmetry and the corresponding conserved charge into the VQD framework. This results in dimension reduction, significantly enhancing the efficiency of excited-state computations. We present benchmark results with GPU-accelerated simulations using systems up to 24 qubits, showcasing applications in high-energy physics, nuclear physics, and quantum chemistry. This work is performed on NERSC's Perlmutter system using NVIDIA's open-source platform for accelerated quantum supercomputing - CUDA-Q.


Figure 1: Summary of the XACC circuit execution workflow.
Figure 8: Strong scaling plots of the DDCL gradients for a single parameter update with 20-26 qubits for the circuit in Figure 5 with 10 layers.
Parallel Quantum Computing Simulations via Quantum Accelerator Platform Virtualization

June 2024

·

31 Reads

Quantum circuit execution is the central task in quantum computation. Due to inherent quantum-mechanical constraints, quantum computing workflows often involve a considerable number of independent measurements over a large set of slightly different quantum circuits. Here we discuss a simple model for parallelizing simulation of such quantum circuit executions that is based on introducing a large array of virtual quantum processing units, mapped to classical HPC nodes, as a parallel quantum computing platform. Implemented within the XACC framework, the model can readily take advantage of its backend-agnostic features, enabling parallel quantum circuit execution over any target backend supported by XACC. We illustrate the performance of this approach by demonstrating strong scaling in two pertinent domain science problems, namely in computing the gradients for the multi-contracted variational quantum eigensolver and in data-driven quantum circuit learning, where we vary the number of qubits and the number of circuit layers. The latter (classical) simulation leverages the cuQuantum SDK library to run efficiently on GPU-accelerated HPC platforms.




Fig. 1: QCOR Machine Model [3]
Fig. 5: Scalability of the one-by-one and the parallel approaches: two SHOR(N=7, a=2) from Algorithm 1
Enabling Multi-threading in Heterogeneous Quantum-Classical Programming Models

January 2023

·

85 Reads

In this paper, we address some of the key limitations to realizing a generic heterogeneous parallel programming model for quantum-classical heterogeneous platforms. We discuss our experience in enabling user-level multi-threading in QCOR as well as challenges that need to be addressed for programming future quantum-classical systems. Specifically, we discuss our design and implementation of introducing C++-based parallel constructs to enable 1) parallel execution of a quantum kernel with std::thread and 2) asynchronous execution with std::async. To do so, we provide a detailed overview of the current implementation of the QCOR programming model and runtime, and discuss how we add 1) thread-safety to some of its user-facing API routines, and 2) increase parallelism in QCOR by removing data races that inhibit multi-threading so as to better utilize available computing resources. We also present preliminary performance results with the Quantum++ back end on a single-node Ryzen9 3900X machine that has 12 physical cores (24 hardware threads) with 128GB of RAM. The results show that running two Bell kernels with 12 threads per kernel in parallel outperforms running the kernels one after the other each with 24 threads (1.63x improvement). In addition, we observe the same trend when running two Shor's algorthm kernels in parallel (1.22x faster than executing the kernels one after the other). It is worth noting that the trends remain the same even when we only use physical cores instead of threads. We believe that our design, implementation, and results will open up an opportunity not only for 1) enabling quicker prototyping of parallel/asynchrony-aware quantum-classical algorithms on quantum circuit simulators in the short-term, but also for 2) realizing a generic heterogeneous parallel programming model for quantum-classical heterogeneous platforms in the long-term.


Numerical simulations of noisy quantum circuits for computational chemistry

September 2022

·

106 Reads

·

7 Citations

Materials Theory

The opportunities afforded by near-term quantum computers to calculate the ground-state properties of small molecules depend on the structure of the computational ansatz as well as the errors induced by device noise. Here we investigate the behavior of these noisy quantum circuits using numerical simulations to estimate the accuracy and fidelity of the prepared quantum states relative to the ground truth obtained by conventional means. We implement several different types of ansatz circuits derived from unitary coupled cluster theory for the purposes of estimating the ground-state energy of sodium hydride using the variational quantum eigensolver algorithm. We show how relative error in the energy and the fidelity scale with the levels of gate-based noise, the internuclear configuration, the ansatz circuit depth, and the parameter optimization methods.


Retargetable Optimizing Compilers for Quantum Accelerators via a Multi-Level Intermediate Representation

September 2022

·

16 Reads

·

6 Citations

IEEE Micro

We present a multilevel quantum–classical intermediate representation (IR) that enables an optimizing, retargetable compiler for available quantum languages. Our work builds upon the multilevel intermediate representation (MLIR) framework and leverages its unique progressive lowering capabilities to map quantum languages to the low-level virtual machine (LLVM) machine-level IR. We provide both quantum and classical optimizations via the MLIR pattern rewriting subsystem and standard LLVM optimization passes, and demonstrate the programmability, compilation, and execution of our approach via standard benchmarks and test cases. In comparison to other standalone language and compiler efforts available today, our work results in compile times that are 1,000× faster than standard Pythonic approaches, and 5–10× faster than comparative standalone quantum language compilers. Our compiler provides quantum resource optimizations via standard programming patterns that result in a 10× reduction in entangling operations, a common source of program noise. We see this work as a vehicle for rapid quantum compiler prototyping.


Hardware connectivity graphs for (a) heavy-hexagon, dH=2.5\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_H=2.5$$\end{document} (b) hexagon, dH=3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_H=3$$\end{document}, (c) square, dH=4\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_H=4$$\end{document}, and (d) triangle, dH=6\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_H=6$$\end{document}.
SWAP\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textsc {SWAP}}$$\end{document} gate scaling with average problem degree dG\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_G$$\end{document} and hardware degree dH\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_H$$\end{document} for 7-vertex graphs. The solid line shows the non-linear least squares fit to NSWAP(dG,dH)=adG/dH+b\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_{\textsc {SWAP}}(d_G,d_H) = a d_G/d_H + b$$\end{document}, with a=5.9±0.1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a=5.9 \pm 0.1$$\end{document} and b=-2.5±0.2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$b = -2.5 \pm 0.2$$\end{document}, with ± indicating the asymptotic standard error of the fit parameters.
Average SWAP\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textsc {SWAP}}$$\end{document} gate scaling with number of qubits n and hardware degree dH\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_H$$\end{document} for 3-regular graphs.
The number of measurement samples M to measure a result from the intended state for 3-regular graphs, see text for details. Inset: M diverges exponentially in 1/dH\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1/d_H$$\end{document}.
The number of initially unsatisfied edges Nu\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_u$$\end{document} in the initial qubit placement at each n and dH\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_H$$\end{document} for 3-regular graphs.
Scaling quantum approximate optimization on near-term hardware

July 2022

·

106 Reads

·

40 Citations

The quantum approximate optimization algorithm (QAOA) is an approach for near-term quantum computers to potentially demonstrate computational advantage in solving combinatorial optimization problems. However, the viability of the QAOA depends on how its performance and resource requirements scale with problem size and complexity for realistic hardware implementations. Here, we quantify scaling of the expected resource requirements by synthesizing optimized circuits for hardware architectures with varying levels of connectivity. Assuming noisy gate operations, we estimate the number of measurements needed to sample the output of the idealized QAOA circuit with high probability. We show the number of measurements, and hence total time to solution, grows exponentially in problem size and problem graph degree as well as depth of the QAOA ansatz, gate infidelities, and inverse hardware graph degree. These problems may be alleviated by increasing hardware connectivity or by recently proposed modifications to the QAOA that achieve higher performance with fewer circuit layers.


Tensor Network Quantum Virtual Machine for Simulating Quantum Circuits at Exascale

July 2022

·

42 Reads

·

31 Citations

ACM Transactions on Quantum Computing

The numerical simulation of quantum circuits is an indispensable tool for development, verification and validation of hybrid quantum-classical algorithms intended for near-term quantum co-processors. The emergence of exascale high-performance computing (HPC) platforms presents new opportunities for pushing the boundaries of quantum circuit simulation. We present a modernized version of the Tensor Network Quantum Virtual Machine (TNQVM) which serves as the quantum circuit simulation backend in the eXtreme-scale ACCelerator (XACC) framework. The new version is based on the scalable tensor network processing library ExaTN (Exascale Tensor Networks). It provides multiple configurable quantum circuit simulators which perform either an exact quantum circuit simulation via the full tensor network contraction or an approximate simulation via a suitably chosen tensor factorization scheme. Upon necessity, stochastic noise modeling from real quantum processors is incorporated into the simulations by modeling quantum channels with Kraus tensors. By combining the portable XACC quantum programming frontend and the scalable ExaTN numerical processing backend, we introduce an end-to-end virtual quantum development environment which can scale from laptops to future exascale platforms. We report initial benchmarks of our framework which include a demonstration of the distributed execution, incorporation of quantum decoherence models, and simulation of the random quantum circuits used for the certification of quantum supremacy on Google’s Sycamore superconducting architecture.


Citations (21)


... To ascertain the ground-state energy of sodium hydride through the variational quantum eigensolver technique, Jerimiah et al. [42] examined the influence of device noise on the precision and fidelity of quantum circuits. They implemented various ansatz circuits derived from unitary coupled cluster theory and investigated how errors induced by gatebased noise, ansatz circuit depth, parameter optimization techniques, and internal nuclear configuration affected the relative error in energy and fidelity of the quantum states generated. ...

Reference:

A computational study and analysis of Variational Quantum Eigensolver over multiple parameters for molecules and ions
Numerical simulations of noisy quantum circuits for computational chemistry

Materials Theory

... Most combinatorial optimization problems, however, are defined on arbitrary graph topologies, some even all-to-all connected. This implies that the topology of the problem rarely matches the topology of the actual device, and that rerouting using SWAP networks is necessary [6,7]. A general way to circumvent the need for rerouting, is to employ different problem encodings that are native to the topology of the hardware, for instance a square grid. ...

Scaling quantum approximate optimization on near-term hardware

... At ORNL, we deploy a number of simulators and related tools, able to utilize the capabilities of our HPC infrastructure (see Section 4.1). These include TN-QVM [110,111], an ORNLdeveloped tensor-network based accelerator that works together with XACC (see Section 7.1) and tensor-network simulator backends such as ITensor [112] or ExaTN [113] for high-performance exploration of circuit evolution. Figure 3 shows a schematic of how TN-QVM integrates into a larger software ecosystem. ...

Tensor Network Quantum Virtual Machine for Simulating Quantum Circuits at Exascale
  • Citing Article
  • July 2022

ACM Transactions on Quantum Computing

... [40] Tensor Network qiskit, PastaQ.jl [41], NVidia cuQuantum, QXTools [42], Blueqat, Tai Zhang Simulator [43], qFlex [44], HybridQ, ExaTN [45] (with TNVQM Accelerator), Jet [46], Quimb [47], TensorCircuit [48], QTensor [49], Tensorly [50] allow the access of qubit numbers on the order of a million and more. They are mainly used to investigate quantum error correction codes [52] and more recently random unitary circuits. ...

ExaTN: Scalable GPU-Accelerated High-Performance Processing of General Tensor Networks at Exascale

Frontiers in Applied Mathematics and Statistics

... The use of compilation optimizers like the JIT (Just-In-Time) compiler from Numba, mentioned in the previous paragraph, is being explored in the field of Classical Quantum Computing Simulation. Unlike traditional compilation optimizers like the one found in Numba, quantum compilation optimizations are already being developed, such as Quantum Just-In-Time Compilation (QJIT) [6]. These optimizations help maintain computational performance in hybrid CPU-QPU models. ...

Extending Python for Quantum-Classical Computing via Quantum Just-in-Time Compilation
  • Citing Article
  • June 2022

ACM Transactions on Quantum Computing

... The relevant distinction here is that this code does not contain the inner loop -instead the group homomorphism property of the in-place multiplication operator U x can be used to fuse the loop into a singular multiplication with a classically precomputed multiplication factor: (19) and are subsequently called only "symbolically", i.e. the Python interpreter doesn't have to traverse them again. The flattening to a lower level representation like QIR [45] is then outsourced to established classical compilation infrastructure [46,47]. The proposed architecture brings some significant advantages: ...

Retargetable Optimizing Compilers for Quantum Accelerators via a Multi-Level Intermediate Representation
  • Citing Article
  • September 2022

IEEE Micro

... The following UCC results are obtained using the XACC quantum computing framework 39,40 and PySCF 41 to generate the Hamiltonians, calculate FCI energies, and select important τ's suggested by CCSD amplitudes. Converged UCC amplitudes are then manipulated by the UT2 Python module, 42 a software dedicated to rapidly prototyping alternative coupled cluster theories, to extract the triples corrections. ...

A Backend-agnostic, Quantum-classical Framework for Simulations of Chemistry in C ++
  • Citing Article
  • April 2022

ACM Transactions on Quantum Computing

... More efficient numerical tensoralgebraic techniques coupled with advances in classical machine learning will be required for scalable characterization of larger NISQ devices. On the engineering side, all these newly devised numerical simulation techniques will have to be implemented in an efficient manner in order to fully exploit the computational power of Exascale HPC platforms which are becoming widely available worldwide [159,[167][168][169]. ...

Scalable Programming Workflows for Validation of Quantum Computers
  • Citing Conference Paper
  • November 2021

... MLIR (Multi-Level Intermediate Representation) [13], an emerging compiler infrastructure under the LLVM umbrella, overcomes this by having multiple levels of IR from close to the source language down to machine-level IR. This abstraction is quite useful in specifying domain-specific computations at the desired level of abstraction and hence utilized in various domains like Quantum Computing (Quantum MLIR) [19], machine learning models (Onnx MLIR [11], torch-mlir [16]), and hardware description languages (Circt [3]). Since these different dialects fall under the same infrastructure, the MLIR infrastructure seems quite promising for integrating different domains as the dialects will share consistent IR structure. ...

A MLIR Dialect for Quantum Assembly Languages
  • Citing Conference Paper
  • October 2021

... As suggested in Refs. [217,[225][226][227][228], large noise may leads to problems such as performance degradation or barren plateaus. Some special noises, such as leakage error, also have a bad effect on the performance of VQAs [229]. ...

Numerical Simulations of Noisy Variational Quantum Eigensolver Ansatz Circuits
  • Citing Conference Paper
  • October 2021