Article

FlexSpin: A CMOS Ising Machine With 256 Flexible Spin Processing Elements With 8-b Coefficients for Solving Combinatorial Optimization Problems

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Combinatorial optimization problems (COPs) are essential in various applications, including data clustering, supply chain management, and communication networks. Many real-world COPs are non-deterministic polynomial-time hard problems intractable using classical computers. Ising machine, the hardware accelerator based on the Ising model and annealing operation, has gained much attention as an alternative for solving COPs. The COPs are mapped to the Ising model, and their optimal/near-optimal solutions are explored by the intrinsic convergence property of the Ising machine. However, prior Ising machines based on locally connected spins have limitations in solving hard COPs due to significant overhead while mapping the Ising model to the inflexible hardware topology. In this work, we propose a scalable CMOS Ising machine with a network of flexible processing elements (PEs) to map and solve complex COPs with minimal overhead. The proposed Ising machine implements 256 PEs, where each PE is reconfigured to 1-to-4 spins with 28 spin interactions based on 8-bit coefficients. A 65-nm prototype chip has been fabricated, and a range of COPs have been mapped and solved, including max-cut and Boolean satisfiability problems.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... There are many classes of problems that cannot be efficiently solved by conventional computers, including optimization, inference, sampling, and invertible logic. [1][2][3][4][5][6][7][8] As problems scale poorly and Moore's law has begun to slow, there is increasing interest in designing alternative computational schemes and domain-specific hardware to accelerate these problems. Both experiments and theory show that the hardware-accelerated Boltzmann machines (BMs) outperform CPU implementation of the same probabilistic algorithms and deterministic algorithms on certain specific problems. ...
Article
Full-text available
Traditional processors based on the von Neumann architecture are not efficient when dealing with combinatorial optimization problems, which has led to the proposal of unconventional algorithms and domain-specific computing architectures. Using probabilistic computing to implement invertible logic has emerged as a potential solution, with the primary challenges being the realization of high-quality random sources and efficient circuit mapping schemes. In this work, we propose a reliable design for invertible logic circuits based on stochastic spin-transfer torque magnetic tunnel junctions (MTJs) and validate it through SPICE simulations. To achieve this, we develop a physics-driven stochastic MTJ model using Verilog-A, which is then implemented to construct binary stochastic neurons for building restricted Boltzmann machines (RBMs). Using linear programming (LP), the stochastic MTJs are weighted and interconnected to construct elementary RBM-based invertible logic gates, including AND, OR, NOT, and NAND gates. Furthermore, through logic synthesis, invertible logic circuits capable of realizing arbitrary logic functions are achieved. RBMs whose weights are determined by LP not only eliminate the need for training but also enhance iteration speed. Finally, we demonstrate how to solve the Boolean satisfiability problem using the proposed invertible logic circuits. Power consumption and area estimations indicate that our design consumes fewer resources compared to pure CMOS implementations.
... The performance of DTA is primarily determined by the initial annealing strength and annealing rate controlled by the bias voltages. It is necessary to strike a balance between annealing speed and Hamiltonian convergence quality for COPs of varying complexity [23], in order to achieve a tradeoff between solution quality and TTS. ...
Article
Full-text available
Recently, Ising machines have received rising attention as efficient alternatives for solving combinatorial optimization problems (COPs), compared to von Neumann architectures with high power consumption and prolonged latencies. While prior discrete-time Ising machines exploited near-/in-memory computing to mitigate memory access bottlenecks, their reliance on sequential spin updates inherently limits solution speed. To address this problem, continuous-time (CT) Ising machines have emerged as promising low-cost COP solvers due to their fully parallel spin-update mechanisms. However, existing CT Ising machines face a critical trilemma in simultaneously achieving multi-level coefficient precision, area-efficient spin implementations, and effective annealing strategies. In this work, we present a CT Ising machine with an eDRAM-based in-memory computing architecture. The design introduces a neighboring spin computation-sharing architecture that halves the storage overhead of interaction coefficients, thereby significantly reducing the spin cell area. Furthermore, a novel dynamic threshold annealing method is proposed to attain lower Ising Hamiltonians during the solution process. An 18 × 34 spin array has been designed in a 55-nm CMOS process, which achieves a compact spin unit footprint of merely 561 μm2 with 15-level programmable interaction coefficients. Additionally, post-layout simulation results validate that the proposed Ising machine achieves comparable nanosecond-scale time-to-solution (TTS) for diverse max-cut COPs while consuming 0.84 nJ at a 1.0 V supply voltage.
... By mapping QUBO problems to the Ising formulation, the QUBO could be solved directly on the Ising machine hardware. Due to the applicability and efficiency of the Ising machine, hardware structures using MOS devices (e.g., [7][8][9][10][11]), memristor-based systems (e.g., [12]), quantum devices (e.g., [13]), classical superconductor devices e.g., [14]), and optical devices (e.g., [15][16][17]) have been suggested. ...
Preprint
Full-text available
We present a heuristic algorithm designed to solve Quadratic Unconstrained Binary Optimization (QUBO) problems efficiently. The algorithm, referred to as IC-D2S, leverages a hybrid approach using Ising and classical machines to address very large problem sizes. Considering the practical limitation on the size of the Ising machine(IM), our algorithm partitions the QUBO problem into a collection of QUBO subproblems (called subQUBOs) and utilizes the IM to solve each subQUBO. Our proposed heuristic algorithm uses a set of control parameters to generate the subQUBOs and explore the search space. Also, it utilizes an annealer based on cosine waveform and applies a mutation operator at each step of the search to diversify the solution space and facilitate the process of finding the global minimum of the problem. We have evaluated the effectiveness of our IC-D2S algorithm on three large-sized problem sets and compared its efficiency in finding the (near-)optimal solution with three QUBO solvers. One of the solvers is a software-based algorithm (D2TS), while the other one (D-Wave) employs a similar approach to ours, utilizing both classical and Ising machines. The results demonstrate that for large-sized problems (>= 5000) the proposed algorithm identifies superior solutions. Additionally, for smaller-sized problems (= 2500), IC-D2S efficiently finds the optimal solution in a significantly faster manner.
Article
Ising machines have received growing interest as efficient and hardware-friendly solvers for combinatorial optimization problems (COPs). They search for the absolute or approximate ground states of the Ising model with a proper annealing process. In contrast to Ising machines built with superconductive or optical circuits, complementary metal–oxide–semiconductor (CMOS) Ising machines offer inexpensive fabrication, high scalability, and easy integration with mainstream semiconductor chips. As low-energy and CMOS-compatible emerging technologies, spintronics and phase-transition devices offer functionalities that can enhance the scalability and sampling performance of Ising machines. In this article, we survey various approaches in the process flow for solving COPs using CMOS, hybrid CMOSspintronic, and phase-transition devices. First, the methods for formulating COPs as Ising problems and embedding Ising formulations to the topology of the Ising machine are reviewed. Then, Ising machines are classified by their underlying operational principles and reviewed from a perspective of hardware implementation. CMOS solutions are advantageous with denser connectivity, whereas hybrid CMOS-spintronic and phase-transition device-based solutions show great potential in energy efficiency and high performance. Finally, the challenges and prospects are discussed for the Ising formulation, embedding process, and implementation of Ising machines.
Article
Full-text available
Solving computationally hard problems using conventional computing architectures is often slow and energetically inefficient. Quantum computing may help with these challenges, but it is still in the early stages of development. A quantum-inspired alternative is to build domain-specific architectures with classical hardware. Here we report a sparse Ising machine that achieves massive parallelism where the flips per second—the key figure of merit—scales linearly with the number of probabilistic bits. Our sparse Ising machine architecture, prototyped on a field-programmable gate array, is up to six orders of magnitude faster than standard Gibbs sampling on a central processing unit, and offers 5–18 times improvements in sampling speed compared with approaches based on tensor processing units and graphics processing units. Our sparse Ising machine can reliably factor semi-primes up to 32 bits and it outperforms competition-winning Boolean satisfiability solvers in approximate optimization. Moreover, our architecture can find the correct ground state, even when inexact sampling is made with faster clocks. Our problem encoding and sparsification techniques could be applied to other classical and quantum Ising machines, and our architecture could potentially be scaled to 1,000,000 or more p-bits using analogue silicon or nanodevice technologies. Sparsification techniques can be used to create Ising machines prototyped on field-programmable gate arrays that can quickly and efficiently solve combinatorial optimization problems.
Article
Full-text available
Computational architectures that are optimized to solve non-deterministic polynomial-time hard or complete problems are of use in the development of machine learning, logistical planning and pathfinding. A range of quantum-, optical- and spintronic-based approaches have been explored for solving such combinatorial optimization problems, but they remain complicated to build and to scale. Here we report a scalable ring-oscillator-based integrated circuit for optimization problem solving. Our 1,968-node King’s graph ring oscillator array has five levels of coupling strengths and can achieve up to 95% accuracy for randomly generated combinatorial optimization problems. The measured average power consumption of the Ising chip is 0.042 W and it takes less than 50 oscillation cycles to resolve to the ground state. Our device is resilient to environmental and variation effects. By using a multi-phase phase measurement circuit, we also capture the true phase behaviour within a coupled-oscillator integrated circuit. A coupled ring-oscillator-based integrated circuit with 1,968 nodes can be used to efficiently solve combinatorial optimization problems with an accuracy of up to 95%.
Article
Full-text available
Therapeutic mRNAs and vaccines are being developed for a broad range of human diseases, including COVID-19. However, their optimization is hindered by mRNA instability and inefficient protein expression. Here, we describe design principles that overcome these barriers. We develop an RNA sequencing-based platform called PERSIST-seq to systematically delineate in-cell mRNA stability, ribosome load, as well as in-solution stability of a library of diverse mRNAs. We find that, surprisingly, in-cell stability is a greater driver of protein output than high ribosome load. We further introduce a method called In-line-seq, applied to thousands of diverse RNAs, that reveals sequence and structure-based rules for mitigating hydrolytic degradation. Our findings show that highly structured “superfolder” mRNAs can be designed to improve both stability and expression with further enhancement through pseudouridine nucleoside modification. Together, our study demonstrates simultaneous improvement of mRNA stability and protein expression and provides a computational-experimental platform for the enhancement of mRNA medicines. The authors develop an RNA sequencing-based platform, PERSIST-seq, to simultaneously delineate in-cell mRNA stability, ribosome load, and in-solution stability of a diverse mRNA library to derive design principles for improved mRNA therapeutics.
Article
Full-text available
Minor-embedding heuristics have become an indispensable tool for compiling problems in quadratically unconstrained binary optimization (QUBO) into the hardware graphs of quantum and CMOS annealing processors. While recent embedding heuristics have been developed for annealers of moderate size (about 2000 nodes), the size of the latest CMOS annealing processor (with 102,400 nodes) poses entirely new demands on the embedding heuristic. This raises the question, if recent embedding heuristics can maintain meaningful embedding performance on hardware graphs of increasing size. Here, we develop an improved version of the probabilistic-swap-shift-annealing (PSSA) embedding heuristic [which has recently been demonstrated to outperform the standard embedding heuristic by D-Wave Systems (Cai et al. in http://arxiv.org/abs/1406.2741, 2014)] and evaluate its embedding performance on hardware graphs of increasing size. For random cubic and Barábasi–Albert graphs we find the embedding performance of improved PSSA to consistently exceed the threshold of the best known complete graph embedding by a factor of 3.2 and 2.8, respectively, up to hardware graphs with 102,400 nodes. On the other hand, for random graphs with constant edge density not even improved PSSA can overcome the deterministic threshold guaranteed by the existence of the best known complete graph embedding. Finally, we prove a new upper bound on the maximal embeddable size of complete graphs into hardware graphs of CMOS annealers and show that the embedding performance of its currently best known complete graph embedding has optimal order for hardware graphs with fixed coordination number.
Article
Full-text available
This article presents a high-performance annealing processor named STochAsTIc Cellular automata Annealer (STATICA) for solving combinatorial optimization problems represented by fully connected graphs. Supporting fully connected graphs is strongly required for dealing with realistic optimization problems. Unlike previous annealing processors that follow Glauber dynamics, our proposed annealer can update multiple states of fully connected spins simultaneously by introducing different dynamics called stochastic cellular automata annealing. It allows us to utilize the pipeline-level and memory-bank-level parallelization in addition to the PE-level parallelization originally adopted in the previous annealers. The STATICA prototype chip, which supports 512-spin fully connected graph, has been fabricated with the 65-nm CMOS technology and realized as a 3 mm ×4\times \,\,{4} mm chip. Using the fabricated 512-spin chip and numerical projections for a 2048-spin chip, we have conducted experiments to reveal the annealing performance of STATICA and examined how to control its annealing process efficiently.
Article
Full-text available
The observation of an unequivocal quantum speedup remains an elusive objective for quantum computing. A more modest goal is to demonstrate a scaling advantage over a class of classical algorithms for a computational problem running on quantum hardware. The D-Wave quantum annealing processors have been at the forefront of experimental attempts to address this goal, given their relatively large numbers of qubits and programmability. A complete determination of the optimal time-to-solution using these processors has not been possible to date, preventing definitive conclusions about the presence of a scaling advantage. The main technical obstacle has been the inability to verify an optimal annealing time within the available range. Here, we overcome this obstacle using a class of problem instances constructed by systematically combining many-spin frustrated loops with few-qubit gadgets exhibiting a tunneling event—a combination that we find to promote the presence of tunneling energy barriers in the relevant semiclassical energy landscape of the full problem—and we observe an optimal annealing time using a D-Wave 2000Q processor over a range spanning up to more than 2000 qubits. We identify the gadgets as being responsible for the optimal annealing time, whose existence allows us to perform an optimal time-to-solution benchmarking analysis. We perform a comparison to several classical algorithms, including simulated annealing, spin-vector Monte Carlo, and discrete-time simulated quantum annealing (SQA), and establish the first example of a scaling advantage for an experimental quantum annealer over classical simulated annealing. Namely, we find that the D-Wave device exhibits certifiably better scaling than simulated annealing, with 95% confidence, over the range of problem sizes that we can test. However, we do not find evidence for a quantum speedup: SQA exhibits the best scaling for annealing algorithms by a significant margin. This is a finding of independent interest, since we associate SQA’s advantage with its ability to transverse energy barriers in the semiclassical energy landscape by mimicking tunneling. Our construction of instance classes with verifiably optimal annealing times opens up the possibility of generating many new such classes based on a similar principle of promoting the presence of energy barriers that can be overcome more efficiently using quantum rather than thermal fluctuations, paving the way for further definitive assessments of scaling advantages using current and future quantum annealing devices.
Article
Full-text available
Quantum annealing aims at solving combinatorial optimization problems mapped to Ising interactions between quantum spins. Here, with the objective of developing a noise-resilient annealer, we propose a paradigm for quantum annealing with a scalable network of two-photon-driven Kerr-nonlinear resonators. Each resonator encodes an Ising spin in a robust degenerate subspace formed by two coherent states of opposite phases. A fully connected optimization problem is mapped to local fields driving the resonators, which are connected with only local four-body interactions. We describe an adiabatic annealing protocol in this system and analyse its performance in the presence of photon loss. Numerical simulations indicate substantial resilience to this noise channel, leading to a high success probability for quantum annealing. Finally, we propose a realistic circuit QED implementation of this promising platform for implementing a large-scale quantum Ising machine.
Article
Full-text available
Adiabatic quantum computing has evolved in recent years from a theoretical field into an immensely practical area, a change partially sparked by D-Wave System's quantum annealing hardware. These multimillion-dollar quantum annealers offer the potential to solve optimization problems millions of times faster than classical heuristics, prompting researchers at Google, NASA and Lockheed Martin to study how these computers can be applied to complex real-world problems such as NASA rover missions. Unfortunately, compiling (embedding) an optimization problem into the annealing hardware is itself a difficult optimization problem and a major bottleneck currently preventing widespread adoption. Additionally, while finding a single embedding is difficult, no generalized method is known for tuning embeddings to use minimal hardware resources. To address these barriers, we introduce a graph-theoretic framework for developing structured embedding algorithms. Using this framework, we introduce a biclique virtual hardware layer to provide a simplified interface to the physical hardware. Additionally, we exploit bipartite structure in quantum programs using odd cycle transversal (OCT) decompositions. By coupling an OCT-based embedding algorithm with new, generalized reduction methods, we develop a new baseline for embedding a wide range of optimization problems into fault-free D-Wave annealing hardware. To encourage the reuse and extension of these techniques, we provide an implementation of the framework and embedding algorithms.
Article
Full-text available
Taking the pulse of optimization Finding the optimum solution of multiparameter or multifunctional problems is important across many disciplines, but it can be computationally intensive. Many such problems defined as computationally difficult can be mathematically mapped onto the so-called Ising problem, which looks at finding the minimum energy configuration for an array of coupled spins. Inagaki et al. and McMahon et al. show that an optical processing approach based on a network of coupled optical pulses in a ring fiber can be used to model and optimize large-scale Ising systems. Such a scalable architecture could help to optimize solutions to a wide range of complex problems. Science , this issue pp. 603 and 614
Article
Full-text available
Taking the pulse of optimization Finding the optimum solution of multiparameter or multifunctional problems is important across many disciplines, but it can be computationally intensive. Many such problems defined as computationally difficult can be mathematically mapped onto the so-called Ising problem, which looks at finding the minimum energy configuration for an array of coupled spins. Inagaki et al. and McMahon et al. show that an optical processing approach based on a network of coupled optical pulses in a ring fiber can be used to model and optimize large-scale Ising systems. Such a scalable architecture could help to optimize solutions to a wide range of complex problems. Science , this issue pp. 603 and 614
Article
Full-text available
An indispensable part of our personal and working lives, computing has also become essential to industries and governments. Steady improvements in computer hardware have been supported by periodic doubling of transistor densities in integrated circuits over the past fifty years. Such Moore scaling now requires ever-increasing efforts, stimulating research in alternative hardware and stirring controversy. To help evaluate emerging technologies and increase our understanding of integrated-circuit scaling, here I review fundamental limits to computation in the areas of manufacturing, energy, physical space, design and verification effort, and algorithms. To outline what is achievable in principle and in practice, I recapitulate how some limits were circumvented, and compare loose and tight limits. Engineering difficulties encountered by emerging technologies may indicate yet unknown limits.
Article
Full-text available
Quantum technology is maturing to the point where quantum devices, such as quantum communication systems, quantum random number generators and quantum simulators, may be built with capabilities exceeding classical computers. A quantum annealer, in particular, solves hard optimisation problems by evolving a known initial configuration at non-zero temperature towards the ground state of a Hamiltonian encoding a given problem. Here, we present results from experiments on a 108 qubit D-Wave One device based on superconducting flux qubits. The strong correlations between the device and a simulated quantum annealer, in contrast with weak correlations between the device and classical annealing or classical spin dynamics, demonstrate that the device performs quantum annealing. We find additional evidence for quantum annealing in the form of small-gap avoided level crossings characterizing the hard problems. To assess the computational power of the device we compare it to optimised classical algorithms.
Article
Full-text available
We provide Ising formulations for many NP-complete and NP-hard problems, including all of Karp's 21 NP-complete problems. This collects and extends classic results relating partitioning problems to Ising spin glasses, as well as work describing exact covering algorithms and satisfiability. In each case, the state space is at most polynomial in the size of the problem, as is the number of terms in the Hamiltonian. This work may be useful in designing adiabatic quantum optimization algorithms.
Article
Full-text available
Many interesting but practically intractable problems can be reduced to that of finding the ground state of a system of interacting spins; however, finding such a ground state remains computationally difficult. It is believed that the ground state of some naturally occurring spin systems can be effectively attained through a process called quantum annealing. If it could be harnessed, quantum annealing might improve on known methods for solving certain types of problem. However, physical investigation of quantum annealing has been largely confined to microscopic spins in condensed-matter systems. Here we use quantum annealing to find the ground state of an artificial Ising spin system comprising an array of eight superconducting flux quantum bits with programmable spin-spin couplings. We observe a clear signature of quantum annealing, distinguishable from classical thermal annealing through the temperature dependence of the time at which the system dynamics freezes. Our implementation can be configured in situ to realize a wide variety of different spin networks, each of which can be monitored as it moves towards a low-energy configuration. This programmable artificial spin network bridges the gap between the theoretical study of ideal isolated spin networks and the experimental investigation of bulk magnetic samples. Moreover, with an increased number of spins, such a system may provide a practical physical means to implement a quantum algorithm, possibly allowing more-effective approaches to solving certain classes of hard combinatorial optimization problems.
Article
Combinatorial optimization problems (COPs) find applications in real-world scientific, industrial, and societal scenarios. Such COPs are computationally NP-hard, and performing an exhaustive brute force search for the optimal solution becomes untenable as the COP size increases. To expedite the COP computation, the Ising model formalism is used, which abstracts spin dynamics in a ferromagnet. The spins are orientated to reach the minimum energy state, representing the optimum COP solution. Previous Ising engine designs utilized dedicated annealing processors or additional digital arithmetic circuits next to the memory bitcells. These custom circuits or processors cannot be repurposed for other applications, incurring significant area and power overhead. In contrast to the prior approaches, this work presents a reconfigurable and scalable compute-within-memory analog approach for Ising computation (called Ising-CIM). This area-efficient approach repurposes existing embedded memory bitcell columns and peripheral circuits to perform analog domain Hamiltonian calculations on the bitlines minimizing area and power overhead significantly. A 13.18-Kb silicon prototype, implemented in a 65-nm CMOS process, demonstrates the Ising-CIM concept and functionality using a 100 ×\times 64 pixel image in a max-cut COP. The Ising-CIM design achieves 48- μm 2\mu \text{m}~^{\mathrm{ 2}} /spin unit spin area and 1091×1091\times speedup in annealing time compared to the CPU.
Article
No existing algorithms can find exact solutions to the combinatorial optimization problems (COPs) classified as non-deterministic polynomial-time (NP) hard problems. Alternatively, Ising computer based on the Ising model and annealing process has recently drawn significant attention. The Ising computers can find approximate solutions to the NP-hard COPs by observing the convergence of dynamic spin states. However, they have encountered challenges in mapping the optimization problems to the inflexible Ising computers with fixed spin interconnects. In this article, we propose a scalable CMOS Ising computer with sparse and reconfigurable spin interconnects for arbitrary mapping of spin networks with minimal overhead. Without a mapping algorithm, the proposed Ising computer provides a method for directly mapping COPs to the reconfigurable hardware. A 65-nm CMOS Ising test chip with 252 spins is fabricated and used for solving COPs, including max-cut problems.
Article
Recently, annealing processors based on the Ising model have received rising attention as efficient alternative hardware for solving combinatorial optimization problems. After mapping a problem to the hardware Ising model, we can observe the natural convergence behavior of the Ising model and find potential solutions to the problem. The quantum annealing processor has shown effectiveness in finding better solutions to the problems by using quantum bits (qubits) with annealing based on their quantum tunneling behaviors. However, the annealing processor based on the emerging quantum technology faces challenges such as limited scalability, high energy consumption, and operating costs due to the extremely low operating temperature. As a low-cost alternative, a CMOS annealing processor has been recently developed and drawn increasing attention thanks to the advantages over its quantum counterpart, including better scalability and lower energy consumption. In this work, we present a digital CMOS annealing processor with in-memory spin operators and register spins. The proposed CMOS annealing processor achieves > 10×10\times energy efficiency and faster operation than state-of-the-art works.
Article
Quantum annealing (QA) can be used to quickly obtain near-optimal solutions for quadratic unconstrained binary optimization (QUBO) problems. In QA hardware, each decision variable of a QUBO should be mapped to one or more adjacent qubits in such a way that pairs of variables defining a quadratic term in the objective function are mapped to some pair of adjacent qubits. However, qubits have limited connectivity in existing QA hardware. This has spurred work on preprocessing algorithms for embedding the graph representing problem variables with quadratic terms into the hardware graph representing qubits adjacencies, such as the Chimera graph in hardware produced by D-Wave Systems. In this paper, we use integer linear programming to search for an embedding of the problem graph into certain classes of minors of the Chimera graph, which we call template embeddings. One of these classes corresponds to complete bipartite graphs, for which we show the limitation of the existing approach based on minimum odd cycle transversals (OCTs). One of the formulations presented is exact and thus can be used to certify the absence of a minor embedding using that template. On an extensive test set consisting of random graphs from five different classes of varying size and sparsity, we can embed more graphs than a state-of-the-art OCT-based approach, our approach scales better with the hardware size, and the runtime is generally orders of magnitude smaller. Summary of Contribution: Our work combines classical and quantum computing for operations research by showing that integer linear programming can be successfully used as a preprocessing step for adiabatic quantum optimization. We use it to determine how a quadratic unconstrained binary optimization problem can be solved by a quantum annealer in which the qubits are coupled as in a Chimera graph, such as in the quantum annealers currently produced by D-Wave Systems. The paper also provides a timely introduction to adiabatic quantum computing and related work on minor embeddings.
Article
Nondeterministic polynomial time hard (NP-hard) combinatorial optimization problems (COPs) are intractable to solve using a traditional computer as the time to find a solution increases very rapidly with the number of variables. An efficient alternative computing method uses coupled spin networks to solve COP. This work presents a first-of-its-kind coupled ring oscillator (ROSC)-based scalable probabilistic Ising computer to solve NP-hard COPs. An integrated coupled oscillator network was designed with 560 ROSCs that mimic a coupled spin network. Each ROSC can be coupled to any of its neighbors using programmable back-to-back (B2B) inverter-based coupling mechanism. The ROSC-based spins and B2B inverter-based coupling were optimized to work under a wide range of system noise as well as voltage and temperature variations. Randomly generated 1000 max-cut problems were mapped and solved in the hardware. The integrated Ising computer produced satisfactory solutions of max-cut problems when compared with commercial software running on a CPU. Experiments show that the integrated CMOS-based Ising computer can find the solution to NP-hard problems with an accuracy of 82%-100%. In addition, the repeated measurements of the same problem showed that the Ising computer can traverse through several local minima to find high-quality solutions under various voltage and temperature variation conditions. The experimental results show that ROSCs are a potential candidate for a dedicated hardware accelerator aiming to solve a wide range of COPs.
Article
The world's first 2 x 30k-spin multi-chip CMOS annealing processor (AP)-based on the processing-in-memory approach for solving large-scale combinatorial optimization problem-was developed. To expand the bit width of coefficients and enhance the scalability of the AP, it has three key features: an expandable and high-accuracy spin operator for local communication, a highly integrated spin circuit using direct access to SRAM, and a low-latency inter-chip interface that does not affect the runtime or results of the annealing process. The AP is fabricated on the basis of 40-nm CMOS technology. It was experimentally demonstrated that the spin-flip ratio of the processor agrees well with theoretical values based on the Gibbs distribution over a wide temperature range. As a result, under two-chip operation with 2 x 30k spins, the AP achieves an annealing time of 22 μs, which is 455 times and 2.6 x 10 4 times faster than those achieved by our previous CMOS-AP and a conventional CPU, respectively. Moreover, its energy efficiency is 1.75 x 105 times higher than that of a conventional CPU-based algorithm.
Article
We discussed the entanglement generated by the quantum annealing processor in the thermal state. The quantum annealing processor is modeled using the spin-chain model. The system is analytically solved using the unitary operator method and generated correlations (Von Neuman, Shanonn entropies and Purity) are discussed. The effect of the system parameters such as coupling constant, strength coupling and bias parameter, on the dynamics of the generated entanglement is studied. It is shown that the system parameters can be used as a controller of the entanglement.
Book
Combinatorial optimization algorithms are used in many applications including the design, management, and operations of communication networks. The objective of this book is to advance and promote the theory and applications of combinatorial optimization in communication networks. The book collects a distinguished set of papers on subjects such as wireless communication systems, satellite networks, optical networks, and ad hoc networks. The topics covered range from topology control, routing optimization, and resource allocation to QoS provisioning. It is the first book that integrates rich theory from operations research with cutting-edge research in communication networks. Audience The target audience for the work includes the researchers in the field of network design and optimization, graduate students and professors interested in networking and optimization research, as well as network design engineers. It is a handy reference book for researchers in networking and mathematical programming, also a suitable textbook for advanced courses in the theoretical aspects of networking.
Book
Stochastic local search (SLS) algorithms are among the most prominent and successful techniques for solving computationally difficult problems in many areas of computer science and operations research, including propositional satisfiability, constraint satisfaction, routing, and scheduling. SLS algorithms have also become increasingly popular for solving challenging combinatorial problems in many application areas, such as e-commerce and bioinformatics. Hoos and St tzle offer the first systematic and unified treatment of SLS algorithms. In this groundbreaking new book, they examine the general concepts and specific instances of SLS algorithms and carefully consider their development, analysis and application. The discussion focuses on the most successful SLS methods and explores their underlying principles, properties, and features. This book gives hands-on experience with some of the most widely used search techniques, and provides readers with the necessary understanding and skills to use this powerful tool. *Provides the first unified view of the field. *Offers an extensive review of state-of-the-art stochastic local search algorithms and their applications. *Presents and applies an advanced empirical methodology for analyzing the behavior of SLS algorithms. *A companion website offers lecture slides as well as source code and Java applets for exploring and demonstrating SLS algorithms.
Article
In the near future, the ability to solve combinatorial optimization problems will be a key technique to enable the IoT era. A new computing architecture called Ising computing and implemented using CMOS circuits is proposed. This computing maps the problems to an Ising model, a model to express the behavior of magnetic spins, and solves combinatorial optimization problems efficiently exploiting its intrinsic convergence properties. In the computing, “CMOS annealing” is used to find a better solution for the problems. A 20k-spin prototype Ising chip is fabricated in 65 nm process. The Ising chip achieves 100 MHz operation and its capability of solving combinatorial optimization problems using an Ising model is confirmed. The power efficiency of the chip can be estimated to be 1800 times higher than that of a general purpose CPU when running an approximation algorithm.
Article
An algebraic method has been developed which allows one to engineer several energy levels including the low-energy subspace of interacting spin systems. By introducing ancillary qubits, this approach allows k-body interactions to be captured exactly using 2-body Hamiltonians. Our method works when all terms in the Hamiltonian share the same basis and has no dependence on perturbation theory or the associated large spectral gap. Our methods allow problem instance solutions to be embedded into the ground energy state of Ising spin systems. Adiabatic evolution might then be used to place a computational system into it's ground state.
Article
-
Article
Production planning and scheduling models arising in automated manufacturing environments exhibit several features not encountered in models developed for traditional production systems. For instance, models of automated facilities typically include tooling constraints which reflect the possibility for a machine to use different tools in order to perform successive operations, within limits imposed by the size of the tool magazine. Also, these models often account for the existence of flexible material handling systems whose activities must be synchronized with the machining operations in order to optimize system utilization. In this paper, we describe a few interesting combinatorial optimization problems proposed in this framework, we point to their relationships with models investigated in seemingly remote areas, and we identify a number of challenging open problems.
Article
In [Choi08], we introduced the notion of minor-embedding in adiabatic quantum optimization. A minor-embedding of a graph G in a quantum hardware graph U is a subgraph of U such that G can be obtained from it by contracting edges. In this paper, we describe the intertwined adiabatic quantum architecture design problem, which is to construct a hardware graph U that satisfies all known physical constraints and, at the same time, permits an efficient minor-embedding algorithm. We illustrate an optimal complete-graph-minor hardware graph. Given a family F of graphs, a (host) graph U is called F-minor-universal if for each graph G in F, U contains a minor-embedding of G. The problem for designing a F-minor-universal hardware graph U_{sparse} in which F consists of a family of sparse graphs (e.g., bounded degree graphs) is open.
Article
We show that the NP-hard quadratic unconstrained binary optimization (QUBO) problem on a graph G can be solved using an adiabatic quantum computer that implements an Ising spin-1/2 Hamiltonian, by reduction through minor-embedding of G in the quantum hardware graph U. There are two components to this reduction: embedding and parameter setting. The embedding problem is to find a minor-embedding G emb of a graph G in U, which is a subgraph of U such that G can be obtained from G emb by contracting edges. The parameter setting problem is to determine the corresponding parameters, qubit biases and coupler strengths, of the embedded Ising Hamiltonian. In this paper, we focus on the parameter setting problem. As an example, we demonstrate the embedded Ising Hamiltonian for solving the maximum independent set (MIS) problem via adiabatic quantum computation (AQC) using an Ising spin-1/2 system. We close by discussing several related algorithmic problems that need to be investigated in order to facilitate the design of adiabatic algorithms and AQC architectures.
GPU based parallel Ising computing for combinatorial optimization problems in VLSI physical design
  • Cook
GPU based parallel Ising computing for combinatorial optimization problems in VLSI physical design
  • C Cook
  • H Zhao
  • T Sato
  • M Hiromoto
  • X.-D Tan