Conference Paper

CTLE-Ising:A 1440-Spin Continuous-Time Latch-Based isling Machine with One-Shot Fully-Parallel Spin Updates Featuring Equalization of Spin States

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... nconventional computing methods [1][2][3][4][5][6][7][8][9][10][11][12][13][14] are increasingly being used to tackle challenging problems, such as combinatorial optimization problems (COPs), that are difficult to solve using conventional deterministic computers due to their insufficient search speeds 5 . Attempts have been made to use quantum computers 3,4 to solve COPs by manipulating the probability of quantum bits (qubits) in an adiabatic quantum computation 2 (AQC) method. ...
... Therefore, different static currents should be delicately chosen for each MTJ-based p-bit, which makes them less practical for largescale probabilistic computing implementation. Similarly, fully complementary metal-oxide-semiconductor (CMOS) Ising machines 13,14 that use non-digitally coupled spins to reduce hardware area have also been introduced. However, their nondigitally-coupled structures strictly limit the coupling distances and resolutions between spins 12 . ...
Preprint
Full-text available
Probabilistic computing—quantum-inspired computing that uses probabilistic bits (p-bits)—has emerged as a powerful method owing to its fast search speed and robust connectivity. Previous works used linear feedback shift registers (LFSRs) or stochastic magnetic tunnel junctions (MTJs) to implement p-bits. However, in large-scale problems, periodicity and correlation issues in LFSR p-bits and inherent variations in MTJ-based p-bits with narrow stochastic regions lead to unreliable results when seeking the appropriate solution. Therefore, we propose a fully CMOS frequency-scalable p-bit implemented with a discrete-time flipped-hook tent-map chaotic oscillator. The proposed chaotic oscillator produces high-quality noise voltage that is uniformly distributed across the entire supply voltage range, enabling aligned responses of p-bits free from calibration and an input resolution of 8 bits. In contrast to LFSR-based p-bits with hardware-dependent correlation, the chaotic oscillator p-bits could factorize semiprimes with lengths up to 64 bits without changing hardware size. The chaotic oscillator exhibited an energy efficiency of 4.26 pJ/bit at 1.8 V supply voltage. The robustness and the high randomness of the proposed chaotic oscillator p-bit suggest a new direction of a p-bit scalable to large-scale probabilistic computing.
... However, the practical application of quantum annealing faces significant challenges, such as extremely low temperature operating condition and excessive power consumption. For these reasons, low-power CMOS based Ising machines have been researched to achieve practical and energy efficient spin computing acceleration [4][5][6]. ...
Article
Full-text available
The Ising spin model is an efficient method for solving combinatorial optimization problems (COPs) but faces challenges in conventional Von‐Neumann architectures due to high computational costs, especially with the growing data volume in the IoT era. To address this problem, we proposed low power CMOS stochastic bit based Ising machine to efficiently compute COPs. By adopting compute‐in‐memory (CIM) approach for parallel spin computation, we achieved energy efficient spin computing. Furthermore, we harnessed the inherent randomness of CMOS stochastic bit to prevent Ising computing process from being stuck into local minima, effectively mitigating the power penalty associated with the random number generators (RNGs) in the conventional CMOS based Ising machines. We demonstrated the feasibility of our design by solving NP‐complete graph coloring problem with four vertices and three colors using TSMC 65 nm GP process. Moreover, the proposed CMOS stochastic bit based spin unit consumes the lowest power/spin among the state‐of‐the‐art Ising machine researches, with power/spin of 1.07 μWμW\mu{\rm W} and energy/spin of 107 fJ.
... In this section, we will make a comparison of the performance of PIMs and Ising machines built with non-photonic components. Table III presents representative Ising machines built with CMOS circuitry, 116,198,[204][205][206]208 SMTJs, 41,43,213 and memristors. 113 As power consumption is typically not specified in photonic research, Table IV primarily focuses on the performance of Ising-related characters of PIMs. ...
Article
Full-text available
The demand for efficient solvers of complicated combinatorial optimization problems, especially those classified as NP-complete or NP-hard, has recently led to increased exploration of novel computing architectures. One prominent collective state computing paradigm embodied in the so-called Ising machines has recently attracted considerable research attention due to its ability to optimize complex problems with large numbers of interacting variables. Ising model-inspired solvers, thus named due to mathematical similarities to the well-known model from solid-state physics, represent a promising alternative to traditional von Neumann computer architectures due to their high degree of inherent parallelism. While there are many possible physical realizations of Ising solvers, just as there are many possible implementations of any binary computer, photonic Ising machines (PIMs) use primarily optical components for computation, taking advantage of features like lower power consumption, fast calculation speeds, the leveraging of physical optics to perform the calculations themselves, possessing decent scalability and noise tolerance. Photonic computing in the form of PIMs may offer certain computational advantages that are not easily achieved with non-photonic approaches and is nonetheless an altogether fascinating application of photonics to computing. In this review, we provide an overview of Ising machines generally, introducing why they are useful, what types of problems they can tackle, and how different Ising solvers can be compared and benchmarked. We delineate their various operational mechanisms, advantages, and limitations vis-à-vis non-photonic Ising machines. We describe their scalability, interconnectivity, performance, and physical dimensions. As research in PIMs continues to progress, there is a potential that photonic computing could well emerge as a way to handle large and challenging optimization problems across diverse domains. This review serves as a comprehensive resource for researchers and practitioners interested in understanding capabilities and potential of PIMs in addressing such complex optimization problems.
... Ising model hardware called Ising machine, as a specific accelerator, aims to solve COPs more efficiently. A wide range of schemes for building Ising machines have been proposed on various technology platforms, including digital CMOS Ising annealers based on SRAM or register [8], [9], [10], quantum annealers based on superconducting qubits [11], [12], [13], coherent optical Ising machines [14], [15], [16], CMOS ring oscillator (ROSC) Ising machines [17], [18], [19], and latch-based Ising machine [20]. Digital Ising machine have demonstrated flexible spin connections, large-scale spin integration and high-precision interaction coefficients.However, they rely on an external source for random number generation to introduce stochasticity and require thousands of cycles to iterate and update because of discrete-time properties. ...
Article
A range of quantum-, optical-and CMOS-based approaches have been explored to solve Nondeterministic polynomial-time hard (NP-hard) combinatorial optimization problems (COPs), of which we consider ring oscillator (ROSC) coupled Ising machine to be highly prospective. This work proposed a scalable ROSC-based Ising machine with capacity coupling and phase drift eliminator. The coupling module consisting of two MOSCAPs and several switch transistors allows for nine coupling strengths, which can be arbitrarily configured into different weight resolutions, and shows resilience to capacitance variations. Phase drift eliminator was designed to alleviate the effect of intrinsic noise within the ROSC array, which ensures accurate readout of the spin state. The area of the readout circuit is only 32% of the previous phase-sampling circuit. We also proposed a progressive annealing method inspired by quantum adiabatic annealing, which is easy to perform on ROSC-based Ising machines and does not require additional random number generators. After applying the progressive annealing method, accuracy can be achieved up to 98% for randomly generated contentious problems.
... Low energy and high robustness TRNG is crucial in information security applications of edge devices for generating session keys, nonces, and initialization vectors [1]. In addition to supporting hardware security, TRNG is becoming attractive for non-von Neumann computing architecture, such as neuromorphic computing [2], ising machine [3], and stochastic computing [4], in which large amounts of random bits are required. ...
Article
This article presents a true random number generator (TRNG) that achieves high entropy generation across wide voltage and temperature (VT) range (0.3–1.0 V, - 40 ^{\circ} C to 110 ^{\circ} C) in a single latch-based entropy source (ES). In the ES, static inverter selection technique to minimize the mismatch between the paired inverters, and noise enhancement methods to increase the root mean square (rms) of noise voltage ( σn\sigma_{n} ) are implemented for good randomness and robustness. In a 130-nm CMOS technology, the TRNG occupies 5343 μ\mu m 2 and consumes 0.116 pJ/bit at 0.3 V including an on-chip von Neumann post-processing circuit. The cryptographic quality of TRNG’s output is verified by National Institute of Standards and Technology (NIST) SP800-22 tests. Up to 325 mV V pp noise injection attack tolerance is confirmed by power supply frequency injection attack. And an equivalent 20-year life at 0.3 V, 25 ^{\circ} C is verified by accelerated NBTI aging test.
Article
Full-text available
Probabilistic computing—quantum-inspired computing that uses probabilistic bits (p-bits)—has emerged as a powerful method owing to its fast search speed and robust connectivity. Previous works used linear feedback shift registers (LFSRs) or stochastic magnetic tunnel junctions (MTJs) to implement p-bits. However, in large-scale problems, periodicity and correlation issues in LFSR p-bits and inherent variations in MTJ-based p-bits with narrow stochastic regions lead to unreliable results when seeking the appropriate solution. Therefore, we propose a fully CMOS frequency-scalable p-bit implemented with a discrete-time flipped-hook tent-map chaotic oscillator. The proposed chaotic oscillator produces high-quality noise voltage that is uniformly distributed across the entire supply voltage range, enabling aligned responses of p-bits free from calibration and an input resolution of 8 bits. In contrast to LFSR-based p-bits with hardware-dependent correlation, the chaotic oscillator p-bits could factorize semiprimes with lengths up to 64 bits without changing hardware size. The chaotic oscillator exhibited an energy efficiency of 4.26 pJ/bit at 1.8 V supply voltage. The robustness and the high randomness of the proposed chaotic oscillator p-bit suggest a new direction of a p-bit scalable to large-scale probabilistic computing.
Article
Combinatorial optimization problems (COPs) are essential in various applications, including data clustering, supply chain management, and communication networks. Many real-world COPs are non-deterministic polynomial-time hard problems intractable using classical computers. Ising machine, the hardware accelerator based on the Ising model and annealing operation, has gained much attention as an alternative for solving COPs. The COPs are mapped to the Ising model, and their optimal/near-optimal solutions are explored by the intrinsic convergence property of the Ising machine. However, prior Ising machines based on locally connected spins have limitations in solving hard COPs due to significant overhead while mapping the Ising model to the inflexible hardware topology. In this work, we propose a scalable CMOS Ising machine with a network of flexible processing elements (PEs) to map and solve complex COPs with minimal overhead. The proposed Ising machine implements 256 PEs, where each PE is reconfigured to 1-to-4 spins with 28 spin interactions based on 8-bit coefficients. A 65-nm prototype chip has been fabricated, and a range of COPs have been mapped and solved, including max-cut and Boolean satisfiability problems.
Article
Recently, hardware accelerators based on the Ising model have gained ever-increasing interest by demonstrating their capabilities of solving complex decision and optimization problems that are intractable using classical computers CPUs/graphics processing units (GPUs). The problems are translated into combinatorial optimization problems (COPs) and mapped to the Ising machine, comprised of artificial spins interacting and naturally finding their optimal states. Recent discrete-time Ising machines operating at room temperatures have demonstrated solving small-scale COPs while consuming orders of magnitude lower energy than prior quantum annealers; however, they have several limitations due to their discrete-time operations, bulky spins, and lack of compact random number generators. In this work, we propose a novel Ising machine with compact latch-based spin circuits operating in a continuous time. The proposed continuous-time Ising machine finds solutions to COPs with fully parallel spin operations (couplings between latches), significantly reducing computing latency and energy consumption. Besides, the latch-based spins randomize or superpose their initial spin states to find better solutions with the lower Ising Hamiltonian (i.e., a key performance indicator (KPI) of the Ising machine) A 0.656 \ttimes 0.680 mm 2^{{2}} test chip with a 40 \ttimes 36 latch-based spin array is fabricated using a 65 nm CMOS process. The proposed continuous-time latch-based spin with equalization (CTLE)-Ising achieves 1000×1000\times speedup compared to the discrete-time Ising machine operating at 1 GHz when solving max-cut COPs while consuming 0.2–3 nJ using 0.75–1.05 V core supply voltage.
Article
The rapid advancement in genome sequencing technology has led to a significant increase in the number of genomic reads in recent years. Due to the immense size of reference genomes, which can be up to 3 billion bases, finding optimal solutions for through approximate string matching proves to be computationally challenging. Current alignment algorithms address this by performing a preprocessing step to efficiently calculate likely matching regions and only aligning at the base level within these regions. This article demonstrates the acceleration of sorting and searching in memories, both crucial components of genome alignment algorithms. We designed a compute-in-memory (CIM) array using standard cells, which is capable of sorting datastreams blockwise, merging sorted blocks, as well as operating as a content addressable memory (CAM) while also being able to perform multiword logic operations. We address the problem of datasets not fitting into on-chip memory by reusing the CIM array for a merge sorting step, enabling arbitrarily sized sorting. Our 2.6- μ\mu m 2^{2} /bit design, fabricated using 22-nm fully depleted silicon-on-insulator (FDSOI) technology, yields a throughput of up to 4.28 GB/s at fmaxf_{\text{max}} and 4.97 nJ/sort at the minimum energy point (MEP) when executing sort operations.
Article
We evaluate the racial bias in face recognition application programming interfaces (APIs) using real and deepfake celebrity images. We use deepfake generation methods to introduce small, imperceptible changes to the real images to shift the racial class of predictions, showing how deepfake images exacerbated racial bias in web-based face recognition APIs.
Article
Compute in memory (CIM) promises faster and lower power processing of data. Recently presented papers at the 2023 IEEE ISSCC gave some examples of how various semiconductor architectures can enable CIM devices for various computing applications.
Article
Full-text available
Many interesting but practically intractable problems can be reduced to that of finding the ground state of a system of interacting spins; however, finding such a ground state remains computationally difficult. It is believed that the ground state of some naturally occurring spin systems can be effectively attained through a process called quantum annealing. If it could be harnessed, quantum annealing might improve on known methods for solving certain types of problem. However, physical investigation of quantum annealing has been largely confined to microscopic spins in condensed-matter systems. Here we use quantum annealing to find the ground state of an artificial Ising spin system comprising an array of eight superconducting flux quantum bits with programmable spin-spin couplings. We observe a clear signature of quantum annealing, distinguishable from classical thermal annealing through the temperature dependence of the time at which the system dynamics freezes. Our implementation can be configured in situ to realize a wide variety of different spin networks, each of which can be monitored as it moves towards a low-energy configuration. This programmable artificial spin network bridges the gap between the theoretical study of ideal isolated spin networks and the experimental investigation of bulk magnetic samples. Moreover, with an increased number of spins, such a system may provide a practical physical means to implement a quantum algorithm, possibly allowing more-effective approaches to solving certain classes of hard combinatorial optimization problems.