Preprint

A Laser Spiking Neuron in a Photonic Integrated Circuit

Authors:
  • Luminous Computing
Preprints and early-stage research may not have been peer reviewed yet.
To read the file of this research, you can request a copy directly from the authors.

Abstract

There has been a recent surge of interest in the implementation of linear operations such as matrix multipications using photonic integrated circuit technology. However, these approaches require an efficient and flexible way to perform nonlinear operations in the photonic domain. We have fabricated an optoelectronic nonlinear device--a laser neuron--that uses excitable laser dynamics to achieve biologically-inspired spiking behavior. We demonstrate functionality with simultaneous excitation, inhibition, and summation across multiple wavelengths. We also demonstrate cascadability and compatibility with a wavelength multiplexing protocol, both essential for larger scale system integration. Laser neurons represent an important class of optoelectronic nonlinear processors that can complement both the enormous bandwidth density and energy efficiency of photonic computing operations.

No file available

Request Full-text Paper PDF

To read the file of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Software implementations of brain-inspired computing underlie many important computational tasks, from image processing to speech recognition, artificial intelligence and deep learning applications. Yet, unlike real neural tissue, traditional computing architectures physically separate the core computing functions of memory and processing, making fast, efficient and low-energy computing difficult to achieve. To overcome such limitations, an attractive alternative is to design hardware that mimics neurons and synapses. Such hardware, when connected in networks or neuromorphic systems, processes information in a way more analogous to brains. Here we present an all-optical version of such a neurosynaptic system, capable of supervised and unsupervised learning. We exploit wavelength division multiplexing techniques to implement a scalable circuit architecture for photonic neural networks, successfully demonstrating pattern recognition directly in the optical domain. Such photonic neurosynaptic networks promise access to the high speed and high bandwidth inherent to optical systems, thus enabling the direct processing of optical telecommunication and visual data.
Article
Full-text available
Spiking neural networks (SNNs) are inspired by information processing in biology, where sparse and asynchronous binary signals are communicated and processed in a massively parallel fashion. SNNs on neuromorphic hardware exhibit favorable properties such as low power consumption, fast inference, and event-driven information processing. This makes them interesting candidates for the efficient implementation of deep neural networks, the method of choice for many machine learning tasks. In this review, we address the opportunities that deep spiking networks offer and investigate in detail the challenges associated with training SNNs in a way that makes them competitive with conventional deep learning, but simultaneously allows for efficient mapping to hardware. A wide range of training methods for SNNs is presented, ranging from the conversion of conventional deep networks into SNNs, constrained training before conversion, spiking variants of backpropagation, and biologically motivated variants of STDP. The goal of our review is to define a categorization of SNN training methods, and summarize their advantages and drawbacks. We further discuss relationships between SNNs and binary networks, which are becoming popular for efficient digital hardware implementation. Neuromorphic hardware platforms have great potential to enable deep spiking networks in real-world applications. We compare the suitability of various neuromorphic systems that have been developed over the past years, and investigate potential use cases. Neuromorphic approaches and conventional machine learning should not be considered simply two solutions to the same classes of problems, instead it is possible to identify and exploit their task-specific advantages. Deep SNNs offer great opportunities to work with new types of event-based sensors, exploit temporal codes and local on-chip learning, and we have so far just scratched the surface of realizing these advantages in practical applications.
Article
Full-text available
Neural-network training can be slow and energy intensive, owing to the need to transfer the weight data for the network between conventional digital memory chips and processor chips. Analogue non-volatile memory can accelerate the neural-network training algorithm known as backpropagation by performing parallelized multiply-accumulate operations in the analogue domain at the location of the weight data. However, the classification accuracies of such in situ training using non-volatile-memory hardware have generally been less than those of software-based training, owing to insufficient dynamic range and excessive weight-update asymmetry. Here we demonstrate mixed hardware-software neural-network implementations that involve up to 204,900 synapses and that combine long-term storage in phase-change memory, near-linear updates of volatile capacitors and weight-data transfer with 'polarity inversion' to cancel out inherent device-to-device variations. We achieve generalization accuracies (on previously unseen data) equivalent to those of software-based training on various commonly used machine-learning test datasets (MNIST, MNIST-backrand, CIFAR-10 and CIFAR-100). The computational energy efficiency of 28,065 billion operations per second per watt and throughput per area of 3.6 trillion operations per second per square millimetre that we calculate for our implementation exceed those of today's graphical processing units by two orders of magnitude. This work provides a path towards hardware accelerators that are both fast and energy efficient, particularly on fully connected neural-network layers.
Article
Full-text available
Electronic and photonic technologies have transformed our lives-from computing and mobile devices, to information technology and the internet. Our future demands in these fields require innovation in each technology separately, but also depend on our ability to harness their complementary physics through integrated solutions1,2. This goal is hindered by the fact that most silicon nanotechnologies-which enable our processors, computer memory, communications chips and image sensors-rely on bulk silicon substrates, a cost-effective solution with an abundant supply chain, but with substantial limitations for the integration of photonic functions. Here we introduce photonics into bulk silicon complementary metal-oxide-semiconductor (CMOS) chips using a layer of polycrystalline silicon deposited on silicon oxide (glass) islands fabricated alongside transistors. We use this single deposited layer to realize optical waveguides and resonators, high-speed optical modulators and sensitive avalanche photodetectors. We integrated this photonic platform with a 65-nanometre-transistor bulk CMOS process technology inside a 300-millimetre-diameter-wafer microelectronics foundry. We then implemented integrated high-speed optical transceivers in this platform that operate at ten gigabits per second, composed of millions of transistors, and arrayed on a single optical bus for wavelength division multiplexing, to address the demand for high-bandwidth optical interconnects in data centres and high-performance computing3,4. By decoupling the formation of photonic devices from that of transistors, this integration approach can achieve many of the goals of multi-chip solutions 5 , but with the performance, complexity and scalability of 'systems on a chip'1,6-8. As transistors smaller than ten nanometres across become commercially available 9 , and as new nanotechnologies emerge10,11, this approach could provide a way to integrate photonics with state-of-the-art nanoelectronics.
Article
Full-text available
Photonic neural networks have the potential to revolutionize the speed, energy efficiency and throughput of modern computing—and to give Moore’s law–style scaling a new lease on life.
Article
Full-text available
The progress in the field of neural computation hinges on the use of hardware more efficient than the conventional microprocessors. Recent works have shown that mixed-signal integrated memristive circuits, especially their passive ('0T1R') variety, may increase the neuromorphic network performance dramatically, leaving far behind their digital counterparts. The major obstacle, however, is relatively immature memristor technology so that only limited functionality has been demonstrated to date. Here we experimentally demonstrate operation of one-hidden layer perceptron classifier entirely in the mixed-signal integrated hardware, comprised of two passive 20x20 metal-oxide memristive crossbar arrays, board-integrated with discrete CMOS components. The demonstrated multilayer perceptron network, whose complexity is almost 10x higher as compared to previously reported functional neuromorphic classifiers based on passive memristive circuits, achieves classification fidelity within 3 percent of that obtained in simulations, when using ex-situ training approach. The successful demonstration was facilitated by improvements in fabrication technology of memristors, specifically by lowering variations in their I-V characteristics.
Article
Full-text available
The bandwidth requirement of wireline communications has increased exponentially because of the ever-increasing demand for data centers and high-performance computing systems. However, it becomes difficult to satisfy the requirement with legacy electrical links which suffer from frequency-dependent losses due to skin effects, dielectric losses, channel reflections, and crosstalk, resulting in a severe bandwidth limitation. In order to overcome this challenge, it is necessary to introduce optical communication technology, which has been mainly used for long-reach communications, such as long-haul networks and metropolitan area networks, to the mediumand short-reach communication systems. However, there still remain important issues to be resolved to facilitate the adoption of the optical technologies. The most critical challenges are the energy efficiency and the cost competitiveness as compared to the legacy copper-based electrical communications. One possible solution is silicon photonics which has long been investigated by a number of research groups. Despite inherent incompatibility of silicon with the photonic world, silicon photonics is promising and is the only solution that can leverage the mature complementary metal-oxide-semiconductor (CMOS) technologies. Silicon photonics can be utilized in not only wireline communications but also countless sensor applications. This paper introduces a brief review of silicon photonics first and subsequently describes the history, overview, and categorization of the CMOS IC technology for high-speed photo-detection without enumerating the complex circuital expressions and terminologies.
Article
Full-text available
Photonic systems for high-performance information processing have attracted renewed interest. Neuromorphic silicon photonics has the potential to integrate processing functions that vastly exceed the capabilities of electronics. We report first observations of a recurrent silicon photonic neural network, in which connections are configured by microring weight banks. A mathematical isomorphism between the silicon photonic circuit and a continuous neural network model is demonstrated through dynamical bifurcation analysis. Exploiting this isomorphism, a simulated 24-node silicon photonic neural network is programmed using “neural compiler” to solve a differential system emulation task. A 294-fold acceleration against a conventional benchmark is predicted. We also propose and derive power consumption analysis for modulator-class neurons that, as opposed to laser-class neurons, are compatible with silicon photonic platforms. At increased scale, Neuromorphic silicon photonics could access new regimes of ultrafast information processing for radio, control, and scientific computing.
Conference Paper
Full-text available
Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC---called a Tensor Processing Unit (TPU) --- deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS) and a large (28 MiB) software-managed on-chip memory. The TPU's deterministic execution model is a better match to the 99th-percentile response-time requirement of our NN applications than are the time-varying optimizations of CPUs and GPUs that help average throughput more than guaranteed latency. The lack of such features helps explain why, despite having myriad MACs and a big memory, the TPU is relatively small and low power. We compare the TPU to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the same datacenters. Our workload, written in the high-level TensorFlow framework, uses production NN applications (MLPs, CNNs, and LSTMs) that represent 95% of our datacenters' NN inference demand. Despite low utilization for some applications, the TPU is on average about 15X -- 30X faster than its contemporary GPU or CPU, with TOPS/Watt about 30X -- 80X higher. Moreover, using the CPU's GDDR5 memory in the TPU would triple achieved TOPS and raise TOPS/Watt to nearly 70X the GPU and 200X the CPU.
Article
Full-text available
Previous studies have shown that spike-timing-dependent plasticity (STDP) can be used in spiking neural networks (SNN) to extract visual features of low or intermediate complexity in an unsupervised manner. These studies, however, used relatively shallow architectures, and only one layer was trainable. Another line of research has demonstrated -- using rate-based neural networks trained with back-propagation -- that having many layers increases the recognition robustness, an approach known as deep learning. We thus designed a deep SNN, comprising several convolutional (trainable with STDP) and pooling layers. We used a temporal coding scheme where the most strongly activated neurons fire first, and less activated neurons fire later or not at all. The network was exposed to natural images. Thanks to STDP, neurons progressively learned features corresponding to prototypical patterns that were both salient and frequent. Only a few tens of examples per category were required and no label was needed. After learning, the complexity of the extracted features increased along the hierarchy, from edge detectors in the first layer to object prototypes in the last layer. Coding was very sparse, with only a few thousands spikes per image, and in some cases the object category could be reasonably well inferred from the activity of a single higher-order neuron. More generally, the activity of a few hundreds of such neurons contained robust category information, as demonstrated using a classifier on Caltech 101, ETH-80, and MNIST databases. We think that the combination of STDP with latency coding is key to understanding the way that the primate visual system learns, its remarkable processing speed and its low energy consumption. These mechanisms are also interesting for artificial vision systems, particularly for hardware solutions.
Article
Full-text available
Neuromorphic computing covers a diverse range of approaches to information processing all of which demonstrate some degree of neurobiological inspiration that differentiates them from mainstream conventional computing systems. The philosophy behind neuromorphic computing has its origins in the seminal work carried out by Carver Mead at Caltech in the late 1980s. This early work influenced others to carry developments forward, and advances in VLSI technology supported steady growth in the scale and capability of neuromorphic devices. Recently, a number of large-scale neuromorphic projects have emerged, taking the approach to unprecedented scales and capabilities. These large-scale projects are associated with major new funding initiatives for brain-related research, creating a sense that the time and circumstances are right for progress in our understanding of information processing in the brain. In this review we present a brief history of neuromorphic engineering then focus on some of the principal current large-scale projects, their main features, how their approaches are complementary and distinct, their advantages and drawbacks, and highlight the sorts of capabilities that each can deliver to neural modellers.
Article
Full-text available
The exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-based architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.
Article
Full-text available
Fully exploiting the silicon photonics platform for large-volume, cost-sensitive applications requires a fundamentally new approach to directly integrate high-performance laser sources using wafer-scale fabrication methods. Direct-bandgap III–V semiconductors allow efficient light generation, but the large mismatch in lattice constant, thermal expansion and crystal polarity makes their epitaxial growth directly on silicon extremely complex. Using a selective-area growth technique in confined regions, we surpass this fundamental limit and demonstrate an optically pumped InP-based distributed feedback laser array monolithically grown on (001)-silicon operating at room temperature and suitable for wavelength-division-multiplexing applications. The novel epitaxial technology suppresses threading dislocations and anti-phase boundaries to a less than 20-nm-thick layer, which does not affect device performance. Using an in-plane laser cavity defined using standard top-down lithographic patterning together with a high yield and high uniformity provides scalability and a straightforward path towards cost-effective co-integration with silicon photonic and electronic circuits.
Article
Full-text available
We review recent advances in the field of quantum dot lasers on silicon. A summary of device performance, reliability, and comparison with similar quantum well lasers grown on silicon will be presented. We consider the possibility of scalable, low size, weight, and power nanolasers grown on silicon enabled by quantum dot active regions for future short-reach silicon photonics interconnects.
Article
Full-text available
We investigate a photonic regenerative memory based upon a neuromorphic oscillator with a delayed self-feedback (autaptic) connection. We disclose the existence of a unique temporal response characteristic of localized structures enabling an ideal support for bits in an optical buffer memory for storage and reshaping of data information. We link our experimental implementation, based upon a nanoscale nonlinear resonant tunneling diode driving a laser, to the paradigm of neuronal activity, the FitzHugh-Nagumo model with delayed feedback. This proof-of-concept photonic regenerative memory might constitute a building block for a new class of neuron-inspired photonic memories that can handle high bit-rate optical signals.
Article
Full-text available
Engineering the electromagnetic environment of a nanometre-scale light emitter by use of a photonic cavity can significantly enhance its spontaneous emission rate, through cavity quantum electrodynamics in the Purcell regime. This effect can greatly reduce the lasing threshold of the emitter, providing a low-threshold laser system with small footprint, low power consumption and ultrafast modulation. An ultralow-threshold nanoscale laser has been successfully developed by embedding quantum dots into a photonic crystal cavity (PCC). However, several challenges impede the practical application of this architecture, including the random positions and compositional fluctuations of the dots, extreme difficulty in current injection, and lack of compatibility with electronic circuits. Here we report a new lasing strategy: an atomically thin crystalline semiconductor-that is, a tungsten diselenide monolayer-is non-destructively and deterministically introduced as a gain medium at the surface of a pre-fabricated PCC. A continuous-wave nanolaser operating in the visible regime is thereby achieved with an optical pumping threshold as low as 27 nanowatts at 130 kelvin, similar to the value achieved in quantum-dot PCC lasers. The key to the lasing action lies in the monolayer nature of the gain medium, which confines direct-gap excitons to within one nanometre of the PCC surface. The surface-gain geometry gives unprecedented accessibility and hence the ability to tailor gain properties via external controls such as electrostatic gating and current injection, enabling electrically pumped operation. Our scheme is scalable and compatible with integrated photonics for on-chip optical communication technologies.
Article
Full-text available
Despite all the progress of semiconductor integrated circuit technology, the extreme complexity of the human cerebral cortex makes the hardware implementation of neuromorphic networks with a comparable number of devices exceptionally challenging. One of the most prospective candidates to provide comparable complexity, while operating much faster and with manageable power dissipation, are so-called CrossNets based on hybrid CMOS/memristor circuits. In these circuits, the usual CMOS stack is augmented with one or several crossbar layers, with adjustable two-terminal memristors at each crosspoint. Recently, there was a significant progress in improvement of technology of fabrication of such memristive crossbars and their integration with CMOS circuits, including first demonstrations of their vertical integration. Separately, there have been several demonstrations of discrete memristors as artificial synapses for neuromorphic networks. Very recently such experiments were extended to crossbar arrays of phase-change memristive devices. The adjustment of such devices, however, requires an additional transistor at each crosspoint, and hence the prospects of their scaling are less impressive than those of metal-oxide memristors, whose nonlinear I-V curves enable transistor-free operation. Here we report the first experimental implementation of a transistor-free metal-oxide memristor crossbar with device variability lowered sufficiently to demonstrate a successful operation of a simple integrated neural network, a single layer-perceptron. The network could be taught in situ using a coarse-grain variety of the delta-rule algorithm to perform the perfect classification of 3x3-pixel black/white images into 3 classes. We believe that this demonstration is an important step towards the implementation of much larger and more complex memristive neuromorphic networks.
Article
Full-text available
Silicon photonics has emerged as the leading candidate for implementing ultralow power wavelength-division-multiplexed communication networks in high-performance computers, yet current components (lasers, modulators, filters and detectors) consume too much power for the high-speed femtojoule-class links that ultimately will be required. Here we demonstrate and characterize the first modulator to achieve simultaneous high-speed (25 Gb s(-1)), low-voltage (0.5 VPP) and efficient 0.9 fJ per bit error-free operation. This low-energy high-speed operation is enabled by a record electro-optic response, obtained in a vertical p-n junction device that at 250 pm V(-1) (30 GHz V(-1)) is up to 10 times larger than prior demonstrations. In addition, this record electro-optic response is used to compensate for thermal drift over a 7.5 °C temperature range with little additional energy consumption (0.24 fJ per bit for a total energy consumption below 1.03 J per bit). The combined results of highly efficient modulation and electro-optic thermal compensation represent a new paradigm in modulator development and a major step towards single-digit femtojoule-class communications.
Article
Full-text available
We report on experimental evidence of neuronlike excitable behavior in a micropillar laser with saturable absorber. We show that under a single pulsed perturbation the system exhibits subnanosecond response pulses and analyze the role of the laser bias pumping. Under a double pulsed excitation we study the absolute and relative refractory periods, similarly to what can be found in neural excitability, and interpret the results in terms of a dynamical inhibition mediated by the carrier dynamics. These measurements shed light on the analogy between optical and biological neurons and pave the way to fast spike-time coding based optical systems with a speed several orders of magnitude faster than their biological or electronic counterparts.
Article
Full-text available
In today's age, companies employ machine learning to extract information from large quantities of data. One of those techniques, reservoir computing (RC), is a decade old and has achieved state-of-the-art performance for processing sequential data. Dedicated hardware realizations of RC could enable speed gains and power savings. Here we propose the first integrated passive silicon photonics reservoir. We demonstrate experimentally and through simulations that, thanks to the RC paradigm, this generic chip can be used to perform arbitrary Boolean logic operations with memory as well as 5-bit header recognition up to 12.5 Gbit s(-1), without power consumption in the reservoir. It can also perform isolated spoken digit recognition. Our realization exploits optical phase for computing. It is scalable to larger networks and much higher bitrates, up to speeds >100 Gbit s(-1). These results pave the way for the application of integrated photonic RC for a wide range of applications.
Article
Full-text available
When light interacts with metal nanostructures, it can couple to free-electron excitations near the metal surface. The electromagnetic resonances associated with these surface plasmons depend on the details of the nanostructure, opening up opportunities for controlling light confinement on the nanoscale. The resulting strong electromagnetic fields allow weak nonlinear processes, which depend superlinearly on the local field, to be significantly enhanced. In addition to providing enhanced nonlinear effects with ultrafast response times, plasmonic nanostructures allow nonlinear optical components to be scaled down in size. In this Review, we discuss the principles of nonlinear plasmonic effects and present an overview of their main applications, including frequency conversion, switching and modulation of optical signals, and soliton effects.
Article
Full-text available
The increasing demands on information processing require novel computational concepts and true parallelism. Nevertheless, hardware realizations of unconventional computing approaches never exceeded a marginal existence. While the application of optics in super-computing receives reawakened interest, new concepts, partly neuro-inspired, are being considered and developed. Here we experimentally demonstrate the potential of a simple photonic architecture to process information at unprecedented data rates, implementing a learning-based approach. A semiconductor laser subject to delayed self-feedback and optical data injection is employed to solve computationally hard tasks. We demonstrate simultaneous spoken digit and speaker recognition and chaotic time-series prediction at data rates beyond 1 Gbyte/s. We identify all digits with very low classification errors and perform chaotic time-series prediction with 10% error. Our approach bridges the areas of photonic information processing, cognitive and information science.
Article
Full-text available
The past decade has seen rapid progress in research into high-performance Ge-on-Si photodetectors. Owing to their excellent optoelectronic properties, which include high responsivity from visible to near-infrared wavelengths, high bandwidths and compatibility with silicon complementary metal–oxide–semiconductor circuits, these devices can be monolithically integrated with silicon-based read-out circuits for applications such as high-performance photonic data links and infrared imaging at low cost and low power consumption. This Review summarizes the major developments in Ge-on-Si photodetectors, including epitaxial growth and strain engineering, free-space and waveguide-integrated devices, as well as recent progress in Ge-on-Si avalanche photodetectors.
Article
Full-text available
We examine the current performance and future demands of interconnects to and on silicon chips. We compare electrical and optical interconnects and project the requirements for optoelectronic and optical devices if optics is to solve the major problems of interconnects for future high-performance silicon chips. Optics has potential benefits in interconnect density, energy, and timing. The necessity of low interconnect energy imposes low limits especially on the energy of the optical output devices, with a ~ 10 fJ/bit device energy target emerging. Some optical modulators and radical laser approaches may meet this requirement. Low (e.g., a few femtofarads or less) photodetector capacitance is important. Very compact wavelength splitters are essential for connecting the information to fibers. Dense waveguides are necessary on-chip or on boards for guided wave optical approaches, especially if very high clock rates or dense wavelength-division multiplexing (WDM) is to be avoided. Free-space optics potentially can handle the necessary bandwidths even without fast clocks or WDM. With such technology, however, optics may enable the continued scaling of interconnect capacity required by future chips.
Article
Spiking neural networks enable efficient information processing in real-time. Excitable lasers can exhibit ultrafast spiking dynamics. When preceded by a photodetector, in an O/E/O link, they can process optical spikes at different wavelengths and thus can be interconnected into large neural networks. Here, we experimentally demonstrate and numerically simulate the spiking dynamics of a laser neuron fabricated in a photonic integrated circuit. Our spiking laser neuron is shown to perform coincidence detection with nanosecond time resolution, and we observe refractory periods in the order of 0.1ns. We propose a method to implement XOR classification using our laser neurons, and simulations of the resultant dynamics indicate robust tolerance to timing jitter.
Article
Modern computers are based on the von Neumann architecture in which computation and storage are physically separated: data are fetched from the memory unit, shuttled to the processing unit (where computation takes place) and then shuttled back to the memory unit to be stored. The rate at which data can be transferred between the processing unit and the memory unit represents a fundamental limitation of modern computers, known as the memory wall. In-memory computing is an approach that attempts to address this issue by designing systems that compute within the memory, thus eliminating the energy-intensive and time-consuming data movement that plagues current designs. Here we review the development of in-memory computing using resistive switching devices, where the two-terminal structure of the devices, their resistive switching properties, and direct data processing in the memory can enable area- and energy-efficient computation. We examine the different digital, analogue, and stochastic computing schemes that have been proposed, and explore the microscopic physical mechanisms involved. Finally, we discuss the challenges in-memory computing faces, including the required scaling characteristics, in delivering next-generation computing. This Review Article examines the development of in-memory computing using resistive switching devices.
Conference Paper
A nanocavity-based modulator is closely integrated with a nano-photodetector to form an ultracompact O-E-O converter. The femto-joule energy operation proves the opto-electronic integrability with a femto-farad capacitance, for the first time.
Article
Weighted addition is an elemental multi-input to single-output operation that can be implemented with high-performance photonic devices. Microring (MRR) weight banks bring programmable weighted addition to silicon photonics. Prior work showed that their channel limits are affected by coherent inter-channel effects that occur uniquely in weight banks. We fabricate two-pole designs that exploit this inter-channel interference in a way that is robust to dynamic tuning and fabrication variation. Scaling analysis predicts a channel count improvement of 3.4-fold, which is substantially greater than predicted by incoherent analysis used in conventional MRR devices. Advances in weight bank design expand the potential of reconfigurable analog photonic networks and multivariate microwave photonics.
Article
This comprehensive review summarizes state of the art, challenges, and prospects of the neuro-inspired computing with emerging nonvolatile memory devices. First, we discuss the demand for developing neuro-inspired architecture beyond today's von-Neumann architecture. Second, we summarize the various approaches to designing the neuromorphic hardware (digital versus analog, spiking versus nonspiking, online training versus offline training) and discuss why emerging nonvolatile memory is attractive for implementing the synapses in the neural network. Then, we discuss the desired device characteristics of the synaptic devices (e.g., multilevel states, weight update nonlinearity/asymmetry, variation/noise), and survey a few representative material systems and device prototypes reported in the literature that show the analog conductance tuning. These candidates include phase change memory, resistive memory, ferroelectric memory, floating-gate transistors, etc. Next, we introduce the crossbar array architecture to accelerate the weighted sum and weight update operations that are commonly used in the neuro-inspired machine learning algorithms, and review the recent progresses of array-level experimental demonstrations for pattern recognition tasks. In addition, we discuss the peripheral neuron circuit design issues and present a device-circuit-algorithm codesign methodology to evaluate the impact of nonideal device effects on the systemlevel performance (e.g., learning accuracy). Finally, we give an outlook on the customization of the learning algorithms for efficient hardware implementation.
Article
Artificial neural networks are computational network models inspired by signal processing in the brain. These models have dramatically improved performance for many machine-learning tasks, including speech and image recognition. However, today's computing hardware is inefficient at implementing neural networks, in large part because much of it was designed for von Neumann computing schemes. Significant effort has been made towards developing electronic architectures tuned to implement artificial neural networks that exhibit improved computational speed and accuracy. Here, we propose a new architecture for a fully optical neural network that, in principle, could offer an enhancement in computational speed and power efficiency over state-of-the-art electronics for conventional inference tasks. We experimentally demonstrate the essential part of the concept using a programmable nanophotonic processor featuring a cascaded array of 56 programmable Mach–Zehnder interferometers in a silicon photonic integrated circuit and show its utility for vowel recognition.
Article
Microring weight banks could enable novel signal processing approaches in silicon photonics. We analyze factors limiting channel count in microring weight banks, which are central to analog wavelength-division multiplexed processing networks in silicon. We find that microring weight banks require a fundamentally different analysis compared to other wavelength-division multiplexing circuits (e.g., demultiplexers). By introducing a quantitative description of independent weighting, we establish performance tradeoffs between channel count and power penalty. This performance is significantly affected by coherent multiresonator interactions through bus waveguides. We experimentally demonstrate these effects in a fabricated device. Analysis relies on the development of a novel simulation technique combining parametric programming with generalized transmission theory. Experimental measurement fitting of an 8-channel weight bank is presented as an example of another application of the simulator.
Article
Recently, there has been tremendous interest in excitable optoelectronic devices and in particular excitable semiconductor lasers that could potentially enable unconventional processing approaches beyond conventional binary-logic-based approaches. In parallel, there has been renewed investigation of non-von Neumann architectures driven in part by incipient limitations in aspects of Moore’s law. These neuromorphic architectures attempt to decentralize processing by interweaving interconnection with computing while simultaneously incorporating time-resolved dynamics, loosely classified as spiking (a.k.a. excitability). The rapid and efficient advances in CMOS-compatible photonic interconnect technologies have led to opportunities in optics and photonics for unconventional circuits and systems. Effort in the budding research field of photonic spike processing aims to synergistically integrate the underlying physics of photonics with bio-inspired processing. Lasers operating in the excitable regime are dynamically analogous with the spiking dynamics observed in neuron biophysics but roughly 8 orders of magnitude faster. The field is reaching a critical juncture at which there is a shift from studying single devices to studying an interconnected network of lasers. In this paper, we review the recent research in the information processing abilities of such lasers, dubbed “photonic neurons,” “laser neurons,” or “optical neurons.” An integrated network of such lasers on a chip could potentially grant the capacity for complex, ultrafast categorization and decision making to provide a range of computing and signal processing applications, such as sensing and manipulating the radio frequency spectrum and for hypersonic aircraft control.
Article
We introduce a method to train Binarized Neural Networks (BNNs) - neural networks with binary weights and activations at run-time and when computing the parameters' gradient at train-time. We conduct two sets of experiments, each based on a different framework, namely Torch7 and Theano, where we train BNNs on MNIST, CIFAR-10 and SVHN, and achieve nearly state-of-the-art results. During the forward pass, BNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operations, which might lead to a great increase in power-efficiency. Last but not least, we wrote a binary matrix multiplication GPU kernel with which it is possible to run our MNIST BNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. The code for training and running our BNNs is available.
Article
We demonstrate an analog O/E/O electronic link to allow integrated laser neurons to accept many distinguishable, high bandwidth input signals simultaneously. This device utilizes wavelength division multiplexing to achieve multi-channel fan-in, a photodetector to sum signals together, and a laser cavity to perform a nonlinear operation. Its speed outpaces accelerated-time neuromorphic electronics, and it represents a viable direction towards scalable networking approaches.
Article
Microwave photonics (MWP) provides advantages in bandwidth performance and fan-in scalability that are far superior to electronic counterparts. Processing of many channels at high bandwidths is not easily achievable in any electronic implementation. We consider an MWP system that iteratively performs principal component analysis (PCA) on partially correlated, eight-channel, and 13-GBd signals. The system that is presented is able to adapt to oscillations in interchannel correlations and follow changing principal components. Wideband multidimensional techniques are relevant to > 10-GHz radio systems and could bring solutions for intelligent radio communications and information sensing.
Article
Novel materials and devices in photonics have the potential to revolutionize optical information processing, beyond conventional binary-logic approaches. Laser systems offer a rich repertoire of useful dynamical behaviors, including the excitable dynamics also found in the time-resolved “spiking” of neurons. Spiking reconciles the expressiveness and efficiency of analog processing with the robustness and scalability of digital processing. We demonstrate a unified platform for spike processing with a graphene-coupled laser system. We show that this platform can simultaneously exhibit logic-level restoration, cascadability and input-output isolation—fundamental challenges in optical information processing. We also implement low-level spike-processing tasks that are critical for higher level processing: temporal pattern detection and stable recurrent memory. We study these properties in the context of a fiber laser system and also propose and simulate an analogous integrated device. The addition of graphene leads to a number of advantages which stem from its unique properties, including high absorption and fast carrier relaxation. These could lead to significant speed and efficiency improvements in unconventional laser processing devices, and ongoing research on graphene microfabrication promises compatibility with integrated laser platforms.
Article
Data transport across short electrical wires is limited by both bandwidth and power density, which creates a performance bottleneck for semiconductor microchips in modern computer systems - from mobile phones to large-scale data centres. These limitations can be overcome by using optical communications based on chip-scale electronic-photonic systems enabled by silicon-based nanophotonic devices. However, combining electronics and photonics on the same chip has proved challenging, owing to microchip manufacturing conflicts between electronics and photonics. Consequently, current electronic-photonic chips are limited to niche manufacturing processes and include only a few optical devices alongside simple circuits. Here we report an electronic-photonic system on a single chip integrating over 70 million transistors and 850 photonic components that work together to provide logic, memory, and interconnect functions. This system is a realization of a microprocessor that uses on-chip photonic devices to directly communicate with other chips using light. To integrate electronics and photonics at the scale of a microprocessor chip, we adopt a 'zero-change' approach to the integration of photonics. Instead of developing a custom process to enable the fabrication of photonics, which would complicate or eliminate the possibility of integration with state-of-the-art transistors at large scale and at high yield, we design optical devices using a standard microelectronics foundry process that is used for modern microprocessors. This demonstration could represent the beginning of an era of chip-scale electronic-photonic systems with the potential to transform computing system architectures, enabling more powerful computers, from network infrastructure to data centres and supercomputers.
Article
Deep Neural Networks (DNN) have achieved state-of-the-art results in a wide range of tasks, with the best results obtained with large training sets and large models. In the past, GPUs enabled these breakthroughs because of their greater computational speed. In the future, faster computation at both training and test time is likely to be crucial for further progress and for consumer applications on low-power devices. As a result, there is much interest in research and development of dedicated hardware for Deep Learning (DL). Binary weights, i.e., weights which are constrained to only two possible values (e.g. -1 or 1), would bring great benefits to specialized DL hardware by replacing many multiply-accumulate operations by simple accumulations, as multipliers are the most space and power-hungry components of the digital implementation of neural networks. We introduce BinaryConnect, a method which consists in training a DNN with binary weights during the forward and backward propagations, while retaining precision of the stored weights in which gradients are accumulated. Like other dropout schemes, we show that BinaryConnect acts as regularizer and we obtain near state-of-the-art results with BinaryConnect on the permutation-invariant MNIST, CIFAR-10 and SVHN.
Article
We consider an optical technique for performing tunable weighted addition using wavelength-division multiplexed (WDM) inputs, the enabling function of a recently proposed photonic spike processing architecture [J. Lightwave Technol., 32 (2014)]. WDM weighted addition provides important advantages to performance, integrability, and networking capability that were not possible in any past approaches to optical neurocomputing. In this letter, we report a WDM weighted addition prototype used to find the first principal component of a 1Gbps, 8-channel signal. Wideband, multivariate techniques have immediate relevance to modern radio systems, and photonic spike processing networks enabled by WDM could open new domains of information processing that bring unprecedented bandwidth and intelligence to problems in radio communications, ultrafast control, and scientific computing.
Article
We propose an on-chip optical architecture to support massive parallel communication among high-performance spiking laser neurons. Designs for a network protocol, computational element, and waveguide medium are described, and novel methods are considered in relation to prior research in optical on-chip networking, neural networking, and computing. Broadcast-and-weight is a new approach for combining neuromorphic processing and optoelectronic physics, a pairing that is found to yield a variety of advantageous features. We discuss properties and design considerations for architectures for scalable wavelength reuse and biologically relevant organizational capabilities, in addition to aspects of practical feasibility. Given recent developments commercial photonic systems integration and neuromorphic computing, we suggest that a novel approach to photonic spike processing represents a promising opportunity in unconventional computing.
Article
Inspired by the brain’s structure, we have developed an efficient, scalable, and flexible non–von Neumann architecture that leverages contemporary silicon technology. To demonstrate, we built a 5.4-billion-transistor chip with 4096 neurosynaptic cores interconnected via an intrachip network that integrates 1 million programmable spiking neurons and 256 million configurable synapses. Chips can be tiled in two dimensions via an interchip communication interface, seamlessly scaling the architecture to a cortexlike sheet of arbitrary size. The architecture is well suited to many applications that use complex neural networks in real time, for example, multiobject detection and classification. With 400-pixel-by-240-pixel video input at 30 frames per second, the chip consumes 63 milliwatts.
Article
We propose an original design for a neuron-inspired photonic computational primitive for a large-scale, ultrafast cognitive computing platform. The laser exhibits excitability and behaves analogously to a leaky integrate-and-fire (LIF) neuron. This model is both fast and scalable, operating up to a billion times faster than a biological equivalent and is realizable in a compact, vertical-cavity surface-emitting laser (VCSEL). We show that-under a certain set of conditions-the rate equations governing a laser with an embedded saturable absorber reduces to the behavior of LIF neurons. We simulate the laser using realistic rate equations governing a VCSEL cavity, and show behavior representative of cortical spiking algorithms simulated in small circuits of excitable lasers. Pairing this technology with ultrafast, neural learning algorithms would open up a new domain of processing.
Article
A low operating energy is needed for nanocavity lasers designed for on-chip photonic network applications. On-chip nanocavity lasers must be driven by current because they act as light sources driven by electronic circuits. Here, we report the high-speed direct modulation of a lambda-scale embedded active region photonic-crystal (LEAP) laser that holds three records for any type of laser operated at room temperature: a low threshold current of 4.8 µA, a modulation current efficiency of 2.0 GHz µA−0.5 and an operating energy of 4.4 fJ bit−1. Five major technologies make this performance possible: a compact buried heterostructure, a photonic-crystal nanocavity, a lateral p–n junction realized by ion implantation and thermal diffusion, an InAlAs sacrificial layer and current-blocking trenches. We believe that an output power of 2.17 µW and an operating energy of 4.4 fJ bit−1 will enable us to realize on-chip photonic networks in combination with the recently developed highly sensitive receivers.
Article
Preface 1. A Nonlinear History of Radio 2. Characteristics of Passive IC Components 3. A review of MOS Device Physics 4. Passive RLC networks 5. Distributed systems 6. The Smith Chart and S-parameters 7. Bandwidth Estimation Techniques 8. High-Frequency Amplifier Design 9. Voltage References and Biasing 10. Noise 11. Low-Noise Amplifier Design 12. Mixers 13. Radio-Frequency Power Amplifiers 14. Feedback systems 15. Phase-Locked Loops 16. Oscillators and synthesizers 17. Phase noise 18. Architectures 19. Radio-Frequency Circuits Through the Ages.
Chapter
Analog-to-digital converters (ADCs) continue to be important components of signal-processing systems, such as those for mobile communications, software radio, radar, satellite communications, and others. This article revisits the state-of-the-art of ADCs and includes recent data on experimental converters and commercially available parts. Converter performances have improved significantly since previous surveys were published (1999–2005). Specifically, aperture uncertainty (jitter) and power dissipation have both decreased substantially during the early 2000s. The lowest jitter value has fallen from approximately 1 picosecond in 1999 to < 100 femtoseconds for the very best of current ADCs. In addition, the lowest values for the IEEE Figure of Merit (which is proportional to the product of jitter and power dissipation) have also decreased by an order of magnitude. For converters that operate at multi-GSPS rates, the speed of the fastest ADC IC device technologies e.g., InP, GaAs, is the main limitation to performance; as measured by device transit-time frequency, fT, has roughly tripled since 1999. ADC architectures used in high-performance broadband circuits include pipelined (successive approximation, multistage flash) and parallel (time-interleaved, filter-bank) with the former leading to lower power operation and the latter being applied to high-sample rate converters. Bandpass ADCs based on delta-sigma modulation are being applied to narrow band applications with ever increasing center frequencies. CMOS has become a mainstream ADC IC technology because (1) it enables designs with low power dissipation and (2) it allows for significant amounts of digital signal-processing to be included on-chip. DSP enables correction of conversion errors, improved channel matching in parallel structures, and provides filtering required for delta-sigma converters. Finally, a performance projection based on a trend in aperture jitter predicts 25 fs in approximately 10 years, which would imply performance of 12 ENOB at nearly 1-GHz bandwidth. Keywords: analog-to-digital converters; signal-to-noise ratio; aperture jitter; input-referred noise; comparator ambiguity; spurious-free dynamic range; digital-signal-processing