Wei D. Lu

Wei D. Lu
  • Ph.D
  • Professor at University of Michigan

About

285
Publications
89,034
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
38,122
Citations
Introduction
Professor Wei Lu is with the Department of Electrical Engineering and Computer Science (EECS), University of Michigan. His current research topics include high-density memory based on two-terminal resistive devices (RRAM), memristors and memristive systems, neuromorphic circuits, aggressively scaled nanowire transistors, and other emerging electrical devices. He is an IEEE Fellow and co-founder of Crossbar Inc.
Current institution
University of Michigan
Current position
  • Professor
Additional affiliations
April 2003 - August 2005
Harvard University
Position
  • PostDoc Position
September 1996 - March 2003
Rice University
Position
  • PhD Student
September 2005 - June 2016
University of Michigan
Position
  • Professor (Associate)

Publications

Publications (285)
Article
Analog compute in memory (CIM) with multilevel cell (MLC) resistive random access memory (ReRAM) promises highly dense and efficient compute support for machine learning and scientific computing. This article introduces analog to digital converter (ADC)-assisted bit-serial processing for efficient, high-throughput compute. Bit-serial digital to ana...
Article
Full-text available
A memristor array has emerged as a potential computing hardware for artificial intelligence (AI). It has an inherent memory effect that allows information storage in the form of easily programmable electrical conductance, making it suitable for efficient data processing without shuttling of data between the processor and memory. To realize its full...
Article
Full-text available
Cutting-edge humanoid machine vision merely mimics human systems and lacks polarimetric functionalities that convey the information of navigation and authentic images. Interspecies-chimera vision reserving multiple hosts’ capacities will lead to advanced machine vision. However, implementing the visual functions of multiple species (human and non-h...
Article
Full-text available
Decoder-only Transformer models such as Generative Pre-trained Transformers (GPT) have demonstrated exceptional performance in text generation by autoregressively predicting the next token. However, the efficiency of running GPT on current hardware systems is bounded by low compute-to-memory-ratio and high memory access. In this work, we propose a...
Article
Full-text available
Neuromorphic technologies aim to use the organizing principles of the brain to build efficient and intelligent systems, making them the center-piece between the biological and current Artificial Intelligence (AI) systems. Specifically, in conventional AI systems, one of the dominant sources of power consumption is the data movement between the memo...
Article
Full-text available
Neuromorphic computing systems promise high energy efficiency and low latency. In particular, when integrated with neuromorphic sensors, they can be used to produce intelligent systems for a broad range of applications. An event‐based camera is such a neuromorphic sensor, inspired by the sparse and asynchronous spike representation of the biologica...
Article
Full-text available
Memristive devices are of potential use in a range of computing applications. However, many of these devices are based on amorphous materials, where systematic control of the switching dynamics is challenging. Here we report tunable and stable memristors based on an entropy-stabilized oxide. We use single-crystalline (Mg,Co,Ni,Cu,Zn)O films grown o...
Article
Full-text available
Artificial Intelligence (AI) is currently experiencing a bloom driven by deep learning (DL) techniques, which rely on networks of connected simple computing units operating in parallel. The low communication bandwidth between memory and processing units in conventional von Neumann machines does not support the requirements of emerging applications...
Preprint
Full-text available
Decoder-only Transformer models such as GPT have demonstrated exceptional performance in text generation, by autore-gressively predicting the next token. However, the efficacy of running GPT on current hardware systems is bounded by low compute-to-memory-ratio and high memory access. Process-in-memory (PIM) architectures can minimize off-chip data...
Article
Full-text available
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency when performing inference with deep learning workloads. Error backpropagation is presently regarded as the most effective method for training SNNs, but in a twist of irony, when training on modern graphics processing units (GPUs)...
Article
Deep learning accelerators (DLAs) based on compute-in-memory (CIM) technologies have been considered promising candidates to drastically improve the throughput and energy efficiency for running deep neural network models. In this review, we analyze DLA designs reported in the past decade, including both fully digital DLAs and analog CIM based DLAs,...
Article
With the rising popularity of post-quantum cryptographic schemes, realizing practical implementations for real-world applications is still a major challenge. A major bottleneck in such schemes is the fetching and processing of large polynomials in the Number Theoretic Transform (NTT), which makes non Von Neumann paradigms, such as near-memory proce...
Article
Full-text available
The need for deep neural network (DNN) models with higher performance and better functionality leads to the proliferation of very large models. Model training, however, requires intensive computation time and energy. Memristor‐based compute‐in‐memory (CIM) modules can perform vector‐matrix multiplication (VMM) in situ and in parallel, and have show...
Article
Full-text available
Analog compute‐in‐memory (CIM) systems are promising candidates for deep neural network (DNN) inference acceleration. However, as the use of DNNs expands, protecting user input privacy has become increasingly important. Herein, a potential security vulnerability is identified wherein an adversary can reconstruct the user's private input data from a...
Article
Full-text available
The brain is the perfect place to look for inspiration to develop more efficient neural networks. The inner workings of our synapses and neurons provide a glimpse at what the future of deep learning might look like. This article serves as a tutorial and perspective showing how to apply the lessons learned from several decades of research in deep le...
Article
Memristive technology has been rapidly emerging as a potential alternative to traditional CMOS technology, which is facing fundamental limitations in its development. Since oxide-based resistive switches were demonstrated as memristors in 2008, memristive devices have garnered significant attention due to their biomimetic memory properties, which p...
Preprint
Full-text available
The need for deep neural network (DNN) models with higher performance and better functionality leads to the proliferation of very large models. Model training, however, requires intensive computation time and energy. Memristor-based compute-in-memory (CIM) modules can perform vector-matrix multiplication (VMM) in situ and in parallel, and have show...
Preprint
Full-text available
Analog compute-in-memory (CIM) accelerators are becoming increasingly popular for deep neural network (DNN) inference due to their energy efficiency and in-situ vector-matrix multiplication (VMM) capabilities. However, as the use of DNNs expands, protecting user input privacy has become increasingly important. In this paper, we identify a security...
Preprint
Event-based cameras are inspired by the sparse and asynchronous spike representation of the biological visual system. However, processing the even data requires either using expensive feature descriptors to transform spikes into frames, or using spiking neural networks that are difficult to train. In this work, we propose a neural network architect...
Article
Fully automated retail vehicles will have stringent, real-time operational safety, sensing, communication, inference, planning, and control requirements. Meeting them with existing technologies will impose prohibitive energy-provisioning and thermal management requirements. This article summarizes the research challenges facing the designers of com...
Technical Report
Automated vehicles (AV) hold great promise for improving safety, as well as reducing congestion and emissions. In order to make automated vehicles commercially viable, a reliable and highperformance vehicle-based computing platform that meets ever-increasing computational demands will be key. Given the state of existing digital computing technology...
Article
In-memory computing (IMC) systems have great potential for accelerating data-intensive tasks such as deep neural networks (DNNs). As DNN models are generally highly proprietary, the neural network architectures become valuable targets for attacks. In IMC systems, since the whole model is mapped on chip and weight memory read can be restricted, the...
Article
Compute-in-Memory (CIM) implemented with Resistive-Random-Access-Memory (RRAM) crossbars is a promising approach for accelerating Convolutional Neural Network (CNN) computations. The growing size in the number of parameters in state-of-the-art CNN models, however, creates challenge for on-chip weight storage for CIM implementations, and CNN compres...
Preprint
Full-text available
Electronic switches based on the migration of high-density point defects, or memristors, are poised to revolutionize post-digital electronics. Despite significant research, key mechanisms for filament formation and oxygen transport remain unresolved, thus hindering our ability to predict and design crucial device properties. For example, predicted...
Article
Full-text available
As more cloud computing resources are used for machine learning training and inference processes, privacy-preserving techniques that protect data from revealing at the cloud platforms attract increasing interest. Homomorphic encryption (HE) is one of the most promising techniques that enable privacy-preserving machine learning because HE allows dat...
Article
We present MEMprop , the adoption of gradient-based learning to train fully memristive spiking neural networks (MSNNs). Our approach harnesses intrinsic device dynamics to trigger naturally arising voltage spikes. These spikes emitted by memristive dynamics are analog in nature, and thus fully differentiable, which eliminates the need for surroga...
Article
Compute-in-Memory (CIM) implemented with Resistive-Random-Access-Memory (RRAM) crossbars is a promising approach for Deep Neural Network (DNN) acceleration. As the DNN size continues to grow, the finite on-chip weight storage has become a challenge for CIM implementations. Pruning can reduce network size, but unstructured pruning is not compatible...
Preprint
Full-text available
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency when performing inference with deep learning workloads. Error backpropagation is presently regarded as the most effective method for training SNNs, but in a twist of irony, when training on modern graphics processing units (GPUs)...
Article
Full-text available
Network features found in the brain may help implement more efficient and robust neural networks. Spiking neural networks (SNNs) process spikes in the spatiotemporal domain and can offer better energy efficiency than deep neural networks. However, most SNN implementations rely on simple point neurons that neglect the rich neuronal and dendritic dyn...
Preprint
Full-text available
In-memory computing (IMC) systems have great potential for accelerating data-intensive tasks such as deep neural networks (DNNs). As DNN models are generally highly proprietary, the neural network architectures become valuable targets for attacks. In IMC systems, since the whole model is mapped on chip and weight memory read can be restricted, the...
Preprint
We present MEMprop, the adoption of gradient-based learning to train fully memristive spiking neural networks (MSNNs). Our approach harnesses intrinsic device dynamics to trigger naturally arising voltage spikes. These spikes emitted by memristive dynamics are analog in nature, and thus fully differentiable, which eliminates the need for surrogate...
Article
Memristive devices, which combine a resistor with memory functions such that voltage pulses can change their resistance (and hence their memory state) in a nonvolatile manner, are beginning to be implemented in integrated circuits for memory applications. However, memristive devices could have applications in many other technologies, such as non-vo...
Article
Memristive arrays are a natural fit to implement spiking neural network (SNN) acceleration. Representing information as digital spiking events can improve noise margins and tolerance to device variability compared to analog bitline current summation approaches to multiply–accumulate (MAC) operations. Restricting neuron activations to single-bit spi...
Article
In this letter, we demonstrate a physical unclonable function (PUF) system based on fingerprint-like random planar structures through pattern transfer of self-assembled binary polymer mixtures. With properly designed electrode structures and materials, different types of conductance distributions with large variations are achieved, allowing the PUF...
Article
Research on electronic devices and materials is currently driven by both the slowing down of transistor scaling and the exponential growth of computing needs, which make present digital computing increasingly capacity-limited and power-limited. A promising alternative approach consists in performing computing based on intrinsic device dynamics, suc...
Chapter
In-memory computing on RRAM crossbars enables efficient and parallel vector-matrix multiplication. The neural network weight matrix is mapped onto the crossbar, and multiplication is performed in the analog domain. This article discusses different RRAM crossbar implementations, the associated mixed-signal circuits, and their challenges. As a proof-...
Article
Full-text available
Memristive devices have demonstrated rich switching behaviors that closely resemble synaptic functions and provide a building block to construct efficient neuromorphic systems. It is demonstrated that resistive switching effects are controlled not only by the external field, but also by the dynamics of various internal state variables that facilita...
Preprint
Full-text available
Spiking and Quantized Neural Networks (NNs) are becoming exceedingly important for hyper-efficient implementations of Deep Learning (DL) algorithms. However, these networks face challenges when trained using error backpropagation, due to the absence of gradient signals when applying hard thresholds. The broadly accepted trick to overcoming this is...
Preprint
Full-text available
Spiking neural networks can compensate for quantization error by encoding information either in the temporal domain, or by processing discretized quantities in hidden states of higher precision. In theory, a wide dynamic range state-space enables multiple binarized inputs to be accumulated together, thus improving the representational capacity of i...
Article
Full-text available
In analog in-memory computing systems based on nonvolatile memories such as resistive random-access memory (RRAM), neural network models are often trained offline and then the weights are programmed onto memory devices as conductance values. The programmed weight values inevitably deviate from the target values during the programming process. This...
Preprint
Full-text available
The impact of device and circuit-level effects in mixed-signal Resistive Random Access Memory (RRAM) accelerators typically manifest as performance degradation of Deep Learning (DL) algorithms, but the degree of impact varies based on algorithmic features. These include network architecture, capacity, weight distribution, and the type of inter-laye...
Chapter
Advances in electronics have revolutionized the way people work, play, and communicate with each other. Historically, these advances were mainly driven by CMOS transistor scaling following Moore’s law, where new generations of devices are smaller, faster, and cheaper, leading to more powerful circuits and systems. However, conventional scaling is n...
Article
Full-text available
Neuromorphic systems that can emulate the structure and the operations of biological neural circuits have long been viewed as a promising hardware solution to meet the ever-growing demands of big-data analysis and AI tasks. Recent studies on resistive switching or memristive devices have suggested such devices may form the building blocks of biorea...
Article
We present and experimentally validate two minimal compact memristive models for spiking neuronal signal generation using commercially available low-cost components. The first neuron model is called the Memristive Integrate-and-Fire (MIF) model, for neuronal signaling with two voltage levels: the spike-peak, and the rest-potential. The second model...
Preprint
Full-text available
The brain is the perfect place to look for inspiration to develop more efficient neural networks. The inner workings of our synapses and neurons provide a glimpse at what the future of deep learning might look like. This paper shows how to apply the lessons learnt from several decades of research in deep learning, gradient descent, backpropagation...
Article
Full-text available
Reservoir computing (RC) offers efficient temporal data processing with a low training cost by separating recurrent neural networks into a fixed network with recurrent connections and a trainable linear network. The quality of the fixed network, called reservoir, is the most important factor that determines the performance of the RC system. In this...
Article
We present TAICHI, a general in-memory computing deep neural network accelerator design based on RRAM crossbar arrays heterogeneously integrated with local arithmetic units and global co-processors to allow the system to efficiently map different models while maintaining high energy efficiency and throughput. A hierarchical mesh network-onchip is i...
Article
Knowing the connectivity patterns in neural circuitry is essential to understand the operating mechanism of the brain, as it allows the analysis of how neural signals are processed and flown through the neural system. With the recent advances in neural recording technologies in terms of channel size and time resolution, a simple and efficient syste...
Article
Full-text available
The advances of neural recording techniques have fostered rapid growth of the number of simultaneously recorded neurons, opening up new possibilities to investigate the interactions and dynamics inside neural circuitry. The high recording channel counts, however, pose significant challenges for data analysis because the required time and computatio...
Preprint
Full-text available
Reservoir computing (RC) offers efficient temporal data processing with a low training cost by separating recurrent neural networks into a fixed network with recurrent connections and a trainable linear network. The quality of the fixed network, called reservoir, is the most important factor that determines the performance of the RC system. In this...
Article
Full-text available
Memristors have emerged as transformative devices to enable neuromorphic and in‐memory computing, where success requires the identification and development of materials that can overcome challenges in retention and device variability. Here, high‐entropy oxide composed of Zr, Hf, Nb, Ta, Mo, and W oxides is first demonstrated as a switching material...
Article
Stochastic Computing (SC) is a computing paradigm that allows for the low-cost and low-power computation of various arithmetic operations using stochastic bit streams and digital logic. In contrast to conventional representation schemes used within the binary domain, the sequence of bit streams in the stochastic domain is inconsequential, and compu...
Preprint
Full-text available
Stochastic Computing (SC) is a computing paradigm that allows for the low-cost and low-power computation of various arithmetic operations using stochastic bit streams and digital logic. In contrast to conventional representation schemes used within the binary domain, the sequence of bit streams in the stochastic domain is inconsequential, and compu...
Article
Resistive random-access memory (RRAM) structured in crossbar arrays typically use selectors to improve noise margins by suppressing sneak path currents and leakages. Without selectors, crossbar array dimensions would be prohibitively small for practical use in storage-class memory (SCM), and compute-in-memory (CIM) applications. Most one-selector o...
Article
Full-text available
In article number 2003984, Yiyang Li, A. Alec Talin, and co‐workers design a deterministic nonvolatile resistive memory cell without nanosized filaments. By using the statistical ensemble behavior of all point defects within the 3D bulk for information storage, they solve the challenge of stochastic switching that has plagued filament‐based memrist...
Article
Full-text available
Biologically plausible computing systems require fine‐grain tuning of analog synaptic characteristics. In this study, lithium‐doped silicate resistive random access memory with a titanium nitride (TiN) electrode mimicking biological synapses is demonstrated. Biological plausibility of this RRAM device is thought to occur due to the low ionization e...
Article
Full-text available
Digital computing is nearing its physical limits as computing needs and energy consumption rapidly increase. Analogue-memory-based neuromorphic computing can be orders of magnitude more energy efficient at data-intensive tasks like deep neural networks, but has been limited by the inaccurate and unpredictable switching of analogue resistive memory....
Article
Full-text available
To tackle important combinatorial optimization problems, a variety of annealing-inspired computing accelerators, based on several different technology platforms, have been proposed, including quantum-, optical- and electronics-based approaches. However, to be of use in industrial applications, further improvements in speed and energy efficiency are...
Article
To address the von Neumann bottleneck that leads to both energy and speed degradations, in-memory processing architectures have been proposed as a promising alternative for future computing applications. In this paper, we present an in-memory computing system based on resistive random-access memory (RRAM) crossbar arrays that is reconfigurable and...
Article
Full-text available
The ability to efficiently analyze the activities of biological neural networks can significantly promote our understanding of neural communications and functionalities. However, conventional neural signal analysis approaches need to transmit and store large amounts of raw recording data, followed by extensive processing offline, posing significant...
Article
Full-text available
Analog compute-in-memory with resistive random access memory (RRAM) devices promises to overcome the data movement bottleneck in data-intensive AI and machine learning. RRAM crossbar arrays improve the efficiency of vector-matrix multiplications (VMM), which is a vital operation in these applications. The prototype IC is the first complete, fully-i...
Article
Oxide-based memristors are two-terminal devices whose resistance can be modulated by the history of applied stimulation. Memristors have been extensively studied as memory (as resistive random-access memory (RRAM)) and synaptic devices for neuromorphic computing applications. Understanding the internal dynamics of memristors is essential for contin...
Article
With the slowing down of the Moore’s law and fundamental limitations due to the von-Neumann bottleneck, continued improvements in computing hardware performance become increasingly more challenging. Resistive switching (RS) devices are being extensively studied as promising candidates for next generation memory and computing applications due to the...
Chapter
Advances in computing power have historically been driven by transistor scaling, commonly known as Moore’s law. However, transistor scaling is now close to an end as device sizes approach fundamental physical limits. In the meantime, demands in the application space have been evolving, often requiring processing large amounts of data at high throug...
Preprint
Full-text available
We present an optimized conductance-based retina microcircuit simulator which transforms light stimuli into a series of graded and spiking action potentials through photo transduction. We use discrete retinal neuron blocks based on a collation of single-compartment models and morphologically realistic formulations, and successfully achieve a biolog...
Article
Near infrared (NIR) synaptic devices offer a remote-control approach to implement neuromorphic computing for data safety and artificial retinal system applications. In upconverting nanoparticles (UCNPs)-mediated optogenetics biosystems, NIR regulation of membrane ion channels allows remote and selective control of the Ca²⁺ flux to modulate synaptic...
Article
Full-text available
Time-series analysis including forecasting is essential in a range of fields from finance to engineering. However, long-term forecasting is difficult, particularly for cases where the underlying models and parameters are complex and unknown. Neural networks can effectively process features in temporal units and are attractive for such purposes. Res...
Article
Silicon (Si) nanostructures are widely used in microelectronics and nanotechnology. Brittle to ductile transition in nanoscale Si is of great scientific and technological interest, but this phenomenon and its underlying mechanism remain elusive. By conducting in situ temperature-controlled nanomechanical testing inside a transmission electron micro...
Article
Full-text available
Memristors and memristor crossbar arrays have been widely studied for neuromorphic and other in-memory computing applications. To achieve optimal system performance, however, it is essential to integrate memristor crossbars with peripheral and control circuitry. Here, we report a fully functional, hybrid memristor chip in which a passive crossbar a...
Article
This paper is concerned with mode-dependent impulsive hybrid systems driven by deterministic finite automaton (DFA) with mixed-mode effects. In the hybrid systems, a complex phenomenon called mixed mode, caused in time-varying delay switching systems, is considered explicitly. Furthermore, mode-dependent impulses, which can exist not only at the in...
Article
Full-text available
Advances in the understanding of nanoscale ionic processes in solid‐state thin films have led to the rapid development of devices based on coupled ionic–electronic effects. For example, ion‐driven resistive‐switching (RS) devices have been extensively studied for future memory applications due to their excellent performance in terms of switching sp...
Preprint
Full-text available
We describe a hybrid analog-digital computing approach to solve important combinatorial optimization problems that leverages memristors (two-terminal nonvolatile memories). While previous memristor accelerators have had to minimize analog noise effects, we show that our optimization solver harnesses such noise as a computing resource. Here we descr...
Article
Resistive random-access memory (RRAM) devices have attracted broad interest as promising building blocks for high-density non-volatile memory and neuromorphic computing applications. Atomic-level thermodynamic and kinetic descriptions of resistive switching (RS) processes are essential for continued device design and optimization, but are relativel...
Chapter
Stochastic computing is a low-cost form of computing. To perform stochastic computing, inputs need to be converted to stochastic bit streams using stochastic number generators (SNGs). The random number generation presents a significant overhead, which partially defeats the benefits of stochastic computing. In this work, we show that stochastic comp...
Article
Full-text available
Coupled ionic–electronic effects present intriguing opportunities for device and circuit development. In particular, layered two-dimensional materials such as MoS2 offer highly anisotropic ionic transport properties, facilitating controlled ion migration and efficient ionic coupling among devices. Here, we report reversible modulation of MoS2 films...
Article
Full-text available
Resistive switching (RS) is an interesting property shown by some materials systems that, especially during the last decade, has gained a lot of interest for the fabrication of electronic devices, with electronic nonvolatile memories being those that have received the most attention. The presence and quality of the RS phenomenon in a materials syst...
Article
Memristors based on 2D layered materials could provide bio-realistic ionic interactions and potentially enable construction of energy-efficient artificial neural networks capable of faithfully emulating neuronal interconnections in human brains. To build reliable 2D-material-based memristors suitable for constructing working neural networks, the me...
Article
Self-limited and forming-free Cu-based CBRAM devices with a double Al <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> O <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sub> atomic layer deposition layer (D-ALD) structure were developed. The prop...

Network

Cited By