Training a Probabilistic Graphical Model with Resistive Switching Electronic Synapses

February 23, 2026

Reading time: 6 minute

...

📝 Abstract

Current large scale implementations of deep learning and data mining require thousands of processors, massive amounts of off-chip memory, and consume gigajoules of energy. Emerging memory technologies such as nanoscale two-terminal resistive switching memory devices offer a compact, scalable and low power alternative that permits on-chip co-located processing and memory in fine-grain distributed parallel architecture. Here we report first use of resistive switching memory devices for implementing and training a Restricted Boltzmann Machine (RBM), a generative probabilistic graphical model as a key component for unsupervised learning in deep networks. We experimentally demonstrate a 45-synapse RBM realized with 90 resistive switching phase change memory (PCM) elements trained with a bio-inspired variant of the Contrastive Divergence (CD) algorithm, implementing Hebbian and anti-Hebbian weight updates. The resistive PCM devices show a two-fold to ten-fold reduction in error rate in a missing pixel pattern completion task trained over 30 epochs, compared to untrained case. Measured programming energy consumption is 6.1 nJ per epoch with the resistive switching PCM devices, a factor of ~150 times lower than conventional processor-memory systems. We analyze and discuss the dependence of learning performance on cycle-to-cycle variations as well as number of gradual levels in the PCM analog memory devices.

💡 Analysis

🇰🇷 한글로 읽기

📄 Content

1  Abstract— Current large scale implementations of deep learning and data mining require thousands of processors, massive amounts of off-chip memory, and consume gigajoules of energy. Emerging memory technologies such as nanoscale two- terminal resistive switching memory devices offer a compact, scalable and low power alternative that permits on-chip co- located processing and memory in fine-grain distributed parallel architecture. Here we report first use of resistive switching memory devices for implementing and training a Restricted Boltzmann Machine (RBM), a generative probabilistic graphical model as a key component for unsupervised learning in deep networks. We experimentally demonstrate a 45-synapse RBM realized with 90 resistive switching phase change memory (PCM) elements trained with a bio-inspired variant of the Contrastive Divergence (CD) algorithm, implementing Hebbian and anti- Hebbian weight updates. The resistive PCM devices show a two- fold to ten-fold reduction in error rate in a missing pixel pattern completion task trained over 30 epochs, compared to untrained case. Measured programming energy consumption is 6.1 nJ per epoch with the resistive switching PCM devices, a factor of ~150 times lower than conventional processor-memory systems. We analyze and discuss the dependence of learning performance on cycle-to-cycle variations as well as number of gradual levels in the PCM analog memory devices.

Index Terms—neuromorphic computing, phase change memory, resistive memory, brain-inspired hardware, cognitive computing
I. INTRODUCTION EEP learning can extract complex and useful structures within high-dimensional data, without requiring significant amounts of manual feature engineering [1]. It has

This work is supported in part by SONIC, one of six centers of STARnet, a Semiconductor Research Corporation program sponsored by MARCO and DARPA, the NSF Expedition on Computing (Visual Cortex on Silicon, award 1317470), and the member companies of the Stanford Non-Volatile Memory Technology Research Initiative (NMTRI) and the Stanford SystemX Alliance.
S.B. Eryilmaz and H.-S.P. Wong are with the Electrical Engineering Department, Stanford University, Stanford, CA 94305 USA (e-mail: eryilmaz@stanford.edu; hspwong@stanford.edu).
E. Neftci is with the Department of Cognitive Sciences, UC Irvine, Irvine, CA 92697 USA (e-mail: eneftci@uci.edu). S. Joshi is with the Department of Electrical and Computer Engineering, UC San Diego, San Diego, CA 92093 USA (e-mail: sijoshi@eng.ucsd.edu).
S. Kim, M. BrightSky, C. Lam are with the IBM Research, Yorktown Heights, NY 10598 USA (e-mail: SangBum.Kim@us.ibm.com; breitm@us.ibm.com; clam@us.ibm.com). H.-L. Lung is with the Macronix International Co., Ltd., Emerging Central Lab, Taiwan (e-mail: Sllung@mxic.com.tw). G. Cauwenberghs is with the Department of Bioengineering, UC San Diego, San Diego, CA 92093 USA (e-mail: gert@ucsd.edu). made significant advances in recent years and is shown to outperform many other machine learning techniques for a variety of tasks such as image recognition, speech recognition, natural language understanding, predicting the effects of mutations in DNA, and reconstructing brain circuits [2]. However, training of large scale deep networks (~109 synapses, compared to ~1015 synapses in human brain) in today’s hardware consumes more than 10 gigajoules (estimated) of energy [3-4]. An important origin of this energy consumption is the physical separation of processing and memory, which is exacerbated by the large amounts of data needed for training deep networks [1-5]. It has been reported that ~40 percent of energy consumed in general purpose computers are due to the off-chip memory hierarchy [6], and this fraction will increase when applications are more data- centric [7]. GPUs do not solve this problem, since up to 50 percent of dynamic power and 30 percent of overall power are consumed by off-chip memory as shown in several benchmarks [8]. On-chip SRAM does not solve the problem either, since it is very area inefficient (> 100 F2, F being the minimum half-pitch allowed by the considered lithography) and cannot scale up with system size. Extracting useful information from data, which requires efficient data mining and (deep) learning algorithms, is becoming increasingly common in consumer products such as smartphones, and is expected to be even more important for the internet-of-things (IoT) [9]; where energy efficiency is especially crucial. To scale up these systems in an energy efficient manner, it is necessary to develop new learning algorithms and hardware architectures that can capitalize on fine-grained on-chip integration of memory with computation. Because the number of synapses in a neural network far exceeds the number of neurons, we must pay special attention to the power, device density, and wiring of the electronic synapses, f

View Original ArXiv

This content is AI-processed based on ArXiv data.

Training a Probabilistic Graphical Model with Resistive Switching Electronic Synapses

📝 Abstract

💡 Analysis

📄 Content

Table of Contents

Table of Contents

📝 Abstract

💡 Analysis

📄 Content

Start searching

No results found