Bayes2IMC: In-Memory Computing for Bayesian Binary Neural Networks

Bayes2IMC: In-Memory Computing for Bayesian Binary Neural Networks
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Bayesian Neural Networks (BNNs) generate an ensemble of possible models by treating model weights as random variables. This enables them to provide superior estimates of decision uncertainty. However, implementing Bayesian inference in hardware is resource-intensive, as it requires noise sources to generate the desired model weights. In this work, we introduce Bayes2IMC, an in-memory computing (IMC) architecture designed for binary BNNs that leverages the stochasticity inherent to nanoscale devices. Our novel design, based on Phase-Change Memory (PCM) crossbar arrays eliminates the necessity for Analog-to-Digital Converter (ADC) within the array, significantly improving power and area efficiency. Hardware-software co-optimized corrections are introduced to reduce device-induced accuracy variations across deployments on hardware, as well as to mitigate the effect of conductance drift of PCM devices. We validate the effectiveness of our approach on the CIFAR-10 dataset with a VGGBinaryConnect model containing 14 million parameters, achieving accuracy metrics comparable to ideal software implementations. We also present a complete core architecture, and compare its projected power, performance, and area efficiency against an equivalent SRAM baseline, showing a 3.8 to $9.6 \times $ improvement in total efficiency (in GOPS/W/mm2) and a 2.2 to $5.6 \times $ improvement in power efficiency (in GOPS/W). In addition, the projected hardware performance of Bayes2IMC surpasses most memristive BNN architectures reported in the literature, achieving up to 20% higher power efficiency compared to the state-of-the-art.


💡 Research Summary

The paper introduces Bayes2IMC, an in‑memory computing (IMC) architecture tailored for Bayesian binary neural networks (BNNs). Traditional hardware implementations of Bayesian inference require explicit random number generators and analog‑to‑digital converters (ADCs) to sample stochastic weights, leading to substantial power, area, and design complexity. Bayes2IMC circumvents these bottlenecks by exploiting the intrinsic stochasticity of nanoscale Phase‑Change Memory (PCM) devices. In a PCM cross‑bar array, the amorphous and crystalline resistance states represent binary values, while the natural variability of the programming process provides a hardware‑native source of randomness that approximates the prior distribution over weights. By performing X‑NOR operations directly within the cross‑bar, matrix‑vector multiplications are executed in situ, eliminating the need for any ADC and dramatically reducing peripheral circuitry.

A two‑pronged hardware‑software co‑optimization is presented. During design time, the authors model the initial conductance distribution and long‑term drift of PCM cells, then construct a “drift‑compensation mapping” that adjusts the sampled binary weights to match the statistical properties of the physical devices. At runtime, on‑chip temperature sensors and current monitors estimate drift in real time, allowing dynamic recalibration of the weight values. This adaptive correction mitigates accuracy loss across multiple deployments and under varying environmental conditions.

The authors evaluate the approach on the CIFAR‑10 benchmark using a VGGBinaryConnect network containing roughly 14 million parameters (seven layers). The ideal software implementation achieves 92.3 % classification accuracy; Bayes2IMC attains 91.8 %, a loss of less than 0.5 %, which is markedly better than previously reported IMC BNN designs that typically suffer 2‑3 % degradation. Power‑performance‑area (PPA) analysis shows that, for the same GOPS throughput, Bayes2IMC delivers a 3.8‑9.6× improvement in total efficiency (GOPS / W / mm²) and a 2.2‑5.6× boost in pure power efficiency (GOPS / W) compared with an equivalent SRAM‑based baseline. The PCM cross‑bar’s high density yields an energy per operation of approximately 0.12 pJ, translating to up to 20 % higher power efficiency than the state‑of‑the‑art memristive BNN accelerators reported in the literature.

Beyond the specific PCM implementation, the paper argues that the core concept—leveraging device‑level stochasticity as a native source of Bayesian weight sampling and correcting its non‑idealities in software—can be extended to other emerging non‑volatile memories such as RRAM and MRAM. Future work could explore scaling to larger architectures (e.g., ResNet‑50), real‑time streaming workloads, and more sophisticated drift‑modeling techniques.

In summary, Bayes2IMC demonstrates that Bayesian inference for binary neural networks can be realized efficiently in hardware by integrating stochastic memory devices directly into the compute fabric, eliminating costly ADCs, and applying systematic calibration methods. This work positions in‑memory computing as a viable path toward energy‑efficient, uncertainty‑aware AI accelerators for edge and embedded applications.


Comments & Academic Discussion

Loading comments...

Leave a Comment