HTMA-Net: Towards Multiplication-Avoiding Neural Networks via Hadamard Transform and In-Memory Computing
Reducing the cost of multiplications is critical for efficient deep neural network deployment, especially in energy-constrained edge devices. In this work, we introduce HTMA-Net, a novel framework that integrates the Hadamard Transform (HT) with multiplication-avoiding (MA) SRAM-based in-memory computing to reduce arithmetic complexity while maintaining accuracy. Unlike prior methods that only target multiplications in convolutional layers or focus solely on in-memory acceleration, HTMA-Net selectively replaces intermediate convolutions with Hybrid Hadamard-based transform layers whose internal convolutions are implemented via multiplication-avoiding in-memory operations. We evaluate HTMA-Net on ResNet-18 using CIFAR-10, CIFAR-100, and Tiny ImageNet, and provide a detailed comparison against regular, MF-only, and HT-only variants. Results show that HTMA-Net eliminates up to 52% of multiplications compared to baseline ResNet-18, ResNet-20, and ResNet-32 models, while achieving comparable accuracy in evaluation and significantly reducing computational complexity and the number of parameters. Our results demonstrate that combining structured Hadamard transform layers with SRAM-based in-memory computing multiplication-avoiding operators is a promising path towards efficient deep learning architectures.
💡 Research Summary
This paper introduces HTMA-Net, a novel framework designed to drastically reduce the computational cost of deep neural networks, specifically targeting the expensive multiplication operations that dominate energy consumption, especially in resource-constrained edge devices. The core innovation lies in the synergistic integration of two distinct efficiency paradigms: the Walsh-Hadamard Transform (WHT) and Multiplication-Avoiding (MA) in-memory computing operators.
The WHT is an orthogonal linear transform that requires only additions and subtractions, making it inherently hardware-friendly. The authors leverage the Hadamard Convolution Theorem, which allows spatial convolutions to be expressed as element-wise multiplications in the transform domain. They implement this not as a fixed filter but as a learnable “spectral convolution” layer within the network, incorporating trainable spectral gates and soft-thresholding for sparsity.
The second component is the MA operator derived from MF-Net, which replaces the standard inner product operation w·x with the expression sign(x)|w| + sign(w)|x|. This eliminates hardware multipliers, requiring only two signed additions and simple sign logic, an operation well-suited for implementation within SRAM-based compute-in-memory (CIM) arrays.
HTMA-Net strategically replaces standard convolutional blocks in ResNet architectures with hybrid HTMA blocks. Within these blocks, the input feature map is transformed via the 2D WHT. Crucially, the subsequent channel-mixing projections (implemented as 1x1 convolutions) within this transform domain are executed using the MA operator. The result is then inverse-transformed back to the spatial domain. The authors explore different integration strategies, such as replacing only the middle stages (“Middle-only”) or all but the first stage (“First-only”) of the network.
Comprehensive evaluations were conducted on ResNet-18, ResNet-20, and ResNet-32 models using CIFAR-10, CIFAR-100, and Tiny ImageNet datasets. The results demonstrate that HTMA-Net achieves a superior trade-off between accuracy and efficiency. While an “MF-only” approach removes over 98% of multiplications but suffers severe accuracy degradation, and an “HT-only” approach removes about 48% of multiplications, the hybrid HTMA-Net eliminates up to 54.4% of multiplications in ResNet-18 on CIFAR-10 with less than a 1% drop in accuracy. This trend holds across different model depths, with 51-52% multiplication elimination in ResNet-20/32. The framework also shows strong generalization on more complex datasets like CIFAR-100 and Tiny ImageNet. The “Middle-only” integration strategy proved to be the most robust.
In conclusion, HTMA-Net presents a compelling pathway for energy-efficient deep learning by co-designing algorithmic transformations (WHT) and hardware-aware operators (MA). It effectively breaks the trade-off between extreme computational reduction and model accuracy, offering a practical solution for deploying DNNs in edge and embedded systems.
Comments & Academic Discussion
Loading comments...
Leave a Comment