Sparse Spike Encoding of Channel Responses for Energy Efficient Human Activity Recognition

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

ISAC enables pervasive monitoring, but modern sensing algorithms are often too complex for energy-constrained edge devices. This motivates the development of learning techniques that balance accuracy performance and energy efficiency. Spiking Neural Networks (SNNs) are a promising alternative, processing information as sparse binary spike trains and potentially reducing energy consumption by orders of magnitude. In this work, we propose a spiking convolutional autoencoder (SCAE) that learns tailored spike-encoded representations of channel impulse responses (CIR), jointly trained with an SNN for human activity recognition (HAR), thereby eliminating the need for Doppler domain preprocessing. The results show that our SCAE-SNN achieves F1 scores comparable to a hybrid approach (almost 96%), while producing substantially sparser spike encoding (81.1% sparsity). We also show that encoding CIR data prior to classification improves both HAR accuracy and efficiency. The code is available at https://github.com/ele-ciccia/SCAE-SNN-HAR.

💡 Research Summary

This paper addresses the challenge of delivering accurate human activity recognition (HAR) on energy‑constrained edge devices within integrated sensing‑and‑communication (ISAC) systems. Conventional radio‑frequency (RF)‑based HAR pipelines typically rely on heavy preprocessing steps such as Doppler spectrogram or range‑Doppler map generation, followed by deep neural networks (DNNs). While effective, these approaches demand large memory footprints and intensive multiply‑accumulate (MAC) operations, making them unsuitable for low‑power hardware.

The authors propose an end‑to‑end spiking solution that operates directly on raw channel impulse response (CIR) measurements, thereby eliminating the need for Doppler‑domain preprocessing. The core of the system is a Spiking Convolutional Autoencoder (SCAE) that learns a task‑specific spike encoding of the CIR, coupled with a Spiking Neural Network (SNN) classifier that consumes the encoded spikes. The SCAE consists of a 3‑D convolutional encoder (two Conv3D layers with 64 and 2 feature maps, each followed by batch normalization and a dense layer of Leaky‑Integrate‑and‑Fire (LIF) neurons) that binarizes the input into sparse spike trains. A mirrored decoder (transposed Conv3D layers and LIF neurons, ending with a sigmoid) reconstructs the original CIR, and the encoder‑decoder pair is jointly optimized to minimize reconstruction error while the downstream SNN is trained to maximize classification accuracy.

The SNN classifier first downsamples the spike tensor via average pooling, then processes it through three fully‑connected layers (128, 64, and the number of activity classes) of LIF neurons. Output spikes are interpreted using a rate‑coding scheme over 29 timesteps. The entire pipeline is trained end‑to‑end, allowing the encoder to produce spikes that are both highly informative for HAR and extremely sparse.

Experiments are conducted on the DISC dataset, which contains 60 GHz IEEE 802.11ay CIR recordings of seven subjects performing four activities (walking, running, sit‑to‑stand, hand waving). After preprocessing (selection of the 10 most variable range bins via inter‑quartile range, separation of real and imaginary parts, min‑max normalization, and overlapping sliding windows), each sample has shape (2, N, R, W). The authors split the data by subjects to evaluate generalization to unseen users.

Four configurations are compared: (a) the proposed SCAE‑SNN, (b) a conventional convolutional autoencoder (CAE) feeding a non‑spiking SNN, (c) a simple delta‑threshold spike encoder feeding the same SNN, and (d) a direct SNN that consumes the raw CIR as a continuous current. Results show that the SCAE‑SNN achieves an F1 score of ≈ 96 % while attaining 81.1 % spike sparsity. This sparsity far exceeds that of the CAE‑based method (28.6 %) and the delta‑threshold approach (71.7 %). Moreover, a standard CNN trained on Doppler spectrograms suffers a ~10 % drop in performance when retrained on raw CIR, highlighting the advantage of learning directly from the channel.

The paper’s contributions are threefold: (1) introduction of a learned spike‑encoding mechanism tailored to CIR data, balancing recognition accuracy with ultra‑low spike activity; (2) demonstration that a spiking autoencoder can extract spectral features relevant for HAR without any explicit Doppler processing; (3) empirical evidence that the proposed architecture matches or surpasses hybrid CNN‑SNN baselines while dramatically reducing the number of MAC operations, making it highly suitable for neuromorphic hardware.

Future work includes deploying the model on actual neuromorphic platforms (e.g., Intel Loihi, SpiNNaker) to quantify real power savings, extending the approach to multi‑user and multi‑activity scenarios, and investigating latency and robustness under streaming conditions. The open‑source implementation is provided at https://github.com/ele-ciccia/SCAE‑SNN‑HAR.

Sparse Spike Encoding of Channel Responses for Energy Efficient Human Activity Recognition

💡 Research Summary

Comments & Academic Discussion

Leave a Comment