FireFly-P: FPGA-Accelerated Spiking Neural Network Plasticity for Robust Adaptive Control

FireFly-P: FPGA-Accelerated Spiking Neural Network Plasticity for Robust Adaptive Control
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Spiking Neural Networks (SNNs) offer a biologically plausible learning mechanism through synaptic plasticity, enabling unsupervised adaptation without the computational overhead of backpropagation. To harness this capability for robotics, this paper presents FireFly-P, an FPGA-based hardware accelerator that implements a novel plasticity algorithm for real-time adaptive control. By leveraging on-chip plasticity, our architecture enhances the network’s generalization, ensuring robust performance in dynamic and unstructured environments. The hardware design achieves an end-to-end latency of just 8~$μ$s for both inference and plasticity updates, enabling rapid adaptation to unseen scenarios. Implemented on a tiny Cmod A7-35T FPGA, FireFly-P consumes only 0.713W and $\sim$10KLUTs, making it ideal for power- and resource-constrained embedded robotic platforms. This work demonstrates that hardware-accelerated SNN plasticity is a viable path toward enabling adaptive, low-latency, and energy-efficient control systems.


💡 Research Summary

FireFly‑P presents a co‑designed algorithm‑hardware solution that brings biologically inspired synaptic plasticity to real‑time adaptive control on resource‑constrained embedded platforms. The authors first introduce a parametric plasticity rule composed of four terms: an associative Hebbian term (α Sj Si), a presynaptic depression term (β Sj), a postsynaptic homeostatic term (γ Si), and a weight‑decay regularization term (δ). Spike traces Sj and Si are exponentially decayed memories of recent spiking activity, updated each timestep with a decay constant λ. All four components are computed in parallel using 16‑bit floating‑point arithmetic, allowing fine‑grained weight adjustments while keeping hardware resource usage low.

The learning framework is split into two phases. In the offline phase, an Evolutionary Strategy (ES) searches the space of plasticity coefficients θ = {α,β,γ,δ}. A population of SNNs, each instantiated with a different θ, is evaluated on a representative control task; selection, mutation, and recombination iteratively improve the population until a robust rule θ* emerges. Importantly, this phase optimizes the learning rule itself, not the synaptic weights, thereby producing a rule that generalizes across diverse dynamics.

During the online phase, the optimized rule θ* is hard‑wired into the FPGA accelerator. Starting from zero‑initialized weights, the SNN continuously updates its synaptic matrix according to the learned rule while simultaneously performing inference. This enables the network to autonomously reorganize its connectivity in response to disturbances such as sudden limb failure, payload changes, or external forces, without any external supervision.

Hardware-wise, FireFly‑P is implemented on a Xilinx Artix‑7 Cmod A7‑35T board, consuming only 0.713 W and roughly 10 K LUTs. The architecture consists of a Forward Engine and a Plasticity Engine that share a dual‑port BRAM‑based on‑chip memory, coordinated by a lightweight Scheduler. The Forward Engine uses a partial‑sum‑stationary dataflow and a multiplier‑free Leaky Integrate‑and‑Fire (LIF) neuron model (τm = 2) to compute membrane potentials and generate spikes in a three‑stage pipeline (psum calculation → neuron dynamics → trace update). The Plasticity Engine fetches the four plasticity parameters for each synapse in a single wide memory access, feeds them together with the pre‑ and post‑synaptic traces into an array of DSP blocks that compute the four terms concurrently, and aggregates the results with a pipelined adder tree to produce Δw.

A key innovation is the deep pipelining across layers: while layer L2 performs a forward pass, layer L1 updates its weights, and vice‑versa. The Scheduler resolves read‑after‑write conflicts by giving write operations priority, thereby eliminating the need for double buffering and keeping the pipeline stall‑free. This overlapping of inference and learning reduces the end‑to‑end latency to just 8 µs per timestep (including both forward propagation and weight update).

Experimental evaluation on a suite of continuous control benchmarks (direction, velocity, and position generalization) demonstrates that FireFly‑P‑controlled robots maintain stable trajectories under abrupt perturbations and converge to appropriate behaviors three times faster than a non‑plastic baseline. Energy efficiency is also superior, with the accelerator consuming less than a watt while delivering high‑throughput, low‑latency updates.

In summary, FireFly‑P delivers (1) an evolution‑based offline optimization of a compact four‑term plasticity rule, (2) a resource‑efficient FPGA implementation that computes the rule in parallel using DSPs and FP16 arithmetic, and (3) a dual‑engine, deeply pipelined architecture that hides plasticity latency behind inference. These contributions make it the first lightweight SNN accelerator capable of real‑time, on‑chip synaptic adaptation for robust robotic control, opening avenues for deployment in multi‑robot teams, low‑power drones, and other edge‑AI cyber‑physical systems.


Comments & Academic Discussion

Loading comments...

Leave a Comment