Turbo NOC: a framework for the design of Network On Chip based turbo decoder architectures

Turbo NOC: a framework for the design of Network On Chip based turbo   decoder architectures
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This work proposes a general framework for the design and simulation of network on chip based turbo decoder architectures. Several parameters in the design space are investigated, namely the network topology, the parallelism degree, the rate at which messages are sent by processing nodes over the network and the routing strategy. The main results of this analysis are: i) the most suited topologies to achieve high throughput with a limited complexity overhead are generalized de-Bruijn and generalized Kautz topologies; ii) depending on the throughput requirements different parallelism degrees, message injection rates and routing algorithms can be used to minimize the network area overhead.


💡 Research Summary

The paper introduces “Turbo NOC,” a comprehensive framework for designing and evaluating network‑on‑chip (NoC) based turbo decoder architectures. Traditional turbo decoder implementations rely on bus‑centric or monolithic designs that struggle to scale efficiently as parallelism increases, leading to higher latency, congestion, and power consumption. Turbo NOC addresses these limitations by distributing the MAP (Maximum‑A‑Posteriori) decoding operations across a configurable array of processing nodes and interconnecting them with a dedicated NoC that carries soft‑information (LLR) packets.

The authors define a four‑dimensional design space: (1) network topology, (2) degree of parallelism (P), (3) message injection rate (R) – the number of packets each node can inject per clock cycle, and (4) routing strategy. Topologies examined include conventional mesh, torus, hyper‑cube, and two graph‑theoretic structures: generalized de‑Bruijn and generalized Kautz graphs. The latter two possess logarithmic diameter and high connectivity, promising low average hop count even as node count grows. Parallelism determines how many MAP units operate concurrently; higher P yields higher raw throughput but also increases traffic load on the NoC. The injection rate R must be tuned to avoid network saturation while meeting target throughput. Routing strategies evaluated are static (pre‑computed paths), minimal‑hop (shortest‑path) and adaptive (congestion‑aware) routing.

A cycle‑accurate simulator evaluates each configuration for throughput, latency, network saturation, router area, and power. Results show that generalized de‑Bruijn and Kautz topologies consistently outperform mesh/torus in both throughput (up to 30 % improvement) and area efficiency because their short diameters keep packet latency low and reduce buffer requirements. Kautz, with its higher out‑degree, tolerates larger injection rates without saturating, making it ideal for high‑performance scenarios.

When the design goal is >1 Gbps throughput, the optimal point is P = 64 processing elements, R = 2 packets per cycle, and minimal‑hop routing on a generalized Kautz graph. This combination achieves the required data rate while keeping router buffers modest. For low‑power, area‑constrained applications, the best trade‑off is P = 32, R = 1, static routing on a generalized de‑Bruijn graph, which reduces silicon area by roughly 20 % compared with mesh‑based designs while still delivering adequate throughput for mobile or IoT devices.

Routing analysis reveals that static routing minimizes hardware overhead but suffers from congestion hotspots under uneven traffic. Adaptive routing eliminates hotspots by dynamically selecting less‑loaded links, but it incurs extra logic and buffer resources, increasing area and power. Minimal‑hop routing offers a middle ground: pre‑computed shortest paths keep average hop count low without the complexity of full adaptivity.

The paper culminates in a set of design guidelines: for high‑throughput base‑station or data‑center equipment, adopt a generalized Kautz topology, high parallelism, higher injection rates, and minimal‑hop routing; for power‑sensitive handheld or IoT devices, prefer generalized de‑Bruijn, moderate parallelism, low injection rates, and static routing. Turbo NOC thus provides a systematic methodology to explore the trade‑offs among performance, area, and power in NoC‑based turbo decoders, enabling designers to tailor architectures to specific application requirements.


Comments & Academic Discussion

Loading comments...

Leave a Comment