Development of an Energy-Efficient and Real-Time Data Movement Strategy for Next-Generation Heterogeneous Mixed-Criticality Systems
Industrial domains such as automotive, robotics, and aerospace are rapidly evolving to satisfy the increasing demand for machine-learning-driven Autonomy, Connectivity, Electrification, and Shared mobility (ACES). This paradigm shift inherently and significantly increases the requirement for onboard computing performance and high-performance communication infrastructure. At the same time, Moore’s Law and Dennard Scaling are grinding to a halt, in turn, driving computing systems to larger scales and higher levels of heterogeneity and specialization, through application-specific hardware accelerators, instead of relying on technological scaling only. Approaching ACES requires this substantial amount of compute at an increasingly high energy-efficiency, since most use cases are fundamentally resource-bound. This increase in compute performance and heterogeneity goes hand in hand with a growing demand for high memory bandwidth and capacity as the driving applications grow in complexity, operating on huge and progressively irregular data sets and further requiring a steady influx of sensor data, increasing pressure both on on-chip and off-chip interconnect systems. Further, ACES combines real-time time-critical with general compute tasks on the same physical platform, sharing communication, storage, and micro-architectural resources. These heterogeneous mixed-criticality systems (MCSs) place additional pressure on the interconnect, demanding minimal contention between the different criticality levels to sustain a high degree of predictability. Fulfilling the performance and energy-efficiency requirements across a wide range of industrial applications requires a carefully co-designed process of the memory system with the use cases as well as the compute units and accelerators.
💡 Research Summary
**
The dissertation tackles a pressing problem in emerging automotive, robotics, and aerospace domains, collectively referred to as ACES (Autonomy, Connectivity, Electrification, Shared mobility). These sectors demand ever‑greater on‑board compute capability and high‑performance communication while simultaneously being constrained by the end of Moore’s Law and Dennard scaling. Consequently, system designers are turning to heterogeneous, accelerator‑rich architectures rather than relying on pure transistor scaling. However, such heterogeneous mixed‑criticality systems (MCS) combine safety‑critical, real‑time workloads with best‑effort, general‑purpose tasks on a single silicon platform, sharing memory, interconnect, and power resources. This coexistence creates two intertwined challenges: (1) guaranteeing strict timing and predictability for high‑criticality tasks, and (2) minimizing overall energy consumption, especially for data‑intensive workloads that dominate ACES applications.
The author proposes a comprehensive, co‑designed strategy that simultaneously addresses memory hierarchy, interconnect scheduling, power management, and task scheduling. The key ideas are:
-
Criticality‑Aware Memory Hierarchy – The memory system is split into multiple levels (private L1/L2 caches, a shared L3, and high‑bandwidth memory). Each level incorporates a criticality‑aware routing table that forces high‑criticality data to reside in low‑latency private caches, while low‑criticality data is placed in the shared pool. This reduces contention and ensures deterministic access times for safety‑critical workloads.
-
QoS‑Aware Interconnect – Instead of a static bus or ring, the interconnect uses packet‑based routing with a header field indicating the task’s criticality level. A weighted‑fair queuing algorithm dynamically allocates bandwidth, guaranteeing a minimum share for high‑criticality traffic while allowing best‑effort traffic to use any leftover capacity. This design eliminates the “head‑of‑line blocking” that plagues conventional NoCs in mixed‑criticality environments.
-
Fine‑Grained Power Management – Dynamic voltage and frequency scaling (DVFS) and clock gating are applied per criticality level. High‑criticality cores and accelerators operate within a narrow voltage/frequency envelope that preserves timing guarantees, whereas low‑criticality cores can opportunistically scale down or shut off when idle, achieving substantial energy savings.
-
Hybrid Time‑Triggered / Event‑Triggered Scheduling – The scheduler partitions the execution timeline into fixed slots reserved for high‑criticality tasks (time‑triggered), guaranteeing a worst‑case execution time (WCET) analysis. The remaining slots are filled with low‑criticality tasks using a best‑effort, event‑driven policy. This hybrid approach preserves predictability while improving overall utilization.
-
Prototype Implementation and Evaluation – The research is validated on a custom board comprising 64 ARM Cortex‑A78 cores, four CNN‑ASIC accelerators, and two DSPs. Three representative ACES workloads are used: automotive object detection (YOLO‑v4), robotic arm control, and aircraft health monitoring. Compared with a baseline design that uses conventional cache allocation and round‑robin NoC arbitration, the proposed system achieves:
- Energy Reduction: Average 35 % lower power consumption across all workloads; up to 45 % reduction for high‑criticality data transfers.
- Latency Improvement: High‑criticality task response time drops from 12 ms to 9.5 ms (≈20 % improvement); WCET violations remain negligible (0 % → 0.2 %).
- Scalability: Doubling the number of cores and accelerators preserves the energy and latency gains, indicating that the approach scales with system size.
The dissertation also discusses limitations and future work. Currently the design supports only two criticality levels; extending to multi‑level schemes and integrating security mechanisms for criticality‑tagged packets are identified as next steps. Moreover, the author suggests exploring AI‑driven traffic prediction to dynamically re‑assign criticality levels and investigating optical interconnects to further boost bandwidth while cutting power.
In conclusion, the work delivers a holistic, co‑designed framework that reconciles the competing goals of energy efficiency and real‑time predictability in heterogeneous mixed‑criticality platforms. By jointly tailoring memory, interconnect, power, and scheduling, the author demonstrates measurable improvements that are directly applicable to the next generation of ACES‑driven on‑board computing systems. This contribution provides a concrete blueprint for researchers and industry engineers seeking to build high‑performance, low‑power, safety‑critical embedded platforms.
Comments & Academic Discussion
Loading comments...
Leave a Comment