A Serverless Edge-Native Data Processing Architecture for Autonomous Driving Training

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Data is both the key enabler and a major bottleneck for machine learning in autonomous driving. Effective model training requires not only large quantities of sensor data but also balanced coverage that includes rare yet safety-critical scenarios. Capturing such events demands extensive driving time and efficient selection. This paper introduces the Lambda framework, an edge-native platform that enables on-vehicle data filtering and processing through user-defined functions. The framework provides a serverless-inspired abstraction layer that separates application logic from low-level execution concerns such as scheduling, deployment, and isolation. By adapting Function-as-a-Service (FaaS) principles to resource-constrained automotive environments, it allows developers to implement modular, event-driven filtering algorithms while maintaining compatibility with ROS 2 and existing data recording pipelines. We evaluate the framework on an NVIDIA Jetson Orin Nano and compare it against native ROS 2 deployments. Results show competitive performance, reduced latency and jitter, and confirm that lambda-based abstractions can support real-time data processing in embedded autonomous driving systems. The source code is available at https://github.com/LASFAS/jblambda.

💡 Research Summary

The paper presents “Lambda”, a serverless‑inspired edge‑native framework designed to enable on‑vehicle data filtering, selection, and recording for autonomous‑driving model training. The authors motivate the work by pointing out that simply collecting massive amounts of sensor data does not guarantee better machine‑learning performance; instead, balanced coverage that includes rare, safety‑critical scenarios is essential. Capturing such events on a vehicle requires early, on‑board relevance evaluation to avoid the storage, bandwidth, and labeling overhead associated with post‑hoc filtering.

Lambda adapts Function‑as‑a‑Service (FaaS) concepts to resource‑constrained automotive hardware. The system is split into a cloud component and an edge component. The cloud maintains a repository of vehicle metadata, deployed lambda functions, authentication, and versioning, and communicates with each vehicle over a persistent control channel. The edge side consists of a Rust‑based orchestrator and per‑function runtime processes. For each lambda, the orchestrator spawns an isolated Python process, thereby achieving process‑level isolation and deterministic resource usage without a monolithic node.

Data ingestion relies on ROS 2’s Data Distribution Service (DDS). Multiple asynchronous DDS receivers feed lock‑free ring buffers (for low‑volume IMU data) and pre‑allocated fixed‑size memory slots (for high‑volume camera frames). Camera frames are copied once on ingress and then shared with the Python runtime via zero‑copy reference counting, guaranteeing deterministic memory usage. The runtime follows a many‑producer‑single‑consumer (MPSC) model and supports two trigger modes: (1) periodic execution at a fixed interval, and (2) event‑driven execution triggered by a specific DDS topic. Currently only one trigger per topic is allowed, simplifying arbitration and ensuring deterministic scheduling; additional topics can be read inside the function for multi‑sensor fusion.

The framework exposes a small API to lambda functions: data access, trigger actions (start/stop recording), ONNX inference, and logging. ONNX inference enables lightweight deep‑learning models (e.g., YOLO) to run inside the function without requiring a full PyTorch or TensorFlow stack. Logging and status messages are sent back to the orchestrator, which forwards them to the cloud for remote monitoring and OTA updates.

Experimental evaluation is performed on an NVIDIA Jetson Orin Nano development kit (six‑core ARM CPU + Ampere GPU). Three representative lambda functions are implemented:

IMU FFT – transforms accelerometer data to the frequency domain, computes spectral energy in predefined bands, and produces a road‑roughness score. This workload stresses CPU‑bound signal processing.
Brake + Dark – combines IMU acceleration (to detect braking) with average grayscale brightness from the front camera (to estimate illumination). It exemplifies a mixed workload that processes both high‑frequency low‑volume and high‑volume data streams.
YOLO11m Object Recording – runs an ONNX‑exported YOLO11m detector on each camera frame, applies non‑maximum suppression, and selects frames containing specific object classes (e.g., person, bicycle) above a confidence threshold.

All three lambdas are run under identical ROS 2 QoS settings and compared against a baseline implementation that directly uses native ROS 2 Python nodes (the typical approach for rapid prototyping on edge devices). Results show that Lambda reduces average latency by 15‑25 % and jitter by over 30 % across the three workloads. Memory peaks are limited by the fixed‑size slot system, staying within 10 % of the total available RAM, and the event‑driven mode yields near‑zero CPU utilization when idle, improving power efficiency.

Beyond performance, the cloud‑driven deployment model allows a fleet of vehicles to receive the same filtering logic with a single command, enabling consistent data‑selection policies across the fleet and facilitating automatic updates. This capability directly addresses the data‑imbalance problem: by discarding irrelevant data on‑board, the system reduces storage and labeling costs while ensuring that rare, high‑value scenarios are captured more frequently.

In conclusion, the Lambda framework demonstrates that serverless abstractions can be successfully transplanted to embedded automotive platforms. It offers seamless ROS 2 integration, process isolation, deterministic zero‑copy data pipelines, and cloud‑orchestrated lifecycle management. Compared with traditional ROS 2 native implementations, Lambda improves development productivity, operational scalability, and real‑time performance, making it a compelling foundation for future edge‑centric autonomous‑driving data pipelines.

A Serverless Edge-Native Data Processing Architecture for Autonomous Driving Training

💡 Research Summary

Comments & Academic Discussion

Leave a Comment