Collection: UAV-Based Wireless Multi-modal Measurements from AERPAW Autonomous Data Mule (AADM) Challenge in Digital Twin and Real-World Environments
In this work, we present an unmanned aerial vehicle (UAV) wireless dataset collected as part of the AERPAW Autonomous Aerial Data Mule (AADM) challenge, organized by the NSF Aerial Experimentation and Research Platform for Advanced Wireless (AERPAW) project. The AADM challenge was the second competition in which an autonomous UAV acted as a data mule, where the UAV downloaded data from multiple base stations (BSs) in a dynamic wireless environment. Participating teams designed flight control and decision-making algorithms for choosing which BSs to communicate with and how to plan flight trajectories to maximize data download within a mission completion time. The competition was conducted in two stages: Stage 1 involved development and experimentation using a digital twin (DT) environment, and in Stage 2, the final test run was conducted on the outdoor testbed. The total score for each team was compiled from both stages. The resulting dataset includes link quality and data download measurements, both in DT and physical environments. Along with the USRP measurements used in the contest, the dataset also includes UAV telemetry, Keysight RF sensors position estimates, link quality measurements from LoRa receivers, and Fortem radar measurements. It supports reproducible research on autonomous UAV networking, multi-cell association and scheduling, air-to-ground propagation modeling, DT-to-real-world transfer learning, and integrated sensing and communication, which serves as a benchmark for future autonomous wireless experimentation.
💡 Research Summary
The paper presents a comprehensive wireless dataset collected during the second AERM (Autonomous Aerial Data Mule) challenge organized by the NSF AERPAW project. The challenge tasked autonomous UAVs with acting as data mules, downloading data from four ground base stations (BSs) distributed across the Lake Wheeler Field Labs in Raleigh, NC. Each UAV performed a 500‑second flight at a nominal altitude of 25 m, dynamically selecting which BS to associate with, how long to stay connected, and when to switch, in order to maximize the total amount of data downloaded.
The competition consisted of two stages. Stage 1 used a high‑fidelity digital twin (DT) that replicated the physical testbed’s topology and channel conditions (LOS plus ground‑reflection model, no fading). Participants uploaded containerized algorithms that were executed in the DT for three distinct missions, each with a different data‑volume configuration across the BSs (e.g., (600, 600, 600, 200) Mbit, (100, 400, 400, 100) Mbit, (20, 50, 400, 30) Mbit). Stage 2 transferred the same containers unchanged to the outdoor testbed for real‑world validation. Scores from DT contributed 20 % of the final ranking, while real‑world performance contributed 80 %, emphasizing algorithmic generalization.
The dataset is multi‑modal and time‑synchronized at a 1‑Hz resolution. It includes:
- UAV telemetry (RTK‑GNSS position, altitude, velocity, heading).
- Link‑quality metrics per second (SNR, RSSI, estimated data rate, current BS ID, cumulative downloaded data).
- USRP B205/B210 raw I/Q samples and derived data‑rate estimates at 3.4 GHz.
- Keysight RF sensor TDOA measurements derived from raw I/Q, providing an independent UAV position estimate.
- Fortem R20 radar tracks (3‑D position, radial velocity, radar cross‑section).
- LoRa gateway logs (packet RSSI, SNR, gateway ID, geographic coordinates).
Scoring follows (S = S_1 + S_2 - P), where (S_1 = (500 - T) \times I(\text{download complete})), (S_2 = 100 \times \frac{D_{\text{down}}}{D_{\text{total}}}), and (P) is a penalty for missing the landing deadline. This formulation balances data‑collection efficiency against mission‑time constraints, providing a natural multi‑objective benchmark.
A total of 22 teams registered; 15 participated in the DT stage (45 flights) and 11 advanced to the real‑world stage (33 flights). Consequently, the dataset contains paired DT and real‑world logs for the same algorithms, enabling direct evaluation of domain shift, transfer learning, and domain‑adaptation techniques.
The authors highlight several research opportunities:
- Multi‑cell scheduling and BS association – The heterogeneous data‑volume configurations force UAVs to make trade‑offs between link quality, distance, and remaining traffic, ideal for reinforcement‑learning or optimization‑based scheduling studies.
- Energy‑aware mission planning – Although battery consumption is not directly logged, speed, acceleration, and hover periods allow inference of energy usage, supporting studies on energy‑optimal trajectories under strict time budgets.
- UAV localization and sensor fusion – Independent position estimates from Keysight RF, Fortem radar, and LoRa gateways can be fused with GNSS ground truth to benchmark RF‑based localization, Kalman filtering, and ISAC (integrated sensing‑communication) algorithms.
- Fairness and QoS analysis – The dataset permits investigation of whether certain BSs are systematically underserved, informing equitable resource‑allocation policies for aerial IoT collectives.
- Scalability and multi‑UAV extensions – Researchers can simulate multiple UAVs operating concurrently, using the provided DT and real‑world traces as ground truth for validation.
In summary, this work delivers the first publicly available, richly annotated dataset of an autonomous UAV performing dynamic multi‑cell data‑muling in both a digital twin and a real‑world environment. By coupling high‑resolution telemetry, multi‑modal RF sensing, and a well‑defined scoring framework, the dataset serves as a benchmark for autonomous UAV networking, air‑to‑ground channel modeling, DT‑to‑real transfer learning, and integrated sensing‑communication research, paving the way for more reliable and scalable aerial data‑collection systems.
Comments & Academic Discussion
Loading comments...
Leave a Comment