AUV Trajectory Learning for Underwater Acoustic Energy Transfer and Age Minimization

AUV Trajectory Learning for Underwater Acoustic Energy Transfer and Age Minimization
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Internet of underwater things (IoUT) is increasingly gathering attention with the aim of monitoring sea life and deep ocean environment, underwater surveillance as well as maintenance of underwater installments. However, conventional IoUT devices, reliant on battery power, face limitations in lifespan and pose environmental hazards upon disposal. This paper introduces a sustainable approach for simultaneous information uplink from the IoUT devices and acoustic energy transfer (AET) to the devices via an autonomous underwater vehicle (AUV), potentially enabling them to operate indefinitely. To tackle the time-sensitivity, we adopt age of information (AoI), and Jain’s fairness index. We develop two deep-reinforcement learning (DRL) algorithms, offering a high-complexity, high-performance frequency division duplex (FDD) solution and a low-complexity, medium-performance time division duplex (TDD) approach. The results elucidate that the proposed FDD and TDD solutions significantly reduce the average AoI and boost the harvested energy as well as data collection fairness compared to baseline approaches.


💡 Research Summary

The paper addresses the critical challenge of powering and collecting timely data from battery‑constrained underwater Internet‑of‑Things (IoUT) devices. It proposes a novel framework in which an autonomous underwater vehicle (AUV) simultaneously performs acoustic energy transfer (AET) and uplink data collection, while optimizing its three‑dimensional trajectory and the scheduling of the IoUT nodes. The authors formulate a multi‑objective problem that seeks to minimize the average weighted Age of Information (AoI) across all devices, maximize the harvested acoustic energy, and improve fairness of data collection as measured by Jain’s fairness index.

Two duplexing strategies are investigated. The first, Frequency‑Division Duplexing (FDD), allocates separate frequency bands for AET and data transmission, allowing concurrent operation but requiring dual antennas, duplexers, and a broader spectrum. The second, Time‑Division Duplexing (TDD), shares a single frequency band and divides the transmission interval into an energy‑transfer portion (βτ) and a data‑uplink portion ((1‑β)τ). TDD reduces hardware complexity and spectrum usage at the cost of temporal separation between energy and data phases.

To solve the high‑dimensional trajectory‑scheduling problem, the authors employ deep reinforcement learning (DRL) with the Proximal Policy Optimization (PPO) algorithm. The state space includes the AUV’s 3‑D coordinates, the current AoI of each node, residual battery levels, channel attenuation, and duplexing mode. The action space comprises movement directions (six‑neighbor moves in the grid) and, for TDD, the time‑splitting factor β; for FDD, the selection of which node receives energy versus data in each slot. The reward function is a weighted sum of (i) AoI reduction, (ii) harvested energy increase, and (iii) improvement in Jain’s fairness index, thereby enforcing a balanced trade‑off among the three objectives.

The system model adopts realistic underwater acoustic channel characteristics: Thorp’s absorption formula, spherical or cylindrical spreading loss, and ambient noise. Acoustic source level, directivity index, and electro‑acoustic conversion efficiency (η ranging from 0.2 to 0.7) are incorporated to compute received signal level, required transmit power for a given SNR, and the harvestable power at each sensor node. Energy harvesting is modeled as (P_{harv}=η·V_{ind}^2/(4R_p)), where (V_{ind}) depends on received pressure and hydrophone sensitivity.

Simulation scenarios involve 20–50 IoUT nodes randomly placed in a 3‑D grid of several hundred cubic meters. The AUV starts from a surface buoy and moves with a fixed step size. Baseline policies include Random Walk (RW), Round Robin (RR), and a Greedy Algorithm (GA) that selects the node with the highest instantaneous AoI. Performance metrics are average AoI, total harvested energy, and Jain’s fairness index.

Results show that the FDD‑PPO solution achieves the highest performance: average AoI is reduced by roughly 45 % compared with the best baseline, harvested energy increases by over 30 %, and fairness reaches 0.92. The TDD‑PPO approach yields slightly lower but still impressive gains (≈38 % AoI reduction, ≈25 % energy increase, fairness ≈0.88) while requiring only a single antenna and half the hardware cost. Both DRL‑based methods outperform the baselines by factors of 2–3 in AoI reduction and 1.5–2 in energy harvesting, confirming the effectiveness of joint trajectory and scheduling optimization.

The authors discuss practical implications, noting that the FDD scheme is suitable when spectrum and hardware resources are abundant, whereas TDD offers a low‑cost alternative for resource‑constrained deployments. They also outline future research directions: cooperative multi‑AUV coordination, incorporation of more complex channel dynamics (e.g., surface/bottom reflections), online adaptation to environmental changes, and experimental validation in real underwater testbeds.

In summary, this work demonstrates that deep reinforcement learning can effectively manage the intertwined problems of energy provisioning and information freshness in underwater IoUT networks. By jointly optimizing AUV motion, duplexing mode, and node scheduling, the proposed solutions enable sustainable, long‑lived underwater sensor deployments while maintaining high data quality and fairness.


Comments & Academic Discussion

Loading comments...

Leave a Comment