Using Machine Learning to Take Stay-or-Go Decisions in Data-driven Drone Missions

December 04, 2025

Reading time: 5 minute

...

📝 Original Info

Title: Using Machine Learning to Take Stay-or-Go Decisions in Data-driven Drone Missions
ArXiv ID: 2512.04773
Date: 2025-12-04
Authors: Giorgos Polychronis, Foivos Pournaropoulos, Christos D. Antonopoulos, Spyros Lalis

📝 Abstract

Drones are becoming indispensable in many application domains. In data-driven missions, besides sensing, the drone must process the collected data at runtime to decide whether additional action must be taken on the spot, before moving to the next point of interest. If processing does not reveal an event or situation that requires such an action, the drone has waited in vain instead of moving to the next point. If, however, the drone starts moving to the next point and it turns out that a follow-up action is needed at the previous point, it must spend time to fly-back. To take this decision, we propose different machine-learning methods based on branch prediction and reinforcement learning. We evaluate these methods for a wide range of scenarios where the probability of event occurrence changes with time. Our results show that the proposed methods consistently outperform the regression-based method proposed in the literature and can significantly improve the worst-case mission time by up to 4.1x. Also, the achieved median mission time is very close, merely up to 2.7% higher, to that of a method with perfect knowledge of the current underlying event probability at each point of interest.

💡 Deep Analysis

📄 Full Content

Drones, multicopters in particular, have become popular across a wide range of civilian applications, because they are easy to deploy, they can fly/hover in a very controllable way, and can be equipped with various sensors. In several cases, the missions are data-driven, i.e., the drone may need to perform further action(s) depending on the data collected via its onboard sensors. For example, in a smart agriculture scenario, if pest is detected at a specific location in the field, the drone can directly spray that location with pesticide. In search and rescue missions, if a person is detected with some confidence, the drone may repeat the sensing from a lower altitude and deliver a first-aid kit before help arrives. In the case of firefighting, the same drone could be used to both detect and control the fire at its early stage. However, the resource-and power-constrained embedded computing platforms of such drones may take a long time to process the sensor data to detect events or situations that require further handling. If this processing must be done often, the overall mission time can increase significantly.

One way to reduce the mission time is to accelerate data processing by leveraging external powerful computational resources (e.g., cloud or edge servers) to offload processing [17], [13], [24]. A complementary approach, proposed in [23], is to exploit the fact that a follow-up action may not always be needed at every point of interest. Namely, the drone can proceed with its mission and go to the next point of interest, right after the sensing task is completed, to overlap the computation time with the flight time. If, however, it turns out that an action is needed at the previous point of interest, extra time is spent for the drone to fly back. The alternative is for the drone to stay at the point of interest and wait for the computation to finish. But if no follow-up action is required, the drone will have wasted time waiting in vain.

In this paper, we focus on the problem of learning how to take good stayor-go decisions. More specifically, the main contributions are: (i) We propose a perceptron-based approach to tackle the decision problem. (ii) In addition, we tackle the problem using reinforcement learning. (iii) We evaluate both approaches via extensive simulation experiments for a wide range of scenarios where the probability of action at each point of interest changes with time. (iv) Our results show that both approaches can achieve good results in dynamic environments, clearly outperforming the regression-based method proposed in [23] while performing close to a method that takes decisions based on perfect knowledge of the underlying probabilities.

The structure of the paper is the following. Section 2 gives an overview of related work. Section 3 presents the system model and the decision problem. Section 4 provides the logic for controlling the drone to perform the mission at hand, independently of the method that is used to take the stay-or-go decision at each point of interest. Section 5 and Section 6 describe the methods for taking such decisions based on branch prediction and reinforcement learning, respectively. Section 7 presents the evaluation of the proposed decisions methods. Finally, Section 8 concludes the paper.

Reducing mission time by offloading. A number of studies aim to reduce the completion time of resource-intensive computations on autonomous drones by offloading them to the cloud or an edge infrastructure. For example, the authors of [17] focus on drone-based navigation and mapping in unknown areas, where the drone dynamically decides when to offload processing. [7] proposes an algorithm that enables multiple drones to dynamically select edge servers for task offloading, taking into account factors such as channel quality and predicted trajectory. In [13], the drone uses a heuristic to choose at runtime between executing the computation locally or offloading it to an edge server. The choice relies on prior knowledge of each server’s end-to-end response time. [24] investigates the combined mission planning and offloading for the case where multiple drones executing different missions experience uncertainty in flying times, while offloading computations to nearby edge servers to reduce the time waiting for the results. Our approach complements these studies by addressing mission time optimization through informed decision-making about whether the drone should wait for computation results or proceed to the next waypoint. The authors of [5] investigate a drone system that starts with high-altitude surveillance to scan large areas. If an object of interest is detected, the drone descends to conduct a more accurate inspection. Their work primarily explores the trade-offs among detection delay, area coverage, and sensing quality. In contrast, our work focuses on the mission time minimization by enabling the drone to proceed to the next point of interest before the current d

📄 Read Full PDF on ArXiv