Acoustic Drone Package Delivery Detection
In recent years, the illicit use of unmanned aerial vehicles (UAVs) for deliveries in restricted area such as prisons became a significant security challenge. While numerous studies have focused on UAV detection or localization, little attention has been given to delivery events identification. This study presents the first acoustic package delivery detection algorithm using a ground-based microphone array. The proposed method estimates both the drone’s propeller speed and the delivery event using solely acoustic features. A deep neural network detects the presence of a drone and estimates the propeller’s rotation speed or blade passing frequency (BPF) from a mel spectrogram. The algorithm analyzes the BPFs to identify probable delivery moments based on sudden changes before and after a specific time. Results demonstrate a mean absolute error of the blade passing frequency estimator of 16 Hz when the drone is less than 150 meters away from the microphone array. The drone presence detection estimator has a accuracy of 97%. The delivery detection algorithm correctly identifies 96% of events with a false positive rate of 8%. This study shows that deliveries can be identified using acoustic signals up to a range of 100 meters.
💡 Research Summary
The paper introduces a novel acoustic‑only approach for detecting illicit drone package deliveries in restricted zones such as prisons. While prior work has largely focused on detecting the presence of unmanned aerial vehicles (UAVs) or localising them using cameras, radar, RF antennas, or multimodal sensor suites, none have attempted to pinpoint the exact moment a payload is released. The authors address this gap by deploying a ground‑based 16‑channel microphone array to capture the characteristic blade‑passing frequency (BPF) generated by a drone’s propellers and by analysing abrupt changes in that frequency to infer delivery events.
The detection pipeline consists of two stages. In the first stage a supervised multitask convolutional‑recurrent neural network (CRNN) processes three‑second audio segments (93 frames) represented as both mel‑spectrograms (128 mel bins) and power cepstrum features. Four convolutional blocks with varying kernel sizes (33, 21, 11, 3) extract time‑frequency patterns, followed by two parallel branches each containing three stacked bidirectional GRU layers (128 hidden units) and fully‑connected layers. One branch regresses the BPF of the two fastest motors, the other outputs a binary probability of drone presence. The loss combines mean‑squared error for BPF regression and binary cross‑entropy for detection (α = β = 1). ReLU enforces positive BPF values, while a sigmoid constrains the detection probability to
Comments & Academic Discussion
Loading comments...
Leave a Comment