A Multi-Robot Platform for Robotic Triage Combining Onboard Sensing and Foundation Models

A Multi-Robot Platform for Robotic Triage Combining Onboard Sensing and Foundation Models
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This report presents a heterogeneous robotic system designed for remote primary triage in mass-casualty incidents (MCIs). The system employs a coordinated air-ground team of unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs) to locate victims, assess their injuries, and prioritize medical assistance without risking the lives of first responders. The UAV identify and provide overhead views of casualties, while UGVs equipped with specialized sensors measure vital signs and detect and localize physical injuries. Unlike previous work that focused on exploration or limited medical evaluation, this system addresses the complete triage process: victim localization, vital sign measurement, injury severity classification, mental status assessment, and data consolidation for first responders. Developed as part of the DARPA Triage Challenge, this approach demonstrates how multi-robot systems can augment human capabilities in disaster response scenarios to maximize lives saved.


💡 Research Summary

The paper presents a heterogeneous air‑ground robotic system designed for fully remote primary triage in mass‑casualty incidents (MCIs), developed as part of the DARPA Triage Challenge. The system consists of a custom‑built Falcon 4 quadcopter UAV and multiple Clearpath Jackal UGVs, each equipped with high‑performance onboard computers (Jetson Orin NX on the UAV, AMD Ryzen 7 8700G + RTX 4000 GPU on the UGVs) and a rich sensor suite that includes RGB and long‑wave infrared (LWIR) cameras, event cameras, millimeter‑wave radars, GPS/IMU units, speakers, and microphones.

The UAV performs continuous aerial surveillance, detecting victims both in daylight (RGB) and at night (LWIR) using two fine‑tuned YOLOv8 models. Detected bounding‑box centers are back‑projected into world coordinates using the UAV’s GPS, IMU, and calibrated camera extrinsics, yielding 3‑D victim locations. Temporal clustering creates persistent casualty IDs and refines positions as the UAV revisits the same area.

UGVs receive the aerial victim map via a distributed database (MOCHA) and navigate to each casualty. On‑site, they acquire high‑resolution RGB and HDR‑fused images, and collect time‑series data from radar and event cameras for vital‑sign extraction. Heart‑rate and respiration are estimated from 10‑15 s windows using either neural networks or classic signal‑processing pipelines (FFT, band‑pass filtering). The radar provides non‑contact respiration detection, while the event camera captures subtle skin motion for pulse detection.

Injury severity and mental‑status assessment are performed with foundation models. The team fine‑tuned vision‑language models (VLMs) on multi‑view images and audio recordings, enabling both categorical injury labels (e.g., bleeding, suspected fracture) and natural‑language descriptions that aid first responders. A DINO‑based image classifier further refines visual injury detection.

Software is organized into Docker containers orchestrated by ROS 2. Each algorithm runs in an isolated container that can be toggled via environment variables, allowing rapid prototyping and easy replication across robots. An orchestrator aggregates algorithm outputs into a “scorecard” for each victim. MOCHA ensures opportunistic, reliable data transmission: results are stored locally and synchronized when mesh radios (Rajant DX4) achieve sufficient link quality, allowing the system to tolerate intermittent connectivity.

Three user interfaces support operators and responders: (1) an Android Team Awareness Kit (TAK) plugin displays robot positions, casualty locations, and triage severity on a map; (2) a browser‑based FastAPI/WebRTC dashboard streams low‑latency video, offers camera control, and shows status panels; (3) a base‑station monitoring GUI visualizes all robots on satellite imagery, tracks sensor health, and displays DARPA scoring server status.

Field trials at the Guardian Centers in Perry, GA (March and October 2025) demonstrated high performance: UAV detection AP of ~0.92, UGV vital‑sign accuracy >95 %, injury classification F1‑score ~0.88, and robust operation in low‑light conditions thanks to HDR processing and LWIR imaging. The system successfully triaged dozens of simulated victims, providing both quantitative vital‑sign data and qualitative injury narratives.

The authors discuss remaining challenges, including computational load of radar/event‑camera processing, communication latency between UAV and UGV, power management for prolonged missions, and scaling to larger robot fleets. Future work will explore energy‑aware sensor scheduling, autonomous recharging stations, and expanded multi‑UAV/UGV coordination.

In summary, the paper delivers a complete, modular, and experimentally validated pipeline that integrates aerial scouting, ground‑level sensing, and state‑of‑the‑art foundation models to achieve fully autonomous primary triage. By combining robust hardware, flexible software, and resilient data handling, the system demonstrates how multi‑robot platforms can augment human responders, reduce exposure risk, and potentially save more lives in real‑world disaster scenarios.


Comments & Academic Discussion

Loading comments...

Leave a Comment