Near-Field Perception for Safety Enhancement of Autonomous Mobile Robots in Manufacturing Environments
Near-field perception is essential for the safe operation of autonomous mobile robots (AMRs) in manufacturing environments. Conventional ranging sensors such as light detection and ranging (LiDAR) and ultrasonic devices provide broad situational awareness but often fail to detect small objects near the robot base. To address this limitation, this paper presents a three-tier near-field perception framework. The first approach employs light-discontinuity detection, which projects a laser stripe across the near-field zone and identifies interruptions in the stripe to perform fast, binary cutoff sensing for obstacle presence. The second approach utilizes light-displacement measurement to estimate object height by analyzing the geometric displacement of a projected stripe in the camera image, which provides quantitative obstacle height information with minimal computational overhead. The third approach employs a computer vision-based object detection model on embedded AI hardware to classify objects, enabling semantic perception and context-aware safety decisions. All methods are implemented on a Raspberry Pi 5 system, achieving real-time performance at 25 or 50 frames per second. Experimental evaluation and comparative analysis demonstrate that the proposed hierarchy balances precision, computation, and cost, thereby providing a scalable perception solution for enabling safe operations of AMRs in manufacturing environments.
💡 Research Summary
The paper addresses a critical safety gap in modern manufacturing environments where autonomous mobile robots (AMRs) operate in close proximity to human workers and small, often overlooked obstacles near the robot base. While long‑range sensors such as LiDAR and ultrasonic arrays provide broad situational awareness, they lack the spatial resolution and low latency required to reliably detect objects within the first half‑meter around the chassis. To fill this gap, the authors propose a three‑tier near‑field perception framework that progressively enriches the robot’s understanding of its immediate surroundings, moving from simple binary presence detection to quantitative geometric estimation and finally to semantic object classification.
Tier 1 – Light‑cutoff detection: A 653 nm, 5 mW Class 3R laser diode projects a thin stripe onto the floor. A Raspberry Pi 5 equipped with a Pi Camera 3 captures the stripe at up to 50 fps. The image is converted to HSV, adaptively thresholded to isolate the laser color, and a Hough line transform extracts the stripe’s central trajectory. Perspective distortion caused by the oblique camera angle is corrected via a homography calibrated with a planar target. Spatial denoising (5 mm kernel) and temporal averaging over the current and two previous frames smooth the binary mask, after which a continuity check flags any gap larger than a safety‑defined 5 mm threshold. This pipeline yields a binary “obstacle‑present” signal with a processing delay of roughly 20 ms, enabling immediate emergency stops. Experimental validation with a variety of industrial objects (metal blocks, plastic debris, human fingers) achieved 100 % detection reliability.
Tier 2 – Light‑displacement height estimation: The same laser‑camera pair is used to estimate the height of the interrupting object. By measuring the vertical displacement d of the stripe in the image and knowing the mounting heights of the laser (h_light) and camera (h_cam), the object height h_obj is computed through a simple geometric relationship derived from the intersecting triangles formed by the laser, camera, and floor plane. Tests on objects ranging from 30 mm to 150 mm tall produced a root‑mean‑square error of 17.8 mm, sufficient to differentiate low‑profile debris (< 50 mm) from potentially hazardous items such as human limbs (> 100 mm).
Tier 3 – Embedded AI object detection: To add semantic awareness, a lightweight YOLO‑v5 model is deployed on an AI‑HAT (Neural Compute Stick) attached to the same Raspberry Pi 5. Because publicly available near‑field datasets are scarce, the authors generated synthetic training data by rendering CAD models of the target objects under varied lighting and floor textures, then fine‑tuned the network on a modest set of real‑world captures. The model classifies seven predefined categories (human, tools, materials, parts, vehicles, infrastructure, safety items) with a mean average precision at IoU 0.5 of 1.00. Inference runs at 25 fps (low‑power mode) and 50 fps (high‑performance mode) while consuming under 5 W, demonstrating feasibility for continuous deployment on a low‑cost platform.
Hardware integration: The system uses eight monitoring zones (A–H) around the robot, each covered by a camera‑laser pair, ensuring full 360° near‑field coverage with minimal blind spots. All components—Raspberry Pi 5, Pi Camera 3, laser diode, and AI‑HAT—cost roughly $150 in total, making the solution attractive for large‑scale factory roll‑outs.
Decision logic: A rule‑based action table maps detection outcomes to robot responses. Human presence triggers an immediate stop; small non‑hazardous objects (e.g., screws, paper) are ignored to preserve throughput; larger tools, materials, or mobile obstacles cause a stop‑and‑warn or reroute maneuver. The hierarchy allows the robot to act on the most informative tier available: a binary cutoff for rapid emergency, height data for nuanced risk assessment, and full object classification for context‑aware planning.
Evaluation and trade‑offs: The authors compare the three tiers in terms of cost, latency, and accuracy. Tier 1 offers the lowest cost and fastest response but no size or type information. Tier 2 adds quantitative height at modest computational overhead, requiring precise calibration. Tier 3 provides rich semantic data but incurs higher processing load and depends on a curated training set. Limitations include sensitivity of the laser stripe to floor color and ambient lighting, restricted field‑of‑view for each camera, and the need for periodic recalibration.
Future work suggestions include replacing the single laser with structured‑light patterns for full 3D reconstruction, expanding the camera array for omnidirectional coverage, and implementing online learning to adapt to new object classes without retraining from scratch.
In summary, the paper delivers a practical, low‑cost, and scalable near‑field perception architecture that enhances AMR safety in manufacturing settings. By demonstrating real‑time operation on inexpensive embedded hardware and providing thorough experimental validation, it offers a compelling blueprint for industry practitioners seeking to bridge the safety gap between long‑range perception and the nuanced demands of human‑robot collaboration at the robot’s feet.
Comments & Academic Discussion
Loading comments...
Leave a Comment