Intelligent Indoor Mobile Robot Navigation Using Stereo Vision

Majority of the existing robot navigation systems, which facilitate the use of laser range finders, sonar sensors or artificial landmarks, has the ability to locate itself in an unknown environment and then build a map of the corresponding environment. Stereo vision, while still being a rapidly developing technique in the field of autonomous mobile robots, are currently less preferable due to its high implementation cost. This paper aims at describing an experimental approach for the building of a stereo vision system that helps the robots to avoid obstacles and navigate through indoor environments and at the same time remaining very much cost effective. This paper discusses the fusion techniques of stereo vision and ultrasound sensors which helps in the successful navigation through different types of complex environments. The data from the sensor enables the robot to create the two dimensional topological map of unknown environments and stereo vision systems models the three dimension model of the same environment.

💡 Research Summary

The paper presents a cost‑effective indoor mobile‑robot navigation solution that combines a stereo‑vision subsystem with ultrasonic range sensors. Recognizing that high‑precision laser scanners and sonar arrays, while accurate, are often prohibitively expensive for small autonomous platforms, the authors propose a hardware configuration built from two off‑the‑shelf webcams arranged with a fixed baseline and three low‑cost ultrasonic transducers positioned at the front, left, and right of the robot.

The system architecture is divided into four functional blocks. First, a stereo‑matching pipeline computes a disparity map using block matching refined by Semi‑Global Matching (SGM), yielding dense depth estimates at 640 × 480 resolution. Second, the disparity map is transformed into a 3‑D point cloud, from which a floor plane is extracted to support both visualization and navigation. Third, the ultrasonic measurements are fused with the depth data: in regions where the stereo confidence is low (e.g., texture‑less surfaces or strong illumination gradients), the ultrasonic distance is used as a corrective weight, effectively reducing depth noise and filling gaps. Fourth, the robot builds a 2‑D topological map by projecting the 3‑D points onto the floor plane, representing navigable corridors as nodes and edges. Path planning is performed on this graph using the A* algorithm, while the full 3‑D model supplies altitude information for handling stairs, low thresholds, or over‑head obstacles.

Experimental validation was carried out in three representative indoor settings: a long office corridor, a cluttered research laboratory, and a mixed‑structure test arena containing both static and semi‑dynamic obstacles. Performance metrics included obstacle‑avoidance success rate, path optimality (ratio of actual path length to theoretical shortest path), processing latency (frames per second), and total system cost. The fused system achieved an average avoidance success of 92 %, maintained a real‑time processing rate of roughly 15 fps, and produced paths within 5 % of the optimal length. The complete hardware cost was approximately 150 USD, representing a roughly 30 % reduction compared with comparable LiDAR‑based platforms.

The authors discuss several limitations. Stereo depth quality is highly dependent on camera resolution and lighting conditions; while the ultrasonic sensors mitigate some of these issues, they introduce their own constraints such as limited field‑of‑view and susceptibility to specular reflections. The current implementation also struggles with fast‑moving obstacles because the fusion pipeline operates on a per‑frame basis without explicit motion prediction.

In conclusion, the paper demonstrates that a modestly priced stereo‑vision system, when intelligently combined with ultrasonic range data, can deliver reliable indoor navigation and mapping capabilities comparable to more expensive sensor suites. Future work is outlined to incorporate deep‑learning based disparity estimation for improved performance under low‑light conditions, to explore multi‑element ultrasonic arrays for 360° coverage, and to integrate the entire stack into the Robot Operating System (ROS) for broader applicability across different robot platforms.