Design of a Machine Vision-Based Intelligent Guide Robot Control System

1. Introduction

Mobile robots are an important branch of robotics, and with the rapid development of related technologies, they are evolving towards intelligence and diversification, with wide applications permeating almost all fields. Yu Chunhe's method of using lidar to detect road boundaries yields good results, but strong interference signals can affect the detection performance. Fu Mengyin et al. proposed a navigation method using skirting boards as reference targets, which can improve the real-time performance of visual navigation.

Here, a visual navigation method is used, and the robot can track roads, stop at target points, and provide tour guide explanations in a structured road environment, achieving good results.

2. Introduction to the Guide Robot

Guided tour robots are used in large exhibition halls, museums, or other convention centers to guide visitors along fixed routes, provide explanations, and engage in simple conversations. Therefore, guided tour robots must possess functions such as autonomous navigation, path planning, intelligent obstacle avoidance, docking and positioning at target points, voice explanation, and the ability to conduct simple conversations with visitors, as well as the ability to quickly react to and adapt to the external environment. Based on a hierarchical structure, guided tour robots can be divided into: an artificial intelligence layer, a control and coordination layer, and a motion execution layer. The artificial intelligence layer primarily utilizes a CCD camera to plan and autonomously navigate the robot's path; the control layer coordinates the fusion of multi-sensor information; and the motion execution layer enables the robot to move. Figure 1 shows the overall structural block diagram of an intelligent guided tour robot.

3. Hardware Design of the Tour Guide Robot

3.1 Hardware Implementation of the Artificial Intelligence Layer

Considering the requirements of the mobile robot control system, such as high processing speed, easy expansion with peripheral devices, and small size and weight, a PC104 system was chosen as the host computer, with its software programmed in C language. A USB camera is used to collect visual information from in front of the robot, providing a basis for visual navigation and path planning. An external microphone and speaker are included to provide guidance and explanation when the robot reaches the target point.

3.1.1 Hardware Implementation of the Control and Coordination Layer

The selection of robot sensors should depend on the robot's operational needs and application characteristics. Here, ultrasonic sensors, infrared sensors, electronic compasses, and gyroscopes are chosen to collect information about the robot's surrounding environment, aiding in obstacle avoidance and path planning. An ARM processing platform is used to drive the motors via an RS-485 bus, propelling the robot's movement.

The guide robot requires sensors with relatively high accuracy, good repeatability, strong anti-interference ability, and high stability and reliability. During movement, the robot must accurately obtain its position information, the digital compass reliably outputs the heading angle, and the gyroscope measures deviations and makes necessary corrections to ensure the robot's direction of travel remains correct. A combination of ultrasonic and infrared sensors is used to acquire information about obstacles ahead. The system design uses six ultrasonic sensors and six infrared sensors. One is positioned directly in front and one directly behind, with the remaining four ultrasonic sensors located on either side of the front and rear, at a 45° angle. The infrared sensors are mounted 1-2 cm directly above the ultrasonic sensors. The ultrasonic sensors primarily achieve obstacle avoidance through distance measurement, while the infrared sensors mainly compensate for the blind spots of the ultrasonic sensors and determine the presence of obstacles at close range.

3.1.2 Hardware Implementation of the Motion Execution Layer

The actuator of this intelligent tour guide robot uses a DC servo motor. A Sanyo Super-L motor (24V/3.7A) with a rated output power of 60W and a maximum no-load speed of 3000 r/min is selected, equipped with a 500-line optical encoder to enable the robot to perform corresponding actions. The tour guide robot uses closed-loop control, measuring the actual wheel speed via the optical encoder and feeding it back to the microcontroller. Based on the difference between the actual speed and the given speed, the driver adjusts the corresponding voltage according to a certain calculation method (such as a PID algorithm), repeating this process until the given speed is reached. Speed adjustment is achieved using a FAULHABER MCDC2805. It achieves speed synchronization with minimal torque fluctuation, and its built-in PI regulator ensures accurate positioning. When equipped with a Super-L motor and integrated encoder, a positioning accuracy of 0.180 can be achieved even at very low speeds.

3.2 Guide Robot Software Design

The robot acquires visual information from ahead using a USB camera or other cameras, and processes the video through image processing algorithms, enabling it to plan paths and navigate autonomously. By receiving multi-sensor fusion information from lower layers, it can avoid obstacles at close range and therefore issues an alarm when encountering obstacles. Upon reaching the destination, it can provide voice narration and then engage in simple conversations with visitors.

4. Visual navigation

Visual navigation is a navigation method for mobile robots, and research on basic visual navigation is one of the main development directions for future mobile robot navigation. The role of this visual subsystem in the overall system is to perform image understanding on visual information of the surrounding environment acquired by the camera, and to control the robot's movement based on image processing algorithms. "Image understanding" refers to processing image data to gain an understanding of the scene reflected in the image, including which objects are present in the image and their positions within it. Images contain rich information; only useful information needs to be extracted. Therefore, image understanding algorithms are often tailored to specific purposes and have certain applicable conditions and limitations.

4.1 Image Preprocessing

The original image is a structured road image with blue labels inside the building, captured by a Logiteeh camera. The image size is 320x240 pixels. First, the original image is converted to grayscale and then binarized by selecting an appropriate threshold. Then, useful image information is extracted, and the direction of travel is extracted through morphological operations such as dilation and erosion, as shown in Figure 2.

Figure 3 compares the detection performance of common edge operators. As can be seen from Figure 3, the Canny and Sobel operators perform relatively better, with the Sobel operator having a smoothing effect on noise and providing more accurate edge direction information. The Sobel operator is used for detection here, as shown in Figure 4.

According to Figure 4, the system detects the position of two straight lines through Hough transform, measures the pixel size of the two edge lines of the image from both ends, and then calibrates according to the actual ground distance to determine the robot's location.

4.2 Template Matching Algorithm

Template matching is an important research direction in image target recognition technology. It features simple algorithms, low computational cost, and high recognition rates, and is currently widely used in the field of target recognition. It uses a small image as a reference, comparing a template with the source image to determine if a region in the source image is identical or similar to the template. If such a region exists, its location can be determined and extracted. It often uses the sum of squared errors between corresponding regions in the template and the source image as a measure.

Let f(x, y) be the source image of M x N, and g(s, t) be the template image of S x T (s ≤ M, T ≤ N). Then the sum of squared errors measure is defined as:

When A is constant, 2B can be used for matching. When D(x, y) reaches its maximum value, the template is considered to match the image. However, assuming A is constant usually introduces errors. In severe cases, it will prevent correct matching. Therefore, normalized cross-correlation can be used as a measure of the sum of squared errors, defined as:

4.3 Improved Template Matching Algorithm

However, the computational workload of template matching is very large. Considering that correlation is a specific form of convolution and the powerful computing capabilities of Matlab, the FFT method is adopted. The correlation is calculated in the frequency domain and then inversely transformed to obtain the result. Rotating the image and the localization template image 180° and performing a Fourier transform followed by a dot product operation, then taking the inverse FFT and returning the spatial domain value is equivalent to the correlation operation. After finding the maximum value in the spatial domain, a suitable threshold is selected based on the maximum value to determine the location of the target point. In the experiment, after successful template matching, the target and background colors are binarized and marked with a red "+" symbol, continuously updating the data. The stopping point is set at the desired pixel position (e.g., slightly below the center of the image), and the robot's position is automatically adjusted, as shown in Figure 5. It can be seen that the robot needs to move to the right.

5. Experimental Results and Conclusions

Based on the above design, experiments were conducted on robot motion control and path planning. Image simulations were performed using Matlab, which automatically selected appropriate threshold segmentation and achieved good edge detection. However, during the experiments, sometimes the threshold segmentation did not achieve ideal results due to factors such as lighting intensity. The robot's motion could be controlled effectively in the VC++ environment, and template matching achieved good results. Future research will focus on image processing methods in the Visual C++ 6.0 environment, which will allow for better control of the robot's motion. In summary, this system design enables the robot to accurately identify image information in complex and changing environments, make correct decisions, and complete the required actions, thereby achieving the predetermined goals.

Design of a Machine Vision-Based Intelligent Guide Robot Control System

Read next

CATDOLL 126CM Sasha (Customer Photos)

CATDOLL 133CM Sasha Shota Doll

CATDOLL 115CM Mimi TPE

CATDOLL 166CM Hanako TPE