Interactive projection system based on fingertip touch

The research on projector-camera systems has attracted much attention in the current field of human-computer interaction . Due to the portability of camera and projector system hardware, this technology has brought great opportunities for the application of interactive projection in embedded mobile devices. This paper proposes an interactive system based on a monocular camera and projector system that enables freehand touch on any plane, including three parts: foreground region segmentation, fingertip localization, and touch detection. A saliency detection method combining the reflectance comparison between the projected image and the camera image, along with mean shift, is used to segment the foreground arm region. Edges of the foreground contour are detected, and the curvature of the edge point sequence is calculated. The fingertip position is located based on the distance constraint between the curvature extremum and the centroid of the contour. An adaptive coding method based on spatial structured light coding measurement technology is proposed. By embedding adaptive coding in the inter-frame projected image, the spatial distance between the fingertip and the projection plane is determined, thus achieving touch detection.

System Framework and Algorithm Principles

This paper presents an interactive projection system consisting of a portable projector based on DLP (Digital Light Processing) technology and a single ordinary visible light camera. The projector projects the background image onto any regular plane, such as a white paper, wall, or desktop. The camera captures the user's interactive gestures on the projection screen and transmits them to the computer for processing via a data interface. The foreground arm region is segmented, extracted, and the position of the fingertip is detected. Finally, an adaptive structured light coding method is used to detect the fingertip touching the projection plane, and the computer executes the corresponding control to realize a touch-based projection interaction mode.

The algorithm framework of the interactive projection system based on fingertip touch is shown in Figure 1. The detection of fingertip touch using computer vision methods can be mainly divided into three parts: foreground arm region detection, fingertip point localization, and touch detection.

Figure 1 System Algorithm Flowchart

To detect the location of a fingertip touch, the system needs to extract and segment the foreground arm region on the projected background. In interactive projection human-computer interaction systems, due to the diversity of projected background images, traditional image segmentation methods struggle to accurately extract the foreground arm region. This paper detects the salient features of the foreground region based on the difference between the reflectivity of the arm skin to light and the reflectivity of the object surface on the projection screen. Assuming the influence of ambient light is Q, the surface reflectivity of the projected object is A, the pixel-level color conversion function of the camera is C, and the brightness value of the visual feedback image is E, then:

After mean-shift image segmentation abstraction, the details and textures of the foreground and background in the camera-captured image have been eliminated, and the image pixels are classified into M different image regions. The system algorithm extracts the foreground arm region by fusing the salient feature map and the image after mean-shift segmentation and detail smoothing.

(a) Projected background image; (b) Image captured by camera; (c) Saliency detection result based on reflectance contrast; (d) Mean-shifted image segmentation result; (e) Fusion result of saliency map and mean-shifted segmentation; (f) Final foreground segmentation result.

Figure 2 Foreground extraction based on saliency detection

Touch detection

Interactive projection systems based on fingertip touch require the detection of fingertip touches by users interacting on the projector plane. Accurate fingertip location is fundamental to determining the touch position. As shown in Figure 3, this paper's fingertip location detection is based on the curvature extremum method. First, edge contour points are detected in the foreground after arm region segmentation using the Canny operator. Second, the curvature of all edge sequence points is calculated, and candidate points for fingertip positions are obtained by finding the maximum curvature. Third, after extracting the maximum curvature points from the foreground contour, we find that the palm contour with large curvature includes not only fingertip positions but also peaks and valleys in the gaps between fingers. These can be excluded based on the centroid distance between the candidate points and the palm region contour. Finally, points that are close to each other are grouped into a single group, called candidate points on the same finger, while points that are far apart are grouped into different groups. A palm region can be divided into five groups in total, and the average value of the points in each group is returned as the final fingertip point.

Touch detection is implemented based on adaptive structured light coding. After detecting the user's foreground fingertip position, the system projects a structured light pattern onto a small area near the fingertip. The color channel of this structured light pattern adaptively changes according to the pixel values at corresponding positions in the projected background image. Adaptive structured light coding is embedded in the two projected images before and after the projector. The color-coded pattern reduces its visual interference with the background image by adding or subtracting a fixed coding value to the brightness values of the background pixels.

The adaptive structured light coding used in this paper mainly consists of three different geometric patterns: bars, squares, and boxes. These three geometric patterns randomly generate color structured light codes in the four color channels: red, green, blue, and black. As shown in Figure 3, the codeword for each pixel in the image is a feature vector composed of the geometric patterns of the neighboring 3×3 sub-windows centered on that pixel.

Figure 3 Adaptive structured light coding pattern

Three different geometric structure patterns and four color channels can form 12 different feature primitives. The specific encoding of each primitive pattern is shown in Table 1. The system defines a sub-window consisting of 3×3 neighboring patterns centered on any point in the image as the codeword for that point. Different sub-window codewords are unique and significantly different from the codewords of neighboring windows. Figure 3 shows the codeword definition method for adaptive encoding. The codeword for each window is encoded as a feature vector composed of nine primitive patterns from left to right and top to bottom within the sub-window. Generally, the greater the Hamming distance difference between the codewords of the sub-window and its neighbors, the lower the probability of error in decoding and recovering the 3D depth information of the object, and the stronger the noise resistance. The average Hamming distance between any two sub-window codewords in the proposed adaptive encoding reaches 8.2551, which is greater than that of pseudo-random encoding, M-array spatial encoding, and geometric pattern encoding schemes. Compared with current spatial structure encoding schemes, it is less susceptible to image noise interference and has higher stability.

During the encoding process, after the system detects the fingertip position on a certain frame of the image, adaptive structured light coding is performed in the neighborhood window centered on the fingertip, using the following formula:

Figure 4 Touch detection principle

For more information, please follow the Human-Computer Interface channel.

Interactive projection system based on fingertip touch

Read next

CATDOLL 135CM Laura

CATDOLL Sasha 60cm – Soft TPE Petite Body

CATDOLL 139CM Tami Silicone Doll

CATDOLL 140CM Sana TPE