Analysis of core technologies and modules of service robots

The service robot industry chain is characterized by the intertwining of products and technologies.

The upstream of the service robot industry chain consists of component manufacturers, including chip, LiDAR, and servo motor manufacturers. These manufacturers are typically technology-driven, and if there is a rapid boom in the midstream and downstream sectors, production capacity may become a limiting factor. The midstream includes voice and image providers. This segment is relatively independent, with data and algorithms as its core competitiveness. The midstream product segment includes everything from design and manufacturing to marketing. Brand, channels, and production capacity are its core barriers. If product companies can build an ecosystem through operating systems, it will become an important barrier. The midstream voice and image segments and product segments penetrate into various downstream consumption and distribution scenarios through virtual and physical means. Each scenario iterates and scales up products according to the ease of industrialization, becoming a powerful engine for the upstream and midstream.

Robotic vacuum cleaners: the most mature and easily industrialized application scenario

2.1 Service robot application scenarios are divided into three levels.

When strong consumer demand drives industry transformation, it will inevitably lead to rapid technological advancements and continuous cost reductions. Service robots, possessing the attributes of consumer goods, are ultimately demand-driven and will be mass-produced starting from specific scenarios.

The demand for service robots hinges on two factors: whether it's a necessity and the frequency of use. The strength and frequency of use determine the application scenarios. The stronger the demand and the higher the frequency, the easier it is for service robots to achieve mass production. Currently, the most mature industrializations are for robotic vacuum cleaners and customer service robots, driven by high-frequency demand. Companion robots and early education robots are developing rapidly, driven by essential demand. Other robots, due to their lower frequency and demand compared to the aforementioned robots, are in a relatively early stage of industrialization.

Based on market penetration, the demand for service robots can be categorized into three types: upgrading existing needs, fulfilling existing needs, and exploring unknown needs. Upgrading existing needs refers to services already in the market, including early education robots and robotic vacuum cleaners. Early education robots add human-computer interaction compared to learning machines, while robotic vacuum cleaners add autonomous path-blocking algorithms compared to traditional vacuum cleaners. Fulfilling existing needs involves using service robots because their procurement cost is lower than labor costs, including intelligent customer service and companion robots. Exploring unknown needs involves situations where demand is not currently strong, such as butler robots.

2.2 Robotic Vacuum Cleaners: Leveraging mature application scenarios, they will be the first to achieve industrialization.

The core of existing demand scenarios is that service robots can find corresponding products in reality, such as learning machines corresponding to early education robots, and vacuum cleaners corresponding to robotic vacuum cleaners. These types of products are the easiest to scale up and industrialize among all product types because they have a large user base, clear needs, and low user education costs.

Robotic vacuum cleaners, driven by the demand for intelligent cleaning and aiming to replace human labor and free up hands, are experiencing explosive growth in demand globally. Compared to manual sweeping and vacuum cleaners, the intelligence of robotic vacuum cleaners depends on the advancement of their artificial intelligence technology, which is reflected in three aspects: First, coverage: efficient algorithms result in fewer missed spots and high coverage; second, autonomous cleaning capability: fewer auxiliary cleaning buttons, requiring no manual assistance; and third, environmental adaptation and judgment: flexible responses based on furniture and surrounding environment.

The core of China's domestic robotic vacuum cleaner industry chain lies in the midstream robot manufacturers. Looking at the current structure of my country's robotic vacuum cleaner industry chain, upstream components mainly consist of electromechanical equipment, batteries, motherboard chips, and other parts. Midstream robot manufacturer companies include two types: those with their own independent brands and those that provide ODM and OEM manufacturing for foreign brands. Downstream distribution mainly relies on offline physical stores and online channels, with online channels currently showing a trend of surpassing offline channels. Currently, the core of the industry chain remains in the hands of midstream robot manufacturers, especially those with their own brands, who enjoy stronger bargaining power.

Currently, domestic robotic vacuum cleaner companies can be roughly divided into three tiers. According to data from CMM, the three tiers of domestic robotic vacuum cleaner companies are: (1) Ecovacs has formed the first tier in the domestic market due to its first-mover advantage. It leads other companies in terms of product R&D capabilities, product system and production capacity innovation capabilities, with a market share of about 50%; (2) The second tier mainly consists of international giants iRobot and Xiaomi robotic vacuum cleaners (Roborock Technology) which have gradually risen in recent years. Their combined market share in China is about 20%-25%; (3) The third tier includes some domestic brands and some traditional home appliance companies, such as Proscenic, Fmart, Philips Haier, and Midea. Traditional home appliance companies entered the transformation direction relatively late, but they have channel advantages and their future development space should not be ignored.

Analysis of core technologies and modules of service robots

3.1 Three core technology modules of intelligent robots: perception + interaction + motion control

Service robots comprise three core technology modules: human-computer interaction and recognition, environmental perception, and motion control. The interaction module includes speech recognition and image recognition, analogous to the human brain; the perception module utilizes various sensors, gyroscopes, and lidar, analogous to the eyes, ears, nose, and skin; and the motion control module includes servo motors, motors, and chips. Based on these three modules, the robot has basic hardware: battery modules, power modules, and a main unit, as well as operating systems such as ROS and Linux. The hardware and operating system constitute the complete robot, integrating basic hardware, systems, and control components to form a robot with certain walking and interaction capabilities.

Among the various sub-modules of service robots, the voice module is the most important and mature, the semantic module is currently the focus of breakthroughs, and the movement control module is relatively the least important. The three main modules of service robots can be further subdivided into voice module, semantic module, image module, perception module, movement control module, and chip module.

From a technological standpoint, artificial intelligence is the core. Currently, only the fields of voice recognition and OCR possess a certain level of maturity. These fields have been developing for nearly 20 years and have already established a data foundation in certain specific scenarios and industries. Other technologies, including image recognition and semantic analysis, are still in their very early stages. The voice recognition field is also currently the largest known segment for platform companies.

3.2 Sensing Module: LiDAR is the core, and multi-sensor fusion is essential.

Multi-sensor fusion ensures safety, with LiDAR being a key technical challenge. The redundancy in the functions of various sensors, such as LiDAR, millimeter-wave radar, and 3D cameras, guarantees the safety and normal operation of service robots. Among these, LiDAR is an indispensable core component. The principle of LiDAR is to emit n laser beams and use triangulation (a low-cost solution) or Time-of-Flight (TOF, a high-cost solution) to measure the distance between itself and surrounding objects, obtaining highly accurate distance information—point cloud data.

For service robots to provide precise services, in addition to accurate positioning, they also need to identify the environment by combining positioning information. This requires the use of SLAM technology, and LiDAR is an important entry point for SLAM.

SLAM (Simultaneous Localization and Mapping) refers to real-time localization and mapping. A robot creates a map in a completely unknown environment and uses this map for autonomous localization and navigation. The SLAM problem can be described as follows: a robot starts moving from an unknown location in an unknown environment, performs self-localization based on position estimation and sensor data during movement, and simultaneously builds an incremental map. This autonomous localization and navigation requires three main technologies: real-time localization, mapping, and path planning, with path planning being the most crucial.

Take robotic vacuum cleaners as an example: Internationally, in a standard 80-square-meter area, a robot without a navigation module typically takes over 40 minutes to achieve an 80% cleaning rate. However, with a SLAM module installed, it achieves 95% coverage in just 10 minutes. If we broaden our focus to other areas, highly efficient movement to the destination is essential, making this an unavoidable technological solution. We can make a simple comparison:

Visual positioning technology: The positioning range is 0.1-2 meters, it cannot obtain a map, requires additional sensors to avoid obstacles, and requires a suitable light source to adapt to the environment; its stability is relatively poor.

LiDAR + SLAM technology: Positioning accuracy can be controlled within 0.01-0.1 meters , and accurate maps can be obtained; it supports autonomous obstacle avoidance and does not generate cumulative errors.

3.3 Interaction Module: Voice recognition has reached the commercial threshold, but semantic understanding still needs time.

The analysis path of the interaction module: The interactive interface receives information from the external input system, usually voice recording. After voice decoding, it is imported into a pre-defined knowledge base for semantic matching and logical processing. Finally, after speech synthesis, it outputs either speech or text based on external requirements. The most important parts of the entire module are speech recognition and semantic analysis. Depending on the mode, semantic analysis can include many types: input text, output speech; input speech, output speech; input speech, output text, etc. Inputting and outputting speech is the most difficult.

Semantic understanding still requires time. Natural language analysis techniques can be broadly divided into three levels: lexical analysis, syntactic analysis, and semantic analysis. Lexical and syntactic analysis have been largely solved, while semantic analysis is currently only being processed at a superficial level.

Challenges in Natural Language Processing: Word sense disambiguation is a bottleneck, and it is more difficult for Chinese than English. There are several challenges in natural language processing: word segmentation (both Chinese and English natural language processing have a preliminary step, which is to decompose the input string into lexical units), word class labeling, grammar theory, and word sense disambiguation. It is universally acknowledged that Chinese semantics is difficult to understand, which includes several aspects: (1) Chinese has many ambiguities, complex grammatical structures, and a large difference between the communication context and the expressed meaning; (2) In the field of semantic analysis, almost all algorithms used are deep learning, advanced neural networks, etc. Heuristic algorithms cannot exhaust all situations and can only form local solutions, so there will always be flaws; (3) The establishment of a semantic knowledge base is a long-term process, and the development of semantic analysis is still in its early stages.

3.4 Motion Control Module: Coexistence of Gait and Non-gait Dynamics

Motion control is the least important of the three modules because service robots are different from industrial robots of the past. They have lower requirements for precise control, and the core is still algorithms and interactive experience.

Gaited walking emphasizes precise control, while non-gait walking is mainly for simple movement. From an appearance perspective, service robots walk in two ways: gaited and non-gaited. Gaited walking includes hydraulic and motor control, with typical examples being Nao robots, Asimo robots, and Atlas, and UBTECH being a prominent domestic company. Non-gait walking is primarily motor-controlled, with a simpler structure, mainly consisting of casters mounted on the bottom of the robot for simple movement. Typical examples include Pepper and Canbot.

Analysis of core technologies and modules of service robots

Read next

CATDOLL 133CM Sasha Shota Doll

CATDOLL Oksana Soft Silicone Head

CATDOLL CATDOLL 115CM Shota Doll Laura (Customer Photos)

CATDOLL 126CM Sasha (Customer Photos)