Xiaomi End-to-End Functionality Overview
The term "end-to-end" refers to a system where everything from initially capturing external environmental information to ultimately outputting vehicle control commands is completed collaboratively by a single, integrated software and hardware platform. This eliminates the need for cumbersome intermediate module splitting, significantly reducing information transmission latency and errors. This architecture is not only more efficient in terms of technical implementation but also enables the system to adapt to various complex scenarios, maintaining a high level of robustness and safety even in changing road environments.
In February of this year, Xiaomi rolled out its end-to-end, full-scenario intelligent driving function to all users, providing drivers with an "instant access, parking space-to-parking" intelligent driving experience. Compared to traditional modular architectures, Xiaomi employs an end-to-end deep neural network modeling approach, where a unified model handles everything from sensor signal input to decision-making and planning output. This data-driven architecture with strong model generalization capabilities enables the system to cope with complex and ever-changing real-world road conditions, delivering an unprecedented driving experience.
Simply get in the car, turn on navigation, shift into Drive, and press the Smart Drive button. The vehicle will automatically and seamlessly handle various driving scenarios, including roadside start, ETC or gate passage, and roundabout driving. Whether on city roads, expressways, highways, closed parks, or parking lots, Xiaomi's end-to-end intelligent driving system provides a stable and smooth driving experience. The user experience also strives to fully embody the concept of "driving more human-like," providing the driving experience of an experienced driver, anticipating road conditions and planning the optimal route to ensure driving safety and comfort.
Thanks to the system's rapid response to obstacles and complex situations, Xiaomi's end-to-end intelligent driving system can quickly make detour decisions when encountering slow-moving vehicles, pedestrians, and roadside obstacles, keeping the vehicle safely centered in the lane and significantly improving traffic efficiency. This new end-to-end experience not only greatly reduces the driver's workload but also makes driving simpler and more convenient, becoming an important trend for future intelligent mobility.
Hardware infrastructure determines the lower limit of intelligent driving capabilities
As the cornerstone of intelligent driving systems, hardware platforms play a crucial role in the overall performance of the system. The efficient operation of intelligent driving systems relies on accurate perception of the physical world, and all of this is built upon a solid hardware foundation. From its initial design, Xiaomi Auto fully considered the need to explore the external world. Taking the Xiaomi SU7 as an example, all models are equipped with 11 high-definition cameras as standard, achieving 360-degree environmental perception around the vehicle; while the Xiaomi SU7 Pro, Max, and Ultra versions further enhance forward perception capabilities by adding a forward-facing LiDAR, ensuring accurate capture of environmental information in various complex scenarios.
This multi-sensor fusion solution builds a "massive data foundation" for the vehicle, enabling the system to acquire richer and more accurate real-time scene information. Whether it's dynamic road conditions at high speeds or minor obstacles in enclosed areas like parking lots, all can be effectively monitored through the collaborative perception of high-definition cameras, millimeter-wave radar, and lidar. The high-precision data acquisition from multiple sensors also provides rich material for subsequent physical world modeling. By acquiring images, distance, and motion information around the vehicle in real time, the system can form a holistic understanding of complex traffic environments, thus laying a solid foundation for subsequent decision-making and planning.
Data processing determines the upper limit of intelligent driving capabilities
To truly enable intelligent driving systems to "understand" the world around them, simply relying on hardware to collect data is far from sufficient. More crucial is how to integrate, interpret, and ultimately transform this fragmented information into actionable driving decisions, which places high demands on data processing capabilities. Xiaomi has built a complete physical world modeling system in this process, primarily divided into three layers: the data observation layer, the implicit feature layer, and the explicit symbol layer.
Xiaomi's "Three-Layer Modeling" Architecture for the Physical World
The data observation layer is essentially the system's "eyes." Similar to the perception structure in traditional architectures, it primarily focuses on surveying road conditions and collecting traffic information. Xiaomi's HAD (Autonomous Driving System) simultaneously collects data from 11 high-definition cameras, millimeter-wave radar, and lidar, capturing real-time images, lidar point clouds, and navigation information required for the navigation function within a 360° radius around the vehicle. This layer integrates scattered sensor signals into a data stream suitable for subsequent processing, providing the system with firsthand, real-world, multi-dimensional scene status information.
The implicit feature layer acts like the system's "thinking brain." Similar to the decision layer in traditional architectures, this layer processes and analyzes the raw data obtained from the data observation layer, extracting features and patterns hidden behind the vast amounts of data using deep neural networks. While these implicit features are not easily understood directly by humans, they contain sophisticated information for discriminating between surrounding vehicles, pedestrians, and other targets. Leveraging powerful model reasoning capabilities, the system can even recover information from partially obscured areas, providing a more comprehensive basis for decision-making.
The explicit symbol layer acts as a "translator," presenting complex data in a human-understandable format. It transforms the fuzzy information extracted by the implicit feature layer into symbols or labels that humans can directly comprehend, enabling intuitive judgment of the model's output. In this way, the system can not only evaluate the safety, comfort, and efficiency of each possible driving trajectory from multiple dimensions, but also continuously optimize the model, making the final decision more accurate and reliable. This end-to-end modeling approach simplifies the data interaction process between modules in traditional systems and significantly improves reaction speed and decision accuracy in dynamically changing environments.
Furthermore, to address the dynamic changes in the time dimension of the physical world, Xiaomi has also attempted to jointly model the three-layer model in a temporal sequence. In the cloud, future frame data is used as a self-supervised signal to continuously participate in training; while on the vehicle, the real-time optimized model can quickly adapt to various unexpected situations. This co-evolutionary design enables Xiaomi's intelligent driving system to gradually achieve accurate judgment and rapid response to complex scenarios through continuous learning and evolution.
As early as March of last year, the Xiaomi SU7's intelligent driving system already possessed advanced functions such as highway navigation, active safety, valet parking, and assisted parking. During subsequent OTA upgrades, the system continuously unlocked urban scenarios, expanding from initial coverage of select cities to the entire Chinese mainland, achieving full rollout of HAD end-to-end intelligent driving functions. At its automotive technology launch event on December 28, 2023, Xiaomi unveiled a series of core technologies for the first time, including zoom-enabled BEV technology, super-resolution OCC technology, and an integrated perception and decision-making model. The unveiling of these technologies not only demonstrated Xiaomi's deep accumulation in intelligent driving R&D but also showcased its unlimited potential in the field of autonomous driving. Through continuous improvement in its physical world modeling capabilities and in-depth engineering optimization, Xiaomi's intelligent driving system is completing a leapfrog evolution from "high-precision map + modular architecture" to "mapless + modular architecture," and then to "end-to-end architecture."