What technologies are crucial for achieving autonomous driving?

Vehicle hardware

The first step in autonomous driving is to "see and touch" the environment and its own state. Vehicle hardware can be divided into three major perception subsystems: external environment perception, vehicle state perception, and in-vehicle driver monitoring.

External environment perception relies on various sensors, including cameras, millimeter-wave radar, lidar, and ultrasonic radar. Cameras, through optical lenses and image sensors, convert information such as roads, traffic signs, and pedestrians into digital images, which are then used by visual algorithms to extract two-dimensional features. Millimeter-wave radar maintains stable distance and speed measurements even in adverse weather conditions such as rain and fog. LiDAR generates high-precision 3D point clouds by emitting hundreds of thousands to millions of laser pulses to reconstruct the 3D contours of surrounding objects. Ultrasonic radar provides meter-level precision detection in low-speed parking and close-range obstacle avoidance scenarios. Different sensors have their own advantages and limitations in terms of physical principles and detection range; therefore, data fusion schemes are essential to achieve complementary strengths and construct an all-weather, all-scenario environmental model.

Vehicle status perception utilizes components such as steering angle sensors, wheel speed sensors, and inertial measurement units (IMUs) to monitor parameters like lateral and longitudinal acceleration, angular velocity, and wheel speed at high frequencies. The steering angle sensor reflects real-time changes in the steering wheel angle, helping to determine the vehicle's steering intention; the wheel speed sensor calculates vehicle speed and slip ratio based on differences in wheel speeds, providing data for the anti-lock braking system (ABS) and traction control system (TCS); the IMU, combined with GNSS/RTK positioning, fuses high-frequency inertial data with low-frequency satellite positioning through extended Kalman filtering, achieving centimeter-level continuous positioning. Even in areas with weak GNSS signals, such as tunnels, it ensures controllable short-term attitude estimation errors.

The in-vehicle driver monitoring system (DMS) provides safety assurance for Level 2/Level 3 assisted driving. A DMS typically consists of infrared cameras, regular cameras, and pressure/biosensors. Through algorithms such as facial and eye key point detection, head posture estimation, and blink frequency statistics, it assesses the driver's level of attention and fatigue. When it detects inattention or hands leaving the steering wheel, the system can promptly issue warnings or trigger takeover conditions to ensure safety in the "last mile" of human-vehicle collaboration.

In-vehicle computing platform

After multi-sensor fusion is completed at the sensing hardware level, massive amounts of data must be aggregated in real time to the onboard computing platform for rapid inference and decision-making. The domain controller is the central nervous system of this process. It not only handles sensor data access and processing but also drives high-frequency deep learning inference computation. At the same time, it interfaces with subsystems such as human-machine interaction, vehicle communication (T-Box), and drive-by-wire chassis, making it a key component for vehicle interconnectivity.

The core components of a domain controller include a high-performance computing SoC (System on Chip), memory and storage subsystems, power management and thermal design, and multiple high-speed interfaces. The computing SoC typically integrates a CPU, GPU, and NPU (Neural Processing Unit), and supports INT8/FP16 quantized inference to balance computing power and power consumption. Currently, in Level A (L2) assisted driving scenarios, the domain controller needs to provide 50–200 TOPS (trillion operations per second) of dense computing power; Level L3 requires ≥200 TOPS; Level L4 requires ≥1000 TOPS; and Level L5 can reach over 2000 TOPS. However, it is important to understand that blindly pursuing peak computing power is less cost-effective than optimizing algorithms and computing power utilization.

In terms of interface design, the domain controller needs to meet various automotive network and sensor bus standards. GMSL or FMC+CSI-2 is used for high-speed camera input, IEEE802.3100Base-T1/1GBase-T1 is used for automotive Ethernet, CANFD is used for traditional control buses, and low-speed expansion interfaces such as SPI, UART, and I²C are also required. Furthermore, the domain controller must provide multiple LIN bus or FlexRay interfaces to ensure compatibility with various actuators and body electronic modules.

To ensure functional safety and real-time performance, the operating system (OS) on domain controllers is typically an RTOS or microkernel operating system that meets ISO 26262 ASILD safety requirements, such as Classic/Adaptive AUTOSAR, QNX, embedded Linux (automotive-grade version), or a vendor-developed automotive-grade OS (e.g., Huawei AOS/VOS). These systems ensure millisecond-level response times for high-priority tasks such as collision warnings through kernel-level security isolation, priority preemptive scheduling, and time partitioning technologies, while also supporting multi-tasking, data encryption, and OTA secure upgrades.

Edge inference and execution hardware

After the vehicle-side inference is completed, the system will output two key results. The first is real-time trajectory generation, used to drive steering, acceleration, deceleration, and suspension adjustment; the second is scene understanding and prediction, used for human-machine interaction display and redundancy safety verification. The decision commands are ultimately implemented through hardware execution—steer-by-wire (SBW), brake-by-wire (EBB/EHB), steer-by-wire drive, and intelligent suspension.

The steer-by-wire system replaces the traditional mechanical linkage, directly driving the steering assembly via an electric motor. This allows for programmable steering feedback and predictable response, while also adaptively adjusting the power assist ratio according to vehicle speed. The brake-by-wire system, centered on electronically controlled hydraulic or electric motor braking, enables independent braking of all four wheels, improving safety on slippery surfaces and in emergency stopping scenarios. The drive-by-wire system is designed for electric or hybrid platforms, achieving precise drive through electric motor torque distribution. The intelligent suspension can dynamically adjust damping and height, balancing ride comfort and handling stability.

The aforementioned actuators are all equipped with feedback sensors (such as steering angle encoders, brake pressure sensors, and suspension travel sensors) to form a closed-loop control system. They are combined with model predictive control (MPC) or active disturbance rejection control (ADRC) algorithms to accurately track the trajectory and correct deviations in real time to cope with road disturbances and dynamic load changes.

Cloud-based training and vehicle-cloud closed loop

Vehicle-side inference alone is insufficient to meet the demands of long-tail scenarios and rapidly evolving algorithms. Autonomous driving systems require large-scale training and validation on cloud-based supercomputing platforms. The cloud-based training process includes five stages: data acquisition, annotation and quality control, simulation synthesis, distributed training, and lightweight deployment.

Training data must meet three key requirements: scale (tens of millions of kilometers of real data and billions of simulation data), accuracy (multi-level annotation and consistency control), and diversity (covering different roads, climates, and traffic habits) to ensure the model's generalization ability in long-tail corner cases. Distributed supercomputing clusters continuously optimize decision-making strategies in complex scenarios through reinforcement learning and large model architectures, and use techniques such as pruning, quantization, and distillation to generate lightweight models adapted to onboard computing power.

During the verification phase, relying on a multi-level simulation system and a real accident reproduction platform, the new model undergoes scenario playback testing and adversarial testing to ensure the robustness of the algorithm under extreme conditions. Mature models are deployed to vehicles via OTA technology, and real-world road condition data is continuously collected in "shadow mode" for subsequent iterations, forming a closed-loop, end-to-end vehicle-cloud collaboration that drives continuous improvement in system performance.

Intelligent decision-making and behavior planning

The perception module accurately recreates the environment and state, the actuators stably control the vehicle's movement, while the intermediate "decision-making and planning" is the true "brain" of autonomous driving, determining what the vehicle should do next and how to do it safely. To possess human-like driving behavior, the system must be able to understand the scene, predict the behavior of others, and plan its own path. This process includes three core levels: strategy planning, path planning, and trajectory control.

Behavior planning is responsible for outputting the most reasonable driving action in the high-level decision space, similar to choices like "I want to overtake" or "I want to slow down and enter the ramp." Common methods include fine-state machines, decision trees, and reinforcement learning policy networks. In practical implementations, some solutions incorporate intent recognition modules to predict whether the vehicle in front will change lanes or whether a pedestrian is about to cross, thereby adjusting the vehicle's behavior strategy in advance. More advanced systems employ Transformer-based behavior prediction models, using multimodal inputs such as trajectory history, lane information, and traffic signals to predict the next behavioral intent for multiple targets and calculate a score indicating the quality of the behavior.

Path planning, based on the strategy output, generates a smooth and feasible path under the constraints of roads, obstacles, and traffic rules. This level often employs methods such as hybrid A* algorithms, Bézier curves, splines, and Real-Time Response (RRT) to construct a reference path in a geographic or vehicle coordinate system. This path must satisfy vehicle dynamics constraints, such as maximum steering angle, minimum turning radius, and maximum lateral acceleration, while also considering ride comfort, such as avoiding sharp turns and severe bumps on potholes to prevent discomfort for the driver and passengers.

Trajectory control is responsible for converting the path into continuous control commands, such as desired speed, steering wheel angle, and braking intensity. Common implementation methods include PID control, feedforward control, and MPC (Model Predictive Control). Model predictive control is a more advanced method that transforms the control problem into a constrained optimization problem, predicting the vehicle response in the future time domain and iteratively optimizing to ultimately output a set of optimal control action sequences. This method balances dynamic constraints and response optimization, making it highly adaptable to urban driving conditions.

In complex traffic environments such as multi-vehicle merging sections and densely populated pedestrian crossings, single-vehicle intelligence is often insufficient. This is where the collaborative prediction module comes into play. Through a centralized algorithm, the system not only predicts its own vehicle's behavior but also collaboratively calculates the trajectories of other vehicles and non-motorized vehicles, constructing a multi-agent game-theoretic planning model to improve overall decision-making robustness.

Functional safety and cybersecurity

Autonomous driving systems differ from ordinary internet applications in that they involve personal safety and are typical high-risk systems in terms of functional safety and information security. Therefore, the system architecture must incorporate robust functional safety mechanisms from the initial design stage and build a strong cybersecurity defense to ensure that the vehicle remains under control even in the event of failure or attack.

At the functional safety level, autonomous driving systems must meet the safety level requirements of the ISO 26262 standard. Based on the degree of impact of potential failures on personal safety, functional units are classified into different ASIL (Automotive Safety Integrity Level) levels, with Level D being the highest, applicable to critical control components such as steering, braking, and acceleration. During the design phase, potential hazardous events in the system need to be identified through HARA (Hazard Analysis and Risk Assessment), and corresponding redundancy schemes and fault detection mechanisms need to be specified for each event.

In braking systems, for example, brake-by-wire controllers typically have dual redundant channels: a primary channel for normal operation and a secondary channel as an emergency backup. When the primary control circuit detects voltage instability or abnormal response, the secondary channel immediately takes over control to ensure uninterrupted braking. Furthermore, the system must be equipped with a fault injection mechanism to periodically simulate various hardware and software failures to test emergency response capabilities. This design principle is called "Fail Operational," meaning that even if part of the system fails, basic functions can still be guaranteed to remain normal.

Autonomous driving systems interact with external systems extensively through processes such as OTA upgrades, cloud model synchronization, and V2X communication. These interactions can all become entry points for attacks. To prevent malicious remote control or data breaches, the system must comply with the ISO/SAE 21434 cybersecurity standard and employ a multi-layered security architecture. First, an in-vehicle Ethernet firewall and intrusion detection system (IDS) isolate the network channel between external communication modules (such as T-Boxes) and the core controller (such as domain controllers). Second, an end-to-end data encryption mechanism uses TLS/SSL or DTLS to encrypt communication channels and verifies data integrity and source trustworthiness through digital signatures and certificates. Third, secure boot and secure execution environment (TEE) for critical components verify the integrity of the software image each time the system boots, preventing firmware tampering or backdoor injection.

With the dual protection of functional safety and information security, autonomous driving systems can possess the technical characteristics of being "controllable, explainable, and recoverable," ensuring that an acceptable level of safety can be maintained even in the event of attacks, malfunctions, or extreme scenarios.

Vehicle-Road Cooperation and System Evolution

Autonomous driving is not an isolated phenomenon; it is a crucial component of future intelligent transportation systems. To enhance the system's perception breadth and planning foresight, it is currently evolving from "vehicle-centric intelligence" to "vehicle-to-infrastructure (V2X) cooperation." V2X technology deeply connects road infrastructure and vehicles through low-latency, highly reliable wireless communication, enabling traffic information sharing and dynamic collaborative decision-making.

V2X mainly includes four scenarios: V2I (vehicle-to-infrastructure communication), V2V (vehicle-to-vehicle communication), V2N (vehicle-to-cloud platform communication), and V2P (vehicle-to-pedestrian communication). Taking V2I as an example, when the traffic light status at the intersection ahead is about to change, the system can use V2I to know the signal timing in advance, adjust the vehicle speed in advance, reduce unnecessary sudden braking and stopping, thereby improving traffic efficiency and passenger comfort. Another example is when a traffic accident or temporary construction occurs ahead; traditional sensing methods may not be able to detect it due to obstructed visibility, while the roadside unit (RSU) can detect it in advance through sensing devices and broadcast warning information, allowing following vehicles to slow down and avoid it in time.

The underlying V2X communication currently widely adopts C-V2X (Cellular Vehicle to Everything) technology based on cellular networks, which includes two modes: PC5 direct connection (no base station required) and Uu network communication (base station required). The former is suitable for low-latency scenarios such as emergency avoidance and intersection coordination, while the latter is suitable for massive non-emergency information exchange, such as map updates and road congestion status broadcasts. Compared with 4G, 5G-V2X significantly improves data rates and latency, reducing communication latency to 1–5 milliseconds, meeting the real-time coordination requirements in high-speed scenarios.

In the future, vehicle-road cooperation will combine urban digital infrastructure with edge computing platforms to form a three-layer collaborative architecture of "vehicle-road-cloud". The urban traffic management center will realize advanced functions such as real-time congestion prediction, green wave control, and emergency response route scheduling by aggregating various data sources; while edge computing nodes will handle tasks such as intersection signal timing, vehicle trajectory fusion, and conflict point identification at close range to achieve local autonomous optimization.

With the development of digital twin systems for transportation, cities will possess a virtual transportation world that is updated in real time. The trajectory of every vehicle, the timing of every traffic light cycle, and even every sudden change in traffic flow will be modeled and simulated in real time in virtual space. This system will not only serve autonomous driving but will also provide strong support for urban traffic planning, emergency management, and accident tracing.

Final words

Autonomous driving is not an advancement of a single technology, but a comprehensive breakthrough in a systems engineering project. From multimodal perception to intelligent decision-making, from safe execution to cloud training, from functional safety to cybersecurity, and then to the deep integration of vehicle-road cooperation, the autonomous driving system is a complex but orderly technological network. Each layer of architecture and each module must be highly coordinated to ultimately support a vehicle to achieve "safe and autonomous" driving in the real world.

In the future, with the application of large-scale algorithm models, the growth of chip computing power, and the acceleration of vehicle-road integration, autonomous driving will move from the assistance stage to truly unmanned operation. However, its ultimate large-scale deployment depends not only on the availability of advanced technology, but also on the stability, controllability, and collaborative capabilities of the system. Only by establishing a stable, universal, and sustainably optimizable autonomous driving technology architecture can we truly move from "smart cars" to "smart transportation" and usher in a new era of more efficient, safe, and green travel.

What technologies are crucial for achieving autonomous driving?

Read next

CATDOLL Sasha Soft Silicone Head

CATDOLL 123CM LuisaTPE

CATDOLL Yuan Soft Silicone Head

CATDOLL Diana Soft Silicone Head