CES 2026 made one thing clear: autonomous driving has crossed from an experimental technology into a genuine inflection point. Robotaxi companies like Waymo and Zoox announced expansions into new cities. Uber confirmed plans to launch its own autonomous ride service. NVIDIA unveiled its Alpamayo physical AI platform specifically designed to accelerate autonomous driving development. And Aurora, which launched the world’s first commercial driverless trucking service on the Dallas-Houston corridor in May 2025, revealed plans to scale dramatically.
But how does it actually work? What is the AI in an autonomous vehicle actually doing, and why is it so much harder than it looks?
The Four Layers Every Autonomous Vehicle Runs
Every self-driving system — whether it’s Waymo’s robotaxi, Tesla’s FSD, or Wayve’s foundation-model-based approach operating in London, Germany, and Japan — performs the same four fundamental tasks continuously while the vehicle is moving.
Perceive. The vehicle builds a real-time picture of the world around it. This means fusing data from multiple sensors simultaneously: cameras reading lane markings, traffic lights, and road signs; LiDAR firing laser pulses to build precise 3D maps of everything within 200 metres; radar tracking the speed and distance of objects in all conditions including heavy rain and fog. NVIDIA’s DRIVE AGX platform processes this sensor fusion at 2,000 trillion operations per second — TOPS — to maintain a coherent, up-to-date model of the environment at all times.
For object detection, convolutional neural networks like YOLOv8 have proven particularly effective in real-world autonomous driving deployments in 2025-2026 because of their combination of accuracy and inference speed. Detecting a pedestrian at distance isn’t enough — the system has to classify it as a pedestrian, not a cyclist or a bin, within milliseconds.
Predict. Knowing what’s around the vehicle is only step one. The AI then has to anticipate where everything is going. A car in the adjacent lane that’s drifting slightly. A cyclist whose wheel wobble suggests an imminent turn. A pedestrian at the kerb who has just looked left. Modern autonomous driving systems use transformer-based models trained on enormous datasets of real-world driving to encode what researchers call “social intelligence” — the subtle cues that experienced human drivers process instinctively.
Wayve’s approach, backed by Microsoft Azure and partnered with Uber and Nissan, explicitly trains on this: teaching the AI the equivalent of 16 years of spatial awareness before the system ever touches a real road, using a combination of real driving footage and simulation.
Plan. Path planning is where the vehicle decides what to do. Given everything it perceives and predicts, it calculates the safest, most efficient route through whatever the road is presenting — and recalculates that path continuously, not once at the start of a journey. The planning module also has to handle edge cases: a delivery vehicle double-parked blocking the lane, a child’s ball rolling into the road, a police officer directing traffic against the signals.
This is the hardest part. Imitation learning — training on millions of miles of human driving — handles routine situations well. But genuinely novel scenarios, ones outside the training distribution, remain the central challenge for every autonomous driving system in 2026.
Act. Finally, the AI translates its decisions into physical outputs: steering angle, brake pressure, throttle position, turn signals. This has to happen in real time, with microsecond precision, and with redundant backup systems in case any component fails.
The End-to-End Shift That’s Changing Everything
For years, autonomous driving systems were built as modular pipelines — separate software components for perception, prediction, planning, and control, each developed and optimised independently. The emerging consensus in 2026 is that end-to-end AI, where a single foundation model processes sensor inputs and outputs driving actions directly, is both more efficient and more capable.
Waymo revealed in 2025 that its commercial fleet has transitioned to a foundation model trained end-to-end — a significant philosophical shift for a company that had long championed modular approaches. Li Auto’s MindVLA architecture, announced at NVIDIA GTC 2025, integrates spatial intelligence, language intelligence, and behavioural intelligence into a single system capable of 3D spatial understanding and logical reasoning, with mass production scheduled for 2026.
Tesla has used end-to-end training longest, and its data advantage — drawing from millions of consumer vehicles on roads worldwide — gives it exposure to far more edge cases than any competitor’s dedicated fleet.
V2X: When Cars Start Talking to Everything
The next frontier isn’t just smarter cars in isolation — it’s cars that communicate with each other and with infrastructure. Vehicle-to-Everything (V2X) communication allows an autonomous vehicle to know about a hazard around a blind corner from a car that’s already passed it, or receive information from a traffic light about when it will turn green a quarter mile ahead.
Research combining V2X data with large language models — published in 2025 — shows how real-time, human-like understanding of complex traffic scenarios becomes possible when vehicles share contextual information rather than each operating on only what their own sensors can see.
Where It Stands in 2026
By the end of 2025, Waymo had logged over 160 million fully driverless kilometres. Zoox launched its purpose-built bidirectional robotaxi in Las Vegas in September 2025 and is expanding in 2026. XPeng in China plans to launch vehicles with Level 4 hardware and software in mass production this year. Germany and Japan have established national legal frameworks for Level 4 deployment.
The technology is increasingly mature for well-defined operational domains — urban robotaxis, highway trucking, mapped routes in controlled conditions. The remaining challenge is broadening those domains: more cities, more weather conditions, more unpredictable scenarios. That’s where the investment, the data, and the research are focused right now — and why every major automotive and AI company in the world has autonomous driving at the centre of its strategy.
