Driverless Vehicle Data Engineering

This Week: How does an AV determine who has right of way at a four-way stop?

Dear Reader…

If you spend any time in Los Angeles County you will encounter a peculiar sight. White EV Jaguar’s driving around with no-one at the wheel.

Follow one of these Waymo vehicles in your own car, and you would not know, based on its behaviour, it is “sans-driver”! Spotting these beasts on a recent visit, got me curious to not only try the service, but also understand more about the data and decision making algorithms that make Autonomous Vehicles (AV’s) viable. How for instance does it figure out who has right of way or get out of the way of fire truck or ambulance?

One of the particular patterns unique to driving in the USA is the right of way at a stop sign - often referred to as the “four-way stop rule”. Essentially it means that the first vehicle to arrive at an intersection, where there are multiple stop signs, has the right of way regardless of direction or location. As I sat in my Waymo for the first time, it got me thinking just what goes into making this decision from a Data Engineering standpoint. Here are some answers based on my research into this engineering conundrum.

Sense, solve and go is how Waymo explains AV driving. If we look under the hood though, there are several more layers to unpack that are of interest to Data Engineers…

Perception and Sensor Fusion

Waymo's autonomous vehicles rely on a sophisticated array of sensors to perceive their environment accurately. Using a multi-modal approach is crucial for making informed decisions at intersections, and include:

  1. LiDAR (Light Detection and Ranging): Waymo uses custom-built LiDAR sensors that provide high-resolution 3D point clouds of the surrounding environment. These sensors can detect objects up to 300 metres away, offering precise distance measurements and shape information. LiDAR is particularly useful for detecting the presence and position of other vehicles at the intersection, as well as pedestrians and cyclists.

  2. Cameras: Multiple high-resolution cameras are strategically placed around the vehicle to provide a 360-degree view. These cameras are crucial for detecting and reading traffic signs, lane markings, and traffic lights. They also help in identifying the type and colour of other vehicles, which can be important for predicting their behaviour.

  3. Radar: Radar sensors complement LiDAR and cameras by providing accurate velocity measurements of moving objects.They are particularly effective in adverse weather conditions where visibility might be reduced.

  4. Sensor Fusion Algorithms: To combine data from these diverse sensors, Waymo likely uses advanced sensor fusion algorithms. These could include Kalman filters for tracking moving objects and integrating data over time, as well as deep learning-based fusion techniques that learn to combine multi-modal data optimally.

Deep Learning Models:

Waymo employs state-of-the-art deep learning architectures for various perception and decision-making tasks, to determine what is going on in any given situation:

  1. VectorNet: This is Waymo's proprietary deep learning architecture designed specifically for predicting vehicle trajectories in complex traffic scenarios. VectorNet uses a hierarchical graph neural network to model interactions between vehicles and the road infrastructure. It processes the scene as a graph, where nodes represent entities (vehicles, pedestrians, road features) and edges represent their relationships. This approach allows the model to capture complex interactions and dependencies between different elements in the scene. VectorNet has shown state-of-the-art performance on trajectory prediction benchmarks, making it particularly useful for anticipating the behavior of other vehicles at intersections.

  2. Convolutional Neural Networks (CNNs): CNNs are typically also used for processing image data from cameras and possibly for interpreting LiDAR point clouds. Tasks include:

    • Object detection: Identifying and localizing other vehicles, pedestrians, and cyclists.

    •  Semantic segmentation: Classifying each pixel in the image to understand road layout, lane markings, and drivable areas.

    •  Traffic sign recognition: Detecting and interpreting traffic signs and signals.

    While not officially cited, the chances are that Waymo uses advanced CNN architectures like ResNet, EfficientNet, or custom-designed networks optimised for their specific use case.

  3. Recurrent Neural Networks (RNNs) and Transformers: These architectures are well-suited for processing sequential data and could be used for:

    • Predicting the future trajectory of other road users based on their past movements.

    • Understanding the temporal context of the scene, which is crucial for making right-of-way decisions.

    • Long Short-Term Memory (LSTM) networks or more recent Transformer-based models are employed for these tasks.

There’s a reason 400,000 professionals read this daily.

Join The AI Report, trusted by 400,000+ professionals at Google, Microsoft, and OpenAI. Get daily insights, tools, and strategies to master practical AI skills that drive results.

Behaviour Prediction

Anticipating other drivers, pedestrians and cyclists behaviour is one of the more complex parts of making safe and efficient right-of-way decisions. It is overly simplistic to some up the four-way stop rule purely based on who arrives first. What happens if there is a pedestrian, or what if two cars arrive at the same time? Waymo uses likely uses the following to anticipate the likely actions of others:

  1. Probabilistic Models: There are various probabilistic approaches to handle uncertainty in predictions, that Waymo is likely to use, including Gaussian Process Models or Bayesian Neural Networks that provide not just point estimates but probability distributions over possible future states.

  2. Intention Recognition: Advanced algorithms might be used to infer the intentions of other drivers based on subtle cues like slight movements or positioning within the lane. This could involve a combination of rule-based systems and learned models.

  3. Multi-Agent Prediction: At a 4-way stop, it's crucial to predict the behaviour of multiple agents simultaneously. Chances are that Waymo uses techniques from game theory or multi-agent reinforcement learning to model the interdependent decision-making of different vehicles at the intersection and other party behaviours.

Decision Making

For making right-of-way decisions, Waymo likely uses a sophisticated combination of approaches:

  1. Rule-Based Systems: Hard-coded rules based on traffic laws and right-of-way conventions provide a baseline for decision-making. There is a balance to be struck around rule adherence, applied common sense and the nuances of acceptable etiquette. As all drivers have experienced there is a significant difference between theory and practice. This is where culturally specific learning comes into play.

  2. Reinforcement Learning (RL): RL algorithms are used to optimise decision-making based on experience from millions of simulated and real-world interactions. Techniques like Deep Q-Networks (DQN) or Proximal Policy Optimization (PPO) might be employed to learn optimal policies for handling intersections. The reward function for such an RL system would likely balance factors like safety, efficiency, and adherence to traffic rules.

  3. Planning Algorithms: Two planning algorithms come into play to help in the decision process like-

    • Monte Carlo Tree Search (MCTS): This algorithm could be used to explore possible future scenarios and choose the best action.

    • Model Predictive Control (MPC): MPC could help plan safe trajectories through the intersection while considering the predicted behavior of other vehicles.

  4. Hybrid Approaches: Waymo likely uses a combination of these techniques, perhaps with a hierarchical structure. For example, a high-level decision-making module might use RL to decide when to proceed, while a lower-level planning module uses MPC to generate the exact trajectory.

Training and Optimisation

Waymo's collaboration with Google’s DeepMind has led to advanced training techniques with:

  • Population-Based Training (PBT): This evolutionary approach helps optimise neural network hyper-parameters more efficiently. PBT works by training multiple models in parallel and periodically replacing poorly performing models with modified versions of better-performing ones. This technique allows Waymo to efficiently search a large space of possible model configurations and find optimal settings for their neural networks.

  • Simulation: Waymo's "Carcraft" simulator is a crucial tool for training and testing their AI systems. It allows for the simulation of 25,000 virtual self-driving cars in various city models, including numerous 4-way stop scenarios. The simulator can generate a wide variety of challenging scenarios, including edge cases that are rare in real-world driving. This extensive simulation capability allows Waymo to train their models on millions of hours of driving experience, far more than would be possible with real-world driving alone.

  • Transfer Learning: Waymo likely uses transfer learning techniques to leverage knowledge gained from simulations when fine-tuning models for real-world performance. This approach helps bridge the gap between simulated and real-world environments.

Continuous Improvement

With more than a decade working at solving the AV challenge the system is designed for ongoing refinement and improvement which needs to be feed into the decision-making models with:

  • Data Collection and Analysis: Every real-world interaction at a 4-way stop provides valuable data for further training and improvement. Waymo likely has sophisticated data pipelines to collect, process, and analyze this data efficiently. Machine learning techniques might be used to automatically identify interesting or challenging scenarios from the collected data.

  • Model Updates: The company regularly updates its AI models based on new data and improved algorithms. This might involve techniques like online learning or periodic retraining of models.

  • Human-in-the-Loop Systems: For particularly challenging scenarios, Waymo might employ human experts to review and annotate data. This human insight can be used to improve the AI system's decision-making in edge cases.

Like this content? Join the conversation at the Data Innovators Exchange.

Governance & Quality Assurance

Fundamental to the viability of AV’s is the governance frameworks that enable companies like Waymo to operate safely and securely. Given the understandable fears of driverless cars the approach to governance is an even more important dimension to data engineering. Here is a short explanation of the quality assurance regime deployed on the Waymo platform.

Waymo use a layered framework that prioritises robust infrastructure, behavioural adaptability, and operational support. This architectural foundation adheres to well-defined hardware and software standards and performance metrics safeguarding the baseline operations by confirming each component’s reliability, including systems like Lidar for sensing and even automated cleaning to maintain sensor functionality.

Operational standards are also based on extensive research of real-world scenarios like passengers spilling drinks in the vehicle or pedestrians potentially behaving badly. There are examples of people place road cones on the hood of the AV or standing in front of them just to see what happens.

To provide transparency and accountability, Waymo has developed the Waymo Case Credibility Assessment to standardise its safety validation. This governance tool provides a consistent method to evaluate safety metrics, similar to regulatory frameworks that guide risk and safety assessments in other sectors. This is coupled with iterative testing, feedback loops, and deployment evaluations across different cities.

It is truly amazing to experience the wonder of an AV in action. It seems very much like the future that has so long been hyped is beginning to make an appearance in our lives. As someone interested in the field of Digital Transformation and Data Engineering I highly recommend trying the service to see if your mind gets blown, like me.

That’s a wrap for this week.
Thank you