How Autonomous Vehicles Work

9 min readMar 12, 2019

There’s been quite a lot of hype regarding self-driving cars in the past several months, and it’s definitely building up.

From events like the Uber driverless test vehicle crash back in 2017 to numerous Tesla Autopilot crashes that have taken place, a lot of people are becoming wary of the future of autonomy in transportation…AKA trust is starting to fade.

I’ve been researching self-driving cars a lot, and the technology that enables society to remove the driver from the front seat is remarkable; not to mention, it’s crucial to understand how these machines on wheels work before we (as in all of us) make a decision about these vehicles.

They are coming. And we should invest time in figuring out what makes driverless vehicles technology and service that will revolutionize how humans move around.

Self-driving cars work on five major components

Image result for how do self driving cars work — a widely-used (yet effective) visualization of the process that drives (no pun intended) self-driving cars

At a high-level, we can summarize all the individual processes in our self-driving vehicle to the follow ‘umbrellas’ of processes; written in the order of occurrence (even though this entire process takes place hundreds or thousands of times every second):

Computer Vision: the eyes of the car; how it ‘sees’ the road (interesting thing: this is done through various types of sensors…not just cameras)
Sensor Fusion: the process in combining information from other sources (like how our brain combines auditory and visual data to make decisions)
Localization: figuring out EXACTLY (down to single-digit centimetres) where the car is on the road, to a very high degree of accuracy
Path Planning: taking all the information about our surroundings and making a decision about which way we need to go to get to our destination
Control: the process of physically moving the car based on decisions made in “path planning”, by constantly adjusting the steering, gas and brakes

This entire process that a car goes through so frequently is happening with respect to a lot of objects (read: obstacles) on the road that can be a potential threat to the safe navigation of our vehicle.

If the car, for example, is driving down a simple, straight, calm street, it’s constantly using its sensors to take in the world around it constantly. Some sensors on autonomous test vehicles look at the world up to 120 times per second for changes in the environment to detect potential obstacles, through the process of computer vision.

Information from all different sensors (radar, LiDAR, cameras, etc) are combined together and looked at as one big picture, through the process of sensor fusion (fusing all sensor data).

If our car does happen to see a person in its path, the car will detect it from the fused data from all the sensors that create that big picture.

Image result for self driving car lidar view — what the car sees in its ‘big picture’ that fuses all sensor data

Throughout the entire process of operation, the self-driving car is ‘localizing itself. It’s figuring out where it is in the world, and based on that information, it can decide the best next course of action during path planning.

Our example car, in this case, will decide to stop 6 metres away from the perpendicular movement of the person crossing the street.

This decision is then exercised during control; the car will apply the correct amount of brake such that the car decelerates at the right rate to stop at the right time.

The car analyses the environment makes decisions and executes on those decisions, for every single object it detects that may interfere in its path, constantly.

Computer Vision

Computer vision is the process of using numerous sensors to depict the world around the car as it moves, and create a dynamic picture of the environment.

Humans use eyes to gain visual information about where they are driving, and cars use a multitude of sensors to gain a depth of data similar to that humans can achieve.

Cameras are used for visual detection and identification of elements on the road to a high degree of specificity because of the great detail that can be extracted from pictures and videos. Figuring out where lanes are, whether or not the traffic light is green or red or labelling objects in the vehicle’s path.

Unfortunately, despite the rich data cameras provide, they cannot gauge an understanding of depth (in most cases) or the velocities of other elements on the road. For this, we use sensors like radars and LiDARs, usually mounted on top of or on the sides of the vehicle.

Radars, that measure the time it takes for emitted radio waves to hit an object and return to the receiver to get an understanding of distance (using simple kinematics), provide very accurate data about the velocity of other objects; but it’s hard to figure out exactly what object we’re looking at with radars.

LiDAR, similar to the concept of radar, uses emitted lasers and measures its time to return to create a map of its surroundings, labelled with specific depth measurements of each object, called a point cloud, as shown below.

the point cloud generated from a Google self-driving test vehicle’s LiDAR sensor

Sensor Fusion

As we’ve learned, no one sensor can provide ALL the information required for an autonomous vehicle to make an effective decision: the car must combine the data from all sources of sensors to have the most information.

Through sensor fusion, depth data from LiDAR point cloud readings, velocity data of other objects from readings from numerous radar sensors, and object classification results from camera feeds are combined together, to make one rich ‘big picture’ to gain the most accurate understanding of the world around us.

Sensor fusion allows cars to know what objects are, how fast they’re moving and far away they are. Because the vehicle performs this process many times in one second to gain a real-time view of the world, the car can track the movement of these objects, and can predict their next movements; all to make an accurate decision about what the car should do next.

Localization

During navigation, not only does a car need to know where it’s going, but it has to know — to great accuracy — where it is in the world.

Through localization, the car uses numerous vision sensors, combined with readings from high-tech GPS’s to figure out where it is EXACTLY in the world.

A GPS alone is perfect for humans to gain an overall understanding of where their car is, but such readings have an error of 1–2 metres. If a self-driving car is ‘off’ by a metre, it could be on the wrong side of the road and cause major damage.

Vision sensors are used to measure exact distances away from specific landmarks in the world, and such results are used to triangulate the position of the vehicle, down to single-digit centimetre-level accuracy.

These results are superimposed over HD GPS maps to figure out where the car is on both a micro and macro level.

Path Planning

The whole idea of a car is to go from one location to another. Path planning is the process of, literally, planning the next path of the car.

Based on the information we’ve received of the environment around the car, and the specific measurements of other elements on the road, we can make a good decision about where and how to move next on the road to reach our end destination.

Though there are numerous ways to accomplish this, the most common, is for an autonomous vehicle to set waypoints in the road ahead of it it needs to pass through, and set conditions for the velocity the car should have when it passes over.

The geographic location and velocity requirements of these waypoints are modified very frequently based on the updated movement of elements in the world around us. Like if a person jumps in front of a car (a self-driving one, of course) all of a sudden, the vehicle stops immediately.

Control

a Voyage self-driving test vehicle travelling down the road

Based on the paths planned in the previous step, the car has concluded the set of waypoints it will follow in its journey to the desired destination.

These ‘conclusions’ of the next steps of the car are actionable using commands regarding the steering angle and brake and acceleration amounts that need to be issued.

These commands, accompanied by how long they need to be executed, are sent to the electronic control unit of the car (ECU) for them to be carried out.

A car taking a right turn may instruct the ECU to turn the steering to an angle of 60° for a duration of 220ms.

These ongoing commands are sent to the car’s ECU, based on decisions made from understanding the surroundings through the previous processes of a self-driving car.

So What?!

I’m a big believer in self-driving cars and their power to transform the way people think of mobility.

Not only will costs per mile decrease drastically (because no driver to pay!), car ownership will decrease as owning a car will now be more expensive.

Because of a lack of car ownership, cities will be able to reclaim parking spaces (as much as 4.4 million acres in the US alone…about 3 times the size of Chicago) as green spaces or residential areas (which would bring down housing prices).

On top of that, the 382 hours the average American would save every year by not needing to drive anymore (which converts to 15 days, or ~$6 000 at average wage levels).

Interestingly, Waymo, Alphabet’s self-driving car company, is in testing for their hail-a-driverless-car service over in Arizona (shown below).

a Waymo One vehicle — from the former Google self-driving car team

People will be able to shop from their rides, sleep on a long commute, or even have a whole meal on their trip. The list keeps going on, as the impact of self-driving cars will touch every single industry, and will revolutionize the way people live.

Key Takeaways

Computer vision is the process of cars being able to ‘see’ the world. A combination of sensors is used to do this.
Sensor fusion is the process of combining the data from various sensors to create one ‘big picture’ of the world.
Localization is the process of measuring distances from landmarks to triangulate the car’s position in the world while checking with GPS maps.
Path planning is the process of using the understanding of the world around us to figure out exactly where to go next, and with what velocity.
Control is the process of executing on the path planned by sending commands to the car computer about corrections to steering angle, braking and acceleration.
Self-driving cars will revolutionize the way people live.

Some awesome articles you should read if you’re super interested in autonomous vehicles:

https://medium.com/@swaritd/designing-for-trust-in-self-driving-cars-4bef4187a545 (full disclosure: shameless plug)
https://medium.com/udacity/how-the-udacity-self-driving-car-works-575365270a40
https://www.wired.com/story/the-know-it-alls-how-do-self-driving-cars-see/
https://www.wired.com/story/guide-self-driving-cars/
https://medium.com/@swaritd/reinventing-the-wheel-with-the-driverless-car-41b0ce2b1c29 (full disclosure: another shameless plug)

Liked this article? AWESOME! Show you’re appreciation down below 👏👏

Follow me on Medium
Connect with me on LinkedIn
Reach out at dholakia.swarit@gmail.com to say hi!