Wednesday, May 8, 2024

Tesla unveils its new supercomputer to train its self-driving system

It’s no secret that Tesla is preparing a supercomputer named the Dojo to help the company develop autonomous driving technologies. But at CVPR this week, the brand has surprised by unveiling its new supercomputer, which is already the fifth most powerful in the world, and it’s going to be the predecessor of Tesla’s Dojo.

This new advanced computing power will train Tesla vehicles’ Autopilot feature and still-unreleased self-driving artificial intelligence systems. Tesla is making it possible for autonomous vehicle engineers to do their life’s work efficiently and at the cutting edge.

The cluster uses 720 nodes of 8x NVIDIA A100 Tensor Core GPUs (5,760 GPUs total) to achieve an industry-leading 1.8 exaflops of performance. NVIDIA A100 GPUs deliver acceleration at every scale to power the world’s highest-performing data centers. Powered by the NVIDIA Ampere Architecture, the A100 GPU provides up to 20 times higher performance over the prior generation and can be partitioned into seven GPU instances to dynamically adjust to shifting demands.

Tesla’s cyclical development begins in the car; a deep neural network running in “shadow mode” quietly perceives and makes predictions while the car is driving without actually controlling the vehicle. These predictions are recorded, which are then used by Tesla engineers to create a training dataset of difficult and diverse scenarios to refine the DNN.

The result is a collection of roughly 1 million 10-second clips recorded at 36 frames per second, totaling a whopping 1.5 petabytes of data. The DNN is then run through these scenarios in the data center over and over until it operates without a mistake. Finally, it’s sent back to the vehicle and begins the process again.

Tesla’s supercomputer also gives autonomous vehicle engineers the performance needed to experiment and iterate in the development process. “Computer vision is the bread and butter of what we do and enables Autopilot. For that to work, you need to train a massive neural network and experiment a lot,” said Andrej Karpathy, senior director of AI at Tesla. “That’s why we’ve invested a lot into the compute.”

For us, computer vision is the bread and butter of what we do and what enables Autopilot,” Karpahty said during an announcement at the 2021 Conference on Computer Vision and Pattern Recognition on Monday. “And for that to work really well, we need to master the data from the fleet, and train massive neural nets and experiment a lot.”