Tuesday, October 15, 2024

NVIDIA DGX A100, the world’s first 5-petaflops system

NVIDIA has unveiled the third generation of its most advanced artificial intelligence system, the new NVIDIA DGX A100. The new AI system will be used by the US Department of Energy’s (DOE) Argonne National Laboratory to better understand and fight COVID-19.

According to CEO Jensen Huang’s company, the DGX A100 system can deliver, for the first time, five petaflops of AI performance by consolidating the power and capabilities of a complete data center on a single flexible platform. It’s the largest 7nm chip ever made, offering the ability to handle 1.5 TB of data per second.

NVIDIA DGX is the first AI system built for the end-to-end machine learning workflow – from data analytics to training to inference. And with the giant performance leap of the new DGX, machine learning engineers can stay ahead of the exponentially growing size of AI models and data,” said Jensen Huang.

It consolidates the power and capabilities of an entire data center into a single flexible platform.
It consolidates the power and capabilities of an entire data center into a single flexible platform.

Technical Specifications:

DGX A100 systems integrate eight of the new Nvidia A100 Color Tensor GPUs, providing 320 GB of memory to train AI data sets. The system includes six third-generation NVIDIA NVLink interface switches with doubled connection speed between GPUs. Together, they provide two-way data exchange with the server at a speed of 4.8 TB/s, and the data exchange rate between GPUs is 600 GB/s. Besides, it uses nine Mellanox ConnectX-6 HDR 200Gb per second network interfaces, offering a total of 3.6Tb per second of bi-directional bandwidth.

NVIDIA says a single rack of five DGX A100 systems replaces a data center of AI training and inference infrastructure, with 1/20th the power consumed, 1/25th the space, and 1/10th the cost.

Computational power to study the spread of viruses

With the launch, Nvidia points out that companies will be able to optimize on-demand computing power and resources to accelerate workloads, including data analysis, training, and inference, on a single software-defined platform. The DGX A100 can also be used in scalable applications such as data analysis and interference estimation.

According to the company, large companies, service providers, and government agencies around the world placed initial orders for the DGX A100, with the first systems being delivered to the US Department of Energy’s Argonne National Laboratory earlier this month.

We’re using America’s most powerful supercomputers in the fight against COVID-19, running AI models and simulations on the latest technology available, like the NVIDIA DGX A100,” said Rick Stevens, associate laboratory director for Computing, Environment and Life Sciences at Argonne. “The compute power of the new DGX A100 systems coming to Argonne will help researchers explore treatments and vaccines and study the spread of the virus, enabling scientists to do years’ worth of AI-accelerated work in months or days.”

Nvidia’s new AI system is now available through NVIDIA Partner Network resellers worldwide starting at $199,000.

Blurbs