Tuesday, May 21, 2024

Google introduces its most potent, most energy-efficient AI chip to date

Alphabet, Google’s parent company, has unveiled its latest addition to the artificial intelligence data center chip family: Trillium. 

Representing the most advanced AI-specific hardware, Trillium consists of Tensor Processing Units (TPUs). These custom chips for AI data centers are a standout alternative to Nvidia’s offerings, providing a compelling option in the market.

Nvidia currently holds an 80% market share in AI chips, with Google dominating the remaining 20%. Notably, Google does not sell the chips but rents them through its cloud computing platform.

Trillium is its sixth-generation TPU, the most performant and most energy-efficient TPU to date. Google has highlighted Trillium TPUs as achieving an impressive 4.7X increase in peak compute performance per chip compared to TPU v5e, as stated in a blog post.

The company’s latest offering boasts doubled high bandwidth memory (HBM) capacity and bandwidth, along with doubled Interchip Interconnect (ICI) bandwidth compared to TPU v5e. 

In addition, Trillium features a third-generation SparseCore, a specialized accelerator for processing ultra-large embeddings found in advanced ranking and recommendation workloads. The blog post also highlights the capability of training the next wave of ‘foundation models’ at reduced latency and lower cost.

The latest Trillium TPUs are touted to be over 67 percent more energy efficient than the TPU v5e, according to Google. Additionally, the Trillium is capable of scaling up to 256 TPUs within a single high-bandwidth, low-latency pod.

The blog also mentioned that beyond this pod-level scalability, the Trillium TPUs, equipped with multislice technology and Titanium Intelligence Processing Units (IPUs), can scale to hundreds of pods, connecting tens of thousands of chips in a building-scale supercomputer interconnected by a multi-petabit-per-second datacenter network.

The company achieved a 4.7X increase in compute performance per Trillium chip by enlarging the size of matrix multiply units (MXUs) and boosting the clock speed.

In a blog post, the company stated, “Trillium TPUs will power the next wave of AI models and agents, and we’re looking forward to helping enable our customers with these advanced capabilities.”

This advancement will greatly benefit Google’s cloud computing services and Gemini. Companies such as Deep Genomics and Deloitte, which rely on Google Cloud services, will experience a significant boost from the new chip.

The support for training and serving long-context, multimodal models on Trillium TPUs will empower Google DeepMind to train and serve future generations of Gemini models faster, more efficiently, and with lower latency than ever before. 

Trillium TPUs are integral to Google Cloud’s AI Hypercomputer, a supercomputing architecture designed specifically for cutting-edge AI workloads.

“Gemini 1.5 Pro is Google’s largest and most capable AI model, and it was trained using tens of thousands of TPU accelerators,” said Jeff Dean, chief scientist at Google Deepmind and Google Research.

“Our team is excited about the announcement of the sixth generation of TPUs, and we’re looking forward to the increase in performance and efficiency for training and inference at the scale of our Gemini models.”