Tuesday, April 9, 2024

IBM’s NorthPole chip promises faster and more energy-efficient AI

Since the birth of the semiconductor industry, computer chips have primarily followed the same basic structure, where the processing units and the memory storing the information to be processed are stored discretely. While this structure has allowed for simpler designs that have been able to scale well over the decades, it has created what is called the von Neumann bottleneck, where it takes time and energy to continually shuffle data back and forth between memory, processing, and any other devices within a chip.

A team of computer scientists and engineers at IBM Research aims to change this, taking inspiration from how the brain computes. The team is developing a brain-inspired computer chip called NorthPole that could supercharge artificial intelligence (AI) by working faster with much less power.

Their massive NorthPole processor chip eliminates the need to frequently access external memory, allowing it to perform tasks such as image recognition faster than existing architectures while consuming vastly less power.

The prototype NorthPole chip was fabricated with a 12-nm node process and contained 22 billion transistors in 800 square millimeters. It has 256 cores and can perform 2,048 operations per core per cycle at 8-bit precision, with the potential to double and quadruple the number of operations with 4-bit and 2-bit precision, respectively.

According to IBM, NorthPole is 25 times more energy efficient and up to 22 times faster than any of the other chips. The device also doesn’t need bulky liquid-cooling systems to run – fans and heat sinks are more than enough – meaning that it could be deployed in some rather small spaces.

While research into the NorthPole chip is still ongoing, its structure lends itself to emerging AI use cases, as well as more well-established ones. In testing, the NorthPole team focused primarily on computer vision-related uses. Some of the primary applications in consideration are image segmentation and video classification, natural language processing, and speech recognition.

However, even NorthPole’s 224 megabytes of RAM are not enough for large language models, such as those used by the chatbot ChatGPT, which take up several thousand megabytes of data. Also, the chip can run only pre-programmed neural networks that need to be ‘trained’ in advance on a separate machine.

But, researchers say that the NorthPole architecture could be useful in speed-critical applications, such as autonomous vehicles, robotics, digital assistants, or spatial computing. Also, they say NorthPole could enable satellites that monitor agriculture and manage wildlife populations, monitor vehicles and freight for safer and less congested roads, operate robots safely, and detect cyber threats for safer businesses.