New computing scheme could enhance machine learning, bring breakthroughs in AI: PKU study

Sep 21, 2024

Peking University, September 21, 2024: Artificial Intelligence (AI) models like ChatGPT run on algorithms and have great appetites for data, which they process through machine learning. But have you wondered about the limits of their data-processing abilities?

Researchers from Peking University’s School of Integrated Circuits and Institute for Artificial Intelligence set out to solve the von Neumann bottleneck that limits data-processing, in a paper published in science journal Device on September 12, 2024. Led by Professor Sun Zhong, the research team developed a new computing scheme, known as the dual-IMC (in-memory computing) scheme, which not only accelerates the machine learning process, but also improves energy efficiency of traditional data operations.

Dual in-memory computing enables fully in-memory MVM operations

When curating algorithms, software engineers and computer scientists rely on data operation known as the matrix-vector multiplication (MVM), which supports neural networks. A neural network is a computing architecture often found in AI models, that mimics the function and structure of a human brain.

As the scale of datasets grow rapidly, computing performance is often limited by data movement and speed mismatch between processing and transferring data. This is known as the von Neumann bottleneck. The conventional solution is a single in-memory computing (single-IMC) scheme, in which neural network weights are stored in the memory chip while input (such as images) is provided externally.

Conventional single-IMC limits computing performance in MVM operations

However, the caveat to the single-IMC is the switch between on-chip and off-chip data transportation, as well as usage of digital-to-analog converters (DACs) which cause a large circuit footprint and high power consumption.

To fully realize the potential of the IMC principle, the team developed a dual-IMC scheme that stores both the weight and input of a neural network in the memory array, thus performing data operations in a fully in-memory manner.

New dual-IMC scheme accelerates MVM operations

The team then tested the dual-IMC on resistive random-access memory (RRAM) devices for signal recovery and image processing. These are some benefits of the dual-IMC scheme when applied to MVM operations:

(1) Greater efficiency is achieved due to fully in-memory computations, which saves on time and energy caused by off-chip dynamic random-access memory (DRAM) and on-chip static random-access memory (SRAM)

(2) Computing performance is optimized as data movement, which was a limiting factor, is eliminated through a fully in-memory manner.

(3) Lower production cost due to the elimination of DACs which are required in the single-IMC scheme. This also means saving on chip area, computing latency and power requirements.

With rapidly growing demand for data-processing in today’s digital era, the discoveries made in this PKU research could bring about new breakthroughs in computing architecture and artificial intelligence.

Source: School of Integrated Circuits and Institute for Artificial Intelligence
Edited by: Wu Jiayun
Photos by: Research team, China Daily (cover photo)

Latest