Semiconductor USA
Memory & SSD

February 18, 2021

How the AI Revolution Spurred Samsung to Rethink Memory and Data Processing

With each generation, high-bandwidth memory (HBM) has incrementally improved in speed, power efficiency, and performance. But traditional HBM is no longer keeping pace with the rate of innovation happening in the fields of artificial intelligence (AI) and machine learning (ML). This is because AI/ML applications not only process staggering amounts of data, but are expected to do it faster and better all the time, which requires increasing amounts of bandwidth — that most elusive of parameters.

The concept of processing-in-memory (PIM) technology has been talked about, examined, and tinkered with as a solution to bandwidth limitations for more than 30 years. However, the impetus to solve the technical challenges required to make it viable has not been strong enough. Here’s why:

The problem with PIM technology is that memory and logic integration has always meant making the unpalatable trade-off of either giving up storage density in a memory-optimized process, or transistor performance in a logic-optimized process. Consequently, the performance and capability of the resulting PIM device had always underperformed relative to the technical hurdle and cost of the integration, and so the traditional von Neumann architecture (separate processor and memory units) has reigned supreme.

It’s taken the explosion of AI/ML based ‘apps’ to spur investment in the development of PIM technology. This is because AI/ML algorithms demand high rates of access to high capacities of data and in modern systems, memory bandwidth and power consumption ultimately limit the performance and capabilities of AI/ML applications. PIM technology is well suited to handle AI/ML workloads with optimized kernels that minimize data movement by mapping data accesses with a high degree of spatial and temporal locality for concurrent processing in the (parallel) banks of a high performance memory device. In this manner, PIM addresses the typical CPU/GPU-memory bandwidth bottleneck, improving AI/ML application performance and capability.

Samsung’s new high-bandwidth memory with processing-in-memory (HBM-PIM) technology is the first memory of its kind to integrate high-performance, parallel data processing and DRAM on the same piece of silicon. HBM-PIM is based on JEDEC-standard HBM2 specification, but enhanced with ‘processor-in-memory’ or PIM architecture. Based on the success of HBM2, Samsung already has plans to include PIM technology in the upcoming HBM3.

HBM-PIM architecture

“Placing transistor technology for processing in the memory brings two worlds together,” says Tien Shiah, Sr. Manager, Memory Marketing, Samsung Semiconductor, Inc., “Software engineers can now write simple commands to take advantage of HBM-PIM’s 1.2TFLOPS programmable computing unit so all of their localized, repetitive workloads happen faster.”

For datacenter system architects, GPU architects, IT managers, and even tech company accountants (to name a few), HBM-PIM represents an opportunity to take a giant leap forward with relative ease. HBM-PIM delivers over twice the system performance while reducing energy consumption by more than 70%. Further, HBM-PIM does not require any hardware or software changes, allowing for seamless integration into existing systems.

Samsung HBM-PIM Performance

“I’m delighted to see that Samsung is addressing the memory bandwidth/power challenges for HPC and AI computing,” said Rick Stevens, Argonne’s Associate Laboratory Director for Computing, Environment and Life Sciences. “Samsung’s HBM-PIM design has demonstrated impressive performance and power gains on important classes of AI applications.”

AI-powered applications like natural language processing (speech recognition, translation), image classification, and recommendation engines are everywhere — in our phones of course, but also in smart speakers, vehicles, wearables, and homes. We no longer have to imagine a future where virtually every task is AI-assisted, we only have to imagine how ‘intelligent’ our lives can become.

Samsung HBM-PIM Performance

On the timeline of semiconductor innovation, we may have passed the point where bandwidth is the major limiting factor in AI/ML performance, at least for now. With Samsung’s latest industry first innovation, HBM-PIM, we are officially ‘leaving the station’ and on track to meet our AI-powered future.