Xilinx’s Versal adaptive compute acceleration platform (ACAP) got a high-bandwidth memory (HBM) upgrade today that the company claims eliminates bottlenecks and reduces power consumption for memory-bound workloads.

“We are seeing exponential growth in network traffic demands, as well as exponential growth in the amount of data to be processed,” Mike Thompson, senior product line manager at Xilinx, said in an interview. “It's not really feasible to exponentially scale out the number of servers and data centers year over year ad infinitum.”

To combat these demands, data center operators are increasingly turning to smartNICs or DPUs, like Xilinx's Versal series, to offload and accelerate SDN, security, and storage workloads.

However, for large memory-bound applications the DDR subsystems are becoming a bottleneck, Thompson explained. “The amount of bandwidth that can go to the DDR memories… it's fairly limited. That’s one of the most significant bottlenecks facing those applications today.”

And that’s exactly the challenge Versal HBM attempts to solve. The platform builds on Xilinx Versal Premium FPGAs, announced early last year, but swaps out one of the chip’s super logic regions and modifies another in favor of HBM chiplets and a controller that interface directly with the die.

Versal Premium features an integrated FPGA and digital signal processor (DSP) alongside a dual-core Arm Cortex A72 processor and a second dual-core Arm Cortex-R5F real-time processor. It supports 112 Gb/s pulse-amplitude modulation (PAM4) Serializer/Deserializers (SerDes) enabling interface speeds up to 800 Gb/s, high-speed cryptographic accelerators optimized for 400 Gb/s line speeds, and an integrated switching chip capable of 1.5 Tb/s of throughput.

“If you look at the spec sheet of Versal HBM it is big, it is robust, it's a very feature rich product, but there's actually not a lot that's going to be new there by the time we start shipping them,” Thompson said.

Blowing Past Bottlenecks

Despite the similarities, the use of HBM memory offers several advantages, Thompson notes.

"By integrating HBM2e we're able to provide up to 6.56 Tb/s per second of access bandwidth. That's eight times the memory bandwidth versus DDR5 — the high-speed DDR5 at that — at 63% lower power,” he said, adding that in some cases it's possible to eliminate the need for slower external memory altogether.

This, he said, makes Versal HBM ideal for security appliances and firewalls, switches, and routers that require high-performance packet processing, as well as compute pre-processing and buffering workloads common in artificial intelligence training.

“The common thread between all of these applications is they're significantly memory bound and they're very compute intensive high-bandwidth applications,” Thompson said.

Versal HBM will be available in five SKUs with between eight gigabytes and 32 gigabytes of on-package HBM2e memory. Versal Premium is available now, while the HBM variant is slated to begin sampling in the first half of 2022.