Marvell shed some light during this week's Hot Chips 2020 event on its upcoming ThunderX3 server chips and teased early performance benchmarks.
The chipmaker initially announced ThunderX3 back in March. The new chip is based on the same Arm Neoverse microarchitecture used by its predecessor, but ups the core count from 32 to 60 for the single die and 96 on the dual-die version.
ThunderX3 is built on Taiwan Semiconductor Manufacturing Co.'s (TSMC) well-established 7-nanometer manufacturing process and retains the 4-way SMT from previous editions giving it four threads per core, or 384 threads per socket.
ThunderX3 is also the first update to the ThunderX line since Marvell's acquisition of Cavium in 2018. And the chip has a lot to live up to.
"[ThunderX2] was the first Arm-based system in the Supercomputing Top 500 list. It's also the first non-x86 CPU in Microsoft Azure. It is the most widely deployed Arm server processor current," said Rabin Sugumar, lead architect for ThunderX3 at Marvell, in a presentation.
According to Sugumar, early benchmarks show ThunderX3 performing 30% better in single-threaded workloads and up to 300% better in multi-threaded workloads compared to ThunderX2. Approximately, 30% of the single-threaded performance increase was achieved through architectural improvements to the chip, while the remainder was achieved through higher clock speeds, explained Sugumar.
Marvell expects performance to improve as it fine tunes the design leading up to its launch later this year.
ThunderX3 starts to shine in multi-threaded workloads with relatively low instructions per clock like MySQL, explained Sugumar. Here, the chip's four threads per core led to dramatic performance gains. Some of these performance gains are thanks to the larger 64-kilobyte instruction cache and 90-megabyte instruction cache.
"We found that data center codes miss a lot in the instruction cache and in some cases even in the L2 cache," Sugumar explained, adding that the larger cache "helped performance significantly in data center codes."
Stronger Arm CompetitionWhen it launches, ThunderX3 will face a very different market and much stiffer competition than its predecessor.
Growing demand for higher compute density and higher performance per watt has driven many cloud providers, including Amazon, Oracle, and Microsoft to invest in Arm CPUs. And since ThunderX2 was released in 2018, several companies including Ampere and Amazon have brought Arm-based server chips to market.
In June, Ampere announced a 128-core variant of its Altra processor called Altra Max. However, the two chips, while both based on Arm's Neoverse architecture, take different approaches to multi-threading. While Marvell's chip has four threads per core, Ampere's is single threaded. Ampere claims that this enables it to provide perfectly linear performance across every core at the cost of fewer threads per die.
ThunderX3 is expected to launch later this year.