Intel’s Habana Labs subsidiary has unveiled two new chips for artificial intelligence workloads.
The subsidiary – which Intel acquired for $2 billion back in 2019 – debuted the Gaudi2, the second generation deep learning training processor; and Greco, the successor to the Goya deep learning inference processor.
Both chips improve on their 16nm predecessors by shifting to 7nm process – but to do so the company is using TSMC to build the chips, rather than Intel’s foundry.
Habana did not disclose detailed specifics about either chip, but said that the Gaudi2 had 24 tensor processor cores (up from 10), while in-package memory capacity has tripled from 32GB (HBM2) to 96GB (HBM2E), and on-board SRAM has doubled from 24MB to 48MB.
The chip has a thermal design power (TDP) of 600W, up from 350W.
Compared to Nvidia’s A100 (80GB) GPU, Intel claimed a 1.9x performance improvement on ResNet-50 training throughput. For BERT Phase-2 training throughput, it was 2.8x.
“Compared with the A100 GPU, implemented in the same process node and roughly the same die size, Gaudi2 delivers clear leadership training performance as demonstrated with apples-to-apples comparison on key workloads,” said Eitan Medina, chief operating officer at Habana Labs.
“This deep-learning acceleration architecture is fundamentally more efficient and backed with a strong roadmap.”
Intel did not provide comparisons against the upcoming Nvidia H100 GPU, set for Q3, which is expected to be significantly more powerful than the A100.
The Gaudi2 is available to Habana customers, and will be brought to the wider market in a Supermicro server in the second half of this year. Another server is planned in partnership with DDN.
Intel said a thousand Gaudi2s have been deployed to Habana’s data center in Haifa, Israel to help with the development of the Gaudi3.
As for Greco, memory on-card has gone from DDR4 to LPDDR5 and on-chip memory has been upped from 50 to 128MB. Its TDP has actually gone down, from 200W to 75W, thanks to moving from a dual-slot to a single-slot.
The inference chip is expected in the second half of the year.