Nvidia (NVDA) kicked off its GTC event in San Jose, Calif., on Monday, debuting a number of chips and platforms ranging from its all-new Nvidia Groq 3 language processing unit (LPU) to its massive Vera central processing unit (CPU) rack, designed to go head-to-head with offerings from Intel (INTC) and AMD (AMD).
All totaled, Nvidia said it’s rolling out five massive server racks, each serving different purposes inside AI data centers.
The biggest announcement of the lot, though, is the Nvidia Groq 3 chip. Nvidia announced it had entered into an agreement to license technology from Groq and hired founder Jonathan Ross, president Sunny Madra, and other members of the Groq team as part of a $20 billion deal in December.
Groq’s processors focus on AI inferencing, or running AI models. It’s what happens when you type something into OpenAI’s (OPAI.PVT) ChatGPT, Anthropic’s (ANTH.PVT) Claude, or Google’s (GOOG, GOOGL) Gemini and get a response.
Nvidia’s graphics processing units (GPUs) are multipurpose and can both train and run AI models, but as the AI market moves toward running models, ensuring the company has a dedicated inferencing chip has become paramount.
That’s where Groq 3 comes in.
According to Nvidia vice president of hyperscale and high-performance computing Ian Buck, while Nvidia’s GPUs support far more memory than Groq 3, the LPU’s memory is faster. So the company is combining the performance benefits of both chips.
To do that, Nvidia is launching its Groq 3 LPX platform, a server rack powered by 128 individual Groq 3 LPUs. When used together with Nvidia’s Vera Rubin NVL72 rack the company says customers could see 35x higher throughput per megawatt of power and 10x more revenue opportunity.
“Optimized for trillion-parameter models and million-token context, the codesigned LPX architecture pairs with Vera Rubin to maximize efficiency across power, memory and compute. The additional throughput per watt and token performance unlocks a new tier of ultra-premium, trillion-parameter, million-context inference, expanding revenue opportunity for all AI providers,” the company said in a statement.
The LPX rack should help address concerns that Nvidia could eventually lose its edge in the AI race to upstart companies designing inference-focused processors.
In addition to the LPX, Nvidia revealed its Vera CPU rack. When Nvidia talks about its Vera Rubin superchip, it’s referring to three processors in one: a Vera CPU and two Rubin GPUs.
Now the company is breaking off Vera into its own standalone chip, which it will slot into dedicated Vera server racks that combine 256 liquid-cooled Vera chips into one system.
