Thursday, September 18, 2025

Nvidia reveals new A.I. chip, says costs of running LLMs will ‘drop significantly’

Related stories

Aisles Launches DREAM: AI-Driven Virtual Reality Evolution

Aisles has unveiled DREAM (Dynamic Reality Experience and Memory),...

TechSee Unveils Visual Remote Assistance with AI (VRAi) on Salesforce

TechSee, a global leader in visual customer assistance, announced...

Rendever and Lenovo Collaborate to Bring Virtual Reality Experiences to Carolina Caring Seniors

Rendever, the Boston-based company pioneering the future of aging...

Ansys 2024 R1 Reimagines the User Experience while Expanding Multiphysics Superiority Boosted by AI

The latest release from Ansys, 2024 R1, introduces an elevated user...

eXeX and Neurosurgeon Dr. Robert Masson Achieve World First Using Apple Vision Pro

eXeX™, a leader in artificial intelligence and mixed reality...
spot_imgspot_img

Nvidia announced a new chip designed to run artificial intelligence models on Tuesday as it seeks to fend off competitors in the AI hardware space, including AMD, Google and Amazon.

Currently, Nvidia dominates the market for AI chips with over 80% market share, according to some estimates. The company’s specialty is graphics processing units, or GPUs, which have become the preferred chips for the large AI models that underpin generative AI software, such as Google’s Bard and OpenAI’s ChatGPT. But Nvidia’s chips are in short supply as tech giants, cloud providers and startups vie for GPU capacity to develop their own AI models.

Nvidia’s new chip, the GH200, has the same GPU as the company’s current highest-end AI chip, the H100. But the GH200 pairs that GPU with 141 gigabytes of cutting-edge memory, as well as a 72-core ARM central processor.

“We’re giving this processor a boost,” Nvidia CEO Jensen Huang said in a talk at a conference on Tuesday. He added, “This processor is designed for the scale-out of the world’s data centers.”

Also Read: Edge Impulse Launches Integration with NVIDIA TAO Toolkit to Supercharge Edge AI

The new chip will be available from Nvidia’s distributors in the second quarter of next year, Huang said, and should be available for sampling by the end of the year. Nvidia representatives declined to give a price.

Oftentimes, the process of working with AI models is split into at least two parts: training and inference.

First, a model is trained using large amounts of data, a process that can take months and sometimes requires thousands of GPUs, such as, in Nvidia’s case, its H100 and A100 chips. Then the model is used in software to make predictions or generate content, using a process called inference. Like training, inference is computationally expensive, and it requires a lot of processing power every time the software runs, like when it works to generate a text or image. But unlike training, inference takes place near-constantly, while training is only required when the model needs updating.

“You can take pretty much any large language model you want and put it in this and it will inference like crazy,” Huang said. “The inference cost of large language models will drop significantly.”

Nvidia’s new GH200 is designed for inference since it has more memory capacity, allowing larger AI models to fit on a single system, Nvidia VP Ian Buck said on a call with analysts and reporters on Tuesday. Nvidia’s H100 has 80GB of memory, versus 141GB on the new GH200. Nvidia also announced a system that combines two GH200 chips into a single computer for even larger models.

“Having larger memory allows the model to remain resident on a single GPU and not have to require multiple systems or multiple GPUs in order to run,” Buck said.

SOURCE: CNBC

Subscribe

- Never miss a story with notifications


    Latest stories

    spot_img