Monday, December 23, 2024

Groq to Feature World’s Fastest GenAI Inference Performance for Foundational LLMs at Supercomputing ’23 on Its LPU™ Systems

Related stories

Doc.com Expands AI developments to Revolutionize Healthcare Access

Doc.com, a pioneering healthcare technology company, proudly announces the development...

Amesite Announces AI-Powered NurseMagic™ Growth in Marketing Reach to Key Markets

Amesite Inc., creator of the AI-powered NurseMagic™ app, announces...

Quantiphi Joins AWS Generative AI Partner Innovation Alliance

Quantiphi, an AI-first digital engineering company, has been named...
spot_imgspot_img

Groq, an artificial intelligence (AI) solutions company, announced that it will have a booth and multiple talks at the premier industry conference for high performance compute, SC23, from November 12-17 in Denver, CO. Groq and their team will be showcasing a demo of the world’s best low latency performance for Large Language Models (LLMs) running on a Language Processing Unit™ system, its next-gen AI accelerator. Subject matter experts from Groq will be presenting four sessions during the conference on a range of HPC, AI, and research-related topics.

Jim Miller, VP of Engineering at Groq, and former engineering leader at Qualcomm, Broadcom, and Intel, shared, “The scale and performance of systems used for AI today is enormous, and will get larger if built with legacy technology. At Groq we are setting a new standard with our LPU™-based systems that improve performance, power, and scale when serving a large customer base. This is thanks to the hard work and innovative ideas of our dedicated team of engineers at Groq who are committed to solving truly novel problems.”

The LPU™ accelerator is the Groq response to the next level of processing power required by enterprise-scale AI applications. With a clear market need for a purpose-built and software-driven processor, the Groq LPU accelerator will power LLMs for the exploding GenAI market.

Also Read: Cloudflare Powers Hyper-Local AI Inference with NVIDIA Accelerated Computing

Yaniv Shemesh, Head of Cloud & HPC Software Engineering at Groq, said, “Groq’s groundbreaking speed, in the form of tokens-as-a-service, was a major milestone for my organization and the company. Running your own hardware and building a large scale HPC can be hard, but Groq’s token-as-a-service ease of use and consumption-based model are very attractive to customers. Our performance is beyond fast and is opening new possibilities and innovative customer use-cases previously unimaginable given existing market solutions limitations.”

To date, the company has showcased record-breaking performance of the open source foundational LLM, Llama-2 70B by Meta AI, now running generated language at over 280 tokens per second per user. Groq also recently deployed Falcon, a powerful language model available for both research and commercial use that’s currently at the top of the Hugging Face Leaderboard for pre-trained open source LLMs, and Code Llama, one of the newest LLMs from Meta AI helping users generate code.

SOURCE: PRNewswire

Subscribe

- Never miss a story with notifications


    Latest stories

    spot_img