Monday, December 23, 2024

Lightning AI Announces Availability of Thunder; A Powerful Source-to-Source Compiler For PyTorch That Speeds Up Training And Serving Generative AI Models Across Multiple GPUs, Built With Support From NVIDIA

Related stories

Doc.com Expands AI developments to Revolutionize Healthcare Access

Doc.com, a pioneering healthcare technology company, proudly announces the development...

Amesite Announces AI-Powered NurseMagic™ Growth in Marketing Reach to Key Markets

Amesite Inc., creator of the AI-powered NurseMagic™ app, announces...

Quantiphi Joins AWS Generative AI Partner Innovation Alliance

Quantiphi, an AI-first digital engineering company, has been named...
spot_imgspot_img

Following the company’s presentation at the NVIDIA GTC AI conference, Lightning AI, the company behind PyTorch Lightning, which has over 100 million downloads, announced the availability of Thunder, a new and powerful source-to-source compiler for PyTorch designed for training and serving the latest generative AI models across multiple GPUs at maximum efficiency. Thunder is the culmination of two years of research on the next generation of deep learning compilers, built with support from NVIDIA.

Large model training can cost billions of dollars today because of the number of GPUs and the length of time it takes to train these models. Lack of high-performance optimization and profiling tools puts this scale of training out of reach for developers who don’t have the resources of a large technology company. Even at its early stage, Thunder achieves up to a 40% speed-up for training large LLMs, compared to unoptimized code in real-world scenarios. These speed-ups save weeks of training and lower training costs proportionally.

In 2022, Lightning AI hired a group of expert PyTorch developers, with the ambitious goal of creating a next-generation deep learning system for PyTorch that could take advantage of the best-in-class executors and software like torch.compile; nvFuser, Apex, CUDA Deep Neural Network Library (cuDNN) – all three products from NVIDIA; and OpenAI’s Triton. The goal of Thunder was to allow developers to use all executors at once so that each executor could handle the mathematical operations it was best designed for. Lightning AI leveraged support from NVIDIA for the integration of NVIDIA’s best executors into Thunder.

The Thunder team is led by Dr. Thomas Viehmann, a pioneer in the deep learning field best known for his early work on PyTorch, his key contributions to TorchScript, and for making PyTorch work on mobile devices for the first time.

Also Read: Applied Digital Among First Cloud Service Providers to Use New NVIDIA Blackwell Platform

Today, Thunder is starting to be adopted by companies of all sizes to accelerate AI workloads. Thunder aims to make the highly specialized optimizations discovered by prominent programmers at organizations such as Open AI, Meta AI, and NVIDIA more accessible to the rest of the open-source community.

“What we are seeing is that customers aren’t using available GPUs to their full capacity, and are instead throwing more GPUs at the problem,” says Luca Antiga, Lightning AI’s CTO. “Thunder, combined with Lightning Studios and its profiling tools, allows customers to effectively utilize their GPUs as they scale their models to be larger and run faster. As the AI community embarks on training increasingly capable models spanning multiple modalities, getting the best GPU performance is of paramount importance. The combination of Thunder’s code optimization and Lightning Studios’ profiling and hardware management will allow them to fully leverage their compute without having to hire their own systems software and optimization experts.”

“NVIDIA accelerates every deep learning framework,” said Christian Sarofeen, Director of Deep Learning Frameworks at NVIDIA. “Thunder facilitates model development and research. Our collaboration with Lightning AI, to integrate NVIDIA technologies into Thunder, will help the AI community improve training efficiency on NVIDIA GPUs and lead to larger and more capable AI models.”

“I couldn’t be more thrilled for Lightning AI to lead the next wave of performance optimizations to make AI more open source and accessible. I’m especially excited to partner with Thomas, one of the giants in our field, to lead the development of Thunder,” says Lightning AI CEO and founder Will Falcon. “Thomas literally wrote the book on PyTorch. At Lightning AI, he will lead the upcoming performance breakthroughs we will make available to the PyTorch and Lightning AI community.”

Thunder will be made available open source under an Apache 2.0 license. Lightning Studios will offer first-class support for Thunder and native profiling tools that enable researchers and developers to easily pinpoint GPU memory and performance bottlenecks.

“Lightning Studios is quickly becoming the standard for not only new developers to build AI apps and train models but also for deep learning experts to understand their model performance in a way that is not possible on any other ML platform,” says Falcon.

SOURCE: BusinessWire

Subscribe

- Never miss a story with notifications


    Latest stories

    spot_img