Saturday, November 23, 2024

TensorOpera and Aethir Team Up to Advance Massive-Scale LLM Training on Decentralized Cloud

Related stories

Deep Instinct Expands Zero-Day Security to Amazon S3

Deep Instinct, the zero-day data security company built on...

Foxit Unveils AI Assistant in Admin Console

Foxit, a leading provider of innovative PDF and eSignature...

Instabase Names Junie Dinda CMO

Instabase, a leading applied artificial intelligence (AI) solution for...
spot_imgspot_img

TensorOpera, the company providing “Your Generative AI Platform at Scale,” has partnered with Aethir, a distributed cloud infrastructure provider, to accelerate its newest foundation model, TensorOpera Fox-1, highlighting the first mass-scale LLM training use case on a decentralized physical infrastructure network.

Introduced last week, TensorOpera Fox-1 is a cutting-edge open-source small language model (SLM) with 1.6 billion parameters, outperforming other models in its class from tech giants like Apple, Google, and Alibaba. This decoder-only transformer was trained from scratch on three trillion tokens using a novel 3-stage curriculum. It features an innovative architecture that is 78% deeper than comparable models such as Google’s Gemma 2B and surpasses competitors in standard LLM benchmarks like GSM8k and MMLU, even with significantly fewer parameters.

The partnership with Aethir equips TensorOpera with advanced GPU resources necessary for training Fox-1. Aethir’s collaboration with NVIDIA Cloud Partners, Infrastructure Funds, and various enterprise-grade hardware providers has established a global, large-scale GPU cloud. This network ensures the delivery of cost-effective and scalable GPU resources, essential for high-throughput, substantial memory capacity, and efficient parallel processing capabilities. With the support of Aethir’s decentralized cloud infrastructure, TensorOpera obtains the necessary tools for facilitating streamlined AI development that requires high network bandwidth and ample amounts of GPU power.

Through this collaboration, TensorOpera is further integrating a pool of GPU resources from Aethir that can be used seamlessly via TensorOpera’s AI platform for a variety of jobs, such as model deployment and serving, fine-tuning, and full training. With Aethir’s distributed GPU cloud network, dynamically adjusting GPU power consumption for AI platforms on the go is no issue. Together, Aethir and TensorOpera aim to empower the next generation of large language model (LLM) training and give AI developers the assets they need to create powerful models and applications.

Also Read: OpenAI Selects Oracle Cloud Infrastructure to Extend Microsoft Azure AI Platform

“I am thrilled about our partnership with Aethir,” said Salman Avestimehr, Co-Founder and CEO of TensorOpera. “In the dynamic landscape of generative AI, the ability to efficiently scale up and down during various stages of model development and in-production deployment is essential. Aethir’s decentralized infrastructure offers this flexibility, combining cost-effectiveness with high-quality performance. Having experienced these benefits firsthand during the training of our Fox-1 model, we decided to deepen our collaboration by integrating Aethir’s GPU resources into TensorOpera’s AI platform to empower developers with the resources necessary for pioneering the next generation of AI technologies.”

Aethir’s operational model is based on a globally distributed network of top-shelf GPUs capable of effectively servicing enterprise clients in the AI and machine learning industry regardless of their physical locations. To effectively provide lag-free, highly scalable GPU power worldwide, Aethir’s GPU resources are decentralized across a multitude of locations in smaller clusters. Instead of pooling resources in a few massive data centers like in the case of traditional, centralized cloud service providers, Aethir distributes its infrastructure to cover the network’s edge and cut the physical distance between GPU resources and end-users.

“TensorOpera is the premier AI platform for LLMs and generative AI applications, and we are excited to be their supplier of enterprise GPU infrastructure,” said Kyle Okamoto, CTO of Aethir.

“Aethir is firmly dedicated to supporting the AI and machine learning sector in developing and launching groundbreaking solutions that can improve the everyday lives of people around the world. TensorOpera provides developers with a comprehensive AI platform, while Aethir will provide them with a steady supply of GPU power that can handle even the most demanding LLM training and AI inference. Thanks to our vast decentralized cloud infrastructure, Aethir is capable of powering large-scale AI development and deployment worldwide,” said Daniel Wang, Aethir’s CEO.

Source: Businesswire

Subscribe

- Never miss a story with notifications


    Latest stories

    spot_img