Foundry Cloud Platform democratizes AI development and solves the GPU shortage with the first real-time compute market purpose-built for AI
Foundry—an emerging cloud provider founded by alumni from Google DeepMind’s core Deep Learning team–launched Foundry Cloud Platform, a real-time market and orchestration engine for GPU compute that simplifies access to the infrastructure required to build and deploy AI. Foundry’s platform reduces operational complexity and improves compute cost efficiency by up to 6x, putting AI development within reach of more organizations and accelerating global AI innovation.
The AI boom has made GPU servers one of the world’s most strategic commodities. Surging demand has outpaced the capacity of traditional public clouds, prompting tech giants and AI startups alike to spend billions to independently secure essential hardware for AI development. Industry-standard long-term contracts and unreliable infrastructure fuel overprovisioning to guarantee capacity and redundancy, further reducing broader access. This arms race for GPU compute ownership has masked a critical issue and opportunity: existing GPUs are vastly underutilized due to the unique compute requirements of AI development.
“The GPU compute market as it exists is one of the most inefficient commodity markets in history, and it’s directly limiting critical AI innovations that will benefit society,” explains Jared Quincy Davis, founder and CEO of Foundry. “The majority of AI research and development teams struggle to access affordable and reliable compute for their workloads, while exceptionally well-funded organizations are forced to purchase long-term GPU reservations that they rarely utilize to maximum capacity. Foundry Cloud Platform addresses this market failure by aggregating and redistributing idle compute capacity to enable faster breakthroughs while improving return on GPU investments.”
Foundry Cloud Platform offers AI teams of every scale a more efficient and approachable way to access GPU compute, optimizing performance, cost efficiency, and reliability.
Also Read: Veritone & AWS Collaborate to Scale AI Solutions
Foundry Cloud Platform aggregates compute into a single, dynamically-priced pool that offers GPU capacity in two ways, optimized for the specific and unique needs of different AI workloads:
- Resellable reserved instances. AI teams have self-serve access to reserve short-term capacity from Foundry’s pool of GPU virtual machines. Rather than pay for fixed, long-term contracts, customers can guarantee compute for predictable workloads by reserving interconnected clusters from the pool for as little as three hours. Customers can further increase cost-efficiency by reselling any idle capacity from their reservations. For example, if a customer reserves 128 NVIDIA H100s and sets aside 16 as “healing buffer” nodes, they can temporarily relist those 16 nodes on the market, where they generate credits until the customer recalls them or the initial reservation period ends. Reserved usage is optimal for pre-planned workloads like training runs and critical day-to-day developer tasks like verification and debugging.
- Spot instances. All unreserved and relisted compute on the platform is available as spot instances that users can bid on for interrupt-tolerant workloads like model inference, hyperparameter tuning, and fine-tuning.
Foundry Cloud Platform uses auction theory to set market-driven prices for reserved and spot compute based on real-time supply and demand. Whenever prices become too high, Foundry increases the overall GPU capacity of the platform, stabilizing the market.
The platform also offers Kubernetes workload orchestration,which eliminates manual scheduling by programmatically adding reserved and spot instances to a managed Kubernetes cluster. Leveraging Kubernetes clusters through Foundry Cloud Platform allows AI development teams to optimize price-performance and minimize inference latency during traffic spikes by quickly scaling capacity horizontally.
Infinite Monkey, an AI startup developing architectures for AGI, uses Foundry Cloud Platform to access a variety of state-of-the-art GPUs without overprovisioning. “With Foundry Cloud Platform, we made actionable discoveries in hours, not weeks,” says Matt Wheeler, Research Engineer at Infinite Monkey. “When we believe we could benefit from additional compute, we just turn it on. When we need to pause to study our results and design the next experiment, we turn it off. Because we aren’t locked into a long-term contract, we have the flexibility to experiment with a variety of GPUs and empirically determine how to get the best price-performance for our workload.”
“Foundry Cloud Platform has accelerated science at Arc,” notes Patrick Hsu, Co-Founder and Core Investigator at Arc Institute – a nonprofit research organization studying complex diseases, including cancer, neurodegeneration, and immune dysfunction. “Our machine learning work brings demanding performance infrastructure needs, and Foundry delivers. With Foundry, we can guarantee that our researchers have exactly the compute they need, when they need it, without procurement friction.”
Source: PRNewswire