FriendliAI Introduces Friendli Dedicated Endpoints, A Managed Service Version of Friendli Container to Increase Accessibility

FriendliAI, a frontrunner in inference serving for generative AI, is thrilled to announce Friendli Dedicated Endpoints, which offers the capabilities of Friendli Container as a managed service. This latest addition to the Friendli Suite eliminates the complexities of containerization and development, providing customers with automated, cost-effective, and high-performance custom model serving.

Friendli Dedicated Endpoints is the managed cloud service alternative to Friendli Container. Friendli Container, currently adopted by startups and enterprises alike to deploy Large Language Models (LLMs) at scale within private environments, shows significant reductions in GPU costs with the power of the highly GPU optimized Friendli Engine, which powers Friendli Dedicated Endpoints as well.

In addition to leveraging the Friendli Engine, Friendli Dedicated Endpoints streamlines the process of building and serving LLMs through automation, making it more cost and time efficient. Friendli Dedicated Endpoints handles managing and operating generative AI deployments, from model custom fine-tuning to procuring cloud resources to automatic monitoring of deployments. For instance, users can fine-tune and deploy a quantized Llama 2 or Mixtral model using the powerful Friendli Engine in just a few clicks, bringing cutting-edge GPU-optimized serving to users of all technical backgrounds.

Byung-Gon Chun, CEO of FriendliAI, highlighted the importance of democratizing generative AI, emphasizing its importance in driving innovation and organizational productivity.

Also Read: Hewlett Packard Enterprise Leverages GenAI to Enhance AIOps Capabilities of HPE Aruba Networking Central Platform

“With Friendli Dedicated Endpoints, we’re eliminating the hassle of infrastructure management so that customers can unlock the full potential of generative AI with the power of Friendli Engine. Whether it’s text generation, image creation, or beyond, our service opens the doors to endless possibilities for users of all backgrounds.”

Key features of Friendli Dedicated Endpoints:

Dedicated GPU Instances: Users can reserve entire GPUs for serving their custom generative AI models, ensuring consistent and reliable access to high-performance GPU resources.
Custom Model Support: Users can upload, fine-tune, and deploy models, enabling tailored solutions for diverse AI applications.
Superior Performance and Efficiency: A single GPU with the optimized Friendli Engine delivers results equivalent to up to seven GPUs with vLLM. Friendli Engine saves 50% to 90% on GPU costs and boasts up to 10x faster query response times.
Intelligent Operation: Friendli Dedicated Endpoints seamlessly adapts to fluctuating workloads and failures with automated failure management and auto-scaling that adjusts resource allocation based on traffic patterns, ensuring uninterrupted operations and resource efficiency during peak demand periods.

By eliminating technical barriers and optimizing GPU usage, FriendliAI hopes that infrastructure constraints will no longer hinder innovation in generative AI.

Chun says, “We’re thrilled to welcome new users on our journey to make generative AI models fast and affordable.”

SOURCE: PRNewswire

FriendliAI Introduces Friendli Dedicated Endpoints, A Managed Service Version of Friendli Container to Increase Accessibility

About Us

Latest

Popular

Quick Link