Galileo Introduces First-of-its-Kind Evaluation Foundation Models to Transform Enterprise GenAI Evaluations

AIT365 News Desk

1 year ago

Luna® provides high-accuracy, low-latency results at nearly no cost to help enterprises bring trustworthy AI to production

Galileo, a leader in developing generative AI for the enterprise, announced the release of Galileo Luna®, a first-of-its-kind suite of Evaluation Foundation Models (EFMs) designed to transform how generative AI evaluations are conducted. This novel approach is faster, more cost-effective, and more accurate than existing evaluation methods such as askGPT and human “vibe checks.” With Galileo Luna®, enterprises can finally bring trustworthy AI solutions to market faster and at production scale.

“For genAI to achieve mass adoption, it’s crucial that enterprises can evaluate hundreds of thousands of AI responses for hallucinations, toxicity, security risk, and more, in real time,” said Vikram Chatterji, Co-Founder and CEO of Galileo. “In speaking with customers, we found that existing approaches, such as human evaluation or LLM-based evaluation, were too expensive and slow, so we set out to solve that. With Galileo Luna®, we’re setting new benchmarks for speed, accuracy, and cost efficiency. Luna® can evaluate millions of responses per month 97% cheaper, 11x faster, and 18% more accurately than evaluating using OpenAI GPT3.5.”

Luna®: Breakthroughs in AI Evaluation Technology
Core to Luna’s® innovation is the creation of EFMs, which are the first models purpose built for generative AI evaluation. Each of these models has been fine-tuned to solve specific evaluation tasks, such as detecting hallucinations, context quality, data leakage, and malicious prompts. By creating smaller purpose built EFMs, Luna® is able to conduct evaluations with never-before-seen accuracy, speed, and cost-efficiency.

Also Read: Google Cloud Expands Partnership with Workday to Enhance App Development With Generative AI

Key Innovations and Features of Galileo Luna®:

Evaluation Accuracy: Exceeding all existing evaluation models, including Galileo’s own Chainpoll, Luna’s® EFMs lead the industry in detecting hallucinations, prompt injections, PII, and more, outperforming previous methods by up to 20%.
Ultra Low-Cost Operations: Proven to be 30x cheaper than conventional methods, such as OpenAI’s GPT 3.5.
Millisecond Speed: Designed for real-time applications, evaluations are completed in milliseconds, essential for real-time applications like chatbots and AI monitoring systems.
No Ground Truth Required: Unlike other evaluation methods that depend on extensive and costly test sets, Luna® eliminates the need for ground truth data, facilitating faster deployment and scalability.
Unmatched Customizability: Each Luna® model can be quickly fine-tuned to meet specific customer needs, providing tailored solutions that achieve over 95% accuracy in critical applications.

“Evaluations are absolutely essential to delivering safe, reliable, production-grade AI products,” said Alex Klug, Head of Product, Data Science & AI at HP. “Until now, existing evaluation methods, such as human evaluations or using LLMs as a judge, have been very costly and slow. With Luna®, Galileo is overcoming enterprise teams’ biggest evaluation hurdles – cost, latency, and accuracy. This is a game changer for the industry.”

Powering Production Generative AI Applications;
With its release, Luna® is already integrated into all Galileo platforms, including the new Galileo Protect® and Evaluate®. These tools utilize Luna’s® capabilities to intercept harmful inputs, improve system security, and enhance operational efficiencies. Teams from Fortune 50 CPG brands to Fortune 10 US Banks are already using Luna® to handle millions of GenAI application queries per month, safeguard against malicious prompt injections, and reduce costs associated with GenAI operations.

SOURCE: PRNewswire