H2O.ai, the open-source leader in generative AI and machine learning, announced a collaboration with AI Verify to ensure the safe deployment of AI. As a part of the collaboration, H2O has launched an initiative alongside the foundation to provide clients the ability to test and govern their AI systems using H2O’s platform and further the global open-source community.
As a part of the collaboration, H2O has agreed to contribute benchmarks and code to AI Verify’s open-source Project Moonshot toolkit for Large Language Modeling (LLM) application testing and provide support for tests recommended by AI Verify on its Machine Learning (ML) and LLM Ops platform.
At H2O.ai, we believe that responsible AI starts with ensuring transparency, accountability, and governance throughout the lifecycle of AI systems, from development to deployment. We have developed robust tools like H2O EvalStudio for customers to systematically assess the performance, security, fairness, and overall effectiveness of LLMs and Retrieval-Augmented Generation systems. Our collaboration with AI Verify Foundation will further strengthen such efforts, ensuring organizations can meet both internal governance standards and external regulatory requirements.
“H2O has been committed to the open-source community since our founding and we believe every organization should have a strategy to safely test AI,” said Sri Ambati, CEO and co-founder of H2O.ai. “Working with AI Verify clearly aligns with our values and we look forward to continue leading the charge for responsible AI adoption.”
“We believe that appropriate tools and approaches to AI testing is critical to enable adoption of AI for society, business and citizens,” said Shameek Kundu, executive director at the AI Verify Foundation. “We are very pleased to have H2O, an active member of the foundation, as a partner in this journey.”
H2O’s contribution to AI Verify’s Project Moonshot provides one of the world’s first LLM Evaluation Toolkits, designed to integrate benchmarking, red teaming, and testing baselines. The toolkit helps developers, compliance teams, and AI system owners manage LLM deployment risks by providing a seamless way to evaluate their applications’ performance, both pre- and post-deployment.
SOURCE: Businesswire