Tuesday, November 5, 2024

Scale AI Partners with DoD’s Chief Digital and Artificial Intelligence Office to Test and Evaluate Large Language Models

Related stories

Crayon Joins AWS Generative AI Partner Innovation Alliance

Crayon announced it will work with Amazon Web Services...

Sikich announced the appointment of Ray Beste as Principal AI Strategist

Sikich, a Chicago-based leading global technology-enabled professional services company,...

Wondershare Unveils SelfyzAI 3.0: New AI Features Enhance Image Editing Experience

Wondershare proudly launched SelfyzAI 3.0, the latest version of...

Dan Muscatello Joins OneSix as Chief Revenue Officer

OneSix, a leading data and artificial intelligence (AI) consultancy...
spot_imgspot_img

Scale AI, the leading test and evaluation (T&E) partner for frontier artificial intelligence companies, is partnering with the U.S. Department of Defense’s (DoD) Chief Digital and Artificial Intelligence Office (CDAO) to create a comprehensive T&E framework for the responsible use of large language models (LLMs) within the DoD.

Through this partnership, Scale will develop benchmark tests tailored to DoD use cases, integrate them into Scale’s T&E platform, and support CDAO’s T&E strategy for using LLMs. The outcomes will provide the CDAO a framework to deploy AI safely by measuring model performance, offering real-time feedback for warfighters, and creating specialized public sector evaluation sets to test AI models for military support applications, such as organizing the findings from after action reports.

This work will enable the DoD to mature its T&E policies to address generative AI by measuring and assessing quantitative data via benchmarking and assessing qualitative feedback from users. The evaluation metrics will help identify generative AI models that are ready to support military applications with accurate and relevant results using DoD terminology and knowledge bases.

Also Read: BigID Appoints New Chief Customer Officer for Next Phase of Market Leadership and Growth

The rigorous T&E process aims to enhance the robustness and resilience of AI systems in classified environments, enabling the adoption of LLM technology in secure environments.

Alexandr Wang, founder and CEO of Scale AI, emphasized Scale’s commitment to protecting the integrity of future AI applications for defense and solidifying the U.S.’s global leadership in the adoption of safe, secure, and trustworthy AI. “Testing and evaluating generative AI will help the DoD understand the strengths and limitations of the technology, so it can be deployed responsibly. Scale is honored to partner with the DoD on this framework,” said Wang.

For decades, T&E has been standard in product development across industries, ensuring products meet safety requirements for market readiness, but AI safety standards have yet to be codified. Scale’s methodology, published last summer, is the industry’s first comprehensive technical methodology for LLM T&E. Its adoption by the DoD reflects Scale’s commitment to understanding the opportunities and limitations of LLMs, mitigating risks, and meeting the unique needs of the military.

SOURCE: BusinessWire

Subscribe

- Never miss a story with notifications


    Latest stories

    spot_img