Orby AI (Orby), a technology trailblazer in generative AI solutions for the enterprise, unveiled ActIO, the most capable large action model (LAM) AI foundation engine yet, with state-of-the-art (SOTA) performance on Large Action Model Benchmark.
The company also announced it has teamed with Ohio State University’s Natural Language Processing (NLP) group to develop advanced AI techniques such as visual grounding, the ability for an AI agent to connect what it sees in an image with what it understands through language learning. OSU and Orby have co-authored and published an extensive research paper on the innovation entitled: “Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents.” OSU refers to this new technique as UGround, which is now native within Orby’s ActIO foundation LAM.
Conventional large language models (LLMs) often struggle to effectively connect visual information with textual understanding. They can miss subtle details or misinterpret different types of information entirely. Orby’s collaboration with OSU on visual grounding now gives machines the ability to identify what is visible and understand its importance in the context of a specific task to be performed.
Also Read: HatchWorks AI Unveils GenIQ: Revolutionizing Software Development with AI-Driven Process Intelligence
“The advances we’ve made and the transition we are seeing right now within the AI world will be the most profound in our lifetimes, far bigger than the shift to mobile or to the web before it,” said Will Lu, Co-Founder and CTO at Orby. “Next generation AI systems must be able to process and interpret visual information, like objects, scenes, and their relationships as well as grasping the meaning of words and sentences – making the connection between the two.” That’s precisely what we’ve done,” concluded Lu.
“This is an incredible milestone that we’ve achieved with Orby, and yet we’re only beginning to scratch the surface of what’s possible,” said Yu Su, Assistant Professor in the Department of Computer Science and Engineering at the Ohio State University.
Orby and OSU have open-sourced the new visual grounding model which is now available on HuggingFace, allowing developers to utilize the model in a variety of applications.
Source: GlobeNewsWire