Thursday, November 21, 2024

Google’s new RT-2 AI model helps robots interpret visual and language patterns

Related stories

The Rise of Cloud-Native AI: Benefits, Trends, and Use Cases

Artificial Intelligence (AI) is no longer just a futuristic...

Cresta Raises $125M to Boost Human-Centric AI in Centers

The latest funding will help Cresta double down on...

Domino Integrates Tech to Scale and Govern AI for Enterprises

New Fall Release and integrations provide more options for...

Thoughtful AI Transforms Healthcare RCM with AI Agents

Thoughtful AI, an AI-powered revenue cycle transformation company, announced...

Qubrid AI Integrates NVIDIA NIM into Cloud & On-Prem

The Qubrid AI platform offers developers a simplified no-code...
spot_imgspot_img

Google is taking a significant leap in enhancing the intelligence of its robots with the introduction of the Robotic Transformer (RT-2), an advanced AI learning model. Building upon its earlier vision-language-action (VLA) model, RT-2 equips robots with the ability to recognise visual and language patterns more effectively. This enables them to interpret instructions accurately and deduce the most suitable objects to fulfill specific requests.

In recent experiments, researchers put RT-2 to the test with a robotic arm in a simulated kitchen office scenario. The robot was instructed to identify an improvised hammer (which turned out to be a rock) and choose a drink to offer an exhausted person (where it chose a Red Bull). Additionally, the researchers instructed the robot to move a Coke can to a picture of Taylor Swift, revealing the robot’s surprising preference for the famous singer.

Google is taking a significant leap in enhancing the intelligence of its robots with the introduction of the Robotic Transformer (RT-2), an advanced AI learning model. Building upon its earlier vision-language-action (VLA) model, RT-2 equips robots with the ability to recognise visual and language patterns more effectively. This enables them to interpret instructions accurately and deduce the most suitable objects to fulfill specific requests.

In recent experiments, researchers put RT-2 to the test with a robotic arm in a simulated kitchen office scenario. The robot was instructed to identify an improvised hammer (which turned out to be a rock) and choose a drink to offer an exhausted person (where it chose a Red Bull). Additionally, the researchers instructed the robot to move a Coke can to a picture of Taylor Swift, revealing the robot’s surprising preference for the famous singer.

Also Read: Squirrly Company Releases First Batch of AI Digital Assistants for Marketing Revolutionizing Business Success

Prior to the advent of VLA models like RT-2, teaching robots required arduous and time-consuming individual programming for each specific task. The power of these advanced models, however, enables robots to draw from a vast pool of information, allowing them to make informed inferences and decisions on the fly.

Google’s endeavour to create more intelligent robots started the previous year with the announcement of integrating its language model LLM PaLM into robotics, culminating in the development of the PaLM-SayCan system. This system aimed to unite LLM with physical robotics and laid the foundation for Google’s current achievements.

Nevertheless, the new robot is not without its imperfections. During a live demonstration observed by The New York Times, the robot struggled with correctly identifying soda flavours and occasionally misidentified fruit as the colour white. Such glitches highlight the ongoing challenges in refining AI technology for real-world applications.

SOURCE: BusinessToday

Subscribe

- Never miss a story with notifications


    Latest stories

    spot_img