Monday, July 15, 2024

Google’s new RT-2 AI model helps robots interpret visual and language patterns

Related stories

Synchron Announces Brain Computer Interface Chat Feature Powered by OpenAI

New feature includes AI-driven emotion and language predictions for...

Inspiro Wins Multiple Gold Honors from Globee® Awards

Inspiro, a leading global CX outsourcing company, is excited...

Peak Boosts Business Productivity with General Release of Agentic AI Assistant, Co:Driver

Artificial intelligence company Peak announced the general availability of Co:Driver,...
spot_imgspot_img

Google is taking a significant leap in enhancing the intelligence of its robots with the introduction of the Robotic Transformer (RT-2), an advanced AI learning model. Building upon its earlier vision-language-action (VLA) model, RT-2 equips robots with the ability to recognise visual and language patterns more effectively. This enables them to interpret instructions accurately and deduce the most suitable objects to fulfill specific requests.

In recent experiments, researchers put RT-2 to the test with a robotic arm in a simulated kitchen office scenario. The robot was instructed to identify an improvised hammer (which turned out to be a rock) and choose a drink to offer an exhausted person (where it chose a Red Bull). Additionally, the researchers instructed the robot to move a Coke can to a picture of Taylor Swift, revealing the robot’s surprising preference for the famous singer.

Google is taking a significant leap in enhancing the intelligence of its robots with the introduction of the Robotic Transformer (RT-2), an advanced AI learning model. Building upon its earlier vision-language-action (VLA) model, RT-2 equips robots with the ability to recognise visual and language patterns more effectively. This enables them to interpret instructions accurately and deduce the most suitable objects to fulfill specific requests.

In recent experiments, researchers put RT-2 to the test with a robotic arm in a simulated kitchen office scenario. The robot was instructed to identify an improvised hammer (which turned out to be a rock) and choose a drink to offer an exhausted person (where it chose a Red Bull). Additionally, the researchers instructed the robot to move a Coke can to a picture of Taylor Swift, revealing the robot’s surprising preference for the famous singer.

Also Read: Squirrly Company Releases First Batch of AI Digital Assistants for Marketing Revolutionizing Business Success

Prior to the advent of VLA models like RT-2, teaching robots required arduous and time-consuming individual programming for each specific task. The power of these advanced models, however, enables robots to draw from a vast pool of information, allowing them to make informed inferences and decisions on the fly.

Google’s endeavour to create more intelligent robots started the previous year with the announcement of integrating its language model LLM PaLM into robotics, culminating in the development of the PaLM-SayCan system. This system aimed to unite LLM with physical robotics and laid the foundation for Google’s current achievements.

Nevertheless, the new robot is not without its imperfections. During a live demonstration observed by The New York Times, the robot struggled with correctly identifying soda flavours and occasionally misidentified fruit as the colour white. Such glitches highlight the ongoing challenges in refining AI technology for real-world applications.

SOURCE: BusinessToday

Subscribe

- Never miss a story with notifications


    Latest stories

    spot_img