Thursday, November 21, 2024

Google’s new RT-2 AI model helps robots interpret visual and language patterns

Related stories

Capgemini, Mistral AI & Microsoft Boost Generative AI

Capgemini announced a global expansion of its Intelligent App...

Rackspace Launches Adaptive Cloud Manager for Growth

Rackspace Technology®, a leading hybrid, multicloud, and AI technology...

Theatro Launches GENiusAI to Boost Frontline Productivity

Theatro, a pioneer in voice-controlled mobile communication technology, is...

Denodo 9.1 Boosts AI & Data Lakehouse Performance

Latest release adds an AI-powered assistant, an SDK to...

Health Catalyst Launches AI Cyber Protection for Healthcare

Health Catalyst, Inc., a leading provider of data and...
spot_imgspot_img

Google is taking a significant leap in enhancing the intelligence of its robots with the introduction of the Robotic Transformer (RT-2), an advanced AI learning model. Building upon its earlier vision-language-action (VLA) model, RT-2 equips robots with the ability to recognise visual and language patterns more effectively. This enables them to interpret instructions accurately and deduce the most suitable objects to fulfill specific requests.

In recent experiments, researchers put RT-2 to the test with a robotic arm in a simulated kitchen office scenario. The robot was instructed to identify an improvised hammer (which turned out to be a rock) and choose a drink to offer an exhausted person (where it chose a Red Bull). Additionally, the researchers instructed the robot to move a Coke can to a picture of Taylor Swift, revealing the robot’s surprising preference for the famous singer.

Google is taking a significant leap in enhancing the intelligence of its robots with the introduction of the Robotic Transformer (RT-2), an advanced AI learning model. Building upon its earlier vision-language-action (VLA) model, RT-2 equips robots with the ability to recognise visual and language patterns more effectively. This enables them to interpret instructions accurately and deduce the most suitable objects to fulfill specific requests.

In recent experiments, researchers put RT-2 to the test with a robotic arm in a simulated kitchen office scenario. The robot was instructed to identify an improvised hammer (which turned out to be a rock) and choose a drink to offer an exhausted person (where it chose a Red Bull). Additionally, the researchers instructed the robot to move a Coke can to a picture of Taylor Swift, revealing the robot’s surprising preference for the famous singer.

Also Read: Squirrly Company Releases First Batch of AI Digital Assistants for Marketing Revolutionizing Business Success

Prior to the advent of VLA models like RT-2, teaching robots required arduous and time-consuming individual programming for each specific task. The power of these advanced models, however, enables robots to draw from a vast pool of information, allowing them to make informed inferences and decisions on the fly.

Google’s endeavour to create more intelligent robots started the previous year with the announcement of integrating its language model LLM PaLM into robotics, culminating in the development of the PaLM-SayCan system. This system aimed to unite LLM with physical robotics and laid the foundation for Google’s current achievements.

Nevertheless, the new robot is not without its imperfections. During a live demonstration observed by The New York Times, the robot struggled with correctly identifying soda flavours and occasionally misidentified fruit as the colour white. Such glitches highlight the ongoing challenges in refining AI technology for real-world applications.

SOURCE: BusinessToday

Subscribe

- Never miss a story with notifications


    Latest stories

    spot_img