In the paper about the robotics transformer model RT-1, researchers made robots solve problems in the real world and fulfilling tasks only within the bounds of their own possibilities. Here’s a short introduction to the paper.
"Roses are red, violets are blue, bring me the rice chips from the drawer, and a napkin too.“ This is an actual task description that was given to a robot. It was part of a research project from Robotics at Google, Everyday Robotics, and Google Research.
The robot can receive a task in natural language and process it accordingly. In addition, the task is filtered so that the robot evaluates which part of the task can be solved with its own skills. Transformers are used for this purpose. A significant advantage here is that they have an understanding of both, natural language and visual data.
The paper „RT-1: Robotics Transformer for Real-World Control at Scale“ shows how the integration of robots into the human (domestic) environment can be easily achieved. With the input of images of the real-world environment and a task description in natural language, the RT-1 approach can contribute to the actions with
1. Process the task in natural language and extract task-relevant visual features 2. Compute a set of tokens (planning) 3. Transform the tokens into actions (and execute movements)
Find more information and the link to the full paper here: https://robotics-transformer.github.io/