Developers: | |
Date of the premiere of the system: | March 2023 |
Branches: | Information Technology |
Technology: | Robotics, Robots Industrial, Robots Service, Application Development Tools |
Content |
History
2023: Product Announcement
On March 6, 2023, researchers from Google and the Technical University of Berlin presented an open, artificial intelligence-based multimodal language model for training autonomous robots.
The project is called PaLM-E. The model combines language and computer vision to control robots. In total, as noted, 562 billion parameters are available, which provides maximum flexibility. The model, in particular, allows the robot to perform a wide range of tasks based on human voice commands without the need for constant retraining. In other words, the robot perceives natural language directives, analyzes them and immediately starts working.
For example, if a robot is ordered to "bring me rice chips from a desk drawer," PaLM-E will promptly create an action plan based on the command and its field of view. Then the mobile robotic platform with an automated hand will completely autonomously perform all the necessary operations.
PaLM-E receives data from the robot camera and analyzes the environment without the need for a preliminary description of the scene. This means that a person does not need to annotate visual data. Moreover, PaLM-E can respond to changes in the environment during the task.
PaLM-E is based on the existing scale language model known as PaLM (similar to ChatGPT technology). It is integrated with sensor information and robotic control. The system continuously monitors its environment and encodes the received data into a sequence of "vectors," similar to how words are encoded into "language tokens." Thus, sensory information is perceived similarly to the processing of voice commands. In addition, PaLM-E can transfer knowledge and skills gained from previous tasks to new ones, which leads to improved efficiency.[1]