The name of the base system (platform): | Artificial intelligence (AI, Artificial intelligence, AI) |
Developers: | Sberbank, Institute of Artificial Intelligence (AIRI), Moscow Institute of Physics and Technology (MIPT) |
Date of the premiere of the system: | 2023/11/27 |
Technology: | Speech technology |
The main articles are:
- Speech Recognition (Technology, Market)
- Speech technology: On the path from recognition to understanding
2023: Robot Action Planning System Introduction
The center robotics Sberbank , together with a team of scientists from AIRI MIPT and, is working to create a robot action planning system that will allow them to perform everyday tasks on commands in natural language. Sberbank announced this on November 27, 2023.
Teaching robots to understand human speech is a real challenge for developers. The unambiguous perception of the natural language that people speak every day is not an easy task for robots. Abstractions, generalizations, context or slang can change the meaning of words and sentences and, as a result, confuse the robot. The control of robots with the help of language commands is also complicated by the fact that, although artificial intelligence has achieved significant success in understanding written text, it is ideal to translate this understanding into spoken language with its variations of accent, speed and intonation, he does not yet know how. In addition, robots struggle to understand ambiguous commands and do not "read between the lines," which is natural for humans. And modern robots, even the simplest, are controlled by a set of written commands - program code.
The embodied artificial intelligence will allow the robot to independently form action sequences for solving problems, interacting with the external environment in the real world. A system built on the basis of such technology processes information, orients itself in space and makes decisions. As a result, the robot must be able to perform tasks for moving objects at the request of the user in a natural language, and not as before according to an algorithm predetermined by the developer in the form of a sequence of commands in a programming language.
To use the progress of generative technologies for the successful application of AI in robotics, the Sberbank Center for Robotics, the AIRI Institute and the MIPT Center for Cognitive Modeling are developing a universal approach for planning the behavior of robots based on large language models. It so happened that for the task of predicting a text, large language models had to implicitly learn ideas about the world around them - what objects are in it, what can and cannot be done with them. This property of the command is used to generate robot action plans.
In the future, a solution that allows machines to understand human commands can be connected to different types of robots. Scientists are currently conducting experiments using a research robot rover.
One of the difficulties in implementing such a project is getting feedback from the environment in which the robot operates. Each apartment or office is unique, and the objects we are used to - cups, computers, furniture - are different from each other. To solve this problem, the system proposed by scientists splits the task into several parts depending on the situation. For example, even a child's simple request to "put all toys in a box" turns out to be completely non-trivial for AI. He does not have "common sense" and does not know what "all toys" are. In such a situation, the robot must convert the request into a requirement to "segment toys," collect a list of items found in the room and divide the task into stages, that is, independently write a manual for cleaning each specific object.