Developers: | SalyutDevices (formerly SberDevices) |
Date of the premiere of the system: | 2022/11/24 |
Technology: | Speech technology |
The main articles are:
Molotilka is the basic infrastructure element of the "electronic brain of the company." Specialized Internet crawlers of the ecosystems of the future will scan the bottomless expanses of the World Wide Web day and night, and powerful tensor supercomputers will constantly further train farms of large machine learning models, which will become the intellectual core of many products, tools and services. At the same time, at the product level, it will be possible not only to rely on current information and trends, but also to study the dynamics of the information space in order to quickly make decisions that affect the strategy and tactics of companies in the market.
2022: Анонс Molotilka (ML Toolkit for Continuous Learning)
On November 24, 2022, SberDevices introduced the Molotilka (ML Toolkit for Continuous Learning) tool for working with pipelines of large language models. It implements automation of constant further training with minimal forgetting of old knowledge. The service for using large neural network models constantly trained with Molotilka is available in Cloud ML Space, a full-cycle ML development platform.
The flow of knowledge is continuous, many events occur every day. When training a large neural network language model, a slice of data available on the Internet or other sources is usually used. Thus, the model will not have knowledge of what happened in 2022 if trained in 2021. For this, Molotilka was created, which has up-to-date knowledge at every moment of time and at the same time remembers old knowledge.
The first version of the framework used the ruGPT-3 language model, which the SberDevices team had previously trained on a large corpus of texts from various sources: books, the Internet, etc. Based on downloaded data from several news sources, a small dataset was regularly formed, into which a little random data from a large dataset was added so that the ratio of old and new data corresponded to a given proportion. After that, ruGPT-3 studied on a mixed data set using current methods to combat catastrophic forgetting. During this experiment, different approaches were used, and as a result, an adapters-based option was chosen - with the addition of special layers and their subsequent further training.
This is how a tool called Molotilka appeared, which allows you to conduct continuous training of language models with the repetition of some predetermined actions. For example, such as downloading current data from news sources, preprocessing them, creating a datacet for further learning the language model and evaluating it on different tasks.
ML Toolkit for Continuous Learning can be used in the following areas:
- standard application of ruGPT-3 as a language model with up-to-date knowledge of the world;
- specifying customized tasks with constant additional training of the model on new data: classification, extraction, information dialog systems, etc.
Now, using a special service, Cloud ML Space users have gained access through the API to the most current version of the ruGPT-3, which is always aware of the latest news, trends and memes, as well as to previous versions of the model. An example of API uses is placed in the public domain. Other neural networks of Sberbank will be added to the number of constantly completed models.