The name of the base system (platform): | Artificial intelligence (AI, Artificial intelligence, AI) |
Developers: | Sberbank, SberDevices |
Date of the premiere of the system: | 2020/12 |
Last Release Date: | July 2023 |
Technology: | Speech technology |
Content |
The main articles are:
2023: Opening access to ruGPT-3.5 13B
On July 20, 2023, Sber announced the opening of access to the neural network model for generating text for the Russian language ruGPT-3.5 13B. Its updated version is at the heart of the GigaChat service. The development is available on the HuggingFace platform, it can be used by all developers (the model is published under an open MIT license).
According to Sber, this is a modern model for generating text for the Russian language based on the architecture finalized by Sber researchers GPT-3 from. OpenAI
ruGPT-3.5 13B contains 13 billion parameters and is able to continue texts in Russian and English, as well as in programming languages. The length of the model context is 2048 tokens. She is trained on a text corpus about 1 TB in size, which, in addition to the large collection of open-source text data already used for ruGPT-3 training, included, for example, part of an open set with The Stack code from the collaboration of BigCode researchers and corpuses of news texts. The final checkpoint of the model is a basic pretrein for further experiments, the bank said.
The model is also available on the Russian ML Space platform in the DataHub hub of pre-trained models and datacets. The SberDevices and Sber AI teams, supported by the AIRI Institute of Artificial Intelligence, participated in the training of the model.
Sber, as a leading technology company, advocates the openness of technologies and the exchange of experience with the professional community, because any development and research has limited potential in a closed environment. Therefore, we are confident that the publication of trained models will spur the work of Russian researchers and developers in need of super-powerful language models to create their own technological products and solutions on their basis. Try, experiment and be sure to share the results, - said Andrey Belevtsev, Senior Vice President, CTO, Head of the Technologies Unit of Sberbank.[1] |
2022
Creating a collection of short stories with writer Pavel Pepperstein
Neural network from "Sberbank" ruGPT-3 wrote a collection of stories along with writer Pavel Pepperstein, who released the publishing house Individuum. About this "Sberbank" reported TAdviser on May 24, 2022. Read more here.
GPT-3 version generating texts in 61 languages of the world
Sber April 21, 2022 introduced a version of neuronets GPT-3 capable of generating texts in 61 languages of the world, including the languages of the peoples Russia and. countries CIS mGPT is available in two versions: basic, with 1.3 billion parameters published cloudy publicly in SberDisk storage, and extended, with 13 billion parameters, which will soon become available on the machine learning platform. ML Space от Cloud
The mGPT model can be used both simply to generate text and to solve various problems in the field of natural language processing in one of the supported languages by further training or as part of model ensembles. The model shows outstanding results on many few-shot and zero-shot learning tasks: in this area of machine learning, you do not need to separately finish learning the model, it is enough to formulate the task with text and give several examples, after which mGPT will learn how to perform a new task. This can be used to teach an automated system to answer questions, determine the emotional color of a text, extract names, surnames, company names from text, etc. The model can also be used as a component of various speech technologies - for example, to improve the quality of speech recognition, generate scripts of dialog systems, etc.
The Russian-language version of GPT-3, developed by Sberbank, is available on the SmartMarket platform.
2021
ruGPT-3 - based on Salyut virtual assistants
On November 12, Sberbank announced that virtual assistants Joy and Athena from the Salyut family began to talk using a generative natural language model ruGPT-3 with 760 million parameters. The transition to the use of a neural network model made assistants more empathic, allowed them to better understand users and give original and unexpected answers to various requests. Read more here.
Creating a Code Generation Model
In November 2021, Sberbank presented a code generation model based on the ruGPT-3 neural network. The development is carried out by the teams SberDevices and SberWorks. The model formed the basis of the system created by the developers of Sberbank, which received the comic name JARVIS (Just another really valuable intellectual system). One of the parts of the system was a service that allows you to automatically write code, reducing development time. From November 15, 2021, external developers will have the opportunity to use this service on SmartMarket - a single access point to all technological platforms in Sberbank.
The code generation model is based on a deep neural network of ruGPT-3 trained on Sberbank code and open source libraries. The capabilities of such a model allow neural networks to add developer code, look for vulnerabilities in the code, translate code from one programming language to another, and even - in the future - transform an algorithm formulated by ordinary speech into code.
Part of the system is the code completion service, which works on the principle of prompts. After writing one part of the code, the neural network offers continuation options that the user can choose and not enter the code manually. As early as November 2021, JARVIS includes plugins for development tools (IDE): IDEA, PyCharm, WebStorm with support for Java, Python and JavaScript, but this function is only available for Sberbank developers so far. It is planned that in early 2022 the JARVIS plugins for IDEA, PyCharm and WebStorm will be available to everyone. This function is also included in the application creation toolkit for Salute virtual assistants.
Unlike the standard code completion tools built into the IDE, JARVIS, when writing programs, is able to rely not only on the project structure and language syntax, but also on the text of comments in natural language. Thus, the system, in fact, is able to translate informal descriptions of functions into program code within certain limits.
{{quote 'author = said Konstantin Kruglov, Senior Vice President for New Digital Surfaces at Sberbank, CEO of SberDevices. |
Writing code is a creative process, but a number of routine tasks can already be rearranged by neural networks. Solutions based on our model save the developer's most valuable resource - time. We have become a company that has created its own code generation model - and will soon offer access to it to external developers. At the same time, the number of available services will expand - for example, the neural network will learn to add code in the emerging programming languages,}}
Increase in the number of neural network parameters from 760 million to 1.3 billion
Sber continues to develop the Russian-language neural network ruGPT-3, which is able to generate very complex meaningful texts on just one request in the "human" language. Since the presentation of the neural network in December 2020, the number of its parameters has almost doubled - from 760 million to 1.3 billion, Sberbank reported on January 29, 2021. According to representatives of the bank, this is a huge step forward in the processing of natural language by artificial intelligence methods in Russia.
GPT-3 (Generative Pre-trained Transformer) is the largest language model in the world, developed by OpenAI to solve any problem in English. In Russian, more complex from the point of view of structure, before the appearance of ruGPT-3 similar high-quality models did not exist. Domestic GPT-3 is constantly studying on the Christofari Sberbank supercomputer on a giant data array, so its capabilities are growing every day.
RuGPT-3 can not only create texts of any profile (news, novels, poems, parodies, technical documentation, and so on), but also correct grammatical errors, conduct dialogues and write program code. In fact, this is a prototype of general, or strong, Artificial General Intelligence (AGI), capable of solving diverse problems in various fields of activity.
In December 2020, we presented the ruGPT-3 and announced a further increase in its capabilities. Together with the team from SberDevices, we are fulfilling this promise and have already brought the number of neural network parameters from 760 million to 1.3 billion. This quantitative growth means a qualitative improvement in the "intelligence" of the system, its ability to solve new problems at a level comparable to human, or above it. But the computing power of our Christofari supercomputer allows us to set even more ambitious goals, so 1.3 billion parameters are just the beginning, said Alexander Vedyakhin, First Deputy Chairman of the Management Board of Sberbank. |