RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

MTS AI: Cotype (Large Language Model, LLM)

Product
The name of the base system (platform): Artificial intelligence (AI, Artificial intelligence, AI)
Developers: MTS AI, MTS AI (MTS Artificial Intelligence Center)
Last Release Date: 2024/09/10
Technology: Speech technology

Content

The main articles are:

2024

Deployment of MTS AI Cotype Plus in the AFL Process Sandbox

The AFL Technology Sandbox has deployed a GPU-oriented infrastructure for high-performance matrix and vector operations in order to pilot solutions using technologies. artificial intelligence The Association announced this on September 12, 2024. In particular, a large MTS AI Cotype Plus language model has already been deployed on this infrastructure. More. here

Optimization for Tatar language texts

MTS AI developed an updated version of the large Cotype Lite language model for working with texts in Tatar the language. The company announced this on September 10, 2024.

LLM is able to process documents up to 8 thousand tokens (approximately 5 A4 sheets), extract and summarize data in a few seconds.

Cotype Light can be used in archives, libraries, state in and private organizations - wherever there is a need for processing information and analyzing documents in Tatar. For example, with the help of a large language model, it is possible to speed up the processing of applications to government agencies.

The Cotype will extract key information such as the subject matter of the request, the location and the personal data of the applicant, and transfer them to the appropriate database. Like other models of the Cotype family, this version can be installed in the contour of the organization, which allows you to eliminate information leaks.

{{quote 'Creating a large language model in Tatar, the developers of MTS AI pursued several goals. First, we wanted to support the diversity of languages ​ ​ existing in Russia, help them develop and be in demand in the digital era. Secondly, this project has shown that we are able to adapt our models to any scientific and business tasks, including such non-trivial ones as information processing in the languages ​ ​ of the peoples of Russia, "said Dmitry Markov, Executive Director of MTS AI. }}

So that the Cotype Light model can understand an unfamiliar language, the developers collected the datacet and translated it from Russian into Tatar. After that, all the data and answers of the model were checked by Turkologists and native speakers. Cotype Light training takes place at MTS Web Services facilities .

According to the developers, Cotype Light is among the best LLM in its class: it contains 8 billion parameters. If necessary, MTS AI can create LLM in Tatar with a large number of parameters - up to 70 billion parameters, as well as a large context window of up to 32 thousand tokens - so that the model can perform tasks such as translation and generation of long texts. Also, MTS AI is ready to adapt models of the Cotype family for other regional languages ​ ​ of Russia.

Ability to process a long user context

MTS AI has released an updated version of its large Cotype PRO business language model. This model can handle a long user context - up to 20 pages, which allows you to produce personalized and accurate responses when computing power is low. The company announced this on August 28, 2024.

Cotype is a large language model created by MTS AI specifically for working with corporate data. It is trained on a large amount of business correspondence, job descriptions, documentation and other texts, which provides high LLM expertise in this area and allows the use of AI in business processes. Cotype is in the top 3 Russian-language models, according to the MERA benchmark.

File:Aquote1.png
An increased amount of context storage without data loss is the Cotype Pro function, which is used in more than 10 pilot projects of the corporate sector and government agencies to create an end-to-end search system for internal documentation and databases, generate technical instructions and corporate letters, SEO-optimization of materials on the site and analyze and summarize the results of meetings - said Sergey Ponomarenko, senior manager of LLM products of MTS AI.
File:Aquote2.png

Cotype PRO was created using a unique two-stage method of further training and its own benchmarks developed by MTS AI specialists. This approach has achieved high quality performance for a large language model that runs on just one NVIDIA A100 graphics card, unlike competitive solutions that need 4 graphics cards.

A large context window gives the Cotype PRO language model the ability to process a significant amount of information at a time. That is, the user can download a 20-page contract or other document. Thanks to which, the model has a better understanding of what it is about and gives more accurate answers, which is extremely important when it comes to analyzing corporate documents and regulations.