2025/02/13 12:55:21

AI training

Content

Robot training
Machine learning
- LLM (Large Language Models)
Chronicle
Notes

The main articles are:

Robot training

Main article: Robot training

Machine learning

Main article: Machine Learning

LLM (Large Language Models)

Main article: LLM (Large Language Models)

Chronicle

2025

Moscow Mayor's Office has opened sets of impersonal data that companies can use to train AI and create services for residents

The Moscow Department of Information Technology has provided developers with access to 25 sets of impersonal city data to train artificial intelligence algorithms and develop digital services for citizens. The initiative is aimed at creating smart solutions that, after expert review, can be integrated into the urban infrastructure. This was reported in the press service of the department on September 9, 2025. Read more here

Free data set for training artificial intelligence to recognize human emotions released in Russia

Researchers at the National Research University Higher School of Economics (HSE) in St. Petersburg have developed and released a multimodal emotional datacet for teaching artificial intelligence systems to analyze human emotions. This was reported to the Higher School of Economics at the end of August 2025. Read more here

Development of technology in Russia that accelerates training of distributed neural network models

Russian specialists of the Center for Practical artificial intelligence Sberbank Moscow Institute of Physics and Technology and created an innovative approach to optimizing distributed training of neural network models, which allows you to significantly reduce the load on computing resources and speed up processes. machine learning This became known in August 2025. The new technology focuses on the use of homogeneity of local data samples and compression of transmitted information between devices.

The development is aimed at solving the key problem of modern machine learning - the inefficiency of communications in distributed training of neural networks. AI models process vast amounts of data with billions of parameters, the researchers note, requiring distributed learning across thousands of devices to speed up computing processes.

In Russia, developed a technology that accelerates the training of distributed neural network models

The main problem is that a significant share of the time in distributed learning is spent on the exchange of information between computers. Inefficient communication can slow neural network learning compared to a centralized approach, reducing the benefits of multiple computing resources.

The method proposed by Russian scientists is based on a combination of two key principles: the use of homogeneity of local data samples and the use of techniques for compressing transmitted information. This approach reduces the frequency of synchronization between devices and significantly reduces the amount of data transferred without compromising the quality of the model being trained.

Gleb Gusev, Director of the Center for Practical Artificial Intelligence of Sberbank, stressed that the combination of homogeneity of data with compression techniques makes it possible to exchange information between the server and devices less often. According to him, this not only speeds up the learning process, but also significantly reduces energy costs, opening up new opportunities for scaling artificial intelligence in large systems.^[1]

Russia has developed a new method for understanding decisions made by artificial intelligence

Scientists from the laboratory of artificial intelligence research T-Bank AI Research have developed a new method called SAE Match, which allows you to understand the decision-making mechanisms of artificial intelligence (AI) and see the reasons for the formation of certain conclusions in the process of calculation. This was announced on April 10, 2025 by representatives of a research group from Moscow. Read more here.

Deputy Prime Minister Dmitry Grigorenko: State funds for AI training will become available to business, government agencies and citizens

State data for AI training will become available to business, government agencies and citizens. Deputy Prime Minister Dmitry Grigorenko said in February 2025 that companies, government agencies and individuals will be able to use marked-up sets of government money to test and develop artificial intelligence algorithms. The first gratuitous contracts for the provision of such data will begin to be concluded in February 2025.

According to Vedomosti, the marked-up sets are prepared on the basis of initial databases that do not contain personal data of citizens, official and other secrets protected by law, as well as limited access information. Markup is the assignment of special labels and categories to data for their recognition by machine learning algorithms.

Deputy Prime Minister Dmitry Grigorenko

The formation of data sets for artificial intelligence is carried out within the framework of the Unified Information Platform of the National Data Management System. In 2023, the first 10 sets with marked data were formed, and in November 2024, another 40 sets were added as part of the federal project "Artificial Intelligence."

According to the publication, the Ministry of Digital Development of Russia acts as a data transmission operator and has developed two types of agreements - for state bodies and for individuals and legal entities. The documents define the tasks of information exchange, the terms of data transfer, the period of information use, as well as guarantees for the targeted use of data sets.

Agreements will be concluded with regional departments and subordinate organizations engaged in the introduction of artificial intelligence in their subjects. The governments of the Tyumen and Lipetsk regions have already sent requests for the provision of data sets.^[2]

Russian scientists trained the neural network to correct the errors of quantum computers

Researchers at MISIS University have created a neural network-based system that learns to find and correct errors in quantum computing. The development combines the advantages of intelligent and classical algorithms, so it more effectively recognizes errors that occur when the number of qubits increases - the "building blocks" of quantum processors. The university announced this on February 14, 2025. Read more here.

2024

Russian scientists have created neural networks to detect AI-generated inserts in scientific texts

A team of researchers with the participation of Alexander Shirnin from the Higher School of Economics has created two models for detection of parts generated by artificial intelligence in scientific texts. In the AIpom system, two types of models are connected - a decoder and an encoder, which allows it to more efficiently find the generated inserts. The Papilusion system is suitable for recognizing corrections using synonyms and short retellings generated by a neural network, in its work it uses models of the same type - encoders. In the future, such models will help in checking the originality and reliability of scientific publications. The HSE announced this on December 6, 2024. Read more here.

SteadyControl has patented a method of continuous further training of neural networks for business

The Eurasian Patent Organization has confirmed the exclusive right of SteadyControl to the method of continuous further training of neural networks for business. The solution implies continuous further training of neural networks: experts increase the accuracy of AI for a high-quality analysis of business processes. The patent is valid in Russia and 7 neighboring states. The company announced this on December 4, 2024. Read more here.

Russia has created the world's first open AI environment for rapid contextual training with reinforcement

On November 29, 2024, it became known that Russian scientists from the T-Bank AI Research laboratory and the AIRI Institute, in collaboration with students of MIPT, Skoltech and Innopolis, developed XLand-MiniGrid, the first open virtual environment for research in the field of contextual training with reinforcement of artificial intelligence. Read more here.

The Ministry of Industry and Trade of the Russian Federation buys AI servers for 665 million rubles for training neural networks

On November 11, 2024, the Federal Center for the Applied Development of Artificial Intelligence (FCPR AI), which provides support for the digital transformation of the Ministry of Industry and Trade of Russia, announced a competition for the purchase of server and telecommunications equipment for training neural networks. The initial cost of the contract is about 665 million rubles. Read more here.

The Pixtral Large neural network with a search engine is presented, which is more powerful GPT-4

In mid-November 2024, the French startup Mistral introduced the Pixtral Large neural network, which can compete with GPT-4. The neural network based on the free chatbot Le Chat is capable of generating images, conducting web searches and serving as an interactive "canvas." Read more here.

In Russia, a technology has been created that allows you to reduce the training time of neural networks by 15-20 times

On September 9, 2024, Cognitive Pilot announced the development of technology to automatically correct neural network errors and optimize training efficiency by up to 40%. The developed technology in the company jokingly called Cognitive Neural Network Hospital, for its ability to "cure sick places" of the neural network. For example, when adding new traffic light data to the training sample, thanks to Cognitive Neural Network Hospital, the recognition accuracy was optimized immediately from 99.3% to 99.99%. The technology also allows you to reduce training time (taking into account data selection) by 15 to 20 times. Read more here.

Russia has released ReBased technology for working with long text. It will help launch commercial neural networks faster

Russian scientists from the T-Bank AI Research laboratory have developed a new ReBased technology for accelerated processing of long texts by artificial intelligence. This innovation will significantly reduce the cost of using AI in word processing with almost no loss in quality, the press service of T-Bank reported in August 2024. Read more here.

Reinforcement training allowed generative streaming neural networks to work better

Scientists of the Center AI and the Institute artificial intelligence and Digital Sciences of the faculty computer sciences HSE applied classical algorithms training with reinforcement for tuning (generative streaming networks GFlowNets). This made it possible to improve the work of GFlowNets, which have been used for three years to solve the most difficult scientific problems at the stages of modeling, generating hypotheses and experimental design. The HSE announced this on June 13, 2024.

The GFlowNets device can be described using the example of the lego designer: according to the unfinished object and the set of available parts, the model will try to predict where and with what probability you need to add a part so that as a result we can most likely assemble a good layout of a machine or ship. Nikita Morozov, a research trainee at the Center for In-Depth Learning and Bayesian Methods at the Institute of Artificial Intelligence and Digital Sciences at the FKN HSE, said.

Reinforcement Learning (RL) is one of the machine learning paradigms in which the agent is trained to interact with the environment to maximize the reward function. The classic model built on the basis of reinforcement training, AlphaGo, is a program that won the professional player's board game.

Generative streaming networks and reinforcement training are similar in that they receive a reward function as a training signal. However, GFlowNets is not trying to maximize the reward, but to learn how to generate objects with probabilities proportional to the reward.

Scientists from the Center for AI and the Institute of Artificial Intelligence and Digital Sciences of the Faculty of Computer Science at the Higher School of Economics have shown that the task of training generative streaming networks is as similar as possible to the general task of training with reinforcement, and also used specialized methods of training with reinforcement to generate discrete objects, for example molecular graphs.

We have shown that classic reinforcement learning algorithms for GFlowNets work comparably and even more efficiently than known modern approaches developed specifically for training these models. So, as part of the task of modeling drug molecules with given properties, during the training of our method, 30% more high-quality molecules were generated than existing methods, - said Alexey Naumov, scientific director of the Center for AI, director of fundamental research at the Institute of Artificial Intelligence and Digital Sciences of the Federal Research School of Higher School of Economics.

The researchers emphasize that using existing reinforcement training methods to train GFlowNet directly, without further adapting these methods, will accelerate the progress of new methods in medical chemistry, materials science, power, biotechnology, and many other fields where GFlowNet has found application in three years of existence.

The study was supported by a grant for research centers in the field of artificial intelligence provided by the Analytical Center under the Government of the Russian Federation.

Scientists have taught artificial intelligence to process sequences with a length of two million tokens

A group of Russian scientists from the Moscow Institute of Physics and Technology, the AIRI Institute of Artificial Intelligence and the London Institute of Mathematical Sciences have proposed a method for processing big data. It allows artificial intelligence to generate answers to questions up to 2 million tokens. MIPT announced this on May 31, 2024.

The proposed method is based on a special mechanism for using language models (algorithms for predicting a word, sign or phrase based on context). Such models underlie modern dialogue systems, search services and voice assistants.

At the same time, their software part is made up of transformers - universal architectures that help to build the right order of action when processing a request and generating a response. In particular, transformers allow neural networks to perform many tasks at the same time, which speeds up their work.

However, models that use standard transformers cannot handle long texts. Their speed drops rapidly as the size of the text increases. As a result, neural networks come to the limits of capabilities, give out "hallucinations" or erroneous answers, - explained the problem one of the authors of the scientific work, programmer-developer of the laboratory of neural systems and deep learning at MIPT Aydar Bulatov.

According to him, in order to bypass the barrier, the team of researchers proposed adding a "memory mechanism" to the transformers. The essence of the idea is to divide long input sequences into segments and provide them with additional algorithms for reserving information. These elements serve as "bridges" for which important data is transferred from the previous segment to the next. This allows the language model to keep all long text in "memory" throughout its length. In the next step, the program can already perform various operations with the "learned" text, processing information in accordance with user requests.

First, we conducted experiments on small sequences - from 7 to 15 segments, each of which has 500 tokens (basic units of information in language models), but noticed that the quality of data processing does not decrease when the length increases. Then we continued testing the model and reached a million, and then - up to two million tokens. For comparison, this is the volume of all books about Harry Potter, - explained for his part the co-author of the work, researcher at AIRI Yuri Kuratov.

In the course of the work, scientists also investigated the "intellectual" abilities of the model, setting it tasks for detecting the necessary data in long texts, for remembering them and for "reasoning" based on what has been learned. At the same time, the program demonstrated not only the ability to hold arrays of information in "memory," but also the skills of "critical thinking" and "writing."

In the future, according to the authors of the work, the proposed method will be in demand for the development of fast neural network algorithms for processing large databases. For example, to quickly translate books, read program code, study genomic sequences, or predict new materials.

Russian scientists have found a way to speed up the training of neural networks to navigate space

Researchers from the Higher School of Economics, NITU MISIS and AIRI have found a way to more effectively conduct reinforcement training for neural networks sharpened for orientation in space. With the help of the attention mechanism, the efficiency of the graph neural network increased by 15%. The results of the study are published in the journal IEEE Access. This was announced on January 23, 2024 by representatives of the Higher School of Economics. Read more here.

2023

Russian scientists have created an algorithm that teaches AI 4 times faster than global analogues

Scientists at Tinkoff Research's Artificial Intelligence (AI) Research Laboratory have created an algorithm to train and adapt artificial intelligence. According to scientists, a method called ReBRAC (Revisited Behavior Regulated Actor Critic) trains AI four times faster and 40% better than global counterparts in the field of reinforcement training (Reinforcement Learning, RL), adapting it to new conditions on the go. Such results were obtained as part of testing the algorithm on robotic simulators, representatives of Tinkoff Bank told TAdviser on December 21, 2023. Read more here.

It turned out that neural networks are not able to learn like a human brain due to lack of sleep

At the end of November 2023, American specialists from the Massachusetts Institute of Technology (MIT) released the results of a study that studied the possibilities of deep neural networks in terms of imitating the human brain. Scientists have come to the conclusion that neural networks are not able to learn like humans due to lack of sleep.

The main task of deep neural networks is to simulate human cognitive abilities. However, due to the unprecedented complexity of the human brain, numerous difficulties arise. In particular, artificial intelligence uses predetermined parameters, which is why there are restrictions when processing unfamiliar or unfavorable scenarios.

Neural networks are not able to learn like humans due to lack of sleep

Research shows that while deep neural networks have made significant progress, they fail to fully mimic the human brain. In addition, such systems tend to overwrite existing data - a phenomenon known as catastrophic forgetting. This effect has a negative impact on AI learning speed. On the other hand, the human brain, having received new information, includes it in existing knowledge. The brain develops rational memory during rest: sleep allows you to form associations between objects and information that at first glance look unrelated to each other.

American researchers propose to include artificial sleep cycles in deep neural networks. It is assumed that this approach will help mitigate the impact of catastrophic forgetting and increase the effectiveness of training AI models. At the same time, scientists admit, despite progress, neural networks have a long way to go to achieve parity with human cognitive abilities.^[3]

2020: Russian scientists taught artificial intelligence to "see" quantum advantages

Russian scientists from MIPT, FTIAN and ITMO have created a neural network that has learned to predict the behavior of a quantum system by "looking" at the scheme of this system. This was announced on January 16, 2020 by MIPT to TAdviser.

Such a neural network independently finds those solutions that are well suited for demonstrating quantum advantages. This will help researchers develop efficient quantum computers.

A large range of tasks of modern science is solved on the basis of quantum mechanical calculations. For example, chemical and biological: research on chemical reactions or the search for sustainable molecular structures for industry, medicine, pharmaceuticals and other fields.

Quantum computing is well suited for the exact solution of this kind of "quantum" problems, unlike classical ones, on the basis of which quantum problems are solved in most cases only cumbersome and approximate.

The process of creating quantum computing circuits is a laborious and expensive task. Not always the resulting devices show "quantum superiority" - they demonstrate the speed of information processing faster than a regular classical computer. Therefore, scientists would like to have a tool for predicting whether some scheme will have a quantum advantage or not.

One implementation of quantum computing is quantum wandering. Simplistically, you can imagine this method as moving a particle along a specific network made up of node points and connections between these nodes. Such networks form the circuit of a quantum system.

If the quantum movement of a particle - wandering - from one node of the network to another turns out to be faster than the classical one, then we can say that a device based on such a scheme shows a quantum advantage. Finding networks with a quantum advantage is an important task that experts in the field of quantum wandering are working on.

The idea of Alexei Melnikov, Leonid Fedichkin and Alexander Alojanets was to replace the expert with machine intelligence: to teach the computer to distinguish networks and give an answer to the question of which networks quantum wandering will give an advantage. That is, it makes sense to discover networks on the basis of which it makes sense to build a quantum computer.

The researchers took a neural network that "specialized" in image recognition. The network adjacency matrix and the number of the input and output node were supplied to the program input. At the output, the neural network gave an answer whether the quantum wandering between these nodes will be faster than the classical one.

It was not obvious that this approach would work, but it works, and we very successfully taught the computer to independently predict the quantum advantage in networks of complex structure,

says Leonid Fedichkin, Associate Professor, Department of Theoretical Physics, Moscow Institute of Physics and Technology

The line between quantum and classical behavior of systems is often blurred. The highlight of our work was the creation of a special computer vision, with which we managed to see this line in the space of networks,

explains Alexey Melnikov, researcher at ITMO

Researchers have created a tool to simplify the development of computational circuits based on quantum algorithms, the main applications of which should be biophotonics and materials science.

For example, quantum wanderings easily describe the excitation of photosensitive proteins such as rhodopsin or chlorophyll. A protein is, in a sense, a complex molecule that looks like a network. The task is to understand what will happen to an electron that has fallen into some point in the molecule, how it will move and what excitation it causes, in translation into formal language and there is a search for wandering time from one node of the network to another.

It is expected that the calculation of natural processes on quantum wanderings is easier to implement than on an architecture of qubits and gates, since wandering itself is a natural physical process.

Notes

Источник — «https://tadviser.com/index.php/Article:AI_training»

The site content is translated by machine translation software powered by PROMT. The machine-translated articles are not always perfect and may contain errors in vocabulary, syntax or grammar. Read original article
If you find inaccuracies or errors in the results of machine translation, please write to editor@tadviser.ru. We will make every effort to correct them as soon as possible.

Simple Link

How to create a "smart plant": Key characteristics of a modern digital enterprise 13200

Model Studio CS: How to use BIM to give new impetus to the development of the fuel and energy complex 15000