2025/08/27 15:16:13

LLM (Large Language Models)

Content

LAM (Large Action Models)
AI training
Machine learning
Reasoning models: CoT LLM
2025
2024
2023: Global Large Language Models Market Growth to $4.6 Billion
Notes

Main article: Artificial intelligence (AI, Artificial intelligence, AI)

LAM (Large Action Models)

Main article: LAM (Large Action Models)

AI training

Main article: Training artificial intelligence

Machine learning

Main article: Machine learning

Reasoning models: CoT LLM

What are reasoning models? A thought chain (CoT) is a method in which an artificial intelligence model breaks down a task into steps, as if "thinking aloud," before giving an answer. This helps models better deal with tasks that require logic, such as solving mathematical problems or logic puzzles.

2025

Sberbank scientists have proposed a way to combat hallucinations of artificial intelligence models

Sberbank scientists have proposed a way to combat hallucinations of artificial intelligence models. Sberbank announced this on August 26, 2025.

This method improves the accuracy of detecting incorrect responses of large language models by almost 30% when using a small amount of data.

The problem of hallucinations of large language models is one of the most common in the industry. LLMs can generate plausible but false answers. The most modern methods for detecting hallucinations are trainees, they require a large amount of high-quality marked data for training.

Sberbank scientists have proposed a solution. They investigated current methods for detecting artificial intelligence hallucinations and developed metamodels that improve the detection accuracy of false responses by nearly 30%, using just 250 examples for training. This is many times less than other solutions need.

This approach allows companies to significantly save resources on data markup and improve the quality of RAG systems. Scientists and developers get a new tool for analyzing large language models, and users get more accurate answers from AI models.

Шаблон:Quote 'author=noted Gleb Gusev, Director of the Center for Practical Artificial Intelligence of Sberbank.

Russian scientists have proposed a method to increase the reliability of generating requests for working with data

In Russia, a method has been developed to improve the reliability of generating requests for working with data. The Institute of Artificial Intelligence (AIRI) announced this on July 2, 2025. Scientists from the AIRI Institute have improved the work of language models for generating SQL queries, creating a system that helps to more accurately create queries to databases using large language models, as well as evaluate their own confidence in the correctness of the result. Read more here.

OpenAI o3, Gemini 2.5 Pro and DeepSeek R1 are the leaders of the Artificial Analysis Intelligence Index

Artificial Analysis Intelligence Index: comparative analysis of neural networks (as of May 2025)

Large language models no longer require powerful servers

Scientists from the Higher School of Economics, Yandex, MIT, KAUST and ISTA made a breakthrough in LLM optimization. The Yandex Research Artificial Intelligence Research Laboratory, together with scientific and technological universities, has developed a method for rapid compression of large language models (LLM) without loss of quality. Now a smartphone or laptop is enough to work with models and you do not need to use expensive servers and powerful GPUs. The HSE announced this on April 11, 2025.

The method allows you to quickly test and implement solutions based on neural networks, save time and money on development. This makes LLM more accessible not only to large, but also to small companies, non-profit laboratories and institutes, individual developers and researchers.

Previously, to run the language model on a smartphone or laptop, it was required to quantize it on an expensive server, which took several weeks. Quantization can now be done directly on your phone or laptop in minutes.

The difficulty in using large language models is that they require significant computing resources. This also applies to open source models. For example, one of them, a popular DeepSeek-R1, does not fit even on expensive servers designed to work with artificial intelligence and machine learning. This means that only a limited range of companies can use large models, even if the model itself is in the public domain.

This method allows you to reduce the size of the model, while maintaining its quality, and run it on more affordable devices. For example, with this method, you can compress even such large models as DeepSeek-R1 by 671 billion parameters and Llama 4 Maverick by 400 billion parameters, which until now have been able to quantize only by the simplest methods with a significant loss in quality.

This method of quantization provides more opportunities for LLM use in various fields, especially those where resources are limited - for example, in education or the social sphere. Now startups and independent developers can use compressed models to create innovative products and services without spending money on expensive equipment. Yandex itself is already using a new method for prototyping - creating working versions of products and quickly checking ideas: compressed models are tested faster than their original versions.

This method is called HIGGS (Hadamard Incoherence with Gaussian MSE-optimal GridS). It allows you to compress neural networks without using additional data and without computationally complex parameter optimization. This is especially useful in situations where there is not enough suitable data for further training the model. The method provides a balance between the quality, size of the model and the complexity of quantization, which allows models to be used on a wide variety of devices.

The method has already been tested on the Llama 3, Llama 4 and Qwen 2.5 models. Experiments have shown that HIGGS is a good way to quantify the ratio of model quality and size among all existing data-free quantification methods, including GPTQ (GPT Quantification) and AWQ (Activation-Aware Quantification).

The HIGGS method is already available to developers and researchers on Hugging Face and GitHub.

Russia has made a breakthrough in optimizing large language models - now they do not need powerful servers

The Yandex Research Laboratory for Artificial Intelligence, together with the HSE, the Massachusetts Institute of Technology (MIT), the Austrian Institute of Science and Technology (ISTA) and the King Abdullah University of Science and Technology (KAUST), has developed a revolutionary method for rapidly compressing large language models (LLM) without loss of quality. Thanks to the new technology, neural networks no longer require expensive servers and powerful GPUs - just a regular smartphone or laptop. This became known on April 11, 2025. Read more here.

Tencent releases' reasoning 'Hunyuan-T1 AI model that beats ChatGPT

At the end of March 2025, the Chinese company Tencent Holdings presented a "reasoning" AI model of the Hunyuan-T1. It outperforms ChatGPT and also competes with DeepSeek R1 in both performance and price. Read more here.

Foxconn launches FoxBrain AI model for manufacturing

Foxconn has launched its own AI model, FoxBrain, to automate manufacturing processes and manage supply chains. The official presentation of the technology took place on March 10, 2025. Read more here.

Almost 50% of tasks related to the development of AI models in the Russian segment will concern LLM

Extending the functionality of Large Language Models (LLMs) is the highest priority application related to models artificial intelligence AI () for 2025. This is evidenced by data an express survey ICT.Moscow conducted in February 2025 among more than 500 representatives -. the Russian AI industries This was Information city (GBU Infogorod) announced on March 6, 2025. More. here

Battle for AI leadership between US and Chinese companies

In 2023, China did not have competitive modifications of LLM. At the end of 2023, the leading Chinese LLM was the Alibaba Qwen Chat 7B, which was much inferior to the GPT-3.5 Turbo.

In early 2024, the introduction of Alibaba Qwen Chat 72B began, which was better than GPT-3.5, but significantly worse than GPT-4, especially inferior in multimodality.

In the summer of 2024, DeepSeek V2 became the leading Chinese LLM, which was also of no significant interest due to a tangible gap in performance and efficiency compared to GPT-4o.

Alibaba Qwen 2 Instruct 72B was introduced almost simultaneously, slightly overtaking DeepSeek V2, but not posing a threat to GPT-4o.

The first wake-up call for the United States was the release of Alibaba Qwen 2.5 Instruct 72B, which managed to equal the GPT-4o and even surpass the advanced American model at that time in some tasks.

At the same time, in September 2024, OpenAI o1-preview was presented, making the first breakthrough in efficiency in 1.5 years. Since March 14, 2023 (release GPT-4) there has been practically no high-quality modification of LLM from OpenAI.

Yes, the contextual window was expanded, multimodality, client interaction, performance were improved, and hallucination was reduced, but for 1.5 years the LLM core remained unchanged (mainly cosmetic modifications).

In December 2024, the DeepSeek V3 was introduced, which was the best Chinese model, winning in all respects against GPT-4o, but behind the expanded o1, which OpenAI integrated in mid-December.

That's where the bitter leadership battle began, Spydell Finance wrote. DeepSeek will launch its flagship R1 model in mid-January 2025, which breaks performance ratings and is the first time in the entire short history of public LLMs that the Chinese have come close to the United States.

In two weeks, OpenAI is implementing o3, breaking ahead again, but not as much as it was in the fall of 2024 at the time of the release of o1.

To understand the logic behind the introduction of a tough export embargo on the supply of chips from Nvidia to China, it is necessary to understand the trajectory of the evolution of Chinese LLMs.

The table shows quite informatively which chips under the embargo and when export control was introduced. All advanced modifications from Nvidia are closed for export to China from the fall of 2023, and restrictions began to be introduced in mid-2022.

Thus, China is moving at a faster pace in the development of LLM than the United States, having incomparably less computing power.

Bloomberg massively advertised DeepSeek to a global audience on January 27, 2025, but not only DeepSeek and Qwen are in the presence of the Chinese.

Moonshot - Kimi 1.5
StepFun – Step R-mini
Baichuan - M1 Preview
Zhipu – GLM Zero Preview
Bytedance - Doubao 1.5 Pro
MiniMax – MiniMax Text-01

• Tencent - Hunyuan Large • Baidu - Ernie 4.0 Turbo

Yi AI - YiLightning.

Each of the presented models (9 in the + DeepSeek and Qwen list) is already stronger or comparable to GPT-4o, i.e. 11 advanced models from the Chinese, and only two are known to the world community so far.

Everything is just beginning, the battle is in full swing.

The graph above also shows a comparison of the top American LLMs. Only once did OpenAI concede the lead when Anthropic unveiled an improved version of Claude 3.5 Sonnet in June 2024. Just six months and Claude 3.5 Sonnet whistles even out of the top three LLMs.

2024

Trained on state AI models in Russia will be checked for threats to national security and defense

The Russian government approved the passport of the federal project "Digital Public Administration," providing for the creation of a system for checking artificial intelligence models trained on state data. The adoption of the document, the responsibility for the implementation of which is assigned to the FSB, became known on November 27, 2024.

According to Kommersant, it is planned to allocate ₽8,1 billion from the federal budget for the development and implementation of software for analyzing AI models by 2030.

In Russia, AI models trained on state data will be tested for possible threats to national security and defense capability

Dmitry Chernous, head of the MTS AI consulting group, stressed that government data allows you to create AI models that better take into account the features and needs of a particular country or region.

Nadezhda Gavrikova, a specialist in machine learning at Jet Infosystems, pointed out the main risks: the possibility of de-denigration and data leakage, as well as the unreliability of predictions that can lead to distorted recommendations when making strategic government decisions.

In 2025-2026, the scientific and research development of the principles for analyzing AI models will be carried out, and in 2027-2028 the first version of the program for their verification was created and implemented. By 2030, it is planned to confirm the safety of the use of five artificial intelligence systems.

Anton Averyanov, General Director of the ST IT Group of Companies, noted the need for enhanced security measures when training AI on public funds and restricting access to such a product only by authorized users.

NTI big data Timofey Voronin, Deputy Director of the Center for Competence at Moscow State University, announced the introduction of a new GOST from January 1, fixing the requirements for data protection when using artificial intelligence.

The project is being implemented within the framework of the new national project "Data Economics," which is at the final stage of approval. Commercial companies' access to government data is currently closed.^[1]

Launched an open neural network OLMo 2 with 13 billion parameters. She supports the Russian language

On November 26, 2024, the Paul Allen Institute of Artificial Intelligence (Ai2) introduced the fully open large language model OLMo 2 (second generation Open Language Model). The neural network supports, among other things, the Russian language. Read more here.

The emergence of multimodal AI that understands information not only from text, but also from images, audio and video

Most large language models (BYMs) can only handle one type of data - text, but multimodal models such as Google Gemini or Microsoft Copilot are able to understand information from different types of data - text, images, audio and video. This capability makes technologies, from search tools to creative applications, richer, more accurate and seamless.

You can find out in Copilot what happens in the uploaded image, thanks to a multimodal model that can process images, natural language and Bing search data. Copilot can generate, for example, relevant information about the historical significance of the monument in your photo.

Multimodal AI is also used in Microsoft Designer, a graphic design app that can generate images based on a description of what you want. It also allows you to create your own neural voices, or natural voices, useful for reading texts and tools for people with speech disorders.

"Bubble deflates a little": Bloomberg learned about OpenAI and Google's problems with new AI models

In mid-November 2024, it became known that OpenAI, Google and Anthropic faced difficulties in creating next-generation artificial intelligence models. These difficulties can negatively affect the development of the so-called general AI (AGI) - a system with autonomous self-control, a sufficient degree of self-awareness and the ability to master new skills. Read more here.

In Russia, for the first time in practice, federated machine learning was used. AI models are trained without data transfer

On October 8, 2024, it became known that Yandex, together with the V.P. Ivannikov Institute for System Programming of the Russian Academy of Sciences and Sechenov University, successfully applied federated machine learning technology for the first time in Russia. This innovative approach allows organizations to jointly train artificial intelligence models without the need to share sensitive data.

According to the press service of Yandex, federated training is intended for projects with several participants, each of which has its own set of data. The technology allows you to collectively train models without transferring the initial data to other project participants. This opens up new opportunities for AI collaboration, especially for companies in sensitive information industries such as finance, medicine and industry.

Federated machine learning was used in Russia. AI models are trained without data transfer

Within the framework of the project, a neural network was created to detect atrial fibrillation according to electrocardiograms. Two independent data sets were used for training: from Sechenov University and from the ISP RAS. Each partner conducted training on his side, after which he transferred the results to a common loop without disclosing the original data.

The technical implementation of the project was carried out by experts from the Technology Center for the Yandex Cloud Society together with engineers from the ISP RAS. Yandex Cloud has developed implementation phases, proposed a technology stack, and created a unified learning environment. ISP RAS adapted the model for an open framework of federal learning, and Sechenov University provided an expert assessment of the quality of the model.

In the future, federated machine learning technology will be available to Yandex Cloud customers. This will allow organizations that previously could not cooperate due to the risks associated with the transfer of sensitive data to participate in joint projects. This approach will not only improve the quality of the final models by increasing the amount of data for training, but also simplify cross-border cooperation.^[2]

OpenAI released AI models with the ability to reason

In mid-September 2024, OpenAI released a new o1 AI model, which the developers say shows excellent results in complex reasoning, outperforming people in math, coding and science tests. Read more here

Linux Foundation Launches Free Open AI Model Project

On August 12, 2024, the nonprofit Linux Foundation announced the Open Model Initiative (OMI) project. It aims to promote the creation and implementation of high-quality models artificial intelligence with an open license. More. here

The world's largest open AI model has been released. It has 405 billion parameters

July 23, 2024 Meta (recognized as an extremist organization; activities in the Russian Federation are prohibited) announced the release of the world's largest open model of artificial intelligence - Llama 3.1. It has 405 billion parameters and is said to surpass GPT-4o and Anthropic Claude 3.5 Sonnet in some characteristics. Read more here

8 billion parameters, faster than ChatGPT 3.5. The most powerful open Russian language AI model has been released

In July 2024, T-Bank announced the release of the most powerful Russian language model T-lite. It is designed to create AI solutions in the field of data analysis, search and development of chatbots. Read more here.

Global ranking of the most powerful open source AI models released

On June 26, 2024, the American company Hugging Face, which develops tools for creating applications using machine learning, announced a global ranking of the most powerful open source AI models. One of the solutions of the Qwen family of the Chinese company Alibaba tops the list.

Large language models (LLMs) open source with contribute to the development of AI and the acceleration of innovation. Thanks to openness, developers are able to adapt models to their tasks. In addition, open LLMs provide greater AI transparency. Plus, entrance barriers for individuals and companies implementing certain projects related to artificial intelligence are reduced.

Rating of the most powerful open source AI models published

The new Hugging Face rating is based on the results of six benchmarks. Это MMLU-Pro (Massive Multitask Language Understanding — Pro), GPQA (Google-Proof Q&A), MuSR (Multistep Soft Reasoning), MATH (Mathematics Aptitude Test of Heuristics), IFEval (Instruction Following Evaluation) и BBH (Big Bench Hard). In first place in the list is the Alibaba Qwen/Qwen2-72B-Instruct model with 72 billion parameters. She is recognized as the best for "efficiency in mathematics, foresight of reasoning and knowledge."

The second place in the ranking went to the meta-llama/Meta-Llama-3-70B-Instruct model, which was developed by Meta (recognized as an extremist organization; activities in the territory of the Russian Federation are prohibited). Closes the top three Microsoft/phi-3-medium-4k-instruct corporations. Overall, Microsoft the Tor-10 is as follows:

Qwen/Qwen2-72B-Instruct;
meta-llama/Meta-Llama-3-70B-Instruct;
microsoft/Phi-3-medium-4k-instruct;
01-ai/Yi-1.5-34B-Chat;
CohereForAI/c4ai-command-r-plus;
abacusai/Smaug-72B-v0.1;
Qwen/Qwen1.5-110B;
Qwen/Qwen1.5-110B-Chat;
microsoft/Phi-3-small-128k-instruct;
01-ai/Yi-1.5-9B-Chat.^[3]

Open AI model released to generate code in 80 programming languages

At the end of May 2024, the French company Mistral AI confirmed the launch of a new open AI model Codestral - the first large language model (LLM) to help developers write code. Read more here.

SberDevices Releases Open AI Model of Machine Learning for Speech and Emotion Recognition

In early April 2024, SberDevices introduced a set of Open Source machine learning models for speech and emotion recognition. The development available to everyone for free was called GigaAM (Giga Acoustic Model). Read more here.

Founded by immigrants from OpenAI, Anthropic has released a language model for AI learning. It turned out to be more powerful than Google and OpenAI systems

On March 4, 2024, Anthropic, founded by immigrants from OpenAI, announced models of artificial intelligence of the Claude 3 family. They are said to surpass the counterparts of both OpenAI itself and Google. Read more here.

Appearance of Small Language Models (MLMs)

By February 2024, many have already experienced the power of large language models (BYM, large language models, LLM), using including ChatGPT to answer difficult questions. These models are so large that they may require significant computing resources to run, so the emergence of small language models (SLM) has become a big sensation.

The MNYs are still large enough and have several billion parameters - unlike the hundreds of billions of parameters in the MNYs - but they are small enough to run offline on the phone. Parameters are variables, or custom elements, that determine the behavior of the model.

"Small language models can make AI more accessible because of their size and cheapness," says Sebastien Bubek, who heads the Machine Learning Foundations group at Microsoft Research. "At the same time, we are discovering new ways to make them as powerful as large language models."

Microsoft scientists have developed and released two MNYs - Phi and Orca, which in some areas work no worse or even better than large language models, refuting the opinion that scale is necessary for performance.

Unlike BYUs trained on vast amounts of data from the internet, more compact models use tailored high-quality training data, and scientists find new thresholds for size and performance. In 2024, we can expect the emergence of improved models designed to promote innovation.

Google has released a model available to everyone for training artificial intelligence

On February 21, 2024, Google announced open source artificial intelligence models Gemma 2B and Gemma 7B, which can be used by everyone. It is possible to solve such problems as document analysis, creation of chat bots, etc. Read more here.

The world's first open source model with support for 100 languages for learning artificial intelligence has been released

On February 13, 2024, Cohere for AI, a non-profit research laboratory created by Cohere in 2022, unveiled an open large language model (LLM) called Aya. This is said to be the first solution in this class to support more than 100 languages. Read more here.

2023: Global Large Language Models Market Growth to $4.6 Billion

In 2023, costs in the global large language model (LLM) market reached $4.6 billion. This industry is rapidly developing against the background of the rapid introduction of artificial intelligence technologies, including generative (GENI), in various fields. Market trends are addressed in the Market Research Future survey published in late November 2024.

LLMs can efficiently perform a wide range of tasks, including answering questions, summarizing documents, language translations, and drafting sentences. AI models are used to automate, support customers and analyze data, which stimulates innovation and improves business operations. Analysts highlight several key uses for LLM:

Supply chain management - AI models serve to transform management processes, providing greater predictability and control over supply and demand. GenI helps in selecting suppliers, analyzing financial data and studying the market;
Answers to Questions - LLMs can be used in many areas to provide certain information, such as in the areas of customer service, health care and education. LLMs extend the capabilities of virtual assistants;
Search - LLMs are used to improve the quality of search results, providing users with more relevant and accurate information;
Social Media - LLMs transform content creation and generation processes. Using clustering, LLMs can classify text with similar meanings or meanings;
Job transformation - LLMs are changing the established order of things, including in workplaces. AI models reduce the number of monotonous and repetitive tasks, reducing the burden on employees and reducing the impact of the human factor.

In general, the need for advanced natural language processing (NLP) and analytics of huge amounts of data in various fields is driving the growth of the large language model market. The key players in the industry in question are:

Google;
OpenAI;
Anthropic;
Meta (recognized as an extremist organization; activities on the territory of the Russian Federation are prohibited);
Microsoft;
Nvidia;
AWS;
IBM;
Oracle;
HPE.

By deployment type, the market is divided into local and cloud LLMs. The first type of solution provides higher revenue because it provides more options in terms of configuration and control over the infrastructure. In addition, some organizations, in particular from the financial sector and healthcare, choose local products for security and privacy reasons. At the same time, the cloud segment demonstrates the highest growth rates. These platforms provide scalability, flexibility, and cost-effectiveness, leading organizations to increasingly implement these services to meet different challenges. Using cloud services, enterprises can access AI models and real-time data from anywhere in the world. Regionally, North America is leading the way in 2023, with many of LLM's leading developers concentrated here, as well as major cloud providers such as Amazon Web Services (AWS), Microsoft Azure and Google Cloud.

At the end of 2024, revenue in the global LLM market is estimated at $6.1 billion. Market Research Future analysts believe that in the future, the CAGR (average annual growth rate in complex percentages) will be 34.2%. As a result, costs globally could reach about $64.9 billion by 2032.^[4]

Notes

Источник — «https://tadviser.com/index.php/Article:LLM_(Large_Language_Models)»

The site content is translated by machine translation software powered by PROMT. The machine-translated articles are not always perfect and may contain errors in vocabulary, syntax or grammar. Read original article
If you find inaccuracies or errors in the results of machine translation, please write to editor@tadviser.ru. We will make every effort to correct them as soon as possible.

Simple Link

How to create a "smart plant": Key characteristics of a modern digital enterprise 11800

Model Studio CS: How to use BIM to give new impetus to the development of the fuel and energy complex 13000