RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2
2024/11/27 15:03:34

AI training

Content

Main article: Artificial intelligence (AI, Artificial intelligence, AI)

Machine learning

Main article: Machine Learning

Chronicle

2024

A system for training robots from each other without human participation has been launched

In late November 2024, scientists at the University of California, Berkeley, unveiled a system to allow robots to exchange skills without human input. The RoVi-Aug platform generates synthetic visual demonstrations using different types of robots and shooting angles, increasing the versatility of learning. Read more here.

Russia has created the world's first open AI environment for rapid contextual training with reinforcement

On November 29, 2024, it became known that Russian scientists from the T-Bank AI Research laboratory and the AIRI Institute, in collaboration with students of MIPT, Skoltech and Innopolis, developed XLand-MiniGrid, the first open virtual environment for research in the field of contextual training with reinforcement of artificial intelligence. Read more here.

Launched an open neural network OLMo 2 with 13 billion parameters. She supports the Russian language

On November 26, 2024, the Paul Allen Institute of Artificial Intelligence (Ai2) introduced the fully open large language model OLMo 2 (second generation Open Language Model). The neural network supports, among other things, the Russian language. Read more here.

Trainees on state AI models in Russia will be checked for threats to national security and defense

The Russian government approved the passport of the federal project "Digital Public Administration," providing for the creation of a system for checking artificial intelligence models trained on state data. The adoption of the document, the responsibility for the implementation of which is assigned to the FSB, became known on November 27, 2024.

According to Kommersant, it is planned to allocate ₽8,1 billion from the federal budget for the development and implementation of software for analyzing AI models by 2030.

In
Russia, AI models trained on state data will be tested for possible threats to national security and defense capability

Dmitry Chernous, head of the MTS AI consulting group, stressed that government data allows you to create AI models that better take into account the features and needs of a particular country or region.

Nadezhda Gavrikova, a specialist in machine learning at Jet Infosystems, pointed out the main risks: the possibility of de-denigration and data leakage, as well as the unreliability of predictions that can lead to distorted recommendations when making strategic government decisions.

In 2025-2026, the scientific and research development of the principles for analyzing AI models will be carried out, and in 2027-2028 the first version of the program for their verification was created and implemented. By 2030, it is planned to confirm the safety of the use of five artificial intelligence systems.

Anton Averyanov, General Director of the ST IT Group of Companies, noted the need for enhanced security measures when training AI on public funds and restricting access to such a product only by authorized users.

NTI big data Timofey Voronin, Deputy Director of the Center for Competence at Moscow State University, announced the introduction of a new GOST from January 1, fixing the requirements for data protection when using artificial intelligence.

The project is being implemented within the framework of the new national project "Data Economics," which is at the final stage of approval. Commercial companies' access to government data is currently closed.[1]

The Ministry of Industry and Trade of the Russian Federation buys AI servers for 665 million rubles for training neural networks

On November 11, 2024, the Federal Center for the Applied Development of Artificial Intelligence (FCPR AI), which provides support for the digital transformation of the Ministry of Industry and Trade of Russia, announced a competition for the purchase of server and telecommunications equipment for training neural networks. The initial cost of the contract is about 665 million rubles. Read more here.

The Pixtral Large neural network with a search engine is presented, which is more powerful GPT-4

In mid-November 2024, the French startup Mistral introduced the Pixtral Large neural network, which can compete with GPT-4. The neural network based on the free chatbot Le Chat is capable of generating images, conducting web searches and serving as an interactive "canvas." Read more here.

"Bubble deflates a little": Bloomberg learned about OpenAI and Google's problems with new AI models

In mid-November 2024, it became known that OpenAI, Google and Anthropic faced difficulties in creating next-generation artificial intelligence models. These difficulties can negatively affect the development of the so-called general AI (AGI) - a system with autonomous self-control, a sufficient degree of self-awareness and the ability to master new skills. Read more here

In Russia, for the first time in practice, federated machine learning was used. AI models are trained without data transfer

On October 8, 2024, it became known that Yandex, together with the V.P. Ivannikov Institute for System Programming of the Russian Academy of Sciences and Sechenov University, successfully applied federated machine learning technology for the first time in Russia. This innovative approach allows organizations to jointly train artificial intelligence models without the need to share sensitive data.

According to the press service of Yandex, federated training is intended for projects with several participants, each of which has its own set of data. The technology allows you to collectively train models without transferring the initial data to other project participants. This opens up new opportunities for AI collaboration, especially for companies in sensitive information industries such as finance, medicine and industry.

Federated machine learning was used in Russia. AI models are trained without data transfer

Within the framework of the project, a neural network was created to detect atrial fibrillation according to electrocardiograms. Two independent data sets were used for training: from Sechenov University and from the ISP RAS. Each partner conducted training on his side, after which he transferred the results to a common loop without disclosing the original data.

The technical implementation of the project was carried out by experts from the Technology Center for the Yandex Cloud Society together with engineers from the ISP RAS. Yandex Cloud has developed implementation phases, proposed a technology stack, and created a unified learning environment. ISP RAS adapted the model for an open framework of federal learning, and Sechenov University provided an expert assessment of the quality of the model.

In the future, federated machine learning technology will be available to Yandex Cloud customers. This will allow organizations that previously could not cooperate due to the risks associated with the transfer of sensitive data to participate in joint projects. This approach will not only improve the quality of the final models by increasing the amount of data for training, but also simplify cross-border cooperation.[2]

OpenAI released AI models with the ability to reason

In mid-September 2024, OpenAI released a new o1 AI model, which the developers say shows excellent results in complex reasoning, outperforming people in math, coding and science tests. Read more here

Russia has released ReBased technology for working with long text. It will help launch commercial neural networks faster

Russian scientists from the T-Bank AI Research laboratory have developed a new ReBased technology for accelerated processing of long texts by artificial intelligence. This innovation will significantly reduce the cost of using AI in word processing with almost no loss in quality, the press service of T-Bank reported in August 2024. Read more here.

Linux Foundation Launches Free Open AI Model Project

On August 12, 2024, the nonprofit Linux Foundation announced the Open Model Initiative (OMI) project. It aims to promote the creation and implementation of high-quality models artificial intelligence with an open license. More. here

The world's largest open AI model has been released. It has 405 billion parameters

July 23, 2024 Meta (recognized as an extremist organization; activities in the Russian Federation are prohibited) announced the release of the world's largest open model of artificial intelligence - Llama 3.1. It has 405 billion parameters and is said to surpass GPT-4o and Anthropic Claude 3.5 Sonnet in some characteristics. Read more here

8 billion parameters, faster than ChatGPT 3.5. The most powerful open Russian language AI model has been released

In July 2024, T-Bank announced the release of the most powerful Russian language model T-lite. It is designed to create AI solutions in the field of data analysis, search and development of chatbots. Read more here.

Global ranking of the most powerful open source AI models released

On June 26, 2024, the American company Hugging Face, which develops tools for creating applications using machine learning, announced a global ranking of the most powerful open source AI models. One of the solutions of the Qwen family of the Chinese company Alibaba tops the list.

Large language models (LLMs) open source with contribute to the development of AI and the acceleration of innovation. Thanks to openness, developers are able to adapt models to their tasks. In addition, open LLMs provide greater AI transparency. Plus, entrance barriers for individuals and companies implementing certain projects related to artificial intelligence are reduced.

Rating of the most powerful open source AI models published

The new Hugging Face rating is based on the results of six benchmarks. Это MMLU-Pro (Massive Multitask Language Understanding — Pro), GPQA (Google-Proof Q&A), MuSR (Multistep Soft Reasoning), MATH (Mathematics Aptitude Test of Heuristics), IFEval (Instruction Following Evaluation) и BBH (Big Bench Hard). In first place in the list is the Alibaba Qwen/Qwen2-72B-Instruct model with 72 billion parameters. She is recognized as the best for "efficiency in mathematics, foresight of reasoning and knowledge."

The second place in the ranking went to the meta-llama/Meta-Llama-3-70B-Instruct model, which was developed by Meta (recognized as an extremist organization; activities in the territory of the Russian Federation are prohibited). Closes the top three Microsoft/phi-3-medium-4k-instruct corporations. Overall, Microsoft the Tor-10 is as follows:

  1. Qwen/Qwen2-72B-Instruct;
  2. meta-llama/Meta-Llama-3-70B-Instruct;
  3. microsoft/Phi-3-medium-4k-instruct;
  4. 01-ai/Yi-1.5-34B-Chat;
  5. CohereForAI/c4ai-command-r-plus;
  6. abacusai/Smaug-72B-v0.1;
  7. Qwen/Qwen1.5-110B;
  8. Qwen/Qwen1.5-110B-Chat;
  9. microsoft/Phi-3-small-128k-instruct;
  10. 01-ai/Yi-1.5-9B-Chat.[3]

Open AI model released to generate code in 80 programming languages

At the end of May 2024, the French company Mistral AI confirmed the launch of a new open AI model Codestral - the first large language model (LLM) to help developers write code. Read more here.

SberDevices Releases Open AI Model of Machine Learning for Speech and Emotion Recognition

In early April 2024, SberDevices introduced a set of Open Source machine learning models for speech and emotion recognition. The development available to everyone for free was called GigaAM (Giga Acoustic Model). Read more here.

Founded by immigrants from OpenAI, Anthropic has released a language model for AI learning. It turned out to be more powerful than Google and OpenAI systems

On March 4, 2024, Anthropic, founded by immigrants from OpenAI, announced models of artificial intelligence of the Claude 3 family. They are said to surpass the counterparts of both OpenAI itself and Google. Read more here.

Appearance of Small Language Models (MLMs)

By February 2024, many have already experienced the power of large language models (BYM, large language models, LLM), using including ChatGPT to answer difficult questions. These models are so large that they may require significant computing resources to run, so the emergence of small language models (SLM) has become a big sensation.

The MNYs are still large enough and have several billion parameters - unlike the hundreds of billions of parameters in the MNYs - but they are small enough to run offline on the phone. Parameters are variables, or custom elements, that determine the behavior of the model.

"Small language models can make AI more accessible because of their size and cheapness," says Sebastien Bubek, who heads the Machine Learning Foundations group at Microsoft Research. "At the same time, we are discovering new ways to make them as powerful as large language models."

Microsoft scientists have developed and released two MNYs - Phi and Orca, which in some areas work no worse or even better than large language models, refuting the opinion that scale is necessary for performance.

Unlike BYUs trained on vast amounts of data from the internet, more compact models use tailored high-quality training data, and scientists find new thresholds for size and performance. In 2024, we can expect the emergence of improved models designed to promote innovation.

The emergence of multimodal AI that understands information not only from text, but also from images, audio and video

Most large language models (BYMs) can only handle one type of data - text, but multimodal models such as Google Gemini or Microsoft Copilot are able to understand information from different types of data - text, images, audio and video. This capability makes technologies, from search tools to creative applications, richer, more accurate and seamless.

You can find out in Copilot what happens in the uploaded image, thanks to a multimodal model that can process images, natural language and Bing search data. Copilot can generate, for example, relevant information about the historical significance of the monument in your photo.

Multimodal AI is also used in Microsoft Designer, a graphic design app that can generate images based on a description of what you want. It also allows you to create your own neural voices, or natural voices, useful for reading texts and tools for people with speech disorders.

Google has released a model available to everyone for training artificial intelligence

On February 21, 2024, Google announced open source artificial intelligence models Gemma 2B and Gemma 7B, which can be used by everyone. It is possible to solve such problems as document analysis, creation of chat bots, etc. Read more here.

The world's first open source model with support for 100 languages ​ ​ for learning artificial intelligence has been released

On February 13, 2024, Cohere for AI, a non-profit research laboratory created by Cohere in 2022, unveiled an open large language model (LLM) called Aya. This is said to be the first solution in this class to support more than 100 languages. Read more here.

2023

Global Large Language Models Market Growth to $4.6 Billion

In 2023, costs in the global large language model (LLM) market reached $4.6 billion. This industry is rapidly developing against the background of the rapid introduction of artificial intelligence technologies, including generative (GENI), in various fields. Market trends are addressed in the Market Research Future survey published in late November 2024.

LLMs can efficiently perform a wide range of tasks, including answering questions, summarizing documents, language translations, and drafting sentences. AI models are used to automate, support customers and analyze data, which stimulates innovation and improves business operations. Analysts highlight several key uses for LLM:

  • Supply chain management - AI models serve to transform management processes, providing greater predictability and control over supply and demand. GenI helps in selecting suppliers, analyzing financial data and studying the market;
  • Answers to Questions - LLMs can be used in many areas to provide certain information, such as in the areas of customer service, health care and education. LLMs extend the capabilities of virtual assistants;
  • Search - LLMs are used to improve the quality of search results, providing users with more relevant and accurate information;
  • Social Media - LLMs transform content creation and generation processes. Using clustering, LLMs can classify text with similar meanings or meanings;
  • Job transformation - LLMs are changing the established order of things, including in workplaces. AI models reduce the number of monotonous and repetitive tasks, reducing the burden on employees and reducing the impact of the human factor.

In general, the need for advanced natural language processing (NLP) and analytics of huge amounts of data in various fields is driving the growth of the large language model market. The key players in the industry in question are:

By deployment type, the market is divided into local and cloud LLMs. The first type of solution provides higher revenue because it provides more options in terms of configuration and control over the infrastructure. In addition, some organizations, in particular from the financial sector and healthcare, choose local products for security and privacy reasons. At the same time, the cloud segment demonstrates the highest growth rates. These platforms provide scalability, flexibility, and cost-effectiveness, leading organizations to increasingly implement these services to meet different challenges. Using cloud services, enterprises can access AI models and real-time data from anywhere in the world. Regionally, North America is leading the way in 2023, with many of LLM's leading developers concentrated here, as well as major cloud providers such as Amazon Web Services (AWS), Microsoft Azure and Google Cloud.

At the end of 2024, revenue in the global LLM market is estimated at $6.1 billion. Market Research Future analysts believe that in the future, the CAGR (average annual growth rate in complex percentages) will be 34.2%. As a result, costs globally could reach about $64.9 billion by 2032.[4]

Russian scientists have created an algorithm that teaches AI 4 times faster than global analogues

Scientists at Tinkoff Research's Artificial Intelligence (AI) Research Laboratory have created an algorithm to train and adapt artificial intelligence. According to scientists, a method called ReBRAC (Revisited Behavior Regulated Actor Critic) trains AI four times faster and 40% better than global counterparts in the field of reinforcement training (Reinforcement Learning, RL), adapting it to new conditions on the go. Such results were obtained as part of testing the algorithm on robotic simulators, representatives of Tinkoff Bank told TAdviser on December 21, 2023. Read more here.

Notes