RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2
2024/10/09 15:27:35

AI training

Content

Main article: Artificial intelligence (AI, Artificial intelligence, AI)

Machine learning

Main article: Machine Learning

Chronicle

2024

In Russia, for the first time in practice, federated machine learning was used. AI models are trained without data transfer

On October 8, 2024, it became known that Yandex, together with the V.P. Ivannikov Institute for System Programming of the Russian Academy of Sciences and Sechenov University, successfully applied federated machine learning technology for the first time in Russia. This innovative approach allows organizations to jointly train artificial intelligence models without the need to share sensitive data.

According to the press service of Yandex, federated training is intended for projects with several participants, each of which has its own set of data. The technology allows you to collectively train models without transferring the initial data to other project participants. This opens up new opportunities for AI collaboration, especially for companies in sensitive information industries such as finance, medicine and industry.

source = Yandex
Federated machine learning was used in Russia. AI models are trained without data transfer

Within the framework of the project, a neural network was created to detect atrial fibrillation according to electrocardiograms. Two independent data sets were used for training: from Sechenov University and from the ISP RAS. Each partner conducted training on his side, after which he transferred the results to a common loop without disclosing the original data.

The technical implementation of the project was carried out by experts from the Technology Center for the Yandex Cloud Society together with engineers from the ISP RAS. Yandex Cloud has developed implementation phases, proposed a technology stack, and created a unified learning environment. ISP RAS adapted the model for an open framework of federal learning, and Sechenov University provided an expert assessment of the quality of the model.

In the future, federated machine learning technology will be available to Yandex Cloud customers. This will allow organizations that previously could not cooperate due to the risks associated with the transfer of sensitive data to participate in joint projects. This approach will not only improve the quality of the final models by increasing the amount of data for training, but also simplify cross-border cooperation.[1]

OpenAI released AI models with the ability to reason

In mid-September 2024, OpenAI released a new o1 AI model, which the developers say shows excellent results in complex reasoning, outperforming people in math, coding and science tests. Read more here

Russia has released ReBased technology for working with long text. It will help launch commercial neural networks faster

Russian scientists from the T-Bank AI Research laboratory have developed a new ReBased technology for accelerated processing of long texts by artificial intelligence. This innovation will significantly reduce the cost of using AI in word processing with almost no loss in quality, the press service of T-Bank reported in August 2024. Read more here.

Linux Foundation Launches Free Open AI Model Project

On August 12, 2024, the nonprofit Linux Foundation announced the Open Model Initiative (OMI) project. It aims to promote the creation and implementation of high-quality models artificial intelligence with an open license. More. here

The world's largest open AI model has been released. It has 405 billion parameters

July 23, 2024 Meta (recognized as an extremist organization; activities in the Russian Federation are prohibited) announced the release of the world's largest open model of artificial intelligence - Llama 3.1. It has 405 billion parameters and is said to surpass GPT-4o and Anthropic Claude 3.5 Sonnet in some characteristics. Read more here

8 billion parameters, faster than ChatGPT 3.5. The most powerful open Russian language AI model has been released

In July 2024, T-Bank announced the release of the most powerful Russian language model T-lite. It is designed to create AI solutions in the field of data analysis, search and development of chatbots. Read more here.

Global ranking of the most powerful open source AI models released

On June 26, 2024, the American company Hugging Face, which develops tools for creating applications using machine learning, announced a global ranking of the most powerful open source AI models. One of the solutions of the Qwen family of the Chinese company Alibaba tops the list.

Large language models (LLMs) open source with contribute to the development of AI and the acceleration of innovation. Thanks to openness, developers are able to adapt models to their tasks. In addition, open LLMs provide greater AI transparency. Plus, entrance barriers for individuals and companies implementing certain projects related to artificial intelligence are reduced.

Rating of the most powerful open source AI models published

The new Hugging Face rating is based on the results of six benchmarks. Это MMLU-Pro (Massive Multitask Language Understanding — Pro), GPQA (Google-Proof Q&A), MuSR (Multistep Soft Reasoning), MATH (Mathematics Aptitude Test of Heuristics), IFEval (Instruction Following Evaluation) и BBH (Big Bench Hard). In first place in the list is the Alibaba Qwen/Qwen2-72B-Instruct model with 72 billion parameters. She is recognized as the best for "efficiency in mathematics, foresight of reasoning and knowledge."

The second place in the ranking went to the meta-llama/Meta-Llama-3-70B-Instruct model, which was developed by Meta (recognized as an extremist organization; activities in the territory of the Russian Federation are prohibited). Closes the top three Microsoft/phi-3-medium-4k-instruct corporations. Overall, Microsoft the Tor-10 is as follows:

  1. Qwen/Qwen2-72B-Instruct;
  2. meta-llama/Meta-Llama-3-70B-Instruct;
  3. microsoft/Phi-3-medium-4k-instruct;
  4. 01-ai/Yi-1.5-34B-Chat;
  5. CohereForAI/c4ai-command-r-plus;
  6. abacusai/Smaug-72B-v0.1;
  7. Qwen/Qwen1.5-110B;
  8. Qwen/Qwen1.5-110B-Chat;
  9. microsoft/Phi-3-small-128k-instruct;
  10. 01-ai/Yi-1.5-9B-Chat.[2]

Open AI model released to generate code in 80 programming languages

At the end of May 2024, the French company Mistral AI confirmed the launch of a new open AI model Codestral - the first large language model (LLM) to help developers write code. Read more here.

SberDevices Releases Open AI Model of Machine Learning for Speech and Emotion Recognition

In early April 2024, SberDevices introduced a set of Open Source machine learning models for speech and emotion recognition. The development available to everyone for free was called GigaAM (Giga Acoustic Model). Read more here.

Founded by immigrants from OpenAI, Anthropic has released a language model for AI learning. It turned out to be more powerful than Google and OpenAI systems

On March 4, 2024, Anthropic, founded by immigrants from OpenAI, announced models of artificial intelligence of the Claude 3 family. They are said to surpass the counterparts of both OpenAI itself and Google. Read more here.

Appearance of Small Language Models (MLMs)

By February 2024, many have already experienced the power of large language models (BYM, large language models, LLM), using including ChatGPT to answer difficult questions. These models are so large that they may require significant computing resources to run, so the emergence of small language models (SLM) has become a big sensation.

The MNYs are still large enough and have several billion parameters - unlike the hundreds of billions of parameters in the MNYs - but they are small enough to run offline on the phone. Parameters are variables, or custom elements, that determine the behavior of the model.

"Small language models can make AI more accessible because of their size and cheapness," says Sebastien Bubek, who heads the Machine Learning Foundations group at Microsoft Research. "At the same time, we are discovering new ways to make them as powerful as large language models."

Microsoft scientists have developed and released two MNYs - Phi and Orca, which in some areas work no worse or even better than large language models, refuting the opinion that scale is necessary for performance.

Unlike BYUs trained on vast amounts of data from the internet, more compact models use tailored high-quality training data, and scientists find new thresholds for size and performance. In 2024, we can expect the emergence of improved models designed to promote innovation.

The emergence of multimodal AI that understands information not only from text, but also from images, audio and video

Most large language models (BYMs) can only handle one type of data - text, but multimodal models such as Google Gemini or Microsoft Copilot are able to understand information from different types of data - text, images, audio and video. This capability makes technologies, from search tools to creative applications, richer, more accurate and seamless.

You can find out in Copilot what happens in the uploaded image, thanks to a multimodal model that can process images, natural language and Bing search data. Copilot can generate, for example, relevant information about the historical significance of the monument in your photo.

Multimodal AI is also used in Microsoft Designer, a graphic design app that can generate images based on a description of what you want. It also allows you to create your own neural voices, or natural voices, useful for reading texts and tools for people with speech disorders.

Google has released a model available to everyone for training artificial intelligence

On February 21, 2024, Google announced open source artificial intelligence models Gemma 2B and Gemma 7B, which can be used by everyone. It is possible to solve such problems as document analysis, creation of chat bots, etc. Read more here.

The world's first open source model with support for 100 languages ​ ​ for learning artificial intelligence has been released

On February 13, 2024, Cohere for AI, a non-profit research laboratory created by Cohere in 2022, unveiled an open large language model (LLM) called Aya. This is said to be the first solution in this class to support more than 100 languages. Read more here.

2023: Russian scientists have created an algorithm that trains AI 4 times faster than global analogues

Scientists at Tinkoff Research's Artificial Intelligence (AI) Research Laboratory have created an algorithm to train and adapt artificial intelligence. According to scientists, a method called ReBRAC (Revisited Behavior Regulated Actor Critic) trains AI four times faster and 40% better than global counterparts in the field of reinforcement training (Reinforcement Learning, RL), adapting it to new conditions on the go. Such results were obtained as part of testing the algorithm on robotic simulators, representatives of Tinkoff Bank told TAdviser on December 21, 2023. Read more here.

Notes