Developers: | Alibaba Group |
Date of the premiere of the system: | August 2023 |
Branches: | Information Technology |
Content |
2025: Qwen Version Announcement 2.5-Max
On January 29, 2025, Alibaba Cloud, the cloud division of the Chinese corporation Alibaba, introduced the large Qwen 2.5-Max language model. It is claimed that this neural network is superior in capabilities to the powerful open-source artificial intelligence model DeepSeek V3, which, in turn, is ahead of most open and closed counterparts, including ChatGPT.
Qwen 2.5-Max uses the Mix-of-Experts (MoE) architecture. It involves the use of many submodels (experts), each of which specializes in different aspects of input data or types of tasks. This approach allows you to significantly increase the speed, as well as improve the quality of processing requests and generated results.
The Qwen neural network 2.5-Max previously trained on more than 20 trillion tokens. Additionally, controlled fine tuning (SFT) and reinforcement training based on human feedback (RLHF) were conducted. The Qwen 2.5-Max model is claimed to outperform the DeepSeek V3 in benchmarks such as Arena-Hard, LiveBench, LiveCodeBench and GPQA-Diamond, while also performing competitively in other scores including the MMLU-Pro.
Qwen 2.5-Max surpasses GPT-4o, DeepSeek V3 and Llama-3.1-405B in almost all indicators. Our base models have shown significant advantages in most tests, and we are optimistic that improvements in tapping techniques will take the next version of Qwen to the next level, Alibaba says. |
The Qwen 2.5-Max model is available through the Qwen Chat service, which can be used to test the capabilities of a neural network, assess its effectiveness, etc. In the future, Alibaba Cloud plans to integrate Qwen 2.5-Max into its cloud services, which will expand their functionality.[1]
2023: Neural Network Launch
On August 25, 2023, the Chinese corporation Alibaba introduced two artificial intelligence models - Qwen-VL[2] (Qwen Large Vision Language Model) and Qwen-VL-Chat, which provide advanced capabilities in terms of image analysis and natural language dialogue.
The released solutions are open source, which means that independent researchers, scientific organizations and companies around the world will be able to use them to create their own AI applications without the need to train their systems. This will save hardware resources, time and money. In addition, the final products will accelerate their entry into the commercial market.
The Qwen-VL model can recognize images and text. The algorithm is capable of processing requests related to graphic files, generating responses, image signatures, etc. In turn, the Qwen-VL-Chat model is designed for more complex interaction: it can compare several graphic files, answer a series of questions, and also generate narratives. AI algorithms make it possible to form images based on photographs provided by the user, as well as solve mathematical problems shown in the picture. For example, you can ask artificial intelligence a question about the location of a particular company by uploading a photo of its signs.
The announced AI models, as noted, are designed to improve user interaction by providing more accurate and up-to-date information. At the same time, experts say, there are issues related to ensuring confidentiality. AI algorithms with the ability to visually localize theoretically allow you to determine the location of people captured in photographs: this information can be used for surveillance or for criminal purposes.[3]