RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

Microsoft DeepSpeed

Product
Developers: Microsoft
Date of the premiere of the system: September, 2020
Technology: Development tools of applications

2020: The announcement of Microsoft DeepSpeed - the tool for deep learning of the AI models

In the middle of September, 2020 Microsoft provided on GitHub the updated open version of DeepSpeed library. It is intended for optimization of deep learning of models artificial intelligence (AI).

According to the SiliconANGLE edition, uniqueness of the solution DeepSpeed is that it is capable to train the AI models on the basis of one trillion different parameters. Microsoft notes that the method used by DeepSpeed developers which received the name 3D - parallelism adapts to different requirements of the user solutions, including interaction with huge models, saving at the same time balance and efficiency in scaling.

Microsoft released the open tool for deep learning on the basis of one trillion parameters

The problem for which solution the product DeepSpeed was created is that developers can equip the neural networks only with such number of parameters which their infrastructure of training of AI can process. In other words, hardware constraints are an obstacle for creation of more large-scale and best models. DeepSpeed does learning process of AI by more effective at the hardware level. Developers can increase the level of complexity of the software of AI created by them without need to buy additional infrastructure.

Microsoft states that this tool can train a language model with one trillion parameters using 100 video cards Nvidia of the previous generation V100. Usually, according to the statement of the company, on accomplishment of this task 100 days are required from 4000 video cards Nvidia A100 of the current generation. And this with the fact that A100 is 20 times faster, than V100.

Microsoft states that even if the used equipment will be reduced to one V100 chip, DeepSpeed all the same will be able to train a language model with 13 billion parameters. For comparison: the largest language model in the world has about 17 billion parameters, and the biggest neural network in general contains about 175 billion.[1]

Notes