Developers: | Yandex |
Date of the premiere of the system: | 2024/06/11 |
Technology: | ITSM - IT Service Management Systems |
2024: YaFSDP Library Presentation
Yandex has developed the YaFSDP library and posted it in open source. The developer announced this on June 11, 2024. It significantly speeds up the training of large language models - both in-house and third-party, open source. The library gives an acceleration of up to 25% - the result depends on the architecture and parameters of the neural network. YaFSDP can also use up to 20% less GPU (GPU) resources required for training. Now YaFSDP can be used by companies, developers and researchers around the world.
The Yandex library is designed primarily for large language models, although it is also suitable for other neural networks - for example, those that generate images. YaFSDP can reduce the cost of equipment for model training - this is especially important for startups and, for example, scientific projects.
One of the difficulties in teaching large language models is the insufficient loading of communication channels between graphic ones. processors YaFSDP decides this. The library optimizes the use of GPU resources at all stages of training: pre-training (preliminary), supervised fine-tuning (with a teacher), alignment (model alignment). Thanks to this, YaFSDP uses exactly as much graphics memory as is needed for learning, while nothing slows down communication between GPUs.
Yandex developed YaFSDP in the process of training its new generation generation model YandexGPT 3. The company has already tested the library on third-party neural networks with an open one. source code For example, if YaFSDP were used for the LLaMA 2 model, the pre-training phase on 1024 GPUs would be reduced from 66 days to 53 days.
YaFSDP source code is already on GitHub[1]. You can see the details of the measurements in the GitHub repository, and read about the development of the library - on Habra.