| Developers: | Yandex |
| Date of the premiere of the system: | 2025/05/28 |
| Technology: | Big Data |
Main article: Big Data
2025: Yambda Presentation
Yandex scientists have developed and posted Yambda, one of the largest datacets for the development of recommendation systems, in open source. Yandex announced this on May 28, 2025.
With the help of datacet, scientists, researchers and universities from all over the world will be able to test and improve recommendation algorithms.
The dataset is presented in three versions: the full version contains 5 billion data, reduced - 500 million and 50 million. Developers and researchers will be able to choose the option that meets their task and available computing resources.
Commercial companies rarely publish datacets for recommendation systems, so there is little current and qualitative data for research in this area. Access to high-quality big data opens up opportunities for scientific research and draws the attention of young scientists to the field.
Yambda is based on impersonal data from Yandex Music, but it can be used to assess the quality of any recommendation systems. Yambda includes aggregated listening, likes, dislikes, as well as some characteristics of the tracks. All user and track data is anonymized.
