RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

YTsaurus (YT)

Product
Developers: Yandex, Yandex B2B Tech
Branches: Information Technology
Technology: Big Data

Content

Main article: Big Data

2025: Launch a single data platform of all sizes for business

Yandex B2B Tech has opened business access to its own development platform for storing and processing big data YTsaurus. The developer announced this on May 28, 2025. With its help, you can analyze data exabytes in companies and train complex machine learning models with billions of parameters. YTsaurus is available in two delivery formats: in the cloud and in the customer infrastructure (on-premium). In the cloud, the solution is available as a managed service, that is , Yandex specialists will fully support the platform.

Yandex has been developing YTsaurus since 2010 - as of May 2025, it is used to store data from most services, train YandexGPT and other neural networks, and search index tasks. For example, Yandex.Market uses the platform to develop a promotion system, and autonomous transport processes travel data and improves algorithms. Previously, the platform was available in open source - it is already used by large technology companies in Russia and abroad.

YTsaurus is suitable for processing a small amount of data and for working with a million CPUs and tens of thousands of GPUs. The platform can be used as a classic MapReduce system, as well as other popular data processing solutions, including ClickHouse and Apache Spark, can be used within the platform. With YTsaurus, you can build enterprise data warehouses, ETL systems, and process both structured and unstructured or semi-structured data, including logs or financial transactions.

File:Aquote1.png
It is important for us that companies have services and tools for working with data for any scenario. To do this, on the one hand, we create and develop services for a cloud platform based on open solutions. On the other hand, we test and adapt our own developments for business, such as the YTsaurus, YDB platforms, the DataLens BI solution, which are used in the company's internal infrastructure, "said Ivan Puzyrevsky, CTO of the Yandex Cloud platform.
File:Aquote2.png

2023: Source Code Publication

Yandex has revealed the sources of its main platform for working with big data YTsaurus. The press service of the company announced this on March 20, 2023.

As told in Yandex, the platform is suitable for a wide range of tasks, from analytics to training complex models with billions of parameters. For example, "Search" builds a search index using YTsaurus, and self-driving cars use the platform to process travel data and improve their algorithms. YTsaurus manages Yandex supercomputers, distributing the load so that their computing power is used most efficiently.

YTsaurus is Yandex's big data platform

By March 2023, Yandex has deployed the YTsaurus platform on tens of thousands of servers and processes data exabytes; every second employee of the company works with her. YTsaurus can be used as a classic MapReduce system, but it also supports other popular approaches to data processing - for example, it has integrations with ClickHouse and Apache Spark.

YTsaurus source code and documentation are available on GitHub. The code is distributed under the Apache 2.0 license. Anyone can use the platform or modify it for themselves.

File:Aquote1.png
Yandex has been developing YTsaurus - or YT, as we call it internally - since 2010. We started building our own ecosystem for big data, because none of the solutions on the market met all our requirements. Now YTsaurus is one of the key elements of Yandex's internal infrastructure. Dozens of developers are working on the platform, and its capabilities are constantly expanding, "said Maxim Babenko, head of the distributed computing technologies department, quoted by the Yandex press service on March 20, 2023.[1]
File:Aquote2.png

Notes