RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

Cloud ML Space

Product
The name of the base system (platform): Artificial intelligence (AI, Artificial intelligence, AI)
Developers: Cloud (Cloud) formerly SberCloud, Sberbank
Date of the premiere of the system: 2020/12/04
Last Release Date: 2022/05/04
Technology: Application Development Tools

Content

The main articles are:

ML Space is a full-cycle cloud platform for developing and implementing AI services for businesses of all sizes. It contains all the necessary tools and resources to create, train and deploy machine learning models - from fast connection to data sources to automatic deployment of trained models on dynamically scalable high-performance SberCloud capabilities.

2022

Product Analyzer Service Placement

On June 20, 2022, Napoleon IT announced the development and placement on the marketplace of ready-made ML-models of AI Services of the ML Space platform of the Product Analyzer service, which allows you to recognize goods and prices in Russian grocery stores. During the first month of use, the service is available using the freemium model to all platform users. Read more here.

Inclusion in the Unified Register of Russian Software

machine learning The ML Space platform from SberCloud entered this became Unified Register of Russian Software. known on May 4, 2022.

ML Space is a full-cycle ML development platform that allows you to speed, optimize and simplify the process of preprocessing data, training and deploying machine learning models. The ML Space platform saves ML developers working time to train machine learning models and helps reduce the time it takes to bring an ML product to market: using the ML Space platform saves Data Science specialists time by 30%, and the development and time-to-market time of the model is reduced by 50%, from an average of three to one and a half months.

The ML Space platform exists in both the public cloud version and the private cloud of ML Space Private. ML Space Private contains absolutely all the advantages of a public version, while it can be deployed on the client's own infrastructure. According to SberCloud estimates, more than 70% of large companies cannot use public infrastructure due to the presence of critical data that, according to enterprise information security standards, cannot be transferred outside the company's perimeter.

In addition, ML Space Private can be deployed in hybrid mode, that is, part of the platform can be installed on client servers, and the other - to implement computing in the SberCloud cloud. In all scenarios, an information system can be built that meets any information security requirements of the company.

File:Aquote1.png
According to our estimates, about 90% of Russian companies do not use a single solution to work with machine learning technologies, and at different stages of development they cover the needs with disparate open source utilities. ML Space, on the other hand, allows you to increase ML processes in such companies immediately to the highest level of maturity. Our customers planning to use ML Space Private's private cloud include oil and gas companies, telecommunications and manufacturing companies, and the public sector. We provide a ready-made boxed product for end-to-end ML development, which ensures complete data privacy with reliable protection against leaks and is easily embedded in the information security landscape of the company,
File:Aquote2.png

An entry in the register on the presence of the ML Space platform in it was made on the basis of an order from the Ministry of Digital Development, Communications and Mass Media of the Russian Federation[1].

2021

RuDALL-E Neural Network Availability

December 15, 2021 Sber announced that, Neuronet ruDALL-E which generates images by description in Russian, became available on the ML Space platform. More. here

Confidential Computing Service View

Cloudy SberCloud The ML Space platform, which provides access to the oneAPI tool, will be replenished with a confidential computing service that will work on the basis. Intel Software Guard Extensions (Intel SGX) This was announced on November 11, 2021 by Executive Vice President, Sberbank STO Sber, Head of the Technologies Block David Rafalovsky and Chief Technical Director Greg Intel Lavander.

The service will allow not only to store and transfer data in encrypted form, but also to process them in a secure enclave, where the confidentiality of sensitive data can be saved from any unauthorized software.

Intel SGX ensures the integrity and confidentiality of sensitive data in systems where even privileged processes should be considered unreliable. Neither the cloud service provider, anyone from outside will be able to get into the protected area and gain access to the data processed there.

In May 2021, SberCloud announced the expansion of the ML Space cloud platform thanks to Intel's cross-architectural oneAPI programming model. It enables developers to leverage the performance and capabilities of different architectures without rewriting the code for each hardware platform. The oneAPI model supports well-known programming languages ​ ​ (for example, C, C++, Fortran and Python) and common standards (such as MPI and OpenMP), providing interoperability and close compliance with existing code.

Using modules from Intel's optimized oneAPI toolkit in SberCloud speeds up AI applications on the CPU without spending months learning new tools.

Шаблон:Quote 'author = said David Rafalovsky, executive vice president of Sberbank, CTO Sberbank, head of the Technologies unit.

The service will be available in the SberCloud cloud in March 2022.

Ability to download and run your own Docker images

On June 1, 2021, SberCloud (part of the Sberbank ecosystem) announced the expansion of the capabilities of ML Space, a full-cycle joint ML development platform based on the Christofari supercomputer.

ML Space

According to the company, in addition to using libraries and frameworks preinstalled in ML Space, users of the cloud platform were able to download and run their own Docker images in a special Docker registry storage, which is also available for collaboration. And remote access via ssh allows you to debug the necessary processes both from a personal computer and from the terminal of the familiar software (Jupiter Notebook or JupiterLab). Docker registry functionality when working with custom Docker images and ssh access allow you to train any model on the platform.

Another platform update was the DataHub module. Pre-trained models, data sets (specially prepared data sets) and containers stored on DataHub have become available to developers and data scientists. ML Space users no longer need to spend time searching and downloading models, data sets, docker containers from external sources, checking their EULA (user agreement) and the absence of viruses in them . If the ML Space client needs data sets, models and containers to solve AI problems, then SberCloud specialists themselves will find and place them in DataHub.

For example, GPT-3 models for 760 million and 1.3 billion are available on DataHub with prepared scripts for further training and deployment on SberCloud. And the GPT-3 language model (13 billion parameters), which is not yet in the public domain, can be deployed in DataHub ML Space in a few clicks.

Also available in the module is a selection of special NVIDIA NGC containers (NeMo, RAPIDS, etc.), adapted for use in ML Space and solving problems of natural language processing (NLP), computer vision (CV), working with data (ETL), deploying ML models in the cloud and many other cases. The Transfer Learning Toolkit container will be available in June 2021 in the updated version of DataHub. In the beta version of DataHub, all content - data sets, models, containers for June 2021 is available for free.

With the advent of updated ML Space functionality, machine learning product development no longer requires additional DevOps engineers and engaged computing infrastructure administrators, which optimizes AI product development.

ML Space users have access to collaboration at all stages of ML development, a flexible choice of infrastructure: CPU, GPU and the ability to launch distributed machine learning up to 1000 + Tesla v100 GPU of the Christofari supercomputer.

ML Space is already actively used by both the Sberbank ecosystem and large commercial companies, startups, as well as scientific organizations.

The following customer cases of companies were presented:

  • Aitarget Tech - model training on ML Space for automated creation and scaling of advertising creatives;
  • EORA - solution to Kaggle's problem of matching photos on the ML Space platform;
  • SberDevices - distributed training at ML Space for GPT-3 transformer models;
  • GetTransfer - model training for predicting a match deal between a client and a driver, using the LightAutoML library and the ML Space platform;
  • CST group of companies - technologies and APIs for solving speech analytics problems in difficult acoustic conditions.

Intel oneAPI Toolkit Availability

On May 20, 2021, Sberbank announced the expansion of the capabilities of SberCloud ML Space, a full-cycle cloud platform for the development and implementation of AI services. It provides tools and resources for creating, training and deploying machine learning models - from fast connection to data sources to automatic deployment of models at the power of SberCloud.

ML Space is a cloud service that allows you to organize distributed training using the scalable family of Intel Xeon processors with built-in AI accelerators. Its architecture is formed on the basis of its own supercomputer SberCloud - "Christofari," the total performance of which is 6.7 petaflops. It is in 40th place in the top 500 highest-performing systems in the world.

Enhanced capabilities were achieved through the use of an open, standards-based cross-architectural model, programming oneAPI which allows developers to effectively use the performance and capabilities of different architectures without correcting the code for each hardware platform. This gives you the freedom to choose the best equipment for a specific task. At the same time, one API supports well-known programming languages ​ ​ (for example, C,, C++ Fortran and) Python and general standards (such as MPI and OpenMP), providing interoperability and close compliance with existing code.

File:Aquote1.png
The SberCloud ML Space cloud platform was created, on the one hand, in order to provide data specialists with the best tools for solving problems in the field of machine learning, and on the other, in order to simplify and democratize the process of developing and using products based on artificial intelligence. Intel oneAPI Toolkits fit perfectly into the ML Space ideology. Now data scientists and ML developers working on a productive, flexible and cost-effective processor architecture will be able to accelerate the development and implementation of their AI products, improve their characteristics, "said David Rafalovsky, Sberbank Group, Executive Vice President, Head of the Technology block of Sberbank.
File:Aquote2.png

ML Space combines tools to work big data with - Jupiter Notebook and Jupiter Lab - and now productivity tools - Intel oneAPI Toolkits. It is built on a modular architecture, which allows users to add new features themselves. Within a year of this announcement, anyone can register and get test access to the SberCloud ML Space platform, Intel oneAPI Toolkits and to servers based on. processors Intel

Intel oneAPI Toolkits help developers create, analyze, and optimize high-performance cross-architectural applications for a variety of XPUs: Intel processors, GPUs, and FPGAs.

These toolkits include the cross-architectural programming language oneAPI Data Parallel C++ (DPC++) and more than 40 software products: compilers, libraries, and migration, analysis, and debugging tools that simplify the development of data processing applications.

One of the key elements of the ML Space cloud platform, the Environments module, will receive the following Intel oneAPI toolkits:

  • Intel oneAPI Base Toolkit - the main set of tools and libraries for the development of high-performance, cross-architectural applications;
  • Intel oneAPI AI Analytics Toolkit provides data professionals, artificial intelligence developers and researchers with familiar and convenient tools to speed up data processing and analysis on the CPU and GPU of Intel architecture;
  • The Intel oneAPI HPC Toolkit enables you to build and optimize high-performance applications based on Fortran, OpenMP, and MPI that can scale to the latest Intel-based systems and clusters. In combination with the main (Base) set of tools, it contains all the necessary tools for developing high-performance applications for solving scientific or engineering problems on systems with shared or distributed memory;
  • The Intel Distribution of OpenVINO Toolkit helps you optimize, configure, and run complex information with deep learning optimizer and execution and development tools.

2020: ML Space platform launch

Sberbank on December 4, 2020 announced that, together with SberCloud, they presented ML Space: a platform for working with artificial intelligence.

This was told by David Rafalovsky, STO Sberbank Group, Executive Vice President, Head of the Technologies Block.

ML Space is a cloud service that allows you to organize distributed training on 1000 + GPU. This feature is provided by Sberbank's own supercomputer, Christofari. The platform is already used by the teams of the company itself, including SberDevices and Speech Technology Center Group. The service will be available from December 12, 2020.

File:Aquote1.png
It should be easy and convenient to implement machine learning in products and services. Any developer, data scientist, company or organization. According to our estimates, in the process of working on AI solutions, only 30% of the time is spent on training models. Everything else is to prepare for it and another routine. We want people to be able to pay 99% of their attention directly to model training. ML Space and ʺKristofariʺ accelerate the creation and launch of ready-made solutions using machine learning by an order of magnitude, and also make artificial intelligence technologies much closer to business. We believe that our platform will lay the foundation for the practical large-scale use of AI in Russia - David Rafalovsky, Sberbank Group, Executive Vice President of Sberbank, Head of the Technologies block.
File:Aquote2.png

ML Space consists of integrated service modules, where each of them solves certain problems. Thanks to Sberbank's open LAMA technology, the ML Space platform allows you to automatically create machine learning models - in a special AutoML module.

The Environments module starts the process of training neural networks and monitors the loading of resources (CPU, GPU, RAM). Data Catalog allows you to collect and manage data and machine learning models in multi-user mode for distributed commands. The AutoDeploy module provides automatic, in a few clicks, deployment of ready-made models to the high-performance power of SberCloud. Thanks to this, trained AI models can be incredibly quickly implemented into production and business processes. In addition, the platform users will have access to a data markup service - TagMe.