Developers: | RSK Technologies |
Last Release Date: | 2019/11/28 |
Technology: | Supercomputer |
Content |
The solution based on the RSK Tornado architecture features high compactness (the entire cluster is located in two rack cabinets), reliability, scalability and energy efficiency. The PUE (ratio of total power consumption to IT equipment power consumption) is less than 1.2, which means that no more than 17% of the electricity consumed is spent on cooling. The computational efficiency of the cluster in the LINPACK test is more than 90%.
2019: RSC Tornado AP and RSC Tornado AFS
On November 29, 2019, RSK introduced the development of the RSC Tornado unified solutions line for a wide range of resource-intensive scientific and applied tasks. The updated line of integrated software-defined and reconfigurable solutions is focused on the application both in the composition of classical high-performance computing (HPC) systems, as well as for efficient storage and processing of data, as well as for the creation of artificial intelligence systems (Artificial Intelligence, AI), machine and deep learning systems (Machine Learning/Learning,).
Among the solutions presented by RSK specialists:
- RSC Tornado AP computing node based on the server processors Intel Xeon Platinum 9200 high-performance line (up to 56 cores per processor).
- High-performance RSC Tornado AFS storage systems for both high-performance computing and machine and deep learning applications. They are based on the advanced DAOS software stack to create distributed object storage systems with support for Intel Optane DC Persistent Memory modules.
According to the developer, the development of the RSC Tornado line allows at a higher level to implement the main capabilities of RSK solutions, for example, such as: maximum computational density and energy efficiency (due to 100% liquid cooling "hot water" of all electronic components), linear scalability from small systems in several servers to thousands of servers in large clusters or server farms. This provides additional capabilities to optimize the cost of end-point solutions by supporting open standards, including storage drives:
- Intel Optane DC Persistent Memory,
- NVMe-enabled drives in the most dense EDSFF form factor (called long/short ruler),
and:
- server boards with more memory
- processors with maximum power consumption up to 500 watts per socket,
- A wide range of accelerators with up to 700 watts of power.
As a result, according to the manufacturer, the updated RSC Tornado line will allow to create systems with even more large computational density, a wide variety of sets of used components and their configurations to achieve high efficiency of a specific solution. In turn, the unification of the installation cabinet form factor, including a distributed power system with duplicated N + x, an integrated monitoring and control system for computing and switching components, allows you to simultaneously use both 100% liquid-cooled RSK solutions in one rack, as well as server and communication equipment of the standard form factor 19 '(rack unit, RU) of other manufacturers, equipped with air or combined cooling.
RSC Tornado AP Computing Node
According to the RSC statement, the high-performance RSC Tornado AP node with support for 56-core processors of the Intel Xeon Platinum 9200 line (Intel Xeon Platinum 9282 model) and direct liquid cooling in hot water mode has a maximum theoretical (peak) performance of 9.3 TFLOPs, providing up to 1.5 channels of RAM storage. Such a node can be equipped with two solid-state drives (SSDs) with support for NVMe technology in the M.2 form factor - for example, Intel Optane SSD DC P4801X M.2 Series or Intel SSD DC P4511 (NVMe, M.2), or two SSD- RS S DC eS. As an option, the system can be expanded with an additional basket with 6 SSD drives based on NVMe in the form factor E1.L (long ruler) with a capacity of up to 15.36 TB each with the possibility of hot replacement.
For example, a configuration with the Intel SSD DC P4320/P4520 (NVMe, E1.L) allows more than 100 TB of data to be stored on a fast access site. The optimal combination of compute, network, and storage components provides the necessary balance for building high-performance hyperconvergent systems with linear scaling, both to achieve the required processing power and the required volume/speed parameters of the distributed storage system. This approach allows you to create high-performance and compact systems with record performance for the industry: 0.8 PFLOPS of total peak performance and 8.4 PB of data storage in one 42U mounting cabinet, according to the RSK.
RSC Tornado AFS Storage Systems
Taking into account the ever-growing needs of customers to increase storage volumes and data processing speeds, RSC Tornado AFS has developed a solution for creating high-volume All-Flash storage systems based on high-speed NVMe technology and with the most dense EDSFF.L form factor. The presented All-Flash array with 100% liquid cooling of all components "hot water" supports up to 32 solid-state drives with support for NVMe technology in the form factor EDSFF.L with a announced capacity of 15.36 TB each for November 2019 and the possibility of hot replacement. The expected doubling of the capacity of NVMe/EDSFF.L class drives in the near future will increase the storage capacity to 1 PB per 1 place of the standard form factor 19 '(1RU) without any design changes, according to the RSK.
According to the manufacturer, the widespread use of NVMe-over-Fabric (NVMeOF) technology provides opportunities to create high-speed distributed systems with data transfer rates up to several TB/s and storage capacity up to 20.64 PB per cabinet with the support of various types of parallel file systems, such as Lustre, BGFS, etc. The use of advanced Intel Optane DC Persistent Memory technologies and remote direct memory access (RDMA) opens up a different approach for building high-speed low-patent distributed storage (storage) systems of the key- value store class using the advanced DAOS (Distributed Asynchronous Object Storage) software stack. These storage systems are designed for extensive use in the areas of machine and deep learning.
In order to achieve optimal performance, special attention was paid to the base part of the solution, which uses two high-performance second-generation Intel Xeon Scalable processors, the ability to use up to 2 TB of high-speed memory and up to four Intel Optane DC Persistent Memory modules as L4-5 data caches. A communication subsystem consisting of 2 PCIe Gen3/4 x16 adapters based on Intel Omni-Path, InfiniBand or Ethernet technologies is responsible for providing high-speed inter-node communication at speeds up to 200 Gb/s, which provides data access speed up to 25 GB/s per array, the RSC noted.
According to the manufacturer, the logical development of the high-speed hyperconvergent RSC Tornado HS storage system with 12 NVMe- drives was the addition of Intel Optane DC Persistent Memory modules, which provided the opportunity to implement support for the DAOS software stack.
Using the RSC Tornado hyperconvergent solution with the RSC software stack BasIS allows using the built-in orchestrator to determine the architecture of the storage system "on the fly" after installing the equipment, while adapting the complex to various types of loads in accordance with user preferences and tasks. At the same time, it becomes possible to create "storage-on-demand" with different characteristics for each of them (volume, type of file system, access speed, level of reliability and security, lifetime), according to the RSK.
2017
Preparing the RCS for the appearance of the Purley platform
The company PCK developed an updated ultra-dense, scalable and energy-efficient cluster solution "RSK Tornado," it was presented on June 19, 2017 at the international conference ISC "17 in Frankfurt am Main. The solution is a set of components for creating modern computing systems of various scales with 100% liquid cooling in hot water mode. It includes high-performance computing nodes based on processors Intel Xeon Phi 7290 and Intel Xeon E5-2697A v4 combined with the world's first high-speed switchboard Intel Omni-Path with similar cooling on "hot water."
Administration and monitoring of RSK Tornado subsystems provides functionality of integrated RSK BASIS software stack for cluster systems management.
"RSC Tornado" on Intel server processors has high compactness and computational density (up to 153 nodes in one standard 80cm x 80 cm x 42U cabinet), high energy efficiency, provides the possibility of stable operation of computing nodes in hot water mode at a coolant temperature of up to + 65 ° C at the entrance to computing nodes and switches.
Operation in the "hot water" mode in this solution allows you to apply the year-round free cooling mode (24x365) using only dry cooling towers operating at ambient temperature up to + 50 ° C, which allows you to completely get rid of the freon circuit[1]."
Questions from TAdviser about RSK Tornado were answered by Alexei Shmelev, executive director of the RSK group of companies.
Performance "RSK Tornado" reached 685.44 TFLOPS
On July 18, 2017, the group of companies PCK introduced its super-dense, scalable and energy-efficient cluster solution "RSK Tornado" based on the processors family. Intel Xeon Scalable
On Intel Xeon processors, the RSC Tornado set a world performance record for high-performance solutions - 685.44 TFLOPS in a standard 42U computing cabinet (80x80x200 cm). This figure is 2.65 times the performance of the RSC Tornado on the platform of the oldest model of the previous generation Intel Xeon processor E5-2600 v4 family.
This performance was achieved on the Intel Xeon Platinum processor 8180 (28 cores, 2.5 GHz core, 205 watts maximum power consumption, 38.5 MB L3 cache) from the Intel Xeon Scalable family.
"RSK Tornado" is cooled by liquid
On June 19, 2017, the RSK group of companies introduced the RSK Tornado cluster solution with direct liquid cooling - all elements of the computing cabinet, including network switches, are cooled with liquid coolant.
To this end, the RSC solution based on the 72-core Intel Xeon Phi 7290 processor is characterized by a computational density for the x86 architecture of 1.41 PFLOPS per cabinet or more than 490 TFLOPS per cubic meter[2]
According to the company, the next generation of RSK Tornado is ready to support the Intel Xeon Processor Scalable Family server processors (code name Skylake-SP).
RSC Tornado based on Intel server processors provides computational density of up to 153 nodes in one standard cabinet 80 cm x 80 cm x 42U), energy efficiency, provides stable operation of computing nodes in hot water mode at a coolant temperature of up to + 65 ° C at the entrance to computing nodes and switches.
2016
"RSK Tornado" reached a computational density of 1.41 PFLOPS
On November 16, 2016, the RSK group of companies introduced the RSK Tornado supercomputer solution with direct liquid cooling on the platform of the 72nd Intel Xeon Phi 7290 nuclear processor. The development set a world record of 1.41 PFLOPS computational density per cabinet for the x86 architecture.
As part of the computer system configuration:
- 72-core Intel Xeon Phi processors 7290,
- Intel Server Board S7200AP,
- Intel SSD DC S3500 Series M.2 340 GB
- Intel Omni-Path Interconnect Switch and Adapters
- 16-32 GB Micron DDR4-2400 VLP memory.
Features of the architecture "RSK Tornado"
- Multi-core Intel Xeon Phi 7200 processor family, including Intel Xeon Phi 7290 (up to 72 cores) and Intel Xeon Phi 7250F , Intel Xeon Phi 7290F (suffix F for Intel OmniPath Integrated High-Speed Interconnect Processors)
- using the Intel Server Board S7200AP family,
- physical density with placement of up to 408 computing nodes in a 42U double-sided cabinet with dimensions of 120x120x200 cm,
- computational density of 1.41 PFLOPS (formerly 528 PFLOPS) in a 42U two-sided computing cabinet or more than 490 PFLOPS/m ³,
- energy density up to 200 kW/cabinet, by reducing the power consumption of the system, helped to increase energy efficiency by almost three times,
- increase the RAM capacity of the computing nodes of one cabinet by almost 5 times from 16.3 TB in the previous generation to 76.5 TB (up to 192 GB RAM of DDR4-2400 RAM type and 16 GB MCDRAM per node),
- up to two SATA SSDs and one M.2 PCIe SSDs, such as the Intel SSD DC S3500 and Intel SSD DC P3100 (M.2 NVMe),
- increased energy efficiency - conditions are provided for stable operation of computing nodes in "hot water" mode at temperature + 63 ° C at the input to computing nodes, which helps to ensure system operation in "friculating" mode 24x365 with system PUE less than 1.05,
- power supply module in form factor of computing unit provides efficient conversion of 220 V AC to 400 V DC (power conversion efficiency 96%) and possibility of parallel operation on common bus with redundancy from N + 1 to N + N,
- Upgraded cabinet design supporting high-speed interconnect technologies including Intel Omni-Path and Mellanox EDR InfiniBand,
- it is possible to create flexible configurations of cooling systems, with the possibility of redundancy, both of individual hydroregulation units, and of the whole system,
- any node of the "Tornado RCS" solution can be serviced individually and does not require stopping another node. Easy access to all components of the node (memory, disks, high-speed interconnect adapters, control and power subsystems) allows you to easily replace these components or reconfigure them at the customer's site.
The RSC Tornado cluster solution can be implemented on the basis of the Intel Xeon Server Processor E5-2600 family, including the older Intel Xeon E5-2699A v4 model (22 cores, 2.40 GHz, 55 MB L3 cache), providing high compute density - 258.5 TFLOPS in the standard 42U cabinet (80x80x200 cm).
Calculation density raised to 528 TFLOPS
On June 21, 2016, RSK announced the upgrade of the RSK Tornado solution using the Intel Xeon Phi processor.
The computational density of the solution increased by 2 times to 528 TFLOPS per cabinet. The upgraded RCS solution has improved physical and computational density, high energy efficiency and provides stable hot water operation at a coolant temperature of + 63 ° C.
Indicators of the current system:
- Using older models of the latest multi-core (up to 72 cores) Intel Xeon Phi 7250, Intel Xeon Phi 7290 or Intel Xeon Phi 7250F, Intel Xeon PhiT 7290F processors (suffix F for Intel Omni-Path integrated high-speed interconnect processors),
- using the new Intel Server Board S7200AP family
- highest physical density with up to 153 computing nodes in a standard 42U cabinet with dimensions of 80x80x200 cm,
- increased computational density by almost 2 times - 528 TFLOPS (previously 280 TFLOPS) in the standard computing cabinet 42U or more than 412 TFLOPS/m3
- up to 192 GB of RAM per node (DDR4-2400 RAM + 16 GB MCDRAM),
- up to two SATA SSDs and one M.2 PCIe, such as the Intel SSD DC S3500 and Intel SSD DC NVMe M.2.
- increased reliability - independent hydraulic pump modules (hydroregulation modules) of the liquid cooling system for each computing domain (up to 9 modules per cabinet in total) with redundancy from N + 1 to N + N,
- increased energy efficiency - the necessary conditions are provided for stable operation of computing nodes in "hot water" mode at temperature + 63 ° С at the input to computing nodes,
- a new power supply module in the form factor of the computing unit, which provides high-efficiency conversion of 220 V AC to 400 V DC and the possibility of parallel operation on a common bus,
- Updated cabinet design supporting new high-speed interconnect technologies including Intel Omni-Path and Mellanox EDR InfiniBand,
- possibility of building flexible configurations of cooling systems, with possibility of redundancy of separate hydroregulation units and the whole system.
2015: RSK Tornado cluster presented
On July 13, 2015, the RSK group of companies introduced the next generation of its cluster solution, RSK Tornado.
"RSK Tornado" has improved indicators of compactness and computational density, energy efficiency.
"RSK Tornado," 2015
Solutions on the platform developed by specialists of the RSK Tornado cluster architecture company, with liquid cooling, have been in operation with Russian customers for more than four years. They are installed at St. Petersburg Polytechnic University Peter the Great (SPbPU), Interdepartmental Supercomputer Center of the Russian Academy of Sciences (JSCC RAS), South Ural State University (SUSU), Moscow University of Physics and Technology (MIPT), Roshydromet, among other customers from various industries.
Cluster solution "RSK Tornado" has the following characteristics:
- increased physical density - up to 153 computing nodes per cabinet
- increased computational density - more than 200 TFLOPS/m3 on standard processors and up to 256 GB of RAM per node,
- increased reliability - independent hydraulic pump modules (hydroregulation modules) of the liquid cooling system for each computing domain (up to 9 modules per cabinet in total) with redundancy from N + 1
- increase in level of energy efficiency - necessary conditions for stable work of computing nodes in the "hot water" mode are provided at a temperature of +65 wasps at the exit from knot (that is today a world record in the NRS-industry),
- power supply module in form factor of computing unit provides high-efficiency conversion of 220 V AC to 400 V DC and possibility of parallel operation on common bus,
- updated the design of the computing cabinet with support for new high-speed interconnect technologies, including Mellanox EDR Infiniband, Intel Omni-Path,
The solution includes support for future Intel Xeon and Intel Xeon Phi processors with Broadwell and Knights Landing architecture codenames.
High availability and fault tolerance are ensured by a system for controlling and monitoring the operation of both individual nodes and the cluster system as a whole, expanded power management capabilities, providing redundancy of power supplies and hydroregulation modules. All elements of the complex (computing units, power supplies, hydroregulation modules, etc.) have a dedicated control controller, which provides wide opportunities for telemetry and control of each element.
Cabinet design allows "hot replacement" of hydroregulation modules without interruption serviceability of the system. Liquid cooling of all components ensures their long service life.
Advanced technological approaches implemented in the new generation of the RSK Tornado cluster solution have reduced the cost of infrastructure as part of computer complex projects and provided opportunities for more flexible modernization at the level of an individual node and the entire system.
The next generation "RSC Tornado" is based on Intel's standard server components - the Intel Xeon Server Processor E5-2600 v3, Intel Server Board S2600KP, and Intel SSD DC S3500/3600/3700 for data centers.
According to the company's statement, the RSK Tornado cluster solution continues to lead the industry in terms of physical and computational density, energy efficiency, reliability, availability and manageability.
"The unique long-standing experience of RSK specialists in the development of high-performance direct liquid cooling technologies and ultra-dense integration of supercomputer solutions based on standard server components made it possible to develop and introduce a new generation of the cluster solution" RSK Tornado "with a number of improved characteristics, which are very much in demand by customers operating powerful computing centers. In addition to our previously established world records of computational and energy density per footprint, on the new generation of "RSK Tornado" recorded a world record of stable operation in the "hot water" mode at a temperature of + 65 ° C. All development of RSK is done in Russia, in the production of our products, we actively rely on the potential and production capacities of Russian industrial enterprises, "said Alexei Shmelev, executive director of the RSK group of companies.
2013: Architecture Development
A new round of development of RSK Tornado architecture for creating energy-efficient and compact data centers (DPC) and supercomputer systems allowed specialists of the RSK group of companies to implement direct liquid cooling for standard and massively accessible server boards for the first time in the world (various manufacturers) on the Intel Xeon processor platform, originally created for traditional air-blown electronic components, along with the latest Intel Xeon Phi coprocessors, the developer's press service said on July 4, 2013.
This is the third generation of energy-efficient DCS solutions for HPC, cloud, and data center segments.
High-performance solutions with high computational density based on the RSK Tornado architecture with liquid cooling are designed to solve various customer problems.
The product line includes:
- DCS microDPC (from 16 to 64 nodes),
- DCS of mini data center (from 64 to 256 nodes),
- DPC DCS (more than 2 racks with high density up to tens of petaflops).
Characteristics
- up to 128 x86 servers in a standard 42U 80x80x200 cm rack;
- High-density blade server design based on standard and mass-accessible server boards
- Highest energy efficiency - Power Usage Efficiency (PUE) reaches a record 1.06 for the HPC industry (Total System Power/Electronic Component Power Ratio). That is, not more than 5.7% of the power consumption is spent on cooling the entire system;
- A record computing efficiency of 96% on the LINPACK test for the new Intel Xeon processor E5-2690 (Intel Turbo Boost Technology runs all the time, which provides up to 400MHz of clock speed when working with the LINPACK test);
- removal of more than 100 kW of thermal power from the rack using a unique RCS liquid cooling system;
- The ability to use the highest performing models of Intel server processors with a thermal output of 135 watts. For example, the Intel Xeon processor E5-2690 (2.9 GHz, 8 cores) and the latest high-performance Intel Xeon Phi 7120X and 5120D coprocessor (1.23 GHz, 61 cores);
- High peak processing power of more than 47 teraflops in a single rack based on Intel x86 architecture with Intel AVX instruction set and more than 200 teraflops using Intel Xeon Phi coprocessors;
- high density - 74 teraflops per square meter (based on Intel Xeon processors only) and 312 teraflops per square meter (with Intel Xeon Phi coprocessors);
- high scalability - up to the level of several petaflops (dozens of racks);
- economic efficiency - reduction of operating costs up to 60% (saving energy costs in rubles due to operation of the RSK solution);
- compactness - reduce the data center area by several times compared to traditional air-cooled solutions;
- The ability to use accelerators and coprocessors (e.g., Intel Xeon Phi).
- A complete integrated "RSC BASIS" software stack optimized for high performance computing.
- Performance and scalability of RSC Tornado based solutions are confirmed by the Intel Cluster Ready certificate.
Notes
212