2018/01/19 18:30:35

Intel against all. War for network interconnect between processors and memory began

Content

Proprietary approach of Intel
The consolidated approaches of other market participants
Read Also
Notes

By the end of the second decade of the 21st century the computer technologies (CT) reached that level of development when an opportunity absolutely fairly appeared to call them information. Before universal use of the term "information technologies" was incorrect. How it is possible to speak about information when use of computers was limited to calculations or work with data? What does it have to do with information?

Very strange statements became the investigation of neglect of differences between data and information. For example, on the website "Academician" it is possible to find the following determination: "Information technologies (IT, from engl. information technology, IT) — a wide class of the disciplines and spheres of activity relating to technologies of creation, preserving, management and data processing including using ADP equipment ^[1]. If technologies operate with data, then why they are information?

Now the phrase "information technologies" acquired absolutely legitimate right on more significantly because information, but not data becomes a subject of technologies. Transition from data to information became possible thanks to rapid entry into life of computer analytics in all its forms. Original information technologies are that subset of KT which provide processing of the results yielded with the subsequent conversion to the form giving to the person an opportunity to take useful information, new to itself.

The foundation was laid for information technologies by technologies of visualization in a combination to statistics, to various methods of work with natural languages (NLP), and in the middle of the 2010th numerous technicians of machine learning increased. Information technologies are designed to help the user to understand those patterns which are hidden in data.

There is nothing surprising that for information technologies other computers are required qualitatively. Literally in the eyes the concept of the multipurpose computer dominating for many years which is going back to the Turing machine becomes outdated. ("Everything passes … and universalism of computers too"^[2])

Alternative computers will appear not at once, and until then are laid great hopes on the special purpose computers constructed on the existing technologies.

To replace "universal computers" (we will remember that the first commercial computer was called UNIAC or Universal Automatic Computer) systems specialized date-centrichnye come and together with them there comes the era of Data-intensive Computing (DIC). From new computers various capabilities for work with large volumes of unstructured data arriving from the outside world – texts, images of all types, video and audio are required. The same can be done also by universal machines, but for bigger efficiency for each data type specialized devices (appliance) are required. In the history of computing such devices already meet. Teradata built specialized analytical machines in due time, but they were too expensive. As a result it was necessary to pass to universal, but, as we know, evolution goes on a spiral. In figure 1 the conceptual vision of a system matching the need of DIC is provided. In it from the outside world data come to the subsystem capable to store large volumes of the processed data. The computing environment performing functions of work with data consists of a great number of specialized computing agents or appliance.

Action of such system reminds programming on Python or other modern languages of programming where creation of programs is not connected with writing of long codes, and is more similar to assembly of a product from the ready modules which are available in open libraries where almost everyone finds that it is necessary for it. And, if does not find, then writes own module and fills up with it library.

The computing agents or appliance (Computational agent) representing the devices specialized for the solution of a certain type of tasks can play a role of some kind of library modules. If necessary they join in heterogeneous computing environment (Computational).

As the first examples of such agents it is possible to consider GPU intended for the general calculations of GPGPU (General-purpose computing for graphics processing units), the programmable arrays of FPGA (Field-Programmable Gate Array) and special purpose integrated circuits of ASIC (Application-Specific Integrated Circuit) intended for the solution of a specific objective. The coordinating and organizational function in heterogeneous computing environment is played by the central processor CPU. As a result there is computing environment where except CPU various specialized modules are connected to process of work with data.

Fig. 1 Conceptual vision of a system matching the need of DIC

And there is the central task. It appears, to all this "zoo" consisting of CPU, GPU, FPGA, ASIC, and it is in the future possible still something, it is required to provide access to the general data array, stored in memory. The data array one, and is a lot of agents, but all of them can address the same data at the same time. As a result, on the way to DCI there is a well-known problem of coherence of memory. It was realized still in the seventies the last century when for the first time connected two processors to total memory, at once there was a need to somehow coordinate their work in memory. Since then coherence of memory call a problem of preserving of coherence of data at the address to her more than one processor.

The figurative term "coherence" is borrowed from physics where by it designate the approved course in space and in time of several oscillatory or wave processes at which the difference of their phases remains to a constant. In computing approval of work with data (coherence of memory) is provided thanks to special protocols (memory coherence protocol) regulating work of agents. As shown in figures 2 and 3, two alternative schemes of monitoring – centralized and decentralized are possible. In real execution they it is significantly more difficult. Until recently it was necessary to face a problem of monitoring in multiple processor systems of UMA and NUMA and also a cache coherency in the CC-NUMA systems where there is a need of approval of caches.

Fig. 2 Centralized of the scheme of monitoring

Fig. 3 Decentralized of the scheme of monitoring

Relevant for the DIC movement to ensuring coherence of memory is divided into two unequal flows. One the Intel company, and another forms all others. Intel develops own proprietary approach while other participants formed several public organizations for the purpose of development of the commonly accepted standards.

Proprietary approach of Intel

At the same time with start of the Purley platform (Xeon Skylake-SP) of Intel offered the interprocessor interconnect like point-to-point of Intel UltraPath Interconnect (UPI) which is further development of QuickPath Interconnect (QPI). With its help coherent access to total memory can be got from two to eight processors. In UPI for ensuring coherence both known mechanisms – using directories (directory based) and using so-called "latches" (snoop based) are used.

Рис.4 QuickPath Interconnect

Independent provision of Intel probably is connected with development of own platform for DIC where as agents accelerators will use not GPU, but FPGA. Not accidentally it was the Altera company – one of two main producers FPGA is purchased. And it is similar to the fact that the company refuses further development of the Xeon Phi processor which maintains mass parallelism and vectorization.

Unlike the majority, Intel does not stake on GPU that is quite reasonable. It is known that use of GPU for the account the phenomenon forced, they were created for game computers and bear an origin load. GPU are than other as arrays of the simplified CPU working according to the same von Neumann scheme. They cannot be so effective as specialized FPGA on many applications where high degree of parallelism from special accuracy is required, in particular: analytics, artificial intelligence, code conversion video, compression, problems of security and genomics. In plans of Intel for the next year there is own advanced PCI Express card on the basis of the Arria 10-GX FPGA chip developed by Altera. It will provide higher data transfer rate and a smaller delay. Let's remind that the memories of UPI, QPI technology providing coherence and their competitors are logical protocols, they work over physical protocols, in this case PCI Express.

The consolidated approaches of other market participants

With preparation for creation of the heterogeneous systems also all other market participants, but, in difference of Intel are anxious, they aim at consolidation and for this purpose created three communities, each of which would like to offer the standards for coherence of memory. The website The Register, brisk on language, claims: "War for network interconnect between CPU and memory is announced" ("The CPU-memory-network interconnect technology wars are on").

On other website it is possible to find material under the name: "The Battle of Data center Interconnect Fabric" ("Fight interconnect of DPCs"). A reason for military operations is the threat of monopoly of Intel in this area.

Something this war reminds an anti-Napoleonic campaign where one is opposed by the allied forces forming three coalitions:

Cache Coherent Interconnect for Accelerators (CCIX)
Gen-Z
Open Coherent Accelerator Processor Interface (OpenCAPI)

They support respectively three standards, In each coalition approximately on 30 members and on a payroll they I will apply for 80% match. Such strange situation is caused, most likely the fact that in the conditions of uncertainty it is difficult to deliver on one player and many companies were included into all three.

Standard CCIX

Logical protocol CCIX (Cache Coherent Interconnect for Accelerators (X) is offered by the consortium of the same name (AMD, ARM, Huawei, IBM, Mellanox, Qualcomm, Xilinx and dr) for the purpose of ensuring distributed access to memory from agents (processors and accelerators) using the physical PCIe 4.0 protocol at the transport layer. In representation to PCI Special Interest Group non-profit organization (PCI SIG) PCIe speed - 4.0 Gbit/sec., but CCIX are allowed by acceleration to 25 Gbit/sec. Efforts of participants of consortium are concentrated on creation of the coherent cache memory providing interconnect between processors (different vendors), accelerators and agents accelerators. The distinctive feature of standard CCIX is that it is not attached to any certain architecture, i.e. extends to ARM, IBM Power and x86.

Fig. 5 Architecture of CCIX

Gen-Z standard

The scope of the Gen-Z standard extends not only to a system memory, but also to storage systems therefore as a part of founders the same who and in CCIX and also HPE, Dell, Micron, Seagate, SK Hynix, Samsung and Western Digital. According to DWH producers, Gen-Z will be popular with the advent of the new devices of the class SCM (Storage Class Memories) which are result of convergence of RAM and DWH. SCM is a new type of storage which can become an intermediate link between high-performance DRAM and cheap HDD. SCM memory is capable to provide the reading speed close to the speed of reading DRAM, and the writing rate many times exceeding possibilities of hard drives. The first SCM announced Intel and Micron, having provided three-dimensional architecture 3D XPoint. It will be used in data processing centers for storage of often queried "hot" data. Interconnect of Gen-Z will allow to get access to them from various accelerators. As the transport protocol Ethernet is used.

Fig. 6 Architecture of Gen-Z

OpenCAPI standard

The OpenCAPI standard provides a basis for implementation of high-speed communications between different system components, including RAM, the data warehouse and network. OpenCAPI (Coherent Accelerator Processor Interface) is founded by AMD companies Google IBM, Mellanox Technologies and Micron. His members are also Dell EMC, HPE, NVIDIA and Xilinx. The leading role belongs to IBM which announced creation of CAPI technology in 2014. Then it was developed for systems on Power8 platform and became available to partners of IBM in the project OpenPower. Later the corporation decided to open access to CAPI technology for all industry.

CAPI is tied to a certain processor as is implemented by the functional intermediary module for the coherent CAPP accelerator (Coherent Accelerator Processor Proxy). The response module PSL (Power Service Layer) is built in the accelerator. Jointly CAPP and PSL can use the same coherent storage space. In this sheaf the accelerator receives the special name AFU (Accelerator Function Unit). It is not necessary to confuse AFU with APU (Accelerated Processing Unit), the device offered AMD and combining on one CPU and GPU chip. Emergence of APU causes in memory consolidation in the processor of 386 two devices – CPU 386 and FPP 287.

In the feather version the CAPI interface worked at Power8 over PCIe Gen 3. The version of CAPI 2, on preparing for production Power9, is supposed PCIe Gen 4.

In October, 2016 after formation of OpenCAPI association (AMD, Google, IBM, Mellanox, Micron, Nvidia, Hewlett Packard Enterprise, Dell EMC, Xilinx, etc.) created absolutely new OpenCAPI standard earlier known as CAPI or CAPI 3.0. He refuses PCIe and passes to Bluelink 25G I/O in combination with NVLink 2.0.

The qualitative novelty of CAPI is that it opens an opportunity for creation of the heterogeneous systems at the level of motherboards. The heterogeneity is characteristic of modern cloud DPCs which are completed practically with all types of computers.

Fig. 7 Architecture of OpenCAPI

Read Also

Memory disaggregation

The main trends in the world market of data centers

Notes

Источник — «https://tadviser.com/index.php/Article:Coherence_of_memory»

The site content is translated by machine translation software powered by PROMT. The machine-translated articles are not always perfect and may contain errors in vocabulary, syntax or grammar. Read original article
If you find inaccuracies or errors in the results of machine translation, please write to editor@tadviser.ru. We will make every effort to correct them as soon as possible.

Simple Link

How to create a "smart plant": Key characteristics of a modern digital enterprise 10500

Model Studio CS: How to use BIM to give new impetus to the development of the fuel and energy complex 11000