RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2
2020/10/09 16:19:24

Reliability of DPC

Requirements to their reliability even more increase in modern conditions when corporate data centers become "heart" and "brain" of the digitized business without exaggeration. In these conditions to present information structures it becomes close within the standard level of reliability of Tier III though it also allows a little more than one and a half hours of idle time a year. What additional requirements do they make? And how Tsodostroyeniya's industry answers them? Article is included into the Technologies for DPC overview

Content

File:Aquote1.png
It covers a stage of certification of the project of DPC (Design Documents), the actual check of compliance to requirements and the approved project of the engineering systems of DPC (Facility Infrastructure) and also check of processes of operation of DPC (Operational Sustainability), - Pavel Goryunov, the technical director of network of the Data-centers of CROC tells.
File:Aquote2.png

By its estimates, the industry of DPC already rather mature, and eventually solutions for their creation cardinally do not change. Fault tolerance level became the same standard characteristic of DPC as its area or installed power per employee today.

File:Aquote1.png
In fact, the continuity of work of the Data-center is affected by two pacing factors: the reserved engineering and hardware-software infrastructure which cornerstone the reliable and checked systems and also competently built operation processes are. In a complex the providers having the status Tier III Gold Certification of Operational Sustainability can offer all this, - Pavel Goryunov says.
File:Aquote2.png

File:Aquote1.png
The main factor of ensuring smooth operation is reservation of all key components of DPC, since separate components of the separate server in a separate rack and finishing with reservation of all DPC by creation of the reserve platform with a possibility of switching to it in case of serious emergency situations.
File:Aquote2.png

File:Aquote1.png
Really, not only the uninterrupted operation of work of DPC in general, but also work of its separate applications and services is interesting to the customer, - Vladimir Leonov, the technical director of AMT Group explains. - Taking into account it, depending on the made demands and the budget the customer can select different schemes of ensuring fault tolerance, build a reserve or on the basis of the additional hardware or, duplicating parts of a system between several DPCs.
File:Aquote2.png

The distributed fault tolerance

The current trend in creation of geographically distributed DPCs is imposed on a traditional method of achievement of disaster tolerance and high availability due to creation of structure from several DPCs duplicating and supplementing each other.

One of options of topology of disaster-proof DPC

Each DPC plays at the same time the role, but due to scale effect at the maximum use of all resources of this structure of DPCs it is possible to increase efficiency of construction. For example, in typical the scheme of the distributed Data-center local networks (LAN) and storage area networks (SAN) of platforms are connected among themselves. Segments of local networks to which servers are connected are united in domains L2, and it allows is transparent for applications to move the IP addresses of servers between platforms. And SAN servers, thanks to consolidation, can use resources of data storage of different platforms. Besides, from the point of view of the user separate DPCs look as the single system providing services via the single interface.

Classification of the distributed DPCs by distance between platforms. Source: RUVDS company

Emergence of new classification of DPCs (regional, peripheral, etc.) resulted in need to classify DPCs also by the distances separating from each other DPCs of different levels in a single geographically distributed network.

Tab. Characteristics of the distributed DPCs

Source: RUVDS company

New structures of DPC change ideas of reliability assurance. So, experts note that world Internet giant build the DPCs by the principle of TIER 0, i.e. use architecture of the distributed fault tolerance. It gives the chance to provide reservation of the resources necessary for ensuring high reliability of the provided services, at the smallest costs. Figuratively speaking the TIIER III level is replaced with "topological intelligence".

According to experts of IDC, by 2020-2021 the autonomy of crucial IT infrastructure will become one of the fundamental principles of work of a half of the Data-centers. So-called "intellectual peripheral nodes" will be applied to its providing more widely.

In line with these representations the peripheral DPC (Edge) has the "intelligence" necessary for primary analysis of data, and on this reason loads communication channels, the central DPC or a cloud less. For example, by such principle system operation of infrastructure management of DPC of EcoStruxure IT of Schneider Electric company is organized.

Basis of this platform – the module which collects data. It contains the gateway installed on an Edge-system and aggregating data. He makes primary analysis of the events, but "upward" sends not all data. He acts only in case of failure this way. If everything operates normally, then data can be sent, say, time 15 minutes or half an hour.

By Uptime Institute estimates, last year about 70% of the companies in the world ispolzuovat such approach in this or that volume for the purpose of ensuring parallel synchronous operation of applications on several territorial and spaced platforms. It is possible even to tell about mass popularity of such approach to the organization of operation of applications though, of course, it requires a close attention to support of the guaranteed transactions time. Besides, it is necessary to count carefully the required level of reservation at each level of such distributed DPC and options of effective backup. And according to data of the research conducted last year by Xelent company, only 7% of the Russian companies have own plan of disaster recovery.

Tier IV standard

The key aspect of modern DPC – its distributed structure - considers the Tier IV standard.

File:Aquote1.png
the Data-center of the Tier IV level for classification of Uptime Institute means failsafe infrastructure, - Sergey Mishchuk, the director of product development in the field of DPC and cloud services in Rostelecom DPC says.

File:Aquote2.png

Tier IV is the only level with fault tolerance in this connection it and is called: Fault tolerant infrastructure. Actually this level means failsafe network topology, for it sectioning and continuous cooling are obligatory.

Tab. Main differences of the different Tier levels

Source: DataLine


Thus, at the level of Tier I the minimum quantity of the equipment for work of DPC (N) is used, i.e. there is no reserve.

At the level of Tier II the engineering equipment is reserved according to scheme N+1.

At the level of Tier III according to scheme N+1 the engineering equipment and ways of distribution is reserved: feed cables, routes, pipelines.

At the level of Tier IV: if there is a single failure of any equipment, all the same there are N active components.

File:Aquote1.png
If in Tier III it is admissible that such switching will require intervention of employees, then at the level of Tier IV of switching are absent or are automatic.
File:Aquote2.png

Besides, in Tier IV ways of distribution are in a different way projected: feed cables, routes, pipelines. In Tier III there is their reservation, and in Tier IV also sectioning is obligatory, i.e. ways of distribution should pass in different premises or in the closed fireproof boxes. Actually they will be crossed only in the machine hall.

File:Aquote1.png
The Tier III level formally allows slight increase of temperature in the machine hall when there is a switching between the main and reserve conditioner or the chiller. And in Tier IV temperature increase in the turbine hall is not allowed even theoretically, - Sergey Mishchuk tells.
File:Aquote2.png

The corresponding exact calculations should be carried out at a design stage of DPC.

For certification of DPC on the Uptime Institute standard it is necessary to undergo testing according to three programs: certification of the project documentation of DPC, certification of the constructed DPC and certification of operational stability on the Tier standard.

File:Aquote1.png
Such certification is a call. We should develop and we feel in ourselves forces to master this level the first, - Sekrey Mishchuk says. - It is important that it allows to increase the level of service and reliability of DPC without substantial increase of capital investments. It is quite possible that it will create a precedent for other commercial DPCs.
File:Aquote2.png

Reliability of hybrid DPCs

The public cloud can undertake some functions of the reserve Data-center. Placement of applications in public clouds for the purpose of increase in flexibility and cost optimization on IT infrastructure becomes frequent the most reasonable solution.

According to data of a research of the systems of recovery after failures and to the forecast for 2018 - 2025 (Disaster Recovery Solutions Market Size, Share & Trends, 2018 – 2025), in 2016 the hybrid cloud dominated in the market at deployment of the new systems, and the sotvetstvuyushchy market was estimated at 763.4 mln. dollars. The popularity of this scheme is explained, first of all, by the fact that deployment of solutions for disaster recovery through a hybrid cloud gives the chance to use program and the hardware on the local platform, and services of recovery - in a cloud. She also allows to use a combination of virtual cloud servers and the selected hosting infrastructure. It allows the organizations to cut down considerably the expenses connected with installation of solutions for disaster recovery.

Plus to it deployment of a hybrid cloud eliminates redundancy and increases fault tolerance, providing flexible, reliable, scalable and economic architecture with the simplified backup and recovery of business data and applications.

File:Aquote1.png
For the last few years we and customers of Softline on themselves estimated importance of the geo-distributed platforms because of different situations with DPC in which we build the solutions, for example, there were fires, shutdown of networks, etc, - Yury Novikov, the head of development of cloud computing of Softline tells. - the Geo-distributed infrastructure allowed us to provide uninterrupted operation of use of clouds.
File:Aquote2.png

File:Aquote1.png
In case of difficulties with communication channels or emergence of risks our company creates reserve infrastructure and operability of clouds for customers remains, - Yury Novikov adds.
File:Aquote2.png

Fast recovery after failures in DPC

According to forecasts of Grand View Research company, by 2025 the volume of the world market of solutions for disaster recovery will reach 26.23 bln. dollars. The above-mentioned research Disaster Recovery Solutions Market Size, Share & Trends contains the forecast that managed services will become the most fast-growing segment during all forecast period (till 2025). Analysts explain growth of services with additional functions, such as remote monitoring, low costs and management of IT infrastructure using convenient tariff models of a subscription.

The service of disaster recovery of the DRaaS servers (Disaster Recovery as a Service) assumes that the provider will provide replication of servers of the company on the remote platform with a possibility of deployment in case of accident. In other words, in a cloud the copy of servers of the company will be created. If infrastructure of the client ceases to work, it will be possible to start copies in a cloud and in read minutes to continue work. At the same time all virtual servers will remain are available on network, for example, thanks to the VPN tunnel which is automatically created by L2.

Scheme of work DRaaS of the solution Cloud4Y

Problems of disaster recovery in hybrid IT infrastructures

Questions of ensuring reliability of hybrid structures of DPC – one of aspects of more general task of management of hybrid digital infrastructure (Hybrid digital infrastructure management, HDIM). Analysts of Gartner in the field of IT infrastructure and its support specify in the research of key trends of 2020 that the scale and complexity of management of infrastructures of HDIM becomes more and more current problem for the companies staking on IT. However HDIM are a new area, warn in Gartner, and the organizations should treat with care suppliers who propose single solution for all tasks of hybrid management already today. Gartner expects that some more years are required in order that product providers of the class HDIM brought the developments to the level which will really allow the companies to receive efficient tools for control of the digital infrastructures.

Gartner explains problems of a present situation on an example. Modern infrastructure is in different places: at colocation providers, in the local Data-centers, on peripheral nodes and in cloud environments. The problem is that the hybrid IT structure is capable to violate the current schemes of disaster recovery. It is connected with the fact that many organizations rely on offers of services (xaaS) now and often lose sight of the additional functions necessary for ensuring the correct levels of stability of a system. Moreover, according to forecasts of Gartner, by 2021 (error-free running time) failure of clients from application of opportunities of reservation of data which are given by the supplier of cloud services will become a basic reason in 90% of problems with availability in a cloud.

In other words, the plans of disaster recovery developed for the traditional systems with bigger confidence figure can conflict to requirements of new hybrid infrastructures. The industry of DPC just should find effective solutions of this problem. For now stabilization of the systems of DPC to failures at a design stage of the upgraded system is offered - to put the traditional method which well proved earlier. As for practical level, experts of IDC expect that by 2020-2021 for ensuring the guaranteed high quality of customer service in DPCs the systems of effective monitoring of calculating capacity, exchange of traffic of data and other resources of DPC everywhere will be implemented.