Storage area network Storage Area Network, SAN
The storage area network (SAN) — is an architectural concept for connection of secondary storage devices of data, such as disk arrays, tape libraries, optical drives to servers so that the operating system recognized the connected resources as local. Creation of SAN network solves problems of decrease in total cost of ownership a storage system and also provides tools for the organization of reliable information storage.
The directory of solutions DWH and projects is available on TAdviser.
Content |
In the simplest case SAN consists of DWH, switches and servers integrated by optical communication links. In addition to directly disk DWH in SAN it is possible to connect disk libraries, tape libraries (streamers), devices for data storage on optical disks (CD/DVD and other), etc.
Example of highly reliable infrastructure in which servers are switched on at the same time in a local network (at the left) and in the storage area network (on the right). Such scheme provides data access, being on DWH, at failure of any processor module, switch or access path.
Use of SAN allows to provide:
- centralized operation by resources of servers and storage systems;
- connection of new disk arrays and servers without stopping works of all storage system;
- use of earlier purchased equipment together with new data storage devices;
- online and reliable access to the drives of data which are at a great distance from servers, * without considerable performance penalties;
- acceleration of process of backup and data recovery - BURA.
History
Development of network technologies led to emergence of two network solutions for DWH – networks of storage Storage Area Network (SAN) for data exchange at the level of the blocks supported by client file systems and servers for data storage at the file Network Attached Storage (NAS) level. To distinguish traditional DWH from network one more retronym – Direct Attached Storage (DAS) was offered.
DAS, SAN and NAS appearing in the market consistently reflect the evolving chains of communications between the applications using data, and bytes in the carrier, containing these data. Once programs-applications were read and wrote blocks, then drivers as a part of the operating system appeared. In modern DAS, SAN and NAS the chain consists of three links: the first link – creation of RAIDs, the second – processing of the metadata allowing to interpret binary data in the form of files and records, and the third – services for providing data to the application. They differ on where and as these links are implemented. In a case with DAS DWH is "naked", it only gives an opportunity of storage and data access, and all the rest becomes on server side, starting with interfaces and the driver. With the advent of SAN providing RAID is transferred to the party of DWH, all the rest remains the same as in a case with DAS. And NAS differs in the fact that also metadata for ensuring file access are transferred to DWH besides, here the client needs only to support services of data.
Emergence of SAN became possible after in 1988 the Fibre Channel protocol (FC) was developed and in 1994 ANSI as the standard is approved. The term Storage Area Network is dated 1999. Over time FC gave way to Ethernet, and IP-SAN networks with connection on iSCSI gained distribution.
The idea of the network server of storage of NAS belongs to Brian Rendell from the Nyyukestla University and is implemented in machines on the UNIX-SERVER in 1983. This idea turned out so successful that it was picked up by a set of the companies, including Novell, IBM, and Sun, but finally replaced leaders of Netapp and EMC.
In 1995 Garth Gibson developed the principles of NAS and created object DWH (Object Storage, OBS). It began with the fact that separated all disk transactions into two groups, one entered executed more often, such as read and write, another more rare, such as transactions with names. Then he offered one more container in addition to blocks and files, he called it an object.
OBS differs in new type of the interface, it is called object. Customer services of data interact with metadata on object API (Object API). Not only data are stored in OBS, but also RAID is supported, the metadata relating to objects are stored and the object interface is supported. DAS, both SAN, and NAS, and OBS coexist in time, but each of access types corresponds to a certain data type and applications to a large extent.
In more detail about evolution of DWH read here.
Architecture of SAN
Network topology
SAN is the high-speed data network intended for connection servers to data storage devices. Various topology of SAN (the point-to-point, a loop with arbitration logic (Arbitrated Loop) and switching) substitute the traditional tire connections "the server — storage devices" and provide in comparison with them big flexibility, performance and reliability. The possibility of connection of any of servers with any data storage device working on is the cornerstone to the protocol of the concept of SAN Fibre Channel. The principle of interaction of nodes in SAN with topology a point-to-point or switching is shown in drawings. In SAN with Arbitrated Loop topology data transmission is performed consistently from a node to a node. To begin data transmission the sending device initializes arbitration for right to use of a data transmission medium (from here and the name of topology – Arbitrated Loop).
The transport basis of SAN is formed by the Fibre Channel protocol using both copper, and fiber-optical connections of devices.
SAN components
The SAN components are subdivided into the following:
- Host Bus Adaptors (HBA);
- Data storage resources;
- The devices implementing infrastructure of SAN;
- Software.
Host Bus Adaptors
HBA are established in servers and perform their interaction with SAN under the protocol Fibre Channel. The stack protocols of the Fibre Channel is implemented in HBA. The most known producers HBA are the companies Emulex JNI, Qlogic and Agilent.
Data storage resources
Treat resources of data storage disk arrays, tape drives and libraries with the interface Fibre Channel. Many opportunities implement resources of storage only being included in SAN. So disk arrays of the top class can perform replication data between arrays on Fibre Channel networks, and tape libraries can implement data transfer on a tape directly from disk arrays with the Fibre Channel interface, passing network and servers (Serverless backup). The greatest popularity in the market was purchased by disk arrays of the companies EMC Hitachi IBM, Compaq (the family Storage Works which got Compaq from Digital), and from producers of tape libraries it is necessary to mention StorageTek Quantum/ATL IBM.
The devices implementing infrastructure of SAN
The devices implementing infrastructure of SAN are switches Fibre Channel (Fibre Channel switches FC switches),hubs (Fibre Channel Hub) and routers SCSI (Fibre Channel-routers). Hubs are used for consolidation of the devices working in the Fibre Channel Arbitrated Loop mode (FC AL). Use of hubs allows to connect and switch-off devices in a loop without stopping of a system as the hub automatically closes a loop in case of shutdown of the device and automatically disconnects a loop if the new device was connected to it. Each change of a loop is followed by its difficult process initialization. Initialization process multistage, and before its termination data exchange in a loop is impossible.
All modern SAN are constructed on the switches allowing to implement full network connection. Switches can not only connect Fibre Channel devices, but also differentiate access between devices for what on switches so-called zones are created. The devices placed in different zones cannot communicate with each other. The number of ports in SAN can be increased, connecting switches with each other. The group of the connected switches carries the name Fibre Channel Fabric or just Fabric. Communications between switches call Interswitch Links or in abbreviated form ISL.
Software
The software allows to implement reservation of access paths of servers to disk arrays and runtime allocation of loading between ways. For the majority of disk arrays there is an easy way to define that the ports available via different controllers treat one disk. The specialized software supports the table of access paths to devices and provides shutdown of ways in case of accident, runtime dynamic linking of new ways and load distribution between them. As a rule, manufacturers of disk arrays offer a specialized software of this kind for the arrays. The VERITAS Software company makes the software of VERITAS Volume Manager intended for the organization of logical disk volumes from physical disks and providing reservation of access paths to disks and also load distribution between them for the majority of the known disk arrays.
The used protocols
In storage area networks low-level protocols are used:
- Fibre Channel Protocol (FCP), SCSI transport through the Fibre Channel. The protocol which is most often used at the moment. There are in options 1 Gbit/s, 2 Gbit/s, 4 Gbit/s, 8 Gbit/s and 10 Gbit/s.
- iSCSI, SCSI transport through TCP/IP.
- FCoE, transportation of FCP/SCSI over "net" Ethernet.
- FCIP and iFCP, encapsulation and a FCP/SCSI broadcast in packets of IP.
- HyperSCSI, SCSI transport through Ethernet.
- FICON transport through the Fibre Channel (it is used only by mainframes).
- ATA over Ethernet, ATA transport through Ethernet.
- SCSI and/or TCP/IP transport through InfiniBand (IB).
Advantages
- High reliability of data access, being on external systems of storage. Independence of topology of SAN from used by DWH and servers.
- The centralized data storage (reliability, security).
- Convenient centralized operation by switching and data.
- Transfer of intensive traffic of input-output in separate network – unloading of LAN.
- High high-speed performance and low latency.
- Scalability and flexibility of a logical structure of SAN
- The geographical SAN sizes, unlike classical DAS, are practically not limited.
- An opportunity to quickly distribute resources between servers.
- An opportunity to build failsafe cluster solutions without additional costs based on the available SAN.
- The simple scheme of backup – all data are in one place.
- Existence of additional opportunities and services (snapshots, remote replication).
- High degree of security of SAN.
Sharing of storage systems as a rule simplifies administration and adds fair flexibility as it is not necessary to transport and perekommutirovat cables and disk arrays physically from one server to another.
Other advantage is an opportunity to load servers directly from network of storage. At such configuration it is possible to replace quickly and easily the faulty server, having reconfigured SAN in such a way that the server replacement, will be loaded with LUN'a of the faulty server. This procedure can take, for example, half an hour. The idea is rather new, but it is already used in the latest data-centers.
Also networks of storage help to recover more effectively working capacity after failure. SAN can include remote section with the secondary storage device. In that case it is possible to use replication - implemented at the level of controllers of arrays, or by means of special hardware devices. As WAN links on the basis of the IP protocol meet often, the Fibre Channel protocols over IP (FCIP) and iSCSI with the purpose to expand uniform SAN with means of networks on the basis of the IP protocol were developed. Demand for such solutions considerably increased after the events on September 11, 2001 in the USA.
Shortcomings
All minuses come down only to the high cost of this sort of solutions. The Russian market of DWH in general lags behind the market of the western developed countries, especially – in wide use of storage area networks. In particular, deficit and high cost of high-speed communication channels continue to have a certain impact.
Difference from NAS
The main difference between SAN and NAS consists in a method of the organization of data exchange between storage devices and servers. Generally speaking, the architecture of SAN is aimed at the problem resolution, caused by intensive backup procedures and data exchange by transferring of all system to the selected subnet. The SAN systems based on the Fibre Channel protocol allow to change over a wide range the capacity of a storage system and to guarantee higher capacity within the selected subnet (The Disk arrays and Tape libraries which are not equipped with the Fibre Channel interfaces can be connected to SAN, using Fibre Channel-SCSI routers).
See Also
- Software-defined storage (Software-Defined Storage, SDS)
- Direct Attached Storage, DAS
- Network attached storages (Network Attached Storage, NAS)
- Product catalog and projects of DWH
- Data backup