Backup system (technologies)
Neither RAID, nor cluster, nor any other technology of ensuring fault tolerance protect from errors as a result of which data change or removed and of which the operating system or the person are guilty. Backup — one of optimal solutions for such situations as it allows to store copies of different prescriptive limit, for example for each day of the current week, two-week, monthly, semi-annual and annual prescription. An opportunity to use external removable mediums significantly reduces information storage costs, however alternative technologies are suitable for some tasks more.
The directory of solutions and projects of backup is available on TAdviser.
Content |
Data backup – is an integral part of functioning of corporate IT.
Streamers are tape drives
Streamers are the most widespread means of backup, set in each bank and any enterprise not only enterprise and the average level, but also in many IT departments of small enterprises. They are simple, reliable and inexpensive in service, and some of their shortcomings still could not move very essential advantages. By the way, one of the biggest consumers of cartridges on the magnetic tape is Google, using them for an internal backup.
Especially as modern films are much more perfect, than their forerunners even a few years ago. It allows to bring storage density to impressive values (for example, compact cartridges of LTO-5 are issued up to 3 Tb), and to make even the speed of reading record to the disk systems. In addition, storage and the organization of backup-processes on tapes still remains the most economic solution for business users. Therefore if small inconveniences use a tape information warehouse according to the schedule and to destination (for example, the increased time of random access) will not be essential in comparison with advantages of tape drives. For creation of backup copies in the balanced IT infrastructure the same quick random access to information it is perfect to anything.
Disk storages
Disk storages are an alternative to tape drives. The idea of use of arrays of hard drives not new, however, just recently such storages cost very much to use them for a backup. According to leading specialists of HP, in Russia only 3% of the companies use disk storages. But thanks to reduction in cost of megabyte of storage on hard drives, became possible sometimes recently to use them for creation of backup copies. Today one of the most promising implementations are the RDX systems when the protected disk cartridges, in the special body emulate work of tape library. Especially the Norwegian Tandberg succeeded in their production.
As the most productive solution on ensuring fault tolerance specialists even more often recommend to supplement disk storage tape and vice versa. It should be noted, feature of modern IT systems: "hot data" (are used constantly), usually, no more than 20%. To store the necessary applications and data with which work rather often best of all than everything on disks, and the archive – on magnetic tapes and in this role to them is not present worthy competitors yet.
Technology of "Shadow copying" (Shadow Copy)
The technology of shadow copying is implemented in Windows Server 2003, but similar is also in products of the third developers on different platforms. The idea is rather simple. In the disk section according to the schedule (by default in Shadow Copy — twice a day) all changes at the lowest level are monitored, and there is an opportunity to recover a disk status in general or even the previous versions of separate files at the time of creation of the shadow copy (an opportunity to recover the previous version is available only at file access through a network resource).
Advantages of shadow copies — in an ease of use and an opportunity to recover the file for users without intervention of the administrator. Unfortunately, copies "eat off" the place on the hard drive, it is impossible to set copying of separate files or directories, it is impossible to guarantee the number of copies and it is impossible to set storage, for example, to the copy of monthly prescription. But in general the technology quite deserves attention. Similar approach (automatic storage of old versions) is implemented also in many workflow systems. It is necessary to notice that the technology of shadow copying is implemented also in Windows XP: through it it is recovered a system (is rolled away) and there is a backup in NTBackup, however there is no interface for recovery of separate files, unfortunately.
Control systems of versions
Modern control systems of versions (such as CVS, Subversion or commercial products) it is possible (and it is sometimes quite convenient) to use not only for control of versions of the source code of programs, but also for storage of versions, for example, of corporate documents. The lack of such approach of its "net" type consists that it is required to accustom the user to work with such system — it is not always easy. Besides, such systems work with some types of binary files extremely inefficiently.
Data recovery at the level of the application
Many applications working with data (for example, database management systems) support the transaction logs which are rolling away changes to a certain timepoint. Sometimes it requires uncommon actions as, for example, in a case with the Microsoft SQL Server. You should not dismiss such method in any way. On use it is very similar to the backup copy, but gives more full control over to what timepoint in the past it is necessary to recover a system.
2016: The Russian scientists found a method of fast data recovery
The senior research associate of Institute of problems of information transfer of the Russian Academy of Sciences (IPPI RAS), professor of the University of Maryland (USA) Alexander Barg together with Itzhak Tamo from the University of Tel Aviv (Israel) and the senior research associate of IPPI RAS Alexey Frolov offered borders of parameters for the codes with local recovery applied in distributed systems of data storage. Their article appeared in the IEEE Transactions on Information Theory magazine. In 2015 in Barg and Tamo's same edition received one of the most prestigious awards in the field of an information theory for the publication and coding of IEEE Information Theory Society Paper Award for an extensive research of codes with local recovery[1].
To secure users against data loss, any information – both on personal computers, and in virtual storages (social networks, "clouds") – is distributed on several servers or disks. Failure of disks is the frequent phenomenon.
In modern distributed systems there are 2 methods of data protection:
1. Their duplication (back-up) on several disks – if one of them fails, then for recovery stored on is mute information enough to address one disk with the copy (or, in a different way, the service information). Recovery time minimum, however, the total amount of information very big, for example, if data repeat 3 times, then the volume of the service information – 200 percent.
2. Use of codes of Read-Solomon. In this case the amount of the service information is minimum, but recovery happens much longer. For example, Facebook uses Read-Solomon's code with parameters (14, 10). In this case the volume of the service information – 40 percent, but for recovery of one disk it is necessary to read out data from 10 others.
As one disk most often fails, there is a problem of creation of codes with property of local recovery. Such codes should "be able" to recover the spoiled disk with the minimum number of appeals to other disks. The volume of the service information also should be minimum.
For the first time codes with local recovery were offered by the staff of Microsoft company P. Gopalan, S. Ekhanin, etc. They set assessment of the minimum amount of the service information at such coding.
In the work Alexander Barg together with Itzhak Tamo offered the general algebraic encoding technique of data which is reaching this assessment, i.e. having the best possible efficiency.
In article published in the June issue of the magazine IEEE Transactions on Information Theory, Tamo, Barg and Frolov investigated synthesis of codes with local recovery and received the lower and upper bounds for parameters of codes with many recovering sets, such as volume of the service information and the minimum distance.
"We consider a case when for each disk (the character of the big alphabet) is available several recovering sets of disks. This property guarantees high data availability to which there is a frequent address, at failure of a disk, different users can recover this disk, addressing different servers with service data. Thus, optimum allocation of loading in a system is reached" – Alexey Frolov explained.
Notes