RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2
2012/02/21 11:57:18

Backup of 'Big Data': 10 questions concerning deduplication

In total a few years ago deduplication of data represented the special function which was optional for corporate systems of storage and used generally at backup and archiving of data. Further application in cloud gateways for an exception of excessive data units still was found for it before they are placed on the disk array or in virtual tape library. Now, when it becomes the self-evident integrated function of the unified computer systems, it is necessary to extend knowledge of methods of the most effective use of deduplication. The offered slideshow shows fresh approach to some questions which storage specialists or IT managers should set to the suppliers of data warehouses. Information is provided to the eWeek magazine by the chief technologist of Sepaton company Jeff Tofano. Storage software, released by Sepaton company, is scaled for high loads and works on the equipment of a consumer class. The directory of solutions and projects of backup the Big Data Big Data - the Directory of systems and projects

Content

How will deduplication affect backup performance?

High performance is important for large enterprises which need to move the huge and growing after the exhibitor amounts of data on the protected Wednesday of backup to the small intervals of time which are taken away for this purpose. Understanding of performance differences of each technology of deduplication, especially taking into account their evolution, plays a large role when choosing the most suitable of them for the specific environment.

Whether deduplication will reduce data recovery speed?

Time, necessary for recovery of files which backup copies were created for the last week (the most often found request for data recovery). Take an interest at the producer whether the last backup copy is available to immediate recovery and fast save on the magnetic tape on its technology.

How will the volume and performance in process of growth of the enterprise be scaled?

Define what amount of data you will be able to store using deduplication in one system at your politicians and taking into account a prize as a result to it uses, your data types and rates of increase in volume. Assess the consequences of exceeding of this volume. For example, if at its exceeding you need the additional systems for storage of backup copies, take expenses on overcoming difficulties of administration, capital costs and violation of integrity of the existing computing environment into account.

Deduplication in relation to big databases is how effective?

At determination of performance levels configure deduplication on comparison of data less than 8 KB long. Big DBMS on which work of the enterprises, such as Oracle, SAP, SQL Server and DB2 depends usually make changes to data segments on 8 KB and less. However many systems of deduplication, comparing data less than 16 KB long, sharply slow down backup process.

Deduplication in the progressive systems of additional backup is how effective?

Some software of deduplication is inefficient in case of the progressive additional backup performed using a packet of Tivoli Storage Manager (TSM) and during the work with the applications fragmenting data such as NetWorker and HP Data Protector. Learn from the producer whether its technology of deduplication is capable to use metadata of such backup applications for determination of the zones most likely containing duplicates of data and to make byte comparison for optimal reduction of amount of data when preserving high performance.

What reduction of amount of data can be expected?

Instead of aiming at higher rate of deduplication in general, use strategy, more effective for large enterprises. Select the solution which with guarantee will save data during the time allowed for backup and at the same time will make effective deduplication. Parallel processing, the determined data movement indicator in archive, deduplication and replication of data are essential for corporate environment.

Whether administrators of a message monitoring of backup, deduplication, replication and enterprise-wide data recovery can?

Complete approach to data protection allows administrators of backup to manage the increased amount of data counting on one person, to configure backup on optimal usefulness and efficiency, to plan precisely for the future of the requirement to performance and volume of enterprise-wide storage.

Whether deduplication will help to lower requirements to bandwidth at backup of large volumes of data at the enterprise?

Some technologies of deduplication allow the companies to replicate data via the Internet more effectively due to replication only of changes at the level of bytes that reduces requirements to bandwidth and costs of time for preserving of data.

Whether the IT personnel will be able to configure deduplication under the needs?

In corporate environments of data protection such data types which impose special requirements to deduplication can meet. Look for such solutions which allow IT specialists to define data sets for deduplication on the basis of policy of backup and a task of data type and automatically determine data type, subject to backup. Give preference to the technology giving to IT personnel an opportunity to select the most effective method of deduplication for each data type.

What experience of the producer with large corporate environments of backup?

Such producer which has experience with applications for backup of a corporate class like NetBackup, NetBackup OST and TSM is suitable for corporate DPC with its huge data array and difficult politicians. It should be ready to make assessment of needs for backup and to provide recommendations about optimization of the general structure of backup for achievement of maximum capacity of backup, replication and deduplication of data in your environment.