Content |
History
The mathematical apparatus of DBMS with changeable dimension or multidimensional DBMS was developed by the outstanding American mathematician Don Nelson in the sixties by request of the U.S. Department of Defense. Since 1968 till present multidimensional DBMS are widely used by federal services of many countries of the world. In 1991 we chose multidimensional DBMS of Pick Systems corporation for the reasons of existence in Moscow of representation and technical center, and also because of existence of DBMS with the OS functions for Intel platform 80x86. Multidimensional DBMS is understood as the database management system implementing so-called. Not Normalized Relational Form (NNRF) capable to process the data models adequate to representations of the real world and free from the basic well-known shortcomings inherent in traditional DBMS on the basis of the normalized relational form (SQL-like DBMS Oracle, Informix, MS SQL Server, etc.).
Specific Features
In DBMS based on multidimensional view of data, data are organized not in the form of relational tables, and in the form of ordered multidimensional arrays: hyper cubes (all cells stored in the database should have identical regularity, i.e. be in the most complete basis of measurements) and/or the data marts representing the subject-oriented subsets of the data warehouse designed for satisfaction of needs of separate group (community) of users and meeting requirements of protection against unauthorized access in the organization; they provide faster reaction to requests of data because addresses arrive to rather small data units, necessary for a specific user group. For achievement of comparable performance the relational systems require careful study of the database scheme, determination of methods of indexation and special setup. In case of multidimensional databases even the instruction on according to what details (groups of details) indexation of data is required, as a rule, is not required. Restrictions of SQL remain reality that does not allow to implement many built-in functions which are easily provided in the systems of the data based on multidimensional view in relational DBMS. At the same time, relational DBMS provide qualitatively higher level of data protection and differentiation of access rights and also have more developed administrative tools and real experience with big and superbig databases. While for multidimensional databases, there are no uniform standards on the interface, languages of the description and data manipulation now. Multidimensional DBMS do not support the replication of data which is most often used as the loading mechanism.
Details of the organization
Multidimensional bases, owing to purely historical reasons, "are not able" to work with large volumes of data. Today, their real limit - base with a capacity of 10-20 gigabytes. And though this restriction is not connected with any internal objective shortcomings of multidimensional approach and, most likely, it is temporary, so today. It is necessary to reckon with it. Besides, at the expense of a denormalization and previously executed aggregation, 20 gigabytes in multidimensional base, at best are equivalent to no more than 1 gigabyte of initial data. By Kodd's estimates, for the systems of the data based on multidimensional view, this ratio lies in the range from 2.5 to 100. Here it is necessary to stop on the main lack of multidimensional databases - inefficient, in comparison with relational databases, to use of an external bulk memory. Data view in the form of multidimensional hyper cubes is the cornerstone of multidimensional approach, at the same time it is usually supposed that in such hyper cube there are no emptiness. So all cells of a cube are always filled. It is connected with the fact that data in them are usually stored in a type of a set of logically ordered blocks (arrays) having fixed length, and the block is the minimum indexed unit. In multidimensional DBMS it is usually supposed that the blocks which are completely filled with indefinite values are not stored, it provides only partial solution. Data in such systems are stored in an ordered type. Indefinite values are eliminated, and that partially only if we due to the choice of a sorting order group them in the biggest continuous groups. Therefore, use of multidimensional DBMS is justified only under following conditions:
- The volume of initial data for the analysis is not too big (no more than several gigabytes), i.e. the level of aggregation of data is rather high;
- A set of information measurements is stable (as any change in their structure almost always requires complete reorganization of a hyper cube);
- Time of the answer of a system for independent requests is the most critical parameter;
- Wide use of the composite built-in functions for accomplishment of krossmerny calculations over hyper cube cells, including a possibility of writing of the user functions is required.
However it would be incorrect to oppose or speak about any competition of relational and multidimensional approaches. These two approaches mutually supplement each other. Relational approach never intended for the solution on its basis of the tasks requiring synthesis, the analysis and consolidation of data. It was supposed that such functions, should be implemented using work benches, external in relation to relational DBMS. Now, multidimensional DBMS are even more often used not only as the independent software product, but also as analytical means in data warehouses or traditional operational to systems, the relational DBMS implemented by means. Such solution allows to implement and use advantages of each of approaches most fully: compact storage of the detailed data and the support of very big databases provided with relational DBMS and simplicity of setup and good response time, during the work with aggregated data, provided with multidimensional DBMS.
Advantages
- In case of use of multidimensional DBMS search and data sampling is performed much quicker, than at a multidimensional conceptual view of a relational database as the multidimensional database is denormalized, contains in advance aggregated indicators and provides the optimized access to required cells.
- Multidimensional DBMS easily cope with problems of inclusion in an information model of the various built-in functions whereas objectively existing restrictions of the SQL language do accomplishment of these tasks on the basis of relational DBMS rather difficult, and sometimes and impossible.
Shortcomings
- Need of involvement of highly skilled programmers for the slightest changes of a database structure.
- Impossibility for the end user independently to analyze data as it should be, not provided by programmers.