RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2
2012/02/08 10:22:03

In-Memory Computing Calculation in RAM

In-Memory Computing is the high-performance distributed systems intended for storage and data processing in RAM in real time. Provide performance on orders quicker, than systems based on disks. Technologies of calculations in RAM (in-memory computing) accelerate processing of large volumes of data therefore in process of growth of such phenomenon as Big Data purchase the increasing popularity among the enterprises.

The directory of BI solutions and projects is available on TAdviser.ru

Content

The FAQ on in-memory DBMS

The in-memory database management systems (in-memory DBMS, engl. - in-memory database systems, IMDS) it is the growing segment of the world market of DBMS. Creation of in-memory DBMS became the response to emergence of new tasks which face applications, new system requirements and an operational environment.

What is in-memory DBMS?

In-memory of DBMS is a database management system which stores information directly in RAM. It considerably contrasts with approach of traditional DBMS which are designed for data storage on stable media carriers. As processing of data in RAM proceeds quicker, than the addressing the file system and reading of information from it, in-memory DBMS provides much higher performance of software applications. As construction of in-memory DBMS is much simpler traditional, they also impose much smaller requirements to amount of memory and characteristics of the CCP.

If the purpose is failure from input-output processes why not to achieve it using caching?

Caching is a process within which traditional DBMS store often used records in memory for quick access to them. However, caching accelerates only process of search of necessary information, but not its processing. Therefore, performance benefit significantly smaller. Besides, management a cache memory in itself is resource-intensive process which involves considerable amounts of memory and computing powers of the processor.

Whether it is possible to gain the effect similar to in-memory DBMS, having created a disk in memories (RAM disk) and having unrolled traditional DBMS on it?

As temporary solution placement of all database on the disk RAM can accelerate record and reading base. Nevertheless, such approach has a number of shortcomings. In particular, the database will be still tied to the disk storage module, and processes in the database, such as caching and input/output of data, will be performed even if they are excessive. Besides, to the database placed on a disk the set of requests is addressed, they require time and resources of the processor, and not to avoid it in a case from traditional DBMS in any way even if it is placed on RAM. On the contrary, in-memory DBMS use uniform date transfer that simplifies data processing. Removal of excess copies of data reduces load of memory and also increases reliability of process and minimizes requirements to the CCP.

Whether there are data characterizing a quantitative difference of performance between three approaches described above?

According to the published tests of McObject during which performance of the same application was compared, transferring of traditional DBMS to RAM allowed to achieve acceleration of reading by 4 times and updating of base by 3 times in comparison with traditional DBMS on the hard drive. In-memory of DBMS showed even more essential results in comparison with DBMS on a RAM disk: reading the database was 4 times faster, and record in the database was 420 (!) times faster.

Performance of in-memory eXtremeDB DBMS in comparison with db.linux DBMS on a RAM disk


McObject, 2009

What else distinguishes in-memory DBMS from traditional?

In-memory of DBMS does not bear on itself any loading from transactions of input-output of data. Initially the architecture of such databases is more rational, and loading processes of memory and cycles of the processor are optimized.

For what applications use of in-memory DBMS is relevant?

In-memory of DBMS are usually used for application which are demanded superquick access to data, storages and manipulations with them and also in systems which have no disk, but, nevertheless, should manage a significant amount of data.

In-memory DBMS are how scaled? If the application manages terabyte of data – whether much it for in-memory DBMS?

According to the report of McObject, in-memory DBMS are perfectly scaled for the sizes exceeding terabyte. So, during the carried-out tests 64-bit in-memory DBMS set on the 160-core SGI Altix 4700 server running SUSE Linux Enterprise Server of version 9 from Novell reached 1.17 terabyte and 15.54 billion lines without visible restrictions for further scaling. And performance in this test practically did not change as DBMS reached hundreds of gigabytes, and then and terabyte that demonstrates almost linear scalability.

Unless it is not the truth that in-memory DBMS is not suitable for use in networks of several and more computers?

In-memory of DBMS can be as "the built-in DBMS", and "client-server". Client-server DBMS inherently are multi-user so in-memory DBMS can be also separated into several flows/processes/users.

In-memory of calculation: facts

The new report of Aberdeen Research company pays attention not only to several interesting facts about Big Data, difficulties in processing and the analysis of the growing amount of data, but also to how calculations in memory can play a key role in acceleration of collecting, sharing of information at the enterprise and managements of it. At least, at those enterprises which can afford it.

  • Every year the volume of business data grows for 36%.
  • The main problem when processing Big Data is in how to receive result quicker (data from the December report of 2011).
  • From 196 clients of Aberdeen discussing Big Data 33 use calculations in memory. The reason for which the majority refuses this technology consists, most likely, in its high cost.
  • Obtaining information on a request borrows 42 with instead of 75 min. spent when using normal technologies.
  • At calculations in memory Tb / h, in comparison with 3.2 Tb is processed 1200 when using normal technologies. Increase in efficiency by 375 times is available.
  • Calculations in memory, in other words, do fast processing and information analysis, and it is good for users and the IT organizations dealing with the increasing information volumes, used at acceptance of business solutions.

Data sources depending on the size of the company


TechTarget, December, 2011

According to TechTarget company, in-memory DBMS most often use mid-size companies (23%) in comparison with small (18%) and the large companies (15%). 

Problems of application of in-memory of calculations

But calculations in RAM, as well as any technology, have the unique features, problems and reefs. First, it is expensive. Powerful servers, multi-core processors and tons of RAM are necessary. It is necessary corresponding software and analytical applications. The technology of high-speed processing requires application of all listed components as terabytes of data are stored with 'zero' latency of access directly in RAM of servers, but not somewhere on disks.

Though producers do not disclose the price of applications for calculations in memory and the prices are also not given in the report, it is quite enough to look at statistics from the report: the enterprise using calculations in memory spent about 850,000 dollars for the last 12 months.

Other problem of technology of calculations in RAM that it well is suitable only for transactions with sets of structured data, such as articles of goods, information on buyers, sales reports.

If your company locates means and understands informative value in modern business strategy, the technology of calculations in RAM can become for you the suitable choice.

Products

Creation of in-memory DBMS began in 1993 in Bell Labs. A system was a prototipirovana as Dali Main-Memory Storage Manager. These researches laid the foundation for creation of the first commercial in-memory DBMS - Datablitz.

In the years ahead in-memory DBMS drew attention of the largest players of the market of databases. TimesTen, the company startup founded by Marie-Anne Neimat in 1996 as a branch from Hewlett-Packard was purchased by Oracle in 2005. Today Oracle sells this product including as in-memory DBMS. In 2008 IBM purchased SolidDB in 2008, also conducts work in the field of in-memory DBMS and Microsoft.

VoltDB founded by one of pioneers of the market of DBMS Michael Stonebraker announced in-memory DBMS exit in May, 2010, at the moment the company offers both the free, and proprietary version of this system. SAP released in-memory DBMS, SAP HANA, in June, 2011.

The list available in the in-memory DBMS market:

Russian realities

The set of solutions of in-memory is available to the Russian customers. Among the most used − solutions Oracle IBM Cognos TM1 SAP HANA Microsoft PowerPivot, QlikView and Pentaho Business Analytics. It is good to use such platforms in need of data analysis in real time taking into account that in the course of the analysis data can be changed at any time. Also such solutions well approach in cases, when there is no possibility of creation of the multidimensional data warehouse and it is necessary to carry out data analysis of the accounting system without its modification. Systems offer different methods of horizontal scaling, both using means of the platform, and using additional software.

In particular, function of work with virtual cubes in Pentaho Business Analytics can be scaled using the industrial solution JBoss Data Grid which is intended for creation of the distributed in-memory of information warehouses.

At such approach it is possible to create in-memory cubes of 1 of Tb above. In terms of price availability, these solutions are quite permissible for the SMB companies. In particular, for SMB of the market IBM has a complete solution Cognos Express (TM1 is its part), and Pentaho has a free version and special price offers for the small companies.

Strictly speaking, in-memory technologies are divided into two classes. It the solutions data discovery and in-memory, or as it is more correct, "in-memory of the database management system (DMS)". The example of the solution data discovery is Qlikview. Data in this system are provided in a convenient form, and use of in-memory technology allows to work with a visual component quickly. But it is impossible to connect other tools to it: data from the Microsoft Excel files, Cognos or Oracle BI systems.

In-memory of DBMS is when data are initially stored in RAM, and access to them practically does not take time. For example, the chief accountant of the company needs to look at the report for a year before last in dynamics on days: if to start this process on classical DBMS, on it at least 10 minutes (will leave at the correct setup of a system). If information is stored in RAM, the result will appear instantly. An example of such solution - SAP HANA. This system, being DBMS, provides a memory access to any BI tool: it is possible to load given from tables of Excel, the systems Cognos, Oracle BI and others.

The cost of such solutions consists of a set of factors, beginning from an implementation time of the project and finishing with the cost of the technology. Some solutions really cost much, but quickly pay off due to increase in efficiency and efficiency of work. Such products are demanded in any company where it is important to receive analytical reports quickly. For example, if the analyst needs to create the report consisting of 30 Excel of files to it at least 3 days will be required to reduce it manually. In the presence of necessary IT systems, it is enough to it to point to these 30 files then a system itself will create the uniform report with which it will be possible to work only.

Vladimir Itkin, the development director of a partner network of Qlik (QlikTech) Russia, told TAdviser that difference of QlikView – in orientation to simplicity and convenience in creation of reports. At such approach the implementation cycle is significantly reduced, and our many partners can work in the mode of extreme programming. It is the iterative approach where duration of one cycle normally no more than a week. Thus, business the user from the first days of the project begins to see result and takes part in creation of the solution.

"Through 5-6 such iterations at the exit the BI solution, and the relevant and "live" tool of the analyst turns out not just. From the last projects in such mode it is possible to call Geotek Holding and A5 pharmacy chain", - the top manager explained. According to him, about 77% of all projects, from joint creation of terms of reference before start are implemented into commercial operation less than in 3 months. A third of clients implement QlikView independently.

For example, CROC implemented the project on consolidation of bases of marketing information in a common information space in pharmaceutical company "Nikomed". Earlier for search and information analysis different covers of work with marketing data which often were inconvenient and not intuitive were used. After implementing solution functioning with the data warehouse began to be ensured by the analytical Qlikview system thanks to which work with separate information became fast and convenient.

In M.Video, for example, the SAP HANA system with in-memory computing technology was implemented. The storage system and data analysis which was at the customer earlier, any more did not cope with such information volume – data were loaded into more than 2.5 billion lines about 3 hours. After SAP implementation a HANA system loads these data less than in 30 minutes.

As an example it is possible to cite also the Tern Group project for Surgutneftegas. Reduction of time expenditure on preparation of reports was the main task of the project, beginning from data processing and finishing with visualization of the received results. The set-up time of reports was reduced as a result of one hundred times, and already now users can work with the analytical requests almost online.

Free reports

In-Memory Data Management

In-Memory Analysis: Delivering Insights in the Speed of Thought

See Also