[an error occurred while processing the directive]
RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

EMC Greenplum Database Edition

Product
The name of the base system (platform): PostgreSQL of DBMS
Developers: Dell EMC
Last Release Date: 2015/10/28
Technology: BI,  DBMS

Content

2018: Integration with Luxms BI

In 2018 the Luxms BI platform was integrated from massive and parallel Greenplum DBMS open source. Joining with Greenplum DBMS is provided by the high-speed bidirectional FDW connector. Read more here.

2015: The source code DB Greenplum is open

On October 28, 2015 it became known of opening of the source code of the Greenplum Database (GPDB) database stated as full-function Open Source- the data warehouse (warehouse) on the platform free DBMS PostgreSQL[1].

Greenplum is DBMS created by the company of the same name which in 2010 was purchased by EMC Corporation and in 2013 it passed to Pivotal Software.

Pivotal announced opening of the GreenplumDB (GPDB) code in February, 2015 and now it became a reality: the project received the website, source texts are published on GitHub under the free license Apache License v2. Greenplum provides powerful and fast analytics on huge data arrays and, according to developers, uses "the query optimizer which is most advanced in the world on the basis of assessment of their cost".

The basis of GPDB is free PostgreSQL DBMS. Its functionality is expanded by means of:

  • architecture for mass parallel processing of data (automatic parallelization of all data and requests),
  • MPP technologies for high performance in scales of petabytes,
  • the innovation query optimizer (its analytical opportunities are scaled on large data sets without damage of performance and capacity),
  • polymorphic (focused on columns or lines) storages and data processings,
  • advanced machine learning on the basis of Apache MADLib library.

The cluster of Greenplum consists of the master server in which only metadata, and sets of "segment" servers where there are all user data are stored. All servers use the same scheme DB.

2011

EMC Greenplum Community Edition

The free version of Community Edition of DBMS with processing with mass parallelism (MPP) of EMC Greenplum Database and also free analytical algorithms and instruments of intelligent data analysis. The announcement of a product was made at conference 2011 O'Reilly Strata Conference (on February 1-3, 2011) in Santa Clara, piece California on which Scott Yara, the vice president of division of EMC Data Computing Products Division acted. Free versions can already be downloaded to the address: http://community.greenplum.com.

Developing success of former advanced developments of Greenplum in the field of large volumes of data, such as EMC Greenplum Data Computing Appliance, the new version of EMC Greenplum Community Edition eliminates the cost barriers disturbing arms powerful tools for work with data bulks of a large number of the developers, researchers and other professionals who are interested in transactions with data. This free tool kit allows community of specialists not only better to understand data, to gain about them deeper impression, to try to obtain the best visualization, but also to make the contribution to development of tools and solutions of the next generation. Using a program stack of Community Edition developers can create difficult applications for collecting, the analysis and use of large volumes of data at the new level, using the instruments of work with large volumes of data, best in the class, including Greenplum Database with its excellent opportunities of analytical processing.

The free version of EMC Greenplum Community Edition includes:

  • 1) Greenplum Database CE – DBMS leading in the industry with processing with mass parallelism (massively parallel processing, MPP) for large-scale analytics and data warehouses of the next generation;
  • 2) MADlib – library of the analytical algorithms open source implementing calculations with parallel processing in mathematical, statistical techniques and methods of machine learning for the structured and unstructured data;
  • 3) Alpine Miner is promising analytical tools of independent producers with the intuitive visual simular of intelligent data analysis which provides possibilities of fast "modeling with assessment" (modeling to scoring), at the new level uses the analytics which is built in the database and is specially created for applications for work with large volumes of data.

For community

This initial version of EMC Greenplum Community Edition is developed both for users beginners, and for experienced customers of Greenplum. The users who for the first time are getting acquainted with a product get access to the complete specialized environment of a business intelligence which allows them to browse, modify and improve the demonstration data files included in a product that gives the chance to experiment with analytical tools for work with large volumes of data in Greenplum DBMS. The users who are already applying this product can download upgraded version of Greenplum Database CE and tools of analytics for integration with the development environment and research environments.

The version of Community Edition can be downloaded as the virtual machine of VMware with already configured configuration and to use it on portable or desktop computers or as a set of packets for development on user machines. All users can participate free of charge in the new forums Greenplum Community Forums – to get support, to cooperate with colleagues, to publish the ideas and to test the improvements which are independently developed by different users.

Product output terms

Since February 1, 2011 the version of EMC Greenplum Community Edition can be downloaded free of charge from the website http://community.greenplum.com. Online also updates of Regular Community Edition will be available. The version of Community Edition is intended only for the experimental purposes, development and researches. Users of the current edition Single-Node Edition can unroll the new edition Community Edition in the one-nodal working environment. Before using a program code for internal data processing either in any commercial or in the production purposes, it is necessary to purchase commercial licenses for Greenplum.

Modular Data Computing Appliance

In division of EMC Greenplum the hardware and software system Modular Data Computing Appliance (in September, 2011 it is announced), which gives an opportunity to work at the same time with the structured and unstructured data of large volume is created, using as the relational methods of processing implemented in parallel Greenplum DBMS, and function of the platform with the open code Apache Hadoop. New Modular DCA devices will incorporate high-performance modules in which the packet of In-Memory Analytics of SAS Institute company executing parallel processing of data in RAM works. Use of programs of SAS allows to place as structured, and unstructured data at the same time on several nodes of a cluster. The possibility of parallel processing in the company is considered main advantage of the Greenplum complexes. Modules pass test now, and should go on sale by the end of the year. EMC provided also test cluster of Greenplum Analytics Workbench consisting from more than 1000 nodes and intended for carrying out integration testing of the Apache Hadoop programs.

EMC Greenplum Database DBMS uses the parallel architecture based on splitting complete data array into separate segments, work with which can be performed at the same time (shared-nothing massively parallel processing, MPP). This architecture is initially developed for a business intelligence and analytical data processing on normal equipment. Data segments are automatically distributed between several servers of segments, each of which owns and manages a separate part of the general data array. Such architecture without shared resources (shared-nothing) means that all communications are performed through a network interconnection (interconnect) therefore there are no problems of the general data access on a disk or the addressing conflicts. More detailed information on Greenplum Database can be found to the address: www.greenplum.com/products/greenplum-database.

Notes