[an error occurred while processing the directive]
RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

Apache Cassandra

Product
Developers: Apache Software Foundation (ASF)
Last Release Date: 2015/11/10
Technology: DBMS

Content

Cassandra is a system for creating storage of large amounts of data, it is used by many world famous companies, among them: Twitter, Instagram, eBay. In Russia, Cassandra is still used little, most often due to lack of information.

2020: Inclusion in the line of DBaaS services from DataLine

Ready-made DB Redis and Cassandra have supplemented the line, DBaaS which includes databases,, and MS SQL. PostgreSQL MySQL MongoDB This was DataLine announced on June 30, 2020. More. here

2015

Apache Cassandra 3.0

On November 10, 2015, the Apache Software Foundation (ASF) announced the release of Apache Cassandra 3.0[1].

ASF press release mentions Apache Cassandra 3.0 as a milestone in the evolution of this DBMS.

There are three main changes to the product:

  • CQL-optimized storage engine and SSTables data format;
  • materialized representations;
  • more efficient work hints.


The beta version of the Java driver for Cassandra 3.0.0 will officially be released next week, the current driver for Python has received version 3.0.0 rc1.

Apache Cassandra 2.2

July 20, 2015 it became known about the release of the distributed DBMS Apache Cassandra 2.2[2].

Major changes in the system

  • The ability to directly add, update and fetch data in JSON format, without the need to use the sstable2json and json2sstable add-ons, which are declared outdated;
  • Support for creating user-defined functions from Java, Javascript, and other languages that support the Java Scripting API. Due to the embedded functions performed on the DBMS side and processing data without copying to the end application side, it is possible to significantly increase the performance of the entire data processing system;
  • A series of performance optimizations have been carried out, including support for compression of the commit log and the message merge technique is enabled by default;
  • The efficiency of data transfer between nodes is increased by incorporating a flexible compression system that allows different compression levels to be selected for different conditions, which is particularly useful in situations such as node rebuilding after a failure or transferring data to a new data center.
  • A role-based access control system has been added to simplify the administration of configurations that span multiple development teams and departments. Some users can delegate authority to other users, including at the level of CREATE, ALTER, DROP, and AUTHORIZE operations, without having to obtain superuser privileges;
  • Added a new utility sstableverify to check the integrity of all tables;
  • Full support for the platform is provided; Microsoft Windows

Cassandra database combines the fully distributed Dynamo hash system, which provides almost linear scalability with an increase in data volume. Cassandra uses a ColumnFamily-based storage model different from memcachedb-like systems that store data only in a key/value association, the ability to store hashes with multiple levels of nesting. To simplify interaction with the database, the CQL (Cassandra Query Language) structured query generation language is supported, reminiscent of SQL, but stripped down in functionality. Among the possibilities, we can note the support of namespaces and column families, the creation of indexes through the expression "CREATE INDEX." Drivers with CQL support are prepared for Python, Java (JDBC/DBAPI2) and JavaScript (Node.js).

DBMS allows you to create fault-resistant storage: data placed in the database is automatically replicated to several nodes of a distributed network, which can include different data centers. If a node fails, its functions are picked up on the fly by other nodes. Adding new nodes to the cluster and updating the Cassandra version is carried out on the fly, without additional manual intervention, changing the configuration of other nodes.

2013

Apache Cassandra 2.0

The most important innovations NoSQL-DBMS Cassandra 2.0, released in September 2013 by the Apache Foundation, are named "easy" transactions, the SQL-like CQL query language, and triggers. Jonotan Ellis, vice president of the project, emphasized that Cassandra now simplifies the transition from RSUBD, but[3] to RSUBD, did not specify the technology[4] such simplification[5]

Originally created on Facebook, this freely distributed distributed DBMS was transferred to the Apache Foundation in 2008, and is now actively used by CERN, eBay, IBM, Instagram, NASA, Reddit, Twitter and many other organizations. Cassandra's largest cluster successfully processes 300 TB of data across 400 servers. All Cassandra network nodes perform identical functions, which eliminates the possibility of a "single point of failure" and simplifies scaling.

In the new version, "easy" transactions resolve conflicts between attempts to modify data with simultaneous queries. Triggers allow you to execute application code when information in the database changes, and due to the distributed architecture, the speed of the request that caused the trigger to fire does not deteriorate. The compactness of the database has been increased, the likelihood of request timeouts has been reduced. You can now use the Cassandra functionality in application tasks not only through the software API, but also through the CQL query language, which was officially added in January, but is only now available in mature form. Although Cassandra supports the key-value model, its specific implementation resembles working with tables, so CQL retains the syntax familiar to SQL encoders: you can use the SELECT, INSERT, CREATE TABLE commands, etc.

Notes