RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

SAS In-Memory Statistics for Hadoop

Product
Developers: SAS Institute Inc. (SAS Institute)
Date of the premiere of the system: 2014/04/24
Technology: BI

SAS In-Memory Statistics for Hadoop is the system for the analysis of "Big Data" using in-memory technology having a broad spectrum of analytical algorithms for a research and modeling in distributed environment of Hadoop.

On September 17, 2014 it became known of release of a new product by SAS company for the analysis of Big Data using in-memory technology - SAS In-Memory Statistics for Hadoop.

The solution works by the principle of interactive programming and allows at once several users to study and analyze jointly data, to create and compare models, to work quickly with large volumes of information on the basis of Hadoop technology.

For the companies of Hadoop which are looking for options of use it is important to have a possibility of use of the most different methods of the analysis, including profound analytics, on huge amounts of data for which it is potentially supposed to use Hadoop. The new product is suitable for the solution of such tasks.

The user of SAS In-Memory Statistics for Hadoop will get access to all main methods of statistical analysis and machine learning in the mode of interactive programming. Among them:

  • linear and logistic regressions,
  • generalized linear models,
  • decision trees and accidental wood,
  • forecasting of time series,
  • analysis of text data,
  • clustering, etc.

There is a possibility of execution of the auxiliary and main objectives:

  • prepare data for the analysis,
  • select significant predictors,
  • compare models,
  • create the code of application of models.

The product gives the chance of creation of the recommendatory systems using a big set of methods of their development. Such systems are demanded for the solution of a wide class of business challenges, including target marketing.

The Hadoop technology increases reliability of a system due to use of a server cluster that ensures safety of data at simultaneous reduction in cost of the hardware, high degree of scalability, lack of strict requirements to a format of data and their preprocessing.