The name of the base system (platform): | Oracle Cloud |
Developers: | Oracle |
Date of the premiere of the system: | 2020/02/20 |
Technology: | Big Data, Data Mining, MDM - Master Data Management - Management of the main master data, SaaS is the Software as service |
2020: Availability of the Oracle Cloud Data Science Platform platform
On February 20, 2020 the corporation Oracle announced availability Oracle Cloud Data Science of the Platform platform with seven services and Oracle Cloud Infrastructure Data Science in a basis. Services will help the enterprises to make projects in the field of Data Science more successful thanks to solving of tasks of joint development, training, management and deployment of models machine learning. Unlike other products for processing data which are focused on certain researchers the Oracle Cloud Infrastructure Data Science service helps to increase efficiency of activity of groups of specialists in processing and data analysis. Such opportunities as the general projects, directories of models are for this purpose offered, group politicians security, the reproducibility and audit are provided. Oracle Cloud Infrastructure Data Science automatically selects the most optimal training data sets thanks to AutoML for the choice and setup algorithm, assessment and an explanation of model.
As noted in Oracle, the modern organizations implement only a small part of the huge transforming potential of data as data specialists do not get simple access to the necessary data and do not locate tools for creation and deployment of effective models of machine learning. As a result on development of models too much time leaves, they not always conform to corporate requirements for accuracy and reliability and very often are not brought into operation.
"Effective models of machine learning are a basis of successful projects in the field of a data science (Data Science), but the volume and a variety of data which the enterprises face can prevent these initiatives still before they begin to be implemented. Using Oracle Cloud Infrastructure Data Science we increase productivity of certain specialists in data, automating all their workflow, and we add strong support of collective work. It provides the actual value of the Data Science projects for business", 'Greg Pavlik, the senior vice president of Oracle for product development of data processing and AI noted' |
According to the developer, the Oracle Cloud Infrastructure Data Science service includes the automated processing of data, saving time and reducing quantity of errors, thanks to following features:
- AutoML, the automatic choice of algorithms and setup automates process of accomplishment of tests for several algorithms and configurations of hyper parameters. A system checks results for accuracy and confirms that for use the optimal model and a configuration are selected. It saves time of specialists in processing and data analysis and allows each of them to receive the same results, as the most experienced specialists.
- The automatic choice of predictive signs simplifies creation and selection of signs, automatically determining key predictive signs on big data sets.
- Assessment of model generates a full range of metrics of assessment and the corresponding visualization for measurement of characteristics of model with new data. It allows to range models eventually to provide optimal behavior of the working version. Assessment of model is beyond direct assessment of characteristics. Completely to consider different influences of errors of the first and second sort (false positive and false negative), the expected basic behavior is taken into account and the cost model is used.
- Model explanation: the Oracle Cloud Infrastructure Data Science service automatically provides an explanation of relative weight and importance of the factors influencing formation of the forecast. She offers the first commercial implementation of an explanation, independent of model. For example, using fraud identification model the data specialist can explain what factors are basic reasons of fraud. It helps the company to change processes or to implement security measures.
For successful start of effective models of machine learning in operation not only the selected specialists are required. Joint work of specialists in the analysis and data processing is for this purpose necessary. According to the statement of the developer, the Oracle Cloud Infrastructure Data Science service gives ample opportunities for support of collective work, including:
- The general projects help users to organize work, to exercise control of versions and it is reliable to share results, including sessions with data and notepads.
- Directories of models allow members of the group to exchange reliably already constructed models and artifacts necessary for change and deployment of models.
- Collective security policies give to users the chance to control access to models, the code and data which are completely integrated with the Oracle Cloud Infrastructure Identity and Access Management functions.
- Functionality of reproducibility and audit allow the enterprise to trace all corresponding assets. All models can be reproduced and checked even if team members leave collective.
Using Oracle Cloud Infrastructure Data Science of the organization can accelerate successful deployment of models, receive results and performance of the corporate level for predictive analytics and to provide positive results for business, consider in Oracle.
The Cloud Data Science Platform platform offers seven services. They integrate complex experience, improve and accelerate obtaining results in the Data Science projects:
- Oracle Cloud Infrastructure Data Science: allows users to create new models of machine learning, to train them and to manage them in the environment of Oracle Cloud using Python and other tools and libraries open source, including TensorFlow, Keras and Jupyter.
- Possibilities of machine learning in Oracle Autonomous Database: algorithms of machine learning are closely integrated into the autonomous Oracle database of Autonomous Database with support of Python and the automated machine learning. The forthcoming integration with Oracle Cloud Infrastructure Data Science service will allow developers to create models, using both the open code, and scalable algorithms in the database. Application of algorithms to data in Oracle Database accelerates obtaining results due to reduction of a set-up time and reduction of need for data movement.
- Oracle Cloud Infrastructure Data Catalog: the directory of data helps users to detect, find, organize, enrich and to trace assets of data in Oracle Cloud. The Oracle Cloud Infrastructure Data Catalog directory has the built-in business glossary allowing to select and find easily necessary and entrusted data.
- Oracle Big Data Service: offers complete implementation of Cloudera Hadoop with significantly simpler management in comparison with other offers of Hadoop. For example, one mouse click it is possible to create a cluster of high availability or to ensure safety. Oracle Big Data Service also includes machine learning for Spark that allows the organizations to execute algorithms of machine learning Spark in memory using one product and with the minimum data movement.
- Oracle Cloud SQL: allows to execute SQL queries to data in HDFS, Hive, Kafka, NoSQL and object storage. CloudSQL allows any user, the application or the analytical tool who can interact with Oracle databases, is transparent to work with data in other data warehouses, using advantages of processing with descent (push-down) and horizontal scaling (scale-out) of data for minimization of their movement.
- Oracle Cloud Infrastructure Data Flow: completely managed service of Big Data allowing users to run the Apache Spark applications without creating infrastructure for their deployment or management of them. It gives to the enterprises the chance to quicker release applications using Big Data and AI. Unlike the competing Hadoop and Spark services, the Oracle Cloud Infrastructure Data Flow service offers the single window for tracking of all Spark tasks allowing to reveal resource-intensive tasks or to diagnose and fix problems.
- Oracle Cloud Infrastructure Virtual Machines for Data Science: Oracle Cloud Infrastructure virtual machines for Data Science are the previously configured environments based on graphic processors with the general IDE, notepads and frameworks which it is possible to configure and start less than in 15 minutes for 30 dollars a day, claim in Oracle.