Developers: | ITSumma (Total I&T) |
Technology: | Big Data |
Content |
Main article: Big Data
ITS Data Processing Platform is a big data solution that can be deployed as part of a separate project or used as part of a subsystem for automation, intelligent and monitoring projects.
2023
Inclusion in the register of domestic software
On October 17, 2023, under the number No. 19542, the ITS DPP platform was included in the Unified Register of Russian Computer Programs and Databases. ITSumma (Summa iTi) announced this on October 31, 2023.
ITS DPP (ITS Data Processing Platform) is an open source software-based big data analysis, storage and processing platform. The solution stack includes: Apache Kafka, Apache Spark, Apache Airflow, Apache Hadoop, Greenplum, Apache Superset Redash, Prometheus.
ITS DPP will be useful if necessary:
- Build a system for storing, processing and analyzing data from scratch.
- Quickly deploy infrastructure for data storage and analysis.
- Create data marts, organize data processing processes, reorganize data storage.
- Optimize structure, reduce costs, and avoid resource losses.
Using the platform, data engineers will be able to:
- Create a Data Lake or Data Warehouse to store structured and unstructured data.
- Collect data from heterogeneous sources into a single repository
- Configure ETL/ELT conversion.
- Organize a data quality check.
- Configure streaming and batch processing.
- Organize code control and delivery for data processors.
- Configure dashboards with different levels of access for different departments.
The solution is built on a modular basis. The modules are responsible for individual tasks such as data collection, processing, storage, and conversion. Fully configured, ITS DPP can process data in a batch or stream manner, store raw, structured and unstructured data of various sizes, and generate data marts. Together with the platform, a module for managing and monitoring individual components is supplied.
ITS DPP is a completely domestic development and replaces many foreign solutions.
Tasks to be solved
For February 2023, the ITS DPP platform solves the following tasks:
- Building a system, storages processing and analysis data from scratch, while the flexibility of the approach allows you to form a solution of any complexity and power.
- Generate reliable real-time plots of large amounts of data.
- Automated reporting to minimize human factors.
- Minimizing production downtime, increasing annual energy production and reducing losses from inefficient equipment use.
- Optimizing the production chain and reducing errors in forecasting demand, reducing losses from storing excess working capital, losses from production delays.
- Moving from a proprietary software stack to open-source components that are easy to find engineers on the market to support.
- Launch MVP/pilots that require a data processing and analysis system.
- Moving from foreign cloud BigData services.