The name of the base system (platform): | Rostelecom Data Management Platform |
Developers: | Rostelecom |
Technology: | MDM - Master Data Management |
RT.DataLoader is an import-independent, easily replicated solution for uploading large amounts of data from source systems to data storage. The product makes it possible to connect new data sources, add new tables, adjust the attribute composition of connected tables through a visual interface, which minimizes the involvement of ETL developers in the process.
2022: Features, Capabilities and Features
Key features and capabilities of RT.DataLoader as of July 2022:
- 1. Full or partial upload from source tables.
- 2. Delivering data file Hadoop to the Distributed File System (HDFS).
- 3. Support for full and incremental upload of data from sources (tables, views, sql queries) to files on the local data storage server.
- 4. Orchestrator for managing upload flows based on Apache Airflow. Start the upload process based on a schedule or event condition. You can configure the schedule for a specific table or source table group.
- 5. Archive uploaded data.
- 6. Calculation of checksums (data quality control).
- 7. Queue upload jobs to balance the load on the source.
- 8. Separation of data upload and delivery processes to reduce the load on the source in case of problems on the XD side
Features:
- 1. Import-independent product registered in the register of domestic software
- 2. Visual interface for configuring, managing and monitoring download processes
- 3. Easily replicate the solution - connect new sources
Scope
The product can be used to build ETL processes in data stores with a daily refresh rate. The solution is especially effective for CD with a large number of data sources.