Customers: Design and Technology Bureau for Informatization Systems - Branch of Russian Railways (PKTB TsKI) Product: Postgres Pro Enterprise Second product: Red OS Project date: 2022/11
|
In 2022, the Design and Technology Bureau for Informatization Systems - Center for Digital Technologies Branch of Russian Railways (PKTB-TsT) began the migration of one of the high-load systems in the field of railway transport from IBM DB2 for z/OS to DBMSostgres Pro. Vasily Timoshchenko, head of the PKTB-TsT sector, spoke publicly about the migration experience on April 3, 2023 at an industry conference.
PCTB-TsT develops automated systems for Russian Railways, and one of the main systems it develops is the Automated System for Operational Transportation Management (APMCS). The history of the ASOUP dates back more than 40 years. The system collects information about all mobile facilities on the railway: trains, cars, containers.
There is a line level in the APMCS - this is what supplies information, including field AWS, sensors, etc. They transmit information to the road level, where there are 16 road APMCS (Russia is organizationally divided into 16 roads). There is also a network APMCS that processes data across the entire road network. Then the information is given to consumer systems, and analytics and reporting are built on it.
The total volume of databases of all road systems of the APMCS is about 10 TV, the representative of PKTB-TsT cited. Historical information has been stored for 10 years. About 30 GB of information coming from the line level to the road level is processed per day.
Vasily Timoshchenko clarified that in 2022 the company was engaged in import substitution of the network level of the ASOOP, where information is aggregated across the entire road network and given to other consumer systems.
At the end of 2022, the system operated on two fault-tolerant mainframe IBM Mainframe Parallel Sysplex. The head of the PKTB-CPT sector says that the company understood that "head-on" the system, which is built on the mainframe, could not be caught up. Therefore, it was decided to divide the OLTP and OLAP loads - to process messages in one base, and execute analytical requests on another base.
It was also necessary to provide fault tolerance, because the system is critical.
The story of the transition from IBM did not begin yesterday, said Vasily Timoshchenko. Prior to this, many attempts were made to rewrite some parts of the systems to other platforms. At one point, they tried to switch to SAP Hana. But this would not be import substitution. They also tried to write something on VoltDB, tested Tarantool. But for one reason or another, it didn't take off.
By that time, the company had already used Postgres in several small projects. Postgres Pro Enterprise DBMS came up because there is vendor support, DBMS is in the register of Russian software, and also attracted multimaster - "this means a cluster out of the box that does not require additional software."
The multimaster extension turns Postgres Pro Enterprise into a non-resource-sharing synchronous cluster that provides OLTP extensibility for reading transactions.
The problem was that the system runs on IBM System z - IBM's mainframe, on the z/OS operating system. It's a very specific system. Almost all code is written in C and assembly. There was nothing left to take, and rewrite everything, says Vasily Timoshchenko.
It turned out the following. They divided the base into OLAP and OLTP, part of OLAP is designed to form operational analytics about the state of traffic objects, planning their management and calculating key performance indicators that are used to determine the effectiveness of the railway as a whole. In OLTP, a multimaster cluster was built with two nodes in the architecture and reference - the Postgres Pro Enterprise extension.
The application part is written in Java, where Spring Boot is packed in a Docker container, and all this spins under Kubernetes, while working in virtual machines. And Red OS "Moore" is used as the OS.
I had to optimize the performance of the system quite a lot, Vasily Timoshchenko admits. For example, faced with the fact that the processor spent a lot of time waiting for I/O. The OS was not configured by default, it was necessary to explain that it works with SSD drives, not HDDs. This immediately gave a performance increase of 2-3 times.
In addition to the new architecture, which developers had to get used to, new restrictions arose. For example, there are very wide tables in the APMCS, and now the tables had to be compressed - the base was not transferred one to one.
An organizational conflict also arose. The organization that operates the system has a clear division into system and application administrators. It so happened that the system administrator does not want to give super-user rights to the application database administrator, and he cannot configure replication without this role. But in the 16th version of Postgres Pro, a function appears that allows you to do the specified non-super user.
Among the plans for the future for the project are further optimization of performance and increasing the functionality of the archive base. And, most likely, there will still be databases for specific tasks - more isolated, but which will also need operational information from the OLTP database.