RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2
2023/05/12 14:30:33

Document Digitization (Russian Market)

Content

2023: Government documents in Russia to go to digital archives

On May 12, 2023, it became known that the Ministry of Economic Development of the Russian Federation would finalize the bill on digital archives, expanding its effect on government documents.

According to Kommersant the newspaper "," we are talking about amendments to the law "On Information" in terms of the use and storage of electronic documents. Changes were made to the State Duma in 2021 and adopted in the first reading in April 2022. The project in its original form prohibits the creation of electronic duplicates of documents of the Archive Fund. In Russia particular, Central Bank of the Russian Federation he insists that the bill apply only to participants in the financial market, since they are subject to high requirements for. cyber security

The Ministry of Economy should finalize the bill on digital archives, extending it to government documents

However, the White House proposes to expand the initiative by extending it to government documents: in this case, electronic duplicates of such papers will be included in the digital archive. On April 25, 2023, Deputy Prime Minister Dmitry Grigorenko appealed to the Ministry of Economic Development with a request to take into account the position of the government apparatus when finalizing the bill on electronic archives.

The head of the National Council of the Financial Market Andrei Emelin said that the work on the project was slowed down by issues related to ensuring security in the event of the destruction of paper originals of documents converted into electronic form. In particular, for financial organizations, including banks, the bill makes sense only if they are allowed to digitize documents with the destruction of the paper original, without the obligatory participation of the second party and with the involvement of specialized organizations.

File:Aquote1.png
The market continues to hope that the project will provide reasonable solutions to these three basic issues. While they are not there, the bill does not give savings, "Kommersant quoted Emelin as saying[1]
File:Aquote2.png

2022

The Russian market for digitization of documents is estimated at 6.4 billion rubles

The volume of the Russian market for document digitization services (converting them into electronic form, selling scanning equipment, licensing standard text recognition programs (OCR) and developing individual solutions) in 2022 reached 6.4 billion rubles. This assessment was announced at the end of January 2023 by the online as-built documentation service BuildDocs.

As Kommersant writes with reference to Yegor Girenko, manager of the Transformation Strategy practice at Reksoft Consulting, in 2017-2022. foreign manufacturers accounted for about 20% of the Russian market for digitization of documents, and by the beginning of 2023 their share had decreased by no more than 5 percentage points. At the same time, he noted that the market is quite fragmented: there are about 50 vendors with a share of more than 1%.

The volume of the Russian market for document digitization services in 2022 reached 6.4 billion rubles

About 5 billion rubles were spent on transferring historical data presented in analog form to information systems in 2022, Ilya Verigin, director of work with state customers of Biorg, told the newspaper.

Document processing technologies will be more actively introduced by banks, insurance companies and industry, said Smart Engines CEO Vladimir Arlazarov. The head of the National Council of the Financial Market Andrei Emelin notes that by the end of January 2023, a bill on the digitization of most paper documents is in development in the State Duma.

Many Russian IT companies proceed from the fact that "during instability, money remains with the state," but this creates the risk of an increase in dependence on government orders, said Denis Kasimov, general director of Factory5 (develops industrial software). This, in his opinion, creates difficulties when working on projects for customers outside Moscow. Contracts often prescribe that the cost of a man-hour cannot exceed the average for the region. Therefore, in many regions, given the salaries of IT specialists, it is difficult even to coordinate work, he added.[2]

At the initiative of Putin, a project to digitize archives for 200 billion rubles may arise in Russia

The President of the Russian Federation instructed to work out the issue of digitization of state archives

The draft transfer of state archives to digital format should be prepared no later than July 31, 2022. Such an order was given by Russian President Vladimir Putin on February 11. The head of the Ministry of Finance Maksut Shadayev, the head of the Federal Archival Agency Andrei Artizov and the President of the Russian Academy of Sciences of [1] Alexander Sergeyev have been appointed responsible.

As TAdviser commented in the Ministry of Digital Engineering, according to the instructions of the President, the Ministry of Digital Engineering together with Rosarchiv and the Federal State Budgetary Institution of the Russian Academy of Sciences will consider the possibility of digitizing documents of the Archive Fund and other archival documents. It is assumed that artificial intelligence technologies will also be used in digitization. The issue is under consideration by the departments, proposals will be submitted by July 31, 2022

Vladimir Putin instructed to prepare a project for the transfer of state archives to digital format. The photo is Культура.рф

Task Scale Estimates

Director of the All-Russian Research Institute of Documentation and Archival Businesses (VNIIDAD) subordinate to Rosarchiv Pavel Kyung, answering TAdviser questions, drew attention to the fact that the order does not clearly spell out the scope of the project. Moreover, he draws attention to the fact that this is not the first attempt to digitize state archives.

File:Aquote1.png
From the order itself, no volumes follow. The very task of digitizing documents of the Archive Fund of the Russian Federation has already arisen. Even in the State Program of the Russian Federation "Information Society (2011-2020)," the event "translation into digital format of materials stored in the archives of the Russian Federation" was laid down, and the indicator "Share of archival funds, including funds of audio and video archives translated into electronic form - at least 50% by 2020." The task was not implemented, I can assume, due to the need for significant financial costs, "he said.
File:Aquote2.png

According to Pavel Küng, the real volume of the project will depend on how much funds will be allocated for this. He estimates the entire volume of state archives at 60 billion pages of archival businesses, and even more, since some documents need to be scanned from two sides.

The total volume of the archive fund is 520 million storage units, that is, archival businesses, estimated at the request of TAdviser Ilya Verigin, director of work with state customers of Biorg (one of the largest players in the document digitization market). Moreover, it can be both a leaflet of the resolution of the Secretary General of the Central Committee of the CPSU, and a church metric book 70 cm high and weighing 30 kg.

The project can cost the state 200 billion rubles

File:Aquote1.png
It is important to understand that digitization consists of two global blocks: scanning and data extraction. Evaluating scans is very difficult. A modern document can be scanned at a price of 1 ruble per sheet. But there are large-format documents, dilapidated historical funds, for which it is important to preserve even the shade of faded paper, various valuable notes, visas, etc. For them, creating a master copy of the highest quality can cost 500 rubles per page. You need to adjust the color reproduction of the scanner according to the color scales, set the light, it can take several hours. Given the volume of the Archive Fund of Russia, we are talking about at least 100 billion rubles, "Ilya Verigin told TAdviser.
File:Aquote2.png

At the same time, the expert clarifies that digitization does not exhaust work with documents, since the information still needs to be indexed. If we take only the cover, which includes several details and numbers, then the price of the storage unit will be 10-20 rubles. And a reference and search database for all archives in Russia will cost 5-10 billion rubles. But even this does not end the cost of digitizing archives, since the documents also need to be recognized.

Pavel Küng recalls that the documents in the archives are very different: from the protocols of the Politburo to private correspondence. At the same time, working with archival documents imposes many restrictions, since they can contain privacy of personal life, medical secrecy, issues related to intellectual property, in the end it can be documents for official use. Therefore, when digitizing, it will be necessary to ensure control over access to information. And this will also affect the price of the digitization project. According to his preliminary estimates, the transfer of state archives to a "digital" can cost 150-200 billion rubles.

File:Aquote1.png
Documents need not just to be scanned, they need to be removed from storage, their condition should be assessed, possibly restored. Things need to be embroidered, then sewn. And this is a lot of additional work, the purchase of the necessary materials. We need parks of various scanners, we need to prepare unified digitization centers and staff. Moreover, the market mainly presents scanners of foreign production. It would be right with a large project to develop the development and production of domestic equipment. In addition, scanning and recognizing information are different types of work. Recognition requires software and the work of tens of thousands of people who will verify the data obtained. The most modern and advanced handwriting recognition systems give quality at 70 percent. The rest of the data has to be manually entered by specialists, - Pavel Küng shared with TAdviser.
File:Aquote2.png

According to him, the price of the project will also be influenced by the quality of the scan. That is, whether the master copies will be in printing quality or only as sufficient for reading. Accordingly, how much space the document will take up in electronic storage depends on - about 1 GB or 10-15 MB or less. In any case, the creation of electronic archives will require more than 1 billion rubles a year for the maintenance of IT infrastructure.

Digitizing archives could take 100 years

Pavel Küng believes that it may take a year to create a TA project for digitizing archives, since it is not yet known how many documents need to be restored before digitization, how many documents need to be embroidered and sewn, it is not clear where it is necessary to locate digitization centers. And the project itself can last 10, and 20 and 100 years. Depending on the number of equipment and specialists who will be engaged in scanning, verification, recognition.

Ilya Verigin is more optimistic in his assessments. In his opinion, the development of regulations for the digitization of archives can take from 6 months to a year, and scanning all archival documents will take 10 or 20 years. The speed of work will strongly depend on the allocated funds and the amount of work. At the same Russia time, he recalls that there is already experience in the digitization of federal archives: from 2018 to 2020, the archives of the registry offices, which contained documents that reached 75 years old, were digitally transferred. In his opinion, to speed up the process, market representatives should be involved in the project.

Approaches to solving the problem

Pavel Küng draws attention to the fact that there is no ready-made methodology for digitization of this scale in the country. Many issues will have to be resolved from scratch. And it will not be possible to use artificial intelligence technologies everywhere. You can not always recognize and index data in automatic mode. According to him, many documents contain handwritten information, even printed documents may have handwritten notes superimposed on printed text. At the same time, even the most modern systems recognize the manuscript with a quality of about 70 percent, the rest must be checked and entered by hand.

Examples of automatic recognition on a large scale in Russia are, for example, projects "Memory of the People," "Feat of the People," digitization of civil status acts from registry office funds. Large companies operate on the market that are developing the direction of digitization - Elar, Biorg. IT giants Yandex and Sber have invested in the development of technologies for recognizing handwritten texts using AI technology. True, the oldest representative of the text recognition market, ABBY, recently withdrew many of its tools from the register of domestic software, the expert notes.

Pavel Küng believes that the work on the project should either be diversified, divided between public and private contractors, or the state should, using the experience of the leaders, build its own system on the basis of their solutions.

According to Ilya Verigin, thousands of companies throughout Russia can cope with scanning, and the recognition and analysis of archival information can be entrusted to AI. First of all, we are talking about indexing a huge amount of monotonous information related to data on citizens, technical documentation. The level of automation in this direction reaches 80-90%.

Verigin believes that to automate digitization and recognition, the state needs a single tool, a digital platform.

File:Aquote1.png
Firstly, it is more convenient to control the implementation of methodology and quality standards. Secondly, we are talking about AI for text recognition - a trained technology, and it is logical to get not a hundred separate poorly working systems, but to combine competencies on a single platform in order to get the maximum return. The question is whether such a system will be developed from scratch or on the basis of solutions already existing in the register of domestic software. I would propose to study the existing experience on the market on the use of platforms that combine technologies for recognition using neural networks and data verification by citizens from different regions. With the help of the Ministry of Labor of Russia, the project can be given tremendous social significance if you connect online through personal accounts to the recognition and verification of data of residents of the country, including socially unprotected categories of citizens. The money will remain in the regional economy and at the same time will go directly to people, - Ilya Verigin shared with TAdviser
File:Aquote2.png

The expert is convinced that the Russian market will cope with the project. At the same time, in his opinion, a large state corporation should become the platform operator, since the project will require not only the creation of the platform itself, but also colossal computing resources, powerful communication channels and support for simultaneous authorization and the work of a large number of users.

2015-2018: Research on the Russian ICT-Online document digitization market

Study period: 2015-2018 inclusive. The main emphasis is on current data from 2017-18. References to earlier periods are also made.

Research object: document digitization market - scanning and processing of text, graphic and other types of documents from paper media to save them electronically; creation of electronic archives.

The subject of the study: services for the digitization of documents, customers and performers of these services, state tenders.

Technique

The actions of organizations to digitize various types of documentation - business, official, historical, scientific and technical, artistic - reach end consumers indirectly, in the form of a ready-made list of necessary files and database records stored in the corporate IT infrastructure, or a list sent by the relevant accumulating organization (for example, state archives). The digitization process itself is generally hidden from end consumers, the mechanism of getting a document into the information system or electronic archive is taken for granted by them. Meanwhile, the digitization market is a complex organized structure, at the moment mostly focused on competitive orders. This determines the features of his research methodology.

The study was based on the data collected over six years (2013-2018) of the official public procurement website (zakupki.gov.ru) and the main official ETPs, filtered by the key phrases "scanning," "digitization," "retroconversion," "indexing," "electronic form," "data entry," "information resource," "electronic archive," etc. Purchases with an initial order value of more than 200 thousand rubles were taken into account, respectively, implemented in accordance with the 94-FZ ("On tenders for placing orders for the supply of goods, for the performance of work, the provision of services for state needs"), 44-FZ ("On the contract system in the field of procurement of goods, works, services to meet state and municipal needs") or 223-FZ ("On the procurement of goods, works, services by certain types of legal entities"). Duplicates were eliminated, procurement was selected, providing for the appropriate digitization work, according to the laid out technical specifications. Industries, types of services, winners of tenders are highlighted.

It should be noted that the tender for the digitization of documents is often accompanied by the purchase of material values ​ ​ or software: in these cases, the entire purchase was taken into account when analyzing the order, since it is difficult or impossible to isolate the actual cost of accompanying values. In addition, there was a possibility of missing a number of purchases in which digitization is not spelled out directly, but complements the development of certain information systems. Analysis of such purchases is physically impossible.

Dynamics of the Russian document digitization market

In the last decade, the market for digitization of documents in Russia has shown systematic growth. An increase in the volume of scanned documents and the total cost of orders year-to-year was observed even in the crisis years 2008-2009 and 2014-2015.

The positive market dynamics is due to several factors:

  • The number of management documents continues to increase, despite the desire to switch to electronic document management. At a certain stage of development, companies and institutions face the need to vacate premises or simplify access to documents on their main activities, since working with paper disrupts the regulatory deadlines for procedures and services.
  • In the field of public administration, digitization is a necessary element in the creation of registers and registers, as well as a mechanism for the transition to the provision of public services in electronic form (a "register model" of public services). It was systematically carried out and carried out by many departments within the framework of the programs "Electronic Russia," "Information Society" and "Digital Economy."
  • In industry, the most popular is the direction of digitization of technical documentation of various types: design, design and estimate, executive, operational, insurance fund. If earlier the main task of the industry was to preserve or reincarnate developments, today the need for digitized data is associated with Industry 4.0: "inhibiting" paper processes negatively affect the main production, technological and construction cycles.
  • Legislation on the use of electronic documents in commercial structures is liberalizing every year, for many companies, working with financial and client documents in paper form is associated with direct costs and affects customer orientation.

The exception to the positive dynamics of the market was 2016. The decline during this period (primarily in the public sector) was caused by uncertainty in the financing of projects due to the beginning of the transition period from the Information Society to the Digital Economy. The target budgets increased in 2015 have already been mastered by organizations, and the current ones have been cut. Accordingly, the average initial purchase price was roughly halved.

Dynamics of the Russian digitization market by volume and number of orders

In 2017, the positive dynamics of the digitization market recovered, and for 2018, relative stability was characteristic. This is largely due to the development of technologies - companies performing appropriate production cycles in crisis years invest in the development and improvement of information recognition algorithms, artificial intelligence. This collectively reduces manual labor and reduces the cost of digitization work, making it more affordable even in conditions of widespread savings.

Market specificity. Consumers of digitization services

The specificity of the mass document digitization market is due to the fact that expensive specialized equipment and software are used for scanning and recognizing documents. Accordingly, scanners and software are purchased for current input (where there is a large incoming flow of documents) by the largest banks and government agencies: FTS, Sberbank, FIPS (Rospatent). Own scanning centers also exist in cultural institutions - RSL, RSL, large archives - as part of centers for conservation and processing of funds.

For most commercial and government agencies, their own mass digitization services are unprofitable, both in terms of the cost of buying equipment and in terms of overhead costs - especially if they are needed only for the "final" project. Ordering the relevant services from a specialized outsourcing company is both more economical and faster in terms of execution.

According to the 44-FZ, to determine the performer of digitization services, customers need to hold a competition, electronic competition or electronic auction, in compliance with a number of necessary regulations and rules. The exception is especially valuable and archival funds, for which the 44-FZ provides for the possibility of organizing contests with limited participation - such an opportunity is given in order to verify the competence and ability of the performers to carry out the project with high quality and without prejudice to the originals of the documents. The format of contests with the ability to present requirements for the competencies of service providers is also chosen by large customers aimed at long-term projects (including rolling contracts), in which digitization is usually considered as an element of the development strategy of departments and companies and must be completed on time with the specified quality.

Fundamental differences between electronic competition and electronic auction

The principal differences between the electronic competition and the electronic auction are shown in the figure. In general, in the electronic auction, all companies that have agreed to the general requirements for the indicators of goods/services in the first part of the application (consent to the provision of services in accordance with the specified conditions and in this volume, information about the goods/service) participate in the procedure for identifying the winner. If the second parts of the bids (information about the auction participant) comply with the original consent and in the absence of restrictions, the contract is concluded with the company that offered the lowest price.

In the electronic competition, in turn, already at the first stage, the customer has the opportunity to present additional requirements for participants, in particular, for qualification. The winner is selected based on the rating, which is determined by several criteria (price, qualification, business reputation). Obviously, for many categories of customers, this method of determining the winner is more preferable.

But, as practice shows, even in the case of tenders and additional requirements for suppliers, the organizer is not protected from non-performance of work. Cases when the contractor overestimates his strength and the contract with him has to be terminated are rare on the market, but even large regional customers may be among the victims: for example, the registry office of the Leningrad Region (the contract in the amount of 23.9 million rubles has been terminated, the contractor is included in the register of unscrupulous suppliers) or the Property Relations Committee of St. Petersburg (two contracts in the amount of 8.1 million and 11.7 million rubles have been terminated).

Among the customers of digitization services, three main groups stand out: government agencies, cultural institutions, industrial and commercial companies (see Table 1). At the same time, in 2017, the volume of orders belonging to government agencies exceeded the volume of orders of all other consumers of these services by more than two times. A year earlier, as already noted above, the market sank somewhat for organizations with state participation.

  • It remains inaccessible to analyze the information of most purchases of commercial companies carried out without open competitive procedures, through direct contracts with contractors. According to statistics provided by market participants, the volume of this segment can be estimated at an additional 20-25% of the available volume.

Dynamics of changes in the volume of orders by the main consumers of digitization services

In-demand digitization services

Several specific areas of work have been formed in the market under consideration: data entry, direct digitization of documents and comprehensive outsourcing.

The bulk of the projects is the mass digitization of accumulated paper archives and cases. These include:

  • cases on the main activities of state authorities (registry office, property management and urban planning, cases and personal accounts of recipients of MPSC services, Pension Fund of the Russian Federation and social services, cases based on the results of control and supervisory activities), digitization and filling of state information systems and registers for the development of public services;
  • technical archives (design, as-built, design and estimate documentation of industrial enterprises, documents of the technical inventory of Rosreestr and BTI funds transferred in 2015 to the regions).
  • archival inventories, catalogs and accounting documentation of libraries and museums, the digitization of which allows you to simultaneously conduct an inventory, automate activities and move on to the provision of electronic public services. Also - digitization and photofixation of funds in order to develop exhibition activities and create insurance copies of exhibits.
  • "client dossiers": filling CRM and accounting systems to improve the quality of service in banks, NPFs, insurance funds, telecom and resource-supplying organizations.

The described areas are characterized by the presence of long-term, strategic digitization work. As examples: the project of the Ministry of Defense of the Russian Federation "Memory of the People" ("OBD Memorial," OBD "Feat of the People," more than 12 years), consistent digitization of property management documents in Moscow (more than 7 years), a long-term project digitization of exhibits of the Hermitage (more than 7 years). Often these are projects using rare and valuable documents: book monuments of history, manuscripts, archival film films, photo images.

The second most important direction of digitization is the current data entry: outsourcing the functions of secretaries, accountants for document registration. These services are actively used in arbitration courts and centralized accounting departments. There are a large number of small purchases almost throughout Russia, the executors of which are mainly local companies and individual entrepreneurs.

Finally, a small segment of comprehensive outsourcing services is a complex of manipulations with documents, most often ordinary and repeated: checking incoming documents, copying them, scanning, registering, preparing cases for placement in the archive.

The largest players in the digitization market in Russia

Among Russian service providers in the market for digitization of documents, several dozen companies traditionally participate in tenders: both specialized and those for which digitization is not a specialized type of activity (system integrators, consulting and printing companies).

The three largest service providers in terms of the volume of tenders won included:

  • ELAR Corporation, a full-service company, specializes in major digitization projects in all industries, including digitizing stitched documents, archival businesses, books, drawings, extracting data from informal documents (including handwritten ones); has software and equipment of its own production.
  • FSUE Main Research Computing Center of the Office of the President of the Russian Federation, is engaged in the provision of services to organizations of the Presidential Administration and federal departments, primarily by digitizing funds commissioned by the Presidential Library. B.N. Yeltsin;
  • Ladoga Telecom LLC, operator of the electronic government of St. Petersburg and the Leningrad Region. The main order is the translation of information from the registry office of the Leningrad Region into electronic form.
  • LLC BIORG"," a company specializing in outsourcing the processing of incoming documentation: questionnaires, scans of passports, price tags, handwritten statements, etc.; has its own software.

Digitization of Documents Solutions and Projects Catalog

Input of primary documents - digitization

Notes