Intelligent document processing. TAdviser Overview
After import substitution ceases to be a key driver of the intelligent document processing market, AI and its integration into corporate information systems of the BPM and IDP class in conjunction with RPA will become a growth point.
Intelligent Document Processing (IDP) technologies have been on the market for a long time and successfully. The first domestic OCR (optical character recognition) programs were widely used in the mid-nineties of the last century.
Since then, they have firmly occupied a place in many office and home computers, and the technologies that they use are being improved and modernized. OCR and CV (computer vision; computer vision). Experts say that the ubiquitous artificial intelligence is also taking part in the technological development of this segment of software.
Among the immediate prospects in the field of intelligent data processing, industry experts see programs based on the use of large language models (LLM) based on AI, processing of natural languages (NLP; natural language processing) and machine learning (ML) in its modifications: AutoML, self-supervised learning.
1 Current trends in IDP
Experts note the expansion and interpenetration of "classical" and "modern" intelligent document processing technologies, in which traditional OCR/CV technologies interact with LLM/generative models and AI. Such integrations are already on the market, and most experts believe that their number will grow.
Vladimir Andreev, President of Docsvision:
Recognition and intelligent search, grouping (clustering) and linking (regression) of documents have long found their place in document processing systems, machine learning is successfully used in certain specific areas (for example, for regulatory control of documents, checking the correctness of document packages or incoming classification of documents). The most "fresh" technologies are the use of large language models (LLM). In addition, the intelligent processing of documents includes the tasks of optimizing their processing processes, which are combined under the name Process Intelligence (in particular, Process Mining technologies). " |
If we talk about the segment of intelligent document processing, then it is based on OCR technologies (including its type ICR) and NLP. (ICR – Intelligent character recognition; intelligent character recognition - ed.). To solve these problems, the entire range of machine vision and machine learning technologies is used - from classical heuristic approaches to pre-trained generative neural networks based on transformer architecture. At the same time, the modern trend is to move away from heuristic approaches and move towards using ML to solve most of the problems. Therefore, we can say that the concepts of OCR, ICR or NLP themselves gradually move into the category of tasks that are solved along the way when applying a machine learning approach. For example, VLM (Vision Language Model), a promising approach for extracting information from documents, in principle does not imply a separate OCR stage in the classical sense. Instead, the multimodal linguistic model pre-trained on images and text data receives the document image and generates a response based on it containing the data to be extracted from the document. If we talk about more practical examples, and VLM today can be called a promising rather than actively used approach, then within the OCR problem, in practice, the classic multi-stage pipeline is replaced by end-2-end neural networks, which, taking the text image at the input, immediately output the OCR result. And at the same time they show better quality and resistance to distortion of the original images than classical approaches. The classical approach to NLP, which implies manually creating a language model and superimposing text on this model, completely gave way to large generative linguistic models (LLM). LLMs allow you to solve NLP problems with greater flexibility and versatility and with less labor. This reduces the "entry threshold" and allows the technology to be used to solve problems that were previously considered poorly suited for automation, "said Content AIO Advisor General Director Leg Sazhin. |
Grigory Starovoitov, product analyst ELMA365 CSP, added speech-to-text software (audio and video recordings), digital data archiving and AI agents to the previously named intelligent document processing tools - programs that work autonomously from a person, independently processing data and making decisions.
Vadim Petrosyan, Business Development Director of ITFB Group, author of EasyDoc (intelligent platform for extracting, analyzing and generating text data (OCR, IDP, LLM)), noted that today the IDP market uses the following technologies: OCR (text recognition) - as the basis; CV (Computer Vision) - recognition of structure, tables, signatures; ML/AI - for classification, routing, preprocessing; NLP/NER - for extracting entities, semantic parsing; LLM - for understanding the context, working with unstructured text; RPA - to automate routine operations with documents; integration with BPM/DMS/ECM - for complete process closure.
! Interview with an expert
IDP systems should understand documents, not just recognize characters.
AI helps them in this
Ivan Volkov,
Chief Product Officer, Content AI
No one will be surprised by OCR and ML technologies today
Ilya Petukhov,
Head of
AI Product Development Projects
Directum
2 Technological Perspectives
Ilya Petukhov, head of AI solutions development projects at Directum, believes that in the future LLMs will be able to process documents and extract from them not only text and metadata, but also meaning and simply understand what the document is about. Work in this direction, according to him, is already underway.
I think within a year we will come to a perfectly working data processing scenario with the help of LLM and will be able to "talk" with any document with the help of requests or promts, "Ilya Petukhov said. (Promt/prompt - a text command or description that is used to generate content - ed.). |
Nikolai Trzhaskal, Product Director of Preferentum of SL Soft, drew attention to the increase in the volume of data and the number of documents requiring processing and analysis:
Large multimodal models (LMMs) that can work with both text and images are of great help. As the amount of information analyzed grows, document processing alone becomes insufficient. Knowledge gained needs to be structured and made searchable in the future. For this purpose, technologies of text vectorization and semantic and vector search are used. The latter requires special databases. Technologies on the cutting edge today will soon become as much a part of the standard developer toolkit as classic relational databases. In general, everything related to the transformation of information into available knowledge will develop in the coming years by leaps and bounds, because if in 2010 the number of data in the world was estimated at 2 zetabytes, then in 2024 there were already about 145 Zbytes, and only in 2025 another 40 will be added. At the same time, 80% of these data are unstructured, that is, their perception and memorization by the human brain is almost impossible without preliminary processing and providing contextual search mechanisms. " |
In the context of EDO, says Grigory Starovoitov, a product analyst at ELMA365 CSP, it is more appropriate to talk about the further development of existing technologies and the creation of mass solutions based on them. Nevertheless, in the medium term, the emergence of ERP systems serving "digital factories" is possible: fully automated "deserted" or with minimal indirect human participation production.
Experts in the near future expect the appearance of standard pre-trained ML-models for solving classical problems of working with documents, for example, legal assessment of contracts or classification of citizens' appeals. This, says Vladimir Andreev, president of Docsvision, will significantly reduce the cost of introducing models into practice.
The evolution of AI agents from "smart assistants" to modules that autonomously carry out business procedures not only in the internal network of the enterprise, but also in the external environment (for example, equipment purchases, hiring, automatic accounting) is also predicted. The use of augmented reality technologies in the design of workplace interfaces will also become more widespread.
Oleg Sazhin from Content AI said that in the field of business trends, including those based on customer expectations from a modern IDP solution, the focus of intelligent data processing is shifting towards the tasks of general business automation. This makes RPA technologies an important element of intelligent data processing systems and part of this market. Here, in addition to classical "scripting" approaches, the role of machine learning is growing. An important trend is the use of so-called AI agents - a technology that allows you to automate scenarios where there is a high degree of variability, which is difficult to describe in the classic RPA style. AI agents have inside LLMs and enable the user to include them in automation scenarios. AI agents, guided by job description and using elements of classical RPA as their "eyes and hands," can solve complex problems, the algorithm for solving which is difficult to describe in advance.
Dmitry Ivankov, an expert at the SKB Kontur Center for Artificial Intelligence, draws attention to the prospects for technologies related to VLM (Visual Language Model), which can simultaneously work with images and text.
You can already build question and answer systems based on such technologies that can quite well solve document processing problems. For example, recognize text in a photo or scan of a document, answer questions about it, do summarization, determine the presence of seals and other objects. Moreover, one model will cope with all these tasks, "says Dmitry Ivankov. |
At the same time, he calls the bundle "OCR + model for working with the obtained result more popular on the market. Due to the growth of metadata and the accumulation of a "digital footprint" of automated processes, an increase in interest in Process Mining tasks, the application of methods for continuous optimization of business processes to parameters of the external environment of the enterprise, as well as LLM using supervised fine-tuning in the preparation and processing of documents.
Despite the rapid development of multimodal language models (MM-LLM), notes product manager Kirill Aliulin from Tevian.AI, specialized solutions based on convolutional neural networks still solve the problem of document recognition most accurately and quickly. Such specialized solutions are the least demanding for computing resources and are easily deployed on the infrastructure and in the internal circuit of the customer, ensuring data confidentiality.
In the near future, Oleg Sazhin believes, progress can be expected in the field of machine learning automation (AutoML). Since the use of ML models is a clearly regulated and iterative process based on data processing, model learning and evaluation of results, it can be automated. With AutoML, you can already optimize various stages of machine learning - data preprocessing, setting up hyperparameters, evaluating model performance. In the future, you can expect to expand the capabilities of AutoML, the emergence of new tools, increased power and the ability to solve more complex problems. The next stage behind AutoML is autonomous self-learning systems. In the medium term, systems will be able to learn from new data without human intervention, improving their accuracy and efficiency.
Alexey Khakhunov, co-founder of Dbrain, drew attention to self-supervised learning in the context of model training. The technology allows you to train models without using gigantic arrays of marked-up data while maintaining high accuracy. This solves the main problem of labor-intensive and expensive document markup.
He also recalled blockchain technology, which allows you to reliably and transparently verify the authenticity of documents. In Europe, according to Alexei Khakhunov, the technology has already taken root, and in a few years it may become a universal standard for confirming the authenticity of data.
In the near and medium term, Alexander Vorobyov from Haulmont believes, a group of technologies will develop, including Data Science, Big Data, Machine Learning, as a result of which non-obvious patterns and sequences are extracted from data arrays (for example, predictive analytics - ed.).
Now it is difficult to say which specific tools will appear as a result of the development of these technologies, "says Alexander Vorobyov." Of the most significant tools that have appeared recently, it is worth mentioning generative models that work with several types of data at the same time and allow you to recognize information from a document without extracting a text layer. " |
Artem Vartanyan, Director of the Marketing Department of ELAR Corporation, believes that now the market is focused on introducing generative AI both in solving individual problems and in designing national technological innovations. Therefore, the use of generative neural networks is the de facto technology of the front edge in the market.
Vadim Petrosyan (ITFB Group) expects the appearance of highly specialized LLMs for legal, financial and medical documents, AI-assistants for working with the content of documents in EDMS, self-learning from user data of ML models, structuring PDFs/scans into editable formats (AI-assistant layout/restoration of documents). Development is underway in the direction of working with data in Document QA at the level of "understanding."
! Current products
3 IDP Target Audience
In general, intelligent document processing is required in those industries and companies where administrative and business processes are associated with the use of paper media, the images of which are important to store and/or analyze for any purpose. The larger the scale of operations and the lower the level of digitalization, the larger the volume of processed documents and the more technologies for their processing are in demand.
Experts note that in the intelligent document processing market, most solutions are "boxed," the effect of their implementation arises within a short period of time and allows you to automate routine operations, as well as increase the efficiency of processing feedback data, for example, when receiving applications, applications, requests. These can be enterprises of any form of ownership and of any size, organizations and institutions in the public sector, such as, for example, accounting, chancellery, human resources department, MPSC, other departments that work with citizens' appeals. Here, AI technologies can have a significant effect in the form of reducing the processing time of documents and/or increasing the processing volume and generally reducing operating costs.
The functional (by task types) differentiation of the implementation of solutions for intelligent document processing was also noted by Ilya Novoseltsev, director of new developments at Rubius:
Custom data solutions made for a specific company and taking into account the peculiarities of its business processes are more in demand in large companies. There this is due economically. The volume of incoming and outgoing documentation is quite large, there are many people who work with it. Any automation immediately brings significant profit. Among the priority cases of the application of AI are the processing of tenders, the analysis and generation of template documentation: acts, certificates, explanatory notes, etc. A separate big trend is the creation of corporate databases with AI assistants. They solve the problem of uneven distribution of information across the company. Here the situation is the same as with chat bots in the client service: only chat bots work for an external audience and relieve the load from the support service, and corporate AI assistants help solve issues inside. In addition, companies are looking for opportunities to turn paper documents into digital ones. For example, logistics is considering the introduction of digital archives for old paper documentation. This will simplify the search, reduce costs. " |
At the same time, experts note, public sector enterprises are less willing to introduce new technologies, whose safety and reliability are not tested by practice. Small and medium-sized businesses in this sense are more mobile and in some areas are actively experimenting with AI tools, for example, for fast and cheap content production (voice acting, visual and sound design, 3D modeling).
Nikolai Trzhaskal from SL Soft adheres to the point of view that the driver of development in the segment is state structures and large corporations. At the same time, he pointed out the objective difficulties of introducing intelligent document processing:
It is in demand everywhere, not only by business, but also by private users. Unfortunately, objective and subjective factors interfere with the rapid progress of systems in all directions. Of the objective ones, it is worth highlighting the high cost of implementation and ownership of solutions, since outside the classic OCR, a market for ready-made cloud or desktop products that can fully satisfy small and medium-sized businesses, as well as advanced private users, has not yet begun to form. The public sector finds itself in the grip of limited budgets and the impossibility of concluding contracts with budgeting for several years in advance. Accordingly, the development of systems occurs in "fits and starts" - every year a little bit, which does not allow building long-term modernization programs, supported by contractual obligations. Therefore, at the moment, the main drivers in intelligent document processing are enterprises and large businesses, for which this is not just a fashionable trend and part of a national project, but a completely practical issue of saving money, increasing efficiency and reducing the risks of errors that entail reputational and financial losses. " |
Alexey Khakhunov, co-founder of Dbrain, details the needs for IOD (intelligent data processing - ed.) For different market segments. According to the expert, the public sector automates the processing of tax returns, government contracts, electronic dossiers of citizens. For software, stability and compliance are key requirements.
Big business implements IOD for compliance, legal control and optimization of document management at the company level. Medium and small businesses prefer cloud services and ready-made Solution SaaS are easy to launch and integrate at minimal cost.
Intelligent data processing, Oleg Sazhin (Content AI) believes, is applicable in all sectors of the economy. In state structures, IDP solutions are most in demand for processing citizens' appeals. This is a difficult task, taking into account the variety of topics, types and formats of statements. The Government of the Russian Federation considers various options for unified automation of this process.
IOD, according to Oleg Sazhin, should become a key tool for work for state organizations, since it allows you to analyze the current state of various industries, record the dynamics of changes, identify trends, be it the analysis of demographic processes, loading the road transport network, monitoring environmental condition. The information received should be used to coordinate actions, timely response, adjust current processes and plans.
As Vadim Petrosyan (ITFB Group) said, the public sector has localized the maximum demand for import-independent solutions, automation of applications, inspections, while large businesses (primarily banks, insurance, telecom) have become important streaming processes, IDP requirements for speed and scale. In the SMB segment, demand is gaining momentum, more often in the format of SaaS or boxed solutions.
4 The Role of Artificial Intelligence
AI and ML, according to most experts, continue to be key technologies in the intelligent document processing market. Their impact on the IOD market will continue and even increase. The dominance of AI and ML technologies, according to some experts, will persist until a real technological breakthrough in quantum computing.
At the same time, a number of experts believe that today the key in the IOD market are Low-code/No-code-platforms, cloud technologies, blockchain technologies.
Interest in AI, as noted by Rubius Development Manager Ilya Orlov, is not decreasing. Customers from the industrial sector (petrochemicals, fuel and energy complex, fertilizer production) are of great interest in predictive analytics and decision support systems. According to the expert, this is due to the fact that by 2023 companies have accumulated significant datacets and are looking for an opportunity to monetize them.
Ilya Orlov cited the task of predicting the demand for electricity actively discussed by industrial enterprises as an example of the importance of predictive analytics. The relevance of the task is associated with the characteristics of the production and consumption of electricity in the industrial segment: the cost of electricity for the consumer depends on how accurately he indicated the amount of electricity he needed. Companies calculate the volume of consumption and submit forecasts to generating companies. If the forecast turns out to be inaccurate, the excess in terms of expenses is charged at a higher price, and the difference is noticeable. Consequently, the more accurate the forecast, the less money the company will pay for electricity. For many industries, this is a significant expenditure part, the expert summed up.
Experts believe that in most cases, the integration of "AI + data storage" or "AI + BPM" increases the final efficiency of the entire system, making it more flexible and user-friendly.
Now there is a trend towards end-to-end processes and the need to bring all systems into one landscape, as a result - the most flexible AI tools with the widest integration capabilities are needed. However, in practice, it is still difficult to imagine such AI, since each IT system of customers has its own specifics. We strive to implement the closest option to this. The flexibility of the tools allows using artificial intelligence to optimize any process in the system, and since the EDMS now often acts as a CSP, this allows you to connect AI to the widest possible range of customer processes, "said Alexander Vorobyov. |
In terms of integration trends, Oleg Sazhin (Content AI) believes that the growing popularity of IDP solutions in conjunction with RPA and AI is obvious. Such projects are implemented by many large customers. Examples of AI (LLM) and BPM integration are found on the international market. In Russia, there is great interest in this technological solution, but there are no examples of the real application of such technologies to existing processes yet. We can assume with a high degree of confidence that they will appear in the future of several years. Work towards the integration of new technologies is of great importance for the stability of companies in the market, the prospects for their development. This factor will come to the fore after import substitution ceases to be a key market driver.
Vadim Petrosyan (ITFB Group) believes that integration with storage (DWH, S3, MinIO), BPM and ERP increases competitiveness, because customers appreciate flexibility and seamless. As for AI and ML, according to the expert, they become an integral part of any document processing system - from extraction to analytics. Without AI, automation is no longer possible.
5 Import substitution in IDP
Most experts are confident that import substitution in the segment of intelligent data processing is a significant factor affecting both the financial and technological results of companies engaged in the implementation of EDMS, document analysis and the development of ML models.
The focus of import substitution in the field of intelligent document processing, according to a number of them, is to ensure the operation of LLM on domestic equipment (CPU and GPU). Work in this direction is being carried out even in several areas. One of them is the use of LLM on devices that do not require high-performance equipment, such as smartphones. The second is the development of Russian servers with GPU. Experts estimate the number of such servers that will be produced in 2025 at 100.
Import substitution will also be in demand in such areas as unification (replacement or integration of heterogeneous software within a single platform), ensuring the confidentiality of information and personal data, fault tolerance.
For many indicators, Alexey Khakhunov from Dbrain said, Russian systems are not just comparable, but surpass foreign counterparts in processing complex types of documents characteristic of Russia (for example, non-standard forms of contracts and forms). This increases interest in domestic solutions and stimulates the industry to further develop.
Alexander Vorobyov from Haulmont, speaking of import substitution, noted that in many organizations foreign EDMS were introduced long ago and are already technologically outdated. Initially, their development and modernization made architectural features difficult, but in the last two years there have also been difficulties with the acquisition of licenses and interaction with foreign vendors. The ability to use AI becomes one of the components of a technological upgrade. At the same time, EDMS and before the activation in the last two years of the trend for import substitution replaced foreign products, including in highly loaded projects.
Vadim Petrosyan, Business Development Director of ITFB Group, considers import substitution to be a key growth driver, especially for the public sector and state corporations. It stimulates the development of domestic OCR, NLP and platform solutions.
6 The demand for various formats of system use
When assessing the demand for integration formats, some experts, in particular, Ilya Petukhov from Directum, believe that embedding AI technologies in existing products will be a priority. With regard to SaaS models, the specialist notes that 81% of large companies are not ready to use cloud services. There is less skepticism for small and medium-sized businesses, and there is more willingness to use cloud AI services.
A similar opinion is shared by Grigory Starovoitov from the CSP ELMA365. In his opinion, movement towards "ecosystems," integration with various software and the ability to work in the "single window" mode will become a trend in the near future.
Nikolai Trzhaskal from SL Soft believes that AI should be not only fashionable, but also economical. Add intelligent features without the need to change stable systems - and there is the most economical way. Therefore, the priority, in his opinion, is the ability to integrate AI solutions into existing EDMS, ERP and BPMS systems from a variety of manufacturers.
Artem Vartanyan from the ELAR corporation notes that in the form of SaaS, the integration of AI into IOD systems is the least in demand, since with this format it is necessary to transfer documents from the customer's internal systems to an external service. Customers, for the most part, are not ready to transfer documents for processing to third-party resources. In addition, according to the SaaS model, it is difficult to automate the processing of the entire flow of incoming documents, in this case, part of the data is processed locally using manual labor, which entails an increase in labor costs and a decrease in the speed of entering documents into corporate information systems.
Therefore, Artem Vartanyan believes, most customers prefer to implement processing systems locally. The implementation strategy is exclusively related to processes within the customer organization. If the main processes are implemented in the document management system, then one of the best solutions will be to install the document processing module in an already functioning system familiar to the customer. But EDMS is not the only option for modular application of AI. The most popular concept is the ECM platform, which acts as an enterprise data storage. Accounting, accounting systems based on ERP, personnel systems based on EDC, EDMS, computer-aided design systems, CRM systems can be associated with it, and automated banking systems for banks and financial organizations can also be associated with it. All software platforms send their data to a single window in the form of an ECM platform, where the AI module categorizes, normalizes, classifies, parses, processes documents and loads data into sections of a single corporate storage. By doing this, the company gets a single window for both processing and rationing data, eliminates duplication and errors and combines all information in one system, which allows you to flexibly build search processes and form consolidated collections of documents in different life situations.
Oleg Sazhin from Content AI adheres to a similar position:
The Russian market is largely dominated by projects in which IDP solutions or additional modules are integrated into the used document management systems or installed separately, but, importantly, only on-premium (local solution - ed.). Customers prefer to operate the information and store it in the inner loop. This position is especially typical for industries that work with sensitive data - personal and financial. " |
Vadim Petrosyan (ITFB Group) considers integration into ECM/EDMS a priority in large business and the public sector; SaaS - popular in SMB, especially after a pandemic; As for the modules, they are relevant if the AI functionality is embedded in existing processes.
7 Financial results of 2024 and expert forecasts
During 2024, an upward trend was observed in the electronic document management market, Grigory Starovoitov believes and predicts stable growth at comparable rates for 2025. Growth drivers, in his opinion, will be import substitution, the development of the regulatory framework and the evolution of EDMS in the Content Services Platform (CSP).
Ilya Petukhov from Directum also adheres to the forecast for market growth in 2025. He believes AI for data processing is replicable, efficient and affordable. In 2024, the growth of the market for "boxed" solutions with AI amounted to about 50% in quantitative terms, in 2025 the trend will continue.
Dmitry Ivankov from SKB Kontur agrees with them. He reinforces his opinion by the fact that many organizations still use paper media in their processes. Many documents are not even digitized, but potentially contain a large amount of data that can be used to manage the company.
Kirill Aliulin from Tevian.AI notes that in 2024, the direction of intelligent document processing showed an impressive increase in revenue. The market is quite mature and saturated, but there are niches with unsatisfied demand, so growth, according to him, will continue.
A similar opinion is shared by Alexander Vorobyov from Haulmont. In his opinion, the IOD segment is experiencing explosive growth and the trend will continue in the near future.
According to Content AI, which was voiced by Oleg Sazhin, import substitution projects occupy from 30 to 50% of all implemented in the domestic market. Starting from 2022, domestic vendors occupy niches vacated on the market. Vendors whose solutions have advanced functionality, high performance, compatibility with domestic operating systems, reliability, stability can count on sustainable revenue growth.
Шаблон:Quote 'author=Oleg Sazhin, Advisor to the CEO of Content AI
ITFB Group Business Development Director Vadim Petrosyan noted the growth of the market in 2024, especially due to the digitalization of public services and banks. The prospect of 2025-2026. the expert sees steady growth: AI, import substitution, digital transformation will become drivers.
8 Conclusion
Technological solutions for IOD are becoming easier to use, and the effect of their application is more tangible. In the near future, the uptrend will continue, and companies implementing AI platforms will strengthen their position in the market. All experts who took part in the TAdviser survey agree with this statement.
Many experts consider the expansion of AI technologies and tools in the field of EDO and intelligent document processing inevitable and desirable. In addition, the integration of AI and EDMS, AI and BPM provides additional advantages for such complex solutions, and for customers, the availability of integrations is an additional factor in the choice in favor of such a product. The integration capabilities of document recognition software add value to the product and help retain customers.
There is also a demand from business structures and government agencies for the unification of EDO, the transition from heterogeneous specialized systems to a single/unified one with similar functionality.
AI and ML continue to be key technologies in the intelligent document processing market and, according to some experts, will remain in this capacity until a real technological breakthrough in quantum computing.
Further prospects for the intellectual document processing market and the domestic IT market as a whole will be largely determined by the general political and economic situation in the country, prospects for interaction with the international community, access to advanced technologies.
! Read also