Interview of TAdviser with the CEO of Preferentum company Dmitry Romanov
On a segment of information systems in the field of the analysis of unstructured information and to a status of this segment of the Russian IT market answered questions of TAdviser Dmitry Romanov, the CEO of the company Preferentum"IT(Group), to. t. - m of N, the associate professor HIGHER SCHOOL OF ECONOMICS NATIONAL RESEARCH UNIVERSITY
of Novels
With what objects do software products which you offer the market work?
Dmitry Romanov: We work with unstructured information. On the one hand, this difficult concept, with another – very simple as all of us deal with unstructured information daily and every minute. Web pages, audio recordings, text information, video, a photo – all this examples of the unstructured information distributed on different sources and devices – the smartphone, a dictophone, the computer. Here we also work with it, we try to take useful information. As a rule, in most cases, it is after all the text – or in electronic form, or in the form of a graphic image. But there is also a combined information – for example when the support service analyzes incidents, it works also with voice calls of clients, both with screenshots, and with documents.
In December of last year from Sberbank there passed official information on 3 thousand robots – the lawyers who got to work in bank. It is about the program developments similar to solutions of your company?
Dmitry Romanov: I am not aware of parts of this project, but I can assume that in our product line there is something similar. Today many directions where the information system can undertake routine and at the same time quite intellectual work of the person come to light. The artificial intelligence about which many decades were told imperceptibly approached the market and actually is already used by business.
How do you estimate a segment of the Russian IT market on which you work?
Dmitry Romanov: The Segment of the Russian IT market at which we work has more and more 10 years, but he is still young. At the initial stage of its formation by clients there were generally government institutions, intelligence agencies, security agencies which needed to analyze unstructured information and to take certain actions according to the results of the analysis. By the way, the first example of use of the analysis of the voice information by intelligence agencies was described in the novel by Alexander Solzhenitsyn "In a circle the first" when in telephone conversation mentioning of a key word was caught. It is clear, that since then technologies strongly developed, and in the last several years they became also are in large quantities available, i.e. gained distribution in a retail segment. The voice sensing technology, machine translators – very many use these technologies by means of applications for the smartphone, and - is free. Thus, today we observe, on the one hand, increase in power and quality of work of cognitive technologies, with another – growth of their availability. In corporate environment the similar situation when these technologies already really can replace the person on those transactions which the person considered before only the prerogative is observed. Owing to the listed factors this segment of IT market endures rapid growth though volumes so far and small.
About what volumes can there be a speech, by your assessment?
Dmitry Romanov: I estimate the annual volume of the Russian market of technologies of the analysis of unstructured information, in a corporate segment, including state, at the level of several hundred millions of rubles, precisely – less than one billion. But this segment, I will repeat, very actively grows. By estimates of IDC growth of international market will annually be about 50%, till 2020 inclusive.
And in what international market the term is applied to designation of a segment of sensing technologies of unstructured information?
Dmitry Romanov: The uniform commonly accepted term does not exist. The range of different names which reflect different aspects and nuances of technologies is used, – semantic technologies, cognitive technologies, text analytics, processing of texts in a natural language (Natural Language Processing, NLP), intellectual search (Intelligence Search, IS) or intellectual search at the enterprise (Enterprise Intelligence Search, EIS), the applications based on search (Search Based Application, SBA).
And what term most of all is pleasant to you?
Dmitry Romanov: Text analytics. The text is what both the speech, and video, and scans when we speak about documentary information is transformed to. As a result all the same we come to the text, anyway.
Let's return to a subject of the Russian market – how many players in it? From among the main vendors … Presence in the market of the western IT companies is how big?
Dmitry Romanov: About twenty players of the different size and power of business work at the Russian market. Perhaps, it is much more than a half of the Russian players though such western companies IBM as well as HP quite actively advance the sensing technologies of unstructured information.
What it is possible to tell about a competitive situation in the market?
Dmitry Romanov: The competition, certainly, is present. On my memory there was no situation that in our projects at a stage of carrying out tender there would be no competitors. But I would not tell that in the conditions of quickly growing market this competition very tough.
Import substitution influenced market situation?
Dmitry Romanov: Our segment still so unripe, not filled with solutions that, frankly speaking, I did not notice special influence. At such early stage of market development simply still there is nothing to importozameshchat especially. Besides, for domestic solutions, should note, in Russia there is an advantage a priori, regardless of import substitution policy, and this advantage is connected with Russian as such – both state, and language of international communication.
And whether there are tasks on a joint of languages – for example, Russian and English?
Dmitry Romanov: Such tasks arise, and they are in a field of our sight too. In the developments we support such languages as English, Ukrainian. Now we work on Kazakh, there are plans for support of the French, German, Spanish and Portuguese languages.
What problems do state structures by means of analytical tools want to solve? What do you offer them?
Dmitry Romanov: In the state segment we work in several directions, but the majority of solvable tasks is connected, often, with processing of regulatory legal acts that in itself is not surprising, documents – the basic what government institutions deal with. In our product line there is a number of the developments connected with conducting legal examination. In particular, we are able to reveal unevident errors in texts. For example, in the text the reference to the article of the law which already became invalid is had or a certain precept of law which contradicts regulations of other document is entered. Manually to trace all this rather difficult.
In what main directions does the product line of the company develop?
Dmitry Romanov: The main direction three. The first direction is connected with the problem of "pulling" structured components from unstructured data (the name of the organization, a position, a geographical location, the address, phone, etc.) then a number of applied problems is solved. For example, it is possible to compare reference books, to clean data, to enrich data of one IC from another. The conditional name of the first direction – "Preferentum Data" or, speaking to language of the trigram name, - Named Entity Recognition, i.e. recognition of named entities. The second direction – "Preferentum Klass" - is connected with methods of machine learning, text classification or other unstructured objects, identification of degree of their similarity. Machine learning in this case is understood, of course, as not training of the person by machine, but machine learning by the person - the machine, i.e. - the information system, could perform work of the person almost like of people. Products of this line are demanded at customer appeals in the company or citizens in authorities. A system reads the message, "understands" to whom it is addressed then sends it on one of channels, depending on contents. The layout of the loaded documents according to folders, search of plagiarism - tens of the most different scenarios exists in respect of use of this technology. And it is very important that this information system constructed on machine learning technologies, - self-trained. And the third product direction is intended for various formal checks, this is Preferentum Robots. On this direction problems of routine verification of documents, are solved with issue of expert opinions. For example, at a stage of approval of agreements as it appeared, about 80% of errors – protozoa: the amount in digits does not fight with an amount in words, the VAT is incorrectly counted, details of the partner do not match that are stated in the database of partners, the power of attorney appearing in the agreement and so on is expired. The person spends a lot of time for correction of the simplest errors, forgetting to check an agreement essence – as far as the agreement is profitable, for example.
In what the main technology complexity of development of the smart programs capable at the expert level to analyze difficult text information? Where do you take the personnel?
Dmitry Romanov: We have no problems with development of any programs and with the personnel. A problem with formalization of a task that is directly connected with understanding of the customer of in what he is engaged, and our understanding of its problems. The customer knows the business processes, document flow, bottlenecks, risks, but knows nothing about sensing technologies. The word "neuronet" is a limit of the fact that he heard about cognitive technologies. We know everything about text analytics, but often we do not represent business processes of the customer in parts. Joining of knowledge of the customer with our knowledge also is the most difficult.
And how this problem is solved?
Dmitry Romanov: We stage demonstrations of our solutions in the territory of the customer, we train specialists in fundamentals of semantic technologies, we configure special stands at which customer representatives can be convinced of efficiency of the solutions proposed by us by practical consideration.
With what research institutes do you actively cooperate within the created developments?
Dmitry Romanov: We cooperate with Higher School of Economics National Research University, Institute of problems of information science of RAS, some other the Russian institutes.
And with western?
Dmitry Romanov: Directly we do not contact to the western institutes. In it there is no need, all information also is so published in different specialized editions, circulates in scientific community. We trace all information on the direction, the benefit – scientific works in the field of semantic technologies, there is a lot of artificial intelligence.
In many directions of development of IT the western technologies and solutions – flagman, the Russian market willy-nilly is guided by them, so it developed historically. Whether it is possible to approve the same in the context of technologies of the analysis of unstructured information?
Dmitry Romanov: No, concerning technologies of the analysis of unstructured information it is not true. There is a lot of Russian – during that time still Soviet - the developments executed in the 70-80th years of the last century which still set the tone in this information technology sphere. For example, - Vapnik and Chervonenkis's classical works on dimension which are key for all developments – both in Russia, and in the world - in the field of classification and machine learning. Therefore regarding text analytics, and more widely – unstructured information, it would be incorrect to claim that the western technologies ahead, and we lag behind. There are common problems connected with development of science in the country, the same as problems in the development plan for physics, chemistry, biology and any other science.
What events will you refer to the greatest success of the company in 2016?
Dmitry Romanov: The project stage in the Ministry of Internal Affairs where a number of interesting solutions was implemented, in particular – the self-training system of anti-corruption examination of projects of regulatory legal acts using crowdsourcing mechanisms for formation of rules of check was completed. The designer of regulations – one more system implemented in the Ministry of Internal Affairs. Having pressed several buttons, the user receives at the exit the competent text "About making changes …", written in the correct legal language. One more event – the NER technologies used in our products won first place in the named entity recognition competition (persons, the organization, geographical locations) held within the Dialog — 2016 conference. Experts marked more than 30 thousand news from the Internet, and 16 participating companies competed who is more better, will fully and precisely select this information. And the third achievement purely technical, but important, - we could progress in increase in accuracy of automatic classification significantly. In certain cases our systems are capable to issue accuracy, comparable from human now, - it is higher than 97%.
What technology platform of solutions of the company?
Dmitry Romanov: Technology platform of solutions of the company completely our own development. We began once on the IBM platform, but having faced a number of the problems connected, in particular, with features of support of Russian and the general ponderousness of the platform gradually substituted separate components on the. As a result we received the unique stack of technologies based on own developments and open libraries.
Your forecast of the top market trends for the 2017th and next period?
Dmitry Romanov: I think, the 2017th year will be year of active growth, involvement of new customers in practical scope of the analysis of unstructured technology for the benefit of business and effective management. And in the next 2 - 3 years the Russian market of use of cognitive technologies will still grow, and – is multiple, many times.
93