RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2
2020/06/25 09:31:02

AI "eyes": what see the systems of computer vision today and what will be made out tomorrow?

Rates of market development of the systems of computer vision (Computer Vision, CV) in the Russian Federation impress: analysts expect that after 2021 its annual average gain can reach 40% a year. Such growth rates are caused literally by revolutionary breaks in mathematical methods which occurred a few years ago. As a result the whole world intensively increases volumes of practical projects in the field of video analytics. What calls of further development consist in and what prevents to receive smart independent robots who "see as the person"? Article is included into the overview of TAdviser "Technologies and solutions of artificial intelligence: change point"

Content

Main article: Computer vision (machine vision)

Evolution of technologies

File:Aquote1.png
The network represents set of neurons with communications and signal transmission. It is possible to call such systems conditionally universal: we constantly doobuchay network, adding new layers of neurons, communications and using increasing number of images for training,
tells Tatyana Voronova, the head of data analysis of Center 2M company.
File:Aquote2.png

Experiments with architecture: as they change the market

File:Aquote1.png
Also different architecture of networks are used, often it is necessary to develop separate architecture under a separate task,
adds Tatyana Voronova.
File:Aquote2.png

Yury Vizilter, the chief of division of intelligent data analysis and technical sight of Federal State Unitary Enterprise GOSNIIAS believes that the after-training method with permanent adding in the trained network of new layers of neurons is rather exotic option.

File:Aquote1.png
Bases of examples, really, constantly are replenished, and ready networks doobuchatsya usually on new examples without change of architecture, - the expert notices. - In case of need to change architecture, developers just try different new ready architecture, changing their parameters, and at the same time every time training network again.
File:Aquote2.png

Permanent search of the best architecture of networks is the present real life of developers. With enviable regularity - several times a year - in the world there are new architecture of neural networks which significantly exceed results of the previous architecture.

File:Aquote1.png
For all main objectives of computer vision (classification, detection of objects, tracking objects, semantic segmentation, annotation of video, face recognition, etc.) there are (and constantly new appear) conventional public test data,
emphasizes Vizilter.
File:Aquote2.png

On these bases by the adopted rules (protocols) of testing scientific and commercial development teams compete better, and all practicians watch closely what architecture of networks, loss functions, acceptances of training showed the best results.

As in scientific communities of machine learning and computer vision it is accepted in the last decade that new architecture of networks are openly published, a way from the advanced scientific development before creation of an applied product, by estimates of specialists, can make only several months.

File:Aquote1.png
It is an inspiring opportunity. But at the same time and serious commercial danger. Now, unlike former decades to claim that any company for a long time took technological leadership in this or that market, it is already impossible. The one who ceased to monitor the latest achievements and ceased to improve the product risks to lose immediately in competitive struggle to the most green beginner,
comments Yury Vizilter.
File:Aquote2.png

Good example of such state of affairs – technology of biometric face recognition. So, the Russian team of NTech Lab company in 2015 developed the algorithm FaceN capable to distinguish persons more precisely than the person. The same year the algorithm FaceN became the winner of the World Cup of The MegaFace – he managed to distinguish identical persons among one million samples, having outstripped more than one hundred commands from around the world, including Google. And before in the world on this field several known companies which were fostered decades dominated and improved the biometric solutions.

Global tender of algorithms on face recognition of Megaface uses base from one million photos

Face recognition: revolution with continuation

Solutions of face recognition of the previous generations required strictly frontal shooting of persons with the good permission in good controlled conditions. Emergence in 2015 of deep neural networks moved this heaven and earth of the settled technologies: new algorithms of recognition allowed to identify persons in any foreshortening, and, on images much of the worst quality.

File:Aquote1.png
Dozens of development teams competed in achievement of the best results worldwide, and as a result the quality of face recognition literally in a year was lifted to essentially new level. Leaders changed weekly - it could be traced according to ratings on the websites of the relevant bases of images among which the main then was LFW base (Labeled Faces in the Wild). Among leaders there were Russian teams more than once. Including we, GOSNIIAS - in 2015 won at the first Russian public competition on face recognition in difficult conditions held by FPI

File:Aquote2.png

When there was a new huge base MegaFace, the community of developers switched to it, and the Russian company NTech Lab became one of the first winners of this tender. Today the growth rate of quality of solutions in front biometrics slowed down a little, Yury Vizilter notices, but still several times a year there are qualitatively new scientific solutions, and the best commercial developments constantly are jointly tested as in public field, for example, in tests of the American National Institute on standardization of a NIST.

Let's note that among winners in the NIST tests there are Russian developers recently. So, the algorithm of face recognition of Vision Labs company became last spring the winner in category Mugshot (the criminal's photograph where lighting and a background are variable, and image quality can be bad) where more than one million photos of people were used, and different photos of the same person could be made with a difference in 14 years. Won first place in this VisionLabs test (99.5%, at a share of wrong operations of 0.001%) with the face recognition algorithm, steadiest against age changes.

In the NIST tests of category Mugshot photos with bad image quality are used

In category Visa where recognition happens in photos of high quality, the problem of search of the necessary person is complicated by the fact that in base several thousands of photos of people from the different countries are collected. Here the algorithm VisionLabs took "silver" - showed the second result in the world with the level of recognition of 99.5% at a share of wrong operations of 0.0001%. And in international contest which took place within the specialized conference on computer vision of CVPR-2019 Vision Labs bypassed all competitors in tests for detection of deception of a computer system: its Liveness system distinguishes the person of the living person not only from his photo, but from a video.

File:Aquote1.png
The situation of change of leaders is characteristic of all areas of computer vision today. For this reason long-term technological leadership here is impossible today, whatever money was invested in developments. The international community of developers is too big, methods of deep learning develop therefore probability is high that someone else will make the following break tomorrow and will receive the best results. And such provision will remain, on extremely measure until the technology revolution which is taking place today in computer vision comes to the end,
emphasizes Yury Vizilter.
File:Aquote2.png

Achievements of today

We know all by sight

As of 2020 NTech Lab technologies provide detecting of persons in crowd, even being in the distance, turned aside from the camera, half-closed, for example, by masks. Also persons on the photos made in difficult conditions, for example, at bad lighting, or on blurred images, say, made in the twilight of the auditorium will be recognized.

Meanwhile, real life makes the amendments – on streets and in public places today many people appear in medical masks. Whether it became a call for present video surveillance systems?

File:Aquote1.png
Everything very strongly depends on the mounted cameras. If it is cameras with bad quality, then the mask can close a half of the face which and was so difficult to be recognized. If we speak about office, shop, a warehouse, then here recognition accuracy practically does not suffer. Moreover, we suggest customers to doobuchit systems: to add the image of employees in a mask to the standard passport photos which are already loaded into systems for recognition. Enrichment of a portfolio on employees allows to increase significantly recognition accuracy,
explains Roman Gots, the director of the department of Big Data and security of Atos in the Russian Federation.
File:Aquote2.png

According to the expert from Atos, this "tough nut" for the systems of face recognition and other objects is recognition of emotion: "Especially emotions of the Russian person. Here so far the level of false operations is extremely high".

On such solutions, in particular, the Center 2M company focused attention. In its solutions of access control to premises, function of recognition of emotions of the visitor is used and also there will be an exact identification of the person even if he tried to change appearance, say, will put on glasses, the wig, a false beard, the heading, etc. Besides, is the built-in system of an antispuffing – it interferes with penetration into the premises of the malefactor using a picture of the real employee.

It became possible after emergence of the special technologies intended for the description of a human face. So, each recognizable person can be provided by a set of the unique parameters called by "descriptors of the person". These descriptors significantly save memory of a computer system. Besides, the source image of the person cannot be recovered from the taken descriptors, specialists of Vision Labs note and emphasize: "Such principle of work allows to follow rules of personal data protection".

The modern systems of face recognition works with descriptors, but not images of real people

Urban video analytics – a component of the smart city

In March, 2020 it became known that the biometric system of face recognition Vizier of CST group is included in base of effective cases of the Digital Economy organization. The base of cases in which there are three tens IT solutions today is created by ANO Tsifrovaya ekonomika with the assistance of the regional authorities and business on purpose development of digital economy in subjects of Russia. The biometric Vizier system which represents a universal information system of detection and identification of persons according to the video image in the conditions of a dense flow of people is included in the Public Security block.

Screen of a system of biometric identification Vizier

A system shows high effectiveness in difficult conditions: bad lighting, low temperatures, angular foreshortening. Recognition and verification of the person take place on the course of its movement, i.e. in real time. Reacting to relevant calls, specialists of CST carried out upgrade of a system for adaptation of its work conditions of an epidemiological situation, in particular, for retrospective search of persons on which there is information on potential infection with a coronavirus.

Meanwhile, the Ministry of Internal Affairs develops the new city system of recognition of criminals and suspects using video cameras. Expect ability to distinguish in street crowd of the criminal on the person, a voice, an eye iris of the eye, tattoos on open parts of a body and gait from it. On earlier read plans, it is supposed to start a system until the end of 2021.

Video analytics: industrial video surveillance

A number of types of monitoring is among the most popular solutions of the videanalitiki for industrial enterprises:

  • Violation of dangerous perimeter.
  • Existence of the individual protection equipment (IPE).
  • Identification of defects on production lines.

For example, the mining and processing enterprises have a widespread problem - a rudozasoreniye. On tapes of the pipeline elements of a mine timbering, the boring tool, wooden and other non-magnetic elements often get that can lead to a rupture of a tape, breakdown of the crusher and the increased wear of the splitting-up bodies. Identification of unnecessary elements on the pipeline – one of popular problems of video analytics.

Norilsk Nickel implements the solution of automatic recognition and informing on rudozasoryayushchy elements on a tape of the pipeline implemented on the basis of neural network


The Center 2M company tells that in a portfolio of the executed projects of the company there are such in which the video surveillance system for production objects covers the territory of the size of ten football fields. And in all this territory presence at any employee of all SIZ elements necessary in each case is monitored. Specially trained neural network will recognize as the SIZ elements (helmets, vests, respirators, goggles, etc.), and equipment elements and also operating conditions of the equipment.

The Croc company developed the complete solution on video analytics for labor protection and industrial security (OTIPB) on the basis of machine vision and artificial intelligence. A system helps to reveal potentially dangerous situations thanks to predictive analytics, it can be used for investigation of incidents. The solution will also help to keep track of location of personnel and to control access to dangerous zones. At the same time the observed zone can be dynamic and move in sight of the camera.

Video analytics in retail

In most shops solutions of calculation of visitors based on the laser sensor or the video camera work with special software today. And network retailers test sensing technologies of persons which give them the chance automatically literally "it helps know by sight" regular customers - including, to start personalized the loyalty program. For example, if the camera at the checkout recognizes the person as the regular customer, the discount is automatically provided to him, and the seller has an opportunity to address it by name. So, in shops of Perekrestok network of technology of computer vision help to reduce queues at the checkout: when the number of people exceeds the allowed level, to the management the signal that it is necessary to open additional cash desk arrives.

Computer vision in workflows

The key direction of development of this segment of the market experts unanimously call creation of such application software in which CV solutions join in these or that production or business processes.

File:Aquote1.png
Today best of all the video analysts connected with platforms when within one platform, say, the number of people about an object is recognized "come" at customers of the solution, the biometric analysis is built-in and also characters on the equipment will be recognized and technology processes are checked. So there is uniform "front" for displays of different events. Of course, all these events are connected with production process of the customer, but in itself they are versatile: they contain information both on people, and on the equipment, and on interaction of people with the equipment.
File:Aquote2.png

Logic integration of processes, consider in Center 2M, is for today, perhaps, the highest achievement of intellectual technologies because within uncommon modern projects of informatization skills of work of the person in a specific workplace, and the sequence of actions of different employees, i.e. some production or business process in general not just change.

Striking example of such approach – "deserted shops"

Deserted shops

Development of technologies CV, including, gave an impetus to development of the "deserted" shops working without people sellers. In 2018 in Seattle the first such automated Amazon Go supermarket was open. In it cameras and sensors monitor what goods the buyer took from the shelf, and then charge money off an account of the visitor. A set of cameras monitor each action of the buyer from different foreshortenings, and the computer system identifies buyers and goods in the cart, creates a virtual basket of goods and makes cashless payment.

In Russia the first "deserted" shop opened in the fall of 2019 in Pyaterochka network. It is small street shop – a container, it can be transported easily using the truck manipulator on other place. The input to the booth requires the Pyaterochka with application which, including, creates the dynamic QR code – on it it is possible to get in the booth.

Video cameras scan all space of shop, and the computer system analyzes actions of all buyers at the same time. In particular, if the person tries to leave shop, without having paid purchase, the door will remain closed.

In the fall of 2019 in Dolgoprudny started the first street shop without cashiers "Pyaterochka with themselves"

Road to mass solutions CV

File:Aquote1.png
In addition to information on objects, it is possible to take information on the image in general, for example, the place where there was a shooting - the premises, the street, the city, the country, day etc. It is also possible to define characteristics of an object: color, size, form, identification. It is possible even to determine wind force by the form of flags on the image. Still it is possible to classify movements of people, for example, falling or phone conversations, to take biometric data,
tells Tatyana Voronova.
File:Aquote2.png

File:Aquote1.png
Shooting on cameras of mobile phones and the corresponding algorithms of processing, the elementary 3D - cameras with structural light, surveillance cameras and video analytics using neural networks — all this, anyway, is present at mass segments,
summarizes Mikhail Smirnov, the technical director of the company of "System of Computer Vision" ("SCV").
File:Aquote2.png

File:Aquote1.png
In each new project there are the specifics, it is necessary to finish functionality. But, in general, the solutions connected with character recognition with accurate structure (for example, recognitions of car numbers), by recognition of often found objects now are well replicated (for example, silhouettes of people) at sufficient image quality and the size of an object on the image,
explains Tatyana Voronova.
File:Aquote2.png

Roman Gots tells about the project which the Atos company keeps on 116 Tesco gas Stations in Great Britain:

File:Aquote1.png
using our hardware and software system set at gas station, Tesco companies was succeeded to automate the fire extinguishing system and to remove "a narrow neck" - to replace the unique employee who is responsible for fire safety on gas station, on an algorithm of detection of smoke and fire. The purpose of implementation of such system - to provide business continuity, in particular, in connection with risk of a disease of the employee in the period of a pandemic.

File:Aquote2.png

Technology tops of CV

File:Aquote1.png
The first are the systems in which cameras and technologies of computer vision are combined with other sensors: lidars, GPS sensors, inertial sensors, etc. As a rule, from them high accuracy is required, the main complexity is represented by setup and calibration of such system.
File:Aquote2.png

The expert carries systems implemented on the simplest and low-quality cameras (fish-eye, smartphones, etc.) to the second class, but capable on the images received from them to solve complex problems, for example, independently to be calibrated, do by 3D - reconstruction, panoramas, to extend quality and many other things.

Let's tell, the independent drone capable without presence of the person to make survey and scanning of an excavation in the hardly accessible area, can use cameras of different quality and, respectively, software miscellaneous for obtaining results of the necessary quality.

The smart UAVs capable to independently investigate an excavation, work in Norilsk Nickel

Solutions of tomorrow

New level of autonomous navigation

The significant progress in the field of algorithms of autonomous visual navigation occurred even before emergence of deep neural networks. At the beginning of the 2000th years there was a class of the algorithms Structure-from-Motion (SfM) allowing to reconstruct completely in the automatic mode three-dimensional data on any sets of rakursny pictures. But these algorithms required big computing costs for the work. Then there were algorithms of SLAM (Simultaneous Localization And Mapping) providing at the same time creation of three-dimensional models and assessment of location parameters and the movement of the independent device even when shooting from one camera. Besides algorithms of SLAM were rather "easy" and could be executed in real time even on processors of cell phones of that time. The important technology point was also the fact that in a photogrammetry methods of rather exact calibration of wide angle cameras appeared (up to fish-eye), and these cameras became a basis of electronic vision systems of autonomous land and air platforms.

File:Aquote1.png
Thus, by 2010, i.e. prior to the revolution of neural networks, the area of computer vision already had almost applicable tools for autonomous onboard navigation. However, today – in 2020 - all these problems are solved through deep neural networks (sometimes in a combination with old methods for acceleration of calculations) which add features for the intellectual analysis of situations and create necessary dynamic semantic models of a surrounding situation for autonomous systems of management,

File:Aquote2.png

So, to SKZ companies there is a joint project with one of leading manufacturers of smartphones on automatic selection of objects of different types in a 3D cloud.{ now { the quote|the author = notes Mikhail Smirnov.|it is for this purpose important to know not only arrangement of objects, but also their type (trees, columns) and material (concrete, a tree, glass, steel). For this purpose we develop integrated solution for creation of the 3D map by data from the wide angle camera, neural network for recognition of object types and the polarizing camera for determination of a material type on index of refraction, }}

Smart systems of the help to the driver

The system of the advanced help to the driver - ADAS (Advanced Driver-Assistance Systems) - is today a part of any serious monitoring system of motor transport which are offered in our market. As a part of basic set she turns on warning of collision, control of a distance, warning of the pedestrian, warning of intersection of a band of a marking. Usually ADAS of the corporate level includes also the analysis of driver's behavior, identification of aberrations based on the received video data and also control of emergence of dangerous situations from the manager. In some solutions integration with the IT system of insurance company is implemented.

File:Aquote1.png
Combining different modules, we create a product for the analysis of behavior of the driver, - tell in the company. - For example, it is possible to determine the level of drowsiness of the driver, combining monitoring of a zone of a mouth for tracking of yawning and condition monitoring of eyes for measurement of time during which eyes were closed.
File:Aquote2.png

For smart observation of the driver in the system of automobile telematics the additional camera and the additional video recorder is mounted


In December, 2019. The Kama automobile works provided a working sample of the perspective KAMAZ Continent tractor which is considered a prototype of the tractor of the next generation of K6. This model is supplied with ADAS of the third level: sensors and radars of a system of an auto pilot are hidden behind the decorative black glossy panel which replaced radiator grilles. It means that the truck can independently be rebuilt between bands, park, follow in a column.

"Continent" — a demonstration sample of future serial product, already appropriated it the industry index — KAMAZ-54907


In the onboard system of the car different opportunities are implemented: navigation, Internet surfing, watching video, listening of audio, hours, weather forecast, course stock, fuel reserve, terms of the following TO and other. Also there are two cameras directed to the driver: the computer system will estimate provision of his head, blinking frequency, focusing of a look and some other indicators of a physical status. At disturbing symptoms a system will give a signal that driving it is time for person to have a rest.

Onboard cabin of KAMAZ Continent

The near future of the ADAS systems of a city class - the help to the driver in a format of the connected car. So, the Nissan company provided the prototype of ADAS executed in the concept of Invisible-to-Visible (I2V), i.e. a system helps drivers to see what is hidden, for example, behind turn or round the corner buildings.

Collecting data with the help of the sensors located in and outside of the car and comparing them with data from the cloud storage, a system can not only understand what occurs at present around, but also to expect what will occur further. The "connected" car gives to the driver lungs for perception of the hint, including the different icons displayed on the meter panel, in particular, about a situation around the car and up-to-date data about road traffic.

Pilotless robots couriers

Yandex Company became interested not in multiton trucks, but small and brisk robots – couriers. Testing of these bright "younger brothers" of unmanned vehicles of "Yandex" began last fall, and today to them already charge real affairs to bring, for example, a packet with paper documentation or to take away sending for the employee from a warehouse of Bury service.

Yandex. Rover – the younger brother of the unmanned vehicle – is intended for pilotless movement on the city of small loads


However, "youngest" only looks small and weak-willed. Actually its hardware-software platform is not inferior to senior "companions". Moreover, especially for Yandex. Rover in the company developed own lidar and the video camera. As Dmitry Polishchuk, the head of unmanned vehicles of "Yandex", the low cost of the equipment – a crucial factor for mass production of such "self-fluent carriages" noted.

Besides, unlike the commercial foreign products which are available today in the market, software of a lidar gives the chance to configure scanning settings during the movement. For example, it can focus attention on the remote object (at distance to 200 meters) and to precisely define that it is an object: pedestrian, cyclist, etc.

File:Aquote1.png
Lidars of third-party producers are analyzed and filter data at a collecting stage. Using own lidars, we obtain more information thanks to access to "crude" data".
File:Aquote2.png

The pilotless robot courier of "Yandex" distinguishes precisely objects of a road situation in the movement at distance of 200 m

The analysis of medical images – a point of growth of the market of systems CV

File:Aquote1.png
In these tasks very strict quality requirements of recognition. Plus to it it is necessary to process large volumes of visual information. So, in the analysis of KT of lungs it is necessary to draw a conclusion about presence of pathology on set of pictures cuts, for example, at once to distinguish vessels from educations. This direction of computer vision now actively develops, in it there are unique specific solutions which did not enter the broad commercial market yet".
speaks Tatyana Voronova.
File:Aquote2.png

Good example - Botkin.AI, platform for the analysis of medical images, developed in Intellodzhik company. This platform performs diagnostics and risk analysis of development of diseases on the basis of mathematical models of representation of the state of health of patients.

Using the solution Botkin.AI stream screening of medical radiological images is provided


At the end of March the company reported that it quickly developed and added to the Botkin.AI platform functionality for the analysis of pneumonia. In Intellodzhik calculate that it will allow to reduce risks and to mitigate possible effects of epidemic of a coronavirus of COVID-19. The company opened free access to new functionality of the platform for all medical organizations participating in diagnostics and treatment of patients with COVID-19.

At the Belgorod state national research university the system of the hematologic analysis of a status of blood according to the digital image of erythrocytes is created.


The SKZ company by request of the Israeli clinic conducts software development which will help to define positions of heart valves of patients in pictures of ultrasonography. As basic model the architecture of neural network Unet which is often used in problems of processing of biomedical images is selected. After completion of neural network architecture as told in the company, the accuracy of result exceeded 95%.

The KZ system helps exact diagnostics serd

]]

File:Aquote1.png
Even in case of the low-quality received pictures where heart valves not always well and clearly are visible, the developed qualifier with a high accuracy will define their arrangement. The method of semantic segmentation as this method considers not only the space, but also context information obtained from the picture is applied to the solution of an assigned task".
tells Mikhail Smirnov.
File:Aquote2.png

Video analytics for agriculture

At the MAKS air show of company "Aeromaks" and "SITRONIKS" signed with summer of 2019 the agreement on cooperation in implementation of digital solutions and services for needs of agriculture, the forest industry and regional government. "Аэромакс" is going to carry out collection of information using the unmanned aerial vehicles and sensors installed on agricultural transport. On the basis of collected data SITRONICS will create IT services for agricultural consumers. It is supposed that the partner companies will be engaged in creation of digital maps of farmlands with a possibility of planning of crops on state-of-health data of the soil, its humidity and a mineralization, illumination weeding the sun, about the wind force and temperature drops that will allow to plan optimal watering and fertilizer, to select the best time for harvesting.

File:Aquote1.png
Monitoring of fields using the pilotless aviasystems and the automated data processing is a break for agrobusiness and the agricultural industry in general. We are going to extend our experience to all regions of Russia as it can significantly facilitate life to farms and the large agrarian enterprises,
noted the chairman of the board of directors of Aeromaks Valery Shantsev
File:Aquote2.png

Inspection of farmlands using the UAV


According to data of consulting company Tractica, till 2024 in the world rapid growth of use of autonomous smart agricultural machinery is expected: supply rate will increase from 32 thousand units in 2016 up to 594 thousand units in 2024.

Source: Tractica, 2016.

Analysts of Tractica note the following key scopes of robots in HSC:

  • pilotless tractors and aircraft;
  • management of material resources;
  • automated systems of vegetation of agricultures;
  • forest exploitation, subsurface use;
  • automated control systems for dairy farms.

In 2016 Cognitive Technologies carried out the first tests of the system of computer vision on pilotless agricultural machinery. And in July of last year agreed to install the system of off-line control Agro Pilot on combines of Rusagro. A complex the artificial brain or the calculator (Agrodroid) includes, the video camera, the display and also some other sensors and controls. The complex is capable to undertake completely traffic control, and the machine operator at the same time can completely will focus on control of parameters of cleaning (control of slope angle of the harvester, thresh, etc.).

Last year Cognitive Agro Pilot was also implemented in a number of agrofarms of the Tomsk region. Testing of the smart equipment was held in difficult conditions: different texture of fields, steep slopes and descents, the cut-up geometry of fields, night-time. And when testing a system in the Kurgan region the cleaning record - 67 centners from hectare was set. A system has customers and from Brazil, the USA and the countries of Asia.

The Cognitive Agro Pilot complex is implemented in a number of farms of the Tomsk region

Group intelligence of smart devices

Operation of the pilotless equipment, certainly, makes an impression on the detached onlooker. Even more impress combined actions of the whole group of smart autonomous objects. We can regularly observe it, for example, during the big ceremonies in the open air – a set of drones rise in air, make texts, dance in the sky, etc. Professionals develop the special direction in the management theory – so-called, network-centric management of group of intellectual objects. And inhabitants have a temptation to bring these developments to the level of autonomous intelligence and even to attribute them existence of some will in decision making.

File:Aquote1.png
The strong artificial intelligence about which speak much today is the intelligence similar with human - it can successfully carry out all or almost all intellectual tasks which people can do. But still this type of artificial intelligence remains in popular fantastic movies and the heads of some marketing specialists which skillfully use it.
File:Aquote2.png

File:Aquote1.png
Of course, complex systems of management, the navigation system in space are for this purpose necessary. But under "will" in given by a case the same algorithms which work under certain conditions by the principle "disappear if the black cat ran, then the auto pilot clicks a brake. With drones - it is absolute the same situation. There is some mathematical model of control of drones on the basis of parameters of flight, the current coordinates, battery power, etc. There is a purpose – to fly up and rise in a certain system. Further this model on the basis of the arriving parameters of flight sends control instructions, adjusts a drone flight trajectory for the purpose of building necessary "compositions,
explains Alexander Spiridonov.
File:Aquote2.png

Computer vision on the way of integration with augmented reality

The interesting perspective direction of development of systems CV - integration with technologies of virtual/augmented reality of AR. The Accenture company developed the GoodsAR application intended for the help to the buyer in shop. It is adapted for smartphones, tablets and special glasses with augmented reality (AR).

The application is installed on the tablet or the mobile phone, allows to import the list of purchases from any messenger, and then lays a route to the next zone in a trading floor where there is the first purchase. And further conducts the buyer from the shelf to the shelf according to the list. Using augmented rReality glasses it is possible to see all necessary information on a route and goods directly before himself, without looking in the tablet.

The Accenture GoodsAR application will pave using virtual reality the way in shop to the shelf with the necessary goods

Perhaps, making of a route between shelves of supermarket will seem someone the idea far from real life. But it is not necessary to hurry with outputs: similar algorithms are really useful, for example, at the big airports. For example, last year in London the Panasonic company presented the innovative solution at the Passenger Terminal EXPO exhibition Smart airport which is based on a sensing technology of persons and provides to passengers different useful services. Among them - the program assistant displaying information on the current location of the person on the screen and laying a route to a necessary point in the territory of the airport.

Experts in many respects connect future AR with development of AR Cloud technology. It is in the beginning of ascension on peak of expectations on the known curve of development of new technologies of Gartner company today.

Source: Gartner, 2019.


AR Cloud is actually exact model of the world in scale 1:1, more precisely, the "program copy" of the world received using scanning of physical objects around us to which elements of augmented reality are added.

Applications based on AR Cloud already appear. For example, the YaPlace platform and the Augmented.City application give the chance to look via the camera of the smartphone at the building and to obtain over video any useful information, say, historical information, the list of the organizations located in the building or the rating of restaurants on the first floor from the TripAdvisor application. The developer promised to release by May, 2020 an AR-application on the Italian city of Bari. Here only the situation with a coronavirus can shift the beginning of a pilgrim season in the south of Italy

The company ""SKZ" created the application for industrial enterprises which using AR Could allows employees of the plant to pass on the plant, to focus the camera on the equipment (machine) interesting him and to obtain in real time information, for example, of data on equipment operation parameters, the last maintenance inspection, etc.

The application of virtual reality based on AR Cloud for industrial enterprises

The calls facing the industry. Whether the smart program "see how the person?" can

File:Aquote1.png
The purpose of a system of computer vision – to take information from the image also or better, than the person. There is even a special Turing test for computer vision: the computer can give the same information volume about the image, as the person. Actually, therefore this sphere of researches gets to the area of artificial intelligence. Solve this problem rather difficult because the person studies these skills since the birth,
tells Tatyana Voronova.
File:Aquote2.png

See as the person: practical problem definition

But whether it is necessary to set such task in practice? Perhaps, the practical embodiments of technologies of computer vision will always have niche, specialized character?

File:Aquote1.png
Functions of a system of computer vision are dictated by its appointment. So the analysis of three-dimensional images, dynamic and stream analysis and also the changing scenes in real time — it is impressive and it is important, but it is not always necessary. On the first place I would put such factors as optimality for the solution of a task and reliability. A system should be adequate at the price, it is reliable to have sufficient high-speed performance and work. If this object recognition, then the recognition range, number of false operations, a system capability most can be factors of quality of a system to define the dysfunction. For 3D — complexity of shooting of a scene for reconstruction, the speed of work, the requirement to the calculator and memory. As for the stream analysis, the complexity of the calculator, high-speed performance, redundability and reliability will be a figure of merit.
File:Aquote2.png

In March on the most loaded highways of the Omsk region nine points of automatic weight and dimensional control of heavy-load transport are brought into operation

On the way to universal computer vision

File:Aquote1.png
So far such systems on required resources exceed possibilities of available calculators. All systems CV solve some final set of problems. But, say, the auto pilot of the machine is an approach to such universal system. But only on technologies of computer vision not to implement it yet. It is required to solve a number of questions of quality, reliability and accuracy,
believes Mikhail Smirnov.
File:Aquote2.png

Problems of it are solved, in particular, by the Russian company Cognitive Technologies. At the end of 2018 it developed the Cognitive Imaging Radar 4D-radar which unlike the normal radars radiating radio waves in the plane due to original construction of the antenna array scans space, and without application of mechanics allows "see" a form of objects of a road scene and receive effective characteristics on permission and accuracy of detection. And, explain in the company, the device allows to create a four-dimensional picture of a road scene for one cycle of reception-transmission of a signal. It gives the chance to increase update rate of data and as a result, to determine parameters of dynamic objects and to effectively use power of the radar and also guarantees the low total cost of the ready device.

In the traditional automobile systems for obtaining additional data on a road situation in addition to the video camera the lidar (the laser scanner) – the device expensive and sensitive to dust and rainfall is used. Cognitive Imaging Radar copes with problems of a lidar, for example, is able to distinguish a form of a subject and to perform its classification. Moreover, he can perform identification of several visible objects, for example, will be able "to see" the pedestrians standing opposite to the vehicle, etc.

Source: Cognitive Technologies presentation about the 4D-radar
Source: Cognitive Technologies presentation about the 4D-radar

Signed with last summer of Cognitive Technologies the agreement with Hyundai Mobis company: Softwares of the Russian developer, built in luxury cars will appear in the market in 2021-2022. It is about the system of the help of C-Pilot of the fourth level to the driver: she assumes that transport will be able to pass the main part of the road without participation of the driver at any time and under any weather conditions: in the dark, during a rain, snow, fog, etc.

And at the end of November Sberbank and Cognitive Technologies announced creation of the new company focused on development of pilotless transport. In addition to cars, the new company will be smart agricultural machinery, railway locomotives and trams.

Calls for software of systems CV

Development of this segment is objectively pushed by improvement of the equipment: there is an increase in resolution of cameras, lenses with fast change of focusing, polarizing cameras are developed, capacities of calculators increase. However experts see also the directions of development of the corresponding software.

File:Aquote1.png
There are two basic approaches to development of the specific solution of computer vision: creation of model of an object and its signs independently or training of ready model under signs of such object. In the first case more exact model which well works with a narrow class of objects, but with a high accuracy turns out. In the second case, as a rule, accuracy is lower, but also it is less costs and it is possible to integrate more various objects,
tells Mikhail Smirnov.
File:Aquote2.png

File:Aquote1.png
If the model can be constructed rather difficult now, then to train and configure it – a big task. At once there is a question of setup of these algorithms under object types. There is a wish to see it most automated. Ideally also to leave from the need for hundreds and thousands of examples for training, as usual it is required now. A single question – validation of model. She demands a large number of experiments, and in itself is a big task.
File:Aquote2.png

File:Aquote1.png
If to tell about restrictions, then it requirements to performance and to data transmission channels. Now operability of algorithms support rather powerful servers, however with increase in productivity of wearable devices, these minuses will stop being minuses. The second restriction is problems of noise in data and the fact that algorithms it is rather easy "deceive". Now we observe the shift of interests in finding solutions in nonideal conditions, for example, fog, night-time, etc. It, certainly, will promote quality improvement of self-governed devices.
File:Aquote2.png

File:Aquote1.png
In spite of the fact that for years of existence of the Internet the huge set of examples of images collected, there are specific objects used only, for example, in specific production, say, special tools, the equipment or signs. The data set for such data needs to be expanded constantly according to the new video record, for example, on the basis of the data arriving from the customer's cameras and regularly to doobuchat a system.
File:Aquote2.png

File:Aquote1.png
If changes of objects only visual, it is enough. If there are more difficult dependences, perhaps, it is required to change model. For simple cases it is possible to consider option with self-training model,

File:Aquote2.png

"Tough nuts" for the systems of computer vision

File:Aquote1.png
Qualifiers with an accuracy of 99.9%. For the majority of qualifiers on neural networks there is no testing approach. It is a black box, and costs for testing can become excessive. Accuracy of systems is often limited to available cameras and optics. For work with the broad range it is required to combine cameras with different optics, etc.
File:Aquote2.png

File:Aquote1.png
Tasks where the answer very strongly depends on a foreshortening and lighting are considered as difficult. For example, a white object can often look gray. Because of a foreshortening ideas of a form, quantity of objects can be distorted, objects block each other. Cause difficulties also of a situation where it is necessary to look for damages or markings on material because these damages can be not visible because of impurity, shading.
File:Aquote2.png

Where the scientific world moves

Achievements of the industry of computer vision by the time of are capable to blow the mind of the average citizen. And the scientific world storms new tops. According to Yury Vizilter, for computer vision it is necessary to carry the following to number of the most interesting directions of scientific search of area of the deep neural networks (DNN):

  • The structured and irregular networks, GNS on graphs (Graph Convolutional Networks, GCN) and also networks with attention (Attention networks) which came from the NLP area and actively began to be used in computer vision.
  • Interpretation of video in a natural language: Action Detection and Prediction, Image Captioning & Video Annotation, Video-Language Understanding, Visual Question Answering (VQA), Visual Dialogues.
  • Automatic training and selection of architecture of GNS: AutoML, Neural Architecture Search (NAS).
  • Attacks to GNS (Adversarial Attack), search of vulnerabilities of GNS, detection of the attacks and protection against them.
  • Synthesis of visual data, training transfer (Domain Adaptation, Generative Adversarial Networs, GAN).
  • Mimicry and extraction of knowledge (Knowledge Distilling).
  • Training at small number of examples (Few-Shot Learning/Detection/Segmentation).
  • Training without examples (Zero-Shot Learning, Grounding).
  • Networks with memory (Memory Nets).
  • Creation and use of scene graphs (Scene Graph).

And, of course, a common problem of all class Deep Learning - opacity of work of an algorithm of GNS which requires the innovative developments for the purpose of an explanation of work of GNS (CNN Explanation).

File:Aquote1.png
In general there is a feeling that revolution in computer vision continues, but slows down a little. Speed and the directions of further promotion will depend on how and when it will be possible to cope with the arisen problems and also from when and what hopes will come true,
notes Yury Vizilter.
File:Aquote2.png

The scientist refers to number of problems, first of all, vulnerabilities of GNS and counteraction to the attacks, methods of effective transfer of training in practical problems of the real world, deficit of real data for practical applications. Among all list of the problems requiring the solution it is worth selecting those which are characteristic of all scopes of deep neural networks and, more, are a brake on the way of building of real implementations:

  • There are catastrophically not enough real data for training of neuronets!
  • Perspective training methods require too big computing resources!
  • The bridge through an abyss between sight and language/understanding theoretically sushchsetvut, but its mass use in practical tasks everything does not happen in any way!

When the industry of computer vision copes with all above-mentioned problems and barriers, then wide prospects for mass transition of neyrostevy reasonings from the level of visible objects to the level of semantic constructions (ontologies), including Object Level SLAM will be offered, in particular. And implementation of object approach, in turn, will provide means to go to joint solving of tasks of sight and management. And only then it will be possible to say that the world reached the new level of development of technologies of computer vision. It is possible to assume that it will become a new point of explosive growth of technologies for independent robots. But the movement in this direction already began.

You See Also