Intelligent video analytics
A few years ago, AI systems in the field of video analytics had many restrictions, say, it was impossible to recognize a face if it was turned to the side in the frame, documents were poorly recognized in low light, etc. How has the situation changed today? How do artificial intelligence methods help today to recognize, identify objects, persons, situations of high complexity and with high accuracy? The article is included in the TAdviser review "Artificial Intelligence Technologies"
Content |
Complex and smart video analytics solutions are now becoming an integral part of hardware solutions - smart video cameras and the like equipment, which makes such solutions more affordable for small businesses. For example, this kind of solution is supplied by Dahua Technology under the EZ-IP brand: video surveillance cameras and video recorders with video analytics, as well as network equipment. The devices are designed to control the entry and exit of visitors, external monitoring, ensuring the security of the cash register, etc.
All this is aimed at creating a safe environment in the field of small and medium-sized businesses, the company says. |
Restrictions remain, because you cannot see what is not, and you cannot recognize what is not visible. Nevertheless, active refinement of algorithms is underway to reduce the number of "non-working" situations. Combined methods are created that are enough to recognize the object at some point, and then they will be able to track it. Increased detection range, system speed. |
According to Dmitry Morozov, Development Director of 3DiVi, the accuracy of recognition algorithms today reaches 99.7% on bases of about one million people, and the result does not worsen the presence of a hat, medical mask, wig, glasses, fake beard, etc. Today's technologies are capable of working in darkening conditions.
Mikhail Smirnov gives other examples of advanced solutions: facial recognition system from long distances (more than 20 meters) through a smartphone video camera, flaw detection systems in production, using cameras with advanced capabilities (polarization, IR, etc.) to determine defects with millimeter accuracy, 3D reconstruction systems using neural capabilities, etc. For example, in the BioSmart Quasar face recognition terminal, a unique optical system that includes an adaptive backlit stereo camera is able to build a 3D model of the face even in complete darkness.
A few years ago, not many photographs of faces were available to train neural networks, and they were most often taken full face with good uniform lighting. Accordingly, when a photo with a face turn or uneven lighting was submitted to the entrance of the neural network, the recognition accuracy deteriorated significantly, "says Alexey Tsyplov, head of the digitalization department of production at RuSat Infrastructure Solutions. |
In recent years, there has been a serious breakthrough in the development of neural network architectures, their complexity is constantly growing, the expert says. The development of computing power also helped - new models of Nvidia video cards, the creation of Google tensor TPU processors, and in general, powerful computers have appeared that can cope with the task of training neural networks at a reasonable time.
Not only large databases with photographs have also appeared, but also technologies for building synthetic data for training neural networks, continues Alexey Tsyplov. - Generative models of neural networks allow you to generate a set of pictures with a changed angle of rotation of the head and different sources of light. Increased variability in neural network training data has led to improved facial recognition accuracy. Neural networks have also learned to make additional transformations, eliminating rotation and tilt, and carry out 3D face reconstruction from a 2D image of the face at an angle. |
The situation when artificial intelligence had many different restrictions on use has never been directly related to the features of AI, he believes. - Just to make sure that your system works stably in some conditions, it takes a certain time. |
Therefore, visual intelligence is introduced much more slowly than, for example, wired communication technologies. After all, it is quite difficult to influence the signal that goes along the shielded wire, but the picture received at the output can be influenced in a million different ways, - explains Dmitry Nikolaev. |
He cites as an example the Smart Engines solution for full recognition of the main spread of the passport RUSSIAN FEDERATION with handwritten filling, including the not fully expanded "book" of the passport: it was necessary to solve many technical problems taking into account all kinds of geometric features in the video stream and in photos, but updating the stack of basic technologies did not require.
Variety of object types for video recognition
Platforma, a joint venture between VTB and Rostelecom, is developing a new service for the insurance business: it is able to assess the scale of damage to the car body after an accident, detect traces of dirt, snow, stickers on the surface of the car, etc. The system, which was trained on a set of 30 thousand photographs of cars, where more than 65 thousand elements of the car were present, steadily recognizes 48 types of parts and 14 types of damage.
Sistema computer vision(part of the group) LANIT is implementing a solution Smart Timber designed to automate the calculation of round forest with the help algorithms computer of vision at the enterprises of the timber holding in and. Segezha Group Karelia During Vologda Oblast project the implementation integration accounting system of the SegezhaLes SDA from wood, moreover, from all data business units, enterprises are assembled at Smart Timber server , located cloudy in the Segezha Group storage facility.
Pilot projects showed that the system makes measurements with an accuracy of 97% and reduces the number of errors in measurements inevitable with the manual counting method.
The system is incorporated into the control loop of the group and works together with a person, is trained to work in specific application conditions. The volume and quality of recognizable parameters will be improved continuously by training on large data streams, says Sergey Merkulov, director of digital transformation at Segezha Group. |
The new direction of video analysis is associated with express analytics of images received from UAVs. So, Skoltech scientists have developed a monitoring system that allows you to segment the image in real time on board the UAV and identify hogweed, a harmful plant that poses a danger to agriculture and human health. This solution allows you to receive information about the distribution of the plant at high resolution, even if the sky is covered with clouds.
In addition, it was possible to abandon the traditional concept of "data collection - creation of orthophofuels - analysis of the obtained image" in favor of processing aerial photography data right on board the UAV during the flight. This has become a non-trivial task, because for the goals set, it is necessary to launch heavy algorithms segmentation based on the so-called neural networks Fully Convolutional Neural Networks (FCNN). It is networks of this type that allow you to select objects of interest of irregular shape with pixel accuracy, which in the task of detecting hogweed makes it possible to recognize individual plants with high accuracy. Scientists had to choose a suitable single-board architecture computer and optimize neural network (popular architectures UNet, SegNet, ResNet were chosen for research) so that it could run on it.
In practical tests, it took 40 minutes to examine the territory of 28 hectares when flying at an altitude of 10 m. At the same time, not a single plant was missed, comments Andrey Somov, senior lecturer at Skoltech, scientific director of the project. Power engineers of the Distribution Networks branch of the Sakhalinenergo branch (part of the RusHydro group of companies) tested in the spring unmanned aerial photography of the LEP-35 kV section using the Geoskan 201 complex. Such solutions are used in the electric power industry to monitor elements of the network infrastructure, inspect the routes of power lines to identify technological violations, check the state of clearances of overhead power lines, thermal imaging control and other purposes.
Now we have at our disposal two unmanned devices of the unmanned type: with a fixed wing - an airplane and a quadcopter, - notes the operator of unmanned aviation systems of the branch "Distribution Networks" Andrei Bulakhov. - They are mutually complementary when examining power lines. With the help of the aircraft, we study the general state of the power line, with the help of a quadcopter - more detailed, for example, support elements during supreme inspection. |
Scientists of Vyatka State University presented at the end of May at the interdepartmental conference "Artificial Intelligence in the Police Service" an algorithm for restoring and improving the quality of images and video sequences obtained from unmanned aerial vehicles.
A non-trivial version of using technologies video recognition was implemented by the company Datana (part of the LANIT group). We are talking about digital technologies for detecting slag in the flow of steel on. Abinsk Electrometallurgical Plant (AEMZ)
The steel production process was previously controlled on the basis of a visual assessment of the steelmaker, the company says: he monitored the flow of steel through protective glasses and focused on its color, noise, sparks and his experience. Smelting conditions are accompanied by high smoke content, which often does not allow even an experienced expert to accurately determine the moment of slag penetration. As a result, together with the metal, the furnace overoxidated slag enters the steelmaking ladle. Datana has implemented its own comprehensive Datana Sense solution: it monitors steel output by visually illuminating slag in the stream, and when the permissible level is exceeded, it produces a light and sound signal. Release control is carried out using an infrared camera of the far spectrum, which allows you to "see" through the smoke, and an artificial intelligence module that accurately detects the presence of furnace slag in the stream.
According to the expert assessment of AEMZ specialists, as a result of the introduction of Datana Sense, the potential savings in deoxidizers, ferroalloys and slag-forming materials can be 10%, and the reduction in electricity consumption can be up to 5%.
Big Data Recognition Tasks
The tasks of recognizing people, vehicles, etc., learn to work with more and more objects in increasingly difficult situations.
Today, video surveillance systems are used in places with a large flow of people - sports competitions, transport, public events, - says Alexey Tsyplov from RuSat Infrastructure Solutions. - With the help of neural networks, you can not only find criminals, but also track the routes of their movements. For neural networks, the presence of appearance-changing factors, such as glasses, a beard, a mask or a headdress, is no longer a big problem. Recognition is also made taking into account gender, age and race. |
For example, NtechLab software analyzes scenes taken with 100,000 CCTV cameras in Moscow.
For neural networks, the recognition process on large databases has negative specifics. So, most algorithms have a linear dependence of the duration of the search on the size of the base: if the base increases 4 times, then the search time in the base increases proportionally.
VisionLabs has added the Index software module to its basic computer vision platform LUNA, which allows you to exclude searches for completely different faces. The company says that now in NIST tests, searching for a database of 3 million people takes 36 ms, and for a database of 12 million people - only 43 ms. This preserves the accuracy of the search.
This is important for many real face recognition applications, says Sergey Milyaev, head of research projects at VisionLabs, because standard algorithms usually recognize faces on large databases for several seconds, which is unacceptable for real-time services, for example, paying for face fare in the subway. Index software is already used in the Sberbank biometric platform.
Digital Twin Video Analytics
The company "Nanosemantics" tells how with the help of smart neural network modules you can create a digital double of the road and traffic. Here, a digital twin means a very detailed, from the point of view of the road infrastructure, map like Yandex Maps. Moreover, you can click on objects of interest (pedestrian crossings, traffic lights, intersections, etc.) and see a set of photos from this place, experts say, similar to the view mode from the eyes of Yandex Maps.
Our assessors marked only a part of the images, that is, they circled the contour and classified traffic lights, signs and other important objects, and then our neural network specialists trained neural network detectors and classifiers to recognize these objects. With the help of these smart neural network modules, the rest of the data can be marked up and transferred to a digital twin already in automatic mode with very high quality. |
During this project, problems were encountered with optimizing the training set. For example, the number of accumulated photos containing traffic lights is measured in tens and hundreds of thousands. Moreover, these volumes will increase over time, since the goal of the project is not only to digitize the road infrastructure, but also to monitor the state of traffic lights with signs in order to quickly replace them.
Because of this, I had to abandon the automatic definition of some types at the current stage. It is planned to create synthetic data where these initial data are not enough to further improve the system, experts explain. |
There is also little data on types of sign deformations, since it is very difficult to manually select objects on such topics due to their rarity and specificity. Therefore, some of the problems of determining deformations were solved using classical computer vision algorithms without deep machine learning.
The problem of tracking objects (traffic lights, signs) has not yet been fully solved, since the work is carried out not with video, but with a discrete set of photos with some step tied to geolocation, "Nanosemantics experts note. |
Video analytics to confirm identity
The basic facial recognition service - identity confirmation in electronic transactions - covers new applications as digitalization technologies spread. For example, for VTB Leasing, one of the important business development priorities is the transfer of car leasing services to the digital environment. To do this, the organization is creating an omnichannel ELISA platform - a modern e-leasing solution that allows you to implement a completely paperless leasing transaction. This means, says Konstantin Yesyunin, chief architect of VTB Leasing, that those documents that the client will have to provide to the company, for example, a passport, must be processed in the IT system at a speed close to real time. We are talking about several seconds: during this time, passport data should not only be recognized, but also transferred to the CRM system. The company seeks to ensure that after the client uploads a picture of the passport to the system, he will receive approval of the transaction and an agreement for signing in a minimum time, which he can immediately sign with an electronic signature, and go for the car.
In the Australian bank National Australia Bank, you can issue a large loan or get a mortgage remotely: documents are provided electronically and the borrower is talking with a bank employee. In such conditions, banks seek to use additional opportunities, in particular, the recognition of the emotions of a potential borrower, which allows them to clarify their scoring score.
Video cameras and face recognition software are installed at all workplaces of Post Bank employees: an employee can start working in the banking system only after he confirms his identity. Similarly, each client of the bank is verified according to the database of fraudsters. The bank claims that this approach made it possible to prevent more than 1200 thousand attempts to use other people's accounts to penetrate the banking system and more than 2 thousand suspicious transactions worth more than 500 million rubles over the year.
The Bank of Russia controls employees "access to information resources using BioSmart's biometric palm pattern analysis system.
with the Biometric authentication help of several cameras installed in the car, suggests using the company LG to start the engine of the car without a key.
In general, the market for biometric solutions, according to estimates by the analytical agency MarketsandMarkets, will grow to $41.80 billion by 2023, with an average annual growth rate of almost 20%. The high growth rates of the segment are explained simply: biometrics radically simplifies the client path. For example, Ak-Bars Bank has introduced a biometric service for identifying users of loyalty programs in restaurants in Tatarstan.
Russian banks are still wary of bioequiring, that is, paying for purchases not with a bank card, but with a person facing a video camera. Expectations are associated with the development of the Unified Biometric System (EBS), which should receive the status of a state system this fall. Experts consider the main obstacles to the spread of biometric identification to be citizens' distrust of centralized systems such as EBS and deepfakes.
Thus, the development of this technology can be a good help for attackers. |
VTB Chairman Vadim Kulik in an interview with BIS Journal in November last year talks about the emergence of technologies that allow you to maintain the confidentiality of primary data and at the same time - after certain manipulations - build models on them. An example is Data Fusion technology, where a neural network "wraps" primary data so that they can be processed by predictive models, but at the same time their decryption would be practically impossible. Vadim Kulik says that as part of the joint work of VTB and Rostelecom in the direction of the Big Data Platform, MPC (Multiparty computing) technology has been tested, which does not require the exchange of source data between companies, and this allows service users to implement the Data Fusion approach.
Dmitry Morozov from 3DiVi believes that the solution could be a transition to decentralized distributed registry systems - they will enable the user to share their biometrics directly with the service provider without intermediaries. {{quote 'The mechanisms of fake biometric data are developing at the same speed as the technologies themselves, - says Svetlana Efimova, co-founder of Oz Forensic. - Deepfakes can be called one of the most dynamically developing areas of fraudsters - you will not surprise anyone with fake viral videos, and during the pandemic there was a real fake news boom. }}
Attackers are increasingly resorting to tricks that allow them to hide their identity and get bank services under the guise of another person, "says Svetlana Efimova. - For example, they use printed photos, photos or videos from another device, high-quality silicone masks, use deepfake technology and replace the video stream using a virtual camera. |
Of course, this puts a financial organization at risk not only of losing money on a mistakenly issued banking product, but also of violating the law, from the point of view of client verification, the expert notes.
Today, two main approaches to the detection of a living face (Liveness) have taken shape:
- Active Liveness: the user is invited to make some kind of movement: approach, wink, smile, turn of the head. In some cases, when a contract or other business relationship is entered into remotely, it is necessary to record the fact of an active action. Active action means that the user got acquainted with the contract and, having fulfilled what is asked of him - smiled or turned his head - confirmed that he acts of his own free will in a clear mind and solid memory.
Active Liveness requires a long video (from 3 to 5 seconds) in order for the user to have time to do the action, increasing the client path. At the same time, the active version of Liveness does not affect the security and protection against spoofing attacks, since modern deepfake technologies allow you to bypass any active liveness by animating the picture to perform the required action.
This simplifies the transfer and processing speed to one process per second, say Oz Forensics. |
{{quote 'Not all manufacturers of technology solutions for biometric identification today can determine with a probability of 100%, that there is a living person on the other side of the screen, that he corresponds to the identity of the document presented, and confirm that no personal data will leak to third parties during the identification process,- says Svetlana Efimova. - Therefore, when the EBS enters the state system, it is important to create technological standards and requirements for software that protects the interests of all participants in the process. }}
Biometrics-based behavioral analytics
In general, in the coming years, face analysis technologies will develop in line with behavioral analytics (User and Entity Behavioral Analytics, UBA/UEBA), Bitcop says: it is important not only to understand who exactly appeared in the frame, but also how this person behaves in a particular situation.
According to Gartner analysts, the UEBA solutions market has already entered a maturity stage, which is characterized, firstly, by the widespread use of these technologies by medium and large businesses and, secondly, the integration of UEBA analytics functionality into various vertical niches related to information security: SIEM, identity and access control (IAM), endpoint protection, data leakage prevention. Allied Market Research estimates will grow by 2025 each year by an average of 23.7%, and the highest growth rates will be observed in the gait analysis segment.
Behavioral biometrics will become the basic element of identity verification in two-factor or multi-factor authentication, according to researchers at Frost & Sullivan.
Products in the segment of analysis of "face and body language" are offered by Russian developers. For example, the company is 3DiVi engaged in solutions for tracking bodies and faces (recognition of faces, movements and gestures). NtechLab is developing software to automatically identify people with signs of aggressive behavior. For these purposes, an integrated approach is used, including recognition of silhouette, actions and emotions.
Process Analysis Video Analytics
Advanced video analytics systems today can become good human advisers when making decisions in various situations, primarily related to the identification of suspicious actions and processes. For example, Meta System, a company specializing in scoring solutions for telematics equipment, offers a product for insurance companies - predicting the risks associated with the performance of a driver based on an analysis of his face and driving style.
The Fuzzy Lodge Labs Smart Fraud Detection system analyzes employee behavior in real time using video cameras and microphones. She knows how to compare the emotions of people in the frame, the dynamics of views with a set of patterns of suspicious actions of employees, for example, closing a video camera or photographing the screen with a phone camera.
The DLP system of SearchInform CIB is also able to understand that the user brought the smartphone camera to the monitor with data from the internal corporate system, determine the identity of this user, and the MonitorController software module at that moment collects information about open sites and active processes in IT systems at the workplace. Thus, the data collected by the DLP system for each incident can become evidence in court cases of improper access to information or disclosure of confidential information.
In order for the IT system to become a good assistant in the tasks of prompt investigation of incidents, the company forms an archive of video information in advanced intelligent search capabilities. Yakov Volkind, director of the ITV Group branch in St. Petersburg, lists the key functions of such a search: search for all events corresponding to the moment of recognition of the offender's face, search for persons who contacted the selected person and came into view of the cameras simultaneously with him, search for specific areas of the premises, search by signs. So, for banks, searching by the characteristics of a particular transaction matters.
Smart video analytics technologies help banks recognize suspicious behavior of a visitor, for example, in the position of a shooter or employee with raised hands and promptly notify bank staff, even if the employee could not use the alarm button.
Of particular importance is the analysis of the actions of visitors in the service areas of ATMs, which are usually located in unguarded premises.
This functionality is especially important if we are not talking about office premises, but about money storage facilities and, and there it is necessary to record all actions related to the work of collectors, as well as illegal actions of intruders in relation to banking property. As Yakov Volkind says, ITV's analytical software can recognize a large number of suspicious situations on a video image: a lying person (he became ill, or was he attacked or is it a homeless person who decided to spend the night in a warm place?) Or a person who crouched down in front of an ATM (a potential cracker?), A crowd of people in the ATM area (for some reason, teenagers gathered in a group in a quiet place?).
We went much further identifications than the function of people. We have developed basic tools for profiling people based on their network activity on platforms,,,,, and TikTok YoutubeVkontakte FacebookYandex other services. |
In addition to the profiling system, the non-stop hidden inspection software and hardware complex will include a system for recognizing microexpressions of the face and gait.
Receiving data from various open and closed sources, as well as heterogeneous technical means, the system automatically processes object information and forms the current target situation in the selected region. At the same time, it continuously maintains its own database of object information, which allows you to use retrospective analysis, form and refine the statistical model of the normal shipping situation. |
The use of machine learning algorithms and artificial intelligence elements in further processing will allow the system to identify deviations from the normal situation, detect and accompany potentially dangerous objects and automatically issue appropriate notifications to military control bodies, says Yuri Anoshko. |
Next Overview Material > >
>
Browse Home > > >
Other Review Materials
- Data processing in deep neural networks: achievements and calls of the current moment
- AI: from data to knowledge
- Processing of natural language documents and texts
- Smart Process Processing
- Smart Voice Services
- New Challenges for Information Security Tasks
- AI in analytics: What's beyond BI?
- Smart manufacturing
- Virtual Assistants
- Smart city
- Conclusion. Where does the world of practical AI implementations roll?
Other materials on the topic of AI
- The artificial intelligence market in Russia has reached a turning point. TAdviser 2020 Review
- Artificial intelligence (AI, Artificial intelligence, AI)
- Artificial Intelligence (Russian market)
- Artificial Intelligence (Global Market)
- Computer Vision: Technology, Market, Outlook
- Video Analytics Systems Video Analytics Systems and Projects Catalog
- National Strategy for the Development of Artificial Intelligence
- Machine Learning, Malicious Machine Learning, Data Labeling
- RPA - Robotic Process Automation
- Video analytics (machine vision)
- Machine intelligence
- Cognitive computing
- Data Science
- DataLake
- BigData
- Neuronets
- Chatbots
- Smart speakers Voice assistants
- Artificial intelligence in various fields: in banks, medicine, radiology, retail, military-industrial complex, production, education, Autopilot, transport, logistics, sports, media and literature, video (DeepFake, FakeApp), music
- Self-driving cars in the world
- Self-driving cars in Russia