Developers: | Nanosemantics Lab |
Date of the premiere of the system: | 2020/09/08 |
Last Release Date: | 2023/11/09 |
Technology: | Big Data, MDM - Master Data Management |
The main articles are:
NLab Marker is an industrial platform ready for implementation and operation on tasks with large amounts of data that need special manual processing: marking and training examples for algorithmic machine learning by specialists. NLab Marker allows you to select objects on video, decrypt audio recordings, and mark up medical images. The service minimizes the work time of data preparation specialists (markers) and the number of errors in the formation of a set of training data (datacet).
2023: Speeding Up the Data Markup Process When Working with Images
The company "Nanosemantics" on November 9, 2023 presented an updated service for marking up data "Marker.ru," which is used as a tool for preparing datacets for machine learning and creating neural networks based on them.
The main changes affected the platform interface. Now "Marker.ru" has an English-language version, and the platform also gives wider possibilities for visualizing marked-up data, which makes the work of specialists clearer and more convenient.
New functions have appeared in Active Learning technology, which accelerates the work of the markup due to parallel learning - the neural network built into the "Marker.ru" "monitors" the actions of the assessor, after which it begins to independently "see" the necessary data. Added tools that allow you to achieve faster data markup when working with images - Magic Wand and One-Shot. Magic Wand automatically highlights the object and minimizes the need for manual adjustment. One-Shot allows you to select the desired object on the reference example, after which the neural network will begin to independently find similar objects in other images.
According to the developers, "Marker.ru" allows you to customize the tool as thinly as possible, which helps the service to "find out" very small details in the materials. New algorithms save markup time at times, reducing the standard process from 2 minutes to 0.5 seconds.
Another important update was the platform's support for the dicom data format. It is used in medical scans. The new interface allows you to navigate image slices and use "Marker.ru" for "smart" solutions in medicine.
In terms of working with audio data, Marker.ru integrated a neural network model for automatic transcription of voice into text, which saves the time of markers who can take almost finished text into work, and not decrypt it completely from scratch. Tools such as cutting audio into the right fragments and eliminating noise make the marking process even more convenient, and the result is better.
The Nanosemantics command also optimized the project management logic in Marker.ru: now tasks can be grouped in a collection, which greatly facilitates the process of distributing tasks between markups. Specialists can see all the stages of work and immediately note the moments that need to be adjusted.
In the future, the developers plan to introduce advanced statistics to display tasks performed by specialists. This will allow you to evaluate in detail the effectiveness of each employee and optimize the management of workflows.
"'Nanosemantic' has created the 'Marker.ru' platform to make it easier for customers to create an important and expensive stage of neural networks - data markup. From our own experience, we realized that the final product depends on the quality of the datacet, be it a familiar chatbot or an advanced digital twin. The most advanced language models require the involvement of thousands of professional markers, which eats up the lion's share of the project's budget and raises the bar for access to artificial intelligence technologies. Partial automation of the data marking process due to the technology of active training of the neural network "Marker.ru" reduces the cost of assessors many times. And the more convenient functionality for monitoring the work done makes the final result noticeably better, "said Stanislav Ashmanov, General Director of Nanosemantics. |
2020: NLab Marker service launch
Nanosemantics, a developer of artificial intelligence (AI) technologies and a resident of the Skolkovo Foundation Information Technology Cluster, has launched the NLab Marker service. With its help, the data is converted into information understandable to neural networks. The Skolkovo Foundation announced this on September 8, 2020.
Machine learning is impossible without training data - examples from which algorithms learn. As developers of AI algorithms, we know how important qualitatively marked data is. Our team has developed the NLab Marker platform for themselves. But now we are ready to offer this product to the market, as we see a demand for industrial data markup platforms that allow you to flexibly implement any markup tasks and independently administer this process to companies with a strong Data Science department. |
Errors in the datacet critically affect the quality of neural network training. For example, a trained neural network for video analytics will skip the defect on the production line or incorrectly transfer personal data from a completed questionnaire to the MPSC. NLab Marker has introduced a system for automatic verification of the operation of markers using example traps (honeypots). The time and volume of completed tasks are also monitored. In addition, NLab Marker has various modules for working with text and audio. For example, the module for announcers allows them to record audio for speech synthesis, and the categorization module makes it possible to assign a category to a certain text.
Unlike analogues, NLab Marker guarantees high accuracy of data preparation, quality control at all stages of the markup process and personal data protection, since the platform can be deployed in a secure customer loop. NLab Marker also allows you to clean personal data from the ready-made datacet so that it does not get into the public domain.
NLab Marker has developed a convenient system for organizing and managing a markup team: the curator distributes tasks and instructions to project managers or performers, and also sets individual deadlines for completing tasks for a specific project. This saves the company's time and money resources. The service allows residents of the most remote regions, the unemployed, people with disabilities to work. It does not matter where the markup is located in the world, the main thing is that it has a computer and Internet access. This is especially true, because in the trend, remote work with a flexible schedule without reference to the workplace. |