RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2
Project

Startup "OnMeeting" together with "Inevils" built a cloud platform for transcribing meetings, summarization and recognition of speakers

Customers: Actor-Educational Technologies

Moscow; Information Technology

Contractors: Inevils
Product: IT outsourcing projects

Project date: 2023/11  - 2024/11

2025: Building a cloud platform to transcribe meetings

The company "Actor-Educational Technologies" at the end of April 2025 announced TAdviser about the implementation of the project by the joint efforts of the startup "OnMeeting" and the IT company "Inevils" to create a cloud platform for transcribing meetings, summarization and recognition of speakers.

As reported, "OnMeeting" is a Russian technology startup founded in 2020. Initially, he developed solutions for hybrid training for higher education, but in the process of developing business the model shifted towards the B2B and B2G segment. The AtMeeting service is an intelligent platform for recording and analyzing online communications and offline meetings.

The team "AtMeeting" entrusted the development of several system modules to the IT company "Inevils," which has competencies in building architectures based on neural network models. The work began with a simple audio recording module, but in the process of cooperation, the product developed into a full-fledged cloud platform with transcription, summarization and bot connection functions.

The founder of the startup drew attention to a widespread problem: recordings of online meetings - whether in departments, universities or business - are often not structured, sound quality is low, and listening takes a long time. To solve this problem, it was originally planned to create a service that records sound and translates it into text. Later it became clear that transcribing alone is not enough - users need compressed, understandable summaries and automatic determination of who said what.

Also, from the very beginning, interest from state-owned companies was taken into account, so the architecture should have provided for the possibility of working in an isolated circuit without accessing the Internet.

What was done:

  • Stage 1: Recording module and backend basis. The Inevils team developed an add-on to the server base and implemented the first module - sound capture from the microphone and its basic processing. Since the project developed iteratively, this stage was considered as a starting point, not overloaded with superfluous functionality.
  • Stage 2: Transcription of speech. The next step was the module for translating audio into text. Ready-made models (in particular, Whisper) were used, configured for project tasks. The need for accurate time-stamp partitioning was taken into account.

Fragment of transcribed text with timestamps

  • Stage 3: Summarization. One of the most sensitive components is the formation of sammari. Two options have been implemented:
    • Boxed solution - for customers with a closed circuit (for example, government agencies);
    • Cloud solution - with the connection of powerful neuromodels that ensure the optimal quality of summarization.

  • Stage 4:. Browser boats Telegram boat The infrastructure of a browser AEROSPACE FORCES bots has been developed, and is also organized, which can be "invited" to online meetings. Integration with, and has already been implemented Google Meet. Yandex.Telemostom Zoom

Bot-to-call interface

The startup team was looking not just for a contractor "for the task," but for a partner who could develop the architecture as the product grew. Three things played in favor of the Inevils:

  • experience in working with AI and neural networks long before their mass use;
  • understanding startup logic (iterative growth, changing requirements);
  • willingness to build long-term relationships and develop the product as it scales.

The main technological problem was the gap between expectations for the quality of summarization and real resources in the boxed version. High-quality language models require high-performance GPUs and a large amount of memory - the customer did not have such capacities at the start of the project.

In order not to postpone the launch of the product, the Inevils team decided to temporarily focus on the cloud version: it made it possible to quickly deploy a stable version of the service with the optimal quality of the sammari. In parallel, the architecture of the boxed solution was saved and postponed until the available resources appeared. As of April 2025, this version is being finalized and tested on hardware platforms, taking into account optimizations for TensorRT. The second significant call concerned the organization of processes. Initially, the work was built as a classic "contractor-customer" scheme with separate tasks. But already at an early stage it became clear that the product requires flexible and quick refinement, often without formalized TAs. The solution was the transition to a joint backlog and a single planning cycle: the teams synchronized sprints, prioritized tasks together and began to work as one product team. This allowed respond more quickly to changes and do not waste time on long iterations of coordination.

The following results have been achieved through joint efforts between Meeting and Inevils:

  • A platform that can: record and transcribe meetings, make summaries, protocols and other reports, analyze and form recommendations to users;
  • Architecture ready for boxed and cloud operation;
  • Built-in integration with video services;
  • Plug-in bots;
  • The possibility of scaling for government agencies and the corporate sector.

The system was created not only by the efforts of Inevils, but it was thanks to their work that the system received important modules, including the main transcription module.

Plans:

  • Add a dialysis function. For many customers, it is critical to understand who exactly said what. There can be several on one audio track voices - and the system should not only "decrypt," but also sign replicas. This is what - AI diarization helps - the model defines speakers, separates phrases and enters them into the final text.
  • Expand the list of integrations;
  • Improve the quality of sammarization through further training;
  • Add the generation of links to meetings;
  • Implement custom scripts for storing and routing decryptions.

File:Aquote1.png
There are no stable requirements in startups. We know that. Therefore, they made a system that can be quickly refined without recycling the entire core. And it worked.

told Dmitry Dudnikov, CEO of Inevils
File:Aquote2.png