Customers: Usachev Show Contractors: Nanosemantics Lab Product: NLab Speech TTSProject date: 2023/09 - 2024/03
|
2024: Synthesis of the voice of blogger Ruslan Usachev
"Nanosemantics" on April 9, 2024 announced the completion of the voice synthesis project of the famous Russian blogger Ruslan Usachev. As a result of the project, the blogger's team will be able to produce synthesized audio content for placement on their information platforms.
Ruslan Usachev is one of the Russian-speaking video bloggers. He can rightfully be considered one of the pioneers of Russian Youtube: he recorded his first blogs back in March 2010. Ruslan is the host and screenwriter of his own travel show and news digest Usachev Show, as well as the showrunner of the ClickKlak project.
Production of content for video blogs and audio podcasts is a laborious process that requires careful study of the script, recording, editing audio or video materials. Speech synthesis can help with this. Instead of recording videos in the studio, bloggers can use artificial intelligence (AI) as an assistant to convert text scripts into audio files. So you can significantly speed up the production of content, and devote your free time to other tasks.
Despite the clear advantages, bloggers may face a number of difficulties. For high-quality speech synthesis, a number of problems need to be solved: the system must learn the timbre of the voice of a particular person and the subtleties of pronunciation, especially when working with complex terms or professional vocabulary.
The voice model of Ruslan Usachev is developed on the basis of the Nanosemantics platform NLab Speech TTS (Text-to-Speech), which allows you to create an exact copy of the voice of a media person. The platform specializes in synthesizing voice from a text format and is used in various fields, including for generating content for training and entertainment. With NLab Speech TTS, a number of problems related to voice synthesis can be solved, such as gluing, different levels of noise and intonation, different speed of speech, coughing, etc.
The developers were faced with the task of synthesizing the original timbre of voice with the peculiarities of the pronunciation of Ruslan Usachev. To train the model, 10 hours of recording from the blogger's Youtube channel were used. And to fine-tune various nuances, it took another 10 hours of recordings of the customer's pure voice. In the process, the developers faced the problem of displaying the voice track of the synthesized recording when publishing in Telegram, subsequently solving this problem by converting it to a suitable file format.
As a result of the project, a voice bot was created that generates audio messages in the voice of Ruslan Usachev. The customer and his team of editors have access to the bot, who will begin with his help to produce audio content for placement on their own and, possibly, third-party sites.
My own voice bot is a valuable tool that will help me both in solving everyday content generation problems and in projects for which I physically lacked time. In addition to automatically recording podcasts and audio interviews, I can now easily and simply create audiobooks or voice training courses. A voice bot can also become an assistant in collaboration with fashion brands or in promoting its own products from my online store, "Ruslan Usachev emphasized. |
Working on each new project to create a replica of a media person's voice is an inspiring experience. A bot with the voice of Ruslan Usachev is a very flexible model, the configuration and refinement of which we can carry out according to the tasks of the customer. For example, in the future, it is possible to finalize the model in order to record voice in foreign languages to work with a multilingual audience, - commented Ilya Ivanov, commercial director of Nanosemantics. |