The name of the base system (platform): | NLab Speech Nanosemantic |
Developers: | Nanosemantics Lab |
Last Release Date: | 2022/09/15 |
Technology: | Speech technology |
Content |
The main articles are:
- Speech synthesis
- Speech Recognition (Technology, Market)
- Speech technology: On the path from recognition to understanding
NLab Speech TTS is a speech synthesis technology.
2023
At the heart of Levitan's voice model
The developer of neural network solutions "Nanosemantics" in the year of the 110th anniversary of the birth of the famous Soviet announcer Yuri Levitan will present a synthesis of his voice. For the anniversary of the man whose voice announced victory in the Great Patriotic War on All-Union Radio, the company will present a voice model created on the basis of the NLab TTS platform and trained on rare recordings from the Levitan archive. Read more here.
How to develop a chatbot based on a modern dialogue platform
To create a full-fledged virtual assistant, you need to take a serious approach to finding a platform that allows companies to independently make bots for their needs. The Russian developer of AI technologies, Nanosemantics, uses the example of its own product DialogOS to tell what capabilities the bot development and training environment should provide to the client. More here.
2022
Update of NLab Speech TTS dictionaries
Nanosemantic continuously optimizes speech synthesis technology NLab Speech TTS, regularly updating dictionaries, experimenting with voice model parameters and signal processing tools.
To understand the client and answer him, the voice assistant must have a good vocabulary. Nanosemantics specialists constantly monitor lexical units, which are often used in the media space, professional communities and the speech of ordinary people. They are added to datacets used to train the voice model. It often happens that NLab Speech TTS "learns" neologisms before lexicographers fix them.
In 2022, the spelling dictionary of the IRI named after V.V. Vinogradov RAS entered 151 words, for example: stand-up, crossfit, jetlag, hundred-point, procrastination. And voice assistant Natasha (a trained voice model based on NLab Speech TTS) already knows all these words and knows how to pronounce correctly, representatives of Nanosemantics said on September 15, 2022.
"Nanosemantic" works on other aspects of speech synthesis, as well as the intellectual functions of assistants. After all, the "live" voice of the automatic interlocutor and his ability to independently answer non-standard questions can increase customer loyalty by 2-3 times, the company emphasized.
To improve the quality of datacets and expand the possibilities of customization, "Nanosemantics" expands the pool of announcers: collects the speech of famous people, male and female voices. For the natural sound of speech and correct intonation in Russian and English, the company works with speech signal synthesis and processing tools: vocoders, phonetizers, normalizers, post-processing.
Analysts expect that by 2024 the number of voice devices will be equal to the population of the Earth. According to representatives of Nanosemantics, the company is working to ensure that artificial voices sound natural, melodic and competent.
How NLab Speech TTS works
- Voice model training: To develop and launch the Nanosemantics speech synthesis technology, two voice models (Natasha and Artyom) were trained using neural networks for this.
- Step-by-step speech synthesis process:
- First, NLPprocessor it is responsible for preparing data and is used in situations where, for example, it is necessary to place stresses, "e/o," and so on. This process is carried out automatically using dictionaries and neural networks;
- The engine translates the text into fine spectrograms;
- Vocoder translates small spectrograms into voice (for each announcer, a different model is trained);
- Post-processing - corrects the speed, tone and volume of the synthesized audio.
(data as of September 2022)
Inclusion in the Register of Domestic Software
In March 2022, NLab Speech ASR technology was included in the Unified Register of Russian Programs for Electronic Computers and Databases. Simultaneously with NLab Speech ASR, NLab Speech TTS was also included in the Unified Register.