The name of the base system (platform): | NLab Speech TTS |
Developers: | Nanosemantics Lab |
Date of the premiere of the system: | 2023/12/05 |
Technology: | Speech technology |
The main articles are:
- Speech Recognition (Technology, Market)
- Speech technology: On the path from recognition to understanding
- Speech synthesis
2023: Voice Model Presentation
The developer of neural network solutions "Nanosemantics" in the year of the 110th anniversary of the birth of the famous Soviet announcer Yuri Levitan will present a synthesis of his voice. For the anniversary of the man whose voice announced victory in the Great Patriotic War on All-Union Radio, the company will present a voice model created on the basis of the NLab TTS platform and trained on rare recordings from the Levitan archive.
In 2024, the birth of Yuri Levitan will be 110 years old. With a proposal to develop a voice model of the All-Union announcer, his great-grandson Arthur Levitan-Sudarikov turned to the company. The idea of the project is to perpetuate in digital format the voice that announced in the Soviet Union about the most significant events of the 20th century. Levitan's voice sounded from reproducers and radio points, when reports from the fronts of the Great Patriotic War were transmitted in the USSR, it was Levitan country who read out the news about the beginning of the war in 1941 and about the victory in it in 1945.
Levitan's voice model will be based on the Nanosemantics platform NLab TTS (Text-to-Speech), which specializes in synthesizing voice from text format. To train the model, a massive archive of records from the State Film Fund will be used. In addition, the great-grandson of the announcer Arthur Levitan-Sudarikov asked the developers to open access to audio materials that are stored in the Levitan Museum in his homeland - in Vladimir.
The developers have a difficult task to synthesize a recognizable timbre of voice using audio recordings with different levels of quality. The recordings are very different from each other in terms of volume, compression, equalization, noise level and distortion, which is associated with defects that are obtained after the direct recording of Levitan's voice, as well as as as a result of rewriting or copying the master film. This heterogeneity of material always complicates the process of creating a voice model that should generate a "clean," even voice without unnecessary interference.
"Source Zero" is a very clear, warm, tube sound that delicately emphasizes Levitan's voice. These warmth and harmonics play far from the last role in the perception of the final voice of the announcer, as he was remembered. We managed to remove 80% of unnecessary defects, but, where necessary, we even added background noise so that the recordings sound equally pleasant, recognizable and at the same time do not lose the sound of that era, "said Stanislav Ashmanov, CEO of Nanosemantics. |
As of December 2023, Nanosemantics is developing the first version of the model. The project participants discuss the issues of open access of users to it in the future, as well as the possibility of integrating the voice model with generative technologies. It is already known that the voice of Yuri Levitan will be available on the platform of the orthoepic service - in the "Stress" application, which was launched in 2015. The release of the product is scheduled for 2024.
It is a great honor for us to take part in the project to create a voice model of one of the most important voices of the 20th century in our country. It is important that voices like Levitan's sound again, but in a new format - from modern gadgets and applications. This will open up ample opportunities in the way of preserving the memory of these people, whose voices can voice books, news reports and virtual interactive characters. Thanks to such projects, the voice will continue to live and remind of a person, - said Stanislav Ashmanov, General Director of Nanosemantics. |