[an error occurred while processing the directive]
RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

Google Translatotron

Product
Developers: Google
Date of the premiere of the system: May 2019
Last Release Date: 2021/08/09
Branches: Internet services
Technology: Speech technology,  Office applications

Content

2021: Translatotron View 2

Google introduced Translatotron 2, in which it solved the problem of abuse by preserving the original voice. This became known on August 9, 2021.

Translatotron did not just translate the speech in real time, but transformed the voice of the speaker in such a way that the words spoken by him in one language sound in another. However, despite the obvious advantages of such technology, it has a significant drawback. Since the system is able to generate speech spoken by different voices, it can be used by fraudsters, including to create diplomatic speeches.

Now Google has introduced the Translatotron 2 system, in which it solved the problem of possible abuse by preserving the original voice of the speaker (the translation voice is different from the voice of the speaker). The quality of translation and the natural sound of speech were also improved by reducing the number of undesirable interference, such as slurred speech and too long pauses between phrases. In addition, Translatotron 2 performance is much higher than the original system.

According to experts, over the past few years, voice conversion technologies have gained great popularity. These technologies work so efficiently that even automated systems cannot always distinguish "live" speech from modified one. In this regard, it is necessary to ensure that they cannot be used to harm. The creators of Translation 2 hope that if successful, their project could be a potential breakthrough in this area of ​ ​[1].

2019: Announcement

On May 15, 2019, Google introduced a tool for simultaneous interpretation of interpretation from one language to another. The technology is called Translatotron.

An important feature of the development was that it allows you to translate conversations while preserving the voice and intonation of the speaker. At the same time, the conversion of the voice into text and back is not carried out, as in the case of "Google.Translator." Translatotron skips this stage and works directly with the sound - the system creates an "impression" of the original speech and converts it.

Google Translatotron Architecture

The neural network presented by Google receives a spectrogram - a visual representation of frequencies - the original audio recording with a voice and synthesizes a spectrogram with a speech in another language. The algorithm then synthesizes the audio file. This method allows you to significantly speed up the translation of oral speech, but by mid-May 2019, accuracy is still far from perfect.

File:Aquote1.png
Although our results lag behind  the usual cascade system, we  have demonstrated the possibility of end-to-end direct conversion of speech  to speech, the Google website says.
File:Aquote2.png

At the same time, according to the developers, Translatotron provides more accurate translation than the basic cascade model, and also better processes proper names.

The synthesized voice, however, sounds somewhat robotic, but the similarity with the original is still large.   You can listen to samples of machine translation on the Google blog.[2]

Google experts tested the algorithm using the BLEU method, in which machine translation is compared with a translation made by a person. They translated the interpretation from Spanish into English.

Translatotron can improve the voice assistant of Google Assistant, which in May 2019 began to work 10 times faster due to the fact that the company reduced recurring neural networks and transferred speech processing to devices.

Notes