RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

Google Tacotron (speech synthesizer)

Product
Developers: Google
Last Release Date: December, 2017
Branches: Internet services
Technology: Speech technologies

2017: Development of an algorithm Tacotron 2

At the end of December, 2017 the Google company announced the system of speech synthesis which is capable to transform the text to the speech as close as possible to human. The algorithm received the name Tacotron 2.

A system is capable to read any offers, to ignore grammatical errors and to change tonality of the speech. The algorithm speaks only English so far

It is noted that Tacotron 2 uses a context to pronounce absolutely identical words. He also reacts to punctuation rules in the text and also can emphasize specific words. The technology is capable to distinguish different forms of a verb or to define whether the role of a verb or a noun executes a word.

Google developed the system of speech synthesis, not distinguishable from human

Google is engaged in technologies of speech synthesis by means of artificial intelligence long ago. In 2016 the company provided the synthesizer sounding close to the human speech. It uses the WaveNet AI system which studies compliance of the text to certain forms of fluctuations of a wave and then on the basis of this knowledge base creates separate sound waves of text fragments.

Tacotron 2 connected to a neuronet of WaveNet which creates necessary sounds on the basis of data from other system of deep training which transforms the text to the spectrogram (represents audiofrequencies depending on time).

Google notes what in the whole Tacotron 2 works perfectly, but after all experiences difficulties with pronunciation of some compound words and also sometimes in a random way issues strange noise. Besides, a system is not capable to work in real time, and authors do not manage to take the engine under control yet, i.e. to set to it the necessary intonation, for example, a happy or sad voice. 

As Tacotron 2 developers believe, the algorithm can be used for improvement of work of voice assistants who are widely adopted more and more.[1]

Notes