RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

RHVoice Speech Synthesis System

Product
Last Release Date: 2022/04/11
Technology: Speech Technology

Main articles:

RHVoice is a free, open source multilingual speech synthesizer.

2022: RHVoice 1.8.0

On April 11, 2022, it became known that the open speech synthesis system RHVoice 1.8.0 was released, which initially developed to provide language support Russian , but then adapted for other languages, including,,, English, and Portuguese. Ukrainian Kyrgyz The code Tatar Georgian is written in C++ and is distributed under LGPL 2.1. license Supported operation in/, and. GNULinux Windows The program Android is compatible with typical TTS interfaces (text-to-speech) for converting text to speech: SAPI5 (Windows), Speech Dispatcher (GNU/Linux) and Android Text-To-Speech, but API can also be used in on-screen NVDA. reader The creator and main developer of RHVoice is Olga Yakovleva, who develops the project despite the full. blindness

Illustration: selectel.ru

Version 1.8.0 for the Android platform offers an optimized voice and language data management system that allows you to download voice data updates without updating the mobile application. Data updates for the added voices and languages are automatically checked. In addition, the presented release implemented support for the Polish language and added voice for the Macedonian language. Compatibility with the latest alpha and beta releases of the NVDA screen reader is ensured. Fixed build problems on the Linux platform that occurred in the absence of Speech Dispatcher.

RHVoice uses the achievements of the HTS project (HMM/DNN-based Speech Synthesis System) and the parametric method of synthesis with statistical models (Statistical Parametric Synthesis based on HMM - Hidden Markov Model). The advantage of the statistical model is the low overhead and non-demand for CPU power. All operations are performed locally on the user system. Three levels of speech quality are supported (the lower the quality - the higher the performance and the shorter the response time).

The downside of the statistical model is the relatively low quality of pronunciation, which does not reach the level of synthesizers that generate speech based on a combination of fragments of natural speech, but nevertheless the result is completely legible and resembles the translation of recording from a speaker. For comparison, the Silero project, which provides an open speech synthesis engine based on machine learning technologies and a set of models for the Russian language, is superior in quality to RHVoice.

For the Russian language, 14 voice options are available, for English - 6. Voices are formed based on natural speech records. You can change the speed, height, and volume in the settings. You can use the Sonic library to change tempo. It is possible to automatically determine and switch the language based on the analysis of the input text (for example, for words and quotes in another language, the synthesis model native to the language can be used). Voice profiles that define voice combinations for different languages are supported.[1]

Notes