Баннер в шапке 1
Баннер в шапке 2

BSS: ASR (Automatic Speech Recognition)

Developers: Banks Soft Systems, BSS
Date of the premiere of the system: 2024/06/20
Technology: Information Security - Biometric Identification,  Speech Technologies

2024: Building the ASR Model

Almost 80% of the quality of automatic recognition of the Kazakh language was achieved by BSS, in just 3 months having assembled its own ASR (Automatic Speech Recognition) model. The developer announced this on June 20, 2024.

ASR technology is necessary for recognizing the oral speech of clients who access the virtual assistant with requests. To quickly train the model in the absence of initial data, the developers used real dialogues in the Kazakh language, containing vocabulary relevant to the customer. BSS analysts carefully transcribed 10 hours of audio materials, transferring them correctly to text mode.

In parallel, ready-made speech corps with a duration of 1500 hours were assembled from open sources. Based on this data, the first basic version of the model was trained, the quality of speech recognition of which was 70%. The basic ASR model was then further trained in 10-hour audio material transcribed by analysts. After the second iteration, the model showed 80% quality on target customer requests. For June 2024, BSS developers are launching a new ASR learning cycle to improve the speech recognition metric.

"This made it possible to speed up ASR by 3 times and improve the quality of speech recognition by 5%. Also, thanks to the transition to a more productive architecture, the reaction boat is reduced by 200-500 ms, which increases the naturalness of the dialogue with customers, he said. Alexander Krushinsky