RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

Yandex Cloud SpeechSense

Product
The name of the base system (platform): Artificial intelligence (AI, Artificial intelligence, AI)
Developers: Yandex.Cloud
Date of the premiere of the system: 2024/03/11
Technology: Speech technology

The main articles are:

2024: Representation of the empath neural network

An empath neural network from the Yandex Cloud cloud platform will help businesses better understand customers' emotions. The developer announced this on March 11, 2024.

The algorithm is able to recognize a person's emotions during a dialogue by voice. This ML model can already determine the negative, informal statements and obscene language, as well as the gender of the speaker and his phrases in the dialogue. In the future, the algorithm will work in conjunction with YandexGPT: together neural networks will be able to recognize more complex emotions, in particular sarcasm.

The empath neural network is based on speech recognition Yandex SpeechKit technology. With its help, companies will be able to create voice assistants virtual operators call centers who know how to understand human emotions. This will allow the business to improve the quality of telephone analytics, as well as better adapt the work for call centers each client and quickly respond to emergency situations during the dialogue.

The ML model works in streaming mode - decryption and analysis of emotions occurs immediately during a conversation. For example, if a subscriber negatively communicates with a voice assistant, the neural network can transmit information about this to the customer's internal system, which will automatically switch him to a call center employee. If the operator chopped up the client, this system will notify management of problems during the conversation.

The algorithm can determine emotions not only by the content of the speaker's speech, but also by voice - by speech speed, height, timbre and other parameters. The neural network also determines the gender of the participants in the conversation and supports speaker labeling technology - it notes who owns this or that replica. This allows you to fully work with single-channel audio tracks: for example, when recording from a voice recorder or under technological restrictions of a virtual PBX. In addition, the ML model expands the capabilities of offline analytics: data from the neural network will help to understand which topics and formulations from the operator cause negativity among the client, and will optimize communication scenarios.

Soon the new model will work in the speech analytics service SpeechSense, which includes Yandex SpeechKit technologies and is integrated with the YandexGPT generative neural network. The interaction of several ML models will allow you to recognize more complex emotions of the speaker - for example, uncertainty or sarcasm. Also, neural networks will be able to assess how much the operator really plunged into the client's problem, whether he tried to help the solution or tried to complete the conversation as soon as possible.

File:Aquote1.png
When developing ML services, we always take into account the reverse communication market. One of the requests of our clients and partners is the ability to identify emotions when recognizing speech. This feature is now available to each user. In the future, within the framework of speech analytics the SpeechSense service, we plan to significantly expand the list of recognizable emotions and enable customers to choose what emotions they need to determine, "said Vasili Yershov, head of ML services at Yandex Cloud.
File:Aquote2.png