RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

Whisper

Product
The name of the base system (platform): Artificial intelligence (AI, Artificial intelligence, AI)
Developers: OpenAI
Date of the premiere of the system: March 2023
Branches: Information Technology
Technology: Robots Service,  Application Development Tools

2023: Speech-to-Text System Announcement

On March 1, 2023, OpenAI introduced the application programming interface (API) for the Whisper system, which debuted in September 2022.

Whisper is an intelligent speech-to-text tool trained on 680,000 hours of multilingual and "multitasking" data collected from the internet. The system is able to correctly perceive pronunciation with an accent, identify background noises, as well as technical jargon. According to OpenAI, the solution can "reliably" transcribe speech in several languages, as well as translate from these languages ​ ​ into English. However, Whisper has its limitations, especially in the field of predictive decryption. In addition, the quality of Whisper's work varies between languages.

OpenAI introduces open speech-to-text API

Thanks to the introduction of the Whisper API, third-party developers will be able to integrate this neural network into their applications. It supports working with files in various formats, including M4A, MP3, MP4, MPEG, MPGA, WAV and WEBM. The cost of using the Whisper large-v2 model is $0.006 per minute. The resulting text can then be used in other neural network-based applications.

File:Aquote1.png
We released the tool, but it wasn't really enough for the entire developer ecosystem to build around it. The Whisper API is the same large model that you can get with open source, but we have optimized it as much as possible. This is much faster and very convenient, - TechCrunch quotes the words of the president and chairman of the board of OpenAI Greg Brockman.
File:Aquote2.png

It is noted that the Whisper API is already used by participants in the Speak project - applications based on artificial intelligence for learning foreign languages. In particular, using the API, a "new companion AI product" will be created[1]

Notes