RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

Whisper

Product
The name of the base system (platform): Artificial intelligence (AI, Artificial intelligence, AI)
Developers: OpenAI
Date of the premiere of the system: March 2023
Branches: Information Technology
Technology: Robots Service,  Application Development Tools

Content

2024: Hospital-used OpenAI model found to be hallucinated

The model used in hospitals OpenAI turned out to be subject to hallucinations. This became known on October 28, 2024.

Generative models of artificial intelligence are prone to generating incorrect information. Surprisingly, this problem also affected the field of automatic transcription, where the model must accurately play the audio recording. Software engineers, developers and scientists are seriously concerned about OpenAI's Whisper decryptions.

A University of Michigan researcher found hallucinations in eight out of ten audio recordings. A machine learning engineer who studied more than 100 hours of Whisper transcriptions found errors in more than half of them. And the developer said that he discovered fictitious information in almost all 26,000 transcriptions created by him using Whisper.

Scientists from Cornell University, the University of Washington and other institutions found that Whisper "hallucinates" about 1% of the time, coming up with whole sentences during pauses in records. Pauses are particularly common in the speech of people with aphasia, the researchers note. Sometimes AI-invented phrases contain aggression and racism, and sometimes nonsense.

Hallucinations included fictional medical terms or phrases one would expect from YouTube videos, such as "Thank you for watching!." OpenAI reportedly used more than 1 million hours of YouTube video to decrypt the GPT-4.

All this creates serious risks, since Whisper is applied in medical institutions. Thus, Whisper is used by Nabla as a medical transcription tool. According to her estimates, the model deciphered 7 million conversations with doctors. More than 30,000 doctors and 40 health systems use the AI tool. Nabla is reportedly aware of the Whisper hallucinations and is "addressing this issue."

File:Aquote1.png
We thank the researchers for sharing their findings, "said OpenAI[1] to[2].
File:Aquote2.png

2023: Speech-to-Text System Announcement

On March 1, 2023, OpenAI introduced the application programming interface (API) for the Whisper system, which debuted in September 2022.

Whisper is an intelligent speech-to-text tool trained on 680,000 hours of multilingual and "multitasking" data collected from the internet. The system is able to correctly perceive pronunciation with an accent, identify background noises, as well as technical jargon. According to OpenAI, the solution can "reliably" transcribe speech in several languages, as well as translate from these languages ​ ​ into English. However, Whisper has its limitations, especially in the field of predictive decryption. In addition, the quality of Whisper's work varies between languages.

OpenAI introduces open speech-to-text API

Thanks to the introduction of the Whisper API, third-party developers will be able to integrate this neural network into their applications. It supports working with files in various formats, including M4A, MP3, MP4, MPEG, MPGA, WAV and WEBM. The cost of using the Whisper large-v2 model is $0.006 per minute. The resulting text can then be used in other neural network-based applications.

File:Aquote1.png
We released the tool, but it wasn't really enough for the entire developer ecosystem to build around it. The Whisper API is the same large model that you can get with open source, but we have optimized it as much as possible. This is much faster and very convenient, - TechCrunch quotes the words of the president and chairman of the board of OpenAI Greg Brockman.
File:Aquote2.png

It is noted that the Whisper API is already used by participants in the Speak project - applications based on artificial intelligence for learning foreign languages. In particular, using the API, a "new companion AI product" will be created[3]

Notes