Good morning dear readers of Tecnogalaxy, today we will talk about the Whisper model of OpenAI and how it will be applied on AI.

OpenAI released Whisper, an open source deep learning model for speech recognition. OpenAI tests on this model show promising results in transcribing audio not only in English, but also in several other languages.

Even developers and researchers who have experienced Whisper were impressed with what the model can do. However, what is perhaps equally important is what Whisper’s statement tells us about the changing culture in artificial intelligence (AI) research and the kind of applications we can expect in the future.

A return?

OpenAI has been criticized for not making its models open source. GPT-3 and DALL-E, two of OpenAI’s most impressive deep learning models, are only available behind paid API services and there is no way to download and review them.

In contrast, Whisper was released as a preadded open source model that everyone can download and run on a computer platform of their choice. This latest development comes when in recent months there has been a trend towards greater openness among commercial AI research laboratories.

Meta has made the OPT-175B open source, a Large Language Model (LLM) that matches the size of GPT-3. Hugging Face has released BLOOM, another open source LLM of GPT-3 scale. has released Stable Diffusion, an open source image generation model that rivals OpenAI’s DALL-E.


One of the important features of Whisper is the diversity of the data used to train it. Whisper was formed over 680,000 hours of multilingual and multitasking data collected from the web. A third of the training data is composed of non-English audio examples.

“Whisper can transcribe speech robustly and perform at a state-of-the-art level with about 10 languages, as well as translate from those languages into English,” said an OpenAI spokesperson.


Because Whisper is open source, developers and users can choose to run it on the computing platform of their choice, whether it’s their laptop, desktop workstation, mobile device, or cloud server. OpenAI has released five different sizes of Whisper, each swapping accuracy with speed proportionally, with the smallest model being about 60 times faster than the largest.

“Because transcription using the larger Whisper model works faster, there are practical use cases to run smaller models on mobile or desktop systems, once the models have been properly brought into their respective environments”said the OpenAI spokesperson. Developers who have tried this service are satisfied with the opportunities it can offer. And it can pose challenges to cloud-based ASR services that have been the main option until now.

“At first glance, Whisper seems to be much better than other saas products” [software-as-a-service] in terms of.


There are already several initiatives to make Whisper easier to use for people who don’t have the technical skills to configure and run machine learning models. One example is a joint project by journalist Peter Sterne and Github engineer Christina Warren to create a free, secure and easy-to-use transcription app for journalists” based on Whisper.

That’s all about Whisper, a forthcoming article.

Read also:

Was this article helpful to you? Help this site to keep the various expenses with a donation to your liking by clicking on this link. Thank you!

Follow us also on Telegram by clicking on this link to stay updated on the latest articles and news about the site.

If you want to ask questions or talk about technology you can join our Telegram group by clicking on this link.

© - It is forbidden to reproduce the content of this article.