Are you an academic researcher or podcaster looking for a free-to-use transcription tool that does not use US-based cloud processing to do its work? If that sounds like you, then I’d like to introduce you to Whisper.ai!

      Note: that the Whisper.ai link above is to the open-source code for the project which is quite difficult for non-programmers to use... don't worry I'll provide links to more user-friendly versions of the software below.

        OpenAI DALL-E 3. (2023). ChatGPT [Large language model]. https://chat.openai.com

          For example, I installed the Whisper Transcription software (for Mac’s which is a graphical wrapper on the open-source Whisper.ai command line tools) on my 14-inch M1 MacBook Pro, and it transcribed a 30-minute podcast interview in 1 minute and 15 seconds! Not only did it transcribe the interview, but it also gave me the option of grouping the transcribed text into paragraphs for each speaker, to make it easier for me to know who was talking at different points in the recording.

            For Windows users, a good option is GoWhisper which also uses a freminum model to fund the development of their Whisper.ai graphical interface.

              Whisper.ai Background

                Whisper.ai is an openly licensed project (MIT license) from the folks at OpenAI, however unlike ChatGPT which runs on cloud-based servers, and ingests your prompts as training data, Whisper.ai can be installed locally on your laptop and does all the processing on your computer. This eliminates the privacy concerns inherent in Otter.ai for example, which for Canadian-based researchers means that they do not have to write cloud-based computing into their research ethics proposals, and cloud-based storage consent into the consent forms that their research participants need to read and sign.

                  The Mac-based Whisper Transcription software interface.

                    While there are a lot of paid tools available for transcribing and translating speech to text like Otter.ai, and some free tools like YouTube that work fairly well, almost all of them use cloud-based services to process the transcriging. up until relatively recently, I have not encountered any free tools that will transcribe audio locally on my laptop without doing any processing in the cloud. Enter Whisper.ai from Open AI, the makers of ChatGPT!

                      Whisper.ai Reliability

                        Whisper.ai is quite reliable in my experience, but its reliability does vary by language. Below are charts provided by OpenAI outlining its reliability for various languages, and available language models. Note: The smaller bars on the charts below indicate more reliable transcription.

                          Graphs shwoing the reliability of large language models for various languages, common voice 15 being most reliable for dutch and spanish, while less reliable for afrikaans and punjabi, and FLEURS being most raliable for languages such as spanish and italian, and less reliable for persian and icelandic

                          Note: The smaller bars on the charts below indicate more reliable transcription.

                            Whisper.ai Pros and Cons

                              Why should you consider using a Whisper.ai based tool to transcribe your audio files?

                                        • Whisper.ai is free to use

                                        • Whisper.ai uses your computer’s processing power for transcription, so there are no worries about personally identifiable or confidential information stored in the cloud or potentially being used as training data for other AI tools

                                        • Whisper.ai relatively fast

                                        • The Mac-based Whisper Transcription software creates separate paragraphs for different speakers (which can make analyzing interviews easier for researchers)

                                        • Creates files for closed captioning video

                                What are some potential drawbacks to Whisper.ai?

                                          • Whisper.ai is slower than Otter.ai (instead of processing a 30-minute interview in 20 seconds, Whisper.ai took 1 minute and 15 seconds on my M1 MacBook Pro).

                                          • The Free versions of both Whisper Transcription & GoWhisper – allow users to use smaller models for analyzing the audio, and require users to pay a licensing fee to enable the software to use the larger and more accurate models (but also slower)

                                          • Older computers will take longer to transcribe audio than newer faster computers

                                  Useful Whisper.ai Resources