ad
ad
Topview AI logo

How to Install & Use Whisper AI Voice to Text

Science & Technology


Introduction

In this article, we’ll explore how to install and use OpenAI’s Whisper AI for transcription. Whisper AI excels at converting speech to text and supports over 96 different languages, making it a versatile tool. Plus, it’s completely free to use! If you prefer a cloud-based option instead of installing it on your PC, consider checking other available resources.

Step-by-Step Installation Guide

Step 1: Install Python

To run Whisper AI, you'll need Python installed on your computer. Here’s how to do it:

  1. Go to the Python homepage.
  2. Click on the “Download” link, and select the appropriate version (Whisper AI works with versions 3.7 to 3.10 but not 3.11).
  3. Scroll down to choose the Windows Installer (64-bit).
  4. After downloading, locate the .exe file in your Downloads folder and start the installation.
  5. Important: During installation, check the box that says "Add Python.exe to PATH".
  6. Finish the installation and confirm by typing python -V in the command prompt (CMD) to check the version.

Step 2: Install PyTorch

Next, you need to install PyTorch, which is required for machine learning tasks:

  1. Visit the PyTorch homepage.
  2. Select your configuration:
    • Stable version
    • Windows
    • Package type: pip (since you installed Python)
    • Compute platform: choose either CUDA for Nvidia GPUs or CPU if you don't have a powerful GPU.
  3. Copy the command provided for installation and paste it into your command prompt.

Step 3: Install Chocolatey

Chocolatey is a package manager for Windows. Here’s how to install it:

  1. Open PowerShell (search for it in the Start menu) as an administrator.
  2. Copy the installation command provided on the Chocolatey installation page.
  3. Paste it into PowerShell and press Enter.

Step 4: Install FFmpeg

FFmpeg is required to handle various audio files:

  1. In the PowerShell, type the following command:
    choco install ffmpeg 
    
    and press Enter.

Step 5: Install Whisper AI

Finally, you can install Whisper AI:

  1. Open a command prompt as an administrator.
  2. Type in the following command:
    pip install --upgrade openai-whisper
    
  3. Press Enter to run the command, and wait for the installation to complete.

Running Whisper AI

After the installation, you can transcribe audio files using Whisper AI:

  1. Navigate to the directory containing your audio files (like .wav, .mp3, .mp4) in File Explorer.
  2. Click on the address field and type CMD to open the command prompt in that directory.
  3. To transcribe an audio file, type:
    whisper sample_audio_1.wav
    
    If your file name has spaces, enclose it in quotes. Hit Enter to start transcription.

Whisper will automatically detect the language used in the audio file, and you’ll find new files created in the same directory with various formats (json, SRT, txt, etc.) containing the transcriptions.

Transcribing Multiple Files

To transcribe multiple files at once:

whisper sample_audio_1.wav sample_audio_2.wav

Using Different Models

Whisper supports different models for transcription. To specify a model when running Whisper, use the following command:

whisper sample_audio_1.wav --model medium

Language and Translation Options

You can also specify the language of the audio:

whisper german.wav --language de

Additionally, Whisper can translate audio to English using the --task translate command.

Checking Available Arguments

To see all the different arguments that Whisper AI supports, use:

whisper --help

Conclusion

Whisper AI is a powerful tool for transcribing and translating audio. While its automatic transcription capabilities work well, you may need to make some minor adjustments to ensure accurate text.

Keyword

Keywords: Whisper AI, transcription, Python, PyTorch, FFmpeg, Chocolatey, audio files, machine learning, voice to text, translations.

FAQ

Q1: What are the system requirements for installing Whisper AI?

A1: Whisper AI requires Python (version 3.7 to 3.10), PyTorch, and FFmpeg installed on your computer.

Q2: Can Whisper AI transcribe audio in different languages?

A2: Yes, Whisper AI supports transcription in over 96 languages and it automatically detects the language used.

Q3: Is Whisper AI free to use?

A3: Yes, Whisper AI is completely free to use.

Q4: Can I uninstall Whisper AI if I no longer need it?

A4: Yes, you can uninstall Whisper AI and its components by following the uninstallation steps provided in the installation guide.

Q5: How do I specify the model for transcription in Whisper AI?

A5: You can specify the model with the command --model followed by the model name (e.g., small, medium, etc.) when running Whisper AI.

ad

Share

linkedin icon
twitter icon
facebook icon
email icon
ad