Installing WhisperX locally is an essential process for those who want to use this powerful audio transcription tool without relying on cloud-based services. In this article, we’ll guide you through the process of local installation of WhisperX, regardless of your operating system.
To avoid getting lost in the process, you should follow the order we propose below. You’ll find links that will take you to the section that details the installation process for each program. Once you’ve completed that part, come back to this section to continue with the next step.
In this article you will find:
- Install Miniconda. See how to install miniconda3.
- Install CUDA (For users with NVIDIA GPUs). See how to install CUDA.
- How to Download Git on Windows, macOS, and Linux.
- Download the GitHub repository prepared by MISTER CONTENIDOS. See how to download the repository.
- Install WhisperX. See how to install WhisperX.
- How to use WhisperX locally. See usage guide.
- How to Translate Your Transcriptions
- Comparison of transcription times with Whisper and WhisperX.
- Relevant Conda Commands.
Miniconda Download and Installation
Installing Miniconda is essential to create the proper environment and install the necessary packages for WhisperX. Additionally, it will simplify the installation of other artificial intelligence projects and any other project that requires environment and package management.
To install Miniconda, follow these steps:
- Access the official Miniconda page: Open your browser and go to the Miniconda3 download page.
- Download the Miniconda3 installer: On the downloads page, look for the installers by platform and download the one corresponding to your operating system (Windows, macOS, or Linux).
- Run the installer: Once you’ve downloaded the installer, find it in your downloads folder and double-click to start the Miniconda3 installation process. The Miniconda3 installer window will open.
- Configure the installation options:
- In the first window, click Next.
- Accept the terms of use by clicking I Agree.
- In the “Install for” window, select Just Me and click Next.
- In the next window, select the location where you want to install Miniconda, leave the default option, and click Next.
- In the “installation options” window, select Create start menu shortcuts and click Install.
- Wait for the installation to complete: This process may take a few minutes. Once finished, you’ll have Miniconda installed on your computer.
Great! Now you have Miniconda installed on your operating system, and you’re ready for the next step.
CUDA Download and Installation Process
If you have an NVIDIA graphics card in your computer, you can install something called CUDA to improve its performance on specific tasks. If you don’t have a card from this brand, you can skip this step.
To get detailed and specific information about CUDA version compatibility, it’s always recommended to consult the official NVIDIA documentation and developer forums for the latest updates and recommendations.
In this guide, we’ll be installing CUDA 12.1.0 (February 2023) for an RTX 3060 12GB, and it works correctly.
To install CUDA on Windows, follow these steps:
- Verify the compatibility of your graphics card with CUDA: Before starting the installation, it’s important to ensure that your graphics card is compatible with CUDA. You can check the list of compatible graphics cards on the official CUDA GPUs page.
- Download the CUDA Toolkit installer: Visit the official CUDA Toolkit downloads page and select the CUDA version you want to install. Make sure to choose the Windows version, the architecture of your system (e.g., x86_64), the operating system version 10 or 11, and the type of installer (local exe). CUDA 11.8 and 12.1 are usually the most common.
- Run the installer as an administrator: Once you’ve downloaded the installer, right-click on the file and select the Run as administrator option to start the CUDA Toolkit installation process.
- Follow the CUDA Toolkit installer instructions: The installer will guide you through several steps. It’s recommended to keep the default options unless you have specific requirements.
- Verify the installation: Once the installation is complete, you can verify that CUDA has been installed correctly by opening a command prompt (cmd) and running the command
nvcc --version
. This command should display the version of the CUDA Toolkit you’ve installed.
There you have it! With these steps, you should have completed the installation of CUDA on Windows, preparing your system for the development of applications that leverage the processing power of Nvidia GPUs.
How to Download Git on Windows, macOS, and Linux
Git on Windows
To install Git on Windows, follow these steps:
- Visit the Git for Windows download page.
- Click the link to download the installer.
- Run the downloaded file and follow the installation wizard instructions.
- Make sure to select the default options.
- Once the installation is complete, open the command line (Cmd, PowerShell, or Git Bash) and verify the installation by running:
git --version
Git on macOS
To install Git on macOS, follow these steps:
- Open the Terminal.
- If you have Homebrew installed, you can install Git with the following command:
brew install git
If you don’t have Homebrew, you can install it by following the instructions on the Homebrew official website. - Verify the installation by running in the Terminal:
git --version
Git on Linux
To install Git on Linux, follow these steps:
- Open the Terminal.
- Use your distribution’s package manager to install Git. Here are some examples based on the distribution:
- Debian/Ubuntu:
sudo apt update && sudo apt install git
- Fedora:
sudo dnf install git
- Arch Linux:
sudo pacman -S git
- Debian/Ubuntu:
- Once the installation is complete, verify the installation by running:
git --version
Downloading the GitHub Repository
To download the GitHub repository prepared for the installation and use of WhisperX, follow these steps:
- Download with Git: Find the miniconda3 terminal and open it. Once done, execute the following command in the terminal
git clone https://github.com/rgcodeai/Kit-Whisperx.git
This will download the repository for this project. - Download the repository without Git: Go to our GitHub repository Kit-Whisperx. Click the green (Code) button at the top and select Download ZIP. Unzip the file on your desktop.
How to Download and Install WhisperX Locally
To complete the installation process of the necessary Whisperx packages, we have two options. On one hand, we can opt for the almost automatic installation using the “environment-cpu” or “environment-cuda” files. On the other hand, if this process fails (which is unlikely, but not impossible), we can resort to manual installation.
Automatic Installation of WhisperX
Each of the files has a special configuration, depending on the characteristics of your PC. In particular, the “environment-cpu” file is intended for users who do not have an NVIDIA graphics card or who have graphics cards from other brands. Meanwhile, the “environment-cuda” file is specifically for users who have NVIDIA GPUs.
Recommendation: Before installing, open the Anaconda Prompt (Miniconda3) terminal exclusively for this process, making sure you haven’t used it previously for another task.
Follow these steps for the installation:
- Open the Anaconda Prompt (Miniconda3) terminal.
- In the terminal, navigate to the project folder downloaded from GitHub.
- Once in the project folder, use one of the following commands to start the installation process:
- For users with Nvidia GPU:
conda env create -f environment-cuda.yml
- For users without Nvidia GPU:
conda env create -f environment-cpu.yml
- For users with Nvidia GPU:
- Once the installation process is complete, execute the following command to activate and run Whisperx locally. (The first time it runs, it takes a little longer)
- Windows:
conda activate whisperx-web-ui & python app.py
- Unix (Linux/macOS):
conda activate whisperx-web-ui && python app.py
- Windows:
Manual Installation of WhisperX
To manually configure WhisperX in your environment, follow these detailed steps. This process includes creating a new conda environment and installing the necessary dependencies based on the characteristics of your PC.
Recommendation: Before installing, open the Anaconda Prompt (Miniconda3) terminal exclusively for this process, making sure you haven’t used it previously for another task.
Steps for the installation:
- Create a new Conda environment: Open the Anaconda Prompt (Miniconda3) terminal and create a new environment called whisperx-web-ui with the following command:
conda create --name whisperx-web-ui python=3.10
- Activate the environment: Activate the new environment you just created with the following command
conda activate whisperx-web-ui
- Install PyTorch and Torchaudio:
- For users with Nvidia GPU:
conda install pytorch==2.0.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia
- For users without GPU:
conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 cpuonly -c pytorch
- For users with Nvidia GPU:
- Install additional dependencies: Execute each command separately ****
conda install conda-forge::gradio
conda install conda-forge::ffmpeg
- Install WhisperX: Finally, install WhisperX using the following command
pip install whisperx
With these steps, you will have manually configured WhisperX in your conda environment. Now you are ready to use the WhisperX web interface and take advantage of its audio processing capabilities.
How to Use WhisperX Locally
Once you have completed the installation process, open the Miniconda3 terminal and run one of the following commands to activate the WhisperX interface:
- Windows:
cd Kit-Whisperx & conda activate whisperx-web-ui & python app.py
- Unix (Linux/macOS):
cd Kit-Whisperx && conda activate whisperx-web-ui && python app.py
How to Translate Your Transcriptions
To translate your transcriptions, follow these steps:
- Sign up for Claude.ai.
- Use the following prompt, make sure to adjust the fields between brackets to your needs:
Translate the following transcript in [SRT] format following these instructions to perform the translation.
- Perform the translation in a way that sounds natural in [TARGET LANGUAGE], as if written by a native speaker.
- Adjust the translation to make sense in the target language, without changing the original purpose of the message.
- Interpret the speaker's intention so that each sentence is expressed in the way a native speaker of the target language would say it.
- Do not modify the sentence timings.
- The source language is [ORIGINAL LANGUAGE]
"""
PASTE YOUR TRANSCRIPT HERE
"""
Transcription Times with Whisper and WhisperX
In this section, we will explore how long Whisper and WhisperX take to convert a 13 minute and 38 second Spanish audio to text. We performed this comparison using an Nvidia RTX 3060 12 GB graphics card and a Ryzen 7 5700X processor.
The models we tested are: Large-v2, Medium, Small, and Base. Here are the results we obtained.
Whisper and WhisperX on CPU Ryzen 7 5700X
Model | Whisper | WhisperX |
---|---|---|
Large-v2 | 23:10 min | 20:53 min |
Medium | 12:48 min | 7:44 min |
Small | 4:58 min | 5:43 min |
Base | 1:54 min | 3:40 min |
Comparing the times, we see that WhisperX is generally faster than Whisper when using the Medium and Large-v2 models. On the other hand, Whisper is faster in the Small and Base models. The following graph will make this comparison easier to see:
Whisper and WhisperX on RTX 3060 – GPU (CUDA)
Model | Whisper | WhisperX |
---|---|---|
Large-v2 | 3.35 min | 1:25 min |
Medium | 2:41 min | 52.5 sec |
Small | 1:32 min | 32.7 sec |
Base | 48 sec | 23.9 sec |
The difference in transcription times is even more apparent when using a graphics card (GPU). WhisperX is considerably faster in all cases. In particular, with the Large-v2 model, the ratio between the time it takes and the quality of the resulting text is exceptional.
Relative Efficiency
- WhisperX proves to be more efficient in GPU utilization, especially in larger models like Large-v2.
- Whisper shows faster transcription times in the Small and Base models when using the CPU, but its efficiency on the GPU is lower compared to WhisperX.
Relevant Conda Commands
Conda is a very useful package and virtual environment manager for Python application development. Throughout the process of installing and configuring WhisperX, we used several Conda commands. In this section, we’ll review some of the most relevant commands that can be helpful in this and other similar projects.
- Navigate between directories:
cd directory_path
- Go back one directory:
cd ..
- Create a new virtual environment:
conda create --name environment_name python=version
- Activate a virtual environment:
conda activate environment_name
- Deactivate a virtual environment:
conda deactivate
- Remove a virtual environment:
conda env remove --name environment_name
- List all virtual environments:
conda env list
- Install a package in the active environment:
conda install package_name
- Remove a package from the active environment:
conda remove package_name
- Update a package in the active environment:
conda update package_name
- Save the environment state to a YML file:
conda env export > environment.yml
- Create a new environment from a YML file:
conda env create -f environment.yml