# Project - Generate podcast transcript using whisper.cpp
## What's whisper.cpp
```whisper.cpp``` is a C/C++ port of OpenAI’s Whisper speech-to-text model, optimized for speed and running locally on CPUs — even without a GPU. It allows you to transcribe audio files to text using Whisper models in a lightweight and portable way.
### ✅ Key Features
- Runs offline – No internet or cloud API needed
- No Python required – Great for embedding into C/C++ or other native apps
- Multi-language support – Supports transcription in many languages
- Real-time/streaming – Possible with microphone or streaming input
- Small binary – Lightweight and portable
### 📦 Model Support
- It uses Whisper models from OpenAI, such as:
- tiny, base, small, medium, large
## How to generate transcript
### Install FFmpeg
- Need to use FFmpeg to transfer audio file into .wav file
- For mac: run ```brew install ffmpeg```
- Verify installation: ```ffmpeg -version```
- Convert audio into .wav format:
```
ffmpeg -y -i [input_file.mp3] -ar 16000 -ac 1 [output].wav
```
### Install whisper.cpp
- Github link: https://github.com/ggml-org/whisper.cpp
- Follow the quick start
- Clone the repo: ```git clone https://github.com/ggml-org/whisper.cpp.git```
- Navigate into the directory: ```cd whisper.cpp```
- Download the model in ggml format: ```sh ./models/download-ggml-model.sh medium```
- Build the project:
```
cmake -B build
cmake --build build --config Release
```
- Run whisper-cli in default mode and output to transcript.txt file:
```
./build/bin/whisper-cli -m ./models/ggml-medium.bin [path_of_target_audio].wav > transcript.txt
```
- Run whisper-cli in Chinese: add ```-l zh```
```
./build/bin/whisper-cli -m ./models/ggml-medium.bin [path_of_target_audio].wav -l zh > transcript.txt
```
- Flag usage: run ```./build/bin/whisper-cli -h```
### Automate the process with shell script
- Command line input: audio file path, language
```audio.mp3 zh```
```audio.m4a en```
- Convert audio into .wav format
- Transcribe the audio into .txt file
```bash
#!/bin/bash
# === USAGE CHECK ===
if [ $# -ne 1 ]; then
echo "Usage: $0 <audio_filename>"
echo "Example: ./transcribe.sh audio.mp3"
exit 1
fi
FILENAME="$1"
# === CONFIGURATION ===
INPUT_DIR="${HOME}/Downloads"
OUTPUT_DIR="${HOME}/Downloads"
MODEL_PATH="./whisper.cpp/models/ggml-medium.bin"
WHISPER_BIN="./whisper.cpp/build/bin/whisper-cli"
# === DERIVED NAMES ===
INPUT_FILE="${INPUT_DIR}/${FILENAME}"
BASENAME="${FILENAME%.*}"
WAV_FILE="${OUTPUT_DIR}/${BASENAME}.wav"
TXT_FILE="${OUTPUT_DIR}/${BASENAME}.txt"
# === CHECK FILE EXISTS ===
if [ ! -f "$INPUT_FILE" ]; then
echo "Error: File not found -> $INPUT_FILE"
exit 1
fi
# === MAKE OUTPUT DIRECTORY ===
mkdir -p "$OUTPUT_DIR"
# === CONVERT TO WAV ===
echo ">> Converting $FILENAME to WAV..."
ffmpeg -y -i "$INPUT_FILE" -ar 16000 -ac 1 "$WAV_FILE"
# === TRANSCRIBE WITH WHISPER (Default in EN) ===
echo ">> Transcribing $BASENAME.wav..."
"$WHISPER_BIN" -m "$MODEL_PATH" -otxt -nt -l en -of "$OUTPUT_DIR/$BASENAME" "$WAV_FILE" > "$TXT_FILE"
echo ">> Done. Transcript saved to: $TXT_FILE"
```