Benchmarking speech-to-text on long-form audio
Comparing 8 STT models on a 27-minute podcast. Local Whisper wins on word accuracy, but cloud APIs dominate punctuation.
Tag
4 posts
Comparing 8 STT models on a 27-minute podcast. Local Whisper wins on word accuracy, but cloud APIs dominate punctuation.
A short curated list of the best Whisper fine-tuning resources: tutorials, notebooks, and managed compute examples.
Evaluating whether fine-tuning Whisper improves transcription accuracy. Spoiler: it depends on model size and use case.
A script for fine-tuning OpenAI's Whisper speech recognition models using Modal's serverless GPU infrastructure.