Support the new TTS and STT models from OpenAI
planned
N
Ngoc Nguyen
Merged in a post:
Proposal: Switching from Whisper to the new GPT-4o-transcribe model
Serg
I'd like to suggest an improvement that would significantly enhance transcription accuracy.
## Current issue
When working with Russian language, numerous errors occur. This requires extensive manual corrections, reducing work efficiency.
## Solution: Transition to GPT-4o-transcribe
OpenAI has released a new transcription model that substantially outperforms the current Whisper: https://openai.com/index/introducing-our-next-generation-audio-models/
### Key advantages:
- Significantly improved accuracy (reduced Word Error Rate), tested on more than 100 languages
- Better recognition of accents and regional speech patterns
- Increased resilience to background noise during recording
- Adaptation to varying speech speeds
- Reduction of incorrect interpretations for complex words
- Better context understanding and recognition of specific terminology
## Simple integration
This enhancement requires minimal effort.
N
Ngoc Nguyen
planned
Diego Sala
Isn't the new ElevenLabs "Scribe" even better?
Serg
Diego Sala It would be great if we could add a few different transcription service providers so we can actually see which one is better in practice. Like OpenAI or ElevenLabs.