CHIM Speech-to-Text Services
This page documents the current speech-to-text services still actively used by HerikaServer for CHIM. The split below follows the current quickstart flow: recommended first, then the other valid services.
Recommended
Deepgram
Hosted speech-to-text and one of the current recommended baseline choices. The current schema supports language selection and a model picker, including newer Deepgram model entries.
- Best for: recommended hosted microphone transcription.
- Needs: Deepgram API key.
- Current model examples:
nova-3,nova-2,whisper-medium.
Parakeet
The other current recommended speech-to-text option in HerikaServer. It is treated as a first-class recommended service in the speech-to-text connector grouping.
- Best for: recommended alternative speech-to-text path.
- Current schema focus: language selection.
- Good fit when you want the current CHIM-recommended non-Deepgram route.
Other Services
OpenAI Whisper
Hosted OpenAI Whisper speech-to-text. Supports language selection and an optional translate-to-English behavior in the current schema.
Local Whisper
The DwemerDistro-installed local Whisper endpoint. This is the route to use if you want your speech recognition to stay on your own machine instead of using a hosted speech-to-text provider.
Gemini
Google Gemini speech-to-text plus emotion detection. The current schema supports Google AI API credentials, language selection, and current Gemini model choices.
Azure
Azure speech-to-text with language and profanity handling controls. Use this if Azure is already your preferred speech stack.
Inworld
Inworld speech-to-text with provider/model identifiers and BCP-47 language codes. This is the speech-to-text path to use if you are already building around Inworld services.


