Support for Third-Party ASR, TTS, and Voice Biometrics¶
Automatic Speech Recognition (ASR)¶
Kore supports the following third-party service providers for ASR services:
ASR | On-Prem / Cloud | Languages | Regions | Word Error Rate (WER) | Comments |
---|---|---|---|---|---|
Cloud | Supported Languages | 1. Locations v2 - Docs 2. Regions |
4–9% | 1. Good for shorter utterances like 'yes', and 'no'. 2. Good for number inputs, alphanumeric inputs (for example, IDs, SSN, etc.). 3. Supports class tokens, so that output format can be formatted up to some extent. 4. Hints, Hint-hoosts are supported. 5. Extensive language support. |
|
Deepgram | Cloud & On-Prem | Supported Languages | Supports all regions across the globe | 3.44% | 1. Hints are supported 2. Custom models available via Deepgram technical team 3. Smart formatting for inputs like numbers, dates |
Azure | Cloud & On-Prem | Supported Languages | Regions | 5–10% | 1. Preferred ASR provider. It should be the default for new accounts. 2. Low WER, lots of flexibility with custom models. 3. Hints are supported. 4. Extensive language support. 5. Custom language models can be created/deployed through DIY (through the Azure portal). |
Nvidia Riva (Nvidia) | On-Prem | ASR Overview | - | 6.67% | |
Amivoice ASR (Advanced Media Inc) | Cloud | Supported Languages | Processing and storage primarily based in Japan | N/A | |
Amazon Transcribe | Cloud | Supported Languages | Regions | 2.60% | |
Gnani.AI | Cloud & On-Prem | Supported Languages | Deployable in customer-specified region (private cloud or on-premises) | 2% |
Text to Speech (TTS)¶
Kore supports the following third-party service providers for TTS services:
TTS | On-Prem / Cloud | Languages | Regions | Comments |
---|---|---|---|---|
Cloud | Supported Voices | Operates within Google Cloud’s global infrastructure | ||
Azure | Cloud & On-Prem | Supported Languages | Regions | 1. Extensive language support. 2. An extensive number of voices. 3. Custom voice preparation can be done through the portal. 4. SSML support (limited to MS Azure supported tags). |
OpenAI TTS | Cloud | Supported Languages | – | 1. Human-like voices. 2. Limited number of voices. |
Eleven Labs | Cloud | Docs | – | 1. Human-like voices. 2. Speed, temperature, and stability can be controlled through call control parameters. 3. Voice cloning is possible with 30-second to 1-minute voice samples. samples |
AWS | Cloud | Supported Languages | Regions | |
Gnani.AI | Cloud & On-Prem | API Service | – | |
playHT | Cloud & On-Prem | FAQ | – | |
Deepgram | Cloud & On-Prem | Supported Languages | – | 1. Limited number of languages. 2. Human-like voices. |
Nvidia Riva TTS | On-Prem | – | – |
Voice Biometrics¶
Kore supports the following third-party service providers for voice biometrics:
Voice Biometric Vendor | Voice Biometric Engine | On-Prem / Cloud | Comments |
---|---|---|---|
ID R&D | ID Voice | – | – |