Support for Third-Party ASR, TTS, and Voice Biometrics¶

Automatic Speech Recognition (ASR)¶

We support the following third-party service providers for ASR services:

ASR	On-Prem / Cloud	Languages	Regions	Word Error Rate (WER)	Comments
Google	Cloud	Supported Languages	1. Locations v2 - Docs 2. Regions	4–9%	1. Good for shorter utterances like 'yes', and 'no'. 2. Good for number inputs, alphanumeric inputs (for example, IDs, SSN, etc.). 3. Supports class tokens, so that output format can be formatted up to some extent. 4. Hints, Hint-hosts are supported. 5. Extensive language support.
Deepgram	Cloud & On-Prem	Supported Languages	Supports all regions across the globe	3.44%	1. Hints are supported 2. Custom models available via Deepgram technical team 3. Smart formatting for inputs like numbers, dates
Azure	Cloud & On-Prem	Supported Languages	Regions	5–10%	1. Preferred ASR provider. It should be the default for new accounts. 2. Low WER, lots of flexibility with custom models. 3. Hints are supported. 4. Extensive language support. 5. Custom language models can be created/deployed through DIY (through the Azure portal).
Nvidia Riva (Nvidia)	On-Prem	ASR Overview	-	6.67%
Amivoice ASR (Advanced Media Inc)	Cloud	Supported Languages	Processing and storage primarily based in Japan	N/A
Amazon Transcribe	Cloud	Supported Languages	Regions	2.60%
gnani.ai	Cloud & On-Prem	Supported Languages	Deployable in customer-specified region (private cloud or on-premises)	2%

We support the following third-party service providers for TTS services:

TTS	On-Prem / Cloud	Languages	Regions	Comments
Google	Cloud	Supported Voices	Operates within Google Cloud’s global infrastructure
Azure	Cloud & On-Prem	Supported Languages	Regions	1. Extensive language support. 2. An extensive number of voices. 3. Custom voice preparation can be done through the portal. 4. SSML support (limited to MS Azure supported tags).
OpenAI TTS	Cloud	Supported Languages	–	1. Human-like voices. 2. Limited number of voices.
Eleven Labs	Cloud	Docs	–	1. Human-like voices. 2. Speed, temperature, and stability can be controlled through call control parameters. 3. Voice cloning is possible with 30-second to 1-minute voice samples. samples
AWS	Cloud	Supported Languages	Regions
gnani.ai	Cloud & On-Prem	API Service	–
Deepgram	Cloud & On-Prem	Supported Languages	–	1. Limited number of languages. 2. Human-like voices.
Nvidia Riva TTS	On-Prem	–	–

We support the following third-party service providers for voice biometrics:

Voice Biometric Vendor	Voice Biometric Engine	On-Prem / Cloud	Comments
ID R&D	ID Voice	–	–

Send Feedback