
Enhanced speech recognition model is now available
62% Word Error Rate (WER) improvement for US English

62% Word Error Rate (WER) improvement for US English

Following Google’s release of new Speech API, we are happy to announce improved quality of call records transcription.

New integrations for Voice AI have arrived: Google's Gemini 2.0 Flash model, featuring seamless voice-to-voice conversation capabilities and ElevenLabs low-latency streaming speech synthesis are now available for Voximplant developers

Connect any Voximplant call to ElevenLabs Conversational AI agents

New Features in Voximplant Kit: Update overview We are constantly working to improve our product to make it easier to use and more effective for you. In this update, we have added several useful features. Here’s what’s new:

OpenAI has recently announced GA version of their Realtime API that Voximplant now fully supports

Discover the future of tech at LEAP 2025 in Riyadh! Join Voximplant as we dive into the latest AI innovations, startup ecosystems, and groundbreaking technologies shaping tomorrow. Don’t miss this chance to network, learn, and transform your business.

Voximplant now includes a native Grok module that connects any Voximplant call to xAI’s Grok Voice Agent API for real-time, speech-to-speech conversations. With a single VoxEngine scenario, you can interact via audio with Grok over phone numbers, SIP trunks and infrastructure, WhatsApp Business, or WebRTC into Grok — all without building custom media gateways or WebSocket streaming infrastructure.

Today Ultravox announced they are directly integrating Voximplant into their platform to provide SIP capabilities. The integration builds on Voximplant’s deep telephony and Voice AI tooling

Voximplant now includes a native Deepgram module that connects any Voximplant call to Deepgram’s Voice Agent API for real-time, speech‑to‑speech conversations. You can stream audio from phone numbers, SIP trunks, WhatsApp, or WebRTC into Deepgram’s unified agent environment—combining STT, LLM reasoning, and TTS—and play responses via Voximplant’s serverless runtime with minimal latency.