Most voice AI still adds too much lag. Even a few hundred milliseconds of delay can make an AI voice sound robotic or hard to interrupt. That’s why we’ve brought Deepgram’s Flux model directly into the Telnyx global network. This allows developers to run conversational speech recognition at the edge, right next to our telephony infrastructure.
Flux is the first Conversational Speech Recognition (CSR) model designed for real-time voice agents. It understands conversation flow. Flux can also tell when a speaker has finished, when they’ve interrupted, and when it’s time for the AI to respond.
That means smoother interactions, faster responses, and more natural, human-like conversations.
Flux is now self-hosted within Telnyx’s global Points of Presence (PoPs), colocated with our telephony and GPU inference layers. Your audio never leaves our private backbone; it's processed, transcribed, and responded to in the same data center.
This architecture removes the multiple network hops that add latency in public cloud setups. This allows us to have a sub-second round-trip time from a user’s speech to the AI’s spoken response.
Natural conversation flow: Flux detects end-of-turns in ~260 ms, enabling true barge-in and instant agent response.
Lower latency: Hosting Flux within Telnyx PoPs cuts 100–300 ms compared to cloud-based deployments.
Simpler pipeline: One integrated model for speech recognition and turn-detection, eliminating the need for separate VAD or endpoint logic.
Predictable performance: Everything runs on the Telnyx private backbone, eliminating jitter and cross-cloud delays.
Flux is now available in AI Assistants, both through the Mission Control Portal and the Telnyx AI Assistants API.
You can configure it under AI → Transcription Models in the portal, or set it programmatically when creating an assistant using the transcription object:
Flux is also available via the Telnyx Voice AI API, using the same endpoint you already use for transcription and streaming:
{
"transcription": {
"model": "deepgram/flux",
"language": "en",
"settings": {
"eot_threshold": 0.7,
"eot_timeout_ms": 5000
}
}
}
By hosting Flux at the edge, Telnyx closes the gap between speech recognition and AI response. Together with NaturalHD voices, LLM orchestration, and direct PSTN access, developers now have a single stack to build production-ready conversational agents that respond as fast as a human.
Flux is now available for all Voice AI users.
Explore it in your Mission Control Panel or through the API to experience how low-latency conversation really sounds.
Related articles