Inworld Realtime TTS 2 Now Available Through Telnyx APIs and Voice AI Assistants

16, Jun 2026

Inworld Realtime TTS 2 is now available through Telnyx APIs and Voice AI Assistants, giving developers access to expressive voices built for realtime conversation across managed agents and custom voice pipelines.

Unlike traditional TTS built for narration, Realtime TTS 2 is designed for live interaction. It can account for conversational context, take voice direction in plain English, and hold one voice identity across over 100 languages.

What's new

Inworld Realtime TTS 2 via API: Use Inworld Realtime TTS 2 through the Telnyx TTS API with REST or WebSocket streaming, not just managed assistants.
Voice AI Assistants support: Select Inworld as the TTS provider in Mission Control for Telnyx Voice AI Assistants.
Voice direction: Pass a natural-language description of how a line should be delivered, inline at the start of your text.
Conversational awareness: The model can account for prior turns, tone, pacing, and emotional context when shaping delivery.
Crosslingual support: Preserve one voice identity across over 100 languages, including mid-utterance language switches.
On-network processing: Inworld TTS 2 runs through Telnyx-hosted models, keeping audio and inference on the same private backbone.

How it works

For API synthesis, use Inworld voices in Telnyx TTS requests with the Inworld.<Model>.<VoiceId> format. For example, developers can use Inworld voices through WebSocket streaming or REST when building custom voice stacks, media experiences, IVRs, or agent infrastructure outside the managed assistant flow.

For managed voice agents, select Inworld as the Text-to-Speech provider inside your Telnyx Voice AI Assistant and configure the voice for your assistant flow.

Why it matters

Developers building custom voice pipelines can use the same Inworld voice model through the API that teams use in Telnyx Voice AI Assistants.
Voice AI built for audiobooks treats every turn as isolated. Realtime TTS 2 is built for conversation: hearing the other person, adjusting tone, and responding in context.
Voice direction removes fixed emotion presets. Developers steer delivery with plain English, the way a director coaches a voice actor.
Crosslingual consistency means one agent can serve multilingual audiences without switching voices.

Example use cases

Custom voice agents using the Telnyx TTS API with WebSocket streaming.
Customer support agents that adapt tone based on caller sentiment.
IVRs and automated call flows that need expressive, low-latency speech.
Global SaaS platforms deploying one voice identity across English, Spanish, Japanese, Hindi, and more.
Voice AI developers iterating on persona behavior through text prompts alone.

Getting started

Via the API:

Generate a Telnyx API key in Mission Control.
Use the Telnyx TTS API with REST or WebSocket streaming.
Select an Inworld voice using the Inworld.<Model>.<VoiceId> format.
Configure language and voice settings for your use case.

Via Voice AI Assistants:

Navigate to Mission Control > AI > Assistants > select your assistant > Voice tab.
Under Text-to-Speech, select Inworld as the provider.
Choose the Inworld voice for your assistant.
Configure voice direction prompts and language settings as needed.

Learn more in the Telnyx Inworld TTS documentation or the Telnyx TTS API documentation.