Enable Custom LLMs for Telnyx AI Assistants

24, Oct 2025

What’s new

You can now power your Telnyx AI Assistants with any OpenAI-compatible chat completions endpoint including models hosted on AWS Bedrock, Azure OpenAI, Baseten, or your own self-hosted inference servers (vLLM, sglang, TGI, etc.).

This update adds a “Use Custom LLM” option under the Agent tab in the Mission Control Portal. When enabled, your assistant routes all model requests to your chosen endpoint while Telnyx continues to handle real-time voice orchestration, speech recognition, and synthesis.

Why it matters

Many enterprises already run fine-tuned or region-specific models and need to keep inference within their own infrastructure or compliance boundaries. This feature removes that blocker, letting you bring your own model without rebuilding the voice stack.

You can now:

  • Run inference on your preferred cloud or GPU cluster.

  • Maintain data residency and compliance control.

  • Use fine-tuned or proprietary models with Telnyx voice AI.

How to use it

  1. Go to Mission Control → AI Assistants → Agent.
  2. Check Use Custom LLM.
  3. Enter your model’s Base URL and create an Integration Secret for the API key.
  4. Optionally validate the connection and save.

The only requirement: your endpoint must support the OpenAI /v1/chat/completions spec.

Learn more

📘 Using a Custom LLM - Developer Docs