State-of-the-Art Open Source LLMs Now Available

5, Mar 2026

Most teams accessing open source LLMs route through third-party inference providers, adding latency, cost markups, and external dependencies to every request. Telnyx now runs three leading open source models directly on its own infrastructure.

New Models

GLM-5

zai-org/GLM-5 - The highest intelligence open source LLM in the world. With weights exceeding a terabyte, GLM-5 delivers unmatched capability for complex reasoning and generation tasks.

Kimi-K2.5

moonshotai/Kimi-K2.5 - Great balance between intelligence and cost. The non-reasoning version is ideal for real-time voice AI. Also recommended for AI Assistants: a significant step up over Qwen 235B in intelligence at effectively the same latency and price point. Many use cases that required complex prompt engineering with Qwen will simply work with Kimi.

MiniMax-M2.5

MiniMaxAI/MiniMax-M2.5 - Highly intelligent at a lower cost. A practical choice for teams optimizing spend while maintaining strong model performance.

All three models were released in 2026 and represent the efficient frontier of cost per intelligence.

Why it matters

  • No external API keys or third-party accounts required. Models run on Telnyx infrastructure.
  • Lower latency compared to routing through external inference providers.
  • Use the same Chat Completions endpoint you already integrate with.
  • Choose the right model for each use case: maximum intelligence, balanced cost-performance, or cost-optimized.

Getting started

  1. Use your existing Telnyx API key (or create an account to get one).
  2. Send a request to the Chat Completions API using one of the new model names: zai-org/GLM-5, moonshotai/Kimi-K2.5, or MiniMaxAI/MiniMax-M2.5.
  3. No additional configuration or external API keys needed.

Currently available in the US only. Additional regions coming soon.