Qwen3 235B A22B

Mixture-of-Experts model with unique thinking/non-thinking modes and comprehensive multilingual agent capabilities.

about

Qwen3 235B A22B is Alibaba's latest-generation dense and mixture-of-experts (MoE) model, representing a fundamental advancement in reasoning and agent capabilities. With 235B total parameters and 22B activated (via 128 experts, 8 active), it uniquely supports seamless switching between thinking mode (for complex mathematical reasoning, code generation, and logical problem-solving) and non-thinking mode (for efficient real-time dialogue). Trained with extensive post-training optimization, Qwen3 surpasses previous generations (QwQ and Qwen2.5) on mathematics, coding, and commonsense reasoning while delivering superior human-preference alignment in creative writing, role-playing, and multi-turn conversations. Native context window of 32,768 tokens extends to 131,072 with YaRN scaling, supporting long-document analysis and multi-turn reasoning sessions.

Licenseapache-2.0
Context window(in thousands)32,768

Use cases for Qwen3 235B A22B

  1. Hybrid Reasoning Tasks: Seamlessly switch between thinking mode (complex math, logical reasoning, coding) and non-thinking mode (fast dialogue) within a single model.
  2. Agent-Based Automation: Excel at tool calling and agent capabilities with precise integration of external tools, achieving leading performance among open-source models on complex agent tasks.
  3. Multilingual Applications: Process and respond in 100+ languages and dialects with strong instruction-following capabilities for translation and cross-language reasoning.

Quality

Arena Elo1422
MMLUN/A
MT BenchN/A

Qwen3 235B A22B surpasses previous generations on mathematical reasoning, code generation, and commonsense logical reasoning benchmarks. In thinking mode, performance exceeds QwQ-32B; in non-thinking mode, it matches or exceeds Qwen2.5 Instruct. Leading performance on complex agent-based tasks among open-source models. Benchmarked on MATH, AIME, reasoning competitions, and real-world agent tasks. Supports dynamic output lengths up to 38,912 tokens for complex problem-solving scenarios.

GLM-5

1456

Kimi-K2.5

1454

Qwen3 235B A22B

1422

gpt-4.1

1413

Gemini-2.5-Flash

1411

What's Twitter saying?

  • Reasoning breakthrough: Qwen3 uniquely combines thinking and dialogue modes, surpassing specialized reasoning models on math and code. Alibaba's technical report shows MATH benchmark improvements over QwQ-32B. src: x.com
  • Open-source leadership: Leading performance on agent capabilities among open-source models, with comprehensive tool-calling framework via Qwen-Agent. src: x.com
  • Multilingual powerhouse: 100+ language support with native instruction-following for translation, cross-lingual reasoning, and localized agent deployment. src: x.com

Explore Our LLM Library

Discover the power and diversity of large language models available with Telnyx. Explore the options below to find the perfect model for your project.

Organizationdeepseek-ai
Model NameDeepSeek-R1-Distill-Qwen-14B
Taskstext generation
Languages SupportedEnglish
Context Length43,000
Parameters14.8B
Model Tiermedium
Licensedeepseek

TRY IT OUT

Chat with an LLM

Powered by our own GPU infrastructure, select a large language model, add a prompt, and chat away. For unlimited chats, sign up for a free account on our Mission Control Portal here.

HOW IT WORKS

Selecting LLMs for Voice AI

RESOURCES

Get started

Check out our helpful tools to help get you started.

  • Icon Resources ebook

    Test in the portal

    Easily browse and select your preferred model in the AI Playground.

  • Icon Resources Docs

    Explore the docs

    Don’t wait to scale, start today with our public API endpoints.

  • Icon Resources Article

    Stay up to date

    Keep an eye on our AI changelog so you don't miss a beat.

Sign up and start building

faqs

What makes Qwen3 235B A22B different from other reasoning models?

Qwen3 uniquely supports dynamic switching between thinking and non-thinking modes within a single model, eliminating the need for separate inference paths. This hybrid architecture allows complex reasoning (math, coding) without sacrificing dialogue efficiency. Traditional reasoning models like o1 operate in thinking-only mode; Qwen3's toggle enables cost-optimized deployment. Visit our comparison guide to understand architectural trade-offs.

How does the mixture-of-experts architecture improve performance?

With 235B parameters but only 22B activated per token, Qwen3 achieves superior quality-to-cost ratios compared to dense models. The 128-expert, 8-active configuration enables specialized task expertise while maintaining training efficiency. For technical deep-dive on MoE scaling, see here.

Can Qwen3 handle long documents?

Yes. Natively supports 32,768 tokens; extends to 131,072 with YaRN rope-scaling (4x factor). Recommended for legal contracts, research papers, multi-turn agent workflows. Deployment via vLLM or SGLang with YaRN configuration enabled. Reference for deployment options.

What languages does Qwen3 support?

100+ languages and dialects with strong instruction-following for translation and cross-language reasoning. Multilingual agent capabilities enable localized tool calling and regional deployment. Validated on instruction-following across major language families. See here for multilingual voice-AI applications.

How do I use Qwen3 in agent workflows?

Deploy via Qwen-Agent framework (built-in tool calling), standard OpenAI-compatible APIs (vLLM, SGLang), or custom integrations. MCP (Model Context Protocol) support enables seamless tool binding. Recommended output length: 32,768 tokens for standard tasks, 38,912 for complex problem-solving. Integration examples available at https://developers.telnyx.com.

Is Qwen3 production-ready?

Yes. Designed for enterprise deployment with SLA support via Telnyx Inference. Sampling parameters: thinking mode (Temperature 0.6, TopP 0.95), non-thinking (Temperature 0.7, TopP 0.8). Presence_penalty (0-2) reduces repetitions. Validate on your workload before production rollout. Full deployment guide for compliance-sensitive applications.