With a 400K-token context window, larger than GPT-5's 272K, GPT-5 mini scores 91.1% on AIME 2025 and 87.8% on HMMT 2025, placing it above every previous "mini" class model and within striking distance of full GPT-5 on mathematical reasoning. At $0.25/$2.00 per million tokens, it is designed for well-defined tasks where precise prompting substitutes for maximum reasoning depth.
GPT-5 Mini is a capable AI model engineered for diverse applications and task types. It delivers strong performance across coding, analysis, writing, and reasoning tasks while maintaining reliability and safety. Built with modern AI principles for production-grade applications.
Discover the power and diversity of large language models available with Telnyx. Explore the options below to find the perfect model for your project.
Powered by our own GPU infrastructure, select a large language model, add a prompt, and chat away. For unlimited chats, sign up for a free account on our Mission Control Portal here.
GPT-5 mini trades reasoning depth for speed and cost efficiency. It handles well-defined tasks with clear instructions effectively but falls short of the full GPT-5 on complex multi-step problems that benefit from extended thinking mode.
GPT-5 mini is a faster, more cost-efficient version of GPT-5 designed for high-volume tasks and precise prompts. It provides at lower cost and latency than the full GPT-5 model.
GPT-5 mini does not have a published standard MMLU score, but it scores 91.1% on AIME 2025 and 87.8% on HMMT 2025, placing its math reasoning within striking distance of full GPT-5 (94.6% AIME). Compared to GPT-4o mini (82.0% MMLU) on the same sheet, it represents a generational jump in reasoning quality at the mini price tier.
Running GPT-5 mini through Telnyx Inference costs $0.25 per million input tokens and $2.00 per million output tokens. Processing 10,000,000 classification tasks at 200 tokens each would cost approximately $500 input plus $4,000 output, a fraction of the cost of full GPT-5 for well-defined workloads.
| Organization | Model Name | Tasks | Languages Supported | Context Length | Parameters | Model Tier | License |
|---|---|---|---|---|---|---|---|
| deepseek-ai | DeepSeek-R1-Distill-Qwen-14B | text generation | English | 43,000 | 14.8B | medium | deepseek |
| fixie-ai | ultravox-v0_4_1-llama-3_1-8b | audio text-to-text | Multilingual | 8,000 | 8.7B | small | mit |
| gemma-2b-it | text generation | English | 8,192 | 2.5B | small | gemma | |
| gemma-7b-it | text generation | English | 8,192 | 8.5B | small | gemma | |
| meta-llama | Llama-3.3-70B-Instruct | text generation | Multilingual | 99,000 | 70.6B | large | llama3.3 |
| meta-llama | Llama-Guard-3-1B | safety classification | Multilingual | 128,000 | 1.5B | small | llama3.3 |
| meta-llama | Meta-Llama-3.1-70B-Instruct | text generation | Multilingual | 99,000 | 70.6B | large | llama3.1 |
| meta-llama | Meta-Llama-3.1-8B-Instruct | text generation | Multilingual | 131,072 | 8.0B | small | llama3.1 |
| minimaxai | MiniMax-M2.5 | text generation | English | 2,000,000 | 0 | large | minimaxai |
| minimaxai | MiniMax-M2.7 | text generation | English | 200,000 | 0 | large | minimaxai |
| mistralai | Mistral-7B-Instruct-v0.1 | text generation | English | 8,192 | 7.2B | small | apache-2.0 |
| mistralai | Mistral-7B-Instruct-v0.2 | text generation | English | 32,768 | 7.2B | small | apache-2.0 |
| mistralai | Mixtral-8x7B-Instruct-v0.1 | text generation | Multilingual | 32,768 | 46.7B | medium | apache-2.0 |
| moonshotai | Kimi-K2.5 | text generation | English | 256,000 | 1.0T | large | modified-mit |
| moonshotai | Kimi-K2.6 | text generation | English | 262,144 | 1.0T | large | modified-mit |
| Qwen | Qwen3-235B-A22B | text generation | English | 32,768 | 235.1B | large | apache-2.0 |
| zai-org | GLM-5.1-FP8 | text generation | English | 202,752 | 753.9B | large | mit |
| anthropic | claude-3-7-sonnet-latest | text generation | Multilingual | 200,000 | 0 | large | anthropic |
| anthropic | claude-haiku-4-5 | text generation | Multilingual | 200,000 | 0 | large | anthropic |
| anthropic | claude-opus-4-6 | text generation | Multilingual | 200,000 | 0 | large | anthropic |
| anthropic | claude-sonnet-4-20250514 | text generation | Multilingual | 200,000 | 0 | large | anthropic |
| gemini-2.0-flash | text generation | Multilingual | 1,048,576 | 0 | large | ||
| gemini-2.5-flash | text generation | Multilingual | 1,048,576 | 0 | large | ||
| gemini-2.5-flash-lite | text generation | Multilingual | 1,048,576 | 0 | large | ||
| groq | gpt-oss-120b | text generation | English | 131,072 | 117.0B | large | groq |
| groq | kimi-k2-instruct | text generation | English | 131,072 | 1.0T | large | groq |
| groq | llama-3.3-70b-versatile | text generation | Multilingual | 131,072 | 70.6B | large | llama3.3 |
| groq | llama-4-maverick-17b-128e-instruct | text generation | Multilingual | 1,000,000 | 400.0B | large | llama4 |
| groq | llama-4-scout-17b-16e-instruct | text generation | Multilingual | 128,000 | 109.0B | large | llama4 |
| openai | gpt-3.5-turbo | text generation | Multilingual | 4,096 | 0 | large | openai |
| openai | gpt-4 | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-4-0125-preview | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-4-0314 | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-4-0613 | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-4-1106-preview | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-4-32k-0314 | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-4-turbo-preview | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-4.1 | text generation | Multilingual | 1,047,576 | 0 | large | openai |
| openai | gpt-4.1-mini | text generation | Multilingual | 1,047,576 | 0 | large | openai |
| openai | gpt-4o | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-4o-mini | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-5 | text generation | Multilingual | 400,000 | 0 | large | openai |
| openai | gpt-5-mini | text generation | Multilingual | 400,000 | 0 | large | openai |
| openai | gpt-5.1 | text generation | Multilingual | 400,000 | 0 | large | openai |
| openai | gpt-5.2 | text generation | Multilingual | 400,000 | 0 | large | openai |
| openai | o1-mini | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | o1-preview | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | o3-mini | text generation | Multilingual | 200,000 | 0 | large | openai |
| xai-org | grok-2 | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-2-latest | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-3 | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-3-beta | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-3-fast | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-3-fast-beta | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-3-fast-latest | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-3-latest | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-3-mini | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-3-mini-fast | text generation | Multilingual | 131,072 | 0 | large | xai |
GPT-5 mini may be available with usage limits through ChatGPT's free tier. API access is paid, with pricing significantly lower than the full GPT-5 model to support high-volume applications.
As of April 2025, OpenAI has not released a GPT-5.4 mini variant. The current lineup includes GPT-5, GPT-5 mini, and GPT-5.2, each targeting different performance and cost tradeoffs.
GPT-5 mini pricing is available through OpenAI's API documentation and is designed to be significantly cheaper than the full GPT-5. Exact per-token rates vary by usage tier.
GPT-5 mini inherits much of GPT-5's mathematical reasoning capability but performs best on well-structured problems with clear instructions. For complex multi-step math, the full GPT-5 with extended thinking remains the stronger choice.
GPT-5 is better for complex reasoning, creative writing, and multi-step tasks that benefit from deeper thinking. GPT-5 mini is better for high-volume production workloads like chatbots, classification, and structured data extraction where speed and cost matter more than maximum capability.