The "o" stands for omni: unlike GPT-4V which routed vision through a separate encoder, GPT-4o processes text, images, and audio through a single end-to-end neural network. It responds to audio input in roughly 320ms on average, runs 2x faster than GPT-4 Turbo at half the cost, and was the first model to bring GPT-4-class intelligence to ChatGPT's free tier.
Discover the power and diversity of large language models available with Telnyx. Explore the options below to find the perfect model for your project.
Powered by our own GPU infrastructure, select a large language model, add a prompt, and chat away. For unlimited chats, sign up for a free account on our Mission Control Portal here.
GPT-4o is available in ChatGPT's free tier with usage limits. Paid subscribers get higher rate limits and priority access. API pricing is $2.50 per million input tokens through OpenAI and inference providers.
GPT-4o ("omni") is OpenAI's natively multimodal successor to GPT-4, processing text, images, and audio in a single model. It is faster, cheaper, and more capable than GPT-4 across most benchmarks.
GPT-4o scores 88.7% on MMLU (5-shot) and 90.2% on HumanEval, surpassing GPT-4 (86.4% MMLU, 67.0% HumanEval) on the same sheet across both knowledge and code benchmarks. It runs at 2x the speed of GPT-4 Turbo at 50% lower cost while adding native audio and image processing. On multilingual tasks it significantly outperforms GPT-4 Turbo, particularly on Arabic, Hindi, and Mandarin.
Running GPT-4o through Telnyx Inference costs $2.50 per million input tokens and $10.00 per million output tokens. Processing 1,000,000 multimodal interactions at 1,500 tokens each would cost approximately $9,375, half the price of GPT-4 Turbo ($30,000) with faster speed and native audio/image support.
| Organization | Model Name | Tasks | Languages Supported | Context Length | Parameters | Model Tier | License |
|---|---|---|---|---|---|---|---|
| deepseek-ai | DeepSeek-R1-Distill-Qwen-14B | text generation | English | 43,000 | 14.8B | medium | deepseek |
| fixie-ai | ultravox-v0_4_1-llama-3_1-8b | audio text-to-text | Multilingual | 8,000 | 8.7B | small | mit |
| gemma-2b-it | text generation | English | 8,192 | 2.5B | small | gemma | |
| gemma-7b-it | text generation | English | 8,192 | 8.5B | small | gemma | |
| meta-llama | Llama-3.3-70B-Instruct | text generation | Multilingual | 99,000 | 70.6B | large | llama3.3 |
| meta-llama | Llama-Guard-3-1B | safety classification | Multilingual | 128,000 | 1.5B | small | llama3.3 |
| meta-llama | Meta-Llama-3.1-70B-Instruct | text generation | Multilingual | 99,000 | 70.6B | large | llama3.1 |
| meta-llama | Meta-Llama-3.1-8B-Instruct | text generation | Multilingual | 131,072 | 8.0B | small | llama3.1 |
| minimaxai | MiniMax-M2.5 | text generation | English | 2,000,000 | 0 | large | minimaxai |
| minimaxai | MiniMax-M2.7 | text generation | English | 200,000 | 0 | large | minimaxai |
| mistralai | Mistral-7B-Instruct-v0.1 | text generation | English | 8,192 | 7.2B | small | apache-2.0 |
| mistralai | Mistral-7B-Instruct-v0.2 | text generation | English | 32,768 | 7.2B | small | apache-2.0 |
| mistralai | Mixtral-8x7B-Instruct-v0.1 | text generation | Multilingual | 32,768 | 46.7B | medium | apache-2.0 |
| moonshotai | Kimi-K2.5 | text generation | English | 256,000 | 1.0T | large | modified-mit |
| moonshotai | Kimi-K2.6 | text generation | English | 262,144 | 1.0T | large | modified-mit |
| Qwen | Qwen3-235B-A22B | text generation | English | 32,768 | 235.1B | large | apache-2.0 |
| zai-org | GLM-5.1-FP8 | text generation | English | 202,752 | 753.9B | large | mit |
| anthropic | claude-3-7-sonnet-latest | text generation | Multilingual | 200,000 | 0 | large | anthropic |
| anthropic | claude-haiku-4-5 | text generation | Multilingual | 200,000 | 0 | large | anthropic |
| anthropic | claude-opus-4-6 | text generation | Multilingual | 200,000 | 0 | large | anthropic |
| anthropic | claude-sonnet-4-20250514 | text generation | Multilingual | 200,000 | 0 | large | anthropic |
| gemini-2.0-flash | text generation | Multilingual | 1,048,576 | 0 | large | ||
| gemini-2.5-flash | text generation | Multilingual | 1,048,576 | 0 | large | ||
| gemini-2.5-flash-lite | text generation | Multilingual | 1,048,576 | 0 | large | ||
| groq | gpt-oss-120b | text generation | English | 131,072 | 117.0B | large | groq |
| groq | kimi-k2-instruct | text generation | English | 131,072 | 1.0T | large | groq |
| groq | llama-3.3-70b-versatile | text generation | Multilingual | 131,072 | 70.6B | large | llama3.3 |
| groq | llama-4-maverick-17b-128e-instruct | text generation | Multilingual | 1,000,000 | 400.0B | large | llama4 |
| groq | llama-4-scout-17b-16e-instruct | text generation | Multilingual | 128,000 | 109.0B | large | llama4 |
| openai | gpt-3.5-turbo | text generation | Multilingual | 4,096 | 0 | large | openai |
| openai | gpt-4 | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-4-0125-preview | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-4-0314 | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-4-0613 | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-4-1106-preview | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-4-32k-0314 | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-4-turbo-preview | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-4.1 | text generation | Multilingual | 1,047,576 | 0 | large | openai |
| openai | gpt-4.1-mini | text generation | Multilingual | 1,047,576 | 0 | large | openai |
| openai | gpt-4o | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-4o-mini | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | gpt-5 | text generation | Multilingual | 400,000 | 0 | large | openai |
| openai | gpt-5-mini | text generation | Multilingual | 400,000 | 0 | large | openai |
| openai | gpt-5.1 | text generation | Multilingual | 400,000 | 0 | large | openai |
| openai | gpt-5.2 | text generation | Multilingual | 400,000 | 0 | large | openai |
| openai | o1-mini | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | o1-preview | text generation | Multilingual | 128,000 | 0 | large | openai |
| openai | o3-mini | text generation | Multilingual | 200,000 | 0 | large | openai |
| xai-org | grok-2 | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-2-latest | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-3 | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-3-beta | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-3-fast | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-3-fast-beta | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-3-fast-latest | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-3-latest | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-3-mini | text generation | Multilingual | 131,072 | 0 | large | xai |
| xai-org | grok-3-mini-fast | text generation | Multilingual | 131,072 | 0 | large | xai |
Yes, GPT-4o remains available in both ChatGPT and the API. It continues to be one of OpenAI's primary models alongside newer releases like GPT-4.1 and GPT-5, accessible through multiple inference platforms.
GPT-4o is accessible through ChatGPT (free and paid tiers), the OpenAI API, and third-party inference providers. API access requires an OpenAI account with billing configured.
GPT-4o excels at multimodal tasks combining text, vision, and audio understanding, making it particularly strong for real-time voice applications and document analysis. It also performs well on coding, reasoning, and creative writing tasks.
GPT-4o is priced at $2.50 per million input tokens and $10 per million output tokens through the API. This is significantly cheaper than the original GPT-4 while delivering better performance across most benchmarks.
GPT-4o outperforms GPT-4 on most benchmarks while being faster and approximately 50% cheaper. Its native multimodal capabilities for vision and audio processing represent a significant upgrade over GPT-4's text-first architecture.