OpenAI's first open-weight release uses 128 experts per layer with top-4 routing, keeping 5.1B of 116.8B total parameters active per token, and fits on a single 80GB GPU through MXFP4 post-training quantization. Trained over 2.1 million H100-hours with a STEM and coding focus, it scores 96.6% on AIME 2024 and reaches a Codeforces Elo of 2,622 with configurable low/medium/high reasoning effort.
Discover the power and diversity of large language models available with Telnyx. Explore the options below to find the perfect model for your project.
| Organization | Model Name | Tasks | Languages Supported | Context Length | Parameters | Model Tier | License |
|---|---|---|---|---|---|---|---|
| No data available at this time, please try again later. |
Powered by our own GPU infrastructure, select a large language model, add a prompt, and chat away. For unlimited chats, sign up for a free account on our Mission Control Portal here.
GPT-OSS 120B is OpenAI's first open-weight model, released with 120 billion parameters under a permissive license. It is available on Hugging Face and supported through Telnyx's inference infrastructure.
GPT-OSS 120B excels at coding, reasoning, and instruction-following tasks, performing competitively with proprietary models. It is for production voice AI and inference workloads.
GPT-OSS 120B scores 87.2% on MMLU and 90.0% on MMLU-Pro, placing it between GPT-4o (88.7% MMLU) and GPT-4.1 (90.2% MMLU) on the same sheet. With a Codeforces ELO of 2,622 it outperforms every other open-weight model on competitive coding. As OpenAI's first Apache 2.0 release, it runs on a single H100 GPU with MXFP4 quantization despite its 116.8B total parameters.
Running GPT-OSS 120B through Telnyx Inference costs $0.039 per million input tokens and $0.10 per million output tokens via the open-weight deployment. Processing 10,000,000 reasoning tasks at 1,000 tokens each would cost approximately $700, making it the cheapest frontier-class reasoning model available under an Apache 2.0 license.
GPT-OSS 120B requires approximately 240GB of VRAM at full precision, typically needing multiple A100 GPUs. Hosted inference platforms provide access without managing GPU infrastructure.
GPT-OSS 120B is released under an open-weight license permitting free commercial use. Weights are available on Hugging Face, and API access is available through hosting providers.
GPT-OSS 120B is competitive with GPT-4o class models but does not match GPT-5's full reasoning capability. It represents OpenAI's commitment to open-weight models and is available through Telnyx alongside GPT-5.
OpenAI released GPT-OSS to participate in the open-weight ecosystem and provide a self-hostable alternative to their API-only models. It is documented in OpenAI's announcement.