Announced at OpenAI DevDay in November 2023, this preview expanded GPT-4's context window from 8K to 128K tokens while cutting input pricing by 3x. It introduced JSON mode, parallel function calling, and reproducible outputs via a seed parameter, though it was widely noted for a "laziness" problem where the model would truncate code or respond with "rest remains the same."
Discover the power and diversity of large language models available with Telnyx. Explore the options below to find the perfect model for your project.
| Organization | Model Name | Tasks | Languages Supported | Context Length | Parameters | Model Tier | License |
|---|---|---|---|---|---|---|---|
| No data available at this time, please try again later. |
Powered by our own GPU infrastructure, select a large language model, add a prompt, and chat away. For unlimited chats, sign up for a free account on our Mission Control Portal here.
GPT-4 Turbo 1106 Preview was the first release of GPT-4 Turbo in November 2023, introducing the 128K context window and JSON mode. The 0125 snapshot later fixed issues with format-following and added improved code generation.
OpenAI is deprecating individual GPT-4 snapshots as newer models supersede them. The 1106 preview remains available but OpenAI recommends migrating to GPT-4o or GPT-4.1 for new projects.
GPT-4 is a general-purpose language model, while o1 Preview is a reasoning-focused model that uses internal chain-of-thought before responding. The excels at broad tasks, while o1 is better for math, science, and complex analytical problems.
GPT-4 1106 preview maintains the 86.5% MMLU (5-shot) baseline of the GPT-4 Turbo family while adding a 128K context window, JSON mode, and parallel function calling. Its Arena ELO of 1,251 places it above GPT-4 (1,165) but below GPT-4o (1,316) on the same sheet, reflecting incremental improvements in chat quality beyond what MMLU captures.
The cost per 1,000 tokens for running the model with Telnyx Inference is $0.0010. For instance, analyzing 1,000,000 customer chats, assuming each chat is 1,000 tokens long, would cost $1,000.
GPT-4 Turbo 1106 Preview is priced at $10 per million input tokens and $30 per million output tokens. This is roughly one-third the cost of the original GPT-4 while offering a larger context window and additional features.
The 1106 snapshot introduced the 128K context window, JSON mode for structured outputs, parallel function calling, and improved instruction following. These features made it the foundation for GPT-4 Turbo and influenced subsequent model releases.
GPT-4 1106 Preview is functional but has been superseded by the 0125 snapshot, GPT-4o, and GPT-4.1. Unless you have a specific reason to pin to this version, migrating to a newer model will give you better results at equal or lower cost.