GPT-4 1106 Preview

The November 2023 GPT-4 Turbo preview featuring a 128k context window, JSON mode, and improved instruction-following for complex generation tasks.

about

Announced at OpenAI DevDay in November 2023, this preview expanded GPT-4's context window from 8K to 128K tokens while cutting input pricing by 3x. It introduced JSON mode, parallel function calling, and reproducible outputs via a seed parameter, though it was widely noted for a "laziness" problem where the model would truncate code or respond with "rest remains the same."

Licenseopenai
Context window(in thousands)128000

Use cases for GPT-4 1106 Preview

  1. Book-length document analysis: The 128K context window processes full manuscripts, regulatory filings, or codebases in a single prompt for comprehensive analysis without chunking.
  2. Guaranteed JSON extraction: JSON mode ensures structurally valid output for data pipelines, eliminating the parsing failures that plagued earlier GPT-4 versions.
  3. Reproducible content generation: The seed parameter delivers deterministic outputs for applications requiring audit trails, regression testing, or consistent baseline comparisons.

Quality

Arena Elo1251
MMLUN/A
MT Bench9.32

GPT-4 1106 preview maintains the 86.5% MMLU (5-shot) baseline of the GPT-4 Turbo family while adding a 128K context window, JSON mode, and parallel function calling. Its Arena ELO of 1,251 places it above GPT-4 (1,165) but below GPT-4o (1,316) on the same sheet, reflecting incremental improvements in chat quality beyond what MMLU captures.

GPT-4 Omni

1316

Claude-3-7-Sonnet-Latest

1268

GPT-4 1106 Preview

1251

Llama-4-Scout-Instruct

1250

Llama 3.1 70B Instruct

1248

pricing

The cost per 1,000 tokens for running the model with Telnyx Inference is $0.0010. For instance, analyzing 1,000,000 customer chats, assuming each chat is 1,000 tokens long, would cost $1,000.

What's Twitter saying?

  • Mixed coding performance: While GPT-4 Turbo showed faster inference, coding benchmarks revealed trade-offs—GPT-4 solved 86 out of 122 tasks on the first attempt compared to GPT-4 Turbo's 56, though GPT-4 Turbo improved significantly on second attempts with 28 successes versus GPT-4's 10.
  • Quality concerns despite claims: Some developers reported that GPT-4-1106-preview generated consistently worse content than the previous GPT-3.5-turbo model, questioning whether this was due to the model's "not yet suited for production traffic" status or other factors.
  • Latency and responsiveness issues: Users on social media reported significant latency problems with GPT-4-Turbo (1106-Preview), noting it was slower than the original GPT-4, and some observed that Azure deployments returned noticeably shorter and less detailed responses compared to OpenAI's direct API.

Explore Our LLM Library

Discover the power and diversity of large language models available with Telnyx. Explore the options below to find the perfect model for your project.

Organizationdeepseek-ai
Model NameDeepSeek-R1-Distill-Qwen-14B
Taskstext generation
Languages SupportedEnglish
Context Length43,000
Parameters14.8B
Model Tiermedium
Licensedeepseek

TRY IT OUT

Chat with an LLM

Powered by our own GPU infrastructure, select a large language model, add a prompt, and chat away. For unlimited chats, sign up for a free account on our Mission Control Portal here.

HOW IT WORKS

Selecting LLMs for Voice AI

RESOURCES

Get started

Check out our helpful tools to help get you started.

  • Icon Resources ebook

    Test in the portal

    Easily browse and select your preferred model in the AI Playground.

  • Icon Resources Docs

    Explore the docs

    Don’t wait to scale, start today with our public API endpoints.

  • Icon Resources Article

    Stay up to date

    Keep an eye on our AI changelog so you don't miss a beat.

Sign up and start building

faqs

How much is GPT-4 1106 Preview?

GPT-4 1106 Preview is priced at $10 per million input tokens and $30 per million output tokens. This represented a significant price drop compared to the base GPT-4 models while offering a 128K context window.

What is the difference between GPT-4 and 1106 Preview?

GPT-4 1106 Preview (GPT-4 Turbo) introduced a 128K context window (16x the base GPT-4), JSON mode, parallel function calling, and improved instruction-following. It was OpenAI's first "Turbo" variant of GPT-4 with significantly faster inference and lower pricing.

Is GPT-4 Vision Preview deprecated?

Yes, the standalone GPT-4 Vision Preview has been deprecated. Vision capabilities are now built into GPT-4o and later models as standard features rather than requiring separate endpoints.

What is the difference between GPT-4 and o1 Preview?

GPT-4 is a general-purpose model that responds immediately, while o1 Preview is a reasoning model that spends time "thinking" before answering. The o1 series excels at math, coding, and science tasks that benefit from multi-step reasoning, while GPT-4 is better for general conversation and speed.

What is GPT-4 1106 Preview?

GPT-4 1106 Preview is the November 2023 release of GPT-4 Turbo, featuring a 128K context window, JSON mode, and parallel function calling. It was the first GPT-4 variant designed for large-scale document processing and structured output generation.