Telnyx - Global Communications Platform ProviderHome
Voice AIVoice APIInferenceMobile VoiceSpeech-to-TextText-to-speechSIP TrunkingSMS APIWhatsApp Business APIView all productsHealthcareFinanceTravel and HospitalityLogistics and TransportationContact CenterInsuranceRetail and E-CommerceSales and MarketingServices and DiningView all solutionsVoice AIVoice APIInferenceMobile VoiceSpeech-to-TextText-to-SpeechSIP TrunkingSMS APIWhatsApp Business APIGlobal NumbersIoT SIM CardView all pricingOur NetworkMission Control PortalCustomer storiesGlobal coveragePartnersCareersEventsResource centerSupport centerAI TemplatesSETIDev DocsIntegrations
Contact usLog in
Contact usLog inSign up

Social

Company

  • Our Network
  • Global Coverage
  • Release Notes
  • Careers
  • Voice AI
  • AI Glossary
  • Shop

Legal

  • Data and Privacy
  • Report Abuse
  • Privacy Policy
  • Cookie Policy
  • Law Enforcement
  • Acceptable Use
  • Trust Center
  • Country Specific Requirements
  • Website Terms and Conditions
  • Terms and Conditions of Service

Compare

  • ElevenLabs
  • Vapi
  • Baseten
  • Together.ai
  • Twilio
  • Bandwidth
  • Vonage
  • Amazon Connect
© Telnyx LLC 2026
ISO • PCI • HIPAA • GDPR • SOC2 Type II

Ask AI

  • GPT
  • Claude
  • Perplexity
  • Gemini
  • Grok

GPT-3.5 Turbo-1106

The November 2023 GPT-3.5 Turbo snapshot introducing JSON mode, parallel function calling, and a 16k context window for structured output tasks.

Start buildingGET Available Models

about

Released at OpenAI DevDay in November 2023, this snapshot merged the standard and 16K context variants into a single model defaulting to 16,384 tokens and introduced JSON mode, parallel function calling, and reproducible outputs via a seed parameter. Input pricing dropped 50% compared to the 0613 snapshot, and the training data cutoff moved forward to April 2023.

Licenseopenai
Context window(in thousands)4096

Use cases for GPT-3.5 Turbo-1106

  1. Structured data extraction: JSON mode guarantees valid JSON output, making it reliable for parsing unstructured text into database-ready records without post-processing.
  2. Multi-tool API orchestration: Parallel function calling enables it to query multiple external services simultaneously within a single conversation turn.
  3. Reproducible test generation: The seed parameter produces deterministic outputs, making it useful for automated testing pipelines that require consistent baseline responses.

Quality

Arena Elo1068
MMLUN/A
MT Bench8.32

GPT-3.5 Turbo scores 70.0% on MMLU (5-shot) and 7.94 on MT-Bench, placing it below GPT-4 (86.4% MMLU, 8.99 MT-Bench) but above Mixtral 8x7B Instruct (70.6% MMLU, 8.30 MT-Bench) on general knowledge. The 1106 snapshot added JSON mode and parallel function calling without changing the underlying benchmark profile.

Hermes 2 Pro Mistral 7B

1074

Mistral 7B Instruct v0.2

1072

GPT-3.5 Turbo-1106

1068

Llama 2 Chat 13B

1063

Dolphin 2.5 Mixtral 8X7B

1063

pricing

The cost per 1,000 tokens for running the model with Telnyx Inference is $0.0010. For instance, analyzing 1,000,000 customer chats, assuming each chat is 1,000 tokens long, would cost $1,000.

What's Twitter saying?

  • Developers praised the November 2023 updates to GPT-3.5 Turbo (e.g., gpt-3.5-turbo-1006/1106) for 38% better format following like JSON/XML and improved function calling accuracy.
  • Community forums reported sudden quality degradation in natural language understanding and instruction following, with increased hallucinations after repeated API calls.
  • Users noted ongoing proprietary refinements to models like gpt-3.5-turbo-0613 into November 2023, breaking some applications despite promises of stability.

Explore Our LLM Library

Discover the power and diversity of large language models available with Telnyx. Explore the options below to find the perfect model for your project.

No data available at this time, please try again later.
OrganizationModel NameTasksLanguages SupportedContext LengthParametersModel TierLicense
No data available at this time, please try again later.
HOW IT WORKS

Selecting LLMs for Voice AI

GET Available Models
RESOURCES

Get started

Check out our helpful tools to help get you started.

  • Icon Resources ebook

    Test in the portal

    Easily browse and select your preferred model in the AI Playground.

    Test today
  • Icon Resources Docs

    Explore the docs

    Don’t wait to scale, start today with our public API endpoints.

    Get started
  • Icon Resources Article

    Stay up to date

    Keep an eye on our AI changelog so you don't miss a beat.

    See updates

Sign up and start building

Sign upContact sales

faqs

Is GPT-3.5 Turbo a good model?

GPT-3.5 Turbo 1106 introduced JSON mode and parallel function calling, making it a strong choice for structured applications at a low price point. It offered solid performance on chat, summarization, and code tasks, though newer models like GPT-4o mini have since surpassed it.

Does GPT-3.5 Turbo still exist?

GPT-3.5 Turbo 1106 remains accessible through the API but has been superseded by the 0125 snapshot and newer models. OpenAI recommends GPT-4o mini for new projects requiring similar capabilities at better performance.

What is the difference between GPT-3 and GPT-3.5 Turbo?

GPT-3.5 Turbo was a major step forward from GPT-3, adding chat optimization, a 16K context window, JSON mode, and parallel function calling. It was designed for the Chat Completions API rather than the older completions format.

How much does GPT-3.5 Turbo cost?

The 1106 variant is priced at $1.00 per million input tokens and $2.00 per million output tokens through OpenAI's API. This makes it one of the most affordable OpenAI models, though GPT-4o mini now offers better performance at a comparable price.