Telnyx - Global Communications Platform ProviderHome
Voice AIVoice APIInferenceMobile VoiceSpeech-to-TextText-to-speechSIP TrunkingSMS APIWhatsApp Business APIView all productsHealthcareFinanceTravel and HospitalityLogistics and TransportationContact CenterInsuranceRetail and E-CommerceSales and MarketingServices and DiningView all solutionsVoice AIVoice APIInferenceMobile VoiceSpeech-to-TextText-to-SpeechSIP TrunkingSMS APIWhatsApp Business APIGlobal NumbersIoT SIM CardView all pricingOur NetworkMission Control PortalCustomer storiesGlobal coveragePartnersCareersEventsResource centerSupport centerAI TemplatesSETIDev DocsIntegrations
Contact usLog in
Contact usLog inSign up

Social

Company

  • Our Network
  • Global Coverage
  • Release Notes
  • Careers
  • Voice AI
  • AI Glossary
  • Shop

Legal

  • Data and Privacy
  • Report Abuse
  • Privacy Policy
  • Cookie Policy
  • Law Enforcement
  • Acceptable Use
  • Trust Center
  • Country Specific Requirements
  • Website Terms and Conditions
  • Terms and Conditions of Service

Compare

  • ElevenLabs
  • Vapi
  • Baseten
  • Together.ai
  • Twilio
  • Bandwidth
  • Vonage
  • Amazon Connect
© Telnyx LLC 2026
ISO • PCI • HIPAA • GDPR • SOC2 Type II

Ask AI

  • GPT
  • Claude
  • Perplexity
  • Gemini
  • Grok

GPT-3.5 Turbo-0125

The January 2024 snapshot of GPT-3.5 Turbo with improved format-following accuracy, faster responses, and better handling of non-English function calls.

Start buildingGET Available Models

about

The final major snapshot of the GPT-3.5 generation, released January 2024, fixed a UTF-8 encoding bug affecting non-English function calls and improved format-following accuracy for JSON, YAML, and XML outputs. The gpt-3.5-turbo alias now points permanently to this version, representing the ceiling of what the architecture could achieve after over a year of iterative refinement.

Licenseopenai
Context window(in thousands)4096

Use cases for GPT-3.5 Turbo-0125

  1. Multilingual function calling: The 0125 snapshot fixed a UTF-8 encoding bug that broke non-English function calls, making it reliable for structured API interactions in languages beyond English.
  2. Format-constrained generation: Improved format-following accuracy makes it suited for producing valid YAML, XML, and JSON outputs without post-processing or retry loops.
  3. High-throughput chat backends: As the final and most refined GPT-3.5 snapshot, it delivers the best quality-per-dollar ratio for production chat systems that don't require GPT-4-class reasoning.

Quality

Arena Elo1106
MMLUN/A
MT BenchN/A

GPT-3.5 Turbo 0125 shares the same 70.0% MMLU (5-shot) and 7.94 MT-Bench baseline as the broader GPT-3.5 Turbo family. As the final snapshot in the series, it improved format-following accuracy and non-English function calling without changing core benchmark performance. Compared to Mixtral 8x7B Instruct (70.6% MMLU, 8.30 MT-Bench) on the sheet, it trails slightly on both measures.

GPT-3.5 Turbo-0613

1117

Mixtral 8x7B Instruct v0.1

1114

GPT-3.5 Turbo-0125

1106

GPT-3.5 Turbo

1105

Llama 2 Chat 70B

1093

pricing

The cost per 1,000 tokens for the model with Telnyx Inference is $0.0010. To illustrate, if an organization were to analyze 1,000,000 customer chats, and each chat consisted of an average of 1,000 tokens, the total cost would be $1,000.

What's Twitter saying?

  • Developers report GPT-3.5 Turbo 0125 performs worse than 0613, with shorter, lazier outputs that skip greetings and miss ideas in paraphrasing tasks.
  • Users note increased output variation even at temperature 0 and reduced performance in classification and filter extraction compared to prior versions.
  • Some testers praise its speed in multi-language chatbots and fast streaming, though tool-calling issues persist.

Explore Our LLM Library

Discover the power and diversity of large language models available with Telnyx. Explore the options below to find the perfect model for your project.

No data available at this time, please try again later.
OrganizationModel NameTasksLanguages SupportedContext LengthParametersModel TierLicense
No data available at this time, please try again later.
TRY IT OUT

Chat with an LLM

Powered by our own GPU infrastructure, select a large language model, add a prompt, and chat away. For unlimited chats, sign up for a free account on our Mission Control Portal here.

Loading...
HOW IT WORKS

Selecting LLMs for Voice AI

GET Available Models
RESOURCES

Get started

Check out our helpful tools to help get you started.

  • Icon Resources ebook

    Test in the portal

    Easily browse and select your preferred model in the AI Playground.

    Test today
  • Icon Resources Docs

    Explore the docs

    Don’t wait to scale, start today with our public API endpoints.

    Get started
  • Icon Resources Article

    Stay up to date

    Keep an eye on our AI changelog so you don't miss a beat.

    See updates

Sign up and start building

Sign upContact sales

faqs

Is GPT-3.5 Turbo 0125 free?

GPT-3.5 Turbo 0125 is not free through the API but is one of OpenAI's most affordable models. It is available through inference providers and may be accessible in ChatGPT's free tier.

Does GPT-3.5 Turbo still exist?

Yes, GPT-3.5 Turbo 0125 is the latest snapshot in the 3.5 Turbo series and remains available through the API. OpenAI recommends GPT-4o mini as the successor for new projects.

Is GPT-3.5 Turbo a good model?

GPT-3.5 Turbo 0125 improved accuracy on format-following tasks and fixed a bug that caused incomplete UTF-8 sequences. It is a reliable choice for structured output tasks like classification and summarization, though newer models outperform it on reasoning.

How much does GPT-3.5 Turbo cost?

GPT-3.5 Turbo 0125 is priced at $0.50 per million input tokens and $1.50 per million output tokens, making it one of the cheapest options in OpenAI's model lineup.

Is GPT-3.5 free or paid?

The GPT-3.5 Turbo API is paid with usage-based pricing. Free access is available through ChatGPT with usage limits. For production deployments, hosted inference platforms offer API access with their own pricing.

Is GPT-3.5 Turbo better than GPT-4?

GPT-4 is significantly more capable than GPT-3.5 Turbo on reasoning, coding, and complex tasks. GPT-3.5 Turbo's advantage is speed and cost, making it better for high-volume, straightforward tasks where maximum quality is not critical.

Loading...