GPT-3.5 Turbo-1106

Transform your AI interactions with GPT-3.5 Turbo-1106, featuring top-tier conversational skills and rapid response times.

Choose from hundreds of open-source LLMs in our model directory.

GPT-3.5 Turbo-1106, licensed by OpenAI, is an advanced language model known for crafting engaging responses and managing intricate dialogues. Its impressive performance makes it suitable for various applications, including virtual assistance, customer service chatbots, and interactive storytelling.

Context window(in thousands)4096

Use cases for GPT-3.5 Turbo-1106

  1. Content generation: GPT-3.5 Turbo-1106 efficiently produces diverse content types such as articles, blog posts, and creative writing.
  2. Chatbots: Leveraging its language comprehension, GPT-3.5 Turbo-1106 builds sophisticated chatbots capable of handling complex user queries.
  3. Translation: With its impressive MT Bench score, the model performs well in language translation across different languages.
Arena Elo1068
MT Bench8.32

GPT-3.5 Turbo-1106 demonstrates high-caliber performance across key metrics, providing effective language responses, solid translation benchmarks, and strong knowledge-based task understanding.

Hermes 2 Pro Mistral 7B


Mistral 7B Instruct v0.2


GPT-3.5 Turbo-1106


Llama 2 Chat (13B)


Dolphin 2.5 Mixtral 8X7B


Throughput(output tokens per second)56
Latency(seconds to first tokens chunk received)0.32
Total Response Time(seconds to output 100 tokens)2.2

The model features moderate throughput, low latency, and quick total response time, ideal for real-time interactive applications but may face challenges with high-volume concurrent usage.


The cost per 1,000 tokens for running the model with Telnyx Inference is $0.0010. For instance, analyzing 1,000,000 customer chats, assuming each chat is 1,000 tokens long, would cost $1,000.

What's Twitter saying?

  • Performance: William Tweet highlights discussions on GPT-3.5-turbo-0613 model performance compared to its predecessors, sparking interest in the model's metrics. (Source: @wgussml)
  • Steerability improvements: Simon Willison questions if steerability improvements in GPT-3.5 Turbo-0613 also apply to GPT-3.5 Turbo-16k, prompting a debate on update reliability. (Source: @simonw)
  • Function calling in GPT models: Jayjen highlights function calling capabilities in OpenAI's GPT models, enhancing versatility in GPT-3.5 Turbo-0613 and GPT-4-0613. (Source: @jayjen_x)

Explore Our LLM Library

Discover the power and diversity of large language models available with Telnyx. Explore the options below to find the perfect model for your project.


Chat with an LLM

Powered by our own GPU infrastructure, select a large language model, add a prompt, and chat away. For unlimited chats, sign up for a free account on our Mission Control Portal here.

Sign-up to get started with the Telnyx model library

Get started

Check out our helpful tools to help get you started.

  • Icon Resources EBook

    Test in the portal

    Easily browse and select your preferred model in the AI Playground.

  • Icon Resources Docs

    Explore the docs

    Don’t wait to scale, start today with our public API endpoints.

  • Icon Resources Article

    Stay up to date

    Keep an eye on our AI changelog so you don't miss a beat.

Start building your future with Telnyx AI

What is GPT-3.5-Turbo-1106 and how does it differ from other GPT models?

GPT-3.5-Turbo-1106 is a version of the GPT-3.5 Turbo large language model developed by OpenAI, featuring a 16,385 token context window, optimized for chat and non-chat tasks, with capabilities like improved instruction following and parallel function calling. It differs from models like GPT-3.5-Turbo-0125 by offering specific enhancements in accuracy and text encoding for non-English languages, and from GPT-4 by not being as advanced in tasks and languages but providing a cost-effective alternative for specific applications.

What is the context window size of GPT-3.5-Turbo-1106?

The context window size of GPT-3.5-Turbo-1106 is 16,385 tokens, allowing it to consider a large amount of text for generating responses, making it effective for complex conversations and tasks.

What are the training data cut-off and capabilities of GPT-3.5-Turbo-1106?

GPT-3.5-Turbo-1106 was trained on data up to September 2021, and it is optimized for both chat using the Chat Completions API and non-chat tasks. Its capabilities include improved instruction following, JSON mode, reproducible outputs, and parallel function calling.

How does GPT-3.5-Turbo-1106 perform compared to GPT-4?

While GPT-4 outperforms GPT-3.5 models in various tasks and languages, including chat and vision tasks, GPT-3.5-Turbo-1106 is a more cost-effective option for applications that do not require GPT-4's advanced capabilities. GPT-3.5-Turbo-1106 is optimized for chat and non-chat tasks, offering improved instruction following and parallel function calling.

Can GPT-3.5-Turbo-1106 understand and generate non-English languages?

Yes, GPT-3.5-Turbo-1106 includes improvements for handling non-English languages, although it might not reach the performance level of GPT-4 in multilingual tasks. It has a fix for a text encoding issue in non-English language function calls, enhancing its capability in processing and generating text in various languages.

What are some alternative models to GPT-3.5-Turbo-1106?

Alternative models include Mistral Medium, known for being less lazy and more efficient in certain instances, and Mixtral 8x7B, which is comparable to GPT-3.5 v1106 and available for free at These models offer varied capabilities for different use cases and preferences.

How can I start using GPT-3.5-Turbo-1106 for building connectivity apps?

To start using GPT-3.5-Turbo-1106 in connectivity apps, developers can access the model through platforms like Telnyx, which supports integration with OpenAI's models. For more detailed instructions on integrating GPT models with Telnyx, visit Telnyx's documentation.

Are there any user experiences or feedback on GPT-3.5-Turbo-1106?

Some users have reported GPT-3.5-Turbo-1106 as being less capable than GPT-3.5-Turbo-0125 in certain aspects, while others have found it better at following instructions. Compared to GPT-4, users have found GPT-3.5-Turbo-1106 sometimes struggles with accurately reading returned documents, indicating a mix of experiences based on the task and user expectations.