Released at OpenAI DevDay in November 2023, this snapshot merged the standard and 16K context variants into a single model defaulting to 16,384 tokens and introduced JSON mode, parallel function calling, and reproducible outputs via a seed parameter. Input pricing dropped 50% compared to the 0613 snapshot, and the training data cutoff moved forward to April 2023.
Discover the power and diversity of large language models available with Telnyx. Explore the options below to find the perfect model for your project.
| Organization | Model Name | Tasks | Languages Supported | Context Length | Parameters | Model Tier | License |
|---|---|---|---|---|---|---|---|
| No data available at this time, please try again later. |
Powered by our own GPU infrastructure, select a large language model, add a prompt, and chat away. For unlimited chats, sign up for a free account on our Mission Control Portal here.
GPT-3.5 Turbo 1106 introduced JSON mode and parallel function calling, making it a strong choice for structured applications at a low price point. It offered solid performance on chat, summarization, and code tasks, though newer models like GPT-4o mini have since surpassed it.
GPT-3.5 Turbo 1106 remains accessible through the API but has been superseded by the 0125 snapshot and newer models. OpenAI recommends GPT-4o mini for new projects requiring similar capabilities at better performance.
GPT-3.5 Turbo scores 70.0% on MMLU (5-shot) and 7.94 on MT-Bench, placing it below GPT-4 (86.4% MMLU, 8.99 MT-Bench) but above Mixtral 8x7B Instruct (70.6% MMLU, 8.30 MT-Bench) on general knowledge. The 1106 snapshot added JSON mode and parallel function calling without changing the underlying benchmark profile.
The cost per 1,000 tokens for running the model with Telnyx Inference is $0.0010. For instance, analyzing 1,000,000 customer chats, assuming each chat is 1,000 tokens long, would cost $1,000.
GPT-3.5 Turbo was a major step forward from GPT-3, adding chat optimization, a 16K context window, JSON mode, and parallel function calling. It was designed for the Chat Completions API rather than the older completions format.
The 1106 variant is priced at $1.00 per million input tokens and $2.00 per million output tokens through OpenAI's API. This makes it one of the most affordable OpenAI models, though GPT-4o mini now offers better performance at a comparable price.