Inference changelog

Stay up to date with the latest feature releases on the Telnyx AI platform, so you can easily connect to your data.

Inference API graphic

Using AI to summarize Telnyx Storage objects

March 20th, 2024

The summarize API provides a single convenient endpoint to summarize any text, audio, or video file in a Telnyx Storage bucket. File summaries are done entirely in-house. Under the hood, we are using our /audio/transcriptions endpoint to transcribe audio and video files, and the /chat/completions endpoint to summarize.

This feature is available now in the portal and via API.

The Telnyx Summary API supports the following formats:

  • Text: pdf, HTML, txt, json, csv
  • Audio and video formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm

Summaries can be conducted on files of up to 100MB.


Summary API pricing is dependent on the file type being summarized.

For audio and video files, pricing starts from $0.003/ minute—as per the pricing for the /audio/transcriptions endpoint. Text file summary pricing will be based on the /chat/completions endpoint pricing, at $0.0003/ 1K tokens.

Telnyx File Summary A portal view of storage buckets summarized using Telnyx Summarize API

OpenAI Compatible /audio/transcriptions (BETA)

March 12th, 2024

The /audio/transcriptions API provides a speech-to-text endpoint to transcribe spoken words to text.


  • Supports flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, and webm file types.
  • Supports segment level timestamps.
  • Pairs nicely with our /chat/completions endpoint to summarize audio.

The Telnyx /audio/transcriptions API supports a 4x higher max file size than OpenAI, with users able to carry out transcription on files up to 100MB vs a limit of 25MB with OpenAI.

Pricing starts from $0.003/ minute, 50% cheaper vs. OpenAI.

Follow our Call Summarization tutorial to get started.

Audio Transcriptions API

Explore new updates in our AI Playground

February 29th, 2024

We’re excited to bring system prompts and chat to our AI Playground in the portal.


  • System prompts allow users to give context and instructions before asking a question to a model. For example, users can specify a role, how to personalize the response, or what tone to use.
  • Telnyx users can view the chat responses and interact with an LLM in the conversation field.

Storage for AI and System Prompts demo

Start testing today in the Mission Control Portal.


  • $0.0002 / 1K tokens for 7B parameter models
  • $0.0003 / 1K tokens for 13B, 34B, 8x7B parameter models
  • $0.001 / 1K tokens for 70B parameter models

Take a look at our pricing page for all our Inference pricing.

OpenAI Compatible /chat/completions

February 22nd, 2024

Chat Completion API enables the LLM to use the chat history for context when returning a model-generated response.


  • Chat Completions with support for messages, temperature, max_tokens, stream and more.
  • Retrieval augmented generation (RAG) with embedded Telnyx Storage buckets using the tools parameter.

Chat Completion API

Take a look at our Inference Pricing Page for a detailed pricing list.

Connect your data on our AI platform