Conversational AI

A voice API is your software interface to the phone network

Learn how programmable voice works, what you can build with it, and why carrier-layer control changes the economics.

By Marlo Vernon

Definition: A voice API is a tool that software developers use to make and receive phone calls using an Application Programming Interface (API). A voice API connects internet-based applications to the Public Switched Telephone Network (PSTN).

Key concepts

  • A Voice API enables software to initiate, receive, route, and manage phone calls programmatically.
  • It supports use cases like IVRs, call tracking, conferencing, automation, and AI-powered voice experiences.
  • Programmable voice gives businesses more flexibility than static phone systems or fixed workflows.

One of the main advantages of a voice API is that developers don't need telecoms expertise to build voice applications. It executes all the telephony functions, so developers can focus on designing an engaging customer experience. Voice APIs are also highly configurable, easily consumable, scalable and incredibly cost effective.

In a nutshell, a voice API enables an application to programmatically make, receive and manage calls, without having to interface directly with the PSTN. You can also use a voice API to route voice calls with global reach to phones, browsers, SIP domains and mobile applications.


Common use cases for Voice API

Voice APIs cater to a diverse range of use-cases, including contact center and UCaaS platforms, cloud-based IVRs, call tracking solutions, Artificial Intelligence (AI) applications, omnichannel routing, and voice notifications and alerts, to name a few.

Any workforce application with embedded collaboration or communication requirements can leverage a programmable voice API. As a result, there's been an explosion in recent years in use-cases for embeddable communication APIs.

Key benefits of voice APIs

One of the key benefits of APIs is that they are configurable
Naomi Ko, D!gitalist Magazine

Businesses can even extend the value of their existing solutions by using APIs to add voice or messaging capabilities to legacy software.

APIs are also easily consumable, which means businesses can leverage them to improve their customer experience, while at the same time lowering their operational costs.

Using APIs, companies of all sizes can embed secure and reliable omnichannel communications capabilities into their applications, to better meet the needs of today's businesses, customers and end-users.

How much does a voice API cost?

Costs for voice APIs vary depending on which provider you choose, call types, volumes and features you need. In general, providers overlay a standard cost for programmatic voice.

Here's a quick comparison of Telnyx pricing vs. Twilio pricing to give you an idea:


Make Calls(Per Min)Receive Calls(Per Min)
Call typeTelnyxTwilioTelnyxTwilio
Local calls$0.0070$0.0140$0.0055$0.0085
Toll-free calls$0.0020$0.0140$0.0170$0.0220
SIP, browser or app calls$0.0020$0.0040$0.0020$0.0040

What features should a voice API include?

Programmable voice APIs can vary from basic to incredibly feature-rich. The most basic of call flows start with the ability to say strings of text and gather DTMF keyboard input. You can use most voice APIs to create outbound calls, record calls and manage conferences.

Global Audio Conferencing

An API with global audio conferencing features enables your application to connect people and teams globally, with highly configurable behavior.

For example, you can specify hosts that control global conference behavior, play sounds when participants join and leave, define a length of time after which the conference automatically ends, mute and unmute or hold and unhold participants, and much more.

Media Streaming

Media Streaming (sometimes called media forking) enables your application to deliver calls while simultaneously duplicating call media to multiple recipients. The moment the call is established, the Telnyx voice API takes the call media and forks it. Call media can be duplicated, delivered, analyzed and returned in real time. And the second recipient never occupies the call stream, so you never have to worry about degraded quality or dropped connections.

Using Media Streaming, you can build next-gen features into your application, like sentiment analysis, conversational AI, fraud detection, call transcriptions and voice biometrics, all powered by Telnyx Inference.

Text-to-Speech

Text-to-speech (or TTS), is a form of speech synthesis that converts text into spoken voice output. First, it's an accessibility feature for customers with disabilities which makes automated customer service systems usable. However, even customers who don't need text-to-speech for accessibility often prefer text-to-speech options so that it's easier to interact with your IVR on the go.

Many programmable voice APIs incorporate text-to-speech technology for this reason. The Telnyx solution is powered by Amazon Polly and allows you to speak dynamic text in 29 different languages and accents.

Smart IVR

Using a programmable voice API, you can build a multi-level IVR to intelligently route your call flows. Smart IVR has enough menu options and interactive capabilities to handle simple customer service tasks without the aid of a human representative. Your programmable voice API should enable you to build a customer-first IVR that leverages:

  • AI technologies
  • Intelligent call routing
  • Omnichannel experiences
  • Text-to-speech capabilities
  • Call recording

The Telnyx voice API is ideally suited to building smart IVR systems. In fact, we ran a super in-depth (hour-long!) webinar, where our developers build one from start to finish.


Answering Machine Detection

Answering Machine Detection (or AMD) can tell you in real time whether a call has been answered by a human or a machine, so you can tailor your experience accordingly. Outbound calling use-cases where Answering Machine Detection is particularly important include following up on potential leads, providing customers with critical updates or collecting information via voice surveys.

The Telnyx voice API offers industry leading accuracy of over 97%. It works by sending a webhook to your application when:

  • Your outbound call is answered by a machine, so you can avoid connecting an agent unnecessarily.
  • The greeting ends, so you can leave a complete message rather than frustrating customers with half-messages left on their voicemail.

How does the Telnyx voice API work?

The Telnyx programmable voice API framework is essentially a set of REST APIs that allow a developer to control call flows, from the moment a call is made or received to the moment the call is terminated. In between, you'll receive a number of webhooks for each step of the call, which you answer with a command.

It's this communication back and forth of webhooks and commands that gives you granular control over your calls.

What makes a good voice API?

First and foremost, a good API is one that's easy to build with. However, it's also important that your API offers the flexibility and control you need to customize your experience to your specific needs.

SDKs and robust documentation

Software Development Kits (or SDKs) are a set of software development tools in one installable package. They're designed to be used for specific platforms or programming languages and make building applications a far more streamlined process, requiring less development resources.

A good provider should deliver SDKs in specific programming languages i.e. Python, Ruby, .NET, Node, and PHP. These API wrappers mean you can easily integrate with the voice API, resulting in a faster go-to-market process for your application.

The same goes for developer documentation. Your voice API provider should create robust documentation to cover a variety of use cases i.e. quickstart guides, tutorials and product overviews to ensure developers can build what they need quickly and easily.

Great developer support

Some integrations are simple, and others are far more complicated. But, it's impossible to predict when you're going to need technical support to move your project forward. It's incredibly important that your voice API provider offers solid technical support (ideally 24/7) at no extra cost.

Common Telnyx Voice API Questions

Telnyx is a company of engineers and developers who've built a voice API that puts the developer experience first. The Telnyx voice API is truly built by developers, for developers.

How scalable is Telnyx?

Best in class. The Telnyx infrastructure scales on-demand, enabling customers to provision voice connectivity instantly and with virtually unlimited capacity. You can build and scale configurable voice applications in minutes. And that means any type of voice app: unified communication software, contact center apps, call tracking and call recording software, and anything else that requires voice connectivity.

Is Telnyx a licensed carrier?

Yes. Telnyx is a true, licensed carrier with a private, global IP network that delivers unmatched call quality and reliability. So you can build ultra-stable apps that users and businesses can depend on.

Does Telnyx have 24/7 365 support?

Yes. Telnyx operates network operations support centers in Chicago and Dublin. So there's an engineer on call to help you troubleshoot and solve problems, at any time, on any day. Since we own and operate our own network, if the problem is on our end, our support engineers can fix it.

Telnyx also has a dedicated developer slack channel, where you can speak to our NOC team, engineers and product managers 24/7, free of charge.

Is it difficult to migrate to Telnyx?

It's actually extremely easy. Telnyx has developed a tool called TeXML, which enables you to switch your programmable voice solution from the Twilio API with ease. And, by using the Telnyx voice API to execute your TwiML code, you'll experience better call quality at significantly lower costs.

How to get started?

Want to build with Voice APIs? Join the Telnyx subreddit community. https://www.reddit.com/r/Telnyx/

Share on Social