Build AI agents for WhatsApp with the Business API and Telnyx. Connect LLMs, handle webhooks, and deploy conversational AI on one co-located network.
Learning how to create AI agents for WhatsApp starts with a simple definition. AI agents for WhatsApp are automated conversational systems that use large language models to understand, respond to, and act on messages sent through the WhatsApp Business API. They replace static rule-based chatbots with dynamic, context-aware conversations that resolve real customer problems.
This guide walks through the full build. You will learn the architecture, set up the prerequisites, write a working Python agent, compare platforms, and pick up the practices that separate a useful WhatsApp AI agent from a frustrating one.
A WhatsApp AI agent is software that reads inbound WhatsApp messages, generates responses with an LLM, and sends replies through the WhatsApp Business API. Unlike a rule-based WhatsApp chatbot that matches keywords to canned replies, an AI agent understands intent, holds context across a conversation, and takes actions like looking up an order or booking an appointment.
The business case is straightforward. WhatsApp has 2 billion active users in 180+ countries, and 75% of consumers prefer messaging over email for support, according to Zendesk. A WhatsApp bot for business meets customers on the channel they already use every day.
Building one requires three layers. The WhatsApp Business API handles messaging transport. An LLM handles conversation logic. Infrastructure ties the two together. Most teams stitch these layers across separate vendors, and each boundary adds latency, cost, and failure points. Telnyx provides all three on one platform, with AI inference co-located with messaging on a single network.
Build your WhatsApp AI agent on one networkGet WhatsApp Business API access, AI inference, and voice on a single platform with one API key.
Explore the WhatsApp APIEvery WhatsApp AI agent runs on the same three-layer architecture. The WhatsApp API moves messages between your business number and the user. An inference layer runs the LLM that reads context and generates responses. Your application sits in the middle, handling webhooks, managing conversation state, and executing business logic.
The message lifecycle follows a predictable loop. A user sends a WhatsApp message. Telnyx delivers it to your webhook endpoint. Your app builds a prompt with conversation context and calls AI Inference, which supports OpenAI, Anthropic, and open-source models through one API. The LLM generates a response. Your app sends the reply back through the WhatsApp Messaging API.
WhatsApp AI agent architecture flow
Five components make this work in production. You need a WhatsApp Business API number, a messaging profile that routes traffic, a webhook endpoint your app exposes, an LLM model, and conversation state management so the agent remembers what was said three messages ago.
The latency between steps 3 and 4 is where most multi-vendor stacks fall apart. When messaging and inference run on different networks, every message pays a cross-vendor round trip. Telnyx co-locates both, which keeps inference latency under 500ms and the conversation feeling live. The same network also runs Voice AI, so the same agent can pick up a WhatsApp voice call.
Before writing code, gather the following. Each item takes minutes to set up in the Telnyx portal.
The build follows six steps. Set up your WhatsApp Business number on Telnyx. Create a messaging profile. Point the profile's webhook URL at your app. Implement the webhook handler. Call the AI Inference API with the message and context. Send the generated reply back through the WhatsApp Messaging API.
The example below is an order tracking agent built with Flask. It receives an inbound WhatsApp message, asks the LLM to answer using order data, and replies on the same thread.
import os
import requests
from flask import Flask, request, jsonify
app = Flask(__name__)
TELNYX_API_KEY = os.environ["TELNYX_API_KEY"]
WHATSAPP_NUMBER = os.environ["WHATSAPP_NUMBER"]
SYSTEM_PROMPT = (
"You are an order tracking assistant. Answer in two sentences "
"or fewer. If you cannot resolve the request, say you will "
"connect the customer with a human agent."
)
def get_order_context(phone_number):
# Replace with a lookup against your order database
return "Order #4821: shipped, arriving Thursday via FedEx."
def generate_reply(user_message, order_context):
response = requests.post(
"https://api.telnyx.com/v2/ai/chat/completions",
headers={"Authorization": f"Bearer {TELNYX_API_KEY}"},
json={
"model": "meta-llama/Meta-Llama-3.1-70B-Instruct",
"messages": [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "system", "content": f"Order data: {order_context}"},
{"role": "user", "content": user_message},
],
},
)
return response.json()["choices"][0]["message"]["content"]
def send_whatsapp_message(to_number, text):
requests.post(
"https://api.telnyx.com/v2/messages",
headers={"Authorization": f"Bearer {TELNYX_API_KEY}"},
json={
"from": WHATSAPP_NUMBER,
"to": to_number,
"type": "whatsapp",
"text": text,
},
)
@app.route("/webhook", methods=["POST"])
def webhook():
event = request.json["data"]
if event["event_type"] == "message.received":
payload = event["payload"]
user_number = payload["from"]["phone_number"]
user_message = payload["text"]
order_context = get_order_context(user_number)
reply = generate_reply(user_message, order_context)
send_whatsapp_message(user_number, reply)
return jsonify({"status": "ok"}), 200
if __name__ == "__main__":
app.run(port=5000)Three things to notice. The system prompt constrains the agent to short answers and defines an escalation path. The get_order_context function injects real business data into the prompt, which is what makes this an agent rather than a generic chatbot. And the webhook returns 200 immediately so Telnyx does not retry the event.
This example handles order tracking, but the pattern is identical for FAQ automation, appointment scheduling, customer support, and lead qualification. Swap the context function and the system prompt. The transport and inference layers stay the same.
WhatsApp business automation covers most of the conversations a support or sales team handles today. These are the patterns teams deploy first.
Adjacent workflows extend the same infrastructure. You can send verification codes with the WhatsApp OTP guide pattern, fall back to the SMS API when a user is unreachable on WhatsApp, and route inbound calls to the same agent with WhatsApp calling AI.
WhatsApp bot platforms fall into three camps. Raw API providers give you transport and nothing else. No-code builders give you speed but cap what the agent can do. Full-stack platforms give you messaging, inference, and voice together. Here is how the options compare.
| Provider | AI approach | Limitation |
|---|---|---|
| Twilio | WhatsApp API via send/receive, AI from a separate vendor | Multi-vendor stitching adds latency and cost |
| Meta Cloud API | Raw API access, no AI infrastructure | Developer builds and hosts everything |
| Wati | No-code chatbot builder | Limited LLM support, no voice infrastructure |
| Respond.io | Omnichannel inbox with AI agent features | No telephony or voice AI |
| Landbot | No-code rule-based flows | Limited LLM integration, no voice |
| Telnyx | Full-stack, co-located AI and messaging | Requires developer setup for custom logic |
A working webhook is the easy part. The difference between a demo and a production WhatsApp conversational AI agent comes down to five habits.
If your agent will handle voice notes or WhatsApp calls, add transcription with the Speech-to-Text API and spoken responses with the Text-to-Speech API. Both run on the same network as your messaging, so the voice path inherits the same sub-500ms latency profile. Budget planning is covered in our WhatsApp API cost guide.
Ship your first WhatsApp AI agent this weekMessaging, AI inference, and voice on one network. One API key, sub-500ms latency, and no vendor stitching.
Get started with TelnyxRelated articles