In Pakistan, Alibaba.com has over 100 sales representatives making roughly 50 outbound calls a day. Five thousand calls, every day, most of them cold, most of them in Urdu, most of them answered by people who are not yet ready to buy. The sheer volume of human effort required to separate interested leads from unqualified ones was eating into the time reps could spend actually closing deals.
Alibaba had already proven that AI-powered calling worked in other markets. Pakistan was next, but they needed Voice AI in English and Urdu, with local Pakistani phone numbers that local prospects would actually answer.
After evaluating several providers the decision came down to one thing: could anyone make voice AI in Urdu work?
Building an AI agent that speaks Urdu is not a matter of picking a voice from a dropdown. Urdu and Hindi share a spoken foundation but diverge in script, vocabulary, and cultural register. Speech-to-text models trained on Hindi data regularly confused Urdu transcription, producing garbled call logs and broken conversation flows. A mis-transcribed word changes the customer's intent entirely: what sounds like "interest" in one language can map to "objection" in the other, and the AI agent responds to the wrong signal.
Text-to-speech engines trained on Hindi or other regional languages could not match the cadence and clarity that Pakistani callers expect from a local phone number. The pronunciation differences are subtle but persistent. To a native speaker, a Hindi-trained TTS voice on a Pakistan caller ID sounds foreign, and foreign-sounding calls get hung up on.
Pakistan number provisioning added another layer. Calls from unattested numbers get flagged as spam on Pakistani mobile networks, which means they never ring at all. Local presence, carrier attestation, and routing reliability across South Asian networks all had to work together for calls to actually connect and sound legitimate.
And finally, there was the integration problem. Stitching together separate vendors for telephony, STT, TTS, and agent orchestration meant four places for the pipeline to break, and when Urdu transcription went wrong, diagnosing whether the issue was the carrier, the speech model, or the agent logic meant coordinating across three vendor support teams.
Telnyx Voice AI Agents gave Alibaba a single platform where telephony, speech processing, and agent orchestration run on the same infrastructure. SIP Trunking delivered Pakistan numbers with carrier-grade identity and local caller ID. The Voice API handled programmatic outbound routing at volume. No stitching, no middleware, no handoff between vendors.
The evaluation came down to Urdu. Telnyx's STT correctly transcribed Urdu without confusing it for Hindi. The TTS voices produced natural cadence and pronunciation for South Asian callers. But getting it right in a demo was not the same as getting it right in production. Alibaba's team worked through multiple prompt iterations with the Telnyx solutions engineering team, refining the agent's conversation flow, testing edge cases in Urdu dialect, and adjusting speech model parameters until recorded calls passed internal QA. Each round of test calls surfaced new edge cases: regional Urdu accents, code-switching between Urdu and English within a single sentence, background noise on Pakistani mobile networks. Each one was addressed, tested, and verified.
The Urdu accuracy was the deciding factor. Without it, the entire pipeline falls apart. Transcriptions are wrong, the agent responds to the wrong intent, and the call fails in ways the customer hears immediately. With Telnyx, the transcriptions were clean, the agent understood the caller's intent, and the conversation flowed naturally.
Alibaba's team emphasized that having telephony, speech processing, and agent orchestration under one contract and one escalation path was a practical advantage they had not found with any other provider. When Urdu transcription issues surfaced during testing, there was one team to call, not three vendors pointing fingers at each other.
Alibaba's Voice AI agents now handle outbound cold calling in Urdu and English, qualifying leads before routing high-intent prospects to human sales reps. Local Pakistan numbers provisioned through Telnyx's carrier network give local caller ID, so phones actually ring instead of getting flagged as spam. Sales reps spend their time on qualified conversations instead of cold dials. The AI handles the top of the funnel. Humans handle the close.
The same infrastructure that powers Pakistan today is already scoped for Vietnam and India. The platform that solved Urdu can solve Vietnamese, Tamil, Bahasa. Same stack, same integration pattern, same escalation path.
Alibaba.com is the world's largest online commerce ecosystem, connecting hundreds of millions of buyers and suppliers across 190 countries and territories. Their International Cross-Border Business Unit operates sales teams across South and Southeast Asia, pitching platform memberships to suppliers who have never heard of Alibaba through anything but a phone call.