Telnyx

SRTP protocol: practical guide to secure VoIP

Secure VoIP and WebRTC media with SRTP: learn encryption basics, key exchange options (SDES, DTLS-SRTP, ZRTP), TLS best practices, and practical troubleshooting to prevent eavesdropping without adding latency.

Eli Mogul
By Eli Mogul
SRTP Protocol

SRTP protocol: practical guide to secure VoIP

Unencrypted VoIP traffic is an open invitation for eavesdropping. With 93.7 million data records exposed in Q3 2024 alone and the average data breach now costing $4.88 million, teams running voice, WebRTC, and Voice AI Agents need clear guidance on securing real-time communications without sacrificing latency. That's where SRTP (Secure Real-time Transport Protocol) comes in.

What SRTP does and why it matters

SRTP is a security profile for RTP (Real-time Transport Protocol) that provides three core protections: encryption of media payloads, message authentication, and replay attack prevention. Developed by Cisco and Ericsson engineers and first published as RFC 3711 in 2004, SRTP has become the standard for securing VoIP and WebRTC media streams. Unlike standard RTP, which transmits voice packets in plaintext, SRTP uses AES (Advanced Encryption Standard) encryption to scramble media data. Even if an attacker intercepts packets, they can't decode the conversation without the session keys. SRTP is lightweight by design. It adds minimal overhead, making it suitable for low-bitrate voice codecs like G.729 and iLBC where bandwidth efficiency is critical. This means you get strong security without noticeable impact on call quality or latency.

SRTP vs RTP: key differences

Feature RTP SRTP
Encryption None AES-128/192/256 (counter mode)
Authentication None HMAC-SHA1 (80 or 32-bit tag)
Replay protection None Sequence number validation
Overhead Minimal Typically 10-14 bytes per packet

Standard RTP leaves voice data exposed. SRTP wraps each packet with encryption and an authentication tag, ensuring confidentiality and integrity without adding significant latency.

RTP-vs-SRTP.svg

Key exchange methods: SDES, DTLS-SRTP, and ZRTP

SRTP itself doesn't handle key exchange. You need a separate mechanism to negotiate encryption keys between endpoints. Three methods dominate:

SDES (Session Description Protocol Security Descriptions) passes keys inline within SIP signaling. It's simple to implement, but security depends entirely on the signaling channel. If you use SDES, TLS encryption for SIP is non-negotiable. Without it, keys travel in plaintext.

DTLS-SRTP performs key exchange directly in the media path using a TLS handshake over UDP. This is the mandatory standard for WebRTC, meaning every browser-based call uses it by default. The IETF has used DTLS-SRTP as the preferred approach for SIP deployments because it leverages TLS's proven security model rather than creating new mechanisms.

ZRTP uses Diffie-Hellman key agreement directly in the media stream, enabling end-to-end encryption without relying on signaling security. However, adoption has declined since 2020, with vendors prioritizing legacy maintenance over expansion amid evolving standards. For mixed SIP and WebRTC environments, DTLS-SRTP provides the best interoperability. It works natively with browsers and integrates smoothly with SIP endpoints through media gateways.

Securing both signaling and media

SRTP protects media, but signaling (call setup, caller ID, routing information) requires separate encryption. TLS for SIP handles this by encrypting the control channel between endpoints and servers. The combination of TLS for signaling and SRTP for media has become the industry-standard security baseline. This dual-layer approach prevents attackers from intercepting call metadata or injecting malicious signaling messages while also keeping the actual conversation private. For WebRTC implementations, DTLS handles both signaling protection and SRTP key exchange in a unified handshake, simplifying the security stack for browser-based applications.

Compliance requirements: HIPAA, PCI, GDPR

Regulatory frameworks increasingly mandate encryption for voice communications carrying sensitive data.

HIPAA requires healthcare organizations to encrypt protected health information (PHI) during transmission. VoIP systems handling patient data must use TLS and SRTP to meet security requirements. Non-compliance penalties can reach $2.19 million per violation category annually.

PCI DSS mandates encryption for any system processing payment card data. Contact centers handling transactions over VoIP must implement SRTP to protect cardholder information during calls.

GDPR requires appropriate technical measures to protect personal data, including voice communications. While GDPR doesn't specify encryption protocols, SRTP provides the confidentiality controls needed to demonstrate compliance.

For organizations navigating these requirements, SIP trunking with built-in TLS and SRTP simplifies compliance by handling encryption at the network edge.

Troubleshooting common SRTP issues

Misconfigured SRTP often manifests as one-way audio, no audio, or failed call setup. Common culprits include:

Cipher mismatch: Endpoints must agree on encryption algorithms. If one side offers AES-256 and the other only supports AES-128, negotiation fails. Check your SDP offers and answers to confirm matching crypto suites.

Key exchange failures: SDES keys embedded in SIP INVITE messages won't work if the signaling path strips or corrupts them. Verify TLS is active end-to-end and that intermediary proxies preserve crypto attributes.

NAT traversal issues: SRTP doesn't change NAT behavior, but encrypted packets can trigger different firewall rules. Ensure your session border controller (SBC) or media gateway handles SRTP-aware NAT traversal.

Certificate problems: DTLS-SRTP requires valid certificate fingerprints in SDP. Mismatched or expired certificates cause handshake failures. For WebRTC, verify fingerprint attributes match the certificates your endpoints present. Packet captures remain essential for debugging. Tools like Wireshark can identify SRTP packets and show whether encryption is active, even if they can't decrypt the content without keys.

How Telnyx secures voice at the edge

Telnyx built its network with security as a foundation. As a licensed carrier with a private IP backbone, Telnyx offers TLS and SRTP at every edge point. For WebRTC applications, DTLS-SRTP handles key exchange automatically, so browser-based Voice AI Agents get encryption by default. The infrastructure supports HD codecs alongside encryption, so security doesn't compromise call quality. GPU colocation with global points of presence keeps latency low for real-time AI voice applications while maintaining full encryption. For teams building compliant communications, Telnyx provides self-service debugging tools to diagnose encryption issues and free 24/7 engineering support when you need expert help.

Get started with secure VoIP

Ready to lock down your voice infrastructure? Explore Telnyx SIP trunking with built-in TLS and SRTP, or build secure real-time experiences with our Voice API. For AI-powered voice applications, check out Voice AI Agents running on encrypted infrastructure designed for low-latency, compliant communications.

Share on Social

Related articles

Sign up and start building.