# Telnyx Voice: Voice Design Lab — Full Documentation
> Complete page content for Voice Design Lab (Voice section) of the Telnyx developer docs (https://developers.telnyx.com).
> Root index: https://developers.telnyx.com/llms.txt · Lightweight index for this subsection: https://telnyx-openapi-ng.s3.us-east-1.amazonaws.com/llms/voice/voice-design-lab.txt

##  

### Overview

> Source: https://developers.telnyx.com/docs/voice/voice-design-lab.md

The **[Voice Design Lab](https://portal.telnyx.com/#/app/ai/voice-design-lab)** lets you create custom voices for text-to-speech. No recording studio, no training datasets, no waiting.

    **Start from a description.** Write what you want in plain text — age, tone, accent, energy — and the AI generates it.
    **Start from a recording.** Upload an audio clip (or record one in the browser) and the AI captures that voice identity.

---

## Design a Voice

### Overview

> Source: https://developers.telnyx.com/docs/voice/voice-design-lab/design-voice/concepts.md

## What voice design does

Voice design generates a synthetic voice from a natural language description. You describe what you want — age, tone, accent, pacing — and the AI creates audio samples that match.

This is **not** voice cloning. There's no source audio. The voice is generated from scratch based on your text prompt.

## The two-step flow: design → clone

The API has two separate resources:

1. **Voice Design** — an intermediate artifact. Think of it as a draft. You can iterate on it (up to 50 versions per design). It is NOT usable for TTS directly.
2. **Voice Clone** — a production-ready voice. Created from a design. This is what you pass to AI Assistants, Call Control, and the TTS API.

```
POST /v2/voice_designs → generates a sample → returns design id + version
POST /v2/voice_clones  → saves the design as a usable voice → returns voice clone id
```

The portal hides this two-step flow behind a single "Save This Voice" button. If you're using the API directly, you need both steps.

---

### Quickstart

> Source: https://developers.telnyx.com/docs/voice/voice-design-lab/design-voice/quickstart.md

## Portal walkthrough

    Select **Telnyx** or **Minimax** using the provider toggle.

    Write a natural language description of the voice you want — gender, age, tone, pace, texture, personality.

    Click **Generate Samples** to create three audio previews. Each reads a different script in your chosen language.

    Listen to each sample. Click **Regenerate All** to try again, or refine your description.

    Click **Save This Voice**. Give it a name and gender tag — this creates a production-ready voice clone.

## Using the API

### 1. Create a voice design

```http
POST /v2/voice_designs

{
  "name": "Friendly Receptionist",
  "prompt": "Female, mid-thirties. Warm and full, slightly husky.",
  "text": "Hello, thank you for calling. How can I help you today?",
  "language": "en",
  "provider": "telnyx"
}
```

Set `"provider": "minimax"` to use the Minimax provider instead.

### 2. Listen to the generated sample

```http
GET /v2/voice_designs/{id}/sample
```

Returns `audio/wav`.

### 3. Save as a usable voice clone

A voice design is a draft. To use it in production, save it as a clone:

```http
POST /v2/voice_clones

{
  "name": "Friendly Receptionist",
  "voice_design_id": "DESIGN_ID",
  "version": 1,
  "language": "en",
  "gender": "female"
}
```

## Full example

```python Python
import telnyx

client = telnyx.Telnyx(api_key="YOUR_TELNYX_API_KEY")

# 1. Create a voice design
design = client.voice_designs.create(
    name="Friendly Receptionist",
    prompt="Female, mid-thirties. Warm and full, slightly husky.",
    text="Hello, thank you for calling. How can I help you today?",
    language="en",
    provider="telnyx",
)
design_id = design.data.id
print(f"Voice design created: {design_id}")

# 2. Download the generated audio sample
sample = client.voice_designs.download_sample(design_id)
with open("sample.wav", "wb") as f:
    f.write(sample.content)
print("Sample saved to sample.wav")

# 3. Save the design as a usable voice clone
clone = client.voice_clones.create(
    params={
        "voice_design_id": design_id,
        "name": "Friendly Receptionist",
        "language": "en",
        "gender": "female",
    }
)
print(f"Voice clone ready: {clone.data.id}")
# Use this clone ID in TTS, Call Control, or AI Assistants
```

```javascript Node.js
import Telnyx from 'telnyx';
import fs from 'fs';

const client = new Telnyx({ apiKey: 'YOUR_TELNYX_API_KEY' });

// 1. Create a voice design
const design = await client.voiceDesigns.create({
  name: 'Friendly Receptionist',
  prompt: 'Female, mid-thirties. Warm and full, slightly husky.',
  text: 'Hello, thank you for calling. How can I help you today?',
  language: 'en',
  provider: 'telnyx',
});
console.log(`Voice design created: ${design.data.id}`);

// 2. Download the generated audio sample
const sample = await client.voiceDesigns.downloadSample(design.data.id);
const buffer = Buffer.from(await sample.arrayBuffer());
fs.writeFileSync('sample.wav', buffer);
console.log('Sample saved to sample.wav');

// 3. Save the design as a usable voice clone
const clone = await client.voiceClones.create({
  params: {
    voice_design_id: design.data.id,
    name: 'Friendly Receptionist',
    language: 'en',
    gender: 'female',
  },
});
console.log(`Voice clone ready: ${clone.data.id}`);
// Use this clone ID in TTS, Call Control, or AI Assistants
```

Once saved, see [Using Custom Voices](/docs/voice/voice-design-lab/using-custom-voices) for how to use it in AI Assistants, Call Control, and the TTS API.

---

### Parameters

> Source: https://developers.telnyx.com/docs/voice/voice-design-lab/design-voice/api-details.md

## Providers

Set via the `provider` body parameter on `POST /v2/voice_designs`.

| | Telnyx (Qwen3TTS) | Minimax |
|---|---|---|
| **`provider` value** | `"telnyx"` (default) | `"minimax"` |
| **Generation parameters** | Respected | Not supported |
| **Languages** | Auto, Chinese, English, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian | Auto, Chinese, English, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian |
| **Prompt interpretation** | Follows prompts closely | May interpret differently |
| **When to use** | Fine control over generation, consistent results across iterations | Try a different model's interpretation of the same prompt |

## Generation Parameters

Body parameters on `POST /v2/voice_designs`. **Telnyx provider only** — ignored when `provider` is `"minimax"`.

| Body parameter | Default | Range | What it does |
|---|---|---|---|
| `temperature` | 0.9 | 0–2 | Higher = more varied/creative output. Lower = more predictable. |
| `top_k` | 50 | 1–1000 | Limits vocabulary at each generation step. Lower = more focused. |
| `top_p` | 1.0 | 0–1 | Nucleus sampling cutoff. Lower = fewer token choices. |
| `repetition_penalty` | 1.05 | 1–2 | Reduces repeated patterns in generated audio. |
| `max_new_tokens` | 2048 | 100–4096 | Maximum tokens to generate. Affects output length. |

The defaults are a great starting point — you can skip these parameters entirely and get good results. Adjust them later if you want to fine-tune the output.

---

### Prompting Guide

> Source: https://developers.telnyx.com/docs/voice/voice-design-lab/design-voice/prompting-guide.md

## Recommended format

Structure your prompt for consistent results:

```
, . .

```

**Example:**

> Female, mid-thirties. Warm and full, slightly husky. Moderate pace, sounds like someone who smiles while talking.

## Dimensions to describe

### Age

| Descriptor | What it produces |
|---|---|
| "young adult", "in their 20s" | Lighter, more energetic |
| "mid-thirties", "early forties" | Balanced, mature |
| "elderly", "in his 80s" | Deeper, weathered texture |

### Tone / Timbre

- **Deep** / **low-pitched** — gravitas, authority
- **Smooth** / **rich** — polished, professional
- **Gravelly** / **raspy** — character, authenticity
- **Airy** / **breathy** — intimate, soft
- **Warm** / **mellow** — approachable, friendly

### Gender

Male, female, or describe the sound directly: "a lower-pitched, husky female voice" or "a neutral, mid-pitched androgynous voice."

### Pacing

- **Measured** / **deliberate** — careful, authoritative
- **Rapid-fire** / **quick** — energetic, urgent
- **Relaxed** / **conversational** — natural, approachable
- **Rhythmic** — storytelling, narration

### Emotion / Energy

- **Calm** / **serene** — support, meditation
- **Enthusiastic** / **upbeat** — marketing, announcements
- **Authoritative** / **matter-of-fact** — IVR, instructions
- **Warm** / **empathetic** — customer service, healthcare

### Accent / Regional

Describe the regional quality you want. Be specific:
- "Slight British accent" rather than "British"
- "Neutral American" rather than just "American"
- "Soft Southern drawl" rather than "Southern"

### Use case context

Adding context helps the model understand intent:
- "Customer service agent for a bank"
- "Podcast narrator for true crime"
- "Bedtime story reader for children"

## Example prompts

| Use Case | Prompt | Recommended Engine |
|---|---|---|
| **Customer service** | Female, mid-thirties. Warm and full, slightly husky. Moderate pace, sounds like someone who smiles while talking. | **Minimax** |
| **IVR system** | Male, late thirties. Clean and dry, matter-of-fact. Deliberate pace, pauses before numbers and details. | **Telnyx** |
| **Voice agent** | Female, late twenties. Clear and professional, slightly upbeat. Natural conversational pace with a helpful tone. | |
| **Podcast narrator** | Male, early forties. Deep and smooth, with a rich baritone. Measured pacing, storytelling cadence. | **Minimax** |
| **Empathetic support** | Male, mid-thirties. Warm, slightly gravelly. Measured and unhurried. You can hear patience in the breathing rhythm. | **Telnyx** |
| **Notification/alert** | Female, mid-twenties. Bright and crisp. Quick pace, clear enunciation. No emotion — just information. | **Minimax** |
| **Meditation guide** | Female, mid-forties. Soft, airy, and serene. Extremely slow and deliberate pace. Soothing and deeply relaxing delivery. | **Minimax** |
| **Energetic promo** | Male, early twenties. Bright and enthusiastic, high energy. Rapid-fire pacing, sounds highly engaged and convincing. | **Minimax** |
| **Audiobook (Fiction)** | Male, in his 60s. Deep, weathered texture. Relaxed, storytelling cadence with a warm, nostalgic feel. | **Telnyx** |

## Common pitfalls

- **Too vague** — "nice voice" or "good voice" produces generic output. Be specific about at least 3 dimensions.
- **Contradictory traits** — "whisper" + "booming" confuses the model. Pick a coherent set of characteristics.
- **Provider differences** — the same prompt may produce noticeably different results on Telnyx vs Minimax. Try both.
- **Ignoring the preview text** — the text you provide for synthesis should match the voice's intended use. Don't use a cheerful script for a somber voice.

## The Enhance button

The portal's **Enhance** button uses AI to expand a short description into a detailed prompt.

| Before (Short description) | After (Enhanced prompt) |
|---|---|
| Empathetic tech support agent | *Empathetic tech support agent***Gender and age:** Female, late 20s to early 30s.**Where the voice sits:** Head and chest, with a balanced resonance.**Texture:** Silky smooth with a faint warmth, slightly airy.**Pace:** Moderate, with deliberate pauses for clarity and reassurance.**Distinctive quality:** A gentle, patient lilt that conveys calm and understanding. |
| Persuasive outbound sales caller | *Persuasive outbound sales caller***Gender and age:** Female, 28-34.**Where the voice sits:** Throat with chest undertones.**Texture:** Smooth and polished, like warm honey over gravel.**Pace:** Brisk and rhythmic, with confident pauses for emphasis.**Distinctive quality:** A bright, engaging lilt that conveys enthusiasm without sounding forced. |
| Professional medical clinic receptionist | *Professional medical clinic receptionist***Gender and age:** Female, 28-34.**Where the voice sits:** Chest and throat, grounded and clear.**Texture:** Smooth, slightly warm, with a subtle firmness like pressed cotton.**Pace:** Measured and steady, with deliberate pauses for clarity.**Distinctive quality:** A calm, reassuring tone, as if accustomed to offering comfort in stressful moments. |
| Patient language tutor | *Patient language tutor***Gender and age:** Female, late 20s to early 30s.**Where the voice sits:** Head and chest, balanced resonance.**Texture:** Smooth, warm, and gently textured like soft velvet.**Pace:** Measured and deliberate, with thoughtful pauses and clear enunciation.**Distinctive quality:** A calm, encouraging lilt that feels reassuring and attentive. |

This is a good starting point, but review the expanded prompt before generating — you may want to tweak specific dimensions.

---

## Clone from Audio

### Overview

> Source: https://developers.telnyx.com/docs/voice/voice-design-lab/clone-voice/concepts.md

## What voice cloning does

Voice cloning captures a speaker's vocal characteristics — timbre, cadence, accent, pronunciation — from a short audio sample and applies them to new speech synthesis.

The clone is a *representation* of the voice, not a recording of it. The system learns patterns from your audio and encodes them into parameters that guide TTS. This means:

- The cloned voice can say things the original speaker never said
- Clone quality is bounded by what the model can learn from your sample
- Poor recordings, background noise, or inconsistent delivery degrade the clone

## What cloning doesn't do

A clone is not a recording. It's a statistical approximation of a voice -- the model extracts patterns (formant frequencies, prosodic tendencies, spectral characteristics) and applies them during synthesis. This means:

- Output passes through the TTS model, which has its own characteristics. A clone sounds *like* the speaker, but through the lens of the model.
- Quality has a ceiling set by your source audio. No amount of API parameters will fix a noisy or inconsistent recording.
- The clone may not handle speech styles far from the original sample well. A voice cloned from calm narration may sound different when asked to express strong emotion.

## Two ways to create a clone

| Method | What it does | When to use |
|---|---|---|
| **Upload audio** | Send an audio file directly | You have a recording ready |
| **From a voice design** | Save a previously generated design as a clone | You used Design a Voice to create it |

Both produce the same output: a voice clone with a voice ID you can use in production.

## Recording best practices

1. **Match your recording to your use case.** Don't read a monotone script if you want an expressive clone. The AI replicates what it hears — including energy, emotion, and pacing.
2. **Speak clearly, avoid background noise.** Use a decent microphone in a quiet space. Background noise gets cloned too. You don't need a $10K mic — a $100-300 USB condenser in a quiet room is sufficient.
3. **Avoid long pauses.** The cloned voice will mimic pauses between sentences. Keep speech flowing naturally.
4. **Trim your recording.** Speech from start to finish, no dead air at the beginning or end.
5. **Speak in the target language.** If you want the clone to speak Spanish, record in Spanish.
6. **Keep it consistent.** Same tone, accent, and energy throughout. Wide fluctuations confuse the model. The AI clones everything — including stutters, "uhms", and inconsistencies.
7. **Aim for the right volume.** Target -23 to -18 dB RMS with peaks no higher than -3 dB. Too quiet = noise floor issues. Too loud = clipping.
8. **Audio codec doesn't matter much.** MP3 at 128 kbps or above is fine. WAV is ideal but higher bitrate MP3 won't noticeably hurt quality.
9. **Optimal duration by model:**
   - **Qwen3TTS:** 5–10 seconds. Auto-trims to 10s. More isn't better.
   - **Ultra:** Up to 10 seconds.
   - **Minimax:** 1–2 minutes is the sweet spot. Longer recordings capture more vocal range, but beyond 3 minutes yields diminishing returns.

---

### Quickstart

> Source: https://developers.telnyx.com/docs/voice/voice-design-lab/clone-voice/quickstart.md

## Upload a file

    Select **Telnyx** or **Minimax** using the provider toggle.

    In the [Voice Design Lab](https://portal.telnyx.com/#/app/ai/voice-design-lab), click **Upload Audio** and choose your file or drag and drop it.

    Enter a name for the voice and select the gender.

    Click **Clone Voice**. The system processes the audio and creates a voice clone, typically in a few seconds.

## Record directly in the browser

    Click **Upload Audio**, then select the **Record** tab.

    Select the language you'll speak in. The system generates a reading script optimized for voice cloning.

    Click **Start Recording** and read the script clearly. It's designed to capture the full range of phonemes.

    Listen to your recording. Re-record if needed, then click **Clone Voice**.

## Using the API

### Clone with Telnyx (default)

```http
POST /v2/voice_clones/from_upload
Content-Type: multipart/form-data

audio_file: recording.wav
name: My Custom Voice
language: en
gender: female
```

### Clone with Minimax

Supports longer audio (up to 5 minutes):

```http
POST /v2/voice_clones/from_upload
Content-Type: multipart/form-data

audio_file: recording.wav
name: My Custom Voice
language: en
gender: female
provider: minimax
```

### Clone with Ultra model

Ultra clones use the higher-quality `Ultra` model. The request returns `202 Accepted` — poll the clone's status until it becomes `active`.

```http
POST /v2/voice_clones/from_upload
Content-Type: multipart/form-data

audio_file: recording.wav
name: My Ultra Voice
language: en
gender: female
provider: telnyx
model_id: Ultra
```

Response (`202 Accepted`):

```json
{
  "data": {
    "id": "uuid",
    "status": "pending",
    ...
  }
}
```

Poll with `GET /v2/voice_clones` until `status` is `active`.

## SDK Examples

### Clone with Telnyx (default)

```python Python
import telnyx

client = telnyx.Telnyx(api_key="YOUR_TELNYX_API_KEY")

clone = client.voice_clones.create_from_upload(
    params={
        "audio_file": open("recording.wav", "rb"),
        "name": "My Custom Voice",
        "language": "en",
        "gender": "female",
        "provider": "telnyx",
    }
)

print("Voice Clone Created:", clone.data)
```

```javascript Node.js
import Telnyx from 'telnyx';
import fs from 'fs';

const client = new Telnyx({ apiKey: 'YOUR_TELNYX_API_KEY' });

const clone = await client.voiceClones.createFromUpload({
  params: {
    audio_file: fs.createReadStream('recording.wav'),
    name: 'My Custom Voice',
    language: 'en',
    gender: 'female',
    provider: 'telnyx',
  },
});

console.log('Voice Clone Created:', clone.data);
```

### Clone with Minimax

Minimax supports longer audio (up to 20MB) and uses the `speech-2.8-turbo` model.

```python Python
import telnyx

client = telnyx.Telnyx(api_key="YOUR_TELNYX_API_KEY")

clone = client.voice_clones.create_from_upload(
    params={
        "audio_file": open("recording.wav", "rb"),
        "name": "My Custom Voice",
        "language": "en",
        "gender": "female",
        "provider": "minimax",
    }
)

print("Voice Clone Created:", clone.data)
```

```javascript Node.js
import Telnyx from 'telnyx';
import fs from 'fs';

const client = new Telnyx({ apiKey: 'YOUR_TELNYX_API_KEY' });

const clone = await client.voiceClones.createFromUpload({
  params: {
    audio_file: fs.createReadStream('recording.wav'),
    name: 'My Custom Voice',
    language: 'en',
    gender: 'female',
    provider: 'minimax',
  },
});

console.log('Voice Clone Created:', clone.data);
```

### Clone with Ultra model

Ultra clones return `202 Accepted` and require polling until the status is `active`.

```python Python
import telnyx
import time

client = telnyx.Telnyx(api_key="YOUR_TELNYX_API_KEY")

# Create the clone (returns 202 Accepted)
clone = client.voice_clones.create_from_upload(
    params={
        "audio_file": open("recording.wav", "rb"),
        "name": "My Ultra Voice",
        "language": "en",
        "gender": "female",
        "provider": "telnyx",
        "model_id": "Ultra",
    }
)

print("Clone submitted:", clone.data.id, "— status:", clone.data.status)

# Poll until active
while True:
    clones = client.voice_clones.list()
    for c in clones:
        if c.id == clone.data.id:
            if c.status == "active":
                print("Clone ready!")
                break
    else:
        time.sleep(5)
        continue
    break
```

```javascript Node.js
import Telnyx from 'telnyx';
import fs from 'fs';

const client = new Telnyx({ apiKey: 'YOUR_TELNYX_API_KEY' });

// Create the clone (returns 202 Accepted)
const clone = await client.voiceClones.createFromUpload({
  params: {
    audio_file: fs.createReadStream('recording.wav'),
    name: 'My Ultra Voice',
    language: 'en',
    gender: 'female',
    provider: 'telnyx',
    model_id: 'Ultra',
  },
});

console.log('Clone submitted:', clone.data.id, '— status:', clone.data.status);

// Poll until active
while (true) {
  const clones = await client.voiceClones.list();
  const found = clones.data.find((c) => c.id === clone.data.id);
  if (found?.status === 'active') {
    console.log('Clone ready!');
    break;
  }
  await new Promise((r) => setTimeout(r, 5000));
}
```

Once saved, see [Using Custom Voices](/docs/voice/voice-design-lab/using-custom-voices) for how to use it in AI Assistants, Call Control, and the TTS API.

---

### Parameters

> Source: https://developers.telnyx.com/docs/voice/voice-design-lab/clone-voice/parameters.md

## Models

Set via `model_id` (body) on `POST /v2/voice_clones/from_upload`, or use `provider` (body) to select Minimax.
| Model | Provider | Audio length | Max file | Sync/Async | Best for |
|---|---|---|---|---|---|
| **Qwen3TTS** | Telnyx (default) | 3–15s (auto-trimmed to 10s) | 5 MB | Sync (201) | Short, clean samples |
| **Ultra** | Telnyx | Up to 10s | 5 MB | **Async (202)** | Higher quality, more natural |
| **speech-2.8-turbo** | Minimax | 10s–5 min | 20 MB | Sync (201) | Longer recordings, more vocal range |

## Audio Requirements

Body parameter `audio_file` (multipart) on `POST /v2/voice_clones/from_upload`.

|                   | Qwen3TTS | Ultra | Minimax |
| ----------------- | -------- | ----- | ------- |
| **Audio length**  | 3–15s (5–10s optimal) | Up to 10s | 10s–5 min |
| **Max file size** | 5 MB | 5 MB | 20 MB |
| **Formats**       | WAV, MP3, FLAC, OGG, M4A | Same | Same |

- **Qwen3TTS:** aim for 5–10 seconds. Longer isn't better — auto-trims to 10s.
- **Minimax:** longer is better. 1–2 minutes of varied speech gives more vocal range.

## The `ref_text` Parameter

Body parameter on `POST /v2/voice_clones/from_upload`. Optional.

A transcript of what's being said in the audio. Improves clone quality by giving the model a text reference to align against.

## Ultra Async Flow

When `model_id` is `"Ultra"`, the API returns **202 Accepted** instead of 201:

```http
POST /v2/voice_clones/from_upload → 202 { "data": { "status": "pending" } }
```

Poll until ready:

```http
GET /v2/voice_clones/{id} → 200 { "data": { "status": "active" } }
```

See [Responses](/docs/voice/voice-design-lab/clone-voice/responses) for status values and voice ID format. See [Errors](/docs/voice/voice-design-lab/clone-voice/errors) for Minimax error codes.

---

### Responses

> Source: https://developers.telnyx.com/docs/voice/voice-design-lab/clone-voice/responses.md

## Voice ID Format

Every clone response includes fields to construct the voice ID: `{Provider}.{Model}.{provider_voice_id}`

| Provider | `provider_voice_id` value |
|---|---|
| **Telnyx Qwen3TTS** | Equals the clone's UUID (`id` field) |
| **Telnyx Ultra** | Cartesia-assigned voice ID |
| **Minimax** | Minimax-assigned ID (encoded format) |

See [Using Custom Voices](/docs/voice/voice-design-lab/using-custom-voices) for how to use these IDs across products.

## Clone Status

| Status | Meaning |
|---|---|
| `active` | Ready to use |
| `pending` | Being processed (Ultra only — poll until active) |
| `failed` | Processing failed |
| `expired` | Voice was not kept alive |

Qwen3TTS and Minimax clones are always `active` on creation.

---

### Errors

> Source: https://developers.telnyx.com/docs/voice/voice-design-lab/clone-voice/errors.md

## General Errors

These errors apply to all providers.

| Endpoint | Status | Code | Detail |
|----------|--------|------|--------|
| `show` | **404** | `10005` | Clone not found or invalid UUID |
| `create` | **404** | `10005` | Voice design not found / no version |
| `create` | **409** | `10012` | Duplicate `provider_voice_id` |
| `create` | **422** | `10027` | Changeset validation (see below) |
| `create_from_upload` | **422** | `10027` | `audio_file is required` |
| `create_from_upload` | **422** | `10027` | `File too large. Maximum allowed size is 5MB` (20MB for Minimax) |
| `create_from_upload` | **400** | `10015` | `Failed to process audio file: ` (FFmpeg failure) |
| `create_from_upload` | **422** | `10027` | Invalid `provider`+`model_id` combination |
| `update` | **404** | `10005` | Clone not found |
| `update` | **422** | `10027` | Changeset validation errors |
| `delete` | **404** | `10005` | Clone not found |
| `sample` | **404** | `10005` | Clone or sample not found |

## Telnyx / Cartesia Errors

| Status | Code | Detail | Provider |
|--------|------|--------|----------|
| **422** | `10027` | Pattern-matched messages (voice not found, audio too short/long, bad quality, unsupported format, unsupported language, invalid params, text length invalid) | Telnyx/Cartesia |
| **429** | `10011` | `Provider rate limit exceeded` | All providers |

## Minimax Errors

| Status | Code | Detail | Provider |
|--------|------|--------|----------|
| **422** | `10038` | `Audio is too short (min 10s)` | Minimax (code 2037) |
| **422** | `10038` | `Audio is too long` | Minimax (code 2038) |
| **422** | `10038` | `Audio quality too low` | Minimax (code 2039) |
| **422** | `10038` | `Audio contains too much noise` | Minimax (code 2048) |
| **422** | `10038` | `Voice cloning provider error: ` | Minimax (other codes) |
| **500** | `10037` | `Voice clone service configuration error` | Auth misconfiguration detected |
| **502** | `10037` | `Voice clone service unavailable` | Upstream error or connection failure |

---

## Using Custom Voices

### Using Custom Voices

> Source: https://developers.telnyx.com/docs/voice/voice-design-lab/using-custom-voices.md

Every voice clone gets a unique voice ID: `{Provider}.{Model}.{voice_id}`

- **Telnyx:** `Telnyx.Qwen3TTS.33226e69-3abd-429b-b64a-86775c9b5850`
- **Minimax:** `Minimax.speech-2.8-turbo.TB4ZMVKanThGeldiw8rLBEg21v4ifjUTRgLpkodJxpMYV`

Find it in the Voice Design Lab by clicking on any saved voice, or build it from the clone response's `provider`, `provider_supported_models`, and `provider_voice_id` fields.

### AI Assistants

Select your custom voice in the assistant's voice settings. Telnyx clones appear under **Telnyx / Qwen3TTS**, Minimax clones under **Minimax**.

### Call Control

Pass the voice ID in the `voice` field of the `speak` command.

### TTS WebSocket

Pass the voice ID as the `voice` query parameter on the WebSocket URL.

See the [TTS streaming guide](/docs/tts-stt/tts-websocket-streaming) for the full connection flow.

---

## API Reference (Voice Design Lab)

### Voice Designs

- [List voice designs](https://developers.telnyx.com/api-reference/voice-designs/list-voice-designs.md): Returns a paginated list of voice designs belonging to the authenticated account.
- [Create or add a version to a voice design](https://developers.telnyx.com/api-reference/voice-designs/create-or-add-a-version-to-a-voice-design.md): Creates a new voice design (version 1) when `voice_design_id` is omitted. When `voice_design_id` is provided, adds a new version to the existing design instead…
- [Get a voice design](https://developers.telnyx.com/api-reference/voice-designs/get-a-voice-design.md): Returns the latest version of a voice design, or a specific version when `?version=N` is provided. The `id` parameter accepts either a UUID or the design name.
- [Rename a voice design](https://developers.telnyx.com/api-reference/voice-designs/rename-a-voice-design.md): Updates the name of a voice design. All versions retain their other properties.
- [Delete a voice design](https://developers.telnyx.com/api-reference/voice-designs/delete-a-voice-design.md): Permanently deletes a voice design and all of its versions. This action cannot be undone.
- [Download voice design audio sample](https://developers.telnyx.com/api-reference/voice-designs/download-voice-design-audio-sample.md): Downloads the WAV audio sample for the voice design. Returns the latest version's sample by default, or a specific version when `?version=N` is provided. The `…
- [Delete a specific version of a voice design](https://developers.telnyx.com/api-reference/voice-designs/delete-a-specific-version-of-a-voice-design.md): Permanently deletes a specific version of a voice design. The version number must be a positive integer.

### Voice Clones

- [List voice clones](https://developers.telnyx.com/api-reference/voice-clones/list-voice-clones.md): Returns a paginated list of voice clones belonging to the authenticated account.
- [Create a voice clone from a voice design](https://developers.telnyx.com/api-reference/voice-clones/create-a-voice-clone-from-a-voice-design.md): Creates a new voice clone by capturing the voice identity of an existing voice design. The clone can then be used for text-to-speech synthesis.
- [Create a voice clone from an audio file upload](https://developers.telnyx.com/api-reference/voice-clones/create-a-voice-clone-from-an-audio-file-upload.md): Creates a new voice clone by uploading an audio file directly. Supported formats: WAV, MP3, FLAC, OGG, M4A. For best results, provide 5–10 seconds of clear spe…
- [Update a voice clone](https://developers.telnyx.com/api-reference/voice-clones/update-a-voice-clone.md): Updates the name, language, or gender of a voice clone.
- [Delete a voice clone](https://developers.telnyx.com/api-reference/voice-clones/delete-a-voice-clone.md): Permanently deletes a voice clone. This action cannot be undone.
- [Download voice clone audio sample](https://developers.telnyx.com/api-reference/voice-clones/download-voice-clone-audio-sample.md): Downloads the WAV audio sample that was used to create the voice clone.