Claude-Haiku-4-5

Anthropic's fastest model with near-frontier performance, matching Sonnet 4 in coding at one-third the cost and more than double the speed.

about

Running at 98.9 tokens per second with a 0.68-second time-to-first-token, Haiku 4.5 scores 73.3% on SWE-bench Verified, within 5 points of the mid-tier Sonnet despite costing $1/$5 per million tokens. It was the first Haiku model to ship with extended thinking, computer use, and context awareness, closing the gap between Anthropic's speed tier and its reasoning tier.

Licenseanthropic
Context window(in thousands)200000

Use cases for Claude-Haiku-4-5

  1. Real-time customer interaction triage: At 98.9 tokens per second and 0.68s time-to-first-token, Haiku 4.5 classifies and routes incoming queries faster than users can notice latency.
  2. Automated code review at scale: Scoring 73.3% on SWE-bench Verified, it catches bugs and suggests fixes in pull request pipelines where speed matters more than maximum depth.
  3. Edge-deployed content moderation: Its small footprint and extended thinking capability make it suited for on-device safety filtering where round-trip API calls are too slow.

Quality

Arena EloN/A
MMLUN/A
MT BenchN/A

Claude Haiku 4.5 scores 73.3% on SWE-bench Verified, within 5 points of Claude Sonnet 4 (72.7%) on the same benchmark despite costing one-third as much. On MMLU, the Claude 3 Haiku baseline scored 76.7% (0-shot CoT), and the 4.5 update maintains that range while adding extended thinking and tool use. At 98.9 tokens per second, it delivers near-Sonnet quality at Haiku speed.

Claude-Opus-4-6

1501

GLM-5

1456

gpt-5.1

1455

Kimi-K2.5

1454

gpt-5.2

1440

pricing

Running Claude Haiku 4.5 through Telnyx Inference costs $1.00 per million input tokens and $5.00 per million output tokens. Processing 1,000,000 customer support conversations at 1,000 tokens each would cost approximately $3,000, roughly one-third the cost of the same workload on Claude Sonnet 4 ($9,000).

What's Twitter saying?

  • Developers praise Claude Haiku 4.5 for its near-Sonnet 4.5 performance on coding benchmarks like 73% on SWE-Bench Verified, at a fraction of the cost and twice the speed, ideal for agentic apps and tool calling.
  • Users on Hacker News call it brilliant for nuanced coding tasks but note it's slow in some cases and requires strict rules to avoid deviation.
  • Early testers report shocking speed enabling full-stack apps in under a minute, though one YouTube review finds it disappointing compared to GPT-5 in non-coding areas.

Explore Our LLM Library

Discover the power and diversity of large language models available with Telnyx. Explore the options below to find the perfect model for your project.

Organizationdeepseek-ai
Model NameDeepSeek-R1-Distill-Qwen-14B
Taskstext generation
Languages SupportedEnglish
Context Length43,000
Parameters14.8B
Model Tiermedium
Licensedeepseek

TRY IT OUT

Chat with an LLM

Powered by our own GPU infrastructure, select a large language model, add a prompt, and chat away. For unlimited chats, sign up for a free account on our Mission Control Portal here.

HOW IT WORKS

Selecting LLMs for Voice AI

RESOURCES

Get started

Check out our helpful tools to help get you started.

  • Icon Resources ebook

    Test in the portal

    Easily browse and select your preferred model in the AI Playground.

  • Icon Resources Docs

    Explore the docs

    Don’t wait to scale, start today with our public API endpoints.

  • Icon Resources Article

    Stay up to date

    Keep an eye on our AI changelog so you don't miss a beat.

Sign up and start building

faqs

What is Claude Haiku 4.5 best used for?

Claude Haiku 4.5 is optimized for high-speed, cost-efficient tasks where quick responses matter. It excels at classification, summarization, and conversational AI, delivering coding performance similar to Claude Sonnet 4 at one-third the cost and more than double the speed.

Is Claude Haiku 4.5 free?

Claude Haiku 4.5 is available for free with usage limits on claude.ai. Through the API, it is priced at $1 per million input tokens and $5 per million output tokens, making it Anthropic's most affordable model.

Is Claude Haiku 4.5 better than ChatGPT 4?

Claude Haiku 4.5 offers near-frontier performance at a fraction of GPT-4's cost. On coding benchmarks, Haiku 4.5 performs comparably to larger models while running significantly faster, making it competitive for speed-sensitive applications.

Is Claude Haiku 4.5 better than Sonnet for coding?

Haiku 4.5 approaches Sonnet 4's coding performance while being faster and cheaper. For straightforward coding tasks, Haiku 4.5 is often the better choice due to its lower latency and cost. For complex multi-file refactoring, Sonnet or Opus may still be preferred.

Can I use Haiku 4.5 in Claude Code?

Yes, Claude Haiku 4.5 is available as a model option in Claude Code. It provides a fast, cost-effective option for coding assistance where speed is prioritized over maximum capability.

Claude Haiku 4: Fast Reasoning Model for Real-Time Applications