Running at 98.9 tokens per second with a 0.68-second time-to-first-token, Haiku 4.5 scores 73.3% on SWE-bench Verified, within 5 points of the mid-tier Sonnet despite costing $1/$5 per million tokens. It was the first Haiku model to ship with extended thinking, computer use, and context awareness, closing the gap between Anthropic's speed tier and its reasoning tier.
Claude Haiku 4.5 scores 73.3% on SWE-bench Verified, within 5 points of Claude Sonnet 4 (72.7%) on the same benchmark despite costing one-third as much. On MMLU, the Claude 3 Haiku baseline scored 76.7% (0-shot CoT), and the 4.5 update maintains that range while adding extended thinking and tool use. At 98.9 tokens per second, it delivers near-Sonnet quality at Haiku speed.
Running Claude Haiku 4.5 through Telnyx Inference costs $1.00 per million input tokens and $5.00 per million output tokens. Processing 1,000,000 customer support conversations at 1,000 tokens each would cost approximately $3,000, roughly one-third the cost of the same workload on Claude Sonnet 4 ($9,000).
Discover the power and diversity of large language models available with Telnyx. Explore the options below to find the perfect model for your project.
| Organization | Model Name | Tasks | Languages Supported | Context Length | Parameters | Model Tier | License |
|---|---|---|---|---|---|---|---|
| No data available at this time, please try again later. |
Powered by our own GPU infrastructure, select a large language model, add a prompt, and chat away. For unlimited chats, sign up for a free account on our Mission Control Portal here.
Check out our helpful tools to help get you started.
Claude Haiku 4.5 is optimized for high-speed, cost-efficient tasks where quick responses matter. It excels at classification, summarization, and conversational AI, delivering coding performance similar to Claude Sonnet 4 at one-third the cost and more than double the speed.
Yes, Claude Haiku 4.5 is available as a model option in Claude Code, providing a fast and cost-effective choice for coding assistance and development workflows. Its low latency makes it particularly useful for rapid iteration during development.
Claude Haiku 4.5 is available for free with usage limits on claude.ai. Through the API, it is priced at $1 per million input tokens and $5 per million output tokens, making it Anthropic's most affordable model.
Sonnet 4 is Anthropic's more capable model for complex reasoning and multi-step tasks, while Haiku 4.5 prioritizes speed and cost efficiency. Haiku 4.5 approaches Sonnet's coding performance while running significantly faster at a lower price, making it better suited for high-volume or latency-sensitive applications.
Haiku 4.5 is Anthropic's lowest-cost model at $1 per million input tokens, roughly one-third the price of Claude Sonnet 4. For voice AI and real-time applications that require sub-second responses, this cost structure makes Haiku a practical choice for production workloads.
Haiku 4.5 is not a weak model. It performs competitively on coding benchmarks and handles most everyday tasks well, according to Anthropic's own benchmarks. Its limitations show on complex reasoning and multi-step analysis, where Sonnet or Opus models are better suited.
Haiku 4.5 is priced at $1 per million input tokens and $5 per million output tokens through the API. Telnyx offers access to Haiku 4.5 through its inference infrastructure, where co-located processing can reduce overall pipeline latency.