Gemini-2.5-Flash-Lite

Name: Gemini 2.5 Flash Lite: Powerful AI Model for Diverse Tasks
Brand: Telnyx
Price: 1 USD
Availability: InStock

The fastest and lowest-cost model in Google's Gemini 2.5 family, optimized for latency-sensitive tasks like classification, translation, and intelligent routing.

Start building GET Available Models

about

Ranking first in output speed at 324.2 tokens per second with a 0.48-second time-to-first-token, Flash Lite ships with multi-pass reasoning disabled by default but available on demand via the API. At $0.10/$0.40 per million tokens it is Google's cheapest model with 1M-token context, explicitly designed as a latency and cost play rather than an intelligence play.

Licensegoogle

Context window(in thousands)1,048,576

Use cases for Gemini-2.5-Flash-Lite

High-speed classification pipelines: At 324 tokens per second and 0.48s time-to-first-token, it processes real-time content classification, intent detection, and routing decisions faster than any comparable model.
Cost-optimized batch translation: At $0.10 per million input tokens, it handles high-volume translation workloads across text, image, and speech inputs at minimal cost.
Intelligent request routing: Its speed makes it practical as a front-end classifier that triages incoming requests to more capable models based on complexity, reducing overall system cost.

Quality

Arena Elo1374

MMLUN/A

MT BenchN/A

Gemini 2.5 Flash Lite scores 81.1% on Global-MMLU-Lite (standard MMLU not separately published), placing it above GPT-4o mini (82.0% MMLU) in cost-efficiency while running at 324 tokens per second. Its Arena ELO of 1,374 is comparable to GPT-4o mini (1,382) on the same sheet, reflecting similar quality at roughly one-third the price.

gpt-4.1-mini

1382

gpt-4o-mini

1382

Gemini-2.5-Flash-Lite

1374

Gemini-2.0-Flash

1360

gpt-oss-120b

1354

pricing

Running Gemini 2.5 Flash Lite through Telnyx Inference costs $0.10 per million input tokens and $0.40 per million output tokens. Processing 10,000,000 classification or routing tasks at 200 tokens each would cost approximately $600, the lowest cost per query of any model on the sheet at this quality tier.

What's Twitter saying?

Developers praise its super-fast coding for simple apps and games, often matching Pro quality with visual effects and animations, though it struggles with complex interactions.
Community notes high benchmark gains over prior Flash models in math (96.9%), coding (59.3%), and reasoning, with low latency (529ms) ideal for real-time tasks.
Users highlight speed and cost efficiency for bulk jobs like content creation, but report occasional bugs like incomplete responses and recommend "thinking mode" for better outputs.

Explore Our LLM Library

Discover the power and diversity of large language models available with Telnyx. Explore the options below to find the perfect model for your project.

Organizationdeepseek-ai

Model NameDeepSeek-R1-Distill-Qwen-14B

Taskstext generation

Languages SupportedEnglish

Context Length43,000

Parameters14.8B

Model Tiermedium

Licensedeepseek

Organization	Model Name	Tasks	Languages Supported	Context Length	Parameters	Model Tier	License
deepseek-ai	DeepSeek-R1-Distill-Qwen-14B	text generation	English	43,000	14.8B	medium	deepseek
fixie-ai	ultravox-v0_4_1-llama-3_1-8b	audio text-to-text	Multilingual	8,000	8.7B	small	mit
google	gemma-2b-it	text generation	English	8,192	2.5B	small	gemma
google	gemma-7b-it	text generation	English	8,192	8.5B	small	gemma
meta-llama	Llama-3.3-70B-Instruct	text generation	Multilingual	99,000	70.6B	large	llama3.3
meta-llama	Llama-Guard-3-1B	safety classification	Multilingual	128,000	1.5B	small	llama3.3
meta-llama	Meta-Llama-3.1-70B-Instruct	text generation	Multilingual	99,000	70.6B	large	llama3.1
meta-llama	Meta-Llama-3.1-8B-Instruct	text generation	Multilingual	131,072	8.0B	small	llama3.1
minimaxai	MiniMax-M2.5	text generation	English	2,000,000	0	large	minimaxai
minimaxai	MiniMax-M2.7	text generation	English	200,000	0	large	minimaxai
mistralai	Mistral-7B-Instruct-v0.1	text generation	English	8,192	7.2B	small	apache-2.0
mistralai	Mistral-7B-Instruct-v0.2	text generation	English	32,768	7.2B	small	apache-2.0
mistralai	Mixtral-8x7B-Instruct-v0.1	text generation	Multilingual	32,768	46.7B	medium	apache-2.0
moonshotai	Kimi-K2.5	text generation	English	256,000	1.0T	large	modified-mit
Qwen	Qwen3-235B-A22B	text generation	English	32,768	235.1B	large	apache-2.0
zai-org	GLM-5.1-FP8	text generation	English	202,752	753.9B	large	mit
anthropic	claude-3-7-sonnet-latest	text generation	Multilingual	200,000	0	large	anthropic
anthropic	claude-haiku-4-5	text generation	Multilingual	200,000	0	large	anthropic
anthropic	claude-opus-4-6	text generation	Multilingual	200,000	0	large	anthropic
anthropic	claude-sonnet-4-20250514	text generation	Multilingual	200,000	0	large	anthropic
google	gemini-2.0-flash	text generation	Multilingual	1,048,576	0	large	google
google	gemini-2.5-flash	text generation	Multilingual	1,048,576	0	large	google
google	gemini-2.5-flash-lite	text generation	Multilingual	1,048,576	0	large	google
groq	gpt-oss-120b	text generation	English	131,072	117.0B	large	groq
groq	kimi-k2-instruct	text generation	English	131,072	1.0T	large	groq
groq	llama-3.3-70b-versatile	text generation	Multilingual	131,072	70.6B	large	llama3.3
groq	llama-4-maverick-17b-128e-instruct	text generation	Multilingual	1,000,000	400.0B	large	llama4
groq	llama-4-scout-17b-16e-instruct	text generation	Multilingual	128,000	109.0B	large	llama4
openai	gpt-3.5-turbo	text generation	Multilingual	4,096	0	large	openai
openai	gpt-4	text generation	Multilingual	128,000	0	large	openai
openai	gpt-4-0125-preview	text generation	Multilingual	128,000	0	large	openai
openai	gpt-4-0314	text generation	Multilingual	128,000	0	large	openai
openai	gpt-4-0613	text generation	Multilingual	128,000	0	large	openai
openai	gpt-4-1106-preview	text generation	Multilingual	128,000	0	large	openai
openai	gpt-4-32k-0314	text generation	Multilingual	128,000	0	large	openai
openai	gpt-4-turbo-preview	text generation	Multilingual	128,000	0	large	openai
openai	gpt-4.1	text generation	Multilingual	1,047,576	0	large	openai
openai	gpt-4.1-mini	text generation	Multilingual	1,047,576	0	large	openai
openai	gpt-4o	text generation	Multilingual	128,000	0	large	openai
openai	gpt-4o-mini	text generation	Multilingual	128,000	0	large	openai
openai	gpt-5	text generation	Multilingual	400,000	0	large	openai
openai	gpt-5-mini	text generation	Multilingual	400,000	0	large	openai
openai	gpt-5.1	text generation	Multilingual	400,000	0	large	openai
openai	gpt-5.2	text generation	Multilingual	400,000	0	large	openai
openai	o1-mini	text generation	Multilingual	128,000	0	large	openai
openai	o1-preview	text generation	Multilingual	128,000	0	large	openai
openai	o3-mini	text generation	Multilingual	200,000	0	large	openai
xai-org	grok-2	text generation	Multilingual	131,072	0	large	xai
xai-org	grok-2-latest	text generation	Multilingual	131,072	0	large	xai
xai-org	grok-3	text generation	Multilingual	131,072	0	large	xai
xai-org	grok-3-beta	text generation	Multilingual	131,072	0	large	xai
xai-org	grok-3-fast	text generation	Multilingual	131,072	0	large	xai
xai-org	grok-3-fast-beta	text generation	Multilingual	131,072	0	large	xai
xai-org	grok-3-fast-latest	text generation	Multilingual	131,072	0	large	xai
xai-org	grok-3-latest	text generation	Multilingual	131,072	0	large	xai
xai-org	grok-3-mini	text generation	Multilingual	131,072	0	large	xai
xai-org	grok-3-mini-fast	text generation	Multilingual	131,072	0	large	xai

TRY IT OUT

Chat with an LLM

Powered by our own GPU infrastructure, select a large language model, add a prompt, and chat away. For unlimited chats, sign up for a free account on our Mission Control Portal here.

HOW IT WORKS

Selecting LLMs for Voice AI

GET Available Models

RESOURCES

Get started

Check out our helpful tools to help get you started.

Test in the portal
Easily browse and select your preferred model in the AI Playground.
Test today
Explore the docs
Don’t wait to scale, start today with our public API endpoints.
Get started
Stay up to date
Keep an eye on our AI changelog so you don't miss a beat.
See updates

Sign up and start building

faqs

What is Gemini 2.5 Flash-Lite good for?

Gemini 2.5 Flash-Lite is optimized for latency-sensitive, high-volume tasks like classification, translation, and intelligent routing. It is 1.5x faster than Gemini 2.0 Flash at lower cost, with optional reasoning capabilities that can be toggled on for harder tasks.

Is Flash-Lite faster than Flash?

Yes, Gemini 2.5 Flash-Lite is faster and cheaper than both Gemini 2.0 Flash and 2.5 Flash. It is specifically designed to push the frontier of intelligence per dollar for cost-sensitive, high-scale operations.

How much is Gemini 2.5 Flash-Lite?

Gemini 2.5 Flash-Lite offers the lowest pricing in the Gemini 2.5 family. Current rates are available through Google AI Studio and Vertex AI documentation, with free tier access for testing.

Can you use Gemini 2.5 Flash for free?

Yes, both Gemini 2.5 Flash and Flash-Lite are available for free through Google AI Studio with usage limits. The free tier provides enough capacity for testing and development before committing to paid API access.

Gemini-2.5-Flash-Lite

about

Use cases for Gemini-2.5-Flash-Lite

Quality

pricing

What's Twitter saying?

Explore Our LLM Library

Chat with an LLM

Selecting LLMs for Voice AI

Create an account

Choose Gemini-2.5-Flash-Lite

Enter your API key

Prompt the LLM

Get started

Test in the portal

Explore the docs

Stay up to date

Sign up and start building

faqs

What is Gemini 2.5 Flash-Lite good for?

Is Flash-Lite faster than Flash?

How much is Gemini 2.5 Flash-Lite?

Can you use Gemini 2.5 Flash for free?

Gemini-2.5-Flash-Lite

about

Use cases for Gemini-2.5-Flash-Lite

Quality

pricing

What's Twitter saying?

Explore Our LLM Library

DeepSeek-R1-Distill-Qwen-14B

ultravox-v0_4_1-llama-3_1-8b

gemma-2b-it

gemma-7b-it

Llama-3.3-70B-Instruct

Llama-Guard-3-1B

Meta-Llama-3.1-70B-Instruct

Meta-Llama-3.1-8B-Instruct

MiniMax-M2.5

MiniMax-M2.7

Mistral-7B-Instruct-v0.1

Mistral-7B-Instruct-v0.2

Mixtral-8x7B-Instruct-v0.1

Kimi-K2.5

Qwen3-235B-A22B

GLM-5.1-FP8

claude-3-7-sonnet-latest

claude-haiku-4-5

claude-opus-4-6

claude-sonnet-4-20250514

gemini-2.0-flash

gemini-2.5-flash