Google Gemini API pricing 2026 — Gemini 2.5 Pro & Flash costs, examples, cheaper access

Q: How much does Gemini 2.5 Flash cost?

On Kunavo, Gemini 2.5 Flash is $0.09 per 1M input tokens and $0.75 per 1M output tokens — roughly 70% below Google's list price of $0.30 / $2.50. A typical chatbot turn (1K in, 300 out) costs about $0.0003.

Q: How much does the Gemini 2.5 Pro API cost?

On Kunavo, Gemini 2.5 Pro API pricing is $0.375 per 1M input tokens and $3.00 per 1M output tokens — about 70% below Google's $1.25 / $10.00 list price. Use Pro over Flash for multi-step reasoning, vision and long context.

Q: Is Gemini cheaper than Claude or GPT?

Gemini 2.5 Flash is one of the cheapest capable models available — well under Claude Haiku and most GPT tiers for high-volume work. Gemini 2.5 Pro sits between Claude Haiku and Sonnet on cost while offering a very large context window.

Q: How do I reduce Gemini API cost?

Tier down to gemini-2-5-flash for simple tasks, cap output tokens, batch independent requests, and reuse stable context. Stacking these typically cuts a Gemini bill by more than half with no quality loss.

This is Google Gemini API pricing for 2026: the current per-model token rates, worked cost examples you can sanity-check, and the cheapest way to call Gemini in production. Gemini is among the best value in frontier AI, and Kunavo prices it about 70% below Google's list rate behind one OpenAI-compatible API.

Rates last verified July 2, 2026. Kunavo's per-token prices below are read live from the model catalog, and the “Google list” column tracks Google's published rate — the official source is ai.google.dev/gemini-api/docs/pricing.

Gemini API pricing at a glance

Rates are per 1M tokens, in USD, as billed on Kunavo. The “Google list” column is Google's published rate for the same model, shown so you can see the delta.

Model	Input / 1M	Output / 1M	Google list (in / out)	You save
`gemini-2-5-flash`	$0.09	$0.75	$0.30 / $2.50	~70%
`gemini-2-5-pro`	$0.375	$3.00	$1.25 / $10.00	~70%

Flash is the high-volume workhorse; Pro is for harder reasoning, vision and long-context jobs. Live rates always show on the pricing page and each model page (gemini-2-5-flash, gemini-2-5-pro).

How Gemini token pricing works

You pay for input tokens (everything you send — system prompt, retrieved context, the user message) and output tokens (what the model generates). Output is the more expensive side, so the single biggest lever on a Gemini bill is how much text you let the model write. Images and audio are converted to token-equivalents and billed on the same meter.

Worked cost examples

Real numbers at Kunavo's Gemini 2.5 Flash rate, except the last row which uses Gemini 2.5 Pro:

Workload	Tokens (in / out)	Model	Cost
Chatbot turn	1,000 / 300	Flash	$0.0003
RAG answer	8,000 / 500	Flash	$0.0011
Batch classify (per doc)	500 / 20	Flash	$0.00006
Long-context analysis	20,000 / 2,000	Pro	$0.0135

At those rates a 100,000-document classification batch on Flash runs about $6, and a million chatbot turns about $315. The math, runnable:

gemini_cost.py

# Kunavo Gemini 2.5 Flash rates (USD per 1M tokens)
IN_RATE, OUT_RATE = 0.09, 0.75

def cost(in_tokens: int, out_tokens: int) -> float:
    return in_tokens / 1_000_000 * IN_RATE + out_tokens / 1_000_000 * OUT_RATE

print(cost(1_000, 300))            # one chatbot turn   -> $0.000315
print(cost(8_000, 500))            # one RAG answer     -> $0.001095
print(cost(500, 20) * 100_000)     # 100k-doc batch     -> ~$6.00

Kunavo pricing and Stripe billing

There is no subscription and no Google Cloud project. You top up a balance (Stripe or local payment methods), and calls draw down from it at the per-token rates above. Pay-as-you-go from a $5 minimum top-up, the balance never expires, and larger top-ups carry bonus credit. One balance covers Gemini and every other model — Claude, GPT, image, video and audio — so you are not reconciling a separate invoice per provider.

Which Gemini model should I choose?

gemini-2-5-flash — default for chat, extraction, classification, summarization and most RAG. Fast and the cheapest capable option.
gemini-2-5-pro — reach for it when Flash is not accurate enough: multi-step reasoning, code, vision and very long context.

A good pattern is to route by difficulty: Flash for the common case, escalate to Pro only when a check fails. See the AI cost optimization guide for the routing pattern in code, and the Claude and GPT pricing guides to compare providers.

Cutting your Gemini bill

Tier down. Send the easy 80% to Flash; reserve Pro for the hard 20%.
Cap output. Set max_tokens and stop sequences — output is the pricey side of the meter.
Trim input. Retrieve fewer, better RAG chunks instead of stuffing the whole knowledge base into context.
Batch. Group independent calls to keep latency down and avoid retry storms.

FAQ

Is the Gemini API free?

Google AI Studio has a rate-limited free tier for prototyping; production is pay-per-token. Kunavo is pay-as-you-go from a $5 minimum top-up — you pay the per-token rates above, the balance never expires, and no Google Cloud billing account is required.

How much does Gemini 2.5 Flash cost?

$0.09 per 1M input tokens and $0.75 per 1M output tokens on Kunavo — about 70% under Google's $0.30 / $2.50 list price. A typical chatbot turn costs roughly $0.0003.

How much does the Gemini 2.5 Pro API cost?

Gemini 2.5 Pro API pricing on Kunavo is $0.375 per 1M input and $3.00 per 1M output — about 70% under Google's $1.25 / $10.00. Reach for Pro over Flash on multi-step reasoning, vision and long-context work.

Is Gemini cheaper than Claude or GPT?

Gemini 2.5 Flash is one of the cheapest capable models anywhere — under Claude Haiku and most GPT tiers for high-volume work. Compare the full table on the pricing page.

How do I reduce Gemini API cost?

Tier to Flash, cap output, trim retrieved context, and batch. Details in the cost optimization guide. To start calling Gemini, see how to get a Gemini API key.