Back to guides
Pricing·June 8, 2026·8 min read

Gemini API pricing 2026 — model costs, examples, and cheaper OpenAI-compatible access

Gemini is among the best value in frontier AI. Here are the current per-model rates — about 70% below Google's list — with worked cost examples and the cheapest way to call Gemini in production.

Gemini is among the best value in frontier AI, and Kunavo prices it about 70% below Google's list rate behind one OpenAI-compatible API. This guide gives the current per-model rates, worked cost examples you can sanity-check, and the cheapest way to call Gemini in production.

Gemini pricing at a glance

Rates are per 1M tokens, in USD, as billed on Kunavo. The “Google list” column is Google's published rate for the same model, shown so you can see the delta.

ModelInput / 1MOutput / 1MGoogle list (in / out)You save
gemini-2-5-flash$0.09$0.75$0.30 / $2.50~70%
gemini-2-5-pro$0.375$3.00$1.25 / $10.00~70%

Flash is the high-volume workhorse; Pro is for harder reasoning, vision and long-context jobs. Live rates always show on the pricing page and each model page (gemini-2-5-flash, gemini-2-5-pro).

How Gemini token pricing works

You pay for input tokens (everything you send — system prompt, retrieved context, the user message) and output tokens (what the model generates). Output is the more expensive side, so the single biggest lever on a Gemini bill is how much text you let the model write. Images and audio are converted to token-equivalents and billed on the same meter.

Worked cost examples

Real numbers at Kunavo's Gemini 2.5 Flash rate, except the last row which uses Gemini 2.5 Pro:

WorkloadTokens (in / out)ModelCost
Chatbot turn1,000 / 300Flash$0.0003
RAG answer8,000 / 500Flash$0.0011
Batch classify (per doc)500 / 20Flash$0.00006
Long-context analysis20,000 / 2,000Pro$0.0135

At those rates a 100,000-document classification batch on Flash runs about $6, and a million chatbot turns about $315. The math, runnable:

gemini_cost.py
# Kunavo Gemini 2.5 Flash rates (USD per 1M tokens)
IN_RATE, OUT_RATE = 0.09, 0.75

def cost(in_tokens: int, out_tokens: int) -> float:
    return in_tokens / 1_000_000 * IN_RATE + out_tokens / 1_000_000 * OUT_RATE

print(cost(1_000, 300))            # one chatbot turn   -> $0.000315
print(cost(8_000, 500))            # one RAG answer     -> $0.001095
print(cost(500, 20) * 100_000)     # 100k-doc batch     -> ~$6.00

Kunavo pricing and Stripe billing

There is no subscription and no Google Cloud project. You top up a balance (Stripe or local payment methods), and calls draw down from it at the per-token rates above. New accounts start with $2 of free credit, and larger top-ups carry bonus credit. One balance covers Gemini and every other model — Claude, GPT, image, video and audio — so you are not reconciling a separate invoice per provider.

Which Gemini model should I choose?

  • gemini-2-5-flash — default for chat, extraction, classification, summarization and most RAG. Fast and the cheapest capable option.
  • gemini-2-5-pro — reach for it when Flash is not accurate enough: multi-step reasoning, code, vision and very long context.

A good pattern is to route by difficulty: Flash for the common case, escalate to Pro only when a check fails. See the AI cost optimization guide for the routing pattern in code.

Cutting your Gemini bill

  1. Tier down. Send the easy 80% to Flash; reserve Pro for the hard 20%.
  2. Cap output. Set max_tokens and stop sequences — output is the pricey side of the meter.
  3. Trim input. Retrieve fewer, better RAG chunks instead of stuffing the whole knowledge base into context.
  4. Batch. Group independent calls to keep latency down and avoid retry storms.

FAQ

Is the Gemini API free?

Google AI Studio has a rate-limited free tier for prototyping; production is pay-per-token. On Kunavo you get $2 of free credit at sign-up, then pay the per-token rates above — no Google Cloud billing account required.

How much does Gemini 2.5 Flash cost?

$0.09 per 1M input tokens and $0.75 per 1M output tokens on Kunavo — about 70% under Google's $0.30 / $2.50 list price. A typical chatbot turn costs roughly $0.0003.

Is Gemini cheaper than Claude or GPT?

Gemini 2.5 Flash is one of the cheapest capable models anywhere — under Claude Haiku and most GPT tiers for high-volume work. Compare the full table on the pricing page.

How do I reduce Gemini API cost?

Tier to Flash, cap output, trim retrieved context, and batch. Details in the cost optimization guide. To start calling Gemini, see how to get a Gemini API key.