Back to guides
Pricing·June 8, 2026·9 min read

Claude API pricing 2026 — Anthropic costs, examples, and cheaper OpenAI-compatible access

Claude is the default for reasoning, coding and agents. Here are the current per-model rates — about 60% below Anthropic's list — with worked examples and the levers that cut a Claude bill the most.

Claude is the default choice for reasoning, coding and agentic work, and Kunavo prices the whole family about 60% below Anthropic's list rate behind one OpenAI-compatible API. Here are the current per-model rates, worked cost examples, and the levers that cut a Claude bill the most.

Claude pricing at a glance

Rates are per 1M tokens, in USD, as billed on Kunavo. The “Anthropic list” column is Anthropic's published rate for the same model.

ModelInput / 1MOutput / 1MAnthropic list (in / out)You save
claude-haiku-4-5$0.40$2.00$1.00 / $5.00~60%
claude-sonnet-4-6$1.20$6.00$3.00 / $15.00~60%
claude-opus-4-7$2.00$10.00$5.00 / $25.00~60%

Live rates always show on the pricing page and each model page. Haiku is the cheap workhorse, Sonnet the balanced default, Opus the heavy reasoner.

How Claude pricing works

You pay for input tokens (system prompt, context, messages) and output tokens (the completion). Output is billed at roughly 5× the input rate across the family, so the largest lever on cost is how much the model writes. The second lever is prompt caching: stable, repeated context can be cached and billed at a fraction of the normal input rate.

Worked cost examples

Real numbers at Kunavo's rates:

WorkloadTokens (in / out)ModelCost
Support ticket800 / 200Haiku 4.5$0.00072
RAG answer6,000 / 500Sonnet 4.6$0.0102
Coding assistant call10,000 / 1,500Sonnet 4.6$0.021
Long-context analysis50,000 / 2,000Opus 4.7$0.12

So 10,000 support tickets a day on Haiku is about $7.20/day. The math, runnable:

claude_cost.py
# Kunavo Claude rates (USD per 1M tokens): (input, output)
RATES = {
    "claude-haiku-4-5":  (0.40, 2.00),
    "claude-sonnet-4-6": (1.20, 6.00),
    "claude-opus-4-7":   (2.00, 10.00),
}

def cost(model: str, in_tokens: int, out_tokens: int) -> float:
    i, o = RATES[model]
    return in_tokens / 1_000_000 * i + out_tokens / 1_000_000 * o

print(cost("claude-haiku-4-5", 800, 200))      # support ticket -> $0.00072
print(cost("claude-sonnet-4-6", 6_000, 500))   # RAG answer     -> $0.0102
print(cost("claude-opus-4-7", 50_000, 2_000))  # long analysis  -> $0.12

Prompt caching — the biggest single saving

If your system prompt or retrieved context is large and stable, prompt caching bills cached input at 10% of the input rate. For a long, reused system prompt that is a 60–90% cut on the input line. It is one extra field on the request — see the Anthropic prompt caching deep dive for the exact pattern and the Messages API reference.

Kunavo pricing and Stripe billing

No subscription, no Anthropic Console billing setup. Top up a balance (Stripe or local payment methods) and calls draw down at the rates above. New accounts get $2 of free credit; larger top-ups carry bonus credit. The same balance covers Claude, Gemini, GPT, image, video and audio models — one wallet, one invoice.

Claude vs Gemini: the cost decision

Match the model to the job rather than defaulting to the most capable one:

NeedPickKunavo in / out per 1M
Cheapest viablegemini-2-5-flash$0.09 / $0.75
Cheap + Anthropic qualityclaude-haiku-4-5$0.40 / $2.00
Balanced reasoningclaude-sonnet-4-6$1.20 / $6.00
Hardest reasoningclaude-opus-4-7$2.00 / $10.00

See the Gemini pricing guide for the other side of this table, and the cost optimization guide for the difficulty-routing pattern in code.

FAQ

Is the Claude API free?

Anthropic has no free Claude API tier — you pay per token from the first call. Kunavo gives you $2 of free credit at sign-up, then per-token rates about 60% under Anthropic's list.

How much does Claude Sonnet cost?

Claude Sonnet 4.6 is $1.20 per 1M input and $6.00 per 1M output on Kunavo — about 60% under Anthropic's $3 / $15. A 6K-in / 500-out RAG answer costs about $0.01.

How do I reduce Claude API cost?

Prompt caching first (cached input at 10% of rate), then tier down to Haiku for simple tasks and cap output. Details in the cost optimization guide.

Does Kunavo support the Anthropic Messages API?

Yes — Claude is available through both the native Messages API (/v1/messages) and the OpenAI-compatible /v1/chat/completions endpoint. To start, see how to get a Claude API key.