Three of the strongest image models you can call on Kunavo today — Google's Nano Banana Pro, OpenAI's GPT-Image-2, and Google's newer, cheaper Nano Banana 2. They share one OpenAI-compatible API surface, so switching between them is a one-word change. Here's how they differ, when we'd reach for which, and where the cost actually lands.

The contenders

Nano Banana Pro — Google's premium image model. Top-tier fidelity and the best text-in-image rendering in the set: signs, UI labels, packaging, and multi-line copy come out legible on the first try. The priciest of the three, but still about 50% under Google's official rate.
GPT-Image-2 — OpenAI's native image model. Strong prompt understanding — it follows literal, detailed instructions closely — and a flat per-image price that does not step up with resolution, which makes it the cheapest option once you go to 4K.
Nano Banana 2 — the value pick. A newer Nano Banana with improved fidelity and prompt adherence over the original, at the lowest 1K price of the three. The default when you're generating at volume.

All three on Kunavo at 30–50% under each provider's official rate. Identical OpenAI API surface — only the model slug changes:

shootout.py

# Same client. Same code shape. Different model.
# All three return a temporary URL you can download.
from openai import OpenAI

client = OpenAI(
    api_key="sk-kn-...",
    base_url="https://api.kunavo.com/v1",
)

prompts = [
    "a glass-walled tea house at the edge of a misty forest, golden hour, "
    "cinematic, 50mm lens, shallow depth of field",
    "a vintage Italian espresso machine on a marble counter, soft window "
    "light, food magazine photography",
    "a stylized cityscape map of Tokyo, isometric, pastel colors, "
    "labeled districts in elegant serif",
]

for prompt in prompts:
    for model in ["nano-banana-pro", "gpt-image-2", "nano-banana-2"]:
        resp = client.images.generate(
            model=model,
            prompt=prompt,
            size="1024x1024",
        )
        print(model, resp.data[0].url)

How to choose

Text inside the image — signs, UI mockups, posters

Start with Nano Banana Pro. Google's Nano Banana line is the strongest in the catalog at rendering legible text inside an image — multi-line copy, varied weights, and brand-style consistency. If your use case puts any rendered text in the frame — Twitter-card mockups, product packaging, infographic posters, app-store screenshots — this is the one to reach for first. GPT-Image-2 is a competent second; the budget Nano Banana 2 handles short labels but gets less reliable as the text grows.

Prompt fidelity — literal, complex prompts

GPT-Image-2 and Nano Banana Pro both shine here, with different personalities. GPT-Image-2 leans literal: it follows a detailed, spelled-out prompt closely, which is what you want when the brief is precise. Nano Banana Pro is the better instruction-follower for compositional asks ("put X on the left, Y in the back") and for anything with text. For four-or-more-element scenes with strict spatial relationships, all current models degrade — expect to generate a few candidates and pick.

Budget and high volume

Nano Banana 2 is the value default. At roughly $0.047 per 1K image it's the cheapest of the three at standard resolution, with fidelity that's clearly a step up from the original Nano Banana. If you're generating thumbnails, variations, or anything at scale where per-image cost dominates, start here and only escalate to Nano Banana Pro for the frames that need text or top fidelity. (The original Nano Banana is cheaper still if you want the absolute floor.)

When you need 4K

GPT-Image-2 is the surprise winner on cost at 4K. Its per-image price is flat across 1K, 2K, and 4K, while the Nano Banana models step up with resolution. So for high-resolution output, GPT-Image-2 is both capable and the cheapest of the set — worth keeping in mind for print assets and large hero images.

Cost (per image, on Kunavo)

All three are billed per image, not per token. The per-call sticker on /pricing:

Nano Banana Pro: ~$0.067 per 1K / 2K, ~$0.12 per 4K (the premium pick — the text quality is worth it for the right use case)
GPT-Image-2: ~$0.063 per image at every resolution (flat — so it's mid-priced at 1K and the cheapest at 4K)
Nano Banana 2: ~$0.047 per 1K, ~$0.071 per 2K, ~$0.106 per 4K (the cheapest at standard resolution)

Failed calls aren't billed — if you misspell the model slug or the upstream rejects your prompt, your wallet is untouched. See /docs/billing for the full rules.

Which one should you start with?

Anything with text in the image (UI mockups, marketing assets, packaging)? Nano Banana Pro. Don't even start with the others.
Precise, literal prompts or 4K output? GPT-Image-2. The prompt adherence is excellent and the flat price makes high-resolution cheap.
High volume where cost dominates? Nano Banana 2. Cheapest at 1K, with fidelity that holds up for most production work.
Building a general-purpose tool that needs all of the above? Default to Nano Banana 2 for the bulk of requests, and let users (or your own routing) escalate to Nano Banana Pro for text-heavy frames and GPT-Image-2 for 4K. One default with two escape hatches beats forcing every request through the priciest model.

Editing images — separate question

For image-to-image work, two edit models are live on the same /v1/images/edits endpoint: Nano Banana Edit (the same text-rendering strength, best for replacing copy on packaging and signs) and GPT-Image-2 Edit (close prompt adherence for targeted changes). Just pass image alongside your prompt.

Try them yourself

The whole catalog is at /models, filterable by category. The fastest way to compare is to sign up and top up from $5, run the snippet above with your own 5 prompts, and let your eyes pick. Image taste is personal — the right shootout is the one you run on your prompts.

Image models shootout — Nano Banana Pro vs GPT-Image-2 vs Nano Banana 2

The contenders

How to choose

Text inside the image — signs, UI mockups, posters

Prompt fidelity — literal, complex prompts

Budget and high volume

When you need 4K

Cost (per image, on Kunavo)

Which one should you start with?

Editing images — separate question

Try them yourself

Related deep dives

Engineering a 99.95% SLO for an AI API gateway — failover, watchdogs, and the boring stuff

Anthropic prompt caching — cut 90% off your input bill in 30 minutes

Migrating from OpenAI to Kunavo in 10 minutes — Python, Node, LangChain, Vercel AI SDK

AI API cost optimization — five techniques that actually cut the bill