Veo 3 and Sora put text-to-video at the same quality bar where text-to-image was eighteen months ago. The catch: both providers gate access behind waitlists, regional restrictions and bespoke billing flows that don't play with the rest of your AI stack.
This guide walks through making your first Veo 3 and Sora API call through Kunavo — no waitlist, OpenAI- compatible auth, billed per second of generated video, results served from a permanent URL. Total time: about five minutes.
Setup
- Sign up at kunavo.com/app/signup. You get $2 in credit on sign-up — enough for a couple of 5-second test clips.
- Create a key in /app/keys. It starts with
sk-kunavo-. - Export it:
export KUNAVO_API_KEY=sk-kunavo-....
Text-to-video with Veo 3
Veo 3 is currently the best text-to-video model on the market for cinematic shots — it understands camera language (dolly, push-in, rack focus), produces stable lighting across cuts, and handles 24fps motion correctly. Generations take 30 seconds to a few minutes; the HTTP response is synchronous — set a long client timeout.
import requests, os, time
KEY = os.environ["KUNAVO_API_KEY"]
resp = requests.post(
"https://api.kunavo.com/v1/video/generations",
headers={"Authorization": f"Bearer {KEY}"},
json={
"model": "veo-3",
"prompt": "A drone shot pulling back from a quiet mountain lake at dawn, mist rising off the water. Cinematic, 24fps, soft golden light.",
"duration": 8,
"aspect_ratio": "16:9",
"resolution": "1080p",
},
timeout=600, # generations take 30s to several minutes
)
resp.raise_for_status()
data = resp.json()
print(data["data"][0]["url"])The response is OpenAI-style: { data: [{ url: '...' }] }. The URL is permanent, served from files.kunavo.com — download it once into your own storage if you need long-term hosting.
Image-to-video
Anchoring with an image is usually where you get production-quality results. Veo 3 supports two image modes:
image_mode: "frame"— single image is the first frame; two images is first + last frame. Default forimage_url.image_mode: "reference"— up to 3 style references for character / wardrobe consistency without forcing frames.
# image-to-video: pass an image_url to anchor the first frame.
resp = requests.post(
"https://api.kunavo.com/v1/video/generations",
headers={"Authorization": f"Bearer {KEY}"},
json={
"model": "veo-3",
"prompt": "She smiles, then walks out of frame to the left",
"image_url": "https://files.kunavo.com/<your-upload>.jpg",
"image_mode": "frame", # one image => first frame
"duration": 6,
"aspect_ratio": "9:16", # vertical, mobile-native
},
timeout=600,
)
print(resp.json()["data"][0]["url"])If you don't already have a public URL for your anchor image, post the bytes to /v1/files and Kunavo hosts the file for you under files.kunavo.com:
# If you don't have a public URL, upload bytes; Kunavo hosts the file.
with open("anchor.jpg", "rb") as f:
up = requests.post(
"https://api.kunavo.com/v1/files",
headers={"Authorization": f"Bearer {KEY}"},
files={"file": f},
)
image_url = up.json()["url"] # permanent files.kunavo.com URLSora and other models
The same endpoint shape works for every video model in the catalog — pass the relevant model slug:
veo-3— cinematic, 1080p, supports image-to-video.sora-2— OpenAI Sora.seedance-1-5-pro,seedance-2-0-pro— ByteDance, very strong on character motion.
See /models for the live list and the per-second price on each.
From Node / TypeScript
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.KUNAVO_API_KEY,
baseURL: "https://api.kunavo.com/v1",
});
// /v1/video/generations isn't in OpenAI's SDK shape, but the same auth
// header works — call it with fetch:
const resp = await fetch(
"https://api.kunavo.com/v1/video/generations",
{
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${process.env.KUNAVO_API_KEY}`,
},
body: JSON.stringify({
model: "sora-2",
prompt: "A red origami crane unfolding into a paper plane and flying away through a window",
duration: 5,
resolution: "1080p",
}),
},
);
const { data } = await resp.json();
console.log(data[0].url);Pricing model
Video models bill per second of output, not per token. Kunavo publishes the per-second rate on /pricing for every video model. A common 8-second Veo 3 1080p clip costs a few cents. Failed generations (4xx / 5xx) are never billed.
Production checklist
- Set a 10-minute HTTP timeout. The gateway polls upstream up to 540s, returning 504 if the model is still working past that. For very long jobs, retry — generations are idempotent per prompt.
- Persist the result URL. Even though files.kunavo.com URLs are permanent, your product should own its own copy in the storage you control.
- Handle 429s with backoff. Video models are GPU-bound; brief contention is normal. The retry-after header is honored when present.
- Cache by prompt hash if reasonable. The same prompt + seed + model returns near-identical video — paying twice for it is wasteful.
Questions: contact@kunavo.com. The team behind the gateway reads every email and replies within 24 hours.