返回场景
Customer Support

AI customer support — Claude + Kunavo for production-grade ticket automation

The fastest-ROI AI deployment in any B2C SaaS — automate ticket triage, draft 80% of responses, and escalate the rest cleanly. Production code, real cost numbers, and the compliance pitfalls that catch teams off-guard.

Why customer support is the highest-ROI AI deployment

Three numbers explain it. Average human-handled ticket: ~5 minutes, $1.50-3 fully loaded. Average AI-drafted ticket (human review): ~1 minute, $0.01-0.05 in API costs. Average pure-AI resolution (no human): ~30 seconds, $0.01. That's 30-150x cost reduction at equal quality, once you get the classification + drafting prompt dialed in.

The catch: tier-1 frequent tickets work great with AI; complex tickets need humans. The architecture below routes intelligently — cheap model classifies, expensive model drafts for cases that warrant it, escalation rules send the hard ones to a human queue.

Architecture in four steps

  1. Classify — Claude Haiku 4.5, returns category + priority + sentiment as JSON
  2. Retrieve — pull top-5 relevant docs from your KB (use the RAG guide)
  3. Draft — Claude Sonnet 4.6 writes a response using KB context + tone guidelines
  4. Route — auto-send if priority=low and confidence high; otherwise to human agent for one-click approve/edit
support_bot.py
from openai import OpenAI
client = OpenAI(api_key="sk-kunavo-...", base_url="https://api.kunavo.com/v1")

# 1) Classify the ticket (cheap model)
def classify(text: str) -> dict:
    resp = client.chat.completions.create(
        model="claude-haiku-4-5",
        messages=[
            {"role": "system", "content": "Return JSON: {category, priority, sentiment}"},
            {"role": "user", "content": text},
        ],
        response_format={"type": "json_object"},
        max_tokens=200,
    )
    return json.loads(resp.choices[0].message.content)

# 2) Draft a response (better model)
def draft(ticket: str, kb_context: list[str], guidelines: str) -> str:
    resp = client.chat.completions.create(
        model="claude-sonnet-4-6",
        messages=[
            {"role": "system", "content": [{
                "type": "text",
                "text": guidelines,  # tone, policy, escalation rules
                "cache_control": {"type": "ephemeral"},
            }]},
            {"role": "user", "content": (
                f"# Knowledge base\n{chr(10).join(kb_context)}\n\n"
                f"# Customer message\n{ticket}\n\n"
                "Draft a reply. Be specific, cite the KB doc id."
            )},
        ],
        max_tokens=500,
    )
    return resp.choices[0].message.content

Real cost projection

  • 100 tickets/day with classify + draft + caching: ~$0.02/ticket × 100 = ~$60/month
  • 1,000 tickets/day: ~$600/month — replaces ~3 full-time agents at typical western tier-1 rates
  • 10,000 tickets/day (large e-commerce): ~$6,000/month — replaces 30+ agents

These are with prompt caching turned on (system prompt is static; KB context changes per ticket). How prompt caching works — the difference between $60 and $200/month at 100 tickets/day.

Anti-patterns to avoid

  • Auto-sending high-stakes replies (refunds, account changes, complaints) without human review. The PR risk of one bad auto-reply outweighs all efficiency gains
  • Routing by keyword instead of LLM classification — keyword routing breaks the moment users phrase things differently; LLM classification handles paraphrase trivially
  • Inflated tier-1 expectations — AI handles repeats and common questions, not "I've been charged 3x for one order and your chatbot keeps saying contact support". Leave a clean human escape

What good looks like at month 6

  • 60-80% of incoming tickets resolved without human touch (AI + KB)
  • Remaining 20-40% drafted by AI, reviewed and sent by humans in ~1 min/ticket
  • Human team focuses on edge cases + improving the KB
  • Net cost: 70-85% lower than pre-AI, customer CSAT often higher (faster responses)

Locale considerations

If you support customers in multiple languages, Claude Sonnet 4.6 handles ~30 languages well. For Korean specifically (formal speech levels matter), see our deep dive on Korean call center automation. For Japanese, Chinese, German, French — same playbook with locale-specific guidelines in the cached system prompt.

Start in 30 minutes

  1. Sign up — $2 credit at /app/signup
  2. Plug your KB into pgvector + text-embedding-3-large (cost: under $5 one-time for 5K docs)
  3. Wire the classify + draft snippet above into your ticketing system (Intercom, Zendesk, Freshdesk all have webhook entry points)
  4. Start in "draft mode" — humans approve every response. After 100 reviewed correctly, enable auto-send for low-priority categories