Vercel AI Gateway vs. Anthropic SDK: when to switch in production

For half a year I called Anthropic via @ai-sdk/anthropic in every production deploy. Standard pattern: npm install @ai-sdk/anthropic, drop ANTHROPIC_API_KEY into Vercel ENV, plug it into streamText({ model: anthropic('claude-sonnet-4-6') }), done. Works. Until the day Anthropic went down for an hour with 529 errors and I had nowhere to fail over. That's when I migrated to Vercel AI Gateway and I'm not going back.

What AI Gateway is and what it solves

Vercel AI Gateway is a proxy for LLM providers that sits between your app and Anthropic/OpenAI/Google/xAI. Instead of calling an SDK directly, you call models via the Gateway endpoint. That gets you four things that are hard to build yourself:

Provider-agnostic model strings - 'anthropic/claude-sonnet-4-6', 'openai/gpt-5.1', 'google/gemini-3'. No import { anthropic } from '@ai-sdk/anthropic'. Same code, different string.
Fallback chain - when the primary provider returns 5xx or 429, Gateway switches to a secondary inside the same request. The client only sees the successful stream.
Centralized observability - every request in one dashboard, breakdown by model, project, route. Without bolting on Datadog.
OIDC auth on Vercel - Vercel deploy signs the request, Gateway verifies. No API key in ENV.

Before: direct Anthropic SDK

This was my pattern in DokladBot, Maruška, customer support bots.

import { anthropic } from '@ai-sdk/anthropic';
import { streamText } from 'ai';
 
export async function POST(req: Request) {
  const { messages } = await req.json();
  const result = streamText({
    model: anthropic('claude-sonnet-4-6'),
    system: 'You are a helpful assistant.',
    messages,
    maxOutputTokens: 2048,
  });
  return result.toUIMessageStreamResponse();
}

Looks fine. Three problems:

API key in ENV. When I rotate the key (regularly, for leak prevention), I have to redeploy. Two-minute pause for production.
Single provider. Anthropic 529 = my endpoint returns 500. No fallback.
Per-project usage tracking. Anthropic Console gives me total spend on the key, no per-route breakdown.

After: AI Gateway via model string

import { streamText } from 'ai';
 
export async function POST(req: Request) {
  const { messages } = await req.json();
  const result = streamText({
    model: 'anthropic/claude-sonnet-4-6',
    system: 'You are a helpful assistant.',
    messages,
    maxOutputTokens: 2048,
  });
  return result.toUIMessageStreamResponse();
}

That string instead of an import is not cosmetic. AI SDK v6 detects the provider/model format and routes through Gateway. No @ai-sdk/anthropic in package.json. No ANTHROPIC_API_KEY in ENV. When the deploy runs on Vercel and is linked to the AI Gateway integration, the OIDC token attaches automatically.

Local dev works the same via vercel env pull and vercel dev - the Gateway key is for local runtime, OIDC kicks in at deploy time.

Fallback chain in 2 lines

This is the main reason I'm not going back. AI SDK v6 supports it directly in options:

import { streamText } from 'ai';
 
const result = streamText({
  model: 'anthropic/claude-sonnet-4-6',
  providerOptions: {
    gateway: {
      order: ['anthropic', 'openai'],
    },
  },
  messages,
});

order: ['anthropic', 'openai'] means: try Anthropic, on fail jump to the OpenAI equivalent. Gateway internally maps claude-sonnet-4-6 → gpt-5.1 (or whatever you configure as equivalent).

Real story: during a demo deploy of Maruška for a client, Anthropic dropped for 47 minutes with 529. Gateway switched to OpenAI fallback, the demo finished, the client never noticed. Without this I would have had to interrupt the demo.

Cost cap and observability

Vercel dashboard has per-project budget for Gateway. I set $5/day soft cap, $20/day hard cap on every project. Soft cap exceeded → I get a notification. Hard cap = Gateway returns 402, and I know there's a leak somewhere.

Before (direct SDK)	After (AI Gateway)
API key per provider in ENV	No key on Vercel deploy (OIDC)
Single provider per request	Configurable fallback chain
Anthropic Console = only usage view	Per-project, per-route breakdown
Provider package per model	`'provider/model'` string
Provider migration = code change + redeploy	Migration = config change
Cost cap = roll your own alerting	Built-in soft/hard cap

Edge cases you might hit

1. Streaming compatibility. AI Gateway fully supports the streamText SSE flow. No extra buffering, < 50ms overhead vs direct SDK call. I measured p95 240ms (gateway) vs 215ms (direct) on a typical 800-token response.

2. Tool calling across providers. anthropic/claude-sonnet-4-6 and openai/gpt-5.1 have different tool schemas, but AI SDK v6 normalizes them. You define tools once, fallback works.

3. Local dev. If you don't use vercel dev, you'll need AI_GATEWAY_API_KEY in .env.local. For CI/CD pipelines that's an extra secret, but only one - instead of per-provider keys.

4. Pricing. Gateway doesn't add markup. You pay provider price plus a minimal observability fee. For me roughly $0.50/month across all projects.

Migrating an existing project

In Maruška the migration took 9 minutes:

# 1. Drop the provider package
pnpm remove @ai-sdk/anthropic
 
# 2. Find every anthropic('...') call
grep -rn "anthropic('" src/
 
# 3. Replace with model string
sed -i "s/anthropic('claude-sonnet-4-6')/'anthropic\/claude-sonnet-4-6'/g" src/**/*.ts
 
# 4. Remove ANTHROPIC_API_KEY from Vercel ENV
vercel env rm ANTHROPIC_API_KEY production

Plus a PR with a diff that looks trivial. Code review was more about whether it makes sense than about correctness.

Lessons

Model string > SDK import is the right default for AI features on Vercel. Provider-package per model is technical debt.
Fallback chain saves demo presentations. At least once a year your primary provider dies on the day you have an important call.
OIDC auth = less secret rotation work. No API key in ENV means nothing to compromise.
Set the cost cap before the first deploy. I once left an app with a buggy retry loop running overnight - without the cap it would have been five figures.
AI SDK v6 ships a compatible migration path. You can run hybrid (some routes on Gateway, some direct) during migration.

What's next

DokladBot case study → - the project where Gateway dropped operational pain to zero
Claude Code workflow → - how Claude Code reviews these migrations
Multi-tenant Postgres → - another "do it right once" pattern

If you're migrating from direct SDK to Gateway on your own project, drop me a line. Most migrations are under 30 minutes.