Claude API Errors: How to Fix the 8 Most Common Ones
TL;DR — A senior engineer's field guide to the most common Claude (Anthropic API) errors — 401, 403, 413, 429, 529 and more — with real fixes, commands, and a do's/don'ts table.
Claude API Errors: How to Fix the 8 Most Common Ones
It's 2 a.m., your batch job is two hours into a 50,000-prompt run, and the logs suddenly fill with red. Error code: 429. Then 529. Then a stray 403 that you swear was working an hour ago. If you've built anything serious on the Anthropic API, you've lived some version of this scene — and you've probably wasted an evening chasing the wrong fix because the status code didn't mean what you assumed.
I've shipped a few production systems on Claude (Opus, Sonnet, Haiku) and the same handful of errors account for the overwhelming majority of pages I've been woken up for. The good news: each one has a clear cause and a repeatable fix. This guide walks through the eight Claude API error codes you'll actually hit, what each one really means, and exactly what to do about it.
The problem: status codes that lie to your intuition
The trap with API errors is that we pattern-match from past experience. A 403 in a web app usually means "you're logged in but not allowed here." A 400 means "your JSON is malformed." Those instincts are mostly right with Claude — but there are two specific places where they'll send you down a rabbit hole:
- Billing failures show up as
403, not a402. There is no HTTP 402 in the Anthropic API. If your credits run dry, you get a403with abilling_errorin the body — the same status as a genuine permission problem. - An oversized request is
413, not400. It's tempting to treat "too much content" as a bad request, but Anthropic returnsrequest_too_largeas a413.
Get those two straight and you've already eliminated the most common misdiagnoses.
The authoritative Claude error code table
Here's the full set, straight from Anthropic's error codes reference. Bookmark this — it's the single most useful thing in this post.
| HTTP status | Error type |
What it actually means |
|---|---|---|
| 400 | invalid_request_error |
Malformed request — bad JSON, missing/invalid params |
| 401 | authentication_error |
API key missing, wrong, or revoked |
| 403 | permission_error |
Key valid but not allowed to use this resource |
| 403 | billing_error |
Out of credits / billing issue (same status, different .type) |
| 404 | not_found_error |
Resource (e.g. a model id) doesn't exist |
| 413 | request_too_large |
Payload exceeds the size limit |
| 429 | rate_limit_error |
You're hitting rate or token limits |
| 500 | api_error |
Something broke on Anthropic's side |
| 529 | overloaded_error |
Anthropic's API is temporarily overloaded |
Key insight:
403is overloaded — it covers bothpermission_errorandbilling_error. Always read the.typefield in the response body, not just the HTTP status. Two requests can both fail with403for completely different reasons.
Fixing each error, step by step
401 authentication_error — your key is the problem
This is the most common first-day error. The key is missing, malformed, or revoked.
# Check the key is actually exported (don't echo the value in shared logs)
printenv ANTHROPIC_API_KEY | head -c 8; echo "..."
# Quick smoke test against the API
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{"model":"claude-sonnet-4-5","max_tokens":16,"messages":[{"role":"user","content":"ping"}]}'
Step-by-step:
- Confirm the env var is set in this shell (a key in
.bashrcwon't help a different process). - Check for trailing whitespace or a copy-paste truncation — keys start with
sk-ant-. - If it was working and suddenly stopped, the key may have been rotated or revoked in the console. Generate a fresh one.
- Never hard-code the key as a CLI argument or commit it. Read it from the environment.
403 — and here's where most people get burned
Two flavors, same status:
permission_error: your key is fine but lacks access to that resource (e.g. an org/workspace restriction, or a model you're not enabled for). Check the workspace and the model id.billing_error: you've run out of prepaid credits or a payment failed. This is not a code bug. Top up credits in the console.
try:
resp = client.messages.create(...)
except anthropic.PermissionDeniedError as e:
# e.status_code == 403 for BOTH cases — inspect the body
err_type = e.body.get("error", {}).get("type")
if err_type == "billing_error":
alert_ops("Claude credits exhausted") # not a deploy bug
else:
log.error("Permission problem: %s", err_type)
I once spent forty minutes "fixing" a permission bug that was actually an empty credit balance. The .type field would have told me in two seconds.
413 request_too_large — you sent too much
You're stuffing a giant document (or an out-of-control conversation history) into one request. The fix is to shrink the payload, not to retry.
- Trim or chunk large inputs. Split a 200-page PDF into sections.
- Prune conversation history — summarize older turns instead of resending them verbatim.
- For genuinely large context, lean on Claude's long-context models, but still respect the request size ceiling.
Remember: this is 413, not 400. If you see
request_too_large, the cure is "send less," full stop.
429 rate_limit_error — the one you'll hit at scale
This is requests-per-minute or tokens-per-minute exhaustion. The right response is exponential backoff with jitter, plus honoring the retry-after header when present.
import time, random, anthropic
def call_with_backoff(client, **kwargs):
for attempt in range(6):
try:
return client.messages.create(**kwargs)
except anthropic.RateLimitError as e:
wait = getattr(e, "retry_after", None) or (2 ** attempt)
time.sleep(wait + random.uniform(0, 1)) # jitter avoids thundering herd
raise RuntimeError("Rate limited after 6 retries")
Step-by-step:
- Add retry-with-backoff on every call path (the official SDK already retries some of this — verify your version's defaults).
- Spread load: queue work instead of firing 1,000 concurrent requests.
- Watch token limits, not just request counts — a few huge prompts can trip TPM before RPM.
- If you've genuinely outgrown your tier, request a limit increase in the console.
500 api_error and 529 overloaded_error — not your fault
500 is a server-side hiccup; 529 means Anthropic's API is temporarily overloaded. For both, the move is the same: retry with backoff, and degrade gracefully if it persists.
529 is common enough that it deserves its own playbook — I've written a dedicated walkthrough on handling overload with retry budgets and fallbacks: see the Claude API 529 overloaded error guide. The short version: treat 529 like 429 for retry purposes, but cap your total retry budget so you fail fast instead of hammering an already-struggling service.
400 invalid_request_error and 404 not_found_error — read the message
These are the "read the actual error string" errors.
- 400: usually a malformed body, an invalid parameter, or a
max_tokensthat violates the model's constraints. The response body names the offending field. - 404: most often a typo'd or deprecated model id. Model names change; pin the current ones from the docs rather than from memory.
A note on Claude Code (the CLI), not just the API
A lot of these errors surface inside Claude Code, Anthropic's terminal agent, where they look like auth failures rather than HTTP codes. If Claude Code won't authenticate:
- Prefer the in-session
/logincommand — run/logininside an active Claude Code session and follow the browser flow. - Don't invent CLI flags. Command surfaces change between versions, so confirm anything version-specific with
claude --helpor the official docs before relying on it. - If you're wiring Claude Code to use an API key directly, set
ANTHROPIC_API_KEYin the environment rather than pasting it into a prompt.
The same principle applies to other CLIs you might run alongside it. For OpenAI's Codex CLI, for example, the device-login flag is codex login --device-auth, and you pass an API key via stdin — printenv OPENAI_API_KEY | codex login --with-api-key — never as a command-line argument. Different tool, same hygiene: keys come from the environment, flags get verified against --help.
Do's and Don'ts
| Do | Don't |
|---|---|
Read the .type field in the response body |
Assume 403 always means "permission" |
Treat billing_error as an ops/credits issue |
Look for a 402 — it doesn't exist |
Handle request_too_large as 413 by shrinking payloads |
Treat oversized requests as 400 and retry blindly |
Retry 429/500/529 with exponential backoff + jitter |
Retry 400/401/413 — they'll fail identically |
| Read keys from environment variables | Hard-code or commit ANTHROPIC_API_KEY |
| Pin current model ids from the docs | Hard-code stale model names (→ 404) |
Verify CLI flags with --help |
Invent version-specific commands from memory |
Three habits that prevent most of these pages
- Centralize your error handling. One wrapper function that classifies by
.typeand applies the right strategy (retry vs. alert vs. fail) beats scatteredtry/exceptblocks that each guess differently. - Separate "retryable" from "terminal" errors explicitly.
429,500, and529are retryable;400,401,403,404, and413are not. Encode that list once. - Alert on
billing_errorlike an outage. Credits running out takes down everything silently and looks exactly like a permission bug. A dedicated alert saves the late-night confusion I described earlier.
Wrapping up
Almost every Claude API failure you'll meet in production reduces to this short list: 400 (fix the request), 401 (fix the key), 403 (permission or billing — check .type), 404 (fix the model id), 413 (send less), 429 (back off), and 500/529 (retry, it's their side). The two facts that save the most time are the ones intuition gets wrong: there is no 402 — billing failures are 403, and oversized requests are 413, not 400.
Your next action: take the table above, drop it into a single error-classification helper in your codebase, and wire up exponential backoff on the retryable codes. Do that once and the 2 a.m. pages mostly stop.
Want to go deeper on the specific overload case? Read our 529 overloaded error guide, or browse more practical AI-tooling walkthroughs on the blog. And if you're shaping the prompts feeding these calls, a tighter prompt means fewer oversized-payload 413s and lower token bills — well worth the time.