Anthropic Claude API Errors Reference - HTTP Status Codes, Error Types, Causes, and Official Solutions

Q: What does overloaded_error (529) mean?

It means the API is temporarily overloaded. 529 overloaded_error occurs when the API experiences high traffic across all users - it is not specific to your account. It is retryable: retry with exponential backoff, spread requests over time, or try a model with more headroom. During streaming, overloaded_error arrives as an error event after a 200 response rather than as an HTTP 529.

Q: How do I resolve rate_limit_error (429)?

A 429 rate_limit_error means you hit a rate limit (RPM, ITPM, or OTPM for a model class, or an acceleration limit from a sharp usage increase). The response includes a retry-after header - wait that many seconds, then retry with exponential backoff (the SDKs do this automatically). Read the anthropic-ratelimit-* headers for remaining headroom and reset times, ramp traffic gradually to avoid acceleration limits, and use prompt caching so cached input tokens stop counting toward ITPM on most models.

Q: What is the difference between 429 and 529?

429 rate_limit_error is about your usage exceeding your limits (or acceleration limits from a sharp increase in your traffic); the fix is on your side - back off, respect retry-after, cache, and ramp gradually. 529 overloaded_error is about aggregate load on the API across all users; it is not tied to your account. Both are retryable with backoff, but 429 carries a retry-after header and is resolved by managing your own request rate, whereas 529 is resolved by retrying and, if needed, shifting load in time or model.

Q: Why did my request return invalid_request_error?

400 invalid_request_error means the request format or content was invalid. Read error.message - it is specific. Documented cases include prefilling an assistant message on a model that does not support it (Claude Fable 5, Mythos 5, Mythos Preview, Opus 4.8/4.7/4.6, and Sonnet 4.6) and modifying thinking or redacted_thinking blocks before sending them back. Other frequent causes are an invalid model ID, a missing or invalid required parameter, and deprecated parameters left over from an older model. This error is not retryable - fix the request first.

Q: How do I find my request id for support?

Every response includes a request-id header (value like req_018EeWyXxfu5pfWkrYcMdjWG). The SDKs expose it as message._request_id (Python and TypeScript), or you can read the header directly via the raw-response accessor. Include this ID when contacting support. On Claude Platform on AWS, responses also include an AWS request ID (x-amzn-requestid) for CloudTrail; use the Anthropic request-id for Anthropic support tickets.

Anthropic Claude API Errors Reference - HTTP Status Codes, Error Types, Causes, and Official Solutions

First Published: 2026-07-05
Last Updated: 2026-07-05

If you are building on the Claude API (also called the Anthropic API) and a request comes back with overloaded_error, rate_limit_error, or a bare invalid_request_error, you usually want three things fast: what the error means, why it happened, and the official way to resolve it. The trouble is that this information is spread across several documentation pages - the errors page, the rate limits page, the streaming page, and each SDK reference. This article aggregates all of it into a single reference so that searching the exact error string lands you on one page with the meaning, the officially documented causes, the resolution steps, and a link back to the primary source.

The scope here is the first-party Claude API (Messages API and its supporting endpoints). Errors that are specific to Amazon Bedrock or Google Vertex AI are out of scope and are covered separately; where they overlap, this page links to the first-party behavior. Pricing figures and exact rate-limit numbers are intentionally not reproduced here because they change over time - the article links to the official pages that hold the current values.

1. Overview

This is a reference aggregation page for Claude API errors. It is organized so that an exact-match search (for example, "claude api 529", "overloaded_error", or "rate_limit_error retry") lands on the specific entry you need.

The core of the page is Section 3, the Error Reference, where every HTTP error the API can return is documented with a fixed four-part structure: its meaning, its officially documented common causes, how to diagnose and resolve it, and a link to the official reference. Section 2 explains the mechanics that apply across all errors - the JSON shape an error takes, where the request ID lives, how errors surface differently during streaming, how to read rate-limit headers, and why stop_reason is not an error. Section 4 covers the official guidance on retries, timeouts, and backoff. Section 5 answers the questions developers ask most often, and Section 6 summarizes and points to related material.

A note on how to read this page:

Error type names, header names, and status codes are quoted verbatim from the official documentation. When Anthropic writes invalid_request_error, this page uses invalid_request_error, not a paraphrase.
Causes are limited to what the official documentation states. Where the docs are specific (for example, the documented validation errors under 400), those are reproduced. Where the docs are general, this page does not invent additional causes.
The error surface can expand over time. Per Anthropic's versioning policy, the set of type values may grow. Code that switches on error types should handle unknown types gracefully.

Related material on this site: the Anthropic Claude Model Release Timeline for model IDs and availability, the Anthropic Claude API Prompt Caching and Token Efficiency guide for cache-aware rate limits, and the Anthropic Claude Model Migration Guide for parameter changes that commonly produce 400 errors when migrating between models.

2. How Errors Are Returned

Before the reference itself, it helps to understand the shapes an error can take and the fields you read to handle it. The same status code can arrive as an HTTP response, as a raised SDK exception, or - during streaming - as an event in an otherwise-successful stream.
The table below summarizes the outcomes a single API request can resolve into, and how each one is handled:

Outcome	What the response carries	How to handle it
200 OK (success)	The body carries a `stop_reason` (`end_turn` / `max_tokens` / `refusal` ...) — a `stop_reason` is not an error	Branch on `stop_reason` in application logic
Non-2xx: 4xx (client-side)	`{ "type": "error", "error": { "type": ... }, "request_id": ... }` with status `400` / `401` / `402` / `403` / `404` / `413`, or `429 rate_limit_error`	Fix the request, key, or account state before retrying; on `429`, back off and respect `retry-after`
Non-2xx: 5xx / 529 (server-side)	The same error JSON with `500 api_error`, `504 timeout_error`, or `529 overloaded_error`	Retry with exponential backoff (the SDK default); `529` signals aggregate overload
Streaming (SSE)	An error such as `overloaded_error` can arrive as an `event: error` after a `200` response	Handle `error` events inside the stream with the same branch-by-`error.type` logic

2.1 The error JSON shape

Errors are always returned as JSON with a top-level error object that always includes a type and a message. The response also includes a top-level request_id. For example:

{
  "type": "error",
  "error": {
    "type": "not_found_error",
    "message": "The requested resource could not be found."
  },
  "request_id": "req_011CSHoEeqs5C35K2UUqR7Fy"
}

The outer type is always "error". The inner error.type is the machine-readable error type (for example, not_found_error, rate_limit_error, overloaded_error) that you branch on. The error.message is a human-readable description whose text may change and should not be pattern-matched. The request_id is the identifier to quote when contacting support.

2.2 Where the request ID lives

Every API response includes a unique request-id HTTP header, with a value such as req_018EeWyXxfu5pfWkrYcMdjWG. Include it when you contact support about a specific request so the issue can be traced end-to-end.

The official SDKs expose this value as a property on the top-level response object. In Python and TypeScript it is message._request_id (a public property despite the underscore). You can also read the raw header via the SDK's raw-response accessor.

message = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello, Claude"}],
)
print(message._request_id)  # e.g. req_018EeWyXxfu5pfWkrYcMdjWG

On Claude Platform on AWS, responses carry two request IDs: the AWS request ID (x-amzn-requestid, indexed in CloudTrail) and the Anthropic request ID (request-id). Use the AWS request ID for CloudTrail lookups and the Anthropic request ID for Anthropic support tickets.

2.3 Errors during streaming

When you request a streaming response over server-sent events (SSE), an error can occur after a 200 response has already been returned. In that case the error does not follow the standard HTTP-status mechanism - it arrives as an error event in the stream.

For example, during periods of high usage you may receive an overloaded_error, which in a non-streaming context would correspond to HTTP 529:

event: error
data: {"type": "error", "error": {"type": "overloaded_error", "message": "Overloaded"}}

The full set of stream event types is message_start, content_block_start, content_block_delta, content_block_stop, message_delta, and message_stop, plus any number of ping events and the occasional error event. Because new event types may be added under the versioning policy, your stream handler should ignore unknown event types rather than fail on them. Practically, this means a streaming client needs the same error-handling branch as a non-streaming client, wired into the event loop rather than only around the initial request.

2.4 Reading rate-limit headers

Every response includes headers that show the enforced limit, current usage, and when the limit resets. These are the levers for staying under 429:

retry-after - the number of seconds to wait before retrying; earlier retries fail.
anthropic-ratelimit-requests-limit / -remaining / -reset - the requests-per-minute limit, requests remaining, and the reset time (RFC 3339).
anthropic-ratelimit-tokens-limit / -remaining / -reset - the token limit values for the most restrictive limit currently in effect.
anthropic-ratelimit-input-tokens-limit / -remaining / -reset - the input-token-per-minute values.
anthropic-ratelimit-output-tokens-limit / -remaining / -reset - the output-token-per-minute values.

Priority Tier organizations also receive anthropic-priority-input-tokens-* and anthropic-priority-output-tokens-* headers. Token remaining values are rounded to the nearest thousand. The token headers reflect the most restrictive limit that currently applies (for example, a Workspace per-minute limit if one is set).

2.5 stop_reason is not an error

A common source of confusion is treating a stop_reason as an error. It is not. stop_reason is a field in an otherwise-successful HTTP 200 response body that tells you why generation stopped, whereas an error indicates the request failed to process.

The Messages API can return these stop_reason values:

stop_reason	Meaning
`end_turn`	Claude finished its response naturally (most common).
`max_tokens`	The response reached the `max_tokens` limit in the request.
`stop_sequence`	Claude emitted one of your custom `stop_sequences`.
`tool_use`	Claude is calling a tool.
`pause_turn`	The server-side tool loop reached its iteration limit (default 10).
`refusal`	Claude declined to respond.
`model_context_window_exceeded`	The response filled the model's context window.

Two of these are frequently mistaken for errors:

max_tokens looks like a failure but is a normal stop. If output is truncated, raise max_tokens (and stream for large values - see Section 4).
refusal is returned as a normal 200, not an error. When stop_reason == "refusal", the stop_details object identifies the policy category that triggered it; stop_details is null for every other stop_reason. Check stop_reason before reading content, because a pre-output refusal can have an empty content array.

pause_turn is resumed by sending the assistant response back as-is; do not add an extra user message. See the official handling stop reasons guide for the full treatment.

3. Error Reference

This is the core of the page. Errors are listed in HTTP status-code order (4xx, then 5xx, then 529), followed by two topics that do not map to a single status: streaming error events and the SDK exception mapping. Each entry has the same four parts - Meaning, Common Causes (from the official documentation only), How to Diagnose and Resolve, and Official Reference.

Error index:

400 invalid_request_error
401 authentication_error
402 billing_error
403 permission_error
404 not_found_error
413 request_too_large
429 rate_limit_error
500 api_error
504 timeout_error
529 overloaded_error
Streaming error events
SDK exceptions by status code

400 invalid_request_error

Meaning. There was an issue with the format or content of your request. This error type may also be used for other 4XX status codes that are not otherwise listed.

Common Causes (officially documented). The general cause is a malformed or invalid request body. The documentation also calls out specific validation errors that surface as 400 invalid_request_error:

Prefilling assistant messages is not supported for this model. Claude Fable 5, Claude Mythos 5, Claude Mythos Preview, Claude Opus 4.8, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6 do not support prefilling assistant messages. Sending a request whose last message is a prefilled assistant turn to any of these models returns 400 with the message Prefilling assistant messages is not supported for this model.
thinking or redacted_thinking blocks cannot be modified. If the most recent assistant message contains thinking or redacted_thinking blocks that were edited, reordered, filtered out, or reconstructed before being sent back, the request returns 400. The message begins with the position of the offending block (for example, messages.1.content.0). Every thinking and redacted_thinking block from the assistant turn must be passed back exactly as received, including blocks whose thinking field is empty.

How to Diagnose and Resolve. Read error.message first - for 400 it is specific about what was wrong. Then:

For the prefill error, remove the trailing assistant message and use structured outputs (output_config.format) or system-prompt instructions to shape the response instead.
For the thinking-blocks error, pass thinking blocks back unchanged; if your code filters content blocks by type before resending, include both thinking and redacted_thinking.
For other 400s, validate the request against the Messages API reference before sending: a valid model ID, a positive max_tokens, and a non-empty messages array. When migrating between models, deprecated parameters (for example, temperature/top_p/top_k or a fixed budget_tokens) are a frequent cause - see the model migration guide.

Official Reference. Errors - HTTP errors and Errors - Common validation errors.

401 authentication_error

Meaning. There is an issue with your API key. On Claude Platform on AWS, this can also indicate a problem with your AWS credentials or SigV4 signature.

Common Causes (officially documented). An authentication problem with the credential presented to the API.

How to Diagnose and Resolve. Confirm the API key is present, current, and sent on the correct header. Set ANTHROPIC_API_KEY in the environment rather than hardcoding it, or authenticate with a stored profile and leave the client constructor empty. If you authenticate with an OAuth bearer token via raw HTTP, send it as Authorization: Bearer <token> rather than on x-api-key. On Claude Platform on AWS, verify the AWS credentials and SigV4 signing.

Official Reference. Errors - HTTP errors.

402 billing_error

Meaning. There is an issue with your billing or payment information.

Common Causes (officially documented). A billing or payment-information problem on the organization.

How to Diagnose and Resolve. Check your payment details in the Claude Console (or in AWS Marketplace if you are using Claude Platform on AWS). This is an account-level condition, not a request-shape problem, so retrying the same request without fixing billing will not help.

Official Reference. Errors - HTTP errors.

403 permission_error

Meaning. Your API key does not have permission to use the specified resource.

Common Causes (officially documented). The key lacks permission for the requested resource.

How to Diagnose and Resolve. Confirm the API key's permissions in the Console and that it is scoped to the workspace and resource you are calling. You may need a different key or additional access. Because this is a permission condition, retrying with the same key will keep failing until the permission is granted.

Official Reference. Errors - HTTP errors.

404 not_found_error

Meaning. The requested resource was not found.

Common Causes (officially documented). The requested resource does not exist - for example, an incorrect model ID or an invalid endpoint path.

How to Diagnose and Resolve. Verify the exact model ID and endpoint. Model IDs must match the published strings exactly; a typo (for example, claude-sonnet-4.6 instead of claude-sonnet-4-6) or a retired model produces 404. Confirm current model IDs against the model release timeline or the official models documentation.

Official Reference. Errors - HTTP errors.

413 request_too_large

Meaning. The request exceeds the maximum allowed number of bytes.

Common Causes (officially documented). The request body is larger than the per-endpoint maximum. On the direct Claude API, this error is returned from Cloudflare before the request reaches the API servers. The documented maximum request sizes are:

Endpoint type	Maximum request size
Messages API	32 MB
Token Counting API	32 MB
Batch API	256 MB
Files API	500 MB

How to Diagnose and Resolve. Reduce the request size to fit under the endpoint's maximum: trim conversation history, compress or resize images, or split large documents. For large inputs referenced repeatedly, the Files API lets you upload a file once and reference it by ID instead of re-embedding it in every request.

Official Reference. Errors - Request size limits.

429 rate_limit_error

Meaning. Your account has hit a rate limit.

Common Causes (officially documented). Exceeding one of the per-minute rate limits, which for the Messages API are measured as requests per minute (RPM), input tokens per minute (ITPM), and output tokens per minute (OTPM) for each model class. A sharp increase in usage can also trigger 429 because of acceleration limits on the API, even below your stated per-minute limits.

A key detail for ITPM: for most models, only uncached input tokens count. input_tokens and cache_creation_input_tokens count toward ITPM; cache_read_input_tokens do not (Claude Haiku 3.5 is the documented exception). This is why prompt caching effectively raises your throughput ceiling - see the prompt caching and token efficiency guide. OTPM is evaluated in real time on tokens actually generated, so max_tokens does not factor into the OTPM calculation.

How to Diagnose and Resolve. A 429 response includes a retry-after header telling you how many seconds to wait; respect it. The official guidance is to retry with exponential backoff (the SDKs do this automatically - see Section 4) and to read the anthropic-ratelimit-* headers to see remaining headroom and reset times. To avoid acceleration-limit 429s, ramp traffic up gradually and keep usage patterns consistent. To raise your effective ceiling without a limit increase, cache repeated content so those tokens stop counting toward ITPM. You can also read your configured limits programmatically with the Rate Limits API.

Official Reference. Rate limits and Errors - HTTP errors.

500 api_error

Meaning. An unexpected error occurred internal to Anthropic's systems.

Common Causes (officially documented). A transient, server-side condition inside Anthropic's infrastructure.

How to Diagnose and Resolve. This is retryable. Retry with exponential backoff (the SDKs retry >= 500 automatically). If it persists, check the Anthropic status page and quote the request-id when contacting support.

Official Reference. Errors - HTTP errors.

504 timeout_error

Meaning. The request timed out while processing.

Common Causes (officially documented). The request took too long to process - most often a long-running, non-streaming request, especially one with a large max_tokens. Some networks also drop idle connections after a period of time, which can cause a long request to fail without a response.

How to Diagnose and Resolve. For long-running requests, use the streaming Messages API or the Message Batches API rather than a single long non-streaming call. Avoid setting a large max_tokens without streaming; if you do not need to process events incrementally, use .stream() with .get_final_message() (Python) or .finalMessage() (TypeScript) to get the complete Message object without writing event handlers. For direct integrations, a TCP socket keep-alive reduces the impact of idle-connection timeouts.

Official Reference. Errors - HTTP errors and Errors - Long requests.

529 overloaded_error

Meaning. The API is temporarily overloaded.

Common Causes (officially documented). High traffic across all users. The documentation notes that 529 can occur when the API experiences high traffic across the whole user base - it is not specific to your account. Separately, if your organization has a sharp increase in usage, you may instead see 429 errors because of acceleration limits.

How to Diagnose and Resolve. This is retryable. Retry with exponential backoff. Because it reflects aggregate load, options include spreading requests over time, queuing, and (as a general capacity strategy) trying a model with more headroom. During streaming, overloaded_error arrives as an error event after a 200 rather than as an HTTP 529 (see the streaming entry below). To avoid the related acceleration-limit 429, ramp usage gradually and keep it consistent.

Official Reference. Errors - HTTP errors.

Streaming error events

Meaning. When streaming over SSE, an error can occur after the API has already returned a 200. Rather than an HTTP status, it is delivered as an error event within the stream.

Common Causes (officially documented). Any error the API surfaces mid-stream. The documented example is overloaded_error during high usage, which would correspond to HTTP 529 in a non-streaming context:

event: error
data: {"type": "error", "error": {"type": "overloaded_error", "message": "Overloaded"}}

How to Diagnose and Resolve. Handle error events inside your stream loop, applying the same branch-by-error.type logic you use for non-streaming responses (for example, back off and retry on overloaded_error). Ignore unknown event types so that new event types added under the versioning policy do not break your handler. If you use the SDK streaming helpers, prefer .get_final_message() / .finalMessage(), which surface the terminal state without requiring you to wire every event by hand.

Official Reference. Streaming - Error events.

SDK exceptions by status code

Meaning. The official SDKs raise typed exceptions instead of returning raw error JSON, so you branch on exception classes rather than parsing status codes yourself. The class names and namespaces differ by language.

Common Causes (officially documented). Each non-success status maps to a typed exception. The Python SDK, for example, defines this mapping:

HTTP status	Python exception (anthropic.*)
400	`BadRequestError`
401	`AuthenticationError`
403	`PermissionDeniedError`
404	`NotFoundError`
409	`ConflictError`
413	`RequestTooLargeError`
422	`UnprocessableEntityError`
429	`RateLimitError`
503	`ServiceUnavailableError`
504	`DeadlineExceededError`
529	`OverloadedError`
Other >= 500	`InternalServerError`
N/A (connection failure)	`APIConnectionError`

All of these subclass APIError. Any non-2xx response that does not have a more specific subclass surfaces as APIStatusError (with status_code and response available) - this is how a status such as 402 appears. A timeout raises APITimeoutError. The TypeScript SDK follows the same pattern under the Anthropic.* namespace (Anthropic.BadRequestError, Anthropic.RateLimitError, Anthropic.APIError, and so on), but as of this article's Last Updated date it does not define the dedicated 413/503/504/529 classes: in TypeScript, a 4xx status without a dedicated class (such as 402 or 413) surfaces as the base APIError, and every >= 500 response (including 504 and 529) surfaces as InternalServerError. Other SDKs follow the same shape - for example, Anthropic::Errors::NotFoundError in Ruby, com.anthropic.errors.NotFoundException in Java, and a single *anthropic.Error value (branch on StatusCode) in Go.

How to Diagnose and Resolve. Catch the most specific class first, then the base:

import anthropic

try:
    message = client.messages.create(
        model="claude-opus-4-8",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Hello, Claude"}],
    )
except anthropic.NotFoundError:
    ...  # 404 - check model ID / endpoint
except anthropic.RateLimitError as e:
    retry_after = e.response.headers.get("retry-after")
    ...  # 429 - back off, respect retry-after
except anthropic.OverloadedError:
    ...  # 529 - back off and retry with jitter
except anthropic.APIStatusError as e:
    print(e.status_code, e.message)  # any other non-2xx (e.g. 402)
except anthropic.APIConnectionError as e:
    print(e.__cause__)  # network failure before a response

Branch on the typed classes rather than string-matching error.message, which can change. For programmatic error classification finer than the HTTP status (for example, distinguishing billing_error from permission_error, both nominally in the 4xx range), the SDKs also expose the API error.type string on the exception.

Official Reference. Errors - SDK error types and the Python SDK - Handling errors reference.

4. Retries, Timeouts, and Backoff

Anthropic's guidance for transient failures is exponential backoff, and the official SDKs implement it so you usually do not need to write retry logic yourself.

What the SDKs retry automatically. Certain errors are retried twice by default, with a short exponential backoff. The retried conditions are connection errors, 408 (request timeout), 409 (conflict), 429 (rate limit), and >= 500 (server errors). You configure this with max_retries (default 2; set it to 0 to disable):

from anthropic import Anthropic

# Configure for all requests
client = Anthropic(max_retries=5)

# Or per request
client.with_options(max_retries=5).messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello, Claude"}],
)

Respecting retry-after. On 429, the response includes a retry-after header with the number of seconds to wait; earlier retries fail. The SDKs read this header when retrying; if you implement your own backoff, honor it.

Timeouts. Requests time out after 10 minutes by default. You can change this with the timeout option; on timeout the SDK raises APITimeoutError and (per the retry policy above) retries. Because timeouts are retried, wall-clock time can reach roughly timeout × (max_retries + 1).

Large max_tokens and long requests. Avoid a large max_tokens on a non-streaming request. Some networks drop idle connections, which can fail a long request without a response, and the SDK will raise a ValueError if it estimates a non-streaming request will exceed roughly 10 minutes. Use the streaming Messages API - or, for latency-insensitive batch work, the Message Batches API - for long or high-max_tokens requests. When you only need the final result, the streaming helpers give it to you without hand-wiring events:

with client.messages.stream(
    model="claude-opus-4-8",
    max_tokens=128000,
    messages=[{"role": "user", "content": "Write a detailed analysis..."}],
) as stream:
    message = stream.get_final_message()

Do not retry non-retryable errors. 400, 401, 402, 403, 404, and 413 reflect a problem with the request or account, not a transient condition; retrying the same request unchanged will keep failing. Fix the request (or the account/billing/permissions) first.

5. Frequently Asked Questions about Claude API Errors

What does overloaded_error (529) mean?
It means the API is temporarily overloaded. 529 overloaded_error occurs when the API experiences high traffic across all users - it is not specific to your account. It is retryable: retry with exponential backoff, spread requests over time, or try a model with more headroom. During streaming, overloaded_error arrives as an error event after a 200 response rather than as an HTTP 529.

How do I resolve rate_limit_error (429)?
A 429 rate_limit_error means you hit a rate limit (RPM, ITPM, or OTPM for a model class, or an acceleration limit from a sharp usage increase). The response includes a retry-after header - wait that many seconds, then retry with exponential backoff (the SDKs do this automatically). Read the anthropic-ratelimit-* headers for remaining headroom and reset times, ramp traffic gradually to avoid acceleration limits, and use prompt caching so cached input tokens stop counting toward ITPM on most models.

What is the difference between 429 and 529?
429 rate_limit_error is about your usage exceeding your limits (or acceleration limits from a sharp increase in your traffic); the fix is on your side - back off, respect retry-after, cache, and ramp gradually. 529 overloaded_error is about aggregate load on the API across all users; it is not tied to your account. Both are retryable with backoff, but 429 carries a retry-after header and is resolved by managing your own request rate, whereas 529 is resolved by retrying and, if needed, shifting load in time or model.

Why did my request return invalid_request_error?
400 invalid_request_error means the request format or content was invalid. Read error.message - it is specific. Documented cases include prefilling an assistant message on a model that does not support it (Claude Fable 5, Mythos 5, Mythos Preview, Opus 4.8/4.7/4.6, and Sonnet 4.6) and modifying thinking or redacted_thinking blocks before sending them back. Other frequent causes are an invalid model ID, a missing or invalid required parameter, and deprecated parameters left over from an older model. This error is not retryable - fix the request first.

Does the SDK retry automatically?
Yes. The official SDKs retry certain errors twice by default with a short exponential backoff: connection errors, 408, 409, 429, and >= 500. Configure this with max_retries (default 2; 0 disables). Non-retryable client errors such as 400, 401, 403, 404, and 413 are not retried. Timeouts (default 10 minutes) raise APITimeoutError and are also retried, so total wall-clock time can approach timeout × (max_retries + 1).

How do I find my request id for support?
Every response includes a request-id header (value like req_018EeWyXxfu5pfWkrYcMdjWG). The SDKs expose it as message._request_id (Python and TypeScript), or you can read the header directly via the raw-response accessor. Include this ID when contacting support. On Claude Platform on AWS, responses also include an AWS request ID (x-amzn-requestid) for CloudTrail; use the Anthropic request-id for Anthropic support tickets.

6. Summary

The Claude API returns a compact, predictable set of errors: 400 invalid_request_error, 401 authentication_error, 402 billing_error, 403 permission_error, 404 not_found_error, 413 request_too_large, 429 rate_limit_error, 500 api_error, 504 timeout_error, and 529 overloaded_error. Client-side 4xx errors (except 429) mean you must change the request, the credential, or the account state before retrying; 429, 500, 504, and 529 are transient and are retried automatically by the SDKs with exponential backoff, honoring retry-after. During streaming, errors such as overloaded_error arrive as error events after a 200, so a streaming client needs the same branch-by-error.type handling as a non-streaming one. And stop_reason values - including max_tokens and refusal - are part of a successful 200 response, not errors.

Error specifications evolve: per the versioning policy, the set of type values can grow, and models change which parameters they accept. This page is maintained as a living reference and will be updated as the official documentation changes - handle unknown error and event types gracefully so your integration keeps working when the surface expands. A companion reference for calling Claude and other models on AWS is available: Amazon Bedrock Errors and Exceptions Reference.

Related references on this site:

Anthropic Claude Model Release Timeline - current model IDs and availability, to avoid 404 not_found_error from stale IDs.
Anthropic Claude API Prompt Caching and Token Efficiency - how caching lowers ITPM pressure and reduces 429 rate_limit_error.
Anthropic Claude Model Migration Guide - parameter changes that commonly produce 400 invalid_request_error when moving between models.
LLM Token Counter and Context Budget Planner - estimate request size to stay under 413 request_too_large and ITPM limits.
JSONL Conversation Transcript Viewer - inspect message history and content blocks when debugging 400 request-shape errors.

References:
Anthropic - Errors
Anthropic - Rate limits
Anthropic - Streaming messages
Anthropic - Handling stop reasons
Anthropic - Python SDK

References:
Tech Blog with curated related content

Written by Hidekazu Konishi