Amazon Bedrock Errors and Exceptions Reference - Runtime Exceptions, Causes, and Official Solutions

Q: What causes ThrottlingException in Amazon Bedrock?

ThrottlingException (HTTP 429) means your request was denied because it exceeded the per-account quotas for Amazon Bedrock - typically the request rate or token throughput during bursty or highly concurrent traffic. Check the limits allotted to your account in the Amazon Bedrock service quotas, use retries with exponential backoff and jitter, and for sustained high throughput consider Provisioned Throughput, cross-region inference, or a quota increase.

Q: What is the difference between ThrottlingException and ServiceQuotaExceededException?

ThrottlingException (429) is a momentary rate/throughput limit: the correct response is to back off and retry, and to smooth or scale your traffic. ServiceQuotaExceededException (400) signals that a service quota boundary was exceeded; retrying immediately does not help. Instead, resubmit later or request a quota increase through Service Quotas. Because it is a client-side (400) condition, the SDKs do not automatically retry it, whereas 429 throttling is retried with backoff.

Q: What does ModelNotReadyException mean?

ModelNotReadyException means the model specified in the request is not ready to serve inference requests (for example, it is still warming up). It is transient. The AWS SDK automatically retries the operation up to five times, and you can add your own exponential backoff for additional resilience. Its HTTP status is 429 on the Bedrock Runtime APIs and 424 on InvokeAgent.

Q: Why do I get AccessDeniedException when invoking a model?

AccessDeniedException (HTTP 403) means the IAM identity making the request lacks sufficient permissions for the action, or its temporary credentials have expired. Verify that your user or role has the required Bedrock permission for the exact model or inference profile ARN you are invoking, confirm no SCP or explicit Deny blocks it, and refresh expired temporary credentials. If access to the model itself was never requested, you may instead see a model-access or Marketplace-agreement error rather than AccessDeniedException.

Q: Is a guardrail intervention or a max_tokens stopReason an error?

No. Both are ordinary, successful 200 responses. A guardrail that blocks or masks content returns action: GUARDRAIL_INTERVENED (or stopReason: guardrail_intervened on Converse), which means the guardrail worked as configured. A stopReason of max_tokens means the model reached your maxTokens limit and the answer was truncated. Neither raises an exception, so handle them as response data rather than as failures.

Amazon Bedrock Errors and Exceptions Reference - Runtime Exceptions, Causes, and Official Solutions

First Published: 2026-07-05
Last Updated: 2026-07-14

When you build on Amazon Bedrock, the exception you catch is often the fastest route to the root cause — but only if you can map the exception name to its precise meaning, its documented causes, and the official fix. Amazon Bedrock spreads that information across several API reference pages (Bedrock Runtime, Agents and Knowledge Bases runtime, the control plane) plus a User Guide troubleshooting page, so answering a simple question like "what is the difference between ThrottlingException and ServiceQuotaExceededException?" can mean opening five tabs.

This article is a single reference that consolidates the exceptions Amazon Bedrock returns at runtime, what each one means, the causes that AWS documents, and the official way to diagnose and resolve it. Every entry is grounded in the current AWS API Reference and the Amazon Bedrock User Guide — no guessed causes, no unofficial workarounds. It is written for developers and SREs who reach this page by searching for an exact exception name such as bedrock ThrottlingException, ModelNotReadyException, or bedrock ValidationException.

For history and the broader service context, see the companion articles AWS History and Timeline regarding Amazon Bedrock and Amazon Bedrock Basic Information and API Examples.

1. Overview

Amazon Bedrock exposes several distinct APIs, and the exception you receive depends on which one you call:

Bedrock Runtime (bedrock-runtime) — inference operations InvokeModel, InvokeModelWithResponseStream, Converse, and ConverseStream. This is where the model-specific exceptions (ModelErrorException, ModelNotReadyException, ModelTimeoutException, ModelStreamErrorException) live.
Agents and Knowledge Bases Runtime (bedrock-agent-runtime) — InvokeAgent, Retrieve, and RetrieveAndGenerate. These add dependency-oriented exceptions (BadGatewayException, DependencyFailedException, ConflictException) because an agent or a knowledge base orchestrates other services. Note that Amazon Bedrock Agents (launched November 2023) is now Amazon Bedrock Agents Classic and will no longer be open to new customers starting on July 30, 2026; existing customers can continue to use the service as normal, and AWS points to Amazon Bedrock AgentCore for similar capabilities (Amazon Bedrock Agents Classic maintenance mode).
Control plane (bedrock) — management operations such as CreateGuardrail, ListFoundationModels, and Provisioned Throughput management. These share the same exception names but sometimes with different HTTP status codes.

Because the same exception name can appear on more than one API with a different HTTP status code, this reference orders the core section (Error Reference) alphabetically by exception name and states the status code for each API that raises it. The rest of the article follows a fixed shape:

How Errors Are Returned — the wire format of a Bedrock error, how the AWS SDKs surface it as an exception, how streaming errors differ, and the signals (stopReason, guardrail intervention) that look like errors but are not.
Error Reference — one entry per exception, each with Meaning, Common Causes, How to Diagnose and Resolve, and Official Reference. An anchor index sits at the top of the section.
Retries, Timeouts, and Backoff — the official retry guidance, and the crucial distinction between an error you should retry and one you should not.
Frequently Asked Questions about Amazon Bedrock Errors.
Summary.

To use this reference, start with the quick-reference table at the top of the Error Reference, jump to the exception you hit through the exception index, and read its four fixed fields. Every cause and fix here is taken from the official AWS documentation; where the documentation does not state a cause, this page does not invent one.

This page is maintained as a living reference. As Amazon Bedrock adds APIs and exceptions, the entries here are updated in place against the official documentation.

2. How Errors Are Returned

2.1 The Shape of a Bedrock Error

Amazon Bedrock APIs are AWS JSON APIs. When a request fails, the service returns a non-2xx HTTP status code, an error code (the exception name), and a human-readable message. The error type is carried in the response — through the x-amzn-ErrorType header and the response body — and a request identifier is returned in the x-amzn-RequestId (also surfaced as x-amzn-requestid) header. The request id is the single most useful value to capture in logs and to quote when you contact AWS Support.

A raw error body looks like this:

HTTP/1.1 429 Too Many Requests
x-amzn-RequestId: 9d2b1f1a-1c3a-4e2e-9b7a-0f6b2f5f1a20
x-amzn-ErrorType: ThrottlingException:http://internal.amazon.com/coral/com.amazon.bedrock/

{
  "message": "Too many requests, please wait before trying again."
}

2.2 How the AWS SDKs Surface Errors

You rarely parse the raw body yourself; the AWS SDKs model each exception as a typed error.

In the AWS SDK for Python (Boto3), Bedrock exceptions are raised as botocore.exceptions.ClientError, and the structured fields live under the response attribute:

import boto3
from botocore.exceptions import ClientError

client = boto3.client("bedrock-runtime", region_name="us-east-1")

try:
    response = client.converse(
        modelId="anthropic.claude-sonnet-5",
        messages=[{"role": "user", "content": [{"text": "Hello"}]}],
    )
except ClientError as error:
    code = error.response["Error"]["Code"]                        # e.g. "ThrottlingException"
    message = error.response["Error"]["Message"]
    status = error.response["ResponseMetadata"]["HTTPStatusCode"] # e.g. 429
    request_id = error.response["ResponseMetadata"]["RequestId"]
    print(code, status, request_id, message)

Boto3 also exposes each modeled exception as a class on the client, so you can branch on a specific one:

try:
    response = client.invoke_model(modelId=model_id, body=body)
except client.exceptions.ModelNotReadyException:
    # transient: the model is warming up - safe to retry
    ...
except client.exceptions.ValidationException:
    # your request is malformed - fix it, do not retry as-is
    ...

The AWS CLI surfaces the same error on standard error with the exception name in parentheses, which makes it easy to reproduce and confirm an exception outside your application:

$ aws bedrock-runtime converse \
    --model-id anthropic.claude-sonnet-5 \
    --messages '[{"role":"user","content":[{"text":"Hello"}]}]'

An error occurred (ThrottlingException) when calling the Converse operation
(reached max retries: 2): Too many requests, please wait before trying again.

The parenthetical "reached max retries: 2" confirms that the SDK's built-in retries already ran before the error surfaced to you.

Other SDKs follow the same modeling. The AWS SDK for Java v2 raises subclasses of BedrockRuntimeException (for example, ThrottlingException, ValidationException) under software.amazon.awssdk.services.bedrockruntime.model. The AWS SDK for JavaScript v3 attaches the exception name to the thrown error's name property. In every case, the exception name equals the error Code documented in the Error Reference.

2.3 Errors During Streaming

InvokeModelWithResponseStream and ConverseStream can fail in two different places, and robust code handles both:

Before the stream opens — the same exceptions as the non-streaming operations (for example, ValidationException, AccessDeniedException, ThrottlingException, ResourceNotFoundException) are raised synchronously when you call the operation.
Mid-stream — after the response stream has started, an error can arrive as an event inside the event stream. The streaming operations can deliver InternalServerException, ModelStreamErrorException, ValidationException, ThrottlingException, and ServiceUnavailableException as stream events. ModelStreamErrorException is specific to streaming and carries the original status code and message from the underlying model.

Because a mid-stream error surfaces while you are already iterating over chunks, wrap the iteration itself (not only the initial call) in error handling.

2.4 Signals That Are Not Errors

Two Bedrock behaviors are frequently reported as "errors" but are ordinary, successful 200 responses. Distinguishing them from exceptions saves a great deal of debugging time.

stopReason (Converse) / stop_reason (model output). A successful Converse response includes a stopReason that explains why generation stopped. Documented values include end_turn, max_tokens, stop_sequence, tool_use, content_filtered, and guardrail_intervened (and, depending on the model, model_context_window_exceeded, malformed_tool_use, and malformed_model_output). A stopReason of max_tokens means the model hit your maxTokens limit — the call succeeded and you simply received a truncated answer. It is not an exception.

Guardrail intervention. When a guardrail blocks or masks content, the API still returns 200. ApplyGuardrail returns an action of NONE or GUARDRAIL_INTERVENED, and Converse reports a stopReason of guardrail_intervened. Intervention is the guardrail working as configured, not a failure of the call. See the Amazon Bedrock glossary for related terminology, and treat guardrail results as normal response data.

2.5 Control Plane vs. Runtime

The exception you catch also depends on which client you used. Management calls (creating a guardrail, listing foundation models, managing Provisioned Throughput) go to the bedrock control plane; inference calls go to bedrock-runtime; agent and knowledge base runtime calls go to bedrock-agent-runtime. The same name — for example ConflictException or ModelNotReadyException — can carry a different HTTP status code on each surface, which is exactly why the reference below lists the status code per API.

The table below maps the most common operations to the exceptions their API reference documents, so you can see at a glance which errors a given call can raise.

* You can sort the table by clicking on the column name.

Operation	Client	Documented exceptions (HTTP status)
InvokeModel	bedrock-runtime	AccessDenied(403), InternalServer(500), ModelError(424), ModelNotReady(429), ModelTimeout(408), ResourceNotFound(404), ServiceQuotaExceeded(400), ServiceUnavailable(503), Throttling(429), Validation(400)
InvokeModelWithResponseStream	bedrock-runtime	Same as InvokeModel, plus ModelStreamError(424)
Converse / ConverseStream	bedrock-runtime	Same as InvokeModel, minus ServiceQuotaExceeded (capacity limits surface as Throttling)
ApplyGuardrail	bedrock-runtime	AccessDenied(403), InternalServer(500), ResourceNotFound(404), ServiceQuotaExceeded(400), ServiceUnavailable(503), Throttling(429), Validation(400)
InvokeAgent	bedrock-agent-runtime	AccessDenied(403), BadGateway(502), Conflict(409), DependencyFailed(424), InternalServer(500), ModelNotReady(424), ResourceNotFound(404), ServiceQuotaExceeded(400), Throttling(429), Validation(400)
Retrieve / RetrieveAndGenerate	bedrock-agent-runtime	Same as InvokeAgent, minus ModelNotReady
CreateGuardrail (control-plane example)	bedrock	AccessDenied(403), Conflict(400), InternalServer(500), ResourceNotFound(404), ServiceQuotaExceeded(400), Throttling(429), TooManyTags(400), Validation(400)

3. Error Reference

Each entry below lists the exception's Meaning (the official definition), Common Causes (only causes documented by AWS), How to Diagnose and Resolve (official steps and links), and an Official Reference. Where an exception appears on more than one API with a different HTTP status code, all documented status codes are listed.

The quick-reference table summarizes every exception, its HTTP status, the API surfaces that raise it, and the first-line response. Detailed entries follow.

* You can sort the table by clicking on the column name.

Exception	HTTP status	API surface	First-line response
AccessDeniedException	403	runtime, agent-runtime, control plane	Do not retry; fix IAM permissions or credentials
BadGatewayException	502	agent-runtime	Retry with backoff; inspect the named dependency
ConflictException	409 / 400	agent-runtime / control plane	Resolve the conflict, then retry
DependencyFailedException	424	agent-runtime	Fix the dependency configuration, then retry
InternalServerException	500	runtime, agent-runtime	Retry with backoff and jitter
ModelErrorException	424	runtime	Inspect originalStatusCode; retry only if transient
ModelNotReadyException	429 / 424	runtime / agent-runtime	Retry (SDK auto-retries up to 5 times)
ModelStreamErrorException	424	runtime (streaming)	Retry; handle inside stream iteration
ModelTimeoutException	408	runtime	Retry with backoff; reduce work per request
ResourceNotFoundException	404	runtime, agent-runtime	Do not retry; fix the resource identifier
ServiceQuotaExceededException	400	runtime, agent-runtime, control plane	Resubmit later or request a quota increase
ServiceUnavailableException	503	runtime, ApplyGuardrail	Retry with backoff; consider a different Region
ThrottlingException	429	runtime, agent-runtime	Retry with backoff and jitter; smooth traffic
TooManyTagsException	400	control plane	Do not retry; reduce the number of tags
ValidationException	400	runtime, agent-runtime, control plane	Do not retry unchanged; fix the request

Exception index:

AccessDeniedException (403)
BadGatewayException (502)
ConflictException (409 / 400)
DependencyFailedException (424)
InternalServerException (500)
ModelErrorException (424)
ModelNotReadyException (429 / 424)
ModelStreamErrorException (424)
ModelTimeoutException (408)
ResourceNotFoundException (404)
ServiceQuotaExceededException (400)
ServiceUnavailableException (503)
ThrottlingException (429)
TooManyTagsException (400)
ValidationException (400)
Model access, subscription, and request-signing errors

AccessDeniedException (403)

Meaning. The request is denied because you do not have sufficient permissions to perform the requested action. HTTP status code: 403 across bedrock-runtime, bedrock-agent-runtime, and the bedrock control plane.

Common Causes. The IAM identity (user or role) making the request lacks the required permission for the action, or temporary security credentials have expired. On inference calls, the identity may be missing bedrock:InvokeModel / bedrock:InvokeModelWithResponseStream (or bedrock:Converse / bedrock:ConverseStream) for the specific model or inference profile.

How to Diagnose and Resolve.

Verify that your IAM user or role has the necessary permissions for the action you are attempting.
If you are using temporary security credentials, ensure they have not expired.
For inference, confirm the policy allows the action on the exact model ARN or inference profile ARN you are invoking, and that no service control policy (SCP) or explicit Deny blocks it. AWS publishes IAM identity-based policy examples for Amazon Bedrock, including how to deny inference on specific models, which are useful for confirming your effective permissions.

Official Reference. Troubleshooting Amazon Bedrock API Error Codes - AccessDeniedException and Identity-based policy examples for Amazon Bedrock.

BadGatewayException (502)

Meaning. There was an issue with a dependency due to a server issue; retry your request. The exception includes a resourceName identifying the dependency (such as Amazon Bedrock, AWS Lambda, or AWS STS). HTTP status code: 502. Raised by bedrock-agent-runtime operations (InvokeAgent, Retrieve, RetrieveAndGenerate).

Common Causes. A downstream dependency that an agent or knowledge base depends on encountered a server-side issue. Because agents orchestrate action groups (often backed by Lambda), knowledge bases, and the model, a transient failure in any of these dependencies can surface as BadGatewayException.

How to Diagnose and Resolve.

Read the resourceName field to identify which dependency failed.
Retry the request — the condition is described as transient.
If it persists, inspect the identified dependency (for example, the Lambda function backing an action group, or the STS role assumption path).

Official Reference. InvokeAgent - Errors.

ConflictException (409 / 400)

Meaning. There was a conflict performing an operation; resolve the conflict and retry. HTTP status code: 409 on bedrock-agent-runtime; 400 on the bedrock control plane (for example, CreateGuardrail).

Common Causes. A concurrent or conflicting operation on the same resource, or a state conflict such as attempting to create or modify a resource in a way that conflicts with its current state.

How to Diagnose and Resolve. Resolve the conflicting condition (for example, wait for an in-flight modification to finish, or reconcile the resource state) and retry the request.

Official Reference. CreateGuardrail - Errors and RetrieveAndGenerate - Errors.

DependencyFailedException (424)

Meaning. There was an issue with a dependency; check the resource configurations and retry. The exception includes a resourceName identifying the dependency (such as Amazon Bedrock, Lambda, or AWS STS). HTTP status code: 424. Raised by bedrock-agent-runtime operations.

Common Causes. A dependency the request relies on failed in a way that is tied to configuration rather than a transient server error — for example, a misconfigured action-group Lambda function, an IAM role that cannot be assumed, or an inaccessible data source.

How to Diagnose and Resolve.

Read the resourceName to identify the failing dependency.
Check that dependency's configuration (permissions, resource existence, input/output contract).
Retry after correcting the configuration.

Official Reference. InvokeAgent - Errors.

InternalServerException (500)

Meaning. An internal server error occurred. HTTP status code: 500 across bedrock-runtime and bedrock-agent-runtime. On bedrock-agent-runtime, the exception can include a reason; when the reason is BEDROCK_MODEL_INVOCATION_SERVICE_UNAVAILABLE, the model invocation service is unavailable and you should retry. (On the control plane and the User Guide troubleshooting page, the equivalent condition is documented as InternalFailure.)

Common Causes. The request processing failed due to a server-side error. This is a service-side condition, not a problem with your request payload.

How to Diagnose and Resolve.

Use the AWS-recommended approach of retries with exponential backoff and random jitter for improved reliability.
Capture the request id (x-amzn-RequestId) for each occurrence.
If the issue persists, contact AWS Support with the request details and the error.

Official Reference. Troubleshooting Amazon Bedrock API Error Codes - InternalFailure.

ModelErrorException (424)

Meaning. The request failed due to an error while processing the model. The exception includes originalStatusCode and resourceName. HTTP status code: 424. Raised by bedrock-runtime inference operations.

Common Causes. The underlying model returned an error while processing the request. The originalStatusCode reflects the status the model itself produced, and resourceName identifies the affected resource.

How to Diagnose and Resolve.

Inspect originalStatusCode and resourceName to understand what the model reported.
Validate that the request body conforms to the target model's expected schema and inference parameters.
If the original status indicates a transient condition, retry with backoff; if it indicates a malformed request, correct the payload before retrying.

Official Reference. InvokeModel - Errors.

ModelNotReadyException (429 / 424)

Meaning. The model specified in the request is not ready to serve inference requests. The AWS SDK automatically retries the operation up to five times. HTTP status code: 429 on bedrock-runtime; 424 on bedrock-agent-runtime (InvokeAgent).

Common Causes. The target model is not yet ready to serve inference — for example, it is still warming up or otherwise temporarily unavailable to accept requests.

How to Diagnose and Resolve.

Allow the SDK's automatic retries to run; the operation is retried up to five times by default.
For your own retry loop, use exponential backoff with jitter.
Review SDK retry configuration if you need to tune the behavior (see Retries, Timeouts, and Backoff).

Official Reference. Converse - Errors and Retry behavior in the AWS SDKs and Tools Reference Guide.

ModelStreamErrorException (424)

Meaning. An error occurred while streaming the response; retry your request. The exception includes originalMessage and originalStatusCode. HTTP status code: 424. Specific to the streaming operation InvokeModelWithResponseStream, and it can also surface as an event inside the response stream.

Common Causes. The response stream from the model was interrupted by an error. originalStatusCode and originalMessage carry the underlying cause reported by the model.

How to Diagnose and Resolve.

Handle the error where you iterate over the stream, not only at the initial call.
Inspect originalStatusCode/originalMessage to classify the cause.
Retry the request as directed by the exception.

Official Reference. InvokeModelWithResponseStream - Errors.

ModelTimeoutException (408)

Meaning. The request took too long to process; processing time exceeded the model timeout length. HTTP status code: 408. Raised by bedrock-runtime inference operations.

Common Causes. The model's processing time for the request exceeded the allowed timeout. Large inputs, long generations, and heavy reasoning workloads increase processing time.

How to Diagnose and Resolve.

Reduce the work per request where possible (for example, smaller inputs or a lower maxTokens), or use streaming so partial output is delivered as it is produced.
Treat ModelTimeoutException as transient and retry with exponential backoff and jitter.
For long-running or streaming calls that traverse NAT gateways, interface VPC endpoints, or Network Load Balancers, be aware of the fixed 350-second idle-connection timeout on those network components (see Retries, Timeouts, and Backoff).

Official Reference. Converse - Errors.

ResourceNotFoundException (404)

Meaning. The specified resource ARN was not found. HTTP status code: 404 across bedrock-runtime and bedrock-agent-runtime.

Common Causes. An incorrect model ID, endpoint name, inference profile ID, or other resource identifier in the request. The resource may not exist in the Region you are calling, or the identifier may be misspelled. When using cross-region inference, the identifier must be the inference profile ID or ARN rather than a bare model ID, so passing the wrong form for the target Region can raise this error.

How to Diagnose and Resolve.

Verify the correctness of the model ID, endpoint name, or other resource identifiers in your request.
Use ListFoundationModels to confirm which models are available to you in the Region.
Implement a fallback mechanism to use alternative models or endpoints when a primary resource is not found, and periodically synchronize your local resource catalog.

Official Reference. Troubleshooting Amazon Bedrock API Error Codes - ResourceNotFound.

ServiceQuotaExceededException (400)

Meaning. Your request exceeds the service quota for your account; you can resubmit your request later. HTTP status code: 400. Raised by InvokeModel and InvokeModelWithResponseStream on bedrock-runtime, and by bedrock-agent-runtime and control-plane operations. (Note: Converse and ConverseStream do not list this exception; they surface capacity limits as ThrottlingException.)

Common Causes. The request would exceed a service quota associated with your account. Unlike momentary rate limiting, this signals a quota boundary.

How to Diagnose and Resolve.

View your quotas in the Service Quotas console and the Amazon Bedrock endpoints and quotas page.
Resubmit the request later, or request a quota increase through Service Quotas or AWS Support if your workload legitimately needs a higher limit.
See Throttling vs. Service Quota for how this differs from ThrottlingException.

Official Reference. InvokeModel - Errors and Requesting a quota increase.

ServiceUnavailableException (503)

Meaning. The service is not currently available. HTTP status code: 503. Raised by bedrock-runtime inference operations and ApplyGuardrail. On the User Guide troubleshooting page, this is documented as ServiceUnavailable, and it is explicitly distinguished from account-level quota limits: 503 indicates high demand or temporary capacity constraints, not your rate limits (which return 429 ThrottlingException).

Common Causes. The service is temporarily unable to handle the request because of high demand or temporary capacity constraints. This is a service-side condition and is unrelated to your account quotas.

How to Diagnose and Resolve.

Retry with exponential backoff and random jitter.
Consider switching to a different AWS Region if the issue persists in your current Region, since load varies by Region.
Use cross-region inference to manage unplanned traffic bursts by spreading compute across Regions, and consider Provisioned Throughput for high, steady throughput.
Ensure your application handles 503 in its retry logic, and check the AWS Service Health Dashboard for announced issues.

Official Reference. Troubleshooting Amazon Bedrock API Error Codes - ServiceUnavailable.

ThrottlingException (429)

Meaning. Your request was denied due to exceeding the account quotas for Amazon Bedrock. HTTP status code: 429 on the Bedrock Runtime and Agents runtime APIs.

Common Causes. The request rate or token throughput exceeded the per-account limits allotted to you for Amazon Bedrock. Bursty traffic and high concurrency are the usual triggers.

How to Diagnose and Resolve.

Check the Amazon Bedrock service quotas to learn the limits allotted to your account.
Use retries with exponential backoff and random jitter.
If you have high throughput requirements, explore Provisioned Throughput, and use cross-region inference to absorb bursts.
Request a quota increase through your account manager or AWS Support if your workload traffic legitimately exceeds your account quotas.

Official Reference. Troubleshooting Amazon Bedrock API Error Codes - ThrottlingException and Implement retry logic and exponential backoff for Amazon Bedrock.

TooManyTagsException (400)

Meaning. The request contains more tags than can be associated with a resource; the maximum includes both existing tags and those in your current request. The exception includes a resourceName. HTTP status code: 400. Raised by control-plane operations that tag resources (for example, CreateGuardrail).

Common Causes. The combined count of existing tags and the tags in the request exceeds the per-resource tag limit.

How to Diagnose and Resolve.

Read resourceName to identify the over-tagged resource.
Remove unnecessary tags or reduce the number of tags in the request so the total stays within the documented per-resource limit.

Official Reference. CreateGuardrail - Errors.

ValidationException (400)

Meaning. The input fails to satisfy the constraints specified by Amazon Bedrock. HTTP status code: 400 across bedrock-runtime, bedrock-agent-runtime, and the control plane. On the User Guide troubleshooting page, this is documented as ValidationError.

Common Causes. A required parameter is missing, a value is out of the allowed range or does not match the expected pattern, or a model-specific validation rule is violated. For Converse/ConverseStream, an empty or incorrectly structured JSON Pointer in additionalModelResponseFieldPaths is rejected with a 400. On InvokeModel, the API documents specific guardrail-configuration errors: supplying the amazon-bedrock-guardrailConfig field in the body without a guardrail identifier, enabling a guardrail when the contentType is not application/json, or providing a guardrail identifier without a guardrailVersion.

How to Diagnose and Resolve.

Review the API documentation to ensure all required parameters are included and formatted correctly.
Check that your input values are within allowed ranges and conform to expected patterns (including the modelId format and length constraints).
Pay attention to the validation rules in the API reference for the specific action you are using. Do not retry a ValidationException unchanged — fix the request first.

Official Reference. Troubleshooting Amazon Bedrock API Error Codes - ValidationError.

Model access, subscription, and request-signing errors

Some failures happen before your request ever reaches a model — during authentication, request signing, or the model-access approval flow. These are documented on the Amazon Bedrock troubleshooting page and in the Common Error Types shared by AWS services. They matter most when you invoke a model whose access has not been requested or fully provisioned. The table below reproduces the documented error code, cause, and solution.

* You can sort the table by clicking on the column name.

Error code	HTTP status	Cause (official)	Solution (official)
FTUFormNotFilled	404	Model use case details have not been submitted for this account.	Fill out the Anthropic use case details form before using the model.
MPAgreementBeingCreated	403	Your account is not yet authorized; the AWS Marketplace subscription for this model is still being processed.	Try again after 15 minutes.
AWS Marketplace Agreement Pending after 15 minutes	403	The AWS Marketplace agreement has not succeeded and 15 minutes have passed.	Try the request again every 15 minutes; if it persists, contact AWS Support with request details.
AWS Marketplace Agreement Failed within 15 minutes	403	The AWS Marketplace agreement failed due to an underlying issue (for example, an invalid payment method or a restricted geo-location).	Review the error message and remediate the underlying issue.
NotAuthorized	400	You do not have permission to perform this action.	Review IAM permissions and role trust relationships; check for organizational or service control policies restricting access.
IncompleteSignature	400	The request signature does not conform to AWS standards.	Use an up-to-date AWS SDK that supports Amazon Bedrock; verify access key ID and secret key; if signing manually, re-check the signature calculation.
InvalidClientTokenId	403	The X.509 certificate or AWS access key ID provided does not exist in AWS records.	Verify you are using the correct (and current) access key ID.
RequestExpired	400	The request is no longer valid due to expired timestamps.	Synchronize your system clock with a reliable time source; account for time-zone differences.
InvalidAction	400	The action or operation requested is invalid.	Check the action name spelling and formatting; verify it is supported and use an up-to-date SDK or CLI.

For request-signing issues specifically, the interactive AWS SigV4 Request Signer and Explainer shows how a canonical request and signature are built, which helps isolate IncompleteSignature and InvalidClientTokenId causes.

The Common Error Types shared across AWS services also apply to Amazon Bedrock and are worth knowing: ExpiredTokenException (403, expired security token), UnrecognizedClientException (403, unknown or expired credentials), OptInRequired (403, the account needs a subscription for the service), RequestEntityTooLargeException (413), RequestTimeoutException (408), and MalformedHttpRequestException (400). See the Common Error Types reference for the full list.

Official Reference. Troubleshooting Amazon Bedrock API Error Codes and Amazon Bedrock Common Error Types.

4. Retries, Timeouts, and Backoff

4.1 Which Errors to Retry

The single most important operational distinction is between transient errors (retry with backoff) and client errors (fix the request; do not blindly retry). AWS documents the split as follows:

Retry with exponential backoff and jitter: ThrottlingException, ModelTimeoutException, ServiceUnavailableException, InternalServerException (and network-related errors). For agents and knowledge bases, BadGatewayException and the transient DependencyFailedException are also retry-with-backoff candidates. ModelNotReadyException is retried automatically by the SDK.
Do not retry unchanged: ValidationException and AccessDeniedException. Retrying a malformed or unauthorized request only wastes attempts — correct the request or the permissions first. ServiceQuotaExceededException is a client error (400) that indicates a quota boundary; retrying immediately will not help — resubmit later or request a quota increase.

* You can sort the table by clicking on the column name.

Exception	Retry?	Recommended action
ThrottlingException	Yes (backoff + jitter)	Back off and retry; smooth traffic; consider Provisioned Throughput or cross-region inference
ModelTimeoutException	Yes (backoff + jitter)	Retry; reduce work per request or stream the response
ServiceUnavailableException	Yes (backoff + jitter)	Retry; consider a different Region or cross-region inference
InternalServerException	Yes (backoff + jitter)	Retry; capture the request id if it persists
ModelNotReadyException	Yes (SDK auto-retries up to 5 times)	Allow SDK retries; add your own backoff if needed
BadGatewayException	Yes (backoff + jitter)	Retry; inspect the named dependency
ValidationException	No	Fix the request payload before retrying
AccessDeniedException	No	Fix IAM permissions or refresh credentials
ResourceNotFoundException	No	Correct the resource identifier or Region
ServiceQuotaExceededException	No (client-side quota)	Resubmit later or request a quota increase

4.2 SDK Automatic Retries

The AWS SDKs retry many failures for you. The behavior is governed by a retry mode and a maximum attempt count:

Standard mode — the recommended default in the cross-SDK AWS SDKs and Tools Reference Guide: it retries failed requests using exponential backoff with jitter, with shorter delays for transient errors and longer delays for throttling. It includes a retry quota (a token bucket) that stops retries when tokens are exhausted so your application fails fast during a broad disruption instead of retrying indefinitely. Use standard mode unless you have a specific reason not to.
Adaptive mode — everything in standard mode plus a client-side rate limiter that tracks throttling responses and slows the send rate; it can delay even the initial request. It suits a client that targets a single resource and expects frequent throttling, but it is not recommended as a general default because throttling on one resource slows all requests from that client.
Legacy mode — the pre-standard behavior, kept for backward compatibility — and still the actual default in the AWS SDK for Python (Boto3), per the Boto3 retries guide. Python code that never sets retry_mode therefore runs in legacy mode, with no standardized retry quota; set the mode to standard explicitly.

The maximum attempt count is controlled by max_attempts (environment variable AWS_MAX_ATTEMPTS, config key max_attempts), and the default under standard mode is 3 — one initial request plus up to two retries (Boto3's legacy mode defaults to 5 total attempts). Setting it to 1 disables retries. ModelNotReadyException is retried up to five times.

A minimal Boto3 configuration that raises the attempt count and uses adaptive mode for a throttling-heavy, single-model workload:

import boto3
from botocore.config import Config

config = Config(
    retries={"max_attempts": 5, "mode": "adaptive"},
    read_timeout=120,   # allow time for long or streaming generations
    connect_timeout=10,
)
client = boto3.client("bedrock-runtime", config=config, region_name="us-east-1")

Even with automatic retries, keep your own idempotency and backoff discipline for application-level orchestration, and consider a circuit breaker for production workloads so a failing dependency does not exhaust resources.

4.3 Timeouts and Long-Running Connections

Three different timeouts can end a Bedrock call, and they are easy to confuse:

Model timeout — surfaced as ModelTimeoutException (408) when the model's processing time exceeds its limit. Reduce the work per request or stream the response.
Client read/connect timeout — configured on your SDK client (as in the example above). For long generations, extended reasoning, or streaming, set a generous read timeout so the client does not abort a call the service would have completed.
Network idle timeout — NAT gateways, interface VPC endpoints, and Network Load Balancers drop a TCP connection that has been idle for 350 seconds, without notifying the client. This appears as connection resets or timeouts on long-running or idle connections, and sometimes as a very slow first call after an idle period when a stale pooled connection is reused. When traffic egresses through these components (common on Amazon EKS and Amazon ECS), configure TCP keep-alives or connection recycling so idle connections do not silently die.

4.4 Throttling vs. Service Quota - the Key Difference

ThrottlingException (429) and ServiceQuotaExceededException (400) both relate to limits but call for different responses:

ThrottlingException (429) — you exceeded your per-account rate/throughput momentarily. The correct response is to back off and retry with jitter, smooth your traffic, and — for sustained needs — adopt Provisioned Throughput, cross-region inference, or a quota increase.
ServiceQuotaExceededException (400) — you hit a quota boundary. Retrying immediately does not help; the correct response is to resubmit later or request a quota increase through Service Quotas. It is a client-side condition, so the SDK does not treat it as automatically retryable.

The specific numeric limits are account- and Region-specific and change over time, so this article deliberately points you to the live Service Quotas console and the Amazon Bedrock endpoints and quotas page rather than quoting values that would go stale. When estimating request size against model context limits (a common source of ValidationException and truncation), the LLM Token Counter and Context Budget Planner can help you plan token budgets before you call the API.

5. Frequently Asked Questions about Amazon Bedrock Errors

What causes ThrottlingException in Amazon Bedrock?

ThrottlingException (HTTP 429) means your request was denied because it exceeded the per-account quotas for Amazon Bedrock — typically the request rate or token throughput during bursty or highly concurrent traffic. Check the limits allotted to your account in the Amazon Bedrock service quotas, use retries with exponential backoff and jitter, and for sustained high throughput consider Provisioned Throughput, cross-region inference, or a quota increase.

What is the difference between ThrottlingException and ServiceQuotaExceededException?

ThrottlingException (429) is a momentary rate/throughput limit: the correct response is to back off and retry, and to smooth or scale your traffic. ServiceQuotaExceededException (400) signals that a service quota boundary was exceeded; retrying immediately does not help. Instead, resubmit later or request a quota increase through Service Quotas. Because it is a client-side (400) condition, the SDKs do not automatically retry it, whereas 429 throttling is retried with backoff.

What does ModelNotReadyException mean?

ModelNotReadyException means the model specified in the request is not ready to serve inference requests (for example, it is still warming up). It is transient. The AWS SDK automatically retries the operation up to five times, and you can add your own exponential backoff for additional resilience. Its HTTP status is 429 on the Bedrock Runtime APIs and 424 on InvokeAgent.

Why do I get AccessDeniedException when invoking a model?

AccessDeniedException (HTTP 403) means the IAM identity making the request lacks sufficient permissions for the action, or its temporary credentials have expired. Verify that your user or role has the required Bedrock permission for the exact model or inference profile ARN you are invoking, confirm no SCP or explicit Deny blocks it, and refresh expired temporary credentials. If access to the model itself was never requested, you may instead see a model-access or Marketplace-agreement error rather than AccessDeniedException.

Does the AWS SDK retry Bedrock errors automatically?

Yes. In standard mode, the SDK retries transient failures — such as ThrottlingException, ServiceUnavailableException, InternalServerException, and network timeouts — using exponential backoff with jitter, up to max_attempts (default 3 under standard mode: one initial request plus two retries). ModelNotReadyException is retried up to five times. The SDK does not retry client errors such as ValidationException or AccessDeniedException, because retrying them unchanged cannot succeed. Note that while the cross-SDK reference lists standard as the recommended default, the AWS SDK for Python (Boto3) still defaults to legacy mode, so set the mode (standard, adaptive, or legacy) and max_attempts explicitly via client configuration, environment variables, or the shared config file.

Is a guardrail intervention or a max_tokens stopReason an error?

No. Both are ordinary, successful 200 responses. A guardrail that blocks or masks content returns action: GUARDRAIL_INTERVENED (or stopReason: guardrail_intervened on Converse), which means the guardrail worked as configured. A stopReason of max_tokens means the model reached your maxTokens limit and the answer was truncated. Neither raises an exception, so handle them as response data rather than as failures.

6. Summary

Amazon Bedrock's exceptions are precise, and once you can map an exception name to its official meaning, documented causes, and prescribed fix, most incidents resolve quickly. The essentials to remember:

Read the exception name and status code together, and note the API surface. The same name (for example, ModelNotReadyException or ConflictException) can carry a different HTTP status on bedrock-runtime, bedrock-agent-runtime, and the control plane.
Separate transient from client errors. Retry ThrottlingException, ModelTimeoutException, ServiceUnavailableException, and InternalServerException with backoff and jitter; fix — do not blindly retry — ValidationException, AccessDeniedException, and ServiceQuotaExceededException.
Distinguish throttling from a quota boundary. 429 ThrottlingException means back off and retry; 400 ServiceQuotaExceededException means resubmit later or request an increase.
Know what is not an error. stopReason values and guardrail interventions are successful 200 responses.
Capture the request id. It is the fastest path to a resolution with AWS Support.

This reference is maintained as a living page: as Amazon Bedrock adds APIs and exceptions, entries are added and updated in place against the official AWS documentation, and the URL stays stable so links keep working.

Related reading on this site:

AWS History and Timeline regarding Amazon Bedrock
Amazon Bedrock Basic Information and API Examples
Amazon Bedrock RAG Architecture Guide
Agentic RAG Architecture on Amazon Bedrock
GraphRAG Architecture on Amazon Neptune and Amazon Bedrock
Amazon Bedrock Glossary
Anthropic Claude API Errors Reference — the companion reference for calling Claude directly on the Anthropic API
Anthropic Claude Model Release Timeline
LLM Token Counter and Context Budget Planner
AWS SigV4 Request Signer and Explainer

References:
Troubleshooting Amazon Bedrock API Error Codes (Amazon Bedrock User Guide)
InvokeModel (Amazon Bedrock Runtime API Reference)
Converse (Amazon Bedrock Runtime API Reference)
ConverseStream (Amazon Bedrock Runtime API Reference)
InvokeModelWithResponseStream (Amazon Bedrock Runtime API Reference)
ApplyGuardrail (Amazon Bedrock Runtime API Reference)
InvokeAgent (Agents for Amazon Bedrock Runtime API Reference)
RetrieveAndGenerate (Agents for Amazon Bedrock Runtime API Reference)
CreateGuardrail (Amazon Bedrock API Reference)
Common Error Types (Amazon Bedrock API Reference)
Retry behavior (AWS SDKs and Tools Reference Guide)
Implement retry logic and exponential backoff for Amazon Bedrock (AWS re:Post)

References:
Tech Blog with curated related content

Written by Hidekazu Konishi