Basic Information about Amazon Bedrock with API Examples - Model Features, Pricing, How to Use, Explanation of Tokens and Inference Parameters

First Published: 2023-10-02
Last Updated: 2025-03-09

Today, I have summarized the basic information on Amazon Bedrock, which became General Availability (GA) on 2023-09-28, as well as examples of the Runtime API execution. Additionally, I have sprinkled in some minimal terminology explanations to help grasp the image of tokens and parameters.
* For a list of Amazon Bedrock models as of the end of 2024 after AWS re:Invent 2024, please see the following article:
　Amazon Bedrock Models as of 2024 - An Analysis of the Comprehensive Model Catalog
* The content of this article has been updated as of the date and time mentioned in the "Last Updated" above.
* The source code published in this article and other articles by this author has been created as part of independent research activities and does not guarantee operation. Please use it at your own risk. Also, please be aware that it may be modified without prior notice.

Amazon Bedrock Basic Information

Amazon Bedrock Reference Materials & Learning Resources

The main reference materials and learning resources that can help in understanding Amazon Bedrock are as follows.
The content of this article is based on the information from these reference materials and learning resources.

What's New: Amazon Bedrock is now generally available
AWS Blog: Amazon Bedrock Is Now Generally Available – Build and Scale Generative AI Applications with Foundation Models
Price: Amazon Bedrock Pricing
Workshop: GitHub - aws-samples/amazon-bedrock-workshop: This is a workshop designed for Amazon Bedrock a foundational model service.
AWS Documentation(User Guide): What is Amazon Bedrock? - Amazon Bedrock
AWS Documentation(API Reference): Bedrock API Reference - Amazon Bedrock
AWS SDK for Python(Boto3) Documentation(Bedrock): Bedrock - Boto3 documentation
AWS SDK for Python(Boto3) Documentation(BedrockRuntime): BedrockRuntime - Boto3 documentation
AWS CLI Command Reference(bedrock): bedrock — AWS CLI Command Reference
AWS CLI Command Reference(bedrock-runtime): bedrock-runtime — AWS CLI Command Reference
AWS Management Console(Amazon Bedrock Model Providers): Amazon Bedrock Model Providers - AWS Management Console

What is Amazon Bedrock?

Amazon Bedrock is a service that provides access to Foundation Models (FMs) such as AI21 Labs' Jurassic-2, Amazon's Titan, Anthropic's Claude, Cohere's Command, Meta's Llama 2, and Stability AI's Stable Diffusion via API, as well as features to customize FMs privately using unique data.
You can choose a foundation model based on use cases like text generation, chatbots, search, text summarization, image generation, and personalized recommendations to build and expand Generative AI applications.

Tokens in Generative AI for Text Handling

Before looking at the list of models and pricing for Amazon Bedrock, let me briefly explain tokens, which serve as the units for restrictions and billing.
However, please note that this description may differ from the strict definition as I prioritize ease of understanding here.

In Generative AI handling text, tokens refer to units that break text into meaningful parts.
While tokens can correspond to words, they don't necessarily equate to words and can be split into characters, subwords, etc.

For instance, if we tokenize the string Amazon Bedrock is amazing! based on words, it would look like this:
["Amazon", "Bedrock", "is", "amazing", "!"]

However, using a non-word-based tokenization method, it might also include spaces like this:
["Amazon", " ", "Bedrock", " ", "is", " ", "amazing", "!"]

There are advanced tokenization methods beyond word-based, like Unigram Tokenization, WordPiece, SentencePiece, and Byte Pair Encoding (BPE). Different models adopt various methods, so it's essential to be aware of that.

Especially when calculating fees on a token basis, it's best to determine the number of tokens based on the model's tokenization method and in a scenario close to actual usage conditions.
However, personally, when considering the monthly budget of the Generative AI service I use, if I don't want to spend time and effort estimating the exact number of tokens, I either use Generative AI itself for calculations or overestimate by assuming 1 character = 1 token for a higher fee estimate.

List and Features of Available Models

Based on the product page of Amazon Bedrock – AWS and AWS Management Console's Amazon Bedrock Model Providers, I compiled data as of the time of writing this article.

* Models supporting Embeddings (Embed) are capable of converting text input (words, phrases, large text units, etc.) into a numerical representation (Embedding) that contains the semantic content of the text.

Model Provider	Model	Model ID	Max tokens	Modality (Data Type)	Languages	Supported use cases
AI21 Labs	Jurassic-2 Ultra (v1)	ai21.j2-ultra-v1	8191	Text	English Spanish French German Portuguese Italian Dutch	Open book question answering summarization draft generation information extraction ideation
AI21 Labs	Jurassic-2 Mid (v1)	ai21.j2-mid-v1	8191	Text	English Spanish French German Portuguese Italian Dutch	Open book question answering summarization draft generation information extraction ideation
Amazon	Titan Embeddings G1 - Text (v1.2)	amazon.titan-embed-text-v1	8k	Embedding	English, Arabic, Chinese (Sim.), French, German, Hindi, Japanese, Spanish, Czech, Filipino, Hebrew, Italian, Korean, Portuguese, Russian, Swedish, Turkish, Chinese (trad), Dutch, Kannada, Malayalam, Marathi, Polish, Tamil, Telugu and others.	Translate text inputs (words, phrases or possibly large units of text) into numerical representations (known as embeddings) that contain the semantic meaning of the text.
Amazon	Titan Text G1 - Lite	amazon.titan-text-lite-v1	4k	Text	English	Summarization and copywriting.
Amazon	Titan Text G1 - Express	amazon.titan-text-express-v1	8k	Text	English (GA), Multilingual in 100+ languages (Preview)	Open ended text generation brainstorming summarization code generation table creation data formatting paraphrasing chain of though rewrite extraction Q&A chat
Amazon	Titan Image Generator G1	amazon.titan-image-generator-v1	77	Image	English	Text to image generation image editing image variations
Amazon	Titan Multimodal Embeddings G1	amazon.titan-embed-image-v1	128	Embedding	English	Search recommendation personalization
Anthropic	Claude 3.5 Sonnet	anthropic.claude-3-5-sonnet-20240620-v1:0	200k	Text	English and multiple other languages	Complex tasks like customer support Coding Data Analysis and Visual Processing. Streamlining of Workflows Generation of Insights and Production of High-Quality Natural-Sounding Content.
Anthropic	Claude 3 Opus	anthropic.claude-3-opus-20240229-v1:0	200k	Text	English and multiple other languages	Task automation: plan and execute complex actions across APIs and databases, interactive coding R&D: research review, brainstorming and hypothesis generation, drug discovery Strategy: advanced analysis of charts & graphs, financials and market trends, forecasting
Anthropic	Claude 3 Sonnet	anthropic.claude-3-sonnet-20240229-v1:0	200k	Text	English and multiple other languages	Data processing: RAG or search & retrieval over vast amounts of knowledge Sales: product recommendations, forecasting, targeted marketing Time-saving tasks: code generation, quality control, parse text from images
Anthropic	Claude 3 Haiku	anthropic.claude-3-haiku-20240307-v1:0	200k	Text	English and multiple other languages	Customer interactions: quick and accurate support in live interactions, translations Content moderation: catch risky behavior or customer requests Cost-saving tasks: optimized logistics, inventory management, extract knowledge from unstructured data
Anthropic	Claude v2.1	anthropic.claude-v2:1	200k	Text	English and multiple other languages	Question answering information extraction removing PII content generation multiple choice classification Roleplay comparing text summarization document Q&A with citation
Anthropic	Claude v2	anthropic.claude-v2	100k	Text	English and multiple other languages	Question answering information extraction removing PII content generation multiple choice classification Roleplay comparing text summarization document Q&A with citation
Anthropic	[Legacy version] Claude v1.3	anthropic.claude-v1	100k	Text	English and multiple other languages	Question answering information extraction removing PII content generation multiple choice classification Roleplay comparing text summarization document Q&A with citation
Anthropic	Claude Instant v1.2	anthropic.claude-instant-v1	100k	Text	English and multiple other languages	Question answering information extraction removing PII content generation multiple choice classification Roleplay comparing text summarization document Q&A with citation
Cohere	Command R+ (v1)	cohere.command-r-plus-v1:0	128k	Text	English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, and Chinese	Complex RAG on large amounts of data Q&A Multi-step tool use chat text generation text summarization
Cohere	Command R (v1)	cohere.command-r-v1:0	128k	Text	English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, and Chinese	Chat text generation text summarization RAG on large amounts of data Q&A function calling
Cohere	Command (v14.7)	cohere.command-text-v14	4000	Text	English	Summarization copywriting dialogue extraction question answering
Cohere	Command Light (v14.7)	cohere.command-light-text-v14	4000	Text	English	Summarization copywriting dialogue extraction question answering
Cohere	Embed English (v3)	cohere.embed-english-v3	512	Embedding	English	Semantic search retrieval-augmented generation (RAG) classification clustering
Cohere	Embed Multilingual (v3)	cohere.embed-multilingual-v3	512	Embedding	108 Languages	Semantic search retrieval-augmented generation (RAG) classification clustering
Meta	Llama 3 70B Instruct	meta.llama3-70b-instruct-v1:0	8k	Text	English	Language modeling Dialog systems Code generation Following instructions Sentiment analysis with nuances in reasoning Text classification with improved accuracy and nuance Text summarization with accuracy and nuance
Meta	Llama 3 8B Instruct	meta.llama3-8b-instruct-v1:0	8k	Text	English	Text summarization Text classification Sentiment analysis
Meta	Llama 2 Chat 13B	meta.llama2-13b-chat-v1	4096	Text	English	Text generation Conversation Chat based applications
Meta	Llama 2 Chat 70B	meta.llama2-70b-chat-v1	4096	Text	English	Text generation Conversation Chat based applications
Mistral AI	Mistral 7B Instruct	mistral.mistral-7b-instruct-v0:2	32K	Text	English	Classification Text generation Code generation
Mistral AI	Mixtral 8x7B Instruct	mistral.mixtral-8x7b-instruct-v0:1	32K	Text	English, French, Italian, German and Spanish	Complex reasoning & analysis Text generation Code generation
Mistral AI	Mistral Large	mistral.mistral-large-2402-v1:0	32K	Text	English, French, Italian, German and Spanish	Complex reasoning & analysis Text generation Code generation RAG Agents
Mistral AI	Mistral Small	mistral.mistral-small-2402-v1:0	32K	Text	English, French, Italian, German and Spanish	Text generation Code generation Classification RAG Conversation
Stability AI	[Legacy version] Stable Diffusion XL (v0.8)	stability.stable-diffusion-xl-v0	77	Image	English	image generation image editing
Stability AI	Stable Diffusion XL (v1.0)	stability.stable-diffusion-xl-v1	77	Image	English	image generation image editing

Model Pricing

Based on the Amazon Bedrock Pricing, I have summarized the US East(N. Virginia) pricing available at the time of writing this article.

If no pricing is listed for a model, it indicates that the pricing option is not offered, or the functionality to customize the model is not supported.

Text Models Pricing

The pricing for text-based models is set based on the following criteria:

On-Demand
On-Demand pricing is calculated per 1,000 input tokens and per 1,000 output tokens (it's not based on time).
Provisioned Throughput
Provisioned Throughput allows you to commit to a time-based payment for a specified period, ensuring sufficient throughput for large-scale use and other requirements.
For the commitment duration, options include none, 1 month, and 6 months, with longer durations offering discounts.
Model customization (Fine-tuning)
When creating a custom model using Fine-tuning, training fees are incurred per 1,000 tokens, and there is a monthly storage fee for each custom model.

Model Provider	Model	On-Demand (per 1000 input tokens)	On-Demand (per 1000 output tokens)	Provisioned Throughput (per hour per model)	Model customization through Fine-tuning
AI21 Labs	Jurassic-2 Ultra	0.0188 USD	0.0188 USD	-	-
AI21 Labs	Jurassic-2 Mid	0.0125 USD	0.0125 USD	-	-
Amazon	Titan Text Lite(Titan Text G1 - Lite)	0.0003 USD	0.0004 USD	no commitment: 7.10 USD 1-month commitment: 6.40 USD 6-month commitment: 5.10 USD	Train(per 1000 tokens): 0.0004 USD Store each custom model(per month): 1.95 USD
Amazon	Titan Text Express(Titan Text G1 - Express)	0.0008 USD	0.0016 USD	no commitment: 20.50 USD 1-month commitment: 18.40 USD 6-month commitment: 14.80 USD	Train(per 1000 tokens): 0.008 USD Store each custom model(per month): 1.95 USD
Amazon	Titan Embeddings(Titan Embeddings G1 - Text)	0.0001 USD	N/A	no commitment: N/A 1-month commitment: 6.40 USD 6-month commitment: 5.10 USD	-
Anthropic	Claude 3.5 Sonnet	0.00300 USD	0.01500 USD	no commitment: N/A 1-month commitment: N/A 6-month commitment: N/A	-
Anthropic	Claude 3 Opus	0.01500 USD	0.07500 USD	no commitment: N/A 1-month commitment: N/A 6-month commitment: N/A	-
Anthropic	Claude 3 Sonnet	0.00300 USD	0.01500 USD	no commitment: N/A 1-month commitment: N/A 6-month commitment: N/A	-
Anthropic	Claude 3 Haiku	0.00025 USD	0.00125 USD	no commitment: N/A 1-month commitment: N/A 6-month commitment: N/A	-
Anthropic	Claude(v2.0, v2.1)	0.00800 USD	0.02400 USD	no commitment: N/A 1-month commitment: 63.00 USD 6-month commitment: 35.00 USD	-
Anthropic	Claude Instant(v1.2)	0.00080 USD	0.00240 USD	no commitment: N/A 1-month commitment: 39.60 USD 6-month commitment: 22.00 USD	-
Cohere	Command R+	0.0030 USD	0.0150 USD	-	-
Cohere	Command R	0.0005 USD	0.0015 USD	-	-
Cohere	Command	0.0015 USD	0.0020 USD	no commitment: 49.50 USD 1-month commitment: 39.60 USD 6-month commitment: 23.77 USD	Train(per 1000 tokens): 0.004 USD Store each custom model(per month): 1.95 USD
Cohere	Command-Light	0.0003 USD	0.0006 USD	no commitment: 8.56 USD 1-month commitment: 6.85 USD 6-month commitment: 4.11 USD	Train(per 1000 tokens): 0.001 USD Store each custom model(per month): 1.95 USD
Cohere	Embed – English	0.0001 USD	N/A	no commitment: 7.12 USD 1-month commitment: 6.76 USD 6-month commitment: 6.41 USD	-
Cohere	Embed – Multilingual	0.0001 USD	N/A	no commitment: 7.12 USD 1-month commitment: 6.76 USD 6-month commitment: 6.41 USD	-
Meta	Llama 3 Instruct 8B	0.0003 USD	0.0006 USD	-	-
Meta	Llama 3 Instruct 70B	0.00265 USD	0.0035 USD	-	-
Meta	Llama 2 Chat 13B	0.00075 USD	0.00100 USD	no commitment: N/A 1-month commitment: 21.18 USD 6-month commitment: 13.08 USD	Train(per 1000 tokens): 0.00149 USD Store each custom model(per month): 1.95 USD
Meta	Llama 2 Chat 70B	0.00195 USD	0.00256 USD	no commitment: N/A 1-month commitment: 21.18 USD 6-month commitment: 13.08 USD	Train(per 1000 tokens): 0.00799 USD Store each custom model(per month): 1.95 USD
Mistral AI	Mistral 7B Instruct	0.00015 USD	0.0002 USD	-	-
Mistral AI	Mixtral 8x7B Instruct	0.00045 USD	0.0007 USD	-	-
Mistral AI	Mistral Small	0.001 USD	0.003 USD	-	-
Mistral AI	Mistral Large	0.004 USD	0.012 USD	-	-

Multi-modal Models Pricing

The pricing of multi-modal models that process images and other media is based on various criteria such as the number of images, resolution, etc., so it is summarized for each model.

Model Provider	Model	Standard quality(<51 steps) (per image)	Premium quality(>51 steps) (per image)	Provisioned Throughput (per hour per model)	Model customization through Fine-tuning
Stability AI	Stable Diffusion XL (v0.8)	512x512 or smaller: 0.018 USD Larger than 512x512: 0.036 USD	512x512 or smaller: 0.036 USD Larger than 512x512: 0.072 USD	-	-
Stability AI	Stable Diffusion XL (v1.0)	Up to 1024 x 1024: 0.04 USD	Up to 1024 x 1024: 0.08 USD	no commitment: N/A 1-month commitment: 49.86 USD 6-month commitment: 46.18 USD	-

Model Provider	Model	Standard quality (per image)	Premium quality (per image)	Provisioned Throughput (per hour per model)	Model customization through Fine-tuning
Amazon	Titan Image Generator	512x512: 0.008 USD 1024X1024: 0.01 USD	512x512: 0.01 USD 1024X1024: 0.012 USD	no commitment: N/A 1-month commitment: 16.20 USD 6-month commitment: 13.00 USD	Train(per image seen): 0.005 USD Store each custom model(per month): 1.95 USD
Amazon	Titan Image Generator(custom models)	512x512: 0.018 USD 1024X1024: 0.02 USD	512x512: 0.02 USD 1024X1024: 0.022 USD	no commitment: 23.40 USD 1-month commitment: 21.00 USD 6-month commitment: 16.85 USD	-

Model Provider	Model	On-Demand (per 1000 input tokens)	On-Demand (per 1000 input image)	Provisioned Throughput (per hour per model)	Model customization through Fine-tuning
Amazon	Titan Multimodal Embeddings	0.0008 USD	0.00006 USD	no commitment: 9.38 USD 1-month commitment: 8.45 USD 6-month commitment: 6.75 USD	Train(per image seen): 0.0002 USD Store each custom model(per month): 1.95 USD

Basic How to Use Amazon Bedrock

Getting Started & Preparation for Amazon Bedrock

To get started with Amazon Bedrock, go to the Model access screen of Amazon Bedrock in the AWS Management Console, click Edit, select the model you want to use, and request access to the model by clicking Save changes.
Amazon Bedrock > Model access - AWS Management Console
Please note that for Anthropic models, you need to enter company information and the purpose to make a request.

Once the request is approved, you can access and use the model.

Amazon Bedrock Runtime API's InvokeModel, InvokeModelWithResponseStream, and Parameters

Here, I will explain the APIs needed to actually use Amazon Bedrock.
There are mainly two types of APIs related to Amazon Bedrock: the Bedrock API and the Bedrock Runtime API.

The Bedrock API is used for operations like creating custom models through Fine-tuning or purchasing Provisioned Throughput for models.

On the other hand, the Bedrock Runtime API is used for the actual execution, where you specify the base or custom model, request input data (Prompt), and obtain output data (Completions) from the response.

The Amazon Bedrock Runtime API includes InvokeModel and InvokeModelWithResponseStream for actually invoking and using the model.

The InvokeModel of Amazon Bedrock Runtime API is an API that obtains all the contents of the response to a request at once.

Meanwhile, the InvokeModelWithResponseStream of the Amazon Bedrock Runtime API is an API that obtains the contents of the response to a request gradually, in small chunks of text, as a stream.
If you've used a chat-style Generative AI service before, you might have seen the results for a Prompt being displayed a few characters at a time. InvokeModelWithResponseStream can be used for this type of display.

The parameters specified in the request for the InvokeModel and InvokeModelWithResponseStream of the Amazon Bedrock Runtime API commonly use the following:

accept: MIME type of the inference Body of the response. (Default: application/json)
contentType: MIME type of the input data of the request. (Default: application/json)
modelId: [Required] Identifier of the model. (e.g., ai21.j2-ultra-v1)
body: [Required] Input data in the format specified by contentType. Specify the format of the body field according to the inference parameters supported by each model.

Meaning of Common Inference Parameters

In the following, I will introduce examples of executing the Amazon Bedrock Runtime API, but before that, let's briefly explain the common inference parameters frequently used in the body of the model request. However, please be aware that, for the sake of clarity in visualization, this explanation might not be strictly aligned with the exact definition.

temperature
This parameter adjusts the randomness and diversity of the model's output probability distribution. If the value is high, it tends to return answers with higher randomness and diversity. Conversely, if the value is low, it is more likely to return answers that are estimated with higher probability. The typical range for temperature is between 0 - 1, but there are models that can be set to values exceeding 1. For instance, between temperature=1.0 and temperature=0.1, temperature=1.0 is inclined to provide answers with higher randomness and diversity, whereas temperature=0.1 tends to return more probable answers.
topK
This parameter adjusts randomness and diversity by limiting the top K tokens considered by the model. The optimal range for topK varies depending on the model used. When you set this value, the output tokens are selected from these top K. For example, with topK=10, the model considers only the top 10 tokens with the highest probability when generating answers. Put simply, topK limits the range of selectable tokens by the number of output tokens, thus adjusting the diversity as well.
topP
This parameter adjusts randomness and diversity by sampling from the set of tokens whose cumulative probability doesn't exceed a specified P. The usual range for topP is between 0 - 1. For instance, with topP=0.9, when the model generates answers, it considers tokens in decreasing order of probability until the cumulative probability exceeds 0.9. In simpler terms, topP limits the range of selectable tokens based on the cumulative probability of the output tokens, and adjusts randomness and diversity accordingly.
maxTokens
This parameter limits the maximum number of tokens generated to control the length of the produced text. For example, with maxTokens=800, the model ensures that the text doesn't exceed 800 tokens.

In the API request, I combine the parameters temperature, topK, and topP to adjust the balance between confidence and diversity, and use maxTokens to limit the number of tokens output.

For detailed inference parameters of each model available in Amazon Bedrock, please refer to "Inference parameters for foundation models - Amazon Bedrock".

Example of invoking Amazon Bedrock Runtime using AWS SDK for Python (Boto3)

Here, I introduce an example where I executed the Amazon Bedrock Runtime's invoke_model using AWS SDK for Python (Boto3) in an AWS Lambda function.
At the time of writing this article, the default AWS SDK for Python (Boto3) in AWS Lambda functions did not yet support calling the bedrock and bedrock-runtime Clients.
Therefore, the following is an example using the bedrock-runtime Client after adding the latest AWS SDK for Python (Boto3) to the Lambda Layer.
・Execution Example (AWS Lambda function)

import boto3
import json
import os

region = os.environ.get('AWS_REGION')
bedrock_runtime_client = boto3.client('bedrock-runtime', region_name=region)

def lambda_handler(event, context):
    modelId = 'ai21.j2-ultra-v1'
    contentType = 'application/json'
    accept = 'application/json'
    body = json.dumps({
        "prompt": "Please tell us all the states in the U.S.",
        "maxTokens": 800,
        "temperature": 0.7,
        "topP": 0.95
    })

    response = bedrock_runtime_client.invoke_model(
        modelId=modelId,
        contentType=contentType,
        accept=accept, 
        body=body
    )
    response_body = json.loads(response.get('body').read())
    return response_body

・Execution Result Example (Return value of the above AWS Lambda function)

{
    "id": 1234,
    "prompt": {
        "text": "Please tell us all the states in the U.S.",
        "tokens": [
            ...
        ]
    },
    "completions": [
        {
            "data": {
                "text": "\nUnited States of America is a federal republic consisting of 50 states, a federal district (Washington, D.C., the capital city of the United States), five major territories, and various minor islands. The 50 states are Alabama, Alaska, Arizona, Arkansas, California, Colorado, Connecticut, Delaware, Florida, Georgia, Hawaii, Idaho, Illinois, Indiana, Iowa, Kansas, Kentucky, Louisiana, Maine, Maryland, Massachusetts, Michigan, Minnesota, Mississippi, Missouri, Montana, Nebraska, Nevada, New Hampshire, New Jersey, New Mexico, New York, North Carolina, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, Rhode Island, South Carolina, South Dakota, Tennessee, Texas, Utah, Vermont, Virginia, Washington, West Virginia, Wisconsin, and Wyoming.",
                "tokens": [
                    ...
                ]
            },
            "finishReason": {
                "reason": "endoftext"
            }
        }
    ]
}

Note: As of the time I wrote this article, the latest AWS SDK for Python (Boto3) provides the invoke_model_with_response_stream command for Amazon Bedrock Runtime.
However, I plan to explain the details in another article, so I will omit it in this article.

AWS CLI Implementation Example for Amazon Bedrock Runtime's invoke-model

In this article, I introduce the implementation example of Amazon Bedrock Runtime's invoke-model using AWS CLI.
As of the time of writing this article, the Amazon Bedrock Runtime API was not yet compatible with AWS CLI Version 2.
Therefore, the following example was executed by separately installing AWS CLI Version 1, which supported the Amazon Bedrock Runtime API.
・Format

aws bedrock-runtime invoke-model \
    --region [Region] \
    --model-id "[modelId]" \
    --content-type "[contentType]" \
    --accept "[accept]" \
    --body "[body]" [Output FileName]

・Implementation Example

aws bedrock-runtime invoke-model \
    --region us-east-1 \
    --model-id "ai21.j2-ultra-v1" \
    --content-type "application/json" \
    --accept "application/json" \
    --body "{\"prompt\": \"Please tell us all the states in the U.S.\", \"maxTokens\": 800,\"temperature\": 0.7,\"topP\": 0.95}" invoke-model-output.txt

・Response Example

* Displayed on screen  
{"contentType": "application/json"}

* File Content (invoke-model-output.txt)  
{"id": 1234,"prompt": {"text": "Please tell us all the states in the U.S.","tokens": [...]},"completions": [{"data": {"text": "\nUnited States of America is a federal republic consisting of 50 states, a federal district (Washington, D.C., the capital city of the United States), five major territories, and various minor islands. The 50 states are Alabama, Alaska, Arizona, Arkansas, California, Colorado, Connecticut, Delaware, Florida, Georgia, Hawaii, Idaho, Illinois, Indiana, Iowa, Kansas, Kentucky, Louisiana, Maine, Maryland, Massachusetts, Michigan, Minnesota, Mississippi, Missouri, Montana, Nebraska, Nevada, New Hampshire, New Jersey, New Mexico, New York, North Carolina, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, Rhode Island, South Carolina, South Dakota, Tennessee, Texas, Utah, Vermont, Virginia, Washington, West Virginia, Wisconsin, and Wyoming.","tokens": [...]},"finishReason": {"reason": "endoftext"}}]}

Note: As of the time of writing this article, AWS CLI does not have the invoke-model-with-response-stream command for Amazon Bedrock Runtime.

References:
Tech Blog with curated related content

Summary

In this article, I introduced reference materials for Amazon Bedrock, model features, pricing, how to use, explanations of terms like tokens and inference parameters, and examples of the Runtime API. While compiling the information, I realized that with Amazon Bedrock, you can choose from a variety of models according to use cases and call them with AWS SDK or AWS CLI interfaces that are highly compatible with other AWS services.
I plan to continue monitoring Amazon Bedrock for updates, implementation methods, and its integration with other services in the future.

Written by Hidekazu Konishi