Amazon Bedrock AgentCore Beginner's Guide - AI Agent Development from Basics with Detailed Term Explanations

First Published: 2025-11-07
Last Updated: 2025-11-07

In recent years, generative AI technology has been evolving rapidly. Amazon Bedrock, Amazon Bedrock Agent, and the latest Amazon Bedrock AgentCore. Keeping up with the new services and frameworks that appear one after another is not easy.
This article is written for those who find it challenging to keep up with the evolution of generative AI technology. While understanding the overall picture of Amazon Bedrock AgentCore, I will also carefully explain basic terms such as "LLM," "agent," and "serverless" one by one. Even if you're starting now, this content will help you catch up effectively.

Features of This Article

Emphasis on Term Explanations
In this article, I provide explanations each time specialized terms commonly used in technical documents appear. For example, "LLM is an artificial intelligence model trained on vast amounts of text data," and "serverless is an execution model where you leave server management to the cloud provider." You can read while filling in knowledge gaps.

Structure That Promotes Gradual Understanding
I start with the basic concepts of AI agents, grasp the overall picture of AgentCore, and then learn each of the seven core services one by one. The structure is designed so that you can understand even without prerequisite knowledge if you read sequentially.

Providing Practical Information
In addition to explaining concepts, I also cover information needed in actual work, such as use cases, pricing models, security, and frequently asked questions.

Target Readers of This Article

Engineers who want to learn about AI agents from the basics
Developers who want to keep up with generative AI technology trends
Architects who want to build AI applications on AWS
Technology leaders considering AI adoption in enterprises

How to Read

This article is designed so that understanding deepens gradually by reading from the beginning in order. Although you can skip parts you already know, checking the term explanation sections will help you understand subsequent content smoothly.

What is an AI Agent

Definition of Agent

An AI agent is an AI system that can autonomously judge and act to achieve goals. The difference from traditional chatbots lies in autonomy and tool utilization capability.
Traditional chatbots only responded according to predetermined scenarios. In contrast, AI agents autonomously judge which tools to use and in what order for a given goal, and execute them. For example, in response to the instruction "arrange a business trip," they can autonomously execute a series of tasks such as calendar checking, flight booking, hotel reservation, and application form creation.

Traditional Chatbot vs AI Agent

Feature	Traditional Chatbot	AI Agent
Nature of Response	Simple response	Goal-oriented
Tool Usage	Cannot use tools	Utilizes multiple tools
Processing Flow	Fixed flow	Dynamically reasons and plans
Autonomy	Requires human instructions	Acts autonomously
Typical Example	FAQ responses	"Arrange a business trip" → Autonomously uses multiple tools to complete booking

As shown above, traditional types operate along fixed flows, but agent types dynamically judge according to situations and select optimal approaches.

Basic Elements of an Agent

AI agents are composed of the following elements. These elements work together to realize a system that thinks, judges, and acts like a human.

1. LLM (Large Language Model) - Brain
LLM is an artificial intelligence model trained on vast amounts of text data, a system that can understand and generate language like humans. Representative examples include GPT-4 (the foundation model of ChatGPT), Claude, and Gemini.
LLM functions as the brain of the agent, understanding user requirements, thinking logically, and judging what to do next. For example, when receiving the instruction "arrange a business trip to Tokyo next week," it plans a series of flows: first checking available dates on the calendar, then booking flights, and subsequently arranging hotels.

2. Tools - Hands and Feet
Tools are the means for agents to actually act. They can execute various operations such as information retrieval from databases, calling external APIs, file reading and writing, and web page browsing.
API (Application Programming Interface) is an interface for exchanging data between applications. The role division is that LLM judges "what should be done," and tools "actually do it."

3. Memory - Storage Device
Memory is a mechanism that retains past conversation history and learned information. This allows agents to understand conversation context, remember user preferences, and refer to previous decisions. For example, to understand what "that" refers to in the question "When is that?", they need to remember past conversations.

4. Planning - Strategy
Planning is the ability to break down complex goals into small tasks and determine execution order. Agents formulate strategies to achieve goals and decide next actions while evaluating results of each step. If a plan fails, they can also try alternative approaches.

Example of Agent Operation

Let's look at a specific agent's operation.
User Instruction: "Arrange a business trip to Tokyo next week"

Agent Thinking Process:
1. Planning

Check calendar availability
Search and book flights
Search and book hotels
Create business trip application

2. Execution

Step 1: Execute calendar tool → Confirm availability: 3/15-3/17 available
Step 2: Execute flight booking tool → Search and book round-trip flights
Step 3: Execute hotel booking tool → Book hotel within company regulations
Step 4: Execute internal system tool → Automatically create and submit business trip application

3. Report

Business trip arrangement completed:
- Flight: Departing 3/15 8:00, Returning 3/17 18:00
- Hotel: Tokyo Business Hotel
- Application: Submitted (awaiting approval)

In this process, the agent understands human instructions, selects necessary tools, executes them in appropriate order, and finally reports results. If problems occur midway (e.g., desired flight is fully booked), it can respond flexibly by presenting alternatives.

What is Amazon Bedrock AgentCore

Positioning of AgentCore

Amazon Bedrock AgentCore is a development and operation platform for AI agents. It allows developers to focus on business logic by providing infrastructure, tools, and security features necessary for agents in an integrated manner.
Platform refers to a group of environments and services that form the foundation for application development and execution.
AgentCore takes on the "troublesome parts" of agent development. It provides all elements necessary for enterprise-grade systems, such as server management, security implementation, tool integration, and performance monitoring.

Why AgentCore is Needed

Traditional agent development had the following challenges.

Development Challenges:

Complex implementation of authentication and authorization (OAuth 2.0, API Key management, etc.)
Complexity of tool integration (understanding each API specification, error handling)
Framework selection and learning costs

Operation Challenges:

Balancing scaling and security
Difficulty in detailed monitoring and debugging
Infrastructure management burden

Security Challenges:

Implementation of data protection and encryption
Detailed management of access control
Compliance requirements response

AgentCore solves all these challenges in an integrated manner.

Difference from Amazon Bedrock Agent

There are two ways to build agents on AWS: Amazon Bedrock Agent and Amazon Bedrock AgentCore. These two differ in purpose and usage.

Feature	Amazon Bedrock Agent	Amazon Bedrock AgentCore
Development Style	No-code/Low-code	Full code control
Configuration Method	Define agent via GUI	Can use any framework
Structure	Predefined structure	High customization freedom
Prototyping	Rapid prototyping	Complex agent logic
LLM	Amazon Bedrock models focused	Can use any LLM
Application Scenarios	- Want to create agents quickly - Standard patterns sufficient - Want to minimize code writing	- Complex agent logic - Leverage existing frameworks - Use specific LLMs - Enterprise requirements response

GUI (Graphical User Interface) is an interface that allows intuitive operation through graphical screens using mouse operations, etc.
Amazon Bedrock Agent is a service that enables rapid agent creation through no-code and low-code approaches. You define agent behavior via GUI and configure along predefined structures. It's suitable when you want rapid prototyping or when standard patterns are sufficient.
On the other hand, Amazon Bedrock AgentCore is a platform that can be fully controlled by code. You can use any framework or LLM, with very high customization freedom. It's suitable when you need complex agent logic or must respond to enterprise-specific requirements.
Importantly, these two are not mutually exclusive. You can use Bedrock Agent for prototyping and migrate to AgentCore for production environments. This article focuses on the more flexible and powerful AgentCore.

Three Value Propositions of AgentCore

AgentCore provides three main values in agent development.

1. Make Agents More Effective

Feature	Provided Value
Memory	Agents can retain conversation context and utilize past information
Code Interpreter & Browser Tool	Executable actions like data analysis and web operations greatly increase
Gateway	Existing APIs and services can be easily integrated as agent tools

These features allow building agents that not only provide simple responses but actually "act."

2. Scale Safely
For agents to operate in production environments, scalability and security are essential.

Runtime automatically scales through serverless architecture, reducing infrastructure management burden
Identity service centralizes user and agent identity management, realizing enterprise-grade authentication and authorization
Complete session isolation minimizes data leakage risks

Scalability is the property of being able to flexibly adjust system processing capacity according to load increases and decreases.

3. Trustworthy Operation
In production operation, it's necessary to visualize agent behavior and detect problems early.

Observability service traces entire agent execution, collects performance metrics, and records detailed logs
This enables problem debugging, performance optimization, and continuous improvement
OpenTelemetry compatibility allows integration with existing monitoring tools

Framework and Model Agnostic Design Philosophy

The greatest feature of AgentCore is flexibility not tied to specific technology stacks. You can obtain enterprise-grade security and reliability while leveraging existing development assets.

Supported Frameworks

Framework refers to development libraries and toolkits for building agents. AgentCore works with popular frameworks such as:

Framework	Features
LangGraph	Can define complex workflows in graph structures. Suitable for complex agent logic including conditional branches and loops
CrewAI	Specialized in multi-agent collaboration. Can build agent teams with role division
Strands Agents	Framework specialized in agent building. Features modularized design
Custom Implementation	Direct implementation in Python without using frameworks is possible. Can leverage existing codebase as is

Workflow refers to a series of task flows or processing procedures.

Supported Models

Model refers to LLM (Large Language Model) that serves as the agent's brain. AgentCore works with various model providers.

Model Provider	Description
Amazon Bedrock models	Various models accessible via Bedrock, such as Anthropic's Claude and Amazon Nova
Anthropic Claude	Direct access to latest models using Claude API
Google Gemini	Google's latest LLM. Excels in multimodal capabilities
OpenAI	GPT-4 series also available
Open Source/Custom Models	Open source LLMs or custom models developed in-house can also be used

Multimodal refers to the ability to handle multiple types of data such as text, images, and audio.
This flexibility allows selecting optimal models according to use cases, costs, and performance requirements.

AgentCore Architecture Overview

AgentCore consists of seven core services. These are provided as independent components, and through loosely coupled design, you can select and combine only necessary functions.

Service Configuration Diagram

Note: The following diagram is independently classified and organized by this article to make it easier to understand the roles and relationships of the seven services.

AgentCore Service Architecture and Component Layering

Service Roles and Relationships

In the configuration diagram above, I classify into four layers emphasizing ease of understanding, but in reality, each of the seven services is an independent component. Below, I explain each service's role.

Monitoring Layer - Observability

Observability: Visualizes operations of all services across the board. Collects and analyzes information necessary for operation such as agent reasoning processes, tool invocations, performance indicators, and error information. Runtime, Identity, Memory, Gateway, and Built-in Tools are all monitoring targets.

Execution Platform and Security Layer - Runtime & Identity

Runtime: Provides agent code execution environment. Handles agent lifecycle management, scaling, and session management. Forms the foundation of all agent processing.
Identity: Provides authentication and authorization functions. Handles user authentication, agent permission management, access control to external services, and overall security. Works closely with Runtime to realize secure execution environment.

Function Enhancement Layer - Memory, Code Interpreter, Browser Tool

Memory: Grants memory capability to agents. Persists conversation history, user settings, learned information, and maintains context.
Code Interpreter: Provides environment where agents can execute Python code. Enables data analysis, computational processing, graph generation, etc.
Browser Tool: Provides web browsing functionality to agents. Can automate web page retrieval, information extraction, form input, etc.

Tool Integration Layer - Gateway

Gateway: Simplifies integration with external tools and APIs. Centralizes credential management, request standardization, and error handling. Used by all services (Runtime, Memory, Code Interpreter, Browser Tool) and functions as a bridge connecting internal and external.

Composable Design

The greatest feature of AgentCore is composable design where each service operates independently. This allows flexible combination of services according to use cases.

【Configuration Examples】

Simple Agent
  Runtime only
  └─ Minimal configuration. Basic LLM responses only

Agent with Conversation Memory
  Runtime + Memory
  └─ Context-retained dialogue possible

Agent with External API Integration
  Runtime + Gateway
  └─ Integration with external services

Agent with Data Analysis Capability
  Runtime + Gateway + Code Interpreter
  └─ Combination of data retrieval and analysis

Agent Utilizing Web Information
  Runtime + Gateway + Browser Tool
  └─ Web information retrieval and utilization

Secure External Integration Agent
  Runtime + Gateway + Identity
  └─ External service integration with authentication/authorization

Enterprise Full-Stack Agent
  Runtime + Memory + Gateway + Identity + 
  Code Interpreter + Browser Tool + Observability
  └─ Integrate all functions with complete monitoring

Since you can select only necessary functions and gradually build and expand agents, an approach of starting with minimal configuration in early development stages and adding functions according to requirements is possible.
Service relationships:

Runtime is the foundation for everything
Identity provides security across the board
Gateway handles internal-external integration
Memory/Code Interpreter/Browser Tool can be added as independent capabilities
Observability monitors everything

Through this flexible combination, you can build gradually on the same platform from simple prototypes to full-scale enterprise systems.

Seven Core Services Details

Now let's look in detail at each service's roles and main functions. Below, I explain in an order that prioritizes ease of learning, starting with Runtime as the foundation and gradually expanding functionality.

1. Runtime: Agent Execution Platform

Runtime is a service that safely hosts and executes agents and tools. As an agent execution environment, it manages all aspects of security, scalability, and performance.
Hosting refers to providing an environment for executing applications or services.

What is Serverless

Serverless is an execution model where developers can focus only on code implementation by leaving server management to cloud providers. Operational tasks such as server startup, shutdown, scaling, and patch application become unnecessary. Additionally, since billing is only for actual usage, cost efficiency is excellent.
Patch application refers to performing security updates and bug fixes for software.
AgentCore Runtime provides a serverless environment optimized for agent-specific requirements. While traditional serverless services assume short-term processing, Runtime also supports long-term execution and large-capacity data processing.

Common Use Cases for Runtime

Long-duration data analysis processing: Continuous execution up to 8 hours maximum for complex statistical analysis or simulations
Multimodal content processing: Agent processing including large-capacity files such as images, audio, and video
Complex workflow execution: Long-term task automation coordinating multiple tools

Key Features

Serverless Architecture
Runtime's greatest feature is complete session isolation through microVM. microVM is a lightweight virtual machine technology with faster startup and less overhead than traditional virtual machines.
Overhead refers to incidental processing or resource consumption required beyond the original processing.

Runtime Session Isolation with microVM Architecture

Each user's session is executed in an independent microVM, so they cannot access other users' data or processes at all. When sessions end, microVMs are completely deleted and memory is cleared, minimizing data leakage risks.

Long-Duration Execution Support
An important feature of Runtime is long-duration execution support. While traditional serverless services have execution time limits of several minutes, Runtime supports execution up to 8 hours maximum. This allows completing agent tasks including complex reasoning and large numbers of tool invocations.

Large-Capacity Payload Processing
Payload refers to the body portion of data actually transmitted and received. Runtime can handle up to 100MB of data. It also supports agents that process multimodal data such as images, audio, and video.

Fast Cold Start
Runtime minimizes time from request to execution start. Cold start refers to initial waiting time for new instance startup. Runtime is designed to minimize this startup time.

Instance refers to a program or service execution unit.

Basic Concepts

Runtime is a foundational unit that hosts agent code. Version management functionality allows managing code update history. Each version is immutable, unchangeable once created. This allows easy rollback to previous versions when problems occur.

Rollback refers to returning to a previous stable state.

Endpoint is a URL for accessing specific Runtime versions. DEFAULT endpoint always automatically references the latest version. By creating custom endpoints, you can separate development, test, and production environments.

Session is a unit representing a series of dialogues between user and agent. Each session is executed in an independent microVM and automatically terminates after maximum 8 hours or 15 minutes of inactivity. Within sessions, conversation context is retained, enabling responses referencing previous statements.

Differences from Traditional Serverless

Item	Traditional Serverless	AgentCore Runtime
Execution Time	Several minutes to ~15 minutes	Up to 8 hours maximum
Payload Size	Several MB to ~10MB	Up to 100MB maximum
State Management	Stateless	Session state retention (up to 8 hours maximum or 15 minutes inactivity)
Billing Model	Request count + execution time	Actual CPU usage time only (I/O wait time like LLM response waiting typically excluded)
Isolation Level	Process-level isolation	Complete isolation through microVM

Stateless refers to design that doesn't retain state and processes each request independently.
I/O is abbreviation for Input/Output, referring to data reading/writing and communication.

2. Memory: Context Management

Memory is a service that gives agents "memory" and maintains context. By saving past conversation history and learned information and retrieving it at appropriate timing, it provides agents with human-like memory capability.

What is Context

Context is contextual information of conversations or situations. In human conversations, pronouns and abbreviations like "that," "it," and "what I said before" are frequently used, but context is necessary to understand these.
For example, the question "When is that?" cannot be answered without knowing what "that" refers to. If "business trip to Tokyo" was mentioned in past conversation, you can understand "that" refers to Tokyo business trip.

Why Memory is Important

AI agents are stateless (don't retain state) by default. This means each request is processed independently without memory of past interactions. While this wasn't a problem for traditional APIs and web services, it's fatal for agents.

Conversation without Memory:

User: "What's the weather in Seattle?"
Agent: "It's sunny in Seattle"
User: "How about tomorrow?"
Agent: "Tomorrow's weather where?" ← Cannot understand context

Conversation with Memory:

User: "What's the weather in Seattle?"
Agent: "It's sunny in Seattle"
User: "How about tomorrow?"
Agent: "Tomorrow's weather in Seattle will be cloudy" ← Understands context

Two Types of Memory

AgentCore Memory adopts a two-layer structure modeled after human memory systems. This realizes both short-term context understanding and long-term personalization.
Personalization refers to customizing according to individual preferences and characteristics.

Two-Layer Memory Architecture: Short-term and Long-term

This two-layer structure realizes both efficient memory management and natural conversation experience. Short-term memory is directly included in LLM's context window, and long-term memory is retrieved via semantic search as needed.
Context window refers to the range of input text that LLM can process at once.
Semantic search is a technique that searches information based on semantic similarity. It finds semantically similar information rather than keyword matches.

Short-term Memory

Short-term memory retains history of ongoing conversations. This corresponds to human working memory, remembering what you're talking about right now.

Feature	Description
Retention Range	Retains conversation history per turn
Session Scope	Associated with a specific session ID
Retention Period	Automatically deleted after a configurable period (up to 365 days maximum). Events are retained even after the conversation session ends, until the retention period expires
Storage Unit	Saved as Event

Note: While short-term memory is scoped to individual sessions, the events persist beyond active conversation sessions for the configured retention period. This allows retrieving conversation history even after users end their sessions.

Turn refers to a pair of user statement and agent response.
Conversation example (Session 1):

User: "Tell me Tokyo weather"
Agent: "It's sunny in Tokyo"
User: "What's the temperature?" ← Can omit "in Tokyo"
Agent: "25 degrees"

In this way, short-term memory allows users not to need repeating content mentioned in previous statements. Agents can understand conversation flow and complement omitted information.

Long-term Memory

Long-term memory persists important information that should be retained across sessions. This corresponds to human long-term memory, remembering important facts, preferences, and past experiences.
Persist means storing long-term rather than temporarily.

Generation through Asynchronous Processing:
Long-term memory generation is an asynchronous process executed in background. After conversation data is saved in short-term memory, long-term memory is generated through the following two-stage process:

Extraction: Automatically extract important information from interactions with agent
Consolidation: Integrate newly extracted information with existing memory and eliminate duplicates

This process efficiently integrates important information without interrupting conversations in real-time. Processing may take over 1 minute to complete.

Feature	Description
Retention Range	LLM automatically extracts and integrates important information
Valid Period	Persisted across sessions (manageable with TTL)
Sharing Scope	Shareable per Actor unit, per Session unit, across multiple agents
Primary Use	Personalization, knowledge accumulation

TTL (Time To Live) is a function to set data expiration period. Data is automatically deleted after specified period elapses.
Examples of extracted information:

User preferences: "Prefers window seats" "Morning person"
Important facts: "Allergy: Peanuts" "Wheelchair user"
Past decisions: "Project X was canceled"
Session summaries: "Consulted about Tokyo business trip in March"

Conversation example:

Session 1 (3 months ago):
- User: "I want to book a flight. I prefer window seats"
- → Long-term: "Preference: Window seats"

Session 2 (Today):
- User: "Book a flight for next week's business trip"
- Agent: "I'll look for a window seat for you" ← Remembers previous preference

In this way, long-term memory allows agents to remember users' past preferences and information and act based on them. Users no longer need to repeat the same information every time.

Memory Strategy

Memory Strategy defines conversion rules from short-term to long-term memory. It instructs LLM what information to extract and how to integrate it.
AgentCore provides three types of Memory Strategies:

Built-in strategies: Predefined standard strategies. No configuration needed and optimized for standard use cases
Built-in with overrides: Can customize prompts and models based on built-in strategies
Self-managed strategies: Completely customize entire memory processing pipeline. Implement custom extraction/consolidation algorithms

Built-in strategies include the following three types:

Semantic Memory: Extracts factual information and contextual knowledge
User Preference Memory: Extracts user preferences and choices
Summary Memory: Generates summaries of conversations within sessions

Difference from RAG (Retrieval-Augmented Generation)

AgentCore Memory and RAG are complementary technologies with different purposes.

Feature	Long-term Memory	RAG
Primary Purpose	Personal context and session continuity	Access to authoritative latest information
Stored Content	User preferences, past decisions, conversation history, behavior patterns	Documents, technical specifications, policies, domain expertise
Data Source	Session-specific context	Large repositories, databases
Update Frequency	Dynamically updated per conversation	Periodic document updates
Answers Questions	"Who is this user and what happened before"	"What do trusted sources currently say"

RAG (Retrieval-Augmented Generation) is a technique that searches related information from large document repositories and incorporates it into LLM responses.
By combining these two, agents can provide both personalized experiences through remembered context and reliable information through real-time knowledge search. Long-term Memory answers "Who is this user and what happened before," while RAG answers "What do trusted sources currently say."
When implementing RAG on AWS, you can use Amazon Bedrock Knowledge Base.

Common Use Cases for Memory

Multi-turn dialogue: Interview bots that collect information through multiple exchanges
User profile building: Personal assistants that learn user preferences and characteristics
Long-term project management: Project management agents that remember project history and decisions
Customer support: Provide continuous support by remembering customer's past inquiry history

3. Gateway: Tool Integration Simplification

Gateway is a service that transforms existing APIs, Lambda functions, and services into tools available to agents. It simplifies complex integration work, allowing tool addition in minutes.

What are API, Lambda, and Smithy

Term	Description
API (Application Programming Interface)	Interface for exchanging data between applications. Example: Retrieving or updating customer information via Salesforce API
AWS Lambda	AWS service that can execute code without server management. Execute custom logic serverlessly
Smithy	Language for defining AWS service APIs. Can generate API specifications from Smithy models

Common Use Cases for Gateway

Multi-system integration: Workflows integrating multiple services like Salesforce, Slack, Jira
Legacy system integration: Make existing internal APIs available to agents
Third-party service utilization: Toolize external specialized services (payment, translation, etc.)
AWS service integration: Use AWS services like DynamoDB, S3, CloudWatch as tools

Legacy system refers to old but currently operating existing systems.

Challenges Solved by Gateway

When having agents use external tools, traditionally much work was required.
Traditional integration work (several weeks to months per tool):

Step	Time Required	Content
API documentation analysis	Several hours to days	Understand which endpoints to call and how
Authentication implementation	Several days	Implement authentication like OAuth 2.0 or API Keys
Protocol conversion code writing	Several days	Support different protocols like REST or GraphQL
Error handling implementation	Several days	Handle various error cases
Retry logic implementation	Several days	Handle temporary failures
Logging and monitoring implementation	Several days	Track tool invocations
Security considerations	Several days	Implement secure credential management
Testing and debugging	Several days	Operation verification and problem fixing

Protocol refers to communication rules and procedures.
Retry refers to automatically reattempting failed processing.
Using Gateway dramatically simplifies this work.
Integration with Gateway (minutes):

Register OpenAPI/Smithy definition, or specify Lambda function
Authentication configuration (a few clicks)
Complete

Official documentation explains integration is possible with "just a few lines of code," shortening work that took weeks to months down to minutes.

Gateway's Six Key Capabilities

Gateway provides the following six key capabilities:

Capability	Description
1. Security Guard	Manages OAuth authorization, ensuring only valid users and agents can access tools and resources
2. Translation	Translates agent requests using protocols like MCP into API requests or Lambda invocations. Eliminates need to manage protocol integration or version support
3. Composition	Integrates multiple APIs, functions, and tools into a single MCP endpoint, streamlining agent access
4. Secure Credential Exchange	Handles credential injection for each tool. Agents can seamlessly use tools with different authentication requirements
5. Semantic Tool Selection	Can search available tools to find optimal ones for specific contexts. Can leverage thousands of tools while minimizing prompt size and reducing latency
6. Infrastructure Manager	Provides serverless solution with built-in observability and auditing capabilities. Eliminates infrastructure management overhead

Through these capabilities, Gateway functions not just as a protocol conversion tool but as an enterprise-grade tool integration platform.

MCP and A2A Protocols

Gateway supports two important protocols.

MCP (Model Context Protocol)
MCP is a standard protocol for AI applications to access tools and resources. An open specification proposed by Anthropic, it functions as a common language between tools and agents.

Gateway Protocol Support: MCP (Model Context Protocol)

MCP benefits:

Benefit	Description
Loose Coupling	Can develop and update tools and agents independently
Compatibility	Compatibility between different frameworks. Tools created once can be reused across multiple agents
Reusability	Tools can be shared across ecosystem
Standardization	Works with popular open source frameworks like CrewAI, LangGraph, LlamaIndex, Strands Agents

Loose coupling refers to design where each part of a system is independent with minimal interdependencies.

Gateway operates as an MCP server, automatically converting existing APIs and Lambda functions into MCP-compatible tools. This allows agents to use various tools via Gateway if they just understand the MCP protocol.

A2A (Agent-to-Agent Protocol)
A2A is a protocol for multiple agents to communicate and cooperate. Complex tasks can be divided among specialized agents and executed cooperatively.

Gateway Protocol Support: A2A (Agent-to-Agent)

A2A benefits:

Benefit	Description
Role Division	Each agent focuses on specialized field
Specialization	Efficiently process complex tasks by combining specialized agents
Dynamic Cooperation	Automatically find and cooperate with necessary agents

AgentCore Runtime can also be deployed as an A2A server, enabling construction of multi-agent systems.

Gateway Architecture

Gateway is composed of multiple layers.

Gateway Multi-Layer Architecture with Authorization

Through this hierarchical structure, agents can use various tools through a unified interface without being conscious of complex integration details.

Inbound Authorization and Outbound Authorization

Gateway ensures security through a two-stage authorization process.

Inbound Authorization (Entry Authorization)
Inbound Authorization handles authorization when users or agents access Gateway. Controls "who can use this Gateway."

Authorization Method	Description	Use Case
JWT (JSON Web Token)	Authenticate with tokens issued by any ID provider (Cognito, Okta, Auth0, etc.)	Access from external users or applications
IAM	Authentication using AWS IAM credentials	Access from services or applications within AWS

Outbound Authorization (Exit Authorization)
Outbound Authorization handles authorization when Gateway accesses backend tools (targets) on behalf of authenticated users or agents. Controls "what this Gateway can do against external services or resources."

Authorization Method	Description	Supported Targets
No Auth (Not recommended)	Access targets without authentication. Not recommended due to security risks	MCP Server (partial)
Gateway Service Role (IAM)	Use Gateway service role's IAM credentials. Authenticate with AWS Signature Version 4 (SigV4)	Lambda, Smithy, AWS services
OAuth 2.0 (2LO) (2-legged OAuth)	Access resources with application's own authority, not user's. Obtain token using Client Credentials flow	OpenAPI, MCP Server
API Key	Authentication using API keys	OpenAPI

2-legged OAuth (2LO) is an OAuth flow that authenticates between applications without user intervention. Client applications directly access resources without requiring end-user authentication.
Through this two-stage authorization, "who can use Gateway" and "what Gateway can access" can be controlled separately. This is why AgentCore Gateway is the only fully managed service providing "Comprehensive authentication."

Supported Tool Types

Gateway supports five types of tools.

Type	Description	Supported Outbound Auth
OpenAPI	Standard format describing REST API specifications (OpenAPI 3.0/3.1 compatible)	OAuth 2.0, API Key
Lambda	Toolize custom logic	IAM (Service Role)
Smithy	Use AWS service APIs	IAM (Service Role)
MCP Server	Integrate existing MCP servers	No Auth, OAuth 2.0
Integration Provider Templates	Pre-configured templates for popular services	Varies by template

1-Click Integration (Integration Provider Templates)

Popular tools are pre-configured and immediately available. Supports major business tools like Salesforce, Slack, Microsoft 365, SAP, Jira. Can integrate in minutes from AWS console.

Semantic Tool Selection

One of Gateway's powerful features is Semantic Tool Selection. This allows automatically finding appropriate tools from thousands.
Benefits of semantic search:

Scalability: Automatically select appropriate tools from thousands
Prompt size reduction: No need to include all tool details in prompt
Latency reduction: Present only related tools to agent
Dynamic tool discovery: Agents find optimal tools according to tasks

Practical Example

Multi-system Integration Scenario: Customer Support Workflow

Receive inquiry from customer via Slack
Agent retrieves customer information from Salesforce via Gateway
Create support ticket in Zendesk via Gateway
Create technical task in Jira via Gateway
After processing completion, send notification to Slack via Gateway

In this scenario, the agent can interact with four different services—Slack, Salesforce, Zendesk, Jira—by connecting to just one Gateway endpoint. Gateway handles each service's credentials, API differences, and protocol conversions.

4. Code Interpreter: Safe Code Execution Environment

Code Interpreter is a service that provides an environment where agents can safely execute code. It can safely execute various tasks requiring code execution, such as data analysis, complex calculations, and file processing.

Why Code Interpreter is Needed

For AI agents to become truly useful, they need to be able to execute actual operations beyond just conversation.
Examples of necessary operations:

Category	Specific Examples
Data Analysis	CSV file reading, statistical calculations, graph generation and visualization
Complex Calculations	Financial model simulation, scientific computation execution
File Processing	PDF parsing, image conversion, data formatting
API Response Processing	JSON data parsing, complex transformations

However, arbitrary code execution carries security risks. There's potential for malicious code to infiltrate systems or access other users' data. Code Interpreter solves this challenge.

What is Sandbox

Sandbox is a safe execution environment completely isolated from external environments. Like children playing in a sandbox, it refers to an environment where you can safely experiment without affecting the outside. Code executes only within sandbox, unable to access external systems or data.

Common Use Cases for Code Interpreter

Data Science Tasks: Complex data analysis using pandas, numpy
Report Generation: Automatically create graphs and charts from data
Data Cleansing: Detection and correction of invalid data
Large-Scale Data Processing: Efficiently process datasets up to 5GB stored in S3

Data cleansing refers to correcting data errors and inconsistencies to improve quality.

Architecture

Code Interpreter creates independent sandbox environments for each session.

Through this structure, each session is executed in a completely independent environment without affecting each other.

Key Features

Serverless Architecture
Code Interpreter's greatest feature is a completely managed serverless environment. Developers can focus only on code execution without being conscious of infrastructure management at all.
Each session's isolation:

Session 1: Python execution → Independent Sandbox A
Session 2: Python execution → Independent Sandbox B
No mutual influence, no data leakage

Long-Duration Execution Support
Code Interpreter provides 15 minutes execution time by default, extendable up to 8 hours maximum as needed. This allows completing tasks including complex data analysis and large-volume data processing.

Large-Capacity Payload Processing
Code Interpreter supports the following file sizes:

Inline upload: Up to 100MB maximum
S3-based upload: Up to 5GB (via terminal commands)

Also supports agents that process multimodal data such as images, audio, video, or large-scale datasets.
Network Configuration
Code Interpreter supports three network modes:

Mode	Description	Use Case
Sandbox	Completely isolated environment. No external network access (S3 access possible)	Most secure choice. When handling sensitive data
Public	Internet access possible	When integration with external APIs or services needed
VPC	Can access private resources within VPC	When access to internal databases or internal APIs needed

VPC (Virtual Private Cloud) is a virtual private network created on AWS.

Supported Languages

Code Interpreter supports major programming languages.

Language	Features
Python 3.12	Optimal for data science and machine learning. 100+ libraries pre-installed including pandas, numpy, matplotlib, scikit-learn, torch
TypeScript	Type-safe script execution
JavaScript	Suitable for lightweight processing

Type-safe refers to strictly managing variable and data types to prevent errors.

Practical Examples

Data Analysis Scenario: Analyze CSV file and create graphs

User: "Analyze sales data"
Agent: Retrieve CSV (up to 100MB inline, or up to 5GB via S3)
Code Interpreter:
- Load data with pandas
- Statistical calculations (mean, median, standard deviation, etc.)
- Generate graphs with matplotlib
Agent: Return graphs and summary

Complex Calculation Scenario: Financial Model Simulation

User: "Compare investment scenarios"
Agent: Collect parameters
Code Interpreter:
- Compound interest calculations
- Risk analysis (Monte Carlo simulation)
- Scenario comparison
- Visualization (graph generation)
Agent: Present recommendations

Compound interest calculation is a calculation method where interest accrues not only on principal but also on interest.

5. Browser Tool: Cloud Browser Execution Environment

Browser Tool is a service that provides an environment where agents can safely interact with websites. It can operate web pages using an actual browser and retrieve information.

What is Web Scraping

Web Scraping is a technology for automatically extracting information from web pages. However, traditional scraping has limitations.
Traditional scraping only handled static HTML retrieval. That is, it analyzes HTML code sent from servers as-is.
HTML is a language that describes web page structure.
However, since many modern websites dynamically generate content using JavaScript, complete information cannot be obtained from static HTML alone.

Common Use Cases for Browser Tool

Price Monitoring: Automatically track e-commerce site price fluctuations
Competitive Analysis: Periodically collect competitor site product information and content
Form Submission Automation: Automate routine web input tasks
Web Application Testing: Test execution in secure environment
Online Resource Access: Access to web-based services and data
Screenshot Capture: Visual recording of web pages

Difference from Web Scraping

Item	Web Scraping	Browser Tool
Retrieval Method	Static HTML retrieval	Actual browser operation
JavaScript	Not executed	Executed
Dynamic Content	Difficult to retrieve	Supported
Authentication	Difficult	Login/operation possible
Visual Understanding	Not possible	Possible through screenshots

JavaScript is a programming language that adds dynamic functionality to web pages. Many modern websites use JavaScript to dynamically display content according to user operations.
Browser Tool uses actual browsers, so JavaScript executes and dynamically generated content can be retrieved.

Browser Tool Features

Browser Tool executes actual Chromium-based browsers in the cloud.
Chromium is the open-source browser engine that forms the foundation of Google Chrome. Browser Tool uses Chromium-based browsers to display web pages just like actual users.
Rendering refers to interpreting code like HTML and CSS and displaying as visual web pages.

Main functions:

Function	Description
Page Navigation	Open URLs, click links
Form Input	Input into text boxes, select from dropdowns
Button Clicks	Click buttons or links
Wait & Scroll	Wait for page loading, scroll to display elements
Screenshot Capture	Save images of entire page or specific elements
Cookie Management	Maintain login state

Cookie is small data that websites store in browsers, retaining information like login state.

Network Configuration

Browser Tool currently supports only Public network mode. Access to external websites and internet resources is possible.

Security and Scaling

Browser Tool realizes both security and scalability.

Isolated Execution
Each user's browser sessions are completely isolated.
Session isolation:

User A → Browser Instance A (completely isolated)
User B → Browser Instance B (completely isolated)
User C → Browser Instance C (completely isolated)

Each instance's characteristics:

Independent cookies/sessions
Independent cache
Independent CPU, memory, filesystem resources (microVM)
Completely deleted at session end

Cache is a mechanism that temporarily stores once-retrieved data and reuses it to speed up processing.

microVM is a lightweight virtual machine, with each session executed in a dedicated microVM. This makes it impossible for one user's tool invocations to access another user's session data. When sessions complete, microVMs completely terminate and memory is sanitized (erased), eliminating cross-session data leakage risks.

Auto Scaling
Browser Tool automatically increases and decreases browser instances according to request volume.

No infrastructure management needed
Usage-based billing prevents wasted costs
Maintains stable performance even during peaks
Up to 500 concurrent sessions possible

Practical Examples

E-commerce Price Monitoring Scenario:

Access target site
Input product name in search form
Execute search
Extract price information from results page
Compare with price history
Notify if price drops

In this case, the agent uses Browser Tool to actually operate the website. It can automate operations humans would perform, such as input into search forms, clicking search buttons, and analyzing results pages.

Competitive Research Scenario:

Access competitor site
Browse product catalog
Collect detailed information for each product
Save screenshots
Save as structured data
Generate report

Structured data refers to data organized in a defined format.
In this way, Browser Tool can also handle information collection across multiple pages and access to sites requiring login.

6. Identity: Identity and Access Management

Identity is a service that centrally manages authentication and authorization for agents and humans. Controls who can access agents and what agents can access.

What are Authentication and Authorization

Authentication confirms "Who are you?" Verifies identity through username and password, multi-factor authentication, biometric authentication, etc.
Multi-factor authentication is a mechanism that verifies identity using multiple factors beyond passwords, such as SMS codes or fingerprint authentication.
Authorization determines "What can you do?" Controls which resources authenticated users or systems can access.
For example, an employee logging in is authentication, and deciding whether that employee can access the HR system is authorization.

Challenges Solved

In traditional AI agent development, implementing authentication and authorization was a major challenge.
Implementation Complexity:

Implement user authentication and agent authentication separately
Use different libraries or frameworks for each
Time-consuming authentication flow testing and debugging

Credential Management:

Access to external services requires API keys or OAuth tokens
Hardcoding these credentials in code poses serious security risks
Need mechanisms for safe storage and updates

Credential refers to authentication information (passwords, API keys, tokens, etc.).
Hardcoding refers to writing values directly into source code.
User Consent Flow:

Need flow for users to consent to external service access
OAuth 2.0 implementation is complex, requiring much development time

Identity solves all these challenges.

Identity Architecture

Identity consists of three main components.
Component refers to parts or elements that constitute a system.

Identity Three-Layer Authentication Architecture

Through this three-layer structure, the entire authentication flow from users to agents and from agents to external services can be managed in an integrated manner.

Key Feature Details

Inbound Auth (Entry Authentication)
Inbound Auth handles authentication when users or applications access agents.
AgentCore Identity supports two main authentication mechanisms:

Authentication Method	Description	Use Case
IAM SigV4 Authentication	Identity verification using AWS credentials. Works automatically without additional configuration	Calls from services or applications within AWS
OAuth 2.0 / OpenID Connect	Integration with external ID providers. Uses Bearer Tokens	When end users access agents

OAuth 2.0 is a standard authorization framework on the internet, a mechanism that can safely grant access rights without sharing passwords. OpenID Connect is an authentication layer built on top of OAuth 2.0, specialized in user identity verification.

Inbound Auth can integrate with existing OAuth 2.0/OpenID Connect compliant ID providers. Also integrates with AWS IAM, supporting calls from services or applications within AWS.

ID provider is a system that manages user authentication information and provides authentication services. Includes Okta, Microsoft Entra ID (formerly Azure AD), Amazon Cognito, etc.

Workload Identity Directory
Workload Identity Directory manages agents' own identities.

Workload identity is identity granted to applications or services (workloads), not humans. In AgentCore Identity, agent identities are implemented as specialized workload identities. Agents authenticate with their own identities and operate with appropriate permissions.
Each agent has a unique ARN (Amazon Resource Name: name that uniquely identifies resources within AWS). This ARN allows organizing agents hierarchically and applying group-based access control.
Directory supports hierarchical structures, allowing agents to be grouped according to organizational structure. For example, control like allowing customer support group agents access to customer data while restricting marketing group agents is possible.

Outbound Auth (Exit Authentication)
Outbound Auth handles authentication when agents access external services or AWS services.

Mode	Description	Use Example
USER_FEDERATION (3LO)	Access with user's permissions. Requires explicit user consent. Uses OAuth 2.0 Authorization Code Grant flow	Add events to user's Google Calendar
M2M (2LO)	Access with agent's own permissions. Uses service-level credentials. Uses OAuth 2.0 Client Credentials Grant flow	Access to internal APIs or shared databases
AWS Services	Access AWS services using AWS IAM roles	Access to S3 buckets, DynamoDB read/write

Token Vault
Token Vault is encrypted storage that safely stores OAuth 2.0 tokens and API keys.
Encryption refers to converting data into a format unreadable by third parties.
Token Vault's main functions:

Encrypted storage: Encrypted with AWS KMS
Access control: Managed with IAM policies and resource policies
Automatic token refresh: Automatic updates using OAuth 2.0 refresh tokens
Scope-based permission management: Realizes principle of least privilege
Audit logs: Records access history with AWS CloudTrail

AWS KMS (Key Management Service) is an AWS service that safely manages encryption keys. Token Vault uses KMS to encrypt and store credentials.
Scope (access permission range)-based permission management realizes principle of least privilege.
Principle of least privilege is a fundamental security principle of granting only minimum necessary permissions.

Actual Authentication Flow Example

Let's look at authentication flow in a specific scenario.
Scenario: Sales representative retrieves customer information from Salesforce

User Authentication: Sales representative logs in with company's Okta
Permission Verification: Identity verifies "Sales Group" permissions
Salesforce Access: Agent obtains OAuth token
Data Retrieval: Agent accesses customer information
Audit: All operations recorded in CloudTrail

In this way, complex authentication flows can be realized without developers being conscious of them.
Sales representatives just log in with their Okta accounts, and agents automatically access Salesforce with appropriate permissions and retrieve necessary customer information. During this time, developers don't need to be conscious of OAuth implementation or token management.

7. Observability: Observability and Operations

Observability is a service that visualizes agent behavior and supports debugging and optimization. Comprehensively monitors agent internal state, execution processes, and performance indicators.

What is Observability

Observability is the property of being able to observe and understand system internal state from outside. AgentCore Observability provides observability tailored to agent-specific challenges.

Why Observability is Important

AI agents exhibit non-deterministic behavior. That is, the same input can produce different results.
Different behaviors with same input:

Different reasoning paths
Different tool selections
Different results
→ Traditional logs alone are insufficient

For example, even with the same instruction "contact customer," agents may choose different means like email, Slack, or phone depending on circumstances. To understand this decision-making process and grasp why that choice was made requires detailed observability.

Common Use Cases for Observability

Identifying Performance Bottlenecks: Analyze which tool invocations take time
Error Cause Investigation: Trace agent reasoning process to identify problem locations
Cost Optimization: Monitor token usage and API call counts to reduce costs
Resource Usage Tracking: Monitor CPU/memory usage to achieve optimal resource allocation

Bottleneck refers to locations causing processing delays.

Two Perspectives for Understanding Observability

To understand AgentCore Observability, you need to grasp two classification axes.
Perspective 1: By Data Type (What to See)
There are three types of data for observing systems. These are called the "three pillars of observability."

Data Type	Format	Granularity	Viewpoint	Example	Main Use
Metrics	Numeric (time series)	Coarse (aggregate values)	What is happening (What)	Invocation count Error rate Latency	Trend monitoring Alert configuration
Logs	Text (events)	Detailed (individual events)	When and what happened (When & What)	Error messages Request content	Detailed investigation Root cause analysis
Spans	Structured data (hierarchical)	Intermediate (per operation)	How it was processed (How)	Start/end times Parent-child relationships	Processing flow visualization Bottleneck identification

Perspective 2: By Granularity (At What Level to See)
Agent behavior is tracked at three hierarchical levels. These form a nested structure (like matryoshka dolls).

Session - Top level
= Complete conversation with user
Duration: Several minutes to 8 hours maximum
├─ Trace 1 - Intermediate level
│  = 1 request-response
│  Duration: Several seconds to minutes
│
│  ├─ Span 1: Parse user input (50ms)
│  ├─ Span 2: Retrieve from Memory (200ms)
│  ├─ Span 3: Tool invocation (1500ms)
│  └─ Span 4: Generate response with LLM (2000ms)
│
├─ Trace 2
│  └─ ...
│
└─ Trace 3
   └─ ...

Level	Definition	Duration	Identifier	Answers Question
Session	Complete conversation with user	Several minutes to 8 hours maximum	session.id	Who is this user and what conversation occurred
Trace	1 request-response	Several seconds to minutes	trace_id	What happened in this 1 exchange
Span	Minimum processing unit (1 operation)	Milliseconds to seconds	span_id	How much time this operation took

Combination of Two Perspectives

Importantly, "data type" and "granularity" are independent concepts that can be used in combination.

Granularity Level	Metrics (Numeric)	Logs (Text)	Spans (Structured)
Session (Entire conversation)	- Session count - Total processing time	- Conversation history - Error summary	- Overall flow - Multiple Traces
Trace (1 round trip)	- Latency - Error rate	- Request - Response	- Processing steps - Tool invocations
Span (1 operation)	- Execution time - Resource amount	- Detailed logs - Exception info	- Parent-child relationships - Dependencies

For example, "Trace's Spans" retain execution records of each operation within 1 request as structured data. "Span's Logs" record detailed events of each operation as text.

Actual Problem-Solving Flow

Let's look at a specific example of solving problems by combining two perspectives.
Problem: Agent responses are slow

Step	Perspective Used	Check Content	Judgment
1	Metrics (Overall)	Average latency degraded from normal 2s→5s	Problem exists, detailed investigation needed
2	Session (Conversation level)	Occurs in specific user's session	Problem not overall but in specific pattern
3	Trace (Request level)	"Salesforce information retrieval" Trace is slow	Possible problem with Salesforce integration
4	Spans (Operation level)	Gateway→Salesforce: 4500ms (normally 500ms)	Salesforce API call is bottleneck
5	Logs (Details)	"Salesforce rate limit exceeded"	Rate limit is cause. Address with retry or caching

In this way, by progressively investigating from coarse perspective (Metrics) to fine perspective (Spans, Logs), problems can be efficiently identified.

Implementation in AgentCore

AgentCore provides all this data in an integrated manner.

Data	Provision Method
Metrics	Automatically output by default in all services
Logs	Runtime automatically outputs. Memory/Gateway/Built-in Tools require configuration
Spans	Requires agent code instrumentation (using ADOT)
Session/Trace/Span	Automatically managed in hierarchical structure conforming to OpenTelemetry standards

All data is stored in Amazon CloudWatch and can be visualized in an integrated manner with CloudWatch GenAI Observability Dashboard.

What is Telemetry

Telemetry refers to measurement data collected remotely from systems. Metrics, Logs, and Spans are all types of telemetry data. Observability collects, analyzes, and visualizes this telemetry data.

Visualization Features

Observability visualizes collected data in various ways.

1. CloudWatch GenAI Observability Dashboard
Displays AgentCore metrics, spans, and traces in an integrated manner. This dashboard provides:

Agent View: Lists agents and displays metrics, sessions, traces for selected agent
Session View: Displays all sessions associated with agent
Trace View: Investigate agent traces and span information. Visualize processing flow on timeline
Resource Usage Graphs: Visualize CPU and memory usage

2. Error Analysis
Classifies errors and analyzes trends. By visualizing error types and frequency, areas needing improvement can be identified.

3. Performance Monitoring
Monitors performance indicators in real-time. Can check current session count, average response time, error rate, token usage, etc. in real-time.

OpenTelemetry Compatibility

Since AgentCore's telemetry data conforms to OpenTelemetry standards, integration with existing monitoring tools is possible.
Integrable tools:

CloudWatch (default)
Datadog
New Relic
Prometheus
Grafana
Splunk
Custom backends

CloudWatch is the default storage destination, but data can also be sent to popular monitoring tools like Datadog, New Relic, Prometheus. AgentCore observability can be added while leveraging existing monitoring infrastructure.

Service Integration Patterns

Each service can be used independently, but combining them allows building powerful agents. Here I introduce representative configuration patterns.

Basic Configuration Patterns

Pattern 1: Minimal Configuration

The simplest configuration uses only Runtime.

Configuration	Runtime
Use Cases	- Simple LLM responses - Stateless processing - Prototyping

This configuration is suitable for simple Q&A or processing that doesn't require context. For example, agents that answer technical questions or summarize text.
Prototyping refers to creating trial products before full-scale development.
I recommend starting with this configuration even in the prototyping stage.

Pattern 2: Context Retention

By adding Memory to Runtime, conversation context can be retained.

Configuration	Runtime + Memory
Use Cases	- Chatbots - Interactive assistants - Multi-turn conversations

This configuration is suitable for applications requiring multi-turn conversations. For example, customer support chatbots, personal assistants, interactive tutorial systems.
Memory eliminates the need for users to repeat content mentioned in previous statements.

Pattern 3: Tool Utilization

By adding Gateway to Runtime, external tools can be integrated.

Configuration	Runtime + Gateway
Use Cases	- API integration agents - Workflow automation - Data retrieval/updates

This configuration is suitable for agents needing to integrate with external services. For example, agents retrieving customer information from CRM systems, agents sending notifications to Slack, agents automating workflows by combining multiple APIs.

CRM (Customer Relationship Management) refers to customer relationship management systems.

Enterprise Configuration Patterns

Pattern 4: Full Features

Enterprise-grade configuration combining all services.

Component	Role
Runtime	Execution platform
Memory	Context retention
Gateway	External tool integration
Code Interpreter	Data analysis/computation
Browser Tool	Web operations
Identity	Authentication/authorization
Observability	Overall monitoring

Use Cases:

Enterprise agents
Mission-critical applications
Compliance requirements response

Mission-critical refers to systems indispensable for business continuity.
This configuration is suitable for advanced agents operated in production environments. Runtime provides execution platform, Memory retains context, Gateway integrates external tools, Code Interpreter and Browser Tool extend execution capabilities, Identity ensures security, and Observability monitors overall.

Specific Integration Examples

Let's look at how each service cooperates in actual use cases.

Example 1: Customer Support Agent

Customer support agents respond to customer inquiries and access external systems as needed.

Configuration:

Runtime: Agent execution
Memory: Customer history/preference retention
Gateway: Salesforce + Zendesk + Internal APIs
Identity: User authentication + Salesforce OAuth
Observability: Performance monitoring

Operation Flow:

Step	Processing Content	Services Used
1	User inquires via chat	Identity: User authentication
2	Retrieve past inquiry history	Memory
3	Retrieve customer information from Salesforce	Gateway + Identity
4	Search internal knowledge base	Gateway
5	Present solution, create Zendesk ticket if needed	Gateway
6	Save this exchange	Memory
7	Record trace of all steps	Observability

Knowledge base refers to a database that systematically organizes knowledge and know-how.
In this flow, each service operates with role division. Identity ensures security, Memory provides customer history, Gateway simplifies access to external systems, and Observability monitors overall operations.

Example 2: Data Analysis Agent

Data analysis agents retrieve and analyze data based on user instructions and visualize results.

Configuration:

Runtime: Agent execution
Memory: Analysis history/user preferences
Gateway: S3 + Athena + Internal DB
Code Interpreter: Data processing/visualization
Identity: AWS IAM authentication
Observability: Execution time/resource monitoring

Operation Flow:

Step	Processing Content	Services Used
1	User: "Analyze last month's sales"	Runtime
2	Retrieve past analysis settings	Memory
3	Execute query on Athena to retrieve data	Gateway
4	Data processing/statistical calculations/graph generation	Code Interpreter
5	Return analysis results and graphs	Runtime
6	Save analysis settings	Memory
7	Record query execution time/data volume/processing time	Observability

AWS Athena is a service that can execute SQL queries on data stored in S3.
SQL is a language for manipulating databases.
In this flow, data is retrieved from Athena through Gateway, analyzed and visualized with Code Interpreter, user preferences are remembered with Memory, and processing time is monitored with Observability.

Getting Started with AgentCore

I will explain preparation for starting AgentCore and how to proceed with development.

Prerequisite Knowledge

Knowledge required to start AgentCore is as follows.

Essential Skills

Skill	Description
Basic AWS Knowledge	AWS account creation, understanding basic services (EC2, S3, IAM, etc.)
Programming Language	Experience with Python or TypeScript
Basic REST API Understanding	Knowledge of HTTP methods (GET, POST, etc.), status codes, JSON format

EC2 (Elastic Compute Cloud) is AWS's virtual server service.
S3 (Simple Storage Service) is AWS's object storage service.
IAM (Identity and Access Management) is AWS's access control service.
HTTP methods specify types of operations on servers (retrieve, create, update, delete, etc.).
Status codes are 3-digit numbers indicating server response results (200=success, 404=not found, etc.).
JSON (JavaScript Object Notation) is a lightweight text format widely used for data exchange.

Recommended Skills

Skill	Benefits
Agent Frameworks like LangGraph or CrewAI	Agent design becomes smoother
Basic OAuth 2.0 Understanding	Integration with external services becomes easier to understand
Container Technology (Docker, etc.)	Deployment to Runtime becomes easier

Docker is a tool for containerizing applications.

Development Flow

Typical AgentCore development flow is as follows.

1. Design Phase

Task	Content
Use Case Definition	- What problems should agent solve - Who will use it - What functions are needed
Service Selection	- Context retention needed → Memory - External API integration needed → Gateway - Data analysis needed → Code Interpreter
Architecture Design	- How each service cooperates - Data flow - Security requirements

2. Development Phase

Step	Task Content
Local Development	Use frameworks like LangGraph or CrewAI, or implement independently
Containerization	Package agent and its dependencies with Dockerfile
Deployment	Deploy to AgentCore Runtime. Can be automated with AWS CLI or AWS SDK

AWS CLI (Command Line Interface) is a tool for operating AWS from command line.
AWS SDK is a library for operating AWS services from programs.
Deploy refers to placing developed applications in production environment and making them available.

3. Integration Phase

Configure each service as needed.

Service	Configuration Content
Memory	Define Memory Strategy. Decide what information to extract as long-term memory
Gateway	Register necessary tools. Configure credentials
Identity	Integrate ID providers. Configure user authentication flow

4. Test & Optimization Phase

Task	Content
Operation Verification	Verify operation with actual data or scenarios. Check whether expected responses are returned and error handling is appropriate
Monitoring	Monitor latency, token usage, error rate with Observability
Optimization	Identify bottlenecks and improve

5. Production Operation Phase

Task	Content
Deployment	Deploy to production environment. Create prod-endpoint separately from DEFAULT endpoint
Continuous Monitoring	Continuously monitor with Observability. Configure alerts for automatic anomaly detection
Continuous Improvement	Based on user feedback and metrics, add new tools, optimize prompts, performance tuning

Alert is a mechanism that notifies when anomalies occur.
Performance tuning refers to optimizing system performance.

First Steps

When starting AgentCore, I recommend proceeding in the following order.

Step	Task Content
1. Starter Toolkit	Experience basic AgentCore operations using official starter kit
2. Simple Agent	Create basic agent using only Runtime. Develop locally→Verify operation→Deploy to AWS
3. Gradually Add Functions	Add Memory to retain conversation context→Integrate external tools with Gateway→Verify operations with Observability
4. Full-Scale Agent	Strengthen authentication/authorization with Identity→Add Code Interpreter or Browser Tool→Address enterprise requirements

Resources

I introduce resources helpful for learning AgentCore.

Official Documentation

Resource	Content
AgentCore Documentation	AWS official documentation. Detailed explanations of each service, API reference, best practices
API Reference	All API endpoints, parameters, response formats

Sample Code

Resource	Content
GitHub: amazon-bedrock-agentcore-samples	Sample code for various use cases, implementation best practices, ready-to-use implementation examples

You can develop your own agents by referring to these codes.

Understanding Cost Structure

AgentCore costs are composed of multiple service charges. Understanding each cost factor helps with budget planning and optimization.
Cost Factors to Consider:

Service	Key Cost Drivers	Example Variables
Runtime	Actual CPU time used per session	Session duration, concurrent sessions
Memory	Number of events stored and searches performed	Conversation turns, retention period, search frequency
Gateway	API call frequency and data transfer volume	Tool invocations, payload size
Observability	Log volume and metrics stored in CloudWatch	Log retention, metric resolution, trace volume

Steps to Estimate Your Costs:

Identify usage patterns: Determine expected monthly sessions, average session duration, and tool usage frequency
Check current pricing: Visit the Amazon Bedrock AgentCore Pricing Page for your AWS region
Use AWS Pricing Calculator: Input your specific requirements into the AWS Pricing Calculator
Add buffer: Include 20-30% buffer for unexpected usage spikes

Important: Actual costs vary significantly based on:

AWS Region selected
Specific usage patterns and peak loads
Data retention policies configured
Network transfer requirements
Choice of LLM models (if using built-in strategies)

Cost Optimization Tips

Implement these strategies to optimize AgentCore costs:

Efficient Prompt Design: Minimize LLM token usage through clear, concise instructions and well-structured prompts
Strategic Memory Management:
- Save only essential information as long-term memory
- Set appropriate TTL (Time To Live) for memory data
- Regularly review and clean up unused memories
Smart Caching Implementation:
- Cache frequently accessed data
- Reduce redundant API calls through Gateway
- Implement response caching where appropriate
Proactive Monitoring:
- Set up CloudWatch alarms for cost thresholds
- Monitor usage patterns with Observability
- Identify and address cost spikes early
Resource Right-Sizing:
- Optimize session duration
- Use appropriate timeout settings
- Implement efficient error handling to prevent retry storms

Security and Compliance

AgentCore provides enterprise-grade security.

Security Features

AgentCore ensures security at multiple layers.

Data Protection

Layer	Security Measures
Data in Transit	Encrypted with TLS 1.2/1.3. End-to-end encryption
Data at Rest	Encrypted with AWS KMS. Data saved in Memory also encrypted. Credentials saved in Token Vault also encrypted with KMS

TLS (Transport Layer Security) is a protocol that encrypts internet communications.
End-to-end refers to the entire path from start to end of communications.

Access Control

Control Level	Description
IAM-Based Detailed Permission Management	Strictly control who can access what. Role-based access control. Principle of least privilege
Resource-Based Policies	Detailed management of access to each resource. Conditional access
Principle of Least Privilege	Grant only minimum necessary permissions. Periodic permission reviews

Role-based access control is a method that grants permissions according to roles.
Policy refers to rules that define access permissions.

Isolation

Isolation Level	Description
Complete Session Isolation	Each user's sessions executed in independent microVMs. Cannot access each other. Completely deleted at session end
Tenant Isolation	Different customer data completely isolated. Physical and logical isolation
Network Isolation	Communication between sessions blocked. Can connect to private networks via VPC integration

Tenant refers to units of customers or organizations using a system.

Auditing

Audit Function	Description
CloudTrail Recording	Records all API calls. Track who did what when. Detect unauthorized access
Observability Logs	Detailed agent operation logs. Security event recording. Abnormal behavior detection

Compliance

Compliance refers to adherence to laws and industry regulations. AgentCore addresses major compliance requirements.

Standards/Regulations	Description
SOC 1, 2, 3	Security audit standards. Security and privacy management audited by third parties
HIPAA Compatible	Can be used for applications handling medical information. Appropriate protection of PHI
GDPR Compliant	Appropriately protect EU citizen data. Respond to data deletion requests
ISO 27001	International standard for information security management

HIPAA is US medical information protection law.
PHI (Protected Health Information) is protected medical information.
GDPR (General Data Protection Regulation) is EU's general data protection regulation.

Data Privacy

Understanding AgentCore's data usage policy is important.
AgentCore Data Usage Policy:
AgentCore may use and store customer content to improve service experience and performance.
However, such improvements are:

✓ For your own AgentCore usage
✗ Not for other customers

That is, your company's data will not be shared with other companies. AWS may use customer data to improve that customer's service experience, but will not use it for other customers.

Troubleshooting

I introduce frequently encountered problems when using AgentCore and their solutions.

Common Problems and Solutions

Problem 1: Session Timeout

Item	Content
Symptoms	- Session terminates during long processing - "Session expired" error occurs
Causes	- Reached maximum 8-hour execution time limit - Exceeded 15-minute inactivity time
Solutions	1. Divide processing into smaller units and save each step's results in Memory 2. Implement periodic keep-alive 3. Leverage asynchronous processing

Keep-alive refers to periodically sending signals to maintain connections.
Asynchronous processing refers to proceeding to next processing without waiting for processing completion.

Problem 2: Out of Memory Error

Item	Content
Symptoms	- "Out of memory" error - Crashes during large data processing
Causes	- Loaded large amounts of data at once - Memory usage exceeded session limit
Solutions	1. Adjust batch size (divide data into smaller units) 2. Adopt streaming processing 3. Delete unnecessary data

Streaming processing refers to continuously processing data little by little.

Problem 3: Tool Invocation Failures

Item	Content
Symptoms	- API calls via Gateway fail - Authentication errors or connection errors
Causes	- Expired credentials - Reached API rate limits - Network connection issues
Solutions	1. Verify credentials 2. Implement retry logic 3. Address rate limits

Problem 4: High Latency

Item	Content
Symptoms	- Agent responses are slow - User experience degraded
Causes	- Unnecessary tool invocations - Inefficient prompts - Network latency
Solutions	1. Analyze with Observability (identify bottlenecks with traces) 2. Optimize prompts 3. Leverage caching 4. Parallel processing

Problem 5: Cost Spikes

Item	Content
Symptoms	- Unexpectedly high bills - Rapid usage increases
Causes	- Infinite loops or retry storms - Inefficient Memory searches - Excessive API calls
Solutions	1. Strengthen cost monitoring (set thresholds with CloudWatch Alarms) 2. Analyze usage patterns 3. Set resource limits

Threshold refers to reference values that trigger alerts.

Comparison with Other Solutions

Let's compare AgentCore with other agent development solutions.

Comparison Table

Feature	AgentCore	LangGraph Standalone	CrewAI Standalone	Amazon Bedrock Agent
Infrastructure Management	Unnecessary (fully managed)	Required	Required	Unnecessary (fully managed)
Scaling	Automatic	Manual	Manual	Automatic
Security	Enterprise-grade	Requires implementation	Requires implementation	Enterprise-grade
Observability	Built-in	Requires implementation	Requires implementation	Basic monitoring
Framework	Any framework	LangGraph	CrewAI	Bedrock Agent specific
LLM Selection	Any LLM	Any LLM	Any LLM	Bedrock models focused
Customizability	High (code control)	High	Medium	Medium (GUI-centered)
Development Speed	Medium to high	Low (infrastructure needed)	Low (infrastructure needed)	High (GUI)
Operation Cost	Pay-as-you-go	Infrastructure + operation cost	Infrastructure + operation cost	Pay-as-you-go
Production Ready	Immediately ready	Additional work needed	Additional work needed	Immediately ready

Fully managed refers to service providers handling all infrastructure management.

Application Scenarios for Each Solution

When AgentCore is Suitable

Enterprise-grade security needed
Want to leverage existing frameworks (LangGraph, CrewAI, etc.)
Want to balance customizability and ease of operation
Complex agent logic needed
Assuming production environment operation

When LangGraph Standalone is Suitable

Complete control of complex workflows needed
Infrastructure management structure already exists
On-premises operation is required

When CrewAI Standalone is Suitable

Want to specialize in multi-agent systems
Infrastructure management structure already exists
On-premises operation is required

When Amazon Bedrock Agent is Suitable

Rapid prototyping needed
Want to create agents without writing code
Standard patterns sufficient
Bedrock models alone sufficient

Frequently Asked Questions (FAQ)

I've compiled frequently asked questions and answers about AgentCore.

Q1: Can I use existing frameworks?

A: Yes, AgentCore is framework-agnostic.
All major agent frameworks including LangGraph, CrewAI, Strands Agents work. Additionally, custom agents without frameworks also work without problems.

Q2: Which LLMs can be used?

A: Any LLM can be used.
Supported LLMs:

Amazon Bedrock models (Claude, Nova, etc.)
Anthropic Claude (direct API)
OpenAI GPT (GPT-4, GPT-3.5, etc.)
Google Gemini
On-premises LLM
Company customized models

Q3: Integration with existing AWS services?

A: Seamless integration through Gateway is possible.
Integrable services:

Amazon S3 (object storage)
Amazon DynamoDB (NoSQL database)
Amazon Athena (data analysis)
AWS Lambda (serverless functions)
Amazon RDS (relational database)
Almost all other AWS services

Q4: Separation of development and production environments?

A: Can be realized using Endpoints. By creating different endpoints for each environment and specifying different versions for each, environment separation is realized.

Q5: Where are logs stored?

A: Stored in Amazon CloudWatch. Since it's OpenTelemetry compatible, data can also be sent to other monitoring tools (Datadog, New Relic, etc.).

Q6: Maximum session time?

A: 8 hours maximum. Active execution is 8 hours maximum, idle state automatically terminates after 15 minutes. If processing over 8 hours is needed, divide processing into multiple sessions and save state with Memory.

Q7: How to estimate charges?

A: Use AWS pricing calculator or cost management tools. Calculate expected usage and estimate with AWS pricing calculator. Monitor actual usage with Observability and conduct periodic cost reviews.

Q8: What is MCP Server? Is it required?

A: MCP Server is a development support tool and is not required.
MCP Server is an optional tool for streamlining AgentCore deployment and testing from IDE (integrated development environment). Benefits of using include one-command deployment from IDE and automatic code conversion support. Without using, traditional deployment with AWS CLI or deployment via programs using AWS SDK is possible.
MCP Server is an auxiliary tool for improving development experience and is separate from AgentCore's seven core services.

References:
Amazon Bedrock AgentCore Documentation
Tech Blog with curated related content

Conclusion

Reaffirming AgentCore's Value

Amazon Bedrock AgentCore solves the following challenges in AI agent development.

1. Improved Development Efficiency

Tool integration work that traditionally took weeks is shortened to minutes with AgentCore.

Gateway dramatically simplifies API integration
Identity eliminates authentication/authorization implementation
Framework agnostic leverages existing development assets

2. Enterprise-Grade Security

Meets all security requirements demanded in production environments.

Identity provides enterprise-grade authentication/authorization
Complete session isolation minimizes data leakage risks
Token Vault safely manages credentials
Compliance ready (SOC, HIPAA, GDPR, ISO 27001)

3. Improved Operability

Supports stable production environment operation.

Serverless architecture reduces infrastructure management burden
Auto-scaling automatically responds to request volume fluctuations
Observability provides detailed observability
Pay-as-you-go reduces wasted costs

Gradual Understanding of Seven Services

AgentCore's seven services can also be learned gradually.

Gradual Learning Path

1. Runtime (Foundation)
   ↓ Run agents
2. Memory (Retention)
   ↓ Remember conversations
3. Gateway (Integration)
   ↓ Use external tools
4. Code Interpreter (Analysis)
   ↓ Analyze data
5. Browser Tool (Web Ops)
   ↓ Operate web
6. Identity (Security)
   ↓ Operate safely
7. Observability (Monitoring)
   ↓ Continue improving

Finally

AI agent technology is evolving rapidly, but by leveraging integrated platforms like AgentCore, you can efficiently keep up with that evolution.
Understanding basic concepts and service roles explained in this article allows you to smoothly start agent development using AgentCore.
AgentCore is:

A platform where anyone can safely build AI agents
Balances development efficiency and security
Provides enterprise-grade features

In the coming AI era, AgentCore will be a powerful tool.
I recommend starting small and gradually adding functions. By actually working hands-on, you should understand AgentCore's true value.
I hope this article helps in understanding Amazon Bedrock AgentCore.

Written by Hidekazu Konishi