Amazon Bedrock AgentCore Beginner's Guide - AI Agent Development from Basics with Detailed Term Explanations

First Published:
Last Updated:

In recent years, generative AI technology has been evolving rapidly. Amazon Bedrock, Amazon Bedrock Agent, and the latest Amazon Bedrock AgentCore. Keeping up with the new services and frameworks that appear one after another is not easy.
This article is written for those who find it challenging to keep up with the evolution of generative AI technology. While understanding the overall picture of Amazon Bedrock AgentCore, I will also carefully explain basic terms such as "LLM," "agent," and "serverless" one by one. Even if you're starting now, this content will help you catch up effectively.

Features of This Article

Emphasis on Term Explanations
In this article, I provide explanations each time specialized terms commonly used in technical documents appear. For example, "LLM is an artificial intelligence model trained on vast amounts of text data," and "serverless is an execution model where you leave server management to the cloud provider." You can read while filling in knowledge gaps.

Structure That Promotes Gradual Understanding
I start with the basic concepts of AI agents, grasp the overall picture of AgentCore, and then learn each of the seven core services one by one. The structure is designed so that you can understand even without prerequisite knowledge if you read sequentially.

Providing Practical Information
In addition to explaining concepts, I also cover information needed in actual work, such as use cases, pricing models, security, and frequently asked questions.

Target Readers of This Article

  • Engineers who want to learn about AI agents from the basics
  • Developers who want to keep up with generative AI technology trends
  • Architects who want to build AI applications on AWS
  • Technology leaders considering AI adoption in enterprises

How to Read

This article is designed so that understanding deepens gradually by reading from the beginning in order. Although you can skip parts you already know, checking the term explanation sections will help you understand subsequent content smoothly.

What is an AI Agent

Definition of Agent

An AI agent is an AI system that can autonomously judge and act to achieve goals. The difference from traditional chatbots lies in autonomy and tool utilization capability.
Traditional chatbots only responded according to predetermined scenarios. In contrast, AI agents autonomously judge which tools to use and in what order for a given goal, and execute them. For example, in response to the instruction "arrange a business trip," they can autonomously execute a series of tasks such as calendar checking, flight booking, hotel reservation, and application form creation.

Traditional Chatbot vs AI Agent

Feature Traditional Chatbot AI Agent
Nature of Response Simple response Goal-oriented
Tool Usage Cannot use tools Utilizes multiple tools
Processing Flow Fixed flow Dynamically reasons and plans
Autonomy Requires human instructions Acts autonomously
Typical Example FAQ responses "Arrange a business trip" → Autonomously uses multiple tools to complete booking
As shown above, traditional types operate along fixed flows, but agent types dynamically judge according to situations and select optimal approaches.

Basic Elements of an Agent

AI agents are composed of the following elements. These elements work together to realize a system that thinks, judges, and acts like a human.

1. LLM (Large Language Model) - Brain
LLM is an artificial intelligence model trained on vast amounts of text data, a system that can understand and generate language like humans. Representative examples include GPT-4 (the foundation model of ChatGPT), Claude, and Gemini.
LLM functions as the brain of the agent, understanding user requirements, thinking logically, and judging what to do next. For example, when receiving the instruction "arrange a business trip to Tokyo next week," it plans a series of flows: first checking available dates on the calendar, then booking flights, and subsequently arranging hotels.

2. Tools - Hands and Feet
Tools are the means for agents to actually act. They can execute various operations such as information retrieval from databases, calling external APIs, file reading and writing, and web page browsing.
API (Application Programming Interface) is an interface for exchanging data between applications. The role division is that LLM judges "what should be done," and tools "actually do it."

3. Memory - Storage Device
Memory is a mechanism that retains past conversation history and learned information. This allows agents to understand conversation context, remember user preferences, and refer to previous decisions. For example, to understand what "that" refers to in the question "When is that?", they need to remember past conversations.

4. Planning - Strategy
Planning is the ability to break down complex goals into small tasks and determine execution order. Agents formulate strategies to achieve goals and decide next actions while evaluating results of each step. If a plan fails, they can also try alternative approaches.

Example of Agent Operation

Let's look at a specific agent's operation.
User Instruction: "Arrange a business trip to Tokyo next week"

Agent Thinking Process:
1. Planning
  • Check calendar availability
  • Search and book flights
  • Search and book hotels
  • Create business trip application
2. Execution
  • Step 1: Execute calendar tool → Confirm availability: 3/15-3/17 available
  • Step 2: Execute flight booking tool → Search and book round-trip flights
  • Step 3: Execute hotel booking tool → Book hotel within company regulations
  • Step 4: Execute internal system tool → Automatically create and submit business trip application
3. Report
Business trip arrangement completed:
- Flight: Departing 3/15 8:00, Returning 3/17 18:00
- Hotel: Tokyo Business Hotel
- Application: Submitted (awaiting approval)
In this process, the agent understands human instructions, selects necessary tools, executes them in appropriate order, and finally reports results. If problems occur midway (e.g., desired flight is fully booked), it can respond flexibly by presenting alternatives.

What is Amazon Bedrock AgentCore

Positioning of AgentCore

Amazon Bedrock AgentCore is a development and operation platform for AI agents. It allows developers to focus on business logic by providing infrastructure, tools, and security features necessary for agents in an integrated manner.
Platform refers to a group of environments and services that form the foundation for application development and execution.
AgentCore takes on the "troublesome parts" of agent development. It provides all elements necessary for enterprise-grade systems, such as server management, security implementation, tool integration, and performance monitoring.

Why AgentCore is Needed

Traditional agent development had the following challenges.

Development Challenges:
  • Complex implementation of authentication and authorization (OAuth 2.0, API Key management, etc.)
  • Complexity of tool integration (understanding each API specification, error handling)
  • Framework selection and learning costs
Operation Challenges:
  • Balancing scaling and security
  • Difficulty in detailed monitoring and debugging
  • Infrastructure management burden
Security Challenges:
  • Implementation of data protection and encryption
  • Detailed management of access control
  • Compliance requirements response
AgentCore solves all these challenges in an integrated manner.

Difference from Amazon Bedrock Agent

There are two ways to build agents on AWS: Amazon Bedrock Agent and Amazon Bedrock AgentCore. These two differ in purpose and usage.
Feature Amazon Bedrock Agent Amazon Bedrock AgentCore
Development Style No-code/Low-code Full code control
Configuration Method Define agent via GUI Can use any framework
Structure Predefined structure High customization freedom
Prototyping Rapid prototyping Complex agent logic
LLM Amazon Bedrock models focused Can use any LLM
Application Scenarios - Want to create agents quickly
- Standard patterns sufficient
- Want to minimize code writing
- Complex agent logic
- Leverage existing frameworks
- Use specific LLMs
- Enterprise requirements response
GUI (Graphical User Interface) is an interface that allows intuitive operation through graphical screens using mouse operations, etc.
Amazon Bedrock Agent is a service that enables rapid agent creation through no-code and low-code approaches. You define agent behavior via GUI and configure along predefined structures. It's suitable when you want rapid prototyping or when standard patterns are sufficient.
On the other hand, Amazon Bedrock AgentCore is a platform that can be fully controlled by code. You can use any framework or LLM, with very high customization freedom. It's suitable when you need complex agent logic or must respond to enterprise-specific requirements.
Importantly, these two are not mutually exclusive. You can use Bedrock Agent for prototyping and migrate to AgentCore for production environments. This article focuses on the more flexible and powerful AgentCore.

Three Value Propositions of AgentCore

AgentCore provides three main values in agent development.

1. Make Agents More Effective
Feature Provided Value
Memory Agents can retain conversation context and utilize past information
Code Interpreter & Browser Tool Executable actions like data analysis and web operations greatly increase
Gateway Existing APIs and services can be easily integrated as agent tools
These features allow building agents that not only provide simple responses but actually "act."

2. Scale Safely
For agents to operate in production environments, scalability and security are essential.
  • Runtime automatically scales through serverless architecture, reducing infrastructure management burden
  • Identity service centralizes user and agent identity management, realizing enterprise-grade authentication and authorization
  • Complete session isolation minimizes data leakage risks
Scalability is the property of being able to flexibly adjust system processing capacity according to load increases and decreases.

3. Trustworthy Operation
In production operation, it's necessary to visualize agent behavior and detect problems early.
  • Observability service traces entire agent execution, collects performance metrics, and records detailed logs
  • This enables problem debugging, performance optimization, and continuous improvement
  • OpenTelemetry compatibility allows integration with existing monitoring tools

Framework and Model Agnostic Design Philosophy

The greatest feature of AgentCore is flexibility not tied to specific technology stacks. You can obtain enterprise-grade security and reliability while leveraging existing development assets.

Supported Frameworks

Framework refers to development libraries and toolkits for building agents. AgentCore works with popular frameworks such as:
Framework Features
LangGraph Can define complex workflows in graph structures. Suitable for complex agent logic including conditional branches and loops
CrewAI Specialized in multi-agent collaboration. Can build agent teams with role division
Strands Agents Framework specialized in agent building. Features modularized design
Custom Implementation Direct implementation in Python without using frameworks is possible. Can leverage existing codebase as is
Workflow refers to a series of task flows or processing procedures.

Supported Models

Model refers to LLM (Large Language Model) that serves as the agent's brain. AgentCore works with various model providers.
Model Provider Description
Amazon Bedrock models Various models accessible via Bedrock, such as Anthropic's Claude and Amazon Nova
Anthropic Claude Direct access to latest models using Claude API
Google Gemini Google's latest LLM. Excels in multimodal capabilities
OpenAI GPT-4 series also available
Open Source/Custom Models Open source LLMs or custom models developed in-house can also be used
Multimodal refers to the ability to handle multiple types of data such as text, images, and audio.
This flexibility allows selecting optimal models according to use cases, costs, and performance requirements.

AgentCore Architecture Overview

AgentCore consists of seven core services. These are provided as independent components, and through loosely coupled design, you can select and combine only necessary functions.

Service Configuration Diagram

Note: The following diagram is independently classified and organized by this article to make it easier to understand the roles and relationships of the seven services.

AgentCore Service Architecture and Component Layering
AgentCore Service Architecture and Component Layering

Service Roles and Relationships

In the configuration diagram above, I classify into four layers emphasizing ease of understanding, but in reality, each of the seven services is an independent component. Below, I explain each service's role.

Monitoring Layer - Observability
  • Observability: Visualizes operations of all services across the board. Collects and analyzes information necessary for operation such as agent reasoning processes, tool invocations, performance indicators, and error information. Runtime, Identity, Memory, Gateway, and Built-in Tools are all monitoring targets.
Execution Platform and Security Layer - Runtime & Identity
  • Runtime: Provides agent code execution environment. Handles agent lifecycle management, scaling, and session management. Forms the foundation of all agent processing.
  • Identity: Provides authentication and authorization functions. Handles user authentication, agent permission management, access control to external services, and overall security. Works closely with Runtime to realize secure execution environment.
Function Enhancement Layer - Memory, Code Interpreter, Browser Tool
  • Memory: Grants memory capability to agents. Persists conversation history, user settings, learned information, and maintains context.
  • Code Interpreter: Provides environment where agents can execute Python code. Enables data analysis, computational processing, graph generation, etc.
  • Browser Tool: Provides web browsing functionality to agents. Can automate web page retrieval, information extraction, form input, etc.
Tool Integration Layer - Gateway
  • Gateway: Simplifies integration with external tools and APIs. Centralizes credential management, request standardization, and error handling. Used by all services (Runtime, Memory, Code Interpreter, Browser Tool) and functions as a bridge connecting internal and external.

Composable Design

The greatest feature of AgentCore is composable design where each service operates independently. This allows flexible combination of services according to use cases.
【Configuration Examples】

Simple Agent
  Runtime only
  └─ Minimal configuration. Basic LLM responses only

Agent with Conversation Memory
  Runtime + Memory
  └─ Context-retained dialogue possible

Agent with External API Integration
  Runtime + Gateway
  └─ Integration with external services

Agent with Data Analysis Capability
  Runtime + Gateway + Code Interpreter
  └─ Combination of data retrieval and analysis

Agent Utilizing Web Information
  Runtime + Gateway + Browser Tool
  └─ Web information retrieval and utilization

Secure External Integration Agent
  Runtime + Gateway + Identity
  └─ External service integration with authentication/authorization

Enterprise Full-Stack Agent
  Runtime + Memory + Gateway + Identity + 
  Code Interpreter + Browser Tool + Observability
  └─ Integrate all functions with complete monitoring
Since you can select only necessary functions and gradually build and expand agents, an approach of starting with minimal configuration in early development stages and adding functions according to requirements is possible.
Service relationships:
  • Runtime is the foundation for everything
  • Identity provides security across the board
  • Gateway handles internal-external integration
  • Memory/Code Interpreter/Browser Tool can be added as independent capabilities
  • Observability monitors everything
Through this flexible combination, you can build gradually on the same platform from simple prototypes to full-scale enterprise systems.

Seven Core Services Details

Now let's look in detail at each service's roles and main functions. Below, I explain in an order that prioritizes ease of learning, starting with Runtime as the foundation and gradually expanding functionality.

1. Runtime: Agent Execution Platform

Runtime is a service that safely hosts and executes agents and tools. As an agent execution environment, it manages all aspects of security, scalability, and performance.
Hosting refers to providing an environment for executing applications or services.

What is Serverless

Serverless is an execution model where developers can focus only on code implementation by leaving server management to cloud providers. Operational tasks such as server startup, shutdown, scaling, and patch application become unnecessary. Additionally, since billing is only for actual usage, cost efficiency is excellent.
Patch application refers to performing security updates and bug fixes for software.
AgentCore Runtime provides a serverless environment optimized for agent-specific requirements. While traditional serverless services assume short-term processing, Runtime also supports long-term execution and large-capacity data processing.

Common Use Cases for Runtime

  • Long-duration data analysis processing: Continuous execution up to 8 hours maximum for complex statistical analysis or simulations
  • Multimodal content processing: Agent processing including large-capacity files such as images, audio, and video
  • Complex workflow execution: Long-term task automation coordinating multiple tools

Key Features

Serverless Architecture
Runtime's greatest feature is complete session isolation through microVM. microVM is a lightweight virtual machine technology with faster startup and less overhead than traditional virtual machines.
Overhead refers to incidental processing or resource consumption required beyond the original processing.

Runtime Session Isolation with microVM Architecture
Runtime Session Isolation with microVM Architecture

Each user's session is executed in an independent microVM, so they cannot access other users' data or processes at all. When sessions end, microVMs are completely deleted and memory is cleared, minimizing data leakage risks.

Long-Duration Execution Support
An important feature of Runtime is long-duration execution support. While traditional serverless services have execution time limits of several minutes, Runtime supports execution up to 8 hours maximum. This allows completing agent tasks including complex reasoning and large numbers of tool invocations.

Large-Capacity Payload Processing
Payload refers to the body portion of data actually transmitted and received. Runtime can handle up to 100MB of data. It also supports agents that process multimodal data such as images, audio, and video.

Fast Cold Start
Runtime minimizes time from request to execution start. Cold start refers to initial waiting time for new instance startup. Runtime is designed to minimize this startup time.

Instance refers to a program or service execution unit.

Basic Concepts

Runtime is a foundational unit that hosts agent code. Version management functionality allows managing code update history. Each version is immutable, unchangeable once created. This allows easy rollback to previous versions when problems occur.

Rollback refers to returning to a previous stable state.

Endpoint is a URL for accessing specific Runtime versions. DEFAULT endpoint always automatically references the latest version. By creating custom endpoints, you can separate development, test, and production environments.

Session is a unit representing a series of dialogues between user and agent. Each session is executed in an independent microVM and automatically terminates after maximum 8 hours or 15 minutes of inactivity. Within sessions, conversation context is retained, enabling responses referencing previous statements.

Differences from Traditional Serverless

Item Traditional Serverless AgentCore Runtime
Execution Time Several minutes to ~15 minutes Up to 8 hours maximum
Payload Size Several MB to ~10MB Up to 100MB maximum
State Management Stateless Session state retention (up to 8 hours maximum or 15 minutes inactivity)
Billing Model Request count + execution time Actual CPU usage time only (I/O wait time like LLM response waiting typically excluded)
Isolation Level Process-level isolation Complete isolation through microVM
Stateless refers to design that doesn't retain state and processes each request independently.
I/O is abbreviation for Input/Output, referring to data reading/writing and communication.

2. Memory: Context Management

Memory is a service that gives agents "memory" and maintains context. By saving past conversation history and learned information and retrieving it at appropriate timing, it provides agents with human-like memory capability.

What is Context

Context is contextual information of conversations or situations. In human conversations, pronouns and abbreviations like "that," "it," and "what I said before" are frequently used, but context is necessary to understand these.
For example, the question "When is that?" cannot be answered without knowing what "that" refers to. If "business trip to Tokyo" was mentioned in past conversation, you can understand "that" refers to Tokyo business trip.

Why Memory is Important

AI agents are stateless (don't retain state) by default. This means each request is processed independently without memory of past interactions. While this wasn't a problem for traditional APIs and web services, it's fatal for agents.

Conversation without Memory:
  • User: "What's the weather in Seattle?"
  • Agent: "It's sunny in Seattle"
  • User: "How about tomorrow?"
  • Agent: "Tomorrow's weather where?" ← Cannot understand context
Conversation with Memory:
  • User: "What's the weather in Seattle?"
  • Agent: "It's sunny in Seattle"
  • User: "How about tomorrow?"
  • Agent: "Tomorrow's weather in Seattle will be cloudy" ← Understands context

Two Types of Memory

AgentCore Memory adopts a two-layer structure modeled after human memory systems. This realizes both short-term context understanding and long-term personalization.
Personalization refers to customizing according to individual preferences and characteristics.

Two-Layer Memory Architecture: Short-term and Long-term
Two-Layer Memory Architecture: Short-term and Long-term

This two-layer structure realizes both efficient memory management and natural conversation experience. Short-term memory is directly included in LLM's context window, and long-term memory is retrieved via semantic search as needed.
Context window refers to the range of input text that LLM can process at once.
Semantic search is a technique that searches information based on semantic similarity. It finds semantically similar information rather than keyword matches.

Short-term Memory

Short-term memory retains history of ongoing conversations. This corresponds to human working memory, remembering what you're talking about right now.
Feature Description
Retention Range Retains conversation history per turn
Session Scope Associated with a specific session ID
Retention Period Automatically deleted after a configurable period (up to 365 days maximum). Events are retained even after the conversation session ends, until the retention period expires
Storage Unit Saved as Event
Note: While short-term memory is scoped to individual sessions, the events persist beyond active conversation sessions for the configured retention period. This allows retrieving conversation history even after users end their sessions.

Turn refers to a pair of user statement and agent response.
Conversation example (Session 1):
  • User: "Tell me Tokyo weather"
  • Agent: "It's sunny in Tokyo"
  • User: "What's the temperature?" ← Can omit "in Tokyo"
  • Agent: "25 degrees"
In this way, short-term memory allows users not to need repeating content mentioned in previous statements. Agents can understand conversation flow and complement omitted information.

Long-term Memory

Long-term memory persists important information that should be retained across sessions. This corresponds to human long-term memory, remembering important facts, preferences, and past experiences.
Persist means storing long-term rather than temporarily.

Generation through Asynchronous Processing:
Long-term memory generation is an asynchronous process executed in background. After conversation data is saved in short-term memory, long-term memory is generated through the following two-stage process:
  • Extraction: Automatically extract important information from interactions with agent
  • Consolidation: Integrate newly extracted information with existing memory and eliminate duplicates
This process efficiently integrates important information without interrupting conversations in real-time. Processing may take over 1 minute to complete.
Feature Description
Retention Range LLM automatically extracts and integrates important information
Valid Period Persisted across sessions (manageable with TTL)
Sharing Scope Shareable per Actor unit, per Session unit, across multiple agents
Primary Use Personalization, knowledge accumulation
TTL (Time To Live) is a function to set data expiration period. Data is automatically deleted after specified period elapses.
Examples of extracted information:
  • User preferences: "Prefers window seats" "Morning person"
  • Important facts: "Allergy: Peanuts" "Wheelchair user"
  • Past decisions: "Project X was canceled"
  • Session summaries: "Consulted about Tokyo business trip in March"
Conversation example:
  • Session 1 (3 months ago):
    • User: "I want to book a flight. I prefer window seats"
    • → Long-term: "Preference: Window seats"
  • Session 2 (Today):
    • User: "Book a flight for next week's business trip"
    • Agent: "I'll look for a window seat for you" ← Remembers previous preference
In this way, long-term memory allows agents to remember users' past preferences and information and act based on them. Users no longer need to repeat the same information every time.

Memory Strategy

Memory Strategy defines conversion rules from short-term to long-term memory. It instructs LLM what information to extract and how to integrate it.
AgentCore provides three types of Memory Strategies:
  • Built-in strategies: Predefined standard strategies. No configuration needed and optimized for standard use cases
  • Built-in with overrides: Can customize prompts and models based on built-in strategies
  • Self-managed strategies: Completely customize entire memory processing pipeline. Implement custom extraction/consolidation algorithms
Built-in strategies include the following three types:
  • Semantic Memory: Extracts factual information and contextual knowledge
  • User Preference Memory: Extracts user preferences and choices
  • Summary Memory: Generates summaries of conversations within sessions

Difference from RAG (Retrieval-Augmented Generation)

AgentCore Memory and RAG are complementary technologies with different purposes.
Feature Long-term Memory RAG
Primary Purpose Personal context and session continuity Access to authoritative latest information
Stored Content User preferences, past decisions, conversation history, behavior patterns Documents, technical specifications, policies, domain expertise
Data Source Session-specific context Large repositories, databases
Update Frequency Dynamically updated per conversation Periodic document updates
Answers Questions "Who is this user and what happened before" "What do trusted sources currently say"
RAG (Retrieval-Augmented Generation) is a technique that searches related information from large document repositories and incorporates it into LLM responses.
By combining these two, agents can provide both personalized experiences through remembered context and reliable information through real-time knowledge search. Long-term Memory answers "Who is this user and what happened before," while RAG answers "What do trusted sources currently say."
When implementing RAG on AWS, you can use Amazon Bedrock Knowledge Base.

Common Use Cases for Memory

  • Multi-turn dialogue: Interview bots that collect information through multiple exchanges
  • User profile building: Personal assistants that learn user preferences and characteristics
  • Long-term project management: Project management agents that remember project history and decisions
  • Customer support: Provide continuous support by remembering customer's past inquiry history

3. Gateway: Tool Integration Simplification

Gateway is a service that transforms existing APIs, Lambda functions, and services into tools available to agents. It simplifies complex integration work, allowing tool addition in minutes.

What are API, Lambda, and Smithy

Term Description
API (Application Programming Interface) Interface for exchanging data between applications. Example: Retrieving or updating customer information via Salesforce API
AWS Lambda AWS service that can execute code without server management. Execute custom logic serverlessly
Smithy Language for defining AWS service APIs. Can generate API specifications from Smithy models

Common Use Cases for Gateway

  • Multi-system integration: Workflows integrating multiple services like Salesforce, Slack, Jira
  • Legacy system integration: Make existing internal APIs available to agents
  • Third-party service utilization: Toolize external specialized services (payment, translation, etc.)
  • AWS service integration: Use AWS services like DynamoDB, S3, CloudWatch as tools
Legacy system refers to old but currently operating existing systems.

Challenges Solved by Gateway

When having agents use external tools, traditionally much work was required.
Traditional integration work (several weeks to months per tool):
Step Time Required Content
API documentation analysis Several hours to days Understand which endpoints to call and how
Authentication implementation Several days Implement authentication like OAuth 2.0 or API Keys
Protocol conversion code writing Several days Support different protocols like REST or GraphQL
Error handling implementation Several days Handle various error cases
Retry logic implementation Several days Handle temporary failures
Logging and monitoring implementation Several days Track tool invocations
Security considerations Several days Implement secure credential management
Testing and debugging Several days Operation verification and problem fixing
Protocol refers to communication rules and procedures.
Retry refers to automatically reattempting failed processing.
Using Gateway dramatically simplifies this work.
Integration with Gateway (minutes):
  1. Register OpenAPI/Smithy definition, or specify Lambda function
  2. Authentication configuration (a few clicks)
  3. Complete
Official documentation explains integration is possible with "just a few lines of code," shortening work that took weeks to months down to minutes.

Gateway's Six Key Capabilities

Gateway provides the following six key capabilities:
Capability Description
1. Security Guard Manages OAuth authorization, ensuring only valid users and agents can access tools and resources
2. Translation Translates agent requests using protocols like MCP into API requests or Lambda invocations. Eliminates need to manage protocol integration or version support
3. Composition Integrates multiple APIs, functions, and tools into a single MCP endpoint, streamlining agent access
4. Secure Credential Exchange Handles credential injection for each tool. Agents can seamlessly use tools with different authentication requirements
5. Semantic Tool Selection Can search available tools to find optimal ones for specific contexts. Can leverage thousands of tools while minimizing prompt size and reducing latency
6. Infrastructure Manager Provides serverless solution with built-in observability and auditing capabilities. Eliminates infrastructure management overhead
Through these capabilities, Gateway functions not just as a protocol conversion tool but as an enterprise-grade tool integration platform.

MCP and A2A Protocols

Gateway supports two important protocols.

MCP (Model Context Protocol)
MCP is a standard protocol for AI applications to access tools and resources. An open specification proposed by Anthropic, it functions as a common language between tools and agents.

Gateway Protocol Support: MCP (Model Context Protocol)
Gateway Protocol Support: MCP (Model Context Protocol)

MCP benefits:
Benefit Description
Loose Coupling Can develop and update tools and agents independently
Compatibility Compatibility between different frameworks. Tools created once can be reused across multiple agents
Reusability Tools can be shared across ecosystem
Standardization Works with popular open source frameworks like CrewAI, LangGraph, LlamaIndex, Strands Agents
Loose coupling refers to design where each part of a system is independent with minimal interdependencies.

Gateway operates as an MCP server, automatically converting existing APIs and Lambda functions into MCP-compatible tools. This allows agents to use various tools via Gateway if they just understand the MCP protocol.

A2A (Agent-to-Agent Protocol)
A2A is a protocol for multiple agents to communicate and cooperate. Complex tasks can be divided among specialized agents and executed cooperatively.

Gateway Protocol Support: A2A (Agent-to-Agent)
Gateway Protocol Support: A2A (Agent-to-Agent)

A2A benefits:
Benefit Description
Role Division Each agent focuses on specialized field
Specialization Efficiently process complex tasks by combining specialized agents
Dynamic Cooperation Automatically find and cooperate with necessary agents
AgentCore Runtime can also be deployed as an A2A server, enabling construction of multi-agent systems.

Gateway Architecture

Gateway is composed of multiple layers.

Gateway Multi-Layer Architecture with Authorization
Gateway Multi-Layer Architecture with Authorization

Through this hierarchical structure, agents can use various tools through a unified interface without being conscious of complex integration details.

Inbound Authorization and Outbound Authorization

Gateway ensures security through a two-stage authorization process.

Inbound Authorization (Entry Authorization)
Inbound Authorization handles authorization when users or agents access Gateway. Controls "who can use this Gateway."
Authorization Method Description Use Case
JWT (JSON Web Token) Authenticate with tokens issued by any ID provider (Cognito, Okta, Auth0, etc.) Access from external users or applications
IAM Authentication using AWS IAM credentials Access from services or applications within AWS

Outbound Authorization (Exit Authorization)
Outbound Authorization handles authorization when Gateway accesses backend tools (targets) on behalf of authenticated users or agents. Controls "what this Gateway can do against external services or resources."
Authorization Method Description Supported Targets
No Auth
(Not recommended)
Access targets without authentication. Not recommended due to security risks MCP Server (partial)
Gateway Service Role
(IAM)
Use Gateway service role's IAM credentials. Authenticate with AWS Signature Version 4 (SigV4) Lambda, Smithy, AWS services
OAuth 2.0 (2LO)
(2-legged OAuth)
Access resources with application's own authority, not user's. Obtain token using Client Credentials flow OpenAPI, MCP Server
API Key Authentication using API keys OpenAPI
2-legged OAuth (2LO) is an OAuth flow that authenticates between applications without user intervention. Client applications directly access resources without requiring end-user authentication.
Through this two-stage authorization, "who can use Gateway" and "what Gateway can access" can be controlled separately. This is why AgentCore Gateway is the only fully managed service providing "Comprehensive authentication."

Supported Tool Types

Gateway supports five types of tools.
Type Description Supported Outbound Auth
OpenAPI Standard format describing REST API specifications (OpenAPI 3.0/3.1 compatible) OAuth 2.0, API Key
Lambda Toolize custom logic IAM (Service Role)
Smithy Use AWS service APIs IAM (Service Role)
MCP Server Integrate existing MCP servers No Auth, OAuth 2.0
Integration Provider
Templates
Pre-configured templates for popular services Varies by template

1-Click Integration (Integration Provider Templates)

Popular tools are pre-configured and immediately available. Supports major business tools like Salesforce, Slack, Microsoft 365, SAP, Jira. Can integrate in minutes from AWS console.

Semantic Tool Selection

One of Gateway's powerful features is Semantic Tool Selection. This allows automatically finding appropriate tools from thousands.
Benefits of semantic search:
  • Scalability: Automatically select appropriate tools from thousands
  • Prompt size reduction: No need to include all tool details in prompt
  • Latency reduction: Present only related tools to agent
  • Dynamic tool discovery: Agents find optimal tools according to tasks

Practical Example

Multi-system Integration Scenario: Customer Support Workflow
  1. Receive inquiry from customer via Slack
  2. Agent retrieves customer information from Salesforce via Gateway
  3. Create support ticket in Zendesk via Gateway
  4. Create technical task in Jira via Gateway
  5. After processing completion, send notification to Slack via Gateway
In this scenario, the agent can interact with four different services—Slack, Salesforce, Zendesk, Jira—by connecting to just one Gateway endpoint. Gateway handles each service's credentials, API differences, and protocol conversions.

4. Code Interpreter: Safe Code Execution Environment

Code Interpreter is a service that provides an environment where agents can safely execute code. It can safely execute various tasks requiring code execution, such as data analysis, complex calculations, and file processing.

Why Code Interpreter is Needed

For AI agents to become truly useful, they need to be able to execute actual operations beyond just conversation.
Examples of necessary operations:
Category Specific Examples
Data Analysis CSV file reading, statistical calculations, graph generation and visualization
Complex Calculations Financial model simulation, scientific computation execution
File Processing PDF parsing, image conversion, data formatting
API Response Processing JSON data parsing, complex transformations
However, arbitrary code execution carries security risks. There's potential for malicious code to infiltrate systems or access other users' data. Code Interpreter solves this challenge.

What is Sandbox

Sandbox is a safe execution environment completely isolated from external environments. Like children playing in a sandbox, it refers to an environment where you can safely experiment without affecting the outside. Code executes only within sandbox, unable to access external systems or data.

Common Use Cases for Code Interpreter

  • Data Science Tasks: Complex data analysis using pandas, numpy
  • Report Generation: Automatically create graphs and charts from data
  • Data Cleansing: Detection and correction of invalid data
  • Large-Scale Data Processing: Efficiently process datasets up to 5GB stored in S3
Data cleansing refers to correcting data errors and inconsistencies to improve quality.

Architecture

Code Interpreter creates independent sandbox environments for each session.

Code Interpreter Sandbox Isolation Architecture
Code Interpreter Sandbox Isolation Architecture

Through this structure, each session is executed in a completely independent environment without affecting each other.

Key Features

Serverless Architecture
Code Interpreter's greatest feature is a completely managed serverless environment. Developers can focus only on code execution without being conscious of infrastructure management at all.
Each session's isolation:
  • Session 1: Python execution → Independent Sandbox A
  • Session 2: Python execution → Independent Sandbox B
  • No mutual influence, no data leakage
Long-Duration Execution Support
Code Interpreter provides 15 minutes execution time by default, extendable up to 8 hours maximum as needed. This allows completing tasks including complex data analysis and large-volume data processing.

Large-Capacity Payload Processing
Code Interpreter supports the following file sizes:
  • Inline upload: Up to 100MB maximum
  • S3-based upload: Up to 5GB (via terminal commands)
Also supports agents that process multimodal data such as images, audio, video, or large-scale datasets.
Network Configuration
Code Interpreter supports three network modes:
Mode Description Use Case
Sandbox Completely isolated environment. No external network access (S3 access possible) Most secure choice. When handling sensitive data
Public Internet access possible When integration with external APIs or services needed
VPC Can access private resources within VPC When access to internal databases or internal APIs needed
VPC (Virtual Private Cloud) is a virtual private network created on AWS.

Supported Languages

Code Interpreter supports major programming languages.
Language Features
Python 3.12 Optimal for data science and machine learning. 100+ libraries pre-installed including pandas, numpy, matplotlib, scikit-learn, torch
TypeScript Type-safe script execution
JavaScript Suitable for lightweight processing
Type-safe refers to strictly managing variable and data types to prevent errors.

Practical Examples

Data Analysis Scenario: Analyze CSV file and create graphs
  1. User: "Analyze sales data"
  2. Agent: Retrieve CSV (up to 100MB inline, or up to 5GB via S3)
  3. Code Interpreter:
    • Load data with pandas
    • Statistical calculations (mean, median, standard deviation, etc.)
    • Generate graphs with matplotlib
  4. Agent: Return graphs and summary
Complex Calculation Scenario: Financial Model Simulation
  1. User: "Compare investment scenarios"
  2. Agent: Collect parameters
  3. Code Interpreter:
    • Compound interest calculations
    • Risk analysis (Monte Carlo simulation)
    • Scenario comparison
    • Visualization (graph generation)
  4. Agent: Present recommendations
Compound interest calculation is a calculation method where interest accrues not only on principal but also on interest.

5. Browser Tool: Cloud Browser Execution Environment

Browser Tool is a service that provides an environment where agents can safely interact with websites. It can operate web pages using an actual browser and retrieve information.

What is Web Scraping

Web Scraping is a technology for automatically extracting information from web pages. However, traditional scraping has limitations.
Traditional scraping only handled static HTML retrieval. That is, it analyzes HTML code sent from servers as-is.
HTML is a language that describes web page structure.
However, since many modern websites dynamically generate content using JavaScript, complete information cannot be obtained from static HTML alone.

Common Use Cases for Browser Tool

  • Price Monitoring: Automatically track e-commerce site price fluctuations
  • Competitive Analysis: Periodically collect competitor site product information and content
  • Form Submission Automation: Automate routine web input tasks
  • Web Application Testing: Test execution in secure environment
  • Online Resource Access: Access to web-based services and data
  • Screenshot Capture: Visual recording of web pages

Difference from Web Scraping

Item Web Scraping Browser Tool
Retrieval Method Static HTML retrieval Actual browser operation
JavaScript Not executed Executed
Dynamic Content Difficult to retrieve Supported
Authentication Difficult Login/operation possible
Visual Understanding Not possible Possible through screenshots
JavaScript is a programming language that adds dynamic functionality to web pages. Many modern websites use JavaScript to dynamically display content according to user operations.
Browser Tool uses actual browsers, so JavaScript executes and dynamically generated content can be retrieved.

Browser Tool Features

Browser Tool executes actual Chromium-based browsers in the cloud.
Chromium is the open-source browser engine that forms the foundation of Google Chrome. Browser Tool uses Chromium-based browsers to display web pages just like actual users.
Rendering refers to interpreting code like HTML and CSS and displaying as visual web pages.

Main functions:
Function Description
Page Navigation Open URLs, click links
Form Input Input into text boxes, select from dropdowns
Button Clicks Click buttons or links
Wait & Scroll Wait for page loading, scroll to display elements
Screenshot Capture Save images of entire page or specific elements
Cookie Management Maintain login state
Cookie is small data that websites store in browsers, retaining information like login state.

Network Configuration

Browser Tool currently supports only Public network mode. Access to external websites and internet resources is possible.

Security and Scaling

Browser Tool realizes both security and scalability.

Isolated Execution
Each user's browser sessions are completely isolated.
Session isolation:
  • User A → Browser Instance A (completely isolated)
  • User B → Browser Instance B (completely isolated)
  • User C → Browser Instance C (completely isolated)
Each instance's characteristics:
  • Independent cookies/sessions
  • Independent cache
  • Independent CPU, memory, filesystem resources (microVM)
  • Completely deleted at session end
Cache is a mechanism that temporarily stores once-retrieved data and reuses it to speed up processing.

microVM is a lightweight virtual machine, with each session executed in a dedicated microVM. This makes it impossible for one user's tool invocations to access another user's session data. When sessions complete, microVMs completely terminate and memory is sanitized (erased), eliminating cross-session data leakage risks.

Auto Scaling
Browser Tool automatically increases and decreases browser instances according to request volume.
  • No infrastructure management needed
  • Usage-based billing prevents wasted costs
  • Maintains stable performance even during peaks
  • Up to 500 concurrent sessions possible

Practical Examples

E-commerce Price Monitoring Scenario:
  1. Access target site
  2. Input product name in search form
  3. Execute search
  4. Extract price information from results page
  5. Compare with price history
  6. Notify if price drops
In this case, the agent uses Browser Tool to actually operate the website. It can automate operations humans would perform, such as input into search forms, clicking search buttons, and analyzing results pages.

Competitive Research Scenario:
  1. Access competitor site
  2. Browse product catalog
  3. Collect detailed information for each product
  4. Save screenshots
  5. Save as structured data
  6. Generate report
Structured data refers to data organized in a defined format.
In this way, Browser Tool can also handle information collection across multiple pages and access to sites requiring login.

6. Identity: Identity and Access Management

Identity is a service that centrally manages authentication and authorization for agents and humans. Controls who can access agents and what agents can access.

What are Authentication and Authorization

Authentication confirms "Who are you?" Verifies identity through username and password, multi-factor authentication, biometric authentication, etc.
Multi-factor authentication is a mechanism that verifies identity using multiple factors beyond passwords, such as SMS codes or fingerprint authentication.
Authorization determines "What can you do?" Controls which resources authenticated users or systems can access.
For example, an employee logging in is authentication, and deciding whether that employee can access the HR system is authorization.

Challenges Solved

In traditional AI agent development, implementing authentication and authorization was a major challenge.
Implementation Complexity:
  • Implement user authentication and agent authentication separately
  • Use different libraries or frameworks for each
  • Time-consuming authentication flow testing and debugging
Credential Management:
  • Access to external services requires API keys or OAuth tokens
  • Hardcoding these credentials in code poses serious security risks
  • Need mechanisms for safe storage and updates
Credential refers to authentication information (passwords, API keys, tokens, etc.).
Hardcoding refers to writing values directly into source code.
User Consent Flow:
  • Need flow for users to consent to external service access
  • OAuth 2.0 implementation is complex, requiring much development time
Identity solves all these challenges.

Identity Architecture

Identity consists of three main components.
Component refers to parts or elements that constitute a system.

Identity Three-Layer Authentication Architecture
Identity Three-Layer Authentication Architecture

Through this three-layer structure, the entire authentication flow from users to agents and from agents to external services can be managed in an integrated manner.

Key Feature Details

Inbound Auth (Entry Authentication)
Inbound Auth handles authentication when users or applications access agents.
AgentCore Identity supports two main authentication mechanisms:
Authentication Method Description Use Case
IAM SigV4 Authentication Identity verification using AWS credentials. Works automatically without additional configuration Calls from services or applications within AWS
OAuth 2.0 / OpenID Connect Integration with external ID providers. Uses Bearer Tokens When end users access agents
OAuth 2.0 is a standard authorization framework on the internet, a mechanism that can safely grant access rights without sharing passwords. OpenID Connect is an authentication layer built on top of OAuth 2.0, specialized in user identity verification.

Inbound Auth can integrate with existing OAuth 2.0/OpenID Connect compliant ID providers. Also integrates with AWS IAM, supporting calls from services or applications within AWS.

ID provider is a system that manages user authentication information and provides authentication services. Includes Okta, Microsoft Entra ID (formerly Azure AD), Amazon Cognito, etc.

Workload Identity Directory
Workload Identity Directory manages agents' own identities.

Workload identity is identity granted to applications or services (workloads), not humans. In AgentCore Identity, agent identities are implemented as specialized workload identities. Agents authenticate with their own identities and operate with appropriate permissions.
Each agent has a unique ARN (Amazon Resource Name: name that uniquely identifies resources within AWS). This ARN allows organizing agents hierarchically and applying group-based access control.
Directory supports hierarchical structures, allowing agents to be grouped according to organizational structure. For example, control like allowing customer support group agents access to customer data while restricting marketing group agents is possible.

Outbound Auth (Exit Authentication)
Outbound Auth handles authentication when agents access external services or AWS services.
Mode Description Use Example
USER_FEDERATION (3LO) Access with user's permissions. Requires explicit user consent. Uses OAuth 2.0 Authorization Code Grant flow Add events to user's Google Calendar
M2M (2LO) Access with agent's own permissions. Uses service-level credentials. Uses OAuth 2.0 Client Credentials Grant flow Access to internal APIs or shared databases
AWS Services Access AWS services using AWS IAM roles Access to S3 buckets, DynamoDB read/write
Token Vault
Token Vault is encrypted storage that safely stores OAuth 2.0 tokens and API keys.
Encryption refers to converting data into a format unreadable by third parties.
Token Vault's main functions:
  • Encrypted storage: Encrypted with AWS KMS
  • Access control: Managed with IAM policies and resource policies
  • Automatic token refresh: Automatic updates using OAuth 2.0 refresh tokens
  • Scope-based permission management: Realizes principle of least privilege
  • Audit logs: Records access history with AWS CloudTrail
AWS KMS (Key Management Service) is an AWS service that safely manages encryption keys. Token Vault uses KMS to encrypt and store credentials.
Scope (access permission range)-based permission management realizes principle of least privilege.
Principle of least privilege is a fundamental security principle of granting only minimum necessary permissions.

Actual Authentication Flow Example

Let's look at authentication flow in a specific scenario.
Scenario: Sales representative retrieves customer information from Salesforce
  1. User Authentication: Sales representative logs in with company's Okta
  2. Permission Verification: Identity verifies "Sales Group" permissions
  3. Salesforce Access: Agent obtains OAuth token
  4. Data Retrieval: Agent accesses customer information
  5. Audit: All operations recorded in CloudTrail
In this way, complex authentication flows can be realized without developers being conscious of them.
Sales representatives just log in with their Okta accounts, and agents automatically access Salesforce with appropriate permissions and retrieve necessary customer information. During this time, developers don't need to be conscious of OAuth implementation or token management.

7. Observability: Observability and Operations

Observability is a service that visualizes agent behavior and supports debugging and optimization. Comprehensively monitors agent internal state, execution processes, and performance indicators.

What is Observability

Observability is the property of being able to observe and understand system internal state from outside. AgentCore Observability provides observability tailored to agent-specific challenges.

Why Observability is Important

AI agents exhibit non-deterministic behavior. That is, the same input can produce different results.
Different behaviors with same input:
  • Different reasoning paths
  • Different tool selections
  • Different results
  • → Traditional logs alone are insufficient
For example, even with the same instruction "contact customer," agents may choose different means like email, Slack, or phone depending on circumstances. To understand this decision-making process and grasp why that choice was made requires detailed observability.

Common Use Cases for Observability

  • Identifying Performance Bottlenecks: Analyze which tool invocations take time
  • Error Cause Investigation: Trace agent reasoning process to identify problem locations
  • Cost Optimization: Monitor token usage and API call counts to reduce costs
  • Resource Usage Tracking: Monitor CPU/memory usage to achieve optimal resource allocation
Bottleneck refers to locations causing processing delays.

Two Perspectives for Understanding Observability

To understand AgentCore Observability, you need to grasp two classification axes.
Perspective 1: By Data Type (What to See)
There are three types of data for observing systems. These are called the "three pillars of observability."
Data Type Format Granularity Viewpoint Example Main Use
Metrics Numeric (time series) Coarse (aggregate values) What is happening
(What)
Invocation count
Error rate
Latency
Trend monitoring
Alert configuration
Logs Text (events) Detailed (individual events) When and what happened
(When & What)
Error messages
Request content
Detailed investigation
Root cause analysis
Spans Structured data (hierarchical) Intermediate (per operation) How it was processed
(How)
Start/end times
Parent-child relationships
Processing flow visualization
Bottleneck identification
Perspective 2: By Granularity (At What Level to See)
Agent behavior is tracked at three hierarchical levels. These form a nested structure (like matryoshka dolls).
Session - Top level
= Complete conversation with user
Duration: Several minutes to 8 hours maximum
├─ Trace 1 - Intermediate level
│  = 1 request-response
│  Duration: Several seconds to minutes
│
│  ├─ Span 1: Parse user input (50ms)
│  ├─ Span 2: Retrieve from Memory (200ms)
│  ├─ Span 3: Tool invocation (1500ms)
│  └─ Span 4: Generate response with LLM (2000ms)
│
├─ Trace 2
│  └─ ...
│
└─ Trace 3
   └─ ...
Level Definition Duration Identifier Answers Question
Session Complete conversation with user Several minutes to 8 hours maximum session.id Who is this user and what conversation occurred
Trace 1 request-response Several seconds to minutes trace_id What happened in this 1 exchange
Span Minimum processing unit (1 operation) Milliseconds to seconds span_id How much time this operation took

Combination of Two Perspectives

Importantly, "data type" and "granularity" are independent concepts that can be used in combination.
Granularity Level Metrics
(Numeric)
Logs
(Text)
Spans
(Structured)
Session
(Entire conversation)
- Session count
- Total processing time
- Conversation history
- Error summary
- Overall flow
- Multiple Traces
Trace
(1 round trip)
- Latency
- Error rate
- Request
- Response
- Processing steps
- Tool invocations
Span
(1 operation)
- Execution time
- Resource amount
- Detailed logs
- Exception info
- Parent-child relationships
- Dependencies
For example, "Trace's Spans" retain execution records of each operation within 1 request as structured data. "Span's Logs" record detailed events of each operation as text.

Actual Problem-Solving Flow

Let's look at a specific example of solving problems by combining two perspectives.
Problem: Agent responses are slow
Step Perspective Used Check Content Judgment
1 Metrics (Overall) Average latency degraded from normal 2s→5s Problem exists, detailed investigation needed
2 Session (Conversation level) Occurs in specific user's session Problem not overall but in specific pattern
3 Trace (Request level) "Salesforce information retrieval" Trace is slow Possible problem with Salesforce integration
4 Spans (Operation level) Gateway→Salesforce: 4500ms (normally 500ms) Salesforce API call is bottleneck
5 Logs (Details) "Salesforce rate limit exceeded" Rate limit is cause. Address with retry or caching
In this way, by progressively investigating from coarse perspective (Metrics) to fine perspective (Spans, Logs), problems can be efficiently identified.

Implementation in AgentCore

AgentCore provides all this data in an integrated manner.
Data Provision Method
Metrics Automatically output by default in all services
Logs Runtime automatically outputs. Memory/Gateway/Built-in Tools require configuration
Spans Requires agent code instrumentation (using ADOT)
Session/Trace/Span Automatically managed in hierarchical structure conforming to OpenTelemetry standards
All data is stored in Amazon CloudWatch and can be visualized in an integrated manner with CloudWatch GenAI Observability Dashboard.

What is Telemetry

Telemetry refers to measurement data collected remotely from systems. Metrics, Logs, and Spans are all types of telemetry data. Observability collects, analyzes, and visualizes this telemetry data.

Visualization Features

Observability visualizes collected data in various ways.

1. CloudWatch GenAI Observability Dashboard
Displays AgentCore metrics, spans, and traces in an integrated manner. This dashboard provides:
  • Agent View: Lists agents and displays metrics, sessions, traces for selected agent
  • Session View: Displays all sessions associated with agent
  • Trace View: Investigate agent traces and span information. Visualize processing flow on timeline
  • Resource Usage Graphs: Visualize CPU and memory usage
2. Error Analysis
Classifies errors and analyzes trends. By visualizing error types and frequency, areas needing improvement can be identified.

3. Performance Monitoring
Monitors performance indicators in real-time. Can check current session count, average response time, error rate, token usage, etc. in real-time.

OpenTelemetry Compatibility

Since AgentCore's telemetry data conforms to OpenTelemetry standards, integration with existing monitoring tools is possible.
Integrable tools:
  • CloudWatch (default)
  • Datadog
  • New Relic
  • Prometheus
  • Grafana
  • Splunk
  • Custom backends
CloudWatch is the default storage destination, but data can also be sent to popular monitoring tools like Datadog, New Relic, Prometheus. AgentCore observability can be added while leveraging existing monitoring infrastructure.

Service Integration Patterns

Each service can be used independently, but combining them allows building powerful agents. Here I introduce representative configuration patterns.

Basic Configuration Patterns

Pattern 1: Minimal Configuration

The simplest configuration uses only Runtime.
Configuration Runtime
Use Cases - Simple LLM responses
- Stateless processing
- Prototyping
This configuration is suitable for simple Q&A or processing that doesn't require context. For example, agents that answer technical questions or summarize text.
Prototyping refers to creating trial products before full-scale development.
I recommend starting with this configuration even in the prototyping stage.

Pattern 2: Context Retention

By adding Memory to Runtime, conversation context can be retained.
Configuration Runtime + Memory
Use Cases - Chatbots
- Interactive assistants
- Multi-turn conversations
This configuration is suitable for applications requiring multi-turn conversations. For example, customer support chatbots, personal assistants, interactive tutorial systems.
Memory eliminates the need for users to repeat content mentioned in previous statements.

Pattern 3: Tool Utilization

By adding Gateway to Runtime, external tools can be integrated.
Configuration Runtime + Gateway
Use Cases - API integration agents
- Workflow automation
- Data retrieval/updates
This configuration is suitable for agents needing to integrate with external services. For example, agents retrieving customer information from CRM systems, agents sending notifications to Slack, agents automating workflows by combining multiple APIs.

CRM (Customer Relationship Management) refers to customer relationship management systems.

Enterprise Configuration Patterns

Pattern 4: Full Features

Enterprise-grade configuration combining all services.
Component Role
Runtime Execution platform
Memory Context retention
Gateway External tool integration
Code Interpreter Data analysis/computation
Browser Tool Web operations
Identity Authentication/authorization
Observability Overall monitoring
Use Cases:
  • Enterprise agents
  • Mission-critical applications
  • Compliance requirements response
Mission-critical refers to systems indispensable for business continuity.
This configuration is suitable for advanced agents operated in production environments. Runtime provides execution platform, Memory retains context, Gateway integrates external tools, Code Interpreter and Browser Tool extend execution capabilities, Identity ensures security, and Observability monitors overall.

Specific Integration Examples

Let's look at how each service cooperates in actual use cases.

Example 1: Customer Support Agent

Customer support agents respond to customer inquiries and access external systems as needed.

Configuration:
  • Runtime: Agent execution
  • Memory: Customer history/preference retention
  • Gateway: Salesforce + Zendesk + Internal APIs
  • Identity: User authentication + Salesforce OAuth
  • Observability: Performance monitoring
Operation Flow:
Step Processing Content Services Used
1 User inquires via chat Identity: User authentication
2 Retrieve past inquiry history Memory
3 Retrieve customer information from Salesforce Gateway + Identity
4 Search internal knowledge base Gateway
5 Present solution, create Zendesk ticket if needed Gateway
6 Save this exchange Memory
7 Record trace of all steps Observability

Knowledge base refers to a database that systematically organizes knowledge and know-how.
In this flow, each service operates with role division. Identity ensures security, Memory provides customer history, Gateway simplifies access to external systems, and Observability monitors overall operations.

Example 2: Data Analysis Agent

Data analysis agents retrieve and analyze data based on user instructions and visualize results.

Configuration:
  • Runtime: Agent execution
  • Memory: Analysis history/user preferences
  • Gateway: S3 + Athena + Internal DB
  • Code Interpreter: Data processing/visualization
  • Identity: AWS IAM authentication
  • Observability: Execution time/resource monitoring
Operation Flow:
Step Processing Content Services Used
1 User: "Analyze last month's sales" Runtime
2 Retrieve past analysis settings Memory
3 Execute query on Athena to retrieve data Gateway
4 Data processing/statistical calculations/graph generation Code Interpreter
5 Return analysis results and graphs Runtime
6 Save analysis settings Memory
7 Record query execution time/data volume/processing time Observability
AWS Athena is a service that can execute SQL queries on data stored in S3.
SQL is a language for manipulating databases.
In this flow, data is retrieved from Athena through Gateway, analyzed and visualized with Code Interpreter, user preferences are remembered with Memory, and processing time is monitored with Observability.

Getting Started with AgentCore

I will explain preparation for starting AgentCore and how to proceed with development.

Prerequisite Knowledge

Knowledge required to start AgentCore is as follows.

Essential Skills

Skill Description
Basic AWS Knowledge AWS account creation, understanding basic services (EC2, S3, IAM, etc.)
Programming Language Experience with Python or TypeScript
Basic REST API Understanding Knowledge of HTTP methods (GET, POST, etc.), status codes, JSON format
EC2 (Elastic Compute Cloud) is AWS's virtual server service.
S3 (Simple Storage Service) is AWS's object storage service.
IAM (Identity and Access Management) is AWS's access control service.
HTTP methods specify types of operations on servers (retrieve, create, update, delete, etc.).
Status codes are 3-digit numbers indicating server response results (200=success, 404=not found, etc.).
JSON (JavaScript Object Notation) is a lightweight text format widely used for data exchange.

Recommended Skills

Skill Benefits
Agent Frameworks like LangGraph or CrewAI Agent design becomes smoother
Basic OAuth 2.0 Understanding Integration with external services becomes easier to understand
Container Technology (Docker, etc.) Deployment to Runtime becomes easier
Docker is a tool for containerizing applications.

Development Flow

Typical AgentCore development flow is as follows.

1. Design Phase

Task Content
Use Case Definition - What problems should agent solve
- Who will use it
- What functions are needed
Service Selection - Context retention needed → Memory
- External API integration needed → Gateway
- Data analysis needed → Code Interpreter
Architecture Design - How each service cooperates
- Data flow
- Security requirements

2. Development Phase

Step Task Content
Local Development Use frameworks like LangGraph or CrewAI, or implement independently
Containerization Package agent and its dependencies with Dockerfile
Deployment Deploy to AgentCore Runtime. Can be automated with AWS CLI or AWS SDK
AWS CLI (Command Line Interface) is a tool for operating AWS from command line.
AWS SDK is a library for operating AWS services from programs.
Deploy refers to placing developed applications in production environment and making them available.

3. Integration Phase

Configure each service as needed.
Service Configuration Content
Memory Define Memory Strategy. Decide what information to extract as long-term memory
Gateway Register necessary tools. Configure credentials
Identity Integrate ID providers. Configure user authentication flow

4. Test & Optimization Phase

Task Content
Operation Verification Verify operation with actual data or scenarios. Check whether expected responses are returned and error handling is appropriate
Monitoring Monitor latency, token usage, error rate with Observability
Optimization Identify bottlenecks and improve

5. Production Operation Phase

Task Content
Deployment Deploy to production environment. Create prod-endpoint separately from DEFAULT endpoint
Continuous Monitoring Continuously monitor with Observability. Configure alerts for automatic anomaly detection
Continuous Improvement Based on user feedback and metrics, add new tools, optimize prompts, performance tuning
Alert is a mechanism that notifies when anomalies occur.
Performance tuning refers to optimizing system performance.

First Steps

When starting AgentCore, I recommend proceeding in the following order.
Step Task Content
1. Starter Toolkit Experience basic AgentCore operations using official starter kit
2. Simple Agent Create basic agent using only Runtime. Develop locally→Verify operation→Deploy to AWS
3. Gradually Add Functions Add Memory to retain conversation context→Integrate external tools with Gateway→Verify operations with Observability
4. Full-Scale Agent Strengthen authentication/authorization with Identity→Add Code Interpreter or Browser Tool→Address enterprise requirements

Resources

I introduce resources helpful for learning AgentCore.

Official Documentation

Resource Content
AgentCore Documentation AWS official documentation. Detailed explanations of each service, API reference, best practices
API Reference All API endpoints, parameters, response formats

Sample Code

Resource Content
GitHub: amazon-bedrock-agentcore-samples Sample code for various use cases, implementation best practices, ready-to-use implementation examples
You can develop your own agents by referring to these codes.

Understanding Cost Structure

AgentCore costs are composed of multiple service charges. Understanding each cost factor helps with budget planning and optimization.
Cost Factors to Consider:
Service Key Cost Drivers Example Variables
Runtime Actual CPU time used per session Session duration, concurrent sessions
Memory Number of events stored and searches performed Conversation turns, retention period, search frequency
Gateway API call frequency and data transfer volume Tool invocations, payload size
Observability Log volume and metrics stored in CloudWatch Log retention, metric resolution, trace volume
Steps to Estimate Your Costs:
  1. Identify usage patterns: Determine expected monthly sessions, average session duration, and tool usage frequency
  2. Check current pricing: Visit the Amazon Bedrock AgentCore Pricing Page for your AWS region
  3. Use AWS Pricing Calculator: Input your specific requirements into the AWS Pricing Calculator
  4. Add buffer: Include 20-30% buffer for unexpected usage spikes
Important: Actual costs vary significantly based on:
  • AWS Region selected
  • Specific usage patterns and peak loads
  • Data retention policies configured
  • Network transfer requirements
  • Choice of LLM models (if using built-in strategies)

Cost Optimization Tips

Implement these strategies to optimize AgentCore costs:
  • Efficient Prompt Design: Minimize LLM token usage through clear, concise instructions and well-structured prompts
  • Strategic Memory Management:
    • Save only essential information as long-term memory
    • Set appropriate TTL (Time To Live) for memory data
    • Regularly review and clean up unused memories
  • Smart Caching Implementation:
    • Cache frequently accessed data
    • Reduce redundant API calls through Gateway
    • Implement response caching where appropriate
  • Proactive Monitoring:
    • Set up CloudWatch alarms for cost thresholds
    • Monitor usage patterns with Observability
    • Identify and address cost spikes early
  • Resource Right-Sizing:
    • Optimize session duration
    • Use appropriate timeout settings
    • Implement efficient error handling to prevent retry storms

Security and Compliance

AgentCore provides enterprise-grade security.

Security Features

AgentCore ensures security at multiple layers.

Data Protection

Layer Security Measures
Data in Transit Encrypted with TLS 1.2/1.3. End-to-end encryption
Data at Rest Encrypted with AWS KMS. Data saved in Memory also encrypted. Credentials saved in Token Vault also encrypted with KMS
TLS (Transport Layer Security) is a protocol that encrypts internet communications.
End-to-end refers to the entire path from start to end of communications.

Access Control

Control Level Description
IAM-Based Detailed Permission Management Strictly control who can access what. Role-based access control. Principle of least privilege
Resource-Based Policies Detailed management of access to each resource. Conditional access
Principle of Least Privilege Grant only minimum necessary permissions. Periodic permission reviews
Role-based access control is a method that grants permissions according to roles.
Policy refers to rules that define access permissions.

Isolation

Isolation Level Description
Complete Session Isolation Each user's sessions executed in independent microVMs. Cannot access each other. Completely deleted at session end
Tenant Isolation Different customer data completely isolated. Physical and logical isolation
Network Isolation Communication between sessions blocked. Can connect to private networks via VPC integration
Tenant refers to units of customers or organizations using a system.

Auditing

Audit Function Description
CloudTrail Recording Records all API calls. Track who did what when. Detect unauthorized access
Observability Logs Detailed agent operation logs. Security event recording. Abnormal behavior detection

Compliance

Compliance refers to adherence to laws and industry regulations. AgentCore addresses major compliance requirements.
Standards/Regulations Description
SOC 1, 2, 3 Security audit standards. Security and privacy management audited by third parties
HIPAA Compatible Can be used for applications handling medical information. Appropriate protection of PHI
GDPR Compliant Appropriately protect EU citizen data. Respond to data deletion requests
ISO 27001 International standard for information security management
HIPAA is US medical information protection law.
PHI (Protected Health Information) is protected medical information.
GDPR (General Data Protection Regulation) is EU's general data protection regulation.

Data Privacy

Understanding AgentCore's data usage policy is important.
AgentCore Data Usage Policy:
AgentCore may use and store customer content to improve service experience and performance.
However, such improvements are:
  • ✓ For your own AgentCore usage
  • ✗ Not for other customers
That is, your company's data will not be shared with other companies. AWS may use customer data to improve that customer's service experience, but will not use it for other customers.

Troubleshooting

I introduce frequently encountered problems when using AgentCore and their solutions.

Common Problems and Solutions

Problem 1: Session Timeout

Item Content
Symptoms - Session terminates during long processing
- "Session expired" error occurs
Causes - Reached maximum 8-hour execution time limit
- Exceeded 15-minute inactivity time
Solutions 1. Divide processing into smaller units and save each step's results in Memory
2. Implement periodic keep-alive
3. Leverage asynchronous processing
Keep-alive refers to periodically sending signals to maintain connections.
Asynchronous processing refers to proceeding to next processing without waiting for processing completion.

Problem 2: Out of Memory Error

Item Content
Symptoms - "Out of memory" error
- Crashes during large data processing
Causes - Loaded large amounts of data at once
- Memory usage exceeded session limit
Solutions 1. Adjust batch size (divide data into smaller units)
2. Adopt streaming processing
3. Delete unnecessary data
Streaming processing refers to continuously processing data little by little.

Problem 3: Tool Invocation Failures

Item Content
Symptoms - API calls via Gateway fail
- Authentication errors or connection errors
Causes - Expired credentials
- Reached API rate limits
- Network connection issues
Solutions 1. Verify credentials
2. Implement retry logic
3. Address rate limits

Problem 4: High Latency

Item Content
Symptoms - Agent responses are slow
- User experience degraded
Causes - Unnecessary tool invocations
- Inefficient prompts
- Network latency
Solutions 1. Analyze with Observability (identify bottlenecks with traces)
2. Optimize prompts
3. Leverage caching
4. Parallel processing

Problem 5: Cost Spikes

Item Content
Symptoms - Unexpectedly high bills
- Rapid usage increases
Causes - Infinite loops or retry storms
- Inefficient Memory searches
- Excessive API calls
Solutions 1. Strengthen cost monitoring (set thresholds with CloudWatch Alarms)
2. Analyze usage patterns
3. Set resource limits
Threshold refers to reference values that trigger alerts.

Comparison with Other Solutions

Let's compare AgentCore with other agent development solutions.

Comparison Table

Feature AgentCore LangGraph Standalone CrewAI Standalone Amazon Bedrock Agent
Infrastructure Management Unnecessary (fully managed) Required Required Unnecessary (fully managed)
Scaling Automatic Manual Manual Automatic
Security Enterprise-grade Requires implementation Requires implementation Enterprise-grade
Observability Built-in Requires implementation Requires implementation Basic monitoring
Framework Any framework LangGraph CrewAI Bedrock Agent specific
LLM Selection Any LLM Any LLM Any LLM Bedrock models focused
Customizability High (code control) High Medium Medium (GUI-centered)
Development Speed Medium to high Low (infrastructure needed) Low (infrastructure needed) High (GUI)
Operation Cost Pay-as-you-go Infrastructure + operation cost Infrastructure + operation cost Pay-as-you-go
Production Ready Immediately ready Additional work needed Additional work needed Immediately ready
Fully managed refers to service providers handling all infrastructure management.

Application Scenarios for Each Solution

When AgentCore is Suitable

  • Enterprise-grade security needed
  • Want to leverage existing frameworks (LangGraph, CrewAI, etc.)
  • Want to balance customizability and ease of operation
  • Complex agent logic needed
  • Assuming production environment operation

When LangGraph Standalone is Suitable

  • Complete control of complex workflows needed
  • Infrastructure management structure already exists
  • On-premises operation is required

When CrewAI Standalone is Suitable

  • Want to specialize in multi-agent systems
  • Infrastructure management structure already exists
  • On-premises operation is required

When Amazon Bedrock Agent is Suitable

  • Rapid prototyping needed
  • Want to create agents without writing code
  • Standard patterns sufficient
  • Bedrock models alone sufficient

Frequently Asked Questions (FAQ)

I've compiled frequently asked questions and answers about AgentCore.

Q1: Can I use existing frameworks?

A: Yes, AgentCore is framework-agnostic.
All major agent frameworks including LangGraph, CrewAI, Strands Agents work. Additionally, custom agents without frameworks also work without problems.

Q2: Which LLMs can be used?

A: Any LLM can be used.
Supported LLMs:
  • Amazon Bedrock models (Claude, Nova, etc.)
  • Anthropic Claude (direct API)
  • OpenAI GPT (GPT-4, GPT-3.5, etc.)
  • Google Gemini
  • On-premises LLM
  • Company customized models

Q3: Integration with existing AWS services?

A: Seamless integration through Gateway is possible.
Integrable services:
  • Amazon S3 (object storage)
  • Amazon DynamoDB (NoSQL database)
  • Amazon Athena (data analysis)
  • AWS Lambda (serverless functions)
  • Amazon RDS (relational database)
  • Almost all other AWS services

Q4: Separation of development and production environments?

A: Can be realized using Endpoints. By creating different endpoints for each environment and specifying different versions for each, environment separation is realized.

Q5: Where are logs stored?

A: Stored in Amazon CloudWatch. Since it's OpenTelemetry compatible, data can also be sent to other monitoring tools (Datadog, New Relic, etc.).

Q6: Maximum session time?

A: 8 hours maximum. Active execution is 8 hours maximum, idle state automatically terminates after 15 minutes. If processing over 8 hours is needed, divide processing into multiple sessions and save state with Memory.

Q7: How to estimate charges?

A: Use AWS pricing calculator or cost management tools. Calculate expected usage and estimate with AWS pricing calculator. Monitor actual usage with Observability and conduct periodic cost reviews.

Q8: What is MCP Server? Is it required?

A: MCP Server is a development support tool and is not required.
MCP Server is an optional tool for streamlining AgentCore deployment and testing from IDE (integrated development environment). Benefits of using include one-command deployment from IDE and automatic code conversion support. Without using, traditional deployment with AWS CLI or deployment via programs using AWS SDK is possible.
MCP Server is an auxiliary tool for improving development experience and is separate from AgentCore's seven core services.

References:
Amazon Bedrock AgentCore Documentation
Tech Blog with curated related content

Conclusion

Reaffirming AgentCore's Value

Amazon Bedrock AgentCore solves the following challenges in AI agent development.

1. Improved Development Efficiency

Tool integration work that traditionally took weeks is shortened to minutes with AgentCore.
  • Gateway dramatically simplifies API integration
  • Identity eliminates authentication/authorization implementation
  • Framework agnostic leverages existing development assets

2. Enterprise-Grade Security

Meets all security requirements demanded in production environments.
  • Identity provides enterprise-grade authentication/authorization
  • Complete session isolation minimizes data leakage risks
  • Token Vault safely manages credentials
  • Compliance ready (SOC, HIPAA, GDPR, ISO 27001)

3. Improved Operability

Supports stable production environment operation.
  • Serverless architecture reduces infrastructure management burden
  • Auto-scaling automatically responds to request volume fluctuations
  • Observability provides detailed observability
  • Pay-as-you-go reduces wasted costs

Gradual Understanding of Seven Services

AgentCore's seven services can also be learned gradually.
Gradual Learning Path

1. Runtime (Foundation)
   ↓ Run agents
2. Memory (Retention)
   ↓ Remember conversations
3. Gateway (Integration)
   ↓ Use external tools
4. Code Interpreter (Analysis)
   ↓ Analyze data
5. Browser Tool (Web Ops)
   ↓ Operate web
6. Identity (Security)
   ↓ Operate safely
7. Observability (Monitoring)
   ↓ Continue improving

Finally

AI agent technology is evolving rapidly, but by leveraging integrated platforms like AgentCore, you can efficiently keep up with that evolution.
Understanding basic concepts and service roles explained in this article allows you to smoothly start agent development using AgentCore.
AgentCore is:
  • A platform where anyone can safely build AI agents
  • Balances development efficiency and security
  • Provides enterprise-grade features
In the coming AI era, AgentCore will be a powerful tool.
I recommend starting small and gradually adding functions. By actually working hands-on, you should understand AgentCore's true value.
I hope this article helps in understanding Amazon Bedrock AgentCore.

Written by Hidekazu Konishi