AWS Database Glossary - RDS, Aurora, DynamoDB, DocumentDB, and Neptune Explained

First Published:
Last Updated:

This glossary collects 60 core terms an engineer or architect repeatedly meets when choosing or operating an AWS database - across relational engines (Amazon RDS, Amazon Aurora), key-value (Amazon DynamoDB), document (Amazon DocumentDB), graph (Amazon Neptune), time-series (Amazon Timestream), in-memory (Amazon ElastiCache, Amazon MemoryDB), wide-column (Amazon Keyspaces), ledger (Amazon QLDB), and the migration and integration layer that sits beside them.

It is a companion to my earlier AI and Machine Learning Glossary for AWS and AI Agent Engineering Glossary. Each entry follows the same shape: a 2-4 sentence definition, a Related line cross-linking to other terms in this page, and a Source line linking to the canonical AWS documentation.

How to Use This Glossary

Use the A-Z Term Index below to jump directly to a term. The seven category sections group terms by the layer of the AWS database stack they belong to: RDS / Aurora core concepts, RDS / Aurora capacity and operations, RDS / Aurora compatibility and extensions, DynamoDB data modeling, DynamoDB capacity and replication, purpose-built databases (DocumentDB, Neptune, Timestream, ElastiCache, MemoryDB, Keyspaces, QLDB, vector search), and the migration and integration layer.

For deep dives on specific patterns, see Amazon DynamoDB Single-Table Design Guide, Amazon DynamoDB Key Design, GSI, and LSI Dictionary, Comparison of AWS Databases Using the Quorum Model, and AWS History and Timeline - Amazon RDS. This glossary intentionally stays at the concept level so the page does not rot as features evolve; follow the Source link on each term for the current scope, limits, and pricing in the AWS documentation.

A-Z Term Index

Adaptive Capacity · Amazon DocumentDB · Amazon ElastiCache for Redis OSS · Amazon ElastiCache for Valkey · Amazon Keyspaces · Amazon MemoryDB · Amazon Neptune · Amazon Neptune Analytics · Amazon QLDB · Amazon RDS Data API · Amazon Timestream for InfluxDB · Amazon Timestream for LiveAnalytics · Aurora Cluster Endpoint · Aurora Custom Endpoint · Aurora DSQL · Aurora I/O-Optimized · Aurora Optimized Reads · Aurora Optimized Writes · Aurora Reader Endpoint · Aurora Serverless v1 · Aurora Serverless v2 · Aurora Storage Layer · Aurora Zero-ETL Integration with Amazon Redshift · AWS Backup for RDS, Aurora, and DynamoDB · AWS Database Migration Service (DMS) · AWS Schema Conversion Tool / DMS Schema Conversion · Babelfish for Aurora PostgreSQL · Backtrack (Aurora MySQL) · Blue/Green Deployments · Composite Primary Key · Database Activity Streams (Aurora) · Database Cloning (Aurora) · DynamoDB Streams · DynamoDB Zero-ETL Integration with Amazon OpenSearch Service · Global Database (Aurora) · Global Secondary Index (GSI) · Global Tables (DynamoDB) · Graph Query Languages (Gremlin / openCypher / SPARQL) · Hot Partition · Lake Formation Integration · Local Secondary Index (LSI) · Multi-AZ Deployment · On-Demand Capacity Mode · Partition Key · Performance Insights · Point-in-Time Recovery (PITR) · Provisioned Capacity Mode · Provisioned Throughput (RDS instance class) · Quorum-based Replication (Aurora) · RDS Custom · RDS Proxy · Read Capacity Unit (RCU) · Read Replica · Single-Table Design · Sort Key · Standard-IA Table Class · Time to Live (TTL) · Transactions (DynamoDB) · Trusted Language Extensions (TLE) for PostgreSQL · Vector Search (pgvector / Aurora / DocumentDB) · Write Capacity Unit (WCU)

AWS Database Service Map

The eight columns below frame how the terms in this glossary map to AWS database categories: Relational (RDS, Aurora), Key-Value (DynamoDB), Document (DocumentDB), Graph (Neptune, Neptune Analytics), Time-series (Timestream LiveAnalytics, Timestream for InfluxDB), In-Memory (ElastiCache for Redis OSS, ElastiCache for Valkey, MemoryDB), Wide-Column (Keyspaces), and Ledger (QLDB, with end-of-support announced for 2025-07-31). DMS, the Schema Conversion Tool, AWS Backup, and the Zero-ETL integrations cross-cut every column.

RelationalSQL OLTP Key-ValueNoSQL KV DocumentJSON GraphProperty / RDF Time-seriesIoT / Metrics In-MemoryCache / Primary Wide-ColumnCassandra LedgerEoS 2025-07-31
Amazon RDS
Amazon Aurora
Aurora Serverless v2
Babelfish
RDS Proxy
Amazon DynamoDB
Global Tables
PITR
DynamoDB Streams
Standard-IA
Amazon DocumentDB
Vector Search
MongoDB API
Amazon Neptune
Neptune Analytics
Gremlin
openCypher
SPARQL
Amazon Timestream LiveAnalytics
Amazon Timestream for InfluxDB
ElastiCache for Redis OSS
ElastiCache for Valkey
Amazon MemoryDB
Amazon Keyspaces
CQL Protocol
Amazon QLDB
Verifiable History
Cross-cutting Layer (applies across all columns)
Migration: AWS DMS (full-load + CDC)  |  AWS Schema Conversion Tool / DMS Schema Conversion  |  Blue/Green Deployments
Backup: AWS Backup (RDS, Aurora, DynamoDB)  |  PITR  |  Global Tables / Global Database  |  Vault Lock
Zero-ETL Integrations: Aurora → Amazon Redshift  |  DynamoDB → Amazon OpenSearch Service  |  Lake Formation export to S3 (Parquet)
Amazon QLDB end-of-support: 2025-07-31. Migrate to immutable patterns on Amazon Aurora PostgreSQL, Amazon DynamoDB, or Amazon Managed Blockchain.

A. RDS / Aurora - Core Concepts

Multi-AZ Deployment

A Multi-AZ deployment for Amazon RDS provisions a primary DB instance and a synchronously replicated standby (or two readable standbys in Multi-AZ DB cluster deployments) in different Availability Zones. RDS performs automatic failover to the standby when the primary fails, when the underlying host needs replacement, or during patching, typically within 60-120 seconds. Multi-AZ is the standard production posture for RDS engines that do not run on the Aurora storage layer.

Related: Read Replica, Aurora Storage Layer, Blue/Green Deployments, RDS Proxy

Source: Amazon RDS - High availability (Multi-AZ)

Read Replica

A Read Replica is an asynchronously replicated copy of an Amazon RDS source DB instance used to offload read traffic and to create cross-Region copies for disaster recovery. RDS Read Replicas can be promoted to standalone primary instances, while Aurora Replicas share the same underlying storage volume and therefore have near-zero replication lag. Read Replicas are independent of Multi-AZ standbys: a single deployment can use both at once for different reasons.

Related: Multi-AZ Deployment, Aurora Reader Endpoint, Global Database, Aurora Storage Layer

Source: Amazon RDS - Working with read replicas

Aurora Storage Layer

Aurora separates compute from storage and uses a distributed, log-structured Aurora storage layer that stores six copies of every data block across three Availability Zones. Writes are acknowledged when a 4-of-6 quorum confirms the redo log record, so the system tolerates the loss of an entire AZ plus one additional copy without losing data. The same storage layer underlies Amazon DocumentDB.

Related: Quorum-based Replication, Aurora Cluster Endpoint, Multi-AZ Deployment, Amazon DocumentDB

Source: Amazon Aurora - Storage and reliability

Quorum-based Replication (Aurora)

Aurora uses a 4/6 write quorum and a 3/6 read quorum across its six storage copies in three AZs, which means a write succeeds once four nodes acknowledge it and a read needs only three nodes to reconstruct the latest version. This is the design that lets Aurora survive the loss of a whole AZ for writes and the loss of an AZ plus one node for reads. For a deeper comparison across AWS databases, see Comparison of AWS Databases Using the Quorum Model.

Related: Aurora Storage Layer, Multi-AZ Deployment, Global Database

Source: Amazon Aurora - Storage and reliability

Backtrack (Aurora MySQL)

Backtrack lets an Aurora MySQL DB cluster rewind its current state to a target time in the recent past without restoring from a snapshot. It is implemented on top of the Aurora storage log and is limited to Aurora MySQL with a configurable target backtrack window of up to 72 hours. Unlike PITR, backtrack does not create a new cluster - the current cluster is moved to the target time in place.

Related: Database Cloning, Point-in-Time Recovery, AWS Backup

Source: Aurora MySQL - Backtracking

Database Cloning (Aurora)

Aurora supports near-instant database clones using copy-on-write semantics on its shared storage layer. A clone starts as a logical pointer to the source volume and only consumes additional storage as either the clone or the source is modified, which makes it useful for production-fidelity test environments without doubling cost. The same mechanism underlies Aurora Blue/Green Deployments and Backtrack.

Related: Backtrack, Blue/Green Deployments, Aurora Storage Layer

Source: Amazon Aurora - Cloning a volume

Aurora Cluster Endpoint

The cluster endpoint of an Aurora DB cluster always resolves to the current writer instance. Applications use it for read/write traffic, and during a failover Aurora reassigns the endpoint to the newly promoted writer with no client-side reconfiguration. There is exactly one cluster endpoint per Aurora cluster.

Related: Aurora Reader Endpoint, Aurora Custom Endpoint, RDS Proxy

Source: Amazon Aurora - Endpoints

Aurora Reader Endpoint

The reader endpoint of an Aurora DB cluster load-balances connections across all available Aurora Replicas in the cluster. It is the standard target for read-only workloads and automatically excludes the writer or any unhealthy replicas. Connection balancing happens at session establishment time, not per query, so long-lived connections can pin to one replica.

Related: Aurora Cluster Endpoint, Aurora Custom Endpoint, Read Replica, RDS Proxy

Source: Amazon Aurora - Endpoints

Aurora Custom Endpoint

A custom endpoint is a user-defined endpoint that load-balances across a chosen subset of instances in an Aurora DB cluster - for example, "analytics replicas only" or "instances of class db.r6g.4xlarge". It lets you route specific workloads to specific hardware without splitting the cluster, and custom endpoints can include or exclude individual instances by tag.

Related: Aurora Cluster Endpoint, Aurora Reader Endpoint, RDS Proxy

Source: Amazon Aurora - Endpoints

Global Database (Aurora)

An Aurora Global Database spans multiple AWS Regions: writes go to a primary cluster in one Region, and Aurora replicates them asynchronously at the storage layer to up to five secondary Regions with typical lag under one second. It is the recommended pattern for low-latency cross-Region reads and for disaster recovery with a target RPO measured in seconds. A secondary Region can be promoted to primary in a planned switchover or in an unplanned failover.

Related: Read Replica, Multi-AZ Deployment, Aurora Storage Layer, Global Tables (DynamoDB)

Source: Amazon Aurora - Global databases

B. RDS / Aurora - Capacity and Operations

Aurora Serverless v1

Aurora Serverless v1 is the original on-demand auto-scaling configuration for Aurora, which scales compute in discrete capacity units between defined minimum and maximum values and can pause to zero capacity during idle periods. AWS announced end of support for Aurora Serverless v1 on March 31, 2025; the official end-of-life date is March 31, 2025 and clusters that have not been migrated to Aurora Serverless v2 are upgraded by AWS during a scheduled maintenance window. v1 lacks several features that v2 supports (Global Database, RDS Proxy, parallel query), so v2 is the recommended target for all new workloads and remaining migrations.

Related: Aurora Serverless v2, Provisioned Throughput, Aurora Cluster Endpoint

Source: Amazon Aurora Serverless v1

Aurora Serverless v2

Aurora Serverless v2 scales compute capacity in fine-grained increments measured in Aurora Capacity Units (ACUs) without disconnecting clients. It supports all major Aurora features (Global Database, RDS Proxy, Performance Insights, parallel query) that v1 did not, and is the recommended serverless option for production workloads with variable load. v2 can also be combined with provisioned instances in the same cluster.

Related: Aurora Serverless v1, Aurora I/O-Optimized, Provisioned Throughput, Amazon RDS Data API

Source: Amazon Aurora Serverless v2

Aurora DSQL

Amazon Aurora DSQL is a serverless, distributed SQL database with PostgreSQL compatibility that offers active-active multi-Region read and write with strong consistency. It is architected as a separate service from the classic Aurora cluster (no provisioned instance, no failover concept) and scales reads and writes independently across Regions without application-level conflict handling. Aurora DSQL is the right choice for globally distributed OLTP workloads that need single-digit-millisecond regional latency and the ability to write in any Region, where the operational simplicity of "no instances and no failover" outweighs the feature set of classic Aurora (extensions, custom parameters, IAM Database Authentication parity, and so on are still maturing).

Related: Aurora Serverless v2, Global Database (Aurora), Quorum-based Replication (Aurora)

Source: Amazon Aurora DSQL - User Guide

Aurora I/O-Optimized

Aurora I/O-Optimized is a cluster storage configuration in which I/O charges are removed and the per-hour instance and storage rates are higher in exchange. It is intended for workloads where I/O cost would otherwise dominate (typically more than 25 percent of total Aurora cost), and the configuration can be toggled per cluster.

Related: Aurora Optimized Reads, Aurora Optimized Writes, Aurora Storage Layer

Source: Amazon Aurora - Storage configurations

Aurora Optimized Reads

Aurora Optimized Reads is an instance-class feature for Aurora PostgreSQL and Aurora MySQL on supported Graviton and Intel NVMe-equipped instances (such as db.r6gd and db.r6id) that uses the local NVMe SSD as a tier for temporary objects and as an extension of the buffer cache. It can materially improve query latency for read-heavy workloads with large working sets that do not fit in memory, and it is enabled simply by choosing an instance class that supports it.

Related: Aurora Optimized Writes, Aurora I/O-Optimized, Aurora Storage Layer

Source: Aurora PostgreSQL - Optimized Reads

Aurora Optimized Writes

Aurora Optimized Writes is a feature of Aurora MySQL on supported instance classes that improves write throughput by removing the double-write buffer using atomic 16 KiB page writes to Aurora storage. It is selected on cluster creation on supported Aurora MySQL 3.x instance classes (such as db.r6i and db.r7g); enabling it on an existing cluster typically requires a snapshot restore or a Blue/Green switchover to re-initialise the storage page format. The feature is the main reason write-heavy MySQL workloads see a step change in throughput on newer Aurora MySQL instance classes.

Related: Aurora Optimized Reads, Aurora I/O-Optimized, Aurora Storage Layer

Source: Aurora MySQL - Optimized Writes

Blue/Green Deployments

A Blue/Green Deployment in Amazon RDS and Amazon Aurora creates a separate green environment that mirrors the production blue one, including engine upgrades, schema changes, or parameter changes you want to test. Switchover swaps endpoints to the green environment in typically under a minute with safety checks for replication lag and active sessions. The pattern is the recommended way to perform major-version upgrades on RDS and Aurora.

Related: Database Cloning, Multi-AZ Deployment, AWS Database Migration Service

Source: Amazon RDS - Blue/Green Deployments

Provisioned Throughput (RDS instance class)

For Amazon RDS in its non-serverless mode, capacity is provisioned by choosing a specific DB instance class (db.m6g, db.r7g, db.x2iedn, etc.) that fixes CPU, memory, and EBS bandwidth. Vertical scaling requires a modify-instance operation with a short downtime, which is the structural reason teams move CPU-spiky workloads to Aurora Serverless v2 or split read traffic onto Read Replicas.

Related: Aurora Serverless v2, On-Demand Capacity Mode, Read Replica

Source: Amazon RDS - DB instance classes

RDS Proxy

Amazon RDS Proxy is a fully managed, highly available database proxy in front of RDS and Aurora that pools and shares database connections across application clients. It reduces connection overhead for serverless and microservice workloads, preserves application connections across DB failovers, and integrates with AWS IAM and AWS Secrets Manager for authentication. RDS Proxy is the recommended fronting layer for Lambda-based access to RDS and Aurora.

Related: Aurora Cluster Endpoint, Aurora Custom Endpoint, Multi-AZ Deployment, Amazon RDS Data API

Source: Amazon RDS Proxy

C. RDS / Aurora - Compatibility and Extensions

Babelfish for Aurora PostgreSQL

Babelfish for Aurora PostgreSQL is an optional capability that lets an Aurora PostgreSQL DB cluster understand the SQL Server T-SQL dialect and the TDS wire protocol on a separate port. Applications written for Microsoft SQL Server can connect with minimal code changes, which makes it a tool for incremental migration off SQL Server licences. Babelfish coexists with the native PostgreSQL endpoint on the same cluster.

Related: Trusted Language Extensions for PostgreSQL, AWS Schema Conversion Tool, AWS Database Migration Service

Source: Aurora PostgreSQL - Babelfish

Trusted Language Extensions (TLE) for PostgreSQL

Trusted Language Extensions is an open-source development kit that lets users build PostgreSQL extensions in JavaScript, Perl, or other supported languages and install them safely in managed environments such as Amazon RDS and Aurora PostgreSQL. It removes the need to wait for AWS to certify each new extension and is the official path for custom server-side logic on managed PostgreSQL.

Related: Babelfish for Aurora PostgreSQL, Aurora Storage Layer, RDS Custom, Vector Search

Source: Trusted Language Extensions for PostgreSQL

Performance Insights

Performance Insights is a database-aware monitoring view on top of Amazon RDS and Amazon Aurora that visualises database load (Active Sessions / DBLoad) broken down by wait event, SQL, host, or user. It complements Amazon CloudWatch and is the default first stop when diagnosing latency or saturation on a managed DB instance. Retention is configurable, with seven days included at no extra charge.

Related: RDS Proxy, Aurora Optimized Reads, Database Activity Streams

Source: Amazon RDS - Performance Insights

RDS Custom

Amazon RDS Custom is a managed RDS variant that grants the customer access to the underlying operating system and database engine for workloads (notably Oracle) that require third-party software, custom patches, or filesystem-level customisation. The customer takes on more of the operational responsibility in return, including OS patching and database engine patching outside of the standard RDS automation.

Related: Trusted Language Extensions for PostgreSQL, Provisioned Throughput, AWS Database Migration Service

Source: Amazon RDS Custom

D. DynamoDB - Data Modeling

Partition Key

The partition key (also called the hash key) is the mandatory first attribute of a DynamoDB primary key, and its value is hashed to decide which physical partition stores the item. A well-distributed partition key is the single most important determinant of DynamoDB throughput because traffic on a single partition is bounded. Item collections sharing a partition key are stored together.

Related: Sort Key, Composite Primary Key, Hot Partition, Adaptive Capacity

Source: Amazon DynamoDB - Primary key

Sort Key

The sort key (also called the range key) is the optional second attribute of a DynamoDB primary key and orders items within the same partition. Adding a sort key turns the primary key into a composite key and unlocks range queries with conditions such as begins_with, between, and >. The sort key is the foundation for both LSIs and for sort-key-prefix-based single-table designs.

Related: Partition Key, Composite Primary Key, Local Secondary Index, Single-Table Design

Source: Amazon DynamoDB - Primary key

Composite Primary Key

A composite primary key in DynamoDB combines a partition key and a sort key so that an item is uniquely identified by both attributes together. It is the structural foundation of the Single-Table Design pattern, in which one table holds many entity types differentiated by sort key prefix. The detailed key design playbook is collected in the Amazon DynamoDB Key Design, GSI, and LSI Dictionary.

Related: Partition Key, Sort Key, Single-Table Design, Global Secondary Index

Source: Amazon DynamoDB - Primary key

Global Secondary Index (GSI)

A Global Secondary Index is an index whose partition and sort keys can be different from the base table, supporting queries on alternate access patterns. A GSI has its own provisioned or on-demand capacity, is updated asynchronously from the base table, and therefore returns eventually consistent reads (strong reads are not supported on a GSI). GSIs can be added or removed online.

Related: Local Secondary Index, Composite Primary Key, Single-Table Design, Partition Key

Source: Amazon DynamoDB - Global Secondary Indexes

Local Secondary Index (LSI)

A Local Secondary Index shares the partition key with the base table but uses a different sort key, so it indexes within each partition. LSIs must be defined at table-creation time, share capacity with the base table, and support both eventually consistent and strongly consistent reads. LSIs are a useful tool when an entity needs multiple sort orders, but they cannot be added or removed after table creation.

Related: Global Secondary Index, Sort Key, Composite Primary Key

Source: Amazon DynamoDB - Local Secondary Indexes

Single-Table Design

Single-Table Design is a DynamoDB modeling approach in which one table stores multiple entity types (users, orders, comments, etc.) differentiated by partition key prefix and sort key prefix, with GSIs added for alternate access patterns. It minimises cross-entity round trips and is the canonical pattern for backend-for-frontend services on DynamoDB. The full playbook is in the Amazon DynamoDB Single-Table Design Guide.

Related: Composite Primary Key, Global Secondary Index, Hot Partition, Transactions

Source: Amazon DynamoDB - NoSQL design

Hot Partition

A hot partition occurs when a disproportionate share of read or write traffic targets a single partition key value, so the partition saturates while other partitions are idle. Adaptive Capacity is the platform-level mitigation; partition-key design (key salting, write sharding, time-bucket suffixes) is the application-level mitigation. Hot partitions are the most common root cause of "DynamoDB throttling" on otherwise under-utilised tables.

Related: Partition Key, Adaptive Capacity, Provisioned Capacity Mode, Single-Table Design

Source: DynamoDB - Designing partition keys to distribute workloads

Adaptive Capacity

Adaptive Capacity is a DynamoDB feature in which the service automatically reallocates throughput to absorb traffic spikes on specific partition keys without requiring manual rebalancing. It is enabled by default on all tables and effectively makes provisioned-throughput "hot key" problems much rarer than in the early years of the service. Adaptive Capacity does not remove the need for good key design - it raises the threshold at which bad key design becomes a production incident.

Related: Hot Partition, Provisioned Capacity Mode, On-Demand Capacity Mode

Source: DynamoDB - Designing partition keys

E. DynamoDB - Capacity, Streams, and Replication

Provisioned Capacity Mode

Provisioned capacity mode lets you specify the number of Read Capacity Units and Write Capacity Units a DynamoDB table can sustain per second. It is the right mode for predictable, steady workloads, and Auto Scaling adjusts the provisioned values within configured bounds. Reserved Capacity discounts are available for committed long-term provisioned capacity.

Related: On-Demand Capacity Mode, Read Capacity Unit, Write Capacity Unit, Adaptive Capacity

Source: DynamoDB - Read/write capacity modes

On-Demand Capacity Mode

On-demand capacity mode charges per request and scales instantaneously up to any previously observed peak (and beyond, with brief warm-up after large step changes). It is the default and recommended mode for most workloads - unpredictable traffic, new applications without an established baseline, and spiky workloads where provisioning headroom would be wasteful. Switching a table between capacity modes is allowed, but subsequent switches are subject to a cooldown window - check the Source link for the current limit, which has tightened and loosened over the years.

Related: Provisioned Capacity Mode, Read Capacity Unit, Write Capacity Unit

Source: DynamoDB - On-demand capacity mode

Read Capacity Unit (RCU)

One Read Capacity Unit represents one strongly consistent read per second, two eventually consistent reads per second, or one-half of a transactional read per second, for items up to 4 KiB. RCU is the billing and quota unit for provisioned capacity reads and for the read side of on-demand metering. Reads of items larger than 4 KiB consume additional RCUs in 4 KiB increments.

Related: Write Capacity Unit, Provisioned Capacity Mode, On-Demand Capacity Mode, Transactions

Source: DynamoDB - Read/write capacity modes

Write Capacity Unit (WCU)

One Write Capacity Unit represents one standard write per second or one-half of a transactional write per second, for items up to 1 KiB. WCU is the billing and quota unit for provisioned capacity writes and the write side of on-demand metering. Writes of items larger than 1 KiB consume additional WCUs in 1 KiB increments, which is the structural reason single-table designs flatten data rather than nest it.

Related: Read Capacity Unit, Provisioned Capacity Mode, Transactions

Source: DynamoDB - Read/write capacity modes

Standard-IA Table Class

The Standard-Infrequent Access (Standard-IA) table class is a per-table storage option that lowers storage rates while raising read/write request rates, intended for tables whose data is rarely accessed but must remain queryable with single-digit-millisecond latency. It is selected at table creation or via a table-class modification, and it is the right choice for tables holding cold but still-online operational data.

Related: Provisioned Capacity Mode, On-Demand Capacity Mode, Point-in-Time Recovery

Source: DynamoDB - Table classes

Time to Live (TTL)

TTL lets you mark a numeric attribute (Unix epoch seconds) as the expiration time of an item, and DynamoDB asynchronously deletes expired items at no extra request cost. TTL deletions also flow into DynamoDB Streams as REMOVE events, which makes TTL a building block for session stores, ephemeral chat history, and archival pipelines that move expired items to S3.

Related: DynamoDB Streams, Point-in-Time Recovery, AWS Backup

Source: DynamoDB - Time to Live

DynamoDB Streams

DynamoDB Streams is an ordered change-data-capture stream of item-level INSERT, MODIFY, and REMOVE events for a DynamoDB table, with 24-hour retention. It is the standard input for AWS Lambda triggers, for keeping a derived index in sync, and for fan-out integrations with services such as Amazon Kinesis Data Streams. The Streams Kinesis Data Streams option lets the same change feed land directly in Kinesis for longer retention.

Related: Global Tables, Time to Live, Transactions, DynamoDB Zero-ETL Integration with OpenSearch

Source: DynamoDB Streams

Global Tables (DynamoDB)

A Global Table is a multi-Region, multi-active replica set of a DynamoDB table where every replica accepts writes and conflicts are resolved with last-writer-wins. It targets active-active deployments with single-digit-millisecond read latency in each Region and an automatic RPO measured in seconds. Global Tables are independent of PITR and AWS Backup, which apply per-replica.

Related: DynamoDB Streams, Point-in-Time Recovery, AWS Backup, Global Database (Aurora)

Source: DynamoDB - Global Tables

Point-in-Time Recovery (PITR)

Point-in-Time Recovery is a per-table backup feature for DynamoDB that lets you restore a table to any second in the trailing recovery window (up to 35 days). It is independent of on-demand backups and is the recommended baseline for production tables. PITR continues to operate transparently across capacity-mode changes, table-class changes, and the addition or removal of GSIs.

Related: AWS Backup, Global Tables, Time to Live

Source: DynamoDB - Point-in-Time Recovery

Transactions (DynamoDB)

DynamoDB Transactions (TransactWriteItems / TransactGetItems) group up to 100 actions across one or many tables in the same AWS account and Region into a single ACID operation, with an aggregate item size limit of 4 MB per transaction. Transactional requests consume twice the capacity of equivalent non-transactional requests, in exchange for atomicity, isolation, and serializable cross-item updates. Transactions are the standard primitive for inventory, payment, and uniqueness-constraint patterns on DynamoDB; ACID guarantees apply only within the Region where the call was made, not across Global Tables replicas.

Related: Write Capacity Unit, Read Capacity Unit, Single-Table Design, DynamoDB Streams

Source: DynamoDB - Transactions

F. Purpose-built Databases

Amazon DocumentDB (with MongoDB compatibility)

Amazon DocumentDB is a managed document database with MongoDB-compatible APIs, designed for JSON workloads that benefit from a flexible schema. Its compute-storage separation is similar to Aurora: six storage copies across three AZs, and read replicas that share the same underlying volume. DocumentDB supports native vector search, which makes it a candidate for RAG systems whose operational data is already in document form.

Related: Amazon Neptune, Aurora Storage Layer, Vector Search, Amazon Keyspaces

Source: Amazon DocumentDB

Amazon Neptune

Amazon Neptune is a managed graph database engine that supports the property-graph model (queryable with Apache Gremlin and openCypher) and the RDF model (queryable with SPARQL). It targets workloads such as fraud detection, knowledge graphs, identity graphs, and recommendation engines where the structure of relationships is the dominant query pattern. Neptune complements Neptune Analytics, which loads a graph into memory for fast multi-hop analytics.

Related: Amazon Neptune Analytics, Graph Query Languages, Amazon DocumentDB

Source: Amazon Neptune

Amazon Neptune Analytics

Amazon Neptune Analytics is a separate analytics engine for graph workloads that loads a graph into memory for fast multi-hop analytics, path-finding, centrality, and similarity algorithms. It complements Amazon Neptune Database, which is the OLTP-style graph engine, and it supports vector search on graph node embeddings.

Related: Amazon Neptune, Graph Query Languages, Vector Search

Source: Amazon Neptune Analytics

Graph Query Languages (Gremlin / openCypher / SPARQL)

The three graph query languages supported on Amazon Neptune are Apache Gremlin (traversal-style, for property graphs), openCypher (declarative, for property graphs, originally from Neo4j), and SPARQL (declarative, for RDF graphs). Choice of language is driven by the data model and existing developer skills, not by Neptune itself; a Neptune cluster can hold both property graph and RDF data, but the two are stored in separate namespaces and an individual query targets only one model.

Related: Amazon Neptune, Amazon Neptune Analytics

Source: Amazon Neptune - Overview

Amazon Timestream for LiveAnalytics

Amazon Timestream for LiveAnalytics (formerly Amazon Timestream) is a serverless time-series database optimised for IoT and operational telemetry, with a tiered storage model (memory store for recent data, magnetic store for historical data) and a SQL-like query language. It scales to trillions of events per day without provisioning capacity, and it includes built-in interpolation and gap-filling functions for time-series queries.

Related: Amazon Timestream for InfluxDB, Amazon MemoryDB, Amazon ElastiCache for Redis OSS

Source: Amazon Timestream

Amazon Timestream for InfluxDB

Amazon Timestream for InfluxDB is a managed deployment of the open-source InfluxDB engine, suitable for teams already standardised on the InfluxQL or Flux query languages and on Telegraf for ingestion. It complements LiveAnalytics by offering a familiar InfluxDB API and ecosystem on managed AWS infrastructure.

Related: Amazon Timestream for LiveAnalytics, Amazon MemoryDB, AWS Backup

Source: Amazon Timestream for InfluxDB

Amazon ElastiCache for Redis OSS

Amazon ElastiCache for Redis OSS is a managed deployment of Redis OSS suitable for caching, leaderboards, queues, and pub/sub workloads. Clusters can be configured with Multi-AZ replication and automatic failover, and the service supports both classic cache use and persistent data store modes. ElastiCache is positioned as the cache in front of a separate primary; MemoryDB is the choice when in-memory state is the primary.

Related: Amazon ElastiCache for Valkey, Amazon MemoryDB, RDS Proxy

Source: Amazon ElastiCache for Redis OSS

Amazon ElastiCache for Valkey

Amazon ElastiCache for Valkey is a managed deployment of Valkey, the Linux Foundation fork of Redis OSS created after the licence change in Redis 7.4. AWS positions it as a fully API-compatible alternative for new workloads and existing Redis OSS clusters, with the same scaling, Multi-AZ replication, and failover behaviour as ElastiCache for Redis OSS.

Related: Amazon ElastiCache for Redis OSS, Amazon MemoryDB

Source: Amazon ElastiCache

Amazon MemoryDB

Amazon MemoryDB is a Redis OSS / Valkey-compatible, durable in-memory database with multi-AZ transaction log durability and microsecond reads / single-digit-millisecond writes. Unlike ElastiCache, it is positioned as the primary database for the workload rather than as a cache in front of a separate primary. MemoryDB is the in-memory companion to DynamoDB for workloads that need extremely low latency without giving up durability.

Related: Amazon ElastiCache for Redis OSS, Amazon ElastiCache for Valkey, Vector Search, On-Demand Capacity Mode

Source: Amazon MemoryDB

Amazon Keyspaces (for Apache Cassandra)

Amazon Keyspaces is a managed, serverless, Apache Cassandra-compatible wide-column database that uses the CQL protocol. It targets applications already built on Cassandra that want to remove the operational burden of running rings, with single-digit-millisecond p99 latency at scale. Keyspaces supports both on-demand and provisioned capacity modes and provides PITR similarly to DynamoDB.

Related: Amazon DocumentDB, On-Demand Capacity Mode, Point-in-Time Recovery

Source: Amazon Keyspaces

Amazon QLDB

Amazon QLDB (Quantum Ledger Database) was a managed ledger database with a cryptographically verifiable change history. It reached end of support on 2025-07-31, and AWS published migration playbooks to Amazon Aurora PostgreSQL, Amazon DynamoDB, and Amazon Managed Blockchain for affected workloads. New ledger-style requirements on AWS should now be modelled as immutable, append-only patterns on those services rather than on QLDB.

Related: AWS Database Migration Service, AWS Backup, Database Activity Streams

Source: Amazon QLDB (End-of-Support 2025-07-31)

Vector Search on AWS is the family of features that store high-dimensional embeddings and answer approximate-nearest-neighbour queries: pgvector on Amazon RDS for PostgreSQL and Aurora PostgreSQL, native vector search on Amazon DocumentDB and Amazon MemoryDB, and vector search on Amazon Neptune Analytics. It is the foundation of RAG systems built on operational databases rather than on a separate vector store, and it is one of the patterns most commonly added to existing AWS database stacks for generative AI workloads.

Related: Amazon DocumentDB, Amazon MemoryDB, Amazon Neptune Analytics, Trusted Language Extensions for PostgreSQL

Source: Aurora PostgreSQL - Vector store with pgvector

G. Migration and Integration

AWS Database Migration Service (DMS)

AWS DMS is a managed service for migrating data between an on-premises or AWS source database and an AWS target, with support for full-load, change-data-capture (CDC), and full-load-plus-CDC modes. It works across heterogeneous engine pairs (for example, Oracle to Aurora PostgreSQL) when combined with the Schema Conversion tool, and it is the default tool for online migrations where the source must stay live during cutover.

Related: AWS Schema Conversion Tool, Blue/Green Deployments, Amazon RDS Data API

Source: AWS Database Migration Service

AWS Schema Conversion Tool (SCT) / DMS Schema Conversion

The AWS Schema Conversion Tool (SCT) and the newer DMS Schema Conversion convert source schemas, stored procedures, and SQL dialect features to the target engine, flagging objects that need manual rework. They are the typical first step of a heterogeneous migration with DMS - for example, Oracle PL/SQL or SQL Server T-SQL to Aurora PostgreSQL.

Related: AWS Database Migration Service, Babelfish for Aurora PostgreSQL, RDS Custom

Source: AWS Schema Conversion Tool

Amazon RDS Data API

The RDS Data API exposes Aurora Serverless v2, Aurora Serverless v1, and Aurora PostgreSQL clusters over HTTPS so that AWS Lambda and other serverless callers can run SQL without managing connections or VPC networking. It removes the connection-pool problem at the cost of a slightly higher per-call overhead than a persistent connection, and it integrates with AWS Secrets Manager for credentials.

Related: Aurora Serverless v2, RDS Proxy, Aurora Zero-ETL Integration with Redshift

Source: Amazon RDS Data API

Aurora Zero-ETL Integration with Amazon Redshift

An Aurora Zero-ETL Integration replicates an Aurora MySQL or Aurora PostgreSQL DB cluster into Amazon Redshift continuously and within seconds, without customer-managed ETL pipelines. Once landed, the data is queryable in Redshift alongside other warehouse data, which makes it the standard path for "operational data, but in Redshift" use cases. Amazon RDS for MySQL, RDS for PostgreSQL, and RDS for Oracle are covered by the analogous Amazon RDS zero-ETL integration with Amazon Redshift, and Amazon DynamoDB has its own zero-ETL integration with Amazon Redshift, so this pattern now spans the bulk of the AWS operational database surface.

Related: DynamoDB Zero-ETL Integration with OpenSearch, Lake Formation Integration, AWS Database Migration Service

Source: Aurora Zero-ETL Integrations

DynamoDB Zero-ETL Integration with Amazon OpenSearch Service

Zero-ETL integration between Amazon DynamoDB and Amazon OpenSearch Service automatically replicates DynamoDB items into an OpenSearch index for full-text search, vector search, and analytics, without a customer-built pipeline on DynamoDB Streams plus Lambda. It is the recommended path for search use cases on DynamoDB-backed data, especially when full-text and vector queries are needed alongside the existing DynamoDB key-value access pattern.

Related: DynamoDB Streams, Aurora Zero-ETL Integration with Redshift, Vector Search

Source: DynamoDB - Zero-ETL with OpenSearch

Database Activity Streams (Aurora)

Database Activity Streams is a feature for Amazon Aurora (PostgreSQL and MySQL) and Amazon RDS for Oracle that streams a near-real-time, encrypted, tamper-resistant log of database activity to Amazon Kinesis Data Streams. It is the basis for external SIEM and audit pipelines, and it complements Performance Insights, which is for performance troubleshooting rather than for compliance audit.

Related: AWS Backup, Performance Insights, AWS Database Migration Service

Source: Database Activity Streams

Lake Formation Integration (DynamoDB / Aurora export to S3)

AWS Lake Formation integrates with operational databases through point-in-time exports to Amazon S3 from DynamoDB and from Aurora, which can then be governed by Lake Formation permissions and queried with Amazon Athena, Amazon EMR, or Amazon Redshift Spectrum. It is the recommended path for offline analytics on operational data without putting query load on the source database.

Related: Aurora Zero-ETL Integration with Redshift, DynamoDB Zero-ETL Integration with OpenSearch, AWS Backup

Source: DynamoDB - Export to Amazon S3

AWS Backup for RDS, Aurora, and DynamoDB

AWS Backup is a centralised, policy-driven backup service that manages backups for Amazon RDS, Amazon Aurora, Amazon DynamoDB (including continuous backups for PITR), and other AWS resources. It provides cross-Region and cross-account copy, immutable Vault Lock, and lifecycle to cold storage in one place. AWS Backup is the recommended control plane when the same retention and compliance policies must span multiple AWS database services.

Related: Point-in-Time Recovery, Global Tables, Global Database (Aurora)

Source: AWS Backup

Related Articles

References

Summary

This glossary collects the essential terms an engineer or architect repeatedly encounters when choosing or operating an AWS database, across relational engines (Amazon RDS, Amazon Aurora), key-value (Amazon DynamoDB), document (Amazon DocumentDB), graph (Amazon Neptune, Amazon Neptune Analytics), time-series (Amazon Timestream), in-memory (Amazon ElastiCache, Amazon MemoryDB), wide-column (Amazon Keyspaces), and ledger (Amazon QLDB) services, plus the migration and integration layer that sits beside them.

Each definition is short enough to read in one breath, each Related line maps the term to its neighbours, and each Source link goes to the canonical AWS documentation. I will continue to update this glossary as the AWS database surface evolves - new engines, new Zero-ETL integrations, and new compatibility layers are exactly the kind of vocabulary worth keeping current in one place.


References:
Tech Blog with curated related content

Written by Hidekazu Konishi