AWS Database Glossary - RDS, Aurora, DynamoDB, DocumentDB, and Neptune Explained
First Published:
Last Updated:
It is a companion to my earlier AI and Machine Learning Glossary for AWS and AI Agent Engineering Glossary. Each entry follows the same shape: a 2-4 sentence definition, a Related line cross-linking to other terms in this page, and a Source line linking to the canonical AWS documentation.
How to Use This Glossary
Use the A-Z Term Index below to jump directly to a term. The seven category sections group terms by the layer of the AWS database stack they belong to: RDS / Aurora core concepts, RDS / Aurora capacity and operations, RDS / Aurora compatibility and extensions, DynamoDB data modeling, DynamoDB capacity and replication, purpose-built databases (DocumentDB, Neptune, Timestream, ElastiCache, MemoryDB, Keyspaces, QLDB, vector search), and the migration and integration layer.
For deep dives on specific patterns, see Amazon DynamoDB Single-Table Design Guide, Amazon DynamoDB Key Design, GSI, and LSI Dictionary, Comparison of AWS Databases Using the Quorum Model, and AWS History and Timeline - Amazon RDS. This glossary intentionally stays at the concept level so the page does not rot as features evolve; follow the Source link on each term for the current scope, limits, and pricing in the AWS documentation.
A-Z Term Index
Adaptive Capacity · Amazon DocumentDB · Amazon ElastiCache for Redis OSS · Amazon ElastiCache for Valkey · Amazon Keyspaces · Amazon MemoryDB · Amazon Neptune · Amazon Neptune Analytics · Amazon QLDB · Amazon RDS Data API · Amazon Timestream for InfluxDB · Amazon Timestream for LiveAnalytics · Aurora Cluster Endpoint · Aurora Custom Endpoint · Aurora DSQL · Aurora I/O-Optimized · Aurora Optimized Reads · Aurora Optimized Writes · Aurora Reader Endpoint · Aurora Serverless v1 · Aurora Serverless v2 · Aurora Storage Layer · Aurora Zero-ETL Integration with Amazon Redshift · AWS Backup for RDS, Aurora, and DynamoDB · AWS Database Migration Service (DMS) · AWS Schema Conversion Tool / DMS Schema Conversion · Babelfish for Aurora PostgreSQL · Backtrack (Aurora MySQL) · Blue/Green Deployments · Composite Primary Key · Database Activity Streams (Aurora) · Database Cloning (Aurora) · DynamoDB Streams · DynamoDB Zero-ETL Integration with Amazon OpenSearch Service · Global Database (Aurora) · Global Secondary Index (GSI) · Global Tables (DynamoDB) · Graph Query Languages (Gremlin / openCypher / SPARQL) · Hot Partition · Lake Formation Integration · Local Secondary Index (LSI) · Multi-AZ Deployment · On-Demand Capacity Mode · Partition Key · Performance Insights · Point-in-Time Recovery (PITR) · Provisioned Capacity Mode · Provisioned Throughput (RDS instance class) · Quorum-based Replication (Aurora) · RDS Custom · RDS Proxy · Read Capacity Unit (RCU) · Read Replica · Single-Table Design · Sort Key · Standard-IA Table Class · Time to Live (TTL) · Transactions (DynamoDB) · Trusted Language Extensions (TLE) for PostgreSQL · Vector Search (pgvector / Aurora / DocumentDB) · Write Capacity Unit (WCU)
AWS Database Service Map
The eight columns below frame how the terms in this glossary map to AWS database categories: Relational (RDS, Aurora), Key-Value (DynamoDB), Document (DocumentDB), Graph (Neptune, Neptune Analytics), Time-series (Timestream LiveAnalytics, Timestream for InfluxDB), In-Memory (ElastiCache for Redis OSS, ElastiCache for Valkey, MemoryDB), Wide-Column (Keyspaces), and Ledger (QLDB, with end-of-support announced for 2025-07-31). DMS, the Schema Conversion Tool, AWS Backup, and the Zero-ETL integrations cross-cut every column.
| RelationalSQL OLTP | Key-ValueNoSQL KV | DocumentJSON | GraphProperty / RDF | Time-seriesIoT / Metrics | In-MemoryCache / Primary | Wide-ColumnCassandra | LedgerEoS 2025-07-31 |
|---|---|---|---|---|---|---|---|
| Amazon RDS Amazon Aurora Aurora Serverless v2 Babelfish RDS Proxy |
Amazon DynamoDB Global Tables PITR DynamoDB Streams Standard-IA |
Amazon DocumentDB Vector Search MongoDB API |
Amazon Neptune Neptune Analytics Gremlin openCypher SPARQL |
Amazon Timestream LiveAnalytics Amazon Timestream for InfluxDB |
ElastiCache for Redis OSS ElastiCache for Valkey Amazon MemoryDB |
Amazon Keyspaces CQL Protocol |
Amazon QLDB Verifiable History |
| Cross-cutting Layer (applies across all columns) | |||||||
| Migration: AWS DMS (full-load + CDC) | AWS Schema Conversion Tool / DMS Schema Conversion | Blue/Green Deployments | |||||||
| Backup: AWS Backup (RDS, Aurora, DynamoDB) | PITR | Global Tables / Global Database | Vault Lock | |||||||
| Zero-ETL Integrations: Aurora → Amazon Redshift | DynamoDB → Amazon OpenSearch Service | Lake Formation export to S3 (Parquet) | |||||||
| Amazon QLDB end-of-support: 2025-07-31. Migrate to immutable patterns on Amazon Aurora PostgreSQL, Amazon DynamoDB, or Amazon Managed Blockchain. | |||||||
A. RDS / Aurora - Core Concepts
Multi-AZ Deployment
A Multi-AZ deployment for Amazon RDS provisions a primary DB instance and a synchronously replicated standby (or two readable standbys in Multi-AZ DB cluster deployments) in different Availability Zones. RDS performs automatic failover to the standby when the primary fails, when the underlying host needs replacement, or during patching, typically within 60-120 seconds. Multi-AZ is the standard production posture for RDS engines that do not run on the Aurora storage layer.
Related: Read Replica, Aurora Storage Layer, Blue/Green Deployments, RDS Proxy
Source: Amazon RDS - High availability (Multi-AZ)
Read Replica
A Read Replica is an asynchronously replicated copy of an Amazon RDS source DB instance used to offload read traffic and to create cross-Region copies for disaster recovery. RDS Read Replicas can be promoted to standalone primary instances, while Aurora Replicas share the same underlying storage volume and therefore have near-zero replication lag. Read Replicas are independent of Multi-AZ standbys: a single deployment can use both at once for different reasons.
Related: Multi-AZ Deployment, Aurora Reader Endpoint, Global Database, Aurora Storage Layer
Source: Amazon RDS - Working with read replicas
Aurora Storage Layer
Aurora separates compute from storage and uses a distributed, log-structured Aurora storage layer that stores six copies of every data block across three Availability Zones. Writes are acknowledged when a 4-of-6 quorum confirms the redo log record, so the system tolerates the loss of an entire AZ plus one additional copy without losing data. The same storage layer underlies Amazon DocumentDB.
Related: Quorum-based Replication, Aurora Cluster Endpoint, Multi-AZ Deployment, Amazon DocumentDB
Source: Amazon Aurora - Storage and reliability
Quorum-based Replication (Aurora)
Aurora uses a 4/6 write quorum and a 3/6 read quorum across its six storage copies in three AZs, which means a write succeeds once four nodes acknowledge it and a read needs only three nodes to reconstruct the latest version. This is the design that lets Aurora survive the loss of a whole AZ for writes and the loss of an AZ plus one node for reads. For a deeper comparison across AWS databases, see Comparison of AWS Databases Using the Quorum Model.
Related: Aurora Storage Layer, Multi-AZ Deployment, Global Database
Source: Amazon Aurora - Storage and reliability
Backtrack (Aurora MySQL)
Backtrack lets an Aurora MySQL DB cluster rewind its current state to a target time in the recent past without restoring from a snapshot. It is implemented on top of the Aurora storage log and is limited to Aurora MySQL with a configurable target backtrack window of up to 72 hours. Unlike PITR, backtrack does not create a new cluster - the current cluster is moved to the target time in place.
Related: Database Cloning, Point-in-Time Recovery, AWS Backup
Source: Aurora MySQL - Backtracking
Database Cloning (Aurora)
Aurora supports near-instant database clones using copy-on-write semantics on its shared storage layer. A clone starts as a logical pointer to the source volume and only consumes additional storage as either the clone or the source is modified, which makes it useful for production-fidelity test environments without doubling cost. The same mechanism underlies Aurora Blue/Green Deployments and Backtrack.
Related: Backtrack, Blue/Green Deployments, Aurora Storage Layer
Source: Amazon Aurora - Cloning a volume
Aurora Cluster Endpoint
The cluster endpoint of an Aurora DB cluster always resolves to the current writer instance. Applications use it for read/write traffic, and during a failover Aurora reassigns the endpoint to the newly promoted writer with no client-side reconfiguration. There is exactly one cluster endpoint per Aurora cluster.
Related: Aurora Reader Endpoint, Aurora Custom Endpoint, RDS Proxy
Source: Amazon Aurora - Endpoints
Aurora Reader Endpoint
The reader endpoint of an Aurora DB cluster load-balances connections across all available Aurora Replicas in the cluster. It is the standard target for read-only workloads and automatically excludes the writer or any unhealthy replicas. Connection balancing happens at session establishment time, not per query, so long-lived connections can pin to one replica.
Related: Aurora Cluster Endpoint, Aurora Custom Endpoint, Read Replica, RDS Proxy
Source: Amazon Aurora - Endpoints
Aurora Custom Endpoint
A custom endpoint is a user-defined endpoint that load-balances across a chosen subset of instances in an Aurora DB cluster - for example, "analytics replicas only" or "instances of class db.r6g.4xlarge". It lets you route specific workloads to specific hardware without splitting the cluster, and custom endpoints can include or exclude individual instances by tag.
Related: Aurora Cluster Endpoint, Aurora Reader Endpoint, RDS Proxy
Source: Amazon Aurora - Endpoints
Global Database (Aurora)
An Aurora Global Database spans multiple AWS Regions: writes go to a primary cluster in one Region, and Aurora replicates them asynchronously at the storage layer to up to five secondary Regions with typical lag under one second. It is the recommended pattern for low-latency cross-Region reads and for disaster recovery with a target RPO measured in seconds. A secondary Region can be promoted to primary in a planned switchover or in an unplanned failover.
Related: Read Replica, Multi-AZ Deployment, Aurora Storage Layer, Global Tables (DynamoDB)
Source: Amazon Aurora - Global databases
B. RDS / Aurora - Capacity and Operations
Aurora Serverless v1
Aurora Serverless v1 is the original on-demand auto-scaling configuration for Aurora, which scales compute in discrete capacity units between defined minimum and maximum values and can pause to zero capacity during idle periods. AWS announced end of support for Aurora Serverless v1 on March 31, 2025; the official end-of-life date is March 31, 2025 and clusters that have not been migrated to Aurora Serverless v2 are upgraded by AWS during a scheduled maintenance window. v1 lacks several features that v2 supports (Global Database, RDS Proxy, parallel query), so v2 is the recommended target for all new workloads and remaining migrations.
Related: Aurora Serverless v2, Provisioned Throughput, Aurora Cluster Endpoint
Source: Amazon Aurora Serverless v1
Aurora Serverless v2
Aurora Serverless v2 scales compute capacity in fine-grained increments measured in Aurora Capacity Units (ACUs) without disconnecting clients. It supports all major Aurora features (Global Database, RDS Proxy, Performance Insights, parallel query) that v1 did not, and is the recommended serverless option for production workloads with variable load. v2 can also be combined with provisioned instances in the same cluster.
Related: Aurora Serverless v1, Aurora I/O-Optimized, Provisioned Throughput, Amazon RDS Data API
Source: Amazon Aurora Serverless v2
Aurora DSQL
Amazon Aurora DSQL is a serverless, distributed SQL database with PostgreSQL compatibility that offers active-active multi-Region read and write with strong consistency. It is architected as a separate service from the classic Aurora cluster (no provisioned instance, no failover concept) and scales reads and writes independently across Regions without application-level conflict handling. Aurora DSQL is the right choice for globally distributed OLTP workloads that need single-digit-millisecond regional latency and the ability to write in any Region, where the operational simplicity of "no instances and no failover" outweighs the feature set of classic Aurora (extensions, custom parameters, IAM Database Authentication parity, and so on are still maturing).
Related: Aurora Serverless v2, Global Database (Aurora), Quorum-based Replication (Aurora)
Source: Amazon Aurora DSQL - User Guide
Aurora I/O-Optimized
Aurora I/O-Optimized is a cluster storage configuration in which I/O charges are removed and the per-hour instance and storage rates are higher in exchange. It is intended for workloads where I/O cost would otherwise dominate (typically more than 25 percent of total Aurora cost), and the configuration can be toggled per cluster.
Related: Aurora Optimized Reads, Aurora Optimized Writes, Aurora Storage Layer
Source: Amazon Aurora - Storage configurations
Aurora Optimized Reads
Aurora Optimized Reads is an instance-class feature for Aurora PostgreSQL and Aurora MySQL on supported Graviton and Intel NVMe-equipped instances (such as db.r6gd and db.r6id) that uses the local NVMe SSD as a tier for temporary objects and as an extension of the buffer cache. It can materially improve query latency for read-heavy workloads with large working sets that do not fit in memory, and it is enabled simply by choosing an instance class that supports it.
Related: Aurora Optimized Writes, Aurora I/O-Optimized, Aurora Storage Layer
Source: Aurora PostgreSQL - Optimized Reads
Aurora Optimized Writes
Aurora Optimized Writes is a feature of Aurora MySQL on supported instance classes that improves write throughput by removing the double-write buffer using atomic 16 KiB page writes to Aurora storage. It is selected on cluster creation on supported Aurora MySQL 3.x instance classes (such as db.r6i and db.r7g); enabling it on an existing cluster typically requires a snapshot restore or a Blue/Green switchover to re-initialise the storage page format. The feature is the main reason write-heavy MySQL workloads see a step change in throughput on newer Aurora MySQL instance classes.
Related: Aurora Optimized Reads, Aurora I/O-Optimized, Aurora Storage Layer
Source: Aurora MySQL - Optimized Writes
Blue/Green Deployments
A Blue/Green Deployment in Amazon RDS and Amazon Aurora creates a separate green environment that mirrors the production blue one, including engine upgrades, schema changes, or parameter changes you want to test. Switchover swaps endpoints to the green environment in typically under a minute with safety checks for replication lag and active sessions. The pattern is the recommended way to perform major-version upgrades on RDS and Aurora.
Related: Database Cloning, Multi-AZ Deployment, AWS Database Migration Service
Source: Amazon RDS - Blue/Green Deployments
Provisioned Throughput (RDS instance class)
For Amazon RDS in its non-serverless mode, capacity is provisioned by choosing a specific DB instance class (db.m6g, db.r7g, db.x2iedn, etc.) that fixes CPU, memory, and EBS bandwidth. Vertical scaling requires a modify-instance operation with a short downtime, which is the structural reason teams move CPU-spiky workloads to Aurora Serverless v2 or split read traffic onto Read Replicas.
Related: Aurora Serverless v2, On-Demand Capacity Mode, Read Replica
Source: Amazon RDS - DB instance classes
RDS Proxy
Amazon RDS Proxy is a fully managed, highly available database proxy in front of RDS and Aurora that pools and shares database connections across application clients. It reduces connection overhead for serverless and microservice workloads, preserves application connections across DB failovers, and integrates with AWS IAM and AWS Secrets Manager for authentication. RDS Proxy is the recommended fronting layer for Lambda-based access to RDS and Aurora.
Related: Aurora Cluster Endpoint, Aurora Custom Endpoint, Multi-AZ Deployment, Amazon RDS Data API
Source: Amazon RDS Proxy
C. RDS / Aurora - Compatibility and Extensions
Babelfish for Aurora PostgreSQL
Babelfish for Aurora PostgreSQL is an optional capability that lets an Aurora PostgreSQL DB cluster understand the SQL Server T-SQL dialect and the TDS wire protocol on a separate port. Applications written for Microsoft SQL Server can connect with minimal code changes, which makes it a tool for incremental migration off SQL Server licences. Babelfish coexists with the native PostgreSQL endpoint on the same cluster.
Related: Trusted Language Extensions for PostgreSQL, AWS Schema Conversion Tool, AWS Database Migration Service
Source: Aurora PostgreSQL - Babelfish
Trusted Language Extensions (TLE) for PostgreSQL
Trusted Language Extensions is an open-source development kit that lets users build PostgreSQL extensions in JavaScript, Perl, or other supported languages and install them safely in managed environments such as Amazon RDS and Aurora PostgreSQL. It removes the need to wait for AWS to certify each new extension and is the official path for custom server-side logic on managed PostgreSQL.
Related: Babelfish for Aurora PostgreSQL, Aurora Storage Layer, RDS Custom, Vector Search
Source: Trusted Language Extensions for PostgreSQL
Performance Insights
Performance Insights is a database-aware monitoring view on top of Amazon RDS and Amazon Aurora that visualises database load (Active Sessions / DBLoad) broken down by wait event, SQL, host, or user. It complements Amazon CloudWatch and is the default first stop when diagnosing latency or saturation on a managed DB instance. Retention is configurable, with seven days included at no extra charge.
Related: RDS Proxy, Aurora Optimized Reads, Database Activity Streams
Source: Amazon RDS - Performance Insights
RDS Custom
Amazon RDS Custom is a managed RDS variant that grants the customer access to the underlying operating system and database engine for workloads (notably Oracle) that require third-party software, custom patches, or filesystem-level customisation. The customer takes on more of the operational responsibility in return, including OS patching and database engine patching outside of the standard RDS automation.
Related: Trusted Language Extensions for PostgreSQL, Provisioned Throughput, AWS Database Migration Service
Source: Amazon RDS Custom
D. DynamoDB - Data Modeling
Partition Key
The partition key (also called the hash key) is the mandatory first attribute of a DynamoDB primary key, and its value is hashed to decide which physical partition stores the item. A well-distributed partition key is the single most important determinant of DynamoDB throughput because traffic on a single partition is bounded. Item collections sharing a partition key are stored together.
Related: Sort Key, Composite Primary Key, Hot Partition, Adaptive Capacity
Source: Amazon DynamoDB - Primary key
Sort Key
The sort key (also called the range key) is the optional second attribute of a DynamoDB primary key and orders items within the same partition. Adding a sort key turns the primary key into a composite key and unlocks range queries with conditions such as begins_with, between, and >. The sort key is the foundation for both LSIs and for sort-key-prefix-based single-table designs.
Related: Partition Key, Composite Primary Key, Local Secondary Index, Single-Table Design
Source: Amazon DynamoDB - Primary key
Composite Primary Key
A composite primary key in DynamoDB combines a partition key and a sort key so that an item is uniquely identified by both attributes together. It is the structural foundation of the Single-Table Design pattern, in which one table holds many entity types differentiated by sort key prefix. The detailed key design playbook is collected in the Amazon DynamoDB Key Design, GSI, and LSI Dictionary.
Related: Partition Key, Sort Key, Single-Table Design, Global Secondary Index
Source: Amazon DynamoDB - Primary key
Global Secondary Index (GSI)
A Global Secondary Index is an index whose partition and sort keys can be different from the base table, supporting queries on alternate access patterns. A GSI has its own provisioned or on-demand capacity, is updated asynchronously from the base table, and therefore returns eventually consistent reads (strong reads are not supported on a GSI). GSIs can be added or removed online.
Related: Local Secondary Index, Composite Primary Key, Single-Table Design, Partition Key
Source: Amazon DynamoDB - Global Secondary Indexes
Local Secondary Index (LSI)
A Local Secondary Index shares the partition key with the base table but uses a different sort key, so it indexes within each partition. LSIs must be defined at table-creation time, share capacity with the base table, and support both eventually consistent and strongly consistent reads. LSIs are a useful tool when an entity needs multiple sort orders, but they cannot be added or removed after table creation.
Related: Global Secondary Index, Sort Key, Composite Primary Key
Source: Amazon DynamoDB - Local Secondary Indexes
Single-Table Design
Single-Table Design is a DynamoDB modeling approach in which one table stores multiple entity types (users, orders, comments, etc.) differentiated by partition key prefix and sort key prefix, with GSIs added for alternate access patterns. It minimises cross-entity round trips and is the canonical pattern for backend-for-frontend services on DynamoDB. The full playbook is in the Amazon DynamoDB Single-Table Design Guide.
Related: Composite Primary Key, Global Secondary Index, Hot Partition, Transactions
Source: Amazon DynamoDB - NoSQL design
Hot Partition
A hot partition occurs when a disproportionate share of read or write traffic targets a single partition key value, so the partition saturates while other partitions are idle. Adaptive Capacity is the platform-level mitigation; partition-key design (key salting, write sharding, time-bucket suffixes) is the application-level mitigation. Hot partitions are the most common root cause of "DynamoDB throttling" on otherwise under-utilised tables.
Related: Partition Key, Adaptive Capacity, Provisioned Capacity Mode, Single-Table Design
Source: DynamoDB - Designing partition keys to distribute workloads
Adaptive Capacity
Adaptive Capacity is a DynamoDB feature in which the service automatically reallocates throughput to absorb traffic spikes on specific partition keys without requiring manual rebalancing. It is enabled by default on all tables and effectively makes provisioned-throughput "hot key" problems much rarer than in the early years of the service. Adaptive Capacity does not remove the need for good key design - it raises the threshold at which bad key design becomes a production incident.
Related: Hot Partition, Provisioned Capacity Mode, On-Demand Capacity Mode
Source: DynamoDB - Designing partition keys
E. DynamoDB - Capacity, Streams, and Replication
Provisioned Capacity Mode
Provisioned capacity mode lets you specify the number of Read Capacity Units and Write Capacity Units a DynamoDB table can sustain per second. It is the right mode for predictable, steady workloads, and Auto Scaling adjusts the provisioned values within configured bounds. Reserved Capacity discounts are available for committed long-term provisioned capacity.
Related: On-Demand Capacity Mode, Read Capacity Unit, Write Capacity Unit, Adaptive Capacity
Source: DynamoDB - Read/write capacity modes
On-Demand Capacity Mode
On-demand capacity mode charges per request and scales instantaneously up to any previously observed peak (and beyond, with brief warm-up after large step changes). It is the default and recommended mode for most workloads - unpredictable traffic, new applications without an established baseline, and spiky workloads where provisioning headroom would be wasteful. Switching a table between capacity modes is allowed, but subsequent switches are subject to a cooldown window - check the Source link for the current limit, which has tightened and loosened over the years.
Related: Provisioned Capacity Mode, Read Capacity Unit, Write Capacity Unit
Source: DynamoDB - On-demand capacity mode
Read Capacity Unit (RCU)
One Read Capacity Unit represents one strongly consistent read per second, two eventually consistent reads per second, or one-half of a transactional read per second, for items up to 4 KiB. RCU is the billing and quota unit for provisioned capacity reads and for the read side of on-demand metering. Reads of items larger than 4 KiB consume additional RCUs in 4 KiB increments.
Related: Write Capacity Unit, Provisioned Capacity Mode, On-Demand Capacity Mode, Transactions
Source: DynamoDB - Read/write capacity modes
Write Capacity Unit (WCU)
One Write Capacity Unit represents one standard write per second or one-half of a transactional write per second, for items up to 1 KiB. WCU is the billing and quota unit for provisioned capacity writes and the write side of on-demand metering. Writes of items larger than 1 KiB consume additional WCUs in 1 KiB increments, which is the structural reason single-table designs flatten data rather than nest it.
Related: Read Capacity Unit, Provisioned Capacity Mode, Transactions
Source: DynamoDB - Read/write capacity modes
Standard-IA Table Class
The Standard-Infrequent Access (Standard-IA) table class is a per-table storage option that lowers storage rates while raising read/write request rates, intended for tables whose data is rarely accessed but must remain queryable with single-digit-millisecond latency. It is selected at table creation or via a table-class modification, and it is the right choice for tables holding cold but still-online operational data.
Related: Provisioned Capacity Mode, On-Demand Capacity Mode, Point-in-Time Recovery
Source: DynamoDB - Table classes
Time to Live (TTL)
TTL lets you mark a numeric attribute (Unix epoch seconds) as the expiration time of an item, and DynamoDB asynchronously deletes expired items at no extra request cost. TTL deletions also flow into DynamoDB Streams as REMOVE events, which makes TTL a building block for session stores, ephemeral chat history, and archival pipelines that move expired items to S3.
Related: DynamoDB Streams, Point-in-Time Recovery, AWS Backup
Source: DynamoDB - Time to Live
DynamoDB Streams
DynamoDB Streams is an ordered change-data-capture stream of item-level INSERT, MODIFY, and REMOVE events for a DynamoDB table, with 24-hour retention. It is the standard input for AWS Lambda triggers, for keeping a derived index in sync, and for fan-out integrations with services such as Amazon Kinesis Data Streams. The Streams Kinesis Data Streams option lets the same change feed land directly in Kinesis for longer retention.
Related: Global Tables, Time to Live, Transactions, DynamoDB Zero-ETL Integration with OpenSearch
Source: DynamoDB Streams
Global Tables (DynamoDB)
A Global Table is a multi-Region, multi-active replica set of a DynamoDB table where every replica accepts writes and conflicts are resolved with last-writer-wins. It targets active-active deployments with single-digit-millisecond read latency in each Region and an automatic RPO measured in seconds. Global Tables are independent of PITR and AWS Backup, which apply per-replica.
Related: DynamoDB Streams, Point-in-Time Recovery, AWS Backup, Global Database (Aurora)
Source: DynamoDB - Global Tables
Point-in-Time Recovery (PITR)
Point-in-Time Recovery is a per-table backup feature for DynamoDB that lets you restore a table to any second in the trailing recovery window (up to 35 days). It is independent of on-demand backups and is the recommended baseline for production tables. PITR continues to operate transparently across capacity-mode changes, table-class changes, and the addition or removal of GSIs.
Related: AWS Backup, Global Tables, Time to Live
Source: DynamoDB - Point-in-Time Recovery
Transactions (DynamoDB)
DynamoDB Transactions (TransactWriteItems / TransactGetItems) group up to 100 actions across one or many tables in the same AWS account and Region into a single ACID operation, with an aggregate item size limit of 4 MB per transaction. Transactional requests consume twice the capacity of equivalent non-transactional requests, in exchange for atomicity, isolation, and serializable cross-item updates. Transactions are the standard primitive for inventory, payment, and uniqueness-constraint patterns on DynamoDB; ACID guarantees apply only within the Region where the call was made, not across Global Tables replicas.
Related: Write Capacity Unit, Read Capacity Unit, Single-Table Design, DynamoDB Streams
Source: DynamoDB - Transactions
F. Purpose-built Databases
Amazon DocumentDB (with MongoDB compatibility)
Amazon DocumentDB is a managed document database with MongoDB-compatible APIs, designed for JSON workloads that benefit from a flexible schema. Its compute-storage separation is similar to Aurora: six storage copies across three AZs, and read replicas that share the same underlying volume. DocumentDB supports native vector search, which makes it a candidate for RAG systems whose operational data is already in document form.
Related: Amazon Neptune, Aurora Storage Layer, Vector Search, Amazon Keyspaces
Source: Amazon DocumentDB
Amazon Neptune
Amazon Neptune is a managed graph database engine that supports the property-graph model (queryable with Apache Gremlin and openCypher) and the RDF model (queryable with SPARQL). It targets workloads such as fraud detection, knowledge graphs, identity graphs, and recommendation engines where the structure of relationships is the dominant query pattern. Neptune complements Neptune Analytics, which loads a graph into memory for fast multi-hop analytics.
Related: Amazon Neptune Analytics, Graph Query Languages, Amazon DocumentDB
Source: Amazon Neptune
Amazon Neptune Analytics
Amazon Neptune Analytics is a separate analytics engine for graph workloads that loads a graph into memory for fast multi-hop analytics, path-finding, centrality, and similarity algorithms. It complements Amazon Neptune Database, which is the OLTP-style graph engine, and it supports vector search on graph node embeddings.
Related: Amazon Neptune, Graph Query Languages, Vector Search
Source: Amazon Neptune Analytics
Graph Query Languages (Gremlin / openCypher / SPARQL)
The three graph query languages supported on Amazon Neptune are Apache Gremlin (traversal-style, for property graphs), openCypher (declarative, for property graphs, originally from Neo4j), and SPARQL (declarative, for RDF graphs). Choice of language is driven by the data model and existing developer skills, not by Neptune itself; a Neptune cluster can hold both property graph and RDF data, but the two are stored in separate namespaces and an individual query targets only one model.
Related: Amazon Neptune, Amazon Neptune Analytics
Source: Amazon Neptune - Overview
Amazon Timestream for LiveAnalytics
Amazon Timestream for LiveAnalytics (formerly Amazon Timestream) is a serverless time-series database optimised for IoT and operational telemetry, with a tiered storage model (memory store for recent data, magnetic store for historical data) and a SQL-like query language. It scales to trillions of events per day without provisioning capacity, and it includes built-in interpolation and gap-filling functions for time-series queries.
Related: Amazon Timestream for InfluxDB, Amazon MemoryDB, Amazon ElastiCache for Redis OSS
Source: Amazon Timestream
Amazon Timestream for InfluxDB
Amazon Timestream for InfluxDB is a managed deployment of the open-source InfluxDB engine, suitable for teams already standardised on the InfluxQL or Flux query languages and on Telegraf for ingestion. It complements LiveAnalytics by offering a familiar InfluxDB API and ecosystem on managed AWS infrastructure.
Related: Amazon Timestream for LiveAnalytics, Amazon MemoryDB, AWS Backup
Source: Amazon Timestream for InfluxDB
Amazon ElastiCache for Redis OSS
Amazon ElastiCache for Redis OSS is a managed deployment of Redis OSS suitable for caching, leaderboards, queues, and pub/sub workloads. Clusters can be configured with Multi-AZ replication and automatic failover, and the service supports both classic cache use and persistent data store modes. ElastiCache is positioned as the cache in front of a separate primary; MemoryDB is the choice when in-memory state is the primary.
Related: Amazon ElastiCache for Valkey, Amazon MemoryDB, RDS Proxy
Source: Amazon ElastiCache for Redis OSS
Amazon ElastiCache for Valkey
Amazon ElastiCache for Valkey is a managed deployment of Valkey, the Linux Foundation fork of Redis OSS created after the licence change in Redis 7.4. AWS positions it as a fully API-compatible alternative for new workloads and existing Redis OSS clusters, with the same scaling, Multi-AZ replication, and failover behaviour as ElastiCache for Redis OSS.
Related: Amazon ElastiCache for Redis OSS, Amazon MemoryDB
Source: Amazon ElastiCache
Amazon MemoryDB
Amazon MemoryDB is a Redis OSS / Valkey-compatible, durable in-memory database with multi-AZ transaction log durability and microsecond reads / single-digit-millisecond writes. Unlike ElastiCache, it is positioned as the primary database for the workload rather than as a cache in front of a separate primary. MemoryDB is the in-memory companion to DynamoDB for workloads that need extremely low latency without giving up durability.
Related: Amazon ElastiCache for Redis OSS, Amazon ElastiCache for Valkey, Vector Search, On-Demand Capacity Mode
Source: Amazon MemoryDB
Amazon Keyspaces (for Apache Cassandra)
Amazon Keyspaces is a managed, serverless, Apache Cassandra-compatible wide-column database that uses the CQL protocol. It targets applications already built on Cassandra that want to remove the operational burden of running rings, with single-digit-millisecond p99 latency at scale. Keyspaces supports both on-demand and provisioned capacity modes and provides PITR similarly to DynamoDB.
Related: Amazon DocumentDB, On-Demand Capacity Mode, Point-in-Time Recovery
Source: Amazon Keyspaces
Amazon QLDB
Amazon QLDB (Quantum Ledger Database) was a managed ledger database with a cryptographically verifiable change history. It reached end of support on 2025-07-31, and AWS published migration playbooks to Amazon Aurora PostgreSQL, Amazon DynamoDB, and Amazon Managed Blockchain for affected workloads. New ledger-style requirements on AWS should now be modelled as immutable, append-only patterns on those services rather than on QLDB.
Related: AWS Database Migration Service, AWS Backup, Database Activity Streams
Source: Amazon QLDB (End-of-Support 2025-07-31)
Vector Search (pgvector / Aurora / DocumentDB)
Vector Search on AWS is the family of features that store high-dimensional embeddings and answer approximate-nearest-neighbour queries: pgvector on Amazon RDS for PostgreSQL and Aurora PostgreSQL, native vector search on Amazon DocumentDB and Amazon MemoryDB, and vector search on Amazon Neptune Analytics. It is the foundation of RAG systems built on operational databases rather than on a separate vector store, and it is one of the patterns most commonly added to existing AWS database stacks for generative AI workloads.
Related: Amazon DocumentDB, Amazon MemoryDB, Amazon Neptune Analytics, Trusted Language Extensions for PostgreSQL
Source: Aurora PostgreSQL - Vector store with pgvector
G. Migration and Integration
AWS Database Migration Service (DMS)
AWS DMS is a managed service for migrating data between an on-premises or AWS source database and an AWS target, with support for full-load, change-data-capture (CDC), and full-load-plus-CDC modes. It works across heterogeneous engine pairs (for example, Oracle to Aurora PostgreSQL) when combined with the Schema Conversion tool, and it is the default tool for online migrations where the source must stay live during cutover.
Related: AWS Schema Conversion Tool, Blue/Green Deployments, Amazon RDS Data API
Source: AWS Database Migration Service
AWS Schema Conversion Tool (SCT) / DMS Schema Conversion
The AWS Schema Conversion Tool (SCT) and the newer DMS Schema Conversion convert source schemas, stored procedures, and SQL dialect features to the target engine, flagging objects that need manual rework. They are the typical first step of a heterogeneous migration with DMS - for example, Oracle PL/SQL or SQL Server T-SQL to Aurora PostgreSQL.
Related: AWS Database Migration Service, Babelfish for Aurora PostgreSQL, RDS Custom
Source: AWS Schema Conversion Tool
Amazon RDS Data API
The RDS Data API exposes Aurora Serverless v2, Aurora Serverless v1, and Aurora PostgreSQL clusters over HTTPS so that AWS Lambda and other serverless callers can run SQL without managing connections or VPC networking. It removes the connection-pool problem at the cost of a slightly higher per-call overhead than a persistent connection, and it integrates with AWS Secrets Manager for credentials.
Related: Aurora Serverless v2, RDS Proxy, Aurora Zero-ETL Integration with Redshift
Source: Amazon RDS Data API
Aurora Zero-ETL Integration with Amazon Redshift
An Aurora Zero-ETL Integration replicates an Aurora MySQL or Aurora PostgreSQL DB cluster into Amazon Redshift continuously and within seconds, without customer-managed ETL pipelines. Once landed, the data is queryable in Redshift alongside other warehouse data, which makes it the standard path for "operational data, but in Redshift" use cases. Amazon RDS for MySQL, RDS for PostgreSQL, and RDS for Oracle are covered by the analogous Amazon RDS zero-ETL integration with Amazon Redshift, and Amazon DynamoDB has its own zero-ETL integration with Amazon Redshift, so this pattern now spans the bulk of the AWS operational database surface.
Related: DynamoDB Zero-ETL Integration with OpenSearch, Lake Formation Integration, AWS Database Migration Service
Source: Aurora Zero-ETL Integrations
DynamoDB Zero-ETL Integration with Amazon OpenSearch Service
Zero-ETL integration between Amazon DynamoDB and Amazon OpenSearch Service automatically replicates DynamoDB items into an OpenSearch index for full-text search, vector search, and analytics, without a customer-built pipeline on DynamoDB Streams plus Lambda. It is the recommended path for search use cases on DynamoDB-backed data, especially when full-text and vector queries are needed alongside the existing DynamoDB key-value access pattern.
Related: DynamoDB Streams, Aurora Zero-ETL Integration with Redshift, Vector Search
Source: DynamoDB - Zero-ETL with OpenSearch
Database Activity Streams (Aurora)
Database Activity Streams is a feature for Amazon Aurora (PostgreSQL and MySQL) and Amazon RDS for Oracle that streams a near-real-time, encrypted, tamper-resistant log of database activity to Amazon Kinesis Data Streams. It is the basis for external SIEM and audit pipelines, and it complements Performance Insights, which is for performance troubleshooting rather than for compliance audit.
Related: AWS Backup, Performance Insights, AWS Database Migration Service
Source: Database Activity Streams
Lake Formation Integration (DynamoDB / Aurora export to S3)
AWS Lake Formation integrates with operational databases through point-in-time exports to Amazon S3 from DynamoDB and from Aurora, which can then be governed by Lake Formation permissions and queried with Amazon Athena, Amazon EMR, or Amazon Redshift Spectrum. It is the recommended path for offline analytics on operational data without putting query load on the source database.
Related: Aurora Zero-ETL Integration with Redshift, DynamoDB Zero-ETL Integration with OpenSearch, AWS Backup
Source: DynamoDB - Export to Amazon S3
AWS Backup for RDS, Aurora, and DynamoDB
AWS Backup is a centralised, policy-driven backup service that manages backups for Amazon RDS, Amazon Aurora, Amazon DynamoDB (including continuous backups for PITR), and other AWS resources. It provides cross-Region and cross-account copy, immutable Vault Lock, and lifecycle to cold storage in one place. AWS Backup is the recommended control plane when the same retention and compliance policies must span multiple AWS database services.
Related: Point-in-Time Recovery, Global Tables, Global Database (Aurora)
Source: AWS Backup
Related Articles
- Amazon DynamoDB Single-Table Design Guide
- Amazon DynamoDB Key Design, GSI, and LSI Dictionary
- Comparison of AWS Databases Using the Quorum Model
- AWS History and Timeline - Amazon RDS
- AI and Machine Learning Glossary for AWS
- AI Agent Engineering Glossary
References
- AWS - Cloud Databases
- Amazon RDS User Guide
- Amazon Aurora User Guide
- Amazon DynamoDB Developer Guide
- Amazon DocumentDB Developer Guide
- Amazon Neptune User Guide
- Amazon Neptune Analytics User Guide
- Amazon Timestream Developer Guide
- Amazon ElastiCache User Guide
- Amazon MemoryDB Developer Guide
- Amazon Keyspaces Developer Guide
- Amazon QLDB Developer Guide (End-of-Support 2025-07-31)
- AWS Database Migration Service User Guide
- AWS Schema Conversion Tool User Guide
- AWS Backup Developer Guide
Summary
This glossary collects the essential terms an engineer or architect repeatedly encounters when choosing or operating an AWS database, across relational engines (Amazon RDS, Amazon Aurora), key-value (Amazon DynamoDB), document (Amazon DocumentDB), graph (Amazon Neptune, Amazon Neptune Analytics), time-series (Amazon Timestream), in-memory (Amazon ElastiCache, Amazon MemoryDB), wide-column (Amazon Keyspaces), and ledger (Amazon QLDB) services, plus the migration and integration layer that sits beside them.
Each definition is short enough to read in one breath, each Related line maps the term to its neighbours, and each Source link goes to the canonical AWS documentation. I will continue to update this glossary as the AWS database surface evolves - new engines, new Zero-ETL integrations, and new compatibility layers are exactly the kind of vocabulary worth keeping current in one place.
References:
Tech Blog with curated related content