AWS Database Glossary - RDS, Aurora, DynamoDB, DocumentDB, and Neptune Explained

First Published: 2026-05-16
Last Updated: 2026-05-16

This glossary collects 60 core terms an engineer or architect repeatedly meets when choosing or operating an AWS database - across relational engines (Amazon RDS, Amazon Aurora), key-value (Amazon DynamoDB), document (Amazon DocumentDB), graph (Amazon Neptune), time-series (Amazon Timestream), in-memory (Amazon ElastiCache, Amazon MemoryDB), wide-column (Amazon Keyspaces), ledger (Amazon QLDB), and the migration and integration layer that sits beside them.

It is a companion to my earlier AI and Machine Learning Glossary for AWS and AI Agent Engineering Glossary. Each entry follows the same shape: a 2-4 sentence definition, a Related line cross-linking to other terms in this page, and a Source line linking to the canonical AWS documentation.

How to Use This Glossary

Use the A-Z Term Index below to jump directly to a term. The seven category sections group terms by the layer of the AWS database stack they belong to: RDS / Aurora core concepts, RDS / Aurora capacity and operations, RDS / Aurora compatibility and extensions, DynamoDB data modeling, DynamoDB capacity and replication, purpose-built databases (DocumentDB, Neptune, Timestream, ElastiCache, MemoryDB, Keyspaces, QLDB, vector search), and the migration and integration layer.

For deep dives on specific patterns, see Amazon DynamoDB Single-Table Design Guide, Amazon DynamoDB Key Design, GSI, and LSI Dictionary, Comparison of AWS Databases Using the Quorum Model, and AWS History and Timeline - Amazon RDS. This glossary intentionally stays at the concept level so the page does not rot as features evolve; follow the Source link on each term for the current scope, limits, and pricing in the AWS documentation.

A-Z Term Index

Adaptive Capacity · Amazon DocumentDB · Amazon ElastiCache for Redis OSS · Amazon ElastiCache for Valkey · Amazon Keyspaces · Amazon MemoryDB · Amazon Neptune · Amazon Neptune Analytics · Amazon QLDB · Amazon RDS Data API · Amazon Timestream for InfluxDB · Amazon Timestream for LiveAnalytics · Aurora Cluster Endpoint · Aurora Custom Endpoint · Aurora DSQL · Aurora I/O-Optimized · Aurora Optimized Reads · Aurora Optimized Writes · Aurora Reader Endpoint · Aurora Serverless v1 · Aurora Serverless v2 · Aurora Storage Layer · Aurora Zero-ETL Integration with Amazon Redshift · AWS Backup for RDS, Aurora, and DynamoDB · AWS Database Migration Service (DMS) · AWS Schema Conversion Tool / DMS Schema Conversion · Babelfish for Aurora PostgreSQL · Backtrack (Aurora MySQL) · Blue/Green Deployments · Composite Primary Key · Database Activity Streams (Aurora) · Database Cloning (Aurora) · DynamoDB Streams · DynamoDB Zero-ETL Integration with Amazon OpenSearch Service · Global Database (Aurora) · Global Secondary Index (GSI) · Global Tables (DynamoDB) · Graph Query Languages (Gremlin / openCypher / SPARQL) · Hot Partition · Lake Formation Integration · Local Secondary Index (LSI) · Multi-AZ Deployment · On-Demand Capacity Mode · Partition Key · Performance Insights · Point-in-Time Recovery (PITR) · Provisioned Capacity Mode · Provisioned Throughput (RDS instance class) · Quorum-based Replication (Aurora) · RDS Custom · RDS Proxy · Read Capacity Unit (RCU) · Read Replica · Single-Table Design · Sort Key · Standard-IA Table Class · Time to Live (TTL) · Transactions (DynamoDB) · Trusted Language Extensions (TLE) for PostgreSQL · Vector Search (pgvector / Aurora / DocumentDB) · Write Capacity Unit (WCU)

AWS Database Service Map

The eight columns below frame how the terms in this glossary map to AWS database categories: Relational (RDS, Aurora), Key-Value (DynamoDB), Document (DocumentDB), Graph (Neptune, Neptune Analytics), Time-series (Timestream LiveAnalytics, Timestream for InfluxDB), In-Memory (ElastiCache for Redis OSS, ElastiCache for Valkey, MemoryDB), Wide-Column (Keyspaces), and Ledger (QLDB, with end-of-support announced for 2025-07-31). DMS, the Schema Conversion Tool, AWS Backup, and the Zero-ETL integrations cross-cut every column.

RelationalSQL OLTP	Key-ValueNoSQL KV	DocumentJSON	GraphProperty / RDF	Time-seriesIoT / Metrics	In-MemoryCache / Primary	Wide-ColumnCassandra	LedgerEoS 2025-07-31
Amazon RDS Amazon Aurora Aurora Serverless v2 Babelfish RDS Proxy	Amazon DynamoDB Global Tables PITR DynamoDB Streams Standard-IA	Amazon DocumentDB Vector Search MongoDB API	Amazon Neptune Neptune Analytics Gremlin openCypher SPARQL	Amazon Timestream LiveAnalytics Amazon Timestream for InfluxDB	ElastiCache for Redis OSS ElastiCache for Valkey Amazon MemoryDB	Amazon Keyspaces CQL Protocol	Amazon QLDB Verifiable History
Cross-cutting Layer (applies across all columns)
Migration: AWS DMS (full-load + CDC) \| AWS Schema Conversion Tool / DMS Schema Conversion \| Blue/Green Deployments
Backup: AWS Backup (RDS, Aurora, DynamoDB) \| PITR \| Global Tables / Global Database \| Vault Lock
Zero-ETL Integrations: Aurora → Amazon Redshift \| DynamoDB → Amazon OpenSearch Service \| Lake Formation export to S3 (Parquet)
Amazon QLDB end-of-support: 2025-07-31. Migrate to immutable patterns on Amazon Aurora PostgreSQL, Amazon DynamoDB, or Amazon Managed Blockchain.

A. RDS / Aurora - Core Concepts

Multi-AZ Deployment

A Multi-AZ deployment for Amazon RDS provisions a primary DB instance and a synchronously replicated standby (or two readable standbys in Multi-AZ DB cluster deployments) in different Availability Zones. RDS performs automatic failover to the standby when the primary fails, when the underlying host needs replacement, or during patching, typically within 60-120 seconds. Multi-AZ is the standard production posture for RDS engines that do not run on the Aurora storage layer.

Source: Amazon RDS - High availability (Multi-AZ)

Read Replica

A Read Replica is an asynchronously replicated copy of an Amazon RDS source DB instance used to offload read traffic and to create cross-Region copies for disaster recovery. RDS Read Replicas can be promoted to standalone primary instances, while Aurora Replicas share the same underlying storage volume and therefore have near-zero replication lag. Read Replicas are independent of Multi-AZ standbys: a single deployment can use both at once for different reasons.

Source: Amazon RDS - Working with read replicas

Aurora Storage Layer

Aurora separates compute from storage and uses a distributed, log-structured Aurora storage layer that stores six copies of every data block across three Availability Zones. Writes are acknowledged when a 4-of-6 quorum confirms the redo log record, so the system tolerates the loss of an entire AZ plus one additional copy without losing data. The same storage layer underlies Amazon DocumentDB.

Source: Amazon Aurora - Storage and reliability

Quorum-based Replication (Aurora)

Aurora uses a 4/6 write quorum and a 3/6 read quorum across its six storage copies in three AZs, which means a write succeeds once four nodes acknowledge it and a read needs only three nodes to reconstruct the latest version. This is the design that lets Aurora survive the loss of a whole AZ for writes and the loss of an AZ plus one node for reads. For a deeper comparison across AWS databases, see Comparison of AWS Databases Using the Quorum Model.

Source: Amazon Aurora - Storage and reliability

Backtrack (Aurora MySQL)

Backtrack lets an Aurora MySQL DB cluster rewind its current state to a target time in the recent past without restoring from a snapshot. It is implemented on top of the Aurora storage log and is limited to Aurora MySQL with a configurable target backtrack window of up to 72 hours. Unlike PITR, backtrack does not create a new cluster - the current cluster is moved to the target time in place.

Source: Aurora MySQL - Backtracking

Database Cloning (Aurora)

Aurora supports near-instant database clones using copy-on-write semantics on its shared storage layer. A clone starts as a logical pointer to the source volume and only consumes additional storage as either the clone or the source is modified, which makes it useful for production-fidelity test environments without doubling cost. The same mechanism underlies Aurora Blue/Green Deployments and Backtrack.

Source: Amazon Aurora - Cloning a volume

Aurora Cluster Endpoint

The cluster endpoint of an Aurora DB cluster always resolves to the current writer instance. Applications use it for read/write traffic, and during a failover Aurora reassigns the endpoint to the newly promoted writer with no client-side reconfiguration. There is exactly one cluster endpoint per Aurora cluster.

Source: Amazon Aurora - Endpoints

Aurora Reader Endpoint

The reader endpoint of an Aurora DB cluster load-balances connections across all available Aurora Replicas in the cluster. It is the standard target for read-only workloads and automatically excludes the writer or any unhealthy replicas. Connection balancing happens at session establishment time, not per query, so long-lived connections can pin to one replica.

Source: Amazon Aurora - Endpoints

Aurora Custom Endpoint

A custom endpoint is a user-defined endpoint that load-balances across a chosen subset of instances in an Aurora DB cluster - for example, "analytics replicas only" or "instances of class db.r6g.4xlarge". It lets you route specific workloads to specific hardware without splitting the cluster, and custom endpoints can include or exclude individual instances by tag.

Source: Amazon Aurora - Endpoints

Global Database (Aurora)

An Aurora Global Database spans multiple AWS Regions: writes go to a primary cluster in one Region, and Aurora replicates them asynchronously at the storage layer to up to five secondary Regions with typical lag under one second. It is the recommended pattern for low-latency cross-Region reads and for disaster recovery with a target RPO measured in seconds. A secondary Region can be promoted to primary in a planned switchover or in an unplanned failover.

Source: Amazon Aurora - Global databases

B. RDS / Aurora - Capacity and Operations

Aurora Serverless v1

Aurora Serverless v1 is the original on-demand auto-scaling configuration for Aurora, which scales compute in discrete capacity units between defined minimum and maximum values and can pause to zero capacity during idle periods. AWS announced end of support for Aurora Serverless v1 on March 31, 2025; the official end-of-life date is March 31, 2025 and clusters that have not been migrated to Aurora Serverless v2 are upgraded by AWS during a scheduled maintenance window. v1 lacks several features that v2 supports (Global Database, RDS Proxy, parallel query), so v2 is the recommended target for all new workloads and remaining migrations.

Source: Amazon Aurora Serverless v1

Aurora Serverless v2

Aurora Serverless v2 scales compute capacity in fine-grained increments measured in Aurora Capacity Units (ACUs) without disconnecting clients. It supports all major Aurora features (Global Database, RDS Proxy, Performance Insights, parallel query) that v1 did not, and is the recommended serverless option for production workloads with variable load. v2 can also be combined with provisioned instances in the same cluster.

Source: Amazon Aurora Serverless v2

Aurora DSQL

Amazon Aurora DSQL is a serverless, distributed SQL database with PostgreSQL compatibility that offers active-active multi-Region read and write with strong consistency. It is architected as a separate service from the classic Aurora cluster (no provisioned instance, no failover concept) and scales reads and writes independently across Regions without application-level conflict handling. Aurora DSQL is the right choice for globally distributed OLTP workloads that need single-digit-millisecond regional latency and the ability to write in any Region, where the operational simplicity of "no instances and no failover" outweighs the feature set of classic Aurora (extensions, custom parameters, IAM Database Authentication parity, and so on are still maturing).

Source: Amazon Aurora DSQL - User Guide

Aurora I/O-Optimized

Aurora I/O-Optimized is a cluster storage configuration in which I/O charges are removed and the per-hour instance and storage rates are higher in exchange. It is intended for workloads where I/O cost would otherwise dominate (typically more than 25 percent of total Aurora cost), and the configuration can be toggled per cluster.

Source: Amazon Aurora - Storage configurations

Aurora Optimized Reads

Aurora Optimized Reads is an instance-class feature for Aurora PostgreSQL and Aurora MySQL on supported Graviton and Intel NVMe-equipped instances (such as db.r6gd and db.r6id) that uses the local NVMe SSD as a tier for temporary objects and as an extension of the buffer cache. It can materially improve query latency for read-heavy workloads with large working sets that do not fit in memory, and it is enabled simply by choosing an instance class that supports it.

Source: Aurora PostgreSQL - Optimized Reads

Aurora Optimized Writes

Aurora Optimized Writes is a feature of Aurora MySQL on supported instance classes that improves write throughput by removing the double-write buffer using atomic 16 KiB page writes to Aurora storage. It is selected on cluster creation on supported Aurora MySQL 3.x instance classes (such as db.r6i and db.r7g); enabling it on an existing cluster typically requires a snapshot restore or a Blue/Green switchover to re-initialise the storage page format. The feature is the main reason write-heavy MySQL workloads see a step change in throughput on newer Aurora MySQL instance classes.

Source: Aurora MySQL - Optimized Writes

Blue/Green Deployments

A Blue/Green Deployment in Amazon RDS and Amazon Aurora creates a separate green environment that mirrors the production blue one, including engine upgrades, schema changes, or parameter changes you want to test. Switchover swaps endpoints to the green environment in typically under a minute with safety checks for replication lag and active sessions. The pattern is the recommended way to perform major-version upgrades on RDS and Aurora.

Source: Amazon RDS - Blue/Green Deployments

Provisioned Throughput (RDS instance class)

For Amazon RDS in its non-serverless mode, capacity is provisioned by choosing a specific DB instance class (db.m6g, db.r7g, db.x2iedn, etc.) that fixes CPU, memory, and EBS bandwidth. Vertical scaling requires a modify-instance operation with a short downtime, which is the structural reason teams move CPU-spiky workloads to Aurora Serverless v2 or split read traffic onto Read Replicas.

Source: Amazon RDS - DB instance classes

RDS Proxy

Amazon RDS Proxy is a fully managed, highly available database proxy in front of RDS and Aurora that pools and shares database connections across application clients. It reduces connection overhead for serverless and microservice workloads, preserves application connections across DB failovers, and integrates with AWS IAM and AWS Secrets Manager for authentication. RDS Proxy is the recommended fronting layer for Lambda-based access to RDS and Aurora.

Source: Amazon RDS Proxy

C. RDS / Aurora - Compatibility and Extensions

Babelfish for Aurora PostgreSQL

Babelfish for Aurora PostgreSQL is an optional capability that lets an Aurora PostgreSQL DB cluster understand the SQL Server T-SQL dialect and the TDS wire protocol on a separate port. Applications written for Microsoft SQL Server can connect with minimal code changes, which makes it a tool for incremental migration off SQL Server licences. Babelfish coexists with the native PostgreSQL endpoint on the same cluster.

Source: Aurora PostgreSQL - Babelfish

Trusted Language Extensions (TLE) for PostgreSQL

Trusted Language Extensions is an open-source development kit that lets users build PostgreSQL extensions in JavaScript, Perl, or other supported languages and install them safely in managed environments such as Amazon RDS and Aurora PostgreSQL. It removes the need to wait for AWS to certify each new extension and is the official path for custom server-side logic on managed PostgreSQL.

Source: Trusted Language Extensions for PostgreSQL

Performance Insights

Performance Insights is a database-aware monitoring view on top of Amazon RDS and Amazon Aurora that visualises database load (Active Sessions / DBLoad) broken down by wait event, SQL, host, or user. It complements Amazon CloudWatch and is the default first stop when diagnosing latency or saturation on a managed DB instance. Retention is configurable, with seven days included at no extra charge.

Source: Amazon RDS - Performance Insights

RDS Custom

Amazon RDS Custom is a managed RDS variant that grants the customer access to the underlying operating system and database engine for workloads (notably Oracle) that require third-party software, custom patches, or filesystem-level customisation. The customer takes on more of the operational responsibility in return, including OS patching and database engine patching outside of the standard RDS automation.

Source: Amazon RDS Custom

D. DynamoDB - Data Modeling

Partition Key

The partition key (also called the hash key) is the mandatory first attribute of a DynamoDB primary key, and its value is hashed to decide which physical partition stores the item. A well-distributed partition key is the single most important determinant of DynamoDB throughput because traffic on a single partition is bounded. Item collections sharing a partition key are stored together.

Source: Amazon DynamoDB - Primary key

Sort Key

The sort key (also called the range key) is the optional second attribute of a DynamoDB primary key and orders items within the same partition. Adding a sort key turns the primary key into a composite key and unlocks range queries with conditions such as begins_with, between, and >. The sort key is the foundation for both LSIs and for sort-key-prefix-based single-table designs.

Source: Amazon DynamoDB - Primary key

Composite Primary Key

A composite primary key in DynamoDB combines a partition key and a sort key so that an item is uniquely identified by both attributes together. It is the structural foundation of the Single-Table Design pattern, in which one table holds many entity types differentiated by sort key prefix. The detailed key design playbook is collected in the Amazon DynamoDB Key Design, GSI, and LSI Dictionary.

Source: Amazon DynamoDB - Primary key

Global Secondary Index (GSI)

A Global Secondary Index is an index whose partition and sort keys can be different from the base table, supporting queries on alternate access patterns. A GSI has its own provisioned or on-demand capacity, is updated asynchronously from the base table, and therefore returns eventually consistent reads (strong reads are not supported on a GSI). GSIs can be added or removed online.

Source: Amazon DynamoDB - Global Secondary Indexes

Local Secondary Index (LSI)

A Local Secondary Index shares the partition key with the base table but uses a different sort key, so it indexes within each partition. LSIs must be defined at table-creation time, share capacity with the base table, and support both eventually consistent and strongly consistent reads. LSIs are a useful tool when an entity needs multiple sort orders, but they cannot be added or removed after table creation.

Source: Amazon DynamoDB - Local Secondary Indexes

Single-Table Design

Single-Table Design is a DynamoDB modeling approach in which one table stores multiple entity types (users, orders, comments, etc.) differentiated by partition key prefix and sort key prefix, with GSIs added for alternate access patterns. It minimises cross-entity round trips and is the canonical pattern for backend-for-frontend services on DynamoDB. The full playbook is in the Amazon DynamoDB Single-Table Design Guide.

Source: Amazon DynamoDB - NoSQL design

Hot Partition

A hot partition occurs when a disproportionate share of read or write traffic targets a single partition key value, so the partition saturates while other partitions are idle. Adaptive Capacity is the platform-level mitigation; partition-key design (key salting, write sharding, time-bucket suffixes) is the application-level mitigation. Hot partitions are the most common root cause of "DynamoDB throttling" on otherwise under-utilised tables.

Source: DynamoDB - Designing partition keys to distribute workloads

Adaptive Capacity

Adaptive Capacity is a DynamoDB feature in which the service automatically reallocates throughput to absorb traffic spikes on specific partition keys without requiring manual rebalancing. It is enabled by default on all tables and effectively makes provisioned-throughput "hot key" problems much rarer than in the early years of the service. Adaptive Capacity does not remove the need for good key design - it raises the threshold at which bad key design becomes a production incident.

Source: DynamoDB - Designing partition keys

E. DynamoDB - Capacity, Streams, and Replication

Provisioned Capacity Mode

Provisioned capacity mode lets you specify the number of Read Capacity Units and Write Capacity Units a DynamoDB table can sustain per second. It is the right mode for predictable, steady workloads, and Auto Scaling adjusts the provisioned values within configured bounds. Reserved Capacity discounts are available for committed long-term provisioned capacity.

Source: DynamoDB - Read/write capacity modes

On-Demand Capacity Mode

On-demand capacity mode charges per request and scales instantaneously up to any previously observed peak (and beyond, with brief warm-up after large step changes). It is the default and recommended mode for most workloads - unpredictable traffic, new applications without an established baseline, and spiky workloads where provisioning headroom would be wasteful. Switching a table between capacity modes is allowed, but subsequent switches are subject to a cooldown window - check the Source link for the current limit, which has tightened and loosened over the years.

Source: DynamoDB - On-demand capacity mode

Read Capacity Unit (RCU)

One Read Capacity Unit represents one strongly consistent read per second, two eventually consistent reads per second, or one-half of a transactional read per second, for items up to 4 KiB. RCU is the billing and quota unit for provisioned capacity reads and for the read side of on-demand metering. Reads of items larger than 4 KiB consume additional RCUs in 4 KiB increments.

Source: DynamoDB - Read/write capacity modes

Write Capacity Unit (WCU)

One Write Capacity Unit represents one standard write per second or one-half of a transactional write per second, for items up to 1 KiB. WCU is the billing and quota unit for provisioned capacity writes and the write side of on-demand metering. Writes of items larger than 1 KiB consume additional WCUs in 1 KiB increments, which is the structural reason single-table designs flatten data rather than nest it.

Source: DynamoDB - Read/write capacity modes

Standard-IA Table Class

The Standard-Infrequent Access (Standard-IA) table class is a per-table storage option that lowers storage rates while raising read/write request rates, intended for tables whose data is rarely accessed but must remain queryable with single-digit-millisecond latency. It is selected at table creation or via a table-class modification, and it is the right choice for tables holding cold but still-online operational data.

Source: DynamoDB - Table classes

Time to Live (TTL)

TTL lets you mark a numeric attribute (Unix epoch seconds) as the expiration time of an item, and DynamoDB asynchronously deletes expired items at no extra request cost. TTL deletions also flow into DynamoDB Streams as REMOVE events, which makes TTL a building block for session stores, ephemeral chat history, and archival pipelines that move expired items to S3.

Source: DynamoDB - Time to Live

DynamoDB Streams

DynamoDB Streams is an ordered change-data-capture stream of item-level INSERT, MODIFY, and REMOVE events for a DynamoDB table, with 24-hour retention. It is the standard input for AWS Lambda triggers, for keeping a derived index in sync, and for fan-out integrations with services such as Amazon Kinesis Data Streams. The Streams Kinesis Data Streams option lets the same change feed land directly in Kinesis for longer retention.

Source: DynamoDB Streams

Global Tables (DynamoDB)

A Global Table is a multi-Region, multi-active replica set of a DynamoDB table where every replica accepts writes and conflicts are resolved with last-writer-wins. It targets active-active deployments with single-digit-millisecond read latency in each Region and an automatic RPO measured in seconds. Global Tables are independent of PITR and AWS Backup, which apply per-replica.

Source: DynamoDB - Global Tables

Point-in-Time Recovery (PITR)

Point-in-Time Recovery is a per-table backup feature for DynamoDB that lets you restore a table to any second in the trailing recovery window (up to 35 days). It is independent of on-demand backups and is the recommended baseline for production tables. PITR continues to operate transparently across capacity-mode changes, table-class changes, and the addition or removal of GSIs.

Related: AWS Backup, Global Tables, Time to Live

Source: DynamoDB - Point-in-Time Recovery

Transactions (DynamoDB)

DynamoDB Transactions (TransactWriteItems / TransactGetItems) group up to 100 actions across one or many tables in the same AWS account and Region into a single ACID operation, with an aggregate item size limit of 4 MB per transaction. Transactional requests consume twice the capacity of equivalent non-transactional requests, in exchange for atomicity, isolation, and serializable cross-item updates. Transactions are the standard primitive for inventory, payment, and uniqueness-constraint patterns on DynamoDB; ACID guarantees apply only within the Region where the call was made, not across Global Tables replicas.

Source: DynamoDB - Transactions

F. Purpose-built Databases

Amazon DocumentDB (with MongoDB compatibility)

Amazon DocumentDB is a managed document database with MongoDB-compatible APIs, designed for JSON workloads that benefit from a flexible schema. Its compute-storage separation is similar to Aurora: six storage copies across three AZs, and read replicas that share the same underlying volume. DocumentDB supports native vector search, which makes it a candidate for RAG systems whose operational data is already in document form.

Source: Amazon DocumentDB

Amazon Neptune

Amazon Neptune is a managed graph database engine that supports the property-graph model (queryable with Apache Gremlin and openCypher) and the RDF model (queryable with SPARQL). It targets workloads such as fraud detection, knowledge graphs, identity graphs, and recommendation engines where the structure of relationships is the dominant query pattern. Neptune complements Neptune Analytics, which loads a graph into memory for fast multi-hop analytics.

Source: Amazon Neptune

Amazon Neptune Analytics

Amazon Neptune Analytics is a separate analytics engine for graph workloads that loads a graph into memory for fast multi-hop analytics, path-finding, centrality, and similarity algorithms. It complements Amazon Neptune Database, which is the OLTP-style graph engine, and it supports vector search on graph node embeddings.

Source: Amazon Neptune Analytics

Graph Query Languages (Gremlin / openCypher / SPARQL)

The three graph query languages supported on Amazon Neptune are Apache Gremlin (traversal-style, for property graphs), openCypher (declarative, for property graphs, originally from Neo4j), and SPARQL (declarative, for RDF graphs). Choice of language is driven by the data model and existing developer skills, not by Neptune itself; a Neptune cluster can hold both property graph and RDF data, but the two are stored in separate namespaces and an individual query targets only one model.

Source: Amazon Neptune - Overview

Amazon Timestream for LiveAnalytics

Amazon Timestream for LiveAnalytics (formerly Amazon Timestream) is a serverless time-series database optimised for IoT and operational telemetry, with a tiered storage model (memory store for recent data, magnetic store for historical data) and a SQL-like query language. It scales to trillions of events per day without provisioning capacity, and it includes built-in interpolation and gap-filling functions for time-series queries.

Source: Amazon Timestream

Amazon Timestream for InfluxDB

Amazon Timestream for InfluxDB is a managed deployment of the open-source InfluxDB engine, suitable for teams already standardised on the InfluxQL or Flux query languages and on Telegraf for ingestion. It complements LiveAnalytics by offering a familiar InfluxDB API and ecosystem on managed AWS infrastructure.

Source: Amazon Timestream for InfluxDB

Amazon ElastiCache for Redis OSS

Amazon ElastiCache for Redis OSS is a managed deployment of Redis OSS suitable for caching, leaderboards, queues, and pub/sub workloads. Clusters can be configured with Multi-AZ replication and automatic failover, and the service supports both classic cache use and persistent data store modes. ElastiCache is positioned as the cache in front of a separate primary; MemoryDB is the choice when in-memory state is the primary.

Source: Amazon ElastiCache for Redis OSS

Amazon ElastiCache for Valkey

Amazon ElastiCache for Valkey is a managed deployment of Valkey, the Linux Foundation fork of Redis OSS created after the licence change in Redis 7.4. AWS positions it as a fully API-compatible alternative for new workloads and existing Redis OSS clusters, with the same scaling, Multi-AZ replication, and failover behaviour as ElastiCache for Redis OSS.

Source: Amazon ElastiCache

Amazon MemoryDB

Amazon MemoryDB is a Redis OSS / Valkey-compatible, durable in-memory database with multi-AZ transaction log durability and microsecond reads / single-digit-millisecond writes. Unlike ElastiCache, it is positioned as the primary database for the workload rather than as a cache in front of a separate primary. MemoryDB is the in-memory companion to DynamoDB for workloads that need extremely low latency without giving up durability.

Source: Amazon MemoryDB

Amazon Keyspaces (for Apache Cassandra)

Amazon Keyspaces is a managed, serverless, Apache Cassandra-compatible wide-column database that uses the CQL protocol. It targets applications already built on Cassandra that want to remove the operational burden of running rings, with single-digit-millisecond p99 latency at scale. Keyspaces supports both on-demand and provisioned capacity modes and provides PITR similarly to DynamoDB.

Source: Amazon Keyspaces

Amazon QLDB

Amazon QLDB (Quantum Ledger Database) was a managed ledger database with a cryptographically verifiable change history. It reached end of support on 2025-07-31, and AWS published migration playbooks to Amazon Aurora PostgreSQL, Amazon DynamoDB, and Amazon Managed Blockchain for affected workloads. New ledger-style requirements on AWS should now be modelled as immutable, append-only patterns on those services rather than on QLDB.

Source: Amazon QLDB (End-of-Support 2025-07-31)

Vector Search (pgvector / Aurora / DocumentDB)

Vector Search on AWS is the family of features that store high-dimensional embeddings and answer approximate-nearest-neighbour queries: pgvector on Amazon RDS for PostgreSQL and Aurora PostgreSQL, native vector search on Amazon DocumentDB and Amazon MemoryDB, and vector search on Amazon Neptune Analytics. It is the foundation of RAG systems built on operational databases rather than on a separate vector store, and it is one of the patterns most commonly added to existing AWS database stacks for generative AI workloads.

Source: Aurora PostgreSQL - Vector store with pgvector

G. Migration and Integration

AWS Database Migration Service (DMS)

AWS DMS is a managed service for migrating data between an on-premises or AWS source database and an AWS target, with support for full-load, change-data-capture (CDC), and full-load-plus-CDC modes. It works across heterogeneous engine pairs (for example, Oracle to Aurora PostgreSQL) when combined with the Schema Conversion tool, and it is the default tool for online migrations where the source must stay live during cutover.

Source: AWS Database Migration Service

AWS Schema Conversion Tool (SCT) / DMS Schema Conversion

The AWS Schema Conversion Tool (SCT) and the newer DMS Schema Conversion convert source schemas, stored procedures, and SQL dialect features to the target engine, flagging objects that need manual rework. They are the typical first step of a heterogeneous migration with DMS - for example, Oracle PL/SQL or SQL Server T-SQL to Aurora PostgreSQL.

Source: AWS Schema Conversion Tool

Amazon RDS Data API

The RDS Data API exposes Aurora Serverless v2, Aurora Serverless v1, and Aurora PostgreSQL clusters over HTTPS so that AWS Lambda and other serverless callers can run SQL without managing connections or VPC networking. It removes the connection-pool problem at the cost of a slightly higher per-call overhead than a persistent connection, and it integrates with AWS Secrets Manager for credentials.

Source: Amazon RDS Data API

Aurora Zero-ETL Integration with Amazon Redshift

An Aurora Zero-ETL Integration replicates an Aurora MySQL or Aurora PostgreSQL DB cluster into Amazon Redshift continuously and within seconds, without customer-managed ETL pipelines. Once landed, the data is queryable in Redshift alongside other warehouse data, which makes it the standard path for "operational data, but in Redshift" use cases. Amazon RDS for MySQL, RDS for PostgreSQL, and RDS for Oracle are covered by the analogous Amazon RDS zero-ETL integration with Amazon Redshift, and Amazon DynamoDB has its own zero-ETL integration with Amazon Redshift, so this pattern now spans the bulk of the AWS operational database surface.

Source: Aurora Zero-ETL Integrations

DynamoDB Zero-ETL Integration with Amazon OpenSearch Service

Zero-ETL integration between Amazon DynamoDB and Amazon OpenSearch Service automatically replicates DynamoDB items into an OpenSearch index for full-text search, vector search, and analytics, without a customer-built pipeline on DynamoDB Streams plus Lambda. It is the recommended path for search use cases on DynamoDB-backed data, especially when full-text and vector queries are needed alongside the existing DynamoDB key-value access pattern.

Source: DynamoDB - Zero-ETL with OpenSearch

Database Activity Streams (Aurora)

Database Activity Streams is a feature for Amazon Aurora (PostgreSQL and MySQL) and Amazon RDS for Oracle that streams a near-real-time, encrypted, tamper-resistant log of database activity to Amazon Kinesis Data Streams. It is the basis for external SIEM and audit pipelines, and it complements Performance Insights, which is for performance troubleshooting rather than for compliance audit.

Source: Database Activity Streams

Lake Formation Integration (DynamoDB / Aurora export to S3)

AWS Lake Formation integrates with operational databases through point-in-time exports to Amazon S3 from DynamoDB and from Aurora, which can then be governed by Lake Formation permissions and queried with Amazon Athena, Amazon EMR, or Amazon Redshift Spectrum. It is the recommended path for offline analytics on operational data without putting query load on the source database.

Source: DynamoDB - Export to Amazon S3

AWS Backup for RDS, Aurora, and DynamoDB

AWS Backup is a centralised, policy-driven backup service that manages backups for Amazon RDS, Amazon Aurora, Amazon DynamoDB (including continuous backups for PITR), and other AWS resources. It provides cross-Region and cross-account copy, immutable Vault Lock, and lifecycle to cold storage in one place. AWS Backup is the recommended control plane when the same retention and compliance policies must span multiple AWS database services.

Source: AWS Backup

References

Summary

This glossary collects the essential terms an engineer or architect repeatedly encounters when choosing or operating an AWS database, across relational engines (Amazon RDS, Amazon Aurora), key-value (Amazon DynamoDB), document (Amazon DocumentDB), graph (Amazon Neptune, Amazon Neptune Analytics), time-series (Amazon Timestream), in-memory (Amazon ElastiCache, Amazon MemoryDB), wide-column (Amazon Keyspaces), and ledger (Amazon QLDB) services, plus the migration and integration layer that sits beside them.

Each definition is short enough to read in one breath, each Related line maps the term to its neighbours, and each Source link goes to the canonical AWS documentation. I will continue to update this glossary as the AWS database surface evolves - new engines, new Zero-ETL integrations, and new compatibility layers are exactly the kind of vocabulary worth keeping current in one place.

References:
Tech Blog with curated related content

Written by Hidekazu Konishi