hidekazu-konishi.com

Summary of Differences and Commonalities in AWS Database Services using the Quorum Model - Comparison Charts of Amazon Aurora, Amazon DocumentDB, and Amazon Neptune

First Published: 2023-08-06
Last Updated: 2023-10-04

The AWS database services currently include various services such as Amazon Aurora, Amazon DocumentDB, Amazon DynamoDB, Amazon ElastiCache, Amazon Keyspaces (for Apache Cassandra), Amazon MemoryDB for Redis, Amazon Neptune, Amazon QLDB, Amazon RDS, Amazon Timestream, and others.
Among these, database services such as Amazon Aurora and those that appeared after it, like Amazon DocumentDB and Amazon Neptune, have adopted a data replication method using the Quorum Model.

To put it succinctly, the Quorum Model is a mechanism where a process is considered complete if it can be written or read into a certain number of multiple data copies, while the consistency of the remaining data copies is maintained asynchronously.
This feature of the Quorum Model offers a solution to problems such as the delay in traditional synchronous replication and the fault tolerance of asynchronous replication.

For example, in the case of Amazon Aurora, it automatically creates six copies across three AZs (Availability Zones) and performs asynchronous replication. An operation is considered successful with a Write Quorum of 4 out of the 6 copies, and a Read Quorum of 3 out of the 6. This achieves availability where there is no impact on writing even with a maximum loss of 2 copies, and no impact on reading even with a maximum loss of 3 copies.

Write Quorum and Read Quorum
Write Quorum and Read Quorum
In Amazon Aurora, from the viewpoint of cost and efficiency, the read quorum is actually used in cases such as the loss of cache in the primary DB instance, restart of the primary DB instance, promotion of replicas to primary, local state reconstruction, reconstruction of corrupted segments, and quorum repair. In all other cases, reads are performed from nodes that have the necessary data version that has been managed in advance. If you want to know the details of the Quorum Model used in Aurora, the following article will be helpful.

Amazon Aurora under the hood: quorums and correlated failure | AWS Database Blog
Amazon Aurora Under the Hood: Quorum Reads and Mutating State | AWS Database Blog
Amazon Aurora Under the Hood: Reducing Costs Using Quorum Sets | AWS Database Blog
Amazon Aurora Under the Hood: Quorum Membership | AWS Database Blog

Now, Amazon Aurora, Amazon DocumentDB, and Amazon Neptune, which use this Quorum Model, have strikingly similar cluster configurations and storage mechanisms when you check the contents of their respective AWS documents.
However, as far as I have searched, I could not find an official announcement from AWS regarding the commonalities and differences in the mechanisms and functions of these database infrastructures.

Therefore, I have compiled a comparison table of the commonalities and differences to efficiently deepen the understanding of these database services.
The following comparison table is compiled based on information from the AWS Documents, while personally verifying differences between the document and actual behaviors.
Please be aware that since this is a personal compilation, there may be some inaccuracies. I would appreciate if you could refer to it with this understanding.

Comparison of Major Features of Amazon Aurora, Amazon DocumentDB, and Amazon Neptune

Comparison Item Amazon Aurora Amazon DocumentDB Amazon Neptune
Overview Relational Database MongoDB Compatible Database Graph Database
Supported Database MySQL, PostgreSQL MongoDB Gremlin, SPARQL
Storage Expansion Automatically expands by 10GB up to 128TiB Automatically expands by 10GB up to 128TiB Automatically expands by 10GB up to 128TiB
Storage Availability Automatically creates 6 copies across 3AZs. 4 out of 6 for Write Quorum, 3 out of 6 for Read Quorum, allowing up to 2 copy losses without write impact, and up to 3 copy losses without read impact. Automatically creates 6 copies across 3AZs. 4 out of 6 for Write Quorum, 3 out of 6 for Read Quorum, allowing up to 2 copy losses without write impact, and up to 3 copy losses without read impact. Automatically creates 6 copies across 3AZs. 4 out of 6 for Write Quorum, 3 out of 6 for Read Quorum, allowing up to 2 copy losses without write impact, and up to 3 copy losses without read impact.
Cluster Configuration One Primary DB Instance,
0 to 15 Read Replica DB Instances
One Primary DB Instance,
0 to 15 Read Replica DB Instances
One Primary DB Instance,
0 to 15 Read Replica DB Instances
Failover Priority Replica promotion tier specified from 0 (highest) to 15 (lowest). Promoted in the following order:
1. Read Replica with the specified higher priority (closer to 0)
2. If priorities are the same, the larger Read Replica
3. If both priority and size are the same, any Read Replica
Replica promotion tier specified from 0 (highest) to 15 (lowest). Promoted in the following order:
1. Read Replica with the specified higher priority (closer to 0)
2. If priorities are the same, the larger Read Replica
3. If both priority and size are the same, any Read Replica
Replica promotion tier specified from 0 (highest) to 15 (lowest). Promoted in the following order:
1. Read Replica with the specified higher priority (closer to 0)
2. If priorities are the same, the larger Read Replica
3. If both priority and size are the same, any Read Replica
Endpoint Cluster endpoint,
Leader endpoint,
Instance endpoint,
Custom endpoint
Cluster endpoint,
Leader endpoint,
Instance endpoint
Cluster endpoint,
Leader endpoint,
Instance endpoint,
Custom endpoint
Default DB Port 3306 (MySQL), 5432 (PostgreSQL) 27017 8182
VPC Configuration Set enableDnsHostnames and enableDnsSupport to true. Set enableDnsHostnames and enableDnsSupport to true. Set enableDnsHostnames and enableDnsSupport to true.
Encryption of Data Storage Use AWS KMS Use AWS KMS Use AWS KMS
Encryption of Data Transfer TLS encryption setting in the parameter group (enabled by default) TLS encryption setting in the parameter group (enabled by default) TLS encryption is mandatory from Neptune engine version 1.0.4.0 and later
Resource Management Access Control IAM Role, IAM User IAM Role, IAM User IAM Role, IAM User
DB Port Access Control Security Group, NACL Security Group, NACL Security Group, NACL
DB Connection Access Control Username and Password, IAM Database Authentication Username and Password IAM Database Authentication
Audit Logs Possible with Amazon CloudWatch Logs export and enabling "server_audit_logging" in the parameter group Possible with Amazon CloudWatch Logs export and enabling "audit_logs" in the parameter group Possible with Amazon CloudWatch Logs export and enabling "neptune_enable_audit_log" in the parameter group
Billing Structure Pay-as-you-go for DB instances, storage, backup storage, and data transfer Pay-as-you-go for DB instances, storage, backup storage, and data transfer Pay-as-you-go for DB instances, storage, backup storage, and data transfer
Maintenance Window A random 30-minute period is allocated within a 8-hour block per region. If not specified, the day of the week is also randomly assigned. A random 30-minute period is allocated within a 8-hour block per region. If not specified, the day of the week is also randomly assigned. A random 30-minute period is allocated within a 8-hour block per region. If not specified, the day of the week is also randomly assigned.
Backup Window A random 30-minute period is allocated within a 8-hour block per region. At the start, there is a few seconds of storage I/O interruption for Single-AZ configurations and a few minutes of latency increase for Multi-AZ configurations. A random 30-minute period is allocated within a 8-hour block per region. At the start, there is a few seconds of storage I/O interruption for Single-AZ configurations and a few minutes of latency increase for Multi-AZ configurations. A random 30-minute period is allocated within a 8-hour block per region. At the start, there is a few seconds of storage I/O interruption for Single-AZ configurations and a few minutes of latency increase for Multi-AZ configurations.
Auto Backup Retention Period 1 day - 35 days 1 day - 35 days 1 day - 35 days
Point in Time Recovery Recovery using auto backup Recovery using auto backup Recovery using auto backup
Backup Method Auto backup (auto snapshot, transaction log), manual snapshot Auto backup (auto snapshot, transaction log), manual snapshot Auto backup (auto snapshot, transaction log), manual snapshot
Backup Sharing & Copying Manual snapshots can be copied between regions and shared between accounts.
Automatic snapshots can be copied to manual snapshots.
If encrypted, it is possible to grant authority to the destination by the custom KMS key.
For default KMS key encryption, it is possible by copying to a custom KMS key encrypted snapshot.
Manual snapshots can be copied between regions and shared between accounts.
Automatic snapshots can be copied to manual snapshots.
If encrypted, it is possible to grant authority to the destination by the custom KMS key.
For default KMS key encryption, it is possible by copying to a custom KMS key encrypted snapshot.
Manual snapshots can be copied between regions and shared between accounts.
Automatic snapshots can be copied to manual snapshots.
If encrypted, it is possible to grant authority to the destination by the custom KMS key.
For default KMS key encryption, it is possible by copying to a custom KMS key encrypted snapshot.
Streams feature Database Activity Streams.
Pushes the log of database changes that can be integrated with other services to Amazon Kinesis Data Streams without duplication while maintaining the order of occurrence.
Change Streams.
Keeps a log of database changes that can be integrated with other services in the cluster for 7 days without duplication while maintaining the order of occurrence. Can be obtained only from primary instances.
Neptune Streams.
Keeps a log of database changes that can be integrated with other services in the cluster for 7 days without duplication while maintaining the order of occurrence. Can be obtained from both primary instances and read replica instances.
Event Notification Feature Collaborates with Amazon SNS when cluster or instance events occur. Collaborates with Amazon SNS when cluster or instance events occur. Collaborates with Amazon SNS when cluster or instance events occur.
Backtrack A feature that can roll back data in the same DB cluster to a specified time in just a few minutes. - -
Clone Feature Create a clone of the DB cluster without downtime. Create a clone of the DB cluster without downtime. Create a clone of the DB cluster without downtime.
Creation of Serverless Cluster Create a cluster with on-demand auto-scaling configuration (Aurora Serverless V1, Aurora Serverless V2). - Create a cluster with on-demand auto-scaling configuration (similar to Aurora Serverless V2).
Creation of Elastic Clusters - Create Amazon DocumentDB Elastic Clusters that can scale to petabytes of storage, handling millions of writes and reads per second, with minimal impact on downtime and performance. -
Performance Insights Analyze database load causes by wait state, SQL queries, host, users, etc. Analyze database load causes by wait state, SQL queries, host, users, etc. -
Global Database(Clusters) Single-master configuration that creates read replicas in each region with Amazon Aurora Global Database.
Transactions are supported by a primary DB instance existing in one region.
Creates read replicas in each region with Amazon DocumentDB Global Clusters (single-master configuration). Creates read replicas in each region with Amazon Neptune Global Database (single-master configuration).
Multi-Master Deployment Create a multi-master database within one region with Amazon Aurora Multi-Master. - -
Profiler - Records operation execution time and detailed logs to Amazon CloudWatch Logs. -

Similar Monitoring Items for Amazon Aurora, Amazon DocumentDB, and Amazon Neptune

I have compiled items with similar content among the monitoring items for Amazon Aurora, Amazon DocumentDB, and Amazon Neptune.
In fact, there are metrics specific to each database in addition to the following table.
Among these, BufferCacheHitRatio, CPUUtilization, FreeableMemory are used as criteria to decide the increase and decrease of the instance class.

Metric Name Metric Content
BufferCacheHitRatio(Aurora)
BufferCacheHitRatio(DocumentDB)
BufferCacheHitRatio(Neptune)
The percentage of requests handled by the buffer cache (%). It is a criterion for considering increasing the instance class (increasing memory data cache), such as when the cache hit rate is less than 99.9% and latency is high.
CPUUtilization(Aurora)
CPUUtilization(DocumentDB)
CPUUtilization(Neptune)
CPU usage rate. It is a criterion for considering increasing the instance class when the CPU usage rate is close to 100% or sufficient performance cannot be obtained.
FreeableMemory(Aurora)
FreeableMemory(DocumentDB)
FreeableMemory(Neptune)
Memory (RAM) capacity (Bytes). It is a criterion for considering increasing the instance class when there is a shortage of memory.
EngineUptime(Aurora)
EngineUptime(DocumentDB)
EngineUptime(Neptune)
Instance running time (seconds).
VolumeBytesUsed(Aurora)
VolumeBytesUsed(DocumentDB)
VolumeBytesUsed(Neptune)
Storage capacity (Bytes) allocated to the DB cluster.
AuroraReplicaLag(Aurora)
DBInstanceReplicaLag(DocumentDB)
ClusterReplicaLag(Neptune)
Replication delay time (milliseconds) between the primary instance and the replica.
AuroraReplicaLagMaximum(Aurora)
DBClusterReplicaLagMaximum(DocumentDB)
ClusterReplicaLagMaximum(Neptune)
Maximum replication delay time (milliseconds) between the primary instance and each replica.
AuroraReplicaLagMinimum(Aurora)
DBClusterReplicaLagMinimum(DocumentDB)
ClusterReplicaLagMinimum(Neptune)
Minimum replication delay time (milliseconds) between the primary instance and each replica.

Common Change Application Timing for Amazon Aurora, Amazon DocumentDB, Amazon Neptune

This summarizes the common change application timing when changes are made to the database in Amazon Aurora, Amazon DocumentDB, Amazon Neptune.
If you do not select "Apply Immediately" at the time of executing the change,
Cluster identifier, master password, IAM DB authentication, instance identifier, instance class
Changes will be applied during the maintenance window for these. Among the items in the table below, settings other than the above items will be applied immediately regardless of the "Apply Immediately" designation.

Scope Change Target Impact
Cluster Cluster Identifier No downtime.
Cluster Master Password No downtime.
Cluster IAM DB Authentication
(No function in DocumentDB)
No downtime.
Cluster Security Group No downtime.
Cluster Parameter Group Parameter changes are reflected only at manual restart without failover.
Cluster Maintenance Window If you change the maintenance time including the current time and the pending action stops functioning, the pending action is applied immediately and the function stops.
Cluster Backup Retention Period No downtime.
Cluster Backup Window No downtime.
Instance Instance Identifier The DB instance restarts, and the function stops during the change.
Instance Instance Class The function stops during the change.
Instance Promotion Tier No downtime.
Cluster Deletion Protection No downtime.

Summary

As you can see from the comparison table, although there are differences in quotas, it can be said that Amazon Aurora, Amazon DocumentDB, and Amazon Neptune are almost the same in terms of availability based on the Quorum Model and backup functions.
On the other hand, security functions have different authentication methods depending on the characteristics of the database, and monitoring functions show subtle differences in the names of similar metrics.
Also, as of the time of writing this article, features of Amazon Aurora such as Backtrack and Multi-Master are not available in Amazon DocumentDB and Amazon Neptune.
On the other hand, features of Amazon DocumentDB such as Profiler and Elastic Clusters are not available in Amazon Aurora and Amazon Neptune at the time of writing this article.
Looking at it this way, while there are some similar features at present, each AWS service is updated frequently, so it might be the same at one point and different at another. Therefore, it's essential to continuously gather information about AWS service updates (I think this is one of the reasons why it's hard to say that the commonalities and differences between each service are the same).
It is important to keep checking for new feature additions and updates for these AWS database services based on the Quorum Model, while looking forward to future improvements.

Written by Hidekazu Konishi


Copyright © Hidekazu Konishi ( hidekazu-konishi.com ) All Rights Reserved.