mysql
MySQL high availability in 2026: Galera, InnoDB Cluster, or async replicas?
Three real HA approaches for MySQL, what each actually buys you, and the decision tree we use when standing up customer clusters.
May 13, 2026 · 10 min · by Sudhanshu K.
MySQL high availability in 2026: Galera, InnoDB Cluster, or async replicas?
Every few months a customer arrives at us with the same conversation. "We need MySQL HA. We've heard about Galera. We've heard about InnoDB Cluster. We've heard people say async replicas are fine. What do we actually use?" The honest answer is "it depends on what you mean by HA," which is a frustrating answer, so this post is the longer version we walk them through.
We run managed MySQL for customers across AWS, GCP, Azure, and DigitalOcean. The three HA topologies below cover roughly 95% of what we deploy. Each one is correct for some workloads and disastrously wrong for others.
What HA actually means for MySQL
Before the architectures, the definitions, because most outages we get paged for stem from teams not having explicitly agreed on these:
- RTO — recovery time objective. How long can the database be unavailable for writes? 5 seconds? 5 minutes? 30 minutes?
- RPO — recovery point objective. How much data can you afford to lose? Zero? Up to the last committed transaction? Up to the last few seconds?
- Read scaling — do you need read replicas for query offload, or is HA strictly about failover?
- Geographic distribution — single AZ? Multi-AZ? Multi-region? Active-active across regions?
"We need HA" usually means "we need automatic failover with under-a-minute RTO and zero data loss for the common case, with read scaling, in a single region." That's a perfectly common shape, and three different topologies can deliver it with very different trade-offs.
Option 1: Async replication with an orchestrator
The classic shape. One primary, one or more replicas, replication is asynchronous, and an external orchestrator (Orchestrator from the Vitess project, MaxScale, ProxySQL with a failover script, or RDS' built-in mechanism) handles primary failover.
# my.cnf (primary)
server-id = 1
log_bin = /var/log/mysql/mysql-bin
binlog_format = ROW
binlog_row_image = FULL
sync_binlog = 1
innodb_flush_log_at_trx_commit = 1
gtid_mode = ON
enforce_gtid_consistency = ONGTID-based async replication is what almost every production MySQL deployment from the last decade has run. It's simple, well-understood, and the failure modes are documented in a thousand blog posts. We default to this for any customer where:
- The workload is read-heavy and writes are concentrated on one node anyway
- Sub-second failover is not a requirement (a 30-60 second failover window is acceptable)
- The team is small and doesn't want to learn cluster-mode operations
- Geographic distribution matters (async replication tolerates 100ms+ replica latency without breaking)
What it costs you: RPO is non-zero. In a failover scenario, any transaction that was committed on the primary but hadn't replicated to the chosen replica is lost. For most B2B SaaS workloads this is a handful of transactions; for a payments system it's unacceptable.
Semi-sync helps but isn't a panacea. Setting rpl_semi_sync_master_enabled = ON makes the primary wait for at least one replica to acknowledge each transaction before returning to the client. This bounds your RPO but doesn't eliminate it (in a network partition the primary can fall back to async), and it adds 1-3ms to every write. Most of our customers running async run semi-sync too.
For straightforward async deployments on cloud-managed services, we tend to lean on the provider's tooling — RDS Multi-AZ on AWS, Cloud SQL HA on GCP, Azure Database for MySQL Flexible Server, or DigitalOcean Managed MySQL. The provider handles failover; we handle the application-side connection retry logic and the monitoring.
Option 2: Galera (Percona XtraDB Cluster, MariaDB Galera)
Galera is synchronous, multi-primary, certification-based replication. Every transaction is broadcast to every node before commit; conflicts are detected at commit time and one transaction is aborted. From the application's perspective, every node looks like a primary you can write to.
This sounds magical. In practice it has a specific set of strengths and a specific set of footguns.
Where Galera shines:
- True zero-RPO. A committed transaction is on every node, full stop.
- Sub-second failover. Connection-pool-level failover; no orchestrator needed.
- Active-active reads and writes across nodes in the same DC.
- Excellent for the "can't tolerate even brief read-only periods" workload.
Where Galera bites:
- Write throughput does not scale with node count. Every write is replicated to every node synchronously. A 3-node cluster has roughly the same write capacity as a 1-node primary, minus the replication overhead. People deploy Galera expecting 3x write scaling and are disappointed.
- Hot-row contention causes cluster-wide stalls. If two clients on different nodes update the same row, one of them gets a deferred deadlock-style error at commit time. Galera works beautifully for workloads where writes are distributed across the keyspace; it works badly for workloads with hotspot tables.
- Cross-AZ latency directly adds to commit latency. A Galera cluster spread across AZs in the same region adds the inter-AZ RTT (typically 1-2ms on AWS, similar on GCP/Azure) to every write transaction. Cross-region Galera is not viable — the latency makes write performance miserable.
- State Snapshot Transfer (SST) is painful. When a node falls behind too far or rejoins, the cluster donor streams a full snapshot to it. On a 500GB database this can take hours during which one node is serving as donor and possibly degraded.
We deploy Galera (specifically Percona XtraDB Cluster — what we provision by default for HA workloads needing zero-RPO) when the customer's workload is write-light-to-moderate, latency-tolerant within a region, and zero-RPO is a hard requirement. Five-node clusters across three AZs is the typical shape. The fifth node is usually an arbiter (no data) to break ties.
# my.cnf for PXC
wsrep_provider = /usr/lib/galera4/libgalera_smm.so
wsrep_cluster_address = "gcomm://node1,node2,node3,node4,node5"
wsrep_node_name = node1
wsrep_node_address = 10.0.1.10
wsrep_sst_method = xtrabackup-v2
binlog_format = ROW
default_storage_engine = InnoDB
innodb_autoinc_lock_mode = 2That last line — innodb_autoinc_lock_mode = 2 — is mandatory for Galera and breaks any application that relies on monotonic auto-increment IDs across sessions. It's the most common subtle bug we see when teams migrate from a standalone MySQL to Galera.
Option 3: MySQL InnoDB Cluster (Group Replication + MySQL Router)
Oracle's first-party answer, and the architecture we recommend most often in 2026 for new deployments on stock MySQL 8.x.
InnoDB Cluster is a packaging of three components:
- Group Replication — the actual replication protocol. Paxos-based, similar synchronous semantics to Galera but with a more conservative conflict model.
- MySQL Shell — administration tooling.
dba.createCluster(),cluster.addInstance(), etc. - MySQL Router — the connection routing layer that hides primary location from the application.
By default, Group Replication runs in single-primary mode — one writeable node, the others are read-only standbys. Failover happens automatically when the primary leaves the group. Multi-primary mode exists but Oracle's official guidance is "don't use it unless you really need it," and we've never recommended it.
# Setup via MySQL Shell
mysqlsh root@node1 --sql -e "SET sql_log_bin = 0; \
CREATE USER 'clusteradmin'@'%' IDENTIFIED BY '<pw>'; \
GRANT ALL ON *.* TO 'clusteradmin'@'%' WITH GRANT OPTION;"
# Then from mysqlsh JS:
var cluster = dba.createCluster('prodCluster');
cluster.addInstance('clusteradmin@node2:3306');
cluster.addInstance('clusteradmin@node3:3306');
cluster.status();What InnoDB Cluster gets right:
- Stock MySQL 8.x. No external storage engine plugin. Patching is the normal MySQL release cycle.
- Sub-second failover with the Router automatically reconnecting clients to the new primary.
- No application changes for failover. The Router accepts both R/W (port 6446) and R/O (port 6447) connection ports — apps using the R/W port get routed to the current primary.
- Better behaviour under partition. Group Replication explicitly fences minority partitions, preventing split-brain.
What it gets wrong (or merely awkward):
- Group Replication has a 9-node limit (in practice, 3, 5, or 7 nodes is what people run).
- The Router is a single point of failure unless you run multiple Router instances behind a load balancer — which means you do.
- DDL still serialises. Schema changes are not Group-Replication-friendly without care; we use
pt-online-schema-changeorgh-ostregardless of HA topology.
For most new managed deployments on AWS, GCP, or Azure where the customer wants stock MySQL 8.x and synchronous replication, InnoDB Cluster is what we ship. A 3-node Group Replication setup with 2 Router instances behind an internal load balancer is the typical shape.
The decision tree
How we actually decide, in the order we ask the questions:
-
Are you on AWS RDS / Cloud SQL / Azure Database for MySQL? If yes and "managed by the cloud provider, with their failover" is acceptable, that's the default. Multi-AZ deployments give you ~60 second failover and no RPO if semi-sync is configured.
-
Is zero-RPO a hard requirement? If yes, you need synchronous replication — either Galera or Group Replication.
-
Is your write workload write-heavy with hotspot rows? If yes, avoid synchronous multi-primary topologies. Async with a single primary, scaled vertically, will be faster and more reliable.
-
Do you need active-active writes across nodes? If yes (and you've genuinely thought through the conflict resolution), Galera. If no, prefer Group Replication for the cleaner failure semantics.
-
Do you need to span regions? Synchronous HA does not span regions. Use async replication to the second region with explicit failover procedures.
Cost notes
A 3-node InnoDB Cluster or Galera cluster costs roughly 3x the compute of a single primary, plus the Router/proxy layer. On AWS this is typically the difference between $300/month for a single db.r6g.large Multi-AZ RDS and $1,200-1,500/month for a self-managed 3-node cluster on EC2 (Router on small instances, EBS gp3 storage, 100GB data set). The trade-off is more operational control and zero-RPO synchronous replication.
For most customers the managed MySQL HA we provide sits on top of either RDS/Cloud SQL Multi-AZ (when the provider's HA is sufficient) or self-managed Group Replication on EC2/Compute Engine/Azure VMs (when synchronous semantics or specific tuning is required). We have a strong default toward the provider's managed HA — every operation you don't have to run is an operation you don't get paged for.
What we ship by default
For new managed MySQL HA customers:
- Group Replication (3 nodes across AZs) on top of stock MySQL 8.4 LTS, with two MySQL Router instances behind a private load balancer
sync_binlog = 1,innodb_flush_log_at_trx_commit = 1, semi-sync as a belt-and-braces measure- ProxySQL or MySQL Router for connection routing, with explicit R/W split where the application supports it
- Async replica in a second region for disaster recovery (not failover)
- Backup via Percona XtraBackup with binlog archival for point-in-time recovery (the topic of the next article in this series)
If you're staring at a MySQL deployment that's been "we'll add HA later" for the past three years, reach out. We'll do a one-day architecture review and tell you which of the three shapes above is right for your workload — and, importantly, when one of them is overkill and you should just use semi-sync async with a managed failover.
Sudhanshu K. is a Principal Database Engineer at EdgeServers (RemotIQ Pty Ltd, ABN 91 682 628 128). She has personally been on-call during three Galera split-brain incidents and lived to deploy Group Replication.