Postgres replication patterns in 2026 — Patroni, managed services, and the failover story
May 27, 2026 · 1 min read · by Sudhanshu K.
There are essentially three ways to run highly-available Postgres in 2026: managed Postgres on a hyperscaler, a self-managed Patroni cluster, or a custom replication script someone wrote in 2019. Option three is a trap.
We run options one and two extensively across customer engagements. Which one you should pick depends on how much operational capacity you have and whether you need extensions or RPO/RTO numbers the managed services can't hit.
Patroni — the failover semantics that matter
bootstrap:
dcs:
ttl: 30
loop_wait: 10
maximum_lag_on_failover: 1048576
synchronous_mode: true
postgresql:
use_pg_rewind: truesynchronous_mode: true is RPO=0 in exchange for commit latency. maximum_lag_on_failover: 1MB means a stale replica can't be promoted — the alternative is silent data loss in a real outage. These two settings are what make Patroni's failover story stronger than RDS Multi-AZ.
The full write-up covers:
- Managed Postgres trade-offs (no superuser, parameter group limits, vendor-defined RPO)
- Patroni cluster topology (3 Postgres + 3 etcd or K8s leases)
- The 30-45 second failover sequence and how to tune
ttldown - DCS (etcd) partition failures — the silent demote that confuses everyone
- Patroni on Kubernetes via Zalando/Crunchy operators vs bare VMs
- HAProxy / PgBouncer health checks against Patroni's REST API
For most workloads we recommend managed Postgres. Reach out if yours doesn't fit.
Full article available
Read the full article