Aller au contenu
EdgeServers
Blog

postgresql

Postgres replication patterns in 2026 — Patroni, managed services, and the failover story

When to roll your own Patroni cluster, when to use managed Postgres, and the failover semantics nobody explains until production breaks.

27 mai 2026 · 10 min · par Sudhanshu K.

Postgres replication patterns in 2026 — Patroni, managed services, and the failover story

There are essentially three ways to run highly-available Postgres in 2026:

  1. Managed Postgres on a hyperscaler (RDS, Cloud SQL, Azure Database for Postgres, DigitalOcean Managed Postgres)
  2. Self-managed Patroni cluster on VMs or Kubernetes
  3. Self-managed primary + custom replication scripts

Option 3 is a trap. Don't.

This post is about choosing between options 1 and 2, the trade-offs that actually matter, and the failover behavior nobody explains until you've experienced it in anger. We run both patterns extensively across managed Postgres customers.

Managed Postgres: the trade-offs

The pitch is compelling: clicking a button gives you a primary + standby + automated failover + backups + monitoring. The reality is more nuanced.

What you get

  • Provisioned in minutes, not days
  • Automated minor version upgrades (mostly)
  • Failover usually happens automatically on primary failure (the "mostly" matters)
  • Backups configured by default (verify the retention is what you think)
  • A maintenance window where the vendor can reboot you

What you give up

  • Superuser access. No SUPERUSER role, no access to the OS, no custom extensions outside the vendor's allowlist
  • Logical replication out, in some implementations. RDS Postgres added it eventually but with caveats; Cloud SQL is similar.
  • Custom postgresql.conf. You get a parameter group, not direct edit access. Some parameters are entirely off-limits.
  • Knowledge of when failover happened. RDS in particular is notoriously vague about whether you're connected to the original primary or the failover replica
  • Cost predictability. Storage I/O charges can dominate the bill at scale; the headline instance price is rarely the actual bill

When managed is correct

  • You're a small team and don't have on-call DBA capacity
  • Your workload fits comfortably within the vendor's extension allowlist
  • You don't need bleeding-edge Postgres features
  • You have predictable load (no need for last-mile tuning)
  • You're already invested in that cloud's IAM and networking

For ~70% of customer workloads, we recommend managed Postgres for these reasons. RDS Postgres on AWS and Cloud SQL on GCP are both mature, well-instrumented, and operationally cheap. The marginal cost of "managed" is well worth it for most teams.

When managed is wrong

  • You need custom Postgres extensions (Citus, TimescaleDB on RDS — both partially supported and full of footnotes)
  • You need very low RPO (managed services typically have an RPO of ~1 minute; Patroni with synchronous replication can hit zero)
  • You need very low RTO (RDS Multi-AZ failover takes 60-120 seconds; Patroni with synchronous_commit = remote_apply can fail over in 5-15 seconds)
  • You're running multi-region active-active (no managed Postgres does this well in 2026)
  • You're moving across clouds and don't want to be locked in

For these cases, self-managed Patroni is the right answer.

Patroni: the self-managed HA pattern

Patroni is the de-facto standard for self-managed Postgres HA. It's an orchestrator that wraps Postgres, uses a distributed consensus store (etcd, Consul, ZooKeeper, or Kubernetes leases) to elect leaders, and handles failover transparently.

A minimal cluster:

  • 3 Postgres nodes (1 primary, 2 replicas)
  • 3 etcd nodes (or use the Kubernetes API as the DCS)
  • HAProxy or PgBouncer in front, with health checks pointing at Patroni's REST API
  • A reliable WAL archive (pgBackRest or WAL-G to object storage)

The Patroni configuration on each node:

scope: pg-prod
namespace: /db/
name: postgres-1
 
restapi:
  listen: 0.0.0.0:8008
  connect_address: 10.0.1.10:8008
 
etcd:
  hosts: 10.0.2.10:2379, 10.0.2.11:2379, 10.0.2.12:2379
 
bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
    synchronous_mode: true
    postgresql:
      use_pg_rewind: true
      parameters:
        max_connections: 200
        shared_buffers: 8GB
        effective_cache_size: 24GB
        wal_level: replica
        max_wal_senders: 10
        max_replication_slots: 10
 
postgresql:
  listen: 0.0.0.0:5432
  connect_address: 10.0.1.10:5432
  data_dir: /var/lib/postgresql/16/main
  bin_dir: /usr/lib/postgresql/16/bin
  authentication:
    superuser:
      username: postgres
      password: <vault-managed>
    replication:
      username: replicator
      password: <vault-managed>

The two important settings:

  • synchronous_mode: true — every commit waits for at least one synchronous replica before returning. This gives you RPO=0 (no committed data is lost in failover) but costs latency.
  • maximum_lag_on_failover: 1048576 — a replica with more than 1MB of lag cannot be promoted. Prevents promoting a stale replica that would lose committed data.

These two together are what give Patroni its strong failover story. The default Postgres streaming replication, without an orchestrator, has neither.

What happens during failover

The sequence on primary failure with Patroni:

  1. t=0 — primary becomes unreachable
  2. t=0-30s — Patroni's leader lease in etcd expires (ttl)
  3. t=30s — surviving replicas notice no leader; they propose promotion
  4. t=30-32s — Patroni picks the replica with the smallest replication lag
  5. t=32-35s — chosen replica runs pg_ctl promote, becomes primary
  6. t=35s — HAProxy health check notices new primary, starts routing writes there
  7. t=35s- — old primary, when it comes back, is automatically reconfigured as a replica via pg_rewind

Total failover time: typically 30-45 seconds, almost all of it the DCS lease expiry. You can tune ttl down to ~15 seconds for faster failover at the cost of more sensitivity to network blips.

Compare with RDS Multi-AZ: 60-120 seconds, no synchronous_mode option, so RPO is "whatever was in the WAL stream but not yet applied to standby."

The failure mode nobody warns you about

Patroni's most common failure mode in production isn't Postgres failure — it's DCS partition. If etcd loses quorum (network partition, AZ outage, simultaneous restart) while Postgres is healthy, Patroni demotes the primary to read-only ("no DCS, can't be sure I'm still leader, safer to step down").

This is the correct behavior. It also produces an outage that's confusing because all your Postgres processes are running fine.

The mitigation:

  • Run etcd in a separate failure domain from Postgres. Three-node etcd, one per AZ, with low-latency networking between them.
  • Tune etcd_election_timeout to be longer than your typical cross-AZ network blip (default 1s; we use 2-3s).
  • Monitor etcd quorum health independently of Postgres health.
  • Have a documented "DCS-down" procedure: if etcd is fully gone, the runbook is to manually mark Patroni as paused and let Postgres serve traffic without orchestration. Patroni supports this with patronictl pause.

We've shipped this configuration on bare-VM Patroni clusters, on Kubernetes (with the Crunchy Postgres Operator or the Zalando Postgres Operator wrapping Patroni), and on hybrid setups. The patterns generalize.

Kubernetes vs VMs for Patroni

Patroni on Kubernetes (via an operator) is more popular than Patroni on VMs in 2026 — but it's not always the right answer.

Kubernetes-Patroni is best when:

  • You're already running other workloads on Kubernetes
  • You want the operator to handle replica restarts, backup scheduling, and TLS rotation
  • You have a competent Kubernetes ops practice
  • You don't have specific performance requirements that need bare-metal tuning

VM-Patroni is best when:

  • You want every Postgres dependency on bare hardware for performance
  • You don't want Postgres co-located with general workload (noisy-neighbor concerns)
  • Your team has stronger Postgres skills than Kubernetes skills
  • You have hard latency requirements where Kubernetes networking adds too much overhead

We run both. For provisioning new managed Postgres environments, we'll typically start with the customer's preference and adjust.

What we ship by default

For new managed Postgres engagements:

  1. Default to managed Postgres (RDS, Cloud SQL, Azure DB, DO Managed) unless there's a specific reason not to.
  2. Patroni on Kubernetes for customers who need self-managed and already have a Kubernetes operations practice.
  3. Patroni on VMs for customers who need self-managed and prefer the simpler operational model.
  4. Synchronous replication enabled by default on Patroni clusters; tuned RPO/RTO requirements drive specific config.
  5. WAL archiving to object storage with quarterly PITR drills.
  6. HAProxy or PgBouncer in front, with proper health checks against Patroni's REST API (not Postgres directly).

The right pattern depends on the workload, the team, and the budget. The wrong pattern is the third one I mentioned at the top — rolling your own with pg_basebackup and cron. That works until it doesn't, and the day it doesn't is the worst day of someone's career.

Sudhanshu K. is Principal Engineer at EdgeServers (RemotIQ Pty Ltd, ABN 91 682 628 128). He has bootstrapped many Patroni clusters, failed over many primaries (mostly intentionally), and has the strong opinions to show for it.