PM2 vs cluster vs containers — how we run Node.js in 2026

There are three ways teams put Node.js into production, and most of them are running the wrong one for their workload. PM2 is still the most popular search result. The cluster module is still in nearly every "scale Node.js" tutorial. Containers — single-process per container, orchestrator-managed — are the default we ship on greenfield work.

The honest answer is that all three are defensible in 2026. They optimise for different things. This is the rubric we use when onboarding a new customer onto managed Node.js hosting.

What each one actually does

Before the comparison, the mechanics, because they get muddled in conversation.

Node's cluster module spawns N worker processes, sharing a listening socket on the OS level. Round-robin distribution on Linux (since Node 10), OS-driven on Windows. Each worker is a full Node process with its own heap, its own event loop, its own everything. The parent process is the supervisor; if a worker dies, the parent respawns it.

// cluster-server.js
const cluster = require('node:cluster');
const os = require('node:os');
 
if (cluster.isPrimary) {
  for (let i = 0; i < os.cpus().length; i++) cluster.fork();
  cluster.on('exit', (worker) => {
    console.log(`worker ${worker.process.pid} died, respawning`);
    cluster.fork();
  });
} else {
  require('./app.js'); // listens on PORT
}

PM2 is, at its core, a fancy wrapper around the cluster module plus a daemon plus a CLI. It adds zero-downtime reloads, log management, automatic restart on crash, memory-threshold restarts, ecosystem files, and a startup script generator. It runs as a long-lived pm2 god daemon that survives Node app crashes.

Containers put each Node process in its own container, with an orchestrator (Kubernetes, ECS, Nomad, Docker Swarm, even plain systemd units) handling restarts, scaling, and health checks. The Node process is single-threaded and single-instance; horizontal scaling is the orchestrator's job, not Node's.

Where each one wins

PM2 wins for: one VM, mixed workload, ops team that lives in SSH

If you have a single VM (or three) running half a dozen Node services, no Kubernetes, no Docker, and an ops team that's comfortable with pm2 list and pm2 logs, PM2 is genuinely good. The reload-on-zero-downtime behaviour works. The log aggregation works. The startup persistence (pm2 startup + pm2 save) works.

The configuration ergonomics are also better than people give credit for in 2026:

// ecosystem.config.js
module.exports = {
  apps: [{
    name: 'api',
    script: './dist/index.js',
    instances: 'max',
    exec_mode: 'cluster',
    max_memory_restart: '1G',
    env_production: { NODE_ENV: 'production' },
    error_file: '/var/log/api/error.log',
    out_file: '/var/log/api/out.log',
  }],
};

One file, version it, deploy with pm2 deploy. For small fleets — say, under ten VMs total — this is genuinely productive.

The caveat: PM2's monitoring side (Keymetrics, PM2 Plus) is fine but you'll usually want your real observability stack (Prometheus, Datadog, Grafana) regardless. So PM2 is doing process management only.

Cluster wins for: nothing, really, in 2026

Raw cluster without PM2 or a container orchestrator was the right answer in 2014. It is rarely the right answer now. The reasons:

No persistence across reboots without writing your own systemd unit
No memory-threshold restart without writing it yourself
No log rotation
No CLI for "show me what's running"
Sticky-session support requires cluster.SCHED_NONE plus a third-party module, and it's fragile

The one case we still use raw cluster: inside a container, where the orchestrator handles all the supervision concerns and we just want to use multiple CPU cores within a single container's CPU quota. Even there, we usually prefer one container per core.

Containers win for: everything bigger than a single VM

For anything running on Kubernetes, ECS, Cloud Run, App Runner, Nomad, or any other modern orchestrator, the answer is one Node process per container. Single-threaded. The orchestrator scales horizontally.

Why this is the right default:

Resource isolation is real. A leak in service A doesn't OOM service B if they're in different containers. Inside PM2 or cluster, a leak in one worker can starve siblings.
Rollouts are atomic. Kubernetes does the cluster's "graceful restart" for you, properly, with readiness probes and surge limits.
Autoscaling is the orchestrator's job. PM2's instances setting is static; an HPA scales pods up and down based on real metrics.
The Node binary stays simple. No supervisor, no cluster.fork, no IPC, just a single process that listens on a port and exits when it's told.
Observability is uniform. stdout/stderr per container, scraped by the platform. No PM2-specific logging path.

The Dockerfile for a Node service in this model is about 20 lines, follows our Dockerfile guidance, and runs as a non-root user:

# syntax=docker/dockerfile:1.7
FROM node:22-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN --mount=type=cache,target=/root/.npm \
    npm ci --omit=dev
 
FROM node:22-alpine
WORKDIR /app
RUN addgroup -S app && adduser -S app -G app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
USER app:app
ENV NODE_ENV=production
EXPOSE 3000
CMD ["node", "dist/index.js"]

No PM2. No cluster. The orchestrator handles restart, scale, and health.

The PM2-in-Docker antipattern

Periodically we onboard a customer and find PM2 running inside Docker containers, inside Kubernetes, with pm2-runtime as the container entrypoint. The argument is usually "we want zero-downtime reloads" or "we want to use all the cores in the container."

Both reasons fall over:

Zero-downtime reloads are Kubernetes' job. The Deployment controller with a rolling update strategy and a working readiness probe gives you cleaner rollouts than pm2 reload ever did. PM2 inside a container is reloading the wrong thing — it's reloading a process inside an ephemeral pod that's about to be replaced anyway.
Using all the cores in the container is a thing if you've given the container multiple cores. Don't. Give each container one core (or 500m if you're tight) and scale horizontally. The platform's bin-packing is better than yours.

The PM2-in-Docker setup also doubles your supervisor chain (pm2-runtime supervises Node, Kubelet supervises pm2-runtime), which makes signal handling, log routing, and exit codes all subtly weird. Crashes that should propagate get swallowed. Restart loops that should be obvious get hidden behind PM2's restart counter.

The exception, narrowly: if you're using PM2 specifically for its zero-downtime reload semantics on a long-running websocket workload that can't tolerate connection churn from rolling pod replacement, PM2-in-Docker may earn its keep. We have one customer running Node.js on DigitalOcean where this is the case. Out of dozens of Node deployments, that's the only one.

Things we do regardless of model

A few practices that apply equally whether you're running PM2, raw cluster, or containers:

Set --max-old-space-size explicitly. Don't let V8 guess. Pick a number 15-20% below the container/VM limit so V8 throws JavaScript heap out of memory before the OS OOM killer fires. The former is debuggable; the latter is not.

Run Node 22 LTS or 24 (when it lands LTS in October 2026). Anything older is leaving performance on the table — V8 has gotten substantially faster between 18 and 22, particularly for promise-heavy code.

Use NODE_OPTIONS='--enable-source-maps' so production stack traces are useful. The CPU cost is negligible; the time saved on incident triage is large.

Drain on SIGTERM. Whatever the supervisor, the contract for a graceful shutdown is: stop accepting new connections, finish in-flight requests within a deadline, then exit. Node services that exit-on-SIGTERM immediately drop traffic during every rollout.

const server = app.listen(port);
let shuttingDown = false;
process.on('SIGTERM', () => {
  if (shuttingDown) return;
  shuttingDown = true;
  server.close(() => process.exit(0));
  setTimeout(() => process.exit(1), 25_000).unref();
});

That ten-line snippet eliminates an entire category of "why are we getting 502s during deploys" tickets.

How we make the call

When we provision a new Node app for a customer, the rubric is roughly:

Fewer than three VMs, no container experience on the ops team — PM2.
Anything on Kubernetes, ECS, Cloud Run, Fly, or Nomad — single-process containers.
Legacy app being lifted-and-shifted, no time to refactor — PM2 inside Docker as a stepping stone, with a written plan to move to single-process within two quarters.
Brand new build — single-process containers, no exceptions.

It's mostly the third one in practice. Most Node fleets we inherit are PM2 setups that grew into Kubernetes without anyone reviewing the supervisor chain. Untangling that is usually a quick win, often cutting incident frequency in half just by removing duplicate restart logic.

Sudhanshu K. is a Principal Engineer at EdgeServers (RemotIQ Pty Ltd, ABN 91 682 628 128). He has, at various points, written process supervisors for Node — and now uses Kubernetes, which is fine.