A pragmatic Argo CD setup — GitOps that survives contact with reality

Argo CD is one of those projects that's wildly popular and wildly mis-deployed. Every team I work with has some version of Argo CD running. Maybe a third of them have it set up in a way that's actually saving them effort instead of being a worse version of kubectl apply with extra YAML.

The difference is usually in three places: repo structure, sync wave orchestration, and how you handle secrets. This post walks through what we ship for managed Kubernetes customers where Argo CD is the deployment surface.

Repo structure: the App-of-Apps pattern, done right

The single biggest mistake is one giant argocd-apps/ directory with 80 application YAMLs in it. You lose track of which app belongs to which team, environment promotion becomes copy-paste-modify, and the whole thing rots in 6 months.

The structure that scales:

gitops-repo/
├── bootstrap/                  # the root app-of-apps
│   └── root-app.yaml
├── platform/                   # cluster-level shared infrastructure
│   ├── ingress-nginx/
│   ├── cert-manager/
│   ├── external-secrets/
│   ├── argo-cd/                # Argo CD manages itself
│   └── monitoring/
├── tenants/
│   ├── team-payments/
│   │   ├── dev/
│   │   ├── staging/
│   │   └── prod/
│   └── team-search/
│       ├── dev/
│       ├── staging/
│       └── prod/
└── projects/                   # AppProject resources
    ├── platform.yaml
    ├── team-payments.yaml
    └── team-search.yaml

Three levels:

bootstrap/root-app.yaml is a single Argo CD Application that points at platform/ and tenants/. Apply this one file and the cluster bootstraps itself.
platform/ holds cluster-wide infrastructure managed by the platform team — ingress, cert-manager, monitoring, Argo CD itself. Each subdirectory is a separate Application.
tenants/<team>/<env>/ holds workloads owned by individual application teams. Each environment is its own Application, so promotion from dev → staging → prod is a PR that copies the manifests one directory over.

The projects/ directory holds AppProject resources, which are the RBAC boundary. The team-payments project is allowed to deploy into team-payments-* namespaces only, with allowed source repos restricted to the paths above. Without AppProjects, every team can deploy anywhere — which is not what you want.

Sync waves: making things happen in the right order

Argo CD applies all resources concurrently by default. That works until you have a Namespace that needs to exist before the Deployment that lives in it, or an ExternalSecret that needs the SecretStore to be ready first.

The fix is sync waves — argocd.argoproj.io/sync-wave annotations on resources, applied in numerical order:

# Wave -1: CRDs and namespaces
apiVersion: v1
kind: Namespace
metadata:
  name: my-app
  annotations:
    argocd.argoproj.io/sync-wave: "-1"
 
# Wave 0: Configuration & secrets (default)
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: app-secrets
  annotations:
    argocd.argoproj.io/sync-wave: "0"
 
# Wave 1: Workloads
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app
  annotations:
    argocd.argoproj.io/sync-wave: "1"
 
# Wave 2: Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app
  annotations:
    argocd.argoproj.io/sync-wave: "2"

Conventions we use across all the Kubernetes clusters we manage on AKS, EKS, and GKE:

Wave	What goes here
-2	CRDs
-1	Namespaces, AppProjects, ClusterRoles
0	ConfigMaps, Secrets (ExternalSecret), Service Accounts
1	StatefulSets, Deployments, Services
2	HPAs, PDBs
3	Ingresses, Gateways

The point is: every team in your org uses the same waves for the same kinds of resources, so the order is predictable across applications.

Secrets: External Secrets Operator + a vault, not sealed-secrets

There's a long argument in the GitOps community about sealed-secrets vs External Secrets Operator (ESO). We come down firmly on ESO + a real secret store (AWS Secrets Manager, GCP Secret Manager, Azure Key Vault, or HashiCorp Vault), for one decisive reason: rotation.

With sealed-secrets, rotating a secret means re-encrypting it, committing the new ciphertext to git, and waiting for Argo CD to sync. The plaintext lives nowhere; you can't see what the current value is without decrypting. Rotation requires a deploy.

With ESO + a vault, rotation is a vault operation. Update the secret in the vault, ESO syncs the new value into the in-cluster Secret, and the workload picks it up (with reloader annotations, or a rolling restart). No commit, no deploy, no exposure of ciphertext-in-git.

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: app-database-creds
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secrets-manager
    kind: ClusterSecretStore
  target:
    name: app-database-creds
  data:
    - secretKey: password
      remoteRef:
        key: prod/app/db
        property: password

This is the canonical pattern. The vault is the source of truth for secret values; git is the source of truth for which secrets exist and where they're consumed. The two things are kept separate, which is what good security boundaries look like.

Auto-sync vs manual sync

Auto-sync for dev and staging. Manual sync for prod.

This is non-negotiable for us. The Argo CD UI gives you a one-click sync button, but having a human hit it gives you:

One last chance to read the diff Argo CD is about to apply
An audit log entry naming the actual person who did the deploy
A natural pause point if you spot something weird in metrics post-deploy

For dev/staging, auto-sync with prune: true, selfHeal: true is fine — these are ephemeral environments and the value of fast iteration outweighs the risk of an unintended apply.

For prod, you can still automate the PR that promotes from staging to prod (Renovate or a custom GitHub Action that copies manifests over). But the merge of that PR, and the explicit sync in Argo CD, both stay manual. This is the discipline that turns Argo CD from "deploy tool" into "deploy gate."

ApplicationSet for fleet-of-environments patterns

When you have many similar applications or many similar environments, ApplicationSet lets you generate Applications from a template:

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: team-payments-environments
spec:
  generators:
    - list:
        elements:
          - env: dev
            cluster: dev-cluster
          - env: staging
            cluster: staging-cluster
          - env: prod
            cluster: prod-cluster
  template:
    metadata:
      name: 'payments-{{env}}'
    spec:
      project: team-payments
      source:
        repoURL: https://github.com/example/gitops
        path: 'tenants/team-payments/{{env}}'
        targetRevision: main
      destination:
        name: '{{cluster}}'
        namespace: 'payments-{{env}}'

One ApplicationSet creates three Applications, one per environment. When you onboard a new environment, you add a row to the list. We use this heavily for multi-cluster setups — same workload, different clouds — which is common when we provision clusters across multiple providers for resilience.

Things we wish we'd known sooner

A grab-bag of lessons:

Always set prune: true on production AppProjects. Otherwise, a removed resource stays orphaned in the cluster forever and nobody notices until it bites.
Don't let Argo CD manage its own Application without a safety check. It's possible to commit a broken self-Application that breaks Argo CD's ability to fix itself. Use a separate "infra root" sync path that doesn't include Argo CD's own resources.
Notifications matter more than you think. Wire Argo CD into Slack on sync-failed, sync-degraded, and out-of-sync states. The default "go look at the UI" experience is far too passive.
Backup the Argo CD repo-server volume and the cluster's etcd. The git repo is the source of truth, but recovering from a cluster-wide loss involves more than argocd app sync if you're rebuilding from scratch.

The point of all of this is to make git push boring. When deploys are boring, you do more of them, and small frequent deploys are the single highest-leverage thing you can do for change-failure rates. That's the whole bet of GitOps, and it pays off cleanly once the foundations are right.

We're happy to set this up properly — it's one of the first things we do for new managed customers.

Sudhanshu K. is a Staff DevOps engineer at EdgeServers (RemotIQ Pty Ltd, ABN 91 682 628 128). He has bootstrapped Argo CD on more clusters than is healthy and learned most of the above the hard way.