FastAPI on Kubernetes — the production deployment we ship by default

FastAPI hit production-ubiquity around 2023. The framework itself is excellent. The deployment story around it is full of patterns that almost work — single-uvicorn-process under PID 1, missing graceful shutdown, no OpenAPI verification in CI, Pydantic v1 still hanging around. Each one of these has been a customer incident at some point.

This is the FastAPI deployment template we apply to every new service we manage on Kubernetes.

Dockerfile + healthcheck endpoints

FROM python:3.12-slim
WORKDIR /app
COPY requirements.lock.txt ./
RUN pip install --no-cache-dir --require-hashes -r requirements.lock.txt
COPY . .
USER 1000
EXPOSE 8000
CMD ["gunicorn", "app.main:app", "--worker-class", "uvicorn.workers.UvicornWorker", \
     "--workers", "4", "--bind", "0.0.0.0:8000", "--timeout", "30"]

@app.get("/healthz", include_in_schema=False)
async def liveness(): return {"ok": True}
 
@app.get("/ready", include_in_schema=False)
async def readiness(): return {"ok": await db.is_connected()}

Liveness is "the process is up." Readiness is "the process can serve traffic." Don't conflate them — a slow database should drop a pod from the rotation, not restart it.

The full write-up covers:

Pydantic v2 (model_config, computed_field, model_validator) — the migration story
ASGI server choice: Gunicorn + UvicornWorker (default), Hypercorn (HTTP/2), Granian (newer)
OpenAPI schema verification in CI — fail the build if the schema has drift
Structured logging with the request ID propagated through dependencies
Background task patterns — fastapi.BackgroundTasks vs Celery vs Arq
Kubernetes manifests: resource requests, readinessProbe, terminationGracePeriodSeconds

We ship this template on every managed FastAPI service.