Docker Multi-Stage Builds for Smaller Images

Docker image size matters. Smaller images mean faster deployments, lower storage costs, reduced bandwidth, and a smaller attack surface.

1. The Problem with Single-Stage Builds

# Single-stage — 1.5GB image!
FROM node:20
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
EXPOSE 3000
CMD ["node", "dist/server.js"]

Contains: build tools, devDependencies, npm cache, source files, full Debian base.

2. Multi-Stage: The Builder Pattern

# Stage 1: Build
FROM node:20-alpine AS builder
WORKDIR /build
COPY package*.json ./
RUN npm ci --only=production --ignore-scripts && \
    cp -R node_modules /prod_modules && \
    npm ci
COPY . .
RUN npm run build

# Stage 2: Runtime — ~150MB (10x reduction)
FROM node:20-alpine AS runner
WORKDIR /app
COPY --from=builder /build/dist ./dist
COPY --from=builder /prod_modules ./node_modules
COPY package*.json ./
EXPOSE 3000
USER node
CMD ["node", "dist/server.js"]

3. Choosing a Base Image

Image Size Libraries Use Case
node:20 (Debian) ~200MB Full glibc Development, compatibility
node:20-alpine ~7MB musl Most production apps
node:20-slim ~45MB Partial glibc When you need glibc
gcr.io/distroless/nodejs ~30MB glibc Maximum security
Scratch ~5MB None Go/Rust static binaries

Alpine notes: Uses musl libc (not glibc). Most Node.js native modules work, but sharp and grpc can have issues. No search domain support in DNS.

Distroless notes: No shell, no package manager. Can't docker exec -it <container> bash. Minimal attack surface.

4. Layer Caching

# GOOD: Dependencies first, source later
FROM node:20-alpine AS builder
WORKDIR /build

# Layer 1: rarely changes
COPY package*.json ./
RUN npm ci

# Layer 2: changes frequently
COPY . .
RUN npm run build
# Without layer ordering: build time 2m30s on every commit
# With layer ordering: build time 30s (npm install cached)

Also use .dockerignore:

node_modules
.git
.env
dist
*.md

5. BuildKit Cache Mounts

# syntax=docker/dockerfile:1
FROM node:20-alpine AS builder
WORKDIR /build

RUN --mount=type=cache,target=/root/.npm \
    --mount=type=bind,source=package.json,target=package.json \
    --mount=type=bind,source=package-lock.json,target=package-lock.json \
    npm ci

COPY . .
RUN npm run build

6. Language-Specific Optimizations

Go (static binary, ~12MB):

FROM golang:1.22 AS builder
WORKDIR /build
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o app .
FROM scratch
COPY --from=builder /build/app /app
CMD ["/app"]

Python (with virtualenv):

FROM python:3.12-slim AS builder
WORKDIR /build
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

FROM python:3.12-slim
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY . .
CMD ["uvicorn", "main:app", "--host", "0.0.0.0"]

7. Production Hardening

FROM node:20-alpine AS runner

RUN addgroup -S appgroup && adduser -S appuser -G appgroup
WORKDIR /app

COPY --from=builder /build/dist ./dist
COPY --from=builder /build/node_modules ./node_modules
COPY --from=builder /build/package*.json ./

# Security: remove unnecessary binaries
RUN apk del --no-cache apk-tools 2>/dev/null || true

USER appuser
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s \
  CMD node -e "require('http').get('http://localhost:3000/health', r => process.exit(r.statusCode === 200 ? 0 : 1))"

CMD ["node", "dist/server.js"]

Summary Checklist

  • Multi-stage build with builder pattern
  • Alpine or distroless base image
  • Dependencies copied before source code (layer caching)
  • .dockerignore excluding unnecessary files
  • Non-root user
  • HEALTHCHECK defined
  • Only production dependencies in final image
  • BuildKit cache mounts for CI

Each technique compounds. Start with multi-stage (10x reduction), add Alpine (another 3x), then optimize layer caching (10x faster builds).