Operations


https://docs.docker.com/engine/manage-resources/pruning/
https://docs.docker.com/reference/cli/docker/system/df/
https://docs.docker.com/reference/cli/docker/system/prune/
https://docs.docker.com/engine/logging/drivers/json-file/
https://docs.docker.com/docker-hub/repos/manage/hub-images/
https://docs.docker.com/docker-hub/repos/manage/hub-images/manage/

1. Important Points#

Docker operations policy should define:
    local disk cleanup:
        what can be pruned, when, and what must be protected

    registry retention:
        which tags are kept, for how long, and how rollback images are protected

    volume backup:
        which volumes contain data and how restore is tested

    runbook:
        commands for no space, broken container, failed pull, and rollback
do not use these as bare checklist items:
    cleanup policy is defined
    registry retention policy is defined
    volume backup policy is defined

write the policy, include commands, and verify it.

2. Local Disk Cleanup Policy#

Host Type Cleanup Default Volume Rule
Dev laptop weekly or Docker disk usage > 30 GB prune only unlabeled unused volumes
Shared Docker VM maintenance window or filesystem > 80% never auto-prune data volumes
CI builder after build or daily no persistent data volume
Production Docker host change window only backup first, then delete explicitly
default policy:
    stopped containers:
        remove older than 7 days

    unused images:
        remove older than 14 days

    build cache:
        remove older than 7 days

    networks:
        remove older than 7 days when unused

    volumes:
        do not run global volume prune unless data volumes have labels and backups

commands#

docker system df
docker container prune --filter "until=168h"
docker image prune -a --filter "until=336h"
docker network prune --filter "until=168h"
docker builder prune --filter "until=168h"

protect data volumes#

docker volume create --label keep=true pgdata
docker volume create --label keep=true redisdata
docker volume prune --filter "label!=keep"
volume rule:
    label real data volumes with keep=true
    backup before manual deletion
    restore test is required before claiming backup is usable

verify#

docker system df
docker ps -a
docker volume ls
df -h
expected:
    Docker disk usage goes down
    running containers still run
    named data volumes still exist
    app can restart and read data

rollback#

containers/images/cache:
    no direct rollback after prune
    re-pull image or rebuild image

volumes:
    restore from backup archive
    if no backup exists, do not delete

3. Registry Retention Policy#

Registry cleanup is not controlled by docker system prune. It must be configured in Docker Hub, ECR, Harbor, GitLab Registry, GHCR, or the registry your team uses.

Tag Type Example Retention
Immutable release 1.0.0 keep 180 days or last 30 releases
Git SHA git-a1b2c3d keep 30-90 days
Environment alias prod, staging keep current only
PR / branch image pr-123, feature-x keep 7-14 days
Untagged image digest only delete after 1-7 days
Last known good rollback-2026-06-05 keep until next stable deploy
retention rule:
    never delete the image currently deployed
    keep enough old images for rollback window
    delete branch/PR images aggressively
    do not use latest as the only rollback reference
    document which registry owns the actual deletion rule

tag release#

docker tag order-api:local registry.example.com/order-api:1.0.0
docker tag order-api:local registry.example.com/order-api:git-a1b2c3d
docker push registry.example.com/order-api:1.0.0
docker push registry.example.com/order-api:git-a1b2c3d

verify before registry cleanup#

docker pull registry.example.com/order-api:1.0.0
docker image inspect registry.example.com/order-api:1.0.0
registry UI/API checks:
    current prod tag exists
    rollback tag exists
    old branch/PR tags match retention rule
    untagged images can be safely deleted
    storage usage trend is visible

Docker Hub notes#

Docker Hub cleanup:
    manage tags/images in repository Image Management
    stale images are visible in the UI
    deletion can be manual or API-based depending on plan/tooling
    deletion is registry-side and does not affect images already present on a host

failure cases#

image accidentally deleted:
    check whether image still exists on a deployed host
    docker tag local digest/tag back to registry
    docker push restored tag
    if no local copy exists, rebuild from exact source and base image references

rollback pull fails:
    verify registry auth
    verify tag was not cleaned
    verify platform architecture exists for multi-arch images

4. Volume Backup Policy#

use Docker volume for:
    local dev database data
    small self-managed lab service data

avoid Docker volume for:
    production database without backup/restore runbook
    data that needs cross-host failover
    data that should be managed by cloud block storage or Kubernetes PV

backup#

docker run --rm \
  -v order-data:/data:ro \
  -v "$PWD":/backup \
  alpine:3.20 \
  tar czf /backup/order-data-$(date +%F).tgz -C /data .

restore#

docker volume create --label keep=true order-data-restore

docker run --rm \
  -v order-data-restore:/data \
  -v "$PWD":/backup \
  alpine:3.20 \
  tar xzf /backup/order-data-2026-06-05.tgz -C /data

verify restore#

docker run --rm \
  -v order-data-restore:/data \
  alpine:3.20 \
  ls -lah /data
backup policy:
    define RPO/RTO
    encrypt backup if it contains customer data
    store backup outside the Docker host
    test restore after creating or changing backup job

5. Host Log Policy#

Containers should write logs to stdout/stderr. Docker host log growth still needs a host policy.

daemon log rotation example#

{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m",
    "max-file": "10"
  }
}
file:
    /etc/docker/daemon.json

apply:
    restart Docker daemon during maintenance window
    recreate containers if existing containers do not pick up new log options

verify logs#

docker inspect <container> --format '{{.HostConfig.LogConfig.Type}} {{json .HostConfig.LogConfig.Config}}'
docker logs --tail 20 <container>

6. Incident Runbook#

container should start after reboot#

sudo systemctl is-enabled docker
docker inspect order-api --format '{{.HostConfig.RestartPolicy.Name}}'
docker inspect order-api --format '{{.State.Status}} {{.State.StartedAt}}'
expected:
    Docker daemon:
        enabled

    restart policy:
        unless-stopped
        or always

    container:
        running after host reboot

Set restart policy on an existing container:

docker update --restart unless-stopped order-api
notes:
    docker update changes container config without recreating the container
    --restart unless-stopped does not restart a container that you manually stopped
    use docker start order-api once if it was manually stopped

no space left on device#

df -h
docker system df
docker ps -a
docker images
docker volume ls
fix order:
    prune stopped containers
    prune old unused images
    prune old build cache
    inspect volumes manually
    expand disk if cleanup cannot recover enough space

failed image pull#

docker login registry.example.com
docker pull registry.example.com/order-api:1.0.0
docker manifest inspect registry.example.com/order-api:1.0.0
check:
    tag exists
    registry credentials are valid
    network/proxy can reach registry
    image platform matches host architecture

rollback by exact tag#

docker pull registry.example.com/order-api:0.9.9

docker stop order-api
docker rm order-api

docker run -d \
  --name order-api \
  --restart unless-stopped \
  -p 3000:3000 \
  --env-file /etc/order-api.env \
  registry.example.com/order-api:0.9.9
rollback notes:
    rollback image tag must be retained by registry policy
    runtime env file or secret source must still exist
    database migration rollback is separate from Docker rollback