Skip to content

Immich Postgres recovery (checkpoint corruption)

What happened

Postgres in the immich namespace was failing to start with:

  • invalid xl_info in checkpoint record
  • PANIC: could not locate a valid checkpoint record
  • startup process was terminated by signal 6: Aborted

This indicates WAL/checkpoint corruption: the database was not shut down cleanly (e.g. pod killed during write, or the volume was attached to more than one node — RWO Multi-Attach scenario).

Volume health (Longhorn)

  • immich-postgres-pvc should show robustness: healthy. If it were degraded, I/O errors could cause or worsen corruption. Check with:
    kubectl get volumes.longhorn.io -n longhorn-system | grep <postgres-pv-name>
  • The immich-library volume (200Gi) is separate; if it is degraded, fix it in the Longhorn UI or by waiting for/triggering replica rebuild, but that does not block Postgres recovery.

Recovery options

Option A: Start fresh (delete DB)

If you don't need the existing data:

  1. Scale Postgres to 0 and delete the PVC:
    kubectl scale deploy immich-postgres -n immich --replicas=0
    kubectl delete pvc immich-postgres-pvc -n immich
    (Ensure no other pods use the PVC, e.g. delete any log-reader job.)
  2. Recreate the PVC (Flux will do it on reconcile, or apply apps/base/immich/postgres.yaml).
  3. Scale Postgres back to 1. It will init a new empty DB.
  4. Create extensions (run from a node that can reach the cluster, e.g. plumbus):
    kubectl exec -n immich deploy/immich-postgres -c postgres -- psql -U immich -d immich -c "CREATE EXTENSION IF NOT EXISTS vectors; CREATE EXTENSION IF NOT EXISTS cube; CREATE EXTENSION IF NOT EXISTS earthdistance;"

Immich server will run migrations on first connect. You'll need to set up Immich again (admin user, etc.).

Option B: Restore from backup (preferred if you have data)

If you have a backup of the Immich Postgres data:

  1. Kopia (or similar)
    Restore the immich-postgres-pvc volume (or the pgdata directory) from a snapshot taken when the DB was healthy. Then scale postgres back to 1 and start Immich.

  2. Immich export
    If you previously used Immich’s backup/export feature, restore that after bringing Postgres up with a new PVC (see Option B for bringing Postgres up with a fresh data dir, then import).

Option C: Last resort — pg_resetwal

Warning: pg_resetwal forces Postgres to create a new checkpoint and can lose recent transactions. Use only if you have no usable backup and accept possible data loss.

  1. Scale Postgres down so the PVC is free:
    kubectl scale deploy immich-postgres -n immich --replicas=0
    
  2. Wait until the postgres pod is gone:
    kubectl get pods -n immich -l app=immich-postgres
    
  3. Run the one-off reset job (same node as where the PVC was, e.g. rex):
    # Edit pg-resetwal-job.yaml and set nodeName to the node that had the postgres pod (e.g. rex)
    kubectl apply -f apps/base/immich/pg-resetwal-job.yaml -n immich
    kubectl wait job/immich-pg-resetwal -n immich --for=condition=complete --timeout=120s
    kubectl logs job/immich-pg-resetwal -n immich
    
  4. Delete the job and scale Postgres back up:
    kubectl delete job immich-pg-resetwal -n immich
    kubectl scale deploy immich-postgres -n immich --replicas=1
    
  5. Recreate Immich extensions if needed (once Postgres is Ready):
    kubectl exec -n immich deploy/immich-postgres -c postgres -- psql -U immich -d immich -c \
      "CREATE EXTENSION IF NOT EXISTS vectors; CREATE EXTENSION IF NOT EXISTS cube; CREATE EXTENSION IF NOT EXISTS earthdistance;"
    

If Postgres still fails after pg_resetwal, the data directory may be too damaged; then restoring from backup or re-deploying Immich with a new PVC (and re-importing assets) is the only path.

Preventing recurrence

  • Immich Postgres and server are already pinned off the oracle node (nodeAffinity) and the server uses strategy: Recreate with replicas: 1 to avoid RWO Multi-Attach.
  • Ensure Postgres is never scheduled on a node that might lose connectivity or be terminated abruptly; keep backups (e.g. Kopia) of the immich-postgres-pvc or regular pg_dump/Immich exports.