Skip to content

Complete Setup Guide

Step-by-step instructions to deploy your entire K3s GitOps infrastructure from scratch.

Phase 1: Preparation (Friday Evening - 1 Hour)

Prerequisites

Ensure you have: - GitHub account with personal access token - Git installed locally - kubectl, kustomize, k3d, flux CLIs installed - SSH access to all Proxmox nodes (r2d2, butthole-ice-cream, windows, schwifty)

Install Required Tools

# macOS
brew install flux kustomize k3d kubectl

# Linux (Ubuntu/Debian)
curl -s https://fluxcd.io/install.sh | sudo bash
curl -s https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh | bash
curl -s https://raw.githubusercontent.com/rancher/k3d/main/install.sh | bash
kubectl version --client  # Should already be installed

Create GitHub Repository

# 1. On GitHub: Create new repo "homelab-gitops" (empty)
# 2. Clone locally
git clone https://github.com/yourusername/homelab-gitops
cd homelab-gitops

# 3. Create folder structure
mkdir -p clusters/{local,production,staging}
mkdir -p infrastructure/{base,production,local}
mkdir -p apps/{base,production,local}
mkdir -p docs scripts

# 4. Create initial files
touch clusters/local/.gitkeep
touch clusters/production/.gitkeep
touch .gitignore

# 5. Add docs
# Copy README.md, ARCHITECTURE.md, SERVICE-CATALOG.md to repo root/docs

# 6. First commit
git add .
git commit -m "chore: initial structure"
git push -u origin main

Phase 2: Local Development (Friday Evening - 2 Hours)

Create Local k3d Cluster

# Create cluster matching production topology
k3d cluster create local \
  --servers 1 \
  --agents 1 \
  --port "8080:80@loadbalancer" \
  --port "8443:443@loadbalancer" \
  --volume "/tmp/k3d-storage:/var/lib/rancher/k3s/storage" \
  --wait

# Verify
kubectl cluster-info
kubectl get nodes

Bootstrap Flux on Local Cluster

# Set GitHub credentials
export GITHUB_TOKEN=your_personal_access_token
export GITHUB_USER=yourusername

# Bootstrap Flux (creates deploy key automatically)
kubectl config use-context k3d-local

flux bootstrap github \
  --owner=$GITHUB_USER \
  --repo=homelab-gitops \
  --branch=main \
  --path=clusters/local \
  --personal

Verify Flux is Running

# Check Flux controllers
kubectl -n flux-system get pods

# Watch reconciliation
flux get kustomizations --all-namespaces --watch

# Should see: flux-system (Reconcile succeeded)

Phase 3: Proxmox VM Provisioning (Saturday Morning - 2 Hours)

Create VMs on r2d2 (192.168.1.10)

SSH into r2d2:

ssh root@192.168.1.10

# Create VMs (use Proxmox UI or script below)
# VMs for K3s:
# - leia (100): 4c, 6GB RAM, 50GB disk
# - luke-1 (101): 4c, 6GB RAM, 100GB disk
# - luke-2 (102): 4c, 6GB RAM, 100GB disk

# Example using Proxmox CLI:
qm create 100 --name k3s-leia --cores 4 --memory 6144 --scsihw virtio-scsi-pci
qm set 100 --scsi0 local-lvm:50
qm set 100 --net0 virtio,bridge=vmbr0
qm start 100

Create VMs on butthole-ice-cream (192.168.1.20)

ssh root@192.168.1.20

# VMs:
# - obi-wan (110): 2c, 3GB RAM, 30GB disk
# - yoda-1 (111): 2c, 3GB RAM, 50GB disk

qm create 110 --name k3s-obi-wan --cores 2 --memory 3072 --scsihw virtio-scsi-pci
qm set 110 --scsi0 local-lvm:30
qm set 110 --net0 virtio,bridge=vmbr0
qm start 110

Create VMs on windows (192.168.1.30)

ssh root@192.168.1.30

# VMs:
# - lando (120): 2c, 4GB RAM, 40GB disk

qm create 120 --name k3s-lando --cores 2 --memory 4096 --scsihw virtio-scsi-pci
qm set 120 --scsi0 local-lvm:40
qm set 120 --net0 virtio,bridge=vmbr0
qm start 120

Create VMs on schwifty (10.0.2.30)

ssh root@10.0.2.30

# VMs:
# - rick (100): 6c, 12GB RAM, 100GB disk
# - morty-1 (101): 6c, 12GB RAM, 300GB disk
# - morty-2 (102): 4c, 8GB RAM, 400GB disk

qm create 100 --name k3s-rick --cores 6 --memory 12288 --scsihw virtio-scsi-pci
qm set 100 --scsi0 local-lvm:100
qm set 100 --net0 virtio,bridge=vmbr0
qm start 100

# (repeat for morty-1 and morty-2)

Phase 4: OS Setup on VMs (Saturday Afternoon - 2 Hours)

Boot and Configure Each VM

For each VM (access via console or SSH once booted):

# Login to VM (default: root/proxmox or ubuntu/ubuntu)
# Configure hostname
hostnamectl set-hostname k3s-leia
echo "192.168.1.100 k3s-leia" >> /etc/hosts

# Update OS
apt update && apt upgrade -y

# Install packages needed by K3s
apt install -y curl wget git vim htop

# Set up sudo for non-root user (optional)
# Create user, add to sudoers

# Enable IP forwarding (required for K3s networking)
echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf
sysctl -p

Verify Network Access

# From your local machine, verify you can reach all VMs
ping 192.168.1.100  # leia
ping 192.168.1.110  # obi-wan
ping 192.168.1.120  # lando
ping 10.0.2.100     # rick

Phase 5: K3s Installation (Saturday Evening - 3 Hours)

Install Primary Master (leia)

# SSH into leia VM
ssh root@192.168.1.100

# Install K3s server (master)
curl -sfL https://get.k3s.io | sh -
sudo /usr/local/bin/k3s server &

# Wait ~30 seconds for startup
sleep 30

# Get node token (needed to join other servers)
TOKEN=$(sudo cat /var/lib/rancher/k3s/server/node-token)
echo "Token: $TOKEN"  # Save this!

# Verify K3s is running
sudo k3s kubectl get nodes

Install Secondary Masters (obi-wan, lando)

# SSH into obi-wan VM
ssh root@192.168.1.110

# Join as secondary master
K3S_URL=https://192.168.1.100:6443 \
K3S_TOKEN=<TOKEN_FROM_LEIA> \
curl -sfL https://get.k3s.io | sh -

# Verify (run on leia)
sudo k3s kubectl get nodes
# Should show: leia, obi-wan, lando (3 masters)

Install Worker Nodes (luke-1, luke-2, yoda-1, morty-1, morty-2)

# SSH into luke-1 VM
ssh root@192.168.1.101

# Join as agent (worker)
K3S_URL=https://192.168.1.100:6443 \
K3S_TOKEN=<TOKEN_FROM_LEIA> \
INSTALL_K3S_EXEC="agent" \
curl -sfL https://get.k3s.io | sh -

# Repeat for luke-2, yoda-1

Verify Cluster Health

# From leia, check all nodes joined
sudo k3s kubectl get nodes -o wide

# Expected output:
# NAME          STATUS   ROLES                  CPU    MEMORY
# leia          Ready    control-plane,master   4c     6Gi
# obi-wan       Ready    control-plane,master   2c     3Gi
# lando         Ready    control-plane,master   2c     4Gi
# luke-1        Ready    <none>                 4c     6Gi
# luke-2        Ready    <none>                 4c     6Gi
# yoda-1        Ready    <none>                 2c     3Gi

Copy kubeconfig Locally

# From leia, copy config
sudo cat /etc/rancher/k3s/k3s.yaml > /tmp/k3s-prod.yaml

# Download to local machine
scp root@192.168.1.100:/tmp/k3s-prod.yaml ~/.kube/config-prod

# Update server IP in file (change 127.0.0.1 to 192.168.1.100)
sed -i 's/127.0.0.1/192.168.1.100/g' ~/.kube/config-prod

# Add to kubeconfig
cat ~/.kube/config-prod >> ~/.kube/config

# Add context alias
kubectl config rename-context k3s-prod-leia k3s-prod

# Test access from local machine
kubectl --context=k3s-prod get nodes

Phase 6: Infrastructure Deployment (Sunday Morning - 2 Hours)

Add Infrastructure to Git

Create the base infrastructure manifests in your repo:

cd homelab-gitops

# Create infrastructure manifests
mkdir -p infrastructure/base/{traefik,sablier,longhorn,cloudflare-tunnel,monitoring,storage}

# Create manifests in each directory
# See previous documentation for complete YAML files

git add infrastructure/
git commit -m "feat: add infrastructure manifests"
git push

Create Clusters Config

# clusters/production/kustomization.yaml
cat > clusters/production/kustomization.yaml <<'EOF'
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

bases:
  - ../../infrastructure/base
  - ../../infrastructure/production
  - ../../apps/base
  - ../../apps/production

namespace: default
EOF

git add clusters/production/
git commit -m "feat: add production cluster config"
git push

Bootstrap Flux on Production

# From local machine
kubectl config use-context k3s-prod

flux bootstrap github \
  --owner=$GITHUB_USER \
  --repo=homelab-gitops \
  --branch=main \
  --path=clusters/production \
  --personal

Monitor Flux Deployment

# Watch Flux reconcile infrastructure
flux get kustomizations --all-namespaces --watch

# Check Flux logs
flux logs --follow

# Verify resources deployed
kubectl get pods -A
kubectl get svc -A
kubectl get ingress -A

Wait for Longhorn

# Longhorn takes ~3 minutes to deploy
kubectl wait --for=condition=ready pod \
  -l app.kubernetes.io/instance=longhorn \
  -n longhorn-system --timeout=300s

# Verify storageclass
kubectl get storageclass
# Should show: local-path, longhorn

Phase 7: Verify Infrastructure (Sunday Afternoon - 1 Hour)

Check All Nodes

kubectl get nodes -o wide
kubectl top nodes
kubectl describe nodes

Check Core Services

# Traefik
kubectl -n traefik get pods,svc

# Sablier
kubectl -n default get pods | grep sablier

# Longhorn
kubectl -n longhorn-system get pods

# Monitoring (Prometheus/Grafana)
kubectl -n monitoring get pods,svc

Test Persistence

# Create test PVC
kubectl apply -f - <<'EOF'
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test-pvc
spec:
  storageClassName: longhorn
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: test
    image: alpine
    command: ['sh', '-c', 'echo "Test data" > /data/test.txt && sleep 3600']
    volumeMounts:
    - name: storage
      mountPath: /data
  volumes:
  - name: storage
    persistentVolumeClaim:
      claimName: test-pvc
EOF

# Check PVC status
kubectl get pvc
kubectl describe pvc test-pvc

# Verify pod wrote data
kubectl exec test-pod -- cat /data/test.txt
# Output: "Test data"

# Cleanup
kubectl delete pod test-pod pvc test-pvc

Phase 8: Deploy First Service - Moodle (Optional - Sunday Evening)

Add Moodle to Git

# Create Moodle manifests
mkdir -p apps/base/moodle/{00-database,01-storage,02-moodle-app,03-monitoring}

# Copy YAML files from SERVICE-CATALOG.md

# Create secret
kubectl create secret generic moodle-db-secret \
  --from-literal=password=$(openssl rand -base64 32) \
  -n moodle \
  --dry-run=client \
  -o yaml > apps/base/moodle/secret.yaml

git add apps/base/moodle/
git commit -m "feat: add Moodle deployment"
git push

Verify Moodle Deployment

# Wait for pods
kubectl wait --for=condition=ready pod \
  -l app=moodle \
  -n moodle --timeout=300s

# Check pods
kubectl -n moodle get pods

# Check PVCs
kubectl -n moodle get pvc

# View logs
kubectl -n moodle logs -f deployment/moodle

Test Moodle

# Port-forward Moodle service
kubectl port-forward -n moodle svc/moodle 8080:80 &

# Open browser
# http://localhost:8080

Phase 9: Set Up Cloudflare Tunnel (Monday)

Create Cloudflare Tunnel

# On Cloudflare dashboard:
# 1. Create Tunnel → Named: "homelab-prod"
# 2. Download credentials JSON
# 3. Copy token

# Create secret in cluster
kubectl create secret generic cloudflare-tunnel-secret \
  --from-file=tunnel-credentials.json=<path-to-json> \
  -n cloudflare \
  --dry-run=client \
  -o yaml > infrastructure/base/cloudflare-tunnel/secret.yaml

# Update infrastructure/base/cloudflare-tunnel/config.yaml with:
# - Tunnel ID
# - Hostname mappings (moodle.yourdomain.com → http://traefik:80)

git add infrastructure/base/cloudflare-tunnel/
git commit -m "feat: set up Cloudflare Tunnel"
git push

# Flux auto-deploys

Verify External Access

# Wait for cloudflared pod
kubectl -n cloudflare get pods

# Test from external machine
curl https://moodle.yourdomain.com
# Should work!

Post-Deployment Checklist

  • [ ] All K3s nodes reporting Ready
  • [ ] Longhorn replicas working (test PVC)
  • [ ] Traefik responding to requests
  • [ ] Cloudflare Tunnel connected
  • [ ] Services accessible via domain names
  • [ ] Prometheus collecting metrics
  • [ ] Grafana dashboards loading
  • [ ] Backup CronJobs running
  • [ ] DNS resolving correctly

Backup & Restore

Backup etcd (Control Plane Database)

# Backup etcd from leia
ssh root@192.168.1.100

sudo k3s etcd-snapshot save -n etcd-backup-20250101

# Verify
sudo k3s etcd-snapshot list

# Copy to external storage
scp /var/lib/rancher/k3s/server/db/snapshots/* backup@backuphost:/backups/

Restore etcd

# If cluster breaks, restore from backup on leia
sudo k3s server --cluster-reset-restore-path=/path/to/snapshot.db

# Restart K3s
sudo systemctl restart k3s

Troubleshooting

Nodes not joining cluster

# On worker node, check logs
sudo journalctl -u k3s-agent -f

# Common issues:
# - Wrong token
# - Firewall blocking 6443
# - Network issue

# Check connectivity
ping 192.168.1.100 (leia IP)

Pods stuck in Pending

# Check node resources
kubectl top nodes
kubectl describe nodes

# Check PVC
kubectl describe pvc

# Common causes:
# - Not enough disk space
# - PVC not bound
# - Resource requests too high

Flux not syncing

# Check Flux status
flux get all

# Check logs
flux logs --follow

# Common issues:
# - GitHub token expired
# - Deploy key removed from repo
# - Kustomize build errors

Next Steps

  1. Add more services: Follow SERVICE-CATALOG.md
  2. Set up monitoring: Configure Grafana dashboards
  3. Enable backups: Configure external backup storage
  4. Add users: Create RBAC policies
  5. Implement GitOps: Use PRs for all changes
  6. Monitor costs: Track resource usage

Timeline Summary

Phase Tasks Duration
1 Prep, tools, Git setup 1h
2 k3d local cluster, Flux bootstrap 2h
3 Proxmox VM provisioning 2h
4 OS setup on VMs 2h
5 K3s installation (all nodes) 3h
6 Infrastructure deployment 2h
7 Verification & testing 1h
8 Moodle deployment 1h
9 Cloudflare Tunnel setup 1h
Total Complete production setup ~15 hours

Estimated timeline: Friday evening → Sunday evening (two weekends) to have a fully operational, production-grade K3s cluster with persistent storage, GitOps automation, and your first service (Moodle) running.

Good luck! 🚀