Table of Contents
Backup and Recovery with K8up
This guide will walk you through the creation, backup, and recovery processes for a local SeaweedFS deployment using K8up and Backblaze B2.
K8up is a Kubernetes-native wrapper for Restic therefore users of Restic and other Restic-based tooling like Velero should also find the techniques described here useful. Users of other S3 hosted services such as Wasabi S3, Cloudflare R2 etc... should also be able to follow along.
For the purposes of this demo, backups are set to run very frequently and plain-text passwords are also used for convenience - do NOT do that in production.
PR ref: https://github.com/seaweedfs/seaweedfs/pull/5034
Outline
- K3s Cluster creation
- SeaweedFS instance setup
- Configure scheduled backups of SeaweedFS to B2
- Restore SeaweedFS from B2 backups
Requirements
K3s Cluster creation
-
Download the k3s installer
curl -sfL https://get.k3s.io > k3s-install.sh
-
install k3s
bash k3s-install.sh --disable=traefik
-
Wait for node to be ready
$ sudo k3s kubectl get node NAME STATUS ROLES AGE VERSION vm0 Ready control-plane,master 1m v1.27.4+k3s1
-
Make an accessible version of the kubeconfig
mkdir -p ~/.config/kube sudo cp /etc/rancher/k3s/k3s.yaml ~/.config/kube/config sudo chown $USER:$USER ~/.config/kube/config export KUBECONFIG=~/.config/kube/config
-
Install k8up
helm repo add k8up-io https://k8up-io.github.io/k8up helm repo update kubectl apply -f https://github.com/k8up-io/k8up/releases/download/k8up-4.4.3/k8up-crd.yaml helm install k8up k8up-io/k8up
SeaweedFS instance and user setup
-
install the MinIO client
Docs: https://min.io/docs/minio/linux/reference/minio-mc.html
mkdir -p $HOME/minio-binaries wget https://dl.min.io/client/mc/release/linux-amd64/mc -O $HOME/minio-binaries/mc chmod +x $HOME/minio-binaries/mc export PATH=$PATH:$HOME/minio-binaries/
-
Download SeaweedFS, unzip, then cd to the helm dir
wget https://github.com/seaweedfs/seaweedfs/archive/refs/tags/3.77.zip unzip 3.77.zip cd seaweedfs-3.77/k8s/charts/seaweedfs
-
Create a minimal values file for the Seaweedfs deployment which adds annotations for K8up.
/bin/cat << EOF > test-values.yaml master: enabled: true data: type: "persistentVolumeClaim" size: "10G" storageClass: "local-path" annotations: "k8up.io/backup": "true" livenessProbe: periodSeconds: 5 readinessProbe: periodSeconds: 5 volume: enabled: true readMode: proxy dataDirs: - name: data type: "persistentVolumeClaim" size: "10G" storageClass: "local-path" annotations: "k8up.io/backup": "true" maxVolumes: 0 idx: {} livenessProbe: periodSeconds: 5 readinessProbe: periodSeconds: 5 filer: enabled: true encryptVolumeData: true enablePVC: true storage: 10Gi defaultReplicaPlacement: "000" data: type: "persistentVolumeClaim" size: "10G" storageClass: "local-path" annotations: "k8up.io/backup": "true" s3: enabled: true enableAuth: true port: 8333 httpsPort: 0 allowEmptyFolder: false createBuckets: - name: shared anonymousRead: false livenessProbe: periodSeconds: 5 readinessProbe: periodSeconds: 5 s3: enabled: false cosi: enabled: false EOF
-
Deploy via Helm (takes longer on slow drives)
helm install seaweedfs . -f test-values.yaml --wait
-
Expose the filer service via a LoadBalancer (servicelb in k3s). This will let us view the admin UI as well as reach the S3 endpoint during the demo.
/bin/cat << EOF > service.yaml apiVersion: v1 kind: Service metadata: labels: app.kubernetes.io/component: filer app.kubernetes.io/instance: seaweedfs app.kubernetes.io/name: seaweedfs name: seaweedfs-filer-lb spec: ports: - name: swfs-filer port: 8888 protocol: TCP targetPort: 8888 - name: swfs-filer-grpc port: 18888 protocol: TCP targetPort: 18888 - name: swfs-s3 port: 8333 protocol: TCP targetPort: 8333 - name: metrics port: 9327 protocol: TCP targetPort: 9327 selector: app.kubernetes.io/component: filer app.kubernetes.io/name: seaweedfs type: LoadBalancer EOF
-
Export your LoadBalancer IP address as an env var
export NODE_IP=""
-
Create an alias for your server using your S3 CLI tool:
-
You can find the
admin_access_key_id
andadmin_secret_access_key
values in the secretseaweedfs-s3-secret
mc alias set seaweedfs http://$NODE_IP:8333 $admin_access_key_id $admin_secret_access_key
-
-
Create a bucket that will hold our demo data
mc mb seaweedfs/backups
-
Add some data to the bucket
mc cp ./some-file seaweedfs/backups/
-
Verify its there
mc ls seaweedfs/backups
-
Open the Web UI at http://$NODE_IP:8888 in a browser to view or add more data.
Configure scheduled backups of SeaweedFS to B2
-
Create a secret containing your external S3 credentials
-
You will need to get these from your provider (Backblaze, Wasabi etc..):
export ACCESS_KEY_ID=$(echo -n "" | base64) export ACCESS_SECRET_KEY=$(echo -n "" |base64)
/bin/cat << EOF > backblaze-secret.yaml apiVersion: v1 kind: Secret metadata: name: backblaze-credentials type: Opaque data: "ACCESS_KEY_ID": "$ACCESS_KEY_ID" "ACCESS_SECRET_KEY": "$ACCESS_SECRET_KEY" EOF kubectl apply -f backblaze-secret.yaml
-
-
Create a secret containing a random password for restic
-
Generate a password.
export RESTIC_PASS=""
-
Create a secret manifest
/bin/cat << EOF > restic.yaml apiVersion: v1 kind: Secret metadata: name: restic-repo type: Opaque stringData: "password": "$RESTIC_PASS" EOF
-
Create the secret
kubectl apply -f restic.yaml
-
-
Create a scheduled backup
-
Export your S3 address:
export BACKUP_S3_URL="" export BACKUP_S3_BUCKET=""
-
Create a manifest for the backup
/bin/cat << EOF > backup.yaml apiVersion: k8up.io/v1 kind: Schedule metadata: name: schedule-backups spec: backend: repoPasswordSecretRef: name: restic-repo key: password s3: endpoint: "$BACKUP_S3_URL" bucket: "$BACKUP_S3_BUCKET" accessKeyIDSecretRef: name: backblaze-credentials key: ACCESS_KEY_ID secretAccessKeySecretRef: name: backblaze-credentials key: ACCESS_SECRET_KEY backup: schedule: '*/5 * * * *' keepJobs: 4 check: schedule: '0 1 * * 1' prune: schedule: '0 1 * * 0' retention: keepLast: 5 keepDaily: 14 EOF
-
Create the backup and let it run
kubectl apply -f backup.yaml
-
Restore SeaweedFS from B2 backups
-
Uninstall SeaweedFS, delete the PVCs, secrets, and scheduled backup
kubectl delete -f backup.yaml helm uninstall seaweedfs kubectl delete pvc data-default-seaweedfs-master-0 kubectl delete pvc data-filer-seaweedfs-filer-0 kubectl delete pvc data-seaweedfs-volume-0
-
Create PVCs to hold our restored data
-
Create a manifest for the PVCs
/bin/cat << EOF > pvc.yaml --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: swfs-volume-data annotations: "k8up.io/backup": "true" spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: swfs-master-data annotations: "k8up.io/backup": "true" spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: swfs-filer-data annotations: "k8up.io/backup": "true" spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi EOF
-
Create the PVCs
kubectl apply -f pvc.yaml
-
-
Setup your restic credentials
# the password used in your restic-repo secret export RESTIC_PASSWORD="" # Your S3 credentials export AWS_ACCESS_KEY_ID="" export AWS_SECRET_ACCESS_KEY="" export RESTIC_REPOSITORY="s3://$BACKUP_S3_URL/$BACKUP_S3_BUCKET"
-
Find your desired snapshot to restore
$ restic snapshots repository d91e9530 opened (version 2, compression level auto) created new cache in /home/friend/.cache/restic ID Time Host Tags Paths -------------------------------------------------------------------------------------------- 4a25424a 2023-11-20 19:40:10 default /data/data-default-seaweedfs-master-0 649b25c7 2023-11-20 19:40:14 default /data/data-filer-seaweedfs-filer-0 99160498 2023-11-20 19:40:19 default /data/data-seaweedfs-volume-0 -------------------------------------------------------------------------------------------- 3 snapshots
-
Use the K8up CLI or a declarative setup to restore data to the PVC. You will need to do this for each PVC that needs to be restored
-
Example manifest for a S3-to-PVC restore job which uses the restic snapshots shown above.
/bin/cat << EOF > s3-to-pvc.yaml --- apiVersion: k8up.io/v1 kind: Restore metadata: name: restore-volume-data spec: restoreMethod: folder: claimName: swfs-volume-data snapshot: "99160498" backend: repoPasswordSecretRef: name: restic-repo key: password s3: endpoint: "$BACKUP_S3_URL" bucket: "$BACKUP_S3_BUCKET" accessKeyIDSecretRef: name: backblaze-credentials key: ACCESS_KEY_ID secretAccessKeySecretRef: name: backblaze-credentials key: ACCESS_SECRET_KEY --- apiVersion: k8up.io/v1 kind: Restore metadata: name: restore-master-data spec: restoreMethod: folder: claimName: swfs-master-data snapshot: "4a25424a" backend: repoPasswordSecretRef: name: restic-repo key: password s3: endpoint: "$BACKUP_S3_URL" bucket: "$BACKUP_S3_BUCKET" accessKeyIDSecretRef: name: backblaze-credentials key: ACCESS_KEY_ID secretAccessKeySecretRef: name: backblaze-credentials key: ACCESS_SECRET_KEY --- apiVersion: k8up.io/v1 kind: Restore metadata: name: restore-filer-data spec: restoreMethod: folder: claimName: swfs-filer-data snapshot: "649b25c7" backend: repoPasswordSecretRef: name: restic-repo key: password s3: endpoint: "$BACKUP_S3_URL" bucket: "$BACKUP_S3_BUCKET" accessKeyIDSecretRef: name: backblaze-credentials key: ACCESS_KEY_ID secretAccessKeySecretRef: name: backblaze-credentials key: ACCESS_SECRET_KEY EOF
-
Apply manifest
kubectl apply -f s3-to-pvc.yaml
-
-
Re-deploy Seaweedfs from the existing PVCs
-
Create a manifest that targets the PVCs we created
/bin/cat << EOF > restore-values.yaml master: enabled: true data: type: "existingClaim" claimName: "swfs-master-data" livenessProbe: periodSeconds: 5 readinessProbe: periodSeconds: 5 volume: enabled: true readMode: proxy dataDirs: - name: data type: "existingClaim" claimName: "swfs-volume-data" maxVolumes: 0 idx: {} livenessProbe: periodSeconds: 5 readinessProbe: periodSeconds: 5 filer: enabled: true encryptVolumeData: true enablePVC: true storage: 10Gi defaultReplicaPlacement: "000" data: type: "existingClaim" claimName: "swfs-filer-data" s3: enabled: true enableAuth: false port: 8333 httpsPort: 0 allowEmptyFolder: false livenessProbe: periodSeconds: 5 readinessProbe: periodSeconds: 5 s3: enabled: false cosi: enabled: false EOF
-
Deploy via Helm
helm install seaweedfs . -f restore-values.yaml --wait
-
-
Update your alias for your server:
-
get the
admin_access_key_id
andadmin_secret_access_key
from the secretseaweedfs-s3-secret
mc alias set seaweedfs http://$NODE_IP:30000 $admin_access_key_id $admin_secret_access_key
-
View for your data:
mc ls seaweedfs
-
Introduction
API
Configuration
- Replication
- Store file with a Time To Live
- Failover Master Server
- Erasure coding for warm storage
- Server Startup Setup
- Environment Variables
Filer
- Filer Setup
- Directories and Files
- Data Structure for Large Files
- Filer Data Encryption
- Filer Commands and Operations
- Filer JWT Use
Filer Stores
- Filer Cassandra Setup
- Filer Redis Setup
- Super Large Directories
- Path-Specific Filer Store
- Choosing a Filer Store
- Customize Filer Store
Advanced Filer Configurations
- Migrate to Filer Store
- Add New Filer Store
- Filer Store Replication
- Filer Active Active cross cluster continuous synchronization
- Filer as a Key-Large-Value Store
- Path Specific Configuration
- Filer Change Data Capture
FUSE Mount
WebDAV
Cloud Drive
- Cloud Drive Benefits
- Cloud Drive Architecture
- Configure Remote Storage
- Mount Remote Storage
- Cache Remote Storage
- Cloud Drive Quick Setup
- Gateway to Remote Object Storage
AWS S3 API
- Amazon S3 API
- AWS CLI with SeaweedFS
- s3cmd with SeaweedFS
- rclone with SeaweedFS
- restic with SeaweedFS
- nodejs with Seaweed S3
- S3 API Benchmark
- S3 API FAQ
- S3 Bucket Quota
- S3 API Audit log
- S3 Nginx Proxy
- Docker Compose for S3
AWS IAM
Machine Learning
HDFS
- Hadoop Compatible File System
- run Spark on SeaweedFS
- run HBase on SeaweedFS
- run Presto on SeaweedFS
- Hadoop Benchmark
- HDFS via S3 connector
Replication and Backup
- Async Replication to another Filer [Deprecated]
- Async Backup
- Async Filer Metadata Backup
- Async Replication to Cloud [Deprecated]
- Kubernetes Backups and Recovery with K8up
Messaging
Use Cases
Operations
Advanced
- Large File Handling
- Optimization
- Volume Management
- Tiered Storage
- Cloud Tier
- Cloud Monitoring
- Load Command Line Options from a file
- SRV Service Discovery
- Volume Files Structure