Proxmox Best Practice Part 3 – Backup Strategies: Keep data safe

Welcome back to the best practices for your Proxmox server or Proxmox cluster, today we look at the topic of backup in detail. For this, we need – as you already know – the colleagues here:

Proxmox Backup Server (PBS)

PBS is Proxmox's professional backup solution and should definitely be on your radar. The main benefits are not only marketing talk, but actually make a difference in practice:

  • Deduplication: Identical data blocks are stored only once, often saving up to 90% Storage space
  • Incremental backups: Only changes are backed up, dramatically reducing your backup time
  • encryption: Client-side AES-256-GCM encryption – even if someone compromises your backup server
  • Bandwidth limitation: Your backups run in the background without the production systems suffering
  • Verify jobs: Automatic integrity check - you always know if your backups are really useful
  • Retention policies: Intelligent deletion rules ensure that you don't sink into backup chaos
  • Web GUI: A modern user interface that is actually fun to use

PBS Setup (basic configuration)

Hardware Requirements and Planning

Before you start, you should think about the hardware. A PBS server is only as good as the hardware it runs on. Here are realistic minimum requirements that have proven themselves in practice:

Minimum hardware recommendations for PBS:

  • CPU: 4 cores (more if you want to back up many VMs in parallel)
  • RAM: 8GB absolute minimum, 16GB+ if you want to do ‘seriously’
  • Storage: A fast SSD for metadata, large HDDs for the actual backup data
  • Network: Gigabit Ethernet as the lowest minimum, better 2.5 or 5 GBE and 10GbE if you are constantly moving large amounts of data

The installation itself is pleasantly uncomplicated. You should definitely install PBS on a separate physical server – having a virtualised PBS on the same Proxmox cluster it is supposed to back up is like having an empty fire extinguisher in the burning house. ⁇

# Add Proxmox Backup Server Repository echo "deb http://download.proxmox.com/debian/pbs bookworm pbs-no-subscription" > /etc/apt/sources.list.d/pbs.list wget https://enterprise.proxmox.com/debian/proxmox-release-bookworm.gpg -O /etc/apt/trusted.gpg/proxmox-release-bookworm.gpg apt update && apt install proxmox-backup-server # First configuration - create an admin user proxmox-backup-manager user create backup@pbs --email admin@example.com proxmox-backup-manager acl update / Backup@pbs --role Admin

Optimize storage configuration

Storage configuration is at the heart of your backup system. It is worth investing some time here. ZFS is often the best choice here because it gives you snapshots, compression and checksumming out-of-the-box. We had already discussed this in detail in the second part (Storage).

# Create ZFS Pool for Optimum Performance # Mirrors two disks and uses NVMe for cache and log zpool create backup-pool \ mirror /dev/sdb /dev/sdc \ cache /dev/nvme0n1p1 \ log /dev/nvme0n1p2 # Datastore with sophisticated retention settings proxmox-backup-manager datastore create backup-storage /backup-pool/data \ --gc-schedule 'daily' \ --prune-schedule 'daily' \ --keep-daily 7 \ --keep-weekly 4 \ --keep-monthly 12 \ --keep-yearly 2 \ --notify-user backup@pbs

Configure backup jobs

Basic configuration

Now it's getting interesting – the actual backup jobs. This is where you decide whether your backup system will become a reliable companion or a source of constant frustration!

The CLI variant gives you much more control over the parameters. The bandwidth limitation is particularly important! No one wants backups to shut down the rest of the network.

# Backup job with all important parameters pvesh create /cluster/backup \ --schedule "02:00" \ --storage backup-pbs \ --mode snapshot \ --all 1 \ --compress zstd \ --protected 1 \ --notes-template "{{cluster}}: {{guestname}} - {{job-id}}" \ --mailnotification always \ --mailto admin@example.com \ --bwlimit 50000 # 50MB/s bandwidth limitation

Alternatively, you can also do the whole thing conveniently via the web interface: Datacenter → Backup → Add. This is often clearer, especially for beginners.

Important parameters explained

These parameters are crucial for a functioning backup system, so here are the most important in detail:

Schedule: Here you use the standard Cron format. 0 2 * * * Means at 2:00 a.m. every day. Plan your backup times so that they don't conflict with other maintenance-intensive tasks.

fashions: Here are three options that all have their advantages and disadvantages:

  • snapshot: The VM continues to run, you get a consistent snapshot. Ideal for productive systems.
  • suspended: The VM is briefly paused. Maximum consistency, but short interruption.
  • stop: VM is shutting down. Only recommended for non-critical systems.

Compress: Here you have to balance speed and efficiency:

  • lz4 is fast and requires little CPU, but compressed less
  • zstd Compresses much better, but needs more computing power

Protected: Protects the backup from accidental deletion. You should always activate it for important backups.

BWLimit: Bandwidth limit in KB/s. 50000 corresponds to about 50MB/s – adapts this to your network infrastructure.

Enable backup verification

A backup without verification is like a parachute without a test certificate – you only notice that it does not work when you really need it, which can be too late in the worst case. PBS helps you because it can automatically check the integrity of your backups.

# Verify job for automatic backup integrity pvesh create /admin/verify \ --store backup-storage \ --schedule 'weekly' \ --outdated-after 7 \ --ignore-verified

The Verify job runs once a week and checks all backups older than 7 days. Already verified backups are skipped to save time.

Implement 3-2-1 backup rule

Understanding the 3-2-1 Rule

The 3-2-1 rule is not just a marketing gimmick, but proven practice from decades of experience with data loss. It states:

  • 3 Copies of your data (1 original + 2 backups)
  • 2 different media/storage types (not all on the same technology)
  • 1 Offsite backup (geographically separated – fire, flood, theft)

Practical implementation

In a typical Proxmox environment, this could look like this:

  1. Primary data: On your Proxmox server with ZFS and redundancy
  2. Local backup: On the PBS server in the same data center/office
  3. Offsite backup: PBS sync to an external site or cloud provider

The remote configuration is relatively straightforward, but you have to pay attention to the network connection. An unstable connection can make your sync jobs a nightmare.

# Configure remote backup - Fingerprint from target server pvesh create /admin/remote \ --name offsite-backup \ --host backup.external.com \ --userid backup@pbs \ --password "secure_password" \ --fingerprint AA:BB:CC:DD:EE:FF:00:11:22:33:44:55:66:77:88:99:AA:BB:CC:DD # Sync job with bandwidth limitation - runs at night when little is going on pvesh create /admin/sync \ --remote offsite-backup \ --remote-store production \ --store local-backup \ --schedule "04:00" \ --rate-in 10000 \ --burst-in 15000 \ --comment "Nighty Offsite-Sync"

Advanced backup strategies

Configure backup retention

Retention policies are often underestimated, but extremely important. Without meaningful rules, you accumulate gigabytes of old backups over the years that no one needs anymore. At the same time, you don't want to accidentally delete the only usable backup from last month.

A proven strategy is the ‘grandfather-father-son’ method: many new backups, less older, very few ancient ones. PBS does this automatically for you.

# Configure sophisticated retention policy proxmox-backup-manager datastore update backup-storage \ --keep-last 3 \ # Always keep last 3 backups (emergencies) --keep-hourly 24 \ # 24 Hourly Backups (Last Day) --keep-daily 7 \ # 7 daily backups (last week) --keep-weekly 4 \ # 4 weekly backups (last month) --keep-monthly 12 \ # 12 monthly backups (last year) --keep-yearly 5 # 5 annual backups (long-term archiving)

Bandwidth management

In production environments, you often have to budget with limited bandwidth. Backups should not slow down production systems, but should also not last forever. Traffic shaping helps you find the right balance.

# Set up traffic shaping on the PBS server tc qdisc add dev eth0 root handle 1: htb default 30 tc class add dev eth0 parent 1: classid 1:1 htb rate 1000mbit tc class add dev eth0 parent 1:1 classid 1:10 htb rate 800mbit ceil 1000mbit tc class add dev eth0 parent 1:1 classid 1:20 htb rate 200mbit ceil 400mbit # iptables -A OUTPUT -p tcp --dport 8007 -j CLASSIFY --set-class 1:20

This configuration reserves 80% the bandwidth for normal traffic and limits backups to a maximum of 40% – with the possibility of using unused traffic.

Backup integration with storage

Storage-specific optimizations

Different storage types have different strengths. Use this to your advantage by storing metadata and large blocks of data on different media.

# Separate datastores for different requirements # Fast NVMe for metadata and small files proxmox-backup-manager datastore create vm-backups-meta \ /nvme/backup-meta \ --gc-schedule 'daily' \ --prune-schedule 'daily' # Big RAID for the actual backup data proxmox-backup-manager datastore create vm-backups-data \ /raid/backup-data \ --gc-schedule 'weekly' \ --keep-daily 14 \ --keep-weekly 8 \ --keep-monthly 24 # Backup job that excludes unnecessary files pvesh create /cluster/backup \ --schedule "02:00" \ --storage backup-pbs \ --mode snapshot \ --compress zstd \ --notes-template "{{cluster}}: {{guestname}} ({{mode}}})" \ --protected 1 \ --exclude-path "/tmp/*,/var/log/*,/var/cache/*" # Saves space and time

Backup strategies by storage type

LVM-Thin Snapshots

LVM thin volumes are widely used and provide simple snapshot functionality. The trick is to keep the snapshots only for as long as you really need them – they can affect performance.

# Automated LVM snapshot backup with troubleshooting create_lvm_backup() { local vmid=$1 local disk_path="/dev/pve/vm-${vmid}-disk-0" local snap_name="vm-${vmid}-snap-$(date +%Y%m%d-%H%M%S)" echo "Create Backup for VM $vmid..." # Check if there is enough space for snapshot if ! lvcreate --test --snapshot --name $snap_name --size 10G $disk_path; then echo "Error: Not enough space for snapshot!" return 1 fi # Create snapshot - is quasi instantan lvcreate --snapshot --name $snap_name --size 10G $disk_path # Backup from snapshot with progress indicator dd if=/dev/pve/$snap_name bs=64k status=progress  ⁇  \ gzip -c > /backup/vm-${vmid}-$(date +%Y%m%d).img.gz # Delete snapshot immediately to get performance lvremove -f /dev/pve/$snap_name echo "Backup for VM $vmid completed." } # Use for VM 100: # create_lvm_backup 100

ZFS Snapshots

ZFS is much more elegant here – snapshots cost practically nothing and you can make incremental backups. This saves a lot of time and bandwidth, especially with large VMs.

# Intelligent ZFS Incremental Backup Strategy zfs_incremental_backup() { local dataset=$1 local remote_host=$2 local current_snap="${dataset}@backup-$(date +%Y%m%d-%H%M%S)" local last_snap=$(zfs list -t snapshot -o name -s creation  ⁇  grep ${dataset}@  ⁇  tail -1) echo "Create ZFS Backup for $dataset..." # Create new snapshot - takes milliseconds zfs snapshot $current_snap if [ -n "$last_snap" ]; then echo "Send incremental backup since $last_snap" # Transfer only the changes - much faster zfs send -i $last_snap $current_snap  ⁇  \ ssh $remote_host "zfs receive -F backup/${dataset##*/}" else echo "Send initial full backup" # First backup is always complete zfs send $current_snap  ⁇  \ ssh $remote_host "zfs receive backup/${dataset##*/}" fi # Clean up snapshots - keep only the last 7 local old_snaps=$(zfs list -t snapshot -o name -s creation  ⁇  grep ${dataset}@  ⁇  head -n -7) if [ -n "$old_snaps" ]; then echo "Delete old snapshots..." echo "$old_snaps"  ⁇  while read snap; do zfs destroy $snap done fi echo "ZFS Backup for $dataset completed." } # Automation for all VM datasets zfs_replicate_all() { echo "Start ZFS replication for all VMs..." for dataset in $(zfs list -H -o name  ⁇  grep "tank/vm-"); do zfs_incremental_backup $dataset backup-server.local done echo "ZFS replication completed." }

Monitoring and Alerting

Monitor backup status

A backup system without monitoring is like a smoke detector without batteries – you only notice that it does not work when it is too late. Therefore, you should set up automatic checks.

# Backup status monitoring script #!/bin/bash check_backup_status() { echo "Check backup status..." # Search for failed jobs in the last 24 hours local failed_jobs=$(pvesh get /cluster/backup --output-format json  ⁇  \ jq -r '.[]  ⁇  select(.state == "error")  ⁇  .id') if [ -n "$failed_jobs" ]; then echo "ALARM: Backup jobs failed: $failed_jobs" # Send email echo "Backup jobs $Failed_jobs failed. Please check!"  ⁇  \ mail -s "BACKUP FAILURE ALERT" admin@example.com # Optional: Alert to monitoring system curl -X POST "https://monitoring.example.com/alert" \ -H "Content Type: application/json" \ -d "{\"message\":\"Backup failures: $failed_jobs\", \"severity\":\"critical\"}" return 1 else echo "All backup jobs run normally." return 0 fi } # Perform as a cron job every 2 hours # 0 */2 * * * /usr/local/bin/check_backup_status.sh

Proxmox Backup Server Health Check

Regular health checks will help you identify problems before they become critical. You should keep an eye on storage space, performance and service status. For example, it might look like this:

# Comprehensive PBS Health Check pbs_health_check() { echo "=== PBS Health Check $(date) ===" # Datastore Status and Storage echo "--- Datastore Status ---" proxmox-backup-manager datastore list echo "--- Storage ---" df -h  ⁇  grep -E "(backup ⁇ pbs)"  ⁇  while read line; do usage=$(echo $line  ⁇  awk '{print $5}'  ⁇  tr -d '%') if [ $usage -gt 85 ]; then echo "WARNING: $line - memory almost full!" else echo "OK: $line" fi done # Ongoing processes echo "--- Active backup processes ---" local active_jobs=$(ps aux  ⁇  grep -E "(proxmox-backup ⁇ pbs)"  ⁇  grep -v grep  ⁇  wc -l) echo "Active jobs: $active_jobs" # Verify Jobs Status ---" pvesh get /admin/verify --output-format json  ⁇  \ jq -r '.[]  ⁇  "Job \(.id): \(.state) (last version: \(.last_run_endtime // "never"))"' # test network performance to remote targets echo "--- network tests ---" if command -v iperf3 > /dev/null; then echo "Test bandwidth to backup targets..." # iperf3 -c backup-remote.com -t 10 -f M fi echo "=== Health Check Ende ===" }

Disaster recovery planning

Full system recovery

The day will come when you will have to restore your entire system. Yes, of course, that never happens – well, it's always the first time at some point. Preparation is everything here – regularly test your disaster recovery plans in an isolated environment.

# Full cluster recovery after total failure restore_cluster() { local backup_location=$1 echo "Start cluster recovery from $backup_location" # 1. Prepare new Proxmox installation echo "Step 1: Basic system installed (manual step)" # 2. Restore cluster configuration echo "Step 2: Restore the cluster configuration..." if [ -f "$backup_location/cluster-config.tar.gz" ]; then tar -xzf "$backup_location/cluster-config.tar.gz" -C /etc/pve/ systemctl reload pve-cluster fi # 3. Restore storage configuration echo "Step 3: Restore storage pools..." # Here you need to customize your specific storage configuration # 4. Restore VMs individually echo "Step 4: VM recovery..." if [ -f "$backup_location/vm-list.txt" ]; then while read vmid; do echo "Recover VM $vmid..." local backup_file="$backup_location/vm-${vmid}-latest.vma.gz" if [ -f "$backup_file" ]; then qmrestore "$backup_file" $vmid --storage local-lvm echo "VM $vmid restored" else echo "WARNING: Backup for VM $vmid not found!" fi done < "$backup_location/vm-list.txt" fi echo "Cluster recovery completed. Please check the services!" }

Automate backup validation

Did I mention that today? No matter, you can't say enough: Regular testing of your backups is essential. Nothing is more frustrating than discovering a corrupt backup only in the event of an emergency.

The PBS will also automatically validate for you, but will add up to 100% I also find it difficult to leave. At least you should consider doing actual restores from time to time, e.g. via your test / staging environment, only then can you really sleep quietly.

# Automated backup tests with test VM backup_validation() { local test_vmid=9999 # Reserve VM-ID for tests local datastore="backup-storage" echo "Start backup validation..." # Find the latest backup local latest_backup=$(proxmox-backup-client list --repository $datastore  ⁇  \ grep "vm/$test_vmid"  ⁇  sort -r  ⁇  head -1  ⁇  awk '{print $1}') if [ -z "$latest_backup" ]; then echo "No backup found for test VM, use any VM backup" latest_backup=$(proxmox-backup-client list --repository $datastore  ⁇  \ grep "vm/"  ⁇  head -1  ⁇  awk '{print $1}') fi if [ -z "$latest_backup" ]; then echo "FEHLER: No backups found!" return 1 fi echo "Test Backup: $latest_backup" # Test recovery (without start) if qmrestore --archive "$datastore:backup/$latest_backup" $test_vmid --storage local-lvm; then echo "Recovery successful, test VM start..." # VM test start qm start $test_vmid sleep 30 if qm status $test_vmid  ⁇  grep -q "running"; then echo "SUCCESS: Backup validation successful - VM starts correctly" qm stop $test_vmid qm destroy $test_vmid --purge return 0 else echo "FEHLER: VM does not start after recovery" sqm destroy $test_vmid --purge return 1 fi else echo "FEHLER: Recovery failed" return 1 fi }

Performance optimization

Backup performance tuning

Performance tuning in backups is often a balancing act between speed, system load and network impact. Here are proven settings for different scenarios.

# Optimize data center-wide backup settings (Homelab) # /etc/pve/datacenter.cfg max_workers: 3 # No more than 3 parallel backup jobs bandwidth_limit: 200 # 200MB/s total bandwidth for all jobs ionice: 7 # Lowest I/O priority (0-7) lockwait: 180 # Wait 3 minutes for locks

These settings prevent your backups from overloading your production systems. With more powerful hardware, you can adjust the values accordingly, the data from the example above is a rather simple homelab setup without high-performance server hardware. On a relatively up-to-date server, for example, this could also look like this:

# Optimize datacenter-wide backup settings (server) # /etc/pve/datacenter.cfg max_workers: 6 # No more than 6 parallel backup jobs bandwidth_limit: 1500 # 1500MB/s total bandwidth for all jobs (NvME Storage) ionice: 4 # Medium I/O priority (0-7) lockwait: 60 # Wait 1 minute for locks

Storage-specific optimizations

Different storage technologies need different optimizations. What's good for NVMe SSDs can be counterproductive with rotating hard drives. Here are a few examples:

# Optimize I/O schedulers for different storage types # For NVMe SSDs: mq-deadline is mostly optimal echo mq-deadline > /sys/block/nvme0n1/queue/scheduler # For SATA SSDs: none or mq-deadline echo none > /sys/block/sda/queue/scheduler # For rotating hard drives: bfq or cfq echo bfq > /sys/block/sdb/queue/scheduler # Optimize readahead for large sequential access blockdev --setra 4096 /dev/sda # 2MB readahead # For backup workloads often helpful echo 32 > /sys/block/sda/queue/nr_requests

Troubleshoot common problems

Hanging backup jobs

Hanging backup jobs are a classic problem. This is usually due to locks, network problems or overloaded storage systems.

# Identify and treat hanging jobs find_hanging_jobs() { echo "Search for hanging backup jobs..." # Long running vzdump processes find ps aux  ⁇  grep vzdump  ⁇  grep -v grep  ⁇  while read line; do pid=$(echo $line  ⁇  awk '{print $2}') runtime=$(ps -o etime= -p $pid  ⁇  tr -d ' ') echo "Job PID $pid has been running since: $runtime" # Jobs that run longer than 6 hours are suspicious if [[ $runtime =~ ^[0-9][0-9]:[0-9][0-9]:[0-9][0-9]$ ]]; then echo "WARNING: job $pid runs very long!" fi done # Check lock files if [ -f /var/lock/vzdump.lock ]; then echo "Vzdump-Lock found, check process..." local lock_pid=$(cat /var/lock/vzdump.lock) if ! kill -0 $lock_pid 2>/dev/null; then echo "Lock file is orphaned, delete it..." rm /var/lock/vzdump.lock fi fi } # Emergency cleanup for fully suspended systems emergency_backup_cleanup() { echo "NOTFALL: End all backup processes!" killall -9 vzdump killall -9 proxmox-backup-client rm -f /var/lock/vzdump.lock rm -f /tmp/vzdumptmp* echo "Cleanup completed - check the logs!" }

Storage space management

Full backup datastores are also a common problem. Garbage collection and pruning should run automatically, but sometimes you still need to intervene manually.

# diagnose and fix storage_cleanup() { local datastore=$1 echo "Storage Space Analysis for $datastore..." # Current memory consumption df -h $(proxmox backup manager datastore list  ⁇  grep $datastore  ⁇  awk '{print $3}') # Find Largest Backup Groups echo "Largest Backup Groups:" you -sh /backup/$datastore/*  ⁇  sort -hr  ⁇  head -10 # Run Garbage Collection manually echo "Start Garbage Collection..." proxmox-backup-manager garbage-collect $datastore # Orphaned chunks find echo "Search orphaned chunks..." proxmox-backup-manager datastore verify $datastore # Storage space after Cleanup echo "Storage space after Cleanup:" df -h $(proxmox backup manager datastore list  ⁇  grep $datastore  ⁇  awk '{print $3}') # Run Prune jobs manually when not running automatically echo "Run Prune jobs..." proxmox-backup-client prune --repository $datastore \ --keep-daily 7 --keep-weekly 4 --keep-monthly 12 } # Storage Monitoring with Alerts monitor_storage_space() { for datastore in $(proxmox backup manager datastore list  ⁇  awk 'NR>1 {print $1}'); do local path=$(proxmox backup manager datastore list  ⁇  grep $datastore  ⁇  awk '{print $3}') local usage=$(df "$path"  ⁇  awk 'NR==2 {print $5}'  ⁇  tr -d '%') echo "Datastore $datastore: ${usage}% evidenced" if [ $usage -gt 90 ]; Then echo "CRITIQUE: $datastore over 90% Full of it! # Here you could automatically delete old backups elif [ $usage -gt 80 ]; then echo "WARNING: $datastore over 80% full" fi done }

Diagnosing network problems

Network issues could also be the reason for slow or failed backups. Here are tools and techniques to help you diagnose.

# Comprehensive network diagnostics for backup connections network_backup_diagnosis() { local target_host=$1 echo "=== Network diagnosis for $target_host ===" # Basic accessibility echo "--- Ping Test ---" if ping -c 5 $target_host > /dev/null; then echo "Host $target_host is reachable" ping -c 5 $target_host  ⁇  tail -1 else echo "FEHLER: Host $target_host not reachable!" return 1 fi # bandwidth test with iperf3 echo "--- bandwidth test ---" if command -v iperf3 > /dev/null; then echo "Test bandwidth to $target_host..." timeout 30s iperf3 -c $target_host -t 20 -f M 2>/dev/null  ⁇  \ grep "sender"  ⁇  awk '{print "Upload: " $7 " " $8}' timeout 30s iperf3 -c $target_host -t 20 -f M -R 2>/dev/null  ⁇  \ grep "receiver"  ⁇  awk '{print "Download: " $7 " " $8}' else echo "iperf3 not installed - install with: apt install iperf3" fi # Latency test echo "--- Latency analysis ---" local avg_latency=$(ping -c 100 $target_host  ⁇  tail -1  ⁇  awk -F'/' '{print $5}') echo "Average latency: ${avg_latency}ms" if (( $(echo "$avg_latency > 100"  ⁇  bc -l) )); then echo "WARNING: High latency could affect backup performance # Port availability for PBS echo "--- Port Tests ---" for port in 8007 8008; do if timeout 5s bash -c "</dev/tcp/$target_host/$port"; then echo "Port $port: OPEN" else echo "Port $port: CLOSED or filtered" fi done # MTU-Discovery echo "--- MTU-Test ---" for size in 1472 1500 9000; do if ping -c 1 -M do -s $size $target_host > /dev/null 2>&1; then echo "MTU $size: OK" else echo "MTU $size: Fragmentation required" fi done }

SSL/TLS Certificate Issues

PBS uses HTTPS for all connections. Certificate issues are a common stumbling block, especially with self-signed certificates.

# SSL certificate diagnosis and repair check_pbs_certificates() { local pbs_host=$1 echo "=== SSL certificate check for $pbs_host ===" # Get certificate details echo "--- Certificate Information ---" local cert_info=$(echo  openssl s_client -servername $pbs_host -connect $pbs_host:8007 2>/dev/null  ⁇  \ openssl x509 -noout -dates -subject -issuer) echo "$cert_info" # Check certificate validity local expiry_date=$(echo "$cert_info"  ⁇  grep "notAfter"  ⁇  cut -d'=' -f2) local expiry_seconds=$(date -d "$expiry_date" +%s) local now_seconds=$(date +%s) local days_until_expiry=$(( (expiry_seconds - now_seconds) / 86400 )) echo "Certificate expires in: $days_until_expiry days" if [ $days_until_expiry -lt 30 ]; then echo "WARNING: Certificate will expire soon!" fi # Fingerprint for remote configuration echo "--- Fingerprint for remote configuration ---" local fingerprint=$(echo  openssl s_client -servername $pbs_host -connect $pbs_host:8007 2>/dev/null  ⁇  \ openssl x509 -noout -fingerprint -sha256  ⁇  cut -d'=' -f2) echo "Fingerprint: $fingerprint" } # Generate new self-signed certificate regenerate_pbs_certificate() { local pbs_host=$1 echo "Generate new Self-Signed Certificate for $pbs_host..." # Backup of the old certificate cp /etc/proxmox-backup/proxy.pem /etc/proxmox-backup/proxy.pem.backup.$(date +%Y%m%d) # Create new certificate openssl req -x509 -newkey rsa:4096 -keyout /tmp/proxy.key -out /tmp/proxy.crt \ -days 3650 -nodes -subj "/CN=$pbs_host" # Combine certificate and key cat /tmp/proxy.crt /tmp/proxy.key > /etc/proxmox-backup/proxy.pem # Delete temporary files rm /tmp/proxy.key /tmp/proxy.crt # Service restart systemctl restart proxmox-backup-proxy echo "New certificate installed. New Fingerprint:" check_pbs_certificates $pbs_host  ⁇  grep "Fingerprint:" }

Best Practices TL:DR

You should definitely pay attention to this

Here are the key points that determine the success or failure of your backup system:

Hardware & Infrastructure:

  • PBS always on separate hardware – never on the system you back up
  • Sufficient RAM for deduplication (at least 16GB for production environments)
  • Fast SSDs for metadata, large HDDs for backup data
  • Redundant network connections if possible

Configuration:

  • Configuring retention policies correctly from the outset – afterward it is tedious
  • Enable bandwidth limitation so as not to interfere with productive systems
  • Set up Verify jobs – an untested backup is not a backup
  • Enable email notifications for all critical events

Monitoring:

  • Automatic health checks at least daily
  • Storage monitoring with alerts at >80% occupancy
  • Regular test restores in isolated environment
  • Regularly review log files – take warnings seriously

Security:

  • Enable client-side encryption for sensitive data
  • Separate credentials for backup users - no admin rights
  • Firewall rules only for necessary ports (8007, 8008)
  • Offsite backups with secure transfer (SSH keys instead of passwords)

Avoiding Frequent Beginner Mistakes

The ‘Virtualised PBS’ error: Never run PBS on the same Proxmox cluster it's supposed to back up. It's like a seat belt made of gum.

The Set and Forget mentality: Backups need regular maintenance. Check at least monthly to see if everything is working.

Lack of bandwidth limitation: Backup jobs without limits can paralyze your production network.

No test restores: You don't know if your backups will work until you test them. Murphy’s Law is particularly applicable to backups. ⁇

Unclear retention policies: Define from the beginning how long you want to keep something. Infinitely growing backup archives are a nightmare.

With these basics, you have a solid foundation for your Proxmox backup system. Invest time in a proper configuration – I'll say so, your future I will thank you, either through quiet, relaxed sleep, or at the latest when the first emergency occurs.


Completion and advanced resources

Proxmox is a powerful tool, but with great power comes great responsibility. (winker) The best practices shown here are the result of experience in practice. Start with the basics and gradually work your way up to the advanced features.

Your next steps:

  1. Build a testing/staging environment: Test all configurations in a separate environment
  2. Implement monitoring: Monitor your system from the beginning
  3. Test backup strategy: Performs regular restore tests
  4. Join the Community: The Proxmox forum is very helpful

So remember: Take your time, the basics Understand before you More complex setups fades over. The Proxmox admin guide As a website I have linked several times in the article as a reference is also worth gold. Take a look in the forum around, If you have a question. There is also an entry point for YouTube channel.

The remaining parts of this article series I have also linked here again for you: Part 1: network | Part 2: storage | Part 3: backup | Part 4: security | Part 5: performance