Guillaume Matheron

Data scientist, PhD in computer science

Taking advantage of ZFS for smarter Proxmox backups


Let’s say we have a Proxmox cluster running ~30 VMs using ZFS as a storage backend. We want to backup each VM hourly to a remote server, and then replicate these backups to an offsite server.

Proxmox Backup Server is nicely integrated into PVE’s web GUI, and can work with ZFS volumes. However, PBS is storage-agnostic, and as such it does not take advantage of snapshots and implements de-duplication using a chunk store indexed by checksum. This means that only the modified portions of a volume need to be transferred over the network to the backup server.

However, the full volume must still be read from disk for each backup to compute the chunk hashes and determine whether they need to be copied. PVE is able to maintain an index of changed chunks which is called dirty bitmap, however this information is discarded when the VM or node shuts down. This is because if the VM is stored on an external storage, who knows what could happen to the volume once it is out of the node’s control ?

This means that in our case full reads of the VM disk are inevitable. Worse, there does not seem to be any way to limit the bandwidth of chunk checksum computations which means that our nodes were frequently frozen because of lost dirty bitmaps.


This is especially frustrating because ZFS already tracks modified information through its snapshot mechanism, and there is also a mechanism to send snapshots (incrementally or not) to a remote host running ZFS.

This means that if our backup server uses a ZFS pool for storage, we could just take snapshots of our VMs, send them over the network incrementally and then prune them from the source nodes.

While simple in theory, abandoning PBS means we need to implement our own pruning, testing, replication and alerting logic. While replicating the backups to offsite servers with different pruning rules, we must ensure that we keep at least one snapshot in common to allow incremental sync.

Alternatives

Some tools like Syncoid offer automated zfs snapshot replication but we decided to develop our own solution as a python tool for more flexibility, mainly since I don’t know any Perl.

zrep looks like a well-documented and mature tool that provides many similar features, however since it is not tailored to proxmox it lacks replication of VM configuration files. On the plus side, it seems to handle locking and metadata a lot better than our tool, which has no locking whatsoever. On the other hand zrep is made for replication and by default it has no way of storing a set of backups with automatic pruning.

We implemented this idea in the form of an open-source python script called ZFS Backups Manager.

Example

On source servers running PVE:

# Backup VMs to zbm-dest
python3 zbm.py backup-pve \
  --remote zbm-dest \
  --prefix bak-zbm- \
  --local-zfs-root rpool/data \
  --remote-zfs-root main/bak-zbm \
  --remote-metadata-dir /root/backups_metadata \
  -y

On destination server zbm-dest running a bare Debian with an OpenZFS pool:

# Replicate backups to zbm-replica
python3 zbm.py replicate \
  --remote zbm-replica \
  --remote-zfs-root main/bak-zbm \
  --local-zfs-root main/bak-zbm

# Prune local backups, ensuring we keep non-replicated
# backups, and a snapshot in common with the replica
python3 zbm.py prune \
  --local-zfs-root main/bak-zbm \
  --prefix bak-zbm- \
  --keep-last 2 \
  --keep-hourly 2 \
  --remote zbm-replica \
  --remote-zfs-root main/bak-zbm \
  -y

python3 zbm.py verify \
  --local-zfs-root main/bak-alphabet \
  --prefix bak-alphabet- \
  --max-age-days 1 \
  --min-snapshots 2

On replica server zbm-replica:

python3 zbm.py prune \
  --local-zfs-root main/bak-zbm \
  --prefix bak-zbm- \
  --keep-last 5 \
  --keep-hourly 24 \
  --keep-daily 30 \
  --keep-monthly 12 \
  -y

python3 zbm.py verify \
  --local-zfs-root main/bak-zbm \
  --prefix bak-zbm- \
  --max-age-days 1 \
  --min-snapshots 20


Leave a Reply

Your email address will not be published. Required fields are marked *