Re: btrbk question: Should I Prefer Fileserver-initiated Backups from... | linux-btrfs

Re: btrbk question: Should I Prefer Fileserver-initiated Backups from Several Hosts (Instead of Each Host Sending to the Server)?

From: Graham Cobb <hidden>
Date: 2021-09-14 09:59:48

On 12/09/2021 18:40, Dave T wrote:

Are btrbk-specific questions OK here?

I have a small LAN with a fileserver that should store backups from
each attached host on the LAN. What is the most efficient (performant)
way to do this with btrbk?

My main goal is not performance but safety - but I realise there is
always a tradeoff to be made! And security and data protection also feed
into the analysis (ransomware, personal data, etc etc).

Each host (laptops, desktops and a few other devices) does hourly
local snapshots with btrbk. Once per day, I would like to send backups
of each volume on each device to the local fileserver. This has to be
done via SSH (as NFS isn't supported by btrfs send|receive, afaik).

That is similar to my setup. But in my case the server is always in control.

The options I'm aware of from the btrbk readme
(https://digint.ch/btrbk/doc/readme.html) are:

1. host-initiated backup to the fileserver from each host

2. fileserver-initiated backups from all hosts

My guess is that the second option is preferred. Is that correct?

Assuming I use the second option, do I need to be concerned about it
initiating a backup on a host while that host is also performing a
local hourly snapshot?

I use the second option, but I rely on btrbk on the server to take the
local snapshots on the hosts as well. In other words, btrbk software is
installed on the host but I never run it there explicitly. btrbk on the
server controls making both host and server snapshots.

What are the disadvantages of the  fileserver-initiated approach?

Laptops, and other intermittently connected hosts, don't even get local
backups done unless they are connected at the time the server tries to
do them. Not a big problem for me.

If one host is offline, will the backup procedure continue on with the
other hosts it can reach at that time?

I run separate cron jobs (with separate btrbk conf files) for each host.

Since deleting snapshots can potentially be a costly operation (in
terms of performance), should I split the process into two steps,
where one step would pull the backups from each host without any
deletions, and a second step would then prune the backups according to
configured retention policies?

I don't. I just let btrbk run through the process.

How many backups (snapshots) can I safely retain for each host volume?
I would like to keep as many as possible, but I know there is a
threshold at which performance can become a problem.

On the server I use a separate btrfs filesystem for snapshots (a mixture
of btrbk snapshots and rsnapshot snapshots). It is currently 18TB (data
single, metadata raid1 on two spinning disks, with LUKS and LVM), of
which about 15TB is in use. It has about 1300 btrbk subvolumes on it
(and about 50 rsnapshot subvolumes). The btrbk jobs run (mostly at
night) using cron so I don't pay any attention to how long they take but
it isn't excessive. It doesn't seem to slow the system down or cause any
problems.

The only problem is that check (run monthly) takes a few days! I just
let it run in the background.

I don't keep many snapshots on the hosts - they take up disk space and
can cause unnecessary issues. Keep the main snapshots on the server,
with just a small number of recent ones on the host for easy access when
someone deletes the wrong file by mistake. For laptops you need to trade
off keeping more so older data can be accessed when on the move or fewer
so that deleted files don't hang around if the laptop is lost.

I mount btrfs volumes on the **hosts** with these mount options:

    autodefrag,noatime,nodiratime,compress=lzo,space_cache=v2

On the hosts I use nothing special. For example:

    noatime,nodiratime,nossd

On the server, I use:


noatime,nodiratime,compress-force=zstd:15,skip_balance,commit=15,space_cache=v2,x-systemd.mount-timeout=180s,nofail

And I have the systemd fstrim.service enabled.

The fileserver is a dedicated backup server, not a general-purpose
fileserver. I plan to use most of those same mount options. Do I need
the autodefrag option? Will autodefrag help or hurt performance in
this use-case? The following message from this list caused me some
confusion as I would have expected the opposite:

[freezes during snapshot creation/deletion -- to be expected? November
2019, 00:21:18 CET]

I don't use autodefrag or any other defrag. As these are backups I don't
see any need to improve read access, and I prefer to avoid the concern
over defrag breaking something.

quoted

So just to follow up on this, reducing the total number of snapshots and  increasing the time between their creation from hourly to once every six hours  did help a *little* bit.  However, about a week ago I decided to try an  experiment and added the "autodefrag" mount option (which I don't usually do  on SSDs), and that helped *massively*.  Ever since, snapper-cleanup.service  runs without me noticing at all!.

Are there any other recommendations?

Don't use snapshots as your only backup. They are great for easy access
and for being bang up to date but I maintain more traditional backups as
well (using DAR, daily in my case, and encrypted and sent to a cloud
service). This is mainly in case some bug or disk problem caused me to
lose the btrfs filesystem structure of the snapshots filesystem, but it
also provides protection against a fire or similar.

Graham

P.S. Just fyi, here is an example of one of my btrbk conf files (for an
intermittently connected laptop in this particular case, others are more
complex with multiple subvolumes but they are all similar):

volume <REDACTED>:/mnt/rootdisk
 ssh_identity /root/.ssh/<REDACTED>
 snapshot_dir btrbk_snapshots
 snapshot_create onchange
 preserve_day_of_week monday

# On the disk itself only keep recent snapshots
 snapshot_preserve_min  5d
 snapshot_preserve 5d 2w
 timestamp_format long-iso

# On the backup disk keep historic monthlies
 target_preserve_min no
 target_preserve 30d 8w *m

subvolume rootfs
 target send-receive    /snapshots/<REDACTED>_snapshots

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help