Borg Backups on Linode Object Storage

Posted on March 13, 2020 in Storage and Backups, Sysadmin


loading_bar

My trials using a mounted s3 bucket as a destination for remote Borg backups

I started using Borg for backups almost two years ago now, and I couldn't be happier. Between its deduplication and compression, I've been able to backup ~10 hosts with 3 months worth of weekly snapshots (plus 7 daily snaps) on under 200G. My backup server is hosted by Backupsy, and I highly recommend them if you're looking for a solid VPS provider for storage purposes. While they aren't the fastest, I've had no downtime in the last two years that I can actually remember, and they run a persistent 40% off campaign that you can take advantage of to get a pretty cheap VPS. They will charge an extra $2/month to use your VPS for any non-backup services, which I happily pay knowing I can run whatever the hell I want on my VPS. Keep in mind that Borg can (and you should) encrypt data at-rest so even if your VPS provider accessed the drives, they'd need the encryption key or password to access the data. All in all, I'm paying $8/mo for 250G worth of storage on a fully functional VPS, and that's a pretty good deal.

But, I started to get curious about the possibility of using s3fuse to mount an S3 bucket (or an s3-compatible one, anyway) to use as a destination for Borg backups. Having a fixed 250G is great, but what if I wanted to expand beyond that? I'd have to get the next tier Backupsy, which is 500G, and doubles in price. But it might take me another 3 years to get to the point where I have 500G to backup, or maybe I would just hover around 300G and be paying for 500G for a long time, which isn't necessarily a great use of resources. The benefit of object storage is that while there may be a minimum bucket size with some providers, they usually are a fixed cost per unit beyond that. So, in walks Linode with a free object storage offer, and I figured it was the right time to test the waters.

Before I continue, I will note that I am NOT sponsored by Linode in any way, and they did not encourage me to write this post. I do love Linode though and use them for almost all of my VPS needs, and I highly recommend them.

The Setup

The nice thing about using a Backupsy VPS is that it's VPS + storage all-in-one, so you setup the VPS and then just write to your root partition and life is good. In this case, though, I'm trying to use Linode's object storage, and while I could mount a bucket on each host I want to backup, I don't want my production hosts to bear the CPU load of using s3fuse (which in the past I have found to be a real issue). So for the rest of this guide, know that I have already provisioned a Linode nanode and configured the server per the guidelines Brent put together here. I stopped before I got to actually performing any backups, because first I'm going to have to create a storage bucket, install s3fuse, and mount it. i also configured DNS and my examples are using backups.jthan.io. Oh, and I use Arch Linux, but you should use whatever you're comfortable with and adapt the rest accordingly!

Install s3fuse

On Arch, s3fuse is provided as a package and can be installed using pacman: pacman -S s3fs-fuse. At this point, most distros have s3fuse available through their package managers, so if you don't use Arch you'll still be okay!

Configure a new bucket

I'm not going to rehash the how here - go ahead and read Linode's documentation for yourself. You'll need to setup a bucket and at least one access key for mounting the bucket via s3fuse. When you create an access key, the secret key will only be shown at the time of creation, so keep note of it when you create it! You will continue to get charged for a minimum of 250G/mo if you don't explicitly go to your account settings and disable object storage after you're done using it.

Mount the bucket

Mounting the bucket is fairly straightforward, the README for the s3fuse project was the only resource I needed. Before you can run a mount command, you'll have to create a credentials file in /root/.passwd-s3fs. The format of the file is in the aforementioned README, but is just a single line, YOUR_ACCESS_TOKEN:YOUR_SECRET_KEY. Now, keep in mind you will probably want to mount your bucket as root and then create folders underneath for each user you're backing up as. An example mount command compatible with Linode's object storage looks like this:

[root@backups ~]# s3fs borg /mnt/s3 -o passwd_file=$HOME/.passwd-s3fs -o url=https://us-east-1.linodeobjects.com/ -o use_path_request_style

If you want to make the mount persistent across reboots, an example /etc/fstab entry is below. I always recommend testing your fstab entries BEFORE you issue a reboot because if they're wrong, it will hang your boot process for awhile and delay boot.

s3fs#borg /mnt/s3 fuse _netdev,allow_other,use_path_request_style,url=https://us-east-1.linodeobjects.com/ 0 0

Initialize a new backup repository

Your organization is entirely up to you, but I split up my repositories for each client - one for /etc, one for /home, one for /srv/http, so on and so forth depending on the role of the client. In my case, I am just testing using a single user from my laptop to backups.jthan.io/mnt/s3/backups. Always remember your password and backup your encryption keys, ideally on paper and stuck in a safety deposit box! In this episode of Sysadministrivia we talk about backups and archiving in more depth.

jonathan@shaco:~$ borg init --encryption=repokey ssh://shaco@backups.jthan.io:22/mnt/s3/backups
Enter new passphrase: 
Enter same passphrase again: 
Do you want your passphrase to be displayed for verification? [yN]: n
Remote: Failed to securely erase old repository config file (hardlinks not supported>). Old repokey data, if any, might persist on physical storage.

By default repositories initialized with this version will produce security
errors if written to with an older version (up to and including Borg 1.0.8).

If you want to use these older versions, you can disable the check by running:
borg upgrade --disable-tam ssh://shaco@backups.jthan.io:22/mnt/s3/backups

See https://borgbackup.readthedocs.io/en/stable/changes.html#pre-1-0-9-manifest-spoofing-vulnerability for details about the security implications.

IMPORTANT: you will need both KEY AND PASSPHRASE to access this repo!
Use "borg key export" to export the key, optionally in printable format.
Write down the passphrase. Store both at safe place(s).

Create our first backup

This is just like creating a backup anywhere else with Borg, except we'll point to wherever the s3fuse mount is (in my example, /mnt/s3).

jonathan@shaco:~$ borg create ssh://shaco@backups.jthan.io:22/mnt/s3/backups::testinit ~/linux
Enter passphrase for key ssh://shaco@backups.jthan.io:22/mnt/s3/backups:

Performance

My initial backup for each test I ran was using the Linux kernel source and varying levels of compression. As expected, initial backups took longer than if you were backing up to Backupsy or other Linode storage options, but realistically, especially depending on your backup frequency, who cares?! Your initial backup will always take more time than your incrementals thereafter anyway, and even if your incrementals take an hour every day, that isn't that bad as long as it isn't placing an unreasonable load on your production servers for a an hour at a time. Most of the personal backups I perform regularly are <1G/day, making this completely viable for me. The following table shows a few iterations of the test. Note that because deduplication is at play, even with NO compression the destination is smaller than the source!

Compression Initial Backup Size Initial Backup Time Size on Destination
Default (lz4) 1.1G 7m57s 325M
lzma9 1.1G 1h14m15s 191M
None 1.1G 24m58s 888M

Next I added a 500M text file to the base of the repo and sent an incremental backup using each of the same compression algorithms - results below:

Compression Incremental Backup Time Final Size on Destination
Default (lz4) 14m3s 804M
lzma9 19m50s 558M
None 11m44s 1.4G

And to be REALLY complete, I tested a restore of the same 500M file using borg mount from my test client. I also summed all of them to verify the file integrity!

Compression Restore Time
Default (lz4) 1m38s
lzma9 1m39s
None 1m23s

Closing Remarks

I think this is a viable use of Linode (or any) object storage as a destination for Borg backups, and you can save money over other options, especially if you already have somewhere to mount the bucket and send backups (i.e. you wouldn't need to add another nanode to your monthly budget).You definitely have to weigh the increased time over other types of backup locations, but all things considered, especially using lz4, there was almost no noticeable CPU load. Something I also hadn't thought of initially is that because of the need for an access key for mounting a bucket, if your backup server were ever compromised, you could revoke all keys to ensure that access to your backups was cut off. This is an added security benefit over using something like Backupsy, though a minor one. I will continue to mirror all of my backups here for at least the duration of Linode's free object storage offer and be sure to update with any observations, notes, etc. below. I will also test some more file restores at some point to ensure that's viable, because a backup isn't worth a damn if you can't restore from it.


← Return to previous page