Posted on March 13, 2020 in Storage and Backups, Sysadmin
My trials using a mounted s3 bucket as a destination for remote Borg backups
I started using Borg for backups almost two years ago now, and I couldn't be happier. Between its deduplication and compression, I've been able to backup ~10 hosts with 3 months worth of weekly snapshots (plus 7 daily snaps) on under 200G. My backup server is hosted by Backupsy, and I highly recommend them if you're looking for a solid VPS provider for storage purposes. While they aren't the fastest, I've had no downtime in the last two years that I can actually remember, and they run a persistent 40% off campaign that you can take advantage of to get a pretty cheap VPS. They will charge an extra $2/month to use your VPS for any non-backup services, which I happily pay knowing I can run whatever the hell I want on my VPS. Keep in mind that Borg can (and you should) encrypt data at-rest so even if your VPS provider accessed the drives, they'd need the encryption key or password to access the data. All in all, I'm paying $8/mo for 250G worth of storage on a fully functional VPS, and that's a pretty good deal.
But, I started to get curious about the possibility of using s3fuse to mount an S3 bucket (or an s3-compatible one, anyway) to use as a destination for Borg backups. Having a fixed 250G is great, but what if I wanted to expand beyond that? I'd have to get the next tier Backupsy, which is 500G, and doubles in price. But it might take me another 3 years to get to the point where I have 500G to backup, or maybe I would just hover around 300G and be paying for 500G for a long time, which isn't necessarily a great use of resources. The benefit of object storage is that while there may be a minimum bucket size with some providers, they usually are a fixed cost per unit beyond that. So, in walks Linode with a free object storage offer, and I figured it was the right time to test the waters.
Before I continue, I will note that I am NOT sponsored by Linode in any way, and they did not encourage me to write this post. I do love Linode though and use them for almost all of my VPS needs, and I highly recommend them.
The nice thing about using a Backupsy VPS is that it's VPS + storage all-in-one, so you setup the VPS and then just write to your root partition and life is good. In this case, though, I'm trying to use Linode's object storage, and while I could mount a bucket on each host I want to backup, I don't want my production hosts to bear the CPU load of using s3fuse (which in the past I have found to be a real issue). So for the rest of this guide, know that I have already provisioned a Linode nanode and configured the server per the guidelines Brent put together here. I stopped before I got to actually performing any backups, because first I'm going to have to create a storage bucket, install s3fuse, and mount it. i also configured DNS and my examples are using
backups.jthan.io. Oh, and I use Arch Linux, but you should use whatever you're comfortable with and adapt the rest accordingly!
On Arch, s3fuse is provided as a package and can be installed using pacman:
pacman -S s3fs-fuse. At this point, most distros have s3fuse available through their package managers, so if you don't use Arch you'll still be okay!
I'm not going to rehash the how here - go ahead and read Linode's documentation for yourself. You'll need to setup a bucket and at least one access key for mounting the bucket via s3fuse. When you create an access key, the secret key will only be shown at the time of creation, so keep note of it when you create it! You will continue to get charged for a minimum of 250G/mo if you don't explicitly go to your account settings and disable object storage after you're done using it.
Mounting the bucket is fairly straightforward, the README for the s3fuse project was the only resource I needed. Before you can run a mount command, you'll have to create a credentials file in
/root/.passwd-s3fs. The format of the file is in the aforementioned README, but is just a single line,
YOUR_ACCESS_TOKEN:YOUR_SECRET_KEY. Now, keep in mind you will probably want to mount your bucket as root and then create folders underneath for each user you're backing up as. An example mount command compatible with Linode's object storage looks like this:
[root@backups ~]# s3fs borg /mnt/s3 -o passwd_file=$HOME/.passwd-s3fs -o url=https://us-east-1.linodeobjects.com/ -o use_path_request_style
If you want to make the mount persistent across reboots, an example
/etc/fstab entry is below. I always recommend testing your fstab entries BEFORE you issue a reboot because if they're wrong, it will hang your boot process for awhile and delay boot.
s3fs#borg /mnt/s3 fuse _netdev,allow_other,use_path_request_style,url=https://us-east-1.linodeobjects.com/ 0 0
Your organization is entirely up to you, but I split up my repositories for each client - one for
/etc, one for
/home, one for
/srv/http, so on and so forth depending on the role of the client. In my case, I am just testing using a single user from my laptop to
backups.jthan.io/mnt/s3/backups. Always remember your password and backup your encryption keys, ideally on paper and stuck in a safety deposit box! In this episode of Sysadministrivia we talk about backups and archiving in more depth.
jonathan@shaco:~$ borg init --encryption=repokey ssh://firstname.lastname@example.org:22/mnt/s3/backups Enter new passphrase: Enter same passphrase again: Do you want your passphrase to be displayed for verification? [yN]: n Remote: Failed to securely erase old repository config file (hardlinks not supported>). Old repokey data, if any, might persist on physical storage. By default repositories initialized with this version will produce security errors if written to with an older version (up to and including Borg 1.0.8). If you want to use these older versions, you can disable the check by running: borg upgrade --disable-tam ssh://email@example.com:22/mnt/s3/backups See https://borgbackup.readthedocs.io/en/stable/changes.html#pre-1-0-9-manifest-spoofing-vulnerability for details about the security implications. IMPORTANT: you will need both KEY AND PASSPHRASE to access this repo! Use "borg key export" to export the key, optionally in printable format. Write down the passphrase. Store both at safe place(s).
This is just like creating a backup anywhere else with Borg, except we'll point to wherever the s3fuse mount is (in my example,
jonathan@shaco:~$ borg create ssh://firstname.lastname@example.org:22/mnt/s3/backups::testinit ~/linux Enter passphrase for key ssh://email@example.com:22/mnt/s3/backups:
My initial backup for each test I ran was using the Linux kernel source and varying levels of compression. As expected, initial backups took longer than if you were backing up to Backupsy or other Linode storage options, but realistically, especially depending on your backup frequency, who cares?! Your initial backup will always take more time than your incrementals thereafter anyway, and even if your incrementals take an hour every day, that isn't that bad as long as it isn't placing an unreasonable load on your production servers for a an hour at a time. Most of the personal backups I perform regularly are <1G/day, making this completely viable for me. The following table shows a few iterations of the test. Note that because deduplication is at play, even with NO compression the destination is smaller than the source!
|Compression||Initial Backup Size||Initial Backup Time||Size on Destination|
Next I added a 500M text file to the base of the repo and sent an incremental backup using each of the same compression algorithms - results below:
|Compression||Incremental Backup Time||Final Size on Destination|
And to be REALLY complete, I tested a restore of the same 500M file using borg mount from my test client. I also summed all of them to verify the file integrity!
I think this is a viable use of Linode (or any) object storage as a destination for Borg backups, and you can save money over other options, especially if you already have somewhere to mount the bucket and send backups (i.e. you wouldn't need to add another nanode to your monthly budget).You definitely have to weigh the increased time over other types of backup locations, but all things considered, especially using lz4, there was almost no noticeable CPU load. Something I also hadn't thought of initially is that because of the need for an access key for mounting a bucket, if your backup server were ever compromised, you could revoke all keys to ensure that access to your backups was cut off. This is an added security benefit over using something like Backupsy, though a minor one. I will continue to mirror all of my backups here for at least the duration of Linode's free object storage offer and be sure to update with any observations, notes, etc. below. I will also test some more file restores at some point to ensure that's viable, because a backup isn't worth a damn if you can't restore from it.