Can you please share your backup strategies for linux? I’m curious to know what tools you use and why?How do you automate/schedule backups? Which files/folders you back up? What is your prefered hardware/cloud storage and how do you manage storage space?
etckeeper, and borg/vorta for /home
I try to be good about everything being installed in packages, even if Im the one that made the package. that means I only have to worry about backing up my local package archive. but Ive never actualy recreated a personal system from a backup, and usually end up starting from a fresh install, slowly adding back things from the backup if I missed them. this tends to cut down on cruft and no longer needed hacks and fixes. also makes for a good way to be exposed to new paradigms (desktop environments, shells, etc)
something that helps is daily notes. one file for any day Im working on my system and want to remember what a custom file, confg edit, or downloaded/created package does and why. these get saved separately and I try to remember to grep them before asking the internet
i see the benefit to snapshots, but disk space is expensive, and Im (usually) careful (enough) not to lock myself out or prevent boots. anything catastophic I have to fix is usually seen as a fun, stressful learning experience! that rarely happens anymore, for better or for worse
Example of a Bash script that performs the following tasks
- Checks the availability of an important web server.
- Checks disk space usage.
- Makes a backup of the specified directories.
- Sends a report to the administrator’s email.
Example script:
#!/bin/bash # Settings WEB_SERVER="https://example.com" BACKUP_DIR="/backup" TARGET_DIRS="/var/www /etc" DISK_USAGE_THRESHOLD=90 ADMIN_EMAIL="[email protected]" DATE=$(date +"%Y-%m-%d") BACKUP_FILE="$BACKUP_DIR/backup-$DATE.tar.gz" # Checking web server availability echo "Checking web server availability..." if curl -s --head $WEB_SERVER | grep "200 OK" > /dev/null; then echo "Web server is available." else echo "Warning: Web server is unavailable!" | mail -s "Problem with web server" $ADMIN_EMAIL fi # Checking disk space echo "Checking disk space..." DISK_USAGE=$(df / | grep / | awk '{ print $5 }' | sed 's/%//g') if [ $DISK_USAGE -gt $DISK_USAGE_THRESHOLD ]; then echo "Warning: Disk space usage exceeded $DISK_USAGE_THRESHOLD%!" | mail -s "Problem with disk space" $ADMIN_EMAIL else echo "There is enough disk space." fi # Creating backup echo "Creating backup..." tar -czf $BACKUP_FILE $TARGET_DIRS if [ $? -eq 0 ]; then echo "Backup created successfully: $BACKUP_FILE" else echo "Error creating backup!" | mail -s "Error creating backup" $ADMIN_EMAIL fi # Sending report echo "Sending report to $ADMIN_EMAIL..." REPORT="Report for $DATE\n\n" REPORT+="Web server status: $(curl -s --head $WEB_SERVER | head -n 1)\n" REPORT+="Disk space usage: $DISK_USAGE%\n" REPORT+="Backup location: $BACKUP_FILE\n" echo -e $REPORT | mail -s "Daily system report" $ADMIN_EMAIL echo "Done."
Description:
- Check web server: Uses
curl
command to check if the site is available. - Check disk space: Use
df
andawk
to check disk usage. If the threshold (90%) is exceeded, a notification is sent. - Create a backup: The
tar
command archives and compresses the directories specified in theTARGET_DIRS
variable. - Send a report: A report on all operations is sent to the administrator’s email using
mail
.
How to use:
- Set the desired parameters, such as the web server address, directories for backup, disk usage threshold and email.
- Make the script executable:
chmod +x /path/to/your/script.sh
- Add the script to
cron
to run on a regular basis:
crontab -e
Example to run every day at 00:00:
0 0 * * * /path/to/your/script.sh
All my code and projects are on GitHub/codeberg.
All my personal info and photos are on proton drive.
If Linux shits itself (and it does often) who cares. I can have it up and running again in a fresh install in ten minutes.
But proton drive soaent have a linux client yet, I suppose you just upload your files there once through the web interface and don’t sync?
Personal stuff is mostly on my phone. And I’ll just sync to the computer what’s needed.
All of my servers make local dumps of their databases and config files to directories owned by unprivileged users. This includes file paths, permissions, and ownerships (so I know how to put them back).
My primary research server at home uses rsync to pull copies of those local backups from my servers.
My primary research server uses Restic to make a daily incremental backup to Backblaze’s B2 service.
The only thing I use as a backup is a Live CD that’s mounted to a USB thumb drive.
I used to use Timeshift but the one time I needed it, it didn’t work for some reason. It also had a problem of making my PC temporarily unusable while it was making a backup, so I didn’t enable it when I had to reinstall Linux Mint.
Same, Timeshift let me down one time when I needed it. I still use it though, and I’m afraid to upgrade Mint because I don’t want to set my system again for of the upgrade fails to keep my configuration and Timeshift fails to take me back
Scuse the cut and paste, but this is something I recently thought quite hard about and blogged, so stealing my own content:
What to back up? This is a core question to ask when you start planning. I think it’s quite simply answered by asking the secondary question: “Can I get the data again?” Don’t back up stuff you downloaded from the public internet unless it’s particularly rare. No TV, no Movies, no software installers. Don’t hoard data you can replace. Do back up stuff you’ve personally created and that doesn’t exist elsewhere, or stuff that would cause you a lot of effort or upset if it wasn’t available. Letters you’ve written, pictures you’ve taken, code you authored, configurations and systems that took you a lot of time to set up and fine tune.
If you want to be able to restore a full system, that’s something else and generally dealt best with imaging – I’m talking about individual file backups here!
Backup Scenario Multiple household computers. Home linux servers. Many services running natively and in docker. A couple of windows computers.
Daily backups Once a day, automate backups of your important files.
On my linux machines, that’s things like some directories like /etc, /root, /docker-data, some shared files.
On my windows machines, then that’s some mapping data, word documents, pictures, geocaching files, generated backups and so on.
You work out the files and get an idea of how much space you need to set aside.
Then, with automated methods, have these files copied or zipped up to a common directory on an always-available server. Let’s call that /backup.
These should be versioned, so that older ones get expired automatically. You can do that with bash scripts, or automated backup software (I use backup-manager for local machines, and backuppc or robocopy for windows ones)
How many copies you keep depends on your preferences – 3 is a sound number, but choose what you want and what disk space you have. More than 1 is a good idea since you may not notice the next day if something is missing or broken.
Monthly Backups – Make them Offline if possible
I puzzled a long time over the best way to do offline backups. For years I would manually copy the contents of /backup to large HDDs once a month. That took an hour or two for a few terabytes.
Now, I attach an external USB hard drive to my server, with a smart power socket controlled by Home Assistant.
This means it’s “cold storage”. The computer can’t access it unless the switch is turned on – something no ransomware knows about. But I can write a script that turns on the power, waits a minute for it to spin up, then mounts the drive and copies the data. When it’s finished, it’ll then unmount the drive and turn off the switch, and lastly, email me to say “Oi, change the drives, human”.
Once I get that email, I open my safe (fireproof and in a different physical building) and take out the oldest of three usb Caddies. Swap that with the one on the server and put that away. Classic Grandfather/Father/Son backups.
Once a year, I change the oldest of those caddies to “Annual backup, 2024” and buy a new one. That way no monthly drive will be older than three years, and I have a (probably still viable) backup by year.
BTW – I use USB3 HDD caddies (and do test for speed – they vary hugely) because I keep a fair bit of data. But you can also use one of the large capacity USB Thumbdrives or MicroSD cards for this. It doesn’t really matter how slowly it writes, since you’ll be asleep when it’s backing up. But you do really want it to be reasonably fast to read data from, and also large enough for your data – the above system gets considerably less simple if you need multiple disks.
Error Check: Of course with automated systems, you need additional automated systems to ensure they’re working! When you complete a backup, touch a file to give you a timestamp of when it was done – online and offline. I find using “tree” to catalogue the files is worthwhile too, so you know what’s on there.
Lastly – test your backups. Once or twice a year, pick a backup at random and ensure you can copy and unpack the files. Ensure they are what you expect and free from errors.
.dotfiles on github
Big/critical files on an external HD
simple as
One reason for moving to Nix was declarative config so at least that part of my system is a series of Nix files to build into a working setup.
…The rest… let’s just say “needs improvement” & I would like to set up a NAS.
My desktop, laptop and homelab all synd my important stuff over syncthing. They all do btrfs snapshots three months back in case an oopsie would propagate.
The homelab additionally fetches deduplicated snapshots of my VPS weekly, before syncing all of the above to an encrypted hetzner storage for those burning-down-the-house events.
Firstly, for my dotfiles, I use home-manager. I keep the config on my git server and in theory I can pull it down and set up a system the way I like it.
In terms of backups, I use Pika to backup my home directory to my hard disk every day, so I can, in theory, pull back files I delete.
I also push a core selection of my files to my server using Pika, just in case my house burns down. Likewise, I pull backups from my server to my desktop (again with Pika) in case Linode starts messing me about.
I also have a 2TiB ssd I keep in a strongbox and some cloud storage which I push bigger things to sporadically.
I also take occasional data exports from online services I use. Because hey, Google or Discord can ban you at any time for no reason. :P
Currently I use Borg Backup with Vorta as a GUI. I don’t really do anything automated/scheduled, I just back it up manually to an external SSD every few days or so. I pretty much do my whole
/home
folder, except for a couple of subfolders that aren’t really necessary (andVideos
, which I back up separately.)I do eventually want to upgrade to a NAS, but I’m waiting until we move to start setting that up. Also I don’t really have an off-site plan yet which I know is bad, but I need to figure that out.
deleted by creator
You have loads of options but you need to also start from … “what if”. Work out how important your data really is. Take another look and ask the kids and others if they give a toss. You might find that no one cares about your photo collection in which case if your phone dies … who cares? If you do care then sync them to a PC or laptop.
Perhaps take a look at this - https://www.veeam.com/products/free/linux.html its free for a few systems.
I use OneDrive. I know people will hate but it’s cheap and works on everything (well, it takes a third party tool on Linux). If I care about it it goes in OneDrive, otherwise I don’t need it that much.
may I ask which third-party tool you use? i’m using onedriver and it’s pretty unreliable in my experience
May I ask why you prefer that over Google Drive, or others such as Dropbox or Mega? I used it extensively when I used Windows, but that’s been several years.
Here’s one that probably nobody else here is doing. The backup goes on my mobile device. Yes, the thing in my pocket.
- Mount it over SSHFS on the local network
- Unlock a LUKS container in the form of a 30GB sparse file on the device
rsync
the files across- Lock, unmount
The backup is incremental but the container file never changes size, no matter what’s in it. Your data is in two places and always under your physical control. But the key is never stored on the remote device, so you could also do this with a VPS.
Highly recommended.
Where is the key stored?
Locally.
If your local machine dies, and you have a backup on your phone which you cannot unlock… aren’t you screwed?
Good question. No, but at a small cost in security. The key I generated using
sha512sum
using a very solid memorized passphrase. This means I can regenerate the key in the scenario you describe.