How Do You Handle Backups in Linux? Hardware, Software, Configuration, etc…

Linux EXT-4 File System Corruption & Attempted Recovery

There’s a file system corruption bug related to EXT-4 in Linux, and it happened to me a few times in Ubuntu 18.04. You are using your computer normally, then suddenly you can’t write anything to the drive, as the root partition has switched to read-only. Why? Here are some error messages:


What then happens is that you restart your PC, and get to the command where you are asked to run:


Change /dev/sda2 to whatever your drive is, and manually review errors. You can take note of the file modified, as you’ll likely have to fix your Ubuntu installation later on. Usually the fix consists of various package re-installations:


It happened to me two or three times in the past, and it’s a pain, but I eventually recovered. But this time, I was not so lucky. The system would not boot, but I could still SSH to it. I fixed a few issues, but there was a problem with the graphics drivers and gdm3 would refuse to start because of it was to open “EGL Display”. Three hours had passed with no solution in sights, so I decided to reinstall Ubuntu 18.04.

Backup Sluggishness

Which brings me to the main topic of this post: Backups. I normally backup the files on my laptop once week. Why only once a week? Because it considerably slow down my computer especially during full backups, as opposed to incremental backups. So I normally leave my laptop on on Wednesday night, so the backup will start at midnight, and hopefully ends before I start working in the morning but it’s not always the case. I’m using the default Duplicity via Deja Dup graphical interface to backup files to a USB 3.0 drive directly connected to my laptop.

Nevertheless, since I only do a backup once a week on Thursday, and I got the file system bug on Tuesday evening, I decided to manually backup my full home directory to the USB 3.0 using tar and bz2 compression as extra precaution before reinstalling Ubuntu. I started at 22:00, and figured it would be done when I’d wake up in the morning. How naive of me! I had to wait until 13:00 the next day for the ~300GB tarball to be ready.

One of my mistakes was to use the single-threaded version of bzip2 with a command like:


Instead of going with the faster multi-threaded bzip2 (lbzip2):


It would probably have made sure it finished before the morning, but still would have take many hours.

(Failing to) Reinstall Ubuntu 18.04

Time to reinstall Ubuntu 18.04. I did not search on the Internet the first time, as I’d assume it would be straightforward, but the installer does not have a nice-an-easy way to reinstall Ubuntu 18.04. Instead you need to select “Something Else”

and do the configuration manually by selection the partition you want to re-install, select the mount point, and make sure “Format the partition” is NOT ticked.

Ubuntu 18.04 reinstallation partition selection
Click to Enlarge

Click OK, and you’ll get a first warning that the partition has not marked for formatting, and that you should backup any critical data before installation.

Reinstall Ubuntu 18.04 backup warning
Click to Enlarge

That’s followed by another warning that reinstalling Ubuntu may completely fail due to existing files.

Ubuntu 18.04 Reinstallation Warning
Click to Enlarge

Installation completed in my case, but but the system would not boot. So I reinstall Ubuntu 18.04 on another partition on my hard drive. With hindsight, I wonder if Ubuntu 18.04.3 LTS release may have been the cause the re-installation failure, since I had the Ubuntu 18.04.2 ISO, and my system may have already been updated to Ubuntu 18.04.3.

Fresh Ubuntu 18.04 Install and Restoring Backup – A time-consumption endeavor

Installation on the other partition worked well, but I decided to use Duplicity backup in case I forgot about some important files in my manual tar backup. I was sort of hoping restoring a backup would be faster, but I was wrong. I started at 13:00, and it completed the next day around 10:00 in the morning.

Duplicity/Deja Vue backup restoration

It looks like duplicity does not support multi-threaded compression. The good news is that I can still have my previous home folder, so I’ll be able to update the recently modified files accordingly, before deleting the old Ubuntu 18.04 installation and recovering space.

Lessons Learned, Potential Improvements, and Feedback

Anyway I learned a few important lessons:

  1. Backups are very important as you’ll never know when problem may happen
  2. Restoring backup may be a timing consuming process
  3. Having backup hardware like another PC or laptop is critical if your work depends on you having access to a computer.  I also learned this the day the power supply of my PC blew up, and since it was under warranty I would get a free replacement. I just had to wait for three weeks since they had to send the old one to a shop in the capital city, and get a new PSU back
  4. Ubuntu re-installation procedure is not optimal.

I got back to normal use after three days due to the slow backup creation and restoration. I still have to reinstall some of the apps I used, but the good thing if that they’ll still have customizations and  things like list of recent files.

I’d like to improve my backup situation.  To summarize, I’m using Duplicity / Deja Dup in Ubuntu 18.04 backing up to a USB 3.0 drive directly attached to my laptop weekly.

One  of the software improvements I can think of is making sure the backup software uses multi-threaded compression and decompression. I may also be more careful when selecting which files I backup to decrease the backup time & size, but this is a time consuming process that I’d like to avoid.

Hardware-wise, I consider saving the files to an external NAS, as Gigabit Ethernet should be faster than USB 3.0, and it may also lower I/O to my laptop. Using a USB 3.0 SSD instead of a hard drive may also help, but it does not seem the most cost-effective solution. I’m not a big fan of backup personal files to the cloud, but at the same time, I’d like this option to restore programs and configuration files. I just don’t know if that’s currently feasible.

So how do you handle backups in Linux on your side? What software & hardware combination do you use, and how often do you backup files from your personal computer / laptop?

Support CNX Software - Donate via PayPal or become a Patron on Patreon

56
Leave a Reply

avatar
29 Comment threads
27 Thread replies
1 Followers
 
Most reacted comment
Hottest comment thread
36 Comment authors
Stephanzoobabaussetgtheguyukkdayns Recent comment authors
  Subscribe  
newest oldest most voted
Notify of
Gaël STEPHAN
Guest

I use burp ( https://burp.grke.org/ ) it works quite well, you tell the server how much backups you want for your computers, and the client connects regularly to the server ( crontab) to ask if the backup needs to be run. Security is done through ssl and certificates, and there’s a quie good web UI on top of that.
Very reliable, handles big backups very well, i recommend it ( over BackupPC which i used before, worked well, but.. easy to break)

tkaiser
Guest
tkaiser

Did you have look at https://restic.net already?

Matlo
Guest
Matlo

Have a look at timeshift: https://github.com/teejee2008/timeshift

JrRocket
Guest
JrRocket

I think this is what I’m looking for!

dim
Guest
dim

Looks really nice. Thanks for the tip.

Brumla
Guest
Brumla

And what about use self hosted Nexcloud, where you will backup continuously.

dgp
Guest
dgp

Back when making an install usable was a week long grind of messing with config files backing up and restoring everything sort of made sense but I don’t think it does anymore. Especially if the result is 300GB files that will cause automated backups to stop working pretty soon and make manually doing them a massive pain. Backing up home should really be enough if you are careful enough not to have loads of totally legal BD rips etc in there.

That said I don’t back up anything. My work stuff is all done in git so as long as I push every few hours something suddenly failing means at most a few hours are lost. Just in case github falls into a blackhole I have a local copy of everything in gitolite. I don’t keep anything important on laptops as they are just as likely to get lost/stolen/stepped on/covered in drink as to have a filesystem failure.
Stuff that’s important enough that losing it could result in jail time (business accounts) I do in a cloud service and make them responsible.

Member

Make a gitlab.com account too and push to both of them.

Marian
Guest
Marian

Hey,
Try Veeam Agent for Linux FREE, fast system restore.

back2future
Guest
back2future

What was disk usage (df -h)?

What’s the reason for using ~0.5TB for os?
Data should have own partition aside from os.

memeka
Guest
memeka

I have a backup server that has a cron job to run my own script that does rsync over ssh on selected folders from multiple computers (windows and Linux). Backup server is Ubuntu + ZFS, and after rsync on a zfs filesystem, it does a snapshot, so I can easily mount backups from different days in the past (time machine). Also backup server has a dock where I can put an external hdd, and update the zfs snapshots on it e.g. weekly, then take hdd offsite – I rotate a couple of hdd so that at any point in time a hdd is in a safe place offsite with at most 1 week old data.

Edit: the offsite disks are encrypted.

zoobab
Guest

I use ccollect, a big shell script over rsync+hardlinks to handle all my backups of my servers and laptops. I did not liked rsnapshot way of naming the backups (I prefer the ccollect way of naming them with date strings)

https://www.nico.schottelius.org/software/ccollect/

Over SSH on low cpu devices, I used arcfour cipher to lower the encyption load, or even went straight over rsync:// protocol without SSH. SSH is still not using low cpu cipher based on elliptic curves.

Mihai
Guest

My Linux laptop is not backed up, but I am usually creating a separate partition for the home directory. My Windows has a Macrium Reflect backup without any drivers installed and one where all needed drivers + minimal programs are installed (Total Commander, Office, Firefox, Chrome).

On my XU4 running as a NAS I used to have a cronjob running once a month where the boot and the system partitions were dumped with dd to an external drive. It used to be ok to backup partitions like that since XU4 used a microSD card and partitions were small by today’s criteria. On my H2 (that has replaced XU4 as a NAS) I did a dd backup of the EFI partition and another dd for the root filesystem for the first 15 GB (since H2 is using a M.2 SATA drive that is considerably larger than a microSD card and dumping the whole system image is not feasible). I plan to shrink the root filesystem to 15 GB and have a separate partition for the rest (256 GB M.2 drive) so I can backup the whole rootFS partition with dd. I found that I can compress the output of dd on the fly, so free space is not an issue. I may just keep backing up 15 GB of the rootFS for now, hopefully occupied space will not be larger or even close to 15 GB (Plex has a not-so-nice habit of eating up space with its metadata and database).

Long story short: creating a separate partition for the home directory and using it helps a lot in case of OS failure. It means one would have to manually create partitions (swap partition as well) and for some this is not easily achievable. I Also prefer using backups of an entire partition as opposed to backing up files.

back2future
Guest
back2future
tonny
Guest
tonny

rather than dd, you could try fsarchiver (http://www.fsarchiver.org/). It’ll compress just the selected partition(s) content. Say, you have 40GB sda1 with just 10GB used space. Fsarchiver will just comporess that 10GB to a file.

I’ve used to backup the 446 first bytes from my drive then fsarchiver partitions that need to be backed up.

Domi
Guest
Domi

I stopped backing up a long time ago. I daily copy (manually or automatically) from my workstation to a NAS (currently an HC2 plus two others that sync from it). I also have a 2nd workstation which I start using when the first one starts behaving. If the issue was hardware, I replaced it. Then I simply reinstall. I found it simpler and faster than to deal with a backup architecture. I have not lost data in the last 25 years (except my phone which recently committed sepuku for an unknown reason, lost my contact list, totally forgot that my phone was a computer too!)

dgp
Guest
dgp

>I found it simpler and faster than to deal with a backup architecture

This. I suspect a lot of the people suggesting their favourite system wouldn’t realise if it stopped working months ago until they actually needed to restore something and the backups don’t exist or are totally useless. Just make logging your changes and pushing them somewhere part of your workflow.

>lost my contact list,

Allowing your phone to use google or apples sync service does make sense for this at least.

Domi
Guest
Domi

> Just make logging your changes and pushing them somewhere part of your workflow.

Exactly. For coding, as you wrote, git. For other stuff like docs you received or wrote and sent by email, just copy them to the NAS one you’re done with them.

> Google sync

I have very little trust in cloud storage that ends up being data mined…

kdayns
Guest
kdayns

Just export your contacts as files.

Diego
Guest
Diego

Id get myself a big enough storage media and use simple rsync. No compression (or maximum something light like i.e. gzip) and easy incremental backup. Of course you should do a full snapshot every now and then, or you backup to zfs or similar (with rsync, as long as your desktop uses ext4) and snapshot with the fs’ on board utilities. Should be considerably faster.

Currently im rather lazy and don’t backup the OS at all. The data is copied between my current and previous pc manually. Which has the added benefit that my previous pc is hardly ever running.

Gaetano
Guest
Gaetano

I use rsync for home and data directory/filesystems, and fsarchiver ( https://en.wikipedia.org/wiki/FSArchiver ) for the OS. With it I made a weekly “hot backup” compressed image. I used it because its compression also uses threads, so you can maximize the usage of HyperThreading CPUs during the compression, resulting in a lower time to create the entire image. To recover a single file/directory you’ve to restore the image in another equal or greater partition than the source, but all we hope that this never happens; in the meanwhile, I can confirm that each time I had issues with the OS, it doesn’t boot or so on, with a restore of this image (booting from a live distro) I successfully restored the system and the OS booted again.

I use it with this syntax:

DATA_FILE_EXT=date +"%Y-%m-%d"
HOSTNAME=hostname
UBUNTU_VERSION=cat /etc/issue | awk '{print$1$2}' | awk -F"/" '{print$1}'
ARCH=uname -m | sed "s|x86_64|64bit|g" | sed "s|i386|32bit|g"
DEST_FILE=$BACKUP_DIR/${HOSTNAME}.${UBUNTU_VERSION}.${ARCH}.${DATA_FILE_EXT}
NUM_THREADS=cat /proc/cpuinfo | grep "processor" | grep ":" | wc -l
LOGFILE=${BACKUP_DIR}/fsarchiver_backup_system.log

/usr/sbin/fsarchiver -a -o -v -Z 22 -j ${NUM_THREADS} –exclude=/home/gaetano/Downloads –exclude=/var/lib/snapd/cache/ -A savefs ${DEST_FILE} $ROOTFS_DEV 2>&1 | tee -a ${LOGFILE}

Member

I don’t back up anything either, I have everything in the cloud. I can whack my Linux box and not lose anything except a few configuration settings which are easy to set back in. In fact, every time I upgrade my major Ubuntu revision I use a clean partition. Gitlab.com gives you unlimited, private projects for free. Plus I have a second copy of most stuff in my Amazon account. All of my mail is in gmail. I do work on everything locally and then commit and push the changes up to gitlab. Gitlab works for all files, not just code.

William Barath
Guest

Sneaky. Google Compute has 50GB of free git as well… Not “unlimited” but that’s more than enough for config and most personal/business documents that anyone will have. If you’re a video content creator then cloud storage is probably not in the cards, at least not for raw footage.

Daniel
Guest
Daniel

Have you tried out https://www.borgbackup.org/ ?

Ricky
Guest
Ricky

What ext4 software bug caused this issue? Are you sure it is not your hardware problem?

Chris
Guest
Chris

I think exactrly like you ! It’s look like a hardware corruption !

RK
Guest
RK

/etc/nixos and /home rsynced to a separate harddrive every once in a while. I should probably create another backup on a separate machine but I’m too lazy.

redeployment takes an hour more or less.

RK
Guest
RK

Just verified; It’s this snippet on a cron job twice a week:

#! /usr/bin/env sh
rsync -avc /mnt/sdb1/home/rk /mnt/sdc1/home
rsync -avc /etc/nixos /mnt/sdc1/nixos

Additionally I have syncthing dumping some stuff from my phone over here on my home but yeah… If both hard drives fail I’m screwed so I should probably do something about this at some point… Bah. Still too lazy.

Frédéric BERIER
Guest

Hi,
I use backintime on kunbuntu 18.04. It is well configurable. Cron + Autoexec when pluged in external drive.
I use usb3 nvme external ssd drive: 10Gbit/s.
I’m happy with that.

William Barath
Guest

I use duplicity with Backblaze and I simply renamed pigz to gz and pbzip2 to bzip2 to overcome duplicity’s silly lack of switches… and tbh I don’t understand why installing these parallel versions doesn’t use the alternatives system to replace the single-threaded ones by default… SMP/SMT has been a standard feature of even cheap systems for a decade.

Amanda
Guest
Amanda

Sorry you are having data problems, I know how frustrating that can be. I’ve lost important data before, so now I overcompensate with triple redundancy. I rotate between 3 identical external hard drives that are kept in 3 different locations so that even a catastrophic building collapse or fire will not destroy my data. I rsync my important data frequently to the external drive currently on-site, then swap it with one of the other drives when I visit one of the other locations where the drives are stored. All drives are LUKS encrypted so I don’t have to worry about theft either. A very simple rsync script does the job.

rsync -av –delete-before –ignore-errors –exclude={‘/Downloads/Linux Distros/’,’/Media/Plex/’,’/.Trash-1000/’,’lost+found’} /mnt/data-drive/ /mnt/backup-drive/

A typical run takes just a few minutes because it’s only copying what has changed. I also exclude a few directories of easily replaceable data to save space and speed up the process. I hope you get things up and running again soon.

Richard Crook
Guest
Richard Crook

“The system would not boot, but I could still SSH to it. ”

If the system had not booted you would would not be able to SSH in to it. For the SSH service to be running the network device is activated. CTRL-ALT-F1 or CTRL-ALT-F2 etc upto maybe CRL-ALT-F6 should get you to a virtual console where you could login directly to a shell prompt as root or su to root (exec sudo su ) to then fix the problem.

For corrupted file system recover or other major disasters, it is always a TOP priority to have a boot DVD or USB with the same version of the distribution in order to boot from that. Then repair the installation (fsck partitions, mount partitions + bind mounts for proc, sys, dev and friends, and chroot into mounted system for reinstallation of packages assuming perl is not broken nor /var/lib/dpkg). Essentail files are backed up to /var/backups so some copying from there may allow recovery.

Stefan
Guest
Stefan

I am using timeshift for backing up the system automatically, keeping different snapshots for three weeks each. As they are done incrementally, they are very fast and just need some minutes to be created (depending how fast the storage is attached, of course).

Additionally I created a lot of scripts using rsync, runned automatically by a cronjob, backing up several other aspects (work data, e-mail-profiles, etc…).
And every month I am creating a whole image by using clonezilla. Just on my personal computer, not on the servers as there has to be no downtime, of course.

Maybe these are a lot of backups, but I like to be safe.
The funny thing is: I never needed these backups. I am using Linux since more than 17 years now and I never had a serious filesystem corruption. My personal computers use Linux Mint, the servers are running Debian and there are also some Raspberry Pi with Raspbian (where a SD card died, but there is not the filesystem to blame for). Ext2/3/4 has been very reliable for me, even ReiserFS that I used 15 years before.

Themadmax
Guest

I test for server restic software, with S3 cloud. Easy differential backups

willy
Guest
willy

I’m using rsync through my “dailybak” script (http://git.1wt.eu/web?p=dailybak.git). It performs daily snapshots via the network (when the laptop is connected) to my NAS (DS116J, reinstalled on Linux to get rid of bugs^Wvendor-specific features), and it manages to keep a configurable number of snapshots per week, month, year and 5yr so that I can easily go back in time just by SSHing into the NAS without having to untar gigs of files. rsync is particularly efficient at doing this because it performs hardlinks to the previous backup and uses little extra space. However it uses gigs of RAM to keep track of all files and that required me to initially use zram and swap on the NAS (1 GB RAM), and finally to switch back to swap only as the little memory used by zram was making the task even harder for rsync. Note that this is an issue because I do have many kernel workdirs, and the workload totals 5.5 million files, but most users will have far less. Also since it doesn’t require much network bandwidth, it even works fine when the laptop is plugged downstairs on a PLC adapter.

It just reminds me it’s been two months without being connected, it’s about time I do it again.

In your case I doubt you’re hitting last year’s bug, because it was fixed around december and I guess you’ve updated your machine’s kernel since then.

S K
Guest
S K

“Gigabit Ethernet should be faster than USB 3.0” – wait, what? Why?
Gigabit Ethernet is, well, 1 Gb/s, whereas USB 3.0 is 5 Gb/s.
Am I missing something here?

fossxplorer
Guest
fossxplorer

Borg Backup someone mentioned here has de-deduplication (as well as compression etc) that can be invaluable to some users.

theguyuk
Guest
theguyuk

Why has no one suggested checking for bios or driver updates for the laptop?

aussetg
Guest
aussetg

Just use ZFS 🙂

( with replication / mirrors + backup to a ZFS remote pool using zfs send )

Stephan
Guest
Stephan

http://www.nongnu.org/rdiff-backup/ is what I use with an external USB drive.