Backup using rsync - a good idea but preserving hard-links is important!

Greetings!

One thing I have discovered is the awsome power of rsync.

Rsync can, among other things, make a literal “clone” of the O/S such that it can be restored in its totality and function as a working, running, system when done.

I have noticed “quirky” problems with rsync’d backups in the past, especially when backing up and restoring a PINNified O/S.  Things behave in weird and wonky ways that the original O/S didn’t.  Or, the PINNified system behaves differently than the original, as-downloaded, system does.

I have kept this in the back of my mind and have kept my eyes open for the last year or so, and I think I have found an answer.  Maybe not THE answer, but a possible failure modality.

Issue:
When using the suggested and widely documented rsync command for backing things up - including entire filesystems, (rsync -axXv [source] [destination]), it does not take into account the possibility that there may be more than one hard link to an individual file object.

If you run rsync ignoring hard links, (i.e. not preserving them), you will end up with multiple copies of the file instead of multiple references linked back to a single, unique instance of that file object.

One possible result of this is that a file, or package of files, may update the same file using different pathnames, or even filenames.  This can cause essential files to become “out of sync” - as they are now separate file objects instead of one file with multiple hard links.

The solution is to “preserve hard links” when rsyncing anything more complex than a user-created data store - and then I would do it anyway, just in case.

Viz.:
rsync -axXHv [source] [destination]
. . . where the “H” option tells rsync to save files that are hard-linked as hard-linked files instead of separate file objects.

That begs the question:
Does it matter?  Does a real Raspbian filesystem actually contain files that are hard-linked together?

I recently ran a test suggested by my Ph.D. brother who is an acknowledged expert in Linux.
sudo find . -links +1 -type f -name '*' -printf '%i %p\n' | sort

(I ran it redirecting output to a file so I had something to study.)
sudo find . -links +1 -type f -name '*' -printf '%i %p\n' | sort > /home/pi/Desktop/hardlinks.txt 2>&1

A sample of the output:

10006 ./usr/bin/pigz
10006 ./usr/bin/unpigz
130571 ./usr/lib/arm-linux-gnueabihf/vdpau/libvdpau_nouveau.so.1.0.0
130571 ./usr/lib/arm-linux-gnueabihf/vdpau/libvdpau_r300.so.1.0.0
130571 ./usr/lib/arm-linux-gnueabihf/vdpau/libvdpau_r600.so.1.0.0
130571 ./usr/lib/arm-linux-gnueabihf/vdpau/libvdpau_radeonsi.so.1.0.0
153228 ./usr/share/doc/libreadline6/examples/Inputrc
153228 ./usr/share/doc/libreadline7/examples/Inputrc
153529 ./usr/share/doc/wamerican/README.Debian
153529 ./usr/share/doc/wbritish/README.Debian
153530 ./usr/share/doc/wamerican/changelog.Debian.gz
153530 ./usr/share/doc/wbritish/changelog.Debian.gz
153531 ./usr/share/doc/wamerican/copyright
153531 ./usr/share/doc/wbritish/copyright
153587 ./usr/share/doc/x11-common/changelog.gz
153587 ./usr/share/doc/xserver-xorg/changelog.gz
153587 ./usr/share/doc/xserver-xorg-input-all/changelog.gz

. . . and so on.  There were 1612 lines of hard-linked files in my running instance of GoPiGo 3.0.2 Beta!  I am assuming other Linux installations have a similar number of hard links.  Maybe more, maybe less, but all potentially important.

That being the case, it is obvious that preserving hard-links is important, nay, potentially CRITICAL to maintaining the integrity of a backed-up filesystem.

What say ye?

2 Likes