It was talking about backups with my colleague Ludovic, that I got aware of a danger threatening my backups : the bitrot.
Oh sure I got my backups redundant and stored in different places, but what if my master copy gets corrupted due to bitrot, and then propagates the corrupted files across my backups ?
Ludovic told me about ZFS, the filesystem designed with checksums and data protection services at the filesystem level; unfortunately my (beloved) NAS, the Synology DS211J does not support zfs (it is using ext4 by default)…
But if my filesystem can not detect corrupted data, jpeginfo can detect corrupted jpeg files : so I would just need to run jpeginfo on my photo collection before any new backup !
Now, jpeginfo is widely spread across classic linux distributions, such as Ubuntu or Debian (
sudo apt-get install jpeginfo) and even Mac OS X (
brew install jpeginfo), but it does not come pre-installed on the Synology Diskstation.
Installing jpeginfo on the Synology Diskstation
- install ipkg on your diskstation, this post sums it up nicely
- then from your workstation, download the debian sid jpeginfo package (this is important: more recent releases have a dependency on libjpeg.8 which is not provided by the ipkg feed) :
- unpack it :
ar x jpeginfo_1.6.0-4_armel.deb
- unpack the archive containing the binary :
tar xvzf data.tar.gz
- copy to your diskstation the binary :
scp ./usr/bin/jpeginfo firstname.lastname@example.org:
- login to your diskstation, jpeginfo is available from /root/jpegingo (move it to the path then)
I also published this howto replying to a Synology forum thread , which actually got me started for the next step.
Running jpeginfo before backing up
Let’s say you want to make sure your photos are not corrupted before running cron against your collection (or a sub part of it), here is the one-liner :
find . \( -iname "*.jpg" -o -iname "*.jpeg" -o -iname "@eaDir" -a -type d -prune \) -print0 | xargs -0 /root/jpeginfo -cv | grep "ERROR"
In other words : find all files ending with jpg or jpeg , not in a directory named @eaDir (where the Diskstation stores its thumbs), check them with jpeginfo, whenever it returns a result with an ERROR print it on the console.
Now you just need to make sure that whenever an ERROR is found, you log it somewhere and you abort the backup.