Sunday, 28 October 2007

Deceiving checksums

Ever tried to copy a file or burn a CD, only to find out that the copy seems just fine, but not according to the checksum? Sometimes it's worth digging a bit deeper: (or so it seems to me as a geek)
ubuntu@ubuntu:/mnt/hda2/ISO$ cat md5sum.txt
d2334dbba7313e9abc8c7c072d2af09c ubuntu-7.10-desktop-i386.iso
ubuntu@ubuntu:/mnt/hda2/ISO$ dd if=/dev/hdd | md5sum
1425008+0 records in
1425008+0 records out
729604096 bytes (730 MB) copied, 165.264 seconds, 4.4 MB/s
04af936c32bf2a26062a70360dd447cb -
Game, set, and ... no match. (For the record, "md5sum /dev/hdd" wouldn't illustrate my point here.) Let's see what we have:
ubuntu@ubuntu:/mnt/hda2/ISO$ ls -l *.iso
-rw-r--r-- 1 ubuntu ubuntu 729608192 2007-10-28 13:05 ubuntu-7.10-desktop-i386.iso
Ah, 4096 bytes missing; now, with some dd / md5sum use it turns out that the preceding part was a perfect copy, as expected. Well, I'm using that live CD to blog about it while installing; it finished already, so I'm assuming it worked out... :-)

(Next time I intend to stick to K3b, as I'm guessing it copes with these problems (integrated verification process) or never creates them in the first place. It wasn't available on the live-CD, though.)

An older story is when I tried a poor man's backup of a 40 GB drive:
$ nc -l -p 5678 > hda # host waits for data
# nc 5678 < /dev/hda # start backup on a Linux live-CD
This worked fine, or so it seemed. I checksummed different parts of the original and the copy (binary-search style), and narrowed down an inconsistent piece. Finally I must have noticed that this chunk got different checksums at two different times. While transmission errors are theoretically possible, I wouldn't normally expect them with both computers in the same room and on the same switch.

Running the live-CD, I was naively assuming all partitions to be copied were mounted read-only, so there could be no change on the disk either. Maybe the hardware was crashing on me...?

Nope. Evidently:
# swapoff -a
would have helped before copying the whole disk. Sigh! ;-)

No comments: