Thor might be on to something regarding cp's use of read() vs mmap(). I'll summarize my long post which I only sent to the mailing list: I had similar problems on a USB hard disk, which had 90,000 files on it. A few would have sha256 checksum differences (from what was expected on a backup copy). I wrote the sha256 program, and discovered that the read() function returned 0 (EOF) BEFORE the EOF was reached! So the checksum was wrong. Some affected files were small (1 MB), some were larger ISO files (400 MB). Maybe some 4 GB files failed, can't remember. The files affected were random and different every time. Something I forgot to mention was that this usb disk has the 4kb sector sizes: http://mail-index.netbsd.org/netbsd-users/2012/03/03/msg010180.html I also had inserted the sha256 code into the cp.c code to compute the checksum at same time as copying. I think I recall that cp uses read() or mmap() depending on the size of the file to be copied. I can't remember if this version computed correct checksums 100% or not. Anyhow, something is wrong. And I suppose still possible that the disk drive has a flaw too. I also can't remember (not near the systems right now) if the machine showing these symptoms was NetBSD or FreeBSD. I assume that the system code at this low level might be the same. This may not be a valid assumption on my part, but the symptoms seemed SO very similar to the original poster's. John Refling _____________________________________________ From: John Refling [mailto:netbsdrat%gmail.com@localhost] Sent: Wednesday, August 21, 2013 5:18 AM To: netbsd-users%NetBSD.org@localhost Subject: Re: MD5 failing on optical media I have had similar problems, although in some respects quite different, regarding USB hard disk vs. SATA/PATA hard disk. Let me try to explain from memory: I have a mix of FreeBSD and NetBSD systems. So I might be using a NetBSD formatted disk on FreeBSD AND vice versa. This might be the issue, but read on. I can track down and replicate the details in a few days, and retest things as needed. Anyway, I wrote an md5 and sha256 program from the spec by cutting and pasting, quite simple really, just an init() stage, an update() stage (repeated with each INPUTBUFFSIZE block of your data), and a finalize() stage, and print out the result. I have 3 identical archives of approx 90,000 files, none really huge. One on USB hard disk, one on SATA hard disk and one on PATA hard disk, different machine (some NetBSD, some FreeBSD). These disks are all copies of each other, either over network, or locally. When I run my md5/sha256 on all the files on the SATA and PATA, everything matches perfectly. However, on the USB disk, there seem to be about 4-5 files that fail to match the ckecksum (both md5/sha256 as will be seen later) of the same files (by filename, and expected same copies) on the PATA/SATA disks. Everytime I run the md5/sha256 checksum on the USB hard disk, a different 4-5-6 files (out of 90,000) fail to compare (different checksum). More bizarre, when I do a 'cmp' from USB disk to PATA/SATA the files compare OK. Also, when I copy them from the USB to local tmp area on PATA/SATA, the checksum of the local file is now correct (matches PATA/SATA), similar to the OP. Since I wrote (copied) the md5/sha256 program, I was curious as to what was happening. I tossed in some debugging info: What I discovered was that my update() loop on the few files that failed checksum would always quit on exactly a multiple of the BUFFERSIZE in the read() call. So if my read() INPUTBUFFSIZE was 1024 k bytes, and the file was 4 x 1024 k bytes + SOME_SMALL_NUMBER (say 100), there would be Read #1 gets 1024 k bytes, update(), chksum OK Read #2 gets 1024 k bytes, update(), chksum OK Read #3 gets 1024 k bytes, update(), chksum OK Read #4 gets 1024 k bytes, update(), chksum OK Read #5 returns 0, no more bytes, even though 100 (say) bytes left in file to compute correct checksum..... THEREFORE CHECKSUM WRONG ! I also added a bit of code to warn if the stat.st_size of file does not equal to added up cumulative # bytes read(), give a warning. Same result, the few files that lost bytes on the read() also (obviously) did not compare # bytes in the stat() for the file. Stat() was correct, but read() read LESS than the stat() file length and returned a 0 (no more data early). UNLESS WE NEED TO WAIT FOR A TIMEOUT OR DELAYED MORE DATA AVAILABLE. So on these 4-5 or 5-6 files (DIFFERENT FILES *EVERY* TIME IT IS RUN) a tiny # of files loose a small # bytes in the read() loop (which obviously cause checksum to be entirely different). [Read() loop gets a 0 and terminates early, with a few bytes left that SHOULD BE READ() but are not]. This is only observed on the USB hard disk, not SATA/PATA. I only have one USB hard disk under test, so this statement is not really definitive as to cause. I initially assumed this was a hardware defect in the USB hard disk (might still be), or maybe in a driver, or ???... FYI, John Refling Maybe the common thing with the OP and me is 'USB' (CD/HD)
<<attachment: winmail.dat>>