I have just been pointed at a postgresql mailing list discussion about a scenario where a program has written data to a file descriptor, calls fsync(), there is an I/O error in writing the data out, and the program re-tries the fsync() until it returns success. On Linux, that can apparently leave your data unwritten. It forgets the I/O error after the first try. As far as I know, NetBSD leaves dirty buffers marked dirty in case of error, so that they are retried again later. Is my understanding right? References: original mail: https://www.postgresql.org/message-id/flat/CAMsr+YE5Gs9iPqw2mQ6OHt1aC5Qk5EuBFCyG+vzHun1EqMxyQg%mail.gmail.com@localhost#CAMsr+YE5Gs9iPqw2mQ6OHt1aC5Qk5EuBFCyG+vzHun1EqMxyQg%mail.gmail.com@localhost Referenced issue on stackoverflow about what the Linux kernel does: https://stackoverflow.com/questions/42434872/writing-programs-to-cope-with-i-o-errors-causing-lost-writes-on-linux/42436054#42436054 Quote from email thread: I found your discussion with kernel hacker Jeff Layton at https://lwn.net/Articles/718734/ in which he said: "The stackoverflow writeup seems to want a scheme where pages stay dirty after a writeback failure so that we can try to fsync them again. Note that that has never been the case in Linux after hard writeback failures, AFAIK, so programs should definitely not assume that behavior." I hope that NetBSD does leave the pages dirty, because that's the only thing that makes sense, right? -Olaf. -- ___ Olaf 'Rhialto' Seibert -- Wayland: Those who don't understand X \X/ rhialto/at/falu.nl -- are condemned to reinvent it. Poorly.
Attachment:
signature.asc
Description: PGP signature