tech-misc archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: du -m is off
On 22-Mar-08, at 11:00 AM, Jeremy C. Reed wrote:
I noticed that a file of 10485760 bytes would be listed as 11MB by
du -m.
The -h "humanize" is correct and lists as 10MB.
du.c uses:
(void)printf("%lld\t%s\n",
(long long)howmany(blocks, (int64_t)blocksize),
This works better for me:
printf("%ld\n", (long)((int64_t)blocks / blocksize));
But that doesn't help for files less than a megabyte which will
then be
displayed as a zero.
As another example, a file of 3372684 bytes is 3.21644MB. du -m would
round up to 4. While my printf above prints "3".
I am not sure what correct behaviour should be.
The first hint is to always think about "blocks", not bytes.
I think there's an argument to be made that when dealing with units
of allocation (and available space) the rounding should always be
done in such a way that anything over goes to the next increment (in
the case of counting allocated units, and of course vice versa in the
case of free units: anything under goes to the next decrement). This
is effectively the same way df portrays things when showing blocks
free way back when filesystems were simple and you simply couldn't
squeeze another one-byte file onto a filesystem with 511 bytes free
no matter how hard you tried. 511 bytes is 99.8% of a 512 block and
in normal arithmetic so the free block count would normally be
rounded up (to 1) if there's 511 bytes free. I think the same logic
should apply no matter what units df is displaying its counts in. If
there's less than 1MB free and we're displaying in terms of MB then
there are zero MB free, for example.
Similarly for du: files of less than 1MB, even if they're only one
byte long, should be shown as using one unit of display-sized units
and so for 'du -m' the one-byte file should show as using 1MB. Same
for that 3372684-byte file -- it should indeed show as 4MB with "du -
m", even without considering any finer details.
Now as to the problem with your 10485760-byte file, well that's easy
to understand if you first take a peek at just how much space has
been allocated for that file (and keep in mind that "du" shows disk
usage, not file size) [use "stat -s"]. For example when I create an
exactly 10MB file I get a file that uses 20512 "blocks" (i.e. 512-
byte-blocks) of disk space. If you multiply that back out then the
bytes of disk space used are 10502144, i.e. 10.015625MB, and so by
the logic above that means that in 1MB units the file does indeed use
what must be counted as 11MB of disk space. This is of course
because the block size on the filesystem is 16KB, not 512B, and your
10MB file actually needs 623 filesystem(16KB) units of allocation,
i.e. it does actually use more than 10MB of filesystem space.
That could also suggest that "du -h" is in some ways wrong too, at
least with respect to the intent of showing disk usage (as opposed to
file size) in hard display-sized units. I'm not sure if it can or
should be fixed though, and after all it is just a more easily human
interpretable measure of usage and by definition it always glosses
over the finer details.
--
Greg A. Woods; Planix, Inc.
<woods%planix.ca@localhost>
Home |
Main Index |
Thread Index |
Old Index