tech-userlevel archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: alignement or compiler bug?
> Another possibility: *resid is, like max_write, signed, and is
> negative. (If ps->ps_max_write is less than sizeof(*fwi), max_write
> could be negative, but your "bigger than max_write" makes it sound as
> though that's not it.)
Everything is size_t, which is unsigned AFAIK
size_t max_write;
size_t *resid;
size_t data_len;
> I'm assuming data_len is of unsigned type. Try printing it out in hex.
> Are the low bits zero? If not, this increases the plausibility of the
> "corruption" theory, because of the clearing of the low bits if it's
> larger than PAGE_SIZE.
In a core dump left from a previous attempt, data_len is 0xbb5d7050 so
it would say it is really corrupted.
> - Look at the assembly/machine code. See if it looks broken. (What
> hardwarwe is this on? If it's one I know, I can have a look.)
This is i386. You can build it using pkgsrc/filesystems/perfused
The mess happens in perfuse_node_write()
Seeing the bug live is a bit more complicated. I mount a glusterfs
volume (pkgsrc/filesystems/glusterfs) and do tar -xzvf src.tgz in it.
The bug pops up after about half an hour.
Here is the assembly leading to memcpy. The 0x28 is sizeof(*fwi), which
suggests a correct (fwi + 1)
0xbbbe14dc <perfuse_node_write+460>: mov %eax,0x20(%esi)
0xbbbe14df <perfuse_node_write+463>: lea 0x28(%esi),%edx
0xbbbe14e2 <perfuse_node_write+466>: mov 0x10(%ebp),%eax
0xbbbe14e5 <perfuse_node_write+469>: add 0xffffffe8(%ebp),%eax
0xbbbe14e8 <perfuse_node_write+472>: push %edi
0xbbbe14e9 <perfuse_node_write+473>: push %eax
0xbbbe14ea <perfuse_node_write+474>: push %edx
0xbbbe14eb <perfuse_node_write+475>: call 0xbbbdfd90 <memcpy@plt>
> - Leave the "data" variable there, including the code you added to set
> it, but still pass fwi+1 to the memcpy.
I tried passing data, it still crashed. It seems to be the test that
saves my day:
if (data != ((char *)fwi) + sizeof(*fwi))
> - If it doesn't break the semantics, make data have static storage
> duration rather than automatic.
It would break semantics.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu%netbsd.org@localhost
Home |
Main Index |
Thread Index |
Old Index