Subject: Re: port-alpha/35448: memory management fault trap during heavy
To: None <gnats-bugs@NetBSD.org>
From: Michael L. Hitch <mhitch@lightning.msu.montana.edu>
List: netbsd-bugs
Date: 01/29/2007 11:09:01
On Mon, 22 Jan 2007, Michael L. Hitch wrote:
> fails, so it's a little hard to figure out where it came from. I'm going
> to start groveling through the stack myself to see if I can dig out the
> parameters to the in4_cksum() call, and if I can follow the traceback
> manually.
OK, I've dug out more information from the raw stack dump. I located
the address of the mbuf and found that it has the same bad address in
mh_data:
(gdb) print (struct mbuf)*0xfffffc000ef7be18
can not access 0x8266000, invalid translation (invalid L2 PTE)
can not access 0x8266000, invalid translation (invalid L2 PTE)
can not access 0x8266000, invalid translation (invalid L2 PTE)
can not access 0x8266000, invalid translation (invalid L2 PTE)
can not access 0x8266000, invalid translation (invalid L2 PTE)
can not access 0x8266000, invalid translation (invalid L2 PTE)
$2 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0,
mh_data = 0xfffffe0108266000 <Address 0xfffffe0108266000 out of
bounds>,
mh_owner = 0x4e4f5a414d412d58, mh_len = 4096, mh_flags = 67108865,
mh_paddr = 251117080, mh_type = 1}, M_dat = {MH = {MH_pkthdr = {
rcvif = 0xfffffe000005a080, tags = {slh_first = 0x0}, len = 188,
csum_flags = 0, csum_data = 0, segsz = 0}, MH_dat = {MH_ext = {
can not access 0x8266000, invalid translation (invalid L2 PTE)
can not access 0x8266000, invalid translation (invalid L2 PTE)
can not access 0x8266000, invalid translation (invalid L2 PTE)
can not access 0x8266000, invalid translation (invalid L2 PTE)
can not access 0x8266000, invalid translation (invalid L2 PTE)
can not access 0x8266000, invalid translation (invalid L2 PTE)
ext_buf = 0xfffffe0108266000 <Address 0xfffffe0108266000 out of
bounds>, ext_fr$
ext_arg = 0xfffffe000c617cb8, ext_size = 4096,
ext_type = 0xfffffc0000a62558, ext_nextref = 0xfffffc000ef7b118,
ext_prevref = 0xfffffc000ef7a218, ext_un = {
extun_paddr = 14733978372531027968, extun_pgs = {
On a whim, I took a look at the data located at 0xfffffe0008266000 and
found what looks like data that might be expected, and Aaron confirmed
that the data was part of a mailbox file that was being synched. So it
looked like something had corrupted the address used by the mbuf. I
followed the stack back to nfs_writerpc, which can use the address of data
being sent as the external data address for the mbuf. I dug out the
address of the uio and iovec structures used at that point and found:
(gdb) print (struct uio)*0xfffffe000c617e70
$8 = {uio_iov = 0xfffffe000c617e60, uio_iovcnt = 1, uio_offset = 102400,
uio_resid = 18446744069414588416, uio_rw = UIO_WRITE,
uio_vmspace = 0xfffffc0000abc018}
(gdb) print (struct iovec)*0xfffffe000c617e60
$9 = {iov_base = 0xfffffe0108267000, iov_len = 18446744069414588416}
(gdb) x/2gx 0xfffffe000c617e60
0xfffffe000c617e60: 0xfffffe0108267000 0xffffffff00001000
The buffer address in iov_base is corrupt as well. In addition, the
iov_len field appears corrupted.
Following the stack back further, I get to nfs_doio and get the address
of the struct buf that was used to generate the uio/iovec data:
(gdb) print (struct buf)*0xfffffc00052b8dc0
$3 = {b_u = {u_actq = {tqe_next = 0xdeadbeef, tqe_prev =
0xfffffc00052b88b8},
u_work = {wk_entry = {sqe_next = 0xdeadbeef}}}, b_interlock = {
lock_data = 86745072}, b_flags = 85, b_error = 0, b_prio = 0,
b_bufsize = 8192, b_bcount = 8192, b_resid = 8192, b_dev = 4294967295,
b_un = {
b_addr = 0xfffffe0008266000 "ntent-Transfer-Encoding:Message-ID;\n
b=T2nY8PninSOLy9W$
b_iodone = 0xfffffc00005bd600 <uvm_aio_biodone>,
b_proc = 0xfffffc0000abc4a0, b_vp = 0xfffffc000bea53c0, b_dep = {
lh_first = 0x0}, b_saveaddr = 0x0, b_fspriv = {
bf_private = 0xfffffc00052b95a8, bf_dcookie = -4397959768664}, b_hash
= {
le_next = 0x16, le_prev = 0x0}, b_vnbufs = {le_next = 0x87654321,
le_prev = 0x4}, b_freelist = {tqe_next = 0x0,
tqe_prev = 0xfffffe0000263700}, b_lblkno = 0, b_freelistindex = 0}
Lo and behold, it has the correct address of the data! So somwhere
between nfs_doio() and nfs_writeprc(), the iov_base and iov_len values
get clobbered (in an apparently fairly consistant way).
Since the bad address was easy to check for, I inserted a number of
KASSERT() statements in nfs_doio(), nfs_doio_write, and nfs_writerpc().
I was able to induce this failure on my own alpha at this point. I found
that the address was good at the entry of nfs_writerpc(), but had been
corrupted at the start of the loop sending out the data. This seemed odd,
since there didn't appear to be anything that would cause the type of
corruption I was seeing. While trying to figure out where some of the
local variables in nfs_writerpc() were located on the stack, I noticed
there was a 'retry:' label before the output loop. Finding where that
label was used shed some light on things. Certain conditions (which I'm
not too clear on, since I don't understand NFS all that well) would cause
a resend of the entire data buffer, and if that clobbered the data address
and length, would result in what I was seeing. Indeed, that was the case;
a few more KASSERT() statements showed that the UIO_ADVANCE() at line 1547
of nfs_vnops.c was clobbering the iovec data.
Closer examinination of what UIO_ADVANCE() was doing, and examination of
the generated code show what the problem was.
The alpha has 64 bit pointers, and the iov_len values was also 64 bits.
The variable backup used to adjust the iovec data is an unsigned 32 bit
value. The changes for version 1.225 appear to have intruduced a problem
that only showed up on the alpha. Prior to that, the unsigned value of
'backup' was being subtracted from iov_base, and added to iov_len. In
version 1.225, that was changed to use the macro UIO_ADVANCE() and passing
a negated value of 'backup' to the macro. The compiler thus negated the
32 bit unsigned value of 'backup' and zero-extended the result to 64 bits
which was added to iov_base, and subtracted fro iov_len. resulting in the
clobbered values.
Changing the UIO_ADVANCE() to a UIO_RETREAT() which passed 'backup'
directly and subtracted that from iov_base, and added it to iov_len gave
me a kernel which did not crash when nfs_writerpc() resent the data. I've
also just verified that simply making 'backup' a signed 32 bit also works
using the UIO_ADVANCE() macro.
---
Michael L. Hitch mhitch@montana.edu
Computer Consultant
Information Technology Center
Montana State University Bozeman, MT USA