Port-amd64 archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Repeatable crash
On Mon, 4 Feb 2008, Paul Goyette wrote:
On Mon, 4 Feb 2008, Andrew Doran wrote:
On Mon, Feb 04, 2008 at 03:40:02AM -0800, Paul Goyette wrote:
This is from a 4.99.49 kernel and userland built from sources dated
2008-01-24 21:14:50 UTC
Usually, I run build.sh on the machine that actually contains the
source, but this time I ran it on another host. The entire /usr/src and
/usr/obj directories were NFS-mounted. The crash happens about 30 or 40
minutes after starting build.sh and it happens at different places, so I
don't think it's data-specific; rather I suspect some strange race
condition. The back-traces don't seem terribly usefule (maybe gdb is
out-of-sync with the trap stack-frame again?).
The trap frame layout changed, I think dsl%netbsd.org@localhost is looking at
it.
Did you get anything out of ddb?
Transcribed by hand since I don't have a serial console:
kernel: protection fault trap, code=0
Stopped in pid 29318.1 (x86_64--netbsd-g) at
netbsd:nfs_loadattrcache+0x13c: cmpl %ecx,0x10(%rbx)
nfs_loadattrcache() at netbsd:nfs_loadattrcache+0x13c
nfsm_loadattrcache() at netbsd:nfs_loadattrcache+0x70
nfs_lookup() at netbsd:nfs_lookup+0xdbd
VOP_LOOKUP() at netbsd:VOP_LOOKUP+0x49
lookup() at netbsd:lookup+0x345
namei() at netbsd:namei+0x1a1
sys_access() at netbsd:sys_access+0x97
syscall() at netbsd:syscall+0xa9
Unable to enter any commands at the ddb prompt - it appears that the keyboard
(USB) is dead or interrupts blocked.
Looking at the sources, nfs_loadattrcache+0x13c is here:
(gdb) list * nfs_loadattrcache + 0x13c
0xffffffff8019d44c is in nfs_loadattrcache
(/usr/src/sys/nfs/nfs_subs.c:1687).
1682 vap = np->n_vattr;
1683
1684 /*
1685 * Invalidate access cache if uid, gid, mode or ctime changed.
1686 */
1687 if (np->n_accstamp != -1 &&
1688 (gid != vap->va_gid || uid != vap->va_uid || vmode !=
vap->va_mode
1689 || timespeccmp(&ctime, &vap->va_ctime, !=)))
1690 np->n_accstamp = -1;
1691
Disassembling this area gives us (offset 0x13c --> +316)
0xffffffff8019d416 <nfs_loadattrcache+262>: mov %r9d,0xa0(%r13)
0xffffffff8019d41d <nfs_loadattrcache+269>: mov %rax,0xa8(%r13)
0xffffffff8019d424 <nfs_loadattrcache+276>: mov 0xc(%r12),%eax
0xffffffff8019d429 <nfs_loadattrcache+281>: mov %eax,%esi
0xffffffff8019d42b <nfs_loadattrcache+283>: bswap %esi
0xffffffff8019d42d <nfs_loadattrcache+285>: mov 0x10(%r12),%ecx
0xffffffff8019d432 <nfs_loadattrcache+290>: bswap %ecx
0xffffffff8019d434 <nfs_loadattrcache+292>: cmpl
$0xffffffffffffffff,0xf8(%r13)
0xffffffff8019d43c <nfs_loadattrcache+300>: mov 0x80(%r13),%rbx
0xffffffff8019d443 <nfs_loadattrcache+307>: movzwl
0xffffffffffffffd6(%rbp),%eax
0xffffffff8019d447 <nfs_loadattrcache+311>: je 0xffffffff8019d461
<nfs_loadattrcache+337>
0xffffffff8019d449 <nfs_loadattrcache+313>: cmp %ecx,0x10(%rbx)
0xffffffff8019d44c <nfs_loadattrcache+316>: mov %eax,%edx
So I'm guessing that np (in %rbx ?) contained something invalid...
----------------------------------------------------------------------
| Paul Goyette | PGP DSS Key fingerprint: | E-mail addresses: |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul%whooppee.com@localhost |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette%juniper.net@localhost |
----------------------------------------------------------------------
Home |
Main Index |
Thread Index |
Old Index