Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: WAPBL + xen3 amd64 = idle loop - [was: amd64 xen3_dom0 failing to boot - stalls (generic boots fine)]



Sarton O'Brien wrote:
Mike Bowie wrote:
See http://mail-index.netbsd.org/port-xen/2009/02/03/msg004728.html for
the more complete resolution.

Might give you a starting point.

Playing around in ddb I found a lot of 'waits' and a wapbl mount. I'd disabled log on / so figured it was /usr locking up. Removing log from all fstab entries cleared up my issues.

Now all my domu are stuck.

It seems to be a global xen + wapbl problem. I'm not sure who's best to be aware of this. For now I'll just disable wapbl in all dom0 and domu that I have.

If there is anything I can do, let me know. I'm happy to poke around a bit.

Further along, removing log from fstab in the domu results in:

Starting file system checks:
/dev/rxbd0a: UNREF FILE I=838860  OWNER=0 MODE=100644
/dev/rxbd0a: SIZE=0 MTIME=Feb  4 22:37 2009  (CLEARED)
/dev/rxbd0a: UNREF FILE I=839006  OWNER=0 MODE=100644
/dev/rxbd0a: SIZE=0 MTIME=Feb  4 22:37 2009  (CLEARED)
/dev/rxbd0a: UNREF FILE I=839104  OWNER=0 MODE=100644
/dev/rxbd0a: SIZE=0 MTIME=Feb  4 22:37 2009  (CLEARED)
/dev/rxbd0a: UNREF FILE I=839220  OWNER=0 MODE=100644
/dev/rxbd0a: SIZE=0 MTIME=Feb  4 22:37 2009  (CLEARED)
/dev/rxbd0a: UNREF FILE I=839371  OWNER=0 MODE=100644
/dev/rxbd0a: SIZE=0 MTIME=Feb  4 22:37 2009  (CLEARED)
/dev/rxbd0a: LINK COUNT DIR I=839689  OWNER=0 MODE=40755
/dev/rxbd0a: SIZE=1024 MTIME=Jan 31 19:48 2009 COUNT 3 SHOULD BE 2 (ADJUSTED)
/dev/rxbd0a: UNREF FILE I=839801  OWNER=0 MODE=100644
/dev/rxbd0a: SIZE=0 MTIME=Feb  4 22:37 2009  (CLEARED)
/dev/rxbd0a: UNREF FILE I=839916  OWNER=0 MODE=100644
/dev/rxbd0a: SIZE=0 MTIME=Feb  4 22:37 2009  (CLEARED)
/dev/rxbd0a: FREE BLK COUNT(S) WRONG IN SUPERBLK (SALVAGED)
/dev/rxbd0a: SUMMARY INFORMATION BAD (SALVAGED)
/dev/rxbd0a: BLK(S) MISSING IN BIT MAPS (SALVAGED)
/dev/rxbd0a: 591881 files, 12050935 used, 8334591 free (65615 frags, 1033622 blo
cks, 0.3% fragmentation)
/dev/rxbd0a: MARKING FILE SYSTEM CLEAN
Root filesystem was modified, rebooting ...
1 2009-02-04T23:03:16.022575+11:00 - reboot - - - rebooted by root
syncing disks... done
unmounting file systems...
unmounting / (root_device)... done
rebooting...
# xm console spike
RAID components...
boot device: xbd0
root on xbd0a dumps on xbd0b
mountroot: trying smbfs...
mountroot: trying ntfs...
mountroot: trying nfs...
mountroot: trying msdos...
mountroot: trying lfs...
mountroot: trying ext2fs...
mountroot: trying ffs...
root file system type: ffs
init: copying out path `/sbin/init' 11
Wed Feb  4 23:03:30 EST 2009
swapctl: setting dump device to /dev/xbd0b
swapctl: adding /dev/xbd0b as swap device at priority 0
Starting file system checks:
/dev/rxbd0a: file system is clean; not checking
Setting tty flags.
Setting sysctl variables:
Starting network.
Hostname: spike.internal
NIS domainname: internal
IPv6 mode: host
Configuring network interfaces: xennet0.
Adding interface aliases:.
add net default: gateway 192.168.210.1
Building databases: dev, utmp, utmpx done
/etc/rc: WARNING: $named9 is not set properly - see rc.conf(5).
Starting syslogd.
Checking for core dump...
savecore - - - no core dump
/etc/rc: WARNING: $named9 is not set properly - see rc.conf(5).
Starting rpcbind.
Starting ypserv.
Starting ypbind.
Starting yppasswdd.
Mounting all filesystems...
Reader / writer lock error: rw_vector_exit: assertion failed: RW_COUNT(rw) != 0

lock address : 0xffffa00024733240
current cpu  :                  0
current lwp  : 0xffffa000248f9bc0
owner/count  : 000000000000000000 flags    : 000000000000000000

panic: lock error
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff80129aad cs e030 rflags 246 cr2 7f7ffd96abc0 cpl
0 rsp ffffa0002491d690
Stopped in pid 159.1 (mount_ffs) at     netbsd:breakpoint+0x5:  leave
breakpoint() at netbsd:breakpoint+0x5
panic() at netbsd:panic+0x242
lockdebug_abort() at netbsd:lockdebug_abort+0x42
rw_vector_exit() at netbsd:rw_vector_exit+0xa6
vlockmgr() at netbsd:vlockmgr+0xd8
VOP_UNLOCK() at netbsd:VOP_UNLOCK+0x28
spec_open() at netbsd:spec_open+0x33e
VOP_OPEN() at netbsd:VOP_OPEN+0x29
ffs_mount() at netbsd:ffs_mount+0x1c7
do_sys_mount() at netbsd:do_sys_mount+0x62d
sys___mount50() at netbsd:sys___mount50+0x33
syscall() at netbsd:syscall+0xb4
ds          0x6fc7
es          0xe02b
fs          0xd658
gs          0x246
rdi         0
rsi         0xd
rbp         0xffffa0002491d690
rbx         0xffffa0002491d6a0
rdx         0
rcx         0
rax         0x1
r8          0xffffffff80588940  cpu_info_primary
r9          0x1
r10         0xffffa0002491d5b0
r11         0xffffffff803abf80  xenconscn_putc
r12         0x104
r13         0xffffffff80406fc7  copyright+0xe907
r14         0xffffffff80590540  rwlock_lockops
r15         0xffffa000246f9608
rip         0xffffffff80129aad  breakpoint+0x5
cs          0xe030
rflags      0x246
rsp         0xffffa0002491d690
ss          0xe02b
netbsd:breakpoint+0x5:  leave
db>
db> ps/l
PID         LID S     FLAGS       STRUCT LWP *               NAME WAIT
>159       >   1 7         4   ffffa000248f9bc0          mount_ffs
158           1 3        84   ffffa000245dc7c0              mount wait
157           1 3        84   ffffa000232397c0                 sh wait
153           1 3        84   ffffa000245dc000      rpc.yppasswdd select
143           1 3        84   ffffa000245dc3e0             ypbind select
144           1 3        84   ffffa000245dcba0             ypserv select
144           1 3        84   ffffa000245dcba0             ypserv select
108           1 3        84   ffffa00023239ba0            syslogd kqueue
2             1 3        84   ffffa000232393e0                 sh wait
1             1 3        84   ffffa0002323a400               init wait
0            29 3       204   ffffa0002323b040            physiod physiod
28 3 204 ffffa00023239000 vmem_rehash vmem_rehash
             27 3       204   ffffa0002323abc0           aiodoned aiodoned
             26 3       204   ffffa0002323a7e0            ioflush syncer
             25 3       204   ffffa0002323a020           pgdaemon pgdaemon
24 3 204 ffffa00023231040 cryptoret crypto_wait
             23 2       204   ffffa0002323b800             xenbus
             22 3       204   ffffa0002323bbe0           xenwatch evtsq
             12 3       204   ffffa00023231420           pmfevent pmfevent
             11 3       204   ffffa00023231800           nfssilly nfssilly
             10 3       204   ffffa00023231be0            cachegc cachegc
              9 3       204   ffffa0002322e020              vrele vrele
              8 3       204   ffffa0002322e400          modunload modunload
              7 3       204   ffffa0002322e7e0            xcall/0 xcall
              6 1       204   ffffa0002322ebc0          softser/0
              5 1       204   ffffa0002322c000          softclk/0
              4 1       204   ffffa0002322c3e0          softbio/0
              3 1       204   ffffa0002322c7c0          softnet/0
              2 1       205   ffffa0002322cba0             idle/0
              1 3       204   ffffffff80590000            swapper schedule
db>

I got the same panic last night when I'd left the image/vnd attached and then started the domu, so I detached it, started ... and that's what happened. Just in case the vnd was still attached on the auto reboot, I'll fsck manually and try starting it again.

dom0 seems to be fine ... for now :)

Sarton


Home | Main Index | Thread Index | Old Index