Port-xen archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: unresponsive domu
On Thu, Apr 24, 2008 at 09:49:05AM +1000, Sarton O'Brien wrote:
> This has happened before but I thought I'd update kernel and userland to see
> if it happened again ... and it has.
>
> I really don't know what I'm doing, so here goes:
>
> gogeta# xm console spike
>
> Stopped in pid 0.2 (system) at netbsd:breakpoint+0x4: popl %ebp
> db> bt
> breakpoint(65c87,0,caa75f78,11,1) at netbsd:breakpoint+0x4
> xencons_tty_input(c0ea1ef0,c05be011,1,c03a1afd,3b9aca00) at
> netbsd:xencons_tty_i
> nput+0xa6
> xencons_handler(c0ea1ef0,caa3ac90,0,caa75fec,20) at
> netbsd:xencons_handler+0x5f
> evtchn_do_event(2,caa3ac90,caa3ac48,0,0) at netbsd:evtchn_do_event+0xbd
> --- switch to interrupt stack ---
> call_evtchn_do_event(caa3ac90,0,11,ca000031,c0100011) at
> netbsd:call_evtchn_do_e
> vent+0x1e
> hypervisor_callback(ca27dbb4,1,c02d6489,0,0) at
> netbsd:hypervisor_callback+0x65
> idle_loop(ca286d80,0,c010006b,c0100063,c010006b) at netbsd:idle_loop+0xf9
> db> ps/l
> PID LID S FLAGS STRUCT LWP * NAME WAIT
> 6248 1 3 4 cd1f7000 master tstile
> 18497 1 3 4 cd1f7d80 find vnode
> 22163 1 3 84 cda8e260 postdrop netio
> 13874 1 3 84 cda8e020 sendmail piperd
> 29998 1 3 84 cda8eb60 tee piperd
> 4904 1 3 84 cc71c260 sh wait
> 16457 1 3 84 cc609000 sh wait
> 6994 1 3 84 cc71c6e0 cron piperd
> 16586 1 3 84 cda8e4a0 ssh select
> 8291 1 3 80 cd1f7b40 cvs piperd
> 3980 1 3 0 cda8eda0 cvs vnode
> 12492 1 3 80 cda8e920 sh wait
> 4448 1 3 80 cc609b40 cron piperd
> 664 1 3 80 ca2894a0 smbd pause
> 647 1 3 4 ca289260 getty tstile
> 628 1 3 4 cc609480 cron tstile
> 496 1 3 84 cc71c920 postgres select
> 533 1 3 84 cc71cda0 postgres select
> 383 1 3 4 cc71cb60 postgres tstile
> 510 1 3 84 cc609d80 qmgr kqueue
> 530 1 3 84 cc6096c0 master kqueue
> 328 8 3 84 cc71c020 slapd parked
> 7 3 80 cc71c4a0 slapd parked
> 6 3 80 ce131000 slapd parked
> 5 3 84 cda8e6e0 slapd parked
> 4 3 84 cc609900 slapd parked
> 3 3 84 cc609240 slapd parked
> 2 3 84 cc4ceb60 slapd select
> 1 3 80 cc4ce020 slapd parked
> 325 1 3 4 cc4ce260 smbd tstile
> 313 1 3 4 cc4ce4a0 nmbd tstile
> 300 1 3 80 cc4ce6e0 sshd select
> 230 1 3 80 ca2896e0 powerd kqueue
> 271 1 3 1000004 cc4ce920 ntpd tstile
> 214 1 3 84 cc4ceda0 rpc.lockd select
> 209 1 3 84 cb696000 rpc.statd select
> 206 5 3 84 cb696240 slave nfsd
> 4 3 84 cb696480 slave nfsd
> 3 3 84 cb6966c0 slave nfsd
> 2 3 84 cb696900 slave nfsd
> 1 3 80 cb696b40 master select
> 187 1 3 84 cb445260 mountd select
> 156 1 3 84 cb696d80 lfs_cleanerd segment
> 151 1 3 84 cb4454a0 rpc.yppasswdd select
> 141 1 3 4 cb4456e0 ypbind tstile
> 137 1 3 4 cb445920 ypserv tstile
> 131 1 3 4 cb445b60 rpcbind tstile
> 106 1 3 84 ca28e000 syslogd kqueue
> 1 1 3 84 ca28e900 init wait
> >0 28 3 204 cb445020 lfs_writer lfswriter
> 27 3 204 cb445da0 physiod physiod
> 26 3 204 ca289020 vmem_rehash vmem_rehash
> 25 3 204 ca28ed80 aiodoned aiodoned
> 24 3 204 ca28eb40 ioflush syncer
> 23 3 204 ca28e6c0 pgdaemon pgdaemon
> 22 3 204 ca289920 cryptoret crypto_wait
> 21 3 204 ca28e240 xenbus rdst
> 20 3 204 ca28e480 xenwatch evtsq
> 10 3 204 ca289b60 pmfevent pmfevent
> 9 3 204 ca289da0 cachegc cachegc
> 8 3 204 ca286000 vrele vrele
> 7 3 204 ca286240 xcall/0 xcall
> 6 1 204 ca286480 softser/0
> 5 1 204 ca2866c0 softclk/0
> 4 1 204 ca286900 softbio/0
> 3 1 204 ca286b40 softnet/0
> > 2 7 20000205 ca286d80 idle/0
> 1 3 204 c044b240 swapper schedule
> db> ps
> PID PPID PGRP UID S FLAGS LWPS COMMAND
> WAIT
> 6248 530 530 0 2 0x100 1 master
> tstile
> 18497 4904 16457 0 2 0x4000 1 find
> vnode
> 22163 13874 16457 0 2 0x4100 1 postdrop
> netio
> 13874 16457 16457 0 2 0x4000 1 sendmail
> piperd
> 29998 16457 16457 0 2 0x4000 1 tee
> piperd
> 4904 16457 16457 0 2 0x4000 1 sh
> wait
> 16457 6994 16457 0 2 0x4000 1 sh
> wait
> 6994 628 628 0 2 0 1 cron
> piperd
> 16586 8291 12492 0 2 0x4000 1 ssh
> select
> 8291 3980 12492 0 2 0 1 cvs
> piperd
> 3980 12492 12492 0 2 0x4000 1 cvs
> vnode
> 12492 4448 12492 0 2 0x4000 1 sh
> wait
> 4448 628 628 0 2 0 1 cron
> piperd
> 664 325 325 0 2 0x101 1 smbd
> pause
> 647 1 647 0 2 0x4000 1 getty
> tstile
> 628 1 628 0 2 0 1 cron
> tstile
> 496 383 496 1003 2 0 1 postgres
> select
> 533 383 533 1003 2 0 1 postgres
> select
> 383 1 2 1003 2 0x4000 1 postgres
> tstile
> 510 530 530 12 2 0x4100 1 qmgr
> kqueue
> 530 1 530 0 2 0x4100 1 master
> kqueue
> 328 1 328 1001 2 0x101 8 slapd
> *
> 325 1 325 0 2 0x101 1 smbd
> tstile
> 313 1 313 0 2 0x1 1 nmbd
> tstile
> 300 1 300 0 2 0 1 sshd
> select
> 230 1 230 0 2 0 1 powerd
> kqueue
> 271 1 271 0 2 0 1 ntpd
> tstile
> 214 1 214 0 2 0 1 rpc.lockd
> select
> 209 1 209 0 2 0xa0000 1 rpc.statd
> select
> 206 1 206 0 2 0 5 nfsd
> *
> 187 1 187 0 2 0 1 mountd
> select
> 156 1 156 0 2 0 1 lfs_cleanerd
> segment
> 151 1 151 0 2 0 1 rpc.yppasswdd
> select
> 141 1 141 0 2 0 1 ypbind
> tstile
> 137 1 137 0 2 0xa0000 1 ypserv
> tstile
> 131 1 131 0 2 0 1 rpcbind
> tstile
> 106 1 106 0 2 0 1 syslogd
> kqueue
> 1 0 1 0 2 0x4001 1 init
> wait
> >0 -1 0 0 2 0x20002 19 system
> *
> db> reboot
> syncing disks... panic: assert_sleepable: interrupt caller=0xc0348fa1
> Stopped in pid 0.2 (system) at netbsd:breakpoint+0x4: popl %ebp
>
> I don't know how to reproduce this, it just happens. If there are more
> commands I can issue to assist ... please let me know. Is there a basic set
> of commands or procedure to follow when dropping to ddb. Sorry for my
> naivety.
It's a file system / disk I/O problem. What is the domain's exact file
system configuration?
Andrew
Home |
Main Index |
Thread Index |
Old Index