On my Xen box, I sometimes get a state where any "xl" command will just hang (in such a way as to not respond to SIGTERM, SIGKILL, or even a ^Z at the shell). Existing domUs are still running fine and can talk to the network and their xbd disks and *existing* xl console sessions, but I can't start new ones, or open new console connections. xenconsoled and xenstored are both running (well, processes with those names exist and are in the "select" state according to top); all my xl processes are in state "tstile", and when I had a look this time, there were two xenstore-rm processes and a xenstore-read around, also in the tstile state: root 13694 0.0 0.5 13092 1264 ? I 3:04PM 0:00.00 xenstore-rm /local/domain/0/backend/vif/17/0 root 17011 0.0 0.5 13092 1260 ? D 3:04PM 0:00.00 xenstore-read /local/domain/0/backend/vbd/17/0/params root 25386 0.0 0.5 12064 1224 ? D 3:04PM 0:00.00 xenstore-rm /local/domain/0/backend/vif/17/3 Any ideas what it might be? When it happens, the only cure seems to be to ssh into the domUs that are running ssh (some are meant to be accessed purely via xl console) and shut them down, then reboot dom0 and restart everything :-( Thanks, ABS -- Alaric Snell-Pym http://www.snell-pym.org.uk/alaric/
Attachment:
signature.asc
Description: OpenPGP digital signature