Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: cannot kill (null) domains



On Fri, 28 Dec 2018 15:50:03 +0100
Petr Topiarz <topiarz%post.cz@localhost> wrote:

> Yes, that helped, exactly as you stated, there were qemu-dm processes 
> with fixed memory which roughly corresponded to the memory the xl list 
> stated for (null) domains.
> 
> When I killed those, the (null) domains were lost!
> 
> Thank you very much for your advice!
> 
> Petr
> 
> Dne 28. 12. 18 v 15:22 Brad Spencer napsal(a):
> > Petr Topiarz <topiarz%post.cz@localhost> writes:
> >  
> >> Hello folks,
> >>
> >> I run NetBSD 8_stable amd64 and upgraded xen from the old 4.1 to 4.8 and
> >> face this issue. I do not know  when it happened but there seem to be
> >> two (null) paused and dying domains which consume memory and I cannot
> >> kill them. These two domains cannot be unpaused or destroyed, they
> >> ignore all commands. Any way to get rid of them?
> >>
> >> Thank you, Petr
> >>
> >> Here is the xen version:
> >>
> >> # pkg_info |grep xen
> >> xenkernel48-4.8.3   Xen 4.8.x Kernel
> >> xentools48-4.8.3nb5 Userland Tools for Xen 4.8.x
> >>
> >> Here you can see the list command:
> >>
> >> # xl list
> >> Name                                        ID   Mem VCPUs State   Time(s)
> >> Domain-0                                     0  4096     1 r-----   75701.8
> >> lin-orange                                   1  5000     1 -b----   46265.8
> >> (null)                                       2  2641     1 --ps-d   10254.3
> >> (null)                                       3  2248     1 --ps-d   12449.9
> >> net-web                                      4  2000     1 -b----    9425.8
> >> lin-tritius                                  5 10000     2 -b----   86002.0
> >> net8-proxy                                   8  2000     1 -b----    3115.6
> >> net8-compil#!/usr/pkg/bin/bash

if [ ! "`uname -s`" = "NetBSD" ];then
    echo this script has assumes the OS is NetBSD. Changes are required to 1>&2
    echo make this portable 1>&2
    exit 99
fi

export PATH=/usr/pkg/bin:/usr/pkg/sbin:$PATH
pushd  /usr/pkg/etc/xen/conf > /dev/null 2>&1

if [ ! $? -eq 0 ];then
    echo Could not change CWD to xen config file directory 1>&2
    exit 1
fi

#look for domains which hang on shutdown/reboot ( usually  windows )
PIDS=`xl list | egrep '^\(null\)' | awk '{print $2}'`
if [ ! -z "${PIDS}" ];then
    for d in $PIDS;do
        p=`ps -auwx | grep -v egrep | egrep "/usr/pkg/libexec/xen/bin/qemu-dm -d ${d}\b" | awk '{print $2}'`
        n=`ps -auwx | grep -v egrep | egrep "/usr/pkg/libexec/xen/bin/qemu-dm -d ${d}\b" | awk '{print $15}'`
        kill $p
        sleep 3
        if ps -p $p > /dev/null 2>&1 ;then
            kill -9 $p
        fi
        sleep 3
        xl create $n
    done
fi
er                                9  4000     2 -b----    7516.3
> >> net8-web                                    10  4000     2 -b----    7672.0
> >> win-red                                     12  4999     1 -b----    5550.6
> >> win-aegidius                                13  3999     1 -b----      32.7
> >> win-les                                     14  3999     1 -b----      18.6  
> >
> > I see this with HVM style guests.  If these are or were HVM guest zones
> > look for a hung qemu process.  I typically end up having to kill those
> > off to get the (null) zones to clear out.  This happened in 4.5 and
> > happens in 4.8.
> >
> >
> >
> >  

I've really only seen this on windows where a software install 
does a reboot, which fails to complete, e.g. using InstallShield
From the windows terminal "shutdown.exe /R" seems to be fine. 
No doubt, there are other failure modes. I've got a bunch of larger 
dom0 servers, so fixing it manually is not a good option for me.

Here's a script I run out of cron every 10 min on my servers to deal
with this. Offered 
as-is with no guarantee it won't bork your system, etc... Use at your
own
risk.

--------------------------
#!/usr/pkg/bin/bash

if [ ! "`uname -s`" = "NetBSD" ];then
    echo this script has assumes the OS is NetBSD. Changes are required to 1>&2
    echo make this portable 1>&2
    exit 99
fi

export PATH=/usr/pkg/bin:/usr/pkg/sbin:$PATH
pushd  /usr/pkg/etc/xen/conf > /dev/null 2>&1

if [ ! $? -eq 0 ];then
    echo Could not change CWD to xen config file directory 1>&2
    exit 1
fi

#look for domains which hang on shutdown/reboot ( usually  windows )
PIDS=`xl list | egrep '^\(null\)' | awk '{print $2}'`
if [ ! -z "${PIDS}" ];then
    for d in $PIDS;do
        p=`ps -auwx | grep -v egrep | egrep "/usr/pkg/libexec/xen/bin/qemu-dm -d ${d}\b" | awk '{print $2}'`
        n=`ps -auwx | grep -v egrep | egrep "/usr/pkg/libexec/xen/bin/qemu-dm -d ${d}\b" | awk '{print $15}'`
        kill $p
        sleep 3
        if ps -p $p > /dev/null 2>&1 ;then
            kill -9 $p
        fi
        sleep 3
        xl create $n
    done
fi


Home | Main Index | Thread Index | Old Index