On 11.05.2020 13:35, Robert Elz wrote: > Date: Mon, 11 May 2020 11:03:15 +0000 > From: "Kamil Rytarowski" <kamil%netbsd.org@localhost> > Message-ID: <20200511110315.54B13FB27%cvs.NetBSD.org@localhost> > > | Do not fail when trying to kill a dying process > | > | A dying process can disappear for a while. Rather than aborting, retry > | sending SIGKILL to it. > > I don't understand this ... a process should never be able to > disappear and then reappear (not in any way). If a SIGKILL (or > ptrace(PT_KILL) fails with a "no such process" error, then repeating > it won't (or shouldn't) help - if it does, there's a kernel bug that > needs fixing (and it is OK for the test to fail until that happens.) > > Further, if the reason for this failure is that the process is > dying, you probably never needed the kill in the first place (and > no, I don't mean it should be deleted - the parent is unlikely to > know the state of the child, so killing it, if that is what is needed > is the right thing to do ... just that if the kill fails because you > were too late issuing it, it isn't an error, just a race that you lost, > and certainly shouldn't be repeated). > > But more than that, adding an infinite loop to the test, where you keep > doing the kill forever until it succeeds, or errno somehow stops being > ESRCH looks like a recipe for disaster. > > Just do the kill once, ignore the error if it is ESRCH (and probably > also ECHILD) report other errors as failures. > > kre > The only purpose of the test is to check whether misaligned program counter can crash the kernel (it can for NetBSD/sparc). Later, if a process dies or runs is not important, thus it is being killed. A process can disappear after dying and before reappearing as a zombie. This is not a bug, but a predicted race. We already discussed it in the past, whether to return the same process multiple times or overlook it for a while during the transition dying->zombie. Once an entity died it disappeared so the same is true for a process. Doing the kill once (and missing the process) is still possibly enough, but correcting it with SIGKILL does not cost.
Attachment:
signature.asc
Description: OpenPGP digital signature