NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/59175: posix_spawn hang, hanging other process too
The following reply was made to PR kern/59175; it has been noted by GNATS.
From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: Thomas Klausner <wiz%NetBSD.org@localhost>
Cc: gnats-bugs%NetBSD.org@localhost, netbsd-bugs%NetBSD.org@localhost,
martin%NetBSD.org@localhost, chs%NetBSD.org@localhost, dholland%NetBSD.org@localhost
Subject: Re: kern/59175: posix_spawn hang, hanging other process too
Date: Fri, 14 Mar 2025 20:05:59 +0000
This is a multi-part message in MIME format.
--=_k+R2A0jPCfl/fFvU84G4E4n02s6AxFtx
> Date: Fri, 14 Mar 2025 15:14:37 +0000
> From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
>
> > Date: Fri, 14 Mar 2025 14:28:38 +0000
> > From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
> >
> > 1. do_posix_spawn in the parent p1 creates a process p2 with
> > proc_alloc, which has:
> > - p2->p_stat = SIDL (can't be looked up with proc_find)
> > - p2->p_vmspace = &vmspace0 (initial vmspace, kernel's) or NULL
> > - p2->p_psstrp = garbage from recycled struct proc or NULL
> >
> > In particular, if p2 is freshly allocated, it will have null
> > p_vmspace and p_psstrp; if it was recycled, p_vmspace will have
> > been set to proc0.p_vmspace=&vmspace0 shortly before recycling
> > in uvm_proc_exit:
> > https://nxr.netbsd.org/xref/src/sys/uvm/uvm_glue.c?r=1.182#414
>
> Tiny correction: do_posix_spawn always initializes p2->p_vmspace to
> proc0.p_vmspace, i.e., &vmspace0, so NULL is not possible here, as of
> kern_exec.c rev. 1.348.
One more correction: p_psstrp is always inherited from the parent
because it lies between p_startcopy and p_endcopy, so it's never null
when copyin_psstrings tries to get at it.
So, here are two candidate patches that attempt to resolve the issue
without discarding the early-parent-wakeup path that is evidently the
primary reason for having posix_spawn as a syscall altogether instead
of a library wrapper around vfork/execve:
1. pr59175-posixspawnnullpsstr.patch sets the child's p2->p_psstrp to
null, and updates copyin_psstrings (and the compat32 version of it)
to fail with EINVAL (or maybe it should be EBUSY so it doesn't look
like a spurious invalid-argument failure shortly after process
creation) when it finds this.
I think this is reasonably low-risk, but:
(a) I don't know what other pointers into the parent's vmspace
might still exist, and be visible to other processes, before
the child finishes the exec logic -- maybe it's just the
ps_strings but I haven't checked.
(b) This adds more complications outside posix_spawn.
2. pr59175-posixspawntempinheritvmspace.patch starts by sharing the
parent's vmspace like vfork(2) does, and then switches it during
execve_runproc like execve(2) does. During that time, the child's
p2->p_psstrp reflects the parent, just like it would be with
vfork(2).
I think this is somewhat higher-risk, but:
(a) This doesn't require reviewing other possible pointers into the
parent's vmspace -- they will continue to work exactly as well
as they would with vfork(2) and execve(2), so it requires no
changes to copyin_psstrings.
(b) This brings do_posix_spawn closer to fork1, so the code has a
better chance of being shared (as suggested in PR kern/50480:
posix_spawn vs. fork code duplication).
--=_k+R2A0jPCfl/fFvU84G4E4n02s6AxFtx
Content-Type: text/plain; charset="ISO-8859-1"; name="pr59175-posixspawnnullpsstr"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="pr59175-posixspawnnullpsstr.patch"
# HG changeset patch
# User Taylor R Campbell <riastradh%NetBSD.org@localhost>
# Date 1741980237 0
# Fri Mar 14 19:23:57 2025 +0000
# Branch trunk
# Node ID 9a8ceed8653bfe797e75c766ca83593c50eecce2
# Parent 3c321704980af3b382b8e8ab493927a652a48ea9
# EXP-Topic riastradh-pr59175-spawnwait
posix_spawn(2): Block access to ps_strings until initialized.
When the child is first created, it has:
- p_vmspace set to &vmspace0 (proc0.p_vmspace) as a sentinel value
- p_psstrp inherited from the parent
It is in this state that the pid is first published -- returned to
the parent, and findable by other processes.
XXX This means that there is a short time early in the child's
lifetime during which kern.proc_args.<childpid>.nargv &c. will fail
with EINVAL. Perhaps we should inherit p_vmspace _and_ p_psstrp like
vfork(2) would, and update them simultaneously like execve(2) would.
PR kern/59175: posix_spawn hang, hanging other process too
diff -r 3c321704980a -r 9a8ceed8653b sys/compat/netbsd32/netbsd32_kern_proc=
.c
--- a/sys/compat/netbsd32/netbsd32_kern_proc.c Fri Mar 14 17:51:57 2025 +00=
00
+++ b/sys/compat/netbsd32/netbsd32_kern_proc.c Fri Mar 14 19:23:57 2025 +00=
00
@@ -109,8 +109,11 @@ static int
copyin_psstrings_32(struct proc *p, struct ps_strings *arginfo)
{
struct ps_strings32 arginfo32;
+ int error;
=20
- int error =3D copyin_proc(p, (void *)p->p_psstrp, &arginfo32,
+ if (p->p_psstrp =3D=3D 0) /* primordial spawn */
+ return EINVAL;
+ error =3D copyin_proc(p, (void *)p->p_psstrp, &arginfo32,
sizeof(arginfo32));
if (error)
return error;
diff -r 3c321704980a -r 9a8ceed8653b sys/kern/kern_exec.c
--- a/sys/kern/kern_exec.c Fri Mar 14 17:51:57 2025 +0000
+++ b/sys/kern/kern_exec.c Fri Mar 14 19:23:57 2025 +0000
@@ -2642,6 +2642,7 @@ do_posix_spawn(struct lwp *l1, pid_t *pi
memcpy(&p2->p_startcopy, &p1->p_startcopy,
(unsigned) ((char *)&p2->p_endcopy - (char *)&p2->p_startcopy));
p2->p_vmspace =3D proc0.p_vmspace;
+ p2->p_psstrp =3D 0; /* to be initialized by child later */
=20
TAILQ_INIT(&p2->p_sigpend.sp_info);
=20
diff -r 3c321704980a -r 9a8ceed8653b sys/kern/kern_proc.c
--- a/sys/kern/kern_proc.c Fri Mar 14 17:51:57 2025 +0000
+++ b/sys/kern/kern_proc.c Fri Mar 14 19:23:57 2025 +0000
@@ -2327,6 +2327,8 @@ copyin_psstrings(struct proc *p, struct=20
}
#endif /* !defined(_RUMPKERNEL) */
=20
+ if (p->p_psstrp =3D=3D 0) /* primordial spawn */
+ return EINVAL;
return copyin_proc(p, (void *)p->p_psstrp, arginfo, sizeof(*arginfo));
}
=20
--=_k+R2A0jPCfl/fFvU84G4E4n02s6AxFtx
Content-Type: text/plain; charset="ISO-8859-1"; name="pr59175-posixspawntempinheritvmspace"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="pr59175-posixspawntempinheritvmspace.patch"
# HG changeset patch
# User Taylor R Campbell <riastradh%NetBSD.org@localhost>
# Date 1741982594 0
# Fri Mar 14 20:03:14 2025 +0000
# Branch trunk
# Node ID 0769a64ad876303f34f46e14ba424f9a0ffb90dc
# Parent 3c321704980af3b382b8e8ab493927a652a48ea9
# EXP-Topic riastradh-pr59175-spawnwait
posix_spawn(2): Share parent's VM space temporarily.
This is just like vfork(2), and this avoids special cases in other
parts of the logic. This is mainly for ps_strings access.
PR kern/59175: posix_spawn hang, hanging other process too
diff -r 3c321704980a -r 0769a64ad876 sys/kern/kern_exec.c
--- a/sys/kern/kern_exec.c Fri Mar 14 17:51:57 2025 +0000
+++ b/sys/kern/kern_exec.c Fri Mar 14 20:03:14 2025 +0000
@@ -1245,14 +1245,9 @@ execve_runproc(struct lwp *l, struct exe
* until we have awoken the parent below, or it will defeat
* lazy pmap switching (on x86).
*/
- if (is_spawn)
- uvmspace_spawn(l, epp->ep_vm_minaddr,
- epp->ep_vm_maxaddr,
- epp->ep_flags & EXEC_TOPDOWN_VM);
- else
- uvmspace_exec(l, epp->ep_vm_minaddr,
- epp->ep_vm_maxaddr,
- epp->ep_flags & EXEC_TOPDOWN_VM);
+ uvmspace_exec(l, epp->ep_vm_minaddr,
+ epp->ep_vm_maxaddr,
+ epp->ep_flags & EXEC_TOPDOWN_VM);
vm =3D p->p_vmspace;
=20
vm->vm_taddr =3D (void *)epp->ep_taddr;
@@ -2641,7 +2636,6 @@ do_posix_spawn(struct lwp *l1, pid_t *pi
(unsigned) ((char *)&p2->p_endzero - (char *)&p2->p_startzero));
memcpy(&p2->p_startcopy, &p1->p_startcopy,
(unsigned) ((char *)&p2->p_endcopy - (char *)&p2->p_startcopy));
- p2->p_vmspace =3D proc0.p_vmspace;
=20
TAILQ_INIT(&p2->p_sigpend.sp_info);
=20
@@ -2732,8 +2726,15 @@ do_posix_spawn(struct lwp *l1, pid_t *pi
*/
p2->p_stats =3D pstatscopy(p1->p_stats);
=20
- /* copy over machdep flags to the new proc */
- cpu_proc_fork(p1, p2);
+ /*
+ * Set up the new process address space.
+ *
+ * Initially, we act like vfork(2) in sharing the address
+ * space, so that the (inherited) ps_strings will be preserved.
+ * The child will soon replace its address space in
+ * execve_runproc, as if by execve(2).
+ */
+ uvm_proc_fork(p1, p2, /*shared*/true);
=20
/*
* Prepare remaining parts of spawn data
diff -r 3c321704980a -r 0769a64ad876 sys/uvm/uvm_extern.h
--- a/sys/uvm/uvm_extern.h Fri Mar 14 17:51:57 2025 +0000
+++ b/sys/uvm/uvm_extern.h Fri Mar 14 20:03:14 2025 +0000
@@ -731,7 +731,6 @@ struct vmspace *uvmspace_alloc(vaddr_t,
void uvmspace_init(struct vmspace *, struct pmap *,
vaddr_t, vaddr_t, bool);
void uvmspace_exec(struct lwp *, vaddr_t, vaddr_t, bool);
-void uvmspace_spawn(struct lwp *, vaddr_t, vaddr_t, bool);
struct vmspace *uvmspace_fork(struct vmspace *);
void uvmspace_addref(struct vmspace *);
void uvmspace_free(struct vmspace *);
diff -r 3c321704980a -r 0769a64ad876 sys/uvm/uvm_map.c
--- a/sys/uvm/uvm_map.c Fri Mar 14 17:51:57 2025 +0000
+++ b/sys/uvm/uvm_map.c Fri Mar 14 20:03:14 2025 +0000
@@ -4259,27 +4259,6 @@ uvmspace_unshare(struct lwp *l)
=20
=20
/*
- * uvmspace_spawn: a new process has been spawned and needs a vmspace
- */
-
-void
-uvmspace_spawn(struct lwp *l, vaddr_t start, vaddr_t end, bool topdown)
-{
- struct proc *p =3D l->l_proc;
- struct vmspace *nvm;
-
-#ifdef __HAVE_CPU_VMSPACE_EXEC
- cpu_vmspace_exec(l, start, end);
-#endif
-
- nvm =3D uvmspace_alloc(start, end, topdown);
- kpreempt_disable();
- p->p_vmspace =3D nvm;
- pmap_activate(l);
- kpreempt_enable();
-}
-
-/*
* uvmspace_exec: the process wants to exec a new program
*/
=20
--=_k+R2A0jPCfl/fFvU84G4E4n02s6AxFtx--
Home |
Main Index |
Thread Index |
Old Index