NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: toolchain/57241: mips64el--netbsd-install core dumps randomly



Why not build the prog.debug file as part of the prog target?

christos

--- bsd.prog.mk 28 Nov 2021 15:49:36 -0000      1.340
+++ bsd.prog.mk 18 Apr 2025 18:33:21 -0000
@@ -529,6 +529,12 @@
 .if ${MKSTRIPIDENT} != "no"
        ${OBJCOPY} -R .ident ${.TARGET}
 .endif
+.if defined(_PROGDEBUG.${_P})
+       (  ${OBJCOPY} --only-keep-debug ${_P} ${_PROGDEBUG.${_P}} \
+       && ${OBJCOPY} --strip-debug -p -R .gnu_debuglink \
+               --add-gnu-debuglink=${_PROGDEBUG.${_P}} ${_P} \
+       ) || (rm -f ${_PROGDEBUG.${_P}}; false)
+.endif
 
 CLEANFILES+=   ${_P}.d
 .if exists(${_P}.d)
@@ -554,21 +560,18 @@
 .if ${MKSTRIPIDENT} != "no"
        ${OBJCOPY} -R .ident ${.TARGET}
 .endif
-.endif # !commands(${_P})
-.endif # USE_COMBINE
-
-${_P}.ro: ${OBJS.${_P}} ${_DPADD.${_P}}
-       ${_MKTARGET_LINK}
-       ${CC} ${LDFLAGS:N-pie} -nostdlib -r -Wl,-dc -o ${.TARGET} ${OBJS.${_P}}
-
 .if defined(_PROGDEBUG.${_P})
-${_PROGDEBUG.${_P}}: ${_P}
-       ${_MKTARGET_CREATE}
        (  ${OBJCOPY} --only-keep-debug ${_P} ${_PROGDEBUG.${_P}} \
        && ${OBJCOPY} --strip-debug -p -R .gnu_debuglink \
                --add-gnu-debuglink=${_PROGDEBUG.${_P}} ${_P} \
        ) || (rm -f ${_PROGDEBUG.${_P}}; false)
 .endif
+.endif # !commands(${_P})
+.endif # USE_COMBINE
+
+${_P}.ro: ${OBJS.${_P}} ${_DPADD.${_P}}
+       ${_MKTARGET_LINK}
+       ${CC} ${LDFLAGS:N-pie} -nostdlib -r -Wl,-dc -o ${.TARGET} ${OBJS.${_P}}
 
 .endif # defined(OBJS.${_P}) && !empty(OBJS.${_P})                     # }
 


> On Apr 18, 2025, at 12:30 PM, Taylor R Campbell via gnats <gnats-admin%netbsd.org@localhost> wrote:
> 
> The following reply was made to PR toolchain/57241; it has been noted by GNATS.
> 
> From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
> To: Roland Illig <rillig%NetBSD.org@localhost>
> Cc: gnats-bugs%NetBSD.org@localhost, netbsd-bugs%NetBSD.org@localhost
> Subject: Re: toolchain/57241: mips64el--netbsd-install core dumps randomly
> Date: Fri, 18 Apr 2025 16:26:30 +0000
> 
> Hi rillig, I wonder whether you might be able to help solve a
> make(1)-related mystery?
> 
> I'm drafting a change to fix the parallel-safety of the foo.debug
> recipe in bsd.prog.mk (a little finicky because it has nontrivial
> interaction with other makefiles like libexec/ld.elf_so/Makefile).
> 
> But before I commit it, I want to make sure I understand the
> underlying cause of PR 57241.
> 
> The immediate symptom is that, e.g., `mips64el--netbsd-install ...
> ipftest ${DESTDIR}/usr/sbin/ipftest' is crashing because its input
> file has been truncated between fstat/mmap and access to file content.
> And it looks like there's a concurrent objcopy from the .debug recipe
> which has truncated ipftest to rewrite it in place.
> 
> But I can't figure out why the concurrent objcopy is happening only in
> the mips64 builds of certain programs like ipftest(8) and crash(8),
> which seem to have in common the use of compat/exec.mk.  (These are
> programs that run with the n64 ABI, in order to read out kernel guts
> on mips64 CPUs, in a userland where _most_ programs run with the n32
> ABI instead because it's more compact and they usually have <4GB RAM.)
> 
> And so I think I need a make(1) wizard to help.
> 
> 
> Here's an example:
> 
> https://releng.netbsd.org/builds/HEAD/202504161330Z/evbmips-mips64el.build.=
> failed
> https://web.archive.org/web/20250418154748/https://releng.netbsd.org/builds=
> /HEAD/202504161330Z/evbmips-mips64el.build.failed
> 
> [1]   Bus error (core dumped) /home/builds/ab/HEAD/evbmips-mips64el/2025041=
> 6...
> --- /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ipfte=
> st ---
> ...
> *** Failed target: /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest=
> /usr/sbin/ipftest
> *** In directory: /home/source/ab/HEAD/src/external/bsd/ipf/bin/ipftest
> *** Failed commands:
> 	${_MKTARGET_INSTALL}
> 	=3D> @# "install " /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-des=
> t/usr/sbin/ipftest
> 	${INSTALL_FILE} -o ${BINOWN} -g ${BINGRP} -m ${BINMODE}  ${STRIPFLAG} ${.A=
> LLSRC} ${.TARGET}
> 	=3D> /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-tools/bin/mips64e=
> l--netbsd-install -U -M /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z=
> -dest/METALOG -D /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest -=
> h sha256 -N /home/source/ab/HEAD/src/etc -c  -r -o root -g wheel -m 555   i=
> pftest /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ip=
> ftest
> *** [/home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-dest/usr/sbin/ipft=
> est] Error code 138
> ...
> /home/builds/ab/HEAD/evbmips-mips64el/202504161330Z-tools/bin/mips64el--net=
> bsd-objcopy: libcrypto.so.15.0.debug: section `.note.netbsd.pax' can't be a=
> llocated in segment 0
> LOAD: .MIPS.abiflags .reginfo .dynamic .hash .dynsym .dynstr .gnu.version .=
> gnu.version_d .gnu.version_r .rel.dyn .init .text .MIPS.stubs .fini .rodata=
>  .eh_frame_hdr .eh_frame .note.netbsd.ident .note.netbsd.pax
> 
> The last part -- a warning message about which I just filed another
> bug, PR port-mips/59320: objcopy: section `.note.netbsd.pax' can't be
> allocated in segment 0 -- is evidence that make(1) is still running
> the buggy ipftest.debug recipe which rewrites ipftest in place:
> 
>     507 ${_PROGDEBUG.${_P}}: ${_P}
>     508 	${_MKTARGET_CREATE}
>     509 	( ${OBJCOPY} --only-keep-debug --compress-debug-sections \
>     510 	    ${_P} ${_PROGDEBUG.${_P}} && \
>     511 	  ${OBJCOPY} --strip-debug -p -R .gnu_debuglink \
>     512 		--add-gnu-debuglink=3D${_PROGDEBUG.${_P}} ${_P} \
>     513 	) || (rm -f ${_PROGDEBUG.${_P}}; false)
> 
> https://nxr.netbsd.org/xref/src/share/mk/bsd.prog.mk?r=3D1.355#509
> 
> 
> My best guess was that:
> 
> 1. When doing dependall, the ipftest.debug recipe above:
>    (a) creates ipftest.debug with objcopy at time t0,
>    (b) a moment later, modifies ipftest in place with objcopy, at time
>        t1 =3D t0 + eps > t1.
> 
> 2. When doing install, make(1) finds that ${DESTDIR}/usr/sbin/ipftest
>    and ${DESTDIR}/usr/libdata/debug/usr/sbin/ipftest.debug are both
>    out of date, so it tries to run, _in parallel_:
> 
>    (a) mips64el--netbsd-install ... ipftest ${DESTDIR}/usr/sbin/ipftest,
>        because ipftest exists and is up-to-date
> 
>    (b) the .debug recipe above again, because ipftest exists and is
>        up-to-date with timestamp t1, but ipftest.debug exists and is
>        out-of-date with timestamp t0 < t1
> 
> Except this hypothesis doesn't make sense, for two reasons:
> 
> - The problem empirically _only_ happens in mips64 builds with a few
>   programs, and nothing in the hypothesis above is restricted to that.
> 
> - We pass `-p' (--preserve-dates) to objcopy(1) in step (1), so it
>   restores the mtime of the input file after truncating and
>   overwriting it -- and so by the time of make install, it should look
>   like ipftest.debug is up-to-date.
> 
> So I can't figure out why, under these circumstances, make install is
> trying to rerun the .debug recipe.  And I can't reproduce it on my
> laptop.
> 
> I tried reading out `make -d g1' and `make -d m' output but it's kind
> of inscrutable to me (I thought `-d g1' would show a graph, with nodes
> and edges for dependency relations, but I can't figure out how to read
> the edges in it).
> 

Attachment: signature.asc
Description: Message signed with OpenPGP



Home | Main Index | Thread Index | Old Index