On Wed 14 Oct 2015 at 09:39:40 +0200, J. Hannken-Illjes wrote: > Looks like a deadlock, two threads in tstile. > > Please take a backtrace (with arguments) of these threads. I've got a whole lot more in tstile, and that is even just from running pkg_comp in the chroot. I didn't try to interrupt anything yet. load averages: 0.00, 0.20, 0.44; up 0+02:23:43 22:43:52 78 processes: 76 sleeping, 2 on CPU CPU states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Memory: 393M Act, 60K Inact, 31M Wired, 31M Exec, 273M File, 3239M Free Swap: 4096M Total, 4096M Free vargaz:~$ ps alxtp1 UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND 1000 1391 74 0 85 0 13208 2528 wait Is ttyp1 0:00.02 -bash 0 1759 1391 1107 85 0 13304 1576 wait I ttyp1 0:00.13 /bin/sh /usr/pkg/sbin/pkg_comp chroot 0 865 1759 1107 85 0 13304 1140 wait I ttyp1 0:00.01 /bin/sh /pkg_comp/tmp/pkg_comp-sOjsoA.sh 0 874 865 13547 82 0 11088 1412 pause I ttyp1 0:00.01 /bin/ksh 0 267 874 20048 81 0 15360 1720 wait I+ ttyp1 0:00.22 /bin/sh -e /usr/pkg/sbin/pkg_chk 0 9782 267 20048 81 0 15360 1448 wait I+ ttyp1 0:00.00 sh -c cd /usr/pkgsrc/devel/mercurial && /usr/bin/make u 0 8085 9782 0 117 0 15224 3452 tstile D+ ttyp1 0:00.14 /usr/bin/make update CLEANDEPENDS 0 26889 8085 29745 78 0 15360 1424 wait I+ ttyp1 0:00.00 /bin/sh -c set -e; /usr/bin/env MAKECONF=/etc/mk.conf P 0 14050 26889 0 117 0 15224 3444 tstile D+ ttyp1 0:00.14 /usr/bin/make _MAKE OPSYS OS_VERSION LOWER_OPSYS _PKGSR 0 6325 14050 22699 80 0 15360 1428 wait I+ ttyp1 0:00.00 /bin/sh -c set -e; pkgpattern=mercurial-3.5.1;\t\t\t\t 0 13334 6325 0 117 0 15224 3452 tstile D+ ttyp1 0:00.14 /usr/bin/make .MAKE.LEVEL.ENV CLEANDEPENDS HOST_OSTYPE 0 2892 13334 29745 78 0 15364 1444 wait I+ ttyp1 0:00.00 /bin/sh -c set -e;\t\t\t\t\t\t\t\t exec 3<&0;\t\t\t\t\t 0 13425 2892 29745 78 0 15364 1136 wait I+ ttyp1 0:00.00 /bin/sh -c set -e;\t\t\t\t\t\t\t\t exec 3<&0;\t\t\t\t\t 0 17339 13425 0 117 0 15224 3504 tstile D+ ttyp1 0:00.16 /usr/bin/make .MAKE.LEVEL.ENV CLEANDEPENDS DEPENDS_TARG 0 11893 17339 23601 80 0 15364 1432 wait I+ ttyp1 0:00.00 /bin/sh -c set -e; pkgpattern=py27-mercurial\\>=3.5.1;\ 0 21797 11893 0 117 0 15228 3512 tstile D+ ttyp1 0:00.18 /usr/bin/make .MAKE.LEVEL.ENV CLEANDEPENDS DEPENDS_TARG 0 1347 21797 23778 80 0 15364 1456 wait I+ ttyp1 0:00.00 /bin/sh -c set -e;\t\t\t\t\t if test -n "" && /usr/pkg 0 23567 1347 0 117 0 15228 4032 tstile D+ ttyp1 0:00.38 /usr/bin/make .MAKE.LEVEL.ENV CLEANDEPENDS DEPENDS_TARG 0 3383 23567 29360 78 0 15364 1432 wait I+ ttyp1 0:00.00 /bin/sh -c (cd /pkg_comp/obj/pkgsrc/devel/py-mercurial/ 0 21311 3383 28277 79 0 81652 11580 wait I+ ttyp1 0:00.14 /usr/pkg/bin/python2.7 setup.py build 0 24114 21311 28277 79 0 15364 1424 wait I+ ttyp1 0:00.01 /bin/sh /pkg_comp/obj/pkgsrc/devel/py-mercurial/default 0 3590 24114 28277 79 0 15364 1472 wait I+ ttyp1 0:00.00 /bin/sh /usr/pkgsrc/mk/tools/msgfmt.sh 0 7060 3590 28277 117 0 4244 188 tstile D+ ttyp1 0:00.00 /bin/cat 0 18497 3590 28277 79 0 10880 1064 pipe_wr I+ ttyp1 0:00.00 /bin/cat i18n/el.po 0 23883 3590 0 117 0 6580 236 netio D+ ttyp1 0:00.00 /usr/bin/msgfmt -v -o mercurial/locale/el/LC_MESSAGES/h 0 27257 3590 28277 117 0 4244 188 tstile D+ ttyp1 0:00.00 /bin/cat 0 29472 3590 28277 79 0 14244 2344 pipe_wr I+ ttyp1 0:00.01 /usr/bin/awk -f /usr/bin/awk (I've re-arranged the order to get parents before children) Here are backtraces of the processes in tstile (and the shell that spawned the 4 leaf children). I have kept the dump so I can examine it further. Unfortunately, crash(8) didn't give me arguments, nor did ddb when I tried that (I used the GENERIC kernel, what options do I need to get the arguments?) Script started on Wed Oct 14 23:41:43 2015 vargaz:~/crash$ crash -M netbsd.3.core -N netbsd.test Crash version 7.0, image version 7.99.21. WARNING: versions differ, you may not be able to examine this image. System panicked: dump forced via kernel debugger Backtrace from time of crash is available. crash> bt/t 0t3590 trace: pid 3590 lid 1 at 0xfffffe8040758d00 sleepq_block() at sleepq_block+0xa2 cv_wait_sig() at cv_wait_sig+0xfe do_sys_wait() at do_sys_wait+0x22c sys___wait450() at sys___wait450+0x3a syscall() at syscall+0x9c --- syscall (number 449) --- 7f7ff683c1ea: crash> bt/t 0t7060 trace: pid 7060 lid 1 at 0xfffffe804076c770 sleepq_block() at sleepq_block+0xa2 turnstile_block() at turnstile_block+0x40e rw_vector_enter() at rw_vector_enter+0x2d0 genfs_lock() at genfs_lock+0x7b VOP_LOCK() at VOP_LOCK+0x54 vn_lock() at vn_lock+0x82 nfs_lookup() at nfs_lookup+0xfb4 VOP_LOOKUP() at VOP_LOOKUP+0xa8 lookup_once() at lookup_once+0x216 namei_tryemulroot() at namei_tryemulroot+0x5b0 namei() at namei+0x29 vn_open() at vn_open+0x8e do_open() at do_open+0x111 do_sys_openat() at do_sys_openat+0x68 sys_open() at sys_open+0x24 syscall() at syscall+0x9c --- syscall (number 5) --- 7f7ff7c0c20a: crash> bt/t 0t27257 trace: pid 27257 lid 1 at 0xfffffe8040748770 sleepq_block() at sleepq_block+0xa2 turnstile_block() at turnstile_block+0x40e rw_vector_enter() at rw_vector_enter+0x2d0 genfs_lock() at genfs_lock+0x7b VOP_LOCK() at VOP_LOCK+0x54 vn_lock() at vn_lock+0x82 nfs_lookup() at nfs_lookup+0xfb4 VOP_LOOKUP() at VOP_LOOKUP+0xa8 lookup_once() at lookup_once+0x216 namei_tryemulroot() at namei_tryemulroot+0x5b0 namei() at namei+0x29 vn_open() at vn_open+0x8e do_open() at do_open+0x111 do_sys_openat() at do_sys_openat+0x68 sys_open() at sys_open+0x24 syscall() at syscall+0x9c --- syscall (number 5) --- 7f7ff7c0c20a: crash> bt/t 0t23567 trace: pid 23567 lid 1 at 0xfffffe8040734c60 sleepq_block() at sleepq_block+0xa2 turnstile_block() at turnstile_block+0x40e rw_vector_enter() at rw_vector_enter+0x2d0 genfs_lock() at genfs_lock+0x7b VOP_LOCK() at VOP_LOCK+0x54 vn_lock() at vn_lock+0x82 getcwd_common() at getcwd_common+0x2cd sys___getcwd() at sys___getcwd+0xae syscall() at syscall+0x9c --- syscall (number 296) --- 7f7ff6c9f6ba: crash> bt/t 0t21797 trace: pid 21797 lid 1 at 0xfffffe804073cc60 sleepq_block() at sleepq_block+0xa2 turnstile_block() at turnstile_block+0x40e rw_vector_enter() at rw_vector_enter+0x2d0 genfs_lock() at genfs_lock+0x7b VOP_LOCK() at VOP_LOCK+0x54 vn_lock() at vn_lock+0x82 getcwd_common() at getcwd_common+0x2cd sys___getcwd() at sys___getcwd+0xae syscall() at syscall+0x9c --- syscall (number 296) --- 7f7ff6c9f6ba: crash> bt/t 0t17339 trace: pid 17339 lid 1 at 0xfffffe80407a4c60 sleepq_block() at sleepq_block+0xa2 turnstile_block() at turnstile_block+0x40e rw_vector_enter() at rw_vector_enter+0x2d0 genfs_lock() at genfs_lock+0x7b VOP_LOCK() at VOP_LOCK+0x54 vn_lock() at vn_lock+0x82 getcwd_common() at getcwd_common+0x2cd sys___getcwd() at sys___getcwd+0xae syscall() at syscall+0x9c --- syscall (number 296) --- 7f7ff6c9f6ba: crash> bt/t 0t13334 trace: pid 13334 lid 1 at 0xfffffe80406b0c60 sleepq_block() at sleepq_block+0xa2 turnstile_block() at turnstile_block+0x40e rw_vector_enter() at rw_vector_enter+0x2d0 genfs_lock() at genfs_lock+0x7b VOP_LOCK() at VOP_LOCK+0x54 vn_lock() at vn_lock+0x82 getcwd_common() at getcwd_common+0x2cd sys___getcwd() at sys___getcwd+0xae syscall() at syscall+0x9c --- syscall (number 296) --- 7f7ff6c9f6ba: crash> bt/t 0t14050 trace: pid 14050 lid 1 at 0xfffffe8040778c60 sleepq_block() at sleepq_block+0xa2 turnstile_block() at turnstile_block+0x40e rw_vector_enter() at rw_vector_enter+0x2d0 genfs_lock() at genfs_lock+0x7b VOP_LOCK() at VOP_LOCK+0x54 vn_lock() at vn_lock+0x82 getcwd_common() at getcwd_common+0x2cd sys___getcwd() at sys___getcwd+0xae syscall() at syscall+0x9c --- syscall (number 296) --- 7f7ff6c9f6ba: crash> bt/t 0t8085 trace: pid 8085 lid 1 at 0xfffffe80406ecc60 sleepq_block() at sleepq_block+0xa2 turnstile_block() at turnstile_block+0x40e rw_vector_enter() at rw_vector_enter+0x2d0 genfs_lock() at genfs_lock+0x7b VOP_LOCK() at VOP_LOCK+0x54 vn_lock() at vn_lock+0x82 getcwd_common() at getcwd_common+0x2cd sys___getcwd() at sys___getcwd+0xae syscall() at syscall+0x9c --- syscall (number 296) --- 7f7ff6c9f6ba: crash> ^Dvargaz:~/crash$ exit Script done on Wed Oct 14 23:48:44 2015 Note the complicated mount points, which might make any bugs in locking more likely to pop up: the usual null mounts from pkg_comp, but with an additional mount of a local directory for actually building in (so that that doesn't need to go over NFS). /dev/wd0a on / type ffs (local) /dev/wd0f on /var type ffs (log, local) /dev/wd0e on /usr type ffs (log, local) /dev/wd0g on /home type ffs (log, local) /dev/wd0h on /tmp type ffs (log, local) kernfs on /kern type kernfs (local) ptyfs on /dev/pts type ptyfs (local) procfs on /proc type procfs (local) procfs on /usr/pkg/emul/linux32/proc type procfs (read-only, local) nfsserver:/mnt/vol1 on /mnt/vol1 type nfs nfsserver:/mnt/scratch on /mnt/scratch type nfs tmpfs on /var/shm type tmpfs (local) /mnt/vol1/rhialto/cvs/src on /mnt/scratch/scratch/chroot/pkg_comp.amd64-7.0/default/usr/src type null (read-only) /mnt/vol1/rhialto/cvs/pkgsrc on /mnt/scratch/scratch/chroot/pkg_comp.amd64-7.0/default/usr/pkgsrc type null (read-only) /mnt/vol1/distfiles on /mnt/scratch/scratch/chroot/pkg_comp.amd64-7.0/default/pkg_comp/distfiles type null /mnt/scratch/scratch/packages.amd64-7.0 on /mnt/scratch/scratch/chroot/pkg_comp.amd64-7.0/default/pkg_comp/packages type null /home/rhialto/obj on /mnt/scratch/scratch/chroot/pkg_comp.amd64-7.0/default/pkg_comp/obj/pkgsrc type null (local) procfs on /usr/pkg/emul/linux32/proc type procfs (local) -Olaf. -- ___ Olaf 'Rhialto' Seibert -- The Doctor: No, 'eureka' is Greek for \X/ rhialto/at/xs4all.nl -- 'this bath is too hot.'
Attachment:
pgpvUuMeHpTqj.pgp
Description: PGP signature