Hello, For a very long time (I don't remember if it was from 9.0 or 9.1), my main server randomly panics or enters in a deadlock when it tries to access to an iSCSI NAS. This panics is not related to hardware failure as I have changed motherboard, disks, memory wit spares components with excatly the same result. This night, it enters in a deadlock. Twice ! I have built a new kernel yesterday from source tree : legendre# uname -a NetBSD legendre.systella.fr 9.3_STABLE NetBSD 9.3_STABLE (CUSTOM) #17: Thu Nov 17 23:16:11 CET 2022 root%legendre.systella.fr@localhost:/usr/src/netbsd-9/obj/sys/arch/amd64/compile/CUSTOM amd64 My kernel is a CUSTOMized kernel as I have added ALTQ support. Indeed, this server is my main professionnal server and it has to set priority on VoIP packets and some other IP traffics. I have tried to rebuild some time ago a kernel with LOCKDEBUG option but this kernel has never reached init. You will find in attachement dmesg of current kernel. I can see panics, always the same panics when system tries to access to a Qnap NAS over iSCSI. I have tried to bissect, I believe if I stop NAS, system is stable. Panic is always panic I have copied in PR/56925 Network configuration : - bridge0 (wm0 and wm1) is connected to two Qnap NAS (iSCSI, MTU 9000) - wm2 : WAN (MTU 1500), IPv4 default route - agr0 (wm3 and wm4) : LAN (MTU 1500) - tap0 : WAN, IPv6 default route (MTU 1500), I have to use an IPv6 broker. System configuration : - Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz - 16 GB - motherboard ASUSTeK Z97-E - 7 internal SATA disks (ccd0 [wd0, wd1], raid0 [wd2, wd3], raid1 [wd4, wd5, wd6]) legendre# df Filesystem 1K-blocks Used Avail %Cap Mounted on /dev/raid0a 32529068 13521382 17381234 43% / /dev/raid0e 65058298 30805934 30999450 49% /usr /dev/raid0f 32529068 26574944 4327672 85% /var /dev/dk5 16515182 3715070 11974354 23% /var/squid/cache /dev/raid0g 264277976 97612068 153452012 38% /usr/src /dev/raid0h 548684628 303773356 217477044 58% /srv /dev/dk0 3876580176 1408841464 2273909704 38% /home kernfs 1 1 0 100% /kern ptyfs 1 1 0 100% /dev/pts procfs 4 4 0 100% /proc tmpfs 4162812 48 4162764 0% /var/shm /dev/dk6 11335898764 9743618440 1025485388 90% /opt/bacula /dev/dk7 11343502476 2175889896 8600437460 20% /opt/video legendre# raidctl -s raid0 Components: /dev/wd2a: optimal /dev/wd3a: optimal legendre# raidctl -s raid1 Components: /dev/wd4e: optimal /dev/wd5e: optimal /dev/wd6e: optimal legendre# ccdconfig -g ccd0 32 0x0 2000408739840 /dev/wd0a /dev/wd1a /opt/bacula and /opt/video are iSCSI NAS. ccd0 contains a lot of partitions: /var/squid/cache and all swap partitions exported (iSCSI) for all diskless workstations on LAN. This systems acts as an iSCSI initiator (for both NAS) and as iSCSI target (istgt) for diskless workstations. I don't know how investigate (LOCKDEBUG kernel doesn't boot) and I have to fix this bug as soon as possible as I cannot continue with an unstable main server. Best regards, JB
Attachment:
dmesg.gz
Description: application/gzip