Some news. System now runs kernel with ALTQ disabled (ALTQ is in kernel, but not configured as ALTQD=NO is set in rc.conf). And... it has crashed last night. Thus, altq is not responsible. But I now, I have a suspect. I don't know if attached file will be sent to mailing list. Kernel is in pooldr state and only find is running (from /etc/daily). Load average is very high (why ? All systems on LAN are idle and nfs server daemon is idle also). legendre# df -h Filesystem Size Used Avail %Cap Mounted on /dev/raid0a 31G 14G 16G 46% / /dev/raid0e 62G 30G 29G 50% /usr /dev/raid0f 31G 23G 6.7G 77% /var /dev/raid0g 252G 121G 118G 50% /usr/src /dev/raid0h 523G 337G 160G 67% /srv /dev/dk0 3.6T 1.9T 1.5T 55% /home kernfs 1.0K 1.0K 0B 100% /kern ptyfs 1.0K 1.0K 0B 100% /dev/pts procfs 4.0K 4.0K 0B 100% /proc tmpfs 4.0G 16K 4.0G 0% /var/shm /dev/dk5 11T 9.9T 121G 98% /opt/bacula /dev/dk6 11T 3.2T 6.8T 31% /opt/video legendre# As I consider NetBSD 10.x is tested on standard configuration, I suppose find crashes system when it try access to /dev/dk5 or /dev/dk6. dk5 and dk6 are wedges on iSCSI target devices : [ 2546.611910] sd0 at scsibus0 target 0 lun 0: <QNAP, iSCSI Storage, 4.0> disk fixed [ 2546.631910] scsibus1 at iscsi0: 1 target, 16 luns per target [ 2546.641918] sd0: fabricating a geometry [ 2546.641918] sd0: 10980 GB, 11244416 cyl, 64 head, 32 sec, 512 bytes/sect x 23028563968 sectors [ 2546.661910] sd0: fabricating a geometry [ 2546.681910] sd0: GPT GUID: a5d27c7c-8eda-40e8-a29b-e85a539a5bc7 [ 2546.681910] dk5 at sd0: "bacula", 23028563901 blocks at 34, type: ffs [ 2546.681910] sd0: async, 8-bit transfers, tagged queueing [ 2546.681910] sd1 at scsibus1 target 0 lun 0: <QNAP, iSCSI Storage, 4.0> disk fixed [ 2546.711910] sd1: fabricating a geometry [ 2546.711910] sd1: 10988 GB, 11251968 cyl, 64 head, 32 sec, 512 bytes/sect x 23044030464 sectors [ 2546.731910] sd1: fabricating a geometry [ 2546.751909] sd1: GPT GUID: 799b4d25-970c-4a32-a388-a59470280de0 [ 2546.761910] dk6 at sd1: "video", 23044030397 blocks at 34, type: ffs [ 2546.761910] sd1: async, 8-bit transfers, tagged queueing Both NAS are connected to server through two dedicated wm interface (direct connection). legendre# ifconfig wm0 wm0: flags=0x8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> mtu 9000 capabilities=0x7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx> capabilities=0x7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx> capabilities=0x7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6> enabled=0x3ff00<IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx> enabled=0x3ff00<UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx,TCP6CSUM_Tx> enabled=0x3ff00<UDP6CSUM_Rx,UDP6CSUM_Tx> ec_capabilities=0x17<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,EEE> ec_enabled=0x3<VLAN_MTU,VLAN_HWTAGGING> address: b4:96:91:92:77:6e media: Ethernet autoselect (1000baseT full-duplex,master) status: active inet6 fe80::b696:91ff:fe92:776e%wm0/64 flags 0 scopeid 0x1 inet 192.168.12.1/24 broadcast 192.168.12.255 flags 0 legendre# ifconfig wm1 wm1: flags=0x8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> mtu 9000 capabilities=0x7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx> capabilities=0x7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx> capabilities=0x7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6> enabled=0x3ff00<IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx> enabled=0x3ff00<UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx,TCP6CSUM_Tx> enabled=0x3ff00<UDP6CSUM_Rx,UDP6CSUM_Tx> ec_capabilities=0x17<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,EEE> ec_enabled=0x3<VLAN_MTU,VLAN_HWTAGGING> address: b4:96:91:92:77:6f media: Ethernet autoselect (1000baseT full-duplex,master) status: active inet6 fe80::b696:91ff:fe92:776f%wm1/64 flags 0 scopeid 0x2 legendre# wm0 and wm1 are bridged : legendre# cat /etc/ifconfig.bridge0 create mtu 9000 #inet6 2001:7a8:a8ed:1::2 prefixlen 64 alias !brconfig $int add wm0 !brconfig $int add wm1 !brconfig $int up !brconfig $int ipf NetBSD 10 | +------ wm0 ------ NAS0 (192.168.12.2) ------- /dev/dk5 | bridge0 +------ wm1 ------ NAS1 (192.168.12.3) ------- /dev/dk6 Faulty seems to be iscsi initiator or bridge. System was stable before my last upgrade of my source tree. Faulty code seems to be added during last six monthes. Best regards, JB
Attachment:
crash.jpg
Description: JPEG image
Attachment:
signature.asc
Description: OpenPGP digital signature