tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [10.0_STABLE] Hard lock



	Some news.

	System now runs kernel with ALTQ disabled (ALTQ is in kernel, but not
configured as ALTQD=NO is set in rc.conf). And... it has crashed last
night. Thus, altq is not responsible.

	But I now, I have a suspect. I don't know if attached file will be sent
to mailing list.

	Kernel is in pooldr state and only find is running (from /etc/daily).
Load average is very high (why ? All systems on LAN are idle and nfs
server daemon is idle also).

legendre# df -h
Filesystem     Size   Used  Avail %Cap Mounted on
/dev/raid0a     31G    14G    16G  46% /
/dev/raid0e     62G    30G    29G  50% /usr
/dev/raid0f     31G    23G   6.7G  77% /var
/dev/raid0g    252G   121G   118G  50% /usr/src
/dev/raid0h    523G   337G   160G  67% /srv
/dev/dk0       3.6T   1.9T   1.5T  55% /home
kernfs         1.0K   1.0K     0B 100% /kern
ptyfs          1.0K   1.0K     0B 100% /dev/pts
procfs         4.0K   4.0K     0B 100% /proc
tmpfs          4.0G    16K   4.0G   0% /var/shm
/dev/dk5        11T   9.9T   121G  98% /opt/bacula
/dev/dk6        11T   3.2T   6.8T  31% /opt/video
legendre#

	As I consider NetBSD 10.x is tested on standard configuration, I
suppose find crashes system when it try access to /dev/dk5 or /dev/dk6.

	dk5 and dk6 are wedges on iSCSI target devices :

[  2546.611910] sd0 at scsibus0 target 0 lun 0: <QNAP, iSCSI Storage,
4.0> disk fixed
[  2546.631910] scsibus1 at iscsi0: 1 target, 16 luns per target
[  2546.641918] sd0: fabricating a geometry
[  2546.641918] sd0: 10980 GB, 11244416 cyl, 64 head, 32 sec, 512
bytes/sect x 23028563968 sectors
[  2546.661910] sd0: fabricating a geometry
[  2546.681910] sd0: GPT GUID: a5d27c7c-8eda-40e8-a29b-e85a539a5bc7
[  2546.681910] dk5 at sd0: "bacula", 23028563901 blocks at 34, type: ffs
[  2546.681910] sd0: async, 8-bit transfers, tagged queueing
[  2546.681910] sd1 at scsibus1 target 0 lun 0: <QNAP, iSCSI Storage,
4.0> disk fixed
[  2546.711910] sd1: fabricating a geometry
[  2546.711910] sd1: 10988 GB, 11251968 cyl, 64 head, 32 sec, 512
bytes/sect x 23044030464 sectors
[  2546.731910] sd1: fabricating a geometry
[  2546.751909] sd1: GPT GUID: 799b4d25-970c-4a32-a388-a59470280de0
[  2546.761910] dk6 at sd1: "video", 23044030397 blocks at 34, type: ffs
[  2546.761910] sd1: async, 8-bit transfers, tagged queueing

	Both NAS are connected to server through two dedicated wm interface
(direct connection).

legendre# ifconfig wm0
wm0:
flags=0x8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST>
mtu 9000
        capabilities=0x7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>

capabilities=0x7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
        capabilities=0x7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
        enabled=0x3ff00<IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx>
        enabled=0x3ff00<UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx,TCP6CSUM_Tx>
        enabled=0x3ff00<UDP6CSUM_Rx,UDP6CSUM_Tx>
        ec_capabilities=0x17<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,EEE>
        ec_enabled=0x3<VLAN_MTU,VLAN_HWTAGGING>
        address: b4:96:91:92:77:6e
        media: Ethernet autoselect (1000baseT full-duplex,master)
        status: active
        inet6 fe80::b696:91ff:fe92:776e%wm0/64 flags 0 scopeid 0x1
        inet 192.168.12.1/24 broadcast 192.168.12.255 flags 0
legendre# ifconfig wm1
wm1:
flags=0x8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST>
mtu 9000
        capabilities=0x7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>

capabilities=0x7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
        capabilities=0x7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
        enabled=0x3ff00<IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx>
        enabled=0x3ff00<UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx,TCP6CSUM_Tx>
        enabled=0x3ff00<UDP6CSUM_Rx,UDP6CSUM_Tx>
        ec_capabilities=0x17<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,EEE>
        ec_enabled=0x3<VLAN_MTU,VLAN_HWTAGGING>
        address: b4:96:91:92:77:6f
        media: Ethernet autoselect (1000baseT full-duplex,master)
        status: active
        inet6 fe80::b696:91ff:fe92:776f%wm1/64 flags 0 scopeid 0x2
legendre#

	wm0 and wm1 are bridged :
legendre# cat /etc/ifconfig.bridge0
create
mtu 9000
#inet6 2001:7a8:a8ed:1::2 prefixlen 64 alias
!brconfig $int add wm0
!brconfig $int add wm1
!brconfig $int up
!brconfig $int ipf

NetBSD 10
 |
 +------ wm0 ------ NAS0 (192.168.12.2) ------- /dev/dk5
 |     bridge0
 +------ wm1 ------ NAS1 (192.168.12.3) ------- /dev/dk6

	Faulty seems to be iscsi initiator or bridge. System was stable before
my last upgrade of my source tree. Faulty code seems to be added during
last six monthes.

	Best regards,

	JB

Attachment: crash.jpg
Description: JPEG image

Attachment: signature.asc
Description: OpenPGP digital signature



Home | Main Index | Thread Index | Old Index