NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/44002: 3ware 9690 (ld driver) doesn't respond after transfer big amount of data
The following reply was made to PR kern/44002; it has been noted by GNATS.
From: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
To: Jiri Novotny <novotny%ics.muni.cz@localhost>
Cc: gnats-bugs%NetBSD.org@localhost, kern-bug-people%NetBSD.org@localhost
Subject: Re: kern/44002: 3ware 9690 (ld driver) doesn't respond after
transfer big amount of data
Date: Wed, 27 Oct 2010 21:29:54 +0200
On Wed, Oct 27, 2010 at 11:46:20AM +0200, Jiri Novotny wrote:
>
> Dear Manuel,
>
> thank you for response.
>
> In meantime the situation get even worth. I tried to write files
> on the disk and mashine crash down. The screenshot of the crash
> is in attachment, hope you can read it.
I can. I wonder what causes the "twa0: clearing queue error", it's
probably related. I would also be interesting to see if there
are other messages before this one.
>
> I use generic kernel 5.1RC4 (I can use 5.0.2 as well).
>
> As the system crash down I tried to used smaller amount of
> data and repeated the situation. I was able to freeze raid now
> withoiu the crash :-) and I can give you the answers to your
> questions.
>
>
> > > ... and disk stop to respond
> >
> > Could you see with 'ps -axl' what wait-channel are on ?
>
> $ ps -axl
> UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
> 0 0 0 0 0 0 0 17564 - OKl ? 0:07.03 [system]
> 0 1 0 3061 85 0 2932 4 wait IWs ? 0:00.00 init
> 0 112 1 0 85 0 2932 988 kqueue Ss ? 0:00.01
> /usr/sbin/syslogd -s
> 0 250 1 0 85 0 5936 4 select IWs ? 0:00.00
> /usr/sbin/sshd
> 0 327 250 0 85 0 8704 4 netio IWs ? 0:00.01 sshd: novotny
> [priv]
> 12 328 363 0 85 0 4796 680 kqueue I ? 0:00.00 qmgr -l -t
> unix -u
> 0 363 1 0 85 0 4796 628 kqueue Is ? 0:00.00
> /usr/libexec/postfix/master
> 0 370 1 3061 85 0 2972 4 kqueue IWs ? 0:00.00
> /usr/sbin/inetd -l
> 0 388 1 0 85 0 2900 876 nanoslp Ss ? 0:00.00
> /usr/sbin/cron
> 0 394 250 0 85 0 8704 4 netio IWs ? 0:00.01 sshd: novotny
> [priv]
> 12 396 363 0 85 0 4796 632 kqueue I ? 0:00.00 pickup -l -t
> fifo -u
> 300 398 394 0 85 0 8704 1000 select S ? 0:00.00 sshd:
> novotny@pts/0 (sshd)
> 300 403 327 0 85 0 8704 2824 select I ? 0:00.01 sshd:
> novotny@pts/1 (sshd)
> 300 375 398 0 85 0 2952 952 wait Ss ttyp0 0:00.00 -sh
> 300 473 375 0 43 0 2960 840 - O+ ttyp0 0:00.00 ps -axl
> 300 405 403 0 85 0 2952 4 wait IWs ttyp1 0:00.00 -sh
> 0 413 405 0 85 0 2952 1168 wait I ttyp1 0:00.01 sh
> 0 461 413 0 117 0 2900 800 tstile D+ ttyp1 0:00.00 dd if bs
> count of
The famous tstile ... doesn't tell much unfortunably.
Maybe 'ps -axws -O lname' would have given more info (with ps, we don't know
what the kernel is doing ...)
> 0 468 413 0 85 0 2904 1012 piperd I+ ttyp1 0:00.00 grep -v
> records
> 0 390 1 0 85 0 2912 788 ttyraw Is+ ttyE0 0:00.00
> /usr/libexec/getty Pc console
> 0 387 1 1815 85 0 2912 4 ttyraw IWs+ ttyE1 0:00.00
> /usr/libexec/getty Pc ttyE1
> 0 383 1 1815 85 0 2912 4 ttyraw IWs+ ttyE2 0:00.00
> /usr/libexec/getty Pc ttyE2
> 0 393 1 1815 85 0 2912 4 ttyraw IWs+ ttyE3 0:00.00
> /usr/libexec/getty Pc ttyE3
>
>
> > Do you have any message in dmesg or console ?
>
> twa0: clearing controller queue error - many time, the leds on disk array
> are not active.
And nothing before this ?
>
> > What is the interrupt setup ?
>
> Standard as in generic kernel, here is the dmesg:
> In the dmesg is warning that filesystem is not clean, but situation
> was the same just after newfs.
So twa0 shares interrupt with wm0 and uhci0.
I have 2 systems with 3ware (these are 9550X, not 9650 though),
but the controllers are alone on their interrupt line.
I'm not sure if this can be the problem, but I would try to
disable some devices so that twa0 doens't share interrupt with
anything else.
--
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
NetBSD: 26 ans d'experience feront toujours la difference
--
Home |
Main Index |
Thread Index |
Old Index