[7.99.12] tda0 issue was: Ultrasparc III+ kernel panic

To: Eduardo Horvath <eeh%NetBSD.org@localhost>
Subject: [7.99.12] tda0 issue was: Ultrasparc III+ kernel panic
From: BERTRAND Joël <joel.bertrand%systella.fr@localhost>
Date: Mon, 27 Apr 2015 00:09:11 +0200

Eduardo Horvath a écrit :

On Mon, 13 Apr 2015, BERTRAND Joël wrote:

	I have seen. And I have seen another panic :

panic: cpu1: ipi_send: couldn't send ipi to UPAID 0 (tried 10000 times)
cpu1: Begin traceback...
cpu1: End traceback...
Frame pointer is at 0x2004e41
Call traceback:
  netbsd:cpu_reboot+0x208(182f828, 1, ffff, 77bb78, 1cce380, 1c97000) fp =
2004f01
  netbsd:vpanic+0x178(104, 0, 1852638, 1cb6800, f, 1c70740) fp = 2004fb1
  netbsd:panic+0x24(1852638, 20059a8, 1cdc800, 1cddaf8, 1cddc00, 104) fp =
2005061
  netbsd:sparc64_send_ipi_sun4u+0x1ac(1852638, 1, 0, 2710, fffffffffffffffe, 0)
fp = 2005121
  netbsd:cpu_need_resched+0x54(f4240, 1018a80, 0, 0, 70, 0) fp = 20051d1
  netbsd:sched_changepri+0x64(2014000, 2, 2014000, 101db1d08, 101db1040, 2a) fp
= 2005281
  netbsd:resetpriority+0x90(1043816c0, 2a, 0, 1, 101daec40, 101daedc0) fp =
2005331
  netbsd:sched_pstats+0x118(1043816c0, 0, 1c70868, 0, 10caf5510, 2a) fp =
20053e1
  netbsd:uvm_scheduler+0x60(64, 1c71000, 0, 101daedc0, 10caf5510, 1043816c0) fp
= 2005491
  netbsd:main+0x83c(101d89f00, 1c70740, 1c70740, 101da2c80, 1c0a1fc, 18a0598)
fp = 2005541
  netbsd:cpu_initialize+0x154(184d500, 10624dd3, 1c97800, 0, 101daee00, 1) fp =
2005621
  netbsd:100030+0(f0059840, 113800, 113c00, 111880, 111ce8, 1117f8) fp =
fff33651

dumping to dev 25,1 offset 12291071

But I don't understand. With the same kernel, this Blade2000 rebooted one or
more times _by day_ and now, uptime is greater than 8 days. I have saved
kernel image and core if you want.


Well that's not terribly useful.

One CPU tried to tell another CPU something but the other CPU did not
respond.  It then paniced.  In this circumstance the interesting info is
the state of the unresponsive CPU.  An SIR would be much more useful in
this circumstance than a panic.


	Hello,

Some good news. Before patching locore.s with your suggestions, I haverebuilt a 7.99.9 kernel from sources (with userland) and I have plannedto investigate last saturday. This kernel 7.99.9 is stable on my blade2000. I have obtained an uptime greater than 6 days (and system hasfinally crashed when I have tried to do /etc/rc.d/altqd restart... butit is not the same issue). With 7.99.6, same condition, same blade 2000paniced one or two times by day. I haven't seen any modification insparc64/sparc64 nor sparc64/dev that can explain that 7.99.9 is stableand that 7.99.6 wasn't.

Thus, I have rebuilt a 7.99.12 from sources and tda.c seems to bebroken. In dmesg, tda.c writes :


tda0: skipping temp adjustment - no sensor values
tda0: skipping temp adjustment - no sensor values
tda0: skipping temp adjustment - no sensor values
tda0: skipping temp adjustment - no sensor values
tda0: skipping temp adjustment - no sensor values
tda0: skipping temp adjustment - no sensor values
tda0: skipping temp adjustment - no sensor values
tda0: skipping temp adjustment - no sensor values
tda0: skipping temp adjustment - no sensor values
tda0: skipping temp adjustment - no sensor values
tda0: skipping temp adjustment - no sensor values
tda0: skipping temp adjustment - no sensor values

and envstat only returns :
envstat: no drivers registered

but fans do not run at maximal speed.

	Best regards,

	JKB

Follow-Ups:
- re: [7.99.12] tda0 issue was: Ultrasparc III+ kernel panic
  - From: matthew green

References:
- Re: Ultrasparc III+ kernel panic
  - From: BERTRAND Joël
- Re: Ultrasparc III+ kernel panic
  - From: Martin Husemann
- Re: Ultrasparc III+ kernel panic
  - From: BERTRAND Joël
- Re: Ultrasparc III+ kernel panic
  - From: Takeshi Nakayama
- Re: Ultrasparc III+ kernel panic
  - From: BERTRAND Joël
- Re: Ultrasparc III+ kernel panic
  - From: Martin Husemann
- Re: Ultrasparc III+ kernel panic
  - From: BERTRAND Joël
- Re: Ultrasparc III+ kernel panic
  - From: Martin Husemann
- Re: Ultrasparc III+ kernel panic
  - From: BERTRAND Joël
- Re: Ultrasparc III+ kernel panic
  - From: BERTRAND Joël
- Re: Ultrasparc III+ kernel panic
  - From: Eduardo Horvath
- Re: Ultrasparc III+ kernel panic
  - From: BERTRAND Joël
- Re: Ultrasparc III+ kernel panic
  - From: Eduardo Horvath

Prev by Date: Re: sun4v support in NetBSD/sparc64
Next by Date: re: [7.99.12] tda0 issue was: Ultrasparc III+ kernel panic
Previous by Thread: Re: Ultrasparc III+ kernel panic
Next by Thread: re: [7.99.12] tda0 issue was: Ultrasparc III+ kernel panic
Indexes:

Home | Main Index | Thread Index | Old Index