Port-sparc archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: SuperSPARCvs HyperSPARC and SS20 stability



On Mon, Dec 02, 2024 at 12:07:10AM +0100, Riccardo Mottola wrote:
> Martin and others described the process hanging as a long time issue and
> dependenton CPU speed... however, I do not (and did not) experience it on
> the SS10 with 8x and 9.x while building about the same pkgsrc packages. Why
> instead on the SS20? is it due to NetBSD 10 making things worse or are there
> other issues?

I am not sure we see the same issue here. In my case (and what I meant with
long-standing-problem) is some issue with libpthread on SMP machines, which 
seems to trigger much more often when the CPUs are faster.

It causes userland processes (using pthreads) to do basically busy loops in
some mutex operation and never make any process. Once you are in this
state, the process never recovers. I think I could always (manually)
kill them, but right now I am not 100% sure of that.

I have never seen them hang the whole machine (i.e. ddb on the console
still works).

My reliable way to trigger this issue is: unpack a sparc userland from the
autobuilds on any sparc64 SMP machine, chroot to it and run the ATF test
suite. It will reliably stop making progress quickly when enough RUMP servers
have been run.

I seriously tried to debug it once, but back then gdb was ... not helpfull
and I postponed it. We are various versions of gdb newer now, maybe it is
time to re-attack this issue.

Martin


Home | Main Index | Thread Index | Old Index