Port-sparc64 archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Netbsd v 8.0 fails to boot on spare ultra 45
On Sun, 23 Dec 2018, Michael L. Hitch wrote:
On Sat, 22 Dec 2018, Jerome Ibanes wrote:
Could you try a GENERIC.UP kernel and/or disable the audio driver?
Thank you,
Jerome
I haven't tried GENERIC.MP, but I have effectively disabled the audio
driver (forced the code to never match - quicker than removing it from the
config and rebujilding the entire kernel). It is able to boot after that,
although I didn't run that kernel very long.
Oops - I meant GENERIC.UP. None of those worked either.
I was able to track a specific commit of audio.c that starts crashing.
/* $NetBSD: audio.c,v 1.335 2017/05/06 00:13:25 nat Exp $ */
This one boots.
/* $NetBSD: audio.c,v 1.337 2017/05/08 07:31:34 martin Exp $ */
This one crashes with SIR.
I can't see any reason why that change would cause this. I'm starting to
suspect the problem is acually elsewhere and that one change just causes the
real problem to consistently occur.
I was trying to see how far the audio attach got on a current netbsd-8
tree, and got to a point where kmem_zalloc() is called, but never returned.
With the tree I used to locate the audio.c commuit that fails, it appears to
get much further in the audio attach code before is fails.
Heading down that rabbit hole.....
I replaced the 1.337 version of audio.c with the 1.335 and found that
the kernel again would boot - but when I enabled AUDIO_DEBUG to get more
information, it would crash with SIR again.
One other possibly related problem is that a few times I had sshd fault on
startup on boot. I wasn't paying much attention to which kernel that was at
the time.
I started running native builds and getting segment violations, and
worked my way to earlier versions of the tree and found that the problems
started when gcc was switched to 5.3 in May 2016. I even managed to hit a
kernel that got the SIR when using gcc 5.3. Building that with gcc 4.8
worked fine.
Then I decided to try DDB with the failing kernel. After some memory
refreshes of sparc64 assembly (I was rather rusty with sparc64), I noticed
one difference between a working kernel and a crashing kernel. The stack
pointer was somewhat lower in the crashing kernel and it looked like the
new gcc used more stack space than gcc 4.8 did. I suspected at that point
that perhaps the kernel stack was overwriting the pcb. I changed the
stack size (SSIZE in param.h) to 4 pages initially and the kernels would
now boot. I did drop it to 3 pages and still was ok.
Now I had kernels that would boot consistently, but I was still having
problems running native builds. I had rememebered looking at the
port-sparc64 mailing list from that time, when sparc64 reverted back to
gcc 4.8 and a message about a problem. When I tracked down the change to
fix that, I realized the the fix was to ld.elf_so, which was not in the
7.0 release that I was running all this on. One of my builds had a
ld.elf_so that included the fix, and once I updated ld.elf_so, I was able
to run complete native builds with no problems. I ran that with both
8.0_STABLE and current kernels.
So it looks like a larger kernel stack is now needed for sparc64.
Mike
---
Michael L. Hitch mhitch%montana.edu@localhost
Operations Consulting, University Information Technology
Montana State University, Bozeman, MT USA
Home |
Main Index |
Thread Index |
Old Index