tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Problems with hangs under NetBSD-5.x
On Thu, Aug 06, 2009 at 04:42:13PM +0100, Mindaugas Rasiukevicius wrote:
> Manuel Bouyer <bouyer%antioche.eu.org@localhost> wrote:
> > It turns out my issue seems to be caused by a hardware bug.
> > In my case the kernel was completely dead exept I still could
> > enter ddb from serial console. Disabling hyperthreading seems to have
> > helped in my case.
>
> Why do you think it is a hardware bug i.e. do you have some way to validate
> this? No problem with disabled HT can mean synchronization issue.
See my post to port-i386/port-amd64, but basically:
diagnostic code I added show we call splx() with a bogus value
(which I guess is what's causing various problems later when it's not checked)
Looking in ddb at the address where the value comes from, it's
correct.
This bogus value always comes from a struct kmutex, which is
on i386 32bits wide and is read/written as bytes. The byte next to
mtxs_ipl is the simple lock used for the mutex ...
mtxs_ipl itself is initialised when the lock is created and never changed
after.
>
> Also, if it's CPU issue (e.g. it requires special patch/workaround in
> software side), then issues would be seen in random subsystems,
in my case I suspect locked byte operations can affect values around
the byte on the other HT in a non-trivial way.
> while
> problem is currently isolated in VFS/FFS layers.
No, in my case it was a complete hang of the system: dead VFS, dead
network, dead soft interrupts (including softclock) on CPU0.
hardclock and serial were still working.
So we may be talking about different issues here.
--
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
NetBSD: 26 ans d'experience feront toujours la difference
--
Home |
Main Index |
Thread Index |
Old Index