Port-macppc archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: tstile lockups - test case



In article <p06240800cdd916f42451@[71.39.101.51]>,
Donald Lee  <MacPPC2%c.icompute.com@localhost> wrote:
>At 11:55 PM +0000 3/30/13, Valery Ushakov wrote:
>>Donald Lee <MacPPC2%c.icompute.com@localhost> wrote:
>>> At 5:22 PM +0000 3/30/13, Valery Ushakov wrote:
>>>
>>>>Next time it hangs you can break into DDB from console and check with
>>>>ps (ddb command, not ordinary command; see ddb(4)).  I've seen such
>>>>lockup once on my newly installed mini g4 when I was copying over
>>>>several cvs trees from the old machine.
>>> 
>>> Console does not respond (aside from the first CR, sometimes).
>>> I don't think I can use ddb.  I'll read the man page......  Thanks
>>
>>Press Ctrl-Alt-Esc on the console keyboard to break into ddb.  It
>>should work on tstiled machine even when normal console input isn't.
>>IIRC, NetBSD 6 should have the necessary fixes for USB keyboard to be
>>usable with ddb.
>>
>>-uwe
>
>I can reproduce the tstile hang fairly easily and am looking for
>a path to fix it.  I don't really know how to use ddb, and am not very
>familiar with the internals of the kernel.  What I have done so
>far is put together a test case that reliably causes the problem
>in a few hours or less.
>
>The test case is just a shell script.  I have apache running on the
>machine, and have a shell script doing
>wget operations on http://127.0.0.1/...  The network interface
>is down (gem0) when the failure occurs, so it doesn't look like a driver
>problem.  When the hang occurs, I cannot get a command launched, so no
>user-level debugging is possible, but I can break into ddb with the
>ctrl-alt-esc sequence.  When I break into ddb I can do "ps", and see
>many, many processes waiting on "tstile".
>
>I have run this test maybe 10 times, and the failures all look about
>the same.
>
>This smells to me like a race condition - a small timing window that gets
>hit somewhere.  Those are always fun to find. ;->
>
>I have run the same test case on a VM instance of an i386 NetBSD install,
>and it does not fail (so far), so I'm pretty sure it's a macppc-specific
>problem.
>
>Anyone have suggestions on how to track this down?  What's the shortest path
>to my getting enough ddb expertise to help track this down, or getting my
>test case in the hands of someone with the requisite skill?

I would stary by running a DIAGNOSTIC/DEBUG/LOCKDEBUG kernel. If that does
not find the deadlock, at least it will let us look at the locks more easily.

christos



Home | Main Index | Thread Index | Old Index