NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
re: kern/57694 (rge(4) hang)
one of my rge(4)'s has soft-hung twice recently. after a power
outage, it lost connectivity soon after coming back online (on
the order of 10s of minutes less than a hour), and ifconfig
down/up sequence restored connectivity. then, about 2 days
later, the same thing happened.
i wrote a stupid script to notice and the down/up for me.
i also started looking at rge_rxeof(), and i wonder if there's
a bad case we get into and can't get out of without manually
resetting the descriptors (eg, the down/up sequence):
1249 for (i = sc->rge_ldata.rge_rxq_considx; ; i = RGE_NEXT_RX_DESC(i)) {
...
1255 cur_rx = &sc->rge_ldata.rge_rx_list[i];
1256
1257 if (RGE_OWN(cur_rx))
1258 break;
...
1361 }
1362
1363 sc->rge_ldata.rge_rxq_considx = i;
if for some reason rge_rxq_considx ends up pointing to a desc
that is owned, then i is never changed from rge_rxq_considx and
the assignment on L1363 is a no-op. could it be being not
filled in _ever_ for some other reason , this loop will never
find other filled in descriptors. can that happen? i am not
nearly familiar enough with ethernet drivers/hardware...
i guess it might be useful to get a dump of all the descriptors
at the hang-time, to see their status.
i'll see about doing that, but this is the only machine i've
seen this hang on and i'm loathe to reboot it ever as it does
not reboot without power cycle, and recently, it took about 5
power cycles for it to get past whatever problem it has.
.mrg.
Home |
Main Index |
Thread Index |
Old Index