Subject: Re: Massive lossage with -current as of tonight?
To: None <current-users@netbsd.org>
From: Christos Zoulas <christos@astron.com>
List: current-users
Date: 11/05/2006 02:52:59
In article <454D1B08.1000500@warped.com>,
Scott Ellis <scotte@warped.com> wrote:
>Eric Haszlakiewicz wrote:
>> On Fri, Nov 03, 2006 at 11:25:57PM -0800, Scott Ellis wrote:
>>> Well, after cvs updating and doing a complete rebuild (so using -current
>>> as of ~10pm PST Nov 3rd), I get the same behavior as before: Various
>>> programs appear to hang when booting multi-user.
>>>
>>> Going back to libc.so.12.147 "fixes" things (mostly), but sshd still
>>> fails, and now I see the new, even more exciting behavior that prevents
>>> logging in:
>[snip]
>> Given that you can fix your problem by reverting libc, and I
>> haven't updated anything beyond the ipf binaries, we might have
>> separate issues here.
>
>Well, I'm starting to suspect some of the kauth changes here.
>
>Booting a -current (Nov 4th, cvs updated moments ago) kernel works fine
>with the October 26th userland.
>
>Updating to Nov 4th userland breaks just as it did when originally
>reported (stuff like named hanging on "load: 0.95 cmd: named 795
>[piperd] 0.00u 0.00s 0% 1808k", but being able to be ^C'ed). My gut
>tells me this is really sh that's hanging, since we're really running
>through rc.local and the rc.d/ scripts at this point. But I digress...
>
>Reverting to Oct 26th binaries, but Nov 4th /lib and /usr/lib "mostly"
>works. Most everything is functional (the system works more-or-less as
>expected) except for some weird permission problems. For example,
>during boot I see:
>raidctl: unable to open device file: raid0
>
>And trying to run atactl (Oct 24th or Nov 4th) yields:
>atactl: wd0: Operation not permitted
>
>A ktrace of this shows:
>499 1 atactl NAMI "/dev/rwd0d"
>499 1 atactl RET open -1 errno 1 Operation not permitted
>
>Using the Oct 24th libc (and other libs), this works fine.
>
>I'm quickly running out of clues. Can anyone suggest what additional
>debugging to collect, or what steps to take to try and root-cause this?
> My build machine is only an Athlon64 3400+, so rebuilding userland for
>every day between Oct 24th and Nov 4th seems time prohibitive.
>
I have no idea. I am running current here on two machines and it seems to
work. But that is i386, not amd64.
christos