Subject: Re: 'hanging' mount problem 'fixed'? Forklift Upgrade....
To: Jason Thorpe <thorpej@nas.nasa.gov>
From: Matthew Jacob <mjacob@feral.com>
List: port-alpha
Date: 04/07/1999 21:02:35
>
> > My kernels were always really current too. It's the user space bits that
> > have hit the problem.
>
> Did you preserve those user space bits, or pinpoint EXACTLY what they were
> doing differently that caused them to lose?
See other mail.
>
> > (you done with brunner then?)
>
> No, actually. I'm sick as a dog (NASTY cold, unfortunate timing), and
> caught the early express train back to San Francisco, where I am currently
> parked on the sofa with a hot cup of tea and a warm blanket, listening to
> KQED. After I get a bite to eat, I'm gonna huff down to Cala and get a
> bottle of Nyquil. I am taking a sick day tomorrow :-)
Sorry to hear you're unwell. I was certainly this way a week or so ago.
>
> > > ...well, that's ... unclear, considering that you never pinpointed exactly
> > > what the problem was.
> > >
> >
> > True- I *did* pinpoint the source changes that caused it- and I *did*
> > narrow it down to subshells within /etc/rc. If it had been *just* me, I
> > would just write it off with an apology, but because at least one other
> > person had the issue, I'd just be a bit uneasy about COMPAT_13.
>
> No you didn't... Because you never commented out the part of pmap_enter()
> that makes the rest of those changes do anything and see if the problem
> still exists, i.e. if that was actually the problem. You didn't pinpoint
> the system call or fault or whatever that caused the hang to occur. Note
> that the other person that observes this can get it to happen without
> /etc/rc subshells. In other words, all you did was find "evidence" to
> support your theory, but you didn't actually provide anything to make
> me believe that your theory is in any way correct (especially considering
> that I wrote a very large portion of the Alpha pmap module, and know
> precisely what the change you're pointing at does and how it works).
It was in email about a week ago I asked for someone who *does* know this
area to look at it and comment. I *did* say I didn't know the area of code
well. I really didn't think through about how to test it, and wasn't going
to try and decipher this and get up to speed on all of this because I have
other stuff to work on. I was upset because I didn't get anything about a
sarcastic comment from you starting today. I was sitting down trying to
read the code when you dropped by and said you wanted to look at the
problem (whereupon I relinquished one of the machines right away).
>
> The point I'm trying to make here is that you're getting upset as "us"
> for not dealing with the problem, when we have no real information to
> go on because we can't reproduce it, and you haven't provided any actual
> info other than "I think this is what caused it".. and now we'll never
> get that info because you added variables (updated your userland) and
> now can't get the problem to occur. That's just nonsense.
Like my other email said, I have two other systems that have the problem.