On Thu, Jul 24, 2008 at 10:57:16PM +0100, Mindaugas Rasiukevicius wrote: > Bill Stouder-Studenmund <wrstuden%netbsd.org@localhost> wrote: > > > Imagine a case of 1000 (new) pthreads which block - that would mean: > > > 1000 * (LWP creation + SA context switch) operations. Plus, LWPs for > > > VPs... > > > > That would be the exact same LWP usage as a 1:1 threading model would > > give. The SA process spends the time creating the LWPs between blocking > > events while the 1:1 process created all of the same LWPs at initial > > thread creation time. > > Not exactly. To create LWP when blocking (that is, switching the context) SA > invents a lot of complexity, and hacks (eg. locking against order). Also, > inventing the limits on such flow is harder. It's not that complicated. Yes, we have to grab an additional lock in sa_switch(). However "locking against order" is a bit of a stretch as I've not seen a locking hierarchy of how you lock multiple sleepq locks at once. (*) So if we can't lock the vp (and thus the lwp that we need to wake up), we do the unlock/relock dance. We then make sure we haven't been woken up in the process. The lock we're grabbing is one we should rarely find locked. (*) I've seen the locking ordering in kern_lwp.c, but my reading of it was it speaks of one lwp lock then other locks, like runqueue locks, etc. The specific issue here is our lwp is on a sleepq, so our lwp lock is a sleepq lock. The sleeper thread we want to wake is also on another sleepq, so its lwp lock is a sleepq lock. And I saw no global hierarchy of them. If there were one, I'd have made it so that the lock we need to take is something we could safely take w/o deadlock. Assuming I understood it correctly.:-) Also, some aspects of "hackishness" in the code stem from the fact that the rest of the kernel hasn't been changed much to better support the SA code. I know there are one place where I had to copy code rather than use existing routines because they each were almost-but-not-fully what was needed. The main thing is that the SA code wants to manage a pool of threads, so it needs to be able to take other threads (not curlwp) and add them to and remove them from sleepqueues. Or SA needs to do something in the middle of un-sleeping something on its sleep queues. So I have a "first half" and "second half" routine. About limiting, there is code in the "start up the new LWP to report blocking" routine that handles failure to set the new thread up (this means getting a stack for the upcall we're delivering about blocking, allocating a new upcall data structure, and allocating a new LWP). If that happens, we put our upcall data structure back on the vp, mark the lwp that was the blessed one and blocked as the blessed lwp again, and put ourselves into the lwp cache. Simply put, we undo the upcall triggering and turn the blocking into a no-upcall block. So if the routine to make a new lwp fails, we will enter this. It will be slow and expensive, but it should work. > > One other thing to consider is how long different context switches take. > > The two important ones are intra-process-same-space switches (inter-LWP in > > the kernel and inter-thread in SA userland) and user-kernel switches. When > > I was starting the Wasabi iSCSI target, I asked around before we used (SA) > > pthreads to implement this. I asked a number of NetBSD threading folks > > about this. > > > > The answer I was given was that user-kernel switches are NOTABLY more > > expensive. Like 10x. Their numbers, not mine. So while SA is adding extra > > steps, they are steps that aren't the most expensive thing around. > > But well.. what Andrew said - let's rather spend time optimising the context > switch on such architectures like ARM - that would give overall benefit. Do you really think we can get a 10x reduction in the amount of time it takes to get in and out of the kernel? I'd love it if we could, but my understanding is that the numbers I was given were things that people had spent a fair amount of time working on before I asked the question, so there is not much time to squeak out. The hardware is going to need a certain amount of time, and there's only so much that can be done about it. Put another way, people don't complain about system calls on ARM because all the OSs are stupid, they complain because of how the hardware handles it. Also, we're a volunteer project. As such, it's not clear to me that doing one thing really means not doing another thing. Since I'm personally much more interested in SA than in improving ARM context switching (and to be blunt, I feel I have no skill at the latter), me doing this isn't holding us back. > > What I don't understand, though, is why we're discussing this issue like > > this. I don't see what the NetBSD kernel loses by having both 1:1 AND SA > > threading support. While the SA code is a fresh port, it is a fresh port > > of the NetBSD 4 code. So it actually is something we're familiar with as a > > project. People on this list have shown that SA does better on some work > > loads, and other people have shown (quite spectacularly) that 1:1 performs > > stunningly. > > Bringing SA back invents more than 3000 lines of very complicated code. Why? > > - To support specific backwards compatibility which we never actually > supported (see what Andrew and Jason wrote). That's two voices, three if we add you. What did everyone else say? Almost all the other comments I've heard have supported this. We used to make exactly this promise, and we never clearly decided not to. It would be fine to ship if we had no alternative, but we do have an alternative. Note, Jason's saying we should shift what our promise is about compatability. That seems to me to clearly indicate what our promise used to be. I think he's right that static libpthread is something that should be touched with care (and generally avoided), and compatability at the .so level may be a better thing to do overall. But that's not what the promise has been. Also, please pay attention to the comments about the practicalities of this. Keeping a promise at the .so level is one thing if we support an update model like say Mac OS. In it, you update the whole OS, then carry forward using your existing apps linking against new libraries & living under the new kernel. That's not what we've done in the past, and most importantly, it's not what our users are used to doing. In the past, the new kernel would support the old libraries. This makes a differnece in chroot environments and in kernel-only upgrades. Finally, we've tested the code. Yes, it's a port to our new kernel, but it's a port of code we've been shipping for three previous releases. We as a project have a feel for what it does and doesn't do well. > - To support theoretical performance for some workload, where seems nobody in > this mailing-list can provide a prove-of-concept test application, or even > a reasonable SA benchmark. And no - "I saw a benchmark" or 5 years old > graph about NPTL, unfortunately, does not say anything... Listen to the people who are saying this. A good number of them are people that I have come to pay attention to on this list. If they bother to say something, it usually is worth listening to. > Looks ironical. Especially when people arguing more from belief, instead of > saying: "Hey, here is the example of real-world application which works with > SA much better - let's try it!" Conversely you can't _prove_ that there aren't real-world cases where SA would do better. :-) > But again, the main thing which makes me upset is adding thousands of lines > to improve few percent of theoretical cases. This breaks one of the main > software engineering principles. I thought it is not the way NetBSD goes... But it is. We've supported extreme amounts of backwards compatability. We have options to turn on system call compatability going back to NetBSD 0.9. I've never NetBSD pre-1.2 (I think. It was late 1995). Also, I think it'd be kinda exciting for us to be one of the few OSs able to support both threadings. :-) I also wish you were not so upset by this. Your help has been invaluable, and your comments have led to a MUCH better SA than we would have had otherwise. :-) To be blunt, without your comments, KERN_SA would not be something we could be talking about integrating. Thank you for this assistance. Take care, Bill
Attachment:
pgp0RG_TlYO64.pgp
Description: PGP signature