On 2023-09-24 19:52, Mouse wrote:
Yes...but, now, I find that it fails when I remove the TB logging code (which I'd like to remove for performance's sake; it introduces some relatively complex operations into a very heavily used codepath). I've so far been unable to see what it does that makes any difference; I'll have to dig deeper.So maybe there is actually some bugs in your code after all. ;-)What? My code has a _bug_?? That _never_ happens!
:-)
Why not cache entries outside main memory, by the way?The main motivation for adding the TB was that I want to experiment with JITting the code to native machine code. [...] This would permit the native machine code for a memory access to be relatively streamlined. "Make the common case fast"....But you don't even know it's for the bus adapter until you've gone through the PTE translation anyway.While once you've done the translation, it's obvious.True as far as it goes. My thought went more or less like this: I cache (faster-to-access forms of) PTEs for virtual addresses that map to main memory, but nothing else. The first lookup for a given page, yes, is slow, but after that, we hit the TB cache and the JITted code doesn't need to fail to the slow path; it can go straight to main memory. For access to devices, I never enter it in the TB, so JITted code always takes the slow path, which it might as well do anyway.
The first lookup (which is what would go through the TB if an entry exists there) is what always have to happen. And once that is done, you then know if this hits main memory or I/O. I can't see what you could do to make it faster for memory, or how anything about I/O can be decided before that is done anyway. And if you don't cache entries for I/O addresses, it just means that you always have to go via the PTEs to get the translation. Maybe I'm just dense. But just caching so that you don't have to go via the PTE in case there is a TB hit seems like always a win. I don't see any gain in just not caching some entries. Or, hmm. Ok, so you know that if you hit in TB, it's memory. So you can skip one compare after the TB lookup in that case.
Oh well. Not sure what the performance hits for various options are here.
I find it interesting that this NetBSD doesn't use TBIS at all.Agreed. Seems like it would potentially be clever to just do TBIS when you update one page.True. It may be relevant that NetBSD/vax, 1.4T at least, uses 4K pages in the getpagesize() sense; [...]The 4K page is a bit more recent. Back when this was messed around with, NetBSD used a 1K page size on VAX.Well, the 1.4T kernel I'm working with definitely uses 4K pages.
Change was back in 1998-08-21. Thanks for just reminding me how old I'm getting. :-)
Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: bqt%softjar.se@localhost || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol