Port-vax archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Moving VAX into 21 century :-)
On 2019-08-27 14:56, Anders Magnusson wrote:
Morning Johnny,
Good morning... :-)
Den 2019-08-27 kl. 00:40, skrev Johnny Billquist:
A couple of comments without having caught up where the thread is now...
On 2019-08-26 14:08, Anders Magnusson wrote:
Hi all,
I have been looking at some VAX problems lately, and have found out
that there are two architectural things things that probably would
help VAX quite much.
1) Change calling convention.
As described in my previous mail, it would solve a very old
well-known performance problem.
It is indeed well known that the CALL/RET instructions on the VAX are
very heavy. However, it is not that they are heavy for no reason.
I can't really see that the JSB/RSB would be much better, unless we
actually think that we want to strip some of that functionality away.
Things that CALL do:
Push requested register on stack at entry, and automatically restores
them again at return.
Saves AP and sets up new AP.
Saves and sets up new FP.
Pushing and popping registers will need to be done by compiler if not
using CALL/RET. Should not have any penalty on speed, but will grow
memory needs a little, I would expect.
Setting up AP - well, you might have some clever convetions in mind,
but I would expect similar effort to CALL would be needed for JSB.
Setting up FP - if we don't want tracebacks and returning without
first cleaning up the stack to work, then this can save some time. But
is this really something we'd like?
I wonder how much gain there really is, if you still want all the
bells that CALL gives you? I would expect the end cost to come out
about the same, but with more memory required.
In the common case we don't need any of the extra stuff that CALLS does,
this is why jsb/rsb can be used instead.
- Pass parameters in registers (we have a bunch of them). This avoids
memory cycles as well which is good.
True. Passing parameters in registers are definitely an option. Will
require saving and restoring registers at call. And it might be messy in
how to deal with different types of parameters. So it would increase
complexity in possibly several ways. But should enable faster execution.
- No need for AP (Use for TLS?)
You need some way of telling where the arguments are for arguments
beyond what you can pass in registers. Are you suggesting just some
fixed offset on the stack? This can become ugly and error prone...
- No need for FP (unless we are playing with VLAs which is quite uncommon)
The FP is mainly used for callback tracing, and automatic cleanup of the
stack. I think in general that is a nice thing, but it does also cost, yes.
- No need to save PSL or align stack. Keeping stack aligned is up to
the compiler.
I can't remember. Does CALL really align the stack? How is that then
handled at return? Does it realign back to whatever it was previously?
- Keep a "red zone" below stack of 8 words or so to simplify for leaf
functions. Amd64 ABI does this as well.
You mean preallocate some space on the stack? Sure. Don't cost anything,
and could already be done today. Not sure how much speed it saves.
It won't increase the code noticeable;
- CALLS is (usually) 9 bytes (7 + the word in the function)
- JSB is 6 bytes.
- PUSHR/POPR takes 4 bytes.
So if no regs needs saving we save 3 bytes, otherwise we add 5.
Also we save three bytes if we only are inside the red zone.
I wasn't thinking about space increase in the call, but the return. If
you have multiple places of return, you need to both clean the stack,
and restore registers at every place you do the return.
2) Make VAX use IEEE floats :-)
Today virtually no floating point exist that is not IEEE. The
only fragment around is probably the VAX floats.
This can hardly be a performance problem. So now we're talking about
some compatibility or general behavior thing?
But FP on VAX do have differences, in the hardware, that we cannot
pretend don't exist. So is this about stupid programs that are making
some assumptions that we fail, while still actually not caring enough
about actual IEEE FP, or do we really want to be proper IEEE FP, in
which case we're going to need to emulate in software, which will have
a huge impact on performance.
We're talking about being able to compile and run existing programs
without getting a headache.
Ok. I'm sortof fond of that idea, in that I expect that most programs
will never care enough anyway. Not in the normal practical sense.
However, I think there are slight differences on how a number is
actually represented, so any floating values read in will turn out
wrong, won't they? Or do you really mean that the values are similar
enough that almost all values are represented the same way in the bit
pattern?
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt%softjar.se@localhost || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
Home |
Main Index |
Thread Index |
Old Index