Subject: Re: Shared arm26/arm32 user code
To: Ben Harris <bjh21@netbsd.org>
From: Richard Earnshaw <rearnsha@arm.com>
List: port-arm26
Date: 11/25/2000 13:34:03
> Probably the most common question I get asked about NetBSD/arm26 is "can
> it run arm32 programs?".  So far, my answer has been "no".  I'd like to
> change this.  This will require some co-operation from the arm32 side, but
> if it could be achieved at the same time as the ELF transition (and before
> arm26 gets shared libraries), I think it would be fairly painless.
> 
> What we need to do is to arrange for GCC to produce code that'll run
> correctly both in USR32 mode on new ARMs and in USR26 mode on old
> ones.  Conveniently, this is more-or-less the same problem the RISC OS
> world is facing, since they're finally having to abandon 26-bit modes.  We
> may be able to make use of their efforts.
> 
> Clearly, the calling convention used by portable binaries will have to be
> a 32-bit APCS variant (RISCOS Ltd call it APCS-32).  Unfortunately, GCC
> uses "-mapcs-32" to mean that the code will be running in a 32-bit mode,
> and not just that a 32-bit APCS should be used.  Either this needs to be
> corrected, or a new flag to specify mode-independent code will be
> needed.  In either case, the APCS variant and the target mode need to be
> distingushed.

I'd suggest -mapcs-26+32 ;-)  (if it turns out that we really need 
anything at all.

> In GCC 2.95.2, -mapcs-32 affects the following:
> 
>  - Whether thumb interworking is possible (arm.c:362)
>  - What kind of CPUs can be selected. (arm.c:376)
>  + Function return conventions (arm.c:5372)
>  + Preservation of PSR over calls (arm.c:6440)
>  * Setting of MASK_RETURN_ADDR
> 
> Now, the first two should probably be made fairly relaxed for the new
> mode.  The next two should follow -mapcs-32, and the last one is
> interesting.  MASK_RETURN_ADDR needs to be an rtx that is used to mask off
> the PSR bits from return addresses, and hence needs to generate code to
> check the processor mode at runtime.  The recommeded way to do this is:
> 
>         TEQ     R0, R0          ; sets Z (can be omitted if not in User mode)
>         TEQ     PC, PC          ; EQ if in a 32-bit mode, NE if 26-bit

I'm missing something here.  What does the first instruction give us?

As for making this dynamic, that isn't too hard, we just need to make 
MASK_RETURN_ADDR into a sequence of instructions that returns a value, 
rather than a static constant as it currently is.  This macro is only used 
when throwing exceptions, so it isn't a major overhead.

> A final thing that needs to be considered is that arm26 requires sections
> to be aligned on 32k boundaries, so the linker will have to be set up to
> arrange this.

This will of course have a cost for arm32 people in that it will make 
their binaries bigger.  There may also have to be some work done in the 
kernel to force it to load such images on well-aligned pages, but I don't 
think it will affect the mapping algorithms, arm32 kernels can still map 
4k pages at a time.

A trick that was done in the RISX iX world was to merge the last (partial) 
text page with the first data page, mapping the entire page read-write; 
this meant that for many small programs, the image could be kept to just 
32K.  I've no idea whether we can (or already do) play a similar trick on 
NetBSD.

R.

PS.  Not relevant to this discussion, but we should probably be thinking 
about a common dev/podulebus in the kernel source area.  There are one or 
two podules that run equally well on both architectures.