On the other hand, it's very clear that the "native" byte ordering of
the ppc family is big-endian, and lots of operations take a
performance hit in little-endian mode.
Never mind performance hit; it doesn't even work rationally. It's
been
a while since I read up on ppc little-endian mode, but as I recall: I
suspect someone tried to design a little-endian mode that allowed
sharing memory between endiannesses without needing to byteswap data,
which is not a coherent desire. They came impressively close, but,
since it's not a self-consistent thing to do, the result is somewhat
broken.
I'm not sure about all of that - as I say, it's been a while - but I'm
sure it didn't work sanely, and not very much like a truly
little-endian machine.