Port-sun3 archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Xorg 1.20 1bpp server performance regression and possible fixes
On testing Xsun server on sun3/60, I noticed Xorg 1.20 1bpp
(i.e. mono) server was much slower than Xorg 1.10 server,
especially on filling /usr/X11R7/include/X11/bitmap/root_weave
4x4 bitmap onto the root window, when "noblank" screen saver was
activated etc.
Xorg 1.10 based Xsun server on Sun3/60 bwtwo:
https://twitter.com/tsutsuii/status/1289451828036300800
-> Drawing time is not measurable by eyes.
Xorg 1.20 based Xsun server on Sun3/60 bwtwo:
https://twitter.com/tsutsuii/status/1289437204654075907
-> It takes >10 seconds to fill root window.
Xorg 1.18 based Xsun server (not in the tree) on Sun3/60 bwtwo:
https://twitter.com/tsutsuii/status/1291000288862560256
-> Same as 1.20.
Xorg 1.20 server + xf86-video-wsfb driver on LUNA using single plane:
https://twitter.com/tsutsuii/status/1291772031525179392
-> Also >10 seconds even on the genuine(?) xf86-video-wsfb driver.
Note this cannot be observed on TME sun3 and bwtwo emulation.
https://twitter.com/tsutsuii/status/1289590810392924162
With several investigation, it turns out the following
xorg-server upstream changes to fb/fbtile.c cause this
regression:
https://cgit.freedesktop.org/xorg/xserver/commit/fb?id=e572bcc7f4236b7e0f23ab762f225b3bce37db59
>> fb: Remove even/odd tile slow-pathing
>>
>> Again, clearly meant to be a fast path, but this turns out not to be the
>> case.
I'm not sure how the "not to be the case" in the log was concluded,
but the "fast path" of the removed fbEvenTile() function was
only called if "FbEvenTile(tileWidth)" was true:
https://cgit.freedesktop.org/xorg/xserver/tree/fb/fbtile.c?id=9838b7032ea9792bec21af424c53c07078636d21#n145
---
void
fbTile(FbBits * dst,
FbStride dstStride,
int dstX,
int width,
int height,
FbBits * tile,
FbStride tileStride,
int tileWidth,
int tileHeight, int alu, FbBits pm, int bpp, int xRot, int yRot)
{
if (FbEvenTile(tileWidth))
fbEvenTile(dst, dstStride, dstX, width, height,
tile, tileStride, tileHeight, alu, pm, xRot, yRot);
---
FbEvenTile() is defined in fb/fb.h:
https://cgit.freedesktop.org/xorg/xserver/tree/fb/fb.h?id=e572bcc7f4236b7e0f23ab762f225b3bce37db59#n543
---
/*
* Accelerated tiles are power of 2 width <= FB_UNIT
*/
#define FbEvenTile(w) ((w) <= FB_UNIT && FbPowerOfTwo(w))
---
FB_UNIT is 32 here, so the "fast path" is activiated only if
"tileWidth" arg is 32 or smaller (i.e. 1, 2, 4, 8, or 16).
The main caller of fbTile() is fbFill() with "FillTiled" op in
fb/fbfill.c:
https://cgit.freedesktop.org/xorg/xserver/tree/fb/fbtile.c?id=9838b7032ea9792bec21af424c53c07078636d21#n145
---
fbTile(dst + (y + dstYoff) * dstStride,
dstStride,
(x + dstXoff) * dstBpp,
width * dstBpp, height,
tile,
tileStride,
tileWidth * tileBpp,
tileHeight,
pGC->alu,
pPriv->pm,
dstBpp,
(pGC->patOrg.x + pDrawable->x + dstXoff) * dstBpp,
pGC->patOrg.y + pDrawable->y - y);
---
The argument "tileWidth" of fbTile() includes bpp, so
the "fast path" fbEvenTile() won't called on 32bpp servers.
On the other hand, 1bpp server uses it for 32x32 and smaller bitmaps.
Reverting the above "fb: Remove even/odd tile slow-pathing" change
significantly improves speed of filling the root_weave and other
pattern of Xorg 1bpp server as before:
Patched Xsun server on Sun3/60 bwtwo:
https://twitter.com/tsutsuii/status/1291061688762957826
Patched Xorg + xf86-video-wsfb server on LUNA:
https://twitter.com/tsutsuii/status/1291773410964463617
I would like to commit changes to revert this fbtile.c removal
into the NetBSD xsrc/external/mit/xorg tree, but I wonder how
they should be applied.
1) Revert the whole commit as is (and put back fbtile.c to SRCS.fb
in src/external/mit/xorg/server/xorg-server/fb/Makefile.fb)
2) Put back the "fbEvenTile" and old "fbTile" to switch
"fbOddTile" (which is same as current fbTile) or "fbEvenTile"
into existing fb/fbfill.c as inline functions, to make
future import/merge less confusing
3) Ask xorg upstream to revert the change
Of course the best way is 3) and 1), but I'm afread most people
don't think it's worth to keep and maintain extra code for obsolete
monochrome (and some 8bpp) servers and retro-computing geeks.
Comments?
---
Izumi Tsutsui
Home |
Main Index |
Thread Index |
Old Index