hi, attached diff is to use per-cpu tss rather than per-process tss. number of processes are no longer limited by number of gdt slots with this. i tried hbench's lat_ctx benchmark and didn't notice any performance differences. (although it might be differ for processes with i/o bitmaps.) any comments? todo: kvm86 YAMAMOTO Takashi