Port-amd64 archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
ACPI idle performance problem
ACPI idle is a choke point which can be observed hammering the CPU and L2/L3
cache with tprof(8). It probably also causes delays with TLB shootdown IPI
processing. Below are some kernel compile times from a 2 x 12 core system
running a kernel from the ad-namecache branch, comparing MWAIT & ACPI idle:
-j24 (no HT) MWAIT 83.57s real 1016.03s user 205.83s system
-j24 (no HT) ACPI 88.64s real 1067.68s user 221.18s system
-j48 MWAIT 74.25s real 1582.42s user 367.20s system
-j48 ACPI 77.26s real 1564.38s user 368.41s system
To solve the problem, my initial thoughts are:
(1) For each CPU, make use of ACPI idle only if all CPUs in the same CPU
core have been idle for N clock ticks; otherwise use MWAIT/HLT.
(2) Remove CPUs doing ACPI idle from participation in TLB shootdown IPIs,
although that may impose its own cost because it means using directed
IPIs instead of broadcast IPIs. Needs to be tried.
But, I'm not really sure about this one. Should it be that the decision to
use ACPI idle should be system wide, and not a per-CPU/per-core decision?
Am I missing some vital piece of information, and/or is there a better way
to solve this?
Thanks,
Andrew
Home |
Main Index |
Thread Index |
Old Index