tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: RFC: NUMA support
On Mon, Nov 10, 2008 at 05:11:37PM +0200, Christoph Egger wrote:
>
> Hi!
>
> I started to work on NUMA support. First step is to set up the
> topology.
>
> It does this by scanning the ACPI SRAT table.
> If no ACPI SRAT table is present or if you boot w/ ACPI disabled,
> a one node NUMA system is faked.
>
> The boot code also utilizes the ACPI MADT table to get more
> information so far possible.
>
> I showed rmind@ my patch so far. I share his opinion, that
> it needs some more thoughts on the MI API side.
> Nonetheless, it's a start.
>
> The next two items are to write a numactl(8) utility and
> to utilize the BIOS e820 memory map for more detailed
> and accurate information on the NUMA memory
> layout - the memory holes in particular.
>
> The dmesg snippet on a four-socket machine looks like this:
>
> NUMA: SRAT table found
> NUMA: SLIT table not found
> ioapic0 at mainbus0 apid 0
> ioapic1 at mainbus0 apid 1
> numa0 at mainbus0
> cpu0 at numa0 apic 4 (BP): AMD 686-class, 2300 MHz, id 0x100f40
> cpu1 at numa0 apic 5 (AP): AMD 686-class, 2300 MHz, id 0x100f40
> cpu2 at numa0 apic 6 (AP): AMD 686-class, 2300 MHz, id 0x100f40
> cpu3 at numa0 apic 7 (AP): AMD 686-class, 2300 MHz, id 0x100f40
> numa0: memory: 0x0 - 0xa0000 (0xa0000, physical, raw, raw)
> numa0: memory: 0x100000 - 0x40000000 (0x3ff00000, physical, raw, raw)
> numa1 at mainbus0
> cpu4 at numa1 apic 8 (AP): AMD 686-class, 2300 MHz, id 0x100f40
> cpu5 at numa1 apic 9 (AP): AMD 686-class, 2300 MHz, id 0x100f40
> cpu6 at numa1 apic 10 (AP): AMD 686-class, 2300 MHz, id 0x100f40
> cpu7 at numa1 apic 11 (AP): AMD 686-class, 2300 MHz, id 0x100f40
> numa1: memory: 0x40000000 - 0x80000000 (0x40000000, physical, raw, raw)
> numa2 at mainbus0
> cpu8 at numa2 apic 12 (AP): AMD 686-class, 2300 MHz, id 0x100f40
> cpu9 at numa2 apic 13 (AP): AMD 686-class, 2300 MHz, id 0x100f40
> cpu10 at numa2 apic 14 (AP): AMD 686-class, 2300 MHz, id 0x100f40
> cpu11 at numa2 apic 15 (AP): AMD 686-class, 2300 MHz, id 0x100f40
> numa2: memory: 0x80000000 - 0xc0000000 (0x40000000, physical, raw, raw)
> numa3 at mainbus0
> cpu12 at numa3 apic 16 (AP): AMD 686-class, 2300 MHz, id 0x100f40
> cpu13 at numa3 apic 17 (AP): AMD 686-class, 2300 MHz, id 0x100f40
> cpu14 at numa3 apic 18 (AP): AMD 686-class, 2300 MHz, id 0x100f40
> cpu15 at numa3 apic 19 (AP): AMD 686-class, 2300 MHz, id 0x100f40
> numa3: memory: 0xc0000000 - 0xd8000000 (0x18000000, physical, raw, raw)
> numa3: memory: 0x100000000 - 0x128000000 (0x28000000, physical, raw, raw)
>
> I can also suspend and resume a full node:
>
> # cpuctl list
> Num HwId Unbound LWPs Interrupts Last change
> ---- ---- ------------ -------------- ----------------------------
> 0 0 online intr Wed Nov 5 01:55:47 2008
> 1 1 online intr Wed Nov 5 01:55:47 2008
> 2 2 online intr Wed Nov 5 01:55:47 2008
> 3 3 online intr Wed Nov 5 01:55:47 2008
> 4 4 online intr Wed Nov 5 01:55:47 2008
> 5 5 online intr Wed Nov 5 01:55:47 2008
> 6 6 online intr Wed Nov 5 01:55:47 2008
> 7 7 online intr Wed Nov 5 01:55:47 2008
> 8 8 online intr Wed Nov 5 01:55:47 2008
> 9 9 online intr Wed Nov 5 01:55:47 2008
> 10 a online intr Wed Nov 5 01:55:47 2008
> 11 b online intr Wed Nov 5 01:55:47 2008
> 12 c online intr Wed Nov 5 01:55:47 2008
> 13 d online intr Wed Nov 5 01:55:47 2008
> 14 e online intr Wed Nov 5 01:55:47 2008
> 15 f online intr Wed Nov 5 01:55:47 2008
>
> # drvctl -l mainbus0
> mainbus0 ioapic0
> mainbus0 ioapic1
> mainbus0 numa0
> mainbus0 numa1
> mainbus0 numa2
> mainbus0 numa3
> mainbus0 acpi0
> mainbus0 pci0
> mainbus0 pci8
Are pci0 and pci8 and the other peripheral buses more properly attached
to a NUMA node?
Currently, numa0..numaN are just aggregations of CPUs, at least as far
as pmf(9) is concerned. AFAICT, a NUMA node is a real physical entity
with RAM attached. Looking ahead, what will it mean for a NUMA node
to be suspended? Will the system vacate that node's RAM and turn off
DRAM refresh? What, for that matter, will it mean to detach a NUMA node?
Dave
--
David Young OJC Technologies
dyoung%ojctech.com@localhost Urbana, IL * (217) 278-3933 ext 24
Home |
Main Index |
Thread Index |
Old Index