Port-xen archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: dom0 kernel profiling on Xen
On Sun, Apr 03, 2016 at 08:45:28PM -0400, Thor Lancelot Simon wrote:
> On Sun, Apr 03, 2016 at 07:34:25PM -0400, Thor Lancelot Simon wrote:
> > Before I dig into this too much -- is this expected to work? I'm at wits'
> > end trying to track down the consumer of 50-80% "interrupt" time under I/O
> > load on my dom0 elsewise.
>
> Well, that was cool:
>
> # kgmon -r -b
> (XEN) Pagetable walk from ffffa0006050ffc8:
> (XEN) L4[0x140] = 000000120b3af027 00000000000013af
> (XEN) L3[0x001] = 00000010400fd027 000000000007fff8
> (XEN) L2[0x102] = 000000107c7d8027 0000000000011cc3
> (XEN) L1[0x10f] = 0000000000000000 ffffffffffffffff
> (XEN) domain_crash_sync called from entry.S: fault at ffff82d080215bfe create_bounce_frame+0x66/0x13a
> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
> (XEN) ----[ Xen-4.5.2 x86_64 debug=n Not tainted ]----
> (XEN) CPU: 0
> (XEN) RIP: e033:[<ffffffff8075f8c4>]
> (XEN) RFLAGS: 0000000000010296 EM: 1 CONTEXT: pv guest (d0v0)
> (XEN) rax: ffffffff813a8000 rbx: 0000000000000001 rcx: 0000000000000000
> (XEN) rdx: 0000000000000004 rsi: ffffffff8020aad2 rdi: ffffffff8075f962
> (XEN) rbp: ffffa00060510010 rsp: ffffa0006050ffd8 r8: 00007f8000000000
> (XEN) r9: 0000000000000000 r10: ffffa00003a555a0 r11: ffffa000605100a0
> (XEN) r12: ffffffff8020aad2 r13: ffffffff8075f962 r14: ffffffff80c8d020
> (XEN) r15: 0000000000000004 cr0: 000000008005003b cr4: 00000000000026f0
> (XEN) cr3: 000000120b37f000 cr2: ffffa0006050ffd8
> (XEN) ds: 003f es: 003f fs: 0000 gs: 0000 ss: e02b cs: e033
> (XEN) Guest stack trace from rsp=ffffa0006050ffd8:
> (XEN) Fault while accessing guest memory.
> (XEN) Domain 0 crashed: rebooting machine in 5 seconds.
I can reproduce this, with a domU.
It looks like we run out of stack in mcount.
I found what looks like a loop:
mount:
[...]
0xffffffff8049a2ca <+54>: callq 0xffffffff8049a2f3 <_mcount>
[...]
_mcount
[...]
0xffffffff8049a335 <+66>: callq 0xffffffff80106478 <x86_lfence>
[...]
0xffffffff8049a341 <+78>: callq 0xffffffff8010c7c0 <__cpu_simple_lock>
[...]
0xffffffff8049a443 <+336>: callq 0xffffffff8010c7e0 <__cpu_simple_unlock>
[...]
0xffffffff8049a461 <+366>: jmpq 0xffffffff80106478 <x86_lfence>
x86_lfence:
0xffffffff80106478 <+0>: push %rbp
0xffffffff80106479 <+1>: lea (%rsp),%rbp
0xffffffff8010647d <+5>: callq 0xffffffff8049a294 <mcount>
0xffffffff80106482 <+10>: pop %rbp
0xffffffff80106483 <+11>: lfence
0xffffffff80106486 <+14>: retq
0xffffffff80106487 <+15>: nop
__cpu_simple_unlock:
0xffffffff8010c7e0 <+0>: push %rbp
0xffffffff8010c7e1 <+1>: lea (%rsp),%rbp
0xffffffff8010c7e5 <+5>: callq 0xffffffff8049a294 <mcount>
0xffffffff8010c7ea <+10>: pop %rbp
0xffffffff8010c7eb <+11>: movb $0x0,(%rdi)
0xffffffff8010c7ee <+14>: retq
it seems that __cpu_simple_lock doens't call mcount.
Now I have to find why this doens't loop on bare metal. Any idea before
I keep looking ?
--
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
NetBSD: 26 ans d'experience feront toujours la difference
--
Home |
Main Index |
Thread Index |
Old Index