Port-amd64 archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: 70,000 TLB shootdown IPIs per second
On Wed, Dec 05, 2012 at 08:29:23AM -0800, Chuck Silvers wrote:
>
>
> and the top few entries from that with a portion of your dd test are:
>
>
> netbsd`pmap_deactivate+0x3b
> netbsd`mi_switch+0x329
> netbsd`idle_loop+0xe0
> netbsd`0xffffffff80100817
> 4349
>
> netbsd`pmap_deactivate+0x3b
> netbsd`mi_switch+0x329
> netbsd`kpreempt+0xe2
> netbsd`0xffffffff80114295
> netbsd`ubc_uiomove+0x113
> netbsd`ffs_write+0x2c5
> netbsd`VOP_WRITE+0x37
> netbsd`vn_write+0xf9
> netbsd`dofilewrite+0x7d
> netbsd`sys_write+0x62
> netbsd`syscall+0x94
> netbsd`0xffffffff801006a1
> 6168
This is a little surprising. A number of developers pointed me earlier
this morning at the code path from genfs{get,put}pages to ubc_pagermapout
which calls pmap_kremove() on a range of pages (one I/O request, I'd assume)
and then pmap_update().
This should cause per-page TLB invalidations to be batched up 6 at a time
(or a full flush to happen if more than 6 at once are batched, as would in
fact be the case for, for exampel, a 64K I/O) and then sent via the xcall
to other CPUs by the call to pmap_shootdown_tlbs from the pmap_update
call -- I would think. But evidently not! If I read the above correctly
we're being preempted, and some pmap_update() is causing the shootdowns
which is not the one in pagermapout, no?
I am in any event surprised because at 300MB/sec, I'd expect to see about
the 70,000 TLB IPIs I'm seeing iff no batching were happening at all; but
I think I should be seeing at most 1/6 that many, and potentially 1/16
that many, since with a 64K I/O size, we should do 1 full flush instead
of 16 single-page invalidations.
What am I misunderstanding?
Thor
Home |
Main Index |
Thread Index |
Old Index