tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: event counting vs. the cache
On Thu, Jan 17, 2013 at 11:10:24PM +0000, David Laight wrote:
> On Thu, Jan 17, 2013 at 03:43:13PM -0600, David Young wrote:
> >
> > 2) Split all counters into two parts: high-order 32 bits, low-order 32
> > bits. It's only necessary to touch the high-order part when the
> > low-order part rolls over, so in effect you split the counters into
> > write-often (hot) and write-rarely (cold) parts. Cram together the
> > cold parts in cachelines. Cram together the hot parts in cachelines.
> > Only the hot parts change that often, so the ordinary footprint of
> > counters in the cache is cut almost in half.
>
> That means have to have special code to read them in order to avoid
> having 'silly' values.
We can end up with silly values with the status quo, too, can't we? On
32-bit architectures like i386, x++ for uint64_t x compiles to
addl $0x1, x
adcl $0x0, x
If the addl carries, then reading x between the addl and adcl will show
a silly value.
I think that you can avoid the silly values. Say you're using per-CPU
counters. If counter x belongs to CPU p, then avoid silly values by
reading x in a low-priority thread, t, that's bound to p and reads hi(x)
then lo(x) then hi(x) again. If hi(x) changed, then t was preempted by
a thread or an interrupt handler that wrapped lo(x), so t has to restart
the sequence.
Dave
--
David Young
dyoung%pobox.com@localhost Urbana, IL (217) 721-9981
Home |
Main Index |
Thread Index |
Old Index