Current-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: 9.99.69 panic - libcrypto changes?
Thank you for the detailed response (I can't claim to understand
completely, of course.). A saved kernel from 9.99.68 still lets me
work with the machine as before; I updated it yesterday and got
another - perhaps identical - panic when downloading mail with
Thunderbird
panic: fpudna from userland, ip 0x7c16e87b95ca, trapframe 0xffffce01527ec000
cpu0: Begin traceback...
vpanic() at netbsd:vpanic+0x152
snprintf() at netbsd:snprintf
fpu_set_default_cw() at netbsd:fpu_set_default_cw
cpu0: End traceback...
dumping to dev 168,15 (offset=8, size=4152523):
dump autoconfiguration error: ahcisata0 port 3: clearing WDCTL_RST
failed for drive 0
WARNING: negative runtime; monotonic clock has gone backwards
wddump: device timed out
i/o error
rebooting...
(and no core dump, of course).
On Sat, 4 Jul 2020 at 21:19, Taylor R Campbell <riastradh%netbsd.org@localhost> wrote:
>
> > Date: Thu, 2 Jul 2020 23:09:16 +0100
> > From: Chavdar Ivanov <ci4ic4%gmail.com@localhost>
> >
> > On amd64 9.99.69 from yesterday I get:
> > [...]
> > System panicked: fpudna from kernel, ip 0xffffffff802292af, trapframe
> > 0xffffbe013c564a50
> > [...]
> > Xtrap07() at Xtrap07+0xbd
> > aesni_enc_impl() at aesni_enc_impl+0x1c
> > rijndaelEncrypt() at rijndaelEncrypt+0x4b
> > ccmp_init_blocks() at ccmp_init_blocks+0xe8
> > [...]
>
> I am investigating. There must be a bug somewhere in the x86 vector
> register state management I used to used to allow the kernel to use
> AES-NI, but I'm not yet sure what it is.
>
> > My WiFi link (iwm) is also visibly slower than usual.
..
> > happened while I was running 'pkgin upgrade' over an NFS mount through
> > the iwm adapter.
>
> This is likely an unintended side effect of my recent AES rework
> (https://mail-index.netbsd.org/tech-kern/2020/06/18/msg026505.html).
>
> For systems where we can take advantage of hardware AES support, like
> yours, after every call into the AES subsystem, the kernel will zero
> the vector registers to avoid leaking secrets through Spectre-class
> speculative execution attacks.
>
> Although your kernel is evidently now taking advantage of hardware
> support for AES (the x86 AES-NI CPU instructions), which is much
> faster than software AES, the logic in our 802.11 stack to compute
> CCMP (the authenticated cipher used in your WPA setup) calls the AES
> block cipher one block at a time.
>
> So it's zeroing all the vector registers for every 16 bytes of data in
> every frame -- twice, because AES-CCM involves two block cipher calls
> for every block of data (one for the AES-CBC-MAC authenticator, one
> for the AES-CTR encryption pad). I expect this is the source of the
> slowdown you're witnessing.
>
>
> There are a few ways we could work around this:
>
> 1. Push the AES-CCM computation into the AES subsystem, so we only
> zero the vector registers once per frame, or once per mbuf segment.
> This requires a bit of work but if I can find CCMP test vectors
> then it shouldn't be too hard. At worst, it will require redoing
> when the wifi branch is merged.
>
> 2. Push ieee80211_crypto_* into a worker thread, and use
> <https://mail-index.netbsd.org/tech-kern/2020/06/20/msg026524.html>
> to avoid zeroing the vector registers. However, this may require
> some design changes in the 802.11 stack and it's not clear that
> they're the right changes or that this can be done quickly.
>
> 3. Invent a new nestable transaction mechanism to defer zeroing the
> vector registers. However, there might also be a penalty to
> enabling or disabling the fpu, so it might not solve the whole
> problem, and it is not entirely clear what it should mean in an MI
> context.
>
> Another approach, of course, is to simply use an open wifi network
> instead -- generally hop-by-hop authenticated encryption like WPA is
> not worth much compared to end-to-end authenticated encryption like
> TLS, SSH, or Wireguard.
Chavdar
--
----
Home |
Main Index |
Thread Index |
Old Index