[src-draft/trunk]: src/sys New cgd cipher adiantum.

To: source-changes-hg%NetBSD.org@localhost
Subject: [src-draft/trunk]: src/sys New cgd cipher adiantum.
From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
Date: Wed, 17 Jun 2020 02:54:57 +0000
details:   https://anonhg.NetBSD.org/src-all/rev/90cc7c0921a0
branches:  trunk
changeset: 934719:90cc7c0921a0
user:      Taylor R Campbell <riastradh%NetBSD.org@localhost>
date:      Wed Jun 17 02:47:43 2020 +0000

description:
New cgd cipher adiantum.

Adiantum is a wide-block cipher, built out of AES, XChaCha12,
Poly1305, and NH, defined in

   Paul Crowley and Eric Biggers, `Adiantum: length-preserving
   encryption for entry-level processors', IACR Transactions on
   Symmetric Cryptology 2018(4), pp. 39--61.

Adiantum provides better security than a narrow-block cipher with CBC
or XTS, because every bit of each sector affects every other bit,
whereas with CBC each block of plaintext only affects the following
blocks of ciphertext in the disk sector, and with XTS each block of
plaintext only affects its own block of ciphertext and nothing else.

Adiantum generally provides much better performance than
constant-time AES-CBC or AES-XTS software do without hardware
support, and performance comparable to or better than the
variable-time (i.e., leaky) AES-CBC and AES-XTS software we had
before.  (Note: Adiantum also uses AES as a subroutine, but only once
per disk sector.  It takes only a small fraction of the time spent by
Adiantum, so there's relatively little performance impact to using
constant-time AES software over using variable-time AES software for
it.)

Adiantum naturally scales to essentially arbitrary disk sector sizes;
sizes >=1024-bytes take the most advantage of Adiantum's design for
performance, so 4096-byte sectors would be a natural choice if we
taught cgd to change the disk sector size.  (However, it's a
different cipher for each disk sector size, so it _must_ be a cgd
parameter.)

The paper presents a similar construction HPolyC.  The salient
difference is that HPolyC uses Poly1305 directly, whereas Adiantum
uses Poly1395(NH(...)).  NH is annoying because it requires a
1072-byte key, which means the test vectors are ginormous, and
changing keys is costly; HPolyC avoids these shortcomings by using
Poly1305 directly, but HPolyC is measurably slower, costing about
1.5x what Adiantum costs on 4096-byte sectors.

For the purposes of cgd, we will reuse each key for many messages,
and there will be very few keys in total (one per cgd volume) so --
except for the annoying verbosity of test vectors -- the tradeoff
weighs in the favour of Adiantum, especially if we teach cgd to do
>>512-byte sectors.

For now, everything that Adiantum needs beyond what's already in the
kernel is gathered into a single file, including NH, Poly1305, and
XChaCha12.  We can split those out -- and reuse them, and provide MD
tuned implementations, and so on -- as needed; this is just a first
pass to get Adiantum implemented for experimentation.

diffstat:

 sys/conf/files                          |     3 +-
 sys/crypto/adiantum/adiantum.c          |  2316 +++++++++++++++++++++++++++++++
 sys/crypto/adiantum/adiantum_selftest.c |  1835 ++++++++++++++++++++++++
 sys/crypto/adiantum/files.adiantum      |     6 +
 sys/dev/cgd_crypto.c                    |    69 +
 5 files changed, 4228 insertions(+), 1 deletions(-)

diffs (truncated from 4286 to 300 lines):

diff -r fea895ead7aa -r 90cc7c0921a0 sys/conf/files
--- a/sys/conf/files    Mon Jun 15 22:55:59 2020 +0000
+++ b/sys/conf/files    Wed Jun 17 02:47:43 2020 +0000
@@ -200,6 +200,7 @@
 # use it.
 
 # Individual crypto transforms
+include "crypto/adiantum/files.adiantum"
 include "crypto/aes/files.aes"
 include "crypto/des/files.des"
 include "crypto/blowfish/files.blowfish"
@@ -1395,7 +1396,7 @@
 defpseudodev vnd:      disk
 defflag opt_vnd.h      VND_COMPRESSION
 defpseudo ccd:         disk
-defpseudodev cgd:      disk, des, blowfish, cast128, aes
+defpseudodev cgd:      disk, des, blowfish, cast128, aes, adiantum
 defpseudodev md:       disk
 defpseudodev fss:      disk
 
diff -r fea895ead7aa -r 90cc7c0921a0 sys/crypto/adiantum/adiantum.c
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/sys/crypto/adiantum/adiantum.c    Wed Jun 17 02:47:43 2020 +0000
@@ -0,0 +1,2316 @@
+/*     $NetBSD$        */
+
+/*-
+ * Copyright (c) 2020 The NetBSD Foundation, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
+ * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+ * The Adiantum wide-block cipher, from
+ *
+ *     Paul Crowley and Eric Biggers, `Adiantum: length-preserving
+ *     encryption for entry-level processors', IACR Transactions on
+ *     Symmetric Cryptology 2018(4), pp. 39--61.
+ *
+ *     https://doi.org/10.13154/tosc.v2018.i4.39-61
+ */
+
+#include <sys/cdefs.h>
+__KERNEL_RCSID(1, "$NetBSD$");
+
+#include <sys/types.h>
+#include <sys/endian.h>
+
+#ifdef _KERNEL
+
+#include <sys/module.h>
+#include <sys/systm.h>
+
+#include <lib/libkern/libkern.h>
+
+#include <crypto/adiantum/adiantum.h>
+#include <crypto/aes/aes.h>
+
+#else  /* !defined(_KERNEL) */
+
+#include <sys/cdefs.h>
+
+#include <assert.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+
+#include <openssl/aes.h>
+
+struct aesenc {
+       AES_KEY enckey;
+};
+
+struct aesdec {
+       AES_KEY deckey;
+};
+
+#define        AES_256_NROUNDS 14
+#define        aes_setenckey256(E, K)  AES_set_encrypt_key((K), 256, &(E)->enckey)
+#define        aes_setdeckey256(D, K)  AES_set_decrypt_key((K), 256, &(D)->deckey)
+#define        aes_enc(E, P, C, NR)    AES_encrypt(P, C, &(E)->enckey)
+#define        aes_dec(D, C, P, NR)    AES_decrypt(C, P, &(D)->deckey)
+
+#include "adiantum.h"
+
+#define        CTASSERT        __CTASSERT
+#define        KASSERT         assert
+#define        MIN(x,y)        ((x) < (y) ? (x) : (y))
+
+static void
+hexdump(int (*prf)(const char *, ...) __printflike(1,2), const char *prefix,
+    const void *buf, size_t len)
+{
+       const uint8_t *p = buf;
+       size_t i;
+
+       (*prf)("%s (%zu bytes)\n", prefix, len);
+       for (i = 0; i < len; i++) {
+               if (i % 16 == 8)
+                       (*prf)("  ");
+               else
+                       (*prf)(" ");
+               (*prf)("%02hhx", p[i]);
+               if ((i + 1) % 16 == 0)
+                       (*prf)("\n");
+       }
+       if (i % 16)
+               (*prf)("\n");
+}
+
+#endif /* _KERNEL */
+
+/* Arithmetic modulo 2^128, represented by 16-digit strings in radix 2^8.  */
+
+/* s := a + b (mod 2^128) */
+static inline void
+add128(uint8_t s[restrict static 16],
+    const uint8_t a[static 16], const uint8_t b[static 16])
+{
+       unsigned i, c;
+
+       c = 0;
+       for (i = 0; i < 16; i++) {
+               c = a[i] + b[i] + c;
+               s[i] = c & 0xff;
+               c >>= 8;
+       }
+}
+
+/* s := a - b (mod 2^128) */
+static inline void
+sub128(uint8_t d[restrict static 16],
+    const uint8_t a[static 16], const uint8_t b[static 16])
+{
+       unsigned i, c;
+
+       c = 0;
+       for (i = 0; i < 16; i++) {
+               c = a[i] - b[i] - c;
+               d[i] = c & 0xff;
+               c = 1 & (c >> 8);
+       }
+}
+
+static int
+addsub128_selftest(void)
+{
+       static const uint8_t zero[16] = {
+               0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,
+               0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,
+       };
+       static const uint8_t one[16] = {
+               0x01,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,
+               0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,
+       };
+       static const uint8_t negativeone[16] = {
+               0xff,0xff,0xff,0xff, 0xff,0xff,0xff,0xff,
+               0xff,0xff,0xff,0xff, 0xff,0xff,0xff,0xff,
+       };
+       static const uint8_t a[16] = {
+               0x03,0x80,0x00,0x00, 0x00,0x00,0x00,0x00,
+               0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,
+       };
+       static const uint8_t b[16] = {
+               0x01,0x82,0x00,0x00, 0x00,0x00,0x00,0x00,
+               0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,
+       };
+       static const uint8_t c[16] = {
+               0x02,0xfe,0xff,0xff, 0xff,0xff,0xff,0xff,
+               0xff,0xff,0xff,0xff, 0xff,0xff,0xff,0xff,
+       };
+       uint8_t r[16];
+       int result = 0;
+
+       sub128(r, zero, one);
+       if (memcmp(r, negativeone, 16)) {
+               hexdump(printf, "sub128 1", r, sizeof r);
+               result = -1;
+       }
+
+       sub128(r, a, b);
+       if (memcmp(r, c, 16)) {
+               hexdump(printf, "sub128 2", r, sizeof r);
+               result = -1;
+       }
+
+       return result;
+}
+
+/* Poly1305 */
+
+struct poly1305 {
+       uint32_t r[5];          /* evaluation point */
+       uint32_t h[5];          /* value */
+};
+
+static void
+poly1305_init(struct poly1305 *P, const uint8_t key[static 16])
+{
+
+       /* clamp */
+       P->r[0] = (le32dec(key +  0) >> 0) & 0x03ffffff;
+       P->r[1] = (le32dec(key +  3) >> 2) & 0x03ffff03;
+       P->r[2] = (le32dec(key +  6) >> 4) & 0x03ffc0ff;
+       P->r[3] = (le32dec(key +  9) >> 6) & 0x03f03fff;
+       P->r[4] = (le32dec(key + 12) >> 8) & 0x000fffff;
+
+       /* initialize polynomial evaluation */
+       P->h[0] = P->h[1] = P->h[2] = P->h[3] = P->h[4] = 0;
+}
+
+static void
+poly1305_update_internal(struct poly1305 *P, const uint8_t m[static 16],
+    uint32_t pad)
+{
+       uint32_t r0 = P->r[0];
+       uint32_t r1 = P->r[1];
+       uint32_t r2 = P->r[2];
+       uint32_t r3 = P->r[3];
+       uint32_t r4 = P->r[4];
+       uint32_t h0 = P->h[0];
+       uint32_t h1 = P->h[1];
+       uint32_t h2 = P->h[2];
+       uint32_t h3 = P->h[3];
+       uint32_t h4 = P->h[4];
+       uint64_t k0, k1, k2, k3, k4; /* 64-bit extension of h */
+       uint64_t p0, p1, p2, p3, p4; /* columns of product */
+       uint32_t c;                  /* carry */
+
+       /* h' := h + m */
+       h0 += (le32dec(m +  0) >> 0) & 0x03ffffff;
+       h1 += (le32dec(m +  3) >> 2) & 0x03ffffff;
+       h2 += (le32dec(m +  6) >> 4) & 0x03ffffff;
+       h3 += (le32dec(m +  9) >> 6);
+       h4 += (le32dec(m + 12) >> 8) | (pad << 24);
+
+       /* extend to 64 bits */
+       k0 = h0;
+       k1 = h1;
+       k2 = h2;
+       k3 = h3;
+       k4 = h4;
+
+       /* p := h' * r = (h + m)*r mod 2^130 - 5 */
+       p0 = r0*k0 + 5*r4*k1 + 5*r3*k2 + 5*r2*k3 + 5*r1*k4;
+       p1 = r1*k0 +   r0*k1 + 5*r4*k2 + 5*r3*k3 + 5*r2*k4;
+       p2 = r2*k0 +   r1*k1 +   r0*k2 + 5*r4*k3 + 5*r3*k4;
+       p3 = r3*k0 +   r2*k1 +   r1*k2 +   r0*k3 + 5*r4*k4;
+       p4 = r4*k0 +   r3*k1 +   r2*k2 +   r1*k3 +   r0*k4;
+
+       /* propagate carries */
+       p0 += 0; c = p0 >> 26; h0 = p0 & 0x03ffffff;
+       p1 += c; c = p1 >> 26; h1 = p1 & 0x03ffffff;
+       p2 += c; c = p2 >> 26; h2 = p2 & 0x03ffffff;
+       p3 += c; c = p3 >> 26; h3 = p3 & 0x03ffffff;
+       p4 += c; c = p4 >> 26; h4 = p4 & 0x03ffffff;
+
+       /* reduce 2^130 = 5 */
+       h0 += c*5; c = h0 >> 26; h0 &= 0x03ffffff;
+       h1 += c;
+
+       /* update hash values */
+       P->h[0] = h0;
+       P->h[1] = h1;
+       P->h[2] = h2;
+       P->h[3] = h3;
+       P->h[4] = h4;
+}
+
+static void
+poly1305_update_block(struct poly1305 *P, const uint8_t m[static 16])
+{
+
+       poly1305_update_internal(P, m, 1);
+}
+
+static void
+poly1305_update_last(struct poly1305 *P, const uint8_t *m, size_t mlen)
+{
+       uint8_t buf[16];
Follow-Ups:
- [src-draft/trunk]: src/tests/dev/cgd New cgd cipher adiantum.
  - From: Taylor R Campbell
- [src-draft/trunk]: src/tests/dev/cgd New cgd cipher adiantum.
  - From: Taylor R Campbell
- [src-draft/trunk]: src/tests/dev/cgd New cgd cipher adiantum.
  - From: Taylor R Campbell
- [src-draft/trunk]: src/tests/dev/cgd New cgd cipher adiantum.
  - From: Taylor R Campbell
Prev by Date: [src-draft/trunk]: src/sys/crypto/aes/arch/x86 Batch AES-XTS computation into...
Next by Date: [src/trunk]: src/sys/conf Add -fstack-usage to kernel builds. Produces .su fi...
Previous by Thread: [src-draft/trunk]: src/sys/crypto/aes/arch/x86 Batch AES-XTS computation into...
Next by Thread: [src-draft/trunk]: src/tests/dev/cgd New cgd cipher adiantum.
Indexes:
Home | Main Index | Thread Index | Old Index