Add a NEON-accelerated implementation of Speck128-XTS and Speck64-XTS for ARM64. This is ported from the 32-bit version. It may be useful on devices with 64-bit ARM CPUs that don't have the Cryptography Extensions, so cannot do AES efficiently -- e.g. the Cortex-A53 processor on the Raspberry Pi 3. It generally works the same way as the 32-bit version, but there are some slight differences due to the different instructions, registers, and syntax available in ARM64 vs. in ARM32. For example, in the 64-bit version there are enough registers to hold the XTS tweaks for each 128-byte chunk, so they don't need to be saved on the stack. Benchmarks on a Raspberry Pi 3 running a 64-bit kernel: Algorithm Encryption Decryption --------- ---------- ---------- Speck64/128-XTS (NEON) 92.2 MB/s 92.2 MB/s Speck128/256-XTS (NEON) 75.0 MB/s 75.0 MB/s Speck128/256-XTS (generic) 47.4 MB/s 35.6 MB/s AES-128-XTS (NEON bit-sliced) 33.4 MB/s 29.6 MB/s AES-256-XTS (NEON bit-sliced) 24.6 MB/s 21.7 MB/s The code performs well on higher-end ARM64 processors as well, though such processors tend to have the Crypto Extensions which make AES preferred. For example, here are the same benchmarks run on a HiKey960 (with CPU affinity set for the A73 cores), with the Crypto Extensions implementation of AES-256-XTS added: Algorithm Encryption Decryption --------- ----------- ----------- AES-256-XTS (Crypto Extensions) 1273.3 MB/s 1274.7 MB/s Speck64/128-XTS (NEON) 359.8 MB/s 348.0 MB/s Speck128/256-XTS (NEON) 292.5 MB/s 286.1 MB/s Speck128/256-XTS (generic) 186.3 MB/s 181.8 MB/s AES-128-XTS (NEON bit-sliced) 142.0 MB/s 124.3 MB/s AES-256-XTS (NEON bit-sliced) 104.7 MB/s 91.1 MB/s Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> (cherry picked from commit 91a2abb78f940ac821345cb7cc376dca94336c2f git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master) (changed speck-neon-glue.c to use blkcipher API instead of skcipher API) (resolved merge conflicts in arch/arm64/crypto/Makefile and arch/arm64/crypto/Kconfig) (made CONFIG_CRYPTO_SPECK_NEON select CONFIG_CRYPTO_GF128MUL, since gf128mul_x_ble() is non-inline in older kernels) Change-Id: Iaed7a14c84b32b09ec299060a5d27060693043d5 Signed-off-by: Eric Biggers <ebiggers@google.com>
48 lines
1.5 KiB
Makefile
48 lines
1.5 KiB
Makefile
#
|
|
# linux/arch/arm64/crypto/Makefile
|
|
#
|
|
# Copyright (C) 2014 Linaro Ltd <ard.biesheuvel@linaro.org>
|
|
#
|
|
# This program is free software; you can redistribute it and/or modify
|
|
# it under the terms of the GNU General Public License version 2 as
|
|
# published by the Free Software Foundation.
|
|
#
|
|
|
|
obj-$(CONFIG_CRYPTO_SHA1_ARM64_CE) += sha1-ce.o
|
|
sha1-ce-y := sha1-ce-glue.o sha1-ce-core.o
|
|
|
|
obj-$(CONFIG_CRYPTO_SHA2_ARM64_CE) += sha2-ce.o
|
|
sha2-ce-y := sha2-ce-glue.o sha2-ce-core.o
|
|
|
|
obj-$(CONFIG_CRYPTO_GHASH_ARM64_CE) += ghash-ce.o
|
|
ghash-ce-y := ghash-ce-glue.o ghash-ce-core.o
|
|
|
|
obj-$(CONFIG_CRYPTO_POLY_HASH_ARM64_CE) += poly-hash-ce.o
|
|
poly-hash-ce-y := poly-hash-ce-glue.o poly-hash-ce-core.o
|
|
|
|
obj-$(CONFIG_CRYPTO_AES_ARM64_CE) += aes-ce-cipher.o
|
|
CFLAGS_aes-ce-cipher.o += -march=armv8-a+crypto
|
|
|
|
obj-$(CONFIG_CRYPTO_AES_ARM64_CE_CCM) += aes-ce-ccm.o
|
|
aes-ce-ccm-y := aes-ce-ccm-glue.o aes-ce-ccm-core.o
|
|
|
|
obj-$(CONFIG_CRYPTO_AES_ARM64_CE_BLK) += aes-ce-blk.o
|
|
aes-ce-blk-y := aes-glue-ce.o aes-ce.o
|
|
|
|
obj-$(CONFIG_CRYPTO_AES_ARM64_NEON_BLK) += aes-neon-blk.o
|
|
aes-neon-blk-y := aes-glue-neon.o aes-neon.o
|
|
|
|
obj-$(CONFIG_CRYPTO_SPECK_NEON) += speck-neon.o
|
|
speck-neon-y := speck-neon-core.o speck-neon-glue.o
|
|
|
|
AFLAGS_aes-ce.o := -DINTERLEAVE=4
|
|
AFLAGS_aes-neon.o := -DINTERLEAVE=4
|
|
|
|
CFLAGS_aes-glue-ce.o := -DUSE_V8_CRYPTO_EXTENSIONS
|
|
|
|
obj-$(CONFIG_CRYPTO_CRC32_ARM64) += crc32-arm64.o
|
|
|
|
CFLAGS_crc32-arm64.o := -mcpu=generic+crc
|
|
|
|
$(obj)/aes-glue-%.o: $(src)/aes-glue.c FORCE
|
|
$(call if_changed_rule,cc_o_c)
|