Add a NEON-accelerated implementation of Speck128-XTS and Speck64-XTS
for ARM64. This is ported from the 32-bit version. It may be useful on
devices with 64-bit ARM CPUs that don't have the Cryptography
Extensions, so cannot do AES efficiently -- e.g. the Cortex-A53
processor on the Raspberry Pi 3.
It generally works the same way as the 32-bit version, but there are
some slight differences due to the different instructions, registers,
and syntax available in ARM64 vs. in ARM32. For example, in the 64-bit
version there are enough registers to hold the XTS tweaks for each
128-byte chunk, so they don't need to be saved on the stack.
Benchmarks on a Raspberry Pi 3 running a 64-bit kernel:
Algorithm Encryption Decryption
--------- ---------- ----------
Speck64/128-XTS (NEON) 92.2 MB/s 92.2 MB/s
Speck128/256-XTS (NEON) 75.0 MB/s 75.0 MB/s
Speck128/256-XTS (generic) 47.4 MB/s 35.6 MB/s
AES-128-XTS (NEON bit-sliced) 33.4 MB/s 29.6 MB/s
AES-256-XTS (NEON bit-sliced) 24.6 MB/s 21.7 MB/s
The code performs well on higher-end ARM64 processors as well, though
such processors tend to have the Crypto Extensions which make AES
preferred. For example, here are the same benchmarks run on a HiKey960
(with CPU affinity set for the A73 cores), with the Crypto Extensions
implementation of AES-256-XTS added:
Algorithm Encryption Decryption
--------- ----------- -----------
AES-256-XTS (Crypto Extensions) 1273.3 MB/s 1274.7 MB/s
Speck64/128-XTS (NEON) 359.8 MB/s 348.0 MB/s
Speck128/256-XTS (NEON) 292.5 MB/s 286.1 MB/s
Speck128/256-XTS (generic) 186.3 MB/s 181.8 MB/s
AES-128-XTS (NEON bit-sliced) 142.0 MB/s 124.3 MB/s
AES-256-XTS (NEON bit-sliced) 104.7 MB/s 91.1 MB/s
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 91a2abb78f940ac821345cb7cc376dca94336c2f
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master)
(changed speck-neon-glue.c to use blkcipher API instead of skcipher API)
(resolved merge conflicts in arch/arm64/crypto/Makefile and
arch/arm64/crypto/Kconfig)
(made CONFIG_CRYPTO_SPECK_NEON select CONFIG_CRYPTO_GF128MUL, since
gf128mul_x_ble() is non-inline in older kernels)
Change-Id: Iaed7a14c84b32b09ec299060a5d27060693043d5
Signed-off-by: Eric Biggers <ebiggers@google.com>
poly_hash is part of the HEH (Hash-Encrypt-Hash) encryption mode,
proposed in Internet Draft
https://tools.ietf.org/html/draft-cope-heh-01. poly_hash is very
similar to GHASH; besides the swapping of the last two coefficients
which we opted to handle in the HEH template, poly_hash just uses a
different finite field representation. As with GHASH, poly_hash becomes
much faster and more secure against timing attacks when implemented
using carryless multiplication instructions instead of tables. This
patch adds an ARMv8-CE optimized version of poly_hash, based roughly on
the existing ARMv8-CE optimized version of GHASH.
Benchmark results are shown below, but note that the resistance to
timing attacks may be even more important than the performance gain.
poly_hash only:
poly_hash-generic:
1,000,000 setkey() takes 1185 ms
hashing is 328 MB/s
poly_hash-ce:
1,000,000 setkey() takes 8 ms
hashing is 1756 MB/s
heh(aes) with 4096-byte inputs (this is the ideal case, as the
improvement is less significant with smaller inputs):
encryption with "heh_base(cmac(aes-ce),poly_hash-generic,ecb-aes-ce)": 118 MB/s
decryption with "heh_base(cmac(aes-ce),poly_hash-generic,ecb-aes-ce)": 120 MB/s
encryption with "heh_base(cmac(aes-ce),poly_hash-ce,ecb-aes-ce)": 291 MB/s
decryption with "heh_base(cmac(aes-ce),poly_hash-ce,ecb-aes-ce)": 293 MB/s
Bug: 32508661
Signed-off-by: Eric Biggers <ebiggers@google.com>
Change-Id: I621ec0e1115df7e6f5cbd7e864a4a9d8d2e94cf2
Pull crypto update from Herbert Xu:
- The crypto API is now documented :)
- Disallow arbitrary module loading through crypto API.
- Allow get request with empty driver name through crypto_user.
- Allow speed testing of arbitrary hash functions.
- Add caam support for ctr(aes), gcm(aes) and their derivatives.
- nx now supports concurrent hashing properly.
- Add sahara support for SHA1/256.
- Add ARM64 version of CRC32.
- Misc fixes.
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (77 commits)
crypto: tcrypt - Allow speed testing of arbitrary hash functions
crypto: af_alg - add user space interface for AEAD
crypto: qat - fix problem with coalescing enable logic
crypto: sahara - add support for SHA1/256
crypto: sahara - replace tasklets with kthread
crypto: sahara - add support for i.MX53
crypto: sahara - fix spinlock initialization
crypto: arm - replace memset by memzero_explicit
crypto: powerpc - replace memset by memzero_explicit
crypto: sha - replace memset by memzero_explicit
crypto: sparc - replace memset by memzero_explicit
crypto: algif_skcipher - initialize upon init request
crypto: algif_skcipher - removed unneeded code
crypto: algif_skcipher - Fixed blocking recvmsg
crypto: drbg - use memzero_explicit() for clearing sensitive data
crypto: drbg - use MODULE_ALIAS_CRYPTO
crypto: include crypto- module prefix in template
crypto: user - add MODULE_ALIAS
crypto: sha-mb - remove a bogus NULL check
crytpo: qat - Fix 64 bytes requests
...
This module registers a crc32 algorithm and a crc32c algorithm
that use the optional CRC32 and CRC32C instructions in ARMv8.
Tested on AMD Seattle.
Improvement compared to crc32c-generic algorithm:
TCRYPT CRC32C speed test shows ~450% speedup.
Simple dd write tests to btrfs filesystem show ~30% speedup.
Signed-off-by: Yazen Ghannam <yazen.ghannam@linaro.org>
Acked-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch implements the AES key schedule generation using ARMv8
Crypto Instructions. It replaces the table based C implementation
in aes_generic.ko, which means we can drop the dependency on that
module.
Tested-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This adds ARMv8 implementations of AES in ECB, CBC, CTR and XTS modes,
both for ARMv8 with Crypto Extensions and for plain ARMv8 NEON.
The Crypto Extensions version can only run on ARMv8 implementations that
have support for these optional extensions.
The plain NEON version is a table based yet time invariant implementation.
All S-box substitutions are performed in parallel, leveraging the wide range
of ARMv8's tbl/tbx instructions, and the huge NEON register file, which can
comfortably hold the entire S-box and still have room to spare for doing the
actual computations.
The key expansion routines were borrowed from aes_generic.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds support for the AES-CCM encryption algorithm for CPUs that
have support for the AES part of the ARM v8 Crypto Extensions.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds support for the AES symmetric encryption algorithm for CPUs
that have support for the AES part of the ARM v8 Crypto Extensions.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
This is a port to ARMv8 (Crypto Extensions) of the Intel implementation of the
GHASH Secure Hash (used in the Galois/Counter chaining mode). It relies on the
optional PMULL/PMULL2 instruction (polynomial multiply long, what Intel call
carry-less multiply).
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds support for the SHA-224 and SHA-256 Secure Hash Algorithms
for CPUs that have support for the SHA-2 part of the ARM v8 Crypto Extensions.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds support for the SHA-1 Secure Hash Algorithm for CPUs that
have support for the SHA-1 part of the ARM v8 Crypto Extensions.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>