Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 19f39eff authored by Eric Biggers's avatar Eric Biggers Committed by Greg Kroah-Hartman
Browse files

crypto: x86/salsa20 - remove x86 salsa20 implementations



commit b7b73cd5d74694ed59abcdb4974dacb4ff8b2a2a upstream.

The x86 assembly implementations of Salsa20 use the frame base pointer
register (%ebp or %rbp), which breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.
Recent (v4.10+) kernels will warn about this, e.g.

WARNING: kernel stack regs at 00000000a8291e69 in syzkaller047086:4677 has bad 'bp' value 000000001077994c
[...]

But after looking into it, I believe there's very little reason to still
retain the x86 Salsa20 code.  First, these are *not* vectorized
(SSE2/SSSE3/AVX2) implementations, which would be needed to get anywhere
close to the best Salsa20 performance on any remotely modern x86
processor; they're just regular x86 assembly.  Second, it's still
unclear that anyone is actually using the kernel's Salsa20 at all,
especially given that now ChaCha20 is supported too, and with much more
efficient SSSE3 and AVX2 implementations.  Finally, in benchmarks I did
on both Intel and AMD processors with both gcc 8.1.0 and gcc 4.9.4, the
x86_64 salsa20-asm is actually slightly *slower* than salsa20-generic
(~3% slower on Skylake, ~10% slower on Zen), while the i686 salsa20-asm
is only slightly faster than salsa20-generic (~15% faster on Skylake,
~20% faster on Zen).  The gcc version made little difference.

So, the x86_64 salsa20-asm is pretty clearly useless.  That leaves just
the i686 salsa20-asm, which based on my tests provides a 15-20% speed
boost.  But that's without updating the code to not use %ebp.  And given
the maintenance cost, the small speed difference vs. salsa20-generic,
the fact that few people still use i686 kernels, the doubt that anyone
is even using the kernel's Salsa20 at all, and the fact that a SSE2
implementation would almost certainly be much faster on any remotely
modern x86 processor yet no one has cared enough to add one yet, I don't
think it's worthwhile to keep.

Thus, just remove both the x86_64 and i686 salsa20-asm implementations.

Reported-by: default avatar <syzbot+ffa3a158337bbc01ff09@syzkaller.appspotmail.com>
Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
parent 2a017ea2
Loading
Loading
Loading
Loading
+0 −4
Original line number Diff line number Diff line
@@ -15,7 +15,6 @@ obj-$(CONFIG_CRYPTO_GLUE_HELPER_X86) += glue_helper.o

obj-$(CONFIG_CRYPTO_AES_586) += aes-i586.o
obj-$(CONFIG_CRYPTO_TWOFISH_586) += twofish-i586.o
obj-$(CONFIG_CRYPTO_SALSA20_586) += salsa20-i586.o
obj-$(CONFIG_CRYPTO_SERPENT_SSE2_586) += serpent-sse2-i586.o

obj-$(CONFIG_CRYPTO_AES_X86_64) += aes-x86_64.o
@@ -24,7 +23,6 @@ obj-$(CONFIG_CRYPTO_CAMELLIA_X86_64) += camellia-x86_64.o
obj-$(CONFIG_CRYPTO_BLOWFISH_X86_64) += blowfish-x86_64.o
obj-$(CONFIG_CRYPTO_TWOFISH_X86_64) += twofish-x86_64.o
obj-$(CONFIG_CRYPTO_TWOFISH_X86_64_3WAY) += twofish-x86_64-3way.o
obj-$(CONFIG_CRYPTO_SALSA20_X86_64) += salsa20-x86_64.o
obj-$(CONFIG_CRYPTO_CHACHA20_X86_64) += chacha20-x86_64.o
obj-$(CONFIG_CRYPTO_SERPENT_SSE2_X86_64) += serpent-sse2-x86_64.o
obj-$(CONFIG_CRYPTO_AES_NI_INTEL) += aesni-intel.o
@@ -59,7 +57,6 @@ endif

aes-i586-y := aes-i586-asm_32.o aes_glue.o
twofish-i586-y := twofish-i586-asm_32.o twofish_glue.o
salsa20-i586-y := salsa20-i586-asm_32.o salsa20_glue.o
serpent-sse2-i586-y := serpent-sse2-i586-asm_32.o serpent_sse2_glue.o

aes-x86_64-y := aes-x86_64-asm_64.o aes_glue.o
@@ -68,7 +65,6 @@ camellia-x86_64-y := camellia-x86_64-asm_64.o camellia_glue.o
blowfish-x86_64-y := blowfish-x86_64-asm_64.o blowfish_glue.o
twofish-x86_64-y := twofish-x86_64-asm_64.o twofish_glue.o
twofish-x86_64-3way-y := twofish-x86_64-asm_64-3way.o twofish_glue_3way.o
salsa20-x86_64-y := salsa20-x86_64-asm_64.o salsa20_glue.o
chacha20-x86_64-y := chacha20-ssse3-x86_64.o chacha20_glue.o
serpent-sse2-x86_64-y := serpent-sse2-x86_64-asm_64.o serpent_sse2_glue.o

+0 −1114

File deleted.

Preview size limit exceeded, changes collapsed.

+0 −919

File deleted.

Preview size limit exceeded, changes collapsed.

arch/x86/crypto/salsa20_glue.c

deleted100644 → 0
+0 −116

File deleted.

Preview size limit exceeded, changes collapsed.

+0 −26
Original line number Diff line number Diff line
@@ -1324,32 +1324,6 @@ config CRYPTO_SALSA20
	  The Salsa20 stream cipher algorithm is designed by Daniel J.
	  Bernstein <djb@cr.yp.to>. See <http://cr.yp.to/snuffle.html>

config CRYPTO_SALSA20_586
	tristate "Salsa20 stream cipher algorithm (i586)"
	depends on (X86 || UML_X86) && !64BIT
	select CRYPTO_BLKCIPHER
	help
	  Salsa20 stream cipher algorithm.

	  Salsa20 is a stream cipher submitted to eSTREAM, the ECRYPT
	  Stream Cipher Project. See <http://www.ecrypt.eu.org/stream/>

	  The Salsa20 stream cipher algorithm is designed by Daniel J.
	  Bernstein <djb@cr.yp.to>. See <http://cr.yp.to/snuffle.html>

config CRYPTO_SALSA20_X86_64
	tristate "Salsa20 stream cipher algorithm (x86_64)"
	depends on (X86 || UML_X86) && 64BIT
	select CRYPTO_BLKCIPHER
	help
	  Salsa20 stream cipher algorithm.

	  Salsa20 is a stream cipher submitted to eSTREAM, the ECRYPT
	  Stream Cipher Project. See <http://www.ecrypt.eu.org/stream/>

	  The Salsa20 stream cipher algorithm is designed by Daniel J.
	  Bernstein <djb@cr.yp.to>. See <http://cr.yp.to/snuffle.html>

config CRYPTO_CHACHA20
	tristate "ChaCha20 cipher algorithm"
	select CRYPTO_BLKCIPHER