Opened 3 years ago

Last modified 3 years ago

#6099 new Enhancement

Add libkcapi crypto backend.

Reported by: fijam Owned by: jordan
Priority: Normal Milestone: None Set
Component: libtransmission Version: 2.92
Severity: Normal Keywords:
Cc:

Description

Libkcapi (http://www.chronox.de/libkcapi.html) presents an API to the linux kernel crypto subsystem.

Documentation is available at http://www.chronox.de/libkcapi/html/

Having this as an optional backend would be tremendously useful for NASes/embedded devices, since many of those are based on SoCs? featuring hardware sha1 accelerators (notably Marvell CESA present in all Marvell Armada SoCs? http://www.marvell.com/embedded-processors) for which kernel drivers are available (eg. marvell_cesa).

Rechecking large torrents on that kind of hardware without hardware acceleration grinds those systems to a halt.

In theory rc4 could also be handled by libkcapi, but I don't know of common hardware crypto accelerators in ARM SoCs? that can do rc4.

This solution is preferable to cryptodev, since cryptodev requires the user to maintain out-of-tree kernel modules and (probably more annoyingly) a patched OpenSSL version on his own. (It has never worked with transmission anyway, since the available crypto engines are not initialized in the openssl helper code.)

Attachments (1)

libkcapi.patch (2.2 KB) - added by fijam 3 years ago.
PoC libkcapi patch

Download all attachments as: .zip

Change History (2)

Changed 3 years ago by fijam

PoC libkcapi patch

comment:1 Changed 3 years ago by fijam

I made a proof-of-concept patch to test this, it straight up replaces openssl portion of sha1 code with libkcapi. You need to include kcapi.h and kcapi_aio.h to build it and have the library itself installed. The patch passes the internal test suite (although memory use on test_ssha1 is much higher, I didn't investigate why).

I tested the performance with a Marvell CESA accelerator and marvell_cesa kernel driver. MSEC_TO_SLEEP_PER_SECOND_DURING_VERIFY was set to 0 for both vanilla and patched versions. A 2GB torrent was re-verified:

without the patch: CPU ~97%
It took 50 seconds to verify 1997966667 bytes (39175817 bytes per second)
perf report --sort=dso
75.42% libcrypto.so.1.0.0
19.06% [kernel.kallsyms]

with the patch: CPU ~60%
It took 45 seconds to verify 1997966667 bytes (43434057 bytes per second)
perf report --sort=dso
71.73% [kernel.kallsyms] (mostly memory shuffling)
12.21% [marvell_cesa]

With the patch transmission apparently spends time waiting for something. I plan to rebuild my kernel with CONFIG_SCHEDSTATS to see what's going on.

I also thought that increasing buflen from the default 128KB would help (as there would be less overhead on the shuffling of data between the kernel and userland) but that didn't seem to change anything.

So while the speedup at the moment is modest the system is entirely usable when the torrents are being verified which was my goal.

Note: See TracTickets for help on using tickets.