diff options
author | Hans Petter Selasky <hselasky@FreeBSD.org> | 2016-05-26 11:10:31 +0000 |
---|---|---|
committer | Hans Petter Selasky <hselasky@FreeBSD.org> | 2016-05-26 11:10:31 +0000 |
commit | fc271df3416a5162043a33b8533951a5f08120ea (patch) | |
tree | f758c1da9af30b9a64d1ec6d2f10068e77aee47b /sys/netinet/tcp_lro.h | |
parent | 9a5e3bff0d07881d246ac9415ddbb4262c7cd445 (diff) | |
download | src-fc271df3416a5162043a33b8533951a5f08120ea.tar.gz src-fc271df3416a5162043a33b8533951a5f08120ea.zip |
Use optimised complexity safe sorting routine instead of the kernel's
"qsort()".
The kernel's "qsort()" routine can in worst case spend O(N*N) amount of
comparisons before the input array is sorted. It can also recurse a
significant amount of times using up the kernel's interrupt thread
stack.
The custom sorting routine takes advantage of that the sorting key is
only 64 bits. Based on set and cleared bits in the sorting key it
partitions the array until it is sorted. This process has a recursion
limit of 64 times, due to the number of set and cleared bits which can
occur. Compiled with -O2 the sorting routine was measured to use
64-bytes of stack. Multiplying this by 64 gives a maximum stack
consumption of 4096 bytes for AMD64. The same applies to the execution
time, that the array to be sorted will not be traversed more than 64
times.
When serving roughly 80Gb/s with 80K TCP connections, the old method
consisting of "qsort()" and "tcp_lro_mbuf_compare_header()" used 1.4%
CPU, while the new "tcp_lro_sort()" used 1.1% for LRO related sorting
as measured by Intel Vtune. The testing was done using a sysctl to
toggle between "qsort()" and "tcp_lro_sort()".
Differential Revision: https://reviews.freebsd.org/D6472
Sponsored by: Mellanox Technologies
Tested by: Netflix
Reviewed by: gallatin, rrs, sephe, transport
Notes
Notes:
svn path=/head/; revision=300731
Diffstat (limited to 'sys/netinet/tcp_lro.h')
-rw-r--r-- | sys/netinet/tcp_lro.h | 10 |
1 files changed, 6 insertions, 4 deletions
diff --git a/sys/netinet/tcp_lro.h b/sys/netinet/tcp_lro.h index b81a95020c18..63aa62edd8ba 100644 --- a/sys/netinet/tcp_lro.h +++ b/sys/netinet/tcp_lro.h @@ -38,9 +38,6 @@ #define TCP_LRO_ENTRIES 8 #endif -#define TCP_LRO_SEQUENCE(mb) \ - (mb)->m_pkthdr.PH_loc.thirtytwo[0] - struct lro_entry { LIST_ENTRY(lro_entry) next; struct mbuf *m_head; @@ -80,10 +77,15 @@ LIST_HEAD(lro_head, lro_entry); #define source_ip6 lesource.s_ip6 #define dest_ip6 ledest.d_ip6 +struct lro_mbuf_sort { + uint64_t seq; + struct mbuf *mb; +}; + /* NB: This is part of driver structs. */ struct lro_ctrl { struct ifnet *ifp; - struct mbuf **lro_mbuf_data; + struct lro_mbuf_sort *lro_mbuf_data; uint64_t lro_queued; uint64_t lro_flushed; uint64_t lro_bad_csum; |