diff options
author | Andrew Gallatin <gallatin@FreeBSD.org> | 2025-08-20 16:49:32 +0000 |
---|---|---|
committer | Andrew Gallatin <gallatin@FreeBSD.org> | 2025-08-20 16:49:32 +0000 |
commit | 84f8ca1bd11d97d8d254248da7c09507038be505 (patch) | |
tree | a14751234f2d7dd952e760f5c99613578b89f402 /tools/test/netfibs | |
parent | fe2418f26ed0b2b9e06152c16f72f2bb4ec27c29 (diff) |
While mp_ring can provide amazing scalability in scenarios
where the number of cores exceeds the number of NIC tx
rings, it can also lead to greatly reduced performance in simpler,
high packet rate scenarios due to extra CPU cycles and cache
misses stemming from its complexity.
This change implements a simple if_transmit routine, selected
at driver load. This routine does not queue anything, and uses
a simple queue selection and ends up being far more cache
friendly.
In testing on a 400GbE NIC in an AMD 7502P EPYC server, this
simple tx routine is roughly 2.5 times as fast as mp_ring
(8Gbs -> 20Gb/s). and 5x as fast as mp_ring with tx_abdicate=1
(4Gbs -> 20Gb/s) for a simple in-kernel packet generator, which
is closed source currently. It also shows a 50% speedup for
a simple netperf -tUDP_STREAM test (5Gb/s -> 8Gbs).
This change is mostly a noop, as it not enabled by default.
The one exception is the change to iflib_encap() to immediately
reclaim completed tx descriptors, and only failing the transmit
and scheduling a later reclaim if iflib_completed_tx_reclaim()
didn't free enough descriptors.
Reviewed by: kbowling, sumit.saxena_broadcom.com, vmaffione
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D51905
Diffstat (limited to 'tools/test/netfibs')
0 files changed, 0 insertions, 0 deletions