aboutsummaryrefslogtreecommitdiff
path: root/sys/netinet/tcp_ratelimit.c
Commit message (Collapse)AuthorAgeFilesLines
* Add a switch structure for send tags.John Baldwin2021-09-141-4/+4
| | | | | | | | | | | | | | | | | | | | | Move the type and function pointers for operations on existing send tags (modify, query, next, free) out of 'struct ifnet' and into a new 'struct if_snd_tag_sw'. A pointer to this structure is added to the generic part of send tags and is initialized by m_snd_tag_init() (which now accepts a switch structure as a new argument in place of the type). Previously, device driver ifnet methods switched on the type to call type-specific functions. Now, those type-specific functions are saved in the switch structure and invoked directly. In addition, this more gracefully permits multiple implementations of the same tag within a driver. In particular, NIC TLS for future Chelsio adapters will use a different implementation than the existing NIC TLS support for T6 adapters. Reviewed by: gallatin, hselasky, kib (older version) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D31572
* This brings into sync FreeBSD with the netflix versions of rack and bbr.Randall Stewart2021-05-061-2/+27
| | | | | | | | | | | | This fixes several breakages (panics) since the tcp_lro code was committed that have been reported. Quite a few new features are now in rack (prefecting of DGP -- Dynamic Goodput Pacing among the largest). There is also support for ack-war prevention. Documents comming soon on rack.. Sponsored by: Netflix Reviewed by: rscheff, mtuexen Differential Revision: https://reviews.freebsd.org/D30036
* Fix LINT kernel builds after 1a714ff20419 .Hans Petter Selasky2021-02-011-28/+8
| | | | | | | MFC after: 1 week Discussed with: rrs@ Differential Revision: https://reviews.freebsd.org/D28357 Sponsored by: Mellanox Technologies // NVIDIA Networking
* This pulls over all the changes that are in the netflixRandall Stewart2021-01-281-201/+383
| | | | | | | | | tree that fix the ratelimit code. There were several bugs in tcp_ratelimit itself and we needed further work to support the multiple tag format coming for the joint TLS and Ratelimit dances. Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D28357
* Add m_snd_tag_alloc() as a wrapper around if_snd_tag_alloc().John Baldwin2020-10-291-17/+7
| | | | | | | | | | | This gives a more uniform API for send tag life cycle management. Reviewed by: gallatin, hselasky Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D27000 Notes: svn path=/head/; revision=367151
* Call m_snd_tag_rele() to free send tags.John Baldwin2020-10-291-7/+3
| | | | | | | | | | | | Send tags are refcounted and if_snd_tag_free() is called by m_snd_tag_rele() when the last reference is dropped on a send tag. Reviewed by: gallatin, hselasky Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D26995 Notes: svn path=/head/; revision=367148
* Remove an extra if_ref().John Baldwin2020-10-291-1/+0
| | | | | | | | | | | | | | In r348254, if_snd_tag_alloc() routines were changed to bump the ifp refcount via m_snd_tag_init(). This function wasn't in the tree at the time and wasn't updated for the new semantics, so was still doing a separate bump after if_snd_tag_alloc() returned. Reviewed by: gallatin Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D26999 Notes: svn path=/head/; revision=367147
* Support hardware rate limiting (pacing) with TLS offload.John Baldwin2020-10-291-9/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Add a new send tag type for a send tag that supports both rate limiting (packet pacing) and TLS offload (mostly similar to D22669 but adds a separate structure when allocating the new tag type). - When allocating a send tag for TLS offload, check to see if the connection already has a pacing rate. If so, allocate a tag that supports both rate limiting and TLS offload rather than a plain TLS offload tag. - When setting an initial rate on an existing ifnet KTLS connection, set the rate in the TCP control block inp and then reset the TLS send tag (via ktls_output_eagain) to reallocate a TLS + ratelimit send tag. This allocates the TLS send tag asynchronously from a task queue, so the TLS rate limit tag alloc is always sleepable. - When modifying a rate on a connection using KTLS, look for a TLS send tag. If the send tag is only a plain TLS send tag, assume we failed to allocate a TLS ratelimit tag (either during the TCP_TXTLS_ENABLE socket option, or during the send tag reset triggered by ktls_output_eagain) and ignore the new rate. If the send tag is a ratelimit TLS send tag, change the rate on the TLS tag and leave the inp tag alone. - Lock the inp lock when setting sb_tls_info for a socket send buffer so that the routines in tcp_ratelimit can safely dereference the pointer without needing to grab the socket buffer lock. - Add an IFCAP_TXTLS_RTLMT capability flag and associated administrative controls in ifconfig(8). TLS rate limit tags are only allocated if this capability is enabled. Note that TLS offload (whether unlimited or rate limited) always requires IFCAP_TXTLS[46]. Reviewed by: gallatin, hselasky Relnotes: yes Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D26691 Notes: svn path=/head/; revision=367123
* Save the current TCP pacing rate in t_pacing_rate.John Baldwin2020-10-291-0/+9
| | | | | | | | | Reviewed by: gallatin, gnn Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D26875 Notes: svn path=/head/; revision=367122
* Check if_capenable, not if_capabilities when enabling rate limiting.John Baldwin2020-10-061-2/+2
| | | | | | | | | | | | if_capabilities is a read-only mask of supported capabilities. if_capenable is a mask under administrative control via ifconfig(8). Reviewed by: gallatin Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D26690 Notes: svn path=/head/; revision=366492
* net: clean up empty lines in .c and .h filesMateusz Guzik2020-09-011-3/+0
| | | | Notes: svn path=/head/; revision=365071
* Fix copyright year and eliminate the obsolete all rights reserved line.Warner Losh2020-04-081-2/+1
| | | | | | | Reviewed by: rrs@ Notes: svn path=/head/; revision=359729
* sys/netinet: remove spurious doubled ;sEd Maste2020-03-271-1/+1
| | | | Notes: svn path=/head/; revision=359381
* make lacp's use_numa hashing aware of send tagsAndrew Gallatin2020-03-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When I did the use_numa support, I missed the fact that there is a separate hash function for send tag nic selection. So when use_numa is enabled, ktls offload does not work properly, as it does not reliably allocate a send tag on the proper egress nic since different egress nics are selected for send-tag allocation and packet transmit. To fix this, this change: - refectors lacp_select_tx_port_by_hash() and lacp_select_tx_port() to make lacp_select_tx_port_by_hash() always called by lacp_select_tx_port() - pre-shifts flowids to convert them to hashes when calling lacp_select_tx_port_by_hash() - adds a numa_domain field to if_snd_tag_alloc_params - plumbs the numa domain into places where we allocate send tags In testing with NIC TLS setup on a NUMA machine, I see thousands of output errors before the change when enabling kern.ipc.tls.ifnet.permitted=1. After the change, I see no errors, and I see the NIC sysctl counters showing active TLS offload sessions. Reviewed by: rrs, hselasky, jhb Sponsored by: Netflix Notes: svn path=/head/; revision=358808
* Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)Pawel Biernacki2020-02-261-6/+6
| | | | | | | | | | | | | | | | | | | r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718 Notes: svn path=/head/; revision=358333
* This commit expands tcp_ratelimit to be able to handle cardsRandall Stewart2020-02-261-39/+318
| | | | | | | | | | | | | | | | | | | | like the mlx-c5 and c6 that require a "setup" routine before the tcp_ratelimit code can declare and use a rate. I add the setup routine to if_var as well as fix tcp_ratelimit to call it. I also revisit the rates so that in the case of a mlx card of type c5/6 we will use about 100 rates concentrated in the range where the most gain can be had (1-200Mbps). Note that I have tested these on a c5 and they work and perform well. In fact in an unloaded system they pace right to the correct rate (great job mlx!). There will be a further commit here from Hans that will add the respective changes to the mlx driver to support this work (which I was testing with). Sponsored by: Netflix Inc. Differential Revision: ttps://reviews.freebsd.org/D23647 Notes: svn path=/head/; revision=358332
* Lets get the real correct version.. gessh. I needRandall Stewart2020-02-121-8/+10
| | | | | | | | | more coffee evidently. Sponsored by: Netflix Notes: svn path=/head/; revision=357823
* Opps committed the wrong ratelimit version in theRandall Stewart2020-02-121-319/+43
| | | | | | | | | whitespace cleanup.. Restore it to the proper version. Sponsored by: Netfilx Inc. Notes: svn path=/head/; revision=357819
* White space cleanup -- remove trailing tab's or spacesRandall Stewart2020-02-121-35/+309
| | | | | | | | | from any line. Sponsored by: Netflix Inc. Notes: svn path=/head/; revision=357818
* Whitespace, remove from three files trailing whiteRandall Stewart2020-02-121-3/+3
| | | | | | | | | space (leftover presents from emacs). Sponsored by: Netflix Inc. Notes: svn path=/head/; revision=357817
* A miss from r356754.Gleb Smirnoff2020-01-151-1/+1
| | | | Notes: svn path=/head/; revision=356756
* Introduce NET_EPOCH_CALL() macro and use it everywhere where we freeGleb Smirnoff2020-01-151-1/+1
| | | | | | | | data based on the network epoch. The macro reverses the argument order of epoch_call(9) - first function, then its argument. NFC Notes: svn path=/head/; revision=356755
* Use official macro to enter/exit the network epoch. NFCGleb Smirnoff2020-01-151-5/+5
| | | | Notes: svn path=/head/; revision=356754
* Since this code dereferences struct ifnet, it must include if_var.hGleb Smirnoff2020-01-151-1/+3
| | | | | | | | explicitly, not via header pollution. While here move TCPSTATES declaration right above the include that is going to make use of it. Notes: svn path=/head/; revision=356751
* The non-preemptible network epoch identified by net_epoch isn't used.Gleb Smirnoff2020-01-151-1/+1
| | | | | | | This code definitely meant net_epoch_preempt. Notes: svn path=/head/; revision=356747
* Factor out TCP rateset destruction code.Hans Petter Selasky2019-10-091-34/+23
| | | | | | | | | | | | | | Ensure the epoch_call() function is not called more than one time before the callback has been executed, by always checking the RS_FUNERAL_SCHD flag before invoking epoch_call(). The "rs_number_dead" is balanced again after r353353. Discussed with: rrs@ Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=353359
* Fix locking order reversal in the TCP ratelimit code by movingHans Petter Selasky2019-10-091-20/+23
| | | | | | | | | | | | | | | | | | | | | | | | | destructors outside the rsmtx mutex. Witness message: lock order reversal: (sleepable after non-sleepable) 1st tcp_rs_mtx (rsmtx) @ sys/netinet/tcp_ratelimit.c:242 2nd sysctl lock (sysctl lock) @ sys/kern/kern_sysctl.c:607 Backtrace: witness_debugger witness_checkorder _rm_wlock_debug sysctl_ctx_free rs_destroy epoch_call_task gtaskqueue_run_locked gtaskqueue_thread_loop Discussed with: rrs@ Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=353353
* With the recent commit of ktls, we no longer have aRandall Stewart2019-09-111-4/+1
| | | | | | | | | | | sb_tls_flags, its just the sb_flags. Also the ratelimit code, now that the defintion is in sockbuf.h, does not need the ktls.h file (or its predecessor). Sponsored by: Netflix Inc Notes: svn path=/head/; revision=352215
* Don't hold the rs_mtx lock while calling malloc().Michael Tuexen2019-08-261-13/+7
| | | | | | | | | Reviewed by: rrs@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D21416 Notes: svn path=/head/; revision=351512
* Fix !INET build.Xin LI2019-08-021-0/+4
| | | | Notes: svn path=/head/; revision=350547
* Fix one more atomic for i86Randall Stewart2019-08-021-1/+1
| | | | | | | Obtained from: mtuexen@freebsd.org Notes: svn path=/head/; revision=350537
* Opps use fetchadd_u64 not long to keep old 32 bit platformsRandall Stewart2019-08-011-1/+1
| | | | | | | happy. Notes: svn path=/head/; revision=350521
* This adds the third step in getting BBR into the tree. BBR andRandall Stewart2019-08-011-0/+1234
an updated rack depend on having access to the new ratelimit api in this commit. Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D20953 Notes: svn path=/head/; revision=350501