| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I have some patches which make ip_mroute and ip6_mroute multi-FIB-aware.
This enables running per-FIB routing daemons, each of which has a
separate routing socket.
Several places in the network stack check whether multicast routing is
configured by checking whether the multicast routing socket is non-NULL.
This doesn't directly translate in my proposed scheme, as each FIB would
have its own socket. I'd like to modify the ip(6)_mroute code to store
all state, including the socket, in a per-FIB structure. So, take a
step towards that and 1) hide the socket, 2) add a boolean flag which
indicates whether a multicast router is registered.
Reviewed by: pouria, zlei, glebius, adrian
MFC after: 2 weeks
Sponsored by: Stormshield
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D55236
|
| |
|
|
|
| |
Reviewed by: gallatin, tuexen
Differential Revision: https://reviews.freebsd.org/D55196
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ip_mroute and ip6_mroute modules hook into the network stack via
several function pointers. Declarations for these pointers are
scattered around several headers. Put them all in the same place,
ip(6)_mroute.h.
No functional change intended.
Reviewed by: glebius
MFC after: 2 weeks
Sponsored by: Stormshield
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D55058
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This is cleaner and will make it a bit easier to add some more
indirection to the VIF table, specifically, to add per-FIB tables.
No functional change intended.
Reviewed by: glebius
MFC after: 2 weeks
Sponsored by: Stormshield
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D55057
|
| |
|
|
|
|
|
| |
Previously this used a home-rolled version.
Reviewed by: tuexen, imp, markj
Differential Revision: https://reviews.freebsd.org/D55165
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sockets that implement their own socket buffers (marked with PR_SOCKBUF)
are now also responsible for initialization of socket buffer mutexes in
pr_attach and for destruction in pr_detach (or pr_close).
This removes a big bunch of reported LORs, as now WITNESS is able to see
that tcp(4) socket buffer mutex and netlink(4) socket buffer mutex are two
different things. Distinct names also improve diagnostics for blocked
threads.
This also removes a hack from unix(4), where we used to mtx_destroy().
Also removes an innocent bug from unix(4) where for accept(2)-ed socket
soreserve() was called twice. This one was innocent since first call to
soreserve() was asking for 0 bytes of space.
This slightly increased amount of pasted code in TCP's syncache_socket().
The problem is that while for sockets created with socket(2) it is
pr_attach responsible for call to soreserve() (including !PR_SOCKBUF
protocols), but for the sockets created with accept(2) it was
solisten_clone() doing soreserve(), combined with the fact that for
accept(2) TCP completely bypasses pr_attach. This all should improve once
TCP has its own socket buffers.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D54984
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- The v6 socket option and ioctl handlers had no privilege checks at
all. The socket options, I believe, can only be reached via a raw
socket, but a jailed root user with a raw socket shouldn't be able to
configure multicast routing in a non-VNET jail. The ioctls can only
be used to fetch stats.
- Delete a bogus comment in X_mrt_ioctl(), one can issue multicast
routing ioctls against any socket. Note that the call path is
soo_ioctl()->rtioctl_fib()->mrt_ioctl().
I think all of the mroute privilege checks should be done within the
ip(6)_mroute code, but let's first make the v4 and v6 modules
consistent.
Reviewed by: glebius
MFC after: 2 weeks
Sponsored by: Stormshield
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D54982
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The multicast routing code was using spin mutexes for packet counting,
but there is no reason to use them instead of regular mutexes, given
that none of this code runs in an interrupt context. Convert to using
default mutexes.
Reviewed by: glebius
MFC after: 2 weeks
Sponsored by: Stormshield
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D54603
|
| |
|
|
|
|
|
|
| |
No functional change intended.
MFC after: 1 week
Sponsored by: Stormshield
Sponsored by: Klara, Inc.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Jumbo Payload option was intended to allow the deployment of IPv6 on
networks with a link MTU in excess of 65,735 octets.
Speaking to one of the authors of RFC2675 the networks which motivated
the Jumbo Payload option no longer exist.
FreeBSD does not currently support any links with this capacity and
discussion when this change was first proposed suggested that the loop
back interface had to be patched to test implementation.
As there are no known devices that can carry Jumbo Payloads remove
support.
Reviewed by: glebius, teuxen, kp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19960
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Stop using struct nd_ifinfo for that, because it is an API struct for
SIOCGIFINFO_IN6. The functional changes are isolated to the protocol
attach and detach: in6_ifarrival(), nd6_ifattach(), in6_ifdeparture(),
nd6_ifdetach(), as well as to the nd6_ioctl(), nd6_ra_input(),
nd6_slowtimo() and in6_ifmtu().
The dad_failures member was just renamed to match the rest. The M_IP6NDP
malloc(9) type declaration moved to files that actually use it.
The rest of the changes are mechanical substitution of double pointer
dereference via ND_IFINFO() to a single pointer dereference. This was
achieved with a sed(1) script:
s/ND_IFINFO\(([a-z0-9>_.-]+)\)->(flags|linkmtu|basereachable|reachable|retrans|chlim)/\1->if_inet6->nd_\2/g
s/nd_chlim/nd_curhoplimit/g
Reviewed by: tuexen, madpilot
Differential Revision: https://reviews.freebsd.org/D54725
|
| |
|
|
|
| |
Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D54723
|
| |
|
|
|
|
| |
Reported by: Timo Völker
Tested by: Timo Völker
MFC after: 3 days
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the same functionality for the IPv4 header checksum
as was done erlier for the SCTP/TCP/UDP transport checksum.
When the IP implementation sends a packet, it does not compute the
corresponding checksum but defers that. It will determine whether the
network interface selected for the packet has the requested capability
and computes the checksum in software, if the selected network
interface does not have the requested capability.
Do this not only for packets being sent by the local IP stack, but
also when forwarding packets. Furthermore, when such packets are
delivered to a local IP stack, do not compute or validate the checksum,
since such packets have never been on the wire. This allows to support
checksum offloading also in the case of local virtual machines or
jails. Support for epair interfaces will be added in a separate commit.
Reviewed by: pouria, tuexen
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D54455
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ever since "d9c55b2e8cd6 rss: Enable portions of RSS globally.."
exposed the RSS software hashing functions, it has been possible
to use them without "ifdef RSS". Do so now in the syncache
so as to get flowids recorded.
Note that the use of the rss hash functions is conditional on IP versions,
so we must ifdef INET to ensure rss_proto_software_hash_v4() is available.
Fixes 73fe85e486d2
Sponsored by: Netflix
Reviewed by: glebius, p.mousavizadeh_protonmail.com, nickbanks_netflix.com, tuexen
Differential Revision: https://reviews.freebsd.org/D54534
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
With a recent bug in the igb (and a few other) driver LRO mis-queuing, rack did things ok, better
than the base stack, due to the rack reordering protections in rack, but there was still room for improvements.
When a series of packets are completely mis-ordered you often times can get the acks shortly after you have
entered recovery and retransmitted the first of the packets indicated in the sack stream. Then the cum-ack
arrives basically acking all those packets. If you look at the time from when you sent the packet to when the
ack came back you can quickly determine that the ack was not to what you just transmitted but instead
was original and you had a completely false recovery entry. Dropping out of that you can then restore the
congestion state and continue on your way. The Dup-acks that also arrive help increase your reordering windows
which makes you less likely to repeat the scenario.
Differential Revision:<https://reviews.freebsd.org/D53832>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new function in_delayed_cksum_o() was introduced to compute
the checksum in the case the mbuf chain does not start with the
IP header. The offset of the IP header is specified by the
parameter iph_offset.
If iph_offset was positive, the function computed an incorrect
checksum.
Reviewed by: sobomax, tuexen
Fixes: 5feb38e37847 ("netinet: provide "at offset" variant of the in_delayed_cksum() API")
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D54269
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change retires two historic relics: the if_afdata[] array and the
dom_ifattach/dom_ifdetach methods.
The if_afdata[] array is a relic of the era, when there was expectation
that many transport protocols will coexist with IP, e.g. IPX or NetAtalk.
The array hasn't had any members except AF_INET and AF_INET6 for over a
decade already. This change removes the array and just leaves two pointer
fields: if_inet and if_inet6.
The dom_ifattach/dom_ifdetach predates the EVENTHANDLER(9) framework and
was a good enough method to initialize protocol contexts back then. Today
there is no good reason to treat IPv4 and IPv6 stacks differently to other
protocols/features that attach and detach from an interface.
The locking of if_afdata[] is a relic of SMPng times, when the system
startup and the interface attach was even more convoluted than before this
change, and we also had unloadable protocols that used a field in
if_afdata[]. Note that IPv4 and IPv6 are not unloadable.
Note that this change removes NET_EPOCH_WAIT() from the interface detach
sequence. This may surface several new races associated with interface
removal. I failed to hit any with consecutive test suite runs, though.
The expected general race scenario is that while struct ifnet is freed
with proper epoch_call(9) itself, some structures hanging off ifnet are
freed with direct free(9). The proper fix is either make if_foo point at
some static "dead" structure providing SMP visibility of this store, or
free those structure with epoch_call(9). All of these cases are planned
to be found and resolved during 16.0-CURRENT lifetime.
Reviewed by: zlei, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D54089
|
| |
|
|
| |
See c3fc0db3bc50df18a724e6e6b12ea4e060fd9255 for details.
|
| |
|
|
|
|
|
|
|
| |
Add struct mtx to struct lltable and stop using IF_AFDATA_LOCK, that
was created for a completely different purpose. No functional change
intended.
Reviewed by: zlei, melifaro
Differential Revision: https://reviews.freebsd.org/D54086
|
| |
|
|
|
| |
PR: 291439
Fixes: 73fe85e486d297c9c976095854c1c84007e543f0
|
| |
|
|
|
|
|
|
|
| |
Depreciation notice for net.inet.tcp.newsack is in 15.0.
Remove this tunable for HEAD, streamlining the code slightly.
Reviewed by: tuexen, cc, nickbanks_netflix.com, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D54072
|
| | |
|
| |
|
|
|
|
|
|
|
| |
With dd0e6bb996dc setting it always on connect(2) and syncache always
picking up the flowid from the incoming packet, any ESTABLISHED connection
shall have the flowid already set.
Reviewed by: tuexen, gallatin
Differential Revision: https://reviews.freebsd.org/D53886
|
| |
|
|
|
|
|
|
| |
Now retransmissions by syncache would use correct flowid, same as
synchronous responds.
Reviewed by: tuexen, gallatin
Differential Revision: https://reviews.freebsd.org/D51792
|
| |
|
|
|
|
|
|
|
|
|
| |
The hash table is accessed in ip_divert_packet(), and there the accesses
are synchronized only by the net epoch, so plain SLIST is not safe.
Reviewed by: ae
MFC after: 1 week
Sponsored by: OPNsense
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D54011
|
| |
|
|
|
|
|
|
| |
These were for $FreeBSD$ that was removed a while ago, but these
includes didn't get swept up in that. Remove them all now.
Sponsored by: Netflix
MFC After: 2 weeks
|
| |
|
|
|
|
|
|
|
|
|
| |
Now that we can trust NICs to supply an identical hash result
to software, we can setup the inpcb hash on outgoing connections.
This gives us symmetric hashing, meaning packets should enter
and leave on the same NIC queue.
Differential Revision: https://reviews.freebsd.org/D53104
Reviewed by: adrian, cc, kbowling, tuexen, zlei
Sponsored by: Netflix
|
| |
|
|
|
|
|
|
|
|
|
| |
We use the fact that all NICs that support hashing are using the
same hash algorithm and hash key to enable symmetic hashing in
TCP, where a software version of the same hash is used to
establish hashes on outgoing connections.
Sponsored by: Netflix
Reviewed by: adrian, zlei (both early version)
Differential Revision: https://reviews.freebsd.org/D53089
|
| |
|
|
| |
This allows to immediately dereference ipfw_insn member.
|
| |
|
|
|
|
|
| |
No functional change intended, suggested by glebius.
Reviewed by: rscheff, zlei, tuexen
Differential Revision: https://reviews.freebsd.org/D53739
|
| |
|
|
|
|
|
| |
Recent changes to HPTS have broken an API that was somehow removed (used by user space programs for
time calculations). This commit will add back the inline function that was removed.
Differential Revision:<https://reviews.freebsd.org/D53225>
|
| |
|
|
|
|
|
|
|
|
| |
Add a comment explaining why syncache entries are dropped and fix a
typo in a comment.
Reviewed by: rrs, glebius
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D53564
|
| |
|
|
|
|
|
| |
Reviewed by: markj, Peter Lei
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D53542
|
| |
|
|
|
|
|
|
|
|
|
|
| |
When a SYN ACK is received for a listening socket, just drop it
instead of killing the SYN-cache entry and send a RST.
This closes the possibility to kill a TCP connection during its
handling in the SYN-cache.
Reviewed by: Nick Banks, Peter Lei
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D53540
|
| |
|
|
|
|
|
|
|
|
|
| |
* shuffle around the inp_label to give inp_flags more space since it
can become long.
* fix the indentation of in6p_icmp6filt, in6p_cksum, and in6p_hops.
Reviewed by: Peter Lei
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D53541
|
| |
|
|
|
|
|
|
|
| |
This is much more compact. Thanks to markj@ for suggesting the change.
Reviewed by: markj, Peter Lei, imp, Nick Banks
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D53510
|
| |
|
|
|
|
|
|
|
| |
This is much more compact. Thanks to markj@ for suggesting the change.
Reviewed by: markj
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D53507
|
| |
|
|
|
| |
Fixes: 9aa5a79e2af9 ("ddb: optionally print inp when printing tcpcb")
Sponsored by: Netflix, Inc.
|
| |
|
|
|
|
|
|
|
|
| |
Add /i option to the ddb commands show tcpcb and show all tcpcbs,
which enables the printing of the t_inpcb.
Reviewed by: markj
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D53497
|
| |
|
|
|
|
|
| |
No functional change intended.
MFC after: 3 days
Sponsored by: Netflix, Inc.
|
| |
|
|
|
|
|
|
|
|
|
| |
Add four missing flags (INP_BINDANY, INP_INHASHLIST, INP_RESERVED_0,
INP_BOUNDFIB) used in inp_flags and remove one flag (INP_ORIGDSTADDR),
which is actually a flag used in inp_flags2 and not in inp_flags.
Reviewed by: markj
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D53498
|
| |
|
|
|
|
| |
Reviewed by: tuexen
MFC after: 3 days
Sponsored by: Netflix, Inc.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
When adding a syncache entry, take a reference count of the
credentials while the inp is still locked.
Thanks to markj@ for providing a hint regarding the root cause.
Reported by: David Marker
Reviewed by: glebius
Tested by: David Marker
Fixes: cbc9438f0505 ("tcp: improve ref count handling when processing SYN")
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D53380
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The current IPFW version 3 dates to 2010 (commit cc4d3c30ea28, "Bring in
the most recent version of ipfw and dummynet, developed").
The compat code for FreeBSD 8 and earlier has a number of issues and is
no longer needed, so remove it.
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Reviewed by: ae, glebius
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D53343
|
| |
|
|
|
|
|
|
|
|
| |
Honor the IPPROTO_IPV6-level cmsg of type IPV6_TCLASS when sending
an UDP/IPv4 packet on an AF_INET6 socket.
Reviewed by: bz
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D53347
|
| |
|
|
|
|
|
|
|
|
| |
Honor the IPPROTO_IPV6-level socket option IPV6_TCLASS when sending
an UDP/IPv4 packet on an AF_INET6 socket.
Reviewed by: bz, glebius
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D53346
|
| |
|
|
|
|
|
|
|
|
|
| |
TCP stats are currently incremented for the persist and progress
timeout conditions, but only the persist cause was saved in the
connection end info status, which in turn is logged in the
blackbox "connection end" event.
Reviewed by: tuexen
MFC after: 3 days
Sponsored by: Netflix, Inc.
|
| |
|
|
|
|
|
|
|
| |
The TCP_SAD_DETECTION code was removed. Remove the remaining
sysctl-variables and counters.
Reviewed by: tuexen
MFC after: 3 days
Sponsored by: Netflix, Inc.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
When copying the data in the first mbuf to get rid of the UDP
header, use the correct length. It was copying too much (8 bytes,
the length of the UDP header).
This only applies to handling TCP over UDP packets. The support for
TCP over UDP is disabled by default.
Reported by: jtl
Reviewed by: Peter Lei
MFC after: 3 days
Sponsored by: Netflix, Inc.
|