aboutsummaryrefslogtreecommitdiff
path: root/sys/net
Commit message (Collapse)AuthorAgeFilesLines
* nd6: Remove DRAFT_IETF_6MAN_IPV6ONLY_FLAG and EXPERIMENTAL optionsPouria Mousavizadeh Tehrani35 hours1-39/+0
| | | | | | | | | | | The draft-ietf-6man-ipv6only-flag has been obsoleted by RFC 8925. Remove the EXPERIMENTAL compile option from the kernel and remove DRAFT_IETF_6MAN_IPV6ONLY_FLAG from userland. This compile option was not enabled by default. Also regenerate src.conf.5. Reviewed by: bz Differential Revision: https://reviews.freebsd.org/D56228
* ifnet: Add some sanity checksZhenlei Huang5 days1-10/+19
| | | | | | | | | | | To be more robust since the checking is now performed where the interface is referenced. While here, remove a redundant check from if_vmove_loan(). Reviewed by: kp, glebius, pouria MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D55875
* net: Add SIOCGI2CPB ioctl & add page/bank fields to ifi2creqAndrew Gallatin6 days3-3/+16
| | | | | | | | | | | | | | | | | | | | This commit adds page & bank fields to ifi2creq in preparation for adding CMIS support for 400g optics to ifconfig. The new ioctl SIOCGI2CPB is added, so that drivers can distinguish between callers asking for page/bank selection and legacy callers that simply failed to zero out all ifi2creq fields. The mlx5en(4) driver and iflib(4) driver frameork have been updated to use this new SIOCGI2CPB ioctl and support page/bank operations. A follow-on patchset will add support to ifconfig for reporting data from CMIS optics. This has been tested on Nvidia ConnectX-7 and Broadcom Thor2 (using out of tree driver) based NICs. Differential Revision: https://reviews.freebsd.org/D55912 Sponsored by: Netflix Inc. Reviewed by: kib
* net/route: Add an eventhandler for rt_numfibs changesMark Johnston13 days2-1/+9
| | | | | | | | | | | | | The multicast routing code will start implementing per-FIB routing tables. As a part of this, it needs to be notified when the number of FIBs changes, so that it can expand its tables. Add an eventhandler for this purpose. MFC after: 2 weeks Sponsored by: Stormshield Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D55239
* ifnet: Fix races in if_vmove_reclaim()Zhenlei Huang13 days1-5/+9
| | | | | | | | | | | | | | | | | | | | | | | The thread running if_vmove_reclaim() may race with other threads those running if_detach(), if_vmove_loan() or if_vmove_reclaim(). In case the current thread loses race, two issues arise, 1. It is unstable and unsafe to access ifp->if_vnet, 2. The interface is removed from "active" list, hence if_unlink_ifnet() can fail. For the first case, check against source prison's vnet instead, given the interface is obtained from that vnet. For the second one, return ENODEV to indicate the interface was on the list but the current thread loses race, to distinguish from ENXIO, which means the interface or child prison is not found. This is the same with if_vmove_loan(). Reviewed by: kp, pouria Fixes: a779388f8bb3 if: Protect V_ifnet in vnet_if_return() MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D55997
* routing: Include opt_route.h in route_ctl.cPouria Mousavizadeh Tehrani2026-03-271-0/+1
| | | | | | | | Fix incorrect removal of opt_route.h header in route_ctl.c Reported by: Jenkins Fixes: 254b23eb1f54 ("routing: Retire ROUTE_MPATH compile option") Differential Revision: https://reviews.freebsd.org/D55884
* routing: Retire ROUTE_MPATH compile optionPouria Mousavizadeh Tehrani2026-03-2715-117/+22
| | | | | | | | | | | The ROUTE_MPATH compile option was introduced to test the new multipath implementation. Since compiling it has no overhead and it's enabled by default, remove it. Reviewed by: melifaro, markj Relnotes: yes Differential Revision: https://reviews.freebsd.org/D55884
* if_types: Fix a typo in a source code commentGordon Bergling2026-03-271-1/+1
| | | | | | | - s/Circiut/Circuit/ Obtained from: OpenBSD MFC after: 3 days
* bridge(4): Remove epoch_enter during destructionPouria Mousavizadeh Tehrani2026-03-191-5/+0
| | | | | | | bridge doesn't require to enter epoch during destruction. Reviewed by: zlei, glebius Differential Revision: https://reviews.freebsd.org/D55935
* if_bridge(4): don't sleep under epoch(9) in destructionPouria Mousavizadeh Tehrani2026-03-171-2/+2
| | | | | | | | | | | | bridge tries to run callout_drain(9) twice under epoch during destruction. once for bridge_timer, which is not required to be under epoch. second time for the BSTP callout, which is already disabled earlier inside bridge_delete_member. Reviewed by: glebius, zlei MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D55876
* ifnet: Remove unreachable codeZhenlei Huang2026-03-161-18/+0
| | | | | | | | | | | | | | The ioctls SIOCSIFVNET and SIOCSIFRVNET are for userland only. For SIOCSIFVNET, if_vmove_loan(), the interface is obtained from current VNET. For SIOCSIFRVNET, if_vmove_reclaim(), a valid child prison is held before getting the interface. In both cases the VNET of the obtained interfaces is stable, so there's no need to check it. No functional change intended. Reviewed by: glebius, jamie (for #jails) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D55828
* ifnet: Fix decreasing the vnet interface countZhenlei Huang2026-03-161-3/+3
| | | | | | | | | | | | | It should be decreased only when the interface has been successfully removed from the "active" list. This prevents vnet_if_return() from potential OOB writes to the allocated memory "pending". Reviewed by: kp, pouria Fixes: a779388f8bb3 if: Protect V_ifnet in vnet_if_return() MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D55873
* rss_config: Add option to enable rss udp hashingbigJ2026-03-161-12/+22
| | | | | | | | | Added optional system tunable parameter to enable 4-tuple rss udp hashing. Signed-off-by: bigJ <bigj@solanavibestation.com> Reviewed by: adrian, pouria Pull Request: https://github.com/freebsd/freebsd-src/pull/2057
* libpcap: Update to 1.10.6Joseph Mingrone2026-03-151-36/+141
| | | | | | | | | Changes: https://raw.githubusercontent.com/the-tcpdump-group/libpcap/89e982c37c36ad0bf9f10b7ded421cb42422effa/CHANGES Reviewed by: bms, emaste Obtained from: https://www.tcpdump.org/release/libpcap-1.10.6.tar.gz Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D55545 Differential Revision: https://reviews.freebsd.org/D55858
* debugnet: don't include udp_var.hGleb Smirnoff2026-03-121-1/+0
| | | | The module constructs UDP packets, but doesn't use the UDP stack.
* systm.h: don't declare socket and inpcb globallyGleb Smirnoff2026-03-121-0/+1
|
* carp: retire ioctl(2) APIGleb Smirnoff2026-03-121-10/+0
| | | | | | | | | All supported stable branches use netlink(4) API to configure carp(4). The deleted code also has kernel stack leak vulnerability, that requires extra effort to fix. Reviewed by: pouria, kp Differential Revision: https://reviews.freebsd.org/D55804
* rss: manifest RSS option in kernel with kern.features sysctlGleb Smirnoff2026-03-051-0/+1
|
* vnet: Ensure the space allocated by vnet_data_alloc() is sufficent alignedZhenlei Huang2026-02-281-3/+11
| | | | | | | | | | | | | | | | | Some 32-bit architectures, e.g., armv7, require strict 8-byte alignment while doing atomic 64-bit access. Hence aligning to the pointer type (4-byte alignment) does not meet the requirement on those architectures. Make the space allocated by vnet_data_alloc() sufficent aligned to avoid unaligned access. PR: 265639 Diagnosed by: markj Reviewed by: jhb, markj Co-authored-by: jhb MFC after: 5 days Differential Revision: https://reviews.freebsd.org/D55560
* gre: unbreak LINT-NOINETEnji Cooper2026-02-271-4/+12
| | | | | | | | | | | | | - Move some of the braces under their respective conditionals to make the statements more self-encapsulated and only define the `aliasreq` union in the event either INET or INET6 is defined. - Fix a copy-paste error: `in_gre_ioctl` should be `in6_gre_ioctl` in the INET6 case. Reported by: tinderbox Fixes: e1e18cc12e68 ("if_gre: Add netlink support with tests") Differential Revision: https://reviews.freebsd.org/D55546
* rtsock: Fix stack overflowMark Johnston2026-02-241-2/+2
| | | | | | | Approved by: so Security: FreeBSD-SA-26:05.route Security: CVE-2026-3038 Fixes: 92be2847e845 ("rtsock: Avoid copying uninitialized padding bytes")
* net/if_vlan.c: do not leak vlan sx slock in vlan_clone_dump_nl()Konstantin Belousov2026-02-221-0/+1
| | | | | | | | | Reported by: pho Reviewed by: markj Fixes: d4062b9f16e46f039f2b5b40dd35592b5dabf00c Sponsored by: The FreeBSD Foundation MFC after: 3 days Differential revision: https://reviews.freebsd.org/D55447
* if_gre: Add netlink support with testsPouria Mousavizadeh Tehrani2026-02-182-66/+408
| | | | | | | | Migrate to new if_clone KPI and implement netlink support for gre(4). Also refactor some of the gre specific ioctls. Reviewed by: glebius, zlei Differential Revision: https://reviews.freebsd.org/D54443
* bpf: don't call bpf_detachd() in bpf_setdlt()Gleb Smirnoff2026-02-131-1/+0
| | | | | | | The bpf_attachd() will perform bpf_detachd() itself. Performing it twice will lead to doing CK_LIST_REMOVE twice. Reported & tested by: bz
* lagg: Avoid dropping locks when starting the interfaceZhenlei Huang2026-02-111-17/+19
| | | | | | | | | | | | | | | The init routine of a lagg(4) interface will not change during the whole lifecycle. So we can call lagg_init() directly instead of through the function pointer. Well, that requires a drop and pickup lock, which unnecessarily expose a small race window. Refactor lagg_init() into lagg_init_locked() and call the later one to avoid that. Meanwhile, delay updating the driver managed status until after the interface is really ready. Reviewed by: markj MFC after: 5 days Differential Revision: https://reviews.freebsd.org/D55198
* pf: remove unused variable from pf_test_ctxKristof Provost2026-02-101-1/+0
| | | | Sponsored by: Rubicon Communications, LLC ("Netgate")
* net: Remove the IFF_RENAMING flagMark Johnston2026-02-106-21/+0
| | | | | | | | | | | This used to be needed when interface renames were broadcast using the ifnet_departure_event eventhandler, but since commit 349fcf079ca3 ("net: add ifnet_rename_event EVENTHANDLER(9) for interface renaming"), it has no purpose. Remove it. Reviewed by: pouria, zlei Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D55171
* iflib: Add support for SIOCGIFDOWNREASON ioctlChandrakanth Patil2026-02-102-0/+16
| | | | | | | | | | | | | | | | | | | This change adds native support for the SIOCGIFDOWNREASON ioctl in iflib. When ifconfig issues SIOCGIFDOWNREASON, the request is now routed through a new driver callback (IFDI_GET_DOWNREASON). iflib allocates the ifdownreason structure, calls the driver to fill the down-reason message, and then returns the data back to ifconfig for display. Without this change, iflib-based drivers cannot implement link-down reason reporting even if the hardware provides the information. No functional change for existing drivers unless they implement the new IFDI_GET_DOWNREASON method. Existing drivers continue to behave as before. Reviewed by: gallatin, erj, kgalazka, ssaxena, #iflib Differential Revision: https://reviews.freebsd.org/D54045 MFC After: 1 week
* lagg: Make lagg_link_active() staticZhenlei Huang2026-02-091-1/+1
| | | | | | | | | | | | It is declared as static. Make the definition consistent with the declaration. It was ever fixed by commit 52e53e2de0ec, but the commit was reverted, leaving it unfixed. No functional change intended. MFC after: 3 days
* lagg: Remove the member pr_num from struct lagg_protoZhenlei Huang2026-02-061-13/+6
| | | | | | | | | | | | | | | It is set but never used. Remove it to avoid confusion and save a little space. While here, use designated initializers to initialize the LAGG protocol table. That improves readability, and it will be safer to initialize the table if we introduce new protocols in the future. No functional change intended. Reviewed by: glebius MFC after: 5 days Differential Revision: https://reviews.freebsd.org/D55124
* lagg: Make the none protocol a first-class citizenZhenlei Huang2026-02-061-9/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All the other protocols have corresponding start and input routines, which are used in the fast path. Currently the none protocol is treated specially. In the fast path it is checked to indicate whether a working protocol is configured. There are two issues raised by this design: 1. In production, other protocols are commonly used, but not the none protocol. It smells like an overkill to always check it in the fast path. It is unfair to other commonly used protocols. 2. PR 289017 reveals that there's a small window between checking the protocol and calling lagg_proto_start(). lagg_proto_start() is possible to see the none protocol and do NULL deferencing. Fix them by making the none protocol a first-class citizen so that it has start and input routines just the same as other protocols. Then we can stop checking it in the fast path, since lagg_proto_start() and lagg_proto_input() will never fail to work. The error ENETDOWN is chosen for the start routine. Obviously no active ports are available, and the packets will go nowhere. It is also a better error than ENXIO, since indeed the interface is configured and has a TX algorithm (the none protocol). PR: 289017 Diagnosed by: Qiu-ji Chen <chenqiuji666@gmail.com> Tested by: Gui-Dong Han <hanguidong02@gmail.com> Reviewed by: glebius MFC after: 5 days Differential Revision: https://reviews.freebsd.org/D55123
* bpf: don't clear pointer from descriptor to the tap on descriptor closeGleb Smirnoff2026-02-041-1/+1
| | | | | | | | | | During packet processing the descriptor is looked up using epoch(9) and it can be accessed after bpf_detachd(). In scenario of descriptor close the tap point is alive (it actually produces packets) and thus the pointer can be legitimately dereferenced. This fixes a race on a bpf(4) device close that would otherwise result in panic. Differential Revision: https://reviews.freebsd.org/D55064
* pf: fix use of uninitialised variableKristof Provost2026-02-031-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | In pf_match_rule() we attempt to append matching rules to the end of 'match_rules'. We want to preserve the order to make the multiple pflog entries easier to understand. So we keep track of the last added rule item in 'rt'. However, that assumed that 'match_rules' was only ever added to in that one call to pf_match_rules(). This isn't always the case, for example if we have match rules in different anchors. In that case we'd end up using the uninitialised 'rt' variable in the SLIST_INSERT_AFTER call. Instead track the match rules and the last matching rule (to enable easy appending) in the struct pf_test_ctx. This also allows us to reduce the number of arguments for some functions, because we passed a ctx to most functions that needed 'match_rules'. While here also make pf_match_rules() static, because it's only ever used in pf.c Add a test case to exercise the relevant code path. MFC after: 2 weeks Sponsored by: Rubicon Communications, LLC ("Netgate")
* epair: add VLAN_HWTAGGINGTimo Völker2026-01-301-12/+16
| | | | | | | | | | | | | | | | | | | | | | Add capability VLAN_HWTAGGING to the epair interface and enable it by default. When sending a packet over a VLAN interface that uses an epair interface, the flag M_VLANTAG and the ether_vtag (which contains the VLAN ID and/or PCP) are set in the mbuf to inform the hardware that the VLAN header has to be added. The sending epair end does not need to actually add a VLAN header. It can just pass the mbuf with this setting to the other epair end, which receives the packet. The receiving epair end can just pass the mbuf with this setting to the upper layer. Due to this setting, the upper layer believes that there was a VLAN header that has been removed by the interface. If the packet later leaves the host, the outgoing physical interface can add the VLAN header in hardware if it supports VLAN_HWTAGGING. If not, the implementation of Ethernet or bridge adds the VLAN header in software. Reviewed by: zlei, tuexen MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D52465
* net/iflib.c: move out scheduler-depended code into the hookKonstantin Belousov2026-01-291-79/+3
| | | | | | | | | | | | Add sched_find_l2_neighbor(). This really should be not scheduler-depended, in does not have anything to do with scheduler at all. But for now keep the same code structure. Reviewed by: olce Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D54831
* netinet6: store ND context directly in struct in6_ifextraGleb Smirnoff2026-01-231-2/+2
| | | | | | | | | | | | | | | | | | | | | Stop using struct nd_ifinfo for that, because it is an API struct for SIOCGIFINFO_IN6. The functional changes are isolated to the protocol attach and detach: in6_ifarrival(), nd6_ifattach(), in6_ifdeparture(), nd6_ifdetach(), as well as to the nd6_ioctl(), nd6_ra_input(), nd6_slowtimo() and in6_ifmtu(). The dad_failures member was just renamed to match the rest. The M_IP6NDP malloc(9) type declaration moved to files that actually use it. The rest of the changes are mechanical substitution of double pointer dereference via ND_IFINFO() to a single pointer dereference. This was achieved with a sed(1) script: s/ND_IFINFO\(([a-z0-9>_.-]+)\)->(flags|linkmtu|basereachable|reachable|retrans|chlim)/\1->if_inet6->nd_\2/g s/nd_chlim/nd_curhoplimit/g Reviewed by: tuexen, madpilot Differential Revision: https://reviews.freebsd.org/D54725
* iflib: null out freed mbuf in iflib_txsd_freeAndrew Gallatin2026-01-191-0/+1
| | | | | | | | | | | | When adding the IFLIB_GET_MBUF/FLAGS, I neglected to NULL out the mbuf in the descriptor ring. I didn't think this should matter as the I thought this code was only used when the ring was about to be freed. But I was wrong, and leaving a stale mbuf in there can cause panics. Reported by: Marek Zarychta (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=292547) Fixes: 14d93f612f26 Sponsored by: Netflix
* net: on interface detach purge all its routes before detaching protocolsGleb Smirnoff2026-01-171-2/+2
| | | | | | | | | | | | | | | | Otherwise, a forwarding thread may use the interface being detached. This is a regression from 0d469d23715d, which manifests itself as a reliably reproducible panic in in6_selecthlim(). Note that there are old bug reports about such a panic, and I believe this change will not fix them, as their nature is not due to a screwed up detach sequence, but due to lack of proper epoch(9) based synchronization between the detach and forwarding. Reviewed by: pouria Reported & tested by: jhibbits PR: 292162 Fixes: 0d469d23715d690b863787ebfa51529e1f6a9092 Differential Revision: https://reviews.freebsd.org/D54721
* if_ovpn: add interface countersKristof Provost2026-01-151-0/+32
| | | | | | | | | Count input/output packets and bytes on the interface as well, not just in openvpn-specific counters. PR: 292464 MFC after: 2 weeks Sponsored by: Rubicon Communications, LLC ("Netgate")
* pf: configurable action on limiter exceededKristof Provost2026-01-141-2/+9
| | | | | | | | | | | | | | | | This change extends pf(4) limiters so administrator can specify action the rule executes when limit is reached. By default when limit is reached the limiter overrides action specified by rule to no-match. If administrator wants to block packet instead then rule with limiter should be changed to: pass in from any to any state limiter test (block) OK dlg@ Obtained from: OpenBSD, sashan <sashan@openbsd.org>, 04394254d9 Sponsored by: Rubicon Communications, LLC ("Netgate")
* pf: convert state limiter interface to netlinkKristof Provost2026-01-141-65/+43
| | | | | | | This is a new feature with new ioctl calls, so we can safely remove them right now. Sponsored by: Rubicon Communications, LLC ("Netgate")
* pf: introduce source and state limitersKristof Provost2026-01-141-3/+411
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | both source and state limiters can provide constraints on the number of states that a set of rules can create, and optionally the rate at which they are created. state limiters have a single limit, but source limiters apply limits against a source address (or network). the source address entries are dynamically created and destroyed, and are also limited. this started out because i was struggling to understand the source and state tracking options in pf.conf, and looking at the code made it worse. it looked like some functionality was missing, and the code also did some things that surprised me. taking a step back from it, even it if did work, what is described doesn't work well outside very simple environments. the functionality i'm talking about is most of the stuff in the Stateful Tracking Options section of pf.conf(4). some of the problems are illustrated one of the simplest options: the "max number" option that limits the number of states that a rule is allowed to create: - wiring limits up to rules is a problem because when you load a new ruleset the limit is reset, allowing more states to be created than you intended. - a single "rule" in pf.conf can expand to multiple rules in the kernel thanks to things like macro expansion for multiple ports. "max 1000" on a line in pf.conf could end up being many times that in effect. - when a state limit on a rule is reached, the packet is dropped. this makes it difficult to do other things with the packet, such a redirect it to a tarpit or another server that replies with an outage notices or such. a state limiter solves these problems. the example from the pf.conf.5 change demonstrates this: An example use case for a state limiter is to restrict the number of connections allowed to a service that is accessible via multiple protocols, e.g. a DNS server that can be accessed by both TCP and UDP on port 53, DNS-over-TLS on TCP port 853, and DNS-over-HTTPS on TCP port 443 can be limited to 1000 concurrent connections: state limiter "dns-server" id 1 limit 1000 pass in proto { tcp udp } to port domain state limiter "dns-server" pass in proto tcp to port { 853 443 } state limiter "dns-server" a single limit across all these protocols can't be implemented with per rule state limits, and any limits that were applied are reset if the ruleset is reloaded. the existing source-track implementation appears to be incomplete, i could only see code for "source-track global", but not "source-track rule". source-track global is too heavy and unweildy a hammer, and source-track rule would suffer the same issues around rule lifetimes and expansions that the "max number" state tracking config above has. a slightly expanded example from the pf.conf.5 change for source limiters: An example use for a source limiter is the mitigation of denial of service caused by the exhaustion of firewall resources by network or port scans from outside the network. The states created by any one scanner from any one source address can be limited to avoid impacting other sources. Below, up to 10000 IPv4 hosts and IPv6 /64 networks from the external network are each limited to a maximum of 1000 connections, and are rate limited to creating 100 states over a 10 second interval: source limiter "internet" id 1 entries 10000 \ limit 1000 rate 100/10 \ inet6 mask 64 block in on egress pass in quick on egress source limiter "internet" pass in on egress proto tcp probability 20% rdr-to $tarpit the extra bit is if the source limiter doesn't have "space" for the state, the rule doesn't match and you can fall through to tarpitting 20% of the tcp connections for fun. i've been using this in anger in production for over 3 years now. sashan@ has been poking me along (slowly) to get it in a good enough shape for the tree for a long time. it's been one of those years. bluhm@ says this doesnt break the regress tests. ok sashan@ Obtained from: OpenBSD, dlg <dlg@openbsd.org>, 8463cae72e Sponsored by: Rubicon Communications, LLC ("Netgate")
* enc: create an interface at SI_SUB_PROTO_IF stageGleb Smirnoff2026-01-131-1/+1
| | | | | | | | | | | | | | | | Creation of enc0 before SI_SUB_PROTO_MC mangles the MLD list as well as encounters IGMP mutex not initialized yet. Reported & tested by: mjg NB: the enc(4) is not a true interface indeed. In a perfect world the module shall not create a cloner, shall not enter if_attach(), shall not trigger ifnet_arrival_event, neither shall have any protocol attached to it. The enc0 exists for two purposes: 1) create a bpf(9) tap; 2) to allow injection packets in the middle of ipsec(4) processing temporarily rewriting m_pkthdr.rcvif to point at enc0. While the problem 1 is already solved with a recent divorce between bpf(9) and ifnet(9), the problem 2 is harder to solve without breaking packet filter rules that use "via enc0".
* iflib: remove convoluted custom zeroing codeBrooks Davis2026-01-091-60/+5
| | | | | | | | | | | | | | Replace a collection of aliasing violations and ifdefs with memset (which now expands to __builtin_memset and should be quite reliably inlined.) The old code is hard to maintain as evidenced by the most recent change to if_pkt_info_t updating the defines, but not the zeroing code. Reviewed by: gallatin, erj Effort: CHERI upstreaming Sponsored by: Innovate UK Fixes: 43d7ee540efe ("iflib: support for transmit side nic KTLS offload") Differential Revision: https://reviews.freebsd.org/D54605
* iflib: Drop tx lock when freeing mbufs using simple_transmitAndrew Gallatin2026-01-071-35/+147
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Freeing completed transmit mbufs can be time consuming (due to them being cold in cache, and due to ext free routines taking locks), especially when we batch tx completions. If we do this when holding the tx ring mutex, this can cause lock contention on the tx ring mutex when using iflib_simple_transmit. To resolve this, this patch opportunistically copies completed mbuf pointers into a new array (ifsd_m_defer) so they can be freed after dropping the transmit mutex. The ifsd_m_defer array is opportunistically used, and may be NULL. If its NULL, then we free mbufs in the old way. The ifsd_m_defer array is atomically nulled when a thread is using it, and atomically restored when the freeing thread is done with it. The use of atomics here avoids acquire/release of the tx lock to restore the array after freeing mbufs. Since we're no longer always freeing mbufs inline, peeking into them to see if a transmit used TSO or not will cause a useless cache miss, as nothing else in the mbuf is likely to be accessed soon. To avoid that cache miss, we encode a TSO or not TSO flag in the lower bits of the mbuf pointer stored in the ifsd_m array. Note that the IFLIB_NO_TSO flag exists primarily for sanity/debugging. iflib_completed_tx_reclaim() was refactored to break out iflib_txq_can_reclaim() and _iflib_completed_tx_reclaim() so the that the tx routine can call iflib_tx_credits_update() just once, rather than twice. Note that deferred mbuf freeing is not enabled by default, and can be enabled using the dev.$DEV.$UNIT.iflib.tx_defer_mfree sysctl. Differential Revision: https://reviews.freebsd.org/D54356 Sponsored by: Netflix Reviewed by: markj, kbowling, ziaee
* bridge: Allow BRDGSIFVLANSET without IFBRF_VLANFILTERLexi Winter2026-01-031-3/+0
| | | | | | | | | | | | | | | Currently, we disallow BRDGSIFVLANSET when IFBRF_VLANFILTER is disabled. There's no particular reason to do this, and it causes some undesirable behaviour such as not being able to remove the tagged config on a member after disabling vlanfilter on the bridge. Remove the restriction so BRDGSIFVLANSET is always accepted. PR: 292019 MFC after: 1 week Reviewed by: zlei, p.mousavizadeh_protonmail.com Sponsored by: https://www.patreon.com/bsdivy Differential Revision: https://reviews.freebsd.org/D54435
* pf: sprinkle const over pf_addr_cmp()Kristof Provost2026-01-021-1/+1
| | | | Sponsored by: Rubicon Communications, LLC ("Netgate")
* sys/netipsec: ensure sah stability during input callback processingKonstantin Belousov2025-12-221-2/+10
| | | | | | | | | | | | Citing ae: this fixes some rare panics, that are reported in derived projects: `panic: esp_input_cb: Unexpected address family'. Reported by: ae Tested by: ae, Daniel Dubnikov <ddaniel@nvidia.com> Reviewed by: ae, Ariel Ehrenberg <aehrenberg@nvidia.com> (previous version) Sponsored by: NVidia networking MFC after: 1 week Differential revision: https://reviews.freebsd.org/D54325
* if_tuntap: use ifnet_rename_event instead of ifnet_arrival_eventGleb Smirnoff2025-12-221-12/+6
|
* ng_ether: refactor to use interface EVENTHANDLER(9)sGleb Smirnoff2025-12-224-43/+0
|