| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
Differential Revision: https://reviews.freebsd.org/D53873
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Imagine that bpf(9) tapping can happen at any point in the network stack,
not necessarily at interface transmit or receive. To achieve that we need
a thin layer of abstraction defined by struct bif_methods, that defines
how generic bpf layer works with a tap point of this kind.
Implement ifnet(9) specific methods in a separate file bpf_ifnet.c. At
this point there is 100% compatibility for all existing interfaces, there
is no KPI change, yet. The legacy attaching KPI is layered over new ifnet
agnostic KPI. The new KPI may change though, as we can implement multiple
DLTs per single tap point in a prettier fashion.
The new abstraction layer allows us to move all the 802.11 radio injection
hacks out of bpf.c into ieee80211_radiotap.c, so do that immediately as a
good proof of concept.
Reviewed by: bz
Differential Revision: https://reviews.freebsd.org/D53872
|
| |
|
|
|
|
|
|
|
|
| |
This makes it easier to reason about system topology, and to
potentially map applications to NIC queues by (ab)using the
mbuf flowid to select egress NIC and queue in a predictable fashion.
Differential Revision: https://reviews.freebsd.org/D54053
Reviewed by: glebius, kbowling
Sponsored by: Netflix
|
| |
|
|
|
| |
Reviewed by: glebius
Differential Revision: https://reviews.freebsd.org/D54190
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pfsync_state_export() takes a pointer to a union that is in reality a
pointer to one of the three state formats (1301, 1400, 1500), and zeros
the union. The three formats do not have the same size, so zeroing is
wrong when the format isn't that which has the largest size.
Refactor a bit so that the zeroing happens at the layer where we know
which format we're dealing with.
Reported by: CHERI
Reviewed by: kp
MFC after: 1 week
Sponsored by: CHERI Research Centre (EPSRC grant UKRI3001)
Differential Revision: https://reviews.freebsd.org/D54163
|
| |
|
|
|
|
| |
- s/backet/bucket/
MFC after: 3 days
|
| |
|
|
|
|
| |
This shrinks the structure a bit. Should be no functional change.
Differential Revision: https://reviews.freebsd.org/D53870
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the TTL (or hop limit) expires during nat64 translation we may
need to send the error message in the original address family (i.e.
pre-translation).
We'd usually handle this in pf_route()/pf_route6(), but at that point we
have already translated the packet, making it difficult to include it in
the generated ICMP message.
Check for this case in pf_translate_af() and send icmp errors directly
from it.
PR: 291527
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D54166
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Avoid a possible use-after-free in the rx path.
ovpn_decrypt_rx_cb() calls ovpn_finish_rx() which releases the lock,
but continues to use the peer.
Ensure that the peer cannot be freed until we're sure all potential
users have stopped using it (i.e. have left net_epoch).
Reported by: Kevin Day <kevin@your.org>
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
|
| |
|
|
|
|
|
|
|
| |
Add struct mtx to struct lltable and stop using IF_AFDATA_LOCK, that
was created for a completely different purpose. No functional change
intended.
Reviewed by: zlei, melifaro
Differential Revision: https://reviews.freebsd.org/D54086
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old approach where we go through the list of interfaces and count them
has bugs. One obvious bug with this dynamic translation is that once an
Ethernet interface in the middle of the list goes away, all interfaces
following it would change their Linux names.
A bigger problem is the ifnet arrival and departure times. For example
linsysfs has event handler for ifnet_arrival_event, and of course it wants
to resolve the name. This accidentially works, due to a bug in
if_attach() where we call if_link_ifnet() before invoking all the event
handlers. Once the bug is fixed linsysfs won't be able to resolve the old
way. The other side is ifnet_departure_event, where there is no bug, the
eventhandlers are called after the if_unlink_ifnet(). This means old
translation won't work for departure event handlers. One example is
netlink. This change gives the Netlink a chance to emit a proper Linux
interface departure message.
However, there is another problem in Netlink, that the ifnet pointer is
lost in the Netlink translation layer. Plug this with a cookie in netlink
writer structure that can be set by the route layer and used by the Netlink
Linux translation layer. This part of the diff seems unrelated, but it is
hard to make it a separate change, as the old KPI goes away and to use the
new one we need the pointer.
Differential Revision: https://reviews.freebsd.org/D54077
|
| |
|
|
| |
Fixes: fd131b47f20dbeb515f5e3e6ea87948f2638eda9
|
| |
|
|
|
|
| |
It is a remnant of a network stack design that was supposed to support
multiple network protocols. Today it is clear that we are left with IPv4
and IPv6 only. Only IPv6 may have an MTU different to the interface MTU.
|
| | |
|
| |
|
|
|
|
|
| |
Otherwise you just can't include pfvar.h without compiling pf in.
Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D54064
|
| |
|
|
|
|
|
|
|
| |
All accesses to this list are done with the global lock held. The
CK connotation is just confusing the reader.
Fixes: 699281b545a8a3fc5109b5f2db62d261b65b588b
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D53869
|
| |
|
|
|
|
|
| |
This removed the global counter, that was updated in a racy manner.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D53868
|
| |
|
|
|
|
|
|
|
|
| |
The struct was used for bpf_if to bif_dlist masking, that is used to
optimize bpf_peers_present() call. The only functional change here is
that bif_dlist and bif_next swap their places in the structure. Both
belong to the first cache line anyway.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D53867
|
| |
|
|
| |
No functional change.
|
| |
|
|
|
|
|
|
|
|
| |
Clear the RSS hash on transmit, now that RSS hashing is enabled
unconditionally, and the network stack may want to trust that
it is getting the correct hash on input.
Differential Revision: https://reviews.freebsd.org/D53090
Reviewed by: zlei
Sponsored by: Netflix
|
| |
|
|
|
|
|
|
|
|
|
| |
We use the fact that all NICs that support hashing are using the
same hash algorithm and hash key to enable symmetic hashing in
TCP, where a software version of the same hash is used to
establish hashes on outgoing connections.
Sponsored by: Netflix
Reviewed by: adrian, zlei (both early version)
Differential Revision: https://reviews.freebsd.org/D53089
|
| |
|
|
|
| |
With modern debugging tools it isn't useful at all and is just a
maintenance burden.
|
| |
|
|
| |
The unlocked one is used only once. No functional change.
|
| |
|
|
|
| |
This basically refactors 4f42daa4a326f to use less indentation and
variables. The code is still not race proof.
|
| |
|
|
| |
Should have gone together with 9738277b5c66.
|
| |
|
|
|
|
|
|
|
|
|
| |
Use the same check as iflib_if_transmit() to detect when the
interface is down and return the proper error code, and also
free the mbuf.
This fixes an mbuf leak when a member of a lagg is brought
down (and probably many other scenarios).
Sponsored by: Netflix
|
| |
|
|
|
|
|
|
| |
IFT_ENC has special behaviour in pf we don't desire, and this also ensures that
for all interface types there is N:1:1 correspondence between if_type:dlt:header len.
Requested by: glebius
MFC after: 1 week
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
active cable
The register address of link length of copper or active cable is 146 as
per the SFF-8436 specification [1].
[1] 7.6.2 Upper Memory Map Page 00h SFF-8436 Specification (pdf): https://members.snia.org/document/dl/25896
Reviewed by: imp, zlei
MFC after: 1 week
Pull Request: https://github.com/freebsd/freebsd-src/pull/1885
Closes: https://github.com/freebsd/freebsd-src/pull/1885
|
| |
|
|
|
|
|
| |
Reported by: glebius
Fixes: 2d608a4cebbd if_media.h: Add 400GBase-SR8 and 400GBase-CR8
MFC after: 1 week
Sponsored by: Chelsio Communications
|
| |
|
|
|
|
|
| |
Reviewed by: bz (network)
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D53387
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove prefetching from the transmit path of iflib in the interest of
increased performance and reduced complexity. Details regarding the
performance penalties of prefetching can be found in the differential
review.
Note this prefetching was only done on link speeds of 10Gb/s and
above, so the change is a no-op (or perhaps slight performance
improvement simply due to the code simplification) for slower
interfaces.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D53674
Reviewed by: kbowling, markj, mjg
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're in the dtor, so we can't destroy it now without deadlocking after
recent changes to make destroy_dev() provide a barrier. However, we
know there isn't any other dtor to run, so we can go ahead and clean up
our state and just prevent a use-after-free if someone races to open
the device while we're trying to destroy it. tunopen() now uses the
net epoch to protect against softc release by a concurrent
tun_destroy().
While we're here, allow a destroy operation to proceed if we caught a
signal in cv_wait_sig() but tun_busy dropped to 0 while we were waiting
to acquire the lock.
This was more of an inherent design flaw, rather than a bug in the
below-refed commit.
PR: 290575
Fixes: 4dbe6628179d ("devfs: make destroy_dev() a release [...]")
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D53438
|
| |
|
|
|
|
|
|
|
| |
The list of addresses is potentially very large. Larger than we can fit in a
single netlink request, so we indicate via the PFR_FLAG_START/PFR_FLAG_DONE
flags when we start and finish, so the kernel can work out which addresses need
to be removed.
Sponsored by: Rubicon Communications, LLC ("Netgate")
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Source nodes redirect (nat-to, rdr-to, route-to) all further connections
matching the rule which has created the source node. The source node is
valid as long as there are states resulting from the rule or until the
source node lifetime expires. When the rule's redirection pool is
modified (e.g. table contents are changed) the source node is still
valid and it will redirect new connections to invalid target (e.g. a
dead next-hop).
When performing source tracking after finding a source node check if the
redirection address still exists in pool of the rule which has created
this node. If not, delete the source node. This will result in finding a
new redirection address and creation of a new source node.
Reviewed by: kp
Obtained from: OpenBSD
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D53231
|
| |
|
|
| |
MFC after: 1 week
|
| |
|
|
|
|
|
|
|
|
|
| |
These structures are copied out to userspace, and it's possible to leak
uninitialized stack bytes since these routines and their callers weren't
careful to clear them first. Add memsets to avoid this.
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Reviewed by: kp, emaste
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D53342
|
| |
|
|
|
|
|
|
|
|
| |
The handlers were not checking that the group names are nul-terminated.
Add checks for this.
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Reviewed by: zlei
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D53344
|
| |
|
|
|
|
|
|
| |
Fix the htons byteorder of vxlan packets after
`vxlan_pick_source_port` picks a source port during encapsulation.
Reviewed by: zlei, kp, adrian
Differential Revision: https://reviews.freebsd.org/D53022
|
| |
|
|
|
|
|
|
| |
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D52045
|
| |
|
|
| |
MFC after: 1 week
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On some iflib drivers, the txd reclaim routine can be fairly expensive
at high packet rates. Iflib was designed with the intent of only
reclaiming tx descriptors above a configurable threshold, but this
logic was left unimplemented.
This change:
- implements 2 new knobs, iflib.tx_reclaim_thresh and
iflib.tx_reclaim_ticks.
- moves tx reclaim thresh from the if_shared_ctx and into the
iflib_ctx as drivers don't need to see it, and it needs to be
changed, so it can't be const
- tx_reclaim_thresh and ticks are replicated into the txq to
improve cache locality of data accessed in the hot path
- ticks is used rather than more expensive timekeeping mechanism so
as to keep things simple and cheap
This change substantially improves packet rates on bnxt. It has been
tested on bxnt and ixl
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D52561
Reviewed by: markj (initial version)
|
| |
|
|
|
|
|
|
| |
This reverts commit 4e7a375804e5ad4b244ce9a035fa971cbf2f0944.
We do not want out-of-tree consumers to access the home_vnet variable.
As discussed with the author and Gleb Smirnoff.
|
| |
|
|
|
| |
Reviewed by: kp
Signed-off-by: Kevin Irabor <kevin.irabor04@gmail.com>
|
| |
|
|
|
|
| |
Otherwise builds warn about them being unused.
Sponsored by: Rubicon Communications, LLC ("Netgate")
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Increasing counters on "match" rules causes the 1st packet making a
connection to be double-counted, but only for rule counters, not rules'
tables, because those are not increased at all during rule parsing.
Remove "match" rule counter handling during rule parsing, do it only in
pf_counters_inc().
NAT can be performed either by "nat" rules in the NAT ruleset or by "match"
rules. Rules before the NAT rule, and the NAT rule itself match on pre-NAT
addresses, and later rules match on post-NAT addresses. When increasing
counters go over rules in the same order as a packet would and use
source and destination addresses for updating table counters from
appropriate state key, taking into consideration on which rule NAT
happens.
Use AF from state key, so that table counters can be properly updated for
af-to rules.
Synchronize match rule updating behaviour to that of OpenBSD: if rules
match, but state is not created, don't update counters.
Reviewed by: kp
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D52447
|
| |
|
|
|
| |
Obtained from: OpenBSD, sashan <sashan@openbsd.org>, 8cf23eed7f
Sponsored by: Rubicon Communications, LLC ("Netgate")
|
| |
|
|
|
|
|
|
|
|
|
|
| |
A new version of pfsync packet is introduced: 1500. This version solves
the issues with data alignment introduced in version 1400 and adds syncing
of information needed to sync states created by rules with af-to (original
interface, af and proto separate for wire and stack keys), of rt_af
needed for prefer-ipv6-nexthop, and of tag names.
Reviewed by: kp
Sponsored by: InnoGames GmbH
Differential Revision: https://reviews.freebsd.org/D52176
|
| |
|
|
|
|
|
|
|
| |
This requires passing the reason pointer down into pf_build_tcp().
ok bluhm@
Obtained from: OpenBSD, sf <sf@openbsd.org>, 03c532ca70
Sponsored by: Rubicon Communications, LLC ("Netgate")
|
| |
|
|
|
|
|
|
|
|
|
|
| |
In case we use OVPN_CIPHER_ALG_NONE, the memcpy will attempt to copy 0
bytes from an uninitialized pointer. While the memcpy() implementation
will treat this as a no-op and not actually dereferece the undefined
variable it is still undefined behaviour to the compiler and should be
fixed. Found by building with clang HEAD
Reviewed by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D52543
|
| |
|
|
|
| |
Obtained from: OpenBSD, jsg <jsg@openbsd.org>, 7ac7a88014
Sponsored by: Rubicon Communications, LLC ("Netgate")
|