aboutsummaryrefslogtreecommitdiff
path: root/sys
Commit message (Collapse)AuthorAgeFilesLines
* linux: Implement F_DUPFD_QUERY fcntl with kcmp(2) KCMP_FILEHEADmainRicardo Branco16 min.2-0/+9
| | | | | | Signed-off-by: Ricardo Branco <rbranco@suse.de> Reviewed by: kib Pull Request: https://github.com/freebsd/freebsd-src/pull/1920
* linux: Add support for kcmp(2) system callRicardo Branco16 min.3-2/+38
| | | | | | Signed-off-by: Ricardo Branco <rbranco@suse.de> Reviewed by: kib Pull Request: https://github.com/freebsd/freebsd-src/pull/1920
* make_dtb.sh: add include pathOskar Holmlund28 min.1-1/+2
| | | | | | | | | | | | | The device tree include file for TI TPS65* is in a relative path to the source for example: device-tree/src/arm/ti/omap/am335x-bone-common.dtsi#n305 device-tree/src/arm/rockchip/rk3066a-marsboard.dts#n183 This patch gets the dts path and adds that as an include path for the device tree compiler. Approved by: manu (mentor) Differential revision: https://reviews.freebsd.org/D53887
* pf: handle TTL expired during nat64Kristof Provost8 hours2-6/+20
| | | | | | | | | | | | | | | | | If the TTL (or hop limit) expires during nat64 translation we may need to send the error message in the original address family (i.e. pre-translation). We'd usually handle this in pf_route()/pf_route6(), but at that point we have already translated the packet, making it difficult to include it in the generated ICMP message. Check for this case in pf_translate_af() and send icmp errors directly from it. PR: 291527 MFC after: 2 weeks Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D54166
* vm: Fix kstack alignment assertionDag-Erling Smørgrav9 hours1-4/+6
| | | | | | | | | | | | | The expectation that the allocation will be aligned to the kstack size only applies when allocating from a kstack arena, not when allocating a non-standard size from the kernel arena. MFC after: 1 week Sponsored by: Klara, Inc. Sponsored by: NetApp, Inc. Fixes: 7a79d0669761 ("vm: improve kstack_object pindex calculation to avoid pindex holes") Reviewed by: bnovkov, siderop1_netapp.com Differential Revision: https://reviews.freebsd.org/D54171
* aq(4): Use sys, not userland, headersEd Maste15 hours6-12/+5
| | | | | | | And remove some unused definitions. Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D54152
* sockets: remove compat shim for divert(4)Gleb Smirnoff16 hours1-11/+0
| | | | | | | All known software in ports had been addressed three years ago and the shim stays in stable/14 and stable/15 for another couple years with its printf(), so all ourliers are expected to conform before 16.0-RELEASE. See 8624f4347e8133911b0554e816f6bedb56dc5fb3 for details.
* LinuxKPI: 802.11: lock down the "txq_scheduled" tailqBjoern A. Zeeb17 hours2-12/+42
| | | | | | | | | | | | | | | | | | For consistency rename the "scheduled_txqs" tailq to "txq_scheduled" and add a lock per txq ("txq_scheduled_lock[]"). We use the "_bh" locking as this called from the device driver. This fixes panics due to concurrent access to the tailq, especially in between "first" and "remove" on the out-direction and between "insert" and "elem_init" on the in-direction. This was easily reproducible just running iperf3 at basic rates for a few seconds to minutes with multiple chipsets, not only rtw89. Sponsored by: The FreeBSD Foundation PR: 290636 Reported by: arved, and others before MFC after: 3 days
* nvme: Only attach to storage NVMe devicesWarner Losh19 hours2-2/+9
| | | | | | Only attach CAM to the nvme storage devices. Sponsored by: Netflix
* nvme: remove now-redundant consumer interfaceWarner Losh19 hours5-155/+0
| | | | | | | | Now that we've moved to newbus methods, we can delete this... Sponsored by: Netflix Reviewed by: dab Differential Revision: https://reviews.freebsd.org/D54095
* nvme: Notify failure with newbus callWarner Losh19 hours3-16/+13
| | | | | | Sponsored by: Netflix Reviewed by: dab Differential Revision: https://reviews.freebsd.org/D51391
* nvme: Use new method to do async notificationsWarner Losh19 hours3-25/+23
| | | | | | | | | Nothing uses these at the moment, but it would be useful to use in the future so convert this functionality to an newbus function dispatch. Sponsored by: Netflix Reviewed by: dab Differential Revision: https://reviews.freebsd.org/D51390
* nvd: Connect nvme_if methodsWarner Losh19 hours6-157/+167
| | | | | | | | Conenct methods to manage namespaces explicitly to replace the old consumer interface. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D51388
* nvme_sim: Connect to events broadcast with nvme_ifWarner Losh19 hours2-61/+85
| | | | | | | | | Connect up the nvme_ns_* events. Copy code from old ways, as needed, and refactor a little. Sponsored by: Netflix Reviewed by: dab Differential Revision: https://reviews.freebsd.org/D51387
* nvd: Attach as a child of nvmeWarner Losh19 hours1-37/+73
| | | | | | | | | | | Rather than registering as a consumer of the nvme controller, hook into the child device and use that. This is a small regression at the moment: we don't fail the device when that happens at runtime. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D51385
* nvme_sim: Attach as a child of nvmeWarner Losh19 hours1-53/+85
| | | | | | | | | | | | Rather than registering as a consumer of the nvme controller, hook into the child device and use that. This is a small regression at the moment: we don't fail the device when that happens at runtime, and we don't handle new namespaces when they arrive (though that feature is currently fragile). Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D51384
* nvme: Add child device for each controllerWarner Losh19 hours2-0/+10
| | | | | | | | | | | | Step 1 in the move from registering consumers for NVMe drives to newbus nvme drives: Add a child device and attach them for each controller that we initialize. Detach them when we detach the main device. Sponsored by: Netflix Reviewed by: dab Differential Revision: https://reviews.freebsd.org/D51383
* nvme: Nvme controller generated eventsWarner Losh19 hours2-0/+56
| | | | | | | | | | Interface for the nvme driver notifying its children of different events: async notifications, namespace events and device failure. These aren't yet connected. Sponsored by: Netflix Reviewed by: dab Differential Revision: https://reviews.freebsd.org/D51386
* sendfile: if sendfile_getobj() fails jump to the function epilogueGleb Smirnoff20 hours1-1/+1
| | | | | | | | The functional change here is that *sent would be zeroed. Note that some portable applications, e.g. OpenSSL, use a wrapper around our sendfile(2) to make it more Linux-like. These wrappers are usually written in a manner that expects *sbytes to always be initialized regardless of the error code returned.
* linux: fix unr(9) leak on module unloadGleb Smirnoff20 hours1-2/+0
| | | | | Suggested by: jhb Fixes: 607f11055d2d421770963162a4d9a99cdd136152
* cam: decode and print direct accecss block device sense dataWarner Losh20 hours2-2/+68
| | | | | | | A more efficient way to include multiple bits of data in a sense decriptor was defined in SBC4 in 2020. Decode and print it. Sponsored by: Netflix
* cam: Expand the parts of the sense buffer we reportWarner Losh20 hours3-6/+212
| | | | | | Decode the descriptors we put into devd. Sponsored by: Netflix
* mpr: Partially revert 332096ebb638Warner Losh21 hours2-32/+2
| | | | | | These were a doodle that escaped into my staging tree. Remove them. Sponsored by: Netflix
* linux: fix panic on kldunloadGleb Smirnoff22 hours1-0/+7
| | | | | | | | | The vnet_deregister_sysuninit() that is called by linker unload sequence also calls every registered destructor before unregistering it. IMHO, this is not correct in principle, but for now plug the regression right in the code that introduced the panic. Fixes: 607f11055d2d421770963162a4d9a99cdd136152
* kboot: Explicitly use host:/procWarner Losh23 hours2-2/+32
| | | | | | | | | | | | | When looking for the boot_params symbol we need to get the UEFI memory map, use host: prefix. The short-circuit we have for this only works when we have a filesystem. During the earliest parts of boot, we can sometimes not have this yet, so making this explicit allows these environments to function. It's always in the host path. Print better error messages, and add newlines in two palces. Sponsored by: Netflix
* linuxkpi: clean up stray pctrie_iter_resetAustin Shafer26 hours1-3/+1
| | | | | | | | | | | | This removes an extraneous pctrie_iter_reset before returning. This is not needed as it simply clears a local variable that will get cleaned up anyway as we immediately return from the function. MFC after: 1 week Sponsored by: NVIDIA Reviewed by: alc Differential Revision: https://reviews.freebsd.org/D54153
* netlink: Don't overwrite existing data in a linear buffer in snl_writerJohn Baldwin26 hours1-11/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | First, a bit of background on some of the data structures netlink uses to manage data associated with a netlink connection. - struct linear_buffer contains a single virtually-contiguous buffer of bytes. Regions of this buffer are suballocated via lb_allocz() which uses a simple "bump" where the buffer is split into an allocated region at the start and a free region at the end. Each allocation "bumps" the boundary (lb->offset) forward by the allocation size. Individual allocations are not freed. Instead, the entire buffer is freed once all of the allocations are no longer in use. Linear buffers also contain an embedded link to permit chaining buffers together. - snl_state contains various state for a netlink connection including a chain of linear buffers. This chain of linear buffers can contain allocations for netlink messages as well as other ancillary data buffers such as socket address structures. The chain of linear buffers are freed once the connection is torn down. - snl_writer is used to construct a message written on a netlink connection. It contains a single virtually-contiguous buffer (nw->base) allocated from the associated snl_state's linear buffer chain. The buffer distinguishes between the amount of space reserved from the underlying allocator (nw->size) and the current message length actually written (nw->offset). As new chunks of data (e.g. netlink attributes) are added to the write buffer, the buffer is grown by snl_realloc_msg_buffer by reallocating a larger buffer from the associated snl_state and copying over the current message data to the new buffer. Commit 0c511bafdd5b309505c13c8dc7c6816686d1e103 aimed to fix two bugs in snl_realloc_msg_buffer. The first bug is that snl_realloc_msg_buffer originally failed to update nw->size after growing the buffer which could result in spurious re-allocations when growing in the future. It also probably could eventually lead to overflowing the buffer since each reallocation request was just adding the new bytes needed for a chunk to the original 'nw->size' while 'nw->offset' kept growing. Eventually the new 'nw->offset' would be larger than 'nw->size + sz' causing routines like snl_reserve_msg_data_raw() to return an out-of-bounds pointer. The second change in this commit I think was trying to fix the buffer overflows due to 'nw->size' being wrong, but instead introduced a new set of bugs. The second change ignored the returned pointer from snl_allocz() and instead assumed it could use all of the currently-allocated data in the current linear buffer. This is only ok if the only data in the linear buffer chain for the associated snl_state is the snl_writer's message buffer. If there is any other data allocated from the snl_state, it could be earlier in the current linear buffer, so resetting new_base to nw->ss->lb->base can result in overwriting that other data. The second change was also over-allocating storage from the underlying chain of linear buffers (e.g. a writer allocation of 256 followed by 512 would end up using the first 512 bytes, but 768 bytes would be reserved in the underlying linear buffer). To fix, revert the second change keeping only the fix for 'nw->size' being wrong. Reviewed by: igoro, markj Fixes: 0c511bafdd5b ("netlink: fix snl_writer and linear_buffer re-allocation logic") Sponsored by: AFRL, DARPA Differential Revision: https://reviews.freebsd.org/D54148
* Add sys/_align.h replacing machine/_align.hBrooks Davis31 hours15-258/+39
| | | | | | | | | | | | | | | | Define _ALIGNBYTES using sizeof(void *) (no functional change on any existing architecture) which will allow it to work with CHERI were we must align things up to capability alignment. In _ALIGN, replace integer manipulation which does not preserve pointer provenance with a type and provenance preserving builtin. This requires modest changes in code which assumes _ALIGN returns an integer, but those are relatively rare. Reviewed by: kib, markj Effort: CHERI upstreaming Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D53947
* splice: Fix leaks that can happen when initiating a spliceAndrew Gallatin44 hours1-17/+27
| | | | | | | | | | | | | | | | - change the state to SPLICE_EXCEPTION to allow so_unsplice() to work to cleanup failed splices (fixes socket reference leak) - NULL out sp->dst when unsplicing from so_splice() before so2 has been been referenced. - Deal with a null sp->dst / so2 in so_unsplice - Fix asserts that talked about sp->state == SPLICE_INIT; that state is not possible here. Differential Revision: https://reviews.freebsd.org/D54157 Reviewed by: markj Sponsored by: Netflix Fixes: c0c5d01e5374 ("so_splice: Synchronize so_unsplice() with so_splice()") MFC after: 3 days
* bhnd_bus_*_resource: Remove redundant type and rid argumentsJohn Baldwin46 hours22-111/+58
| | | | | | | | | | | | | | Remove type and rid arguments from bhnd_bus_(activate|deactivate|release)_resource. This should have been done earlier to match the changes made to bus_release_resource, etc. While fixing up the callers, remove rid members from softc structures since the only time a value is needed is as a constant input to bhnd_bus_alloc_resource*. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D53410
* bhnd_bus_alloc_resource*: Pass rid by valueJohn Baldwin46 hours12-17/+17
| | | | | Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D53409
* dpaa2_rc_add_res: Pass rid by valueJohn Baldwin46 hours1-16/+14
| | | | | Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D53408
* gpio_alloc_intr_resource: Pass rid by valueJohn Baldwin46 hours8-9/+9
| | | | | Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D53407
* acpi_PkgGas: Pass rid by valueJohn Baldwin46 hours4-7/+7
| | | | | Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D53406
* acpi_bus_alloc_gas: Pass rid by valueJohn Baldwin46 hours7-13/+13
| | | | | Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D53405
* pci_reserve_map: Pass rid by valueJohn Baldwin46 hours3-17/+17
| | | | | Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D53404
* resource_list_reserve: Pass rid by valueJohn Baldwin46 hours5-18/+17
| | | | | Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D53403
* bus_alloc_resource: Pass rid by value to BUS_ALLOC_RESOURCE DEVMETHODJohn Baldwin46 hours83-282/+276
| | | | | | | | | The wrapper functions such as bus_alloc_resource_any() still support passing the rid by value or pointer, but the underlying implementation now passes by value. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D53402
* Revert "netlink: Fix overallocation of netlink message buffers"John Baldwin46 hours1-16/+11
| | | | | | | | | This patch was based on an incorrect assumption that the linear buffer chain for an snl_writer only contained the netlink message body. This reverts commit 828df4d36d9d5a6ca0dcc294d65572b4a0474142. Sponsored by: AFRL, DARPA
* linuxkpi: Take const root in read-only radix tree functionsJean-Sébastien Pédron48 hours2-6/+6
| | | | | | | | This is a preparation step for a future addition to this file. This is also closer to what Linux does. Reviewed by: emaste Sponsored by: The FreeBSD Foundation
* linux: fix build without VIMAGEGleb Smirnoff2 days1-0/+1
| | | | Fixes: fbf05d2147b1add8b760be166c4b1fd4499ebce8
* zfs: Reuse ZINCDIR variable from kmod.mkJohn Baldwin2 days1-6/+5
| | | | | Reviewed by: brooks, imp Differential Revision: https://reviews.freebsd.org/D54147
* rman: Embed the mutex in struct rman instead of using a separate allocationJohn Baldwin2 days2-37/+34
| | | | | | | | | This used a separate allocation when rman was first imported (back when the lock was a pre-SMPng "simplelock" instead of a mutex). Reported by: des Reviewed by: des Differential Revision: https://reviews.freebsd.org/D54143
* rman: Simplify initialization of internal globalsJohn Baldwin2 days1-9/+3
| | | | | | | | Use TAILQ_HEAD_INITIALIZER and MTX_SYSINIT to remove the 'once' code from rman_init. Reviewed by: des Differential Revision: https://reviews.freebsd.org/D54142
* cxgbe: Stop using bus_space_tag/handle directlyJohn Baldwin2 days3-19/+11
| | | | | | Reviewed by: np, imp Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D53030
* if_ovpn: use epoch to free peersKristof Provost2 days1-2/+12
| | | | | | | | | | | | Avoid a possible use-after-free in the rx path. ovpn_decrypt_rx_cb() calls ovpn_finish_rx() which releases the lock, but continues to use the peer. Ensure that the peer cannot be freed until we're sure all potential users have stopped using it (i.e. have left net_epoch). Reported by: Kevin Day <kevin@your.org> MFC after: 1 week Sponsored by: Rubicon Communications, LLC ("Netgate")
* sys/_types.h: recognise char8_t as a builtin type in C++20Robert Clausecker2 days1-0/+4
| | | | | | | | | | | | | Unlike in C23 where it's a typedef, char8_t is a built in type in C++20. Recognise it as such. PR: 291449 Reported by: Tomoaki AOKI <junchoon@dec.sakura.ne.jp> Approved by: markj (mentor) Reviewed by: imp Fixes: f0e541118c374869a8226eaa1320bb6eda248a20 MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D54124
* vm_fault: only rely on PG_ZERO when the page was newly allocatedKonstantin Belousov3 days1-1/+5
| | | | | | | | | | | | | If the fs->m page was found invalid on the object queue, PG_ZERO flag is stale. Track the source of the page in the new fault state variable m_needs_zero, and ignore PG_ZERO if the page did not came from the allocator. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D53963
* vm_page.h: remove no longer defined (P) locking annotationKonstantin Belousov3 days1-2/+2
| | | | | | | Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D53963
* lltable: use own lockGleb Smirnoff3 days8-64/+65
| | | | | | | | | Add struct mtx to struct lltable and stop using IF_AFDATA_LOCK, that was created for a completely different purpose. No functional change intended. Reviewed by: zlei, melifaro Differential Revision: https://reviews.freebsd.org/D54086