aboutsummaryrefslogtreecommitdiff
path: root/sys/geom
Commit message (Collapse)AuthorAgeFilesLines
* geom/zero: Add support for unmapped I/OMateusz Piotrowski2025-11-111-19/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for unmapped I/O to gzero(4). Let's consider the following script to illustrate the change in gzero(4)'s behavior: ``` dd="dd if=/dev/gzero of=/dev/null bs=512 count=100000" dtrace -q -c "$dd" -n ' fbt::pmap_qenter:entry, fbt::uiomove_fromphys:entry, fbt::memset:entry /execname == "dd"/ { @[probefunc] = count(); } ' ``` Let's run that script 4 times: ``` ==> 1: unmapped I/O not supported (fallback to mapped I/O), kern.geom.zero.clear=1 51200000 bytes transferred in 1.795809 secs (28510829 bytes/sec) pmap_qenter 100000 memset 400011 ==> 2: unmapped I/O not supported (fallback to mapped I/O), kern.geom.zero.clear=0 51200000 bytes transferred in 0.701079 secs (73030337 bytes/sec) memset 300011 ==> 3: unmapped I/O supported, kern.geom.zero.clear=1 51200000 bytes transferred in 0.771680 secs (66348750 bytes/sec) uiomove_fromphys 100000 memset 300011 ==> 4: unmapped I/O supported, kern.geom.zero.clear=0 51200000 bytes transferred in 0.621303 secs (82407407 bytes/sec) memset 300011 ``` If kern.geom.zero.clear=0, then nothing really changes as no copying takes place. Otherwise, we see by adding unmapped I/O support we avoid calls to pmap_qenter(), which was called by GEOM to turn unmapped I/O requests into mapped ones before passing them for processing to gzero(4). Reviewed by: bnovkov, markj Approved by: bnovkov (mentor), markj (mentor) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D52998
* exterr: print exterr for struct buf and bio in ddb show commandsKonstantin Belousov2025-11-041-0/+4
| | | | | Noted by: imp Sponsored by: The FreeBSD Foundation
* geom/geom_vfs.c: use EXTERROR_KE() in g_vfs_strategy for ENXIOsKonstantin Belousov2025-11-041-0/+2
| | | | | | | | As an example of use for the bp_exterr infrastructure. Reviewed by: mckusick Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D53351
* exterror(9): add infra for bufs and biosKonstantin Belousov2025-11-044-5/+26
| | | | | | | | | | | | | | | | | | | | | | | | | The extended error can be stored in either struct bio or struct buf, indicated by BIO_EXTERR bio_flag. At some strategic places, it is copied into the current thread extended error. This structure is required because io request from the top might pass down through several io threads and the context that can report meaningful extended error does not belong to the thread that initiated the io. Sizes before the change, on amd64 nodebug: sizeof(struct buf) = 456 sizeof(struct bio) = 376 after: sizeof(struct buf) = 496 sizeof(struct bio) = 408 WIP: more geom providers should handle BIO_EXTERR when passing cloned bios down and then handling completions. Reviewed by: mckusick Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D53351
* geom(4): Fix typo in a kernel messageGordon Bergling2025-10-291-1/+1
| | | | | | - s/supressing/suppressing/ MFC after: 5 days
* geom: zero: Let sysctls .byte and .clear to be settable in loaderMateusz Piotrowski2025-10-241-2/+2
| | | | | | | | | | | There is no reason to not allow kern.geom.zero.byte and kern.geom.zero.clear to be settable as a tunable. Reviewed by: imp, markj Approved by: markj (mentor) MFC after: 1 week Event: EuroBSDCon 2025 Differential Revision: https://reviews.freebsd.org/D52763
* knotes: kqueue: handle copy for trivial filtersKonstantin Belousov2025-10-181-0/+1
| | | | | | | | Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D52045
* g_part: Replace some spaces with tabs to match the rest of this structBrad Davis2025-10-091-7/+7
| | | | | Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D52907
* GEOM: remove the redundant if statementWuyang Chung2025-09-144-9/+0
| | | | | | | | | | g_provider_by_name already skips the leading '/dev/' so these if statements are redundant. This changes some error messages, but those aren't parsed. g_concat also calls g_concat_find_disk, but it also skips /dev/ if present at the start of the string. Reviewed by: imp, Elliot Mitchell Pull Request: https://github.com/freebsd/freebsd-src/pull/1793
* gunion: Also destroy the rw_lockWarner Losh2025-09-121-0/+1
| | | | | | | | We also need to destroy the rw_lock when we free the softc. Noticed by: markj Fixes: 656f7f43f204 Sponsored by: Netflix
* geom: only set TDP_GEOM for user threadsKonstantin Belousov2025-09-091-2/+7
| | | | | | | | | | | For kernel threads, ASTs are not handled at all, so there is no reason to expect that g_waitidle() would be called through AST scheduling. PR: 289204 Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D52421
* call g_new_geom instead for callers that pass regular string to g_new_geomfWuyang Chung2025-09-0523-34/+34
| | | | | Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/1786
* GEOM: add a new function g_new_geomWuyang Chung2025-09-052-13/+25
| | | | | | | | | This function is a variant of g_new_geomf. It accepts a regular string instead of a format string as its input parameter. It can save the time wasted on unnecessary format string processing. Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/1786
* GEOM_UNION: Should free sc in g_union_ctl_create when error happened.Wuyang Chung2025-09-051-0/+1
| | | | | | Signed-off-by: Wuyang Chung <wy-chung@outlook.com> Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/1835
* gstripe: remove bio->bio_ma_n assignmentMiroslav Cimerman2025-09-051-4/+2
| | | | | | | | | | We shouldn't be manipulating the parent's bio at all (except to update the number of children). physio() already set this properly as well, in addition. Signed-off-by: Miroslav Cimerman <mc@doas.su> Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/1800
* g_part: Fix a few typos in source code commentsGordon Bergling2025-08-171-3/+3
| | | | | | - s/partitition/partition/ MFC after: 3 days
* sys/geom: use proper style for sizeof operatorAndriy Gapon2025-07-268-39/+39
| | | | | | | | | | | | | | | | | | | | | | | No functional change is intended. Missing parentheses around sizeof operands have been added with a coccinnele patch: @disable paren@ expression E; @@ ( sizeof(E) | sizeof +( E +) ) Spaces between sizeof and a parenthesis have been removed with sed. Discussed with: imp MFC after: 2 weeks
* gvirstor: in virstor_ctl_remove() the copy length of the second bcopy might ↵Wuyang Chung2025-07-231-1/+1
| | | | | | | be wrong. Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/1763
* GEOM: allocate gp->name immediately after gpWuyang Chung2025-07-231-3/+2
| | | | | Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/1765
* gconcat: Return EINVAL when the metadata is invalid for an added disk.Wuyang Chung2025-07-231-0/+1
| | | | | | | | | We don't use the disk and stop using it right afterwards. The user should get an error indication, just like they would if there had been a disk read error. Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/1775
* machine/stdarg.h -> sys/stdarg.hBrooks Davis2025-06-117-9/+7
| | | | | | | | | | | | | Switch to using sys/stdarg.h for va_list type and va_* builtins. Make an attempt to insert the include in a sensible place. Where style(9) was followed this is easy, where it was ignored, aim for the first block of sys/*.h headers and don't get too fussy or try to fix other style bugs. Reviewed by: imp Exp-run by: antoine (PR 286274) Pull Request: https://github.com/freebsd/freebsd-src/pull/1595
* sysctl(9): Ease exporting struct sizes; Discourage doing thatOlivier Certner2025-05-071-10/+5
| | | | | | | | | | | | | | | | | | | | | | | Introduce two helpers, the more general SYSCTL_SIZEOF() and a struct-specific one SYSCTL_SIZEOF_STRUCT() which prepends 'struct' in the description and in the use of sizeof() but uses the raw structure name as the knob's name. The size of the object/structure is exported under 'debug.sizeof'. Existing knobs under 'debug.sizeof' were all converted to use the helpers. Add a note before the helpers discouraging the introduction of new leaves for ad-hoc reasons. List alternative means for developers to obtain the size of arbitrary kernel structures easily (thanks to markj@ for providing these). No functional change (intended). Reviewed by: kib, markj MFC after: 3 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D50121
* Reject providers with too small a size for metadataRose2025-04-201-0/+7
| | | | | | | | | Otherwise, if a misbehaving device claims a sectorsize smaller than 256, the memcpy will overflow the allocated buffer, since sizeof(*meta) is 256. Signed-off-by: Rose <gfunni234@gmail.com> Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/1668
* geom: Push GEOM sysinit ordering to after devctlJustin Hibbits2025-03-262-2/+2
| | | | | | | | | | | | | | | GEOM depends on devctl being initialized, as it uses devctl_notify, which assumes that devctl is initialized already. However, if devctl is not initialized yet, the devctl UMA zone is NULL, resulting in a panic. Thus far this has worked seemingly by linker luck that lets devctl sort before GEOM, but this is not guaranteed. Instead, enforce the ordering by pushing GEOM to third place, explicitly ordering it after devctl_init, which is ordered second. Since g_raid wants to initialize after GEOM, push that to fourth place as well. Sponsored by: Juniper Networks, Inc.
* g_dev_orphan(): Return early if the device is already goneFabian Keil2025-03-131-0/+3
| | | | | | | | | | | | | | | | | | | | | | The following panic was the result of running "cdcontrol eject" after using the physical ejection key on the device before the tray was actually ejected. So we have hardware racing software. The device was loaded with a DVD. Resulted in a NULL pointer dereference g_dev_orphan() at g_dev_orphan+0x2e/frame 0xfffffe01eba0a9f0 g_resize_provider_event() at g_resize_provider_event+0x71/frame 0xfffffe01eba0aa20 g_run_events() at g_run_events+0x20e/frame 0xfffffe01eba0aa70 fork_exit() at fork_exit+0x85/frame 0xfffffe01eba0aab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe01eba0aab0 Avoid this possibility and return early of dev is NULL already. PR: 215856 Reviewed by: imp (I've triggered this once or twice over the years too) Sponsored by: Netflix
* geli: Fix signature mismatch in mountroot callbackSHENGYI HONG2025-03-051-1/+1
| | | | | | | This is required for kernel CFI. Reviewed by: rrs, jhb, glebius Differential Revision: https://reviews.freebsd.org/D49111
* gvinum: Remove kernel supportJohn Baldwin2025-01-2319-9092/+0
| | | | | Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D48541
* gvinum: Emit deprecation notice upon drive tastingEd Maste2025-01-201-0/+9
| | | | | | Reviewed by: phk, jhb Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D38607
* kern: Make fileops and filterops tables const where possibleMark Johnston2024-11-261-1/+1
| | | | | | No functional change intended. MFC after: 1 week
* geom: Allow BSD type '!0' partitionsJose Luis Duran2024-11-201-1/+1
| | | | | | | | | | | Allow the creation of '!0' partition types. Fix it by not considering "0" an invalid partition type. Reviewed by: emaste Approved by: emaste (mentor) MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D47652
* g_eli: update comment for bool return typeEd Maste2024-11-131-2/+2
| | | | | Fixes: 68eadcec0f7c8 ("Give a couple of predication functions a bool return type.") Sponsored by: The FreeBSD Foundation
* geom_flashmap: Rename the kernel moduleMark Johnston2024-10-291-1/+1
| | | | | | | | | | | | | | | | | Absent a linker.hints, if a module dependency exists on disk, the loader will automatically load it. That is, if something depends on module foo, and foo.ko exists, we'll load foo.ko even though the linker hints file is missing. It's a bit of a hack but it's handy. This breaks with geom_flashmap though, since it's geom_flashmap.ko on disk but the module is called g_flashmap. However, pretty much every other GEOM module is given a "geom_" prefix, so for consistency's sake alone, it seems nice to rename the module. PR: 274388 Reviewed by: jhb MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D47311
* gpart: Add warning when the start sector is too low.Warner Losh2024-10-161-0/+14
| | | | | | | | | | | Add a warning if the starting sector is too low. The standard requires that at least 16k is reserved for the GPT Partition Array, but some tools produce GPT images with fewer than the required number of reserved sectors. PR: 274312 Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D42247
* gpart: More nuance for GPT supportWarner Losh2024-10-158-4/+11
| | | | | | | | | | | | | | | | | | | | | A careful reading of the GPT standard shows that one may have fewer than 128 entries in your GPT table. While the standard requires that we reserve enough space (32 512-byte-LBAs or 4 4096-byte-LBAs), it also explicitly allows one to specify fewer actual partitions (since that controls what is in the CRC). It requires that the first LBA to be 32 (512 sectors) or 6 (4k sectors) or larger. That requirement is not enforced (it's not listed as one of validation criteria for the GPT). We should likely do so in the future. To that end, allow a default number of entries to use (defent) on creation to be different (larger) than the minimum number of legal entries. For gpt, these numbers work out to 128 and 1 respectively. For all the others, make minent == defent so this is a nop for those partitioning schemes. Sponsored by: Netflix Reviewed by: zlei, emaste Differential Revision: https://reviews.freebsd.org/D42246
* ggate: Avoid dropping the GEOM topology lock in dumpconfMark Johnston2024-10-041-3/+0
| | | | | | | | | | | In general it's not safe to drop the topology lock in these routines, as GEOM assumes that the mesh will be consistent during traversal. However, there's no reason we can't hold the topology lock across calls to g_gate_release(). (Note that g_gate_hold() can be called with the topology lock held.) PR: 238814 MFC after: 2 weeks
* pkcs5v2: Add pkcs5v2_genkey_raw functionColin Percival2024-09-222-2/+14
| | | | | | | | | | This is like pkcs5v2_genkey but takes a "passphrase" as a buffer and length rather than a NUL-terminated string. Reviwed by: pjd MFC after: 1 week Sponsored by: Amazon Differential Revision: https://reviews.freebsd.org/D46633
* gpart: Add u-boot-env alias for U-Boot's environment GPT partition UUIDJessica Clarke2024-09-023-0/+4
| | | | | | This is a platform-independent UUID, and this is the name U-Boot uses. MFC after: 1 week
* geom_io: Shift to pause_sbt to eliminate bogus min and update comment.Warner Losh2024-05-241-15/+12
| | | | | | | | | | | | | | | Update to eliminate bogus min to ensure 0 was never passed to pause. Instead, requrest 1ms with an 'infinite' precision, which defaults to whatever the underlying time counter can do. This should ensure we run fairly quickly to start processing done events, while still giving a small pause for the system to catch its breath. This rate limiter still is less than ideal, and this commit doesn't change that. It should really have no functional change: it just uses a better interface to express the desired sleep. Sponsored by: Netflix Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D45316
* geom: Add counts for enomem and pausingWarner Losh2024-05-241-0/+11
| | | | | | | | | | Add counts for the number of requests that complete with the ENOMEM as kern.geom.nomem_count and the number of times we pause the g_down thread to let the system recover as kern.geom.pause_count. Sponsored by: Netflix Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D45309
* Stop treating size 0 as unknown size in vnode_create_vobject().Pawel Jakub Dawidek2024-05-231-1/+10
| | | | | | | | | | | | | | | | | | | | | Whenever file is created, the vnode_create_vobject() function will try to determine its size by calling vn_getsize_locked() as size 0 is ambigious: it means either the file size is 0 or the file size is unknown. Introduce special value for the size argument: VNODE_NO_SIZE. Only when it is given, the vnode_create_vobject() will try to obtain file's size on its own. Introduce dedicated vnode_disk_create_vobject() for use by g_vfs_open(), so we don't have to call vn_isdisk() in the common case (for regular files). Handle the case of mediasize==0 in g_vfs_open(). Reviewed by: alc, kib, markj, olce Approved by: oshogbo (mentor), allanjude (mentor) Differential Revision: https://reviews.freebsd.org/D45244
* geom: Remove sysctl.hWarner Losh2024-05-222-2/+0
| | | | | | These files don't need sysctl.h, so remove it. Sponsored by: Netflix
* buf: define and use BUF_DISOWNEDRyan Libby2024-05-211-2/+2
| | | | | | | | | Implement an API where previously code was directly reaching into the buf's internal lock. Reviewed by: mckusick, imp, kib, markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D45249
* geli: fix indentationMariusz Zaborski2024-05-191-126/+126
| | | | no functional changes
* geli: allocate a UMA pool earlierMariusz Zaborski2024-05-191-1/+3
| | | | | | | | | | | | | | | | | | | The functions g_eli_init_uma and g_eli_fini_uma are used to trace the number of devices in GELI. There is an issue where the g_eli_create function may fail before g_eli_init_uma is called, however g_eli_fini_uma is still executed in the fail path. This can incorrectly decrease the device count to zero, potentially leading to the UMA pool being freed. Accessing the device after the pool has been freed causes a system panic. This commit resolves the issue by ensuring devices count is increassed eariler. PR: 278828 Reported by: Andre Albsmeier <mail@fbsd2.e4m.org> Reviewed by: asomers MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D45225
* Remove final cross-reference to GBDEPoul-Henning Kamp2024-05-071-3/+3
|
* Remove GBDE source filesPoul-Henning Kamp2024-05-075-2125/+0
|
* geom_stripe: Cascade cantrim just like we do for gmirrorMatthew Grooms2024-05-032-1/+23
| | | | | | | | If any of the disks can support trim, cascade that up the stack. Otherwise, trims won't pass through striped raid setups. PR: 277673 Reviewed by: imp (minor style tweaks from bug report)
* glabel: Add support for Linux swapRicardo Branco2024-04-293-0/+93
| | | | | Reviewed by: imp, kib Pull Request: https://github.com/freebsd/freebsd-src/pull/1205
* geli: add a read-only kern.geom.eli.use_uma_bytes sysctlAlan Somers2024-04-221-1/+3
| | | | | | | | | | | | | It reports the value of the g_eli_alloc_sz variable. Allocations of this size or less will use UMA. Larger allocations will use malloc. Since malloc is slower, it is useful for users to know this variable so they can avoid such allocations. For example, ZFS users can set vfs.zfs.vdev.aggregation_limit to this value. MFC after: 1 week Sponsored by: Axcient Reviewed by: markj, imp Differential Revision: https://reviews.freebsd.org/D44904
* geom(4): Fix a typo in a source code commentGordon Bergling2024-04-211-1/+1
| | | | | | - s/cant/can't/ MFC after: 3 days