aboutsummaryrefslogtreecommitdiff
path: root/sys/fs/devfs
Commit message (Collapse)AuthorAgeFilesLines
* devfs_allocv(): styleKonstantin Belousov2024-05-191-2/+1
| | | | (cherry picked from commit 6d79564fe341c8dbf09405cae1a0a76460aaf8aa)
* Fix MNT_IGNORE for devfs, fdescfs and nullfsDoug Rabson2024-04-271-1/+1
| | | | | | | | | | | | | | | | | | The MNT_IGNORE flag can be used to mark certain filesystem mounts so that utilities such as df(1) and mount(8) can filter out those mounts by default. This can be used, for instance, to reduce the noise from running container workloads inside jails which often have at least three and sometimes as many as ten mounts per container. The flag is supplied by the nmount(2) system call and is recorded so that it can be reported by statfs(2). Unfortunately several filesystems override the default behaviour and mask out the flag, defeating its purpose. This change preserves the MNT_IGNORE flag for those filesystems so that it can be reported correctly. MFC after: 1 week (cherry picked from commit b5c4616582cebdcf4dee909a3c2f5b113c4ae59e)
* cdevpriv(9): add iteratorKonstantin Belousov2024-03-301-0/+20
| | | | (cherry picked from commit d3efbe0132b24e8660df836905cda7662f85a154)
* kcmp(2): implement for devfs filesKonstantin Belousov2024-02-111-0/+9
| | | | (cherry picked from commit 5c41d888de1aba0e82531fb6df4cc3b6989d37bd)
* devfs(5): Fix a typo in a source code commentGordon Bergling2024-01-231-1/+1
| | | | | | - s/interpeted/interpreted/ (cherry picked from commit 7cf293536ebacc92150be12e0be928500e670610)
* devfs: add integrity asserts for cdevp_listJason A. Harmening2023-09-283-1/+16
| | | | | | | | | | | | | | | It's possible for misuse of cdev KPIs or for bugs in devfs itself to result in e.g. a cdev object's container being freed while still on the global list used to populate each devfs mount; see PR 273418 for a recent example. Since a node may be marked inactive well before it is reaped from the list, add a new flag solely to track list membership, and employ it in some basic list integrity assertions to catch bad actors. Discussed with: kib, mjg (cherry picked from commit 67864268da53b792836f13be10299de8cd62997e)
* sys: Remove $FreeBSD$: two-line .h patternWarner Losh2023-08-237-14/+0
| | | | | | | Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/ Similar commit in current: (cherry picked from commit 95ee2897e98f)
* spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSDWarner Losh2023-07-257-7/+7
| | | | | | | | | | | The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix (cherry picked from commit 4d846d260e2b9a3d4d0a701462568268cbfe7a5b)
* sys/fs: do not report blocks allocated for synthetic file systemsStefan Eßer2023-05-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | The pseudo file systems (devfs, fdescfs, procfs, etc.) report total and available blocks and inodes despite being synthetic with no underlying storage device to which those values could be applied. The current code of these file systems tends to report a fixed number of total blocks but no free blocks, and in the case of procfs, libprocfs, linsysfs also no free inodes. This can be irritating in e.g. the "df" output, since 100% of the resources seem to be in use, but it can also create warnings in monitoring tools used for capacity management. This patch makes these file systems return the same value for the total and free parameters, leading to 0% in use being displayed by "df". Since there is no resource that can be exhausted, this appears to be a sensible result. Reviewed by: mckusick Differential Revision: https://reviews.freebsd.org/D39442 (cherry picked from commit 88a795e80c03ff1d960d830ee273589664ab06cc)
* fs: fix a few common typos in source code commentsGordon Bergling2022-02-091-1/+1
| | | | | | | | | - s/quadradically/quadratically/ - s/persistant/persistent/ Obtained from: NetBSD (cherry picked from commit 8ea3ceda7644b7b93532d0c31b50ac5fa61e51a3)
* devfs: fix use count leak when using TIOCSCTTYMateusz Guzik2021-04-101-1/+1
| | | | | | | by matching devfs_ctty_ref Fixes: 3b44443626603f65 ("devfs: rework si_usecount to track opens") (cherry picked from commit 3bc17248d31794519ba95b2c6b9ff8a0d31dba81)
* devfs(4): defer freeing until we drop devmtx ("cdev")Edward Tomasz Napierala2020-12-291-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before r332974 the old code would sometimes cause a rare lock order reversal against pagequeue, which looked roughly like this: witness_checkorder() __mtx_lock-flags() vm_page_alloc() uma_small_alloc() keg_alloc_slab() keg_fetch-slab() zone_fetch-slab() zone_import() zone_alloc_bucket() uma_zalloc_arg() bucket_alloc() uma_zfree_arg() free() devfs_metoo() devfs_populate_loop() devfs_populate() devfs_rioctl() VOP_IOCTL_APV() VOP_IOCTL() vn_ioctl() fo_ioctl() kern_ioctl() sys_ioctl() Since r332974 the original problem no longer exists, but it still makes sense to move things out of the - often congested - lock. Reviewed By: kib, markj Sponsored by: NetApp, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D27334
* fs: clean up empty lines in .c and .h filesMateusz Guzik2020-09-011-4/+0
| | | | Notes: svn path=/head/; revision=365070
* devfs: Abstract locking assertionsConrad Meyer2020-08-122-3/+6
| | | | | | | | | | | | | | | The conversion was largely mechanical: sed(1) with: -e 's|mtx_assert(&devmtx, MA_OWNED)|dev_lock_assert_locked()|g' -e 's|mtx_assert(&devmtx, MA_NOTOWNED)|dev_lock_assert_unlocked()|g' The definitions of these abstractions in fs/devfs/devfs_int.h are the only non-mechanical change. No functional change. Notes: svn path=/head/; revision=364135
* devfs: rework si_usecount to track opensMateusz Guzik2020-08-112-16/+149
| | | | | | | | | | | This removes a lot of special casing from the VFS layer. Reviewed by: kib (previous version) Tested by: pho (previous version) Differential Revision: https://reviews.freebsd.org/D25612 Notes: svn path=/head/; revision=364113
* devfs: bool -> intMateusz Guzik2020-08-102-2/+2
| | | | | | | Fixes buildworld after r364069 Notes: svn path=/head/; revision=364076
* devfs: save on spurious relocking for devfs_populateMateusz Guzik2020-08-103-2/+16
| | | | | | | Tested by: pho Notes: svn path=/head/; revision=364069
* devfs: use cheaper lockmgr entry pointsMateusz Guzik2020-08-101-0/+6
| | | | | | | Tested by: pho Notes: svn path=/head/; revision=364068
* devfs: use vget_prep/vget_finishMateusz Guzik2020-08-101-7/+8
| | | | | | | Tested by: pho Notes: svn path=/head/; revision=364067
* vfs: remove the obsolete privused argument from vaccessMateusz Guzik2020-08-051-1/+1
| | | | | | | | This brings argument count down to 6, which is passable without the stack on amd64. Notes: svn path=/head/; revision=363893
* devfs: fix a vnode use-after-free in devfs_ioctlMateusz Guzik2020-07-041-8/+9
| | | | | | | | | | The vnode to be replaced was read with a shared lock, meaning 2 racing threads can find the same one. While here clean it up a little bit. Notes: svn path=/head/; revision=362923
* vfs: track sequential reads and writes separatelyThomas Munro2020-06-211-2/+2
| | | | | | | | | | | | | | | | | | For software like PostgreSQL and SQLite that sometimes reads sequentially while also writing sequentially some distance behind with interleaved syscalls on the same fd, performance is better on UFS if we do sequential access heuristics separately for reads and writes. Patch originally by Andrew Gierth in 2008, updated and proposed by me with his permission. Reviewed by: mjg, kib, tmunro Approved by: mjg (mentor) Obtained from: Andrew Gierth <andrew@tao11.riddles.org.uk> Differential Revision: https://reviews.freebsd.org/D25024 Notes: svn path=/head/; revision=362460
* Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)Pawel Biernacki2020-02-261-1/+2
| | | | | | | | | | | | | | | | | | | r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718 Notes: svn path=/head/; revision=358333
* Fix up various vnode-related asserts which did not dump the used vnodeMateusz Guzik2020-02-031-2/+1
| | | | Notes: svn path=/head/; revision=357446
* Provide O_SEARCHKyle Evans2020-02-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | O_SEARCH is defined by POSIX [0] to open a directory for searching, skipping permissions checks on the directory itself after the initial open(). This is close to the semantics we've historically applied for O_EXEC on a directory, which is UB according to POSIX. Conveniently, O_SEARCH on a file is also explicitly undefined behavior according to POSIX, so O_EXEC would be a fine choice. The spec goes on to state that O_SEARCH and O_EXEC need not be distinct values, but they're not defined to be the same value. This was pointed out as an incompatibility with other systems that had made its way into libarchive, which had assumed that O_EXEC was an alias for O_SEARCH. This defines compatibility O_SEARCH/FSEARCH (equivalent to O_EXEC and FEXEC respectively) and expands our UB for O_EXEC on a directory. O_EXEC on a directory is checked in vn_open_vnode already, so for completeness we add a NOEXECCHECK when O_SEARCH has been specified on the top-level fd and do not re-check that when descending in namei. [0] https://pubs.opengroup.org/onlinepubs/9699919799/ Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23247 Notes: svn path=/head/; revision=357412
* vfs: consistently use size_t for buflen around VOP_VPTOCNPMateusz Guzik2020-02-011-1/+1
| | | | Notes: svn path=/head/; revision=357383
* vfs: drop the mostly unused flags argument from VOP_UNLOCKMateusz Guzik2020-01-033-11/+11
| | | | | | | | | | | Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427 Notes: svn path=/head/; revision=356337
* vfs: flatten vop vectorsMateusz Guzik2019-12-161-0/+2
| | | | | | | | | | | | | | | This eliminates the following loop from all VOP calls: while(vop != NULL && \ vop->vop_spare2 == NULL && vop->vop_bypass == NULL) vop = vop->vop_default; Reviewed by: jeff Tesetd by: pho Differential Revision: https://reviews.freebsd.org/D22738 Notes: svn path=/head/; revision=355790
* vfs: introduce v_irflag and make v_type smallerMateusz Guzik2019-12-081-5/+5
| | | | | | | | | | | | | | | | | | The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In particular v_iflag inspected for VI_DOOMED can be found in the same line with v_usecount. Instead make it available in the same cacheline as the v_op, v_data and v_type which all get read all the time. v_type is avoidably 4 bytes while the necessary data will easily fit in 1. Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new flag field with a new value: VIRF_DOOMED. Reviewed by: kib, jeff Differential Revision: https://reviews.freebsd.org/D22715 Notes: svn path=/head/; revision=355537
* tty: implement TIOCNOTTYKyle Evans2019-11-301-2/+9
| | | | | | | | | | | | | | | | | | Generally, it's preferred that an application fork/setsid if it doesn't want to keep its controlling TTY, but it could be that a debugger is trying to steal it instead -- so it would hook in, drop the controlling TTY, then do some magic to set things up again. In this case, TIOCNOTTY is quite handy and still respected by at least OpenBSD, NetBSD, and Linux as far as I can tell. I've dropped the note about obsoletion, as I intend to support TIOCNOTTY as long as it doesn't impose a major burden. Reviewed by: bcr (manpages), kib Differential Revision: https://reviews.freebsd.org/D22572 Notes: svn path=/head/; revision=355248
* devfs: introduce a per-dev lock to protect ->si_devswMateusz Guzik2019-11-302-0/+5
| | | | | | | | | | | | | | | | This allows bumping threadcount without taking the global devmtx lock. In particular this eliminates contention on said lock while using bhyve with multiple vms. Reviewed by: kib Tested by: markj MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22548 Notes: svn path=/head/; revision=355228
* vfs: change si_usecount management to count used vnodesMateusz Guzik2019-11-201-8/+8
| | | | | | | | | | | | | | | | | | | | Currently si_usecount is effectively a sum of usecounts from all associated vnodes. This is maintained by special-casing for VCHR every time usecount is modified. Apart from complicating the code a little bit, it has a scalability impact since it forces a read from a cacheline shared with said count. There are no consumers of the feature in the ports tree. In head there are only 2: revoke and devfs_close. Both can get away with a weaker requirement than the exact usecount, namely just the count of active vnodes. Changing the meaning to the latter means we only need to modify it on 0<->1 transitions, avoiding the check plenty of times (and entirely in something like vrefact). Reviewed by: kib, jeff Tested by: pho Differential Revision: https://reviews.freebsd.org/D22202 Notes: svn path=/head/; revision=354890
* devfs: use MNTK_NOMSYNCMateusz Guzik2019-10-131-1/+2
| | | | | | | | | Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22009 Notes: svn path=/head/; revision=353471
* devfs_vptocnp(): correct the component name when node is not at top.Konstantin Belousov2019-10-111-27/+16
| | | | | | | | | | | | | | | | Node' cdp.si_name is the full path as provided by make_dev(9), it should not be returned by VOP_VPTOCNP() when only the last component is requested. Use the dirent entry instead. With this note, handling of VDIR and VCHR nodes only differs in handling of root vnode, which simplifies and unifies the logic. Reported by: Li, Zhichao1 <Zhichao_Li1@Dell.com> Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=353447
* devfs: add root vnode cachingMateusz Guzik2019-10-061-1/+3
| | | | | | | | | | See r353150. Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21646 Notes: svn path=/head/; revision=353152
* devfs: plug redundant bwillwrite avoidanceMateusz Guzik2019-10-051-11/+0
| | | | | | | | | | | | | vn_write already checks for vnode type to see if bwillwrite should be called. This effectively reverts r244643. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21905 Notes: svn path=/head/; revision=353126
* Rework v_object lifecycle for vnodes.Konstantin Belousov2019-08-291-1/+0
| | | | | | | | | | | | | | | | | | | | | | | Current implementation of vnode_create_vobject() and vnode_destroy_vobject() is written so that it prepared to handle the vm object destruction for live vnode. Practically, no filesystems use this, except for some remnants that were present in UFS till today. One of the consequences of that model is that each filesystem must call vnode_destroy_vobject() in VOP_RECLAIM() or earlier, as result all of them get rid of the v_object in reclaim. Move the call to vnode_destroy_vobject() to vgonel() before VOP_RECLAIM(). This makes v_object stable: either the object is NULL, or it is valid vm object till the vnode reclamation. Remove code from vnode_create_vobject() to handle races with the parallel destruction. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D21412 Notes: svn path=/head/; revision=351598
* Avoid relying on header pollution from sys/refcount.h.Mark Johnston2019-07-291-0/+1
| | | | | | | | MFC after: 3 days Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=350421
* Extract eventfilter declarations to sys/_eventfilter.hConrad Meyer2019-05-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h" in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header pollution substantially. EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c files into appropriate headers (e.g., sys/proc.h, powernv/opal.h). As a side effect of reduced header pollution, many .c files and headers no longer contain needed definitions. The remainder of the patch addresses adding appropriate includes to fix those files. LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by sys/mutex.h since r326106 (but silently protected by header pollution prior to this change). No functional change (intended). Of course, any out of tree modules that relied on header pollution for sys/eventhandler.h, sys/lock.h, or sys/mutex.h inclusion need to be fixed. __FreeBSD_version has been bumped. Notes: svn path=/head/; revision=347984
* Ensure that directory entry padding bytes are zeroed.Mark Johnston2018-11-231-1/+1
| | | | | | | | | | | | | | | | | Directory entries must be padded to maintain alignment; in many filesystems the padding was not initialized, resulting in stack memory being copied out to userspace. With the ino64 work there are also some explicit pad fields in struct dirent. Add a subroutine to clear these bytes and use it in the in-tree filesystems. The NFS client is omitted for now as it was fixed separately in r340787. Reported by: Thomas Barabosch, Fraunhofer FKIE Reviewed by: kib MFC after: 3 days Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=340856
* Add d_off support for multiple filesystems.Konstantin Belousov2018-11-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | | The d_off field has been added to the dirent structure recently. Currently filesystems don't support this feature. Support has been added and tested for zfs, ufs, ext2fs, fdescfs, msdosfs and unionfs. A stub implementation is available for cd9660, nandfs, udf and pseudofs but hasn't been tested. Motivation for this feature: our usecase is for a userspace nfs server (nfs-ganesha) with zfs. At the moment we cache direntry offsets by calling lseek once per entry, with this patch we can get the offset directly from getdirentries(2) calls which provides a significant speedup. Submitted by: Jack Halford <jack@gandi.net> Reviewed by: mckusick, pfg, rmacklem (previous versions) Sponsored by: Gandi.net MFC after: 1 week Differential revision: https://reviews.freebsd.org/D17917 Notes: svn path=/head/; revision=340431
* Move 32-bit compat support for FIODGNAME to the right place.Brooks Davis2018-10-261-8/+34
| | | | | | | | | | | | | | | | | | | ioctl(2) commands only have meaning in the context of a file descriptor so translating them in the syscall layer is incorrect. The new handler users an accessor to retrieve/construct a pointer from the last member of the passed structure and relies on type punning to access the other member which requires no translation. Unlike r339174 this change supports both places FIODGNAME is handled. Reviewed by: kib Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D17475 Notes: svn path=/head/; revision=339779
* Revert r339174: Move 32-bit compat support for FIODGNAME to the right place.Brooks Davis2018-10-041-42/+8
| | | | | | | | | | A case was missed in this commit which breaks sshing into a 32-bit sshd on a 64-bit system. Approved by: re (gjb) Notes: svn path=/head/; revision=339186
* Move 32-bit compat support for FIODGNAME to the right place.Brooks Davis2018-10-031-8/+42
| | | | | | | | | | | | | | | | | | ioctl(2) commands only have meaning in the context of a file descriptor so translating them in the syscall layer is incorrect. The new handler users an accessor to retrieve/construct a pointer from the last member of the passed structure and relies on type punning to access the other member which requires no translation. Reviewed by: kib Approved by: re (rgrimes, gjb) Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Review: https://reviews.freebsd.org/D17388 Notes: svn path=/head/; revision=339174
* Make it easier for filesystems to count themselves as jail-enabled,Jamie Gritton2018-05-041-3/+0
| | | | | | | | | | | | | | | by doing most of the work in a new function prison_add_vfs in kern_jail.c Now a jail-enabled filesystem need only mark itself with VFCF_JAIL, and the rest is taken care of. This includes adding a jail parameter like allow.mount.foofs, and a sysctl like security.jail.mount_foofs_allowed. Both of these used to be a static list of known filesystems, with predefined permission bits. Reviewed by: kib Differential Revision: D14681 Notes: svn path=/head/; revision=333263
* Move most of the contents of opt_compat.h to opt_global.h.Brooks Davis2018-04-061-2/+0
| | | | | | | | | | | | | | | | | | | | | opt_compat.h is mentioned in nearly 180 files. In-progress network driver compabibility improvements may add over 100 more so this is closer to "just about everywhere" than "only some files" per the guidance in sys/conf/options. Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h is created on all architectures. Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the set of compiled files. Reviewed by: kib, cem, jhb, jtl Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14941 Notes: svn path=/head/; revision=332122
* Report INT_MAX for LINK_MAX for devfs' VOP_PATHCONF().John Baldwin2017-12-191-1/+1
| | | | | | | | | | devfs uses int's for link counts internally and already reports the the full link count via stat() post ino64. Sponsored by: Chelsio Communications Notes: svn path=/head/; revision=326996
* Handle _PC_FILESIZEBITS and _PC_SYMLINK_MAX for devfs' VOP_PATHCONF().John Baldwin2017-12-191-0/+6
| | | | | | | | MFC after: 1 month Sponsored by: Chelsio Communications Notes: svn path=/head/; revision=326994
* Move NAME_MAX, LINK_MAX, and CHOWN_RESTRICTED out of vop_stdpathconf().John Baldwin2017-12-191-0/+9
| | | | | | | | | | | | | | | | | | | Having all filesystems fall through to default values isn't always correct and these values can vary for different filesystem implementations. Most of these changes just use the existing default values with a few exceptions: - Don't report CHOWN_RESTRICTED for ZFS since it doesn't do the exact permissions check this claims for chown(). - Use NANDFS_NAME_LEN for NAME_MAX for nandfs. - Don't report a LINK_MAX of 0 on smbfs. Now fail with EINVAL to indicate hard links aren't supported. Requested by: bde (though perhaps not this exact implementation) Reviewed by: kib (earlier version) MFC after: 1 month Sponsored by: Chelsio Communications Notes: svn path=/head/; revision=326993
* In devfs_lookupx() dotdot lookup case, avoid dereferencingKonstantin Belousov2017-12-141-5/+6
| | | | | | | | | | | | | | | | | | | dvp->v_mount after dvp is unlocked. The vnode might be reclaimed after unlock, so v_mount becomes NULL. Cache the struct mount pointer before the unlock, the struct is type-stable. Note that devfs_allocv() reads mp->mnt_data but does not operate on it further when dirent is doomed. The unmount cannot proceed until all dirents are reclaimed. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=326851