aboutsummaryrefslogtreecommitdiff
path: root/sys/fs/unionfs
Commit message (Collapse)AuthorAgeFilesLines
* VOP_RENAME(9): add flags argumentKonstantin Belousov2026-03-051-1/+6
| | | | | | | | Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D55539
* Remove the DEBUG_VFS_LOCKS kernel optionMark Johnston2026-01-151-1/+1
| | | | | | | | | | | After commit 3bd8fab2415b ("vfs: Move DEBUG_VFS_LOCKS checks to INVARIANTS"), this option has no effect. Let's finish the removal. There are a couple of additional uses in zfs, I will submit a separate patch upstream for them. Reviewed by: mckusick, kib Differential Revision: https://reviews.freebsd.org/D54662
* unionfs: Sporadic cleanupDag-Erling Smørgrav2025-12-173-25/+23
| | | | | Sponsored by: Klara, Inc. Sponsored by: NetApp, Inc.
* unionfs: Support renaming symbolic linksDag-Erling Smørgrav2025-12-173-0/+179
| | | | | | | | | | | This adds support for renaming a symbolic link found on the lower fs, which necessitates copying it to the upper fs, as well as basic tests. MFC after: 1 week Sponsored by: Klara, Inc. Sponsored by: NetApp, Inc. Reviewed by: olce, siderop1_netapp.com, jah Differential Revision: https://reviews.freebsd.org/D54229
* unionfs: detect common deadlock-producing mount misconfigurationsJason A. Harmening2025-12-121-2/+25
| | | | | | | | | | | | | | | | | | | | | | | | When creating a unionfs mount, it's fairly easy to shoot oneself in the foot by specifying upper and lower file hierarchies that resolve back to the same vnodes. This is fairly easy to do if the sameness is not obvious due to aliasing through nullfs or other unionfs mounts (as in the associated PR), and will produce either deadlock or failed locking assertions on any attempt to use the resulting unionfs mount. Leverage VOP_GETLOWVNODE() to detect the most common cases of foot-shooting at mount time and fail the mount with EDEADLK. This is not meant to be an exhaustive check for all possible deadlock-producing scenarios, but it is an extremely cheap and simple approach that, unlike previous proposed fixes, also works in the presence of nullfs aliases. PR: 172334 Reported by: ngie, Karlo Miličević <karlo98.m@gmail.com> Reviewed by: kib, olce Tested by: pho MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D53988
* unionfs: Implement VOP_GETLOWVNODEJason A. Harmening2025-12-121-0/+45
| | | | | | | | | | | | | | | | | | | This function returns the vnode that will be used to resolve the access type specified in the 'flags' argument, and is useful for optimal behavior of vn_copy_file_range(). While most filesystems can simply use the default implementation which returns the passed- in vnode, unionfs (like nullfs) ideally should resolve the access request to whichever base layer vnode will be used for the I/O. For unionfs, write accesses must be resolved through the upper vnode, while read accesses will be resolved through the upper vnode if present or the lower vnode otherwise. Provide a simple unionfs_getlowvnode() implementation that reflects this policy. Reviewed by: kib, olce Tested by: pho MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D53988
* unionfs: avoid vdrop()ing a locked but doomed vnodeJason A. Harmening2025-10-161-7/+3
| | | | | | | | | | | | | | | | | | | | | | | | unionfs_lock() unconditionally calls vdrop() on the target vnode after locking it, but it's possible this vnode may be doomed. In that case, vdrop() may free the vnode, which in certain cases requires taking the vnode lock. Commit a7aac8c20497d added an assert to this effect, which unionfs_lock() now trips over. Fix this by lightly reworking the flow of unionfs_lock() so that the target vnode is vdrop()ed after being unlocked in the case where the unionfs lock operation needs to be restarted (which will happen if the unionfs vnode has been doomed, which is a prerequisite for the target vnode in the underlying filesystem to have been doomed). While here, get rid of a superfluous vhold/vdrop sequence in unionfs_unlock() that was probably inherited from nullfs and whose nullfs equivalent was recently removed. MFC after: 1 week Reviewed by: kib, markj, olce Tested by: pho Differential Revision: https://reviews.freebsd.org/D53107
* unionfs: fix NULL deref on closing an fd passed through SCM_RIGHTSJason A. Harmening2025-10-142-1/+3
| | | | | | | | | | | | | | | | | | If the last reference to an open file is contained in an SCM_RIGHTS message in a UNIX domain socket, and that message is discarded without being read out by the receiver, VOP_CLOSE will ultimately be called with ap->a_td == NULL. Change unionfs_close() to check for this condition instead of blindly passing the thread to unionfs_find_node_status() which will try to dereference it. Also add relevant asserts on the node status lookup paths. PR: 289700 Reported by: asomers Reviewed by: asomers, olce MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D53079
* vfs: retire the NULLVP macroMateusz Guzik2025-09-273-208/+208
| | | | | | | | | | | | The kernel was already mostly using plain NULL, just whack it and be doen with the legacy. Churn generated with coccinelle: @@ @@ - NULLVP + NULL
* namei: Fix cn_flags width in various placesMark Johnston2025-05-271-1/+1
| | | | | | | | | This truncation is mostly harmless today, but fix it anyway to avoid pain later down the road. Reviewed by: olce, kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D50417
* namei: Make stackable filesystems check harder for jail rootsMark Johnston2025-05-231-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | Suppose a process has its cwd pointing to a nullfs directory, where the lower directory is also visible in the jail's filesystem namespace. Suppose that the lower directory vnode is moved out from under the nullfs mount. The nullfs vnode still shadows the lower vnode, and dotdot lookups relative to that directory will instantiate new nullfs vnodes outside of the nullfs mountpoint, effectively shadowing the lower filesystem. This phenomenon can be abused to escape a chroot, since the nullfs vnodes instantiated by these dotdot lookups defeat the root vnode check in vfs_lookup(), which uses vnode pointer equality to test for the process root. Fix this by extending nullfs and unionfs to perform the same check, exploiting the fact that the passed componentname is embedded in a nameidata structure to avoid changing the VOP_LOOKUP interface. That is, add a flag to indicate that containerof can be used to get the full nameidata structure, and perform the root vnode check on the lower vnode when performing a dotdot lookup. PR: 262180 Reviewed by: olce, kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D50418
* cred: proc_set_cred(), proc_unset_cred(): Update user's process countOlivier Certner2024-12-161-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As a process really changes credentials at the moment proc_set_cred() or proc_unset_cred() is called, these functions are the proper locations to perform the update of the new and old real users' process count (using chgproccnt()). Before this change, change_ruid() instead would perform that update, although it operates only on a passed credential which is a priori not tied to the calling process (or not to any process at all). This was arguably a flaw of commit b1fc0ec1a7a49ded, r77183, based on its commit message, and in particular the portion "(...) In each case, the call now acts on a credential not a process (...)". Fixing this makes using change_ruid() more natural when building candidate credentials that in the end are not applied to a process, e.g., because of some intervening privilege check. Also, it removes a hack around this unwanted process count change in unionfs. We also introduce the new proc_set_cred_enforce_proc_lim() so that callers can respect the per-user process limit, and will use it for the upcoming setcred(). We plan to change all callers of proc_set_cred() to call this new function instead at some point. In the meantime, both proc_set_cred() and the new function will coexist. As detailed in some proc_set_cred_enforce_proc_lim()'s comment, checking against the process limit is currently flawed as the kernel doesn't really maintain the number of processes per UID (besides RLIMIT_NPROC, this in fact also applies to RLIMIT_KQUEUES, RLIMIT_NPTS, RLIMIT_SBSIZE and RLIMIT_SWAP). The applied limit is currently that of the old real UID. Root (or a process granted with PRIV_PROC_LIMIT) is not subject to this limit. Approved by: markj (mentor) Fixes: b1fc0ec1a7a49ded MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D46923
* vfs: Add IGNOREWHITEOUT flag and adopt it in UFS/unionfsJason A. Harmening2024-09-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | This flag is meant to request that the VOP implementation ignore whiteout entries when processing directory contents. Employ this flag (initially) in UFS when determining whether a directory is empty for the purpose of deleting it or renaming another directory over it. The previous UFS behavior was to always ignore whiteouts and to therefore always allow directories containing only whiteouts to be deleted or overwritten. This makes sense when the directory in question is being accessed through a unionfs view in which the whiteouts produce a unionfs directory that is logically empty, but it makes less sense when directly operating against the UFS directory in which case silently discarding the whiteouts may produce unexpected behavior in a current or future unionfs view. IGNOREWHITEOUT is therefore treated as opt-in and only specified by unionfs_rmdir() when invoking VOP_RMDIR() against the upper filesystem. IGNOREWHITEOUT is not currently used for unionfs rename operations, as the current implementation of unionfs_rename() simply forbids renaming over any existing upper filesystem directory in the first place. Differential Revision: https://reviews.freebsd.org/D45987 Reviewed by: olce Tested by: pho
* unionfs: fix LINT buildJason A. Harmening2024-07-131-2/+2
| | | | | | | | | Fix a stale variable name that snuck into a tracepoint from an earlier version of the change. Fixes: eb60ff1e "unionfs: rework locking scheme to only lock a single vnode" Reported by: jenkins
* unionfs: do not create a new status object during vop_close()Jason A. Harmening2024-07-123-16/+40
| | | | | | | | | | | | | | | Split the portion of unionfs_get_node_status() that searches for an existing status object into a new helper function, unionfs_find_node_status(), and use that in unionfs_close(). Additionally, modify unionfs_close() to accept a NULL status object if unionfs_find_node_status() does not find a matching status object. This can happen due to the unconditional VOP_CLOSE() operation issued by vgonel(). Differential Revision: https://reviews.freebsd.org/D45398 Reviewed by: olce Tested by: pho
* unionfs: rework locking scheme to only lock a single vnodeJason A. Harmening2024-07-124-746/+979
| | | | | | | | | | | | | | | | | | | Instead of locking both the lower and upper vnodes, which is both complex and deadlock-prone, only lock the upper vnode, or the lower vnode if no upper vnode is present. In most cases this is all that is needed; for the cases in which both vnodes do need to be locked, this change also employs deadlock- avoiding techniques such as LK_NOWAIT and vn_lock_pair(). There are still some corner cases in which the current implementation ends up taking multiple vnode locks across different filesystems without taking special steps to avoid deadlock; those cases have been noted in the comments. Differential Revision: https://reviews.freebsd.org/D45398 Reviewed by: olce Tested by: pho
* unionfs_rename: fix numerous locking issuesJason A. Harmening2024-04-291-56/+96
| | | | | | | | | | | | | | | | | | | | | | | There are a few places in which unionfs_rename() accesses fvp's private data without holding the necessary lock/interlock. Moreover, the implementation completely fails to handle the case in which fdvp is not the same as tdvp; in this case it simply fails to lock fdvp at all. Finally, it locks fvp while potentially already holding tvp's lock, but makes no attempt to deal with possible LOR there. Fix this by optimistically using the vnode interlock to protect the short accesses to fdvp and fvp private data, sequentially. If a file copy or shadow directory creation is required to prepare the upper FS for the rename operation, the interlock must be dropped and fdvp/fvp locked as necessary. Additionally, use ERELOOKUP (as suggested by kib@) to simplify the locking logic and eliminate unionfs_relookup() calls for file-copy/ shadow-directory cases that require tdvp's lock to be dropped. Reviewed by: kib (earlier version), olce Tested by: pho Differential Revision: https://reviews.freebsd.org/D44788
* unionfs_lookup(): fix wild accesses to vnode private dataJason A. Harmening2024-04-091-7/+15
| | | | | | | | | There are a few spots in which unionfs_lookup() accesses unionfs vnode private data without holding the corresponding vnode lock or interlock. Reviewed by: kib, olce MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D44601
* unionfs: implement VOP_UNP_* and remove special VSOCK vnode handlingJason A. Harmening2024-03-241-89/+84
| | | | | | | | | | | | | | unionfs has a bunch of clunky special-case code to avoid creating unionfs wrapper vnodes for AF_UNIX sockets. This was added in 2008 to address PR 118346, but in the intervening years the VOP_UNP_* operations have been added to provide a clean interface to allow sockets to work in the presence of stacked filesystems. PR: 275871 Reviewed by: kib (prior version), olce Tested by: Karlo Miličević <karlo98.m@gmail.com> MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D44288
* unionfs: accommodate underlying FS calls that may re-lockJason A. Harmening2024-03-103-60/+289
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since non-doomed unionfs vnodes always share their primary lock with either the lower or upper vnode, any forwarded call to the base FS which transiently drops that upper or lower vnode lock may result in the unionfs vnode becoming completely unlocked during that transient window. The unionfs vnode may then become doomed by a concurrent forced unmount, which can lead to either or both of the following: --Complete loss of the unionfs lock: in the process of being doomed, the unionfs vnode switches back to the default vnode lock, so even if the base FS VOP reacquires the upper/lower vnode lock, that no longer translates into the unionfs vnode being relocked. This will then violate that caller's locking assumptions as well as various assertions that are enabled with DEBUG_VFS_LOCKS. --Complete less of reference on the upper/lower vnode: the caller normally holds a reference on the unionfs vnode, while the unionfs vnode in turn holds references on the upper/lower vnodes. But in the course of being doomed, the unionfs vnode will drop the latter set of references, which can effectively lead to the base FS VOP executing with no references at all on its vnode, violating the assumption that vnodes can't be recycled during these calls and (if lucky) violating various assertions in the base FS. Fix this by adding two new functions, unionfs_forward_vop_start_pair() and unionfs_forward_vop_finish_pair(), which are intended to bookend any forwarded VOP which may transiently unlock the relevant vnode(s). These functions are currently only applied to VOPs that modify file state (and require vnode reference and lock state to be identical at call entry and exit), as the common reason for transiently dropping locks is to update filesystem metadata. Reviewed by: olce Tested by: pho MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D44076
* unionfs: work around underlying FS failing to respect cn_namelenJason A. Harmening2024-02-181-0/+17
| | | | | | | | | | | | | | | | | | | | | unionfs_mkshadowdir() may be invoked on a non-leaf pathname component during lookup, in which case the NUL terminator of the pathname buffer will be well beyond the end of the current component. cn_namelen in this case will still (correctly) indicate the length of only the current component, but ZFS in particular does not currently respect cn_namelen, leading to the creation on inacessible files with slashes in their names. Work around this behavior by temporarily NUL- terminating the current pathname component for the call to VOP_MKDIR(). https://github.com/openzfs/zfs/issues/15705 has been filed to track a proper upstream fix for the issue at hand. PR: 275871 Reported by: Karlo Miličević <karlo98.m@gmail.com> Tested by: Karlo Miličević <karlo98.m@gmail.com> Reviewed by: kib, olce MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D43818
* unionfs: upgrade the vnode lock during fsync() if necessaryJason A. Harmening2024-02-182-1/+10
| | | | | | | | | | | | | | | | | | | | If the underlying upper FS supports shared locking for write ops, as is the case with ZFS, VOP_FSYNC() may only be called with the vnode lock held shared. In this case, temporarily upgrade the lock for those unionfs maintenance operations which require exclusive locking. While here, make unionfs inherit the upper FS' support for shared write locking. Since the upper FS is the target of VOP_GETWRITEMOUNT() this is what will dictate the locking behavior of any unionfs caller that uses vn_start_write() + vn_lktype_write(), so unionfs must be prepared for the caller to only hold a shared vnode lock in these cases. Found in local testing of unionfs atop ZFS with DEBUG_VFS_LOCKS. MFC after: 2 weeks Reviewed by: kib, olce Differential Revision: https://reviews.freebsd.org/D43817
* unionfs: cache upper/lower mount objectsJason A. Harmening2024-02-183-19/+24
| | | | | | | | | | | | | | | | | Store the upper/lower FS mount objects in unionfs per-mount data and use these instead of the v_mount field of the upper/lower root vnodes. As described in the referenced PR, it is unsafe to access this field on the unionfs unmount path as ZFS rollback may have obliterated the v_mount field of the upper or lower root vnode. Use these stored objects to slightly simplify other code that needs access to the upper/lower mount objects as well. PR: 275870 Reported by: Karlo Miličević <karlo98.m@gmail.com> Tested by: Karlo Miličević <karlo98.m@gmail.com> Reviewed by: kib (prior version), olce MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D43815
* sys/fs/unionfs/union_vnops.c: remove an extra semicolonrilysh2024-02-031-1/+1
| | | | | | Signed-off-by: rilysh <nightquick@proton.me> Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/959
* sys: Remove ancient SCCS tags.Warner Losh2023-11-274-8/+0
| | | | | | | | Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script. Sponsored by: Netflix
* sys: Remove $FreeBSD$: one-line .h patternWarner Losh2023-08-164-4/+0
| | | | Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/
* vfs: use __enum_uint8 for vtype and vstateMateusz Guzik2023-07-051-1/+1
| | | | | | This whacks hackery around only reading v_type once. Bump __FreeBSD_version to 1400093
* unionfs(): destroy root vnode if upper registration failsJason A. Harmening2023-05-071-0/+1
| | | | | | | | | | | | | If unionfs_domount() fails, the mount path will not call VFS_UNMOUNT() to clean up after it. If this failure happens during upper vnode registration, the unionfs root vnode will already be allocated. vflush() it in order to prevent the vnode from being leaked and the subsequent vfs_mount_destroy() call from getting stuck waiting for the mountpoint reference count to drain. Reviewed by: kib, markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D39767
* unionfs: prevent upperrootvp from being recycled during mountJason A. Harmening2023-05-071-1/+14
| | | | | | | | | | | | | If upperrootvp is doomed by a concurrent unmount, unionfs_nodeget() may return without a reference or lock on it. unionfs_domount() must prevent the vnode from being recycled for use by a different file until it is finished with the vnode, namely once vfs_register_upper_from_vp() fails. Accomplish this by holding the reference returned by namei() a bit longer. Reviewed by: kib, markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D39767
* unionfs: fixes to unionfs_nodeget() error handlingJason A. Harmening2023-05-071-3/+5
| | | | | | | | | | | | | | If either the lower or upper vnode is found to be doomed after locking it, the newly-created unionfs node won't be associated with it and its lock will be dropped. In that case, clear the uppervp and lowervp locals as necessary to avoid further use of the vnode in unionfs_nodeget(). If the upper vnode is doomed but the lower vnode remains valid, additionally reset the unionfs node's v_vnlock field to point to the lower vnode lock. Reviewed by: kib, markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D39767
* unionfs_mkdir(): handle dvp reclamationJason A. Harmening2023-04-181-4/+29
| | | | | | | | | | | | | | | | | | | | The underlying VOP_MKDIR() implementation may temporarily drop the parent directory vnode's lock. If the vnode is reclaimed during that window, the unionfs vnode will effectively become unlocked because the its v_vnlock field will be reset. To uphold the locking requirements of VOP_MKDIR() and to avoid triggering various VFS assertions, explicitly re-lock the unionfs vnode before returning in this case. Note that there are almost certainly other cases in which we'll similarly need to handle vnode relocking by the underlying FS; this is the only one that's caused problems in stress testing so far. A more general solution, such as that employed for nullfs in null_bypass(), will likely need to be implemented. Tested by: pho Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D39272
* Remove unionfs_islocked()Jason A. Harmening2023-04-181-19/+1
| | | | | | | | | | | | The implementation is racy; if the unionfs vnode is not in fact locked, vnode private data may be concurrently altered or freed. Instead, simply rely upon the standard implementation to query the v_vnlock field, which is type-stable and will reflect the correct lower/upper vnode configuration for the unionfs node. Tested by: pho Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D39272
* Remove an impossible condition from unionfs_lock()Jason A. Harmening2023-04-181-8/+0
| | | | | | | | | We hold the vnode interlock, so vnode private data cannot suddenly become NULL. Tested by: pho Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D39272
* unionfs: remove LK_UPGRADE if falling back to the standard lockJason A. Harmening2023-04-181-2/+18
| | | | | | | | | | | | | | | The LK_UPGRADE operation may have temporarily dropped the upper or lower vnode's lock. If the unionfs vnode was reclaimed during that window, its lock field will be reset to no longer point at the upper/lower vnode lock, so the lock operation will use the standard lock stored in v_lock. Remove LK_UPGRADE from the flags in this case to avoid a lockmgr assertion, as this lock has not been previously owned by the calling thread. Reported by: pho Tested by: pho Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D39272
* vn_lock_pair(): allow to request shared lockingKonstantin Belousov2023-04-071-1/+2
| | | | | | | | | | | If either of vnodes is shared locked, lock must not be recursed. Requested by: rmacklem Reviewed by: markj, rmacklem Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D39444
* vfs: add the concept of vnode state transitionsMateusz Guzik2022-12-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | To quote from a comment above vput_final: <quote> * XXX Some filesystems pass in an exclusively locked vnode and strongly depend * on the lock being held all the way until VOP_INACTIVE. This in particular * happens with UFS which adds half-constructed vnodes to the hash, where they * can be found by other code. </quote> As is there is no mechanism which allows filesystems to denote that a vnode is fully initialized, consequently problems like the above are only found the hard way(tm). Add rudimentary support for state transitions, which in particular allow to assert the vnode is not legally unlocked until its fate is decided (either construction finishes or vgone is called to abort it). The new field lands in a 1-byte hole, thus it does not grow the struct. Bump __FreeBSD_version to 1400077 Reviewed by: kib (previous version) Tested by: pho Differential Revision: https://reviews.freebsd.org/D37759
* vfs: retire the now unused SAVESTART flagMateusz Guzik2022-12-191-3/+2
| | | | | | Bump __FreeBSD_version to 1400075 Tested by: pho
* vfs: make relookup take an additional argumentMateusz Guzik2022-12-191-7/+11
| | | | | | | | | | instead of looking at SAVESTART This is a step towards removing the flag. Reviewed by: mckusick Tested by: pho Differential Revision: https://reviews.freebsd.org/D34468
* unionfs: allow recursion on covered vnode lock during mount/unmountJason A. Harmening2022-12-111-2/+2
| | | | | | | | | | | | | | | | | | | | | When taking the covered vnode lock during mount and unmount operations, specify LK_CANRECURSE as the existing lock state of the covered vnode is not guaranteed (AFAIK) either by assertion or documentation for these code paths. For the mount path, this is done only for completeness as the covered vnode lock is not currently held when VFS_MOUNT() is called. For the unmount path, the covered vnode is currently held across VFS_UNMOUNT(), and the existing code only happens to work when unionfs is mounted atop FFS because FFS sets LO_RECURSABLE on its vnode locks. This of course doesn't cover a hypothetical case in which the covered vnode may be held shared, but for the mount and unmount paths such a scenario seems unlikely to materialize. Reviewed by: kib Tested by: pho Differential Revision: https://reviews.freebsd.org/D37458
* Add VV_CROSSLOCK vnode flag to avoid cross-mount lookup LORJason A. Harmening2022-10-271-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a lookup operation crosses into a new mountpoint, the mountpoint must first be busied before the root vnode can be locked. When a filesystem is unmounted, the vnode covered by the mountpoint must first be locked, and then the busy count for the mountpoint drained. Ordinarily, these two operations work fine if executed concurrently, but with a stacked filesystem the root vnode may in fact use the same lock as the covered vnode. By design, this will always be the case for unionfs (with either the upper or lower root vnode depending on mount options), and can also be the case for nullfs if the target and mount point are the same (which admittedly is very unlikely in practice). In this case, we have LOR. The lookup path holds the mountpoint busy while waiting on what is effectively the covered vnode lock, while a concurrent unmount holds the covered vnode lock and waits for the mountpoint's busy count to drain. Attempt to resolve this LOR by allowing the stacked filesystem to specify a new flag, VV_CROSSLOCK, on a covered vnode as necessary. Upon observing this flag, the vfs_lookup() will leave the covered vnode lock held while crossing into the mountpoint. Employ this flag for unionfs with the caveat that it can't be used for '-o below' mounts until other unionfs locking issues are resolved. Reported by: pho Tested by: pho Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D35054
* vfs: introduce V_PCATCH to stop abusing PCATCHMateusz Guzik2022-09-171-3/+3
|
* vfs: always retain path buffer after lookupMateusz Guzik2022-09-172-30/+4
| | | | | | | | This removes some of the complexity needed to maintain HASBUF and allows for removing injecting SAVENAME by filesystems. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D36542
* unionfs: Use __diagused for a variable only used in KASSERT().John Baldwin2022-04-131-1/+1
|
* vfs: NDFREE(&nd, NDF_ONLY_PNBUF) -> NDFREE_PNBUF(&nd)Mateusz Guzik2022-03-241-1/+1
|
* vfs: prefix lookup and relookup with vfs_Mateusz Guzik2022-03-131-2/+2
| | | | | Reviewed by: imp, mckusick Differential Revision: https://reviews.freebsd.org/D34530
* unionfs: rework unionfs_getwritemount()Jason A. Harmening2022-02-241-15/+34
| | | | | | | | | | | | | VOP_GETWRITEMOUNT() is called on the vn_start_write() path without any vnode locks guaranteed to be held. It's therefore unsafe to blindly access per-mount and per-vnode data. Instead, follow the approach taken by nullfs and use the vnode interlock coupled with the hold count to ensure the mount and the vnode won't be recycled while they are being accessed. Reviewed by: kib (earlier version), markj, pho Tested by: pho Differential Revision: https://reviews.freebsd.org/D34282
* unionfs: fix typo in commentJason A. Harmening2022-02-101-1/+1
| | | | | | I deleted the wrong word when writing up a comment in a prior change; the covered vnode may be recursed during any unmount, not just forced unmount.
* unionfs: do not force LK_NOWAIT if VI_OWEINACT is setJason A. Harmening2022-02-031-4/+0
| | | | | | | | | | | | | | I see no apparent need to avoid waiting on the lock just because vinactive() may be called on another thread while the thread that cleared the vnode refcount has the lock dropped. In fact, this can at least lead to a panic of the form "vn_lock: error <errno> incompatible with flags" if LK_RETRY was passed to VOP_LOCK(). In this case LK_NOWAIT may cause the underlying FS to return an error which is incompatible with LK_RETRY. Reported by: pho Reviewed by: kib, markj, pho Differential Revision: https://reviews.freebsd.org/D34109
* unionfs: allow lock recursion when reclaiming the root vnodeJason A. Harmening2022-02-032-4/+16
| | | | | | | | | | | | | The unionfs root vnode will always share a lock with its lower vnode. If unionfs was mounted with the 'below' option, this will also be the vnode covered by the unionfs mount. During unmount, the covered vnode will be locked by dounmount() while the unionfs root vnode will be locked by vgone(). This effectively requires recursion on the same underlying like, albeit through two different vnodes. Reported by: pho Reviewed by: kib, markj, pho Differential Revision: https://reviews.freebsd.org/D34109
* unionfs: fix assertion order in unionfs_lock()Jason A. Harmening2022-02-031-4/+5
| | | | | | | | | | | | | VOP_LOCK() may be handed a vnode that is concurrently reclaimed. unionfs_lock() accounts for this by checking for empty vnode private data under the interlock. But it incorrectly asserts that the vnode is using the unionfs dispatch table before making this check. Reverse the order, and also update KASSERT_UNIONFS_VNODE() to provide more useful information. Reported by: pho Reviewed by: kib, markj, pho Differential Revision: https://reviews.freebsd.org/D34109