| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
| |
[skip ci]
|
| |
|
|
|
|
|
|
| |
Reviewed by: kib
Fixes: be1f7435ef218b1d ("kern: start tracking cr_gid outside of cr_groups[]")
MFC after: 9 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D52257
|
| |
|
|
|
|
|
| |
Fixes: be1f7435ef218b1d ("kern: start tracking cr_gid outside of cr_groups[]")
MFC after: 9 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D52256
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This comment is obsolete, as:
1. This code is FreeBSD-specific and is not shared with other BSDs.
2. With our recent changes in commit be1f7435ef218b1d ("kern: start
tracking cr_gid outside of cr_groups[]"), all of NetBSD, OpenBSD and
FreeBSD have the effective GID in a separate field (DragonFlyBSD
remains to this day an outlier).
MFC after: 9 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D52254
|
| |
|
|
|
|
|
| |
This reverts commit 65059dd2b6f94e570acc645be82b8ea056316459.
lindebugfs does he vast majority of its pseudofs initialization nearly
everywhere but pseudofs, so let's defer this to post-brsnching.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In D52038, we kind of guess at the reason that pfs_create_dir() failed,
which isn't great: it could be EEXISTS, or it could even be ENOMEM.
Change the pfs_create_*() interfaces to return an error and use a double
pointer to return the new node as requested. Outside of the changes
in sys/fs/pseudofs, this diff is the result of running the added
coccinelle script against in-tree pseudofs and fixing all of the
style(9) violations that spatch added.
We set *opn to NULL in the failure cases to avoid breaking callers that
did actually error-check their results, since the cocci patch does not
attempt to handle that in any way.
Reviewed by: des (previous version), kib
Differential Revision: https://reviews.freebsd.org/D52157
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, pseudofs all get fully constructed when the module is loaded
and vfs registered, but this is pretty unnecessary. Just loading the
fs doesn't mean that it will be used so we're adding overhead and
risk[0] by fully initializing these at the start, along with committing
resources that may not be used.
Deferring pfs_init() allows us to reduce the risk of simply loading the
module causing problems that are harder to avoid, and existing pseudo
filesystems don't really care: configuration that is context-sensitive
is generally deferred to access-time with PFS_PROCDEP.
To preserve symmetry, we'll also teardown our pseudofs on last unmount,
which leaves us with a vfs_uninit() implementation that simply destroys
our lock and prints a message.
[0] Example of such being recent bugs in linsysfs, which caused a panic
as soon as the module was loaded because we're eager to set it up.
Reviewed by: des (previous version), kib
Differential Revision: https://reviews.freebsd.org/D52156
|
| |
|
|
|
| |
Reviewed by: des, kib
Differential Revision: https://reviews.freebsd.org/D52155
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This one in particular is ripe with opportunities to trigger a duplicate
node error in pfs_create_dir(), so we do actually want to error-check
it. The rest, more or less, should be expected not to fail. We'll
propagate the error from pfs_create_dir() up through linsysfs_run_bus
and complain about the device node that caused the error. Note that we
avoid failing vfs_init() since a partially-constructed linsysfs with
missing devices is probably more useful than missing linsysfs entirely.
While we're here, convert two malloc() that weren't being error checked
to M_WAITOK -- we already wait in the rest of the function, might as
well do the same here.
Add a missing newline to the pseudofs error mesage.
Reviewed by: fuz, kib
Differential Revision: https://reviews.freebsd.org/D52038
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 9a3edc8 modified the behaviour of ZFS's
VOP_READDIR() such that it will reply EINVAL for
an offset past EOF on the directory.
This exposed a latent bug in the NFSv4 Readdir
code, which would attempt a Readdir with an
offset beyond EOF for a directory that consists
of only "." and "..". This happened because NFSv4
does not reply "." or ".." to the client and, after
skipping over them, attempted another VOP_READDIR().
This patch fixes the problem by checking the eofflag
for the case where all entries have been skipped over.
Reviewed by: kib
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D52370
|
| |
|
|
|
| |
Reviewed by: stevek
Obtained from: Juniper Networks, Inc.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
It eliminates the need to upgrade the lock in the function.
More importantly, the calls to nfs_advlock_p()/nlm_advlock() sometimes
flush buffers, which requires exclusive locking.
Reported and tested by: bz
Reviewed by: rmacklem
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D52195
|
| |
|
|
|
|
| |
- s/fist/first/
MFC after: 3 days
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Server is allowed to fill any value into the rdev attribute, clear it to
satisfy the local requirements.
Reported by: bakul
Reviewed by: rmacklem
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D51988
|
| |
|
|
|
|
|
|
| |
Reported and tested by: pho
Reviewed by: rmacklem
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D51988
|
| |
|
|
|
|
|
| |
Reviewed by: rmacklem
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential revision: https://reviews.freebsd.org/D51987
|
| |
|
|
|
| |
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D51955
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
We frequently need to check if a vnode refers to either a character or
block special, so we might as well have a macro for it.
We somewhat less frequently need to perform similar checks on things
that aren't vnodes (usually a struct vattr *), so add VATTR_ISDEV()
and a generic VTYPE_ISDEV() as well.
Sponsored by: Klara, Inc.
Sponsored by: NetApp, Inc.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D51947
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 5b5b7e2ca2fa changed namei()s behaviour such that it
does not free the NAMEI buffer unless returning an error.
The nfsd was not fixed for this. Fortunately, the only
leak would be one NAMEI buffer each time mountd(8) reloads
the exports. (There were also leaks in the pNFS server
configuration, but almost no one uses it.)
This patch fixes the leaks by adding NDFREE_PNBUF() macros
in the appropriate places.
MFC after: 2 weeks
Discussed with: kib
Fixes: 5b5b7e2ca2fa ("vfs: always retain path buffer after lookup")
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 37b2cb5ecb0f added VFS support for block cloning.
This patch uses the VFS changes to add support for the
NFSv4.2 Clone operation, which copies ranges within one
or two files via block cloning.
The Clone operation is similar to Copy, but always
completes the "copy on write". It is not allowed to
return partially done. It also allows copying of bytes
ranges within the same file, which the NFSv4.2 Copy
operation does not allow.
Unless COPY_FILE_RANGE_CLONE has been specified for
copy_file_range(2), a failing Clone operation will be
redone with a Copy.
The Clone operation requires that offsets (and length,
if it does not go to EOF in the input file) be aligned
to _PC_CLONE_BLKSIZE. This is similar to what ZFS
implements now.
At this time, ZFS is the only exportable file system
that supports block cloning. As such, the Clone operation
is only supported for ZFS exports at this time.
Fixes: 37b2cb5ecb0f ("vfs: Add support for file cloning to VOP_COPY_FILE_RANGE")
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NFSv4 has a separate CLONE operation from COPY with
a couple of semantics differences. Unlike COPY, CLONE
must complete the "copy on write" and cannot return
partially copied. It also is required to use offsets (and
the length if not to EOF) that are aligned to a buffer
boundary.
Since VOP_COPY_FILE_RANGE() can already do "copy on write"
for file systems that support it, such as ZFS with block
cloning enabled, all this patch does is add a flag called
COPY_FILE_RANGE_CLONE so that it will conform to the
rule that it must do a "copy on write" to completion.
The patch also adds a new pathconf(2) name _PC_CLONE_BLKSIZE,
which acquires the blocksize requirement for cloning and
returns 0 for file systems that do not support the
"copy on write" feature. (This is needed for the NFSv4.2
clone_blksize attribute.)
This patch will allow the implementation of CLONE
for NFSv4.2.
Reviewed by: asomers
Differential Revision: https://reviews.freebsd.org/D51808
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ever since the first GSoC contribution, fusefs has had a curious
behavior. If the daemon hasn't finished responding to FUSE_INIT,
fuse_vnop_getattr would reply to VOP_GETATTR requests for the mountpoint
by returning all zeros. I don't know why. It isn't necessary for
unmounting, even if the daemon is dead.
Delete that behavior. Now VOP_GETATTR for the mountpoint will wait for
the daemon to be ready, just like it will for any other vnode.
Reported by: Vassili Tchersky
Sponsored by: ConnectWise
Differential Revision: https://reviews.freebsd.org/D50800
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Without this patch, an NFSv4.1/4.2 GetACL RPC requests that
the session cache the reply. In some cases, the reply may
be too large to cache, resulting in a NFS4ERR_X
error from the server.
Since a GetACL is idempotent, disable reply caching for it,
by setting that it can generate a large reply.
Tested against a Linux server with a large ACL on a file.
MFC after: 2 weeks
|
| |
|
|
|
|
|
|
|
|
|
| |
Once a new vnode is visible from the mountpoint hash, we should set its
state from VSTATE_UNINITIALIZED to VSTATE_CONSTRUCTED. I do not think
this affects correctness at all, but the bug trips a check in
vop_unlock_debugpost(), previously hidden under options DEBUG_VFS_LOCKS.
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D51720
|
| |
|
|
|
|
|
|
|
| |
This assertion can reasonably be checked when plain INVARIANTS is
configured, there's no need to configure a separate option.
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D51697
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Return EINVAL if this is the first dirent encountered with the short
buffer, or EJUSTRETURN if something was already copied out.
This is needed to pass eof check in vop_readdir_post(): we are not at
eof but resid was not advanced.
Reported and tested by: pho (previous version)
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D51667
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As of FreeBSD 15.0, crsetgroups() *only* sets supplementary groups,
while crsetgroups_and_egid() will do both using an array of the same
style that previous versions used for crsetgroups() -- i.e., the first
element is the egid, and the remainder are supplementary groups.
Unlike the previous iteration of crsetgroups(), crsetgroups_and_egid()
is less prone to misuse as the caller must provide a default egid to use
in case the array is empty. This is particularly useful for groups
being set from data provided by userland.
Reviewed by: olce
Suggested by: olce
Differential Revision: https://reviews.freebsd.org/D51647
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the (mostly) kernel side of de-conflating cr_gid and the
supplemental groups. The pre-existing behavior for getgroups() and
setgroups() is retained to keep the user <-> kernel boundary
functionally the same while we audit use of these syscalls, but we can
remove a lot of the internal special-casing just by reorganizing ucred
like this.
struct xucred has been altered because the cr_gid macro becomes
problematic if ucred has a real cr_gid member but xucred does not. Most
notably, they both also have cr_groups[] members, so the definition
means that we could easily have situations where we end up using the
first supplemental group as the egid in some places. We really can't
change the ABI of xucred, so instead we alias the first member to the
`cr_gid` name and maintain the status quo.
This also fixes the Linux setgroups(2)/getgroups(2) implementation to
more cleanly preserve the group set, now that we don't need to special
case cr_groups[0].
__FreeBSD_version bumped for the `struct ucred` ABI break.
For relnotes: downstreams and out-of-tree modules absolutely must fix
any references to cr_groups[0] in their code. These are almost
exclusively incorrect in the new world, and cr_gid should be used
instead. There is a cr_gid macro available in earlier FreeBSD versions
that can be used to avoid having version-dependant conditionals to refer
to the effective group id. Surrounding code may need adjusted if it
peels off the first element of cr_groups and uses the others as the
supplemental groups, since the supplemental groups start at cr_groups[0]
now if &cr_groups[0] != &cr_gid.
Relnotes: yes (see last paragraph)
Co-authored-by: olce
Differential Revision: https://reviews.freebsd.org/D51489
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A future change may split cr_gid out of cr_groups[0] so that there's a
cleaner separation between the supplemental groups and the effective
group. Do the mechanical conversion where we can, and drop some
comments where we need further work because some assumptions about
cr_gid == cr_groups[0] have been made.
This should not be a functional change, but downstreams and other
out-of-tree code are advised to investigate their usage of cr_groups
sooner rather than later, as a future change will render assumptions
about these two being equivalent harmful.
Reviewed by: asomers, kib, olce
Differential Revision: https://reviews.freebsd.org/D51153
|
| |
|
|
| |
MFC after: 2 weeks
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There were two loops, one to skip entries before the requested offset,
and one to populate entries. Merge them and use the return value of
pfs_iterate() to decide whether to set eofflag.
As a side effect, this fixes a small bug: the first loop would decrement
`offset` to zero, and the second loop would increment it again.
However, `offset` was used to set `d_off`, and we should include the
starting offset there.
Add a comment explaining the use of the allproc lock.
Reviewed by: des, kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D51462
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
des@ reported a panic in the NFSv4 server, where
nfsv4_fillattr() did a VOP_PATHCONF() without having
"vp" locked.
Relocking the vnode is inefficient and, for Readdir,
may cause deadlocks. As such, this patch handles
VOP_PATHCONF() in the same way that the code checks
for ACL support, by doing the VOP_PATHCONF() before
the calls to nfsv4_filllattr() where the vnode is still locked.
Reported by: des
Reviewed by: kib (earlier version)
Differential Revision: https://reviews.freebsd.org/D51410
Fixes: c5d72d29fe0e ("nfsv4: Add support for the NFSv4 hidden and system attributes")
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Without this patch, nfsv4_fillattr() relocks the vnode to
test to see if extended attributes are supported.
This is inefficient and could cause deadlocks if Readdir
ever asks for this attribute.
At this time, no extant NFSv4 client asks for this attribute
for Readdir, but this patch fixes the problem in case a
future client does so, by moving the test for extended attribute
support to before the nfsv4_fillattr() call where the vnode
is still locked.
MFC after: 2 weeks
|
| |
|
|
|
|
|
|
|
| |
Without this patch, nfsrvd_openattr() requests an unlocked
vnode via VOP_LOOKUP(). This is not allowed for
"options DEBUG_VFS_LOCKS" kernels, so this patch requests a
locked vnode and then unlocks it.
Fixes: e4c7b2b6053f ("nfsv4: Add support to NFSv4 for named attributes")
|
| |
|
|
|
|
|
|
|
| |
PR: 288266
Reported by: Robert Morris <rtm@lcs.mit.edu>
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D51365
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing code frequently assigns unsigned 64-bit values to variables
that are signed and / or shorter without checking for overflow. Try to
deal with these cases.
While here, fix two structs that used single-element arrays in place of
flexible array members.
PR: 287896
MFC after: 1 week
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D51339
|
| |
|
|
|
|
| |
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D51319
|
| |
|
|
|
|
| |
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D50760
|
| |
|
|
|
|
|
|
|
| |
We also need to set it when an end-of-directory marker is reached.
Reported by: vishwin
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D51290
|
| |
|
|
|
| |
Fixes: c5d72d29fe0e
Reviewed by: rmacklem
|
| |
|
|
|
|
|
|
|
| |
Fixes: ef6ea91593ebff73e2fc201efd9f848b71c5a125
Reported by: des
Reviewed by: markj, rmacklem
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D51211
|
| |
|
|
|
|
|
|
|
|
| |
it clashes with ERESTART. Use EJUSTRETURN for the case, as it is often
done in other places in the kernel.
Reviewed by: markj, rmacklem
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D51211
|
| |
|
|
|
|
|
|
|
|
|
|
| |
There now appears to be a use for the NFSv4 hidden and system
attributes for the Windows ms-nfs41 client. As such, this
patch implements these using the UF_HIDDEN and UF_SYSTEM
flags. Commit afd5bc630930 added support for _PC_HAS_HIDDENSYSTEM,
to VOP_PATHCONF(), which is used by the server to check for
support of the UF_HIDDEN and UF_SYSTEM flags.
This patch only affects NFSv4 and only when the client/server
on the other end supports the hidden and system attributes.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the NFSv4 server to implement the "hidden" and
"system" attributes, it needs to know if UF_HIDDEN,
UF_SYSTEM are supported for the file.
This patch adds a new pathconf variable called
_PC_HAS_HIDDENSYSTEM to do that. The ZFS patch
will be handled separately as a OpenZFS pull request.
Although this pathconf variable may be queried
by applications using pathconf(2), the current
interface where chflags(2) returns EOPNOTSUPP
may still be used to check if the flags are set.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D51172
|
| |
|
|
|
|
|
|
| |
Reviewed by: markj, olce
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D50648
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add an implementation of inotify_init(), inotify_add_watch(),
inotify_rm_watch(), source-compatible with Linux. This provides
functionality similar to kevent(2)'s EVFILT_VNODE, i.e., it lets
applications monitor filesystem files for accesses. Compared to
inotify, however, EVFILT_VNODE has the limitation of requiring the
application to open the file to be monitored. This means that activity
on a newly created file cannot be monitored reliably, and that a file
descriptor per file in the hierarchy is required.
inotify on the other hand allows a directory and its entries to be
monitored at once. It introduces a new file descriptor type to which
"watches" can be attached; a watch is a pseudo-file descriptor
associated with a file or directory and a set of events to watch for.
When a watched vnode is accessed, a description of the event is queued
to the inotify descriptor, readable with read(2). Events for files in a
watched directory include the file name.
A watched vnode has its usecount bumped, so name cache entries
originating from a watched directory are not evicted. Name cache
entries are used to populate inotify events for files with a link in a
watched directory. In particular, if a file is accessed with, say,
read(2), an IN_ACCESS event will be generated for any watched hard link
of the file.
The inotify_add_watch_at() variant is included so that this
functionality is available in capability mode; plain inotify_add_watch()
is disallowed in capability mode.
When a file in a nullfs mount is watched, the watch is attached to the
lower vnode, such that accesses via either layer generate inotify
events.
Many thanks to Gleb Popov for testing this patch and finding lots of
bugs.
PR: 258010, 215011
Reviewed by: kib
Tested by: arrowd
MFC after: 3 months
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D50315
|
| |
|
|
|
| |
Approved by: asomers
Pull Request: https://github.com/freebsd/freebsd-src/pull/1727
|
| |
|
|
|
| |
Approved by: asomers
Pull Request: https://github.com/freebsd/freebsd-src/pull/1727
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Upgrade the FUSE API from protocol 7.33 to 7.35.
Add support for FOPEN_NOFLUSH, introduced in 7.35.
Also, reduce diffs vis-a-vis upstream by factoring out an ioctl type, a
change missed in d5e3cf41e89.
Signed-off-by: Claudiu I. Palincas <mscotty@protonmail.ch>
Reviewed by: asomers
Pull Request: https://github.com/freebsd/freebsd-src/pull/1744
|
| |
|
|
|
|
|
|
|
|
| |
Commit 50e733f19b37 broke kernel builds without "options UFS_ACL".
This patch fixes it.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D51131
Fixes: 50e733f19b37 ("nfscl: Use delegation ACE when mounted with nocto")
|