aboutsummaryrefslogtreecommitdiff
path: root/sys/cddl/contrib
Commit message (Collapse)AuthorAgeFilesLines
* dtrace/amd64: Implement emulation of call instructionsMark Johnston2022-08-091-0/+4
| | | | | | | | | | | | | | | | | | | Here, the provider is responsible for updating the trapframe to redirect control flow and for computing the return address. Once software-saved registers are restored, the emulation shifts the remaining context down on the stack to make space for the return address, then copies the address provided by the invop handler. dtrace_invop() is modified to allocate temporary storage space on the stack for use by the provider to return the return address. This is to support a new provider for amd64 which can instrument arbitrary instructions, not just function entry and exit instructions as FBT does. In collaboration with: christos Sponsored by: Google, Inc. (GSoC 2022) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
* Adjust dtrace_getf_barrier() definition to avoid clang 15 warningDimitry Andric2022-07-191-1/+1
| | | | | | | | | | | | | | | With clang 15, the following -Werror warnings is produced: sys/cddl/contrib/opensolaris/uts/common/dtrace/dtrace.c:17019:20: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes] dtrace_getf_barrier() ^ void This is because dtrace_getf_barrier() is declared with a (void) argument list, but defined with an empty argument list. Make the definition match the declaration. MFC after: 3 days
* Use KERNEL_PANICKED() in more placesMitchell Horne2022-06-021-2/+2
| | | | | | | | | | This is slightly more optimized than checking panicstr directly. For most of these instances performance doesn't matter, but let's make KERNEL_PANICKED() the common idiom. Reviewed by: mjg MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D35373
* zfs: merge openzfs/zfs@a86e08941 (master) into mainMartin Matuska2022-03-081-2/+2
| | | | | | | | | | | | | | | | | | | | Notable upstream pull request merges: #9078: log xattr=sa create/remove/update to ZIL #11919: Cross-platform xattr user namespace compatibility #13014: Report dnodes with faulty bonuslen #13016: FreeBSD: Fix zvol_cdev_open locking #13019: spl: Don't check FreeBSD rwlocks for double initialization #13027: Fix clearing set-uid and set-gid bits on a file when replying a write #13031: Add enumerated vdev names to 'zpool iostat -v' and 'zpool list -v' #13074: Enable encrypted raw sending to pools with greater ashift #13076: Receive checks should allow unencrypted child datasets #13098: Avoid dirtying the final TXGs when exporting a pool #13172: Fix ENOSPC when unlinking multiple files from full pool Obtained from: OpenZFS OpenZFS commit: a86e089415679cf1b98eb424a159bb36aa2c19e3
* ctf: Import ctf.h from OpenBSDMark Johnston2022-03-071-360/+0
| | | | | | | | | | | | | | | | | Use it instead of the existing ctf.h from OpenSolaris. This makes it easier to use CTF in the core kernel, and to extend the CTF format to support wider type IDs. The imported ctf.h is modified to depend only on _types.h, and also to provide macros which use the "parent" bit of a type ID to refer to types in a parent CTF container. No functional change intended. Reviewed by: Domagoj Stolfa, emaste MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34358
* fasttrap: Avoid creating WX mappingsMark Johnston2022-03-012-3/+4
| | | | | | | | | | | | | | | | | fasttrap instruments certain instructions by overwriting them and copying the original instruction to some per-thread scratch space which is executed after the probe fires. This trampoline jumps back to the tracepoint after executing the original instruction. The created mapping has both write and execute permissions, and so this mechanism doesn't work when allow_wx is disabled. Work around the restriction by using proc_rwmem() to write to the trampoline. Reviewed by: vangyzen Tested by: Amit <akamit91@hotmail.com> MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34304
* fasttrap: Assert that fasttrap_fork() successfully unmaps scratch spaceMark Johnston2022-03-011-2/+3
| | | | | | | No functional change intended. MFC after: 2 weeks Sponsored by: The FreeBSD Foundation
* fd: rename fget*_locked to fget*_norefMateusz Guzik2022-02-221-1/+4
| | | | | | | | This gets rid of the error prone naming where fget_unlocked returns with a ref held, while fget_locked requires a lock but provides nothing in terms of making sure the file lives past unlock. No functional changes.
* Teach DTrace about BTI on arm64Andrew Turner2022-01-191-0/+3
| | | | | | | | | | | | | The Branch Target Identification (BTI) Armv8-A extension adds new instructions that can be placed where we may indirrectly branch to, e.g. at the start of a function called via a function pointer. We can't emulate these in DTrace as the kernel will have raised a different exception before the DTrace handler has run. Skip over the BTI instruction if it's used as the first instruction in a function. Sponsored by: The FreeBSD Foundation
* dtrace: add a knob to control maximum size of principal buffersAndriy Gapon2022-01-111-1/+3
| | | | | | | | | | | | | | | | | | We had a hardcoded limit of 1/128-th of physical memory that was further subdivided between all CPUs as principal buffers are allocated on the per-CPU basis. Actually, the buffers could use up 1/64-th of the memmory because with the default switch policy there are two buffers per CPU. This commit allows to change that limit. Note that the discussed limit is per dtrace command invocation. The idea is to limit the size of a single malloc(9) call, not the total memory size used by DTrace buffers. Reviewed by: markj MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D33648
* dtrace: Disable getf() as it is broken on FreeBSDDomagoj Stolfa2021-12-171-0/+4
| | | | | | | | | | | | | | | | | | getf() on FreeBSD calls _sx_slock(), _sx_sunlock() and fget_locked(). Furthermore, it does not set the per-core fault flag, meaning it usually ends up in a double fault panic once getf() does get called, especially from fbt. Reviewing the DTrace Toolkit + a number of other scripts scattered around FreeBSD, I have not been able to find one use of getf(). Given how broken the implementation currently is, we disable it until it can be implemented properly. Also comment out a test in aggs/tst.subr.d for getf(). Reviewed by: markj MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D33378
* Create sys/reg.h for the common code previously in machine/reg.hAndrew Turner2021-08-301-1/+1
| | | | | | | | | | Move the common kernel function signatures from machine/reg.h to a new sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2). Reviewed by: imp, markj Sponsored by: DARPA, AFRL (original work) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D19830
* sys/cddl: remove extraneous semicolonsEd Maste2021-08-161-1/+1
| | | | | | | Fixes: 5a1b490d502e ("FreeBSD changes to vendor source.") Fixes: 91eaf3e1831d ("Custom DTrace kernel module...") MFC after: 1 week Sponsored by: The FreeBSD Foundation
* dtrace: use %zu format specifier for data of size_t typeKonstantin Belousov2021-08-081-1/+1
| | | | Sponsored by: The FreeBSD Foundation
* Teach DTrace that unaligned accesses are OK on aarch64, not just x86.Robert Watson2021-03-221-1/+1
| | | | | | MFC after: 3 days Reviewed: andrew Differential Revision: https://reviews.freebsd.org/D29369
* Handle functions that use a nop in the arm64 fbtAndrew Turner2021-03-031-0/+2
| | | | | | | | | To trace leaf asm functions we can insert a single nop instruction as the first instruction in a function and trigger off this. Reviewed by: gnn Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D28132
* Handle using a sub instruction in the arm64 fbtAndrew Turner2021-01-121-0/+9
| | | | | | | | Some stack frames are too large for a store pair instruction we already detect in the arm64 fbt code. Add support for handling subtracting the stack pointer directly. Sponsored by: Innovate UK
* Only allow a store through sp in the arm64 fbtAndrew Turner2021-01-121-1/+3
| | | | | | | | | | | | | | | | When searching for an instruction to patch out in the arm64 function boundary trace we search for a store pair with a write back. This instruction is commonly used to store two registers to the stack and update the stack pointer to hold space for more. This works in many cases, however not all functions use this, e.g. when the stack frame is too large. In these cases we may find another instruction of the same type that doesn't store through the stack pointer. Filter these instructions out and assume if we see one we are past the function prologue. Reported by: rwatson Sponsored by: Innovate UK
* dtrace: Fix /"string" == NULL/ comparisons using an uninitialized value.Bryan Drewery2021-01-081-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A test of this is funcs/tst.strtok.d which has this filter: BEGIN /(this->field = strtok(this->str, ",")) == NULL/ { exit(1); } The test will randomly fail with exit status of 1 indicating that this->field was NULL even though printing it out shows it is not. This is compiled to the DTrace instruction set: // Pushed arguments not shown here // call strtok() and set result into %r1 07: 2f001f01 call DIF_SUBR(31), %r1 ! strtok // set thread local scalar this->field from %r1 08: 39050101 stls %r1, DT_VAR(1281) ! DT_VAR(1281) = "field" // Prepare for the == comparison // Set right side of %r2 to NULL 09: 25000102 setx DT_INTEGER[1], %r2 ! 0x0 // string compare %r1 (strtok result) to %r2 10: 27010200 scmp %r1, %r2 In this case only %r1 is loaded with a string limit set to lim1. %r2 being NULL does not get loaded and does not set lim2. Then we call dtrace_strncmp() with MIN(lim1, lim2) resulting in passing 0 and comparing neither side. dtrace_strncmp() handles this case fine and it already has been while being lucky with what lim2 was [un]initialized as. Reviewed by: markj, Don Morris <dgmorris AT earthlink.net> Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D27671
* Install dtrace.h and dependenciesAlex Richardson2021-01-071-2/+12
| | | | | | | | | | | | | This makes the minimum amount of changes to allow inclusion of dtrace.h without all the solaris compatibility headers. Installing dtrace.h allows compiling consumers of libdtrace (e.g. https://github.com/tmetsch/python-dtrace) without requiring a copy of the source tree. For python-dtrace I worked around this in https://github.com/tmetsch/python-dtrace/commit/58019c9a12022203a9ffda286dd8b41f1a5ace42 but being able to build the library without installed sources would be extremely useful. Reviewed By: gnn Differential Revision: https://reviews.freebsd.org/D27884
* [cddl] Fix lz4 function definitions to not tri pup compile.Adrian Chadd2020-11-172-2/+7
| | | | | | | | | | | This tripped up in llvm compilation on amd64 noting that lz4_init/lz4_fini were lacking in being previously defined. Reviewed by: emaste, freqlabs, brooks Differential Revision: https://reviews.freebsd.org/D27240 Notes: svn path=/head/; revision=367770
* Update OpenZFS to 2.0.0-rc3-gfc5966Matt Macy2020-10-171-2130/+0
| | | | | | | | | | | - fix panic due to tqid overflow - Improve libzfs_error_init messages - Expose zfetch_max_idistance tunable - Make dbufstat work on FreeBSD - Fix EIO after resuming receive of new dataset over an existing one Notes: svn path=/head/; revision=366780
* Add zstd support to the boot loader.Warner Losh2020-10-121-243/+0
| | | | | | | | | | | | | | | Add support to the _STANDALONE environment enough bits of the kernel that we can compile it. We still have a small zstd_shim.c since there were 3 items that were a bit hard to nail down and may be cleaned up in the future. These go hand in hand with a number of commits to sys/sys in the past weeks, should this need be MFCd. Discussed with: mmacy (in review and on IRC/Slack) Reviewed by: freqlabs (on openzfs repo) Differential Revision: https://reviews.freebsd.org/D26218 Notes: svn path=/head/; revision=366657
* ZFS: band-aid for -DNO_CLEANMatt Macy2020-08-251-1/+3
| | | | | | | | | Submitted by: Neal Chauhan Approved by: imp@ Differential Revision: https://reviews.freebsd.org/D26183 Notes: svn path=/head/; revision=364788
* Merge OpenZFS support in to HEAD.Matt Macy2020-08-25298-196927/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The primary benefit is maintaining a completely shared code base with the community allowing FreeBSD to receive new features sooner and with less effort. I would advise against doing 'zpool upgrade' or creating indispensable pools using new features until this change has had a month+ to soak. Work on merging FreeBSD support in to what was at the time "ZFS on Linux" began in August 2018. I first publicly proposed transitioning FreeBSD to (new) OpenZFS on December 18th, 2018. FreeBSD support in OpenZFS was finally completed in December 2019. A CFT for downstreaming OpenZFS support in to FreeBSD was first issued on July 8th. All issues that were reported have been addressed or, for a couple of less critical matters there are pull requests in progress with OpenZFS. iXsystems has tested and dogfooded extensively internally. The TrueNAS 12 release is based on OpenZFS with some additional features that have not yet made it upstream. Improvements include: project quotas, encrypted datasets, allocation classes, vectorized raidz, vectorized checksums, various command line improvements, zstd compression. Thanks to those who have helped along the way: Ryan Moeller, Allan Jude, Zack Welch, and many others. Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D25872 Notes: svn path=/head/; revision=364746
* zfs: add an option to the bootloader to rewind the ZFS checkpointMariusz Zaborski2020-08-183-3/+20
| | | | | | | | | | | | | | | | | | The checkpoints are another way of keeping the state of ZFS. During the rewind, the pool has to be exported. This makes checkpoints unusable when using ZFS as root. Add the option to rewind the ZFS checkpoint at the boot time. If checkpoint exists, a new option for rewinding a checkpoint will appear in the bootloader menu. We fully support boot environments. If the rewind option is selected, the boot loader will show a list of boot environments that existed before the checkpoint. Reviewed by: tsoome, allanjude, kevans (ok with high-level overview) Differential Revision: https://reviews.freebsd.org/D24920 Notes: svn path=/head/; revision=364355
* Fix linker error in libuutil with recent LLVMAlex Richardson2020-08-072-8/+1
| | | | | | | | | | | | | | | | Not marking the function as static can result in a linker error: undefined reference to __assfail [--no-allow-shlib-undefined] I noticed this error after updating our CHERI LLVM to the latest upstream LLVM HEAD revision. This change effectively reverts r329984 and marks dmu_buf_init_user as static (which keeps the GCC build happy). Reviewed By: #zfs, asomers, freqlabs, mav Differential Revision: https://reviews.freebsd.org/D25663 Notes: svn path=/head/; revision=364027
* Fix cddl tools bootstrapping on macOS and LinuxAlex Richardson2020-08-071-1/+8
| | | | | | | | Reviewed By: brooks Differential Revision: https://reviews.freebsd.org/D25979 Notes: svn path=/head/; revision=364022
* MFOpenZFS: Add support for boot environment data to be stored in the labelToomas Soome2020-08-056-14/+242
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are building new bootonce mechanism (previously zfs bootnext) and it is based on this OpenZFS change. Since this patch is nicely self contained, I am commiting it as is, and we can stack our changes. Original patch description follows: Modern bootloaders leverage data stored in the root filesystem to enable some of their powerful features. GRUB specifically has a grubenv file which can store large amounts of configuration data that can be read and written at boot time and during normal operation. This allows sysadmins to configure useful features like automated failover after failed boot attempts. Unfortunately, due to the Copy-on-Write nature of ZFS, the standard behavior of these tools cannot handle writing to ZFS files safely at boot time. We need an alternative way to store data that allows the bootloader to make changes to the data. This work is very similar to work that was done on Illumos to enable similar functionality in the FreeBSD bootloader. This patch is different in that the data being stored is a raw grubenv file; this file can store arbitrary variables and values, and the scripting provided by grub is powerful enough that special structures are not required to implement advanced behavior. We repurpose the second padding area in each label to store the grubenv file, protected by an embedded checksum. We add two ioctls to get and set this data, and libzfs_core and libzfs functions to access them more easily. There are no direct command line interfaces to these functions; these will be added directly to the bootloader utilities. Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes #10009 Obtained from: OpenZFS Sponsored by: Netflix, Klara Inc. Notes: svn path=/head/; revision=363911
* zfs_keys_nextboot array is missing ZPOOL_CONFIG_POOL_GUID and ZPOOL_CONFIG_GUIDToomas Soome2020-08-051-1/+3
| | | | | | | | | | | | As we do check the incomint nvlist, we either need to list all possible keys or use wildcard. PR: 248462 Reported by: larafercue@gmail.com Sponsored by: Netflix, Klara Inc. Notes: svn path=/head/; revision=363910
* vfs: remove the obsolete privused argument from vaccessMateusz Guzik2020-08-051-1/+1
| | | | | | | | This brings argument count down to 6, which is passable without the stack on amd64. Notes: svn path=/head/; revision=363893
* zfs: add support for lockless lookupMateusz Guzik2020-07-255-10/+147
| | | | | | | | Tested by: pho (in a patchset, previous version) Differential Revision: https://reviews.freebsd.org/D25581 Notes: svn path=/head/; revision=363522
* Fix a memory leak in dsl_scan_visitbp().Mark Johnston2020-07-201-1/+1
| | | | | | | | | | | | This should be triggered only if arc_read() fails, i.e., quite rarely. The same logic is already present in OpenZFS. PR: 247445 Submitted by: jdolecek@NetBSD.org MFC after: 1 week Notes: svn path=/head/; revision=363373
* Fix page fault in zfsctl_snapdir_getattrAlan Somers2020-07-021-1/+2
| | | | | | | | | | | | | | | Must acquire the z_teardown_lock before accessing the zfsvfs_t object. I can't reproduce this panic on demand, but this looks like the correct solution. PR: 247668 Reviewed by: avg MFC after: 2 weeks Sponsored by: Axcient Differential Revision: https://reviews.freebsd.org/D25543 Notes: svn path=/head/; revision=362891
* Fix "current" variable name conflict with openzfsMatt Macy2020-06-271-39/+39
| | | | | | | | | The variable "current" is an alias for curthread in openzfs. Rename all variable uses of current in dtrace.c to curstate. Notes: svn path=/head/; revision=362667
* MFOpenZFS: Add basic zfs ioc input nvpair validationToomas Soome2020-06-232-79/+376
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want newer versions of libzfs_core to run against an existing zfs kernel module (i.e. a deferred reboot or module reload after an update). Programmatically document, via a zfs_ioc_key_t, the valid arguments for the ioc commands that rely on nvpair input arguments (i.e. non legacy commands from libzfs_core). Automatically verify the expected pairs before dispatching a command. This initial phase focuses on the non-legacy ioctls. A follow-on change can address the legacy ioctl input from the zfs_cmd_t. The zfs_ioc_key_t for zfs_keys_channel_program looks like: static const zfs_ioc_key_t zfs_keys_channel_program[] = { {"program", DATA_TYPE_STRING, 0}, {"arg", DATA_TYPE_UNKNOWN, 0}, {"sync", DATA_TYPE_BOOLEAN_VALUE, ZK_OPTIONAL}, {"instrlimit", DATA_TYPE_UINT64, ZK_OPTIONAL}, {"memlimit", DATA_TYPE_UINT64, ZK_OPTIONAL}, }; Introduce four input errors to identify specific input failures (in addition to generic argument value errors like EINVAL, ERANGE, EBADF, and E2BIG). ZFS_ERR_IOC_CMD_UNAVAIL the ioctl number is not supported by kernel ZFS_ERR_IOC_ARG_UNAVAIL an input argument is not supported by kernel ZFS_ERR_IOC_ARG_REQUIRED a required input argument is missing ZFS_ERR_IOC_ARG_BADTYPE an input argument has an invalid type Reviewed by: allanjude Obtained from: OpenZFS Sponsored by: Netflix, Klara Inc. Differential Revision: https://reviews.freebsd.org/D25393 Notes: svn path=/head/; revision=362531
* MFOpenZFS: Add zio_ddt_free()+ddt_phys_decref() error handlingAllan Jude2020-06-222-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | The assumption in zio_ddt_free() is that ddt_phys_select() must always find a match. However, if that fails due to a damaged DDT or some other reason the code will NULL dereference in ddt_phys_decref(). While this should never happen it has been observed on various platforms. The result is that unless your willing to patch the ZFS code the pool is inaccessible. Therefore, we're choosing to more gracefully handle this case rather than leave it fatal. http://mail.opensolaris.org/pipermail/zfs-discuss/2012-February/050972.html https://github.com/openzfs/zfs/commit/5dc6af0eec29b119b731c793037fd77214fc9438 Reported by: Pierre Beyssac Obtained from: OpenZFS MFC after: 2 weeks Sponsored by: Klara Inc. Notes: svn path=/head/; revision=362505
* ZFS: Allow setting checksum=skein on boot poolsAllan Jude2020-06-191-10/+1
| | | | | | | | | PR: 245889 Reported by: delphij Sponsored by: Klara Inc. Notes: svn path=/head/; revision=362396
* Fix export_args ex_flags field so that is 64bits, the same as mnt_flags.Rick Macklem2020-06-141-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since mnt_flags was upgraded to 64bits there has been a quirk in "struct export_args", since it hold a copy of mnt_flags in ex_flags, which is an "int" (32bits). This happens to currently work, since all the flag bits used in ex_flags are defined in the low order 32bits. However, new export flags cannot be defined. Also, ex_anon is a "struct xucred", which limits it to 16 additional groups. This patch revises "struct export_args" to make ex_flags 64bits and replaces ex_anon with ex_uid, ex_ngroups and ex_groups (which points to a groups list, so it can be malloc'd up to NGROUPS in size. This requires that the VFS_CHECKEXP() arguments change, so I also modified the last "secflavors" argument to be an array pointer, so that the secflavors could be copied in VFS_CHECKEXP() while the export entry is locked. (Without this patch VFS_CHECKEXP() returns a pointer to the secflavors array and then it is used after being unlocked, which is potentially a problem if the exports entry is changed. In practice this does not occur when mountd is run with "-S", but I think it is worth fixing.) This patch also deleted the vfs_oexport_conv() function, since do_mount_update() does the conversion, as required by the old vfs_cmount() calls. Reviewed by: kib, freqlabs Relnotes: yes Differential Revision: https://reviews.freebsd.org/D25088 Notes: svn path=/head/; revision=362158
* fix up r362047: a call to zvol_*_minors() was not hidden from userlandAndriy Gapon2020-06-111-0/+2
| | | | | | | | | Reported by: CI/FreeBSD-head-powerpc64-build MFC after: 5 weeks X-MFC with: r362047 Notes: svn path=/head/; revision=362048
* rework how ZVOLs are updated in response to DSL operationsAndriy Gapon2020-06-1110-72/+212
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this change all ZVOL updates are initiated from the SPA sync context instead of a mix of the sync and open contexts. The updates are queued to be applied by a dedicated thread in the original order. This should ensure that ZVOLs always accurately reflect the corresponding datasets. ZFS ioctl operations wait on the mentioned thread to complete its work. Thus, the illusion of the synchronous ZVOL update is preserved. At the same time, the SPA sync thread never blocks on ZVOL related operations avoiding problems like reported in bug 203864. This change is based on earlier work in the same direction: D7179 and D14669 by Anthoine Bourgeois. D7179 tried to perform ZVOL operations in the open context and that opened races between them. D14669 uses a design very similar to this change but with different implementation details. This change also heavily borrows from similar code in ZoL, but there are many differences too. See: - https://github.com/zfsonlinux/zfs/commit/a0bd735adb1b1eb81fef10b4db102ee051c4d4ff - https://github.com/zfsonlinux/zfs/issues/3681 - https://github.com/zfsonlinux/zfs/issues/2217 PR: 203864 MFC after: 5 weeks Sponsored by: CyberSecure Differential Revision: https://reviews.freebsd.org/D23478 Notes: svn path=/head/; revision=362047
* Don't block on the range lock in zfs_getpages().Mark Johnston2020-05-203-22/+67
| | | | | | | | | | | | | | | | | | | | | | | | After r358443 the vnode object lock no longer synchronizes concurrent zfs_getpages() and zfs_write() (which must update vnode pages to maintain coherence). This created a potential deadlock between ZFS range locks and VM page busy locks: a fault on a mapped file will cause the fault page to be busied, after which zfs_getpages() locks a range around the file offset in order to map adjacent, resident pages; zfs_write() locks the range first, and then must busy vnode pages when synchronizing. Solve this by adding a non-blocking mode for ZFS range locks, and using it in zfs_getpages(). If zfs_getpages() fails to acquire the range lock, only the fault page will be populated. Reported by: bdrewery Reviewed by: avg Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D24839 Notes: svn path=/head/; revision=361287
* lz4 hash table does not start zeroedToomas Soome2020-05-191-1/+1
| | | | | | | | | illumos issue: https://www.illumos.org/issues/12757 Submitted by: andyf Notes: svn path=/head/; revision=361266
* zfs: reject read(2) of a dirfd with EISDIRKyle Evans2020-05-191-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | This is independent of the recently-discussed global change, which is still in review/discussion stage. This is effectively a measure for consistency in the ZFS world, where FreeBSD was the only platform (as far as I could find) that allowed this. What ZFS exposes is decidedly not useful for any real purposes, to paraphrase (hopefully faithfully) jhb's findings when exploring this: The size of a directory in ZFS is the number of directory entries within. When reading a directory, you would instead get the leading part of its raw contents; the amount you get being dictated by the "size," i.e. number of directory entries. There's decidedly (luckily) no stack disclosure happening here, though the behavior is bizarre and almost certainly a historical accident. This change has already been upstreamed to OpenZFS. MFC after: 1 week Notes: svn path=/head/; revision=361238
* Correct the order of arguments to copyin() for Q_SETQUOTA.John Baldwin2020-05-181-1/+1
| | | | | | | | | MFC after: 2 weeks Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D24656 Notes: svn path=/head/; revision=361220
* Avoid the GEOM topology lock recursion when we automatically expand a pool.Pawel Jakub Dawidek2020-04-251-2/+6
| | | | | | | | | | | | | The steps to reproduce the problem: mdconfig -a -t swap -s 3g -u 0 gpart create -s GPT md0 gpart add -t freebsd-zfs -s 1g md0 zpool create -o autoexpand=on foo md0p1 gpart resize -i 1 -s 2g md0 Notes: svn path=/head/; revision=360325
* Make ZFS depend on xdr.ko only. It doesn't need kernel RPC.Gleb Smirnoff2020-04-171-1/+1
| | | | | | | Differential Revision: https://reviews.freebsd.org/D24408 Notes: svn path=/head/; revision=360037
* MFOpenZFS: ZVOLs should not be allowed to have childrenRyan Moeller2020-03-255-7/+83
| | | | | | | | | | | | | | | | | | | | | | | | zfs create, receive and rename can bypass this hierarchy rule. Update both userland and kernel module to prevent this issue and use pyzfs unit tests to exercise the ioctls directly. Note: this commit slightly changes zfs_ioc_create() ABI. This allow to differentiate a generic error (EINVAL) from the specific case where we tried to create a dataset below a ZVOL (ZFS_ERR_WRONG_PARENT). Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tom Caputi <tcaputi@datto.com> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Approved by: mav (mentor) MFC after: 2 weeks Sponsored by: iXsystems, Inc. openzfs/zfs@d8d418ff0cc90776182534bce10b01e9487b63e4 Notes: svn path=/head/; revision=359303
* MFOpenZFS: make zil max block size tunableAlexander Motin2020-03-195-31/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've observed that on some highly fragmented pools, most metaslab allocations are small (~2-8KB), but there are some large, 128K allocations. The large allocations are for ZIL blocks. If there is a lot of fragmentation, the large allocations can be hard to satisfy. The most common impact of this is that we need to check (and thus load) lots of metaslabs from the ZIL allocation code path, causing sync writes to wait for metaslabs to load, which can take a second or more. In the worst case, we may not be able to satisfy the allocation, in which case the ZIL will resort to txg_wait_synced() to ensure the change is on disk. To provide a workaround for this, this change adds a tunable that can reduce the size of ZIL blocks. External-issue: DLPX-61719 Reviewed-by: George Wilson <george.wilson@delphix.com> Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #8865 openzfs/zfs@b8738257c2607c73c731ce8e0fd73282b266d6ef MFC after: 2 weeks Notes: svn path=/head/; revision=359112
* Fix infinite scan on a pool with only special allocationsAlexander Motin2020-03-161-3/+6
| | | | | | | | | | | | | | | | | | | Attempt to run scrub or resilver on a new pool containing only special allocations (special vdev added on creation) caused infinite loop because of dsl_scan_should_clear() limiting memory usage to 5% of pool size, which it calculated accounting only normal allocation class. Addition of special and just in case dedup classes fixes the issue. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes #10106 Closes #8694 openzfs/zfs@fa130e010c2ff9b33aba11d2699b667e454b3ccb Notes: svn path=/head/; revision=359018