aboutsummaryrefslogtreecommitdiff
path: root/sys/cddl/contrib/opensolaris/uts
Commit message (Collapse)AuthorAgeFilesLines
* MFV r323794: 8605 zfs channel programs: zfs.exists undocumented and non-workingAndriy Gapon2017-10-011-1/+1
| | | | | | | | | | | | | | | | | | | | | illumos/illumos-gate@5f39f884e2035d671ec02148fc4d8420c670bcb4 https://github.com/illumos/illumos-gate/commit/5f39f884e2035d671ec02148fc4d8420c670bcb4 https://www.illumos.org/issues/8605 zfs.exists() in channel programs doesn't return any result, and should have a man page entry. Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Chris Williamson <chris.williamson@delphix.com> MFC after: 5 weeks X-MFC after: r324163 Notes: svn path=/head/; revision=324170
* MFV r323531: 8521 nvlist memory leak in get_clones_stat() and spa_load_best()Andriy Gapon2017-10-012-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | illumos/illumos-gate@7d3000f774e20097a1ee45cbd06d0e38065ddd5a https://github.com/illumos/illumos-gate/commit/7d3000f774e20097a1ee45cbd06d0e38065ddd5a https://www.illumos.org/issues/8521 Yuri reported this to the mailing list: doing a `reboot -d` on current illumos-gate HEAD gives the following ":: findleaks -dv" output: findleaks: maximum buffers => 301061 findleaks: actual buffers => 297587 findleaks: findleaks: potential pointers => 29289774 findleaks: dismissals => 26242305 (89.5%) findleaks: misses => 331153 ( 1.1%) findleaks: dups => 2419681 ( 8.2%) findleaks: follows => 296635 ( 1.0%) findleaks: findleaks: peak memory usage => 7353 kB findleaks: elapsed CPU time => 1.5 seconds findleaks: elapsed wall time => 2.0 seconds findleaks: CACHE LEAKED BUFCTL CALLER ffffff03d222b008 120 ffffff03ef7ceb78 nv_alloc_sys+0x1f ffffff03d222a448 123 ffffff03f4150cc8 nv_alloc_sys+0x1f ffffff03d222b448 5 ffffff03f28bd598 nv_alloc_sys+0x1f ffffff03d222b888 87 ffffff03f28c10f0 nv_alloc_sys+0x1f ffffff03d222c008 21 ffffff03f4139310 nv_alloc_sys+0x1f ffffff03d222b888 43 ffffff040ef3f3e8 nv_alloc_sys+0x1f ffffff03d222c008 120 ffffff03f4591e58 nv_alloc_sys+0x1f ffffff03d222b008 121 ffffff03f352c068 nv_alloc_sys+0x1f ffffff03d222a448 112 ffffff03f414e5f8 nv_alloc_sys+0x1f ffffff03d222b008 119 ffffff03ee92fdc0 nv_alloc_sys+0x1f ffffff03d222b888 46 ffffff03f28c1378 nv_alloc_sys+0x1f ffffff03d222b448 4 ffffff03f28c7708 nv_alloc_sys+0x1f ffffff03d222c008 20 ffffff03f2a6e7e8 nv_alloc_sys+0x1f Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Yuri Pankov <yuripv@gmx.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Pavel Zakharov <pavel.zakharov@delphix.com> MFC after: 5 weeks X-MFC after: r324163 Notes: svn path=/head/; revision=324168
* revert r324166, it has an unrelated change in itAndriy Gapon2017-10-012-5/+3
| | | | Notes: svn path=/head/; revision=324167
* MFV r323531: 8521 nvlist memory leak in get_clones_stat() and spa_load_best()Andriy Gapon2017-10-012-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | illumos/illumos-gate@7d3000f774e20097a1ee45cbd06d0e38065ddd5a https://github.com/illumos/illumos-gate/commit/7d3000f774e20097a1ee45cbd06d0e38065ddd5a https://www.illumos.org/issues/8521 Yuri reported this to the mailing list: doing a `reboot -d` on current illumos-gate HEAD gives the following ":: findleaks -dv" output: findleaks: maximum buffers => 301061 findleaks: actual buffers => 297587 findleaks: findleaks: potential pointers => 29289774 findleaks: dismissals => 26242305 (89.5%) findleaks: misses => 331153 ( 1.1%) findleaks: dups => 2419681 ( 8.2%) findleaks: follows => 296635 ( 1.0%) findleaks: findleaks: peak memory usage => 7353 kB findleaks: elapsed CPU time => 1.5 seconds findleaks: elapsed wall time => 2.0 seconds findleaks: CACHE LEAKED BUFCTL CALLER ffffff03d222b008 120 ffffff03ef7ceb78 nv_alloc_sys+0x1f ffffff03d222a448 123 ffffff03f4150cc8 nv_alloc_sys+0x1f ffffff03d222b448 5 ffffff03f28bd598 nv_alloc_sys+0x1f ffffff03d222b888 87 ffffff03f28c10f0 nv_alloc_sys+0x1f ffffff03d222c008 21 ffffff03f4139310 nv_alloc_sys+0x1f ffffff03d222b888 43 ffffff040ef3f3e8 nv_alloc_sys+0x1f ffffff03d222c008 120 ffffff03f4591e58 nv_alloc_sys+0x1f ffffff03d222b008 121 ffffff03f352c068 nv_alloc_sys+0x1f ffffff03d222a448 112 ffffff03f414e5f8 nv_alloc_sys+0x1f ffffff03d222b008 119 ffffff03ee92fdc0 nv_alloc_sys+0x1f ffffff03d222b888 46 ffffff03f28c1378 nv_alloc_sys+0x1f ffffff03d222b448 4 ffffff03f28c7708 nv_alloc_sys+0x1f ffffff03d222c008 20 ffffff03f2a6e7e8 nv_alloc_sys+0x1f Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Yuri Pankov <yuripv@gmx.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Pavel Zakharov <pavel.zakharov@delphix.com> MFC after: 5 weeks X-MFC after: r324163 Notes: svn path=/head/; revision=324166
* MFV r323530,r323533,r323534: 7431 ZFS Channel Programs, and followupsAndriy Gapon2017-10-0173-243/+20995
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 7431 ZFS Channel Programs illumos/illumos-gate@dfc115332c94a2f62058ac7f2bce7631fbd20b3d https://github.com/illumos/illumos-gate/commit/dfc115332c94a2f62058ac7f2bce7631fbd20b3d https://www.illumos.org/issues/7431 ZFS channel programs (ZCP) adds support for performing compound ZFS administrative actions via Lua scripts in a sandboxed environment (with time and memory limits). This initial commit includes both base support for running ZCP scripts, and a small initial library of API calls which support getting properties and listing, destroying, and promoting datasets. Testing: in addition to the included unit tests, channel programs have been in use at Delphix for several months for batch destroying filesystems. The dsl_destroy_snaps_nvl() call has also been replaced with Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Approved by: Garrett D'Amore <garrett@damore.org> Author: Chris Williamson <chris.williamson@delphix.com> 8552 ZFS LUA code uses floating point math illumos/illumos-gate@916c8d881190bd2c3ca20d9fca919aecff504435 https://github.com/illumos/illumos-gate/commit/916c8d881190bd2c3ca20d9fca919aecff504435 https://www.illumos.org/issues/8552 In the LUA interpreter used by "zfs program", the lua format() function accidentally includes support for '%f' and friends, which can cause compilation problems when building on platforms that don't support floating-point math in the kernel (e.g. sparc). Support for '%f' friends (%f %e %E %g %G) should be removed, since there's no way to supply a floating-point value anyway (all numbers in ZFS LUA are int64_t's). Reviewed by: Yuri Pankov <yuripv@gmx.com> Reviewed by: Igor Kozhukhov <igor@dilos.org> Approved by: Dan McDonald <danmcd@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com> 8590 memory leak in dsl_destroy_snapshots_nvl() illumos/illumos-gate@e6ab4525d156c82445c116ecf6b2b874d5e9009d https://github.com/illumos/illumos-gate/commit/e6ab4525d156c82445c116ecf6b2b874d5e9009d https://www.illumos.org/issues/8590 In dsl_destroy_snapshots_nvl(), "snaps_normalized" is not freed after it is added to "arg". Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Steve Gonczi <steve.gonczi@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com> FreeBSD notes: - zfs-program.8 manual page is taken almost as is from the vendor repository, no FreeBSD-ification done - fixed multiple instances of NULL being used where an integer is expected - replaced ETIME and ECHRNG with ETIMEDOUT and EDOM respectively This commit adds a modified version of Lua 5.2.4 under sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lua, mirroring the upstream. See README.zfs in that directory for the description of Lua customizations. See zfs-program.8 on how to use the new feature. MFC after: 5 weeks Relnotes: yes Differential Revision: https://reviews.freebsd.org/D12528 Notes: svn path=/head/; revision=324163
* Use C99 initializers for DTrace provider methods.Mark Johnston2017-09-272-34/+34
| | | | | | | | | This makes the definitions easier to read and more cscope-friendly. MFC after: 1 week Notes: svn path=/head/; revision=324066
* fix r324011, MFV of r323535, 8585 improve batching done in zil_commit()Andriy Gapon2017-09-261-3/+7
| | | | | | | | | | | | | I managed to commit an older version of the change. Plus, even the latest version was not ready for userland compilation. Reported by: "O. Hartmann" <ohartmann@walstatt.org>, cy MFC after: 1 week X-MFC with: r324011 Notes: svn path=/head/; revision=324016
* MFV r323535: 8585 improve batching done in zil_commit()Andriy Gapon2017-09-2611-289/+1375
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | FreeBSD notes: - this MFV reverts FreeBSD commit r314549 to make the merge easier - at present our emulation of cv_timedwait_hires is rather poor, so I elected to use cv_timedwait_sbt directly Please see the differential revision for details. Unfortunately, I did not get any positive reviews, so there could be bugs in the FreeBSD-specific piece of the merge. Hence, the long MFC timeout. illumos/illumos-gate@1271e4b10dfaaed576c08a812f466f6e81370e5e https://github.com/illumos/illumos-gate/commit/1271e4b10dfaaed576c08a812f466f6e81370e5e https://www.illumos.org/issues/8585 The current implementation of zil_commit() can introduce significant latency, beyond what is inherent due to the latency of the underlying storage. The additional latency comes from two main problems: 1. When there's outstanding ZIL blocks being written (i.e. there's already a "writer thread" in progress), then any new calls to zil_commit() will block waiting for the currently oustanding ZIL blocks to complete. The blocks written for each "writer thread" is coined a "batch", and there can only ever be a single "batch" being written at a time. When a batch is being written, any new ZIL transactions will have to wait for the next batch to be written, which won't occur until the current batch finishes. As a result, the underlying storage may not be used as efficiently as possible. While "new" threads enter zil_commit() and are blocked waiting for the next batch, it's possible that the underlying storage isn't fully utilized by the current batch of ZIL blocks. In that case, it'd be better to allow these new threads to generate (and issue) a new ZIL block, such that it could be serviced by the underlying storage concurrently with the other ZIL blocks that are being serviced. 2. Any call to zil_commit() must wait for all ZIL blocks in its "batch" to complete, prior to zil_commit() returning. The size of any given batch is proportional to the number of ZIL transaction in the queue at the time that the batch starts processing the queue; which doesn't occur until the previous batch completes. Thus, if there's a lot of transactions in the queue, the batch could be composed of many ZIL blocks, and each call to zil_commit() will have to wait for all of these writes to complete (even if the thread calling zil_commit() only cared about one of the transactions in the batch). Reviewed by: Brad Lewis <brad.lewis@delphix.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Prakash Surya <prakash.surya@delphix.com> MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D12355 Notes: svn path=/head/; revision=324011
* Use nstosbt() instead of multiplying by SBT_1NS to avoid roundoff errors.Ian Lepore2017-09-251-2/+2
| | | | | | | Differential Revision: https://reviews.freebsd.org/D11779 Notes: svn path=/head/; revision=323985
* MFV r323917: 8648 Fix range locking in ZIL commit codepathAndriy Gapon2017-09-222-4/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | illumos/illumos-gate@42b14111721da2ebd5159e7b45012a3eb0e3384c https://github.com/illumos/illumos-gate/commit/42b14111721da2ebd5159e7b45012a3eb0e3384c https://www.illumos.org/issues/8648 I'm opening this bug to track integration of the following ZFS on Linux commit into illumos: commit f763c3d1df569a8d6b60bcb5e95cf07aa7a189e6 Author: LOLi <loli10K@users.noreply.github.com> Date: Mon Aug 21 17:59:48 2017 +0200 Fix range locking in ZIL commit codepath Since OpenZFS 7578 (1b7c1e5) if we have a ZVOL with logbias=throughput we will force WR_INDIRECT itxs in zvol_log_write() setting itx->itx_lr offset and length to the offset and length of the BIO from zvol_write()->zvol_log_write(): these offset and length are later used to take a range lock in zillog->zl_get_data function: zvol_get_data(). Now suppose we have a ZVOL with blocksize=8K and push 4K writes to offset 0: we will only be range-locking 0-4096. This means the ASSERTion we make in dbuf_unoverride() is no longer valid because now dmu_sync() is called from zilog's get_data functions holding a partial lock on the dbuf. Fix this by taking a range lock on the whole block in zvol_get_data(). Reviewed-by: Chunwei Chen <tuxoko@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Reviewed by: Igor Kozhukhov <igor@dilos.org> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Andriy Gapon <avg@FreeBSD.org> Reviewed by: Alexander Motin <mav@FreeBSD.org> Approved by: Robert Mustacchi <rm@joyent.com> Author: LOLi <loli10K@users.noreply.github.com> MFC after: 10 days Notes: svn path=/head/; revision=323918
* MFV r323914: 8661 remove "zil-cw2" dtrace probeAndriy Gapon2017-09-221-1/+0
| | | | | | | | | | | | | | | | | | | | | illumos/illumos-gate@bd9d3f904625846bdc61af8897a1072029c7aeb7 https://github.com/illumos/illumos-gate/commit/bd9d3f904625846bdc61af8897a1072029c7aeb7 https://www.illumos.org/issues/8661 The "zil-cw1" dtrace probe was previously removed in 8558, and the "zil-cw2" probe should have been removed in that patch as well. Unfortunately, the "zil- cw2" was not removed in 8558, so this bug is to track it's removal. Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Igor Kozhukhov <igor@dilos.org> Approved by: Robert Mustacchi <rm@joyent.com> Author: Prakash Surya <prakash.surya@delphix.com> MFC after: 1 week Notes: svn path=/head/; revision=323915
* MFV r323789: 8473 scrub does not detect errors on active sparesAlan Somers2017-09-201-8/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | illumos/illumos-gate@554675eee75dd2d7398d960aa5c81083ceb8505a https://github.com/illumos/illumos-gate/commit/554675eee75dd2d7398d960aa5c81083ceb8505a https://www.illumos.org/issues/8473 Scrubbing is supposed to detect and repair all errors in the pool. However, it wrongly ignores active spare devices. The problem can easily be reproduced in OpenZFS at git rev 0ef125d with these commands: truncate -s 64m /tmp/a /tmp/b /tmp/c sudo zpool create testpool mirror /tmp/a /tmp/b spare /tmp/c sudo zpool replace testpool /tmp/a /tmp/c /bin/dd if=/dev/zero bs=1024k count=63 oseek=1 conv=notrunc of=/tmp/c sync sudo zpool scrub testpool zpool status testpool # Will show 0 errors, which is wrong sudo zpool offline testpool /tmp/a sudo zpool scrub testpool zpool status testpool # Will show errors on /tmp/c, # which should've already been fixed FreeBSD head is partially affected: the first scrub will detect some errors, but the second scrub will detect more. Reviewed by: Andy Stormont <astormont@racktopsystems.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> MFC after: 1 week Sponsored by: Spectra Logic Corp Notes: svn path=/head/; revision=323813
* add vfs_zfs.abd_chunk_size tunableAndriy Gapon2017-09-201-0/+7
| | | | | | | | | | | | | It is reported that the default value of 4KB results in a substantial memory use overhead (at least, on some configurations). Using 1KB seems to reduce the overhead significantly. PR: 222377 Reported by: Sean Chittenden <sean@chittenden.org> MFC after: 1 week Notes: svn path=/head/; revision=323797
* fix memory leak in g_bio zone introduced in r320452, another ABD falloutAndriy Gapon2017-09-201-7/+18
| | | | | | | | | | | | | | | | | | | I overlooked the fact that that ZIO_IOCTL_PIPELINE does not include ZIO_STAGE_VDEV_IO_DONE stage. We do allocate a struct bio for an ioctl zio (a disk cache flush), but we never freed it. This change splits bio handling into two groups, one for normal read/write i/o that passes data around and, thus, needs the abd data tranform; the other group is for "data-less" i/o such as trim and cache flush. PR: 222288 Reported by: Dan Nelson <dnelson@allantgroup.com> Tested by: Borja Marcos <borjam@sarenet.es> MFC after: 10 days Notes: svn path=/head/; revision=323796
* MFV r323792: 8602 remove unused "dp_early_sync_tasks" field from "dsl_pool" ↵Andriy Gapon2017-09-201-1/+0
| | | | | | | | | | | | | | | | | | | | | | | structure illumos/illumos-gate@2bcb5458541cc6e8bf7dc541303da29297b82e8b https://github.com/illumos/illumos-gate/commit/2bcb5458541cc6e8bf7dc541303da29297b82e8b https://www.illumos.org/issues/8602 When I landed the fix for 8558, I incorrectly added the "dp_early_sync_tasks" field to the "dsl_pool" structure. This field is used in DelphixOS, but not in illumos. It was incorrectly pulled into illumos, so this bug is to remove it from the structure. Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Prakash Surya <prakash.surya@delphix.com> MFC after: 1 week Notes: svn path=/head/; revision=323793
* slightly simplify zfs_vptocnpAndriy Gapon2017-09-131-8/+1
| | | | | | | | | | It's not necessary to look up the parent's ID to check if the node is the root node of the filesystem. MFC after: 2 weeks Notes: svn path=/head/; revision=323522
* fix a fallout from the ZTOV tightening, r323479Andriy Gapon2017-09-121-1/+4
| | | | | | | | MFC after: 13 days X-MFC with: r323479 Notes: svn path=/head/; revision=323491
* zfsctl_snapdir_lookup should be able to handle an uncovered vnodeAndriy Gapon2017-09-121-10/+25
| | | | | | | | | | | | | | The uncovered vnode is possible because there is no guarantee that its hold count would go to zero (and it would be inactivated and reclaimed) immediately after a covering filesystem is unmounted. So, such a vnode should be expected and it is possible to re-use it without any trouble. MFC after: 3 weeks Sponsored by: Panzura Notes: svn path=/head/; revision=323483
* zfs_ctldir: remove obsolete / bogus ARGSUSED lint directivesAndriy Gapon2017-09-121-6/+0
| | | | | | | | | None of the tagged functions had unused parameters. MFC after: 1 week Notes: svn path=/head/; revision=323482
* zfsvfs_hold: assert that the busied filesystem can not be unmountedAndriy Gapon2017-09-121-0/+8
| | | | | | | | | | This is a FreeBSD specific feature. MFC after: 3 weeks Sponsored by: Panzura Notes: svn path=/head/; revision=323481
* zfs_get_vfs: reference a requested filesystem instead of vfs_busy-ing itAndriy Gapon2017-09-121-10/+6
| | | | | | | | | | | | | | | | | The only consumer of zfs_get_vfs, zfs_unmount_snap, does not need the filesystem to be busy, it just need a reference that it can pass to dounmount. Also, previously the code was racy as it unbusied the filesystem before taking a reference on it. Now the code should be simpler and safer. MFC after: 2 weeks Sponsored by: Panzura Notes: svn path=/head/; revision=323480
* zfs: tighten debug versions of ZTOV and VTOZAndriy Gapon2017-09-122-4/+3
| | | | | | | | MFC after: 2 weeks Sponsored by: Panzura Notes: svn path=/head/; revision=323479
* MFV r323111: 8569 problem with inline functions in abd.hAndriy Gapon2017-09-111-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | illumos/illumos-gate@37e84ab74e939caf52150fc3352081786ecc0c29 https://github.com/illumos/illumos-gate/commit/37e84ab74e939caf52150fc3352081786ecc0c29 https://www.illumos.org/issues/8569 C [C99] has peculiar rules for inline functions that are different from the C++ rules. Unlike C++ where inline is "fire and forget", in C a programmer must pay attention to the function's storage class / visibility. The main problem is with the case where a compiler decides to not inline a call to the function declared as inline. Some relevant links: - http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15831.html - http://www.drdobbs.com/the-new-c-inline-functions/184401540 The summary is that either the inline functions should be declared 'static inline' or one of the compilation units (.c files) must provide a callable externally visible function definition. In the former case, the compiler would automatically create a local non-inlined function instance in every compilation unit where it's needed. In the latter case the single external definition is used to satisfy any non-inlined calls in all compilation units. As things stand right now, we can get an undefined reference error under certain combinations of compilers and compiler options. For example, this is what I get on FreeBSD when compiling with clang 4.0.0 and -O1: In function `abd_free': /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/abd.c:385: undefined reference to `abd_is_linear' Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org> MFC after: 1 week Notes: svn path=/head/; revision=323435
* Revert r322601, Mark ZFS ABD inline functions staticAndriy Gapon2017-09-111-6/+6
| | | | | | | An alternative fix is to be merged from illumos shortly. Notes: svn path=/head/; revision=323434
* MFV r323110: 8558 lwp_create() returns EAGAIN on system with more than 80K ↵Andriy Gapon2017-09-114-9/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ZFS filesystems illumos/illumos-gate@216d7723a1a58124cf95c4950d51d5f99d3f4128 https://github.com/illumos/illumos-gate/commit/216d7723a1a58124cf95c4950d51d5f99d3f4128 https://www.illumos.org/issues/8558 On a system with more than 80K ZFS filesystems, we've seen cases where lwp_create() will start to fail by returning EAGAIN. The problem being, for each of those 80K ZFS filesystems, a taskq will be created for each dataset as part of the ZIL for each dataset. For each of these taskq's, a kernel thread will be created which results in 24KB being allocated for each thread. With enough of these 24KB allocations, we eventually exhaust the memory region set aside for these allocations. Currently, segkpsize is set to a value of 2GB, which means we can only support about 80K filesystems; 2GB / 24KB = ~80K. The lwp_create() failure comes into play due to the fact that LWP creation also allocates 24KB from this same region of memory. Thus, if we've exhausted this region of memory due to the number of ZIL taskq's, there won't be any memory avaible to allow the call to lwp_create() to succeed. FreeBSD note: I haven't created sysctl-s for the new ZIL clean parameters. Let's add them if anyone requires to tune them. Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Sebastien Roy <sebastien.roy@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Prakash Surya <prakash.surya@delphix.com> MFC after: 3 weeks Notes: svn path=/head/; revision=323433
* MFV r323107: 8414 Implemented zpool scrub pause/resumeAndriy Gapon2017-09-099-39/+197
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | illumos/illumos-gate@1702cce751c5cb7ead878d0205a6c90b027e3de8 https://github.com/illumos/illumos-gate/commit/1702cce751c5cb7ead878d0205a6c90b027e3de8 FreeBSD note: rather than merging the zpool.8 update I copied the zpool scrub section from the illumos zpool.1m to FreeBSD zpool.8 almost verbatim. Now that the illumos page uses the mdoc format, it was an easier option. Perhaps the change is not in perfect compliance with the FreeBSD style, but I think that it is acceptible. https://www.illumos.org/issues/8414 This issue tracks the port of scrub pause from ZoL: https://github.com/zfsonlinux/zfs/pull/6167 Currently, there is no way to pause a scrub. Pausing may be useful when the pool is busy with other I/O to preserve bandwidth. Description This patch adds the ability to pause and resume scrubbing. This is achieved by maintaining a persistent on-disk scrub state. While the state is 'paused' we do not scrub any more blocks. We do however perform regular scan housekeeping such as freeing async destroyed and deadlist blocks while paused. Motivation and Context Scrub pausing can be an I/O intensive operation and people have been asking for the ability to pause a scrub for a while. This allows one to preserve scrub progress while freeing up bandwidth for other I/O. Reviewed by: George Melikov <mail@gmelikov.ru> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Brad Lewis <brad.lewis@delphix.com> Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Alek Pinchuk <apinchuk@datto.com> MFC after: 2 weeks Notes: svn path=/head/; revision=323355
* Add sysctls for arc shrinking and growing valuesBaptiste Daroussin2017-08-311-0/+30
| | | | | | | | | | | | | | | | | The default value for arc_no_grow_shift may not be optimal when using several GiB ARC. Expose it via sysctl allows users to tune it easily. Also expose arc_grow_retry via sysctl for the same reason. The default value of 60s might, in case of intensive load, be too long. Submitted by: Nikita Kozlov <nikita.kozlov@blade-group.com> Reviewed by: mav, manu, bapt MFC after: 2 weeks Sponsored by: blade Differential Revision: https://reviews.freebsd.org/D12144 Notes: svn path=/head/; revision=323051
* Add a guard around _ILP32 for mips.John Baldwin2017-08-211-0/+2
| | | | | | | | This is already done for other architectures in this file and fixes the build with clang. Notes: svn path=/head/; revision=322765
* Mark ZFS ABD inline functions static.John Baldwin2017-08-161-6/+6
| | | | | | | | | | | | When built with -fno-inline-functions zfs.ko contains undefined references to these functions if they are only marked inline. Reviewed by: avg (earlier version) MFC after: 1 week Sponsored by: Chelsio Communications Notes: svn path=/head/; revision=322601
* Fix some ZFS debugging messagesAlan Somers2017-08-151-4/+4
| | | | | | | | | | | | sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c Be more careful about the use of provider names vs vdev names in ZFS_LOG statements. MFC after: 3 weeks Sponsored by: Spectra Logic Corp Notes: svn path=/head/; revision=322546
* MFV r322242: 8373 TXG_WAIT in ZIL commit pathAndriy Gapon2017-08-081-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | illumos/illumos-gate@d28671a3b094af696bea87f52272d4c4d89321c7 https://github.com/illumos/illumos-gate/commit/d28671a3b094af696bea87f52272d4c4d89321c7 https://www.illumos.org/issues/8373 The code that writes ZIL blocks uses dmu_tx_assign(TXG_WAIT) to assign a transaction to a transaction group. That seems to be logically incorrect as writing of the ZIL block does not introduce any new dirty data. Also, when there is a lot of dirty data, the call can introduce significant delays into the ZIL commit path, thus affecting all synchronous writes. Additionally, ARC throttling may affect the ZIL writing. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org> MFC after: 2 weeks Notes: svn path=/head/; revision=322245
* MFV r322240: 8491 uberblock on-disk padding to reserve space for smoothly ↵Andriy Gapon2017-08-081-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | merging zpool checkpoint & MMP in ZFS illumos/illumos-gate@79c2b812ee2010ebf20fdd92dc5f06b59000a94c https://github.com/illumos/illumos-gate/commit/79c2b812ee2010ebf20fdd92dc5f06b59000a94c https://www.illumos.org/issues/8491 The zpool checkpoint feature in DxOS added a new field in the uberblock. The Multi-Modifier Protection Pull Request from ZoL adds two new fields in the uberblock (Reference: https://github.com/zfsonlinux/zfs/pull/6279). As these two changes come from two different sources and once upstreamed and deployed will introduce an incompatibility with each other we want to upstream a change that will reserve the padding for both of them so integration goes smoothly and everyone gets both features. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Olaf Faaland <faaland1@llnl.gov> Approved by: Gordon Ross <gwr@nexenta.com> Author: Serapheim Dimitropoulos <serapheim@delphix.com> MFC after: 3 weeks Notes: svn path=/head/; revision=322241
* MFV r322238: 7915 checks in l2arc_evict could use some cleaning upAndriy Gapon2017-08-081-15/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | illumos/illumos-gate@267ae6c3a88d2fc39276af66caafa978b0935b82 https://github.com/illumos/illumos-gate/commit/267ae6c3a88d2fc39276af66caafa978b0935b82 https://www.illumos.org/issues/7915 l2arc_evict() is strictly serialized with respect to l2arc_write_buffers() and l2arc_write_done(). Normally, l2arc_evict() and l2arc_write_buffers() are called from the same thread, so they can not be concurrent. Also, l2arc_write_buffers() uses zio_wait() on the parent zio of all cache zio-s. That ensures that l2arc_write_done() is completed before l2arc_write_buffers() returns. Finally, if a cache device is removed, then l2arc_evict() is called under SCL_ALL in the exclusive mode. That ensures that it can not be concurrent with the normal L2ARC accesses to the device (including writing and evicting buffers). Given the above, some checks and actions in l2arc_evict() do not make sense. For instance, it must never encounter the write head header let alone remove it from the buffer list. Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Approved by: Matthew Ahrens <mahrens@delphix.com> Author: Andriy Gapon <avg@FreeBSD.org> MFC after: 2 weeks Notes: svn path=/head/; revision=322239
* MFV r322236: 8126 ztest assertion failed in dbuf_dirty due to dn_nlevels ↵Andriy Gapon2017-08-081-5/+10
| | | | | | | | | | | | | | | | | | | | | | | changing illumos/illumos-gate@dcb6872c565819ac88acbc2ece999ef241c8b982 https://github.com/illumos/illumos-gate/commit/dcb6872c565819ac88acbc2ece999ef241c8b982 https://www.illumos.org/issues/8126 The sync thread is concurrently modifying dn_phys->dn_nlevels while dbuf_dirty() is trying to assert something about it, without holding the necessary lock. We need to move this assertion further down in the function, after we have acquired the dn_struct_rwlock. Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com> MFC after: 2 weeks Notes: svn path=/head/; revision=322237
* zfs: no need for __DECONST after abd constification in r322233Andriy Gapon2017-08-081-1/+1
| | | | | | | | | | Note that vdev_label_write_pad2() is FreeBSD specific. MFC after: 2 weeks X-MFC after: r322233 Notes: svn path=/head/; revision=322234
* MFV r322232: 8426 mark immutable buffer arguments as such in abd.hAndriy Gapon2017-08-081-2/+2
| | | | | | | | | | | | | | | | | | | | illumos/illumos-gate@9b195260e22529ac0e2580faaf89402420589c1c https://github.com/illumos/illumos-gate/commit/9b195260e22529ac0e2580faaf89402420589c1c https://www.illumos.org/issues/8426 abd_copy_from_buf and abd_cmp_buf do not modify their void *buf arguments, so qualify them with const. abd_copy_from_buf_off and abd_cmp_buf_off already had that type for the corresponding arguments. Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org> MFC after: 2 weeks Notes: svn path=/head/; revision=322233
* MFV r322229: 7600 zfs rollback should pass target snapshot to kernelAndriy Gapon2017-08-083-6/+34
| | | | | | | | | | | | | | | | | | | | | | illumos/illumos-gate@77b171372ed21642e04c873ef1e87fe2365520df https://github.com/illumos/illumos-gate/commit/77b171372ed21642e04c873ef1e87fe2365520df https://www.illumos.org/issues/7600 At present, the kernel side code seems to blindly rollback to whatever happens to be the latest snapshot at the time when the rollback task is processed. The expected target's name should be passed to the kernel driver and the sync task should validate that the target exists and that it is the latest snapshot indeed. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org> MFC after: 3 weeks Notes: svn path=/head/; revision=322230
* MFV r322227: 8377 Panic in bookmark deletionAndriy Gapon2017-08-081-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | illumos/illumos-gate@42418f9e73f0d007aa87675ecc206c26fc8e073e https://github.com/illumos/illumos-gate/commit/42418f9e73f0d007aa87675ecc206c26fc8e073e https://www.illumos.org/issues/8377 The problem is that when dsl_bookmark_destroy_check() is executed from open context (the pre-check), it fills in dbda_success based on the existence of the bookmark. But the bookmark (or containing filesystem as in this case) can be destroyed before we get to syncing context. When we re-run dsl_bookmark_destroy_check() in syncing context, it will not add the deleted bookmark to dbda_success, intending for dsl_bookmark_destroy_sync() to not process it. But because the bookmark is still in dbda_success from the open-context call, we do try to destroy it. The fix is that dsl_bookmark_destroy_check() should not modify dbda_success when called from open context. Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com> MFC after: 2 weeks Notes: svn path=/head/; revision=322228
* MFV r322223: 8378 crash due to bp in-memory modification of nopwrite blockAndriy Gapon2017-08-083-28/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | illumos/illumos-gate@b7edcb940884114e61382937505433c4c38c0278 https://github.com/illumos/illumos-gate/commit/b7edcb940884114e61382937505433c4c38c0278 https://www.illumos.org/issues/8378 The problem is that zfs_get_data() supplies a stale zgd_bp to dmu_sync(), which we then nopwrite against. zfs_get_data() doesn't hold any DMU-related locks, so after it copies db_blkptr to zgd_bp, dbuf_write_ready() could change db_blkptr, and dbuf_write_done() could remove the dirty record. dmu_sync() then sees the stale BP and that the dbuf it not dirty, so it is eligible for nop-writing. The fix is for dmu_sync() to copy db_blkptr to zgd_bp after acquiring the db_mtx. We could still see a stale db_blkptr, but if it is stale then the dirty record will still exist and thus we won't attempt to nopwrite. Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com> MFC after: 2 weeks Notes: svn path=/head/; revision=322226
* MFV r322221: 7910 l2arc_write_buffers() may write beyond target_szAndriy Gapon2017-08-081-29/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | FreeBD note: the essence of this change was committed to FreeBSD in r314274. This commit catches up with differences between what was committed to FreeBSD and what was committed to OpenZFS, mainly more logical variable names. illumos/illumos-gate@16a7e5ac116c85d965007a5f201104b564e82210 https://github.com/illumos/illumos-gate/commit/16a7e5ac116c85d965007a5f201104b564e82210 https://www.illumos.org/issues/7910 It seems that the change in issue #6950 resurrected the problem that was earlier fixed by the change in issue #5219. Please also see the following FreeBSD bug report: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216178 Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org> MFC after: 2 weeks Notes: svn path=/head/; revision=322222
* o Replace __riscv__ with __riscvRuslan Bukin2017-08-073-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | o Replace __riscv64 with (__riscv && __riscv_xlen == 64) This is required to support new GCC 7.1 compiler. This is compatible with current GCC 6.1 compiler. RISC-V is extensible ISA and the idea here is to have built-in define per each extension, so together with __riscv we will have some subset of these as well (depending on -march string passed to compiler): __riscv_compressed __riscv_atomic __riscv_mul __riscv_div __riscv_muldiv __riscv_fdiv __riscv_fsqrt __riscv_float_abi_soft __riscv_float_abi_single __riscv_float_abi_double __riscv_cmodel_medlow __riscv_cmodel_medany __riscv_cmodel_pic __riscv_xlen Reviewed by: ngie Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D11901 Notes: svn path=/head/; revision=322168
* spa_import_rootpool should be able to handle an imported root poolAndriy Gapon2017-07-251-0/+10
| | | | | | | | | | | | That is required to support reboot -r with a new root filesystem being on an already imported pool. PR: 210721 Reported by: Jan Bramkamp <crest_maintainer@rlwinm.de> MFC after: 2 weeks Notes: svn path=/head/; revision=321471
* zfs: Fix a typo in the delay_min_dirty_percent sysctl descriptionEd Maste2017-07-191-1/+1
| | | | | | | | | | | | | The description is FreeBSD-specific and was added in r266497 to fix PR189865. PR: 220825 Submitted by: Fabian Keil Obtained from: ElectroBSD MFC after: 1 week Notes: svn path=/head/; revision=321218
* fix a regression in r320452, ZFS ABD importAndriy Gapon2017-07-181-0/+8
| | | | | | | | | | | | | | | | | | I overlooked the fact that vdev_op_io_done hook is called even if the actual I/O is skipped, for example, in the case of a missing vdev. Arguably, this could be considered an issue in the zio pipeline engine, but for now I am adding defensive code to check for io_bp being NULL along with assertions that that happens only when it can be really expected. PR: 220691 Reported by: peter, cy Tested by: cy MFC after: 1 week X-MFC with: r320156, r320452 Notes: svn path=/head/; revision=321111
* Make ZFS not crash on mount on 32-bit systemsJustin Hibbits2017-07-181-1/+1
| | | | | | | | | | ZPL_VERSION is unsigned long long, not an int. With this change, a zpool can be created on a 32-bit system (tested on powerpcspe) and mounted correctly. Reviewed by: allanjude Notes: svn path=/head/; revision=321104
* fix an architectural problem introduced in r320156, ZFS ABD importAndriy Gapon2017-06-282-7/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | The implementation of ZFS refcount_t uses the emulated illumos mutex (the sx lock) and the waiting memory allocation when ZFS_DEBUG is enabled. This makes refcount_t unsuitable for use in GEOM g_up thread where sleeping is prohibited. When importing the ABD change I modified vdev_geom using illumos vdev_disk as an example. As a result, I added a call to abd_return_buf in vdev_geom_io_intr. The latter is called on g_up thread while the former uses refcount_t. This change fixes the problem by deferring the abd_return_buf call to the previously unused vdev_geom_io_done that is called on a ZFS zio taskqueue thread where sleeping is allowed. A side bonus of this change is that now a vdev zio has a pointer to its corresponding bio while the zio is active. Reported by: Shawn Webb <shawn.webb@hardenedbsd.org> Tested by: Shawn Webb <shawn.webb@hardenedbsd.org> MFC after: 1 week X-MFC with: r320156 Notes: svn path=/head/; revision=320452
* zfs: port vdev_file part of illumos change 3306Andriy Gapon2017-06-263-27/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | 3306 zdb should be able to issue reads in parallel illumos/illumos-gate/31d7e8fa33fae995f558673adb22641b5aa8b6e1 https://www.illumos.org/issues/3306 The upstream change was made before we started to import upstream commits individually. It was imported into the illumos vendor area as r242733. That commit was MFV-ed in r260138, but as the commit message says vdev_file.c was left intact. This commit actually implements the parallel I/O for vdev_file using a taskqueue with multiple thread. This implementation does not depend on the illumos or FreeBSD bio interface at all, but uses zio_t to pass around all the relevent data. So, the code looks a bit different from the upstream. This commit also incorporates ZoL commit zfsonlinux/zfs/bc25c9325b0e5ced897b9820dad239539d561ec9 that fixed https://github.com/zfsonlinux/zfs/issues/2270 We need to use a dedicated taskqueue for exactly the same reason as ZoL as we do not implement TASKQ_DYNAMIC. Obtained from: illumos, ZFS on Linux MFC after: 2 weeks Notes: svn path=/head/; revision=320352
* fix gcc-specific fallout from r320156, MFV of r318946, ZFS ABDAndriy Gapon2017-06-232-2/+2
| | | | | | | | | Reported by: jhibbits MFC after: 1 week X-MFC with: r320156 Notes: svn path=/head/; revision=320262
* MFV r319950: 5220 L2ARC does not support devices that do not provide 512B accessAndriy Gapon2017-06-221-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | FreeBSD note: the actual change has been in FreeBSD since r297848. This commit accounts for integration of that change with subsequent changes, especially r320156 (MFV of r318946) and r314274. illumos/illumos-gate@403a8da73c64ff9dfb6230ba045c765a242213fb https://github.com/illumos/illumos-gate/commit/403a8da73c64ff9dfb6230ba045c765a242213fb https://www.illumos.org/issues/5220 There are disk devices that have logical sector size larger than 512B, for example 4KB. That is, their physical sector size is larger than 512B and they do not provide emulation for 512B sector sizes. For such devices both a data offset and a data size must be properly aligned. L2ARC should arrange that because it uses physical I/O. zio_vdev_io_start() performs a necessary transformation if io_size is not aligned to vdev_ashift, but that is done only for logical I/O. Something similar should be done in L2ARC code. * a temporary write buffer should be allocated if the original buffer is not going to be compressed and its size is not aligned * size of a temporary compression buffer should be ashift aligned * for the reads, if a size of a target buffer is not sufficiently large and it is not aligned then a temporary read buffer should be allocated Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org> MFC after: 3 weeks Notes: svn path=/head/; revision=320239
* MFV r319742: 8056 zfs send size estimate is inaccurate for some zvolsAndriy Gapon2017-06-221-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | illumos/illumos-gate@0255edcc85fc0cd1dda0e49bcd52eb66c06a1b16 https://github.com/illumos/illumos-gate/commit/0255edcc85fc0cd1dda0e49bcd52eb66c06a1b16 https://www.illumos.org/issues/8056 The send size estimate for a zvol can be too low, if the size of the record headers (dmu_replay_record_t's) is a significant portion of the size. This is typically the case when the data is highly compressible, especially with embedded blocks. The problem is that dmu_adjust_send_estimate_for_indirects() assumes that blocks are the size of the "recordsize" property (128KB). However, for zvols, the blocks are the size of the "volblocksize" property (8KB). Therefore, we estimate that there will be 16x less record headers than there really will be. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Paul Dagnelie <pcd@delphix.com> MFC after: 3 weeks Notes: svn path=/head/; revision=320238