aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMark Johnston <markj@FreeBSD.org>2021-11-20 16:21:25 +0000
committerTony Hutter <hutter2@llnl.gov>2022-03-04 23:37:41 +0000
commitb3427b18b1c694f9dbc4673a11a6cf57c7f5c5d8 (patch)
tree778ca15b4e32a99fec5a5e3d299882c0445c7d69
parent0e2bb1a3ee395887a8e75f0273aca2b328a3f3cd (diff)
downloadsrc-b3427b18b1c694f9dbc4673a11a6cf57c7f5c5d8.tar.gz
src-b3427b18b1c694f9dbc4673a11a6cf57c7f5c5d8.zip
zfs: Fix a deadlock between page busy and the teardown lock
When rolling back a dataset, ZFS has to purge file data resident in the system page cache. To do this, it loops over all vnodes for the mountpoint and calls vn_pages_remove() to purge pages associated with the vnode's VM object. Each page is thus exclusively busied while the dataset's teardown write lock is held. When handling a page fault on a mapped ZFS file, FreeBSD's page fault handler busies newly allocated pages and then uses VOP_GETPAGES to fill them. The ZFS getpages VOP acquires the teardown read lock with vnode pages already busied. This represents a lock order reversal which can lead to deadlock. To break the deadlock, observe that zfs_rezget() need only purge those pages marked valid, and that pages busied by the page fault handler are, by definition, invalid. Furthermore, ZFS pages always transition from invalid to valid with the teardown lock held, and ZFS never creates partially valid pages. Thus, zfs_rezget() can use the new vn_pages_remove_valid() to skip over pages busied by the fault handler. PR: 258208 Tested by: pho Reviewed by: avg, sef, kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32931 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org> Closes #12828
-rw-r--r--module/os/freebsd/zfs/zfs_znode.c9
1 files changed, 9 insertions, 0 deletions
diff --git a/module/os/freebsd/zfs/zfs_znode.c b/module/os/freebsd/zfs/zfs_znode.c
index 85f71f99ea16..317a35eefd0e 100644
--- a/module/os/freebsd/zfs/zfs_znode.c
+++ b/module/os/freebsd/zfs/zfs_znode.c
@@ -1083,9 +1083,18 @@ zfs_rezget(znode_t *zp)
* the vnode in case of error, but currently we cannot do that
* because of the LOR between the vnode lock and z_teardown_lock.
* So, instead, we have to "doom" the znode in the illumos style.
+ *
+ * Ignore invalid pages during the scan. This is to avoid deadlocks
+ * between page busying and the teardown lock, as pages are busied prior
+ * to a VOP_GETPAGES operation, which acquires the teardown read lock.
+ * Such pages will be invalid and can safely be skipped here.
*/
vp = ZTOV(zp);
+#if __FreeBSD_version >= 1400042
+ vn_pages_remove_valid(vp, 0, 0);
+#else
vn_pages_remove(vp, 0, 0);
+#endif
ZFS_OBJ_HOLD_ENTER(zfsvfs, obj_num);