From 8694fd333556addb97acfff1feca6a1e389201ce Mon Sep 17 00:00:00 2001 From: Mark Johnston Date: Sat, 24 Sep 2022 09:18:04 -0400 Subject: smr: Fix synchronization in smr_enter() smr_enter() must publish its observed read sequence number before issuing any subsequent memory operations. The ordering provided by atomic_add_acq_int() is insufficient on some platforms, at least on arm64, because it permits reordering of subsequent loads with the store to c_seq. Thus, use atomic_thread_fence_seq_cst() to issue a store-load barrier after publishing the read sequence number. On x86, take advantage of the fact that memory operations are not reordered with locked instructions to improve code density: we can store the observed read sequence and provide a store-load barrier with a single operation. Based on a patch from Pierre Habouzit . PR: 265974 Reviewed by: alc MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D36370 --- sys/sys/smr.h | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/sys/sys/smr.h b/sys/sys/smr.h index c110be9a66c2..1319e2bf465b 100644 --- a/sys/sys/smr.h +++ b/sys/sys/smr.h @@ -122,8 +122,12 @@ smr_enter(smr_t smr) * Frees that are newer than this stored value will be * deferred until we call smr_exit(). * - * An acquire barrier is used to synchronize with smr_exit() - * and smr_poll(). + * Subsequent loads must not be re-ordered with the store. On + * x86 platforms, any locked instruction will provide this + * guarantee, so as an optimization we use a single operation to + * both store the cached write sequence number and provide the + * requisite barrier, taking advantage of the fact that + * SMR_SEQ_INVALID is zero. * * It is possible that a long delay between loading the wr_seq * and storing the c_seq could create a situation where the @@ -132,8 +136,12 @@ smr_enter(smr_t smr) * the load. See smr_poll() for details on how this condition * is detected and handled there. */ - /* This is an add because we do not have atomic_store_acq_int */ +#if defined(__amd64__) || defined(__i386__) atomic_add_acq_int(&smr->c_seq, smr_shared_current(smr->c_shared)); +#else + atomic_store_int(&smr->c_seq, smr_shared_current(smr->c_shared)); + atomic_thread_fence_seq_cst(); +#endif } /* -- cgit v1.2.3