path: root/sys/kern/subr_smr.c
Commit message (Collapse)AuthorAgeFilesLines
* Use COUNTER_U64_DEFINE_EARLY() in places where it simplifies things.Mark Johnston2020-03-061-17/+5
| | | | | | | | | Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23978 Notes: svn path=/head/; revision=358716
* Simplify lazy advance with a 64bit atomic cmpset.Jeff Roberson2020-02-271-56/+30
| | | | | | | | | | | | | | This provides the potential to force a lazy (tick based) SMR to advance when there are blocking waiters by decoupling the wr_seq value from the ticks value. Add some missing compiler barriers. Reviewed by: rlibby Differential Revision: https://reviews.freebsd.org/D23825 Notes: svn path=/head/; revision=358400
* Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)Pawel Biernacki2020-02-261-1/+2
| | | | | | | | | | | | | | | | | | | r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718 Notes: svn path=/head/; revision=358333
* Add an atomic-free tick moderated lazy update variant of SMR.Jeff Roberson2020-02-221-133/+339
| | | | | | | | | | | | | | | | | | | | | This enables very cheap read sections with free-to-use latencies and memory overhead similar to epoch. On a recent AMD platform a read section cost 1ns vs 5ns for the default SMR. On Xeon the numbers should be more like 1 ns vs 11. The memory consumption should be proportional to the product of the free rate and 2*1/hz while normal SMR consumption is proportional to the product of free rate and maximum read section time. While here refactor the code to make future additions more straightforward. Name the overall technique Global Unbound Sequences (GUS) and adjust some comments accordingly. This helps distinguish discussions of the general technique (SMR) vs this specific implementation (GUS). Discussed with: rlibby, markj Notes: svn path=/head/; revision=358236
* Since r357804 pcpu zones are required to use zalloc_pcpu(). Prior to thisJeff Roberson2020-02-131-2/+2
| | | | | | | | | it was only required if you were zeroing. Switch to these interfaces. Reviewed by: mjg Notes: svn path=/head/; revision=357884
* Add more precise SMR entry asserts.Jeff Roberson2020-02-131-4/+5
| | | | Notes: svn path=/head/; revision=357882
* Fix a race in smr_advance() that could result in unnecessary poll calls.Jeff Roberson2020-02-061-4/+10
| | | | | | | | | | | | This was relatively harmless but surprising to see in counters. The race occurred when rd_seq was read after the goal was updated and we incorrectly calculated the delta between them. Reviewed by: rlibby Differential Revision: https://reviews.freebsd.org/D23464 Notes: svn path=/head/; revision=357641
* Add some global counters for SMR. These may eventually become per-smrJeff Roberson2020-02-061-3/+32
| | | | | | | | | | | | counters. In my stress test there is only one poll for every 15,000 frees. This means we are effectively amortizing the cache coherency overhead even with very high write rates (3M/s/core). Reviewed by: markj, rlibby Differential Revision: https://reviews.freebsd.org/D23463 Notes: svn path=/head/; revision=357637
* Implement a deferred write advancement feature that can be used to furtherJeff Roberson2020-02-041-0/+31
| | | | | | | | | | amortize shared cacheline writes. Discussed with: rlibby Differential Revision: https://reviews.freebsd.org/D23462 Notes: svn path=/head/; revision=357487
* Add two missing fences with comments describing them. These were found byJeff Roberson2020-01-311-3/+14
| | | | | | | | | | | inspection and after a lengthy discussion with jhb and kib. They have not produced test failures. Don't pointer chase through cpu0's smr. Use cpu correct smr even when not in a critical section to reduce the likelihood of false sharing. Notes: svn path=/head/; revision=357355
* Don't use "All rights reserved" in new copyrights.Jeff Roberson2020-01-311-2/+1
| | | | | | | Requested by: rgrimes Notes: svn path=/head/; revision=357316
* Implement a safe memory reclamation feature that is tightly coupled with UMA.Jeff Roberson2020-01-311-0/+387
This is in the same family of algorithms as Epoch/QSBR/RCU/PARSEC but is a unique algorithm. This has 3x the performance of epoch in a write heavy workload with less than half of the read side cost. The memory overhead is significantly lessened by limiting the free-to-use latency. A synthetic test uses 1/20th of the memory vs Epoch. There is significant further discussion in the comments and code review. This code should be considered experimental. I will write a man page after it has settled. After further validation the VM will begin using this feature to permit lockless page lookups. Both markj and cperciva tested on arm64 at large core counts to verify fences on weaker ordering architectures. I will commit a stress testing tool in a follow-up. Reviewed by: mmacy, markj, rlibby, hselasky Discussed with: sbahara Differential Revision: https://reviews.freebsd.org/D22586 Notes: svn path=/head/; revision=357314