| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Historically we have not distinguished between kernel wirings and user
wirings for accounting purposes. User wirings (via mlock(2)) were
subject to a global limit on the number of wired pages, so if large
swaths of physical memory were wired by the kernel, as happens with
the ZFS ARC among other things, the limit could be exceeded, causing
user wirings to fail.
The change adds a new counter, v_user_wire_count, which counts the
number of virtual pages wired by user processes via mlock(2) and
mlockall(2). Only user-wired pages are subject to the system-wide
limit which helps provide some safety against deadlocks. In
particular, while sources of kernel wirings typically support some
backpressure mechanism, there is no way to reclaim user-wired pages
shorting of killing the wiring process. The limit is exported as
vm.max_user_wired, renamed from vm.max_wired, and changed from u_int
to u_long.
The choice to count virtual user-wired pages rather than physical
pages was done for simplicity. There are mechanisms that can cause
user-wired mappings to be destroyed while maintaining a wiring of
the backing physical page; these make it difficult to accurately
track user wirings at the physical page layer.
The change also closes some holes which allowed user wirings to succeed
even when they would cause the system limit to be exceeded. For
instance, mmap() may now fail with ENOMEM in a process that has called
mlockall(MCL_FUTURE) if the new mapping would cause the user wiring
limit to be exceeded.
Note that bhyve -S is subject to the user wiring limit, which defaults
to 1/3 of physical RAM. Users that wish to exceed the limit must tune
vm.max_user_wired.
Reviewed by: kib, ngie (mlock() test changes)
Tested by: pho (earlier version)
MFC after: 45 days
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D19908
Notes:
svn path=/head/; revision=347532
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use these predicates instead of inline references to vm_min_domains.
Also add a global all_domains set, akin to all_cpus.
Reviewed by: alc, jeff, kib
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17278
Notes:
svn path=/head/; revision=338919
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
other allowed domains if the requested domain is below the minimum paging
threshold. Block in fork only if all domains available to the forking
thread are below the severe threshold rather than any.
Submitted by: jeff
Reported by: mjg
Reviewed by: alc, kib, markj
Approved by: re (rgrimes)
Differential Revision: https://reviews.freebsd.org/D16191
Notes:
svn path=/head/; revision=338507
|
|
|
|
|
|
|
|
|
| |
While here whack unused locking keys for the struct.
Discussed with: jeff
Notes:
svn path=/head/; revision=333051
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
significant source of cache line contention from vm_page_alloc(). Use
accessors and vm_page_unwire_noq() so that the mechanism can be easily
changed in the future.
Reviewed by: markj
Discussed with: kib, glebius
Tested by: pho (earlier version)
Sponsored by: Netflix, Dell/EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14273
Notes:
svn path=/head/; revision=329187
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
global to per-domain state. Protect reservations with the free lock
from the domain that they belong to. Refactor to make vm domains more
of a first class object.
Reviewed by: markj, kib, gallatin
Tested by: pho
Sponsored by: Netflix, Dell/EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14000
Notes:
svn path=/head/; revision=328954
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- The process stats are actually thread counts rather than process
counts.
- Simplify various descriptions to remove mention of stats that are
updated every 5 seconds (all VM related stats are now "instant",
only the load average is updated every 5 seconds).
- Don't make any mention of special treatment for processes that have
been active in the last 20 seconds. We don't track that stat.
- Rework the description of active virtual memory. Call it mapped
virtual memory and explicitly point out it is not the same as the
active page queue (which corresponds to "Active" in top(1)), and
also hint at the possible bogusness of the value (e.g. if a process
maps a single page out of a multiple GB file, the entire file's size
is considered mapped).
- Simplify a few descriptions that implied their output was a value
per interval. All of the "rate" values are per-second rates scaled
across the interval.
- Update a few comments for 'struct vmtotal' along similar lines.
Reported by: mwlucas (indirectly)
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D13905
Notes:
svn path=/head/; revision=328134
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
Notes:
svn path=/head/; revision=326023
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
hardware sizes.
32bit counters already overflow on approachable virtual memory page
counts, and soon would overflow on the physical pages counts as well.
Bump sizes to 64bit types. Bump __FreeBSD_version.
It is impossible to provide perfect backward ABI compat for this
change. If a program requests an old structure, it can be detected by
size. But if it queries the size first by passing NULL old req
pointer, there is almost nothing we can do to detect the desired ABI.
As a partial solution, check p_osrel of the quering process when
selecting the size to report.
Submitted by: Pawel Biernacki <pawel.biernacki@gmail.com>
Differential revision: https://reviews.freebsd.org/D13018
Notes:
svn path=/head/; revision=325852
|
|
|
|
|
|
|
| |
Reported by: alc
Notes:
svn path=/head/; revision=324614
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The variable is modified with the highly contended page free queue lock.
It unnecessarily shares a cacheline with purely read-only fields and is
re-read after the lock is dropped in the page allocation code making the
hold time longer.
Pad the variable just like the others and store the value as found with
the lock held instead of re-reading.
Provides a modest 1%-ish speed up in concurrent page faults.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D12665
Notes:
svn path=/head/; revision=324610
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to the change they were subject to extreme false sharing.
In particular this change shaves about 3 seconds real time of -j 80 buildkernel.
Reviewed by: alc, markj
Differential Revision: https://reviews.freebsd.org/D12281
Notes:
svn path=/head/; revision=323393
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in place. To do per-cpu stats, convert all fields that previously were
maintained in the vmmeters that sit in pcpus to counter(9).
- Since some vmmeter stats may be touched at very early stages of boot,
before we have set up UMA and we can do counter_u64_alloc(), provide an
early counter mechanism:
o Leave one spare uint64_t in struct pcpu, named pc_early_dummy_counter.
o Point counter(9) fields of vmmeter to pcpu[0].pc_early_dummy_counter,
so that at early stages of boot, before counters are allocated we already
point to a counter that can be safely written to.
o For sparc64 that required a whole dummy pcpu[MAXCPU] array.
Further related changes:
- Don't include vmmeter.h into pcpu.h.
- vm.stats.vm.v_swappgsout and vm.stats.vm.v_swappgsin changed to 64-bit,
to match kernel representation.
- struct vmmeter hidden under _KERNEL, and only vmstat(1) is an exclusion.
This is based on benno@'s 4-year old patch:
https://lists.freebsd.org/pipermail/freebsd-arch/2013-July/014471.html
Reviewed by: kib, gallatin, marius, lidl
Differential Revision: https://reviews.freebsd.org/D10156
Notes:
svn path=/head/; revision=317061
|
|
|
|
|
|
|
|
|
|
|
|
| |
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.
Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96
Notes:
svn path=/head/; revision=314436
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
longer used. More precisely, they are always zero because the code that
decremented and incremented them no longer exists.
Bump __FreeBSD_version to mark this change.
Reviewed by: kib, markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D8583
Notes:
svn path=/head/; revision=309017
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pages, specificially, dirty pages that have passed once through the inactive
queue. A new, dedicated thread is responsible for both deciding when to
launder pages and actually laundering them. The new policy uses the
relative sizes of the inactive and laundry queues to determine whether to
launder pages at a given point in time. In general, this leads to more
intelligent swapping behavior, since the laundry thread will avoid pageouts
when the marginal benefit of doing so is low. Previously, without a
dedicated queue for dirty pages, the page daemon didn't have the information
to determine whether pageout provides any benefit to the system. Thus, the
previous policy often resulted in small but steadily increasing amounts of
swap usage when the system is under memory pressure, even when the inactive
queue consisted mostly of clean pages. This change addresses that issue,
and also paves the way for some future virtual memory system improvements by
removing the last source of object-cached clean pages, i.e., PG_CACHE pages.
The new laundry thread sleeps while waiting for a request from the page
daemon thread(s). A request is raised by setting the variable
vm_laundry_request and waking the laundry thread. We request launderings
for two reasons: to try and balance the inactive and laundry queue sizes
("background laundering"), and to quickly make up for a shortage of free
pages and clean inactive pages ("shortfall laundering"). When background
laundering is requested, the laundry thread computes the number of page
daemon wakeups that have taken place since the last laundering. If this
number is large enough relative to the ratio of the laundry and (global)
inactive queue sizes, we will launder vm_background_launder_target pages at
vm_background_launder_rate KB/s. Otherwise, the laundry thread goes back
to sleep without doing any work. When scanning the laundry queue during
background laundering, reactivated pages are counted towards the laundry
thread's target.
In contrast, shortfall laundering is requested when an inactive queue scan
fails to meet its target. In this case, the laundry thread attempts to
launder enough pages to meet v_free_target within 0.5s, which is the
inactive queue scan period.
A laundry request can be latched while another is currently being
serviced. In particular, a shortfall request will immediately preempt a
background laundering.
This change also redefines the meaning of vm_cnt.v_reactivated and removes
the functions vm_page_cache() and vm_page_try_to_cache(). The new meaning
of vm_cnt.v_reactivated now better reflects its name. It represents the
number of inactive or laundry pages that are returned to the active queue
on account of a reference.
In collaboration with: markj
Reviewed by: kib
Tested by: pho
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D8302
Notes:
svn path=/head/; revision=308474
|
|
|
|
|
|
|
|
|
|
|
| |
It's a threshold for v_free_count, which is of type u_int. This also lets
us get rid of a cast in vm_paging_needed().
Reviewed by: alc
MFC after: 1 week
Notes:
svn path=/head/; revision=303052
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
indicate that threads are waiting for free pages to become available and
(2) to indicate whether a wakeup call has been sent to the page daemon.
The trouble is that a single flag cannot really serve both purposes, because
we have two distinct targets for when to wakeup threads waiting for free
pages versus when the page daemon has completed its work. In particular,
the flag will be cleared by vm_page_free() before the page daemon has met
its target, and this can lead to the OOM killer being invoked prematurely.
To address this problem, a new flag "vm_pageout_wanted" is introduced.
Discussed with: jeff
Reviewed by: kib, markj
Tested by: markj
Sponsored by: EMC / Isilon Storage Division
Notes:
svn path=/head/; revision=300865
|
|
|
|
|
|
|
|
| |
Discussed with: alc, kib
MFC after: 1 week
Notes:
svn path=/head/; revision=300261
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Frankly, it doesn't make sense for vm_pageout_wakeup_thresh to have a negative
value (it is only ever set to a fraction of v_free_min, which is unsigned and
also obviously non-negative). But I'm not going to try and convert every
non-negative scalar in the VM to unsigned today, so just cast it for the
comparison.
Submitted by: Clang 3.3
Sponsored by: EMC / Isilon Storage Division
Notes:
svn path=/head/; revision=300220
|
|
|
|
| |
Notes:
svn path=/head/; revision=300213
|
|
|
|
|
|
|
|
|
|
| |
no effect.
Reviewed by: alc
Sponsored by: EMC / Isilon Storage Division
Notes:
svn path=/head/; revision=287640
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To reduce the diff struct pcu.cnt field was not renamed, so
PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in
kvm(3) and vmstat(8). The goal was to not affect externally used KPI.
Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the
the global cnt variable.
Exp-run revealed no ports using it directly.
No objection from: arch@
Sponsored by: EMC / Isilon Storage Division
Notes:
svn path=/head/; revision=263620
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
maintaining better LRU of active pages.
- Change v_free_target to include the quantity previously represented by
v_cache_min so we don't need to add them together everywhere we use them.
- Add a pageout_wakeup_thresh that sets the free page count trigger for
waking the page daemon. Set this 10% above v_free_min so we wakeup before
any phase transitions in vm users.
- Adjust down v_free_target now that we're willing to accept more pagedaemon
wakeups. This means we process fewer pages in one iteration as well,
leading to shorter lock hold times and less overall disruption.
- Eliminate vm_pageout_page_stats(). This was a minor variation on the
PQ_ACTIVE segment of the normal pageout daemon. Instead we now process
1 / vm_pageout_update_period pages every second. This causes us to visit
the whole active list every 60 seconds. Previously we would only maintain
the active LRU when we were short on pages which would mean it could be
woefully out of date.
Reviewed by: alc (slight variant of this)
Discussed with: alc, kib, jhb
Sponsored by: EMC / Isilon Storage Division
Notes:
svn path=/head/; revision=254304
|
|
|
|
|
|
|
|
| |
Reviewed by: alc
MFC after: 2 weeks
Notes:
svn path=/head/; revision=246032
|
|
|
|
|
|
|
|
|
| |
active and inactive paging queues.
Reviewed by: kib
Notes:
svn path=/head/; revision=242941
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
inactive queue, unless busy page is found.
Dropping the mutex often should allow the other lock acquires to
proceed without waiting for whole inactive scan to finish. On machines
with lot of physical memory scan often need to iterate a lot before it
finishes or finds a page which requires laundring, causing high
latency for other lock waiters.
Suggested and reviewed by: alc
MFC after: 3 weeks
Notes:
svn path=/head/; revision=238212
|
|
|
|
|
|
|
|
|
|
| |
Update the outdated comments describing MAXSLP and the process
selection algorithm for swap out.
Comments wording and reviewed by: alc
Notes:
svn path=/head/; revision=217192
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vm_page_try_to_free(). Consequently, push down the page queues lock into
pmap_enter_quick(), pmap_page_wired_mapped(), pmap_remove_all(), and
pmap_remove_write().
Push down the page queues lock into Xen's pmap_page_is_mapped(). (I
overlooked the Xen pmap in r207702.)
Switch to a per-processor counter for the total number of pages cached.
Notes:
svn path=/head/; revision=207796
|
|
|
|
|
|
|
|
| |
Switch to a per-processor counter for the number of pages freed during
process termination.
Notes:
svn path=/head/; revision=207739
|
|
|
|
| |
Notes:
svn path=/head/; revision=180622
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ways:
(1) Cached pages are no longer kept in the object's resident page
splay tree and memq. Instead, they are kept in a separate per-object
splay tree of cached pages. However, access to this new per-object
splay tree is synchronized by the _free_ page queues lock, not to be
confused with the heavily contended page queues lock. Consequently, a
cached page can be reclaimed by vm_page_alloc(9) without acquiring the
object's lock or the page queues lock.
This solves a problem independently reported by tegge@ and Isilon.
Specifically, they observed the page daemon consuming a great deal of
CPU time because of pages bouncing back and forth between the cache
queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE). The source of
this problem turned out to be a deadlock avoidance strategy employed
when selecting a cached page to reclaim in vm_page_select_cache().
However, the root cause was really that reclaiming a cached page
required the acquisition of an object lock while the page queues lock
was already held. Thus, this change addresses the problem at its
root, by eliminating the need to acquire the object's lock.
Moreover, keeping cached pages in the object's primary splay tree and
memq was, in effect, optimizing for the uncommon case. Cached pages
are reclaimed far, far more often than they are reactivated. Instead,
this change makes reclamation cheaper, especially in terms of
synchronization overhead, and reactivation more expensive, because
reactivated pages will have to be reentered into the object's primary
splay tree and memq.
(2) Cached pages are now stored alongside free pages in the physical
memory allocator's buddy queues, increasing the likelihood that large
allocations of contiguous physical memory (i.e., superpages) will
succeed.
Finally, as a result of this change long-standing restrictions on when
and where a cached page can be reclaimed and returned by
vm_page_alloc(9) are eliminated. Specifically, calls to
vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and
return a formerly cached page. Consequently, a call to malloc(9)
specifying M_NOWAIT is less likely to fail.
Discussed with: many over the course of the summer, including jeff@,
Justin Husted @ Isilon, peter@, tegge@
Tested by: an earlier version by kris@
Approved by: re (kensmith)
Notes:
svn path=/head/; revision=172317
|
|
|
|
|
|
|
|
|
| |
reporting the value of this counter in the program "vmstat".
Approved by: re (rwatson)
Notes:
svn path=/head/; revision=171633
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In particular:
- Add an explicative table for locking of struct vmmeter members
- Apply new rules for some of those members
- Remove some unuseful comments
Heavily reviewed by: alc, bde, jeff
Approved by: jeff (mentor)
Notes:
svn path=/head/; revision=170517
|
|
|
|
|
|
|
|
|
|
|
| |
Probabilly, a general approach is not the better solution here, so we should
solve the sched_lock protection problems separately.
Requested by: alc
Approved by: jeff (mentor)
Notes:
svn path=/head/; revision=170170
|
|
|
|
|
|
|
|
| |
Suggested by: julian@
Contributed by: attilio@
Notes:
svn path=/head/; revision=169805
|
|
|
|
|
|
|
|
|
|
|
| |
vmcnts. This can be used to abstract away pcpu details but also changes
to use atomics for all counters now. This means sched lock is no longer
responsible for protecting counts in the switch routines.
Contributed by: Attilio Rao <attilio@FreeBSD.org>
Notes:
svn path=/head/; revision=169667
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
them unsigned I made the possible overflows hard to detect,
and it only saved 1 bit which isn't principal, even less now
that the underlying issue with the total of virtual memory has
been fixed. (For the record, it will overflow with >=2T of
VM total, with 32-bit ints used to keep counters in pages.)
- While here, fix printing of other "struct vmtotal" members
such as t_rq, t_dw, t_pw, and t_sw as they are also signed.
Reviewed by: bde
MFC after: 3 days
Notes:
svn path=/head/; revision=164718
|
|
|
|
|
|
|
|
|
|
| |
- Fix overflow bugs in sysctl(8), systat(1), and vmstat(8)
when printing values of "struct vmmeter" in kilobytes as
they don't necessarily fit into 32 bits. (Fix sysctl(8)
reporting of a total virtual memory; it's in pages too.)
Notes:
svn path=/head/; revision=164443
|
|
|
|
| |
Notes:
svn path=/head/; revision=130239
|
|
|
|
|
|
|
|
|
| |
per letter dated July 22, 1999.
Approved by: core
Notes:
svn path=/head/; revision=127976
|
|
|
|
|
|
|
|
|
| |
than a positive number.
- In pagedaemon_wakeup(), set vm_pages_needed to 1 rather than
incrementing it to accomplish the same.
Notes:
svn path=/head/; revision=110225
|
|
|
|
|
|
|
| |
the structure definition easier to find using grep).
Notes:
svn path=/head/; revision=97691
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vm.stats.vm.v_forks
vm.stats.vm.v_vforks
vm.stats.vm.v_rforks
vm.stats.vm.v_kthreads
vm.stats.vm.v_forkpages
vm.stats.vm.v_vforkpages
vm.stats.vm.v_rforkpages
vm.stats.vm.v_kthreadpages
Submitted by: Paul Herman <pherman@frenchfries.net>
Reviewed by: alfred
Notes:
svn path=/head/; revision=71429
|
|
|
|
|
|
|
|
|
| |
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.
Notes:
svn path=/head/; revision=55205
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace various VM related page count calculations strewn over the
VM code with inlines to aid in readability and to reduce fragility
in the code where modules depend on the same test being performed
to properly sleep and wakeup.
Split out a portion of the page deactivation code into an inline
in vm_page.c to support vm_page_dontneed().
add vm_page_dontneed(), which handles the madvise MADV_DONTNEED
feature in a related commit coming up for vm_map.c/vm_object.c. This
code prevents degenerate cases where an essentially active page may
be rotated through a subset of the paging lists, resulting in premature
disposal.
Notes:
svn path=/head/; revision=51337
|
|
|
|
| |
Notes:
svn path=/head/; revision=50477
|
|
|
|
|
|
|
| |
Revert the comment for v_ozfod now that vm_fault is fixed.
Notes:
svn path=/head/; revision=44251
|
|
|
|
|
|
|
|
|
|
| |
zero-fill levels.
Adjust comment for ozfod in vmmeter.h - this counter represents
non-optimal ( on the fly ) zero fills, not prefills.
Notes:
svn path=/head/; revision=43758
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
put alot of it's context into a data structure. This allows
significant shortening of its codepath, and will significantly
decrease it's cache footprint.
Also, add some stats to vmmeter. Note that you'll have to
rebuild/recompile vmstat, systat, etc... Otherwise, you'll
get "very interesting" paging stats.
Notes:
svn path=/head/; revision=34202
|