aboutsummaryrefslogtreecommitdiff
path: root/en_US.ISO8859-1/books/arch-handbook/vm/chapter.sgml
diff options
context:
space:
mode:
Diffstat (limited to 'en_US.ISO8859-1/books/arch-handbook/vm/chapter.sgml')
-rw-r--r--en_US.ISO8859-1/books/arch-handbook/vm/chapter.sgml255
1 files changed, 0 insertions, 255 deletions
diff --git a/en_US.ISO8859-1/books/arch-handbook/vm/chapter.sgml b/en_US.ISO8859-1/books/arch-handbook/vm/chapter.sgml
deleted file mode 100644
index b48098aa00..0000000000
--- a/en_US.ISO8859-1/books/arch-handbook/vm/chapter.sgml
+++ /dev/null
@@ -1,255 +0,0 @@
-<!--
- The FreeBSD Documentation Project
-
- $FreeBSD$
--->
-
-<chapter id="vm">
- <title>Virtual Memory System</title>
-
- <sect1 id="internals-vm">
- <title>The FreeBSD VM System</title>
-
- <para><emphasis>Contributed by &a.dillon;. 6 Feb 1999</emphasis></para>
-
- <sect2>
- <title>Management of physical
- memory&mdash;<literal>vm_page_t</literal></title>
-
- <para>Physical memory is managed on a page-by-page basis through the
- <literal>vm_page_t</literal> structure. Pages of physical memory are
- categorized through the placement of their respective
- <literal>vm_page_t</literal> structures on one of several paging
- queues.</para>
-
- <para>A page can be in a wired, active, inactive, cache, or free state.
- Except for the wired state, the page is typically placed in a doubly
- link list queue representing the state that it is in. Wired pages
- are not placed on any queue.</para>
-
- <para>FreeBSD implements a more involved paging queue for cached and
- free pages in order to implement page coloring. Each of these states
- involves multiple queues arranged according to the size of the
- processor's L1 and L2 caches. When a new page needs to be allocated,
- FreeBSD attempts to obtain one that is reasonably well aligned from
- the point of view of the L1 and L2 caches relative to the VM object
- the page is being allocated for.</para>
-
- <para>Additionally, a page may be held with a reference count or locked
- with a busy count. The VM system also implements an <quote>ultimate
- locked</quote> state for a page using the PG_BUSY bit in the page's
- flags.</para>
-
- <para>In general terms, each of the paging queues operates in a LRU
- fashion. A page is typically placed in a wired or active state
- initially. When wired, the page is usually associated with a page
- table somewhere. The VM system ages the page by scanning pages in a
- more active paging queue (LRU) in order to move them to a less-active
- paging queue. Pages that get moved into the cache are still
- associated with a VM object but are candidates for immediate reuse.
- Pages in the free queue are truly free. FreeBSD attempts to minimize
- the number of pages in the free queue, but a certain minimum number of
- truly free pages must be maintained in order to accommodate page
- allocation at interrupt time.</para>
-
- <para>If a process attempts to access a page that does not exist in its
- page table but does exist in one of the paging queues (such as the
- inactive or cache queues), a relatively inexpensive page reactivation
- fault occurs which causes the page to be reactivated. If the page
- does not exist in system memory at all, the process must block while
- the page is brought in from disk.</para>
-
- <para>FreeBSD dynamically tunes its paging queues and attempts to
- maintain reasonable ratios of pages in the various queues as well as
- attempts to maintain a reasonable breakdown of clean vs. dirty pages.
- The amount of rebalancing that occurs depends on the system's memory
- load. This rebalancing is implemented by the pageout daemon and
- involves laundering dirty pages (syncing them with their backing
- store), noticing when pages are activity referenced (resetting their
- position in the LRU queues or moving them between queues), migrating
- pages between queues when the queues are out of balance, and so forth.
- FreeBSD's VM system is willing to take a reasonable number of
- reactivation page faults to determine how active or how idle a page
- actually is. This leads to better decisions being made as to when to
- launder or swap-out a page.</para>
- </sect2>
-
- <sect2>
- <title>The unified buffer
- cache&mdash;<literal>vm_object_t</literal></title>
-
- <para>FreeBSD implements the idea of a generic <quote>VM object</quote>.
- VM objects can be associated with backing store of various
- types&mdash;unbacked, swap-backed, physical device-backed, or
- file-backed storage. Since the filesystem uses the same VM objects to
- manage in-core data relating to files, the result is a unified buffer
- cache.</para>
-
- <para>VM objects can be <emphasis>shadowed</emphasis>. That is, they
- can be stacked on top of each other. For example, you might have a
- swap-backed VM object stacked on top of a file-backed VM object in
- order to implement a MAP_PRIVATE mmap()ing. This stacking is also
- used to implement various sharing properties, including
- copy-on-write, for forked address spaces.</para>
-
- <para>It should be noted that a <literal>vm_page_t</literal> can only be
- associated with one VM object at a time. The VM object shadowing
- implements the perceived sharing of the same page across multiple
- instances.</para>
- </sect2>
-
- <sect2>
- <title>Filesystem I/O&mdash;<literal>struct buf</literal></title>
-
- <para>vnode-backed VM objects, such as file-backed objects, generally
- need to maintain their own clean/dirty info independent from the VM
- system's idea of clean/dirty. For example, when the VM system decides
- to synchronize a physical page to its backing store, the VM system
- needs to mark the page clean before the page is actually written to
- its backing store. Additionally, filesystems need to be able to map
- portions of a file or file metadata into KVM in order to operate on
- it.</para>
-
- <para>The entities used to manage this are known as filesystem buffers,
- <literal>struct buf</literal>'s, or
- <literal>bp</literal>'s. When a filesystem needs to operate on a
- portion of a VM object, it typically maps part of the object into a
- struct buf and the maps the pages in the struct buf into KVM. In the
- same manner, disk I/O is typically issued by mapping portions of
- objects into buffer structures and then issuing the I/O on the buffer
- structures. The underlying vm_page_t's are typically busied for the
- duration of the I/O. Filesystem buffers also have their own notion of
- being busy, which is useful to filesystem driver code which would
- rather operate on filesystem buffers instead of hard VM pages.</para>
-
- <para>FreeBSD reserves a limited amount of KVM to hold mappings from
- struct bufs, but it should be made clear that this KVM is used solely
- to hold mappings and does not limit the ability to cache data.
- Physical data caching is strictly a function of
- <literal>vm_page_t</literal>'s, not filesystem buffers. However,
- since filesystem buffers are used to placehold I/O, they do inherently
- limit the amount of concurrent I/O possible. However, as there are usually a
- few thousand filesystem buffers available, this is not usually a
- problem.</para>
- </sect2>
-
- <sect2>
- <title>Mapping Page Tables - vm_map_t, vm_entry_t</title>
-
- <para>FreeBSD separates the physical page table topology from the VM
- system. All hard per-process page tables can be reconstructed on the
- fly and are usually considered throwaway. Special page tables such as
- those managing KVM are typically permanently preallocated. These page
- tables are not throwaway.</para>
-
- <para>FreeBSD associates portions of vm_objects with address ranges in
- virtual memory through <literal>vm_map_t</literal> and
- <literal>vm_entry_t</literal> structures. Page tables are directly
- synthesized from the
- <literal>vm_map_t</literal>/<literal>vm_entry_t</literal>/
- <literal>vm_object_t</literal> hierarchy. Recall that I mentioned
- that physical pages are only directly associated with a
- <literal>vm_object</literal>; that is not quite true.
- <literal>vm_page_t</literal>'s are also linked into page tables that
- they are actively associated with. One <literal>vm_page_t</literal>
- can be linked into several <emphasis>pmaps</emphasis>, as page tables
- are called. However, the hierarchical association holds, so all
- references to the same page in the same object reference the same
- <literal>vm_page_t</literal> and thus give us buffer cache unification
- across the board.</para>
- </sect2>
-
- <sect2>
- <title>KVM Memory Mapping</title>
-
- <para>FreeBSD uses KVM to hold various kernel structures. The single
- largest entity held in KVM is the filesystem buffer cache. That is,
- mappings relating to <literal>struct buf</literal> entities.</para>
-
- <para>Unlike Linux, FreeBSD does <emphasis>not</emphasis> map all of physical memory into
- KVM. This means that FreeBSD can handle memory configurations up to
- 4G on 32 bit platforms. In fact, if the mmu were capable of it,
- FreeBSD could theoretically handle memory configurations up to 8TB on
- a 32 bit platform. However, since most 32 bit platforms are only
- capable of mapping 4GB of ram, this is a moot point.</para>
-
- <para>KVM is managed through several mechanisms. The main mechanism
- used to manage KVM is the <emphasis>zone allocator</emphasis>. The
- zone allocator takes a chunk of KVM and splits it up into
- constant-sized blocks of memory in order to allocate a specific type
- of structure. You can use <command>vmstat -m</command> to get an
- overview of current KVM utilization broken down by zone.</para>
- </sect2>
-
- <sect2>
- <title>Tuning the FreeBSD VM system</title>
-
- <para>A concerted effort has been made to make the FreeBSD kernel
- dynamically tune itself. Typically you do not need to mess with
- anything beyond the <option>maxusers</option> and
- <option>NMBCLUSTERS</option> kernel config options. That is, kernel
- compilation options specified in (typically)
- <filename>/usr/src/sys/i386/conf/<replaceable>CONFIG_FILE</replaceable></filename>.
- A description of all available kernel configuration options can be
- found in <filename>/usr/src/sys/i386/conf/LINT</filename>.</para>
-
- <para>In a large system configuration you may wish to increase
- <option>maxusers</option>. Values typically range from 10 to 128.
- Note that raising <option>maxusers</option> too high can cause the
- system to overflow available KVM resulting in unpredictable operation.
- It is better to leave <option>maxusers</option> at some reasonable number and add other
- options, such as <option>NMBCLUSTERS</option>, to increase specific
- resources.</para>
-
- <para>If your system is going to use the network heavily, you may want
- to increase <option>NMBCLUSTERS</option>. Typical values range from
- 1024 to 4096.</para>
-
- <para>The <literal>NBUF</literal> parameter is also traditionally used
- to scale the system. This parameter determines the amount of KVA the
- system can use to map filesystem buffers for I/O. Note that this
- parameter has nothing whatsoever to do with the unified buffer cache!
- This parameter is dynamically tuned in 3.0-CURRENT and later kernels
- and should generally not be adjusted manually. We recommend that you
- <emphasis>not</emphasis> try to specify an <literal>NBUF</literal>
- parameter. Let the system pick it. Too small a value can result in
- extremely inefficient filesystem operation while too large a value can
- starve the page queues by causing too many pages to become wired
- down.</para>
-
- <para>By default, FreeBSD kernels are not optimized. You can set
- debugging and optimization flags with the
- <literal>makeoptions</literal> directive in the kernel configuration.
- Note that you should not use <option>-g</option> unless you can
- accommodate the large (typically 7 MB+) kernels that result.</para>
-
- <programlisting>makeoptions DEBUG="-g"
-makeoptions COPTFLAGS="-O -pipe"</programlisting>
-
- <para>Sysctl provides a way to tune kernel parameters at run-time. You
- typically do not need to mess with any of the sysctl variables,
- especially the VM related ones.</para>
-
- <para>Run time VM and system tuning is relatively straightforward.
- First, use softupdates on your UFS/FFS filesystems whenever possible.
- <filename>/usr/src/sys/ufs/ffs/README.softupdates</filename> contains
- instructions (and restrictions) on how to configure it.</para>
-
- <para>Second, configure sufficient swap. You should have a swap
- partition configured on each physical disk, up to four, even on your
- <quote>work</quote> disks. You should have at least 2x the swap space
- as you have main memory, and possibly even more if you do not have a
- lot of memory. You should also size your swap partition based on the
- maximum memory configuration you ever intend to put on the machine so
- you do not have to repartition your disks later on. If you want to be
- able to accommodate a crash dump, your first swap partition must be at
- least as large as main memory and <filename>/var/crash</filename> must
- have sufficient free space to hold the dump.</para>
-
- <para>NFS-based swap is perfectly acceptable on -4.x or later systems,
- but you must be aware that the NFS server will take the brunt of the
- paging load.</para>
- </sect2>
- </sect1>
-
-</chapter>