diff options
Diffstat (limited to 'en_US.ISO8859-1/books/handbook')
-rw-r--r-- | en_US.ISO8859-1/books/handbook/Makefile | 2 | ||||
-rw-r--r-- | en_US.ISO8859-1/books/handbook/bibliography/chapter.xml | 2 | ||||
-rw-r--r-- | en_US.ISO8859-1/books/handbook/book.xml | 2 | ||||
-rw-r--r-- | en_US.ISO8859-1/books/handbook/chapters.ent | 4 | ||||
-rw-r--r-- | en_US.ISO8859-1/books/handbook/colophon.xml | 2 | ||||
-rw-r--r-- | en_US.ISO8859-1/books/handbook/introduction/chapter.xml | 4 | ||||
-rw-r--r-- | en_US.ISO8859-1/books/handbook/mirrors/chapter.xml | 1 | ||||
-rw-r--r-- | en_US.ISO8859-1/books/handbook/users/Makefile | 15 | ||||
-rw-r--r-- | en_US.ISO8859-1/books/handbook/users/chapter.xml | 1043 | ||||
-rw-r--r-- | en_US.ISO8859-1/books/handbook/vinum/Makefile | 15 | ||||
-rw-r--r-- | en_US.ISO8859-1/books/handbook/vinum/chapter.xml | 1251 |
11 files changed, 2331 insertions, 10 deletions
diff --git a/en_US.ISO8859-1/books/handbook/Makefile b/en_US.ISO8859-1/books/handbook/Makefile index f2e41fc06f..fa7ad27323 100644 --- a/en_US.ISO8859-1/books/handbook/Makefile +++ b/en_US.ISO8859-1/books/handbook/Makefile @@ -39,8 +39,6 @@ DOC?= book FORMATS?= html-split -HAS_INDEX= true - INSTALL_COMPRESSED?= gz INSTALL_ONLY_COMPRESSED?= diff --git a/en_US.ISO8859-1/books/handbook/bibliography/chapter.xml b/en_US.ISO8859-1/books/handbook/bibliography/chapter.xml index 2b7bdece60..da614bdf17 100644 --- a/en_US.ISO8859-1/books/handbook/bibliography/chapter.xml +++ b/en_US.ISO8859-1/books/handbook/bibliography/chapter.xml @@ -532,7 +532,7 @@ <ulink url="http://www.FreeBSD.org/cgi/cvsweb.cgi/src/share/misc/bsd-family-tree"></ulink> or - <ulink type="html" + <ulink url="file://localhost/usr/share/misc/bsd-family-tree"><filename>/usr/share/misc/bsd-family-tree</filename></ulink> on a FreeBSD machine.</para> </listitem> diff --git a/en_US.ISO8859-1/books/handbook/book.xml b/en_US.ISO8859-1/books/handbook/book.xml index e23d7ec009..59144256ac 100644 --- a/en_US.ISO8859-1/books/handbook/book.xml +++ b/en_US.ISO8859-1/books/handbook/book.xml @@ -295,7 +295,7 @@ &chap.eresources; &chap.pgpkeys; </part> - &freebsd-glossary; + &chap.freebsd-glossary; &chap.index; &chap.colophon; </book> diff --git a/en_US.ISO8859-1/books/handbook/chapters.ent b/en_US.ISO8859-1/books/handbook/chapters.ent index d5cd395e7f..0bcc64c5d7 100644 --- a/en_US.ISO8859-1/books/handbook/chapters.ent +++ b/en_US.ISO8859-1/books/handbook/chapters.ent @@ -63,7 +63,7 @@ <!ENTITY chap.eresources.www.index.inc SYSTEM "eresources.xml.www.index.inc"> <!ENTITY chap.eresources.www.inc SYSTEM "eresources.xml.www.inc"> <!ENTITY chap.pgpkeys SYSTEM "pgpkeys/chapter.xml"> - <!ENTITY chap.freebsd-glossary "&freebsd-glossary;"> - <!ENTITY chap.index SYSTEM "index.xml"> + <!ENTITY chap.freebsd-glossary SYSTEM "../../share/xml/glossary.ent"> + <!ENTITY chap.index "<index xmlns='http://docbook.org/ns/docbook'/>"> <!ENTITY chap.colophon SYSTEM "colophon.xml"> diff --git a/en_US.ISO8859-1/books/handbook/colophon.xml b/en_US.ISO8859-1/books/handbook/colophon.xml index 0d93d9bb66..c583ddfe90 100644 --- a/en_US.ISO8859-1/books/handbook/colophon.xml +++ b/en_US.ISO8859-1/books/handbook/colophon.xml @@ -12,7 +12,7 @@ from XML into many different presentation formats using XSLT. The printed version of this document would not be possible without Donald Knuth's - <application>&tex;</application> typesetting language, Leslie + &tex; typesetting language, Leslie Lamport's <application>LaTeX</application>, or Sebastian Rahtz's <application>JadeTeX</application> macro package.</para> </colophon> diff --git a/en_US.ISO8859-1/books/handbook/introduction/chapter.xml b/en_US.ISO8859-1/books/handbook/introduction/chapter.xml index afbe796fcf..9adbd535d2 100644 --- a/en_US.ISO8859-1/books/handbook/introduction/chapter.xml +++ b/en_US.ISO8859-1/books/handbook/introduction/chapter.xml @@ -1185,7 +1185,7 @@ <term>The FreeBSD Handbook</term> <listitem> - <para><ulink type="html" + <para><ulink url="file://localhost/usr/local/share/doc/freebsd/handbook/index.html"><filename>/usr/local/share/doc/freebsd/handbook/index.html</filename></ulink></para> </listitem> </varlistentry> @@ -1194,7 +1194,7 @@ <term>The FreeBSD FAQ</term> <listitem> - <para><ulink type="html" + <para><ulink url="file://localhost/usr/local/share/doc/freebsd/faq/index.html"><filename>/usr/local/share/doc/freebsd/faq/index.html</filename></ulink></para> </listitem> </varlistentry> diff --git a/en_US.ISO8859-1/books/handbook/mirrors/chapter.xml b/en_US.ISO8859-1/books/handbook/mirrors/chapter.xml index ba2dde8e6b..f46b6db93e 100644 --- a/en_US.ISO8859-1/books/handbook/mirrors/chapter.xml +++ b/en_US.ISO8859-1/books/handbook/mirrors/chapter.xml @@ -899,7 +899,6 @@ Certificate information: by a configuration file called the <filename>supfile</filename>. There are some sample <filename>supfiles</filename> in the directory <ulink - type="html" url="file://localhost/usr/share/examples/cvsup/"><filename>/usr/share/examples/cvsup/</filename></ulink>.</para> <para>The information in a <filename>supfile</filename> answers diff --git a/en_US.ISO8859-1/books/handbook/users/Makefile b/en_US.ISO8859-1/books/handbook/users/Makefile new file mode 100644 index 0000000000..b44bd80628 --- /dev/null +++ b/en_US.ISO8859-1/books/handbook/users/Makefile @@ -0,0 +1,15 @@ +# +# Build the Handbook with just the content from this chapter. +# +# $FreeBSD$ +# + +CHAPTERS= users/chapter.xml + +VPATH= .. + +MASTERDOC= ${.CURDIR}/../${DOC}.${DOCBOOKSUFFIX} + +DOC_PREFIX?= ${.CURDIR}/../../../.. + +.include "../Makefile" diff --git a/en_US.ISO8859-1/books/handbook/users/chapter.xml b/en_US.ISO8859-1/books/handbook/users/chapter.xml new file mode 100644 index 0000000000..39f145021d --- /dev/null +++ b/en_US.ISO8859-1/books/handbook/users/chapter.xml @@ -0,0 +1,1043 @@ +<?xml version="1.0" encoding="iso-8859-1"?> +<!-- + The FreeBSD Documentation Project + + $FreeBSD$ +--> + +<chapter id="users"> + <chapterinfo> + <authorgroup> + <author> + <firstname>Neil</firstname> + <surname>Blakey-Milner</surname> + <contrib>Contributed by </contrib> + </author> + </authorgroup> + <!-- Feb 2000 --> + </chapterinfo> + + <title>Users and Basic Account Management</title> + + <sect1 id="users-synopsis"> + <title>Synopsis</title> + + <para>&os; allows multiple users to use the computer at the same + time. While only one user can sit in front of the screen and + use the keyboard at any one time, any number of users can log + in to the system through the network. To use the system, every + user must have a user account.</para> + + <para>After reading this chapter, you will know:</para> + + <itemizedlist> + <listitem> + <para>The differences between the various user accounts on a + &os; system.</para> + </listitem> + + <listitem> + <para>How to add and remove user accounts.</para> + </listitem> + + <listitem> + <para>How to change account details, such as the user's full + name or preferred shell.</para> + </listitem> + + <listitem> + <para>How to set limits on a per-account basis to control the + resources, such as memory and CPU time, that accounts and + groups of accounts are allowed to access.</para> + </listitem> + + <listitem> + <para>How to use groups to make account management + easier.</para> + </listitem> + </itemizedlist> + + <para>Before reading this chapter, you should:</para> + + <itemizedlist> + <listitem> + <para>Understand the <link linkend="basics">basics of &unix; + and &os;</link>.</para> + </listitem> + </itemizedlist> + </sect1> + + <sect1 id="users-introduction"> + <title>Introduction</title> + + <para>Since all access to the &os; system is achieved via accounts + and all processes are run by users, user and account management + is important.</para> + + <para>Every account on a &os; system has certain information + associated with it to identify the account.</para> + + <variablelist> + <varlistentry> + <term>User name</term> + + <listitem> + <para>The user name is typed at the <prompt>login:</prompt> + prompt. User names must be unique on the system as no two + users can have the same user name. There are a number of + rules for creating valid user names, documented in + &man.passwd.5;. Typically user names consist of eight or + fewer all lower case characters in order to maintain + backwards compatibility with applications.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term>Password</term> + + <listitem> + <para>Each account has an associated password. While the + password can be blank, this is highly discouraged and + every account should have a password.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term>User ID (<acronym>UID</acronym>)</term> + + <listitem> + <para>The User ID (<acronym>UID</acronym>) is a number, + traditionally from 0 to 65535<footnote + id="users-largeuidgid"> + <para>It is possible to use + <acronym>UID</acronym>s/<acronym>GID</acronym>s as + large as 4294967295, but such IDs can cause serious + problems with software that makes assumptions about + the values of IDs.</para> + </footnote>, used to uniquely identify the user to the + system. Internally, &os; uses the + <acronym>UID</acronym> to identify users. Commands that + allow a user name to be specified will first convert it to + the <acronym>UID</acronym>. Though unlikely, it is + possible for several accounts with different user names to + share the same <acronym>UID</acronym>. As far as &os; is + concerned, these accounts are one user.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term>Group ID (<acronym>GID</acronym>)</term> + + <listitem> + <para>The Group ID (<acronym>GID</acronym>) is a number, + traditionally from 0 to 65535<footnoteref + linkend="users-largeuidgid"/>, used to uniquely identify + the primary group that the user belongs to. Groups are a + mechanism for controlling access to resources based on a + user's <acronym>GID</acronym> rather than their + <acronym>UID</acronym>. This can significantly reduce the + size of some configuration files. A user may also be a + member of more than one group.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term>Login class</term> + + <listitem> + <para>Login classes are an extension to the group mechanism + that provide additional flexibility when tailoring the + system to different users.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term>Password change time</term> + + <listitem> + <para>By default &os; does not force users to change their + passwords periodically. Password expiration can be + enforced on a per-user basis, forcing some or all users to + change their passwords after a certain amount of time has + elapsed.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term>Account expiry time</term> + + <listitem> + <para>By default &os; does not expire accounts. When + creating accounts that need a limited lifespan, such as + student accounts in a school, specify the account expiry + date. After the expiry time has elapsed, the account + cannot be used to log in to the system, although the + account's directories and files will remain.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term>User's full name</term> + + <listitem> + <para>The user name uniquely identifies the account to &os;, + but does not necessarily reflect the user's real name. + This information can be associated with the + account.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term>Home directory</term> + + <listitem> + <para>The home directory is the full path to a directory on + the system. This is the user's starting directory when + the user logs in. A common convention is to put all user + home directories under <filename + class="directory">/home/<replaceable>username</replaceable></filename> + or <filename + class="directory">/usr/home/<replaceable>username</replaceable></filename>. + Each user stores their personal files and subdirectories + in their own home directory.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term>User shell</term> + + <listitem> + <para>The shell provides the default environment users use + to interact with the system. There are many different + kinds of shells, and experienced users will have their own + preferences, which can be reflected in their account + settings.</para> + </listitem> + </varlistentry> + </variablelist> + + <para>There are three main types of accounts: the <link + linkend="users-superuser">superuser</link>, <link + linkend="users-system">system accounts</link>, and <link + linkend="users-user">user accounts</link>. The superuser + account, usually called <username>root</username>, is used to + manage the system with no limitations on privileges. System + accounts are used to run services. User accounts are + assigned to real people and are used to log in and use the + system.</para> + + <sect2 id="users-superuser"> + <title>The Superuser Account</title> + + <indexterm> + <primary>accounts</primary> + <secondary>superuser (root)</secondary> + </indexterm> + <para>The superuser account, usually called + <username>root</username>, is used to perform system + administration tasks and should not be used for day-to-day + tasks like sending and receiving mail, general exploration of + the system, or programming.</para> + + <para>This is because the superuser, unlike normal user + accounts, can operate without limits, and misuse of the + superuser account may result in spectacular disasters. User + accounts are unable to destroy the system by mistake, so it is + generally best to use normal user accounts whenever possible, + unless extra privilege is required.</para> + + <para>Always double and triple-check any commands issued as the + superuser, since an extra space or missing character can mean + irreparable data loss.</para> + + <para>Always create a user account for the system administrator + and use this account to log in to the system for general + usage. This applies equally to multi-user or single-user + systems. Later sections will discuss how to create additional + accounts and how to change between the normal user and + superuser.</para> + </sect2> + + <sect2 id="users-system"> + <title>System Accounts</title> + + <indexterm> + <primary>accounts</primary> + <secondary>system</secondary> + </indexterm> + <para>System accounts are used to run services such as DNS, + mail, and web servers. The reason for this is security; if + all services ran as the superuser, they could act without + restriction.</para> + + <indexterm> + <primary>accounts</primary> + <secondary><username>daemon</username></secondary> + </indexterm> + <indexterm> + <primary>accounts</primary> + <secondary><username>operator</username></secondary> + </indexterm> + <para>Examples of system accounts are + <username>daemon</username>, <username>operator</username>, + <username>bind</username>, <username>news</username>, and + <username>www</username>.</para> + + <indexterm> + <primary>accounts</primary> + <secondary><username>nobody</username></secondary> + </indexterm> + <para><username>nobody</username> is the generic unprivileged + system account. However, the more services that use + <username>nobody</username>, the more files and processes that + user will become associated with, and hence the more + privileged that user becomes.</para> + </sect2> + + <sect2 id="users-user"> + <title>User Accounts</title> + + <indexterm> + <primary>accounts</primary> + <secondary>user</secondary> + </indexterm> + <para>User accounts are the primary means of access for real + people to the system. User accounts insulate the user and + the environment, preventing users from damaging the system + or other users, and allowing users to customize their + environment without affecting others.</para> + + <para>Every person accessing the system should have a unique + user account. This allows the administrator to find out who + is doing what, prevents users from clobbering each others' + settings or reading each others' mail, and so forth.</para> + + <para>Each user can set up their own environment to accommodate + their use of the system, by using alternate shells, editors, + key bindings, and language.</para> + </sect2> + </sect1> + + <sect1 id="users-modifying"> + <title>Modifying Accounts</title> + + <indexterm> + <primary>accounts</primary> + <secondary>modifying</secondary> + </indexterm> + + <para>&os; provides a variety of different commands to manage + user accounts. The most common commands are summarized below, + followed by more detailed examples of their usage.</para> + + <informaltable frame="none" pgwide="1"> + <tgroup cols="2"> + <colspec colwidth="1*"/> + <colspec colwidth="2*"/> + + <thead> + <row> + <entry>Command</entry> + <entry>Summary</entry> + </row> + </thead> + <tbody> + <row> + <entry>&man.adduser.8;</entry> + <entry>The recommended command-line application for adding + new users.</entry> + </row> + + <row> + <entry>&man.rmuser.8;</entry> + <entry>The recommended command-line application for + removing users.</entry> + </row> + + <row> + <entry>&man.chpass.1;</entry> + <entry>A flexible tool for changing user database + information.</entry> + </row> + + <row> + <entry>&man.passwd.1;</entry> + <entry>The simple command-line tool to change user + passwords.</entry> + </row> + + <row> + <entry>&man.pw.8;</entry> + <entry>A powerful and flexible tool for modifying all + aspects of user accounts.</entry> + </row> + </tbody> + </tgroup> + </informaltable> + + <sect2 id="users-adduser"> + <title><command>adduser</command></title> + + <indexterm> + <primary>accounts</primary> + <secondary>adding</secondary> + </indexterm> + <indexterm> + <primary><command>adduser</command></primary> + </indexterm> + <indexterm> + <primary><filename + class="directory">/usr/share/skel</filename></primary> + </indexterm> + <indexterm><primary>skeleton directory</primary></indexterm> + <para>&man.adduser.8; is a simple program for adding new users + When a new user is added, this program automatically updates + <filename>/etc/passwd</filename> and + <filename>/etc/group</filename>. It also creates a home + directory for the new user, copies in the default + configuration files from <filename + class="directory">/usr/share/skel</filename>, and can + optionally mail the new user a welcome message.</para> + + <example> + <title>Adding a User on &os;</title> + + <screen>&prompt.root; <userinput>adduser</userinput> +Username: <userinput>jru</userinput> +Full name: <userinput>J. Random User</userinput> +Uid (Leave empty for default): +Login group [jru]: +Login group is jru. Invite jru into other groups? []: <userinput>wheel</userinput> +Login class [default]: +Shell (sh csh tcsh zsh nologin) [sh]: <userinput>zsh</userinput> +Home directory [/home/jru]: +Home directory permissions (Leave empty for default): +Use password-based authentication? [yes]: +Use an empty password? (yes/no) [no]: +Use a random password? (yes/no) [no]: +Enter password: +Enter password again: +Lock out the account after creation? [no]: +Username : jru +Password : **** +Full Name : J. Random User +Uid : 1001 +Class : +Groups : jru wheel +Home : /home/jru +Shell : /usr/local/bin/zsh +Locked : no +OK? (yes/no): <userinput>yes</userinput> +adduser: INFO: Successfully added (jru) to the user database. +Add another user? (yes/no): <userinput>no</userinput> +Goodbye! +&prompt.root;</screen> + </example> + + <note> + <para>Since the password is not echoed when typed, be careful + to not mistype the password when creating the user + account.</para> + </note> + </sect2> + + <sect2 id="users-rmuser"> + <title><command>rmuser</command></title> + + <indexterm><primary><command>rmuser</command></primary></indexterm> + <indexterm> + <primary>accounts</primary> + <secondary>removing</secondary> + </indexterm> + + <para>To completely remove a user from the system use + &man.rmuser.8;. This command performs the following + steps:</para> + + <procedure> + <step> + <para>Removes the user's &man.crontab.1; entry if one + exists.</para> + </step> + + <step> + <para>Removes any &man.at.1; jobs belonging to the + user.</para> + </step> + + <step> + <para>Kills all processes owned by the user.</para> + </step> + + <step> + <para>Removes the user from the system's local password + file.</para> + </step> + + <step> + <para>Removes the user's home directory, if it is owned by + the user.</para> + </step> + + <step> + <para>Removes the incoming mail files belonging to the user + from <filename + class="directory">/var/mail</filename>.</para> + </step> + + <step> + <para>Removes all files owned by the user from temporary + file storage areas such as <filename + class="directory">/tmp</filename>.</para> + </step> + + <step> + <para>Finally, removes the username from all groups to which + it belongs in <filename>/etc/group</filename>.</para> + + <note> + <para>If a group becomes empty and the group name is the + same as the username, the group is removed. This + complements the per-user unique groups created by + &man.adduser.8;.</para> + </note> + </step> + </procedure> + + <para>&man.rmuser.8; cannot be used to remove superuser + accounts since that is almost always an indication of massive + destruction.</para> + + <para>By default, an interactive mode is used, as shown + in the following example.</para> + + <example> + <title><command>rmuser</command> Interactive Account + Removal</title> + + <screen>&prompt.root; <userinput>rmuser jru</userinput> +Matching password entry: +jru:*:1001:1001::0:0:J. Random User:/home/jru:/usr/local/bin/zsh +Is this the entry you wish to remove? <userinput>y</userinput> +Remove user's home directory (/home/jru)? <userinput>y</userinput> +Updating password file, updating databases, done. +Updating group file: trusted (removing group jru -- personal group is empty) done. +Removing user's incoming mail file /var/mail/jru: done. +Removing files belonging to jru from /tmp: done. +Removing files belonging to jru from /var/tmp: done. +Removing files belonging to jru from /var/tmp/vi.recover: done. +&prompt.root;</screen> + </example> + </sect2> + + <sect2 id="users-chpass"> + <title><command>chpass</command></title> + + <indexterm><primary><command>chpass</command></primary></indexterm> + <para>&man.chpass.1; can be used to change user database + information such as passwords, shells, and personal + information.</para> + + <para>Only the superuser can change other users' information and + passwords with &man.chpass.1;.</para> + + <para>When passed no options, aside from an optional username, + &man.chpass.1; displays an editor containing user information. + When the user exists from the editor, the user database is + updated with the new information.</para> + + <note> + <para>You will be asked for your password after exiting the + editor if you are not the superuser.</para> + </note> + + <example> + <title>Interactive <command>chpass</command> by + Superuser</title> + + <screen>#Changing user database information for jru. +Login: jru +Password: * +Uid [#]: 1001 +Gid [# or name]: 1001 +Change [month day year]: +Expire [month day year]: +Class: +Home directory: /home/jru +Shell: /usr/local/bin/zsh +Full Name: J. Random User +Office Location: +Office Phone: +Home Phone: +Other information:</screen> + </example> + + <para>A user can change only a small subset of this + information, and only for their own user account.</para> + + <example> + <title>Interactive <command>chpass</command> by Normal + User</title> + + <screen>#Changing user database information for jru. +Shell: /usr/local/bin/zsh +Full Name: J. Random User +Office Location: +Office Phone: +Home Phone: +Other information:</screen> + </example> + + <note> + <para>&man.chfn.1; and &man.chsh.1; are links to + &man.chpass.1;, as are &man.ypchpass.1;, &man.ypchfn.1;, and + &man.ypchsh.1;. <acronym>NIS</acronym> support is + automatic, so specifying the <literal>yp</literal> before + the command is not necessary. How to configure NIS is + covered in <xref linkend="network-servers"/>.</para> + </note> + </sect2> + <sect2 id="users-passwd"> + <title><command>passwd</command></title> + + <indexterm><primary><command>passwd</command></primary></indexterm> + <indexterm> + <primary>accounts</primary> + <secondary>changing password</secondary> + </indexterm> + <para>&man.passwd.1; is the usual way to change your own + password as a user, or another user's password as the + superuser.</para> + + <note> + <para>To prevent accidental or unauthorized changes, the user + must enter their original password before a new password can + be set. This is not the case when the superuser changes a + user's password.</para> + </note> + + <example> + <title>Changing Your Password</title> + + <screen>&prompt.user; <userinput>passwd</userinput> +Changing local password for jru. +Old password: +New password: +Retype new password: +passwd: updating the database... +passwd: done</screen> + </example> + + <example> + <title>Changing Another User's Password as the + Superuser</title> + + <screen>&prompt.root; <userinput>passwd jru</userinput> +Changing local password for jru. +New password: +Retype new password: +passwd: updating the database... +passwd: done</screen> + </example> + + <note> + <para>As with &man.chpass.1;, &man.yppasswd.1; is a link to + &man.passwd.1;, so NIS works with either command.</para> + </note> + </sect2> + + + <sect2 id="users-pw"> + <title><command>pw</command></title> + + <indexterm><primary><command>pw</command></primary></indexterm> + + <para>&man.pw.8; is a command line utility to create, remove, + modify, and display users and groups. It functions as a front + end to the system user and group files. &man.pw.8; has a very + powerful set of command line options that make it suitable for + use in shell scripts, but new users may find it more + complicated than the other commands presented in this + section.</para> + </sect2> + + + </sect1> + + <sect1 id="users-limiting"> + <title>Limiting Users</title> + + <indexterm><primary>limiting users</primary></indexterm> + <indexterm> + <primary>accounts</primary> + <secondary>limiting</secondary> + </indexterm> + <para>&os; provides several methods for an administrator to limit + the amount of system resources an individual may use. These + limits are discussed in two sections: disk quotas and other + resource limits.</para> + + <indexterm><primary>quotas</primary></indexterm> + <indexterm> + <primary>limiting users</primary> + <secondary>quotas</secondary> + </indexterm> + <indexterm><primary>disk quotas</primary></indexterm> + <para>Disk quotas limit the amount of disk space available to + users and provide a way to quickly check that usage without + calculating it every time. Quotas are discussed in <xref + linkend="quotas"/>.</para> + + <para>The other resource limits include ways to limit the amount + of CPU, memory, and other resources a user may consume. These + are defined using login classes and are discussed here.</para> + + <indexterm> + <primary><filename>/etc/login.conf</filename></primary> + </indexterm> + <para>Login classes are defined in + <filename>/etc/login.conf</filename> and are described in detail + in &man.login.conf.5;. Each user account is assigned to a login + class, <literal>default</literal> by default, and each login + class has a set of login capabilities associated with it. A + login capability is a + <literal><replaceable>name</replaceable>=<replaceable>value</replaceable></literal> + pair, where <replaceable>name</replaceable> is a well-known + identifier and <replaceable>value</replaceable> is an arbitrary + string which is processed accordingly depending on the + <replaceable>name</replaceable>. Setting up login classes and + capabilities is rather straightforward and is also described in + &man.login.conf.5;.</para> + + <note> + <para>&os; does not normally read the configuration in + <filename>/etc/login.conf</filename> directly, but instead + reads the <filename>/etc/login.conf.db</filename> database + which provides faster lookups. Whenever + <filename>/etc/login.conf</filename> is edited, the + <filename>/etc/login.conf.db</filename> must be updated by + executing the following command:</para> + + <screen>&prompt.root; <userinput>cap_mkdb /etc/login.conf</userinput></screen> + </note> + + <para>Resource limits differ from the default login capabilities + in two ways. First, for every limit, there is a soft (current) + and hard limit. A soft limit may be adjusted by the user or + application, but may not be set higher than the hard limit. The + hard limit may be lowered by the user, but can only be raised + by the superuser. Second, most resource limits apply per + process to a specific user, not to the user as a whole. These + differences are mandated by the specific handling of the limits, + not by the implementation of the login capability + framework.</para> + + <para>Below are the most commonly used resource limits. The rest + of the limits, along with all the other login capabilities, can + be found in &man.login.conf.5;.</para> + + <variablelist> + <varlistentry> + <term><literal>coredumpsize</literal></term> + + <listitem> + <para>The limit on the size of a core file<indexterm><primary>coredumpsize</primary></indexterm> generated by a + program is subordinate to other limits<indexterm><primary>limiting users</primary><secondary>coredumpsize</secondary></indexterm> on disk usage, such + as <literal>filesize</literal>, or disk quotas. + This limit is often used as a less-severe method of + controlling disk space consumption. Since users do not + generate core files themselves, and often do not delete + them, setting this may save them from running out of disk + space should a large program crash.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>cputime</literal></term> + + <listitem> + <para>The maximum amount of CPU<indexterm><primary>cputime</primary></indexterm><indexterm><primary>limiting users</primary><secondary>cputime</secondary></indexterm> time a user's process may + consume. Offending processes will be killed by the + kernel.</para> + + <note> + <para>This is a limit on CPU <emphasis>time</emphasis> + consumed, not percentage of the CPU as displayed in + some fields by &man.top.1; and &man.ps.1;.</para> + </note> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>filesize</literal></term> + + <listitem> + <para>The maximum size of a file<indexterm><primary>filesize</primary></indexterm><indexterm><primary>limiting users</primary><secondary>filesize</secondary></indexterm> the user may own. Unlike + <link linkend="quotas">disk quotas</link>, this limit is + enforced on individual files, not the set of all files a + user owns.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>maxproc</literal></term> + + <listitem> + <para>The maximum number of processes<indexterm><primary>maxproc</primary></indexterm><indexterm><primary>limiting users</primary><secondary>maxproc</secondary></indexterm> a user can run. This + includes foreground and background processes. This limit + may not be larger than the system limit specified by the + <varname>kern.maxproc</varname> &man.sysctl.8;. Setting + this limit too small may hinder a user's productivity as + it is often useful to be logged in multiple times or to + execute pipelines. Some tasks, such as compiling a large + program, spawn multiple processes and other intermediate + preprocessors.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>memorylocked</literal></term> + + <listitem> + <para>The maximum amount of memory<indexterm><primary>memorylocked</primary></indexterm><indexterm><primary>limiting users</primary><secondary>memorylocked</secondary></indexterm> a process may request + to be locked into main memory using &man.mlock.2;. Some + system-critical programs, such as &man.amd.8;, lock into + main memory so that if the system begins to swap, they do + not contribute to disk thrashing.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>memoryuse</literal></term> + + <listitem> + <para>The maximum amount of memory<indexterm><primary>memoryuse</primary></indexterm><indexterm><primary>limiting users</primary><secondary>memoryuse</secondary></indexterm> a process may consume at + any given time. It includes both core memory and swap + usage. This is not a catch-all limit for restricting + memory consumption, but is a good start.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>openfiles</literal></term> + + <listitem> + <para>The maximum number of files a process may have open<indexterm><primary>openfiles</primary></indexterm><indexterm><primary>limiting users</primary><secondary>openfiles</secondary></indexterm>. + In &os;, files are used to represent sockets and IPC + channels, so be careful not to set this too low. The + system-wide limit for this is defined by the + <varname>kern.maxfiles</varname> &man.sysctl.8;.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>sbsize</literal></term> + + <listitem> + <para>The limit on the amount of network memory, and + thus mbufs<indexterm><primary>sbsize</primary></indexterm><indexterm><primary>limiting users</primary><secondary>sbsize</secondary></indexterm>, a user may consume in order to limit network + communications.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>stacksize</literal></term> + + <listitem> + <para>The maximum size of a process stack<indexterm><primary>stacksize</primary></indexterm><indexterm><primary>limiting users</primary><secondary>stacksize</secondary></indexterm>. This alone is + not sufficient to limit the amount of memory a program + may use so it should be used in conjunction with other + limits.</para> + </listitem> + </varlistentry> + </variablelist> + + <para>There are a few other things to remember when setting + resource limits. Following are some general tips, suggestions, + and miscellaneous comments.</para> + + <itemizedlist> + <listitem> + <para>Processes started at system startup by + <filename>/etc/rc</filename> are assigned to the + <literal>daemon</literal> login class.</para> + </listitem> + + <listitem> + <para>Although the <filename>/etc/login.conf</filename> that + comes with the system is a good source of reasonable values + for most limits, they may not be appropriate for every + system. Setting a limit too high may open the system up to + abuse, while setting it too low may put a strain on + productivity.</para> + </listitem> + + <listitem> + <para>Users of <application>&xorg;</application> should + probably be granted more resources than other users. + <application>&xorg;</application> by itself takes a lot of + resources, but it also encourages users to run more programs + simultaneously.</para> + </listitem> + + <listitem> + <para>Many limits apply to individual processes, not the user + as a whole. For example, setting + <varname>openfiles</varname> to 50 means that each process + the user runs may open up to 50 files. The total amount + of files a user may open is the value of + <literal>openfiles</literal> multiplied by the value of + <literal>maxproc</literal>. This also applies to memory + consumption.</para> + </listitem> + </itemizedlist> + + <para>For further information on resource limits and login classes + and capabilities in general, refer to &man.cap.mkdb.1;, + &man.getrlimit.2;, and &man.login.conf.5;.</para> + </sect1> + + <sect1 id="users-groups"> + <title>Groups</title> + + <indexterm><primary>groups</primary></indexterm> + <indexterm> + <primary><filename>/etc/groups</filename></primary> + </indexterm> + <indexterm> + <primary>accounts</primary> + <secondary>groups</secondary> + </indexterm> + <para>A group is a list of users. A group is identified by its + group name and <acronym>GID</acronym>. In &os;, the + kernel uses the <acronym>UID</acronym> of a process, and the + list of groups it belongs to, to determine what the process is + allowed to do. Most of the time, the <acronym>GID</acronym> of + a user or process usually means the first group in the + list.</para> + + <para>The group name to <acronym>GID</acronym> mapping is listed + in <filename>/etc/group</filename>. This is a plain text file + with four colon-delimited fields. The first field is the group + name, the second is the encrypted password, the third the + <acronym>GID</acronym>, and the fourth the comma-delimited list + of members. For a more complete description of the syntax, + refer to &man.group.5;.</para> + + <para>The superuser can modify <filename>/etc/group</filename> + using a text editor. Alternatively, &man.pw.8; can be used to + add and edit groups. For example, to add a group called + <groupname>teamtwo</groupname> and then confirm that it + exists:</para> + + <example> + <title>Adding a Group Using &man.pw.8;</title> + + <screen>&prompt.root; <userinput>pw groupadd teamtwo</userinput> +&prompt.root; <userinput>pw groupshow teamtwo</userinput> +teamtwo:*:1100:</screen> + </example> + + <para>In this example, <literal>1100</literal> is the + <acronym>GID</acronym> of <groupname>teamtwo</groupname>. Right + now, <groupname>teamtwo</groupname> has no members. This + command will add <username>jru</username> as a member of + <groupname>teamtwo</groupname>.</para> + + <example> + <title>Adding User Accounts to a New Group Using + &man.pw.8;</title> + + <screen>&prompt.root; <userinput>pw groupmod teamtwo -M jru</userinput> +&prompt.root; <userinput>pw groupshow teamtwo</userinput> +teamtwo:*:1100:jru</screen> + </example> + + <para>The argument to <option>-M</option> is a comma-delimited + list of users to be added to a new (empty) group or to replace + the members of an existing group. To the user, this group + membership is different from (and in addition to) the user's + primary group listed in the password file. This means that + the user will not show up as a member when using + <option>groupshow</option> with &man.pw.8;, but will show up + when the information is queried via &man.id.1; or a similar + tool. When &man.pw.8; is used to add a user to a group, it only + manipulates <filename>/etc/group</filename> and does not attempt + to read additional data from + <filename>/etc/passwd</filename>.</para> + + <example> + <title>Adding a New Member to a Group Using &man.pw.8;</title> + + <screen>&prompt.root; <userinput>pw groupmod teamtwo -m db</userinput> +&prompt.root; <userinput>pw groupshow teamtwo</userinput> +teamtwo:*:1100:jru,db</screen> + </example> + + <para>In this example, the argument to <option>-m</option> is a + comma-delimited list of users who are to be added to the group. + Unlike the previous example, these users are appended to the + group list and do not replace the list of existing users in the + group.</para> + + <example> + <title>Using &man.id.1; to Determine Group Membership</title> + + <screen>&prompt.user; <userinput>id jru</userinput> +uid=1001(jru) gid=1001(jru) groups=1001(jru), 1100(teamtwo)</screen> + </example> + + <para>In this example, <username>jru</username> is a member of the + groups <groupname>jru</groupname> and + <groupname>teamtwo</groupname>.</para> + + <para>For more information about this command and the format of + <filename>/etc/group</filename>, refer to &man.pw.8; and + &man.group.5;.</para> + </sect1> + + <sect1 id="users-becomesuper"> + <title>Becoming Superuser</title> + + <para>There are several ways to do things as the superuser. The + worst way is to log in as <username>root</username> directly. + Usually very little activity requires <username>root</username> + so logging off and logging in as <username>root</username>, + performing tasks, then logging off and on again as a normal user + is a waste of time.</para> + + <para>A better way is to use &man.su.1; without providing a login + but using <literal>-</literal> to inherit the root environment. + Not providing a login will imply super user. For this to work + the login that must be in the <groupname>wheel</groupname> group. + An example of a typical software installation would involve the + administrator unpacking the software as a normal user and then + elevating their privileges for the build and installation of + the software.</para> + + <example> + <title>Install a Program As The Superuser</title> + + <screen>&prompt.user; <userinput>configure</userinput> +&prompt.user; <userinput>make</userinput> +&prompt.user; <userinput>su -</userinput> +Password: +&prompt.root; <userinput>make install</userinput> +&prompt.root; <userinput>exit</userinput> +&prompt.user;</screen> + </example> + + <para>Note in this example the transition to + <username>root</username> is less painful than logging off + and back on twice.</para> + + <para>Using &man.su.1; works well for single systems or small + networks with just one system administrator. For more complex + environments (or even for these simple environments) + <command>sudo</command> should be used. It is provided as a port, + <filename role="package">security/sudo</filename>. It allows for + things like activity logging, granting users the ability to only + run certain commands as the superuser, and several other + options.</para> + </sect1> +</chapter> diff --git a/en_US.ISO8859-1/books/handbook/vinum/Makefile b/en_US.ISO8859-1/books/handbook/vinum/Makefile new file mode 100644 index 0000000000..b970524581 --- /dev/null +++ b/en_US.ISO8859-1/books/handbook/vinum/Makefile @@ -0,0 +1,15 @@ +# +# Build the Handbook with just the content from this chapter. +# +# $FreeBSD$ +# + +CHAPTERS= vinum/chapter.xml + +VPATH= .. + +MASTERDOC= ${.CURDIR}/../${DOC}.${DOCBOOKSUFFIX} + +DOC_PREFIX?= ${.CURDIR}/../../../.. + +.include "../Makefile" diff --git a/en_US.ISO8859-1/books/handbook/vinum/chapter.xml b/en_US.ISO8859-1/books/handbook/vinum/chapter.xml new file mode 100644 index 0000000000..0b0dc34114 --- /dev/null +++ b/en_US.ISO8859-1/books/handbook/vinum/chapter.xml @@ -0,0 +1,1251 @@ +<?xml version="1.0" encoding="iso-8859-1"?> +<!-- + The Vinum Volume Manager + By Greg Lehey (grog at lemis dot com) + + Added to the Handbook by Hiten Pandya <hmp@FreeBSD.org> + and Tom Rhodes <trhodes@FreeBSD.org> + + For the FreeBSD Documentation Project + $FreeBSD$ +--> + +<chapter id="vinum-vinum"> + <chapterinfo> + <authorgroup> + <author> + <firstname>Greg</firstname> + <surname>Lehey</surname> + <contrib>Originally written by </contrib> + </author> + </authorgroup> + </chapterinfo> + + <title>The <devicename>vinum</devicename> Volume Manager</title> + + <sect1 id="vinum-synopsis"> + <title>Synopsis</title> + + <para>No matter the type of disks, there are always potential + problems. The disks can be too small, too slow, or too + unreliable to meet the system's requirements. While disks are + getting bigger, so are data storage requirements. Often a file + system is needed that is bigger than a disk's capacity. Various + solutions to these problems have been proposed and + implemented.</para> + + <para>One method is through the use of multiple, and sometimes + redundant, disks. In addition to supporting various cards and + controllers for hardware Redundant Array of Independent + Disks <acronym>RAID</acronym> systems, the base &os; system + includes the <devicename>vinum</devicename> volume manager, a + block device driver that implements virtual disk drives and + addresses these three problems. <devicename>vinum</devicename> + provides more flexibility, performance, and reliability than + traditional disk storage and implements + <acronym>RAID</acronym>-0, <acronym>RAID</acronym>-1, and + <acronym>RAID</acronym>-5 models, both individually and in + combination.</para> + + <para>This chapter provides an overview of potential problems with + traditional disk storage, and an introduction to the + <devicename>vinum</devicename> volume manager.</para> + + <note> + <para>Starting with &os; 5, <devicename>vinum</devicename> + has been rewritten in order to fit into the <link + linkend="GEOM">GEOM architecture</link>, while retaining the + original ideas, terminology, and on-disk metadata. This + rewrite is called <emphasis>gvinum</emphasis> (for <emphasis> + GEOM vinum</emphasis>). While this chapter uses the term + <devicename>vinum</devicename>, any command invocations should + be performed with <command>gvinum</command>. The name of the + kernel module has changed from the original + <filename>vinum.ko</filename> to + <filename>geom_vinum.ko</filename>, and all device nodes + reside under <filename + class="directory">/dev/gvinum</filename> instead of + <filename class="directory">/dev/vinum</filename>. As of + &os; 6, the original <devicename>vinum</devicename> + implementation is no longer available in the code base.</para> + </note> + </sect1> + + <sect1 id="vinum-access-bottlenecks"> + <title>Access Bottlenecks</title> + + <para>Modern systems frequently need to access data in a highly + concurrent manner. For example, large FTP or HTTP servers can + maintain thousands of concurrent sessions and have multiple + 100 Mbit/s connections to the outside world, well beyond + the sustained transfer rate of most disks.</para> + + <para>Current disk drives can transfer data sequentially at up to + 70 MB/s, but this value is of little importance in an + environment where many independent processes access a drive, and + where they may achieve only a fraction of these values. In such + cases, it is more interesting to view the problem from the + viewpoint of the disk subsystem. The important parameter is the + load that a transfer places on the subsystem, or the time for + which a transfer occupies the drives involved in the + transfer.</para> + + <para>In any disk transfer, the drive must first position the + heads, wait for the first sector to pass under the read head, + and then perform the transfer. These actions can be considered + to be atomic as it does not make any sense to interrupt + them.</para> + + <para><anchor id="vinum-latency"/> Consider a typical transfer of + about 10 kB: the current generation of high-performance + disks can position the heads in an average of 3.5 ms. The + fastest drives spin at 15,000 rpm, so the average + rotational latency (half a revolution) is 2 ms. At + 70 MB/s, the transfer itself takes about 150 μs, + almost nothing compared to the positioning time. In such a + case, the effective transfer rate drops to a little over + 1 MB/s and is clearly highly dependent on the transfer + size.</para> + + <para>The traditional and obvious solution to this bottleneck is + <quote>more spindles</quote>: rather than using one large disk, + use several smaller disks with the same aggregate storage + space. Each disk is capable of positioning and transferring + independently, so the effective throughput increases by a factor + close to the number of disks used.</para> + + <para>The actual throughput improvement is smaller than the + number of disks involved. Although each drive is capable of + transferring in parallel, there is no way to ensure that the + requests are evenly distributed across the drives. Inevitably + the load on one drive will be higher than on another.</para> + + <indexterm> + <primary>disk concatenation</primary> + </indexterm> + <indexterm> + <primary>Vinum</primary> + <secondary>concatenation</secondary> + </indexterm> + + <para>The evenness of the load on the disks is strongly dependent + on the way the data is shared across the drives. In the + following discussion, it is convenient to think of the disk + storage as a large number of data sectors which are addressable + by number, rather like the pages in a book. The most obvious + method is to divide the virtual disk into groups of consecutive + sectors the size of the individual physical disks and store them + in this manner, rather like taking a large book and tearing it + into smaller sections. This method is called + <emphasis>concatenation</emphasis> and has the advantage that + the disks are not required to have any specific size + relationships. It works well when the access to the virtual + disk is spread evenly about its address space. When access is + concentrated on a smaller area, the improvement is less marked. + <xref linkend="vinum-concat"/> illustrates the sequence in + which storage units are allocated in a concatenated + organization.</para> + + <para> + <figure id="vinum-concat"> + <title>Concatenated Organization</title> + + <graphic fileref="vinum/vinum-concat"/> + </figure></para> + + <indexterm> + <primary>disk striping</primary> + </indexterm> + <indexterm> + <primary>Vinum</primary> + <secondary>striping</secondary> + </indexterm> + <indexterm> + <primary><acronym>RAID</acronym></primary> + </indexterm> + + <para>An alternative mapping is to divide the address space into + smaller, equal-sized components and store them sequentially on + different devices. For example, the first 256 sectors may be + stored on the first disk, the next 256 sectors on the next disk + and so on. After filling the last disk, the process repeats + until the disks are full. This mapping is called + <emphasis>striping</emphasis> or + <acronym>RAID-0</acronym>.</para> + + <para><acronym>RAID</acronym> offers various forms of fault + tolerance, though <acronym>RAID-0</acronym> is somewhat + misleading as it provides no redundancy. Striping requires + somewhat more effort to locate the data, and it can cause + additional I/O load where a transfer is spread over multiple + disks, but it can also provide a more constant load across the + disks. <xref linkend="vinum-striped"/> illustrates the + sequence in which storage units are allocated in a striped + organization.</para> + + <para> + <figure id="vinum-striped"> + <title>Striped Organization</title> + + <graphic fileref="vinum/vinum-striped"/> + </figure></para> + </sect1> + + <sect1 id="vinum-data-integrity"> + <title>Data Integrity</title> + + <para>The final problem with disks is that they are unreliable. + Although reliability has increased tremendously over the last + few years, disk drives are still the most likely core component + of a server to fail. When they do, the results can be + catastrophic and replacing a failed disk drive and restoring + data can result in server downtime.</para> + + <indexterm> + <primary>disk mirroring</primary> + </indexterm> + <indexterm><primary>vinum</primary> + <secondary>mirroring</secondary> + </indexterm> + <indexterm><primary><acronym>RAID</acronym>-1</primary> + </indexterm> + + <para>One approach to this problem is + <emphasis>mirroring</emphasis>, or + <acronym>RAID-1</acronym>, which keeps two copies of the + data on different physical hardware. Any write to the volume + writes to both disks; a read can be satisfied from either, so if + one drive fails, the data is still available on the other + drive.</para> + + <para>Mirroring has two problems:</para> + + <itemizedlist> + <listitem> + <para>It requires twice as much disk storage as a + non-redundant solution.</para> + </listitem> + + <listitem> + <para>Writes must be performed to both drives, so they take up + twice the bandwidth of a non-mirrored volume. Reads do not + suffer from a performance penalty and can even be + faster.</para> + </listitem> + </itemizedlist> + + <indexterm><primary><acronym>RAID</acronym>-5</primary></indexterm> + + <para>An alternative solution is <emphasis>parity</emphasis>, + implemented in <acronym>RAID</acronym> levels 2, 3, 4 and 5. + Of these, <acronym>RAID-5</acronym> is the most interesting. As + implemented in <devicename>vinum</devicename>, it is a variant + on a striped organization which dedicates one block of each + stripe to parity one of the other blocks. As implemented by + <devicename>vinum</devicename>, a + <acronym>RAID-5</acronym> plex is similar to a striped plex, + except that it implements <acronym>RAID-5</acronym> by + including a parity block in each stripe. As required by + <acronym>RAID-5</acronym>, the location of this parity block + changes from one stripe to the next. The numbers in the data + blocks indicate the relative block numbers.</para> + + <para> + <figure id="vinum-raid5-org"> + <title><acronym>RAID</acronym>-5 Organization</title> + + <graphic fileref="vinum/vinum-raid5-org"/> + </figure></para> + + <para>Compared to mirroring, <acronym>RAID-5</acronym> has the + advantage of requiring significantly less storage space. Read + access is similar to that of striped organizations, but write + access is significantly slower, approximately 25% of the read + performance. If one drive fails, the array can continue to + operate in degraded mode where a read from one of the remaining + accessible drives continues normally, but a read from the + failed drive is recalculated from the corresponding block from + all the remaining drives.</para> + </sect1> + + <sect1 id="vinum-objects"> + <title><devicename>vinum</devicename> Objects</title> + + <para>In order to address these problems, + <devicename>vinum</devicename> implements a four-level hierarchy + of objects:</para> + + <itemizedlist> + <listitem> + <para>The most visible object is the virtual disk, called a + <emphasis>volume</emphasis>. Volumes have essentially the + same properties as a &unix; disk drive, though there are + some minor differences. For one, they have no size + limitations.</para> + </listitem> + + <listitem> + <para>Volumes are composed of <emphasis>plexes</emphasis>, + each of which represent the total address space of a + volume. This level in the hierarchy provides redundancy. + Think of plexes as individual disks in a mirrored array, + each containing the same data.</para> + </listitem> + + <listitem> + <para>Since <devicename>vinum</devicename> exists within the + &unix; disk storage framework, it would be possible to use + &unix; partitions as the building block for multi-disk + plexes. In fact, this turns out to be too inflexible as + &unix; disks can have only a limited number of partitions. + Instead, <devicename>vinum</devicename> subdivides a single + &unix; partition, the <emphasis>drive</emphasis>, into + contiguous areas called <emphasis>subdisks</emphasis>, which + are used as building blocks for plexes.</para> + </listitem> + + <listitem> + <para>Subdisks reside on <devicename>vinum</devicename> + <emphasis>drives</emphasis>, currently &unix; partitions. + <devicename>vinum</devicename> drives can contain any + number of subdisks. With the exception of a small area at + the beginning of the drive, which is used for storing + configuration and state information, the entire drive is + available for data storage.</para> + </listitem> + </itemizedlist> + + <para>The following sections describe the way these objects + provide the functionality required of + <devicename>vinum</devicename>.</para> + + <sect2> + <title>Volume Size Considerations</title> + + <para>Plexes can include multiple subdisks spread over all + drives in the <devicename>vinum</devicename> configuration. + As a result, the size of an individual drive does not limit + the size of a plex or a volume.</para> + </sect2> + + <sect2> + <title>Redundant Data Storage</title> + + <para><devicename>vinum</devicename> implements mirroring by + attaching multiple plexes to a volume. Each plex is a + representation of the data in a volume. A volume may contain + between one and eight plexes.</para> + + <para>Although a plex represents the complete data of a volume, + it is possible for parts of the representation to be + physically missing, either by design (by not defining a + subdisk for parts of the plex) or by accident (as a result of + the failure of a drive). As long as at least one plex can + provide the data for the complete address range of the volume, + the volume is fully functional.</para> + </sect2> + + <sect2> + <title>Which Plex Organization?</title> + + <para><devicename>vinum</devicename> implements both + concatenation and striping at the plex level:</para> + + <itemizedlist> + <listitem> + <para>A <emphasis>concatenated plex</emphasis> uses the + address space of each subdisk in turn. Concatenated + plexes are the most flexible as they can contain any + number of subdisks, and the subdisks may be of different + length. The plex may be extended by adding additional + subdisks. They require less <acronym>CPU</acronym> + time than striped plexes, though the difference in + <acronym>CPU</acronym> overhead is not measurable. On + the other hand, they are most susceptible to hot spots, + where one disk is very active and others are idle.</para> + </listitem> + + <listitem> + <para>A <emphasis>striped plex</emphasis> stripes the data + across each subdisk. The subdisks must all be the same + size and there must be at least two subdisks in order to + distinguish it from a concatenated plex. The greatest + advantage of striped plexes is that they reduce hot spots. + By choosing an optimum sized stripe, about 256 kB, + the load can be evened out on the component drives. + Extending a plex by adding new subdisks is so complicated + that <devicename>vinum</devicename> does not implement + it.</para> + </listitem> + </itemizedlist> + + <para><xref linkend="vinum-comparison"/> summarizes the + advantages and disadvantages of each plex organization.</para> + + <table id="vinum-comparison" frame="none"> + <title><devicename>vinum</devicename> Plex + Organizations</title> + + <tgroup cols="5"> + <thead> + <row> + <entry>Plex type</entry> + <entry>Minimum subdisks</entry> + <entry>Can add subdisks</entry> + <entry>Must be equal size</entry> + <entry>Application</entry> + </row> + </thead> + + <tbody> + <row> + <entry>concatenated</entry> + <entry>1</entry> + <entry>yes</entry> + <entry>no</entry> + <entry>Large data storage with maximum placement + flexibility and moderate performance</entry> + </row> + + <row> + <entry>striped</entry> + <entry>2</entry> + <entry>no</entry> + <entry>yes</entry> + <entry>High performance in combination with highly + concurrent access</entry> + </row> + </tbody> + </tgroup> + </table> + </sect2> + </sect1> + + <sect1 id="vinum-examples"> + <title>Some Examples</title> + + <para><devicename>vinum</devicename> maintains a + <emphasis>configuration database</emphasis> which describes the + objects known to an individual system. Initially, the user + creates the configuration database from one or more + configuration files using &man.gvinum.8;. + <devicename>vinum</devicename> stores a copy of its + configuration database on each disk + <emphasis>device</emphasis> under its control. This database is + updated on each state change, so that a restart accurately + restores the state of each + <devicename>vinum</devicename> object.</para> + + <sect2> + <title>The Configuration File</title> + + <para>The configuration file describes individual + <devicename>vinum</devicename> objects. The definition of a + simple volume might be:</para> + + <programlisting> drive a device /dev/da3h + volume myvol + plex org concat + sd length 512m drive a</programlisting> + + <para>This file describes four <devicename>vinum</devicename> + objects:</para> + + <itemizedlist> + <listitem> + <para>The <emphasis>drive</emphasis> line describes a disk + partition (<emphasis>drive</emphasis>) and its location + relative to the underlying hardware. It is given the + symbolic name <emphasis>a</emphasis>. This separation of + symbolic names from device names allows disks to be moved + from one location to another without confusion.</para> + </listitem> + + <listitem> + <para>The <emphasis>volume</emphasis> line describes a + volume. The only required attribute is the name, in this + case <emphasis>myvol</emphasis>.</para> + </listitem> + + <listitem> + <para>The <emphasis>plex</emphasis> line defines a plex. + The only required parameter is the organization, in this + case <emphasis>concat</emphasis>. No name is necessary as + the system automatically generates a name from the volume + name by adding the suffix + <emphasis>.p</emphasis><emphasis>x</emphasis>, where + <emphasis>x</emphasis> is the number of the plex in the + volume. Thus this plex will be called + <emphasis>myvol.p0</emphasis>.</para> + </listitem> + + <listitem> + <para>The <emphasis>sd</emphasis> line describes a subdisk. + The minimum specifications are the name of a drive on + which to store it, and the length of the subdisk. No name + is necessary as the system automatically assigns names + derived from the plex name by adding the suffix + <emphasis>.s</emphasis><emphasis>x</emphasis>, where + <emphasis>x</emphasis> is the number of the subdisk in + the plex. Thus <devicename>vinum</devicename> gives this + subdisk the name <emphasis>myvol.p0.s0</emphasis>.</para> + </listitem> + </itemizedlist> + + <para>After processing this file, &man.gvinum.8; produces the + following output:</para> + + <programlisting width="97"> + &prompt.root; gvinum -> <userinput>create config1</userinput> + Configuration summary + Drives: 1 (4 configured) + Volumes: 1 (4 configured) + Plexes: 1 (8 configured) + Subdisks: 1 (16 configured) + + D a State: up Device /dev/da3h Avail: 2061/2573 MB (80%) + + V myvol State: up Plexes: 1 Size: 512 MB + + P myvol.p0 C State: up Subdisks: 1 Size: 512 MB + + S myvol.p0.s0 State: up PO: 0 B Size: 512 MB</programlisting> + + <para>This output shows the brief listing format of + &man.gvinum.8;. It is represented graphically in <xref + linkend="vinum-simple-vol"/>.</para> + + <para> + <figure id="vinum-simple-vol"> + <title>A Simple <devicename>vinum</devicename> + Volume</title> + + <graphic fileref="vinum/vinum-simple-vol"/> + </figure></para> + + <para>This figure, and the ones which follow, represent a + volume, which contains the plexes, which in turn contains the + subdisks. In this example, the volume contains one plex, and + the plex contains one subdisk.</para> + + <para>This particular volume has no specific advantage over a + conventional disk partition. It contains a single plex, so it + is not redundant. The plex contains a single subdisk, so + there is no difference in storage allocation from a + conventional disk partition. The following sections + illustrate various more interesting configuration + methods.</para> + </sect2> + + <sect2> + <title>Increased Resilience: Mirroring</title> + + <para>The resilience of a volume can be increased by mirroring. + When laying out a mirrored volume, it is important to ensure + that the subdisks of each plex are on different drives, so + that a drive failure will not take down both plexes. The + following configuration mirrors a volume:</para> + + <programlisting> drive b device /dev/da4h + volume mirror + plex org concat + sd length 512m drive a + plex org concat + sd length 512m drive b</programlisting> + + <para>In this example, it was not necessary to specify a + definition of drive <emphasis>a</emphasis> again, since + <devicename>vinum</devicename> keeps track of all objects in + its configuration database. After processing this definition, + the configuration looks like:</para> + + <programlisting width="97"> + Drives: 2 (4 configured) + Volumes: 2 (4 configured) + Plexes: 3 (8 configured) + Subdisks: 3 (16 configured) + + D a State: up Device /dev/da3h Avail: 1549/2573 MB (60%) + D b State: up Device /dev/da4h Avail: 2061/2573 MB (80%) + + V myvol State: up Plexes: 1 Size: 512 MB + V mirror State: up Plexes: 2 Size: 512 MB + + P myvol.p0 C State: up Subdisks: 1 Size: 512 MB + P mirror.p0 C State: up Subdisks: 1 Size: 512 MB + P mirror.p1 C State: initializing Subdisks: 1 Size: 512 MB + + S myvol.p0.s0 State: up PO: 0 B Size: 512 MB + S mirror.p0.s0 State: up PO: 0 B Size: 512 MB + S mirror.p1.s0 State: empty PO: 0 B Size: 512 MB</programlisting> + + <para><xref linkend="vinum-mirrored-vol"/> shows the + structure graphically.</para> + + <para> + <figure id="vinum-mirrored-vol"> + <title>A Mirrored <devicename>vinum</devicename> + Volume</title> + + <graphic fileref="vinum/vinum-mirrored-vol"/> + </figure></para> + + <para>In this example, each plex contains the full 512 MB + of address space. As in the previous example, each plex + contains only a single subdisk.</para> + </sect2> + + <sect2> + <title>Optimizing Performance</title> + + <para>The mirrored volume in the previous example is more + resistant to failure than an unmirrored volume, but its + performance is less as each write to the volume requires a + write to both drives, using up a greater proportion of the + total disk bandwidth. Performance considerations demand a + different approach: instead of mirroring, the data is striped + across as many disk drives as possible. The following + configuration shows a volume with a plex striped across four + disk drives:</para> + + <programlisting> drive c device /dev/da5h + drive d device /dev/da6h + volume stripe + plex org striped 512k + sd length 128m drive a + sd length 128m drive b + sd length 128m drive c + sd length 128m drive d</programlisting> + + <para>As before, it is not necessary to define the drives which + are already known to <devicename>vinum</devicename>. After + processing this definition, the configuration looks + like:</para> + + <programlisting width="92"> + Drives: 4 (4 configured) + Volumes: 3 (4 configured) + Plexes: 4 (8 configured) + Subdisks: 7 (16 configured) + + D a State: up Device /dev/da3h Avail: 1421/2573 MB (55%) + D b State: up Device /dev/da4h Avail: 1933/2573 MB (75%) + D c State: up Device /dev/da5h Avail: 2445/2573 MB (95%) + D d State: up Device /dev/da6h Avail: 2445/2573 MB (95%) + + V myvol State: up Plexes: 1 Size: 512 MB + V mirror State: up Plexes: 2 Size: 512 MB + V striped State: up Plexes: 1 Size: 512 MB + + P myvol.p0 C State: up Subdisks: 1 Size: 512 MB + P mirror.p0 C State: up Subdisks: 1 Size: 512 MB + P mirror.p1 C State: initializing Subdisks: 1 Size: 512 MB + P striped.p1 State: up Subdisks: 1 Size: 512 MB + + S myvol.p0.s0 State: up PO: 0 B Size: 512 MB + S mirror.p0.s0 State: up PO: 0 B Size: 512 MB + S mirror.p1.s0 State: empty PO: 0 B Size: 512 MB + S striped.p0.s0 State: up PO: 0 B Size: 128 MB + S striped.p0.s1 State: up PO: 512 kB Size: 128 MB + S striped.p0.s2 State: up PO: 1024 kB Size: 128 MB + S striped.p0.s3 State: up PO: 1536 kB Size: 128 MB</programlisting> + + <para> + <figure id="vinum-striped-vol"> + <title>A Striped <devicename>vinum</devicename> + Volume</title> + + <graphic fileref="vinum/vinum-striped-vol"/> + </figure></para> + + <para>This volume is represented in <xref + linkend="vinum-striped-vol"/>. The darkness of the + stripes indicates the position within the plex address space, + where the lightest stripes come first and the darkest + last.</para> + </sect2> + + <sect2> + <title>Resilience and Performance</title> + + <para><anchor id="vinum-resilience"/>With sufficient hardware, + it is possible to build volumes which show both increased + resilience and increased performance compared to standard + &unix; partitions. A typical configuration file might + be:</para> + + <programlisting> volume raid10 + plex org striped 512k + sd length 102480k drive a + sd length 102480k drive b + sd length 102480k drive c + sd length 102480k drive d + sd length 102480k drive e + plex org striped 512k + sd length 102480k drive c + sd length 102480k drive d + sd length 102480k drive e + sd length 102480k drive a + sd length 102480k drive b</programlisting> + + <para>The subdisks of the second plex are offset by two drives + from those of the first plex. This helps to ensure that + writes do not go to the same subdisks even if a transfer goes + over two drives.</para> + + <para><xref linkend="vinum-raid10-vol"/> represents the + structure of this volume.</para> + + <para> + <figure id="vinum-raid10-vol"> + <title>A Mirrored, Striped <devicename>vinum</devicename> + Volume</title> + + <graphic fileref="vinum/vinum-raid10-vol"/> + </figure></para> + </sect2> + </sect1> + + <sect1 id="vinum-object-naming"> + <title>Object Naming</title> + + <para><devicename>vinum</devicename> assigns default names to + plexes and subdisks, although they may be overridden. + Overriding the default names is not recommended as it does not + bring a significant advantage and it can cause + confusion.</para> + + <para>Names may contain any non-blank character, but it is + recommended to restrict them to letters, digits and the + underscore characters. The names of volumes, plexes, and + subdisks may be up to 64 characters long, and the names of + drives may be up to 32 characters long.</para> + + <para><devicename>vinum</devicename> objects are assigned device + nodes in the hierarchy <filename + class="directory">/dev/gvinum</filename>. The configuration + shown above would cause <devicename>vinum</devicename> to create + the following device nodes:</para> + + <itemizedlist> + <listitem> + <para>Device entries for each volume. These are the main + devices used by <devicename>vinum</devicename>. The + configuration above would include the devices + <filename class="devicefile">/dev/gvinum/myvol</filename>, + <filename class="devicefile">/dev/gvinum/mirror</filename>, + <filename class="devicefile">/dev/gvinum/striped</filename>, + <filename class="devicefile">/dev/gvinum/raid5</filename> + and <filename + class="devicefile">/dev/gvinum/raid10</filename>.</para> + </listitem> + + <listitem> + <para>All volumes get direct entries under + <filename class="directory">/dev/gvinum/</filename>.</para> + </listitem> + + <listitem> + <para>The directories + <filename class="directory">/dev/gvinum/plex</filename>, and + <filename class="directory">/dev/gvinum/sd</filename>, which + contain device nodes for each plex and for each subdisk, + respectively.</para> + </listitem> + </itemizedlist> + + <para>For example, consider the following configuration + file:</para> + + <programlisting> drive drive1 device /dev/sd1h + drive drive2 device /dev/sd2h + drive drive3 device /dev/sd3h + drive drive4 device /dev/sd4h + volume s64 setupstate + plex org striped 64k + sd length 100m drive drive1 + sd length 100m drive drive2 + sd length 100m drive drive3 + sd length 100m drive drive4</programlisting> + + <para>After processing this file, &man.gvinum.8; creates the + following structure in <filename + class="directory">/dev/gvinum</filename>:</para> + + <programlisting> drwxr-xr-x 2 root wheel 512 Apr 13 +16:46 plex + crwxr-xr-- 1 root wheel 91, 2 Apr 13 16:46 s64 + drwxr-xr-x 2 root wheel 512 Apr 13 16:46 sd + + /dev/vinum/plex: + total 0 + crwxr-xr-- 1 root wheel 25, 0x10000002 Apr 13 16:46 s64.p0 + + /dev/vinum/sd: + total 0 + crwxr-xr-- 1 root wheel 91, 0x20000002 Apr 13 16:46 s64.p0.s0 + crwxr-xr-- 1 root wheel 91, 0x20100002 Apr 13 16:46 s64.p0.s1 + crwxr-xr-- 1 root wheel 91, 0x20200002 Apr 13 16:46 s64.p0.s2 + crwxr-xr-- 1 root wheel 91, 0x20300002 Apr 13 16:46 s64.p0.s3</programlisting> + + <para>Although it is recommended that plexes and subdisks should + not be allocated specific names, + <devicename>vinum</devicename> drives must be named. This makes + it possible to move a drive to a different location and still + recognize it automatically. Drive names may be up to 32 + characters long.</para> + + <sect2> + <title>Creating File Systems</title> + + <para>Volumes appear to the system to be identical to disks, + with one exception. Unlike &unix; drives, + <devicename>vinum</devicename> does not partition volumes, + which thus do not contain a partition table. This has + required modification to some disk utilities, notably + &man.newfs.8;, so that it does not try to interpret the last + letter of a <devicename>vinum</devicename> volume name as a + partition identifier. For example, a disk drive may have a + name like <filename class="devicefile">/dev/ad0a</filename> + or <filename class="devicefile">/dev/da2h</filename>. These + names represent the first partition + (<devicename>a</devicename>) on the first (0) IDE disk + (<devicename>ad</devicename>) and the eighth partition + (<devicename>h</devicename>) on the third (2) SCSI disk + (<devicename>da</devicename>) respectively. By contrast, a + <devicename>vinum</devicename> volume might be called + <filename class="devicefile">/dev/gvinum/concat</filename>, + which has no relationship with a partition name.</para> + + <para>In order to create a file system on this volume, use + &man.newfs.8;:</para> + + <screen>&prompt.root; <userinput>newfs /dev/gvinum/concat</userinput></screen> + </sect2> + </sect1> + + <sect1 id="vinum-config"> + <title>Configuring <devicename>vinum</devicename></title> + + <para>The <filename>GENERIC</filename> kernel does not contain + <devicename>vinum</devicename>. It is possible to build a + custom kernel which includes <devicename>vinum</devicename>, but + this is not recommended. The standard way to start + <devicename>vinum</devicename> is as a kernel module. + &man.kldload.8; is not needed because when &man.gvinum.8; + starts, it checks whether the module has been loaded, and if it + is not, it loads it automatically.</para> + + + <sect2> + <title>Startup</title> + + <para><devicename>vinum</devicename> stores configuration + information on the disk slices in essentially the same form as + in the configuration files. When reading from the + configuration database, <devicename>vinum</devicename> + recognizes a number of keywords which are not allowed in the + configuration files. For example, a disk configuration might + contain the following text:</para> + + <programlisting width="119">volume myvol state up +volume bigraid state down +plex name myvol.p0 state up org concat vol myvol +plex name myvol.p1 state up org concat vol myvol +plex name myvol.p2 state init org striped 512b vol myvol +plex name bigraid.p0 state initializing org raid5 512b vol bigraid +sd name myvol.p0.s0 drive a plex myvol.p0 state up len 1048576b driveoffset 265b plexoffset 0b +sd name myvol.p0.s1 drive b plex myvol.p0 state up len 1048576b driveoffset 265b plexoffset 1048576b +sd name myvol.p1.s0 drive c plex myvol.p1 state up len 1048576b driveoffset 265b plexoffset 0b +sd name myvol.p1.s1 drive d plex myvol.p1 state up len 1048576b driveoffset 265b plexoffset 1048576b +sd name myvol.p2.s0 drive a plex myvol.p2 state init len 524288b driveoffset 1048841b plexoffset 0b +sd name myvol.p2.s1 drive b plex myvol.p2 state init len 524288b driveoffset 1048841b plexoffset 524288b +sd name myvol.p2.s2 drive c plex myvol.p2 state init len 524288b driveoffset 1048841b plexoffset 1048576b +sd name myvol.p2.s3 drive d plex myvol.p2 state init len 524288b driveoffset 1048841b plexoffset 1572864b +sd name bigraid.p0.s0 drive a plex bigraid.p0 state initializing len 4194304b driveoff set 1573129b plexoffset 0b +sd name bigraid.p0.s1 drive b plex bigraid.p0 state initializing len 4194304b driveoff set 1573129b plexoffset 4194304b +sd name bigraid.p0.s2 drive c plex bigraid.p0 state initializing len 4194304b driveoff set 1573129b plexoffset 8388608b +sd name bigraid.p0.s3 drive d plex bigraid.p0 state initializing len 4194304b driveoff set 1573129b plexoffset 12582912b +sd name bigraid.p0.s4 drive e plex bigraid.p0 state initializing len 4194304b driveoff set 1573129b plexoffset 16777216b</programlisting> + + <para>The obvious differences here are the presence of + explicit location information and naming, both of which are + allowed but discouraged, and the information on the states. + <devicename>vinum</devicename> does not store information + about drives in the configuration information. It finds the + drives by scanning the configured disk drives for partitions + with a <devicename>vinum</devicename> label. This enables + <devicename>vinum</devicename> to identify drives correctly + even if they have been assigned different &unix; drive + IDs.</para> + + <sect3 id="vinum-rc-startup"> + <title>Automatic Startup</title> + + <para><emphasis>Gvinum</emphasis> always features an + automatic startup once the kernel module is loaded, via + &man.loader.conf.5;. To load the + <emphasis>Gvinum</emphasis> module at boot time, add + <literal>geom_vinum_load="YES"</literal> to + <filename>/boot/loader.conf</filename>.</para> + + <para>When <devicename>vinum</devicename> is started with + <command>gvinum start</command>, + <devicename>vinum</devicename> reads the configuration + database from one of the <devicename>vinum</devicename> + drives. Under normal circumstances, each drive contains + an identical copy of the configuration database, so it + does not matter which drive is read. After a crash, + however, <devicename>vinum</devicename> must determine + which drive was updated most recently and read the + configuration from this drive. It then updates the + configuration, if necessary, from progressively older + drives.</para> + </sect3> + </sect2> + </sect1> + + <sect1 id="vinum-root"> + <title>Using <devicename>vinum</devicename> for the Root + File System</title> + + <para>For a machine that has fully-mirrored file systems using + <devicename>vinum</devicename>, it is desirable to also + mirror the root file system. Setting up such a configuration + is less trivial than mirroring an arbitrary file system + because:</para> + + <itemizedlist> + <listitem> + <para>The root file system must be available very early + during the boot process, so the + <devicename>vinum</devicename> infrastructure must + already be available at this time.</para> + </listitem> + <listitem> + <para>The volume containing the root file system also + contains the system bootstrap and the kernel. These must + be read using the host system's native utilities, such as + the BIOS, which often cannot be taught about the details + of <devicename>vinum</devicename>.</para> + </listitem> + </itemizedlist> + + <para>In the following sections, the term <quote>root + volume</quote> is generally used to describe the + <devicename>vinum</devicename> volume that contains the root + file system.</para> + + <sect2> + <title>Starting up <devicename>vinum</devicename> Early + Enough for the Root File System</title> + + <para><devicename>vinum</devicename> must be available early + in the system boot as &man.loader.8; must be able to load + the vinum kernel module before starting the kernel. This + can be accomplished by putting this line in + <filename>/boot/loader.conf</filename>:</para> + + <programlisting>geom_vinum_load="YES"</programlisting> + + </sect2> + + <sect2> + <title>Making a <devicename>vinum</devicename>-based Root + Volume Accessible to the Bootstrap</title> + + <para>The current &os; bootstrap is only 7.5 KB of code and + does not understand the internal + <devicename>vinum</devicename> structures. This means that it + cannot parse the <devicename>vinum</devicename> configuration + data or figure out the elements of a boot volume. Thus, some + workarounds are necessary to provide the bootstrap code with + the illusion of a standard <literal>a</literal> partition + that contains the root file system.</para> + + <para>For this to be possible, the following requirements must + be met for the root volume:</para> + + <itemizedlist> + <listitem> + <para>The root volume must not be a stripe or + <acronym>RAID</acronym>-5.</para> + </listitem> + + <listitem> + <para>The root volume must not contain more than one + concatenated subdisk per plex.</para> + </listitem> + </itemizedlist> + + <para>Note that it is desirable and possible to use multiple + plexes, each containing one replica of the root file system. + The bootstrap process will only use one replica for finding + the bootstrap and all boot files, until the kernel mounts the + root file system. Each single subdisk within these plexes + needs its own <literal>a</literal> partition illusion, for + the respective device to be bootable. It is not strictly + needed that each of these faked <literal>a</literal> + partitions is located at the same offset within its device, + compared with other devices containing plexes of the root + volume. However, it is probably a good idea to create the + <devicename>vinum</devicename> volumes that way so the + resulting mirrored devices are symmetric, to avoid + confusion.</para> + + <para>In order to set up these <literal>a</literal> + partitions for each device containing part of the root + volume, the following is required:</para> + + <procedure> + <step> + <para>The location, offset from the beginning of the device, + and size of this device's subdisk that is part of the root + volume needs to be examined, using the command:</para> + + <screen>&prompt.root; <userinput>gvinum l -rv root</userinput></screen> + + <para><devicename>vinum</devicename> offsets and sizes are + measured in bytes. They must be divided by 512 in order + to obtain the block numbers that are to be used by + <command>bsdlabel</command>.</para> + </step> + + <step> + <para>Run this command for each device that participates in + the root volume:</para> + + <screen>&prompt.root; <userinput>bsdlabel -e <replaceable>devname</replaceable></userinput></screen> + + <para><replaceable>devname</replaceable> must be either the + name of the disk, like <devicename>da0</devicename> for + disks without a slice table, or the name of the + slice, like <devicename>ad0s1</devicename>.</para> + + <para>If there is already an <literal>a</literal> + partition on the device from a + pre-<devicename>vinum</devicename> root file system, it + should be renamed to something else so that it remains + accessible (just in case), but will no longer be used by + default to bootstrap the system. A currently mounted root + file system cannot be renamed, so this must be executed + either when being booted from a <quote>Fixit</quote> + media, or in a two-step process where, in a mirror, the + disk that is not been currently booted is manipulated + first.</para> + + <para>The offset of the <devicename>vinum</devicename> + partition on this device (if any) must be added to the + offset of the respective root volume subdisk on this + device. The resulting value will become the + <literal>offset</literal> value for the new + <literal>a</literal> partition. The + <literal>size</literal> value for this partition can be + taken verbatim from the calculation above. The + <literal>fstype</literal> should be + <literal>4.2BSD</literal>. The + <literal>fsize</literal>, <literal>bsize</literal>, + and <literal>cpg</literal> values should be chosen + to match the actual file system, though they are fairly + unimportant within this context.</para> + + <para>That way, a new <literal>a</literal> partition will + be established that overlaps the + <devicename>vinum</devicename> partition on this device. + <command>bsdlabel</command> will only allow for this + overlap if the <devicename>vinum</devicename> partition + has properly been marked using the + <literal>vinum</literal> fstype.</para> + </step> + + <step> + <para>A faked <literal>a</literal> partition now exists + on each device that has one replica of the root volume. + It is highly recommendable to verify the result using a + command like:</para> + + <screen>&prompt.root; <userinput>fsck -n /dev/<replaceable>devname</replaceable>a</userinput></screen> + </step> + </procedure> + + <para>It should be remembered that all files containing control + information must be relative to the root file system in the + <devicename>vinum</devicename> volume which, when setting up + a new <devicename>vinum</devicename> root volume, might not + match the root file system that is currently active. So in + particular, <filename>/etc/fstab</filename> and + <filename>/boot/loader.conf</filename> need to be taken care + of.</para> + + <para>At next reboot, the bootstrap should figure out the + appropriate control information from the new + <devicename>vinum</devicename>-based root file system, and act + accordingly. At the end of the kernel initialization process, + after all devices have been announced, the prominent notice + that shows the success of this setup is a message like:</para> + + <screen>Mounting root from ufs:/dev/gvinum/root</screen> + </sect2> + + <sect2> + <title>Example of a <devicename>vinum</devicename>-based Root + Setup</title> + + <para>After the <devicename>vinum</devicename> root volume has + been set up, the output of <command>gvinum l -rv + root</command> could look like:</para> + + <screen>... +Subdisk root.p0.s0: + Size: 125829120 bytes (120 MB) + State: up + Plex root.p0 at offset 0 (0 B) + Drive disk0 (/dev/da0h) at offset 135680 (132 kB) + +Subdisk root.p1.s0: + Size: 125829120 bytes (120 MB) + State: up + Plex root.p1 at offset 0 (0 B) + Drive disk1 (/dev/da1h) at offset 135680 (132 kB)</screen> + + <para>The values to note are <literal>135680</literal> for the + offset, relative to partition + <filename class="devicefile">/dev/da0h</filename>. This + translates to 265 512-byte disk blocks in + <command>bsdlabel</command>'s terms. Likewise, the size of + this root volume is 245760 512-byte blocks. <filename + class="devicefile">/dev/da1h</filename>, containing the + second replica of this root volume, has a symmetric + setup.</para> + + <para>The bsdlabel for these devices might look like:</para> + + <screen>... +8 partitions: +# size offset fstype [fsize bsize bps/cpg] + a: 245760 281 4.2BSD 2048 16384 0 # (Cyl. 0*- 15*) + c: 71771688 0 unused 0 0 # (Cyl. 0 - 4467*) + h: 71771672 16 vinum # (Cyl. 0*- 4467*)</screen> + + <para>It can be observed that the <literal>size</literal> + parameter for the faked <literal>a</literal> partition + matches the value outlined above, while the + <literal>offset</literal> parameter is the sum of the offset + within the <devicename>vinum</devicename> partition + <literal>h</literal>, and the offset of this partition + within the device or slice. This is a typical setup that is + necessary to avoid the problem described in <xref + linkend="vinum-root-panic"/>. The entire + <literal>a</literal> partition is completely within the + <literal>h</literal> partition containing all the + <devicename>vinum</devicename> data for this device.</para> + + <para>In the above example, the entire device is dedicated to + <devicename>vinum</devicename> and there is no leftover + pre-<devicename>vinum</devicename> root partition.</para> + </sect2> + + <sect2> + <title>Troubleshooting</title> + + <para>The following list contains a few known pitfalls and + solutions.</para> + + <sect3> + <title>System Bootstrap Loads, but System Does Not + Boot</title> + + <para>If for any reason the system does not continue to boot, + the bootstrap can be interrupted by pressing + <keycap>space</keycap> at the 10-seconds warning. The + loader variable <literal>vinum.autostart</literal> can be + examined by typing <command>show</command> and manipulated + using <command>set</command> or + <command>unset</command>.</para> + + <para>If the <devicename>vinum</devicename> kernel module was + not yet in the list of modules to load automatically, type + <command>load geom_vinum</command>.</para> + + <para>When ready, the boot process can be continued by typing + <command>boot -as</command> which + <option>-as</option> requests the kernel to ask for the + root file system to mount (<option>-a</option>) and make the + boot process stop in single-user mode (<option>-s</option>), + where the root file system is mounted read-only. That way, + even if only one plex of a multi-plex volume has been + mounted, no data inconsistency between plexes is being + risked.</para> + + <para>At the prompt asking for a root file system to mount, + any device that contains a valid root file system can be + entered. If <filename>/etc/fstab</filename> is set up + correctly, the default should be something like + <literal>ufs:/dev/gvinum/root</literal>. A typical + alternate choice would be something like + <literal>ufs:da0d</literal> which could be a + hypothetical partition containing the + pre-<devicename>vinum</devicename> root file system. Care + should be taken if one of the alias + <literal>a</literal> partitions is entered here, that it + actually references the subdisks of the + <devicename>vinum</devicename> root device, because in a + mirrored setup, this would only mount one piece of a + mirrored root device. If this file system is to be mounted + read-write later on, it is necessary to remove the other + plex(es) of the <devicename>vinum</devicename> root volume + since these plexes would otherwise carry inconsistent + data.</para> + </sect3> + + <sect3> + <title>Only Primary Bootstrap Loads</title> + + <para>If <filename>/boot/loader</filename> fails to load, but + the primary bootstrap still loads (visible by a single dash + in the left column of the screen right after the boot + process starts), an attempt can be made to interrupt the + primary bootstrap by pressing + <keycap>space</keycap>. This will make the bootstrap stop + in <link linkend="boot-boot1">stage two</link>. An attempt + can be made here to boot off an alternate partition, like + the partition containing the previous root file system that + has been moved away from <literal>a</literal>.</para> + </sect3> + + <sect3 id="vinum-root-panic"> + <title>Nothing Boots, the Bootstrap + Panics</title> + + <para>This situation will happen if the bootstrap had been + destroyed by the <devicename>vinum</devicename> + installation. Unfortunately, <devicename>vinum</devicename> + accidentally leaves only 4 KB at the beginning of its + partition free before starting to write its + <devicename>vinum</devicename> header information. However, + the stage one and two bootstraps plus the bsdlabel require 8 + KB. So if a <devicename>vinum</devicename> partition was + started at offset 0 within a slice or disk that was meant to + be bootable, the <devicename>vinum</devicename> setup will + trash the bootstrap.</para> + + <para>Similarly, if the above situation has been recovered, + by booting from a <quote>Fixit</quote> media, and the + bootstrap has been re-installed using + <command>bsdlabel -B</command> as described in <xref + linkend="boot-boot1"/>, the bootstrap will trash the + <devicename>vinum</devicename> header, and + <devicename>vinum</devicename> will no longer find its + disk(s). Though no actual <devicename>vinum</devicename> + configuration data or data in <devicename>vinum</devicename> + volumes will be trashed, and it would be possible to recover + all the data by entering exactly the same + <devicename>vinum</devicename> configuration data again, the + situation is hard to fix. It is necessary to move the + entire <devicename>vinum</devicename> partition by at least + 4 KB, in order to have the <devicename>vinum</devicename> + header and the system bootstrap no longer collide.</para> + </sect3> + </sect2> + </sect1> +</chapter> |