aboutsummaryrefslogtreecommitdiff
path: root/en_US.ISO8859-1/articles/vinum/article.sgml
diff options
context:
space:
mode:
Diffstat (limited to 'en_US.ISO8859-1/articles/vinum/article.sgml')
-rw-r--r--en_US.ISO8859-1/articles/vinum/article.sgml2547
1 files changed, 0 insertions, 2547 deletions
diff --git a/en_US.ISO8859-1/articles/vinum/article.sgml b/en_US.ISO8859-1/articles/vinum/article.sgml
deleted file mode 100644
index 730605d2cc..0000000000
--- a/en_US.ISO8859-1/articles/vinum/article.sgml
+++ /dev/null
@@ -1,2547 +0,0 @@
-<!-- $FreeBSD$ -->
-<!-- FreeBSD Documentation Project -->
-
-<!DOCTYPE article PUBLIC "-//FreeBSD//DTD DocBook V4.1-Based Extension//EN" [
-<!ENTITY % articles.ent PUBLIC "-//FreeBSD//ENTITIES DocBook FreeBSD Articles Entity Set//EN">
-%articles.ent;
-
-<!ENTITY vinum.ap "<application>Vinum</application>">
-]>
-
-<article>
- <articleinfo>
- <title>
- Bootstrapping Vinum: A Foundation for Reliable Servers
- </title>
- <author>
- <firstname>Robert A.</firstname>
- <surname>Van Valzah</surname>
- </author>
- <copyright>
- <year>2001</year>
- <holder>Robert A. Van Valzah</holder>
- </copyright>
- <pubdate>$Date: 2004-08-08 13:43:56 $ GMT</pubdate>
- <releaseinfo>$Id: article.sgml,v 1.15 2004-08-08 13:43:56 hrs Exp $</releaseinfo>
- <legalnotice id="trademarks" role="trademarks">
- &tm-attrib.freebsd;
- &tm-attrib.general;
- </legalnotice>
- </articleinfo>
-
- <abstract>
- <para> In the most abstract sense, these instructions show how
- to build a pair of disk drives where either one is adequate
- to keep your server running if the other fails.
- Life is better if they are both working, but your server will never die
- unless both disk drives die at once.
- If you choose ATAPI drives and use a fairly generic kernel, you can
- be confident that either of these drives can be plugged into most any
- main board to produce a working server in a pinch.
- The drives need not be identical.
- These techniques work equally well with SCSI drives as they do with ATAPI,
- but I will focus on ATAPI here because main boards with this interface are
- ubiquitous.
- After building the foundation of a reliable server as shown here, you
- can expand to as many disk drives as necessary to build the
- failure-resilient server of your dreams.</para>
- </abstract>
-
- <section id="Introduction">
- <title>Introduction</title>
-
- <para>Any machine that is going to provide reliable service needs
- to have either redundant components on-line or a pool of
- off-line spares that can be promptly swapped in. Commodity
- PC hardware makes it affordable for even small organizations
- to have some spare parts available that could be pressed
- into service following the failure of production equipment.
- In many organizations, a failed power supply, NIC, memory,
- or main board could easily be swapped with a standby in a
- matter of minutes and be ready to return to production work.</para>
-
- <para>If a disk drive fails, however, it often has to be restored
- from a tape backup. This may take many hours. With disk
- drive capacities rising faster than tape drive capacities,
- the time needed to restore a failed disk drive seems to
- increase as technology progresses.</para>
-
- <para>&vinum.ap;
- is a volume manager for FreeBSD that provides a standard block
- I/O layer interface to the filesystem code just as any hardware
- device driver would.
- It works by managing partitions
- of type <literal>vinum</literal> and
- allows you to subdivide and group the space in such
- partitions into logical devices called
- <firstterm>volumes</firstterm> that
- can be used in the same way as disk partitions.
- Volumes can
- be configured for resilience, performance, or both. Experienced
- system administrators will immediately recognize the benefits
- of being able to configure each filesystem to match the way
- it is most often used.</para>
-
- <para>In some ways, <application>Vinum</application> is similar to
- &man.ccd.4;, but it is far more flexible and robust in the face
- of failures.
- It is only slightly more difficult to set up than &man.ccd.4;.
- &man.ccd.4; may meet your needs if you are only interested in
- concatenation.</para>
-
- <section id="Terminology">
- <title>Terminology</title>
-
- <para>Discussion of storage management can get very tricky
- simply because of the terminology involved.
- As we will see below,
- the terms <firstterm>disk</firstterm>,
- <firstterm>slice</firstterm>, <firstterm>partition</firstterm>,
- <firstterm>subdisk</firstterm>, and <firstterm>volume</firstterm>
- each refer to different things that present the same interface
- to a kernel function like swapping.
- The potential for confusion is compounded because the objects
- that these terms represent can be nested inside each other.</para>
-
- <para>I will refer to a physical disk drive as a
- <firstterm>spindle</firstterm>.
- A <firstterm>partition</firstterm> here means a BSD partition as
- maintained by <command>disklabel</command>.
- It does not refer to <firstterm>slices</firstterm> or
- <firstterm>BIOS partitions</firstterm> as
- maintained by <command>fdisk</command>.</para>
- </section>
-
- <section id="Objects">
- <title>Vinum Objects</title>
-
- <para><application>Vinum</application>
- defines a hierarchy of four objects that it uses to manage storage
- (see <xref linkend="arch">).
- Different combinations of these objects are used to achieve
- failure resilience, performance, and/or extra capacity.
- I will give a whirlwind tour of the objects here--see the
- <ulink url="http://www.vinumvm.org/">Vinum web site</ulink>
- for a more thorough description.</para>
-
- <figure id="arch">
- <title>Vinum Objects and Architecture</title>
-
- <mediaobject>
- <imageobject>
- <imagedata fileref="arch" format="EPS">
- </imageobject>
-
- <textobject>
- <literallayout class="monospaced">+-----+------+------+
-| UFS | swap | Etc. |
-+---+-+------+----+ +
-| volume | |
-+ V +-------------+ +
-| i plex | |
-+ n +-------------+ +
-| u subdisk | |
-+ m +-------------+ +
-| drive | |
-+-----------------+ +
-| Block I/O devices |
-+-------------------+</literallayout>
- </textobject>
-
- <textobject>
- <phrase>Vinum Objects and Architecture</phrase>
- </textobject>
- </mediaobject>
- </figure>
-
- <para>The top object, a vinum <firstterm>volume</firstterm>,
- implements a virtual disk that
- provides a standard block I/O layer
- interface to other parts of the kernel.
- The bottom object, a vinum <firstterm>drive</firstterm>,
- uses this same interface to
- request I/O from physical devices below it.</para>
-
- <para>In between these two (from top to bottom) we have objects called
- a vinum <firstterm>plex</firstterm>
- and a vinum <firstterm>subdisk</firstterm>.
- As you can probably guess from the name, a vinum subdisk is a
- contiguous subset of the space available on a vinum drive.
- It lets you subdivide a vinum drive in much the same way that
- a disk BSD partition lets you subdivide a BIOS slice.</para>
-
- <para>A plex allows subdisks to be grouped together making the space
- of all subdisks available as a single object.</para>
-
- <para>A plex can be organized with its constituent subdisks concatenated
- or striped.
- Both organizations are useful for spreading I/O requests across
- spindles since plexes reside on distinct spindles.
- A striped plex will switch spindles each time a multiple of the
- stripe size is reached.
- A concatenated plex will switch spindles only when the end of
- a subdisk is reached.</para>
-
- <para>An important characteristic of a
- <application>Vinum</application> volume is that it can be
- made up of more than one plex.
- In this case, writes go to all plexes and a read may be satisfied
- by any plex.
- Configuring two or more plexes on distinct spindles yields a
- volume that is resilient to failure.</para>
-
- <para><application>Vinum</application> maintains a
- <firstterm>configuration</firstterm>
- that defines instances of the above objects and the way they
- are related to each other.
- This configuration is automatically written to all spindles under
- <application>Vinum</application> management whenever it changes.</para>
- </section>
-
- <section id="Organizations">
- <title>Vinum Volume/Plex Organization</title>
-
- <para>Although <application>Vinum</application>
- can manage any number of spindles,
- I will only cover scenarios with two spindles here
- for simplification.
- See <xref linkend=OrgCompare> to see how
- two spindles organized with <application>Vinum</application>
- compare to two spindles without <application>Vinum</application>.</para>
-
- <para>
- <table id=OrgCompare frame=all>
- <title>Characteristics of Two Spindles Organized with Vinum</title>
-
- <tgroup cols="5">
- <thead>
- <row>
- <entry>Organization</entry>
- <entry>Total Capacity</entry>
- <entry>Failure Resilient</entry>
- <entry>Peak Read Performance</entry>
- <entry>Peak Write Performance</entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry>Concatenated Plexes</entry>
- <entry>Unchanged, but appears as a single drive</entry>
- <entry>No</entry>
- <entry>Unchanged</entry>
- <entry>Unchanged</entry>
- </row>
- <row>
- <entry>Striped Plexes (RAID-0)</entry>
- <entry>Unchanged, but appears as a single drive</entry>
- <entry>No</entry>
- <entry>2x</entry>
- <entry>2x</entry>
- </row>
- <row>
- <entry>Mirrored Volumes (RAID-1)</entry>
- <entry>1/2, appearing as a single drive</entry>
- <entry>Yes</entry>
- <entry>2x</entry>
- <entry>Unchanged</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- </para>
-
- <para><xref linkend=OrgCompare> shows that striping yields
- the same capacity and lack of failure resilience
- as concatenation, but it has better peak read and write performance.
- Hence we will not be using concatenation in any of the examples here.
- Mirrored volumes provide the benefits of improved peak read performance
- and failure resilience--but this comes at a loss in capacity.</para>
-
- <note><para>Both concatenation and striping bring their benefits over a
- single spindle at the cost of increased likelihood of failure since
- more than one spindle is now involved.</para></note>
-
- <para>When three or more spindles are present,
- <application>Vinum</application> also supports rotated,
- block-interleaved parity (also called <firstterm>RAID-5</firstterm>)
- that provides better
- capacity than mirroring (but not quite as good as striping), better
- read performance than both mirroring and striping,
- and good failure resilience.
- There is, however,
- a substantial decrease in write performance with RAID-5.
- Most of the benefits become more pronounced with five or more
- spindles.</para>
-
- <para>The organizations described above may be combined to provide
- benefits that no single organization can match.
- For example, mirroring and striping can be combined to provide
- failure-resilience with very fast read performance.</para>
-
- </section>
-
- <section id="History">
- <title>Vinum History</title>
-
- <para><application>Vinum</application>
- is a standard part of even a "minimum" FreeBSD distribution and
- it has been standard since 3.0-RELEASE.
- The official pronunciation of the name is
- <emphasis>VEE-noom</emphasis>.</para>
-
- <para>&vinum.ap; was inspired by the Veritas Volume Manager, but
- was not derived from it.
- The name is a play on that history and the Latin adage
- <foreignphrase>In Vino Veritas</foreignphrase>
- (<foreignphrase>Vino</foreignphrase> is the ablative form of
- <foreignphrase>Vinum</foreignphrase>).
- Literally translated, that is <quote>Truth lies in wine</quote> hinting that
- drunkards have a hard time lying.
- </para>
-
- <para>I have been using it in production on six different servers for
- over two years with no data loss.
- Like the rest of FreeBSD, <application>Vinum</application>
- provides <quote>rock-stable performance.</quote>
- (On a personal note, I have seen <application>Vinum</application>
- panic when I misconfigured something, but I have
- never had any trouble in normal operation.)
- Greg Lehey wrote
- <application>Vinum</application> for FreeBSD,
- but he is seeking
- help in porting it to NetBSD and OpenBSD.</para>
-
- <warning>
- <para>Just like the rest of FreeBSD, <application>Vinum</application>
- is undergoing continuous
- development.
- Several subtle, but significant bugs have been fixed in recent
- releases.
- It is always best to use the most recent code base that meets your
- stability requirements.</para></warning>
-
- </section>
-
- <section id="Strategy">
- <title>Vinum Deployment Strategy</title>
-
- <para><application>Vinum</application>,
- coupled with prudent partition management, lets you
- keep <quote>warm-spare</quote> spindles on-line so that failures
- are transparent to users. Failed spindles can be replaced
- during regular maintenance periods or whenever it is convenient.
- When all spindles are working, the server benefits from increased
- performance and capacity.</para>
-
- <para>Having redundant copies of your home directory does not
- help you if the spindle holding root,
- <filename>/usr</filename>, or swap fails on your server.
- Hence I focus here on building a simple
- foundation for a failure-resilient server covering the root,
- <filename>/usr</filename>,
- <filename>/home</filename>, and swap partitions.</para>
-
- <warning>
- <para><application>Vinum</application>
- mirroring does not remove the need for making backups!
- Mirroring cannot help you recover from site disasters
- or the dreaded
- <command>rm -r -f /</command> command.</para></warning>
- </section>
-
- <section id="WhyBootstrap">
- <title>Why Bootstrap Vinum?</title>
-
- <para>It is possible to add <application>Vinum</application>
- to a server configuration after
- it is already in production use, but this is much harder than
- designing for it from the start. Ironically,
- <application>Vinum</application> is not supported by
- <command>/stand/sysinstall</command>
- and hence you cannot install
- <filename>/usr</filename> right onto a
- <application>Vinum</application> volume.</para>
-
- <note><para><application>Vinum</application> currently does not
- support the root filesystem (this feature
- is in development).</para></note>
-
- <para>Hence it is a bit
- tricky to get started using
- <application>Vinum</application>, but these instructions
- take you though the process of planning for
- <application>Vinum</application>, installing FreeBSD
- without it, and then beginning to use it.</para>
-
- <para>I have come to call this whole process <quote>bootstrapping Vinum.</quote>
- That is, the process of getting <application>Vinum</application>
- initially installed
- and operating to the point where you have met your resilience
- or performance goals. My purpose here is to document a
- <application>Vinum</application>
- bootstrapping method that I have found that works well for me.</para>
- </section>
-
- <section id="Benefits">
- <title>Vinum Benefits</title>
-
- <para>The server foundation scenario I have chosen here allows me
- to show you examples of configuring for resilience on
- <filename>/usr</filename> and
- <filename>/home</filename>.
- Yet <application>Vinum</application>
- provides benefits other than resilience--namely
- performance, capacity, and manageability.
- It can significantly improve disk performance (especially
- under multi-user loads).
- <application>Vinum</application>
- can easily concatenate many smaller disks to produce the
- illusion of a single larger disk (but my server foundation
- scenario does not allow me to illustrate these benefits here).</para>
-
- <para>For servers with many spindles, <application>Vinum</application>
- provides substantial
- benefits in volume management, particularly when coupled with
- hot-pluggable hardware. Data can be moved from spindle to
- spindle while the system is running without loss of production
- time. Again, details of this will not be given here, but once
- you get your feet wet with <application>Vinum</application>,
- other documentation will help you do things like this.
- See
- "<ulink url="http://www.vinumvm.org/vinum/vinum.ps">The Vinum
- Volume Manager</ulink>" for a technical introduction to
- <application>Vinum</application>,
- &man.vinum.8; for a description of the <command>vinum</command>
- command, and
- &man.vinum.4;
- for a description of the vinum device
- driver and the way <application>Vinum</application>
- objects are named.</para>
-
- <note>
- <para>Breaking up your disk space into smaller and smaller partitions
- has the benefit of allowing you to <quote>tune</quote> for the most common
- type of access and tends to keep disk hogs <quote>within their pens.</quote>
- However it also causes some loss in total available disk space
- due to fragmentation.</para></note>
- </section>
-
- <section id="DegradedOperation">
- <title>Server Operation in Degraded Mode</title>
-
- <para>Some disk failures in this two-spindle scenario will result in
- <application>Vinum</application>
- automatically routing
- all disk I/O to the remaining good spindle.
- Others will require brief manual intervention on the console
- to configure the server for degraded mode operation and a quick reboot.
- Other than actual hardware repairs, most recovery work
- can be done while the server is running in multi-user degraded
- mode so there is as little production impact
- from failures as possible.</para>
-
- <para>I give the instructions in <xref linkend=Failures> needed to
- configure the server for degraded mode operation
- in those cases where <application>Vinum</application>
- cannot do it automatically.
- I also give the instructions needed to
- return to normal operation once the failed hardware is repaired.
- You might call these instructions <application>Vinum</application>
- failure recovery techniques.</para>
-
- <para>I recommend practicing using these instructions
- by recovering from simulated failures.
- For each failure scenario, I also give tips below for simulating
- a failure even when your hardware is working well.
- Even a minimum <application>Vinum</application>
- system as described in
- <xref linkend="HW">
- below can be a good place to experiment with
- recovery techniques without impacting production equipment.</para>
- </section>
-
- <section id="HWvsSW">
- <title>Hardware RAID vs. Vinum (Software RAID)</title>
-
- <para>Manual intervention is sometimes required to configure a server for
- degraded mode because
- <application>Vinum</application>
- is implemented in software that runs after the FreeBSD
- kernel is loaded. One disadvantage of such
- <firstterm>software RAID</firstterm>
- solutions is that there is nothing that can be done to hide spindle
- failures from the BIOS or the FreeBSD boot sequence. Hence
- the manual reconfiguration of the server
- for degraded operation mentioned
- above just informs the BIOS and boot sequence of failed
- spindles.
- <firstterm>Hardware RAID</firstterm> solutions generally have an
- advantage in that they require no such reconfiguration since
- spindle failures are hidden from the BIOS and boot sequence.</para>
-
- <para>Hardware RAID, however, may have some disadvantages that can
- be significant in some cases:
- <itemizedlist>
- <listitem><para>
- The hardware RAID controller itself may become a single
- point of failure for the system.
- </para></listitem>
- <listitem><para>
- The data is usually kept in a proprietary
- format so that a disk drive cannot be simply plugged
- into another main board and booted.
- </para></listitem>
- <listitem><para>
- You often cannot mix and
- match drives with different sizes and interfaces.
- </para></listitem>
- <listitem><para>
- You are often limited to the number of drives supported by the
- hardware RAID controller (often only four or eight).
- </para></listitem>
- </itemizedlist>
- In other words, &vinum.ap; may offer advantages in that
- there is no single point of failure,
- the drives can boot on most any main board, and
- you are free to mix and match as many drives using
- whatever interface you choose.</para>
-
- <tip>
- <para>Keep your kernel fairly generic (or at least keep
- <filename>/kernel.GENERIC</filename> around).
- This will improve the chances that you can come back up on
- <quote>foreign</quote> hardware more quickly.</para>
- </tip>
-
- <para>The pros and cons discussed above suggest
- that the root filesystem and swap partition are good
- candidates for hardware RAID if available.
- This is especially true for servers where it is difficult for
- administrators to get console access (recall that this is sometimes
- required to configure a server for degraded mode operation).
- A server with only software RAID is well suited to office and home
- environments where an administrator can be close at hand.</para>
-
- <note><para>A common myth is that hardware RAID is always faster
- than software RAID.
- Since it runs on the host CPU, <application>Vinum</application>
- often has more CPU power and memory available than a
- dedicated RAID controller would have.
- If performance is a prime concern, it is best to benchmark
- your application running on your CPU with your spindles using
- both hardware and software RAID systems before making
- a decision.</para></note>
-
- </section>
-
- <section id="HW">
- <title>Hardware for Vinum</title>
-
- <para>These instructions may be timely since commodity PC hardware
- can now easily host several hundred gigabytes of reasonably
- high-performance disk space at a low price. Many disk
- drive manufactures now sell 7,200 RPM disk drives with quite
- low seek times and high transfer rates through ATA-100
- interfaces, all at very attractive prices. Four such drives,
- attached to a suitable main board and configured with
- <application>Vinum</application>
- and prudent partitioning, yields a failure-resilient, high
- performance disk server at a very reasonable cost.</para>
-
- <para>However, you can indeed get started with
- <application>Vinum</application> very simply.
- A minimum system can be as simple as
- an old CPU (even a 486 is fine) and a pair of drives
- that are 500 MB or more. They need not be the same size or
- even use the same interface (i.e., it is fine to mix ATAPI and
- SCSI). So get busy and give this a try today! You will have
- the foundation of a failure-resilient server running in an
- hour or so!</para>
- </section>
- </section>
-
- <section id="BootstrappingPhases">
- <title>Bootstrapping Phases</title>
-
- <para>Greg Lehey suggested this bootstrapping method.
- It uses knowledge of how <application>Vinum</application>
- internally allocates disk space to avoid copying data.
- Instead, <application>Vinum</application>
- objects are configured so that they occupy the
- same disk space where <command>/stand/sysinstall</command> built
- filesystems.
- The filesystems are thus embedded within
- <application>Vinum</application> objects without copying.</para>
-
- <para>There are several distinct phases to the
- <application>Vinum</application> bootstrapping
- procedure. Each of these phases is presented in a separate section below.
- The section starts with a general overview of the phase and its goals.
- It then gives example steps for the two-spindle scenario
- presented here and advice on how to adapt them for your server.
- (If you are reading for a general understanding
- of <application>Vinum</application>
- bootstrapping, the example sections for each phase
- can safely be skipped.)
- The remainder of this section gives
- an overview of the entire bootstrapping process.</para>
-
- <para>Phase 1 involves planning and preparation.
- We will balance requirements
- for the server against available resources and make design
- tradeoffs.
- We will plan the transition from no
- <application>Vinum</application> to
- <application>Vinum</application>
- on just one spindle, to <application>Vinum</application>
- on two spindles.</para>
-
- <para>In phase 2, we will install a minimum FreeBSD system on a
- single spindle using partitions of type
- <literal>4.2BSD</literal> (regular UFS filesystems).</para>
-
- <para>Phase 3 will embed the non-root filesystems from phase 2 in
- <application>Vinum</application> objects.
- Note that <application>Vinum</application> will be up and
- running at this point,
- but it cannot yet provide any resilience since it only has
- one spindle on which to store data.</para>
-
- <para>Finally in phase 4, we configure <application>Vinum</application>
- on a second spindle and make a backup copy of the root filesystem.
- This will give us resilience on all filesystems.</para>
-
- <section id="P1">
- <title>Bootstrapping Phase 1: Planning and Preparation</title>
-
- <para>Our goal in this phase is to define the different partitions
- we will need and examine their requirements.
- We will also look at available disk drives and controllers and allocate
- partitions to them.
- Finally, we will determine the size of
- each partition and its use during the bootstrapping process.
- After this planning is complete, we can optionally prepare to use some
- tools that will make bootstrapping <application>Vinum</application>
- easier.</para>
-
- <para>Several key questions must be answered in this
- planning phase:</para>
-
- <itemizedlist>
- <listitem><para>
- What filesystem and partitions will be needed?
- </para></listitem>
- <listitem><para>
- How will they be used?
- </para></listitem>
- <listitem><para>
- How will we name each spindle?
- </para></listitem>
- <listitem><para>
- How will the partitions be ordered for each spindle?
- </para></listitem>
- <listitem><para>
- How will partitions be assigned to the spindles?
- </para></listitem>
- <listitem><para>
- How will partitions be configured? Resilience or performance?
- </para></listitem>
- <listitem><para>
- What technique will be used to achieve resilience?
- </para></listitem>
- <listitem><para>
- What spindles will be used?
- </para></listitem>
- <listitem><para>
- How will they be configured on the available controllers?
- </para></listitem>
- <listitem><para>
- How much space is required for each partition?
- </para></listitem>
- </itemizedlist>
-
- <section id="P1E">
- <title>Phase 1 Example</title>
-
- <para>In this example, I will assume a scenario
- where we are building
- a minimal foundation for a failure-resilient server.
- Hence we will need at least root,
- <filename>/usr</filename>,
- <filename>/home</filename>,
- and swap partitions.
- The root,
- <filename>/usr</filename>, and
- <filename>/home</filename> filesystems all need resilience since the
- server will not be much good without them.
- The swap partition needs performance first and
- generally does
- not need resilience since nothing it holds needs to be retained
- across a reboot.</para>
-
- <section>
- <title>Spindle Naming</title>
-
- <para>The kernel would refer to the master spindle on
- the primary and secondary ATA controllers as
- <devicename>/dev/ad0</devicename> and
- <devicename>/dev/ad2</devicename> respectively.
- <footnote>
- <para>
- This assumes that you have not removed the line
- <programlisting>options ATA_STATIC_ID</programlisting>
- from your kernel configuration.
- </para>
- </footnote>
- But <application>Vinum</application>
- also needs to have a name for each spindle
- that will stay the same name regardless
- of how it is attached to the CPU (i.e., if the drive moves, the
- <application>Vinum</application> name moves with the drive).</para>
-
- <para>Some recovery techniques documented below suggest
- moving a spindle from
- the secondary ATA controller to the primary ATA controller.
- (Indeed, the flexibility of making such moves is a key benefit
- of <application>Vinum</application>
- especially if you are managing a large number of spindles.)
- After such a drive/controller swap,
- the kernel will see what used to be
- <devicename>/dev/ad2</devicename> as
- <devicename>/dev/ad0</devicename>
- but <application>Vinum</application>
- will still call
- it by whatever name it had when it was attached to
- <devicename>/dev/ad2</devicename>
- (i.e., when it was <quote>created</quote> or first made known to
- <application>Vinum</application>).</para>
-
- <para>Since connections can change, it is best to give
- each spindle a unique, abstract
- name that gives no hint of how it is attached.
- Avoid names that suggest a manufacturer, model number,
- physical location, or membership in a sequence
- (e.g. avoid names like
- <literal>upper</literal>, <literal>lower</literal>, etc.,
- <literal>alpha</literal>, <literal>beta</literal>, etc.,
- <literal>SCSI1</literal>, <literal>SCSI2</literal>, etc., or
- <literal>Seagate1</literal>, <literal>Seagate2</literal> etc.).
- Such names are likely to lose their uniqueness or
- get out of sequence
- someday even if they seem like great names today.</para>
-
- <tip>
- <para>Once you have picked names for your spindles,
- label them with a permanent marker.
- If you have hot-swappable hardware, write the names on the sleds
- in which the spindles are mounted.
- This will significantly reduce the likelihood of
- error when you are moving spindles around later as
- part of failure recovery or routine system management
- procedures.</para></tip>
-
- <para>In the instructions that follow,
- <application>Vinum</application>
- will name the root spindle <literal>YouCrazy</literal>
- and the rootback spindle <literal>UpWindow</literal>.
- I will only use <devicename>/dev/ad0</devicename>
- when I want to refer to whichever
- of the two spindles is currently attached as
- <devicename>/dev/ad0</devicename>.</para>
- </section>
-
- <section>
- <title>Partition Ordering</title>
-
- <para>Modern disk drives operate with fairly uniform areal
- density across the surface of the disk.
- That implies that more data is available under the heads without
- seeking on the outer cylinders than on the inner cylinders.
- We will allocate partitions most critical to system performance
- from these outer cylinders as
- <command>/stand/sysinstall</command> generally does.</para>
-
- <para>The root filesystem is traditionally the outermost, even though
- it generally is not as critical to system performance as others.
- (However root can have a larger impact on performance if it contains
- <filename>/tmp</filename> and <filename>/var</filename> as it
- does in this example.)
- The FreeBSD boot loaders assume that the
- root filesystem lives in the <literal>a</literal> partition.
- There is no requirement that the <literal>a</literal>
- partition start on the outermost cylinders, but this
- convention makes it easier to manage disk labels.</para>
-
- <para>Swap performance is critical so it comes next on our way toward
- the center.
- I/O operations here tend to be large and contiguous.
- Having as much data under the heads as possible avoids seeking
- while swapping.</para>
-
- <para>With all the smaller partitions out of the way, we finish
- up the disk with
- <filename>/home</filename> and
- <filename>/usr</filename>.
- Access patterns here tend not to be as intense as for other
- filesystems (especially if there is an abundant supply of RAM
- and read cache hit rates are high).</para>
-
- <para>If the pair of spindles you have are large enough to allow
- for more than
- <filename>/home</filename> and
- <filename>/usr</filename>,
- it is fine to plan for additional filesystems here.</para>
-
- </section>
-
- <section>
- <title>Assigning Partitions to Spindles</title>
-
- <para>We will want to assign
- partitions to these spindles so that either can fail
- without loss of data on filesystems configured for
- resilience.</para>
-
- <para>Reliability on
- <filename>/usr</filename> and
- <filename>/home</filename>
- is best achieved using <application>Vinum</application>
- mirroring.
- Resilience will have to come differently, however, for the root
- filesystem since <application>Vinum</application>
- is not a part of the FreeBSD boot sequence.
- Here we will have to settle for two identical
- partitions with a periodic copy from the primary to the
- backup secondary.</para>
-
- <para>The kernel already has support for interleaved swap across
- all available partitions so there is no need for help from
- <application>Vinum</application> here.
- <command>/stand/sysinstall</command>
- will automatically configure <filename>/etc/fstab</filename>
- for all swap partitions given.</para>
-
- <para>The &vinum.ap; bootstrapping method given below
- requires a pair of spindles that I will call the
- <firstterm>root spindle</firstterm> and the
- <firstterm>rootback spindle</firstterm>.</para>
-
- <important><para>The rootback spindle must be the same size or
- larger than the root spindle.</para></important>
-
- <para>These instructions first allocate all space on the root
- spindle and then allocate exactly that amount of space on
- a rootback spindle.
- (After &vinum.ap; is bootstrapped, there is nothing special
- about either of these spindles--they are interchangeable.)
- You can later use the remaining space on the rootback spindle for
- other filesystems.</para>
-
- <para>If you have more than two spindles, the
- <literal>bootvinum</literal> Perl script and the procedure
- below will help you initialize them for use with &vinum.ap;.
- However you will have to figure out how to assign partitions
- to them on your own.</para>
-
- </section>
-
- <section>
- <title>Assigning Space to Partitions</title>
-
- <para>For this example, I will use two spindles: one with
- 4,124,673 blocks (about 2 GB) on <devicename>/dev/ad0</devicename>
- and one with 8,420,769 blocks (about 4 GB) on
- <devicename>/dev/ad2</devicename>.</para>
-
- <para>It is best to configure your two spindles on separate
- controllers so that both can operate in parallel and
- so that you will have failure resilience in case a
- controller dies.
- Note that mirrored volume write performance will be halved
- in cases where both spindles share a controller that requires
- they operate serially (as is often the case with ATA controllers).
- One spindle will be the master on the primary ATA
- controller and the other will be the master on the
- secondary ATA controller.</para>
-
- <para>Recall that we will be allocating space on the smaller
- spindle first and the larger spindle second.</para>
-
- </section>
-
- <section id=AssignSmall>
- <title>Assigning Partitions on the Root Spindle</title>
-
- <para>We will allocate 200,000 blocks (about 93 MB)
- for a root filesystem on each spindle
- (<devicename>/dev/ad0s1a</devicename> and
- <devicename>/dev/ad2s1a</devicename>).
- We will initially allocate 200,265 blocks for a swap partition
- on each spindle,
- giving a total of about 186 MB of
- swap space (<devicename>/dev/ad0s1b</devicename> and
- <devicename>/dev/ad2s1b</devicename>).</para>
-
- <note><para>We will lose 265 blocks from each swap partition
- as part of the bootstrapping process.
- This is the size of the space used by
- <application>Vinum</application> to store configuration
- information.
- The space will be taken from swap and given to a vinum
- partition but will be unavailable for
- <application>Vinum</application> subdisks.</para></note>
-
- <note><para>I have done the partition allocation in nice round
- numbers of blocks just to emphasize where the 265 blocks go.
- There is nothing wrong with allocating space in MB if that is
- more convenient for you.</para></note>
-
- <para>This leaves 4,124,673 - 200,000 - 200,265 = 3,724,408 blocks
- (about 1,818 MB) on the root spindle for
- <application>Vinum</application>
- partitions (<devicename>/dev/ad0s1e</devicename> and
- <devicename>/dev/ad2s1f</devicename>).
- From this, allocate the 265 blocks for
- <application>Vinum</application> configuration information,
- 1,000,000 blocks (about 488 MB)
- for <filename>/home</filename>, and the remaining
- 2,724,408 blocks (about 1,330 MB) for
- <filename>/usr</filename>.
- See <xref linkend=ad0b4aft> below to see this graphically.</para>
-
- <para>The left-hand side of
- <xref linkend="ad0b4aft"> below shows what spindle ad0 will
- look like at the end of phase 2.
- The right-hand side shows what it will look like at the
- end of phase 3.</para>
-
- <figure id="ad0b4aft">
- <title>Spindle ad0 Before and After Vinum</title>
-
- <mediaobject>
- <imageobject>
- <imagedata fileref="ad0b4aft" format="EPS">
- </imageobject>
-
- <textobject>
- <literallayout class="monospaced"> ad0 Before Vinum Offset (blocks) ad0 After Vinum
-+----------------------+ <-- 0--> +----------------------+
-| root | | root |
-| /dev/ad0s1a | | /dev/ad0s1a |
-+----------------------+ <-- 200000--> +----------------------+
-| swap | | swap |
-| /dev/ad0s1b | | /dev/ad0s1b |
-| | 400000--> +----------------------+
-| | | Vinum drive YouCrazy |
-| | | /dev/ad0s1h |
-+----------------------+ <-- 400265--> +-----------------+ |
-| /home | | Vinum sd | |
-| /dev/ad0s1e | | home.p0.s0 | |
-+----------------------+ <--1400265--> +-----------------+ |
-| /usr | | Vinum sd | |
-| /dev/ad0s1f | | usr.p0.s0 | |
-+----------------------+ <--4124673--> +-----------------+----+
-Not to scale</literallayout>
- </textobject>
-
- <textobject>
- <phrase>Spindle /dev/ad0 Before and After Vinum</phrase>
- </textobject>
- </mediaobject>
- </figure>
-
- </section>
-
- <section id=AssignLarge>
- <title>Assigning Partitions on the Rootback Spindle</title>
-
- <para>The <filename>/rootback</filename> and swap partition sizes
- on the rootback spindle must
- match the root and swap partition sizes on the root spindle.
- That leaves 8,420,769 - 200,000 - 200,265 = 8,020,504
- blocks for the <application>Vinum</application> partition.
- Mirrors of <filename>/home</filename> and
- <filename>/usr</filename> receive the same allocation as on
- the root spindle.
- That will leave an extra 2 GB or so that we can deal
- with later.
- See <xref linkend=ad2b4aft> below to see this graphically.</para>
-
- <para>The left-hand side of
- <xref linkend="ad2b4aft"> below shows what spindle ad2 will
- look like at the beginning of phase 4.
- The right-hand side shows what it will look like at the end.</para>
-
- <figure id="ad2b4aft">
- <title>Spindle ad2 Before and After Vinum</title>
-
- <mediaobject>
- <imageobject>
- <imagedata fileref="ad2b4aft" format="EPS">
- </imageobject>
-
- <textobject>
- <literallayout class="monospaced"> ad2 Before Vinum Offset (blocks) ad2 After Vinum
-+----------------------+ <-- 0--> +----------------------+
-| /rootback | | /rootback |
-| /dev/ad2s1e | | /dev/ad2s1a |
-+----------------------+ <-- 200000--> +----------------------+
-| swap | | swap |
-| /dev/ad2s1b | | /dev/ad2s1b |
-| | 400000--> +----------------------+
-| | | Vinum drive UpWindow |
-| | | /dev/ad2s1h |
-+----------------------+ <-- 400265--> +-----------------+ |
-| /NOFUTURE | | Vinum sd | |
-| /dev/ad2s1f | | home.p1.s0 | |
-| | 1400265--> +-----------------+ |
-| | | Vinum sd | |
-| | | usr.p1.s0 | |
-| | 4124673--> +-----------------+ |
-| | | Vinum sd | |
-| | | hope.p0.s0 | |
-+----------------------+ <--8420769--> +-----------------+----+
-Not to scale</literallayout>
- </textobject>
-
- <textobject>
- <phrase>Spindle ad2 Before and After Vinum</phrase>
- </textobject>
- </mediaobject>
- </figure>
-
- </section>
-
- <section id=floppy>
- <title>Preparation of Tools</title>
-
- <para>The <literal>bootvinum</literal> Perl script given below in
- <xref linkend=Perl> will make the
- <application>Vinum</application> bootstrapping process much
- easier if you can run it on the machine being bootstrapped.
- It is over 200 lines and you would not want to type it in.
- At this point, I recommend that you
- copy it to a floppy or arrange some
- alternative method of making it readily available
- so that it can be available later when needed.
- For example:</para>
-
-<screen>&prompt.root; <userinput>fdformat -f 1440 /dev/fd0</userinput>
-&prompt.root; <userinput>newfs_msdos -f 1440 /dev/fd0</userinput>
-&prompt.root; <userinput>mount_msdos /dev/fd0 /mnt</userinput>
-&prompt.root; <userinput>cp /usr/share/examples/vinum/bootvinum /mnt</userinput></screen>
-
- <para>XXX Someday, I would like this script to live in
- <filename>/usr/share/examples/vinum</filename>.
- Till then, please use this
- <ulink url="http://www.BGPBook.Com/vinum/bootvinum">link</ulink>
- to get a copy.</para>
- </section>
- </section>
- </section>
-
- <section id="P2">
- <title>Bootstrapping Phase 2: Minimal OS Installation</title>
-
- <para>Our goal in this phase is to complete the smallest possible
- FreeBSD installation in such a way that we can later install
- <application>Vinum</application>.
- We will use only
- partitions of type <literal>4.2BSD</literal> (i.e., regular UFS file
- systems) since that is the only type supported by
- <command>/stand/sysinstall</command>.</para>
-
- <section id="P2E">
- <title>Phase 2 Example</title>
-
- <procedure>
- <step>
- <para>Start up the FreeBSD installation process by running
- <command>/stand/sysinstall</command> from
- installation media as you normally would.</para></step>
-
- <step>
- <para>Fdisk partition all spindles as needed.</para>
-
- <important>
- <para>Make sure to select BootMgr for all spindles.</para></important>
- </step>
-
- <step>
- <para>Partition the root spindle with appropriate block
- allocations as described above in <xref linkend=AssignSmall>.
- For this example on a 2 GB spindle, I will use
- 200,000 blocks for root, 200,265 blocks for swap,
- 1,000,000 blocks for <filename>/home</filename>, and
- the rest of the spindle (2,724,408 blocks) for
- <filename>/usr</filename>.
- (<command>/stand/sysinstall</command>
- should automatically assign these to
- <devicename>/dev/ad0s1a</devicename>,
- <devicename>/dev/ad0s1b</devicename>,
- <devicename>/dev/ad0s1e</devicename>, and
- <devicename>/dev/ad0s1f</devicename>
- by default.)</para>
-
- <note><para>If you prefer Soft Updates as I do and you are
- using 4.4-RELEASE or better, this is a good time to enable
- them.</para></note>
-
- </step>
-
- <step>
- <para>Partition the rootback spindle with the appropriate block
- allocations as described above in <xref linkend=AssignLarge>.
- For this example on a 4 GB spindle, I will use
- 200,000 blocks for <filename>/rootback</filename>,
- 200,265 blocks for swap, and
- the rest of the spindle (8,020,504 blocks) for
- <filename>/NOFUTURE</filename>.
- (<command>/stand/sysinstall</command>
- should automatically assign these to
- <devicename>/dev/ad2s1e</devicename>,
- <devicename>/dev/ad2s1b</devicename>, and
- <devicename>/dev/ad2s1f</devicename> by default.)</para>
-
- <note>
- <para>We do not really want to have a
- <filename>/NOFUTURE</filename> UFS filesystem (we
- want a vinum partition instead), but that is the
- best choice we have for the space given the limitations of
- <command>/stand/sysinstall</command>.
- Mount point names beginning with <literal>NOFUTURE</literal>
- and <literal>rootback</literal>
- serve as sentinels to the bootstrapping
- script presented in <xref linkend=Perl> below.</para></note>
- </step>
-
- <step>
- <para>Partition any other spindles with swap if desired and a
- single <filename>/NOFUTURExx</filename> filesystem.</para>
- </step>
-
- <step>
- <para>Select a minimum system install for now even if you
- want to end up with more distributions loaded later.</para>
-
- <tip>
- <para>Do not worry about system configuration options at this
- point--get <application>Vinum</application>
- set up and get the partitions in
- the right places first.</para></tip>
- </step>
-
- <step>
- <para>Exit <command>/stand/sysinstall</command> and reboot.
- Do a quick test to verify that the minimum
- installation was successful.</para>
- </step>
- </procedure>
-
- <para>The left-hand side of <xref linkend=ad0b4aft> above
- and the left-hand side of <xref linkend=ad2b4aft> above
- show how the disks will look at this point.</para>
- </section>
- </section>
-
- <section id="P3">
- <title>Bootstrapping Phase 3: Root Spindle Setup</title>
-
- <para>Our goal in this phase is get <application>Vinum</application>
- set up and running on the
- root spindle.
- We will embed the existing
- <filename>/usr</filename> and
- <filename>/home</filename> filesystems in a
- <application>Vinum</application> partition.
- Note that the <application>Vinum</application>
- volumes created will not yet be
- failure-resilient since we have
- only one underlying <application>Vinum</application>
- drive to hold them.
- The resulting system will automatically start
- <application>Vinum</application> as it boots to multi-user mode.</para>
-
- <section id="P3E">
- <title>Phase 3 Example</title>
-
- <procedure>
- <step>
- <para>Login as root.</para>
- </step>
-
- <step>
- <para>We will need a directory in the root filesystem in
- which to keep a few files that will be used in the
- <application>Vinum</application>
- bootstrapping process.</para>
-
- <screen>&prompt.root; <userinput>mkdir /bootvinum</userinput>
-&prompt.root; <userinput>cd /bootvinum</userinput></screen>
- </step>
-
- <step>
- <para>Several files need to be prepared for use in bootstrapping.
- I have written a Perl script that makes all the required
- files for you.
- Copy this script to <filename>/bootvinum</filename> by
- floppy disk, tape, network, or any convenient means and
- then run it.
- (If you cannot get this script copied onto the machine being
- bootstrapped, then see <xref linkend=ManualBoot>
- below for a manual alternative.)</para>
-
- <screen>&prompt.root; <userinput>cp /mnt/bootvinum .</userinput>
-&prompt.root; <userinput>./bootvinum</userinput></screen>
-
- <note><para><literal>bootvinum</literal> produces no output
- when run successfully.
- If you get any errors,
- something may have gone wrong when you were creating
- partitions with
- <command>/stand/sysinstall</command> above.</para></note>
-
- <para>Running <literal>bootvinum</literal> will:</para>
-
- <itemizedlist>
- <listitem><para>
- Create <filename>/etc/fstab.vinum</filename>
- based on what it finds
- in your existing <filename>/etc/fstab</filename>
- </para></listitem>
- <listitem><para>
- Create new disk labels for each spindle mentioned
- in <filename>/etc/fstab</filename> and keep copies of the
- current disk labels
- </para></listitem>
- <listitem><para>
- Create files needed as input to <command>vinum</command>
- <option>create</option> for building
- <application>Vinum</application> objects on each spindle
- </para></listitem>
- <listitem><para>
- Create many alternates to <filename>/etc/fstab.vinum</filename>
- that might come in handy should a spindle fail
- </para></listitem>
- </itemizedlist>
-
- <para>You may want to take a look at these files to learn more
- about the disk partitioning required for
- <application>Vinum</application> or to learn more about the
- commands needed to create
- <application>Vinum</application> objects.</para>
-
- </step>
-
- <step>
- <para>We now need to install new spindle partitioning for
- <devicename>/dev/ad0</devicename>.
- This requires that
- <devicename>/dev/ad0s1b</devicename> not be in use for
- swapping so we have to reboot in single-user mode.</para>
-
- <substeps>
- <step>
- <para>First, reboot the system.</para>
-
- <screen>&prompt.root; <userinput>reboot</userinput></screen>
- </step>
-
- <step>
- <para>Next, enter single-user mode.</para>
- <screen>Hit [Enter] to boot immediately, or any other key for command prompt.
-Booting [kernel] in 8 seconds...
-
-Type '?' for a list of commands, 'help' for more detailed help.
-ok <userinput>boot -s</userinput></screen>
-
- </step>
- </substeps>
- </step>
-
- <step>
- <para>In single-user mode, install the new partitioning
- created above.</para>
-
- <screen>&prompt.root; <userinput>cd /bootvinum</userinput>
-&prompt.root; <userinput>disklabel -R ad0s1 disklabel.ad0s1</userinput>
-&prompt.root; <userinput>disklabel -R ad2s1 disklabel.ad2s1</userinput></screen>
-
- <note><para>If you have additional spindles, repeat the
- above commands as appropriate for them.</para></note>
-
- </step>
-
- <step>
- <para>We are about to start <application>Vinum</application>
- for the first time.
- It is going to want to create several device nodes under
- <filename>/dev/vinum</filename> so we will need to mount the
- root filesystem for read/write access.</para>
-
- <screen>&prompt.root; <userinput>fsck -p /</userinput>
-&prompt.root; <userinput>mount /</userinput></screen>
- </step>
-
- <step>
- <para>Now it is time to create the <application>Vinum</application>
- objects that
- will embed the existing non-root filesystems on
- the root spindle in a
- <application>Vinum</application> partition.
- This will load the <application>Vinum</application>
- kernel module and start <application>Vinum</application>
- as a side effect.</para>
-
- <screen>&prompt.root; <userinput>vinum create create.YouCrazy</userinput></screen>
-
- <para>
- You should see a list of <application>Vinum</application>
- objects created that looks like the following:</para>
-<screen>1 drives:
-D YouCrazy State: up Device /dev/ad0s1h Avail: 0/1818 MB (0%)
-
-2 volumes:
-V home State: up Plexes: 1 Size: 488 MB
-V usr State: up Plexes: 1 Size: 1330 MB
-
-2 plexes:
-P home.p0 C State: up Subdisks: 1 Size: 488 MB
-P usr.p0 C State: up Subdisks: 1 Size: 1330 MB
-
-2 subdisks:
-S home.p0.s0 State: up PO: 0 B Size: 488 MB
-S usr.p0.s0 State: up PO: 0 B Size: 1330 MB</screen>
- <para>
- You should also see several kernel messages
- which state that the <application>Vinum</application>
- objects you have created are now <literal>up</literal>.</para>
- </step>
-
- <step>
- <para>Our non-root filesystems should now be embedded in a
- <application>Vinum</application> partition and
- hence available through <application>Vinum</application>
- volumes.
- It is important to test that this embedding worked.</para>
-
- <screen>&prompt.root; <userinput>fsck -n /dev/vinum/home</userinput>
-&prompt.root; <userinput>fsck -n /dev/vinum/usr</userinput></screen>
-
- <para>This should produce no errors.
- If it does produce errors <emphasis>do not fix them</emphasis>.
- Instead, go back and examine the root spindle partition tables
- before and after <application>Vinum</application>
- to see if you can spot the error.
- You can back out the partition table changes by using
- <command>disklabel -R</command> with the
- <filename>disklabel.*.b4vinum</filename> files.</para>
- </step>
-
- <step>
- <para>While we have the root filesystem mounted read/write, this is
- a good time to install <filename>/etc/fstab</filename>.</para>
-
- <screen>&prompt.root; <userinput>mv /etc/fstab /etc/fstab.b4vinum</userinput>
-&prompt.root; <userinput>cp /etc/fstab.vinum /etc/fstab</userinput></screen>
- </step>
-
- <step>
- <para>We are now done with tasks requiring single-user
- mode, so it is safe to go multi-user from here on.</para>
-
- <screen>&prompt.root; <userinput>^D</userinput></screen>
- </step>
-
- <step>
- <para>Login as root.</para>
- </step>
-
- <step>
- <para>Edit <filename>/etc/rc.conf</filename> and add this line:
- <programlisting>start_vinum="YES"</programlisting></para>
- </step>
- </procedure>
- </section>
- </section>
-
- <section id="P4">
- <title>Bootstrapping Phase 4: Rootback Spindle Setup</title>
-
- <para>Our goal in this phase is to get redundant copies of all data
- from the root spindle to the rootback spindle.
- We will first create the necessary <application>Vinum</application>
- objects on the rootback spindle.
- Then we will ask <application>Vinum</application>
- to copy the data from the root spindle to the
- rootback spindle.
- Finally, we use <command>dump</command> and <command>restore</command>
- to copy the root filesystem.</para>
-
- <section id="P4E">
- <title>Phase 4 Example</title>
-
- <procedure>
- <step>
- <para>Now that <application>Vinum</application>
- is running on the root spindle, we can bring
- it up on the rootback spindle so that our
- <application>Vinum</application> volumes can become
- failure-resilient.</para>
-
- <screen>&prompt.root; <userinput>cd /bootvinum</userinput>
-&prompt.root; <userinput>vinum create create.UpWindow</userinput></screen>
-
- <para>You should see a list of <application>Vinum</application>
- objects created that
- looks like the following:</para>
-
-<screen>2 drives:
-D YouCrazy State: up Device /dev/ad0s1h Avail: 0/1818 MB (0%)
-D UpWindow State: up Device /dev/ad2s1h Avail: 2096/3915 MB (53%)
-
-2 volumes:
-V home State: up Plexes: 2 Size: 488 MB
-V usr State: up Plexes: 2 Size: 1330 MB
-
-4 plexes:
-P home.p0 C State: up Subdisks: 1 Size: 488 MB
-P usr.p0 C State: up Subdisks: 1 Size: 1330 MB
-P home.p1 C State: faulty Subdisks: 1 Size: 488 MB
-P usr.p1 C State: faulty Subdisks: 1 Size: 1330 MB
-
-4 subdisks:
-S home.p0.s0 State: up PO: 0 B Size: 488 MB
-S usr.p0.s0 State: up PO: 0 B Size: 1330 MB
-S home.p1.s0 State: stale PO: 0 B Size: 488 MB
-S usr.p1.s0 State: stale PO: 0 B Size: 1330 MB</screen>
-
- <para>You should also see several kernel messages
- which state that some of the <application>Vinum</application>
- objects you have created are now <literal>up</literal>
- while others are <literal>faulty</literal> or
- <literal>stale</literal>.</para>
- </step>
-
- <step>
- <para>Now we ask <application>Vinum</application>
- to copy each of the subdisks on drive
- <literal>YouCrazy</literal> to drive <literal>UpWindow</literal>.
- This will change the state of the newly created
- <application>Vinum</application> subdisks
- from <literal>stale</literal> to <literal>up</literal>.
- It will also change the state of the newly created
- <application>Vinum</application> plexes
- from <literal>faulty</literal> to <literal>up</literal>.</para>
-
- <para>First, we do the new subdisk we
- added to <filename>/home</filename>.</para>
-
- <screen>&prompt.root; <userinput>vinum start -w home.p1.s0</userinput>
-reviving home.p1.s0
-<emphasis>(time passes . . . )</emphasis>
-home.p1.s0 is up by force
-home.p1 is up
-home.p1.s0 is up</screen>
- <note>
- <para>
- My 5,400 RPM EIDE spindles copied at about 3.5 MBytes/sec.
- Your mileage may vary.
- </para>
- </note>
- </step>
-
- <step>
- <para>Next we do the new subdisk we
- added to <filename>/usr</filename>.</para>
-
- <screen>&prompt.root; <userinput>vinum start -w usr.p1.s0</userinput>
-reviving usr.p1.s0
-<emphasis>(time passes . . . )</emphasis>
-usr.p1.s0 is up by force
-usr.p1 is up
-usr.p1.s0 is up</screen>
-
- <para>All <application>Vinum</application>
- objects should be in state <literal>up</literal> at this point.
- The output of
- <command>vinum list</command> should look
- like the following:</para>
-
-<screen>2 drives:
-D YouCrazy State: up Device /dev/ad0s1h Avail: 0/1818 MB (0%)
-D UpWindow State: up Device /dev/ad2s1h Avail: 2096/3915 MB (53%)
-
-2 volumes:
-V home State: up Plexes: 2 Size: 488 MB
-V usr State: up Plexes: 2 Size: 1330 MB
-
-4 plexes:
-P home.p0 C State: up Subdisks: 1 Size: 488 MB
-P usr.p0 C State: up Subdisks: 1 Size: 1330 MB
-P home.p1 C State: up Subdisks: 1 Size: 488 MB
-P usr.p1 C State: up Subdisks: 1 Size: 1330 MB
-
-4 subdisks:
-S home.p0.s0 State: up PO: 0 B Size: 488 MB
-S usr.p0.s0 State: up PO: 0 B Size: 1330 MB
-S home.p1.s0 State: up PO: 0 B Size: 488 MB
-S usr.p1.s0 State: up PO: 0 B Size: 1330 MB</screen>
- </step>
-
- <step>
- <para>Copy the root filesystem so that you will have a backup.</para>
-
- <screen>&prompt.root; <userinput>cd /rootback</userinput>
-&prompt.root; <userinput>dump 0f - / | restore rf -</userinput>
-&prompt.root; <userinput>rm restoresymtable</userinput>
-&prompt.root; <userinput>cd /</userinput></screen>
-
- <note>
- <para>You may see errors like this:</para>
-
- <screen>./tmp/rstdir1001216411: (inode 558) not found on tape
-cannot find directory inode 265
-abort? [yn] <userinput>n</userinput>
-expected next file 492, got 491</screen>
-
- <para>They seem to cause no harm.
- I suspect they are a consequence of dumping the filesystem
- containing <filename>/tmp</filename> and/or the pipe
- connecting <command>dump</command> and
- <command>restore</command>.</para>
-
- </note>
- </step>
-
- <step>
- <para>Make a directory on which we can mount a damaged root
- filesystem during the recovery process.</para>
-
- <screen>&prompt.root; <userinput>mkdir /rootbad</userinput></screen>
-
- </step>
-
- <step>
- <para>Remove sentinel mount points that are now unused.</para>
-
- <screen>&prompt.root; <userinput>rmdir /NOFUTURE*</userinput></screen>
-
- </step>
-
- <step>
- <para>Create empty &vinum.ap; drives on remaining spindles.</para>
-
- <screen>&prompt.root; <userinput>vinum create create.ThruBank</userinput>
-&prompt.root; <userinput>...</userinput></screen>
-
- </step>
- </procedure>
-
- <para>At this point, the reliable server foundation is complete.
- The right-hand side of <xref linkend=ad0b4aft> above
- and the right-hand side of <xref linkend=ad2b4aft> above
- show how the disks will look.</para>
-
- <para>You may want to do a quick reboot to multi-user and give it
- a quick test drive.
- This is also a good point to complete installation
- of other distributions beyond the minimal install.
- Add packages, ports, and users as required.
- Configure <filename>/etc/rc.conf</filename> as required.</para>
-
- <tip>
- <para>After you have completed your server configuration,
- remember to do one more copy of root to
- <filename>/rootback</filename> as shown above before placing
- the server into production.</para></tip>
-
- <tip>
- <para>Make a schedule to refresh
- <filename>/rootback</filename> periodically.</para></tip>
-
- <tip>
- <para>It may be a good idea to mount
- <filename>/rootback</filename> read-only for normal operation
- of the server.
- This does, however, complicate the periodic refresh a bit.</para></tip>
-
- <tip>
- <para>Do not forget to watch
- <filename>/var/log/messages</filename> carefully for errors.
- <application>Vinum</application>
- may automatically avoid failed hardware in a way that users
- do not notice.
- You must watch for such failures and get them repaired before a
- second failure results in data loss.
- You may see
- <application>Vinum</application> noting damaged objects
- at server boot time.</para></tip>
-
- </section>
- </section>
- </section>
-
- <section id="FromHere">
- <title>Where to Go from Here?</title>
-
- <para>Now that you have established the foundation of a reliable server,
- there are several things you might want to try next.</para>
-
- <section>
- <title>Make a Vinum Volume with Remaining Space</title>
-
- <para>Following are the steps to create another
- <application>Vinum</application> volume with space remaining
- on the rootback spindle.</para>
-
- <note><para>This volume will not be resilient to spindle failure
- since it has only one plex on a single spindle.</para></note>
-
- <procedure>
- <step>
- <para>Create a file with the following contents:</para>
-
- <programlisting>volume hope
- plex name hope.p0 org concat volume hope
- sd name hope.p0.s0 drive UpWindow plex hope.p0 len 0</programlisting>
-
- <note>
- <para>Specifying a length of <literal>0</literal> for
- the <filename>hope.p0.s0</filename> subdisk
- asks <application>Vinum</application>
- to use whatever space is left available on the underlying
- drive.</para></note>
- </step>
-
- <step>
- <para>Feed these commands into <command>vinum</command> <option>create</option>.</para>
- <screen>&prompt.root; <userinput>vinum create <replaceable>filename</replaceable></userinput></screen>
- </step>
-
- <step>
- <para>Now we <command>newfs</command> the volume and
- <command>mount</command> it.</para>
-
- <screen>&prompt.root; <userinput>newfs -v /dev/vinum/hope</userinput>
-&prompt.root; <userinput>mkdir /hope</userinput>
-&prompt.root; <userinput>mount /dev/vinum/hope /hope</userinput></screen>
-
- </step>
-
- <step>
- <para>Edit <filename>/etc/fstab</filename> if you want
- <filename>/hope</filename> mounted at boot time.</para>
- </step>
-
- </procedure>
- </section>
-
- <section>
- <title>Try Out More Vinum Commands</title>
-
- <para>You might already be familiar with
- <command>vinum</command> <option>list</option> to get a list of
- all <application>Vinum</application> objects.
- Try <option>-v</option> following it to see more detail.</para>
-
- <para>If you have more spindles and you want to bring them up as
- concatenated, mirrored, or striped volumes, then give
- <command>vinum</command> <option>concat</option> <replaceable>drivelist</replaceable>,
- <command>vinum</command> <option>mirror</option> <replaceable>drivelist</replaceable>, or
- <command>vinum</command> <option>stripe</option> <replaceable>drivelist</replaceable> a try.</para>
-
- <para>See &man.vinum.8; for sample configurations and important
- performance considerations before settling on a final organization
- for your additional spindles.</para>
-
- <para>The failure recovery instructions below will also give you
- some experience using more <application>Vinum</application>
- commands.</para>
-
- </section>
- </section>
-
- <section id="Failures">
- <title>Failure Scenarios</title>
-
- <para>This section contains descriptions of various failure scenarios.
- For each scenario, there is a subsection on how to configure your
- server for degraded mode operation, how to recover from the failure,
- how to exit degraded mode, and how to simulate the failure.</para>
-
- <tip>
- <para>Make a hard copy of these instructions and leave them inside the CPU
- case, being careful not to interfere with ventilation.</para></tip>
-
- <section id="ad0RootBad">
- <title>Root filesystem on ad0 unusable, rest of drive ok</title>
-
- <note>
- <para>We assume here that the boot blocks and disk label on
- <devicename>/dev/ad0</devicename> are ok.
- If your BIOS can boot from a drive other than
- <devicename>C:</devicename>, you may be able to get around this
- limitation.</para></note>
-
- <section id="enter1">
- <title>Configure Server for Degraded Mode</title>
-
- <procedure>
- <step>
- <para>Use BootMgr to load kernel from
- <devicename>/dev/ad2s1a</devicename>.</para>
-
- <substeps>
- <step>
- <para>Hit <keycap>F5</keycap> in BootMgr to select
- <literal>Drive 1</literal>.</para>
- </step>
-
- <step>
- <para>Hit <keycap>F1</keycap> to select
- <literal>FreeBSD</literal>.</para>
- </step>
- </substeps>
- </step>
-
- <step>
- <para>After the kernel is loaded, hit any key but enter to interrupt
- the boot sequence.
- Boot into single-user mode and allow explicit entry of
- a root filesystem.</para>
-
- <screen>Hit [Enter] to boot immediately, or any other key for command prompt.
-Booting [kernel] in 8 seconds...
-
-Type '?' for a list of commands, 'help' for more detailed help.
-ok <userinput>boot -as</userinput></screen>
-
- </step>
-
- <step>
- <para>Select <filename>/rootback</filename>
- as your root filesystem.</para>
-
- <screen>Manual root filesystem specification:
- &lt;fstype>:&lt;device> Mount &lt;device> using filesystem &lt;fstype>
- e.g. ufs:/dev/da0s1a
- ? List valid disk boot devices
- &lt;empty line> Abort manual input
-
- mountroot> <userinput>ufs:/dev/ad2s1a</userinput></screen>
- </step>
-
- <step>
- <para>Now that you are in single-user mode, change
- <filename>/etc/fstab</filename> to avoid the
- bad root filesystem.</para>
-
- <tip>
- <para>If you used the <literal>bootvinum</literal> Perl script from <xref linkend=Perl>
- below, then these commands should configure your server for
- degraded mode.</para>
-
- <screen>&prompt.root; <userinput>fsck -p /</userinput>
-&prompt.root; <userinput>mount /</userinput>
-&prompt.root; <userinput>cd /etc</userinput>
-&prompt.root; <userinput>mv fstab fstab.bak</userinput>
-&prompt.root; <userinput>cp fstab_ad0s1_root_bad fstab</userinput>
-&prompt.root; <userinput>cd /</userinput>
-&prompt.root; <userinput>mount -o ro /</userinput>
-&prompt.root; <userinput>vinum start</userinput>
-&prompt.root; <userinput>fsck -p</userinput>
-&prompt.root; <userinput>^D</userinput></screen>
- </tip>
- </step>
- </procedure>
- </section>
-
- <section>
- <title>Recovery</title>
-
- <procedure>
- <step>
- <para>Restore <devicename>/dev/ad0s1a</devicename> from
- backups or copy
- <filename>/rootback</filename> to it with these commands:</para>
-
- <screen>&prompt.root; <userinput>umount /rootbad</userinput>
-&prompt.root; <userinput>newfs /dev/ad0s1a</userinput>
-&prompt.root; <userinput>tunefs -n enable /dev/ad0s1a</userinput>
-&prompt.root; <userinput>mount /rootbad</userinput>
-&prompt.root; <userinput>cd /rootbad</userinput>
-&prompt.root; <userinput>dump 0f - / | restore rf -</userinput>
-&prompt.root; <userinput>rm restoresymtable</userinput></screen>
- </step>
- </procedure>
- </section>
-
- <section>
- <title>Exiting Degraded Mode</title>
-
- <procedure>
- <step>
- <para>Enter single-user mode.</para>
-
- <screen>&prompt.root; <userinput>shutdown now</userinput></screen>
- </step>
-
- <step>
- <para>Put <filename>/etc/fstab</filename> back to
- normal and reboot.</para>
-
- <screen>&prompt.root; <userinput>cd /rootbad/etc</userinput>
-&prompt.root; <userinput>rm fstab</userinput>
-&prompt.root; <userinput>mv fstab.bak fstab</userinput>
-&prompt.root; <userinput>reboot</userinput></screen>
- </step>
-
- <step>
- <para>Reboot and hit <keycap>F1</keycap> to boot from
- <devicename>/dev/ad0</devicename> when
- prompted by BootMgr.</para>
- </step>
- </procedure>
- </section>
-
- <section>
- <title>Simulation</title>
-
- <para>This kind of failure can be simulated by shutting down to
- single-user mode and then booting as shown above in
- <xref linkend=enter1>.</para>
- </section>
- </section>
-
- <section id="ad2Bad">
- <title>Drive ad2 Fails</title>
-
- <para>This section deals with the total failure of
- <devicename>/dev/ad2</devicename>.</para>
-
- <section>
- <title>Configure Server for Degraded Mode</title>
-
- <procedure>
- <step>
- <para>After the kernel is loaded, hit any key but
- <keycap>Enter</keycap> to interrupt the boot sequence.
- Boot into single-user mode.</para>
-
- <screen>Hit [Enter] to boot immediately, or any other key for command prompt.
-Booting [kernel] in 8 seconds...
-
-Type '?' for a list of commands, 'help' for more detailed help.
-ok <userinput>boot -s</userinput></screen>
-
- </step>
-
- <step>
- <para>Change
- <filename>/etc/fstab</filename> to avoid the bad drive.
- If you used the <literal>bootvinum</literal> Perl script from <xref linkend=Perl>
- below, then
- these commands should configure your server for
- degraded mode.</para>
-
- <screen>&prompt.root; <userinput>fsck -p /</userinput>
-&prompt.root; <userinput>mount /</userinput>
-&prompt.root; <userinput>cd /etc</userinput>
-&prompt.root; <userinput>mv fstab fstab.bak</userinput>
-&prompt.root; <userinput>cp fstab_only_have_ad0s1 fstab</userinput>
-&prompt.root; <userinput>cd /</userinput>
-&prompt.root; <userinput>mount -o ro /</userinput>
-&prompt.root; <userinput>vinum start</userinput>
-&prompt.root; <userinput>fsck -p</userinput>
-&prompt.root; <userinput>^D</userinput></screen>
-
- <para>If you do not have modified versions of
- <filename>/etc/fstab</filename> that are ready for use,
- then you can use <command>ed</command> to make one.
- Alternatively, you can <command>fsck</command> and
- <command>mount</command>
- <filename>/usr</filename> and then use your
- favorite editor.</para>
-
- </step>
- </procedure>
- </section>
-
- <section id=ad2Recov>
- <title>Recovery</title>
- <procedure>
-
- <para>We assume here that your server is up and running multi-user in
- degraded mode on just
- <devicename>/dev/ad0</devicename> and that you have
- a new spindle now on
- <devicename>/dev/ad2</devicename> ready to go.</para>
-
- <para>You will need a new spindle with enough room to hold root and swap
- partitions plus a <application>Vinum</application>
- partition large enough to hold
- <filename>/home</filename> and <filename>/usr</filename>.</para>
-
- <step>
- <para>Create a BIOS partition (slice) on the new spindle.</para>
-
- <screen>&prompt.root; <userinput>/stand/sysinstall</userinput></screen>
-
- <substeps>
- <step><para>Select <literal>Custom</literal>.</para></step>
- <step><para>Select <literal>Partition</literal>.</para></step>
- <step><para>Select <devicename>ad2</devicename>.</para></step>
- <step><para>Create a FreeBSD (type 165) slice
- large enough to hold everything mentioned above.</para></step>
- <step><para>Write changes.</para></step>
- <step><para>Yes, you are absolutely sure.</para></step>
- <step><para>Select BootMgr.</para></step>
- <step><para>Quit Partitioning.</para></step>
- <step><para>Exit <command>/stand/sysinstall</command>.</para></step>
- </substeps>
- </step>
-
- <step>
- <para>Create disk label partitioning based on current
- <devicename>/dev/ad0</devicename> partitioning.</para>
-
- <screen>&prompt.root; <userinput>disklabel ad0 > /tmp/ad0</userinput>
-&prompt.root; <userinput>disklabel -e ad2</userinput></screen>
-
- <para>This will drop you into your favorite editor.</para>
-
- <substeps>
- <step>
- <para>Copy the lines for the <literal>a</literal> and
- <literal>b</literal> partitions from
- <filename>/tmp/ad0</filename> to the
- <devicename>ad2</devicename> disklabel.</para>
- </step>
-
- <step>
- <para>Add the <literal>size</literal> of the
- <literal>a</literal> and
- <literal>b</literal> partitions to find the proper
- <literal>offset</literal> for the
- <literal>h</literal> partition.</para>
- </step>
-
- <step>
- <para>Subtract this <literal>offset</literal> from the
- <literal>size</literal> of the <literal>c</literal>
- partition to find the proper <literal>size</literal> for the <literal>h</literal>
- partition.</para>
- </step>
-
- <step>
- <para>Define an <literal>h</literal> partition with the
- <literal>size</literal> and
- <literal>offset</literal> calculated above.</para>
- </step>
-
- <step>
- <para>Set the <literal>fstype</literal> column to
- <literal>vinum</literal>.</para>
- </step>
-
- <step>
- <para>Save the file and quit your editor.</para>
- </step>
- </substeps>
- </step>
-
- <step>
- <para>Tell <application>Vinum</application>
- about the new drive.</para>
-
- <substeps>
- <step>
- <para>Ask <application>Vinum</application> to start an
- editor with a copy of the current configuration.</para>
-
- <screen>&prompt.root; <userinput>vinum create</userinput></screen>
-
- </step>
-
- <step>
- <para>Uncomment the drive line referring to drive
- <literal>UpWindow</literal> and set
- <literal>device</literal> to
- <devicename>/dev/ad2s1h</devicename>.</para></step>
-
- <step>
- <para>Save the file and quit your editor.</para></step>
-
- </substeps>
- </step>
-
- <step>
- <para>Now that <application>Vinum</application>
- has two spindles again, revive the mirrors.</para>
-
- <screen>&prompt.root; <userinput>vinum start -w usr.p1.s0</userinput>
-&prompt.root; <userinput>vinum start -w home.p1.s0</userinput></screen>
- </step>
-
- <step>
- <para>Now we need to restore
- <filename>/rootback</filename> to a current copy of the
- root filesystem.
- These commands will accomplish this.</para>
-
- <screen>&prompt.root; <userinput>newfs /dev/ad2s1a</userinput>
-&prompt.root; <userinput>tunefs -n enable /dev/ad2s1a</userinput>
-&prompt.root; <userinput>mount /dev/ad2s1a /mnt</userinput>
-&prompt.root; <userinput>cd /mnt</userinput>
-&prompt.root; <userinput>dump 0f - / | restore rf -</userinput>
-&prompt.root; <userinput>rm restoresymtable</userinput>
-&prompt.root; <userinput>cd /</userinput>
-&prompt.root; <userinput>umount /mnt</userinput></screen>
- </step>
- </procedure>
- </section>
-
- <section>
- <title>Exiting Degraded Mode</title>
-
- <procedure>
- <step>
- <para>Enter single-user mode.</para>
-
- <screen>&prompt.root; <userinput>shutdown now</userinput></screen>
- </step>
-
- <step>
- <para>Return <filename>/etc/fstab</filename> to
- its normal state and reboot.</para>
-
- <screen>&prompt.root; <userinput>cd /etc</userinput>
-&prompt.root; <userinput>rm fstab</userinput>
-&prompt.root; <userinput>mv fstab.bak fstab</userinput>
-&prompt.root; <userinput>reboot</userinput></screen>
- </step>
- </procedure>
- </section>
-
- <section>
- <title>Simulation</title>
-
- <para>You can simulate this kind of failure by unplugging
- <devicename>/dev/ad2</devicename>, write-protecting it,
- or by this procedure:</para>
-
- <procedure>
- <step>
- <para>Shutdown to single-user mode.</para>
- </step>
-
- <step>
- <para>Unmount all non-root filesystems.</para>
- </step>
-
- <step>
- <para>Clobber any existing <application>Vinum</application>
- configuration and partitioning on
- <devicename>/dev/ad2</devicename>.</para>
-
- <screen>&prompt.root; <userinput>vinum stop</userinput>
-&prompt.root; <userinput>dd if=/dev/zero of=/dev/ad2s1h count=512</userinput>
-&prompt.root; <userinput>dd if=/dev/zero of=/dev/ad2 count=512</userinput></screen>
- </step>
- </procedure>
- </section>
- </section>
-
- <section id="ad0Bad">
- <title>Drive ad0 Fails</title>
-
- <para>Some BIOSes can boot from drive 1 or drive 2 (often called
- <devicename>C:</devicename> or <devicename>D:</devicename>),
- while others can boot only from drive 1.
- If your BIOS can boot from either, the fastest road to recovery
- might be to boot directly from <filename>/dev/ad2</filename>
- in single-user mode and
- install <filename>/etc/fstab_only_have_ad2s1</filename> as
- <filename>/etc/fstab</filename>.
- You would then have to adapt the <filename>/dev/ad2</filename>
- failure recovery instructions from <xref linkend=ad2Recov> above.</para>
-
- <para>If your BIOS can only boot from drive one, then you will have to
- unplug drive <literal>YouCrazy</literal> from the controller for
- <devicename>/dev/ad2</devicename> and plug it
- into the controller for <devicename>/dev/ad0</devicename>.
- Then continue with the instructions for
- <devicename>/dev/ad2</devicename> failure recovery
- in <xref linkend=ad2Recov> above.</para>
- </section>
- </section>
-
- <appendix id="Perl">
- <title>bootvinum Perl Script</title>
-
- <para>The <literal>bootvinum</literal> Perl script below reads <filename>/etc/fstab</filename>
- and current drive partitioning.
- It then writes several files in the current directory and several
- variants of <filename>/etc/fstab</filename> in <filename>/etc</filename>.
- These files significantly simplify the installation of
- <application>Vinum</application> and recovery from
- spindle failures.</para>
-
- <programlisting>#!/usr/bin/perl -w
-use strict;
-use FileHandle;
-
-my $config_tag1 = '$Id: article.sgml,v 1.15 2004-08-08 13:43:56 hrs Exp $';
-# Copyright (C) 2001 Robert A. Van Valzah
-#
-# Bootstrap Vinum
-#
-# Read /etc/fstab and current partitioning for all spindles mentioned there.
-# Generate files needed to mirror all filesystems on root spindle.
-# A new partition table for each spindle
-# Input for the vinum create command to create Vinum objects on each spindle
-# A copy of fstab mounting Vinum volumes instead of BSD partitions
-# Copies of fstab altered for server's degraded modes of operation
-# See handbook for instructions on how to use the the files generated.
-# N.B. This bootstrapping method shrinks size of swap partition by the size
-# of Vinum's on-disk configuration (265 sectors). It embeds existing file
-# systems on the root spindle in Vinum objects without having to copy them.
-# Thanks to Greg Lehey for suggesting this bootstrapping method.
-# Expectations:
-# The root spindle must contain at least root, swap, and /usr partitions
-# The rootback spindle must have matching /rootback and swap partitions
-# Other spindles should only have a /NOFUTURE* filesystem and maybe swap
-# File systems named /NOFUTURE* will be replaced with Vinum drives
-
-# Change configuration variables below to suit your taste
-my $vip = 'h'; # VInum Partition
-my @drv = ('YouCrazy', 'UpWindow', 'ThruBank', # Vinum DRiVe names
- 'OutSnakes', 'MeWild', 'InMovie', 'HomeJames', 'DownPrices', 'WhileBlind');
-# No configuration variables beyond this point
-
-my %vols; # One entry per Vinum volume to be created
-my @spndl; # One entry per SPiNDLe
-my $rsp; # Root SPindle (as in /dev/$rsp)
-my $rbsp; # RootBack SPindle (as in /dev/$rbsp)
-my $cfgsiz = 265; # Size of Vinum on-disk configuration info in sectors
-my $nxtpas = 2; # Next fsck pass number for non-root filesystems
-
-# Parse fstab, generating the version we'll need for Vinum and noting
-# spindles in use.
-my $fsin = "/etc/fstab";
-#my $fsin = "simu/fstab";
-open(FSIN, "$fsin") || die("Couldn't open $fsin: $!\n");
-
-my $fsout = "/etc/fstab.vinum";
-open(FSOUT, ">$fsout") || die("Couldn't open $fsout for writing: $!\n");
-
-while (&lt;FSIN>) {
- my ($dev, $mnt, $fstyp, $opt, $dump, $pass) = split;
- next if $dev =~ /^#/;
- if ($mnt eq '/' || $mnt eq '/rootback' || $mnt =~ /^\/NOFUTURE/) {
- my $dn = substr($dev, 5, length($dev)-6); # Device Name without /dev/
- push(@spndl, $dn) unless grep($_ eq $dn, @spndl);
- $rsp = $dn if $mnt eq '/';
- next if $mnt =~ /^\/NOFUTURE/;
- }
- # Move /rootback from partition e to a
- if ($mnt =~ /^\/rootback/) {
- $dev =~ s/e$/a/;
- $pass = 1;
- $rbsp = substr($dev, 5, length($dev)-6);
- print FSOUT "$dev\t\t$mnt\t$fstyp\t$opt\t\t$dump\t$pass\n";
- next;
- }
- # Move non-root filesystems on smallest spindle into Vinum
- if (defined($rsp) && $dev =~ /^\/dev\/$rsp/ && $dev =~ /[d-h]$/) {
- $pass = $nxtpas++;
- print FSOUT "/dev/vinum$mnt\t\t$mnt\t\t$fstyp\t$opt\t\t$dump\t$pass\n";
- $vols{$dev}->{mnt} = substr($mnt, 1);
- next;
- }
- print FSOUT $_;
-}
-close(FSOUT);
-die("Found more spindles than we have abstract names\n") if $#spndl > $#drv;
-die("Didn't find a root partition!\n") if !defined($rsp);
-die("Didn't find a /rootback partition!\n") if !defined($rbsp);
-
-# Table of server's Degraded Modes
-# One row per mode with hash keys
-# fn FileName
-# xpr eXPRession needed to convert fstab lines for this mode
-# cm1 CoMment 1 describing this mode
-# cm2 CoMment 2 describing this mode
-# FH FileHandle (dynamically initialized below)
-my @DM = (
- { cm1 => "When we only have $rsp, comment out lines using $rbsp",
- fn => "/etc/fstab_only_have_$rsp",
- xpr => "s:^/dev/$rbsp:#\$&:",
- },
- { cm1 => "When we only have $rbsp, comment out lines using $rsp and",
- cm2 => "rootback becomes root",
- fn => "/etc/fstab_only_have_$rbsp",
- xpr => "s:^/dev/$rsp:#\$&: || s:/rootback:/\t:",
- },
- { cm1 => "When only $rsp root is bad, /rootback becomes root and",
- cm2 => "root becomes /rootbad",
- fn => "/etc/fstab_${rsp}_root_bad",
- xpr => "s:\t/\t:\t/rootbad: || s:/rootback:/\t:",
- },
-);
-
-# Initialize output FileHandles and write comments
-foreach my $dm (@DM) {
- my $fh = new FileHandle;
- $fh->open(">$dm->{fn}") || die("Can't write $dm->{fn}: $!\n");
- print $fh "# $dm->{cm1}\n" if $dm->{cm1};
- print $fh "# $dm->{cm2}\n" if $dm->{cm2};
- $dm->{FH} = $fh;
-}
-
-# Parse the Vinum version of fstab written above and write versions needed
-# for server's degraded modes.
-open(FSOUT, "$fsout") || die("Couldn't open $fsout: $!\n");
-while (&lt;FSOUT>) {
- my $line = $_;
- foreach my $dm (@DM) {
- $_ = $line;
- eval $dm->{xpr};
- print {$dm->{FH}} $_;
- }
-}
-
-# Parse partition table for each spindle and write versions needed for Vinum
-my $rootsiz; # ROOT partition SIZe
-my $swapsiz; # SWAP partition SIZe
-my $rspminoff; # Root SPindle MINimum OFFset of non-root, non-swap, non-c parts
-my $rspsiz; # Root SPindle SIZe
-my $rbspsiz; # RootBack SPindle SIZe
-foreach my $i (0..$#spndl) {
- my $dlin = "disklabel $spndl[$i] |";
-# my $dlin = "simu/disklabel.$spndl[$i]";
- open(DLIN, "$dlin") || die("Couldn't open $dlin: $!\n");
-
- my $dlout = "disklabel.$spndl[$i]";
- open(DLOUT, ">$dlout") || die("Couldn't open $dlout for writing: $!\n");
-
- my $dlb4 = "$dlout.b4vinum";
- open(DLB4, ">$dlb4") || die("Couldn't open $dlb4 for writing: $!\n");
-
- my $minoff; # MINimum OFFset of non-root, non-swap, non-c partitions
- my $totsiz = 0; # TOTal SIZe of all non-root, non-swap, non-c partitions
- my $swapspndl = 0; # True if SWAP partition on this SPiNDLe
- while (&lt;DLIN>) {
- print DLB4 $_;
- my ($part, $siz, $off, $fstyp, $fsiz, $bsiz, $bps) = split;
-
- if ($part && $part eq 'a:' && $spndl[$i] eq $rsp) {
- $rootsiz = $siz;
- }
- if ($part && $part eq 'e:' && $spndl[$i] eq $rbsp) {
- if ($rootsiz != $siz) {
- die("Rootback size ($siz) != root size ($rootsiz)\n");
- }
- }
- if ($part && $part eq 'c:') {
- $rspsiz = $siz if $spndl[$i] eq $rsp;
- $rbspsiz = $siz if $spndl[$i] eq $rbsp;
- }
- # Make swap partition $cfgsiz sectors smaller
- if ($part && $part eq 'b:') {
- if ($spndl[$i] eq $rsp) {
- $swapsiz = $siz;
- } else {
- if ($swapsiz != $siz) {
- die("Swap partition sizes unequal across spindles\n");
- }
- }
- printf DLOUT "%4s%9d%9d%10s\n", $part, $siz-$cfgsiz, $off, $fstyp;
- $swapspndl = 1;
- next;
- }
- # Move rootback spindle e partitions to a
- if ($part && $part eq 'e:' && $spndl[$i] eq $rbsp) {
- printf DLOUT "%4s%9d%9d%10s%9d%6d%6d\n", 'a:', $siz, $off, $fstyp,
- $fsiz, $bsiz, $bps;
- next;
- }
- # Delete non-root, non-swap, non-c partitions but note their minimum
- # offset and total size that're needed below.
- if ($part && $part =~ /^[d-h]:$/) {
- $minoff = $off unless $minoff;
- $minoff = $off if $off &lt; $minoff;
- $totsiz += $siz;
- if ($spndl[$i] eq $rsp) { # If doing spindle containing root
- my $dev = "/dev/$spndl[$i]" . substr($part, 0, 1);
- $vols{$dev}->{siz} = $siz;
- $vols{$dev}->{off} = $off;
- $rspminoff = $minoff;
- }
- next;
- }
- print DLOUT $_;
- }
- if ($swapspndl) { # If there was a swap partition on this spindle
- # Make a Vinum partition the size of all non-root, non-swap,
- # non-c partitions + the size of Vinum's on-disk configuration.
- # Set its offset so that the start of the first subdisk it contains
- # coincides with the first filesystem we're embedding in Vinum.
- printf DLOUT "%4s%9d%9d%10s\n", "$vip:", $totsiz+$cfgsiz, $minoff-$cfgsiz,
- 'vinum';
- } else {
- # No need to mess with size size and offset if there was no swap
- printf DLOUT "%4s%9d%9d%10s\n", "$vip:", $totsiz, $minoff,
- 'vinum';
- }
-}
-die("Swap partition not found\n") unless $swapsiz;
-die("Swap partition not larger than $cfgsiz blocks\n") unless $swapsiz>$cfgsiz;
-die("Rootback spindle size not >= root spindle size\n") unless $rbspsiz>=$rspsiz;
-
-# Generate input to vinum create command needed for each spindle.
-foreach my $i (0..$#spndl) {
- my $cfn = "create.$drv[$i]"; # Create File Name
- open(CF, ">$cfn") || die("Can't open $cfn for writing: $!\n");
- print CF "drive $drv[$i] device /dev/$spndl[$i]$vip\n";
- next unless $spndl[$i] eq $rsp || $spndl[$i] eq $rbsp;
- foreach my $dev (keys(%vols)) {
- my $mnt = $vols{$dev}->{mnt};
- my $siz = $vols{$dev}->{siz};
- my $off = $vols{$dev}->{off}-$rspminoff+$cfgsiz;
- print CF "volume $mnt\n" if $spndl[$i] eq $rsp;
- print CF &lt;&lt;EOF;
- plex name $mnt.p$i org concat volume $mnt
- sd name $mnt.p$i.s0 drive $drv[$i] plex $mnt.p$i len ${siz}s driveoffset ${off}s
-EOF
- }
-}</programlisting>
- </appendix>
-
- <appendix id=ManualBoot>
- <title>Manual Vinum Bootstrapping</title>
-
- <para>The <literal>bootvinum</literal> Perl script in <xref linkend=Perl> makes life easier, but
- it may be necessary to manually perform some or all of the steps that
- it automates.
- This appendix describes how you would manually mimic the script.</para>
-
- <procedure>
- <step>
- <para>Make a copy of <filename>/etc/fstab</filename>
- to be customized.</para>
-
- <screen>&prompt.root; <userinput>cp /etc/fstab /etc/fstab.vinum</userinput></screen>
- </step>
-
- <step>
- <para>Edit <filename>/etc/fstab.vinum</filename>.</para>
-
- <substeps>
- <step>
- <para>Change the <literal>device</literal> column of
- non-root partitions on the root spindle to
- <filename>/dev/vinum/mnt</filename>.</para></step>
-
- <step>
- <para>Change the <literal>pass</literal> column of
- non-root partitions on the root spindle to <userinput>2</userinput>,
- <userinput>3</userinput>, etc.</para></step>
-
- <step>
- <para>Delete any lines with mountpoint
- matching <filename>/NOFUTURE*</filename>.</para></step>
-
- <step>
- <para>Change the <literal>device</literal> column of
- <filename>/rootback</filename>
- from <literal>e</literal> to
- <literal>a</literal>.</para></step>
-
- <step>
- <para>Change the <literal>pass</literal> column of
- <filename>/rootback</filename> to
- <userinput>1</userinput>.</para></step>
-
- </substeps>
- </step>
-
- <step>
- <para>Prepare disklabels for editing:</para>
-
- <screen>&prompt.root; <userinput>cd /bootvinum</userinput>
-&prompt.root; <userinput>disklabel ad0s1 > disklabel.ad0s1</userinput>
-&prompt.root; <userinput>cp disklabel.ad0s1 disklabel.ad0s1.b4vinum</userinput>
-&prompt.root; <userinput>disklabel ad2s1 > disklabel.ad2s1</userinput>
-&prompt.root; <userinput>cp disklabel.ad2s1 disklabel.ad2s1.b4vinum</userinput></screen>
- </step>
-
- <step>
- <para>Edit <filename>/etc/disklabel.ad?s1</filename>.</para>
-
- <substeps>
- <step>
- <para>On the root spindle:</para>
-
- <substeps>
- <step>
- <para>Decrease the <literal>size</literal> of the
- <literal>b</literal> partition by 265 blocks.</para></step>
-
- <step>
- <para>Note the <literal>size</literal> and
- <literal>offset</literal> of the <literal>a</literal> and
- <literal>b</literal> partitions.</para></step>
-
- <step>
- <para>Note the smallest <literal>offset</literal> for partitions
- <literal>d</literal>-<literal>h</literal>.</para></step>
-
- <step>
- <para>Note the <literal>size</literal> and
- <literal>offset</literal> for all non-root, non-swap
- partitions (<filename>/home</filename> was probably on
- <literal>e</literal> and <filename>/usr</filename> was
- probably on <literal>f</literal>).</para></step>
-
- <step>
- <para>Delete partitions
- <literal>d</literal>-<literal>h</literal>.</para></step>
-
- <step>
- <para>Create a new <literal>h</literal> partition with
- <literal>offset</literal> 265 blocks less than the
- smallest <literal>offset</literal>
- for partitions <literal>d</literal>-<literal>h</literal>
- noted above.
- Set its <literal>size</literal> to the <literal>size</literal>
- of the <literal>c</literal> partition less the
- smallest <literal>offset</literal>
- for partitions <literal>d</literal>-<literal>h</literal>
- noted above + 265 blocks.</para>
-
- <note>
- <para><application>Vinum</application>
- can use any partition other than <literal>c</literal>.
- It is not strictly necessary to use <literal>h</literal>
- for all your <application>Vinum</application>
- partitions, but it is good practice to
- be consistent across all spindles.</para></note>
- </step>
-
- <step>
- <para>Set the <literal>fstype</literal> of this new
- partition to <userinput>vinum</userinput>.</para></step>
- </substeps>
- </step>
-
- <step>
- <para>On the rootback spindle:</para>
-
- <substeps>
- <step>
- <para>Move the <literal>e</literal> partition to
- <literal>a</literal>.</para></step>
-
- <step>
- <para>Verify that the <literal>size</literal> of the
- <literal>a</literal> and
- <literal>b</literal> partitions matches the
- root spindle.</para></step>
-
- <step>
- <para>Note the smallest <literal>offset</literal> for partitions
- <literal>d</literal>-<literal>h</literal>.</para></step>
-
- <step>
- <para>Delete partitions
- <literal>d</literal>-<literal>h</literal>.</para></step>
-
- <step>
- <para>Create a new <literal>h</literal> partition with
- <literal>offset</literal> 265 blocks less than the
- smallest <literal>offset</literal>
- noted above for partitions
- <literal>d</literal>-<literal>h</literal>.
- Set its <literal>size</literal> to the <literal>size</literal>
- of the <literal>c</literal> partition less the
- smallest <literal>offset</literal>
- for partitions <literal>d</literal>-<literal>h</literal>
- noted above + 265 blocks.</para></step>
-
- <step>
- <para>Set the <literal>fstype</literal> of this new
- partition to <userinput>vinum</userinput>.</para></step>
- </substeps>
- </step>
-
- </substeps>
- </step>
-
- <step>
- <para>Create a file named
- <filename>create.YouCrazy</filename> that contains:</para>
-
- <programlisting>drive YouCrazy device /dev/ad0s1h
-volume home
- plex name home.p0 org concat volume home
- sd name home.p0.s0 drive YouCrazy plex home.p0 len $hl driveoffset $ho
-volume usr
- plex name usr.p0 org concat volume usr
- sd name usr.p0.s0 drive YouCrazy plex usr.p0 len $ul driveoffset $uo</programlisting>
-
- <para>Where:</para>
- <itemizedlist>
- <listitem><para>
- <literal>$hl</literal> is the length noted above for
- <filename>/home</filename>.</para></listitem>
-
- <listitem><para>
- <literal>$ho</literal> is the offset noted above for
- <filename>/home</filename> less the smallest offset
- noted above + 265 blocks.</para></listitem>
-
- <listitem><para>
- <literal>$ul</literal> is the length noted above for
- <filename>/usr</filename>.</para></listitem>
-
- <listitem><para>
- <literal>$uo</literal> is the offset noted above for
- <filename>/usr</filename> less the smallest offset
- noted above + 265 blocks.</para></listitem>
- </itemizedlist>
- </step>
-
- <step>
- <para>Create a file named
- <filename>create.UpWindow</filename> containing:</para>
-
- <programlisting>drive UpWindow device /dev/ad2s1h
- plex name home.p1 org concat volume home
- sd name home.p1.s0 drive UpWindow plex home.p1 len $hl driveoffset $ho
- plex name usr.p1 org concat volume usr
- sd name usr.p1.s0 drive UpWindow plex usr.p1 len $ul driveoffset $uo</programlisting>
-
- <para>Where <literal>$hl</literal>, <literal>$ho</literal>, <literal>$ul</literal>, and <literal>$uo</literal> are set as above.</para>
- </step>
- </procedure>
- </appendix>
-
- <appendix id="Acknowledgements">
- <title>Acknowledgements</title>
-
- <para>I would like to thank Greg Lehey for writing &vinum.ap; and for
- providing very helpful comments on early drafts.
- Several others made helpful suggestions after reviewing later drafts
- including
- Dag-Erling Sm&oslash;rgrav,
- Michael Splendoria,
- Chern Lee,
- Stefan Aeschbacher,
- Fleming Froekjaer,
- Bernd Walter,
- Aleksey Baranov, and
- Doug Swarin.</para>
- </appendix>
-</article>