aboutsummaryrefslogblamecommitdiff
path: root/en_US.ISO8859-1/books/handbook/geom/chapter.xml
blob: 8fa6b4b1b82ab5139257c893616ce00ee70cb7dd (plain) (tree)
1
2
3
4
5
6
7
8
                                           





                                      
                   











                                                            
                            









                                            
                                                             
                                                          
                                                                  


                                                                   




                                                                      




                                                           

                                                             


                 
                                                                   

                                                            


                 

                                                                  











                                                            

                                                       
                 
 
                

                                                             



                   
                         

                                    





                                                                     


                                        
                            












                                        
 





                                   
                                 

                




                                                                     
 
                                                                  





                                                                      
 



                                                                   

                   
                                                           
                    
 



                                                   
 
               
                                                               
 


                                                          
 

                                                                                 
 




                                                                   
 
                                                                        

             


                                                                     
                                                
                                                           
                                           
                                               
 
                                                                                           


                                  

             
            


                                                                      
 
                                                                                          
             
 
            






                                                                      
 
                                                                                       
 


                                                                    

                
 
                                                           
 
                                                                                     
 

                                                                    


                                                               
 
                                                              
                                                                                 

                                                       
                                                                     

                                                                    
 
                                                                                                                  

          
                          







                                       


                              
 










                                                                     

                                                                  


















                                                                      
                                     

































                                                                                   
 
                                          
                                                         
 





                                                                    
 



                                                                  
 
                                                                        
 
                                                             




                                                                                                    

                                                 














                                                                                 
                                                                          




































                                                                                         

                                                                     


                                                                                
                                                                                                    


                                                                      


                                                                                                                    






















                                                                                     

                                                                  






                                                                     
 
                                           
                                                             
 



                                                                   



                                                                      

                                                                      
 
                                                          
 
                                                                        
 
                                                          
                                











                                                                              
                                                                 







































                                                                        
                                                     























                                                                                    





                                                                     

















                                                                        
                                                   



                                                                                           

                                                        








































































                                                                                         
                                                                                                       



                                                                        



                                                                                                                        






















































                                                                          
            
 
                                        
                                    
 
             





                                                                     
                                                                   
                                                                    
































                                                                            



                                                                     












                                                                           

                                                                  












                                                                      

              

           
                                                 
 













                                                                      




                                                                      


                                                                              
                                            
                                                                     

















                                                                                        
            
          
 



















































































































































































































































































































                                                                                                 
                         
























                                                       

                                                              
 






                              
                                                                      




                                                                      
                                                                      

                                                                    


                                                                      
                                                                 






                                                                    
                                                         

           

                                                          








































                                                                                                             

                                                                    





















                                                                                                        
                                                           














                                                                                                   


                                                                    




                                                                

                                                                 








                                                                                                          



                                                                 
                                                               
                                                       



                                                                     
                                                                     




                                                                  

                                                                      






                                                                    
                                                              

                                                   


                                                                                            
 

                                                              

          


                                                                      

           

                                                              
          










                                        
                                                               
                                                                    


                                                                    



                                                                    












                                                                     

                                             

                                                                  

                                                                      


                                                                     


                     



                                                                      




                                             





                                                                     
                                                                     










                                                                     

                                                                       
                                                         
 
                                             
                                                                  
                                                 
 
                                                                                                                                         


                                                             
                            

                


                                                                  
 
                                                                                                                  





                                                                  
                                                      


                                                                       


                                                                     
                                                               




                                                                 

                                                             




                                                                               
 

                                                                      





                                                                      





                                                                      


                                                         
                                                         
                                          
 


                                                                  












                                                                                    


                                                                      

                                                                 







                                                                                                     


                                                                     







                                                          
 


                                                                 






                                                                      





                                                                



                                                                 

                                                                   





                                                                                                  


                                                                     


            

                                              
 



                             
                                   

                
















                                                                      


                                                                  
                                                      


                                                                

                                                              


                                                             
                                                                     

                                                                    



                                                            
                                                                 




                                                                      
                                                                  
                                                  


                                                                   
                                                              
                                                                     


                                                         

                                                             


                                                                                        
                                                                 
                                            
 

                                                                  
 
                                                                                                                


                                                                    
                                                    


                                                            
                                                  
                                                      

           








                                                                      


                                                                
                                          
                                                              

                                                       
          
          
<?xml version="1.0" encoding="iso-8859-1"?>
<!--
     The FreeBSD Documentation Project
     $FreeBSD$

-->

<chapter id="geom">
  <chapterinfo>
    <authorgroup>
      <author>
	<firstname>Tom</firstname>
	<surname>Rhodes</surname>
	<contrib>Written by </contrib>
      </author>
    </authorgroup>
  </chapterinfo>

  <title>GEOM: Modular Disk Transformation Framework</title>

  <sect1 id="geom-synopsis">
    <title>Synopsis</title>

    <indexterm>
      <primary>GEOM</primary>
    </indexterm>
    <indexterm>
      <primary>GEOM Disk Framework</primary>
      <see>GEOM</see>
    </indexterm>

    <para>This chapter covers the use of disks under the GEOM
      framework in &os;.  This includes the major <acronym
	role="Redundant Array of Inexpensive Disks">RAID</acronym>
      control utilities which use the framework for configuration.
      This chapter will not go into in depth discussion on how GEOM
      handles or controls I/O, the underlying subsystem, or code.
      This information is provided in &man.geom.4; and its various
      <literal>SEE ALSO</literal> references.  This chapter is also
      not a definitive guide to <acronym>RAID</acronym> configurations
      and only GEOM-supported <acronym>RAID</acronym> classifications
      will be discussed.</para>

    <para>After reading this chapter, you will know:</para>

    <itemizedlist>
      <listitem>
	<para>What type of <acronym>RAID</acronym> support is
	  available through GEOM.</para>
      </listitem>

      <listitem>
	<para>How to use the base utilities to configure, maintain,
	  and manipulate the various <acronym>RAID</acronym>
	  levels.</para>
      </listitem>

      <listitem>
	<para>How to mirror, stripe, encrypt, and remotely connect
	  disk devices through GEOM.</para>
      </listitem>

      <listitem>
	<para>How to troubleshoot disks attached to the GEOM
	  framework.</para>
      </listitem>
    </itemizedlist>

    <para>Before reading this chapter, you should:</para>

    <itemizedlist>
      <listitem>
	<para>Understand how &os; treats <link
	    linkend="disks">disk devices</link>.</para>
      </listitem>

      <listitem>
	<para>Know how to configure and install a new <link
	    linkend="kernelconfig">&os; kernel</link>.</para>
      </listitem>
    </itemizedlist>
  </sect1>

  <sect1 id="geom-intro">
    <title>GEOM Introduction</title>

    <para>GEOM permits access and control to classes, such as Master
      Boot Records and <acronym>BSD</acronym> labels, through the use
      of providers, or the special files in <filename
	class="directory">/dev</filename>.  By supporting various
      software <acronym>RAID</acronym> configurations, GEOM
      transparently provides access to the operating system and
      operating system utilities.</para>
  </sect1>

  <sect1 id="geom-striping">
    <sect1info>
      <authorgroup>
	<author>
	  <firstname>Tom</firstname>
	  <surname>Rhodes</surname>
	  <contrib>Written by </contrib>
	</author>
	<author>
	  <firstname>Murray</firstname>
	  <surname>Stokely</surname>
	</author>
      </authorgroup>
    </sect1info>

    <title>RAID0 - Striping</title>

    <indexterm>
      <primary>GEOM</primary>
    </indexterm>
    <indexterm>
      <primary>Striping</primary>
    </indexterm>

    <para>Striping combine several disk drives into a single volume.
      In many cases, this is done through the use of hardware
      controllers.  The GEOM disk subsystem provides software support
      for <acronym>RAID</acronym>0, also known as disk
      striping.</para>

    <para>In a <acronym>RAID</acronym>0 system, data is split into
      blocks that get written across all the drives in the array.
      Instead of having to wait on the system to write 256k to one
      disk, a <acronym>RAID</acronym>0 system can simultaneously write
      64k to each of four different disks, offering superior I/O
      performance.  This performance can be enhanced further by using
      multiple disk controllers.</para>

    <para>Each disk in a <acronym>RAID</acronym>0 stripe must be of
      the same size, since I/O requests are interleaved to read or
      write to multiple disks in parallel.</para>

    <mediaobject>
      <imageobject>
	<imagedata fileref="geom/striping" align="center"/>
      </imageobject>

      <textobject>
	<phrase>Disk Striping Illustration</phrase>
      </textobject>
    </mediaobject>

    <procedure>
      <title>Creating a Stripe of Unformatted ATA Disks</title>

      <step>
	<para>Load the <filename>geom_stripe.ko</filename>
	  module:</para>

	<screen>&prompt.root; <userinput>kldload geom_stripe</userinput></screen>
      </step>

      <step>
	<para>Ensure that a suitable mount point exists.  If this
	  volume will become a root partition, then temporarily use
	  another mount point such as <filename
	    class="directory">/mnt</filename>:</para>

	<screen>&prompt.root; <userinput>mkdir /mnt</userinput></screen>
      </step>

      <step>
	<para>Determine the device names for the disks which will
	  be striped, and create the new stripe device.  For example,
	  to stripe two unused and unpartitioned
	  <acronym>ATA</acronym> disks with device names of
	  <filename>/dev/ad2</filename> and
	  <filename>/dev/ad3</filename>:</para>

	<screen>&prompt.root; <userinput>gstripe label -v st0 /dev/ad2 /dev/ad3</userinput>
Metadata value stored on /dev/ad2.
Metadata value stored on /dev/ad3.
Done.</screen>
      </step>

      <step>
	<para>Write a standard label, also known as a partition table,
	  on the new volume and install the default bootstrap
	  code:</para>

	<screen>&prompt.root; <userinput>bsdlabel -wB /dev/stripe/st0</userinput></screen>
      </step>

      <step>
	<para>This process should create two other devices in
	  <filename class="directory">/dev/stripe</filename> in
	  addition to <devicename>st0</devicename>.  Those include
	  <devicename>st0a</devicename> and
	  <devicename>st0c</devicename>.  At this point, a file system
	  may be created on <devicename>st0a</devicename> using
	  <command>newfs</command>:</para>

	<screen>&prompt.root; <userinput>newfs -U /dev/stripe/st0a</userinput></screen>

	<para>Many numbers will glide across the screen, and after a
	  few seconds, the process will be complete.  The volume has
	  been created and is ready to be mounted.</para>
      </step>
    </procedure>

    <para>To manually mount the created disk stripe:</para>

    <screen>&prompt.root; <userinput>mount /dev/stripe/st0a /mnt</userinput></screen>

    <para>To mount this striped file system automatically during the
      boot process, place the volume information in
      <filename>/etc/fstab</filename>.  In this example, a
      permanent mount point, named <filename
	class="directory">stripe</filename>, is created:</para>

    <screen>&prompt.root; <userinput>mkdir /stripe</userinput>
&prompt.root; <userinput>echo "/dev/stripe/st0a /stripe ufs rw 2 2" \</userinput>
    <userinput>&gt;&gt; /etc/fstab</userinput></screen>

    <para>The <filename>geom_stripe.ko</filename> module must also be
      automatically loaded during system initialization, by adding a
      line to <filename>/boot/loader.conf</filename>:</para>

    <screen>&prompt.root; <userinput>echo 'geom_stripe_load="YES"' &gt;&gt; /boot/loader.conf</userinput></screen>
  </sect1>

  <sect1 id="geom-mirror">
    <title>RAID1 - Mirroring</title>

    <indexterm>
      <primary>GEOM</primary>
    </indexterm>
    <indexterm>
      <primary>Disk Mirroring</primary>
    </indexterm>
    <indexterm>
      <primary>RAID1</primary>
    </indexterm>

    <para><acronym>RAID1</acronym>, or
      <firstterm>mirroring</firstterm>, is the technique of writing
      the same data to more than one disk drive.  Mirrors are usually
      used to guard against data loss due to drive failure.  Each
      drive in a mirror contains an identical copy of the data.  When
      an individual drive fails, the mirror continues to work,
      providing data from the drives that are still functioning.  The
      computer keeps running, and the administrator has time to
      replace the failed drive without user interruption.</para>

    <para>Two common situations are illustrated in these examples.
      The first creates a mirror out of two new drives and uses it
      as a replacement for an existing single drive.  The second
      example creates a mirror on a single new drive, copies the old
      drive's data to it, then inserts the old drive into the
      mirror.  While this procedure is slightly more complicated, it
      only requires one new drive.</para>

    <para>Traditionally, the two drives in a mirror are identical in
      model and capacity, but &man.gmirror.8; does not require that.
      Mirrors created with dissimilar drives will have a capacity
      equal to that of the smallest drive in the mirror.  Extra space
      on larger drives will be unused.  Drives inserted into the
      mirror later must have at least as much capacity as the smallest
      drive already in the mirror.</para>

    <warning>
      <para>The mirroring procedures shown here are non-destructive,
	but as with any major disk operation, make a full backup
	first.</para>
    </warning>

    <sect2 id="geom-mirror-metadata">
      <title>Metadata Issues</title>

      <para>Many disk systems store metadata at the end of each disk.
	Old metadata should be erased before reusing the disk for a
	mirror.  Most problems are caused by two particular types of
	leftover metadata: GPT partition tables, and old
	&man.gmirror.8; metadata from a previous mirror.</para>

      <para>GPT metadata can be erased with &man.gpart.8;.  This
	example erases both primary and backup GPT partition tables
	from disk <devicename>ada8</devicename>:</para>

      <screen>&prompt.root; <userinput>gpart destroy -F ada8</userinput></screen>

      <para>&man.gmirror.8; can remove a disk from an active mirror
	and erase the metadata in one step.  Here, the example disk
	<devicename>ada8</devicename> is removed from the active
	mirror <devicename>gm4</devicename>:</para>

      <screen>&prompt.root; <userinput>gmirror remove gm4 ada8</userinput></screen>

      <para>If the mirror is not running but old mirror metadata is
	still on the disk, use <command>gmirror clear</command> to
	remove it:</para>

      <screen>&prompt.root; <userinput>gmirror clear ada8</userinput></screen>

      <para>&man.gmirror.8; stores one block of metadata at the end of
	the disk.  Because GPT partition schemes also store metadata
	at the end of the disk, mirroring full GPT disks with
	&man.gmirror.8; is not recommended.  MBR partitioning is used
	here because it only stores a partition table at the start of
	the disk and does not conflict with &man.gmirror.8;.</para>
    </sect2>

    <sect2 id="geom-mirror-two-new-disks">
      <title>Creating a Mirror with Two New Disks</title>

      <para>In this example, &os; has already been installed on a
	single disk, <devicename>ada0</devicename>.  Two new disks,
	<devicename>ada1</devicename> and
	<devicename>ada2</devicename>, have been connected to the
	system.  A new mirror will be created on these two disks and
	used to replace the old single disk.</para>

      <para>&man.gmirror.8; requires a kernel module,
	<filename>geom_mirror.ko</filename>, either built into the
	kernel or loaded at boot- or run-time.  Manually load the
	kernel module now:</para>

      <screen>&prompt.root; <userinput>gmirror load</userinput></screen>

      <para>Create the mirror with the two new drives:</para>

      <screen>&prompt.root; <userinput>gmirror label -v gm0 /dev/ada1 /dev/ada2</userinput></screen>

      <para><devicename>gm0</devicename> is a user-chosen device name
	assigned to the new mirror.  After the mirror has been
	started, this device name will appear in
	<filename>/dev/mirror/</filename>.</para>

      <para>MBR and bsdlabel partition tables can now be created on
	the mirror with &man.gpart.8;.  Here we show a traditional
	split-filesystem layout, with partitions for
	<filename>/</filename>, swap, <filename>/var</filename>,
	<filename>/tmp</filename>, and <filename>/usr</filename>.  A
	single <filename>/</filename> filesystem and a swap partition
	will also work.</para>

      <para>Partitions on the mirror do not have to be the same size
	as those on the existing disk, but they must be large enough
	to hold all the data already present on
	<devicename>ada0</devicename>.</para>

      <screen>&prompt.root; <userinput>gpart create -s MBR mirror/gm0</userinput>
&prompt.root; <userinput>gpart add -t freebsd -a 4k mirror/gm0</userinput>
&prompt.root; <userinput>gpart show mirror/gm0</userinput>
=>       63  156301423  mirror/gm0  MBR  (74G)
         63         63                    - free -  (31k)
        126  156301299                 1  freebsd  (74G)
  156301425         61                    - free -  (30k)</screen>

      <screen>&prompt.root; <userinput>gpart create -s BSD mirror/gm0s1</userinput>
&prompt.root; <userinput>gpart add -t freebsd-ufs  -a 4k -s 2g mirror/gm0s1</userinput>
&prompt.root; <userinput>gpart add -t freebsd-swap -a 4k -s 4g mirror/gm0s1</userinput>
&prompt.root; <userinput>gpart add -t freebsd-ufs  -a 4k -s 2g mirror/gm0s1</userinput>
&prompt.root; <userinput>gpart add -t freebsd-ufs  -a 4k -s 1g mirror/gm0s1</userinput>
&prompt.root; <userinput>gpart add -t freebsd-ufs  -a 4k       mirror/gm0s1</userinput>
&prompt.root; <userinput>gpart show mirror/gm0s1</userinput>
=>        0  156301299  mirror/gm0s1  BSD  (74G)
          0          2                      - free -  (1.0k)
          2    4194304                   1  freebsd-ufs  (2.0G)
    4194306    8388608                   2  freebsd-swap  (4.0G)
   12582914    4194304                   4  freebsd-ufs  (2.0G)
   16777218    2097152                   5  freebsd-ufs  (1.0G)
   18874370  137426928                   6  freebsd-ufs  (65G)
  156301298          1                      - free -  (512B)</screen>

      <para>Make the mirror bootable by installing bootcode in the MBR
	and bsdlabel and setting the active slice:</para>

      <screen>&prompt.root; <userinput>gpart bootcode -b /boot/mbr mirror/gm0</userinput>
&prompt.root; <userinput>gpart set -a active -i 1 mirror/gm0</userinput>
&prompt.root; <userinput>gpart bootcode -b /boot/boot mirror/gm0s1</userinput></screen>

      <para>Format the filesystems on the new mirror, enabling
	soft-updates.</para>

      <screen>&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1a</userinput>
&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1d</userinput>
&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1e</userinput>
&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1f</userinput></screen>

      <para>Filesystems from the original
	<devicename>ada0</devicename> disk can now be copied onto the
	mirror with &man.dump.8; and &man.restore.8;.</para>

      <screen>&prompt.root; <userinput>mount /dev/mirror/gm0s1a /mnt</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - / | (cd /mnt &amp;&amp; restore -rf -)</userinput>
&prompt.root; <userinput>mount /dev/mirror/gm0s1d /mnt/var</userinput>
&prompt.root; <userinput>mount /dev/mirror/gm0s1e /mnt/tmp</userinput>
&prompt.root; <userinput>mount /dev/mirror/gm0s1f /mnt/usr</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - /var | (cd /mnt/var &amp;&amp; restore -rf -)</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - /tmp | (cd /mnt/tmp &amp;&amp; restore -rf -)</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - /usr | (cd /mnt/usr &amp;&amp; restore -rf -)</userinput></screen>

      <para><filename>/mnt/etc/fstab</filename> must be edited to
	point to the new mirror filesystems:</para>

      <programlisting># Device		Mountpoint	FStype	Options	Dump	Pass#
/dev/mirror/gm0s1a	/		ufs	rw	1	1
/dev/mirror/gm0s1b	none		swap	sw	0	0
/dev/mirror/gm0s1d	/var		ufs	rw	2	2
/dev/mirror/gm0s1e	/tmp		ufs	rw	2	2
/dev/mirror/gm0s1f	/usr		ufs	rw	2	2</programlisting>

      <para>If the &man.gmirror.8; kernel module has not been built
	into the kernel, <filename>/mnt/boot/loader.conf</filename> is
	edited to load the module at boot:</para>

      <programlisting>geom_mirror_load="YES"</programlisting>

      <para>Reboot the system to test the new mirror and verify that
	all data has been copied.  The BIOS will see the mirror as two
	individual drives rather than a mirror.  Because the drives
	are identical, it does not matter which is selected to
	boot.</para>

      <para>See the <link
	  linkend="gmirror-troubleshooting">Troubleshooting</link>
	section if there are problems booting.  Powering down and
	disconnecting the original <devicename>ada0</devicename> disk
	will allow it to be kept as an offline backup.</para>

      <para>In use, the mirror will behave just like the original
	single drive.</para>
    </sect2>

    <sect2 id="geom-mirror-existing-drive">
      <title>Creating a Mirror with an Existing Drive</title>

      <para>In this example, &os; has already been installed on a
	single disk, <devicename>ada0</devicename>.  A new disk,
	<devicename>ada1</devicename>, has been connected to the
	system.  A one-disk mirror will be created on the new disk,
	the existing system copied onto it, and then the old disk will
	be inserted into the mirror.  This slightly complex procedure
	is required because &man.gmirror.8; needs to put a 512-byte
	block of metadata at the end of each disk, and the existing
	<devicename>ada0</devicename> has usually had all of its space
	already allocated.</para>

      <para>Load the &man.gmirror.8; kernel module:</para>

      <screen>&prompt.root; <userinput>gmirror load</userinput></screen>

      <para>Check the media size of the original disk with
	&man.diskinfo.8;:</para>

      <screen>&prompt.root; <userinput>diskinfo -v ada0 | head -n3</userinput>
/dev/ada0
	512             # sectorsize
	1000204821504   # mediasize in bytes (931G)</screen>

      <para>Create a mirror on the new disk.  To make certain that the
	mirror capacity is not any larger than the original drive,
	&man.gnop.8; is used to create a fake drive of the exact same
	size.  This drive does not store any data, but is used only to
	limit the size of the mirror.  When &man.gmirror.8; creates
	the mirror, it will restrict the capacity to the size of
	<devicename>gzero.nop</devicename>, even if the new drive
	(<devicename>ada1</devicename>) has more space.  Note that the
	<replaceable>1000204821504</replaceable> in the second line
	should be equal to <devicename>ada0</devicename>'s media size
	as shown by &man.diskinfo.8; above.</para>

      <screen>&prompt.root; <userinput>geom zero load</userinput>
&prompt.root; <userinput>gnop create -s 1000204821504 gzero</userinput>
&prompt.root; <userinput>gmirror label -v gm0 gzero.nop ada1</userinput>
&prompt.root; <userinput>gmirror forget gm0</userinput></screen>

      <para><devicename>gzero.nop</devicename> does not store any
	data, so the mirror does not see it as connected.  The mirror
	is told to <quote>forget</quote> unconnected components,
	removing references to <devicename>gzero.nop</devicename>.
	The result is a mirror device containing only a single disk,
	<devicename>ada1</devicename>.</para>

      <para>After creating <devicename>gm0</devicename>, view the
	partition table on <devicename>ada0</devicename>.</para>

      <para>This output is from a 1&nbsp;TB drive.  If there is some
	unallocated space at the end of the drive, the contents may be
	copied directly from <devicename>ada0</devicename> to the new
	mirror.</para>

      <para>However, if the output shows that all of the space on the
	disk is allocated like the following listing, there is no
	space available for the 512-byte &man.gmirror.8; metadata at
	the end of the disk.</para>

      <screen>&prompt.root; <userinput>gpart show ada0</userinput>
=>        63  1953525105        ada0  MBR  (931G)
          63  1953525105           1  freebsd  [active]  (931G)</screen>

      <para>In this case, the partition table must be edited to reduce
	the capacity by one sector on
	<devicename>mirror/gm0</devicename>.  The procedure will
	be explained later.</para>

      <para>In either case, partition tables on the primary disk
	should be copied first with the &man.gpart.8;
	<command>backup</command> and <command>restore</command>
	subcommands.</para>

      <screen>&prompt.root; <userinput>gpart backup ada0 &gt; table.ada0</userinput>
&prompt.root; <userinput>gpart backup ada0s1 &gt; table.ada0s1</userinput></screen>

      <para>These commands create two files,
	<filename>table.ada0</filename> and
	<filename>table.ada0s1</filename>.  This example is from a
	1&nbsp;TB drive:</para>

      <screen>&prompt.root; <userinput>cat table.ada0</userinput>
MBR 4
1 freebsd         63 1953525105   [active]</screen>

      <screen>&prompt.root; <userinput>cat table.ada0s1</userinput>
BSD 8
1  freebsd-ufs          0    4194304
2 freebsd-swap    4194304   33554432
4  freebsd-ufs   37748736   50331648
5  freebsd-ufs   88080384   41943040
6  freebsd-ufs  130023424  838860800
7  freebsd-ufs  968884224  984640881</screen>

      <para>If the output of <command>gpart show</command> shows no
	free space at the end of the disk, the size of both the slice
	and the last partition must be reduced by one sector.  Edit
	the two files, reducing the size of both the slice and last
	partition by one.  These are the last numbers in each
	listing.</para>

      <screen>&prompt.root; <userinput>cat table.ada0</userinput>
MBR 4
1 freebsd         63 <emphasis>1953525104</emphasis>   [active]</screen>

      <screen>&prompt.root; <userinput>cat table.ada0s1</userinput>
BSD 8
1  freebsd-ufs          0    4194304
2 freebsd-swap    4194304   33554432
4  freebsd-ufs   37748736   50331648
5  freebsd-ufs   88080384   41943040
6  freebsd-ufs  130023424  838860800
7  freebsd-ufs  968884224  <emphasis>984640880</emphasis></screen>

      <para>If at least one sector was unallocated at the end of the
	disk, these two files can be used without modification.</para>

      <para>Now restore the partition table into
	<devicename>mirror/gm0</devicename>:</para>

      <screen>&prompt.root; <userinput>gpart restore mirror/gm0 &lt; table.ada0</userinput>
&prompt.root; <userinput>gpart restore mirror/gm0s1 &lt; table.ada0s1</userinput></screen>

      <para>Check the partition table with
	<command>gpart show</command>.  This example has
	<devicename>gm0s1a</devicename> for <filename>/</filename>,
	<devicename>gm0s1d</devicename> for <filename>/var</filename>,
	<devicename>gm0s1e</devicename> for <filename>/usr</filename>,
	<devicename>gm0s1f</devicename> for
	<filename>/data1</filename>, and
	<devicename>gm0s1g</devicename> for
	<filename>/data2</filename>.</para>

      <screen>&prompt.root; <userinput>gpart show mirror/gm0</userinput>
=>        63  1953525104  mirror/gm0  MBR  (931G)
          63  1953525042           1  freebsd  [active]  (931G)
  1953525105          62              - free -  (31k)

&prompt.root; <userinput>gpart show mirror/gm0s1</userinput>
=>         0  1953525042  mirror/gm0s1  BSD  (931G)
           0     2097152             1  freebsd-ufs  (1.0G)
     2097152    16777216             2  freebsd-swap  (8.0G)
    18874368    41943040             4  freebsd-ufs  (20G)
    60817408    20971520             5  freebsd-ufs  (10G)
    81788928   629145600             6  freebsd-ufs  (300G)
   710934528  1242590514             7  freebsd-ufs  (592G)
  1953525042          63                - free -  (31k)</screen>

      <para>Both the slice and the last partition should have some
	free space at the end of each disk.</para>

      <para>Create filesystems on these new partitions.  The
	number of partitions will vary, matching the partitions on the
	original disk, <devicename>ada0</devicename>.</para>

      <screen>&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1a</userinput>
&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1d</userinput>
&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1e</userinput>
&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1f</userinput>
&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1g</userinput></screen>

      <para>Make the mirror bootable by installing bootcode in the MBR
	and bsdlabel and setting the active slice:</para>

      <screen>&prompt.root; <userinput>gpart bootcode -b /boot/mbr mirror/gm0</userinput>
&prompt.root; <userinput>gpart set -a active -i 1 mirror/gm0</userinput>
&prompt.root; <userinput>gpart bootcode -b /boot/boot mirror/gm0s1</userinput></screen>

      <para>Adjust <filename>/etc/fstab</filename> to use the
	new partitions on the mirror.  Back up this file first by
	copying it to <filename>/etc/fstab.orig</filename>.</para>

      <screen>&prompt.root; <userinput>cp /etc/fstab /etc/fstab.orig</userinput></screen>

      <para>Edit <filename>/etc/fstab</filename>, replacing
	<devicename>/dev/ada0</devicename> with
	<devicename>mirror/gm0</devicename>.</para>

      <programlisting># Device		Mountpoint	FStype	Options	Dump	Pass#
/dev/mirror/gm0s1a	/		ufs	rw	1	1
/dev/mirror/gm0s1b	none		swap	sw	0	0
/dev/mirror/gm0s1d	/var		ufs	rw	2	2
/dev/mirror/gm0s1e	/usr		ufs	rw	2	2
/dev/mirror/gm0s1f	/data1		ufs	rw	2	2
/dev/mirror/gm0s1g	/data2		ufs	rw	2	2</programlisting>

      <para>If the &man.gmirror.8; kernel module has not been built
	into the kernel, edit <filename>/boot/loader.conf</filename>
	to load it:</para>

      <programlisting>geom_mirror_load="YES"</programlisting>

      <para>Filesystems from the original disk can now be copied onto
	the mirror with &man.dump.8; and &man.restore.8;.  Note that
	it may take some time to create a snapshot for each filesystem
	dumped with <command>dump -L</command>.</para>

      <screen>&prompt.root; <userinput>mount /dev/mirror/gm0s1a /mnt</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - /    | (cd /mnt &amp;&amp; restore -rf -)</userinput>
&prompt.root; <userinput>mount /dev/mirror/gm0s1d /mnt/var</userinput>
&prompt.root; <userinput>mount /dev/mirror/gm0s1e /mnt/usr</userinput>
&prompt.root; <userinput>mount /dev/mirror/gm0s1f /mnt/data1</userinput>
&prompt.root; <userinput>mount /dev/mirror/gm0s1g /mnt/data2</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - /usr | (cd /mnt/usr &amp;&amp; restore -rf -)</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - /var | (cd /mnt/var &amp;&amp; restore -rf -)</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - /data1 | (cd /mnt/data1 &amp;&amp; restore -rf -)</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - /data2 | (cd /mnt/data2 &amp;&amp; restore -rf -)</userinput></screen>

      <para>Restart the system, booting from
	<devicename>ada1</devicename>.  If everything is working, the
	system will boot from <devicename>mirror/gm0</devicename>,
	which now contains the same data as
	<devicename>ada0</devicename> had previously.  See the
	<link linkend="gmirror-troubleshooting">Troubleshooting</link>
	section if there are problems booting.</para>

      <para>At this point, the mirror still consists of only the
	single <devicename>ada1</devicename> disk.</para>

      <para>After booting from <devicename>mirror/gm0</devicename>
	successfully, the final step is inserting
	<devicename>ada0</devicename> into the mirror.</para>

      <important>
	<para>When <devicename>ada0</devicename> is inserted into the
	  mirror, its former contents will be overwritten by data on
	  the mirror.  Make certain that
	  <devicename>mirror/gm0</devicename> has the same contents as
	  <devicename>ada0</devicename> before adding
	  <devicename>ada0</devicename> to the mirror.  If there is
	  something wrong with the contents copied by &man.dump.8; and
	  &man.restore.8;, revert <filename>/etc/fstab</filename> to
	  mount the filesystems on <devicename>ada0</devicename>,
	  reboot, and try the whole procedure again.</para>
      </important>

      <screen>&prompt.root; <userinput>gmirror insert gm0 ada0</userinput>
GEOM_MIRROR: Device gm0: rebuilding provider ada0</screen>

      <para>Synchronization between the two disks will start
	immediately.  &man.gmirror.8; <command>status</command>
	shows the progress.</para>

      <screen>&prompt.root; <userinput>gmirror status</userinput>
      Name    Status  Components
mirror/gm0  DEGRADED  ada1 (ACTIVE)
                      ada0 (SYNCHRONIZING, 64%)</screen>

      <para>After a while, synchronization will finish.</para>

      <screen>GEOM_MIRROR: Device gm0: rebuilding provider ada0 finished.
&prompt.root; <userinput>gmirror status</userinput>
      Name    Status  Components
mirror/gm0  COMPLETE  ada1 (ACTIVE)
                      ada0 (ACTIVE)</screen>

      <para><devicename>mirror/gm0</devicename> now consists of
	the two disks <devicename>ada0</devicename> and
	<devicename>ada1</devicename>, and the contents are
	automatically synchronized with each other.  In use,
	<devicename>mirror/gm0</devicename> will behave just like the
	original single drive.</para>
    </sect2>

    <sect2 id="gmirror-troubleshooting">
      <title>Troubleshooting</title>

      <sect3>
	<title>Problems with Booting</title>

	<sect4>
	  <title>BIOS Settings</title>

	  <para>BIOS settings may have to be changed to boot from one
	    of the new mirrored drives.  Either mirror drive can be
	    used for booting, as they contain identical data.</para>
	</sect4>

	<sect4>
	  <title>Boot Problems</title>

	  <para>If the boot stopped with this message, something is
	    wrong with the mirror device:</para>

	  <screen>Mounting from ufs:/dev/mirror/gm0s1a failed with error 19.

Loader variables:
  vfs.root.mountfrom=ufs:/dev/mirror/gm0s1a
  vfs.root.mountfrom.options=rw

Manual root filesystem specification:
  &lt;fstype&gt;:&lt;device&gt; [options]
      Mount &lt;device&gt; using filesystem &lt;fstype&gt;
      and with the specified (optional) option list.

    eg. ufs:/dev/da0s1a
        zfs:tank
        cd9660:/dev/acd0 ro
          (which is equivalent to: mount -t cd9660 -o ro /dev/acd0 /)

  ?               List valid disk boot devices
  .               Yield 1 second (for background tasks)
  &lt;empty line&gt;    Abort manual input

mountroot&gt;</screen>

	  <para>Forgetting to load the
	    <filename>geom_mirror</filename> module in
	    <filename>/boot/loader.conf</filename> can cause this
	    problem.  To fix it, boot from a &os;&nbsp;9.0 or later
	    installation media and choose <literal>Shell</literal> at
	    the first prompt.  Then load the mirror module and mount
	    the mirror device:</para>

	  <screen>&prompt.root; <userinput>gmirror load</userinput>
&prompt.root; <userinput>mount /dev/mirror/gm0s1a /mnt</userinput></screen>

	  <para>Edit <filename>/mnt/boot/loader.conf</filename>,
	    adding a line to load the mirror module:</para>

	  <programlisting>geom_mirror_load="YES"</programlisting>

	  <para>Save the file and reboot.</para>

	  <para>Other problems that cause <literal>error 19</literal>
	    require more effort to fix.  Enter
	    <literal>ufs:/dev/ada0s1a</literal> at the boot loader
	    prompt.  Although the system should boot from
	    <devicename>ada0</devicename>, another prompt to select a
	    shell appears because <filename>/etc/fstab</filename> is
	    incorrect.  Press the Enter key at the prompt.  Undo the
	    modifications so far by reverting
	    <filename>/etc/fstab</filename>, mounting filesystems from
	    the original disk (<devicename>ada0</devicename>) instead
	    of the mirror.  Reboot the system and try the procedure
	    again.</para>

	  <screen>Enter full pathname of shell or RETURN for /bin/sh:
&prompt.root; <userinput>cp /etc/fstab.orig /etc/fstab</userinput>
&prompt.root; <userinput>reboot</userinput></screen>
	</sect4>
      </sect3>
    </sect2>

    <sect2>
      <title>Recovering from Disk Failure</title>

      <para>The benefit of disk mirroring is that an individual disk
	can fail without causing the mirror to lose any data.  In the
	above example, if <devicename>ada0</devicename> fails, the
	mirror will continue to work, providing data from the
	remaining working drive, <devicename>ada1</devicename>.</para>

      <para>To replace the failed drive, shut down the system and
	physically replace the failed drive with a new drive of equal
	or greater capacity.  Manufacturers use somewhat arbitrary
	values when rating drives in gigabytes, and the only way to
	really be sure is to compare the total count of sectors shown
	by <command>diskinfo -v</command>.  A drive with larger
	capacity than the mirror will work, although the extra space
	on the new drive will not be used.</para>

      <para>After the computer is powered back up, the mirror will be
	running in a <quote>degraded</quote> mode with only one drive.
	The mirror is told to forget drives that are not currently
	connected:</para>

      <screen>&prompt.root; <userinput>gmirror forget gm0</userinput></screen>

      <para>Any old metadata should be <link
	  linkend="geom-mirror-metadata">cleared from the replacement
	  disk</link>.  Then the disk, <devicename>ada4</devicename>
	for this example, is inserted into the mirror:</para>

      <screen>&prompt.root; <userinput>gmirror insert gm0 /dev/ada4</userinput></screen>

      <para>Resynchronization begins when the new drive is inserted
	into the mirror.  This process of copying mirror data to a new
	drive can take a while.  Performance of the mirror will be
	greatly reduced during the copy, so inserting new drives is
	best done when there is low demand on the computer.</para>

      <para>Progress can be monitored with <command>gmirror
	  status</command>, which shows drives that are being
	synchronized and the percentage of completion.  During
	resynchronization, the status will be
	<computeroutput>DEGRADED</computeroutput>, changing to
	<computeroutput>COMPLETE</computeroutput> when the process is
	finished.</para>
    </sect2>
  </sect1>

  <sect1 id="geom-graid">
    <sect1info>
      <authorgroup>
	<author>
	  <firstname>Warren</firstname>
	  <surname>Block</surname>
	  <contrib>Originally contributed by </contrib>
	</author>
      </authorgroup>
    </sect1info>
    <title>Software <acronym>RAID</acronym> Devices</title>

    <indexterm>
      <primary>GEOM</primary>
    </indexterm>
    <indexterm>
      <primary>Software RAID Devices</primary>
      <secondary>Hardware-assisted RAID</secondary>
    </indexterm>

    <para>Some motherboards and expansion cards add some simple
      hardware, usually just a <acronym>ROM</acronym>, that allows the
      computer to boot from a <acronym>RAID</acronym> array.  After
      booting, access to the <acronym>RAID</acronym> array is handled
      by software running on the computer's main processor.  This
      <quote>hardware-assisted software
	<acronym>RAID</acronym></quote> gives <acronym>RAID</acronym>
      arrays that are not dependent on any particular operating
      system, and which are functional even before an operating system
      is loaded.</para>

    <para>Several levels of <acronym>RAID</acronym> are supported,
      depending on the hardware in use.  See &man.graid.8; for a
      complete list.</para>

    <para>&man.graid.8; requires the <filename>geom_raid.ko</filename>
      kernel module, which is included in the
      <filename>GENERIC</filename> kernel starting with &os;&nbsp;9.1.
      If needed, it can be loaded manually with
      <command>graid load</command>.</para>

    <sect2 id="geom-graid-creating">
      <title>Creating an Array</title>

      <para>Software <acronym>RAID</acronym> devices often have a menu
	that can be entered by pressing special keys when the computer
	is booting.  The menu can be used to create and delete
	<acronym>RAID</acronym> arrays.  &man.graid.8; can also create
	arrays directly from the command line.</para>

      <para><command>graid label</command> is used to create a new
	array.  The motherboard used for this example has an Intel
	software <acronym>RAID</acronym> chipset, so the Intel
	metadata format is specified.  The new array is given a label
	of <devicename>gm0</devicename>, it is a mirror
	(<acronym>RAID1</acronym>), and uses drives
	<devicename>ada0</devicename> and
	<devicename>ada1</devicename>.</para>

      <caution>
	<para>Some space on the drives will be overwritten when they
	  are made into a new array.  Back up existing data
	  first!</para>
      </caution>

      <screen>&prompt.root; <userinput>graid label Intel gm0 RAID1 ada0 ada1</userinput>
GEOM_RAID: Intel-a29ea104: Array Intel-a29ea104 created.
GEOM_RAID: Intel-a29ea104: Disk ada0 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:0-ada0 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-a29ea104: Disk ada1 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:1-ada1 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-a29ea104: Array started.
GEOM_RAID: Intel-a29ea104: Volume gm0 state changed from STARTING to OPTIMAL.
Intel-a29ea104 created
GEOM_RAID: Intel-a29ea104: Provider raid/r0 for volume gm0 created.</screen>

      <para>A status check shows the new mirror is ready for
	use:</para>

      <screen>&prompt.root; <userinput>graid status</userinput>
   Name   Status  Components
raid/r0  OPTIMAL  ada0 (ACTIVE (ACTIVE))
                  ada1 (ACTIVE (ACTIVE))</screen>

      <para>The array device appears in
	<filename>/dev/raid/</filename>.  The first array is called
	<devicename>r0</devicename>.  Additional arrays, if present,
	will be <devicename>r1</devicename>,
	<devicename>r2</devicename>, and so on.</para>

      <para>The <acronym>BIOS</acronym> menu on some of these devices
	can create arrays with special characters in their names.  To
	avoid problems with those special characters, arrays are given
	simple numbered names like <devicename>r0</devicename>.  To
	show the actual labels, like <devicename>gm0</devicename> in
	the example above, use &man.sysctl.8;:</para>

      <screen>&prompt.root; <userinput>sysctl kern.geom.raid.name_format=1</userinput></screen>
    </sect2>

    <sect2 id="geom-graid-volumes">
      <title>Multiple Volumes</title>

      <para>Some software <acronym>RAID</acronym> devices support
	more than one <emphasis>volume</emphasis> on an array.
	Volumes work like partitions, allowing space on the physical
	drives to be split and used in different ways.  For example,
	Intel software <acronym>RAID</acronym> devices support two
	volumes.  This example creates a 40&nbsp;G mirror for safely
	storing the operating system, followed by a 20&nbsp;G
	<acronym>RAID0</acronym> (stripe) volume for fast temporary
	storage:</para>

      <screen>&prompt.root; <userinput>graid label -S 40G Intel gm0 RAID1 ada0 ada1</userinput>
&prompt.root; <userinput>graid add -S 20G gm0 RAID0</userinput></screen>

      <para>Volumes appear as additional
	<devicename>r<replaceable>X</replaceable></devicename> entries
	in <filename>/dev/raid/</filename>.  An array with two volumes
	will show <devicename>r0</devicename> and
	<devicename>r1</devicename>.</para>

      <para>See &man.graid.8; for the number of volumes supported by
	different software <acronym>RAID</acronym> devices.</para>
    </sect2>

    <sect2 id="geom-graid-converting">
      <title>Converting a Single Drive to a Mirror</title>

      <para>Under certain specific conditions, it is possible to
	convert an existing single drive to a &man.graid.8; array
	without reformatting.  To avoid data loss during the
	conversion, the existing drive must meet these minimum
	requirements:</para>

      <itemizedlist>
	<listitem>
	  <para>The drive must be partitioned with the
	    <acronym>MBR</acronym> partitioning scheme.
	    <acronym>GPT</acronym> or other partitioning schemes with
	    metadata at the end of the drive will be overwritten and
	    corrupted by the &man.graid.8; metadata.</para>
	</listitem>

	<listitem>
	  <para>There must be enough unpartitioned and unused space at
	    the end of the drive to hold the &man.graid.8; metadata.
	    This metadata varies in size, but the largest occupies
	    64&nbsp;M, so at least that much free space is
	    recommended.</para>
	</listitem>
      </itemizedlist>

      <para>If the drive meets these requirements, start by making a
	full backup.  Then create a single-drive mirror with that
	drive:</para>

      <screen>&prompt.root; <userinput>graid label Intel gm0 RAID1 ada0 NONE</userinput></screen>

      <para>&man.graid.8; metadata was written to the end of the drive
	in the unused space.  A second drive can now be inserted into
	the mirror:</para>

      <screen>&prompt.root; <userinput>graid insert raid/r0 ada1</userinput></screen>

      <para>Data from the original drive will immediately begin to be
	copied to the second drive.  The mirror will operate in
	degraded status until the copy is complete.</para>
    </sect2>

    <sect2 id="geom-graid-inserting">
      <title>Inserting New Drives into the Array</title>

      <para>Drives can be inserted into an array as replacements for
	drives that have failed or are missing.  If there are no
	failed or missing drives, the new drive becomes a spare.  For
	example, inserting a new drive into a working two-drive mirror
	results in a two-drive mirror with one spare drive, not a
	three-drive mirror.</para>

      <para>In the example mirror array, data immediately begins to be
	copied to the newly-inserted drive.  Any existing information
	on the new drive will be overwritten.</para>

      <screen>&prompt.root; <userinput>graid insert raid/r0 ada1</userinput>
GEOM_RAID: Intel-a29ea104: Disk ada1 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:1-ada1 state changed from NONE to NEW.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:1-ada1 state changed from NEW to REBUILD.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:1-ada1 rebuild start at 0.</screen>
    </sect2>

    <sect2 id="geom-graid-removing">
      <title>Removing Drives from the Array</title>

      <para>Individual drives can be permanently removed from a
	from an array and their metadata erased:</para>

      <screen>&prompt.root; <userinput>graid remove raid/r0 ada1</userinput>
GEOM_RAID: Intel-a29ea104: Disk ada1 state changed from ACTIVE to OFFLINE.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:1-[unknown] state changed from ACTIVE to NONE.
GEOM_RAID: Intel-a29ea104: Volume gm0 state changed from OPTIMAL to DEGRADED.</screen>
    </sect2>

    <sect2 id="geom-graid-stopping">
      <title>Stopping the Array</title>

      <para>An array can be stopped without removing metadata from the
	drives.  The array will be restarted when the system is
	booted.</para>

      <screen>&prompt.root; <userinput>graid stop raid/r0</userinput></screen>
    </sect2>

    <sect2 id="geom-graid-status">
      <title>Checking Array Status</title>

      <para>Array status can be checked at any time.  After a drive
	was added to the mirror in the example above, data is being
	copied from the original drive to the new drive:</para>

      <screen>&prompt.root; <userinput>graid status</userinput>
   Name    Status  Components
raid/r0  DEGRADED  ada0 (ACTIVE (ACTIVE))
                   ada1 (ACTIVE (REBUILD 28%))</screen>

      <para>Some types of arrays, like <literal>RAID0</literal> or
	<literal>CONCAT</literal>, may not be shown in the status
	report if disks have failed.  To see these partially-failed
	arrays, add <option>-ga</option>:</para>

      <screen>&prompt.root; <userinput>graid status -ga</userinput>
          Name  Status  Components
Intel-e2d07d9a  BROKEN  ada6 (ACTIVE (ACTIVE))</screen>
    </sect2>

    <sect2 id="geom-graid-deleting">
      <title>Deleting Arrays</title>

      <para>Arrays are destroyed by deleting all of the volumes from
	them.  When the last volume present is deleted, the array is
	stopped and metadata is removed from the drives:</para>

      <screen>&prompt.root; <userinput>graid delete raid/r0</userinput></screen>
    </sect2>

    <sect2 id="geom-graid-unexpected">
      <title>Deleting Unexpected Arrays</title>

      <para>Drives may unexpectedly contain &man.graid.8; metadata,
	either from previous use or manufacturer testing.
	&man.graid.8; will detect these drives and create an array,
	interfering with access to the individual drive.  To remove
	the unwanted metadata:</para>

      <procedure>
	<step>
	  <para>Boot the system.  At the boot menu, select
	    <literal>2</literal> for the loader prompt.  Enter:</para>

	  <screen>OK <userinput>set kern.geom.raid.enable=0</userinput>
OK <userinput>boot</userinput></screen>

	  <para>The system will boot with &man.graid.8;
	    disabled.</para>
	</step>

	<step>
	  <para>Back up all data on the affected drive.</para>
	</step>

	<step>
	  <para>As a workaround, &man.graid.8; array detection
	    can be disabled by adding</para>

	  <programlisting>kern.geom.raid.enable=0</programlisting>

	  <para>to <filename>/boot/loader.conf</filename>.</para>

	  <para>To permanently remove the &man.graid.8; metadata
	    from the affected drive, boot a &os; installation
	    <acronym>CD-ROM</acronym> or memory stick, and select
	    <literal>Shell</literal>.  Use <command>status</command>
	    to find the name of the array, typically
	    <literal>raid/r0</literal>:</para>

	  <screen>&prompt.root; <userinput>graid status</userinput>
   Name   Status  Components
raid/r0  OPTIMAL  ada0 (ACTIVE (ACTIVE))
                  ada1 (ACTIVE (ACTIVE))</screen>

	  <para>Delete the volume by name:</para>

	  <screen>&prompt.root; <userinput>graid delete raid/r0</userinput></screen>

	  <para>If there is more than one volume shown, repeat the
	    process for each volume.  After the last array has been
	    deleted, the volume will be destroyed.</para>

	  <para>Reboot and verify data, restoring from backup if
	    necessary.  After the metadata has been removed, the
	    <literal>kern.geom.raid.enable=0</literal> entry in
	    <filename>/boot/loader.conf</filename> can also be
	    removed.</para>
	</step>
      </procedure>
    </sect2>
  </sect1>

  <sect1 id="geom-raid3">
    <sect1info>
      <authorgroup>
	<author>
	  <firstname>Mark</firstname>
	  <surname>Gladman</surname>
	  <contrib>Written by </contrib>
	</author>
	<author>
	  <firstname>Daniel</firstname>
	  <surname>Gerzo</surname>
	</author>
      </authorgroup>
      <authorgroup>
	<author>
	  <firstname>Tom</firstname>
	  <surname>Rhodes</surname>
	  <contrib>Based on documentation by </contrib>
	</author>
	<author>
	  <firstname>Murray</firstname>
	  <surname>Stokely</surname>
	</author>
      </authorgroup>
    </sect1info>

    <title><acronym>RAID</acronym>3 - Byte-level Striping with
      Dedicated Parity</title>

    <indexterm>
      <primary>GEOM</primary>
    </indexterm>
    <indexterm>
      <primary>RAID3</primary>
    </indexterm>

    <para><acronym>RAID</acronym>3 is a method used to combine several
      disk drives into a single volume with a dedicated parity disk.
      In a <acronym>RAID</acronym>3 system, data is split up into a
      number of bytes that are written across all the drives in the
      array except for one disk which acts as a dedicated parity disk.
      This means that reading 1024KB from a
      <acronym>RAID</acronym>3 implementation will access all disks in
      the array.  Performance can be enhanced by using multiple disk
      controllers.  The <acronym>RAID</acronym>3 array provides a
      fault tolerance of 1 drive, while providing a capacity of 1 -
      1/n times the total capacity of all drives in the array, where n
      is the number of hard drives in the array.  Such a configuration
      is mostly suitable for storing data of larger sizes such as
      multimedia files.</para>

    <para>At least 3 physical hard drives are required to build a
      <acronym>RAID</acronym>3 array.  Each disk must be of the same
      size, since I/O requests are interleaved to read or write to
      multiple disks in parallel.  Also, due to the nature of
      <acronym>RAID</acronym>3, the number of drives must be
      equal to 3, 5, 9, 17, and so on, or 2^n + 1.</para>

    <sect2>
      <title>Creating a Dedicated <acronym>RAID</acronym>3
	Array</title>

      <para>In &os;, support for <acronym>RAID</acronym>3 is
	implemented by the &man.graid3.8; <acronym>GEOM</acronym>
	class.  Creating a dedicated
	<acronym>RAID</acronym>3 array on &os; requires the following
	steps.</para>

      <note>
	<para>While it is theoretically possible to boot from a
	  <acronym>RAID</acronym>3 array on &os;, that configuration
	  is uncommon and is not advised.</para>
      </note>

      <procedure>
	<step>
	  <para>First, load the <filename>geom_raid3.ko</filename>
	    kernel module by issuing the following command:</para>

	  <screen>&prompt.root; <userinput>graid3 load</userinput></screen>

	  <para>Alternatively, it is possible to manually load the
	    <filename>geom_raid3.ko</filename> module:</para>

	  <screen>&prompt.root; <userinput>kldload geom_raid3.ko</userinput></screen>
	</step>

	<step>
	  <para>Create or ensure that a suitable mount point
	    exists:</para>

	  <screen>&prompt.root; <userinput>mkdir <replaceable>/multimedia/</replaceable></userinput></screen>
	</step>

	<step>
	  <para>Determine the device names for the disks which will be
	    added to the array, and create the new
	    <acronym>RAID</acronym>3 device.  The final device listed
	    will act as the dedicated parity disk.  This
	    example uses three unpartitioned
	    <acronym>ATA</acronym> drives:
	    <devicename><replaceable>ada1</replaceable></devicename>
	    and
	    <devicename><replaceable>ada2</replaceable></devicename>
	    for data, and
	    <devicename><replaceable>ada3</replaceable></devicename>
	    for parity.</para>

	  <screen>&prompt.root; <userinput>graid3 label -v gr0 /dev/ada1 /dev/ada2 /dev/ada3</userinput>
Metadata value stored on /dev/ada1.
Metadata value stored on /dev/ada2.
Metadata value stored on /dev/ada3.
Done.</screen>
	</step>

	<step>
	  <para>Partition the newly created
	    <devicename>gr0</devicename> device and put a UFS file
	    system on it:</para>

	  <screen>&prompt.root; <userinput>gpart create -s GPT /dev/raid3/gr0</userinput>
&prompt.root; <userinput>gpart add -t freebsd-ufs /dev/raid3/gr0</userinput>
&prompt.root; <userinput>newfs -j /dev/raid3/gr0p1</userinput></screen>

	  <para>Many numbers will glide across the screen, and after a
	    bit of time, the process will be complete.  The volume has
	    been created and is ready to be mounted:</para>

	  <screen>&prompt.root; <userinput>mount /dev/raid3/gr0p1 /multimedia/</userinput></screen>

	  <para>The <acronym>RAID</acronym>3 array is now ready to
	    use.</para>
	</step>
      </procedure>

      <para>Additional configuration is needed to retain the above
	setup across system reboots.</para>

      <procedure>
	<step>
	  <para>The <filename>geom_raid3.ko</filename> module must be
	    loaded before the array can be mounted.  To automatically
	    load the kernel module during system initialization, add
	    the following line to
	    <filename>/boot/loader.conf</filename>:</para>

	  <programlisting>geom_raid3_load="YES"</programlisting>
	</step>

	<step>
	  <para>The following volume information must be added to
	    <filename>/etc/fstab</filename> in order to
	    automatically mount the array's file system during
	    the system boot process:</para>

	  <programlisting>/dev/raid3/gr0p1	/multimedia	ufs	rw	2	2</programlisting>
	</step>
      </procedure>
    </sect2>
  </sect1>

  <sect1 id="geom-ggate">
    <title>GEOM Gate Network Devices</title>

    <para>GEOM supports the remote use of devices, such as disks,
      CD-ROMs, and files through the use of the gate utilities.
      This is similar to <acronym>NFS</acronym>.</para>

    <para>To begin, an exports file must be created.  This file
      specifies who is permitted to access the exported resources and
      what level of access they are offered.  For example, to export
      the fourth slice on the first <acronym>SCSI</acronym> disk, the
      following <filename>/etc/gg.exports</filename> is more than
      adequate:</para>

    <programlisting>192.168.1.0/24 RW /dev/da0s4d</programlisting>

    <para>This allows all hosts inside the specified private network
      access to the file system on the <devicename>da0s4d</devicename>
      partition.</para>

    <para>To export this device, ensure it is not currently mounted,
      and start the &man.ggated.8; server daemon:</para>

    <screen>&prompt.root; <userinput>ggated</userinput></screen>

    <para>To <command>mount</command> the device on the client
      machine, issue the following commands:</para>

    <screen>&prompt.root; <userinput>ggatec create -o rw 192.168.1.1 /dev/da0s4d</userinput>
ggate0
&prompt.root; <userinput>mount /dev/ggate0 /mnt</userinput></screen>

    <para>The device may now be accessed through the <filename
	class="directory">/mnt</filename> mount point.</para>

    <note>
      <para>However, this will fail if the device is currently mounted
	on either the server machine or any other machine on the
	network.</para>
    </note>

    <para>When the device is no longer needed, unmount it with
      &man.umount.8;, similar to any other disk device.</para>
  </sect1>

  <sect1 id="geom-glabel">
    <title>Labeling Disk Devices</title>

    <indexterm>
      <primary>GEOM</primary>
    </indexterm>
    <indexterm>
      <primary>Disk Labels</primary>
    </indexterm>

    <para>During system initialization, the &os; kernel creates
      device nodes as devices are found.  This method of probing for
      devices raises some issues.  For instance, what if a new disk
      device is added via <acronym>USB</acronym>?  It is likely that
      a flash device may be handed the device name of
      <devicename>da0</devicename> and the original
      <devicename>da0</devicename> shifted to
      <devicename>da1</devicename>.  This will cause issues mounting
      file systems if they are listed in
      <filename>/etc/fstab</filename> which may also prevent the
      system from booting.</para>

    <para>One solution is to chain <acronym>SCSI</acronym> devices
      in order so a new device added to the <acronym>SCSI</acronym>
      card will be issued unused device numbers.  But what about
      <acronym>USB</acronym> devices which may replace the primary
      <acronym>SCSI</acronym> disk?  This happens because
      <acronym>USB</acronym> devices are usually probed before the
      <acronym>SCSI</acronym> card.  One solution is to only insert
      these devices after the system has been booted.  Another method
      is to use only a single <acronym>ATA</acronym> drive and never
      list the <acronym>SCSI</acronym> devices in
      <filename>/etc/fstab</filename>.</para>

    <para>A better solution is to use <command>glabel</command> to
      label the disk devices and use the labels in
      <filename>/etc/fstab</filename>.  Because
      <command>glabel</command> stores the label in the last sector of
      a given provider, the label will remain persistent across
      reboots.  By using this label as a device, the file system may
      always be mounted regardless of what device node it is accessed
      through.</para>

    <note>
      <para><command>glabel</command> can create both transient and
	permanent labels.  Only permanent labels are consistent across
	reboots.  Refer to &man.glabel.8; for more information on the
	differences between labels.</para>
    </note>

    <sect2>
      <title>Label Types and Examples</title>

      <para>Permanent labels can be a generic or a file system label.
	Permanent file system labels can be created with
	&man.tunefs.8; or &man.newfs.8;.  These types of labels are
	created in a sub-directory of <filename
	  class="directory">/dev</filename>, and will be named
	according to the file system type.  For example,
	<acronym>UFS</acronym>2 file system labels will be created in
	<filename class="directory">/dev/ufs</filename>.  Generic
	permanent labels can be created with <command>glabel
	  label</command>.  These are not file system specific and
	will be created in <filename
	  class="directory">/dev/label</filename>.</para>

      <para>Temporary labels are destroyed at the next reboot.  These
	labels are created in <filename
	  class="directory">/dev/label</filename> and are suited to
	experimentation.  A temporary label can be created using
	<command>glabel create</command>.</para>

<!-- XXXTR: How do you create a file system label without running newfs
	    or when there is no newfs (e.g.: cd9660)? -->

      <para>To create a permanent label for a
	<acronym>UFS</acronym>2 file system without destroying any
	data, issue the following command:</para>

      <screen>&prompt.root; <userinput>tunefs -L <replaceable>home</replaceable> <replaceable>/dev/da3</replaceable></userinput></screen>

      <warning>
	<para>If the file system is full, this may cause data
	  corruption.</para>
      </warning>

      <para>A label should now exist in <filename
	  class="directory">/dev/ufs</filename> which may be added
	to <filename>/etc/fstab</filename>:</para>

      <programlisting>/dev/ufs/home		/home            ufs     rw              2      2</programlisting>

      <note>
	<para>The file system must not be mounted while attempting
	  to run <command>tunefs</command>.</para>
      </note>

      <para>Now the file system may be mounted:</para>

      <screen>&prompt.root; <userinput>mount /home</userinput></screen>

      <para>From this point on, so long as the
	<filename>geom_label.ko</filename> kernel module is loaded at
	boot with <filename>/boot/loader.conf</filename> or the
	<literal>GEOM_LABEL</literal> kernel option is present,
	the device node may change without any ill effect on the
	system.</para>

      <para>File systems may also be created with a default label
	by using the <option>-L</option> flag with
	<command>newfs</command>.  Refer to &man.newfs.8; for
	more information.</para>

      <para>The following command can be used to destroy the
	label:</para>

      <screen>&prompt.root; <userinput>glabel destroy home</userinput></screen>

      <para>The following example shows how to label the partitions of
	a boot disk.</para>

      <example>
	<title>Labeling Partitions on the Boot Disk</title>

	<para>By permanently labeling the partitions on the boot disk,
	  the system should be able to continue to boot normally, even
	  if the disk is moved to another controller or transferred to
	  a different system.  For this example, it is assumed that a
	  single <acronym>ATA</acronym> disk is used, which is
	  currently recognized by the system as
	  <devicename>ad0</devicename>.  It is also assumed that the
	  standard &os; partition scheme is used, with
	  <filename class="directory">/</filename>,
	  <filename class="directory">/var</filename>,
	  <filename class="directory">/usr</filename> and
	  <filename class="directory">/tmp</filename>, as
	  well as a swap partition.</para>

	<para>Reboot the system, and at the &man.loader.8; prompt,
	  press <keycap>4</keycap> to boot into single user mode.
	  Then enter the following commands:</para>

	<screen>&prompt.root; <userinput>glabel label rootfs /dev/ad0s1a</userinput>
GEOM_LABEL: Label for provider /dev/ad0s1a is label/rootfs
&prompt.root; <userinput>glabel label var /dev/ad0s1d</userinput>
GEOM_LABEL: Label for provider /dev/ad0s1d is label/var
&prompt.root; <userinput>glabel label usr /dev/ad0s1f</userinput>
GEOM_LABEL: Label for provider /dev/ad0s1f is label/usr
&prompt.root; <userinput>glabel label tmp /dev/ad0s1e</userinput>
GEOM_LABEL: Label for provider /dev/ad0s1e is label/tmp
&prompt.root; <userinput>glabel label swap /dev/ad0s1b</userinput>
GEOM_LABEL: Label for provider /dev/ad0s1b is label/swap
&prompt.root; <userinput>exit</userinput></screen>

	<para>The system will continue with multi-user boot.  After
	  the boot completes, edit <filename>/etc/fstab</filename> and
	  replace the conventional device names, with their respective
	  labels.  The final <filename>/etc/fstab</filename> will
	  look like this:</para>

	<programlisting># Device                Mountpoint      FStype  Options         Dump    Pass#
/dev/label/swap         none            swap    sw              0       0
/dev/label/rootfs       /               ufs     rw              1       1
/dev/label/tmp          /tmp            ufs     rw              2       2
/dev/label/usr          /usr            ufs     rw              2       2
/dev/label/var          /var            ufs     rw              2       2</programlisting>

	<para>The system can now be rebooted.  If everything went
	  well, it will come up normally and <command>mount</command>
	  will show:</para>

	<screen>&prompt.root; <userinput>mount</userinput>
/dev/label/rootfs on / (ufs, local)
devfs on /dev (devfs, local)
/dev/label/tmp on /tmp (ufs, local, soft-updates)
/dev/label/usr on /usr (ufs, local, soft-updates)
/dev/label/var on /var (ufs, local, soft-updates)</screen>
      </example>

      <para>Starting with &os;&nbsp;7.2, the &man.glabel.8; class
	supports a new label type for <acronym>UFS</acronym> file
	systems, based on the unique file system id,
	<literal>ufsid</literal>.  These labels may be found in
	<filename class="directory">/dev/ufsid</filename> and are
	created automatically during system startup.  It is possible
	to use <literal>ufsid</literal> labels to mount partitions
	using <filename>/etc/fstab</filename>.  Use <command>glabel
	  status</command> to receive a list of file systems and their
	corresponding <literal>ufsid</literal> labels:</para>

      <screen>&prompt.user; <userinput>glabel status</userinput>
                  Name  Status  Components
ufsid/486b6fc38d330916     N/A  ad4s1d
ufsid/486b6fc16926168e     N/A  ad4s1f</screen>

      <para>In the above example, <devicename>ad4s1d</devicename>
	represents <filename class="directory">/var</filename>,
	while <devicename>ad4s1f</devicename> represents
	<filename class="directory">/usr</filename>.
	Using the <literal>ufsid</literal> values shown, these
	partitions may now be mounted with the following entries in
	<filename>/etc/fstab</filename>:</para>

      <programlisting>/dev/ufsid/486b6fc38d330916        /var        ufs        rw        2      2
/dev/ufsid/486b6fc16926168e        /usr        ufs        rw        2      2</programlisting>

      <para>Any partitions with <literal>ufsid</literal> labels can be
	mounted in this way, eliminating the need to manually create
	permanent labels, while still enjoying the benefits of device
	name independent mounting.</para>
    </sect2>
  </sect1>

  <sect1 id="geom-gjournal">
    <title>UFS Journaling Through GEOM</title>

    <indexterm>
      <primary>GEOM</primary>
    </indexterm>
    <indexterm>
      <primary>Journaling</primary>
    </indexterm>

    <para>Beginning with &os;&nbsp;7.0, support for UFS journals is
      available.  The implementation is provided through the
      <acronym>GEOM</acronym> subsystem and is configured using
      &man.gjournal.8;.</para>

    <para>Journaling stores a log of file system transactions, such as
      changes that make up a complete disk write operation, before
      meta-data and file writes are committed to the disk.  This
      transaction log can later be replayed to redo file system
      transactions, preventing file system inconsistencies.</para>

    <para>This method provides another mechanism to protect against
      data loss and inconsistencies of the file system.  Unlike Soft
      Updates, which tracks and enforces meta-data updates, and
      snapshots, which create an image of the file system, a log is
      stored in disk space specifically for this task, and in
      some cases, may be stored on another disk entirely.</para>

    <para>Unlike other file system journaling implementations, the
      <command>gjournal</command> method is block based and not
      implemented as part of the file system.  It is a
      <acronym>GEOM</acronym> extension.</para>

    <para>To enable support for <command>gjournal</command>, the
      &os; kernel must have the following option  which is the
      default on &os;&nbsp;7.0 and later:</para>

    <programlisting>options	UFS_GJOURNAL</programlisting>

    <para>If journaled volumes need to be mounted during startup, the
      <filename>geom_journal.ko</filename> kernel module needs to be
      loaded, by adding the following line to
      <filename>/boot/loader.conf</filename>:</para>

    <programlisting>geom_journal_load="YES"</programlisting>

    <para>Alternatively, this function can be built into a custom
      kernel, by adding the following line in the kernel configuration
      file:</para>

    <programlisting>options	GEOM_JOURNAL</programlisting>

    <para>Creating a journal on a free file system may now be done
      using the following steps.  In this example,
      <devicename>da4</devicename> is a new <acronym>SCSI</acronym>
      disk:</para>

    <screen>&prompt.root; <userinput>gjournal load</userinput>
&prompt.root; <userinput>gjournal label /dev/da4</userinput></screen>

    <para>At this point, there should be a
      <devicename>/dev/da4</devicename> device node and a
      <devicename>/dev/da4.journal</devicename> device node.
      A file system may now be created on this device:</para>

    <screen>&prompt.root; <userinput>newfs -O 2 -J /dev/da4.journal</userinput></screen>

    <para>This command will create a <acronym>UFS</acronym>2 file
      system on the journaled device.</para>

    <para><command>mount</command> the device at the desired point
      with:</para>

    <screen>&prompt.root; <userinput>mount /dev/da4.journal <replaceable>/mnt</replaceable></userinput></screen>

    <note>
      <para>In the case of several slices, a journal will be created
	for each individual slice.  For instance, if
	<devicename>ad4s1</devicename> and
	<devicename>ad4s2</devicename> are both slices, then
	<command>gjournal</command> will create
	<devicename>ad4s1.journal</devicename> and
	<devicename>ad4s2.journal</devicename>.</para>
    </note>

    <para>For better performance, the journal may be kept on another
      disk.  In this configuration, the journal provider or storage
      device should be listed after the device to enable journaling
      on.  Journaling may also be enabled on current file systems by
      using <command>tunefs</command>.  However,
      <emphasis>always</emphasis> make a backup before attempting to
      alter a file system.  In most cases, <command>gjournal</command>
      will fail if it is unable to create the journal, but this does
      not protect against data loss incurred as a result of misusing
      <command>tunefs</command>.</para>

    <para>It is also possible to journal the boot disk of a &os;
      system.  Refer to the article <ulink
	url="&url.articles.gjournal-desktop;">Implementing UFS
	Journaling on a Desktop PC</ulink> for detailed
      instructions.</para>
  </sect1>
</chapter>