aboutsummaryrefslogtreecommitdiff
path: root/en_US.ISO8859-1/books/handbook/geom/chapter.xml
blob: a6827995435e0572ada2fe571a6039d069c6a439 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
<?xml version="1.0" encoding="iso-8859-1"?>
<!--
     The FreeBSD Documentation Project
     $FreeBSD$

-->
<chapter xmlns="http://docbook.org/ns/docbook"
  xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
  xml:id="geom">

  <info>
    <title>GEOM: Modular Disk Transformation Framework</title>

    <authorgroup>
      <author>
	<personname>
	  <firstname>Tom</firstname>
	  <surname>Rhodes</surname>
	</personname>
	<contrib>Written by </contrib>
      </author>
    </authorgroup>
  </info>

  <sect1 xml:id="geom-synopsis">
    <title>Synopsis</title>

    <indexterm>
      <primary><acronym>GEOM</acronym></primary>
    </indexterm>
    <indexterm>
      <primary><acronym>GEOM</acronym> Disk Framework</primary>
      <see><acronym>GEOM</acronym></see>
    </indexterm>

    <para>In &os;, the <acronym>GEOM</acronym> framework permits
      access and control to classes, such as Master Boot Records and
      <acronym>BSD</acronym> labels, through the use of providers, or
      the disk devices in <filename>/dev</filename>.  By supporting
      various software <acronym>RAID</acronym> configurations,
      <acronym>GEOM</acronym> transparently provides access  to the
      operating system and operating system utilities.</para>

    <para>This chapter covers the use of disks under the
      <acronym>GEOM</acronym> framework in &os;.  This includes the
      major <acronym>RAID</acronym> control utilities which use the
      framework for configuration.  This chapter is not a definitive
      guide to <acronym>RAID</acronym> configurations and only
      <acronym>GEOM</acronym>-supported <acronym>RAID</acronym>
      classifications are discussed.</para>

    <para>After reading this chapter, you will know:</para>

    <itemizedlist>
      <listitem>
	<para>What type of <acronym>RAID</acronym> support is
	  available through <acronym>GEOM</acronym>.</para>
      </listitem>

      <listitem>
	<para>How to use the base utilities to configure, maintain,
	  and manipulate the various <acronym>RAID</acronym>
	  levels.</para>
      </listitem>

      <listitem>
	<para>How to mirror, stripe, encrypt, and remotely connect
	  disk devices through <acronym>GEOM</acronym>.</para>
      </listitem>

      <listitem>
	<para>How to troubleshoot disks attached to the
	  <acronym>GEOM</acronym> framework.</para>
      </listitem>
    </itemizedlist>

    <para>Before reading this chapter, you should:</para>

    <itemizedlist>
      <listitem>
	<para>Understand how &os; treats disk devices (<xref
	    linkend="disks"/>).</para>
      </listitem>

      <listitem>
	<para>Know how to configure and install a new kernel (<xref
	    linkend="kernelconfig"/>).</para>
      </listitem>
    </itemizedlist>
  </sect1>

  <sect1 xml:id="geom-striping">
    <info>
      <title>RAID0 - Striping</title>

      <authorgroup>
	<author>
	  <personname>
	    <firstname>Tom</firstname>
	    <surname>Rhodes</surname>
	  </personname>
	  <contrib>Written by </contrib>
	</author>

	<author>
	  <personname>
	    <firstname>Murray</firstname>
	    <surname>Stokely</surname>
	  </personname>
	</author>
      </authorgroup>
    </info>

    <indexterm>
      <primary><acronym>GEOM</acronym></primary>
    </indexterm>
    <indexterm>
      <primary>Striping</primary>
    </indexterm>

    <para>Striping combines several disk drives into a single volume.
      Striping can be performed through the use of hardware
      <acronym>RAID</acronym> controllers.  The
      <acronym>GEOM</acronym> disk subsystem provides software support
      for disk striping, also known as <acronym>RAID0</acronym>,
      without the need for a <acronym>RAID</acronym> disk
      controller.</para>

    <para>In <acronym>RAID0</acronym>, data is split into blocks that
      are written across all the drives in the array.  As seen in the
      following illustration, instead of having to wait on the system
      to write 256k to one disk, <acronym>RAID0</acronym> can
      simultaneously write 64k to each of the four disks in the array,
      offering superior <acronym>I/O</acronym> performance.  This
      performance can be enhanced further by using multiple disk
      controllers.</para>

    <mediaobject>
      <imageobject>
	<imagedata fileref="geom/striping" align="center"/>
      </imageobject>

      <textobject>
	<phrase>Disk Striping Illustration</phrase>
      </textobject>
    </mediaobject>

    <para>Each disk in a <acronym>RAID0</acronym> stripe must be of
      the same size, since <acronym>I/O</acronym> requests are
      interleaved to read or write to multiple disks in
      parallel.</para>

    <note>
      <para><acronym>RAID0</acronym> does <emphasis>not</emphasis>
	provide any redundancy.  This means that if one disk in the
	array fails, all of the data on the disks is lost.  If the
	data is important, implement a backup strategy that regularly
	saves backups to a remote system or device.</para>
    </note>

    <para>The process for creating a software,
      <acronym>GEOM</acronym>-based <acronym>RAID0</acronym> on a &os;
      system using commodity disks is as follows.  Once the stripe is
      created, refer to &man.gstripe.8; for more information on how
      to control an existing stripe.</para>

    <procedure>
      <title>Creating a Stripe of Unformatted <acronym>ATA</acronym>
	Disks</title>

      <step>
	<para>Load the <filename>geom_stripe.ko</filename>
	  module:</para>

	<screen>&prompt.root; <userinput>kldload geom_stripe</userinput></screen>
      </step>

      <step>
	<para>Ensure that a suitable mount point exists.  If this
	  volume will become a root partition, then temporarily use
	  another mount point such as
	  <filename>/mnt</filename>.</para>
      </step>

      <step>
	<para>Determine the device names for the disks which will
	  be striped, and create the new stripe device.  For example,
	  to stripe two unused and unpartitioned
	  <acronym>ATA</acronym> disks with device names of
	  <filename>/dev/ad2</filename> and
	  <filename>/dev/ad3</filename>:</para>

	<screen>&prompt.root; <userinput>gstripe label -v st0 /dev/ad2 /dev/ad3</userinput>
Metadata value stored on /dev/ad2.
Metadata value stored on /dev/ad3.
Done.</screen>
      </step>

      <step>
	<para>Write a standard label, also known as a partition table,
	  on the new volume and install the default bootstrap
	  code:</para>

	<screen>&prompt.root; <userinput>bsdlabel -wB /dev/stripe/st0</userinput></screen>
      </step>

      <step>
	<para>This process should create two other devices in
	  <filename>/dev/stripe</filename> in addition to
	  <filename>st0</filename>.  Those include
	  <filename>st0a</filename> and <filename>st0c</filename>.  At
	  this point, a <acronym>UFS</acronym> file system can be
	  created on <filename>st0a</filename> using
	  <command>newfs</command>:</para>

	<screen>&prompt.root; <userinput>newfs -U /dev/stripe/st0a</userinput></screen>

	<para>Many numbers will glide across the screen, and after a
	  few seconds, the process will be complete.  The volume has
	  been created and is ready to be mounted.</para>
      </step>

      <step>
	<para>To manually mount the created disk stripe:</para>

	<screen>&prompt.root; <userinput>mount /dev/stripe/st0a /mnt</userinput></screen>
      </step>

      <step>
	<para>To mount this striped file system automatically during
	  the boot process, place the volume information in
	  <filename>/etc/fstab</filename>.  In this example, a
	  permanent mount point, named <filename>stripe</filename>, is
	  created:</para>

	<screen>&prompt.root; <userinput>mkdir /stripe</userinput>
&prompt.root; <userinput>echo "/dev/stripe/st0a /stripe ufs rw 2 2" \</userinput>
<userinput>&gt;&gt; /etc/fstab</userinput></screen>
      </step>

      <step>
	<para>The <filename>geom_stripe.ko</filename> module must also
	  be automatically loaded during system initialization, by
	  adding a line to
	  <filename>/boot/loader.conf</filename>:</para>

	<screen>&prompt.root; <userinput>echo 'geom_stripe_load="YES"' &gt;&gt; /boot/loader.conf</userinput></screen>
      </step>
    </procedure>
  </sect1>

  <sect1 xml:id="geom-mirror">
    <title>RAID1 - Mirroring</title>

    <indexterm>
      <primary><acronym>GEOM</acronym></primary>
    </indexterm>
    <indexterm>
      <primary>Disk Mirroring</primary>
    </indexterm>
    <indexterm>
      <primary>RAID1</primary>
    </indexterm>

    <para><acronym>RAID1</acronym>, or
      <emphasis>mirroring</emphasis>, is the technique of writing
      the same data to more than one disk drive.  Mirrors are usually
      used to guard against data loss due to drive failure.  Each
      drive in a mirror contains an identical copy of the data.  When
      an individual drive fails, the mirror continues to work,
      providing data from the drives that are still functioning.  The
      computer keeps running, and the administrator has time to
      replace the failed drive without user interruption.</para>

    <para>Two common situations are illustrated in these examples.
      The first creates a mirror out of two new drives and uses it as
      a replacement for an existing single drive.  The second example
      creates a mirror on a single new drive, copies the old drive's
      data to it, then inserts the old drive into the mirror.  While
      this procedure is slightly more complicated, it only requires
      one new drive.</para>

    <para>Traditionally, the two drives in a mirror are identical in
      model and capacity, but &man.gmirror.8; does not require that.
      Mirrors created with dissimilar drives will have a capacity
      equal to that of the smallest drive in the mirror.  Extra space
      on larger drives will be unused.  Drives inserted into the
      mirror later must have at least as much capacity as the smallest
      drive already in the mirror.</para>

    <warning>
      <para>The mirroring procedures shown here are non-destructive,
	but as with any major disk operation, make a full backup
	first.</para>
    </warning>

    <warning>
      <para>While &man.dump.8; is used in these procedures
	to copy file systems, it does not work on file systems with
	soft updates journaling.  See &man.tunefs.8; for information
	on detecting and disabling soft updates journaling.</para>
    </warning>

    <sect2 xml:id="geom-mirror-metadata">
      <title>Metadata Issues</title>

      <para>Many disk systems store metadata at the end of each disk.
	Old metadata should be erased before reusing the disk for a
	mirror.  Most problems are caused by two particular types of
	leftover metadata: <acronym>GPT</acronym> partition tables and
	old metadata from a previous mirror.</para>

      <para><acronym>GPT</acronym> metadata can be erased with
	&man.gpart.8;.  This example erases both primary and backup
	<acronym>GPT</acronym> partition tables from disk
	<filename>ada8</filename>:</para>

      <screen>&prompt.root; <userinput>gpart destroy -F ada8</userinput></screen>

      <para>A disk can be removed from an active mirror and the
	metadata erased in one step using &man.gmirror.8;.  Here, the
	example disk <filename>ada8</filename> is removed from the
	active mirror <filename>gm4</filename>:</para>

      <screen>&prompt.root; <userinput>gmirror remove gm4 ada8</userinput></screen>

      <para>If the mirror is not running, but old mirror metadata is
	still on the disk, use <command>gmirror clear</command> to
	remove it:</para>

      <screen>&prompt.root; <userinput>gmirror clear ada8</userinput></screen>

      <para>&man.gmirror.8; stores one block of metadata at the end of
	the disk.  As <acronym>GPT</acronym> partition schemes
	also store metadata at the end of the disk, mirroring entire
	<acronym>GPT</acronym> disks with &man.gmirror.8; is not
	recommended.  <acronym>MBR</acronym> partitioning is used here
	because it only stores a partition table at the start of the
	disk and does not conflict with the mirror metadata.</para>
    </sect2>

    <sect2 xml:id="geom-mirror-two-new-disks">
      <title>Creating a Mirror with Two New Disks</title>

      <para>In this example, &os; has already been installed on a
	single disk, <filename>ada0</filename>.  Two new disks,
	<filename>ada1</filename> and <filename>ada2</filename>, have
	been connected to the system.  A new mirror will be created on
	these two disks and used to replace the old single
	disk.</para>

      <para>The <filename>geom_mirror.ko</filename> kernel module must
	either be built into the kernel or loaded at boot- or
	run-time.  Manually load the kernel module now:</para>

      <screen>&prompt.root; <userinput>gmirror load</userinput></screen>

      <para>Create the mirror with the two new drives:</para>

      <screen>&prompt.root; <userinput>gmirror label -v gm0 /dev/ada1 /dev/ada2</userinput></screen>

      <para><filename>gm0</filename> is a user-chosen device name
	assigned to the new mirror.  After the mirror has been
	started, this device name appears in
	<filename>/dev/mirror/</filename>.</para>

      <para><acronym>MBR</acronym> and
	<application>bsdlabel</application> partition tables can now
	be created on the mirror with &man.gpart.8;.  This example
	uses a traditional file system layout, with partitions for
	<filename>/</filename>, swap, <filename>/var</filename>,
	<filename>/tmp</filename>, and <filename>/usr</filename>.  A
	single <filename>/</filename> and a swap partition
	will also work.</para>

      <para>Partitions on the mirror do not have to be the same size
	as those on the existing disk, but they must be large enough
	to hold all the data already present on
	<filename>ada0</filename>.</para>

      <screen>&prompt.root; <userinput>gpart create -s MBR mirror/gm0</userinput>
&prompt.root; <userinput>gpart add -t freebsd -a 4k mirror/gm0</userinput>
&prompt.root; <userinput>gpart show mirror/gm0</userinput>
=&gt;       63  156301423  mirror/gm0  MBR  (74G)
         63         63                    - free -  (31k)
        126  156301299                 1  freebsd  (74G)
  156301425         61                    - free -  (30k)</screen>

      <screen>&prompt.root; <userinput>gpart create -s BSD mirror/gm0s1</userinput>
&prompt.root; <userinput>gpart add -t freebsd-ufs  -a 4k -s 2g mirror/gm0s1</userinput>
&prompt.root; <userinput>gpart add -t freebsd-swap -a 4k -s 4g mirror/gm0s1</userinput>
&prompt.root; <userinput>gpart add -t freebsd-ufs  -a 4k -s 2g mirror/gm0s1</userinput>
&prompt.root; <userinput>gpart add -t freebsd-ufs  -a 4k -s 1g mirror/gm0s1</userinput>
&prompt.root; <userinput>gpart add -t freebsd-ufs  -a 4k       mirror/gm0s1</userinput>
&prompt.root; <userinput>gpart show mirror/gm0s1</userinput>
=&gt;        0  156301299  mirror/gm0s1  BSD  (74G)
          0          2                      - free -  (1.0k)
          2    4194304                   1  freebsd-ufs  (2.0G)
    4194306    8388608                   2  freebsd-swap  (4.0G)
   12582914    4194304                   4  freebsd-ufs  (2.0G)
   16777218    2097152                   5  freebsd-ufs  (1.0G)
   18874370  137426928                   6  freebsd-ufs  (65G)
  156301298          1                      - free -  (512B)</screen>

      <para>Make the mirror bootable by installing bootcode in the
	<acronym>MBR</acronym> and bsdlabel and setting the active
	slice:</para>

      <screen>&prompt.root; <userinput>gpart bootcode -b /boot/mbr mirror/gm0</userinput>
&prompt.root; <userinput>gpart set -a active -i 1 mirror/gm0</userinput>
&prompt.root; <userinput>gpart bootcode -b /boot/boot mirror/gm0s1</userinput></screen>

      <para>Format the file systems on the new mirror, enabling
	soft-updates.</para>

      <screen>&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1a</userinput>
&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1d</userinput>
&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1e</userinput>
&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1f</userinput></screen>

      <para>File systems from the original <filename>ada0</filename>
	disk can now be copied onto the mirror with &man.dump.8; and
	&man.restore.8;.</para>

      <screen>&prompt.root; <userinput>mount /dev/mirror/gm0s1a /mnt</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - / | (cd /mnt &amp;&amp; restore -rf -)</userinput>
&prompt.root; <userinput>mount /dev/mirror/gm0s1d /mnt/var</userinput>
&prompt.root; <userinput>mount /dev/mirror/gm0s1e /mnt/tmp</userinput>
&prompt.root; <userinput>mount /dev/mirror/gm0s1f /mnt/usr</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - /var | (cd /mnt/var &amp;&amp; restore -rf -)</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - /tmp | (cd /mnt/tmp &amp;&amp; restore -rf -)</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - /usr | (cd /mnt/usr &amp;&amp; restore -rf -)</userinput></screen>

      <para>Edit <filename>/mnt/etc/fstab</filename> to point to
	the new mirror file systems:</para>

      <programlisting># Device		Mountpoint	FStype	Options	Dump	Pass#
/dev/mirror/gm0s1a	/		ufs	rw	1	1
/dev/mirror/gm0s1b	none		swap	sw	0	0
/dev/mirror/gm0s1d	/var		ufs	rw	2	2
/dev/mirror/gm0s1e	/tmp		ufs	rw	2	2
/dev/mirror/gm0s1f	/usr		ufs	rw	2	2</programlisting>

      <para>If the <filename>geom_mirror.ko</filename> kernel module
	has not been built into the kernel,
	<filename>/mnt/boot/loader.conf</filename> is edited to load
	the module at boot:</para>

      <programlisting>geom_mirror_load="YES"</programlisting>

      <para>Reboot the system to test the new mirror and verify that
	all data has been copied.  The <acronym>BIOS</acronym> will
	see the mirror as two individual drives rather than a mirror.
	Since the drives are identical, it does not matter which is
	selected to boot.</para>

      <para>See <xref linkend="gmirror-troubleshooting"/> if there are
	problems booting.  Powering down and disconnecting the
	original <filename>ada0</filename> disk will allow it to be
	kept as an offline backup.</para>

      <para>In use, the mirror will behave just like the original
	single drive.</para>
    </sect2>

    <sect2 xml:id="geom-mirror-existing-drive">
      <title>Creating a Mirror with an Existing Drive</title>

      <para>In this example, &os; has already been installed on a
	single disk, <filename>ada0</filename>.  A new disk,
	<filename>ada1</filename>, has been connected to the system.
	A one-disk mirror will be created on the new disk, the
	existing system copied onto it, and then the old disk will be
	inserted into the mirror.  This slightly complex procedure is
	required because <command>gmirror</command> needs to put a
	512-byte block of metadata at the end of each disk, and the
	existing <filename>ada0</filename> has usually had all of its
	space already allocated.</para>

      <para>Load the <filename>geom_mirror.ko</filename> kernel
	module:</para>

      <screen>&prompt.root; <userinput>gmirror load</userinput></screen>

      <para>Check the media size of the original disk with
	<command>diskinfo</command>:</para>

      <screen>&prompt.root; <userinput>diskinfo -v ada0 | head -n3</userinput>
/dev/ada0
	512             # sectorsize
	1000204821504   # mediasize in bytes (931G)</screen>

      <para>Create a mirror on the new disk.  To make certain that the
	mirror capacity is not any larger than the original
	<filename>ada0</filename> drive, &man.gnop.8; is used to
	create a fake drive of the exact same size.  This drive does
	not store any data, but is used only to limit the size of the
	mirror.  When &man.gmirror.8; creates the mirror, it will
	restrict the capacity to the size of
	<filename>gzero.nop</filename>, even if the new
	<filename>ada1</filename> drive has more space.  Note that the
	<replaceable>1000204821504</replaceable> in the second line is
	equal to <filename>ada0</filename>'s media size as shown by
	<command>diskinfo</command> above.</para>

      <screen>&prompt.root; <userinput>geom zero load</userinput>
&prompt.root; <userinput>gnop create -s 1000204821504 gzero</userinput>
&prompt.root; <userinput>gmirror label -v gm0 gzero.nop ada1</userinput>
&prompt.root; <userinput>gmirror forget gm0</userinput></screen>

      <para>Since <filename>gzero.nop</filename> does not store any
	data, the mirror does not see it as connected.  The mirror is
	told to <quote>forget</quote> unconnected components, removing
	references to <filename>gzero.nop</filename>.  The result is a
	mirror device containing only a single disk,
	<filename>ada1</filename>.</para>

      <para>After creating <filename>gm0</filename>, view the
	partition table on <filename>ada0</filename>.  This output is
	from a 1&nbsp;TB drive.  If there is some unallocated space at
	the end of the drive, the contents may be copied directly from
	<filename>ada0</filename> to the new mirror.</para>

      <para>However, if the output shows that all of the space on the
	disk is allocated, as in the following listing, there is no
	space available for the 512-byte mirror metadata at the end of
	the disk.</para>

      <screen>&prompt.root; <userinput>gpart show ada0</userinput>
=&gt;        63  1953525105        ada0  MBR  (931G)
          63  1953525105           1  freebsd  [active]  (931G)</screen>

      <para>In this case, the partition table must be edited to reduce
	the capacity by one sector on <filename>mirror/gm0</filename>.
	The procedure will be explained later.</para>

      <para>In either case, partition tables on the primary disk
	should be first copied using <command>gpart backup</command>
	and <command>gpart restore</command>.</para>

      <screen>&prompt.root; <userinput>gpart backup ada0 &gt; table.ada0</userinput>
&prompt.root; <userinput>gpart backup ada0s1 &gt; table.ada0s1</userinput></screen>

      <para>These commands create two files,
	<filename>table.ada0</filename> and
	<filename>table.ada0s1</filename>.  This example is from a
	1&nbsp;TB drive:</para>

      <screen>&prompt.root; <userinput>cat table.ada0</userinput>
MBR 4
1 freebsd         63 1953525105   [active]</screen>

      <screen>&prompt.root; <userinput>cat table.ada0s1</userinput>
BSD 8
1  freebsd-ufs          0    4194304
2 freebsd-swap    4194304   33554432
4  freebsd-ufs   37748736   50331648
5  freebsd-ufs   88080384   41943040
6  freebsd-ufs  130023424  838860800
7  freebsd-ufs  968884224  984640881</screen>

      <para>If no free space is shown at the end of the disk, the size
	of both the slice and the last partition must be reduced by
	one sector.  Edit the two files, reducing the size of both the
	slice and last partition by one.  These are the last numbers
	in each listing.</para>

      <screen>&prompt.root; <userinput>cat table.ada0</userinput>
MBR 4
1 freebsd         63 <emphasis>1953525104</emphasis>   [active]</screen>

      <screen>&prompt.root; <userinput>cat table.ada0s1</userinput>
BSD 8
1  freebsd-ufs          0    4194304
2 freebsd-swap    4194304   33554432
4  freebsd-ufs   37748736   50331648
5  freebsd-ufs   88080384   41943040
6  freebsd-ufs  130023424  838860800
7  freebsd-ufs  968884224  <emphasis>984640880</emphasis></screen>

      <para>If at least one sector was unallocated at the end of the
	disk, these two files can be used without modification.</para>

      <para>Now restore the partition table into
	<filename>mirror/gm0</filename>:</para>

      <screen>&prompt.root; <userinput>gpart restore mirror/gm0 &lt; table.ada0</userinput>
&prompt.root; <userinput>gpart restore mirror/gm0s1 &lt; table.ada0s1</userinput></screen>

      <para>Check the partition table with
	<command>gpart show</command>.  This example has
	<filename>gm0s1a</filename> for <filename>/</filename>,
	<filename>gm0s1d</filename> for <filename>/var</filename>,
	<filename>gm0s1e</filename> for <filename>/usr</filename>,
	<filename>gm0s1f</filename> for <filename>/data1</filename>,
	and <filename>gm0s1g</filename> for
	<filename>/data2</filename>.</para>

      <screen>&prompt.root; <userinput>gpart show mirror/gm0</userinput>
=&gt;        63  1953525104  mirror/gm0  MBR  (931G)
          63  1953525042           1  freebsd  [active]  (931G)
  1953525105          62              - free -  (31k)

&prompt.root; <userinput>gpart show mirror/gm0s1</userinput>
=&gt;         0  1953525042  mirror/gm0s1  BSD  (931G)
           0     2097152             1  freebsd-ufs  (1.0G)
     2097152    16777216             2  freebsd-swap  (8.0G)
    18874368    41943040             4  freebsd-ufs  (20G)
    60817408    20971520             5  freebsd-ufs  (10G)
    81788928   629145600             6  freebsd-ufs  (300G)
   710934528  1242590514             7  freebsd-ufs  (592G)
  1953525042          63                - free -  (31k)</screen>

      <para>Both the slice and the last partition must have at least
	one free block at the end of the disk.</para>

      <para>Create file systems on these new partitions.  The number
	of partitions will vary to match the original disk,
	<filename>ada0</filename>.</para>

      <screen>&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1a</userinput>
&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1d</userinput>
&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1e</userinput>
&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1f</userinput>
&prompt.root; <userinput>newfs -U /dev/mirror/gm0s1g</userinput></screen>

      <para>Make the mirror bootable by installing bootcode in the
	<acronym>MBR</acronym> and bsdlabel and setting the active
	slice:</para>

      <screen>&prompt.root; <userinput>gpart bootcode -b /boot/mbr mirror/gm0</userinput>
&prompt.root; <userinput>gpart set -a active -i 1 mirror/gm0</userinput>
&prompt.root; <userinput>gpart bootcode -b /boot/boot mirror/gm0s1</userinput></screen>

      <para>Adjust <filename>/etc/fstab</filename> to use the new
	partitions on the mirror.  Back up this file first by copying
	it to <filename>/etc/fstab.orig</filename>.</para>

      <screen>&prompt.root; <userinput>cp /etc/fstab /etc/fstab.orig</userinput></screen>

      <para>Edit <filename>/etc/fstab</filename>, replacing
	<filename>/dev/ada0</filename> with
	<filename>mirror/gm0</filename>.</para>

      <programlisting># Device		Mountpoint	FStype	Options	Dump	Pass#
/dev/mirror/gm0s1a	/		ufs	rw	1	1
/dev/mirror/gm0s1b	none		swap	sw	0	0
/dev/mirror/gm0s1d	/var		ufs	rw	2	2
/dev/mirror/gm0s1e	/usr		ufs	rw	2	2
/dev/mirror/gm0s1f	/data1		ufs	rw	2	2
/dev/mirror/gm0s1g	/data2		ufs	rw	2	2</programlisting>

      <para>If the <filename>geom_mirror.ko</filename> kernel module
	has not been built into the kernel, edit
	<filename>/boot/loader.conf</filename> to load it at
	boot:</para>

      <programlisting>geom_mirror_load="YES"</programlisting>

      <para>File systems from the original disk can now be copied onto
	the mirror with &man.dump.8; and &man.restore.8;.  Each file
	system dumped with <command>dump -L</command> will create a
	snapshot first, which can take some time.</para>

      <screen>&prompt.root; <userinput>mount /dev/mirror/gm0s1a /mnt</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - /    | (cd /mnt &amp;&amp; restore -rf -)</userinput>
&prompt.root; <userinput>mount /dev/mirror/gm0s1d /mnt/var</userinput>
&prompt.root; <userinput>mount /dev/mirror/gm0s1e /mnt/usr</userinput>
&prompt.root; <userinput>mount /dev/mirror/gm0s1f /mnt/data1</userinput>
&prompt.root; <userinput>mount /dev/mirror/gm0s1g /mnt/data2</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - /usr | (cd /mnt/usr &amp;&amp; restore -rf -)</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - /var | (cd /mnt/var &amp;&amp; restore -rf -)</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - /data1 | (cd /mnt/data1 &amp;&amp; restore -rf -)</userinput>
&prompt.root; <userinput>dump -C16 -b64 -0aL -f - /data2 | (cd /mnt/data2 &amp;&amp; restore -rf -)</userinput></screen>

      <para>Restart the system, booting from
	<filename>ada1</filename>.  If everything is working, the
	system will boot from <filename>mirror/gm0</filename>, which
	now contains the same data as <filename>ada0</filename> had
	previously.  See <xref linkend="gmirror-troubleshooting"/> if
	there are problems booting.</para>

      <para>At this point, the mirror still consists of only the
	single <filename>ada1</filename> disk.</para>

      <para>After booting from <filename>mirror/gm0</filename>
	successfully, the final step is inserting
	<filename>ada0</filename> into the mirror.</para>

      <important>
	<para>When <filename>ada0</filename> is inserted into the
	  mirror, its former contents will be overwritten by data from
	  the mirror.  Make certain that
	  <filename>mirror/gm0</filename> has the same contents as
	  <filename>ada0</filename> before adding
	  <filename>ada0</filename> to the mirror.  If the contents
	  previously copied by &man.dump.8; and &man.restore.8; are
	  not identical to what was on <filename>ada0</filename>,
	  revert <filename>/etc/fstab</filename> to mount the file
	  systems on <filename>ada0</filename>, reboot, and start the
	  whole procedure again.</para>
      </important>

      <screen>&prompt.root; <userinput>gmirror insert gm0 ada0</userinput>
GEOM_MIRROR: Device gm0: rebuilding provider ada0</screen>

      <para>Synchronization between the two disks will start
	immediately.  Use <command>gmirror status</command> to view
	the progress.</para>

      <screen>&prompt.root; <userinput>gmirror status</userinput>
      Name    Status  Components
mirror/gm0  DEGRADED  ada1 (ACTIVE)
                      ada0 (SYNCHRONIZING, 64%)</screen>

      <para>After a while, synchronization will finish.</para>

      <screen>GEOM_MIRROR: Device gm0: rebuilding provider ada0 finished.
&prompt.root; <userinput>gmirror status</userinput>
      Name    Status  Components
mirror/gm0  COMPLETE  ada1 (ACTIVE)
                      ada0 (ACTIVE)</screen>

      <para><filename>mirror/gm0</filename> now consists
	of the two disks <filename>ada0</filename> and
	<filename>ada1</filename>, and the contents are automatically
	synchronized with each other.  In use,
	<filename>mirror/gm0</filename> will behave just like the
	original single drive.</para>
    </sect2>

    <sect2 xml:id="gmirror-troubleshooting">
      <title>Troubleshooting</title>

      <para>If the system no longer boots, <acronym>BIOS</acronym>
	settings may have to be changed to boot from one of the new
	mirrored drives.  Either mirror drive can be used for booting,
	as they contain identical data.</para>

      <para>If the boot stops with this message, something is wrong
	with the mirror device:</para>

      <screen>Mounting from ufs:/dev/mirror/gm0s1a failed with error 19.

Loader variables:
  vfs.root.mountfrom=ufs:/dev/mirror/gm0s1a
  vfs.root.mountfrom.options=rw

Manual root filesystem specification:
  &lt;fstype&gt;:&lt;device&gt; [options]
      Mount &lt;device&gt; using filesystem &lt;fstype&gt;
      and with the specified (optional) option list.

    eg. ufs:/dev/da0s1a
        zfs:tank
        cd9660:/dev/acd0 ro
          (which is equivalent to: mount -t cd9660 -o ro /dev/acd0 /)

  ?               List valid disk boot devices
  .               Yield 1 second (for background tasks)
  &lt;empty line&gt;    Abort manual input

mountroot&gt;</screen>

      <para>Forgetting to load the <filename>geom_mirror.ko</filename>
	module in <filename>/boot/loader.conf</filename> can cause
	this problem.  To fix it, boot from a &os;
	installation media and choose <literal>Shell</literal> at the
	first prompt.  Then load the mirror module and mount the
	mirror device:</para>

      <screen>&prompt.root; <userinput>gmirror load</userinput>
&prompt.root; <userinput>mount /dev/mirror/gm0s1a /mnt</userinput></screen>

      <para>Edit <filename>/mnt/boot/loader.conf</filename>, adding a
	line to load the mirror module:</para>

      <programlisting>geom_mirror_load="YES"</programlisting>

      <para>Save the file and reboot.</para>

      <para>Other problems that cause <errorname>error 19</errorname>
	require more effort to fix.  Although the system should boot
	from <filename>ada0</filename>, another prompt to select a
	shell will appear if <filename>/etc/fstab</filename> is
	incorrect.  Enter <literal>ufs:/dev/ada0s1a</literal> at the
	boot loader prompt and press <keycap>Enter</keycap>.  Undo the
	edits in <filename>/etc/fstab</filename> then mount the file
	systems from the original disk (<filename>ada0</filename>)
	instead of the mirror.  Reboot the system and try the
	procedure again.</para>

      <screen>Enter full pathname of shell or RETURN for /bin/sh:
&prompt.root; <userinput>cp /etc/fstab.orig /etc/fstab</userinput>
&prompt.root; <userinput>reboot</userinput></screen>
    </sect2>

    <sect2>
      <title>Recovering from Disk Failure</title>

      <para>The benefit of disk mirroring is that an individual disk
	can fail without causing the mirror to lose any data.  In the
	above example, if <filename>ada0</filename> fails, the mirror
	will continue to work, providing data from the remaining
	working drive, <filename>ada1</filename>.</para>

      <para>To replace the failed drive, shut down the system and
	physically replace the failed drive with a new drive of equal
	or greater capacity.  Manufacturers use somewhat arbitrary
	values when rating drives in gigabytes, and the only way to
	really be sure is to compare the total count of sectors shown
	by <command>diskinfo -v</command>.  A drive with larger
	capacity than the mirror will work, although the extra space
	on the new drive will not be used.</para>

      <para>After the computer is powered back up, the mirror will be
	running in a <quote>degraded</quote> mode with only one drive.
	The mirror is told to forget drives that are not currently
	connected:</para>

      <screen>&prompt.root; <userinput>gmirror forget gm0</userinput></screen>

      <para>Any old metadata should be cleared from the replacement
	disk using the instructions in
	<xref linkend="geom-mirror-metadata"/>.  Then the replacement
	disk, <filename>ada4</filename> for this example, is inserted
	into the mirror:</para>

      <screen>&prompt.root; <userinput>gmirror insert gm0 /dev/ada4</userinput></screen>

      <para>Resynchronization begins when the new drive is inserted
	into the mirror.  This process of copying mirror data to a new
	drive can take a while.  Performance of the mirror will be
	greatly reduced during the copy, so inserting new drives is
	best done when there is low demand on the computer.</para>

      <para>Progress can be monitored with <command>gmirror
	  status</command>, which shows drives that are being
	synchronized and the percentage of completion.  During
	resynchronization, the status will be
	<computeroutput>DEGRADED</computeroutput>, changing to
	<computeroutput>COMPLETE</computeroutput> when the process is
	finished.</para>
    </sect2>
  </sect1>

  <sect1 xml:id="geom-raid3">
    <info>

      <title><acronym>RAID</acronym>3 - Byte-level Striping with
	Dedicated Parity</title>

      <authorgroup>
	<author>
	  <personname>
	    <firstname>Mark</firstname>
	    <surname>Gladman</surname>
	  </personname>
	  <contrib>Written by </contrib>
	</author>

	<author>
	  <personname>
	    <firstname>Daniel</firstname>
	    <surname>Gerzo</surname>
	  </personname>
	</author>
      </authorgroup>

      <authorgroup>
	<author>
	  <personname>
	    <firstname>Tom</firstname>
	    <surname>Rhodes</surname>
	  </personname>
	  <contrib>Based on documentation by </contrib>
	</author>

	<author>
	  <personname>
	    <firstname>Murray</firstname>
	    <surname>Stokely</surname>
	  </personname>
	</author>
      </authorgroup>
    </info>

    <indexterm>
      <primary><acronym>GEOM</acronym></primary>
    </indexterm>
    <indexterm>
      <primary>RAID3</primary>
    </indexterm>

    <para><acronym>RAID</acronym>3 is a method used to combine several
      disk drives into a single volume with a dedicated parity disk.
      In a <acronym>RAID</acronym>3 system, data is split up into a
      number of bytes that are written across all the drives in the
      array except for one disk which acts as a dedicated parity disk.
      This means that disk reads from a <acronym>RAID</acronym>3
      implementation access all disks in the array.  Performance can
      be enhanced by using multiple disk controllers.  The
      <acronym>RAID</acronym>3 array provides a fault tolerance of 1
      drive, while providing a capacity of 1 - 1/n times the total
      capacity of all drives in the array, where n is the number of
      hard drives in the array.  Such a configuration is mostly
      suitable for storing data of larger sizes such as multimedia
      files.</para>

    <para>At least 3 physical hard drives are required to build a
      <acronym>RAID</acronym>3 array.  Each disk must be of the same
      size, since <acronym>I/O</acronym> requests are interleaved to
      read or write to multiple disks in parallel.  Also, due to the
      nature of <acronym>RAID</acronym>3, the number of drives must be
      equal to 3, 5, 9, 17, and so on, or 2^n + 1.</para>

    <para>This section demonstrates how to create a software
      <acronym>RAID</acronym>3 on a &os; system.</para>

    <note>
      <para>While it is theoretically possible to boot from a
	<acronym>RAID</acronym>3 array on &os;, that configuration is
	uncommon and is not advised.</para>
    </note>

    <sect2>
      <title>Creating a Dedicated <acronym>RAID</acronym>3
	Array</title>

      <para>In &os;, support for <acronym>RAID</acronym>3 is
	implemented by the &man.graid3.8; <acronym>GEOM</acronym>
	class.  Creating a dedicated <acronym>RAID</acronym>3 array on
	&os; requires the following steps.</para>

      <procedure>
	<step>
	  <para>First, load the <filename>geom_raid3.ko</filename>
	    kernel module by issuing one of the following
	    commands:</para>

	  <screen>&prompt.root; <userinput>graid3 load</userinput></screen>

	  <para>or:</para>

	  <screen>&prompt.root; <userinput>kldload geom_raid3</userinput></screen>
	</step>

	<step>
	  <para>Ensure that a suitable mount point exists.  This
	    command creates a new directory to use as the mount
	    point:</para>

	  <screen>&prompt.root; <userinput>mkdir <replaceable>/multimedia</replaceable></userinput></screen>
	</step>

	<step>
	  <para>Determine the device names for the disks which will be
	    added to the array, and create the new
	    <acronym>RAID</acronym>3 device.  The final device listed
	    will act as the dedicated parity disk.  This example uses
	    three unpartitioned <acronym>ATA</acronym> drives:
	    <filename><replaceable>ada1</replaceable></filename> and
	    <filename><replaceable>ada2</replaceable></filename> for
	    data, and
	    <filename><replaceable>ada3</replaceable></filename> for
	    parity.</para>

	  <screen>&prompt.root; <userinput>graid3 label -v gr0 /dev/ada1 /dev/ada2 /dev/ada3</userinput>
Metadata value stored on /dev/ada1.
Metadata value stored on /dev/ada2.
Metadata value stored on /dev/ada3.
Done.</screen>
	</step>

	<step>
	  <para>Partition the newly created <filename>gr0</filename>
	    device and put a <acronym>UFS</acronym> file system on
	    it:</para>

	  <screen>&prompt.root; <userinput>gpart create -s GPT /dev/raid3/gr0</userinput>
&prompt.root; <userinput>gpart add -t freebsd-ufs /dev/raid3/gr0</userinput>
&prompt.root; <userinput>newfs -j /dev/raid3/gr0p1</userinput></screen>

	  <para>Many numbers will glide across the screen, and after a
	    bit of time, the process will be complete.  The volume has
	    been created and is ready to be mounted:</para>

	  <screen>&prompt.root; <userinput>mount /dev/raid3/gr0p1 /multimedia/</userinput></screen>

	  <para>The <acronym>RAID</acronym>3 array is now ready to
	    use.</para>
	</step>
      </procedure>

      <para>Additional configuration is needed to retain this setup
	across system reboots.</para>

      <procedure>
	<step>
	  <para>The <filename>geom_raid3.ko</filename> module must be
	    loaded before the array can be mounted.  To automatically
	    load the kernel module during system initialization, add
	    the following line to
	    <filename>/boot/loader.conf</filename>:</para>

	  <programlisting>geom_raid3_load="YES"</programlisting>
	</step>

	<step>
	  <para>The following volume information must be added to
	    <filename>/etc/fstab</filename> in order to
	    automatically mount the array's file system during the
	    system boot process:</para>

	  <programlisting>/dev/raid3/gr0p1	/multimedia	ufs	rw	2	2</programlisting>
	</step>
      </procedure>
    </sect2>
  </sect1>

  <sect1 xml:id="geom-graid">
    <info>
      <title>Software <acronym>RAID</acronym> Devices</title>

      <authorgroup>
	<author>
	  <personname>
	    <firstname>Warren</firstname>
	    <surname>Block</surname>
	  </personname>
	  <contrib>Originally contributed by </contrib>
	</author>
      </authorgroup>
    </info>

    <indexterm>
      <primary><acronym>GEOM</acronym></primary>
    </indexterm>
    <indexterm>
      <primary>Software RAID Devices</primary>
      <secondary>Hardware-assisted RAID</secondary>
    </indexterm>

    <para>Some motherboards and expansion cards add some simple
      hardware, usually just a <acronym>ROM</acronym>, that allows the
      computer to boot from a <acronym>RAID</acronym> array.  After
      booting, access to the <acronym>RAID</acronym> array is handled
      by software running on the computer's main processor.  This
      <quote>hardware-assisted software
	<acronym>RAID</acronym></quote> gives <acronym>RAID</acronym>
      arrays that are not dependent on any particular operating
      system, and which are functional even before an operating system
      is loaded.</para>

    <para>Several levels of <acronym>RAID</acronym> are supported,
      depending on the hardware in use.  See &man.graid.8; for a
      complete list.</para>

    <para>&man.graid.8; requires the <filename>geom_raid.ko</filename>
      kernel module, which is included in the
      <filename>GENERIC</filename> kernel starting with &os;&nbsp;9.1.
      If needed, it can be loaded manually with
      <command>graid load</command>.</para>

    <sect2 xml:id="geom-graid-creating">
      <title>Creating an Array</title>

      <para>Software <acronym>RAID</acronym> devices often have a menu
	that can be entered by pressing special keys when the computer
	is booting.  The menu can be used to create and delete
	<acronym>RAID</acronym> arrays.  &man.graid.8; can also create
	arrays directly from the command line.</para>

      <para><command>graid label</command> is used to create a new
	array.  The motherboard used for this example has an Intel
	software <acronym>RAID</acronym> chipset, so the Intel
	metadata format is specified.  The new array is given a label
	of <filename>gm0</filename>, it is a mirror
	(<acronym>RAID1</acronym>), and uses drives
	<filename>ada0</filename> and
	<filename>ada1</filename>.</para>

      <caution>
	<para>Some space on the drives will be overwritten when they
	  are made into a new array.  Back up existing data
	  first!</para>
      </caution>

      <screen>&prompt.root; <userinput>graid label Intel gm0 RAID1 ada0 ada1</userinput>
GEOM_RAID: Intel-a29ea104: Array Intel-a29ea104 created.
GEOM_RAID: Intel-a29ea104: Disk ada0 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:0-ada0 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-a29ea104: Disk ada1 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:1-ada1 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-a29ea104: Array started.
GEOM_RAID: Intel-a29ea104: Volume gm0 state changed from STARTING to OPTIMAL.
Intel-a29ea104 created
GEOM_RAID: Intel-a29ea104: Provider raid/r0 for volume gm0 created.</screen>

      <para>A status check shows the new mirror is ready for
	use:</para>

      <screen>&prompt.root; <userinput>graid status</userinput>
   Name   Status  Components
raid/r0  OPTIMAL  ada0 (ACTIVE (ACTIVE))
                  ada1 (ACTIVE (ACTIVE))</screen>

      <para>The array device appears in
	<filename>/dev/raid/</filename>.  The first array is called
	<filename>r0</filename>.  Additional arrays, if present, will
	be <filename>r1</filename>, <filename>r2</filename>, and so
	on.</para>

      <para>The <acronym>BIOS</acronym> menu on some of these devices
	can create arrays with special characters in their names.  To
	avoid problems with those special characters, arrays are given
	simple numbered names like <filename>r0</filename>.  To show
	the actual labels, like <filename>gm0</filename> in the
	example above, use &man.sysctl.8;:</para>

      <screen>&prompt.root; <userinput>sysctl kern.geom.raid.name_format=1</userinput></screen>
    </sect2>

    <sect2 xml:id="geom-graid-volumes">
      <title>Multiple Volumes</title>

      <para>Some software <acronym>RAID</acronym> devices support
	more than one <emphasis>volume</emphasis> on an array.
	Volumes work like partitions, allowing space on the physical
	drives to be split and used in different ways.  For example,
	Intel software <acronym>RAID</acronym> devices support two
	volumes.  This example creates a 40&nbsp;G mirror for safely
	storing the operating system, followed by a 20&nbsp;G
	<acronym>RAID0</acronym> (stripe) volume for fast temporary
	storage:</para>

      <screen>&prompt.root; <userinput>graid label -S 40G Intel gm0 RAID1 ada0 ada1</userinput>
&prompt.root; <userinput>graid add -S 20G gm0 RAID0</userinput></screen>

      <para>Volumes appear as additional
	<filename>r<replaceable>X</replaceable></filename> entries
	in <filename>/dev/raid/</filename>.  An array with two volumes
	will show <filename>r0</filename> and
	<filename>r1</filename>.</para>

      <para>See &man.graid.8; for the number of volumes supported by
	different software <acronym>RAID</acronym> devices.</para>
    </sect2>

    <sect2 xml:id="geom-graid-converting">
      <title>Converting a Single Drive to a Mirror</title>

      <para>Under certain specific conditions, it is possible to
	convert an existing single drive to a &man.graid.8; array
	without reformatting.  To avoid data loss during the
	conversion, the existing drive must meet these minimum
	requirements:</para>

      <itemizedlist>
	<listitem>
	  <para>The drive must be partitioned with the
	    <acronym>MBR</acronym> partitioning scheme.
	    <acronym>GPT</acronym> or other partitioning schemes with
	    metadata at the end of the drive will be overwritten and
	    corrupted by the &man.graid.8; metadata.</para>
	</listitem>

	<listitem>
	  <para>There must be enough unpartitioned and unused space at
	    the end of the drive to hold the &man.graid.8; metadata.
	    This metadata varies in size, but the largest occupies
	    64&nbsp;M, so at least that much free space is
	    recommended.</para>
	</listitem>
      </itemizedlist>

      <para>If the drive meets these requirements, start by making a
	full backup.  Then create a single-drive mirror with that
	drive:</para>

      <screen>&prompt.root; <userinput>graid label Intel gm0 RAID1 ada0 NONE</userinput></screen>

      <para>&man.graid.8; metadata was written to the end of the drive
	in the unused space.  A second drive can now be inserted into
	the mirror:</para>

      <screen>&prompt.root; <userinput>graid insert raid/r0 ada1</userinput></screen>

      <para>Data from the original drive will immediately begin to be
	copied to the second drive.  The mirror will operate in
	degraded status until the copy is complete.</para>
    </sect2>

    <sect2 xml:id="geom-graid-inserting">
      <title>Inserting New Drives into the Array</title>

      <para>Drives can be inserted into an array as replacements for
	drives that have failed or are missing.  If there are no
	failed or missing drives, the new drive becomes a spare.  For
	example, inserting a new drive into a working two-drive mirror
	results in a two-drive mirror with one spare drive, not a
	three-drive mirror.</para>

      <para>In the example mirror array, data immediately begins to be
	copied to the newly-inserted drive.  Any existing information
	on the new drive will be overwritten.</para>

      <screen>&prompt.root; <userinput>graid insert raid/r0 ada1</userinput>
GEOM_RAID: Intel-a29ea104: Disk ada1 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:1-ada1 state changed from NONE to NEW.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:1-ada1 state changed from NEW to REBUILD.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:1-ada1 rebuild start at 0.</screen>
    </sect2>

    <sect2 xml:id="geom-graid-removing">
      <title>Removing Drives from the Array</title>

      <para>Individual drives can be permanently removed from a
	from an array and their metadata erased:</para>

      <screen>&prompt.root; <userinput>graid remove raid/r0 ada1</userinput>
GEOM_RAID: Intel-a29ea104: Disk ada1 state changed from ACTIVE to OFFLINE.
GEOM_RAID: Intel-a29ea104: Subdisk gm0:1-[unknown] state changed from ACTIVE to NONE.
GEOM_RAID: Intel-a29ea104: Volume gm0 state changed from OPTIMAL to DEGRADED.</screen>
    </sect2>

    <sect2 xml:id="geom-graid-stopping">
      <title>Stopping the Array</title>

      <para>An array can be stopped without removing metadata from the
	drives.  The array will be restarted when the system is
	booted.</para>

      <screen>&prompt.root; <userinput>graid stop raid/r0</userinput></screen>
    </sect2>

    <sect2 xml:id="geom-graid-status">
      <title>Checking Array Status</title>

      <para>Array status can be checked at any time.  After a drive
	was added to the mirror in the example above, data is being
	copied from the original drive to the new drive:</para>

      <screen>&prompt.root; <userinput>graid status</userinput>
   Name    Status  Components
raid/r0  DEGRADED  ada0 (ACTIVE (ACTIVE))
                   ada1 (ACTIVE (REBUILD 28%))</screen>

      <para>Some types of arrays, like <literal>RAID0</literal> or
	<literal>CONCAT</literal>, may not be shown in the status
	report if disks have failed.  To see these partially-failed
	arrays, add <option>-ga</option>:</para>

      <screen>&prompt.root; <userinput>graid status -ga</userinput>
          Name  Status  Components
Intel-e2d07d9a  BROKEN  ada6 (ACTIVE (ACTIVE))</screen>
    </sect2>

    <sect2 xml:id="geom-graid-deleting">
      <title>Deleting Arrays</title>

      <para>Arrays are destroyed by deleting all of the volumes from
	them.  When the last volume present is deleted, the array is
	stopped and metadata is removed from the drives:</para>

      <screen>&prompt.root; <userinput>graid delete raid/r0</userinput></screen>
    </sect2>

    <sect2 xml:id="geom-graid-unexpected">
      <title>Deleting Unexpected Arrays</title>

      <para>Drives may unexpectedly contain &man.graid.8; metadata,
	either from previous use or manufacturer testing.
	&man.graid.8; will detect these drives and create an array,
	interfering with access to the individual drive.  To remove
	the unwanted metadata:</para>

      <procedure>
	<step>
	  <para>Boot the system.  At the boot menu, select
	    <literal>2</literal> for the loader prompt.  Enter:</para>

	  <screen>OK <userinput>set kern.geom.raid.enable=0</userinput>
OK <userinput>boot</userinput></screen>

	  <para>The system will boot with &man.graid.8;
	    disabled.</para>
	</step>

	<step>
	  <para>Back up all data on the affected drive.</para>
	</step>

	<step>
	  <para>As a workaround, &man.graid.8; array detection
	    can be disabled by adding</para>

	  <programlisting>kern.geom.raid.enable=0</programlisting>

	  <para>to <filename>/boot/loader.conf</filename>.</para>

	  <para>To permanently remove the &man.graid.8; metadata
	    from the affected drive, boot a &os; installation
	    <acronym>CD-ROM</acronym> or memory stick, and select
	    <literal>Shell</literal>.  Use <command>status</command>
	    to find the name of the array, typically
	    <literal>raid/r0</literal>:</para>

	  <screen>&prompt.root; <userinput>graid status</userinput>
   Name   Status  Components
raid/r0  OPTIMAL  ada0 (ACTIVE (ACTIVE))
                  ada1 (ACTIVE (ACTIVE))</screen>

	  <para>Delete the volume by name:</para>

	  <screen>&prompt.root; <userinput>graid delete raid/r0</userinput></screen>

	  <para>If there is more than one volume shown, repeat the
	    process for each volume.  After the last array has been
	    deleted, the volume will be destroyed.</para>

	  <para>Reboot and verify data, restoring from backup if
	    necessary.  After the metadata has been removed, the
	    <literal>kern.geom.raid.enable=0</literal> entry in
	    <filename>/boot/loader.conf</filename> can also be
	    removed.</para>
	</step>
      </procedure>
    </sect2>
  </sect1>

  <sect1 xml:id="geom-ggate">
    <title><acronym>GEOM</acronym> Gate Network</title>

    <para><acronym>GEOM</acronym> provides a simple mechanism for
      providing remote access to devices such as disks,
      <acronym>CD</acronym>s, and file systems through the use of the
      <acronym>GEOM</acronym> Gate network daemon,
      <application>ggated</application>.  The system with the device
      runs the server daemon which handles requests made by clients
      using <application>ggatec</application>.  The devices should not
      contain any sensitive data as the connection between the client
      and the server is not encrypted.</para>

    <para>Similar to <acronym>NFS</acronym>, which is discussed in
      <xref linkend="network-nfs"/>, <application>ggated</application>
      is configured using an exports file.  This file specifies which
      systems are permitted to access the exported resources and what
      level of access they are offered.  For example, to give the
      client <systemitem class="ipaddress">192.168.1.5</systemitem>
      read and write access to the fourth slice on the first
      <acronym>SCSI</acronym> disk, create
      <filename>/etc/gg.exports</filename> with this line:</para>

    <programlisting>192.168.1.5 RW /dev/da0s4d</programlisting>

    <para>Before exporting the device, ensure it is not currently
      mounted.  Then, start <application>ggated</application>:</para>

    <screen>&prompt.root; <userinput>ggated</userinput></screen>

    <para>Several options are available for specifying an alternate
      listening port or changing the default location of the exports
      file.  Refer to &man.ggated.8; for details.</para>

    <para>To access the exported device on the client machine, first
      use <command>ggatec</command> to specify the
      <acronym>IP</acronym> address of the server and the device name
      of the exported device.  If successful, this command will
      display a <literal>ggate</literal> device name to mount.  Mount
      that specified device name on a free mount point.  This example
      connects to the <filename>/dev/da0s4d</filename> partition on
      <literal>192.168.1.1</literal>, then mounts
      <filename>/dev/ggate0</filename> on
      <filename>/mnt</filename>:</para>

    <screen>&prompt.root; <userinput>ggatec create -o rw 192.168.1.1 /dev/da0s4d</userinput>
ggate0
&prompt.root; <userinput>mount /dev/ggate0 /mnt</userinput></screen>

    <para>The device on the server may now be accessed through
      <filename>/mnt</filename> on the client.  For more details about
      <command>ggatec</command> and a few usage examples, refer to
      &man.ggatec.8;.</para>

    <note>
      <para>The mount will fail if the device is currently mounted on
	either the server or any other client on the network.  If
	simultaneous access is needed to network resources, use
	<acronym>NFS</acronym> instead.</para>
    </note>

    <para>When the device is no longer needed, unmount it with
      <command>umount</command> so that the resource is available to
      other clients.</para>
  </sect1>

  <sect1 xml:id="geom-glabel">
    <title>Labeling Disk Devices</title>

    <indexterm>
      <primary><acronym>GEOM</acronym></primary>
    </indexterm>
    <indexterm>
      <primary>Disk Labels</primary>
    </indexterm>

    <para>During system initialization, the &os; kernel creates
      device nodes as devices are found.  This method of probing for
      devices raises some issues.  For instance, what if a new disk
      device is added via <acronym>USB</acronym>?  It is likely that
      a flash device may be handed the device name of
      <filename>da0</filename> and the original
      <filename>da0</filename> shifted to
      <filename>da1</filename>.  This will cause issues mounting
      file systems if they are listed in
      <filename>/etc/fstab</filename> which may also prevent the
      system from booting.</para>

    <para>One solution is to chain <acronym>SCSI</acronym> devices
      in order so a new device added to the <acronym>SCSI</acronym>
      card will be issued unused device numbers.  But what about
      <acronym>USB</acronym> devices which may replace the primary
      <acronym>SCSI</acronym> disk?  This happens because
      <acronym>USB</acronym> devices are usually probed before the
      <acronym>SCSI</acronym> card.  One solution is to only insert
      these devices after the system has been booted.  Another method
      is to use only a single <acronym>ATA</acronym> drive and never
      list the <acronym>SCSI</acronym> devices in
      <filename>/etc/fstab</filename>.</para>

    <para>A better solution is to use <command>glabel</command> to
      label the disk devices and use the labels in
      <filename>/etc/fstab</filename>.
      Since <command>glabel</command> stores the label in the last
      sector of a given provider, the label will remain persistent
      across reboots.  By using this label as a device, the
      file-system may always be mounted regardless of what
      device node it is accessed through.</para>

    <note>
      <para><command>glabel</command> can create both transient and
	permanent labels.  Only permanent labels are consistent across
	reboots.  Refer to &man.glabel.8; for more information on the
	differences between labels.</para>
    </note>

    <sect2>
      <title>Label Types and Examples</title>

      <para>Permanent labels can be a generic or a file system label.
	Permanent file system labels can be created with
	&man.tunefs.8; or &man.newfs.8;.  These types of labels are
	created in a sub-directory of <filename>/dev</filename>, and
	will be named according to the file system type.  For example,
	<acronym>UFS</acronym>2 file system labels will be created in
	<filename>/dev/ufs</filename>.  Generic permanent labels can
	be created with <command>glabel label</command>.  These are
	not file system specific and will be created in
	<filename>/dev/label</filename>.</para>

      <para>Temporary labels are destroyed at the next reboot.  These
	labels are created in <filename>/dev/label</filename> and are
	suited to experimentation.  A temporary label can be created
	using <command>glabel create</command>.</para>

<!-- XXXTR: How do you create a file system label without running newfs
	    or when there is no newfs (e.g.: cd9660)? -->

      <para>To create a permanent label for a
	<acronym>UFS</acronym>2 file system without destroying any
	data, issue the following command:</para>

      <screen>&prompt.root; <userinput>tunefs -L <replaceable>home</replaceable> <replaceable>/dev/da3</replaceable></userinput></screen>

      <para>A label should now exist in <filename>/dev/ufs</filename>
	which may be added to <filename>/etc/fstab</filename>:</para>

      <programlisting>/dev/ufs/home		/home            ufs     rw              2      2</programlisting>

      <note>
	<para>The file system must not be mounted while attempting
	  to run <command>tunefs</command>.</para>
      </note>

      <para>Now the file system may be mounted:</para>

      <screen>&prompt.root; <userinput>mount /home</userinput></screen>

      <para>From this point on, so long as the
	<filename>geom_label.ko</filename> kernel module is loaded at
	boot with <filename>/boot/loader.conf</filename> or the
	<literal>GEOM_LABEL</literal> kernel option is present,
	the device node may change without any ill effect on the
	system.</para>

      <para>File systems may also be created with a default label
	by using the <option>-L</option> flag with
	<command>newfs</command>.  Refer to &man.newfs.8; for
	more information.</para>

      <para>The following command can be used to destroy the
	label:</para>

      <screen>&prompt.root; <userinput>glabel destroy home</userinput></screen>

      <para>The following example shows how to label the partitions of
	a boot disk.</para>

      <example>
	<title>Labeling Partitions on the Boot Disk</title>

	<para>By permanently labeling the partitions on the boot disk,
	  the system should be able to continue to boot normally, even
	  if the disk is moved to another controller or transferred to
	  a different system.  For this example, it is assumed that a
	  single <acronym>ATA</acronym> disk is used, which is
	  currently recognized by the system as
	  <filename>ad0</filename>.  It is also assumed that the
	  standard &os; partition scheme is used, with
	  <filename>/</filename>,
	  <filename>/var</filename>,
	  <filename>/usr</filename> and
	  <filename>/tmp</filename>, as
	  well as a swap partition.</para>

	<para>Reboot the system, and at the &man.loader.8; prompt,
	  press <keycap>4</keycap> to boot into single user mode.
	  Then enter the following commands:</para>

	<screen>&prompt.root; <userinput>glabel label rootfs /dev/ad0s1a</userinput>
GEOM_LABEL: Label for provider /dev/ad0s1a is label/rootfs
&prompt.root; <userinput>glabel label var /dev/ad0s1d</userinput>
GEOM_LABEL: Label for provider /dev/ad0s1d is label/var
&prompt.root; <userinput>glabel label usr /dev/ad0s1f</userinput>
GEOM_LABEL: Label for provider /dev/ad0s1f is label/usr
&prompt.root; <userinput>glabel label tmp /dev/ad0s1e</userinput>
GEOM_LABEL: Label for provider /dev/ad0s1e is label/tmp
&prompt.root; <userinput>glabel label swap /dev/ad0s1b</userinput>
GEOM_LABEL: Label for provider /dev/ad0s1b is label/swap
&prompt.root; <userinput>exit</userinput></screen>

	<para>The system will continue with multi-user boot.  After
	  the boot completes, edit <filename>/etc/fstab</filename> and
	  replace the conventional device names, with their respective
	  labels.  The final <filename>/etc/fstab</filename> will
	  look like this:</para>

	<programlisting># Device                Mountpoint      FStype  Options         Dump    Pass#
/dev/label/swap         none            swap    sw              0       0
/dev/label/rootfs       /               ufs     rw              1       1
/dev/label/tmp          /tmp            ufs     rw              2       2
/dev/label/usr          /usr            ufs     rw              2       2
/dev/label/var          /var            ufs     rw              2       2</programlisting>

	<para>The system can now be rebooted.  If everything went
	  well, it will come up normally and <command>mount</command>
	  will show:</para>

	<screen>&prompt.root; <userinput>mount</userinput>
/dev/label/rootfs on / (ufs, local)
devfs on /dev (devfs, local)
/dev/label/tmp on /tmp (ufs, local, soft-updates)
/dev/label/usr on /usr (ufs, local, soft-updates)
/dev/label/var on /var (ufs, local, soft-updates)</screen>
      </example>

      <para>The &man.glabel.8; class
	supports a label type for <acronym>UFS</acronym> file
	systems, based on the unique file system id,
	<literal>ufsid</literal>.  These labels may be found in
	<filename>/dev/ufsid</filename> and are
	created automatically during system startup.  It is possible
	to use <literal>ufsid</literal> labels to mount partitions
	using <filename>/etc/fstab</filename>.  Use <command>glabel
	  status</command> to receive a list of file systems and their
	corresponding <literal>ufsid</literal> labels:</para>

      <screen>&prompt.user; <userinput>glabel status</userinput>
                  Name  Status  Components
ufsid/486b6fc38d330916     N/A  ad4s1d
ufsid/486b6fc16926168e     N/A  ad4s1f</screen>

      <para>In the above example, <filename>ad4s1d</filename>
	represents <filename>/var</filename>,
	while <filename>ad4s1f</filename> represents
	<filename>/usr</filename>.
	Using the <literal>ufsid</literal> values shown, these
	partitions may now be mounted with the following entries in
	<filename>/etc/fstab</filename>:</para>

      <programlisting>/dev/ufsid/486b6fc38d330916        /var        ufs        rw        2      2
/dev/ufsid/486b6fc16926168e        /usr        ufs        rw        2      2</programlisting>

      <para>Any partitions with <literal>ufsid</literal> labels can be
	mounted in this way, eliminating the need to manually create
	permanent labels, while still enjoying the benefits of device
	name independent mounting.</para>
    </sect2>
  </sect1>

  <sect1 xml:id="geom-gjournal">
    <title>UFS Journaling Through <acronym>GEOM</acronym></title>

    <indexterm>
      <primary><acronym>GEOM</acronym></primary>
    </indexterm>
    <indexterm>
      <primary>Journaling</primary>
    </indexterm>

    <para>Support for journals on
      <acronym>UFS</acronym> file systems is available on &os;.  The
      implementation is provided through the <acronym>GEOM</acronym>
      subsystem and is configured using <command>gjournal</command>.
      Unlike other file system journaling implementations, the
      <command>gjournal</command> method is block based and not
      implemented as part of the file system.  It is a
      <acronym>GEOM</acronym> extension.</para>

    <para>Journaling stores a log of file system transactions, such as
      changes that make up a complete disk write operation, before
      meta-data and file writes are committed to the disk.  This
      transaction log can later be replayed to redo file system
      transactions, preventing file system inconsistencies.</para>

    <para>This method provides another mechanism to protect against
      data loss and inconsistencies of the file system.  Unlike Soft
      Updates, which tracks and enforces meta-data updates, and
      snapshots, which create an image of the file system, a log is
      stored in disk space specifically for this task.  For better
      performance, the journal may be stored on another disk.  In this
      configuration, the journal provider or storage device should be
      listed after the device to enable journaling on.</para>

    <para>The <filename>GENERIC</filename> kernel provides support for
      <command>gjournal</command>.  To automatically load the
      <filename>geom_journal.ko</filename> kernel module at boot time,
      add the following line to
      <filename>/boot/loader.conf</filename>:</para>

    <programlisting>geom_journal_load="YES"</programlisting>

    <para>If a custom kernel is used, ensure the following line is in
      the kernel configuration file:</para>

    <programlisting>options	GEOM_JOURNAL</programlisting>

    <para>Once the module is loaded, a journal can be created on a new
      file system using the following steps.  In this example,
      <filename>da4</filename> is a new <acronym>SCSI</acronym>
      disk:</para>

    <screen>&prompt.root; <userinput>gjournal load</userinput>
&prompt.root; <userinput>gjournal label /dev/<replaceable>da4</replaceable></userinput></screen>

    <para>This will load the module and create a
      <filename>/dev/da4.journal</filename> device node on
      <filename>/dev/da4</filename>.</para>

    <para>A <acronym>UFS</acronym> file system may now be created on
      the journaled device, then mounted on an existing mount
      point:</para>

    <screen>&prompt.root; <userinput>newfs -O 2 -J /dev/<replaceable>da4</replaceable>.journal</userinput>
&prompt.root; <userinput>mount /dev/<replaceable>da4</replaceable>.journal <replaceable>/mnt</replaceable></userinput></screen>

    <note>
      <para>In the case of several slices, a journal will be created
	for each individual slice.  For instance, if
	<filename>ad4s1</filename> and <filename>ad4s2</filename> are
	both slices, then <command>gjournal</command> will create
	<filename>ad4s1.journal</filename> and
	<filename>ad4s2.journal</filename>.</para>
    </note>

    <para>Journaling may also be enabled on current file systems by
      using <command>tunefs</command>.  However,
      <emphasis>always</emphasis> make a backup before attempting to
      alter an existing file system.  In most cases,
      <command>gjournal</command> will fail if it is unable to create
      the journal, but this does not protect against data loss
      incurred as a result of misusing <command>tunefs</command>.
      Refer to &man.gjournal.8; and &man.tunefs.8; for more
      information about these commands.</para>

    <para>It is possible to journal the boot disk of a &os; system.
      Refer to the article <link
	xlink:href="&url.articles.gjournal-desktop;">Implementing UFS
	Journaling on a Desktop PC</link> for detailed
      instructions.</para>
  </sect1>
</chapter>