1 files changed, 848 insertions, 0 deletions
diff --git a/zh_TW.UTF-8/books/developers-handbook/kerneldebug/chapter.xml b/zh_TW.UTF-8/books/developers-handbook/kerneldebug/chapter.xml
new file mode 100644
index 0000000000..a2cd16fa99
--- /dev/null
+++ b/zh_TW.UTF-8/books/developers-handbook/kerneldebug/chapter.xml
@@ -0,0 +1,848 @@
+<?xml version="1.0" encoding="utf-8"?>
+<!--
+     The FreeBSD Documentation Project
+
+     $FreeBSD$
+-->
+<chapter xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0" xml:id="kerneldebug">
+  <info><title>Kernel Debugging</title>
+    <authorgroup>
+      <author><personname><firstname>Paul</firstname><surname>Richards</surname></personname><contrib>Contributed by </contrib></author>
+      <author><personname><firstname>J&ouml;rg</firstname><surname>Wunsch</surname></personname></author>
+    </authorgroup>
+  </info>
+
+  
+
+  <sect1 xml:id="kerneldebug-obtain">
+    <title>Obtaining a Kernel Crash Dump</title>
+
+    <para>When running a development kernel (eg: &os.current;), such as a
+      kernel under extreme conditions (eg: very high load averages,
+      tens of thousands of connections, exceedingly high number of
+      concurrent users, hundreds of &man.jail.8;s, etc.), or using a
+      new feature or device driver on &os.stable; (eg:
+      <acronym>PAE</acronym>), sometimes a kernel will panic.  In the
+      event that it does, this chapter will demonstrate how to extract
+      useful information out of a crash.</para>
+
+    <para>A system reboot is inevitable once a kernel panics.  Once a
+      system is rebooted, the contents of a system's physical memory
+      (<acronym>RAM</acronym>) is lost, as well as any bits that are
+      on the swap device before the panic.  To preserve the bits in
+      physical memory, the kernel makes use of the swap device as a
+      temporary place to store the bits that are in RAM across a
+      reboot after a crash.  In doing this, when &os; boots after a
+      crash, a kernel image can now be extracted and debugging can
+      take place.</para>
+
+    <note><para>A swap device that has been configured as a dump
+      device still acts as a swap device.  Dumps to non-swap devices
+      (such as tapes or CDRWs, for example) are not supported at this time.  A
+      <quote>swap device</quote> is synonymous with a <quote>swap
+      partition.</quote></para></note>
+
+    <para>To be able to extract a usable core, it is required that at
+      least one swap partition be large enough to hold all of the bits
+      in physical memory.  When a kernel panics, before the system
+      reboots, the kernel is smart enough to check to see if a swap
+      device has been configured as a dump device.  If there is a
+      valid dump device, the kernel dumps the contents of what is in
+      physical memory to the swap device.</para>
+
+    <sect2 xml:id="config-dumpdev">
+      <title>Configuring the Dump Device</title>
+
+      <para>Before the kernel will dump the contents of its physical
+	memory to a dump device, a dump device must be configured.  A
+	dump device is specified by using the &man.dumpon.8; command
+	to tell the kernel where to save kernel crash dumps.  The
+	&man.dumpon.8; program must be called after the swap partition
+	has been configured with &man.swapon.8;.  This is normally
+	handled by setting the <varname>dumpdev</varname> variable in
+	&man.rc.conf.5; to the path of the swap device (the
+	recommended way to extract a kernel dump).</para>
+
+      <para>Alternatively, the dump device can be hard-coded via the
+	<literal>dump</literal> clause in the &man.config.5; line of
+	a kernel configuration file.  This approach is deprecated and should
+	be used only if a kernel is crashing before &man.dumpon.8; can be executed.</para>
+
+      <tip><para>Check <filename>/etc/fstab</filename> or
+	&man.swapinfo.8; for a list of swap devices.</para></tip>
+
+      <important><para>Make sure the <varname>dumpdir</varname>
+        specified in &man.rc.conf.5; exists before a kernel
+        crash!</para>
+
+        <screen>&prompt.root; <userinput>mkdir /var/crash</userinput>
+&prompt.root; <userinput>chmod 700 /var/crash</userinput></screen>
+
+        <para>Also, remember that the contents of
+	  <filename>/var/crash</filename> is sensitive and very likely
+	  contains confidential information such as passwords.</para>
+      </important>
+    </sect2>
+
+    <sect2 xml:id="extract-dump">
+      <title>Extracting a Kernel Dump</title>
+
+        <para>Once a dump has been written to a dump device, the dump
+	  must be extracted before the swap device is mounted.
+	  To extract a dump
+	  from a dump device, use the &man.savecore.8; program.  If
+	  <varname>dumpdev</varname> has been set in &man.rc.conf.5;,
+	  &man.savecore.8; will be called automatically on the first
+	  multi-user boot after the crash and before the swap device
+	  is mounted.  The location of the extracted core is placed in
+	  the &man.rc.conf.5; value <varname>dumpdir</varname>, by
+	  default <filename>/var/crash</filename> and will be named
+	  <filename>vmcore.0</filename>.</para>
+
+        <para>In the event that there is already a file called
+          <filename>vmcore.0</filename> in
+          <filename>/var/crash</filename> (or whatever
+          <varname>dumpdir</varname> is set to), the kernel will
+          increment the trailing number for every crash to avoid
+          overwriting an existing <filename>vmcore</filename> (eg:
+          <filename>vmcore.1</filename>).  While debugging, it is
+          highly likely that you will want to use the highest version
+          <filename>vmcore</filename> in
+          <filename>/var/crash</filename> when searching for the right
+          <filename>vmcore</filename>.</para>
+
+    <tip>
+      <para>If you are testing a new kernel but need to boot a different one in
+      order to get your system up and running again, boot it only into single
+      user mode using the <option>-s</option> flag at the boot prompt, and
+      then perform the following steps:</para>
+
+    <screen>&prompt.root; <userinput>fsck -p</userinput>
+&prompt.root; <userinput>mount -a -t ufs</userinput>       # make sure /var/crash is writable
+&prompt.root; <userinput>savecore /var/crash /dev/ad0s1b</userinput>
+&prompt.root; <userinput>exit</userinput>                  # exit to multi-user</screen>
+
+    <para>This instructs &man.savecore.8; to extract a kernel dump
+      from <filename>/dev/ad0s1b</filename> and place the contents in
+      <filename>/var/crash</filename>.  Do not forget to make sure the
+      destination directory <filename>/var/crash</filename> has enough
+      space for the dump.  Also, do not forget to specify the correct path to your swap
+      device as it is likely different than
+      <filename>/dev/ad0s1b</filename>!</para></tip>
+
+      <para>The recommended, and certainly the easiest way to automate
+        obtaining crash dumps is to use the <varname>dumpdev</varname>
+        variable in &man.rc.conf.5;.</para>
+    </sect2>
+  </sect1>
+
+  <sect1 xml:id="kerneldebug-gdb">
+    <title>Debugging a Kernel Crash Dump with <command>kgdb</command></title>
+
+    <note>
+      <para>This section covers &man.kgdb.1; as found in &os;&nbsp;5.3
+	and later.  In previous versions, one must use
+	<command>gdb -k</command> to read a core dump file.</para>
+    </note>
+
+    <para>Once a dump has been obtained, getting useful information
+      out of the dump is relatively easy for simple problems.  Before
+      launching into the internals of &man.kgdb.1; to debug
+      the crash dump, locate the debug version of your kernel
+      (normally called <filename>kernel.debug</filename>) and the path
+      to the source files used to build your kernel (normally
+      <filename>/usr/obj/usr/src/sys/KERNCONF</filename>,
+      where <filename>KERNCONF</filename>
+      is the <varname>ident</varname> specified in a kernel
+      &man.config.5;).  With those two pieces of info, let the
+      debugging commence!</para>
+
+    <para>To enter into the debugger and begin getting information
+      from the dump, the following steps are required at a minimum:</para>
+
+    <screen>&prompt.root; <userinput>cd /usr/obj/usr/src/sys/KERNCONF</userinput>
+&prompt.root; <userinput>kgdb kernel.debug /var/crash/vmcore.0</userinput></screen>
+
+    <para>You can debug the crash dump using the kernel sources just like
+      you can for any other program.</para>
+
+    <para>This first dump is from a 5.2-BETA kernel and the crash
+      comes from deep within the kernel.  The output below has been
+      modified to include line numbers on the left.  This first trace
+      inspects the instruction pointer and obtains a back trace.  The
+      address that is used on line 41 for the <command>list</command>
+      command is the instruction pointer and can be found on line
+      17.  Most developers will request having at least this
+      information sent to them if you are unable to debug the problem
+      yourself.  If, however, you do solve the problem, make sure that
+      your patch winds its way into the source tree via a problem
+      report, mailing lists, or by being able to commit it!</para>
+
+      <screen> 1:&prompt.root; <userinput>cd /usr/obj/usr/src/sys/KERNCONF</userinput>
+ 2:&prompt.root; <userinput>kgdb kernel.debug /var/crash/vmcore.0</userinput>
+ 3:GNU gdb 5.2.1 (FreeBSD)
+ 4:Copyright 2002 Free Software Foundation, Inc.
+ 5:GDB is free software, covered by the GNU General Public License, and you are
+ 6:welcome to change it and/or distribute copies of it under certain conditions.
+ 7:Type "show copying" to see the conditions.
+ 8:There is absolutely no warranty for GDB.  Type "show warranty" for details.
+ 9:This GDB was configured as "i386-undermydesk-freebsd"...
+10:panic: page fault
+11:panic messages:
+12:---
+13:Fatal trap 12: page fault while in kernel mode
+14:cpuid = 0; apic id = 00
+15:fault virtual address   = 0x300
+16:fault code:             = supervisor read, page not present
+17:instruction pointer     = 0x8:0xc0713860
+18:stack pointer           = 0x10:0xdc1d0b70
+19:frame pointer           = 0x10:0xdc1d0b7c
+20:code segment            = base 0x0, limit 0xfffff, type 0x1b
+21:                        = DPL 0, pres 1, def32 1, gran 1
+22:processor eflags        = resume, IOPL = 0
+23:current process         = 14394 (uname)
+24:trap number             = 12
+25:panic: page fault
+26      cpuid = 0;
+27:Stack backtrace:
+28
+29:syncing disks, buffers remaining... 2199 2199 panic: mi_switch: switch in a critical section
+30:cpuid = 0;
+31:Uptime: 2h43m19s
+32:Dumping 255 MB
+33: 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240
+34:---
+35:Reading symbols from /boot/kernel/snd_maestro3.ko...done.
+36:Loaded symbols for /boot/kernel/snd_maestro3.ko
+37:Reading symbols from /boot/kernel/snd_pcm.ko...done.
+38:Loaded symbols for /boot/kernel/snd_pcm.ko
+39:#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
+40:240             dumping++;
+41:<prompt>(kgdb)</prompt> <userinput>list *0xc0713860</userinput>
+42:0xc0713860 is in lapic_ipi_wait (/usr/src/sys/i386/i386/local_apic.c:663).
+43:658                     incr = 0;
+44:659                     delay = 1;
+45:660             } else
+46:661                     incr = 1;
+47:662             for (x = 0; x &lt; delay; x += incr) {
+48:663                     if ((lapic-&gt;icr_lo &amp; APIC_DELSTAT_MASK) == APIC_DELSTAT_IDLE)
+49:664                             return (1);
+50:665                     ia32_pause();
+51:666             }
+52:667             return (0);
+53:<prompt>(kgdb)</prompt> <userinput>backtrace</userinput>
+54:#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
+55:#1  0xc055fd9b in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:372
+56:#2  0xc056019d in panic () at /usr/src/sys/kern/kern_shutdown.c:550
+57:#3  0xc0567ef5 in mi_switch () at /usr/src/sys/kern/kern_synch.c:470
+58:#4  0xc055fa87 in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:312
+59:#5  0xc056019d in panic () at /usr/src/sys/kern/kern_shutdown.c:550
+60:#6  0xc0720c66 in trap_fatal (frame=0xdc1d0b30, eva=0)
+61:    at /usr/src/sys/i386/i386/trap.c:821
+62:#7  0xc07202b3 in trap (frame=
+63:      {tf_fs = -1065484264, tf_es = -1065484272, tf_ds = -1065484272, tf_edi = 1, tf_esi = 0, tf_ebp = -602076292, tf_isp = -602076324, tf_ebx = 0, tf_edx = 0, tf_ecx = 1000000, tf_eax = 243, tf_trapno = 12, tf_err = 0, tf_eip = -1066321824, tf_cs = 8, tf_eflags = 65671, tf_esp = 243, tf_ss = 0})
+64:    at /usr/src/sys/i386/i386/trap.c:250
+65:#8  0xc070c9f8 in calltrap () at {standard input}:94
+66:#9  0xc07139f3 in lapic_ipi_vectored (vector=0, dest=0)
+67:    at /usr/src/sys/i386/i386/local_apic.c:733
+68:#10 0xc0718b23 in ipi_selected (cpus=1, ipi=1)
+69:    at /usr/src/sys/i386/i386/mp_machdep.c:1115
+70:#11 0xc057473e in kseq_notify (ke=0xcc05e360, cpu=0)
+71:    at /usr/src/sys/kern/sched_ule.c:520
+72:#12 0xc0575cad in sched_add (td=0xcbcf5c80)
+73:    at /usr/src/sys/kern/sched_ule.c:1366
+74:#13 0xc05666c6 in setrunqueue (td=0xcc05e360)
+75:    at /usr/src/sys/kern/kern_switch.c:422
+76:#14 0xc05752f4 in sched_wakeup (td=0xcbcf5c80)
+77:    at /usr/src/sys/kern/sched_ule.c:999
+78:#15 0xc056816c in setrunnable (td=0xcbcf5c80)
+79:    at /usr/src/sys/kern/kern_synch.c:570
+80:#16 0xc0567d53 in wakeup (ident=0xcbcf5c80)
+81:    at /usr/src/sys/kern/kern_synch.c:411
+82:#17 0xc05490a8 in exit1 (td=0xcbcf5b40, rv=0)
+83:    at /usr/src/sys/kern/kern_exit.c:509
+84:#18 0xc0548011 in sys_exit () at /usr/src/sys/kern/kern_exit.c:102
+85:#19 0xc0720fd0 in syscall (frame=
+86:      {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 0, tf_esi = -1, tf_ebp = -1077940712, tf_isp = -602075788, tf_ebx = 672411944, tf_edx = 10, tf_ecx = 672411600, tf_eax = 1, tf_trapno = 12, tf_err = 2, tf_eip = 671899563, tf_cs = 31, tf_eflags = 642, tf_esp = -1077940740, tf_ss = 47})
+87:    at /usr/src/sys/i386/i386/trap.c:1010
+88:#20 0xc070ca4d in Xint0x80_syscall () at {standard input}:136
+89:---Can't read userspace from dump, or kernel process---
+90:<prompt>(kgdb)</prompt> <userinput>quit</userinput></screen>
+
+
+    <para>This next trace is an older dump from the FreeBSD 2 time
+      frame, but is more involved and demonstrates more of the
+      features of <command>gdb</command>.  Long lines have been folded
+      to improve readability, and the lines are numbered for
+      reference. Despite this, it is a real-world error trace taken
+      during the development of the pcvt console driver.</para>
+
+<screen> 1:Script started on Fri Dec 30 23:15:22 1994
+ 2:&prompt.root; <userinput>cd /sys/compile/URIAH</userinput>
+ 3:&prompt.root; <userinput>gdb -k kernel /var/crash/vmcore.1</userinput>
+ 4:Reading symbol data from /usr/src/sys/compile/URIAH/kernel
+...done.
+ 5:IdlePTD 1f3000
+ 6:panic: because you said to!
+ 7:current pcb at 1e3f70
+ 8:Reading in symbols for ../../i386/i386/machdep.c...done.
+ 9:<prompt>(kgdb)</prompt> <userinput>backtrace</userinput>
+10:#0  boot (arghowto=256) (../../i386/i386/machdep.c line 767)
+11:#1  0xf0115159 in panic ()
+12:#2  0xf01955bd in diediedie () (../../i386/i386/machdep.c line 698)
+13:#3  0xf010185e in db_fncall ()
+14:#4  0xf0101586 in db_command (-266509132, -266509516, -267381073)
+15:#5  0xf0101711 in db_command_loop ()
+16:#6  0xf01040a0 in db_trap ()
+17:#7  0xf0192976 in kdb_trap (12, 0, -272630436, -266743723)
+18:#8  0xf019d2eb in trap_fatal (...)
+19:#9  0xf019ce60 in trap_pfault (...)
+20:#10 0xf019cb2f in trap (...)
+21:#11 0xf01932a1 in exception:calltrap ()
+22:#12 0xf0191503 in cnopen (...)
+23:#13 0xf0132c34 in spec_open ()
+24:#14 0xf012d014 in vn_open ()
+25:#15 0xf012a183 in open ()
+26:#16 0xf019d4eb in syscall (...)
+27:<prompt>(kgdb)</prompt> <userinput>up 10</userinput>
+28:Reading in symbols for ../../i386/i386/trap.c...done.
+29:#10 0xf019cb2f in trap (frame={tf_es = -260440048, tf_ds = 16, tf_\
+30:edi = 3072, tf_esi = -266445372, tf_ebp = -272630356, tf_isp = -27\
+31:2630396, tf_ebx = -266427884, tf_edx = 12, tf_ecx = -266427884, tf\
+32:_eax = 64772224, tf_trapno = 12, tf_err = -272695296, tf_eip = -26\
+33:6672343, tf_cs = -266469368, tf_eflags = 66066, tf_esp = 3072, tf_\
+34:ss = -266427884}) (../../i386/i386/trap.c line 283)
+35:283                             (void) trap_pfault(&amp;frame, FALSE);
+36:<prompt>(kgdb)</prompt> <userinput>frame frame-&gt;tf_ebp frame-&gt;tf_eip</userinput>
+37:Reading in symbols for ../../i386/isa/pcvt/pcvt_drv.c...done.
+38:#0  0xf01ae729 in pcopen (dev=3072, flag=3, mode=8192, p=(struct p\
+39:roc *) 0xf07c0c00) (../../i386/isa/pcvt/pcvt_drv.c line 403)
+40:403             return ((*linesw[tp-&gt;t_line].l_open)(dev, tp));
+41:<prompt>(kgdb)</prompt> <userinput>list</userinput>
+42:398
+43:399             tp-&gt;t_state |= TS_CARR_ON;
+44:400             tp-&gt;t_cflag |= CLOCAL;  /* cannot be a modem (:-) */
+45:401
+46:402     #if PCVT_NETBSD || (PCVT_FREEBSD &gt;= 200)
+47:403             return ((*linesw[tp-&gt;t_line].l_open)(dev, tp));
+48:404     #else
+49:405             return ((*linesw[tp-&gt;t_line].l_open)(dev, tp, flag));
+50:406     #endif /* PCVT_NETBSD || (PCVT_FREEBSD &gt;= 200) */
+51:407     }
+52:<prompt>(kgdb)</prompt> <userinput>print tp</userinput>
+53:Reading in symbols for ../../i386/i386/cons.c...done.
+54:$1 = (struct tty *) 0x1bae
+55:<prompt>(kgdb)</prompt> <userinput>print tp-&gt;t_line</userinput>
+56:$2 = 1767990816
+57:<prompt>(kgdb)</prompt> <userinput>up</userinput>
+58:#1  0xf0191503 in cnopen (dev=0x00000000, flag=3, mode=8192, p=(st\
+59:ruct proc *) 0xf07c0c00) (../../i386/i386/cons.c line 126)
+60:       return ((*cdevsw[major(dev)].d_open)(dev, flag, mode, p));
+61:<prompt>(kgdb)</prompt> <userinput>up</userinput>
+62:#2  0xf0132c34 in spec_open ()
+63:<prompt>(kgdb)</prompt> <userinput>up</userinput>
+64:#3  0xf012d014 in vn_open ()
+65:<prompt>(kgdb)</prompt> <userinput>up</userinput>
+66:#4  0xf012a183 in open ()
+67:<prompt>(kgdb)</prompt> <userinput>up</userinput>
+68:#5  0xf019d4eb in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi =\
+69: 2158592, tf_esi = 0, tf_ebp = -272638436, tf_isp = -272629788, tf\
+70:_ebx = 7086, tf_edx = 1, tf_ecx = 0, tf_eax = 5, tf_trapno = 582, \
+71:tf_err = 582, tf_eip = 75749, tf_cs = 31, tf_eflags = 582, tf_esp \
+72:= -272638456, tf_ss = 39}) (../../i386/i386/trap.c line 673)
+73:673             error = (*callp-&gt;sy_call)(p, args, rval);
+74:<prompt>(kgdb)</prompt> <userinput>up</userinput>
+75:Initial frame selected; you cannot go up.
+76:<prompt>(kgdb)</prompt> <userinput>quit</userinput></screen>
+    <para>Comments to the above script:</para>
+
+    <variablelist>
+      <varlistentry>
+	<term>line 6:</term>
+
+	<listitem>
+	  <para>This is a dump taken from within DDB (see below), hence the
+	    panic comment <quote>because you said to!</quote>, and a rather
+	    long stack trace; the initial reason for going into DDB has been a
+	    page fault trap though.</para>
+	</listitem>
+      </varlistentry>
+
+      <varlistentry>
+	<term>line 20:</term>
+
+	<listitem>
+	  <para>This is the location of function <function>trap()</function>
+	    in the stack trace.</para>
+	</listitem>
+      </varlistentry>
+
+      <varlistentry>
+	<term>line 36:</term>
+
+	<listitem>
+	  <para>Force usage of a new stack frame; this is no longer necessary.
+	    The stack frames are supposed to point to the right
+	    locations now, even in case of a trap.
+	    From looking at the code in source line 403, there is a
+	    high probability that either the pointer access for
+	    <quote>tp</quote> was messed up, or the array access was out of
+	    bounds.</para>
+	</listitem>
+      </varlistentry>
+
+      <varlistentry>
+	<term>line 52:</term>
+
+	<listitem>
+	  <para>The pointer looks suspicious, but happens to be a valid
+	    address.</para>
+	</listitem>
+      </varlistentry>
+
+      <varlistentry>
+	<term>line 56:</term>
+
+	<listitem>
+	  <para>However, it obviously points to garbage, so we have found our
+	    error! (For those unfamiliar with that particular piece of code:
+	    <literal>tp-&gt;t_line</literal> refers to the line discipline  of
+	    the console device here, which must be a rather small integer
+	    number.)</para>
+	</listitem>
+      </varlistentry>
+    </variablelist>
+
+    <tip><para>If your system is crashing regularly and you are running
+      out of disk space, deleting old <filename>vmcore</filename>
+      files in <filename>/var/crash</filename> could save a
+      considerable amount of disk space!</para></tip>
+  </sect1>
+
+  <sect1 xml:id="kerneldebug-ddd">
+    <title>Debugging a Crash Dump with DDD</title>
+
+    <para>Examining a kernel crash dump with a graphical debugger like
+      <command>ddd</command> is also possible (you will need to install
+      the <package>devel/ddd</package> port in order to use the
+      <command>ddd</command> debugger).  Add the <option>-k</option>
+      option to the <command>ddd</command> command line you would use
+      normally.  For example;</para>
+
+    <screen>&prompt.root; <userinput>ddd -k /var/crash/kernel.0 /var/crash/vmcore.0</userinput></screen>
+
+    <para>You should then be able to go about looking at the crash dump using
+      <command>ddd</command>'s graphical interface.</para>
+  </sect1>
+
+  <sect1 xml:id="kerneldebug-post-mortem">
+    <title>Post-Mortem Analysis of a Dump</title>
+
+    <para>What do you do if a kernel dumped core but you did not expect it,
+      and it is therefore not compiled using <command>config -g</command>? Not
+      everything is lost here.  Do not panic!</para>
+
+    <para>Of course, you still need to enable crash dumps.  See above for the
+      options you have to specify in order to do this.</para>
+
+    <para>Go to your kernel config directory
+      (<filename>/usr/src/sys/arch/conf</filename>)
+      and edit your configuration file.  Uncomment (or add, if it does not
+      exist) the following line:</para>
+
+    <programlisting>makeoptions    DEBUG=-g                #Build kernel with gdb(1) debug symbols</programlisting>
+
+    <para>Rebuild the kernel.  Due to the time stamp change on the Makefile,
+      some other object files will be rebuilt, for example
+      <filename>trap.o</filename>.  With a bit of luck, the added
+      <option>-g</option> option will not change anything for the generated
+      code, so you will finally get a new kernel with similar code to the
+      faulting one but with some debugging symbols.  You should at least verify the
+      old and new sizes with the &man.size.1; command.  If there is a
+      mismatch, you probably need to give up here.</para>
+
+    <para>Go and examine the dump as described above.  The debugging symbols
+      might be incomplete for some places, as can be seen in the stack trace
+      in the example above where some functions are displayed without line
+      numbers and argument lists.  If you need more debugging symbols, remove
+      the appropriate object files, recompile the kernel again and repeat the
+      <command>gdb -k</command>
+      session until you know enough.</para>
+
+    <para>All this is not guaranteed to work, but it will do it fine in most
+      cases.</para>
+  </sect1>
+
+  <sect1 xml:id="kerneldebug-online-ddb">
+    <title>On-Line Kernel Debugging Using DDB</title>
+
+    <para>While <command>gdb -k</command> as an off-line debugger provides a very
+      high level of user interface, there are some things it cannot do.  The
+      most important ones being breakpointing and single-stepping kernel
+      code.</para>
+
+    <para>If you need to do low-level debugging on your kernel, there is an
+      on-line debugger available called DDB.  It allows setting of
+      breakpoints, single-stepping kernel functions, examining and changing
+      kernel variables, etc.  However, it cannot access kernel source files,
+      and only has access to the global and static symbols, not to the full
+      debug information like <command>gdb</command> does.</para>
+
+    <para>To configure your kernel to include DDB, add the option line
+
+      <programlisting>options DDB</programlisting>
+
+      to your config file, and rebuild.  (See <link xlink:href="&url.books.handbook;/index.html">The FreeBSD Handbook</link> for details on
+      configuring the FreeBSD kernel).</para>
+
+    <note>
+      <para>If you have an older version of the boot blocks, your
+	debugger symbols might not be loaded at all.  Update the boot blocks;
+	the recent ones load the DDB symbols automatically.</para>
+    </note>
+
+    <para>Once your DDB kernel is running, there are several ways to enter
+      DDB.  The first, and earliest way is to type the boot flag
+      <option>-d</option> right at the boot prompt.  The kernel will start up
+      in debug mode and enter DDB prior to any device probing.  Hence you can
+      even debug the device probe/attach functions.</para>
+
+    <para>The second scenario is to drop to the debugger once the
+      system has booted.  There are two simple ways to accomplish
+      this.  If you would like to break to the debugger from the
+      command prompt, simply type the command:</para>
+
+    <screen>&prompt.root; <userinput>sysctl debug.enter_debugger=ddb</userinput></screen>
+
+    <para>Alternatively, if you are at the system console, you may use
+      a hot-key on the keyboard.  The default break-to-debugger
+      sequence is <keycombo action="simul"><keycap>Ctrl</keycap>
+      <keycap>Alt</keycap><keycap>ESC</keycap></keycombo>.  For
+      syscons, this sequence can be remapped and some of the
+      distributed maps out there do this, so check to make sure you
+      know the right sequence to use.  There is an option available
+      for serial consoles that allows the use of a serial line BREAK on the
+      console line to enter DDB (<literal>options BREAK_TO_DEBUGGER</literal>
+      in the kernel config file).  It is not the default since there are a lot
+      of serial adapters around that gratuitously generate a BREAK
+      condition, for example when pulling the cable.</para>
+
+    <para>The third way is that any panic condition will branch to DDB if the
+      kernel is configured to use it.  For this reason, it is not wise to
+      configure a kernel with DDB for a machine running unattended.</para>
+
+    <para>The DDB commands roughly resemble some <command>gdb</command>
+      commands.  The first thing you probably need to do is to set a
+      breakpoint:</para>
+
+    <screen><userinput>b function-name</userinput>
+<userinput>b address</userinput></screen>
+
+    <para>Numbers are taken hexadecimal by default, but to make them distinct
+      from symbol names; hexadecimal numbers starting with the letters
+      <literal>a-f</literal> need to be preceded with <literal>0x</literal>
+      (this is optional for other numbers).  Simple expressions are allowed,
+      for example: <literal>function-name + 0x103</literal>.</para>
+
+    <para>To continue the operation of an interrupted kernel, simply
+      type:</para>
+
+    <screen><userinput>c</userinput></screen>
+
+    <para>To get a stack trace, use:</para>
+
+    <screen><userinput>trace</userinput></screen>
+
+    <note>
+      <para>Note that when entering DDB via a hot-key, the kernel is currently
+	servicing an interrupt, so the stack trace might be not of much use
+	to you.</para>
+    </note>
+
+    <para>If you want to remove a breakpoint, use</para>
+
+
+    <screen><userinput>del</userinput>
+<userinput>del address-expression</userinput></screen>
+
+    <para>The first form will be accepted immediately after a breakpoint hit,
+      and deletes the current breakpoint.  The second form can remove any
+      breakpoint, but you need to specify the exact address; this can be
+      obtained from:</para>
+
+    <screen><userinput>show b</userinput></screen>
+
+    <para>To single-step the kernel, try:</para>
+
+    <screen><userinput>s</userinput></screen>
+
+    <para>This will step into functions, but you can make DDB trace them until
+      the matching return statement is reached by:</para>
+
+    <screen><userinput>n</userinput></screen>
+
+    <note>
+      <para>This is different from <command>gdb</command>'s
+	<command>next</command> statement; it is like <command>gdb</command>'s
+	<command>finish</command>.</para>
+    </note>
+
+    <para>To examine data from memory, use (for example):
+
+      <screen><userinput>x/wx 0xf0133fe0,40</userinput>
+<userinput>x/hd db_symtab_space</userinput>
+<userinput>x/bc termbuf,10</userinput>
+<userinput>x/s stringbuf</userinput></screen>
+
+      for word/halfword/byte access, and hexadecimal/decimal/character/ string
+      display.  The number after the comma is the object count.  To display
+      the next 0x10 items, simply use:</para>
+
+    <screen><userinput>x ,10</userinput></screen>
+
+    <para>Similarly, use
+
+      <screen><userinput>x/ia foofunc,10</userinput></screen>
+
+      to disassemble the first 0x10 instructions of
+      <function>foofunc</function>, and display them along with their offset
+      from the beginning of <function>foofunc</function>.</para>
+
+    <para>To modify memory, use the write command:</para>
+
+    <screen><userinput>w/b termbuf 0xa 0xb 0</userinput>
+<userinput>w/w 0xf0010030 0 0</userinput></screen>
+
+    <para>The command modifier
+      (<literal>b</literal>/<literal>h</literal>/<literal>w</literal>)
+      specifies the size of the data to be written, the first following
+      expression is the address to write to and the remainder is interpreted
+      as data to write to successive memory locations.</para>
+
+    <para>If you need to know the current registers, use:</para>
+
+    <screen><userinput>show reg</userinput></screen>
+
+    <para>Alternatively, you can display a single register value by e.g.
+
+      <screen><userinput>p $eax</userinput></screen>
+
+      and modify it by:</para>
+
+    <screen><userinput>set $eax new-value</userinput></screen>
+
+    <para>Should you need to call some kernel functions from DDB, simply
+      say:</para>
+
+    <screen><userinput>call func(arg1, arg2, ...)</userinput></screen>
+
+    <para>The return value will be printed.</para>
+
+    <para>For a &man.ps.1; style summary of all running processes, use:</para>
+
+    <screen><userinput>ps</userinput></screen>
+
+    <para>Now you have examined why your kernel failed, and you wish to
+      reboot.  Remember that, depending on the severity of previous
+      malfunctioning, not all parts of the kernel might still be working as
+      expected.  Perform one of the following actions to shut down and reboot
+      your system:</para>
+
+    <screen><userinput>panic</userinput></screen>
+
+    <para>This will cause your kernel to dump core and reboot, so you can
+      later analyze the core on a higher level with <command>gdb</command>.  This command
+      usually must be followed by another <command>continue</command>
+      statement.</para>
+
+    <screen><userinput>call boot(0)</userinput></screen>
+
+    <para>Which might be a good way to cleanly shut down the running system,
+      <function>sync()</function> all disks, and finally reboot.  As long as
+      the disk and filesystem interfaces of the kernel are not damaged, this
+      might be a good way for an almost clean shutdown.</para>
+
+    <screen><userinput>call cpu_reset()</userinput></screen>
+
+    <para>This is the final way out of disaster and almost the same as hitting the
+      Big Red Button.</para>
+
+    <para>If you need a short command summary, simply type:</para>
+
+    <screen><userinput>help</userinput></screen>
+
+    <para>However, it is highly recommended to have a printed copy of the
+	&man.ddb.4; manual page ready for a debugging
+      session.  Remember that it is hard to read the on-line manual while
+      single-stepping the kernel.</para>
+  </sect1>
+
+  <sect1 xml:id="kerneldebug-online-gdb">
+    <title>On-Line Kernel Debugging Using Remote GDB</title>
+
+    <para>This feature has been supported since FreeBSD 2.2, and it is
+      actually a very neat one.</para>
+
+    <para>GDB has already supported <emphasis>remote debugging</emphasis> for
+      a long time.  This is done using a very simple protocol along a serial
+      line.  Unlike the other methods described above, you will need two
+      machines for doing this.  One is the host providing the debugging
+      environment, including all the sources, and a copy of the kernel binary
+      with all the symbols in it, and the other one is the target machine that
+      simply runs a similar copy of the very same kernel (but stripped of the
+      debugging information).</para>
+
+    <para>You should configure the kernel in question with <command>config
+	-g</command>, include <option>DDB</option> into the configuration, and
+      compile it as usual.  This gives a large binary, due to the
+      debugging information.  Copy this kernel to the target machine, strip
+      the debugging symbols off with <command>strip -x</command>, and boot it
+      using the <option>-d</option> boot option.  Connect the serial line
+      of the target machine that has "flags 080" set on its sio device
+      to any serial line of the debugging host.
+      Now, on the debugging machine, go to the compile directory of the target
+      kernel, and start <command>gdb</command>:</para>
+
+    <screen>&prompt.user; <userinput>gdb -k kernel</userinput>
+GDB is free software and you are welcome to distribute copies of it
+ under certain conditions; type "show copying" to see the conditions.
+There is absolutely no warranty for GDB; type "show warranty" for details.
+GDB 4.16 (i386-unknown-freebsd),
+Copyright 1996 Free Software Foundation, Inc...
+<prompt>(kgdb)</prompt> </screen>
+
+    <para>Initialize the remote debugging session (assuming the first serial
+      port is being used) by:</para>
+
+    <screen><prompt>(kgdb)</prompt> <userinput>target remote /dev/cuaa0</userinput></screen>
+
+    <para>Now, on the target host (the one that entered DDB right before even
+      starting the device probe), type:</para>
+
+    <screen>Debugger("Boot flags requested debugger")
+Stopped at Debugger+0x35: movb	$0, edata+0x51bc
+<prompt>db&gt;</prompt> <userinput>gdb</userinput></screen>
+
+    <para>DDB will respond with:</para>
+
+    <screen>Next trap will enter GDB remote protocol mode</screen>
+
+    <para>Every time you type <command>gdb</command>, the mode will be toggled
+      between remote GDB and local DDB.  In order to force a next trap
+      immediately, simply type <command>s</command> (step).  Your hosting GDB
+      will now gain control over the target kernel:</para>
+
+    <screen>Remote debugging using /dev/cuaa0
+Debugger (msg=0xf01b0383 "Boot flags requested debugger")
+    at ../../i386/i386/db_interface.c:257
+<prompt>(kgdb)</prompt></screen>
+
+    <para>You can use this session almost as any other GDB session, including
+      full access to the source, running it in gud-mode inside an Emacs window
+      (which gives you an automatic source code display in another Emacs
+      window), etc.</para>
+  </sect1>
+
+  <sect1 xml:id="kerneldebug-kld">
+    <title>Debugging Loadable Modules Using GDB</title>
+
+    <para>When debugging a panic that occurred within a module, or
+      using remote GDB against a machine that uses dynamic modules,
+      you need to tell GDB how to obtain symbol information for those
+      modules.</para>
+
+    <para>First, you need to build the module(s) with debugging
+      information:</para>
+
+    <screen>&prompt.root; <userinput>cd /sys/modules/linux</userinput>
+&prompt.root; <userinput>make clean; make COPTS=-g</userinput></screen>
+
+    <para>If you are using remote GDB, you can run
+      <command>kldstat</command> on the target machine to find out
+      where the module was loaded:</para>
+
+    <screen>&prompt.root; <userinput>kldstat</userinput>
+Id Refs Address    Size     Name
+ 1    4 0xc0100000 1c1678   kernel
+ 2    1 0xc0a9e000 6000     linprocfs.ko
+ 3    1 0xc0ad7000 2000     warp_saver.ko
+ 4    1 0xc0adc000 11000    linux.ko</screen>
+
+    <para>If you are debugging a crash dump, you will need to walk the
+      <literal>linker_files</literal> list, starting at
+      <literal>linker_files-&gt;tqh_first</literal> and following the
+      <literal>link.tqe_next</literal> pointers until you find the
+      entry with the <literal>filename</literal> you are looking for.
+      The <literal>address</literal> member of that entry is the load
+      address of the module.</para>
+
+    <para>Next, you need to find out the offset of the text section
+      within the module:</para>
+
+    <screen>&prompt.root; <userinput>objdump --section-headers /sys/modules/linux/linux.ko | grep text</userinput>
+  3 .rel.text     000016e0  000038e0  000038e0  000038e0  2**2
+ 10 .text         00007f34  000062d0  000062d0  000062d0  2**2</screen>
+
+    <para>The one you want is the <literal>.text</literal> section,
+      section 10 in the above example.  The fourth hexadecimal field
+      (sixth field overall) is the offset of the text section within
+      the file. Add this offset to the load address of the module to
+      obtain the relocation address for the module's code. In our
+      example, we get 0xc0adc000 + 0x62d0 = 0xc0ae22d0.  Use the
+      <command>add-symbol-file</command> command in GDB to tell the
+      debugger about the module:</para>
+
+    <screen><prompt>(kgdb)</prompt> <userinput>add-symbol-file /sys/modules/linux/linux.ko 0xc0ae22d0</userinput>
+add symbol table from file "/sys/modules/linux/linux.ko" at text_addr = 0xc0ae22d0?
+(y or n) <userinput>y</userinput>
+Reading symbols from /sys/modules/linux/linux.ko...done.
+<prompt>(kgdb)</prompt></screen>
+
+    <para>You should now have access to all the symbols in the
+      module.</para>
+  </sect1>
+
+  <sect1 xml:id="kerneldebug-console">
+    <title>Debugging a Console Driver</title>
+
+    <para>Since you need a console driver to run DDB on, things are more
+      complicated if the console driver itself is failing.  You might remember
+      the use of a serial console (either with modified boot blocks, or by
+      specifying <option>-h</option> at the <prompt>Boot:</prompt> prompt),
+      and hook up a standard terminal onto your first serial port.  DDB works
+      on any configured console driver, including a serial
+      console.</para>
+  </sect1>
+
+  <sect1 xml:id="kerneldebug-deadlocks">
+    <title>Debugging the Deadlocks</title>
+
+    <para>You may experience so called deadlocks, the situation where
+      system stops doing useful work. To provide the helpful bug report
+      in this situation, you shall use ddb as described above. Please,
+      include the output of <command>ps</command> and
+      <command>trace</command> for suspected processes in the
+      report.</para>
+
+    <para>If possible, consider doing further investigation. Receipt
+      below is especially useful if you suspect deadlock occurs in the
+      VFS layer. Add the options
+      <programlisting>makeoptions		DEBUG=-g
+	options		INVARIANTS
+	options		INVARIANT_SUPPORT
+	options		WITNESS
+	options		DEBUG_LOCKS
+	options		DEBUG_VFS_LOCKS
+	options		DIAGNOSTIC</programlisting>
+
+      to the kernel config. When deadlock occurs, in addition to the
+      output of the <command>ps</command> command, provide information
+      from the <command>show allpcpu</command>, <command>show
+      alllocks</command>, <command>show lockedvnods</command> and
+      <command>show alltrace</command>.</para>
+
+    <para>For threaded processes, to obtain meaningful backtraces, use
+      <command>thread thread-id</command> to switch to the thread
+      stack, and do backtrace with <command>where</command>.</para>
+  </sect1>
+</chapter>