22 Kernel Debugging

Ladebug supports kernel debugging, which is a task normally performed by systems engineers or system administrators. A systems engineer might debug a kernel space program, which is built as part of the kernel and which references kernel data structures. A systems administrator might debug a kernel when a process is hung, or kernel parameters need to be examined or modified, or the operating system hangs, panics, or crashes. Kernel debugging aids in analyzing crash dumps.

Security

You may need to be the superuser (root login) to examine either the running system or crash dumps. Whether or not you need to be the superuser depends on the directory and file protections for the files you attempt to examine.

Compiling a Kernel for Debugging

Compilation of a kernel should be done without full optimization and without stripping the kernel of its symbol table information. Otherwise, your ability to debug the kernel is greatly reduced.

By default, compilation does not strip the symbol table information. By default, optimization is only partial. If you do not change these defaults, there should not be a problem.

Adding or Deleting Symbol Table Information

From within the debugger, you can selectively add or delete symbol table information for a kernel image, with the addstb or delstb commands. These commands can be useful because symbol table information can impact debugger performance and take up considerable disk space. The syntax is as follows:

addstb kernel_image

delstb kernel_image

Patching a Disk File

From within the debugger, you can use the patch command to correct bad data or instructions in an executable disk file. The text, initialized data, or read-only data areas can be patched. The bss segment cannot be patched because it does not exist in disk files.

The syntax is as follows:

patch expression1 = expression2

For example,

(ladebug)  patch @foo = 20

Setting the Thread Context

The debugger variable $tid contains the thread identifier of the current thread. The $tid value is updated implicitly by the debugger when program execution stops or completes.

You can modify the current thread context by setting $tid to a valid thread identifier.

When there is no process or program, $tid is set to 0.

The debugger variable $tid is the same as $curthread except that $tid is used for kernel debugging.

Summary and Additional Information

The remainder of this chapter briefly describes the use of Ladebug to

Perform local kernel debugging
Analyze crash dumps
Perform remote kernel debugging with the kdebug debugger

You can find additional information on kernel debugging in

The kdbx(8) reference page
The kdebug(8) reference page
The DEC OSF/1 Kernel Debugging manual

The kernel debugging functionality supported by Ladebug is very similar to the functionality described in the above-listed dbx sources, with the substitution of the term ladebug for the term dbx .

22.1 Local Kernel Debugging

When you have a problem with a process, you can debug the running kernel or examine the values assigned to system parameters. (It is generally recommended that you avoid modifying the value of the parameters, which can cause problems with the kernel.)

Invoke the debugger with the following command:

#  ladebug -k /vmunix /dev/mem

The -k flag maps virtual to physical addresses to enable local kernel debugging. The /vmunix and /dev/mem parameters cause the debugger to operate on the running kernel.

Now you can use Ladebug commands to display the current process identification numbers (process IDs), and trace the execution of processes. The following example shows the use of the command kps (which is the alias of the show process command) to display the process IDs:

(ladebug) kps

  PID     COMM
00000     kernel idle
00001     init
00014     kloadsrv
00016     update
  .
  .
  .

The Ladebug commands cont, next, rerun, run, setting registers, step , and stop are not available when you do local kernel debugging. (Stopping the kernel would also stop the debugger.)

If you want to examine the stack of, for example, the kloadsrv daemon, you set the $pid symbol to its process ID (14) and enter the where command, as in the following example (the spacing in the example has been altered to fit the page):

(ladebug) set $pid = 14
(ladebug) where

>  0 thread_block()
["/usr/sde/osf1/build/goldos.nightly1/src/kernel/kern/sched_prim.c":1623, 0xfffffc000043d77c]
   1 mpsleep(0xffffffff92586f00, 0x11a, 0xfffffc0000279cf4, 0x0, 0x0)
["/usr/sde/osf1/build/goldos.nightly1/src/kernel/bsd/kern_synch.c":411, 0xfffffc000040adc0]
   2 sosleep(0xffffffff92586f00, 0x1, 0xfffffc000000011a, 0x0, 0xffffffff81274210)
["/usr/sde/osf1/build/goldos.nightly1/src/kernel/bsd/uipc_socket2.c":654, 0xfffffc0000254ff8]
   3 sosbwait(0xffffffff92586f60, 0xffffffff92586f00, 0x0, 0xffffffff92586f00, 0x10180)
["/usr/sde/osf1/build/goldos.nightly1/src/kernel/bsd/uipc_socket2.c":630, 0xfffffc0000254f64]
   4 soreceive(0x0, 0xffffffff9a64f658, 0xffffffff9a64f680, 0x8000004300000000, 0x0)
["/usr/sde/osf1/build/goldos.nightly1/src/kernel/bsd/uipc_socket.c":1297, 0xfffffc0000253338]
   5 recvit(0xfffffc0000456fe8, 0xffffffff9a64f718, 0x14000c6d8, 0xffffffff9a64f8b8, 0xfffffc000043d724)
["/usr/sde/osf1/build/goldos.nightly1/src/kernel/bsd/uipc_syscalls.c":1002, 0xfffffc00002574f0]
   6 recvfrom(0xffffffff81274210, 0xffffffff9a64f8c8, 0xffffffff9a64f8b8, 0xffffffff9a64f8c8,
0xfffffc0000457570) ["/usr/sde/osf1/build/goldos.nightly1/src/kernel/bsd/uipc_syscalls.c":860,
0xfffffc000025712c]
   7 orecvfrom(0xffffffff9a64f8b8, 0xffffffff9a64f8c8, 0xfffffc0000457570, 0x1, 0xfffffc0000456fe8)
["/usr/sde/osf1/build/goldos.nightly1/src/kernel/bsd/uipc_syscalls.c":825, 0xfffffc000025708c]
   8 syscall(0x120024078, 0xffffffffffffffff, 0xffffffffffffffff, 0x21, 0x7d)
["/usr/sde/osf1/build/goldos.nightly1/src/kernel/arch/alpha/syscall_trap.c":515, 0xfffffc0000456fe4]
   9 _Xsyscall(0x8, 0x12001acb8, 0x14000eed0, 0x4, 0x1400109d0)
["/usr/sde/osf1/build/goldos.nightly1/src/kernel/arch/alpha/locore.s":1046, 0xfffffc00004486e4]

Examining the stack trace may reveal the problem. Then you can modify parameters, restart daemons, or take other corrective actions.

The kdbx Interface

The kdbx interface is a crash analysis and kernel debugging tool. It serves as a front end to the Ladebug debugger. The kdbx interface is extensible, customizable, and insensitive to changes to offsets and sizes of fields in structures. The only dependencies on kernel header files are for bit definitions in flag fields.

The kdbx interface has facilities for interpreting various symbols and kernel data structures. It can format and display these symbols and data structures in the following ways:

In a predefined form as specified in the source code modules that currently accompany the kdbx interface
As defined in user-written source code modules according to a standardized format for the contents of the kdbx modules

The Ladebug commands (except signals such as Ctrl/P) are available when you use the kdbx interface. (Many of these commands have aliases that match dbx commands, for the convenience of users who are accustomed to debugging kernels with the dbx debugger.) In general, kdbx assumes hexadecimal addresses for commands that perform input and output.

The sections that follow explain using kdbx to debug kernel programs.

Beginning a kdbx Session

Using the kdbx interface, you can examine either the running kernel or dump files created by the savecore utility. In either case, you examine an object file and a core file. For running systems, these files are usually /vmunix and /dev/mem, respectively. The savecore utility saves dump files it creates in the directory specified by the /sbin/init.d/savecore script. By default, the savecore utility saves dump files in the /var/adm/crash directory.

To examine a running system, enter the kdbx command with the following parameters:

     # kdbx -k /vmunix /dev/mem

When you begin a debugging session, kdbx reads and executes the commands in the system initialization file /var/kdbx /system.kdbx rc. The initialization file contains setup commands and alias definitions. (For a list of kdbx aliases, see the kdbx(8) reference page.) You can further customize the kdbx environment by adding commands and aliases to one of the following initialization files:

/var/kdbx /site.kdbx rc, which contains customized commands and alias definitions for a particular system
~/.kdbx rc, which contains customized commands and alias definitions for a specific user
./.kdbx rc, which contains customized commands and alias definitions for a specific project (this file must reside in the current working directory when kdbx is invoked)

The kdbx Interface Commands

The kdbx interface provides the following commands:

alias [name ] [command-string ]

Sets or displays aliases. If you omit all arguments, alias displays all aliases. If you specify the variable name, alias displays the alias for name, if one exists. If you specify name and command-string, alias establishes name as an alias for command-string.

context proc | user

Sets context to the user's aliases or the extension's aliases. This command is used only by the extensions.

coredata start_address end_address

Dumps, in hexadecimal, the contents of the core file starting at start_address and ending before end_address.

ladebug command-string

Passes the variable command-string to Ladebug. Specifying Ladebug is optional; if the command is not recognized by kdbx , it is passed automatically.

help [-long ] [args ]

Prints help text.

proc [flags ] [extension ] [arguments ]

Executes an extension and gives it control of the

kdbx

session until it quits. The variable extension specifies the named extension file and passes arguments to it as specified by the variable arguments. Valid flags are as follows:

-debug

Causes input to and output from the extension to be displayed on the screen.

-pipe in_pipe out_pipe

Used in conjunction with the dbx debugger for debugging extensions.

-print_output

Causes the output of the extension to be sent to the invoker of the extension without interpretation as kdbx commands.

-redirect_output

Used by extensions that execute other extensions to receive the output from the called extensions. Otherwise, the user receives the output.

-tty

Causes kdbx to communicate with the subprocess through a terminal line instead of pipes. If you specify the -pipe flag, proc ignores it.

print string

Displays string on the terminal. If this command is used by an extension, the terminal receives no output.

quit

Exits the kdbx interface.

source [-x ] [file(s) ]

Reads and interprets files as kdbx commands in the context of the current aliases. If the you specify the -x flag, the debugger displays commands as they are executed.

unalias name

Removes the alias, if any, from name.

The kdbx interface contains many predefined aliases, which are defined in the kdbx startup file /var/kdbx /system.kdbx rc.

Using kdbx Extensions

In addition to its commands, the kdbx interface provides extensions. You execute extensions using the kdbx command proc . For example, to execute the arp extension, you enter the following command:

kdbx> proc arp

You can create your own kdbx extensions.

For more information on the extensions, see the DEC OSF/1 Kernel Debugging manual.

22.2 Crash Dump Analysis

If your system panics or crashes, you can often find the cause by using either Ladebug or kdbx to analyze a crash dump.

The operating system can crash in the following ways:

Hardware trap - A hardware problem often results in the kernel trap() function being invoked.
Software panic - A software panic, resulting from a software failure, calls the kernel panic() function.
Hung system - When the system hangs, you can force the creation of dump files.

If the system crashes because of a hardware fault or an unrecoverable software state, a dump function is invoked. The dump function copies the core memory into the primary default swap disk area as specified by the /etc/fstab file structure table and the /sbin/swapdefault file. At system reboot time, the information is copied into a file, called a crash dump file.

You can analyze the crash dump file to determine what caused the crash. For example, if a hardware trap occurred, you can examine variables, such as savedefp, the program counter (pc), and the stack pointer (sp), to help you determine why the crash occurred. If a software panic caused the crash, you can use the Ladebug debugger to examine the crash dump and the uerf utility to examine the error log. Using these tools, you can determine what function called the panic() routine.

Crash dump files, such as vmunix.n and vmcore.n, usually reside in the /var/adm/crash directory. The version number (n in vmunix.n and vmcore.n ) must match for the two files.

For example, you might use the following command to examine dump files:

# ladebug -k vmunix.1 vmcore.1

Examining the Exception Frame

When you debug your code by working with a crash dump file, you can examine the exception frame using Ladebug. The variable savedefp contains the location of the exception frame. (No exception frames are created when you force a system to dump.) Refer to the header file /usr/include/machine/reg.h to determine where registers are stored in the exception frame. The following example shows an exception frame:

(ladebug) print savedefp/33X

ffffffff9618d940:   0000000000000000 fffffc000046f888
ffffffff9618d950:   ffffffff86329ed0 0000000079cd612f
   .
   .
   .
ffffffff9618da30:   0000000000901402 0000000000001001
ffffffff9618da40:   0000000000002000

Extracting the Character Message Buffer

You can use Ladebug to extract the preserved message buffer from a running system or dump files to display system messages logged by the kernel. For example:

(ladebug) print *pmsgbuf

struct {
       msg_magic = 405601
       msg_bufx = 1181
       msg_bufr = 1181
       msg_bufc = "Alpha boot: memory from 0x68a000 to 0x6000000
DEC OSF/1 T1.2-2   (Rev. 5); Thu Dec 03 11:20:36 EST 1992
physical memory = 94.00 megabytes.
available memory = 83.63 megabytes.
using 360 buffers containing 2.81 megabytes of memory
tc0 at nexus
scc0 at tc0 slot 7
asc0 at tc0 slot 6
rz1 at asc0 bus 0 target 1 lun 0 (DEC        RZ25        (C) DEC 0700)
rz2 at asc0 bus 0 target 2 lun 0 (DEC        RZ25        (C) DEC 0700)
rz3 at asc0 bus 0 target 3 lun 0 (DEC        RZ26        (C) DEC T384)
rz4 at asc0 bus 0 target 4 lun 0 (DEC        RRD42     (C) DEC   4.5d)
tz5 at asc0 bus 0 target 5 lun 0 (DEC        TLZ06        (C)DEC 0374)
asc1 at tc0 slot 6
fb0 at tc0 slot 8
  1280X1024
ln0: DEC LANCE Module Name: PMAD-BA
ln0 at tc0 slot 7
ln0: DEC LANCE Ethernet Interface, hardware address: 08:00:2b:2c:f6:9f
DEC3000 - M500 system
Firmware revision: 1.1
PALcode: OSF version 1.14
lvm0: configured.
lvm1: configured.
setconf: bootdevice_parser translated 'SCSI 0 6 0 0 300 0 FLAMG-IO' to 'rz3' " }
(ladebug)

The crashdc Utility

The crashdc utility collects critical data from operating system crash dump files or from a running kernel. You can use the data it collects to analyze the cause of a system crash. The crashdc utility uses existing system tools and utilities to extract information from crash dumps. The information garnered from crash dump files or from the running kernel includes the hardware and software configuration, current processes, the panic string (if any), and swap information.

The crashdc utility is invoked each time the system is booted. If it finds a current crash dump, crashdc creates a data collection file with the same numerical file name extension as the crash dump.

You can also invoke crashdc manually. The syntax of the command for invoking the data collection script is as follows:

/bin/crashdc  vmunix. n /vmcore. n

See the DEC OSF/1 Kernel Debugging manual for an example of the output from the crashdc command.

Managing Crash Dump File Creation

To ensure that you are able to analyze crash dump files following a system crash, you must understand the crash dump file creation process. This process requires that you reserve space on the system for crash dump files. The amount of space you save depends upon your system configuration and the type of crash dump file you want the system to create.

Saving Dumps to a File System

When the system reboots, it attempts to save a crash dump from the crash dump partition to a file system. The savecore utility (/sbin /savecore), which is invoked during system startup before the dump partition is accessed, checks to see if the system crashed or was rebooted. If the system crashed within the last three days, the savecore utility performs the following tasks as the system reboots:

Checks to see if a dump has been made within the last three days and that there is enough space to save it.
Saves the dump file and kernel image into a specified directory. The default files for the kernel image and the dump file are vmunix.n and vmcore.n, respectively.
The variable n gives the number of the crash. The number of the crash is recorded in the bounds file. After the first crash, the bounds file is created in the crash dump directory and the value one is stored in it. That value is incremented for each succeeding crash.
Logs a reboot message using the facility LOG_CRIT, which logs critical conditions. For more information, refer to the syslog(3) reference page.
Logs the panic string in both the ASCII and binary error log files, if the system crashed as a result of a panic.
Attempts to save the kernel syslog message buffer from the dump files. The msgbuf.err entry in /etc/syslog.conf file specifies the file name and location for the msgbuf dump file. The default /etc/syslog.conf file specification is as follows:
```
msgbuf.err                        /var/adm/crash/msgbuf.savecore
```
If the msgbuf.err entry is not specified in the /etc/syslog.conf file, the msgbuf dump file is not saved. The msgbuf dump file cannot be forwarded to any system.
When the syslogd daemon is later initialized, it checks for the msgbuf dump file. If a msgbuf dump file is found, syslogd processes the file and then deletes it.
Creates the file binlogdumpfile.n in the /var/adm/crash directory. The variable n is determined by the value of the bounds file.

You can modify the system default for the location of dump files by using the rcmgr command to specify another directory path for the /sbin/savecore utility:

# /usr/sbin/rcmgr set SAVECORE_DIR </newpath>

The /sbin/init.d/savecore script invokes the /sbin/savecore utility.

Crash Dump Files

Crash dump files are either partial (the default) or full. The following sections describe each type of file and explains allocating the proper amount of space in the crash dump partition and file system.

Partial Crash Dump Files

Unlike full crash dumps, the size of a partial crash dump file is proportional to the amount of system activity at the time of the crash: the higher the level of system activity and the larger the amount of memory in use at the time of a crash, the larger the partial crash dump files will be. For example, when a system with 96 megabytes (MB) of memory crashes, it creates a vmcore.n file with 10 to 96 MB of memory (depending upon system activity) and a vmunix.n file with approximately six MB of memory.

Note

If you compress a core dump file from a partial crash dump, you must use care in decompressing it. Using the uncompress command with no options results in a core file equal to the size of memory. To ensure that the decompressed core file remains at its partial dump size, you need to use the uncompress command with the -c option and the dd command with the conv=sparse option. For example, to decompress a core file named vmunix.0.Z, enter the following command:

# uncompress -c vmcore.0.Z | dd of=vmcore.0 conv=sparse
262144+0 records in
262144+0 records out

Full Crash Dump Files

Full crash dump files can be very large because the vmunix.n file is a copy of the running kernel and the size of the vmcore.n file is slightly larger than the amount of physical memory on the system that crashed. For example, when a system with 96 MB of memory crashes, it creates a vmcore.n file with approximately 96 MB of memory and a vmunix.n file with approximately six MB of memory.

Selecting a Crash Dump Type

The default is to use partial crash dumps. If you want to use full dumps, you can modify the default behavior in the following ways:

By specifying the d flag to the boot_osflags console environment variable.
By modifying the kernel's partial_dump variable to 0 using the Ladebug debugger (discussed in Chapter 8) as follows:
```
(ladebug) a partial_dump = 0
```
A partial_dump value of 1 indicates that partial dumps are to be generated.

Determining Crash Dump Partition Size

If you intend to save full crash dumps, you need to reserve disk space equal to the size of memory, plus one additional block for the dump header. For example, if your system has 128 MB of memory, you need a crash dump partition of at least 128 MB, plus one block (512 bytes).

If you intend to save partial crash dumps, the size of the disk partition may vary, depending upon system activity. For example, for a system with 128 MB of memory, if peak system activity is low (never using more than 60 MB of memory), the size of the crash dump partition can be 60 MB. If peak system activity is high (using all of memory), 128 MB of disk space is needed.

If full dumps are turned on and there is not enough disk space to create dump files for a full dump, partial dumps are automatically invoked.

Determining File System Space for Saving Crash Dumps

The size of the file system needed for saving crash dumps depends on the size and the number of crash dumps you want to retain. A general guideline is to reserve, at a minimum, the size of your crash dump partition, plus 10 MB. If necessary, you can increase this amount later.

If your system cannot save a crash dump due to insufficient disk space, it returns to single user mode. This return to single user mode prevents system swapping from corrupting the dump file. Space can then be made available in the crash dump directory, or the changed directory, before continuing to multiuser mode. You can override this option using the following command:

# /usr/sbin/rcmgr set SAVECORE_FLAGS M

This command causes the system to always boot to multiuser mode even if it cannot save a dump.

Procedures for Creating Dumps of a Hung System

If necessary, you can force the system to create dump files when the system hangs. The method for forcing crash dumps varies according to the hardware platform. The methods are described in the DEC OSF/1 Kernel Debugging manual.

Guidelines for Examining Crash Dump Files

In examining crash dump files, there is no one way to determine the cause of a system crash. However, the following guidelines should assist you in identifying the events that led to the crash:

Gather some facts about the system (for example, operating system type, version number, revision level, and hardware configuration).
Look at the panic string, if one exists. This string is contained in the preserved message buffer, pmsgbuf, and in the panicstr global variable.
Locate the thread executing at the time of the crash. (Use the where command.) Most likely, this thread will contain the events that led to the panic.
Determine whether you can fix the problem. If the system crashed because of lack of resources (for example, swap space), you can probably eliminate the problem by adding more of that resource.
If the problem is with the software, you may need to file a report with your local Digital Customer Support Center.

For more information, and for examples, see the DEC OSF/1 Kernel Debugging manual. This manual contains detailed information on the following topics related to crash dump analysis:

The crashdc utility
Managing crash dump file creation
Saving dumps to a file system
Selecting full or partial crash dump files
Determining crash dump partition size and file system space
Procedures (according to hardware platform) for creating dumps of a hung system

Note: Crash dump analysis is possible only with local, not remote, kernel debugging.

22.3 Remote Kernel Debugging with the kdebug Debugger

For remote kernel debugging, Ladebug is used in conjunction with the kdebug debugger,+ which is a tool for executing, testing, and debugging test kernels. The kdebug code runs inside the kernel to be debugged on a test system, while Ladebug runs on a remote system and communicates with kdebug over a serial line or a gateway system.

You use Ladebug commands to start and stop kernel execution, examine variable and register values, and perform other debugging tasks, just as you would when debugging user space programs. The kdebug debugger, not Ladebug, performs the actual reads and writes to registers, memory, and the image itself (for example, when breakpoints are set).

Connections Needed

The kernel code to be debugged runs on a test system. Ladebug runs on a remote build system and communicates with the kernel code over a serial communication line or through a gateway system.

You use a gateway system when you cannot physically connect the test and build systems. The build system is connected to the gateway system by a network line. The gateway system is connected to the test system by a serial communication line.

The following diagram shows the physical connection of the test and build systems (with no gateway):

  Build system          Serial line         Test system
(with Ladebug) <---------------------> (kernel code here)

The following diagram shows the connections when you use a gateway system:

  Build system       Network    Gateway      Serial line         Test system
(with Ladebug) <-----------> system <---------------------> (kernel code here)
                                with
                                kdebug
                                daemon

System Requirements

The test, build, and (if used) gateway systems must meet the following requirements for kdebug:

Test system
Must be running Version 2.0 or higher of the Digital UNIX operating system, must have the Kernel Debugging Tools subset loaded, and must have the Kernel Breakpoint Debugger kernel option configured.
Build system
Must be running Version 3.2 or higher of the Digital UNIX operating system. Also, this system must contain a copy of the kernel code you are testing and, preferably, the source used to build that kernel code.
Gateway system
Must be running Version 2.0 or higher of the Digital UNIX operating system, and must have the Kernel Debugging Tools subset loaded.

Getting Ready to Use the kdebug Debugger

To use the kdebug debugger, first do the following:

Attach the test system and the build system or test system and gateway system. See your hardware documentation for information about connecting systems to serial lines and networks.
Configure the kernel to be debugged with the configuration file option OPTIONS KDEBUG. If you are debugging the installed kernel, you can do this by selecting KERNEL BREAKPOINT DEBUGGING from the kernel options menu.
Recompile kernel files, if necessary. By default, the kernel is compiled with only partial debugging information, occasionally causing Ladebug to display erroneous arguments or mismatched source lines. To correct this, recompile selected source files specifying the CDEBUGOPTS=-g argument.
Copy the kernel to be tested to /vmunix on the test system. Retain an exact copy of this image on the build system.
Install the Product Authorization Key (PAK) for the Developer's kit (OSF-DEV), if it is not already installed. For information about installing PAKs, see the Installation Guide.
Determine the debugger variable settings or command-line options you will use, as follows:
Debugger variables:

On the build system, add the following lines to your .dbxinit file if you need to override the default values (and you choose not to use the corresponding options, described below). Alternatively, you can use these lines within the debugger session, at the (ladebug) prompt:
```
          set $kdebug_host="gateway_system"
          set $kdebug_line="serial_line"
          set $kdebug_dbgtty="tty"
```
$kdebug_host specifies the node or address of the gateway system. By default, $kdebug_host is set to localhost, for when a gateway system is not used.
$kdebug_line specifies the serial line to use as defined in the /etc/remote file of the build system (or the gateway system, if one is being used). By default, $kdebug_line is set to kdebug .
$kdebug_dbgtty sets the terminal on the gateway system to display the communication between the build and test systems, which is useful in debugging your setup. To determine the terminal name to supply to the $kdebug_dbgtty variable, enter the tty command in the desired window on the gateway system. By default, $kdebug_dbgtty is null.
Options:

Instead of using debugger variables, you can specify any of the following options on the ladebug command line:
- The -rn option specifies the node or address of the gateway system, and can be used instead of $kdebug_host.
- The -line option specifies the serial line, and can be used instead of $kdebug_line.
- The -tty option specifies the terminal name, and can be used instead of $kdebug_dbgtty.
The above three options require the -remote option or its alternative, the -rp kdebug option.
The variables you set in your .dbxinit file will override any options you use on the ladebug command line. In your debugging session, you can still override the .dbxinit variable settings by using the set command at the (ladebug) prompt, prior to issuing the run command.
If you are debugging on an SMP system, set the lockmode system attribute to four, as shown:
```
#  sysconfig -r lockmode = 4
```
Setting this system attribute makes debugging on an SMP system easier.

Invoking the Debugger

When the setup is complete, start up the debugger as follows:

Invoke the Ladebug debugger on the build system, supplying the pathname of the copy of the test kernel that resides on the build system. Set a breakpoint and start running Ladebug as follows (assuming that vmunix resides in the /usr/test directory):
```
# ladebug -remote /usr/test/vmunix
```
```
   .
   .
   .
```
```
(ladebug)  stop in hard_clock
[2] stop in hard_clock
(ladebug)  run
```
Because Ctrl/C cannot be used as an interrupt, you should set at least one breakpoint if you wish the debugger to gain control of kernel execution. You can set a breakpoint anytime after the execution of the kdebug_bootstrap() routine. Setting a breakpoint prior to the execution of this routine can result in unpredictable behavior.
Note
Pressing Ctrl/C causes the remote debugger to exit, not interrupt as it does during local debugging.
Halt the test system and, at the console prompt, set the boot_osflags console variable to contain the k option, and then boot the system. For example:
```
>>>  set boot_osflags k
>>>  boot
```
Alternatively, you can enter:
```
>>>  boot -A k
```

Once you boot the kernel, it begins executing. The Ladebug debugger halts execution at the breakpoint you specified, and you can begin issuing Ladebug debugging commands. All Ladebug commands are available, except kps , attach , and detach . See Part V, Command Reference for information on Ladebug debugging commands.)

Breakpoint Behavior on SMP Systems

If you set breakpoints in code that is executed on an SMP system, the breakpoints are handled serially. When a breakpoint is encountered on a particular CPU, the state of all the other processors in the system is saved and those processors spin, similarly to how execution stops when a simple lock is obtained on a particular CPU.

When the breakpoint is dismissed (for example, because you entered a step or cont command to the debugger), processing resumes on all processors.

Troubleshooting Tips

If you have completed the kdebug setup and it fails to work, refer to the following list for help:

Be sure the serial line is attached properly. Use the tip command to test the connection: Log onto the build system (or the gateway system if one is being used) as root and enter the following command:
```
      # tip kdebug
```
If the command does not return the message "connected," another process, such as a print daemon, might be using the serial line port that you have dedicated to the kdebug debugger. To remedy this condition, do the following:
- Check the /etc/inittab file to see if any processes are using that line. If so, disable these lines until you finish with the kdebug session. See the inittab(4) reference page for information on disabling lines.
- Use the ps command to see if any processes are using the line. For example, if you are using the /dev/tty00 serial port for your kdebug session, check for other processes using the serial line as follows:
```
             # ps agxt00
```
  If a process is using tty00, kill that process.
- Determine whether any unused kdebugd gateway daemons are running:
```
              # ps agx | grep kdebugd
```
  If one is running, kill the process.
If the test system boots to single user or beyond, then kdebug has not been configured into the kernel as specified in the section Getting Ready to Use the kdebug Debugger. Ensure that the boot_ osflags console environment variable specifies the k flag and try booting the system again:
```
     >>>  set boot_osflags k
     >>>  boot
```
Be sure you defined the Ladebug variables in your .dbxinit file correctly, or specify them correctly on the command line.
Determine which pseudoterminal line you ran tip from by issuing the /usr/bin/tty command. For example:
```
# /usr/bin/tty
/dev/ttyp2
```
The example shows that you are using pseudoterminal /dev /ttyp2. Edit your $HOME/.dbxinit file on the build system as follows:
1. Set the $kdebug_dbgtty variable to /dev/ttyp2 with this command:
```
           set $kdebug_dbgtty="/dev/ttyp2"
```
2. Set the variable $kdebug_host to the host name of the system from which you entered the tip command. For example, if the host name is decosf, the entry in the .dbxinit file should be:
```
           set $kdebug_host="decosf"
```
3. Remove any settings of the $kdebug_line variable:
```
           set $kdebug_line=""
```
Start Ladebug on the build system. You should see informational messages on the pseudoterminal line, /dev /ttyp2, which kdebug is starting.
If you are using a gateway system, ensure that the inetd daemon is running on the gateway system. Also, check the TCP/IP connection between the build and gateway system using one of the following commands: rlogin, rsh, or rcp.

+ Used alone, kdebug has its own syntax and commands, and allows local nonsymbolic debugging of a running kernel across a serial line. See the kdebug(8) manpage for information about kdebug local kernel debugging.

22.3.1 Analyzing a Crash Dump

When the operating system crashes, the dump function copies core memory into swap partitions. Then at system reboot time, this copy of core memory is copied into the crash dump file, which you can analyze.

When the operating system hangs, you may need to force a crash dump.

22.4 Debugging Loadable Drivers

The procedure for debugging loadable drivers depends on whether you are doing local or remote kernel debugging.

Loadable Drivers and Local Kernel Debugging

For local kernel debugging, any loadable drivers already present in the kernel are automatically loaded into the debugger once when the debugger is started. Since the kernel is running, additional drivers can be loaded at any time. If you wish to obtain the most current list of loaded drivers, you can manually load any new driver information with the following command:

(ladebug) readsharedobj /driver-directory/driver.mod

The list of drivers currently known to ladebug can be displayed as follows:

(ladebug) listobj

ObjectName                        Start Addr           Size        Symbols
                                                    (bytes)         Loaded
----------------------------------------------------------------------------
/vmunix                    0xfffffc0000230000        2442992          Yes
/var/subsys/dna_netman.mod
                           0xffffffff90ce0000          49152          Yes
/var/subsys/dna_dli.mod
                           0xffffffff90cf0000          57344          Yes
/var/subsys/dna_base.mod
                           0xffffffff90d04000         393216          Yes
/var/subsys/dna_xti.mod
                           0xffffffff90b0a000           8192          Yes

Loadable Drivers and Remote Kernel Debugging

For remote kernel debugging, you can debug loadable drivers as follows:

On your remote machine, create a directory called (for example)
```
/usr/opt/TMU100
```
Put both the source file and the loadable driver's .mod file into this directory. Assuming a loadable driver called tmux , get the tmux.mod file, and make sure you have permission to read, write, and execute the file.

Configure your system as follows:

Create a /usr/opt/TMU100/stanza.loadable file:

tmux:
   Subsystem_Description = TMUX device driver
   Module_Config_Name = tmux
   Module_Config1 = controller tmux0 at *
   Module_Type = Dynamic
   Module_Path = /usr/opt/TMU100/tmux.mod
   Device_Dir = /dev/streams
   Device_Char_Major = Any
   Device_Char_Minor = 0
   Device_Char_Files = tmux0

Run the sysconfigdb and sysconfig utilities:

sysconfigdb -a -f /usr/opt/TMU100/stanza.loadable tmux
sysconfigdb -s
cp /usr/opt/TMU100/tmux.mod /subsys/tmux.mod
cd /subsys
ln -s device.mth tmux.mth

Ensure that there is a copy of the driver tmux.mod residing in the same directory on the local machine, for example, /subsys/tmux.mod.
Start up Ladebug with the "-remote" option, and set a breakpoint in the routine subsys_preconfigure . Issue the run command:
```
    (ladebug) stop in subsys_preconfigure
    [#1: stop in void subsys_preconfigure(caddr_t) ]
    (ladebug) run
```
subsys_preconfigure is provided to assist developers in debugging configuration routines. It is needed because the debugger is not notified when a subsystem is loaded, and a user- defined breakpoint cannot be set until the load has occurred. subsys_preconfigure is guaranteed to be called following a subsystem load but prior to a configuration.
If your kernel has been built with symbolic information, once stopped in subsys_preconfigure you can examine the variable subsys to see if this event corresponds to your driver being loaded:
```
    (ladebug) print subsys
    0xfffffc0000620cd4="generic"
```
Depending on the number of subsystems being automatically loaded, you may stop in subsys_preconfigure many times. If your driver is being loaded last, or if you are manually loading your driver (see item #5, below), you may prefer to employ a more useful breakpoint that gets you closer to your desired stopping point, one that brings you closer to the subsys_ preconfigure call that takes place immediately prior to your driver being loaded.
If you are manually loading your driver, you will need to run the remote kernel to the point where you have the hash (#) single-user prompt. Once there, configure the driver as follows:
```
    # sysconfig -c tmux
```
Once your driver has been loaded into the kernel, the debugger will stop at your subsys_preconfigure breakpoint. At this time you can issue the following commands:
```
    (ladebug) print subsys
    0xfffffc0000620cd4="tmux"
    (ladebug) readsharedobj /subsys/tmux.mod
```
The readsharedobj command causes ladebug to retrieve the symbolic information associated with your driver. A lot of information is transferred over a serial line during this operation, so expect it to take several seconds to complete.

Now Ladebug knows about your driver, so you can proceed with symbolic debugging as you would with any other program. For example:

(ladebug) stop in tmux_configure
[#2: stop in int tmux_configure(cfg_op_t, caddr_t, ulong, caddr_t, ulong) ]
(ladebug) cont
[2] stopped at [int tmux_configure(cfg_op_t, caddr_t, ulong, caddr_t,
  ulong):933 0xffffffff89a14028]
    933     sa.sa_version       = OSF_STREAMS_11;
(ladebug) next
stopped at [int tmux_configure(cfg_op_t, caddr_t, ulong, caddr_t,
  ulong):934 0xffffffff89a14034]
    934     sa.sa_flags         = STR_IS_DEVICE | STR_SYSV4_OPEN;
(ladebug) print sa.sa_version
84107547

Note

If you use sysconfig to unload and then subsequently reload a driver in a kernel actively being debugged by ladebug, any breakpoints previously present in that driver will be lost. To reestablish those breakpoints in the newly loaded subsystem, issue the following ladebug commands prior to continuing:

    (ladebug) disable *
    (ladebug) enable *
    (ladebug) cont