Computer Crash Analysis Tool

User Guide

This guide explains how to use the Computer Crash Analysis Tool (CCAT) to analyze crash files on the supported operating systems.

Product Name:
Computer Crash Analysis Tool (CCAT)
Product Version:
5.1.1
Operating Systems:
Microsoft® Windows® 2000 and XP
HP Tru64 UNIX® versions 4.0F, 4.0G, 5.0A or higher
HP OpenVMS Alpha versions 7.2–2 or higher
Document Date:
4 December 2003

1  Introduction

Computer Crash Analysis Tool (CCAT) is a software application that enables Hewlett-Packard customer service engineers and system administrators to analyze operating system crashes.

CCAT matches information collected about a crash against a set of operating system-specific rules to determine if the footprint of the collected crash data matches any known crash data footprints for which a solution or corrective action has been found. Use of CCAT greatly reduces customer downtime by shortening the time required to analyze system crashes and eliminates the need for customer site visits.

1.1  Gathering Crash Data

The method used to gather crash data varies depending on your operating system.

1.1.1  Windows

In order to generate a crash data file that CCAT can use, Windows systems must have Crash Analysis Data Collector (CADC) installed.

CADC reads the binary crash information stored in the memory.dmp file created by the operating system in the event of a crash. CADC processes the memory.dmp file and creates a new file named NtFootPrint.txt. CCAT can only process crash files that have been pre-processed by CADC. CCAT cannot process a raw memory.dmp file.

Note


The current version of CCAT will not work with the original version of CADC. You must have version 3.1 or higher of CADC installed in order to use CCAT. You can install CADC either before or after you install CCAT.


You can download CADC from this URL:
http://www.compaq.com/support/svctools/webes/ccat/cadc.html

Once CADC is installed, you will need to configure your machine to create a memory.dmp file when/if it crashes for CADC (and subsequently CCAT) to work. For Windows, these settings can be found in the Control Panel, under the System utility.

For Windows NT, choose the tab labeled Startup/Shutdown from the System window.

For Windows 2000, select the Advanced tab, then click the Startup and Recovery button. In the Write Debugging Information section, do the following:

Once installation and configuration is complete, each time your Windows system crashes, CADC reads and processes the memory.dmp file, and creates a new NtFootPrint.txt file. Once CADC has created the footprint, CCAT can process the crash data.

1.1.2  Tru64 UNIX

Each time your Tru64 UNIX system crashes, a system utility collects data about the crash and saves it in a crash data file.

1.1.3  OpenVMS

Each time your OpenVMS system crashes, a system utility collects data about the crash and saves it in a crash data file.

1.2  CCAT Functionality

Once the footprint has been created, CCAT can perform the following functions automatically:

For more information about DSNlink, contact the CSC or see the following web site:
http://www.compaq.com/support/svctools/connectivity
For more information about PRS, contact the CSC or see the following web site:
http://www.compaq.com/manage/remoteservices

CCAT can also be run at any time as a GUI, enabling you to manually process crash data files.

1.3  Security and Required Permissions

In order to enhance security, only privileged users can access the WEBES directory tree and run Compaq Analyze commands. The requirements for each operating system are given here.

1.3.1  Windows

The following actions are restricted to privileged users:

To perform restricted actions, your user ID must be either:

1.3.2  Tru64 UNIX

The following actions are restricted to privileged users:

Only the "root" user can perform these actions. The /usr/opt/hp/svctools directory is owned by root, and has rwx (read, write, and execute) permissions for root (owner), and no permissions for any other user (group or world).

1.3.3  OpenVMS

Commands—To execute any Compaq Analyze commands (desta or wccat commands from the command prompt), the user needs all of the following OpenVMS privileges. Note that these are a subset of the privileges required to install, upgrade, or uninstall WEBES as described in the WEBES Installation Guide:

ALTPRI
BUGCHK
CMKRNL

DIAGNOSE
IMPERSONATE
NETMBX

SYSPRV
TMPMBX


Files—File access is restricted in the WEBES installed directory tree pointed to by the SVCTOOLS_HOME logical (SYS$COMMON:[HP] by default). To view these files, you must be a member of the System group, your user ID must have all privileges, or you must issue the SET PROCESS /PRIV=ALL command.

All directories and files in the SVCTOOLS_HOME tree are owned by the System user, and have System, Owner, and Group permissions of RWED (Read, Write, Execute, and Delete). There are no permissions for World.

1.4  Intended Audience

The Computer Crash Analysis Tool User Guide is intended for use by system administrators and Hewlett-Packard Customer Services engineers who use the CCAT software on all supported operating systems, including Windows 2000 and XP, Tru64 UNIX, and OpenVMS Alpha.

1.5  Further Information

CCAT is a member of the Web-Based Enterprise Services (WEBES) suite of products. For more information on the other WEBES applications, visit the support web site:
http://www.compaq.com/support/svctools/webes

2  Running CCAT Automatically

This manual tells you how to use CCAT to process crash files manually. However, CCAT is used most efficiently as an automatic process requiring no input from the user. This section describes the automatic operation of CCAT.

2.1  Automatic Mode Process

Note


If you are running an older unsupported operating system or OpenVMS VAX, you must use the Crash Analysis Data Collector (CADC) for operating system diagnostics. For more information about installing and using CADC, see the CADC user documentation for your operating system.


The automated CCAT process begins when a system crashes and consists of the following steps:

  1. When the system reboots, a system utility or other software collects data about the cause of the crash and creates a crash file.
  1. CCAT automatically starts when a system reboots from a crash, and detects that there is a crash file to process. CCAT analyzes the crash file against the local CCAT knowledge base and produces a results file which contains the crash parameters, and may include the possible cause and solution for the system crash.
  2. CCAT sends an email message to the system administrator or other specified local addressee containing information about the crash.

Note


In order for CCAT to perform the following functions automatically, either DSNlink or PRS must be installed and running on the system.


  1. CCAT opens a service request containing the crash parameters and the crash data analysis file at the Customer Support Center (CSC) using DSNlink or PRS. (If neither DSNlink not PRS is available, the customer can provide the crash data analysis file to the CSC via ftp, email, or storage medium (e.g., diskette or tape).
  2. The crash is analyzed again when the message containing the crash parameters and the results file arrives at the CSC, in case the CCAT server at the CSC may have updated rule sets that can provide additional insight into the cause of the crash and problem resolution.
  3. The results of the analysis performed at the customer site and at the CSC are entered into the Call Handling System.
  4. The CSC monitors open calls in the Call Handling System, and notifies the customer of the final analysis results via email or by means of a call from a crash analysis specialist.

2.2  Configuring CCAT To Run Automatically

If you want CCAT to process a footprint automatically and send the footprint and the results to the CSC, you must do the following:

3  Using the CCAT GUI

The CCAT GUI is an interactive tool you can use to analyze crash files manually. It is important to keep in mind that the CCAT GUI is used only for onsite manual tasks. It does not log calls or send crash parameters or results files to the CSC, nor does it send email notification to anyone.

The CCAT GUI allows you to perform the following tasks:

3.1  Verifying the WEBES Director

The DESTA Director must be running before you start the CCAT GUI. Ordinarily, the WEBES Common Components installation configures your startup procedure so that the DESTA Director starts every time your system reboots. If the DESTA Director fails to start at system startup, you will not be able to analyze crash files.

You can verify that the DESTA Director is running by executing the following command:

   desta status

If circumstances require it, you can manually start the Director by following the instructions for your operating system.

3.1.1  Windows

To start the WEBES Director, start the DESTA_Service Windows service using one of the following methods:

Using the desta start command on Windows systems is unsupported. Using the desta start command will start the Director, but will also generate error messages. Starting the director this way is is not recommended because:

On Windows, the desta start/stop functionality is only intented to be used as a tool for investigating WEBES operational problems. If the Director is started with desta start, it must be stopped with desta stop.

3.1.2  Tru64 UNIX

Enter /usr/sbin/desta start at a shell prompt.

On TruClusters, you can run the /usr/sbin/webes_install_update program and choose the Start WEBES Director option to start the Director on either all the nodes in the cluster or a selected group of nodes that you choose.

3.1.3  OpenVMS

Enter DESTA START at the OpenVMS command line prompt.

On OpenVMS clusters, you can uses the SYSMAN utility to issue the command do desta start on either all the nodes in the cluster or a specific group of nodes that you choose.

3.2  Starting the GUI

Start the CCAT GUI according to your operating system:

Windows:

Start | Programs | Hewlett-Packard Service Tools | Computer Crash Analysis Tool | Computer Crash Analysis Tool

Tru64 UNIX:

# /usr/sbin/wccat gui

OpenVMS:

(Before you start the CCAT GUI, make sure your user account page file quota is set to at least 300,000 blocks.)

$ @SVCTOOLS_HOME:[COMMON.BIN]WCCAT GUI


3.3  CCAT GUI

Starting the GUI displays the CCAT window (Figure 3–1).

Figure 3–1 Computer Crash Analysis Tool Window
figures/ccat_window.gif

Note the horizontal scroll bar at the bottom of the upper frame of the CCAT window. You can resize the CCAT window to best suit your needs and the size of your monitor. Use the scroll bar to view information in the crash data parameter fields that falls outside the frame area.

3.4  Performing a Manual Crash Analysis

To analyze a crash manually, you must enter the parameters from the crash data file into the fields in the CCAT window.

3.4.1  Crash Data Parameters

The crash data parameters that you need to enter vary depending on your operating system.

Windows

Table 3–1 Windows Crash Data Parameters 
Parameter
Explanation
OS Version
The version number of the failed Windows operating system
Minor Version
The NT build number (for NT 4.0, 1381; for Windows 2000, 2195)
Service Pack
The number of the Service Pack installed on the failed machine
Machine Image Type
"intel"
BugCheckCode
The number of the stop that occurred, which can be used to determine what trap occurred
BugCheckParam #1
BugCheckParam #2
BugCheckParam #3
BugCheckParam #4
The four parameters normally included with the BugCheckCode that give clues to the nature of the BugCheckCode
Failing Module
The name of the driver that failed
Failing Module Offset
The offset of the failed driver
Failing Module Timestamp
The date and time the failed driver was built
Crash Process Name
The name of the process that was running when the system crashed
Failing Routine
The name of the failing routine
Failing Routine Offset
The failing address location within the failing routine, offset from the start of the routine
Pool Information
The address within a Page or NonPage pool, depending on the stopcode
Canonical Stopcode Parameter 1
Canonical Stopcode Parameter 2
Canonical Stopcode Parameter 3
Canonical Stopcode Parameter 4
Address or status register variables (see the Kanalyze documentation for more information)
Keyword 1
Keyword 2
Keyword 3
Keyword 4
Items on the stack that point to the cause of the failure (see the Kanalyze documentation for more information)
Driver List
The Driver Name, Driver Load Address, Driver Size and Driver Date. These values are derived from the failing address information contained in the Bugcheck Parameter fields. Which Bugcheck Parameter field you use depends on the Bugcheck Code. The Driver List corresponds to the driver base address when compared to the address of the Stopcode.
Stack Trace
A list of the functions the system was executing when it crashed, with the ending line of code for each
Call Site List
Addresses taken from the Stack Trace used to identify failing areas

Tru64 UNIX

Table 3–2 Tru64 UNIX Crash Data Parameters 
Parameter
Explanation
OS Version
The version number of the failed operating system
Architecture
The hardware architecture (e.g., alpha)
Panic String
A brief description of why the system crashed
Stack Trace
A list of the functions the system was executing when it crashed, with the ending line of code for each
Crash Time
The time of the system crash
Uptime
How long the system that crashed had been running since the last reboot
Host Name
The node on which the crash occurred
Firmware Revision
The machine hardware type of the failed CPU
System String
The System Information String, e.g., AlphaServer 4100 5/400 4MB
Number of CPUs
The number of CPUs available to the system
Physical Memory
The memory in megabytes
Panic CPU
The CPU that caused the system to crash
Available CPUs
The CPUs that are currently being used
Virtual Address
The virtual address that caused a kernel memory fault, and subsequent system crash (valid for kernel memory fault panics only)
Faulting PC
The PC on which the fault occurred
Exception Frame Pointer
A pointer to the exception frame that contains register information about the state of the failed CPU (valid prior to V4.0 only)
PC/I Module
The Program Counter/Instruction at the time of the trap or exception that led to the system crash (valid prior to V4.0 only)
Return Address
The address of the instruction immediately prior to the trap or exception that led to the system crash (valid prior to V4.0 only)

OpenVMS Alpha

Table 3–3 Open VMS Alpha Crash Data Parameters 
Parameter
Explanation
OS Version
The version number of the failed operating system
Crash Time
The date and time the system crash occurred
Bugcheck
The type of diagnostic check logged by the operating system
Host Name
The node on which the crash occurred
CPU Type
The model number of the failed CPU
Process Name
The name of the process active at the time of the crash
Image Name
The name of the image being executed at the time of the crash
Signal Array
The Signal Array count. The Signal Array contains the exception code, zero or more exception parameters, the PC, and the PSL.
Condition Code
The symbolic value assigned to the specific condition
Reason Mask
The longword mask
Virtual Address
The virtual address the failing instruction tried to reference
Exception PC
The instruction whose attempted execution resulted in the unexpected executive or kernel mode exception
Exception PSL
Processor Status Longword (PSL) at the time of the exception
Module Name
The name of the failed module
Module Offset
The offset of the failed module
Instruction
The failing instruction corresponding to the exception PC
Map Module
The name of the map module in use when the crash occurred
Map Offset
The beginning memory location where the map module driver resides
Caller Module
The first module identified on the stack below the failing PC
Caller Module Offset
The first module offset identified on the stack below the failing PC
Instruction M1
The instruction executed immediately before the Failing Instruction (helps to locate the Failing Instruction precisely in the code)
Instruction M2
The next-to-last instruction executed before the Failing Instruction (helps to locate the Failing Instruction precisely in the code)
Instruction P1
The first instruction that would have been executed after the Failing Instruction (helps to locate the Failing Instruction precisely in the code)
Instruction P2
The second instruction that would have been executed after the Failing Instruction (helps to locate the Failing Instruction precisely in the code)

3.4.2  Entering Parameters

You can enter crash data parameters in any of the following ways:

Note


To edit the contents of a parameter field, click on the field and use the arrow and Backspace keys to remove unwanted characters. Do not click on the Clear button. The Clear button clears all of the parameter fields.


3.4.2.1  Selecting And Opening a Crash File

To populate the crash data parameter fields by selecting and opening a crash file, follow these steps:

  1. Choose Select Crash File.... from the File pull-down menu.
  2. Select the desired crash file using the appropriate procedure for your operating system.
  1. Once you have the correct file selected or entered, click the Open button. The CCAT Message Processing window appears, telling you that the crash file is being analyzed.

During analysis, CCAT populates the parameter fields. When analysis is complete, the results appear in the CCAT Results frame at the bottom of the CCAT window, as shown in Figure 3–2.

Note


When the results file is displayed in the CCAT Results frame, the frame at the top of the CCAT window may be grayed out. To display the contents of this frame, click on the operating system tab.


Figure 3–2 CCAT Analysis Results
figures/results.gif

3.4.2.2  Typing In Crash Parameters

To enter a crash parameter manually, click on the appropriate field and type the parameter exactly as it appears in the crash data file or use the Copy and Paste functions to copy information into the fields.

When you are entering crash parameters manually, it is important to remember the following:

If you make a mistake or need to edit the contents of a parameter field, click on the field and use the arrow and Backspace keys to remove unwanted characters. Do not click on the Clear button. The Clear button clears all of the parameter fields.

The crash data file may not contain all of the parameters listed in the CCAT window. When the crash data file does not contain a parameter, leave the tilde (~) in the field to indicate that the parameter is not available.

Once you have entered all the crash data parameters available to you, click on the Apply button on the right side of the CCAT window to start the crash analysis.

When CCAT has completed the crash analysis, the results file is displayed in the frame at the bottom of the CCAT window. You can resize the window and use the scroll bar to view the file.

3.4.3  Saving the Results File

If you want to save the results file so you can view it again later, make sure the file is still displayed in the frame at the bottom of the CCAT window. Then follow these steps:

  1. Select Save Results File As from the File pull-down menu. The Save window appears.
  2. Use the Look In field to select the directory where you want to save the results.
  3. Enter the name you want to assign to the saved results file in the File Name field and click the Save button.

3.5  Viewing Saved Results Files

To view a previously saved results file, follow these steps:

  1. Select View Saved Results File from the File pull-down menu. The Open window appears.
  2. Use the Look In field to select the directory where the results file is saved.
  3. Click on the results file you want to display. The name of the file you selected appears in the File Name.
  4. Click on Open.

CCAT displays the results file in the Results frame at the bottom of the CCAT window, as shown in Figure 3–3.

Figure 3–3 Typical CCAT Analysis Results
figures/view_results.gif

3.6  Exiting From the CCAT GUI

To exit from the CCAT GUI, select Exit from the File pull-down menu.

A CCAT Information message window appears, telling you that the communication interface has been shut down, as shown in Figure 3–4.

Figure 3–4 Exit CCAT Information Window
figures/exit_ccat_info.gif

Click on OK to exit from CCAT.

Copyright Statement

© 2003 Hewlett-Packard Company

Microsoft, Windows, MS Windows, Windows NT, and MS-DOS are US registered trademarks of Microsoft Corporation. Intel is a US registered trademark of Intel Corporation. UNIX is a registered trademark of The Open Group. Java is a US trademark of Sun Microsystems, Inc.

Confidential computer software. Valid license from Hewlett-Packard required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license.

Hewlett-Packard shall not be liable for technical or editorial errors or omissions contained herein. The information is provided "as is" without warranty of any kind and is subject to change without notice. The warranties for Hewlett-Packard products are set forth in the express limited warranty statements accompanying such products. Nothing herein should be construed as constituting an additional warranty.

This service tool software is the property of, and contains confidential technology of Hewlett-Packard Company or its affiliates. Possession and use of this software is authorized only pursuant to the Proprietary Service Tool Software License contained in the software or documentation accompanying this software.

Hewlett-Packard service tool software, including associated documentation, is the property of and contains confidential technology of Hewlett-Packard Company or its affiliates. Service customer is hereby licensed to use the software only for activities directly relating to the delivery of, and only during the term of, the applicable services delivered by Hewlett-Packard or its authorized service provider. Customer may not modify or reverse engineer, remove or transfer the software or make the software or any resultant diagnosis or system management data available to other parties without Hewlett-Packard's or its authorized service provider's consent. Upon termination of the services, customer will, at Hewlett-Packard's or its service provider's option, destroy or return the software and associated documentation in its possession.

Examples used throughout this document are fictitious. Any resemblance to actual companies, persons, or events is purely coincidental.