msgbartop
Tips and Tricks site for advanced HP-UX Engineers
msgbarbottom

04 Sep 09 Q4 Crash Dump Analysis to Analyze System Dump Files

What follows is a document I found on the forums. It can also be found on the docs.hp.com site, but this is a paraphrase, with some extra commentary.

1) You need to have foresight. Before you have a crash you must enable your system to save crash dumps.

2) vi /etc/rc.config.d/savecrash    — set the first parameter to 1. Now when your system crashes, and some day it probably will you can perform q4 analysis and send the results to HP. I think this document originated within HP. I have one written somewhere on the forums, but his one is better.

USING Q4 TO ANALYZE SYSTEM DUMP FILES

————————————-

When a 11.X HP-UX system crashes, it saves a snapshot of RAM in swap and during the reboot, copies it into /var/adm/crash. Because these files are binary, a utility called “q4” is used to analyze them and create readable text from which the response center can determine the failure cause.

============================ STEP 1 ===========================

Dumps are normally saved to /var/adm/crash.

Verify you have a dump to analyze by doing:

# ll /var/adm/crash/cr*

You may see:

/var/adm/crash/crash.0/INDEX

/var/adm/crash/crash.0/vmunix.gz

/var/adm/crash/crash.0/image.0.1.gz

/var/adm/crash/crash.0/image.0.2.gz

/var/adm/crash/crash.0/image.0.3.gz

/var/adm/crash/crash.0/image.0.4.gz

^ your suffix may vary

The INDEX file contains and the /etc/shutdownlog contains the “panic” statement.

============================ STEP 2 ===========================

The following commands must all be run from the dump directory:

  • cd to the dump directory ie: cd /var/adm/crash/crash.0

^^^^ ^

your dump dir.

  • # /usr/contrib/bin/gunzip vmunix.gz

(uncompresses the kernel file – may already be done)

  • # q4prep -p

(ignore the error if this was previously done)

  • Now type:

# q4 -p .

^ Notice this ‘dot’

This will put you at the q4 utility prompt: q4>

  • The next command will get you a “fingerprint” of what was going on on the system at the time of the failure.

  • If you are working with an HP RCE at this time, type the following line and read the results to him:

trace event 0

Otherwise, simply type this next line and continue.

trace event 0 > trace

  • At the prompt type: include analyze.pl

\_letter “el”

  • At the next prompt type: run Analyze AU >> ana.out

  • At the next prompt type: exit

============================ STEP 3===========================

Generate a patch list:

# swlist -l product PH\* > patch_list

Using the CALL ID as the subject, email patch_list, ana.out and possibly the trace file and what.out to : hpcu@atl.hp.com

NOTE: Max 3MB email size

To speed future calls of this nature, open a call with the Response Center and inform them that you will send email with the call ID as the subject. Then send the ana.out and patch_list file to the email address listed above.

Tags: , , ,

Leave a Comment

You must be logged in to post a comment.

sidebarbottom
sidebartop
sidebarbottom
WhatsApp chat