How to collect crash dump for solaris x-86 arch servers.
Other Learning Articles that you may like to read
Free Courses We Offer
Paid Training Courses we Offer
How to collect crash dump for solaris x-86 arch servers.
Sometimes your System crashed/paniced all of a sudden and you will see there will not be any crash dump files created for analysis. But Vendor always require system dump files to analyze the server to provide with the RCA. Here in I am presenting a simple way to generate crash dump files every time when your system got crashed. I am assuming that the dump devices (dumpadm) is already configured and crash directory (/var/crash/) is present.
1. edit /boot/grub/menu.lst and add “kmdb” to multiboot line like as the following example
Example:
title Solaris 10 10/08 s10x_u6wos_07 X86
findroot (rootfs0,0,a)
kernel /platform/i86pc/multiboot kmdb -B console=ttya
module /platform/i86pc/boot_archive
2. edit /etc/system and enter the following line.
set pcplusmp:apic_kmdb_on_nmi=1
3. reboot the system.
4. When the problem happens, open console by the following command.
-> start /SP/console
5. After that, please run ipmitool command from remote system like as follows.
ipmitool -H-U root chassis power diag
6. Then the system console will show kmdb prompt like as follows.
NMI detected, entering kmdb.
Welcome to kmdb
kmdb: unable to determine terminal type: assuming `vt100′
Loaded modules: [ crypto cpc uppc neti ptm ufs unix mpt zfs krtld s1394 sppp ipc
nca uhci hook lofs genunix ip logindmux usba specfs pcplusmp nfs md random
cpu.generic sctp arp ]
[0]>
7. Then type the following on the prompt
[0]>$Then system crash dump will be saved to primary swap space.
nopanicdebug: 0 = 0x1
panic[cpu0]/thread=fffffe8000005c80: BAD TRAP: type=e (#pf Page fault) rp=fffffe80000059a0 addr=0 occurred in module “” due to a NULL pointer dereference
sched: #pf Page fault
Bad kernel fault at addr=0x0
pid=0, pc=0x0, sp=0xfffffe8000005a98, eflags=0x10002
cr0: 8005003b cr4: 6f0
cr2: 0 cr3: 10bd8000 cr8: c
rdi: fffffffffbc7ef30 rsi: 3f8 rdx: 3f8
rcx: a r8: 0 r9: fffffffffbc4f560
rax: fffffffffbce6500 rbx: ffffffffefc661f8 rbp: fffffe8000005aa0
r10: fffffe80000059e0 r11: 0 r12: fffffe8000005b10
r13: ffffffff8a537c80 r14: fffffffffbc54ca0 r15: 1
fsb: ffffffff80000000 gsb: fffffffffbc27fc0 ds: 43
es: 43 fs: 0 gs: 1c3
trp: e err: 10 rip: 0
cs: 28 rfl: 10002 rsp: fffffe8000005a98
ss: 30
fffffe80000058b0 unix:die+da ()
fffffe8000005990 unix:trap+5e6 ()
fffffe80000059a0 unix:_cmntrap+140 ()
fffffe8000005aa0 0 ()
fffffe8000005ab0 genunix:kdi_dvec_enter+d ()
fffffe8000005ad0 unix:debug_enter+66 ()
fffffe8000005ae0 pcplusmp:apic_nmi_intr+90 ()
fffffe8000005b00 unix:av_dispatch_nmivect+1f ()
fffffe8000005b10 unix:nmiint+17e ()
fffffe8000005c00 unix:i86_mwait+d ()
fffffe8000005c40 unix:cpu_halt_mwait+b4 ()
fffffe8000005c60 unix:idle+89 ()
fffffe8000005c70 unix:thread_start+8 ()
syncing file systems… done
dumping to /dev/dsk/c0t0d0s1, offset 108593152, content: kernel
100% done: 121718 pages dumped, compression ratio 4.69, dump succeeded
rebooting…
8. vmcore.X/unix.X files will be saved after next solaris boot complete. The files will be located at /var/crash/’uname -n’.
/usr/bin/echo apic_kmdb_on_nmi/X | mdb -kw
;
/usr/bin/echo apic_kmdb_on_nmi/w1 | mdb -kw
cd /etc/init.d/ ; echo ” /usr/bin/echo apic_kmdb_on_nmi/w1 | mdb -kw ” > nmi
chmod 744 nmi ; chown root:sys nmi; ln nmi /etc/rc3.d/S85nmi
ls -li nmi ; ls -li /etc/rc3.d/S85nmi
… I do this way do we really need entry in “system” file in solaris 10 while doing via mdb :)
Karn, Thanks for sharing alternative way.