Redhat Linux : Collect System Diagnosis report for the Support Call purpose
Other Learning Articles that you may like to read
Free Courses We Offer
Paid Training Courses we Offer
Red Hat Enterprise Linux 4.5 and previous
On a default installation the package Sysreport should be already installed. If not you need to install the package “sysreport-.rpm” with the following command
# rpm -ivh sysreport-.rpm
or, if your system is registered at the Red Hat Network “RHN”, simply running
# up2date -i sysreport
This will install the latest version of Sysreport on your system.
To collect the information you need to start troubleshooting just enter the command
# sysreport
and follow the instructions on screen. At the end you get a filename and the location where to find the compressed information collected by this script. Please keep this data for further support.
Please note that Sysreport will need some time to collect all the data, depending on the speed of the system and how many packages are installed.
In cases you experience that Sysreport seems to hang and will not return after a while, you may pass the parameter “-norpm” to the command. This will skip the checking of the RPM database which may be broken.
Red Hat Enterprise Linux 4.6 and later
The “sosreport” command is a tool that collects information about a Red Hat Enterprise Linux system. To run sosreport, the “sos” package must be installed. The package should be installed by default, but if the package is not installed, follow the steps below:
Installation on Red Hat Enterprise Linux 4.6 and later
If the system is registered with Red Hat Network (RHN), “sos” can be installed using the up2date command:
# up2date sos
Installation on Red Hat Enterprise Linux 5 and later
If the system is registered with RHN, use the yum command:
# yum install sos
If the system is not registered with RHN, the “sos” package can be downloaded from the RHN website or found on the installation CDs. The RPM command can be used to install the package on any version of Red Hat Enterprise Linux:
# rpm -Uvh sos-..rpm
To collect the system information to start troubleshooting just enter the command and follow the instructions
# sosreport
The sosreport will run for several minutes, according to the system, the running time maybe more longer. Once completed, sosreport will generate a compressed a bz2 file under /tmp. Normally, the size of the bz2 file will be about 3MB.
The sosreport has some plugins which can be turn on and off, the following command lists the plugins:
# sosreport -l
If Sosreport seems to hang and will not return after a while, you may pass the parameter “-k rpm.rpmva=off” to the command. This will skip to verify on all packages.
# sosreport -k rpm.rpmva=off
Even though Sysreport and Sosreport collects most of the needed data for analysis, it is suggested that the content of the directory “/var/log/” is provided, to get all relevant data (such as older message files, service related log files, mcelogs etc).
You might tar this data with the following command:
# tar czvf logfiles.tar.gz /var/log
Be cautious, one of my real experience :
As per the Redhat these tools are safe to run on production system at any time, but I had experienced a problem when I ran sosreport on a production machine which had failed power supply fan.
The actually Scenario is :
One fine morning, a linux server ( part of three node VCS cluster hosting a critical application ) was configured on HP hardware, had thrown an hardware alert . As per the ILO logs , the machine had Power supply FAN issues. To raise a RedHat support call we need sosreport output, then we started the command during the production time, which created more cpu and disk activity on the machine, which inturn raised the temperature in the machine ( this is because the cooling fan already failed). And the over temperature in the CPU caused the server to stop responding from external connections but still left the server pingable in network.
As per the VCS setup if one of the node crashes then the other node should automatically pick the applications and continue to operate, and in this case since the system was just hung ( didn’t respondig to external connectioned) but still pinging, the VCS couln’t take any quick decision to failover the application to the running nodes which inturn caused all the customer connections to fail. And to recover the machine, we had to halt the troubled node forcefully and manually failed over all the services manually to the working node.
This whole process took 20 mins, and later we had to deal with many customer escalations with a question
why the diagnosis ran during the production hours”. And after that sysadmins were instructed to take business team permission to run any diagnosis on production server during production times.
Good one Anna…
Thanks for sharing your experience :)
sos-1.7-9.35.el4
I have a strange issue experienced running sos report sos-1.7-9.35.el4 ,after LUN presentd from EMC have atached to host ,
ran sosreport and the filsystem is dissapeaed mounted on that emc device presented .there after i have to reboot the server and request SAN team to re- represent the LUN .
it is working fine,as long we don’t run sosreport but when i run sosreport the file system mounted /Data is getting dissapeared
ANy idea , where to check at the sosreport . why it is removing the filesystem .
Hi shekar …. why did you run sos report ? I did  run sos report to send  report to  RHEL support …. In your case …. First check Your LUN  and file system  if ur missing anything ….think …. why did you ask san team to again re-represent LUN  … ? Was LUN not visible to OS before , was reboot required ? As per my experience sos has nothing to do with storage it only  gets system information … Â
Thanks for your response S,Yes, you are correct the SOS report is to generate report for Linux OS .It is affecting only fFor Data FS ( 2 TB ) ,Initially FOr data ,we always request SAN team to present LUN to Host and using powermt command we generate psuedo device and mount them /dev/emcpowera/VGxyz /Data ..it is working fine as long as u don’t run sosreport. .Once i ran the sosreport the file system /Data is dissapeared from df -k output .i can’t export it or import the VG .The only option i have left to reboot the box and request the SAN team to re-present the same LUN ,This is a bug . even i reported to RH and till date no solution .I’m still investigting it .This is strange .if any one has come across , pl let me know .( i’m runnign RH 4.x on Virtual machine) .
when I pvscan or pvs or vgs , i can’t find the device other than rootvg
o/p Before runnign sosreport
[root@host1 ~]# pvs
PV VG Fmt Attr PSize PFree
/dev/cciss/c0d0p1 rootvg lvm2 a- 4G 0 ( Native disk is not affected )
/dev/emcpowerg VGxyz lvm2 a- 2T 0 ( affected and dissapear )
o/p After running Sosreport
[root@host1 Data]# pvs
PV VG Fmt Attr PSize PFree
/dev/cciss/c0d0p2 rootvg lvm2 a- 4G 0
hi Shekar, can you please run ” sosreport -vvv “, let me know the output that you see on the screen.
Hi Seema, thanks for trying to help for this.
@Shekar, yes u r right. There was a bug in Linux 4.X and prior version with sosreport. But it sorted out in 5.X version of RHEL. And I think RHEL was not able to provide any BUG fix yet for older versions as you stated above also.
Thanks buddy , if the given LUN is in the format of EFI (ee) , it is getting attached once ,at the time of beginging .once we run sos report it is getting disappear , if we change the lun and format to native Linux LVM (8e) . even after sos report run it is holding the FS. ,This is tested .FYI
I have added in the past other FS with this cmd by checking the PP size available
chfs -a size=+1G /sap
similiarly if PP is available on rootvg
If the root/ FS on AIX is 0 space left
chfs -a size=+1G / pl advise
@Shekar, I have done this in past on AIX it worked perfectly (for root too) if you have space available. chfs -a size=+1g
Live Example is: yogesh-AIX#lsvg -o
rootvg
yogesh-AIX#lsvg rootvg
VOLUME GROUP: rootvg VG IDENTIFIER: 00c589e500004c000000013671848d74
VG STATE: active PP SIZE: 128 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 319 (40832 megabytes)
MAX LVs: 256 FREE PPs: 87 (11136 megabytes)
LVs: 18 USED PPs: 232 (29696 megabytes)
OPEN LVs: 17 QUORUM: 1 (Disabled)
TOTAL PVs: 1 VG DESCRIPTORS: 2
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 1 AUTO ON: yes
MAX PPs per VG: 32512
MAX PPs per PV: 1016 MAX PVs: 32
LTG size (Dynamic): 256 kilobyte(s) AUTO SYNC: no
HOT SPARE: no BB POLICY: relocatable
PV RESTRICTION: none
yogesh-AIX#lsvg -l rootvg
rootvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
hd5 boot 1 1 1 closed/syncd N/A
hd6 paging 64 64 1 open/syncd N/A
hd8 jfs2log 1 1 1 open/syncd N/A
hd4 jfs2 4 4 1 open/syncd /
hd2 jfs2 48 48 1 open/syncd /usr
hd9var jfs2 8 8 1 open/syncd /var
hd3 jfs2 12 12 1 open/syncd /tmp
hd1 jfs2 1 1 1 open/syncd /home
hd10opt jfs2 9 9 1 open/syncd /opt
hd11admin jfs2 1 1 1 open/syncd /admin
livedump jfs2 2 2 1 open/syncd /var/adm/ras/livedump
rootlv jfs2 2 2 1 open/syncd /home/root
buildlv jfs2 2 2 1 open/syncd /build
nmonlv jfs2 8 8 1 open/syncd /nmon
hpovlv jfs2 4 4 1 open/syncd /var/opt/OV
mksysblv jfs2 40 40 1 open/syncd /mksysb_image
openvlv jfs2 16 16 1 open/syncd /usr/openv
pdumplv sysdump 9 9 1 open/syncd N/A
yogesh-AIX#
yogesh-AIX#
yogesh-AIX#df -g /
Filesystem GB blocks Free %Used Iused %Iused Mounted on
/dev/hd4 0.50 0.31 38% 10745 13% /
yogesh-AIX#
yogesh-AIX#chfs -a size=+1G /
Filesystem size changed to 3145728
yogesh-AIX#
yogesh-AIX#df -g /
Filesystem GB blocks Free %Used Iused %Iused Mounted on
/dev/hd4 1.50 1.31 13% 10745 4% /
yogesh-AIX#
yogesh-AIX#lsvg rootvg
VOLUME GROUP: rootvg VG IDENTIFIER: 00c589e500004c000000013671848d74
VG STATE: active PP SIZE: 128 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 319 (40832 megabytes)
MAX LVs: 256 FREE PPs: 79 (10112 megabytes)
LVs: 18 USED PPs: 240 (30720 megabytes)
OPEN LVs: 17 QUORUM: 1 (Disabled)
TOTAL PVs: 1 VG DESCRIPTORS: 2
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 1 AUTO ON: yes
MAX PPs per VG: 32512
MAX PPs per PV: 1016 MAX PVs: 32
LTG size (Dynamic): 256 kilobyte(s) AUTO SYNC: no
HOT SPARE: no BB POLICY: relocatable
PV RESTRICTION: none
yogesh-AIX#
yogesh-AIX#
yogesh-AIX#
yogesh-AIX#chfs -a size=-1G /
Filesystem size changed to 1048576
yogesh-AIX#df -g /
Filesystem GB blocks Free %Used Iused %Iused Mounted on
/dev/hd4 0.50 0.31 38% 10745 13% /
yogesh-AIX#
yogesh-AIX#exit
@Shekar, but if you have 0 space left than you wont be able to increase anyfile system in any OS for any LVM. Either it throw some error or the comamnd will hung.
I have few questions reg RHN
1.who takes care of RHN part ?
2.How to find whether server got registered in RHN ?
3.On which RHN account server got registered ? Will the account id stored in any file?
@Kiran –
>> RHN will be taken care by infrastucture – engineering teams – who ceritfies new packages / pactches that can be used in environment.
>> Registered servers will have a digital ID stored in the file /etc/sysconfig/rhn/systemid which is actually generated when we run up2date –register command
>> /etc/sysconfig/rhn/up2date will have the serverURL talks about RHN server of your network
Thxs a lot anna :)