VXVM on Solaris ( Sparc ) – Understanding boot Failures

Possible reasons for a boot failure:

            1. The boot device cannot be opened.
            2. The system cannot be be booted from unusable or stale plexes.
            3. A UNIX partition is invalid.
            4. There are incorrect entries in /etc/vfstab.
            5. Configuration files are missing or damaged

i)  Boot device Cannot be opened

Symptom:

SCSI device 0,0 is not responding

Can’t open boot device

Possible Reasons:

The following are Common Causes for the system PROM being unable to read the boot program from the boot drive:

  • The boot disk is not powered on
  • The SCSI bus is not terminated
  • There is a Controller failure on some port
  • A disk is failing and locking the bus , preventing any disks from identifying themselves to the controller, and making the controller assume that there are no disks attached

Actions :

  • Check carefully that everything on the SCSI bus is in order, check if any disks are powered off or the bus is unterminated. If any disks failed , then remove them from the bus
  • If no h/w problems, the errors due to data errors on boot disk, just try to boot from alternative boot disk.

ii)  Cannot boot from unusable or stale Plexes

Possible Reasons:

  • If the system was booted from one of the disks made bootable by VXVM with the original boot disk turned off. The systems boots normally, but the plexes that reside on the unpowered disks are stale. If the sytem reboots from the original boot disk with the disk turned back on, the system boots using that stale plex.
  • if errors in the VxVM headers on the boot disk prevent VxVM from properly identifying the disk. In this case, VxVM does not know the name of that disk. This is a problem because Plexes are associated with disk names, so any Plexes on the unidentified disk are unusable.
  • If the root disk has a failure that affects the root volume Plex. At the next boot attempt, the system still expects to use the failed root Plex for booting. If the root disk mirrored at the time of the failure, an alternate root disk can be specified for booting.

In any of the above situation, vxconfigd display a message describing the error and the recovery advice, then halts the system.

Sample errors:

VxVM vxconfigd ERROR V-5-1-1049: System boot disk does not have a valid root plex

Please boot from one of the following disks:

Disk: disk01 Device: c0t1d0s2

vxvm:vxconfigd: Error: System startup failed

The system is down.

Above error informs to boot from alternate disk disk01. Once you boot from the alternate disk, if the Plexes on the original boot disk were simply stale, they are caught up automatically as the system comes up. If, on the other hand, there was a problem with the private area on the disk or the disk failed, you need to re-add or replace the disk.

we can identify the problem, if the failure In the private area of root disk … by running

# vxdisk list

DEVICE                 TYPE                      DISK                       GROUP STATUS

–                                –                             rootdisk               bootdg                 failed was: c0t3d0s2

c0t1d0s2              sliced                      disk01                 bootdg                 ONLINE

iii)  Invalid Unix Partition

Error appears like             “File just loaded does not appear to be executable “

If this message appears during the boot attempt, the system should be booted from an alternate boot disk. While booting, most disk drivers display errors on the console about the invalid UNIX partition information on the failing disk. The messages are similar to this:

WARNING: unable to read label

WARNING: corrupt label_sdo

To resolve the issue either you need to “re-add the failed disk”   or  “replace a failed boot disk

iv)   Incorrect entries in /etc/vfstab

  • Damaged root (/) entry in /etc/vfstab

If the entry in /etc/vfstab for the root (/) file system is lost or is incorrect, the system boots in single-user mode. Messages similar to the following are displayed on booting the system:

INIT: Cannot create /var/adm/utmp or /var/adm/utmpx

INIT: failed write of utmpx entry:” “

Resolution:

# fsck –F ufs /dev/rdsk/c0t0d0s0

# mount –o remount /dev/vx/dsk/rootvol /

After mounting the / as RW just exit the shell, and then the system prompts for the run level. Enter into Run level 3 and restore the /etc/vfstab for / after the system boots.

  • Damaged /usr entry in /etc/vfstab

The /etc/vfstab file has an entry for /usr only if /usr is located on separate disk partition. After encapsulation of the disk containing the /usr partition, VxVM changes the entry in /etc/vfstab to use the corresponding volume.

#     ok boot cdrom –s à Boot to Single user mode using CDROM

#     mount /dev/dsk/c0t0d0s0 /a à mount root (/) file system on /a or /mnt

Edit the /a/etc/vfstab to fix the /usr entry

/dev/vx/dsk/usr              /dev/vx/rdsk/usr             /usr ufs 1            yes        –

Shutdown and reboot the machine from the same root partition on which /usr was restored

v) Missing or damaged configuration files

If   /etc/system is damaged or missing, and a saved copy of this file is not  available on the root disk, the system cannot be booted with the VxVM                                        rootability feature turned on.

Below steps allow you to boot the system without VxVM rootability and  restore configuration file

a. ok>  boot cdrom –s

b. # mount /dev/dsk/c0t0d0s0 /a

If backup copy of /etc/system avialble, then restore this as a file /a/etc/system. If no backup then create new /a/etc/system file with the following entries that are require by VxVM:

set vxio:vol_rootdev_is_volume=1

forceload: drv/ < driver >

forceload: drv/vxio

forceload: drv/vxspec

forceload: drv/vxdmp

rootdev:/pseudo/vxio@0:0

To find the driver name in the entry “forceload: drv/ < driver >” just run

# ls –al /dev/dsk/c0t0d0s2

lrwxrwxrwx … /dev/dsk/c0t0d0s2 ->  ../../devices/pci@1f,0/pci@1/pci@1/SUNW,isptwo@4/sd@0,0:c

above output indicates that the root disk required “pci” and “sd” drivers. And the entries in /etc/system should look like

forceload: drv/pci

forceload: drv/sd

Shutdown and reboot the machine with same root partition where the  configuration files restored.


Ramdev

Ramdev

I have started unixadminschool.com ( aka gurkulindia.com) in 2009 as my own personal reference blog, and later sometime i have realized that my leanings might be helpful for other unixadmins if I manage my knowledge-base in more user friendly format. And the result is today's' unixadminschool.com. You can connect me at - https://www.linkedin.com/in/unixadminschool/

7 Responses

  1. Sateesh says:

    It is good one. I think this proce is for VXVM volume bur root file system is ufs other wise it should be like below. Please corrct me if wrong. For Iv) # fsck –F vxfs /dev/vx/rdsk/rootvol

    # mount –o remount /dev/vx/dsk/rootvol /

  2. Yogesh Raheja says:

    @Sateesh, yes u r right. In case of vxvm vol. having Vxfs FS type you need to run fsck -F vxfs .

  3. virender says:

    Is there any alternative if server’s both root and mirror disk got crash.
    Plez specify steps. Thanks a lot

  4. Ganesh Mane says:

    The above blog is very useful
    I got something to share my exprience

    Yesterday we faced issue of OS got corrupted.
    Solaris server was not coming up , booting was went in loop.
    Resolution
    1)booted in safe mode
    2)Update the boot archive
    3) Reboot the system

    It should be fine now.

    Technical Steps

    ok>boot -F failsafe

    #cd /a/platform/sun4u (sun4v)
    #mv boot_archive boot_archive.old
    #bootadm update-archive -R /a
    #cd /
    #umount /a
    # reboot

  1. September 17, 2015

    […] Read – VXVM on Solaris ( Sparc ) – Understanding boot Failures […]

  2. September 18, 2015

    […] Read – VXVM on Solaris ( Sparc ) – Understanding boot Failures […]

What is in your mind, about this post ? Leave a Reply

Close
  Our next learning article is ready, subscribe it in your email

What is your Learning Goal for Next Six Months ? Talk to us