Solaris Solstice DiskSuite: recover a metadevice that is in an unstable state OR how to recreate metadevice without losing data

When  a metadevice goes into an undefined state,  recovery of metadevice with meta commands is problematic.   The workaround is to re-create the metadevice after carefully recording the current configuration from metastat -p.   The procedure is a bit risky if not followed carefully, so keep the backup handy.


Examples of When to Use This Procedure

  1. A mirror might be in a state where both sides are in a “Needs Maintenance” state:  Neither side is “Last Erred”, and the normal commands cannot be used to clear it.
  2. A one-way mirror has had a non fatal I/O error and has gone to “Needs Maintenance”. Currently, it is difficult to put the mirror back on-line.
  3. There might have been a reconfiguration reboot, and the underlying device files might have changed controller or target address.
  4. Condition when an OK submirror of a mirror (with soft partitions and in a metaset) in SVM/SDS goes into stale state and the submirror with no data goes into OK state (In this case you are not able to use a normal “metareplace -e” command to synchronise as this will result in total loss of data.)

Examples of When NOT to Use This Procedure

  1. One sub-mirror is “Needs maintenance.” The other mirror is “Last Erred”.
  2. One sub-mirror with valid data is “Okay.”  The other mirror is “Last Erred” or “Needs Maintenance”.

For these conditions, use the normal documented recovery procedures. If they fail without a known cause, continue with this procedure.

Below procedure can be used with all versions of Solstice Disk Suite and Solaris Volume Manager software. Note however that on Solstice DiskSuite versions prior to 4.2.1 the “meta” commands reside in /usr/opt/SUNWmd/sbin so you are recommended to add this directory to your PATH before starting the procedure. To do this, type:

PATH=$PATH:/usr/opt/SUNWmd/sbin

On Solstice DiskSuite 4.2.1 and Solaris Volume Manager the commands are in /usr/sbin which is expected to be in root’s PATH by default.

Recovery procedure

With Solaris Volume Manager you can clear any metadevice and re-create it without losing data.  The following example, with metadevice d8 that cannot be repaired, illustrates such a recovery.

Step 1. Examine the metastat output carefully to determine the state of the mirror and its sub-mirrors.

# metastat [-s setname] d8
d8: Mirror
    Submirror 0: d82
      State: Needs Maintenance
    Submirror 1: d83
      State: Needs Maintenance
    Pass: 1
    Read option: roundrobin (default)
    Write option: parallel (default)
    Size: 35307016 blocks (16 GB)

In the example, both sides are “Needs Maintenance.”  If one side is “Last Erred,” use the normal mirror recovery procedures described in the “Solaris Volume Manager User Guide.” If those procedures do not work, continue with this procedure.

Step 2. Determine if one or both sub-mirrors are active.

The purpose of this step is to determine which side to use as the first sub-mirror when you re-create it later. If you already have this information, or if the mirror is a one-way mirror, you can skip this step.

Use iostat to determine which sub-mirror is active:

# iostat -xnz 10 6                  # -z : Don’t show zero’ed lines.
                 extended device statistics              
 r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
 0.1    4.7    0.8    5.3  0.1  0.1   10.9   19.8   5   4 d8
 0.1    4.7    0.8    5.3  0.0  0.1    0.0   18.6   0   4 d82
 0.0    4.7    0.0    5.3  0.0  0.1    0.0   18.3   0   3 d83
 0.1    6.4    0.8    6.1  0.0  0.1    0.0   21.6   0   6 c1t2d0
 0.0    6.4    0.0    6.1  0.0  0.1    0.0   21.3   0   6 c1t3d0

In this example, I/O is still active on both sides of the mirror. If only one side is active, that sub-mirror should be used as the initial sub-mirror in step # 9.

Step 3.  To have a safe record of the original device, capture metastat -p for the failed metadevice:

# metastat [-s setname] -p d8 | tee d8.metastat
d8 -m d82 d83 1
d82 1 1 c1t2d0s0
d83 1 1 c1t3d0s0

(this information can also be taken from an old explorer if available)

Step 4. Check for any soft partitions above the device:

# metastat [-s setname] -p | grep d8 | tee d8.sp
d8 -m d82 d83 1
d82 1 1 c1t2d0s0
d83 1 1 c1t3d0s0
d106 -p d8 -o 28286982 -b 1433600
d105 -p d8 -o 29720583 -b 2097152
d104 -p d8 -o 27262981 -b 1024000
d103 -p d8 -o 25165828 -b 2097152
d102 -p d8 -o 20971523 -b 4194304
d101 -p d8 -o 8388610 -b 12582912  -o 31817736 -b 3481600
d100 -p d8 -o 1 -b 8388608

In this example, 7 soft partitions are on top of d8.

  • Note: Do not proceed without carefully capturing and saving this data.
  • Caution : the grep may collect lines for other devices besides the one you are repairing.

As an alternate to steps 3 and 4, ensure md.tab is up to date:

For Solstice DiskSuite 4.2.1 and solaris Volume Manager:

# mv /etc/lvm/md.tab /etc/lvm/md.tab.old
# metastat [-s setname] -p > /etc/lvm/md.tab

For versions of Solstice DiskSuite prior to 4.2.1, the md.tab file is not in the location stated above. Instead, use

# mv /etc/opt/SUNWmd/md.tab /etc/opt/SUNWmd/md.tab.old
# metastat [-s setname] -p > /etc/opt/SUNWmd/md.tab

Step 5. Use commands such as “df -k” and “swap -l” to look for all filesystems and applications that use any of the metadevices.

A. Look for file systems:

# df -k | grep /dev/md/dsk/d8

In this example, there are no filesystem because d8 is only a base device for soft partitions. Otherwise, there is probably a file system on the metadevice.

B. Look for swap devices:

# swap -l | grep  /dev/md/dsk/d8

The output would shown that there is no swap device. If there were, you would need to take normal steps to provide sufficient swap while this device is temporarily disabled.

Step 6. If in step 4, you found soft partitions on the base metadevice, you must find and record the relevant filesystems, using a grep pattern to match your circumstances:

# df -k | grep /dev/md/dsk/d10 | awk ‘{ print $6 }’
/data
/data/packages
/data/crash
/data/dl
/data/install
/data/patches
/data/samba

Step 7. Clear the soft partition metadevices:

# metaclear [-s setname] d100
# metaclear [-s setname] d101
# metaclear [-s setname] d102
# metaclear [-s setname] d103
# metaclear [-s setname] d104
# metaclear [-s setname] d105
# metaclear [-s setname] d106

Do this one by one for all partitions or use the following:

# metaclear [-s setname] -p <mirror>

Step 8. Recursively clear the mirror (the -f option could be required):

# metaclear [-s setname] -r d8

Step 9. Re-create the metadevices, first making sub-mirrors:

# metainit [-s setname] d82 1 1 c1t2d0s0
# metainit [-s setname] d8 -m d82

  • The initial sub-mirror used is d82. If step 2 showed I/O to only one sub-mirror, then use the same sub-mirror in this step.
  • If step 1 showed one sub-mirror as “Last Erred state,” then use that sub-mirror as the initial sub-mirror in this step.
  • If you are in the state described in the Pt4 of “Examples of When to Use This Procedure” you must use the sub-mirror that contains the valid data.
  • NOTE: Check this step closely, or data loss will occur.

Step 10. Re-create the soft partitions, which can be done while d8 is syncing up:

# metainit [-s setname] d100 -p d8 -o 1 -b 8388608
# metainit [-s setname] d101 -p d8 -o 8388610 -b 12582912  -o 31817736 -b 3481600
# metainit [-s setname] d102 -p d8 -o 20971523 -b 4194304
# metainit [-s setname] d103 -p d8 -o 25165828 -b 2097152
# metainit [-s setname] d104 -p d8 -o 27262981 -b 1024000
# metainit [-s setname] d105 -p d8 -o 29720583 -b 2097152
# metainit [-s setname] d106 -p d8 -o 28286982 -b 1433600

Step 11 (optional)  If time permits, you can fsck -n each filesystem:

  • If the fsck fails very quickly, you might have re-created the metadevice incorrectly.
  • If the fsck finds other errors, those errors might have been in the filesystem originally and are not related to this procedure.

Step 12. Mount the filesystems as they were originally:

  • If the mount fails, then you might have re-created the metadevice incorrectly (usually getting the order of component slices wrong).
  • To recover, use metaclear and metainit again.

Step 13. Once the data availability is confirmed you can re-initialize and attach the other sub-mirror back to the mirror:

# metainit [-s setname] d83 1 1 c1t3d0s0
# metattach [-s setname] d8 d83   

 

=======================================================================

Using This Procedure With Other Metadevice Types

  • RAID-5 metadevices: This same procedure can be used on a Solstice Volume Manager RAID-5 metadevice.

    The “-k” option.  When you rebuild the device in step 9, the “-k” option is of great importance.  This option tells Solaris Volume Manager not to initialize the metadevice, but to use the data that is already on the disk.  Failure to do this causes complete loss of existing data on the device.

  • Concat or stripe metadevices (or both):  This procedure can also be used on concat or stripe metadevices, or both.  Running metaclear and metainit does not affect user data.

    It is very important that stripes and concats are re-created with each slice component in the same order. Recreating a stripe or concat in the wrong order does no harm, so long as the data (filesystem) is not modified (such as with fsck -y). If you accidentally re-create a stripe in the wrong order just metaclear it and re-create it in the  correct order. If you do not know the correct order, simply repeat the steps of metainit and fsck -n until fsck completes successfully.

 

========================================================================

Using This Procedure With Controller Numbers That Have Changed

  • This procedure can also be used when a reconfiguration reboot has changed the device tree’s c# value. Simply, edit the saved files to reflect the new c#.
  • When a controller number changes, it is necessary to re-initialize all meta databases (MDBs) on those controllers.

Working with embedded MDBs

  • An embedded MDB (meta data base) is any sub-mirror or slice of the device you are re-creating that has “Yes” underneath Dbase in the metastat output.
  • An embedded MDB does not affect this procedure.  The metadevice can be safely cleared and initialized without touching the embedded MDB. If you are using this procedure as part of a controller number change, you should re-initialize all MDBs.
  • Embedded MDBs are not the best practice, and you should consider using this opportunity to remove them.

Ramdev

Ramdev

I have started unixadminschool.com ( aka gurkulindia.com) in 2009 as my own personal reference blog, and later sometime i have realized that my leanings might be helpful for other unixadmins if I manage my knowledge-base in more user friendly format. And the result is today's' unixadminschool.com. You can connect me at - https://www.linkedin.com/in/unixadminschool/

8 Responses

  1. Ian says:

    This article was very useful but I wonder if you could confirm something for me.

    I will need to replace an FC controller in one of our servers soon. This FC connects the host to our SAN.
    We have several soft partitions on the SAN disk presented through this controller. These soft partitions look like this:

    d152 -p c6t0d0s0 -o 73448517 -b 62914560 -o 1942969449 -b 41943040

    d153 -p c6t0d0s0 -o 157334599 -b 56623104

    I presume that first I need to carefully record the metastat -p output. Then, once the new FC card is fitted the c#t#d# for these metadevices will change, so I will need to delete and recreate these devices using metaclear and metainit to reflect the new device path to the disk.

    Is that correct, or have I missed something ?

    A reply to mail address would be very much appreciated.

    Thanks,

    Ian

    • admin admin says:

      Hello Ian, my understanding is replacing the FC card doesn’t change controller numbers unless you are changing slot/card model.

      But for the safe side you can keep the following outputs handy before go for maintenance 1. echo|format 2. metstat 3. metastat -p.

      Just incase if the ctrl numbers change , yes, you need to recreate them with same offsets/block numbers. But please make sure you have proper full backup before doing it, single mistake could destroy your entire data.

      please feel free to mail me incase if you need further help.

      Regards
      Ramdev

      • Ian says:

        Hi Ramdev,

        Many thanks for your reply.

        I may have to replace the FC card due to driver compatability issues with our SAN. The FC card is a Qlogic QLA2342 (according to the OS). The driver version we are using is qla-5.04 (server is running Solaris 10 x86)

        I needed to replace the qla driver with the Sun qlc driver as we have now found out the qla driver is not supported by our SAN for x86 Solaris (only SPARC). I thought I could simply replace the driver so I contacted Qlogic Support to confirm this was possible.

        They said that if the card was a native Qlogic (as opposed to the Sun branded model of the card) then I could not install the qlc driver – it would not work.

        They asked me to run a script that would report the serial number of the card. From that they would be able to identify if the card was native Qlogic or Sun branded.

        I sent them the script output and they then told me the serial number did not appear in their database !!

        So now I’m a bit stuck. I have a card in a machine that I cannot fully identify, therefore don’t know whther I can simply replace the qla driver with qlc. Unfortunately the machine is heavily used and is located over 100 miles away so I cannot easily check.

        Looking on the web lots of people say the qlc driver will work fine, even if the card IS a native Qlogic, so right now now I don’t know what, or who, to believe !!

        Thanks,

        Ian.

        • admin admin says:

          @Ian, I understand the situation.. Just wondering if Oracle could help you to identify the exact card details if you could upload the explorer output. And if you want to try little bit more , you can just break the root mirror ( if you have SVM in place) and install the new driver verify the SAN connectivity, and incase of issues Just boot back from second disk and recreate your mirror. And later you can go for card replacement.

  2. sluge says:

    Hello!
    Is any way to restore ZFS pool when WWN of the disks were changed?

  3. Sukhbir says:

    Do we have any changce to put old VTOC using the new disk name??

  1. September 16, 2015

    […] Read – recover a metadevice that is in an unstable state […]

What is in your mind, about this post ? Leave a Reply

Close
  Our next learning article is ready, subscribe it in your email

What is your Learning Goal for Next Six Months ? Talk to us