VxVM : Recovering the mirrored volume with both sides of mirror components ( plexes) had Error State

This post is continuation to the Yogesh’s recent post ” Recovering Disabled Disk groups” ( which is talking about recovering Disk group which has no volume issues at low level but just failed to initiate during the start up),  and also related to the post  “Recovering an unstartable volume with a disabled plex in the RECOVER state ” ( which is talking about recovering volume which has one of it’s Plex state as STALE while having another ACTIVE plex in the volume )

In the scenario We are discussing about revoring a mirror volume which was DISABLED and having the underlying  two plexes in  the status like  either DISABLED , RECOVER or STALE  ( Want to know more about these STATES, read the post Veritas Plex State Transition ) . This could happen after an unwanted system shutdown by a powersupply failure.

 

We initially tried to FSCK the failed  volume to see the status, but the FSCK was failed with below device error.

# fsck /dev/vx/rdsk/gurkul-dg/gurkuldata

Can’t open /dev/vx/rdsk/gurkul-dg/gurkuldata: No such device or address

After further checking the logs we see no errors with related disks to this volume, and we know it is just volume manager that messed up with the volume and plex status.

Ideal Recovery Procedure Should follow this way 

 

Just Offline the STALE PLEX FIRST

# vxmend -g gurkul-dg off gurkuldata-02 

               

Force restart the volume, This will start the volume using the other Plex which is in Recover status.

# vxvol -g gurkul-dg -f start gurkuldata 

 

Check the Diskgroup Status, to reflect the above commands.             

# vxprint -h -g gurkul-dg

TY NAME             ASSOC           KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0

dg gurkul-dg        gurkul-dg        –        –        –        –        –       –

dm d00              SUN35101_0       –        209704704 –       –        –       –

dm d10              SUN35100_3       –        209704704 –       FAILING  –       –

v  gurkuldata       fsgen            ENABLED  209700864 –       ACTIVE   –       –  *****

pl gurkuldata-01    gurkuldata       ENABLED  209700864 –       ACTIVE   –       –

sd d10-01           gurkuldata-01    ENABLED  209700864 0       –        –       –

pl gurkuldata-02    gurkuldata       DISABLED 209700864 –       OFFLINE  –       –  *****

sd d00-01           gurkuldata-02    ENABLED  209700864 0       RELOCATE –       –

pl gurkuldata-03    gurkuldata       ENABLED  LOGONLY  –        ACTIVE   –       –

sd d10-02           gurkuldata-03    ENABLED  2112     LOG      –        –       –

 

Just check the Raw Device Volume had no issues

# fsck /dev/vx/rdsk/gurkul-dg/gurkuldata  

** /dev/vx/rdsk/gurkul-dg/gurkuldata

** Last Mounted on /global/gurkul

** Phase 1 – Check Blocks and Sizes

** Phase 2 – Check Pathnames

** Phase 3a – Check Connectivity

** Phase 3b – Verify Shadows/ACLs

** Phase 4 – Check Reference Counts

** Phase 5 – Check Cylinder Groups

6587 files, 206212 used, 102940067 free (14059 frags, 12865751 blocks, 0.0% fragmentation)

 

Since the First Plex (gurkuldata-01) was Enabled & Active state we will bring up the offline plex, so that it will sync to the ACTIVE plex.

# vxmend -g gurkul-dg on gurkuldata-02 

 

Now Recover the Volume, to sync both the Plexes.

# vxrecover -g gurkul-dg gurkuldata

 

Check the Disk group Status, Volume and Both Plexes will Enabled and ACTIVE .

# vxprint -h -g gurkul-dg

TY NAME             ASSOC           KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0

dg gurkul-dg        gurkul-dg        –        –        –        –        –       –

dm d00              SUN35101_0       –        209704704 –       –        –       –

dm d10              SUN35100_2       –        209704704 –       FAILING  –       –

v  gurkuldata       fsgen            ENABLED  209700864 –       ACTIVE   –       –

pl gurkuldata-01    gurkuldata       ENABLED  209700864 –       ACTIVE   –       –

sd d10-01           gurkuldata-01    ENABLED  209700864 0       –        –       –

pl gurkuldata-02    gurkuldata       ENABLED  209700864 –       ACTIVE   –       –

sd d00-01           gurkuldata-02    ENABLED  209700864 0       –        –       –

pl gurkuldata-03    gurkuldata       ENABLED  LOGONLY  –        ACTIVE   –       –

sd d10-02           gurkuldata-03    ENABLED  2112     LOG      –        –       –

 

 

If we observe the above output we still have one of the related disk in failing status, (i.e. dm d10)  after the unexpected powerdown. From the logs we know there is no error on the disk. So we want to clear the failing flag off for the disk using the below command.

 

# vxedit -g gurkul-dg set failing=off d10

 

Now again check the Diskgroup status, the Failing Disk status will disappead and all looks good.

 

# vxprint -h -g gurkul-dg

TY NAME             ASSOC           KSTATE   LENGTH   PLOFFS   STATE    TUTIL0  PUTIL0

dg gurkul-dg        gurkul-dg        –        –        –        –        –       –

dm d00              SUN35101_0       –        209704704 –       –        –       –

dm d10              SUN35100_2       –        209704704 –       –        –       –

v  gurkuldata       fsgen            ENABLED  209700864 –       ACTIVE   –       –

pl gurkuldata-01    gurkuldata       ENABLED  209700864 –       ACTIVE   –       –

sd d10-01           gurkuldata-01    ENABLED  209700864 0       –        –       –

pl gurkuldata-02    gurkuldata       ENABLED  209700864 –       ACTIVE   –       –

sd d00-01           gurkuldata-02    ENABLED  209700864 0       –        –       –

pl gurkuldata-03    gurkuldata       ENABLED  LOGONLY  –        ACTIVE   –       –

sd d10-02            gurkuldata-03   ENABLED  2112     LOG      –        –       –

#


 That’s it for the day. Please feel to drop your comments and questions. And also let us know your experience in recovering a disabled volume / diskgroup.

 

 

Ramdev

Ramdev

I have started unixadminschool.com ( aka gurkulindia.com) in 2009 as my own personal reference blog, and later sometime i have realized that my leanings might be helpful for other unixadmins if I manage my knowledge-base in more user friendly format. And the result is today's' unixadminschool.com. You can connect me at - https://www.linkedin.com/in/unixadminschool/

4 Responses

  1. Gowtham says:

    Good Job Ram.

  2. Prajwala says:

    Thanks

  3. ramakrishna says:

    this is very clear way to go…………….thanks randev

  4. Sampath says:

    Thanks

What is in your mind, about this post ? Leave a Reply

Close
  Our next learning article is ready, subscribe it in your email

What is your Learning Goal for Next Six Months ? Talk to us