VCS : Understand Jeopardy Condition with real time scenerio

VCS facing the  following Issue  :  Service group not failing over when server rebooted.

Error messages in /var/adm/messages file:

Mar 18 15:26:48 p-vhs-ntt01b-d2 gab: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port a gen ef2e0f membership 01

Mar 18 15:26:48 p-vhs-ntt01b-d2 gab: [ID 608499 kern.notice] GAB INFO V-15-1-20037 Port a gen ef2e0f jeopardy ;1

Mar 18 15:26:49 p-vhs-ntt01b-d2 gab: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port b gen ef2e11 membership 01

Mar 18 15:26:49 p-vhs-ntt01b-d2 gab: [ID 608499 kern.notice] GAB INFO V-15-1-20037 Port b gen ef2e11 jeopardy ;1

Mar 18 15:26:49 p-vhs-ntt01b-d2 vxfen: [ID 416634 kern.notice] NOTICE: VXFEN INFO V-11-1-35 Fencing driver going into RUNNING state

Mar 18 15:26:51 p-vhs-ntt01b-d2 gab: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port h gen ef2e28 membership 01

Mar 18 15:26:51 p-vhs-ntt01b-d2 gab: [ID 608499 kern.notice] GAB INFO V-15-1-20037 Port h gen ef2e28 jeopardy ;1

 

Further analysis revealed that a link was down:

 

# lltstatus -vvn

 





output:

LLT node information:

Node State Link Status Address

* 0 p-vhs-ntt01a-d2 OPEN

nxge0 UP 00:21:28:62:0B:D6

nxge4 UP 00:21:28:63:27:B6

1 p-vhs-ntt01b-d2 OPEN

nxge0 UP 00:21:28:63:1C:96

nxge4 DOWN

 

EXPLANATION OF JEOPARDY STATUS:

VCS protects against network failure by requiring that all systems be connected via two or more communications channels.  When a system is down to a single heartbeat connection, VCS can no longer discriminate between the loss of a system and the loss of a network connection.  This situation is referred to as jeopardy.

When LLT on a system no longer receives heartbeat messages from another system on any of the configured LLT interfaces, GAB reports a change in membership.  When a system is down to only one interconnect link, GAB can no longer reliably discriminate between loss of a system and loss of the network.  The reliability of the system’s membership is considered at risk.  In this situation jeopardy membership takes effect.  This provides the best possible split-brain protection without membership arbitration and SCSI-3 devices.

Two actions take effect when a system goes into jeopardy status:

  • If the system loses the last interconnect link, VCS places the service groups running on the system in autodisabled state.  A service group in autodisabled state may failover on a resource or group fault, but can not failover on a system fault until the autodisabled flag is manually cleared.
  • VCS operates the system as a single system cluster.  Other systems in the cluster are partitioned off in a separate cluster membership.

 

Resolution:

Fix the link for node nxge4. This put cluster membership back into normal membership status

 

Ramdev

Ramdev

I have started unixadminschool.com ( aka gurkulindia.com) in 2009 as my own personal reference blog, and later sometime i have realized that my leanings might be helpful for other unixadmins if I manage my knowledge-base in more user friendly format. And the result is today's' unixadminschool.com. You can connect me at - https://www.linkedin.com/in/unixadminschool/

3 Responses

  1. Ramesh says:

    Hello Ram,

    My name is Ramesh working in SGP, by your profile I came to know that you currently working here and teaching is your passion.
    I don’t have much experience in VCS and scripting and very much eager to learn those things!!! If you know any study center please let me know… Also possible please text your personal Num / mail ID to 90664728 (hope you won’t mind)

    Thanks Much :-)

  2. Nagul says:

    its really good and helpfull, thank you

  1. September 24, 2015

    […] Read – VCS for Beginners – Understand Jeopardy Condition with real time scenerio […]

What is in your mind, about this post ? Leave a Reply

Close
  Our next learning article is ready, subscribe it in your email

What is your Learning Goal for Next Six Months ? Talk to us