VCS : Understand Jeopardy Condition with real time scenerio
Other Learning Articles that you may like to read
Free Courses We Offer
Paid Training Courses we Offer
VCS facing the following Issue : Service group not failing over when server rebooted.
Error messages in /var/adm/messages file:
Mar 18 15:26:48 p-vhs-ntt01b-d2 gab: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port a gen ef2e0f membership 01
Mar 18 15:26:48 p-vhs-ntt01b-d2 gab: [ID 608499 kern.notice] GAB INFO V-15-1-20037 Port a gen ef2e0f jeopardy ;1
Mar 18 15:26:49 p-vhs-ntt01b-d2 gab: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port b gen ef2e11 membership 01
Mar 18 15:26:49 p-vhs-ntt01b-d2 gab: [ID 608499 kern.notice] GAB INFO V-15-1-20037 Port b gen ef2e11 jeopardy ;1
Mar 18 15:26:49 p-vhs-ntt01b-d2 vxfen: [ID 416634 kern.notice] NOTICE: VXFEN INFO V-11-1-35 Fencing driver going into RUNNING state
Mar 18 15:26:51 p-vhs-ntt01b-d2 gab: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port h gen ef2e28 membership 01
Mar 18 15:26:51 p-vhs-ntt01b-d2 gab: [ID 608499 kern.notice] GAB INFO V-15-1-20037 Port h gen ef2e28 jeopardy ;1
Further analysis revealed that a link was down:
# lltstatus -vvn
LLT node information:
Node State Link Status Address
* 0 p-vhs-ntt01a-d2 OPEN
nxge0 UP 00:21:28:62:0B:D6
nxge4 UP 00:21:28:63:27:B6
1 p-vhs-ntt01b-d2 OPEN
nxge0 UP 00:21:28:63:1C:96
EXPLANATION OF JEOPARDY STATUS:
VCS protects against network failure by requiring that all systems be connected via two or more communications channels. When a system is down to a single heartbeat connection, VCS can no longer discriminate between the loss of a system and the loss of a network connection. This situation is referred to as jeopardy.
When LLT on a system no longer receives heartbeat messages from another system on any of the configured LLT interfaces, GAB reports a change in membership. When a system is down to only one interconnect link, GAB can no longer reliably discriminate between loss of a system and loss of the network. The reliability of the system’s membership is considered at risk. In this situation jeopardy membership takes effect. This provides the best possible split-brain protection without membership arbitration and SCSI-3 devices.
Two actions take effect when a system goes into jeopardy status:
- If the system loses the last interconnect link, VCS places the service groups running on the system in autodisabled state. A service group in autodisabled state may failover on a resource or group fault, but can not failover on a system fault until the autodisabled flag is manually cleared.
- VCS operates the system as a single system cluster. Other systems in the cluster are partitioned off in a separate cluster membership.
Fix the link for node nxge4. This put cluster membership back into normal membership status