VCS (Veritas Cluster Services ) Beginners Lesson – Cluster Communications
Other Learning Articles that you may like to read
Free Courses We Offer
Paid Training Courses we Offer
In our previous post “Beginners Lesson – Veritas Cluster Services for System Admin” we have discussed the fundamentals of clustering and it’s purpose. In this post I would like to take you little deep inside of the clusters just to explore how the VCS communicating internally within a cluster node ( intra-system Communication ) as well externally with all the cluster nodes ( inter system communication).
For the purpose of easy learning we will classify cluster communications as below
1. Intra System Communications ( Inter process communications) – These communications happens among the components of VCS with in a single node.
2. Inter System Communications – These Communications happens in between the cluster nodes which are interconnected using cluster heartbeat network connections. And these communications are very important for VCS decision making regarding the failover and failback decision of individual service groups.
1. Intra System Communications
a. Command line utilities
b. GUI ( Graphical User Interface Utilities)
c. Cluster Agents
d. High Availability Daemon ( HAD) called VCS engine
Within a system, the VCS engine (HAD) uses a VCS-specific communication protocol known as Inter Process Messaging (IPM) to communicate with the GUI, the command line, and the agents.
The agent uses the agent framework, which is compiled into the agent itself. For each resource type configured in a cluster, an agent ( e.g. sybase agent, oracle agent , nfs agent … ) runs on each cluster system. The agent handles all resources of that type. The engine passes commands to the agent and the agent returns the status of command execution. For example, an agent is commanded to bring a resource online. The agent responds back with the success (or failure) of the operation. Once the resource is online, the agent communicates with the engine only if this status changes
2.Inter-system cluster communications
VCS uses the cluster interconnect for network communications between cluster systems. Each system runs as an independent unit and shares information at the cluster level. On each system the VCS High Availability Daemon (HAD), which is the decision logic for the cluster, maintains a view of the cluster configuration. This daemon operates as a replicated state machine, which means all systems in the cluster have a synchronized state of the cluster configuration. This is accomplished by the following:
- All systems run an identical copy of HAD.
- HAD on each system maintains the state of its own resources, and sends all cluster information about the local system to all other machines in the cluster.
- HAD on each system receives information from the other cluster systems to update its own view of the cluster.
- Each system follows the same code path for actions on the cluster.
VCS uses two components to perform the communications to manage synchronized of the cluster configuration, they are Group Membership Services/Atomic Broadcast (GAB) and Low Latency Transport (LLT).
2.a. Group Membership Services/Atomic Broadcast (GAB)
GAB is responsible for Cluster membership and reliable cluster communications.
- Cluster membership
GAB maintains cluster membership by receiving input on the status of the heartbeat from each system via LLT. When a system no longer receives heartbeats from a cluster peer, LLT passes the heartbeat loss to GAB. GAB marks the peer as DOWN and excludes it from the cluster. In most configurations, membership arbitration is used to prevent network partitions.
- Cluster communications
GAB’s second function is reliable cluster communications. GAB provides guaranteed delivery of messages to all cluster systems. The Atomic Broadcast functionality is used by HAD to ensure that all systems within the cluster receive all configuration change messages, or are rolled back to the previous state, much like a database atomic commit. While the communications function in GAB is known as Atomic Broadcast, no actual network broadcast traffic is generated. An Atomic Broadcast message is a series of point to point unicast messages from the sending system to each receiving system, with a corresponding acknowledgement from each receiving system.
2.b. Low Latency Transport (LLT)
The Low Latency Transport protocol is used for all cluster communications as a high-performance, low-latency replacement for the IP stack. LLT has two major functions.
- Traffic distribution
LLT provides the communications backbone for GAB. LLT distributes (load balances) inter-system communication across all configured network links. This distribution ensures all cluster communications are evenly distributed across all network links for performance and fault resilience. If a link fails, traffic is redirected to the remaining links. A maximum of eight network links are supported.
LLT is responsible for sending and receiving heartbeat traffic over each configured network link. LLT heartbeat is an Ethernet broadcast packet. This broadcast heartbeat method allows a single packet to notify all other cluster members the sender is functional, as well as provide necessary address information for the receiver to send unicast traffic back to the sender. The heartbeat is the only broadcast traffic generated by VCS. Each system sends 2 heartbeat packets per second per interface. All other cluster communications, including all status and configuration traffic is point to point unicast. This heartbeat is used by the Group Membership Services to determine cluster membership.
The heartbeat signal is defined as follows:
- LLT on each system in the cluster sends heartbeat packets out on all configured LLT interfaces every half second.
- LLT on each system tracks the heartbeat status from each peer on each configured LLT interface.
- LLT on each system forwards the heartbeat status of each system in the cluster to the local Group Membership Services function of GAB.
- GAB receives the status of heartbeat from all cluster systems from LLT and makes membership determination based on this information.
LLT can be configured to designate specific cluster interconnect links as either high priority or low priority. High priority links are used for cluster communications to GAB as well as heartbeat signals. Low priority links, during normal operation, are used for heartbeat and link state maintenance only, and the frequency of heartbeats is reduced to 50% of normal to reduce network overhead.
If there is a failure of all configured high priority links, LLT will switch all cluster communications traffic to the first available low priority link. Communication traffic will revert back to the high priority links as soon as they become available.
While not required, best practice recommends to configure at least one low priority link, and to configure two high priority links on dedicated cluster interconnects to provide redundancy in the communications path. Low priority links are typically configured on the public or administrative network