HeartBeat (V1, V2, PACEMAKER) cluster component overview - cluster, components, Heartbeat, overview, Pacemaker, V1, V2

One, what is Heartbeat< /strong>

Heartbeat is a highly available cluster system based on Linux open source. It mainly includes two highly available cluster components, heartbeat service and resource takeover. The heartbeat monitoring service can be performed through network links and serial ports, and supports redundant links. They send messages to each other to tell each other their current status. If they do not receive the message sent by the other party within a specified time, then It is considered that the other party is invalid, and then the resource takeover module needs to be activated to take over the resources or services running on the other party’s host.

Two, HeartBeat Version

Heartbeat is a highly available cluster system based on Linux open source. It mainly includes two high-availability cluster components, heartbeat service and resource takeover. The major version changes are mainly divided into three stages.

1, Heartbeat v1.x

Heartbeat v1 had the concept of resource management, while the v1 version The resource comes with heartbeat, called haresources, this file is a configuration file; and the configuration file interface is called haresources;

Heartbeat1.x allows cluster nodes and resources to be configured through two files in the /etc/ha.d directory

ha.cf: Define cluster nodes, failure detection and switching time intervals, cluster time log mechanism and node fence method

haresources: Define cluster resource groups, each line defines one that can be failover together The default node and a set of resources. Resources include IP addresses, file systems, services or applications

2, Heartbeat v2.x

In the second version of Heartbeat v2, heartbeat has been greatly improved. It can run as an independent process and can receive user requests through it. It is called crm, and it needs Run a process called crmd on each node. This process usually needs to be monitored on a socket, the port is 5560, so the server is called crmd, and the client is called crm (can be called crm shell), which is a command line interface , Through this command line interface, you can communicate with the crm on the server side. Heartbeat also has its graphical interface tool, called the heartbeat-GUI tool, which can be configured through this interface.

Heartbeat 2.0 introduces a module structure configuration method based on Heartbeat1.x configuration, cluster resource manager (Cluster Rescource Manager-CRM).

The CRM model can support up to 16 This model uses XML-based cluster information (Cluster Information Base-CIB) configuration.
The last official STABLE release 2.x of Heartbeat 2.x is 2.1.4.

The CIB file (/var/lib/heartbeat/crm/cib.xml) will be automatically copied between each node. It defines the following objects and actions:

Cluster nodes

Cluster resources, including attributes, priorities, groups And dependency

Log, monitoring, arbitration and fence standards

When the service fails or the standards set in it are met , The action that needs to be performed

3, Heartbeat v3.x

After the v3 version, the entire heartbeat project has been carried out The function is split and divided into different sub-projects for separate development. But the HA implementation principle is basically the same as Heartbeat 2.x, and the configuration is basically the same. After the v3 version, it was split into heartbeat, pacemaker (heart pacemaker), and cluster-glue (cluster gluer). The architecture is separated and can be combined with other components.

Heartbeat: Separate the original message communication layer For the heartbeat project, the new heartbeat is only responsible for maintaining the information of the nodes in the cluster and their previous communications;

Cluster Glue: is equivalent to an intermediate layer, It is used to associate heartbeat and pacemaker, and it mainly contains two parts, namely LRM and STONITH.

Resource Agent: A collection of scripts used to control the start and stop of services and monitor the status of services. These scripts will be called by LRM to enable various resource startups , Stop, monitor, etc.

Pacemaker: It is Cluster Resource Manager (CRM for short), which is used to manage the entire HA control center, and the client uses pacemaker to configure, manage, and monitor The entire cluster.

The first version of Heartbeat 3 officially released is 3.0.2. The original CRM management was replaced by pacemaker, and the underlying message layer can still use heartbeat v3 or corosync. The specific details will not be introduced in this article, you can refer to clusterlabs.org separately.

Pacemaker is a resource manager, not for providing heartbeat information, because it seems to be a common misunderstanding and it is worthwhile. Pacemaker is a continuation of CRM (also known as Heartbeat V2 Explorer), originally for heartbeat, but has become an independent project.

Pacemaker core component description:

ccm component (Cluster Consensus Menbership Service): The role is to connect the previous and the next, monitor the heartbeat information received by the bottom layer, when the heartbeat information is not monitored, recalculate the number of votes and convergence status information of the entire cluster, and forward the result to the upper layer, so that the upper layer can make Deciding what measures to take, ccm can also generate an overview map of the topology of each node’s state, taking this node as the perspective to ensure that the node can take corresponding actions under special circumstances.

crmd component (Cluster Resource Manager, cluster resource manager, that is, pacemaker): to realize the allocation of resources, each action of resource allocation is To be implemented through crm, it is the core component. The crm on each node maintains a cib to define resource-specific attributes and which resources are defined on the same node.

cib component (Cluster Infonation Base): is a configuration file in XML format, an XML format cluster in memory Resource configuration files are mainly stored in files. They are resident in memory during work and need to be notified to other nodes. Only the cib on the DC can be modified, and the cibs on other nodes are copied to the DC. There are two ways to configure the cib file: command-line configuration and graphical interface configuration based on the foreground.

lrmd component (Local Resource Manager, local resource manager): used to obtain the status of a local resource and realize the management of local resources , Such as when it detects that the other party has no heartbeat information, to start the local service process and so on.

pengine component

PE( Policy Engine): The policy engine defines a complete set of transfer methods for resource transfer, but it is only a strategist and does not personally participate in the process of resource transfer, but allows TE to execute its own strategy.

TE (Transition Engine): It is to execute the strategy made by PE and only run PE and TE on DC.

stonithd component

STONITH (Shoot The Other Node in the Head, “headshot”), this method directly operates the power switch, when a node fails If another node can detect it, it will issue a command through the network to control the power switch of the faulty node, and the faulty node will be restarted by temporarily powering off and then powering on. This method requires hardware support.

STONITH application case (master-slave server), the master server has no time to respond to the heartbeat information due to the busy service at a certain end time. The service resources are grabbed, but the main server has not gone down at this time, which will lead to resource preemption, so that users can access on the master and slave servers. If only the read operation is okay, if there is a write operation, then It will cause the file system to crash, so everything is played, so when resources are preempted, a certain isolation method can be used to achieve it, that is, when the backup server is preempting resources, the main server is directly given to STONITH, which is what we often say “explosion” head”.

Resource scripts are scripts under the control of Heartbeat. These scripts can add or remove IP alias (IP alias) or secondary IP address (secondary IP address), or include the ability to start/stop the processing of data packets outside of the service capability, etc. Usually, Heartbeat will read the script file in the /etc/init.d/ or /etc/ha.d/resource.d/ directory. Heartbeat needs to always clearly understand which node owns the “resource” or which node provides it. When writing a script to start or stop a resource, it must be clearly determined in the script whether the relevant service is provided by the current system.

Three, Pacemaker features

Host and application level failure detection and recovery

Almost any redundant configuration is supported

Support multiple clusters at the same time Configuration mode

Configure strategy to deal with quorum loss (when multiple machines fail)

Support application startup/shutdown sequence< /p>

Support applications that must/must run on the same machine

Applications that support multiple modes (such as Master/Slave)

Can test any failure or the cluster status of the cluster

4. Classification of Pacemaker cluster

1. Dual-system hot backup (Active/Passive)

Official Note: In many high availability situations, a two-node master/slave cluster using Pacemaker and DRBD is a cost-effective solution.

2. Multi-node hot standby (N+1)

Official note: How many nodes are supported, Pacemaker can significantly reduce hardware costs by allowing Several master/slave clusters need to combine and share a common backup node.

3. Multi-node shared storage (N-TO-N)

Official note: When there is shared storage, each node may be used For failover. Pacemaker can even run multiple services.

4. Shared storage hot backup (Split Site)

Official note: Pacemaker 1.2 will include enhancements to simplify the establishment of sub-site clusters< /p>

V. Pacemaker internal structure

stonithd: Heartbeat system.

lrmd: local resource management daemon. It provides a common interface to support resource types. Call the resource agent (script) directly.

pengine: Policy engine. Calculate the next state based on the current state and configuration of the cluster. Generate a transition diagram, containing a list of actions and dependencies.

CIB: Cluster Information Base. Contains definitions of all cluster options, nodes, resources, their relationships with each other and the status quo. Synchronous update to all cluster nodes.

CRMD: Cluster resource management daemon. Mainly the PEngine and LRM of the message broker, and also elect a leader (DC) to coordinate activities (including start/stop resources) cluster.

OpenAIS: The message and member layer of OpenAIS.

Heartbeat: Heartbeat message layer, an alternative to OpenAIS.

CCM: Consensus cluster member, heartbeat member layer.

Six, heartbeat Heartbeat transmission

Heartbeat running on the standby host can detect the operating status of the primary server through the Ethernet connection. Once it fails to detect the primary The “heartbeat” of the server automatically takes over the resources of the main server. Under normal circumstances, the heartbeat connection between the primary and standby servers is an independent physical connection. This connection can be a serial cable or an Ethernet connection realized by a “crossover cable”. Heartbeat can even detect the working status of the main server through multiple physical connections at the same time, and as long as it can receive the information that the main server is active through one of the connections, it will consider the main server to be in a normal state. From the perspective of practical experience, it is recommended to configure multiple independent physical connections for Heartbeat to avoid a single point of failure in the Heartbeat communication line itself.

Serial cable: The network connection is a slightly safer connection method, because hackers cannot run programs such as telnet, ssh, or rsh through a serial connection, which can reduce the chance of intruding into the backup server again through the hijacked server. However, the serial cable is limited by the available length, so the distance between the main and standby servers must be very short.

Ethernet connection:Using this method can eliminate the length limitation of the serial cable, and it can be connected between the active and standby servers through this method Synchronize the file system, thereby reducing the bandwidth usage from normal communication connections.

From the perspective of redundancy, two physical connections should be used to transmit heartbeat control information between the primary and standby servers; this can avoid the failure of a network or cable As a result, two nodes think that they are the only active server at the same time, and there is a situation of contention for resources. This kind of resource contention scenario is the so-called “split-brain” or “partitioned cluster”. In the case of two nodes sharing the same physical device resources, split-brain can have quite dire consequences.

Seven, Heartbeat Control information

“Heartbeat” information: (also known as status information) broadcast, multicast or multicast packets of only 150 bytes in size. Each node can be configured to report the frequency of “heartbeat” information to other nodes, as well as the waiting time for the heartbeat process on other nodes to confirm that the primary node’s outgoing node has an operation error.

Cluster change transaction (transition) information: ip-request and ip-request-rest are relatively common two types of cluster change information. They When resources need to be migrated between nodes, information is transferred for the session between heartbeat processes on different nodes. For example, when the primary node is repaired and brought back “online”, the primary node will use ip-request to request the standby node to release the resources it had previously taken over from the primary node due to the failure of the primary node. At this time, the standby node shuts down the service and uses ip-request-resp to notify the master node that it no longer occupies the resources previously taken over. The main contact will restart the service after receiving ip-request-resp.

Retransmission request: A cluster node found that the heartbeat control information it received from other nodes was “out of order” (the heartbeat process uses the sequence number to To ensure that the data packet is not discarded or error occurred during transmission), the other party will be required to retransmit this control information. Heartbeat generally sends a retransmission request every second to avoid flooding.

The above three types of control information are all transmitted based on the UDP protocol, and the UDP port or multicast address used can be specified in /etc/ha.d/ha.cf (In the case of an Ethernet connection).

In addition, in addition to using the “serial number/confirmation” mechanism to ensure the reliable transmission of control information, Heartbeat also uses MD5 or SHA1 to sign each data packet to ensure the security of the control information in transmission sex.

Leave a Comment Cancel reply