HeartBeat V1 + LDIRECTORD implementation of LVS high availability

Introduction to High Availability Cluster

High availability cluster, ie High Availability Cluster, its main function is to realize the automation of server failure detection and resource switching, and minimize the problem caused by server stoppage Business interruption time. During the operation of the server, the server often stops providing services due to computer hardware or software. With the help of high-availability cluster software, the standby node can automatically detect the failure of the primary node and transfer the resources on the primary node Come over, and immediately provide services to the outside world to achieve automatic business switching.


HA frame

The bottom layer is the basic transaction layer (Messaging Layer), mainly used to transmit cluster transaction information, heartbeat information of each node, and also includes all the needs of the CRM layer The information passed, after this type of service is started They are all monitored on a certain broadcast address or multicast address. The solutions on this layer are heartbeat , Corosync, cman (openais), these services need to run on each node.

Membership, which is mainly used to manage the topology of the cluster and share this topology with the upper layer CRM makes corresponding decisions. This layer is more used to manage the members of the current cluster and their roles, including determining which node is DC (Designated Coordinator, there is only one DC in a cluster)

Resource Allocation, which mainly implements resource management, including defining resources, resource grouping, and resource constraints. Monitor the running status of resources on a certain node. Where policy engine is used to do Out of the cluster transaction strategy, this module only runs on DC, Transition Engine module is used to implementThe decision made by the policy engine. The specific operations of the resource manager (CRM) of this layer on resourcesthrough LRM(Local Resource Manager), while LRM mainly completes the response work by executing the script in the /etc/init.d directory. The implementation of this layer has haresources(heartbeat v1), crm(heartbeatv2),pacemaker ( heartbeat v3),rgmanager.


Configure a highly available cluster Note:

1. The time of each node in the cluster must be synchronized.

2. Nodes communicate with each other using names (configuration /etc/hosts implementation)

3, SSH key authentication to achieve barrier-free communication

4. Provide ping node (ping node)


based on heartbeat v1+haresources realizes LVS high availability cluster

The LVS model in the experiment is a NAT model, which achieves high availability for Director through heartbeat v1+haresource. Use ldirectord to monitor the health of the rear Real Server.

Experimental environment:

Time server: 192.168.1.118

2 Directors (node1, node2): VIP: 192.168.1.200, DIP: 192.168.2.200

Real Server1: 192.168.2.12

Real Server2: 192.168.2.6

Configure the LVS environment

Configure the gateway on each Real Server

[root@node1~]#ip route add default via 192.168.2.200

Enable the forwarding function on the Director (active and standby nodes)

[root@vm1 ~]#echo 1> /proc/sys/net/ipv4/ip_forward


time synchronization

on time server:

[root@vm1~]#vim/etc/ntp.confrestrict 192.168.0.0 mask 255.255.0.0 nomodify notrap #Only allow time synchronization for this network segment...server cn.pool.ntp.org #Specify a higher-level time server server 0.cn.pool.ntp. orgserver 127.127.1.0 #If you cannot access the previous services, use the local system time as the standard fudge 127.127.1.0 stratum 10 #Quick time provided to the client
[root@vm1~]# ntpstatsynchronised to NTP server (202.118.1.81) at stratum 3 time correct to within 84 ms polling server every 512 s


each node in the cluster :

[root@node1 ~]# vim /etc/ntp.conf...server 192.168.2.8[root@node1 ha.d]# ntpstatsynchronised to NTP server (192.168 .2.8) at stratum 4 #Time has been synchronized

Configure /etc/hosts to enable both parties to be based on the host name Mutual communication

[root@node2~]# vim/etc/hosts192.168.1.116 node1192. 168.1.117 node2

SSH key authentication, barrier-free communication between the two parties< /span>

[root@node1~]#ssh-keygen-trsa-P”[root@node1~]#ssh-copy-id-i~/.ssh/id_rsa .pub root@node2[root@node2~]#ssh-keygen-trsa-P”[root@node2~]#ssh-copy-id-i~/.ssh/id_rsa.pubroot@node1

After configuring barrier-free communication, test whether the time on both sides are synchronized

[[email protected]] #Ssh node2'date'; dateWed August 5 23:49:43 CST 2015Wed August 5 23:49:43 CST 2015

Install the corresponding package and edit the configuration file

install these packages in each node of the cluster.

[root@node1 heartbeat]# yum install perl-TimeDate PyXML libnet net-snmp-libs heartbeat-ldirectord-2.1.4-12 .el6.x86_64.rpm [root@node2 heartbeat]# rpm -ivh heartbeat-2.1.4-12.el6.x86_64.rpm heartbeat-pils-2.1.4-12.el6.x86_64.rpm heartbeat-stonith-2.1. 4-12.el6.x86_64.rpm

The heartbeat-ldirectord program package mainly provides the ldirectord program. The ldirectord program automatically creates an IPVS table when it starts, and then monitors the health of the cluster nodes, and automatically removes them from the IPVS table when a failed node is found.

Main configuration file: /etc/ha.d/ha.cf

Authentication key: /etc/ha.d/authkeys

Files used to define resources: /etc/ha.d/haresources

ldirectord’s configuration file:/etc/ha. d/ldirectord.cf

By default, the above files are not in the /etc/ha.d directory and need to be copied The sample file is in this directory, and then modify /etc/ha.d/authkeys file permission is 400 or 600, if the authority is greater than this value, heartbeat will not start.

[[email protected]]#cp/usr/share/doc/heartbeat-2.1 .4/{ha.cf,haresources,authkeys}./-p[[email protected]]#chmod600authkeys[[email protected]]#cp/usr/share/doc/heartbeat-ldirectord-2.1 .4/ldirectord.cf /etc/ha.d


Edit the main configuration file/etc/ha.d/ha.cf (all others use the default configuration)

logfile/var/log/ha-log 230 ports eth 0 port 694 heartbeat information delivered every 2 seconds udp .0.120.1 69410 # multicast address auto_failback on # automatically switch back node node1 # define respective nodes selected node node2 ping 192.168.1.1 # ping node, the arbitration device information transfer compression bz2 # compress compression_threshold 2 # than 2K Only compressed information

Edit authkeys,each node needs to be authenticated when transmitting information This file will be used

[root@node2 heartbeat]#opensslrand-hex8#Generate random Code 4ece364b077efd89[[email protected]]#vim authkeys #auth1#1crc#2sha1HI!#3md5Hello!auth1#Which one to choose Encryption method 1 sha1 4ece364b077efd89 #Encryption algorithm serial number Encryption algorithm Random code

Configure ldirectord(/etc/ha.d/ldirectord.cf), the parameters below #http virtual service define a cluster service, including VIP, RIP, LVS models and scheduling algorithms. When defining resources, only need to define ldirectord service (ipvsadm is not required) Yes, in ldirectord will call ipvsadm to complete the addition of cluster services at startup .

#GlobalDirectives#Global configuration, valid for all virtual checktimeout=3#Detect realserver The timeout period of checkinterval=1 #The time interval of each detection#fallback=127.0.0.1:80autoreload=yes #If the configuration file is changed, it will be automatically loaded #logfile#/var/log/ldirectord.log" File#logfile="local0" Work #Use syslog record log#emailalert="admin@xyz"#emailalertfreq=3600 #Send mail time interval#emailalertstatus=allquiescent=yes #http virtual servicevirtual=192.168.1.200:80 #vip real=192.168.2.12:80 masq real=192.168.2.12:80 masq DR model, masq is NAT model, masq is NAT model, NAT model is NAT model, masq is NAT model, masq is NAT model, and masq is NAT model. data scheduler protocol request to use when checking service = http # = Real Server health status when "test.html" # check, receive a page request = "ok" # expectations contained in the page = rr # scheduler # persistent = 600 # whether to use persistent connections 600 is long lasting # netmask = 255.255.255.255 protocol = tcp # is based on TCP checktype = negotiate # probe detection methods checkport = 80 # probe port  

After configuration is complete , Don’t forget to add a test page to the DocumentRoot directory of httpd on each Real Server. The test page contains the information specified by the request. If fallback is configured, you need to start the httpd service on each node and add the index.html page. When all the rear Real Servers stop serving, this page can give the user prompt information.

each Real Server[root@node1~]# vim/httpd_dir/test.html ok on each cluster node[root@node1ha .d]# vim /var/www/html/index.html

Sorry

In /etc/ha.d/haresourcesDefine resources in the file

Format:

Master node IP/mask/iface resource #The main node is all resources, and the resources are separated by spaces or tabs.

[root@node2~]# vim/etc/ha.d/haresourcesnode1 192.168.1.200/24/eth0 192.168 .2.200/24/eth1 ldirectord::/etc/ha.d/ldirectord.cf

ldirectord needs to specify the configuration file when it starts. Each row defines a set of resources. When the primary node fails, this set of resources will all be transferred to the standby node. The above 192.168.1.200 is VIP, and 192.168.2.200 is DIP.


Copy 4 configuration files to other nodes (-p reserve permissions)

[root@node1 Ha.d ]#Scp -p haresources ha.cf authkeys ldirectord.cf root@node2:/etc/ha.d/

Close the ldirectord service on each node and make sure that it does not start automatically.

[[email protected]]#chkconfigldirectordoff;sshnode2'chkconfigldirectordoff'

< p>

start the service for testing strong>

[root@node1 ha.d]# service heartbeat start; ssh node2 'service heartbeat start'

On the main node:

corresponding VIP, DIP and cluster services have been enabled! ! !


Let the master node stop providing services (imitate server failure)

[root@node1 ha.d]# /usr/lib64/heartbeat/hb_standby 2015/08/06_01:17:42 Going standby [all]. 

The corresponding resource has been transferred to node2.

Stop httpd service on a Real Server (RealS1) in the back

[root@node1~]# service httpd stopStopping httpd: OK [ ]

The Real Server has been marked as unavailable on the front-end Director (Weight=0) , And can only visit the page of Real Server2. Stop the httpd service on Real Server2. At this time, only the information of the fallback page is available.

测试完成.................^_^

Leave a Comment

Your email address will not be published.