DRBD + HeartBeat + NFS High Available Architecture Notes

Reference 1: http://os.51cto.com/art/201412/461533.htm

Reference 2: http://www.voidcn.com/article /p-fhezbfkb-qq.html

Reference 3: http://network.51cto.com/art/201010/230237_all.htm

Network topology:

Two , Basic service configuration

1 , Configure time synchronization

NFS1 side:

[root@M1 ~]# ntpdate pool.ntp.org

NFS2 side:

[root@M2 ~]# ntpdate pool.ntp.org

2. Add routing between hosts (optional)

First Verify that the server IPs of M1 and M2 are in line with the plan

M1 end:

[root@M1 ~]# ifconfig|egrep'Link encap|inet addr'

< pre>[root@M2 ~]# ifconfig|egrep’Link encap|inet addr’
Check the existing routes, and then add the corresponding end-to-end static route entries for the heartbeat line and drbd data transmission line.
The purpose is to prevent heartbeat detection and data synchronization from being disturbed.

M1 :

route add -host 172.16.0.3 dev eth1
route add -host 172.16.100.3 dev eth2
echo'route add -host 172.16.0.3 dev eth1' >> /etc/rc.local
echo'route add -host 172.16.100.3 dev eth2' >> /etc/rc.local
traceroute 172.16.100.3

M2 terminal:

Similar to 1

Three, deploy heartbeat service Service

Here only demonstrates the installation of the NFS1 server, and does not repeat the description of 2.

< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(0,112,192);">1, install the heartbeat software

[root@M1 ~]# cd /etc/yum.repos.d/ 
[root@M1 yum.repos.d]#wget http://mirrors.163.com /.help/CentOS6-Base-163.repo
[[email protected]]#rpm-Uvh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release -6-8.noarch.rpm
[[email protected]]#sed-i's@#baseurl@baseurl@g'* (to be studied, not possible)
[ [email protected]]#sed-i's@mirrorlist@#mirrorlist@g'* (to be studied, not possible)
[[email protected]]#yum install heartbeat -y # This command may sometimes need to be executed twice

2, configure the heartbeat service

[root@M1 yum.repos.d]# cd /usr/share/doc/heartbeat-3.0.4/ 
[root@M1 heartbeat-3.0.4]#ll|egrep'ha.cf|authkeys|haresources'
[[email protected]]#cpha.cfauthkeysharesources/etc /ha.d/
[root@M1 heartbeat-3.0.4]# cd /etc/ha.d/
chmod 600 authkeys
[[email protected]]#ls authkeys ha.cf haresources rc.d README.config resource.d shellfuncs

Note: The configuration files (ha.cf, authkeys, haresource) at both ends of the active and standby nodes are exactly the same. The following is the file content of each node.

For heartbeat The configuration is mainly to modify the three files ha.cf, authkeys, and haresources. Below I list the configuration information of these three files for reference only!

ha.cf

debugfile /var/log/ha-debuglogfile /var/log/ha-loglogfacility local0keepalive 2deadtime 10warntime 6udpport 694ucast eth0 192.168.1.168 auto_failback onnode fuzai02ping 192.168.1.199respawn hacluster /usr/lib64/heartbeat/ipfail

authkeys

 

authkeys

 

authkeys

; auth 11 crc

haresources

fuzai01 IPaddr::192.168.1.160/24/eth0

Note: this The nfsd is not included with heartbeat, you need to write it yourself.

For the writing of this script, the following requirements need to be met:

1, executable permissions

2, must be stored in /etc/ha.d/resource .d or /etc/init.d directory

3. There must be two functions of start and stop

The specific script information will be written below.

< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(0,112,192);">4. Start heartbeat

[root@M1 ha.d]# /etc/init.d/heartbeat start
Starting High-Availability services: INFO: Resource is stopped Done. [root@M1 ha.d]# chkconfig heartbeat off

Description: Turn off auto-start after booting. When the service restarts, it needs to be started manually.

< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(0,112,192);">5, test heartbeat

Before this step of testing, please perform the above steps on NFS2 first!

< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(227,108,9);background-color:rgb(255,255,255);">a, normal state< /span>

[root@M1 Ha.d]#ip a|grep eth0

[[email protected]]#ip a|grep eth0

Description: The M1 master node has a VIP address, but the M2 node does not.

< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(227,108,9);background-color:rgb(255,255,255);">b, analog master node Status after the downtime

[[email protected]]#/etc/init.d/heartbeat stop
Stopping High-Availability services: Done.
[root@M2 ha.d]# ip a|grep eth0 Description: After M1 goes down, the VIP address drifts to the M2 node, and the M2 node becomes the master node

< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(0,0,0);">c, simulation< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(227,108,9);">Status after the main node failure recovery

[[email protected]]#/etc/init.d/heartbeat start 
Starting High-Availability services: INFO: Resource is stopped Done.
[root@M1 Ha.d]# ip a|grep em1
Description: After the M1 node is restored, it is preempted again VIP resources

Four. DRBD installation and deployment

1. Newly added (initial) hard disk

# fdisk /dev/sdb

< p>----------------

np-1-1-"+1G"-w

------ ----------

# mkdir /data

Method 2: The order of creating a logical volume (LV): Linux partition---physical volume (PV)---volume group (VG )-Logical Volume (LV)-Mount to File System
The sequence of deleting Logical Volume (LV): Unmount File System-Logical Volume (LV)-Volume Group (VG)- ---Physical Volume (PV)---Linux Partition

2, install drbd

For the installation of drbd, we can not only use yum, but also compile and install. Since I couldn't get the rpm package of drbd from the current yum source when I was operating, I adopted the compiled installation method.

[root@M1 ~]#yum -y install gcc gcc-c++ kernel-devel kernel-headers flexmake 
[root@M1~]#cd/usr/local/src
[root@M1src]#wget http://oss.linbit.com/drbd/8.4/drbd-8.4.3.tar.gz
[root@M1src]#tar zxfdrbd-8.4. 3.tar.gz
[root@M1 src]# cd drbd-8.4.3
[[email protected]]#./configure--prefix=/usr/local/drbd-- with-km--with-heartbeat
[[email protected]]#make KDIR=/usr/src/kernels/fill/
[[email protected]]#makeinstall br />[[email protected]]#mkdir-p/usr/local/drbd/var/run/drbd
[[email protected]]#cp/usr/local/drbd/etc/ rc.d/init.d/drbd/etc/init.d/
[[email protected]]#chmod+x/etc/init.d/drbd
[root@M1ha. d]# modprobe drbd # Execute the command to load the drbd module into the kernel
[[email protected]]#lsmod|grep drbd#Check whether drbd is correctly loaded into the kernel

Install DRBD via yum (recommended): (dbm135,dbm134)1234
[root@scj~]#cd /usr/local/src/
[root@scj src]#wget
[root@scjsrc]#rpm-ivh elrepo-releas e-6-6.el6.elrepo.noarch.rpm
[root@scj src]# yum -y install kmod-drbd84 #Time may be longer
[root@scj~]#modprobe drbd #Load drbd moduleFATAL: Module drbd not found. Resolve the failure to load drbd module:
Because when yum-y install drbd83-utils kmod-drbd83 was executed, kernel update was performed. , To restart the server, the updated kernel will take effect

3. Configure DRBD

The configuration files involved in DRBD are mainly global_common.conf and user-defined resource files (of course, the resource file can be written to global_common. conf).

Note: The following configuration files of the two active and standby nodes, M1 and M2, are exactly the same

global usage-count no; usage-count no; common {protocol C; disk {no-disk-flushes; no-md-flushes;} net {sndbuf-size 512k; max-buffers 8000; unplug-watermark 1024; max-epoch-size 8000; after-sb-0pri disconnect ; after-sb-1pri disconnect; after-sb-2pri disconnect; rr-conflict disconnect;} syncer {al-extents 517;}}  
 vi / usr / local / drbd / etc / drbd.d / drbd.resresource drbd {on fuzai01 {device / dev / drbd0; disk disk / dev / sdb1; address 172.16.100.2:7789; meta-disk internal;} on fuzai02 {device / dev / drbd0; disk disk / dev / sdb1; address 172.16.100.3:7789; meta-disk internal;}}  


4. Initialize the meta partition

[root@M1 drbd]# drbdadm create-md drbd < br />Writing metadata...initializing activity log
NOT initializing bitmap
New drbd meta data block successfully created.

5. Start drbd service

Here, we can see the drbd device changes before and after M1 and M2 start drbd service

< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(227,108,9);">M1 end:

[root@M1 drbd]# cat /proc/drbd # before start drbd Device information 
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by [email protected],
2014-11-11 16:20:26
[root@M1 drbd]# drbdadm rbup all can also be started here Use script to start
[root@M1drbd]#cat /proc/drbd # After startup drbd Device information
version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515
build by [email protected], 2014-11-11 16:20:26 0: cs:Connected ro:Secondary/Secondary
ds:Inconsistent/Inconsistent C r----- ns:0 nr:0 dw:0 dr:0 al:0 bm:0lo :0 pe:0 ua:0 ap:0
ep:1 wo:doos:133615596

M2 terminal:

[root@M2 ~]# cat /proc/drbd

M1 end:

[root@M1drbd]# drbdadm--overwrite-data-of-peer primary drbd
[root@ M1 drbd]# cat /proc/drbd version: 8.4. 3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by [email protected],
2014-11-11 16:20:26 0:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C
r---n- ns:140132 nr:0 dw:0 dr:144024 al:0 bm:8 lo:0 pe:17 ua:26 ap: 0
ep:1 wo:doos:133477612 [>....................] sync'ed: 0.2% (130348/130480)M < br /> finish: 0:16:07 speed: 137,984 (137,984) K/sec

M2 terminal:

[root@M2~]#cat/proc/drbd< /pre> 

M1 end:

[root@M1~]#cat /proc/drbd
 [root@M1 drbd]# mkfs.ext4 /dev/drbd0

M1 end:

[root@M1 drbd]# dd if=/dev/zero of=/data/test bs=1G count=1 
1+0 records in 1+0 records out
1073741824 bytes (1.1 GB) copied, 1.26333 s, 850 MB/s
[root@M1 drbd]#cat/proc/drbd
[root@M1drbd]#umount/data/
[Root@M1 drbd]# drbdadm down drbd # Close the resource named drbd

M2 terminal:

[root@M2 ~]# c at /proc/drbd# After the master node shuts down the resources, check the information of the backup node,
you can see that the role of the master node has become UnKnown

Method 2: yum install drbd http:/ /www.linuxidc.com/Linux/2013-08/89035p2.htm
wget http://download.Fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm

< strong>V. NFS installation and deployment< /strong>

This operation still only takes NFS1 as an example, and the same is true for operation 2.

< strong>1, install nfs< /span>

[root@M1drbd]#yuminstall nfs-utilsrpcbind-y
[root@M2~]#yuminstallnfs-utilsrpcbind-y

2, configure nfs shared directory

[root@M1 drbd]# cat /etc/exports 
/data 192.168.1.0/24(rw,sync,no_root_squash, anonuid=0,anongid=0)
[root@M2~]#cat/etc/exports
/data 192.168.1.0/24(rw,sync,no_root_squash,anonuid=0,anongid=0)

< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(0,0,0);">3. Start rpcbind and nfs services< /span>

[root@M1 drbd]# /etc/init.d/rpcbind start;chkconfig rpcbind off
[root@M1drbd]#/etc/init.d/nfs start ;chkconfig nfs off
[root@M2 drbd]# /etc/init.d/nfs start;chkconfig nfs off

4, test nfs

[root@C1 ~]#mount -t nfs -o noatime,nodiratime 192.168.1.160:/data/xxxxx/
[root@C1~]#df -h|grepdata
lost+foundtest
[root@C1data]#echo'nolinux'>>nihao
[root@C1data ]#Ls
lost+found nihao test
[root@C1data]#cat nihao
nolinux

VI. Integrate Heartbeat, DRBD and NF S service

Test 1: Whether the remote server can be successfully mounted

nfs server, dbm135, Primary:

1
2
3
4
5
6
7
8
9
10 < /div>
[root@dbm135~] # cd /data/
[root@dbm135data] #ll

< div class="line number3 index2 alt2"> total 16

- rw-r--r-- 1 root root 0 Jun 30 10:15 file1
-rw-r --r-- 1 root root 0 Jun 30 10:15 file2
-rw-r--r - 1 root root 0 Jun 30 10:15 file3
-rw-r--r-- 1 root root 0 Jun 30 10:15 file4
-rw-r--r-- 1 root root 0 Jun 30 10:15 file5
-rw-r--r-- 1 root root 0 Jun 30 18:01 file6
drwx------ 2 root root 16384 Jun 30 10:14 lost+found

Mount the directory shared by the nfs server on another host 192.168.186.131:

1
2
3
4
5
6
7
8
9
10
11
12
13

14
15
[root@scj131~] #showmount-e 192.168.186.150 #150 is the virtual ip
Export list for 192.168.186.150:
/data 192.168.186.0 /255 .255.255.0

< div class="line number4 index3 alt1"> [root@scj131~] #mount 192.168.186.150:/data/data

[root@scj131~] # cd/data/
[root@scj131 data] #ll< /code>
total 16
-rw-r--r-- 1 root root 0 Jun 30 2015 file1
-rw-r--r-- 1 root root 0 Jun 30 2015 file2
-rw- r--r-- 1 root root 0 Jun 30 2015 file3
-rw-r--r-- 1 root root 0 Jun 30 2015 file4

< div class="line number12 index11 alt1"> -rw-r--r-- 1 root root 0 Jun 30 2015 file5

-rw-r--r-- 1 root root 0 Jun 30 2015 file6
drwx------ 2 root root 16384 June 30 2015 lost+found
##Successful mounting

Test 2: Main Node DRBD service restarted

Restart the DRBD service on the primary node, and see if the secondary node changes and whether the mount of 192.168.186.131 is still normal

测试三:主节点DRBD服务stop掉(先停heartbeat服务)

        在主节点把DRBD服务stop,看Secondary节点的变化,192.168.186.131挂载是否仍正常

    测试四:主节点nfs服务stop掉

        在主节点把nfs服务stop,看Secondary节点的变化,192.168.186.131挂载是否仍正常

1
##把主节点的nfs服务stop掉后,Secondary节点没有任何变换,且192.168.186.131不能正常挂载了
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
##解决方法:
[root@dbm135 ~] #vi /opt/monitor/nfs/monitornfs.sh
#!/bin/ba sh
#监控nfs服务的运行情况
 
while  true
do
     drbdstatus=` cat  /proc/drbd  2>  /dev/null   grep  ro |  tail  -n1 |  awk  -F ':'  '{print $4}'  awk  -F '/'  '{print $1}' `    #判断drbd的状态
     nfsstatus=` /etc/init .d /nfs  status |  grep  -c running`     #判断nfs是否运行
 
     if  [ -z  $drbdstatus ]; then
         sleep  10
         continue
     elif  [ $drbdstatus ==  'Primary'  ]; then      #若drbd是Primary状态

< div class="line number15 index14 alt2">          if  [ $nfsstatus - eq  0 ]; then            #若nfs未运行

             /etc/init .d /nfs  start &>  /dev/null    #启动nfs服务
             /etc/init .d /nfs  start &>  /dev/null
             newnfsstatus=` /etc/init .d /nfs  status |  grep  -c running`      #再次判断nfs是否成功启动
             if  [ $newnfsstatus - eq  0 ]; then          #若nfs未运行,也就是无法启动
                 /etc/init .d /heartbeat   stop &>  /dev/null         #将heartbeat服务stop掉,目的是自动切换到另一台备用机
                 /etc/init .d /heartbeat   stop &>  /dev/null
             fi
         fi
     fi
     sleep  5
done
 
 
##注意:不要将此监控脚本放到/data/目录下,挂载drbd设备时,会把此脚本覆盖掉
 
[root@dbm135 ~] # chmod  u+x /opt/monitor/nfs/monitornfs.sh

< div class="line number32 index31 alt1"> [root@dbm135 ~] #nohup bash /opt/monitor/nfs/monitornfs.sh &     #放在后台运行

##别忘了设置开机自动启动

    测试五:主节点HeartBeat服务stop掉或重启

        在主节点把HeartBeat服务stop或重启,看Secondary节点的变化,192.168.186.131挂载是否仍正常

1
2
3
4
##可以通过如下命令,查看Secondary节点的变化
[root@dbm134 ~] # cat /proc/drbd       #查看节点是否由Secondary切换为Primary
[root@dbm134 ~] # df -h                #查看drbd设备是否成功挂载
[root@dbm134 ~] # ps -ef | grep nfs    #查看nfs服务是否启动或重启(进程号发生变化)
1
##测试后,发现主备节点切换正常,客户端挂载正常

    测试六:把主节点服务器关机或重启

        把主节点服务器关机或重启,看Secondary节点的变化,192.168.186.131挂载是否仍正常;等主节点重新启动后,看Secondary节点(新主节点)的变化

1
2
##主节点关机后主备节点切换正常,客户端挂载正常
##主节点恢复正常后,主备节点不会再去切换,由Secondary节点(新主节点)继续对外提供服务

恢复主节点步骤:1.从节点停heartbeat 2 主节点启动heartbeat

    测试七:模拟脑裂

        把主节点的eth0关掉(ifdown eth0),再把eth0启动(ifup eth0),此时两个节点都变成了StandAlone状态

1
2
3
4
5
6
7
8
9
10
11
##解决脑裂的方法
##备用节点:
[root@dbm134 ~] # drbdadm secondary r0
[root@dbm134 ~] # drbdadm disconn ect all
[root@dbm134 ~] # drbdadm -- --discard-my-data connect r0 
 
##主节点:
[root@dbm135 ~] # drbdadm disconnect all
[root@dbm135 ~] # drbdadm connect r0
[root@dbm135 ~] # drbdsetup /dev/drbd0 primary
[root@dbm135 ~] # mount /dev/drbd0 /data/

       注意:在解决脑裂时,把上面所有步骤都操作完后,有时候客户端会挂载正常,有时候则挂载不正常;若不正常,可以尝试把主节点的heartbeat服务重启一下

1

2

3

4

5

6

7

8

9

10

[root@dbm135 ~] # cd /data/

[root@dbm135 data] # ll

total 16

-rw-r--r-- 1 root root     0 Jun 30 10:15 file1

-rw-r--r-- 1 root root     0 Jun 30 10:15 file2

-rw-r--r-- 1 root root     0 Jun 30 10:15 file3

-rw-r--r-- 1 root root     0 Jun 30 10:15 file4

-rw-r--r-- 1 root root     0 Jun 30 10:15 file5< /code>

-rw-r--r-- 1 root root     0 Jun 30 18:01 file6

drwx------ 2 root root 16384 Jun 30 10:14 lost+found

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

[root@scj131 ~] # showmount -e 192.168.186.150         #150是虚拟ip

Export list  for  192.168.186.150:

/data  192.168.186.0 /255 .255.255.0

[root@scj131 ~] # mount 192.168.186.150:/data /data

[root@scj131 ~] # cd /data/

[root@scj131 data] # ll

total 16

-rw-r--r-- 1 root root     0 Jun 30  2015 file1

-rw-r--r-- 1 root root     0 Jun 30  2015 file2

-rw-r--r-- 1 root root     0 Jun 30  2015 file3

-rw-r--r-- 1 root root     0 Jun 30  2015 file4

-rw-r--r-- 1 root root     0 Jun 30  2015 file5

-rw-r--r-- 1 root root     0 Jun 30  2015 file6

drwx------ 2 root root 16384 Jun 30  2015 lost+found

##挂载成功

1

##把主节点的nfs服务stop掉后,Secondary节点没有任何变换,且192.168.186.131不能正常挂载了

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

##解决方法:

[root@dbm135 ~] #vi /opt/monitor/nfs/monitornfs.sh

#!/bin/bash

#监控nfs服务的运行情况

 

while  true

do

     drbdstatus=` cat  /proc/drbd  2>  /dev/null   grep  ro |  tail  -n1 |  awk  -F ':'  '{print $4}'  awk  -F '/'  '{print $1}' `    #判断drbd的状态

     nfsstatus=` /etc/init .d /nfs  status |  grep  -c running`     #判断nfs是否运行

 

     if  [ -z  $drbdstatus ]; then

         sleep  10

         continue

     elif  [ $drbdstatus ==  'Primary'  ]; then      #若drbd是Primary状态

         if  [ $nfsstatus - eq  0 ]; then            #若nfs未运行

             /etc/init .d /nfs  start &>  /dev/null    #启动nfs服务

             /etc/init .d /nfs  start &>  /dev/null

             newnfsstatus=` /etc/init .d /nfs  status |  grep  -c running`      #再次判断nfs是否成功启动

             if  [ $newnfsstatus - eq  0 ]; then          #若nfs未运行,也就是无法启动

                 /etc/init .d /heartbeat   stop &>  /dev/null         #将heartbeat服务stop掉,目的是自动切换到另一台备用机

                 /etc/init .d /heartbeat   stop &>  /dev/null

             fi

         fi

     fi

     sleep  5

done

 

 

##注意:不要将此监控脚本放到/data/目录下,挂载drbd设备时,会把此脚本覆盖掉

 

[root@dbm135 ~] # chmod  u+x /opt/monitor/nfs/monitornfs.sh

[root@dbm135 ~] #nohup bash /opt/monitor/nfs/monitornfs.sh &     #放在后台运行

##别忘了设置开机自动启动

1

2

3

4

##可以通过如下命令,查看Secondary节点的变化

[root@dbm134 ~] # cat /proc/drbd       #查看节点是否由Secondary切换为Primary

[root@dbm134 ~] # df -h                #查看drbd设备是否成功挂载

[root@dbm134 ~] # ps -ef | grep nfs    #查看nfs服务是否启动或重启(进程号发生变化)

1

##测试后,发现主备节点切换正常,客户端挂载正常

1

2

##主节点关机后主备节点切换正常,客户端挂载正常

##主节点恢复正常后,主备节点不会再去切换,由Secondary节点(新主节点)继续对外提供服务

1

2

3

4

5

6

7

8

9

10

11

##解决脑裂的方法

##备用节点:

[root@dbm134 ~] # drbdadm secondary r0

[root@dbm134 ~] # drbdadm disconnect all

[root@dbm134 ~] # drbdadm -- --discard-my-data connect r0 

 

##主节点:

[root@dbm135 ~] # drbdadm disconnect all

[root@dbm135 ~] # drbdadm connect r0

[root@dbm135 ~] # drbdsetup /dev/drbd0 primary

[root@dbm135 ~] # mount /dev/drbd0 /data/

Leave a Comment

Your email address will not be published.