Reference 1: http://os.51cto.com/art/201412/461533.htm
Reference 2: http://www.voidcn.com/article /p-fhezbfkb-qq.html
Reference 3: http://network.51cto.com/art/201010/230237_all.htm
Network topology:
Two , Basic service configuration
1 , Configure time synchronization
NFS1 side:
[root@M1 ~]# ntpdate pool.ntp.org
NFS2 side:
[root@M2 ~]# ntpdate pool.ntp.org
2. Add routing between hosts (optional)
First Verify that the server IPs of M1 and M2 are in line with the plan
M1 end:
[root@M1 ~]# ifconfig|egrep'Link encap|inet addr'
< pre>[root@M2 ~]# ifconfig|egrep’Link encap|inet addr’
Check the existing routes, and then add the corresponding end-to-end static route entries for the heartbeat line and drbd data transmission line.
The purpose is to prevent heartbeat detection and data synchronization from being disturbed.
M1 :
route add -host 172.16.0.3 dev eth1
route add -host 172.16.100.3 dev eth2
echo'route add -host 172.16.0.3 dev eth1' >> /etc/rc.local
echo'route add -host 172.16.100.3 dev eth2' >> /etc/rc.local
traceroute 172.16.100.3 pre>M2 terminal:
Similar to 1Three, deploy heartbeat service Service
Here only demonstrates the installation of the NFS1 server, and does not repeat the description of 2.
< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(0,112,192);">1, install the heartbeat software
[root@M1 ~]# cd /etc/yum.repos.d/
[root@M1 yum.repos.d]#wget http://mirrors.163.com /.help/CentOS6-Base-163.repo
[[email protected]]#rpm-Uvh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release -6-8.noarch.rpm
[[email protected]]#sed-i's@#baseurl@baseurl@g'* (to be studied, not possible)
[ [email protected]]#sed-i's@mirrorlist@#mirrorlist@g'* (to be studied, not possible)
[[email protected]]#yum install heartbeat -y # This command may sometimes need to be executed twice2, configure the heartbeat service
[root@M1 yum.repos.d]# cd /usr/share/doc/heartbeat-3.0.4/
[root@M1 heartbeat-3.0.4]#ll|egrep'ha.cf|authkeys|haresources'
[[email protected]]#cpha.cfauthkeysharesources/etc /ha.d/
[root@M1 heartbeat-3.0.4]# cd /etc/ha.d/
chmod 600 authkeys
[[email protected]]#ls authkeys ha.cf haresources rc.d README.config resource.d shellfuncsNote: The configuration files (ha.cf, authkeys, haresource) at both ends of the active and standby nodes are exactly the same. The following is the file content of each node.
For heartbeat The configuration is mainly to modify the three files ha.cf, authkeys, and haresources. Below I list the configuration information of these three files for reference only!
ha.cf
debugfile /var/log/ha-debuglogfile /var/log/ha-loglogfacility local0keepalive 2deadtime 10warntime 6udpport 694ucast eth0 192.168.1.168 auto_failback onnode fuzai02ping 192.168.1.199respawn hacluster /usr/lib64/heartbeat/ipfailauthkeys
authkeys
authkeys
; auth 11 crcharesources
fuzai01 IPaddr::192.168.1.160/24/eth0Note: this The nfsd is not included with heartbeat, you need to write it yourself.
For the writing of this script, the following requirements need to be met:
1, executable permissions
2, must be stored in /etc/ha.d/resource .d or /etc/init.d directory
3. There must be two functions of start and stop
The specific script information will be written below.
< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(0,112,192);">4. Start heartbeat
[root@M1 ha.d]# /etc/init.d/heartbeat start
Starting High-Availability services: INFO: Resource is stopped Done. [root@M1 ha.d]# chkconfig heartbeat offDescription: Turn off auto-start after booting. When the service restarts, it needs to be started manually.
< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(0,112,192);">5, test heartbeat
Before this step of testing, please perform the above steps on NFS2 first!
< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(227,108,9);background-color:rgb(255,255,255);">a, normal state< /span>
[root@M1 Ha.d]#ip a|grep eth0
[[email protected]]#ip a|grep eth0Description: The M1 master node has a VIP address, but the M2 node does not.
< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(227,108,9);background-color:rgb(255,255,255);">b, analog master node Status after the downtime
[[email protected]]#/etc/init.d/heartbeat stop
Stopping High-Availability services: Done.
[root@M2 ha.d]# ip a|grep eth0 Description: After M1 goes down, the VIP address drifts to the M2 node, and the M2 node becomes the master node< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(0,0,0);">c, simulation< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(227,108,9);">Status after the main node failure recovery
[[email protected]]#/etc/init.d/heartbeat start
Starting High-Availability services: INFO: Resource is stopped Done.
[root@M1 Ha.d]# ip a|grep em1
Description: After the M1 node is restored, it is preempted again VIP resourcesFour. DRBD installation and deployment
1. Newly added (initial) hard disk
# fdisk /dev/sdb
< p>----------------
np-1-1-"+1G"-w
------ ----------
# mkdir /data
Method 2: The order of creating a logical volume (LV): Linux partition---physical volume (PV)---volume group (VG )-Logical Volume (LV)-Mount to File System
The sequence of deleting Logical Volume (LV): Unmount File System-Logical Volume (LV)-Volume Group (VG)- ---Physical Volume (PV)---Linux Partition2, install drbd
For the installation of drbd, we can not only use yum, but also compile and install. Since I couldn't get the rpm package of drbd from the current yum source when I was operating, I adopted the compiled installation method.
[root@M1 ~]#yum -y install gcc gcc-c++ kernel-devel kernel-headers flexmake
[root@M1~]#cd/usr/local/src
[root@M1src]#wget http://oss.linbit.com/drbd/8.4/drbd-8.4.3.tar.gz
[root@M1src]#tar zxfdrbd-8.4. 3.tar.gz
[root@M1 src]# cd drbd-8.4.3
[[email protected]]#./configure--prefix=/usr/local/drbd-- with-km--with-heartbeat
[[email protected]]#make KDIR=/usr/src/kernels/fill/
[[email protected]]#makeinstall br />[[email protected]]#mkdir-p/usr/local/drbd/var/run/drbd
[[email protected]]#cp/usr/local/drbd/etc/ rc.d/init.d/drbd/etc/init.d/
[[email protected]]#chmod+x/etc/init.d/drbd
[root@M1ha. d]# modprobe drbd # Execute the command to load the drbd module into the kernel
[[email protected]]#lsmod|grep drbd#Check whether drbd is correctly loaded into the kernel
Install DRBD via yum (recommended): (dbm135,dbm134)1234
[root@scj~]#cd /usr/local/src/
[root@scj src]#wget
[root@scjsrc]#rpm-ivh elrepo-releas e-6-6.el6.elrepo.noarch.rpm
[root@scj src]# yum -y install kmod-drbd84 #Time may be longer
[root@scj~]#modprobe drbd #Load drbd moduleFATAL: Module drbd not found. Resolve the failure to load drbd module:
Because when yum-y install drbd83-utils kmod-drbd83 was executed, kernel update was performed. , To restart the server, the updated kernel will take effect3. Configure DRBD
The configuration files involved in DRBD are mainly global_common.conf and user-defined resource files (of course, the resource file can be written to global_common. conf).
Note: The following configuration files of the two active and standby nodes, M1 and M2, are exactly the same
global usage-count no; usage-count no; common {protocol C; disk {no-disk-flushes; no-md-flushes;} net {sndbuf-size 512k; max-buffers 8000; unplug-watermark 1024; max-epoch-size 8000; after-sb-0pri disconnect ; after-sb-1pri disconnect; after-sb-2pri disconnect; rr-conflict disconnect;} syncer {al-extents 517;}} pre>vi / usr / local / drbd / etc / drbd.d / drbd.resresource drbd {on fuzai01 {device / dev / drbd0; disk disk / dev / sdb1; address 172.16.100.2:7789; meta-disk internal;} on fuzai02 {device / dev / drbd0; disk disk / dev / sdb1; address 172.16.100.3:7789; meta-disk internal;}} pre>
4. Initialize the meta partition
[root@M1 drbd]# drbdadm create-md drbd < br />Writing metadata...initializing activity log
NOT initializing bitmap
New drbd meta data block successfully created.5. Start drbd service
Here, we can see the drbd device changes before and after M1 and M2 start drbd service
< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(227,108,9);">M1 end:
[root@M1 drbd]# cat /proc/drbd # before start drbd Device information
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by [email protected],
2014-11-11 16:20:26
[root@M1 drbd]# drbdadm rbup all can also be started here Use script to start
[root@M1drbd]#cat /proc/drbd # After startup drbd Device information
version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515
build by [email protected], 2014-11-11 16:20:26 0: cs:Connected ro:Secondary/Secondary
ds:Inconsistent/Inconsistent C r----- ns:0 nr:0 dw:0 dr:0 al:0 bm:0lo :0 pe:0 ua:0 ap:0
ep:1 wo:doos:133615596M2 terminal:
[root@M2 ~]# cat /proc/drbdM1 end:
[root@M1drbd]# drbdadm--overwrite-data-of-peer primary drbd
[root@ M1 drbd]# cat /proc/drbd version: 8.4. 3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by [email protected],
2014-11-11 16:20:26 0:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C
r---n- ns:140132 nr:0 dw:0 dr:144024 al:0 bm:8 lo:0 pe:17 ua:26 ap: 0
ep:1 wo:doos:133477612 [>....................] sync'ed: 0.2% (130348/130480)M < br /> finish: 0:16:07 speed: 137,984 (137,984) K/secM2 terminal:
[root@M2~]#cat/proc/drbd< /pre>M1 end: span>
[root@M1~]#cat /proc/drbd[root@M1 drbd]# mkfs.ext4 /dev/drbd0M1 end:
[root@M1 drbd]# dd if=/dev/zero of=/data/test bs=1G count=1
1+0 records in 1+0 records out
1073741824 bytes (1.1 GB) copied, 1.26333 s, 850 MB/s
[root@M1 drbd]#cat/proc/drbd
[root@M1drbd]#umount/data/
[Root@M1 drbd]# drbdadm down drbd # Close the resource named drbdM2 terminal:
[root@M2 ~]# c at /proc/drbd# After the master node shuts down the resources, check the information of the backup node,
you can see that the role of the master node has become UnKnown
Method 2: yum install drbd http:/ /www.linuxidc.com/Linux/2013-08/89035p2.htm
wget http://download.Fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm< strong>V. NFS installation and deployment< /strong>
This operation still only takes NFS1 as an example, and the same is true for operation 2.
< strong>1, install nfs< /span>
[root@M1drbd]#yuminstall nfs-utilsrpcbind-y
[root@M2~]#yuminstallnfs-utilsrpcbind-y pre>2, configure nfs shared directory
[root@M1 drbd]# cat /etc/exports
/data 192.168.1.0/24(rw,sync,no_root_squash, anonuid=0,anongid=0)
[root@M2~]#cat/etc/exports
/data 192.168.1.0/24(rw,sync,no_root_squash,anonuid=0,anongid=0)< span style="font-size:14px;font-family:'Microsoft YaHei','Microsoft YaHei';color:rgb(0,0,0);">3. Start rpcbind and nfs services< /span>
[root@M1 drbd]# /etc/init.d/rpcbind start;chkconfig rpcbind off
[root@M1drbd]#/etc/init.d/nfs start ;chkconfig nfs off
[root@M2 drbd]# /etc/init.d/nfs start;chkconfig nfs off4, test nfs
[root@C1 ~]#mount -t nfs -o noatime,nodiratime 192.168.1.160:/data/xxxxx/
[root@C1~]#df -h|grepdata
lost+foundtest
[root@C1data]#echo'nolinux'>>nihao
[root@C1data ]#Ls
lost+found nihao test
[root@C1data]#cat nihao
nolinuxVI. Integrate Heartbeat, DRBD and NF S service
Test 1: Whether the remote server can be successfully mounted
nfs server, dbm135, Primary:
1
2
3
4
5
6
7
8
9
10 < /div> |
[root@dbm135~] # cd /data/ [root@dbm135data] #ll < div class="line number3 index2 alt2"> - rw-r--r-- 1 root root 0 Jun 30 10:15 file1 -rw-r --r-- 1 root root 0 Jun 30 10:15 file2 -rw-r--r - 1 root root 0 Jun 30 10:15 file3 -rw-r--r-- 1 root root 0 Jun 30 10:15 file4 -rw-r--r-- 1 root root 0 Jun 30 10:15 file5 drwx------ 2 root root 16384 Jun 30 10:14 lost+found |
Mount the directory shared by the nfs server on another host 192.168.186.131:
1
2
3
4
5
6
7
8
9
10
11
12
13 div>
14
15
|
[root@scj131~] #showmount-e 192.168.186.150 #150 is the virtual ip Export list for 192.168.186.150: /data 192.168.186.0 /255 .255.255.0 < div class="line number4 index3 alt1"> < div class="line number12 index11 alt1"> |
Test 2: Main Node DRBD service restarted
Restart the DRBD service on the primary node, and see if the secondary node changes and whether the mount of 192.168.186.131 is still normal
测试三:主节点DRBD服务stop掉(先停heartbeat服务)
在主节点把DRBD服务stop,看Secondary节点的变化,192.168.186.131挂载是否仍正常
测试四:主节点nfs服务stop掉
在主节点把nfs服务stop,看Secondary节点的变化,192.168.186.131挂载是否仍正常
1
|
##把主节点的nfs服务stop掉后,Secondary节点没有任何变换,且192.168.186.131不能正常挂载了 |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
|
##解决方法: [root@dbm135 ~] #vi /opt/monitor/nfs/monitornfs.sh #!/bin/ba sh #监控nfs服务的运行情况 while true do drbdstatus=` cat /proc/drbd 2> /dev/null | grep ro | tail -n1 | awk -F ':' '{print $4}' | awk -F '/' '{print $1}' ` #判断drbd的状态 nfsstatus=` /etc/init .d /nfs status | grep -c running` #判断nfs是否运行 if [ -z $drbdstatus ]; then sleep 10 continue elif [ $drbdstatus == 'Primary' ]; then #若drbd是Primary状态 < div class="line number15 index14 alt2"> /etc/init .d /nfs start &> /dev/null #启动nfs服务 /etc/init .d /nfs /dev/null newnfsstatus=` /etc/init .d /nfs status | grep -c running` #再次判断nfs是否成功启动 if [ $newnfsstatus - eq 0 ]; then #若nfs未运行,也就是无法启动 /etc/init .d /heartbeat stop &> /dev/null #将heartbeat服务stop掉,目的是自动切换到另一台备用机 /etc/init .d /heartbeat stop &> /dev/null fi fi sleep 5 done ##注意:不要将此监控脚本放到/data/目录下,挂载drbd设备时,会把此脚本覆盖掉 [root@dbm135 ~] # chmod u+x /opt/monitor/nfs/monitornfs.sh < div class="line number32 index31 alt1"> ##别忘了设置开机自动启动 |
测试五:主节点HeartBeat服务stop掉或重启
在主节点把HeartBeat服务stop或重启,看Secondary节点的变化,192.168.186.131挂载是否仍正常
1
2
3
4
|
##可以通过如下命令,查看Secondary节点的变化 [root@dbm134 ~] # cat /proc/drbd #查看节点是否由Secondary切换为Primary [root@dbm134 ~] # df -h #查看drbd设备是否成功挂载 [root@dbm134 ~] # ps -ef | grep nfs #查看nfs服务是否启动或重启(进程号发生变化) |
1
|
##测试后,发现主备节点切换正常,客户端挂载正常 |
测试六:把主节点服务器关机或重启
把主节点服务器关机或重启,看Secondary节点的变化,192.168.186.131挂载是否仍正常;等主节点重新启动后,看Secondary节点(新主节点)的变化
1
2
|
##主节点关机后主备节点切换正常,客户端挂载正常 ##主节点恢复正常后,主备节点不会再去切换,由Secondary节点(新主节点)继续对外提供服务 |
恢复主节点步骤:1.从节点停heartbeat 2 主节点启动heartbeat
测试七:模拟脑裂
把主节点的eth0关掉(ifdown eth0),再把eth0启动(ifup eth0),此时两个节点都变成了StandAlone状态
1
2
3
4
5
6
7
8
9
10
11
|
##解决脑裂的方法 ##备用节点: [root@dbm134 ~] # drbdadm secondary r0 [root@dbm134 ~] # drbdadm disconn ect all [root@dbm134 ~] # drbdadm -- --discard-my-data connect r0 ##主节点: [root@dbm135 ~] # drbdadm disconnect all [root@dbm135 ~] # drbdadm connect r0 [root@dbm135 ~] # drbdsetup /dev/drbd0 primary [root@dbm135 ~] # mount /dev/drbd0 /data/ |
注意:在解决脑裂时,把上面所有步骤都操作完后,有时候客户端会挂载正常,有时候则挂载不正常;若不正常,可以尝试把主节点的heartbeat服务重启一下
1
2
3
4
5
6
7
8
9
10
[root@dbm135 ~]
# cd /data/
[root@dbm135 data]
# ll
total 16
-rw-r--r-- 1 root root 0 Jun 30 10:15 file1
-rw-r--r-- 1 root root 0 Jun 30 10:15 file2
-rw-r--r-- 1 root root 0 Jun 30 10:15 file3
-rw-r--r-- 1 root root 0 Jun 30 10:15 file4
-rw-r--r-- 1 root root 0 Jun 30 10:15 file5< /code>
-rw-r--r-- 1 root root 0 Jun 30 18:01 file6
drwx------ 2 root root 16384 Jun 30 10:14 lost+found
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[root@scj131 ~]
# showmount -e 192.168.186.150 #150是虚拟ip
Export list
for
192.168.186.150:
/data
192.168.186.0
/255
.255.255.0
[root@scj131 ~]
# mount 192.168.186.150:/data /data
[root@scj131 ~]
# cd /data/
[root@scj131 data]
# ll
total 16
-rw-r--r-- 1 root root 0 Jun 30 2015 file1
-rw-r--r-- 1 root root 0 Jun 30 2015 file2
-rw-r--r-- 1 root root 0 Jun 30 2015 file3
-rw-r--r-- 1 root root 0 Jun 30 2015 file4
-rw-r--r-- 1 root root 0 Jun 30 2015 file5
-rw-r--r-- 1 root root 0 Jun 30 2015 file6
drwx------ 2 root root 16384 Jun 30 2015 lost+found
##挂载成功
1
##把主节点的nfs服务stop掉后,Secondary节点没有任何变换,且192.168.186.131不能正常挂载了
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
##解决方法:
[root@dbm135 ~]
#vi /opt/monitor/nfs/monitornfs.sh
#!/bin/bash
#监控nfs服务的运行情况
while
true
do
drbdstatus=`
cat
/proc/drbd
2>
/dev/null
|
grep
ro |
tail
-n1 |
awk
-F
':'
'{print $4}'
|
awk
-F
'/'
'{print $1}'
`
#判断drbd的状态
nfsstatus=`
/etc/init
.d
/nfs
status |
grep
-c running`
#判断nfs是否运行
if
[ -z $drbdstatus ];
then
sleep
10
continue
elif
[ $drbdstatus ==
'Primary'
];
then
#若drbd是Primary状态
if
[ $nfsstatus -
eq
0 ];
then
#若nfs未运行
/etc/init
.d
/nfs
start &>
/dev/null
#启动nfs服务
/etc/init
.d
/nfs
start &>
/dev/null
newnfsstatus=`
/etc/init
.d
/nfs
status |
grep
-c running`
#再次判断nfs是否成功启动
if
[ $newnfsstatus -
eq
0 ];
then
#若nfs未运行,也就是无法启动
/etc/init
.d
/heartbeat
stop &>
/dev/null
#将heartbeat服务stop掉,目的是自动切换到另一台备用机
/etc/init
.d
/heartbeat
stop &>
/dev/null
fi
fi
fi
sleep
5
done
##注意:不要将此监控脚本放到/data/目录下,挂载drbd设备时,会把此脚本覆盖掉
[root@dbm135 ~]
# chmod u+x /opt/monitor/nfs/monitornfs.sh
[root@dbm135 ~]
#nohup bash /opt/monitor/nfs/monitornfs.sh & #放在后台运行
##别忘了设置开机自动启动
1
2
3
4
##可以通过如下命令,查看Secondary节点的变化
[root@dbm134 ~]
# cat /proc/drbd #查看节点是否由Secondary切换为Primary
[root@dbm134 ~]
# df -h #查看drbd设备是否成功挂载
[root@dbm134 ~]
# ps -ef | grep nfs #查看nfs服务是否启动或重启(进程号发生变化)
1
##测试后,发现主备节点切换正常,客户端挂载正常
1
2
##主节点关机后主备节点切换正常,客户端挂载正常
##主节点恢复正常后,主备节点不会再去切换,由Secondary节点(新主节点)继续对外提供服务
1
2
3
4
5
6
7
8
9
10
11
##解决脑裂的方法
##备用节点:
[root@dbm134 ~]
# drbdadm secondary r0
[root@dbm134 ~]
# drbdadm disconnect all
[root@dbm134 ~]
# drbdadm -- --discard-my-data connect r0
##主节点:
[root@dbm135 ~]
# drbdadm disconnect all
[root@dbm135 ~]
# drbdadm connect r0
[root@dbm135 ~]
# drbdsetup /dev/drbd0 primary
[root@dbm135 ~]
# mount /dev/drbd0 /data/