pacemaker做nginx高可用
创始人
2025-06-01 11:32:44
0

为保证系统更高的可用性,需要对重要的关键业务做双机热备

https://access.redhat.com/documentation/zh-cn/red_hat_enterprise_linux/7/html/high_availability_add-on_reference/index

https://blog.csdn.net/m0_51277041/article/details/124147404

https://www.cnblogs.com/chimeiwangliang/p/7975911.html

准备

参数说明
10.0.0.11主节点
10.0.0.12备节点
10.0.0.10虚拟IP

hosts

cat /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
10.0.0.11 node1
10.0.0.12 node2

配置ssh 密钥访问

让node1 和 node2 之间均可以使用密匙访问

# node1与node2都操作
ssh-keygen 
ssh-copy-id -i /root/.ssh/id_rsa.pub root@node1
ssh-copy-id -i /root/.ssh/id_rsa.pub root@node2

yum 工具安装 pacemaker

yum install -y fence-agents-all corosync pacemaker pcs

配置集群用户

pacemaker 使用的用户名为hacluster ,软件安装时此用户已经建立,需要设置其密码

passwd hacluster

配置集群节点之间的认证

启动pcsd 服务,并配置各节点之间的认证,让节点之间可以互相通信

启动pcsd服务并设置自启动

需要在两个节点上都开启pcsd服务

systemctl start pcsd.service 
systemctl enable pcsd.service

pcsd服务启动之后,监听2224端口,可以访问管理页面

https://10.0.0.11:2224

配置节点间的认证

以下命令仅需要在node1上执行即可

[root@node1 ~]# pcs cluster auth node1 node2
Username: hacluster
Password: 
node1: Authorized
node2: Authorized

pacemaker资源配置

配置nginx

在节点node1 和 node2上配置nginx

yum install nginx -y
echo "welcome to node1" > /usr/share/nginx/html/index.html
systemctl start nginx.service
curl http://10.0.0.11

pacemaker 可以控制nginx服务的启动和关闭,因此在node1 和 node2上配置完nginx并测试之后关闭nginx服务

集群配置

新建并启动集群

完成以上工作之后,就可以在节点node1上新建一个集群并启动

新建一个名为 mycluster 的集群
集群节点包括node1 和 node2

[root@node1 ~]# pcs cluster setup --name mycluster node1 node2
Destroying cluster on nodes: node1, node2...
node1: Stopping Cluster (pacemaker)...
node2: Stopping Cluster (pacemaker)...
node1: Successfully destroyed cluster
node2: Successfully destroyed clusterSending 'pacemaker_remote authkey' to 'node1', 'node2'
node1: successful distribution of the file 'pacemaker_remote authkey'
node2: successful distribution of the file 'pacemaker_remote authkey'
Sending cluster config files to the nodes...
node1: Succeeded
node2: SucceededSynchronizing pcsd certificates on nodes node1, node2...
node1: Success
node2: Success
Restarting pcsd on the nodes in order to reload the certificates...
node1: Success
node2: Success

启动集群并设置集群自启动

[root@node1 ~]# pcs cluster start --all
node1: Starting Cluster (corosync)...
node2: Starting Cluster (corosync)...
node2: Starting Cluster (pacemaker)...
node1: Starting Cluster (pacemaker)...
[root@node1 ~]# pcs cluster enable --all
node1: Cluster Enabled
node2: Cluster Enabled

查看集群状态

[root@node1 ~]# pcs status 
Cluster name: myclusterWARNINGS:
No stonith devices and stonith-enabled is not falseStack: corosync
Current DC: node1 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Tue Mar 21 16:52:39 2023
Last change: Tue Mar 21 16:51:33 2023 by hacluster via crmd on node12 nodes configured
0 resource instances configuredOnline: [ node1 node2 ]No resourcesDaemon Status:corosync: active/enabledpacemaker: active/enabledpcsd: active/enabled

node1 上新建集群之后,所有的设置都会同步到node2上,而在集群状态中可以看出node1 和 node2均已在线,集群使用的服务都已激活并启动

为集群添加资源

集群状态中的 "No resources" 中可以看到集群还没有任何资源,接下来为集群添加VIP和服务

添加一个名为VIP的IP地址资源
使用 heartbeat 作为心跳检测
集群每隔30s检查该资源一次

[root@node1 ~]# pcs resource create VIP ocf:heartbeat:IPaddr2 ip=10.0.0.10 cidr_netmask=24 nic=eth0 op monitor interval=30s

添加一个名为 web 的nginx资源

[root@node1 ~]# pcs resource create web systemd:nginx op monitor interval=30s

查看服务资源

pcs resource list |grep nginx

如果删除,执行

[root@node1 ~]# pcs resource delete web

查看集群状态

[root@node1 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: node2 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Wed Mar 22 08:44:59 2023
Last change: Wed Mar 22 08:33:45 2023 by root via cibadmin on node12 nodes configured
2 resource instances configuredOnline: [ node1 node2 ]Full list of resources:VIP	(ocf::heartbeat:IPaddr2):	Started node1web	(systemd:nginx):	Started node1Daemon Status:corosync: active/enabledpacemaker: active/enabledpcsd: active/enabled

调整资源

添加资源后还需对资源调整,让VIP和Web这两个资源捆绑在一起,以免出现VIP节点在节点node1上,而nginx运行在node2上的情况。

另一个情况则是有可能集群先启动nginx,然后在启用VIP,这是不正确的

捆绑

[root@node1 ~]# pcs constraint colocation add web VIP INFINITY

如果删除,执行

[root@node1 ~]# pcs constraint colocation remove web VIP

设置资源的启动停止顺序

先启动VIP,然后在启动web

[root@node1 ~]# pcs constraint order start VIP then start web

优先级

如果node1与node2的硬件配置不同,那么应该调整节点的优先级,让资源运行于硬件配置较好的服务器上,待其失效后在转移至较低配置的服务器上,这就需要配置优先级(pacemaker 中称为 Location)

调整Location

数值越大表示优先级越高

[root@node1 ~]# pcs constraint location web prefers node1=10
[root@node1 ~]# pcs constraint location web prefers node2=5
[root@node1 ~]# pcs property set stonith-enabled=false
[root@node1 ~]# crm_simulate -sLCurrent cluster status:
Online: [ node1 node2 ]VIP	(ocf::heartbeat:IPaddr2):	Started node1web	(systemd:nginx):	Started node1Allocation scores:
pcmk__native_allocate: VIP allocation score on node1: 10
pcmk__native_allocate: VIP allocation score on node2: 5
pcmk__native_allocate: web allocation score on node1: 10
pcmk__native_allocate: web allocation score on node2: -INFINITYTransition Summary:

提示:在本次操作中没有设置fence设备,集群在启动的时候可能会遇到一些错误,可以使用命令 pcs property set stonith-enabled=false 禁用fence设备

至此,pacemaker集群已经配置完成了,重新启动集群所有设置生效

停止所有集群

[root@node1 ~]# pcs cluster stop --all
node2: Stopping Cluster (pacemaker)...
node1: Stopping Cluster (pacemaker)...
node2: Stopping Cluster (corosync)...
node1: Stopping Cluster (corosync)...

启动所有集群

[root@node1 ~]# pcs cluster start --all
node1: Starting Cluster (corosync)...
node2: Starting Cluster (corosync)...
node1: Starting Cluster (pacemaker)...
node2: Starting Cluster (pacemaker)...

查看集群状态

[root@node1 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: node2 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Wed Mar 22 08:44:59 2023
Last change: Wed Mar 22 08:33:45 2023 by root via cibadmin on node12 nodes configured
2 resource instances configuredOnline: [ node1 node2 ]Full list of resources:VIP	(ocf::heartbeat:IPaddr2):	Started node1web	(systemd:nginx):	Started node1Daemon Status:corosync: active/enabledpacemaker: active/enabledpcsd: active/enabled

验证VIP是否启用

[root@node1 ~]# ip a
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00inet 127.0.0.1/8 scope host lovalid_lft forever preferred_lft foreverinet6 ::1/128 scope host valid_lft forever preferred_lft forever
2: eth0:  mtu 1500 qdisc pfifo_fast state UP group default qlen 1000link/ether 00:0c:29:1a:a4:00 brd ff:ff:ff:ff:ff:ffinet 10.0.0.11/24 brd 10.0.0.255 scope global noprefixroute eth0valid_lft forever preferred_lft foreverinet 10.0.0.10/24 brd 10.0.0.255 scope global secondary eth0valid_lft forever preferred_lft foreverinet6 fe80::16dc:d558:23b:696d/64 scope link noprefixroute valid_lft forever preferred_lft forever
3: docker0:  mtu 1500 qdisc noqueue state DOWN group default link/ether 02:42:32:bd:68:14 brd ff:ff:ff:ff:ff:ffinet 172.17.0.1/16 brd 172.17.255.255 scope global docker0valid_lft forever preferred_lft forever

验证nginx是否启动

[root@node1 ~]# systemctl status nginx.service 
● nginx.service - Cluster Controlled nginxLoaded: loaded (/usr/lib/systemd/system/nginx.service; disabled; vendor preset: disabled)Drop-In: /run/systemd/system/nginx.service.d└─50-pacemaker.confActive: active (running) since Wed 2023-03-22 08:38:33 CST; 10min agoDocs: http://nginx.org/en/docs/Process: 1467 ExecStart=/usr/sbin/nginx -c /etc/nginx/nginx.conf (code=exited, status=0/SUCCESS)Main PID: 1468 (nginx)Tasks: 2Memory: 3.3MCGroup: /system.slice/nginx.service├─1468 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf└─1469 nginx: worker processMar 22 08:38:33 node1 systemd[1]: Starting Cluster Controlled nginx...
Mar 22 08:38:33 node1 systemd[1]: Can't open PID file /var/run/nginx.pid (yet?) after start: No such file or directory
Mar 22 08:38:33 node1 systemd[1]: Started Cluster Controlled nginx.[root@node1 ~]# curl http://10.0.0.10
welcome to node1

启动后正常情况下VIP设置在主机点10.0.0.11上。如主节点故障,则节点node2自动接管服务,方法是直接重启节点node1,然后观察备节点是否接管了主节点的资源

测试

重启node1,在node2观察

[root@node2 ~]# pcs status
Cluster name: mycluster
Stack: corosync
Current DC: node2 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Wed Mar 22 08:44:59 2023
Last change: Wed Mar 22 08:33:45 2023 by root via cibadmin on node12 nodes configured
2 resource instances configuredOnline: [ node1 node2 ]Full list of resources:VIP	(ocf::heartbeat:IPaddr2):	Started node1web	(systemd:nginx):	Started node1Daemon Status:corosync: active/enabledpacemaker: active/enabledpcsd: active/enabled
[root@node2 ~]# ip a
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00inet 127.0.0.1/8 scope host lovalid_lft forever preferred_lft foreverinet6 ::1/128 scope host valid_lft forever preferred_lft forever
2: eth0:  mtu 1500 qdisc pfifo_fast state UP group default qlen 1000link/ether 00:0c:29:c2:52:cb brd ff:ff:ff:ff:ff:ffinet 10.0.0.12/24 brd 10.0.0.255 scope global noprefixroute eth0valid_lft forever preferred_lft foreverinet 10.0.0.10/24 brd 10.0.0.255 scope global secondary eth0valid_lft forever preferred_lft foreverinet6 fe80::16dc:d558:23b:696d/64 scope link tentative noprefixroute dadfailed valid_lft forever preferred_lft foreverinet6 fe80::959b:b4bc:f1b2:41f3/64 scope link noprefixroute valid_lft forever preferred_lft forever[root@node2 ~]# systemctl status nginx.service 
● nginx.service - Cluster Controlled nginxLoaded: loaded (/usr/lib/systemd/system/nginx.service; disabled; vendor preset: disabled)Drop-In: /run/systemd/system/nginx.service.d└─50-pacemaker.confActive: active (running) since Wed 2023-03-22 08:51:59 CST; 8s agoDocs: http://nginx.org/en/docs/Process: 2546 ExecStart=/usr/sbin/nginx -c /etc/nginx/nginx.conf (code=exited, status=0/SUCCESS)Main PID: 2548 (nginx)CGroup: /system.slice/nginx.service├─2548 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf└─2550 nginx: worker processMar 22 08:51:59 node2 systemd[1]: Starting Cluster Controlled nginx...
Mar 22 08:51:59 node2 systemd[1]: Can't open PID file /var/run/nginx.pid (yet?) after start: No such file or directory
Mar 22 08:51:59 node2 systemd[1]: Started Cluster Controlled nginx.[root@node2 ~]# curl http://10.0.0.10
welcome to node2

节点 node1优先级高,恢复后VIP 和 Web又会重新被节点 node1接管

相关内容

热门资讯

监控摄像头接入GB28181平... 流程简介将监控摄像头的视频在网站和APP中直播,要解决的几个问题是:1&...
【PdgCntEditor】解... 一、问题背景 大部分的图书对应的PDF,目录中的页码并非PDF中直接索引的页码...
protocol buffer... 目录 目录 什么是protocol buffer 1.protobuf 1.1安装  1.2使用...
在Word、WPS中插入AxM... 引言 我最近需要写一些文章,在排版时发现AxMath插入的公式竟然会导致行间距异常&#...
Windows10添加群晖磁盘... 在使用群晖NAS时,我们需要通过本地映射的方式把NAS映射成本地的一块磁盘使用。 通过...
修复 爱普生 EPSON L4... L4151 L4153 L4156 L4158 L4163 L4165 L4166 L4168 L4...
Fluent中创建监测点 1 概述某些仿真问题,需要创建监测点,用于获取空间定点的数据࿰...
ChatGPT 怎么用最新详细... ChatGPT 以其强大的信息整合和对话能力惊艳了全球,在自然语言处理上面表现出了惊人...
educoder数据结构与算法...                                                   ...
MySQL下载和安装(Wind... 前言:刚换了一台电脑,里面所有东西都需要重新配置,习惯了所...