写点什么

【我和 openGauss 的故事】为集群增加 VIP

作者:daydayup
  • 2023-08-07
    北京
  • 本文字数:5762 字

    阅读完需:约 19 分钟

【我和 openGauss 的故事】为集群增加 VIP

lqkitten [openGauss](javascript:void(0);) 2023-08-04 18:01 发表于四川


openGauss 发布以来,原生支持一主多备,RTO<10S,高可用性能大大增强。自 openGauss3.0 开始,更新了集群管理套件 CM,易用性也得到了提高。但对于客户端来说,数据库端的切换,需要手工完成。


openGauss 增加 VIP 后,客户端的连接就如连接 ORACLE RAC 的 scan VIP 一样,对于服务端的切换无感知。


要使用 VIP,可以在安装前规划,在配置文件中指定,也可以对已安装的集群进行手工增加。下面就测试手工增加方法。

1.已安装集群的相关信息

数据库版本 gsql -Vgsql (openGauss 5.0.0 build a07d57c3) compiled at 2023-03-29 03:37:13 commit 0 last mr
复制代码


集群状态


[omm@db1 srv]$ cm_ctl query -Cv[  CMServer State   ]
node instance state-----------------------1 db1 1 Primary2 db2 2 Standby3 db3 3 Standby
[ Cluster State ]
cluster_state : Normalredistributing : Nobalanced : Yescurrent_az : AZ_ALL
[ Datanode State ]
node instance state | node instance state | node instance state---------------------------------------------------------------------------------------------------------1 db1 6001 P Primary Normal | 2 db2 6002 S Standby Normal | 3 db3 6003 S Standby Normal

[omm@db1 srv]$ gs_om -t status --detail[ CMServer State ]
node node_ip instance state-----------------------------------------------------------------------1 db1 192.168.56.11 1 /opt/huawei/data/cmserver/cm_server Primary2 db2 192.168.56.12 2 /opt/huawei/data/cmserver/cm_server Standby3 db3 192.168.56.13 3 /opt/huawei/data/cmserver/cm_server Standby
[ Cluster State ]
cluster_state : Normalredistributing : Nobalanced : Yescurrent_az : AZ_ALL
[ Datanode State ]
node node_ip instance state-------------------------------------------------------------------------1 db1 192.168.56.11 6001 /opt/huawei/install/data/dn P Primary Normal2 db2 192.168.56.12 6002 /opt/huawei/install/data/dn S Standby Normal3 db3 192.168.56.13 6003 /opt/huawei/install/data/dn S Standby Normal
复制代码

2.给 omm 用户增加 sudo 权限,三台机器都执行

echo “omm ALL=(ALL) NOPASSWD:ALL”>>/etc/sudoers

3. 在主库上添加 VIP

添加前


[omm@db1 cm_agent]$ ip a1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00    inet 127.0.0.1/8 scope host lo       valid_lft forever preferred_lft forever    inet6 ::1/128 scope host       valid_lft forever preferred_lft forever2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000    link/ether 08:00:27:04:f9:a3 brd ff:ff:ff:ff:ff:ff    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3       valid_lft 74572sec preferred_lft 74572sec    inet6 fe80::c8c2:7f4c:914f:a32d/64 scope link noprefixroute       valid_lft forever preferred_lft forever3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000    link/ether 08:00:27:50:47:12 brd ff:ff:ff:ff:ff:ff    inet 192.168.56.11/24 brd 192.168.56.255 scope global noprefixroute enp0s8       valid_lft forever preferred_lft forever    inet6 fe80::890a:d968:b65d:f59a/64 scope link noprefixroute       valid_lft forever preferred_lft forever ifconfig enp0s8:15400 192.168.56010 netmask 255.255.255.0 up
复制代码


添加后


[omm@db1 cm_agent]$ ip a1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00    inet 127.0.0.1/8 scope host lo       valid_lft forever preferred_lft forever    inet6 ::1/128 scope host       valid_lft forever preferred_lft forever2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000    link/ether 08:00:27:04:f9:a3 brd ff:ff:ff:ff:ff:ff    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3       valid_lft 74572sec preferred_lft 74572sec    inet6 fe80::c8c2:7f4c:914f:a32d/64 scope link noprefixroute       valid_lft forever preferred_lft forever3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000    link/ether 08:00:27:50:47:12 brd ff:ff:ff:ff:ff:ff    inet 192.168.56.11/24 brd 192.168.56.255 scope global noprefixroute enp0s8       valid_lft forever preferred_lft forever  inet 192.168.56.10/24 brd 192.168.56.255 scope global secondary enp0s8:15400   inet6 fe80::890a:d968:b65d:f59a/64 scope link noprefixroute       valid_lft forever preferred_lft forever
复制代码

4.给集群添加 VIP 资源 VIP 作为 openGauss 的资源管理

[omm@db2 cm_agent]$cm_ctl res --add --res_name="VIP_az1" --res_attr="resources_type=VIP,float_ip=192.168.56.10"cm_ctl: add res(VIP_az1) success.
复制代码


将每个实例加到资源中


[omm@db2 cm_agent]$ cm_ctl res --edit --res_name="VIP_az1" --add_inst="node_id=1,res_instance_id=6001" --inst_attr=base_ip=192.168.56.11cm_ctl: edit res(VIP_az1) success.  [omm@db2 cm_agent]$ cm_ctl res --edit --res_name="VIP_az1" --add_inst="node_id=2,res_instance_id=6002" --inst_attr=base_ip=192.168.56.12cm_ctl: edit res(VIP_az1) success.
[omm@db2 cm_agent]$ cm_ctl res --edit --res_name="VIP_az1" --add_inst="node_id=2,res_instance_id=6003" --inst_attr=base_ip=192.168.56.13cm_ctl: edit res(VIP_az1) success.
复制代码


查询 VIP 在哪个节点


[omm@db3 ~]$ cm_ctl show
[ Network Connect State ]
Network timeout: 6sCurrent CMServer time: 2023-08-03 06:18:42Network stat('Y' means connected, otherwise 'N'):| \ | Y | Y || Y | \ | Y || Y | Y | \ |

[ Node Disk HB State ]
Node disk hb timeout: 200sCurrent CMServer time: 2023-08-03 06:18:43Node disk hb stat('Y' means connected, otherwise 'N'):| N | N | N |
[ FloatIp Network State ]
node instance base_ip float_ip_name float_ip----------------------------------------------------------1 db1 6001 192.168.56.11 VIP_az1 192.168.56.10
复制代码


模拟主节点故障


[omm@db3 ~]$ cm_ctl stop -n 1cm_ctl: stop the node: 1.cm_ctl: stop node, nodeid: 1...........cm_ctl: stop node successfully.
复制代码


主节点切换到节点 2,VIP 也到了节点 2


[omm@db1 cm_agent]$ cm_ctl show
[ Network Connect State ]
Network timeout: 6sCurrent CMServer time: 2023-08-03 06:19:40Network stat('Y' means connected, otherwise 'N'):| \ | N | N || N | \ | Y || N | Y | \ |

[ Node Disk HB State ]
Node disk hb timeout: 200sCurrent CMServer time: 2023-08-03 06:19:41Node disk hb stat('Y' means connected, otherwise 'N'):| N | N | N |
[ FloatIp Network State ]
node instance base_ip float_ip_name float_ip----------------------------------------------------------2 db2 6002 192.168.56.12 VIP_az1 192.168.56.10
复制代码


节点 1 的 IP,已没有 192.168.56.10


[omm@db1 cm_agent]$ ip a1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00    inet 127.0.0.1/8 scope host lo       valid_lft forever preferred_lft forever    inet6 ::1/128 scope host       valid_lft forever preferred_lft forever2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000    link/ether 08:00:27:04:f9:a3 brd ff:ff:ff:ff:ff:ff    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3       valid_lft 74572sec preferred_lft 74572sec    inet6 fe80::c8c2:7f4c:914f:a32d/64 scope link noprefixroute       valid_lft forever preferred_lft forever3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000    link/ether 08:00:27:50:47:12 brd ff:ff:ff:ff:ff:ff    inet 192.168.56.11/24 brd 192.168.56.255 scope global noprefixroute enp0s8       valid_lft forever preferred_lft forever    inet6 fe80::890a:d968:b65d:f59a/64 scope link noprefixroute       valid_lft forever preferred_lft forever
复制代码


节点 2 的 IP,已增加 192.168.56.10


[omm@db2 ~]$ ip a1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00    inet 127.0.0.1/8 scope host lo       valid_lft forever preferred_lft forever    inet6 ::1/128 scope host       valid_lft forever preferred_lft forever2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000    link/ether 08:00:27:04:f9:a3 brd ff:ff:ff:ff:ff:ff    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute enp0s3       valid_lft 74550sec preferred_lft 74550sec    inet6 fe80::c8c2:7f4c:914f:a32d/64 scope link noprefixroute       valid_lft forever preferred_lft forever3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000    link/ether 08:00:27:41:73:29 brd ff:ff:ff:ff:ff:ff    inet 192.168.56.12/24 brd 192.168.56.255 scope global noprefixroute enp0s8       valid_lft forever preferred_lft forever    inet 192.168.56.10/24 brd 192.168.56.255 scope global secondary enp0s8:15400       valid_lft forever preferred_lft forever    inet6 fe80::5373:d66d:7a39:ddc2/64 scope link noprefixroute       valid_lft forever preferred_lft forever
复制代码


资源配置文件


[omm@db1 cm_agent]$ cat cm_resource.json{        "resources":    [{                        "name": "VIP_az1",                        "resources_type":       "VIP",                        "instances":    [{                                        "node_id":      1,                                        "res_instance_id":      6001,                                        "inst_attr":    "base_ip=192.168.56.11"                                }, {                                        "node_id":      2,                                        "res_instance_id":      6002,                                        "inst_attr":    "base_ip=192.168.56.12"                                }, {                                        "node_id":      3,                                        "res_instance_id":      6003,                                        "inst_attr":    "base_ip=192.168.56.13"                                }],                        "float_ip":     "192.168.56.10"                }]
复制代码


同步配置文件到其余节点


scp   cm_resource.json db2:/opt/huawei/data/cmserver/cm_agentscp   cm_resource.json db3:/opt/huawei/data/cmserver/cm_agent  
复制代码


启动节点 1


[omm@db3 ~]$ cm_ctl start -n 1cm_ctl: start the node: 1.cm_ctl: start node, nodeid: 1...........cm_ctl: start node successfully.    [omm@db1 cm_agent]$ gs_om -t status --detail[  CMServer State   ]
node node_ip instance state-----------------------------------------------------------------------1 db1 192.168.56.11 1 /opt/huawei/data/cmserver/cm_server Standby2 db2 192.168.56.12 2 /opt/huawei/data/cmserver/cm_server Primary3 db3 192.168.56.13 3 /opt/huawei/data/cmserver/cm_server Standby
[ Cluster State ]
cluster_state : Normalredistributing : Nobalanced : Nocurrent_az : AZ_ALL
[ Datanode State ]
node node_ip instance state-------------------------------------------------------------------------1 db1 192.168.56.11 6001 /opt/huawei/install/data/dn P Standby Normal2 db2 192.168.56.12 6002 /opt/huawei/install/data/dn S Primary Normal3 db3 192.168.56.13 6003 /opt/huawei/install/data/dn S Standby Normal
复制代码



现在 CM 的主节点和数据库的主节点在同一机器上了。

用户头像

daydayup

关注

还未添加个人签名 2023-07-18 加入

还未添加个人简介

评论

发布
暂无评论
【我和openGauss的故事】为集群增加VIP_daydayup_InfoQ写作社区