【我和 openGauss 的故事】openGauss 3.1.1 企业版主备集群升级至 5.0.0 操作指南
- 2023-08-12 北京
本文字数:24410 字
阅读完需:约 80 分钟
尚雷 [openGauss](javascript:void(0);) 2023-07-29 17:58 发表于四川
前言:继前几日测试部署 openGauss 5.0 并写了[*Centos/RHEL 7 安装部署openGauss 5.0 企业版 一主二备一级联操作指南]*的文章,近日测试了 openGauss 从 3.1.1 升级 5.0.0,在升级过程中也遇到了一些问题。也非常希望看到此文的朋友,如果你在参照此文升级过程中遇到什么问题或者对此文有什么异议的地方,也希望能和我交流,不胜感激。
本套数据库环境为 openGauss 3.1.1 企业版一主一备环境,前期安装部署 openGauss 3.1.1 前已参照 openGauss 官网安装了依赖包、关闭了防火墙\SElinux、调整了内核参数等其它相关所要求的环境准备,数据库相关环境信息如下:
对 openGauss 3 企业版集群安装部署不熟悉的可参照我之前写的文章:[Centos 7 系统 openGauss 3.1.0 一主两备集群安装部署指南],文章链接:https://www.modb.pro/db/551221
1.1 主机名称
1.2 主机地址
1.3 端口号信息
1.4 用户及组信息
1.5 软件目录信息
1.6 XML 配置文件信息
<?xml version="1.0" encoding="UTF-8"?>
<!-- openGauss整体信息 -->
<!-- 数据库名称 -->
<PARAM name="clusterName" value="openGSDB" />
<!-- 数据库节点名称(hostname) -->
<PARAM name="nodeNames" value="opengauss-db1,opengauss-db2" />
<!-- 节点IP,与nodeNames一一对应 -->
<PARAM name="backIp1s" value=","/>
<!-- 数据库安装目录-->
<PARAM name="gaussdbAppPath" value="/opt/gaussdb/install/app" />
<!-- 日志目录-->
<PARAM name="gaussdbLogPath" value="/var/log/omm" />
<!-- 临时文件目录-->
<PARAM name="tmpMppdbPath" value="/opt/gaussdb/tmp"/>
<PARAM name="gaussdbToolPath" value="/opt/gaussdb/install/om" />
<PARAM name="corePath" value="/opt/gaussdb/corefile"/>
<!-- openGauss类型,此处示例为单机类型,"single-inst"表示单机一主多备部署形态-->
<PARAM name="clusterType" value="single-inst"/>
<!-- 每台服务器上的节点部署信息 -->
<!-- opengauss-db1上的节点部署信息 -->
<DEVICE sn="1000001">
<!-- opengauss-db1的hostname -->
<PARAM name="name" value="opengauss-db1"/>
<!-- opengauss-db1所在的AZ及AZ优先级 -->
<PARAM name="azName" value="AZ1"/>
<PARAM name="azPriority" value="1"/>
<!-- 如果服务器只有一个网卡可用,将backIP1和sshIP1配置成同一个IP -->
<PARAM name="backIp1" value=""/>
<PARAM name="sshIp1" value=""/>
<PARAM name="cmDir" value="/opt/gaussdb/install/cm" />
<PARAM name="cmsNum" value="1" />
<PARAM name="cmServerPortBase" value="15300" />
<PARAM name="cmServerlevel" value="1" />
<PARAM name="cmServerListenIp1" value="," />
<PARAM name="cmServerRelation" value="opengauss-db1,opengauss-db2" />
<PARAM name="dataNum" value="1"/>
<PARAM name="dataPortBase" value="26000"/>
<PARAM name="dataNode1" value="/opt/gaussdb/install/data/dn,opengauss-db2,/opt/gaussdb/install/data/dn"/>
<PARAM name="dataNode1_syncNum" value="0"/>
<!-- opengauss-db2上的节点部署信息,其中"name"的值配置为主机名称(hostname) -->
<DEVICE sn="1000002">
<PARAM name="name" value="opengauss-db2"/>
<PARAM name="azName" value="AZ1"/>
<PARAM name="azPriority" value="1"/>
<!-- 如果服务器只有一个网卡可用,将backIP1和sshIP1配置成同一个IP -->
<PARAM name="backIp1" value=""/>
<PARAM name="sshIp1" value=""/>
<PARAM name="cmDir" value="/opt/gaussdb/install/cm" />
2.1 下载 5.0.0 软件安装包
2.1.1 下载安装包
使用注册账号登录 openGauss 官网https://www.opengauss.org/zh/download/下载页面,下载与操作系统匹配的 openGauss 5.0.0 软件安装包,选择 openGauss_5.0.0 企业版下载,并将下载的软件包上传至服务器/opt/software/openGauss 目录下。

注:如果服务器可联网,可通过 wget 方式下载软件安装包。可用鼠标右键点击,然后选择“复制链接”,如数据库服务器可连外网,可在服务器上通过 wget 获取 openGauss 5.0.0 企业版软件安装包。
# root用户执行【主节点】
[root@opengauss-db1 ~]# cd /opt/software/openGauss
[root@opengauss-db1 openGauss]# wget https://opengauss.obs.cn-south-1.myhuaweicloud.com/5.0.0/x86/openGauss-5.0.0-CentOS-64bit-all.tar.gz
2.1.2 校验安装包


,将复制的内容粘贴到文本文件,显示内容为:aa9fc724c5030f4cc79dad201675183029c8f36a07667028e681169a2f6482f5,然后将下载的文件通过 sha256sum 命令进行校验,以确保下载安装包完整性。
# root用户执行【主节点】
[root@opengauss-db1 openGauss]# sha256sum openGauss-5.0.0-CentOS-64bit-all.tar.gz
aa9fc724c5030f4cc79dad201675183029c8f36a07667028e681169a2f6482f5 openGauss-5.0.0-CentOS-64bit-all.tar.gz
-- 如校验的值和官网SHA256值相同,表明文件完整
2.1.3 解压安装包
# root用户执行【主节点】
[root@opengauss-db1 ~]# cd /opt/software/openGauss
[root@opengauss-db1 openGauss]# tar -zxvf openGauss-5.0.0-CentOS-64bit-all.tar.gz
[root@opengauss-db1 openGauss]# tar -zxvf openGauss-5.0.0-CentOS-64bit-om.tar.gz
[root@xsky-node1 openGauss]# ll
total 261040
drwxr-xr-x 14 root root 302 Mar 29 03:22 lib
-rw-r--r-- 1 root root 133071038 Mar 29 20:11 openGauss-5.0.0-CentOS-64bit-all.tar.gz
-rw-r--r-- 1 root root 105 Mar 29 03:23 openGauss-5.0.0-CentOS-64bit-cm.sha256
-rw-r--r-- 1 root root 22356000 Mar 29 03:23 openGauss-5.0.0-CentOS-64bit-cm.tar.gz
-rw-r--r-- 1 root root 65 Mar 29 03:22 openGauss-5.0.0-CentOS-64bit-om.sha256
-rw-r--r-- 1 root root 11963876 Mar 29 03:22 openGauss-5.0.0-CentOS-64bit-om.tar.gz
-rw-r--r-- 1 root root 65 Mar 29 03:23 openGauss-5.0.0-CentOS-64bit.sha256
-rw-r--r-- 1 root root 99384569 Mar 29 03:23 openGauss-5.0.0-CentOS-64bit.tar.bz2
drwxr-xr-x 10 root root 4096 Mar 29 03:22 script
-rw------- 1 root root 65 Mar 29 03:21 upgrade_sql.sha256
-rw------- 1 root root 493211 Mar 29 03:21 upgrade_sql.tar.gz
-rw-r--r-- 1 root root 32 Mar 29 03:22 version.cfg
2.2 检查健康状态
# root用户执行【任一节点】
-- 执行 gs_checkos -i A 命令
[root@opengauss-dbxxx ~]# /opt/software/openGauss/script/gs_checkos -i A --detail
Checking items:
A1. [ OS version status ] : Normal
A2. [ Kernel version status ] : Normal
The names about all kernel versions are same. The value is "3.10.0-1160.92.1.el7.x86_64".
A3. [ Unicode status ] : Normal
The values of all unicode are same. The value is "LANG=en_US.UTF-8".
A4. [ Time zone status ] : Normal
The informations about all timezones are same. The value is "+0800".
A5. [ Swap memory status ] : Normal
The value about swap memory is correct.
A6. [ System control parameters status ] : Normal
All values about system control parameters are correct.
A7. [ File system configuration status ] : Normal
Both soft nofile and hard nofile are correct.
A8. [ Disk configuration status ] : Normal
The value about XFS mount parameters is correct.
A9. [ Pre-read block size status ] : Normal
The value about Logical block size is correct.
A11.[ Network card configuration status ] : Normal
The configuration about network card is correct.
A12.[ Time consistency status ] : Normal
The ntpd service is started, local time is "2023-07-21 16:24:44".
A13.[ Firewall service status ] : Normal
The firewall service is stopped.
A14.[ THP service status ] : Normal
The THP service is stopped.
Total numbers:13. Abnormal numbers:0. Warning numbers:0.
-- 对非Normal值要进行调整
2.3 检查磁盘空间
# root用户执行【所有节点】
-- 通过 df -H 及 df -i 查看磁盘相应信息是否可用
-- df -h 查看磁盘空间
-- df -i 查看inode空闲数
2.4 检查版本信息
-- omm 用户 【任一节点】
-- 查询所有节点版本信息
[root@opengauss-dbxxx ~]# su - omm
Last login: Fri Jul 21 16:07:06 CST 2023 on pts/1
[omm@opengauss-dbxxx ~]$ gs_ssh -c "gsql -V"
Successfully execute command on all nodes.
[SUCCESS] opengauss-db1:
gsql (openGauss 3.1.1 build 70980198) compiled at 2023-01-06 09:34:59 commit 0 last mr
[SUCCESS] opengauss-db2:
gsql (openGauss 3.1.1 build 70980198) compiled at 2023-01-06 09:34:59 commit 0 last mr

2.5 检查集群状态
-- omm 用户 【任一节点】
[omm@opengauss-dbxxx ~]$ gs_om -t status --detail
[ CMServer State ]
node node_ip instance state
1 opengauss-db1 1 /opt/gaussdb/install/cm/cm_server Primary
2 opengauss-db2 2 /opt/gaussdb/install/cm/cm_server Standby
[ Cluster State ]
cluster_state : Normal
redistributing : No
balanced : Yes
current_az : AZ_ALL
[ Datanode State ]
node node_ip instance state
1 opengauss-db1 6001 /opt/gaussdb/install/data/dn P Primary Normal
2 opengauss-db2 6002 /opt/gaussdb/install/data/dn S Standby Normal

2.6 备份数据库
-- omm 用户执行【主节点】
[root@opengauss-db1 ~]# su - omm
Last login: Fri Jul 21 16:51:53 CST 2023 on pts/1
-- 创建目录
[omm@opengauss-db1 ~]$ BACKUP_DIR=/opt/gaussdb/backup/`date '+%Y%m%d_%H%M%S'`
[omm@opengauss-db1 ~]$ mkdir -p $BACKUP_DIR
-- 执行物理备份
[omm@opengauss-db1 backup]$ gs_basebackup -D $BACKUP_DIR -p 26000 -P -l $BACKUP_DIR
INFO: The starting position of the xlog copy of the full build is: 0/400E8B0. The slot minimum LSN is: 0/400E8B0. The disaster slot minimum LSN is: 0/0. The logical slot minimum LSN is: 0/0.
[2023-07-21 17:11:55]:begin build tablespace list
[2023-07-21 17:11:55]:finish build tablespace list
[2023-07-21 17:11:55]:begin get xlog by xlogstream
check identify system successpace[2023-07-21 17:11:55]:
[2023-07-21 17:11:55]: send START_REPLICATION 0/4000000 success
[2023-07-21 17:11:55]: keepalive message is received
[2023-07-21 17:11:55]: keepalive message is received
97981/97981 kB (100%), 1/1 tablespace
[2023-07-21 17:12:00]:gs_basebackup: base backup successfully
-- 查看备份信息
[omm@opengauss-db1 ~]$ ls -l /opt/gaussdb/backup/20230721_171855
total 5084
-rw------- 1 omm dbgrp 216 Jul 21 17:19 backup_label
-rw------- 1 omm dbgrp 198 Jul 21 17:19 backup_label.old
drwx------ 5 omm dbgrp 4096 Jul 21 17:19 base
-rw------- 1 omm dbgrp 0 Jul 21 17:19 build_completed.done
-rw------- 1 omm dbgrp 4399 Jul 21 17:19 cacert.pem
drwx------ 4 omm dbgrp 4096 Jul 21 17:19 dbe_perf_standby
-rw------- 1 omm dbgrp 56 Jul 21 17:19 full_backup_label
drwx------ 2 omm dbgrp 4096 Jul 21 17:19 global
-rw------- 1 omm dbgrp 4915200 Jul 21 17:19 gswlm_userinfo.cfg
-rw------- 1 omm dbgrp 21016 Jul 21 17:19 mot.conf
drwx------ 2 omm dbgrp 4096 Jul 21 17:19 pg_clog
drwx------ 2 omm dbgrp 4096 Jul 21 17:19 pg_csnlog
-rw------- 1 omm dbgrp 0 Jul 21 17:19 pg_ctl.lock
drwx------ 2 omm dbgrp 4096 Jul 21 17:19 pg_errorinfo
-rw------- 1 omm dbgrp 4676 Jul 21 17:19 pg_hba.conf
-rw------- 1 omm dbgrp 4676 Jul 21 17:19 pg_hba.conf.bak
-rw------- 1 omm dbgrp 1024 Jul 21 17:19 pg_hba.conf.lock
-rw------- 1 omm dbgrp 1636 Jul 21 17:19 pg_ident.conf
drwx------ 4 omm dbgrp 4096 Jul 21 17:19 pg_llog
drwx------ 2 omm dbgrp 4096 Jul 21 17:19 pg_logical
drwx------ 4 omm dbgrp 4096 Jul 21 17:19 pg_multixact
drwx------ 2 omm dbgrp 4096 Jul 21 17:19 pg_notify
drwx------ 2 omm dbgrp 4096 Jul 21 17:19 pg_replslot
drwx------ 2 omm dbgrp 4096 Jul 21 17:19 pg_serial
drwx------ 2 omm dbgrp 4096 Jul 21 17:19 pg_snapshots
drwx------ 2 omm dbgrp 4096 Jul 21 17:19 pg_stat_tmp
drwx------ 2 omm dbgrp 4096 Jul 21 17:19 pg_tblspc
drwx------ 2 omm dbgrp 4096 Jul 21 17:19 pg_twophase
-rw------- 1 omm dbgrp 4 Jul 21 17:19 PG_VERSION
drwx------ 3 omm dbgrp 4096 Jul 21 17:19 pg_xlog
-rw------- 1 omm dbgrp 35919 Jul 21 17:19 postgresql.conf
-rw------- 1 omm dbgrp 35919 Jul 21 17:19 postgresql.conf.guc.bak
-rw------- 1 omm dbgrp 1024 Jul 21 17:19 postgresql.conf.lock
-rw------- 1 omm dbgrp 35919 Jul 21 17:19 postgresql.conf.wal.bak
-rw------- 1 omm dbgrp 0 Jul 21 17:19 postmaster.pid.lock
-rw------- 1 omm dbgrp 10 Jul 21 17:19 rewind_lable
-rw------- 1 omm dbgrp 4402 Jul 21 17:19 server.crt
-rw------- 1 omm dbgrp 1766 Jul 21 17:19 server.key
-rw------- 1 omm dbgrp 56 Jul 21 17:19 server.key.cipher
-rw------- 1 omm dbgrp 24 Jul 21 17:19 server.key.rand
-rw------- 1 omm dbgrp 4 Jul 21 17:19 term_file
drwx------ 5 omm dbgrp 4096 Jul 21 17:19 undo

2.7 停止集群
-- 停集群,omm 用户执行【主节点】
[omm@opengauss-db1 ~]$ gs_om -t stop
Stopping cluster.
Successfully stopped cluster.
End stop cluster.
[omm@opengauss-db1 ~]$ gs_om -t status --detail --all
[ CMServer State ]
node node_ip instance state
1 opengauss-db1 1 /opt/gaussdb/install/cm/cm_server Down
2 opengauss-db2 2 /opt/gaussdb/install/cm/cm_server Down
cm_ctl: can't connect to cm_server.
Maybe cm_server is not running, or timeout expired. Please try again.

2.8 备份目录及文件
-- root 用户执行【所有节点】
-- 升级前建议参照clusterconfig.xml文件对相应目录及文件进行备份,以防升级失败
-- 本次测试环境数据库相应目录如下,请参照实际生产环境执行
<PARAM name="gaussdbAppPath" value="/opt/gaussdb/install/app" />
<PARAM name="gaussdbLogPath" value="/var/log/omm" />
<PARAM name="tmpMppdbPath" value="/opt/gaussdb/tmp" />
<PARAM name="gaussdbToolPath" value="/opt/gaussdb/install/om" />
<PARAM name="corePath" value="/opt/gaussdb/corefile" />
<PARAM name="dataNode1" value="/opt/gaussdb/install/data/dn,opengauss-db2,/opt/gaussdb/install/data/dn"/>
-- 备份目录
[root@opengauss-dbxxx ~]# cd /opt
[root@opengauss-dbxxx opt]# tar -czf gaussdb_3.1.1.tar ./gaussdb/
2.9 启动集群
-- 停集群,omm 用户执行【主节点】
[omm@opengauss-db1 ~]$ gs_om -t start
Starting cluster.
Successfully started primary instance. Wait for standby instance.
Successfully started cluster.
cluster_state : Normal
redistributing : No
node_count : 2
Datanode State
primary : 1
standby : 1
secondary : 0
cascade_standby : 0
building : 0
abnormal : 0
down : 0
Successfully started cluster.
[omm@opengauss-db1 ~]$ gs_om -t status --detail --all
[ CMServer State ]
node node_ip instance state
1 opengauss-db1 1 /opt/gaussdb/install/cm/cm_server Primary
2 opengauss-db2 2 /opt/gaussdb/install/cm/cm_server Standby
[ Cluster State ]
cluster_state : Normal
redistributing : No
balanced : Yes
current_az : AZ_ALL
[ Datanode State ]
node node_ip instance state
1 opengauss-db1 6001 /opt/gaussdb/install/data/dn P Primary Normal
2 opengauss-db2 6002 /opt/gaussdb/install/data/dn S Standby Normal

3.1 升级前预检查
# root用户执行【主节点】
[root@opengauss-db1 ~]# python3 /opt/software/openGauss/script/gs_preinstall -U omm -G dbgrp -X /opt/software/openGauss/cluster_config.xml
Parsing the configuration file.
Successfully parsed the configuration file.
Installing the tools on the local node.
Successfully installed the tools on the local node.
Are you sure you want to create trust for root (yes/no)?yes -- 输入 yes
Please enter password for root
Successfully created SSH trust for the root permission user.
Setting host ip env
Successfully set host ip env.
Distributing package.
Begin to distribute package to tool path.
Successfully distribute package to tool path.
Begin to distribute package to package path.
Successfully distribute package to package path.
Successfully distributed package.
Are you sure you want to create the user[omm] and create trust for it (yes/no)? no -- 输入no
Preparing SSH service.
Successfully prepared SSH service.
Installing the tools in the cluster.
Successfully installed the tools in the cluster.
Checking hostname mapping.
Successfully checked hostname mapping.
Checking OS software.
Successfully check os software.
Checking OS version.
Successfully checked OS version.
Creating cluster's path.
Successfully created cluster's path.
Set and check OS parameter.
Setting OS parameters.
Successfully set OS parameters.
Warning: Installation environment contains some warning messages.
Please get more details by "/opt/software/openGauss/script/gs_checkos -i A -h opengauss-db1,opengauss-db2 --detail".
Set and check OS parameter completed.
Preparing CRON service.
Successfully prepared CRON service.
Setting user environmental variables.
Successfully set user environmental variables.
Setting the dynamic link library.
Successfully set the dynamic link library.
Setting Core file
Successfully set core path.
Setting pssh path
Successfully set pssh path.
Setting Cgroup.
Successfully set Cgroup.
Set ARM Optimization.
No need to set ARM Optimization.
Fixing server package owner.
Setting finish flag.
Successfully set finish flag.
Preinstallation succeeded.
-- 可通过/opt/software/openGauss/script/gs_checkos -i A -h opengauss-db1,opengauss-db2 --detail查看预检查详细信息,如有告警等信息进行处理
3.2 执行升级
# root用户执行【主节点】
[root@opengauss-db1 ~]# chmod -R 755 /opt/software/openGauss/script/
[root@opengauss-db1 ~]# chown -R omm:dbgrp /opt/software/openGauss/script/
-- 灰度升级
[omm@opengauss-db1 ~]$ /opt/software/openGauss/script/gs_upgradectl -t auto-upgrade --grey -X /opt/software/openGauss/cluster_config.xml
Static configuration matched with old static configuration files.
Wait for the cluster status normal or degrade.
Start check CMS parameter.
Old cluster version number less than 92574.
Successfully set upgrade_mode to 0.
Checking upgrade environment.
Successfully checked upgrade environment.
Start to do health check.
Successfully checked cluster status.
Upgrade all nodes.
NOTICE: The directory /opt/gaussdb/install/app_70980198 will be deleted after commit-upgrade, please make sure there is no personal data.
Performing grey rollback.
No need to rollback.
The directory /opt/gaussdb/install/app_70980198 will be deleted after commit-upgrade, please make sure there is no personal data.
Installing new binary.
Wait for the cluster status normal or degrade.
copy certs from /opt/gaussdb/install/app_70980198 to /opt/gaussdb/install/app_a07d57c3.
Successfully copy certs from /opt/gaussdb/install/app_70980198 to /opt/gaussdb/install/app_a07d57c3.
Successfully backup hotpatch config file.
Sync cluster configuration.
Successfully synced cluster configuration.
Switch symbolic link to new binary directory.
Successfully switch symbolic link to new binary directory.
Start check CMS parameter.
Old cluster version number less than 92574.
Switching all db processes.
Check cluster state.
Cluster state: [ Cluster State ]
cluster_state : Normal
redistributing : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip port instance state
1 opengauss-db1 26000 6001 P Primary Normal
2 opengauss-db2 26000 6002 S Standby Normal
Wait for the cluster status normal or degrade.
Wait for the cluster status normal or degrade.
Create checkpoint before switching.
Start to wait for om_monitor.
Switching DN processes.
Switch DN processes for rolling upgrade.
Ready to grey start cluster.
Grey start cluster successfully.
Wait for the cluster status normal or degrade.
Successfully switch all process version
The nodes ['opengauss-db1', 'opengauss-db2'] have been successfully upgraded to new version. Then do health check.
Start to do health check.
Successfully checked cluster status.
Waiting for the cluster status to become normal.
The cluster status is normal.
Upgrade main process has been finished, user can do some check now.
Once the check done, please execute following command to commit upgrade:
gs_upgradectl -t commit-upgrade -X /opt/software/openGauss/cluster_config.xml
Successfully upgrade all nodes.
-- 升级提交
[omm@opengauss-db1 ~]$ gs_upgradectl -t commit-upgrade -X /opt/software/openGauss/cluster_config.xml
Wait for the cluster status normal or degrade.
Start check CMS parameter.
Old cluster version number less than 92574.
Start to do health check.
Successfully checked cluster status.
Wait for the cluster status normal or degrade.
Wait for the cluster status normal or degrade.
Start check CMS parameter.
Old cluster version number less than 92574.
Successfully cleaned old install path.
Commit upgrade succeeded.
Start check CMS parameter.
Old cluster version number less than 92574.

3.3 信息核查
3.3.1 查看版本信息
# omm用户执行【任一节点】
-- 查看版本信息
-- 版本信息为 5.0.0
[omm@opengauss-db1 ~]$ gs_om -V
gs_om (openGauss OM 5.0.0 build 244a7e05) compiled at 2023-03-29 03:22:22 commit 0 last mr
-- 查看两节点数据库版本信息,都已升级到5.0.0
[omm@opengauss-db1 ~]$ gs_ssh -c "gsql -V"
Successfully execute command on all nodes.
[SUCCESS] opengauss-db1:
gsql (openGauss 5.0.0 build a07d57c3) compiled at 2023-03-29 03:07:56 commit 0 last mr
[SUCCESS] opengauss-db2:
gsql (openGauss 5.0.0 build a07d57c3) compiled at 2023-03-29 03:07:56 commit 0 last mr
3.3.2 查看集群状态信息
# omm用户执行【任一节点】
-- 集群状态信息
[omm@opengauss-db1 ~]$ gs_om -t status --detail --all
[ CMServer State ]
node node_ip instance state
1 opengauss-db1 1 /opt/gaussdb/install/cm/cm_server Primary
2 opengauss-db2 2 /opt/gaussdb/install/cm/cm_server Standby
[ Cluster State ]
cluster_state : Normal
redistributing : No
balanced : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip instance state
1 opengauss-db1 6001 /opt/gaussdb/install/data/dn P Standby Normal
2 opengauss-db2 6002 /opt/gaussdb/install/data/dn S Primary Normal
-- 可以看到在升级后进行了主备切换

3.3.3 查看数据库信息
# omm用户执行【任一节点】
[omm@opengauss-db1 ~]$ gs_om -t status --detail --all
[ CMServer State ]
node node_ip instance state
1 opengauss-db1 1 /opt/gaussdb/install/cm/cm_server Primary
2 opengauss-db2 2 /opt/gaussdb/install/cm/cm_server Standby
[ Cluster State ]
cluster_state : Normal
redistributing : No
balanced : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip instance state
1 opengauss-db1 6001 /opt/gaussdb/install/data/dn P Standby Normal
2 opengauss-db2 6002 /opt/gaussdb/install/data/dn S Primary Normal
[omm@opengauss-db1 ~]$
[omm@opengauss-db1 ~]$
[omm@opengauss-db1 ~]$ gsql -d postgres -p 26000
gsql ((openGauss 5.0.0 build a07d57c3) compiled at 2023-03-29 03:07:56 commit 0 last mr )
Non-SSL connection (SSL connection is recommended when requiring high-security)
Type "help" for help.
openGauss=# CREATE DATABASE gaussdb WITH ENCODING 'UTF8' template = template0;
ERROR: cannot execute CREATE DATABASE in a read-only transaction
-- 因为发生了主备切换,连接备节点无法创建数据库
4.1 需修改 version.cfg 属主和属组
执行升级前,应同时修改主备节点/opt/software/openGauss/version.cfg 属主和属组,如未修改,执行升级会报错。
-- 如未修改主备节点version.cfg属主和属组,执行升级时会报如下错误
[omm@opengauss-db1 ~]$ /opt/software/openGauss/script/gs_upgradectl -t auto-upgrade --grey -X /opt/software/openGauss/cluster_config.xml
[Errno 13] Permission denied: '/opt/software/openGauss/version.cfg'
[Errno 13] Permission denied: '/opt/software/openGauss/version.cfg'
Start check CMS parameter.
float() argument must be a string or a number, not 'NoneType'
4.2 修改网卡 MTU 可能导致主备节点间无法 SSH
在升级前预检查时,如果修改了主备节点网卡的 MTU,在执行 gs_upgradectl 会卡主导致升级报错,此时两个节点间无法通过 SSH 互联,虽然可以互相 ping 通。
解决办法是将 MTU 值调整为默认 1500,重启 SSH 服务
-- 升级预检查提示主备节点MTU值需调整,从1500调整到8192,但修改网卡MTU后执行gs_upgradectl升级卡主,最后报错,从升级日志里可看到如下相关信息:
[2023-07-21 22:45:39.414838][20984][gs_sshexkey][DEBUG]:Successfully to add id_rsa in ssh-agent
[2023-07-21 22:45:39.415698][20984][gs_sshexkey][DEBUG]:Ssh agent register successfully.
[2023-07-21 22:45:39.416461][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step5]:Successfully created the local key files.
[2023-07-21 22:45:39.417283][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step6]:Appending local ID to authorized_keys.
[2023-07-21 22:45:39.418192][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step6]:Successfully appended local ID to authorized_keys.
[2023-07-21 22:45:39.429370][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step7]:Updating the known_hosts file.
[2023-07-21 22:45:40.311033][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step7]:Successfully updated the known_hosts file.
[2023-07-21 22:45:40.311665][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step8]:Appending authorized_key on the remote node.
[2023-07-21 22:45:40.679766][20984][gs_sshexkey][DEBUG]:Send to
Successfully appended authorized_key on remote node
[2023-07-21 22:45:40.864480][20984][gs_sshexkey][DEBUG]:Send to
Successfully appended authorized_key on remote node
[2023-07-21 22:45:40.921407][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step8]:Successfully appended authorized_key on all remote node.
[2023-07-21 22:45:40.921956][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step9]:Checking common authentication file content.
[2023-07-21 22:45:40.927562][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step9]:Successfully checked common authentication content.
[2023-07-21 22:45:40.928391][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step10]:Distributing SSH trust file to all node.
[2023-07-21 22:47:41.046988][20984][gs_sshexkey][DEBUG]:send_trust_file failed, coutdown 3, retry again.
[2023-07-21 22:47:41.047776][20984][gs_sshexkey][DEBUG]:errorinfo: hostip:, status: 1, output: lost connection,
[2023-07-21 22:47:41.089878][20984][gs_sshexkey][DEBUG]:check os info: drwx------ 2 root root 4096 Jul 21 22:45 .ssh
-rwxr-xr-x 1 root root 885 Dec 12 2022 ssh_key.sh
-rw-r--r-- 1 root root 521 Jul 21 11:36 sshtrust.sh
total 32
drwx------ 2 root root 4096 Jul 21 22:45 .
dr-xr-x---. 11 root root 4096 Jul 21 22:45 ..
-rw------- 1 root root 504 Jul 21 22:45 authorized_keys
-rw------- 1 root root 464 Jul 21 22:45 id_om
-rw------- 1 root root 100 Jul 21 22:45 id_om.pub
-rw------- 1 root root 1679 Jul 21 11:35 id_rsa
-rw------- 1 root root 400 Jul 21 11:35 id_rsa.pub
-rw------- 1 root root 1012 Jul 21 22:45 known_hosts
[2023-07-21 22:49:51.205162][20984][gs_sshexkey][DEBUG]:send_trust_file failed, coutdown 2, retry again.
[2023-07-21 22:49:51.206276][20984][gs_sshexkey][DEBUG]:errorinfo: hostip:, status: 1, output: lost connection,
[2023-07-21 22:49:51.240173][20984][gs_sshexkey][DEBUG]:check os info: drwx------ 2 root root 4096 Jul 21 22:45 .ssh
-rwxr-xr-x 1 root root 885 Dec 12 2022 ssh_key.sh
-rw-r--r-- 1 root root 521 Jul 21 11:36 sshtrust.sh
total 32
drwx------ 2 root root 4096 Jul 21 22:45 .
dr-xr-x---. 11 root root 4096 Jul 21 22:45 ..
-rw------- 1 root root 504 Jul 21 22:45 authorized_keys
-rw------- 1 root root 464 Jul 21 22:45 id_om
-rw------- 1 root root 100 Jul 21 22:45 id_om.pub
-rw------- 1 root root 1679 Jul 21 11:35 id_rsa
-rw------- 1 root root 400 Jul 21 11:35 id_rsa.pub
-rw------- 1 root root 1012 Jul 21 22:45 known_hosts
[2023-07-21 22:52:01.367717][20984][gs_sshexkey][DEBUG]:send_trust_file failed, coutdown 1, retry again.
[2023-07-21 22:52:01.368465][20984][gs_sshexkey][DEBUG]:errorinfo: hostip:, status: 1, output: lost connection,
[2023-07-21 22:52:01.425251][20984][gs_sshexkey][DEBUG]:check os info: drwx------ 2 root root 4096 Jul 21 22:45 .ssh
-rwxr-xr-x 1 root root 885 Dec 12 2022 ssh_key.sh
-rw-r--r-- 1 root root 521 Jul 21 11:36 sshtrust.sh
total 32
drwx------ 2 root root 4096 Jul 21 22:45 .
dr-xr-x---. 11 root root 4096 Jul 21 22:45 ..
-rw------- 1 root root 504 Jul 21 22:45 authorized_keys
-rw------- 1 root root 464 Jul 21 22:45 id_om
-rw------- 1 root root 100 Jul 21 22:45 id_om.pub
-rw------- 1 root root 1679 Jul 21 11:35 id_rsa
-rw------- 1 root root 400 Jul 21 11:35 id_rsa.pub
-rw------- 1 root root 1012 Jul 21 22:45 known_hosts
[2023-07-21 22:54:11.538969][20984][gs_sshexkey][ERROR]:[GAUSS-50223] : Failed to update the authentication files.cmd is source /root/.bashrc;scp -q -o "BatchMode yes" -o "NumberOfPasswordPrompts 0" /root/.ssh/id_om /root/.ssh/id_om.pub && temp_auth=$(grep '#OM' /root/.ssh/authorized_keys) && ssh "sed -i '/#OM/d' /root/.ssh/authorized_keys; echo *** >> /root/.ssh/authorized_keys" && temp_auth=$(grep '#OM' /root/.ssh/known_hosts) && ssh "sed -i '/#OM/d' /root/.ssh/known_hosts; echo *** >> /root/.ssh/known_hosts"; Node: Error:
1, lost connection
[2023-07-21 22:54:12.110072][20463][gs_preinstall][DEBUG]:The $GAUSSHOME/bin is exist.
[2023-07-21 22:54:12.111040][20463][gs_preinstall][DEBUG]:The $GAUSS_ENV is 2.
[2023-07-21 22:54:12.111678][20463][gs_preinstall][DEBUG]:There is the upgrade is in progress.
[2023-07-21 22:54:12.112467][20463][gs_preinstall][DEBUG]:In upgrade process, no need to delete /opt/gaussdb/install/om.
[2023-07-21 22:54:12.113237][20463][gs_preinstall][ERROR]:[GAUSS-51632] : Failed to do gs_sshexkey.Error: Please enter password for current user[root].
Checking network information.
All nodes in the network are Normal.
Successfully checked network information.
Creating SSH trust.
Creating the local key file.
Successfully created the local key files.
Appending local ID to authorized_keys.
Successfully appended local ID to authorized_keys.
Updating the known_hosts file.
Successfully updated the known_hosts file.
Appending authorized_key on the remote node.
Successfully appended authorized_key on all remote node.
Checking common authentication file content.
Successfully checked common authentication content.
Distributing SSH trust file to all node.
[GAUSS-50223] : Failed to update the authentication files.cmd is source /root/.bashrc;scp -q -o "BatchMode yes" -o "NumberOfPasswordPrompts 0" /root/.ssh/id_om /root/.ssh/id_om.pub && temp_auth=$(grep '#OM' /root/.ssh/authorized_keys) && ssh "sed -i '/#OM/d' /root/.ssh/authorized_keys; echo *** >> /root/.ssh/authorized_keys" && temp_auth=$(grep '#OM' /root/.ssh/known_hosts) && ssh "sed -i '/#OM/d' /root/.ssh/known_hosts; echo *** >> /root/.ssh/known_hosts"; Node: Error:
1, lost connection
-- 此时查看主备节点SSH状态也是异常
[root@opengauss-db2 ~]# systemctl status sshd.service
● sshd.service - OpenSSH server daemon
Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2023-07-21 11:03:03 CST; 12h ago
Docs: man:sshd(8)
Main PID: 2160 (sshd)
Tasks: 1
Memory: 4.2M
CGroup: /system.slice/sshd.service
└─2160 /usr/sbin/sshd -D
Jul 21 17:44:20 opengauss-db2 sshd[6374]: Accepted publickey for root from port 63717 ssh2: ED25519 SHA256:hUo4iBgUOVXW5ONlVeD2QMdS+4snKsRs0K1K3jBLO8E
Jul 21 17:44:22 opengauss-db2 sshd[6417]: Accepted publickey for root from port 63721 ssh2: ED25519 SHA256:hUo4iBgUOVXW5ONlVeD2QMdS+4snKsRs0K1K3jBLO8E
Jul 21 17:44:24 opengauss-db2 sshd[6463]: Accepted publickey for root from port 63723 ssh2: ED25519 SHA256:hUo4iBgUOVXW5ONlVeD2QMdS+4snKsRs0K1K3jBLO8E
Jul 21 22:45:32 opengauss-db2 sshd[4829]: Accepted password for root from port 30166 ssh2
Jul 21 22:45:37 opengauss-db2 sshd[4883]: Accepted password for root from port 30172 ssh2
Jul 21 22:45:39 opengauss-db2 sshd[4922]: Connection closed by port 30178 [preauth]
Jul 21 22:45:39 opengauss-db2 sshd[4928]: Connection closed by port 30182 [preauth]
Jul 21 22:45:40 opengauss-db2 sshd[4930]: Accepted password for root from port 30188 ssh2
Jul 21 23:06:46 opengauss-db2 sshd[13949]: Connection closed by port 50810 [preauth]
Jul 21 23:27:22 opengauss-db2 sshd[22723]: Connection closed by port 31050 [preauth]
-- 重新调整MTU,重启主备节点SSH服务
[root@opengauss-db2 ~]# systemctl restart sshd.service
[root@opengauss-db2 ~]# systemctl status sshd.service
● sshd.service - OpenSSH server daemon
Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2023-07-21 23:33:29 CST; 1s ago
Docs: man:sshd(8)
Main PID: 25303 (sshd)
Tasks: 1
Memory: 1.3M
CGroup: /system.slice/sshd.service
└─25303 /usr/sbin/sshd -D
Jul 21 23:33:28 opengauss-db2 systemd[1]: Starting OpenSSH server daemon...
Jul 21 23:33:29 opengauss-db2 sshd[25303]: Server listening on port 60002.
Jul 21 23:33:29 opengauss-db2 sshd[25303]: Server listening on :: port 60002.
Jul 21 23:33:29 opengauss-db2 systemd[1]: Started OpenSSH server daemon.
Jul 21 23:33:29 opengauss-db2 sshd[25303]: Server listening on port 22.
Jul 21 23:33:29 opengauss-db2 sshd[25303]: Server listening on :: port 22.
4.3 python3 故障导致无法正常查看集群状态
-- 如果安装的python3故障,会导致gs_om无法查看集群状态
[omm@opengauss-db1 ~]$ gs_om -t status --detail --all
-bash: /opt/gaussdb/install/om/script/gs_om: Permission denied
4.4 集群升级后会发生主备切换
-- 集群升级前状态信息
[omm@opengauss-db1 dn]$ gs_om -t status --detail --all
[ CMServer State ]
node node_ip instance state
1 opengauss-db1 1 /opt/gaussdb/install/cm/cm_server Primary
2 opengauss-db2 2 /opt/gaussdb/install/cm/cm_server Standby
[ Cluster State ]
cluster_state : Normal
redistributing : No
balanced : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip instance state
1 opengauss-db1 6001 /opt/gaussdb/install/data/dn P Standby Normal
2 opengauss-db2 6002 /opt/gaussdb/install/data/dn S Primary Normal
-- 集群升级后状态信息
[omm@opengauss-db1 ~]$ gs_om -t status --detail --all
[ CMServer State ]
node node_ip instance state
1 opengauss-db1 1 /opt/gaussdb/install/cm/cm_server Primary
2 opengauss-db2 2 /opt/gaussdb/install/cm/cm_server Standby
[ Cluster State ]
cluster_state : Normal
redistributing : No
balanced : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip instance state
1 opengauss-db1 6001 /opt/gaussdb/install/data/dn P Standby Normal
2 opengauss-db2 6002 /opt/gaussdb/install/data/dn S Primary Normal
-- 连接原来的主库无法创建数据库
[omm@opengauss-db1 ~]$ gsql -d postgres -p 26000
gsql ((openGauss 5.0.0 build a07d57c3) compiled at 2023-03-29 03:07:56 commit 0 last mr )
Non-SSL connection (SSL connection is recommended when requiring high-security)
Type "help" for help.
openGauss=# CREATE DATABASE gaussdb WITH ENCODING 'UTF8' template = template0;
ERROR: cannot execute CREATE DATABASE in a read-only transaction
-- 连接新主节点可以正常创建数据库
[omm@opengauss-db2 ~]$ gsql -d postgres -p 26000
gsql ((openGauss 5.0.0 build a07d57c3) compiled at 2023-03-29 03:07:56 commit 0 last mr )
Non-SSL connection (SSL connection is recommended when requiring high-security)
Type "help" for help.
openGauss=# CREATE DATABASE gaussdb WITH ENCODING 'UTF8' template = template0;
openGauss=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
gaussdb | omm | UTF8 | C | C |
postgres | omm | SQL_ASCII | C | C |
template0 | omm | SQL_ASCII | C | C | =c/omm +
| | | | | omm=CTc/omm
template1 | omm | SQL_ASCII | C | C | =c/omm +
| | | | | omm=CTc/omm
(4 rows)
[root@opengauss-db1 ~]# python3 /opt/software/openGauss/script/gs_preinstall -U omm -G dbgrp -X /opt/software/openGauss/cluster_config.xml
Parsing the configuration file.
Successfully parsed the configuration file.
Installing the tools on the local node.
Successfully installed the tools on the local node.
Are you sure you want to create trust for root (yes/no)?no
Setting host ip env
[GAUSS-51400] : Failed to execute the command: sed -i '/^export[ ]*HOST_IP=/d' /etc/profile. Result:{'opengauss-db1': 'Success', 'opengauss-db2': 'Failure'}.
[SUCCESS] opengauss-db1:
[FAILURE] opengauss-db2:
[2023-07-21 22:45:39.414838][20984][gs_sshexkey][DEBUG]:Successfully to add id_rsa in ssh-agent
[2023-07-21 22:45:39.415698][20984][gs_sshexkey][DEBUG]:Ssh agent register successfully.
[2023-07-21 22:45:39.416461][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step5]:Successfully created the local key files.
[2023-07-21 22:45:39.417283][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step6]:Appending local ID to authorized_keys.
[2023-07-21 22:45:39.418192][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step6]:Successfully appended local ID to authorized_keys.
[2023-07-21 22:45:39.429370][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step7]:Updating the known_hosts file.
[2023-07-21 22:45:40.311033][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step7]:Successfully updated the known_hosts file.
[2023-07-21 22:45:40.311665][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step8]:Appending authorized_key on the remote node.
[2023-07-21 22:45:40.679766][20984][gs_sshexkey][DEBUG]:Send to
Successfully appended authorized_key on remote node
[2023-07-21 22:45:40.864480][20984][gs_sshexkey][DEBUG]:Send to
Successfully appended authorized_key on remote node
[2023-07-21 22:45:40.921407][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step8]:Successfully appended authorized_key on all remote node.
[2023-07-21 22:45:40.921956][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step9]:Checking common authentication file content.
[2023-07-21 22:45:40.927562][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step9]:Successfully checked common authentication content.
[2023-07-21 22:45:40.928391][20984][gs_sshexkey(_log:1396)][gs_sshexkey][LOG][Step10]:Distributing SSH trust file to all node.
[2023-07-21 22:47:41.046988][20984][gs_sshexkey][DEBUG]:send_trust_file failed, coutdown 3, retry again.
[2023-07-21 22:47:41.047776][20984][gs_sshexkey][DEBUG]:errorinfo: hostip:, status: 1, output: lost connection,
[2023-07-21 22:47:41.089878][20984][gs_sshexkey][DEBUG]:check os info: drwx------ 2 root root 4096 Jul 21 22:45 .ssh
-rwxr-xr-x 1 root root 885 Dec 12 2022 ssh_key.sh
-rw-r--r-- 1 root root 521 Jul 21 11:36 sshtrust.sh
total 32
drwx------ 2 root root 4096 Jul 21 22:45 .
dr-xr-x---. 11 root root 4096 Jul 21 22:45 ..
-rw------- 1 root root 504 Jul 21 22:45 authorized_keys
-rw------- 1 root root 464 Jul 21 22:45 id_om
-rw------- 1 root root 100 Jul 21 22:45 id_om.pub
-rw------- 1 root root 1679 Jul 21 11:35 id_rsa
-rw------- 1 root root 400 Jul 21 11:35 id_rsa.pub
-rw------- 1 root root 1012 Jul 21 22:45 known_hosts
[2023-07-21 22:49:51.205162][20984][gs_sshexkey][DEBUG]:send_trust_file failed, coutdown 2, retry again.
[2023-07-21 22:49:51.206276][20984][gs_sshexkey][DEBUG]:errorinfo: hostip:, status: 1, output: lost connection,
[2023-07-21 22:49:51.240173][20984][gs_sshexkey][DEBUG]:check os info: drwx------ 2 root root 4096 Jul 21 22:45 .ssh
-rwxr-xr-x 1 root root 885 Dec 12 2022 ssh_key.sh
-rw-r--r-- 1 root root 521 Jul 21 11:36 sshtrust.sh
total 32
drwx------ 2 root root 4096 Jul 21 22:45 .
dr-xr-x---. 11 root root 4096 Jul 21 22:45 ..
-rw------- 1 root root 504 Jul 21 22:45 authorized_keys
-rw------- 1 root root 464 Jul 21 22:45 id_om
-rw------- 1 root root 100 Jul 21 22:45 id_om.pub
-rw------- 1 root root 1679 Jul 21 11:35 id_rsa
-rw------- 1 root root 400 Jul 21 11:35 id_rsa.pub
-rw------- 1 root root 1012 Jul 21 22:45 known_hosts
[2023-07-21 22:52:01.367717][20984][gs_sshexkey][DEBUG]:send_trust_file failed, coutdown 1, retry again.
[2023-07-21 22:52:01.368465][20984][gs_sshexkey][DEBUG]:errorinfo: hostip:, status: 1, output: lost connection,
[2023-07-21 22:52:01.425251][20984][gs_sshexkey][DEBUG]:check os info: drwx------ 2 root root 4096 Jul 21 22:45 .ssh
-rwxr-xr-x 1 root root 885 Dec 12 2022 ssh_key.sh
-rw-r--r-- 1 root root 521 Jul 21 11:36 sshtrust.sh
total 32
drwx------ 2 root root 4096 Jul 21 22:45 .
dr-xr-x---. 11 root root 4096 Jul 21 22:45 ..
-rw------- 1 root root 504 Jul 21 22:45 authorized_keys
-rw------- 1 root root 464 Jul 21 22:45 id_om
-rw------- 1 root root 100 Jul 21 22:45 id_om.pub
-rw------- 1 root root 1679 Jul 21 11:35 id_rsa
-rw------- 1 root root 400 Jul 21 11:35 id_rsa.pub
-rw------- 1 root root 1012 Jul 21 22:45 known_hosts
[2023-07-21 22:54:11.538969][20984][gs_sshexkey][ERROR]:[GAUSS-50223] : Failed to update the authentication files.cmd is source /root/.bashrc;scp -q -o "BatchMode yes" -o "NumberOfPasswordPrompts 0" /root/.ssh/id_om /root/.ssh/id_om.pub && temp_auth=$(grep '#OM' /root/.ssh/authorized_keys) && ssh "sed -i '/#OM/d' /root/.ssh/authorized_keys; echo *** >> /root/.ssh/authorized_keys" && temp_auth=$(grep '#OM' /root/.ssh/known_hosts) && ssh "sed -i '/#OM/d' /root/.ssh/known_hosts; echo *** >> /root/.ssh/known_hosts"; Node: Error:
1, lost connection
[2023-07-21 22:54:12.110072][20463][gs_preinstall][DEBUG]:The $GAUSSHOME/bin is exist.
[2023-07-21 22:54:12.111040][20463][gs_preinstall][DEBUG]:The $GAUSS_ENV is 2.
[2023-07-21 22:54:12.111678][20463][gs_preinstall][DEBUG]:There is the upgrade is in progress.
[2023-07-21 22:54:12.112467][20463][gs_preinstall][DEBUG]:In upgrade process, no need to delete /opt/gaussdb/install/om.
[2023-07-21 22:54:12.113237][20463][gs_preinstall][ERROR]:[GAUSS-51632] : Failed to do gs_sshexkey.Error: Please enter password for current user[root].
Checking network information.
All nodes in the network are Normal.
Successfully checked network information.
Creating SSH trust.
Creating the local key file.
Successfully created the local key files.
Appending local ID to authorized_keys.
Successfully appended local ID to authorized_keys.
Updating the known_hosts file.
Successfully updated the known_hosts file.
Appending authorized_key on the remote node.
Successfully appended authorized_key on all remote node.
Checking common authentication file content.
Successfully checked common authentication content.
Distributing SSH trust file to all node.
[GAUSS-50223] : Failed to update the authentication files.cmd is source /root/.bashrc;scp -q -o "BatchMode yes" -o "NumberOfPasswordPrompts 0" /root/.ssh/id_om /root/.ssh/id_om.pub && temp_auth=$(grep '#OM' /root/.ssh/authorized_keys) && ssh "sed -i '/#OM/d' /root/.ssh/authorized_keys; echo *** >> /root/.ssh/authorized_keys" && temp_auth=$(grep '#OM' /root/.ssh/known_hosts) && ssh "sed -i '/#OM/d' /root/.ssh/known_hosts; echo *** >> /root/.ssh/known_hosts"; Node: Error:
1, lost connection
[omm@opengauss-db1 ~]$ gs_om -t status --detail --all
[ CMServer State ]
node node_ip instance state
1 opengauss-db1 1 /opt/gaussdb/install/cm/cm_server Down
2 opengauss-db2 2 /opt/gaussdb/install/cm/cm_server Down
cm_ctl: can't connect to cm_server.
Maybe cm_server is not running, or timeout expired. Please try again.
[omm@opengauss-db1 ~]$ cm_ctl switchover -a
cm_ctl: send switchover msg to cm_server, connect fail node_id:0, data_path:.
[omm@opengauss-db1 ~]$ cm_ctl query -Cv
[ CMServer State ]
node instance state
1 opengauss-db1 1 Primary
2 opengauss-db2 2 Standby
[ Cluster State ]
cluster_state : Normal
redistributing : No
balanced : Yes
current_az : AZ_ALL
[ Datanode State ]
node instance state | node instance state
1 opengauss-db1 6001 P Primary Normal | 2 opengauss-db2 6002 S Standby Normal
[omm@opengauss-db1 ~]$ gs_om -t status --detail --all
[ CMServer State ]
node node_ip instance state
1 opengauss-db1 1 /opt/gaussdb/install/cm/cm_server Primary
2 opengauss-db2 2 /opt/gaussdb/install/cm/cm_server Standby
[ Cluster State ]
cluster_state : Normal
redistributing : No
balanced : Yes
current_az : AZ_ALL
[ Datanode State ]
node node_ip instance state
1 opengauss-db1 6001 /opt/gaussdb/install/data/dn P Primary Normal
2 opengauss-db2 6002 /opt/gaussdb/install/data/dn S Standby Normal

