写点什么

MGR 新节点 RECOVERING 状态的分析与解决:caching_sha2_password 验证插件的影响

作者:GreatSQL
  • 2023-09-13
    福建
  • 本文字数:6660 字

    阅读完需:约 22 分钟

起因

在 GreatSQL 社区上有一位用户提出了“手工构建 MGR 碰到的次节点一直处于 recovering 状态”,经过排查后,发现了是因为新密码验证插件caching_sha2_password导致的从节点一直无法连接主节点,帖子地址:(https://greatsql.cn/thread-420-2-1.html))

复现

环境介绍

本文验证环境,以及本文所采用数据库为GreatSQL 8.0.32-24


$ cat /etc/system-releaseRed Hat Enterprise Linux Server release 7.9 (Maipo)$ uname -aLinux gip 3.10.0-1160.el7.x86_64 #1 SMP Tue Aug 18 14:50:17 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux
复制代码


部署准备:


采用的是单机多实例的部署方式,如何部署单机多实例可以前往(https://gitee.com/GreatSQL/GreatSQL-Manual/blob/master/6-oper-guide/6-6-multi-instances.md)



MGR 有关配置参数:


#mgr settingsloose-plugin_load_add = 'mysql_clone.so'loose-plugin_load_add = 'group_replication.so'loose-group_replication_group_name = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaa1"loose-group_replication_group_seeds = '172.17.139.77:33061,172.17.139.77:33071'loose-group_replication_start_on_boot = ONloose-group_replication_bootstrap_group = OFFloose-group_replication_exit_state_action = READ_ONLYloose-group_replication_flow_control_mode = "DISABLED"loose-group_replication_single_primary_mode = ONloose-group_replication_communication_max_message_size = 10Mloose-group_replication_transaction_size_limit = 3Gloose-group_replication_arbitrator = 0loose-group_replication_single_primary_fast_mode = 0loose-group_replication_request_time_threshold = 20000report_host = "172.17.139.77"
复制代码


MGR01 节点配置如下:


[mysqld@mgr01]datadir=/data/GreatSQL/mgr01socket=/data/GreatSQL/mgr01/mysql.sockport=3306server_id=103306log-error=/data/GreatSQL/mgr01/error.logloose-group_replication_local_address= "172.17.139.77:33061"
复制代码


MGR02 节点配置如下:


[mysqld@mgr02]datadir=/data/GreatSQL/mgr02socket=/data/GreatSQL/mgr02/mysql.sockport=3307server_id=103317log-error=/data/GreatSQL/mgr02/error.logloose-group_replication_local_address= "172.17.139.77:33071"
复制代码


启动 MGR01 实例、MGR02 实例,并修改密码


#启动两个实例$ systemctl restart greatsql@mgr01 &$ systemctl restart greatsql@mgr02 &#获取初始化密码$ grep root /data/GreatSQL/mgr01/error.log$ grep root /data/GreatSQL/mgr02/error.log#登录数据库并修改密码$ mysql -S /data/GreatSQL/mgr01/mysql.sock -uroot -pgreatsql> alter user root@'localhost' identified by 'GreatSQL@666';$ mysql -S /data/GreatSQL/mgr02/mysql.sock -uroot -pgreatsql> alter user root@'localhost' identified by 'GreatSQL@666';
复制代码


检查两个实例是否正确加载group_replicaiton 插件


greatsql> show plugins;+----------------------------------+----------+--------------------+----------------------+---------+| Name                             | Status   | Type               | Library              | License |+----------------------------------+----------+--------------------+----------------------+---------+| group_replication                | ACTIVE   | GROUP REPLICATION  | group_replication.so | GPL     |+----------------------------------+----------+--------------------+----------------------+---------+
复制代码


没有加载的话可以手动加载这个 plugin


greatsql> install plugin group_replication soname 'group_replication.so';
复制代码

搭建 MGR

接下来就可以手工搭建 MGR,流程如下可参考安装部署MGR集群 | 深入浅出MGR(https://gitee.com/GreatSQL/GreatSQL-Doc/blob/master/deep-dive-mgr/deep-dive-mgr-03.md))


MGR01 实例操作:


greatsql> set session sql_log_bin=0;# 特别注意下面因为8.0.4版本开始使用的默认是“caching_sha2_password”,所以这样创建会采用最新的身份认证插件greatsql> create user repl@'%' identified by 'GreatSQL@666';greatsql> GRANT BACKUP_ADMIN, REPLICATION SLAVE ON *.* TO `repl`@`%`;greatsql> set session sql_log_bin=1;greatsql> CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='GreatSQL@666' FOR CHANNEL 'group_replication_recovery';
复制代码


接下来即可启动 MGR 集群:


greatsql> set global group_replication_bootstrap_group=ON;
greatsql> start group_replication;
greatsql> select * from performance_schema.replication_group_members\G*************************** 1. row *************************** CHANNEL_NAME: group_replication_applier MEMBER_ID: 2920447e-35bf-11ee-89a5-00163e566da1 MEMBER_HOST: 172.17.139.77 MEMBER_PORT: 3306 MEMBER_STATE: ONLINE MEMBER_ROLE: PRIMARY MEMBER_VERSION: 8.0.32MEMBER_COMMUNICATION_STACK: XCom
复制代码


MGR02 实例操作:


greatsql> set session sql_log_bin=0;greatsql> create user repl@'%' identified by 'GreatSQL@666';greatsql> GRANT BACKUP_ADMIN, REPLICATION SLAVE ON *.* TO `repl`@`%`;greatsql> set session sql_log_bin=1;greatsql> CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='GreatSQL@666' FOR CHANNEL 'group_replication_recovery';greatsql> start group_replication;Query OK, 0 rows affected (5.39 sec)
复制代码


此时创建的用户采用的都是caching_sha2_password身份认证插件


greatsql> SELECT USER,PLUGIN FROM mysql.`user` ;+------------------+-----------------------+| USER             | PLUGIN                |+------------------+-----------------------+| repl             | caching_sha2_password || mysql.infoschema | caching_sha2_password || mysql.session    | caching_sha2_password || mysql.sys        | caching_sha2_password || root             | caching_sha2_password |+------------------+-----------------------+
复制代码


虽然启动 MGR 成功,但是查看下节点状态:


greatsql> select * from performance_schema.replication_group_members\G*************************** 1. row ***************************              CHANNEL_NAME: group_replication_applier                 MEMBER_ID: 2920447e-35bf-11ee-89a5-00163e566da1               MEMBER_HOST: 172.17.139.77               MEMBER_PORT: 3306              MEMBER_STATE: ONLINE               MEMBER_ROLE: PRIMARY            MEMBER_VERSION: 8.0.32MEMBER_COMMUNICATION_STACK: XCom*************************** 2. row ***************************              CHANNEL_NAME: group_replication_applier                 MEMBER_ID: 2a4f068b-35bf-11ee-9504-00163e566da1               MEMBER_HOST: 172.17.139.77               MEMBER_PORT: 3307              MEMBER_STATE: RECOVERING               MEMBER_ROLE: SECONDARY            MEMBER_VERSION: 8.0.32MEMBER_COMMUNICATION_STACK: XCom2 rows in set (0.00 sec)
复制代码


此时节点一直处于RECOVERING状态,查看 mgr02 实例的错误日志如下:


2023-08-08T08:00:47.034870Z 42 [ERROR] [MY-010584] [Repl] Slave I/O for channel 'group_replication_recovery': error connecting to master 'repl@172.17.139.77:3306' - retry-time: 60 retries: 1 message: Authentication plugin 'caching_sha2_password' reported error: Authentication requires secure connection. Error_code: MY-0020612023-08-08T08:00:47.037631Z 35 [ERROR] [MY-011582] [Repl] Plugin group_replication reported: 'There was an error when connecting to the donor server. Please check that group_replication_recovery channel credentials and all MEMBER_HOST column values of performance_schema.replication_group_members table are correct and DNS resolvable.'2023-08-08T08:00:47.037671Z 35 [ERROR] [MY-011583] [Repl] Plugin group_replication reported: 'For details please check performance_schema.replication_connection_status table and error log messages of Slave I/O for channel group_replication_recovery.'
复制代码


这是由于caching_sha2_password 是 MySQL 8.0.4 引入的一个新的身份验证插件,caching_sha2_password 对密码安全性要求更高,要求用户认证过程中在网络传输的密码是加密的,所以导致的这个问题的出现,caching_sha2_password的介绍可以看社区文章“浅谈 MySQL 新的身份验证插件 caching_sha2_password【微信导入、微信导入】”

解决方式

1、采用旧密码验证插件

旧的身份验证插件mysql_native_passwordmysql_native_password的特点是不需要加密的连接。该插件验证速度特别快,但是不够安全,只需要更改创建用户的语句


create user repl@'%' identified with mysql_native_password by 'GreatSQL@666';
复制代码


旧密码验证插件容易被破解,如果有 GreatSQL 服务要公网上使用,建议还是尽量使用 caching_sha2_password作为认证插件

2、启用 group_replication_recovery_get_public_key

设置 group_replication_recovery_get_public_key=ON 可以确保从节点在连接到主节点时能够获取所需的公钥,从而允许安全连接并成功进行身份验证,避免了连接错误和身份验证问题。


手册中也有明确说明:


18.6.3.1.1 Replication User With The Caching SHA-2 Authentication PluginBy default, users created in MySQL 8 use Section 6.4.1.2, “Caching SHA-2 Pluggable Authentication”. If the replication user you configure for distributed recovery uses the caching SHA-2 authentication plugin, and you are not using SSL for distributed recovery connections, RSA key-pairs are used for password exchange. For more information on RSA key-pairs, see Section 6.3.3, “Creating SSL and RSA Certificates and Keys”.
In this situation, you can either copy the public key of the to the joining member, or configure the donors to provide the public key when requested. The more secure approach is to copy the public key of the replication user account to the joining member. Then you need to configure the group_replication_recovery_public_key_path system variable on the joining member with the path to the public key for the replication user account. rpl_user
The less secure approach is to set group_replication_recovery_get_public_key=ON on donors so that they provide the public key of the replication user account to joining members. There is no way to verify the identity of a server, therefore only set group_replication_recovery_get_public_key=ON when you are sure there is no risk of server identity being compromised, for example by a man-in-the-middle attack.
复制代码


可以看到,当确认环境安全以及没人任何人攻击集群时,如果不配置 ssl,可以最低配置group_replication_recovery_get_public_key=ON来在请求复制用户密钥时给公钥

3、为组复制通道启用 SSL 支持

以下操作方法仅使用于 GreatSQL/MySQL 8.0.27 版本及以上


更安全的方法是将 repl 用户所需的公钥文件复制到 joiner 节点的 Server 所在主机中。然后,在 joiner 节点的 Server 中配置group_replication_recovery_public_key_path系统变量,指定 rpl_user 用户所需的公钥文件路径。


使用caching_sha2_password 插件身份验证会在数据目录下生成如下两个 RSA 文件:


private_key.pempublic_key.pem
复制代码


  • private_key.pem:RSA 私钥

  • public_key.pem: RSA 公钥


对于 MGR ,如果设置 group_replication_ssl_mode=DISABLED 必须使用下面的变量来指定 RSA 公钥,否则报错:


  • group_replication_recovery_get_public_key :向服务端请求 RSA 公钥;

  • group_replication_recovery_public_key_path :指定本地 RSA 公钥文件;


指定本地 RSA 公钥,首先需要全局 MGR 配置开启 SSL


[mysqld]#开启use_ssl,指定组成员之间的组复制分布式恢复连接是否应使用 SSLloose-group_replication_recovery_use_ssl=ON
复制代码


进入 MGR01 实例配置


greatsql> set session sql_log_bin=0;# 此时就可以使用“caching_sha2_password”身份认证插件greatsql> create user repl@'%' identified by 'GreatSQL@666';greatsql> GRANT BACKUP_ADMIN, REPLICATION SLAVE ON *.* TO `repl`@`%`;greatsql> set session sql_log_bin=1;greatsql> CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='GreatSQL@666' FOR CHANNEL 'group_replication_recovery';
复制代码


启动 MGR01 实例的 MGR 集群


greatsql> set global group_replication_bootstrap_group=ON;greatsql> start group_replication;greatsql> select * from performance_schema.replication_group_members\G*************************** 1. row ***************************              CHANNEL_NAME: group_replication_applier                 MEMBER_ID: 35b653d2-3658-11ee-93c9-00163e566da1               MEMBER_HOST: 172.17.139.77               MEMBER_PORT: 3306              MEMBER_STATE: ONLINE               MEMBER_ROLE: PRIMARY            MEMBER_VERSION: 8.0.32MEMBER_COMMUNICATION_STACK: XCom
复制代码


启动成功后,需要把 MGR01 节点的RSA公钥拷贝到 MGR02 节点上,因为 MGR02 也会生成此公钥,所以最好创建一个文件夹


$ mkdir mgr01_key$ chown mysql:mysql mgr01_key/# 将public_key.pem移动到MGR02$ mv /data/GreatSQL/mgr01/public_key.pem /data/GreatSQL/mgr02/mgr01_key/
复制代码


当然,如果有多个节点,也需要把主节点的 RSA 公钥移动到各个节点上


MGR02 节点操作


greatsql> set session sql_log_bin=0;greatsql> create user repl@'%' identified by 'GreatSQL@666';greatsql> GRANT BACKUP_ADMIN, REPLICATION SLAVE ON *.* TO `repl`@`%`;greatsql> set session sql_log_bin=1;greatsql> CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='GreatSQL@666' FOR CHANNEL 'group_replication_recovery';
# 此命令设置完成后,最好写进my.cnf文件中持久化greatsql> set global group_replication_recovery_public_key_path = "/data/GreatSQL/mgr02/mgr01key/public_key.pem";
greatsql> start group_replication;greatsql> select * from performance_schema.replication_group_members\G*************************** 1. row *************************** CHANNEL_NAME: group_replication_applier MEMBER_ID: 35b653d2-3658-11ee-93c9-00163e566da1 MEMBER_HOST: 172.17.139.77 MEMBER_PORT: 3306 MEMBER_STATE: ONLINE MEMBER_ROLE: PRIMARY MEMBER_VERSION: 8.0.32MEMBER_COMMUNICATION_STACK: XCom*************************** 2. row *************************** CHANNEL_NAME: group_replication_applier MEMBER_ID: aa031fb9-365a-11ee-9925-00163e566da1 MEMBER_HOST: 172.17.139.77 MEMBER_PORT: 3307 MEMBER_STATE: ONLINE MEMBER_ROLE: SECONDARY MEMBER_VERSION: 8.0.32MEMBER_COMMUNICATION_STACK: XCom
复制代码


可以看到双节点ONLINE,新加入的节点不会一直是RECOVERING状态

总结

新身份验证插件caching_sha2_password安全度相比其他的身份验证插件,既解决安全性问题又解决性能问题,建议使用新密码验证插件。


也感谢社区用户指出 GreatSQL 社区文档中的不足,并给予用户金币奖励,同时欢迎大家来 GreatSQL 社区捉虫~


发布于: 刚刚阅读数: 2
用户头像

GreatSQL

关注

GreatSQL社区 2023-01-31 加入

GreatSQL是由万里数据库维护的MySQL分支,专注于提升MGR可靠性及性能,支持InnoDB并行查询特性,是适用于金融级应用的MySQL分支版本。 社区:https://greatsql.cn/ Gitee: https://gitee.com/GreatSQL/GreatSQL

评论

发布
暂无评论
MGR新节点RECOVERING状态的分析与解决:caching_sha2_password验证插件的影响_greatsql_GreatSQL_InfoQ写作社区