详解如何在数仓中搭建细粒度容灾应用

2024-02-01
广东
本文字数：6687 字
阅读完需：约 22 分钟

本文分享自华为云社区《GaussDB(DWS)细粒度容灾使用介绍》，作者：天蓝蓝。

1. 前言

适用版本：【8.2.1.210 及以上】

当前数仓承载的客户业务越来越多，从而导致客户对于数仓的可靠性要求不断增加。尤其在金融领域，容灾备份机制是信息系统必须提供的能力之一。本文介绍了在云上环境的双集群(不跨 Region 不跨 VPC)后台手动部署并使用细粒度容灾的主要步骤，使得用户能快速方便得搭建起细粒度容灾。

2. 细粒度容灾简介

对于 MPPDB 集群的容灾而言，目前业界的常见方案要么是部署两套规格配置同等的集群，要么通过逻辑双加载方式去实现，这两个方案缺点比较明显，存在架构复杂、建设成本高等问题，不仅使得灾备部署难度增大，还导致资源浪费。在此背景下，GaussDB(DWS)基于列存表实现细粒度容灾能力，既满足核心分析型业务在极端场景的业务连续性要求，同时也能大幅降低容灾方案的建设成本，而且容灾架构轻量化，容灾系统易运维、易操作、易演练，从而帮助用户迅捷、经济、按需构建核心业务的容灾系统。

相比于传统的容灾方案，细粒度容灾有以下优势：

主集群和备集群双活（Active-Active）（备集群除灾备表只读外，其余表可读写）
主备集群只需满足 DN 整数倍关系，降低备集群建设资源，方便灵活部署
利用列存表的数据和元数据分离的特点，通过元数据逻辑同步 + 数据物理同步的方式，同步增量，结合 UDF 的增量抽取/回放接口，保障表级数据一致性
粒度可控，支持表级、schema 级、库级容灾
支持表 DDL，部分 DCL 元数据同步，达到切换可用

细粒度容灾示意图

3. 容灾前准备

3.1 配置互信

前提条件

确保 ssh 服务打开。
确保 ssh 端口不会被防火墙关闭。
确保所有机器节点间网络畅通。

配置互信

依次登录主备集群各节点沙箱外，执行步骤 2-4。

将主集群和备集群所有节点 ip 和 hostname 添加到/home/Ruby/.ssh/authorized_keys 和/var/chroot/home/Ruby/.ssh/authorized_keys 的 from 列表中。

将主集群和备集群所有节点 ip 和 hostname 的映射添加到/etc/hosts 和/var/chroot/etc/hosts 文件中。

遍历主集群和备集群所有节点 ip 和 hostname，执行以下命令。

ssh-keyscan -t rsa -T 120 ${ip/hostname} >> /home/Ruby/.ssh/known_hostsssh-keyscan -t rsa -T 120 ${ip/hostname} >> /var/chroot/home/Ruby/.ssh/known_hosts

复制代码

校验互信

从主集群任一节点能免密 ssh 到备集群任一节点且从备集群任一节点能免密 ssh 到主集群任一节点，即代表互信已配置成功。

3.2 配置容灾配置文件

Ruby 用户分别登录主备集群主节点沙箱内。
创建容灾目录，假设目录为/DWS/data2/finedisaster。mkdir -p /DWS/data2/finedisaster
在/DWS/data2/finedisaster 目录下，创建容灾配置文件 backupRestore.ini 和主备集群倒换配置文件 sw_backupRestore.ini。

注意：

细粒度容灾的容灾备份过程支持与物理细粒度备份业务并行执行。可通过以下措施隔离：

两个备份任务指定不同的 metadata 目录和 media 目录，实现备份数据隔离。容灾备份目录通过容灾配置文件(backupRestore.ini 和 sw_backupRestore.ini)中的参数 primary-media-destination 和 primary-metadata-destination 指定。

细粒度容灾的容灾备份端口与物理细粒度备份的 master-port 指定为不同的值，实现备份进程端口的隔离。容灾的备份端口通过容灾配置文件(backupRestore.ini 和 sw_backupRestore.ini)中的参数 backup-port 指定。

容灾配置文件 backupRestore.ini

以下是 backupRestore.ini 文件示例，可根据具体业务场景修改配置参数值。

# Configuration file for SyncDataToStby tool# The backup life cyclelife-cycle=10m# The table that records backups infobackuprestoreinfo-file=backuprestoreinfo.csv# The cluster user name for clusterusername=Ruby# The primary cluster env setprimary-env=# The standby cluster env setstandby-env=# Time interval between each full backup, uint: minfull-backup-exec-time-interval=N/A# Time interval between each backupbackup-exec-time-interval=2# One of the backup hostsprimary-host-ip=XXX.XXX.XXX.XX# The media type that restore backup files, DISKmedia-type=disk# The number of process that should be used. Range is (1~32)parrallel-process=4# The compression level that should be used for backup. Range(0~9)compression-level=1compression-type=2# Media destination where the backup must be storedprimary-media-destination=/DWS/data2/finedisaster/mediadata# Metadata destination where the metadata must be storedprimary-metadata-destination=/DWS/data2/finedisaster/metadata# The master-port in which the backup must be executedbackup-port=XXXX# Logging level for the log contents of backup:FATAL,ERROR,INFO,DEBUGprimary-cluster-logging-level=INFO# Time interval between each restore, uint: minrestore-interval=2# One of the restore hostsrestore-host-ip=XXX.XXX.XXX.XX# Media destination where the backup contents must be stored in the standby clusterrestore-media-destination=/DWS/data2/finedisaster/mediadata# Metadata destination where the backup contents must be stored in the standby clusterrestore-metadata-destination=/DWS/data2/finedisaster/metadata# The master-port in which the restore must be executedrestore-port=XXXX# Logging level for the log contents of restorestandby-cluster-logging-level=INFO# The maximum number of log files that should be created. Range is (5~1024).log-file-count=50# The retry times of checking cluster balance for resuem backup or double clusterscheck-balance-retry-times=0# cluster idprimary-cluster-id=11111111-1111-1111-1111-111111111111standby-cluster-id=22222222-2222-2222-2222-222222222222# Processes tables for which fine-grained disaster recovery is no longer performed# Value should be 'none', 'log-desync-table', 'drop-desync-table'desync-table-operation=drop-desync-table# Number of CPU cores that can be used by each DR process. Range is (1~1024).cpu-cores=8# The max number of rch files that can be reserved on master cluster before sent to standby cluster. 0 means no limit.local-reserve-file-count=160

复制代码

主备集群倒换配置文件 sw_backupRestore.ini 相比于 backupRestore.ini 只需颠倒下列参数

primary-host-ip / restore-host-ip
backup-port / restore-port
primary-media-destination / restore-media-destination
primary-metadata-destination / restore-metadata-destination
primary-cluster-id / standby-cluster-id

配置文件主要参数说明

3.3 主备集群准备

Ruby 用户登录主集群主节点沙箱内。

设置集群 GUC 参数。

local-dn-num 为本集群 DN 数，remote-dn-num 为对方集群 DN 数，可替换为实际值。

python3 $GPHOME/script/DisasterFineGrained.py -t prepare --local-dn-num 6 --remote-dn-num 3 --config-file /DWS/data2/finedisaster/backupRestore.ini

复制代码

3.4 容灾表数据准备

主集群连接数据库，准备容灾表数据。

新建容灾表

create table schema1.table_1(id int, name text) with (orientation = column, colversion="2.0", enable_disaster_cstore="on", enable_delta=false) DISTRIBUTE BY hash(id);

复制代码

存量表(非容灾表)转为容灾表

alter table schema1.table_1 set (enable_disaster_cstore='on');--查询表的分布方式select pclocatortype from pg_catalog.pgxc_class where pcrelid = 'schema1.table_1'::regclass::oid limit 1;--若表的分布方式为H（HASH分布），获取一个hash表的分布列，后续以'id'列分布为例select pg_catalog.getdistributekey('schema1.table_1');--对表做重分布alter table schema1.table_1 DISTRIBUTE BY HASH(id);--若表的分布方式为N(ROUNDROBIN分布)，对表做重分布alter table schema1.table_1 DISTRIBUTE BY ROUNDROBIN;--若表的分布方式为R(REPLICATION分布)，对表做重分布alter table schema1.table_1 DISTRIBUTE BY REPLICATION;

复制代码

4. 细粒度容灾操作

4.1 定义发布

Ruby 用户登录主集群主节点沙箱，以下操作均在沙箱内进行。

连接数据库，通过发布语法指定需要容灾的主表，语法如下：

--发布所有表CREATE PUBLICATION _pub_for_fine_dr FOR ALL TABLES;--发布schemaCREATE PUBLICATION _pub_for_fine_dr FOR ALL TABLES IN SCHEMA schema1, schema2;--发布表和schemaCREATE PUBLICATION _pub_for_fine_dr FOR ALL TABLES IN SCHEMA schema1, TABLE schema2.table_1;--增加一个发布表ALTER PUBLICATION _pub_for_fine_dr ADD TABLE schema1.table_1;--增加发布SCHEMAALTER PUBLICATION _pub_for_fine_dr ADD ALL TABLES IN SCHEMA schema2;

复制代码

定义发布后，将容灾 publication 放到参数文件中，以 dbname.pubname 形式，例如：

echo 'dbname._pub_for_fine_dr' > /DWS/data2/finedisaster/pub.list

复制代码

若需要解除发布，通过 DisasterFineGrained.py 脚本下发 cancel-publication 命令

# 1、创建需要取消的容灾对象列表文件# 解除发布表echo 'db_name.schema_name.table_name' > /DWS/data2/finedisaster/config/disaster_object_list.txt# 解除发布SCHEMAecho 'db_name.schema_name' > /DWS/data2/finedisaster/config/disaster_object_list.txt
# 2、下发cancel-publication命令python3 $GPHOME/script/DisasterFineGrained.py -t cancel-publication --disaster-object-list-file /DWS/data2/finedisaster/config/disaster_object_list.txt --config-file /DWS/data2/finedisaster/backupRestore.ini
# 3、取消全部发布python3 $GPHOME/script/DisasterFineGrained.py -t cancel-publication --config-file /DWS/data2/finedisaster/backupRestore.ini --all-cancel

复制代码

4.2 启动容灾

注意：

云上集群沙箱内无法启动 crontab 任务，需要在主备集群沙箱外手动添加定时备份恢复任务

HCS 8.3.0 及以上环境，沙箱外 crontab 设置 ssh 到沙箱内的定时任务，为了防止沙箱逃逸，ssh 前需要加上"sudo python3 /rds/datastore/dws/XXXXXX/sudo_lib/checkBashrcFile.py && source /etc/profile && source ~/.bashrc && "，否则定时任务不生效。checkBashrcFile.py 文件路径与版本号有关。

主集群主节点沙箱外设置定时任务，crontab -e。注意替换主节点 IP。

*/1 * * * * nohup ssh XXX.XXX.XXX.XXX "source /etc/profile;if [ -f ~/.profile ];then source ~/.profile;fi;source ~/.bashrc;nohup python3 /opt/dws/tools/script/SyncDataToStby.py -t backup  --config-file /DWS/data2/finedisaster/backupRestore.ini --disaster-fine-grained --publication-list /DWS/data2/finedisaster/pub.list  >>/dev/null 2>&1 &"  >>/dev/null 2>&1 &

复制代码

备集群主节点沙箱外设置定时任务，crontab -e。注意替换主节点 IP。

*/1 * * * * nohup ssh XXX.XXX.XXX.XXX "source /etc/profile;if [ -f ~/.profile ];then source ~/.profile;fi;source ~/.bashrc;nohup python3 /opt/dws/tools/script/SyncDataToStby.py -t restore  --config-file /DWS/data2/finedisaster/backupRestore.ini  --disaster-fine-grained  >>/dev/null 2>&1 &" >>/dev/null 2>&1 &

复制代码

成功启动备份确认。

Ruby 用户分别登录主备集群主节点沙箱，查询 SyncDataToStby.py 进程是否存在。

ps ux | grep SyncDataToStby | grep -v grep

复制代码

Ruby 用户登录主/备集群主节点沙箱，使用 roach 的 show-progress 监控工具，查看备份/恢复进度。show-progress 命令提供主备集群备份恢复进度等信息，显示结果为 json 格式，各字段含义请参考产品文档。

python3 $GPHOME/script/SyncDataToStby.py -t show-progress --config-file /DWS/data2/finedisaster/backupRestore.ini

复制代码

系统回显：

{  "primary cluster": {    "key": "20231109_212030",    "priorKey": "20231109_211754",    "actionType": "Backup",    "progress": "100.00%",    "backupRate": {      "producerRate": "0MB/s",      "compressRate": "0MB/s",      "consumerRate": "0MB/s"    },    "currentStep": "FINISH",    "unrestoreKeys": "N/A",    "failedStep": "INIT",    "errorMsg": "",    "errorCode": "",    "actionStartTime": "2023-11-09 21:20:28",    "actionEndTime": "2023-11-09 21:20:49",    "updateTime": "2023-11-09 21:20:50"  },  "standby cluster": {    "key": "20231109_175002",    "priorKey": "N/A",    "actionType": "Restore",    "progress": "100.00%",    "backupRate": {      "producerRate": "0MB/s",      "compressRate": "0MB/s",      "consumerRate": "0MB/s"    },    "currentStep": "FINISH",    "unrestoreKeys": "20231109_211754,20231109_212030",    "failedStep": "INIT",    "errorMsg": "",    "errorCode": "",    "actionStartTime": "2023-11-09 17:53:07",    "actionEndTime": "2023-11-09 17:53:15",    "updateTime": "2023-11-09 17:53:15"  },  "apply": {    "backupState": "waiting",    "restoreState": "waiting",    "backupSuccessTime": "2023-11-09 21:21:24",    "restoreSuccessTime": "2023-11-09 17:53:55"  },  "latestBarrierTime": "",  "recovery point objective": "3:32:17",  "failover recovery point time": ""}

复制代码

show-progress 命令显示的主要字段释义如下

priorKey：该备份集是基于这个 backup key 生成的。
actionType：备份集当前的操作类型。
取值包括如下：Backup，表示备份阶段。Restore，表示恢复阶段。
progress：备份或恢复操作的进度。
currentStep：备份或恢复正在执行的步骤。
unrestoreKeys：待恢复的 key 列表。
failedStep：备份或恢复失败的步骤，初始值默认为 INIT。
errorMsg：备份或恢复失败的错误信息，如果成功，该字段则显示成功的信息。
errorCode：备份或恢复的错误码，该字段为预留字段，暂未使用。
actionStartTime：当前操作的开始时间。
actionEndTime：当前操作的结束时间。
updateTime：当前操作进度的刷新时间。
backupState：当前备份状态（backuping/stopped/waiting/abnormal）。
restoreState：当前恢复状态（restoring/stopped/waiting/abnormal）。
backupSuccessTime：上次成功备份结束的时间。
restoreSuccessTime：上次成功恢复结束的时间。
latestBarrierTime：上次主备集群一致点时间。
recovery point objective：rpo 时间（当前主集群时间到最后一个恢复成功的备份集备份开始时间）。
failover recovery point time：failover 未同步时间点。

4.3 结果验证

4.3.1 功能验证

同步前后进行数据一致性校验，主表、备表数据进行 checksum 校验，检查是否同步正确（需排除由于接入业务导致的差异）。

4.3.2 性能验证

通过 show-progress 监控工具可以看到最近一次备份、恢复耗时。

5. 解除容灾

5.1 停止容灾

# 主备集群取消沙箱外的定时容灾任务，在备份/恢复任务前添加注释符“#”，取消定时任务crontab -e# 主集群Ruby用户登录主节点沙箱，停止备份python3 $GPHOME/script/SyncDataToStby.py -t stop-backup --config-file /DWS/data2/finedisaster/backupRestore.ini --disaster-fine-grained# 备集群Ruby用户登录主节点沙箱，停止恢复python3 $GPHOME/script/SyncDataToStby.py -t stop-restore --config-file /DWS/data2/finedisaster/backupRestore.ini --disaster-fine-grained

复制代码

5.2 删除容灾

警告

当不再需要容灾任务的时候，可以解除主备关系，恢复备集群的读写能力。删除容灾前需要先停止容灾。

Ruby 用户登录主集群主节点的沙箱内。

python3 $GPHOME/script/SyncDataToStby.py -t set-independent --config-file /DWS/data2/finedisaster/backupRestore.ini --disaster-fine-grained

复制代码

系统回显：

Delete csv file.Delete roachbackup file from XXX.XXX.XXX.XXDelete roachbackup file from XXX.XXX.XXX.XXClear cluster disaster state.

复制代码

6. 总结

本文介绍了在云上环境的双集群(不跨 Region 不跨 VPC)后台手动部署并使用细粒度容灾的主要步骤，分为容灾前准备、细粒度容灾操作和解除容灾，使得用户能快速方便得搭建起细粒度容灾。

点击关注，第一时间了解华为云新鲜技术~

发布于: 刚刚阅读数: 4

原文链接:【http://xie.infoq.cn/article/c59c5f9ba24d65be21cfc2a92】。文章转载请联系作者。

华为云开发者联盟

关注

提供全面深入的云计算技术干货 2020-07-14 加入

生于云，长于云，让开发者成为决定性力量

发布

暂无评论

创作场景

详解如何在数仓中搭建细粒度容灾应用

1. 前言

2. 细粒度容灾简介

3. 容灾前准备

3.1 配置互信

3.2 配置容灾配置文件

3.3 主备集群准备

3.4 容灾表数据准备

4. 细粒度容灾操作

4.1 定义发布

4.2 启动容灾

4.3 结果验证

4.3.1 功能验证

4.3.2 性能验证

5. 解除容灾

5.1 停止容灾

5.2 删除容灾

6. 总结

华为云开发者联盟

评论