写点什么

使用 br 工具备份到 local 的一些操作

  • 2023-09-15
    北京
  • 本文字数:11747 字

    阅读完需:约 39 分钟

作者: hellogitee 原文来源:https://tidb.net/blog/83544639

背景

最近业务有一个需求,为防止机房级别的故障,想要在异地机房新搭建一套 TiDB 集群做备用,以便能随时进行机房级别的切换。这种需求当然是要用 TiCDC 来同步啦,第一要步就是通过 br 工具进行备份,然后再来同步。

官方文档 &FAQ

备份存储的选择


官方文档建议使用 S3 或者 NFS,如果使用 local 的话,因为 br 备份是将 tikv 的各个节点数据保存到本地目录,在恢复的时候需要将所有的 tikv 节点备份数据合并到一起后才能使用,这样比较麻烦,被官方不推荐使用。


但咱不是没那条件么,合并麻烦是麻烦,但总归是条路子。


https://docs.pingcap.com/zh/tidb/dev/br-use-overview# 如何管理备份数据


备份用户的权限和注意项


看 FAQ,要求备份的目录要具有读写权限,如果 br 工具和 TiKV 位于不同的机器,则需要用户的 UID 相同。


权限可以理解,但为啥 uid 也要完全一致?


https://docs.pingcap.com/zh/tidb/stable/backup-and-restore-faq# 遇到 -permission-denied- 或者 -no-such-file-or-directory- 错误即使用 -root- 运行 -br- 命令行工具也无法解决该如何处理


以下为具体测试步骤。

实验步骤

环境准备

使用三台测试机


dbpnew129v    10.10.10.1dbpnew130v    10.10.10.2dbpnew131v    10.10.10.3
复制代码


查看三台备份用户的 uid(为啥用 kibana 用户,因为我也在测试 es。。)


[kibana@dbpnew129v backup]$ iduid=49480(kibana) gid=49479(kibana) groups=49479(kibana)
[kibana@dbpnew130v ~]$ id uid=49479(kibana) gid=49479(kibana) groups=49479(kibana)
[kibana@dbpnew131v ~]$ iduid=49478(kibana) gid=49479(kibana) groups=49479(kibana)

复制代码


测试 tidb 版本



[kibana@dbpnew129v backup]$ tiup cluster display test2tiup is checking updates for component cluster ...Starting component `cluster`: /home/kibana/.tiup/components/cluster/v1.13.0/tiup-cluster display test2Cluster type: tidbCluster name: test2Cluster version: v6.5.2Deploy user: kibanaSSH type: builtinDashboard URL: http://10.10.10.1:2379/dashboardGrafana URL: http://10.10.10.1:3000ID Role Host Ports OS/Arch Status Data Dir Deploy Dir-- ---- ---- ----- ------- ------ -------- ----------10.10.10.1:9093 alertmanager 10.10.10.1 9093/9094 linux/x86_64 Up /data1/tidb-data/alertmanager-9093 /data1/tidb-deploy/alertmanager-909310.10.10.1:3000 grafana 10.10.10.1 3000 linux/x86_64 Up - /data1/tidb-deploy/grafana-300010.10.10.2:2379 pd 10.10.10.2 2379/2380 linux/x86_64 Up /data1/tidb-data/pd-2379 /data1/tidb-deploy/pd-237910.10.10.1:2379 pd 10.10.10.1 2379/2380 linux/x86_64 Up|L|UI /data1/tidb-data/pd-2379 /data1/tidb-deploy/pd-237910.10.10.3:2379 pd 10.10.10.3 2379/2380 linux/x86_64 Up /data1/tidb-data/pd-2379 /data1/tidb-deploy/pd-237910.10.10.1:9090 prometheus 10.10.10.1 9090/12020 linux/x86_64 Up /data1/tidb-data/prometheus-9090 /data1/tidb-deploy/prometheus-909010.10.10.2:4000 tidb 10.10.10.2 4000/10080 linux/x86_64 Up - /data1/tidb-deploy/tidb-400010.10.10.1:4000 tidb 10.10.10.1 4000/10080 linux/x86_64 Up - /data1/tidb-deploy/tidb-400010.10.10.3:4000 tidb 10.10.10.3 4000/10080 linux/x86_64 Up - /data1/tidb-deploy/tidb-400010.10.10.2:20160 tikv 10.10.10.2 20160/20180 linux/x86_64 Up /data1/tidb-data/tikv-20160 /data1/tidb-deploy/tikv-2016010.10.10.1:20160 tikv 10.10.10.1 20160/20180 linux/x86_64 Up /data1/tidb-data/tikv-20160 /data1/tidb-deploy/tikv-2016010.10.10.3:20160 tikv 10.10.10.3 20160/20180 linux/x86_64 Up /data1/tidb-data/tikv-20160 /data1/tidb-deploy/tikv-20160
复制代码


开始备份


[kibana@dbpnew129v data1]$ tiup br backup full --pd 10.10.10.2:2379 --storage "local:///data1/backup"
复制代码


因为 /data1 是 777 权限,而指定的 /data1/backup 子目录并没有提前创建,于是备份吐出一大堆的错误信息,感受到了满屏的伤害。。。


## 截取部分日志[2023/09/11 10:55:24.686 +08:00] [INFO] [collector.go:77] ["Full Backup failed summary"] [total-ranges=80] [ranges-succeed=0] [ranges-failed=80] [backup-total-ranges=80] [backup-total-regions=82] [unit-name="range start:7480000000000000485f720000000000000000 end:7480000000000000485f72ffffffffffffffff00"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1189\ngithub.com/pingcap/tidb/br/pkg/conn/util.GetAllTiKVStores\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/util/util.go:39\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:83\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/retry.go:56\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:80\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:893\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"] [unit-name="range start:7480000000000000185f720000000000000000 end:7480000000000000185f72ffffffffffffffff00"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1189\ngithub.com/pingcap/tidb/br/pkg/conn/util.GetAllTiKVStores\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/util/util.go:39\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:83\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/retry.go:56\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:80\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:893\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"] [unit-name="range start:748000fffffffffffd5f720000000000000000 end:748000fffffffffffd5f72ffffffffffffffff00"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1189\ngithub.com/pingcap/tidb/br/pkg/conn/util.GetAllTiKVStores\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/util/util.go:39\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:83\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/retry.go:56\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:80\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:893\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"] [unit-name="range start:7480000000000000205f720000000000000000 end:7480000000000000205f72ffffffffffffffff00"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1189\ngithub.com/pingcap/tidb/br/pkg/conn/util.GetAllTiKVStores\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/util/util.go:39\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:83\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/retry.go:56\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:80\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:893\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"] [unit-name="range start:74800000000000002e5f69800000000000000300 end:74800000000000002e5f698000000000000003fb"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1189\ngithub.com/pingcap/tidb/br/pkg/conn/util.GetAllTiKVStores\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/util/util.go:39\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:83\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/retry.go:56\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:80\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:893\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"] [unit-name="range start:7480000000000000345f720000000000000000 end:7480000000000000345f72ffffffffffffffff00"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1189\ngithub.com/pingcap/tidb/br/pkg/conn/util.GetAllTiKVStores\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/util/util.go:39\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:83\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/retry.go:56\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:80\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:893\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"] [unit-name="range start:7480000000000000365f720000000000000000 end:7480000000000000365f72ffffffffffffffff00"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1189\ngithub.com/pingcap/tidb/br/pkg/conn/util.GetAllTiKVStores\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/util/util.go:39\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:83\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/retry.go:56\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:80\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:893\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"] [unit-name="range start:7480000000000000105f69800000000000000100 end:7480000000000000105f698000000000000001fb"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1189\ngithub.com/pingcap/tidb/br/pkg/conn/util.GetAllTiKVStores\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/util/util.go:39\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:83\ngithub.com/pingcap/tidb/br/pkg/utils.WithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/retry.go:56\ngithub.com/pingcap/tidb/br/pkg/conn.GetAllTiKVStoresWithRetry\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/conn/conn.go:80\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:893\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"] [unit-name="range start:7480000000000000165f720000000000000000 end:7480000000000000165f72ffffffffffffffff00"] [error="rpc error: code = Canceled desc = context canceled"] [errorVerbose="rpc error: code = Canceled desc = context canceled\ngithub.com/tikv/pd/client.(*client).respForErr\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1582\ngithub.com/tikv/pd/client.(*client).GetAllStores\n\t/go/pkg/mod/github.com/tikv/pd/client@v0.0.0-20230724080549-de985b8e0afc/client.go:1Error: error happen in store 5 at 10.10.10.2:20160: File or directory not found on TiKV Node (store id: 5; Address: 10.10.10.2:20160). work around:please ensure br and tikv nodes share a same storage and the user of br and tikv has same uid.: [BR:KV:ErrKVStorage]tikv storage occur I/O error
复制代码


通过最后一条输出看到提示文件或目录在 tikv 节点不存在。


再查看 /tmp/br 下产生的备份日志:


[2023/09/11 10:55:24.680 +08:00] [ERROR] [push.go:206] [range-sn=0] [error="[BR:KV:ErrKVStorage]tikv storage occur I/O error: File or directory not found on TiKV Node (store id: 5; Address: 10.10.10.2:20160). work around:please ensure br and tikv nodes share a same storage and the user of br and tikv has same uid."] [stack="github.com/pingcap/tidb/br/pkg/backup.(*pushDown).pushBackup\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/push.go:206\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRange\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:938\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).BackupRanges.func2\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/backup/client.go:852\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:76\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.3.0/errgroup/errgroup.go:75"]

复制代码


看提示错误是:必须共享相同的存储,且使用 br 工具备份的用户和运行 tikv 节点的用户,必须具有相同的 uid。

问题解决

看到这种报错的意思,只能搞 S3 或者 NFS 共享文件存储了,既然提示没有文件或目录,那我提前创建下呢?


## 三个tikv节点使用br备份用户提前创建/data1/backup目录[kibana@dbpnew131v data1]$ mkdir /data1/backup
## 再次使用br工具进行备份[kibana@dbpnew129v data1]$ tiup br backup full --pd 10.10.10.2:2379 --storage "local:///data1/backup" tiup is checking updates for component br ...Starting component `br`: /home/kibana/.tiup/components/br/v7.3.0/br backup full --pd 10.10.10.2:2379 --storage local:///data1/backupDetail BR log in /tmp/br.log.2023-09-11T11.40.17+0800 Full Backup <------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%Checksum <---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%[2023/09/11 11:40:24.602 +08:00] [INFO] [collector.go:77] ["Full Backup success summary"] [total-ranges=27] [ranges-succeed=27] [ranges-failed=0] [backup-checksum=569.677625ms] [backup-fast-checksum=9.318469ms] [backup-total-ranges=80] [backup-total-regions=82] [total-take=6.64338341s] [total-kv-size=86.32MB] [average-speed=12.99MB/s] [backup-data-size(after-compressed)=5.027MB] [Size=5026872] [BackupTS=444177742312505345] [total-kv=2098905][kibana@dbpnew129v data1]$
复制代码


竟然成功了!!

问题总结

  • 在使用 br 工具做备份时,如果使用 local 的方式时,不能只确保备份的目录对启动各个 tikv 节点用户具有读写权限,还要确保备份指定的目录要实际存在(br 节点会自己创建一个 777 的备份目录);

  • 备份日志提示有误导,提示【please ensure br and tikv nodes share a same storage and the user of br and tikv has same uid】与实际表现不对,实际上只是因为备份指定的实际目录没创建而已;

  • 文档 FAQ 中,对使用本地磁盘备份要求【如果 br 工具和 TiKV 位于不同的机器,则需要用户的 UID 相同】,这一点并不是必须的,因为实际我 uid 不同也是能正常备份的;

  • 后面测试即使启动 tikv 的用户和备份的 br 用户不通,只要保证目录存在且具有读写权限,也是能正常备份成功的;


一句话,保证备份命令中指定的目录实际存在并使其对 tikv 具有读写权限(不能只确保父目录,因为备份时并不会实际替咱们创建),不管你用啥用户,不管 uid 是否一致,都能备份成功!!


发布于: 刚刚阅读数: 3
用户头像

TiDB 社区官网:https://tidb.net/ 2021-12-15 加入

TiDB 社区干货传送门是由 TiDB 社区中布道师组委会自发组织的 TiDB 社区优质内容对外宣布的栏目,旨在加深 TiDBer 之间的交流和学习。一起构建有爱、互助、共创共建的 TiDB 社区 https://tidb.net/

评论

发布
暂无评论
使用br工具备份到local的一些操作_迁移_TiDB 社区干货传送门_InfoQ写作社区