写点什么

Data Migration 高可用演练

  • 2022 年 7 月 11 日
  • 本文字数:27591 字

    阅读完需:约 91 分钟

作者: Haaahei 原文来源:https://tidb.net/blog/3898f3dd


为确保 DM 可以在线上稳定运行,现计划对其高可用机制进行演练,主要包括如下事项:


| 事项 | 验证点 | 步骤 | 结论 || ———— | —————————————————————————————————– | – | – || dm-worker ha | 验证 dm-worker 宕机 - 同步任务是否会转移- 同步任务情况(延迟、状态等)- 宕掉的 dm-worker 启动后,dm-worker 是否会自动启动并重新加入集群 | 如下 | 如下 || dm-master ha | 验证 dm-master leader 宕机 - leader 是否正常选举- 选举过程中,同步任务的情况(延迟、状态等)- dm-master 所在机器启动后,dm-master 是否会自动启动并重新加入集群 | 如下 | 如下 || 滚动升级 | 升级 dm 到 v2.0.6- leader 是否正常选举- 同步任务情况 | 如下 | 如下 |

步骤及结论

dm-worker HA

  1. 模拟 dm-worker 宕机


| | | ———————————————————————————————– | | ``` date; kill -9 pid; mv-1 # 强制 kill dm-worker pid,并将部署目录改名防止自启动



2. 观察任务切换情况
3. 记录相关数据:切换耗时,任务状态,延时情况
**结论:**
- 同步任务是否会转移
| | | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ``` [2021/08/17 13:28:04.712 +08:00] [WARN] [grpclog.go:60] ["grpc: addrConn.createTransport failed to connect to {172.17.201.115:8262 <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial tcp 172.17.201.115:8262: connect: connection refused\". Reconnecting..."] [component="embed etcd"] ... [2021/08/17 13:28:51.576 +08:00] [WARN] [grpclog.go:60] ["grpc: addrConn.createTransport failed to connect to {172.17.201.115:8262 <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial tcp 172.17.201.115:8262: connect: connection refused\". Reconnecting..."] [component="embed etcd"] [2021/08/17 13:28:54.876 +08:00] [WARN] [grpclog.go:60] ["grpc: addrConn.createTransport failed to connect to {172.17.201.115:8262 <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial tcp 172.17.201.115:8262: connect: connection refused\". Reconnecting..."] [component="embed etcd"] [2021/08/17 13:28:57.913 +08:00] [WARN] [grpclog.go:60] ["grpc: addrConn.createTransport failed to connect to {172.17.201.115:8262 <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial tcp 172.17.201.115:8262: connect: connection refused\". Reconnecting..."] [component="embed etcd"] [2021/08/17 13:28:58.159 +08:00] [INFO] [keepalive.go:216] ["receive dm-worker keep alive event"] [operation=DELETE] [kv=/dm-worker/a/646d2d3137322e31372e3230312e3131352d38323632] [2021/08/17 13:28:58.163 +08:00] [INFO] [scheduler.go:1506] ["receive worker status change event"] [component=scheduler] [delete=true] [event="{\"worker-name\":\"dm-172.17.201.115-8262\",\"join-time\":\"0001-01-01T00:00:00Z\"}"] [2021/08/17 13:28:58.165 +08:00] [INFO] [scheduler.go:1662] ["unbound the worker for source"] [component=scheduler] [bound="{\"source\":\"ds-mysql_report\",\"worker\":\"dm-172.17.201.115-8262\"}"] [event="{\"worker-name\":\"dm-172.17.201.115-8262\",\"join-time\":\"0001-01-01T00:00:00Z\"}"] [2021/08/17 13:28:58.165 +08:00] [INFO] [scheduler.go:1838] ["found free worker when source bound"] [component=scheduler] [worker=dm-172.18.78.254-8265] [source=ds-mysql_report] [2021/08/17 13:28:58.168 +08:00] [INFO] [scheduler.go:1876] ["bound the source to worker"] [component=scheduler] [bound="{\"source\":\"ds-mysql_report\",\"worker\":\"dm-172.18.78.254-8265\"}"]
复制代码


|- 大约 60s 左右,新的 dm-worker 成功接管同步任务,通过 query-status 查看同步状态正常


  • 同步任务情况


| | | ——————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————— | | ``<br> [2021/08/17 13:28:58.168 +08:00] [INFO] [server.go:581] ["receive source bound"] [bound="{\"source\":\"ds-mysql_report\",\"worker\":\"dm-172.18.78.254-8265\"}"] ["is deleted"=false] [2021/08/17 13:28:58.170 +08:00] [WARN] [task.go:826] ["session variable 'time_zone' is overwritten by default UTC timezone."] [time_zone=+00:00] [2021/08/17 13:28:58.170 +08:00] [INFO] [server.go:836] ["will start a new worker"] [sourceID=ds-mysql_report] [2021/08/17 13:28:58.170 +08:00] [INFO] [worker.go:120] [initialized] [component="worker controller"] [cfg="{\"enable-gtid\":true,\"auto-fix-gtid\":false,\"relay-dir\":\"relay-dir\",\"meta-dir\":\"\",\"flavor\":\"mysql\",\"charset\":\"\",\"enable-relay\":false,\"relay-binlog-name\":\"\",\"relay-binlog-gtid\":\"\",\"source-id\":\"ds-mysql_report\",\"from\":{\"host\":\"172.16.150.53\",\"port\":15381,\"user\":\"dm_sync\",\"max-allowed-packet\":null,\"session\":{\"time_zone\":\"+00:00\"},\"security\":null},\"purge\":{\"interval\":3600,\"expires\":0,\"remain-space\":15},\"checker\":{\"check-enable\":true,\"backoff-rollback\":{\"Duration\":\"5m0s\"},\"backoff-max\":{\"Duration\":\"5m0s\"}},\"server-id\":429548349,\"case-sensitive\":false,\"filters\":null}"] [2021/08/17 13:28:58.170 +08:00] [INFO] [worker.go:135] ["start running"] [component="worker controller"] [2021/08/17 13:28:58.270 +08:00] [INFO] [worker.go:310] ["enter EnableHandleSubtasks"] [component="worker controller"] [2021/08/17 13:28:58.272 +08:00] [WARN] [task.go:826] ["session variable 'time_zone' is overwritten by default UTC timezone."] [time_zone=+00:00] [2021/08/17 13:28:58.272 +08:00] [WARN] [task.go:826] ["session variable 'time_zone' is overwritten by default UTC timezone."] [time_zone=+00:00] [2021/08/17 13:28:58.273 +08:00] [INFO] [worker.go:326] ["starting to handle mysql source"] [component="worker controller"] [sourceCfg="{\"enable-gtid\":true,\"auto-fix-gtid\":false,\"relay-dir\":\"relay-dir\",\"meta-dir\":\"\",\"flavor\":\"mysql\",\"charset\":\"\",\"enable-relay\":false,\"relay-binlog-name\":\"\",\"relay-binlog-gtid\":\"\",\"source-id\":\"ds-mysql_report\",\"from\":{\"host\":\"172.16.150.53\",\"port\":15381,\"user\":\"dm_sync\",\"max-allowed-packet\":null,\"session\":{\"time_zone\":\"+00:00\"},\"security\":null},\"purge\":{\"interval\":3600,\"expires\":0,\"remain-space\":15},\"checker\":{\"check-enable\":true,\"backoff-rollback\":{\"Duration\":\"5m0s\"},\"backoff-max\":{\"Duration\":\"5m0s\"}},\"server-id\":429548349,\"case-sensitive\":false,\"filters\":null}"] [subTasks="{\"dm-mysql_report\":{\"is-sharding\":false,\"shard-mode\":\"\",\"online-ddl-scheme\":\"gh-ost\",\"case-sensitive\":false,\"name\":\"dm-mysql_report\",\"mode\":\"incremental\",\"ignore-checking-items\":[\"dump_privilege\"],\"source-id\":\"ds-mysql_report\",\"server-id\":429548349,\"flavor\":\"mysql\",\"meta-schema\":\"dm_meta\",\"heartbeat-update-interval\":1,\"heartbeat-report-interval\":10,\"enable-heartbeat\":false,\"meta\":{\"BinLogName\":\"\",\"BinLogPos\":0,\"BinLogGTID\":\"34474b1e-2bf3-11e8-8515-00163e1040fb:1-1278385,767d5889-e08e-11ea-bf83-00163e0e3732:1,7fbb40a3-8240-11eb-8cda-00163e17fb0e:1-168290280,803ffea1-7b9d-11e9-87b0-00163e0e3732:1-435424493,a2d27a7f-de3c-11e7-82cd-00163e1040fb:1-304056,a33473fb-de3c-11e7-8140-00163e0e6470:1-68232043,bfbebe4d-1582-11e9-8e63-00163e082a23:1-36011466,e033b7c4-7b9d-11e9-8e45-00163e097eeb:1-608218207\"},\"timezone\":\"\",\"relay-dir\":\"relay-dir\",\"use-relay\":false,\"from\":{\"host\":\"172.16.150.53\",\"port\":15381,\"user\":\"dm_sync\",\"max-allowed-packet\":null,\"session\":{\"time_zone\":\"+00:00\"},\"security\":null},\"to\":{\"host\":\"172.21.35.233\",\"port\":15381,\"user\":\"dm_load\",\"max-allowed-packet\":null,\"session\":{\"tidb_txn_mode\":\"optimistic\",\"time_zone\":\"+00:00\"},\"security\":null},\"route-rules\":[{\"schema-pattern\":\"reverse_flow\",\"table-pattern\":\"\",\"target-schema\":\"reverse_center\",\"target-table\":\"\"}],\"filter-rules\":[],\"mapping-rule\":[],\"black-white-list\":null,\"block-allow-list\":{\"do-tables\":[{\"db-name\":\"reverse_flow\",\"tbl-name\":\"rc_reverse_record_integration\"}],\"do-dbs\":[\"reverse_flow\"],\"ignore-tables\":null,\"ignore-dbs\":null},\"mydumper-path\":\"./bin/mydumper\",\"threads\":1,\"chunk-filesize\":\"64\",\"statement-size\":0,\"rows\":1000,\"where\":\"\",\"skip-tz-utc\":true,\"extra-args\":\"--consistency none\",\"pool-size\":8,\"dir\":\"./dm-mysql_report.dm-mysql_report\",\"meta-file\":\"\",\"worker-count\":128,\"batch\":100,\"queue-size\":1024,\"checkpoint-flush-interval\":30,\"max-retry\":0,\"auto-fix-gtid\":false,\"enable-gtid\":true,\"disable-detect\":false,\"safe-mode\":false,\"enable-ansi-quotes\":false,\"log-level\":\"\",\"log-file\":\"\",\"log-format\":\"\",\"log-rotate\":\"\",\"pprof-addr\":\"\",\"status-addr\":\"\",\"config-file\":\"\",\"clean-dump-file\":false,\"ansi-quotes\":false}}"] [2021/08/17 13:28:58.273 +08:00] [INFO] [worker.go:333] ["start to create subtask"] [component="worker controller"] [sourceID=ds-mysql_report] [task=dm-mysql_report] [2021/08/17 13:28:58.273 +08:00] [INFO] [worker.go:426] ["subtask created"] [component="worker controller"] [config="{\"is-sharding\":false,\"shard-mode\":\"\",\"online-ddl-scheme\":\"gh-ost\",\"case-sensitive\":false,\"name\":\"dm-mysql_report\",\"mode\":\"incremental\",\"ignore-checking-items\":[\"dump_privilege\"],\"source-id\":\"ds-mysql_report\",\"server-id\":429548349,\"flavor\":\"mysql\",\"meta-schema\":\"dm_meta\",\"heartbeat-update-interval\":1,\"heartbeat-report-interval\":10,\"enable-heartbeat\":false,\"meta\":{\"BinLogName\":\"\",\"BinLogPos\":0,\"BinLogGTID\":\"34474b1e-2bf3-11e8-8515-00163e1040fb:1-1278385,767d5889-e08e-11ea-bf83-00163e0e3732:1,7fbb40a3-8240-11eb-8cda-00163e17fb0e:1-168290280,803ffea1-7b9d-11e9-87b0-00163e0e3732:1-435424493,a2d27a7f-de3c-11e7-82cd-00163e1040fb:1-304056,a33473fb-de3c-11e7-8140-00163e0e6470:1-68232043,bfbebe4d-1582-11e9-8e63-00163e082a23:1-36011466,e033b7c4-7b9d-11e9-8e45-00163e097eeb:1-608218207\"},\"timezone\":\"\",\"relay-dir\":\"relay-dir\",\"use-relay\":false,\"from\":{\"host\":\"172.16.150.53\",\"port\":15381,\"user\":\"dm_sync\",\"max-allowed-packet\":null,\"session\":{\"time_zone\":\"+00:00\"},\"security\":null},\"to\":{\"host\":\"172.21.35.233\",\"port\":15381,\"user\":\"dm_load\",\"max-allowed-packet\":null,\"session\":{\"tidb_txn_mode\":\"optimistic\",\"time_zone\":\"+00:00\"},\"security\":null},\"route-rules\":[{\"schema-pattern\":\"reverse_flow\",\"table-pattern\":\"\",\"target-schema\":\"reverse_center\",\"target-table\":\"\"}],\"filter-rules\":[],\"mapping-rule\":[],\"black-white-list\":null,\"block-allow-list\":{\"do-tables\":[{\"db-name\":\"reverse_flow\",\"tbl-name\":\"rc_reverse_record_integration\"}],\"do-dbs\":[\"reverse_flow\"],\"ignore-tables\":null,\"ignore-dbs\":null},\"mydumper-path\":\"./bin/mydumper\",\"threads\":1,\"chunk-filesize\":\"64\",\"statement-size\":0,\"rows\":1000,\"where\":\"\",\"skip-tz-utc\":true,\"extra-args\":\"--consistency none\",\"pool-size\":8,\"dir\":\"./dm-mysql_report.dm-mysql_report\",\"meta-file\":\"\",\"worker-count\":128,\"batch\":100,\"queue-size\":1024,\"checkpoint-flush-interval\":30,\"max-retry\":0,\"auto-fix-gtid\":false,\"enable-gtid\":true,\"disable-detect\":false,\"safe-mode\":false,\"enable-ansi-quotes\":false,\"log-level\":\"\",\"log-file\":\"\",\"log-format\":\"\",\"log-rotate\":\"\",\"pprof-addr\":\"\",\"status-addr\":\"\",\"config-file\":\"\",\"clean-dump-file\":false,\"ansi-quotes\":false}"] [2021/08/17 13:28:58.273 +08:00] [INFO] [syncer.go:3024] ["use timezone"] [task=dm-mysql_report] [unit="binlog replication"] [location=UTC] [2021/08/17 13:28:58.891 +08:00] [INFO] [config.go:599] ["detect server type"] [task=dm-mysql_report] [unit="binlog replication"] [scope=upstream] [type=MySQL] [2021/08/17 13:28:58.891 +08:00] [INFO] [config.go:618] ["detect server version"] [task=dm-mysql_report] [unit="binlog replication"] [scope=upstream] [version=5.7.20-log] [2021/08/17 13:28:58.894 +08:00] [INFO] [config.go:599] ["detect server type"] [task=dm-mysql_report] [unit="binlog replication"] [scope=downstream] [type=TiDB] [2021/08/17 13:28:58.894 +08:00] [INFO] [config.go:618] ["detect server version"] [task=dm-mysql_report] [unit="binlog replication"] [scope=downstream] [version=4.0.13] [2021/08/17 13:28:59.422 +08:00] [INFO] [checkpoint.go:699] ["create checkpoint schema"] [task=dm-mysql_report] [unit="binlog replication"] [component="remote checkpoint"] [statement="CREATE SCHEMA IF NOT EXISTSdm_meta"] [2021/08/17 13:28:59.426 +08:00] [INFO] [checkpoint.go:723] ["create checkpoint table"] [task=dm-mysql_report] [unit="binlog replication"] [component="remote checkpoint"] [statements="[\"CREATE TABLE IF NOT EXISTSdm_meta.dm-mysql_report_syncer_checkpoint` (\n\t\t\tid VARCHAR(32) NOT NULL,\n\t\t\tcp_schema VARCHAR(128) NOT NULL,\n\t\t\tcp_table VARCHAR(128) NOT NULL,\n\t\t\tbinlog_name VARCHAR(128),\n\t\t\tbinlog_pos INT UNSIGNED,\n\t\t\tbinlog_gtid TEXT,\n\t\t\texit_safe_binlog_name VARCHAR(128) DEFAULT “,\n\t\t\texit_safe_binlog_pos INT UNSIGNED DEFAULT 0,\n\t\t\texit_safe_binlog_gtid TEXT,\n\t\t\ttable_info JSON NOT NULL,\n\t\t\tis_global BOOLEAN,\n\t\t\tcreate_time timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,\n\t\t\tupdate_time timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,\n\t\t\tUNIQUE KEY uk_id_schema_table (id, cp_schema, cp_table)\n\t\t)\“]”] [2021/08/17 13:28:59.429 +08:00] [INFO] [checkpoint.go:785] [“fetch global checkpoint from DB”] [task=dm-mysql_report] [unit=“binlog replication”] [component=“remote checkpoint”] [“global checkpoint”=“position: (mysql-bin.001906, 820109405), gtid-set: 34474b1e-2bf3-11e8-8515-00163e1040fb:1-1278385,767d5889-e08e-11ea-bf83-00163e0e3732:1,7fbb40a3-8240-11eb-8cda-00163e17fb0e:1-344070205,803ffea1-7b9d-11e9-87b0-00163e0e3732:1-435424493,a2d27a7f-de3c-11e7-82cd-00163e1040fb:1-304056,a33473fb-de3c-11e7-8140-00163e0e6470:1-68232043,bfbebe4d-1582-11e9-8e63-00163e082a23:1-36011466,e033b7c4-7b9d-11e9-8e45-00163e097eeb:1-608218207(flushed position: (mysql-bin.001906, 820109405), gtid-set: 34474b1e-2bf3-11e8-8515-00163e1040fb:1-1278385,767d5889-e08e-11ea-bf83-00163e0e3732:1,7fbb40a3-8240-11eb-8cda-00163e17fb0e:1-344070205,803ffea1-7b9d-11e9-87b0-00163e0e3732:1-435424493,a2d27a7f-de3c-11e7-82cd-00163e1040fb:1-304056,a33473fb-de3c-11e7-8140-00163e0e6470:1-68232043,bfbebe4d-1582-11e9-8e63-00163e082a23:1-36011466,e033b7c4-7b9d-11e9-8e45-00163e097eeb:1-608218207)“] [2021/08/17 13:28:59.431 +08:00] [INFO] [subtask.go:226] [“start to run”] [subtask=dm-mysql_report] [unit=Sync] [2021/08/17 13:28:59.431 +08:00] [INFO] [worker.go:351] [“handling subtask enabled”] [component=“worker controller”] [2021/08/17 13:28:59.432 +08:00] [INFO] [syncer.go:1342] [“replicate binlog from checkpoint”] [task=dm-mysql_report] [unit=“binlog replication”] [checkpoint=“position: (mysql-bin.001906, 820109405), gtid-set: 34474b1e-2bf3-11e8-8515-00163e1040fb:1-1278385,767d5889-e08e-11ea-bf83-00163e0e3732:1,7fbb40a3-8240-11eb-8cda-00163e17fb0e:1-344070205,803ffea1-7b9d-11e9-87b0-00163e0e3732:1-435424493,a2d27a7f-de3c-11e7-82cd-00163e1040fb:1-304056,a33473fb-de3c-11e7-8140-00163e0e6470:1-68232043,bfbebe4d-1582-11e9-8e63-00163e082a23:1-36011466,e033b7c4-7b9d-11e9-8e45-00163e097eeb:1-608218207”] [2021/08/17 13:28:59.440 +08:00] [INFO] [streamer_controller.go:72] [“last slave connection”] [task=dm-mysql_report] [unit=“binlog replication”] [“connection ID”=31610609] [2021/08/17 13:28:59.440 +08:00] [INFO] [mode.go:100] [“change count”] [task=dm-mysql_report] [unit=“binlog replication”] [“previous count”=0] [“new count”=0] [2021/08/17 13:28:59.440 +08:00] [INFO] [mode.go:100] [“change count”] [task=dm-mysql_report] [unit=“binlog replication”] [“previous count”=0] [“new count”=1] [2021/08/17 13:28:59.440 +08:00] [INFO] [mode.go:59] [“enable safe-mode because of task initialization”] [task=dm-mysql_report] [unit=“binlog replication”] [“duration in seconds”=60] [2021/08/17 13:29:00.075 +08:00] [INFO] [syncer.go:1690] [“meet heartbeat event and then flush jobs”] [task=dm-mysql_report] [unit=“binlog replication”] [2021/08/17 13:29:00.075 +08:00] [INFO] [syncer.go:2746] [“flush all jobs”] [task=dm-mysql_report] [unit=“binlog replication”] [“global checkpoint”=“position: (mysql-bin.001906, 820109405), gtid-set: 34474b1e-2bf3-11e8-8515-00163e1040fb:1-1278385,767d5889-e08e-11ea-bf83-00163e0e3732:1,7fbb40a3-8240-11eb-8cda-00163e17fb0e:1-344070205,803ffea1-7b9d-11e9-87b0-00163e0e3732:1-435424493,a2d27a7f-de3c-11e7-82cd-00163e1040fb:1-304056,a33473fb-de3c-11e7-8140-00163e0e6470:1-68232043,bfbebe4d-1582-11e9-8e63-00163e082a23:1-36011466,e033b7c4-7b9d-11e9-8e45-00163e097eeb:1-608218207(flushed position: (mysql-bin.001906, 820109405), gtid-set: 34474b1e-2bf3-11e8-8515-00163e1040fb:1-1278385,767d5889-e08e-11ea-bf83-00163e0e3732:1,7fbb40a3-8240-11eb-8cda-00163e17fb0e:1-344070205,803ffea1-7b9d-11e9-87b0-00163e0e3732:1-435424493,a2d27a7f-de3c-11e7-82cd-00163e1040fb:1-304056,a33473fb-de3c-11e7-8140-00163e0e6470:1-68232043,bfbebe4d-1582-11e9-8e63-00163e082a23:1-36011466,e033b7c4-7b9d-11e9-8e45-00163e097eeb:1-608218207)“] [2021/08/17 13:29:00.080 +08:00] [INFO] [syncer.go:1003] [“flushed checkpoint”] [task=dm-mysql_report] [unit=“binlog replication”] [checkpoint=“position: (mysql-bin.001906, 820109405), gtid-set: 34474b1e-2bf3-11e8-8515-00163e1040fb:1-1278385,767d5889-e08e-11ea-bf83-00163e0e3732:1,7fbb40a3-8240-11eb-8cda-00163e17fb0e:1-344070205,803ffea1-7b9d-11e9-87b0-00163e0e3732:1-435424493,a2d27a7f-de3c-11e7-82cd-00163e1040fb:1-304056,a33473fb-de3c-11e7-8140-00163e0e6470:1-68232043,bfbebe4d-1582-11e9-8e63-00163e082a23:1-36011466,e033b7c4-7b9d-11e9-8e45-00163e097eeb:1-608218207(flushed position: (mysql-bin.001906, 820109405), gtid-set: 34474b1e-2bf3-11e8-8515-00163e1040fb:1-1278385,767d5889-e08e-11ea-bf83-00163e0e3732:1,7fbb40a3-8240-11eb-8cda-00163e17fb0e:1-344070205,803ffea1-7b9d-11e9-87b0-00163e0e3732:1-435424493,a2d27a7f-de3c-11e7-82cd-00163e1040fb:1-304056,a33473fb-de3c-11e7-8140-00163e0e6470:1-68232043,bfbebe4d-1582-11e9-8e63-00163e082a23:1-36011466,e033b7c4-7b9d-11e9-8e45-00163e097eeb:1-608218207)“] [2021/08/17 13:29:13.098 +08:00] [INFO] [server.go:753] [request=QueryStatus] [payload=“name:\“dm-mysql_report\” “] [2021/08/17 13:29:13.098 +08:00] [INFO] [worker.go:509] [“will open a connection to get master status”] [component=“worker controller”] [“upstream config”=“{\“host\”:\“172.16.150.53\“,\“port\”:15381,\“user\”:\“dm_sync\“,\“max-allowed-packet\”:null,\“session\”:{\“time_zone\”:\“+00:00\“},\“security\”:null}“] [2021/08/17 13:29:29.443 +08:00] [INFO] [syncer.go:2627] [“binlog replication progress”] [task=dm-mysql_report] [unit=“binlog replication”] [“total binlog size”=12632410] [“last binlog size”=0] [“cost time”=30] [bytes/Second=421080] [“unsynced binlog size”=0] [“estimate time to catch up”=0]


- 在新的dm-worker接管后,同步任务正常运行;由于切换需要60s左右,所以延迟至少在60s
<!---->
- 宕掉的dm-worker启动后,dm-worker是否会自动启动并重新加入集群会自动加入集群,dm-master leader会尝试重启宕掉的dm-worker
| | | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ``` [2021/08/17 13:30:28.796 +08:00] [WARN] [grpclog.go:60] ["grpc: addrConn.createTransport failed to connect to {172.17.201.115:8262 <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial tcp 172.17.201.115:8262: connect: connection refused\". Reconnecting..."] [component="embed etcd"] [2021/08/17 13:30:31.625 +08:00] [WARN] [grpclog.go:60] ["grpc: addrConn.createTransport failed to connect to {172.17.201.115:8262 <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial tcp 172.17.201.115:8262: connect: connection refused\". Reconnecting..."] [component="embed etcd"] [2021/08/17 13:30:35.190 +08:00] [WARN] [grpclog.go:60] ["grpc: addrConn.createTransport failed to connect to {172.17.201.115:8262 <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial tcp 172.17.201.115:8262: connect: connection refused\". Reconnecting..."] [component="embed etcd"] [2021/08/17 13:30:37.523 +08:00] [INFO] [server.go:2206] [payload="name:\"dm-172.17.201.115-8262\" address:\"172.17.201.115:8262\" "] [request=RegisterWorker] [2021/08/17 13:30:37.523 +08:00] [WARN] [scheduler.go:836] ["add the same worker again"] [component=scheduler] ["worker info"="{\"name\":\"dm-172.17.201.115-8262\",\"addr\":\"172.17.201.115:8262\"}"] [2021/08/17 13:30:37.523 +08:00] [INFO] [server.go:309] ["register worker successfully"] [name=dm-172.17.201.115-8262] [address=172.17.201.115:8262] [2021/08/17 13:30:37.529 +08:00] [INFO] [keepalive.go:216] ["receive dm-worker keep alive event"] [operation=PUT] [kv=/dm-worker/a/646d2d3137322e31372e3230312e3131352d38323632] [2021/08/17 13:30:37.529 +08:00] [INFO] [scheduler.go:1506] ["receive worker status change event"] [component=scheduler] [delete=false] [event="{\"worker-name\":\"dm-172.17.201.115-8262\",\"join-time\":\"2021-08-17T13:30:37.524837339+08:00\"}"] [2021/08/17 13:30:37.529 +08:00] [INFO] [scheduler.go:1739] ["no unbound sources need to bound"] [component=scheduler] [worker="{\"name\":\"dm-172.17.201.115-8262\",\"addr\":\"172.17.201.115:8262\"}"]
复制代码


|

dm-master HA

  1. 模拟 dm-master 宕机


| | | ———————————————————————————————– | | <br> date; kill -9 pid; mv <deploy dir> <deploy dir>-1 # 强制kill dm-master pid,并将部署目录改名防止自启动<br> |


  1. 观察 leader 切换情况

  2. 记录相关数据:leader 切换耗时,所有任务状态,延时情况


结论:


  • leader 是否正常选举


|


发布于: 刚刚阅读数: 2
用户头像

TiDB 社区官网:https://tidb.net/ 2021.12.15 加入

TiDB 社区干货传送门是由 TiDB 社区中布道师组委会自发组织的 TiDB 社区优质内容对外宣布的栏目,旨在加深 TiDBer 之间的交流和学习。一起构建有爱、互助、共创共建的 TiDB 社区 https://tidb.net/

评论

发布
暂无评论
Data Migration高可用演练_迁移_TiDB 社区干货传送门_InfoQ写作社区