当前位置:   article > 正文

数据库|DR-AUTO-SYNC架构集群搭建及主备切换手册

数据库|DR-AUTO-SYNC架构集群搭建及主备切换手册

近期,我司计划部署一套名为“dr-autosync”的集群系统。鉴于我们之前没有实施过此类系统的生产环境,因此我们采取了逐步探索的方法。

通过不懈努力,我们最终成功地完成了搭建工作。为了便于参考,现将搭建过程的详细步骤分享给大家。

作者:刘昊| 数据库工程师

一、集群架构

二、规则拓扑

根据集群架构规划拓扑

  1. global:
  2. user: "tidb"
  3. ssh_port: 22
  4. deploy_dir: "/tidb/tidb-deploy"
  5. data_dir: "/tidb/tidb-data"
  6. arch: "arm64"
  7. monitored:
  8. node_exporter_port: 19100
  9. blackbox_exporter_port: 19115
  10. server_configs:
  11. tidb:
  12. tikv:
  13. pd:
  14. dashboard.enable-telemetry: false
  15. log.file.max-backups: 100
  16. log.file.max-days: 90
  17. replication.isolation-level: logic
  18. replication.location-labels:
  19. - dc
  20. - logic
  21. - rack
  22. - host
  23. replication.max-replicas: 5
  24. schedule.max-store-down-time: 30m
  25. pd_servers:
  26. - host: 10.3.65.1
  27. client_port: 12379
  28. peer_port: 12380
  29. - host: 10.3.65.2
  30. client_port: 12379
  31. peer_port: 12380
  32. - host: 10.3.65.3
  33. client_port: 12379
  34. peer_port: 12380
  35. - host: 10.3.65.1
  36. client_port: 12379
  37. peer_port: 12380
  38. - host: 10.3.65.2
  39. client_port: 12379
  40. peer_port: 12380
  41. tidb_servers:
  42. - host: 10.3.65.1
  43. port: 24000
  44. status_port: 20080
  45. - host: 10.3.65.2
  46. port: 24000
  47. status_port: 20080
  48. - host: 10.3.65.3
  49. port: 24000
  50. status_port: 20080
  51. - host: 10.3.65.1
  52. port: 24000
  53. status_port: 20080
  54. - host: 10.3.65.2
  55. port: 24000
  56. status_port: 20080
  57. - host: 10.3.65.3
  58. port: 24000
  59. status_port: 20080
  60. tikv_servers:
  61. - host: 10.3.65.1
  62. port: 20160
  63. status_port: 20180
  64. config:
  65. server.labels:
  66. dc: dc1
  67. logic: logic1
  68. rack: rack1
  69. host: host1
  70. - host: 10.3.65.2
  71. port: 20160
  72. status_port: 20180
  73. config:
  74. server.labels:
  75. dc: dc1
  76. logic: logic2
  77. rack: rack1
  78. host: host1
  79. - host: 10.3.65.3
  80. port: 20160
  81. status_port: 20180
  82. config:
  83. server.labels:
  84. dc: dc1
  85. logic: logic3
  86. rack: rack1
  87. host: host1
  88. - host: 10.3.65.1
  89. port: 20160
  90. status_port: 20180
  91. config:
  92. server.labels:
  93. dc: dc2
  94. logic: logic4
  95. rack: rack1
  96. host: host1
  97. - host: 10.3.65.2
  98. port: 20160
  99. status_port: 20180
  100. config:
  101. server.labels:
  102. dc: dc2
  103. logic: logic5
  104. rack: rack1
  105. host: host1
  106. - host: 10.3.65.3
  107. port: 20160
  108. status_port: 20180
  109. config:
  110. server.labels:
  111. dc: dc2
  112. logic: logic6
  113. rack: rack1
  114. host: host1
  115. monitoring_servers:
  116. - host: 10.3.65.3
  117. port: 29090
  118. ng_port: 22020
  119. grafana_servers:
  120. - host: 10.3.65.3
  121. port: 23000
  122. alertmanager_servers:
  123. - host: 10.3.65.3
  124. web_port: 29093
  125. cluster_port: 29094

三、集群部署

1、部署集群

tiup cluster deploy dr-auto-syncv6.5.4dr-auto-sync.yaml --user tidb -p

2、编写dr-auto-sync集群的json文件

vim rule.json

  1. [
  2. {
  3. "group_id": "pd",
  4. "group_index": 0,
  5. "group_override": false,
  6. "rules": [
  7. {
  8. "group_id": "pd",
  9. "id": "dc1",
  10. "start_key": "",
  11. "end_key": "",
  12. "role": "voter",
  13. "count": 3,
  14. "location_labels": ["dc", "logic", "rack", "host"],
  15. "label_constraints": [{"key": "dc", "op": "in", "values": ["dc1"]}]
  16. },
  17. {
  18. "group_id": "pd",
  19. "id": "dc2",
  20. "start_key": "",
  21. "end_key": "",
  22. "role": "follower",
  23. "count": 2,
  24. "location_labels": ["dc", "logic", "rack", "host"],
  25. "label_constraints": [{"key": "dc", "op": "in", "values": ["dc2"]}]
  26. },
  27. {
  28. "group_id": "pd",
  29. "id": "dc2-1",
  30. "start_key": "",
  31. "end_key": "",
  32. "role": "learner",
  33. "count": 1,
  34. "location_labels": ["dc", "logic", "rack", "host"],
  35. "label_constraints": [{"key": "dc", "op": "in", "values": ["dc2"]}]
  36. }
  37. ]
  38. }
  39. ]

3、配置placement rule json文件,使其生效

  1. [tidb@tidb141 ~]$ tiup ctl:v6.5.4pd -u 10.3.65.141:22379-i
  2. » config placement-rules rule-bundle save --in="/home/tidb/rule.json"

检查配置是否生效

» config placement-rules show

4、修改dr-auto-sync 模式

config setreplication-modedr-auto-sync

5、配置dr-auto-sync 的机房标签

config setreplication-modedr-auto-synclabel-keydc

6、配置主机房

config setreplication-modedr-auto-syncprimary dc1

7、配置从机房

config setreplication-modedr-auto-syncdr dc2

8、配置主机房副本数量

config setreplication-modedr-auto-syncprimary-replicas 3

9、配置从机房副本数量

config setreplication-modedr-auto-syncdr-replicas 2

10、如果集群为跨机房部署的dr-auto-sync 架构,需要确保pd leader 始终位于主机房,可以配置主机房pd权重高于备机房,数值越大权重越高,越优先考虑成为pd leader

  1. tiupctl:v6.5.3pd–u192.168.113.1:12379-i
  2. memberleader_prioritypd-192.168.113.1-12379100
  3. memberleader_prioritypd-192.168.113.2-12379100
  4. memberleader_prioritypd-192.168.113.3-12379100
  5. memberleader_prioritypd-192.168.113.4-1237950
  6. memberleader_prioritypd-192.168.113.5-1237950

11、检查集群同步状态

[tidb@tidb141 ~]$ curl http://10.3.65.141:22379/pd/api/v1/replication_mode/status

四、测试

1、手动关停备机房tikv节点,等待约一分钟左右,检查同步级别是否自动降级为async

[tidb@tidb141 ~]$ tiup cluster stopdr-auto-sync-N 10.3.65.142:10160,10.3.65.142:40160,10.3.65.142:50160

同步级别自动降级为async(异步)

2、启动关停的tikv节点,等待约一分钟左右,检查同步级别是否自动升级为sync

[tidb@tidb141 ~]$ tiup cluster startdr-auto-sync-N 10.3.65.142:10160,10.3.65.142:40160,10.3.65.142:50160

同步级别自动升级为sync,符合预期

至此,dr-auto-sync集群部署成功

五、总结

dr-auto-sync集群,较普通集群其实区别不大,只要按需规划好集群拓扑及、abels、json文件,基本上不会有什么问题,把它当作普通集群部署就可以,但有几点需要注意:

1、跨机房的话,需要配置pd的权重,防止pd leader跑到备机房,影响整体性能;

2、我使用的6.5.4版本,dr-auto-sync有个bug,配置完成后,需要reload一下tikv节点,触发region leader重新选举,同步链路才会升级为sync状态,否则会一直卡在sync_recover阶段。

版权声明:本文由神州数码云基地团队整理撰写,若转载请注明出处。

公众号搜索神州数码云基地,了解更多技术干货。

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/你好赵伟/article/detail/724697
推荐阅读
相关标签
  

闽ICP备14008679号