赞
踩
一、什么是Orchestrator
Orchestrator是一款开源,对MySQL复制提供高可用、拓扑的可视化管理工具,采用go语言编写,它能够主动发现当前拓扑结构和主从复制状态,支持MySQL主从复制拓扑关系的调整、支持MySQL主库故障自动切换(failover)、手动主从切换(switchover)等功能。
Orchestrator后台依赖于MySQL或者SQLite存储元数据,能够提供Web界面展示MySQL集群的拓扑关系及实例状态,通过Web界面可更改MySQL实例的部分配置信息,同时也提供命令行和api接口,以便更加灵活的自动化运维管理。Orchestrator 对MySQL主库的故障切换分为自动切换和手动切换。手动切换又分为recover、force-master-failover、force-master-takeover以及graceful-master-takeover。
相比于MHA,Orchestrator更加偏重于复制拓扑关系的管理,能够实现MySQL任一复制拓扑关系的调整,并在此基础上,实现MySQL高可用。另外,Orchestrator自身也可以部署多个节点,通过raft分布式一致性协议,保证自身的高可用。
二、Orchestrator搭建
主机 | hostname | 备注 |
192.168.88.129 | mysql_orch | 安装Orchestrator环境 |
192.168.88.130 | mysql00 | 主库mysql8.0 |
192.168.88.131 | mysql01 | 从库库mysql8.0 |
192.168.88.132 | mysql02 | 从库库mysql8.0 |
a.修改hostname
vim /etc/sysconfig/network
hostname=mysql00
b.在每台机器上面配置
echo '192.168.88.129 mysql_orch' >> /etc/hosts
echo '192.168.88.130 mysql00' >> /etc/hosts
echo '192.168.88.131 mysql01' >> /etc/hosts
echo '192.168.88.132 mysql02' >> /etc/hosts
c.每台主机做免登录
ssh-keygen -t rsa ##enter键执行3次
ssh-copy-id -i ~/.ssh/id_rsa.pub mysql_orch
ssh-copy-id -i ~/.ssh/id_rsa.pub mysql00 ##含义是对192.168.88.130登录做免密
ssh-copy-id -i ~/.ssh/id_rsa.pub mysql01 ##含义是对192.168.88.131登录做免密
ssh-copy-id -i ~/.ssh/id_rsa.pub mysql02 ##含义是对192.168.88.132登录做免密
d.关闭每台电脑的防火墙
主从环境就略过了,需要开启gtid,半同步增强复制。
在192.168.88.129的mysql上面创建Orchestrator账户,用于Orchestrator存储高可用架构的信息。还需要创建一个数据库
- CREATE DATABASE IF NOT EXISTS orchestrator;
- create user 'orchestrator'@'%' identified by '123456';
- GRANT ALL PRIVILEGES ON `orchestrator`.* TO 'orchestrator'@'%';
在192.168.88.130、192.168.88.131、192.168.88.132的mysql创建Orchestrator账户,用于Orchestrator,抓取各个mysql的信息,主从结构、切换mysql主从等。
- create user 'orchestrator'@'%' identified by '123456';
- GRANT SUPER, PROCESS, REPLICATION SLAVE, RELOAD ON *.* TO 'orchestrator'@'%';
- GRANT SELECT ON mysql.slave_master_info TO 'orchestrator'@'%';
这里需要在每台mysql中创建Orchestrator账户,
a. 下载Orchestrator
GitHub - openark/orchestrator: MySQL replication topology management and HA
orchestrator-3.2.6-1.x86_64.rpm
b. 安装orchestrator-3.2.6-1.x86_64.rpm
yum localinstall orchestrator-3.2.6-1.x86_64.rpm 即可,有需要的依赖包,安装即可。
c. orchestrator 配置文件配置
yum安装orchestrator后,安装路径在/usr/local/orchestrator目录
复制一份配置文件orchestrator.conf.json
cp orchestrator-sample.conf.json orchestrator.conf.json
orchestrator.conf.json文件是一个json文件,主要配置以下参数
- {
- "Debug": true,
- "EnableSyslog": false,
- "ListenAddress": ":3000",
- "MySQLTopologyUser": "orchestrator",
- "MySQLTopologyPassword": "123456",
- "MySQLTopologyCredentialsConfigFile": "",
- "MySQLTopologySSLPrivateKeyFile": "",
- "MySQLTopologySSLCertFile": "",
- "MySQLTopologySSLCAFile": "",
- "MySQLTopologySSLSkipVerify": true,
- "MySQLTopologyUseMutualTLS": false,
- "MySQLOrchestratorHost": "192.168.88.129",
- "MySQLOrchestratorPort": 3306,
- "MySQLOrchestratorDatabase": "orchestrator",
- "MySQLOrchestratorUser": "orche",
- "MySQLOrchestratorPassword": "123456",
- "MySQLOrchestratorCredentialsConfigFile": "",
- "MySQLOrchestratorSSLPrivateKeyFile": "",
- "MySQLOrchestratorSSLCertFile": "",
- "MySQLOrchestratorSSLCAFile": "",
- "MySQLOrchestratorSSLSkipVerify": true,
- "MySQLOrchestratorUseMutualTLS": false,
- "MySQLConnectTimeoutSeconds": 1,
- "DefaultInstancePort": 3306,
- "DiscoverByShowSlaveHosts": true,
- "InstancePollSeconds": 5,
- "DiscoveryIgnoreReplicaHostnameFilters": [
- "a_host_i_want_to_ignore[.]example[.]com",
- ".*[.]ignore_all_hosts_from_this_domain[.]example[.]com",
- "a_host_with_extra_port_i_want_to_ignore[.]example[.]com:3307"
- ],
- "UnseenInstanceForgetHours": 240,
- "SnapshotTopologiesIntervalHours": 0,
- "InstanceBulkOperationsWaitTimeoutSeconds": 10,
- "HostnameResolveMethod": "default",
- "MySQLHostnameResolveMethod": "@@hostname",
- "SkipBinlogServerUnresolveCheck": true,
- "ExpiryHostnameResolvesMinutes": 60,
- "RejectHostnameResolvePattern": "",
- "ReasonableReplicationLagSeconds": 10,
- "ProblemIgnoreHostnameFilters": [],
- "VerifyReplicationFilters": false,
- "ReasonableMaintenanceReplicationLagSeconds": 20,
- "CandidateInstanceExpireMinutes": 60,
- "AuditLogFile": "",
- "AuditToSyslog": false,
- "RemoveTextFromHostnameDisplay": ".mydomain.com:3306",
- "ReadOnly": false,
- "AuthenticationMethod": "",
- "HTTPAuthUser": "",
- "HTTPAuthPassword": "",
- "AuthUserHeader": "",
- "PowerAuthUsers": [
- "*"
- ],
- "ClusterNameToAlias": {
- "127.0.0.1": "test suite"
- },
- "ReplicationLagQuery": "",
- "DetectClusterAliasQuery": "SELECT SUBSTRING_INDEX(@@hostname, '.', 1)",
- "DetectClusterDomainQuery": "",
- "DetectInstanceAliasQuery": "",
- "DetectPromotionRuleQuery": "",
- "DataCenterPattern": "[.]([^.]+)[.][^.]+[.]mydomain[.]com",
- "PhysicalEnvironmentPattern": "[.]([^.]+[.][^.]+)[.]mydomain[.]com",
- "PromotionIgnoreHostnameFilters": [],
- "DetectSemiSyncEnforcedQuery": "",
- "ServeAgentsHttp": false,
- "AgentsServerPort": ":3001",
- "AgentsUseSSL": false,
- "AgentsUseMutualTLS": false,
- "AgentSSLSkipVerify": false,
- "AgentSSLPrivateKeyFile": "",
- "AgentSSLCertFile": "",
- "AgentSSLCAFile": "",
- "AgentSSLValidOUs": [],
- "UseSSL": false,
- "UseMutualTLS": false,
- "SSLSkipVerify": false,
- "SSLPrivateKeyFile": "",
- "SSLCertFile": "",
- "SSLCAFile": "",
- "SSLValidOUs": [],
- "URLPrefix": "",
- "StatusEndpoint": "/api/status",
- "StatusSimpleHealth": true,
- "StatusOUVerify": false,
- "AgentPollMinutes": 60,
- "UnseenAgentForgetHours": 6,
- "StaleSeedFailMinutes": 60,
- "SeedAcceptableBytesDiff": 8192,
- "PseudoGTIDPattern": "",
- "PseudoGTIDPatternIsFixedSubstring": false,
- "PseudoGTIDMonotonicHint": "asc:",
- "DetectPseudoGTIDQuery": "",
- "BinlogEventsChunkSize": 10000,
- "SkipBinlogEventsContaining": [],
- "ReduceReplicationAnalysisCount": true,
- "FailureDetectionPeriodBlockMinutes": 1,
- "FailMasterPromotionOnLagMinutes": 0,
- "RecoveryPeriodBlockSeconds": 30,
- "RecoveryIgnoreHostnameFilters": [],
- "RecoverMasterClusterFilters": [
- "*"
- ],
- "RecoverIntermediateMasterClusterFilters": [
- "*"
- ],
- "OnFailureDetectionProcesses": [
- "echo 'Detected {failureType} on {failureCluster}. Affected replicas: {countSlaves}' >> /tmp/recovery.log"
- ],
- "PreGracefulTakeoverProcesses": [
- "echo 'Planned takeover about to take place on {failureCluster}. Master will switch to read_only' >> /tmp/recovery.log"
- ],
- "PreFailoverProcesses": [
- "echo 'Will recover from {failureType} on {failureCluster}' >> /tmp/recovery.log"
- ],
- "PostFailoverProcesses": [
- "echo '(for all types) Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Successor: {successorHost}:{successorPort}{failureClusterAlias}' >> /tmp/recovery.log", "/home/orch/orch_hook.sh {failureType} {failureClusterAlias} {failedHost} {successorHost} >> /tmp/orch.log"
- ],
- "PostUnsuccessfulFailoverProcesses": [],
- "PostMasterFailoverProcesses": [
- "echo 'Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Promoted: {successorHost}:{successorPort}' >> /tmp/recovery.log"
- ],
- "PostIntermediateMasterFailoverProcesses": [
- "echo 'Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Successor: {successorHost}:{successorPort}' >> /tmp/recovery.log"
- ],
- "PostGracefulTakeoverProcesses": [
- "echo 'Planned takeover complete' >> /tmp/recovery.log"
- ],
- "CoMasterRecoveryMustPromoteOtherCoMaster": true,
- "DetachLostSlavesAfterMasterFailover": true,
- "ApplyMySQLPromotionAfterMasterFailover": true,
- "PreventCrossDataCenterMasterFailover": false,
- "PreventCrossRegionMasterFailover": false,
- "MasterFailoverDetachReplicaMasterHost": false,
- "MasterFailoverLostInstancesDowntimeMinutes": 0,
- "PostponeReplicaRecoveryOnLagMinutes": 0,
- "OSCIgnoreHostnameFilters": [],
- "GraphiteAddr": "",
- "GraphitePath": "",
- "GraphiteConvertHostnameDotsToUnderscores": true,
- "ConsulAddress": "",
- "ConsulAclToken": "",
- "ConsulKVStoreProvider": "consul"
- }

配置修改:
MySQLTopologyUser 和 MySQLTopologyPassword
表示的是192.168.88.130、192.168.88.131、192.168.88.132上面的mysql账户
"MySQLOrchestratorHost": "192.168.88.129",
"MySQLOrchestratorPort": 3306,
"MySQLOrchestratorDatabase": "orchestrator",
"MySQLOrchestratorUser": "orche",
"MySQLOrchestratorPassword": "123456",
上面的参数对应的是192.168.88.129对应的存储Orchestrator信息的数据库,这里一定要理解清楚,很容易搞混淆。
RecoverMasterClusterFilters
RecoverIntermediateMasterClusterFilters
两个参数建议都配置 * ,不然后面多次切换主从的时候,第二次不会成功
RecoveryPeriodBlockSeconds 3600 ##已经出故障后切换 3600s 后才能切换下一次
FailureDetectionPeriodBlockMinutes 1 在该时间内再次出现故障,不会被多次发现
如果在测试的环境把这两个值调小一点,不然多次切换主从会有问题
d. 启动orchestrator
在/usr/local/orchestrator 目录下执行
nohup ./orchestrator --debug http > orc.log 2>&1 &
f. 进入orchestrator web页面
可以在discover中添加我们的mysql主从集群,在hostname中输入主库192.168.88.130,点击submit按钮即可,
如果能连接上就有下面的拓扑结构图
三、其他相关问题
可以把192.168.88.130的mysql关闭后,刷新Orchestrator就可以看到,Orchestrator会自动选主。
a.orch会先检测主库,如果联系不上主库
b.判断能连接上的所有从库是否能联系上主库,如果a和b都不行,则判断主库已经挂掉了
orch检测主库宕机依赖从库的IO线程(本身连不上主库后,还会通过从库再去检测主库是否异常),所以默认change搭建的主从感知主库宕机的等待时间过长,需要需要稍微改下
change master to master_host='192.168.163.131',master_port=3307,master_user='rep',master_password='rep',master_auto_position=1,MASTER_HEARTBEAT_PERIOD=2,MASTER_CONNECT_RETRY=1, MASTER_RETRY_COUNT=86400;
set global slave_net_timeout=8;##判断slave是否与主机断掉了,超时时间
slave_net_timeout 表示slave在slave_net_timeout时间之内没有收到master的任何数据(包括binlog,heartbeat),slave认为连接断开,需要重连。默认值60s。
MASTER_HEARTBEAT_PERIOD 表示心跳的周期。当MASTER_HEARTBEAT_PERIOD时间之内,master没有binlog event发送给slave的时候,就会发送心跳数据给slave
MASTER_CONNECT_RETRY
slave_net_timeout超时后,立刻重连,后续重连的时间间隔由 CHANGE MASTER TO 命令的MASTER_CONNECT_RETRY 参数指定。默认值60s。
MASTER_RETRY_COUNT 限制重连次数
考虑从库执行的位置点、数据中心、版本的兼容、binlog格式、是否满足相应的参数、promotion_rule规则的判定等等,来寻找一个符合设定条件的候选节点并尝试提升其为新的主库,从而完成自动故障恢复
默认会选Executed_Gtid_Set最大的,数据最全的
有一些原因确实有丢失数据,造成主从不一致,因为orch不会主动的去补数据。所以一定要使用gtid和半同步增加复制。
检查old主库和new主库的数据有没有不同,然后change master to执行就好了
change master to master_host='192.168.88.130', master_port=3306, master_user='repl', master_password='123456', master_auto_position=1, MASTER_HEARTBEAT_PERIOD=2,MASTER_CONNECT_RETRY=1, MASTER_RETRY_COUNT=86400;
Orchestrator中配置钩子,在orchestrator.conf.json配置文件中,添加
逻辑就是:在PostFailoverProcesses配置中添加一个shell脚本,orchestrator主从切换成功以后就会执行这个脚本。上面也可以配置不同的钩子,在orchestrator不同状态下执行对于的脚本。这里只讲vip相关的。
"/home/orch/orch_hook.sh {failureType} {failureClusterAlias} {failedHost} {successorHost} >> /tmp/orch.log"
有问题可以看上面的配置文件,里面直接复制
a、所以我们的shell脚本就在 /home/orch/orch_hook.sh 中
- #!/bin/bash
-
- isitdead=$1
- cluster=$2
- oldmaster=$3
- newmaster=$4
- mysqluser="orchestrator"
- export MYSQL_PWD="123456"
-
- logfile="/var/log/orch_hook.log"
-
- # list of clusternames
- #clusternames=(rep blea lajos)
-
- # clustername=( interface IP user Inter_IP)
- #rep=( ens32 "192.168.56.121" root "192.168.56.125")
-
- if [[ $isitdead == "DeadMaster" ]]; then
-
- array=( ens33 "192.168.88.200" root "192.168.88.129")
- interface=${array[0]}
- IP=${array[1]}
- user=${array[2]}
-
- if [ ! -z ${IP} ] ; then
-
- echo $(date)
- echo "Revocering from: $isitdead"
- echo "New master is: $newmaster"
- echo "/usr/local/orchestrator/orch_vip.sh -d 1 -n $newmaster -i ${interface} -I ${IP} -u ${user} -o $oldmaster" | tee $logfile
- /usr/local/orchestrator/orch_vip.sh -d 1 -n $newmaster -i ${interface} -I ${IP} -u ${user} -o $oldmaster
- #mysql -h$newmaster -u$mysqluser < /usr/local/bin/orch_event.sql
- else
-
- echo "Cluster does not exist!" | tee $logfile
-
- fi
- elif [[ $isitdead == "DeadIntermediateMasterWithSingleSlaveFailingToConnect" ]]; then
-
- array=( ens33 "192.168.88.200" root "192.168.88.129")
- interface=${array[0]}
- IP=${array[1]}
- user=${array[2]}
- slavehost=`echo $5 | cut -d":" -f1`
-
- echo $(date)
- echo "Revocering from: $isitdead"
- echo "New intermediate master is: $slavehost"
- echo "/usr/local/orchestrator/orch_vip.sh -d 1 -n $slavehost -i ${interface} -I ${IP} -u ${user} -o $oldmaster" | tee $logfile
- /usr/local/orchestrator/orch_vip.sh -d 1 -n $slavehost -i ${interface} -I ${IP} -u ${user} -o $oldmaster
-
-
- elif [[ $isitdead == "DeadIntermediateMaster" ]]; then
-
- array=( ens33 "192.168.88.200" root "192.168.88.129")
- interface=${array[0]}
- IP=${array[3]}
- user=${array[2]}
- slavehost=`echo $5 | sed -E "s/:[0-9]+//g" | sed -E "s/,/ /g"`
- showslave=`mysql -h$newmaster -u$mysqluser -sN -e "SHOW SLAVE HOSTS;" | awk '{print $2}'`
- newintermediatemaster=`echo $slavehost $showslave | tr ' ' '\n' | sort | uniq -d`
-
- echo $(date)
- echo "Revocering from: $isitdead"
- echo "New intermediate master is: $newintermediatemaster"
- echo "/usr/local/orchestrator/orch_vip.sh -d 1 -n $newintermediatemaster -i ${interface} -I ${IP} -u ${user} -o $oldmaster" | tee $logfile
- /usr/local/orchestrator/orch_vip.sh -d 1 -n $newintermediatemaster -i ${interface} -I ${IP} -u ${user} -o $oldmaster
-
- fi

上面的脚本需要修改的地方 array=( ens33 "192.168.88.200" root "192.168.88.129")
ens33表示网卡,填自己的电脑上面的
192.168.88.200 表示vip地址
192.168.88.129 表示orchestrator安装的ip
root 表示免登录的linux账户
b、/usr/local/orchestrator/orch_vip.sh 漂移脚本
- #!/bin/bash
-
- isitdead=$1
- cluster=$2
- oldmaster=$3
- newmaster=$4
- mysqluser="orchestrator"
- export MYSQL_PWD="123456"
-
- logfile="/var/log/orch_hook.log"
-
- # list of clusternames
- #clusternames=(rep blea lajos)
-
- # clustername=( interface IP user Inter_IP)
- #rep=( ens32 "192.168.56.121" root "192.168.56.125")
-
- if [[ $isitdead == "DeadMaster" ]]; then
-
- array=( ens33 "192.168.88.200" root "192.168.88.129")
- interface=${array[0]}
- IP=${array[1]}
- user=${array[2]}
-
- if [ ! -z ${IP} ] ; then
-
- echo $(date)
- echo "Revocering from: $isitdead"
- echo "New master is: $newmaster"
- echo "/usr/local/orchestrator/orch_vip.sh -d 1 -n $newmaster -i ${interface} -I ${IP} -u ${user} -o $oldmaster" | tee $logfile
- /usr/local/orchestrator/orch_vip.sh -d 1 -n $newmaster -i ${interface} -I ${IP} -u ${user} -o $oldmaster
- #mysql -h$newmaster -u$mysqluser < /usr/local/bin/orch_event.sql
- else
-
- echo "Cluster does not exist!" | tee $logfile
-
- fi
- elif [[ $isitdead == "DeadIntermediateMasterWithSingleSlaveFailingToConnect" ]]; then
-
- array=( ens33 "192.168.88.200" root "192.168.88.129")
- interface=${array[0]}
- IP=${array[1]}
- user=${array[2]}
- slavehost=`echo $5 | cut -d":" -f1`
-
- echo $(date)
- echo "Revocering from: $isitdead"
- echo "New intermediate master is: $slavehost"
- echo "/usr/local/orchestrator/orch_vip.sh -d 1 -n $slavehost -i ${interface} -I ${IP} -u ${user} -o $oldmaster" | tee $logfile
- /usr/local/orchestrator/orch_vip.sh -d 1 -n $slavehost -i ${interface} -I ${IP} -u ${user} -o $oldmaster
-
-
- elif [[ $isitdead == "DeadIntermediateMaster" ]]; then
-
- array=( ens33 "192.168.88.200" root "192.168.88.129")
- interface=${array[0]}
- IP=${array[3]}
- user=${array[2]}
- slavehost=`echo $5 | sed -E "s/:[0-9]+//g" | sed -E "s/,/ /g"`
- showslave=`mysql -h$newmaster -u$mysqluser -sN -e "SHOW SLAVE HOSTS;" | awk '{print $2}'`
- newintermediatemaster=`echo $slavehost $showslave | tr ' ' '\n' | sort | uniq -d`
-
- echo $(date)
- echo "Revocering from: $isitdead"
- echo "New intermediate master is: $newintermediatemaster"
- echo "/usr/local/orchestrator/orch_vip.sh -d 1 -n $newintermediatemaster -i ${interface} -I ${IP} -u ${user} -o $oldmaster" | tee $logfile
- /usr/local/orchestrator/orch_vip.sh -d 1 -n $newintermediatemaster -i ${interface} -I ${IP} -u ${user} -o $oldmaster
-
- fi
- [root@localhost orchestrator]# cat /usr/local/orchestrator/orch_vip.sh
- #!/bin/bash
-
- emailaddress="email@example.com"
- sendmail=0
-
- function usage {
- cat << EOF
- usage: $0 [-h] [-d master is dead] [-o old master ] [-s ssh options] [-n new master] [-i interface] [-I] [-u SSH user]
-
- OPTIONS:
- -h Show this message
- -o string Old master hostname or IP address
- -d int If master is dead should be 1 otherweise it is 0
- -s string SSH options
- -n string New master hostname or IP address
- -i string Interface exmple eth0:1
- -I string Virtual IP
- -u string SSH user
- EOF
-
- }
-
- while getopts ho:d:s:n:i:I:u: flag; do
- case $flag in
- o)
- orig_master="$OPTARG";
- ;;
- d)
- isitdead="${OPTARG}";
- ;;
- s)
- ssh_options="${OPTARG}";
- ;;
- n)
- new_master="$OPTARG";
- ;;
- i)
- interface="$OPTARG";
- ;;
- I)
- vip="$OPTARG";
- ;;
- u)
- ssh_user="$OPTARG";
- ;;
- h)
- usage;
- exit 0;
- ;;
- *)
- usage;
- exit 1;
- ;;
- esac
- done
-
-
- if [ $OPTIND -eq 1 ]; then
- echo "No options were passed";
- usage;
- fi
-
- shift $(( OPTIND - 1 ));
-
- # discover commands from our path
- ssh=$(which ssh)
- arping=$(which arping)
- ip2util=$(which ip)
-
- # command for adding our vip
- cmd_vip_add="sudo -n $ip2util address add ${vip} dev ${interface}"
- # command for deleting our vip
- cmd_vip_del="sudo -n $ip2util address del ${vip}/32 dev ${interface}"
- # command for discovering if our vip is enabled
- cmd_vip_chk="sudo -n $ip2util address show dev ${interface} to ${vip%/*}/32"
- # command for sending gratuitous arp to announce ip move
- cmd_arp_fix="sudo -n $arping -c 1 -I ${interface} ${vip%/*} "
- # command for sending gratuitous arp to announce ip move on current server
- cmd_local_arp_fix="sudo -n $arping -c 1 -I ${interface} ${vip%/*} "
-
- vip_stop() {
- rc=0
-
- # ensure the vip is removed
- $ssh ${ssh_options} -tt ${ssh_user}@${orig_master} \
- "[ -n \"\$(${cmd_vip_chk})\" ] && ${cmd_vip_del} && sudo ${ip2util} route flush cache || [ -z \"\$(${cmd_vip_chk})\" ]"
- rc=$?
- return $rc
- }
-
- vip_start() {
- rc=0
-
- # ensure the vip is added
- # this command should exit with failure if we are unable to add the vip
- # if the vip already exists always exit 0 (whether or not we added it)
- $ssh ${ssh_options} -tt ${ssh_user}@${new_master} \
- "[ -z \"\$(${cmd_vip_chk})\" ] && ${cmd_vip_add} && ${cmd_arp_fix} || [ -n \"\$(${cmd_vip_chk})\" ]"
- rc=$?
- $cmd_local_arp_fix
- return $rc
- }
-
- vip_status() {
- $arping -c 1 -I ${interface} ${vip%/*}
- if ping -c 1 -W 1 "$vip"; then
- return 0
- else
- return 1
- fi
- }
-
- if [[ $isitdead == 0 ]]; then
- echo "Online failover"
- if vip_stop; then
- if vip_start; then
- echo "$vip is moved to $new_master."
- if [ $sendmail -eq 1 ]; then mail -s "$vip is moved to $new_master." "$emailaddress" < /dev/null &> /dev/null ; fi
- else
- echo "Can't add $vip on $new_master!"
- if [ $sendmail -eq 1 ]; then mail -s "Can't add $vip on $new_master!" "$emailaddress" < /dev/null &> /dev/null ; fi
- exit 1
- fi
- else
- echo $rc
- echo "Can't remove the $vip from orig_master!"
- if [ $sendmail -eq 1 ]; then mail -s "Can't remove the $vip from orig_master!" "$emailaddress" < /dev/null &> /dev/null ; fi
- exit 1
- fi
-
-
- elif [[ $isitdead == 1 ]]; then
- echo "Master is dead, failover"
- # make sure the vip is not available
- if vip_status; then
- if vip_stop; then
- if [ $sendmail -eq 1 ]; then mail -s "$vip is removed from orig_master." "$emailaddress" < /dev/null &> /dev/null ; fi
- else
- if [ $sendmail -eq 1 ]; then mail -s "Couldn't remove $vip from orig_master." "$emailaddress" < /dev/null &> /dev/null ; fi
- exit 1
- fi
- fi
-
- if vip_start; then
- echo "$vip is moved to $new_master."
- if [ $sendmail -eq 1 ]; then mail -s "$vip is moved to $new_master." "$emailaddress" < /dev/null &> /dev/null ; fi
-
- else
- echo "Can't add $vip on $new_master!"
- if [ $sendmail -eq 1 ]; then mail -s "Can't add $vip on $new_master!" "$emailaddress" < /dev/null &> /dev/null ; fi
- exit 1
- fi
- else
- echo "Wrong argument, the master is dead or live?"
-
- fi

上面两个文件记得权限和可执行的问题。
c.测试漂移
在主库添加vip
ip addr add 192.168.88.200 dev ens33
然后关闭主库的mysql,看看主从切换后,vip地址是否也切换
上面如果有问题的日志可以看
/var/log/orch_hook.log
/tmp/recovery.log
细节:一定要配置免登录,否则不可能成功,orch_vip.sh这个脚本需要到新的主库linux上面去绑定vip的
四、参考博文
感谢下面的大佬的资料,感谢感谢!
##这个文章是真的不错
https://www.cnblogs.com/chasetimeyang/p/15063858.html
https://www.cnblogs.com/dh17/articles/14790321.html
##这个问题很不错啊
https://blog.csdn.net/baijiu1/article/details/89395654
https://blog.csdn.net/du18020126395/article/details/115289099
https://www.cnblogs.com/zhoujinyi/p/10394389.html
##写的不错
https://www.cnblogs.com/chasetimeyang/p/15063858.html
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。