赞
踩
出现免密失败,我这里出现是因为用普通用户做免密,但是创建文件用的是root用户,所以出现这个问题,所以要把创建的mha有关的文件、文件夹都设置为普通用户。即可免密成功。(应该是权限的问题)
[hado@aproxy ~]$ masterha_check_repl --conf=/etc/mha/app1.cnf
Sun Jun 13 20:29:27 2021 - [info] Reading default configuration from /etc/masterha_default.cnf..
Sun Jun 13 20:29:27 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Sun Jun 13 20:29:27 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
Sun Jun 13 20:29:27 2021 - [info] MHA::MasterMonitor version 0.58.
Sun Jun 13 20:29:29 2021 - [info] GTID failover mode = 0
Sun Jun 13 20:29:29 2021 - [info] Dead Servers:
Sun Jun 13 20:29:29 2021 - [info] Alive Servers:
Sun Jun 13 20:29:29 2021 - [info] 192.168.72.90(192.168.72.90:3306)
Sun Jun 13 20:29:29 2021 - [info] 192.168.72.91(192.168.72.91:3306)
Sun Jun 13 20:29:29 2021 - [info] 192.168.72.92(192.168.72.92:3306)
Sun Jun 13 20:29:29 2021 - [info] Alive Slaves:
Sun Jun 13 20:29:29 2021 - [info] 192.168.72.91(192.168.72.91:3306) Version=5.7.28-log (oldest major version between slaves) log-bin:enabled
Sun Jun 13 20:29:29 2021 - [info] Replicating from 192.168.72.90(192.168.72.90:3306)
Sun Jun 13 20:29:29 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Sun Jun 13 20:29:29 2021 - [info] 192.168.72.92(192.168.72.92:3306) Version=5.7.28-log (oldest major version between slaves) log-bin:enabled
Sun Jun 13 20:29:29 2021 - [info] Replicating from 192.168.72.90(192.168.72.90:3306)
Sun Jun 13 20:29:29 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Sun Jun 13 20:29:29 2021 - [info] Current Alive Master: 192.168.72.90(192.168.72.90:3306)
Sun Jun 13 20:29:29 2021 - [info] Checking slave configurations..
Sun Jun 13 20:29:29 2021 - [info] Checking replication filtering settings..
Sun Jun 13 20:29:29 2021 - [info] binlog_do_db= , binlog_ignore_db= information_schema,mysql,performance_schema,sys
Sun Jun 13 20:29:29 2021 - [info] Replication filtering check ok.
Sun Jun 13 20:29:29 2021 - [info] GTID (with auto-pos) is not supported
Sun Jun 13 20:29:29 2021 - [info] Starting SSH connection tests..
Sun Jun 13 20:29:31 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. SSH Configuration Check Failed!
at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 372.
Sun Jun 13 20:29:31 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Sun Jun 13 20:29:31 2021 - [info] Got exit code 1 (Not master dead).MySQL Replication Health is NOT OK!
排查思路:
1、根据错误提示:集群中ssh免密登陆未设置好,仔细检查我的全部机器的ssh免密是没有问题的。 cat id_rsa.pub >> authorized_keys
2、因为我很多都是通过sudo来执行的,然后切换到root用户下操作,将免密用户改为root
然后检查命令是在root用户下检查,是OK的。没问题。
3、后面觉得可能是权限问题免密不过去
4、根据思路修改创建文件的权限
注意
sudo chown -R hado:hado /etc/masterha_default.cnf
-------------------------------------
注意:
mha 文件夹是root的权限
sudo chown -R hado:hado /etc/mha
改为hado的权限
------------------------------
是root的权限。
sudo chown -R hado:hado /var/log/mha_manager
将权限修改为hado
又出现读取不了路径的问题
[hado@aproxy ~]$ masterha_check_repl --conf=/etc/mha/app1.cnf
。。。。。。。。。。。。。。。。。
Sun Jun 13 20:38:30 2021 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000003
Sun Jun 13 20:38:30 2021 - [info] Connecting to hado@192.168.72.90(192.168.72.90:22)..
Failed to save binary log: readdir() attempted on invalid dirhandle $dir at /usr/share/perl5/vendor_perl/MHA/BinlogManager.pm line 271.
Sun Jun 13 20:38:30 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln161] Binlog setting check failed!
Sun Jun 13 20:38:30 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln408] Master configuration failed.
Sun Jun 13 20:38:30 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/bin/masterha_check_repl line 48.
Sun Jun 13 20:38:30 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Sun Jun 13 20:38:30 2021 - [info] Got exit code 1 (Not master dead).注意:是在90,也就是master上,读取不了mysql的日志信息,所以要改为和mysql同一组
sudo usermod -G mysql hado 在master上执行(其实三台都要执行)
然后出现不能阅读的问题(还是要和mysql一个组才可以,继续执行上面的命令)
Sun Jun 13 21:03:09 2021 - [info] Connecting to hado@192.168.72.91(192.168.72.91:22)..
Checking slave recovery environment settings..
Opening /var/lib/mysql/relay-log.info ...Could not open relay-log-info file /var/lib/mysql/relay-log.info.
at /usr/bin/apply_diff_relay_logs line 347.
Sun Jun 13 21:03:09 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln208] Slaves settings check failed!
Sun Jun 13 21:03:09 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln416] Slave configuration failed.
Sun Jun 13 21:03:09 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/bin/masterha_check_repl line 48.
Sun Jun 13 21:03:09 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Sun Jun 13 21:03:09 2021 - [info] Got exit code 1 (Not master dead).
然后出现这个权限不够的问题
Sun Jun 13 21:10:07 2021 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/log/mha_manager/app1/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000004
Sun Jun 13 21:10:07 2021 - [info] Connecting to hado@192.168.72.90(192.168.72.90:22)..
Creating /var/log/mha_manager/app1 if not exists.. Creating directory /var/log/mha_manager/app1.. Failed to save binary log: failed to create dir:/var/log/mha_manager/app1:mkdir /var/log/mha_manager: Permission denied at /usr/share/perl5/vendor_perl/MHA/NodeUtil.pm line 36.
at /usr/bin/save_binary_logs line 132.
Sun Jun 13 21:10:07 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln161] Binlog setting check failed!
Sun Jun 13 21:10:07 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln408] Master configuration failed.
Sun Jun 13 21:10:07 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/bin/masterha_check_repl line 48.
Sun Jun 13 21:10:07 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Sun Jun 13 21:10:07 2021 - [info] Got exit code 1 (Not master dead).所以要修改以下:
sudo mkdir -p /var/log/mha_manager 先创建目录 sudo chown -R hado:hado /var/log/mha_manager 再给目录授权 [hado@amaster ~]$ sudo usermod -G mysql hado 日常管理用户添加到mysql用户组读取错误日志文件就用此命令给用户授权(3个节点都要)这个错误的出现,是因为在配置文件 /etc/mha/app1.cnf添加了这个(红圈圈)
如果是root那是没问题的,但是普通用户是没有权限在 /var/log 创建的。
再次运行,正常了
后面我通过快照返回去,继续执行上面的命令,发现有一个命令不用操作也能成功,就是这两句,那是因为我少配置了这个
- sudo mkdir -p /var/log/mha_manager 先创建目录
- sudo chown -R hado:hado /var/log/mha_manager 再给目录授权
是因为MHA配置文件里面缺少了主库的配置
Mon Jun 14 11:04:31 2021 - [info] Reading default configuration from /etc/masterha_default.cnf..
Mon Jun 14 11:04:31 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Mon Jun 14 11:04:31 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
Mon Jun 14 11:04:31 2021 - [info] MHA::MasterMonitor version 0.58.
Mon Jun 14 11:04:33 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln671] Master 192.168.72.90:3306 from which slave 192.168.72.91(192.168.72.91:3306) replicates is not defined in the configuration file!
Mon Jun 14 11:04:33 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 329.
Mon Jun 14 11:04:33 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Jun 14 11:04:33 2021 - [info] Got exit code 1 (Not master dead).MySQL Replication Health is NOT OK!
故障是因为启动原主机,MHA的配置中没有加入主机的信息,然后执行切换命令后,备注主机恢复从机,然后从机2就一直指向备用主机
查网上; 停止一台从服务器后换上了一台新的mysql从服务器也会出现这个情况
这里是主机90恢复后,(从机92还是指向从机91的问题)
Master 192.168.72.90(192.168.72.90:3306), dead
Master 192.168.72.91(192.168.72.91:3306), replicating from 192.168.72.90(192.168.72.90:3306), read-onlyMon Jun 14 11:45:39 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln726] Slave 192.168.72.92(192.168.72.92:3306) replicates from 192.168.72.91:3306, but real master is 192.168.72.90(192.168.72.90:3306)!
Mon Jun 14 11:45:39 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 329.
Mon Jun 14 11:45:39 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Jun 14 11:45:39 2021 - [info] Got exit code 1 (Not master dead).MySQL Replication Health is NOT OK!
我尝试给从机92 运行指向主机的命令
mysql> CHANGE MASTER TO MASTER_HOST='192.168.72.90',MASTER_USER='hado',MASTER_PASSWORD='123456',MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=1284;
ERROR 3021 (HY000): This operation cannot be performed with a running slave io thread; run STOP SLAVE IO_THREAD FOR CHANNEL '' first.根据提示:
1、STOP SLAVE IO_THREAD; //关闭线程 2、CHANGE MASTER TO MASTER_HOST='192.168.72.90',MASTER_USER='hado',MASTER_PASSWORD='000000',MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=1284; 3、START SLAVE IO_THREAD; //开启io线程然后又提示 从机91IO没有运行
192.168.72.90(192.168.72.90:3306) (current master)
+--192.168.72.91(192.168.72.91:3306)
+--192.168.72.92(192.168.72.92:3306)Mon Jun 14 12:35:50 2021 - [info] Checking replication health on 192.168.72.91..
Mon Jun 14 12:35:50 2021 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln490] Slave IO thread is not running on 192.168.72.91(192.168.72.91:3306)
Mon Jun 14 12:35:50 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln1526] failed!
Mon Jun 14 12:35:50 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 420.
Mon Jun 14 12:35:50 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Jun 14 12:35:50 2021 - [info] Got exit code 1 (Not master dead).尝试运行重启命令
mysql> reset slave all;
ERROR 3081 (HY000): This operation cannot be performed with running replication threads; run STOP SLAVE FOR CHANNEL '' first解决如下:
先查看主机的日志
mysql> stop slave; // 停止从机 Query OK, 0 rows affected (0.01 sec) mysql> reset master; //重置绑定的主机 Query OK, 0 rows affected (0.02 sec) mysql> CHANGE MASTER TO MASTER_HOST='192.168.72.90',MASTER_USER='hado',MASTER_PASSWORD='123456',MASTER_LOG_FILE='mysql-bin.000010', MASTER_LOG_POS=154; Query OK, 0 rows affected, 2 warnings (0.02 sec) // 根据主机提示,执行复制主机的命令 mysql> start slave; // 重启从机 Query OK, 0 rows affected (0.01 sec)然后查看状态:
Slave_IO_Running: Yes
Slave_SQL_Running: Yes监控主机上在运行监测命令
运行发现,92也是 Slave_IO_Running: No,
mysql> STOP SLAVE IO_THREAD; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> CHANGE MASTER TO MASTER_HOST='192.168.72.90',MASTER_USER='hado',MASTER_PASSWORD='000000',MASTER_LOG_FILE='mysql-bin.000010', MASTER_LOG_POS=154; Query OK, 0 rows affected, 2 warnings (0.00 sec) mysql> START SLAVE IO_THREAD; Query OK, 0 rows affected (0.00 sec)从这两个操作发现,关键还是在指向主机复制这句,其他的无非就是停止、启动命令的运行。
然后再次监控主机上在运行监测命令(总算恢复正常了)
然后就启动MHA,接着查看一下主节点信息,发现
[hado@aproxy app1]$ nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha_manager/app1/manager.log 2>&1 &
[1] 4444
[hado@aproxy app1]$ masterha_check_status --conf=/etc/mha/app1.cnf
app1 monitoring program is now on initialization phase(10:INITIALIZING_MONITOR). Wait for a while and try checking again. (等待一下再查下,其实是因为故障出现两个master,)再次查询没问题了
[hado@aproxy app1]$ masterha_check_status --conf=/etc/mha/app1.cnf
app1 (pid:4444) is running(0:PING_OK), master:192.168.72.90
再次进行故障切换,恢复原主机后,在从机92执行指向主机的复制命令,就出现了sql异常
Mon Jun 14 13:36:11 2021 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln935] SQL Thread is stopped(error) on 192.168.72.92(192.168.72.92:3306)! Errno:1007, Error:Error 'Can't create database 'UserTest'; database exists' on query. Default database: 'UserTest'. Query: 'create database UserTest'
Mon Jun 14 13:36:11 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln671] Master 192.168.72.90:3306 from which slave 192.168.72.92(192.168.72.92:3306) replicates is not defined in the configuration file!
Mon Jun 14 13:36:11 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 329.
Mon Jun 14 13:36:11 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Jun 14 13:36:11 2021 - [info] Got exit code 1 (Not master dead).解决如下:
stop slave; SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; start slave; show slave status\G然后执行检查命令发现:
Mon Jun 14 13:49:00 2021 - [info] Checking replication health on 192.168.72.91..
Mon Jun 14 13:49:00 2021 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln490] Slave IO thread is not running on 192.168.72.91(192.168.72.91:3306)
Mon Jun 14 13:49:00 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln1526] failed!
Mon Jun 14 13:49:00 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 420.
Mon Jun 14 13:49:00 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Jun 14 13:49:00 2021 - [info] Got exit code 1 (Not master dead).MySQL Replication Health is NOT OK!
到91上进行查看从库状态:
Master_Log_File: mysql-bin.000011
Read_Master_Log_Pos: 154
Relay_Log_File: mysql-relay-bin.000002
Relay_Log_Pos: 1035
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1007
Last_Error: Error 'Can't create database 'UserTest'; database exists' on query. Default database: 'UserTest'. Query: 'create database UserTest'提示不能创建数据库,数据库已经存在
我就直接查看从库92正常的信息状态。发现一个不一样的地方。如下:
然后我运行命令如下就恢复正常了。
再次进行故障转移发现,主机宕机,作为91备用主机应该升级为主机的,可现在出现IO问题
Mon Jun 14 14:19:59 2021 - [info] Checking replication health on 192.168.72.91..
Mon Jun 14 14:19:59 2021 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln490] Slave IO thread is not running on 192.168.72.91(192.168.72.91:3306)
Mon Jun 14 14:19:59 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln1526] failed!
Mon Jun 14 14:19:59 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 420.
Mon Jun 14 14:19:59 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Jun 14 14:19:59 2021 - [info] Got exit code 1 (Not master dead).MySQL Replication Health is NOT OK!
到91集群上,发现, Slave_IO_Running: Connecting
解决:我重新启动主库的mysql服务,等待一下,然后就变成yes了。
恢复原主机,又发现问题了
Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from position > file size'
根据网上查找有相同的问题,然后拷贝过来,运行成功,所以就此做个记录。
一般做主从同步,都是要求以后的数据实现主从同步,而对于旧的数据完全可以使用数据库同步工具先将数据库同步,完了再进行主从同步;
正确做法是:
1.打开主mysql服务器,进入mysql里面。
2.执行flush logs; //这时主服务器会重新创建一个binlog文件;
3.在主服务上执行show master status\G;显示如下:
切到从服务器,进入mysql;
5.stop slave;
6.CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000015', MASTER_LOG_POS=154;//这里的file和pos都是上面主服务器master显示的。
7.start slave;//这时候就应可以了(都显示为yes)
然后再此切换,再次检查,发现
Mon Jun 14 15:34:05 2021 - [info] Checking replication health on 192.168.72.90..
Mon Jun 14 15:34:05 2021 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln490] Slave IO thread is not running on 192.168.72.90(192.168.72.90:3306)
Mon Jun 14 15:34:05 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln1526] failed!
Mon Jun 14 15:34:05 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 420.
Mon Jun 14 15:34:05 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Jun 14 15:34:05 2021 - [info] Got exit code 1 (Not master dead).到90主机上查看(因为在主机上也启动了slave,检查就出问题了,作为主机、怎么也抢从机的ip呢,奇怪了)
stop slave
change master to master_host='192.168.72.90',master_user='hado',..............
start slave ,但是报如下错误
Last_IO_Error: Fatal error: The slave I/O thread stops because master and slave have equal MySQL server ids; these ids must be different for replication to work (or the --replicate-same-server-id option must be used on slave but this does not always make sense; please check the manual before using it).
百翻:最后一个IO错误:致命错误:从I/O线程停止,因为主服务器和从服务器的MySQL服务器ID相等;这些id必须不同,复制才能工作(或者在从属服务器上必须使用--replicate same server id选项,但这并不总是有意义的;使用前请检查手册)。
根据网上的解决思路进行如下操作:
show variables like 'server_id'; set global server_id=2; start slave ;最后解决方式是:
最后无论如何都搞定不了,最后的方法是换一个思路,开启MHA,把原主机宕掉。让他切换到新从机去做主机。先进行stop ,同步 start。
在原来主机上执行指向原主机突然爆这个错误(我一开始指向出问题,然后引发一系列的情况报错。后面回想起来,这个就顺手记下来了)
注意: 原主机启动恢复后,将原主机指向现在的新主机,把恢复原主机作为从机。
ERROR 1777 (HY000): CHANGE MASTER TO MASTER_AUTO_POSITION = 1 cannot be executed because @@GLOBAL.GTID_MODE = OFF.
https://bugs.mysql.com/bug.php?id=70167参考网上意见
<span style="color:#000000">mysql 5.6.13-log (root) [test]> 将 master 更改为 master_auto_position=0; ERROR 1777 (HY000): CHANGE MASTER TO MASTER_AUTO_POSITION = 1 只能在@@GLOBAL.GTID_MODE = ON 时执行。 <strong>建议修复:</strong> 即使服务器以 gtid_mode=OFF> 运行,也应该可以设置 MASTER_AUTO_POSITION=0</span>
Mon Jun 14 23:54:04 2021 - [info] /var/log/mha_manager/scripts/master_ip_failover --command=status --ssh_user=hado --orig_master_host=192.168.72.91 --orig_master_ip=192.168.72.91 --orig_master_port=3306
"my" variable $vip masks earlier declaration in same scope at /var/log/mha_manager/scripts/master_ip_failover line 84.
Mon Jun 14 23:54:04 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln229] Failed to get master_ip_failover_script status with return code 255:0.
Mon Jun 14 23:54:04 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/bin/masterha_check_repl line 48.
Mon Jun 14 23:54:04 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Jun 14 23:54:04 2021 - [info] Got exit code 1 (Not master dead).MySQL Replication Health is NOT OK!
“my”变量$vip在/var/log/mha\u manager/scripts/master\u ip\u failover line 84的同一作用域中屏蔽了先前的声明。
把重复的屏蔽掉即可。
[hado@aslave1 ~]$ sudo ifconfig eth1:1 192.168.72.200/24
SIOCSIFADDR: No such device 没有这样的装置
eth1:1: ERROR while getting interface flags: No such device 获取接口标志时出错:没有此类设备
SIOCSIFNETMASK: No such device 没有这样的设备也许网卡名字不是“eth0”,而是其他的名字。通过在虚拟机输入一行代码“ifconfig -a”来检查所有的网卡参数,结果发现我的网卡名字没有“eth0”,而是叫“ens33”
配置了虚拟IP,实现故障后VIP漂移,然后检查这里就出这个了,但是主从结构是正常的。
原因是:配置少 一个双引号 master_binlog_dir="/var/lib/mysql"
Wed Jun 16 20:46:42 2021 - [info] Connecting to hado@192.168.72.91(192.168.72.91:22)..
sh: -c: line 0: unexpected EOF while looking for matching `"'
sh: -c: line 1: syntax error: unexpected end of file
Wed Jun 16 20:46:42 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln161] Binlog setting check failed!
Wed Jun 16 20:46:42 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln408] Master configuration failed.
Wed Jun 16 20:46:42 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/bin/masterha_check_repl line 48.
Wed Jun 16 20:46:42 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Wed Jun 16 20:46:42 2021 - [info] Got exit code 1 (Not master dead).MySQL Replication Health is NOT OK!
然后再次检查又出现这个
[hado@aproxy ~]$ masterha_check_repl --conf=/etc/mha/app1.cnf
Wed Jun 16 21:10:38 2021 - [info] Reading default configuration from /etc/masterha_default.cnf..
Wed Jun 16 21:10:38 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Wed Jun 16 21:10:38 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
Wed Jun 16 21:10:39 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. Failed to get IP address on host 192.168.72.91": Name or service not known
at /usr/share/perl5/vendor_perl/MHA/Config.pm line 63.
Wed Jun 16 21:10:39 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Wed Jun 16 21:10:39 2021 - [info] Got exit code 1 (Not master dead).MySQL Replication Health is NOT OK!
获取不到ip吗,然后我把/etc/mha/app1.cnf 中的三个主机的ip信息重新放入进去。在检查,就没问题了。
我把一个表删除了,导致这个情况
Thu Jun 17 07:04:42 2021 - [info] Reading default configuration from /etc/masterha_default.cnf..
Thu Jun 17 07:04:42 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Thu Jun 17 07:04:42 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
Thu Jun 17 07:04:42 2021 - [info] MHA::MasterMonitor version 0.58.
Thu Jun 17 07:04:43 2021 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln935] SQL Thread is stopped(error) on 192.168.72.92(192.168.72.92:3306)! Errno:1051, Error:Error 'Unknown table 'UserTest.position'' on query. Default database: 'UserTest'. Query: 'DROP TABLE `position` /* generated by server */'
Thu Jun 17 07:04:43 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln193] There is no alive slave. We can't do failover
Thu Jun 17 07:04:43 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 329.
Thu Jun 17 07:04:43 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Thu Jun 17 07:04:43 2021 - [info] Got exit code 1 (Not master dead).MySQL Replication Health is NOT OK!
1、执行flush logs; //这时主服务器会重新创建一个binlog文件;
2、在将所有的从节点重新指向主机即可。
Thu Jun 17 20:11:58 2021 - [info] Reading default configuration from /etc/masterha_default.cnf..
Thu Jun 17 20:11:58 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Thu Jun 17 20:11:58 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
Thu Jun 17 20:11:58 2021 - [info] MHA::MasterMonitor version 0.58.
Thu Jun 17 20:12:00 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln193] There is no alive slave. We can't do failover
Thu Jun 17 20:12:00 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 329.
Thu Jun 17 20:12:00 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Thu Jun 17 20:12:00 2021 - [info] Got exit code 1 (Not master dead).MySQL Replication Health is NOT OK!
没有从节点怎么可能呢?后面三个节点都检查了一遍,发现只有一个92IP,其他两个没有了,可能是昨晚测试的时候。后面是我把其他两个重新指向主节点了
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。