当前位置:   article > 正文

我配置的Mha检查时出现的情况分析及解决如下_[error][/usr/share/perl5/vendor_perl/mha/server.pm

[error][/usr/share/perl5/vendor_perl/mha/server.pm, ln490] slave io thread i

出现免密失败,我这里出现是因为用普通用户做免密,但是创建文件用的是root用户,所以出现这个问题,所以要把创建的mha有关的文件、文件夹都设置为普通用户。即可免密成功。(应该是权限的问题)

[hado@aproxy ~]$ masterha_check_repl --conf=/etc/mha/app1.cnf
Sun Jun 13 20:29:27 2021 - [info] Reading default configuration from /etc/masterha_default.cnf..
Sun Jun 13 20:29:27 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Sun Jun 13 20:29:27 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
Sun Jun 13 20:29:27 2021 - [info] MHA::MasterMonitor version 0.58.
Sun Jun 13 20:29:29 2021 - [info] GTID failover mode = 0
Sun Jun 13 20:29:29 2021 - [info] Dead Servers:
Sun Jun 13 20:29:29 2021 - [info] Alive Servers:
Sun Jun 13 20:29:29 2021 - [info]   192.168.72.90(192.168.72.90:3306)
Sun Jun 13 20:29:29 2021 - [info]   192.168.72.91(192.168.72.91:3306)
Sun Jun 13 20:29:29 2021 - [info]   192.168.72.92(192.168.72.92:3306)
Sun Jun 13 20:29:29 2021 - [info] Alive Slaves:
Sun Jun 13 20:29:29 2021 - [info]   192.168.72.91(192.168.72.91:3306)  Version=5.7.28-log (oldest major version between slaves) log-bin:enabled
Sun Jun 13 20:29:29 2021 - [info]     Replicating from 192.168.72.90(192.168.72.90:3306)
Sun Jun 13 20:29:29 2021 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Jun 13 20:29:29 2021 - [info]   192.168.72.92(192.168.72.92:3306)  Version=5.7.28-log (oldest major version between slaves) log-bin:enabled
Sun Jun 13 20:29:29 2021 - [info]     Replicating from 192.168.72.90(192.168.72.90:3306)
Sun Jun 13 20:29:29 2021 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Jun 13 20:29:29 2021 - [info] Current Alive Master: 192.168.72.90(192.168.72.90:3306)
Sun Jun 13 20:29:29 2021 - [info] Checking slave configurations..
Sun Jun 13 20:29:29 2021 - [info] Checking replication filtering settings..
Sun Jun 13 20:29:29 2021 - [info]  binlog_do_db= , binlog_ignore_db= information_schema,mysql,performance_schema,sys
Sun Jun 13 20:29:29 2021 - [info]  Replication filtering check ok.
Sun Jun 13 20:29:29 2021 - [info] GTID (with auto-pos) is not supported
Sun Jun 13 20:29:29 2021 - [info] Starting SSH connection tests..
Sun Jun 13 20:29:31 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. SSH Configuration Check Failed!

 at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 372.
Sun Jun 13 20:29:31 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Sun Jun 13 20:29:31 2021 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

排查思路:

1、根据错误提示:集群中ssh免密登陆未设置好,仔细检查我的全部机器的ssh免密是没有问题的。 cat id_rsa.pub >> authorized_keys

​2、因为我很多都是通过sudo来执行的,然后切换到root用户下操作,将免密用户改为root

然后检查命令是在root用户下检查,是OK的。没问题。

3、后面觉得可能是权限问题免密不过去

4、根据思路修改创建文件的权限

注意 

sudo chown -R hado:hado /etc/masterha_default.cnf

-------------------------------------

注意: 

  ​  mha 文件夹是root的权限

sudo chown -R hado:hado /etc/mha

  ​  改为hado的权限

------------------------------

 ​ 是root的权限。   

sudo chown -R hado:hado /var/log/mha_manager

  ​  将权限修改为hado

又出现读取不了路径的问题

[hado@aproxy ~]$ masterha_check_repl --conf=/etc/mha/app1.cnf
。。。。。。。。。。。。。。。。。
Sun Jun 13 20:38:30 2021 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000003 
Sun Jun 13 20:38:30 2021 - [info]   Connecting to hado@192.168.72.90(192.168.72.90:22).
Failed to save binary log: readdir() attempted on invalid dirhandle $dir at /usr/share/perl5/vendor_perl/MHA/BinlogManager.pm line 271.
Sun Jun 13 20:38:30 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln161] Binlog setting check failed!
Sun Jun 13 20:38:30 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln408] Master configuration failed.
Sun Jun 13 20:38:30 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/bin/masterha_check_repl line 48.
Sun Jun 13 20:38:30 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Sun Jun 13 20:38:30 2021 - [info] Got exit code 1 (Not master dead).

注意:是在90,也就是master上,读取不了mysql的日志信息,所以要改为和mysql同一组

sudo usermod -G mysql hado   在master上执行(其实三台都要执行)

然后出现不能阅读的问题(还是要和mysql一个组才可以,继续执行上面的命令)

Sun Jun 13 21:03:09 2021 - [info]   Connecting to hado@192.168.72.91(192.168.72.91:22).. 
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ...Could not open relay-log-info file /var/lib/mysql/relay-log.info.
 at /usr/bin/apply_diff_relay_logs line 347.
Sun Jun 13 21:03:09 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln208] Slaves settings check failed!
Sun Jun 13 21:03:09 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln416] Slave configuration failed.
Sun Jun 13 21:03:09 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/bin/masterha_check_repl line 48.
Sun Jun 13 21:03:09 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Sun Jun 13 21:03:09 2021 - [info] Got exit code 1 (Not master dead).

然后出现这个权限不够的问题

Sun Jun 13 21:10:07 2021 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/log/mha_manager/app1/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000004 
Sun Jun 13 21:10:07 2021 - [info]   Connecting to hado@192.168.72.90(192.168.72.90:22).. 
  Creating /var/log/mha_manager/app1 if not exists.. Creating directory /var/log/mha_manager/app1.. Failed to save binary log: failed to create dir:/var/log/mha_manager/app1:mkdir /var/log/mha_manager: Permission denied at /usr/share/perl5/vendor_perl/MHA/NodeUtil.pm line 36.
 at /usr/bin/save_binary_logs line 132.
Sun Jun 13 21:10:07 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln161] Binlog setting check failed!
Sun Jun 13 21:10:07 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln408] Master configuration failed.
Sun Jun 13 21:10:07 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/bin/masterha_check_repl line 48.
Sun Jun 13 21:10:07 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Sun Jun 13 21:10:07 2021 - [info] Got exit code 1 (Not master dead).

所以要修改以下:

  1. sudo mkdir -p /var/log/mha_manager 先创建目录
  2. sudo chown -R hado:hado /var/log/mha_manager 再给目录授权
  3. [hado@amaster ~]$ sudo usermod -G mysql hado
  4. 日常管理用户添加到mysql用户组读取错误日志文件就用此命令给用户授权(3个节点都要)

这个错误的出现,是因为在配置文件   /etc/mha/app1.cnf添加了这个(红圈圈)

如果是root那是没问题的,但是普通用户是没有权限在 /var/log 创建的。

再次运行,正常了

  

后面我通过快照返回去,继续执行上面的命令,发现有一个命令不用操作也能成功,就是这两句,那是因为我少配置了这个 

  1. sudo mkdir -p /var/log/mha_manager 先创建目录
  2. sudo chown -R hado:hado /var/log/mha_manager 再给目录授权

是因为MHA配置文件里面缺少了主库的配置                                            

Mon Jun 14 11:04:31 2021 - [info] Reading default configuration from /etc/masterha_default.cnf..
Mon Jun 14 11:04:31 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Mon Jun 14 11:04:31 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
Mon Jun 14 11:04:31 2021 - [info] MHA::MasterMonitor version 0.58.
Mon Jun 14 11:04:33 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln671] Master 192.168.72.90:3306 from which slave 192.168.72.91(192.168.72.91:3306) replicates is not defined in the configuration file!
Mon Jun 14 11:04:33 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 329.
Mon Jun 14 11:04:33 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Jun 14 11:04:33 2021 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

故障是因为启动原主机,MHA的配置中没有加入主机的信息,然后执行切换命令后,备注主机恢复从机,然后从机2就一直指向备用主机

查网上; 停止一台从服务器后换上了一台新的mysql从服务器也会出现这个情况

这里是主机90恢复后,(从机92还是指向从机91的问题)

Master 192.168.72.90(192.168.72.90:3306), dead
Master 192.168.72.91(192.168.72.91:3306), replicating from 192.168.72.90(192.168.72.90:3306), read-only

Mon Jun 14 11:45:39 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln726] Slave 192.168.72.92(192.168.72.92:3306) replicates from 192.168.72.91:3306, but real master is 192.168.72.90(192.168.72.90:3306)!
Mon Jun 14 11:45:39 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 329.
Mon Jun 14 11:45:39 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Jun 14 11:45:39 2021 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

我尝试给从机92 运行指向主机的命令

mysql>   CHANGE MASTER TO MASTER_HOST='192.168.72.90',MASTER_USER='hado',MASTER_PASSWORD='123456',MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=1284;
ERROR 3021 (HY000): This operation cannot be performed with a running slave io thread; run STOP SLAVE IO_THREAD FOR CHANNEL '' first.

根据提示:

  1. 1、STOP SLAVE IO_THREAD; //关闭线程
  2. 2、CHANGE MASTER TO MASTER_HOST='192.168.72.90',MASTER_USER='hado',MASTER_PASSWORD='000000',MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=1284;
  3. 3START SLAVE IO_THREAD; //开启io线程

然后又提示 从机91IO没有运行

192.168.72.90(192.168.72.90:3306) (current master)
 +--192.168.72.91(192.168.72.91:3306)
 +--192.168.72.92(192.168.72.92:3306)

Mon Jun 14 12:35:50 2021 - [info] Checking replication health on 192.168.72.91..
Mon Jun 14 12:35:50 2021 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln490] Slave IO thread is not running on 192.168.72.91(192.168.72.91:3306)
Mon Jun 14 12:35:50 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln1526]  failed!
Mon Jun 14 12:35:50 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 420.
Mon Jun 14 12:35:50 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Jun 14 12:35:50 2021 - [info] Got exit code 1 (Not master dead).

尝试运行重启命令

mysql> reset slave all;
ERROR 3081 (HY000): This operation cannot be performed with running replication threads; run STOP SLAVE FOR CHANNEL '' first

解决如下:

先查看主机的日志

     

  1. mysql> stop slave;     // 停止从机
  2. Query OK, 0 rows affected (0.01 sec)
  3. mysql> reset master;  //重置绑定的主机
  4. Query OK, 0 rows affected (0.02 sec)
  5. mysql> CHANGE MASTER TO MASTER_HOST='192.168.72.90',MASTER_USER='hado',MASTER_PASSWORD='123456',MASTER_LOG_FILE='mysql-bin.000010', MASTER_LOG_POS=154;
  6. Query OK, 0 rows affected, 2 warnings (0.02 sec)   // 根据主机提示,执行复制主机的命令
  7. mysql> start slave; // 重启从机
  8. Query OK, 0 rows affected (0.01 sec)

然后查看状态:

  Slave_IO_Running: Yes
 Slave_SQL_Running: Yes

监控主机上在运行监测命令

运行发现,92也是    Slave_IO_Running: No,

  1. mysql> STOP SLAVE IO_THREAD;
  2. Query OK, 0 rows affected, 1 warning (0.00 sec)
  3. mysql> CHANGE MASTER TO MASTER_HOST='192.168.72.90',MASTER_USER='hado',MASTER_PASSWORD='000000',MASTER_LOG_FILE='mysql-bin.000010', MASTER_LOG_POS=154;
  4. Query OK, 0 rows affected, 2 warnings (0.00 sec)
  5. mysql> START SLAVE IO_THREAD;
  6. Query OK, 0 rows affected (0.00 sec)

从这两个操作发现,关键还是在指向主机复制这句,其他的无非就是停止、启动命令的运行。

然后再次监控主机上在运行监测命令(总算恢复正常了)

然后就启动MHA,接着查看一下主节点信息,发现

[hado@aproxy app1]$ nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha_manager/app1/manager.log 2>&1 &
[1] 4444
[hado@aproxy app1]$ masterha_check_status --conf=/etc/mha/app1.cnf
app1 monitoring program is now on initialization phase(10:INITIALIZING_MONITOR). Wait for a while and try checking again.  (等待一下再查下,其实是因为故障出现两个master,

再次查询没问题了

[hado@aproxy app1]$ masterha_check_status --conf=/etc/mha/app1.cnf
app1 (pid:4444) is running(0:PING_OK), master:192.168.72.90

再次进行故障切换,恢复原主机后,在从机92执行指向主机的复制命令,就出现了sql异常

Mon Jun 14 13:36:11 2021 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln935] SQL Thread is stopped(error) on 192.168.72.92(192.168.72.92:3306)! Errno:1007, Error:Error 'Can't create database 'UserTest'; database exists' on query. Default database: 'UserTest'. Query: 'create database UserTest'
Mon Jun 14 13:36:11 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln671] Master 192.168.72.90:3306 from which slave 192.168.72.92(192.168.72.92:3306) replicates is not defined in the configuration file!
Mon Jun 14 13:36:11 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 329.
Mon Jun 14 13:36:11 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Jun 14 13:36:11 2021 - [info] Got exit code 1 (Not master dead).

解决如下:

  1. stop slave;
  2. SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1;
  3. start slave;
  4. show slave status\G

然后执行检查命令发现:


Mon Jun 14 13:49:00 2021 - [info] Checking replication health on 192.168.72.91..
Mon Jun 14 13:49:00 2021 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln490] Slave IO thread is not running on 192.168.72.91(192.168.72.91:3306)
Mon Jun 14 13:49:00 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln1526]  failed!
Mon Jun 14 13:49:00 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 420.
Mon Jun 14 13:49:00 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Jun 14 13:49:00 2021 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

到91上进行查看从库状态:

  Master_Log_File: mysql-bin.000011
  Read_Master_Log_Pos: 154
  Relay_Log_File: mysql-relay-bin.000002
  Relay_Log_Pos: 1035
  Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: No

              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1007
                   Last_Error: Error 'Can't create database 'UserTest'; database exists' on query. Default database: 'UserTest'. Query: 'create database UserTest'

提示不能创建数据库,数据库已经存在

我就直接查看从库92正常的信息状态。发现一个不一样的地方。如下:

然后我运行命令如下就恢复正常了。

再次进行故障转移发现,主机宕机,作为91备用主机应该升级为主机的,可现在出现IO问题

Mon Jun 14 14:19:59 2021 - [info] Checking replication health on 192.168.72.91..
Mon Jun 14 14:19:59 2021 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln490] Slave IO thread is not running on 192.168.72.91(192.168.72.91:3306)
Mon Jun 14 14:19:59 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln1526]  failed!
Mon Jun 14 14:19:59 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 420.
Mon Jun 14 14:19:59 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Jun 14 14:19:59 2021 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

到91集群上,发现, Slave_IO_Running: Connecting

解决:我重新启动主库的mysql服务,等待一下,然后就变成yes了。

恢复原主机,又发现问题了

 Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from position > file size'

根据网上查找有相同的问题,然后拷贝过来,运行成功,所以就此做个记录。

一般做主从同步,都是要求以后的数据实现主从同步,而对于旧的数据完全可以使用数据库同步工具先将数据库同步,完了再进行主从同步;

正确做法是:

1.打开主mysql服务器,进入mysql里面。

2.执行flush logs; //这时主服务器会重新创建一个binlog文件;

3.在主服务上执行show master status\G;显示如下:

切到从服务器,进入mysql;

5.stop slave;

6.CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000015', MASTER_LOG_POS=154;//这里的file和pos都是上面主服务器master显示的。

7.start slave;//这时候就应可以了(都显示为yes)

然后再此切换,再次检查,发现
Mon Jun 14 15:34:05 2021 - [info] Checking replication health on 192.168.72.90..
Mon Jun 14 15:34:05 2021 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln490] Slave IO thread is not running on 192.168.72.90(192.168.72.90:3306)
Mon Jun 14 15:34:05 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln1526]  failed!
Mon Jun 14 15:34:05 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 420.
Mon Jun 14 15:34:05 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Jun 14 15:34:05 2021 - [info] Got exit code 1 (Not master dead). 

  到90主机上查看(因为在主机上也启动了slave,检查就出问题了,作为主机、怎么也抢从机的ip呢,奇怪了)

stop slave 

change master   to  master_host='192.168.72.90',master_user='hado',..............

start slave ,但是报如下错误

 Last_IO_Error: Fatal error: The slave I/O thread stops because master and slave have equal MySQL server ids; these ids must be different for replication to work (or the --replicate-same-server-id option must be used on slave but this does not always make sense; please check the manual before using it).

百翻:最后一个IO错误:致命错误:从I/O线程停止,因为主服务器和从服务器的MySQL服务器ID相等这些id必须不同,复制才能工作(或者在从属服务器上必须使用--replicate same server id选项,但这并不总是有意义的;使用前请检查手册)。

根据网上的解决思路进行如下操作:

  1. show variables like 'server_id';
  2. set global server_id=2;
  3. start slave ;

最后解决方式是:

最后无论如何都搞定不了,最后的方法是换一个思路,开启MHA,把原主机宕掉。让他切换到新从机去做主机。先进行stop ,同步  start。

  

在原来主机上执行指向原主机突然爆这个错误(我一开始指向出问题,然后引发一系列的情况报错。后面回想起来,这个就顺手记下来了)

注意:   原主机启动恢复后,将原主机指向现在的新主机,把恢复原主机作为从机。

ERROR 1777 (HY000): CHANGE MASTER TO MASTER_AUTO_POSITION = 1 cannot be executed because @@GLOBAL.GTID_MODE = OFF.

https://bugs.mysql.com/bug.php?id=70167参考网上意见

  1. <span style="color:#000000">mysql 5.6.13-log (root) [test]> 将 master 更改为 master_auto_position=0;
  2. ERROR 1777 (HY000): CHANGE MASTER TO MASTER_AUTO_POSITION = 1 只能在@@GLOBAL.GTID_MODE = ON 时执行。
  3. <strong>建议修复:</strong>
  4. 即使服务器以 gtid_mode=OFF> 运行,也应该可以设置 MASTER_AUTO_POSITION=0</span>

Mon Jun 14 23:54:04 2021 - [info]   /var/log/mha_manager/scripts/master_ip_failover --command=status --ssh_user=hado --orig_master_host=192.168.72.91 --orig_master_ip=192.168.72.91 --orig_master_port=3306 
"my" variable $vip masks earlier declaration in same scope at /var/log/mha_manager/scripts/master_ip_failover line 84.
Mon Jun 14 23:54:04 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln229]  Failed to get master_ip_failover_script status with return code 255:0.
Mon Jun 14 23:54:04 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/bin/masterha_check_repl line 48.
Mon Jun 14 23:54:04 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Mon Jun 14 23:54:04 2021 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

“my”变量$vip在/var/log/mha\u manager/scripts/master\u ip\u failover line 84的同一作用域中屏蔽了先前的声明。

把重复的屏蔽掉即可。

[hado@aslave1 ~]$ sudo  ifconfig eth1:1 192.168.72.200/24
SIOCSIFADDR: No such device  没有这样的装置
eth1:1: ERROR while getting interface flags: No such device   获取接口标志时出错:没有此类设备
SIOCSIFNETMASK: No such device  没有这样的设备

 也许网卡名字不是“eth0”,而是其他的名字。通过在虚拟机输入一行代码“ifconfig -a”来检查所有的网卡参数,结果发现我的网卡名字没有“eth0”,而是叫“ens33”

配置了虚拟IP,实现故障后VIP漂移,然后检查这里就出这个了,但是主从结构是正常的。

原因是:配置少 一个双引号    master_binlog_dir="/var/lib/mysql"  

Wed Jun 16 20:46:42 2021 - [info]   Connecting to hado@192.168.72.91(192.168.72.91:22).. 
sh: -c: line 0: unexpected EOF while looking for matching `"'
sh: -c: line 1: syntax error: unexpected end of file
Wed Jun 16 20:46:42 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln161] Binlog setting check failed!
Wed Jun 16 20:46:42 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln408] Master configuration failed.
Wed Jun 16 20:46:42 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/bin/masterha_check_repl line 48.
Wed Jun 16 20:46:42 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Wed Jun 16 20:46:42 2021 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

然后再次检查又出现这个

[hado@aproxy ~]$ masterha_check_repl --conf=/etc/mha/app1.cnf                 
Wed Jun 16 21:10:38 2021 - [info] Reading default configuration from /etc/masterha_default.cnf..
Wed Jun 16 21:10:38 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Wed Jun 16 21:10:38 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
Wed Jun 16 21:10:39 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. Failed to get IP address on host 192.168.72.91": Name or service not known
 at /usr/share/perl5/vendor_perl/MHA/Config.pm line 63.
Wed Jun 16 21:10:39 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Wed Jun 16 21:10:39 2021 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

获取不到ip吗,然后我把/etc/mha/app1.cnf 中的三个主机的ip信息重新放入进去。在检查,就没问题了。

我把一个表删除了,导致这个情况

Thu Jun 17 07:04:42 2021 - [info] Reading default configuration from /etc/masterha_default.cnf..
Thu Jun 17 07:04:42 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Thu Jun 17 07:04:42 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
Thu Jun 17 07:04:42 2021 - [info] MHA::MasterMonitor version 0.58.
Thu Jun 17 07:04:43 2021 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln935] SQL Thread is stopped(error) on 192.168.72.92(192.168.72.92:3306)! Errno:1051, Error:Error 'Unknown table 'UserTest.position'' on query. Default database: 'UserTest'. Query: 'DROP TABLE `position` /* generated by server */'
Thu Jun 17 07:04:43 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln193] There is no alive slave. We can't do failover
Thu Jun 17 07:04:43 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 329.
Thu Jun 17 07:04:43 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Thu Jun 17 07:04:43 2021 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

1、执行flush logs; //这时主服务器会重新创建一个binlog文件;

2、在将所有的从节点重新指向主机即可。

Thu Jun 17 20:11:58 2021 - [info] Reading default configuration from /etc/masterha_default.cnf..
Thu Jun 17 20:11:58 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Thu Jun 17 20:11:58 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
Thu Jun 17 20:11:58 2021 - [info] MHA::MasterMonitor version 0.58.
Thu Jun 17 20:12:00 2021 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln193] There is no alive slave. We can't do failover
Thu Jun 17 20:12:00 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 329.
Thu Jun 17 20:12:00 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Thu Jun 17 20:12:00 2021 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

没有从节点怎么可能呢?后面三个节点都检查了一遍,发现只有一个92IP,其他两个没有了,可能是昨晚测试的时候。后面是我把其他两个重新指向主节点了

本文内容由网友自发贡献,转载请注明出处:https://www.wpsshop.cn/w/羊村懒王/article/detail/160419
推荐阅读
相关标签
  

闽ICP备14008679号