赞
踩
@hucong
目录导航
通常在查看canal.log时,提示一堆错误,如reset by peer之类的多半是canal中记录的binlog位置与MySQL中实际记录的binlog位置不同造成的
2019-01-15 10:52:20.941 [New I/O server worker #1-3] ERROR c.a.otter.canal.server.netty.handler.SessionHandler - something goes wrong with channel:[id: 0x2a1b7b6f, /149.129.68.40:48252 => /172.26.100.222:11111], exception=java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:322)
at org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:281)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:201)
at org.jboss.netty.util.internal.IoWorkerRunnable.run(IoWorkerRunnable.java:46)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
1.首先停止canal服务器sh bin/stop.sh
,然后记录canal服务端的binlog值,配置文件在canal的conf目录下对应项目的meta.dat文件中
vim usr/local/canal/conf/example/meta.dat
找到对应的binlog信息
"journalName":"mysql-bin.000001","position":43581207,"
2.记录canal服务器所在的MySQL节点信息
进入MySQL命令行模式
mysql> show master status;
+-----------------+---------+--------------+-----------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-----------------+---------+--------------+-----------------+------------------+
| mysql-bin.000002 | 6399145 | | | |
+-----------------+---------+--------------+-----------------+------------------+
1 row in set (0.09 sec)
发现file名称和position值都不匹配,解决方法有两种:
测试过程中重置的方法基本都能解决大多数问题
binlog重置方法:在MySQL命令模式下
mysql> reset master;
1.安装配置jdk和maven环境,可以查看CentOS 7.3安装配置JDK8+Maven
2.拉取canal-receiver代码
# git clone https://gitee.com/xingcyun/canal-receiver.git //yum install -y git
3.编译代码,在canal-receiver目录下执行
# mvn clean
# mvn install
都提示成功success说明编译成功,此时自动生成程序目录./target
4.启动程序,在./target目录下执行
# java -jar canal-receiver.jar start 1
关于如何在后台运行程序参考:Linux查看、关闭、后台运行任务
###无MQ下canal
设置对应文件的instance.properties(默认example)
# vim conf/example/instance.properties
修改以下配置
#################################################
## mysql serverId , v1.0.26+ will autoGen
canal.instance.mysql.slaveId=1234
# position info
canal.instance.master.address=127.0.0.1:3306
# username/password
canal.instance.dbUsername=canal
canal.instance.dbPassword=canal
canal.instance.connectionCharset = UTF-8
canal.instance.defaultDatabaseName =material_1703
# table regex
canal.instance.filter.regex=material_1703.bi_bill
#################################################
###canal+kafka配置
#################################################
## mysql serverId , v1.0.26+ will autoGen
canal.instance.mysql.slaveId=1234
# position info
canal.instance.master.address=127.0.0.1:3306
# username/password
canal.instance.dbUsername=canal
canal.instance.dbPassword=canal
canal.instance.connectionCharset = UTF-8
canal.instance.defaultDatabaseName =material_1703
# table regex
canal.instance.filter.regex=material_1703.bi_bill
# mq config
canal.mq.topic=TopicReceiver
#################################################
注:canal.mq.topic与实际创建的为准
canal.zkServers =39.98.41.26:2181
# tcp, kafka, RocketMQ
canal.serverMode = kafka
canal.destinations = example
canal.mq.servers = 39.98.41.26:9092
canal.mq.retries = 0
canal.mq.batchSize = 16384
canal.mq.maxRequestSize = 1048576
canal.mq.lingerMs = 1
canal.mq.bufferMemory = 33554432
canal.mq.canalBatchSize = 50
canal.mq.canalGetTimeout = 100
canal.mq.flatMessage = true
canal.mq.compressionType = none
canal.mq.acks = all
注:zookeeper端口2181;kafka端口9092
###123
问题1:
ERROR c.a.otter.canal.parse.inbound.mysql.MysqlEventParser - dump address /192.168.1.50:3306 has an error, retrying. caused by
com.alibaba.otter.canal.parse.exception.CanalParseException: can’t find start position for example
原因:meta.dat 中保存的位点信息和数据库的位点信息不一致;导致canal抓取不到数据库的动作;
解决方案:删除meta.dat删除,再重启canal,问题解决;
集群操作:进入canal对应的zookeeper集群下,删除节点/otter/canal/destinations/xxxxx/1001/cursor ;重启canal即可恢复;
问题2:
java.lang.OutOfMemoryError: Java heap space
canal消费端挂了太久,在zk对应conf下节点的
/otter/canal/destinations/test_db/1001/cursor
位点信息是很早以前,导致重启canal时,从很早以前的位点开始消费,导致canal服务器内存爆掉
监听数据库变更,只有TransactionBegin/TransactionEnd,没有拿到数据的EventType;
原因可能是canal.instance.filter.black.regex=.*\…*导致,改canal.instance.filter.black.regex=再重启试试;
问题3:
ERROR com.alibaba.otter.canal.common.alarm.LogAlarmHandler - destination:fdyb_db[com.alibaba.otter.canal.parse.exception.CanalParseException: com.alibaba.otter.canal.parse.exception.CanalParseException: parse row data failed.
Caused by: com.alibaba.otter.canal.parse.exception.CanalParseException: parse row data failed.
Caused by: com.alibaba.otter.canal.parse.exception.CanalParseException: com.google.common.collect.ComputationException: com.alibaba.otter.canal.parse.exception.CanalParseException: fetch failed by table meta:mysql
.pds_4490277
Caused by: com.google.common.collect.ComputationException: com.alibaba.otter.canal.parse.exception.CanalParseException: fetch failed by table meta:mysql
.pds_4490277
Caused by: com.alibaba.otter.canal.parse.exception.CanalParseException: fetch failed by table meta:mysql
.pds_4490277
Caused by: java.io.IOException: ErrorPacket [errorNumber=1142, fieldCount=-1, message=SELECT command denied to user ‘cy_canal’@‘11.217.0.224’ for table ‘pds_4490277’, sqlState=42000, sqlStateMarker=#]
with command: descmysql
.pds_4490277
分析:mysql系统表权限较高,canal读该表的binlog失败,位点无法移动
解决:将配置项中黑名单加上mysql下的所有表:canal.instance.filter.black.regex = mysql\…* ,修改后canal集群不需要重启即可恢复;
其它注意点:检查下CanalConnector是否调用subscribe(filter)方法;有的话,filter需要和instance.properties的canal.instance.filter.regex一致,否则subscribe的filter会覆盖instance的配置,如果subscribe的filter是.\…,那么相当于你消费了所有的更新数据。
问题4:
现象:数据库修改后,canal应用感知不到binlog,数据无法正常消费处理;
定位:1.查看canal服务器,canal应用,zk服务器的日志,确认无异常;2.查看mysql,es服务器,无异常,3.查看canal服务器,canal应用配置项,发现canal服务器的canal.properties有问题;
原因:canal.properties中配置了canal.ip和canal.zkServers,如果是zk集群模式下的canal配置了canal.ip,则会优先按IP连接canal服务器,从而让zk功能失效,位点文件则会保存到本地;一旦本地位点文件出现问题,各方无错误日志,问题就很难排查;
解决:将canal.ip配置项置为空,关掉canal服务器,canal应用,删除zk上的节点,重启canal服务器,canal应用,问题解决;
问题5:
客户端异常:
com.alibaba.otter.canal.protocol.exception.CanalClientException: java.net.ConnectException: Connection timed out: connect
at com.alibaba.otter.canal.client.impl.SimpleCanalConnector.doConnect(SimpleCanalConnector.java:189)
at com.alibaba.otter.canal.client.impl.SimpleCanalConnector.connect(SimpleCanalConnector.java:113)
at com.thon.util.BillSyncUtil.process(BillSyncUtil.java:111)
at com.thon.util.BillSyncUtil$2.run(BillSyncUtil.java:73)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection timed out: connect
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:454)
at sun.nio.ch.Net.connect(Net.java:446)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648)
at com.alibaba.otter.canal.client.impl.SimpleCanalConnector.doConnect(SimpleCanalConnector.java:148)
… 4 more
是因为修改了binlog信息前没有停止服务器端导致服务端异常,客户端连接失败
解决:重启canal服务端即可
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。