赞
踩
1,java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled
kafka 进程失败,重启失败 检查日志发现这个,原因是磁盘满了,清除 kafka-logs 即可
2,kafka定时清理日志不生效
网上很多文档,说是要设置log.retention.hour等等参数。
默认是保留7天,但我实测下来发现日志根本没有任何变化。
原因是因为kafka只会回收上个分片的数据
配置没有生效的原因就是,数据并没有分片,所以没有回收
在配置中加入 log.roll.hours=12既可以解决问题 12小时切一次片
log.flush.interval.messages=10000
log.flush.interval.ms=1000
log.retention.hours=24
log.roll.hours=12
log.cleanup.policy=delete
log.retention.bytes=5368709120
至少需要配置这些信息
3、 Could not login: the client is being asked for a password, but the Kafka client code does not currently support obtaining a password from the user. not available to garner authentication information from the user
确认自己的配置文件没问题的情况下 可以先
kdestroy
kinit -kt /var/lib/keytab/kafka.keytab kafka
这样清空缓存再连接
4、Failed to elect leader for partition crawer-contact-info-23 under strategy
问题原因 新增加的副本的offset 副本的offset比leader的新 所以在elect的时候 出现问题
解决方法 :
在kafka的home path 的bin目录下 执行自带平衡topic 脚本
kafka-preferred-replica-election.sh --zookeeper 192.168.1.66:2181
然后重启 kafka 问题解决
5、由于zookeeper挂掉,造成kafka出现:There are 60 offline partitions。
问题原因:由于kafka之前Topic在zookeeper中的数据还在,再重新建立会产生冲突导致失败。
进入Zookeeper中将之前的脏数据删掉再重启kafka。
#1.进入zookeeper
sh /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/zookeeper/bin/zkCli.sh
#2.删除掉脏数据
deleteall /brokers/topics
6、 kafka消费异常
Offset commit cannot be completed since the consumer is not part of an active group for auto partition assignment; it is likely that the consumer was kicked out of the group.
15:10:10.857 logback [org.springframework.kafka.KafkaListenerEndpointContainer#0-2-C-1] INFO o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-info01-3, groupId=info01] Failing OffsetCommit request since the consumer is not part of an active group
15:10:10.857 logback [org.springframework.kafka.KafkaListenerEndpointContainer#0-2-C-1] ERROR o.s.k.l.KafkaMessageListenerContainer$ListenerConsumer - Consumer exception
java.lang.IllegalStateException: This error handler cannot process 'org.apache.kafka.clients.consumer.CommitFailedException's; no record information is available
at org.springframework.kafka.listener.SeekUtils.seekOrRecover(SeekUtils.java:151)
at org.springframework.kafka.listener.SeekToCurrentErrorHandler.handle(SeekToCurrentErrorHandler.java:103)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.handleConsumerException(KafkaMessageListenerContainer.java:1241)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:1002)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.kafka.clients.consumer.CommitFailedException: Offset commit cannot be completed since the consumer is not part of an active group for auto partition assignment; it is likely that the consumer was kicked out of the group.
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.sendOffsetCommitRequest(ConsumerCoordinator.java:1109)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:976)
at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1511)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.commitSync(KafkaMessageListenerContainer.java:2149)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.commitIfNecessary(KafkaMessageListenerContainer.java:2134)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.timedAcks(KafkaMessageListenerContainer.java:1981)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.processCommits(KafkaMessageListenerContainer.java:1961)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:1036)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:970)
... 3 common frames omitted
15:10:10.858 logback [org.springframework.kafka.KafkaListenerEndpointContainer#0-2-C-1] INFO o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-info01-3, groupId=info01] Giving away all assigned partitions as lost since generation has been reset,indicating that consumer is no longer part of the group
15:10:10.858 logback [org.springframework.kafka.KafkaListenerEndpointContainer#0-2-C-1] INFO o.a.k.c.c.i.ConsumerCoordinator - [Consumer clientId=consumer-info01-3, groupId=info01] Lost previously assigned partitions zxk-ann-5
15:10:10.858 logback [org.springframework.kafka.KafkaListenerEndpointContainer#0-2-C-1] INFO o.s.k.l.KafkaMessageListenerContainer - info01: partitions lost: [zxk-ann-5]
15:10:10.858 logback [org.springframework.kafka.KafkaListenerEndpointContainer#0-2-C-1] INFO o.s.k.l.KafkaMessageListenerContainer - info01: partitions revoked: [zxk-ann-5]
15:10:10.858 logback [org.springframework.kafka.KafkaListenerEndpointContainer#0-2-C-1] INFO o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-info01-3, groupId=info01] (Re-)joining group
15:10:10.871 logback [org.springframework.kafka.KafkaListenerEndpointContainer#0-2-C-1] INFO o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-info01-3, groupId=info01] Join group failed with org.apache.kafka.common.errors.MemberIdRequiredException: The group member needs to have a valid member id before actually entering a consumer group
15:10:10.871 logback [org.springframework.kafka.KafkaListenerEndpointContainer#0-2-C-1] INFO o.a.k.c.c.i.AbstractCoordinator - [Consumer clientId=consumer-info01-3, groupId=info01] (Re-)joining group
原因分析:
1,kafka数据量激增,因为这个程序前两天未报错,并且前两条未发现不良问题。
2,kafka 组件与其他组件混装在一个集群,性能被其他组件占用。
3,问题三解决:当前消费的数据是年报数据,数据体过大,当前一次拉的批次 max.poll.records 过大,拉取的批 max.poll.interval.ms 超时,当调整 max.poll.records 为 50 后正常(默认500);
客户端调大 max.poll.interval.ms 参数,或者调小 max.poll.records 参数,使得一个批次中消息消费时间别超过 session.timeout.ms 。
参数解析
max.poll.interval.ms 使用消费者组管理时poll()调用之间的最大延迟。消费者在获取更多记录之前可以空闲的时间量的上限。如果此超时时间期满之前poll()没有调用,则消费者被视为失败,并且分组将重新平衡,以便将分区重新分配给别的成员。
max.poll.records 在单次调用poll()中返回的最大消息数。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。