当前位置:   article > 正文

Kafka 消费迟滞监控工具 Burrow

kafka burrow

Kafka 官方对于自身的 LAG 监控并没有太好的方法,虽然Kafka broker 自带有 kafka-topic.sh, kafka-consumer-groups.sh, kafka-console-consumer.sh 等脚本,但是对于大规模的生产集群上,使用脚本采集是非常不可靠的。


Burrow 简介

LinkedIn 公司的数据基础设施Streaming SRE团队正在积极开发Burrow,该软件由Go语言编写,在Apache许可证下发布,并托管在 GitHub Burrow 上。

它收集集群消费者群组的信息,并为每个群组计算出一个单独的状态,告诉我们群组是否运行正常,是否落后,速度是否变慢或者是否已经停止工作,以此来完成对消费者状态的监控。它不需要通过监控群组的进度来获得阈值,不过用户仍然可以从中获得消息的延时数量。

Burrow 的设计框架

Kafka 消费迟滞监控工具 Burrow
Burrow自动监控所有消费者和他们消费的每个分区。它通过消费特殊的内部Kafka主题来消费者偏移量。然后,Burrow将消费者信息作为与任何单个消费者分开的集中式服务提供。消费者状态通过评估滑动窗口中的消费者行为来确定。

这些信息被分解成每个分区的状态,然后转化为Consumer的单一状态。消费状态可以是OK,或处于WARNING状态(Consumer正在工作但消息消费落后),或处于ERROR状态(Consumer已停止消费或离线)。此状态可通过简单的HTTP请求发送至Burrow获取状态,也可以通过Burrow 定期检查并使用通知其通过电子邮件或单独的HTTP endpoint接口(例如监视或通知系统)发送出去。

Burrow能够监控Consumer消费消息的延迟,从而监控应用的健康状况,并且可以同时监控多个Kafka集群。用于获取关于Kafka集群和消费者的信息的HTTP上报服务与滞后状态分开,对于在无法运行Java Kafka客户端时有助于管理Kafka集群的应用程序非常有用。

Burrow 安装及版本

Burrow 是基于 Go 语言开发,当前 Burrow 的 v1.1 版本已经release。
Burrow 也提供用于 docker 镜像。

Burrow_1.2.2_checksums.txt              297 Bytes

Burrow_1.2.2_darwin_amd64.tar.gz        4.25 MB

Burrow_1.1.0_linux_amd64.tar.gz       3.22 MB (CentOS 6)

Burrow_1.2.2_linux_amd64.tar.gz        4.31 MB (CentOS 7 Require GLIBC >= 2.14)

Burrow_1.2.2_windows_amd64.tar.gz        4 MB

Source code (zip)

Source code (tar.gz)

本发行版包含针对初始1.0.0发行版中发现的问题的一些重要修复,其中包括:

  • 支持 Kafka 1.0更新版本(#306
  • Fix Zookeeper 监视处理(#328

还有一些小的功能更新

  • 存储最近的代理偏移环以避免停止的分区出现虚假警报
  • 添加可配置的通知间隔
  • 通过环境变量添加对配置的支持
  • 支持存储模块中可配置的队列深度
  1. Changelog - version 1.2
  2. [d244fce922] - Bump sarama to 1.20.1 (Vlad Gorodetsky)
  3. [793430d249] - Golang 1.9.x is no longer supported (Vlad Gorodetsky)
  4. [735fcb7c82] - Replace deprecated megacheck with staticcheck (Vlad Gorodetsky)
  5. [3d49b2588b] - Link the README to the Compose file in the project (Jordan Moore)
  6. [3a59b36d94] - Tests fixed (Mikhail Chugunkov)
  7. [6684c5e4db] - Added unit test for v3 value decoding (Mikhail Chugunkov)
  8. [10d4dc39eb] - Added v3 messages protocol support (Mikhail Chugunkov)
  9. [d6b075b781] - Replace deprecated MAINTAINER directive with a label (Vlad Gorodetsky)
  10. [52606499a6] - Refactor parseKafkaVersion to reduce method complexity (gocyclo) (Vlad Gorodetsky)
  11. [b0440f9dea] - Add gcc to build zstd (Vlad Gorodetsky)
  12. [6898a8de26] - Add libc-dev to build zstd (Vlad Gorodetsky)
  13. [b81089aada] - Add support for Kafka 2.1.0 (Vlad Gorodetsky)
  14. [cb004f9405] - Build with Go 1.11 (Vlad Gorodetsky)
  15. [679a95fb38] - Fix golint import path (golint fixer)
  16. [f88bb7d3a8] - Update docker-compose Readme section with working url. (Daniel Wojda)
  17. [3f888cdb2d] - Upgrade sarama to support Kafka 2.0.0 (#440) (daniel)
  18. [1150f6fef9] - Support linux/arm64 using Dup3() instead of Dup2() (Mpampis Kostas)
  19. [1b65b4b2f2] - Add support for Kafka 1.1.0 (#403) (Vlad Gorodetsky)
  20. [74b309fc8d] - code coverage for newly added lines (Clemens Valiente)
  21. [279c75375c] - accidentally reverted this (Clemens Valiente)
  22. [192878c69c] - gofmt (Clemens Valiente)
  23. [33bc8defcd] - make first regex test case a proper match everything (Clemens Valiente)
  24. [279b256b27] - only set whitelist / blacklist if it's not empty string (Clemens Valiente)
  25. [b48d30d18c] - naming (Clemens Valiente)
  26. [7d6c6ccb03] - variable naming (Clemens Valiente)
  27. [4e051e973f] - add tests (Clemens Valiente)
  28. [545bec66d0] - add blacklist for memory store (Clemens Valiente)
  29. [07af26d2f1] - Updated burrow endpoint in README : #401 (Ratish Ravindran)
  30. [fecab1ea88] - pass custom headers to http notifications. (#357) (vixns)
  31. Changelog - version 1.1
  32. fecab1e pass custom headers to http notifications. (#357)
  33. 7c0b8b1 Add minimum-complete config for the evaluator (#388)
  34. dc4cb84 Fix mail template (#369)
  35. e2216d7 Fetch goreleaser via curl instead of 'go get' as compilation only works in 1.10 (#387)
  36. f3659d1 Add a send-interval configuration parameter (#364)
  37. 3e488a2 Allow env vars to be used for configuration (#363)
  38. b7428c9 Fix typo in slack close (#361)
  39. 5b546cc Create the broker offset rings earlier (#360)
  40. 61f097a Metadata refresh on detecting a deleted topic must not be for that topic (#359)
  41. b890885 Make inmemory module request channel's size configurable (#352)
  42. 9911709 Update sarama to support 10.2.1 too. (#345)
  43. a1bdcde Adjusting docker build to be self-contained (#344)
  44. a91cf4d Fix an incorrect cast from #338 and add a test to cover it (#340)
  45. 389ef47 Store broker offset history (#338)
  46. 1a60efe Fix alert closing (#334)
  47. b75a6f3 Fix typo in Cluster reference
  48. cacf05e Reject offsets that are older than the group expiration time (#330)
  49. b6184ff Fix typo in the config checked for TLS no-verify #316 (#329)
  50. 3b765ea Sync Gopkg.lock with Gopkg.toml (#312)
  51. e47ec4c Fix ZK watch problem (#328)
  52. 846d785 Assume backward-compatible consumer protocol version (fix #313) (#327)
  53. e3a1493 Update sarama to support Kafka 1.0.0 (#306)
  54. 946a425 Fixing requests for StorageFetchConsumersForTopic (#310)
  55. 52e3e5d Update burrow.toml (#300)
  56. 3a4372f Upgrade sarama dependency to support Kafka 0.11.0 (#297)
  57. 8993eb7 Fix goreleaser condition (#299)
  58. d088c99 Add gitter webhook to travis config (#296)
  59. 08e9328 Merge branch 'gitter-badger-gitter-badge'
  60. 76db0a9 Fix positioning
  61. dddd0ea Add Gitter badge

安装方法可以选用源码编译,和使用官方提供的二进制包等方法。

这里推荐使用二进制包的方式。

Burrow 是无本地状态存储的,CPU密集型,网络IO密集型应用。

安装方法

  1. # wget https://github.com/linkedin/Burrow/releases/download/v1.1.0/Burrow_1.1.0_linux_amd64.tar.gz
  2. # mkdir burrow
  3. # tar -xf Burrow_1.1.0_linux_amd64.tar.gz -C burrow
  4. # cp burrow/burrow /usr/bin/
  5. # mkdir /etc/burrow
  6. # cp burrow/config/* /etc/burrow/
  7. # chkconfig --add burrow
  8. # /etc/init.d/burrow start

配置文件

  1. [general]
  2. pidfile="/var/run/burrow.pid"
  3. stdout-logfile="/var/log/burrow.log"
  4. access-control-allow-origin="mysite.example.com"
  5. [logging]
  6. filename="/var/log/burrow.log"
  7. level="info"
  8. maxsize=512
  9. maxbackups=30
  10. maxage=10
  11. use-localtime=true
  12. use-compression=true
  13. [zookeeper]
  14. servers=[ "test1.localhost:2181","test2.localhost:2181" ]
  15. timeout=6
  16. root-path="/burrow"
  17. [client-profile.prod]
  18. client-id="burrow-lagchecker"
  19. kafka-version="0.10.0"
  20. [cluster.production]
  21. class-name="kafka"
  22. servers=[ "test1.localhost:9092","test2.localhost:9092" ]
  23. client-profile="prod"
  24. topic-refresh=180
  25. offset-refresh=30
  26. [consumer.production_kafka]
  27. class-name="kafka"
  28. cluster="production"
  29. servers=[ "test1.localhost:9092","test2.localhost:9092" ]
  30. client-profile="prod"
  31. start-latest=false
  32. group-blacklist="^(console-consumer-|python-kafka-consumer-|quick-|test).*$"
  33. group-whitelist=""
  34. [consumer.production_consumer_zk]
  35. class-name="kafka_zk"
  36. cluster="production"
  37. servers=[ "test1.localhost:2181","test2.localhost:2181" ]
  38. #zookeeper-path="/"
  39. # If specified, this is the root of the Kafka cluster metadata in the Zookeeper ensemble. If not specified, the root path is used.
  40. zookeeper-timeout=30
  41. group-blacklist="^(console-consumer-|python-kafka-consumer-|quick-|test).*$"
  42. group-whitelist=""
  43. [httpserver.default]
  44. address=":8000"
  45. [storage.default]
  46. class-name="inmemory"
  47. workers=20
  48. intervals=15
  49. expire-group=604800
  50. min-distance=1
  51. #[notifier.default]
  52. #class-name="http"
  53. #url-open="http://127.0.0.1:1467/v1/event"
  54. #interval=60
  55. #timeout=5
  56. #keepalive=30
  57. #extras={ api_key="REDACTED", app="burrow", tier="STG", fabric="mydc" }
  58. #template-open="/etc/burrow/default-http-post.tmpl"
  59. #template-close="/etc/burrow/default-http-delete.tmpl"
  60. #method-close="DELETE"
  61. #send-close=false
  62. ##send-close=true
  63. #threshold=1

启动脚本:

  1. #!/bin/bash
  2. #
  3. # Comments to support chkconfig
  4. # chkconfig: - 98 02
  5. # description: Burrow is kafka lag check_program by LinkedIn, Inc.
  6. #
  7. # Source function library.
  8. . /etc/init.d/functions
  9. ### Default variables
  10. prog_name="burrow"
  11. prog_path="/usr/bin/${prog_name}"
  12. pidfile="/var/run/${prog_name}.pid"
  13. options="-config-dir /etc/burrow/"
  14. # Check if requirements are met
  15. [ -x "${prog_path}" ] || exit 1
  16. RETVAL=0
  17. start(){
  18. echo -n $"Starting $prog_name: "
  19. #pidfileofproc $prog_name
  20. #killproc $prog_path
  21. PID=$(pidofproc -p $pidfile $prog_name)
  22. #daemon $prog_path $options
  23. if [ -z $PID ]; then
  24. $prog_path $options > /dev/null 2>&1 &
  25. [ ! -e $pidfile ] && sleep 1
  26. fi
  27. [ -z $PID ] && PID=$(pidof ${prog_path})
  28. if [ -f $pidfile -a -d "/proc/$PID" ]; then
  29. #RETVAL=$?
  30. RETVAL=0
  31. #[ ! -z "${PID}" ] && echo ${PID} > ${pidfile}
  32. echo_success
  33. [ $RETVAL -eq 0 ] && touch /var/lock/subsys/$prog_name
  34. else
  35. RETVAL=1
  36. echo_failure
  37. fi
  38. echo
  39. return $RETVAL
  40. }
  41. stop(){
  42. echo -n $"Shutting down $prog_name: "
  43. killproc -p ${pidfile} $prog_name
  44. RETVAL=$?
  45. echo
  46. [ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/$prog_name
  47. return $RETVAL
  48. }
  49. restart() {
  50. stop
  51. start
  52. }
  53. case "$1" in
  54. start)
  55. start
  56. ;;
  57. stop)
  58. stop
  59. ;;
  60. restart)
  61. restart
  62. ;;
  63. status)
  64. status $prog_path
  65. RETVAL=$?
  66. ;;
  67. *)
  68. echo $"Usage: $0 {start|stop|restart|status}"
  69. RETVAL=1
  70. esac
  71. exit $RETVAL

默认配置文件为 burrow.toml


使用方法

获取消费者列表
GET /v3/kafka/(cluster)/consumer

Burrow 返回额接口均为 json 对象格式,所以非常方便用于二次采集处理。

获取指定消费者的状态 或 消费延时
  1. GET /v3/kafka/(cluster)/consumer/(group)/status
  2. GET /v3/kafka/(cluster)/consumer/(group)/lag
获取 topic 列表
GET /v3/kafka/(cluster)/topic
获取指定 topic offsets 信息
GET /v3/kafka/(cluster)/topic/(topic)

转载于:https://blog.51cto.com/professor/2119071

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/繁依Fanyi0/article/detail/880626
推荐阅读
相关标签
  

闽ICP备14008679号