赞
踩
为了更直观的展示kafka实时消息生产速率以及某一topic下group_id与当前kafka之间的消息积压情况,采用kafka_exporter,promehues,grafana将相关指标实时展示
1. 下载 kafka_exporter(所在机器需与kafka集群网络相通)
wget https://github.com/danielqsj/kafka_exporter/releases/download/v1.2.0/kafka_exporter-1.2.0.linux-amd64.tar.gz
解压: tar -zxvf kafka_exporter-1.2.0.linux-amd64.tar.gz
切到相应目录: cd kafka_exporter-1.2.0.linux-amd64
./kafka_exporter --kafka.server=kafkaIP或者域名:9092 & (只需填写kafka集群的一个ip即可)
对应的服务端口为9308
2.下载prometheus
wget https://github.com/prometheus/prometheus/releases/download/v2.15.1/prometheus-2.15.1.linux-amd64.tar.gz
解压
tar -zxvf prometheus-2.15.1.linux-amd64.tar.gz
prometheus.yml为promethues配置文件,可以先启动验证服务可用性
cd ./prometheus-2.15.1.linux-amd64
prometheus.yml 这个文件是对应的配置文件,在未添加kafka_exporter之前可以先启动查看下服务是否正常
- # my global config
- global:
- scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
- # evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
- # scrape_timeout is set to the global default (10s).
-
- # Alertmanager configuration
- #alerting:
- # alertmanagers:
- # - static_configs:
- # - targets:
- # - alertmanager:9093
-
- # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
- rule_files:
- # - "first_rules.yml"
- # - "second_rules.yml"
-
- # A scrape configuration containing exactly one endpoint to scrape:
- # Here it's Prometheus itself.
- scrape_configs:
- # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- - job_name: 'prometheus'
- # metrics_path defaults to '/metrics'
- # scheme defaults to 'http'.
- static_configs:
- - targets: ['localhost:9090']
运行prometheus
./prometheus --config.file=prometheus.yml
ip:9090即可打开其web页面
将kafka_exporter对应的服务添加进preomethues(添加在配置文件后面就行)
-
- static_configs:
- - targets: ['localhost:9090']
- - job_name: 'vpc_md_kafka'
- static_configs:
- - targets: ['localhost:9308']
重新启动promethues(后台运行)
nohup ./prometheus --config.file=prometheus.yml 2>&1 &
在status里的targets看到服务正常,下面就是使用grafana将监控指标可视化
3. 下载grafana
wget https://dl.grafana.com/oss/release/grafana-6.5.2-1.x86_64.rpm
root用户下执行
yum localinstall grafana-6.5.2-1.x86_64.rpm
启动grafana
- service grafana-server start #启动
- service grafana-server status #查看状态
打开grafana的web页面 ip:3000,添加promethues数据源
导入监控图标,对于grafana的监控,官方有监控图标,不需要自己搞
鼠标离开7589的框就会跳转到下一步
官方的监控界面是这样的,我这儿是测试华景,所以没什么数据
可以自己写查询满足需求,生产环境的监控界面弄的也比较简单,三个图标
生产环境的监控环境配置及对应查询语句
对应图标的三条查询语句为
sum(irate(kafka_topic_partition_current_offset{topic !~ "__consumer_offsets|__transaction_state|test",env="$env",app="$app"}[30s])) by (topic) >= 0
sum(kafka_consumergroup_lag{env="$env",app="$app"}) by (topic,consumergroup)
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。