赞
踩
prometheus机器压力过大,内存耗尽,负载飙高,导致无法登录;
于是从公有云web界面进行重启,重启后内存还是不足,负载很快升高;
对机器进行配置变更,由4C+8G升级为4C+16G;
Grafana无法获取指标、但是可以通过curl命令获取远程目标主机暴露的指标;
日志有以下报错
level=warn ts=2024-05-23T11:04:46.410Z caller=scrape.go:1094 component="scrape manager" scrape_pool=jws2-development-hangzhou target=http://172.16.185.173:9100/metrics msg="Appending scrape report failed" err="out of bounds"
level=warn ts=2024-05-23T11:04:46.423Z caller=scrape.go:1378 component="scrape manager" scrape_pool=jws2-development-hangzhou target=http://172.16.185.182:9100/metrics msg="Error on ingesting samples that are too old or are too far into the future" num_dropped=507
level=warn ts=2024-05-23T11:04:46.423Z caller=scrape.go:1094 component="scrape manager" scrape_pool=jws2-development-hangzhou target=http://172.16.185.182:9100/metrics msg="Appending scrape report failed" err="out of bounds"
level=warn ts=2024-05-23T11:04:46.505Z caller=scrape.go:1378 component="scrape manager" scrape_pool=jws2-development-hangzhou target=http://172.16.185.172:9100/metrics msg="Error on ingesting samples that are too old or are too far into the future" num_dropped=435
level=warn ts=2024-05-23T11:04:46.505Z caller=scrape.go:1094 component="scrape manager" scrape_pool=jws2-development-hangzhou target=http://172.16.185.172:9100/metrics msg="Appending scrape report failed" err="out of bounds"
获取当前时间与时区
timedatactl
调整时区为上海
timedatectl set-timezone Asia/Shanghai
调整硬件时间为UTC
wal文件作用:仅用于记录事件和在启动时恢复内存状态,
[root@iZbp15tsl5bp6tjrwi2ksjZ data]# mv wal /tmp/
[root@iZbp15tsl5bp6tjrwi2ksjZ data]# mkdir wal
删除文件后,再次进行重启,日志恢复正常,Grafana显示正常
当变更机器规格是,系统时间改变为UTC时间,目标主机为上海时区,Prometheus 从新的时间记录数据,现在的日期与 Prometheus 记录的日期冲突(太旧了),所以它报告“Error on ingesting samples that are too old or are too far into the future”
参考:https://github.com/prometheus/prometheus/issues/6554
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。