赞
踩
Prometheus 是一套开源的系统监控报警框架,非常适合大规模集群的监控。它也是第二个加入CNCF的项目,受欢迎度仅次于 Kubernetes 的项目。本文讲解完整prometheus 监控和告警服务的搭建。
prometheus 监控是当下主流监控系统,它是多个服务组合使用的体系。整体架构预览如下:
本篇教程监控系统搭建,包括的服务有:
除了监控采集节点,其他服务均通过docker-compose部署。部署系统信息:
prometheus主要负责数据采集和存储,提供PromQL查询语言的支持。部署prometheus分为两个步骤:
整个体系的配置文件在/root/prometheus
,首先新建prometheus服务的配置文件路径 /root/prometheus/prometheus
,并在这个目录下新建:
root@ubuntu-System-Product-Name:~/prometheus# tree . -L 3
.
├── docker-compose.yaml
└── prometheus
├── config
│ └── prometheus.yml
└── data
新建prometheus.yml,prometheus服务的主配置文件
global:
scrape_interval: 30s # 每30s采集一次数据
evaluation_interval: 30s # 每30s做一次告警检测
scrape_configs:
# 配置prometheus服务本身
- job_name: prometheus
static_configs:
- targets: ['172.16.9.124:9090']
labels:
instance: prometheus
修改 data 目录的文件权限,让容器有权限在data目录里生成数据相关数据
chmod 777 data
创建 docker-compse.yml
version: '3'
services:
prometheus:
image: prom/prometheus
container_name: prometheus
restart: always
ports:
- "9090:9090"
volumes:
- /root/prometheus/prometheus/config:/etc/prometheus
- /root/prometheus/prometheus/data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.enable-lifecycle'
参数说明:
command:
volumes:
启动 docker-compse docker-compose up -d
查看日志:
root@ubuntu-System-Product-Name:~/prometheus# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
776772d69b20 prom/prometheus "/bin/prometheus --c…" 5 minutes ago Up 5 minutes 0.0.0.0:9090->9090/tcp, :::9090->9090/tcp prometheus
查看容器的日志:
docker logs -f 776
ts=2023-12-25T10:21:17.560Z caller=main.go:478 level=info msg="No time or size retention was set so using the default time retention" duration=15d ts=2023-12-25T10:21:17.560Z caller=main.go:515 level=info msg="Starting prometheus" version="(version=2.32.1, branch=HEAD, revision=41f1a8125e664985dd30674e5bdf6b683eff5d32)" ts=2023-12-25T10:21:17.561Z caller=main.go:520 level=info build_context="(go=go1.17.5, user=root@54b6dbd48b97, date=20211217-22:08:06)" ts=2023-12-25T10:21:17.561Z caller=main.go:521 level=info host_details="(Linux 5.15.0-56-generic #62~20.04.1-Ubuntu SMP Tue Nov 22 21:24:20 UTC 2022 x86_64 776772d69b20 (none))" ts=2023-12-25T10:21:17.561Z caller=main.go:522 level=info fd_limits="(soft=1048576, hard=1048576)" ts=2023-12-25T10:21:17.561Z caller=main.go:523 level=info vm_limits="(soft=unlimited, hard=unlimited)" ts=2023-12-25T10:21:17.562Z caller=web.go:570 level=info component=web msg="Start listening for connections" address=0.0.0.0:9090 ts=2023-12-25T10:21:17.562Z caller=main.go:924 level=info msg="Starting TSDB ..." ts=2023-12-25T10:21:17.562Z caller=tls_config.go:195 level=info component=web msg="TLS is disabled." http2=false ts=2023-12-25T10:21:17.564Z caller=head.go:488 level=info component=tsdb msg="Replaying on-disk memory mappable chunks if any" ts=2023-12-25T10:21:17.564Z caller=head.go:522 level=info component=tsdb msg="On-disk memory mappable chunks replay completed" duration=1.305µs ts=2023-12-25T10:21:17.564Z caller=head.go:528 level=info component=tsdb msg="Replaying WAL, this may take a while" ts=2023-12-25T10:21:17.564Z caller=head.go:599 level=info component=tsdb msg="WAL segment loaded" segment=0 maxSegment=1 ts=2023-12-25T10:21:17.564Z caller=head.go:599 level=info component=tsdb msg="WAL segment loaded" segment=1 maxSegment=1 ts=2023-12-25T10:21:17.564Z caller=head.go:605 level=info component=tsdb msg="WAL replay completed" checkpoint_replay_duration=14.305µs wal_replay_duration=301.534µs total_replay_duration=327.342µs ts=2023-12-25T10:21:17.565Z caller=main.go:945 level=info fs_type=EXT4_SUPER_MAGIC ts=2023-12-25T10:21:17.565Z caller=main.go:948 level=info msg="TSDB started" ts=2023-12-25T10:21:17.565Z caller=main.go:1129 level=info msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml ts=2023-12-25T10:21:17.565Z caller=main.go:1166 level=info msg="Completed loading of configuration file" filename=/etc/prometheus/prometheus.yml totalDuration=217.62µs db_storage=555ns remote_storage=860ns web_handler=182ns query_engine=371ns scrape=90.382µs scrape_sd=10.238µs notify=450ns notify_sd=788ns rules=737ns ts=2023-12-25T10:21:17.565Z caller=main.go:897 level=info msg="Server is ready to receive web requests."
日志很重要!日志很重要!日志很重要!声明:本文内容由网友自发贡献,转载请注明出处:【wpsshop】
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。