当前位置:   article > 正文

Prometheus 监控容器,node 服务器,mysql_prometheus监控node

prometheus监控node

什么是doker compose?

docker compose 是一个可以在一台宿主机上同时启动多个容器的工具--》容器编排--》一台宿主机上的多个容器,那个容器需要加载什么配置,使用那个镜像,开放那个端口,是否使用卷等参数的配置。

cAdvisor有什么作用?

cAdvisor (short for container Advisor) analyzes and exposes resource usage and performance data from running containers.
可以获取宿主机的资源使用和容器的资源使用。


一.使用Prometheus监控容器

  1. [root@k8snode1 sc]# pwd
  2. /sc

1.编辑prometheus.yml

  1. [root@k8snode1 sc]# cat prometheus.yml 
  2. scrape_configs:
  3. - job_name: cadvisor
  4.   scrape_interval: 5s
  5.   static_configs:
  6.   - targets:
  7.     - cadvisor:8080

2.编辑docker-compose.yml

  1. [root@k8snode1 sc]# cat docker-compose.yml 
  2. version: '3.2'
  3. services:
  4.   prometheus:
  5.     image: prom/prometheus:latest
  6.     container_name: prometheus
  7.     ports:
  8.     - 9090:9090
  9.     command:
  10.     - --config.file=/etc/prometheus/prometheus.yml
  11.     volumes:
  12.     - ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
  13.     depends_on:
  14.     - cadvisor
  15.   cadvisor:
  16.     image: gcr.io/cadvisor/cadvisor:latest
  17.     container_name: cadvisor
  18.     ports:
  19.     - 8080:8080
  20.     volumes:
  21.     - /:/rootfs:ro
  22.     - /var/run:/var/run:rw
  23.     - /sys:/sys:ro
  24.     - /var/lib/docker/:/var/lib/docker:ro
  25.     depends_on:
  26.     - redis
  27.   redis:
  28.     image: redis:latest
  29.     container_name: redis
  30.     ports:
  31.     - 6379:6379

3.下载cadvisor,导入cadvisor镜像。

  1. [root@k8snode1 prom]# ls
  2. cadvisor.tar
  3. [root@k8snode1 prom]# docker load -i cadvisor.tar 
  4. [root@k8snode1 sc]# docker images
  5. REPOSITORY                                                                     TAG         IMAGE ID       CREATED         SIZE
  6. redis                                                                          latest      7614ae9453d1   18 months ago   113MB
  7. prom/prometheus                                                                latest      a3d385fc29f9   18 months ago   201MB
  8. gcr.io/cadvisor/cadvisor                                                       latest      68c29634fe49   2 years ago     163MB

4.使用docker compose启动容器

  1. [root@k8snode1 sc]# docker compose up
  2. [+] Running 4/2
  3.  ⠿ Network sc_default    Created                                                          0.1s
  4.  ⠿ Container redis       Created                                                          0.1s
  5.  ⠿ Container cadvisor    Created                                                          0.0s
  6.  ⠿ Container prometheus  Created                                                          0.0s
  7. Attaching to cadvisor, prometheus, redis
  8. redis       | 1:C 15 Jun 2023 10:22:08.193 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
  9. redis       | 1:C 15 Jun 2023 10:22:08.193 # Redis version=6.2.6, bits=64, commit=00000000, modified=0, pid=1, just started
  10. redis       | 1:C 15 Jun 2023 10:22:08.193 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
  11. redis       | 1:M 15 Jun 2023 10:22:08.194 * monotonic clock: POSIX clock_gettime
  12. redis       | 1:M 15 Jun 2023 10:22:08.194 * Running mode=standalone, port=6379.
  13. redis       | 1:M 15 Jun 2023 10:22:08.195 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
  14. redis       | 1:M 15 Jun 2023 10:22:08.195 # Server initialized
  15. redis       | 1:M 15 Jun 2023 10:22:08.195 * Ready to accept connections
  16. cadvisor    | W0615 10:22:08.704129       1 manager.go:288] Could not configure a source for OOM detection, disabling OOM events: open /dev/kmsg: no such file or directory
  17. prometheus  | ts=2023-06-15T10:22:09.372Z caller=main.go:478 level=info msg="No time or size retention was set so using the default time retention" duration=15d
  18. prometheus  | ts=2023-06-15T10:22:09.372Z caller=main.go:515 level=info msg="Starting Prometheus" version="(version=2.32.1, branch=HEAD, revision=41f1a8125e664985dd30674e5bdf6b683eff5d32)"
  19. prometheus  | ts=2023-06-15T10:22:09.372Z caller=main.go:520 level=info build_context="(go=go1.17.5, user=root@54b6dbd48b97, date=20211217-22:08:06)"
  20. prometheus  | ts=2023-06-15T10:22:09.372Z caller=main.go:521 level=info host_details="(Linux 3.10.0-1160.88.1.el7.x86_64 #1 SMP Tue Mar 7 15:41:52 UTC 2023 x86_64 1bc7affdad3c (none))"
  21. prometheus  | ts=2023-06-15T10:22:09.372Z caller=main.go:522 level=info fd_limits="(soft=1048576, hard=1048576)"
  22. prometheus  | ts=2023-06-15T10:22:09.372Z caller=main.go:523 level=info vm_limits="(soft=unlimited, hard=unlimited)"
  23. prometheus  | ts=2023-06-15T10:22:09.376Z caller=web.go:570 level=info component=web msg="Start listening for connections" address=0.0.0.0:9090
  24. prometheus  | ts=2023-06-15T10:22:09.378Z caller=main.go:924 level=info msg="Starting TSDB ..."
  25. prometheus  | ts=2023-06-15T10:22:09.386Z caller=tls_config.go:195 level=info component=web msg="TLS is disabled." http2=false
  26. prometheus  | ts=2023-06-15T10:22:09.388Z caller=head.go:488 level=info component=tsdb msg="Replaying on-disk memory mappable chunks if any"
  27. prometheus  | ts=2023-06-15T10:22:09.389Z caller=head.go:522 level=info component=tsdb msg="On-disk memory mappable chunks replay completed" duration=164.444µs
  28. prometheus  | ts=2023-06-15T10:22:09.389Z caller=head.go:528 level=info component=tsdb msg="Replaying WAL, this may take a while"
  29. prometheus  | ts=2023-06-15T10:22:09.391Z caller=head.go:599 level=info component=tsdb msg="WAL segment loaded" segment=0 maxSegment=0
  30. prometheus  | ts=2023-06-15T10:22:09.391Z caller=head.go:605 level=info component=tsdb msg="WAL replay completed" checkpoint_replay_duration=94.48µs wal_replay_duration=1.637257ms total_replay_duration=1.946212ms
  31. prometheus  | ts=2023-06-15T10:22:09.392Z caller=main.go:945 level=info fs_type=XFS_SUPER_MAGIC
  32. prometheus  | ts=2023-06-15T10:22:09.392Z caller=main.go:948 level=info msg="TSDB started"
  33. prometheus  | ts=2023-06-15T10:22:09.392Z caller=main.go:1129 level=info msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
  34. prometheus  | ts=2023-06-15T10:22:09.395Z caller=main.go:1166 level=info msg="Completed loading of configuration file" filename=/etc/prometheus/prometheus.yml totalDuration=2.855148ms db_storage=1.181µs remote_storage=27.878µs web_handler=595ns query_engine=1.107µs scrape=1.47533ms scrape_sd=69.063µs notify=1.869µs notify_sd=2.79µs rules=323.126µs
  35. prometheus  | ts=2023-06-15T10:22:09.395Z caller=main.go:897 level=info msg="Server is ready to receive web requests."
  36. ^CGracefully stopping... (press Ctrl+C again to force)  # 按ctrl+C退出
  37. Aborting on container exit...
  38. [+] Running 3/3
  39.  ⠿ Container prometheus  Stopped                                                          0.1s
  40.  ⠿ Container cadvisor    Stopped                                                          0.1s
  41.  ⠿ Container redis       Stopped                                                          0.2s
  42. canceled

使用 docker compose 启动容器并且在后台运行

  1. [root@k8snode1 sc]# docker compose up -d
  2. [+] Running 3/3
  3.  ⠿ Container redis       Started                                                          0.5s
  4.  ⠿ Container cadvisor    Started                                                          0.9s
  5.  ⠿ Container prometheus  Started                                                          1.5s
  6. [root@k8snode1 sc]# docker compose ps  #查看启动的容器
  7. NAME                IMAGE                             COMMAND                  SERVICE             CREATED             STATUS                    PORTS
  8. cadvisor            gcr.io/cadvisor/cadvisor:latest   "/usr/bin/cadvisor -…"   cadvisor            4 minutes ago       Up 30 seconds (healthy)   0.0.0.0:8080->8080/tcp, :::8080->8080/tcp
  9. prometheus          prom/prometheus:latest            "/bin/prometheus --c…"   prometheus          4 minutes ago       Up 30 seconds             0.0.0.0:9090->9090/tcp, :::9090->9090/tcp
  10. redis               redis:latest                      "docker-entrypoint.s…"   redis               4 minutes ago       Up 30 seconds             0.0.0.0:6379->6379/tcp, :::6379->6379/tcp

5.访问 cadvisor 和 Prometheus
http://192.168.102.137:8080/  访问cadvisor


http://192.168.102.137:9090/graph  访问Prometheus

6.停止 docker compose

  1. [root@k8snode1 sc]# docker compose stop  #停止容器
  2. [+] Running 3/3
  3.  ⠿ Container prometheus  Stopped                                                          0.1s
  4.  ⠿ Container cadvisor    Stopped                                                          0.1s
  5.  ⠿ Container redis       Stopped                                                          0.1s
  6. [root@k8snode1 sc]# docker compose down  stop and remove resources  #停止并且删除容器

二.Prometheus 监控的 node 服务器,使用的是exporter+node_exporter
1.上传node_exporter软件,解压。

  1. [root@slave ~]# mkdir /node_exporter
  2. [root@slave ~]# cd /node_exporter/
  3. [root@slave node_exporter]# ls
  4. node_exporter-1.4.0-rc.0.linux-amd64.tar.gz
  5. [root@slave node_exporter]# tar xf node_exporter-1.4.0-rc.0.linux-amd64.tar.gz 
  6. [root@slave node_exporter]# ls
  7. node_exporter-1.4.0-rc.0.linux-amd64.tar.gz
  8. node_exporter-1.4.0-rc.0.linux-amd64

2.启动node_exporter代理软件

  1. [root@slave node_exporter-1.4.0-rc.0.linux-amd64]# PATH=/node_exporter/node_exporter-1.4.0-rc.0.linux-amd64:$PATH
  2. [root@slave node_exporter-1.4.0-rc.0.linux-amd64]# which node_exporter 
  3. /node_exporter/node_exporter-1.4.0-rc.0.linux-amd64/node_exporter
  4. [root@slave node_exporter-1.4.0-rc.0.linux-amd64]# node_exporter --help  #查看使用手册
  5. [root@slave node_exporter-1.4.0-rc.0.linux-amd64]# nohup node_exporter --web.listen-address='0.0.0.0:9100' &

启动node_exporter 监听9100端口

  1. [root@slave node_exporter-1.4.0-rc.0.linux-amd64]# ps aux|grep node
  2. root      17227  0.0  0.7 716544 13112 pts/1    Sl   20:19   0:00 node_exporter --web.listen-address=0.0.0.0:9100
  3. root      17236  0.0  0.0 112824   988 pts/1    S+   20:20   0:00 grep --color=auto node

访问测试是否安装成功
http://192.168.102.136:9100/metrics

3.在Prometheus server里添加被监控主机
在server上操作

  1. [root@k8snode1 sc]# pwd
  2. /sc
  3. [root@k8snode1 sc]# cat prometheus.yml 
  4. scrape_configs:
  5. - job_name: cadvisor
  6.   scrape_interval: 5s
  7.   static_configs:
  8.   - targets:
  9.     - cadvisor:8080
  10. # 添加需要监控的服务器的信息
  11. - job_name: slave
  12.   scrape_interval: 5s
  13.   static_configs:
  14.   - targets:
  15.     - 192.168.102.136:9100

4.重启Prometheus服务,因为没有专门的重启脚本,需要手工完成
因为我们是使用容器启动的Prometheus,所有我们需要重启 compose

  1. [root@k8snode1 sc]# docker compose stop
  2. [+] Running 3/3
  3.  ⠿ Container prometheus  Stopped                                                          0.1s
  4.  ⠿ Container cadvisor    Stopped                                                          0.1s
  5.  ⠿ Container redis       Stopped   
  6.  [root@k8snode1 sc]# docker compose up -d
  7. [+] Running 3/3
  8.  ⠿ Container redis       Started                                                          0.3s
  9.  ⠿ Container cadvisor    Started                                                          0.7s
  10.  ⠿ Container prometheus  Started                                                          1.1s
  11. [root@k8snode1 sc]# 

5.去Prometheus服务器上查看添加的监控服务器
http://192.168.102.137:9090/targets


 三.使用 Prometheus 监控 MySQL

1.在一台服务器上使用脚本或者yum安装mysqld,然后安装mysqld_exporter,然后在Prometheus里添加mysqld这台被监控的服务器

  1. [root@slave node_exporter]# mkdir /mysqld_exporter
  2. [root@slave node_exporter]# cd /mysqld_exporter/
  3. [root@slave mysqld_exporter]# ps aux |grep mysqld
  4. root       6652  0.0  0.0 115536  1716 ?        S    09:36   0:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --datadir=/data/mysql --pid-file=/data/mysql/slave.pid
  5. mysql      6991  0.1 12.5 1677892 233308 ?      Sl   09:36   0:05 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/data/mysql --plugin-dir=/usr/local/mysql/lib/plugin --user=mysql --log-error=slave.err --open-files-limit=8192 --pid-file=/data/mysql/slave.pid --socket=/data/mysql/mysql.sock --port=3306
  6. root      16954  0.0  0.0 112824   988 pts/0    S+   10:33   0:00 grep --color=auto mysqld

2.下载 mysqld_exporter,用 xftp 上传到 Linux 里。

  1. [root@slave mysqld_exporter]# ls
  2. mysqld_exporter-0.14.0.linux-amd64.tar.gz
  3. [root@slave mysqld_exporter]# tar xf mysqld_exporter-0.14.0.linux-amd64.tar.gz 
  4. [root@slave mysqld_exporter]# cd mysqld_exporter-0.14.0.linux-amd64
  5. [root@slave mysqld_exporter-0.14.0.linux-amd64]# ls
  6. LICENSE  mysqld_exporter  NOTICE
  7. [root@slave mysqld_exporter-0.14.0.linux-amd64]# PATH=/mysqld_exporter/mysqld_exporter-0.14.0.linux-amd64:$PATH
  8. [root@slave mysqld_exporter-0.14.0.linux-amd64]# which mysqld_exporter 
  9. /mysqld_exporter/mysqld_exporter-0.14.0.linux-amd64/mysqld_exporter

3.给 exporter 创建用户,并且赋予权限。

  1. [root@slave mysqld_exporter-0.14.0.linux-amd64]# mysql -uroot -pSanchuang1234#
  2. mysql: [Warning] Using a password on the command line interface can be insecure.
  3. Welcome to the MySQL monitor.  Commands end with ; or \g.
  4. Your MySQL connection id is 4
  5. Server version: 5.7.37 MySQL Community Server (GPL)
  6. Copyright (c) 2000, 2022, Oracle and/or its affiliates.
  7. Oracle is a registered trademark of Oracle Corporation and/or its
  8. affiliates. Other names may be trademarks of their respective
  9. owners.
  10. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
  11. root@(none) 21:00  mysql>grant select , replication client,process on *.* to 'prom'@'localhost' identified by 'sc123456';
  12. Query OK, 0 rows affected, 1 warning (0.01 sec)
  13. root@(none) 21:00  mysql>exit
  14. Bye

4.配置 mysql 的账号信息

  1. [root@slave mysqld_exporter-0.14.0.linux-amd64]# vim my.cnf
  2. [client]
  3. user=prom
  4. password=sc123456

5.启动 mysqld_exporter

  1. [root@slave mysqld_exporter-0.14.0.linux-amd64]# pwd
  2. /mysqld_exporter/mysqld_exporter-0.14.0.linux-amd64
  3. [root@slave mysqld_exporter-0.14.0.linux-amd64]# nohup mysqld_exporter  --config.my-cnf=/mysqld_exporter/mysqld_exporter-0.14.0.linux-amd64/my.cnf &
  4. [2] 17291
  5. [root@slave mysqld_exporter-0.14.0.linux-amd64]# nohup: 忽略输入并把输出追加到"nohup.out"
  6. [root@slave mysqld_exporter-0.14.0.linux-amd64]# ls
  7. LICENSE  my.cnf  mysqld_exporter  nohup.out  NOTICE
  8. [root@slave mysqld_exporter-0.14.0.linux-amd64]# ps aux |grep mysqld
  9. root       6642  0.0  0.0 115536  1716 ?        S    18:37   0:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --datadir=/data/mysql --pid-file=/data/mysql/slave.pid
  10. mysql      6982  0.1 12.1 1743692 227048 ?      Sl   18:37   0:10 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/data/mysql --plugin-dir=/usr/local/mysql/lib/plugin --user=mysql --log-error=slave.err --open-files-limit=8192 --pid-file=/data/mysql/slave.pid --socket=/data/mysql/mysql.sock --port=3306
  11. root      17291  0.0  0.6 712340 11936 pts/1    Sl   21:05   0:00 mysqld_exporter --config.my-cnf=/mysqld_exporter/mysqld_exporter-0.14.0.linux-amd64/my.cnf
  12. root      17301  0.0  0.0 112824   984 pts/1    S+   21:08   0:00 grep --color=auto mysqld
  13. [root@slave mysqld_exporter-0.14.0.linux-amd64]# netstat -anpult|grep mysqld
  14. tcp6       0      0 :::3306                 :::*                    LISTEN      6982/mysqld         
  15. tcp6       0      0 :::9104                 :::*                    LISTEN      17291/mysqld_export 

6.在server上操作

  1. [root@k8snode1 sc]# cat prometheus.yml 
  2. scrape_configs:
  3. - job_name: cadvisor
  4.   scrape_interval: 5s
  5.   static_configs:
  6.   - targets:
  7.     - cadvisor:8080
  8. - job_name: slave
  9.   scrape_interval: 5s
  10.   static_configs:
  11.   - targets:
  12.     - 192.168.102.136:9100
  13. - job_name: mysqld_exporter
  14.   scrape_interval: 5s
  15.   static_configs:
  16.   - targets:
  17.     - 192.168.102.136:9104
  18. [root@k8snode1 sc]# docker compose stop
  19. [+] Running 3/3
  20.  ⠿ Container prometheus  Stopped                                                          0.2s
  21.  ⠿ Container cadvisor    Stopped                                                          0.1s
  22.  ⠿ Container redis       Stopped                                                          0.3s
  23. [root@k8snode1 sc]# docker compose up -d
  24. [+] Running 3/3
  25.  ⠿ Container redis       Started                                                          0.4s
  26.  ⠿ Container cadvisor    Started                                                          0.9s
  27.  ⠿ Container prometheus  Started                                                          1.4s
  28. [root@k8snode1 sc]# docker ps
  29. CONTAINER ID   IMAGE                             COMMAND                   CREATED        STATUS                             PORTS                                       NAMES
  30. 1bc7affdad3c   prom/prometheus:latest            "/bin/prometheus --c…"   16 hours ago   Up 14 seconds                      0.0.0.0:9090->9090/tcp, :::9090->9090/tcp   prometheus
  31. 8838f2e63e1c   gcr.io/cadvisor/cadvisor:latest   "/usr/bin/cadvisor -…"   16 hours ago   Up 15 seconds (health: starting)   0.0.0.0:8080->8080/tcp, :::8080->8080/tcp   cadvisor
  32. c9aeedb12daa   redis:latest                      "docker-entrypoint.s…"   16 hours ago   Up 15 seconds                      0.0.0.0:6379->6379/tcp, :::6379->6379/tcp   redis

7.去Prometheus服务器上查看添加的监控服务器
        http://192.168.102.137:9090/targets

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/不正经/article/detail/517333
推荐阅读
相关标签
  

闽ICP备14008679号