赞
踩
目录
做监控是为了保证业务的正常运行。
推荐一些工具:站长工具tool.chinaz.com(超级ping)
smokeping(https://oss.oetiker.ch/smokeping/)
监控的一些比较好的第三方软件:监控宝、博睿、基调(听云)
友情博客链接 Centos6.5 安装Zabbix3.4
硬件监控一般基于IPMI工具来实现的,可对底层硬件进行监控,如机服务器风扇、服务器温度等进行监控。可参考IBM官方文档(写的比较好)。若想要支持IPMI需要满足三个条件:
1)服务器支持(现在一般服务器都支持)
2)操作系统支持(Linux支持)
3)软件本身支持
Centos安装IPMI可直接yum安装
yum install OpenIPMI ipmitool -y
安装完成后lsmod查看内核模块中是否有ipmi会发现是没有的,因为系统没有自动加载,需要启动ipmi服务,虚拟机会启动失败,物理机可正常启动(/etc/init.d/ipmi start)
IPMI有个缺点就是获取不到服务器硬盘的状态信息,可以用MegaCli这个工具来获取硬盘raid阵列状态信息 http://www.ttlsa.com/html/tag/megacli/
- [root@Centos69 ~]# /etc/init.d/ipmi start # 虚拟机会启动失败,若有真机就可以启动成功
- Starting ipmi drivers: [FAILED]
- [root@Centos69 ~]# lsmod |grep ipmi
- [root@Centos69 ~]# ipmitool help # 查看帮助
- Commands:
- raw Send a RAW IPMI request and print response
- i2c Send an I2C Master Write-Read command and print response
- spd Print SPD info from remote I2C device
- lan Configure LAN Channels
- chassis Get chassis status and set power state
- power Shortcut to chassis power commands
- event Send pre-defined events to MC
- mc Management Controller status and global enables
- sdr Print Sensor Data Repository entries and readings # 打印传感器相关数据
- sensor Print detailed sensor information #详细打印传感器相关的数据
- fru Print built-in FRU and scan SDR for FRU locators
- gendev Read/Write Device associated with Generic Device locators sdr
- sel Print System Event Log (SEL)
- pef Configure Platform Event Filtering (PEF)
- sol Configure and connect IPMIv2.0 Serial-over-LAN
- tsol Configure and connect with Tyan IPMIv1.5 Serial-over-LAN
- isol Configure IPMIv1.5 Serial-over-LAN
- user Configure Management Controller users
- channel Configure Management Controller channels
- session Print session information
- dcmi Data Center Management Interface
- sunoem OEM Commands for Sun servers
- kontronoem OEM Commands for Kontron devices
- picmg Run a PICMG/ATCA extended cmd
- fwum Update IPMC using Kontron OEM Firmware Update Manager
- firewall Configure Firmware Firewall
- delloem OEM Commands for Dell systems
- shell Launch interactive IPMI shell
- exec Run list of commands from file
- set Set runtime variable for shell and exec
- hpm Update HPM components using PICMG HPM.1 file
- ekanalyzer run FRU-Ekeying analyzer using FRU files
- ime Update Intel Manageability Engine Firmware
- # ipmitool sensor list
- # 所有的应用程序都会提供相关的接口,如apache、nginx都会提供一个status页面来显示状态信息。redis和memcached也有相关接口,下面给个redis的例子:
- [root@Centos69 ~]# yum install redis
- [root@Centos69 ~]# /etc/init.d/redis start
- Starting redis-server: [ OK ]
- [root@Centos69 ~]# redis-cli
- 127.0.0.1:6379> info
- # Server
- redis_version:3.2.11
- redis_git_sha1:00000000
- redis_git_dirty:0
- redis_build_id:6ad59081ae574f13
- redis_mode:standalone
- os:Linux 2.6.32-696.el6.x86_64 x86_64
- arch_bits:64
- multiplexing_api:epoll
- gcc_version:4.4.7
- process_id:1601
- run_id:50e5711635c89f29bd3406772ab0f82117aa8e7b
- tcp_port:6379
- uptime_in_seconds:9
- uptime_in_days:0
- hz:10
- lru_clock:15300014
- executable:/usr/bin/redis-server
- config_file:/etc/redis.conf
- ......
- 自动化监控
google分析、seo(搜索引擎优化)
开源的分析软件(PIWIK)matomo.org
# 注意zabbix-server不支持Windows,客户端支持Windows。
环境:
zabbix_server ip:10.0.0.77 (hostname:linux-node1)
zabbix_agent ip:10.0.0.88 (hostname:linux-node2)
- # 安装zabbix的引导源,版本号为2.4
- [root@linux-node1 ~]# yum install http://repo.zabbix.com/zabbix/2.4/rhel/6/x86_64/zabbix-release-2.4-1.el6.noarch.rpm -y
-
- # yum安装zabbix
- [root@linux-node1 ~]# rpm -ql zabbix-release
- /etc/pki/rpm-gpg/RPM-GPG-KEY-ZABBIX
- /etc/yum.repos.d/zabbix.repo
- /usr/share/doc/zabbix-release-2.4
- /usr/share/doc/zabbix-release-2.4/GPL
- [root@linux-node1 ~]# yum install zabbix zabbix-agent zabbix-server zabbix-get zabbix-server-mysql zabbix-web zabbix-web-mysql -y
-
- # 修改数据库配置并启动数据库
- [root@linux-node1 ~]# cp /usr/share/mysql/my-medium.cnf /etc/my.cnf
- cp: overwrite `/etc/my.cnf'? y
- [root@linux-node1 ~]# vim /etc/my.cnf # 在[mysqld]字段下添加以下内容,修改字符集
- character-set-server = utf8
- init-connect = 'SET NAMES utf8'
- collation-server = utf8_general_ci
- [root@linux-node1 ~]# /etc/init.d/mysqld start #启动mysql服务
- # 导入数据库信息
- [root@linux-node1 ~]# cd /usr/share/doc/zabbix-server-mysql-2.4.8/create/
- [root@linux-node1 create]# mysql -e 'create database zabbix character set utf8 collate utf8_bin;'
- [root@linux-node1 create]# mysql -e "grant all on zabbix.* to zabbix@localhost identified by 'zabbix';"
- [root@linux-node1 create]# mysql -uroot -pzabbix zabbix <schema.sql
- [root@linux-node1 create]# mysql -uroot -pzabbix zabbix <images.sql
- [root@linux-node1 create]# mysql -uroot -pzabbix zabbix <data.sql
- # 修改apache中zabbix.conf配置文件,并启动httpd服务
- [root@linux-node1 ~]# vim /etc/httpd/conf.d/zabbix.conf
- php_value date.timezone Asia/Shanghai
- [root@linux-node1 ~]# /etc/init.d/httpd start
- Starting httpd: httpd: apr_sockaddr_info_get() failed for linux-node1
- httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1 for ServerName
- [ OK ]
-
- # 修改zabbix-server的配置文件,连接zabbix数据库,并重启httpd服务
- [root@linux-node1 ~]# vim /etc/zabbix/zabbix_server.conf
- DBHost=localhost
- DBName=zabbix
- DBUser=zabbix
- DBPassword=zabbix
- [root@linux-node1 ~]# /etc/init.d/httpd restart
- Stopping httpd: [ OK ]
- Starting httpd: httpd: apr_sockaddr_info_get() failed for linux-node1
- httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1 for ServerName
- [ OK ]
- # 启动zabbix-server服务
- [root@linux-node1 ~]# /etc/init.d/zabbix-server start
- Starting Zabbix server: [ OK ]
在web界面下安装zabbix
- # 修改zabbix-agent的配置文件,用本机做测试,并启动zabbix-agent
- [root@linux-node1 ~]# vim /etc/zabbix/zabbix_agentd.conf
- Server=10.0.0.77
- [root@linux-node1 ~]# /etc/init.d/zabbix-agent start
- Starting Zabbix agent: [ OK ]
监控uptime显示的用户数量,用下面命令获取用户数,并写入配置文件:
uptime | awk -F ’ ’ ‘{print $4}’
- # 把自定义项添加到配置文件
- [root@linux-node2 ~]# uptime
- 10:32:51 up 7:23, 2 users, load average: 0.23, 0.16, 0.05
- [root@linux-node2 ~]# uptime | awk -F ' ' '{print $4}'
- 2
-
- # 在UserParameter=下自定义一个用户,格式为UserParameter=key,command (必须要包含这两项,期中command可以是命令也可以是脚本)。用户自定义的参数返回值最大可以返回512kb的数据。
- [root@linux-node2 ~]# vim /etc/zabbix/zabbix_agentd.conf
- UserParameter=login-user,uptime | awk -F ' ' '{print $4}'
- [root@linux-node2 ~]# /etc/init.d/zabbix-agent restart # 一定要重启zabbix-agent服务
- Shutting down Zabbix agent: [ OK ]
- Starting Zabbix agent: [ OK ]
-
- # 在zabbix-server端用zabbix_get测试能否获取agent的这个key
- [root@linux-node1 ~]# zabbix_get --help
- Zabbix get v2.4.8 (revision 59539) (20 April 2016)
-
- usage: zabbix_get [-hV] -s <host name or IP> [-p <port>] [-I <IP address>] -k <key>
-
- Options:
- -s --host <host name or IP> Specify host name or IP address of a host
- -p --port <port number> Specify port number of agent running on the host. Default is 10050
- -I --source-address <IP address> Specify source IP address
-
- -k --key <key of metric> Specify key of item to retrieve value for
-
- -h --help Display help information
- -V --version Display version number
-
- Example: zabbix_get -s 127.0.0.1 -p 10050 -k "system.cpu.load[all,avg1]"
- [root@linux-node1 ~]# zabbix_get -s 10.0.0.88 -k login-user # 已经获取成功
- 2
添加图表
- # 这里我们自己定义一个
- [root@linux-node1 ~]# grep "alertscripts" /etc/zabbix/zabbix_server.conf
- # AlertScriptsPath=${datadir}/zabbix/alertscripts
- AlertScriptsPath=/usr/lib/zabbix/alertscripts
- [root@linux-node1 ~]# cd /usr/lib/zabbix/alertscripts
- [root@linux-node1 alertscripts]# cat Send_mail.py
- #!/usr/bin/python
- #coding: utf-8
- import smtplib
- import sys
- from email.mime.text import MIMEText
- from email.header import Header
- from email.Utils import COMMASPACE
-
- receiver = sys.argv[1]
- subject = sys.argv[2]
- mailbody = sys.argv[3]
- smtpserver = 'smtp.163.com'
- username = 'cactirsq@163.com'
- password = 'xxxxxx'
- sender = username
-
- msg = MIMEText(sys.argv[3],'html','utf-8')
- msg['Subject'] = Header(subject,'utf-8')
- msg['From'] = username
- msg['To'] = receiver
-
- smtp = smtplib.SMTP()
- smtp.connect(smtpserver)
- smtp.login(username,password)
- smtp.starttls()
- smtp.sendmail(msg['From'],msg['To'],msg.as_string())
- smtp.quit()
- [root@linux-node1 alertscripts]# chmod +x Send_mail.py
而后在web界面自定义
然后给此用户一管理员权限
多开几个终端测试,稍等几分钟
转载至https://blog.csdn.net/mr_rsq/article/details/80214937
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。