当前位置:   article > 正文

kubernetes -- Pod健康检查_kubectl查看pod状态

kubectl查看pod状态

目录

一、Pod探针基本概念

1、Pod状态

2、更准确的判断Pod状态

3、容器探针

4、检测结果

​编辑

二、使用存活探针

1、存活探针案例

2、Liveness探针流程

3、查看存活探针信息

4、探针高级配置

5、探针高级配置

6、存活探针 - HTTP

7、存活探针 - TCP

三、使用就绪探针

1、就绪探针

2、存活探针和就绪探针对比

3、创建HTTP服务

4、查看Endpoint状态

1. 查看服务状态,endpoints如下:

2. Pod状态如下:

3. 现在进入第一个容器,删除其中的index.html文件

5、查看故障后状态

1. 查看服务状态

2. 恢复故障Pod


一、Pod探针基本概念

1、Pod状态

1. Pod的状态信息在PodStatus中定义,其中有一个phase字段,就是我们熟悉的以下一些状态

2. 在何种状态下的Pod可以正常提供服务? 

2、更准确的判断Pod状态

Kubernetes借助探针(Prebes)机制,

探针可以会周期性的监测容器运行的状态,返回结果

        1. Liveness 探针:存活探针。

                Liveness探针用户捕获容器的状态是否处于存活状态。

                如果探测失败,kubelet会根据重启策略尝试恢复容器

        2. Readiness探针:就绪探针。

                如果 readiness 探针探测失败,

                则kubelet认为该容器没有准备好对外提供服务

                则endpointcontroller 会从与pod匹配的所有服务的端点中删除该Pod的地址

3、容器探针

kubelet可以周期性的执行Container的诊断。

为了执行诊断,kubelet 调用 Container 实现的 Handler,有三种Handler类型

        1. ExecAction:在容器内执行指定命令,

            如果命令退出时返回码0(表示命令成功执行了),则认为诊断成功

        2. TCPSocketAction:对指定端口上的容器的ip地址进行TCP检查。

            如果端口打开,则认为诊断成功

        3. HTTPGetAction:对指定端口和路径上的容器IP地址执行HTTP Get 请求。

            如果相应的状态码 ≥ 200 且 < 400,则诊断认为是成功的

4、检测结果

二、使用存活探针

1、存活探针案例

1. 本案例采用execaction 模式的存活探针

2. livenessProbe 字段详细定义了存活探针,包括

        - Handler 采用 exec

        - 使用方式是运行 cat /tmp/healthy 命令

        - 探测延迟和探测周期是5秒钟

  1. $ kubectl apply -f- <<EOF #创建存活探针,handler类型为ExecAction
  2. apiVersion: v1
  3. kind: Pod
  4. metadata:
  5. labels:
  6. test: liveness
  7. name: liveness-exec
  8. spec:
  9. containers:
  10. - name: liveness
  11. args:
  12. - /bin/sh
  13. - -c
  14. - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
  15. image: busybox
  16. livenessProbe:
  17. exec: #handler类型
  18. command:
  19. - cat
  20. - /tmp/healthy
  21. initialDelaySeconds: 5 #初始延时秒数
  22. periodSeconds: 5 #延迟周期
  23. EOF

参考资料: 配置存活、就绪和启动探针 | Kubernetes

2、Liveness探针流程

  1. $ kubectl get pods liveness-exec #查看pod是否存在
  2. NAME READY STATUS RESTARTS AGE
  3. liveness-exec 1/1 Running `1`(5s ago) 80s

3、查看存活探针信息

使用describe 命令查看pod信息

  1. $ kubectl describe pods liveness-exec #查看当前探针的策略
  2. Name: liveness-exec
  3. Namespace: default
  4. Priority: 0
  5. Node: k8s-worker1/192.168.147.103
  6. Start Time: Fri, 23 Sep 2022 02:14:50 +0000
  7. Labels: test=liveness
  8. Annotations: cni.projectcalico.org/containerID: 48099a7a7855d118ee67216855ffd4caf24ee712a8082556717b8b2bfe971081
  9. cni.projectcalico.org/podIP: 172.16.194.125/32
  10. cni.projectcalico.org/podIPs: 172.16.194.125/32
  11. Status: Running
  12. IP: 172.16.194.125
  13. IPs:
  14. IP: 172.16.194.125
  15. Containers:
  16. liveness:
  17. Container ID: docker://36377afbec966d304f1e9b0ef4eaac51f3c582fe1aaf7c60698315ca46395d89
  18. Image: busybox
  19. Image ID: docker-pullable://busybox@sha256:5acba83a746c7608ed544dc1533b87c737a0b0fb730301639a0179f9344b1678
  20. Port: <none>
  21. Host Port: <none>
  22. Args:
  23. /bin/sh
  24. -c
  25. touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
  26. State: Running
  27. Started: Fri, 23 Sep 2022 02:14:51 +0000
  28. Ready: True
  29. Restart Count: 0
  30. Liveness: exec [cat /tmp/healthy] delay=5s timeout=1s period=5s #success=1 #failure=3
  31. Environment: <none>
  32. Mounts:
  33. /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lrgdh (ro)
  34. Conditions:
  35. Type Status
  36. Initialized True
  37. Ready True
  38. ContainersReady True
  39. PodScheduled True
  40. Volumes:
  41. kube-api-access-lrgdh:
  42. Type: Projected (a volume that contains injected data from multiple sources)
  43. TokenExpirationSeconds: 3607
  44. ConfigMapName: kube-root-ca.crt
  45. ConfigMapOptional: <nil>
  46. DownwardAPI: true
  47. QoS Class: BestEffort
  48. Node-Selectors: <none>
  49. Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
  50. node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
  51. Events:
  52. Type Reason Age From Message
  53. ---- ------ ---- ---- -------
  54. Normal Scheduled 24s default-scheduler Successfully assigned default/liveness-exec to k8s-worker1
  55. Normal Pulling 24s kubelet Pulling image "busybox"
  56. Normal Pulled 23s kubelet Successfully pulled image "busybox" in 793.091641ms
  57. Normal Created 23s kubelet Created container liveness
  58. Normal Started 23s kubelet Started container liveness
  1. $ kubectl describe pods liveness-exec | grep -i liveness:.*exec #过滤策略
  2. Liveness: exec [cat /tmp/healthy] delay=5s timeout=1s period=5s #success=1 #failure=3

4、探针高级配置

1. 在上一步骤中使用describe命令可以看到探针的一些策略

2. delay=5s        表示探针在容器启动后5秒开始进行第一次探测

3. timeout=1s     表示容器必须在1秒内反馈信息给探针,否则视为失败

4. period=5s       表示每5秒探针进行一次探测

5. #success=1    表示探测连续成功1次,表示成功

6. #failure=3       表示探测连续失败3次,视为Pod处于failure状态,重启容器

5、探针高级配置

高级配置参数可以在配置参数时指定,以下为配置样例。

实现的功能与之前配置的探针一致

创建

  1. kubectl apply -f- <<EOF
  2. apiVersion: v1
  3. kind: Pod
  4. metadata:
  5. labels:
  6. test: liveness
  7. # name: liveness-exec
  8. name: liveness-exec3
  9. spec:
  10. containers:
  11. - name: liveness
  12. args:
  13. - /bin/sh
  14. - -c
  15. - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
  16. image: busybox
  17. livenessProbe:
  18. exec:
  19. command:
  20. - cat
  21. - /tmp/healthy
  22. initialDelaySeconds: 5
  23. periodSeconds: 5
  24. # 增加 1 行
  25. timeoutSeconds: 3
  26. EOF

查询

  1. $ kubectl describe pods liveness-exec3 | grep -i liveness:.* #查看策略
  2. liveness:
  3. Liveness: exec [cat /tmp/healthy] delay=5s timeout=3s period=5s #success=1 #failure=3

6、存活探针 - HTTP

1. HTTP方式的存活探针,通过get方法定期向容器发送http请求。

    方法中定义了请求路径、端口、请求头等信息

2. 由于探针仅在返回码 ≥200,小于400的情况下返回正常,10秒后探针检测失败,

    kubelet会重启容器

3.创建存活探针HTTP(一个不存在的)

  1. $ kubectl apply -f- <<EOF
  2. apiVersion: v1
  3. kind: Pod
  4. metadata:
  5. labels:
  6. test: liveness
  7. name: liveness-http
  8. spec:
  9. containers:
  10. - name: liveness
  11. image: mirrorgooglecontainers/liveness
  12. args:
  13. - /server
  14. livenessProbe:
  15. httpGet:
  16. path: /healthz
  17. port: 8080
  18. httpHeaders:
  19. - name: X-Custom-Header
  20. value: Awesome
  21. initialDelaySeconds: 3
  22. periodSeconds: 3
  23. EOF

4. 查询

  1. $ kubectl get pods liveness-http
  2. NAME READY STATUS RESTARTS AGE
  3. liveness-http 1/1 Running 0 19s
  1. $ kubectl describe pods liveness-http #查看pod详细信息步骤
  2. Name: liveness-http
  3. Namespace: default
  4. Priority: 0
  5. Node: k8s-worker1/192.168.147.103
  6. Start Time: Fri, 23 Sep 2022 02:50:33 +0000
  7. Labels: test=liveness
  8. Annotations: cni.projectcalico.org/containerID: 02fe605f301c9c1c1c5e704c2864efbf567f24a4b2239fdda0fddb608ef8122f
  9. cni.projectcalico.org/podIP: 172.16.194.127/32
  10. cni.projectcalico.org/podIPs: 172.16.194.127/32
  11. Status: Running
  12. IP: 172.16.194.127
  13. IPs:
  14. IP: 172.16.194.127
  15. Containers:
  16. liveness:
  17. Container ID: docker://ec0d01ecaeed511e3e108f037ec539006c4983e81085c1cf51178cc081bb06bd
  18. Image: mirrorgooglecontainers/liveness
  19. Image ID: docker-pullable://mirrorgooglecontainers/liveness@sha256:854458862be990608ad916980f9d3c552ac978ff70ceb0f90508858ec8fc4a62
  20. Port: <none>
  21. Host Port: <none>
  22. Args:
  23. /server
  24. State: Running
  25. Started: Fri, 23 Sep 2022 02:52:20 +0000
  26. Last State: Terminated
  27. Reason: Error
  28. Exit Code: 2
  29. Started: Fri, 23 Sep 2022 02:51:46 +0000
  30. Finished: Fri, 23 Sep 2022 02:52:03 +0000
  31. Ready: True
  32. Restart Count: 3
  33. Liveness: http-get http://:8080/healthz delay=3s timeout=1s period=3s #success=1 #failure=3 #探针类型为http-get、端口、监测的文件、参数
  34. Environment: <none>
  35. Mounts:
  36. /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-mwpj8 (ro)
  37. Conditions:
  38. Type Status
  39. Initialized True
  40. Ready True
  41. ContainersReady True
  42. PodScheduled True
  43. Volumes:
  44. kube-api-access-mwpj8:
  45. Type: Projected (a volume that contains injected data from multiple sources)
  46. TokenExpirationSeconds: 3607
  47. ConfigMapName: kube-root-ca.crt
  48. ConfigMapOptional: <nil>
  49. DownwardAPI: true
  50. QoS Class: BestEffort
  51. Node-Selectors: <none>
  52. Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
  53. node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
  54. Events:
  55. Type Reason Age From Message
  56. ---- ------ ---- ---- -------
  57. Normal Scheduled 2m7s default-scheduler Successfully assigned default/liveness-http to k8s-worker1 #在哪个节点创建的
  58. Normal Pulled 111s kubelet Successfully pulled image "mirrorgooglecontainers/liveness" in 14.77337733s
  59. Normal Pulled 73s kubelet Successfully pulled image "mirrorgooglecontainers/liveness" in 21.122850752s
  60. Normal Created 54s (x3 over 111s) kubelet Created container liveness
  61. Normal Started 54s (x3 over 111s) kubelet Started container liveness
  62. Normal Pulled 54s kubelet Successfully pulled image "mirrorgooglecontainers/liveness" in 747.964898ms
  63. Warning Unhealthy 37s (x9 over 100s) kubelet Liveness probe failed: HTTP probe failed with statuscode: 500 #失败,正常值为200-400,导致探针失败
  64. Normal Killing 37s (x3 over 94s) kubelet Container liveness failed liveness probe, will be restarted #因为不健康所以重启
  65. Normal Pulling 37s (x4 over 2m6s) kubelet Pulling image "mirrorgooglecontainers/liveness"

7、存活探针 - TCP

1. TCP 探针检测能否建立连接。实验中部署一个telnet服务,探针探测23端口

2. TCP探针参数与HTTP探针相似

3. 创建TCP探针

  1. $ kubectl apply -f- <<EOF
  2. apiVersion: v1
  3. kind: Pod
  4. metadata:
  5. name: ubuntu
  6. labels:
  7. app: ubuntu
  8. spec:
  9. containers:
  10. - name: ubuntu
  11. image: ubuntu
  12. args:
  13. - /bin/sh
  14. - -c
  15. - apt-get update && apt-get -y install openbsd-inetd telnetd && /etc/init.d/openbsd-inetd start; sleep 30000
  16. livenessProbe:
  17. tcpSocket:
  18. port: 23
  19. initialDelaySeconds: 60
  20. periodSeconds: 20
  21. EOF

4. 查询

  1. $ kubectl get pods ubuntu #查询状态
  2. NAME READY STATUS RESTARTS AGE
  3. ubuntu 1/1 Running 0 60s
  4. $ kubectl get pods -o wide ubuntu #查询pod的ip
  5. NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
  6. ubuntu 1/1 Running 0 2m13s 172.16.194.65 k8s-worker1 <none> <none>

5. 测试

  1. $ telnet 172.16.194.65
  2. Trying 172.16.194.65...
  3. Connected to 172.16.194.65.
  4. Escape character is '^]'.
  5. Ubuntu 20.04.3 LTS
  6. ubuntu login:

三、使用就绪探针

1、就绪探针

存活探针与就绪探针主要区别就是关键字

1. Pod处于存活状态并不意味着可以提供服务,创建完成后,

    通常需要进行诸如准备数据、安装和运行程序等步骤,才能对外提供服务 

2. Liveness 探针指示Pod是否处于存活状态,

    readiness探针则可指示容器是否已经一切准备就绪,可以对外提供服务

2、存活探针和就绪探针对比

1. 就绪探针与存活探针一致,

    可以使用 ExecAction,TCPSocketAction,HTTPGetAction三种方法

2. 就绪探针用于检测和显示Pod是否已经准备好对外提供业务。

    在实际使用场景中,就绪探针需要和业务绑定

就绪探针存活探针
当Pod未通过检测等待杀死Pod,重启一个新的Pod
服务如果检测失败,则从endpoint中移除podEndpoint自动更新新pod信息
作用Pod是否准备好提供服务Pod是否存活

3、创建HTTP服务

创建http的deployment和service,

并在其中加入就绪探针,探测是否存在index.html文件

  1. $ kubectl apply -f- <<EOF #以exec类型为例,判断索引文件是否存在
  2. apiVersion: apps/v1
  3. kind: Deployment
  4. metadata:
  5. name: httpd-deployment
  6. spec:
  7. replicas: 3
  8. selector:
  9. matchLabels:
  10. app: httpd
  11. template:
  12. metadata:
  13. labels:
  14. app: httpd
  15. spec:
  16. containers:
  17. - name: httpd
  18. image: httpd
  19. ports:
  20. - containerPort: 80
  21. readinessProbe:
  22. exec:
  23. command:
  24. - cat
  25. - /usr/local/apache2/htdocs/index.html
  26. initialDelaySeconds: 5
  27. periodSeconds: 5
  28. ---
  29. apiVersion: v1
  30. kind: Service
  31. metadata:
  32. name: httpd-svc
  33. spec:
  34. selector:
  35. app: httpd
  36. ports:
  37. - protocol: TCP
  38. port: 8080
  39. targetPort: 80
  40. EOF
  41. #运行后提示如下:
  42. deployment.apps/httpd-deployment created
  43. service/httpd-svc created

4、查看Endpoint状态

1. 查看服务状态,endpoints如下:

  1. $ kubectl get deployments.apps httpd-deployment #查看deployment
  2. NAME READY UP-TO-DATE AVAILABLE AGE
  3. httpd-deployment 3/3 3 3 3m51s
  1. $ kubectl get service httpd-svc #查看service
  2. NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
  3. httpd-svc ClusterIP 10.98.114.43 <none> 8080/TCP 4m8s
  1. $ kubectl get endpoints | awk '(NR==1){print $0} /httpd-svc/{print $0}' #查看endpoint
  2. NAME ENDPOINTS AGE
  3. httpd-svc 172.16.126.6:80,172.16.194.67:80,172.16.194.68:80 8m56s

2. Pod状态如下:

  1. $ kubectl get pod -l app=httpd -o wide #通过标签查看pod详细信息
  2. NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
  3. httpd-deployment-564dc969bb-k4sbr 1/1 Running 0 12m 172.16.194.68 k8s-worker1 <none> <none>
  4. httpd-deployment-564dc969bb-qhdd6 1/1 Running 0 12m 172.16.126.6 k8s-worker2 <none> <none>
  5. httpd-deployment-564dc969bb-wxqt7 1/1 Running 0 12m 172.16.194.67 k8s-worker1 <none> <none>

3. 现在进入第一个容器,删除其中的index.html文件

  1. $ kubectl exec -it httpd-deployment-564dc969bb-k4sbr -- /bin/sh #登录pod
  2. # rm /usr/local/apache2/htdocs/index.html #删除索引文件

5、查看故障后状态

1. 查看服务状态

endpoints如下,其中一个pod的端口信息已被移除endpint

  1. $ kubectl get pods -l app=httpd -o wide
  2. NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
  3. httpd-deployment-564dc969bb-k4sbr 0/1 Running 0 22m 172.16.194.68 k8s-worker1 <none> <none>
  4. httpd-deployment-564dc969bb-qhdd6 1/1 Running 0 22m 172.16.126.6 k8s-worker2 <none> <none>
  5. httpd-deployment-564dc969bb-wxqt7 1/1 Running 0 22m 172.16.194.67 k8s-worker1 <no
  6. $ kubectl get endpoints httpd-svc #发现endpoint值已经减少一个
  7. NAME ENDPOINTS AGE
  8. httpd-svc 172.16.126.6:80,172.16.194.67:80 23m

2. 恢复故障Pod

  1. $ kubectl delete pods httpd-deployment-564dc969bb-k4sbr #删除pod后、pod会重新部署新的pod
  2. pod "httpd-deployment-564dc969bb-k4sbr" deleted
  3. $ kubectl get pods -l app=httpd #再次查看已经恢复正常
  4. NAME READY STATUS RESTARTS AGE
  5. httpd-deployment-564dc969bb-fwvq8 1/1 Running 0 32s
  6. httpd-deployment-564dc969bb-qhdd6 1/1 Running 0 28m
  7. httpd-deployment-564dc969bb-wxqt7 1/1 Running 0 28m
  1. $ kubectl get endpoints httpd-svc #endpoint也已经恢复
  2. NAME ENDPOINTS AGE
  3. httpd-svc 172.16.126.6:80,172.16.194.67:80,172.16.194.69:80 31m

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/笔触狂放9/article/detail/162126
推荐阅读
相关标签
  

闽ICP备14008679号