当前位置:   article > 正文

研发工程师玩转Kubernetes——就绪探针(Readiness Probe)和服务(Service)_jaeger会导致就绪探针有问题

jaeger会导致就绪探针有问题


《研发工程师玩转Kubernetes——启动、存活和就绪探针》中,我们讲了就绪探针和服务之间的特殊关系。就绪探针检测失败并不代表整个程序处于“非存活”状态,可能只是短暂临时的不可以提供服务,比如CPU阶段性占满,导致就绪探针检测超时而导致失败。这个时候就绪探针并不会向存活探针那样尝试重启容器,而只是简单的把它从何它关联的Service中摘除。

带Readiness Probe的Nginx

apiVersion: apps/v1
kind: Deployment
metadata:
  name: readiness-nginx-deployment
spec:
  selector:
    matchLabels:
      app: readiness-nginx
  replicas: 2
  template:
    metadata:
      labels:
        app: readiness-nginx
    spec:
      containers:
      - name: readiness-nginx-container
        image: nginx
        ports:
        - containerPort: 80
        command: ["/bin/sh", "-c", "sleep 3; touch /tempdir/readiness-nginx; while true; do sleep 5; done"]
        volumeMounts:
        - name:  probe-volume
          mountPath:  /tempdir
        readinessProbe:
          exec:
            command:
            - cat
            - /tempdir/readiness-nginx
          initialDelaySeconds: 2
          failureThreshold: 6
          periodSeconds: 1
          successThreshold: 1
      volumes:
      - name: probe-volume
        emptyDir: 
          medium: Memory
          sizeLimit: 1Gi
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37

Nginx关联的Service

kind: Service
apiVersion: v1
metadata:
  name: readiness-nginx-service
spec:
  selector:
    app: readiness-nginx
  ports:
  - protocol: TCP
    port: 80
    targetPort: 80
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

实验

创建上述组件,可以看到启动了下面的Pod

kubectl get pod -o wide
  • 1
NAME                                          READY   STATUS    RESTARTS   AGE   IP             NODE      NOMINATED NODE   READINESS GATES
readiness-nginx-deployment-57b7fd5644-7x7wc   1/1     Running   0          25s   10.1.43.223    ubuntuc   <none>           <none>
readiness-nginx-deployment-57b7fd5644-lhszp   1/1     Running   0          25s   10.1.209.155   ubuntub   <none>           <none>
  • 1
  • 2
  • 3

Service也绑定了这些IP。

kubectl describe endpoints readiness-nginx-service 
  • 1
Name:         readiness-nginx-service
Namespace:    default
Labels:       <none>
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2023-08-14T14:35:33Z
Subsets:
  Addresses:          10.1.209.155,10.1.43.223
  NotReadyAddresses:  <none>
  Ports:
    Name     Port  Protocol
    ----     ----  --------
    <unset>  80    TCP

Events:  <none>
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13

现在我们挑选一个容器(readiness-nginx-deployment-57b7fd5644-7x7wc,10.1.43.223),观察该容器的Event状态:

kubectl describe pod readiness-nginx-deployment-57b7fd5644-7x7wc
  • 1
Name:             readiness-nginx-deployment-57b7fd5644-7x7wc
Namespace:        default
Priority:         0
Service Account:  default
Node:             ubuntuc/172.22.247.176
Start Time:       Mon, 14 Aug 2023 14:35:27 +0000
Labels:           app=readiness-nginx
                  pod-template-hash=57b7fd5644
Annotations:      cni.projectcalico.org/containerID: c475d3e82ff0d5adbd35252ab990608ad75955f8d0862bb8b0c54ee60a0878eb
                  cni.projectcalico.org/podIP: 10.1.43.223/32
                  cni.projectcalico.org/podIPs: 10.1.43.223/32
Status:           Running
IP:               10.1.43.223
IPs:
  IP:           10.1.43.223
Controlled By:  ReplicaSet/readiness-nginx-deployment-57b7fd5644
Containers:
  readiness-nginx-container:
    Container ID:  containerd://5d82d8467bc6e0c8151e40ee3258d54bffec8659bcdad4a441848ea8f77a3223
    Image:         nginx
    Image ID:      docker.io/library/nginx@sha256:67f9a4f10d147a6e04629340e6493c9703300ca23a2f7f3aa56fe615d75d31ca
    Port:          80/TCP
    Host Port:     0/TCP
    Command:
      /bin/sh
      -c
      sleep 3; touch /tempdir/readiness-nginx; while true; do sleep 5; done
    State:          Running
      Started:      Mon, 14 Aug 2023 14:35:30 +0000
    Ready:          True
    Restart Count:  0
    Readiness:      exec [cat /tempdir/readiness-nginx] delay=2s timeout=1s period=1s #success=1 #failure=6
    Environment:    <none>
    Mounts:
      /tempdir from probe-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-c4tcl (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  probe-volume:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  1Gi
  kube-api-access-c4tcl:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  3m53s                  default-scheduler  Successfully assigned default/readiness-nginx-deployment-57b7fd5644-7x7wc to ubuntuc
  Normal   Pulling    3m53s                  kubelet            Pulling image "nginx"
  Normal   Pulled     3m50s                  kubelet            Successfully pulled image "nginx" in 2.489885583s (2.489893984s including waiting)
  Normal   Created    3m50s                  kubelet            Created container readiness-nginx-container
  Normal   Started    3m50s                  kubelet            Started container readiness-nginx-container
  Warning  Unhealthy  3m48s (x2 over 3m48s)  kubelet            Readiness probe failed: cat: /tempdir/readiness-nginx: No such file or directory
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66

可以看到就绪探针在第3次检测时就存在了,这个时候Pod的Ready和ContainersReady都是True的状态。

就绪->非就绪

现在我们删除就绪标志文件

kubectl exec pods/readiness-nginx-deployment-57b7fd5644-7x7wc --container readiness-nginx-container -- rm /tempdir/readiness-nginx
  • 1

再观察其状态,可以发现

Name:             readiness-nginx-deployment-57b7fd5644-7x7wc
Namespace:        default
Priority:         0
Service Account:  default
Node:             ubuntuc/172.22.247.176
Start Time:       Mon, 14 Aug 2023 14:35:27 +0000
Labels:           app=readiness-nginx
                  pod-template-hash=57b7fd5644
Annotations:      cni.projectcalico.org/containerID: c475d3e82ff0d5adbd35252ab990608ad75955f8d0862bb8b0c54ee60a0878eb
                  cni.projectcalico.org/podIP: 10.1.43.223/32
                  cni.projectcalico.org/podIPs: 10.1.43.223/32
Status:           Running
IP:               10.1.43.223
IPs:
  IP:           10.1.43.223
Controlled By:  ReplicaSet/readiness-nginx-deployment-57b7fd5644
Containers:
  readiness-nginx-container:
    Container ID:  containerd://5d82d8467bc6e0c8151e40ee3258d54bffec8659bcdad4a441848ea8f77a3223
    Image:         nginx
    Image ID:      docker.io/library/nginx@sha256:67f9a4f10d147a6e04629340e6493c9703300ca23a2f7f3aa56fe615d75d31ca
    Port:          80/TCP
    Host Port:     0/TCP
    Command:
      /bin/sh
      -c
      sleep 3; touch /tempdir/readiness-nginx; while true; do sleep 5; done
    State:          Running
      Started:      Mon, 14 Aug 2023 14:35:30 +0000
    Ready:          False
    Restart Count:  0
    Readiness:      exec [cat /tempdir/readiness-nginx] delay=2s timeout=1s period=1s #success=1 #failure=6
    Environment:    <none>
    Mounts:
      /tempdir from probe-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-c4tcl (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  probe-volume:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  1Gi
  kube-api-access-c4tcl:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                 From     Message
  ----     ------     ----                ----     -------
  Warning  Unhealthy  7s (x22 over 6m6s)  kubelet  Readiness probe failed: cat: /tempdir/readiness-nginx: No such file or directory
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61

可以看到Ready和ContainersReady都变成了False状态。
我们再观察Service

kubectl describe endpoints readiness-nginx-service 
  • 1
Name:         readiness-nginx-service
Namespace:    default
Labels:       <none>
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2023-08-14T14:41:18Z
Subsets:
  Addresses:          10.1.209.155
  NotReadyAddresses:  10.1.43.223
  Ports:
    Name     Port  Protocol
    ----     ----  --------
    <unset>  80    TCP

Events:  <none>
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13

可以看到被删除了就绪探针检测文件的Pod被从Service中摘掉了。

非就绪->就绪

我们再将检测文件还原

kubectl exec pods/readiness-nginx-deployment-57b7fd5644-7x7wc --container readiness-nginx-container -- touch /tempdir/readiness-nginx
  • 1

观察对应Pod的状态,其Ready和ContainersReady又变成了True状态。

Name:             readiness-nginx-deployment-57b7fd5644-7x7wc
Namespace:        default
Priority:         0
Service Account:  default
Node:             ubuntuc/172.22.247.176
Start Time:       Mon, 14 Aug 2023 14:35:27 +0000
Labels:           app=readiness-nginx
                  pod-template-hash=57b7fd5644
Annotations:      cni.projectcalico.org/containerID: c475d3e82ff0d5adbd35252ab990608ad75955f8d0862bb8b0c54ee60a0878eb
                  cni.projectcalico.org/podIP: 10.1.43.223/32
                  cni.projectcalico.org/podIPs: 10.1.43.223/32
Status:           Running
IP:               10.1.43.223
IPs:
  IP:           10.1.43.223
Controlled By:  ReplicaSet/readiness-nginx-deployment-57b7fd5644
Containers:
  readiness-nginx-container:
    Container ID:  containerd://5d82d8467bc6e0c8151e40ee3258d54bffec8659bcdad4a441848ea8f77a3223
    Image:         nginx
    Image ID:      docker.io/library/nginx@sha256:67f9a4f10d147a6e04629340e6493c9703300ca23a2f7f3aa56fe615d75d31ca
    Port:          80/TCP
    Host Port:     0/TCP
    Command:
      /bin/sh
      -c
      sleep 3; touch /tempdir/readiness-nginx; while true; do sleep 5; done
    State:          Running
      Started:      Mon, 14 Aug 2023 14:35:30 +0000
    Ready:          True
    Restart Count:  0
    Readiness:      exec [cat /tempdir/readiness-nginx] delay=2s timeout=1s period=1s #success=1 #failure=6
    Environment:    <none>
    Mounts:
      /tempdir from probe-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-c4tcl (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  probe-volume:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  1Gi
  kube-api-access-c4tcl:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                   From     Message
  ----     ------     ----                  ----     -------
  Warning  Unhealthy  3m5s (x262 over 13m)  kubelet  Readiness probe failed: cat: /tempdir/readiness-nginx: No such file or directory
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61

Service也重新将其加回来了。

Name:         readiness-nginx-service
Namespace:    default
Labels:       <none>
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2023-08-14T14:48:23Z
Subsets:
  Addresses:          10.1.209.155,10.1.43.223
  NotReadyAddresses:  <none>
  Ports:
    Name     Port  Protocol
    ----     ----  --------
    <unset>  80    TCP

Events:  <none>
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/很楠不爱3/article/detail/247367?site
推荐阅读
相关标签
  

闽ICP备14008679号