赞
踩
在上次处理kubelet.go node "master" not found问题
之后的一段时间里面,我又遇到了相同的问题发生在其他节点。它的表现方式是/etc/kubernetes/bootstrap-kubelet.conf: no such file or directory
我此前也写了一篇文章处理 k8s kubelet.go node "master" not found 问题[1]
假如按照此前的方式删除/etc/kubernetes/bootstrap-kubelet.conf
之后可能就会出现kubelet.go node "master" not found
的问题,随后使用 admin.conf 来替换启动文件来解决这个问题的。
但是我随后发现,这个问题的缘由是 kubelet 的证数到期后进行了证数更新导致的上面的这个错误,从而误导了我删除了10-kubeadm.conf
种的--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf
字段后重启,并使用将master
的admin.conf
替换成kubelet.conf
来解决了这个问题,这一个操作,似乎掩盖了真正的问题所在。
究其原因是因为 Kubelet 的证数没有更新。这种情况发生在手动执行了更新证数到期时间后导致的,kubeadm 更新证数并不会更新到 Kubelet 的证数(实际上是客户端证书轮换失败)。
于是当 kublet 被重启后,就发生了证数不一致的问题,此前将 master 的 admin.conf 替换成 kubelet.conf 来解决了这个问题的假象在于没有重启 kubelet。
我个人并没有这种腿癖好,下午太困,群友说美腿提神啊(来自网图),响应号召
我们来看相同的报错,发生在 1.16 的 kubernetes 版本中:
- 2月 09 16:41:11 master systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
- 2月 09 16:41:11 master systemd[1]: Unit kubelet.service entered failed state.
- 2月 09 16:41:11 master systemd[1]: kubelet.service failed.
- 2月 09 16:41:22 master systemd[1]: kubelet.service holdoff time over, scheduling restart.
- 2月 09 16:41:22 master systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
- 2月 09 16:41:22 master systemd[1]: Started kubelet: The Kubernetes Node Agent.
- 2月 09 16:41:22 master kubelet[74138]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
- 2月 09 16:41:22 master kubelet[74138]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
- 2月 09 16:41:22 master kubelet[74138]: I0209 16:41:22.222741 74138 server.go:410] Version: v1.16.3
- 2月 09 16:41:22 master kubelet[74138]: I0209 16:41:22.223911 74138 plugins.go:100] No cloud provider specified.
- 2月 09 16:41:22 master kubelet[74138]: I0209 16:41:22.223954 74138 server.go:773] Client rotation is on, will bootstrap in background
- 2月 09 16:41:22 master systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
- 2月 09 16:41:22 master kubelet[74138]: E0209 16:41:22.227202 74138 bootstrap.go:265] part of the existing bootstrap client certificate is expired: 2021-03-18 08:46:29 +0000 UTC
- 2月 09 16:41:22 master kubelet[74138]: F0209 16:41:22.227239 74138 server.go:271] failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory
- 2月 09 16:41:22 master systemd[1]: Unit kubelet.service entered failed state.
- 2月 09 16:41:22 master systemd[1]: kubelet.service failed.

此前的方式就是直接删除了/etc/kubernetes/bootstrap-kubelet.conf
(kubeadm 安装)这段,这段位于 kubelet 启动的的配置文件内,你可以通过命令来查看 贴图的日期不重要,仅仅提供说明
- [root@master ~]# systemctl status kubelet
- ● kubelet.service - kubelet: The Kubernetes Node Agent
- Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
- Drop-In: /usr/lib/systemd/system/kubelet.service.d
- └─10-kubeadm.conf
- Active: active (running) since Thu 2021-12-30 03:08:09 CST; 1 months 23 days ago
- Docs: https://kubernetes.io/docs/
- Main PID: 32478 (kubelet)
- Tasks: 29
- Memory: 106.9M
- CGroup: /system.slice/kubelet.service
- └─32478 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/confi...
首先我们查看证书
- [root@master pki]# kubeadm alpha certs check-expiration
- CERTIFICATE EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
- admin.conf Feb 07, 2032 08:31 UTC 9y no
- apiserver Feb 07, 2032 08:31 UTC 9y no
- apiserver-etcd-client Feb 07, 2032 08:31 UTC 9y no
- apiserver-kubelet-client Feb 07, 2032 08:31 UTC 9y no
- controller-manager.conf Feb 07, 2032 08:31 UTC 9y no
- etcd-healthcheck-client Feb 07, 2032 08:31 UTC 9y no
- etcd-peer Feb 07, 2032 08:31 UTC 9y no
- etcd-server Feb 07, 2032 08:31 UTC 9y no
- front-proxy-client Feb 07, 2032 08:31 UTC 9y no
- scheduler.conf Feb 07, 2032 08:31 UTC 9y no
查看到的日期是正常的 而后我们查看 kubelet 的证数,kubelet.conf 是在/var/lib/kubelet/pki 的连接文件,于是我们查看它的证数到期时间
- [root@master ]# cd /var/lib/kubelet/pki
- [root@master pki]# ls
- kubelet-client-2020-03-18-16-46-37.pem kubelet-client-2021-01-28-09-11-35.pem kubelet-client-current.pem kubelet.key
- kubelet-client-2020-03-18-16-47-03.pem kubelet-client-2022-02-09-16-22-05.pem kubelet.crt
- [root@master pki]# openssl x509 -noout -enddate -in ./kubelet.crt
- notAfter=Mar 18 07:46:26 2021 GMT
我们可以看到在Mar 18 07:46:26 2021 GMT
也就是说在 2021 年 3 月 18 日 07:46:26 就已经到期了
kubelet-client-2022-02-09-16-22-05.pem
文件是通过kubeadm alpha certs renew all
更新后的,可以看到有不同的日期。这个 kubeadm 是有 10 年的时间的,所以它并不影响。但是这个 pem 和我们的日期也是对不上的
kubelet client 的日志也没更新
来源于 kublet 的文章Kubelet 客户端证书轮换失败[2]原文如下:
By default, kubeadm configures a kubelet with automatic rotation of client certificates by using the
/var/lib/kubelet/pki/kubelet-client-current.pem
symlink specified in/etc/kubernetes/kubelet.conf
. If this rotation process fails you might see errors such asx509: certificate has expired or is not yet valid
in kube-apiserver logs. To fix the issue you must follow these steps:
Backup and delete
/etc/kubernetes/kubelet.conf
and/var/lib/kubelet/pki/kubelet-client*
from the failed node.From a working control plane node in the cluster that has
/etc/kubernetes/pki/ca.key
executekubeadm kubeconfig user --org system:nodes --client-name system:node:$NODE > kubelet.conf
.$NODE
must be set to the name of the existing failed node in the cluster. Modify the resultedkubelet.conf
manually to adjust the cluster name and server endpoint, or passkubeconfig user --config
(it acceptsInitConfiguration
). If your cluster does not have theca.key
you must sign the embedded certificates in thekubelet.conf
externally.Copy this resulted
kubelet.conf
to/etc/kubernetes/kubelet.conf
on the failed node.Restart the kubelet (
systemctl restart kubelet
) on the failed node and wait for/var/lib/kubelet/pki/kubelet-client-current.pem
to be recreated.Manually edit the
kubelet.conf
to point to the rotated kubelet client certificates, by replacingclient-certificate-data
andclient-key-data
with:
client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem client-key: /var/lib/kubelet/pki/kubelet-client-current.pem
Restart the kubelet.
Make sure the node becomes
Ready
.
翻译过来的意思如下:
默认情况下,kubeadm 通过使用
/var/lib/kubelet/pki/kubelet-client-current.pem
在/etc/kubernetes/kubelet.conf
. 如果此轮换过程失败,您可能会x509: certificate has expired or is not yet valid
在 kube-apiserver 日志中看到错误。要解决此问题,您必须执行以下步骤:
从故障节点备份
/etc/kubernetes/kubelet.conf
和删除。/var/lib/kubelet/pki/kubelet-client*
从集群中具有
/etc/kubernetes/pki/ca.key
执行 的工作控制平面节点kubeadm kubeconfig user --org system:nodes --client-name system:node:$NODE > kubelet.conf
。$NODE
必须设置为集群中现有故障节点的名称。手动修改结果kubelet.conf
以调整集群名称和服务器端点,或通过kubeconfig user --config
(它接受InitConfiguration
)。如果您的集群没有,您必须在外部ca.key
签署嵌入式证书。kubelet.conf
将此结果复制
kubelet.conf
到/etc/kubernetes/kubelet.conf
故障节点上。重新启动故障节点上的 kubelet (
systemctl restart kubelet
) 并等待/var/lib/kubelet/pki/kubelet-client-current.pem
重新创建。手动编辑
kubelet.conf
以指向旋转的 kubelet 客户端证书,方法是将client-certificate-data
和替换client-key-data
为:
client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem client-key: /var/lib/kubelet/pki/kubelet-client-current.pem
重启 kubelet。
确保节点变为
Ready
.
在 github 上有好几种办法,然而这种方式,被一些大佬吐槽,评价是过于粗糙
解决方法是复制/etc/kubernetes/admin.conf 特定键的内容 client-certificate-data 并将 client-key-data 这些新字符串粘贴到/etc/kubernetes/kubelet.conf 相同键下的文件中。然后只是一个service kubelet restart
- [root@master kubernetes]# cat admin.conf
- apiVersion: v1
- clusters:
- - cluster:
- certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJd01ETXhPREE0TkRZeU4xb1hEVE13TURNeE5qQTRORFl5TjFvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTGFaClRWNODRKWVBPM09yKzdVbS9KN29sRVFEa3RGT3RWWHg0NWhQU0MrVkhWVEZib1JvOWEKNnVHT05iTWNHWVJjcERBbUZSU2pycnFlaFhmbTNjVWJaRUxrdmpTNXFsaFVONGlYak9idFFVYnQ4cHREYU9QSgo1cDUybjRnczdKMU92bzhKRjYzYU83Vy91cHdJS05MOEovWlpUVTh0YlU1TklkUzZCMXE1cFRSQTFBVT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
- server: https://master:6443
- name: kubernetes
- contexts:
- - context:
- cluster: kubernetes
- user: kubernetes-admin
- name: kubernetes-admin@kubernetes
- current-context: kubernetes-admin@kubernetes
- kind: Config
- preferences: {}
- users:
- - name: kubernetes-admin
- user:
- client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM4akNDQWRxZ0F3SUJBZ0lJUm91STNYU1ZTak13RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TURBek1UZ3dPRFEyTWpkYUZ3MHpNakF5TURjd09ETXhNVGxhTURReApGekFWQmdOVkJBb1REb9FUmJWenpRQndxZ1djMkMrbmVmRlNYK0FQMHdrL2VmdXJpdGRqUTAKeFhVNjgwNnF0b1hzM3VHaWtNQkc1WmQzT2srLzc5NlZGM29TZllObU5CaVAxY3FjVUJIcVFpOTdQNVZSL2RmawpaR0phMVJoNE5aRk9IaXVqRXFFOGQxUFVLOTg0SHNxOTcxN0dIelRaZGNDMW1EcFF3d3FUdktVRlZOa3hQdFljCjdDWkl1QUltZWFwcXlQVkFhdEp5Vk5kVy9NRlVya0ZjTHZFMnlRQ1pXd1NxL3RnSDFtMD0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
- client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBcXVmRUo2NG9wR2txM1Vzd21SNGFiOTRuS0RjTTFMSWRsYnBXVkIraDAzZGp5K0ZICnJsRVVSVUdESEtBZjIvN2EwbTNrS0xoSWVudC9GRVRxSm5Kd3RUUzdmUDlDVzVwUGR2OHdEQ3o3U1dzK1ZrczcKTTVjcXhMNFovem5ySU9LZ2FmQzIyaTVFdjgrRjBqdW85b1lES3VwMFQ0bmxON3dNeXdjN1dFS0dNcGtEZGNnTgpwem1kTGZDSzQvNXdWeFhVcDFvTDJ1OHowV0RLKzcyN3plaFVMcFpZN0lXRG1PRnd2YzFxcmp6RFBCYWNxd3MwCnJyMkx6RXllRWt6cUZpd3BkcXBmbE4rYkxTZkN3ekNlWFdTcEVQ5UEVnV0dEWFlaYUhGTzBRZVF0a2Vnd2xoeWdXeXNZOTBBZnArbQpOeVByZW8zRngzaTlBUG9QeWRuNHFtbVd2dmhiT2FhUGZyK1pBUmFOa0JCaXc1OUw3eW5IMVhLcExMMDBGZHlCClFRYS8rUUtCZ1FDYzFLaXV3Ui9ZWGY5aGtKeWVZRTZHUXhKeEc2OWl2MDNuZm1ldi9zeExKZDY3WmxBemRrbDgKc3Vtb29uK0dhc0V4SGFqQUhkVVlNZmplU2ZxUkNOR1FISWM4cGFNYjQxbFErRGowRlBydzRHeThjcTBNWEtleQpIelduazQrVmpXeW9URVJoTnpkSEVUdXFKUG51TFdqbFhSaFhLWCtIVmVZVUdwN3pRNHFXQWc9PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo=

修改后如下
- [root@master kubernetes]# cat kubelet.conf
- apiVersion: v1
- clusters:
- - cluster:
- certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJd01ETXhPREE0TkRZeU4xb1hEVE13TURNeE5qQTRORFl5TjFvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTGFaClRWNODRKWVBPM09yKzdVbS9KN29sRVFEa3RGT3RWWHg0NWhQU0MrVkhWVEZib1JvOWEKNnVHT05iTWNHWVJjcERBbUZSU2pycnFlaFhmbTNjVWJaRUxrdmpTNXFsaFVONGlYak9idFFVYnQ4cHREYU9QSgo1cDUybjRnczdKMU92bzhKRjYzYU83Vy91cHdJS05MOEovWlpUVTh0YlU1TklkUzZCMXE1cFRSQTFBVT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
- server: https://master:6443
- name: kubernetes
- contexts:
- - context:
- cluster: kubernetes
- user: system:node:master
- name: system:node:master@kubernetes
- current-context: system:node:master@kubernetes
- kind: Config
- preferences: {}
- users:
- - name: system:node:master
- user:
- client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM4akNDQWRxZ0F3SUJBZ0lJUm91STNYU1ZTak13RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB5TURBek1UZ3dPRFEyTWpkYUZ3MHpNakF5TURjd09ETXhNVGxhTURReApGekFWQmdOVkJBb1REb9FUmJWenpRQndxZ1djMkMrbmVmRlNYK0FQMHdrL2VmdXJpdGRqUTAKeFhVNjgwNnF0b1hzM3VHaWtNQkc1WmQzT2srLzc5NlZGM29TZllObU5CaVAxY3FjVUJIcVFpOTdQNVZSL2RmawpaR0phMVJoNE5aRk9IaXVqRXFFOGQxUFVLOTg0SHNxOTcxN0dIelRaZGNDMW1EcFF3d3FUdktVRlZOa3hQdFljCjdDWkl1QUltZWFwcXlQVkFhdEp5Vk5kVy9NRlVya0ZjTHZFMnlRQ1pXd1NxL3RnSDFtMD0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
- client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBcXVmRUo2NG9wR2txM1Vzd21SNGFiOTRuS0RjTTFMSWRsYnBXVkIraDAzZGp5K0ZICnJsRVVSVUdESEtBZjIvN2EwbTNrS0xoSWVudC9GRVRxSm5Kd3RUUzdmUDlDVzVwUGR2OHdEQ3o3U1dzK1ZrczcKTTVjcXhMNFovem5ySU9LZ2FmQzIyaTVFdjgrRjBqdW85b1lES3VwMFQ0bmxON3dNeXdjN1dFS0dNcGtEZGNnTgpwem1kTGZDSzQvNXdWeFhVcDFvTDJ1OHowV0RLKzcyN3plaFVMcFpZN0lXRG1PRnd2YzFxcmp6RFBCYWNxd3MwCnJyMkx6RXllRWt6cUZpd3BkcXBmbE4rYkxTZkN3ekNlWFdTcEVQ5UEVnV0dEWFlaYUhGTzBRZVF0a2Vnd2xoeWdXeXNZOTBBZnArbQpOeVByZW8zRngzaTlBUG9QeWRuNHFtbVd2dmhiT2FhUGZyK1pBUmFOa0JCaXc1OUw3eW5IMVhLcExMMDBGZHlCClFRYS8rUUtCZ1FDYzFLaXV3Ui9ZWGY5aGtKeWVZRTZHUXhKeEc2OWl2MDNuZm1ldi9zeExKZDY3WmxBemRrbDgKc3Vtb29uK0dhc0V4SGFqQUhkVVlNZmplU2ZxUkNOR1FISWM4cGFNYjQxbFErRGowRlBydzRHeThjcTBNWEtleQpIelduazQrVmpXeW9URVJoTnpkSEVUdXFKUG51TFdqbFhSaFhLWCtIVmVZVUdwN3pRNHFXQWc9PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo=

最后我们得到的结果是,通过kubeadm alpha certs renew all
更新的 k8s 证数,是不会更新 kubelet.conf 的证数的,并且这在 github 上得到了进一步的讨论和证实
Kubelet can't running after renew certificates[3]
处理 k8s kubelet.go node "master" not found 问题[4]
[1]
处理 k8s kubelet.go node "master" not found 问题: https://www.linuxea.com/2580.html
[2]Kubelet 客户端证书轮换失败: https://www.linuxea.com/https
[3]Kubelet can't running after renew certificates: https://github.com/kubernetes/kubeadm/issues/2054
[4]处理 k8s kubelet.go node "master" not found 问题: https://www.linuxea.com/2580.html
原文链接:https://www.linuxea.com/2626.html
你可能还喜欢
点击下方图片即可阅读
云原生是一种信仰
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。