赞
踩
下面是 kubernetes 官网的集群架构图
hostname | ip | components | cluster role | flannel version | kubectl version | kubeadm version | keepalived | OS | docker version | docker-root-data | cgroup driver |
---|---|---|---|---|---|---|---|---|---|---|---|
www.datang001.com | 10.176.10.20 | kube-apiserver | master01 | v1.18.20 | v1.18.20 | v1.18.20 | Centos7.9 | 20.10.2 | /var/lib/docker | systemd | |
www.datang002.com | 10.176.10.21 | kube-apiserver | master02 | v1.18.20 | v1.18.20 | v1.18.20 | Centos7.9 | 20.10.2 | /var/lib/docker | systemd | |
www.datang003.com | 10.176.10.22 | kube-apiserver | master03 | v1.18.20 | v1.18.20 | v1.18.20 | Centos7.9 | 20.10.2 | /var/lib/docker | systemd | |
www.datang004.com | 10.176.10.23 | kubelet/kube-proxy | node01 | v1.18.20 | v1.18.20 | v1.18.20 | Centos7.9 | 20.10.2 | /var/lib/docker | systemd | |
www.datang005.com | 10.176.10.24 | kubelet/kube-proxy | node02 | v1.18.20 | v1.18.20 | v1.18.20 | Centos7.9 | 20.10.2 | /var/lib/docker | systemd | |
www.datang006.com | 10.176.10.25 | kubelet/kube-proxy | node03 | v1.18.20 | v1.18.20 | v1.18.20 | Centos7.9 | 20.10.2 | /var/lib/docker | systemd | |
apiserver-lb.com | 10.176.10.250 | VIP |
The kubeadm method is used to build a high-availability k8s cluster. The high availability of the k8s cluster is actually the high availability of the core components of k8s. This deployment adopts the active-standby mode. The architecture is as follows:
Description of the high-availability architecture in active-standby mode:
core components | high availablity mode | high availablity implement method |
---|---|---|
apiserver | master-backup | keepalived-+haproxy |
controller-manager | master-backup | leader election |
scheduler | master-backup | leader election |
etcd | cluster | kubeadm |
Execute the commad in all the control plan and work node hosts’s /etc/hosts file
cat >> /etc/hosts <<EOF
10.176.10.20 www.datang001.com
10.176.10.21 www.datang001.com
10.176.10.22 www.datang001.com
10.176.10.23 www.datang001.com
10.176.10.24 www.datang001.com
10.176.10.25 www.datang001.com
10.176.10.250 apiserver-lb.com
EOF
临时禁用 swap 分区,机器重启后失效,所有机器都执行
[root@www.datang001.com ~]# swapoff -a
[root@www.datang001.com ~]# free -m
total used free shared buff/cache available
Mem: 128770 4073 120315 330 4381 123763
Swap: 0 0 0
永久禁用 swap 分区,所有机器都执行
[root@www.datang001.com ~]# cat /etc/fstab
#
# /etc/fstab
# Created by anaconda on Thu Oct 5 04:55:59 2017
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
...
...
#/dev/mapper/rootvg-swap swap swap defaults 0 0
...
禁用 seLinux
首先查看 selinux
状态
[root@www.datang001.com ~]# sestatus -v
SELinux status: disabled
临时禁用
setenforce 0
永久禁用
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
为了集群的稳定性和防止后面业务容器把节点内存耗尽问题,生产环境必须升级linux服务器内核到4.19之上。由于centos7默认的内核版本是3.10.x,实际运行中,可能会出现内存泄露的问题,根因是 cgroup 的 keme account 特性有 内存泄露问题,具体分析请移步这里低内核造成k8s内存泄露,所以部署k8s集群之前,一定先对所有机器的内核版本进行升级。[已在使用的生产环境不要立刻做升级操作,因为已经有业务在kuberentes上运行了,升级的内核会导致业务容器飘逸,严重情况可能造成业务容器不能正常运行。]
内核升级过程:
略
转发 IPv4 并让 iptables 看到桥接流量
通过运行 lsmod | grep br_netfilter 来验证 br_netfilter 模块是否已加载。
若要显式加载此模块,请运行 sudo modprobe br_netfilter。
为了让 Linux 节点的 iptables 能够正确查看桥接流量,请确认 sysctl 配置中的 net.bridge.bridge-nf-call-iptables 设置为 1。例如:
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf overlay br_netfilter EOF sudo modprobe overlay sudo modprobe br_netfilter # 设置所需的 sysctl 参数,参数在重新启动后保持不变 cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.ipv4.ip_forward = 1 EOF # 应用 sysctl 参数而不重新启动 sudo sysctl --system
kubernetes repo
源国内的一般设置为阿里云的源,如果服务器可以科学上网,那么直接使用默认谷歌源
不能科学上网,使用阿里镜像源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
能科学上网,使用谷歌镜像源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
首先升级 yum cache
yum clean all
yum update
yum -y makecahe
先安装 docker
,关于docker
的安装我们需要注意,docker
存储目录要爆炸尽量是大磁盘,因为后期我们需要拉取很多镜像文件呢,另外 Cgroup Driver
应该设置为 systemd
, Storage Driver
应该设置为 overlay2
。
/etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"],
"data-root": "/data01/docker"
}
安装指定版本的docker
yum list docker-ce --showduplicates | sort -r
yum install -y docker-ce-19.03.9-3.el7
systemctl start docker && systemctl enable docker
安装指定版本的 kubeadm、kubelet、kubectl
yum -y install kubeadm==1.18.20 kubelet==1.18.20 kubectl==1.18.20
docker
和 kubeadm
、kubelet
、和 kubectl
,以及提前下载镜像有时候我们机器处于内网中,不能正常连接互联网,这时我们就需要准备好 rpm 包来进行安装。这些 rpm 包是有相互依赖关系的,所以在安装的时候有先后顺序。kubectl
依赖于 crit-tools
,kubernetes-cni
和 kubelet
之间相互依赖,kubeadm
依赖于 kubectl
、kubelet
和 crit-tools
。
[root@www.datang001.com rpm]# pwd
/home/shutang/k8s/rpm
[root@www.datang001.com rpm]# ls
cri-tools-1.19.0-0.x86_64.rpm kubeadm-1.18.20-0.x86_64.rpm kubectl-1.18.20-0.x86_64.rpm kubelet-1.18.20-0.x86_64.rpm kubernetes-cni-0.8.7-0.x86_64.rpm
[root@www.datang001.com rpm]# yum -y install ./cri-tools-1.19.0-0.x86_64.rpm
[root@www.datang001.com rpm]# yum -y install ./kubectl-1.18.20-0.x86_64.rpm
[root@www.datang001.com rpm]# yum -y install ./kubernetes-cni-0.8.7-0.x86_64.rpm ./kubelet-1.18.20-0.x86_64.rpm
[root@www.datang001.com rpm]# yum -y install ./kubeadm-1.18.20-0.x86_64.rpm
[root@www.datang001.com rpm]# whereis kubeadm
kubeadm: /usr/bin/kubeadm
[root@www.datang001.com rpm]# whereis kubelet
kubelet: /usr/bin/kubelet
[root@www.datang001.com rpm]# whereis kubectl
kubectl: /usr/bin/kubectl
查看我们需要提前下载的镜像
[root@phx11-gliws-u23 ~]# kubeadm config images list --kubernetes-version v1.18.20
W1112 20:10:37.628119 20654 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
k8s.gcr.io/kube-apiserver:v1.18.20
k8s.gcr.io/kube-controller-manager:v1.18.20
k8s.gcr.io/kube-scheduler:v1.18.20
k8s.gcr.io/kube-proxy:v1.18.20
k8s.gcr.io/pause:3.2
k8s.gcr.io/etcd:3.4.3-0
k8s.gcr.io/coredns:1.6.7
# 注意:master节点需要把上面的镜像都下载,node节点只需要下载 k8s.gcr.io/kube-proxy:v1.18.20 k8s.gcr.io/pause:3.2 k8s.gcr.io/coredns:1.6.7
如果我们不能访问 k8s.gcr.io
,我们需要利用阿里云提供的镜像仓库里的镜像。只不过有时候阿里云镜像仓库保存的版本与 k8s.gcr.io
不同步而已,如果需要安装比较新的版本,阿里云镜像仓库不存在的话,就可以用 daocloud
提供的镜像仓库或 清华镜像仓库,或者试试国内别的镜像仓库。
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.18.20
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.18.20
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.18.20
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.18.20
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.6.7
kubelet
默认配置的 pause
镜像使用 k8s.gcr.io
仓库,国内可能无法访问,所以这里配置 kubelet
使用阿里云的 pause
镜像地址。
DOCKER_CGROUPS=$(docker info |grep 'Cgroup' |cut -d '' -f4)
cat > /etc/sysconfig/kubelet <EOF
KUBELET_EXTRA_ARGS="--cgroup-dirver=$DOCKER_CGROUPS --pod-infrs-contaienr-image=registry.cn-hangzhou.aliyuncs.com/google_containers/puase-amd64:3.2"
EOF
设置 kubelet 开机自启动
systemctl daemon-reload
systemctl enable --now kubelet
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/
https://kubernetes.io/zh-cn/docs/reference/setup-tools/kubeadm/kubeadm-config/
在 master01
节点的 kubeadm-config.yaml
配置文件如下:
apiVersion: kubeadm.k8s.io/v1beta2 apiServer: certSANs: - apiserver-lb.com - www.datang001.com - www.datang002.com - www.datang003.com - www.datang004.com - www.datang005.com - www.datang006.com - 10.172.10.20 - 10.172.10.21 - 10.172.10.22 - 10.172.10.23 - 10.172.10.24 - 10.172.10.25 - 10.172.10.250 extraArgs: authorization-mode: Node,RBAC timeoutForControlPlane: 4m0s certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controlPlaneEndpoint: apiserver-lb:16443 # 修改成负载均衡的地址 controllerManager: {} dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd imageRepository: k8s.gcr.io #imageRepository: registory.cn-hangzhou.aliyuncs.com/google_container #imageRepository: daocloud.io/daocloud kind: ClusterConfiguration kubernetesVersion: v1.18.20 networking: dnsDomain: cluster.local podSubnet: 172.26.0.0/16 serviceSubnet: 10.96.0.0/12 scheduler: {}
后面我们升级kubenetes 版本的时候,再次初始化,可能 kubeadm-config
文件里的一些 API
有所变更,需要重新根据 old
的配置文件生产新的 kubeadm-config
文件
kubeadm config migrate --old-config kubeadm-config.yaml --new-config new.yaml
所有节点提前下载镜像,可以节省初始化时间
kubeadm config image pull --config kubeadm-config.yaml
master01
节点初始化,初始化以后会在 /etc/kubernetes
目录下生成对应的证书和配置文件,--upload-certs
参数是当有节点加入集群时,自动同步 master01
上生成的证书到该节点上。:
[root@www.datang001.com k8s]# sudo kubeadm init --config kubeadm.yaml --upload-certs W0514 23:06:11.417640 20494 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io] [init] Using Kubernetes version: v1.18.20 [preflight] Running pre-flight checks [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.2. Latest validated version: 19.03 [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Starting the kubelet [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [www.datang001.com kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local apiserver-lb.com] and IPs [10.96.0.1 10.222.175.201] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [phx11-gliws-u23 localhost] and IPs [10.172.10.20 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [www.datang001.com localhost] and IPs [10.172.10.20 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" W0514 23:06:16.003004 20494 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [control-plane] Creating static Pod manifest for "kube-scheduler" W0514 23:06:16.004606 20494 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 20.502817 seconds [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.18" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [upload-certs] Using certificate key: ************************************************** [mark-control-plane] Marking the node www.datang001.com as control-plane by adding the label "node-role.kubernetes.io/master=''" [mark-control-plane] Marking the node www.datang001.com as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] [bootstrap-token] Using token: ixhv5g.n37m33eybijtb13q [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of the control-plane node running the following command on each as root: kubeadm join apiserver-lb.com:6443 --token ixhv5g.n37m33eybijtb13q \ --discovery-token-ca-cert-hash sha256:fc9a9ff3fc5ae118a5a9616cb742a26deacc235ec79beb85018b52280d887d5e \ --control-plane --certificate-key ************************************************* Please note that the certificate-key gives access to cluster sensitive data, keep it secret! As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use "kubeadm init phase upload-certs --upload-certs" to reload certs afterward. Then you can join any number of worker nodes by running the following on each as root: kubeadm join apiserver-lb.com:6443 --token ixhv5g.n37m33eybijtb13q \ --discovery-token-ca-cert-hash sha256:fc9a9ff3fc5ae118a5a9616cb742a26deacc235ec79beb85018b52280d887d5e [root@www.datang001.com k8s]# mkdir -p $HOME/.kube [root@www.datang001.com k8s]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config [root@www.datang001.com k8s]# sudo chown $(id -u):$(id -g) $HOME/.kube/config
或者不用指定配置文件初始化:
kubeadm init --control-plane-endpoint "LOAD_BALANCER_DNS:LOAD_BALANCER_PORT" --image-repository daocloud.io/daocloud --upload-certs
如果初始化失败,重置后再次初始化,命令如下:
kubeadm reset
Token 过期处理:https://kubernetes.io/zh-cn/docs/reference/setup-tools/kubeadm/kubeadm-token/#cmd-token-create
master
节点和 node
节点加入集群master
节点加入集群
kubeadm join apiserver-lb.com:6443 --token ixhv5g.n37m33eybijtb13q \
--discovery-token-ca-cert-hash sha256:fc9a9ff3fc5ae118a5a9616cb742a26deacc235ec79beb85018b52280d887d5e \
--control-plane --certificate-key *************************************************
node
节点加入集群
kubeadm join apiserver-lb.com:6443 --token ixhv5g.n37m33eybijtb13q \
--discovery-token-ca-cert-hash sha256:fc9a9ff3fc5ae118a5a9616cb742a26deacc235ec79beb85018b52280d887d5e
master01
节点上安装 keepalived
、haproxy
软件yum -y install keepalived haproxy
master01
机器上的 keepalived.conf
配置文件 和 check_apiserver.sh
脚本文件。
# keepalived.conf 内容 ! Configuration File for keepalived global_defs { router_id www.datang001.com } # 定义脚本 vrrp_script check_apiserver { script "/etc/keepalived/check_apiserver.sh" interval 2 weight -5 fall 3 rise 2 } vrrp_instance VI_1 { state MASTER interface eth0 virtual_router_id 50 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 10.176.10.250 } # 调用脚本 track_script { check_apiserver } } # 监测脚本 check_apiserver #!/bin/bash function check_apiserver(){ for ((i=0;i<5;i++)) do apiserver_job_id=${pgrep kube-apiserver} if [[ ! -z ${apiserver_job_id} ]];then return else sleep 2 fi done apiserver_job_id=0 } # 1->running 0->stopped check_apiserver if [[ $apiserver_job_id -eq 0 ]];then /usr/bin/systemctl stop keepalived exit 1 else exit 0 fi
启动 keepalived
systemctl enable --now keepalived.service
master01
上 haproxy
的配置文件 haproxy.cfg
。
global log /dev/log local0 warning chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 4000 user haproxy group haproxy daemon stats socket /var/lib/haproxy/stats defaults mode http log global option httplog option dontlognull timeout connect 5000 timeout client 50000 timeout server 50000 listen status_page bind 0.0.0.0:1080 stats enable stats uri /haproxy-status stats auth admin:nihaoma stats realm "Welcome to the haproxy load balancer status page" stats hide-version stats admin if TRUE stats refresh 5s frontend kube-apiserver bind *:16443 mode tcp option tcplog default_backend kube-apiserver backend kube-apiserver mode tcp option tcplog option tcp-check balance roundrobin default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100 server www.datang001.com 10.172.10.20:6443 check # Replace the IP address with your own. server www.datang002.com 10.172.10.21:6443 check server www.datang003.com 10.172.110.22:6443 check
启动 haproxy
systemctl enable --now haproxy
master02
机器上的 keepalived.conf
配置文件 和 check_apiserver.sh
脚本文件。
! Configuration File for keepalived global_defs { router_id www.datang002.com } # 定义脚本 vrrp_script check_apiserver { script "/etc/keepalived/check_apiserver.sh" interval 2 weight -5 fall 3 rise 2 } vrrp_instance VI_1 { state BACKUP interface eth0 virtual_router_id 50 priority 99 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 10.172.10.250 } # 调用脚本 #track_script { # check_apiserver #} }
master02
上 haproxy
的配置文件 haproxy.cfg
。
global log /dev/log local0 warning chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 4000 user haproxy group haproxy daemon stats socket /var/lib/haproxy/stats defaults mode http log global option httplog option dontlognull timeout connect 5000 timeout client 50000 timeout server 50000 listen status_page bind 0.0.0.0:1080 stats enable stats uri /haproxy-status stats auth admin:nihaoma stats realm "Welcome to the haproxy load balancer status page" stats hide-version stats admin if TRUE stats refresh 5s frontend kube-apiserver bind *:16443 mode tcp option tcplog default_backend kube-apiserver backend kube-apiserver mode tcp option tcplog option tcp-check balance roundrobin default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100 server www.datang001.com 10.172.10.20:6443 check # Replace the IP address with your own. server www.datang002.com 10.172.10.21:6443 check server www.datang003.com 10.172.110.22:6443 check
master03
机器上的 keepalived.conf
配置文件 和 check_apiserver.sh
脚本文件。
master03
上 haproxy
的配置文件 haproxy.cfg
。
[root@www.datang001.com network]$ wget https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml
[root@www.datang001.com network]$ kubectl apply -f kube-flannel.yml
[shutang@www.datang001.com network]# kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME www.datang001.com Ready <none> 160m v1.18.20 10.172.10.20 <none> CentOS Linux 7 (Core) 3.10.0-1160.62.1.el7.x86_64 docker://20.10.2 www.datang002.com Ready <none> 160m v1.18.20 10.172.10.21 <none> CentOS Linux 7 (Core) 3.10.0-1160.62.1.el7.x86_64 docker://20.10.2 www.datang003.com Ready <none> 161m v1.18.20 10.172.10.22 <none> CentOS Linux 7 (Core) 3.10.0-1160.62.1.el7.x86_64 docker://20.10.2 www.datang004.com Ready master 162m v1.18.20 10.172.10.23 <none> CentOS Linux 7 (Core) 3.10.0-1160.62.1.el7.x86_64 docker://20.10.2 www.datang005.com Ready master 163m v1.18.20 10.172.10.24 <none> CentOS Linux 7 (Core) 3.10.0-1160.62.1.el7.x86_64 docker://20.10.6 www.datang006.com Ready master 166m v1.18.20 10.172.10.25 <none> CentOS Linux 7 (Core) 3.10.0-1160.62.1.el7.x86_64 docker://20.10.2 # execute kubectl get cs It may be that the first two components are unhealthy in reality, and there are ways to deal with it later in the article [root@www.datang001.com ~]# kubectl get cs NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-0 Healthy {"health":"true"}
NodePort
端口范围In a kubernetes cluster, the default range of NodePort is 30000-32767
. In some cases, due to company network policy restrictions, you may modify the port range of NodePort.
Modify kube-apiserver.yaml
When using kubeadm to install a k8s cluster, there will be a file /etc/kubernetes/mainfests/kube-apiserver.yaml
on your control plane nodes, modify this file and add --service-node-port-range=1 to it -65535 (please use your own desired port range) as follows:
apiVersion: v1 kind: Pod metadata: annotations: kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 10.172.10.20:6443 creationTimestamp: null labels: component: kube-apiserver tier: control-plane name: kube-apiserver namespace: kube-system spec: containers: - command: - kube-apiserver - --advertise-address=10.172.10.20 - --allow-privileged=true - --authorization-mode=Node,RBAC - --client-ca-file=/etc/kubernetes/pki/ca.crt - --enable-admission-plugins=NodeRestriction - --enable-bootstrap-token-auth=true - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key - --etcd-servers=https://127.0.0.1:2379 - --insecure-port=0 - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt - --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key - --requestheader-allowed-names=front-proxy-client - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt - --requestheader-extra-headers-prefix=X-Remote-Extra- - --requestheader-group-headers=X-Remote-Group - --requestheader-username-headers=X-Remote-User - --secure-port=6443 - --service-account-key-file=/etc/kubernetes/pki/sa.pub - --service-cluster-ip-range=10.96.0.0/12 - --service-node-port-range=1-65535 # 新增该行 - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key image: k8s.gcr.io/kube-apiserver:v1.18.20 imagePullPolicy: IfNotPresent ......
重启 apiserver
# get apiserver pod name
export apiserver_pods=$(kubectl get pods --selector=component=kube-apiserver -n kube-system --output=jsonpath={.items..metadata.name})
# delete apiserver pod
kubectl delete pod $apiserver_pods -n kube-system
监测 apiserver
是否正常
kubectl describe pod $apiserver_pods -n kube-system
# Check whether there is the line we added above in the parameters of the startup command, and if so, verify that it is correct
[root@www.datang001.com ~]# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.20", GitCommit:"1f3e19b7beb1cc0110255668c4238ed63dadb7ad", GitTreeState:"clean", BuildDate:"2021-06-16T12:56:41Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
[root@www.datang001.com update-cert]# wget https://github.com/kubernetes/kubernetes/archive/refs/tags/v1.18.20.tar.gz && wget https://golang.google.cn/dl/go1.13.15.linux-amd64.tar.gz
[root@www.datang001.com update-cert]# tar -zxf v1.18.20.tar.gz && tar -zxf go1.13.15.linux-amd64.tar.gz -C /usr/local/
[root@www.datang001.com update-cert]# cat > /etc/profile.d/go.sh <<EOF
export PATH=$PATH:/usr/local/go/bin
EOF
[root@www.datang001.com update-cert]# source /etc/profile.d/go.sh
[root@www.datang001.com update-cert]# go version
go version go1.13.15 linux/amd64
[root@www.datang001.com update-cert]#
[root@www.datang001.com ~]# kubeadm alpha certs check-expiration [check-expiration] Reading configuration from the cluster... [check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED admin.conf May 15, 2023 06:06 UTC 364d no apiserver May 15, 2023 06:06 UTC 364d ca no apiserver-etcd-client May 15, 2023 06:06 UTC 364d etcd-ca no apiserver-kubelet-client May 15, 2023 06:06 UTC 364d ca no controller-manager.conf May 15, 2023 06:06 UTC 364d no etcd-healthcheck-client May 15, 2023 06:06 UTC 364d etcd-ca no etcd-peer May 15, 2023 06:06 UTC 364d etcd-ca no etcd-server May 15, 2023 06:06 UTC 364d etcd-ca no front-proxy-client May 15, 2023 06:06 UTC 364d front-proxy-ca no scheduler.conf May 15, 2023 06:06 UTC 364d no CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED ca May 12, 2032 06:06 UTC 9y no etcd-ca May 12, 2032 06:06 UTC 9y no front-proxy-ca May 12, 2032 06:06 UTC 9y no
www.datang001.com modified certs validate time
[root@www.datang001.com update-cert]# pwd /home/shutang/k8s/update-cert [root@www.datang001.com update-cert]# cd kubernetes-1.18.20 [root@www.datang001.com kubernetes-1.18.20]# cd cmd/kubeadm/app/constants/ [root@www.datang001.com constants]# cat constants |grep 10 cat: constants: No such file or directory [root@www.datang001.com constants]# cat constants.go |grep 10 CertificateValidity = time.Hour * 24 * 365 * 10 [root@www.datang001.com kubernetes-1.18.20]# make WHAT=cmd/kubeadm +++ [0515 08:40:42] Building go targets for linux/amd64: ./vendor/k8s.io/code-generator/cmd/deepcopy-gen +++ [0515 08:40:52] Building go targets for linux/amd64: ./vendor/k8s.io/code-generator/cmd/defaulter-gen +++ [0515 08:40:59] Building go targets for linux/amd64: ./vendor/k8s.io/code-generator/cmd/conversion-gen +++ [0515 08:41:11] Building go targets for linux/amd64: ./vendor/k8s.io/kube-openapi/cmd/openapi-gen +++ [0515 08:41:22] Building go targets for linux/amd64: ./vendor/github.com/go-bindata/go-bindata/go-bindata warning: ignoring symlink /home/shutang/k8s/update-cert/kubernetes-1.18.20/_output/local/go/src/k8s.io/kubernetes go: warning: "k8s.io/kubernetes/vendor/github.com/go-bindata/go-bindata/..." matched no packages +++ [0515 08:41:24] Building go targets for linux/amd64: cmd/kubeadm # backup the old kubeadm [root@www.datang001.com kubernetes-1.18.20]# mv /usr/bin/kubeadm /usr/bin/kubeadm.old [root@www.datang001.com kubernetes-1.18.20]# cp _output/bin/kubeadm /usr/bin/kubeadm [root@www.datang001.com kubernetes-1.18.20]# cd /etc/kubernetes/pki/ [root@www.datang001.com pki]# ls -lah total 60K drwxr-xr-x 3 root root 4.0K May 14 23:06 . drwxr-xr-x 4 root root 125 May 14 23:06 .. -rw-r--r-- 1 root root 1.3K May 14 23:06 apiserver.crt -rw-r--r-- 1 root root 1.1K May 14 23:06 apiserver-etcd-client.crt -rw------- 1 root root 1.7K May 14 23:06 apiserver-etcd-client.key -rw------- 1 root root 1.7K May 14 23:06 apiserver.key -rw-r--r-- 1 root root 1.1K May 14 23:06 apiserver-kubelet-client.crt -rw------- 1 root root 1.7K May 14 23:06 apiserver-kubelet-client.key -rw-r--r-- 1 root root 1.1K May 14 23:06 ca.crt -rw------- 1 root root 1.7K May 14 23:06 ca.key drwxr-xr-x 2 root root 162 May 14 23:06 etcd -rw-r--r-- 1 root root 1.1K May 14 23:06 front-proxy-ca.crt -rw------- 1 root root 1.7K May 14 23:06 front-proxy-ca.key -rw-r--r-- 1 root root 1.1K May 14 23:06 front-proxy-client.crt -rw------- 1 root root 1.7K May 14 23:06 front-proxy-client.key -rw------- 1 root root 1.7K May 14 23:06 sa.key -rw------- 1 root root 451 May 14 23:06 sa.pub [root@phx11-gliws-u23 pki]# kubeadm alpha certs renew all [renew] Reading configuration from the cluster... [renew] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' certificate embedded in the kubeconfig file for the admin to use and for kubeadm itself renewed certificate for serving the Kubernetes API renewed certificate the apiserver uses to access etcd renewed certificate for the API server to connect to kubelet renewed certificate embedded in the kubeconfig file for the controller manager to use renewed certificate for liveness probes to healthcheck etcd renewed certificate for etcd nodes to communicate with each other renewed certificate for serving etcd renewed certificate for the front proxy client renewed certificate embedded in the kubeconfig file for the scheduler manager to use renewed [root@phx11-gliws-u23 pki]# [root@phx11-gliws-u23 pki]# kubeadm alpha certs check-expiration [check-expiration] Reading configuration from the cluster... [check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED admin.conf May 12, 2032 15:46 UTC 9y no apiserver May 12, 2032 15:46 UTC 9y ca no apiserver-etcd-client May 12, 2032 15:46 UTC 9y etcd-ca no apiserver-kubelet-client May 12, 2032 15:46 UTC 9y ca no controller-manager.conf May 12, 2032 15:46 UTC 9y no etcd-healthcheck-client May 12, 2032 15:46 UTC 9y etcd-ca no etcd-peer May 12, 2032 15:46 UTC 9y etcd-ca no etcd-server May 12, 2032 15:46 UTC 9y etcd-ca no front-proxy-client May 12, 2032 15:46 UTC 9y front-proxy-ca no scheduler.conf May 12, 2032 15:46 UTC 9y no CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED ca May 12, 2032 06:06 UTC 9y no etcd-ca May 12, 2032 06:06 UTC 9y no front-proxy-ca May 12, 2032 06:06 UTC 9y no
kube-proxy
的代理模式为 ipvs
[root@www.datang001.com ~]# kubectl get pods -n kube-system |grep proxy
kube-proxy-2qc7r 1/1 Running 13 182d
kube-proxy-6nfzm 1/1 Running 13 182d
kube-proxy-frwcg 1/1 Running 15 182d
kube-proxy-l6xg2 1/1 Running 13 182d
kube-proxy-r96hz 1/1 Running 13 182d
kube-proxy-sgwfh 1/1 Running 13 182d
随便查看某个 pod 的日志
[shutang@phx11-gliws-u23 ~]$ kubectl logs -f kube-proxy-2qc7r -n kube-system
I1021 04:57:16.251139 1 node.go:136] Successfully retrieved node IP: 10.222.175.237
I1021 04:57:16.251166 1 server_others.go:259] Using iptables Proxier.
I1021 04:57:16.251599 1 server.go:583] Version: v1.18.20
I1021 04:57:16.252018 1 conntrack.go:52] Setting nf_conntrack_max to 131072
I1021 04:57:16.252241 1 config.go:133] Starting endpoints config controller
I1021 04:57:16.252270 1 shared_informer.go:223] Waiting for caches to sync for endpoints config
I1021 04:57:16.252300 1 config.go:315] Starting service config controller
I1021 04:57:16.252306 1 shared_informer.go:223] Waiting for caches to sync for service config
I1021 04:57:16.352464 1 shared_informer.go:230] Caches are synced for endpoints config
I1021 04:57:16.352528 1 shared_informer.go:230] Caches are synced for service config
E1103 07:57:22.505665 1 graceful_termination.go:89] Try delete
此时 kube-proxy
的默认代理模式 iptables
。
kube-proxy
代理模式为 ipvs
确保 ipvs 模式已经运行
[shutang@www.datang001.com ~]# lsmod |grep ip_vs
ip_vs_sh 12688 0
ip_vs_wrr 12697 0
ip_vs_rr 12600 167
ip_vs 145458 173 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack 139264 7 ip_vs,nf_nat,nf_nat_ipv4,xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_netlink,nf_conntrack_ipv4
libcrc32c 12644 4 xfs,ip_vs,nf_nat,nf_conntrack
如果没有加载 ipvs 模块运行一下命令
cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack
EOF
执行下面命令
chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
修改 kube-proxy
的 configmap
文件
...
...
...
kind: KubeProxyConfiguration
metricsBindAddress: ""
mode: "ipvs" #修改此处,原为空
nodePortAddresses: null
oomScoreAdj: null
portRange: ""
...
...
...
重启 kube-proxy
kubectl rollout restart daemonset kube-proxy -n kube-system
kubesphere
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。