当前位置:   article > 正文

云服务器安装k8s和kubesphere踩坑_kubekey拉取镜像失败

kubekey拉取镜像失败

首先是三台云服务器,腾讯云、百度云、京东云,配置4c4g、2c4g、2c4g,穷大学生只有这配置的服务器了

ubuntu20.04通过kubekey安装k8s和kubesphere

参考
在 Ubuntu 22.04 上安装 KubeSphere 实战教程
这篇文章是使用kubekey安装k8s和kubesphere,三台服务器都参与作为master节点,master可以选举,不容易嘎蛋,但是我没跑明白,etcd那块卡挺久的,弄好后我以为要装好了,但是后面还有一堆要安装还有坑,最后实在搞不动放弃了

坑1

00:02:04 CST [PullModule] Start to pull images on all nodes
00:02:04 CST message: [master]
downloading image: kubesphere/pause:3.7
00:02:04 CST message: [node2]
downloading image: kubesphere/pause:3.7
00:02:04 CST message: [node1]
downloading image: kubesphere/pause:3.7
00:02:04 CST message: [master]
pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH crictl pull kubesphere/pause:3.7"
WARN[0000] image connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead.
ERRO[0000] unable to determine image API version: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory"
E0819 00:02:04.611117    9644 remote_image.go:218] "PullImage from image service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService" image="kubesphere/pause:3.7"
FATA[0000] pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService: Process exited with status 1
00:02:04 CST retry: [master]
00:02:07 CST message: [node1]
downloading image: kubesphere/kube-proxy:v1.24.2
00:02:09 CST message: [node2]
downloading image: kubesphere/kube-proxy:v1.24.2
00:02:09 CST message: [master]
downloading image: kubesphere/pause:3.7
00:02:09 CST message: [master]
pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH crictl pull kubesphere/pause:3.7"
WARN[0000] image connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead.
ERRO[0000] unable to determine image API version: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory"
E0819 00:02:09.650287    9665 remote_image.go:218] "PullImage from image service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService" image="kubesphere/pause:3.7"
FATA[0000] pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService: Process exited with status 1
00:02:09 CST retry: [master]
00:02:14 CST message: [master]
downloading image: kubesphere/pause:3.7
00:02:14 CST message: [master]
pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH crictl pull kubesphere/pause:3.7"
WARN[0000] image connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead.
ERRO[0000] unable to determine image API version: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory"
E0819 00:02:14.690706    9697 remote_image.go:218] "PullImage from image service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService" image="kubesphere/pause:3.7"
FATA[0000] pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService: Process exited with status 1
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35

在配置文件里加入

cat > /etc/containerd/config.toml <<EOF
[plugins."io.containerd.grpc.v1.cri"]
systemd_cgroup = true
EOF
  • 1
  • 2
  • 3
  • 4

然后重启

systemctl restart containerd
  • 1

坑2

error: code = Unknown desc = failed to pull and unpack image "docker.io/kubesphere/k8s-dns-node-cache:1.15.12": failed to copy: httpReadSeeker: failed open: failed to do request: Get "https://registry-1.docker.io/v2/kubesphere/k8s-dns-node-cache/manifests/sha256:8e765f63b3a5b4832c484b4397f4932bd607713ec2bb3e639118bc164ab4a958": net/http: TLS handshake timeout: Process exited with status 1
  • 1

改一下配置

crictl config runtime-endpoint /run/containerd/containerd.sock
# 或者选择修改配置文件也可,修改文件这个我没去验证
# vi /etc/crictl.yaml
  • 1
  • 2
  • 3

k8s集群部署中etcd启动报错request sent was ignored (cluster ID mismatch: peer[c39bdec535db1fd5]=cdf818194e3a8c

下边的方法是清缓存的指令

k8s集群部署中etcd启动报错request sent was ignored (cluster ID mismatch: peer[c39bdec535db1fd5]=cdf818194e3a8c

云服务器主要需要把etcd的配置文件写好
有些需要改成本地网卡监听

kube@k8s-master-0:~/kubekey$ cat /etc/etcd.env
# Environment file for etcd v3.4.13
ETCD_DATA_DIR=/var/lib/etcd
ETCD_ADVERTISE_CLIENT_URLS=https://外网:2379
ETCD_INITIAL_ADVERTISE_PEER_URLS=https://外网:2380
ETCD_INITIAL_CLUSTER_STATE=existing
ETCD_METRICS=basic
ETCD_LISTEN_CLIENT_URLS=https://内网:2379,https://127.0.0.1:2379
ETCD_ELECTION_TIMEOUT=5000
ETCD_HEARTBEAT_INTERVAL=250
ETCD_INITIAL_CLUSTER_TOKEN=k8s_etcd
ETCD_LISTEN_PEER_URLS=https://内网:2380
ETCD_NAME=etcd-k8s-master-0
ETCD_PROXY=off
ETCD_ENABLE_V2=true
ETCD_INITIAL_CLUSTER=etcd-k8s-master-0=https://外网:2380,etcd-k8s-master-1=https://外网:2380,etcd-k8s-master-2=https://外网:2380
ETCD_AUTO_COMPACTION_RETENTION=8
ETCD_SNAPSHOT_COUNT=10000

# TLS settings
ETCD_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.pem
ETCD_CERT_FILE=/etc/ssl/etcd/ssl/member-k8s-master-0.pem
ETCD_KEY_FILE=/etc/ssl/etcd/ssl/member-k8s-master-0-key.pem
ETCD_CLIENT_CERT_AUTH=true

ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.pem
ETCD_PEER_CERT_FILE=/etc/ssl/etcd/ssl/member-k8s-master-0.pem
ETCD_PEER_KEY_FILE=/etc/ssl/etcd/ssl/member-k8s-master-0-key.pem
ETCD_PEER_CLIENT_CERT_AUTH=True

# CLI settings
ETCDCTL_ENDPOINTS=https://127.0.0.1:2379
ETCDCTL_CA_FILE=/etc/ssl/etcd/ssl/ca.pem
ETCDCTL_KEY_FILE=/etc/ssl/etcd/ssl/admin-k8s-master-0-key.pem
ETCDCTL_CERT_FILE=/etc/ssl/etcd/ssl/admin-k8s-master-0.pem
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35

kubekey配置

apiVersion: kubekey.kubesphere.io/v1alpha2
kind: Cluster
metadata:
  name: sample
spec:
  hosts:
  - {name: k8s-master-0, address: 外网, internalAddress: 外网, user: kube, password: ""}
  - {name: k8s-master-1, address: 外网, internalAddress: 外网, user: kube, privateKeyPath: "~/.ssh/id_ed25519"}
  - {name: k8s-master-2, address: 外网, internalAddress: 外网, user: kube, privateKeyPath: "~/.ssh/id_ed25519"}
  roleGroups:
    etcd:
    - k8s-master-0
    - k8s-master-1
    - k8s-master-2
    control-plane:
    - k8s-master-0
    - k8s-master-1
    - k8s-master-2
    worker:
    - k8s-master-0
    - k8s-master-1
    - k8s-master-2
  controlPlaneEndpoint:
    ## Internal loadbalancer for apiservers
    internalLoadbalancer: haproxy

    domain: lb.kubesphere.local
    address: ""
    port: 6443
  kubernetes:
    version: v1.25.5
    clusterName: cluster.local
    autoRenewCerts: true
    containerManager: containerd
  etcd:
    type: kubekey
  network:
    plugin: calico
    kubePodsCIDR: 10.233.64.0/18
    kubeServiceCIDR: 10.233.0.0/18
    ## multus support. https://github.com/k8snetworkplumbingwg/multus-cni
    multusCNI:
      enabled: false
  registry:
    privateRegistry: ""
    namespaceOverride: ""
    registryMirrors: []
    insecureRegistries: []
  addons: []



---
apiVersion: installer.kubesphere.io/v1alpha1
kind: ClusterConfiguration
metadata:
  name: ks-installer
  namespace: kubesphere-system
  labels:
    version: v3.3.2
spec:
  persistence:
    storageClass: ""
  authentication:
    jwtSecret: ""
  zone: ""
  local_registry: ""
  namespace_override: ""
  # dev_tag: ""
  etcd:
    monitoring: false
    endpointIps: localhost
    port: 2379
    tlsEnable: true
  common:
    core:
      console:
        enableMultiLogin: true
        port: 30880
        type: NodePort
    # apiserver:
    #  resources: {}
    # controllerManager:
    #  resources: {}
    redis:
      enabled: false
      volumeSize: 2Gi
    openldap:
      enabled: false
      volumeSize: 2Gi
    minio:
      volumeSize: 20Gi
    monitoring:
      # type: external
      endpoint: http://prometheus-operated.kubesphere-monitoring-system.svc:9090
      GPUMonitoring:
        enabled: false
    gpu:
      kinds:
      - resourceName: "nvidia.com/gpu"
        resourceType: "GPU"
        default: true
    es:
      # master:
      #   volumeSize: 4Gi
      #   replicas: 1
      #   resources: {}
      # data:
      #   volumeSize: 20Gi
      #   replicas: 1
      #   resources: {}
      logMaxAge: 7
      elkPrefix: logstash
      basicAuth:
        enabled: false
        username: ""
        password: ""
      externalElasticsearchHost: ""
      externalElasticsearchPort: ""
  alerting:
    enabled: false
    # thanosruler:
    #   replicas: 1
    #   resources: {}
  auditing:
    enabled: false
    # operator:
    #   resources: {}
    # webhook:
    #   resources: {}
  devops:
    enabled: false
    # resources: {}
    jenkinsMemoryLim: 8Gi
    jenkinsMemoryReq: 4Gi
    jenkinsVolumeSize: 8Gi
  events:
    enabled: false
    # operator:
    #   resources: {}
    # exporter:
    #   resources: {}
    # ruler:
    #   enabled: true
    #   replicas: 2
    #   resources: {}
  logging:
    enabled: false
    logsidecar:
      enabled: true
      replicas: 2
      # resources: {}
  metrics_server:
    enabled: false
  monitoring:
    storageClass: ""
    node_exporter:
      port: 9100
      # resources: {}
    # kube_rbac_proxy:
    #   resources: {}
    # kube_state_metrics:
    #   resources: {}
    # prometheus:
    #   replicas: 1
    #   volumeSize: 20Gi
    #   resources: {}
    #   operator:
    #     resources: {}
    # alertmanager:
    #   replicas: 1
    #   resources: {}
    # notification_manager:
    #   resources: {}
    #   operator:
    #     resources: {}
    #   proxy:
    #     resources: {}
    gpu:
      nvidia_dcgm_exporter:
        enabled: false
        # resources: {}
  multicluster:
    clusterRole: none
  network:
    networkpolicy:
      enabled: false
    ippool:
      type: none
    topology:
      type: none
  openpitrix:
    store:
      enabled: false
  servicemesh:
    enabled: false
    istio:
      components:
        ingressGateways:
        - name: istio-ingressgateway
          enabled: false
        cni:
          enabled: false
  edgeruntime:
    enabled: false
    kubeedge:
      enabled: false
      cloudCore:
        cloudHub:
          advertiseAddress:
            - ""
        service:
          cloudhubNodePort: "30000"
          cloudhubQuicNodePort: "30001"
          cloudhubHttpsNodePort: "30002"
          cloudstreamNodePort: "30003"
          tunnelNodePort: "30004"
        # resources: {}
        # hostNetWork: false
      iptables-manager:
        enabled: true
        mode: "external"
        # resources: {}
      # edgeService:
      #   resources: {}
  terminal:
    timeout: 600
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227

以上是ubuntu安装遇到的一些坑,使用ubuntu系统然后kubekey一键安装没装明白,以下换成centos7.9使用kubeadm安装

Centos7.9 使用kubeadm安装k8s

centos通过参考下边的文章安装成功,但是这种固定一个master节点,master服务器挂了k8s就挂了,但是安装简单,学习使用暂时先这样

首先参考这篇文章搭建的k8s

总体流程一览

主要流程如下:

  1. 准备云主机,升级CentOS系统到7.9
  2. 所有节点上安装Docker和Kubeadm,拉取相关镜像
  3. 在Master节点初始化集群,包括kubectl和部署CN容器网络插件
  4. 把Node节点加入k8s集群

可视化界面和私有镜像仓库请参考其他文章:
1.部署Dashboard Web 页面,可视化查看Kubernetes资源,看我下一篇文章:k8s dashboard安装
2.部署Harbor私有仓库,存放镜像资源(非必要,省略介绍)

下面,开始介绍各流程详细配置步骤。

环境准备
云主机
云服务器 区域 CentOS 节点类型 配置 公网IP 安装工具
腾讯云 上海三区 7.9 master01 2C4G 101.34.112.190 docker、kubeadm、kubelet、kubectl、flannel
同上 上海二区 7.9 node01 1C2G 81.68.126.69 同上
同上 上海二区 7.9 node02 1C2G 81.68.92.49 同上
CentOS升级
如果低于CentOS 7.9,请先升级:

$ yum update -y
$ cat /etc/redhat-release
CentOS Linux release 7.9.2009 (Core)
  • 1
  • 2
  • 3

PS:请务必不要跳过这步,低版本的CentOS安装Kubeadm时很可能会失败。作者就是失败之后升级CentOS才成功了!!

所有节点CentOS设置
基础设置
PS:该步骤基于:CentOS Linux release 7.9.2009 (Core) ,7.2 - 7.6 版本好像会失败,如果中途不成功,请考虑升级更换CentOS版本!

可以创建 k8s-pre-install-centos.sh 脚本,一键设置:

$ vim k8s-pre-install-centos.sh

#!/bin/sh

function set_base(){
  # 关闭防火墙,PS:如果使用云服务器,还需要在云服务器的控制台中把防火墙关闭了或者允许所有端口。
  systemctl stop firewalld
  systemctl disable firewalld

  # 关闭SELinux,这样做的目的是:为了让容器能读取主机文件系统。
  setenforce 0

  # 永久关闭swap分区交换,kubeadm规定,一定要关闭
  swapoff -a
  sed -ri 's/.*swap.*/#&/' /etc/fstab

  # iptables配置
  cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF

  cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF

  # iptables生效参数
  sysctl --system
}

set_base
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32

执行上述脚本:

$ chmod 777 k8s-pre-install-centos.sh && ./k8s-pre-install-centos.sh
  • 1

主机名设置
修改主机名

master执行:

hostnamectl set-hostname master01 

节点1执行:

hostnamectl set-hostname node01

节点2执行:

hostnamectl set-hostname node02
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

修改hosts文件
每台机器上都要执行:

$ vim /etc/hosts 
101.34.112.190 master01 
81.68.126.69 node01 
81.68.92.49 node02
  • 1
  • 2
  • 3
  • 4

PS:请更换为自己的公网IP(注意是公网IP,不是内网)!

所有节点安装Docker
k8s支持 3种容器运行时,这里我们优先使用熟悉的 Docker 作为容器运行时。请确保CentOS 7以上,最新要求以 官方 为主。

安装yum仓库

$ sudo yum install -y yum-utils
$ sudo yum-config-manager \
    --add-repo \
    https://download.docker.com/linux/centos/docker-ce.repo
  • 1
  • 2
  • 3
  • 4

安装docker
包括cli、engine、docker compose等

$ sudo yum install docker-ce-20.10.14-3.el7 docker-ce-cli-20.10.14-3.el7 containerd.io docker-compose-plugin
  • 1

ps:本教程使用的是docker版本:20.10.14

配置Docker守护程序
尤其是使用 systemd 来管理容器的 cgroup,另外还要配置阿里云镜像源,加快拉取速度!

$ sudo mkdir /etc/docker
$ cat <<EOF | sudo tee /etc/docker/daemon.json
{
  "registry-mirrors": ["https://6ijb8ubo.mirror.aliyuncs.com"],
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

registry-mirrors:镜像加速。
cgroupdriver:使用systemd。
log-driver:使用json日志,大小为100m。
启动docker,并设置为开机启动

$ sudo systemctl enable docker
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker
$ systemctl status docker # 确保是running状态
  • 1
  • 2
  • 3
  • 4

确认Cgroup Driver为systemd

$ docker info | grep "Cgroup Driver"
 Cgroup Driver: systemd
  • 1
  • 2

PS:因为k8s是默认systemd作为cgroup driver,如果Docker使用另外的驱动,则可能出现不稳定的情况。

所有节点安装kubeadm
为保证不过期,请最终以 官方文档 为主:

配置yum源(使用aliyun,google你知道的)

$ cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

$ yum makecache # 更新yum
安装kubeadm, kubelet, kubectl
$ sudo yum install -y kubelet-1.23.6 kubeadm-1.23.6 kubectl-1.23.6 --disableexcludes=kubernetes

PS:k8s升级很快,为了保证教程正确,请使用相同版本。

启动klubelet,并设置为开机启动
$ sudo systemctl start kubelet
$ sudo systemctl enable kubelet
PS:kubeadm 将使用 kubelet 服务以容器方式部署和启动 Kubemetes 的主要服务。
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20

所有节点拉取Docker镜像
ps:该1.23.6版本,需要docker为20,超过的不保证成功
拉取Docker镜像

查看初始化需要的镜像

$ kubeadm config images list

k8s.gcr.io/kube-apiserver:v1.23.6
k8s.gcr.io/kube-controller-manager:v1.23.6
k8s.gcr.io/kube-scheduler:v1.23.6
k8s.gcr.io/kube-proxy:v1.23.6
k8s.gcr.io/pause:3.6
k8s.gcr.io/etcd:3.5.1-0
k8s.gcr.io/coredns/coredns:v1.8.6
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

替换k8s镜像源
k8s模式镜像仓库是 k8s.gcr.io,由于众所周知的原因是无法访问的。

故这里需要创建配置 kubeadm-config-image.yaml 替换成阿里云的源:

$ vim kubeadm-config-image.yaml

apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
# 默认为k8s.gcr.io,但是网络不通,所以要替换为阿里云镜像
imageRepository: registry.aliyuncs.com/google_containers 

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

确认镜像仓库改变
再查看,发现镜像的地址变了,才能执行下一步:

$ kubeadm config images list --config kubeadm-config-image.yaml

registry.aliyuncs.com/google_containers/kube-apiserver:v1.23.6
registry.aliyuncs.com/google_containers/kube-controller-manager:v1.23.6
registry.aliyuncs.com/google_containers/kube-scheduler:v1.23.6
registry.aliyuncs.com/google_containers/kube-proxy:v1.23.6
registry.aliyuncs.com/google_containers/pause:3.6
registry.aliyuncs.com/google_containers/etcd:3.5.1-0
registry.aliyuncs.com/google_containers/coredns:v1.8.6
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

拉取镜像

$ kubeadm config images pull --config kubeadm-config-image.yaml
  • 1

在 所有机器 上执行,把这些镜像提前拉好。

Master节点初始化集群
生成默认配置 kubeadm-config.yaml
并更改下面几项:

$ kubeadm config print init-defaults > kubeadm-config.yaml
  • 1

kubernetes-version:集群版本,上面安装的kubeadm版本必须小于等于这里的,可以查看这里:https://kubernetes.io/releases/。
pod-network-cidr:pod资源的网段,需与pod网络插件的值设置一致。通常,Flannel网络插件的默认为10.244.0.0/16,Calico插件的默认值为192.168.0.0/16;
api-server:使用Master作为api-server,所以就是master机器的IP地址。
image-repository:拉取镜像的镜像仓库,默认是k8s.gcr.io。
nodeRegistration.name:改成master01
最终如下:

apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 101.34.112.190 # 指定master节点的IP地址(公网)
  bindPort: 6443
nodeRegistration:
  criSocket: /var/run/dockershim.sock
  imagePullPolicy: IfNotPresent
  name: master01  # 改成master的主机名
  taints: null
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers  # 默认为k8s.gcr.io,但是网络不通,所以要替换为阿里云镜像
kind: ClusterConfiguration
kubernetesVersion: 1.23.6  # 指定kubernetes版本号,使用kubeadm config print init-defaults生成的即可
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
  podSubnet: 10.244.0.0/16  # 指定pod网段,10.244.0.0/16用于匹配flannel默认网段
scheduler: {}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37

上面的配置等价于:

$ kubeadm init \
--kubernetes-version=v1.23.6 \
--image-repository registry.aliyuncs.com/google_containers \
--pod-network-cidr=10.244.0.0/16 \
--apiserver-advertise-address=101.34.112.190 --ignore-preflight-errors=Swap
  • 1
  • 2
  • 3
  • 4
  • 5

或者1核CPU Master初始化(–ignore-preflight-errors=NumCPU这个如果是1核的ECS服务器一定要添加,不然会报错,因为K8S要求最低核数是2核)::

$ kubeadm init \
--kubernetes-version=v1.23.6 \
--apiserver-advertise-address=101.34.112.190 
--ignore-preflight-errors=NumCPU \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16 \
--image-repository registry.aliyuncs.com/google_containers \
--v=6
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

PS:建议通过配置文件的方式来操作,命令行不直观。

检查环境

$ kubeadm init phase preflight --config=kubeadm-config.yaml 
  • 1

这个命令会检查配置文件是否正确,以及系统环境是否支持kubeadm的安装。

初始化kubeadm集群
只需要在master上执行如下命令:

$ kubeadm init --config=kubeadm-config.yaml
  • 1

PS:这里是最难的,作者卡在这里卡了一整天,查阅各种资料才解决,所以如果你也失败了,比较正常,这里是相比于内网部署k8s,公网最麻烦也是最难的点,这一步成功了,后面也没啥了。
最终参考:https://blog.51cto.com/u_15152259/2690063 解决

到这里,会有2种结果:

如果是内网,上面的docker版本,kubeadm版本没错的话,会成功,直接跳到4步骤。
如果在云服务器(腾讯云,阿里云)上,一定会失败(原因和办法在这里):

[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests”
// ...
[kubelet-check] Initial timeout of 40s passed.
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15

提示:请一定先执行上面的初始化(目的是为了生成k8s的配置文件,否则下面的步骤中你会找不到etcd.yaml),失败后再执行下面的步骤!!

云服务器初始化失败解决版本
1)编辑etcd配置文件
配置文件位置:/etc/kubernetes/manifests/etcd.yaml

- --listen-client-urls=https://127.0.0.1:2379,https://101.34.112.190:2379
- --listen-peer-urls=https://101.34.112.190:2380
  • 1
  • 2

改成

- --listen-client-urls=https://127.0.0.1:2379
- --listen-peer-urls=https://127.0.0.1:2380
  • 1
  • 2

引用 在腾讯云安装K8S集群 :
此处"118.195.137.68"为腾讯云公网ip,要关注的是"–listen-client-urls"和"–listen-peer-urls"。需要把–listen-client-urls后面的公网IP删除,把–listen-peer-urls改成127.0.0.1:2380
原因是因为腾讯云只要选择VPC网络均是采用NAT方式将公网IP映射到私人网卡的,有兴趣的同学可以了解下NAT。这也就是为什么很多同事无法在腾讯云或阿里云上安装k8s集群的原因

2)手工停止已启动的进程

先停止kubelet
$ systemctl stop kubelet

把所有kube的进程杀掉

$ netstat -anp |grep kube
请注意,不要执行 kubeadm reset,先 systemctl stop kubelet ,然后手动通过 netstat -anp |grep kube 来找pid,再通过 kill -9 pid 强杀。否则又会生成错误的etcd配置文件,这里非常关键!!!

3)重新初始化,但是跳过etcd文件已经存在的检查:

#重新启动kubelet
$ systemctl start kubelet
#重新初始化,跳过配置文件生成环节,不要etcd的修改要被覆盖掉
$ kubeadm init --config=kubeadm-config.yaml --skip-phases=preflight,certs,kubeconfig,kubelet-start,control-plane,etcd
成功初始化
如果所有配置都正常,很快会输出下面的信息(秒级别)代表了成功,否则大概率是失败(由于网络超时等):

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown ( i d − u ) : (id -u): (idu):(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run “kubectl apply -f [podnetwork].yaml” with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.1.200:6443 --token abcdef.0123456789abcdef
–discovery-token-ca-cert-hash sha256:af2a6e096cb404da729ef3802e77482f0a8a579fa602d7c071ef5c5415aac748

保存上面输出的token和sha256值。
也就是下面这一段,这段命令主要是让node节点加入k8s集群:

kubeadm join 101.34.112.190:6443 --token abcdef.0123456789abcdef
–discovery-token-ca-cert-hash sha256:af2a6e096cb404da729ef3802e77482f0a8a579fa602d7c071ef5c5415aac748
常见错误
Initial timeout of 40s passed
可能1:检查镜像版本,可能是不匹配或者本地替换tag出错造成的了,或者是因为公网IP ETCD无法启动造成的。执行:journalctl -xeu kubelet 查看具体错误,或者时候 journalctl -f -u kubelet 查看初始化的实时输出,下次初始化之前执行 kubeadm reset 重置。
可能2:CentOS版本太低,推荐7.8以上。我在7.2和7.5都失败了,执行 yum update -y升级到7.9才成功。
证书忘记
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2> /dev/null | openssl dgst -sha256 -hex | sed ‘s/^.* //’
1
token忘记
kubeadm token list
1
配置kubectl(master)
准备配置文件
kubectl需经由API server认证及授权后方能执行相应的管理操作,kubeadm 部署的集群为其生成了一个具有管理员权限的认证配置文件 /etc/kubernetes/admin.conf,它可由 kubectl 通过默认的 “$HOME/.kube/config” 的路径进行加载。

拷贝配置文件到kubectl默认加载路径:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown ( i d − u ) : (id -u): (idu):(id -g) $HOME/.kube/config
1
2
3
使用kubectl查看集群信息
在Master节点上执行,输出集群信息:

$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master01 NotReady control-plane,master 15m v1.23.6

$ kubectl get cs
etcd-0 Healthy {“health”:“true”,“reason”:“”}
controller-manager Healthy ok
scheduler Healthy ok
1
2
3
4
5
6
7
8
这里STATUS是NotReady是因为还没有配置网络的原因,接下来会介绍。

安装CN网络(master)
$ curl https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml>>kube-flannel.yml
$ chmod 777 kube-flannel.yml
$ kubectl apply -f kube-flannel.yml
1
2
3
等待几分钟,再查看Master节点状态,由NotRead变成了Ready状态:

$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master01 Ready control-plane,master 15m v1.23.6
1
2
3
允许master节点部署pod
此时k8s master节点安装完毕(为了不浪费云服务器资源,需要让master节点能部署pod,需要运行以下命令)。

查看调度策略
$ kubectl describe node|grep -E “Name:|Taints:”

Name: master01
Taints: node-role.kubernetes.io/master:NoSchedule
1
2
3
4
NoSchedule: 一定不能被调度
PreferNoSchedule: 尽量不要调度
NoExecute: 不仅不会调度, 还会驱逐Node上已有的Pod
更改master节点可被部署pod
$ kubectl taint nodes --all node-role.kubernetes.io/master-
1
查看是否生效
$ kubectl describe node|grep -E “Name:|Taints:”
Name: master01
Taints:
1
2
3
把Node节点加入集群
在node上执行上面kubeadm输出的命令(注意token和sha256值不同):

$ kubeadm join 192.168.1.200:6443 --token abcdef.0123456789abcdef
–discovery-token-ca-cert-hash sha256:af2a6e096cb404da729ef3802e77482f0a8a579fa602d7c071ef5c5415aac748

[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster…
[preflight] FYI: You can look at this config file with ‘kubectl -n kube-system get cm kubeadm-config -o yaml’
[kubelet-start] Writing kubelet configuration to file “/var/lib/kubelet/config.yaml”
[kubelet-start] Writing kubelet environment file with flags to file “/var/lib/kubelet/kubeadm-flags.env”
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap…

This node has joined the cluster:

  • Certificate signing request was sent to apiserver and a response was received.
  • The Kubelet was informed of the new secure connection details.

Run ‘kubectl get nodes’ on the control-plane to see this node join the cluster.

此时,在master执行以下命令,可以看到node节点已经加入成功了:

$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master01 Ready control-plane,master 60m v1.23.6
node01 NotReady 54s v1.23.6
1
2
3
4
等待5分钟左右,node01的状态变成Ready。

另外一台Node节点机器,重复改步骤加入集群即可!

测试集群
创建个nginx Pod
在master节点运行以下命令:

$ kubectl run --image=nginx nginx-app --port=80
$ kubectl run --image=nginx nginx-app1 --port=81
1
2
然后再运行:

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-app 0/1 ContainerCreating 0 18s

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-app 1/1 Running 0 26s

$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-app 1/1 Running 0 57s 10.244.1.2 node01
1
2
3
4
5
6
7
8
9
10
11
可以看到2个pod已经是运行状态,证明k8s集群成功安装

Dashboard可视化界面安装
请移步下一篇文章:k8s dashboard安装
————————————————
版权声明:本文为CSDN博主「Go和分布式IM」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/xmcy001122/article/details/127221661

搭建前记得还要做一下内网地址映射,k8s集群之间是通过内网地址访问通讯的

内网映射外网

所有节点都要执行,所有机器的内外网在每个节点上都要做映射

还有集群网路的问题,k8s集群之间是通过内网地址联系的,云服务器内网访问不通,通过iptables将内网和外网做一个映射kubectl把节点名称和外网ip对应起来

iptables -t nat -A OUTPUT -d 内网ip -j DNAT --to-destination 外网ip
iptables -t nat -A OUTPUT -d 内网ip -j DNAT --to-destination 外网ip
iptables -t nat -A OUTPUT -d 内网ip -j DNAT --to-destination 外网ip

kubectl annotate node k8s-master-0 flannel.alpha.coreos.com/public-ip-overwrite=节点外网ip --overwrite
kubectl annotate node k8s-master-1 flannel.alpha.coreos.com/public-ip-overwrite=节点外网ip --overwrite
kubectl annotate node k8s-master-2 flannel.alpha.coreos.com/public-ip-overwrite=节点外网ip --overwrite
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

然后搭建kubesphere

搭建之前先搭建nfs,挂载一个默认的存储盘
nfs搭建
2.3、配置NFS网络文件
此操作需在三台节点机器上全部安装NFS环境。
yum install -y nfs-utils
1
Node1主节点NFS网络文件配置。

nfs主节点

mkdir -p /nfs/data
echo “/nfs/data/ *(insecure,rw,sync,no_root_squash)” > /etc/exports

设置开机自启 & 现在启动 – 远程绑定服务

systemctl enable rpcbind --now

nfs服务

systemctl enable nfs-server --now

配置生效

exportfs -r

查看

exportfs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
以下就准备好了NFS网络文件Node1主节点基础环境。

Node2从节点NFS网络文件配置。

查看远程机器有哪些目录可以同步 --使用master机器ip地址

showmount -e 192.168.47.139

执行以下命令挂载 nfs 服务器上的共享目录到本机路径

mkdir -p /nfs/data

同步远程机器数据

mount -t nfs 192.168.47.139:/nfs/data /nfs/data
1
2
3
4
5
6
7
8
Node3从节点NFS网络文件配置。

查看远程机器有哪些目录可以同步 --使用master机器ip地址

showmount -e 192.168.47.139

执行以下命令挂载 nfs 服务器上的共享目录到本机路径

mkdir -p /nfs/data

同步远程机器数据

mount -t nfs 192.168.47.139:/nfs/data /nfs/data
1
2
3
4
5
6
7
8
测试NFS网络文件系统是否搭建成功。

在任意机器写入一个测试文件

echo “hello nfs” > /nfs/data/test.txt

在其它机器查看数据

cat /nfs/data/test.txt
1
2
3
4
5

2.4、配置默认存储类
该操作只需要在Node1主节点上操作即可

创建一个命名为storageclass.yaml文件的配置。
将spec>env>value和volumes>server换成你自己主节点上的IP地址。

创建了一个存储类

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs-storage
annotations:
storageclass.kubernetes.io/is-default-class: “true”
provisioner: k8s-sigs.io/nfs-subdir-external-provisioner
parameters:
archiveOnDelete: “true” ## 删除pv的时候,pv的内容是否要备份

apiVersion: apps/v1
kind: Deployment
metadata:
name: nfs-client-provisioner
labels:
app: nfs-client-provisioner

replace with namespace where provisioner is deployed

namespace: default
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
app: nfs-client-provisioner
template:
metadata:
labels:
app: nfs-client-provisioner
spec:
serviceAccountName: nfs-client-provisioner
containers:
- name: nfs-client-provisioner
image: registry.cn-hangzhou.aliyuncs.com/lfy_k8s_images/nfs-subdir-external-provisioner:v4.0.2
# resources:
# limits:
# cpu: 10m
# requests:
# cpu: 10m
volumeMounts:
- name: nfs-client-root
mountPath: persistentvolumes
env:
- name: PROVISIONER_NAME
value: k8s-sigs.io/nfs-subdir-external-provisioner
- name: NFS_SERVER
value: 192.168.47.139 ## 指定自己nfs服务器地址
- name: NFS_PATH
value: /nfs/data ## nfs服务器共享的目录
volumes:
- name: nfs-client-root
nfs:
server: 192.168.47.139 ##nfs服务器共享的目录
path: /nfs/data

apiVersion: v1
kind: ServiceAccount
metadata:
name: nfs-client-provisioner

replace with namespace where provisioner is deployed

namespace: default

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: nfs-client-provisioner-runner
rules:

  • apiGroups: [“”]
    resources: [“nodes”]
    verbs: [“get”, “list”, “watch”]
  • apiGroups: [“”]
    resources: [“persistentvolumes”]
    verbs: [“get”, “list”, “watch”, “create”, “delete”]
  • apiGroups: [“”]
    resources: [“persistentvolumeclaims”]
    verbs: [“get”, “list”, “watch”, “update”]
  • apiGroups: [“storage.k8s.io”]
    resources: [“storageclasses”]
    verbs: [“get”, “list”, “watch”]
  • apiGroups: [“”]
    resources: [“events”]
    verbs: [“create”, “update”, “patch”]

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: run-nfs-client-provisioner
subjects:

  • kind: ServiceAccount
    name: nfs-client-provisioner

    replace with namespace where provisioner is deployed

    namespace: default
    roleRef:
    kind: ClusterRole
    name: nfs-client-provisioner-runner
    apiGroup: rbac.authorization.k8s.io

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-client-provisioner

replace with namespace where provisioner is deployed

namespace: default
rules:

  • apiGroups: [“”]
    resources: [“endpoints”]
    verbs: [“get”, “list”, “watch”, “create”, “update”, “patch”]

kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-client-provisioner

replace with namespace where provisioner is deployed

namespace: default
subjects:

  • kind: ServiceAccount
    name: nfs-client-provisioner

    replace with namespace where provisioner is deployed

    namespace: default
    roleRef:
    kind: Role
    name: leader-locking-nfs-client-provisioner
    apiGroup: rbac.authorization.k8s.io

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
应用storageclass.yaml文件。
kubectl apply -f storageclass.yaml
1

查看默认存储类
kubectl get sc
1

查看NFS-ClientPod状态
kubectl get pods -A
1

2.5、配置集群指标监控组件
该操作只需要在Node1主节点上操作即可

默认是没有监控节点和POD的组件。

Metrics-Server:它是集群指标监控组件,用于和API Server交互,获取(采集)Kubernetes集群中各项指标数据的。有了它我们可以查看各个Pod,Node等其他资源的CPU,Mem(内存)使用情况。

KubeSphere:可以充当Kubernetes的dashboard(可视化面板)因此KubeSphere要想获取Kubernetes的各项数据,就需要某个组件去提供给想数据,这个数据采集功能由Metrics-Server实现。

搭建Metrics-Server监控服务。
创建一个名为metrics-server.yaml的文件
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
rbac.authorization.k8s.io/aggregate-to-admin: “true”
rbac.authorization.k8s.io/aggregate-to-edit: “true”
rbac.authorization.k8s.io/aggregate-to-view: “true”
name: system:aggregated-metrics-reader
rules:

  • apiGroups:
    • metrics.k8s.io
      resources:
    • pods
    • nodes
      verbs:
    • get
    • list
    • watch

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
rules:

  • apiGroups:
    • “”
      resources:
    • pods
    • nodes
    • nodes/stats
    • namespaces
    • configmaps
      verbs:
    • get
    • list
    • watch

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:

  • kind: ServiceAccount
    name: metrics-server
    namespace: kube-system

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:

  • kind: ServiceAccount
    name: metrics-server
    namespace: kube-system

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:

  • kind: ServiceAccount
    name: metrics-server
    namespace: kube-system

apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
ports:

  • name: https
    port: 443
    protocol: TCP
    targetPort: https
    selector:
    k8s-app: metrics-server

apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: metrics-server
strategy:
rollingUpdate:
maxUnavailable: 0
template:
metadata:
labels:
k8s-app: metrics-server
spec:
containers:
- args:
- --cert-dir=/tmp
- --kubelet-insecure-tls
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
image: registry.cn-hangzhou.aliyuncs.com/lfy_k8s_images/metrics-server:v0.4.3
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: livez
port: https
scheme: HTTPS
periodSeconds: 10
name: metrics-server
ports:
- containerPort: 4443
name: https
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: readyz
port: https
scheme: HTTPS
periodSeconds: 10
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
volumeMounts:
- mountPath: tmp
name: tmp-dir
nodeSelector:
kubernetes.io/os: linux
priorityClassName: system-cluster-critical
serviceAccountName: metrics-server
volumes:
- emptyDir: {}
name: tmp-dir

apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
labels:
k8s-app: metrics-server
name: v1beta1.metrics.k8s.io
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
version: v1beta1
versionPriority: 100

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
进入manifests文件夹修改kube-apiserver.yaml
添加配置:–enable-aggregator-routing=true
spec:
containers:

  • command:
    • kube-apiserver
    • –advertise-address=192.168.47.139
    • –allow-privileged=true
    • –authorization-mode=Node,RBAC
    • –client-ca-file=/etc/kubernetes/pki/ca.crt
    • –enable-admission-plugins=NodeRestriction
    • –enable-bootstrap-token-auth=true
    • –enable-aggregator-routing=true
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11

重启kubelet服务
systemctl daemon-reload
systemctl restart kubelet
1
2
docker拉取Metrics-Server监控服务镜像
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.4.3
1
应用名为metrics-server.yaml的文件
kubectl apply -f metrics-server.yaml
1

查看metrics-server状态
kubectl get pods -n kube-system
1

验证metrics-server监控组件
#验证K8s节点
kubectl top nodes

#验证Pod
kubectl top pods -A
1
2
3
4
5

至此Kubesphere基本环境和默认存储PVC搭建完成
————————————————

                        版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
  • 1

原文链接:https://blog.csdn.net/weixin_46389877/article/details/128497112

然后按照官网搭建kubefphere
kubesphere搭建

下载kubesphere的时候记得不要按照网上的把cluster-configuration.yaml里面的false全部改为true,我被坑过,那时候不懂,里面是kubesphere的可插拔组件,服务器性能不行最好一个一个开看看,带不动的,我开了一个日志组件就把master干爆了

然后就是报错

在node节点上get 不到pod

k8s报错:The connection to the server localhost:8080 was refused
k8s的node节点使用kubectl命令时,如kubectl get pods --all-namespaces 出现如下错误:

[root@k8s-node239 ~]# kubectl get pods
The connection to the server localhost:8080 was refused - did you specify the right host or port?
  • 1
  • 2
  • 3
  • 4
  • 5

解决方案

我常用解决方法2

解决办法1:使用一个非 root 账户登录,然后运行下列命令:

sudo cp /etc/kubernetes/admin.conf $HOME/
sudo chown $(id -u):$(id -g) $HOME/admin.conf
export KUBECONFIG=$HOME/admin.conf

解决办法2:

  出现这个问题的原因是kubectl命令需要使用kubernetes-admin的身份来运行,在“kubeadm int”启动集群的步骤中就生成了“/etc/kubernetes/admin.conf”。

因此,解决方法如下,将主节点中的【/etc/kubernetes/admin.conf】文件拷贝到工作节点相同目录下:

#复制admin.conf,请在主节点服务器上执行此命令
scp /etc/kubernetes/admin.conf 172.16.2.202:/etc/kubernetes/admin.conf
scp /etc/kubernetes/admin.conf 172.16.2.203:/etc/kubernetes/admin.conf
然后分别在工作节点上配置环境变量:

#设置kubeconfig文件
export KUBECONFIG=/etc/kubernetes/admin.conf
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/我家小花儿/article/detail/576372
推荐阅读
相关标签
  

闽ICP备14008679号