赞
踩
首先是三台云服务器,腾讯云、百度云、京东云,配置4c4g、2c4g、2c4g,穷大学生只有这配置的服务器了
参考
在 Ubuntu 22.04 上安装 KubeSphere 实战教程
这篇文章是使用kubekey安装k8s和kubesphere,三台服务器都参与作为master节点,master可以选举,不容易嘎蛋,但是我没跑明白,etcd那块卡挺久的,弄好后我以为要装好了,但是后面还有一堆要安装还有坑,最后实在搞不动放弃了
00:02:04 CST [PullModule] Start to pull images on all nodes 00:02:04 CST message: [master] downloading image: kubesphere/pause:3.7 00:02:04 CST message: [node2] downloading image: kubesphere/pause:3.7 00:02:04 CST message: [node1] downloading image: kubesphere/pause:3.7 00:02:04 CST message: [master] pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH crictl pull kubesphere/pause:3.7" WARN[0000] image connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead. ERRO[0000] unable to determine image API version: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory" E0819 00:02:04.611117 9644 remote_image.go:218] "PullImage from image service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService" image="kubesphere/pause:3.7" FATA[0000] pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService: Process exited with status 1 00:02:04 CST retry: [master] 00:02:07 CST message: [node1] downloading image: kubesphere/kube-proxy:v1.24.2 00:02:09 CST message: [node2] downloading image: kubesphere/kube-proxy:v1.24.2 00:02:09 CST message: [master] downloading image: kubesphere/pause:3.7 00:02:09 CST message: [master] pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH crictl pull kubesphere/pause:3.7" WARN[0000] image connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead. ERRO[0000] unable to determine image API version: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory" E0819 00:02:09.650287 9665 remote_image.go:218] "PullImage from image service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService" image="kubesphere/pause:3.7" FATA[0000] pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService: Process exited with status 1 00:02:09 CST retry: [master] 00:02:14 CST message: [master] downloading image: kubesphere/pause:3.7 00:02:14 CST message: [master] pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH crictl pull kubesphere/pause:3.7" WARN[0000] image connect using default endpoints: [unix:///var/run/dockershim.sock unix:///run/containerd/containerd.sock unix:///run/crio/crio.sock unix:///var/run/cri-dockerd.sock]. As the default settings are now deprecated, you should set the endpoint instead. ERRO[0000] unable to determine image API version: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory" E0819 00:02:14.690706 9697 remote_image.go:218] "PullImage from image service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService" image="kubesphere/pause:3.7" FATA[0000] pulling image: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService: Process exited with status 1
cat > /etc/containerd/config.toml <<EOF
[plugins."io.containerd.grpc.v1.cri"]
systemd_cgroup = true
EOF
systemctl restart containerd
error: code = Unknown desc = failed to pull and unpack image "docker.io/kubesphere/k8s-dns-node-cache:1.15.12": failed to copy: httpReadSeeker: failed open: failed to do request: Get "https://registry-1.docker.io/v2/kubesphere/k8s-dns-node-cache/manifests/sha256:8e765f63b3a5b4832c484b4397f4932bd607713ec2bb3e639118bc164ab4a958": net/http: TLS handshake timeout: Process exited with status 1
改一下配置
crictl config runtime-endpoint /run/containerd/containerd.sock
# 或者选择修改配置文件也可,修改文件这个我没去验证
# vi /etc/crictl.yaml
下边的方法是清缓存的指令
k8s集群部署中etcd启动报错request sent was ignored (cluster ID mismatch: peer[c39bdec535db1fd5]=cdf818194e3a8c
云服务器主要需要把etcd的配置文件写好
有些需要改成本地网卡监听
kube@k8s-master-0:~/kubekey$ cat /etc/etcd.env # Environment file for etcd v3.4.13 ETCD_DATA_DIR=/var/lib/etcd ETCD_ADVERTISE_CLIENT_URLS=https://外网:2379 ETCD_INITIAL_ADVERTISE_PEER_URLS=https://外网:2380 ETCD_INITIAL_CLUSTER_STATE=existing ETCD_METRICS=basic ETCD_LISTEN_CLIENT_URLS=https://内网:2379,https://127.0.0.1:2379 ETCD_ELECTION_TIMEOUT=5000 ETCD_HEARTBEAT_INTERVAL=250 ETCD_INITIAL_CLUSTER_TOKEN=k8s_etcd ETCD_LISTEN_PEER_URLS=https://内网:2380 ETCD_NAME=etcd-k8s-master-0 ETCD_PROXY=off ETCD_ENABLE_V2=true ETCD_INITIAL_CLUSTER=etcd-k8s-master-0=https://外网:2380,etcd-k8s-master-1=https://外网:2380,etcd-k8s-master-2=https://外网:2380 ETCD_AUTO_COMPACTION_RETENTION=8 ETCD_SNAPSHOT_COUNT=10000 # TLS settings ETCD_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.pem ETCD_CERT_FILE=/etc/ssl/etcd/ssl/member-k8s-master-0.pem ETCD_KEY_FILE=/etc/ssl/etcd/ssl/member-k8s-master-0-key.pem ETCD_CLIENT_CERT_AUTH=true ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.pem ETCD_PEER_CERT_FILE=/etc/ssl/etcd/ssl/member-k8s-master-0.pem ETCD_PEER_KEY_FILE=/etc/ssl/etcd/ssl/member-k8s-master-0-key.pem ETCD_PEER_CLIENT_CERT_AUTH=True # CLI settings ETCDCTL_ENDPOINTS=https://127.0.0.1:2379 ETCDCTL_CA_FILE=/etc/ssl/etcd/ssl/ca.pem ETCDCTL_KEY_FILE=/etc/ssl/etcd/ssl/admin-k8s-master-0-key.pem ETCDCTL_CERT_FILE=/etc/ssl/etcd/ssl/admin-k8s-master-0.pem
kubekey配置
apiVersion: kubekey.kubesphere.io/v1alpha2 kind: Cluster metadata: name: sample spec: hosts: - {name: k8s-master-0, address: 外网, internalAddress: 外网, user: kube, password: ""} - {name: k8s-master-1, address: 外网, internalAddress: 外网, user: kube, privateKeyPath: "~/.ssh/id_ed25519"} - {name: k8s-master-2, address: 外网, internalAddress: 外网, user: kube, privateKeyPath: "~/.ssh/id_ed25519"} roleGroups: etcd: - k8s-master-0 - k8s-master-1 - k8s-master-2 control-plane: - k8s-master-0 - k8s-master-1 - k8s-master-2 worker: - k8s-master-0 - k8s-master-1 - k8s-master-2 controlPlaneEndpoint: ## Internal loadbalancer for apiservers internalLoadbalancer: haproxy domain: lb.kubesphere.local address: "" port: 6443 kubernetes: version: v1.25.5 clusterName: cluster.local autoRenewCerts: true containerManager: containerd etcd: type: kubekey network: plugin: calico kubePodsCIDR: 10.233.64.0/18 kubeServiceCIDR: 10.233.0.0/18 ## multus support. https://github.com/k8snetworkplumbingwg/multus-cni multusCNI: enabled: false registry: privateRegistry: "" namespaceOverride: "" registryMirrors: [] insecureRegistries: [] addons: [] --- apiVersion: installer.kubesphere.io/v1alpha1 kind: ClusterConfiguration metadata: name: ks-installer namespace: kubesphere-system labels: version: v3.3.2 spec: persistence: storageClass: "" authentication: jwtSecret: "" zone: "" local_registry: "" namespace_override: "" # dev_tag: "" etcd: monitoring: false endpointIps: localhost port: 2379 tlsEnable: true common: core: console: enableMultiLogin: true port: 30880 type: NodePort # apiserver: # resources: {} # controllerManager: # resources: {} redis: enabled: false volumeSize: 2Gi openldap: enabled: false volumeSize: 2Gi minio: volumeSize: 20Gi monitoring: # type: external endpoint: http://prometheus-operated.kubesphere-monitoring-system.svc:9090 GPUMonitoring: enabled: false gpu: kinds: - resourceName: "nvidia.com/gpu" resourceType: "GPU" default: true es: # master: # volumeSize: 4Gi # replicas: 1 # resources: {} # data: # volumeSize: 20Gi # replicas: 1 # resources: {} logMaxAge: 7 elkPrefix: logstash basicAuth: enabled: false username: "" password: "" externalElasticsearchHost: "" externalElasticsearchPort: "" alerting: enabled: false # thanosruler: # replicas: 1 # resources: {} auditing: enabled: false # operator: # resources: {} # webhook: # resources: {} devops: enabled: false # resources: {} jenkinsMemoryLim: 8Gi jenkinsMemoryReq: 4Gi jenkinsVolumeSize: 8Gi events: enabled: false # operator: # resources: {} # exporter: # resources: {} # ruler: # enabled: true # replicas: 2 # resources: {} logging: enabled: false logsidecar: enabled: true replicas: 2 # resources: {} metrics_server: enabled: false monitoring: storageClass: "" node_exporter: port: 9100 # resources: {} # kube_rbac_proxy: # resources: {} # kube_state_metrics: # resources: {} # prometheus: # replicas: 1 # volumeSize: 20Gi # resources: {} # operator: # resources: {} # alertmanager: # replicas: 1 # resources: {} # notification_manager: # resources: {} # operator: # resources: {} # proxy: # resources: {} gpu: nvidia_dcgm_exporter: enabled: false # resources: {} multicluster: clusterRole: none network: networkpolicy: enabled: false ippool: type: none topology: type: none openpitrix: store: enabled: false servicemesh: enabled: false istio: components: ingressGateways: - name: istio-ingressgateway enabled: false cni: enabled: false edgeruntime: enabled: false kubeedge: enabled: false cloudCore: cloudHub: advertiseAddress: - "" service: cloudhubNodePort: "30000" cloudhubQuicNodePort: "30001" cloudhubHttpsNodePort: "30002" cloudstreamNodePort: "30003" tunnelNodePort: "30004" # resources: {} # hostNetWork: false iptables-manager: enabled: true mode: "external" # resources: {} # edgeService: # resources: {} terminal: timeout: 600
以上是ubuntu安装遇到的一些坑,使用ubuntu系统然后kubekey一键安装没装明白,以下换成centos7.9使用kubeadm安装
centos通过参考下边的文章安装成功,但是这种固定一个master节点,master服务器挂了k8s就挂了,但是安装简单,学习使用暂时先这样
总体流程一览
主要流程如下:
可视化界面和私有镜像仓库请参考其他文章:
1.部署Dashboard Web 页面,可视化查看Kubernetes资源,看我下一篇文章:k8s dashboard安装
2.部署Harbor私有仓库,存放镜像资源(非必要,省略介绍)
下面,开始介绍各流程详细配置步骤。
环境准备
云主机
云服务器 区域 CentOS 节点类型 配置 公网IP 安装工具
腾讯云 上海三区 7.9 master01 2C4G 101.34.112.190 docker、kubeadm、kubelet、kubectl、flannel
同上 上海二区 7.9 node01 1C2G 81.68.126.69 同上
同上 上海二区 7.9 node02 1C2G 81.68.92.49 同上
CentOS升级
如果低于CentOS 7.9,请先升级:
$ yum update -y
$ cat /etc/redhat-release
CentOS Linux release 7.9.2009 (Core)
PS:请务必不要跳过这步,低版本的CentOS安装Kubeadm时很可能会失败。作者就是失败之后升级CentOS才成功了!!
所有节点CentOS设置
基础设置
PS:该步骤基于:CentOS Linux release 7.9.2009 (Core) ,7.2 - 7.6 版本好像会失败,如果中途不成功,请考虑升级更换CentOS版本!
可以创建 k8s-pre-install-centos.sh 脚本,一键设置:
$ vim k8s-pre-install-centos.sh #!/bin/sh function set_base(){ # 关闭防火墙,PS:如果使用云服务器,还需要在云服务器的控制台中把防火墙关闭了或者允许所有端口。 systemctl stop firewalld systemctl disable firewalld # 关闭SELinux,这样做的目的是:为了让容器能读取主机文件系统。 setenforce 0 # 永久关闭swap分区交换,kubeadm规定,一定要关闭 swapoff -a sed -ri 's/.*swap.*/#&/' /etc/fstab # iptables配置 cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf br_netfilter EOF cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 EOF # iptables生效参数 sysctl --system } set_base
执行上述脚本:
$ chmod 777 k8s-pre-install-centos.sh && ./k8s-pre-install-centos.sh
主机名设置
修改主机名
master执行:
hostnamectl set-hostname master01
节点1执行:
hostnamectl set-hostname node01
节点2执行:
hostnamectl set-hostname node02
修改hosts文件
每台机器上都要执行:
$ vim /etc/hosts
101.34.112.190 master01
81.68.126.69 node01
81.68.92.49 node02
PS:请更换为自己的公网IP(注意是公网IP,不是内网)!
所有节点安装Docker
k8s支持 3种容器运行时,这里我们优先使用熟悉的 Docker 作为容器运行时。请确保CentOS 7以上,最新要求以 官方 为主。
安装yum仓库
$ sudo yum install -y yum-utils
$ sudo yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo
安装docker
包括cli、engine、docker compose等
$ sudo yum install docker-ce-20.10.14-3.el7 docker-ce-cli-20.10.14-3.el7 containerd.io docker-compose-plugin
ps:本教程使用的是docker版本:20.10.14
配置Docker守护程序
尤其是使用 systemd 来管理容器的 cgroup,另外还要配置阿里云镜像源,加快拉取速度!
$ sudo mkdir /etc/docker
$ cat <<EOF | sudo tee /etc/docker/daemon.json
{
"registry-mirrors": ["https://6ijb8ubo.mirror.aliyuncs.com"],
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
EOF
registry-mirrors:镜像加速。
cgroupdriver:使用systemd。
log-driver:使用json日志,大小为100m。
启动docker,并设置为开机启动
$ sudo systemctl enable docker
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker
$ systemctl status docker # 确保是running状态
确认Cgroup Driver为systemd
$ docker info | grep "Cgroup Driver"
Cgroup Driver: systemd
PS:因为k8s是默认systemd作为cgroup driver,如果Docker使用另外的驱动,则可能出现不稳定的情况。
所有节点安装kubeadm
为保证不过期,请最终以 官方文档 为主:
配置yum源(使用aliyun,google你知道的)
$ cat > /etc/yum.repos.d/kubernetes.repo << EOF [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF $ yum makecache # 更新yum 安装kubeadm, kubelet, kubectl $ sudo yum install -y kubelet-1.23.6 kubeadm-1.23.6 kubectl-1.23.6 --disableexcludes=kubernetes PS:k8s升级很快,为了保证教程正确,请使用相同版本。 启动klubelet,并设置为开机启动 $ sudo systemctl start kubelet $ sudo systemctl enable kubelet PS:kubeadm 将使用 kubelet 服务以容器方式部署和启动 Kubemetes 的主要服务。
所有节点拉取Docker镜像
ps:该1.23.6版本,需要docker为20,超过的不保证成功
拉取Docker镜像
查看初始化需要的镜像
$ kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.23.6
k8s.gcr.io/kube-controller-manager:v1.23.6
k8s.gcr.io/kube-scheduler:v1.23.6
k8s.gcr.io/kube-proxy:v1.23.6
k8s.gcr.io/pause:3.6
k8s.gcr.io/etcd:3.5.1-0
k8s.gcr.io/coredns/coredns:v1.8.6
替换k8s镜像源
k8s模式镜像仓库是 k8s.gcr.io,由于众所周知的原因是无法访问的。
故这里需要创建配置 kubeadm-config-image.yaml 替换成阿里云的源:
$ vim kubeadm-config-image.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
# 默认为k8s.gcr.io,但是网络不通,所以要替换为阿里云镜像
imageRepository: registry.aliyuncs.com/google_containers
确认镜像仓库改变
再查看,发现镜像的地址变了,才能执行下一步:
$ kubeadm config images list --config kubeadm-config-image.yaml
registry.aliyuncs.com/google_containers/kube-apiserver:v1.23.6
registry.aliyuncs.com/google_containers/kube-controller-manager:v1.23.6
registry.aliyuncs.com/google_containers/kube-scheduler:v1.23.6
registry.aliyuncs.com/google_containers/kube-proxy:v1.23.6
registry.aliyuncs.com/google_containers/pause:3.6
registry.aliyuncs.com/google_containers/etcd:3.5.1-0
registry.aliyuncs.com/google_containers/coredns:v1.8.6
拉取镜像
$ kubeadm config images pull --config kubeadm-config-image.yaml
在 所有机器 上执行,把这些镜像提前拉好。
Master节点初始化集群
生成默认配置 kubeadm-config.yaml
并更改下面几项:
$ kubeadm config print init-defaults > kubeadm-config.yaml
kubernetes-version:集群版本,上面安装的kubeadm版本必须小于等于这里的,可以查看这里:https://kubernetes.io/releases/。
pod-network-cidr:pod资源的网段,需与pod网络插件的值设置一致。通常,Flannel网络插件的默认为10.244.0.0/16,Calico插件的默认值为192.168.0.0/16;
api-server:使用Master作为api-server,所以就是master机器的IP地址。
image-repository:拉取镜像的镜像仓库,默认是k8s.gcr.io。
nodeRegistration.name:改成master01
最终如下:
apiVersion: kubeadm.k8s.io/v1beta3 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 101.34.112.190 # 指定master节点的IP地址(公网) bindPort: 6443 nodeRegistration: criSocket: /var/run/dockershim.sock imagePullPolicy: IfNotPresent name: master01 # 改成master的主机名 taints: null --- apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta3 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {} dns: {} etcd: local: dataDir: /var/lib/etcd imageRepository: registry.aliyuncs.com/google_containers # 默认为k8s.gcr.io,但是网络不通,所以要替换为阿里云镜像 kind: ClusterConfiguration kubernetesVersion: 1.23.6 # 指定kubernetes版本号,使用kubeadm config print init-defaults生成的即可 networking: dnsDomain: cluster.local serviceSubnet: 10.96.0.0/12 podSubnet: 10.244.0.0/16 # 指定pod网段,10.244.0.0/16用于匹配flannel默认网段 scheduler: {}
上面的配置等价于:
$ kubeadm init \
--kubernetes-version=v1.23.6 \
--image-repository registry.aliyuncs.com/google_containers \
--pod-network-cidr=10.244.0.0/16 \
--apiserver-advertise-address=101.34.112.190 --ignore-preflight-errors=Swap
或者1核CPU Master初始化(–ignore-preflight-errors=NumCPU这个如果是1核的ECS服务器一定要添加,不然会报错,因为K8S要求最低核数是2核)::
$ kubeadm init \
--kubernetes-version=v1.23.6 \
--apiserver-advertise-address=101.34.112.190
--ignore-preflight-errors=NumCPU \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16 \
--image-repository registry.aliyuncs.com/google_containers \
--v=6
PS:建议通过配置文件的方式来操作,命令行不直观。
检查环境
$ kubeadm init phase preflight --config=kubeadm-config.yaml
这个命令会检查配置文件是否正确,以及系统环境是否支持kubeadm的安装。
初始化kubeadm集群
只需要在master上执行如下命令:
$ kubeadm init --config=kubeadm-config.yaml
PS:这里是最难的,作者卡在这里卡了一整天,查阅各种资料才解决,所以如果你也失败了,比较正常,这里是相比于内网部署k8s,公网最麻烦也是最难的点,这一步成功了,后面也没啥了。
最终参考:https://blog.51cto.com/u_15152259/2690063 解决
到这里,会有2种结果:
如果是内网,上面的docker版本,kubeadm版本没错的话,会成功,直接跳到4步骤。
如果在云服务器(腾讯云,阿里云)上,一定会失败(原因和办法在这里):
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests”
// ...
[kubelet-check] Initial timeout of 40s passed.
提示:请一定先执行上面的初始化(目的是为了生成k8s的配置文件,否则下面的步骤中你会找不到etcd.yaml),失败后再执行下面的步骤!!
云服务器初始化失败解决版本
1)编辑etcd配置文件
配置文件位置:/etc/kubernetes/manifests/etcd.yaml
- --listen-client-urls=https://127.0.0.1:2379,https://101.34.112.190:2379
- --listen-peer-urls=https://101.34.112.190:2380
改成
- --listen-client-urls=https://127.0.0.1:2379
- --listen-peer-urls=https://127.0.0.1:2380
引用 在腾讯云安装K8S集群 :
此处"118.195.137.68"为腾讯云公网ip,要关注的是"–listen-client-urls"和"–listen-peer-urls"。需要把–listen-client-urls后面的公网IP删除,把–listen-peer-urls改成127.0.0.1:2380
原因是因为腾讯云只要选择VPC网络均是采用NAT方式将公网IP映射到私人网卡的,有兴趣的同学可以了解下NAT。这也就是为什么很多同事无法在腾讯云或阿里云上安装k8s集群的原因
2)手工停止已启动的进程
先停止kubelet
$ systemctl stop kubelet
$ netstat -anp |grep kube
请注意,不要执行 kubeadm reset,先 systemctl stop kubelet ,然后手动通过 netstat -anp |grep kube 来找pid,再通过 kill -9 pid 强杀。否则又会生成错误的etcd配置文件,这里非常关键!!!
3)重新初始化,但是跳过etcd文件已经存在的检查:
#重新启动kubelet
$ systemctl start kubelet
#重新初始化,跳过配置文件生成环节,不要etcd的修改要被覆盖掉
$ kubeadm init --config=kubeadm-config.yaml --skip-phases=preflight,certs,kubeconfig,kubelet-start,control-plane,etcd
成功初始化
如果所有配置都正常,很快会输出下面的信息(秒级别)代表了成功,否则大概率是失败(由于网络超时等):
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown
(
i
d
−
u
)
:
(id -u):
(id−u):(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run “kubectl apply -f [podnetwork].yaml” with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.1.200:6443 --token abcdef.0123456789abcdef
–discovery-token-ca-cert-hash sha256:af2a6e096cb404da729ef3802e77482f0a8a579fa602d7c071ef5c5415aac748
保存上面输出的token和sha256值。
也就是下面这一段,这段命令主要是让node节点加入k8s集群:
kubeadm join 101.34.112.190:6443 --token abcdef.0123456789abcdef
–discovery-token-ca-cert-hash sha256:af2a6e096cb404da729ef3802e77482f0a8a579fa602d7c071ef5c5415aac748
常见错误
Initial timeout of 40s passed
可能1:检查镜像版本,可能是不匹配或者本地替换tag出错造成的了,或者是因为公网IP ETCD无法启动造成的。执行:journalctl -xeu kubelet 查看具体错误,或者时候 journalctl -f -u kubelet 查看初始化的实时输出,下次初始化之前执行 kubeadm reset 重置。
可能2:CentOS版本太低,推荐7.8以上。我在7.2和7.5都失败了,执行 yum update -y升级到7.9才成功。
证书忘记
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2> /dev/null | openssl dgst -sha256 -hex | sed ‘s/^.* //’
1
token忘记
kubeadm token list
1
配置kubectl(master)
准备配置文件
kubectl需经由API server认证及授权后方能执行相应的管理操作,kubeadm 部署的集群为其生成了一个具有管理员权限的认证配置文件 /etc/kubernetes/admin.conf,它可由 kubectl 通过默认的 “$HOME/.kube/config” 的路径进行加载。
拷贝配置文件到kubectl默认加载路径:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown
(
i
d
−
u
)
:
(id -u):
(id−u):(id -g) $HOME/.kube/config
1
2
3
使用kubectl查看集群信息
在Master节点上执行,输出集群信息:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master01 NotReady control-plane,master 15m v1.23.6
$ kubectl get cs
etcd-0 Healthy {“health”:“true”,“reason”:“”}
controller-manager Healthy ok
scheduler Healthy ok
1
2
3
4
5
6
7
8
这里STATUS是NotReady是因为还没有配置网络的原因,接下来会介绍。
安装CN网络(master)
$ curl https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml>>kube-flannel.yml
$ chmod 777 kube-flannel.yml
$ kubectl apply -f kube-flannel.yml
1
2
3
等待几分钟,再查看Master节点状态,由NotRead变成了Ready状态:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master01 Ready control-plane,master 15m v1.23.6
1
2
3
允许master节点部署pod
此时k8s master节点安装完毕(为了不浪费云服务器资源,需要让master节点能部署pod,需要运行以下命令)。
查看调度策略
$ kubectl describe node|grep -E “Name:|Taints:”
Name: master01
Taints: node-role.kubernetes.io/master:NoSchedule
1
2
3
4
NoSchedule: 一定不能被调度
PreferNoSchedule: 尽量不要调度
NoExecute: 不仅不会调度, 还会驱逐Node上已有的Pod
更改master节点可被部署pod
$ kubectl taint nodes --all node-role.kubernetes.io/master-
1
查看是否生效
$ kubectl describe node|grep -E “Name:|Taints:”
Name: master01
Taints:
1
2
3
把Node节点加入集群
在node上执行上面kubeadm输出的命令(注意token和sha256值不同):
$ kubeadm join 192.168.1.200:6443 --token abcdef.0123456789abcdef
–discovery-token-ca-cert-hash sha256:af2a6e096cb404da729ef3802e77482f0a8a579fa602d7c071ef5c5415aac748
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster…
[preflight] FYI: You can look at this config file with ‘kubectl -n kube-system get cm kubeadm-config -o yaml’
[kubelet-start] Writing kubelet configuration to file “/var/lib/kubelet/config.yaml”
[kubelet-start] Writing kubelet environment file with flags to file “/var/lib/kubelet/kubeadm-flags.env”
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap…
This node has joined the cluster:
Run ‘kubectl get nodes’ on the control-plane to see this node join the cluster.
此时,在master执行以下命令,可以看到node节点已经加入成功了:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master01 Ready control-plane,master 60m v1.23.6
node01 NotReady 54s v1.23.6
1
2
3
4
等待5分钟左右,node01的状态变成Ready。
另外一台Node节点机器,重复改步骤加入集群即可!
测试集群
创建个nginx Pod
在master节点运行以下命令:
$ kubectl run --image=nginx nginx-app --port=80
$ kubectl run --image=nginx nginx-app1 --port=81
1
2
然后再运行:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-app 0/1 ContainerCreating 0 18s
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-app 1/1 Running 0 26s
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-app 1/1 Running 0 57s 10.244.1.2 node01
1
2
3
4
5
6
7
8
9
10
11
可以看到2个pod已经是运行状态,证明k8s集群成功安装
Dashboard可视化界面安装
请移步下一篇文章:k8s dashboard安装
————————————————
版权声明:本文为CSDN博主「Go和分布式IM」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/xmcy001122/article/details/127221661
搭建前记得还要做一下内网地址映射,k8s集群之间是通过内网地址访问通讯的
所有节点都要执行,所有机器的内外网在每个节点上都要做映射
还有集群网路的问题,k8s集群之间是通过内网地址联系的,云服务器内网访问不通,通过iptables将内网和外网做一个映射kubectl把节点名称和外网ip对应起来
iptables -t nat -A OUTPUT -d 内网ip -j DNAT --to-destination 外网ip
iptables -t nat -A OUTPUT -d 内网ip -j DNAT --to-destination 外网ip
iptables -t nat -A OUTPUT -d 内网ip -j DNAT --to-destination 外网ip
kubectl annotate node k8s-master-0 flannel.alpha.coreos.com/public-ip-overwrite=节点外网ip --overwrite
kubectl annotate node k8s-master-1 flannel.alpha.coreos.com/public-ip-overwrite=节点外网ip --overwrite
kubectl annotate node k8s-master-2 flannel.alpha.coreos.com/public-ip-overwrite=节点外网ip --overwrite
搭建之前先搭建nfs,挂载一个默认的存储盘
nfs搭建
2.3、配置NFS网络文件
此操作需在三台节点机器上全部安装NFS环境。
yum install -y nfs-utils
1
Node1主节点NFS网络文件配置。
mkdir -p /nfs/data
echo “/nfs/data/ *(insecure,rw,sync,no_root_squash)” > /etc/exports
systemctl enable rpcbind --now
systemctl enable nfs-server --now
exportfs -r
exportfs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
以下就准备好了NFS网络文件Node1主节点基础环境。
Node2从节点NFS网络文件配置。
showmount -e 192.168.47.139
mkdir -p /nfs/data
mount -t nfs 192.168.47.139:/nfs/data /nfs/data
1
2
3
4
5
6
7
8
Node3从节点NFS网络文件配置。
showmount -e 192.168.47.139
mkdir -p /nfs/data
mount -t nfs 192.168.47.139:/nfs/data /nfs/data
1
2
3
4
5
6
7
8
测试NFS网络文件系统是否搭建成功。
echo “hello nfs” > /nfs/data/test.txt
cat /nfs/data/test.txt
1
2
3
4
5
2.4、配置默认存储类
该操作只需要在Node1主节点上操作即可
创建一个命名为storageclass.yaml文件的配置。
将spec>env>value和volumes>server换成你自己主节点上的IP地址。
apiVersion: apps/v1
kind: Deployment
metadata:
name: nfs-client-provisioner
labels:
app: nfs-client-provisioner
apiVersion: v1
kind: ServiceAccount
metadata:
name: nfs-client-provisioner
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: nfs-client-provisioner-runner
rules:
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: run-nfs-client-provisioner
subjects:
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-client-provisioner
namespace: default
rules:
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-client-provisioner
namespace: default
subjects:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
应用storageclass.yaml文件。
kubectl apply -f storageclass.yaml
1
查看默认存储类
kubectl get sc
1
查看NFS-ClientPod状态
kubectl get pods -A
1
2.5、配置集群指标监控组件
该操作只需要在Node1主节点上操作即可
默认是没有监控节点和POD的组件。
Metrics-Server:它是集群指标监控组件,用于和API Server交互,获取(采集)Kubernetes集群中各项指标数据的。有了它我们可以查看各个Pod,Node等其他资源的CPU,Mem(内存)使用情况。
KubeSphere:可以充当Kubernetes的dashboard(可视化面板)因此KubeSphere要想获取Kubernetes的各项数据,就需要某个组件去提供给想数据,这个数据采集功能由Metrics-Server实现。
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
rbac.authorization.k8s.io/aggregate-to-admin: “true”
rbac.authorization.k8s.io/aggregate-to-edit: “true”
rbac.authorization.k8s.io/aggregate-to-view: “true”
name: system:aggregated-metrics-reader
rules:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
rules:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
ports:
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
labels:
k8s-app: metrics-server
name: v1beta1.metrics.k8s.io
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
version: v1beta1
versionPriority: 100
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
进入manifests文件夹修改kube-apiserver.yaml
添加配置:–enable-aggregator-routing=true
spec:
containers:
重启kubelet服务
systemctl daemon-reload
systemctl restart kubelet
1
2
docker拉取Metrics-Server监控服务镜像
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.4.3
1
应用名为metrics-server.yaml的文件
kubectl apply -f metrics-server.yaml
1
查看metrics-server状态
kubectl get pods -n kube-system
1
验证metrics-server监控组件
#验证K8s节点
kubectl top nodes
#验证Pod
kubectl top pods -A
1
2
3
4
5
至此Kubesphere基本环境和默认存储PVC搭建完成
————————————————
版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
原文链接:https://blog.csdn.net/weixin_46389877/article/details/128497112
然后按照官网搭建kubefphere
kubesphere搭建
下载kubesphere的时候记得不要按照网上的把cluster-configuration.yaml里面的false全部改为true,我被坑过,那时候不懂,里面是kubesphere的可插拔组件,服务器性能不行最好一个一个开看看,带不动的,我开了一个日志组件就把master干爆了
在node节点上get 不到pod
k8s报错:The connection to the server localhost:8080 was refused
k8s的node节点使用kubectl命令时,如kubectl get pods --all-namespaces 出现如下错误:
[root@k8s-node239 ~]# kubectl get pods
The connection to the server localhost:8080 was refused - did you specify the right host or port?
我常用解决方法2
解决办法1:使用一个非 root 账户登录,然后运行下列命令: sudo cp /etc/kubernetes/admin.conf $HOME/ sudo chown $(id -u):$(id -g) $HOME/admin.conf export KUBECONFIG=$HOME/admin.conf 解决办法2: 出现这个问题的原因是kubectl命令需要使用kubernetes-admin的身份来运行,在“kubeadm int”启动集群的步骤中就生成了“/etc/kubernetes/admin.conf”。 因此,解决方法如下,将主节点中的【/etc/kubernetes/admin.conf】文件拷贝到工作节点相同目录下: #复制admin.conf,请在主节点服务器上执行此命令 scp /etc/kubernetes/admin.conf 172.16.2.202:/etc/kubernetes/admin.conf scp /etc/kubernetes/admin.conf 172.16.2.203:/etc/kubernetes/admin.conf 然后分别在工作节点上配置环境变量: #设置kubeconfig文件 export KUBECONFIG=/etc/kubernetes/admin.conf echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。