赞
踩
kubernetes的兴起与应用不仅为容器的发展推波助澜,也成就了云原生技术的火爆。同样,金融行业也逐步涌现出很多上云的系统。为了保证我行后期上云更加容易,更容易上手,因此对k8s及云原生技术也开展一些学习和实验。实验才是掌握知识最快的方式,开展k8s的相关学习,我也是选择从安装部署开始,拥有一套自己的kubernetes集群,然后带着疑问进行研究学习,后期也会通过书本进行一些系统的了解和学习,希望自己能够坚持下去。下面则通过离线的方式基于RHEL7搭建一套k8s集群。
IP | 主机名 | 功能 |
---|---|---|
172.16.131.83 | k8s-master | master管理节点 |
172.16.131.84 | k8s-node1 | 工作节点1 |
172.16.131.85 | k8s-node2 | 工作节点2 |
172.16.131.86 | k8s-node3 | 工作节点3 |
172.16.131.87 | registry-harbor | 仓库 |
172.16.131.88 | k8s-zhongzhuan | 外网中转 |
1)在k8s节点修改主机名:
cp /etc/hosts /etc/hosts_`date +%y%m%d`
echo "
172.16.131.83 k8s-master
172.16.131.84 k8s-node1
172.16.131.85 k8s-node2
172.16.131.86 k8s-node3
172.16.131.87 registry-harbor
" >> /etc/hosts
2)系统参数配置:
echo "fs.file-max = 6815744 kernel.sem = 10000 10240000 10000 1024 kernel.shmmni = 4096 kernel.shmall = 1073741824 kernel.shmmax = 751619276800 net.ipv4.ip_local_port_range = 9000 65500 net.core.rmem_default = 16777216 net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.core.wmem_default = 16777216 fs.aio-max-nr = 6194304 vm.dirty_ratio=20 vm.dirty_background_ratio=3 vm.dirty_writeback_centisecs=100 vm.dirty_expire_centisecs=500 vm.min_free_kbytes=524288 net.core.netdev_max_backlog = 30000 net.core.netdev_budget = 600 #vm.nr_hugepages = net.ipv4.conf.all.rp_filter = 2 net.ipv4.conf.default.rp_filter = 2 net.ipv4.ipfrag_time = 60 net.ipv4.ipfrag_low_thresh = 6291456 net.ipv4.ipfrag_high_thresh = 8388608 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 vm.swappiness=0">> /etc/sysctl.conf && sysctl -p
3)用户限制参数配置:
cp /etc/security/limits.conf /etc/security/limits_`date +"%Y%m%d_%H%M%S"`.conf
echo "
* soft nproc 655350
* hard nproc 655350
* soft nofile 655360
* hard nofile 655360
* soft stack 102400
* hard stack 327680
* soft stack 102400
* hard stack 327680
* soft memlock -1
* hard memlock -1" >>/etc/security/limits.conf
4)关闭防火墙:
systemctl stop firewalld
systemctl disable firewalld
5)关闭selinux:
setenforce 0
sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
6)关闭透明大页:
[ -f /sys/kernel/mm/transparent_hugepage/enabled ] && echo never > /sys/kernel/mm/transparent_hugepage/enabled
[ -f /sys/kernel/mm/redhat_transparent_hugepage/enabled ] && echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
grep transparent_hugepage /etc/rc.d/rc.local 1>/dev/null || echo '[ -f /sys/kernel/mm/transparent_hugepage/enabled ] && echo never > /sys/kernel/mm/transparent_hugepage/enabled' >> /etc/rc.local
grep redhat_transparent_hugepage /etc/rc.d/rc.local 1>/dev/null || echo '[ -f /sys/kernel/mm/redhat_transparent_hugepage/enabled ] && echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled' >> /etc/rc.local
[ -x /etc/rc.d/rc.local ] || chmod +x /etc/rc.d/rc.local
7)关闭swap
swapoff -a
sed -i 's/.*swap.*/#&/' /etc/fstab
8)配置ssh(sshUserSetup.sh具体内容见附录)
sh sshUserSetup.sh -user root -hosts "k8s-master k8s-node1 k8s-node2 k8s-node3"
9)同步时钟(其他节点同步):
master中:
vi /etc/ntp.conf
#server 0.rhel.pool.ntp.org iburst
#server 1.rhel.pool.ntp.org iburst
#server 2.rhel.pool.ntp.org iburst
#server 3.rhel.pool.ntp.org iburst
server 127.127.1.0
fudge 127.127.1.0 stratum 10
其他机器:
crontab -e
*/2 * * * * /usr/sbin/ntpdate 172.16.131.83
date && ssh k8s-node1 date && ssh k8s-node2 date && ssh k8s-node3 date
在kubernetes的1.24之后,kubernetes对docker作为容器运行时兼容性不好,在部署初始化时时会出现无法从私有仓库里拉取镜像的问题。因此,此时则有两种解决方案,即方案一,部署cri-docker配合docker容器运行时进行使用;方案二,使用containerd作为容器运行时。这里我们选择第一种方式进行。
1)安装需要软件(利用本地源即可)
yum install -y yum-utils device-mapper-persistent-data lvm2 wget
2)安装epel(需要centos7源)
获取阿里云的centos-7的repo文件:
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
3)修改CentOS-Base.repo文件,把文件里面的$releasever全部替换为版本号7:
vi /etc/yum.repos.d/CentOS-Base.repo
%s/$releasever/7/g
4)清理注册源:
yum clean all&& yum makecache fast
5)安装epel-release.noarch
yum install -y epel-release.noarch
6)下载docker源
yum-config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo
or
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
7)生效yum仓库
yum-config-manager --enable docker-ce-nightly
(检查可以安装的docker版本:yum list docker-ce --showduplicates | sort -r)
注:当检查可安装的docker版本时出现以下类似错误的时候
https://mirrors.aliyun.com/docker-ce/linux/centos/7Server/x86_64/stable/repodata/7cc100684a6630e5382cf07c92483acecdff60eb94243af9acb95654c2913d70-primary.sqlite.bz2: [Errno 14] HTTPS Error 404 - Not Found
Trying other mirror.
主要原因是由于,仓库配置中的$releasever找不到导致,此时可以作如下操作:
vi /etc/yum.repos.d/docker-ce.repo
%s/$releasever/7/g
8)清理注册源:
yum clean all&& yum makecache fast
9)下载指定版本的docker的相关部署包:
mkdir -p /app/soft/docker
cd /app/soft/docker
yumdownloader --resolve docker-ce-23.0.1
10)打包:
cd /app/soft
tar -cvzf docker_v23.0.1_offline_pkg.tar.gz docker
11)将docker_v23.0.1_offline_pkg.tar.gz包发送至离线机器
scp -rp docker_offline_pkg.tar.gz 172.16.131.83:/app/soft/
scp -rp docker_offline_pkg.tar.gz 172.16.131.84:/app/soft/
scp -rp docker_offline_pkg.tar.gz 172.16.131.85:/app/soft/
scp -rp docker_offline_pkg.tar.gz 172.16.131.86:/app/soft/
scp -rp docker_offline_pkg.tar.gz 172.16.131.87:/app/soft/
12)下载cri-doceker的二进制包
下载地址:
https://github.com/Mirantis/cri-dockerd/releases/
选择二进制包:
cri-dockerd-0.3.1.amd64.tgz
1)解压离线安装包:
tar -xvzf docker_offline_pkg.tar.gz -C /app/soft/
tar -xvzf cri-dockerd-0.3.1.amd64.tgz -C /app/soft/
2)安装docker:
cd /app/soft/
yum install *.rpm
3)启动docker:
systemctl start docker && systemctl enable docker
4)安装cri-docker,解压安装包
tar -xvzf cri-dockerd-0.3.1.amd64.tgz -C /app/soft
5)拷贝二进制文件到/usr/bin下,并设置权限:
cd cri-dockerd
cp cri-dockerd /usr/bin/
chmod +x /usr/bin/cri-dockerd
6)配置cri-dockerd的启动文件:
cat <<"EOF" > /usr/lib/systemd/system/cri-docker.service [Unit] Description=CRI Interface for Docker Application Container Engine Documentation=https://docs.mirantis.com After=network-online.target firewalld.service docker.service Wants=network-online.target Requires=cri-docker.socket [Service] Type=notify ExecStart=/usr/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=172.16.131.87:1088/kubernetes-deploy/pause:3.7 ExecReload=/bin/kill -s HUP $MAINPID TimeoutSec=0 RestartSec=2 Restart=always StartLimitBurst=3 StartLimitInterval=60s LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity TasksMax=infinity Delegate=yes KillMode=process [Install] WantedBy=multi-user.target EOF
注:在启动文件里面需要加pod-infra-container-image的配置,否则后续在进行kubernetes安装部署的时候,pause的下载会默认到k8s.gcr.io/pause3.7上下载,从而无法获取,加上改参数,则会到我们指定的仓库下载镜像,具体参数如下:–pod-infra-container-image=172.16.131.87:1088/kubernetes-v1.24.12-deploy/pause:3.7
7)配置生成socket文件:
cat <<"EOF" > /usr/lib/systemd/system/cri-docker.socket
[Unit]
Description=CRI Docker Socket for the API
PartOf=cri-docker.service
[Socket]
ListenStream=%t/cri-dockerd.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker
[Install]
WantedBy=sockets.target
EOF
8)启动cri-docker运行时
systemctl daemon-reload
systemctl start cri-docker
systemctl enable cri-docker
systemctl status cri-docker
1)下载配置epel源
wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
2)下载docker-compose
检查版本:
yum list docker-compose --showduplicates | sort -r
创建目录:
mkdir -p /app/soft/docker-compose
cd /app/soft/docker-compose
安装指定版本:
yumdownloader --resolve docker-compose-1.18.0
3)打包docker-compose安装包:
cd /app/soft
tar -cvzf docker-compase_offline_pkg_v1.18.0.tar.gz docker-compase
4)将docker_offline_pkg.tar.gz包发送至离线机器
scp -rp docker-compase_offline_pkg_v1.18.0.tar.gz 172.16.220.83:/app/soft/
scp -rp docker-compase_offline_pkg_v1.18.0.tar.gz 172.16.220.84:/app/soft/
scp -rp docker-compase_offline_pkg_v1.18.0.tar.gz 172.16.220.85:/app/soft/
scp -rp docker-compase_offline_pkg_v1.18.0.tar.gz 172.16.220.86:/app/soft/
scp -rp docker-compase_offline_pkg_v1.18.0.tar.gz 172.16.220.87:/app/soft/
5)在离线机器上解压离线安装包:
tar -xvzf docker-compase_offline_pkg_v1.18.0.tar.gz -C /app/soft/
6)在离线机器上安装docker-compase:
cd /app/soft/docker-compase
yum install *.rpm
7)下载harbor的离线安装包(联网中转机)
curl -O https://github.com/goharbor/harbor/releases/download/v2.7.1/harbor-offline-installer-v2.7.1.tgz
或者直接到github上手动下载上传
8)传输离线包至registry-harbor主机下并解压
scp -rp /app/soft/harbor-offline-installer-v2.7.1.tgz 172.16.131.87:/app/soft/
tar -xvzf /app/soft/harbor-offline-installer-v2.7.1.tgz -C /app/
9)根据需求修改yaml文件
cp harbor.yml.tmpl harbor.yml
vi harbor.yml
主要修改内容包括:
# Configuration file of Harbor # The IP address or hostname to access admin UI and registry service. # DO NOT use localhost or 127.0.0.1, because Harbor needs to be accessed by external clients. hostname: 172.16.131.87 # http related config http: # port for http, default is 80. If https enabled, this port will redirect to https port port: 1088 # https related config #https: # # https port for harbor, default is 443 # port: 443 # # The path of cert and key files for nginx # certificate: /your/certificate/path # private_key: /your/private/key/path # # Uncomment following will enable tls communication between all harbor components # internal_tls: # # set enabled to true means internal tls is enabled # enabled: true # # put your cert and key files on dir # dir: /etc/harbor/tls/internal # Uncomment external_url if you want to enable external proxy # And when it enabled the hostname will no longer used # external_url: https://reg.mydomain.com:8433 # The initial password of Harbor admin # It only works in first time to install harbor # Remember Change the admin password from UI after launching Harbor. harbor_admin_password: Harbor@1234 # Harbor DB configuration database: # The password for the root user of Harbor DB. Change this before any production use. password: Harbor@1234 # The maximum number of connections in the idle connection pool. If it <=0, no idle connections are retained. max_idle_conns: 100 # The maximum number of open connections to the database. If it <= 0, then there is no limit on the number of open connections. # Note: the default number of connections is 1024 for postgres of harbor. max_open_conns: 900 # The maximum amount of time a connection may be reused. Expired connections may be closed lazily before reuse. If it <= 0, connections are not closed due to a connection's age. # The value is a duration string. A duration string is a possibly signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h". conn_max_lifetime: 5m # The maximum amount of time a connection may be idle. Expired connections may be closed lazily before reuse. If it <= 0, connections are not closed due to a connection's idle time. # The value is a duration string. A duration string is a possibly signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h". conn_max_idle_time: 0 # The default data volume data_volume: /app/data # Harbor Storage settings by default is using /data dir on local filesystem # Uncomment storage_service setting If you want to using external storage # storage_service: # # ca_bundle is the path to the custom root ca certificate, which will be injected into the truststore # # of registry's and chart repository's containers. This is usually needed when the user hosts a internal storage with self signed certificate. # ca_bundle: # # storage backend, default is filesystem, options include filesystem, azure, gcs, s3, swift and oss # # for more info about this configuration please refer https://docs.docker.com/registry/configuration/ # filesystem: # maxthreads: 100 # # set disable to true when you want to disable registry redirect # redirect: # disabled: false # Trivy configuration # # Trivy DB contains vulnerability information from NVD, Red Hat, and many other upstream vulnerability databases. # It is downloaded by Trivy from the GitHub release page https://github.com/aquasecurity/trivy-db/releases and cached # in the local file system. In addition, the database contains the update timestamp so Trivy can detect whether it # should download a newer version from the Internet or use the cached one. Currently, the database is updated every # 12 hours and published as a new release to GitHub. trivy: # ignoreUnfixed The flag to display only fixed vulnerabilities ignore_unfixed: false # skipUpdate The flag to enable or disable Trivy DB downloads from GitHub # # You might want to enable this flag in test or CI/CD environments to avoid GitHub rate limiting issues. # If the flag is enabled you have to download the `trivy-offline.tar.gz` archive manually, extract `trivy.db` and # `metadata.json` files and mount them in the `/home/scanner/.cache/trivy/db` path. skip_update: false # # The offline_scan option prevents Trivy from sending API requests to identify dependencies. # Scanning JAR files and pom.xml may require Internet access for better detection, but this option tries to avoid it. # For example, the offline mode will not try to resolve transitive dependencies in pom.xml when the dependency doesn't # exist in the local repositories. It means a number of detected vulnerabilities might be fewer in offline mode. # It would work if all the dependencies are in local. # This option doesn’t affect DB download. You need to specify "skip-update" as well as "offline-scan" in an air-gapped environment. offline_scan: false # # Comma-separated list of what security issues to detect. Possible values are `vuln`, `config` and `secret`. Defaults to `vuln`. security_check: vuln # # insecure The flag to skip verifying registry certificate insecure: false # github_token The GitHub access token to download Trivy DB # # Anonymous downloads from GitHub are subject to the limit of 60 requests per hour. Normally such rate limit is enough # for production operations. If, for any reason, it's not enough, you could increase the rate limit to 5000 # requests per hour by specifying the GitHub access token. For more details on GitHub rate limiting please consult # https://developer.github.com/v3/#rate-limiting # # You can create a GitHub token by following the instructions in # https://help.github.com/en/github/authenticating-to-github/creating-a-personal-access-token-for-the-command-line # # github_token: xxx jobservice: # Maximum number of job workers in job service max_job_workers: 10 notification: # Maximum retry count for webhook job webhook_job_max_retry: 10 chart: # Change the value of absolute_url to enabled can enable absolute url in chart absolute_url: disabled # Log configurations log: # options are debug, info, warning, error, fatal level: info # configs for logs in local storage local: # Log files are rotated log_rotate_count times before being removed. If count is 0, old versions are removed rather than rotated. rotate_count: 50 # Log files are rotated only if they grow bigger than log_rotate_size bytes. If size is followed by k, the size is assumed to be in kilobytes. # If the M is used, the size is in megabytes, and if G is used, the size is in gigabytes. So size 100, size 100k, size 100M and size 100G # are all valid. rotate_size: 200M # The directory on your host that store log location: /app/harbor/log # Uncomment following lines to enable external syslog endpoint. # external_endpoint: # # protocol used to transmit log to external endpoint, options is tcp or udp # protocol: tcp # # The host of external endpoint # host: localhost # # Port of external endpoint # port: 5140 #This attribute is for migrator to detect the version of the .cfg file, DO NOT MODIFY! _version: 2.7.0 # Uncomment external_database if using external database. # external_database: # harbor: # host: harbor_db_host # port: harbor_db_port # db_name: harbor_db_name # username: harbor_db_username # password: harbor_db_password # ssl_mode: disable # max_idle_conns: 2 # max_open_conns: 0 # notary_signer: # host: notary_signer_db_host # port: notary_signer_db_port # db_name: notary_signer_db_name # username: notary_signer_db_username # password: notary_signer_db_password # ssl_mode: disable # notary_server: # host: notary_server_db_host # port: notary_server_db_port # db_name: notary_server_db_name # username: notary_server_db_username # password: notary_server_db_password # ssl_mode: disable # Uncomment external_redis if using external Redis server # external_redis: # # support redis, redis+sentinel # # host for redis: <host_redis>:<port_redis> # # host for redis+sentinel: # # <host_sentinel1>:<port_sentinel1>,<host_sentinel2>:<port_sentinel2>,<host_sentinel3>:<port_sentinel3> # host: redis:6379 # password: # # sentinel_master_set must be set to support redis+sentinel # #sentinel_master_set: # # db_index 0 is for core, it's unchangeable # registry_db_index: 1 # jobservice_db_index: 2 # chartmuseum_db_index: 3 # trivy_db_index: 5 # idle_timeout_seconds: 30 # Uncomment uaa for trusting the certificate of uaa instance that is hosted via self-signed cert. # uaa: # ca_file: /path/to/ca # Global proxy # Config http proxy for components, e.g. http://my.proxy.com:3128 # Components doesn't need to connect to each others via http proxy. # Remove component from `components` array if want disable proxy # for it. If you want use proxy for replication, MUST enable proxy # for core and jobservice, and set `http_proxy` and `https_proxy`. # Add domain to the `no_proxy` field, when you want disable proxy # for some special registry. proxy: http_proxy: https_proxy: no_proxy: components: - core - jobservice - trivy # metric: # enabled: false # port: 9090 # path: /metrics # Trace related config # only can enable one trace provider(jaeger or otel) at the same time, # and when using jaeger as provider, can only enable it with agent mode or collector mode. # if using jaeger collector mode, uncomment endpoint and uncomment username, password if needed # if using jaeger agetn mode uncomment agent_host and agent_port # trace: # enabled: true # # set sample_rate to 1 if you wanna sampling 100% of trace data; set 0.5 if you wanna sampling 50% of trace data, and so forth # sample_rate: 1 # # # namespace used to differenciate different harbor services # # namespace: # # # attributes is a key value dict contains user defined attributes used to initialize trace provider # # attributes: # # application: harbor # # # jaeger should be 1.26 or newer. # # jaeger: # # endpoint: http://hostname:14268/api/traces # # username: # # password: # # agent_host: hostname # # # export trace data by jaeger.thrift in compact mode # # agent_port: 6831 # # otel: # # endpoint: hostname:4318 # # url_path: /v1/traces # # compression: false # # insecure: true # # timeout: 10s # enable purge _upload directories upload_purging: enabled: true # remove files in _upload directories which exist for a period of time, default is one week. age: 168h # the interval of the purge operations interval: 24h dryrun: false # cache layer configurations # If this feature enabled, harbor will cache the resource # `project/project_metadata/repository/artifact/manifest` in the redis # which can especially help to improve the performance of high concurrent # manifest pulling. # NOTICE # If you are deploying Harbor in HA mode, make sure that all the harbor # instances have the same behaviour, all with caching enabled or disabled, # otherwise it can lead to potential data inconsistency. cache: # not enabled by default enabled: false # keep cache for one day by default expire_hours: 24
10)安装harbor:
cd /app/soft/harbor
./install.sh
11)修改各服务器的容器仓库源为内网harbor,且将docker容器的cgroup的控制模式调整为systemd:
cat > /etc/docker/daemon.json<<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"insecure-registries": ["172.16.131.87:1088"]
}
EOF
systemctl restart docker
docker info | grep Cgroup
reboot
1)配置kubernetes的yum源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
2)重新加载yum源
yum clean all && yum makecache
3)查看版本kubelet,kubeadm,kubectl的版本
yum list kubelet --showduplicates | sort -r
yum list kubeadm --showduplicates | sort -r
yum list kubectl --showduplicates | sort -r
4)下载kubeadm相关包
yumdownloader kubelet-1.25.6 --resolve --destdir=/app/soft/kubernetes/kubelet
yumdownloader kubeadm-1.25.6 --resolve --destdir=/app/soft/kubernetes/kubeadm
yumdownloader kubectl-1.25.6 --resolve --destdir=/app/soft/kubernetes/kubectl
5)生成后,将kubeadm文件夹下载的kubectl-1.26.3和kubelet-1.26.3移走,并打包剩余的安装包
cd /app/soft/kubernetes/kubeadm/
mv *kubectl-1.26.3*.rpm *kubelet-1.26*.rpm ../../
tar -cvzf kubeadm_1.25.6_offline_install_pkg.tar.gz /app/soft/kubernetes
6)传输至离线的所有节点:
scp -rp /app/soft/kubeadm_1.25.6_offline_install_pkg.tar.gz 172.16.131.83:/app/soft/
scp -rp /app/soft/kubeadm_1.25.6_offline_install_pkg.tar.gz 172.16.131.84:/app/soft/
scp -rp /app/soft/kubeadm_1.25.6_offline_install_pkg.tar.gz 172.16.131.85:/app/soft/
scp -rp /app/soft/kubeadm_1.25.6_offline_install_pkg.tar.gz 172.16.131.86:/app/soft/
1)所有机器,解压并安装kubelet,kubectl,kubeadm
tar -xvzf /app/soft/kubeadm_1.25.6_offline_install_pkg.tar.gz -C /app/
cd /app/kubernets/kubelet/
yum install -y *.rpm
cd /app/kubernets/kubectl/
yum install -y *.rpm
cd /app/kubernets/kubeadm/
yum install -y *.rpm
2)启动kubelet服务
systemctl start kubelet && systemctl enable kubelet && systemctl status kubelet
3)查看部署kubernetes所需的镜像版本
kubeadm config images list --kubernetes-version=v1.25.6
registry.k8s.io/kube-apiserver:v1.25.6
registry.k8s.io/kube-controller-manager:v1.25.6
registry.k8s.io/kube-scheduler:v1.25.6
registry.k8s.io/kube-proxy:v1.25.6
registry.k8s.io/pause:3.7
registry.k8s.io/etcd:3.5.6-0
registry.k8s.io/coredns/coredns:v1.8.6
1)下载k8s镜像:
mkdir -p /app/soft/k8s_images
docker pull dyrnq/kube-apiserver:v1.25.6
docker pull dyrnq/kube-controller-manager:v1.25.6
docker pull dyrnq/kube-scheduler:v1.25.6
docker pull dyrnq/kube-proxy:v1.25.6
docker pull dyrnq/pause:3.7
docker pull dyrnq/etcd:3.5.6-0
docker pull dyrnq/coredns:v1.8.6
docker pull registry:latest
docker pull quay.io/coreos/flannel:v0.15.1
docker pull flannel/flannel-cni-plugin:v1.1.2
docker pull nginx:latest
2)打包镜像:
docker save dyrnq/kube-apiserver:v1.25.6 -o kube-apiserver_v1.25.6.tar
docker save dyrnq/kube-controller-manager:v1.25.6 -o kube-controller-manager_v1.25.6.tar
docker save dyrnq/kube-scheduler:v1.25.6 -o kube-scheduler_v1.25.6.tar
docker save dyrnq/kube-proxy:v1.25.6 -o kube-proxy_v1.25.6.tar
docker save dyrnq/pause:3.7 -o pause_v1.25.6.tar
docker save dyrnq/etcd:3.5.6-0 -o etcd_v1.25.6.tar
docker save dyrnq/coredns:v1.8.6 -o coredns_v1.25.6.tar
docker save registry:latest -o registry_latest.tar
docker save quay.io/coreos/flannel:v0.15.1 -o flannel_v0.15.1.tar
docker save flannel/flannel-cni-plugin:v1.1.2 -o flannel-cni-plugin_v1.1.2.tar
docker save nginx:latest -o nginx:latest
3)将打包的镜像压缩,并传输至k8s的master节点
tar -cvzf /app/soft/k8s_images.tar.gz /app/soft/k8s_images
scp -rp /app/soft/k8s_images.tar.gz 172.16.131.83:/app/soft
1)解压镜像
tar -xvzf /app/soft/k8s_images.tar.gz -C /app/soft
2)加载镜像
cd k8s_images
for i in `ls`
> do
> docker load -i $i
> done
3)重新给镜像打包:
docker images|awk '{print "docker tag " $1 ":" $2 " 172.16.131.87:1088/kubernetes-deploy/" $1 ":" $2}'|sed 1d
docker tag dyrnq/kube-apiserver:v1.25.6 172.16.131.87:1088/kubernetes-deploy/kube-apiserver:v1.25.6
docker tag dyrnq/kube-controller-manager:v1.25.6 172.16.131.87:1088/kubernetes-deploy/kube-controller-manager:v1.25.6
docker tag dyrnq/kube-scheduler:v1.25.6 172.16.131.87:1088/kubernetes-deploy/kube-scheduler:v1.25.6
docker tag dyrnq/kube-proxy:v1.25.6 172.16.131.87:1088/kubernetes-deploy/kube-proxy:v1.25.6
docker tag dyrnq/pause:3.7 172.16.131.87:1088/kubernetes-deploy/pause:3.7
docker tag dyrnq/etcd:3.5.6-0 172.16.131.87:1088/kubernetes-deploy/etcd:3.5.6-0
docker tag dyrnq/coredns:v1.8.6 172.16.131.87:1088/kubernetes-deploy/coredns:v1.8.6
docker tag registry:latest 172.16.131.87:1088/kubernetes-deploy/registry:latest
docker tag quay.io/coreos/flannel:v0.15.1 172.16.131.87:1088/kubernetes-deploy/flannel:v0.15.1
docker tag flannel/flannel-cni-plugin:v1.1.2 172.16.131.87:1088/kubernetes-deploy/flannel-cni-plugin:v1.1.2
docker tag nginx:latest 172.16.131.87:1088/kubernetes-deploy/nginx:latest
4)在各个节点登陆私有并在master节点推入新tag的镜像到仓库中:
docker login 172.16.131.87:1088 Username: admin Password: WARNING! Your password will be stored unencrypted in /root/.docker/config.json. Configure a credential helper to remove this warning. See https://docs.docker.com/engine/reference/commandline/login/#credentials-store Login Succeeded docker images|grep "172.16.131.87"|awk '{print "docker push " $1 ":" $2}' docker push 172.16.131.87:1088/kubernetes-deploy/kube-apiserver:v1.25.6 docker push 172.16.131.87:1088/kubernetes-deploy/kube-controller-manager:v1.25.6 docker push 172.16.131.87:1088/kubernetes-deploy/kube-scheduler:v1.25.6 docker push 172.16.131.87:1088/kubernetes-deploy/kube-proxy:v1.25.6 docker push 172.16.131.87:1088/kubernetes-deploy/pause:3.7 docker push 172.16.131.87:1088/kubernetes-deploy/etcd:3.5.6-0 docker push 172.16.131.87:1088/kubernetes-deploy/coredns:v1.8.6 docker push 172.16.131.87:1088/kubernetes-deploy/registry:latest docker push 172.16.131.87:1088/kubernetes-deploy/flannel:v0.15.1 docker push 172.16.131.87:1088/kubernetes-deploy/flannel-cni-plugin:v1.1.2 docker push 172.16.131.87:1088/kubernetes-deploy/nginx:latest
5)在master节点初始化kubernetes集群
在master节点生成初始化集群参数配置文件:
kubeadm config print init-defaults > kubeadm.yaml
修改配置文件参数:
apiVersion: kubeadm.k8s.io/v1beta3 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 172.16.131.83 bindPort: 6443 nodeRegistration: criSocket: unix:///var/run/cri-dockerd.sock imagePullPolicy: IfNotPresent name: k8s-master taints: null --- apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta3 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {} dns: {} etcd: local: dataDir: /var/lib/etcd imageRepository: 172.16.131.87:1088/kubernetes-deploy kind: ClusterConfiguration kubernetesVersion: 1.25.6 networking: dnsDomain: cluster.local serviceSubnet: 10.96.0.0/12 podSubnet: 10.224.0.0/16 scheduler: {} --- apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration mode: ipvs
6)修改containered的cri配置文件
vi /etc/containered/config.toml
将diaabled_plugins=["cri"]禁用
7)初始化kubernetes
kubeadm init --config=kubeadm.yaml [init] Using Kubernetes version: v1.25.6 [preflight] Running pre-flight checks [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 23.0.1. Latest validated version: 18.09 [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Activating the kubelet service [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [k8s-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.1.0.1 172.16.131.83] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [k8s-master localhost] and IPs [172.16.131.83 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [k8s-master localhost] and IPs [172.16.131.83 127.0.0.1 ::1] [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 17.002750 seconds [upload-config] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.14" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Skipping phase. Please see --experimental-upload-certs [mark-control-plane] Marking the node k8s-master as control-plane by adding the label "node-role.kubernetes.io/master=''" [mark-control-plane] Marking the node k8s-master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] [bootstrap-token] Using token: f93xna.7kr79tn4z6fmzf23 [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] creating the "cluster-info" ConfigMap in the "kube-public" namespace [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 172.16.131.83:6443 --token f93xna.7kr79tn4z6fmzf23 \ --discovery-token-ca-cert-hash sha256:40dba9e45ffce1c08d415c44a962974f9081c1ae9e74e922a2410d4e0ebac590
8)根据提示启动kubernetes集群
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
注:
如果想重新初始化集群则需要做reset,此时则可进行如下操作(必须加–cri-socket unix:///var/run/cri-docker.sock,否则会报错):
kubeadm reset --cri-socket unix:///var/run/cri-docker.sock
9)配置fannel(或calcio)网络,用于不同主机之间的容器网络交互:
联网中转机操作下载fannel的yml配置文件:
wget https://github.com/flannel-io/flannel/blob/master/Documentation/kube-flannel.yml
10)修改kube-flannel.yml文件
--- kind: Namespace apiVersion: v1 metadata: name: kube-flannel labels: k8s-app: flannel pod-security.kubernetes.io/enforce: privileged --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: labels: k8s-app: flannel name: flannel rules: - apiGroups: - "" resources: - pods verbs: - get - apiGroups: - "" resources: - nodes verbs: - get - list - watch - apiGroups: - "" resources: - nodes/status verbs: - patch - apiGroups: - networking.k8s.io resources: - clustercidrs verbs: - list - watch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: labels: k8s-app: flannel name: flannel roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: flannel subjects: - kind: ServiceAccount name: flannel namespace: kube-flannel --- apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-app: flannel name: flannel namespace: kube-flannel --- kind: ConfigMap apiVersion: v1 metadata: name: kube-flannel-cfg namespace: kube-flannel labels: tier: node k8s-app: flannel app: flannel data: cni-conf.json: | { "name": "cbr0", "cniVersion": "0.3.1", "plugins": [ { "type": "flannel", "delegate": { "hairpinMode": true, "isDefaultGateway": true } }, { "type": "portmap", "capabilities": { "portMappings": true } } ] } net-conf.json: | { "Network": "10.244.0.0/16", "Backend": { "Type": "vxlan" } } --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds namespace: kube-flannel labels: tier: node app: flannel k8s-app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/os operator: In values: - linux hostNetwork: true priorityClassName: system-node-critical tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni-plugin image: 172.16.131.87:1088/kubernetes-v1.24.12-deploy/flannel-cni-plugin:v1.1.2 #image: docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.2 command: - cp args: - -f - /flannel - /opt/cni/bin/flannel volumeMounts: - name: cni-plugin mountPath: /opt/cni/bin - name: install-cni image: 172.16.131.87:1088/kubernetes-v1.24.12-deploy/flannel:v0.15.1 #image: docker.io/rancher/mirrored-flannelcni-flannel:v0.21.4 command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: 172.16.131.87:1088/kubernetes-v1.24.12-deploy/flannel:v0.15.1 #image: docker.io/rancher/mirrored-flannelcni-flannel:v0.21.4 command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN", "NET_RAW"] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace - name: EVENT_QUEUE_DEPTH value: "5000" volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ - name: xtables-lock mountPath: /run/xtables.lock volumes: - name: run hostPath: path: /run/flannel - name: cni-plugin hostPath: path: /opt/cni/bin - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg - name: xtables-lock hostPath: path: /run/xtables.lock type: FileOrCreate
11)在master上配置FANNEL网络:
kubectl apply -f /apps/flannel/kube-flannel.yml
12)根据上述提示在其他节点上执行命令加入kubectl集群(需要在命令上加入–cri-socket unix:///var/run/cri-dockerd.sock,否则会失败):
kubeadm join 172.16.131.83:6443 --token f93xna.7kr79tn4z6fmzf23 --discovery-token-ca-cert-hash sha256:40dba9e45ffce1c08d415c44a962974f9081c1ae9e74e922a2410d4e0ebac590 --cri-socket unix:///var/run/cri-dockerd.sock
也可以通过以下方式在主节点生成集群加入命令,并拷贝到其他node上执行:
kubeadm token create --print-join-command
kubeadm join 172.16.131.83:6443 --token r7oaex.qgqvdqvlyuubt5aw --discovery-token-ca-cert-hash sha256:40dba9e45ffce1c08d415c44a962974f9081c1ae9e74e922a2410d4e0ebac590 --cri-socket unix:///var/run/cri-dockerd.sock
13)node节点执行后,如下则说明成功将节点加入集群,以后有新的节点需要加入kubernets集群也一样:
e922a2410d4e0ebac590 [preflight] Running pre-flight checks [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 23.0.1. Latest validated version: 18.09 [preflight] Reading configuration from the cluster... [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' [kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.14" ConfigMap in the kube-system namespace [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Activating the kubelet service [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap... This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
14)检查集群状态:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 1h v1.25.6
k8s-node1 Ready <none> 2h v1.25.6
k8s-node2 Ready <none> 1h v1.25.6
k8s-node3 Ready <none> 1h v1.25.6
注:
我在部署完成后,长时间检查发现node节点一直处于NotReady的状态
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 1h v1.25.6
k8s-node1 NotReady <none> 2h v1.25.6
k8s-node2 NotReady <none> 1h v1.25.6
k8s-node3 NotReady <none> 1h v1.25.6
此时kubenetes的状态是不正确的,因此需要排查,我们可以在k8s节点上运行如下命令用于查看错误日志,方便我们排查问题:
journalctl -u kubelet -f
此时在日志中,我看到两个报错:
k8s-node1 kubelet[27242]: I1014 11:17:29.409068 27242 cni.go:239] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
Oct 14 11:17:29 k8s-node1 kubelet[27242]: E1014 11:17:29.996079 27242 kubelet.go:2332] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
问题1的处理方式,即其他节点缺失配置文件,传输主节点的网络配置文件到其他节点即可
scp -rp /etc/cni k8s-node1:/etc/
scp -rp /etc/cni k8s-node2:/etc/
scp -rp /etc/cni k8s-node3:/etc/
此时可以发现所有节点状态为ready,即kubernetes的状态已经正确。
问题2的出现则是由于在搭建仓库上传k8s镜像的时候,将项目kubernetes-deploy项目设置为了私有,因此无法下载,最简单的方式就是直接在harbor上将该项目设置为公开即可(私有方式如何获取镜像后续再讨论)。
至此,我们整个基于红帽7的k8s通过kubeadm的离线安装部署整个就完成了,接下来就是通过部署一个nginx来验证整个集群的可用性了。
附:sshUserSetup.sh
#!/bin/sh # Nitin Jerath - Aug 2005 #Usage sshUserSetup.sh -user <user name> [ -hosts \"<space separated hostlist>\" | -hostfile <absolute path of cluster configuration file> ] [ -advanced ] [ -verify] [ -exverify ] [ -logfile <desired absolute path of logfile> ] [-confirm] [-shared] [-help] [-usePassphrase] [-noPromptPassphrase] #eg. sshUserSetup.sh -hosts "host1 host2" -user njerath -advanced #This script is used to setup SSH connectivity from the host on which it is # run to the specified remote hosts. After this script is run, the user can use # SSH to run commands on the remote hosts or copy files between the local host # and the remote hosts without being prompted for passwords or confirmations. # The list of remote hosts and the user name on the remote host is specified as # a command line parameter to the script. Note that in case the user on the # remote host has its home directory NFS mounted or shared across the remote # hosts, this script should be used with -shared option. #Specifying the -advanced option on the command line would result in SSH # connectivity being setup among the remote hosts which means that SSH can be # used to run commands on one remote host from the other remote host or copy # files between the remote hosts without being prompted for passwords or # confirmations. #Please note that the script would remove write permissions on the remote hosts #for the user home directory and ~/.ssh directory for "group" and "others". This # is an SSH requirement. The user would be explicitly informed about this by teh script and prompted to continue. In case the user presses no, the script would exit. In case the user does not want to be prompted, he can use -confirm option. # As a part of the setup, the script would use SSH to create files within ~/.ssh # directory of the remote node and to setup the requisite permissions. The #script also uses SCP to copy the local host public key to the remote hosts so # that the remote hosts trust the local host for SSH. At the time, the script #performs these steps, SSH connectivity has not been completely setup hence # the script would prompt the user for the remote host password. #For each remote host, for remote users with non-shared homes this would be # done once for SSH and once for SCP. If the number of remote hosts are x, the # user would be prompted 2x times for passwords. For remote users with shared # homes, the user would be prompted only twice, once each for SCP and SSH. #For security reasons, the script does not save passwords and reuse it. Also, # for security reasons, the script does not accept passwords redirected from a #file. The user has to key in the confirmations and passwords at the prompts. #The -verify option means that the user just wants to verify whether SSH has #been set up. In this case, the script would not setup SSH but would only check # whether SSH connectivity has been setup from the local host to the remote # hosts. The script would run the date command on each remote host using SSH. In # case the user is prompted for a password or sees a warning message for a #particular host, it means SSH connectivity has not been setup correctly for # that host. #In case the -verify option is not specified, the script would setup SSH and #then do the verification as well. #In case the user speciies the -exverify option, an exhaustive verification would be done. In that case, the following would be checked: # 1. SSH connectivity from local host to all remote hosts. # 2. SSH connectivity from each remote host to itself and other remote hosts. #echo Parsing command line arguments numargs=$# ADVANCED=false HOSTNAME=`hostname` CONFIRM=no SHARED=false i=1 USR=$USER if test -z "$TEMP" then TEMP=/tmp fi IDENTITY=id_rsa LOGFILE=$TEMP/sshUserSetup_`date +%F-%H-%M-%S`.log VERIFY=false EXHAUSTIVE_VERIFY=false HELP=false PASSPHRASE=no RERUN_SSHKEYGEN=no NO_PROMPT_PASSPHRASE=no while [ $i -le $numargs ] do j=$1 if [ $j = "-hosts" ] then HOSTS=$2 shift 1 i=`expr $i + 1` fi if [ $j = "-user" ] then USR=$2 shift 1 i=`expr $i + 1` fi if [ $j = "-logfile" ] then LOGFILE=$2 shift 1 i=`expr $i + 1` fi if [ $j = "-confirm" ] then CONFIRM=yes fi if [ $j = "-hostfile" ] then CLUSTER_CONFIGURATION_FILE=$2 shift 1 i=`expr $i + 1` fi if [ $j = "-usePassphrase" ] then PASSPHRASE=yes fi if [ $j = "-noPromptPassphrase" ] then NO_PROMPT_PASSPHRASE=yes fi if [ $j = "-shared" ] then SHARED=true fi if [ $j = "-exverify" ] then EXHAUSTIVE_VERIFY=true fi if [ $j = "-verify" ] then VERIFY=true fi if [ $j = "-advanced" ] then ADVANCED=true fi if [ $j = "-help" ] then HELP=true fi i=`expr $i + 1` shift 1 done if [ $HELP = "true" ] then echo "Usage $0 -user <user name> [ -hosts \"<space separated hostlist>\" | -hostfile <absolute path of cluster configuration file> ] [ -advanced ] [ -verify] [ -exverify ] [ -logfile <desired absolute path of logfile> ] [-confirm] [-shared] [-help] [-usePassphrase] [-noPromptPassphrase]" echo "This script is used to setup SSH connectivity from the host on which it is run to the specified remote hosts. After this script is run, the user can use SSH to run commands on the remote hosts or copy files between the local host and the remote hosts without being prompted for passwords or confirmations. The list of remote hosts and the user name on the remote host is specified as a command line parameter to the script. " echo "-user : User on remote hosts. " echo "-hosts : Space separated remote hosts list. " echo "-hostfile : The user can specify the host names either through the -hosts option or by specifying the absolute path of a cluster configuration file. A sample host file contents are below: " echo echo " stacg30 stacg30int 10.1.0.0 stacg30v -" echo " stacg34 stacg34int 10.1.0.1 stacg34v -" echo echo " The first column in each row of the host file will be used as the host name." echo echo "-usePassphrase : The user wants to set up passphrase to encrypt the private key on the local host. " echo "-noPromptPassphrase : The user does not want to be prompted for passphrase related questions. This is for users who want the default behavior to be followed." echo "-shared : In case the user on the remote host has its home directory NFS mounted or shared across the remote hosts, this script should be used with -shared option. " echo " It is possible for the user to determine whether a user's home directory is shared or non-shared. Let us say we want to determine that user user1's home directory is shared across hosts A, B and C." echo " Follow the following steps:" echo " 1. On host A, touch ~user1/checkSharedHome.tmp" echo " 2. On hosts B and C, ls -al ~user1/checkSharedHome.tmp" echo " 3. If the file is present on hosts B and C in ~user1 directory and" echo " is identical on all hosts A, B, C, it means that the user's home " echo " directory is shared." echo " 4. On host A, rm -f ~user1/checkSharedHome.tmp" echo " In case the user accidentally passes -shared option for non-shared homes or viceversa,SSH connectivity would only be set up for a subset of the hosts. The user would have to re-run the setyp script with the correct option to rectify this problem." echo "-advanced : Specifying the -advanced option on the command line would result in SSH connectivity being setup among the remote hosts which means that SSH can be used to run commands on one remote host from the other remote host or copy files between the remote hosts without being prompted for passwords or confirmations." echo "-confirm: The script would remove write permissions on the remote hosts for the user home directory and ~/.ssh directory for "group" and "others". This is an SSH requirement. The user would be explicitly informed about this by the script and prompted to continue. In case the user presses no, the script would exit. In case the user does not want to be prompted, he can use -confirm option." echo "As a part of the setup, the script would use SSH to create files within ~/.ssh directory of the remote node and to setup the requisite permissions. The script also uses SCP to copy the local host public key to the remote hosts so that the remote hosts trust the local host for SSH. At the time, the script performs these steps, SSH connectivity has not been completely setup hence the script would prompt the user for the remote host password. " echo "For each remote host, for remote users with non-shared homes this would be done once for SSH and once for SCP. If the number of remote hosts are x, the user would be prompted 2x times for passwords. For remote users with shared homes, the user would be prompted only twice, once each for SCP and SSH. For security reasons, the script does not save passwords and reuse it. Also, for security reasons, the script does not accept passwords redirected from a file. The user has to key in the confirmations and passwords at the prompts. " echo "-verify : -verify option means that the user just wants to verify whether SSH has been set up. In this case, the script would not setup SSH but would only check whether SSH connectivity has been setup from the local host to the remote hosts. The script would run the date command on each remote host using SSH. In case the user is prompted for a password or sees a warning message for a particular host, it means SSH connectivity has not been setup correctly for that host. In case the -verify option is not specified, the script would setup SSH and then do the verification as well. " echo "-exverify : In case the user speciies the -exverify option, an exhaustive verification for all hosts would be done. In that case, the following would be checked: " echo " 1. SSH connectivity from local host to all remote hosts. " echo " 2. SSH connectivity from each remote host to itself and other remote hosts. " echo The -exverify option can be used in conjunction with the -verify option as well to do an exhaustive verification once the setup has been done. echo "Taking some examples: Let us say local host is Z, remote hosts are A,B and C. Local user is njerath. Remote users are racqa(non-shared), aime(shared)." echo "$0 -user racqa -hosts "A B C" -advanced -exverify -confirm" echo "Script would set up connectivity from Z -> A, Z -> B, Z -> C, A -> A, A -> B, A -> C, B -> A, B -> B, B -> C, C -> A, C -> B, C -> C." echo "Since user has given -exverify option, all these scenario would be verified too." echo echo "Now the user runs : $0 -user racqa -hosts "A B C" -verify" echo "Since -verify option is given, no SSH setup would be done, only verification of existing setup. Also, since -exverify or -advanced options are not given, script would only verify connectivity from Z -> A, Z -> B, Z -> C" echo "Now the user runs : $0 -user racqa -hosts "A B C" -verify -advanced" echo "Since -verify option is given, no SSH setup would be done, only verification of existing setup. Also, since -advanced options is given, script would verify connectivity from Z -> A, Z -> B, Z -> C, A-> A, A->B, A->C, A->D" echo "Now the user runs:" echo "$0 -user aime -hosts "A B C" -confirm -shared" echo "Script would set up connectivity between Z->A, Z->B, Z->C only since advanced option is not given." echo "All these scenarios would be verified too." exit fi if test -z "$HOSTS" then if test -n "$CLUSTER_CONFIGURATION_FILE" && test -f "$CLUSTER_CONFIGURATION_FILE" then HOSTS=`awk '$1 !~ /^#/ { str = str " " $1 } END { print str }' $CLUSTER_CONFIGURATION_FILE` elif ! test -f "$CLUSTER_CONFIGURATION_FILE" then echo "Please specify a valid and existing cluster configuration file." fi fi if test -z "$HOSTS" || test -z $USR then echo "Either user name or host information is missing" echo "Usage $0 -user <user name> [ -hosts \"<space separated hostlist>\" | -hostfile <absolute path of cluster configuration file> ] [ -advanced ] [ -verify] [ -exverify ] [ -logfile <desired absolute path of logfile> ] [-confirm] [-shared] [-help] [-usePassphrase] [-noPromptPassphrase]" exit 1 fi if [ -d $LOGFILE ]; then echo $LOGFILE is a directory, setting logfile to $LOGFILE/ssh.log LOGFILE=$LOGFILE/ssh.log fi echo The output of this script is also logged into $LOGFILE | tee -a $LOGFILE if [ `echo $?` != 0 ]; then echo Error writing to the logfile $LOGFILE, Exiting exit 1 fi echo Hosts are $HOSTS | tee -a $LOGFILE echo user is $USR | tee -a $LOGFILE SSH="/usr/bin/ssh" SCP="/usr/bin/scp" SSH_KEYGEN="/usr/bin/ssh-keygen" calculateOS() { platform=`uname -s` case "$platform" in "SunOS") os=solaris;; "Linux") os=linux;; "HP-UX") os=hpunix;; "AIX") os=aix;; *) echo "Sorry, $platform is not currently supported." | tee -a $LOGFILE exit 1;; esac echo "Platform:- $platform " | tee -a $LOGFILE } calculateOS BITS=1024 ENCR="rsa" deadhosts="" alivehosts="" if [ $platform = "Linux" ] then PING="/bin/ping" else PING="/usr/sbin/ping" fi #bug 9044791 if [ -n "$SSH_PATH" ]; then SSH=$SSH_PATH fi if [ -n "$SCP_PATH" ]; then SCP=$SCP_PATH fi if [ -n "$SSH_KEYGEN_PATH" ]; then SSH_KEYGEN=$SSH_KEYGEN_PATH fi if [ -n "$PING_PATH" ]; then PING=$PING_PATH fi PATH_ERROR=0 if test ! -x $SSH ; then echo "ssh not found at $SSH. Please set the variable SSH_PATH to the correct location of ssh and retry." PATH_ERROR=1 fi if test ! -x $SCP ; then echo "scp not found at $SCP. Please set the variable SCP_PATH to the correct location of scp and retry." PATH_ERROR=1 fi if test ! -x $SSH_KEYGEN ; then echo "ssh-keygen not found at $SSH_KEYGEN. Please set the variable SSH_KEYGEN_PATH to the correct location of ssh-keygen and retry." PATH_ERROR=1 fi if test ! -x $PING ; then echo "ping not found at $PING. Please set the variable PING_PATH to the correct location of ping and retry." PATH_ERROR=1 fi if [ $PATH_ERROR = 1 ]; then echo "ERROR: one or more of the required binaries not found, exiting" exit 1 fi #9044791 end echo Checking if the remote hosts are reachable | tee -a $LOGFILE for host in $HOSTS do if [ $platform = "SunOS" ]; then $PING -s $host 5 5 elif [ $platform = "HP-UX" ]; then $PING $host -n 5 -m 5 else $PING -c 5 -w 5 $host fi exitcode=`echo $?` if [ $exitcode = 0 ] then alivehosts="$alivehosts $host" else deadhosts="$deadhosts $host" fi done if test -z "$deadhosts" then echo Remote host reachability check succeeded. | tee -a $LOGFILE echo The following hosts are reachable: $alivehosts. | tee -a $LOGFILE echo The following hosts are not reachable: $deadhosts. | tee -a $LOGFILE echo All hosts are reachable. Proceeding further... | tee -a $LOGFILE else echo Remote host reachability check failed. | tee -a $LOGFILE echo The following hosts are reachable: $alivehosts. | tee -a $LOGFILE echo The following hosts are not reachable: $deadhosts. | tee -a $LOGFILE echo Please ensure that all the hosts are up and re-run the script. | tee -a $LOGFILE echo Exiting now... | tee -a $LOGFILE exit 1 fi firsthost=`echo $HOSTS | awk '{print $1}; END { }'` echo firsthost $firsthost numhosts=`echo $HOSTS | awk '{ }; END {print NF}'` echo numhosts $numhosts if [ $VERIFY = "true" ] then echo Since user has specified -verify option, SSH setup would not be done. Only, existing SSH setup would be verified. | tee -a $LOGFILE continue else echo The script will setup SSH connectivity from the host ''`hostname`'' to all | tee -a $LOGFILE echo the remote hosts. After the script is executed, the user can use SSH to run | tee -a $LOGFILE echo commands on the remote hosts or copy files between this host ''`hostname`'' | tee -a $LOGFILE echo and the remote hosts without being prompted for passwords or confirmations. | tee -a $LOGFILE echo | tee -a $LOGFILE echo NOTE 1: | tee -a $LOGFILE echo As part of the setup procedure, this script will use 'ssh' and 'scp' to copy | tee -a $LOGFILE echo files between the local host and the remote hosts. Since the script does not | tee -a $LOGFILE echo store passwords, you may be prompted for the passwords during the execution of | tee -a $LOGFILE echo the script whenever 'ssh' or 'scp' is invoked. | tee -a $LOGFILE echo | tee -a $LOGFILE echo NOTE 2: | tee -a $LOGFILE echo "AS PER SSH REQUIREMENTS, THIS SCRIPT WILL SECURE THE USER HOME DIRECTORY" | tee -a $LOGFILE echo AND THE .ssh DIRECTORY BY REVOKING GROUP AND WORLD WRITE PRIVILEGES TO THESE | tee -a $LOGFILE echo "directories." | tee -a $LOGFILE echo | tee -a $LOGFILE echo "Do you want to continue and let the script make the above mentioned changes (yes/no)?" | tee -a $LOGFILE if [ "$CONFIRM" = "no" ] then read CONFIRM else echo "Confirmation provided on the command line" | tee -a $LOGFILE fi echo | tee -a $LOGFILE echo The user chose ''$CONFIRM'' | tee -a $LOGFILE if [ -z "$CONFIRM" -o "$CONFIRM" != "yes" -a "$CONFIRM" != "no" ] then echo "You haven't specified proper input. Please enter 'yes' or 'no'. Exiting...." exit 0 fi if [ "$CONFIRM" = "no" ] then echo "SSH setup is not done." | tee -a $LOGFILE exit 1 else if [ $NO_PROMPT_PASSPHRASE = "yes" ] then echo "User chose to skip passphrase related questions." | tee -a $LOGFILE else if [ $SHARED = "true" ] then hostcount=`expr ${numhosts} + 1` PASSPHRASE_PROMPT=`expr 2 \* $hostcount` else PASSPHRASE_PROMPT=`expr 2 \* ${numhosts}` fi echo "Please specify if you want to specify a passphrase for the private key this script will create for the local host. Passphrase is used to encrypt the private key and makes SSH much more secure. Type 'yes' or 'no' and then press enter. In case you press 'yes', you would need to enter the passphrase whenever the script executes ssh or scp. $PASSPHRASE " | tee -a $LOGFILE echo "The estimated number of times the user would be prompted for a passphrase is $PASSPHRASE_PROMPT. In addition, if the private-public files are also newly created, the user would have to specify the passphrase on one additional occasion. " | tee -a $LOGFILE echo "Enter 'yes' or 'no'." | tee -a $LOGFILE if [ "$PASSPHRASE" = "no" ] then read PASSPHRASE else echo "Confirmation provided on the command line" | tee -a $LOGFILE fi echo | tee -a $LOGFILE echo The user chose ''$PASSPHRASE'' | tee -a $LOGFILE if [ -z "$PASSPHRASE" -o "$PASSPHRASE" != "yes" -a "$PASSPHRASE" != "no" ] then echo "You haven't specified whether to use Passphrase or not. Please specify 'yes' or 'no'. Exiting..." exit 0 fi if [ "$PASSPHRASE" = "yes" ] then RERUN_SSHKEYGEN="yes" #Checking for existence of ${IDENTITY} file if test -f $HOME/.ssh/${IDENTITY}.pub && test -f $HOME/.ssh/${IDENTITY} then echo "The files containing the client public and private keys already exist on the local host. The current private key may or may not have a passphrase associated with it. In case you remember the passphrase and do not want to re-run ssh-keygen, press 'no' and enter. If you press 'no', the script will not attempt to create any new public/private key pairs. If you press 'yes', the script will remove the old private/public key files existing and create new ones prompting the user to enter the passphrase. If you enter 'yes', any previous SSH user setups would be reset. If you press 'change', the script will associate a new passphrase with the old keys." | tee -a $LOGFILE echo "Press 'yes', 'no' or 'change'" | tee -a $LOGFILE read RERUN_SSHKEYGEN echo The user chose ''$RERUN_SSHKEYGEN'' | tee -a $LOGFILE if [ -z "$RERUN_SSHKEYGEN" -o "$RERUN_SSHKEYGEN" != "yes" -a "$RERUN_SSHKEYGEN" != "no" -a "$RERUN_SSHKEYGEN" != "change" ] then echo "You haven't specified whether to re-run 'ssh-keygen' or not. Please enter 'yes' , 'no' or 'change'. Exiting..." exit 0; fi fi else if test -f $HOME/.ssh/${IDENTITY}.pub && test -f $HOME/.ssh/${IDENTITY} then echo "The files containing the client public and private keys already exist on the local host. The current private key may have a passphrase associated with it. In case you find using passphrase inconvenient(although it is more secure), you can change to it empty through this script. Press 'change' if you want the script to change the passphrase for you. Press 'no' if you want to use your old passphrase, if you had one." read RERUN_SSHKEYGEN echo The user chose ''$RERUN_SSHKEYGEN'' | tee -a $LOGFILE if [ -z "$RERUN_SSHKEYGEN" -o "$RERUN_SSHKEYGEN" != "yes" -a "$RERUN_SSHKEYGEN" != "no" -a "$RERUN_SSHKEYGEN" != "change" ] then echo "You haven't specified whether to re-run 'ssh-keygen' or not. Please enter 'yes' , 'no' or 'change'. Exiting..." exit 0 fi fi fi fi echo Creating .ssh directory on local host, if not present already | tee -a $LOGFILE mkdir -p $HOME/.ssh | tee -a $LOGFILE echo Creating authorized_keys file on local host | tee -a $LOGFILE touch $HOME/.ssh/authorized_keys | tee -a $LOGFILE echo Changing permissions on authorized_keys to 644 on local host | tee -a $LOGFILE chmod 644 $HOME/.ssh/authorized_keys | tee -a $LOGFILE mv -f $HOME/.ssh/authorized_keys $HOME/.ssh/authorized_keys.tmp | tee -a $LOGFILE echo Creating known_hosts file on local host | tee -a $LOGFILE touch $HOME/.ssh/known_hosts | tee -a $LOGFILE echo Changing permissions on known_hosts to 644 on local host | tee -a $LOGFILE chmod 644 $HOME/.ssh/known_hosts | tee -a $LOGFILE mv -f $HOME/.ssh/known_hosts $HOME/.ssh/known_hosts.tmp | tee -a $LOGFILE echo Creating config file on local host | tee -a $LOGFILE echo If a config file exists already at $HOME/.ssh/config, it would be backed up to $HOME/.ssh/config.backup. echo "Host *" > $HOME/.ssh/config.tmp | tee -a $LOGFILE echo "ForwardX11 no" >> $HOME/.ssh/config.tmp | tee -a $LOGFILE if test -f $HOME/.ssh/config then cp -f $HOME/.ssh/config $HOME/.ssh/config.backup fi mv -f $HOME/.ssh/config.tmp $HOME/.ssh/config | tee -a $LOGFILE chmod 644 $HOME/.ssh/config if [ "$RERUN_SSHKEYGEN" = "yes" ] then echo Removing old private/public keys on local host | tee -a $LOGFILE rm -f $HOME/.ssh/${IDENTITY} | tee -a $LOGFILE rm -f $HOME/.ssh/${IDENTITY}.pub | tee -a $LOGFILE echo Running SSH keygen on local host | tee -a $LOGFILE $SSH_KEYGEN -t $ENCR -b $BITS -f $HOME/.ssh/${IDENTITY} | tee -a $LOGFILE elif [ "$RERUN_SSHKEYGEN" = "change" ] then echo Running SSH Keygen on local host to change the passphrase associated with the existing private key | tee -a $LOGFILE $SSH_KEYGEN -p -t $ENCR -b $BITS -f $HOME/.ssh/${IDENTITY} | tee -a $LOGFILE elif test -f $HOME/.ssh/${IDENTITY}.pub && test -f $HOME/.ssh/${IDENTITY} then continue else echo Removing old private/public keys on local host | tee -a $LOGFILE rm -f $HOME/.ssh/${IDENTITY} | tee -a $LOGFILE rm -f $HOME/.ssh/${IDENTITY}.pub | tee -a $LOGFILE echo Running SSH keygen on local host with empty passphrase | tee -a $LOGFILE $SSH_KEYGEN -t $ENCR -b $BITS -f $HOME/.ssh/${IDENTITY} -N '' | tee -a $LOGFILE fi if [ $SHARED = "true" ] then if [ $USER = $USR ] then #No remote operations required echo Remote user is same as local user | tee -a $LOGFILE REMOTEHOSTS="" chmod og-w $HOME $HOME/.ssh | tee -a $LOGFILE else REMOTEHOSTS="${firsthost}" fi else REMOTEHOSTS="$HOSTS" fi for host in $REMOTEHOSTS do echo Creating .ssh directory and setting permissions on remote host $host | tee -a $LOGFILE echo "THE SCRIPT WOULD ALSO BE REVOKING WRITE PERMISSIONS FOR "group" AND "others" ON THE HOME DIRECTORY FOR $USR. THIS IS AN SSH REQUIREMENT." | tee -a $LOGFILE echo The script would create ~$USR/.ssh/config file on remote host $host. If a config file exists already at ~$USR/.ssh/config, it would be backed up to ~$USR/.ssh/config.backup. | tee -a $LOGFILE echo The user may be prompted for a password here since the script would be running SSH on host $host. | tee -a $LOGFILE $SSH -o StrictHostKeyChecking=no -x -l $USR $host "/bin/sh -c \" mkdir -p .ssh ; chmod og-w . .ssh; touch .ssh/authorized_keys .ssh/known_hosts; chmod 644 .ssh/authorized_keys .ssh/known_hosts; cp .ssh/authorized_keys .ssh/authorized_keys.tmp ; cp .ssh/known_hosts .ssh/known_hosts.tmp; echo \\"Host *\\" > .ssh/config.tmp; echo \\"ForwardX11 no\\" >> .ssh/config.tmp; if test -f .ssh/config ; then cp -f .ssh/config .ssh/config.backup; fi ; mv -f .ssh/config.tmp .ssh/config\"" | tee -a $LOGFILE echo Done with creating .ssh directory and setting permissions on remote host $host. | tee -a $LOGFILE done for host in $REMOTEHOSTS do echo Copying local host public key to the remote host $host | tee -a $LOGFILE echo The user may be prompted for a password or passphrase here since the script would be using SCP for host $host. | tee -a $LOGFILE $SCP $HOME/.ssh/${IDENTITY}.pub $USR@$host:.ssh/authorized_keys | tee -a $LOGFILE echo Done copying local host public key to the remote host $host | tee -a $LOGFILE done cat $HOME/.ssh/${IDENTITY}.pub >> $HOME/.ssh/authorized_keys | tee -a $LOGFILE for host in $HOSTS do if [ "$ADVANCED" = "true" ] then echo Creating keys on remote host $host if they do not exist already. This is required to setup SSH on host $host. | tee -a $LOGFILE if [ "$SHARED" = "true" ] then IDENTITY_FILE_NAME=${IDENTITY}_$host COALESCE_IDENTITY_FILES_COMMAND="cat .ssh/${IDENTITY_FILE_NAME}.pub >> .ssh/authorized_keys" else IDENTITY_FILE_NAME=${IDENTITY} fi $SSH -o StrictHostKeyChecking=no -x -l $USR $host " /bin/sh -c \"if test -f .ssh/${IDENTITY_FILE_NAME}.pub && test -f .ssh/${IDENTITY_FILE_NAME}; then echo; else rm -f .ssh/${IDENTITY_FILE_NAME} ; rm -f .ssh/${IDENTITY_FILE_NAME}.pub ; $SSH_KEYGEN -t $ENCR -b $BITS -f .ssh/${IDENTITY_FILE_NAME} -N '' ; fi; ${COALESCE_IDENTITY_FILES_COMMAND} \"" | tee -a $LOGFILE else #At least get the host keys from all hosts for shared case - advanced option not set if test $SHARED = "true" && test $ADVANCED = "false" then if [ "$PASSPHRASE" = "yes" ] then echo "The script will fetch the host keys from all hosts. The user may be prompted for a passphrase here in case the private key has been encrypted with a passphrase." | tee -a $LOGFILE fi $SSH -o StrictHostKeyChecking=no -x -l $USR $host "/bin/sh -c true" fi fi done for host in $REMOTEHOSTS do if test $ADVANCED = "true" && test $SHARED = "false" then $SCP $USR@$host:.ssh/${IDENTITY}.pub $HOME/.ssh/${IDENTITY}.pub.$host | tee -a $LOGFILE cat $HOME/.ssh/${IDENTITY}.pub.$host >> $HOME/.ssh/authorized_keys | tee -a $LOGFILE rm -f $HOME/.ssh/${IDENTITY}.pub.$host | tee -a $LOGFILE fi done for host in $REMOTEHOSTS do if [ "$ADVANCED" = "true" ] then if [ "$SHARED" != "true" ] then echo Updating authorized_keys file on remote host $host | tee -a $LOGFILE $SCP $HOME/.ssh/authorized_keys $USR@$host:.ssh/authorized_keys | tee -a $LOGFILE fi echo Updating known_hosts file on remote host $host | tee -a $LOGFILE $SCP $HOME/.ssh/known_hosts $USR@$host:.ssh/known_hosts | tee -a $LOGFILE fi if [ "$PASSPHRASE" = "yes" ] then echo "The script will run SSH on the remote machine $host. The user may be prompted for a passphrase here in case the private key has been encrypted with a passphrase." | tee -a $LOGFILE fi $SSH -x -l $USR $host "/bin/sh -c \"cat .ssh/authorized_keys.tmp >> .ssh/authorized_keys; cat .ssh/known_hosts.tmp >> .ssh/known_hosts; rm -f .ssh/known_hosts.tmp .ssh/authorized_keys.tmp\"" | tee -a $LOGFILE done cat $HOME/.ssh/known_hosts.tmp >> $HOME/.ssh/known_hosts | tee -a $LOGFILE cat $HOME/.ssh/authorized_keys.tmp >> $HOME/.ssh/authorized_keys | tee -a $LOGFILE #Added chmod to fix BUG NO 5238814 chmod 644 $HOME/.ssh/authorized_keys #Fix for BUG NO 5157782 chmod 644 $HOME/.ssh/config rm -f $HOME/.ssh/known_hosts.tmp $HOME/.ssh/authorized_keys.tmp | tee -a $LOGFILE echo SSH setup is complete. | tee -a $LOGFILE fi fi echo | tee -a $LOGFILE echo ------------------------------------------------------------------------ | tee -a $LOGFILE echo Verifying SSH setup | tee -a $LOGFILE echo =================== | tee -a $LOGFILE echo The script will now run the 'date' command on the remote nodes using ssh | tee -a $LOGFILE echo to verify if ssh is setup correctly. IF THE SETUP IS CORRECTLY SETUP, | tee -a $LOGFILE echo THERE SHOULD BE NO OUTPUT OTHER THAN THE DATE AND SSH SHOULD NOT ASK FOR | tee -a $LOGFILE echo PASSWORDS. If you see any output other than date or are prompted for the | tee -a $LOGFILE echo password, ssh is not setup correctly and you will need to resolve the | tee -a $LOGFILE echo issue and set up ssh again. | tee -a $LOGFILE echo The possible causes for failure could be: | tee -a $LOGFILE echo 1. The server settings in /etc/ssh/sshd_config file do not allow ssh | tee -a $LOGFILE echo for user $USR. | tee -a $LOGFILE echo 2. The server may have disabled public key based authentication. echo 3. The client public key on the server may be outdated. echo 4. ~$USR or ~$USR/.ssh on the remote host may not be owned by $USR. | tee -a $LOGFILE echo 5. User may not have passed -shared option for shared remote users or | tee -a $LOGFILE echo may be passing the -shared option for non-shared remote users. | tee -a $LOGFILE echo 6. If there is output in addition to the date, but no password is asked, | tee -a $LOGFILE echo it may be a security alert shown as part of company policy. Append the | tee -a $LOGFILE echo "additional text to the <OMS HOME>/sysman/prov/resources/ignoreMessages.txt file." | tee -a $LOGFILE echo ------------------------------------------------------------------------ | tee -a $LOGFILE #read -t 30 dummy for host in $HOSTS do echo --$host:-- | tee -a $LOGFILE echo Running $SSH -x -l $USR $host date to verify SSH connectivity has been setup from local host to $host. | tee -a $LOGFILE echo "IF YOU SEE ANY OTHER OUTPUT BESIDES THE OUTPUT OF THE DATE COMMAND OR IF YOU ARE PROMPTED FOR A PASSWORD HERE, IT MEANS SSH SETUP HAS NOT BEEN SUCCESSFUL. Please note that being prompted for a passphrase may be OK but being prompted for a password is ERROR." | tee -a $LOGFILE if [ "$PASSPHRASE" = "yes" ] then echo "The script will run SSH on the remote machine $host. The user may be prompted for a passphrase here in case the private key has been encrypted with a passphrase." | tee -a $LOGFILE fi $SSH -l $USR $host "/bin/sh -c date" | tee -a $LOGFILE echo ------------------------------------------------------------------------ | tee -a $LOGFILE done if [ "$EXHAUSTIVE_VERIFY" = "true" ] then for clienthost in $HOSTS do if [ "$SHARED" = "true" ] then REMOTESSH="$SSH -i .ssh/${IDENTITY}_${clienthost}" else REMOTESSH=$SSH fi for serverhost in $HOSTS do echo ------------------------------------------------------------------------ | tee -a $LOGFILE echo Verifying SSH connectivity has been setup from $clienthost to $serverhost | tee -a $LOGFILE echo ------------------------------------------------------------------------ | tee -a $LOGFILE echo "IF YOU SEE ANY OTHER OUTPUT BESIDES THE OUTPUT OF THE DATE COMMAND OR IF YOU ARE PROMPTED FOR A PASSWORD HERE, IT MEANS SSH SETUP HAS NOT BEEN SUCCESSFUL." | tee -a $LOGFILE $SSH -l $USR $clienthost "$REMOTESSH $serverhost \"/bin/sh -c date\"" | tee -a $LOGFILE echo ------------------------------------------------------------------------ | tee -a $LOGFILE done echo -Verification from $clienthost complete- | tee -a $LOGFILE done else if [ "$ADVANCED" = "true" ] then if [ "$SHARED" = "true" ] then REMOTESSH="$SSH -i .ssh/${IDENTITY}_${firsthost}" else REMOTESSH=$SSH fi for host in $HOSTS do echo ------------------------------------------------------------------------ | tee -a $LOGFILE echo Verifying SSH connectivity has been setup from $firsthost to $host | tee -a $LOGFILE echo "IF YOU SEE ANY OTHER OUTPUT BESIDES THE OUTPUT OF THE DATE COMMAND OR IF YOU ARE PROMPTED FOR A PASSWORD HERE, IT MEANS SSH SETUP HAS NOT BEEN SUCCESSFUL." | tee -a $LOGFILE $SSH -l $USR $firsthost "$REMOTESSH $host \"/bin/sh -c date\"" | tee -a $LOGFILE echo ------------------------------------------------------------------------ | tee -a $LOGFILE done echo -Verification from $clienthost complete- | tee -a $LOGFILE fi fi echo "SSH verification complete." | tee -a $LOGFILE
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。