赞
踩
Ceph是一种高度可扩展的分布式存储解决方案,提供对象、文件和块存储。在每个存储节点上,将找到Ceph存储对象的文件系统和Ceph OSD(对象存储守护程序)进程。在Ceph集群上,还存在Ceph MON(监控)守护程序,它们确保Ceph集群保持高可用性。
更多Ceph介绍参考:https://www.cnblogs.com/itzgr/category/1382602.html
Rook 是一个开源的cloud-native storage编排, 提供平台和框架;为各种存储解决方案提供平台、框架和支持,以便与云原生环境本地集成。目前主要专用于Cloud-Native环境的文件、块、对象存储服务。它实现了一个自我管理的、自我扩容的、自我修复的分布式存储服务。
Rook支持自动部署、启动、配置、分配(provisioning)、扩容/缩容、升级、迁移、灾难恢复、监控,以及资源管理。为了实现所有这些功能,Rook依赖底层的容器编排平台,例如 kubernetes、CoreOS 等。。
Rook 目前支持Ceph、NFS、Minio Object Store、Edegefs、Cassandra、CockroachDB 存储的搭建。
Rook机制:
Rook 提供了卷插件,来扩展了 K8S 的存储系统,使用 Kubelet 代理程序 Pod 可以挂载 Rook 管理的块设备和文件系统。
Rook Operator 负责启动并监控整个底层存储系统,例如 Ceph Pod、Ceph OSD 等,同时它还管理 CRD、对象存储、文件系统。
Rook Agent 代理部署在 K8S 每个节点上以 Pod 容器运行,每个代理 Pod 都配置一个 Flexvolume 驱动,该驱动主要用来跟 K8S 的卷控制框架集成起来,每个节点上的相关的操作,例如添加存储设备、挂载、格式化、删除存储等操作,都有该代理来完成。
更多参考如下官网:
https://rook.io
https://ceph.com/
Rook架构如下:
Kubernetes集成Rook架构如下:
主机 磁盘 IP
centos8-master01 sdb 192.168.10.131
centos8-master02 sdb 192.168.10.132
centos8-master03 sdb 192.168.10.133
centos8-node01 sdb 192.168.10.181
centos8-node02 sdb 192.168.10.182
centos8-node03 sdb 192.168.10.183
#国外地址比较慢 使用gitee地址
git clone --single-branch --branch release-1.5 https://gitee.com/estarhaohao/rook.git
部署在master节点不需要打污点,这布可以省略
进入ceph部署目录
[root@centos8-master01 ]# cd rook/cluster/examples/kubernetes/ceph
[root@centos8-master01 ceph]# kubectl taint node centos8-master01 node-role.kubernetes.io/master="":NoSchedule
[root@centos8-master01 ceph]# kubectl taint node centos8-master02 node-role.kubernetes.io/master="":NoSchedule
[root@centos8-master01 ceph]# kubectl taint node centos8-master03 node-role.kubernetes.io/master="":NoSchedule
#master节点设置标签部署到master
[root@k8smaster01 ceph]# kubectl label nodes {centos8-master01,centos8-master02,centos8-master03} ceph-osd=enabled
[root@k8smaster01 ceph]# kubectl label nodes {centos8-master01,centos8-master02,centos8-master03} ceph-mon=enabled
[root@k8smaster01 ceph]# kubectl label nodes centos8-master01 ceph-mgr=enabled
[root@k8smaster01 ceph]# kubectl create -f common.yaml #这个不需要修改
[root@k8smaster01 ceph]# kubectl create -f operator.yaml
#修改对应的镜像地址 使用的都是国内阿里云的镜像 不会报错放心使用
#可以直接复制粘贴
################################################################################################################# # The deployment for the rook operator # Contains the common settings for most Kubernetes deployments. # For example, to create the rook-ceph cluster: # kubectl create -f crds.yaml -f common.yaml -f operator.yaml # kubectl create -f cluster.yaml # # Also see other operator sample files for variations of operator.yaml: # - operator-openshift.yaml: Common settings for running in OpenShift ############################################################################################################### # Rook Ceph Operator Config ConfigMap # Use this ConfigMap to override Rook-Ceph Operator configurations. # NOTE! Precedence will be given to this config if the same Env Var config also exists in the # Operator Deployment. # To move a configuration(s) from the Operator Deployment to this ConfigMap, add the config # here. It is recommended to then remove it from the Deployment to eliminate any future confusion. kind: ConfigMap apiVersion: v1 metadata: name: rook-ceph-operator-config # should be in the namespace of the operator namespace: rook-ceph # namespace:operator data: # Enable the CSI driver. # To run the non-default version of the CSI driver, see the override-able image properties in operator.yaml ROOK_CSI_ENABLE_CEPHFS: "true" # Enable the default version of the CSI RBD driver. To start another version of the CSI driver, see image properties below. ROOK_CSI_ENABLE_RBD: "true" ROOK_CSI_ENABLE_GRPC_METRICS: "false" # Set logging level for csi containers. # Supported values from 0 to 5. 0 for general useful logs, 5 for trace level verbosity. # CSI_LOG_LEVEL: "0" # OMAP generator will generate the omap mapping between the PV name and the RBD image. # CSI_ENABLE_OMAP_GENERATOR need to be enabled when we are using rbd mirroring feature. # By default OMAP generator sidecar is deployed with CSI provisioner pod, to disable # it set it to false. # CSI_ENABLE_OMAP_GENERATOR: "false" # set to false to disable deployment of snapshotter container in CephFS provisioner pod. CSI_ENABLE_CEPHFS_SNAPSHOTTER: "true" # set to false to disable deployment of snapshotter container in RBD provisioner pod. CSI_ENABLE_RBD_SNAPSHOTTER: "true" # Enable cephfs kernel driver instead of ceph-fuse. # If you disable the kernel client, your application may be disrupted during upgrade. # See the upgrade guide: https://rook.io/docs/rook/master/ceph-upgrade.html # NOTE! cephfs quota is not supported in kernel version < 4.17 CSI_FORCE_CEPHFS_KERNEL_CLIENT: "true" # (Optional) policy for modifying a volume's ownership or permissions when the RBD PVC is being mounted. # supported values are documented at https://kubernetes-csi.github.io/docs/support-fsgroup.html CSI_RBD_FSGROUPPOLICY: "ReadWriteOnceWithFSType" # (Optional) policy for modifying a volume's ownership or permissions when the CephFS PVC is being mounted. # supported values are documented at https://kubernetes-csi.github.io/docs/support-fsgroup.html CSI_CEPHFS_FSGROUPPOLICY: "ReadWriteOnceWithFSType" # (Optional) Allow starting unsupported ceph-csi image ROOK_CSI_ALLOW_UNSUPPORTED_VERSION: "false" # The default version of CSI supported by Rook will be started. To change the version # of the CSI driver to something other than what is officially supported, change # these images to the desired release of the CSI driver. #修改这里 ROOK_CSI_CEPH_IMAGE: "registry.cn-hangzhou.aliyuncs.com/haoyustorage/cephcsi:v3.2.2" ROOK_CSI_REGISTRAR_IMAGE: "registry.cn-hangzhou.aliyuncs.com/haoyustorage/csi-node-driver-registrar:v2.0.1" ROOK_CSI_RESIZER_IMAGE: "registry.cn-hangzhou.aliyuncs.com/haoyustorage/csi-resizer:v1.0.1" ROOK_CSI_PROVISIONER_IMAGE: "registry.cn-hangzhou.aliyuncs.com/haoyustorage/csi-provisioner:v2.0.4" ROOK_CSI_SNAPSHOTTER_IMAGE: "registry.cn-hangzhou.aliyuncs.com/haoyustorage/csi-snapshotter:v3.0.2" ROOK_CSI_ATTACHER_IMAGE: "registry.cn-hangzhou.aliyuncs.com/haoyustorage/csi-attacher:v3.0.2" # (Optional) set user created priorityclassName for csi plugin pods. # CSI_PLUGIN_PRIORITY_CLASSNAME: "system-node-critical" # (Optional) set user created priorityclassName for csi provisioner pods. # CSI_PROVISIONER_PRIORITY_CLASSNAME: "system-cluster-critical" # CSI CephFS plugin daemonset update strategy, supported values are OnDelete and RollingUpdate. # Default value is RollingUpdate. # CSI_CEPHFS_PLUGIN_UPDATE_STRATEGY: "OnDelete" # CSI RBD plugin daemonset update strategy, supported values are OnDelete and RollingUpdate. # Default value is RollingUpdate. # CSI_RBD_PLUGIN_UPDATE_STRATEGY: "OnDelete" # kubelet directory path, if kubelet configured to use other than /var/lib/kubelet path. ROOK_CSI_KUBELET_DIR_PATH: "/data/kubernetes/kubelet" # Labels to add to the CSI CephFS Deployments and DaemonSets Pods. # ROOK_CSI_CEPHFS_POD_LABELS: "key1=value1,key2=value2" # Labels to add to the CSI RBD Deployments and DaemonSets Pods. # ROOK_CSI_RBD_POD_LABELS: "key1=value1,key2=value2" # (Optional) Ceph Provisioner NodeAffinity. # CSI_PROVISIONER_NODE_AFFINITY: "role=storage-node; storage=rook, ceph" # (Optional) CEPH CSI provisioner tolerations list. Put here list of taints you want to tolerate in YAML format. # CSI provisioner would be best to start on the same nodes as other ceph daemons. # CSI_PROVISIONER_TOLERATIONS: | # - effect: NoSchedule # key: node-role.kubernetes.io/controlplane # operator: Exists # - effect: NoExecute # key: node-role.kubernetes.io/etcd # operator: Exists # (Optional) Ceph CSI plugin NodeAffinity. # CSI_PLUGIN_NODE_AFFINITY: "role=storage-node; storage=rook, ceph" # (Optional) CEPH CSI plugin tolerations list. Put here list of taints you want to tolerate in YAML format. # CSI plugins need to be started on all the nodes where the clients need to mount the storage. # CSI_PLUGIN_TOLERATIONS: | # - effect: NoSchedule # key: node-role.kubernetes.io/controlplane # operator: Exists # - effect: NoExecute # key: node-role.kubernetes.io/etcd # operator: Exists # (Optional) CEPH CSI RBD provisioner resource requirement list, Put here list of resource # requests and limits you want to apply for provisioner pod # CSI_RBD_PROVISIONER_RESOURCE: | # - name : csi-provisioner # resource: # requests: # memory: 128Mi # cpu: 100m # limits: # memory: 256Mi # cpu: 200m # - name : csi-resizer # resource: # requests: # memory: 128Mi # cpu: 100m # limits: # memory: 256Mi # cpu: 200m # - name : csi-attacher # resource: # requests: # memory: 128Mi # cpu: 100m # limits: # memory: 256Mi # cpu: 200m # - name : csi-snapshotter # resource: # requests: # memory: 128Mi # cpu: 100m # limits: # memory: 256Mi # cpu: 200m # - name : csi-rbdplugin # resource: # requests: # memory: 512Mi # cpu: 250m # limits: # memory: 1Gi # cpu: 500m # - name : liveness-prometheus # resource: # requests: # memory: 128Mi # cpu: 50m # limits: # memory: 256Mi # cpu: 100m # (Optional) CEPH CSI RBD plugin resource requirement list, Put here list of resource # requests and limits you want to apply for plugin pod # CSI_RBD_PLUGIN_RESOURCE: | # - name : driver-registrar # resource: # requests: # memory: 128Mi # cpu: 50m # limits: # memory: 256Mi # cpu: 100m # - name : csi-rbdplugin # resource: # requests: # memory: 512Mi # cpu: 250m # limits: # memory: 1Gi # cpu: 500m # - name : liveness-prometheus # resource: # requests: # memory: 128Mi # cpu: 50m # limits: # memory: 256Mi # cpu: 100m # (Optional) CEPH CSI CephFS provisioner resource requirement list, Put here list of resource # requests and limits you want to apply for provisioner pod # CSI_CEPHFS_PROVISIONER_RESOURCE: | # - name : csi-provisioner # resource: # requests: # memory: 128Mi # cpu: 100m # limits: # memory: 256Mi # cpu: 200m # - name : csi-resizer # resource: # requests: # memory: 128Mi # cpu: 100m # limits: # memory: 256Mi # cpu: 200m # - name : csi-attacher # resource: # requests: # memory: 128Mi # cpu: 100m # limits: # memory: 256Mi # cpu: 200m # - name : csi-cephfsplugin # resource: # requests: # memory: 512Mi # cpu: 250m # limits: # memory: 1Gi # cpu: 500m # - name : liveness-prometheus # resource: # requests: # memory: 128Mi # cpu: 50m # limits: # memory: 256Mi # cpu: 100m # (Optional) CEPH CSI CephFS plugin resource requirement list, Put here list of resource # requests and limits you want to apply for plugin pod # CSI_CEPHFS_PLUGIN_RESOURCE: | # - name : driver-registrar # resource: # requests: # memory: 128Mi # cpu: 50m # limits: # memory: 256Mi # cpu: 100m # - name : csi-cephfsplugin # resource: # requests: # memory: 512Mi # cpu: 250m # limits: # memory: 1Gi # cpu: 500m # - name : liveness-prometheus # resource: # requests: # memory: 128Mi # cpu: 50m # limits: # memory: 256Mi # cpu: 100m # Configure CSI CSI Ceph FS grpc and liveness metrics port # CSI_CEPHFS_GRPC_METRICS_PORT: "9091" # CSI_CEPHFS_LIVENESS_METRICS_PORT: "9081" # Configure CSI RBD grpc and liveness metrics port # CSI_RBD_GRPC_METRICS_PORT: "9090" # CSI_RBD_LIVENESS_METRICS_PORT: "9080" # Whether the OBC provisioner should watch on the operator namespace or not, if not the namespace of the cluster will be used ROOK_OBC_WATCH_OPERATOR_NAMESPACE: "true" # (Optional) Admission controller NodeAffinity. # ADMISSION_CONTROLLER_NODE_AFFINITY: "role=storage-node; storage=rook, ceph" # (Optional) Admission controller tolerations list. Put here list of taints you want to tolerate in YAML format. # Admission controller would be best to start on the same nodes as other ceph daemons. # ADMISSION_CONTROLLER_TOLERATIONS: | # - effect: NoSchedule # key: node-role.kubernetes.io/controlplane # operator: Exists # - effect: NoExecute # key: node-role.kubernetes.io/etcd # operator: Exists --- # OLM: BEGIN OPERATOR DEPLOYMENT apiVersion: apps/v1 kind: Deployment metadata: name: rook-ceph-operator namespace: rook-ceph # namespace:operator labels: operator: rook storage-backend: ceph spec: selector: matchLabels: app: rook-ceph-operator replicas: 1 template: metadata: labels: app: rook-ceph-operator spec: serviceAccountName: rook-ceph-system containers: - name: rook-ceph-operator image: registry.cn-hangzhou.aliyuncs.com/haoyustorage/rookceph:v1.5.12 args: ["ceph", "operator"] volumeMounts: - mountPath: /var/lib/rook name: rook-config - mountPath: /etc/ceph name: default-config-dir env: # If the operator should only watch for cluster CRDs in the same namespace, set this to "true". # If this is not set to true, the operator will watch for cluster CRDs in all namespaces. - name: ROOK_CURRENT_NAMESPACE_ONLY value: "false" # To disable RBAC, uncomment the following: # - name: RBAC_ENABLED # value: "false" # Rook Agent toleration. Will tolerate all taints with all keys. # Choose between NoSchedule, PreferNoSchedule and NoExecute: # - name: AGENT_TOLERATION # value: "NoSchedule" # (Optional) Rook Agent toleration key. Set this to the key of the taint you want to tolerate # - name: AGENT_TOLERATION_KEY # value: "<KeyOfTheTaintToTolerate>" # (Optional) Rook Agent tolerations list. Put here list of taints you want to tolerate in YAML format. # - name: AGENT_TOLERATIONS # value: | # - effect: NoSchedule # key: node-role.kubernetes.io/controlplane # operator: Exists # - effect: NoExecute # key: node-role.kubernetes.io/etcd # operator: Exists # (Optional) Rook Agent priority class name to set on the pod(s) # - name: AGENT_PRIORITY_CLASS_NAME # value: "<PriorityClassName>" # (Optional) Rook Agent NodeAffinity. # - name: AGENT_NODE_AFFINITY # value: "role=storage-node; storage=rook,ceph" # (Optional) Rook Agent mount security mode. Can by `Any` or `Restricted`. # `Any` uses Ceph admin credentials by default/fallback. # For using `Restricted` you must have a Ceph secret in each namespace storage should be consumed from and # set `mountUser` to the Ceph user, `mountSecret` to the Kubernetes secret name. # to the namespace in which the `mountSecret` Kubernetes secret namespace. # - name: AGENT_MOUNT_SECURITY_MODE # value: "Any" # Set the path where the Rook agent can find the flex volumes # - name: FLEXVOLUME_DIR_PATH # value: "<PathToFlexVolumes>" # Set the path where kernel modules can be found # - name: LIB_MODULES_DIR_PATH # value: "<PathToLibModules>" # Mount any extra directories into the agent container # - name: AGENT_MOUNTS # value: "somemount=/host/path:/container/path,someothermount=/host/path2:/container/path2" # Rook Discover toleration. Will tolerate all taints with all keys. # Choose between NoSchedule, PreferNoSchedule and NoExecute: # - name: DISCOVER_TOLERATION # value: "NoSchedule" # (Optional) Rook Discover toleration key. Set this to the key of the taint you want to tolerate # - name: DISCOVER_TOLERATION_KEY # value: "<KeyOfTheTaintToTolerate>" # (Optional) Rook Discover tolerations list. Put here list of taints you want to tolerate in YAML format. # - name: DISCOVER_TOLERATIONS # value: | # - effect: NoSchedule # key: node-role.kubernetes.io/controlplane # operator: Exists # - effect: NoExecute # key: node-role.kubernetes.io/etcd # operator: Exists # (Optional) Rook Discover priority class name to set on the pod(s) # - name: DISCOVER_PRIORITY_CLASS_NAME # value: "<PriorityClassName>" # (Optional) Discover Agent NodeAffinity. # - name: DISCOVER_AGENT_NODE_AFFINITY # value: "role=storage-node; storage=rook, ceph" # (Optional) Discover Agent Pod Labels. # - name: DISCOVER_AGENT_POD_LABELS # value: "key1=value1,key2=value2" # Allow rook to create multiple file systems. Note: This is considered # an experimental feature in Ceph as described at # http://docs.ceph.com/docs/master/cephfs/experimental-features/#multiple-filesystems-within-a-ceph-cluster # which might cause mons to crash as seen in https://github.com/rook/rook/issues/1027 - name: ROOK_ALLOW_MULTIPLE_FILESYSTEMS value: "false" # The logging level for the operator: INFO | DEBUG - name: ROOK_LOG_LEVEL value: "INFO" # The duration between discovering devices in the rook-discover daemonset. - name: ROOK_DISCOVER_DEVICES_INTERVAL value: "60m" # Whether to start pods as privileged that mount a host path, which includes the Ceph mon and osd pods. # Set this to true if SELinux is enabled (e.g. OpenShift) to workaround the anyuid issues. # For more details see https://github.com/rook/rook/issues/1314#issuecomment-355799641 - name: ROOK_HOSTPATH_REQUIRES_PRIVILEGED value: "false" # In some situations SELinux relabelling breaks (times out) on large filesystems, and doesn't work with cephfs ReadWriteMany volumes (last relabel wins). # Disable it here if you have similar issues. # For more details see https://github.com/rook/rook/issues/2417 - name: ROOK_ENABLE_SELINUX_RELABELING value: "true" # In large volumes it will take some time to chown all the files. Disable it here if you have performance issues. # For more details see https://github.com/rook/rook/issues/2254 - name: ROOK_ENABLE_FSGROUP value: "true" # Disable automatic orchestration when new devices are discovered - name: ROOK_DISABLE_DEVICE_HOTPLUG value: "false" # Provide customised regex as the values using comma. For eg. regex for rbd based volume, value will be like "(?i)rbd[0-9]+". # In case of more than one regex, use comma to separate between them. # Default regex will be "(?i)dm-[0-9]+,(?i)rbd[0-9]+,(?i)nbd[0-9]+" # Add regex expression after putting a comma to blacklist a disk # If value is empty, the default regex will be used. - name: DISCOVER_DAEMON_UDEV_BLACKLIST value: "(?i)dm-[0-9]+,(?i)rbd[0-9]+,(?i)nbd[0-9]+" # Whether to enable the flex driver. By default it is enabled and is fully supported, but will be deprecated in some future release # in favor of the CSI driver. - name: ROOK_ENABLE_FLEX_DRIVER value: "false" # Whether to start the discovery daemon to watch for raw storage devices on nodes in the cluster. # This daemon does not need to run if you are only going to create your OSDs based on StorageClassDeviceSets with PVCs. - name: ROOK_ENABLE_DISCOVERY_DAEMON value: "false" # Time to wait until the node controller will move Rook pods to other # nodes after detecting an unreachable node. # Pods affected by this setting are: # mgr, rbd, mds, rgw, nfs, PVC based mons and osds, and ceph toolbox # The value used in this variable replaces the default value of 300 secs # added automatically by k8s as Toleration for # <node.kubernetes.io/unreachable> # The total amount of time to reschedule Rook pods in healthy nodes # before detecting a <not ready node> condition will be the sum of: # --> node-monitor-grace-period: 40 seconds (k8s kube-controller-manager flag) # --> ROOK_UNREACHABLE_NODE_TOLERATION_SECONDS: 5 seconds - name: ROOK_UNREACHABLE_NODE_TOLERATION_SECONDS value: "5" # The name of the node to pass with the downward API - name: NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName # The pod name to pass with the downward API - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name # The pod namespace to pass with the downward API - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace # Uncomment it to run lib bucket provisioner in multithreaded mode #- name: LIB_BUCKET_PROVISIONER_THREADS # value: "5" # Uncomment it to run rook operator on the host network #hostNetwork: true volumes: - name: rook-config emptyDir: {} - name: default-config-dir emptyDir: {} # OLM: END OPERATOR DEPLOYMENT
解读:如上创建了相应的基础服务(如serviceaccounts),同时rook-ceph-operator会在每个节点创建 rook-ceph-agent 和 rook-discover。
2.4 配置cluster
[root@k8smaster01 ceph]# vi cluster.yaml
################################################################################################################# # Define the settings for the rook-ceph cluster with common settings for a production cluster. # All nodes with available raw devices will be used for the Ceph cluster. At least three nodes are required # in this example. See the documentation for more details on storage settings available. # For example, to create the cluster: # kubectl create -f crds.yaml -f common.yaml -f operator.yaml # kubectl create -f cluster.yaml ################################################################################################################# apiVersion: ceph.rook.io/v1 kind: CephCluster metadata: name: rook-ceph namespace: rook-ceph # namespace:cluster spec: cephVersion: # The container image used to launch the Ceph daemon pods (mon, mgr, osd, mds, rgw). # v13 is mimic, v14 is nautilus, and v15 is octopus. # RECOMMENDATION: In production, use a specific version tag instead of the general v14 flag, which pulls the latest release and could result in different # versions running within the cluster. See tags available at https://hub.docker.com/r/ceph/ceph/tags/. # If you want to be more precise, you can always use a timestamp tag such ceph/ceph:v15.2.11-20200419 # This tag might not contain a new Ceph version, just security fixes from the underlying operating system, which will reduce vulnerabilities image: registry.cn-hangzhou.aliyuncs.com/haoyustorage/ceph:v15.2.11 # Whether to allow unsupported versions of Ceph. Currently `nautilus` and `octopus` are supported. # Future versions such as `pacific` would require this to be set to `true`. # Do not set to true in production. allowUnsupported: false # The path on the host where configuration files will be persisted. Must be specified. # Important: if you reinstall the cluster, make sure you delete this directory from each host or else the mons will fail to start on the new cluster. # In Minikube, the '/data' directory is configured to persist across reboots. Use "/data/rook" in Minikube environment. dataDirHostPath: /var/lib/rook # Whether or not upgrade should continue even if a check fails # This means Ceph's status could be degraded and we don't recommend upgrading but you might decide otherwise # Use at your OWN risk # To understand Rook's upgrade process of Ceph, read https://rook.io/docs/rook/master/ceph-upgrade.html#ceph-version-upgrades skipUpgradeChecks: false # Whether or not continue if PGs are not clean during an upgrade continueUpgradeAfterChecksEvenIfNotHealthy: false # WaitTimeoutForHealthyOSDInMinutes defines the time (in minutes) the operator would wait before an OSD can be stopped for upgrade or restart. # If the timeout exceeds and OSD is not ok to stop, then the operator would skip upgrade for the current OSD and proceed with the next one # if `continueUpgradeAfterChecksEvenIfNotHealthy` is `false`. If `continueUpgradeAfterChecksEvenIfNotHealthy` is `true`, then opertor would # continue with the upgrade of an OSD even if its not ok to stop after the timeout. This timeout won't be applied if `skipUpgradeChecks` is `true`. # The default wait timeout is 10 minutes. waitTimeoutForHealthyOSDInMinutes: 10 mon: count: 3 allowMultiplePerNode: false mgr: modules: - name: pg_autoscaler enabled: true # enable the ceph dashboard for viewing cluster status dashboard: enabled: true # serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy) # urlPrefix: /ceph-dashboard # serve the dashboard at the given port. # port: 8443 # serve the dashboard using SSL ssl: true # enable prometheus alerting for cluster monitoring: # requires Prometheus to be pre-installed enabled: false # namespace to deploy prometheusRule in. If empty, namespace of the cluster will be used. # Recommended: # If you have a single rook-ceph cluster, set the rulesNamespace to the same namespace as the cluster or keep it empty. # If you have multiple rook-ceph clusters in the same k8s cluster, choose the same namespace (ideally, namespace with prometheus # deployed) to set rulesNamespace for all the clusters. Otherwise, you will get duplicate alerts with multiple alert definitions. rulesNamespace: rook-ceph network: # enable host networking #provider: host # EXPERIMENTAL: enable the Multus network provider #provider: multus #selectors: # The selector keys are required to be `public` and `cluster`. # Based on the configuration, the operator will do the following: # 1. if only the `public` selector key is specified both public_network and cluster_network Ceph settings will listen on that interface # 2. if both `public` and `cluster` selector keys are specified the first one will point to 'public_network' flag and the second one to 'cluster_network' # # In order to work, each selector value must match a NetworkAttachmentDefinition object in Multus # #public: public-conf --> NetworkAttachmentDefinition object name in Multus #cluster: cluster-conf --> NetworkAttachmentDefinition object name in Multus # Provide internet protocol version. IPv6, IPv4 or empty string are valid options. Empty string would mean IPv4 #ipFamily: "IPv6" # enable the crash collector for ceph daemon crash collection crashCollector: disable: false # enable log collector, daemons will log on files and rotate # logCollector: # enabled: true # periodicity: 24h # SUFFIX may be 'h' for hours or 'd' for days. # automate [data cleanup process](https://github.com/rook/rook/blob/master/Documentation/ceph-teardown.md#delete-the-data-on-hosts) in cluster destruction. cleanupPolicy: # Since cluster cleanup is destructive to data, confirmation is required. # To destroy all Rook data on hosts during uninstall, confirmation must be set to "yes-really-destroy-data". # This value should only be set when the cluster is about to be deleted. After the confirmation is set, # Rook will immediately stop configuring the cluster and only wait for the delete command. # If the empty string is set, Rook will not destroy any data on hosts during uninstall. confirmation: "" # sanitizeDisks represents settings for sanitizing OSD disks on cluster deletion # To control where various services will be scheduled by kubernetes, use the placement configuration sections below. # The example under 'all' would have all services scheduled on kubernetes nodes labeled with 'role=storage-node' and # tolerate taints with a key of 'storage-node'. placement: #配置特定节点亲和力保证Node作为存储节点 # all: # nodeAffinity: # requiredDuringSchedulingIgnoredDuringExecution: # nodeSelectorTerms: # - matchExpressions: # - key: role # operator: In # values: # - storage-node # tolerations: # - key: storage-node # operator: Exists mon: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: ceph-mon operator: In values: - enabled tolerations: - key: ceph-mon operator: Exists ods: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: ceph-osd operator: In values: - enabled tolerations: - key: ceph-osd operator: Exists mgr: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: ceph-mgr operator: In values: - enabled tolerations: - key: ceph-mgr operator: Exists annotations: resources: # placement: # all: # nodeAffinity: # requiredDuringSchedulingIgnoredDuringExecution: # nodeSelectorTerms: # - matchExpressions: # - key: role # operator: In # values: # - storage-node # podAffinity: # podAntiAffinity: # topologySpreadConstraints: # tolerations: # - key: storage-node # operator: Exists # The above placement information can also be specified for mon, osd, and mgr components # mon: # Monitor deployments may contain an anti-affinity rule for avoiding monitor # collocation on the same node. This is a required rule when host network is used # or when AllowMultiplePerNode is false. Otherwise this anti-affinity rule is a # preferred rule with weight: 50. # osd: # mgr: # cleanup: # all: # mon: # osd: # cleanup: # prepareosd: # If no mgr annotations are set, prometheus scrape annotations will be set by default. # mon: # osd: # cleanup: # mgr: # prepareosd: # The requests and limits set here, allow the mgr pod to use half of one CPU core and 1 gigabyte of memory # mgr: # limits: # cpu: "500m" # memory: "1024Mi" # requests: # cpu: "500m" # memory: "1024Mi" # The above example requests/limits can also be added to the mon and osd components # mon: # osd: # prepareosd: # crashcollector: # logcollector: # cleanup: # The option to automatically remove OSDs that are out and are safe to destroy. removeOSDsIfOutAndSafeToRemove: false # priorityClassNames: # all: rook-ceph-default-priority-class # mon: rook-ceph-mon-priority-class # osd: rook-ceph-osd-priority-class # mgr: rook-ceph-mgr-priority-class storage: useAllNodes: false #关闭使用所有Node useAllDevices: false #关闭使用所有设备 deviceFilter: sdb config: metadataDevice: databaseSizeMB: "1024" journalSizeMB: "1024" nodes: - name: "centos8-master01" #指定存储节点主机 config: storeType: bluestore #指定类型为裸磁盘 devices: - name: "sdb" #指定磁盘为sdb - name: "centos8-master02" config: storeType: bluestore devices: - name: "sdb" - name: "centos8-master03" config: storeType: bluestore devices: - name: "sdb" # crushRoot: "custom-root" # specify a non-default root label for the CRUSH map # metadataDevice: "md0" # specify a non-rotational storage so ceph-volume will use it as block db device of bluestore. # databaseSizeMB: "1024" # uncomment if the disks are smaller than 100 GB # journalSizeMB: "1024" # uncomment if the disks are 20 GB or smaller # osdsPerDevice: "1" # this value can be overridden at the node or device level # encryptedDevice: "true" # the default value for this option is "false" # Individual nodes and their config can be specified as well, but 'useAllNodes' above must be set to false. Then, only the named # nodes below will be used as storage resources. Each node's 'name' field should match their 'kubernetes.io/hostname' label. # nodes: # - name: "172.17.4.201" # devices: # specific devices to use for storage can be specified for each node # - name: "sdb" # - name: "nvme01" # multiple osds can be created on high performance devices # config: # osdsPerDevice: "5" # - name: "/dev/disk/by-id/ata-ST4000DM004-XXXX" # devices can be specified using full udev paths # config: # configuration can be specified at the node level which overrides the cluster level config # storeType: filestore # - name: "172.17.4.301" # deviceFilter: "^sd." # The section for configuring management of daemon disruptions during upgrade or fencing. disruptionManagement: # If true, the operator will create and manage PodDisruptionBudgets for OSD, Mon, RGW, and MDS daemons. OSD PDBs are managed dynamically # via the strategy outlined in the [design](https://github.com/rook/rook/blob/master/design/ceph/ceph-managed-disruptionbudgets.md). The operator will # block eviction of OSDs by default and unblock them safely when drains are detected. managePodBudgets: false # A duration in minutes that determines how long an entire failureDomain like `region/zone/host` will be held in `noout` (in addition to the # default DOWN/OUT interval) when it is draining. This is only relevant when `managePodBudgets` is `true`. The default value is `30` minutes. osdMaintenanceTimeout: 30 # A duration in minutes that the operator will wait for the placement groups to become healthy (active+clean) after a drain was completed and OSDs came back up. # Operator will continue with the next drain if the timeout exceeds. It only works if `managePodBudgets` is `true`. # No values or 0 means that the operator will wait until the placement groups are healthy before unblocking the next drain. pgHealthCheckTimeout: 0 # If true, the operator will create and manage MachineDisruptionBudgets to ensure OSDs are only fenced when the cluster is healthy. # Only available on OpenShift. manageMachineDisruptionBudgets: false # Namespace in which to watch for the MachineDisruptionBudgets. machineDisruptionBudgetNamespace: openshift-machine-api # healthChecks # Valid values for daemons are 'mon', 'osd', 'status' healthCheck: daemonHealth: mon: disabled: false interval: 45s osd: disabled: false interval: 60s status: disabled: false interval: 60s # Change pod liveness probe, it works for all mon,mgr,osd daemons livenessProbe: mon: disabled: false mgr: disabled: false osd: disabled: false
提示:更多cluster的CRD配置参考:https://github.com/rook/rook/blob/master/Documentation/ceph-cluster-crd.md。
https://blog.gmem.cc/rook-based-k8s-storage-solution
[root@k8smaster01 ceph]# kubectl create -f cluster.yaml [root@k8smaster01 ceph]# kubectl logs -f -n rook-ceph rook-ceph-operator-cb47c46bc-pszfh #可查看部署log [root@k8smaster01 ceph]# kubectl get pods -n rook-ceph -o wide #需要等待一定时间,部分中间态容器可能会波动 [root@centos8-master01 ceph]# kubectl get pod -n rook-ceph NAME READY STATUS RESTARTS AGE csi-cephfsplugin-5fpwr 3/3 Running 0 168m csi-cephfsplugin-7zpd6 3/3 Running 0 168m csi-cephfsplugin-cp27r 3/3 Running 0 168m csi-cephfsplugin-nfl9t 3/3 Running 2 168m csi-cephfsplugin-provisioner-f57576c9f-6k7dd 6/6 Running 5 168m csi-cephfsplugin-provisioner-f57576c9f-9td6k 6/6 Running 0 168m csi-cephfsplugin-qf7kw 3/3 Running 0 168m csi-cephfsplugin-srf8k 3/3 Running 0 168m csi-rbdplugin-5srf4 3/3 Running 0 168m csi-rbdplugin-br5gm 3/3 Running 0 168m csi-rbdplugin-dnmnl 3/3 Running 0 168m csi-rbdplugin-gbpf4 3/3 Running 0 168m csi-rbdplugin-n5bm9 3/3 Running 0 168m csi-rbdplugin-provisioner-8557f6cd8-7t4nb 6/6 Running 5 168m csi-rbdplugin-provisioner-8557f6cd8-pnn6f 6/6 Running 0 168m csi-rbdplugin-rj6j5 3/3 Running 2 168m rook-ceph-crashcollector-centos8-master01-64f57f48b8-xfgpb 1/1 Running 0 166m rook-ceph-crashcollector-centos8-master02-778cf6d7f6-wpnxv 1/1 Running 0 166m rook-ceph-crashcollector-centos8-master03-56476849b7-hp5m8 1/1 Running 0 166m rook-ceph-mgr-a-f5d8d9fc8-5plv9 1/1 Running 0 166m rook-ceph-mon-a-66c78577df-jsj2m 1/1 Running 0 168m rook-ceph-mon-b-5cb6bdbb-b7g4r 1/1 Running 0 167m rook-ceph-mon-c-78c5889d7-hwpxk 1/1 Running 0 166m rook-ceph-operator-7968ff9886-tszjt 1/1 Running 0 169m rook-ceph-osd-0-7dcddcbc6b-6m2hc 1/1 Running 0 166m rook-ceph-osd-1-54b47bf778-zlftn 1/1 Running 0 166m rook-ceph-osd-2-56c5b9c48b-jcbcp 1/1 Running 0 166m rook-ceph-osd-prepare-centos8-master01-q9v7d 0/1 Completed 0 95m rook-ceph-osd-prepare-centos8-master02-kqlqz 0/1 Completed 0 95m rook-ceph-osd-prepare-centos8-master03-jxdk8 0/1 Completed 0 94m
master节点执行[root@k8smaster01 ceph]# kubectl delete -f ./
所有master节点执行如下清理操作:
rm -rf /var/lib/rook
/dev/mapper/ceph-*
dmsetup ls
dmsetup remove_all
dd if=/dev/zero of=/dev/sdb bs=512k count=1
wipefs -af /dev/sdb
toolbox是一个rook的工具集容器,该容器中的命令可以用来调试、测试Rook,对Ceph临时测试的操作一般在这个容器内执行。
[root@centos8-master01 ceph]# kubectl create -f toolbox.yaml
[root@centos8-master01 ceph]# kubectl -n rook-ceph get pod -l "app=rook-ceph-tools"
NAME READY STATUS RESTARTS AGE
rook-ceph-tools-8574b74c5d-25bp9 1/1 Running 0 143m
[root@centos8-master01 ceph]# kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead. [root@rook-ceph-tools-8574b74c5d-25bp9 /]# ceph status cluster: id: 0b5933ca-7a97-4176-b17f-9a07aa19560b health: HEALTH_OK services: mon: 3 daemons, quorum a,b,c (age 98m) mgr: a(active, since 2h) osd: 3 osds: 3 up (since 2h), 3 in (since 2h) data: pools: 2 pools, 33 pgs objects: 5 objects, 19 B usage: 3.0 GiB used, 897 GiB / 900 GiB avail pgs: 33 active+clean [root@rook-ceph-tools-8574b74c5d-25bp9 /]# ceph osd status ID HOST USED AVAIL WR OPS WR DATA RD OPS RD DATA STATE 0 centos8-master02 1027M 298G 0 0 0 0 exists,up 1 centos8-master03 1027M 298G 0 0 0 0 exists,up 2 centos8-master01 1027M 298G 0 0 0 0 exists,up [root@rook-ceph-tools-8574b74c5d-25bp9 /]# ceph df --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 900 GiB 897 GiB 11 MiB 3.0 GiB 0.33 TOTAL 900 GiB 897 GiB 11 MiB 3.0 GiB 0.33 --- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL device_health_metrics 1 1 0 B 0 0 B 0 284 GiB replicapool 2 32 19 B 5 192 KiB 0 284 GiB [root@rook-ceph-tools-8574b74c5d-25bp9 /]# rados df POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR device_health_metrics 0 B 0 0 0 0 0 0 0 0 B 0 0 B 0 B 0 B replicapool 192 KiB 5 0 15 0 0 0 1214 8.8 MiB 220 1.6 MiB 0 B 0 B total_objects 5 total_used 3.0 GiB total_avail 897 GiB total_space 900 GiB [root@rook-ceph-tools-8574b74c5d-25bp9 /]# ceph auth ls installed auth entries: osd.0 key: AQAfvIxhHjXgLRAA1oy7wvyANwp0EE9wm5Q+cQ== caps: [mgr] allow profile osd caps: [mon] allow profile osd caps: [osd] allow * osd.1 key: AQAgvIxhF9MVABAAN5XEgvloNL5SCFUwjcL99g== caps: [mgr] allow profile osd caps: [mon] allow profile osd caps: [osd] allow * osd.2 key: AQAivIxh7oI7MRAAq1QMd4Pnkc1n93mcS8ibzw== caps: [mgr] allow profile osd caps: [mon] allow profile osd caps: [osd] allow * client.admin key: AQCCu4xhsRX/AhAA9j6pgi7ZxSmcOz9g1bnQXA== caps: [mds] allow * caps: [mgr] allow * caps: [mon] allow * caps: [osd] allow * client.bootstrap-mds key: AQDvu4xhLtqcOhAARwiC7ZGStBjajADg3d/dTQ== caps: [mon] allow profile bootstrap-mds client.bootstrap-mgr key: AQDvu4xhd+KcOhAA0x/8kUkT7ibC+6VfXszGSw== caps: [mon] allow profile bootstrap-mgr client.bootstrap-osd key: AQDvu4xh4emcOhAAUnKWOm+ZFZ9wL24JyFPtrA== caps: [mon] allow profile bootstrap-osd client.bootstrap-rbd key: AQDvu4xhbfKcOhAAi+7TE6qVR5c5PDZJVwLBRg== caps: [mon] allow profile bootstrap-rbd client.bootstrap-rbd-mirror key: AQDvu4xhbvqcOhAA78ZXb46BrBfL6A7xRLZuDw== caps: [mon] allow profile bootstrap-rbd-mirror client.bootstrap-rgw key: AQDvu4xhigKdOhAA4IEL9YbcPTmb2kbCPYsSOw== caps: [mon] allow profile bootstrap-rgw client.crash key: AQAUvIxhhvI3DhAAx+p9+90bnD7HNp8/ilGLSg== caps: [mgr] allow profile crash caps: [mon] allow profile crash client.csi-cephfs-node key: AQATvIxh6NPyNxAAZopNxf/8vkU4raGaRl5B1g== caps: [mds] allow rw caps: [mgr] allow rw caps: [mon] allow r caps: [osd] allow rw tag cephfs *=* client.csi-cephfs-provisioner key: AQATvIxheVlIKxAAI+pnqJaqu9XXdTEezaHL9g== caps: [mgr] allow rw caps: [mon] allow r caps: [osd] allow rw tag cephfs metadata=* client.csi-rbd-node key: AQATvIxhOer9HRAAWwnDk/dmahHyWZbGWSbWjg== caps: [mgr] allow rw caps: [mon] profile rbd caps: [osd] profile rbd client.csi-rbd-provisioner key: AQATvIxhI1kPEBAA2KRxn7qIdb92z/GSUruZxw== caps: [mgr] allow rw caps: [mon] profile rbd caps: [osd] profile rbd mgr.a key: AQAUvIxhEmq6KhAA2cOgyM4GyoysKPZfqDOlsw== caps: [mds] allow * caps: [mon] allow profile mgr caps: [osd] allow * [root@rook-ceph-tools-8574b74c5d-25bp9 /]# ceph version ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) octopus (stable)
提示:更多Ceph管理参考《008.RHCS-管理Ceph存储集群》,如上工具中也支持使用独立的ceph命令ceph osd pool create ceph-test 512创建相关pool,实际Kubernetes rook中,不建议直接操作底层Ceph,以防止上层Kubernetes而言数据不一致性。
为方便管理,可将Ceph的keyring和config在master节点也创建一份,从而实现在Kubernetes外部宿主机对rook Ceph集群的简单查看。
[root@k8smaster01 ~]# kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') cat /etc/ceph/ceph.conf > /etc/ceph/ceph.conf
[root@k8smaster01 ~]# kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') cat /etc/ceph/keyring > /etc/ceph/keyring
[root@k8smaster01 ceph]# tee /etc/yum.repos.d/ceph.repo <<-'EOF' [Ceph] name=Ceph packages for $basearch baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/$basearch enabled=1 gpgcheck=0 type=rpm-md gpgkey=https://mirrors.aliyun.com/ceph/keys/release.asc priority=1 [Ceph-noarch] name=Ceph noarch packages baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch enabled=1 gpgcheck=0 type=rpm-md gpgkey=https://mirrors.aliyun.com/ceph/keys/release.asc priority=1 [ceph-source] name=Ceph source packages baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/SRPMS enabled=1 gpgcheck=0 type=rpm-md gpgkey=https://mirrors.aliyun.com/ceph/keys/release.asc priority=1 EOF [root@k8smaster01 ceph]# yum -y install ceph-common ceph-fuse #安装客户端 [root@k8smaster01 ~]# ceph status
提示:rpm-nautilus版本建议和2.8所查看的版本一致。基于Kubernetes的rook Ceph集群,强烈不建议直接使用ceph命令进行管理,否则可能出现非一致性,对于rook集群的使用参考步骤三,ceph命令仅限于简单的集群查看。
在提供(Provisioning)块存储之前,需要先创建StorageClass和存储池。K8S需要这两类资源,才能和Rook交互,进而分配持久卷(PV)。
复制代码 [root@k8smaster01 ceph]# kubectl create -f csi/rbd/storageclass.yaml apiVersion: ceph.rook.io/v1 kind: CephBlockPool metadata: name: replicapool namespace: rook-ceph spec: failureDomain: host replicated: size: 3 requireSafeReplicaSize: true --- apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: rook-ceph-block provisioner: rook-ceph.rbd.csi.ceph.com parameters: clusterID: rook-ceph # namespace:cluster pool: replicapool imageFormat: "2" imageFeatures: layering csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph # namespace:cluster csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph # namespace:cluster csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph # namespace:cluster csi.storage.k8s.io/fstype: ext4 allowVolumeExpansion: true reclaimPolicy: Delete [root@centos8-master01 rbd]# kubectl get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE rook-ceph-block rook-ceph.rbd.csi.ceph.com Delete Immediate true 107m
[root@centos8-master01 rbd]# cat pvc.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: block-pvc spec: storageClassName: rook-ceph-block accessModes: - ReadWriteOnce resources: requests: storage: 200Mi [root@centos8-master01 rbd]# kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE block-pvc Bound pvc-6328beff-bfe6-4a26-be53-4c1ffc4c9bb3 200Mi RWO rook-ceph-block 8s [root@centos8-master01 rbd]# kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-6328beff-bfe6-4a26-be53-4c1ffc4c9bb3 200Mi RWO Delete Bound default/block-pvc rook-ceph-block 9s
解读:如上创建相应的PVC,storageClassName:为基于rook Ceph集群的rook-ceph-block。
[root@centos8-master01 rbd]# cat podo1.yaml apiVersion: v1 kind: Pod metadata: name: rookpod01 spec: restartPolicy: OnFailure containers: - name: test-container image: busybox volumeMounts: - name: block-pvc mountPath: /var/test command: ['sh', '-c', 'echo "Hello World" > /var/test/data; exit 0'] volumes: - name: block-pvc persistentVolumeClaim: claimName: block-pvc [root@centos8-master01 rbd]# kubectl apply -f podo1.yaml pod/rookpod01 created [root@centos8-master01 rbd]# kubectl get pod NAME READY STATUS RESTARTS AGE rookpod01 0/1 ContainerCreating 0 9s [root@centos8-master01 rbd]# kubectl get pod NAME READY STATUS RESTARTS AGE rookpod01 0/1 Completed 0 81s
解读:创建如上Pod,并挂载3.2所创建的PVC,等待执行完毕。
[root@centos8-master01 rbd]# kubectl delete -f podo1.yaml pod "rookpod01" deleted [root@centos8-master01 rbd]# cat pod02.yaml apiVersion: v1 kind: Pod metadata: name: rookpod02 spec: restartPolicy: OnFailure containers: - name: test-container image: busybox volumeMounts: - name: block-pvc mountPath: /var/test command: ['sh', '-c', 'cat /var/test/data; exit 0'] volumes: - name: block-pvc persistentVolumeClaim: claimName: block-pvc
[root@k8smaster01 ceph]# kubectl create -f rookpod02.yaml
[root@k8smaster01 ceph]# kubectl logs rookpod02 test-container
Hello World
解读:创建rookpod02,并使用所创建的PVC,测试持久性。
提示:更多Ceph块设备知识参考《003.RHCS-RBD块存储使用》。
在提供(object)对象存储之前,需要先创建相应的支持,使用如下官方提供的默认yaml可部署对象存储的CephObjectStore。
[root@k8smaster01 ceph]# kubectl create -f object.yaml apiVersion: ceph.rook.io/v1 kind: CephObjectStore metadata: name: my-store namespace: rook-ceph # namespace:cluster spec: # The pool spec used to create the metadata pools. Must use replication. metadataPool: failureDomain: host replicated: size: 3 # Disallow setting pool with replica 1, this could lead to data loss without recovery. # Make sure you're *ABSOLUTELY CERTAIN* that is what you want requireSafeReplicaSize: true parameters: # Inline compression mode for the data pool # Further reference: https://docs.ceph.com/docs/nautilus/rados/configuration/bluestore-config-ref/#inline-compression compression_mode: none # gives a hint (%) to Ceph in terms of expected consumption of the total cluster capacity of a given pool # for more info: https://docs.ceph.com/docs/master/rados/operations/placement-groups/#specifying-expected-pool-size #target_size_ratio: ".5" # The pool spec used to create the data pool. Can use replication or erasure coding. dataPool: failureDomain: host replicated: size: 3 # Disallow setting pool with replica 1, this could lead to data loss without recovery. # Make sure you're *ABSOLUTELY CERTAIN* that is what you want requireSafeReplicaSize: true parameters: # Inline compression mode for the data pool # Further reference: https://docs.ceph.com/docs/nautilus/rados/configuration/bluestore-config-ref/#inline-compression compression_mode: none # gives a hint (%) to Ceph in terms of expected consumption of the total cluster capacity of a given pool # for more info: https://docs.ceph.com/docs/master/rados/operations/placement-groups/#specifying-expected-pool-size #target_size_ratio: ".5" # Whether to preserve metadata and data pools on object store deletion preservePoolsOnDelete: false # The gateway service configuration gateway: # type of the gateway (s3) type: s3 # A reference to the secret in the rook namespace where the ssl certificate is stored sslCertificateRef: # The port that RGW pods will listen on (http) port: 80 # The port that RGW pods will listen on (https). An ssl certificate is required. # securePort: 443 # The number of pods in the rgw deployment instances: 1 # The affinity rules to apply to the rgw deployment or daemonset. placement: # nodeAffinity: # requiredDuringSchedulingIgnoredDuringExecution: # nodeSelectorTerms: # - matchExpressions: # - key: role # operator: In # values: # - rgw-node # topologySpreadConstraints: # tolerations: # - key: rgw-node # operator: Exists # podAffinity: # podAntiAffinity: # A key/value list of annotations annotations: # key: value # A key/value list of labels labels: # key: value resources: # The requests and limits set here, allow the object store gateway Pod(s) to use half of one CPU core and 1 gigabyte of memory # limits: # cpu: "500m" # memory: "1024Mi" # requests: # cpu: "500m" # memory: "1024Mi" # priorityClassName: my-priority-class #zone: #name: zone-a # service endpoint healthcheck healthCheck: bucket: disabled: false interval: 60s # Configure the pod liveness probe for the rgw daemon livenessProbe: disabled: false [root@centos8-master01 ceph]# kubectl apply -f object.yaml cephobjectstore.ceph.rook.io/my-store created [root@centos8-master01 ceph]# kubectl -n rook-ceph get pod -l app=rook-ceph-rgw -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES rook-ceph-rgw-my-store-a-857d775fbd-bl2x7 1/1 Running 0 39s 10.10.108.168 centos8-master02 <none> <none>
使用如下官方提供的默认yaml可部署对象存储的StorageClass。
[root@centos8-master01 ceph]# cat storageclass-bucket-delete.yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: rook-ceph-delete-bucket provisioner: rook-ceph.ceph.rook.io/bucket # driver:namespace:cluster # set the reclaim policy to delete the bucket and all objects # when its OBC is deleted. reclaimPolicy: Delete parameters: objectStoreName: my-store objectStoreNamespace: rook-ceph # namespace:cluster region: us-east-1 # To accommodate brownfield cases reference the existing bucket name here instead # of in the ObjectBucketClaim (OBC). In this case the provisioner will grant # access to the bucket by creating a new user, attaching it to the bucket, and # providing the credentials via a Secret in the namespace of the requesting OBC. #bucketName: [root@centos8-master01 ceph]# kubectl apply -f storageclass-bucket-delete.yaml storageclass.storage.k8s.io/rook-ceph-delete-bucket created [root@centos8-master01 ceph]# kubectl get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE rook-ceph-block rook-ceph.rbd.csi.ceph.com Delete Immediate true 3h2m rook-ceph-delete-bucket rook-ceph.ceph.rook.io/bucket Delete Immediate false 37s
使用如下
官方提供的默认yaml可部署对象存储的bucket。
[root@centos8-master01 ceph]# cat object-bucket-claim-delete.yaml apiVersion: objectbucket.io/v1alpha1 kind: ObjectBucketClaim metadata: name: ceph-delete-bucket spec: # To create a new bucket specify either `bucketName` or # `generateBucketName` here. Both cannot be used. To access # an existing bucket the bucket name needs to be defined in # the StorageClass referenced here, and both `bucketName` and # `generateBucketName` must be omitted in the OBC. #bucketName: generateBucketName: ceph-bkt storageClassName: rook-ceph-delete-bucket additionalConfig: # To set for quota for OBC #maxObjects: "1000" #maxSize: "2G" [root@centos8-master01 ceph]# kubectl get cm NAME DATA AGE ceph-delete-bucket 5 41s
4.4 设置对象存储访问信息
[root@k8smaster01 ceph]# kubectl -n default get cm ceph-delete-bucket -o yaml | grep BUCKET_HOST | awk '{print KaTeX parse error: Expected 'EOF', got '}' at position 2: 2}̲' rook-ceph-rgw…(kubectl -n default get cm ceph-delete-bucket -o yaml | grep BUCKET_HOST | awk '{print KaTeX parse error: Expected 'EOF', got '}' at position 2: 2}̲') [root@k8smas…(kubectl -n default get secret ceph-delete-bucket -o yaml | grep AWS_ACCESS_KEY_ID | awk ‘{print KaTeX parse error: Expected 'EOF', got '}' at position 2: 2}̲' | base64 --de…(kubectl -n default get secret ceph-delete-bucket -o yaml | grep AWS_SECRET_ACCESS_KEY | awk ‘{print KaTeX parse error: Expected 'EOF', got '}' at position 2: 2}̲' | base64 --de…{AWS_HOST} --host-bucket= s3://ceph-bkt-377bf96f-aea8-4838-82bc-2cb2c16cccfb/test.txt #测试上传至bucket
提示:更多rook 对象存储使用,如创建用户等参考:https://rook.io/docs/rook/v1.1/ceph-object.html。
回到顶部
五 Ceph 文件存储
5.1 创建CephFilesystem
默认Ceph未部署对CephFS的支持,使用如下官方提供的默认yaml可部署文件存储的filesystem。
[root@k8smaster01 ceph]# kubectl create -f filesystem.yaml
复制代码
1 apiVersion: ceph.rook.io/v1
2 kind: CephFilesystem
3 metadata:
4 name: myfs
5 namespace: rook-ceph
6 spec:
7 metadataPool:
8 replicated:
9 size: 3
10 dataPools:
11 - failureDomain: host
12 replicated:
13 size: 3
14 preservePoolsOnDelete: true
15 metadataServer:
16 activeCount: 1
17 activeStandby: true
18 placement:
19 podAntiAffinity:
20 requiredDuringSchedulingIgnoredDuringExecution:
21 - labelSelector:
22 matchExpressions:
23 - key: app
24 operator: In
25 values:
26 - rook-ceph-mds
27 topologyKey: kubernetes.io/hostname
28 annotations:
29 resources:
复制代码
[root@k8smaster01 ceph]# kubectl get cephfilesystems.ceph.rook.io -n rook-ceph
NAME ACTIVEMDS AGE
myfs 1 27s
5.2 创建StorageClass
[root@k8smaster01 ceph]# kubectl create -f csi/cephfs/storageclass.yaml
使用如下官方提供的默认yaml可部署文件存储的StorageClass。
[root@k8smaster01 ceph]# vi csi/cephfs/storageclass.yaml
复制代码
1 apiVersion: storage.k8s.io/v1
2 kind: StorageClass
3 metadata:
4 name: csi-cephfs
5 provisioner: rook-ceph.cephfs.csi.ceph.com
6 parameters:
7 clusterID: rook-ceph
8 fsName: myfs
9 pool: myfs-data0
10 csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
11 csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
12 csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
13 csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
14 reclaimPolicy: Delete
15 mountOptions:
复制代码
[root@k8smaster01 ceph]# kubectl get sc
NAME PROVISIONER AGE
csi-cephfs rook-ceph.cephfs.csi.ceph.com 10m
5.3 创建PVC
[root@k8smaster01 ceph]# vi rookpvc03.yaml
复制代码
1 apiVersion: v1
2 kind: PersistentVolumeClaim
3 metadata:
4 name: cephfs-pvc
5 spec:
6 storageClassName: csi-cephfs
7 accessModes:
8 - ReadWriteOnce
9 resources:
10 requests:
11 storage: 200Mi
复制代码
[root@k8smaster01 ceph]# kubectl create -f rookpvc03.yaml
[root@k8smaster01 ceph]# kubectl get pv
[root@k8smaster01 ceph]# kubectl get pvc
clipboard
5.4 消费PVC
[root@k8smaster01 ceph]# vi rookpod03.yaml
复制代码
1 —
2 apiVersion: v1
3 kind: Pod
4 metadata:
5 name: csicephfs-demo-pod
6 spec:
7 containers:
8 - name: web-server
9 image: nginx
10 volumeMounts:
11 - name: mypvc
12 mountPath: /var/lib/www/html
13 volumes:
14 - name: mypvc
15 persistentVolumeClaim:
16 claimName: cephfs-pvc
17 readOnly: false
复制代码
[root@k8smaster01 ceph]# kubectl create -f rookpod03.yaml
[root@k8smaster01 ceph]# kubectl get pods
NAME READY STATUS RESTARTS AGE
csicephfs-demo-pod 1/1 Running 0 24s
回到顶部
六 设置dashboard
6.1 部署Node SVC
步骤2.4已创建dashboard,但仅使用clusterIP暴露服务,使用如下官方提供的默认yaml可部署外部nodePort方式暴露服务的dashboard。
[root@k8smaster01 ceph]# kubectl create -f dashboard-external-https.yaml
[root@k8smaster01 ceph]# vi dashboard-external-https.yaml
复制代码
1 apiVersion: v1
2 kind: Service
3 metadata:
4 name: rook-ceph-mgr-dashboard-external-https
5 namespace: rook-ceph
6 labels:
7 app: rook-ceph-mgr
8 rook_cluster: rook-ceph
9 spec:
10 ports:
11 - name: dashboard
12 port: 8443
13 protocol: TCP
14 targetPort: 8443
15 selector:
16 app: rook-ceph-mgr
17 rook_cluster: rook-ceph
18 sessionAffinity: None
19 type: NodePort
复制代码
[root@k8smaster01 ceph]# kubectl get svc -n rook-ceph
clipboard
6.2 确认验证
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath=’{.data.password}’ | base64 --decode #获取初始密码
浏览器访问:https://172.24.8.71:31097
clipboard
账号:admin,密码:如上查找即可。
clipboard
回到顶部
七 集群管理
7.1 修改配置
默认创建Ceph集群的配置参数在创建Cluster的时候生成Ceph集群的配置参数,若需要在部署完成后修改相应参数,可通过如下操作试下:
[root@k8smaster01 ceph]# kubectl -n rook-ceph get configmap rook-config-override -o yaml #获取参数
[root@k8snode02 ~]# cat /var/lib/rook/rook-ceph/rook-ceph.config #也可在任何node上查看
[root@k8smaster01 ceph]# kubectl -n rook-ceph edit configmap rook-config-override -o yaml #修改参数
复制代码
1 ……
2 apiVersion: v1
3 data:
4 config: |
5 [global]
6 osd pool default size = 2
7 ……
复制代码
依次重启ceph组件
[root@k8smaster01 ceph]# kubectl -n rook-ceph delete pod rook-ceph-mgr-a-5699bb7984-kpxgp
[root@k8smaster01 ceph]# kubectl -n rook-ceph delete pod rook-ceph-mon-a-85698dfff9-w5l8c
[root@k8smaster01 ceph]# kubectl -n rook-ceph delete pod rook-ceph-mgr-a-d58847d5-dj62p
[root@k8smaster01 ceph]# kubectl -n rook-ceph delete pod rook-ceph-mon-b-76559bf966-652nl
[root@k8smaster01 ceph]# kubectl -n rook-ceph delete pod rook-ceph-mon-c-74dd86589d-s84cz
注意:ceph-mon, ceph-osd的delete最后是one-by-one的,等待ceph集群状态为HEALTH_OK后再delete另一个。
提示:其他更多rook配置参数参考:https://rook.io/docs/rook/v1.1/。
7.2 创建Pool
对rook Ceph集群的pool创建,建议采用Kubernetes的方式,而不建议使用toolbox中的ceph命令。
使用如下官方提供的默认yaml可部署Pool。
[root@k8smaster01 ceph]# kubectl create -f pool.yaml
复制代码
1 apiVersion: ceph.rook.io/v1
2 kind: CephBlockPool
3 metadata:
4 name: replicapool2
5 namespace: rook-ceph
6 spec:
7 failureDomain: host
8 replicated:
9 size: 3
10 annotations:
复制代码
7.3 删除Pool
[root@k8smaster01 ceph]# kubectl delete -f pool.yaml
提示:更多Pool管理,如纠删码池参考:https://rook.io/docs/rook/v1.1/ceph-pool-crd.html。
7.4 添加OSD节点
本步骤模拟将k8smaster的sdb添加为OSD。
[root@k8smaster01 ceph]# kubectl taint node k8smaster01 node-role.kubernetes.io/master- #允许调度Pod
[root@k8smaster01 ceph]# kubectl label nodes k8smaster01 ceph-osd=enabled #设置标签
[root@k8smaster01 ceph]# vi cluster.yaml #追加master01的配置
……
- name: “k8smaster01”
config:
storeType: bluestore
devices:
- name: “sdb”
……
clipboard
[root@k8smaster01 ceph]# kubectl apply -f cluster.yaml
[root@k8smaster01 ceph]# kubectl -n rook-ceph get pod -o wide -w
clipboard
ceph osd tree
7.5 删除OSD节点
[root@k8smaster01 ceph]# kubectl label nodes k8smaster01 ceph-osd- #删除标签
[root@k8smaster01 ceph]# vi cluster.yaml #删除如下master01的配置
复制代码
1 ……
2 - name: “k8smaster01”
3 config:
4 storeType: bluestore
5 devices:
6 - name: “sdb”
7 ……
复制代码
clipboard
[root@k8smaster01 ceph]# kubectl apply -f cluster.yaml
[root@k8smaster01 ceph]# kubectl -n rook-ceph get pod -o wide -w
[root@k8smaster01 ceph]# rm -rf /var/lib/rook
7.6 删除Cluster
完整优雅删除rook集群的方式参考:https://github.com/rook/rook/blob/master/Documentation/ceph-teardown.md
7.7 升级rook
参考:http://www.yangguanjun.com/2018/12/28/rook-ceph-practice-part2/。
更多官网文档参考:https://rook.github.io/docs/rook/v1.1/
推荐博文:http://www.yangguanjun.com/archives/
https://sealyun.com/post/rook/
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。