特别说明:
- 本方法也可以用于单节点部署,只部署一个
Monitor
(只是会形成单点故障而已),最低要求是使用两个分区创建2
个OSD
(因为默认最小副本是2
);如果不需要使用CephFS
,则可以不部署MDS
服务;如果不使用对象存储,则可以不部署RGW
服务。 Ceph
从11.x (kraken)
版本开始新增Manager
服务,是可选的,从12.x (luminous)
版本开始是必选的。
系统环境
- 3个节点的主机
DNS
名及IP
配置(主机名和DNS
名称一样):
- $ cat /etc/hosts
- ...
-
- 172.29.101.166 osdev01
- 172.29.101.167 osdev02
- 172.29.101.168 osdev03
-
- ...
- 内核及发行版版本:
- $ uname -r
- 3.10.0-862.11.6.el7.x86_64
-
- $ cat /etc/redhat-release
- CentOS Linux release 7.5.1804 (Core)
- 3个节点使用
sdb
做OSD
磁盘,使用dd
命令清除其中可能存在的分区信息(会破坏磁盘数据,谨慎操作):
- $ lsblk
- NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
- sda 8:0 0 222.6G 0 disk
- ├─sda1 8:1 0 1G 0 part /boot
- └─sda2 8:2 0 221.6G 0 part /
- sdb 8:16 0 7.3T 0 disk
-
- $ dd if=/dev/zero of=/dev/sdb bs=512 count=1024
系统配置
Yum配置
- 安装
epel
仓库:
$ yum install -y epel-release
- 安装
yum
优先级插件:
$ yum install -y yum-plugin-priorities --enablerepo=rhel-7-server-optional-rpms
系统配置
- 安装和开启
NTP
服务:
- $ yum install -y ntp ntpdate ntp-doc
-
- $ systemctl enable ntpd.service && systemctl start ntpd.service && systemctl status ntpd.service
- 添加
osdev
用户,并放开sudo
权限(也可以直接使用root
用户,此步骤只是出于安全考虑):
- $ useradd -d /home/osdev -m osdev
- $ passwd osdev
-
- $ echo "osdev ALL = (root) NOPASSWD:ALL" | tee /etc/sudoers.d/osdev
- $ chmod 0440 /etc/sudoers.d/osdev
- 关闭防火墙:
$ systemctl stop firewalld && systemctl disable firewalld && systemctl status firewalld
- 关闭
SELinux
:
- $ sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config && cat /etc/selinux/config
- # setenforce 0 && sestatus
- $ reboot
- $ sestatus
- SELinux status: disabled
SSH配置
- 安装
SSH
服务软件包:
$ yum install -y openssh-server
SSH
免密登录:
- $ ssh-keygen
- $ ssh-copy-id osdev@osdev01
- $ ssh-copy-id osdev@osdev02
- $ ssh-copy-id osdev@osdev03
- 配置
SSH
默认用户,或者在执行cepy-deploy
命令时使用--username
指定用户名(这个配置会导致Kolla-Ansible
也把这个用户作为默认用户使用,导致权限不足而出现错误。可以在osdev
用户下进行如下配置,在root
用户下使用Kolla-Ansible
即可):
- $ vi ~/.ssh/config
- Host osdev01
- Hostname osdev01
- User osdev
- Host osdev02
- Hostname osdev02
- User osdev
- Host osdev03
- Hostname osdev03
- User osdev
- 测免密登录是否正确:
- [root@osdev01 ~]# ssh osdev01
- Last login: Wed Aug 22 16:53:56 2018 from osdev01
- [osdev@osdev01 ~]$ exit
- 登出
- Connection to osdev01 closed.
- [root@osdev01 ~]# ssh osdev02
- Last login: Wed Aug 22 16:55:06 2018 from osdev01
- [osdev@osdev02 ~]$ exit
- 登出
- Connection to osdev02 closed.
- [root@osdev01 ~]# ssh osdev03
- Last login: Wed Aug 22 16:55:35 2018 from osdev01
- [osdev@osdev03 ~]$ exit
- 登出
- Connection to osdev03 closed.
开始部署
初始化系统
- 安装
ceph-deploy
:
$ yum install -y ceph-deploy
- 创建
ceph-deploy
配置目录:
- $ su - osdev
- $ mkdir -pv /opt/ceph/deploy && cd /opt/ceph/deploy
- 创建一个
Ceph
集群,使用osdev01
、osdev02
和osdev03
做Monitor
节点:
$ ceph-deploy new osdev01 osdev02 osdev03
- 查看生成的配置文件:
- $ ls
- ceph.conf ceph-deploy-ceph.log ceph.mon.keyring
-
- $ cat ceph.conf
- [global]
- fsid = 42ded78e-211b-4095-b795-a33f116727fc
- mon_initial_members = osdev01, osdev02, osdev03
- mon_host = 172.29.101.166,172.29.101.167,172.29.101.168
- auth_cluster_required = cephx
- auth_service_required = cephx
- auth_client_required = cephx
- 编辑
Ceph
集群配置:
- $ vi ceph.conf
- public_network = 172.29.101.0/24
- cluster_network = 172.29.101.0/24
-
- osd_pool_default_size = 3
- osd_pool_default_min_size = 1
- osd_pool_default_pg_num = 8
- osd_pool_default_pgp_num = 8
- osd_crush_chooseleaf_type = 1
-
- [mon]
- mon_clock_drift_allowed = 0.5
-
- [osd]
- osd_mkfs_type = xfs
- osd_mkfs_options_xfs = -f
- filestore_max_sync_interval = 5
- filestore_min_sync_interval = 0.1
- filestore_fd_cache_size = 655350
- filestore_omap_header_cache_size = 655350
- filestore_fd_cache_random = true
- osd op threads = 8
- osd disk threads = 4
- filestore op threads = 8
- max_open_files = 655350
安装软件包
- 在
3
个节点上安装Ceph
软件包(如果出现错误,则先到3
个节点上分别先删除软件包):
- # sudo yum remove -y ceph-release
- $ ceph-deploy install osdev01 osdev02 osdev03
部署Monitor
- 部署初始
Monitor
:
$ ceph-deploy mon create-initial
- 查看生成的配置和秘钥文件:
- $ ls
- ceph.bootstrap-mds.keyring ceph.bootstrap-mgr.keyring ceph.bootstrap-osd.keyring ceph.bootstrap-rgw.keyring ceph.client.admin.keyring ceph.conf ceph-deploy-ceph.log ceph.mon.keyring
-
- $ sudo chmod a+r /etc/ceph/ceph.client.admin.keyring
- 拷贝配置和秘钥文件到指定节点:
$ ceph-deploy --overwrite-conf admin osdev01 osdev02 osdev03
- 配置
osdev01
的Monitor
剩余可用数据空间警告比例:
- $ ceph -s
- cluster:
- id: 383237bd-becf-49d5-9bd6-deb0bc35ab2a
- health: HEALTH_WARN
- mon osdev01 is low on available space
-
- services:
- mon: 3 daemons, quorum osdev01,osdev02,osdev03
- mgr: osdev03(active), standbys: osdev02, osdev01
- osd: 3 osds: 3 up, 3 in
- rgw: 3 daemons active
-
- data:
- pools: 10 pools, 176 pgs
- objects: 578 objects, 477 MiB
- usage: 4.0 GiB used, 22 TiB / 22 TiB avail
- pgs: 176 active+clean
-
- $ ceph daemon mon.osdev01 config get mon_data_avail_warn
- {
- "mon_data_avail_warn": "30"
- }
-
- $ ceph daemon mon.osdev01 config set mon_data_avail_warn 10
- {
- "success": "mon_data_avail_warn = '10' (not observed, change may require restart) "
- }
-
- $ vi /etc/ceph/ceph.conf
- [mon]
- mon_clock_drift_allowed = 0.5
- mon allow pool delete = true
- mon_data_avail_warn = 10
-
- $ systemctl restart ceph-mon@osdev01.service
-
- $ ceph daemon mon.osdev01 config get mon_data_avail_warn
- {
- "mon_data_avail_warn": "10"
- }
-
- $ ceph -s
- cluster:
- id: 383237bd-becf-49d5-9bd6-deb0bc35ab2a
- health: HEALTH_OK
-
- services:
- mon: 3 daemons, quorum osdev01,osdev02,osdev03
- mgr: osdev03(active), standbys: osdev02, osdev01
- osd: 3 osds: 3 up, 3 in
- rgw: 3 daemons active
-
- data:
- pools: 10 pools, 176 pgs
- objects: 578 objects, 477 MiB
- usage: 4.0 GiB used, 22 TiB / 22 TiB avail
- pgs: 176 active+clean
移除Monitor
- 移除
osdev01
上的Monitor
服务:
- $ ceph-deploy mon destroy osdev01
- [ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
- [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mon destroy osdev01
- [ceph_deploy.cli][INFO ] ceph-deploy options:
- [ceph_deploy.cli][INFO ] username : None
- [ceph_deploy.cli][INFO ] verbose : False
- [ceph_deploy.cli][INFO ] overwrite_conf : False
- [ceph_deploy.cli][INFO ] subcommand : destroy
- [ceph_deploy.cli][INFO ] quiet : False
- [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f2a70e3db00>
- [ceph_deploy.cli][INFO ] cluster : ceph
- [ceph_deploy.cli][INFO ] mon : ['osdev01']
- [ceph_deploy.cli][INFO ] func : <function mon at 0x7f2a7129c848>
- [ceph_deploy.cli][INFO ] ceph_conf : None
- [ceph_deploy.cli][INFO ] default_release : False
- [ceph_deploy.mon][DEBUG ] Removing mon from osdev01
- [osdev01][DEBUG ] connected to host: osdev01
- [osdev01][DEBUG ] detect platform information from remote host
- [osdev01][DEBUG ] detect machine type
- [osdev01][DEBUG ] find the location of an executable
- [osdev01][DEBUG ] get remote short hostname
- [osdev01][INFO ] Running command: ceph --cluster=ceph -n mon. -k /var/lib/ceph/mon/ceph-osdev01/keyring mon remove osdev01
- [osdev01][WARNIN] removing mon.osdev01 at 172.29.101.166:6789/0, there will be 2 monitors
- [osdev01][INFO ] polling the daemon to verify it stopped
- [osdev01][INFO ] Running command: systemctl stop ceph-mon@osdev01.service
- [osdev01][INFO ] Running command: mkdir -p /var/lib/ceph/mon-removed
- [osdev01][DEBUG ] move old monitor data
- 重新在
osdev01
上添加Monitor
服务:
- $ ceph-deploy mon add osdev01
- [ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
- [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mon add osdev01
- [ceph_deploy.cli][INFO ] ceph-deploy options:
- [ceph_deploy.cli][INFO ] username : None
- [ceph_deploy.cli][INFO ] verbose : False
- [ceph_deploy.cli][INFO ] overwrite_conf : False
- [ceph_deploy.cli][INFO ] subcommand : add
- [ceph_deploy.cli][INFO ] quiet : False
- [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f791d413878>
- [ceph_deploy.cli][INFO ] cluster : ceph
- [ceph_deploy.cli][INFO ] mon : ['osdev01']
- [ceph_deploy.cli][INFO ] func : <function mon at 0x7f791d870848>
- [ceph_deploy.cli][INFO ] address : None
- [ceph_deploy.cli][INFO ] ceph_conf : None
- [ceph_deploy.cli][INFO ] default_release : False
- [ceph_deploy.mon][INFO ] ensuring configuration of new mon host: osdev01
- [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to osdev01
- [osdev01][DEBUG ] connected to host: osdev01
- [osdev01][DEBUG ] detect platform information from remote host
- [osdev01][DEBUG ] detect machine type
- [osdev01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
- [ceph_deploy.mon][DEBUG ] Adding mon to cluster ceph, host osdev01
- [ceph_deploy.mon][DEBUG ] using mon address by resolving host: 172.29.101.166
- [ceph_deploy.mon][DEBUG ] detecting platform for host osdev01 ...
- [osdev01][DEBUG ] connected to host: osdev01
- [osdev01][DEBUG ] detect platform information from remote host
- [osdev01][DEBUG ] detect machine type
- [osdev01][DEBUG ] find the location of an executable
- [ceph_deploy.mon][INFO ] distro info: CentOS Linux 7.5.1804 Core
- [osdev01][DEBUG ] determining if provided host has same hostname in remote
- [osdev01][DEBUG ] get remote short hostname
- [osdev01][DEBUG ] adding mon to osdev01
- [osdev01][DEBUG ] get remote short hostname
- [osdev01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
- [osdev01][DEBUG ] create the mon path if it does not exist
- [osdev01][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-osdev01/done
- [osdev01][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-osdev01/done
- [osdev01][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-osdev01.mon.keyring
- [osdev01][DEBUG ] create the monitor keyring file
- [osdev01][INFO ] Running command: ceph --cluster ceph mon getmap -o /var/lib/ceph/tmp/ceph.osdev01.monmap
- [osdev01][WARNIN] got monmap epoch 3
- [osdev01][INFO ] Running command: ceph-mon --cluster ceph --mkfs -i osdev01 --monmap /var/lib/ceph/tmp/ceph.osdev01.monmap --keyring /var/lib/ceph/tmp/ceph-osdev01.mon.keyring --setuser 167 --setgroup 167
- [osdev01][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-osdev01.mon.keyring
- [osdev01][DEBUG ] create a done file to avoid re-doing the mon deployment
- [osdev01][DEBUG ] create the init path if it does not exist
- [osdev01][INFO ] Running command: systemctl enable ceph.target
- [osdev01][INFO ] Running command: systemctl enable ceph-mon@osdev01
- [osdev01][INFO ] Running command: systemctl start ceph-mon@osdev01
- [osdev01][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.osdev01.asok mon_status
- [osdev01][WARNIN] monitor osdev01 does not exist in monmap
- [osdev01][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.osdev01.asok mon_status
- [osdev01][DEBUG ] ********************************************************************************
- [osdev01][DEBUG ] status for monitor: mon.osdev01
- [osdev01][DEBUG ] {
- [osdev01][DEBUG ] "election_epoch": 0,
- [osdev01][DEBUG ] "extra_probe_peers": [],
- [osdev01][DEBUG ] "feature_map": {
- [osdev01][DEBUG ] "client": [
- [osdev01][DEBUG ] {
- [osdev01][DEBUG ] "features": "0x1ffddff8eea4fffb",
- [osdev01][DEBUG ] "num": 1,
- [osdev01][DEBUG ] "release": "luminous"
- [osdev01][DEBUG ] },
- [osdev01][DEBUG ] {
- [osdev01][DEBUG ] "features": "0x3ffddff8ffa4fffb",
- [osdev01][DEBUG ] "num": 1,
- [osdev01][DEBUG ] "release": "luminous"
- [osdev01][DEBUG ] }
- [osdev01][DEBUG ] ],
- [osdev01][DEBUG ] "mds": [
- [osdev01][DEBUG ] {
- [osdev01][DEBUG ] "features": "0x3ffddff8ffa4fffb",
- [osdev01][DEBUG ] "num": 2,
- [osdev01][DEBUG ] "release": "luminous"
- [osdev01][DEBUG ] }
- [osdev01][DEBUG ] ],
- [osdev01][DEBUG ] "mgr": [
- [osdev01][DEBUG ] {
- [osdev01][DEBUG ] "features": "0x3ffddff8ffa4fffb",
- [osdev01][DEBUG ] "num": 3,
- [osdev01][DEBUG ] "release": "luminous"
- [osdev01][DEBUG ] }
- [osdev01][DEBUG ] ],
- [osdev01][DEBUG ] "mon": [
- [osdev01][DEBUG ] {
- [osdev01][DEBUG ] "features": "0x3ffddff8ffa4fffb",
- [osdev01][DEBUG ] "num": 1,
- [osdev01][DEBUG ] "release": "luminous"
- [osdev01][DEBUG ] }
- [osdev01][DEBUG ] ],
- [osdev01][DEBUG ] "osd": [
- [osdev01][DEBUG ] {
- [osdev01][DEBUG ] "features": "0x3ffddff8ffa4fffb",
- [osdev01][DEBUG ] "num": 2,
- [osdev01][DEBUG ] "release": "luminous"
- [osdev01][DEBUG ] }
- [osdev01][DEBUG ] ]
- [osdev01][DEBUG ] },
- [osdev01][DEBUG ] "features": {
- [osdev01][DEBUG ] "quorum_con": "0",
- [osdev01][DEBUG ] "quorum_mon": [],
- [osdev01][DEBUG ] "required_con": "144115188346404864",
- [osdev01][DEBUG ] "required_mon": [
- [osdev01][DEBUG ] "kraken",
- [osdev01][DEBUG ] "luminous",
- [osdev01][DEBUG ] "mimic",
- [osdev01][DEBUG ] "osdmap-prune"
- [osdev01][DEBUG ] ]
- [osdev01][DEBUG ] },
- [osdev01][DEBUG ] "monmap": {
- [osdev01][DEBUG ] "created": "2018-08-23 10:55:27.755434",
- [osdev01][DEBUG ] "epoch": 3,
- [osdev01][DEBUG ] "features": {
- [osdev01][DEBUG ] "optional": [],
- [osdev01][DEBUG ] "persistent": [
- [osdev01][DEBUG ] "kraken",
- [osdev01][DEBUG ] "luminous",
- [osdev01][DEBUG ] "mimic",
- [osdev01][DEBUG ] "osdmap-prune"
- [osdev01][DEBUG ] ]
- [osdev01][DEBUG ] },
- [osdev01][DEBUG ] "fsid": "383237bd-becf-49d5-9bd6-deb0bc35ab2a",
- [osdev01][DEBUG ] "modified": "2018-09-19 14:57:08.984472",
- [osdev01][DEBUG ] "mons": [
- [osdev01][DEBUG ] {
- [osdev01][DEBUG ] "addr": "172.29.101.167:6789/0",
- [osdev01][DEBUG ] "name": "osdev02",
- [osdev01][DEBUG ] "public_addr": "172.29.101.167:6789/0",
- [osdev01][DEBUG ] "rank": 0
- [osdev01][DEBUG ] },
- [osdev01][DEBUG ] {
- [osdev01][DEBUG ] "addr": "172.29.101.168:6789/0",
- [osdev01][DEBUG ] "name": "osdev03",
- [osdev01][DEBUG ] "public_addr": "172.29.101.168:6789/0",
- [osdev01][DEBUG ] "rank": 1
- [osdev01][DEBUG ] }
- [osdev01][DEBUG ] ]
- [osdev01][DEBUG ] },
- [osdev01][DEBUG ] "name": "osdev01",
- [osdev01][DEBUG ] "outside_quorum": [],
- [osdev01][DEBUG ] "quorum": [],
- [osdev01][DEBUG ] "rank": -1,
- [osdev01][DEBUG ] "state": "probing",
- [osdev01][DEBUG ] "sync_provider": []
- [osdev01][DEBUG ] }
- [osdev01][DEBUG ] ********************************************************************************
- [osdev01][INFO ] monitor: mon.osdev01 is currently at the state of probing
部署Manager
- 在
3
个节点上安装Manager
服务(从kraken
版本开始增加该服务,从luminous
版本开始是必选):
$ ceph-deploy mgr create osdev01 osdev02 osdev03
- 查看集群状态,
3
个Manager
只有一个是被激活的:
- $ sudo ceph -s
- cluster:
- id: 383237bd-becf-49d5-9bd6-deb0bc35ab2a
- health: HEALTH_WARN
- mon osdev01 is low on available space
-
- services:
- mon: 3 daemons, quorum osdev01,osdev02,osdev03
- mgr: osdev01(active), standbys: osdev03, osdev02
- osd: 0 osds: 0 up, 0 in
-
- data:
- pools: 0 pools, 0 pgs
- objects: 0 objects, 0 B
- usage: 0 B used, 0 B / 0 B avail
- pgs:
- 查看当前的集群投票状态:
- $ sudo ceph quorum_status --format json-pretty
- {
- "election_epoch": 8,
- "quorum": [
- 0,
- 1,
- 2
- ],
- "quorum_names": [
- "osdev01",
- "osdev02",
- "osdev03"
- ],
- "quorum_leader_name": "osdev01",
- "monmap": {
- "epoch": 2,
- "fsid": "383237bd-becf-49d5-9bd6-deb0bc35ab2a",
- "modified": "2018-08-23 10:55:53.598952",
- "created": "2018-08-23 10:55:27.755434",
- "features": {
- "persistent": [
- "kraken",
- "luminous",
- "mimic",
- "osdmap-prune"
- ],
- "optional": []
- },
- "mons": [
- {
- "rank": 0,
- "name": "osdev01",
- "addr": "172.29.101.166:6789/0",
- "public_addr": "172.29.101.166:6789/0"
- },
- {
- "rank": 1,
- "name": "osdev02",
- "addr": "172.29.101.167:6789/0",
- "public_addr": "172.29.101.167:6789/0"
- },
- {
- "rank": 2,
- "name": "osdev03",
- "addr": "172.29.101.168:6789/0",
- "public_addr": "172.29.101.168:6789/0"
- }
- ]
- }
- }
部署OSD
- 如果之前部署过
OSD
,则清理掉其中的LVM卷:
$ sudo lvs | awk 'NR!=1 {if($1~"osd-block-") print $2 "/" $1}' | xargs -I {} sudo lvremove -y {}
- 清除磁盘数据(如果之前
dd处理过
,以及没有LVM
卷,则可省略):
- $ ceph-deploy disk zap osdev01 /dev/sdb
- $ ceph-deploy disk zap osdev02 /dev/sdb
- $ ceph-deploy disk zap osdev03 /dev/sdb
- 在
3
个节点上部署OSD
服务,默认使用bluestore
,没有journal
和block_db
:
- $ ceph-deploy osd create --data /dev/sdb osdev01
- [ceph_deploy.conf][DEBUG ] found configuration file at: /home/osdev/.cephdeploy.conf
- [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy osd create --data /dev/sdb osdev01
- [ceph_deploy.cli][INFO ] ceph-deploy options:
- [ceph_deploy.cli][INFO ] verbose : False
- [ceph_deploy.cli][INFO ] bluestore : None
- [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7ff06aa94d88>
- [ceph_deploy.cli][INFO ] cluster : ceph
- [ceph_deploy.cli][INFO ] fs_type : xfs
- [ceph_deploy.cli][INFO ] block_wal : None
- [ceph_deploy.cli][INFO ] default_release : False
- [ceph_deploy.cli][INFO ] username : None
- [ceph_deploy.cli][INFO ] journal : None
- [ceph_deploy.cli][INFO ] subcommand : create
- [ceph_deploy.cli][INFO ] host : osdev01
- [ceph_deploy.cli][INFO ] filestore : None
- [ceph_deploy.cli][INFO ] func : <function osd at 0x7ff06b2efb90>
- [ceph_deploy.cli][INFO ] ceph_conf : None
- [ceph_deploy.cli][INFO ] zap_disk : False
- [ceph_deploy.cli][INFO ] data : /dev/sdb
- [ceph_deploy.cli][INFO ] block_db : None
- [ceph_deploy.cli][INFO ] dmcrypt : False
- [ceph_deploy.cli][INFO ] overwrite_conf : False
- [ceph_deploy.cli][INFO ] dmcrypt_key_dir : /etc/ceph/dmcrypt-keys
- [ceph_deploy.cli][INFO ] quiet : False
- [ceph_deploy.cli][INFO ] debug : False
- [ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data device /dev/sdb
- [osdev01][DEBUG ] connection detected need for sudo
- [osdev01][DEBUG ] connected to host: osdev01
- [osdev01][DEBUG ] detect platform information from remote host
- [osdev01][DEBUG ] detect machine type
- [osdev01][DEBUG ] find the location of an executable
- [ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.5.1804 Core
- [ceph_deploy.osd][DEBUG ] Deploying osd to osdev01
- [osdev01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
- [osdev01][DEBUG ] find the location of an executable
- [osdev01][INFO ] Running command: sudo /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb
- [osdev01][DEBUG ] Running command: /bin/ceph-authtool --gen-print-key
- [osdev01][DEBUG ] Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 3c3d6c5a-c82e-4318-a8fb-134de5444ca7
- [osdev01][DEBUG ] Running command: /usr/sbin/vgcreate --force --yes ceph-95b94aa4-22df-401c-822b-dd62f82f6b08 /dev/sdb
- [osdev01][DEBUG ] stdout: Physical volume "/dev/sdb" successfully created.
- [osdev01][DEBUG ] stdout: Volume group "ceph-95b94aa4-22df-401c-822b-dd62f82f6b08" successfully created
- [osdev01][DEBUG ] Running command: /usr/sbin/lvcreate --yes -l 100%FREE -n osd-block-3c3d6c5a-c82e-4318-a8fb-134de5444ca7 ceph-95b94aa4-22df-401c-822b-dd62f82f6b08
- [osdev01][DEBUG ] stdout: Logical volume "osd-block-3c3d6c5a-c82e-4318-a8fb-134de5444ca7" created.
- [osdev01][DEBUG ] Running command: /bin/ceph-authtool --gen-print-key
- [osdev01][DEBUG ] Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-1
- [osdev01][DEBUG ] Running command: /bin/chown -h ceph:ceph /dev/ceph-95b94aa4-22df-401c-822b-dd62f82f6b08/osd-block-3c3d6c5a-c82e-4318-a8fb-134de5444ca7
- [osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /dev/dm-0
- [osdev01][DEBUG ] Running command: /bin/ln -s /dev/ceph-95b94aa4-22df-401c-822b-dd62f82f6b08/osd-block-3c3d6c5a-c82e-4318-a8fb-134de5444ca7 /var/lib/ceph/osd/ceph-1/block
- [osdev01][DEBUG ] Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-1/activate.monmap
- [osdev01][DEBUG ] stderr: got monmap epoch 1
- [osdev01][DEBUG ] Running command: /bin/ceph-authtool /var/lib/ceph/osd/ceph-1/keyring --create-keyring --name osd.1 --add-key AQDxF35bOAdNHBAAelXgl7laeMnVsGAlHl0dxQ==
- [osdev01][DEBUG ] stdout: creating /var/lib/ceph/osd/ceph-1/keyring
- [osdev01][DEBUG ] added entity osd.1 auth auth(auid = 18446744073709551615 key=AQDxF35bOAdNHBAAelXgl7laeMnVsGAlHl0dxQ== with 0 caps)
- [osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-1/keyring
- [osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-1/
- [osdev01][DEBUG ] Running command: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 1 --monmap /var/lib/ceph/osd/ceph-1/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-1/ --osd-uuid 3c3d6c5a-c82e-4318-a8fb-134de5444ca7 --setuser ceph --setgroup ceph
- [osdev01][DEBUG ] --> ceph-volume lvm prepare successful for: /dev/sdb
- [osdev01][DEBUG ] Running command: /bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-95b94aa4-22df-401c-822b-dd62f82f6b08/osd-block-3c3d6c5a-c82e-4318-a8fb-134de5444ca7 --path /var/lib/ceph/osd/ceph-1 --no-mon-config
- [osdev01][DEBUG ] Running command: /bin/ln -snf /dev/ceph-95b94aa4-22df-401c-822b-dd62f82f6b08/osd-block-3c3d6c5a-c82e-4318-a8fb-134de5444ca7 /var/lib/ceph/osd/ceph-1/block
- [osdev01][DEBUG ] Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-1/block
- [osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /dev/dm-0
- [osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-1
- [osdev01][DEBUG ] Running command: /bin/systemctl enable ceph-volume@lvm-1-3c3d6c5a-c82e-4318-a8fb-134de5444ca7
- [osdev01][DEBUG ] stderr: Created symlink from /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-1-3c3d6c5a-c82e-4318-a8fb-134de5444ca7.service to /usr/lib/systemd/system/ceph-volume@.service.
- [osdev01][DEBUG ] Running command: /bin/systemctl start ceph-osd@1
- [osdev01][DEBUG ] --> ceph-volume lvm activate successful for osd ID: 1
- [osdev01][DEBUG ] --> ceph-volume lvm create successful for: /dev/sdb
- [osdev01][INFO ] checking OSD status...
- [osdev01][DEBUG ] find the location of an executable
- [osdev01][INFO ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
- [osdev01][WARNIN] there is 1 OSD down
- [osdev01][WARNIN] there is 1 OSD out
- [ceph_deploy.osd][DEBUG ] Host osdev01 is now ready for osd use.
-
- $ ceph-deploy osd create --data /dev/sdb osdev02
- $ ceph-deploy osd create --data /dev/sdb osdev03
- 查看
OSD
的分区状况,新版Ceph
默认使用bluestore
:
- $ ceph-deploy osd list osdev01
- [ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
- [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy osd list osdev01
- [ceph_deploy.cli][INFO ] ceph-deploy options:
- [ceph_deploy.cli][INFO ] username : None
- [ceph_deploy.cli][INFO ] verbose : False
- [ceph_deploy.cli][INFO ] debug : False
- [ceph_deploy.cli][INFO ] overwrite_conf : False
- [ceph_deploy.cli][INFO ] subcommand : list
- [ceph_deploy.cli][INFO ] quiet : False
- [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f54d13d9ef0>
- [ceph_deploy.cli][INFO ] cluster : ceph
- [ceph_deploy.cli][INFO ] host : ['osdev01']
- [ceph_deploy.cli][INFO ] func : <function osd at 0x7f54d1c34b90>
- [ceph_deploy.cli][INFO ] ceph_conf : None
- [ceph_deploy.cli][INFO ] default_release : False
- [osdev01][DEBUG ] connected to host: osdev01
- [osdev01][DEBUG ] detect platform information from remote host
- [osdev01][DEBUG ] detect machine type
- [osdev01][DEBUG ] find the location of an executable
- [ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.5.1804 Core
- [ceph_deploy.osd][DEBUG ] Listing disks on osdev01...
- [osdev01][DEBUG ] find the location of an executable
- [osdev01][INFO ] Running command: /usr/sbin/ceph-volume lvm list
- [osdev01][DEBUG ]
- [osdev01][DEBUG ]
- [osdev01][DEBUG ] ====== osd.0 =======
- [osdev01][DEBUG ]
- [osdev01][DEBUG ] [block] /dev/ceph-a2130090-fb78-4b65-838f-7496c63fa025/osd-block-2cb30e7c-7b98-4a6c-816a-2de7201a7669
- [osdev01][DEBUG ]
- [osdev01][DEBUG ] type block
- [osdev01][DEBUG ] osd id 0
- [osdev01][DEBUG ] cluster fsid 383237bd-becf-49d5-9bd6-deb0bc35ab2a
- [osdev01][DEBUG ] cluster name ceph
- [osdev01][DEBUG ] osd fsid 2cb30e7c-7b98-4a6c-816a-2de7201a7669
- [osdev01][DEBUG ] encrypted 0
- [osdev01][DEBUG ] cephx lockbox secret
- [osdev01][DEBUG ] block uuid AL5bfk-acAQ-9guP-tl61-A4Jf-RQOF-nFnE9o
- [osdev01][DEBUG ] block device /dev/ceph-a2130090-fb78-4b65-838f-7496c63fa025/osd-block-2cb30e7c-7b98-4a6c-816a-2de7201a7669
- [osdev01][DEBUG ] vdo 0
- [osdev01][DEBUG ] crush device class None
- [osdev01][DEBUG ] devices /dev/sdb
-
- $ lvs
- LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
- osd-block-2cb30e7c-7b98-4a6c-816a-2de7201a7669 ceph-a2130090-fb78-4b65-838f-7496c63fa025 -wi-ao---- <7.28t
-
- $ pvs
- PV VG Fmt Attr PSize PFree
- /dev/sdb ceph-a2130090-fb78-4b65-838f-7496c63fa025 lvm2 a-- <7.28t 0
-
- # osdev01
- $ df -h | grep ceph
- tmpfs 189G 24K 189G 1% /var/lib/ceph/osd/ceph-0
-
- $ ll /var/lib/ceph/osd/ceph-0
- 总用量 24
- lrwxrwxrwx 1 ceph ceph 93 8月 29 15:15 block -> /dev/ceph-a2130090-fb78-4b65-838f-7496c63fa025/osd-block-2cb30e7c-7b98-4a6c-816a-2de7201a7669
- -rw------- 1 ceph ceph 37 8月 29 15:15 ceph_fsid
- -rw------- 1 ceph ceph 37 8月 29 15:15 fsid
- -rw------- 1 ceph ceph 55 8月 29 15:15 keyring
- -rw------- 1 ceph ceph 6 8月 29 15:15 ready
- -rw------- 1 ceph ceph 10 8月 29 15:15 type
- -rw------- 1 ceph ceph 2 8月 29 15:15 whoami
-
- $ cat /var/lib/ceph/osd/ceph-0/whoami
- 0
- $ cat /var/lib/ceph/osd/ceph-0/type
- bluestore
- $ cat /var/lib/ceph/osd/ceph-0/ready
- ready
- $ cat /var/lib/ceph/osd/ceph-0/fsid
- 2cb30e7c-7b98-4a6c-816a-2de7201a7669
-
- # osdev02
- $ df -h | grep ceph
- tmpfs 189G 48K 189G 1% /var/lib/ceph/osd/ceph-1
- 查看集群状态:
- $ sudo ceph health
- HEALTH_WARN mon osdev01 is low on available space
-
- $ sudo ceph -s
- cluster:
- id: 383237bd-becf-49d5-9bd6-deb0bc35ab2a
- health: HEALTH_WARN
- mon osdev01 is low on available space
-
- services:
- mon: 3 daemons, quorum osdev01,osdev02,osdev03
- mgr: osdev01(active), standbys: osdev03, osdev02
- osd: 3 osds: 3 up, 3 in
-
- data:
- pools: 0 pools, 0 pgs
- objects: 0 objects, 0 B
- usage: 3.0 GiB used, 22 TiB / 22 TiB avail
- pgs:
- 查看
OSD
状态:
- $ sudo ceph osd tree
- ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
- -1 21.83066 root default
- -3 7.27689 host osdev01
- 0 hdd 7.27689 osd.0 up 1.00000 1.00000
- -5 7.27689 host osdev02
- 1 hdd 7.27689 osd.1 up 1.00000 1.00000
- -7 7.27689 host osdev03
- 2 hdd 7.27689 osd.2 up 1.00000 1.00000
移除OSD
- 删除
OSD
:
- $ ceph osd out 0
- marked out osd.0.
- 观察数据迁移:
$ ceph -w
- 在对应的节点上停止
OSD
服务:
$ systemctl stop ceph-osd@0
- 删除该
OSD
的CRUSH
表:
- $ ceph osd crush remove osd.0
- removed item id 0 name 'osd.0' from crush map
- 删除该
OSD
的认证:
- $ ceph auth del osd.0
- updated
- 清理
OSD
的磁盘:
- $ sudo lvs | awk 'NR!=1 {if($1~"osd-block-") print $2 "/" $1}' | xargs -I {} sudo lvremove -y {}
- Logical volume "osd-block-2cb30e7c-7b98-4a6c-816a-2de7201a7669" successfully removed
-
- $ ceph-deploy disk zap osdev01 /dev/sdb
- [ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
- [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy disk zap osdev01 /dev/sdb
- [ceph_deploy.cli][INFO ] ceph-deploy options:
- [ceph_deploy.cli][INFO ] username : None
- [ceph_deploy.cli][INFO ] verbose : False
- [ceph_deploy.cli][INFO ] debug : False
- [ceph_deploy.cli][INFO ] overwrite_conf : False
- [ceph_deploy.cli][INFO ] subcommand : zap
- [ceph_deploy.cli][INFO ] quiet : False
- [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f3029a04c20>
- [ceph_deploy.cli][INFO ] cluster : ceph
- [ceph_deploy.cli][INFO ] host : osdev01
- [ceph_deploy.cli][INFO ] func : <function disk at 0x7f3029e50d70>
- [ceph_deploy.cli][INFO ] ceph_conf : None
- [ceph_deploy.cli][INFO ] default_release : False
- [ceph_deploy.cli][INFO ] disk : ['/dev/sdb']
- [ceph_deploy.osd][DEBUG ] zapping /dev/sdb on osdev01
- [osdev01][DEBUG ] connected to host: osdev01
- [osdev01][DEBUG ] detect platform information from remote host
- [osdev01][DEBUG ] detect machine type
- [osdev01][DEBUG ] find the location of an executable
- [ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.5.1804 Core
- [osdev01][DEBUG ] zeroing last few blocks of device
- [osdev01][DEBUG ] find the location of an executable
- [osdev01][INFO ] Running command: /usr/sbin/ceph-volume lvm zap /dev/sdb
- [osdev01][DEBUG ] --> Zapping: /dev/sdb
- [osdev01][DEBUG ] Running command: /usr/sbin/cryptsetup status /dev/mapper/
- [osdev01][DEBUG ] stdout: /dev/mapper/ is inactive.
- [osdev01][DEBUG ] Running command: /usr/sbin/wipefs --all /dev/sdb
- [osdev01][DEBUG ] stdout: /dev/sdb:8 个字节已擦除,位置偏移为 0x00000218 (LVM2_member):4c 56 4d 32 20 30 30 31
- [osdev01][DEBUG ] Running command: /bin/dd if=/dev/zero of=/dev/sdb bs=1M count=10
- [osdev01][DEBUG ] stderr: 记录了10+0 的读入
- [osdev01][DEBUG ] 记录了10+0 的写出
- [osdev01][DEBUG ] 10485760字节(10 MB)已复制
- [osdev01][DEBUG ] stderr: ,0.0131341 秒,798 MB/秒
- [osdev01][DEBUG ] --> Zapping successful for: /dev/sdb
- 重新添加
OSD
:
- $ ceph-deploy osd create --data /dev/sdb osdev01
- [ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
- [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy osd create --data /dev/sdb osdev01
- [ceph_deploy.cli][INFO ] ceph-deploy options:
- [ceph_deploy.cli][INFO ] verbose : False
- [ceph_deploy.cli][INFO ] bluestore : None
- [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f6594673d40>
- [ceph_deploy.cli][INFO ] cluster : ceph
- [ceph_deploy.cli][INFO ] fs_type : xfs
- [ceph_deploy.cli][INFO ] block_wal : None
- [ceph_deploy.cli][INFO ] default_release : False
- [ceph_deploy.cli][INFO ] username : None
- [ceph_deploy.cli][INFO ] journal : None
- [ceph_deploy.cli][INFO ] subcommand : create
- [ceph_deploy.cli][INFO ] host : osdev01
- [ceph_deploy.cli][INFO ] filestore : None
- [ceph_deploy.cli][INFO ] func : <function osd at 0x7f6594abacf8>
- [ceph_deploy.cli][INFO ] ceph_conf : None
- [ceph_deploy.cli][INFO ] zap_disk : False
- [ceph_deploy.cli][INFO ] data : /dev/sdb
- [ceph_deploy.cli][INFO ] block_db : None
- [ceph_deploy.cli][INFO ] dmcrypt : False
- [ceph_deploy.cli][INFO ] overwrite_conf : False
- [ceph_deploy.cli][INFO ] dmcrypt_key_dir : /etc/ceph/dmcrypt-keys
- [ceph_deploy.cli][INFO ] quiet : False
- [ceph_deploy.cli][INFO ] debug : False
- [ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data device /dev/sdb
- [osdev01][DEBUG ] connected to host: osdev01
- [osdev01][DEBUG ] detect platform information from remote host
- [osdev01][DEBUG ] detect machine type
- [osdev01][DEBUG ] find the location of an executable
- [ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.5.1804 Core
- [ceph_deploy.osd][DEBUG ] Deploying osd to osdev01
- [osdev01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
- [osdev01][DEBUG ] find the location of an executable
- [osdev01][INFO ] Running command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb
- [osdev01][DEBUG ] Running command: /bin/ceph-authtool --gen-print-key
- [osdev01][DEBUG ] Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new df124d5a-122a-48b4-9173-87088c6e6aac
- [osdev01][DEBUG ] Running command: /usr/sbin/vgcreate --force --yes ceph-5cddc4d4-2b62-452a-8ba1-61df276d5320 /dev/sdb
- [osdev01][DEBUG ] stdout: Physical volume "/dev/sdb" successfully created.
- [osdev01][DEBUG ] stdout: Volume group "ceph-5cddc4d4-2b62-452a-8ba1-61df276d5320" successfully created
- [osdev01][DEBUG ] Running command: /usr/sbin/lvcreate --yes -l 100%FREE -n osd-block-df124d5a-122a-48b4-9173-87088c6e6aac ceph-5cddc4d4-2b62-452a-8ba1-61df276d5320
- [osdev01][DEBUG ] stdout: Logical volume "osd-block-df124d5a-122a-48b4-9173-87088c6e6aac" created.
- [osdev01][DEBUG ] Running command: /bin/ceph-authtool --gen-print-key
- [osdev01][DEBUG ] Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-3
- [osdev01][DEBUG ] Running command: /bin/chown -h ceph:ceph /dev/ceph-5cddc4d4-2b62-452a-8ba1-61df276d5320/osd-block-df124d5a-122a-48b4-9173-87088c6e6aac
- [osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /dev/dm-0
- [osdev01][DEBUG ] Running command: /bin/ln -s /dev/ceph-5cddc4d4-2b62-452a-8ba1-61df276d5320/osd-block-df124d5a-122a-48b4-9173-87088c6e6aac /var/lib/ceph/osd/ceph-3/block
- [osdev01][DEBUG ] Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-3/activate.monmap
- [osdev01][DEBUG ] stderr: got monmap epoch 4
- [osdev01][DEBUG ] Running command: /bin/ceph-authtool /var/lib/ceph/osd/ceph-3/keyring --create-keyring --name osd.3 --add-key AQDP9qFbXoYRERAAMMz5EHjYAdlveVdDe1uAYg==
- [osdev01][DEBUG ] stdout: creating /var/lib/ceph/osd/ceph-3/keyring
- [osdev01][DEBUG ] stdout: added entity osd.3 auth auth(auid = 18446744073709551615 key=AQDP9qFbXoYRERAAMMz5EHjYAdlveVdDe1uAYg== with 0 caps)
- [osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/keyring
- [osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/
- [osdev01][DEBUG ] Running command: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 3 --monmap /var/lib/ceph/osd/ceph-3/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-3/ --osd-uuid df124d5a-122a-48b4-9173-87088c6e6aac --setuser ceph --setgroup ceph
- [osdev01][DEBUG ] --> ceph-volume lvm prepare successful for: /dev/sdb
- [osdev01][DEBUG ] Running command: /bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-5cddc4d4-2b62-452a-8ba1-61df276d5320/osd-block-df124d5a-122a-48b4-9173-87088c6e6aac --path /var/lib/ceph/osd/ceph-3 --no-mon-config
- [osdev01][DEBUG ] Running command: /bin/ln -snf /dev/ceph-5cddc4d4-2b62-452a-8ba1-61df276d5320/osd-block-df124d5a-122a-48b4-9173-87088c6e6aac /var/lib/ceph/osd/ceph-3/block
- [osdev01][DEBUG ] Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-3/block
- [osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /dev/dm-0
- [osdev01][DEBUG ] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-3
- [osdev01][DEBUG ] Running command: /bin/systemctl enable ceph-volume@lvm-3-df124d5a-122a-48b4-9173-87088c6e6aac
- [osdev01][DEBUG ] stderr: Created symlink from /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-3-df124d5a-122a-48b4-9173-87088c6e6aac.service to /usr/lib/systemd/system/ceph-volume@.service.
- [osdev01][DEBUG ] Running command: /bin/systemctl start ceph-osd@3
- [osdev01][DEBUG ] --> ceph-volume lvm activate successful for osd ID: 3
- [osdev01][DEBUG ] --> ceph-volume lvm create successful for: /dev/sdb
- [osdev01][INFO ] checking OSD status...
- [osdev01][DEBUG ] find the location of an executable
- [osdev01][INFO ] Running command: /bin/ceph --cluster=ceph osd stat --format=json
- [osdev01][WARNIN] there is 1 OSD down
- [osdev01][WARNIN] there is 1 OSD out
- [ceph_deploy.osd][DEBUG ] Host osdev01 is now ready for osd use.
- 查看
OSD
状态:
- $ ceph osd tree
- ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
- -1 21.83066 root default
- -3 7.27689 host osdev01
- 3 hdd 7.27689 osd.3 up 1.00000 1.00000
- -5 7.27689 host osdev02
- 1 hdd 7.27689 osd.1 up 1.00000 1.00000
- -7 7.27689 host osdev03
- 2 hdd 7.27689 osd.2 up 1.00000 1.00000
- 0 0 osd.0 down 0 1.00000
-
- $ ceph-bluestore-tool show-label --dev /var/lib/ceph/osd/ceph-3/block
- {
- "/var/lib/ceph/osd/ceph-3/block": {
- "osd_uuid": "df124d5a-122a-48b4-9173-87088c6e6aac",
- "size": 8000995590144,
- "btime": "2018-09-19 15:12:17.376253",
- "description": "main",
- "bluefs": "1",
- "ceph_fsid": "383237bd-becf-49d5-9bd6-deb0bc35ab2a",
- "kv_backend": "rocksdb",
- "magic": "ceph osd volume v026",
- "mkfs_done": "yes",
- "osd_key": "AQDP9qFbXoYRERAAMMz5EHjYAdlveVdDe1uAYg==",
- "ready": "ready",
- "whoami": "3"
- }
- }
- 查看数据迁移状态:
- $ ceph -w
- cluster:
- id: 383237bd-becf-49d5-9bd6-deb0bc35ab2a
- health: HEALTH_WARN
- Degraded data redundancy: 4825/16156 objects degraded (29.865%), 83 pgs degraded, 63 pgs undersized
- clock skew detected on mon.osdev02
- mon osdev01 is low on available space
-
- services:
- mon: 3 daemons, quorum osdev01,osdev02,osdev03
- mgr: osdev03(active), standbys: osdev02, osdev01
- osd: 4 osds: 3 up, 3 in; 63 remapped pgs
- rgw: 3 daemons active
-
- data:
- pools: 10 pools, 176 pgs
- objects: 5.39 k objects, 19 GiB
- usage: 43 GiB used, 22 TiB / 22 TiB avail
- pgs: 4825/16156 objects degraded (29.865%)
- 88 active+clean
- 48 active+undersized+degraded+remapped+backfill_wait
- 19 active+recovery_wait+degraded
- 15 active+recovery_wait+undersized+degraded+remapped
- 5 active+recovery_wait
- 1 active+recovering+degraded
-
- io:
- recovery: 15 MiB/s, 3 objects/s
-
-
- 2018-09-19 15:14:35.149958 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4825/16156 objects degraded (29.865%), 83 pgs degraded, 63 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:14:40.154936 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4802/16156 objects degraded (29.723%), 83 pgs degraded, 63 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:14:45.155511 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4785/16156 objects degraded (29.617%), 72 pgs degraded, 63 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:14:50.156258 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4761/16156 objects degraded (29.469%), 70 pgs degraded, 63 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:14:55.157259 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4736/16156 objects degraded (29.314%), 66 pgs degraded, 63 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:15:00.157805 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4715/16156 objects degraded (29.184%), 66 pgs degraded, 63 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:15:05.159788 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4700/16156 objects degraded (29.091%), 65 pgs degraded, 62 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:15:10.160347 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4687/16156 objects degraded (29.011%), 65 pgs degraded, 62 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:15:15.161346 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4663/16156 objects degraded (28.862%), 65 pgs degraded, 62 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:15:20.163878 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4639/16156 objects degraded (28.714%), 64 pgs degraded, 62 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:15:25.166626 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4634/16156 objects degraded (28.683%), 64 pgs degraded, 62 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:15:30.168933 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4612/16156 objects degraded (28.547%), 62 pgs degraded, 61 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:15:35.170116 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4590/16156 objects degraded (28.410%), 62 pgs degraded, 61 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:15:35.310448 mon.osdev01 [WRN] Health check failed: Reduced data availability: 1 pg inactive, 1 pg peering (PG_AVAILABILITY)
- 2018-09-19 15:15:40.170608 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4578/16156 objects degraded (28.336%), 60 pgs degraded, 60 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:15:41.314443 mon.osdev01 [INF] Health check cleared: PG_AVAILABILITY (was: Reduced data availability: 1 pg inactive, 1 pg peering)
- 2018-09-19 15:15:45.171537 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4564/16156 objects degraded (28.250%), 60 pgs degraded, 60 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:15:50.172340 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4546/16156 objects degraded (28.138%), 59 pgs degraded, 59 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:15:55.173243 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4536/16156 objects degraded (28.076%), 59 pgs degraded, 59 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:16:00.174125 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4514/16156 objects degraded (27.940%), 59 pgs degraded, 59 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:16:05.176502 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4496/16156 objects degraded (27.829%), 58 pgs degraded, 58 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:16:10.177113 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4486/16156 objects degraded (27.767%), 58 pgs degraded, 58 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:16:15.178024 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4464/16156 objects degraded (27.631%), 58 pgs degraded, 58 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:16:20.178774 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4457/16156 objects degraded (27.587%), 57 pgs degraded, 57 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:16:25.179609 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4436/16156 objects degraded (27.457%), 57 pgs degraded, 57 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:16:30.180333 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4426/16156 objects degraded (27.395%), 56 pgs degraded, 56 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:16:35.180850 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4404/16156 objects degraded (27.259%), 56 pgs degraded, 56 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:16:37.760009 mon.osdev01 [WRN] mon.1 172.29.101.167:6789/0 clock skew 1.47964s > max 0.5s
- 2018-09-19 15:16:40.181520 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4383/16156 objects degraded (27.129%), 55 pgs degraded, 55 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:16:45.183101 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4373/16156 objects degraded (27.067%), 55 pgs degraded, 55 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:16:50.184008 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4351/16156 objects degraded (26.931%), 55 pgs degraded, 55 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:16:51.434708 mon.osdev01 [WRN] Health check failed: Reduced data availability: 1 pg inactive, 1 pg peering (PG_AVAILABILITY)
- 2018-09-19 15:16:55.184869 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4336/16156 objects degraded (26.838%), 54 pgs degraded, 54 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:16:56.238863 mon.osdev01 [INF] Health check cleared: PG_AVAILABILITY (was: Reduced data availability: 1 pg inactive, 1 pg peering)
- 2018-09-19 15:17:00.185629 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4318/16156 objects degraded (26.727%), 54 pgs degraded, 54 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:17:05.186503 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4296/16156 objects degraded (26.591%), 54 pgs degraded, 54 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:17:10.187331 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4283/16156 objects degraded (26.510%), 52 pgs degraded, 52 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:17:15.188170 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4261/16156 objects degraded (26.374%), 52 pgs degraded, 52 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:17:20.189922 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4243/16156 objects degraded (26.263%), 51 pgs degraded, 51 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:17:25.190843 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4227/16156 objects degraded (26.164%), 51 pgs degraded, 51 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:17:30.191813 mon.osdev01 [WRN] Health check update: Degraded data redundancy: 4205/16156 objects degraded (26.027%), 51 pgs degraded, 51 pgs undersized (PG_DEGRADED)
- 2018-09-19 15:17:32.348305 mon.osdev01 [WRN] Health check failed: Reduced data availability: 1 pg inactive, 1 pg peering (PG_AVAILABILITY)
- ...
-
- $ watch -n1 ceph -s
- Every 1.0s: ceph -s Wed Sep 19 15:21:12 2018
-
- cluster:
- id: 383237bd-becf-49d5-9bd6-deb0bc35ab2a
- health: HEALTH_WARN
- Degraded data redundancy: 3372/16156 objects degraded (20.872%), 36 pgs degraded, 36 pgs undersized
- clock skew detected on mon.osdev02
- mon osdev01 is low on available space
-
- services:
- mon: 3 daemons, quorum osdev01,osdev02,osdev03
- mgr: osdev03(active), standbys: osdev02, osdev01
- osd: 4 osds: 3 up, 3 in; 36 remapped pgs
- rgw: 3 daemons active
-
- data:
- pools: 10 pools, 176 pgs
- objects: 5.39 k objects, 19 GiB
- usage: 48 GiB used, 22 TiB / 22 TiB avail
- pgs: 3372/16156 objects degraded (20.872%)
- 140 active+clean
- 35 active+undersized+degraded+remapped+backfill_wait
- 1 active+undersized+degraded+remapped+backfilling
-
- io:
- recovery: 17 MiB/s, 4 objects/s
部署MDS
- 在
3
个节点上部署MDS
服务:
$ ceph-deploy mds create osdev01 osdev02 osdev03
部署RGW
- 在
3
个节点上部署RGW
服务:
$ ceph-deploy rgw create osdev01 osdev02 osdev03
- 查看集群状态:
- $ sudo ceph -s
- cluster:
- id: 383237bd-becf-49d5-9bd6-deb0bc35ab2a
- health: HEALTH_WARN
- too few PGs per OSD (22 < min 30)
-
- services:
- mon: 3 daemons, quorum osdev01,osdev02,osdev03
- mgr: osdev01(active), standbys: osdev03, osdev02
- osd: 3 osds: 3 up, 3 in
- rgw: 1 daemon active
-
- data:
- pools: 4 pools, 32 pgs
- objects: 16 objects, 3.2 KiB
- usage: 3.0 GiB used, 22 TiB / 22 TiB avail
- pgs: 31.250% pgs unknown
- 3.125% pgs not active
- 21 active+clean
- 10 unknown
- 1 creating+peering
-
- io:
- client: 2.4 KiB/s rd, 731 B/s wr, 3 op/s rd, 0 op/s wr
卸载Ceph
- 卸载掉部署的
Ceph
,包括软件包和配置:
- # destroy and uninstall all packages
- $ ceph-deploy purge osdev01 osdev02 osdev03
-
- # destroy data
- $ ceph-deploy purgedata osdev01 osdev02 osdev03
-
- $ ceph-deploy forgetkeys
-
- # remove all keys
- $ rm -rfv ceph.*
测试使用
创建Pool
- 查看当前
Pool
信息,可以看到里面有几个RGW
网关的默认存储池:
- $ rados lspools
- .rgw.root
- default.rgw.control
- default.rgw.meta
- default.rgw.log
-
- $ rados -p .rgw.root ls
- zone_info.4741b9cf-cc27-43d8-9bbc-59eee875b4db
- zone_info.c775c6a6-036a-43ab-b558-ab0df40c3ad2
- zonegroup_info.df77b60a-8423-4570-b9ae-ae4ef06a13a2
- zone_info.0e5daa99-3863-4411-8d75-7d14a3f9a014
- zonegroup_info.f652f53f-94bb-4599-a1c1-737f792a9510
- zonegroup_info.5a4fb515-ef63-4ddc-85e0-5cf8339d9472
- zone_names.default
- zonegroups_names.default
-
- $ ceph osd pool get .rgw.root pg_num
- pg_num: 8
-
- $ ceph osd dump
- epoch 25
- fsid 383237bd-becf-49d5-9bd6-deb0bc35ab2a
- created 2018-08-23 10:55:49.409542
- modified 2018-08-23 16:23:00.574710
- flags sortbitwise,recovery_deletes,purged_snapdirs
- crush_version 7
- full_ratio 0.95
- backfillfull_ratio 0.9
- nearfull_ratio 0.85
- require_min_compat_client jewel
- min_compat_client jewel
- require_osd_release mimic
- pool 1 '.rgw.root' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 17 flags hashpspool stripe_width 0 application rgw
- pool 2 'default.rgw.control' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 20 flags hashpspool stripe_width 0 application rgw
- pool 3 'default.rgw.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 22 flags hashpspool stripe_width 0 application rgw
- pool 4 'default.rgw.log' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 24 flags hashpspool stripe_width 0 application rgw
- max_osd 3
- osd.0 up in weight 1 up_from 5 up_thru 23 down_at 0 last_clean_interval [0,0) 172.29.101.166:6801/719880 172.29.101.166:6802/719880 172.29.101.166:6803/719880 172.29.101.166:6804/719880 exists,up 2cb30e7c-7b98-4a6c-816a-2de7201a7669
- osd.1 up in weight 1 up_from 15 up_thru 23 down_at 14 last_clean_interval [9,14) 172.29.101.167:6800/189449 172.29.101.167:6804/1189449 172.29.101.167:6805/1189449 172.29.101.167:6806/1189449 exists,up 9d3bafa9-9ea0-401c-ad67-a08ef7c2d9f7
- osd.2 up in weight 1 up_from 13 up_thru 23 down_at 0 last_clean_interval [0,0) 172.29.101.168:6800/188591 172.29.101.168:6801/188591 172.29.101.168:6802/188591 172.29.101.168:6803/188591 exists,up a41fa4e0-c80b-4091-95cc-b58af291f387
- 创建一个
Pool
:
- $ ceph osd pool create glance 32 32
- pool 'glance' created
- 删除一个
Pool
,发现无法删除:
- $ ceph osd pool delete glance
- Error EPERM: WARNING: this will *PERMANENTLY DESTROY* all data stored in pool glance. If you are *ABSOLUTELY CERTAIN* that is what you want, pass the pool name *twice*, followed by --yes-i-really-really-mean-it.
-
- $ ceph osd pool delete glance glance --yes-i-really-really-mean-it
- Error EPERM: pool deletion is disabled; you must first set the mon_allow_pool_delete config option to true before you can destroy a pool
- 配置允许删除
Pool
:
- $ vi /etc/ceph/ceph.conf
- [mon]
- mon allow pool delete = true
-
- $ systemctl restart ceph-mon.target
- 再次删除
Pool
:
- $ ceph osd pool delete glance glance --yes-i-really-really-mean-it
- pool 'glance' removed
创建Object
- 创建一个测试用
Pool
,并设置副本数为3:
- $ ceph osd pool create test-pool 128 128
- $ ceph osd lspools
- 1 .rgw.root
- 2 default.rgw.control
- 3 default.rgw.meta
- 4 default.rgw.log
- 5 test-pool
-
- $ ceph osd dump | grep pool
- pool 1 '.rgw.root' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 17 flags hashpspool stripe_width 0 application rgw
- pool 2 'default.rgw.control' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 20 flags hashpspool stripe_width 0 application rgw
- pool 3 'default.rgw.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 22 flags hashpspool stripe_width 0 application rgw
- pool 4 'default.rgw.log' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 24 flags hashpspool stripe_width 0 application rgw
- pool 5 'test-pool' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 26 flags hashpspool stripe_width 0
-
- $ rados lspools
- .rgw.root
- default.rgw.control
- default.rgw.meta
- default.rgw.log
- test-pool
-
- # set replicated size
- $ ceph osd pool set test-pool size 3
- set pool 5 size to 3
-
- $ rados -p test-pool ls
- 创建一个测试文件:
$ echo "He110 Ceph, You are Awesome 1ike MJ" > hello_ceph
- 创建一个
Object
:
$ rados -p test-pool put object1 hello_ceph
- 查看
Object
的OSDMap
,可以看到名字,所属PG
和OSD
,以及他们的状态:
- $ ceph osd map test-pool object1
- osdmap e29 pool 'test-pool' (5) object 'object1' -> pg 5.bac5debc (5.3c) -> up ([0,1,2], p0) acting ([0,1,2], p0)
-
- $ rados -p test-pool ls
- object1
创建RBD
- 创建一个
RBD
Pool
:
- $ ceph osd pool create rbd 8 8
- $ rbd pool init rbd
- 创建一个
RBD
:
$ rbd create rbd_test --size 10240
- 查看
RADOS
和OSD
的变化,可以看到新建的RBD
会多出来3
个文件:
- $ rbd ls
- rbd_test
-
- $ rados -p rbd ls
- rbd_directory
- rbd_header.11856b8b4567
- rbd_info
- rbd_object_map.11856b8b4567
- rbd_id.rbd_test
-
- $ ceph osd dump | grep pool
- pool 1 '.rgw.root' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 17 flags hashpspool stripe_width 0 application rgw
- pool 2 'default.rgw.control' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 20 flags hashpspool stripe_width 0 application rgw
- pool 3 'default.rgw.meta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 22 flags hashpspool stripe_width 0 application rgw
- pool 4 'default.rgw.log' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 24 flags hashpspool stripe_width 0 application rgw
- pool 5 'test-pool' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 29 flags hashpspool stripe_width 0
- pool 6 'rbd' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 35 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd
映射RBD
- 加载
RBD
内核模块:
- $ uname -r
- 3.10.0-862.11.6.el7.x86_64
-
- $ modprobe rbd
-
- $ lsmod | grep rbd
- rbd 83728 0
- libceph 301687 1 rbd
- 映射
RBD
块设备,发现由于内核版本较低,无法映射:
- $ rbd map rbd_test
- rbd: sysfs write failed
- RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable rbd_test object-map fast-diff deep-flatten".
- In some cases useful info is found in syslog - try "dmesg | tail".
- rbd: map failed: (6) No such device or address
-
- $ dmesg | tail
- [150078.190941] Key type dns_resolver registered
- [150078.231155] Key type ceph registered
- [150078.231538] libceph: loaded (mon/osd proto 15/24)
- [150078.239110] rbd: loaded
- [152620.392095] libceph: mon1 172.29.101.167:6789 session established
- [152620.392821] libceph: client4522 fsid 383237bd-becf-49d5-9bd6-deb0bc35ab2a
- [152620.646943] rbd: image rbd_test: image uses unsupported features: 0x38
- [152648.322295] libceph: mon0 172.29.101.166:6789 session established
- [152648.322845] libceph: client4530 fsid 383237bd-becf-49d5-9bd6-deb0bc35ab2a
- [152648.357522] rbd: image rbd_test: image uses unsupported features: 0x38
- 查看
RBD
块设备的特性:
- $ rbd info rbd_test
- rbd image 'rbd_test':
- size 10 GiB in 2560 objects
- order 22 (4 MiB objects)
- id: 11856b8b4567
- block_name_prefix: rbd_data.11856b8b4567
- format: 2
- features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
- op_features:
- flags:
- create_timestamp: Fri Aug 24 10:21:11 2018
-
- layering: 支持分层
- striping: 支持条带化 v2
- exclusive-lock: 支持独占锁
- object-map: 支持对象映射(依赖 exclusive-lock )
- fast-diff: 快速计算差异(依赖 object-map )
- deep-flatten: 支持快照扁平化操作
- journaling: 支持记录 IO 操作(依赖独占锁)
- 修改
Ceph
默认RBD
特性来解决这一问题:
- $ vi /etc/ceph/ceph.conf
- rbd_default_features = 1
-
- $ ceph --show-config | grep rbd | grep "features rbd_default_features = 1"
- 或者在创建
RBD
指定特性:
$ rbd create rbd_test --size 10G --image-format 1 --image-feature layering
- 关闭掉内核不支持的特性:
- $ rbd feature disable rbd_test object-map fast-diff deep-flatten
- $ rbd info rbd_test
- rbd image 'rbd_test':
- size 10 GiB in 2560 objects
- order 22 (4 MiB objects)
- id: 11856b8b4567
- block_name_prefix: rbd_data.11856b8b4567
- format: 2
- features: layering, exclusive-lock
- op_features:
- flags:
- create_timestamp: Fri Aug 24 10:21:11 2018
- 重新映射
RBD
:
- # rbd map rbd/rbd_test
- $ rbd map rbd_test
- /dev/rbd0
-
- $ rbd showmapped
- id pool image snap device
- 0 rbd rbd_test - /dev/rbd0
-
- $ lsblk | grep rbd0
- rbd0 252:0 0 10.2G 0 disk
使用RBD
- 创建文件系统:
- $ mkfs.xfs /dev/rbd0
- meta-data=/dev/rbd0 isize=512 agcount=16, agsize=167936 blks
- = sectsz=512 attr=2, projid32bit=1
- = crc=1 finobt=0, sparse=0
- data = bsize=4096 blocks=2682880, imaxpct=25
- = sunit=1024 swidth=1024 blks
- naming =version 2 bsize=4096 ascii-ci=0 ftype=1
- log =internal log bsize=4096 blocks=2560, version=2
- = sectsz=512 sunit=8 blks, lazy-count=1
- realtime =none extsz=4096 blocks=0, rtextents=0
- 挂载
RBD
,并写入数据:
- $ mkdir -pv /mnt/rbd_test
- mkdir: 已创建目录 "/mnt/rbd_test"
-
- $ mount /dev/rbd0 /mnt/rbd_test
-
- $ dd if=/dev/zero of=/mnt/rbd_test/fi1e1 count=100 bs=1M
- 查看
RADOS
的变化,可以看到一个RBD
会被分为很多小对象:
- $ ll -h /mnt/rbd_test/
- 总用量 100M
- -rw-r--r-- 1 root root 100M 8月 24 11:35 fi1e1
-
- $ rados -p rbd ls | grep 1185
- rbd_data.11856b8b4567.0000000000000003
- rbd_data.11856b8b4567.00000000000003d8
- rbd_data.11856b8b4567.0000000000000d74
- rbd_data.11856b8b4567.0000000000001294
- rbd_data.11856b8b4567.0000000000000522
- rbd_data.11856b8b4567.0000000000000007
- rbd_data.11856b8b4567.0000000000001338
- rbd_data.11856b8b4567.0000000000000018
- rbd_data.11856b8b4567.000000000000000d
- rbd_data.11856b8b4567.0000000000000148
- rbd_data.11856b8b4567.00000000000000a4
- rbd_data.11856b8b4567.00000000000013dc
- rbd_data.11856b8b4567.0000000000000013
- rbd_header.11856b8b4567
- rbd_data.11856b8b4567.0000000000000000
- rbd_data.11856b8b4567.0000000000000a40
- rbd_data.11856b8b4567.000000000000114c
- rbd_data.11856b8b4567.0000000000000008
- rbd_data.11856b8b4567.0000000000000b88
- rbd_data.11856b8b4567.0000000000000009
- rbd_data.11856b8b4567.0000000000000521
- rbd_data.11856b8b4567.0000000000000010
- rbd_data.11856b8b4567.00000000000008f8
- rbd_data.11856b8b4567.0000000000000012
- rbd_data.11856b8b4567.0000000000000016
- rbd_data.11856b8b4567.0000000000000014
- rbd_data.11856b8b4567.000000000000001a
- rbd_data.11856b8b4567.0000000000000854
- rbd_data.11856b8b4567.000000000000000c
- rbd_data.11856b8b4567.0000000000000ae4
- rbd_data.11856b8b4567.000000000000047c
- rbd_data.11856b8b4567.0000000000000005
- rbd_data.11856b8b4567.0000000000000e18
- rbd_data.11856b8b4567.000000000000000f
- rbd_data.11856b8b4567.0000000000000cd0
- rbd_data.11856b8b4567.00000000000001ec
- rbd_data.11856b8b4567.0000000000000017
- rbd_data.11856b8b4567.0000000000000a3b
- rbd_data.11856b8b4567.0000000000000011
- rbd_data.11856b8b4567.000000000000070c
- rbd_data.11856b8b4567.0000000000000520
- rbd_data.11856b8b4567.00000000000010a8
- rbd_data.11856b8b4567.0000000000000015
- rbd_data.11856b8b4567.0000000000000004
- rbd_data.11856b8b4567.000000000000099c
- rbd_data.11856b8b4567.0000000000000001
- rbd_data.11856b8b4567.000000000000000b
- rbd_data.11856b8b4567.0000000000000c2c
- rbd_data.11856b8b4567.0000000000000334
- rbd_data.11856b8b4567.00000000000005c4
- rbd_data.11856b8b4567.000000000000000a
- rbd_data.11856b8b4567.0000000000000006
- rbd_data.11856b8b4567.0000000000000668
- rbd_data.11856b8b4567.0000000000001004
- rbd_data.11856b8b4567.0000000000000019
- rbd_data.11856b8b4567.00000000000011f0
- rbd_data.11856b8b4567.000000000000000e
- rbd_data.11856b8b4567.0000000000000f60
- rbd_data.11856b8b4567.00000000000007b0
- rbd_data.11856b8b4567.0000000000000290
- rbd_data.11856b8b4567.0000000000000ebc
- rbd_data.11856b8b4567.0000000000000002
-
- $ rados -p rbd ls | grep 1185 | wc -l
- 62
- 再次写入数据并查看变化,随着写入的数据变多,其中的对象也会变多:
- $ dd if=/dev/zero of=/mnt/rbd_test/fi1e1 count=200 bs=1M
- 记录了200+0 的读入
- 记录了200+0 的写出
- 209715200字节(210 MB)已复制,0.441176 秒,475 MB/秒
-
- $ rados -p rbd ls | grep 1185 | wc -l
- 87
调整RBD
- 调整
RBD
大小:
- $ rbd resize rbd_test --size 20480
- Resizing image: 100% complete...done.
- 调整文件系统大小:
- $ xfs_growfs -d /mnt/rbd_test/
- meta-data=/dev/rbd0 isize=512 agcount=16, agsize=167936 blks
- = sectsz=512 attr=2, projid32bit=1
- = crc=1 finobt=0 spinodes=0
- data = bsize=4096 blocks=2682880, imaxpct=25
- = sunit=1024 swidth=1024 blks
- naming =version 2 bsize=4096 ascii-ci=0 ftype=1
- log =internal bsize=4096 blocks=2560, version=2
- = sectsz=512 sunit=8 blks, lazy-count=1
- realtime =none extsz=4096 blocks=0, rtextents=0
- data blocks changed from 2682880 to 5242880
- 查看
RBD
变化:
- $ rbd info rbd_test
- rbd image 'rbd_test':
- size 20 GiB in 5120 objects
- order 22 (4 MiB objects)
- id: 11856b8b4567
- block_name_prefix: rbd_data.11856b8b4567
- format: 2
- features: layering, exclusive-lock
- op_features:
- flags:
- create_timestamp: Fri Aug 24 10:21:11 2018
-
- $ lsblk | grep rbd0
- rbd0 252:0 0 20G 0 disk /mnt/rbd_test
-
- $ df -h | grep rbd
- /dev/rbd0 20G 234M 20G 2% /mnt/rbd_test
快照RBD
- 创建测试文件:
- $ echo "Hello Ceph This is snapshot test" > /mnt/rbd_test/file2
-
- $ ls -lh /mnt/rbd_test/
- 总用量 201M
- -rw-r--r-- 1 root root 200M 8月 24 15:46 fi1e1
- -rw-r--r-- 1 root root 33 8月 24 15:51 file2
-
- $ cat /mnt/rbd_test/file2
- Hello Ceph This is snapshot test
- 创建
RBD
快照:
- $ rbd snap create rbd_test@snap1
- $ rbd snap ls rbd_test
- SNAPID NAME SIZE TIMESTAMP
- 4 snap1 20 GiB Fri Aug 24 15:52:49 2018
- 删除文件:
- $ rm -rfv /mnt/rbd_test/file2
- 已删除"/mnt/rbd_test/file2"
- $ ls -lh /mnt/rbd_test/
- 总用量 200M
- -rw-r--r-- 1 root root 200M 8月 24 15:46 fi1e1
- 卸载并取消
RBD
映射:
- $ umount /mnt/rbd_test
- $ rbd unmap rbd_test
- 回滚
RBD
:
- $ rbd snap rollback rbd_test@snap1
- Rolling back to snapshot: 100% complete...done.
- 重新映射和挂载
RBD
,并查看文件:
- $ rbd map rbd_test
- /dev/rbd0
-
- $ mount /dev/rbd0 /mnt/rbd_test
- $ ls -lh /mnt/rbd_test/
- 总用量 201M
- -rw-r--r-- 1 root root 200M 8月 24 15:46 fi1e1
- -rw-r--r-- 1 root root 33 8月 24 15:51 file2
观察PG
- 随意查看
rbd
存储池中的对象OSDMap
,可以看到其中PG
的OSD
顺序并不完全相同,而且同一个Pool
中的对象的PG
的ID
中小数点前的数字是一样的:
- $ ceph osd map rbd rbd_info
- osdmap e74 pool 'rbd' (6) object 'rbd_info' -> pg 6.ac0e573a (6.2) -> up ([1,0,2], p1) acting ([1,0,2], p1)
-
- $ ceph osd map rbd rbd_directory
- osdmap e74 pool 'rbd' (6) object 'rbd_directory' -> pg 6.30a98c1c (6.4) -> up ([0,1,2], p0) acting ([0,1,2], p0)
-
- $ ceph osd map rbd rbd_id.rbd_test
- osdmap e74 pool 'rbd' (6) object 'rbd_id.rbd_test' -> pg 6.818788b3 (6.3) -> up ([1,2,0], p1) acting ([1,2,0], p1)
-
- $ ceph osd map rbd rbd_data.11856b8b4567.0000000000000022
- osdmap e74 pool 'rbd' (6) object 'rbd_data.11856b8b4567.0000000000000022' -> pg 6.deee7c73 (6.3) -> up ([1,2,0], p1) acting ([1,2,0], p1)
-
- $ ceph osd map rbd rbd_data.11856b8b4567.000000000000000a
- osdmap e74 pool 'rbd' (6) object 'rbd_data.11856b8b4567.000000000000000a' -> pg 6.561c344b (6.3) -> up ([1,2,0], p1) acting ([1,2,0], p1)
-
- $ ceph osd map rbd rbd_data.11856b8b4567.00000000000007b0
- osdmap e74 pool 'rbd' (6) object 'rbd_data.11856b8b4567.00000000000007b0' -> pg 6.a603e1f (6.7) -> up ([1,0,2], p1) acting ([1,0,2], p1)
- 创建一个两副本的存储池,可以看到同一个存储池对象的
PG
也可能会使用不同的OSD
:
- $ ceph osd pool create pg_test 8 8
- pool 'pg_test' created
-
- $ ceph osd dump | grep pg_test
- pool 12 'pg_test' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 75 flags hashpspool stripe_width 0
-
- $ osd pool set pg_test size 2
- set pool 12 size to 2
- $ ceph osd dump | grep pg_test
- pool 12 'pg_test' replicated size 2 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 78 flags hashpspool stripe_width 0
-
- $ rados -p pg_test put object1 /etc/hosts
- $ rados -p pg_test put object2 /etc/hosts
- $ rados -p pg_test put object3 /etc/hosts
- $ rados -p pg_test put object4 /etc/hosts
- $ rados -p pg_test put object5 /etc/hosts
-
- $ rados -p pg_test ls
- object1
- object2
- object3
- object4
- object5
-
- $ ceph osd map pg_test object1
- osdmap e79 pool 'pg_test' (12) object 'object1' -> pg 12.bac5debc (12.4) -> up ([2,0], p2) acting ([2,0], p2)
-
- $ ceph osd map pg_test object2
- osdmap e79 pool 'pg_test' (12) object 'object2' -> pg 12.f85a416a (12.2) -> up ([2,0], p2) acting ([2,0], p2)
-
- $ ceph osd map pg_test object3
- osdmap e79 pool 'pg_test' (12) object 'object3' -> pg 12.f877ac20 (12.0) -> up ([1,0], p1) acting ([1,0], p1
-
- $ ceph osd map pg_test object4
- osdmap e79 pool 'pg_test' (12) object 'object4' -> pg 12.9d9216ab (12.3) -> up ([2,1], p2) acting ([2,1], p2)
-
- $ ceph osd map pg_test object5
- osdmap e79 pool 'pg_test' (12) object 'object5' -> pg 12.e1acd6d (12.5) -> up ([1,2], p1) acting ([1,2], p1)
测试性能
- 写入性能测试:
- $ rados bench -p test-pool 10 write --no-cleanup
- hints = 1
- Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
- Object prefix: benchmark_data_osdev01_1827771
- sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
- 0 0 0 0 0 0 - 0
- 1 16 31 15 59.8716 60 0.388146 0.666288
- 2 16 49 33 65.9176 72 0.62486 0.824162
- 3 16 65 49 65.2595 64 1.18038 0.834558
- 4 16 86 70 69.8978 84 0.657194 0.834779
- 5 16 107 91 72.7115 84 0.594541 0.829814
- 6 16 125 109 72.5838 72 0.371435 0.796664
- 7 16 149 133 75.8989 96 1.17764 0.803259
- 8 16 165 149 74.4101 64 0.568129 0.797091
- 9 16 185 169 75.01 80 0.813372 0.81463
- 10 16 203 187 74.7085 72 0.728715 0.812529
- Total time run: 10.3161
- Total writes made: 203
- Write size: 4194304
- Object size: 4194304
- Bandwidth (MB/sec): 78.7122
- Stddev Bandwidth: 11.1634
- Max bandwidth (MB/sec): 96
- Min bandwidth (MB/sec): 60
- Average IOPS: 19
- Stddev IOPS: 2
- Max IOPS: 24
- Min IOPS: 15
- Average Latency(s): 0.80954
- Stddev Latency(s): 0.293645
- Max latency(s): 1.77366
- Min latency(s): 0.240024
- 顺序读取性能测试:
- $ rados bench -p test-pool 10 seq
- hints = 1
- sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
- 0 0 0 0 0 0 - 0
- 1 16 72 56 223.808 224 0.0519066 0.217292
- 2 16 111 95 189.736 156 0.658876 0.289657
- 3 16 160 144 191.663 196 0.0658452 0.301259
- 4 16 203 187 186.745 172 0.210803 0.297584
- Total time run: 4.43386
- Total reads made: 203
- Read size: 4194304
- Object size: 4194304
- Bandwidth (MB/sec): 183.136
- Average IOPS: 45
- Stddev IOPS: 7
- Max IOPS: 56
- Min IOPS: 39
- Average Latency(s): 0.346754
- Max latency(s): 1.37891
- Min latency(s): 0.0249563
- 随机读取性能测试:
- $ rados bench -p test-pool 10 rand
- hints = 1
- sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
- 0 0 0 0 0 0 - 0
- 1 16 59 43 171.94 172 0.271225 0.222279
- 2 16 108 92 183.95 196 1.06429 0.275433
- 3 16 153 137 182.618 180 0.00350975 0.304582
- 4 16 224 208 207.951 284 0.0678476 0.278888
- 5 16 267 251 200.757 172 0.00386545 0.289519
- 6 16 319 303 201.955 208 0.866646 0.294983
- 7 16 360 344 196.529 164 0.00428517 0.30615
- 8 16 405 389 194.458 180 0.903073 0.311316
- 9 16 455 439 195.071 200 0.00368576 0.316057
- 10 16 517 501 200.36 248 0.621325 0.309242
- Total time run: 10.5614
- Total reads made: 518
- Read size: 4194304
- Object size: 4194304
- Bandwidth (MB/sec): 196.187
- Average IOPS: 49
- Stddev IOPS: 9
- Max IOPS: 71
- Min IOPS: 41
- Average Latency(s): 0.321834
- Max latency(s): 1.16304
- Min latency(s): 0.0026629
- 使用
fio
进行测试:
- $ yum install -y fio "*librbd*"
-
- $ rbd create fio_test --size 20480
-
- $ vi write.fio
- [global]
- description="write test with block size of 4M"
- ioengine=rbd
- clustername=ceph
- clientname=admin
- pool=rbd
- rbdname=fio_test
- iodepth=32
- runtime=120
- rw=write
- bs=4M
-
- [logging]
- write_iops_log=write_iops_log
- write_bw_log=write_bw_log
- write_lat_log=write_lat_log
-
-
- $ fio write.fio
- logging: (g=0): rw=write, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=rbd, iodepth=32
- fio-3.1
- Starting 1 process
- Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 00m:00s]
- logging: (groupid=0, jobs=1): err= 0: pid=161962: Wed Aug 29 19:17:17 2018
- Description : ["write test with block size of 4M"]
- write: IOPS=15, BW=60.4MiB/s (63.3MB/s)(7252MiB/120085msec)
- slat (usec): min=665, max=14535, avg=1584.29, stdev=860.28
- clat (msec): min=1828, max=4353, avg=2092.28, stdev=180.12
- lat (msec): min=1829, max=4354, avg=2093.87, stdev=180.15
- clat percentiles (msec):
- | 1.00th=[ 1838], 5.00th=[ 1938], 10.00th=[ 1989], 20.00th=[ 2022],
- | 30.00th=[ 2039], 40.00th=[ 2056], 50.00th=[ 2072], 60.00th=[ 2106],
- | 70.00th=[ 2123], 80.00th=[ 2165], 90.00th=[ 2198], 95.00th=[ 2232],
- | 99.00th=[ 2333], 99.50th=[ 3977], 99.90th=[ 4111], 99.95th=[ 4329],
- | 99.99th=[ 4329]
- bw ( KiB/s): min= 963, max= 2294, per=3.26%, avg=2013.72, stdev=117.50, samples=1813
- iops : min= 1, max= 1, avg= 1.00, stdev= 0.00, samples=1813
- lat (msec) : 2000=13.40%, >=2000=86.60%
- cpu : usr=1.94%, sys=0.40%, ctx=157, majf=0, minf=157364
- IO depths : 1=2.3%, 2=6.0%, 4=12.6%, 8=25.2%, 16=50.3%, 32=3.6%, >=64=0.0%
- submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
- complete : 0=0.0%, 4=97.0%, 8=0.0%, 16=0.0%, 32=3.0%, 64=0.0%, >=64=0.0%
- issued rwt: total=0,1813,0, short=0,0,0, dropped=0,0,0
- latency : target=0, window=0, percentile=100.00%, depth=32
-
- Run status group 0 (all jobs):
- WRITE: bw=60.4MiB/s (63.3MB/s), 60.4MiB/s-60.4MiB/s (63.3MB/s-63.3MB/s), io=7252MiB (7604MB), run=120085-120085msec
-
- Disk stats (read/write):
- sda: ios=5/653, merge=0/6, ticks=6/2818, in_queue=2824, util=0.17%