首页 » DevOps » ceph » ceph初始monitor(s)报错解决

ceph初始monitor(s)报错解决

 

由于官方文档没有特别说明,网上大部分ceph配置文章丢三落四。导致配置ceph初始monitor(s)时,各种报错,本文提供了几种解决的办法可供参考。

执行ceph-deploy mon create-initial

报错部分内容如下:

[ceph2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph2][WARNIN] monitor: mon.ceph2, might not be running yet
[ceph2][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph2.asok mon_status
[ceph2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph2][WARNIN] monitor ceph2 does not exist in monmap
[ceph2][WARNIN] neither public_addr nor public_network keys are defined for monitors
[ceph2][WARNIN] monitors may not be able to form quorum

注意报错中public_network,这是由于没有在ceph.conf中配置

解决办法:

修改ceph.conf配置文件(此IP段根据个人情况设定),添加public_network = 192.168.1.0/24


修改后继续执行ceph-deploy mon create-initial后,发现依旧报错,报错部分内容如下

[ceph3][WARNIN] provided hostname must match remote hostname
[ceph3][WARNIN] provided hostname: ceph3
[ceph3][WARNIN] remote hostname: localhost
[ceph3][WARNIN] monitors may not reach quorum and create-keys will not complete
[ceph3][WARNIN] ********************************************************************************
[ceph3][DEBUG ] deploying mon to ceph3
[ceph3][DEBUG ] get remote short hostname
[ceph3][DEBUG ] remote hostname: localhost
[ceph3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.mon][ERROR ] RuntimeError: config file /etc/ceph/ceph.conf exists with different content; use --overwrite-conf to overwrite
[ceph_deploy][ERROR ] GenericError: Failed to create 3 monitors

这里看到错误提示/etc/ceph/ceph.conf内容不同,使用--overwrite-conf来覆盖

命令如下:

ceph-deploy --overwrite-conf config push ceph1 ceph2 ceph3


修改后继续执行ceph-deploy mon create-initial,发现报错还是存在,报错部分内容如下

[ceph3][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph3.asok mon_status
[ceph3][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph_deploy.mon][WARNIN] mon.ceph3 monitor is not yet in quorum, tries left: 1
[ceph_deploy.mon][WARNIN] waiting 20 seconds before retrying
[ceph_deploy.mon][ERROR ] Some monitors have still not reached quorum:
[ceph_deploy.mon][ERROR ] ceph1
[ceph_deploy.mon][ERROR ] ceph3
[ceph_deploy.mon][ERROR ] ceph2

经过排查发现节点的hostname与/etc/hosts不符

解决办法:修改节点hostname名称,使其与/etc/hosts相符

节点一执行:hostnamectl set-hostname ceph1
节点二执行:hostnamectl set-hostname ceph2
节点三执行:hostnamectl set-hostname ceph3


修改后继续执行ceph-deploy mon create-initial,mmp发现还是报错,报错内容又不一样了,中间部分报错内容如下

[ceph2][ERROR ] no valid command found; 10 closest matches:
[ceph2][ERROR ] perf dump {<logger>} {<counter>}
[ceph2][ERROR ] log reopen
[ceph2][ERROR ] help
[ceph2][ERROR ] git_version
[ceph2][ERROR ] log flush
[ceph2][ERROR ] log dump
[ceph2][ERROR ] config unset <var>
[ceph2][ERROR ] config show
[ceph2][ERROR ] get_command_descriptions
[ceph2][ERROR ] dump_mempools
[ceph2][ERROR ] admin_socket: invalid command
[ceph_deploy.mon][WARNIN] mon.ceph2 monitor is not yet in quorum, tries left: 5
[ceph_deploy.mon][WARNIN] waiting 5 seconds before retrying
[ceph2][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph2.asok mon_status
[ceph2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory

解决办法:在各个节点上执行sudo pkill ceph,然后再在deploy节点执行ceph-deploy mon create-initial

然后发现ERROR报错消失了,配置初始monitor(s)、并收集到了所有密钥,当前目录下可以看到下面这些密钥环

ceph.bootstrap-mds.keyring
ceph.bootstrap-mgr.keyring
ceph.bootstrap-osd.keyring
ceph.bootstrap-rgw.keyring
ceph.client.admin.keyring

原文链接:ceph初始monitor(s)报错解决,转载请注明来源!

13