Travis Rhoden
2013-05-24 19:40:46 UTC
Hi folks,
It's the first time I've gotten to try out ceph-deploy. I'm staring
with a small test system, where I am running everything locally on one
chassis. I've run several Ceph clusters now, so I am aware of the
implications of this -- it's just a test setup.
I walked through the docs, and did:
ceph-deploy install
ceph-deploy new
ceph-deploy mon create
All looked good.
When I ran ceph-deploy gatherkeys, I saw errors about it not finding
keys... I've seen this in the ML a few times now.
I dug through logs, and found that it appears that the upstart job
ceph-create-keys hadn't created the keys for me (hence, no keys to
gather). Looking in /var/log/upstart/ceph-create-keys.log, I see:
INFO:ceph-create-keys:ceph-mon is not in quorum: u'probing'
repeated ad infinitum. So the mon is not in quorum. Question is,
why? Since there is only one mon, it should always be in quorum. =)
The mon log shows:
2013-05-24 11:58:50.959058 7fe8cb5c1780 0 ceph version 0.61.2
(fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mon, pid 3124
2013-05-24 11:58:50.973588 7fe8cb5c1780 -1 asok(0x2ea8000)
AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen:
failed to bind the UNIX domain socket to
'/var/run/ceph/ceph-mon.ceph.asok': (2) No such file or directory
2013-05-24 11:58:51.021834 7f992fa96780 0 ceph version 0.61.2
(fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mon, pid 3130
2013-05-24 11:58:51.037512 7f992fa96780 1 mon.ceph at -1(probing) e0
preinit fsid 773097e3-10d1-425a-961a-40480bec0493
2013-05-24 11:58:51.037617 7f992fa96780 1 mon.ceph at -1(probing) e0
initial_members ceph-cluster, filtering seed monmap
2013-05-24 11:58:51.039514 7f992fa7f700 0 -- 10.10.1.1:6789/0 >>
0.0.0.0:0/1 pipe(0x1aea500 sd=20 :0 s=1 pgs=0 cs=0 l=0).fault
2013-05-24 11:59:51.038094 7f992a093700 0
mon.ceph at -1(probing).data_health(0) update_stats avail 91% total
59094864 used 1867904 avail 54225112
That last line is just repeated forever. I saw the bit about not
being able to bind to the admin socket, but when I tried "ceph
--admin-daemon <admin sock> config show", it worked, so I think the
socket is working fine. In fact, I ran the command that
/usr/sbin/ceph-create-keys uses to check on status, and here is what
that gives:
root at ceph:~# ceph --admin-daemon /var/run/ceph/ceph-mon.ceph.asok mon_status
{ "name": "ceph",
"rank": -1,
"state": "probing",
"election_epoch": 0,
"quorum": [],
"outside_quorum": [],
"extra_probe_peers": [
"10.10.1.1:6789\/0"],
"monmap": { "epoch": 0,
"fsid": "773097e3-10d1-425a-961a-40480bec0493",
"modified": "0.000000",
"created": "0.000000",
"mons": [
{ "rank": 0,
"name": "ceph-cluster",
"addr": "0.0.0.0:0\/1"}]}}
So, how do I go about figuring out why the on is not in quorum? I'm
stumped. =)
Thanks,
- Travis
It's the first time I've gotten to try out ceph-deploy. I'm staring
with a small test system, where I am running everything locally on one
chassis. I've run several Ceph clusters now, so I am aware of the
implications of this -- it's just a test setup.
I walked through the docs, and did:
ceph-deploy install
ceph-deploy new
ceph-deploy mon create
All looked good.
When I ran ceph-deploy gatherkeys, I saw errors about it not finding
keys... I've seen this in the ML a few times now.
I dug through logs, and found that it appears that the upstart job
ceph-create-keys hadn't created the keys for me (hence, no keys to
gather). Looking in /var/log/upstart/ceph-create-keys.log, I see:
INFO:ceph-create-keys:ceph-mon is not in quorum: u'probing'
repeated ad infinitum. So the mon is not in quorum. Question is,
why? Since there is only one mon, it should always be in quorum. =)
The mon log shows:
2013-05-24 11:58:50.959058 7fe8cb5c1780 0 ceph version 0.61.2
(fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mon, pid 3124
2013-05-24 11:58:50.973588 7fe8cb5c1780 -1 asok(0x2ea8000)
AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen:
failed to bind the UNIX domain socket to
'/var/run/ceph/ceph-mon.ceph.asok': (2) No such file or directory
2013-05-24 11:58:51.021834 7f992fa96780 0 ceph version 0.61.2
(fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mon, pid 3130
2013-05-24 11:58:51.037512 7f992fa96780 1 mon.ceph at -1(probing) e0
preinit fsid 773097e3-10d1-425a-961a-40480bec0493
2013-05-24 11:58:51.037617 7f992fa96780 1 mon.ceph at -1(probing) e0
initial_members ceph-cluster, filtering seed monmap
2013-05-24 11:58:51.039514 7f992fa7f700 0 -- 10.10.1.1:6789/0 >>
0.0.0.0:0/1 pipe(0x1aea500 sd=20 :0 s=1 pgs=0 cs=0 l=0).fault
2013-05-24 11:59:51.038094 7f992a093700 0
mon.ceph at -1(probing).data_health(0) update_stats avail 91% total
59094864 used 1867904 avail 54225112
That last line is just repeated forever. I saw the bit about not
being able to bind to the admin socket, but when I tried "ceph
--admin-daemon <admin sock> config show", it worked, so I think the
socket is working fine. In fact, I ran the command that
/usr/sbin/ceph-create-keys uses to check on status, and here is what
that gives:
root at ceph:~# ceph --admin-daemon /var/run/ceph/ceph-mon.ceph.asok mon_status
{ "name": "ceph",
"rank": -1,
"state": "probing",
"election_epoch": 0,
"quorum": [],
"outside_quorum": [],
"extra_probe_peers": [
"10.10.1.1:6789\/0"],
"monmap": { "epoch": 0,
"fsid": "773097e3-10d1-425a-961a-40480bec0493",
"modified": "0.000000",
"created": "0.000000",
"mons": [
{ "rank": 0,
"name": "ceph-cluster",
"addr": "0.0.0.0:0\/1"}]}}
So, how do I go about figuring out why the on is not in quorum? I'm
stumped. =)
Thanks,
- Travis