Discussion:
[ceph-users] ceph-deploy, single mon not in quorum
Travis Rhoden
2013-05-24 19:40:46 UTC
Permalink
Hi folks,

It's the first time I've gotten to try out ceph-deploy. I'm staring
with a small test system, where I am running everything locally on one
chassis. I've run several Ceph clusters now, so I am aware of the
implications of this -- it's just a test setup.

I walked through the docs, and did:
ceph-deploy install
ceph-deploy new
ceph-deploy mon create

All looked good.

When I ran ceph-deploy gatherkeys, I saw errors about it not finding
keys... I've seen this in the ML a few times now.

I dug through logs, and found that it appears that the upstart job
ceph-create-keys hadn't created the keys for me (hence, no keys to
gather). Looking in /var/log/upstart/ceph-create-keys.log, I see:

INFO:ceph-create-keys:ceph-mon is not in quorum: u'probing'

repeated ad infinitum. So the mon is not in quorum. Question is,
why? Since there is only one mon, it should always be in quorum. =)
The mon log shows:

2013-05-24 11:58:50.959058 7fe8cb5c1780 0 ceph version 0.61.2
(fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mon, pid 3124
2013-05-24 11:58:50.973588 7fe8cb5c1780 -1 asok(0x2ea8000)
AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen:
failed to bind the UNIX domain socket to
'/var/run/ceph/ceph-mon.ceph.asok': (2) No such file or directory
2013-05-24 11:58:51.021834 7f992fa96780 0 ceph version 0.61.2
(fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mon, pid 3130
2013-05-24 11:58:51.037512 7f992fa96780 1 mon.ceph at -1(probing) e0
preinit fsid 773097e3-10d1-425a-961a-40480bec0493
2013-05-24 11:58:51.037617 7f992fa96780 1 mon.ceph at -1(probing) e0
initial_members ceph-cluster, filtering seed monmap
2013-05-24 11:58:51.039514 7f992fa7f700 0 -- 10.10.1.1:6789/0 >>
0.0.0.0:0/1 pipe(0x1aea500 sd=20 :0 s=1 pgs=0 cs=0 l=0).fault
2013-05-24 11:59:51.038094 7f992a093700 0
mon.ceph at -1(probing).data_health(0) update_stats avail 91% total
59094864 used 1867904 avail 54225112

That last line is just repeated forever. I saw the bit about not
being able to bind to the admin socket, but when I tried "ceph
--admin-daemon <admin sock> config show", it worked, so I think the
socket is working fine. In fact, I ran the command that
/usr/sbin/ceph-create-keys uses to check on status, and here is what
that gives:

root at ceph:~# ceph --admin-daemon /var/run/ceph/ceph-mon.ceph.asok mon_status
{ "name": "ceph",
"rank": -1,
"state": "probing",
"election_epoch": 0,
"quorum": [],
"outside_quorum": [],
"extra_probe_peers": [
"10.10.1.1:6789\/0"],
"monmap": { "epoch": 0,
"fsid": "773097e3-10d1-425a-961a-40480bec0493",
"modified": "0.000000",
"created": "0.000000",
"mons": [
{ "rank": 0,
"name": "ceph-cluster",
"addr": "0.0.0.0:0\/1"}]}}


So, how do I go about figuring out why the on is not in quorum? I'm
stumped. =)

Thanks,

- Travis
Travis Rhoden
2013-05-24 19:40:46 UTC
Permalink
Hi folks,

It's the first time I've gotten to try out ceph-deploy. I'm staring
with a small test system, where I am running everything locally on one
chassis. I've run several Ceph clusters now, so I am aware of the
implications of this -- it's just a test setup.

I walked through the docs, and did:
ceph-deploy install
ceph-deploy new
ceph-deploy mon create

All looked good.

When I ran ceph-deploy gatherkeys, I saw errors about it not finding
keys... I've seen this in the ML a few times now.

I dug through logs, and found that it appears that the upstart job
ceph-create-keys hadn't created the keys for me (hence, no keys to
gather). Looking in /var/log/upstart/ceph-create-keys.log, I see:

INFO:ceph-create-keys:ceph-mon is not in quorum: u'probing'

repeated ad infinitum. So the mon is not in quorum. Question is,
why? Since there is only one mon, it should always be in quorum. =)
The mon log shows:

2013-05-24 11:58:50.959058 7fe8cb5c1780 0 ceph version 0.61.2
(fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mon, pid 3124
2013-05-24 11:58:50.973588 7fe8cb5c1780 -1 asok(0x2ea8000)
AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen:
failed to bind the UNIX domain socket to
'/var/run/ceph/ceph-mon.ceph.asok': (2) No such file or directory
2013-05-24 11:58:51.021834 7f992fa96780 0 ceph version 0.61.2
(fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mon, pid 3130
2013-05-24 11:58:51.037512 7f992fa96780 1 mon.ceph at -1(probing) e0
preinit fsid 773097e3-10d1-425a-961a-40480bec0493
2013-05-24 11:58:51.037617 7f992fa96780 1 mon.ceph at -1(probing) e0
initial_members ceph-cluster, filtering seed monmap
2013-05-24 11:58:51.039514 7f992fa7f700 0 -- 10.10.1.1:6789/0 >>
0.0.0.0:0/1 pipe(0x1aea500 sd=20 :0 s=1 pgs=0 cs=0 l=0).fault
2013-05-24 11:59:51.038094 7f992a093700 0
mon.ceph at -1(probing).data_health(0) update_stats avail 91% total
59094864 used 1867904 avail 54225112

That last line is just repeated forever. I saw the bit about not
being able to bind to the admin socket, but when I tried "ceph
--admin-daemon <admin sock> config show", it worked, so I think the
socket is working fine. In fact, I ran the command that
/usr/sbin/ceph-create-keys uses to check on status, and here is what
that gives:

root at ceph:~# ceph --admin-daemon /var/run/ceph/ceph-mon.ceph.asok mon_status
{ "name": "ceph",
"rank": -1,
"state": "probing",
"election_epoch": 0,
"quorum": [],
"outside_quorum": [],
"extra_probe_peers": [
"10.10.1.1:6789\/0"],
"monmap": { "epoch": 0,
"fsid": "773097e3-10d1-425a-961a-40480bec0493",
"modified": "0.000000",
"created": "0.000000",
"mons": [
{ "rank": 0,
"name": "ceph-cluster",
"addr": "0.0.0.0:0\/1"}]}}


So, how do I go about figuring out why the on is not in quorum? I'm
stumped. =)

Thanks,

- Travis
Travis Rhoden
2013-05-24 19:40:46 UTC
Permalink
Hi folks,

It's the first time I've gotten to try out ceph-deploy. I'm staring
with a small test system, where I am running everything locally on one
chassis. I've run several Ceph clusters now, so I am aware of the
implications of this -- it's just a test setup.

I walked through the docs, and did:
ceph-deploy install
ceph-deploy new
ceph-deploy mon create

All looked good.

When I ran ceph-deploy gatherkeys, I saw errors about it not finding
keys... I've seen this in the ML a few times now.

I dug through logs, and found that it appears that the upstart job
ceph-create-keys hadn't created the keys for me (hence, no keys to
gather). Looking in /var/log/upstart/ceph-create-keys.log, I see:

INFO:ceph-create-keys:ceph-mon is not in quorum: u'probing'

repeated ad infinitum. So the mon is not in quorum. Question is,
why? Since there is only one mon, it should always be in quorum. =)
The mon log shows:

2013-05-24 11:58:50.959058 7fe8cb5c1780 0 ceph version 0.61.2
(fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mon, pid 3124
2013-05-24 11:58:50.973588 7fe8cb5c1780 -1 asok(0x2ea8000)
AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen:
failed to bind the UNIX domain socket to
'/var/run/ceph/ceph-mon.ceph.asok': (2) No such file or directory
2013-05-24 11:58:51.021834 7f992fa96780 0 ceph version 0.61.2
(fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mon, pid 3130
2013-05-24 11:58:51.037512 7f992fa96780 1 mon.ceph at -1(probing) e0
preinit fsid 773097e3-10d1-425a-961a-40480bec0493
2013-05-24 11:58:51.037617 7f992fa96780 1 mon.ceph at -1(probing) e0
initial_members ceph-cluster, filtering seed monmap
2013-05-24 11:58:51.039514 7f992fa7f700 0 -- 10.10.1.1:6789/0 >>
0.0.0.0:0/1 pipe(0x1aea500 sd=20 :0 s=1 pgs=0 cs=0 l=0).fault
2013-05-24 11:59:51.038094 7f992a093700 0
mon.ceph at -1(probing).data_health(0) update_stats avail 91% total
59094864 used 1867904 avail 54225112

That last line is just repeated forever. I saw the bit about not
being able to bind to the admin socket, but when I tried "ceph
--admin-daemon <admin sock> config show", it worked, so I think the
socket is working fine. In fact, I ran the command that
/usr/sbin/ceph-create-keys uses to check on status, and here is what
that gives:

root at ceph:~# ceph --admin-daemon /var/run/ceph/ceph-mon.ceph.asok mon_status
{ "name": "ceph",
"rank": -1,
"state": "probing",
"election_epoch": 0,
"quorum": [],
"outside_quorum": [],
"extra_probe_peers": [
"10.10.1.1:6789\/0"],
"monmap": { "epoch": 0,
"fsid": "773097e3-10d1-425a-961a-40480bec0493",
"modified": "0.000000",
"created": "0.000000",
"mons": [
{ "rank": 0,
"name": "ceph-cluster",
"addr": "0.0.0.0:0\/1"}]}}


So, how do I go about figuring out why the on is not in quorum? I'm
stumped. =)

Thanks,

- Travis
Travis Rhoden
2013-05-24 19:40:46 UTC
Permalink
Hi folks,

It's the first time I've gotten to try out ceph-deploy. I'm staring
with a small test system, where I am running everything locally on one
chassis. I've run several Ceph clusters now, so I am aware of the
implications of this -- it's just a test setup.

I walked through the docs, and did:
ceph-deploy install
ceph-deploy new
ceph-deploy mon create

All looked good.

When I ran ceph-deploy gatherkeys, I saw errors about it not finding
keys... I've seen this in the ML a few times now.

I dug through logs, and found that it appears that the upstart job
ceph-create-keys hadn't created the keys for me (hence, no keys to
gather). Looking in /var/log/upstart/ceph-create-keys.log, I see:

INFO:ceph-create-keys:ceph-mon is not in quorum: u'probing'

repeated ad infinitum. So the mon is not in quorum. Question is,
why? Since there is only one mon, it should always be in quorum. =)
The mon log shows:

2013-05-24 11:58:50.959058 7fe8cb5c1780 0 ceph version 0.61.2
(fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mon, pid 3124
2013-05-24 11:58:50.973588 7fe8cb5c1780 -1 asok(0x2ea8000)
AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen:
failed to bind the UNIX domain socket to
'/var/run/ceph/ceph-mon.ceph.asok': (2) No such file or directory
2013-05-24 11:58:51.021834 7f992fa96780 0 ceph version 0.61.2
(fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mon, pid 3130
2013-05-24 11:58:51.037512 7f992fa96780 1 mon.ceph at -1(probing) e0
preinit fsid 773097e3-10d1-425a-961a-40480bec0493
2013-05-24 11:58:51.037617 7f992fa96780 1 mon.ceph at -1(probing) e0
initial_members ceph-cluster, filtering seed monmap
2013-05-24 11:58:51.039514 7f992fa7f700 0 -- 10.10.1.1:6789/0 >>
0.0.0.0:0/1 pipe(0x1aea500 sd=20 :0 s=1 pgs=0 cs=0 l=0).fault
2013-05-24 11:59:51.038094 7f992a093700 0
mon.ceph at -1(probing).data_health(0) update_stats avail 91% total
59094864 used 1867904 avail 54225112

That last line is just repeated forever. I saw the bit about not
being able to bind to the admin socket, but when I tried "ceph
--admin-daemon <admin sock> config show", it worked, so I think the
socket is working fine. In fact, I ran the command that
/usr/sbin/ceph-create-keys uses to check on status, and here is what
that gives:

root at ceph:~# ceph --admin-daemon /var/run/ceph/ceph-mon.ceph.asok mon_status
{ "name": "ceph",
"rank": -1,
"state": "probing",
"election_epoch": 0,
"quorum": [],
"outside_quorum": [],
"extra_probe_peers": [
"10.10.1.1:6789\/0"],
"monmap": { "epoch": 0,
"fsid": "773097e3-10d1-425a-961a-40480bec0493",
"modified": "0.000000",
"created": "0.000000",
"mons": [
{ "rank": 0,
"name": "ceph-cluster",
"addr": "0.0.0.0:0\/1"}]}}


So, how do I go about figuring out why the on is not in quorum? I'm
stumped. =)

Thanks,

- Travis
Mordur Ingolfsson
2014-01-09 05:40:07 UTC
Permalink
Hi Travis,

Did you figure this out? I'm dealing with exactly the same thing over here.

Best,
Moe
Travis Rhoden
2014-01-09 14:45:28 UTC
Permalink
HI Mordur,

I'm definitely straining my memory on this one, but happy to help if I can?

I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.

Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.

Sorry that's probably not much help.

- Travis
Post by Mordur Ingolfsson
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Alfredo Deza
2014-01-09 14:48:09 UTC
Permalink
Post by Travis Rhoden
HI Mordur,
I'm definitely straining my memory on this one, but happy to help if I can?
I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.
Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.
Sorry that's probably not much help.
- Travis
Post by Mordur Ingolfsson
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Can you share what exactly you are having problems with? ceph-deploy's
log output has been
much improved and it is super useful to have that when dealing with
possible issues.
Post by Travis Rhoden
Post by Mordur Ingolfsson
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Travis Rhoden
2014-01-09 14:51:51 UTC
Permalink
Post by Alfredo Deza
Post by Travis Rhoden
HI Mordur,
I'm definitely straining my memory on this one, but happy to help if I can?
I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.
Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.
Sorry that's probably not much help.
- Travis
Post by Mordur Ingolfsson
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Can you share what exactly you are having problems with? ceph-deploy's
log output has been
much improved and it is super useful to have that when dealing with
possible issues.
I do not, it was long long ago... And it case it was ambiguous, let
me explicitly say I was not recommending the use of mkcephfs at all
(is that even still possible?). ceph-deploy is certainly the tool to
use.
Post by Alfredo Deza
Post by Travis Rhoden
Post by Mordur Ingolfsson
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Mordur Ingolfsson
2014-01-09 16:15:33 UTC
Permalink
Hi guys, thank you very much for your feedback. I'm new to Ceph, so I
ask you to be patient with my newbie-ness.

I'm dealing with the same issue although I'm not using ceph-deploy. I
installed manually (for learning purposes) a small test cluster of three
nodes, one to host the single mon and two for osd. I had managed to get
this working, all seemed healthy. I then simulated a catastrophic event
by pulling the plug on all three nodes. After that I haven't been able
to get things working. There is no quorum reached on a single mon setup
and a ceph-create-keys process is hanging hanging. This is my ceph.conf


This is my ceph.conf

http://pastebin.com/qyqeu5E4


This is what a process list pertaining to ceph looks like on the mon
node after a reboot, please note that the ceph-create-keys hangs:

root at ceph0:/var/log/ceph# ps aux | grep ceph
root 988 0.2 0.2 34204 7368 ? S 15:36 0:00
/usr/bin/python /usr/sbin/ceph-create-keys -i cehp0
root 1449 0.0 0.1 94844 3972 ? Ss 15:38 0:00 sshd:
ceph [priv]
ceph 1470 0.0 0.0 94844 1740 ? S 15:38 0:00 sshd:
ceph at pts/0
ceph 1471 0.3 0.1 22308 3384 pts/0 Ss 15:38 0:00 -bash
root 1670 0.0 0.0 9452 904 pts/0 R+ 15:38 0:00 grep
--color=auto ceph

So as you can see, no mon process is started, I presume that this is
somehow a result of the ceph-create-keys process hanging.
/var/log/ceph-mon.cehp0.log shows the following in this status of the
system, after a reboot:

2014-01-09 15:49:44.433943 7f9e45eb97c0 0 ceph version 0.72.2
(a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid 972
2014-01-09 15:49:44.535436 7f9e45eb97c0 -1 failed to create new leveldb
store

If I manually start the ceph process by:

start ceph-mon id=ceph0

it starts fine, and "ceph
--admin-daemon=/var/run/ceph/ceph-mon.ceph0.asok mon_status" outputs:

{ "name": "ceph0",
"rank": 0,
"state": "leader",
"election_epoch": 1,
"quorum": [
0],
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": { "epoch": 1,
"fsid": "e0696edf-ac8d-4095-beaf-6a2592964060",
"modified": "2014-01-08 02:00:23.264895",
"created": "2014-01-08 02:00:23.264895",
"mons": [
{ "rank": 0,
"name": "ceph0",
"addr": "192.168.10.200:6789\/0"}]}}


The mon process seems ok, but the ceph-create-keys keeps hanging and
there is no quorum.

If I kill the ceph-create-keys process and run "/usr/bin/python
/usr/sbin/ceph-create-keys -i cehp0" manually i get:

"admin_socket: exception getting command descriptions: [Errno 2] No such
file or directory
INFO:ceph-create-keys:ceph-mon admin socket not ready yet."

every second or so. This is what happens when I terminate the manually
started ceph-create-keys process:

^CTraceback (most recent call last):
File "/usr/sbin/ceph-create-keys", line 227, in <module>
main()
File "/usr/sbin/ceph-create-keys", line 213, in main
wait_for_quorum(cluster=args.cluster, mon_id=args.id)
File "/usr/sbin/ceph-create-keys", line 34, in wait_for_quorum
time.sleep(1)
KeyboardInterrupt


I will finish this long post by pasting what happens if I try to restart
all services on the cluster, just so you know that the mon problem is
only the first problem I'm battling with here :)

http://pastebin.com/mPGhiYu5

Please note, that after the above global restart, the ceph-create-keys
hanging process is back.


Best,
Moe
Post by Travis Rhoden
Post by Alfredo Deza
Post by Travis Rhoden
HI Mordur,
I'm definitely straining my memory on this one, but happy to help if I can?
I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.
Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.
Sorry that's probably not much help.
- Travis
Post by Mordur Ingolfsson
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Can you share what exactly you are having problems with? ceph-deploy's
log output has been
much improved and it is super useful to have that when dealing with
possible issues.
I do not, it was long long ago... And it case it was ambiguous, let
me explicitly say I was not recommending the use of mkcephfs at all
(is that even still possible?). ceph-deploy is certainly the tool to
use.
Post by Alfredo Deza
Post by Travis Rhoden
Post by Mordur Ingolfsson
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140109/e2c28a97/attachment.htm>
Alfredo Deza
2014-01-09 18:32:54 UTC
Permalink
Hi guys, thank you very much for your feedback. I'm new to Ceph, so I ask
you to be patient with my newbie-ness.
I'm dealing with the same issue although I'm not using ceph-deploy. I
installed manually (for learning purposes) a small test cluster of three
nodes, one to host the single mon and two for osd. I had managed to get this
working, all seemed healthy. I then simulated a catastrophic event by
pulling the plug on all three nodes. After that I haven't been able to get
things working. There is no quorum reached on a single mon setup and a
ceph-create-keys process is hanging hanging. This is my ceph.conf
This is my ceph.conf
http://pastebin.com/qyqeu5E4
This is what a process list pertaining to ceph looks like on the mon node
root at ceph0:/var/log/ceph# ps aux | grep ceph
root 988 0.2 0.2 34204 7368 ? S 15:36 0:00
/usr/bin/python /usr/sbin/ceph-create-keys -i cehp0
root 1449 0.0 0.1 94844 3972 ? Ss 15:38 0:00 sshd: ceph
[priv]
ceph at pts/0
ceph 1471 0.3 0.1 22308 3384 pts/0 Ss 15:38 0:00 -bash
root 1670 0.0 0.0 9452 904 pts/0 R+ 15:38 0:00 grep
--color=auto ceph
So as you can see, no mon process is started, I presume that this is
somehow a result of the ceph-create-keys process hanging.
/var/log/ceph-mon.cehp0.log shows the following in this status of the
2014-01-09 15:49:44.433943 7f9e45eb97c0 0 ceph version 0.72.2
(a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid 972
2014-01-09 15:49:44.535436 7f9e45eb97c0 -1 failed to create new leveldb
store
start ceph-mon id=ceph0
it starts fine, and "ceph --admin-daemon=/var/run/ceph/ceph-mon.ceph0.asok
{ "name": "ceph0",
"rank": 0,
"state": "leader",
"election_epoch": 1,
"quorum": [
0],
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": { "epoch": 1,
"fsid": "e0696edf-ac8d-4095-beaf-6a2592964060",
"modified": "2014-01-08 02:00:23.264895",
"created": "2014-01-08 02:00:23.264895",
"mons": [
{ "rank": 0,
"name": "ceph0",
"addr": "192.168.10.200:6789\/0"}]}}
The mon process seems ok, but the ceph-create-keys keeps hanging and there
is no quorum.
If I kill the ceph-create-keys process and run "/usr/bin/python
"admin_socket: exception getting command descriptions: [Errno 2] No such
file or directory
INFO:ceph-create-keys:ceph-mon admin socket not ready yet."
I think you have a typo that is causing you issues. Is it 'cehp0' or 'ceph0' ?
every second or so. This is what happens when I terminate the manually
File "/usr/sbin/ceph-create-keys", line 227, in <module>
main()
File "/usr/sbin/ceph-create-keys", line 213, in main
wait_for_quorum(cluster=args.cluster, mon_id=args.id)
File "/usr/sbin/ceph-create-keys", line 34, in wait_for_quorum
time.sleep(1)
KeyboardInterrupt
I will finish this long post by pasting what happens if I try to restart all
services on the cluster, just so you know that the mon problem is only the
first problem I'm battling with here :)
http://pastebin.com/mPGhiYu5
Please note, that after the above global restart, the ceph-create-keys
hanging process is back.
Best,
Moe
HI Mordur,
I'm definitely straining my memory on this one, but happy to help if I can?
I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.
Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.
Sorry that's probably not much help.
- Travis
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Can you share what exactly you are having problems with? ceph-deploy's
log output has been
much improved and it is super useful to have that when dealing with
possible issues.
I do not, it was long long ago... And it case it was ambiguous, let
me explicitly say I was not recommending the use of mkcephfs at all
(is that even still possible?). ceph-deploy is certainly the tool to
use.
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Alfredo Deza
2014-01-09 18:32:54 UTC
Permalink
Hi guys, thank you very much for your feedback. I'm new to Ceph, so I ask
you to be patient with my newbie-ness.
I'm dealing with the same issue although I'm not using ceph-deploy. I
installed manually (for learning purposes) a small test cluster of three
nodes, one to host the single mon and two for osd. I had managed to get this
working, all seemed healthy. I then simulated a catastrophic event by
pulling the plug on all three nodes. After that I haven't been able to get
things working. There is no quorum reached on a single mon setup and a
ceph-create-keys process is hanging hanging. This is my ceph.conf
This is my ceph.conf
http://pastebin.com/qyqeu5E4
This is what a process list pertaining to ceph looks like on the mon node
root at ceph0:/var/log/ceph# ps aux | grep ceph
root 988 0.2 0.2 34204 7368 ? S 15:36 0:00
/usr/bin/python /usr/sbin/ceph-create-keys -i cehp0
root 1449 0.0 0.1 94844 3972 ? Ss 15:38 0:00 sshd: ceph
[priv]
ceph at pts/0
ceph 1471 0.3 0.1 22308 3384 pts/0 Ss 15:38 0:00 -bash
root 1670 0.0 0.0 9452 904 pts/0 R+ 15:38 0:00 grep
--color=auto ceph
So as you can see, no mon process is started, I presume that this is
somehow a result of the ceph-create-keys process hanging.
/var/log/ceph-mon.cehp0.log shows the following in this status of the
2014-01-09 15:49:44.433943 7f9e45eb97c0 0 ceph version 0.72.2
(a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid 972
2014-01-09 15:49:44.535436 7f9e45eb97c0 -1 failed to create new leveldb
store
start ceph-mon id=ceph0
it starts fine, and "ceph --admin-daemon=/var/run/ceph/ceph-mon.ceph0.asok
{ "name": "ceph0",
"rank": 0,
"state": "leader",
"election_epoch": 1,
"quorum": [
0],
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": { "epoch": 1,
"fsid": "e0696edf-ac8d-4095-beaf-6a2592964060",
"modified": "2014-01-08 02:00:23.264895",
"created": "2014-01-08 02:00:23.264895",
"mons": [
{ "rank": 0,
"name": "ceph0",
"addr": "192.168.10.200:6789\/0"}]}}
The mon process seems ok, but the ceph-create-keys keeps hanging and there
is no quorum.
If I kill the ceph-create-keys process and run "/usr/bin/python
"admin_socket: exception getting command descriptions: [Errno 2] No such
file or directory
INFO:ceph-create-keys:ceph-mon admin socket not ready yet."
I think you have a typo that is causing you issues. Is it 'cehp0' or 'ceph0' ?
every second or so. This is what happens when I terminate the manually
File "/usr/sbin/ceph-create-keys", line 227, in <module>
main()
File "/usr/sbin/ceph-create-keys", line 213, in main
wait_for_quorum(cluster=args.cluster, mon_id=args.id)
File "/usr/sbin/ceph-create-keys", line 34, in wait_for_quorum
time.sleep(1)
KeyboardInterrupt
I will finish this long post by pasting what happens if I try to restart all
services on the cluster, just so you know that the mon problem is only the
first problem I'm battling with here :)
http://pastebin.com/mPGhiYu5
Please note, that after the above global restart, the ceph-create-keys
hanging process is back.
Best,
Moe
HI Mordur,
I'm definitely straining my memory on this one, but happy to help if I can?
I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.
Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.
Sorry that's probably not much help.
- Travis
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Can you share what exactly you are having problems with? ceph-deploy's
log output has been
much improved and it is super useful to have that when dealing with
possible issues.
I do not, it was long long ago... And it case it was ambiguous, let
me explicitly say I was not recommending the use of mkcephfs at all
(is that even still possible?). ceph-deploy is certainly the tool to
use.
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Alfredo Deza
2014-01-09 18:32:54 UTC
Permalink
Hi guys, thank you very much for your feedback. I'm new to Ceph, so I ask
you to be patient with my newbie-ness.
I'm dealing with the same issue although I'm not using ceph-deploy. I
installed manually (for learning purposes) a small test cluster of three
nodes, one to host the single mon and two for osd. I had managed to get this
working, all seemed healthy. I then simulated a catastrophic event by
pulling the plug on all three nodes. After that I haven't been able to get
things working. There is no quorum reached on a single mon setup and a
ceph-create-keys process is hanging hanging. This is my ceph.conf
This is my ceph.conf
http://pastebin.com/qyqeu5E4
This is what a process list pertaining to ceph looks like on the mon node
root at ceph0:/var/log/ceph# ps aux | grep ceph
root 988 0.2 0.2 34204 7368 ? S 15:36 0:00
/usr/bin/python /usr/sbin/ceph-create-keys -i cehp0
root 1449 0.0 0.1 94844 3972 ? Ss 15:38 0:00 sshd: ceph
[priv]
ceph at pts/0
ceph 1471 0.3 0.1 22308 3384 pts/0 Ss 15:38 0:00 -bash
root 1670 0.0 0.0 9452 904 pts/0 R+ 15:38 0:00 grep
--color=auto ceph
So as you can see, no mon process is started, I presume that this is
somehow a result of the ceph-create-keys process hanging.
/var/log/ceph-mon.cehp0.log shows the following in this status of the
2014-01-09 15:49:44.433943 7f9e45eb97c0 0 ceph version 0.72.2
(a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid 972
2014-01-09 15:49:44.535436 7f9e45eb97c0 -1 failed to create new leveldb
store
start ceph-mon id=ceph0
it starts fine, and "ceph --admin-daemon=/var/run/ceph/ceph-mon.ceph0.asok
{ "name": "ceph0",
"rank": 0,
"state": "leader",
"election_epoch": 1,
"quorum": [
0],
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": { "epoch": 1,
"fsid": "e0696edf-ac8d-4095-beaf-6a2592964060",
"modified": "2014-01-08 02:00:23.264895",
"created": "2014-01-08 02:00:23.264895",
"mons": [
{ "rank": 0,
"name": "ceph0",
"addr": "192.168.10.200:6789\/0"}]}}
The mon process seems ok, but the ceph-create-keys keeps hanging and there
is no quorum.
If I kill the ceph-create-keys process and run "/usr/bin/python
"admin_socket: exception getting command descriptions: [Errno 2] No such
file or directory
INFO:ceph-create-keys:ceph-mon admin socket not ready yet."
I think you have a typo that is causing you issues. Is it 'cehp0' or 'ceph0' ?
every second or so. This is what happens when I terminate the manually
File "/usr/sbin/ceph-create-keys", line 227, in <module>
main()
File "/usr/sbin/ceph-create-keys", line 213, in main
wait_for_quorum(cluster=args.cluster, mon_id=args.id)
File "/usr/sbin/ceph-create-keys", line 34, in wait_for_quorum
time.sleep(1)
KeyboardInterrupt
I will finish this long post by pasting what happens if I try to restart all
services on the cluster, just so you know that the mon problem is only the
first problem I'm battling with here :)
http://pastebin.com/mPGhiYu5
Please note, that after the above global restart, the ceph-create-keys
hanging process is back.
Best,
Moe
HI Mordur,
I'm definitely straining my memory on this one, but happy to help if I can?
I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.
Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.
Sorry that's probably not much help.
- Travis
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Can you share what exactly you are having problems with? ceph-deploy's
log output has been
much improved and it is super useful to have that when dealing with
possible issues.
I do not, it was long long ago... And it case it was ambiguous, let
me explicitly say I was not recommending the use of mkcephfs at all
(is that even still possible?). ceph-deploy is certainly the tool to
use.
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Alfredo Deza
2014-01-09 18:32:54 UTC
Permalink
Hi guys, thank you very much for your feedback. I'm new to Ceph, so I ask
you to be patient with my newbie-ness.
I'm dealing with the same issue although I'm not using ceph-deploy. I
installed manually (for learning purposes) a small test cluster of three
nodes, one to host the single mon and two for osd. I had managed to get this
working, all seemed healthy. I then simulated a catastrophic event by
pulling the plug on all three nodes. After that I haven't been able to get
things working. There is no quorum reached on a single mon setup and a
ceph-create-keys process is hanging hanging. This is my ceph.conf
This is my ceph.conf
http://pastebin.com/qyqeu5E4
This is what a process list pertaining to ceph looks like on the mon node
root at ceph0:/var/log/ceph# ps aux | grep ceph
root 988 0.2 0.2 34204 7368 ? S 15:36 0:00
/usr/bin/python /usr/sbin/ceph-create-keys -i cehp0
root 1449 0.0 0.1 94844 3972 ? Ss 15:38 0:00 sshd: ceph
[priv]
ceph at pts/0
ceph 1471 0.3 0.1 22308 3384 pts/0 Ss 15:38 0:00 -bash
root 1670 0.0 0.0 9452 904 pts/0 R+ 15:38 0:00 grep
--color=auto ceph
So as you can see, no mon process is started, I presume that this is
somehow a result of the ceph-create-keys process hanging.
/var/log/ceph-mon.cehp0.log shows the following in this status of the
2014-01-09 15:49:44.433943 7f9e45eb97c0 0 ceph version 0.72.2
(a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid 972
2014-01-09 15:49:44.535436 7f9e45eb97c0 -1 failed to create new leveldb
store
start ceph-mon id=ceph0
it starts fine, and "ceph --admin-daemon=/var/run/ceph/ceph-mon.ceph0.asok
{ "name": "ceph0",
"rank": 0,
"state": "leader",
"election_epoch": 1,
"quorum": [
0],
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": { "epoch": 1,
"fsid": "e0696edf-ac8d-4095-beaf-6a2592964060",
"modified": "2014-01-08 02:00:23.264895",
"created": "2014-01-08 02:00:23.264895",
"mons": [
{ "rank": 0,
"name": "ceph0",
"addr": "192.168.10.200:6789\/0"}]}}
The mon process seems ok, but the ceph-create-keys keeps hanging and there
is no quorum.
If I kill the ceph-create-keys process and run "/usr/bin/python
"admin_socket: exception getting command descriptions: [Errno 2] No such
file or directory
INFO:ceph-create-keys:ceph-mon admin socket not ready yet."
I think you have a typo that is causing you issues. Is it 'cehp0' or 'ceph0' ?
every second or so. This is what happens when I terminate the manually
File "/usr/sbin/ceph-create-keys", line 227, in <module>
main()
File "/usr/sbin/ceph-create-keys", line 213, in main
wait_for_quorum(cluster=args.cluster, mon_id=args.id)
File "/usr/sbin/ceph-create-keys", line 34, in wait_for_quorum
time.sleep(1)
KeyboardInterrupt
I will finish this long post by pasting what happens if I try to restart all
services on the cluster, just so you know that the mon problem is only the
first problem I'm battling with here :)
http://pastebin.com/mPGhiYu5
Please note, that after the above global restart, the ceph-create-keys
hanging process is back.
Best,
Moe
HI Mordur,
I'm definitely straining my memory on this one, but happy to help if I can?
I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.
Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.
Sorry that's probably not much help.
- Travis
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Can you share what exactly you are having problems with? ceph-deploy's
log output has been
much improved and it is super useful to have that when dealing with
possible issues.
I do not, it was long long ago... And it case it was ambiguous, let
me explicitly say I was not recommending the use of mkcephfs at all
(is that even still possible?). ceph-deploy is certainly the tool to
use.
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Mordur Ingolfsson
2014-01-09 16:15:33 UTC
Permalink
Hi guys, thank you very much for your feedback. I'm new to Ceph, so I
ask you to be patient with my newbie-ness.

I'm dealing with the same issue although I'm not using ceph-deploy. I
installed manually (for learning purposes) a small test cluster of three
nodes, one to host the single mon and two for osd. I had managed to get
this working, all seemed healthy. I then simulated a catastrophic event
by pulling the plug on all three nodes. After that I haven't been able
to get things working. There is no quorum reached on a single mon setup
and a ceph-create-keys process is hanging hanging. This is my ceph.conf


This is my ceph.conf

http://pastebin.com/qyqeu5E4


This is what a process list pertaining to ceph looks like on the mon
node after a reboot, please note that the ceph-create-keys hangs:

root at ceph0:/var/log/ceph# ps aux | grep ceph
root 988 0.2 0.2 34204 7368 ? S 15:36 0:00
/usr/bin/python /usr/sbin/ceph-create-keys -i cehp0
root 1449 0.0 0.1 94844 3972 ? Ss 15:38 0:00 sshd:
ceph [priv]
ceph 1470 0.0 0.0 94844 1740 ? S 15:38 0:00 sshd:
ceph at pts/0
ceph 1471 0.3 0.1 22308 3384 pts/0 Ss 15:38 0:00 -bash
root 1670 0.0 0.0 9452 904 pts/0 R+ 15:38 0:00 grep
--color=auto ceph

So as you can see, no mon process is started, I presume that this is
somehow a result of the ceph-create-keys process hanging.
/var/log/ceph-mon.cehp0.log shows the following in this status of the
system, after a reboot:

2014-01-09 15:49:44.433943 7f9e45eb97c0 0 ceph version 0.72.2
(a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid 972
2014-01-09 15:49:44.535436 7f9e45eb97c0 -1 failed to create new leveldb
store

If I manually start the ceph process by:

start ceph-mon id=ceph0

it starts fine, and "ceph
--admin-daemon=/var/run/ceph/ceph-mon.ceph0.asok mon_status" outputs:

{ "name": "ceph0",
"rank": 0,
"state": "leader",
"election_epoch": 1,
"quorum": [
0],
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": { "epoch": 1,
"fsid": "e0696edf-ac8d-4095-beaf-6a2592964060",
"modified": "2014-01-08 02:00:23.264895",
"created": "2014-01-08 02:00:23.264895",
"mons": [
{ "rank": 0,
"name": "ceph0",
"addr": "192.168.10.200:6789\/0"}]}}


The mon process seems ok, but the ceph-create-keys keeps hanging and
there is no quorum.

If I kill the ceph-create-keys process and run "/usr/bin/python
/usr/sbin/ceph-create-keys -i cehp0" manually i get:

"admin_socket: exception getting command descriptions: [Errno 2] No such
file or directory
INFO:ceph-create-keys:ceph-mon admin socket not ready yet."

every second or so. This is what happens when I terminate the manually
started ceph-create-keys process:

^CTraceback (most recent call last):
File "/usr/sbin/ceph-create-keys", line 227, in <module>
main()
File "/usr/sbin/ceph-create-keys", line 213, in main
wait_for_quorum(cluster=args.cluster, mon_id=args.id)
File "/usr/sbin/ceph-create-keys", line 34, in wait_for_quorum
time.sleep(1)
KeyboardInterrupt


I will finish this long post by pasting what happens if I try to restart
all services on the cluster, just so you know that the mon problem is
only the first problem I'm battling with here :)

http://pastebin.com/mPGhiYu5

Please note, that after the above global restart, the ceph-create-keys
hanging process is back.


Best,
Moe
Post by Travis Rhoden
Post by Alfredo Deza
Post by Travis Rhoden
HI Mordur,
I'm definitely straining my memory on this one, but happy to help if I can?
I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.
Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.
Sorry that's probably not much help.
- Travis
Post by Mordur Ingolfsson
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Can you share what exactly you are having problems with? ceph-deploy's
log output has been
much improved and it is super useful to have that when dealing with
possible issues.
I do not, it was long long ago... And it case it was ambiguous, let
me explicitly say I was not recommending the use of mkcephfs at all
(is that even still possible?). ceph-deploy is certainly the tool to
use.
Post by Alfredo Deza
Post by Travis Rhoden
Post by Mordur Ingolfsson
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140109/e2c28a97/attachment-0002.htm>
Mordur Ingolfsson
2014-01-09 16:15:33 UTC
Permalink
Hi guys, thank you very much for your feedback. I'm new to Ceph, so I
ask you to be patient with my newbie-ness.

I'm dealing with the same issue although I'm not using ceph-deploy. I
installed manually (for learning purposes) a small test cluster of three
nodes, one to host the single mon and two for osd. I had managed to get
this working, all seemed healthy. I then simulated a catastrophic event
by pulling the plug on all three nodes. After that I haven't been able
to get things working. There is no quorum reached on a single mon setup
and a ceph-create-keys process is hanging hanging. This is my ceph.conf


This is my ceph.conf

http://pastebin.com/qyqeu5E4


This is what a process list pertaining to ceph looks like on the mon
node after a reboot, please note that the ceph-create-keys hangs:

root at ceph0:/var/log/ceph# ps aux | grep ceph
root 988 0.2 0.2 34204 7368 ? S 15:36 0:00
/usr/bin/python /usr/sbin/ceph-create-keys -i cehp0
root 1449 0.0 0.1 94844 3972 ? Ss 15:38 0:00 sshd:
ceph [priv]
ceph 1470 0.0 0.0 94844 1740 ? S 15:38 0:00 sshd:
ceph at pts/0
ceph 1471 0.3 0.1 22308 3384 pts/0 Ss 15:38 0:00 -bash
root 1670 0.0 0.0 9452 904 pts/0 R+ 15:38 0:00 grep
--color=auto ceph

So as you can see, no mon process is started, I presume that this is
somehow a result of the ceph-create-keys process hanging.
/var/log/ceph-mon.cehp0.log shows the following in this status of the
system, after a reboot:

2014-01-09 15:49:44.433943 7f9e45eb97c0 0 ceph version 0.72.2
(a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid 972
2014-01-09 15:49:44.535436 7f9e45eb97c0 -1 failed to create new leveldb
store

If I manually start the ceph process by:

start ceph-mon id=ceph0

it starts fine, and "ceph
--admin-daemon=/var/run/ceph/ceph-mon.ceph0.asok mon_status" outputs:

{ "name": "ceph0",
"rank": 0,
"state": "leader",
"election_epoch": 1,
"quorum": [
0],
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": { "epoch": 1,
"fsid": "e0696edf-ac8d-4095-beaf-6a2592964060",
"modified": "2014-01-08 02:00:23.264895",
"created": "2014-01-08 02:00:23.264895",
"mons": [
{ "rank": 0,
"name": "ceph0",
"addr": "192.168.10.200:6789\/0"}]}}


The mon process seems ok, but the ceph-create-keys keeps hanging and
there is no quorum.

If I kill the ceph-create-keys process and run "/usr/bin/python
/usr/sbin/ceph-create-keys -i cehp0" manually i get:

"admin_socket: exception getting command descriptions: [Errno 2] No such
file or directory
INFO:ceph-create-keys:ceph-mon admin socket not ready yet."

every second or so. This is what happens when I terminate the manually
started ceph-create-keys process:

^CTraceback (most recent call last):
File "/usr/sbin/ceph-create-keys", line 227, in <module>
main()
File "/usr/sbin/ceph-create-keys", line 213, in main
wait_for_quorum(cluster=args.cluster, mon_id=args.id)
File "/usr/sbin/ceph-create-keys", line 34, in wait_for_quorum
time.sleep(1)
KeyboardInterrupt


I will finish this long post by pasting what happens if I try to restart
all services on the cluster, just so you know that the mon problem is
only the first problem I'm battling with here :)

http://pastebin.com/mPGhiYu5

Please note, that after the above global restart, the ceph-create-keys
hanging process is back.


Best,
Moe
Post by Travis Rhoden
Post by Alfredo Deza
Post by Travis Rhoden
HI Mordur,
I'm definitely straining my memory on this one, but happy to help if I can?
I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.
Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.
Sorry that's probably not much help.
- Travis
Post by Mordur Ingolfsson
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Can you share what exactly you are having problems with? ceph-deploy's
log output has been
much improved and it is super useful to have that when dealing with
possible issues.
I do not, it was long long ago... And it case it was ambiguous, let
me explicitly say I was not recommending the use of mkcephfs at all
(is that even still possible?). ceph-deploy is certainly the tool to
use.
Post by Alfredo Deza
Post by Travis Rhoden
Post by Mordur Ingolfsson
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140109/e2c28a97/attachment-0003.htm>
Mordur Ingolfsson
2014-01-09 16:15:33 UTC
Permalink
Hi guys, thank you very much for your feedback. I'm new to Ceph, so I
ask you to be patient with my newbie-ness.

I'm dealing with the same issue although I'm not using ceph-deploy. I
installed manually (for learning purposes) a small test cluster of three
nodes, one to host the single mon and two for osd. I had managed to get
this working, all seemed healthy. I then simulated a catastrophic event
by pulling the plug on all three nodes. After that I haven't been able
to get things working. There is no quorum reached on a single mon setup
and a ceph-create-keys process is hanging hanging. This is my ceph.conf


This is my ceph.conf

http://pastebin.com/qyqeu5E4


This is what a process list pertaining to ceph looks like on the mon
node after a reboot, please note that the ceph-create-keys hangs:

root at ceph0:/var/log/ceph# ps aux | grep ceph
root 988 0.2 0.2 34204 7368 ? S 15:36 0:00
/usr/bin/python /usr/sbin/ceph-create-keys -i cehp0
root 1449 0.0 0.1 94844 3972 ? Ss 15:38 0:00 sshd:
ceph [priv]
ceph 1470 0.0 0.0 94844 1740 ? S 15:38 0:00 sshd:
ceph at pts/0
ceph 1471 0.3 0.1 22308 3384 pts/0 Ss 15:38 0:00 -bash
root 1670 0.0 0.0 9452 904 pts/0 R+ 15:38 0:00 grep
--color=auto ceph

So as you can see, no mon process is started, I presume that this is
somehow a result of the ceph-create-keys process hanging.
/var/log/ceph-mon.cehp0.log shows the following in this status of the
system, after a reboot:

2014-01-09 15:49:44.433943 7f9e45eb97c0 0 ceph version 0.72.2
(a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid 972
2014-01-09 15:49:44.535436 7f9e45eb97c0 -1 failed to create new leveldb
store

If I manually start the ceph process by:

start ceph-mon id=ceph0

it starts fine, and "ceph
--admin-daemon=/var/run/ceph/ceph-mon.ceph0.asok mon_status" outputs:

{ "name": "ceph0",
"rank": 0,
"state": "leader",
"election_epoch": 1,
"quorum": [
0],
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": { "epoch": 1,
"fsid": "e0696edf-ac8d-4095-beaf-6a2592964060",
"modified": "2014-01-08 02:00:23.264895",
"created": "2014-01-08 02:00:23.264895",
"mons": [
{ "rank": 0,
"name": "ceph0",
"addr": "192.168.10.200:6789\/0"}]}}


The mon process seems ok, but the ceph-create-keys keeps hanging and
there is no quorum.

If I kill the ceph-create-keys process and run "/usr/bin/python
/usr/sbin/ceph-create-keys -i cehp0" manually i get:

"admin_socket: exception getting command descriptions: [Errno 2] No such
file or directory
INFO:ceph-create-keys:ceph-mon admin socket not ready yet."

every second or so. This is what happens when I terminate the manually
started ceph-create-keys process:

^CTraceback (most recent call last):
File "/usr/sbin/ceph-create-keys", line 227, in <module>
main()
File "/usr/sbin/ceph-create-keys", line 213, in main
wait_for_quorum(cluster=args.cluster, mon_id=args.id)
File "/usr/sbin/ceph-create-keys", line 34, in wait_for_quorum
time.sleep(1)
KeyboardInterrupt


I will finish this long post by pasting what happens if I try to restart
all services on the cluster, just so you know that the mon problem is
only the first problem I'm battling with here :)

http://pastebin.com/mPGhiYu5

Please note, that after the above global restart, the ceph-create-keys
hanging process is back.


Best,
Moe
Post by Travis Rhoden
Post by Alfredo Deza
Post by Travis Rhoden
HI Mordur,
I'm definitely straining my memory on this one, but happy to help if I can?
I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.
Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.
Sorry that's probably not much help.
- Travis
Post by Mordur Ingolfsson
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Can you share what exactly you are having problems with? ceph-deploy's
log output has been
much improved and it is super useful to have that when dealing with
possible issues.
I do not, it was long long ago... And it case it was ambiguous, let
me explicitly say I was not recommending the use of mkcephfs at all
(is that even still possible?). ceph-deploy is certainly the tool to
use.
Post by Alfredo Deza
Post by Travis Rhoden
Post by Mordur Ingolfsson
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140109/e2c28a97/attachment-0004.htm>
Travis Rhoden
2014-01-09 14:51:51 UTC
Permalink
Post by Alfredo Deza
Post by Travis Rhoden
HI Mordur,
I'm definitely straining my memory on this one, but happy to help if I can?
I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.
Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.
Sorry that's probably not much help.
- Travis
Post by Mordur Ingolfsson
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Can you share what exactly you are having problems with? ceph-deploy's
log output has been
much improved and it is super useful to have that when dealing with
possible issues.
I do not, it was long long ago... And it case it was ambiguous, let
me explicitly say I was not recommending the use of mkcephfs at all
(is that even still possible?). ceph-deploy is certainly the tool to
use.
Post by Alfredo Deza
Post by Travis Rhoden
Post by Mordur Ingolfsson
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Travis Rhoden
2014-01-09 14:51:51 UTC
Permalink
Post by Alfredo Deza
Post by Travis Rhoden
HI Mordur,
I'm definitely straining my memory on this one, but happy to help if I can?
I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.
Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.
Sorry that's probably not much help.
- Travis
Post by Mordur Ingolfsson
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Can you share what exactly you are having problems with? ceph-deploy's
log output has been
much improved and it is super useful to have that when dealing with
possible issues.
I do not, it was long long ago... And it case it was ambiguous, let
me explicitly say I was not recommending the use of mkcephfs at all
(is that even still possible?). ceph-deploy is certainly the tool to
use.
Post by Alfredo Deza
Post by Travis Rhoden
Post by Mordur Ingolfsson
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Travis Rhoden
2014-01-09 14:51:51 UTC
Permalink
Post by Alfredo Deza
Post by Travis Rhoden
HI Mordur,
I'm definitely straining my memory on this one, but happy to help if I can?
I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.
Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.
Sorry that's probably not much help.
- Travis
Post by Mordur Ingolfsson
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Can you share what exactly you are having problems with? ceph-deploy's
log output has been
much improved and it is super useful to have that when dealing with
possible issues.
I do not, it was long long ago... And it case it was ambiguous, let
me explicitly say I was not recommending the use of mkcephfs at all
(is that even still possible?). ceph-deploy is certainly the tool to
use.
Post by Alfredo Deza
Post by Travis Rhoden
Post by Mordur Ingolfsson
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Alfredo Deza
2014-01-09 14:48:09 UTC
Permalink
Post by Travis Rhoden
HI Mordur,
I'm definitely straining my memory on this one, but happy to help if I can?
I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.
Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.
Sorry that's probably not much help.
- Travis
Post by Mordur Ingolfsson
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Can you share what exactly you are having problems with? ceph-deploy's
log output has been
much improved and it is super useful to have that when dealing with
possible issues.
Post by Travis Rhoden
Post by Mordur Ingolfsson
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Alfredo Deza
2014-01-09 14:48:09 UTC
Permalink
Post by Travis Rhoden
HI Mordur,
I'm definitely straining my memory on this one, but happy to help if I can?
I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.
Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.
Sorry that's probably not much help.
- Travis
Post by Mordur Ingolfsson
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Can you share what exactly you are having problems with? ceph-deploy's
log output has been
much improved and it is super useful to have that when dealing with
possible issues.
Post by Travis Rhoden
Post by Mordur Ingolfsson
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Alfredo Deza
2014-01-09 14:48:09 UTC
Permalink
Post by Travis Rhoden
HI Mordur,
I'm definitely straining my memory on this one, but happy to help if I can?
I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.
Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.
Sorry that's probably not much help.
- Travis
Post by Mordur Ingolfsson
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Can you share what exactly you are having problems with? ceph-deploy's
log output has been
much improved and it is super useful to have that when dealing with
possible issues.
Post by Travis Rhoden
Post by Mordur Ingolfsson
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Travis Rhoden
2014-01-09 14:45:28 UTC
Permalink
HI Mordur,

I'm definitely straining my memory on this one, but happy to help if I can?

I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.

Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.

Sorry that's probably not much help.

- Travis
Post by Mordur Ingolfsson
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Travis Rhoden
2014-01-09 14:45:28 UTC
Permalink
HI Mordur,

I'm definitely straining my memory on this one, but happy to help if I can?

I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.

Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.

Sorry that's probably not much help.

- Travis
Post by Mordur Ingolfsson
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Travis Rhoden
2014-01-09 14:45:28 UTC
Permalink
HI Mordur,

I'm definitely straining my memory on this one, but happy to help if I can?

I'm pretty sure I did not figure it out -- you can see I didn't get
any feedback from the list. What I did do, however, was uninstall
everything and try the same setup with mkcephfs, which worked fine at
the time. This was 8 months ago, though, and I have since used
ceph-deploy many times with great success. I am not sure if I have
ever tried a similar set up, though, with just one node and one
monitor. Fortuitiously, I may be trying that very setup today or
tomorrow. If I still have issues, I will be sure to post them here.

Are you using both the latest ceph-deploy and the latest Ceph packages
(Emperor or newer dev packages)? There have been lots of changes in
the monitor area, including in the upstart scripts, that made many
things more robust in this area. I did have a cluster a few months
ago that had a flaky monitor that refused to join quorum after
install, and I had to just blow it away and re-install/deploy it and
then it was fine, which I thought was odd.

Sorry that's probably not much help.

- Travis
Post by Mordur Ingolfsson
Hi Travis,
Did you figure this out? I'm dealing with exactly the same thing over here.
Best,
Moe
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Mordur Ingolfsson
2014-01-09 05:40:07 UTC
Permalink
Hi Travis,

Did you figure this out? I'm dealing with exactly the same thing over here.

Best,
Moe
Mordur Ingolfsson
2014-01-09 05:40:07 UTC
Permalink
Hi Travis,

Did you figure this out? I'm dealing with exactly the same thing over here.

Best,
Moe
Mordur Ingolfsson
2014-01-09 05:40:07 UTC
Permalink
Hi Travis,

Did you figure this out? I'm dealing with exactly the same thing over here.

Best,
Moe
Loading...