Discussion:
[ceph-users] Customized Crush location hooks in Mimic
Oliver Freyermuth
2018-11-30 10:44:51 UTC
Permalink
Dear Cephalopodians,

I'm probably missing something obvious, but I am at a loss here on how to actually make use of a customized crush location hook.

I'm currently on "ceph version 13.2.1" on CentOS 7 (i.e. the last version before the upgrade-preventing bugs). Here's what I did:

1. Write a script /usr/local/bin/customized-ceph-crush-location. The script can be executed by user "ceph":
# sudo -u ceph /usr/local/bin/customized-ceph-crush-location
host=osd001 datacenter=FTD root=default

2. Add the following to ceph.conf:
[osd]
crush_location_hook = /usr/local/bin/customized-ceph-crush-location

3. Restart an OSD and confirm that is picked up:
# systemctl restart ceph-***@0
# ceph config show-with-defaults osd.0
...
crush_location_hook /usr/local/bin/customized-ceph-crush-location file
...
osd_crush_update_on_start true default
...

However, the script is not executed, and I can ensure that since the script should also write a log to /tmp, which is not created.
Also, the "datacenter" type does not show up in the crush tree.

I have already disabled SELinux just to make sure.

Any ideas what I am missing here?

Cheers and thanks in advance,
Oliver
Oliver Freyermuth
2018-11-30 14:25:14 UTC
Permalink
Dear Cephalopodians,

further experiments revealed that the crush-location-hook is indeed called!
It's just my check (writing to a file in tmp from inside the hook) which somehow failed. Using "logger" works for debugging.

So now, my hook outputs:
host=osd001 datacenter=FTD root=default
as explained before. I have also explicitly created the buckets beforehand in case that is needed.

Tree looks like that:
# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 55.23582 root default
-9 0 datacenter FTD
-12 18.41194 datacenter FTD_1
-3 18.41194 host osd001
0 hdd 3.68239 osd.0 up 1.00000 1.00000
1 hdd 3.68239 osd.1 up 1.00000 1.00000
2 hdd 3.68239 osd.2 up 1.00000 1.00000
3 hdd 3.68239 osd.3 up 1.00000 1.00000
4 hdd 3.68239 osd.4 up 1.00000 1.00000
-11 0 datacenter FTD_2
-5 18.41194 host osd002
5 hdd 3.68239 osd.5 up 1.00000 1.00000
6 hdd 3.68239 osd.6 up 1.00000 1.00000
7 hdd 3.68239 osd.7 up 1.00000 1.00000
8 hdd 3.68239 osd.8 up 1.00000 1.00000
9 hdd 3.68239 osd.9 up 1.00000 1.00000
-7 18.41194 host osd003
10 hdd 3.68239 osd.10 up 1.00000 1.00000
11 hdd 3.68239 osd.11 up 1.00000 1.00000
12 hdd 3.68239 osd.12 up 1.00000 1.00000
13 hdd 3.68239 osd.13 up 1.00000 1.00000
14 hdd 3.68239 osd.14 up 1.00000 1.00000

So naively, I would expect that when I restart osd.0, it should move itself into datacenter=FTD.
But that does not happen...

Any idea what I am missing?

Cheers,
Oliver
Post by Oliver Freyermuth
Dear Cephalopodians,
I'm probably missing something obvious, but I am at a loss here on how to actually make use of a customized crush location hook.
  # sudo -u ceph /usr/local/bin/customized-ceph-crush-location
  host=osd001 datacenter=FTD root=default
 [osd]
 crush_location_hook = /usr/local/bin/customized-ceph-crush-location
 # ceph config show-with-defaults osd.0
  ...
  crush_location_hook        /usr/local/bin/customized-ceph-crush-location  file
  ...
  osd_crush_update_on_start  true                                           default
  ...
However, the script is not executed, and I can ensure that since the script should also write a log to /tmp, which is not created.
Also, the "datacenter" type does not show up in the crush tree.
I have already disabled SELinux just to make sure.
Any ideas what I am missing here?
Cheers and thanks in advance,
    Oliver
Oliver Freyermuth
2018-11-30 14:46:03 UTC
Permalink
Dear Cephalopodians,

sorry for the spam, but I found the following in mon logs just now and am finally out of ideas:
------------------------------------------------------------------------------------------
2018-11-30 15:43:05.207 7f9d64aac700 0 ***@0(leader) e3 handle_command mon_command({"prefix": "osd crush set-device-class", "class": "hdd", "ids": ["1"]} v 0) v1
2018-11-30 15:43:05.207 7f9d64aac700 0 log_channel(audit) log [INF] : from='osd.1 10.160.12.101:6816/90528' entity='osd.1' cmd=[{"prefix": "osd crush set-device-class", "class": "hdd", "ids": ["1"]}]: dispatch
2018-11-30 15:43:05.208 7f9d64aac700 0 ***@0(leader) e3 handle_command mon_command({"prefix": "osd crush create-or-move", "id": 1, "weight":3.6824, "args": ["datacenter=FTD", "host=osd001", "root=default"]} v 0) v1
2018-11-30 15:43:05.208 7f9d64aac700 0 log_channel(audit) log [INF] : from='osd.1 10.160.12.101:6816/90528' entity='osd.1' cmd=[{"prefix": "osd crush create-or-move", "id": 1, "weight":3.6824, "args": ["datacenter=FTD", "host=osd001", "root=default"]}]: dispatch
2018-11-30 15:43:05.208 7f9d64aac700 0 ***@0(leader).osd e2464 create-or-move crush item name 'osd.1' initial_weight 3.6824 at location {datacenter=FTD,host=osd001,root=default}
------------------------------------------------------------------------------------------
So the request to move to datacenter=FTD arrives at the mon, but no action is taken, and the OSD is left in FTD_1.

Cheers,
Oliver
Post by Oliver Freyermuth
Dear Cephalopodians,
further experiments revealed that the crush-location-hook is indeed called!
It's just my check (writing to a file in tmp from inside the hook) which somehow failed. Using "logger" works for debugging.
host=osd001 datacenter=FTD root=default
as explained before. I have also explicitly created the buckets beforehand in case that is needed.
# ceph osd tree
ID  CLASS WEIGHT   TYPE NAME            STATUS REWEIGHT PRI-AFF
 -1       55.23582 root default
 -9              0     datacenter FTD
-12       18.41194     datacenter FTD_1
 -3       18.41194         host osd001
  0   hdd  3.68239             osd.0        up  1.00000 1.00000
  1   hdd  3.68239             osd.1        up  1.00000 1.00000
  2   hdd  3.68239             osd.2        up  1.00000 1.00000
  3   hdd  3.68239             osd.3        up  1.00000 1.00000
  4   hdd  3.68239             osd.4        up  1.00000 1.00000
-11              0     datacenter FTD_2
 -5       18.41194     host osd002
  5   hdd  3.68239         osd.5            up  1.00000 1.00000
  6   hdd  3.68239         osd.6            up  1.00000 1.00000
  7   hdd  3.68239         osd.7            up  1.00000 1.00000
  8   hdd  3.68239         osd.8            up  1.00000 1.00000
  9   hdd  3.68239         osd.9            up  1.00000 1.00000
 -7       18.41194     host osd003
 10   hdd  3.68239         osd.10           up  1.00000 1.00000
 11   hdd  3.68239         osd.11           up  1.00000 1.00000
 12   hdd  3.68239         osd.12           up  1.00000 1.00000
 13   hdd  3.68239         osd.13           up  1.00000 1.00000
 14   hdd  3.68239         osd.14           up  1.00000 1.00000
So naively, I would expect that when I restart osd.0, it should move itself into datacenter=FTD.
But that does not happen...
Any idea what I am missing?
Cheers,
    Oliver
Post by Oliver Freyermuth
Dear Cephalopodians,
I'm probably missing something obvious, but I am at a loss here on how to actually make use of a customized crush location hook.
   # sudo -u ceph /usr/local/bin/customized-ceph-crush-location
   host=osd001 datacenter=FTD root=default
  [osd]
  crush_location_hook = /usr/local/bin/customized-ceph-crush-location
  # ceph config show-with-defaults osd.0
   ...
   crush_location_hook        /usr/local/bin/customized-ceph-crush-location  file
   ...
   osd_crush_update_on_start  true                                           default
   ...
However, the script is not executed, and I can ensure that since the script should also write a log to /tmp, which is not created.
Also, the "datacenter" type does not show up in the crush tree.
I have already disabled SELinux just to make sure.
Any ideas what I am missing here?
Cheers and thanks in advance,
     Oliver
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Gregory Farnum
2018-11-30 17:38:54 UTC
Permalink
I’m pretty sure the monitor command there won’t move intermediate buckets
like the host. This is so if an osd has incomplete metadata it doesn’t
inadvertently move 11 other OSDs into a different rack/row/whatever.

So in this case, it finds the host osd0001 and matches it, but since the
crush map already knows about osd0001 it doesn’t pay any attention to the
datacenter field.
Whereas if you tried setting it with mynewhost, the monitor wouldn’t know
where that host exists and would look at the other fields to set it in the
specified data center.
-Greg
On Fri, Nov 30, 2018 at 6:46 AM Oliver Freyermuth <
Post by Oliver Freyermuth
Dear Cephalopodians,
sorry for the spam, but I found the following in mon logs just now and am
------------------------------------------------------------------------------------------
handle_command mon_command({"prefix": "osd crush set-device-class",
"class": "hdd", "ids": ["1"]} v 0) v1
from='osd.1 10.160.12.101:6816/90528' entity='osd.1' cmd=[{"prefix": "osd
crush set-device-class", "class": "hdd", "ids": ["1"]}]: dispatch
handle_command mon_command({"prefix": "osd crush create-or-move", "id": 1,
"weight":3.6824, "args": ["datacenter=FTD", "host=osd001", "root=default"]}
v 0) v1
from='osd.1 10.160.12.101:6816/90528' entity='osd.1' cmd=[{"prefix": "osd
crush create-or-move", "id": 1, "weight":3.6824, "args": ["datacenter=FTD",
"host=osd001", "root=default"]}]: dispatch
create-or-move crush item name 'osd.1' initial_weight 3.6824 at location
{datacenter=FTD,host=osd001,root=default}
------------------------------------------------------------------------------------------
So the request to move to datacenter=FTD arrives at the mon, but no action
is taken, and the OSD is left in FTD_1.
Cheers,
Oliver
Post by Oliver Freyermuth
Dear Cephalopodians,
further experiments revealed that the crush-location-hook is indeed
called!
Post by Oliver Freyermuth
It's just my check (writing to a file in tmp from inside the hook) which
somehow failed. Using "logger" works for debugging.
Post by Oliver Freyermuth
host=osd001 datacenter=FTD root=default
as explained before. I have also explicitly created the buckets
beforehand in case that is needed.
Post by Oliver Freyermuth
# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 55.23582 root default
-9 0 datacenter FTD
-12 18.41194 datacenter FTD_1
-3 18.41194 host osd001
0 hdd 3.68239 osd.0 up 1.00000 1.00000
1 hdd 3.68239 osd.1 up 1.00000 1.00000
2 hdd 3.68239 osd.2 up 1.00000 1.00000
3 hdd 3.68239 osd.3 up 1.00000 1.00000
4 hdd 3.68239 osd.4 up 1.00000 1.00000
-11 0 datacenter FTD_2
-5 18.41194 host osd002
5 hdd 3.68239 osd.5 up 1.00000 1.00000
6 hdd 3.68239 osd.6 up 1.00000 1.00000
7 hdd 3.68239 osd.7 up 1.00000 1.00000
8 hdd 3.68239 osd.8 up 1.00000 1.00000
9 hdd 3.68239 osd.9 up 1.00000 1.00000
-7 18.41194 host osd003
10 hdd 3.68239 osd.10 up 1.00000 1.00000
11 hdd 3.68239 osd.11 up 1.00000 1.00000
12 hdd 3.68239 osd.12 up 1.00000 1.00000
13 hdd 3.68239 osd.13 up 1.00000 1.00000
14 hdd 3.68239 osd.14 up 1.00000 1.00000
So naively, I would expect that when I restart osd.0, it should move
itself into datacenter=FTD.
Post by Oliver Freyermuth
But that does not happen...
Any idea what I am missing?
Cheers,
Oliver
Post by Oliver Freyermuth
Dear Cephalopodians,
I'm probably missing something obvious, but I am at a loss here on how
to actually make use of a customized crush location hook.
Post by Oliver Freyermuth
Post by Oliver Freyermuth
I'm currently on "ceph version 13.2.1" on CentOS 7 (i.e. the last
1. Write a script /usr/local/bin/customized-ceph-crush-location. The
# sudo -u ceph /usr/local/bin/customized-ceph-crush-location
host=osd001 datacenter=FTD root=default
[osd]
crush_location_hook = /usr/local/bin/customized-ceph-crush-location
# ceph config show-with-defaults osd.0
...
crush_location_hook
/usr/local/bin/customized-ceph-crush-location file
Post by Oliver Freyermuth
Post by Oliver Freyermuth
...
osd_crush_update_on_start
true default
Post by Oliver Freyermuth
Post by Oliver Freyermuth
...
However, the script is not executed, and I can ensure that since the
script should also write a log to /tmp, which is not created.
Post by Oliver Freyermuth
Post by Oliver Freyermuth
Also, the "datacenter" type does not show up in the crush tree.
I have already disabled SELinux just to make sure.
Any ideas what I am missing here?
Cheers and thanks in advance,
Oliver
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Oliver Freyermuth
2018-11-30 23:56:06 UTC
Permalink
Dear Greg,
I’m pretty sure the monitor command there won’t move intermediate buckets like the host. This is so if an osd has incomplete metadata it doesn’t inadvertently move 11 other OSDs into a different rack/row/whatever.
So in this case, it finds the host osd0001 and matches it, but since the crush map already knows about osd0001 it doesn’t pay any attention to the datacenter field.
Whereas if you tried setting it with mynewhost, the monitor wouldn’t know where that host exists and would look at the other fields to set it in the specified data center.
thanks! That's a good and clear explanation. This was not apparent from the documentation to me, but it sounds like the safest way to go.
So in the end, crush-location-hooks are mostly useful for freshly created OSDs, e.g. on a new host (they should then directly go to the correct rack / datacenter etc.).

I wonder if that's the only sensible usecase, but it seems to me right now that this is the case.
So for our scheme, I will indeed use it for that, and move hosts manually (when moving them physically...) by moving the ceph buckets manually to the other rack / datacenter.

Thanks for the explanation!
Cheers,
Oliver
-Greg
Dear Cephalopodians,
------------------------------------------------------------------------------------------
2018-11-30 15:43:05.207 7f9d64aac700  0 log_channel(audit) log [INF] : from='osd.1 10.160.12.101:6816/90528 <http://10.160.12.101:6816/90528>' entity='osd.1' cmd=[{"prefix": "osd crush set-device-class", "class": "hdd", "ids": ["1"]}]: dispatch
2018-11-30 15:43:05.208 7f9d64aac700  0 log_channel(audit) log [INF] : from='osd.1 10.160.12.101:6816/90528 <http://10.160.12.101:6816/90528>' entity='osd.1' cmd=[{"prefix": "osd crush create-or-move", "id": 1, "weight":3.6824, "args": ["datacenter=FTD", "host=osd001", "root=default"]}]: dispatch
------------------------------------------------------------------------------------------
So the request to move to datacenter=FTD arrives at the mon, but no action is taken, and the OSD is left in FTD_1.
Cheers,
        Oliver
Post by Oliver Freyermuth
Dear Cephalopodians,
further experiments revealed that the crush-location-hook is indeed called!
It's just my check (writing to a file in tmp from inside the hook) which somehow failed. Using "logger" works for debugging.
host=osd001 datacenter=FTD root=default
as explained before. I have also explicitly created the buckets beforehand in case that is needed.
# ceph osd tree
ID  CLASS WEIGHT   TYPE NAME            STATUS REWEIGHT PRI-AFF
   -1       55.23582 root default
   -9              0     datacenter FTD
-12       18.41194     datacenter FTD_1
   -3       18.41194         host osd001
    0   hdd  3.68239             osd.0        up  1.00000 1.00000
    1   hdd  3.68239             osd.1        up  1.00000 1.00000
    2   hdd  3.68239             osd.2        up  1.00000 1.00000
    3   hdd  3.68239             osd.3        up  1.00000 1.00000
    4   hdd  3.68239             osd.4        up  1.00000 1.00000
-11              0     datacenter FTD_2
   -5       18.41194     host osd002
    5   hdd  3.68239         osd.5            up  1.00000 1.00000
    6   hdd  3.68239         osd.6            up  1.00000 1.00000
    7   hdd  3.68239         osd.7            up  1.00000 1.00000
    8   hdd  3.68239         osd.8            up  1.00000 1.00000
    9   hdd  3.68239         osd.9            up  1.00000 1.00000
   -7       18.41194     host osd003
   10   hdd  3.68239         osd.10           up  1.00000 1.00000
   11   hdd  3.68239         osd.11           up  1.00000 1.00000
   12   hdd  3.68239         osd.12           up  1.00000 1.00000
   13   hdd  3.68239         osd.13           up  1.00000 1.00000
   14   hdd  3.68239         osd.14           up  1.00000 1.00000
So naively, I would expect that when I restart osd.0, it should move itself into datacenter=FTD.
But that does not happen...
Any idea what I am missing?
Cheers,
      Oliver
Post by Oliver Freyermuth
Dear Cephalopodians,
I'm probably missing something obvious, but I am at a loss here on how to actually make use of a customized crush location hook.
   # sudo -u ceph /usr/local/bin/customized-ceph-crush-location
   host=osd001 datacenter=FTD root=default
  [osd]
  crush_location_hook = /usr/local/bin/customized-ceph-crush-location
  # ceph config show-with-defaults osd.0
   ...
   crush_location_hook        /usr/local/bin/customized-ceph-crush-location  file
   ...
   osd_crush_update_on_start  true                                           default
   ...
However, the script is not executed, and I can ensure that since the script should also write a log to /tmp, which is not created.
Also, the "datacenter" type does not show up in the crush tree.
I have already disabled SELinux just to make sure.
Any ideas what I am missing here?
Cheers and thanks in advance,
     Oliver
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Loading...