Discussion:
ceph-disk vs. ceph-volume: both error prone
(too old to reply)
Nico Schottelius
2018-02-09 15:48:42 UTC
Permalink
Raw Message
Dear list,

for a few days we are disecting ceph-disk and ceph-volume to find out,
what is the appropriate way of creating partitions for ceph.

For years already I found ceph-disk (and especially ceph-deploy) very
error prone and we at ungleich are considering to rewrite both into a
ceph-block-do-what-I-want-tool.

Only considering bluestore, I see that ceph-disk creates two partitions:

Device Start End Sectors Size Type
/dev/sde1 2048 206847 204800 100M Ceph OSD
/dev/sde2 206848 2049966046 2049759199 977.4G unknown

Does somebody know, what exactly belongs onto the xfs formatted first
disk and how is the data/wal/db device sde2 formatted?

What I really would like to know is, how can we best extract this
information so that we are not depending on ceph-{disk,volume} anymore.

Any pointer for the on disk format would be much appreciated!

Best,

Nico




--
Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch
Alfredo Deza
2018-02-09 20:56:55 UTC
Permalink
Raw Message
On Fri, Feb 9, 2018 at 10:48 AM, Nico Schottelius
Post by Nico Schottelius
Dear list,
for a few days we are disecting ceph-disk and ceph-volume to find out,
what is the appropriate way of creating partitions for ceph.
ceph-volume does not create partitions for ceph
Post by Nico Schottelius
For years already I found ceph-disk (and especially ceph-deploy) very
error prone and we at ungleich are considering to rewrite both into a
ceph-block-do-what-I-want-tool.
This is not very simple, that is the reason why there are tools that
do this for you.
Post by Nico Schottelius
Device Start End Sectors Size Type
/dev/sde1 2048 206847 204800 100M Ceph OSD
/dev/sde2 206848 2049966046 2049759199 977.4G unknown
Does somebody know, what exactly belongs onto the xfs formatted first
disk and how is the data/wal/db device sde2 formatted?
If you must, I would encourage you to try ceph-disk out with full
verbosity and dissect all the system calls, which will answer how the
partitions are formatted
Post by Nico Schottelius
What I really would like to know is, how can we best extract this
information so that we are not depending on ceph-{disk,volume} anymore.
Initially you mentioned partitions, but you want to avoid ceph-disk
and ceph-volume wholesale? That is going to take a lot more effort.
These tools not only "prepare" devices
for Ceph consumption, they also "activate" them when a system boots,
it talks to the cluster to register the OSDs, etc... It isn't just
partitioning (for ceph-disk).
Post by Nico Schottelius
Any pointer for the on disk format would be much appreciated!
Again, if you are only interested in how ceph-disk partitions or how
ceph-volume formats, you should try them out with full verbosity
(ceph-volume logs everything to /var/log/ceph/ceph-volume.log)
Post by Nico Schottelius
Best,
Nico
--
Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Willem Jan Withagen
2018-02-11 18:00:12 UTC
Permalink
Raw Message
Post by Alfredo Deza
On Fri, Feb 9, 2018 at 10:48 AM, Nico Schottelius
Post by Nico Schottelius
Dear list,
for a few days we are disecting ceph-disk and ceph-volume to find out,
what is the appropriate way of creating partitions for ceph.
ceph-volume does not create partitions for ceph
Post by Nico Schottelius
For years already I found ceph-disk (and especially ceph-deploy) very
error prone and we at ungleich are considering to rewrite both into a
ceph-block-do-what-I-want-tool.
This is not very simple, that is the reason why there are tools that
do this for you.
Post by Nico Schottelius
Device Start End Sectors Size Type
/dev/sde1 2048 206847 204800 100M Ceph OSD
/dev/sde2 206848 2049966046 2049759199 977.4G unknown
Does somebody know, what exactly belongs onto the xfs formatted first
disk and how is the data/wal/db device sde2 formatted?
If you must, I would encourage you to try ceph-disk out with full
verbosity and dissect all the system calls, which will answer how the
partitions are formatted
Post by Nico Schottelius
What I really would like to know is, how can we best extract this
information so that we are not depending on ceph-{disk,volume} anymore.
Initially you mentioned partitions, but you want to avoid ceph-disk
and ceph-volume wholesale? That is going to take a lot more effort.
These tools not only "prepare" devices
for Ceph consumption, they also "activate" them when a system boots,
it talks to the cluster to register the OSDs, etc... It isn't just
partitioning (for ceph-disk).
I personally find it very annoying that ceph-disk tries to be friends
with all the init-tools that are with all linuxes. Let alone all the
udev stuff that starts working on disks once they are introduced in the
system.

And for FreeBSD I'm not suggesting to use that since it does not fit
with with the FreeBSD paradigm that things like this are not really
automagically started.

So if it is only about creating the ceph-infra, things are relatively easy.

The actual work on the partitions is done with ceph-osd --mkfs and there
is little magic about it. And then some more options tell where the
parts for BlueStore go if you want something that is not the STD location.

Also a large part of ceph-disk is complicated/abfuscated by desires to
run on crypted disks and or multipath disk providers...
Running it with verbose on, gives a bit of info, but the python-code is
convoluted and complex until you have it figured out. Then it starts to
become simpler, but never easy. ;-)

Writing a script that does what ceph-disk does? Take a look at
src/vstart in the source. That script builds a full cluster during
testing and is way more legible.
I did so for my FreeBSD multi-server cluster tests, and it is not
complex at all.

Just my 2cts,
--WjW

Loading...