[ceph-users] ceph-volume lvm deactivate/destroy/zap

Post by Dan van der Ster
Hi,
For someone who is not an lvm expert, does anyone have a recipe for
destroying a ceph-volume lvm osd?
(I have a failed disk which I want to deactivate / wipe before
physically removing from the host, and the tooling for this doesn't
exist yet http://tracker.ceph.com/issues/22287)

ceph-volume lvm zap /dev/sdu # does not work

Zapping: /dev/sdu
Running command: sudo wipefs --all /dev/sdu
Device or resource busy
How does one tear that down so it can be zapped?

wipefs -fa /dev/the/device
dd if=/dev/zero of=/dev/the/device bs=1M count=1

^^ I have succesfully re-created ceph-volume lvm bluestore OSDs with
above method (assuming you have done the ceph osd purge osd.$ID part
already and brought down the OSD process itself).

Gr. Stefan

--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / ***@bit.nl

Dan van der Ster

2017-12-21 15:26:45 UTC

ceph-volume lvm zap /dev/sdu # does not work

Zapping: /dev/sdu
Running command: sudo wipefs --all /dev/sdu
Device or resource busy
How does one tear that down so it can be zapped?

wipefs -fa /dev/the/device
dd if=/dev/zero of=/dev/the/device bs=1M count=1

Thanks Stefan. But isn't there also some vgremove or lvremove magic
that needs to bring down these /dev/dm-... devices I have?

-- dan

Post by Stefan Kooman
^^ I have succesfully re-created ceph-volume lvm bluestore OSDs with
above method (assuming you have done the ceph osd purge osd.$ID part
already and brought down the OSD process itself).
Gr. Stefan
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351

Stefan Kooman

2017-12-21 16:35:50 UTC

Post by Dan van der Ster
Thanks Stefan. But isn't there also some vgremove or lvremove magic
that needs to bring down these /dev/dm-... devices I have?

Ah, you want to clean up properly before that. Sure:

lvremove -f <volume_group>/<logical_volume>
vgremove <volume_group>
pvremove /dev/ceph-device (should wipe labels)

So ideally there should be a ceph-volume lvm destroy / zap option that
takes care of this:

1) Properly remove LV/VG/PV as shown above
2) wipefs to get rid of LVM signatures
3) dd zeroes to get rid of signatures that might still be there

Gr. Stefan

--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / ***@bit.nl

Alfredo Deza

2018-01-08 15:37:46 UTC

Post by Dan van der Ster
Thanks Stefan. But isn't there also some vgremove or lvremove magic
that needs to bring down these /dev/dm-... devices I have?

lvremove -f <volume_group>/<logical_volume>
vgremove <volume_group>
pvremove /dev/ceph-device (should wipe labels)
So ideally there should be a ceph-volume lvm destroy / zap option that
1) Properly remove LV/VG/PV as shown above
2) wipefs to get rid of LVM signatures
3) dd zeroes to get rid of signatures that might still be there

ceph-volume does have a 'zap' subcommand, but it does not remove
logical volumes or groups. It is intended to leave those in place for
re-use. It uses wipefs, but
not in a way that would end up removing LVM signatures.

Docs for zap are at: http://docs.ceph.com/docs/master/ceph-volume/lvm/zap/

The reason for not attempting removal is that an LV might not be a
1-to-1 device to volume group. It is being suggested here to "vgremove
<volume_group>"
but what if the group has several other LVs that should not get
removed? Similarly, what if the logical volume is not a single PV but
many?

We believe that these operations should be up to the administrator
with better context as to what goes where and what (if anything)
really needs to be removed
from LVM.

Post by Stefan Kooman
Gr. Stefan
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Dan van der Ster

2018-01-08 15:53:39 UTC

Post by Dan van der Ster
Thanks Stefan. But isn't there also some vgremove or lvremove magic
that needs to bring down these /dev/dm-... devices I have?

ceph-volume does have a 'zap' subcommand, but it does not remove
logical volumes or groups. It is intended to leave those in place for
re-use. It uses wipefs, but
not in a way that would end up removing LVM signatures.
Docs for zap are at: http://docs.ceph.com/docs/master/ceph-volume/lvm/zap/
The reason for not attempting removal is that an LV might not be a
1-to-1 device to volume group. It is being suggested here to "vgremove
<volume_group>"
but what if the group has several other LVs that should not get
removed? Similarly, what if the logical volume is not a single PV but
many?
We believe that these operations should be up to the administrator
with better context as to what goes where and what (if anything)
really needs to be removed
from LVM.

Maybe I'm missing something, but aren't most (almost all?) use-cases just

ceph-volume lvm create /dev/<thewholedisk>

? Or do you expect most deployments to do something more complicated with lvm?

In that above whole-disk case, I think it would be useful to have a
very simple cmd to tear down whatever ceph-volume created, so that
ceph admins don't need to reverse engineer what ceph-volume is doing
with lvm.

Otherwise, perhaps it would be useful to document the expected normal
lifecycle of an lvm osd: create, failure / replacement handling,
decommissioning.

Cheers, Dan

Alfredo Deza

2018-01-08 16:41:38 UTC

Post by Dan van der Ster

Post by Dan van der Ster
Thanks Stefan. But isn't there also some vgremove or lvremove magic
that needs to bring down these /dev/dm-... devices I have?

ceph-volume does have a 'zap' subcommand, but it does not remove
logical volumes or groups. It is intended to leave those in place for
re-use. It uses wipefs, but
not in a way that would end up removing LVM signatures.
Docs for zap are at: http://docs.ceph.com/docs/master/ceph-volume/lvm/zap/
The reason for not attempting removal is that an LV might not be a
1-to-1 device to volume group. It is being suggested here to "vgremove
<volume_group>"
but what if the group has several other LVs that should not get
removed? Similarly, what if the logical volume is not a single PV but
many?
We believe that these operations should be up to the administrator
with better context as to what goes where and what (if anything)
really needs to be removed
from LVM.

Maybe I'm missing something, but aren't most (almost all?) use-cases just
ceph-volume lvm create /dev/<thewholedisk>

Post by Dan van der Ster
? Or do you expect most deployments to do something more complicated with lvm?

Yes, we do. For example dmcache, which to ceph-volume looks like a
plain logical volume, but it can be vary on how it is implemented
behind the scenes

Post by Dan van der Ster
In that above whole-disk case, I think it would be useful to have a
very simple cmd to tear down whatever ceph-volume created, so that
ceph admins don't need to reverse engineer what ceph-volume is doing
with lvm.

Right, that would work if that was the only supported way of dealing
with lvm. We aren't imposing this, we added it as a convenience if a
user did not want
to deal with lvm at all. LVM has a plethora of ways to create an LV,
and we don't want to either restrict users to our view of LVM or
attempt to understand all the many different
ways that may be and assume some behavior is desired (like removing a VG)

Post by Dan van der Ster
Otherwise, perhaps it would be useful to document the expected normal
lifecycle of an lvm osd: create, failure / replacement handling,
decommissioning.
Cheers, Dan

Reed Dier

2018-01-09 18:35:20 UTC

I would just like to mirror what Dan van der Sterâs sentiments are.

As someone attempting to move an OSD to bluestore, with limited/no LVM experience, it is a completely different beast and complexity level compared to the ceph-disk/filestore days.

ceph-deploy was a very simple tool that did exactly what I was looking to do, but now we have deprecated ceph-disk halfway into a release, ceph-deploy doesnât appear to fully support ceph-volume, which is now the official way to manage OSDs moving forward.

My ceph-volume create statement âsucceededâ but the OSD doesnât start, so now I am trying to zap the disk to try to recreate the OSD, and the zap is failing as Danâs did.

And yes, I was able to get it zapped using the lvremove, vgremove, pvremove commands, but that is not obvious to someone who hasnât used LVM extensively for storage management before.

I also want to mirror Danâs sentiments about the unnecessary complexity imposed on what I expect is the default use case of an entire disk being used. I canât see anything more than the âentire diskâ method being the largest use case for users of ceph, especially the smaller clusters trying to maximize hardware/spend.

Just wanted to piggy back this thread to echo Danâs frustration.

Thanks,

Reed

Post by Dan van der Ster

Post by Dan van der Ster
Thanks Stefan. But isn't there also some vgremove or lvremove magic
that needs to bring down these /dev/dm-... devices I have?

ceph-volume does have a 'zap' subcommand, but it does not remove
logical volumes or groups. It is intended to leave those in place for
re-use. It uses wipefs, but
not in a way that would end up removing LVM signatures.
Docs for zap are at: http://docs.ceph.com/docs/master/ceph-volume/lvm/zap/
The reason for not attempting removal is that an LV might not be a
1-to-1 device to volume group. It is being suggested here to "vgremove
<volume_group>"
but what if the group has several other LVs that should not get
removed? Similarly, what if the logical volume is not a single PV but
many?
We believe that these operations should be up to the administrator
with better context as to what goes where and what (if anything)
really needs to be removed
from LVM.

Maybe I'm missing something, but aren't most (almost all?) use-cases just
ceph-volume lvm create /dev/<thewholedisk>

Post by Dan van der Ster
? Or do you expect most deployments to do something more complicated with lvm?

Yes, we do. For example dmcache, which to ceph-volume looks like a
plain logical volume, but it can be vary on how it is implemented
behind the scenes

Post by Dan van der Ster
Otherwise, perhaps it would be useful to document the expected normal
lifecycle of an lvm osd: create, failure / replacement handling,
decommissioning.
Cheers, Dan

_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Alfredo Deza

2018-01-09 19:14:51 UTC

I would just like to mirror what Dan van der Ster’s sentiments are.
As someone attempting to move an OSD to bluestore, with limited/no LVM
experience, it is a completely different beast and complexity level compared
to the ceph-disk/filestore days.
ceph-deploy was a very simple tool that did exactly what I was looking to
do, but now we have deprecated ceph-disk halfway into a release, ceph-deploy
doesn’t appear to fully support ceph-volume, which is now the official way
to manage OSDs moving forward.

ceph-deploy now fully supports ceph-volume, we should get a release soon

My ceph-volume create statement ‘succeeded’ but the OSD doesn’t start, so
now I am trying to zap the disk to try to recreate the OSD, and the zap is
failing as Dan’s did.

I would encourage you to open a ticket in the tracker so that we can
improve on what failed for you

http://tracker.ceph.com/projects/ceph-volume/issues/new

ceph-volume keeps thorough logs in /var/log/ceph/ceph-volume.log and
/var/log/ceph/ceph-volume-systemd.log

If you create a ticket, please make sure to add all the output and
steps that you can

And yes, I was able to get it zapped using the lvremove, vgremove, pvremove
commands, but that is not obvious to someone who hasn’t used LVM extensively
for storage management before.
I also want to mirror Dan’s sentiments about the unnecessary complexity
imposed on what I expect is the default use case of an entire disk being
used. I can’t see anything more than the ‘entire disk’ method being the
largest use case for users of ceph, especially the smaller clusters trying
to maximize hardware/spend.

We don't take lightly the introduction of LVM here. The new tool is
addressing several insurmountable issues with how ceph-disk operated.

Although using an entire disk might be easier in the use case you are
in, it is certainly not the only thing we have to support, so then
again, we can't
reliably decide what strategy would be best to destroy that volume, or
group, or if the PV should be destroyed as well.

The 'zap' sub-command will allow that lv to be reused for an OSD and
that should work. Again, if it isn't sufficient, we really do need
more information and a
ticket in the tracker is the best way.

Just wanted to piggy back this thread to echo Dan’s frustration.
Thanks,
Reed
Thanks Stefan. But isn't there also some vgremove or lvremove magic
that needs to bring down these /dev/dm-... devices I have?
lvremove -f <volume_group>/<logical_volume>
vgremove <volume_group>
pvremove /dev/ceph-device (should wipe labels)
So ideally there should be a ceph-volume lvm destroy / zap option that
1) Properly remove LV/VG/PV as shown above
2) wipefs to get rid of LVM signatures
3) dd zeroes to get rid of signatures that might still be there
ceph-volume does have a 'zap' subcommand, but it does not remove
logical volumes or groups. It is intended to leave those in place for
re-use. It uses wipefs, but
not in a way that would end up removing LVM signatures.
Docs for zap are at: http://docs.ceph.com/docs/master/ceph-volume/lvm/zap/
The reason for not attempting removal is that an LV might not be a
1-to-1 device to volume group. It is being suggested here to "vgremove
<volume_group>"
but what if the group has several other LVs that should not get
removed? Similarly, what if the logical volume is not a single PV but
many?
We believe that these operations should be up to the administrator
with better context as to what goes where and what (if anything)
really needs to be removed
from LVM.
Maybe I'm missing something, but aren't most (almost all?) use-cases just
ceph-volume lvm create /dev/<thewholedisk>
No
? Or do you expect most deployments to do something more complicated with lvm?
Yes, we do. For example dmcache, which to ceph-volume looks like a
plain logical volume, but it can be vary on how it is implemented
behind the scenes
In that above whole-disk case, I think it would be useful to have a
very simple cmd to tear down whatever ceph-volume created, so that
ceph admins don't need to reverse engineer what ceph-volume is doing
with lvm.
Right, that would work if that was the only supported way of dealing
with lvm. We aren't imposing this, we added it as a convenience if a
user did not want
to deal with lvm at all. LVM has a plethora of ways to create an LV,
and we don't want to either restrict users to our view of LVM or
attempt to understand all the many different
ways that may be and assume some behavior is desired (like removing a VG)
Otherwise, perhaps it would be useful to document the expected normal
lifecycle of an lvm osd: create, failure / replacement handling,
decommissioning.
Cheers, Dan
Gr. Stefan
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Fabian Grünbichler

2018-01-10 07:10:40 UTC

ceph-deploy now fully supports ceph-volume, we should get a release soon

My ceph-volume create statement ‘succeeded’ but the OSD doesn’t start, so
now I am trying to zap the disk to try to recreate the OSD, and the zap is
failing as Dan’s did.

I would encourage you to open a ticket in the tracker so that we can
improve on what failed for you
http://tracker.ceph.com/projects/ceph-volume/issues/new
ceph-volume keeps thorough logs in /var/log/ceph/ceph-volume.log and
/var/log/ceph/ceph-volume-systemd.log
If you create a ticket, please make sure to add all the output and
steps that you can

wouldn't it be possible to detect on creation that it is a full physical
disk that gets initialized completely by ceph-volume, store that in the
metadata somewhere and clean up accordingly when destroying the OSD?

Post by Alfredo Deza
The 'zap' sub-command will allow that lv to be reused for an OSD and
that should work. Again, if it isn't sufficient, we really do need
more information and a
ticket in the tracker is the best way.

Alfredo Deza

2018-01-10 12:57:59 UTC

Post by Fabian GrÃ¼nbichler

On Wed, Jan 10, 2018 at 2:10 AM, Fabian Grünbichler