Discussion:
[ceph-users] ceph-volume lvm deactivate/destroy/zap
Dan van der Ster
2017-12-21 14:50:07 UTC
Permalink
Hi,

For someone who is not an lvm expert, does anyone have a recipe for
destroying a ceph-volume lvm osd?
(I have a failed disk which I want to deactivate / wipe before
physically removing from the host, and the tooling for this doesn't
exist yet http://tracker.ceph.com/issues/22287)
ceph-volume lvm zap /dev/sdu # does not work
Zapping: /dev/sdu
Running command: sudo wipefs --all /dev/sdu
stderr: wipefs: error: /dev/sdu: probing initialization failed:
Device or resource busy
--> RuntimeError: command returned non-zero exit status: 1

This is the drive I want to remove:

===== osd.240 ======

[block] /dev/ceph-<the cluster
fsid>/osd-block-f1455f38-b94b-4501-86df-6d6c96727d02

type block
osd id 240
cluster fsid xxx
cluster name ceph
osd fsid f1455f38-b94b-4501-86df-6d6c96727d02
block uuid N4fpLc-O3y0-hvfN-oRpD-y6kH-znfl-4EaVLi
block device /dev/ceph-<the cluster
fsid>/osd-block-f1455f38-b94b-4501-86df-6d6c96727d02

How does one tear that down so it can be zapped?

Best Regards,

Dan
Stefan Kooman
2017-12-21 14:59:13 UTC
Permalink
Post by Dan van der Ster
Hi,
For someone who is not an lvm expert, does anyone have a recipe for
destroying a ceph-volume lvm osd?
(I have a failed disk which I want to deactivate / wipe before
physically removing from the host, and the tooling for this doesn't
exist yet http://tracker.ceph.com/issues/22287)
ceph-volume lvm zap /dev/sdu # does not work
Zapping: /dev/sdu
Running command: sudo wipefs --all /dev/sdu
Device or resource busy
How does one tear that down so it can be zapped?
wipefs -fa /dev/the/device
dd if=/dev/zero of=/dev/the/device bs=1M count=1

^^ I have succesfully re-created ceph-volume lvm bluestore OSDs with
above method (assuming you have done the ceph osd purge osd.$ID part
already and brought down the OSD process itself).

Gr. Stefan
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / ***@bit.nl
Dan van der Ster
2017-12-21 15:26:45 UTC
Permalink
Post by Stefan Kooman
Post by Dan van der Ster
Hi,
For someone who is not an lvm expert, does anyone have a recipe for
destroying a ceph-volume lvm osd?
(I have a failed disk which I want to deactivate / wipe before
physically removing from the host, and the tooling for this doesn't
exist yet http://tracker.ceph.com/issues/22287)
ceph-volume lvm zap /dev/sdu # does not work
Zapping: /dev/sdu
Running command: sudo wipefs --all /dev/sdu
Device or resource busy
How does one tear that down so it can be zapped?
wipefs -fa /dev/the/device
dd if=/dev/zero of=/dev/the/device bs=1M count=1
Thanks Stefan. But isn't there also some vgremove or lvremove magic
that needs to bring down these /dev/dm-... devices I have?

-- dan
Post by Stefan Kooman
^^ I have succesfully re-created ceph-volume lvm bluestore OSDs with
above method (assuming you have done the ceph osd purge osd.$ID part
already and brought down the OSD process itself).
Gr. Stefan
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
Stefan Kooman
2017-12-21 16:35:50 UTC
Permalink
Post by Dan van der Ster
Thanks Stefan. But isn't there also some vgremove or lvremove magic
that needs to bring down these /dev/dm-... devices I have?
Ah, you want to clean up properly before that. Sure:

lvremove -f <volume_group>/<logical_volume>
vgremove <volume_group>
pvremove /dev/ceph-device (should wipe labels)

So ideally there should be a ceph-volume lvm destroy / zap option that
takes care of this:

1) Properly remove LV/VG/PV as shown above
2) wipefs to get rid of LVM signatures
3) dd zeroes to get rid of signatures that might still be there

Gr. Stefan
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / ***@bit.nl
Alfredo Deza
2018-01-08 15:37:46 UTC
Permalink
Post by Stefan Kooman
Post by Dan van der Ster
Thanks Stefan. But isn't there also some vgremove or lvremove magic
that needs to bring down these /dev/dm-... devices I have?
lvremove -f <volume_group>/<logical_volume>
vgremove <volume_group>
pvremove /dev/ceph-device (should wipe labels)
So ideally there should be a ceph-volume lvm destroy / zap option that
1) Properly remove LV/VG/PV as shown above
2) wipefs to get rid of LVM signatures
3) dd zeroes to get rid of signatures that might still be there
ceph-volume does have a 'zap' subcommand, but it does not remove
logical volumes or groups. It is intended to leave those in place for
re-use. It uses wipefs, but
not in a way that would end up removing LVM signatures.

Docs for zap are at: http://docs.ceph.com/docs/master/ceph-volume/lvm/zap/

The reason for not attempting removal is that an LV might not be a
1-to-1 device to volume group. It is being suggested here to "vgremove
<volume_group>"
but what if the group has several other LVs that should not get
removed? Similarly, what if the logical volume is not a single PV but
many?

We believe that these operations should be up to the administrator
with better context as to what goes where and what (if anything)
really needs to be removed
from LVM.
Post by Stefan Kooman
Gr. Stefan
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Dan van der Ster
2018-01-08 15:53:39 UTC
Permalink
Post by Alfredo Deza
Post by Stefan Kooman
Post by Dan van der Ster
Thanks Stefan. But isn't there also some vgremove or lvremove magic
that needs to bring down these /dev/dm-... devices I have?
lvremove -f <volume_group>/<logical_volume>
vgremove <volume_group>
pvremove /dev/ceph-device (should wipe labels)
So ideally there should be a ceph-volume lvm destroy / zap option that
1) Properly remove LV/VG/PV as shown above
2) wipefs to get rid of LVM signatures
3) dd zeroes to get rid of signatures that might still be there
ceph-volume does have a 'zap' subcommand, but it does not remove
logical volumes or groups. It is intended to leave those in place for
re-use. It uses wipefs, but
not in a way that would end up removing LVM signatures.
Docs for zap are at: http://docs.ceph.com/docs/master/ceph-volume/lvm/zap/
The reason for not attempting removal is that an LV might not be a
1-to-1 device to volume group. It is being suggested here to "vgremove
<volume_group>"
but what if the group has several other LVs that should not get
removed? Similarly, what if the logical volume is not a single PV but
many?
We believe that these operations should be up to the administrator
with better context as to what goes where and what (if anything)
really needs to be removed
from LVM.
Maybe I'm missing something, but aren't most (almost all?) use-cases just

ceph-volume lvm create /dev/<thewholedisk>

? Or do you expect most deployments to do something more complicated with lvm?

In that above whole-disk case, I think it would be useful to have a
very simple cmd to tear down whatever ceph-volume created, so that
ceph admins don't need to reverse engineer what ceph-volume is doing
with lvm.

Otherwise, perhaps it would be useful to document the expected normal
lifecycle of an lvm osd: create, failure / replacement handling,
decommissioning.

Cheers, Dan
Post by Alfredo Deza
Post by Stefan Kooman
Gr. Stefan
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Alfredo Deza
2018-01-08 16:41:38 UTC
Permalink
Post by Dan van der Ster
Post by Alfredo Deza
Post by Stefan Kooman
Post by Dan van der Ster
Thanks Stefan. But isn't there also some vgremove or lvremove magic
that needs to bring down these /dev/dm-... devices I have?
lvremove -f <volume_group>/<logical_volume>
vgremove <volume_group>
pvremove /dev/ceph-device (should wipe labels)
So ideally there should be a ceph-volume lvm destroy / zap option that
1) Properly remove LV/VG/PV as shown above
2) wipefs to get rid of LVM signatures
3) dd zeroes to get rid of signatures that might still be there
ceph-volume does have a 'zap' subcommand, but it does not remove
logical volumes or groups. It is intended to leave those in place for
re-use. It uses wipefs, but
not in a way that would end up removing LVM signatures.
Docs for zap are at: http://docs.ceph.com/docs/master/ceph-volume/lvm/zap/
The reason for not attempting removal is that an LV might not be a
1-to-1 device to volume group. It is being suggested here to "vgremove
<volume_group>"
but what if the group has several other LVs that should not get
removed? Similarly, what if the logical volume is not a single PV but
many?
We believe that these operations should be up to the administrator
with better context as to what goes where and what (if anything)
really needs to be removed
from LVM.
Maybe I'm missing something, but aren't most (almost all?) use-cases just
ceph-volume lvm create /dev/<thewholedisk>
No
Post by Dan van der Ster
? Or do you expect most deployments to do something more complicated with lvm?
Yes, we do. For example dmcache, which to ceph-volume looks like a
plain logical volume, but it can be vary on how it is implemented
behind the scenes
Post by Dan van der Ster
In that above whole-disk case, I think it would be useful to have a
very simple cmd to tear down whatever ceph-volume created, so that
ceph admins don't need to reverse engineer what ceph-volume is doing
with lvm.
Right, that would work if that was the only supported way of dealing
with lvm. We aren't imposing this, we added it as a convenience if a
user did not want
to deal with lvm at all. LVM has a plethora of ways to create an LV,
and we don't want to either restrict users to our view of LVM or
attempt to understand all the many different
ways that may be and assume some behavior is desired (like removing a VG)
Post by Dan van der Ster
Otherwise, perhaps it would be useful to document the expected normal
lifecycle of an lvm osd: create, failure / replacement handling,
decommissioning.
Cheers, Dan
Post by Alfredo Deza
Post by Stefan Kooman
Gr. Stefan
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Reed Dier
2018-01-09 18:35:20 UTC
Permalink
I would just like to mirror what Dan van der Ster’s sentiments are.

As someone attempting to move an OSD to bluestore, with limited/no LVM experience, it is a completely different beast and complexity level compared to the ceph-disk/filestore days.

ceph-deploy was a very simple tool that did exactly what I was looking to do, but now we have deprecated ceph-disk halfway into a release, ceph-deploy doesn’t appear to fully support ceph-volume, which is now the official way to manage OSDs moving forward.

My ceph-volume create statement ‘succeeded’ but the OSD doesn’t start, so now I am trying to zap the disk to try to recreate the OSD, and the zap is failing as Dan’s did.

And yes, I was able to get it zapped using the lvremove, vgremove, pvremove commands, but that is not obvious to someone who hasn’t used LVM extensively for storage management before.

I also want to mirror Dan’s sentiments about the unnecessary complexity imposed on what I expect is the default use case of an entire disk being used. I can’t see anything more than the ‘entire disk’ method being the largest use case for users of ceph, especially the smaller clusters trying to maximize hardware/spend.

Just wanted to piggy back this thread to echo Dan’s frustration.

Thanks,

Reed
Post by Alfredo Deza
Post by Dan van der Ster
Post by Alfredo Deza
Post by Stefan Kooman
Post by Dan van der Ster
Thanks Stefan. But isn't there also some vgremove or lvremove magic
that needs to bring down these /dev/dm-... devices I have?
lvremove -f <volume_group>/<logical_volume>
vgremove <volume_group>
pvremove /dev/ceph-device (should wipe labels)
So ideally there should be a ceph-volume lvm destroy / zap option that
1) Properly remove LV/VG/PV as shown above
2) wipefs to get rid of LVM signatures
3) dd zeroes to get rid of signatures that might still be there
ceph-volume does have a 'zap' subcommand, but it does not remove
logical volumes or groups. It is intended to leave those in place for
re-use. It uses wipefs, but
not in a way that would end up removing LVM signatures.
Docs for zap are at: http://docs.ceph.com/docs/master/ceph-volume/lvm/zap/
The reason for not attempting removal is that an LV might not be a
1-to-1 device to volume group. It is being suggested here to "vgremove
<volume_group>"
but what if the group has several other LVs that should not get
removed? Similarly, what if the logical volume is not a single PV but
many?
We believe that these operations should be up to the administrator
with better context as to what goes where and what (if anything)
really needs to be removed
from LVM.
Maybe I'm missing something, but aren't most (almost all?) use-cases just
ceph-volume lvm create /dev/<thewholedisk>
No
Post by Dan van der Ster
? Or do you expect most deployments to do something more complicated with lvm?
Yes, we do. For example dmcache, which to ceph-volume looks like a
plain logical volume, but it can be vary on how it is implemented
behind the scenes
Post by Dan van der Ster
In that above whole-disk case, I think it would be useful to have a
very simple cmd to tear down whatever ceph-volume created, so that
ceph admins don't need to reverse engineer what ceph-volume is doing
with lvm.
Right, that would work if that was the only supported way of dealing
with lvm. We aren't imposing this, we added it as a convenience if a
user did not want
to deal with lvm at all. LVM has a plethora of ways to create an LV,
and we don't want to either restrict users to our view of LVM or
attempt to understand all the many different
ways that may be and assume some behavior is desired (like removing a VG)
Post by Dan van der Ster
Otherwise, perhaps it would be useful to document the expected normal
lifecycle of an lvm osd: create, failure / replacement handling,
decommissioning.
Cheers, Dan
Post by Alfredo Deza
Post by Stefan Kooman
Gr. Stefan
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Alfredo Deza
2018-01-09 19:14:51 UTC
Permalink
I would just like to mirror what Dan van der Ster’s sentiments are.
As someone attempting to move an OSD to bluestore, with limited/no LVM
experience, it is a completely different beast and complexity level compared
to the ceph-disk/filestore days.
ceph-deploy was a very simple tool that did exactly what I was looking to
do, but now we have deprecated ceph-disk halfway into a release, ceph-deploy
doesn’t appear to fully support ceph-volume, which is now the official way
to manage OSDs moving forward.
ceph-deploy now fully supports ceph-volume, we should get a release soon
My ceph-volume create statement ‘succeeded’ but the OSD doesn’t start, so
now I am trying to zap the disk to try to recreate the OSD, and the zap is
failing as Dan’s did.
I would encourage you to open a ticket in the tracker so that we can
improve on what failed for you

http://tracker.ceph.com/projects/ceph-volume/issues/new

ceph-volume keeps thorough logs in /var/log/ceph/ceph-volume.log and
/var/log/ceph/ceph-volume-systemd.log

If you create a ticket, please make sure to add all the output and
steps that you can
And yes, I was able to get it zapped using the lvremove, vgremove, pvremove
commands, but that is not obvious to someone who hasn’t used LVM extensively
for storage management before.
I also want to mirror Dan’s sentiments about the unnecessary complexity
imposed on what I expect is the default use case of an entire disk being
used. I can’t see anything more than the ‘entire disk’ method being the
largest use case for users of ceph, especially the smaller clusters trying
to maximize hardware/spend.
We don't take lightly the introduction of LVM here. The new tool is
addressing several insurmountable issues with how ceph-disk operated.

Although using an entire disk might be easier in the use case you are
in, it is certainly not the only thing we have to support, so then
again, we can't
reliably decide what strategy would be best to destroy that volume, or
group, or if the PV should be destroyed as well.

The 'zap' sub-command will allow that lv to be reused for an OSD and
that should work. Again, if it isn't sufficient, we really do need
more information and a
ticket in the tracker is the best way.
Just wanted to piggy back this thread to echo Dan’s frustration.
Thanks,
Reed
Thanks Stefan. But isn't there also some vgremove or lvremove magic
that needs to bring down these /dev/dm-... devices I have?
lvremove -f <volume_group>/<logical_volume>
vgremove <volume_group>
pvremove /dev/ceph-device (should wipe labels)
So ideally there should be a ceph-volume lvm destroy / zap option that
1) Properly remove LV/VG/PV as shown above
2) wipefs to get rid of LVM signatures
3) dd zeroes to get rid of signatures that might still be there
ceph-volume does have a 'zap' subcommand, but it does not remove
logical volumes or groups. It is intended to leave those in place for
re-use. It uses wipefs, but
not in a way that would end up removing LVM signatures.
Docs for zap are at: http://docs.ceph.com/docs/master/ceph-volume/lvm/zap/
The reason for not attempting removal is that an LV might not be a
1-to-1 device to volume group. It is being suggested here to "vgremove
<volume_group>"
but what if the group has several other LVs that should not get
removed? Similarly, what if the logical volume is not a single PV but
many?
We believe that these operations should be up to the administrator
with better context as to what goes where and what (if anything)
really needs to be removed
from LVM.
Maybe I'm missing something, but aren't most (almost all?) use-cases just
ceph-volume lvm create /dev/<thewholedisk>
No
? Or do you expect most deployments to do something more complicated with lvm?
Yes, we do. For example dmcache, which to ceph-volume looks like a
plain logical volume, but it can be vary on how it is implemented
behind the scenes
In that above whole-disk case, I think it would be useful to have a
very simple cmd to tear down whatever ceph-volume created, so that
ceph admins don't need to reverse engineer what ceph-volume is doing
with lvm.
Right, that would work if that was the only supported way of dealing
with lvm. We aren't imposing this, we added it as a convenience if a
user did not want
to deal with lvm at all. LVM has a plethora of ways to create an LV,
and we don't want to either restrict users to our view of LVM or
attempt to understand all the many different
ways that may be and assume some behavior is desired (like removing a VG)
Otherwise, perhaps it would be useful to document the expected normal
lifecycle of an lvm osd: create, failure / replacement handling,
decommissioning.
Cheers, Dan
Gr. Stefan
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Fabian Grünbichler
2018-01-10 07:10:40 UTC
Permalink
Post by Alfredo Deza
I would just like to mirror what Dan van der Ster’s sentiments are.
As someone attempting to move an OSD to bluestore, with limited/no LVM
experience, it is a completely different beast and complexity level compared
to the ceph-disk/filestore days.
ceph-deploy was a very simple tool that did exactly what I was looking to
do, but now we have deprecated ceph-disk halfway into a release, ceph-deploy
doesn’t appear to fully support ceph-volume, which is now the official way
to manage OSDs moving forward.
ceph-deploy now fully supports ceph-volume, we should get a release soon
My ceph-volume create statement ‘succeeded’ but the OSD doesn’t start, so
now I am trying to zap the disk to try to recreate the OSD, and the zap is
failing as Dan’s did.
I would encourage you to open a ticket in the tracker so that we can
improve on what failed for you
http://tracker.ceph.com/projects/ceph-volume/issues/new
ceph-volume keeps thorough logs in /var/log/ceph/ceph-volume.log and
/var/log/ceph/ceph-volume-systemd.log
If you create a ticket, please make sure to add all the output and
steps that you can
And yes, I was able to get it zapped using the lvremove, vgremove, pvremove
commands, but that is not obvious to someone who hasn’t used LVM extensively
for storage management before.
I also want to mirror Dan’s sentiments about the unnecessary complexity
imposed on what I expect is the default use case of an entire disk being
used. I can’t see anything more than the ‘entire disk’ method being the
largest use case for users of ceph, especially the smaller clusters trying
to maximize hardware/spend.
We don't take lightly the introduction of LVM here. The new tool is
addressing several insurmountable issues with how ceph-disk operated.
Although using an entire disk might be easier in the use case you are
in, it is certainly not the only thing we have to support, so then
again, we can't
reliably decide what strategy would be best to destroy that volume, or
group, or if the PV should be destroyed as well.
wouldn't it be possible to detect on creation that it is a full physical
disk that gets initialized completely by ceph-volume, store that in the
metadata somewhere and clean up accordingly when destroying the OSD?
Post by Alfredo Deza
The 'zap' sub-command will allow that lv to be reused for an OSD and
that should work. Again, if it isn't sufficient, we really do need
more information and a
ticket in the tracker is the best way.
Alfredo Deza
2018-01-10 12:57:59 UTC
Permalink
On Wed, Jan 10, 2018 at 2:10 AM, Fabian Grünbichler
Post by Fabian Grünbichler
Post by Alfredo Deza
I would just like to mirror what Dan van der Ster’s sentiments are.
As someone attempting to move an OSD to bluestore, with limited/no LVM
experience, it is a completely different beast and complexity level compared
to the ceph-disk/filestore days.
ceph-deploy was a very simple tool that did exactly what I was looking to
do, but now we have deprecated ceph-disk halfway into a release, ceph-deploy
doesn’t appear to fully support ceph-volume, which is now the official way
to manage OSDs moving forward.
ceph-deploy now fully supports ceph-volume, we should get a release soon
My ceph-volume create statement ‘succeeded’ but the OSD doesn’t start, so
now I am trying to zap the disk to try to recreate the OSD, and the zap is
failing as Dan’s did.
I would encourage you to open a ticket in the tracker so that we can
improve on what failed for you
http://tracker.ceph.com/projects/ceph-volume/issues/new
ceph-volume keeps thorough logs in /var/log/ceph/ceph-volume.log and
/var/log/ceph/ceph-volume-systemd.log
If you create a ticket, please make sure to add all the output and
steps that you can
And yes, I was able to get it zapped using the lvremove, vgremove, pvremove
commands, but that is not obvious to someone who hasn’t used LVM extensively
for storage management before.
I also want to mirror Dan’s sentiments about the unnecessary complexity
imposed on what I expect is the default use case of an entire disk being
used. I can’t see anything more than the ‘entire disk’ method being the
largest use case for users of ceph, especially the smaller clusters trying
to maximize hardware/spend.
We don't take lightly the introduction of LVM here. The new tool is
addressing several insurmountable issues with how ceph-disk operated.
Although using an entire disk might be easier in the use case you are
in, it is certainly not the only thing we have to support, so then
again, we can't
reliably decide what strategy would be best to destroy that volume, or
group, or if the PV should be destroyed as well.
wouldn't it be possible to detect on creation that it is a full physical
disk that gets initialized completely by ceph-volume, store that in the
metadata somewhere and clean up accordingly when destroying the OSD?
When the OSD is created, we capture a lot of metadata about devices,
what goes were (even if the device changes names), and
what devices are part of an OSD. For example we can accurately tell if
a device is a Journal and what OSD is it associated with.

The removal of an LV and its corresponding VG is very destructive with
no way to revert, and even though we allow a simplistic approach of
creating the
VG and LV for you it doesn't necessarily mean that an operator will
want to have a VG fully destroyed when zapping an LV.

There are two use cases here:

1) An operator is redeploying and wants to completely remove the VG
(including the PV and LV), that may or may not have been created by
ceph-volume
2) An operator already has VGs and LVs in place and wants to reuse
them for an OSD - no need to destroy the underlying VG

We must support #2, but I see that there is a lot of users that would
like a more transparent removal of LVM-related devices like what
ceph-volume does when creating.

How about a flag that allows that behavior (although not enabled by
default) so that `zap` can destroy the LVM devices as well? So instead
of:

ceph-volume lvm zap vg/lv

We would offer:

ceph-volume lvm zap --destroy vg/lv

Which would get rid of the lv, vg, and pv as well
Post by Fabian Grünbichler
Post by Alfredo Deza
The 'zap' sub-command will allow that lv to be reused for an OSD and
that should work. Again, if it isn't sufficient, we really do need
more information and a
ticket in the tracker is the best way.
Loading...