[ceph-users] Rotating Cephx Keys

Discussion:

Graeme Gillies

2018-07-09 01:05:59 UTC

Hi,

I was wondering how (if?) people handle rotating cephx keys while
keeping cluster up/available.

Part of meeting compliance standards such as PCI DSS is making sure that
data encryption keys and security credentials are rotated regularly and
during other key points (such as notable staff turnover).

We are currently looking at using Ceph as a storage solution and was
wondering how people handle rotating cephx keys (at the very least, the
admin and client.$user keys) while causing minimal/no downtime to ceph
or the clients.

My understanding is that if you change the keys stored in the ceph kv db
then any existing sessions should still continue to work, but any new
ones (say, a hypervisor establishing new connections to osds for a new
vm volume) will fail until the key on the client side is also updated.

I attempted to set two keys against the same client to see if I can have
an "overlap" period of new and old keys before rotating out the old key,
but it seems that ceph only has the concept of 1 key per user.

Any hints, advice, or any information on how to achieve this would be
much appreciated.

Thanks in advance,

Graeme

Gregory Farnum

2018-07-09 18:40:55 UTC

Permalink

Post by Graeme Gillies
Thanks in advance,
Graeme
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Graeme Gillies

2018-07-09 23:57:20 UTC

Permalink

Post by Graeme Gillies
Hi,
I was wondering how (if?) people handle rotating cephx keys while
keeping cluster up/available.
Part of meeting compliance standards such as PCI DSS is making sure that
data encryption keys and security credentials are rotated
regularly and
during other key points (such as notable staff turnover).
We are currently looking at using Ceph as a storage solution and was
wondering how people handle rotating cephx keys (at the very least, the
admin and client.$user keys) while causing minimal/no downtime to ceph
or the clients.
My understanding is that if you change the keys stored in the ceph kv db
then any existing sessions should still continue to work, but any new
ones (say, a hypervisor establishing new connections to osds for a new
vm volume) will fail until the key on the client side is also updated.
I attempted to set two keys against the same client to see if I can have
an "overlap" period of new and old keys before rotating out the old key,
but it seems that ceph only has the concept of 1 key per user.
Any hints, advice, or any information on how to achieve this would be
much appreciated.
This isn't something I've seen come up much. Your understanding sounds
correct to me, so as a naive developer I'd assume you just change the
key in the monitors and distribute the new one to whoever should have
it. There's a small window in which the admin with the old key can't
do anything, but presumably you can coordinate around that?
The big issue I'm aware of is that orchestration systems like
OpenStack don't always do a good job supporting those changes â eg, I
think it embeds some keys in its database descriptor for the rbd
volume? :/
-Greg

I think the biggest problem with simply changing keys is that lets say I
have a client connecting to ceph using a ceph.client.user account. If I
want to rotate the keys for that I can simply do that ceph cluster side,
but then I also need to do that on the client side (in my case virtual
machine hypervisors). DUring this window (which might be tiny with
decent tooling, but still non-zero) my clients can't do new connections
to the ceph cluster, which I assume will cause issues.

I do wonder if an RFE to allow ceph auth to accept multiple keys for
client would be accepted? That way I would add my new key to the ceph
auth (so clients can authenticate with either key), then rotate it out
on my hypervisors, then remove the old key from ceph auth when done.

As for Openstack, when I used it I was pretty sure it simply used
ceph.conf of the nova-compute hosts to connect to ceph (at least for
libvirt) however that doesn't mean it does something else for other
hypervisors or implementations.

Regards,

Graeme

Post by Graeme Gillies
Â
Thanks in advance,
Graeme
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=xj3ftZ-YqnF9lrwuKqTuKqnVjnR0xteMhkRGTrOYYdo&m=zjBoDFxwgvY8z1hrXjQlTuTT-ahxsiJPmUGoLYKoSa4&s=Q7T-1zinvB8CyQKTLjBy8E5EFeBScmFX7T-5RJ8svHs&e=>

Gregory Farnum

2018-07-10 01:29:29 UTC

Permalink

Post by Gregory Farnum

This isn't something I've seen come up much. Your understanding sounds
correct to me, so as a naive developer I'd assume you just change the key
in the monitors and distribute the new one to whoever should have it.
There's a small window in which the admin with the old key can't do
anything, but presumably you can coordinate around that?
The big issue I'm aware of is that orchestration systems like OpenStack
don't always do a good job supporting those changes â eg, I think it embeds
some keys in its database descriptor for the rbd volume? :/
-Greg
I think the biggest problem with simply changing keys is that lets say I
have a client connecting to ceph using a ceph.client.user account. If I
want to rotate the keys for that I can simply do that ceph cluster side,
but then I also need to do that on the client side (in my case virtual
machine hypervisors). DUring this window (which might be tiny with decent
tooling, but still non-zero) my clients can't do new connections to the
ceph cluster, which I assume will cause issues.

Well, it will depend on what they're doing, right? Most Ceph clients just
set up a monitor connection and then maintain it until they shut down.
Although I guess if they need to establish a session to a *new* monitor and
you've changed the key, that might not go well â the client libraries
really aren't set up for that. Hrm.

Post by Gregory Farnum
I do wonder if an RFE to allow ceph auth to accept multiple keys for
client would be accepted? That way I would add my new key to the ceph auth
(so clients can authenticate with either key), then rotate it out on my
hypervisors, then remove the old key from ceph auth when done.

PRs are always welcome! This would probably take some work, though â the
"AuthMonitor" storage would need a pretty serious change, and the client
libraries extended to deal with changing them online as well.
-Greg

Konstantin Shalygin

2018-07-10 04:37:30 UTC

Permalink

Post by Graeme Gillies
If I
want to rotate the keys for that I can simply do that ceph cluster side,
but then I also need to do that on the client side (in my case virtual
machine hypervisors). DUring this window (which might be tiny with
decent tooling, but still non-zero) my clients can't do new connections
to the ceph cluster, which I assume will cause issues.

It's depends on orchestrator. For example, oVirt maintain cephx keys by
ovirt-engine. So, if key is changed we need to update key in oVirt,
after this - every new client will use new key = zero downtime. Simple
k,v storage.

Don't know how it looks in pure OpenStack, but oVirt hosts not need
ceph.conf, keys always pushed by ovirt-engine.

k

Graeme Gillies

2018-07-10 04:41:17 UTC

Permalink

Post by Konstantin Shalygin

I think you are missing the part where if you update a key in ceph, in
the space between that and when you update it in ovirt-engine any new
connections to ceph by any ovirt nodes will fail (as the key they have
ovirt side no longer matches what you have in ovirt-engine and all the
ovirt nodes).

That's the problem (unless I am misunderstanding what you are saying)

Post by Konstantin Shalygin
Don't know how it looks in pure OpenStack, but oVirt hosts not need
ceph.conf, keys always pushed by ovirt-engine.
k

Konstantin Shalygin

2018-07-10 05:21:07 UTC

Permalink

Post by Graeme Gillies
I think you are missing the part where if you update a key in ceph, in
the space between that and when you update it in ovirt-engine any new
connections to ceph by any ovirt nodes will fail

Yes, this is should be seconds. But, actually startup will not be failed
for user, because if first start is failed, ovirt-engine will continue
try's on another hosts.
If you start 1000+ vms per sec this is will not works for you.

Post by Graeme Gillies
(as the key they have
ovirt side no longer matches what you have in ovirt-engine and all the
ovirt nodes).

oVirt-hosts does not have any cephx keys. oVirt-hosts don't know about
Ceph anything. Keys is always pushed by ovirt-engine in xml to libvirt.

k