Discussion:
[ceph-users] RGW Swift metadata dropped when S3 bucket versioning enabled
Maxime Guyot
2018-11-27 19:28:11 UTC
Permalink
Hi,

I'm running into an issue with the RadosGW Swift API when the S3 bucket
versioning is enabled. It looks like it silently drops any metadata sent
with the "X-Object-Meta-foo" header (see example below).
This is observed on a Luminous 12.2.8 cluster. Is that a normal thing? Am I
misconfiguring something here?


With S3 bucket versioning OFF:
$ openstack object set --property foo=bar test test.dat
$ os object show test test.dat
+----------------+----------------------------------+
| Field | Value |
+----------------+----------------------------------+
| account | v1 |
| container | test |
| content-length | 507904 |
| content-type | binary/octet-stream |
| etag | 03e8a398f343ade4e1e1d7c81a66e400 |
| last-modified | Tue, 27 Nov 2018 13:53:54 GMT |
| object | test.dat |
| properties | Foo='bar' | <= Metadata is here
+----------------+----------------------------------+

With S3 bucket versioning ON:
$ openstack object set --property foo=bar test test2.dat
$ openstack object show test test2.dat
+----------------+----------------------------------+
| Field | Value |
+----------------+----------------------------------+
| account | v1 |
| container | test |
| content-length | 507904 |
| content-type | binary/octet-stream |
| etag | 03e8a398f343ade4e1e1d7c81a66e400 |
| last-modified | Tue, 27 Nov 2018 13:56:50 GMT |
| object | test2.dat | <= Metadata is absent
+----------------+----------------------------------+

Cheers,

/ Maxime
Florian Haas
2018-11-28 16:58:39 UTC
Permalink
Post by Maxime Guyot
Hi,
I'm running into an issue with the RadosGW Swift API when the S3 bucket
versioning is enabled. It looks like it silently drops any metadata sent
with the "X-Object-Meta-foo" header (see example below).
This is observed on a Luminous 12.2.8 cluster. Is that a normal thing?
Am I misconfiguring something here?
$ openstack object set --property foo=bar test test.dat
$ os object show test test.dat
+----------------+----------------------------------+
| Field          | Value                            |
+----------------+----------------------------------+
| account        | v1                               |
| container      | test                             |
| content-length | 507904                           |
| content-type   | binary/octet-stream              |
| etag           | 03e8a398f343ade4e1e1d7c81a66e400 |
| last-modified  | Tue, 27 Nov 2018 13:53:54 GMT    |
| object         | test.dat                         |
| properties     | Foo='bar'                        |  <= Metadata is here
+----------------+----------------------------------+
Can you elaborate on what exactly you're doing here to enable S3 bucket
versioning? Do I assume correctly that you are creating the "test"
container using the swift or openstack client, then sending a
VersioningConfiguration request against the "test" bucket, as explained
in
https://docs.aws.amazon.com/AmazonS3/latest/dev/Versioning.html#how-to-enable-disable-versioning-intro?
Post by Maxime Guyot
$ openstack object set --property foo=bar test test2.dat
$ openstack object show test test2.dat
+----------------+----------------------------------+
| Field          | Value                            |
+----------------+----------------------------------+
| account        | v1                               |
| container      | test                             |
| content-length | 507904                           |
| content-type   | binary/octet-stream              |
| etag           | 03e8a398f343ade4e1e1d7c81a66e400 |
| last-modified  | Tue, 27 Nov 2018 13:56:50 GMT    |
| object         | test2.dat                        | <= Metadata is absent
+----------------+----------------------------------+
Semi-related: I've seen some interesting things when mucking around with
a single container/bucket while switching APIs, when it comes to
container properties and metadata. For example, if you set a public read
ACL on an S3 bucket, the the corresponding Swift container is also
publicly readable but its read ACL looks empty (i.e. private) when you
ask via the Swift API.

Cheers,
Florian
Maxime Guyot
2018-11-28 18:06:39 UTC
Permalink
Hi Florian,

You assumed correctly, the "test" container (private) was created with the
"openstack container create test", then I am using the S3 API to
enable/disable object versioning on it.
I use the following Python snippet to enable/disable S3 bucket versioning:

import boto, boto.s3, boto.s3.connection
conn = conn = boto.connect_s3(aws_access_key_id='***',
aws_secret_access_key='***', host='***', port=8080,
calling_format=boto.s3.connection.OrdinaryCallingFormat())
bucket = conn.get_bucket('test')
bucket.configure_versioning(True) # Or False to disable S3 bucket versioning
bucket.get_versioning_status()
Post by Florian Haas
Semi-related: I've seen some interesting things when mucking around with
a single container/bucket while switching APIs, when it comes to
container properties and metadata. For example, if you set a public read
ACL on an S3 bucket, the the corresponding Swift container is also
publicly readable but its read ACL looks empty (i.e. private) when you
ask via the Swift API.
This can definitely become a problem if Swift API says "private" but data
is actually publicly available.
Since the doc says "S3 and Swift APIs share a common namespace, so you may
write data with one API and retrieve it with the other", it might be useful
to document this kind of limitations somewhere.

Cheers,
/ Maxime
Post by Florian Haas
Post by Maxime Guyot
Hi,
I'm running into an issue with the RadosGW Swift API when the S3 bucket
versioning is enabled. It looks like it silently drops any metadata sent
with the "X-Object-Meta-foo" header (see example below).
This is observed on a Luminous 12.2.8 cluster. Is that a normal thing?
Am I misconfiguring something here?
$ openstack object set --property foo=bar test test.dat
$ os object show test test.dat
+----------------+----------------------------------+
| Field | Value |
+----------------+----------------------------------+
| account | v1 |
| container | test |
| content-length | 507904 |
| content-type | binary/octet-stream |
| etag | 03e8a398f343ade4e1e1d7c81a66e400 |
| last-modified | Tue, 27 Nov 2018 13:53:54 GMT |
| object | test.dat |
| properties | Foo='bar' | <= Metadata is
here
Post by Maxime Guyot
+----------------+----------------------------------+
Can you elaborate on what exactly you're doing here to enable S3 bucket
versioning? Do I assume correctly that you are creating the "test"
container using the swift or openstack client, then sending a
VersioningConfiguration request against the "test" bucket, as explained
in
https://docs.aws.amazon.com/AmazonS3/latest/dev/Versioning.html#how-to-enable-disable-versioning-intro
?
Post by Maxime Guyot
$ openstack object set --property foo=bar test test2.dat
$ openstack object show test test2.dat
+----------------+----------------------------------+
| Field | Value |
+----------------+----------------------------------+
| account | v1 |
| container | test |
| content-length | 507904 |
| content-type | binary/octet-stream |
| etag | 03e8a398f343ade4e1e1d7c81a66e400 |
| last-modified | Tue, 27 Nov 2018 13:56:50 GMT |
| object | test2.dat | <= Metadata is
absent
Post by Maxime Guyot
+----------------+----------------------------------+
Semi-related: I've seen some interesting things when mucking around with
a single container/bucket while switching APIs, when it comes to
container properties and metadata. For example, if you set a public read
ACL on an S3 bucket, the the corresponding Swift container is also
publicly readable but its read ACL looks empty (i.e. private) when you
ask via the Swift API.
Cheers,
Florian
Yehuda Sadeh-Weinraub
2018-11-29 11:34:46 UTC
Permalink
Post by Maxime Guyot
Hi Florian,
You assumed correctly, the "test" container (private) was created with the "openstack container create test", then I am using the S3 API to enable/disable object versioning on it.
import boto, boto.s3, boto.s3.connection
conn = conn = boto.connect_s3(aws_access_key_id='***', aws_secret_access_key='***', host='***', port=8080, calling_format=boto.s3.connection.OrdinaryCallingFormat())
bucket = conn.get_bucket('test')
bucket.configure_versioning(True) # Or False to disable S3 bucket versioning
bucket.get_versioning_status()
Post by Florian Haas
Semi-related: I've seen some interesting things when mucking around with
a single container/bucket while switching APIs, when it comes to
container properties and metadata. For example, if you set a public read
ACL on an S3 bucket, the the corresponding Swift container is also
publicly readable but its read ACL looks empty (i.e. private) when you
ask via the Swift API.
This can definitely become a problem if Swift API says "private" but data is actually publicly available.
Since the doc says "S3 and Swift APIs share a common namespace, so you may write data with one API and retrieve it with the other", it might be useful to document this kind of limitations somewhere.
Note that swift acls and S3 acls don't quite map perfectly to each
other. When S3 public read acl on a bucket doesn't mean that data is
accessible, but rather that bucket can be listed. In swift the
container acls are about the objects inside. Not sure that there is an
equivalent swift acl that would only deal with ability to list objects
in the container.

Yehuda
Post by Maxime Guyot
Cheers,
/ Maxime
Post by Florian Haas
Post by Maxime Guyot
Hi,
I'm running into an issue with the RadosGW Swift API when the S3 bucket
versioning is enabled. It looks like it silently drops any metadata sent
with the "X-Object-Meta-foo" header (see example below).
This is observed on a Luminous 12.2.8 cluster. Is that a normal thing?
Am I misconfiguring something here?
$ openstack object set --property foo=bar test test.dat
$ os object show test test.dat
+----------------+----------------------------------+
| Field | Value |
+----------------+----------------------------------+
| account | v1 |
| container | test |
| content-length | 507904 |
| content-type | binary/octet-stream |
| etag | 03e8a398f343ade4e1e1d7c81a66e400 |
| last-modified | Tue, 27 Nov 2018 13:53:54 GMT |
| object | test.dat |
| properties | Foo='bar' | <= Metadata is here
+----------------+----------------------------------+
Can you elaborate on what exactly you're doing here to enable S3 bucket
versioning? Do I assume correctly that you are creating the "test"
container using the swift or openstack client, then sending a
VersioningConfiguration request against the "test" bucket, as explained
in
https://docs.aws.amazon.com/AmazonS3/latest/dev/Versioning.html#how-to-enable-disable-versioning-intro?
Post by Maxime Guyot
$ openstack object set --property foo=bar test test2.dat
$ openstack object show test test2.dat
+----------------+----------------------------------+
| Field | Value |
+----------------+----------------------------------+
| account | v1 |
| container | test |
| content-length | 507904 |
| content-type | binary/octet-stream |
| etag | 03e8a398f343ade4e1e1d7c81a66e400 |
| last-modified | Tue, 27 Nov 2018 13:56:50 GMT |
| object | test2.dat | <= Metadata is absent
+----------------+----------------------------------+
Semi-related: I've seen some interesting things when mucking around with
a single container/bucket while switching APIs, when it comes to
container properties and metadata. For example, if you set a public read
ACL on an S3 bucket, the the corresponding Swift container is also
publicly readable but its read ACL looks empty (i.e. private) when you
ask via the Swift API.
Cheers,
Florian
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Florian Haas
2018-11-30 21:28:09 UTC
Permalink
Post by Maxime Guyot
Hi Florian,
You assumed correctly, the "test" container (private) was created with
the "openstack container create test", then I am using the S3 API to
enable/disable object versioning on it.
import boto, boto.s3, boto.s3.connection
conn = conn = boto.connect_s3(aws_access_key_id='***',
aws_secret_access_key='***', host='***', port=8080,
calling_format=boto.s3.connection.OrdinaryCallingFormat())
bucket = conn.get_bucket('test')
bucket.configure_versioning(True) # Or False to disable S3 bucket versioning
bucket.get_versioning_status()
Thanks for making this so easy to reproduce! I must confess upfront that
I've found myself unable to reproduce your problem, but I've retraced
your steps and maybe you'll find this useful to develop a hypothesis as
to what's happening in your case.

$ openstack object show -f shell foo bar
account="AUTH_5ed51981f4a8468292bf2c578806ebf7"
container="foo"
content_length="12"
content_type="text/plain"
last_modified="Thu, 22 Nov 2018 15:02:57 GMT"
object="bar"
properties="S3cmd-Attrs='atime:1542629253/ctime:1542629253/gid:1000/gname:florian/md5:6f5902ac237024bdd0c176cb93063dc4/mode:33204/mtime:1542629253/uid:1000/uname:florian'"

See the properties that are set there? These are obviously not
properties ever set through the Swift API, but instead they were set
when I uploaded this object into the corresponding bucket, using the S3 API.
Post by Maxime Guyot
Post by Florian Haas
foo = conn.get_bucket('foo')
bar = bucket.get_key('bar')
bar.metadata
{'s3cmd-attrs':
u'atime:1542629253/ctime:1542629253/gid:1000/gname:florian/md5:6f5902ac237024bdd0c176cb93063dc4/mode:33204/mtime:1542629253/uid:1000/uname:florian'}
Post by Maxime Guyot
Post by Florian Haas
foo.configure_versioning(True)
True
Post by Maxime Guyot
Post by Florian Haas
foo.get_versioning_status()
{'Versioning': 'Enabled'}
Post by Maxime Guyot
Post by Florian Haas
bar.metadata
{'s3cmd-attrs':
u'atime:1542629253/ctime:1542629253/gid:1000/gname:florian/md5:6f5902ac237024bdd0c176cb93063dc4/mode:33204/mtime:1542629253/uid:1000/uname:florian'}
Post by Maxime Guyot
Post by Florian Haas
bar = bucket.get_key('bar')
{'s3cmd-attrs':
u'atime:1542629253/ctime:1542629253/gid:1000/gname:florian/md5:6f5902ac237024bdd0c176cb93063dc4/mode:33204/mtime:1542629253/uid:1000/uname:florian'}
Post by Maxime Guyot
Post by Florian Haas
foo.configure_versioning(False)
True
Post by Maxime Guyot
Post by Florian Haas
foo.get_versioning_status()
{'Versioning': 'Suspended'}

Now add a property using the Swift API:

$ openstack object set --property spam=eggs foo bar

And read it back:

$ openstack object show -f shell foo bar
account="AUTH_5ed51981f4a8468292bf2c578806ebf7"
container="foo"
content_length="12"
content_type="text/plain"
last_modified="Wed, 28 Nov 2018 19:52:48 GMT"
object="bar"
properties="Spam='eggs'"

Notice that not only has the property been set, it has *overwritten* the
S3 properties that were set before. I am not sure if this is meant to be
this way, i.e. if native Swift acts this way too, but it appears to be
how radosgw does it.

However, now that have the "spam" property set, I go ahead and re-enable
Post by Maxime Guyot
Post by Florian Haas
foo.configure_versioning(True)
True
Post by Maxime Guyot
Post by Florian Haas
foo.get_versioning_status()
{'Versioning': 'Enabled'}

And then I re-query my object:

$ openstack object show -f shell foo bar
account="AUTH_5ed51981f4a8468292bf2c578806ebf7"
container="foo"
content_length="12"
content_type="text/plain"
last_modified="Thu, 29 Nov 2018 11:47:41 GMT"
object="bar"
properties="Spam='eggs'"

So as you can see, in my case the "spam" property, when defined with the
Swift API, did get preserved even across enabling versioning.

I can also use s3cmd to check that the Swift meta header looks like an
x-amz-meta header when querying the same object via S3:

$ s3cmd info s3://foo/bar
s3://foo/bar (object):
File size: 12
Last mod: Thu, 29 Nov 2018 11:47:41 GMT
MIME type: text/plain
Storage: STANDARD
MD5 sum: 6f5902ac237024bdd0c176cb93063dc4
SSE: none
Policy: none
CORS: none
x-amz-meta-spam: eggs

So could it be that there is *something* you're doing that just
overwrites your metadata?
Post by Maxime Guyot
Post by Florian Haas
Semi-related: I've seen some interesting things when mucking around with
a single container/bucket while switching APIs, when it comes to
container properties and metadata. For example, if you set a public read
ACL on an S3 bucket, the the corresponding Swift container is also
publicly readable but its read ACL looks empty (i.e. private) when you
ask via the Swift API.
This can definitely become a problem if Swift API says "private" but
data is actually publicly available.
Since the doc says "S3 and Swift APIs share a common namespace, so you
may write data with one API and retrieve it with the other", it might be
useful to document this kind of limitations somewhere.
I agree, but as an aside I believe that in your particular case you're
highlighting one of the inherent limitations of the dual-API capability.
S3 bucket versioning works, as far as I understand, quite differently
from versioned objects in Swift, so making radosgw behave correctly for
versioning in both APIs — as soon as it's enabled in only one — strikes
me as really, really difficult. Similar to what Yehuda mentioned about
ACL differences.

Not sure if this helps?

Cheers,
Florian
Maxime Guyot
2018-12-05 16:35:24 UTC
Permalink
Hi Florian,

Thanks for the help. I did further testing and narrowed it down to objects
that have been uploaded when the bucket has versioning enabled.
Objects created before that are not affected: all metadata operations are
still possible.

Here is a simple way to reproduce this:
http://paste.openstack.org/show/736713/
And here is the snippet to easily turn on/off S3 versioning on a given
bucket: https://gist.github.com/Miouge1/b8ae19b71411655154e74e609b61f24e

Cheers,
Maxime
Post by Maxime Guyot
Post by Maxime Guyot
Hi Florian,
You assumed correctly, the "test" container (private) was created with
the "openstack container create test", then I am using the S3 API to
enable/disable object versioning on it.
I use the following Python snippet to enable/disable S3 bucket
import boto, boto.s3, boto.s3.connection
conn = conn = boto.connect_s3(aws_access_key_id='***',
aws_secret_access_key='***', host='***', port=8080,
calling_format=boto.s3.connection.OrdinaryCallingFormat())
bucket = conn.get_bucket('test')
bucket.configure_versioning(True) # Or False to disable S3 bucket
versioning
Post by Maxime Guyot
bucket.get_versioning_status()
Thanks for making this so easy to reproduce! I must confess upfront that
I've found myself unable to reproduce your problem, but I've retraced
your steps and maybe you'll find this useful to develop a hypothesis as
to what's happening in your case.
$ openstack object show -f shell foo bar
account="AUTH_5ed51981f4a8468292bf2c578806ebf7"
container="foo"
content_length="12"
content_type="text/plain"
last_modified="Thu, 22 Nov 2018 15:02:57 GMT"
object="bar"
properties="S3cmd-Attrs='atime:1542629253/ctime:1542629253/gid:1000/gname:florian/md5:6f5902ac237024bdd0c176cb93063dc4/mode:33204/mtime:1542629253/uid:1000/uname:florian'"
See the properties that are set there? These are obviously not
properties ever set through the Swift API, but instead they were set
when I uploaded this object into the corresponding bucket, using the S3 API.
Post by Maxime Guyot
Post by Florian Haas
foo = conn.get_bucket('foo')
bar = bucket.get_key('bar')
bar.metadata
u'atime:1542629253/ctime:1542629253/gid:1000/gname:florian/md5:6f5902ac237024bdd0c176cb93063dc4/mode:33204/mtime:1542629253/uid:1000/uname:florian'}
Post by Maxime Guyot
Post by Florian Haas
foo.configure_versioning(True)
True
Post by Maxime Guyot
Post by Florian Haas
foo.get_versioning_status()
{'Versioning': 'Enabled'}
Post by Maxime Guyot
Post by Florian Haas
bar.metadata
u'atime:1542629253/ctime:1542629253/gid:1000/gname:florian/md5:6f5902ac237024bdd0c176cb93063dc4/mode:33204/mtime:1542629253/uid:1000/uname:florian'}
Post by Maxime Guyot
Post by Florian Haas
bar = bucket.get_key('bar')
u'atime:1542629253/ctime:1542629253/gid:1000/gname:florian/md5:6f5902ac237024bdd0c176cb93063dc4/mode:33204/mtime:1542629253/uid:1000/uname:florian'}
Post by Maxime Guyot
Post by Florian Haas
foo.configure_versioning(False)
True
Post by Maxime Guyot
Post by Florian Haas
foo.get_versioning_status()
{'Versioning': 'Suspended'}
$ openstack object set --property spam=eggs foo bar
$ openstack object show -f shell foo bar
account="AUTH_5ed51981f4a8468292bf2c578806ebf7"
container="foo"
content_length="12"
content_type="text/plain"
last_modified="Wed, 28 Nov 2018 19:52:48 GMT"
object="bar"
properties="Spam='eggs'"
Notice that not only has the property been set, it has *overwritten* the
S3 properties that were set before. I am not sure if this is meant to be
this way, i.e. if native Swift acts this way too, but it appears to be
how radosgw does it.
However, now that have the "spam" property set, I go ahead and re-enable
Post by Maxime Guyot
Post by Florian Haas
foo.configure_versioning(True)
True
Post by Maxime Guyot
Post by Florian Haas
foo.get_versioning_status()
{'Versioning': 'Enabled'}
$ openstack object show -f shell foo bar
account="AUTH_5ed51981f4a8468292bf2c578806ebf7"
container="foo"
content_length="12"
content_type="text/plain"
last_modified="Thu, 29 Nov 2018 11:47:41 GMT"
object="bar"
properties="Spam='eggs'"
So as you can see, in my case the "spam" property, when defined with the
Swift API, did get preserved even across enabling versioning.
I can also use s3cmd to check that the Swift meta header looks like an
$ s3cmd info s3://foo/bar
File size: 12
Last mod: Thu, 29 Nov 2018 11:47:41 GMT
MIME type: text/plain
Storage: STANDARD
MD5 sum: 6f5902ac237024bdd0c176cb93063dc4
SSE: none
Policy: none
CORS: none
x-amz-meta-spam: eggs
So could it be that there is *something* you're doing that just
overwrites your metadata?
Post by Maxime Guyot
Post by Florian Haas
Semi-related: I've seen some interesting things when mucking around with
a single container/bucket while switching APIs, when it comes to
container properties and metadata. For example, if you set a public read
ACL on an S3 bucket, the the corresponding Swift container is also
publicly readable but its read ACL looks empty (i.e. private) when you
ask via the Swift API.
This can definitely become a problem if Swift API says "private" but
data is actually publicly available.
Since the doc says "S3 and Swift APIs share a common namespace, so you
may write data with one API and retrieve it with the other", it might be
useful to document this kind of limitations somewhere.
I agree, but as an aside I believe that in your particular case you're
highlighting one of the inherent limitations of the dual-API capability.
S3 bucket versioning works, as far as I understand, quite differently
from versioned objects in Swift, so making radosgw behave correctly for
versioning in both APIs — as soon as it's enabled in only one — strikes
me as really, really difficult. Similar to what Yehuda mentioned about
ACL differences.
Not sure if this helps?
Cheers,
Florian
Florian Haas
2018-12-05 16:41:53 UTC
Permalink
Post by Maxime Guyot
Hi Florian,
Thanks for the help. I did further testing and narrowed it down to
objects that have been uploaded when the bucket has versioning enabled.
Objects created before that are not affected: all metadata operations
are still possible.
Here is a simple way to reproduce
this: http://paste.openstack.org/show/736713/ 
And here is the snippet to easily turn on/off S3 versioning on a given
bucket: https://gist.github.com/Miouge1/b8ae19b71411655154e74e609b61f24e 
Cheers,
Maxime
All right, by my reckoning this would very much look like a bug then.
You probably want to chuck an issue for this into
https://tracker.ceph.com/projects/rgw.

Out of curiosity, are you also seeing Swift metadata getting borked when
you're enabling *Swift* versioning? (Wholly different animal, I know,
but still worth taking a look I think.)

Cheers
Florian
Matt Benjamin
2018-12-05 17:20:40 UTC
Permalink
Agree, please file a tracker issue with the info, we'll prioritize
reproducing it.

Cheers,

Matt
Post by Florian Haas
Post by Maxime Guyot
Hi Florian,
Thanks for the help. I did further testing and narrowed it down to
objects that have been uploaded when the bucket has versioning enabled.
Objects created before that are not affected: all metadata operations
are still possible.
Here is a simple way to reproduce
this: http://paste.openstack.org/show/736713/
And here is the snippet to easily turn on/off S3 versioning on a given
bucket: https://gist.github.com/Miouge1/b8ae19b71411655154e74e609b61f24e
Cheers,
Maxime
All right, by my reckoning this would very much look like a bug then.
You probably want to chuck an issue for this into
https://tracker.ceph.com/projects/rgw.
Out of curiosity, are you also seeing Swift metadata getting borked when
you're enabling *Swift* versioning? (Wholly different animal, I know,
but still worth taking a look I think.)
Cheers
Florian
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel. 734-821-5101
fax. 734-769-8938
cel. 734-216-5309
Loading...