Wido den Hollander
2018-11-15 12:43:01 UTC
Hi,
Recently we've seen multiple messages on the mailinglists about people
seeing HEALTH_WARN due to large OMAP objects on their cluster. This is
due to the fact that starting with 12.2.6 OSDs warn about this.
I've got multiple people asking me the same questions and I've done some
digging around.
Somebody on the ML wrote this script:
for bucket in `radosgw-admin metadata list bucket | jq -r '.[]' | sort`; do
actual_id=`radosgw-admin bucket stats --bucket=${bucket} | jq -r '.id'`
for instance in `radosgw-admin metadata list bucket.instance | jq -r
'.[]' | grep ${bucket}: | cut -d ':' -f 2`
do
if [ "$actual_id" != "$instance" ]
then
radosgw-admin bi purge --bucket=${bucket} --bucket-id=${instance}
radosgw-admin metadata rm bucket.instance:${bucket}:${instance}
fi
done
done
That partially works, but 'orphaned' objects in the index pool do not work.
So I wrote my own script [0]:
#!/bin/bash
INDEX_POOL=$1
if [ -z "$INDEX_POOL" ]; then
echo "Usage: $0 <index pool>"
exit 1
fi
INDEXES=$(mktemp)
METADATA=$(mktemp)
trap "rm -f ${INDEXES} ${METADATA}" EXIT
radosgw-admin metadata list bucket.instance|jq -r '.[]' > ${METADATA}
rados -p ${INDEX_POOL} ls > $INDEXES
for OBJECT in $(cat ${INDEXES}); do
MARKER=$(echo ${OBJECT}|cut -d '.' -f 3,4,5)
grep ${MARKER} ${METADATA} > /dev/null
if [ "$?" -ne 0 ]; then
echo $OBJECT
fi
done
It does not remove anything, but for example, it returns these objects:
.dir.eb32b1ca-807a-4867-aea5-ff43ef7647c6.10406917.5752
.dir.eb32b1ca-807a-4867-aea5-ff43ef7647c6.10289105.6162
.dir.eb32b1ca-807a-4867-aea5-ff43ef7647c6.10289105.6186
The output of:
$ radosgw-admin metadata list|jq -r '.[]'
Does not contain:
- eb32b1ca-807a-4867-aea5-ff43ef7647c6.10406917.5752
- eb32b1ca-807a-4867-aea5-ff43ef7647c6.10289105.6162
- eb32b1ca-807a-4867-aea5-ff43ef7647c6.10289105.6186
So for me these objects do not seem to be tied to any bucket and seem to
be leftovers which were not cleaned up.
For example, I see these objects tied to a bucket:
- b32b1ca-807a-4867-aea5-ff43ef7647c6.10289105.6160
- eb32b1ca-807a-4867-aea5-ff43ef7647c6.10289105.6188
- eb32b1ca-807a-4867-aea5-ff43ef7647c6.10289105.6167
But notice the difference: 6160, 6188, 6167, but not 6162 nor 6186
Before I remove these objects I want to verify with other users if they
see the same and if my thinking is correct.
Wido
[0]: https://gist.github.com/wido/6650e66b09770ef02df89636891bef04
Recently we've seen multiple messages on the mailinglists about people
seeing HEALTH_WARN due to large OMAP objects on their cluster. This is
due to the fact that starting with 12.2.6 OSDs warn about this.
I've got multiple people asking me the same questions and I've done some
digging around.
Somebody on the ML wrote this script:
for bucket in `radosgw-admin metadata list bucket | jq -r '.[]' | sort`; do
actual_id=`radosgw-admin bucket stats --bucket=${bucket} | jq -r '.id'`
for instance in `radosgw-admin metadata list bucket.instance | jq -r
'.[]' | grep ${bucket}: | cut -d ':' -f 2`
do
if [ "$actual_id" != "$instance" ]
then
radosgw-admin bi purge --bucket=${bucket} --bucket-id=${instance}
radosgw-admin metadata rm bucket.instance:${bucket}:${instance}
fi
done
done
That partially works, but 'orphaned' objects in the index pool do not work.
So I wrote my own script [0]:
#!/bin/bash
INDEX_POOL=$1
if [ -z "$INDEX_POOL" ]; then
echo "Usage: $0 <index pool>"
exit 1
fi
INDEXES=$(mktemp)
METADATA=$(mktemp)
trap "rm -f ${INDEXES} ${METADATA}" EXIT
radosgw-admin metadata list bucket.instance|jq -r '.[]' > ${METADATA}
rados -p ${INDEX_POOL} ls > $INDEXES
for OBJECT in $(cat ${INDEXES}); do
MARKER=$(echo ${OBJECT}|cut -d '.' -f 3,4,5)
grep ${MARKER} ${METADATA} > /dev/null
if [ "$?" -ne 0 ]; then
echo $OBJECT
fi
done
It does not remove anything, but for example, it returns these objects:
.dir.eb32b1ca-807a-4867-aea5-ff43ef7647c6.10406917.5752
.dir.eb32b1ca-807a-4867-aea5-ff43ef7647c6.10289105.6162
.dir.eb32b1ca-807a-4867-aea5-ff43ef7647c6.10289105.6186
The output of:
$ radosgw-admin metadata list|jq -r '.[]'
Does not contain:
- eb32b1ca-807a-4867-aea5-ff43ef7647c6.10406917.5752
- eb32b1ca-807a-4867-aea5-ff43ef7647c6.10289105.6162
- eb32b1ca-807a-4867-aea5-ff43ef7647c6.10289105.6186
So for me these objects do not seem to be tied to any bucket and seem to
be leftovers which were not cleaned up.
For example, I see these objects tied to a bucket:
- b32b1ca-807a-4867-aea5-ff43ef7647c6.10289105.6160
- eb32b1ca-807a-4867-aea5-ff43ef7647c6.10289105.6188
- eb32b1ca-807a-4867-aea5-ff43ef7647c6.10289105.6167
But notice the difference: 6160, 6188, 6167, but not 6162 nor 6186
Before I remove these objects I want to verify with other users if they
see the same and if my thinking is correct.
Wido
[0]: https://gist.github.com/wido/6650e66b09770ef02df89636891bef04